linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling
@ 2022-11-30 23:08 Sean Christopherson
  2022-11-30 23:08 ` [PATCH v2 01/50] KVM: Register /dev/kvm as the _very_ last thing during initialization Sean Christopherson
                   ` (51 more replies)
  0 siblings, 52 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:08 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

The main theme of this series is to kill off kvm_arch_init(),
kvm_arch_hardware_(un)setup(), and kvm_arch_check_processor_compat(), which
all originated in x86 code from way back when, and needlessly complicate
both common KVM code and architecture code.  E.g. many architectures don't
mark functions/data as __init/__ro_after_init purely because kvm_init()
isn't marked __init to support x86's separate vendor modules.

The idea/hope is that with those hooks gone (moved to arch code), it will
be easier for x86 (and other architectures) to modify their module init
sequences as needed without having to fight common KVM code.  E.g. I'm
hoping that ARM can build on this to simplify its hardware enabling logic,
especially the pKVM side of things.

There are bug fixes throughout this series.  They are more scattered than
I would usually prefer, but getting the sequencing correct was a gigantic
pain for many of the x86 fixes due to needing to fix common code in order
for the x86 fix to have any meaning.  And while the bugs are often fatal,
they aren't all that interesting for most users as they either require a
malicious admin or broken hardware, i.e. aren't likely to be encountered
by the vast majority of KVM users.  So unless someone _really_ wants a
particular fix isolated for backporting, I'm not planning on shuffling
patches.

v2:
 - Collect reviews/acks.
 - Reset eVMCS controls in VP assist page when disabling hardware. [Vitaly]
 - Clean up labels in kvm_init(). [Chao]
 - Fix a goof where VMX compat checks used boot_cpu_has. [Kai]
 - Reorder patches and/or tweak changelogs to not require time travel. [Paolo, Kai]
 - Rewrite the changelog for the patch to move ARM away from kvm_arch_init()
   to call out that it fixes theoretical bugs. [Philippe]
 - Document why it's safe to allow preemption and/or migration when
   accessing kvm_usage_count.

v1: https://lore.kernel.org/all/20221102231911.3107438-1-seanjc@google.com

Chao Gao (3):
  KVM: x86: Do compatibility checks when onlining CPU
  KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section
  KVM: Disable CPU hotplug during hardware enabling/disabling

Isaku Yamahata (3):
  KVM: Drop kvm_count_lock and instead protect kvm_usage_count with
    kvm_lock
  KVM: Remove on_each_cpu(hardware_disable_nolock) in kvm_exit()
  KVM: Make hardware_enable_failed a local variable in the "enable all"
    path

Marc Zyngier (1):
  KVM: arm64: Simplify the CPUHP logic

Sean Christopherson (43):
  KVM: Register /dev/kvm as the _very_ last thing during initialization
  KVM: Initialize IRQ FD after arch hardware setup
  KVM: Allocate cpus_hardware_enabled after arch hardware setup
  KVM: Teardown VFIO ops earlier in kvm_exit()
  KVM: s390: Unwind kvm_arch_init() piece-by-piece() if a step fails
  KVM: s390: Move hardware setup/unsetup to init/exit
  KVM: x86: Do timer initialization after XCR0 configuration
  KVM: x86: Move hardware setup/unsetup to init/exit
  KVM: Drop arch hardware (un)setup hooks
  KVM: VMX: Reset eVMCS controls in VP assist page during hardware
    disabling
  KVM: VMX: Don't bother disabling eVMCS static key on module exit
  KVM: VMX: Move Hyper-V eVMCS initialization to helper
  KVM: x86: Move guts of kvm_arch_init() to standalone helper
  KVM: VMX: Do _all_ initialization before exposing /dev/kvm to
    userspace
  KVM: x86: Serialize vendor module initialization (hardware setup)
  KVM: arm64: Free hypervisor allocations if vector slot init fails
  KVM: arm64: Unregister perf callbacks if hypervisor finalization fails
  KVM: arm64: Do arm/arch initialization without bouncing through
    kvm_init()
  KVM: arm64: Mark kvm_arm_init() and its unique descendants as __init
  KVM: MIPS: Hardcode callbacks to hardware virtualization extensions
  KVM: MIPS: Setup VZ emulation? directly from kvm_mips_init()
  KVM: MIPS: Register die notifier prior to kvm_init()
  KVM: RISC-V: Do arch init directly in riscv_kvm_init()
  KVM: RISC-V: Tag init functions and data with __init, __ro_after_init
  KVM: PPC: Move processor compatibility check to module init
  KVM: s390: Do s390 specific init without bouncing through kvm_init()
  KVM: s390: Mark __kvm_s390_init() and its descendants as __init
  KVM: Drop kvm_arch_{init,exit}() hooks
  KVM: VMX: Make VMCS configuration/capabilities structs read-only after
    init
  KVM: x86: Do CPU compatibility checks in x86 code
  KVM: Drop kvm_arch_check_processor_compat() hook
  KVM: x86: Use KBUILD_MODNAME to specify vendor module name
  KVM: x86: Unify pr_fmt to use module name for all KVM modules
  KVM: VMX: Use current CPU's info to perform "disabled by BIOS?" checks
  KVM: x86: Do VMX/SVM support checks directly in vendor code
  KVM: VMX: Shuffle support checks and hardware enabling code around
  KVM: SVM: Check for SVM support in CPU compatibility checks
  KVM: x86: Move CPU compat checks hook to kvm_x86_ops (from
    kvm_x86_init_ops)
  KVM: Ensure CPU is stable during low level hardware enable/disable
  KVM: Use a per-CPU variable to track which CPUs have enabled
    virtualization
  KVM: Register syscore (suspend/resume) ops early in kvm_init()
  KVM: Opt out of generic hardware enabling on s390 and PPC
  KVM: Clean up error labels in kvm_init()

 Documentation/virt/kvm/locking.rst  |  25 +-
 arch/arm64/include/asm/kvm_host.h   |  15 +-
 arch/arm64/include/asm/kvm_mmu.h    |   4 +-
 arch/arm64/kvm/Kconfig              |   1 +
 arch/arm64/kvm/arch_timer.c         |  29 +-
 arch/arm64/kvm/arm.c                |  93 +++---
 arch/arm64/kvm/mmu.c                |  12 +-
 arch/arm64/kvm/reset.c              |   8 +-
 arch/arm64/kvm/sys_regs.c           |   6 +-
 arch/arm64/kvm/vgic/vgic-init.c     |  19 +-
 arch/arm64/kvm/vmid.c               |   6 +-
 arch/mips/include/asm/kvm_host.h    |   3 +-
 arch/mips/kvm/Kconfig               |   1 +
 arch/mips/kvm/Makefile              |   2 +-
 arch/mips/kvm/callback.c            |  14 -
 arch/mips/kvm/mips.c                |  34 +--
 arch/mips/kvm/vz.c                  |   7 +-
 arch/powerpc/include/asm/kvm_host.h |   3 -
 arch/powerpc/include/asm/kvm_ppc.h  |   1 -
 arch/powerpc/kvm/book3s.c           |  12 +-
 arch/powerpc/kvm/e500.c             |   6 +-
 arch/powerpc/kvm/e500mc.c           |   6 +-
 arch/powerpc/kvm/powerpc.c          |  20 --
 arch/riscv/include/asm/kvm_host.h   |   7 +-
 arch/riscv/kvm/Kconfig              |   1 +
 arch/riscv/kvm/main.c               |  23 +-
 arch/riscv/kvm/mmu.c                |  12 +-
 arch/riscv/kvm/vmid.c               |   4 +-
 arch/s390/include/asm/kvm_host.h    |   1 -
 arch/s390/kvm/interrupt.c           |   2 +-
 arch/s390/kvm/kvm-s390.c            |  84 +++---
 arch/s390/kvm/kvm-s390.h            |   2 +-
 arch/s390/kvm/pci.c                 |   2 +-
 arch/s390/kvm/pci.h                 |   2 +-
 arch/x86/include/asm/kvm-x86-ops.h  |   1 +
 arch/x86/include/asm/kvm_host.h     |   8 +-
 arch/x86/kvm/Kconfig                |   1 +
 arch/x86/kvm/cpuid.c                |   1 +
 arch/x86/kvm/debugfs.c              |   2 +
 arch/x86/kvm/emulate.c              |   1 +
 arch/x86/kvm/hyperv.c               |   1 +
 arch/x86/kvm/i8254.c                |   4 +-
 arch/x86/kvm/i8259.c                |   4 +-
 arch/x86/kvm/ioapic.c               |   1 +
 arch/x86/kvm/irq.c                  |   1 +
 arch/x86/kvm/irq_comm.c             |   7 +-
 arch/x86/kvm/kvm_onhyperv.c         |   1 +
 arch/x86/kvm/lapic.c                |   8 +-
 arch/x86/kvm/mmu/mmu.c              |   6 +-
 arch/x86/kvm/mmu/page_track.c       |   1 +
 arch/x86/kvm/mmu/spte.c             |   4 +-
 arch/x86/kvm/mmu/spte.h             |   4 +-
 arch/x86/kvm/mmu/tdp_iter.c         |   1 +
 arch/x86/kvm/mmu/tdp_mmu.c          |   1 +
 arch/x86/kvm/mtrr.c                 |   1 +
 arch/x86/kvm/pmu.c                  |   1 +
 arch/x86/kvm/smm.c                  |   1 +
 arch/x86/kvm/svm/avic.c             |   2 +-
 arch/x86/kvm/svm/nested.c           |   2 +-
 arch/x86/kvm/svm/pmu.c              |   2 +
 arch/x86/kvm/svm/sev.c              |   1 +
 arch/x86/kvm/svm/svm.c              |  89 +++---
 arch/x86/kvm/svm/svm_onhyperv.c     |   1 +
 arch/x86/kvm/svm/svm_onhyperv.h     |   4 +-
 arch/x86/kvm/vmx/capabilities.h     |   4 +-
 arch/x86/kvm/vmx/hyperv.c           |   1 +
 arch/x86/kvm/vmx/hyperv.h           |   4 +-
 arch/x86/kvm/vmx/nested.c           |   3 +-
 arch/x86/kvm/vmx/pmu_intel.c        |   5 +-
 arch/x86/kvm/vmx/posted_intr.c      |   2 +
 arch/x86/kvm/vmx/sgx.c              |   5 +-
 arch/x86/kvm/vmx/vmcs12.c           |   1 +
 arch/x86/kvm/vmx/vmx.c              | 438 +++++++++++++++-------------
 arch/x86/kvm/vmx/vmx_ops.h          |   4 +-
 arch/x86/kvm/x86.c                  | 248 +++++++++-------
 arch/x86/kvm/xen.c                  |   1 +
 include/kvm/arm_arch_timer.h        |   6 +-
 include/kvm/arm_vgic.h              |   4 +
 include/linux/cpuhotplug.h          |   5 +-
 include/linux/kvm_host.h            |  13 +-
 virt/kvm/Kconfig                    |   3 +
 virt/kvm/kvm_main.c                 | 303 ++++++++++---------
 82 files changed, 863 insertions(+), 816 deletions(-)
 delete mode 100644 arch/mips/kvm/callback.c


base-commit: 3e04435fe60590a1c79ec94d60e9897c3ff7d73b
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH v2 01/50] KVM: Register /dev/kvm as the _very_ last thing during initialization
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
@ 2022-11-30 23:08 ` Sean Christopherson
  2022-11-30 23:08 ` [PATCH v2 02/50] KVM: Initialize IRQ FD after arch hardware setup Sean Christopherson
                   ` (50 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:08 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Register /dev/kvm, i.e. expose KVM to userspace, only after all other
setup has completed.  Once /dev/kvm is exposed, userspace can start
invoking KVM ioctls, creating VMs, etc...  If userspace creates a VM
before KVM is done with its configuration, bad things may happen, e.g.
KVM will fail to properly migrate vCPU state if a VM is created before
KVM has registered preemption notifiers.

Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/kvm_main.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 1782c4555d94..b60abb03606b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5919,12 +5919,6 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 
 	kvm_chardev_ops.owner = module;
 
-	r = misc_register(&kvm_dev);
-	if (r) {
-		pr_err("kvm: misc device register failed\n");
-		goto out_unreg;
-	}
-
 	register_syscore_ops(&kvm_syscore_ops);
 
 	kvm_preempt_ops.sched_in = kvm_sched_in;
@@ -5933,11 +5927,24 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 	kvm_init_debug();
 
 	r = kvm_vfio_ops_init();
-	WARN_ON(r);
+	if (WARN_ON_ONCE(r))
+		goto err_vfio;
+
+	/*
+	 * Registration _must_ be the very last thing done, as this exposes
+	 * /dev/kvm to userspace, i.e. all infrastructure must be setup!
+	 */
+	r = misc_register(&kvm_dev);
+	if (r) {
+		pr_err("kvm: misc device register failed\n");
+		goto err_register;
+	}
 
 	return 0;
 
-out_unreg:
+err_register:
+	kvm_vfio_ops_exit();
+err_vfio:
 	kvm_async_pf_deinit();
 out_free_4:
 	for_each_possible_cpu(cpu)
@@ -5963,8 +5970,14 @@ void kvm_exit(void)
 {
 	int cpu;
 
-	debugfs_remove_recursive(kvm_debugfs_dir);
+	/*
+	 * Note, unregistering /dev/kvm doesn't strictly need to come first,
+	 * fops_get(), a.k.a. try_module_get(), prevents acquiring references
+	 * to KVM while the module is being stopped.
+	 */
 	misc_deregister(&kvm_dev);
+
+	debugfs_remove_recursive(kvm_debugfs_dir);
 	for_each_possible_cpu(cpu)
 		free_cpumask_var(per_cpu(cpu_kick_mask, cpu));
 	kmem_cache_destroy(kvm_vcpu_cache);
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 02/50] KVM: Initialize IRQ FD after arch hardware setup
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
  2022-11-30 23:08 ` [PATCH v2 01/50] KVM: Register /dev/kvm as the _very_ last thing during initialization Sean Christopherson
@ 2022-11-30 23:08 ` Sean Christopherson
  2022-11-30 23:08 ` [PATCH v2 03/50] KVM: Allocate cpus_hardware_enabled " Sean Christopherson
                   ` (49 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:08 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Move initialization of KVM's IRQ FD workqueue below arch hardware setup
as a step towards consolidating arch "init" and "hardware setup", and
eventually towards dropping the hooks entirely.  There is no dependency
on the workqueue being created before hardware setup, the workqueue is
used only when destroying VMs, i.e. only needs to be created before
/dev/kvm is exposed to userspace.

Move the destruction of the workqueue before the arch hooks to maintain
symmetry, and so that arch code can move away from the hooks without
having to worry about ordering changes.

Reword the comment about kvm_irqfd_init() needing to come after
kvm_arch_init() to call out that kvm_arch_init() must come before common
KVM does _anything_, as x86 very subtly relies on that behavior to deal
with multiple calls to kvm_init(), e.g. if userspace attempts to load
kvm_amd.ko and kvm_intel.ko.  Tag the code with a FIXME, as x86's subtle
requirement is gross, and invoking an arch callback as the very first
action in a helper that is called only from arch code is silly.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/kvm_main.c | 37 ++++++++++++++++++-------------------
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b60abb03606b..43e2e4f38151 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5852,24 +5852,19 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 	int r;
 	int cpu;
 
+	/*
+	 * FIXME: Get rid of kvm_arch_init(), vendor code should call arch code
+	 * directly.  Note, kvm_arch_init() _must_ be called before anything
+	 * else as x86 relies on checks buried in kvm_arch_init() to guard
+	 * against multiple calls to kvm_init().
+	 */
 	r = kvm_arch_init(opaque);
 	if (r)
-		goto out_fail;
-
-	/*
-	 * kvm_arch_init makes sure there's at most one caller
-	 * for architectures that support multiple implementations,
-	 * like intel and amd on x86.
-	 * kvm_arch_init must be called before kvm_irqfd_init to avoid creating
-	 * conflicts in case kvm is already setup for another implementation.
-	 */
-	r = kvm_irqfd_init();
-	if (r)
-		goto out_irqfd;
+		return r;
 
 	if (!zalloc_cpumask_var(&cpus_hardware_enabled, GFP_KERNEL)) {
 		r = -ENOMEM;
-		goto out_free_0;
+		goto err_hw_enabled;
 	}
 
 	r = kvm_arch_hardware_setup(opaque);
@@ -5913,9 +5908,13 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 		}
 	}
 
+	r = kvm_irqfd_init();
+	if (r)
+		goto err_irqfd;
+
 	r = kvm_async_pf_init();
 	if (r)
-		goto out_free_4;
+		goto err_async_pf;
 
 	kvm_chardev_ops.owner = module;
 
@@ -5946,6 +5945,9 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 	kvm_vfio_ops_exit();
 err_vfio:
 	kvm_async_pf_deinit();
+err_async_pf:
+	kvm_irqfd_exit();
+err_irqfd:
 out_free_4:
 	for_each_possible_cpu(cpu)
 		free_cpumask_var(per_cpu(cpu_kick_mask, cpu));
@@ -5957,11 +5959,8 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 	kvm_arch_hardware_unsetup();
 out_free_1:
 	free_cpumask_var(cpus_hardware_enabled);
-out_free_0:
-	kvm_irqfd_exit();
-out_irqfd:
+err_hw_enabled:
 	kvm_arch_exit();
-out_fail:
 	return r;
 }
 EXPORT_SYMBOL_GPL(kvm_init);
@@ -5986,9 +5985,9 @@ void kvm_exit(void)
 	unregister_reboot_notifier(&kvm_reboot_notifier);
 	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING);
 	on_each_cpu(hardware_disable_nolock, NULL, 1);
+	kvm_irqfd_exit();
 	kvm_arch_hardware_unsetup();
 	kvm_arch_exit();
-	kvm_irqfd_exit();
 	free_cpumask_var(cpus_hardware_enabled);
 	kvm_vfio_ops_exit();
 }
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 03/50] KVM: Allocate cpus_hardware_enabled after arch hardware setup
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
  2022-11-30 23:08 ` [PATCH v2 01/50] KVM: Register /dev/kvm as the _very_ last thing during initialization Sean Christopherson
  2022-11-30 23:08 ` [PATCH v2 02/50] KVM: Initialize IRQ FD after arch hardware setup Sean Christopherson
@ 2022-11-30 23:08 ` Sean Christopherson
  2022-11-30 23:08 ` [PATCH v2 04/50] KVM: Teardown VFIO ops earlier in kvm_exit() Sean Christopherson
                   ` (48 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:08 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Allocate cpus_hardware_enabled after arch hardware setup so that arch
"init" and "hardware setup" are called back-to-back and thus can be
combined in a future patch.  cpus_hardware_enabled is never used before
kvm_create_vm(), i.e. doesn't have a dependency with hardware setup and
only needs to be allocated before /dev/kvm is exposed to userspace.

Free the object before the arch hooks are invoked to maintain symmetry,
and so that arch code can move away from the hooks without having to
worry about ordering changes.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
---
 virt/kvm/kvm_main.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 43e2e4f38151..ded88ad6c2d8 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5862,15 +5862,15 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 	if (r)
 		return r;
 
+	r = kvm_arch_hardware_setup(opaque);
+	if (r < 0)
+		goto err_hw_setup;
+
 	if (!zalloc_cpumask_var(&cpus_hardware_enabled, GFP_KERNEL)) {
 		r = -ENOMEM;
 		goto err_hw_enabled;
 	}
 
-	r = kvm_arch_hardware_setup(opaque);
-	if (r < 0)
-		goto out_free_1;
-
 	c.ret = &r;
 	c.opaque = opaque;
 	for_each_online_cpu(cpu) {
@@ -5956,10 +5956,10 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 	unregister_reboot_notifier(&kvm_reboot_notifier);
 	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING);
 out_free_2:
-	kvm_arch_hardware_unsetup();
-out_free_1:
 	free_cpumask_var(cpus_hardware_enabled);
 err_hw_enabled:
+	kvm_arch_hardware_unsetup();
+err_hw_setup:
 	kvm_arch_exit();
 	return r;
 }
@@ -5986,9 +5986,9 @@ void kvm_exit(void)
 	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING);
 	on_each_cpu(hardware_disable_nolock, NULL, 1);
 	kvm_irqfd_exit();
+	free_cpumask_var(cpus_hardware_enabled);
 	kvm_arch_hardware_unsetup();
 	kvm_arch_exit();
-	free_cpumask_var(cpus_hardware_enabled);
 	kvm_vfio_ops_exit();
 }
 EXPORT_SYMBOL_GPL(kvm_exit);
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 04/50] KVM: Teardown VFIO ops earlier in kvm_exit()
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (2 preceding siblings ...)
  2022-11-30 23:08 ` [PATCH v2 03/50] KVM: Allocate cpus_hardware_enabled " Sean Christopherson
@ 2022-11-30 23:08 ` Sean Christopherson
  2022-11-30 23:08 ` [PATCH v2 05/50] KVM: s390: Unwind kvm_arch_init() piece-by-piece() if a step fails Sean Christopherson
                   ` (47 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:08 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Move the call to kvm_vfio_ops_exit() further up kvm_exit() to try and
bring some amount of symmetry to the setup order in kvm_init(), and more
importantly so that the arch hooks are invoked dead last by kvm_exit().
This will allow arch code to move away from the arch hooks without any
change in ordering between arch code and common code in kvm_exit().

That kvm_vfio_ops_exit() is called last appears to be 100% arbitrary.  It
was bolted on after the fact by commit 571ee1b68598 ("kvm: vfio: fix
unregister kvm_device_ops of vfio").  The nullified kvm_device_ops_table
is also local to kvm_main.c and is used only when there are active VMs,
so unless arch code is doing something truly bizarre, nullifying the
table earlier in kvm_exit() is little more than a nop.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
---
 virt/kvm/kvm_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index ded88ad6c2d8..988f7d92db2e 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5980,6 +5980,7 @@ void kvm_exit(void)
 	for_each_possible_cpu(cpu)
 		free_cpumask_var(per_cpu(cpu_kick_mask, cpu));
 	kmem_cache_destroy(kvm_vcpu_cache);
+	kvm_vfio_ops_exit();
 	kvm_async_pf_deinit();
 	unregister_syscore_ops(&kvm_syscore_ops);
 	unregister_reboot_notifier(&kvm_reboot_notifier);
@@ -5989,7 +5990,6 @@ void kvm_exit(void)
 	free_cpumask_var(cpus_hardware_enabled);
 	kvm_arch_hardware_unsetup();
 	kvm_arch_exit();
-	kvm_vfio_ops_exit();
 }
 EXPORT_SYMBOL_GPL(kvm_exit);
 
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 05/50] KVM: s390: Unwind kvm_arch_init() piece-by-piece() if a step fails
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (3 preceding siblings ...)
  2022-11-30 23:08 ` [PATCH v2 04/50] KVM: Teardown VFIO ops earlier in kvm_exit() Sean Christopherson
@ 2022-11-30 23:08 ` Sean Christopherson
  2022-11-30 23:08 ` [PATCH v2 06/50] KVM: s390: Move hardware setup/unsetup to init/exit Sean Christopherson
                   ` (46 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:08 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

In preparation for folding kvm_arch_hardware_setup() into kvm_arch_init(),
unwind initialization one step at a time instead of simply calling
kvm_arch_exit().  Using kvm_arch_exit() regardless of which initialization
step failed relies on all affected state playing nice with being undone
even if said state wasn't first setup.  That holds true for state that is
currently configured by kvm_arch_init(), but not for state that's handled
by kvm_arch_hardware_setup(), e.g. calling gmap_unregister_pte_notifier()
without first registering a notifier would result in list corruption due
to attempting to delete an entry that was never added to the list.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
---
 arch/s390/kvm/kvm-s390.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index e4890e04b210..221481a09742 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -498,11 +498,11 @@ int kvm_arch_init(void *opaque)
 
 	kvm_s390_dbf_uv = debug_register("kvm-uv", 32, 1, 7 * sizeof(long));
 	if (!kvm_s390_dbf_uv)
-		goto out;
+		goto err_kvm_uv;
 
 	if (debug_register_view(kvm_s390_dbf, &debug_sprintf_view) ||
 	    debug_register_view(kvm_s390_dbf_uv, &debug_sprintf_view))
-		goto out;
+		goto err_debug_view;
 
 	kvm_s390_cpu_feat_init();
 
@@ -510,25 +510,32 @@ int kvm_arch_init(void *opaque)
 	rc = kvm_register_device_ops(&kvm_flic_ops, KVM_DEV_TYPE_FLIC);
 	if (rc) {
 		pr_err("A FLIC registration call failed with rc=%d\n", rc);
-		goto out;
+		goto err_flic;
 	}
 
 	if (IS_ENABLED(CONFIG_VFIO_PCI_ZDEV_KVM)) {
 		rc = kvm_s390_pci_init();
 		if (rc) {
 			pr_err("Unable to allocate AIFT for PCI\n");
-			goto out;
+			goto err_pci;
 		}
 	}
 
 	rc = kvm_s390_gib_init(GAL_ISC);
 	if (rc)
-		goto out;
+		goto err_gib;
 
 	return 0;
 
-out:
-	kvm_arch_exit();
+err_gib:
+	if (IS_ENABLED(CONFIG_VFIO_PCI_ZDEV_KVM))
+		kvm_s390_pci_exit();
+err_pci:
+err_flic:
+err_debug_view:
+	debug_unregister(kvm_s390_dbf_uv);
+err_kvm_uv:
+	debug_unregister(kvm_s390_dbf);
 	return rc;
 }
 
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 06/50] KVM: s390: Move hardware setup/unsetup to init/exit
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (4 preceding siblings ...)
  2022-11-30 23:08 ` [PATCH v2 05/50] KVM: s390: Unwind kvm_arch_init() piece-by-piece() if a step fails Sean Christopherson
@ 2022-11-30 23:08 ` Sean Christopherson
  2022-11-30 23:08 ` [PATCH v2 07/50] KVM: x86: Do timer initialization after XCR0 configuration Sean Christopherson
                   ` (45 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:08 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Now that kvm_arch_hardware_setup() is called immediately after
kvm_arch_init(), fold the guts of kvm_arch_hardware_(un)setup() into
kvm_arch_{init,exit}() as a step towards dropping one of the hooks.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
---
 arch/s390/kvm/kvm-s390.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 221481a09742..97c7ccd189eb 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -331,21 +331,12 @@ static struct notifier_block kvm_clock_notifier = {
 
 int kvm_arch_hardware_setup(void *opaque)
 {
-	gmap_notifier.notifier_call = kvm_gmap_notifier;
-	gmap_register_pte_notifier(&gmap_notifier);
-	vsie_gmap_notifier.notifier_call = kvm_s390_vsie_gmap_notifier;
-	gmap_register_pte_notifier(&vsie_gmap_notifier);
-	atomic_notifier_chain_register(&s390_epoch_delta_notifier,
-				       &kvm_clock_notifier);
 	return 0;
 }
 
 void kvm_arch_hardware_unsetup(void)
 {
-	gmap_unregister_pte_notifier(&gmap_notifier);
-	gmap_unregister_pte_notifier(&vsie_gmap_notifier);
-	atomic_notifier_chain_unregister(&s390_epoch_delta_notifier,
-					 &kvm_clock_notifier);
+
 }
 
 static void allow_cpu_feat(unsigned long nr)
@@ -525,6 +516,13 @@ int kvm_arch_init(void *opaque)
 	if (rc)
 		goto err_gib;
 
+	gmap_notifier.notifier_call = kvm_gmap_notifier;
+	gmap_register_pte_notifier(&gmap_notifier);
+	vsie_gmap_notifier.notifier_call = kvm_s390_vsie_gmap_notifier;
+	gmap_register_pte_notifier(&vsie_gmap_notifier);
+	atomic_notifier_chain_register(&s390_epoch_delta_notifier,
+				       &kvm_clock_notifier);
+
 	return 0;
 
 err_gib:
@@ -541,6 +539,11 @@ int kvm_arch_init(void *opaque)
 
 void kvm_arch_exit(void)
 {
+	gmap_unregister_pte_notifier(&gmap_notifier);
+	gmap_unregister_pte_notifier(&vsie_gmap_notifier);
+	atomic_notifier_chain_unregister(&s390_epoch_delta_notifier,
+					 &kvm_clock_notifier);
+
 	kvm_s390_gib_destroy();
 	if (IS_ENABLED(CONFIG_VFIO_PCI_ZDEV_KVM))
 		kvm_s390_pci_exit();
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 07/50] KVM: x86: Do timer initialization after XCR0 configuration
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (5 preceding siblings ...)
  2022-11-30 23:08 ` [PATCH v2 06/50] KVM: s390: Move hardware setup/unsetup to init/exit Sean Christopherson
@ 2022-11-30 23:08 ` Sean Christopherson
  2022-11-30 23:08 ` [PATCH v2 08/50] KVM: x86: Move hardware setup/unsetup to init/exit Sean Christopherson
                   ` (44 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:08 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Move kvm_arch_init()'s call to kvm_timer_init() down a few lines below
the XCR0 configuration code.  A future patch will move hardware setup
into kvm_arch_init() and slot in vendor hardware setup before the call
to kvm_timer_init() so that timer initialization (among other stuff)
doesn't need to be unwound if vendor setup fails.  XCR0 setup on the
other hand needs to happen before vendor hardware setup.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f18f579ebde8..a873618564cd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9326,13 +9326,13 @@ int kvm_arch_init(void *opaque)
 	if (r)
 		goto out_free_percpu;
 
-	kvm_timer_init();
-
 	if (boot_cpu_has(X86_FEATURE_XSAVE)) {
 		host_xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
 		kvm_caps.supported_xcr0 = host_xcr0 & KVM_SUPPORTED_XCR0;
 	}
 
+	kvm_timer_init();
+
 	if (pi_inject_timer == -1)
 		pi_inject_timer = housekeeping_enabled(HK_TYPE_TIMER);
 #ifdef CONFIG_X86_64
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 08/50] KVM: x86: Move hardware setup/unsetup to init/exit
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (6 preceding siblings ...)
  2022-11-30 23:08 ` [PATCH v2 07/50] KVM: x86: Do timer initialization after XCR0 configuration Sean Christopherson
@ 2022-11-30 23:08 ` Sean Christopherson
  2022-11-30 23:08 ` [PATCH v2 09/50] KVM: Drop arch hardware (un)setup hooks Sean Christopherson
                   ` (43 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:08 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Now that kvm_arch_hardware_setup() is called immediately after
kvm_arch_init(), fold the guts of kvm_arch_hardware_(un)setup() into
kvm_arch_{init,exit}() as a step towards dropping one of the hooks.

To avoid having to unwind various setup, e.g registration of several
notifiers, slot in the vendor hardware setup before the registration of
said notifiers and callbacks.  Introducing a functional change while
moving code is less than ideal, but the alternative is adding a pile of
unwinding code, which is much more error prone, e.g. several attempts to
move the setup code verbatim all introduced bugs.

Add a comment to document that kvm_ops_update() is effectively the point
of no return, e.g. it sets the kvm_x86_ops.hardware_enable canary and so
needs to be unwound.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 121 +++++++++++++++++++++++----------------------
 1 file changed, 63 insertions(+), 58 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a873618564cd..fe5f2e49b5eb 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9258,6 +9258,24 @@ static struct notifier_block pvclock_gtod_notifier = {
 };
 #endif
 
+static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
+{
+	memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
+
+#define __KVM_X86_OP(func) \
+	static_call_update(kvm_x86_##func, kvm_x86_ops.func);
+#define KVM_X86_OP(func) \
+	WARN_ON(!kvm_x86_ops.func); __KVM_X86_OP(func)
+#define KVM_X86_OP_OPTIONAL __KVM_X86_OP
+#define KVM_X86_OP_OPTIONAL_RET0(func) \
+	static_call_update(kvm_x86_##func, (void *)kvm_x86_ops.func ? : \
+					   (void *)__static_call_return0);
+#include <asm/kvm-x86-ops.h>
+#undef __KVM_X86_OP
+
+	kvm_pmu_ops_update(ops->pmu_ops);
+}
+
 int kvm_arch_init(void *opaque)
 {
 	struct kvm_x86_init_ops *ops = opaque;
@@ -9331,6 +9349,24 @@ int kvm_arch_init(void *opaque)
 		kvm_caps.supported_xcr0 = host_xcr0 & KVM_SUPPORTED_XCR0;
 	}
 
+	rdmsrl_safe(MSR_EFER, &host_efer);
+
+	if (boot_cpu_has(X86_FEATURE_XSAVES))
+		rdmsrl(MSR_IA32_XSS, host_xss);
+
+	kvm_init_pmu_capability();
+
+	r = ops->hardware_setup();
+	if (r != 0)
+		goto out_mmu_exit;
+
+	/*
+	 * Point of no return!  DO NOT add error paths below this point unless
+	 * absolutely necessary, as most operations from this point forward
+	 * require unwinding.
+	 */
+	kvm_ops_update(ops);
+
 	kvm_timer_init();
 
 	if (pi_inject_timer == -1)
@@ -9342,8 +9378,32 @@ int kvm_arch_init(void *opaque)
 		set_hv_tscchange_cb(kvm_hyperv_tsc_notifier);
 #endif
 
+	kvm_register_perf_callbacks(ops->handle_intel_pt_intr);
+
+	if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
+		kvm_caps.supported_xss = 0;
+
+#define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
+	cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_);
+#undef __kvm_cpu_cap_has
+
+	if (kvm_caps.has_tsc_control) {
+		/*
+		 * Make sure the user can only configure tsc_khz values that
+		 * fit into a signed integer.
+		 * A min value is not calculated because it will always
+		 * be 1 on all machines.
+		 */
+		u64 max = min(0x7fffffffULL,
+			      __scale_tsc(kvm_caps.max_tsc_scaling_ratio, tsc_khz));
+		kvm_caps.max_guest_tsc_khz = max;
+	}
+	kvm_caps.default_tsc_scaling_ratio = 1ULL << kvm_caps.tsc_scaling_ratio_frac_bits;
+	kvm_init_msr_list();
 	return 0;
 
+out_mmu_exit:
+	kvm_mmu_vendor_module_exit();
 out_free_percpu:
 	free_percpu(user_return_msrs);
 out_free_x86_emulator_cache:
@@ -9353,6 +9413,8 @@ int kvm_arch_init(void *opaque)
 
 void kvm_arch_exit(void)
 {
+	kvm_unregister_perf_callbacks();
+
 #ifdef CONFIG_X86_64
 	if (hypervisor_is_type(X86_HYPER_MS_HYPERV))
 		clear_hv_tscchange_cb();
@@ -9368,6 +9430,7 @@ void kvm_arch_exit(void)
 	irq_work_sync(&pvclock_irq_work);
 	cancel_work_sync(&pvclock_gtod_work);
 #endif
+	static_call(kvm_x86_hardware_unsetup)();
 	kvm_x86_ops.hardware_enable = NULL;
 	kvm_mmu_vendor_module_exit();
 	free_percpu(user_return_msrs);
@@ -11957,72 +12020,14 @@ void kvm_arch_hardware_disable(void)
 	drop_user_return_notifiers();
 }
 
-static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
-{
-	memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
-
-#define __KVM_X86_OP(func) \
-	static_call_update(kvm_x86_##func, kvm_x86_ops.func);
-#define KVM_X86_OP(func) \
-	WARN_ON(!kvm_x86_ops.func); __KVM_X86_OP(func)
-#define KVM_X86_OP_OPTIONAL __KVM_X86_OP
-#define KVM_X86_OP_OPTIONAL_RET0(func) \
-	static_call_update(kvm_x86_##func, (void *)kvm_x86_ops.func ? : \
-					   (void *)__static_call_return0);
-#include <asm/kvm-x86-ops.h>
-#undef __KVM_X86_OP
-
-	kvm_pmu_ops_update(ops->pmu_ops);
-}
-
 int kvm_arch_hardware_setup(void *opaque)
 {
-	struct kvm_x86_init_ops *ops = opaque;
-	int r;
-
-	rdmsrl_safe(MSR_EFER, &host_efer);
-
-	if (boot_cpu_has(X86_FEATURE_XSAVES))
-		rdmsrl(MSR_IA32_XSS, host_xss);
-
-	kvm_init_pmu_capability();
-
-	r = ops->hardware_setup();
-	if (r != 0)
-		return r;
-
-	kvm_ops_update(ops);
-
-	kvm_register_perf_callbacks(ops->handle_intel_pt_intr);
-
-	if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
-		kvm_caps.supported_xss = 0;
-
-#define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
-	cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_);
-#undef __kvm_cpu_cap_has
-
-	if (kvm_caps.has_tsc_control) {
-		/*
-		 * Make sure the user can only configure tsc_khz values that
-		 * fit into a signed integer.
-		 * A min value is not calculated because it will always
-		 * be 1 on all machines.
-		 */
-		u64 max = min(0x7fffffffULL,
-			      __scale_tsc(kvm_caps.max_tsc_scaling_ratio, tsc_khz));
-		kvm_caps.max_guest_tsc_khz = max;
-	}
-	kvm_caps.default_tsc_scaling_ratio = 1ULL << kvm_caps.tsc_scaling_ratio_frac_bits;
-	kvm_init_msr_list();
 	return 0;
 }
 
 void kvm_arch_hardware_unsetup(void)
 {
-	kvm_unregister_perf_callbacks();
 
-	static_call(kvm_x86_hardware_unsetup)();
 }
 
 int kvm_arch_check_processor_compat(void *opaque)
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 09/50] KVM: Drop arch hardware (un)setup hooks
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (7 preceding siblings ...)
  2022-11-30 23:08 ` [PATCH v2 08/50] KVM: x86: Move hardware setup/unsetup to init/exit Sean Christopherson
@ 2022-11-30 23:08 ` Sean Christopherson
  2022-11-30 23:08 ` [PATCH v2 10/50] KVM: VMX: Reset eVMCS controls in VP assist page during hardware disabling Sean Christopherson
                   ` (42 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:08 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Drop kvm_arch_hardware_setup() and kvm_arch_hardware_unsetup() now that
all implementations are nops.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>	# s390
Acked-by: Anup Patel <anup@brainfault.org>
---
 arch/arm64/include/asm/kvm_host.h   |  1 -
 arch/arm64/kvm/arm.c                |  5 -----
 arch/mips/include/asm/kvm_host.h    |  1 -
 arch/mips/kvm/mips.c                |  5 -----
 arch/powerpc/include/asm/kvm_host.h |  1 -
 arch/powerpc/kvm/powerpc.c          |  5 -----
 arch/riscv/include/asm/kvm_host.h   |  1 -
 arch/riscv/kvm/main.c               |  5 -----
 arch/s390/kvm/kvm-s390.c            | 10 ----------
 arch/x86/kvm/x86.c                  | 10 ----------
 include/linux/kvm_host.h            |  2 --
 virt/kvm/kvm_main.c                 |  7 -------
 12 files changed, 53 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 45e2136322ba..5d5a887e63a5 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -859,7 +859,6 @@ static inline bool kvm_system_needs_idmapped_vectors(void)
 
 void kvm_arm_vcpu_ptrauth_trap(struct kvm_vcpu *vcpu);
 
-static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 7b107fa540fa..c6732ac329ca 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -63,11 +63,6 @@ int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 	return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
 }
 
-int kvm_arch_hardware_setup(void *opaque)
-{
-	return 0;
-}
-
 int kvm_arch_check_processor_compat(void *opaque)
 {
 	return 0;
diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
index 5cedb28e8a40..28f0ba97db71 100644
--- a/arch/mips/include/asm/kvm_host.h
+++ b/arch/mips/include/asm/kvm_host.h
@@ -888,7 +888,6 @@ extern unsigned long kvm_mips_get_ramsize(struct kvm *kvm);
 extern int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu,
 			     struct kvm_mips_interrupt *irq);
 
-static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_free_memslot(struct kvm *kvm,
 					 struct kvm_memory_slot *slot) {}
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index a25e0b73ee70..af29490d9740 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -135,11 +135,6 @@ void kvm_arch_hardware_disable(void)
 	kvm_mips_callbacks->hardware_disable();
 }
 
-int kvm_arch_hardware_setup(void *opaque)
-{
-	return 0;
-}
-
 int kvm_arch_check_processor_compat(void *opaque)
 {
 	return 0;
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index caea15dcb91d..5d2c3a487e73 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -877,7 +877,6 @@ struct kvm_vcpu_arch {
 #define __KVM_HAVE_CREATE_DEVICE
 
 static inline void kvm_arch_hardware_disable(void) {}
-static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {}
 static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {}
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 04494a4fb37a..5faf69421f13 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -440,11 +440,6 @@ int kvm_arch_hardware_enable(void)
 	return 0;
 }
 
-int kvm_arch_hardware_setup(void *opaque)
-{
-	return 0;
-}
-
 int kvm_arch_check_processor_compat(void *opaque)
 {
 	return kvmppc_core_check_processor_compat();
diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
index dbbf43d52623..8c771fc4f5d2 100644
--- a/arch/riscv/include/asm/kvm_host.h
+++ b/arch/riscv/include/asm/kvm_host.h
@@ -229,7 +229,6 @@ struct kvm_vcpu_arch {
 	bool pause;
 };
 
-static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 
diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c
index df2d8716851f..a146fa0ce4d2 100644
--- a/arch/riscv/kvm/main.c
+++ b/arch/riscv/kvm/main.c
@@ -25,11 +25,6 @@ int kvm_arch_check_processor_compat(void *opaque)
 	return 0;
 }
 
-int kvm_arch_hardware_setup(void *opaque)
-{
-	return 0;
-}
-
 int kvm_arch_hardware_enable(void)
 {
 	unsigned long hideleg, hedeleg;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 97c7ccd189eb..829e6e046003 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -329,16 +329,6 @@ static struct notifier_block kvm_clock_notifier = {
 	.notifier_call = kvm_clock_sync,
 };
 
-int kvm_arch_hardware_setup(void *opaque)
-{
-	return 0;
-}
-
-void kvm_arch_hardware_unsetup(void)
-{
-
-}
-
 static void allow_cpu_feat(unsigned long nr)
 {
 	set_bit_inv(nr, kvm_s390_available_cpu_feat);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fe5f2e49b5eb..915d57c3b41d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12020,16 +12020,6 @@ void kvm_arch_hardware_disable(void)
 	drop_user_return_notifiers();
 }
 
-int kvm_arch_hardware_setup(void *opaque)
-{
-	return 0;
-}
-
-void kvm_arch_hardware_unsetup(void)
-{
-
-}
-
 int kvm_arch_check_processor_compat(void *opaque)
 {
 	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 8f874a964313..f2e0e78d2d92 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1463,8 +1463,6 @@ static inline void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) {}
 
 int kvm_arch_hardware_enable(void);
 void kvm_arch_hardware_disable(void);
-int kvm_arch_hardware_setup(void *opaque);
-void kvm_arch_hardware_unsetup(void);
 int kvm_arch_check_processor_compat(void *opaque);
 int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
 bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 988f7d92db2e..0e62887e8ce1 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5862,10 +5862,6 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 	if (r)
 		return r;
 
-	r = kvm_arch_hardware_setup(opaque);
-	if (r < 0)
-		goto err_hw_setup;
-
 	if (!zalloc_cpumask_var(&cpus_hardware_enabled, GFP_KERNEL)) {
 		r = -ENOMEM;
 		goto err_hw_enabled;
@@ -5958,8 +5954,6 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 out_free_2:
 	free_cpumask_var(cpus_hardware_enabled);
 err_hw_enabled:
-	kvm_arch_hardware_unsetup();
-err_hw_setup:
 	kvm_arch_exit();
 	return r;
 }
@@ -5988,7 +5982,6 @@ void kvm_exit(void)
 	on_each_cpu(hardware_disable_nolock, NULL, 1);
 	kvm_irqfd_exit();
 	free_cpumask_var(cpus_hardware_enabled);
-	kvm_arch_hardware_unsetup();
 	kvm_arch_exit();
 }
 EXPORT_SYMBOL_GPL(kvm_exit);
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 10/50] KVM: VMX: Reset eVMCS controls in VP assist page during hardware disabling
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (8 preceding siblings ...)
  2022-11-30 23:08 ` [PATCH v2 09/50] KVM: Drop arch hardware (un)setup hooks Sean Christopherson
@ 2022-11-30 23:08 ` Sean Christopherson
  2022-12-01 15:42   ` Vitaly Kuznetsov
  2022-11-30 23:08 ` [PATCH v2 11/50] KVM: VMX: Don't bother disabling eVMCS static key on module exit Sean Christopherson
                   ` (41 subsequent siblings)
  51 siblings, 1 reply; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:08 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Reset the eVMCS controls in the per-CPU VP assist page during hardware
disabling instead of waiting until kvm-intel's module exit.  The controls
are activated if and only if KVM creates a VM, i.e. don't need to be
reset if hardware is never enabled.

Doing the reset during hardware disabling will naturally fix a potential
NULL pointer deref bug once KVM disables CPU hotplug while enabling and
disabling hardware (which is necessary to fix a variety of bugs).  If the
kernel is running as the root partition, the VP assist page is unmapped
during CPU hot unplug, and so KVM's clearing of the eVMCS controls needs
to occur with CPU hot(un)plug disabled, otherwise KVM could attempt to
write to a CPU's VP assist page after it's unmapped.

Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 50 +++++++++++++++++++++++++-----------------
 1 file changed, 30 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index cea8c07f5229..d85d175dca70 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -551,6 +551,33 @@ static int hv_enable_l2_tlb_flush(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+static void hv_reset_evmcs(void)
+{
+	struct hv_vp_assist_page *vp_ap;
+
+	if (!static_branch_unlikely(&enable_evmcs))
+		return;
+
+	/*
+	 * KVM should enable eVMCS if and only if all CPUs have a VP assist
+	 * page, and should reject CPU onlining if eVMCS is enabled the CPU
+	 * doesn't have a VP assist page allocated.
+	 */
+	vp_ap = hv_get_vp_assist_page(smp_processor_id());
+	if (WARN_ON_ONCE(!vp_ap))
+		return;
+
+	/*
+	 * Reset everything to support using non-enlightened VMCS access later
+	 * (e.g. when we reload the module with enlightened_vmcs=0)
+	 */
+	vp_ap->nested_control.features.directhypercall = 0;
+	vp_ap->current_nested_vmcs = 0;
+	vp_ap->enlighten_vmentry = 0;
+}
+
+#else /* IS_ENABLED(CONFIG_HYPERV) */
+static void hv_reset_evmcs(void) {}
 #endif /* IS_ENABLED(CONFIG_HYPERV) */
 
 /*
@@ -2496,6 +2523,8 @@ static void vmx_hardware_disable(void)
 	if (cpu_vmxoff())
 		kvm_spurious_fault();
 
+	hv_reset_evmcs();
+
 	intel_pt_handle_vmx(0);
 }
 
@@ -8462,27 +8491,8 @@ static void vmx_exit(void)
 	kvm_exit();
 
 #if IS_ENABLED(CONFIG_HYPERV)
-	if (static_branch_unlikely(&enable_evmcs)) {
-		int cpu;
-		struct hv_vp_assist_page *vp_ap;
-		/*
-		 * Reset everything to support using non-enlightened VMCS
-		 * access later (e.g. when we reload the module with
-		 * enlightened_vmcs=0)
-		 */
-		for_each_online_cpu(cpu) {
-			vp_ap =	hv_get_vp_assist_page(cpu);
-
-			if (!vp_ap)
-				continue;
-
-			vp_ap->nested_control.features.directhypercall = 0;
-			vp_ap->current_nested_vmcs = 0;
-			vp_ap->enlighten_vmentry = 0;
-		}
-
+	if (static_branch_unlikely(&enable_evmcs))
 		static_branch_disable(&enable_evmcs);
-	}
 #endif
 	vmx_cleanup_l1d_flush();
 
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 11/50] KVM: VMX: Don't bother disabling eVMCS static key on module exit
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (9 preceding siblings ...)
  2022-11-30 23:08 ` [PATCH v2 10/50] KVM: VMX: Reset eVMCS controls in VP assist page during hardware disabling Sean Christopherson
@ 2022-11-30 23:08 ` Sean Christopherson
  2022-11-30 23:08 ` [PATCH v2 12/50] KVM: VMX: Move Hyper-V eVMCS initialization to helper Sean Christopherson
                   ` (40 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:08 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Don't disable the eVMCS static key on module exit, kvm_intel.ko owns the
key so there can't possibly be users after the kvm_intel.ko is unloaded,
at least not without much bigger issues.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d85d175dca70..c0de7160700b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8490,10 +8490,6 @@ static void vmx_exit(void)
 
 	kvm_exit();
 
-#if IS_ENABLED(CONFIG_HYPERV)
-	if (static_branch_unlikely(&enable_evmcs))
-		static_branch_disable(&enable_evmcs);
-#endif
 	vmx_cleanup_l1d_flush();
 
 	allow_smaller_maxphyaddr = false;
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 12/50] KVM: VMX: Move Hyper-V eVMCS initialization to helper
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (10 preceding siblings ...)
  2022-11-30 23:08 ` [PATCH v2 11/50] KVM: VMX: Don't bother disabling eVMCS static key on module exit Sean Christopherson
@ 2022-11-30 23:08 ` Sean Christopherson
  2022-12-01 15:22   ` Vitaly Kuznetsov
  2022-11-30 23:08 ` [PATCH v2 13/50] KVM: x86: Move guts of kvm_arch_init() to standalone helper Sean Christopherson
                   ` (39 subsequent siblings)
  51 siblings, 1 reply; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:08 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Move Hyper-V's eVMCS initialization to a dedicated helper to clean up
vmx_init(), and add a comment to call out that the Hyper-V init code
doesn't need to be unwound if vmx_init() ultimately fails.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 73 +++++++++++++++++++++++++-----------------
 1 file changed, 43 insertions(+), 30 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c0de7160700b..b8bf95b9710d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -523,6 +523,8 @@ static inline void vmx_segment_cache_clear(struct vcpu_vmx *vmx)
 static unsigned long host_idt_base;
 
 #if IS_ENABLED(CONFIG_HYPERV)
+static struct kvm_x86_ops vmx_x86_ops __initdata;
+
 static bool __read_mostly enlightened_vmcs = true;
 module_param(enlightened_vmcs, bool, 0444);
 
@@ -551,6 +553,43 @@ static int hv_enable_l2_tlb_flush(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+static __init void hv_init_evmcs(void)
+{
+	int cpu;
+
+	if (!enlightened_vmcs)
+		return;
+
+	/*
+	 * Enlightened VMCS usage should be recommended and the host needs
+	 * to support eVMCS v1 or above.
+	 */
+	if (ms_hyperv.hints & HV_X64_ENLIGHTENED_VMCS_RECOMMENDED &&
+	    (ms_hyperv.nested_features & HV_X64_ENLIGHTENED_VMCS_VERSION) >=
+	     KVM_EVMCS_VERSION) {
+
+		/* Check that we have assist pages on all online CPUs */
+		for_each_online_cpu(cpu) {
+			if (!hv_get_vp_assist_page(cpu)) {
+				enlightened_vmcs = false;
+				break;
+			}
+		}
+
+		if (enlightened_vmcs) {
+			pr_info("KVM: vmx: using Hyper-V Enlightened VMCS\n");
+			static_branch_enable(&enable_evmcs);
+		}
+
+		if (ms_hyperv.nested_features & HV_X64_NESTED_DIRECT_FLUSH)
+			vmx_x86_ops.enable_l2_tlb_flush
+				= hv_enable_l2_tlb_flush;
+
+	} else {
+		enlightened_vmcs = false;
+	}
+}
+
 static void hv_reset_evmcs(void)
 {
 	struct hv_vp_assist_page *vp_ap;
@@ -577,6 +616,7 @@ static void hv_reset_evmcs(void)
 }
 
 #else /* IS_ENABLED(CONFIG_HYPERV) */
+static void hv_init_evmcs(void) {}
 static void hv_reset_evmcs(void) {}
 #endif /* IS_ENABLED(CONFIG_HYPERV) */
 
@@ -8500,38 +8540,11 @@ static int __init vmx_init(void)
 {
 	int r, cpu;
 
-#if IS_ENABLED(CONFIG_HYPERV)
 	/*
-	 * Enlightened VMCS usage should be recommended and the host needs
-	 * to support eVMCS v1 or above. We can also disable eVMCS support
-	 * with module parameter.
+	 * Note, hv_init_evmcs() touches only VMX knobs, i.e. there's nothing
+	 * to unwind if a later step fails.
 	 */
-	if (enlightened_vmcs &&
-	    ms_hyperv.hints & HV_X64_ENLIGHTENED_VMCS_RECOMMENDED &&
-	    (ms_hyperv.nested_features & HV_X64_ENLIGHTENED_VMCS_VERSION) >=
-	    KVM_EVMCS_VERSION) {
-
-		/* Check that we have assist pages on all online CPUs */
-		for_each_online_cpu(cpu) {
-			if (!hv_get_vp_assist_page(cpu)) {
-				enlightened_vmcs = false;
-				break;
-			}
-		}
-
-		if (enlightened_vmcs) {
-			pr_info("KVM: vmx: using Hyper-V Enlightened VMCS\n");
-			static_branch_enable(&enable_evmcs);
-		}
-
-		if (ms_hyperv.nested_features & HV_X64_NESTED_DIRECT_FLUSH)
-			vmx_x86_ops.enable_l2_tlb_flush
-				= hv_enable_l2_tlb_flush;
-
-	} else {
-		enlightened_vmcs = false;
-	}
-#endif
+	hv_init_evmcs();
 
 	r = kvm_init(&vmx_init_ops, sizeof(struct vcpu_vmx),
 		     __alignof__(struct vcpu_vmx), THIS_MODULE);
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 13/50] KVM: x86: Move guts of kvm_arch_init() to standalone helper
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (11 preceding siblings ...)
  2022-11-30 23:08 ` [PATCH v2 12/50] KVM: VMX: Move Hyper-V eVMCS initialization to helper Sean Christopherson
@ 2022-11-30 23:08 ` Sean Christopherson
  2022-11-30 23:08 ` [PATCH v2 14/50] KVM: VMX: Do _all_ initialization before exposing /dev/kvm to userspace Sean Christopherson
                   ` (38 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:08 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Move the guts of kvm_arch_init() to a new helper, kvm_x86_vendor_init(),
so that VMX can do _all_ arch and vendor initialization before calling
kvm_init().  Calling kvm_init() must be the _very_ last step during init,
as kvm_init() exposes /dev/kvm to userspace, i.e. allows creating VMs.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h |  3 +++
 arch/x86/kvm/svm/svm.c          | 23 +++++++++++++++++++++--
 arch/x86/kvm/vmx/vmx.c          | 21 +++++++++++++++------
 arch/x86/kvm/x86.c              | 15 +++++++++++++--
 4 files changed, 52 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 70af7240a1d5..04a9ae66fb8d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1758,6 +1758,9 @@ extern struct kvm_x86_ops kvm_x86_ops;
 #define KVM_X86_OP_OPTIONAL_RET0 KVM_X86_OP
 #include <asm/kvm-x86-ops.h>
 
+int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops);
+void kvm_x86_vendor_exit(void);
+
 #define __KVM_HAVE_ARCH_VM_ALLOC
 static inline struct kvm *kvm_arch_alloc_vm(void)
 {
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 91352d692845..19e81a99c58f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5091,15 +5091,34 @@ static struct kvm_x86_init_ops svm_init_ops __initdata = {
 
 static int __init svm_init(void)
 {
+	int r;
+
 	__unused_size_checks();
 
-	return kvm_init(&svm_init_ops, sizeof(struct vcpu_svm),
-			__alignof__(struct vcpu_svm), THIS_MODULE);
+	r = kvm_x86_vendor_init(&svm_init_ops);
+	if (r)
+		return r;
+
+	/*
+	 * Common KVM initialization _must_ come last, after this, /dev/kvm is
+	 * exposed to userspace!
+	 */
+	r = kvm_init(&svm_init_ops, sizeof(struct vcpu_svm),
+		     __alignof__(struct vcpu_svm), THIS_MODULE);
+	if (r)
+		goto err_kvm_init;
+
+	return 0;
+
+err_kvm_init:
+	kvm_x86_vendor_exit();
+	return r;
 }
 
 static void __exit svm_exit(void)
 {
 	kvm_exit();
+	kvm_x86_vendor_exit();
 }
 
 module_init(svm_init)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index b8bf95b9710d..8e81cd94407d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8529,6 +8529,7 @@ static void vmx_exit(void)
 #endif
 
 	kvm_exit();
+	kvm_x86_vendor_exit();
 
 	vmx_cleanup_l1d_flush();
 
@@ -8546,23 +8547,25 @@ static int __init vmx_init(void)
 	 */
 	hv_init_evmcs();
 
+	r = kvm_x86_vendor_init(&vmx_init_ops);
+	if (r)
+		return r;
+
 	r = kvm_init(&vmx_init_ops, sizeof(struct vcpu_vmx),
 		     __alignof__(struct vcpu_vmx), THIS_MODULE);
 	if (r)
-		return r;
+		goto err_kvm_init;
 
 	/*
-	 * Must be called after kvm_init() so enable_ept is properly set
+	 * Must be called after common x86 init so enable_ept is properly set
 	 * up. Hand the parameter mitigation value in which was stored in
 	 * the pre module init parser. If no parameter was given, it will
 	 * contain 'auto' which will be turned into the default 'cond'
 	 * mitigation mode.
 	 */
 	r = vmx_setup_l1d_flush(vmentry_l1d_flush_param);
-	if (r) {
-		vmx_exit();
-		return r;
-	}
+	if (r)
+		goto err_l1d_flush;
 
 	vmx_setup_fb_clear_ctrl();
 
@@ -8587,5 +8590,11 @@ static int __init vmx_init(void)
 		allow_smaller_maxphyaddr = true;
 
 	return 0;
+
+err_l1d_flush:
+	vmx_exit();
+err_kvm_init:
+	kvm_x86_vendor_exit();
+	return r;
 }
 module_init(vmx_init);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 915d57c3b41d..b33932fca36e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9278,7 +9278,16 @@ static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
 
 int kvm_arch_init(void *opaque)
 {
-	struct kvm_x86_init_ops *ops = opaque;
+	return 0;
+}
+
+void kvm_arch_exit(void)
+{
+
+}
+
+int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
+{
 	u64 host_pat;
 	int r;
 
@@ -9410,8 +9419,9 @@ int kvm_arch_init(void *opaque)
 	kmem_cache_destroy(x86_emulator_cache);
 	return r;
 }
+EXPORT_SYMBOL_GPL(kvm_x86_vendor_init);
 
-void kvm_arch_exit(void)
+void kvm_x86_vendor_exit(void)
 {
 	kvm_unregister_perf_callbacks();
 
@@ -9440,6 +9450,7 @@ void kvm_arch_exit(void)
 	WARN_ON(static_branch_unlikely(&kvm_xen_enabled.key));
 #endif
 }
+EXPORT_SYMBOL_GPL(kvm_x86_vendor_exit);
 
 static int __kvm_emulate_halt(struct kvm_vcpu *vcpu, int state, int reason)
 {
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 14/50] KVM: VMX: Do _all_ initialization before exposing /dev/kvm to userspace
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (12 preceding siblings ...)
  2022-11-30 23:08 ` [PATCH v2 13/50] KVM: x86: Move guts of kvm_arch_init() to standalone helper Sean Christopherson
@ 2022-11-30 23:08 ` Sean Christopherson
  2022-11-30 23:08 ` [PATCH v2 15/50] KVM: x86: Serialize vendor module initialization (hardware setup) Sean Christopherson
                   ` (37 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:08 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Call kvm_init() only after _all_ setup is complete, as kvm_init() exposes
/dev/kvm to userspace and thus allows userspace to create VMs (and call
other ioctls).  E.g. KVM will encounter a NULL pointer when attempting to
add a vCPU to the per-CPU loaded_vmcss_on_cpu list if userspace is able to
create a VM before vmx_init() configures said list.

 BUG: kernel NULL pointer dereference, address: 0000000000000008
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 0 P4D 0
 Oops: 0002 [#1] SMP
 CPU: 6 PID: 1143 Comm: stable Not tainted 6.0.0-rc7+ #988
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
 RIP: 0010:vmx_vcpu_load_vmcs+0x68/0x230 [kvm_intel]
  <TASK>
  vmx_vcpu_load+0x16/0x60 [kvm_intel]
  kvm_arch_vcpu_load+0x32/0x1f0 [kvm]
  vcpu_load+0x2f/0x40 [kvm]
  kvm_arch_vcpu_create+0x231/0x310 [kvm]
  kvm_vm_ioctl+0x79f/0xe10 [kvm]
  ? handle_mm_fault+0xb1/0x220
  __x64_sys_ioctl+0x80/0xb0
  do_syscall_64+0x2b/0x50
  entry_SYSCALL_64_after_hwframe+0x46/0xb0
 RIP: 0033:0x7f5a6b05743b
  </TASK>
 Modules linked in: vhost_net vhost vhost_iotlb tap kvm_intel(+) kvm irqbypass

Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 34 +++++++++++++++++++++-------------
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 8e81cd94407d..76185a7a7ded 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8521,19 +8521,23 @@ static void vmx_cleanup_l1d_flush(void)
 	l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_AUTO;
 }
 
-static void vmx_exit(void)
+static void __vmx_exit(void)
 {
+	allow_smaller_maxphyaddr = false;
+
 #ifdef CONFIG_KEXEC_CORE
 	RCU_INIT_POINTER(crash_vmclear_loaded_vmcss, NULL);
 	synchronize_rcu();
 #endif
-
-	kvm_exit();
-	kvm_x86_vendor_exit();
-
 	vmx_cleanup_l1d_flush();
+}
 
-	allow_smaller_maxphyaddr = false;
+static void vmx_exit(void)
+{
+	kvm_exit();
+	kvm_x86_vendor_exit();
+
+	__vmx_exit();
 }
 module_exit(vmx_exit);
 
@@ -8551,11 +8555,6 @@ static int __init vmx_init(void)
 	if (r)
 		return r;
 
-	r = kvm_init(&vmx_init_ops, sizeof(struct vcpu_vmx),
-		     __alignof__(struct vcpu_vmx), THIS_MODULE);
-	if (r)
-		goto err_kvm_init;
-
 	/*
 	 * Must be called after common x86 init so enable_ept is properly set
 	 * up. Hand the parameter mitigation value in which was stored in
@@ -8589,11 +8588,20 @@ static int __init vmx_init(void)
 	if (!enable_ept)
 		allow_smaller_maxphyaddr = true;
 
+	/*
+	 * Common KVM initialization _must_ come last, after this, /dev/kvm is
+	 * exposed to userspace!
+	 */
+	r = kvm_init(&vmx_init_ops, sizeof(struct vcpu_vmx),
+		     __alignof__(struct vcpu_vmx), THIS_MODULE);
+	if (r)
+		goto err_kvm_init;
+
 	return 0;
 
-err_l1d_flush:
-	vmx_exit();
 err_kvm_init:
+	__vmx_exit();
+err_l1d_flush:
 	kvm_x86_vendor_exit();
 	return r;
 }
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 15/50] KVM: x86: Serialize vendor module initialization (hardware setup)
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (13 preceding siblings ...)
  2022-11-30 23:08 ` [PATCH v2 14/50] KVM: VMX: Do _all_ initialization before exposing /dev/kvm to userspace Sean Christopherson
@ 2022-11-30 23:08 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 16/50] KVM: arm64: Simplify the CPUHP logic Sean Christopherson
                   ` (36 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:08 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Acquire a new mutex, vendor_module_lock, in kvm_x86_vendor_init() while
doing hardware setup to ensure that concurrent calls are fully serialized.
KVM rejects attempts to load vendor modules if a different module has
already been loaded, but doesn't handle the case where multiple vendor
modules are loaded at the same time, and module_init() doesn't run under
the global module_mutex.

Note, in practice, this is likely a benign bug as no platform exists that
supports both SVM and VMX, i.e. barring a weird VM setup, one of the
vendor modules is guaranteed to fail a support check before modifying
common KVM state.

Alternatively, KVM could perform an atomic CMPXCHG on .hardware_enable,
but that comes with its own ugliness as it would require setting
.hardware_enable before success is guaranteed, e.g. attempting to load
the "wrong" could result in spurious failure to load the "right" module.

Introduce a new mutex as using kvm_lock is extremely deadlock prone due
to kvm_lock being taken under cpus_write_lock(), and in the future, under
under cpus_read_lock().  Any operation that takes cpus_read_lock() while
holding kvm_lock would potentially deadlock, e.g. kvm_timer_init() takes
cpus_read_lock() to register a callback.  In theory, KVM could avoid
such problematic paths, i.e. do less setup under kvm_lock, but avoiding
all calls to cpus_read_lock() is subtly difficult and thus fragile.  E.g.
updating static calls also acquires cpus_read_lock().

Inverting the lock ordering, i.e. always taking kvm_lock outside
cpus_read_lock(), is not a viable option as kvm_lock is taken in various
callbacks that may be invoked under cpus_read_lock(), e.g. x86's
kvmclock_cpufreq_notifier().

The lockdep splat below is dependent on future patches to take
cpus_read_lock() in hardware_enable_all(), but as above, deadlock is
already is already possible.

  ======================================================
  WARNING: possible circular locking dependency detected
  6.0.0-smp--7ec93244f194-init2 #27 Tainted: G           O
  ------------------------------------------------------
  stable/251833 is trying to acquire lock:
  ffffffffc097ea28 (kvm_lock){+.+.}-{3:3}, at: hardware_enable_all+0x1f/0xc0 [kvm]

               but task is already holding lock:
  ffffffffa2456828 (cpu_hotplug_lock){++++}-{0:0}, at: hardware_enable_all+0xf/0xc0 [kvm]

               which lock already depends on the new lock.

               the existing dependency chain (in reverse order) is:

               -> #1 (cpu_hotplug_lock){++++}-{0:0}:
         cpus_read_lock+0x2a/0xa0
         __cpuhp_setup_state+0x2b/0x60
         __kvm_x86_vendor_init+0x16a/0x1870 [kvm]
         kvm_x86_vendor_init+0x23/0x40 [kvm]
         0xffffffffc0a4d02b
         do_one_initcall+0x110/0x200
         do_init_module+0x4f/0x250
         load_module+0x1730/0x18f0
         __se_sys_finit_module+0xca/0x100
         __x64_sys_finit_module+0x1d/0x20
         do_syscall_64+0x3d/0x80
         entry_SYSCALL_64_after_hwframe+0x63/0xcd

               -> #0 (kvm_lock){+.+.}-{3:3}:
         __lock_acquire+0x16f4/0x30d0
         lock_acquire+0xb2/0x190
         __mutex_lock+0x98/0x6f0
         mutex_lock_nested+0x1b/0x20
         hardware_enable_all+0x1f/0xc0 [kvm]
         kvm_dev_ioctl+0x45e/0x930 [kvm]
         __se_sys_ioctl+0x77/0xc0
         __x64_sys_ioctl+0x1d/0x20
         do_syscall_64+0x3d/0x80
         entry_SYSCALL_64_after_hwframe+0x63/0xcd

               other info that might help us debug this:

   Possible unsafe locking scenario:

         CPU0                    CPU1
         ----                    ----
    lock(cpu_hotplug_lock);
                                 lock(kvm_lock);
                                 lock(cpu_hotplug_lock);
    lock(kvm_lock);

                *** DEADLOCK ***

  1 lock held by stable/251833:
   #0: ffffffffa2456828 (cpu_hotplug_lock){++++}-{0:0}, at: hardware_enable_all+0xf/0xc0 [kvm]

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 Documentation/virt/kvm/locking.rst |  6 ++++++
 arch/x86/kvm/x86.c                 | 18 ++++++++++++++++--
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index 845a561629f1..132a9e5436e5 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -282,3 +282,9 @@ time it will be set using the Dirty tracking mechanism described above.
 		wakeup notification event since external interrupts from the
 		assigned devices happens, we will find the vCPU on the list to
 		wakeup.
+
+``vendor_module_lock``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+:Type:		mutex
+:Arch:		x86
+:Protects:	loading a vendor module (kvm_amd or kvm_intel)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b33932fca36e..45184ca89317 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -128,6 +128,7 @@ static int kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu);
 static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2);
 static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2);
 
+static DEFINE_MUTEX(vendor_module_lock);
 struct kvm_x86_ops kvm_x86_ops __read_mostly;
 
 #define KVM_X86_OP(func)					     \
@@ -9286,7 +9287,7 @@ void kvm_arch_exit(void)
 
 }
 
-int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
+static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 {
 	u64 host_pat;
 	int r;
@@ -9419,6 +9420,17 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 	kmem_cache_destroy(x86_emulator_cache);
 	return r;
 }
+
+int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
+{
+	int r;
+
+	mutex_lock(&vendor_module_lock);
+	r = __kvm_x86_vendor_init(ops);
+	mutex_unlock(&vendor_module_lock);
+
+	return r;
+}
 EXPORT_SYMBOL_GPL(kvm_x86_vendor_init);
 
 void kvm_x86_vendor_exit(void)
@@ -9441,7 +9453,6 @@ void kvm_x86_vendor_exit(void)
 	cancel_work_sync(&pvclock_gtod_work);
 #endif
 	static_call(kvm_x86_hardware_unsetup)();
-	kvm_x86_ops.hardware_enable = NULL;
 	kvm_mmu_vendor_module_exit();
 	free_percpu(user_return_msrs);
 	kmem_cache_destroy(x86_emulator_cache);
@@ -9449,6 +9460,9 @@ void kvm_x86_vendor_exit(void)
 	static_key_deferred_flush(&kvm_xen_enabled);
 	WARN_ON(static_branch_unlikely(&kvm_xen_enabled.key));
 #endif
+	mutex_lock(&vendor_module_lock);
+	kvm_x86_ops.hardware_enable = NULL;
+	mutex_unlock(&vendor_module_lock);
 }
 EXPORT_SYMBOL_GPL(kvm_x86_vendor_exit);
 
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 16/50] KVM: arm64: Simplify the CPUHP logic
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (14 preceding siblings ...)
  2022-11-30 23:08 ` [PATCH v2 15/50] KVM: x86: Serialize vendor module initialization (hardware setup) Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 17/50] KVM: arm64: Free hypervisor allocations if vector slot init fails Sean Christopherson
                   ` (35 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

From: Marc Zyngier <maz@kernel.org>

For a number of historical reasons, the KVM/arm64 hotplug setup is pretty
complicated, and we have two extra CPUHP notifiers for vGIC and timers.

It looks pretty pointless, and gets in the way of further changes.
So let's just expose some helpers that can be called from the core
CPUHP callback, and get rid of everything else.

This gives us the opportunity to drop a useless notifier entry,
as well as tidy-up the timer enable/disable, which was a bit odd.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/arm64/kvm/arch_timer.c     | 27 ++++++++++-----------------
 arch/arm64/kvm/arm.c            | 13 +++++++++++++
 arch/arm64/kvm/vgic/vgic-init.c | 19 ++-----------------
 include/kvm/arm_arch_timer.h    |  4 ++++
 include/kvm/arm_vgic.h          |  4 ++++
 include/linux/cpuhotplug.h      |  3 ---
 6 files changed, 33 insertions(+), 37 deletions(-)

diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index bb24a76b4224..33fca1a691a5 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -811,10 +811,18 @@ void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
 	ptimer->host_timer_irq_flags = host_ptimer_irq_flags;
 }
 
-static void kvm_timer_init_interrupt(void *info)
+void kvm_timer_cpu_up(void)
 {
 	enable_percpu_irq(host_vtimer_irq, host_vtimer_irq_flags);
-	enable_percpu_irq(host_ptimer_irq, host_ptimer_irq_flags);
+	if (host_ptimer_irq)
+		enable_percpu_irq(host_ptimer_irq, host_ptimer_irq_flags);
+}
+
+void kvm_timer_cpu_down(void)
+{
+	disable_percpu_irq(host_vtimer_irq);
+	if (host_ptimer_irq)
+		disable_percpu_irq(host_ptimer_irq);
 }
 
 int kvm_arm_timer_set_reg(struct kvm_vcpu *vcpu, u64 regid, u64 value)
@@ -976,18 +984,6 @@ void kvm_arm_timer_write_sysreg(struct kvm_vcpu *vcpu,
 	preempt_enable();
 }
 
-static int kvm_timer_starting_cpu(unsigned int cpu)
-{
-	kvm_timer_init_interrupt(NULL);
-	return 0;
-}
-
-static int kvm_timer_dying_cpu(unsigned int cpu)
-{
-	disable_percpu_irq(host_vtimer_irq);
-	return 0;
-}
-
 static int timer_irq_set_vcpu_affinity(struct irq_data *d, void *vcpu)
 {
 	if (vcpu)
@@ -1185,9 +1181,6 @@ int kvm_timer_hyp_init(bool has_gic)
 		goto out_free_irq;
 	}
 
-	cpuhp_setup_state(CPUHP_AP_KVM_ARM_TIMER_STARTING,
-			  "kvm/arm/timer:starting", kvm_timer_starting_cpu,
-			  kvm_timer_dying_cpu);
 	return 0;
 out_free_irq:
 	free_percpu_irq(host_vtimer_irq, kvm_get_running_vcpus());
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index c6732ac329ca..07f5cef5c33b 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1670,7 +1670,15 @@ static void _kvm_arch_hardware_enable(void *discard)
 
 int kvm_arch_hardware_enable(void)
 {
+	int was_enabled = __this_cpu_read(kvm_arm_hardware_enabled);
+
 	_kvm_arch_hardware_enable(NULL);
+
+	if (!was_enabled) {
+		kvm_vgic_cpu_up();
+		kvm_timer_cpu_up();
+	}
+
 	return 0;
 }
 
@@ -1684,6 +1692,11 @@ static void _kvm_arch_hardware_disable(void *discard)
 
 void kvm_arch_hardware_disable(void)
 {
+	if (__this_cpu_read(kvm_arm_hardware_enabled)) {
+		kvm_timer_cpu_down();
+		kvm_vgic_cpu_down();
+	}
+
 	if (!is_protected_kvm_enabled())
 		_kvm_arch_hardware_disable(NULL);
 }
diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c
index f6d4f4052555..6c7f6ae21ec0 100644
--- a/arch/arm64/kvm/vgic/vgic-init.c
+++ b/arch/arm64/kvm/vgic/vgic-init.c
@@ -465,17 +465,15 @@ int kvm_vgic_map_resources(struct kvm *kvm)
 
 /* GENERIC PROBE */
 
-static int vgic_init_cpu_starting(unsigned int cpu)
+void kvm_vgic_cpu_up(void)
 {
 	enable_percpu_irq(kvm_vgic_global_state.maint_irq, 0);
-	return 0;
 }
 
 
-static int vgic_init_cpu_dying(unsigned int cpu)
+void kvm_vgic_cpu_down(void)
 {
 	disable_percpu_irq(kvm_vgic_global_state.maint_irq);
-	return 0;
 }
 
 static irqreturn_t vgic_maintenance_handler(int irq, void *data)
@@ -584,19 +582,6 @@ int kvm_vgic_hyp_init(void)
 		return ret;
 	}
 
-	ret = cpuhp_setup_state(CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING,
-				"kvm/arm/vgic:starting",
-				vgic_init_cpu_starting, vgic_init_cpu_dying);
-	if (ret) {
-		kvm_err("Cannot register vgic CPU notifier\n");
-		goto out_free_irq;
-	}
-
 	kvm_info("vgic interrupt IRQ%d\n", kvm_vgic_global_state.maint_irq);
 	return 0;
-
-out_free_irq:
-	free_percpu_irq(kvm_vgic_global_state.maint_irq,
-			kvm_get_running_vcpus());
-	return ret;
 }
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index cd6d8f260eab..1638418f72dd 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -104,4 +104,8 @@ void kvm_arm_timer_write_sysreg(struct kvm_vcpu *vcpu,
 u32 timer_get_ctl(struct arch_timer_context *ctxt);
 u64 timer_get_cval(struct arch_timer_context *ctxt);
 
+/* CPU HP callbacks */
+void kvm_timer_cpu_up(void);
+void kvm_timer_cpu_down(void);
+
 #endif
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 4df9e73a8bb5..fc4acc91ba06 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -431,4 +431,8 @@ int vgic_v4_load(struct kvm_vcpu *vcpu);
 void vgic_v4_commit(struct kvm_vcpu *vcpu);
 int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db);
 
+/* CPU HP callbacks */
+void kvm_vgic_cpu_up(void);
+void kvm_vgic_cpu_down(void);
+
 #endif /* __KVM_ARM_VGIC_H */
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index f61447913db9..7337414e4947 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -186,9 +186,6 @@ enum cpuhp_state {
 	CPUHP_AP_TI_GP_TIMER_STARTING,
 	CPUHP_AP_HYPERV_TIMER_STARTING,
 	CPUHP_AP_KVM_STARTING,
-	CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING,
-	CPUHP_AP_KVM_ARM_VGIC_STARTING,
-	CPUHP_AP_KVM_ARM_TIMER_STARTING,
 	/* Must be the last timer callback */
 	CPUHP_AP_DUMMY_TIMER_STARTING,
 	CPUHP_AP_ARM_XEN_STARTING,
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 17/50] KVM: arm64: Free hypervisor allocations if vector slot init fails
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (15 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 16/50] KVM: arm64: Simplify the CPUHP logic Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 18/50] KVM: arm64: Unregister perf callbacks if hypervisor finalization fails Sean Christopherson
                   ` (34 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Teardown hypervisor mode if vector slot setup fails in order to avoid
leaking any allocations done by init_hyp_mode().

Fixes: b881cdce77b4 ("KVM: arm64: Allocate hyp vectors statically")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/arm64/kvm/arm.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 07f5cef5c33b..fa986ebb4793 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -2237,18 +2237,18 @@ int kvm_arch_init(void *opaque)
 	err = kvm_init_vector_slots();
 	if (err) {
 		kvm_err("Cannot initialise vector slots\n");
-		goto out_err;
-	}
-
-	err = init_subsystems();
-	if (err)
 		goto out_hyp;
+	}
+
+	err = init_subsystems();
+	if (err)
+		goto out_subs;
 
 	if (!in_hyp_mode) {
 		err = finalize_hyp_mode();
 		if (err) {
 			kvm_err("Failed to finalize Hyp protection\n");
-			goto out_hyp;
+			goto out_subs;
 		}
 	}
 
@@ -2262,8 +2262,9 @@ int kvm_arch_init(void *opaque)
 
 	return 0;
 
-out_hyp:
+out_subs:
 	hyp_cpu_pm_exit();
+out_hyp:
 	if (!in_hyp_mode)
 		teardown_hyp_mode();
 out_err:
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 18/50] KVM: arm64: Unregister perf callbacks if hypervisor finalization fails
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (16 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 17/50] KVM: arm64: Free hypervisor allocations if vector slot init fails Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 19/50] KVM: arm64: Do arm/arch initialization without bouncing through kvm_init() Sean Christopherson
                   ` (33 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Undo everything done by init_subsystems() if a later initialization step
fails, i.e. unregister perf callbacks in addition to unregistering the
power management notifier.

Fixes: bfa79a805454 ("KVM: arm64: Elevate hypervisor mappings creation at EL2")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/arm64/kvm/arm.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index fa986ebb4793..e6f6fcfe6bcc 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1839,12 +1839,21 @@ static int init_subsystems(void)
 	kvm_register_perf_callbacks(NULL);
 
 out:
+	if (err)
+		hyp_cpu_pm_exit();
+
 	if (err || !is_protected_kvm_enabled())
 		on_each_cpu(_kvm_arch_hardware_disable, NULL, 1);
 
 	return err;
 }
 
+static void teardown_subsystems(void)
+{
+	kvm_unregister_perf_callbacks();
+	hyp_cpu_pm_exit();
+}
+
 static void teardown_hyp_mode(void)
 {
 	int cpu;
@@ -2242,7 +2251,7 @@ int kvm_arch_init(void *opaque)
 
 	err = init_subsystems();
 	if (err)
-		goto out_subs;
+		goto out_hyp;
 
 	if (!in_hyp_mode) {
 		err = finalize_hyp_mode();
@@ -2263,7 +2272,7 @@ int kvm_arch_init(void *opaque)
 	return 0;
 
 out_subs:
-	hyp_cpu_pm_exit();
+	teardown_subsystems();
 out_hyp:
 	if (!in_hyp_mode)
 		teardown_hyp_mode();
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 19/50] KVM: arm64: Do arm/arch initialization without bouncing through kvm_init()
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (17 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 18/50] KVM: arm64: Unregister perf callbacks if hypervisor finalization fails Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 20/50] KVM: arm64: Mark kvm_arm_init() and its unique descendants as __init Sean Christopherson
                   ` (32 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Do arm/arch specific initialization directly in arm's module_init(), now
called kvm_arm_init(), instead of bouncing through kvm_init() to reach
kvm_arch_init().  Invoking kvm_arch_init() is the very first action
performed by kvm_init(), so from a initialization perspective this is a
glorified nop.

Avoiding kvm_arch_init() also fixes a mostly benign bug as kvm_arch_exit()
doesn't properly unwind if a later stage of kvm_init() fails.  While the
soon-to-be-deleted comment about compiling as a module being unsupported
is correct, kvm_arch_exit() can still be called by kvm_init() if any step
after the call to kvm_arch_init() succeeds.

Add a FIXME to call out that pKVM initialization isn't unwound if
kvm_init() fails, which is a pre-existing problem inherited from
kvm_arch_exit().

Making kvm_arch_init() a nop will also allow dropping kvm_arch_init() and
kvm_arch_exit() entirely once all other architectures follow suit.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/arm64/kvm/arm.c | 25 ++++++++++++++++---------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e6f6fcfe6bcc..d3a4db1abf32 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -2195,7 +2195,7 @@ void kvm_arch_irq_bypass_start(struct irq_bypass_consumer *cons)
 /**
  * Initialize Hyp-mode and memory mappings on all CPUs.
  */
-int kvm_arch_init(void *opaque)
+int kvm_arm_init(void)
 {
 	int err;
 	bool in_hyp_mode;
@@ -2269,6 +2269,14 @@ int kvm_arch_init(void *opaque)
 		kvm_info("Hyp mode initialized successfully\n");
 	}
 
+	/*
+	 * FIXME: Do something reasonable if kvm_init() fails after pKVM
+	 * hypervisor protection is finalized.
+	 */
+	err = kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
+	if (err)
+		goto out_subs;
+
 	return 0;
 
 out_subs:
@@ -2281,10 +2289,15 @@ int kvm_arch_init(void *opaque)
 	return err;
 }
 
+int kvm_arch_init(void *opaque)
+{
+	return 0;
+}
+
 /* NOP: Compiling as a module not supported */
 void kvm_arch_exit(void)
 {
-	kvm_unregister_perf_callbacks();
+
 }
 
 static int __init early_kvm_mode_cfg(char *arg)
@@ -2325,10 +2338,4 @@ enum kvm_mode kvm_get_mode(void)
 	return kvm_mode;
 }
 
-static int arm_init(void)
-{
-	int rc = kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
-	return rc;
-}
-
-module_init(arm_init);
+module_init(kvm_arm_init);
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 20/50] KVM: arm64: Mark kvm_arm_init() and its unique descendants as __init
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (18 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 19/50] KVM: arm64: Do arm/arch initialization without bouncing through kvm_init() Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 21/50] KVM: MIPS: Hardcode callbacks to hardware virtualization extensions Sean Christopherson
                   ` (31 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Tag kvm_arm_init() and its unique helper as __init, and tag data that is
only ever modified under the kvm_arm_init() umbrella as read-only after
init.

Opportunistically name the boolean param in kvm_timer_hyp_init()'s
prototype to match its definition.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/arm64/include/asm/kvm_host.h | 14 ++++++-------
 arch/arm64/include/asm/kvm_mmu.h  |  4 ++--
 arch/arm64/kvm/arch_timer.c       |  2 +-
 arch/arm64/kvm/arm.c              | 34 +++++++++++++++----------------
 arch/arm64/kvm/mmu.c              | 12 +++++------
 arch/arm64/kvm/reset.c            |  8 ++++----
 arch/arm64/kvm/sys_regs.c         |  6 +++---
 arch/arm64/kvm/vmid.c             |  6 +++---
 include/kvm/arm_arch_timer.h      |  2 +-
 9 files changed, 44 insertions(+), 44 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 5d5a887e63a5..4863fe356be1 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -66,8 +66,8 @@ enum kvm_mode kvm_get_mode(void);
 
 DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
 
-extern unsigned int kvm_sve_max_vl;
-int kvm_arm_init_sve(void);
+extern unsigned int __ro_after_init kvm_sve_max_vl;
+int __init kvm_arm_init_sve(void);
 
 u32 __attribute_const__ kvm_target_cpu(void);
 int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
@@ -793,7 +793,7 @@ int kvm_handle_cp10_id(struct kvm_vcpu *vcpu);
 
 void kvm_reset_sys_regs(struct kvm_vcpu *vcpu);
 
-int kvm_sys_reg_table_init(void);
+int __init kvm_sys_reg_table_init(void);
 
 /* MMIO helpers */
 void kvm_mmio_write_buf(void *buf, unsigned int len, unsigned long data);
@@ -824,9 +824,9 @@ int kvm_arm_pvtime_get_attr(struct kvm_vcpu *vcpu,
 int kvm_arm_pvtime_has_attr(struct kvm_vcpu *vcpu,
 			    struct kvm_device_attr *attr);
 
-extern unsigned int kvm_arm_vmid_bits;
-int kvm_arm_vmid_alloc_init(void);
-void kvm_arm_vmid_alloc_free(void);
+extern unsigned int __ro_after_init kvm_arm_vmid_bits;
+int __init kvm_arm_vmid_alloc_init(void);
+void __init kvm_arm_vmid_alloc_free(void);
 void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid);
 void kvm_arm_vmid_clear_active(void);
 
@@ -909,7 +909,7 @@ static inline void kvm_clr_pmu_events(u32 clr) {}
 void kvm_vcpu_load_sysregs_vhe(struct kvm_vcpu *vcpu);
 void kvm_vcpu_put_sysregs_vhe(struct kvm_vcpu *vcpu);
 
-int kvm_set_ipa_limit(void);
+int __init kvm_set_ipa_limit(void);
 
 #define __KVM_HAVE_ARCH_VM_ALLOC
 struct kvm *kvm_arch_alloc_vm(void);
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 7784081088e7..ced5b0028933 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -163,7 +163,7 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
 			   void __iomem **haddr);
 int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t size,
 			     void **haddr);
-void free_hyp_pgds(void);
+void __init free_hyp_pgds(void);
 
 void stage2_unmap_vm(struct kvm *kvm);
 int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu);
@@ -175,7 +175,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu);
 
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
-int kvm_mmu_init(u32 *hyp_va_bits);
+int __init kvm_mmu_init(u32 *hyp_va_bits);
 
 static inline void *__kvm_vector_slot2addr(void *base,
 					   enum arm64_hyp_spectre_vector slot)
diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index 33fca1a691a5..23346585a294 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -1113,7 +1113,7 @@ static int kvm_irq_init(struct arch_timer_kvm_info *info)
 	return 0;
 }
 
-int kvm_timer_hyp_init(bool has_gic)
+int __init kvm_timer_hyp_init(bool has_gic)
 {
 	struct arch_timer_kvm_info *info;
 	int err;
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index d3a4db1abf32..4d34abcfc9a9 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1513,7 +1513,7 @@ static int kvm_init_vector_slots(void)
 	return 0;
 }
 
-static void cpu_prepare_hyp_mode(int cpu)
+static void __init cpu_prepare_hyp_mode(int cpu)
 {
 	struct kvm_nvhe_init_params *params = per_cpu_ptr_nvhe_sym(kvm_init_params, cpu);
 	unsigned long tcr;
@@ -1739,26 +1739,26 @@ static struct notifier_block hyp_init_cpu_pm_nb = {
 	.notifier_call = hyp_init_cpu_pm_notifier,
 };
 
-static void hyp_cpu_pm_init(void)
+static void __init hyp_cpu_pm_init(void)
 {
 	if (!is_protected_kvm_enabled())
 		cpu_pm_register_notifier(&hyp_init_cpu_pm_nb);
 }
-static void hyp_cpu_pm_exit(void)
+static void __init hyp_cpu_pm_exit(void)
 {
 	if (!is_protected_kvm_enabled())
 		cpu_pm_unregister_notifier(&hyp_init_cpu_pm_nb);
 }
 #else
-static inline void hyp_cpu_pm_init(void)
+static inline void __init hyp_cpu_pm_init(void)
 {
 }
-static inline void hyp_cpu_pm_exit(void)
+static inline void __init hyp_cpu_pm_exit(void)
 {
 }
 #endif
 
-static void init_cpu_logical_map(void)
+static void __init init_cpu_logical_map(void)
 {
 	unsigned int cpu;
 
@@ -1775,7 +1775,7 @@ static void init_cpu_logical_map(void)
 #define init_psci_0_1_impl_state(config, what)	\
 	config.psci_0_1_ ## what ## _implemented = psci_ops.what
 
-static bool init_psci_relay(void)
+static bool __init init_psci_relay(void)
 {
 	/*
 	 * If PSCI has not been initialized, protected KVM cannot install
@@ -1798,7 +1798,7 @@ static bool init_psci_relay(void)
 	return true;
 }
 
-static int init_subsystems(void)
+static int __init init_subsystems(void)
 {
 	int err = 0;
 
@@ -1848,13 +1848,13 @@ static int init_subsystems(void)
 	return err;
 }
 
-static void teardown_subsystems(void)
+static void __init teardown_subsystems(void)
 {
 	kvm_unregister_perf_callbacks();
 	hyp_cpu_pm_exit();
 }
 
-static void teardown_hyp_mode(void)
+static void __init teardown_hyp_mode(void)
 {
 	int cpu;
 
@@ -1865,7 +1865,7 @@ static void teardown_hyp_mode(void)
 	}
 }
 
-static int do_pkvm_init(u32 hyp_va_bits)
+static int __init do_pkvm_init(u32 hyp_va_bits)
 {
 	void *per_cpu_base = kvm_ksym_ref(kvm_arm_hyp_percpu_base);
 	int ret;
@@ -1887,7 +1887,7 @@ static int do_pkvm_init(u32 hyp_va_bits)
 	return ret;
 }
 
-static int kvm_hyp_init_protection(u32 hyp_va_bits)
+static int __init kvm_hyp_init_protection(u32 hyp_va_bits)
 {
 	void *addr = phys_to_virt(hyp_mem_base);
 	int ret;
@@ -1917,7 +1917,7 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits)
 /**
  * Inits Hyp-mode on all online CPUs
  */
-static int init_hyp_mode(void)
+static int __init init_hyp_mode(void)
 {
 	u32 hyp_va_bits;
 	int cpu;
@@ -2099,7 +2099,7 @@ static int init_hyp_mode(void)
 	return err;
 }
 
-static void _kvm_host_prot_finalize(void *arg)
+static void __init _kvm_host_prot_finalize(void *arg)
 {
 	int *err = arg;
 
@@ -2107,7 +2107,7 @@ static void _kvm_host_prot_finalize(void *arg)
 		WRITE_ONCE(*err, -EINVAL);
 }
 
-static int pkvm_drop_host_privileges(void)
+static int __init pkvm_drop_host_privileges(void)
 {
 	int ret = 0;
 
@@ -2120,7 +2120,7 @@ static int pkvm_drop_host_privileges(void)
 	return ret;
 }
 
-static int finalize_hyp_mode(void)
+static int __init finalize_hyp_mode(void)
 {
 	if (!is_protected_kvm_enabled())
 		return 0;
@@ -2195,7 +2195,7 @@ void kvm_arch_irq_bypass_start(struct irq_bypass_consumer *cons)
 /**
  * Initialize Hyp-mode and memory mappings on all CPUs.
  */
-int kvm_arm_init(void)
+static __init int kvm_arm_init(void)
 {
 	int err;
 	bool in_hyp_mode;
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index f154d4a7fae0..be1d904d3e44 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -25,11 +25,11 @@
 static struct kvm_pgtable *hyp_pgtable;
 static DEFINE_MUTEX(kvm_hyp_pgd_mutex);
 
-static unsigned long hyp_idmap_start;
-static unsigned long hyp_idmap_end;
-static phys_addr_t hyp_idmap_vector;
+static unsigned long __ro_after_init hyp_idmap_start;
+static unsigned long __ro_after_init hyp_idmap_end;
+static phys_addr_t __ro_after_init hyp_idmap_vector;
 
-static unsigned long io_map_base;
+static unsigned long __ro_after_init io_map_base;
 
 static phys_addr_t stage2_range_addr_end(phys_addr_t addr, phys_addr_t end)
 {
@@ -261,7 +261,7 @@ static void stage2_flush_vm(struct kvm *kvm)
 /**
  * free_hyp_pgds - free Hyp-mode page tables
  */
-void free_hyp_pgds(void)
+void __init free_hyp_pgds(void)
 {
 	mutex_lock(&kvm_hyp_pgd_mutex);
 	if (hyp_pgtable) {
@@ -1615,7 +1615,7 @@ static struct kvm_pgtable_mm_ops kvm_hyp_mm_ops = {
 	.virt_to_phys		= kvm_host_pa,
 };
 
-int kvm_mmu_init(u32 *hyp_va_bits)
+int __init kvm_mmu_init(u32 *hyp_va_bits)
 {
 	int err;
 
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 5ae18472205a..dd58a8629a2e 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -30,7 +30,7 @@
 #include <asm/virt.h>
 
 /* Maximum phys_shift supported for any VM on this host */
-static u32 kvm_ipa_limit;
+static u32 __ro_after_init kvm_ipa_limit;
 
 /*
  * ARMv8 Reset Values
@@ -41,9 +41,9 @@ static u32 kvm_ipa_limit;
 #define VCPU_RESET_PSTATE_SVC	(PSR_AA32_MODE_SVC | PSR_AA32_A_BIT | \
 				 PSR_AA32_I_BIT | PSR_AA32_F_BIT)
 
-unsigned int kvm_sve_max_vl;
+unsigned int __ro_after_init kvm_sve_max_vl;
 
-int kvm_arm_init_sve(void)
+int __init kvm_arm_init_sve(void)
 {
 	if (system_supports_sve()) {
 		kvm_sve_max_vl = sve_max_virtualisable_vl();
@@ -352,7 +352,7 @@ u32 get_kvm_ipa_limit(void)
 	return kvm_ipa_limit;
 }
 
-int kvm_set_ipa_limit(void)
+int __init kvm_set_ipa_limit(void)
 {
 	unsigned int parange;
 	u64 mmfr0;
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index f4a7c5abcbca..0359f57c2c44 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -82,7 +82,7 @@ void vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg)
 }
 
 /* 3 bits per cache level, as per CLIDR, but non-existent caches always 0 */
-static u32 cache_levels;
+static u32 __ro_after_init cache_levels;
 
 /* CSSELR values; used to index KVM_REG_ARM_DEMUX_ID_CCSIDR */
 #define CSSELR_MAX 14
@@ -2620,7 +2620,7 @@ static void get_ctr_el0(struct kvm_vcpu *v, const struct sys_reg_desc *r)
 }
 
 /* ->val is filled in by kvm_sys_reg_table_init() */
-static struct sys_reg_desc invariant_sys_regs[] = {
+static struct sys_reg_desc invariant_sys_regs[] __ro_after_init = {
 	{ SYS_DESC(SYS_MIDR_EL1), NULL, get_midr_el1 },
 	{ SYS_DESC(SYS_REVIDR_EL1), NULL, get_revidr_el1 },
 	{ SYS_DESC(SYS_CLIDR_EL1), NULL, get_clidr_el1 },
@@ -2944,7 +2944,7 @@ int kvm_arm_copy_sys_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 	return write_demux_regids(uindices);
 }
 
-int kvm_sys_reg_table_init(void)
+int __init kvm_sys_reg_table_init(void)
 {
 	bool valid = true;
 	unsigned int i;
diff --git a/arch/arm64/kvm/vmid.c b/arch/arm64/kvm/vmid.c
index d78ae63d7c15..08978d0672e7 100644
--- a/arch/arm64/kvm/vmid.c
+++ b/arch/arm64/kvm/vmid.c
@@ -16,7 +16,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_mmu.h>
 
-unsigned int kvm_arm_vmid_bits;
+unsigned int __ro_after_init kvm_arm_vmid_bits;
 static DEFINE_RAW_SPINLOCK(cpu_vmid_lock);
 
 static atomic64_t vmid_generation;
@@ -172,7 +172,7 @@ void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid)
 /*
  * Initialize the VMID allocator
  */
-int kvm_arm_vmid_alloc_init(void)
+int __init kvm_arm_vmid_alloc_init(void)
 {
 	kvm_arm_vmid_bits = kvm_get_vmid_bits();
 
@@ -190,7 +190,7 @@ int kvm_arm_vmid_alloc_init(void)
 	return 0;
 }
 
-void kvm_arm_vmid_alloc_free(void)
+void __init kvm_arm_vmid_alloc_free(void)
 {
 	kfree(vmid_map);
 }
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index 1638418f72dd..71916de7c6c4 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -60,7 +60,7 @@ struct arch_timer_cpu {
 	bool			enabled;
 };
 
-int kvm_timer_hyp_init(bool);
+int __init kvm_timer_hyp_init(bool has_gic);
 int kvm_timer_enable(struct kvm_vcpu *vcpu);
 int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu);
 void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu);
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 21/50] KVM: MIPS: Hardcode callbacks to hardware virtualization extensions
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (19 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 20/50] KVM: arm64: Mark kvm_arm_init() and its unique descendants as __init Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-12-01 22:00   ` Philippe Mathieu-Daudé
  2022-11-30 23:09 ` [PATCH v2 22/50] KVM: MIPS: Setup VZ emulation? directly from kvm_mips_init() Sean Christopherson
                   ` (30 subsequent siblings)
  51 siblings, 1 reply; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Now that KVM no longer supports trap-and-emulate (see commit 45c7e8af4a5e
"MIPS: Remove KVM_TE support"), hardcode the MIPS callbacks to the
virtualization callbacks.

Harcoding the callbacks eliminates the technically-unnecessary check on
non-NULL kvm_mips_callbacks in kvm_arch_init().  MIPS has never supported
multiple in-tree modules, i.e. barring an out-of-tree module, where
copying and renaming kvm.ko counts as "out-of-tree", KVM could never
encounter a non-NULL set of callbacks during module init.

The callback check is also subtly broken, as it is not thread safe,
i.e. if there were multiple modules, loading both concurrently would
create a race between checking and setting kvm_mips_callbacks.

Given that out-of-tree shenanigans are not the kernel's responsibility,
hardcode the callbacks to simplify the code.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/mips/include/asm/kvm_host.h |  2 +-
 arch/mips/kvm/Makefile           |  2 +-
 arch/mips/kvm/callback.c         | 14 --------------
 arch/mips/kvm/mips.c             |  9 ++-------
 arch/mips/kvm/vz.c               |  7 ++++---
 5 files changed, 8 insertions(+), 26 deletions(-)
 delete mode 100644 arch/mips/kvm/callback.c

diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
index 28f0ba97db71..2803c9c21ef9 100644
--- a/arch/mips/include/asm/kvm_host.h
+++ b/arch/mips/include/asm/kvm_host.h
@@ -758,7 +758,7 @@ struct kvm_mips_callbacks {
 	void (*vcpu_reenter)(struct kvm_vcpu *vcpu);
 };
 extern struct kvm_mips_callbacks *kvm_mips_callbacks;
-int kvm_mips_emulation_init(struct kvm_mips_callbacks **install_callbacks);
+int kvm_mips_emulation_init(void);
 
 /* Debug: dump vcpu state */
 int kvm_arch_vcpu_dump_regs(struct kvm_vcpu *vcpu);
diff --git a/arch/mips/kvm/Makefile b/arch/mips/kvm/Makefile
index 21ff75bcdbc4..805aeea2166e 100644
--- a/arch/mips/kvm/Makefile
+++ b/arch/mips/kvm/Makefile
@@ -17,4 +17,4 @@ kvm-$(CONFIG_CPU_LOONGSON64) += loongson_ipi.o
 
 kvm-y		+= vz.o
 obj-$(CONFIG_KVM)	+= kvm.o
-obj-y			+= callback.o tlb.o
+obj-y			+= tlb.o
diff --git a/arch/mips/kvm/callback.c b/arch/mips/kvm/callback.c
deleted file mode 100644
index d88aa2173fb0..000000000000
--- a/arch/mips/kvm/callback.c
+++ /dev/null
@@ -1,14 +0,0 @@
-/*
- * This file is subject to the terms and conditions of the GNU General Public
- * License.  See the file "COPYING" in the main directory of this archive
- * for more details.
- *
- * Copyright (C) 2012  MIPS Technologies, Inc.  All rights reserved.
- * Authors: Yann Le Du <ledu@kymasys.com>
- */
-
-#include <linux/export.h>
-#include <linux/kvm_host.h>
-
-struct kvm_mips_callbacks *kvm_mips_callbacks;
-EXPORT_SYMBOL_GPL(kvm_mips_callbacks);
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index af29490d9740..f0a6c245d1ff 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -1012,17 +1012,12 @@ long kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 
 int kvm_arch_init(void *opaque)
 {
-	if (kvm_mips_callbacks) {
-		kvm_err("kvm: module already exists\n");
-		return -EEXIST;
-	}
-
-	return kvm_mips_emulation_init(&kvm_mips_callbacks);
+	return kvm_mips_emulation_init();
 }
 
 void kvm_arch_exit(void)
 {
-	kvm_mips_callbacks = NULL;
+
 }
 
 int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
diff --git a/arch/mips/kvm/vz.c b/arch/mips/kvm/vz.c
index c706f5890a05..dafab003ea0d 100644
--- a/arch/mips/kvm/vz.c
+++ b/arch/mips/kvm/vz.c
@@ -3304,7 +3304,10 @@ static struct kvm_mips_callbacks kvm_vz_callbacks = {
 	.vcpu_reenter = kvm_vz_vcpu_reenter,
 };
 
-int kvm_mips_emulation_init(struct kvm_mips_callbacks **install_callbacks)
+/* FIXME: Get rid of the callbacks now that trap-and-emulate is gone. */
+struct kvm_mips_callbacks *kvm_mips_callbacks = &kvm_vz_callbacks;
+
+int kvm_mips_emulation_init(void)
 {
 	if (!cpu_has_vz)
 		return -ENODEV;
@@ -3318,7 +3321,5 @@ int kvm_mips_emulation_init(struct kvm_mips_callbacks **install_callbacks)
 		return -ENODEV;
 
 	pr_info("Starting KVM with MIPS VZ extensions\n");
-
-	*install_callbacks = &kvm_vz_callbacks;
 	return 0;
 }
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 22/50] KVM: MIPS: Setup VZ emulation? directly from kvm_mips_init()
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (20 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 21/50] KVM: MIPS: Hardcode callbacks to hardware virtualization extensions Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 23/50] KVM: MIPS: Register die notifier prior to kvm_init() Sean Christopherson
                   ` (29 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Invoke kvm_mips_emulation_init() directly from kvm_mips_init() instead
of bouncing through kvm_init()=>kvm_arch_init().  Functionally, this is
a glorified nop as invoking kvm_arch_init() is the very first action
performed by kvm_init().

Emptying kvm_arch_init() will allow dropping the hook entirely once all
architecture implementations are nops.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
 arch/mips/kvm/mips.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index f0a6c245d1ff..75681281e2df 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -1012,7 +1012,7 @@ long kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 
 int kvm_arch_init(void *opaque)
 {
-	return kvm_mips_emulation_init();
+	return 0;
 }
 
 void kvm_arch_exit(void)
@@ -1636,6 +1636,10 @@ static int __init kvm_mips_init(void)
 	if (ret)
 		return ret;
 
+	ret = kvm_mips_emulation_init();
+	if (ret)
+		return ret;
+
 	ret = kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
 
 	if (ret)
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 23/50] KVM: MIPS: Register die notifier prior to kvm_init()
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (21 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 22/50] KVM: MIPS: Setup VZ emulation? directly from kvm_mips_init() Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 24/50] KVM: RISC-V: Do arch init directly in riscv_kvm_init() Sean Christopherson
                   ` (28 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Call kvm_init() only after _all_ setup is complete, as kvm_init() exposes
/dev/kvm to userspace and thus allows userspace to create VMs (and call
other ioctls).

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
 arch/mips/kvm/mips.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index 75681281e2df..ae7a24342fdf 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -1640,16 +1640,17 @@ static int __init kvm_mips_init(void)
 	if (ret)
 		return ret;
 
-	ret = kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
-
-	if (ret)
-		return ret;
 
 	if (boot_cpu_type() == CPU_LOONGSON64)
 		kvm_priority_to_irq = kvm_loongson3_priority_to_irq;
 
 	register_die_notifier(&kvm_mips_csr_die_notifier);
 
+	ret = kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
+	if (ret) {
+		unregister_die_notifier(&kvm_mips_csr_die_notifier);
+		return ret;
+	}
 	return 0;
 }
 
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 24/50] KVM: RISC-V: Do arch init directly in riscv_kvm_init()
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (22 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 23/50] KVM: MIPS: Register die notifier prior to kvm_init() Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 25/50] KVM: RISC-V: Tag init functions and data with __init, __ro_after_init Sean Christopherson
                   ` (27 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Fold the guts of kvm_arch_init() into riscv_kvm_init() instead of
bouncing through kvm_init()=>kvm_arch_init().  Functionally, this is a
glorified nop as invoking kvm_arch_init() is the very first action
performed by kvm_init().

Moving setup to riscv_kvm_init(), which is tagged __init, will allow
tagging more functions and data with __init and __ro_after_init.  And
emptying kvm_arch_init() will allow dropping the hook entirely once all
architecture implementations are nops.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Anup Patel <anup@brainfault.org>
---
 arch/riscv/kvm/main.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c
index a146fa0ce4d2..cb063b8a9a0f 100644
--- a/arch/riscv/kvm/main.c
+++ b/arch/riscv/kvm/main.c
@@ -66,6 +66,15 @@ void kvm_arch_hardware_disable(void)
 }
 
 int kvm_arch_init(void *opaque)
+{
+	return 0;
+}
+
+void kvm_arch_exit(void)
+{
+}
+
+static int __init riscv_kvm_init(void)
 {
 	const char *str;
 
@@ -110,15 +119,6 @@ int kvm_arch_init(void *opaque)
 
 	kvm_info("VMID %ld bits available\n", kvm_riscv_gstage_vmid_bits());
 
-	return 0;
-}
-
-void kvm_arch_exit(void)
-{
-}
-
-static int __init riscv_kvm_init(void)
-{
 	return kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
 }
 module_init(riscv_kvm_init);
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 25/50] KVM: RISC-V: Tag init functions and data with __init, __ro_after_init
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (23 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 24/50] KVM: RISC-V: Do arch init directly in riscv_kvm_init() Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 26/50] KVM: PPC: Move processor compatibility check to module init Sean Christopherson
                   ` (26 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Now that KVM setup is handled directly in riscv_kvm_init(), tag functions
and data that are used/set only during init with __init/__ro_after_init.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Anup Patel <anup@brainfault.org>
---
 arch/riscv/include/asm/kvm_host.h |  6 +++---
 arch/riscv/kvm/mmu.c              | 12 ++++++------
 arch/riscv/kvm/vmid.c             |  4 ++--
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
index 8c771fc4f5d2..778ff0f282b7 100644
--- a/arch/riscv/include/asm/kvm_host.h
+++ b/arch/riscv/include/asm/kvm_host.h
@@ -295,11 +295,11 @@ int kvm_riscv_gstage_map(struct kvm_vcpu *vcpu,
 int kvm_riscv_gstage_alloc_pgd(struct kvm *kvm);
 void kvm_riscv_gstage_free_pgd(struct kvm *kvm);
 void kvm_riscv_gstage_update_hgatp(struct kvm_vcpu *vcpu);
-void kvm_riscv_gstage_mode_detect(void);
-unsigned long kvm_riscv_gstage_mode(void);
+void __init kvm_riscv_gstage_mode_detect(void);
+unsigned long __init kvm_riscv_gstage_mode(void);
 int kvm_riscv_gstage_gpa_bits(void);
 
-void kvm_riscv_gstage_vmid_detect(void);
+void __init kvm_riscv_gstage_vmid_detect(void);
 unsigned long kvm_riscv_gstage_vmid_bits(void);
 int kvm_riscv_gstage_vmid_init(struct kvm *kvm);
 bool kvm_riscv_gstage_vmid_ver_changed(struct kvm_vmid *vmid);
diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
index 3620ecac2fa1..f42a34c7879a 100644
--- a/arch/riscv/kvm/mmu.c
+++ b/arch/riscv/kvm/mmu.c
@@ -20,12 +20,12 @@
 #include <asm/pgtable.h>
 
 #ifdef CONFIG_64BIT
-static unsigned long gstage_mode = (HGATP_MODE_SV39X4 << HGATP_MODE_SHIFT);
-static unsigned long gstage_pgd_levels = 3;
+static unsigned long gstage_mode __ro_after_init = (HGATP_MODE_SV39X4 << HGATP_MODE_SHIFT);
+static unsigned long gstage_pgd_levels __ro_after_init = 3;
 #define gstage_index_bits	9
 #else
-static unsigned long gstage_mode = (HGATP_MODE_SV32X4 << HGATP_MODE_SHIFT);
-static unsigned long gstage_pgd_levels = 2;
+static unsigned long gstage_mode __ro_after_init = (HGATP_MODE_SV32X4 << HGATP_MODE_SHIFT);
+static unsigned long gstage_pgd_levels __ro_after_init = 2;
 #define gstage_index_bits	10
 #endif
 
@@ -760,7 +760,7 @@ void kvm_riscv_gstage_update_hgatp(struct kvm_vcpu *vcpu)
 		kvm_riscv_local_hfence_gvma_all();
 }
 
-void kvm_riscv_gstage_mode_detect(void)
+void __init kvm_riscv_gstage_mode_detect(void)
 {
 #ifdef CONFIG_64BIT
 	/* Try Sv57x4 G-stage mode */
@@ -784,7 +784,7 @@ void kvm_riscv_gstage_mode_detect(void)
 #endif
 }
 
-unsigned long kvm_riscv_gstage_mode(void)
+unsigned long __init kvm_riscv_gstage_mode(void)
 {
 	return gstage_mode >> HGATP_MODE_SHIFT;
 }
diff --git a/arch/riscv/kvm/vmid.c b/arch/riscv/kvm/vmid.c
index 6cd93995fb65..5246da1c9167 100644
--- a/arch/riscv/kvm/vmid.c
+++ b/arch/riscv/kvm/vmid.c
@@ -17,10 +17,10 @@
 
 static unsigned long vmid_version = 1;
 static unsigned long vmid_next;
-static unsigned long vmid_bits;
+static unsigned long vmid_bits __ro_after_init;
 static DEFINE_SPINLOCK(vmid_lock);
 
-void kvm_riscv_gstage_vmid_detect(void)
+void __init kvm_riscv_gstage_vmid_detect(void)
 {
 	unsigned long old;
 
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 26/50] KVM: PPC: Move processor compatibility check to module init
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (24 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 25/50] KVM: RISC-V: Tag init functions and data with __init, __ro_after_init Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-12-01  5:21   ` Michael Ellerman
  2022-11-30 23:09 ` [PATCH v2 27/50] KVM: s390: Do s390 specific init without bouncing through kvm_init() Sean Christopherson
                   ` (25 subsequent siblings)
  51 siblings, 1 reply; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Move KVM PPC's compatibility checks to their respective module_init()
hooks, there's no need to wait until KVM's common compat check, nor is
there a need to perform the check on every CPU (provided by common KVM's
hook), as the compatibility checks operate on global data.

  arch/powerpc/include/asm/cputable.h: extern struct cpu_spec *cur_cpu_spec;
  arch/powerpc/kvm/book3s.c: return 0
  arch/powerpc/kvm/e500.c: strcmp(cur_cpu_spec->cpu_name, "e500v2")
  arch/powerpc/kvm/e500mc.c: strcmp(cur_cpu_spec->cpu_name, "e500mc")
                             strcmp(cur_cpu_spec->cpu_name, "e5500")
                             strcmp(cur_cpu_spec->cpu_name, "e6500")

Cc: Fabiano Rosas <farosas@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/powerpc/include/asm/kvm_ppc.h |  1 -
 arch/powerpc/kvm/book3s.c          | 10 ----------
 arch/powerpc/kvm/e500.c            |  4 ++--
 arch/powerpc/kvm/e500mc.c          |  4 ++++
 arch/powerpc/kvm/powerpc.c         |  2 +-
 5 files changed, 7 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index bfacf12784dd..51a1824b0a16 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -118,7 +118,6 @@ extern int kvmppc_xlate(struct kvm_vcpu *vcpu, ulong eaddr,
 extern int kvmppc_core_vcpu_create(struct kvm_vcpu *vcpu);
 extern void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu);
 extern int kvmppc_core_vcpu_setup(struct kvm_vcpu *vcpu);
-extern int kvmppc_core_check_processor_compat(void);
 extern int kvmppc_core_vcpu_translate(struct kvm_vcpu *vcpu,
                                       struct kvm_translation *tr);
 
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 6d525285dbe8..87283a0e33d8 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -999,16 +999,6 @@ int kvmppc_h_logical_ci_store(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvmppc_h_logical_ci_store);
 
-int kvmppc_core_check_processor_compat(void)
-{
-	/*
-	 * We always return 0 for book3s. We check
-	 * for compatibility while loading the HV
-	 * or PR module
-	 */
-	return 0;
-}
-
 int kvmppc_book3s_hcall_implemented(struct kvm *kvm, unsigned long hcall)
 {
 	return kvm->arch.kvm_ops->hcall_implemented(hcall);
diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
index c8b2b4478545..0ea61190ec04 100644
--- a/arch/powerpc/kvm/e500.c
+++ b/arch/powerpc/kvm/e500.c
@@ -314,7 +314,7 @@ static void kvmppc_core_vcpu_put_e500(struct kvm_vcpu *vcpu)
 	kvmppc_booke_vcpu_put(vcpu);
 }
 
-int kvmppc_core_check_processor_compat(void)
+static int kvmppc_e500_check_processor_compat(void)
 {
 	int r;
 
@@ -507,7 +507,7 @@ static int __init kvmppc_e500_init(void)
 	unsigned long handler_len;
 	unsigned long max_ivor = 0;
 
-	r = kvmppc_core_check_processor_compat();
+	r = kvmppc_e500_check_processor_compat();
 	if (r)
 		goto err_out;
 
diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
index 57e0ad6a2ca3..795667f7ebf0 100644
--- a/arch/powerpc/kvm/e500mc.c
+++ b/arch/powerpc/kvm/e500mc.c
@@ -388,6 +388,10 @@ static int __init kvmppc_e500mc_init(void)
 {
 	int r;
 
+	r = kvmppc_e500mc_check_processor_compat();
+	if (r)
+		return kvmppc_e500mc;
+
 	r = kvmppc_booke_init();
 	if (r)
 		goto err_out;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 5faf69421f13..d44b85ba8cef 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -442,7 +442,7 @@ int kvm_arch_hardware_enable(void)
 
 int kvm_arch_check_processor_compat(void *opaque)
 {
-	return kvmppc_core_check_processor_compat();
+	return 0;
 }
 
 int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 27/50] KVM: s390: Do s390 specific init without bouncing through kvm_init()
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (25 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 26/50] KVM: PPC: Move processor compatibility check to module init Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 28/50] KVM: s390: Mark __kvm_s390_init() and its descendants as __init Sean Christopherson
                   ` (24 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Move the guts of kvm_arch_init() into a new helper, __kvm_s390_init(),
and invoke the new helper directly from kvm_s390_init() instead of
bouncing through kvm_init().  Invoking kvm_arch_init() is the very
first action performed by kvm_init(), i.e. this is a glorified nop.

Moving setup to __kvm_s390_init() will allow tagging more functions as
__init, and emptying kvm_arch_init() will allow dropping the hook
entirely once all architecture implementations are nops.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
 arch/s390/kvm/kvm-s390.c | 29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 829e6e046003..8c4fcaf2bd36 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -469,7 +469,7 @@ static void kvm_s390_cpu_feat_init(void)
 	 */
 }
 
-int kvm_arch_init(void *opaque)
+static int __kvm_s390_init(void)
 {
 	int rc = -ENOMEM;
 
@@ -527,7 +527,7 @@ int kvm_arch_init(void *opaque)
 	return rc;
 }
 
-void kvm_arch_exit(void)
+static void __kvm_s390_exit(void)
 {
 	gmap_unregister_pte_notifier(&gmap_notifier);
 	gmap_unregister_pte_notifier(&vsie_gmap_notifier);
@@ -541,6 +541,16 @@ void kvm_arch_exit(void)
 	debug_unregister(kvm_s390_dbf_uv);
 }
 
+int kvm_arch_init(void *opaque)
+{
+	return 0;
+}
+
+void kvm_arch_exit(void)
+{
+
+}
+
 /* Section: device related */
 long kvm_arch_dev_ioctl(struct file *filp,
 			unsigned int ioctl, unsigned long arg)
@@ -5696,7 +5706,7 @@ static inline unsigned long nonhyp_mask(int i)
 
 static int __init kvm_s390_init(void)
 {
-	int i;
+	int i, r;
 
 	if (!sclp.has_sief2) {
 		pr_info("SIE is not available\n");
@@ -5712,12 +5722,23 @@ static int __init kvm_s390_init(void)
 		kvm_s390_fac_base[i] |=
 			stfle_fac_list[i] & nonhyp_mask(i);
 
-	return kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
+	r = __kvm_s390_init();
+	if (r)
+		return r;
+
+	r = kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
+	if (r) {
+		__kvm_s390_exit();
+		return r;
+	}
+	return 0;
 }
 
 static void __exit kvm_s390_exit(void)
 {
 	kvm_exit();
+
+	__kvm_s390_exit();
 }
 
 module_init(kvm_s390_init);
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 28/50] KVM: s390: Mark __kvm_s390_init() and its descendants as __init
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (26 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 27/50] KVM: s390: Do s390 specific init without bouncing through kvm_init() Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 29/50] KVM: Drop kvm_arch_{init,exit}() hooks Sean Christopherson
                   ` (23 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Tag __kvm_s390_init() and its unique helpers as __init.  These functions
are only ever called during module_init(), but could not be tagged
accordingly while they were invoked from the common kvm_arch_init(),
which is not __init because of x86.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
---
 arch/s390/kvm/interrupt.c | 2 +-
 arch/s390/kvm/kvm-s390.c  | 4 ++--
 arch/s390/kvm/kvm-s390.h  | 2 +-
 arch/s390/kvm/pci.c       | 2 +-
 arch/s390/kvm/pci.h       | 2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 1dae78deddf2..3754d7937530 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -3411,7 +3411,7 @@ void kvm_s390_gib_destroy(void)
 	gib = NULL;
 }
 
-int kvm_s390_gib_init(u8 nisc)
+int __init kvm_s390_gib_init(u8 nisc)
 {
 	int rc = 0;
 
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 8c4fcaf2bd36..66d162723d21 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -366,7 +366,7 @@ static __always_inline void __insn32_query(unsigned int opcode, u8 *query)
 #define INSN_SORTL 0xb938
 #define INSN_DFLTCC 0xb939
 
-static void kvm_s390_cpu_feat_init(void)
+static void __init kvm_s390_cpu_feat_init(void)
 {
 	int i;
 
@@ -469,7 +469,7 @@ static void kvm_s390_cpu_feat_init(void)
 	 */
 }
 
-static int __kvm_s390_init(void)
+static int __init __kvm_s390_init(void)
 {
 	int rc = -ENOMEM;
 
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index d48588c207d8..0261d42c7d01 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -470,7 +470,7 @@ void kvm_s390_gisa_clear(struct kvm *kvm);
 void kvm_s390_gisa_destroy(struct kvm *kvm);
 void kvm_s390_gisa_disable(struct kvm *kvm);
 void kvm_s390_gisa_enable(struct kvm *kvm);
-int kvm_s390_gib_init(u8 nisc);
+int __init kvm_s390_gib_init(u8 nisc);
 void kvm_s390_gib_destroy(void);
 
 /* implemented in guestdbg.c */
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
index ded1af2ddae9..39544c92ce3d 100644
--- a/arch/s390/kvm/pci.c
+++ b/arch/s390/kvm/pci.c
@@ -670,7 +670,7 @@ int kvm_s390_pci_zpci_op(struct kvm *kvm, struct kvm_s390_zpci_op *args)
 	return r;
 }
 
-int kvm_s390_pci_init(void)
+int __init kvm_s390_pci_init(void)
 {
 	zpci_kvm_hook.kvm_register = kvm_s390_pci_register_kvm;
 	zpci_kvm_hook.kvm_unregister = kvm_s390_pci_unregister_kvm;
diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
index 486d06ef563f..ff0972dd5e71 100644
--- a/arch/s390/kvm/pci.h
+++ b/arch/s390/kvm/pci.h
@@ -60,7 +60,7 @@ void kvm_s390_pci_clear_list(struct kvm *kvm);
 
 int kvm_s390_pci_zpci_op(struct kvm *kvm, struct kvm_s390_zpci_op *args);
 
-int kvm_s390_pci_init(void);
+int __init kvm_s390_pci_init(void);
 void kvm_s390_pci_exit(void);
 
 static inline bool kvm_s390_pci_interp_allowed(void)
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 29/50] KVM: Drop kvm_arch_{init,exit}() hooks
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (27 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 28/50] KVM: s390: Mark __kvm_s390_init() and its descendants as __init Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 30/50] KVM: VMX: Make VMCS configuration/capabilities structs read-only after init Sean Christopherson
                   ` (22 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Drop kvm_arch_init() and kvm_arch_exit() now that all implementations
are nops.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>	# s390
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Anup Patel <anup@brainfault.org>
---
 arch/arm64/kvm/arm.c                | 11 -----------
 arch/mips/kvm/mips.c                | 10 ----------
 arch/powerpc/include/asm/kvm_host.h |  1 -
 arch/powerpc/kvm/powerpc.c          |  5 -----
 arch/riscv/kvm/main.c               |  9 ---------
 arch/s390/kvm/kvm-s390.c            | 10 ----------
 arch/x86/kvm/x86.c                  | 10 ----------
 include/linux/kvm_host.h            |  3 ---
 virt/kvm/kvm_main.c                 | 19 ++-----------------
 9 files changed, 2 insertions(+), 76 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 4d34abcfc9a9..936ef7d1ea94 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -2289,17 +2289,6 @@ static __init int kvm_arm_init(void)
 	return err;
 }
 
-int kvm_arch_init(void *opaque)
-{
-	return 0;
-}
-
-/* NOP: Compiling as a module not supported */
-void kvm_arch_exit(void)
-{
-
-}
-
 static int __init early_kvm_mode_cfg(char *arg)
 {
 	if (!arg)
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index ae7a24342fdf..3cade648827a 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -1010,16 +1010,6 @@ long kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 	return r;
 }
 
-int kvm_arch_init(void *opaque)
-{
-	return 0;
-}
-
-void kvm_arch_exit(void)
-{
-
-}
-
 int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
 				  struct kvm_sregs *sregs)
 {
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 5d2c3a487e73..0a80e80c7b9e 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -881,7 +881,6 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {}
 static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
-static inline void kvm_arch_exit(void) {}
 static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
 
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index d44b85ba8cef..01d0f9935e6c 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -2539,11 +2539,6 @@ void kvmppc_init_lpid(unsigned long nr_lpids_param)
 }
 EXPORT_SYMBOL_GPL(kvmppc_init_lpid);
 
-int kvm_arch_init(void *opaque)
-{
-	return 0;
-}
-
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_ppc_instr);
 
 void kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu, struct dentry *debugfs_dentry)
diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c
index cb063b8a9a0f..4710a6751687 100644
--- a/arch/riscv/kvm/main.c
+++ b/arch/riscv/kvm/main.c
@@ -65,15 +65,6 @@ void kvm_arch_hardware_disable(void)
 	csr_write(CSR_HIDELEG, 0);
 }
 
-int kvm_arch_init(void *opaque)
-{
-	return 0;
-}
-
-void kvm_arch_exit(void)
-{
-}
-
 static int __init riscv_kvm_init(void)
 {
 	const char *str;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 66d162723d21..25b08b956888 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -541,16 +541,6 @@ static void __kvm_s390_exit(void)
 	debug_unregister(kvm_s390_dbf_uv);
 }
 
-int kvm_arch_init(void *opaque)
-{
-	return 0;
-}
-
-void kvm_arch_exit(void)
-{
-
-}
-
 /* Section: device related */
 long kvm_arch_dev_ioctl(struct file *filp,
 			unsigned int ioctl, unsigned long arg)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 45184ca89317..66f16458aa97 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9277,16 +9277,6 @@ static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
 	kvm_pmu_ops_update(ops->pmu_ops);
 }
 
-int kvm_arch_init(void *opaque)
-{
-	return 0;
-}
-
-void kvm_arch_exit(void)
-{
-
-}
-
 static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 {
 	u64 host_pat;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f2e0e78d2d92..7dde28333e7c 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1439,9 +1439,6 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
 					struct kvm_guest_debug *dbg);
 int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu);
 
-int kvm_arch_init(void *opaque);
-void kvm_arch_exit(void);
-
 void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu);
 
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 0e62887e8ce1..a4a10a0b322f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5852,20 +5852,8 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 	int r;
 	int cpu;
 
-	/*
-	 * FIXME: Get rid of kvm_arch_init(), vendor code should call arch code
-	 * directly.  Note, kvm_arch_init() _must_ be called before anything
-	 * else as x86 relies on checks buried in kvm_arch_init() to guard
-	 * against multiple calls to kvm_init().
-	 */
-	r = kvm_arch_init(opaque);
-	if (r)
-		return r;
-
-	if (!zalloc_cpumask_var(&cpus_hardware_enabled, GFP_KERNEL)) {
-		r = -ENOMEM;
-		goto err_hw_enabled;
-	}
+	if (!zalloc_cpumask_var(&cpus_hardware_enabled, GFP_KERNEL))
+		return -ENOMEM;
 
 	c.ret = &r;
 	c.opaque = opaque;
@@ -5953,8 +5941,6 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING);
 out_free_2:
 	free_cpumask_var(cpus_hardware_enabled);
-err_hw_enabled:
-	kvm_arch_exit();
 	return r;
 }
 EXPORT_SYMBOL_GPL(kvm_init);
@@ -5982,7 +5968,6 @@ void kvm_exit(void)
 	on_each_cpu(hardware_disable_nolock, NULL, 1);
 	kvm_irqfd_exit();
 	free_cpumask_var(cpus_hardware_enabled);
-	kvm_arch_exit();
 }
 EXPORT_SYMBOL_GPL(kvm_exit);
 
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 30/50] KVM: VMX: Make VMCS configuration/capabilities structs read-only after init
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (28 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 29/50] KVM: Drop kvm_arch_{init,exit}() hooks Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 31/50] KVM: x86: Do CPU compatibility checks in x86 code Sean Christopherson
                   ` (21 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Tag vmcs_config and vmx_capability structs as __init, the canonical
configuration is generated during hardware_setup() and must never be
modified after that point.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/capabilities.h | 4 ++--
 arch/x86/kvm/vmx/vmx.c          | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index cd2ac9536c99..45162c1bcd8f 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -66,13 +66,13 @@ struct vmcs_config {
 	u64 misc;
 	struct nested_vmx_msrs nested;
 };
-extern struct vmcs_config vmcs_config;
+extern struct vmcs_config vmcs_config __ro_after_init;
 
 struct vmx_capability {
 	u32 ept;
 	u32 vpid;
 };
-extern struct vmx_capability vmx_capability;
+extern struct vmx_capability vmx_capability __ro_after_init;
 
 static inline bool cpu_has_vmx_basic_inout(void)
 {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 76185a7a7ded..654d81f781da 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -488,8 +488,8 @@ static DEFINE_PER_CPU(struct list_head, loaded_vmcss_on_cpu);
 static DECLARE_BITMAP(vmx_vpid_bitmap, VMX_NR_VPIDS);
 static DEFINE_SPINLOCK(vmx_vpid_lock);
 
-struct vmcs_config vmcs_config;
-struct vmx_capability vmx_capability;
+struct vmcs_config vmcs_config __ro_after_init;
+struct vmx_capability vmx_capability __ro_after_init;
 
 #define VMX_SEGMENT_FIELD(seg)					\
 	[VCPU_SREG_##seg] = {                                   \
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 31/50] KVM: x86: Do CPU compatibility checks in x86 code
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (29 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 30/50] KVM: VMX: Make VMCS configuration/capabilities structs read-only after init Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-12-02 12:16   ` Huang, Kai
  2022-12-05 20:52   ` Isaku Yamahata
  2022-11-30 23:09 ` [PATCH v2 32/50] KVM: Drop kvm_arch_check_processor_compat() hook Sean Christopherson
                   ` (20 subsequent siblings)
  51 siblings, 2 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Move the CPU compatibility checks to pure x86 code, i.e. drop x86's use
of the common kvm_x86_check_cpu_compat() arch hook.  x86 is the only
architecture that "needs" to do per-CPU compatibility checks, moving
the logic to x86 will allow dropping the common code, and will also
give x86 more control over when/how the compatibility checks are
performed, e.g. TDX will need to enable hardware (do VMXON) in order to
perform compatibility checks.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c |  2 +-
 arch/x86/kvm/vmx/vmx.c |  2 +-
 arch/x86/kvm/x86.c     | 49 ++++++++++++++++++++++++++++++++----------
 3 files changed, 40 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 19e81a99c58f..d7ea1c1175c2 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5103,7 +5103,7 @@ static int __init svm_init(void)
 	 * Common KVM initialization _must_ come last, after this, /dev/kvm is
 	 * exposed to userspace!
 	 */
-	r = kvm_init(&svm_init_ops, sizeof(struct vcpu_svm),
+	r = kvm_init(NULL, sizeof(struct vcpu_svm),
 		     __alignof__(struct vcpu_svm), THIS_MODULE);
 	if (r)
 		goto err_kvm_init;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 654d81f781da..8deb1bd60c10 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8592,7 +8592,7 @@ static int __init vmx_init(void)
 	 * Common KVM initialization _must_ come last, after this, /dev/kvm is
 	 * exposed to userspace!
 	 */
-	r = kvm_init(&vmx_init_ops, sizeof(struct vcpu_vmx),
+	r = kvm_init(NULL, sizeof(struct vcpu_vmx),
 		     __alignof__(struct vcpu_vmx), THIS_MODULE);
 	if (r)
 		goto err_kvm_init;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 66f16458aa97..3571bc968cf8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9277,10 +9277,36 @@ static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
 	kvm_pmu_ops_update(ops->pmu_ops);
 }
 
+struct kvm_cpu_compat_check {
+	struct kvm_x86_init_ops *ops;
+	int *ret;
+};
+
+static int kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
+{
+	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
+
+	WARN_ON(!irqs_disabled());
+
+	if (__cr4_reserved_bits(cpu_has, c) !=
+	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
+		return -EIO;
+
+	return ops->check_processor_compatibility();
+}
+
+static void kvm_x86_check_cpu_compat(void *data)
+{
+	struct kvm_cpu_compat_check *c = data;
+
+	*c->ret = kvm_x86_check_processor_compatibility(c->ops);
+}
+
 static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 {
+	struct kvm_cpu_compat_check c;
 	u64 host_pat;
-	int r;
+	int r, cpu;
 
 	if (kvm_x86_ops.hardware_enable) {
 		pr_err("kvm: already loaded vendor module '%s'\n", kvm_x86_ops.name);
@@ -9360,6 +9386,14 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 	if (r != 0)
 		goto out_mmu_exit;
 
+	c.ret = &r;
+	c.ops = ops;
+	for_each_online_cpu(cpu) {
+		smp_call_function_single(cpu, kvm_x86_check_cpu_compat, &c, 1);
+		if (r < 0)
+			goto out_hardware_unsetup;
+	}
+
 	/*
 	 * Point of no return!  DO NOT add error paths below this point unless
 	 * absolutely necessary, as most operations from this point forward
@@ -9402,6 +9436,8 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 	kvm_init_msr_list();
 	return 0;
 
+out_hardware_unsetup:
+	ops->runtime_ops->hardware_unsetup();
 out_mmu_exit:
 	kvm_mmu_vendor_module_exit();
 out_free_percpu:
@@ -12037,16 +12073,7 @@ void kvm_arch_hardware_disable(void)
 
 int kvm_arch_check_processor_compat(void *opaque)
 {
-	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
-	struct kvm_x86_init_ops *ops = opaque;
-
-	WARN_ON(!irqs_disabled());
-
-	if (__cr4_reserved_bits(cpu_has, c) !=
-	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
-		return -EIO;
-
-	return ops->check_processor_compatibility();
+	return 0;
 }
 
 bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu)
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 32/50] KVM: Drop kvm_arch_check_processor_compat() hook
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (30 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 31/50] KVM: x86: Do CPU compatibility checks in x86 code Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-12-02 12:18   ` Huang, Kai
  2022-11-30 23:09 ` [PATCH v2 33/50] KVM: x86: Use KBUILD_MODNAME to specify vendor module name Sean Christopherson
                   ` (19 subsequent siblings)
  51 siblings, 1 reply; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Drop kvm_arch_check_processor_compat() and its support code now that all
architecture implementations are nops.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Farman <farman@linux.ibm.com>	# s390
Acked-by: Anup Patel <anup@brainfault.org>
---
 arch/arm64/kvm/arm.c       |  7 +------
 arch/mips/kvm/mips.c       |  7 +------
 arch/powerpc/kvm/book3s.c  |  2 +-
 arch/powerpc/kvm/e500.c    |  2 +-
 arch/powerpc/kvm/e500mc.c  |  2 +-
 arch/powerpc/kvm/powerpc.c |  5 -----
 arch/riscv/kvm/main.c      |  7 +------
 arch/s390/kvm/kvm-s390.c   |  7 +------
 arch/x86/kvm/svm/svm.c     |  4 ++--
 arch/x86/kvm/vmx/vmx.c     |  4 ++--
 arch/x86/kvm/x86.c         |  5 -----
 include/linux/kvm_host.h   |  4 +---
 virt/kvm/kvm_main.c        | 24 +-----------------------
 13 files changed, 13 insertions(+), 67 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 936ef7d1ea94..e915b1d9f2cd 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -63,11 +63,6 @@ int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 	return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
-{
-	return 0;
-}
-
 int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 			    struct kvm_enable_cap *cap)
 {
@@ -2273,7 +2268,7 @@ static __init int kvm_arm_init(void)
 	 * FIXME: Do something reasonable if kvm_init() fails after pKVM
 	 * hypervisor protection is finalized.
 	 */
-	err = kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
+	err = kvm_init(sizeof(struct kvm_vcpu), 0, THIS_MODULE);
 	if (err)
 		goto out_subs;
 
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index 3cade648827a..36c8991b5d39 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -135,11 +135,6 @@ void kvm_arch_hardware_disable(void)
 	kvm_mips_callbacks->hardware_disable();
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
-{
-	return 0;
-}
-
 extern void kvm_init_loongson_ipi(struct kvm *kvm);
 
 int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
@@ -1636,7 +1631,7 @@ static int __init kvm_mips_init(void)
 
 	register_die_notifier(&kvm_mips_csr_die_notifier);
 
-	ret = kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
+	ret = kvm_init(sizeof(struct kvm_vcpu), 0, THIS_MODULE);
 	if (ret) {
 		unregister_die_notifier(&kvm_mips_csr_die_notifier);
 		return ret;
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 87283a0e33d8..57f4e7896d67 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -1052,7 +1052,7 @@ static int kvmppc_book3s_init(void)
 {
 	int r;
 
-	r = kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
+	r = kvm_init(sizeof(struct kvm_vcpu), 0, THIS_MODULE);
 	if (r)
 		return r;
 #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
index 0ea61190ec04..b0f695428733 100644
--- a/arch/powerpc/kvm/e500.c
+++ b/arch/powerpc/kvm/e500.c
@@ -531,7 +531,7 @@ static int __init kvmppc_e500_init(void)
 	flush_icache_range(kvmppc_booke_handlers, kvmppc_booke_handlers +
 			   ivor[max_ivor] + handler_len);
 
-	r = kvm_init(NULL, sizeof(struct kvmppc_vcpu_e500), 0, THIS_MODULE);
+	r = kvm_init(sizeof(struct kvmppc_vcpu_e500), 0, THIS_MODULE);
 	if (r)
 		goto err_out;
 	kvm_ops_e500.owner = THIS_MODULE;
diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
index 795667f7ebf0..611532a0dedc 100644
--- a/arch/powerpc/kvm/e500mc.c
+++ b/arch/powerpc/kvm/e500mc.c
@@ -404,7 +404,7 @@ static int __init kvmppc_e500mc_init(void)
 	 */
 	kvmppc_init_lpid(KVMPPC_NR_LPIDS/threads_per_core);
 
-	r = kvm_init(NULL, sizeof(struct kvmppc_vcpu_e500), 0, THIS_MODULE);
+	r = kvm_init(sizeof(struct kvmppc_vcpu_e500), 0, THIS_MODULE);
 	if (r)
 		goto err_out;
 	kvm_ops_e500mc.owner = THIS_MODULE;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 01d0f9935e6c..f5b4ff6bfc89 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -440,11 +440,6 @@ int kvm_arch_hardware_enable(void)
 	return 0;
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
-{
-	return 0;
-}
-
 int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 {
 	struct kvmppc_ops *kvm_ops = NULL;
diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c
index 4710a6751687..34c3dece6990 100644
--- a/arch/riscv/kvm/main.c
+++ b/arch/riscv/kvm/main.c
@@ -20,11 +20,6 @@ long kvm_arch_dev_ioctl(struct file *filp,
 	return -EINVAL;
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
-{
-	return 0;
-}
-
 int kvm_arch_hardware_enable(void)
 {
 	unsigned long hideleg, hedeleg;
@@ -110,6 +105,6 @@ static int __init riscv_kvm_init(void)
 
 	kvm_info("VMID %ld bits available\n", kvm_riscv_gstage_vmid_bits());
 
-	return kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
+	return kvm_init(sizeof(struct kvm_vcpu), 0, THIS_MODULE);
 }
 module_init(riscv_kvm_init);
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 25b08b956888..7ad8252e92c2 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -262,11 +262,6 @@ int kvm_arch_hardware_enable(void)
 	return 0;
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
-{
-	return 0;
-}
-
 /* forward declarations */
 static void kvm_gmap_notifier(struct gmap *gmap, unsigned long start,
 			      unsigned long end);
@@ -5716,7 +5711,7 @@ static int __init kvm_s390_init(void)
 	if (r)
 		return r;
 
-	r = kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
+	r = kvm_init(sizeof(struct kvm_vcpu), 0, THIS_MODULE);
 	if (r) {
 		__kvm_s390_exit();
 		return r;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d7ea1c1175c2..d9a54590591d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5103,8 +5103,8 @@ static int __init svm_init(void)
 	 * Common KVM initialization _must_ come last, after this, /dev/kvm is
 	 * exposed to userspace!
 	 */
-	r = kvm_init(NULL, sizeof(struct vcpu_svm),
-		     __alignof__(struct vcpu_svm), THIS_MODULE);
+	r = kvm_init(sizeof(struct vcpu_svm), __alignof__(struct vcpu_svm),
+		     THIS_MODULE);
 	if (r)
 		goto err_kvm_init;
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 8deb1bd60c10..b6f08a0a1435 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8592,8 +8592,8 @@ static int __init vmx_init(void)
 	 * Common KVM initialization _must_ come last, after this, /dev/kvm is
 	 * exposed to userspace!
 	 */
-	r = kvm_init(NULL, sizeof(struct vcpu_vmx),
-		     __alignof__(struct vcpu_vmx), THIS_MODULE);
+	r = kvm_init(sizeof(struct vcpu_vmx), __alignof__(struct vcpu_vmx),
+		     THIS_MODULE);
 	if (r)
 		goto err_kvm_init;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3571bc968cf8..566156b34314 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12071,11 +12071,6 @@ void kvm_arch_hardware_disable(void)
 	drop_user_return_notifiers();
 }
 
-int kvm_arch_check_processor_compat(void *opaque)
-{
-	return 0;
-}
-
 bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu)
 {
 	return vcpu->kvm->arch.bsp_vcpu_id == vcpu->vcpu_id;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 7dde28333e7c..616e8e90558b 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -953,8 +953,7 @@ static inline void kvm_irqfd_exit(void)
 {
 }
 #endif
-int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
-		  struct module *module);
+int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module);
 void kvm_exit(void);
 
 void kvm_get_kvm(struct kvm *kvm);
@@ -1460,7 +1459,6 @@ static inline void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) {}
 
 int kvm_arch_hardware_enable(void);
 void kvm_arch_hardware_disable(void);
-int kvm_arch_check_processor_compat(void *opaque);
 int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
 bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu);
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index a4a10a0b322f..3900bd3d75cb 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5833,36 +5833,14 @@ void kvm_unregister_perf_callbacks(void)
 }
 #endif
 
-struct kvm_cpu_compat_check {
-	void *opaque;
-	int *ret;
-};
-
-static void check_processor_compat(void *data)
+int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 {
-	struct kvm_cpu_compat_check *c = data;
-
-	*c->ret = kvm_arch_check_processor_compat(c->opaque);
-}
-
-int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
-		  struct module *module)
-{
-	struct kvm_cpu_compat_check c;
 	int r;
 	int cpu;
 
 	if (!zalloc_cpumask_var(&cpus_hardware_enabled, GFP_KERNEL))
 		return -ENOMEM;
 
-	c.ret = &r;
-	c.opaque = opaque;
-	for_each_online_cpu(cpu) {
-		smp_call_function_single(cpu, check_processor_compat, &c, 1);
-		if (r < 0)
-			goto out_free_2;
-	}
-
 	r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_STARTING, "kvm/cpu:starting",
 				      kvm_starting_cpu, kvm_dying_cpu);
 	if (r)
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 33/50] KVM: x86: Use KBUILD_MODNAME to specify vendor module name
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (31 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 32/50] KVM: Drop kvm_arch_check_processor_compat() hook Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 34/50] KVM: x86: Unify pr_fmt to use module name for all KVM modules Sean Christopherson
                   ` (18 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Use KBUILD_MODNAME to specify the vendor module name instead of manually
writing out the name to make it a bit more obvious that the name isn't
completely arbitrary.  A future patch will also use KBUILD_MODNAME to
define pr_fmt, at which point using KBUILD_MODNAME for kvm_x86_ops.name
further reinforces the intended usage of kvm_x86_ops.name.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c | 2 +-
 arch/x86/kvm/vmx/vmx.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d9a54590591d..a875cf7b2942 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4695,7 +4695,7 @@ static int svm_vm_init(struct kvm *kvm)
 }
 
 static struct kvm_x86_ops svm_x86_ops __initdata = {
-	.name = "kvm_amd",
+	.name = KBUILD_MODNAME,
 
 	.hardware_unsetup = svm_hardware_unsetup,
 	.hardware_enable = svm_hardware_enable,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index b6f08a0a1435..229a9cf485f0 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8102,7 +8102,7 @@ static void vmx_vm_destroy(struct kvm *kvm)
 }
 
 static struct kvm_x86_ops vmx_x86_ops __initdata = {
-	.name = "kvm_intel",
+	.name = KBUILD_MODNAME,
 
 	.hardware_unsetup = vmx_hardware_unsetup,
 
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 34/50] KVM: x86: Unify pr_fmt to use module name for all KVM modules
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (32 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 33/50] KVM: x86: Use KBUILD_MODNAME to specify vendor module name Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-12-01 10:43   ` Paul Durrant
  2022-11-30 23:09 ` [PATCH v2 35/50] KVM: VMX: Use current CPU's info to perform "disabled by BIOS?" checks Sean Christopherson
                   ` (17 subsequent siblings)
  51 siblings, 1 reply; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Define pr_fmt using KBUILD_MODNAME for all KVM x86 code so that printks
use consistent formatting across common x86, Intel, and AMD code.  In
addition to providing consistent print formatting, using KBUILD_MODNAME,
e.g. kvm_amd and kvm_intel, allows referencing SVM and VMX (and SEV and
SGX and ...) as technologies without generating weird messages, and
without causing naming conflicts with other kernel code, e.g. "SEV: ",
"tdx: ", "sgx: " etc.. are all used by the kernel for non-KVM subsystems.

Opportunistically move away from printk() for prints that need to be
modified anyways, e.g. to drop a manual "kvm: " prefix.

Opportunistically convert a few SGX WARNs that are similarly modified to
WARN_ONCE; in the very unlikely event that the WARNs fire, odds are good
that they would fire repeatedly and spam the kernel log without providing
unique information in each print.

Note, defining pr_fmt yields undesirable results for code that uses KVM's
printk wrappers, e.g. vcpu_unimpl().  But, that's a pre-existing problem
as SVM/kvm_amd already defines a pr_fmt, and thankfully use of KVM's
wrappers is relatively limited in KVM x86 code.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c            |  1 +
 arch/x86/kvm/debugfs.c          |  2 ++
 arch/x86/kvm/emulate.c          |  1 +
 arch/x86/kvm/hyperv.c           |  1 +
 arch/x86/kvm/i8254.c            |  4 ++--
 arch/x86/kvm/i8259.c            |  4 +++-
 arch/x86/kvm/ioapic.c           |  1 +
 arch/x86/kvm/irq.c              |  1 +
 arch/x86/kvm/irq_comm.c         |  7 +++---
 arch/x86/kvm/kvm_onhyperv.c     |  1 +
 arch/x86/kvm/lapic.c            |  8 +++----
 arch/x86/kvm/mmu/mmu.c          |  6 ++---
 arch/x86/kvm/mmu/page_track.c   |  1 +
 arch/x86/kvm/mmu/spte.c         |  4 ++--
 arch/x86/kvm/mmu/spte.h         |  4 ++--
 arch/x86/kvm/mmu/tdp_iter.c     |  1 +
 arch/x86/kvm/mmu/tdp_mmu.c      |  1 +
 arch/x86/kvm/mtrr.c             |  1 +
 arch/x86/kvm/pmu.c              |  1 +
 arch/x86/kvm/smm.c              |  1 +
 arch/x86/kvm/svm/avic.c         |  2 +-
 arch/x86/kvm/svm/nested.c       |  2 +-
 arch/x86/kvm/svm/pmu.c          |  2 ++
 arch/x86/kvm/svm/sev.c          |  1 +
 arch/x86/kvm/svm/svm.c          | 10 ++++-----
 arch/x86/kvm/svm/svm_onhyperv.c |  1 +
 arch/x86/kvm/svm/svm_onhyperv.h |  4 ++--
 arch/x86/kvm/vmx/hyperv.c       |  1 +
 arch/x86/kvm/vmx/hyperv.h       |  4 +---
 arch/x86/kvm/vmx/nested.c       |  3 ++-
 arch/x86/kvm/vmx/pmu_intel.c    |  5 +++--
 arch/x86/kvm/vmx/posted_intr.c  |  2 ++
 arch/x86/kvm/vmx/sgx.c          |  5 +++--
 arch/x86/kvm/vmx/vmcs12.c       |  1 +
 arch/x86/kvm/vmx/vmx.c          | 40 ++++++++++++++++-----------------
 arch/x86/kvm/vmx/vmx_ops.h      |  4 ++--
 arch/x86/kvm/x86.c              | 28 ++++++++++++-----------
 arch/x86/kvm/xen.c              |  1 +
 38 files changed, 97 insertions(+), 70 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 723502181a3a..82411693e604 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -8,6 +8,7 @@
  * Copyright 2011 Red Hat, Inc. and/or its affiliates.
  * Copyright IBM Corporation, 2008
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_host.h>
 #include <linux/export.h>
diff --git a/arch/x86/kvm/debugfs.c b/arch/x86/kvm/debugfs.c
index c1390357126a..ee8c4c3496ed 100644
--- a/arch/x86/kvm/debugfs.c
+++ b/arch/x86/kvm/debugfs.c
@@ -4,6 +4,8 @@
  *
  * Copyright 2016 Red Hat, Inc. and/or its affiliates.
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include <linux/kvm_host.h>
 #include <linux/debugfs.h>
 #include "lapic.h"
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 5cc3efa0e21c..c3443045cd93 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -17,6 +17,7 @@
  *
  * From: xen-unstable 10676:af9809f51f81a3c43f276f00c81a52ef558afda4
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_host.h>
 #include "kvm_cache_regs.h"
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 2c7f2a26421e..4c47892d72bb 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -17,6 +17,7 @@
  *   Ben-Ami Yassour <benami@il.ibm.com>
  *   Andrey Smetanin <asmetanin@virtuozzo.com>
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include "x86.h"
 #include "lapic.h"
diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index e0a7a0e7a73c..cd57a517d04a 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -30,7 +30,7 @@
  *   Based on QEMU and Xen.
  */
 
-#define pr_fmt(fmt) "pit: " fmt
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_host.h>
 #include <linux/slab.h>
@@ -351,7 +351,7 @@ static void create_pit_timer(struct kvm_pit *pit, u32 val, int is_period)
 
 		if (ps->period < min_period) {
 			pr_info_ratelimited(
-			    "kvm: requested %lld ns "
+			    "requested %lld ns "
 			    "i8254 timer period limited to %lld ns\n",
 			    ps->period, min_period);
 			ps->period = min_period;
diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
index e1bb6218bb96..4756bcb5724f 100644
--- a/arch/x86/kvm/i8259.c
+++ b/arch/x86/kvm/i8259.c
@@ -26,6 +26,8 @@
  *   Yaozu (Eddie) Dong <Eddie.dong@intel.com>
  *   Port from Qemu.
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include <linux/mm.h>
 #include <linux/slab.h>
 #include <linux/bitops.h>
@@ -35,7 +37,7 @@
 #include "trace.h"
 
 #define pr_pic_unimpl(fmt, ...)	\
-	pr_err_ratelimited("kvm: pic: " fmt, ## __VA_ARGS__)
+	pr_err_ratelimited("pic: " fmt, ## __VA_ARGS__)
 
 static void pic_irq_request(struct kvm *kvm, int level);
 
diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
index 765943d7cfa5..042dee556125 100644
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -26,6 +26,7 @@
  *  Yaozu (Eddie) Dong <eddie.dong@intel.com>
  *  Based on Xen 3.1 code.
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_host.h>
 #include <linux/kvm.h>
diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index d8d50558f165..f36f7cd77fb7 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -7,6 +7,7 @@
  * Authors:
  *   Yaozu (Eddie) Dong <Eddie.dong@intel.com>
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/export.h>
 #include <linux/kvm_host.h>
diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
index 0687162c4f22..d48eaeacf803 100644
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -8,6 +8,7 @@
  *
  * Copyright 2010 Red Hat, Inc. and/or its affiliates.
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_host.h>
 #include <linux/slab.h>
@@ -56,7 +57,7 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
 
 	if (irq->dest_mode == APIC_DEST_PHYSICAL &&
 	    irq->dest_id == 0xff && kvm_lowest_prio_delivery(irq)) {
-		printk(KERN_INFO "kvm: apic: phys broadcast and lowest prio\n");
+		pr_info("apic: phys broadcast and lowest prio\n");
 		irq->delivery_mode = APIC_DM_FIXED;
 	}
 
@@ -199,7 +200,7 @@ int kvm_request_irq_source_id(struct kvm *kvm)
 	irq_source_id = find_first_zero_bit(bitmap, BITS_PER_LONG);
 
 	if (irq_source_id >= BITS_PER_LONG) {
-		printk(KERN_WARNING "kvm: exhaust allocatable IRQ sources!\n");
+		pr_warn("exhausted allocatable IRQ sources!\n");
 		irq_source_id = -EFAULT;
 		goto unlock;
 	}
@@ -221,7 +222,7 @@ void kvm_free_irq_source_id(struct kvm *kvm, int irq_source_id)
 	mutex_lock(&kvm->irq_lock);
 	if (irq_source_id < 0 ||
 	    irq_source_id >= BITS_PER_LONG) {
-		printk(KERN_ERR "kvm: IRQ source ID out of range!\n");
+		pr_err("IRQ source ID out of range!\n");
 		goto unlock;
 	}
 	clear_bit(irq_source_id, &kvm->arch.irq_sources_bitmap);
diff --git a/arch/x86/kvm/kvm_onhyperv.c b/arch/x86/kvm/kvm_onhyperv.c
index ee4f696a0782..482d6639ef88 100644
--- a/arch/x86/kvm/kvm_onhyperv.c
+++ b/arch/x86/kvm/kvm_onhyperv.c
@@ -2,6 +2,7 @@
 /*
  * KVM L1 hypervisor optimizations on Hyper-V.
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_host.h>
 #include <asm/mshyperv.h>
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 1bb63746e991..9335c4b05760 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -15,6 +15,7 @@
  *
  * Based on Xen 3.1 code, Copyright (c) 2004, Intel Corporation.
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_host.h>
 #include <linux/kvm.h>
@@ -942,8 +943,7 @@ static void kvm_apic_disabled_lapic_found(struct kvm *kvm)
 {
 	if (!kvm->arch.disabled_lapic_found) {
 		kvm->arch.disabled_lapic_found = true;
-		printk(KERN_INFO
-		       "Disabled LAPIC found during irq injection\n");
+		pr_info("Disabled LAPIC found during irq injection\n");
 	}
 }
 
@@ -1561,7 +1561,7 @@ static void limit_periodic_timer_frequency(struct kvm_lapic *apic)
 
 		if (apic->lapic_timer.period < min_period) {
 			pr_info_ratelimited(
-			    "kvm: vcpu %i: requested %lld ns "
+			    "vcpu %i: requested %lld ns "
 			    "lapic timer period limited to %lld ns\n",
 			    apic->vcpu->vcpu_id,
 			    apic->lapic_timer.period, min_period);
@@ -1846,7 +1846,7 @@ static bool set_target_expiration(struct kvm_lapic *apic, u32 count_reg)
 				deadline = apic->lapic_timer.period;
 			else if (unlikely(deadline > apic->lapic_timer.period)) {
 				pr_info_ratelimited(
-				    "kvm: vcpu %i: requested lapic timer restore with "
+				    "vcpu %i: requested lapic timer restore with "
 				    "starting count register %#x=%u (%lld ns) > initial count (%lld ns). "
 				    "Using initial count to start timer.\n",
 				    apic->vcpu->vcpu_id,
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 4736d7849c60..00a8312ad31d 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -14,6 +14,7 @@
  *   Yaniv Kamay  <yaniv@qumranet.com>
  *   Avi Kivity   <avi@qumranet.com>
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include "irq.h"
 #include "ioapic.h"
@@ -3432,8 +3433,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 		}
 
 		if (++retry_count > 4) {
-			printk_once(KERN_WARNING
-				"kvm: Fast #PF retrying more than 4 times.\n");
+			pr_warn_once("Fast #PF retrying more than 4 times.\n");
 			break;
 		}
 
@@ -6578,7 +6578,7 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen)
 	 * zap all shadow pages.
 	 */
 	if (unlikely(gen == 0)) {
-		kvm_debug_ratelimited("kvm: zapping shadow pages for mmio generation wraparound\n");
+		kvm_debug_ratelimited("zapping shadow pages for mmio generation wraparound\n");
 		kvm_mmu_zap_all_fast(kvm);
 	}
 }
diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c
index 2e09d1b6249f..0a2ac438d647 100644
--- a/arch/x86/kvm/mmu/page_track.c
+++ b/arch/x86/kvm/mmu/page_track.c
@@ -10,6 +10,7 @@
  * Author:
  *   Xiao Guangrong <guangrong.xiao@linux.intel.com>
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_host.h>
 #include <linux/rculist.h>
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index c0fd7e049b4e..fce6f047399f 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -7,7 +7,7 @@
  * Copyright (C) 2006 Qumranet, Inc.
  * Copyright 2020 Red Hat, Inc. and/or its affiliates.
  */
-
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_host.h>
 #include "mmu.h"
@@ -352,7 +352,7 @@ u64 mark_spte_for_access_track(u64 spte)
 
 	WARN_ONCE(spte & (SHADOW_ACC_TRACK_SAVED_BITS_MASK <<
 			  SHADOW_ACC_TRACK_SAVED_BITS_SHIFT),
-		  "kvm: Access Tracking saved bit locations are not zero\n");
+		  "Access Tracking saved bit locations are not zero\n");
 
 	spte |= (spte & SHADOW_ACC_TRACK_SAVED_BITS_MASK) <<
 		SHADOW_ACC_TRACK_SAVED_BITS_SHIFT;
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index 1f03701b943a..17f74c60c774 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -435,11 +435,11 @@ static inline void check_spte_writable_invariants(u64 spte)
 {
 	if (spte & shadow_mmu_writable_mask)
 		WARN_ONCE(!(spte & shadow_host_writable_mask),
-			  "kvm: MMU-writable SPTE is not Host-writable: %llx",
+			  KBUILD_MODNAME ": MMU-writable SPTE is not Host-writable: %llx",
 			  spte);
 	else
 		WARN_ONCE(is_writable_pte(spte),
-			  "kvm: Writable SPTE is not MMU-writable: %llx", spte);
+			  KBUILD_MODNAME ": Writable SPTE is not MMU-writable: %llx", spte);
 }
 
 static inline bool is_mmu_writable_spte(u64 spte)
diff --git a/arch/x86/kvm/mmu/tdp_iter.c b/arch/x86/kvm/mmu/tdp_iter.c
index 39b48e7d7d1a..e26e744df1d1 100644
--- a/arch/x86/kvm/mmu/tdp_iter.c
+++ b/arch/x86/kvm/mmu/tdp_iter.c
@@ -1,4 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include "mmu_internal.h"
 #include "tdp_iter.h"
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 771210ce5181..c0242f3fe614 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1,4 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include "mmu.h"
 #include "mmu_internal.h"
diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c
index a8502e02f479..9fac1ec03463 100644
--- a/arch/x86/kvm/mtrr.c
+++ b/arch/x86/kvm/mtrr.c
@@ -13,6 +13,7 @@
  *   Paolo Bonzini <pbonzini@redhat.com>
  *   Xiao Guangrong <guangrong.xiao@linux.intel.com>
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_host.h>
 #include <asm/mtrr.h>
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 684393c22105..ddb818ac08c4 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -9,6 +9,7 @@
  *   Gleb Natapov <gleb@redhat.com>
  *   Wei Huang    <wei@redhat.com>
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/types.h>
 #include <linux/kvm_host.h>
diff --git a/arch/x86/kvm/smm.c b/arch/x86/kvm/smm.c
index a9c1c2af8d94..cc43638d48a3 100644
--- a/arch/x86/kvm/smm.c
+++ b/arch/x86/kvm/smm.c
@@ -1,4 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_host.h>
 #include "x86.h"
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 6919dee69f18..f52f5e0dd465 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -12,7 +12,7 @@
  *   Avi Kivity   <avi@qumranet.com>
  */
 
-#define pr_fmt(fmt) "SVM: " fmt
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_types.h>
 #include <linux/hashtable.h>
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index bc9cd7086fa9..3bfbcb607d80 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -12,7 +12,7 @@
  *   Avi Kivity   <avi@qumranet.com>
  */
 
-#define pr_fmt(fmt) "SVM: " fmt
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_types.h>
 #include <linux/kvm_host.h>
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 0e313fbae055..1ff068f23841 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -9,6 +9,8 @@
  *
  * Implementation is based on pmu_intel.c file
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include <linux/types.h>
 #include <linux/kvm_host.h>
 #include <linux/perf_event.h>
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 69dbf17f0d6a..339fb69f4b2d 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -6,6 +6,7 @@
  *
  * Copyright 2010 Red Hat, Inc. and/or its affiliates.
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_types.h>
 #include <linux/kvm_host.h>
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a875cf7b2942..ab53da3fbcd1 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1,4 +1,4 @@
-#define pr_fmt(fmt) "SVM: " fmt
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_host.h>
 
@@ -2076,7 +2076,7 @@ static void svm_handle_mce(struct kvm_vcpu *vcpu)
 		 * Erratum 383 triggered. Guest state is corrupt so kill the
 		 * guest.
 		 */
-		pr_err("KVM: Guest triggered AMD Erratum 383\n");
+		pr_err("Guest triggered AMD Erratum 383\n");
 
 		kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
 
@@ -4623,7 +4623,7 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
 	smap = cr4 & X86_CR4_SMAP;
 	is_user = svm_get_cpl(vcpu) == 3;
 	if (smap && (!smep || is_user)) {
-		pr_err_ratelimited("KVM: SEV Guest triggered AMD Erratum 1096\n");
+		pr_err_ratelimited("SEV Guest triggered AMD Erratum 1096\n");
 
 		/*
 		 * If the fault occurred in userspace, arbitrarily inject #GP
@@ -4972,7 +4972,7 @@ static __init int svm_hardware_setup(void)
 	}
 
 	if (nested) {
-		printk(KERN_INFO "kvm: Nested Virtualization enabled\n");
+		pr_info("Nested Virtualization enabled\n");
 		kvm_enable_efer_bits(EFER_SVME | EFER_LMSLE);
 	}
 
@@ -4990,7 +4990,7 @@ static __init int svm_hardware_setup(void)
 	/* Force VM NPT level equal to the host's paging level */
 	kvm_configure_mmu(npt_enabled, get_npt_level(),
 			  get_npt_level(), PG_LEVEL_1G);
-	pr_info("kvm: Nested Paging %sabled\n", npt_enabled ? "en" : "dis");
+	pr_info("Nested Paging %sabled\n", npt_enabled ? "en" : "dis");
 
 	/* Setup shadow_me_value and shadow_me_mask */
 	kvm_mmu_set_me_spte_mask(sme_me_mask, sme_me_mask);
diff --git a/arch/x86/kvm/svm/svm_onhyperv.c b/arch/x86/kvm/svm/svm_onhyperv.c
index 26a89d0da93e..7af8422d3382 100644
--- a/arch/x86/kvm/svm/svm_onhyperv.c
+++ b/arch/x86/kvm/svm/svm_onhyperv.c
@@ -2,6 +2,7 @@
 /*
  * KVM L1 hypervisor optimizations on Hyper-V for SVM.
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_host.h>
 
diff --git a/arch/x86/kvm/svm/svm_onhyperv.h b/arch/x86/kvm/svm/svm_onhyperv.h
index 45faf84476ce..6981c1e9a809 100644
--- a/arch/x86/kvm/svm/svm_onhyperv.h
+++ b/arch/x86/kvm/svm/svm_onhyperv.h
@@ -34,7 +34,7 @@ static inline void svm_hv_hardware_setup(void)
 {
 	if (npt_enabled &&
 	    ms_hyperv.nested_features & HV_X64_NESTED_ENLIGHTENED_TLB) {
-		pr_info("kvm: Hyper-V enlightened NPT TLB flush enabled\n");
+		pr_info(KBUILD_MODNAME ": Hyper-V enlightened NPT TLB flush enabled\n");
 		svm_x86_ops.tlb_remote_flush = hv_remote_flush_tlb;
 		svm_x86_ops.tlb_remote_flush_with_range =
 				hv_remote_flush_tlb_with_range;
@@ -43,7 +43,7 @@ static inline void svm_hv_hardware_setup(void)
 	if (ms_hyperv.nested_features & HV_X64_NESTED_DIRECT_FLUSH) {
 		int cpu;
 
-		pr_info("kvm: Hyper-V Direct TLB Flush enabled\n");
+		pr_info(KBUILD_MODNAME ": Hyper-V Direct TLB Flush enabled\n");
 		for_each_online_cpu(cpu) {
 			struct hv_vp_assist_page *vp_ap =
 				hv_get_vp_assist_page(cpu);
diff --git a/arch/x86/kvm/vmx/hyperv.c b/arch/x86/kvm/vmx/hyperv.c
index ae03d1fe0355..0d3a39059c64 100644
--- a/arch/x86/kvm/vmx/hyperv.c
+++ b/arch/x86/kvm/vmx/hyperv.c
@@ -1,4 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/errno.h>
 #include <linux/smp.h>
diff --git a/arch/x86/kvm/vmx/hyperv.h b/arch/x86/kvm/vmx/hyperv.h
index 571e7929d14e..6f0620a62dea 100644
--- a/arch/x86/kvm/vmx/hyperv.h
+++ b/arch/x86/kvm/vmx/hyperv.h
@@ -117,9 +117,7 @@ static __always_inline int get_evmcs_offset(unsigned long field,
 {
 	int offset = evmcs_field_offset(field, clean_field);
 
-	WARN_ONCE(offset < 0, "KVM: accessing unsupported EVMCS field %lx\n",
-		  field);
-
+	WARN_ONCE(offset < 0, "accessing unsupported EVMCS field %lx\n", field);
 	return offset;
 }
 
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index b28be793de29..fbc1dbf294c1 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1,4 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/objtool.h>
 #include <linux/percpu.h>
@@ -203,7 +204,7 @@ static void nested_vmx_abort(struct kvm_vcpu *vcpu, u32 indicator)
 {
 	/* TODO: not to reset guest simply here. */
 	kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
-	pr_debug_ratelimited("kvm: nested vmx abort, indicator %d\n", indicator);
+	pr_debug_ratelimited("nested vmx abort, indicator %d\n", indicator);
 }
 
 static inline bool vmx_control_verify(u32 control, u32 low, u32 high)
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index e5cec07ca8d9..efce9ad70e4e 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -8,6 +8,8 @@
  *   Avi Kivity   <avi@redhat.com>
  *   Gleb Natapov <gleb@redhat.com>
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include <linux/types.h>
 #include <linux/kvm_host.h>
 #include <linux/perf_event.h>
@@ -762,8 +764,7 @@ void vmx_passthrough_lbr_msrs(struct kvm_vcpu *vcpu)
 	return;
 
 warn:
-	pr_warn_ratelimited("kvm: vcpu-%d: fail to passthrough LBR.\n",
-		vcpu->vcpu_id);
+	pr_warn_ratelimited("vcpu-%d: fail to passthrough LBR.\n", vcpu->vcpu_id);
 }
 
 static void intel_pmu_cleanup(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
index 1b56c5e5c9fb..94c38bea60e7 100644
--- a/arch/x86/kvm/vmx/posted_intr.c
+++ b/arch/x86/kvm/vmx/posted_intr.c
@@ -1,4 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include <linux/kvm_host.h>
 
 #include <asm/irq_remapping.h>
diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
index 8f95c7c01433..a6ac83d4b6ad 100644
--- a/arch/x86/kvm/vmx/sgx.c
+++ b/arch/x86/kvm/vmx/sgx.c
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 /*  Copyright(c) 2021 Intel Corporation. */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <asm/sgx.h>
 
@@ -164,7 +165,7 @@ static int __handle_encls_ecreate(struct kvm_vcpu *vcpu,
 	if (!vcpu->kvm->arch.sgx_provisioning_allowed &&
 	    (attributes & SGX_ATTR_PROVISIONKEY)) {
 		if (sgx_12_1->eax & SGX_ATTR_PROVISIONKEY)
-			pr_warn_once("KVM: SGX PROVISIONKEY advertised but not allowed\n");
+			pr_warn_once("SGX PROVISIONKEY advertised but not allowed\n");
 		kvm_inject_gp(vcpu, 0);
 		return 1;
 	}
@@ -379,7 +380,7 @@ int handle_encls(struct kvm_vcpu *vcpu)
 			return handle_encls_ecreate(vcpu);
 		if (leaf == EINIT)
 			return handle_encls_einit(vcpu);
-		WARN(1, "KVM: unexpected exit on ENCLS[%u]", leaf);
+		WARN_ONCE(1, "unexpected exit on ENCLS[%u]", leaf);
 		vcpu->run->exit_reason = KVM_EXIT_UNKNOWN;
 		vcpu->run->hw.hardware_exit_reason = EXIT_REASON_ENCLS;
 		return 0;
diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c
index 2251b60920f8..106a72c923ca 100644
--- a/arch/x86/kvm/vmx/vmcs12.c
+++ b/arch/x86/kvm/vmx/vmcs12.c
@@ -1,4 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include "vmcs12.h"
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 229a9cf485f0..e859d2b7daa4 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -12,6 +12,7 @@
  *   Avi Kivity   <avi@qumranet.com>
  *   Yaniv Kamay  <yaniv@qumranet.com>
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/highmem.h>
 #include <linux/hrtimer.h>
@@ -444,36 +445,36 @@ void vmread_error(unsigned long field, bool fault)
 	if (fault)
 		kvm_spurious_fault();
 	else
-		vmx_insn_failed("kvm: vmread failed: field=%lx\n", field);
+		vmx_insn_failed("vmread failed: field=%lx\n", field);
 }
 
 noinline void vmwrite_error(unsigned long field, unsigned long value)
 {
-	vmx_insn_failed("kvm: vmwrite failed: field=%lx val=%lx err=%u\n",
+	vmx_insn_failed("vmwrite failed: field=%lx val=%lx err=%u\n",
 			field, value, vmcs_read32(VM_INSTRUCTION_ERROR));
 }
 
 noinline void vmclear_error(struct vmcs *vmcs, u64 phys_addr)
 {
-	vmx_insn_failed("kvm: vmclear failed: %p/%llx err=%u\n",
+	vmx_insn_failed("vmclear failed: %p/%llx err=%u\n",
 			vmcs, phys_addr, vmcs_read32(VM_INSTRUCTION_ERROR));
 }
 
 noinline void vmptrld_error(struct vmcs *vmcs, u64 phys_addr)
 {
-	vmx_insn_failed("kvm: vmptrld failed: %p/%llx err=%u\n",
+	vmx_insn_failed("vmptrld failed: %p/%llx err=%u\n",
 			vmcs, phys_addr, vmcs_read32(VM_INSTRUCTION_ERROR));
 }
 
 noinline void invvpid_error(unsigned long ext, u16 vpid, gva_t gva)
 {
-	vmx_insn_failed("kvm: invvpid failed: ext=0x%lx vpid=%u gva=0x%lx\n",
+	vmx_insn_failed("invvpid failed: ext=0x%lx vpid=%u gva=0x%lx\n",
 			ext, vpid, gva);
 }
 
 noinline void invept_error(unsigned long ext, u64 eptp, gpa_t gpa)
 {
-	vmx_insn_failed("kvm: invept failed: ext=0x%lx eptp=%llx gpa=0x%llx\n",
+	vmx_insn_failed("invept failed: ext=0x%lx eptp=%llx gpa=0x%llx\n",
 			ext, eptp, gpa);
 }
 
@@ -577,7 +578,7 @@ static __init void hv_init_evmcs(void)
 		}
 
 		if (enlightened_vmcs) {
-			pr_info("KVM: vmx: using Hyper-V Enlightened VMCS\n");
+			pr_info("Using Hyper-V Enlightened VMCS\n");
 			static_branch_enable(&enable_evmcs);
 		}
 
@@ -1678,8 +1679,8 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
 		if (!instr_len)
 			goto rip_updated;
 
-		WARN(exit_reason.enclave_mode,
-		     "KVM: skipping instruction after SGX enclave VM-Exit");
+		WARN_ONCE(exit_reason.enclave_mode,
+			  "skipping instruction after SGX enclave VM-Exit");
 
 		orig_rip = kvm_rip_read(vcpu);
 		rip = orig_rip + instr_len;
@@ -2988,9 +2989,8 @@ static void fix_rmode_seg(int seg, struct kvm_segment *save)
 		var.type = 0x3;
 		var.avl = 0;
 		if (save->base & 0xf)
-			printk_once(KERN_WARNING "kvm: segment base is not "
-					"paragraph aligned when entering "
-					"protected mode (seg=%d)", seg);
+			pr_warn_once("segment base is not paragraph aligned "
+				     "when entering protected mode (seg=%d)", seg);
 	}
 
 	vmcs_write16(sf->selector, var.selector);
@@ -3020,8 +3020,7 @@ static void enter_rmode(struct kvm_vcpu *vcpu)
 	 * vcpu. Warn the user that an update is overdue.
 	 */
 	if (!kvm_vmx->tss_addr)
-		printk_once(KERN_WARNING "kvm: KVM_SET_TSS_ADDR need to be "
-			     "called before entering vcpu\n");
+		pr_warn_once("KVM_SET_TSS_ADDR needs to be called before running vCPU\n");
 
 	vmx_segment_cache_clear(vmx);
 
@@ -6882,7 +6881,7 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu)
 	gate_desc *desc = (gate_desc *)host_idt_base + vector;
 
 	if (KVM_BUG(!is_external_intr(intr_info), vcpu->kvm,
-	    "KVM: unexpected VM-Exit interrupt info: 0x%x", intr_info))
+	    "unexpected VM-Exit interrupt info: 0x%x", intr_info))
 		return;
 
 	handle_interrupt_nmi_irqoff(vcpu, gate_offset(desc));
@@ -7487,7 +7486,7 @@ static int __init vmx_check_processor_compat(void)
 
 	if (!this_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
 	    !this_cpu_has(X86_FEATURE_VMX)) {
-		pr_err("kvm: VMX is disabled on CPU %d\n", smp_processor_id());
+		pr_err("VMX is disabled on CPU %d\n", smp_processor_id());
 		return -EIO;
 	}
 
@@ -7496,8 +7495,7 @@ static int __init vmx_check_processor_compat(void)
 	if (nested)
 		nested_vmx_setup_ctls_msrs(&vmcs_conf, vmx_cap.ept);
 	if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config)) != 0) {
-		printk(KERN_ERR "kvm: CPU %d feature inconsistency!\n",
-				smp_processor_id());
+		pr_err("CPU %d feature inconsistency!\n", smp_processor_id());
 		return -EIO;
 	}
 	return 0;
@@ -8322,7 +8320,7 @@ static __init int hardware_setup(void)
 		return -EIO;
 
 	if (cpu_has_perf_global_ctrl_bug())
-		pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
+		pr_warn_once("VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
 			     "does not work properly. Using workaround\n");
 
 	if (boot_cpu_has(X86_FEATURE_NX))
@@ -8330,7 +8328,7 @@ static __init int hardware_setup(void)
 
 	if (boot_cpu_has(X86_FEATURE_MPX)) {
 		rdmsrl(MSR_IA32_BNDCFGS, host_bndcfgs);
-		WARN_ONCE(host_bndcfgs, "KVM: BNDCFGS in host will be lost");
+		WARN_ONCE(host_bndcfgs, "BNDCFGS in host will be lost");
 	}
 
 	if (!cpu_has_vmx_mpx())
@@ -8349,7 +8347,7 @@ static __init int hardware_setup(void)
 
 	/* NX support is required for shadow paging. */
 	if (!enable_ept && !boot_cpu_has(X86_FEATURE_NX)) {
-		pr_err_ratelimited("kvm: NX (Execute Disable) not supported\n");
+		pr_err_ratelimited("NX (Execute Disable) not supported\n");
 		return -EOPNOTSUPP;
 	}
 
diff --git a/arch/x86/kvm/vmx/vmx_ops.h b/arch/x86/kvm/vmx/vmx_ops.h
index f6f23c7397dc..bc043f68d06b 100644
--- a/arch/x86/kvm/vmx/vmx_ops.h
+++ b/arch/x86/kvm/vmx/vmx_ops.h
@@ -86,8 +86,8 @@ static __always_inline unsigned long __vmcs_readl(unsigned long field)
 	return value;
 
 do_fail:
-	WARN_ONCE(1, "kvm: vmread failed: field=%lx\n", field);
-	pr_warn_ratelimited("kvm: vmread failed: field=%lx\n", field);
+	WARN_ONCE(1, KBUILD_MODNAME ": vmread failed: field=%lx\n", field);
+	pr_warn_ratelimited(KBUILD_MODNAME ": vmread failed: field=%lx\n", field);
 	return 0;
 
 do_exception:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 566156b34314..3d5455e08191 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -15,6 +15,7 @@
  *   Amit Shah    <amit.shah@qumranet.com>
  *   Ben-Ami Yassour <benami@il.ibm.com>
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/kvm_host.h>
 #include "irq.h"
@@ -2093,7 +2094,7 @@ static int kvm_emulate_monitor_mwait(struct kvm_vcpu *vcpu, const char *insn)
 	    !guest_cpuid_has(vcpu, X86_FEATURE_MWAIT))
 		return kvm_handle_invalid_op(vcpu);
 
-	pr_warn_once("kvm: %s instruction emulated as NOP!\n", insn);
+	pr_warn_once("%s instruction emulated as NOP!\n", insn);
 	return kvm_emulate_as_nop(vcpu);
 }
 int kvm_emulate_mwait(struct kvm_vcpu *vcpu)
@@ -2442,7 +2443,8 @@ static int kvm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz)
 	thresh_lo = adjust_tsc_khz(tsc_khz, -tsc_tolerance_ppm);
 	thresh_hi = adjust_tsc_khz(tsc_khz, tsc_tolerance_ppm);
 	if (user_tsc_khz < thresh_lo || user_tsc_khz > thresh_hi) {
-		pr_debug("kvm: requested TSC rate %u falls outside tolerance [%u,%u]\n", user_tsc_khz, thresh_lo, thresh_hi);
+		pr_debug("requested TSC rate %u falls outside tolerance [%u,%u]\n",
+			 user_tsc_khz, thresh_lo, thresh_hi);
 		use_scaling = 1;
 	}
 	return set_tsc_khz(vcpu, user_tsc_khz, use_scaling);
@@ -7693,7 +7695,7 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
 	return X86EMUL_CONTINUE;
 
 emul_write:
-	printk_once(KERN_WARNING "kvm: emulating exchange as write\n");
+	pr_warn_once("emulating exchange as write\n");
 
 	return emulator_write_emulated(ctxt, addr, new, bytes, exception);
 }
@@ -8254,7 +8256,7 @@ static struct x86_emulate_ctxt *alloc_emulate_ctxt(struct kvm_vcpu *vcpu)
 
 	ctxt = kmem_cache_zalloc(x86_emulator_cache, GFP_KERNEL_ACCOUNT);
 	if (!ctxt) {
-		pr_err("kvm: failed to allocate vcpu's emulator\n");
+		pr_err("failed to allocate vcpu's emulator\n");
 		return NULL;
 	}
 
@@ -9309,17 +9311,17 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 	int r, cpu;
 
 	if (kvm_x86_ops.hardware_enable) {
-		pr_err("kvm: already loaded vendor module '%s'\n", kvm_x86_ops.name);
+		pr_err("already loaded vendor module '%s'\n", kvm_x86_ops.name);
 		return -EEXIST;
 	}
 
 	if (!ops->cpu_has_kvm_support()) {
-		pr_err_ratelimited("kvm: no hardware support for '%s'\n",
+		pr_err_ratelimited("no hardware support for '%s'\n",
 				   ops->runtime_ops->name);
 		return -EOPNOTSUPP;
 	}
 	if (ops->disabled_by_bios()) {
-		pr_err_ratelimited("kvm: support for '%s' disabled by bios\n",
+		pr_err_ratelimited("support for '%s' disabled by bios\n",
 				   ops->runtime_ops->name);
 		return -EOPNOTSUPP;
 	}
@@ -9330,7 +9332,7 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 	 * vCPU's FPU state as a fxregs_state struct.
 	 */
 	if (!boot_cpu_has(X86_FEATURE_FPU) || !boot_cpu_has(X86_FEATURE_FXSR)) {
-		printk(KERN_ERR "kvm: inadequate fpu\n");
+		pr_err("inadequate fpu\n");
 		return -EOPNOTSUPP;
 	}
 
@@ -9348,19 +9350,19 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 	 */
 	if (rdmsrl_safe(MSR_IA32_CR_PAT, &host_pat) ||
 	    (host_pat & GENMASK(2, 0)) != 6) {
-		pr_err("kvm: host PAT[0] is not WB\n");
+		pr_err("host PAT[0] is not WB\n");
 		return -EIO;
 	}
 
 	x86_emulator_cache = kvm_alloc_emulator_cache();
 	if (!x86_emulator_cache) {
-		pr_err("kvm: failed to allocate cache for x86 emulator\n");
+		pr_err("failed to allocate cache for x86 emulator\n");
 		return -ENOMEM;
 	}
 
 	user_return_msrs = alloc_percpu(struct kvm_user_return_msrs);
 	if (!user_return_msrs) {
-		printk(KERN_ERR "kvm: failed to allocate percpu kvm_user_return_msrs\n");
+		pr_err("failed to allocate percpu kvm_user_return_msrs\n");
 		r = -ENOMEM;
 		goto out_free_x86_emulator_cache;
 	}
@@ -11634,7 +11636,7 @@ static int sync_regs(struct kvm_vcpu *vcpu)
 int kvm_arch_vcpu_precreate(struct kvm *kvm, unsigned int id)
 {
 	if (kvm_check_tsc_unstable() && kvm->created_vcpus)
-		pr_warn_once("kvm: SMP vm created on host with unstable TSC; "
+		pr_warn_once("SMP vm created on host with unstable TSC; "
 			     "guest TSC will not be reliable\n");
 
 	if (!kvm->arch.max_vcpu_ids)
@@ -11711,7 +11713,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 		goto free_wbinvd_dirty_mask;
 
 	if (!fpu_alloc_guest_fpstate(&vcpu->arch.guest_fpu)) {
-		pr_err("kvm: failed to allocate vcpu's fpu\n");
+		pr_err("failed to allocate vcpu's fpu\n");
 		goto free_emulate_ctxt;
 	}
 
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index b246decb53a9..3bf7d69373cf 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -5,6 +5,7 @@
  *
  * KVM Xen emulation
  */
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include "x86.h"
 #include "xen.h"
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 35/50] KVM: VMX: Use current CPU's info to perform "disabled by BIOS?" checks
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (33 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 34/50] KVM: x86: Unify pr_fmt to use module name for all KVM modules Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-12-02 12:18   ` Huang, Kai
  2022-11-30 23:09 ` [PATCH v2 36/50] KVM: x86: Do VMX/SVM support checks directly in vendor code Sean Christopherson
                   ` (16 subsequent siblings)
  51 siblings, 1 reply; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Use this_cpu_has() instead of boot_cpu_has() to perform the effective
"disabled by BIOS?" checks for VMX.  This will allow consolidating code
between vmx_disabled_by_bios() and vmx_check_processor_compat().

Checking the boot CPU isn't a strict requirement as any divergence in VMX
enabling between the boot CPU and other CPUs will result in KVM refusing
to load thanks to the aforementioned vmx_check_processor_compat().

Furthermore, using the boot CPU was an unintentional change introduced by
commit a4d0b2fdbcf7 ("KVM: VMX: Use VMX feature flag to query BIOS
enabling").  Prior to using the feature flags, KVM checked the raw MSR
value from the current CPU.

Reported-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e859d2b7daa4..3f7d9f88b314 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2492,8 +2492,8 @@ static __init int cpu_has_kvm_support(void)
 
 static __init int vmx_disabled_by_bios(void)
 {
-	return !boot_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
-	       !boot_cpu_has(X86_FEATURE_VMX);
+	return !this_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
+	       !this_cpu_has(X86_FEATURE_VMX);
 }
 
 static int kvm_cpu_vmxon(u64 vmxon_pointer)
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 36/50] KVM: x86: Do VMX/SVM support checks directly in vendor code
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (34 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 35/50] KVM: VMX: Use current CPU's info to perform "disabled by BIOS?" checks Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 37/50] KVM: VMX: Shuffle support checks and hardware enabling code around Sean Christopherson
                   ` (15 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Do basic VMX/SVM support checks directly in vendor code instead of
implementing them via kvm_x86_ops hooks.  Beyond the superficial benefit
of providing common messages, which isn't even clearly a net positive
since vendor code can provide more precise/detailed messages, there's
zero advantage to bouncing through common x86 code.

Consolidating the checks will also simplify performing the checks
across all CPUs (in a future patch).

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h |  2 --
 arch/x86/kvm/svm/svm.c          | 38 +++++++++++++++------------------
 arch/x86/kvm/vmx/vmx.c          | 37 +++++++++++++++++---------------
 arch/x86/kvm/x86.c              | 11 ----------
 4 files changed, 37 insertions(+), 51 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 04a9ae66fb8d..d79aedf70908 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1729,8 +1729,6 @@ struct kvm_x86_nested_ops {
 };
 
 struct kvm_x86_init_ops {
-	int (*cpu_has_kvm_support)(void);
-	int (*disabled_by_bios)(void);
 	int (*check_processor_compatibility)(void);
 	int (*hardware_setup)(void);
 	unsigned int (*handle_intel_pt_intr)(void);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index ab53da3fbcd1..49ccef9fae81 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -519,21 +519,28 @@ static void svm_init_osvw(struct kvm_vcpu *vcpu)
 		vcpu->arch.osvw.status |= 1;
 }
 
-static int has_svm(void)
+static bool kvm_is_svm_supported(void)
 {
 	const char *msg;
+	u64 vm_cr;
 
 	if (!cpu_has_svm(&msg)) {
-		printk(KERN_INFO "has_svm: %s\n", msg);
-		return 0;
+		pr_err("SVM not supported, %s\n", msg);
+		return false;
 	}
 
 	if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
 		pr_info("KVM is unsupported when running as an SEV guest\n");
-		return 0;
+		return false;
 	}
 
-	return 1;
+	rdmsrl(MSR_VM_CR, vm_cr);
+	if (vm_cr & (1 << SVM_VM_CR_SVM_DISABLE)) {
+		pr_err("SVM disabled (by BIOS) in MSR_VM_CR\n");
+		return false;
+	}
+
+	return true;
 }
 
 void __svm_write_tsc_multiplier(u64 multiplier)
@@ -572,10 +579,9 @@ static int svm_hardware_enable(void)
 	if (efer & EFER_SVME)
 		return -EBUSY;
 
-	if (!has_svm()) {
-		pr_err("%s: err EOPNOTSUPP on %d\n", __func__, me);
+	if (!kvm_is_svm_supported())
 		return -EINVAL;
-	}
+
 	sd = per_cpu_ptr(&svm_data, me);
 	sd->asid_generation = 1;
 	sd->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
@@ -4070,17 +4076,6 @@ static void svm_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa,
 	vmcb_mark_dirty(svm->vmcb, VMCB_CR);
 }
 
-static int is_disabled(void)
-{
-	u64 vm_cr;
-
-	rdmsrl(MSR_VM_CR, vm_cr);
-	if (vm_cr & (1 << SVM_VM_CR_SVM_DISABLE))
-		return 1;
-
-	return 0;
-}
-
 static void
 svm_patch_hypercall(struct kvm_vcpu *vcpu, unsigned char *hypercall)
 {
@@ -5080,8 +5075,6 @@ static __init int svm_hardware_setup(void)
 
 
 static struct kvm_x86_init_ops svm_init_ops __initdata = {
-	.cpu_has_kvm_support = has_svm,
-	.disabled_by_bios = is_disabled,
 	.hardware_setup = svm_hardware_setup,
 	.check_processor_compatibility = svm_check_processor_compat,
 
@@ -5095,6 +5088,9 @@ static int __init svm_init(void)
 
 	__unused_size_checks();
 
+	if (!kvm_is_svm_supported())
+		return -EOPNOTSUPP;
+
 	r = kvm_x86_vendor_init(&svm_init_ops);
 	if (r)
 		return r;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 3f7d9f88b314..23b64bf4bfcf 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2485,17 +2485,6 @@ static void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
 	}
 }
 
-static __init int cpu_has_kvm_support(void)
-{
-	return cpu_has_vmx();
-}
-
-static __init int vmx_disabled_by_bios(void)
-{
-	return !this_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
-	       !this_cpu_has(X86_FEATURE_VMX);
-}
-
 static int kvm_cpu_vmxon(u64 vmxon_pointer)
 {
 	u64 msr;
@@ -7479,16 +7468,29 @@ static int vmx_vm_init(struct kvm *kvm)
 	return 0;
 }
 
+static bool __init kvm_is_vmx_supported(void)
+{
+	if (!cpu_has_vmx()) {
+		pr_err("CPU doesn't support VMX\n");
+		return false;
+	}
+
+	if (!this_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
+	    !this_cpu_has(X86_FEATURE_VMX)) {
+		pr_err("VMX not enabled (by BIOS) in MSR_IA32_FEAT_CTL\n");
+		return false;
+	}
+
+	return true;
+}
+
 static int __init vmx_check_processor_compat(void)
 {
 	struct vmcs_config vmcs_conf;
 	struct vmx_capability vmx_cap;
 
-	if (!this_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
-	    !this_cpu_has(X86_FEATURE_VMX)) {
-		pr_err("VMX is disabled on CPU %d\n", smp_processor_id());
+	if (!kvm_is_vmx_supported())
 		return -EIO;
-	}
 
 	if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0)
 		return -EIO;
@@ -8499,8 +8501,6 @@ static __init int hardware_setup(void)
 }
 
 static struct kvm_x86_init_ops vmx_init_ops __initdata = {
-	.cpu_has_kvm_support = cpu_has_kvm_support,
-	.disabled_by_bios = vmx_disabled_by_bios,
 	.check_processor_compatibility = vmx_check_processor_compat,
 	.hardware_setup = hardware_setup,
 	.handle_intel_pt_intr = NULL,
@@ -8543,6 +8543,9 @@ static int __init vmx_init(void)
 {
 	int r, cpu;
 
+	if (!kvm_is_vmx_supported())
+		return -EOPNOTSUPP;
+
 	/*
 	 * Note, hv_init_evmcs() touches only VMX knobs, i.e. there's nothing
 	 * to unwind if a later step fails.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3d5455e08191..5551f3552f08 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9315,17 +9315,6 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 		return -EEXIST;
 	}
 
-	if (!ops->cpu_has_kvm_support()) {
-		pr_err_ratelimited("no hardware support for '%s'\n",
-				   ops->runtime_ops->name);
-		return -EOPNOTSUPP;
-	}
-	if (ops->disabled_by_bios()) {
-		pr_err_ratelimited("support for '%s' disabled by bios\n",
-				   ops->runtime_ops->name);
-		return -EOPNOTSUPP;
-	}
-
 	/*
 	 * KVM explicitly assumes that the guest has an FPU and
 	 * FXSAVE/FXRSTOR. For example, the KVM_GET_FPU explicitly casts the
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 37/50] KVM: VMX: Shuffle support checks and hardware enabling code around
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (35 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 36/50] KVM: x86: Do VMX/SVM support checks directly in vendor code Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 38/50] KVM: SVM: Check for SVM support in CPU compatibility checks Sean Christopherson
                   ` (14 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Reorder code in vmx.c so that the VMX support check helpers reside above
the hardware enabling helpers, which will allow KVM to perform support
checks during hardware enabling (in a future patch).

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 216 ++++++++++++++++++++---------------------
 1 file changed, 108 insertions(+), 108 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 23b64bf4bfcf..2a8a6e481c76 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2485,79 +2485,6 @@ static void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
 	}
 }
 
-static int kvm_cpu_vmxon(u64 vmxon_pointer)
-{
-	u64 msr;
-
-	cr4_set_bits(X86_CR4_VMXE);
-
-	asm_volatile_goto("1: vmxon %[vmxon_pointer]\n\t"
-			  _ASM_EXTABLE(1b, %l[fault])
-			  : : [vmxon_pointer] "m"(vmxon_pointer)
-			  : : fault);
-	return 0;
-
-fault:
-	WARN_ONCE(1, "VMXON faulted, MSR_IA32_FEAT_CTL (0x3a) = 0x%llx\n",
-		  rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr) ? 0xdeadbeef : msr);
-	cr4_clear_bits(X86_CR4_VMXE);
-
-	return -EFAULT;
-}
-
-static int vmx_hardware_enable(void)
-{
-	int cpu = raw_smp_processor_id();
-	u64 phys_addr = __pa(per_cpu(vmxarea, cpu));
-	int r;
-
-	if (cr4_read_shadow() & X86_CR4_VMXE)
-		return -EBUSY;
-
-	/*
-	 * This can happen if we hot-added a CPU but failed to allocate
-	 * VP assist page for it.
-	 */
-	if (static_branch_unlikely(&enable_evmcs) &&
-	    !hv_get_vp_assist_page(cpu))
-		return -EFAULT;
-
-	intel_pt_handle_vmx(1);
-
-	r = kvm_cpu_vmxon(phys_addr);
-	if (r) {
-		intel_pt_handle_vmx(0);
-		return r;
-	}
-
-	if (enable_ept)
-		ept_sync_global();
-
-	return 0;
-}
-
-static void vmclear_local_loaded_vmcss(void)
-{
-	int cpu = raw_smp_processor_id();
-	struct loaded_vmcs *v, *n;
-
-	list_for_each_entry_safe(v, n, &per_cpu(loaded_vmcss_on_cpu, cpu),
-				 loaded_vmcss_on_cpu_link)
-		__loaded_vmcs_clear(v);
-}
-
-static void vmx_hardware_disable(void)
-{
-	vmclear_local_loaded_vmcss();
-
-	if (cpu_vmxoff())
-		kvm_spurious_fault();
-
-	hv_reset_evmcs();
-
-	intel_pt_handle_vmx(0);
-}
-
 /*
  * There is no X86_FEATURE for SGX yet, but anyway we need to query CPUID
  * directly instead of going through cpu_has(), to ensure KVM is trapping
@@ -2783,6 +2710,114 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	return 0;
 }
 
+static bool __init kvm_is_vmx_supported(void)
+{
+	if (!cpu_has_vmx()) {
+		pr_err("CPU doesn't support VMX\n");
+		return false;
+	}
+
+	if (!this_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
+	    !this_cpu_has(X86_FEATURE_VMX)) {
+		pr_err("VMX not enabled (by BIOS) in MSR_IA32_FEAT_CTL\n");
+		return false;
+	}
+
+	return true;
+}
+
+static int __init vmx_check_processor_compat(void)
+{
+	struct vmcs_config vmcs_conf;
+	struct vmx_capability vmx_cap;
+
+	if (!kvm_is_vmx_supported())
+		return -EIO;
+
+	if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0)
+		return -EIO;
+	if (nested)
+		nested_vmx_setup_ctls_msrs(&vmcs_conf, vmx_cap.ept);
+	if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config)) != 0) {
+		pr_err("CPU %d feature inconsistency!\n", smp_processor_id());
+		return -EIO;
+	}
+	return 0;
+}
+
+static int kvm_cpu_vmxon(u64 vmxon_pointer)
+{
+	u64 msr;
+
+	cr4_set_bits(X86_CR4_VMXE);
+
+	asm_volatile_goto("1: vmxon %[vmxon_pointer]\n\t"
+			  _ASM_EXTABLE(1b, %l[fault])
+			  : : [vmxon_pointer] "m"(vmxon_pointer)
+			  : : fault);
+	return 0;
+
+fault:
+	WARN_ONCE(1, "VMXON faulted, MSR_IA32_FEAT_CTL (0x3a) = 0x%llx\n",
+		  rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr) ? 0xdeadbeef : msr);
+	cr4_clear_bits(X86_CR4_VMXE);
+
+	return -EFAULT;
+}
+
+static int vmx_hardware_enable(void)
+{
+	int cpu = raw_smp_processor_id();
+	u64 phys_addr = __pa(per_cpu(vmxarea, cpu));
+	int r;
+
+	if (cr4_read_shadow() & X86_CR4_VMXE)
+		return -EBUSY;
+
+	/*
+	 * This can happen if we hot-added a CPU but failed to allocate
+	 * VP assist page for it.
+	 */
+	if (static_branch_unlikely(&enable_evmcs) &&
+	    !hv_get_vp_assist_page(cpu))
+		return -EFAULT;
+
+	intel_pt_handle_vmx(1);
+
+	r = kvm_cpu_vmxon(phys_addr);
+	if (r) {
+		intel_pt_handle_vmx(0);
+		return r;
+	}
+
+	if (enable_ept)
+		ept_sync_global();
+
+	return 0;
+}
+
+static void vmclear_local_loaded_vmcss(void)
+{
+	int cpu = raw_smp_processor_id();
+	struct loaded_vmcs *v, *n;
+
+	list_for_each_entry_safe(v, n, &per_cpu(loaded_vmcss_on_cpu, cpu),
+				 loaded_vmcss_on_cpu_link)
+		__loaded_vmcs_clear(v);
+}
+
+static void vmx_hardware_disable(void)
+{
+	vmclear_local_loaded_vmcss();
+
+	if (cpu_vmxoff())
+		kvm_spurious_fault();
+
+	hv_reset_evmcs();
+
+	intel_pt_handle_vmx(0);
+}
+
 struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu, gfp_t flags)
 {
 	int node = cpu_to_node(cpu);
@@ -7468,41 +7503,6 @@ static int vmx_vm_init(struct kvm *kvm)
 	return 0;
 }
 
-static bool __init kvm_is_vmx_supported(void)
-{
-	if (!cpu_has_vmx()) {
-		pr_err("CPU doesn't support VMX\n");
-		return false;
-	}
-
-	if (!this_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
-	    !this_cpu_has(X86_FEATURE_VMX)) {
-		pr_err("VMX not enabled (by BIOS) in MSR_IA32_FEAT_CTL\n");
-		return false;
-	}
-
-	return true;
-}
-
-static int __init vmx_check_processor_compat(void)
-{
-	struct vmcs_config vmcs_conf;
-	struct vmx_capability vmx_cap;
-
-	if (!kvm_is_vmx_supported())
-		return -EIO;
-
-	if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0)
-		return -EIO;
-	if (nested)
-		nested_vmx_setup_ctls_msrs(&vmcs_conf, vmx_cap.ept);
-	if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config)) != 0) {
-		pr_err("CPU %d feature inconsistency!\n", smp_processor_id());
-		return -EIO;
-	}
-	return 0;
-}
-
 static u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
 {
 	u8 cache;
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 38/50] KVM: SVM: Check for SVM support in CPU compatibility checks
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (36 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 37/50] KVM: VMX: Shuffle support checks and hardware enabling code around Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 39/50] KVM: x86: Move CPU compat checks hook to kvm_x86_ops (from kvm_x86_init_ops) Sean Christopherson
                   ` (13 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Check that SVM is supported and enabled in the processor compatibility
checks.  SVM already checks for support during hardware enabling,
i.e. this doesn't really add new functionality.  The net effect is that
KVM will refuse to load if a CPU doesn't have SVM fully enabled, as
opposed to failing KVM_CREATE_VM.

Opportunistically move svm_check_processor_compat() up in svm.c so that
it can be invoked during hardware enabling in a future patch.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 49ccef9fae81..9f94efcb9aa6 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -543,6 +543,14 @@ static bool kvm_is_svm_supported(void)
 	return true;
 }
 
+static int __init svm_check_processor_compat(void)
+{
+	if (!kvm_is_svm_supported())
+		return -EIO;
+
+	return 0;
+}
+
 void __svm_write_tsc_multiplier(u64 multiplier)
 {
 	preempt_disable();
@@ -4087,11 +4095,6 @@ svm_patch_hypercall(struct kvm_vcpu *vcpu, unsigned char *hypercall)
 	hypercall[2] = 0xd9;
 }
 
-static int __init svm_check_processor_compat(void)
-{
-	return 0;
-}
-
 /*
  * The kvm parameter can be NULL (module initialization, or invocation before
  * VM creation). Be sure to check the kvm parameter before using it.
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 39/50] KVM: x86: Move CPU compat checks hook to kvm_x86_ops (from kvm_x86_init_ops)
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (37 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 38/50] KVM: SVM: Check for SVM support in CPU compatibility checks Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-12-02 13:01   ` Huang, Kai
  2022-12-05 21:04   ` Isaku Yamahata
  2022-11-30 23:09 ` [PATCH v2 40/50] KVM: x86: Do compatibility checks when onlining CPU Sean Christopherson
                   ` (12 subsequent siblings)
  51 siblings, 2 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Move the .check_processor_compatibility() callback from kvm_x86_init_ops
to kvm_x86_ops to allow a future patch to do compatibility checks during
CPU hotplug.

Do kvm_ops_update() before compat checks so that static_call() can be
used during compat checks.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h    |  3 ++-
 arch/x86/kvm/svm/svm.c             |  5 +++--
 arch/x86/kvm/vmx/vmx.c             | 16 +++++++--------
 arch/x86/kvm/x86.c                 | 31 +++++++++++-------------------
 5 files changed, 25 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index abccd51dcfca..dba2909e5ae2 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -14,6 +14,7 @@ BUILD_BUG_ON(1)
  * to make a definition optional, but in this case the default will
  * be __static_call_return0.
  */
+KVM_X86_OP(check_processor_compatibility)
 KVM_X86_OP(hardware_enable)
 KVM_X86_OP(hardware_disable)
 KVM_X86_OP(hardware_unsetup)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d79aedf70908..ba74fea6850b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1518,6 +1518,8 @@ static inline u16 kvm_lapic_irq_dest_mode(bool dest_mode_logical)
 struct kvm_x86_ops {
 	const char *name;
 
+	int (*check_processor_compatibility)(void);
+
 	int (*hardware_enable)(void);
 	void (*hardware_disable)(void);
 	void (*hardware_unsetup)(void);
@@ -1729,7 +1731,6 @@ struct kvm_x86_nested_ops {
 };
 
 struct kvm_x86_init_ops {
-	int (*check_processor_compatibility)(void);
 	int (*hardware_setup)(void);
 	unsigned int (*handle_intel_pt_intr)(void);
 
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 9f94efcb9aa6..c2e95c0d9fd8 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -543,7 +543,7 @@ static bool kvm_is_svm_supported(void)
 	return true;
 }
 
-static int __init svm_check_processor_compat(void)
+static int svm_check_processor_compat(void)
 {
 	if (!kvm_is_svm_supported())
 		return -EIO;
@@ -4695,6 +4695,8 @@ static int svm_vm_init(struct kvm *kvm)
 static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.name = KBUILD_MODNAME,
 
+	.check_processor_compatibility = svm_check_processor_compat,
+
 	.hardware_unsetup = svm_hardware_unsetup,
 	.hardware_enable = svm_hardware_enable,
 	.hardware_disable = svm_hardware_disable,
@@ -5079,7 +5081,6 @@ static __init int svm_hardware_setup(void)
 
 static struct kvm_x86_init_ops svm_init_ops __initdata = {
 	.hardware_setup = svm_hardware_setup,
-	.check_processor_compatibility = svm_check_processor_compat,
 
 	.runtime_ops = &svm_x86_ops,
 	.pmu_ops = &amd_pmu_ops,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 2a8a6e481c76..6416ed5b7f89 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2520,8 +2520,7 @@ static bool cpu_has_perf_global_ctrl_bug(void)
 	return false;
 }
 
-static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
-				      u32 msr, u32 *result)
+static int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt, u32 msr, u32 *result)
 {
 	u32 vmx_msr_low, vmx_msr_high;
 	u32 ctl = ctl_min | ctl_opt;
@@ -2539,7 +2538,7 @@ static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
 	return 0;
 }
 
-static __init u64 adjust_vmx_controls64(u64 ctl_opt, u32 msr)
+static u64 adjust_vmx_controls64(u64 ctl_opt, u32 msr)
 {
 	u64 allowed;
 
@@ -2548,8 +2547,8 @@ static __init u64 adjust_vmx_controls64(u64 ctl_opt, u32 msr)
 	return  ctl_opt & allowed;
 }
 
-static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
-				    struct vmx_capability *vmx_cap)
+static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
+			     struct vmx_capability *vmx_cap)
 {
 	u32 vmx_msr_low, vmx_msr_high;
 	u32 _pin_based_exec_control = 0;
@@ -2710,7 +2709,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	return 0;
 }
 
-static bool __init kvm_is_vmx_supported(void)
+static bool kvm_is_vmx_supported(void)
 {
 	if (!cpu_has_vmx()) {
 		pr_err("CPU doesn't support VMX\n");
@@ -2726,7 +2725,7 @@ static bool __init kvm_is_vmx_supported(void)
 	return true;
 }
 
-static int __init vmx_check_processor_compat(void)
+static int vmx_check_processor_compat(void)
 {
 	struct vmcs_config vmcs_conf;
 	struct vmx_capability vmx_cap;
@@ -8104,6 +8103,8 @@ static void vmx_vm_destroy(struct kvm *kvm)
 static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.name = KBUILD_MODNAME,
 
+	.check_processor_compatibility = vmx_check_processor_compat,
+
 	.hardware_unsetup = vmx_hardware_unsetup,
 
 	.hardware_enable = vmx_hardware_enable,
@@ -8501,7 +8502,6 @@ static __init int hardware_setup(void)
 }
 
 static struct kvm_x86_init_ops vmx_init_ops __initdata = {
-	.check_processor_compatibility = vmx_check_processor_compat,
 	.hardware_setup = hardware_setup,
 	.handle_intel_pt_intr = NULL,
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5551f3552f08..ee9af412ffd4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9279,12 +9279,7 @@ static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
 	kvm_pmu_ops_update(ops->pmu_ops);
 }
 
-struct kvm_cpu_compat_check {
-	struct kvm_x86_init_ops *ops;
-	int *ret;
-};
-
-static int kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
+static int kvm_x86_check_processor_compatibility(void)
 {
 	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
 
@@ -9294,19 +9289,16 @@ static int kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
 	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
 		return -EIO;
 
-	return ops->check_processor_compatibility();
+	return static_call(kvm_x86_check_processor_compatibility)();
 }
 
-static void kvm_x86_check_cpu_compat(void *data)
+static void kvm_x86_check_cpu_compat(void *ret)
 {
-	struct kvm_cpu_compat_check *c = data;
-
-	*c->ret = kvm_x86_check_processor_compatibility(c->ops);
+	*(int *)ret = kvm_x86_check_processor_compatibility();
 }
 
 static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 {
-	struct kvm_cpu_compat_check c;
 	u64 host_pat;
 	int r, cpu;
 
@@ -9377,12 +9369,12 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 	if (r != 0)
 		goto out_mmu_exit;
 
-	c.ret = &r;
-	c.ops = ops;
+	kvm_ops_update(ops);
+
 	for_each_online_cpu(cpu) {
-		smp_call_function_single(cpu, kvm_x86_check_cpu_compat, &c, 1);
+		smp_call_function_single(cpu, kvm_x86_check_cpu_compat, &r, 1);
 		if (r < 0)
-			goto out_hardware_unsetup;
+			goto out_unwind_ops;
 	}
 
 	/*
@@ -9390,8 +9382,6 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 	 * absolutely necessary, as most operations from this point forward
 	 * require unwinding.
 	 */
-	kvm_ops_update(ops);
-
 	kvm_timer_init();
 
 	if (pi_inject_timer == -1)
@@ -9427,8 +9417,9 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 	kvm_init_msr_list();
 	return 0;
 
-out_hardware_unsetup:
-	ops->runtime_ops->hardware_unsetup();
+out_unwind_ops:
+	kvm_x86_ops.hardware_enable = NULL;
+	static_call(kvm_x86_hardware_unsetup)();
 out_mmu_exit:
 	kvm_mmu_vendor_module_exit();
 out_free_percpu:
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 40/50] KVM: x86: Do compatibility checks when onlining CPU
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (38 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 39/50] KVM: x86: Move CPU compat checks hook to kvm_x86_ops (from kvm_x86_init_ops) Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-12-02 13:03   ` Huang, Kai
  2022-12-02 13:36   ` Huang, Kai
  2022-11-30 23:09 ` [PATCH v2 41/50] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section Sean Christopherson
                   ` (11 subsequent siblings)
  51 siblings, 2 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

From: Chao Gao <chao.gao@intel.com>

Do compatibility checks when enabling hardware to effectively add
compatibility checks when onlining a CPU.  Abort enabling, i.e. the
online process, if the (hotplugged) CPU is incompatible with the known
good setup.

At init time, KVM does compatibility checks to ensure that all online
CPUs support hardware virtualization and a common set of features. But
KVM uses hotplugged CPUs without such compatibility checks. On Intel
CPUs, this leads to #GP if the hotplugged CPU doesn't support VMX, or
VM-Entry failure if the hotplugged CPU doesn't support all features
enabled by KVM.

Note, this is little more than a NOP on SVM, as SVM already checks for
full SVM support during hardware enabling.

Opportunistically add a pr_err() if setup_vmcs_config() fails, and
tweak all error messages to output which CPU failed.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c |  8 +++-----
 arch/x86/kvm/vmx/vmx.c | 15 ++++++++++-----
 arch/x86/kvm/x86.c     |  5 +++++
 3 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index c2e95c0d9fd8..46b658d0f46e 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -521,11 +521,12 @@ static void svm_init_osvw(struct kvm_vcpu *vcpu)
 
 static bool kvm_is_svm_supported(void)
 {
+	int cpu = raw_smp_processor_id();
 	const char *msg;
 	u64 vm_cr;
 
 	if (!cpu_has_svm(&msg)) {
-		pr_err("SVM not supported, %s\n", msg);
+		pr_err("SVM not supported by CPU %d, %s\n", cpu, msg);
 		return false;
 	}
 
@@ -536,7 +537,7 @@ static bool kvm_is_svm_supported(void)
 
 	rdmsrl(MSR_VM_CR, vm_cr);
 	if (vm_cr & (1 << SVM_VM_CR_SVM_DISABLE)) {
-		pr_err("SVM disabled (by BIOS) in MSR_VM_CR\n");
+		pr_err("SVM disabled (by BIOS) in MSR_VM_CR on CPU %d\n", cpu);
 		return false;
 	}
 
@@ -587,9 +588,6 @@ static int svm_hardware_enable(void)
 	if (efer & EFER_SVME)
 		return -EBUSY;
 
-	if (!kvm_is_svm_supported())
-		return -EINVAL;
-
 	sd = per_cpu_ptr(&svm_data, me);
 	sd->asid_generation = 1;
 	sd->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6416ed5b7f89..39dd3082fcd8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2711,14 +2711,16 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 
 static bool kvm_is_vmx_supported(void)
 {
+	int cpu = raw_smp_processor_id();
+
 	if (!cpu_has_vmx()) {
-		pr_err("CPU doesn't support VMX\n");
+		pr_err("VMX not supported by CPU %d\n", cpu);
 		return false;
 	}
 
 	if (!this_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
 	    !this_cpu_has(X86_FEATURE_VMX)) {
-		pr_err("VMX not enabled (by BIOS) in MSR_IA32_FEAT_CTL\n");
+		pr_err("VMX not enabled (by BIOS) in MSR_IA32_FEAT_CTL on CPU %d\n", cpu);
 		return false;
 	}
 
@@ -2727,18 +2729,21 @@ static bool kvm_is_vmx_supported(void)
 
 static int vmx_check_processor_compat(void)
 {
+	int cpu = raw_smp_processor_id();
 	struct vmcs_config vmcs_conf;
 	struct vmx_capability vmx_cap;
 
 	if (!kvm_is_vmx_supported())
 		return -EIO;
 
-	if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0)
+	if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0) {
+		pr_err("Failed to setup VMCS config on CPU %d\n", cpu);
 		return -EIO;
+	}
 	if (nested)
 		nested_vmx_setup_ctls_msrs(&vmcs_conf, vmx_cap.ept);
-	if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config)) != 0) {
-		pr_err("CPU %d feature inconsistency!\n", smp_processor_id());
+	if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config))) {
+		pr_err("Inconsistent VMCS config on CPU %d\n", cpu);
 		return -EIO;
 	}
 	return 0;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ee9af412ffd4..5a9e74cedbc6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11967,6 +11967,11 @@ int kvm_arch_hardware_enable(void)
 	bool stable, backwards_tsc = false;
 
 	kvm_user_return_msr_cpu_online();
+
+	ret = kvm_x86_check_processor_compatibility();
+	if (ret)
+		return ret;
+
 	ret = static_call(kvm_x86_hardware_enable)();
 	if (ret != 0)
 		return ret;
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 41/50] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (39 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 40/50] KVM: x86: Do compatibility checks when onlining CPU Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-12-02 13:06   ` Huang, Kai
  2022-11-30 23:09 ` [PATCH v2 42/50] KVM: Disable CPU hotplug during hardware enabling/disabling Sean Christopherson
                   ` (10 subsequent siblings)
  51 siblings, 1 reply; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

From: Chao Gao <chao.gao@intel.com>

The CPU STARTING section doesn't allow callbacks to fail. Move KVM's
hotplug callback to ONLINE section so that it can abort onlining a CPU in
certain cases to avoid potentially breaking VMs running on existing CPUs.
For example, when KVM fails to enable hardware virtualization on the
hotplugged CPU.

Place KVM's hotplug state before CPUHP_AP_SCHED_WAIT_EMPTY as it ensures
when offlining a CPU, all user tasks and non-pinned kernel tasks have left
the CPU, i.e. there cannot be a vCPU task around. So, it is safe for KVM's
CPU offline callback to disable hardware virtualization at that point.
Likewise, KVM's online callback can enable hardware virtualization before
any vCPU task gets a chance to run on hotplugged CPUs.

Drop kvm_x86_check_processor_compatibility()'s WARN that IRQs are
disabled, as the ONLINE section runs with IRQs disabled.  The WARN wasn't
intended to be a requirement, e.g. disabling preemption is sufficient,
the IRQ thing was purely an aggressive sanity check since the helper was
only ever invoked via SMP function call.

Rename KVM's CPU hotplug callbacks accordingly.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
[sean: drop WARN that IRQs are disabled]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c         |  2 --
 include/linux/cpuhotplug.h |  2 +-
 virt/kvm/kvm_main.c        | 30 ++++++++++++++++++++++--------
 3 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5a9e74cedbc6..dad30097f0c3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9283,8 +9283,6 @@ static int kvm_x86_check_processor_compatibility(void)
 {
 	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
 
-	WARN_ON(!irqs_disabled());
-
 	if (__cr4_reserved_bits(cpu_has, c) !=
 	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
 		return -EIO;
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index 7337414e4947..de45be38dd27 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -185,7 +185,6 @@ enum cpuhp_state {
 	CPUHP_AP_CSKY_TIMER_STARTING,
 	CPUHP_AP_TI_GP_TIMER_STARTING,
 	CPUHP_AP_HYPERV_TIMER_STARTING,
-	CPUHP_AP_KVM_STARTING,
 	/* Must be the last timer callback */
 	CPUHP_AP_DUMMY_TIMER_STARTING,
 	CPUHP_AP_ARM_XEN_STARTING,
@@ -200,6 +199,7 @@ enum cpuhp_state {
 
 	/* Online section invoked on the hotplugged CPU from the hotplug thread */
 	CPUHP_AP_ONLINE_IDLE,
+	CPUHP_AP_KVM_ONLINE,
 	CPUHP_AP_SCHED_WAIT_EMPTY,
 	CPUHP_AP_SMPBOOT_THREADS,
 	CPUHP_AP_X86_VDSO_VMA_ONLINE,
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3900bd3d75cb..f26ea779710a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5045,13 +5045,27 @@ static void hardware_enable_nolock(void *junk)
 	}
 }
 
-static int kvm_starting_cpu(unsigned int cpu)
+static int kvm_online_cpu(unsigned int cpu)
 {
+	int ret = 0;
+
+	/*
+	 * Abort the CPU online process if hardware virtualization cannot
+	 * be enabled. Otherwise running VMs would encounter unrecoverable
+	 * errors when scheduled to this CPU.
+	 */
 	raw_spin_lock(&kvm_count_lock);
-	if (kvm_usage_count)
+	if (kvm_usage_count) {
+		WARN_ON_ONCE(atomic_read(&hardware_enable_failed));
+
 		hardware_enable_nolock(NULL);
+		if (atomic_read(&hardware_enable_failed)) {
+			atomic_set(&hardware_enable_failed, 0);
+			ret = -EIO;
+		}
+	}
 	raw_spin_unlock(&kvm_count_lock);
-	return 0;
+	return ret;
 }
 
 static void hardware_disable_nolock(void *junk)
@@ -5064,7 +5078,7 @@ static void hardware_disable_nolock(void *junk)
 	kvm_arch_hardware_disable();
 }
 
-static int kvm_dying_cpu(unsigned int cpu)
+static int kvm_offline_cpu(unsigned int cpu)
 {
 	raw_spin_lock(&kvm_count_lock);
 	if (kvm_usage_count)
@@ -5841,8 +5855,8 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 	if (!zalloc_cpumask_var(&cpus_hardware_enabled, GFP_KERNEL))
 		return -ENOMEM;
 
-	r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_STARTING, "kvm/cpu:starting",
-				      kvm_starting_cpu, kvm_dying_cpu);
+	r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_ONLINE, "kvm/cpu:online",
+				      kvm_online_cpu, kvm_offline_cpu);
 	if (r)
 		goto out_free_2;
 	register_reboot_notifier(&kvm_reboot_notifier);
@@ -5916,7 +5930,7 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 	kmem_cache_destroy(kvm_vcpu_cache);
 out_free_3:
 	unregister_reboot_notifier(&kvm_reboot_notifier);
-	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING);
+	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE);
 out_free_2:
 	free_cpumask_var(cpus_hardware_enabled);
 	return r;
@@ -5942,7 +5956,7 @@ void kvm_exit(void)
 	kvm_async_pf_deinit();
 	unregister_syscore_ops(&kvm_syscore_ops);
 	unregister_reboot_notifier(&kvm_reboot_notifier);
-	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING);
+	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE);
 	on_each_cpu(hardware_disable_nolock, NULL, 1);
 	kvm_irqfd_exit();
 	free_cpumask_var(cpus_hardware_enabled);
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 42/50] KVM: Disable CPU hotplug during hardware enabling/disabling
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (40 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 41/50] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-12-02 12:59   ` Huang, Kai
  2022-11-30 23:09 ` [PATCH v2 43/50] KVM: Ensure CPU is stable during low level hardware enable/disable Sean Christopherson
                   ` (9 subsequent siblings)
  51 siblings, 1 reply; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

From: Chao Gao <chao.gao@intel.com>

Disable CPU hotplug when enabling/disabling hardware to prevent the
corner case where if the following sequence occurs:

  1. A hotplugged CPU marks itself online in cpu_online_mask
  2. The hotplugged CPU enables interrupt before invoking KVM's ONLINE
     callback
  3  hardware_{en,dis}able_all() is invoked on another CPU

the hotplugged CPU will be included in on_each_cpu() and thus get sent
through hardware_{en,dis}able_nolock() before kvm_online_cpu() is called.

        start_secondary { ...
                set_cpu_online(smp_processor_id(), true); <- 1
                ...
                local_irq_enable();  <- 2
                ...
                cpu_startup_entry(CPUHP_AP_ONLINE_IDLE); <- 3
        }

KVM currently fudges around this race by keeping track of which CPUs have
done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which
cpus have virtualization enabled"), but that's an inefficient, convoluted,
and hacky solution.

Signed-off-by: Chao Gao <chao.gao@intel.com>
[sean: split to separate patch, write changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c  | 11 ++++++++++-
 virt/kvm/kvm_main.c | 12 ++++++++++++
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index dad30097f0c3..d2ad383da998 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9281,7 +9281,16 @@ static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
 
 static int kvm_x86_check_processor_compatibility(void)
 {
-	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
+	int cpu = smp_processor_id();
+	struct cpuinfo_x86 *c = &cpu_data(cpu);
+
+	/*
+	 * Compatibility checks are done when loading KVM and when enabling
+	 * hardware, e.g. during CPU hotplug, to ensure all online CPUs are
+	 * compatible, i.e. KVM should never perform a compatibility check on
+	 * an offline CPU.
+	 */
+	WARN_ON(!cpu_online(cpu));
 
 	if (__cr4_reserved_bits(cpu_has, c) !=
 	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f26ea779710a..d985b24c423b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5098,15 +5098,26 @@ static void hardware_disable_all_nolock(void)
 
 static void hardware_disable_all(void)
 {
+	cpus_read_lock();
 	raw_spin_lock(&kvm_count_lock);
 	hardware_disable_all_nolock();
 	raw_spin_unlock(&kvm_count_lock);
+	cpus_read_unlock();
 }
 
 static int hardware_enable_all(void)
 {
 	int r = 0;
 
+	/*
+	 * When onlining a CPU, cpu_online_mask is set before kvm_online_cpu()
+	 * is called, and so on_each_cpu() between them includes the CPU that
+	 * is being onlined.  As a result, hardware_enable_nolock() may get
+	 * invoked before kvm_online_cpu(), which also enables hardware if the
+	 * usage count is non-zero.  Disable CPU hotplug to avoid attempting to
+	 * enable hardware multiple times.
+	 */
+	cpus_read_lock();
 	raw_spin_lock(&kvm_count_lock);
 
 	kvm_usage_count++;
@@ -5121,6 +5132,7 @@ static int hardware_enable_all(void)
 	}
 
 	raw_spin_unlock(&kvm_count_lock);
+	cpus_read_unlock();
 
 	return r;
 }
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 43/50] KVM: Ensure CPU is stable during low level hardware enable/disable
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (41 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 42/50] KVM: Disable CPU hotplug during hardware enabling/disabling Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 44/50] KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock Sean Christopherson
                   ` (8 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Use the non-raw smp_processor_id() in the low hardware enable/disable
helpers as KVM absolutely relies on the CPU being stable, e.g. KVM would
end up with incorrect state if the task were migrated between accessing
cpus_hardware_enabled and actually enabling/disabling hardware.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/kvm_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d985b24c423b..a46d61e9c053 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5028,7 +5028,7 @@ static struct miscdevice kvm_dev = {
 
 static void hardware_enable_nolock(void *junk)
 {
-	int cpu = raw_smp_processor_id();
+	int cpu = smp_processor_id();
 	int r;
 
 	if (cpumask_test_cpu(cpu, cpus_hardware_enabled))
@@ -5070,7 +5070,7 @@ static int kvm_online_cpu(unsigned int cpu)
 
 static void hardware_disable_nolock(void *junk)
 {
-	int cpu = raw_smp_processor_id();
+	int cpu = smp_processor_id();
 
 	if (!cpumask_test_cpu(cpu, cpus_hardware_enabled))
 		return;
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 44/50] KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (42 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 43/50] KVM: Ensure CPU is stable during low level hardware enable/disable Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 45/50] KVM: Remove on_each_cpu(hardware_disable_nolock) in kvm_exit() Sean Christopherson
                   ` (7 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

From: Isaku Yamahata <isaku.yamahata@intel.com>

Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock now
that KVM hooks CPU hotplug during the ONLINE phase, which can sleep.
Previously, KVM hooked the STARTING phase, which is not allowed to sleep
and thus could not take kvm_lock (a mutex).  This effectively allows the
task that's initiating hardware enabling/disabling to preempted and/or
migrated.

Note, the Documentation/virt/kvm/locking.rst statement that kvm_count_lock
is "raw" because hardware enabling/disabling needs to be atomic with
respect to migration is wrong on multiple fronts.  First, while regular
spinlocks can be preempted, the task holding the lock cannot be migrated.
Second, preventing migration is not required.  on_each_cpu() disables
preemption, which ensures that cpus_hardware_enabled correctly reflects
hardware state.  The task may be preempted/migrated between bumping
kvm_usage_count and invoking on_each_cpu(), but that's perfectly ok as
kvm_usage_count is still protected, e.g. other tasks that call
hardware_enable_all() will be blocked until the preempted/migrated owner
exits its critical section.

KVM does have lockless accesses to kvm_usage_count in the suspend/resume
flows, but those are safe because all tasks must be frozen prior to
suspending CPUs, and a task cannot be frozen while it holds one or more
locks (userspace tasks are frozen via a fake signal).

Preemption doesn't need to be explicitly disabled in the hotplug path.
The hotplug thread is pinned to the CPU that's being hotplugged, and KVM
only cares about having a stable CPU, i.e. to ensure hardware is enabled
on the correct CPU.  Lockep, i.e. check_preemption_disabled(), plays nice
with this state too, as is_percpu_thread() is true for the hotplug thread.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 Documentation/virt/kvm/locking.rst | 19 ++++++++--------
 virt/kvm/kvm_main.c                | 36 ++++++++++++++++++++----------
 2 files changed, 34 insertions(+), 21 deletions(-)

diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index 132a9e5436e5..cd570e565522 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -9,6 +9,8 @@ KVM Lock Overview
 
 The acquisition orders for mutexes are as follows:
 
+- cpus_read_lock() is taken outside kvm_lock
+
 - kvm->lock is taken outside vcpu->mutex
 
 - kvm->lock is taken outside kvm->slots_lock and kvm->irq_lock
@@ -216,15 +218,10 @@ time it will be set using the Dirty tracking mechanism described above.
 :Type:		mutex
 :Arch:		any
 :Protects:	- vm_list
-
-``kvm_count_lock``
-^^^^^^^^^^^^^^^^^^
-
-:Type:		raw_spinlock_t
-:Arch:		any
-:Protects:	- hardware virtualization enable/disable
-:Comment:	'raw' because hardware enabling/disabling must be atomic /wrt
-		migration.
+		- kvm_usage_count
+		- hardware virtualization enable/disable
+:Comment:	KVM also disables CPU hotplug via cpus_read_lock() during
+		enable/disable.
 
 ``kvm->mn_invalidate_lock``
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -288,3 +285,7 @@ time it will be set using the Dirty tracking mechanism described above.
 :Type:		mutex
 :Arch:		x86
 :Protects:	loading a vendor module (kvm_amd or kvm_intel)
+:Comment:	Exists because using kvm_lock leads to deadlock.  cpu_hotplug_lock is
+    taken outside of kvm_lock, e.g. in KVM's CPU online/offline callbacks, and
+    many operations need to take cpu_hotplug_lock when loading a vendor module,
+    e.g. updating static calls.
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index a46d61e9c053..6a8fb53b32f0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -100,7 +100,6 @@ EXPORT_SYMBOL_GPL(halt_poll_ns_shrink);
  */
 
 DEFINE_MUTEX(kvm_lock);
-static DEFINE_RAW_SPINLOCK(kvm_count_lock);
 LIST_HEAD(vm_list);
 
 static cpumask_var_t cpus_hardware_enabled;
@@ -5054,17 +5053,18 @@ static int kvm_online_cpu(unsigned int cpu)
 	 * be enabled. Otherwise running VMs would encounter unrecoverable
 	 * errors when scheduled to this CPU.
 	 */
-	raw_spin_lock(&kvm_count_lock);
+	mutex_lock(&kvm_lock);
 	if (kvm_usage_count) {
 		WARN_ON_ONCE(atomic_read(&hardware_enable_failed));
 
 		hardware_enable_nolock(NULL);
+
 		if (atomic_read(&hardware_enable_failed)) {
 			atomic_set(&hardware_enable_failed, 0);
 			ret = -EIO;
 		}
 	}
-	raw_spin_unlock(&kvm_count_lock);
+	mutex_unlock(&kvm_lock);
 	return ret;
 }
 
@@ -5080,10 +5080,10 @@ static void hardware_disable_nolock(void *junk)
 
 static int kvm_offline_cpu(unsigned int cpu)
 {
-	raw_spin_lock(&kvm_count_lock);
+	mutex_lock(&kvm_lock);
 	if (kvm_usage_count)
 		hardware_disable_nolock(NULL);
-	raw_spin_unlock(&kvm_count_lock);
+	mutex_unlock(&kvm_lock);
 	return 0;
 }
 
@@ -5099,9 +5099,9 @@ static void hardware_disable_all_nolock(void)
 static void hardware_disable_all(void)
 {
 	cpus_read_lock();
-	raw_spin_lock(&kvm_count_lock);
+	mutex_lock(&kvm_lock);
 	hardware_disable_all_nolock();
-	raw_spin_unlock(&kvm_count_lock);
+	mutex_unlock(&kvm_lock);
 	cpus_read_unlock();
 }
 
@@ -5118,7 +5118,7 @@ static int hardware_enable_all(void)
 	 * enable hardware multiple times.
 	 */
 	cpus_read_lock();
-	raw_spin_lock(&kvm_count_lock);
+	mutex_lock(&kvm_lock);
 
 	kvm_usage_count++;
 	if (kvm_usage_count == 1) {
@@ -5131,7 +5131,7 @@ static int hardware_enable_all(void)
 		}
 	}
 
-	raw_spin_unlock(&kvm_count_lock);
+	mutex_unlock(&kvm_lock);
 	cpus_read_unlock();
 
 	return r;
@@ -5737,6 +5737,17 @@ static void kvm_init_debug(void)
 
 static int kvm_suspend(void)
 {
+	/*
+	 * Secondary CPUs and CPU hotplug are disabled across the suspend/resume
+	 * callbacks, i.e. no need to acquire kvm_lock to ensure the usage count
+	 * is stable.  Assert that kvm_lock is not held to ensure the system
+	 * isn't suspended while KVM is enabling hardware.  Hardware enabling
+	 * can be preempted, but the task cannot be frozen until it has dropped
+	 * all locks (userspace tasks are frozen via a fake signal).
+	 */
+	lockdep_assert_not_held(&kvm_lock);
+	lockdep_assert_irqs_disabled();
+
 	if (kvm_usage_count)
 		hardware_disable_nolock(NULL);
 	return 0;
@@ -5744,10 +5755,11 @@ static int kvm_suspend(void)
 
 static void kvm_resume(void)
 {
-	if (kvm_usage_count) {
-		lockdep_assert_not_held(&kvm_count_lock);
+	lockdep_assert_not_held(&kvm_lock);
+	lockdep_assert_irqs_disabled();
+
+	if (kvm_usage_count)
 		hardware_enable_nolock(NULL);
-	}
 }
 
 static struct syscore_ops kvm_syscore_ops = {
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 45/50] KVM: Remove on_each_cpu(hardware_disable_nolock) in kvm_exit()
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (43 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 44/50] KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 46/50] KVM: Use a per-CPU variable to track which CPUs have enabled virtualization Sean Christopherson
                   ` (6 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

From: Isaku Yamahata <isaku.yamahata@intel.com>

Drop the superfluous invocation of hardware_disable_nolock() during
kvm_exit(), as it's nothing more than a glorified nop.

KVM automatically disables hardware on all CPUs when the last VM is
destroyed, and kvm_exit() cannot be called until the last VM goes
away as the calling module is pinned by an elevated refcount of the fops
associated with /dev/kvm.  This holds true even on x86, where the caller
of kvm_exit() is not kvm.ko, but is instead a dependent module, kvm_amd.ko
or kvm_intel.ko, as kvm_chardev_ops.owner is set to the module that calls
kvm_init(), not hardcoded to the base kvm.ko module.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
[sean: rework changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/kvm_main.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6a8fb53b32f0..a27ded004644 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5981,7 +5981,6 @@ void kvm_exit(void)
 	unregister_syscore_ops(&kvm_syscore_ops);
 	unregister_reboot_notifier(&kvm_reboot_notifier);
 	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE);
-	on_each_cpu(hardware_disable_nolock, NULL, 1);
 	kvm_irqfd_exit();
 	free_cpumask_var(cpus_hardware_enabled);
 }
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 46/50] KVM: Use a per-CPU variable to track which CPUs have enabled virtualization
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (44 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 45/50] KVM: Remove on_each_cpu(hardware_disable_nolock) in kvm_exit() Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 47/50] KVM: Make hardware_enable_failed a local variable in the "enable all" path Sean Christopherson
                   ` (5 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Use a per-CPU variable instead of a shared bitmap to track which CPUs
have successfully enabled virtualization hardware.  Using a per-CPU bool
avoids the need for an additional allocation, and arguably yields easier
to read code.  Using a bitmap would be advantageous if KVM used it to
avoid generating IPIs to CPUs that failed to enable hardware, but that's
an extreme edge case and not worth optimizing, and the low level helpers
would still want to keep their individual checks as attempting to enable
virtualization hardware when it's already enabled can be problematic,
e.g. Intel's VMXON will fault.

Opportunistically change the order in hardware_enable_nolock() to set
the flag if and only if hardware enabling is successful, instead of
speculatively setting the flag and then clearing it on failure.

Add a comment explaining that the check in hardware_disable_nolock()
isn't simply paranoia.  Waaay back when, commit 1b6c016818a5 ("KVM: Keep
track of which cpus have virtualization enabled"), added the logic as a
guards against CPU hotplug racing with hardware enable/disable.  Now that
KVM has eliminated the race by taking cpu_hotplug_lock for read (via
cpus_read_lock()) when enabling or disabling hardware, at first glance it
appears that the check is now superfluous, i.e. it's tempting to remove
the per-CPU flag entirely...

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/kvm_main.c | 41 ++++++++++++++++++-----------------------
 1 file changed, 18 insertions(+), 23 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index a27ded004644..c1e48c18e2d9 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -102,7 +102,7 @@ EXPORT_SYMBOL_GPL(halt_poll_ns_shrink);
 DEFINE_MUTEX(kvm_lock);
 LIST_HEAD(vm_list);
 
-static cpumask_var_t cpus_hardware_enabled;
+static DEFINE_PER_CPU(bool, hardware_enabled);
 static int kvm_usage_count;
 static atomic_t hardware_enable_failed;
 
@@ -5027,21 +5027,17 @@ static struct miscdevice kvm_dev = {
 
 static void hardware_enable_nolock(void *junk)
 {
-	int cpu = smp_processor_id();
-	int r;
-
-	if (cpumask_test_cpu(cpu, cpus_hardware_enabled))
+	if (__this_cpu_read(hardware_enabled))
 		return;
 
-	cpumask_set_cpu(cpu, cpus_hardware_enabled);
-
-	r = kvm_arch_hardware_enable();
-
-	if (r) {
-		cpumask_clear_cpu(cpu, cpus_hardware_enabled);
+	if (kvm_arch_hardware_enable()) {
 		atomic_inc(&hardware_enable_failed);
-		pr_info("kvm: enabling virtualization on CPU%d failed\n", cpu);
+		pr_info("kvm: enabling virtualization on CPU%d failed\n",
+			raw_smp_processor_id());
+		return;
 	}
+
+	__this_cpu_write(hardware_enabled, true);
 }
 
 static int kvm_online_cpu(unsigned int cpu)
@@ -5070,12 +5066,16 @@ static int kvm_online_cpu(unsigned int cpu)
 
 static void hardware_disable_nolock(void *junk)
 {
-	int cpu = smp_processor_id();
-
-	if (!cpumask_test_cpu(cpu, cpus_hardware_enabled))
+	/*
+	 * Note, hardware_disable_all_nolock() tells all online CPUs to disable
+	 * hardware, not just CPUs that successfully enabled hardware!
+	 */
+	if (!__this_cpu_read(hardware_enabled))
 		return;
-	cpumask_clear_cpu(cpu, cpus_hardware_enabled);
+
 	kvm_arch_hardware_disable();
+
+	__this_cpu_write(hardware_enabled, false);
 }
 
 static int kvm_offline_cpu(unsigned int cpu)
@@ -5876,13 +5876,11 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 	int r;
 	int cpu;
 
-	if (!zalloc_cpumask_var(&cpus_hardware_enabled, GFP_KERNEL))
-		return -ENOMEM;
-
 	r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_ONLINE, "kvm/cpu:online",
 				      kvm_online_cpu, kvm_offline_cpu);
 	if (r)
-		goto out_free_2;
+		return r;
+
 	register_reboot_notifier(&kvm_reboot_notifier);
 
 	/* A kmem cache lets us meet the alignment requirements of fx_save. */
@@ -5955,8 +5953,6 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 out_free_3:
 	unregister_reboot_notifier(&kvm_reboot_notifier);
 	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE);
-out_free_2:
-	free_cpumask_var(cpus_hardware_enabled);
 	return r;
 }
 EXPORT_SYMBOL_GPL(kvm_init);
@@ -5982,7 +5978,6 @@ void kvm_exit(void)
 	unregister_reboot_notifier(&kvm_reboot_notifier);
 	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE);
 	kvm_irqfd_exit();
-	free_cpumask_var(cpus_hardware_enabled);
 }
 EXPORT_SYMBOL_GPL(kvm_exit);
 
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 47/50] KVM: Make hardware_enable_failed a local variable in the "enable all" path
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (45 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 46/50] KVM: Use a per-CPU variable to track which CPUs have enabled virtualization Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 48/50] KVM: Register syscore (suspend/resume) ops early in kvm_init() Sean Christopherson
                   ` (4 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

From: Isaku Yamahata <isaku.yamahata@intel.com>

Rework detecting hardware enabling errors to use a local variable in the
"enable all" path to track whether or not enabling was successful across
all CPUs.  Using a global variable complicates paths that enable hardware
only on the current CPU, e.g. kvm_resume() and kvm_online_cpu().

Opportunistically add a WARN if hardware enabling fails during
kvm_resume(), KVM is all kinds of hosed if CPU0 fails to enable hardware.
The WARN is largely futile in the current code, as KVM BUG()s on spurious
faults on VMX instructions, e.g. attempting to run a vCPU on CPU if
hardware enabling fails will explode.

  ------------[ cut here ]------------
  kernel BUG at arch/x86/kvm/x86.c:508!
  invalid opcode: 0000 [#1] SMP
  CPU: 3 PID: 1009 Comm: CPU 4/KVM Not tainted 6.1.0-rc1+ #11
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:kvm_spurious_fault+0xa/0x10
  Call Trace:
   vmx_vcpu_load_vmcs+0x192/0x230 [kvm_intel]
   vmx_vcpu_load+0x16/0x60 [kvm_intel]
   kvm_arch_vcpu_load+0x32/0x1f0
   vcpu_load+0x2f/0x40
   kvm_arch_vcpu_ioctl_run+0x19/0x9d0
   kvm_vcpu_ioctl+0x271/0x660
   __x64_sys_ioctl+0x80/0xb0
   do_syscall_64+0x2b/0x50
   entry_SYSCALL_64_after_hwframe+0x46/0xb0

But, the WARN may provide a breadcrumb to understand what went awry, and
someday KVM may fix one or both of those bugs, e.g. by finding a way to
eat spurious faults no matter the context (easier said than done due to
side effects of certain operations, e.g. Intel's VMCLEAR).

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
[sean: rebase, WARN on failure in kvm_resume()]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/kvm_main.c | 35 ++++++++++++++++-------------------
 1 file changed, 16 insertions(+), 19 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c1e48c18e2d9..674a9dab5411 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -104,7 +104,6 @@ LIST_HEAD(vm_list);
 
 static DEFINE_PER_CPU(bool, hardware_enabled);
 static int kvm_usage_count;
-static atomic_t hardware_enable_failed;
 
 static struct kmem_cache *kvm_vcpu_cache;
 
@@ -5025,19 +5024,25 @@ static struct miscdevice kvm_dev = {
 	&kvm_chardev_ops,
 };
 
-static void hardware_enable_nolock(void *junk)
+static int __hardware_enable_nolock(void)
 {
 	if (__this_cpu_read(hardware_enabled))
-		return;
+		return 0;
 
 	if (kvm_arch_hardware_enable()) {
-		atomic_inc(&hardware_enable_failed);
 		pr_info("kvm: enabling virtualization on CPU%d failed\n",
 			raw_smp_processor_id());
-		return;
+		return -EIO;
 	}
 
 	__this_cpu_write(hardware_enabled, true);
+	return 0;
+}
+
+static void hardware_enable_nolock(void *failed)
+{
+	if (__hardware_enable_nolock())
+		atomic_inc(failed);
 }
 
 static int kvm_online_cpu(unsigned int cpu)
@@ -5050,16 +5055,8 @@ static int kvm_online_cpu(unsigned int cpu)
 	 * errors when scheduled to this CPU.
 	 */
 	mutex_lock(&kvm_lock);
-	if (kvm_usage_count) {
-		WARN_ON_ONCE(atomic_read(&hardware_enable_failed));
-
-		hardware_enable_nolock(NULL);
-
-		if (atomic_read(&hardware_enable_failed)) {
-			atomic_set(&hardware_enable_failed, 0);
-			ret = -EIO;
-		}
-	}
+	if (kvm_usage_count)
+		ret = __hardware_enable_nolock();
 	mutex_unlock(&kvm_lock);
 	return ret;
 }
@@ -5107,6 +5104,7 @@ static void hardware_disable_all(void)
 
 static int hardware_enable_all(void)
 {
+	atomic_t failed = ATOMIC_INIT(0);
 	int r = 0;
 
 	/*
@@ -5122,10 +5120,9 @@ static int hardware_enable_all(void)
 
 	kvm_usage_count++;
 	if (kvm_usage_count == 1) {
-		atomic_set(&hardware_enable_failed, 0);
-		on_each_cpu(hardware_enable_nolock, NULL, 1);
+		on_each_cpu(hardware_enable_nolock, &failed, 1);
 
-		if (atomic_read(&hardware_enable_failed)) {
+		if (atomic_read(&failed)) {
 			hardware_disable_all_nolock();
 			r = -EBUSY;
 		}
@@ -5759,7 +5756,7 @@ static void kvm_resume(void)
 	lockdep_assert_irqs_disabled();
 
 	if (kvm_usage_count)
-		hardware_enable_nolock(NULL);
+		WARN_ON_ONCE(__hardware_enable_nolock());
 }
 
 static struct syscore_ops kvm_syscore_ops = {
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 48/50] KVM: Register syscore (suspend/resume) ops early in kvm_init()
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (46 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 47/50] KVM: Make hardware_enable_failed a local variable in the "enable all" path Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 49/50] KVM: Opt out of generic hardware enabling on s390 and PPC Sean Christopherson
                   ` (3 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Register the suspend/resume notifier hooks at the same time KVM registers
its reboot notifier so that all the code in kvm_init() that deals with
enabling/disabling hardware is bundled together.  Opportunstically move
KVM's implementations to reside near the reboot notifier code for the
same reason.

Bunching the code together will allow architectures to opt out of KVM's
generic hardware enable/disable logic with minimal #ifdeffery.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/kvm_main.c | 68 ++++++++++++++++++++++-----------------------
 1 file changed, 34 insertions(+), 34 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 674a9dab5411..c12db3839114 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5154,6 +5154,38 @@ static struct notifier_block kvm_reboot_notifier = {
 	.priority = 0,
 };
 
+static int kvm_suspend(void)
+{
+	/*
+	 * Secondary CPUs and CPU hotplug are disabled across the suspend/resume
+	 * callbacks, i.e. no need to acquire kvm_lock to ensure the usage count
+	 * is stable.  Assert that kvm_lock is not held to ensure the system
+	 * isn't suspended while KVM is enabling hardware.  Hardware enabling
+	 * can be preempted, but the task cannot be frozen until it has dropped
+	 * all locks (userspace tasks are frozen via a fake signal).
+	 */
+	lockdep_assert_not_held(&kvm_lock);
+	lockdep_assert_irqs_disabled();
+
+	if (kvm_usage_count)
+		hardware_disable_nolock(NULL);
+	return 0;
+}
+
+static void kvm_resume(void)
+{
+	lockdep_assert_not_held(&kvm_lock);
+	lockdep_assert_irqs_disabled();
+
+	if (kvm_usage_count)
+		WARN_ON_ONCE(__hardware_enable_nolock());
+}
+
+static struct syscore_ops kvm_syscore_ops = {
+	.suspend = kvm_suspend,
+	.resume = kvm_resume,
+};
+
 static void kvm_io_bus_destroy(struct kvm_io_bus *bus)
 {
 	int i;
@@ -5732,38 +5764,6 @@ static void kvm_init_debug(void)
 	}
 }
 
-static int kvm_suspend(void)
-{
-	/*
-	 * Secondary CPUs and CPU hotplug are disabled across the suspend/resume
-	 * callbacks, i.e. no need to acquire kvm_lock to ensure the usage count
-	 * is stable.  Assert that kvm_lock is not held to ensure the system
-	 * isn't suspended while KVM is enabling hardware.  Hardware enabling
-	 * can be preempted, but the task cannot be frozen until it has dropped
-	 * all locks (userspace tasks are frozen via a fake signal).
-	 */
-	lockdep_assert_not_held(&kvm_lock);
-	lockdep_assert_irqs_disabled();
-
-	if (kvm_usage_count)
-		hardware_disable_nolock(NULL);
-	return 0;
-}
-
-static void kvm_resume(void)
-{
-	lockdep_assert_not_held(&kvm_lock);
-	lockdep_assert_irqs_disabled();
-
-	if (kvm_usage_count)
-		WARN_ON_ONCE(__hardware_enable_nolock());
-}
-
-static struct syscore_ops kvm_syscore_ops = {
-	.suspend = kvm_suspend,
-	.resume = kvm_resume,
-};
-
 static inline
 struct kvm_vcpu *preempt_notifier_to_vcpu(struct preempt_notifier *pn)
 {
@@ -5879,6 +5879,7 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 		return r;
 
 	register_reboot_notifier(&kvm_reboot_notifier);
+	register_syscore_ops(&kvm_syscore_ops);
 
 	/* A kmem cache lets us meet the alignment requirements of fx_save. */
 	if (!vcpu_align)
@@ -5913,8 +5914,6 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 
 	kvm_chardev_ops.owner = module;
 
-	register_syscore_ops(&kvm_syscore_ops);
-
 	kvm_preempt_ops.sched_in = kvm_sched_in;
 	kvm_preempt_ops.sched_out = kvm_sched_out;
 
@@ -5948,6 +5947,7 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 		free_cpumask_var(per_cpu(cpu_kick_mask, cpu));
 	kmem_cache_destroy(kvm_vcpu_cache);
 out_free_3:
+	unregister_syscore_ops(&kvm_syscore_ops);
 	unregister_reboot_notifier(&kvm_reboot_notifier);
 	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE);
 	return r;
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 49/50] KVM: Opt out of generic hardware enabling on s390 and PPC
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (47 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 48/50] KVM: Register syscore (suspend/resume) ops early in kvm_init() Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-11-30 23:09 ` [PATCH v2 50/50] KVM: Clean up error labels in kvm_init() Sean Christopherson
                   ` (2 subsequent siblings)
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Allow architectures to opt out of the generic hardware enabling logic,
and opt out on both s390 and PPC, which don't need to manually enable
virtualization as it's always on (when available).

In addition to letting s390 and PPC drop a bit of dead code, this will
hopefully also allow ARM to clean up its related code, e.g. ARM has its
own per-CPU flag to track which CPUs have enable hardware due to the
need to keep hardware enabled indefinitely when pKVM is enabled.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Anup Patel <anup@brainfault.org>
---
 arch/arm64/kvm/Kconfig              |  1 +
 arch/mips/kvm/Kconfig               |  1 +
 arch/powerpc/include/asm/kvm_host.h |  1 -
 arch/powerpc/kvm/powerpc.c          |  5 -----
 arch/riscv/kvm/Kconfig              |  1 +
 arch/s390/include/asm/kvm_host.h    |  1 -
 arch/s390/kvm/kvm-s390.c            |  6 ------
 arch/x86/kvm/Kconfig                |  1 +
 include/linux/kvm_host.h            |  4 ++++
 virt/kvm/Kconfig                    |  3 +++
 virt/kvm/kvm_main.c                 | 30 +++++++++++++++++++++++------
 11 files changed, 35 insertions(+), 19 deletions(-)

diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 815cc118c675..0a7d2116b27b 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -21,6 +21,7 @@ if VIRTUALIZATION
 menuconfig KVM
 	bool "Kernel-based Virtual Machine (KVM) support"
 	depends on HAVE_KVM
+	select KVM_GENERIC_HARDWARE_ENABLING
 	select MMU_NOTIFIER
 	select PREEMPT_NOTIFIERS
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
diff --git a/arch/mips/kvm/Kconfig b/arch/mips/kvm/Kconfig
index 91d197bee9c0..29e51649203b 100644
--- a/arch/mips/kvm/Kconfig
+++ b/arch/mips/kvm/Kconfig
@@ -28,6 +28,7 @@ config KVM
 	select MMU_NOTIFIER
 	select SRCU
 	select INTERVAL_TREE
+	select KVM_GENERIC_HARDWARE_ENABLING
 	help
 	  Support for hosting Guest kernels.
 
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 0a80e80c7b9e..959f566a455c 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -876,7 +876,6 @@ struct kvm_vcpu_arch {
 #define __KVM_HAVE_ARCH_WQP
 #define __KVM_HAVE_CREATE_DEVICE
 
-static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {}
 static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {}
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index f5b4ff6bfc89..4c5405fc5538 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -435,11 +435,6 @@ int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr,
 }
 EXPORT_SYMBOL_GPL(kvmppc_ld);
 
-int kvm_arch_hardware_enable(void)
-{
-	return 0;
-}
-
 int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 {
 	struct kvmppc_ops *kvm_ops = NULL;
diff --git a/arch/riscv/kvm/Kconfig b/arch/riscv/kvm/Kconfig
index f36a737d5f96..d5a658a047a7 100644
--- a/arch/riscv/kvm/Kconfig
+++ b/arch/riscv/kvm/Kconfig
@@ -20,6 +20,7 @@ if VIRTUALIZATION
 config KVM
 	tristate "Kernel-based Virtual Machine (KVM) support (EXPERIMENTAL)"
 	depends on RISCV_SBI && MMU
+	select KVM_GENERIC_HARDWARE_ENABLING
 	select MMU_NOTIFIER
 	select PREEMPT_NOTIFIERS
 	select KVM_MMIO
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index d67ce719d16a..2bbc3d54959d 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -1031,7 +1031,6 @@ extern char sie_exit;
 extern int kvm_s390_gisc_register(struct kvm *kvm, u32 gisc);
 extern int kvm_s390_gisc_unregister(struct kvm *kvm, u32 gisc);
 
-static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_free_memslot(struct kvm *kvm,
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 7ad8252e92c2..bd25076aa19b 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -256,12 +256,6 @@ debug_info_t *kvm_s390_dbf;
 debug_info_t *kvm_s390_dbf_uv;
 
 /* Section: not file related */
-int kvm_arch_hardware_enable(void)
-{
-	/* every s390 is virtualization enabled ;-) */
-	return 0;
-}
-
 /* forward declarations */
 static void kvm_gmap_notifier(struct gmap *gmap, unsigned long start,
 			      unsigned long end);
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index fbeaa9ddef59..8e578311ca9d 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -49,6 +49,7 @@ config KVM
 	select SRCU
 	select INTERVAL_TREE
 	select HAVE_KVM_PM_NOTIFIER if PM
+	select KVM_GENERIC_HARDWARE_ENABLING
 	help
 	  Support hosting fully virtualized guest machines using hardware
 	  virtualization extensions.  You will need a fairly recent
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 616e8e90558b..ffb4f9c3371f 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1457,8 +1457,10 @@ void kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu, struct dentry *debugfs_
 static inline void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) {}
 #endif
 
+#ifdef CONFIG_KVM_GENERIC_HARDWARE_ENABLING
 int kvm_arch_hardware_enable(void);
 void kvm_arch_hardware_disable(void);
+#endif
 int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
 bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu);
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);
@@ -2090,7 +2092,9 @@ static inline bool kvm_check_request(int req, struct kvm_vcpu *vcpu)
 	}
 }
 
+#ifdef CONFIG_KVM_GENERIC_HARDWARE_ENABLING
 extern bool kvm_rebooting;
+#endif
 
 extern unsigned int halt_poll_ns;
 extern unsigned int halt_poll_ns_grow;
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index 800f9470e36b..d28df77345e1 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -86,3 +86,6 @@ config KVM_XFER_TO_GUEST_WORK
 
 config HAVE_KVM_PM_NOTIFIER
        bool
+
+config KVM_GENERIC_HARDWARE_ENABLING
+       bool
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c12db3839114..6a2be96557c2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -102,9 +102,6 @@ EXPORT_SYMBOL_GPL(halt_poll_ns_shrink);
 DEFINE_MUTEX(kvm_lock);
 LIST_HEAD(vm_list);
 
-static DEFINE_PER_CPU(bool, hardware_enabled);
-static int kvm_usage_count;
-
 static struct kmem_cache *kvm_vcpu_cache;
 
 static __read_mostly struct preempt_ops kvm_preempt_ops;
@@ -146,9 +143,6 @@ static void hardware_disable_all(void);
 
 static void kvm_io_bus_destroy(struct kvm_io_bus *bus);
 
-__visible bool kvm_rebooting;
-EXPORT_SYMBOL_GPL(kvm_rebooting);
-
 #define KVM_EVENT_CREATE_VM 0
 #define KVM_EVENT_DESTROY_VM 1
 static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm);
@@ -5024,6 +5018,13 @@ static struct miscdevice kvm_dev = {
 	&kvm_chardev_ops,
 };
 
+#ifdef CONFIG_KVM_GENERIC_HARDWARE_ENABLING
+__visible bool kvm_rebooting;
+EXPORT_SYMBOL_GPL(kvm_rebooting);
+
+static DEFINE_PER_CPU(bool, hardware_enabled);
+static int kvm_usage_count;
+
 static int __hardware_enable_nolock(void)
 {
 	if (__this_cpu_read(hardware_enabled))
@@ -5185,6 +5186,17 @@ static struct syscore_ops kvm_syscore_ops = {
 	.suspend = kvm_suspend,
 	.resume = kvm_resume,
 };
+#else /* CONFIG_KVM_GENERIC_HARDWARE_ENABLING */
+static int hardware_enable_all(void)
+{
+	return 0;
+}
+
+static void hardware_disable_all(void)
+{
+
+}
+#endif /* CONFIG_KVM_GENERIC_HARDWARE_ENABLING */
 
 static void kvm_io_bus_destroy(struct kvm_io_bus *bus)
 {
@@ -5873,6 +5885,7 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 	int r;
 	int cpu;
 
+#ifdef CONFIG_KVM_GENERIC_HARDWARE_ENABLING
 	r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_ONLINE, "kvm/cpu:online",
 				      kvm_online_cpu, kvm_offline_cpu);
 	if (r)
@@ -5880,6 +5893,7 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 
 	register_reboot_notifier(&kvm_reboot_notifier);
 	register_syscore_ops(&kvm_syscore_ops);
+#endif
 
 	/* A kmem cache lets us meet the alignment requirements of fx_save. */
 	if (!vcpu_align)
@@ -5947,9 +5961,11 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 		free_cpumask_var(per_cpu(cpu_kick_mask, cpu));
 	kmem_cache_destroy(kvm_vcpu_cache);
 out_free_3:
+#ifdef CONFIG_KVM_GENERIC_HARDWARE_ENABLING
 	unregister_syscore_ops(&kvm_syscore_ops);
 	unregister_reboot_notifier(&kvm_reboot_notifier);
 	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE);
+#endif
 	return r;
 }
 EXPORT_SYMBOL_GPL(kvm_init);
@@ -5971,9 +5987,11 @@ void kvm_exit(void)
 	kmem_cache_destroy(kvm_vcpu_cache);
 	kvm_vfio_ops_exit();
 	kvm_async_pf_deinit();
+#ifdef CONFIG_KVM_GENERIC_HARDWARE_ENABLING
 	unregister_syscore_ops(&kvm_syscore_ops);
 	unregister_reboot_notifier(&kvm_reboot_notifier);
 	cpuhp_remove_state_nocalls(CPUHP_AP_KVM_ONLINE);
+#endif
 	kvm_irqfd_exit();
 }
 EXPORT_SYMBOL_GPL(kvm_exit);
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 50/50] KVM: Clean up error labels in kvm_init()
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (48 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 49/50] KVM: Opt out of generic hardware enabling on s390 and PPC Sean Christopherson
@ 2022-11-30 23:09 ` Sean Christopherson
  2022-12-02  8:02 ` [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Chao Gao
  2022-12-27 13:02 ` Paolo Bonzini
  51 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-11-30 23:09 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Sean Christopherson,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Convert the last two "out" lables to "err" labels now that the dust has
settled, i.e. now that there are no more planned changes to the order
of things in kvm_init().

Use "err" instead of "out" as it's easier to describe what failed than it
is to describe what needs to be unwound, e.g. if allocating a per-CPU kick
mask fails, KVM needs to free any masks that were allocated, and of course
needs to unwind previous operations.

Reported-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/kvm_main.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6a2be96557c2..b8c6bfb46066 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5907,14 +5907,14 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 					   NULL);
 	if (!kvm_vcpu_cache) {
 		r = -ENOMEM;
-		goto out_free_3;
+		goto err_vcpu_cache;
 	}
 
 	for_each_possible_cpu(cpu) {
 		if (!alloc_cpumask_var_node(&per_cpu(cpu_kick_mask, cpu),
 					    GFP_KERNEL, cpu_to_node(cpu))) {
 			r = -ENOMEM;
-			goto out_free_4;
+			goto err_cpu_kick_mask;
 		}
 	}
 
@@ -5956,11 +5956,11 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
 err_async_pf:
 	kvm_irqfd_exit();
 err_irqfd:
-out_free_4:
+err_cpu_kick_mask:
 	for_each_possible_cpu(cpu)
 		free_cpumask_var(per_cpu(cpu_kick_mask, cpu));
 	kmem_cache_destroy(kvm_vcpu_cache);
-out_free_3:
+err_vcpu_cache:
 #ifdef CONFIG_KVM_GENERIC_HARDWARE_ENABLING
 	unregister_syscore_ops(&kvm_syscore_ops);
 	unregister_reboot_notifier(&kvm_reboot_notifier);
-- 
2.38.1.584.g0f3c55d4c2-goog


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 26/50] KVM: PPC: Move processor compatibility check to module init
  2022-11-30 23:09 ` [PATCH v2 26/50] KVM: PPC: Move processor compatibility check to module init Sean Christopherson
@ 2022-12-01  5:21   ` Michael Ellerman
  2022-12-01 16:38     ` Sean Christopherson
  0 siblings, 1 reply; 77+ messages in thread
From: Michael Ellerman @ 2022-12-01  5:21 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Marc Zyngier, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Christian Borntraeger, Janosch Frank,
	Claudio Imbrenda, Matthew Rosato, Eric Farman,
	Sean Christopherson, Vitaly Kuznetsov, David Woodhouse,
	Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Kai Huang, Chao Gao, Thomas Gleixner

Sean Christopherson <seanjc@google.com> writes:
> Move KVM PPC's compatibility checks to their respective module_init()
> hooks, there's no need to wait until KVM's common compat check, nor is
> there a need to perform the check on every CPU (provided by common KVM's
> hook), as the compatibility checks operate on global data.
>
>   arch/powerpc/include/asm/cputable.h: extern struct cpu_spec *cur_cpu_spec;
>   arch/powerpc/kvm/book3s.c: return 0
>   arch/powerpc/kvm/e500.c: strcmp(cur_cpu_spec->cpu_name, "e500v2")
>   arch/powerpc/kvm/e500mc.c: strcmp(cur_cpu_spec->cpu_name, "e500mc")
>                              strcmp(cur_cpu_spec->cpu_name, "e5500")
>                              strcmp(cur_cpu_spec->cpu_name, "e6500")

I'm not sure that output is really useful in the change log unless you
explain more about what it is.

> diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
> index 57e0ad6a2ca3..795667f7ebf0 100644
> --- a/arch/powerpc/kvm/e500mc.c
> +++ b/arch/powerpc/kvm/e500mc.c
> @@ -388,6 +388,10 @@ static int __init kvmppc_e500mc_init(void)
>  {
>  	int r;
>  
> +	r = kvmppc_e500mc_check_processor_compat();
> +	if (r)
> +		return kvmppc_e500mc;
 
This doesn't build:

linux/arch/powerpc/kvm/e500mc.c: In function ‘kvmppc_e500mc_init’:
linux/arch/powerpc/kvm/e500mc.c:391:13: error: implicit declaration of function ‘kvmppc_e500mc_check_processor_compat’; did you mean ‘kvmppc_core_check_processor_compat’? [-Werror=implicit-function-declaration]
  391 |         r = kvmppc_e500mc_check_processor_compat();
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |             kvmppc_core_check_processor_compat
linux/arch/powerpc/kvm/e500mc.c:393:24: error: ‘kvmppc_e500mc’ undeclared (first use in this function); did you mean ‘kvm_ops_e500mc’?
  393 |                 return kvmppc_e500mc;
      |                        ^~~~~~~~~~~~~
      |                        kvm_ops_e500mc
linux/arch/powerpc/kvm/e500mc.c:393:24: note: each undeclared identifier is reported only once for each function it appears in


It needs the delta below to compile.

With that:

Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)

cheers


diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
index 795667f7ebf0..4564aa27edcf 100644
--- a/arch/powerpc/kvm/e500mc.c
+++ b/arch/powerpc/kvm/e500mc.c
@@ -168,7 +168,7 @@ static void kvmppc_core_vcpu_put_e500mc(struct kvm_vcpu *vcpu)
 	kvmppc_booke_vcpu_put(vcpu);
 }
 
-int kvmppc_core_check_processor_compat(void)
+int kvmppc_e500mc_check_processor_compat(void)
 {
 	int r;
 
@@ -390,7 +390,7 @@ static int __init kvmppc_e500mc_init(void)
 
 	r = kvmppc_e500mc_check_processor_compat();
 	if (r)
-		return kvmppc_e500mc;
+		goto err_out;
 
 	r = kvmppc_booke_init();
 	if (r)

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 34/50] KVM: x86: Unify pr_fmt to use module name for all KVM modules
  2022-11-30 23:09 ` [PATCH v2 34/50] KVM: x86: Unify pr_fmt to use module name for all KVM modules Sean Christopherson
@ 2022-12-01 10:43   ` Paul Durrant
  0 siblings, 0 replies; 77+ messages in thread
From: Paul Durrant @ 2022-12-01 10:43 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Marc Zyngier, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Christian Borntraeger, Janosch Frank,
	Claudio Imbrenda, Matthew Rosato, Eric Farman, Vitaly Kuznetsov,
	David Woodhouse
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

On 30/11/2022 23:09, Sean Christopherson wrote:
> Define pr_fmt using KBUILD_MODNAME for all KVM x86 code so that printks
> use consistent formatting across common x86, Intel, and AMD code.  In
> addition to providing consistent print formatting, using KBUILD_MODNAME,
> e.g. kvm_amd and kvm_intel, allows referencing SVM and VMX (and SEV and
> SGX and ...) as technologies without generating weird messages, and
> without causing naming conflicts with other kernel code, e.g. "SEV: ",
> "tdx: ", "sgx: " etc.. are all used by the kernel for non-KVM subsystems.
> 
> Opportunistically move away from printk() for prints that need to be
> modified anyways, e.g. to drop a manual "kvm: " prefix.
> 
> Opportunistically convert a few SGX WARNs that are similarly modified to
> WARN_ONCE; in the very unlikely event that the WARNs fire, odds are good
> that they would fire repeatedly and spam the kernel log without providing
> unique information in each print.
> 
> Note, defining pr_fmt yields undesirable results for code that uses KVM's
> printk wrappers, e.g. vcpu_unimpl().  But, that's a pre-existing problem
> as SVM/kvm_amd already defines a pr_fmt, and thankfully use of KVM's
> wrappers is relatively limited in KVM x86 code.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
[snip]
> diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
> index b246decb53a9..3bf7d69373cf 100644
> --- a/arch/x86/kvm/xen.c
> +++ b/arch/x86/kvm/xen.c
> @@ -5,6 +5,7 @@
>    *
>    * KVM Xen emulation
>    */
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>   
>   #include "x86.h"
>   #include "xen.h"

Reviewed-by: Paul Durrant <paul@xen.org>


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 12/50] KVM: VMX: Move Hyper-V eVMCS initialization to helper
  2022-11-30 23:08 ` [PATCH v2 12/50] KVM: VMX: Move Hyper-V eVMCS initialization to helper Sean Christopherson
@ 2022-12-01 15:22   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 77+ messages in thread
From: Vitaly Kuznetsov @ 2022-12-01 15:22 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner, Paolo Bonzini, Marc Zyngier, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Christian Borntraeger, Janosch Frank,
	Claudio Imbrenda, Matthew Rosato, Eric Farman,
	Sean Christopherson, David Woodhouse, Paul Durrant

Sean Christopherson <seanjc@google.com> writes:

> Move Hyper-V's eVMCS initialization to a dedicated helper to clean up
> vmx_init(), and add a comment to call out that the Hyper-V init code
> doesn't need to be unwound if vmx_init() ultimately fails.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 73 +++++++++++++++++++++++++-----------------
>  1 file changed, 43 insertions(+), 30 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index c0de7160700b..b8bf95b9710d 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -523,6 +523,8 @@ static inline void vmx_segment_cache_clear(struct vcpu_vmx *vmx)
>  static unsigned long host_idt_base;
>  
>  #if IS_ENABLED(CONFIG_HYPERV)
> +static struct kvm_x86_ops vmx_x86_ops __initdata;
> +
>  static bool __read_mostly enlightened_vmcs = true;
>  module_param(enlightened_vmcs, bool, 0444);
>  
> @@ -551,6 +553,43 @@ static int hv_enable_l2_tlb_flush(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> +static __init void hv_init_evmcs(void)
> +{
> +	int cpu;
> +
> +	if (!enlightened_vmcs)
> +		return;
> +
> +	/*
> +	 * Enlightened VMCS usage should be recommended and the host needs
> +	 * to support eVMCS v1 or above.
> +	 */
> +	if (ms_hyperv.hints & HV_X64_ENLIGHTENED_VMCS_RECOMMENDED &&
> +	    (ms_hyperv.nested_features & HV_X64_ENLIGHTENED_VMCS_VERSION) >=
> +	     KVM_EVMCS_VERSION) {
> +
> +		/* Check that we have assist pages on all online CPUs */
> +		for_each_online_cpu(cpu) {
> +			if (!hv_get_vp_assist_page(cpu)) {
> +				enlightened_vmcs = false;
> +				break;
> +			}
> +		}
> +
> +		if (enlightened_vmcs) {
> +			pr_info("KVM: vmx: using Hyper-V Enlightened VMCS\n");
> +			static_branch_enable(&enable_evmcs);
> +		}
> +
> +		if (ms_hyperv.nested_features & HV_X64_NESTED_DIRECT_FLUSH)
> +			vmx_x86_ops.enable_l2_tlb_flush
> +				= hv_enable_l2_tlb_flush;
> +
> +	} else {
> +		enlightened_vmcs = false;
> +	}
> +}
> +
>  static void hv_reset_evmcs(void)
>  {
>  	struct hv_vp_assist_page *vp_ap;
> @@ -577,6 +616,7 @@ static void hv_reset_evmcs(void)
>  }
>  
>  #else /* IS_ENABLED(CONFIG_HYPERV) */
> +static void hv_init_evmcs(void) {}
>  static void hv_reset_evmcs(void) {}
>  #endif /* IS_ENABLED(CONFIG_HYPERV) */
>  
> @@ -8500,38 +8540,11 @@ static int __init vmx_init(void)
>  {
>  	int r, cpu;
>  
> -#if IS_ENABLED(CONFIG_HYPERV)
>  	/*
> -	 * Enlightened VMCS usage should be recommended and the host needs
> -	 * to support eVMCS v1 or above. We can also disable eVMCS support
> -	 * with module parameter.
> +	 * Note, hv_init_evmcs() touches only VMX knobs, i.e. there's nothing
> +	 * to unwind if a later step fails.
>  	 */
> -	if (enlightened_vmcs &&
> -	    ms_hyperv.hints & HV_X64_ENLIGHTENED_VMCS_RECOMMENDED &&
> -	    (ms_hyperv.nested_features & HV_X64_ENLIGHTENED_VMCS_VERSION) >=
> -	    KVM_EVMCS_VERSION) {
> -
> -		/* Check that we have assist pages on all online CPUs */
> -		for_each_online_cpu(cpu) {
> -			if (!hv_get_vp_assist_page(cpu)) {
> -				enlightened_vmcs = false;
> -				break;
> -			}
> -		}
> -
> -		if (enlightened_vmcs) {
> -			pr_info("KVM: vmx: using Hyper-V Enlightened VMCS\n");
> -			static_branch_enable(&enable_evmcs);
> -		}
> -
> -		if (ms_hyperv.nested_features & HV_X64_NESTED_DIRECT_FLUSH)
> -			vmx_x86_ops.enable_l2_tlb_flush
> -				= hv_enable_l2_tlb_flush;
> -
> -	} else {
> -		enlightened_vmcs = false;
> -	}
> -#endif
> +	hv_init_evmcs();
>  
>  	r = kvm_init(&vmx_init_ops, sizeof(struct vcpu_vmx),
>  		     __alignof__(struct vcpu_vmx), THIS_MODULE);

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 10/50] KVM: VMX: Reset eVMCS controls in VP assist page during hardware disabling
  2022-11-30 23:08 ` [PATCH v2 10/50] KVM: VMX: Reset eVMCS controls in VP assist page during hardware disabling Sean Christopherson
@ 2022-12-01 15:42   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 77+ messages in thread
From: Vitaly Kuznetsov @ 2022-12-01 15:42 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner, Paolo Bonzini, Marc Zyngier, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Christian Borntraeger, Janosch Frank,
	Claudio Imbrenda, Matthew Rosato, Eric Farman,
	Sean Christopherson, David Woodhouse, Paul Durrant

Sean Christopherson <seanjc@google.com> writes:

> Reset the eVMCS controls in the per-CPU VP assist page during hardware
> disabling instead of waiting until kvm-intel's module exit.  The controls
> are activated if and only if KVM creates a VM, i.e. don't need to be
> reset if hardware is never enabled.
>
> Doing the reset during hardware disabling will naturally fix a potential
> NULL pointer deref bug once KVM disables CPU hotplug while enabling and
> disabling hardware (which is necessary to fix a variety of bugs).  If the
> kernel is running as the root partition, the VP assist page is unmapped
> during CPU hot unplug, and so KVM's clearing of the eVMCS controls needs
> to occur with CPU hot(un)plug disabled, otherwise KVM could attempt to
> write to a CPU's VP assist page after it's unmapped.
>
> Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 50 +++++++++++++++++++++++++-----------------
>  1 file changed, 30 insertions(+), 20 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index cea8c07f5229..d85d175dca70 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -551,6 +551,33 @@ static int hv_enable_l2_tlb_flush(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> +static void hv_reset_evmcs(void)
> +{
> +	struct hv_vp_assist_page *vp_ap;
> +
> +	if (!static_branch_unlikely(&enable_evmcs))
> +		return;
> +
> +	/*
> +	 * KVM should enable eVMCS if and only if all CPUs have a VP assist
> +	 * page, and should reject CPU onlining if eVMCS is enabled the CPU
> +	 * doesn't have a VP assist page allocated.
> +	 */
> +	vp_ap = hv_get_vp_assist_page(smp_processor_id());
> +	if (WARN_ON_ONCE(!vp_ap))
> +		return;
> +

In case my understanding is correct, this may actually get triggered
for Hyper-V root partition: vmx_hardware_disable() gets called from
kvm_dying_cpu() which has its own CPUHP_AP_KVM_STARTING stage. VP page
unmapping happens in hv_cpu_die() which uses generic CPUHP_AP_ONLINE_DYN
(happens first on CPU oflining AFAIR). I believe we need to introduce a
new CPUHP_AP_HYPERV_STARTING stage and put it before
CPUHP_AP_KVM_STARTING so it happens after it upon offlining.

The issue is likely theoretical as Hyper-V root partition is a very
special case, I'm not sure whether KVM is used there and whether CPU
offlining is possible. In any case, WARN_ON_ONCE() is much better than
NULL pointer dereference we have now :-)

> +	/*
> +	 * Reset everything to support using non-enlightened VMCS access later
> +	 * (e.g. when we reload the module with enlightened_vmcs=0)
> +	 */
> +	vp_ap->nested_control.features.directhypercall = 0;
> +	vp_ap->current_nested_vmcs = 0;
> +	vp_ap->enlighten_vmentry = 0;
> +}
> +
> +#else /* IS_ENABLED(CONFIG_HYPERV) */
> +static void hv_reset_evmcs(void) {}
>  #endif /* IS_ENABLED(CONFIG_HYPERV) */
>  
>  /*
> @@ -2496,6 +2523,8 @@ static void vmx_hardware_disable(void)
>  	if (cpu_vmxoff())
>  		kvm_spurious_fault();
>  
> +	hv_reset_evmcs();
> +
>  	intel_pt_handle_vmx(0);
>  }
>  
> @@ -8462,27 +8491,8 @@ static void vmx_exit(void)
>  	kvm_exit();
>  
>  #if IS_ENABLED(CONFIG_HYPERV)
> -	if (static_branch_unlikely(&enable_evmcs)) {
> -		int cpu;
> -		struct hv_vp_assist_page *vp_ap;
> -		/*
> -		 * Reset everything to support using non-enlightened VMCS
> -		 * access later (e.g. when we reload the module with
> -		 * enlightened_vmcs=0)
> -		 */
> -		for_each_online_cpu(cpu) {
> -			vp_ap =	hv_get_vp_assist_page(cpu);
> -
> -			if (!vp_ap)
> -				continue;
> -
> -			vp_ap->nested_control.features.directhypercall = 0;
> -			vp_ap->current_nested_vmcs = 0;
> -			vp_ap->enlighten_vmentry = 0;
> -		}
> -
> +	if (static_branch_unlikely(&enable_evmcs))
>  		static_branch_disable(&enable_evmcs);
> -	}
>  #endif
>  	vmx_cleanup_l1d_flush();

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 26/50] KVM: PPC: Move processor compatibility check to module init
  2022-12-01  5:21   ` Michael Ellerman
@ 2022-12-01 16:38     ` Sean Christopherson
  0 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-12-01 16:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Vitaly Kuznetsov, David Woodhouse,
	Paul Durrant, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Oliver Upton, Atish Patra, David Hildenbrand, kvm,
	linux-arm-kernel, kvmarm, kvmarm, linux-mips, linuxppc-dev,
	kvm-riscv, linux-riscv, linux-s390, linux-kernel, Yuan Yao,
	Cornelia Huck, Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Kai Huang, Chao Gao, Thomas Gleixner

On Thu, Dec 01, 2022, Michael Ellerman wrote:
> Sean Christopherson <seanjc@google.com> writes:
> > Move KVM PPC's compatibility checks to their respective module_init()
> > hooks, there's no need to wait until KVM's common compat check, nor is
> > there a need to perform the check on every CPU (provided by common KVM's
> > hook), as the compatibility checks operate on global data.
> >
> >   arch/powerpc/include/asm/cputable.h: extern struct cpu_spec *cur_cpu_spec;
> >   arch/powerpc/kvm/book3s.c: return 0
> >   arch/powerpc/kvm/e500.c: strcmp(cur_cpu_spec->cpu_name, "e500v2")
> >   arch/powerpc/kvm/e500mc.c: strcmp(cur_cpu_spec->cpu_name, "e500mc")
> >                              strcmp(cur_cpu_spec->cpu_name, "e5500")
> >                              strcmp(cur_cpu_spec->cpu_name, "e6500")
> 
> I'm not sure that output is really useful in the change log unless you
> explain more about what it is.

Agreed, I got lazy.  I'll write a proper description.
 
> > diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
> > index 57e0ad6a2ca3..795667f7ebf0 100644
> > --- a/arch/powerpc/kvm/e500mc.c
> > +++ b/arch/powerpc/kvm/e500mc.c
> > @@ -388,6 +388,10 @@ static int __init kvmppc_e500mc_init(void)
> >  {
> >  	int r;
> >  
> > +	r = kvmppc_e500mc_check_processor_compat();
> > +	if (r)
> > +		return kvmppc_e500mc;
>  
> This doesn't build:
> 
> linux/arch/powerpc/kvm/e500mc.c: In function ‘kvmppc_e500mc_init’:
> linux/arch/powerpc/kvm/e500mc.c:391:13: error: implicit declaration of function ‘kvmppc_e500mc_check_processor_compat’; did you mean ‘kvmppc_core_check_processor_compat’? [-Werror=implicit-function-declaration]
>   391 |         r = kvmppc_e500mc_check_processor_compat();
>       |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>       |             kvmppc_core_check_processor_compat
> linux/arch/powerpc/kvm/e500mc.c:393:24: error: ‘kvmppc_e500mc’ undeclared (first use in this function); did you mean ‘kvm_ops_e500mc’?
>   393 |                 return kvmppc_e500mc;
>       |                        ^~~~~~~~~~~~~
>       |                        kvm_ops_e500mc
> linux/arch/powerpc/kvm/e500mc.c:393:24: note: each undeclared identifier is reported only once for each function it appears in

Huh, CONFIG_PPC_E500MC got unselected in the config I use to compile test this
flavor.  I suspect I botched an oldconfig at some point.
 
Anyways, fixed that and the bugs.

Thanks much!

--
Subject: [PATCH] KVM: PPC: Move processor compatibility check to module init

Move KVM PPC's compatibility checks to their respective module_init()
hooks, there's no need to wait until KVM's common compat check, nor is
there a need to perform the check on every CPU (provided by common KVM's
hook).  The compatibility checks are either a nop (Book3S), or simply
check the CPU name stored in the global CPU spec (e500 and e500mc).

Cc: Fabiano Rosas <farosas@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/powerpc/include/asm/kvm_ppc.h |  1 -
 arch/powerpc/kvm/book3s.c          | 10 ----------
 arch/powerpc/kvm/e500.c            |  4 ++--
 arch/powerpc/kvm/e500mc.c          |  6 +++++-
 arch/powerpc/kvm/powerpc.c         |  2 +-
 5 files changed, 8 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index bfacf12784dd..51a1824b0a16 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -118,7 +118,6 @@ extern int kvmppc_xlate(struct kvm_vcpu *vcpu, ulong eaddr,
 extern int kvmppc_core_vcpu_create(struct kvm_vcpu *vcpu);
 extern void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu);
 extern int kvmppc_core_vcpu_setup(struct kvm_vcpu *vcpu);
-extern int kvmppc_core_check_processor_compat(void);
 extern int kvmppc_core_vcpu_translate(struct kvm_vcpu *vcpu,
                                       struct kvm_translation *tr);
 
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 6d525285dbe8..87283a0e33d8 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -999,16 +999,6 @@ int kvmppc_h_logical_ci_store(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvmppc_h_logical_ci_store);
 
-int kvmppc_core_check_processor_compat(void)
-{
-	/*
-	 * We always return 0 for book3s. We check
-	 * for compatibility while loading the HV
-	 * or PR module
-	 */
-	return 0;
-}
-
 int kvmppc_book3s_hcall_implemented(struct kvm *kvm, unsigned long hcall)
 {
 	return kvm->arch.kvm_ops->hcall_implemented(hcall);
diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
index c8b2b4478545..0ea61190ec04 100644
--- a/arch/powerpc/kvm/e500.c
+++ b/arch/powerpc/kvm/e500.c
@@ -314,7 +314,7 @@ static void kvmppc_core_vcpu_put_e500(struct kvm_vcpu *vcpu)
 	kvmppc_booke_vcpu_put(vcpu);
 }
 
-int kvmppc_core_check_processor_compat(void)
+static int kvmppc_e500_check_processor_compat(void)
 {
 	int r;
 
@@ -507,7 +507,7 @@ static int __init kvmppc_e500_init(void)
 	unsigned long handler_len;
 	unsigned long max_ivor = 0;
 
-	r = kvmppc_core_check_processor_compat();
+	r = kvmppc_e500_check_processor_compat();
 	if (r)
 		goto err_out;
 
diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
index 57e0ad6a2ca3..4564aa27edcf 100644
--- a/arch/powerpc/kvm/e500mc.c
+++ b/arch/powerpc/kvm/e500mc.c
@@ -168,7 +168,7 @@ static void kvmppc_core_vcpu_put_e500mc(struct kvm_vcpu *vcpu)
 	kvmppc_booke_vcpu_put(vcpu);
 }
 
-int kvmppc_core_check_processor_compat(void)
+int kvmppc_e500mc_check_processor_compat(void)
 {
 	int r;
 
@@ -388,6 +388,10 @@ static int __init kvmppc_e500mc_init(void)
 {
 	int r;
 
+	r = kvmppc_e500mc_check_processor_compat();
+	if (r)
+		goto err_out;
+
 	r = kvmppc_booke_init();
 	if (r)
 		goto err_out;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 5faf69421f13..d44b85ba8cef 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -442,7 +442,7 @@ int kvm_arch_hardware_enable(void)
 
 int kvm_arch_check_processor_compat(void *opaque)
 {
-	return kvmppc_core_check_processor_compat();
+	return 0;
 }
 
 int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)

base-commit: 00e4493db7c6163d48d5b45034d1a77e16a1c8dc
-- 


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 21/50] KVM: MIPS: Hardcode callbacks to hardware virtualization extensions
  2022-11-30 23:09 ` [PATCH v2 21/50] KVM: MIPS: Hardcode callbacks to hardware virtualization extensions Sean Christopherson
@ 2022-12-01 22:00   ` Philippe Mathieu-Daudé
  2022-12-01 22:49     ` Sean Christopherson
  0 siblings, 1 reply; 77+ messages in thread
From: Philippe Mathieu-Daudé @ 2022-12-01 22:00 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Marc Zyngier, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Christian Borntraeger, Janosch Frank,
	Claudio Imbrenda, Matthew Rosato, Eric Farman, Vitaly Kuznetsov,
	David Woodhouse, Paul Durrant
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose, Oliver Upton,
	Atish Patra, David Hildenbrand, kvm, linux-arm-kernel, kvmarm,
	kvmarm, linux-mips, linuxppc-dev, kvm-riscv, linux-riscv,
	linux-s390, linux-kernel, Yuan Yao, Cornelia Huck,
	Isaku Yamahata, Fabiano Rosas, Michael Ellerman, Kai Huang,
	Chao Gao, Thomas Gleixner

On 1/12/22 00:09, Sean Christopherson wrote:
> Now that KVM no longer supports trap-and-emulate (see commit 45c7e8af4a5e
> "MIPS: Remove KVM_TE support"), hardcode the MIPS callbacks to the
> virtualization callbacks.
> 
> Harcoding the callbacks eliminates the technically-unnecessary check on
> non-NULL kvm_mips_callbacks in kvm_arch_init().  MIPS has never supported
> multiple in-tree modules, i.e. barring an out-of-tree module, where
> copying and renaming kvm.ko counts as "out-of-tree", KVM could never
> encounter a non-NULL set of callbacks during module init.
> 
> The callback check is also subtly broken, as it is not thread safe,
> i.e. if there were multiple modules, loading both concurrently would
> create a race between checking and setting kvm_mips_callbacks.
> 
> Given that out-of-tree shenanigans are not the kernel's responsibility,
> hardcode the callbacks to simplify the code.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   arch/mips/include/asm/kvm_host.h |  2 +-
>   arch/mips/kvm/Makefile           |  2 +-
>   arch/mips/kvm/callback.c         | 14 --------------
>   arch/mips/kvm/mips.c             |  9 ++-------
>   arch/mips/kvm/vz.c               |  7 ++++---
>   5 files changed, 8 insertions(+), 26 deletions(-)
>   delete mode 100644 arch/mips/kvm/callback.c
> 
> diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
> index 28f0ba97db71..2803c9c21ef9 100644
> --- a/arch/mips/include/asm/kvm_host.h
> +++ b/arch/mips/include/asm/kvm_host.h
> @@ -758,7 +758,7 @@ struct kvm_mips_callbacks {
>   	void (*vcpu_reenter)(struct kvm_vcpu *vcpu);
>   };
>   extern struct kvm_mips_callbacks *kvm_mips_callbacks;

IIUC we could even constify this pointer.

Anyway,
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

> diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
> index af29490d9740..f0a6c245d1ff 100644
> --- a/arch/mips/kvm/mips.c
> +++ b/arch/mips/kvm/mips.c
> @@ -1012,17 +1012,12 @@ long kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
>   
>   int kvm_arch_init(void *opaque)
>   {
> -	if (kvm_mips_callbacks) {
> -		kvm_err("kvm: module already exists\n");
> -		return -EEXIST;
> -	}
> -
> -	return kvm_mips_emulation_init(&kvm_mips_callbacks);
> +	return kvm_mips_emulation_init();
>   }
>   
>   void kvm_arch_exit(void)
>   {
> -	kvm_mips_callbacks = NULL;
> +
>   }
>   
>   int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
> diff --git a/arch/mips/kvm/vz.c b/arch/mips/kvm/vz.c
> index c706f5890a05..dafab003ea0d 100644
> --- a/arch/mips/kvm/vz.c
> +++ b/arch/mips/kvm/vz.c
> @@ -3304,7 +3304,10 @@ static struct kvm_mips_callbacks kvm_vz_callbacks = {
>   	.vcpu_reenter = kvm_vz_vcpu_reenter,
>   };
>   
> -int kvm_mips_emulation_init(struct kvm_mips_callbacks **install_callbacks)
> +/* FIXME: Get rid of the callbacks now that trap-and-emulate is gone. */
> +struct kvm_mips_callbacks *kvm_mips_callbacks = &kvm_vz_callbacks;

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 21/50] KVM: MIPS: Hardcode callbacks to hardware virtualization extensions
  2022-12-01 22:00   ` Philippe Mathieu-Daudé
@ 2022-12-01 22:49     ` Sean Christopherson
  0 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-12-01 22:49 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Vitaly Kuznetsov, David Woodhouse,
	Paul Durrant, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Oliver Upton, Atish Patra, David Hildenbrand, kvm,
	linux-arm-kernel, kvmarm, kvmarm, linux-mips, linuxppc-dev,
	kvm-riscv, linux-riscv, linux-s390, linux-kernel, Yuan Yao,
	Cornelia Huck, Isaku Yamahata, Fabiano Rosas, Michael Ellerman,
	Kai Huang, Chao Gao, Thomas Gleixner

On Thu, Dec 01, 2022, Philippe Mathieu-Daudé wrote:
> On 1/12/22 00:09, Sean Christopherson wrote:
> > Now that KVM no longer supports trap-and-emulate (see commit 45c7e8af4a5e
> > "MIPS: Remove KVM_TE support"), hardcode the MIPS callbacks to the
> > virtualization callbacks.
> > 
> > Harcoding the callbacks eliminates the technically-unnecessary check on
> > non-NULL kvm_mips_callbacks in kvm_arch_init().  MIPS has never supported
> > multiple in-tree modules, i.e. barring an out-of-tree module, where
> > copying and renaming kvm.ko counts as "out-of-tree", KVM could never
> > encounter a non-NULL set of callbacks during module init.
> > 
> > The callback check is also subtly broken, as it is not thread safe,
> > i.e. if there were multiple modules, loading both concurrently would
> > create a race between checking and setting kvm_mips_callbacks.
> > 
> > Given that out-of-tree shenanigans are not the kernel's responsibility,
> > hardcode the callbacks to simplify the code.
> > 
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >   arch/mips/include/asm/kvm_host.h |  2 +-
> >   arch/mips/kvm/Makefile           |  2 +-
> >   arch/mips/kvm/callback.c         | 14 --------------
> >   arch/mips/kvm/mips.c             |  9 ++-------
> >   arch/mips/kvm/vz.c               |  7 ++++---
> >   5 files changed, 8 insertions(+), 26 deletions(-)
> >   delete mode 100644 arch/mips/kvm/callback.c
> > 
> > diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
> > index 28f0ba97db71..2803c9c21ef9 100644
> > --- a/arch/mips/include/asm/kvm_host.h
> > +++ b/arch/mips/include/asm/kvm_host.h
> > @@ -758,7 +758,7 @@ struct kvm_mips_callbacks {
> >   	void (*vcpu_reenter)(struct kvm_vcpu *vcpu);
> >   };
> >   extern struct kvm_mips_callbacks *kvm_mips_callbacks;
> 
> IIUC we could even constify this pointer.

Good point.  Protecting the pointer itself is a bit gross, but it is a nice
stopgap until the callbacks are gone.  I'll fold this in.  Thanks!

  extern const struct kvm_mips_callbacks * const kvm_mips_callbacks;

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (49 preceding siblings ...)
  2022-11-30 23:09 ` [PATCH v2 50/50] KVM: Clean up error labels in kvm_init() Sean Christopherson
@ 2022-12-02  8:02 ` Chao Gao
  2022-12-27 13:02 ` Paolo Bonzini
  51 siblings, 0 replies; 77+ messages in thread
From: Chao Gao @ 2022-12-02  8:02 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Vitaly Kuznetsov, David Woodhouse,
	Paul Durrant, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Oliver Upton, Atish Patra, David Hildenbrand, kvm,
	linux-arm-kernel, kvmarm, kvmarm, linux-mips, linuxppc-dev,
	kvm-riscv, linux-riscv, linux-s390, linux-kernel, Yuan Yao,
	Cornelia Huck, Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Thomas Gleixner

On Wed, Nov 30, 2022 at 11:08:44PM +0000, Sean Christopherson wrote:
>The main theme of this series is to kill off kvm_arch_init(),
>kvm_arch_hardware_(un)setup(), and kvm_arch_check_processor_compat(), which
>all originated in x86 code from way back when, and needlessly complicate
>both common KVM code and architecture code.  E.g. many architectures don't
>mark functions/data as __init/__ro_after_init purely because kvm_init()
>isn't marked __init to support x86's separate vendor modules.

Applied this series and verified that an attempt to online incompatible
CPUs (no VMX support) when some VMs are running will fail.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 31/50] KVM: x86: Do CPU compatibility checks in x86 code
  2022-11-30 23:09 ` [PATCH v2 31/50] KVM: x86: Do CPU compatibility checks in x86 code Sean Christopherson
@ 2022-12-02 12:16   ` Huang, Kai
  2022-12-05 20:52   ` Isaku Yamahata
  1 sibling, 0 replies; 77+ messages in thread
From: Huang, Kai @ 2022-12-02 12:16 UTC (permalink / raw)
  To: chenhuacai, maz, frankja, borntraeger, farman, aou, palmer,
	Christopherson,,
	Sean, paul.walmsley, pbonzini, dwmw2, aleksandar.qemu.devel,
	imbrenda, paul, mjrosato, vkuznets, anup
  Cc: oliver.upton, kvm, cohuck, farosas, david, james.morse, Yao,
	Yuan, alexandru.elisei, linux-s390, linux-kernel, mpe, Yamahata,
	Isaku, kvmarm, tglx, suzuki.poulose, kvm-riscv, linuxppc-dev,
	linux-arm-kernel, linux-mips, kvmarm, philmd, atishp,
	linux-riscv, Gao, Chao

On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote:
> Move the CPU compatibility checks to pure x86 code, i.e. drop x86's use
> of the common kvm_x86_check_cpu_compat() arch hook.  x86 is the only
		^
		kvm_arch_check_processor_compat()

> architecture that "needs" to do per-CPU compatibility checks, moving
> the logic to x86 will allow dropping the common code, and will also
> give x86 more control over when/how the compatibility checks are
> performed, e.g. TDX will need to enable hardware (do VMXON) in order to
> perform compatibility checks.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Kai Huang <kai.huang@intel.com>

> ---
>  arch/x86/kvm/svm/svm.c |  2 +-
>  arch/x86/kvm/vmx/vmx.c |  2 +-
>  arch/x86/kvm/x86.c     | 49 ++++++++++++++++++++++++++++++++----------
>  3 files changed, 40 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 19e81a99c58f..d7ea1c1175c2 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -5103,7 +5103,7 @@ static int __init svm_init(void)
>  	 * Common KVM initialization _must_ come last, after this, /dev/kvm is
>  	 * exposed to userspace!
>  	 */
> -	r = kvm_init(&svm_init_ops, sizeof(struct vcpu_svm),
> +	r = kvm_init(NULL, sizeof(struct vcpu_svm),
>  		     __alignof__(struct vcpu_svm), THIS_MODULE);
>  	if (r)
>  		goto err_kvm_init;
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 654d81f781da..8deb1bd60c10 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -8592,7 +8592,7 @@ static int __init vmx_init(void)
>  	 * Common KVM initialization _must_ come last, after this, /dev/kvm is
>  	 * exposed to userspace!
>  	 */
> -	r = kvm_init(&vmx_init_ops, sizeof(struct vcpu_vmx),
> +	r = kvm_init(NULL, sizeof(struct vcpu_vmx),
>  		     __alignof__(struct vcpu_vmx), THIS_MODULE);
>  	if (r)
>  		goto err_kvm_init;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 66f16458aa97..3571bc968cf8 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9277,10 +9277,36 @@ static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
>  	kvm_pmu_ops_update(ops->pmu_ops);
>  }
>  
> +struct kvm_cpu_compat_check {
> +	struct kvm_x86_init_ops *ops;
> +	int *ret;
> +};
> +
> +static int kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
> +{
> +	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
> +
> +	WARN_ON(!irqs_disabled());
> +
> +	if (__cr4_reserved_bits(cpu_has, c) !=
> +	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
> +		return -EIO;
> +
> +	return ops->check_processor_compatibility();
> +}
> +
> +static void kvm_x86_check_cpu_compat(void *data)
> +{
> +	struct kvm_cpu_compat_check *c = data;
> +
> +	*c->ret = kvm_x86_check_processor_compatibility(c->ops);
> +}
> +
>  static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
>  {
> +	struct kvm_cpu_compat_check c;
>  	u64 host_pat;
> -	int r;
> +	int r, cpu;
>  
>  	if (kvm_x86_ops.hardware_enable) {
>  		pr_err("kvm: already loaded vendor module '%s'\n", kvm_x86_ops.name);
> @@ -9360,6 +9386,14 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
>  	if (r != 0)
>  		goto out_mmu_exit;
>  
> +	c.ret = &r;
> +	c.ops = ops;
> +	for_each_online_cpu(cpu) {
> +		smp_call_function_single(cpu, kvm_x86_check_cpu_compat, &c, 1);
> +		if (r < 0)
> +			goto out_hardware_unsetup;
> +	}
> +
>  	/*
>  	 * Point of no return!  DO NOT add error paths below this point unless
>  	 * absolutely necessary, as most operations from this point forward
> @@ -9402,6 +9436,8 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
>  	kvm_init_msr_list();
>  	return 0;
>  
> +out_hardware_unsetup:
> +	ops->runtime_ops->hardware_unsetup();
>  out_mmu_exit:
>  	kvm_mmu_vendor_module_exit();
>  out_free_percpu:
> @@ -12037,16 +12073,7 @@ void kvm_arch_hardware_disable(void)
>  
>  int kvm_arch_check_processor_compat(void *opaque)
>  {
> -	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
> -	struct kvm_x86_init_ops *ops = opaque;
> -
> -	WARN_ON(!irqs_disabled());
> -
> -	if (__cr4_reserved_bits(cpu_has, c) !=
> -	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
> -		return -EIO;
> -
> -	return ops->check_processor_compatibility();
> +	return 0;
>  }
>  
>  bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu)
> -- 
> 2.38.1.584.g0f3c55d4c2-goog
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 32/50] KVM: Drop kvm_arch_check_processor_compat() hook
  2022-11-30 23:09 ` [PATCH v2 32/50] KVM: Drop kvm_arch_check_processor_compat() hook Sean Christopherson
@ 2022-12-02 12:18   ` Huang, Kai
  0 siblings, 0 replies; 77+ messages in thread
From: Huang, Kai @ 2022-12-02 12:18 UTC (permalink / raw)
  To: chenhuacai, maz, frankja, borntraeger, farman, aou, palmer,
	Christopherson,,
	Sean, paul.walmsley, pbonzini, dwmw2, aleksandar.qemu.devel,
	imbrenda, paul, mjrosato, vkuznets, anup
  Cc: oliver.upton, kvm, cohuck, farosas, david, james.morse, Yao,
	Yuan, alexandru.elisei, linux-s390, linux-kernel, mpe, Yamahata,
	Isaku, kvmarm, tglx, suzuki.poulose, kvm-riscv, linuxppc-dev,
	linux-arm-kernel, linux-mips, kvmarm, philmd, atishp,
	linux-riscv, Gao, Chao

On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote:
> Drop kvm_arch_check_processor_compat() and its support code now that all
> architecture implementations are nops.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> Reviewed-by: Eric Farman <farman@linux.ibm.com>	# s390
> Acked-by: Anup Patel <anup@brainfault.org>

For x86,

Reviewed-by: Kai Huang <kai.huang@intel.com>

> ---
>  arch/arm64/kvm/arm.c       |  7 +------
>  arch/mips/kvm/mips.c       |  7 +------
>  arch/powerpc/kvm/book3s.c  |  2 +-
>  arch/powerpc/kvm/e500.c    |  2 +-
>  arch/powerpc/kvm/e500mc.c  |  2 +-
>  arch/powerpc/kvm/powerpc.c |  5 -----
>  arch/riscv/kvm/main.c      |  7 +------
>  arch/s390/kvm/kvm-s390.c   |  7 +------
>  arch/x86/kvm/svm/svm.c     |  4 ++--
>  arch/x86/kvm/vmx/vmx.c     |  4 ++--
>  arch/x86/kvm/x86.c         |  5 -----
>  include/linux/kvm_host.h   |  4 +---
>  virt/kvm/kvm_main.c        | 24 +-----------------------
>  13 files changed, 13 insertions(+), 67 deletions(-)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 936ef7d1ea94..e915b1d9f2cd 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -63,11 +63,6 @@ int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
>  	return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
>  }
>  
> -int kvm_arch_check_processor_compat(void *opaque)
> -{
> -	return 0;
> -}
> -
>  int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  			    struct kvm_enable_cap *cap)
>  {
> @@ -2273,7 +2268,7 @@ static __init int kvm_arm_init(void)
>  	 * FIXME: Do something reasonable if kvm_init() fails after pKVM
>  	 * hypervisor protection is finalized.
>  	 */
> -	err = kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
> +	err = kvm_init(sizeof(struct kvm_vcpu), 0, THIS_MODULE);
>  	if (err)
>  		goto out_subs;
>  
> diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
> index 3cade648827a..36c8991b5d39 100644
> --- a/arch/mips/kvm/mips.c
> +++ b/arch/mips/kvm/mips.c
> @@ -135,11 +135,6 @@ void kvm_arch_hardware_disable(void)
>  	kvm_mips_callbacks->hardware_disable();
>  }
>  
> -int kvm_arch_check_processor_compat(void *opaque)
> -{
> -	return 0;
> -}
> -
>  extern void kvm_init_loongson_ipi(struct kvm *kvm);
>  
>  int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
> @@ -1636,7 +1631,7 @@ static int __init kvm_mips_init(void)
>  
>  	register_die_notifier(&kvm_mips_csr_die_notifier);
>  
> -	ret = kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
> +	ret = kvm_init(sizeof(struct kvm_vcpu), 0, THIS_MODULE);
>  	if (ret) {
>  		unregister_die_notifier(&kvm_mips_csr_die_notifier);
>  		return ret;
> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
> index 87283a0e33d8..57f4e7896d67 100644
> --- a/arch/powerpc/kvm/book3s.c
> +++ b/arch/powerpc/kvm/book3s.c
> @@ -1052,7 +1052,7 @@ static int kvmppc_book3s_init(void)
>  {
>  	int r;
>  
> -	r = kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
> +	r = kvm_init(sizeof(struct kvm_vcpu), 0, THIS_MODULE);
>  	if (r)
>  		return r;
>  #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
> diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c
> index 0ea61190ec04..b0f695428733 100644
> --- a/arch/powerpc/kvm/e500.c
> +++ b/arch/powerpc/kvm/e500.c
> @@ -531,7 +531,7 @@ static int __init kvmppc_e500_init(void)
>  	flush_icache_range(kvmppc_booke_handlers, kvmppc_booke_handlers +
>  			   ivor[max_ivor] + handler_len);
>  
> -	r = kvm_init(NULL, sizeof(struct kvmppc_vcpu_e500), 0, THIS_MODULE);
> +	r = kvm_init(sizeof(struct kvmppc_vcpu_e500), 0, THIS_MODULE);
>  	if (r)
>  		goto err_out;
>  	kvm_ops_e500.owner = THIS_MODULE;
> diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
> index 795667f7ebf0..611532a0dedc 100644
> --- a/arch/powerpc/kvm/e500mc.c
> +++ b/arch/powerpc/kvm/e500mc.c
> @@ -404,7 +404,7 @@ static int __init kvmppc_e500mc_init(void)
>  	 */
>  	kvmppc_init_lpid(KVMPPC_NR_LPIDS/threads_per_core);
>  
> -	r = kvm_init(NULL, sizeof(struct kvmppc_vcpu_e500), 0, THIS_MODULE);
> +	r = kvm_init(sizeof(struct kvmppc_vcpu_e500), 0, THIS_MODULE);
>  	if (r)
>  		goto err_out;
>  	kvm_ops_e500mc.owner = THIS_MODULE;
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index 01d0f9935e6c..f5b4ff6bfc89 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -440,11 +440,6 @@ int kvm_arch_hardware_enable(void)
>  	return 0;
>  }
>  
> -int kvm_arch_check_processor_compat(void *opaque)
> -{
> -	return 0;
> -}
> -
>  int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  {
>  	struct kvmppc_ops *kvm_ops = NULL;
> diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c
> index 4710a6751687..34c3dece6990 100644
> --- a/arch/riscv/kvm/main.c
> +++ b/arch/riscv/kvm/main.c
> @@ -20,11 +20,6 @@ long kvm_arch_dev_ioctl(struct file *filp,
>  	return -EINVAL;
>  }
>  
> -int kvm_arch_check_processor_compat(void *opaque)
> -{
> -	return 0;
> -}
> -
>  int kvm_arch_hardware_enable(void)
>  {
>  	unsigned long hideleg, hedeleg;
> @@ -110,6 +105,6 @@ static int __init riscv_kvm_init(void)
>  
>  	kvm_info("VMID %ld bits available\n", kvm_riscv_gstage_vmid_bits());
>  
> -	return kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
> +	return kvm_init(sizeof(struct kvm_vcpu), 0, THIS_MODULE);
>  }
>  module_init(riscv_kvm_init);
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 25b08b956888..7ad8252e92c2 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -262,11 +262,6 @@ int kvm_arch_hardware_enable(void)
>  	return 0;
>  }
>  
> -int kvm_arch_check_processor_compat(void *opaque)
> -{
> -	return 0;
> -}
> -
>  /* forward declarations */
>  static void kvm_gmap_notifier(struct gmap *gmap, unsigned long start,
>  			      unsigned long end);
> @@ -5716,7 +5711,7 @@ static int __init kvm_s390_init(void)
>  	if (r)
>  		return r;
>  
> -	r = kvm_init(NULL, sizeof(struct kvm_vcpu), 0, THIS_MODULE);
> +	r = kvm_init(sizeof(struct kvm_vcpu), 0, THIS_MODULE);
>  	if (r) {
>  		__kvm_s390_exit();
>  		return r;
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index d7ea1c1175c2..d9a54590591d 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -5103,8 +5103,8 @@ static int __init svm_init(void)
>  	 * Common KVM initialization _must_ come last, after this, /dev/kvm is
>  	 * exposed to userspace!
>  	 */
> -	r = kvm_init(NULL, sizeof(struct vcpu_svm),
> -		     __alignof__(struct vcpu_svm), THIS_MODULE);
> +	r = kvm_init(sizeof(struct vcpu_svm), __alignof__(struct vcpu_svm),
> +		     THIS_MODULE);
>  	if (r)
>  		goto err_kvm_init;
>  
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 8deb1bd60c10..b6f08a0a1435 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -8592,8 +8592,8 @@ static int __init vmx_init(void)
>  	 * Common KVM initialization _must_ come last, after this, /dev/kvm is
>  	 * exposed to userspace!
>  	 */
> -	r = kvm_init(NULL, sizeof(struct vcpu_vmx),
> -		     __alignof__(struct vcpu_vmx), THIS_MODULE);
> +	r = kvm_init(sizeof(struct vcpu_vmx), __alignof__(struct vcpu_vmx),
> +		     THIS_MODULE);
>  	if (r)
>  		goto err_kvm_init;
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 3571bc968cf8..566156b34314 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -12071,11 +12071,6 @@ void kvm_arch_hardware_disable(void)
>  	drop_user_return_notifiers();
>  }
>  
> -int kvm_arch_check_processor_compat(void *opaque)
> -{
> -	return 0;
> -}
> -
>  bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu)
>  {
>  	return vcpu->kvm->arch.bsp_vcpu_id == vcpu->vcpu_id;
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 7dde28333e7c..616e8e90558b 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -953,8 +953,7 @@ static inline void kvm_irqfd_exit(void)
>  {
>  }
>  #endif
> -int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
> -		  struct module *module);
> +int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module);
>  void kvm_exit(void);
>  
>  void kvm_get_kvm(struct kvm *kvm);
> @@ -1460,7 +1459,6 @@ static inline void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) {}
>  
>  int kvm_arch_hardware_enable(void);
>  void kvm_arch_hardware_disable(void);
> -int kvm_arch_check_processor_compat(void *opaque);
>  int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
>  bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu);
>  int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index a4a10a0b322f..3900bd3d75cb 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -5833,36 +5833,14 @@ void kvm_unregister_perf_callbacks(void)
>  }
>  #endif
>  
> -struct kvm_cpu_compat_check {
> -	void *opaque;
> -	int *ret;
> -};
> -
> -static void check_processor_compat(void *data)
> +int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
>  {
> -	struct kvm_cpu_compat_check *c = data;
> -
> -	*c->ret = kvm_arch_check_processor_compat(c->opaque);
> -}
> -
> -int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
> -		  struct module *module)
> -{
> -	struct kvm_cpu_compat_check c;
>  	int r;
>  	int cpu;
>  
>  	if (!zalloc_cpumask_var(&cpus_hardware_enabled, GFP_KERNEL))
>  		return -ENOMEM;
>  
> -	c.ret = &r;
> -	c.opaque = opaque;
> -	for_each_online_cpu(cpu) {
> -		smp_call_function_single(cpu, check_processor_compat, &c, 1);
> -		if (r < 0)
> -			goto out_free_2;
> -	}
> -
>  	r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_STARTING, "kvm/cpu:starting",
>  				      kvm_starting_cpu, kvm_dying_cpu);
>  	if (r)


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 35/50] KVM: VMX: Use current CPU's info to perform "disabled by BIOS?" checks
  2022-11-30 23:09 ` [PATCH v2 35/50] KVM: VMX: Use current CPU's info to perform "disabled by BIOS?" checks Sean Christopherson
@ 2022-12-02 12:18   ` Huang, Kai
  0 siblings, 0 replies; 77+ messages in thread
From: Huang, Kai @ 2022-12-02 12:18 UTC (permalink / raw)
  To: chenhuacai, maz, frankja, borntraeger, farman, aou, palmer,
	Christopherson,,
	Sean, paul.walmsley, pbonzini, dwmw2, aleksandar.qemu.devel,
	imbrenda, paul, mjrosato, vkuznets, anup
  Cc: oliver.upton, kvm, cohuck, farosas, david, james.morse, Yao,
	Yuan, alexandru.elisei, linux-s390, linux-kernel, mpe, Yamahata,
	Isaku, kvmarm, tglx, suzuki.poulose, kvm-riscv, linuxppc-dev,
	linux-arm-kernel, linux-mips, kvmarm, philmd, atishp,
	linux-riscv, Gao, Chao

On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote:
> Use this_cpu_has() instead of boot_cpu_has() to perform the effective
> "disabled by BIOS?" checks for VMX.  This will allow consolidating code
> between vmx_disabled_by_bios() and vmx_check_processor_compat().
> 
> Checking the boot CPU isn't a strict requirement as any divergence in VMX
> enabling between the boot CPU and other CPUs will result in KVM refusing
> to load thanks to the aforementioned vmx_check_processor_compat().
> 
> Furthermore, using the boot CPU was an unintentional change introduced by
> commit a4d0b2fdbcf7 ("KVM: VMX: Use VMX feature flag to query BIOS
> enabling").  Prior to using the feature flags, KVM checked the raw MSR
> value from the current CPU.
> 
> Reported-by: Kai Huang <kai.huang@intel.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Kai Huang <kai.huang@intel.com>

> ---
>  arch/x86/kvm/vmx/vmx.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index e859d2b7daa4..3f7d9f88b314 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2492,8 +2492,8 @@ static __init int cpu_has_kvm_support(void)
>  
>  static __init int vmx_disabled_by_bios(void)
>  {
> -	return !boot_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
> -	       !boot_cpu_has(X86_FEATURE_VMX);
> +	return !this_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
> +	       !this_cpu_has(X86_FEATURE_VMX);
>  }
>  
>  static int kvm_cpu_vmxon(u64 vmxon_pointer)


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 42/50] KVM: Disable CPU hotplug during hardware enabling/disabling
  2022-11-30 23:09 ` [PATCH v2 42/50] KVM: Disable CPU hotplug during hardware enabling/disabling Sean Christopherson
@ 2022-12-02 12:59   ` Huang, Kai
  2022-12-02 16:31     ` Sean Christopherson
  0 siblings, 1 reply; 77+ messages in thread
From: Huang, Kai @ 2022-12-02 12:59 UTC (permalink / raw)
  To: chenhuacai, maz, frankja, borntraeger, farman, aou, palmer,
	Christopherson,,
	Sean, paul.walmsley, pbonzini, dwmw2, aleksandar.qemu.devel,
	imbrenda, paul, mjrosato, vkuznets, anup
  Cc: oliver.upton, kvm, cohuck, farosas, david, james.morse, Yao,
	Yuan, alexandru.elisei, linux-s390, linux-kernel, mpe, Yamahata,
	Isaku, kvmarm, tglx, suzuki.poulose, kvm-riscv, linuxppc-dev,
	linux-arm-kernel, linux-mips, kvmarm, philmd, atishp,
	linux-riscv, Gao, Chao

On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Disable CPU hotplug when enabling/disabling hardware to prevent the
> corner case where if the following sequence occurs:
> 
>   1. A hotplugged CPU marks itself online in cpu_online_mask
>   2. The hotplugged CPU enables interrupt before invoking KVM's ONLINE
>      callback
>   3  hardware_{en,dis}able_all() is invoked on another CPU
> 
> the hotplugged CPU will be included in on_each_cpu() and thus get sent
> through hardware_{en,dis}able_nolock() before kvm_online_cpu() is called.

Should we explicitly call out what is the consequence of such case, otherwise
it's hard to tell whether this truly is an issue?

IIUC, since now the compatibility check has already been moved to
kvm_arch_hardware_enable(), the consequence is hardware_enable_all() will fail
if the now online cpu isn't compatible, which will results in failing to create
the first VM.  This isn't ideal since the incompatible cpu should be rejected to
go online instead.

> 
>         start_secondary { ...
>                 set_cpu_online(smp_processor_id(), true); <- 1
>                 ...
>                 local_irq_enable();  <- 2
>                 ...
>                 cpu_startup_entry(CPUHP_AP_ONLINE_IDLE); <- 3
>         }
> 
> KVM currently fudges around this race by keeping track of which CPUs have
> done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which
> cpus have virtualization enabled"), but that's an inefficient, convoluted,
> and hacky solution.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> [sean: split to separate patch, write changelog]
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/x86.c  | 11 ++++++++++-
>  virt/kvm/kvm_main.c | 12 ++++++++++++
>  2 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index dad30097f0c3..d2ad383da998 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9281,7 +9281,16 @@ static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
>  
>  static int kvm_x86_check_processor_compatibility(void)
>  {
> -	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
> +	int cpu = smp_processor_id();
> +	struct cpuinfo_x86 *c = &cpu_data(cpu);
> +
> +	/*
> +	 * Compatibility checks are done when loading KVM and when enabling
> +	 * hardware, e.g. during CPU hotplug, to ensure all online CPUs are
> +	 * compatible, i.e. KVM should never perform a compatibility check on
> +	 * an offline CPU.
> +	 */
> +	WARN_ON(!cpu_online(cpu));

IMHO this chunk logically should belong to previous patch.  IIUC disabling CPU
hotplug during hardware_enable_all() doesn't have relationship to this WARN().

>  
>  	if (__cr4_reserved_bits(cpu_has, c) !=
>  	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index f26ea779710a..d985b24c423b 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -5098,15 +5098,26 @@ static void hardware_disable_all_nolock(void)
>  
>  static void hardware_disable_all(void)
>  {
> +	cpus_read_lock();
>  	raw_spin_lock(&kvm_count_lock);
>  	hardware_disable_all_nolock();
>  	raw_spin_unlock(&kvm_count_lock);
> +	cpus_read_unlock();
>  }
>  
>  static int hardware_enable_all(void)
>  {
>  	int r = 0;
>  
> +	/*
> +	 * When onlining a CPU, cpu_online_mask is set before kvm_online_cpu()
> +	 * is called, and so on_each_cpu() between them includes the CPU that
> +	 * is being onlined.  As a result, hardware_enable_nolock() may get
> +	 * invoked before kvm_online_cpu(), which also enables hardware if the
> +	 * usage count is non-zero.  Disable CPU hotplug to avoid attempting to
> +	 * enable hardware multiple times.

It won't enable hardware multiple times, right?  Since hardware_enable_nolock()
has below check:

        if (cpumask_test_cpu(cpu, cpus_hardware_enabled))                      
                return;                                                        
                                                                                                                                                   
        cpumask_set_cpu(cpu, cpus_hardware_enabled);     

IIUC the only issue is the one that I replied in the changelog.

Or perhaps I am missing something?

> +	 */
> +	cpus_read_lock();
>  	raw_spin_lock(&kvm_count_lock);
>  
>  	kvm_usage_count++;
> @@ -5121,6 +5132,7 @@ static int hardware_enable_all(void)
>  	}
>  
>  	raw_spin_unlock(&kvm_count_lock);
> +	cpus_read_unlock();
>  
>  	return r;
>  }


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 39/50] KVM: x86: Move CPU compat checks hook to kvm_x86_ops (from kvm_x86_init_ops)
  2022-11-30 23:09 ` [PATCH v2 39/50] KVM: x86: Move CPU compat checks hook to kvm_x86_ops (from kvm_x86_init_ops) Sean Christopherson
@ 2022-12-02 13:01   ` Huang, Kai
  2022-12-05 21:04   ` Isaku Yamahata
  1 sibling, 0 replies; 77+ messages in thread
From: Huang, Kai @ 2022-12-02 13:01 UTC (permalink / raw)
  To: chenhuacai, maz, frankja, borntraeger, farman, aou, palmer,
	Christopherson,,
	Sean, paul.walmsley, pbonzini, dwmw2, aleksandar.qemu.devel,
	imbrenda, paul, mjrosato, vkuznets, anup
  Cc: oliver.upton, kvm, cohuck, farosas, david, james.morse, Yao,
	Yuan, alexandru.elisei, linux-s390, linux-kernel, mpe, Yamahata,
	Isaku, kvmarm, tglx, suzuki.poulose, kvm-riscv, linuxppc-dev,
	linux-arm-kernel, linux-mips, kvmarm, philmd, atishp,
	linux-riscv, Gao, Chao

On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote:
> Move the .check_processor_compatibility() callback from kvm_x86_init_ops
> to kvm_x86_ops to allow a future patch to do compatibility checks during
> CPU hotplug.
> 
> Do kvm_ops_update() before compat checks so that static_call() can be
> used during compat checks.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Kai Huang <kai.huang@intel.com>

> ---
>  arch/x86/include/asm/kvm-x86-ops.h |  1 +
>  arch/x86/include/asm/kvm_host.h    |  3 ++-
>  arch/x86/kvm/svm/svm.c             |  5 +++--
>  arch/x86/kvm/vmx/vmx.c             | 16 +++++++--------
>  arch/x86/kvm/x86.c                 | 31 +++++++++++-------------------
>  5 files changed, 25 insertions(+), 31 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> index abccd51dcfca..dba2909e5ae2 100644
> --- a/arch/x86/include/asm/kvm-x86-ops.h
> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> @@ -14,6 +14,7 @@ BUILD_BUG_ON(1)
>   * to make a definition optional, but in this case the default will
>   * be __static_call_return0.
>   */
> +KVM_X86_OP(check_processor_compatibility)
>  KVM_X86_OP(hardware_enable)
>  KVM_X86_OP(hardware_disable)
>  KVM_X86_OP(hardware_unsetup)
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index d79aedf70908..ba74fea6850b 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1518,6 +1518,8 @@ static inline u16 kvm_lapic_irq_dest_mode(bool dest_mode_logical)
>  struct kvm_x86_ops {
>  	const char *name;
>  
> +	int (*check_processor_compatibility)(void);
> +
>  	int (*hardware_enable)(void);
>  	void (*hardware_disable)(void);
>  	void (*hardware_unsetup)(void);
> @@ -1729,7 +1731,6 @@ struct kvm_x86_nested_ops {
>  };
>  
>  struct kvm_x86_init_ops {
> -	int (*check_processor_compatibility)(void);
>  	int (*hardware_setup)(void);
>  	unsigned int (*handle_intel_pt_intr)(void);
>  
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 9f94efcb9aa6..c2e95c0d9fd8 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -543,7 +543,7 @@ static bool kvm_is_svm_supported(void)
>  	return true;
>  }
>  
> -static int __init svm_check_processor_compat(void)
> +static int svm_check_processor_compat(void)
>  {
>  	if (!kvm_is_svm_supported())
>  		return -EIO;
> @@ -4695,6 +4695,8 @@ static int svm_vm_init(struct kvm *kvm)
>  static struct kvm_x86_ops svm_x86_ops __initdata = {
>  	.name = KBUILD_MODNAME,
>  
> +	.check_processor_compatibility = svm_check_processor_compat,
> +
>  	.hardware_unsetup = svm_hardware_unsetup,
>  	.hardware_enable = svm_hardware_enable,
>  	.hardware_disable = svm_hardware_disable,
> @@ -5079,7 +5081,6 @@ static __init int svm_hardware_setup(void)
>  
>  static struct kvm_x86_init_ops svm_init_ops __initdata = {
>  	.hardware_setup = svm_hardware_setup,
> -	.check_processor_compatibility = svm_check_processor_compat,
>  
>  	.runtime_ops = &svm_x86_ops,
>  	.pmu_ops = &amd_pmu_ops,
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 2a8a6e481c76..6416ed5b7f89 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2520,8 +2520,7 @@ static bool cpu_has_perf_global_ctrl_bug(void)
>  	return false;
>  }
>  
> -static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
> -				      u32 msr, u32 *result)
> +static int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt, u32 msr, u32 *result)
>  {
>  	u32 vmx_msr_low, vmx_msr_high;
>  	u32 ctl = ctl_min | ctl_opt;
> @@ -2539,7 +2538,7 @@ static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
>  	return 0;
>  }
>  
> -static __init u64 adjust_vmx_controls64(u64 ctl_opt, u32 msr)
> +static u64 adjust_vmx_controls64(u64 ctl_opt, u32 msr)
>  {
>  	u64 allowed;
>  
> @@ -2548,8 +2547,8 @@ static __init u64 adjust_vmx_controls64(u64 ctl_opt, u32 msr)
>  	return  ctl_opt & allowed;
>  }
>  
> -static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
> -				    struct vmx_capability *vmx_cap)
> +static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
> +			     struct vmx_capability *vmx_cap)
>  {
>  	u32 vmx_msr_low, vmx_msr_high;
>  	u32 _pin_based_exec_control = 0;
> @@ -2710,7 +2709,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>  	return 0;
>  }
>  
> -static bool __init kvm_is_vmx_supported(void)
> +static bool kvm_is_vmx_supported(void)
>  {
>  	if (!cpu_has_vmx()) {
>  		pr_err("CPU doesn't support VMX\n");
> @@ -2726,7 +2725,7 @@ static bool __init kvm_is_vmx_supported(void)
>  	return true;
>  }
>  
> -static int __init vmx_check_processor_compat(void)
> +static int vmx_check_processor_compat(void)
>  {
>  	struct vmcs_config vmcs_conf;
>  	struct vmx_capability vmx_cap;
> @@ -8104,6 +8103,8 @@ static void vmx_vm_destroy(struct kvm *kvm)
>  static struct kvm_x86_ops vmx_x86_ops __initdata = {
>  	.name = KBUILD_MODNAME,
>  
> +	.check_processor_compatibility = vmx_check_processor_compat,
> +
>  	.hardware_unsetup = vmx_hardware_unsetup,
>  
>  	.hardware_enable = vmx_hardware_enable,
> @@ -8501,7 +8502,6 @@ static __init int hardware_setup(void)
>  }
>  
>  static struct kvm_x86_init_ops vmx_init_ops __initdata = {
> -	.check_processor_compatibility = vmx_check_processor_compat,
>  	.hardware_setup = hardware_setup,
>  	.handle_intel_pt_intr = NULL,
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 5551f3552f08..ee9af412ffd4 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9279,12 +9279,7 @@ static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
>  	kvm_pmu_ops_update(ops->pmu_ops);
>  }
>  
> -struct kvm_cpu_compat_check {
> -	struct kvm_x86_init_ops *ops;
> -	int *ret;
> -};
> -
> -static int kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
> +static int kvm_x86_check_processor_compatibility(void)
>  {
>  	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
>  
> @@ -9294,19 +9289,16 @@ static int kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
>  	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
>  		return -EIO;
>  
> -	return ops->check_processor_compatibility();
> +	return static_call(kvm_x86_check_processor_compatibility)();
>  }
>  
> -static void kvm_x86_check_cpu_compat(void *data)
> +static void kvm_x86_check_cpu_compat(void *ret)
>  {
> -	struct kvm_cpu_compat_check *c = data;
> -
> -	*c->ret = kvm_x86_check_processor_compatibility(c->ops);
> +	*(int *)ret = kvm_x86_check_processor_compatibility();
>  }
>  
>  static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
>  {
> -	struct kvm_cpu_compat_check c;
>  	u64 host_pat;
>  	int r, cpu;
>  
> @@ -9377,12 +9369,12 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
>  	if (r != 0)
>  		goto out_mmu_exit;
>  
> -	c.ret = &r;
> -	c.ops = ops;
> +	kvm_ops_update(ops);
> +
>  	for_each_online_cpu(cpu) {
> -		smp_call_function_single(cpu, kvm_x86_check_cpu_compat, &c, 1);
> +		smp_call_function_single(cpu, kvm_x86_check_cpu_compat, &r, 1);
>  		if (r < 0)
> -			goto out_hardware_unsetup;
> +			goto out_unwind_ops;
>  	}
>  
>  	/*
> @@ -9390,8 +9382,6 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
>  	 * absolutely necessary, as most operations from this point forward
>  	 * require unwinding.
>  	 */
> -	kvm_ops_update(ops);
> -
>  	kvm_timer_init();
>  
>  	if (pi_inject_timer == -1)
> @@ -9427,8 +9417,9 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
>  	kvm_init_msr_list();
>  	return 0;
>  
> -out_hardware_unsetup:
> -	ops->runtime_ops->hardware_unsetup();
> +out_unwind_ops:
> +	kvm_x86_ops.hardware_enable = NULL;
> +	static_call(kvm_x86_hardware_unsetup)();
>  out_mmu_exit:
>  	kvm_mmu_vendor_module_exit();
>  out_free_percpu:


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 40/50] KVM: x86: Do compatibility checks when onlining CPU
  2022-11-30 23:09 ` [PATCH v2 40/50] KVM: x86: Do compatibility checks when onlining CPU Sean Christopherson
@ 2022-12-02 13:03   ` Huang, Kai
  2022-12-02 13:36   ` Huang, Kai
  1 sibling, 0 replies; 77+ messages in thread
From: Huang, Kai @ 2022-12-02 13:03 UTC (permalink / raw)
  To: chenhuacai, maz, frankja, borntraeger, farman, aou, palmer,
	Christopherson,,
	Sean, paul.walmsley, pbonzini, dwmw2, aleksandar.qemu.devel,
	imbrenda, paul, mjrosato, vkuznets, anup
  Cc: oliver.upton, kvm, cohuck, farosas, david, james.morse, Yao,
	Yuan, alexandru.elisei, linux-s390, linux-kernel, mpe, Yamahata,
	Isaku, kvmarm, tglx, suzuki.poulose, kvm-riscv, linuxppc-dev,
	linux-arm-kernel, linux-mips, kvmarm, philmd, atishp,
	linux-riscv, Gao, Chao

On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
> Do compatibility checks when enabling hardware to effectively add
> compatibility checks when onlining a CPU.  Abort enabling, i.e. the
> online process, if the (hotplugged) CPU is incompatible with the known
> good setup.
> 
> At init time, KVM does compatibility checks to ensure that all online
> CPUs support hardware virtualization and a common set of features. But
> KVM uses hotplugged CPUs without such compatibility checks. On Intel
> CPUs, this leads to #GP if the hotplugged CPU doesn't support VMX, or
> VM-Entry failure if the hotplugged CPU doesn't support all features
> enabled by KVM.
> 
> Note, this is little more than a NOP on SVM, as SVM already checks for
> full SVM support during hardware enabling.
> 
> Opportunistically add a pr_err() if setup_vmcs_config() fails, and
> tweak all error messages to output which CPU failed.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Co-developed-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

For VMX:

Acked-by: Kai Huang <kai.huang@intel.com>

> ---
>  arch/x86/kvm/svm/svm.c |  8 +++-----
>  arch/x86/kvm/vmx/vmx.c | 15 ++++++++++-----
>  arch/x86/kvm/x86.c     |  5 +++++
>  3 files changed, 18 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index c2e95c0d9fd8..46b658d0f46e 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -521,11 +521,12 @@ static void svm_init_osvw(struct kvm_vcpu *vcpu)
>  
>  static bool kvm_is_svm_supported(void)
>  {
> +	int cpu = raw_smp_processor_id();
>  	const char *msg;
>  	u64 vm_cr;
>  
>  	if (!cpu_has_svm(&msg)) {
> -		pr_err("SVM not supported, %s\n", msg);
> +		pr_err("SVM not supported by CPU %d, %s\n", cpu, msg);
>  		return false;
>  	}
>  
> @@ -536,7 +537,7 @@ static bool kvm_is_svm_supported(void)
>  
>  	rdmsrl(MSR_VM_CR, vm_cr);
>  	if (vm_cr & (1 << SVM_VM_CR_SVM_DISABLE)) {
> -		pr_err("SVM disabled (by BIOS) in MSR_VM_CR\n");
> +		pr_err("SVM disabled (by BIOS) in MSR_VM_CR on CPU %d\n", cpu);
>  		return false;
>  	}
>  
> @@ -587,9 +588,6 @@ static int svm_hardware_enable(void)
>  	if (efer & EFER_SVME)
>  		return -EBUSY;
>  
> -	if (!kvm_is_svm_supported())
> -		return -EINVAL;
> -
>  	sd = per_cpu_ptr(&svm_data, me);
>  	sd->asid_generation = 1;
>  	sd->max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1;
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 6416ed5b7f89..39dd3082fcd8 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2711,14 +2711,16 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>  
>  static bool kvm_is_vmx_supported(void)
>  {
> +	int cpu = raw_smp_processor_id();
> +
>  	if (!cpu_has_vmx()) {
> -		pr_err("CPU doesn't support VMX\n");
> +		pr_err("VMX not supported by CPU %d\n", cpu);
>  		return false;
>  	}
>  
>  	if (!this_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
>  	    !this_cpu_has(X86_FEATURE_VMX)) {
> -		pr_err("VMX not enabled (by BIOS) in MSR_IA32_FEAT_CTL\n");
> +		pr_err("VMX not enabled (by BIOS) in MSR_IA32_FEAT_CTL on CPU %d\n", cpu);
>  		return false;
>  	}
>  
> @@ -2727,18 +2729,21 @@ static bool kvm_is_vmx_supported(void)
>  
>  static int vmx_check_processor_compat(void)
>  {
> +	int cpu = raw_smp_processor_id();
>  	struct vmcs_config vmcs_conf;
>  	struct vmx_capability vmx_cap;
>  
>  	if (!kvm_is_vmx_supported())
>  		return -EIO;
>  
> -	if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0)
> +	if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0) {
> +		pr_err("Failed to setup VMCS config on CPU %d\n", cpu);
>  		return -EIO;
> +	}
>  	if (nested)
>  		nested_vmx_setup_ctls_msrs(&vmcs_conf, vmx_cap.ept);
> -	if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config)) != 0) {
> -		pr_err("CPU %d feature inconsistency!\n", smp_processor_id());
> +	if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config))) {
> +		pr_err("Inconsistent VMCS config on CPU %d\n", cpu);
>  		return -EIO;
>  	}
>  	return 0;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index ee9af412ffd4..5a9e74cedbc6 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -11967,6 +11967,11 @@ int kvm_arch_hardware_enable(void)
>  	bool stable, backwards_tsc = false;
>  
>  	kvm_user_return_msr_cpu_online();
> +
> +	ret = kvm_x86_check_processor_compatibility();
> +	if (ret)
> +		return ret;
> +
>  	ret = static_call(kvm_x86_hardware_enable)();
>  	if (ret != 0)
>  		return ret;


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 41/50] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section
  2022-11-30 23:09 ` [PATCH v2 41/50] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section Sean Christopherson
@ 2022-12-02 13:06   ` Huang, Kai
  2022-12-02 16:08     ` Sean Christopherson
  0 siblings, 1 reply; 77+ messages in thread
From: Huang, Kai @ 2022-12-02 13:06 UTC (permalink / raw)
  To: chenhuacai, maz, frankja, borntraeger, farman, aou, palmer,
	Christopherson,,
	Sean, paul.walmsley, pbonzini, dwmw2, aleksandar.qemu.devel,
	imbrenda, paul, mjrosato, vkuznets, anup
  Cc: oliver.upton, kvm, cohuck, farosas, david, james.morse, Yao,
	Yuan, alexandru.elisei, linux-s390, linux-kernel, mpe, Yamahata,
	Isaku, kvmarm, tglx, suzuki.poulose, kvm-riscv, linuxppc-dev,
	linux-arm-kernel, linux-mips, kvmarm, philmd, atishp,
	linux-riscv, Gao, Chao

On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote:
> From: Chao Gao <chao.gao@intel.com>
> 
...

> 
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>

Perhaps I am wrong, but I have memory that if someone has SoB but isn't the
original author should also have a Co-developed-by?

> Reviewed-by: Yuan Yao <yuan.yao@intel.com>
> [sean: drop WARN that IRQs are disabled]
> Signed-off-by: Sean Christopherson <seanjc@google.com>


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 40/50] KVM: x86: Do compatibility checks when onlining CPU
  2022-11-30 23:09 ` [PATCH v2 40/50] KVM: x86: Do compatibility checks when onlining CPU Sean Christopherson
  2022-12-02 13:03   ` Huang, Kai
@ 2022-12-02 13:36   ` Huang, Kai
  2022-12-02 16:04     ` Sean Christopherson
  1 sibling, 1 reply; 77+ messages in thread
From: Huang, Kai @ 2022-12-02 13:36 UTC (permalink / raw)
  To: chenhuacai, maz, frankja, borntraeger, farman, aou, palmer,
	Christopherson,,
	Sean, paul.walmsley, pbonzini, dwmw2, aleksandar.qemu.devel,
	imbrenda, paul, mjrosato, vkuznets, anup
  Cc: oliver.upton, kvm, cohuck, farosas, david, james.morse, Yao,
	Yuan, alexandru.elisei, linux-s390, linux-kernel, mpe, Yamahata,
	Isaku, kvmarm, tglx, suzuki.poulose, kvm-riscv, linuxppc-dev,
	linux-arm-kernel, linux-mips, kvmarm, philmd, atishp,
	linux-riscv, Gao, Chao

On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote:
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -11967,6 +11967,11 @@ int kvm_arch_hardware_enable(void)
>  	bool stable, backwards_tsc = false;
>  
>  	kvm_user_return_msr_cpu_online();
> +
> +	ret = kvm_x86_check_processor_compatibility();
> +	if (ret)
> +		return ret;
> +
>  	ret = static_call(kvm_x86_hardware_enable)();
>  	if (ret != 0)
>  		return ret;

Thinking more, AFAICT, kvm_x86_vendor_init() so far still does the compatibility
check on all online cpus.  Since now kvm_arch_hardware_enable() also does the
compatibility check, IIUC the compatibility check will be done twice -- one in
kvm_x86_vendor_init() and one in hardware_enable_all() when creating the first
VM.

Do you think it's still worth to do compatibility check in vm_x86_vendor_init()?

The behaviour difference should be "KVM module fail to load" vs "failing to
create the first VM" IIUC.  I don't know whether the former is better than the
better, but it seems duplicated compatibility checking isn't needed?



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 40/50] KVM: x86: Do compatibility checks when onlining CPU
  2022-12-02 13:36   ` Huang, Kai
@ 2022-12-02 16:04     ` Sean Christopherson
  0 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-12-02 16:04 UTC (permalink / raw)
  To: Huang, Kai
  Cc: chenhuacai, maz, frankja, borntraeger, farman, aou, palmer,
	paul.walmsley, pbonzini, dwmw2, aleksandar.qemu.devel, imbrenda,
	paul, mjrosato, vkuznets, anup, oliver.upton, kvm, cohuck,
	farosas, david, james.morse, Yao, Yuan, alexandru.elisei,
	linux-s390, linux-kernel, mpe, Yamahata, Isaku, kvmarm, tglx,
	suzuki.poulose, kvm-riscv, linuxppc-dev, linux-arm-kernel,
	linux-mips, kvmarm, philmd, atishp, linux-riscv, Gao, Chao

On Fri, Dec 02, 2022, Huang, Kai wrote:
> On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote:
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -11967,6 +11967,11 @@ int kvm_arch_hardware_enable(void)
> >  	bool stable, backwards_tsc = false;
> >  
> >  	kvm_user_return_msr_cpu_online();
> > +
> > +	ret = kvm_x86_check_processor_compatibility();
> > +	if (ret)
> > +		return ret;
> > +
> >  	ret = static_call(kvm_x86_hardware_enable)();
> >  	if (ret != 0)
> >  		return ret;
> 
> Thinking more, AFAICT, kvm_x86_vendor_init() so far still does the compatibility
> check on all online cpus.  Since now kvm_arch_hardware_enable() also does the
> compatibility check, IIUC the compatibility check will be done twice -- one in
> kvm_x86_vendor_init() and one in hardware_enable_all() when creating the first
> VM.
> 
> Do you think it's still worth to do compatibility check in vm_x86_vendor_init()?
> 
> The behaviour difference should be "KVM module fail to load" vs "failing to
> create the first VM" IIUC.  I don't know whether the former is better than the
> better, but it seems duplicated compatibility checking isn't needed?

It's not strictly needed, but I think it's worth keeping.  The duplicate checking
annoys me too, and I considered removing it multiple times when creating this
series.  But, if there is a hardware incompatibility for whatever reason, failing
to load and thus not instantiating /dev/kvm is friendlier to userspace, e.g.
userspace can immediately flag the platform as potentially flaky, whereas
detecting the likely hardware issue when VM creation fails would essentialy require
scraping the kernel logs.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 41/50] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section
  2022-12-02 13:06   ` Huang, Kai
@ 2022-12-02 16:08     ` Sean Christopherson
  0 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-12-02 16:08 UTC (permalink / raw)
  To: Huang, Kai
  Cc: chenhuacai, maz, frankja, borntraeger, farman, aou, palmer,
	paul.walmsley, pbonzini, dwmw2, aleksandar.qemu.devel, imbrenda,
	paul, mjrosato, vkuznets, anup, oliver.upton, kvm, cohuck,
	farosas, david, james.morse, Yao, Yuan, alexandru.elisei,
	linux-s390, linux-kernel, mpe, Yamahata, Isaku, kvmarm, tglx,
	suzuki.poulose, kvm-riscv, linuxppc-dev, linux-arm-kernel,
	linux-mips, kvmarm, philmd, atishp, linux-riscv, Gao, Chao

On Fri, Dec 02, 2022, Huang, Kai wrote:
> On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote:
> > From: Chao Gao <chao.gao@intel.com>
> > 
> ...
> 
> > 
> > Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> > Signed-off-by: Chao Gao <chao.gao@intel.com>
> > Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> 
> Perhaps I am wrong, but I have memory that if someone has SoB but isn't the
> original author should also have a Co-developed-by?

This is the case where a patched is passed along as-is, e.g. same as when
maintainers apply a patch.  Isaku posted Chao's patch, and then I came along and
grabbed the patch that Isaku posted.  I could go back and grab Chao's patch
directly, but Yuan's review was provided for the version Isaku posted, so I
grabbed that version.

> > Reviewed-by: Yuan Yao <yuan.yao@intel.com>
> > [sean: drop WARN that IRQs are disabled]
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> 

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 42/50] KVM: Disable CPU hotplug during hardware enabling/disabling
  2022-12-02 12:59   ` Huang, Kai
@ 2022-12-02 16:31     ` Sean Christopherson
  0 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-12-02 16:31 UTC (permalink / raw)
  To: Huang, Kai
  Cc: chenhuacai, maz, frankja, borntraeger, farman, aou, palmer,
	paul.walmsley, pbonzini, dwmw2, aleksandar.qemu.devel, imbrenda,
	paul, mjrosato, vkuznets, anup, oliver.upton, kvm, cohuck,
	farosas, david, james.morse, Yao, Yuan, alexandru.elisei,
	linux-s390, linux-kernel, mpe, Yamahata, Isaku, kvmarm, tglx,
	suzuki.poulose, kvm-riscv, linuxppc-dev, linux-arm-kernel,
	linux-mips, kvmarm, philmd, atishp, linux-riscv, Gao, Chao

On Fri, Dec 02, 2022, Huang, Kai wrote:
> On Wed, 2022-11-30 at 23:09 +0000, Sean Christopherson wrote:
> > From: Chao Gao <chao.gao@intel.com>
> > 
> > Disable CPU hotplug when enabling/disabling hardware to prevent the
> > corner case where if the following sequence occurs:
> > 
> >   1. A hotplugged CPU marks itself online in cpu_online_mask
> >   2. The hotplugged CPU enables interrupt before invoking KVM's ONLINE
> >      callback
> >   3  hardware_{en,dis}able_all() is invoked on another CPU
> > 
> > the hotplugged CPU will be included in on_each_cpu() and thus get sent
> > through hardware_{en,dis}able_nolock() before kvm_online_cpu() is called.
> 
> Should we explicitly call out what is the consequence of such case, otherwise
> it's hard to tell whether this truly is an issue?
>
> IIUC, since now the compatibility check has already been moved to
> kvm_arch_hardware_enable(), the consequence is hardware_enable_all() will fail
> if the now online cpu isn't compatible, which will results in failing to create
> the first VM.  This isn't ideal since the incompatible cpu should be rejected to
> go online instead.

Actually, in that specific scenario, KVM should not reject the CPU.  E.g. if KVM
is autoloaded (common with systemd and/or qemu-kvm installed), but not used by
userspace, then KVM is overstepping by rejecting the incompatible CPU since the
user likely cares more about onlining a CPU than they do about KVM.

> > KVM currently fudges around this race by keeping track of which CPUs have
> > done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which
> > cpus have virtualization enabled"), but that's an inefficient, convoluted,
> > and hacky solution.

...

> > +	/*
> > +	 * Compatibility checks are done when loading KVM and when enabling
> > +	 * hardware, e.g. during CPU hotplug, to ensure all online CPUs are
> > +	 * compatible, i.e. KVM should never perform a compatibility check on
> > +	 * an offline CPU.
> > +	 */
> > +	WARN_ON(!cpu_online(cpu));
> 
> IMHO this chunk logically should belong to previous patch.  IIUC disabling CPU
> hotplug during hardware_enable_all() doesn't have relationship to this WARN().

Hmm, yeah, I agree.  I'll move it.

> >  static int hardware_enable_all(void)
> >  {
> >  	int r = 0;
> >  
> > +	/*
> > +	 * When onlining a CPU, cpu_online_mask is set before kvm_online_cpu()
> > +	 * is called, and so on_each_cpu() between them includes the CPU that
> > +	 * is being onlined.  As a result, hardware_enable_nolock() may get
> > +	 * invoked before kvm_online_cpu(), which also enables hardware if the
> > +	 * usage count is non-zero.  Disable CPU hotplug to avoid attempting to
> > +	 * enable hardware multiple times.
> 
> It won't enable hardware multiple times, right?  Since hardware_enable_nolock()
> has below check:
> 
>         if (cpumask_test_cpu(cpu, cpus_hardware_enabled))                      
>                 return;                                                        
>                                                                                                                                                    
>         cpumask_set_cpu(cpu, cpus_hardware_enabled);     
> 
> IIUC the only issue is the one that I replied in the changelog.
> 
> Or perhaps I am missing something?

You're not missing anything in terms of code.  What the comment means by "attempting"
in this case is calling hardware_enable_nolock().  As called out in the changelog,
guarding against this race with cpus_hardware_enabled is a hack, i.e. KVM should
not have to rely on a per-CPU flag.

 : KVM currently fudges around this race by keeping track of which CPUs have
 : done hardware enabling (see commit 1b6c016818a5 "KVM: Keep track of which
 : cpus have virtualization enabled"), but that's an inefficient, convoluted,
 : and hacky solution.

I actually considered removing the per-CPU flag, but decided not to because it's
simpler to blast

	on_each_cpu(hardware_disable_nolock, ...)

in kvm_reboot() and if enabling hardware fails on one or more CPUs, and taking a
#UD on VMXOFF in the latter case is really unnecessary, i.e. the flag is nice to
have for other reasons.

That said, after this patch, KVM should be able to WARN in the enable path.  I'll
test that and do a more thorough audit; unless I'm missing something, I'll add a
patch to do:

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b8c6bfb46066..ee896fa2f196 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5027,7 +5027,7 @@ static int kvm_usage_count;
 
 static int __hardware_enable_nolock(void)
 {
-       if (__this_cpu_read(hardware_enabled))
+       if (WARN_ON_ONCE(__this_cpu_read(hardware_enabled)))
                return 0;
 
        if (kvm_arch_hardware_enable()) {


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 31/50] KVM: x86: Do CPU compatibility checks in x86 code
  2022-11-30 23:09 ` [PATCH v2 31/50] KVM: x86: Do CPU compatibility checks in x86 code Sean Christopherson
  2022-12-02 12:16   ` Huang, Kai
@ 2022-12-05 20:52   ` Isaku Yamahata
  2022-12-05 21:12     ` Sean Christopherson
  1 sibling, 1 reply; 77+ messages in thread
From: Isaku Yamahata @ 2022-12-05 20:52 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Vitaly Kuznetsov, David Woodhouse,
	Paul Durrant, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Oliver Upton, Atish Patra, David Hildenbrand, kvm,
	linux-arm-kernel, kvmarm, kvmarm, linux-mips, linuxppc-dev,
	kvm-riscv, linux-riscv, linux-s390, linux-kernel, Yuan Yao,
	Cornelia Huck, Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner, isaku.yamahata

On Wed, Nov 30, 2022 at 11:09:15PM +0000,
Sean Christopherson <seanjc@google.com> wrote:

> Move the CPU compatibility checks to pure x86 code, i.e. drop x86's use
> of the common kvm_x86_check_cpu_compat() arch hook.  x86 is the only
> architecture that "needs" to do per-CPU compatibility checks, moving
> the logic to x86 will allow dropping the common code, and will also
> give x86 more control over when/how the compatibility checks are
> performed, e.g. TDX will need to enable hardware (do VMXON) in order to
> perform compatibility checks.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/svm.c |  2 +-
>  arch/x86/kvm/vmx/vmx.c |  2 +-
>  arch/x86/kvm/x86.c     | 49 ++++++++++++++++++++++++++++++++----------
>  3 files changed, 40 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 19e81a99c58f..d7ea1c1175c2 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -5103,7 +5103,7 @@ static int __init svm_init(void)
>  	 * Common KVM initialization _must_ come last, after this, /dev/kvm is
>  	 * exposed to userspace!
>  	 */
> -	r = kvm_init(&svm_init_ops, sizeof(struct vcpu_svm),
> +	r = kvm_init(NULL, sizeof(struct vcpu_svm),
>  		     __alignof__(struct vcpu_svm), THIS_MODULE);
>  	if (r)
>  		goto err_kvm_init;
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 654d81f781da..8deb1bd60c10 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -8592,7 +8592,7 @@ static int __init vmx_init(void)
>  	 * Common KVM initialization _must_ come last, after this, /dev/kvm is
>  	 * exposed to userspace!
>  	 */
> -	r = kvm_init(&vmx_init_ops, sizeof(struct vcpu_vmx),
> +	r = kvm_init(NULL, sizeof(struct vcpu_vmx),
>  		     __alignof__(struct vcpu_vmx), THIS_MODULE);
>  	if (r)
>  		goto err_kvm_init;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 66f16458aa97..3571bc968cf8 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9277,10 +9277,36 @@ static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
>  	kvm_pmu_ops_update(ops->pmu_ops);
>  }
>  
> +struct kvm_cpu_compat_check {
> +	struct kvm_x86_init_ops *ops;
> +	int *ret;

minor nitpick: just int ret. I don't see the necessity of the pointer.
Anyway overall it looks good to me.

Reviewed-by: Isaku Yamahata <isaku.yamahata@intel.com>

> +};
> +
> +static int kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
> +{
> +	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
> +
> +	WARN_ON(!irqs_disabled());
> +
> +	if (__cr4_reserved_bits(cpu_has, c) !=
> +	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
> +		return -EIO;
> +
> +	return ops->check_processor_compatibility();
> +}
> +
> +static void kvm_x86_check_cpu_compat(void *data)
> +{
> +	struct kvm_cpu_compat_check *c = data;
> +
> +	*c->ret = kvm_x86_check_processor_compatibility(c->ops);
> +}
> +
>  static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
>  {
> +	struct kvm_cpu_compat_check c;
>  	u64 host_pat;
> -	int r;
> +	int r, cpu;
>  
>  	if (kvm_x86_ops.hardware_enable) {
>  		pr_err("kvm: already loaded vendor module '%s'\n", kvm_x86_ops.name);
> @@ -9360,6 +9386,14 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
>  	if (r != 0)
>  		goto out_mmu_exit;
>  
> +	c.ret = &r;
> +	c.ops = ops;
> +	for_each_online_cpu(cpu) {
> +		smp_call_function_single(cpu, kvm_x86_check_cpu_compat, &c, 1);
> +		if (r < 0)

Here it can be "c.ret < 0".

> +			goto out_hardware_unsetup;
> +	}
> +
>  	/*
>  	 * Point of no return!  DO NOT add error paths below this point unless
>  	 * absolutely necessary, as most operations from this point forward
> @@ -9402,6 +9436,8 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
>  	kvm_init_msr_list();
>  	return 0;
>  
> +out_hardware_unsetup:
> +	ops->runtime_ops->hardware_unsetup();
>  out_mmu_exit:
>  	kvm_mmu_vendor_module_exit();
>  out_free_percpu:
> @@ -12037,16 +12073,7 @@ void kvm_arch_hardware_disable(void)
>  
>  int kvm_arch_check_processor_compat(void *opaque)
>  {
> -	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
> -	struct kvm_x86_init_ops *ops = opaque;
> -
> -	WARN_ON(!irqs_disabled());
> -
> -	if (__cr4_reserved_bits(cpu_has, c) !=
> -	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
> -		return -EIO;
> -
> -	return ops->check_processor_compatibility();
> +	return 0;
>  }
>  
>  bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu)
> -- 
> 2.38.1.584.g0f3c55d4c2-goog
> 

-- 
Isaku Yamahata <isaku.yamahata@gmail.com>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 39/50] KVM: x86: Move CPU compat checks hook to kvm_x86_ops (from kvm_x86_init_ops)
  2022-11-30 23:09 ` [PATCH v2 39/50] KVM: x86: Move CPU compat checks hook to kvm_x86_ops (from kvm_x86_init_ops) Sean Christopherson
  2022-12-02 13:01   ` Huang, Kai
@ 2022-12-05 21:04   ` Isaku Yamahata
  1 sibling, 0 replies; 77+ messages in thread
From: Isaku Yamahata @ 2022-12-05 21:04 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Vitaly Kuznetsov, David Woodhouse,
	Paul Durrant, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Oliver Upton, Atish Patra, David Hildenbrand, kvm,
	linux-arm-kernel, kvmarm, kvmarm, linux-mips, linuxppc-dev,
	kvm-riscv, linux-riscv, linux-s390, linux-kernel, Yuan Yao,
	Cornelia Huck, Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner, isaku.yamahata

On Wed, Nov 30, 2022 at 11:09:23PM +0000,
Sean Christopherson <seanjc@google.com> wrote:

> Move the .check_processor_compatibility() callback from kvm_x86_init_ops
> to kvm_x86_ops to allow a future patch to do compatibility checks during
> CPU hotplug.
> 
> Do kvm_ops_update() before compat checks so that static_call() can be
> used during compat checks.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/include/asm/kvm-x86-ops.h |  1 +
>  arch/x86/include/asm/kvm_host.h    |  3 ++-
>  arch/x86/kvm/svm/svm.c             |  5 +++--
>  arch/x86/kvm/vmx/vmx.c             | 16 +++++++--------
>  arch/x86/kvm/x86.c                 | 31 +++++++++++-------------------
>  5 files changed, 25 insertions(+), 31 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> index abccd51dcfca..dba2909e5ae2 100644
> --- a/arch/x86/include/asm/kvm-x86-ops.h
> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> @@ -14,6 +14,7 @@ BUILD_BUG_ON(1)
>   * to make a definition optional, but in this case the default will
>   * be __static_call_return0.
>   */
> +KVM_X86_OP(check_processor_compatibility)
>  KVM_X86_OP(hardware_enable)
>  KVM_X86_OP(hardware_disable)
>  KVM_X86_OP(hardware_unsetup)
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index d79aedf70908..ba74fea6850b 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1518,6 +1518,8 @@ static inline u16 kvm_lapic_irq_dest_mode(bool dest_mode_logical)
>  struct kvm_x86_ops {
>  	const char *name;
>  
> +	int (*check_processor_compatibility)(void);
> +
>  	int (*hardware_enable)(void);
>  	void (*hardware_disable)(void);
>  	void (*hardware_unsetup)(void);
> @@ -1729,7 +1731,6 @@ struct kvm_x86_nested_ops {
>  };
>  
>  struct kvm_x86_init_ops {
> -	int (*check_processor_compatibility)(void);
>  	int (*hardware_setup)(void);
>  	unsigned int (*handle_intel_pt_intr)(void);
>  
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 9f94efcb9aa6..c2e95c0d9fd8 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -543,7 +543,7 @@ static bool kvm_is_svm_supported(void)
>  	return true;
>  }
>  
> -static int __init svm_check_processor_compat(void)
> +static int svm_check_processor_compat(void)
>  {
>  	if (!kvm_is_svm_supported())
>  		return -EIO;
> @@ -4695,6 +4695,8 @@ static int svm_vm_init(struct kvm *kvm)
>  static struct kvm_x86_ops svm_x86_ops __initdata = {
>  	.name = KBUILD_MODNAME,
>  
> +	.check_processor_compatibility = svm_check_processor_compat,
> +
>  	.hardware_unsetup = svm_hardware_unsetup,
>  	.hardware_enable = svm_hardware_enable,
>  	.hardware_disable = svm_hardware_disable,
> @@ -5079,7 +5081,6 @@ static __init int svm_hardware_setup(void)
>  
>  static struct kvm_x86_init_ops svm_init_ops __initdata = {
>  	.hardware_setup = svm_hardware_setup,
> -	.check_processor_compatibility = svm_check_processor_compat,
>  
>  	.runtime_ops = &svm_x86_ops,
>  	.pmu_ops = &amd_pmu_ops,
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 2a8a6e481c76..6416ed5b7f89 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2520,8 +2520,7 @@ static bool cpu_has_perf_global_ctrl_bug(void)
>  	return false;
>  }
>  
> -static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
> -				      u32 msr, u32 *result)
> +static int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt, u32 msr, u32 *result)
>  {
>  	u32 vmx_msr_low, vmx_msr_high;
>  	u32 ctl = ctl_min | ctl_opt;
> @@ -2539,7 +2538,7 @@ static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
>  	return 0;
>  }
>  
> -static __init u64 adjust_vmx_controls64(u64 ctl_opt, u32 msr)
> +static u64 adjust_vmx_controls64(u64 ctl_opt, u32 msr)
>  {
>  	u64 allowed;
>  
> @@ -2548,8 +2547,8 @@ static __init u64 adjust_vmx_controls64(u64 ctl_opt, u32 msr)
>  	return  ctl_opt & allowed;
>  }
>  
> -static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
> -				    struct vmx_capability *vmx_cap)
> +static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
> +			     struct vmx_capability *vmx_cap)
>  {
>  	u32 vmx_msr_low, vmx_msr_high;
>  	u32 _pin_based_exec_control = 0;
> @@ -2710,7 +2709,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>  	return 0;
>  }
>  
> -static bool __init kvm_is_vmx_supported(void)
> +static bool kvm_is_vmx_supported(void)
>  {
>  	if (!cpu_has_vmx()) {
>  		pr_err("CPU doesn't support VMX\n");
> @@ -2726,7 +2725,7 @@ static bool __init kvm_is_vmx_supported(void)
>  	return true;
>  }
>  
> -static int __init vmx_check_processor_compat(void)
> +static int vmx_check_processor_compat(void)
>  {
>  	struct vmcs_config vmcs_conf;
>  	struct vmx_capability vmx_cap;
> @@ -8104,6 +8103,8 @@ static void vmx_vm_destroy(struct kvm *kvm)
>  static struct kvm_x86_ops vmx_x86_ops __initdata = {
>  	.name = KBUILD_MODNAME,
>  
> +	.check_processor_compatibility = vmx_check_processor_compat,
> +
>  	.hardware_unsetup = vmx_hardware_unsetup,
>  
>  	.hardware_enable = vmx_hardware_enable,
> @@ -8501,7 +8502,6 @@ static __init int hardware_setup(void)
>  }
>  
>  static struct kvm_x86_init_ops vmx_init_ops __initdata = {
> -	.check_processor_compatibility = vmx_check_processor_compat,
>  	.hardware_setup = hardware_setup,
>  	.handle_intel_pt_intr = NULL,
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 5551f3552f08..ee9af412ffd4 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9279,12 +9279,7 @@ static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
>  	kvm_pmu_ops_update(ops->pmu_ops);
>  }
>  
> -struct kvm_cpu_compat_check {
> -	struct kvm_x86_init_ops *ops;
> -	int *ret;
> -};
> -
> -static int kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
> +static int kvm_x86_check_processor_compatibility(void)
>  {
>  	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
>  
> @@ -9294,19 +9289,16 @@ static int kvm_x86_check_processor_compatibility(struct kvm_x86_init_ops *ops)
>  	    __cr4_reserved_bits(cpu_has, &boot_cpu_data))
>  		return -EIO;
>  
> -	return ops->check_processor_compatibility();
> +	return static_call(kvm_x86_check_processor_compatibility)();
>  }
>  
> -static void kvm_x86_check_cpu_compat(void *data)
> +static void kvm_x86_check_cpu_compat(void *ret)
>  {
> -	struct kvm_cpu_compat_check *c = data;
> -
> -	*c->ret = kvm_x86_check_processor_compatibility(c->ops);
> +	*(int *)ret = kvm_x86_check_processor_compatibility();
>  }
>  
>  static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
>  {
> -	struct kvm_cpu_compat_check c;
>  	u64 host_pat;
>  	int r, cpu;
>  
> @@ -9377,12 +9369,12 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
>  	if (r != 0)
>  		goto out_mmu_exit;
>  
> -	c.ret = &r;
> -	c.ops = ops;
> +	kvm_ops_update(ops);
> +
>  	for_each_online_cpu(cpu) {
> -		smp_call_function_single(cpu, kvm_x86_check_cpu_compat, &c, 1);
> +		smp_call_function_single(cpu, kvm_x86_check_cpu_compat, &r, 1);

Ah, here pointer makes sense. So I scratch the comment on the previous patch.

Thanks,

>  		if (r < 0)
> -			goto out_hardware_unsetup;
> +			goto out_unwind_ops;
>  	}
>  
>  	/*
> @@ -9390,8 +9382,6 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
>  	 * absolutely necessary, as most operations from this point forward
>  	 * require unwinding.
>  	 */
> -	kvm_ops_update(ops);
> -
>  	kvm_timer_init();
>  
>  	if (pi_inject_timer == -1)
> @@ -9427,8 +9417,9 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
>  	kvm_init_msr_list();
>  	return 0;
>  
> -out_hardware_unsetup:
> -	ops->runtime_ops->hardware_unsetup();
> +out_unwind_ops:
> +	kvm_x86_ops.hardware_enable = NULL;
> +	static_call(kvm_x86_hardware_unsetup)();
>  out_mmu_exit:
>  	kvm_mmu_vendor_module_exit();
>  out_free_percpu:
> -- 
> 2.38.1.584.g0f3c55d4c2-goog
> 

-- 
Isaku Yamahata <isaku.yamahata@gmail.com>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 31/50] KVM: x86: Do CPU compatibility checks in x86 code
  2022-12-05 20:52   ` Isaku Yamahata
@ 2022-12-05 21:12     ` Sean Christopherson
  0 siblings, 0 replies; 77+ messages in thread
From: Sean Christopherson @ 2022-12-05 21:12 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: Paolo Bonzini, Marc Zyngier, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Vitaly Kuznetsov, David Woodhouse,
	Paul Durrant, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Oliver Upton, Atish Patra, David Hildenbrand, kvm,
	linux-arm-kernel, kvmarm, kvmarm, linux-mips, linuxppc-dev,
	kvm-riscv, linux-riscv, linux-s390, linux-kernel, Yuan Yao,
	Cornelia Huck, Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

On Mon, Dec 05, 2022, Isaku Yamahata wrote:
> On Wed, Nov 30, 2022 at 11:09:15PM +0000,
> > index 66f16458aa97..3571bc968cf8 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -9277,10 +9277,36 @@ static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
> >  	kvm_pmu_ops_update(ops->pmu_ops);
> >  }
> >  
> > +struct kvm_cpu_compat_check {
> > +	struct kvm_x86_init_ops *ops;
> > +	int *ret;
> 
> minor nitpick: just int ret. I don't see the necessity of the pointer.
> Anyway overall it looks good to me.

...

> > @@ -9360,6 +9386,14 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
> >  	if (r != 0)
> >  		goto out_mmu_exit;
> >  
> > +	c.ret = &r;
> > +	c.ops = ops;
> > +	for_each_online_cpu(cpu) {
> > +		smp_call_function_single(cpu, kvm_x86_check_cpu_compat, &c, 1);
> > +		if (r < 0)
> 
> Here it can be "c.ret < 0".

No, because the below goto leads to "return r", i.e. "c.ret" needs to be propagated
to "r".  That's why the code does the admittedly funky "int *ret" thing.

FWIW, this gets cleanup in the end.  "struct kvm_cpu_compat_check" goes away and
"&r" is passed directly to kvm_x86_check_cpu_compat.

> > +			goto out_hardware_unsetup;

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling
  2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
                   ` (50 preceding siblings ...)
  2022-12-02  8:02 ` [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Chao Gao
@ 2022-12-27 13:02 ` Paolo Bonzini
  2022-12-28 11:22   ` Marc Zyngier
  51 siblings, 1 reply; 77+ messages in thread
From: Paolo Bonzini @ 2022-12-27 13:02 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Anup Patel,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Christian Borntraeger,
	Janosch Frank, Claudio Imbrenda, Matthew Rosato, Eric Farman,
	Vitaly Kuznetsov, David Woodhouse, Paul Durrant, James Morse,
	Alexandru Elisei, Suzuki K Poulose, Oliver Upton, Atish Patra,
	David Hildenbrand, kvm, linux-arm-kernel, kvmarm, kvmarm,
	linux-mips, linuxppc-dev, kvm-riscv, linux-riscv, linux-s390,
	linux-kernel, Yuan Yao, Cornelia Huck, Isaku Yamahata,
	Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

Queued, thanks.  I will leave this in kvm/queue after testing everything
else and moving it to kvm/next; this way, we can wait for test results
on other architectures.

Paolo



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling
  2022-12-27 13:02 ` Paolo Bonzini
@ 2022-12-28 11:22   ` Marc Zyngier
  2022-12-28 11:58     ` Paolo Bonzini
  2022-12-29 20:52     ` Paolo Bonzini
  0 siblings, 2 replies; 77+ messages in thread
From: Marc Zyngier @ 2022-12-28 11:22 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Vitaly Kuznetsov, David Woodhouse,
	Paul Durrant, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Oliver Upton, Atish Patra, David Hildenbrand, kvm,
	linux-arm-kernel, kvmarm, kvmarm, linux-mips, linuxppc-dev,
	kvm-riscv, linux-riscv, linux-s390, linux-kernel, Yuan Yao,
	Cornelia Huck, Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

On 2022-12-27 13:02, Paolo Bonzini wrote:
> Queued, thanks.  I will leave this in kvm/queue after testing 
> everything
> else and moving it to kvm/next; this way, we can wait for test results
> on other architectures.

Can you please make this a topic branch, and if possible based
on a released -rc? It would make it a lot easier for everyone.

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling
  2022-12-28 11:22   ` Marc Zyngier
@ 2022-12-28 11:58     ` Paolo Bonzini
  2022-12-29 20:52     ` Paolo Bonzini
  1 sibling, 0 replies; 77+ messages in thread
From: Paolo Bonzini @ 2022-12-28 11:58 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Sean Christopherson, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Vitaly Kuznetsov, David Woodhouse,
	Paul Durrant, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Oliver Upton, Atish Patra, David Hildenbrand, kvm,
	linux-arm-kernel, kvmarm, kvmarm, linux-mips, linuxppc-dev,
	kvm-riscv, linux-riscv, linux-s390, linux-kernel, Yuan Yao,
	Cornelia Huck, Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

On 12/28/22 12:22, Marc Zyngier wrote:
> 
>> Queued, thanks.  I will leave this in kvm/queue after testing everything
>> else and moving it to kvm/next; this way, we can wait for test results
>> on other architectures.
> 
> Can you please make this a topic branch, and if possible based
> on a released -rc? It would make it a lot easier for everyone.

Yes, I will (it will be based on 6.2-rc1 + pull request for rc2 that I'm 
preparing + x86 changes that this conflicts with).

Paolo


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling
  2022-12-28 11:22   ` Marc Zyngier
  2022-12-28 11:58     ` Paolo Bonzini
@ 2022-12-29 20:52     ` Paolo Bonzini
  1 sibling, 0 replies; 77+ messages in thread
From: Paolo Bonzini @ 2022-12-29 20:52 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Sean Christopherson, Huacai Chen, Aleksandar Markovic,
	Anup Patel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
	Matthew Rosato, Eric Farman, Vitaly Kuznetsov, David Woodhouse,
	Paul Durrant, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Oliver Upton, Atish Patra, David Hildenbrand, kvm,
	linux-arm-kernel, kvmarm, kvmarm, linux-mips, linuxppc-dev,
	kvm-riscv, linux-riscv, linux-s390, linux-kernel, Yuan Yao,
	Cornelia Huck, Isaku Yamahata, Philippe Mathieu-Daudé,
	Fabiano Rosas, Michael Ellerman, Kai Huang, Chao Gao,
	Thomas Gleixner

On 12/28/22 12:22, Marc Zyngier wrote:
> 
>> Queued, thanks.  I will leave this in kvm/queue after testing everything
>> else and moving it to kvm/next; this way, we can wait for test results
>> on other architectures.
> 
> Can you please make this a topic branch, and if possible based
> on a released -rc? It would make it a lot easier for everyone.

This is now refs/heads/kvm-hw-enable-refactor in 
https://git.kernel.org/pub/scm/virt/kvm/kvm.git.

The commits for this series start at hash fc471e831016.

Paolo


^ permalink raw reply	[flat|nested] 77+ messages in thread

end of thread, other threads:[~2022-12-29 20:53 UTC | newest]

Thread overview: 77+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-30 23:08 [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Sean Christopherson
2022-11-30 23:08 ` [PATCH v2 01/50] KVM: Register /dev/kvm as the _very_ last thing during initialization Sean Christopherson
2022-11-30 23:08 ` [PATCH v2 02/50] KVM: Initialize IRQ FD after arch hardware setup Sean Christopherson
2022-11-30 23:08 ` [PATCH v2 03/50] KVM: Allocate cpus_hardware_enabled " Sean Christopherson
2022-11-30 23:08 ` [PATCH v2 04/50] KVM: Teardown VFIO ops earlier in kvm_exit() Sean Christopherson
2022-11-30 23:08 ` [PATCH v2 05/50] KVM: s390: Unwind kvm_arch_init() piece-by-piece() if a step fails Sean Christopherson
2022-11-30 23:08 ` [PATCH v2 06/50] KVM: s390: Move hardware setup/unsetup to init/exit Sean Christopherson
2022-11-30 23:08 ` [PATCH v2 07/50] KVM: x86: Do timer initialization after XCR0 configuration Sean Christopherson
2022-11-30 23:08 ` [PATCH v2 08/50] KVM: x86: Move hardware setup/unsetup to init/exit Sean Christopherson
2022-11-30 23:08 ` [PATCH v2 09/50] KVM: Drop arch hardware (un)setup hooks Sean Christopherson
2022-11-30 23:08 ` [PATCH v2 10/50] KVM: VMX: Reset eVMCS controls in VP assist page during hardware disabling Sean Christopherson
2022-12-01 15:42   ` Vitaly Kuznetsov
2022-11-30 23:08 ` [PATCH v2 11/50] KVM: VMX: Don't bother disabling eVMCS static key on module exit Sean Christopherson
2022-11-30 23:08 ` [PATCH v2 12/50] KVM: VMX: Move Hyper-V eVMCS initialization to helper Sean Christopherson
2022-12-01 15:22   ` Vitaly Kuznetsov
2022-11-30 23:08 ` [PATCH v2 13/50] KVM: x86: Move guts of kvm_arch_init() to standalone helper Sean Christopherson
2022-11-30 23:08 ` [PATCH v2 14/50] KVM: VMX: Do _all_ initialization before exposing /dev/kvm to userspace Sean Christopherson
2022-11-30 23:08 ` [PATCH v2 15/50] KVM: x86: Serialize vendor module initialization (hardware setup) Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 16/50] KVM: arm64: Simplify the CPUHP logic Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 17/50] KVM: arm64: Free hypervisor allocations if vector slot init fails Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 18/50] KVM: arm64: Unregister perf callbacks if hypervisor finalization fails Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 19/50] KVM: arm64: Do arm/arch initialization without bouncing through kvm_init() Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 20/50] KVM: arm64: Mark kvm_arm_init() and its unique descendants as __init Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 21/50] KVM: MIPS: Hardcode callbacks to hardware virtualization extensions Sean Christopherson
2022-12-01 22:00   ` Philippe Mathieu-Daudé
2022-12-01 22:49     ` Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 22/50] KVM: MIPS: Setup VZ emulation? directly from kvm_mips_init() Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 23/50] KVM: MIPS: Register die notifier prior to kvm_init() Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 24/50] KVM: RISC-V: Do arch init directly in riscv_kvm_init() Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 25/50] KVM: RISC-V: Tag init functions and data with __init, __ro_after_init Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 26/50] KVM: PPC: Move processor compatibility check to module init Sean Christopherson
2022-12-01  5:21   ` Michael Ellerman
2022-12-01 16:38     ` Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 27/50] KVM: s390: Do s390 specific init without bouncing through kvm_init() Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 28/50] KVM: s390: Mark __kvm_s390_init() and its descendants as __init Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 29/50] KVM: Drop kvm_arch_{init,exit}() hooks Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 30/50] KVM: VMX: Make VMCS configuration/capabilities structs read-only after init Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 31/50] KVM: x86: Do CPU compatibility checks in x86 code Sean Christopherson
2022-12-02 12:16   ` Huang, Kai
2022-12-05 20:52   ` Isaku Yamahata
2022-12-05 21:12     ` Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 32/50] KVM: Drop kvm_arch_check_processor_compat() hook Sean Christopherson
2022-12-02 12:18   ` Huang, Kai
2022-11-30 23:09 ` [PATCH v2 33/50] KVM: x86: Use KBUILD_MODNAME to specify vendor module name Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 34/50] KVM: x86: Unify pr_fmt to use module name for all KVM modules Sean Christopherson
2022-12-01 10:43   ` Paul Durrant
2022-11-30 23:09 ` [PATCH v2 35/50] KVM: VMX: Use current CPU's info to perform "disabled by BIOS?" checks Sean Christopherson
2022-12-02 12:18   ` Huang, Kai
2022-11-30 23:09 ` [PATCH v2 36/50] KVM: x86: Do VMX/SVM support checks directly in vendor code Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 37/50] KVM: VMX: Shuffle support checks and hardware enabling code around Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 38/50] KVM: SVM: Check for SVM support in CPU compatibility checks Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 39/50] KVM: x86: Move CPU compat checks hook to kvm_x86_ops (from kvm_x86_init_ops) Sean Christopherson
2022-12-02 13:01   ` Huang, Kai
2022-12-05 21:04   ` Isaku Yamahata
2022-11-30 23:09 ` [PATCH v2 40/50] KVM: x86: Do compatibility checks when onlining CPU Sean Christopherson
2022-12-02 13:03   ` Huang, Kai
2022-12-02 13:36   ` Huang, Kai
2022-12-02 16:04     ` Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 41/50] KVM: Rename and move CPUHP_AP_KVM_STARTING to ONLINE section Sean Christopherson
2022-12-02 13:06   ` Huang, Kai
2022-12-02 16:08     ` Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 42/50] KVM: Disable CPU hotplug during hardware enabling/disabling Sean Christopherson
2022-12-02 12:59   ` Huang, Kai
2022-12-02 16:31     ` Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 43/50] KVM: Ensure CPU is stable during low level hardware enable/disable Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 44/50] KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 45/50] KVM: Remove on_each_cpu(hardware_disable_nolock) in kvm_exit() Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 46/50] KVM: Use a per-CPU variable to track which CPUs have enabled virtualization Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 47/50] KVM: Make hardware_enable_failed a local variable in the "enable all" path Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 48/50] KVM: Register syscore (suspend/resume) ops early in kvm_init() Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 49/50] KVM: Opt out of generic hardware enabling on s390 and PPC Sean Christopherson
2022-11-30 23:09 ` [PATCH v2 50/50] KVM: Clean up error labels in kvm_init() Sean Christopherson
2022-12-02  8:02 ` [PATCH v2 00/50] KVM: Rework kvm_init() and hardware enabling Chao Gao
2022-12-27 13:02 ` Paolo Bonzini
2022-12-28 11:22   ` Marc Zyngier
2022-12-28 11:58     ` Paolo Bonzini
2022-12-29 20:52     ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).