All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v6 00/36] KVM: x86: eVMCS rework
@ 2022-08-24  3:01 Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 01/36] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition Sean Christopherson
                   ` (36 more replies)
  0 siblings, 37 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

This is what I ended up with as a way to dig ourselves out of the eVMCS
conundrum.  Not well tested, though KUT and selftests pass.  The enforcement
added by "KVM: nVMX: Enforce unsupported eVMCS in VMX MSRs for host accesses"
is not tested at all (and lacks a changelog).

I don't care if we add a new capability or extend the existing one, my goal
was purely to frame in the KVM internals and show _a_ way to let userspace
opt-in.  I do think we need something that isn't CPUID-based though.

Everything from patch 22 onwards should be unchanged from your v5.

Jim Mattson (1):
  KVM: x86: VMX: Replace some Intel model numbers with mnemonics

Sean Christopherson (10):
  KVM: x86: Check for existing Hyper-V vCPU in kvm_hv_vcpu_init()
  KVM: x86: Report error when setting CPUID if Hyper-V allocation fails
  KVM: nVMX: Treat eVMCS as enabled for guest iff Hyper-V is also
    enabled
  KVM: nVMX: Use CC() macro to handle eVMCS unsupported controls checks
  KVM: nVMX: Enforce unsupported eVMCS in VMX MSRs for host accesses
  KVM: nVMX: WARN once and fail VM-Enter if eVMCS sees VMFUNC[63:32] !=
    0
  KVM: nVMX: Don't propagate vmcs12's PERF_GLOBAL_CTRL settings to
    vmcs02
  KVM: nVMX: Always emulate PERF_GLOBAL_CTRL VM-Entry/VM-Exit controls
  KVM: VMX: Don't toggle VM_ENTRY_IA32E_MODE for 32-bit kernels/KVM
  KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not
    setup

Vitaly Kuznetsov (25):
  x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition
  x86/hyperv: Update 'struct hv_enlightened_vmcs' definition
  KVM: x86: Zero out entire Hyper-V CPUID cache before processing
    entries
  KVM: nVMX: Refactor unsupported eVMCS controls logic to use 2-d array
  KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields
  KVM: nVMX: Support several new fields in eVMCSv1
  KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf
  KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields
  KVM: selftests: Switch to updated eVMCSv1 definition
  KVM: nVMX: Support PERF_GLOBAL_CTRL with enlightened VMCS
  KVM: nVMX: Support TSC scaling with enlightened VMCS
  KVM: selftests: Enable TSC scaling in evmcs selftest
  KVM: VMX: Get rid of eVMCS specific VMX controls sanitization
  KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
  KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
    setup_vmcs_config()
  KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
    in setup_vmcs_config()
  KVM: VMX: Extend VMX controls macro shenanigans
  KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
    setup_vmcs_config()
  KVM: VMX: Add missing VMEXIT controls to vmcs_config
  KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
  KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of
    setup_vmcs_config()
  KVM: nVMX: Always set required-1 bits of pinbased_ctls to
    PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR
  KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
  KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
  KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
    nested MSR

 arch/x86/include/asm/hyperv-tlfs.h            |  22 +-
 arch/x86/include/asm/kvm_host.h               |   6 +-
 arch/x86/kvm/cpuid.c                          |  18 +-
 arch/x86/kvm/hyperv.c                         |  70 +++--
 arch/x86/kvm/hyperv.h                         |   6 +-
 arch/x86/kvm/vmx/capabilities.h               |  14 +-
 arch/x86/kvm/vmx/evmcs.c                      | 249 +++++++++++-----
 arch/x86/kvm/vmx/evmcs.h                      |  30 +-
 arch/x86/kvm/vmx/nested.c                     | 109 ++++---
 arch/x86/kvm/vmx/nested.h                     |   2 +-
 arch/x86/kvm/vmx/vmx.c                        | 265 ++++++++----------
 arch/x86/kvm/vmx/vmx.h                        | 174 ++++++++++--
 arch/x86/kvm/x86.c                            |   8 +-
 include/uapi/linux/kvm.h                      |   1 +
 .../selftests/kvm/include/x86_64/evmcs.h      |  45 ++-
 .../selftests/kvm/include/x86_64/vmx.h        |   2 +
 .../testing/selftests/kvm/x86_64/evmcs_test.c |  31 +-
 17 files changed, 695 insertions(+), 357 deletions(-)


base-commit: 372d07084593dc7a399bf9bee815711b1fb1bcf2
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 01/36] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 02/36] x86/hyperv: Update " Sean Christopherson
                   ` (35 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

Section 1.9 of TLFS v6.0b says:

"All structures are padded in such a way that fields are aligned
naturally (that is, an 8-byte field is aligned to an offset of 8 bytes
and so on)".

'struct enlightened_vmcs' has a glitch:

...
        struct {
                u32                nested_flush_hypercall:1; /*   836: 0  4 */
                u32                msr_bitmap:1;         /*   836: 1  4 */
                u32                reserved:30;          /*   836: 2  4 */
        } hv_enlightenments_control;                     /*   836     4 */
        u32                        hv_vp_id;             /*   840     4 */
        u64                        hv_vm_id;             /*   844     8 */
        u64                        partition_assist_page; /*   852     8 */
...

And the observed values in 'partition_assist_page' make no sense at
all. Fix the layout by padding the structure properly.

Fixes: 68d1eb72ee99 ("x86/hyper-v: define struct hv_enlightened_vmcs and clean field bits")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/hyperv-tlfs.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 0a9407dc0859..6f0acc45e67a 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -546,7 +546,7 @@ struct hv_enlightened_vmcs {
 	u64 guest_rip;
 
 	u32 hv_clean_fields;
-	u32 hv_padding_32;
+	u32 padding32_1;
 	u32 hv_synthetic_controls;
 	struct {
 		u32 nested_flush_hypercall:1;
@@ -554,7 +554,7 @@ struct hv_enlightened_vmcs {
 		u32 reserved:30;
 	}  __packed hv_enlightenments_control;
 	u32 hv_vp_id;
-
+	u32 padding32_2;
 	u64 hv_vm_id;
 	u64 partition_assist_page;
 	u64 padding64_4[4];
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 02/36] x86/hyperv: Update 'struct hv_enlightened_vmcs' definition
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 01/36] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 03/36] KVM: x86: Zero out entire Hyper-V CPUID cache before processing entries Sean Christopherson
                   ` (34 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

Updated Hyper-V Enlightened VMCS specification lists several new
fields for the following features:

- PerfGlobalCtrl
- EnclsExitingBitmap
- Tsc Scaling
- GuestLbrCtl
- CET
- SSP

Update the definition.

Note, the updated spec also provides an additional CPUID feature flag,
CPUIDD.0x4000000A.EBX BIT(0), for PerfGlobalCtrl to workaround a Windows
11 quirk.  Despite what the TLFS says:

  Indicates support for the GuestPerfGlobalCtrl and HostPerfGlobalCtrl
  fields in the enlightened VMCS.

guests can safely use the fields if they are enumerated in the
architectural VMX MSRs.  I.e. KVM-on-HyperV doesn't need to check the
CPUID bit, but KVM-as-HyperV must ensure the bit is set if PerfGlobalCtrl
fields are exposed to L1.

https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
[sean: tweak CPUID name to make it PerfGlobalCtrl only]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/hyperv-tlfs.h | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 6f0acc45e67a..3089ec352743 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -138,6 +138,9 @@
 #define HV_X64_NESTED_GUEST_MAPPING_FLUSH		BIT(18)
 #define HV_X64_NESTED_MSR_BITMAP			BIT(19)
 
+/* Nested features #2. These are HYPERV_CPUID_NESTED_FEATURES.EBX bits. */
+#define HV_X64_NESTED_EVMCS1_PERF_GLOBAL_CTRL		BIT(0)
+
 /*
  * This is specific to AMD and specifies that enlightened TLB flush is
  * supported. If guest opts in to this feature, ASID invalidations only
@@ -559,9 +562,20 @@ struct hv_enlightened_vmcs {
 	u64 partition_assist_page;
 	u64 padding64_4[4];
 	u64 guest_bndcfgs;
-	u64 padding64_5[7];
+	u64 guest_ia32_perf_global_ctrl;
+	u64 guest_ia32_s_cet;
+	u64 guest_ssp;
+	u64 guest_ia32_int_ssp_table_addr;
+	u64 guest_ia32_lbr_ctl;
+	u64 padding64_5[2];
 	u64 xss_exit_bitmap;
-	u64 padding64_6[7];
+	u64 encls_exiting_bitmap;
+	u64 host_ia32_perf_global_ctrl;
+	u64 tsc_multiplier;
+	u64 host_ia32_s_cet;
+	u64 host_ssp;
+	u64 host_ia32_int_ssp_table_addr;
+	u64 padding64_6;
 } __packed;
 
 #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE			0
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 03/36] KVM: x86: Zero out entire Hyper-V CPUID cache before processing entries
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 01/36] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 02/36] x86/hyperv: Update " Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 04/36] KVM: x86: Check for existing Hyper-V vCPU in kvm_hv_vcpu_init() Sean Christopherson
                   ` (33 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

Wipe the whole 'hv_vcpu->cpuid_cache' with memset() instead of having to
zero each particular member when the corresponding CPUID entry was not
found.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
[sean: split to separate patch]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/hyperv.c | 11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index ed804447589c..611c349a08bf 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2005,31 +2005,24 @@ void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu)
 
 	hv_vcpu = to_hv_vcpu(vcpu);
 
+	memset(&hv_vcpu->cpuid_cache, 0, sizeof(hv_vcpu->cpuid_cache));
+
 	entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_FEATURES);
 	if (entry) {
 		hv_vcpu->cpuid_cache.features_eax = entry->eax;
 		hv_vcpu->cpuid_cache.features_ebx = entry->ebx;
 		hv_vcpu->cpuid_cache.features_edx = entry->edx;
-	} else {
-		hv_vcpu->cpuid_cache.features_eax = 0;
-		hv_vcpu->cpuid_cache.features_ebx = 0;
-		hv_vcpu->cpuid_cache.features_edx = 0;
 	}
 
 	entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_ENLIGHTMENT_INFO);
 	if (entry) {
 		hv_vcpu->cpuid_cache.enlightenments_eax = entry->eax;
 		hv_vcpu->cpuid_cache.enlightenments_ebx = entry->ebx;
-	} else {
-		hv_vcpu->cpuid_cache.enlightenments_eax = 0;
-		hv_vcpu->cpuid_cache.enlightenments_ebx = 0;
 	}
 
 	entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES);
 	if (entry)
 		hv_vcpu->cpuid_cache.syndbg_cap_eax = entry->eax;
-	else
-		hv_vcpu->cpuid_cache.syndbg_cap_eax = 0;
 }
 
 int kvm_hv_set_enforce_cpuid(struct kvm_vcpu *vcpu, bool enforce)
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 04/36] KVM: x86: Check for existing Hyper-V vCPU in kvm_hv_vcpu_init()
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (2 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 03/36] KVM: x86: Zero out entire Hyper-V CPUID cache before processing entries Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 05/36] KVM: x86: Report error when setting CPUID if Hyper-V allocation fails Sean Christopherson
                   ` (32 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

When potentially allocating/initializing the Hyper-V vCPU struct, check
for an existing instance in kvm_hv_vcpu_init() instead of requiring
callers to perform the check.  Relying on callers to do the check is
risky as it's all too easy for KVM to overwrite vcpu->arch.hyperv and
leak memory, and it adds additional burden on callers without much
benefit.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/hyperv.c | 27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 611c349a08bf..8aadd31ed058 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -936,9 +936,12 @@ static void stimer_init(struct kvm_vcpu_hv_stimer *stimer, int timer_index)
 
 static int kvm_hv_vcpu_init(struct kvm_vcpu *vcpu)
 {
-	struct kvm_vcpu_hv *hv_vcpu;
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
 	int i;
 
+	if (hv_vcpu)
+		return 0;
+
 	hv_vcpu = kzalloc(sizeof(struct kvm_vcpu_hv), GFP_KERNEL_ACCOUNT);
 	if (!hv_vcpu)
 		return -ENOMEM;
@@ -962,11 +965,9 @@ int kvm_hv_activate_synic(struct kvm_vcpu *vcpu, bool dont_zero_synic_pages)
 	struct kvm_vcpu_hv_synic *synic;
 	int r;
 
-	if (!to_hv_vcpu(vcpu)) {
-		r = kvm_hv_vcpu_init(vcpu);
-		if (r)
-			return r;
-	}
+	r = kvm_hv_vcpu_init(vcpu);
+	if (r)
+		return r;
 
 	synic = to_hv_synic(vcpu);
 
@@ -1660,10 +1661,8 @@ int kvm_hv_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
 	if (!host && !vcpu->arch.hyperv_enabled)
 		return 1;
 
-	if (!to_hv_vcpu(vcpu)) {
-		if (kvm_hv_vcpu_init(vcpu))
-			return 1;
-	}
+	if (kvm_hv_vcpu_init(vcpu))
+		return 1;
 
 	if (kvm_hv_msr_partition_wide(msr)) {
 		int r;
@@ -1683,10 +1682,8 @@ int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, bool host)
 	if (!host && !vcpu->arch.hyperv_enabled)
 		return 1;
 
-	if (!to_hv_vcpu(vcpu)) {
-		if (kvm_hv_vcpu_init(vcpu))
-			return 1;
-	}
+	if (kvm_hv_vcpu_init(vcpu))
+		return 1;
 
 	if (kvm_hv_msr_partition_wide(msr)) {
 		int r;
@@ -2000,7 +1997,7 @@ void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu)
 		return;
 	}
 
-	if (!to_hv_vcpu(vcpu) && kvm_hv_vcpu_init(vcpu))
+	if (kvm_hv_vcpu_init(vcpu))
 		return;
 
 	hv_vcpu = to_hv_vcpu(vcpu);
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 05/36] KVM: x86: Report error when setting CPUID if Hyper-V allocation fails
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (3 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 04/36] KVM: x86: Check for existing Hyper-V vCPU in kvm_hv_vcpu_init() Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 06/36] KVM: nVMX: Treat eVMCS as enabled for guest iff Hyper-V is also enabled Sean Christopherson
                   ` (31 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

Return -ENOMEM back to userspace if allocating the Hyper-V vCPU struct
fails when enabling Hyper-V in guest CPUID.  Silently ignoring failure
means that KVM will not have an up-to-date CPUID cache if allocating the
struct succeeds later on, e.g. when activating SynIC.

Rejecting the CPUID operation also guarantess that vcpu->arch.hyperv is
non-NULL if hyperv_enabled is true, which will allow for additional
cleanup, e.g. in the eVMCS code.

Note, the initialization needs to be done before CPUID is set, and more
subtly before kvm_check_cpuid(), which potentially enables dynamic
XFEATURES.  Sadly, there's no easy way to avoid exposing Hyper-V details
to CPUID or vice versa.  Expose kvm_hv_vcpu_init() and the Hyper-V CPUID
signature to CPUID instead of exposing cpuid_entry2_find() outside of
CPUID code.  It's hard to envision kvm_hv_vcpu_init() being misused,
whereas cpuid_entry2_find() absolutely shouldn't be used outside of core
CPUID code.

Fixes: 10d7bf1e46dc ("KVM: x86: hyper-v: Cache guest CPUID leaves determining features availability")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c  | 18 +++++++++++++++++-
 arch/x86/kvm/hyperv.c | 30 ++++++++++++++----------------
 arch/x86/kvm/hyperv.h |  6 +++++-
 3 files changed, 36 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 75dcf7a72605..ffdc28684cb7 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -311,6 +311,15 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvm_update_cpuid_runtime);
 
+static bool kvm_cpuid_has_hyperv(struct kvm_cpuid_entry2 *entries, int nent)
+{
+	struct kvm_cpuid_entry2 *entry;
+
+	entry = cpuid_entry2_find(entries, nent, HYPERV_CPUID_INTERFACE,
+				  KVM_CPUID_INDEX_NOT_SIGNIFICANT);
+	return entry && entry->eax == HYPERV_CPUID_SIGNATURE_EAX;
+}
+
 static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 {
 	struct kvm_lapic *apic = vcpu->arch.apic;
@@ -341,7 +350,8 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	vcpu->arch.cr4_guest_rsvd_bits =
 	    __cr4_reserved_bits(guest_cpuid_has, vcpu);
 
-	kvm_hv_set_cpuid(vcpu);
+	kvm_hv_set_cpuid(vcpu, kvm_cpuid_has_hyperv(vcpu->arch.cpuid_entries,
+						    vcpu->arch.cpuid_nent));
 
 	/* Invoke the vendor callback only after the above state is updated. */
 	static_call(kvm_x86_vcpu_after_set_cpuid)(vcpu);
@@ -404,6 +414,12 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
 		return 0;
 	}
 
+	if (kvm_cpuid_has_hyperv(e2, nent)) {
+		r = kvm_hv_vcpu_init(vcpu);
+		if (r)
+			return r;
+	}
+
 	r = kvm_check_cpuid(vcpu, e2, nent);
 	if (r)
 		return r;
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 8aadd31ed058..bf4729e8cc80 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -38,9 +38,6 @@
 #include "irq.h"
 #include "fpu.h"
 
-/* "Hv#1" signature */
-#define HYPERV_CPUID_SIGNATURE_EAX 0x31237648
-
 #define KVM_HV_MAX_SPARSE_VCPU_SET_BITS DIV_ROUND_UP(KVM_MAX_VCPUS, 64)
 
 static void stimer_mark_pending(struct kvm_vcpu_hv_stimer *stimer,
@@ -934,7 +931,7 @@ static void stimer_init(struct kvm_vcpu_hv_stimer *stimer, int timer_index)
 	stimer_prepare_msg(stimer);
 }
 
-static int kvm_hv_vcpu_init(struct kvm_vcpu *vcpu)
+int kvm_hv_vcpu_init(struct kvm_vcpu *vcpu)
 {
 	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
 	int i;
@@ -1984,26 +1981,27 @@ static u64 kvm_hv_send_ipi(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc)
 	return HV_STATUS_SUCCESS;
 }
 
-void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu)
+void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu, bool hyperv_enabled)
 {
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
 	struct kvm_cpuid_entry2 *entry;
-	struct kvm_vcpu_hv *hv_vcpu;
 
-	entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_INTERFACE);
-	if (entry && entry->eax == HYPERV_CPUID_SIGNATURE_EAX) {
-		vcpu->arch.hyperv_enabled = true;
-	} else {
-		vcpu->arch.hyperv_enabled = false;
+	vcpu->arch.hyperv_enabled = hyperv_enabled;
+
+	if (!hv_vcpu) {
+		/*
+		 * KVM should have already allocated kvm_vcpu_hv if Hyper-V is
+		 * enabled in CPUID.
+		 */
+		WARN_ON_ONCE(vcpu->arch.hyperv_enabled);
 		return;
 	}
 
-	if (kvm_hv_vcpu_init(vcpu))
-		return;
-
-	hv_vcpu = to_hv_vcpu(vcpu);
-
 	memset(&hv_vcpu->cpuid_cache, 0, sizeof(hv_vcpu->cpuid_cache));
 
+	if (!vcpu->arch.hyperv_enabled)
+		return;
+
 	entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_FEATURES);
 	if (entry) {
 		hv_vcpu->cpuid_cache.features_eax = entry->eax;
diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index da2737f2a956..1030b1b50552 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -23,6 +23,9 @@
 
 #include <linux/kvm_host.h>
 
+/* "Hv#1" signature */
+#define HYPERV_CPUID_SIGNATURE_EAX 0x31237648
+
 /*
  * The #defines related to the synthetic debugger are required by KDNet, but
  * they are not documented in the Hyper-V TLFS because the synthetic debugger
@@ -141,7 +144,8 @@ void kvm_hv_request_tsc_page_update(struct kvm *kvm);
 
 void kvm_hv_init_vm(struct kvm *kvm);
 void kvm_hv_destroy_vm(struct kvm *kvm);
-void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu);
+int kvm_hv_vcpu_init(struct kvm_vcpu *vcpu);
+void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu, bool hyperv_enabled);
 int kvm_hv_set_enforce_cpuid(struct kvm_vcpu *vcpu, bool enforce);
 int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, struct kvm_hyperv_eventfd *args);
 int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 06/36] KVM: nVMX: Treat eVMCS as enabled for guest iff Hyper-V is also enabled
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (4 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 05/36] KVM: x86: Report error when setting CPUID if Hyper-V allocation fails Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-25 10:21   ` Vitaly Kuznetsov
  2022-08-24  3:01 ` [RFC PATCH v6 07/36] KVM: nVMX: Refactor unsupported eVMCS controls logic to use 2-d array Sean Christopherson
                   ` (30 subsequent siblings)
  36 siblings, 1 reply; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

When querying whether or not eVMCS is enabled on behalf of the guest,
treat eVMCS as enable if and only if Hyper-V is enabled/exposed to the
guest.

Note, flows that come from the host, e.g. KVM_SET_NESTED_STATE, must NOT
check for Hyper-V being enabled as KVM doesn't require guest CPUID to be
set before most ioctls().

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/evmcs.c  |  3 +++
 arch/x86/kvm/vmx/nested.c |  8 ++++----
 arch/x86/kvm/vmx/vmx.c    |  3 +--
 arch/x86/kvm/vmx/vmx.h    | 10 ++++++++++
 4 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index 6a61b1ae7942..9139c70b6008 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -334,6 +334,9 @@ uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
 	 * versions: lower 8 bits is the minimal version, higher 8 bits is the
 	 * maximum supported version. KVM supports versions from 1 to
 	 * KVM_EVMCS_VERSION.
+	 *
+	 * Note, do not check the Hyper-V is fully enabled in guest CPUID, this
+	 * helper is used to _get_ the vCPU's supported CPUID.
 	 */
 	if (kvm_cpu_cap_get(X86_FEATURE_VMX) &&
 	    (!vcpu || to_vmx(vcpu)->nested.enlightened_vmcs_enabled))
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index ddd4367d4826..28f9d64851b3 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1982,7 +1982,7 @@ static enum nested_evmptrld_status nested_vmx_handle_enlightened_vmptrld(
 	bool evmcs_gpa_changed = false;
 	u64 evmcs_gpa;
 
-	if (likely(!vmx->nested.enlightened_vmcs_enabled))
+	if (likely(!guest_cpuid_has_evmcs(vcpu)))
 		return EVMPTRLD_DISABLED;
 
 	if (!nested_enlightened_vmentry(vcpu, &evmcs_gpa)) {
@@ -2863,7 +2863,7 @@ static int nested_vmx_check_controls(struct kvm_vcpu *vcpu,
 	    nested_check_vm_entry_controls(vcpu, vmcs12))
 		return -EINVAL;
 
-	if (to_vmx(vcpu)->nested.enlightened_vmcs_enabled)
+	if (guest_cpuid_has_evmcs(vcpu))
 		return nested_evmcs_check_controls(vmcs12);
 
 	return 0;
@@ -3145,7 +3145,7 @@ static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu)
 	 * L2 was running), map it here to make sure vmcs12 changes are
 	 * properly reflected.
 	 */
-	if (vmx->nested.enlightened_vmcs_enabled &&
+	if (guest_cpuid_has_evmcs(vcpu) &&
 	    vmx->nested.hv_evmcs_vmptr == EVMPTR_MAP_PENDING) {
 		enum nested_evmptrld_status evmptrld_status =
 			nested_vmx_handle_enlightened_vmptrld(vcpu, false);
@@ -5067,7 +5067,7 @@ static int handle_vmclear(struct kvm_vcpu *vcpu)
 	 * state. It is possible that the area will stay mapped as
 	 * vmx->nested.hv_evmcs but this shouldn't be a problem.
 	 */
-	if (likely(!vmx->nested.enlightened_vmcs_enabled ||
+	if (likely(!guest_cpuid_has_evmcs(vcpu) ||
 		   !nested_enlightened_vmentry(vcpu, &evmcs_gpa))) {
 		if (vmptr == vmx->nested.current_vmptr)
 			nested_release_vmcs12(vcpu);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c9b49a09e6b5..d4ed802947d7 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1930,8 +1930,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		 * sanity checking and refuse to boot. Filter all unsupported
 		 * features out.
 		 */
-		if (!msr_info->host_initiated &&
-		    vmx->nested.enlightened_vmcs_enabled)
+		if (!msr_info->host_initiated && guest_cpuid_has_evmcs(vcpu))
 			nested_evmcs_filter_control_msr(msr_info->index,
 							&msr_info->data);
 		break;
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 24d58c2ffaa3..35c7e6aef301 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -626,4 +626,14 @@ static inline bool vmx_can_use_ipiv(struct kvm_vcpu *vcpu)
 	return  lapic_in_kernel(vcpu) && enable_ipiv;
 }
 
+static inline bool guest_cpuid_has_evmcs(struct kvm_vcpu *vcpu)
+{
+	/*
+	 * eVMCS is exposed to the guest if Hyper-V is enabled in CPUID and
+	 * eVMCS has been explicitly enabled by userspace.
+	 */
+	return vcpu->arch.hyperv_enabled &&
+	       to_vmx(vcpu)->nested.enlightened_vmcs_enabled;
+}
+
 #endif /* __KVM_X86_VMX_H */
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 07/36] KVM: nVMX: Refactor unsupported eVMCS controls logic to use 2-d array
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (5 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 06/36] KVM: nVMX: Treat eVMCS as enabled for guest iff Hyper-V is also enabled Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-25 10:24   ` Vitaly Kuznetsov
  2022-08-24  3:01 ` [RFC PATCH v6 08/36] KVM: nVMX: Use CC() macro to handle eVMCS unsupported controls checks Sean Christopherson
                   ` (29 subsequent siblings)
  36 siblings, 1 reply; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

Refactor the handling of unsupported eVMCS to use a 2-d array to store
the set of unsupported controls.  KVM's handling of eVMCS is completely
broken as there is no way for userspace to query which features are
unsupported, nor does KVM prevent userspace from attempting to enable
unsupported features.  A future commit will remedy that by filtering and
enforcing unsupported features when eVMCS, but that needs to be opt-in
from userspace to avoid breakage, i.e. KVM needs to maintain its legacy
behavior by snapshotting the exact set of controls that are currently
(un)supported by eVMCS.

No functional change intended.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
[sean: split to standalone patch, write changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/evmcs.c | 60 +++++++++++++++++++++++++++++++++-------
 1 file changed, 50 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index 9139c70b6008..10fc0be49f96 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -345,6 +345,45 @@ uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+enum evmcs_revision {
+	EVMCSv1_LEGACY,
+	NR_EVMCS_REVISIONS,
+};
+
+enum evmcs_ctrl_type {
+	EVMCS_EXIT_CTRLS,
+	EVMCS_ENTRY_CTRLS,
+	EVMCS_2NDEXEC,
+	EVMCS_PINCTRL,
+	EVMCS_VMFUNC,
+	NR_EVMCS_CTRLS,
+};
+
+static const u32 evmcs_unsupported_ctrls[NR_EVMCS_CTRLS][NR_EVMCS_REVISIONS] = {
+	[EVMCS_EXIT_CTRLS] = {
+		[EVMCSv1_LEGACY] = EVMCS1_UNSUPPORTED_VMEXIT_CTRL | VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,
+	},
+	[EVMCS_ENTRY_CTRLS] = {
+		[EVMCSv1_LEGACY] = EVMCS1_UNSUPPORTED_VMENTRY_CTRL | VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,
+	},
+	[EVMCS_2NDEXEC] = {
+		[EVMCSv1_LEGACY] = EVMCS1_UNSUPPORTED_2NDEXEC | SECONDARY_EXEC_TSC_SCALING,
+	},
+	[EVMCS_PINCTRL] = {
+		[EVMCSv1_LEGACY] = EVMCS1_UNSUPPORTED_PINCTRL,
+	},
+	[EVMCS_VMFUNC] = {
+		[EVMCSv1_LEGACY] = EVMCS1_UNSUPPORTED_VMFUNC,
+	},
+};
+
+static u32 evmcs_get_unsupported_ctls(enum evmcs_ctrl_type ctrl_type)
+{
+	enum evmcs_revision evmcs_rev = EVMCSv1_LEGACY;
+
+	return evmcs_unsupported_ctrls[ctrl_type][evmcs_rev];
+}
+
 void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
 {
 	u32 ctl_low = (u32)*pdata;
@@ -357,21 +396,21 @@ void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
 	switch (msr_index) {
 	case MSR_IA32_VMX_EXIT_CTLS:
 	case MSR_IA32_VMX_TRUE_EXIT_CTLS:
-		ctl_high &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
+		ctl_high &= ~evmcs_get_unsupported_ctls(EVMCS_EXIT_CTRLS);
 		break;
 	case MSR_IA32_VMX_ENTRY_CTLS:
 	case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
-		ctl_high &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
+		ctl_high &= ~evmcs_get_unsupported_ctls(EVMCS_ENTRY_CTRLS);
 		break;
 	case MSR_IA32_VMX_PROCBASED_CTLS2:
-		ctl_high &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
+		ctl_high &= ~evmcs_get_unsupported_ctls(EVMCS_2NDEXEC);
 		break;
 	case MSR_IA32_VMX_TRUE_PINBASED_CTLS:
 	case MSR_IA32_VMX_PINBASED_CTLS:
-		ctl_high &= ~EVMCS1_UNSUPPORTED_PINCTRL;
+		ctl_high &= ~evmcs_get_unsupported_ctls(EVMCS_PINCTRL);
 		break;
 	case MSR_IA32_VMX_VMFUNC:
-		ctl_low &= ~EVMCS1_UNSUPPORTED_VMFUNC;
+		ctl_low &= ~evmcs_get_unsupported_ctls(EVMCS_VMFUNC);
 		break;
 	}
 
@@ -384,7 +423,7 @@ int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
 	u32 unsupp_ctl;
 
 	unsupp_ctl = vmcs12->pin_based_vm_exec_control &
-		EVMCS1_UNSUPPORTED_PINCTRL;
+		evmcs_get_unsupported_ctls(EVMCS_PINCTRL);
 	if (unsupp_ctl) {
 		trace_kvm_nested_vmenter_failed(
 			"eVMCS: unsupported pin-based VM-execution controls",
@@ -393,7 +432,7 @@ int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
 	}
 
 	unsupp_ctl = vmcs12->secondary_vm_exec_control &
-		EVMCS1_UNSUPPORTED_2NDEXEC;
+		evmcs_get_unsupported_ctls(EVMCS_2NDEXEC);
 	if (unsupp_ctl) {
 		trace_kvm_nested_vmenter_failed(
 			"eVMCS: unsupported secondary VM-execution controls",
@@ -402,7 +441,7 @@ int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
 	}
 
 	unsupp_ctl = vmcs12->vm_exit_controls &
-		EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
+		evmcs_get_unsupported_ctls(EVMCS_EXIT_CTRLS);
 	if (unsupp_ctl) {
 		trace_kvm_nested_vmenter_failed(
 			"eVMCS: unsupported VM-exit controls",
@@ -411,7 +450,7 @@ int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
 	}
 
 	unsupp_ctl = vmcs12->vm_entry_controls &
-		EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
+		evmcs_get_unsupported_ctls(EVMCS_ENTRY_CTRLS);
 	if (unsupp_ctl) {
 		trace_kvm_nested_vmenter_failed(
 			"eVMCS: unsupported VM-entry controls",
@@ -419,7 +458,8 @@ int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
 		ret = -EINVAL;
 	}
 
-	unsupp_ctl = vmcs12->vm_function_control & EVMCS1_UNSUPPORTED_VMFUNC;
+	unsupp_ctl = vmcs12->vm_function_control &
+		evmcs_get_unsupported_ctls(EVMCS_VMFUNC);
 	if (unsupp_ctl) {
 		trace_kvm_nested_vmenter_failed(
 			"eVMCS: unsupported VM-function controls",
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 08/36] KVM: nVMX: Use CC() macro to handle eVMCS unsupported controls checks
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (6 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 07/36] KVM: nVMX: Refactor unsupported eVMCS controls logic to use 2-d array Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 09/36] KVM: nVMX: Enforce unsupported eVMCS in VMX MSRs for host accesses Sean Christopherson
                   ` (28 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

Locally #define and use the nested virtualization Consistency Check (CC)
macro to handle eVMCS unsupported controls checks.  Using the macro loses
the existing printing of the unsupported controls, but that's a feature
and not a bug.  The existing approach is flawed because the @err param to
trace_kvm_nested_vmenter_failed() is the error code, not the error value.

The eVMCS trickery mostly works as __print_symbolic() falls back to
printing the raw hex value, but that subtly relies on not having a match
between the unsupported value and VMX_VMENTER_INSTRUCTION_ERRORS.

If it's really truly necessary to snapshot the bad value, then the
tracepoint can be extended in the future.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/evmcs.c | 68 ++++++++++++++--------------------------
 1 file changed, 24 insertions(+), 44 deletions(-)

diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index 10fc0be49f96..3bf8681e5239 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -10,6 +10,8 @@
 #include "vmx.h"
 #include "trace.h"
 
+#define CC KVM_NESTED_VMENTER_CONSISTENCY_CHECK
+
 DEFINE_STATIC_KEY_FALSE(enable_evmcs);
 
 #define EVMCS1_OFFSET(x) offsetof(struct hv_enlightened_vmcs, x)
@@ -417,57 +419,35 @@ void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
 	*pdata = ctl_low | ((u64)ctl_high << 32);
 }
 
+static bool nested_evmcs_is_valid_controls(enum evmcs_ctrl_type ctrl_type,
+					   u32 val)
+{
+	return !(val & evmcs_get_unsupported_ctls(ctrl_type));
+}
+
 int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
 {
-	int ret = 0;
-	u32 unsupp_ctl;
+	if (CC(!nested_evmcs_is_valid_controls(EVMCS_PINCTRL,
+					       vmcs12->pin_based_vm_exec_control)))
+		return -EINVAL;
 
-	unsupp_ctl = vmcs12->pin_based_vm_exec_control &
-		evmcs_get_unsupported_ctls(EVMCS_PINCTRL);
-	if (unsupp_ctl) {
-		trace_kvm_nested_vmenter_failed(
-			"eVMCS: unsupported pin-based VM-execution controls",
-			unsupp_ctl);
-		ret = -EINVAL;
-	}
+	if (CC(!nested_evmcs_is_valid_controls(EVMCS_2NDEXEC,
+					       vmcs12->secondary_vm_exec_control)))
+		return -EINVAL;
 
-	unsupp_ctl = vmcs12->secondary_vm_exec_control &
-		evmcs_get_unsupported_ctls(EVMCS_2NDEXEC);
-	if (unsupp_ctl) {
-		trace_kvm_nested_vmenter_failed(
-			"eVMCS: unsupported secondary VM-execution controls",
-			unsupp_ctl);
-		ret = -EINVAL;
-	}
+	if (CC(!nested_evmcs_is_valid_controls(EVMCS_EXIT_CTRLS,
+					       vmcs12->vm_exit_controls)))
+		return -EINVAL;
 
-	unsupp_ctl = vmcs12->vm_exit_controls &
-		evmcs_get_unsupported_ctls(EVMCS_EXIT_CTRLS);
-	if (unsupp_ctl) {
-		trace_kvm_nested_vmenter_failed(
-			"eVMCS: unsupported VM-exit controls",
-			unsupp_ctl);
-		ret = -EINVAL;
-	}
+	if (CC(!nested_evmcs_is_valid_controls(EVMCS_ENTRY_CTRLS,
+					       vmcs12->vm_entry_controls)))
+		return -EINVAL;
 
-	unsupp_ctl = vmcs12->vm_entry_controls &
-		evmcs_get_unsupported_ctls(EVMCS_ENTRY_CTRLS);
-	if (unsupp_ctl) {
-		trace_kvm_nested_vmenter_failed(
-			"eVMCS: unsupported VM-entry controls",
-			unsupp_ctl);
-		ret = -EINVAL;
-	}
+	if (CC(!nested_evmcs_is_valid_controls(EVMCS_VMFUNC,
+					       vmcs12->vm_function_control)))
+		return -EINVAL;
 
-	unsupp_ctl = vmcs12->vm_function_control &
-		evmcs_get_unsupported_ctls(EVMCS_VMFUNC);
-	if (unsupp_ctl) {
-		trace_kvm_nested_vmenter_failed(
-			"eVMCS: unsupported VM-function controls",
-			unsupp_ctl);
-		ret = -EINVAL;
-	}
-
-	return ret;
+	return 0;
 }
 
 int nested_enable_evmcs(struct kvm_vcpu *vcpu,
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 09/36] KVM: nVMX: Enforce unsupported eVMCS in VMX MSRs for host accesses
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (7 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 08/36] KVM: nVMX: Use CC() macro to handle eVMCS unsupported controls checks Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 10/36] KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields Sean Christopherson
                   ` (27 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h |  4 +-
 arch/x86/kvm/vmx/evmcs.c        | 91 ++++++++++++++++++++-------------
 arch/x86/kvm/vmx/evmcs.h        | 19 +++++--
 arch/x86/kvm/vmx/nested.c       | 15 ++++--
 arch/x86/kvm/vmx/vmx.c          | 12 ++---
 arch/x86/kvm/vmx/vmx.h          |  2 +
 arch/x86/kvm/x86.c              |  8 ++-
 include/uapi/linux/kvm.h        |  1 +
 8 files changed, 99 insertions(+), 53 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 2c96c43c313a..2209724b765e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1647,8 +1647,8 @@ struct kvm_x86_nested_ops {
 	bool (*get_nested_state_pages)(struct kvm_vcpu *vcpu);
 	int (*write_log_dirty)(struct kvm_vcpu *vcpu, gpa_t l2_gpa);
 
-	int (*enable_evmcs)(struct kvm_vcpu *vcpu,
-			    uint16_t *vmcs_version);
+	int (*enable_evmcs)(struct kvm_vcpu *vcpu, uint16_t *vmcs_version,
+			    bool enforce_evmcs);
 	uint16_t (*get_evmcs_version)(struct kvm_vcpu *vcpu);
 };
 
diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index 3bf8681e5239..c0cb68ce7b1b 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -349,6 +349,7 @@ uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
 
 enum evmcs_revision {
 	EVMCSv1_LEGACY,
+	EVMCSv1_ENFORCED,
 	NR_EVMCS_REVISIONS,
 };
 
@@ -363,99 +364,119 @@ enum evmcs_ctrl_type {
 
 static const u32 evmcs_unsupported_ctrls[NR_EVMCS_CTRLS][NR_EVMCS_REVISIONS] = {
 	[EVMCS_EXIT_CTRLS] = {
-		[EVMCSv1_LEGACY] = EVMCS1_UNSUPPORTED_VMEXIT_CTRL | VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,
+		[EVMCSv1_LEGACY]   = EVMCS1_UNSUPPORTED_VMEXIT_CTRL_LEGACY,
+		[EVMCSv1_ENFORCED] = EVMCS1_UNSUPPORTED_VMEXIT_CTRL,
 	},
 	[EVMCS_ENTRY_CTRLS] = {
-		[EVMCSv1_LEGACY] = EVMCS1_UNSUPPORTED_VMENTRY_CTRL | VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,
+		[EVMCSv1_LEGACY]   = EVMCS1_UNSUPPORTED_VMENTRY_CTRL_LEGACY,
+		[EVMCSv1_ENFORCED] = EVMCS1_UNSUPPORTED_VMENTRY_CTRL,
 	},
 	[EVMCS_2NDEXEC] = {
-		[EVMCSv1_LEGACY] = EVMCS1_UNSUPPORTED_2NDEXEC | SECONDARY_EXEC_TSC_SCALING,
+		[EVMCSv1_LEGACY]   = EVMCS1_UNSUPPORTED_2NDEXEC_LEGACY,
+		[EVMCSv1_ENFORCED] = EVMCS1_UNSUPPORTED_2NDEXEC,
 	},
 	[EVMCS_PINCTRL] = {
-		[EVMCSv1_LEGACY] = EVMCS1_UNSUPPORTED_PINCTRL,
+		[EVMCSv1_LEGACY]   = EVMCS1_UNSUPPORTED_PINCTRL_LEGACY,
+		[EVMCSv1_ENFORCED] = EVMCS1_UNSUPPORTED_PINCTRL,
 	},
 	[EVMCS_VMFUNC] = {
-		[EVMCSv1_LEGACY] = EVMCS1_UNSUPPORTED_VMFUNC,
+		[EVMCSv1_LEGACY]   = EVMCS1_UNSUPPORTED_VMFUNC_LEGACY,
+		[EVMCSv1_ENFORCED] = EVMCS1_UNSUPPORTED_VMFUNC,
 	},
 };
 
-static u32 evmcs_get_unsupported_ctls(enum evmcs_ctrl_type ctrl_type)
+static u32 evmcs_get_unsupported_ctls(struct vcpu_vmx *vmx,
+				      enum evmcs_ctrl_type ctrl_type)
 {
 	enum evmcs_revision evmcs_rev = EVMCSv1_LEGACY;
 
+	if (vmx->nested.enforce_evmcs)
+		evmcs_rev = EVMCSv1_ENFORCED;
+
 	return evmcs_unsupported_ctrls[ctrl_type][evmcs_rev];
 }
 
-void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
+u64 nested_evmcs_get_unsupported_ctrls(struct vcpu_vmx *vmx, u32 msr_index)
 {
-	u32 ctl_low = (u32)*pdata;
-	u32 ctl_high = (u32)(*pdata >> 32);
-
-	/*
-	 * Hyper-V 2016 and 2019 try using these features even when eVMCS
-	 * is enabled but there are no corresponding fields.
-	 */
 	switch (msr_index) {
 	case MSR_IA32_VMX_EXIT_CTLS:
 	case MSR_IA32_VMX_TRUE_EXIT_CTLS:
-		ctl_high &= ~evmcs_get_unsupported_ctls(EVMCS_EXIT_CTRLS);
-		break;
+		return evmcs_get_unsupported_ctls(vmx, EVMCS_EXIT_CTRLS);
 	case MSR_IA32_VMX_ENTRY_CTLS:
 	case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
-		ctl_high &= ~evmcs_get_unsupported_ctls(EVMCS_ENTRY_CTRLS);
-		break;
+		return evmcs_get_unsupported_ctls(vmx, EVMCS_ENTRY_CTRLS);
 	case MSR_IA32_VMX_PROCBASED_CTLS2:
-		ctl_high &= ~evmcs_get_unsupported_ctls(EVMCS_2NDEXEC);
-		break;
+		return evmcs_get_unsupported_ctls(vmx, EVMCS_2NDEXEC);
 	case MSR_IA32_VMX_TRUE_PINBASED_CTLS:
 	case MSR_IA32_VMX_PINBASED_CTLS:
-		ctl_high &= ~evmcs_get_unsupported_ctls(EVMCS_PINCTRL);
-		break;
+		return evmcs_get_unsupported_ctls(vmx, EVMCS_PINCTRL);
 	case MSR_IA32_VMX_VMFUNC:
-		ctl_low &= ~evmcs_get_unsupported_ctls(EVMCS_VMFUNC);
-		break;
+		return evmcs_get_unsupported_ctls(vmx, EVMCS_VMFUNC);
 	}
+	return 0;
+}
+
+void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu,
+				     struct msr_data *msr_info)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	u64 unsupported_ctrls;
+
+	if (!msr_info->host_initiated && !guest_cpuid_has_evmcs(vcpu))
+		return;
+
+	if (msr_info->host_initiated && !vmx->nested.enforce_evmcs)
+		return;
 
-	*pdata = ctl_low | ((u64)ctl_high << 32);
+	unsupported_ctrls = nested_evmcs_get_unsupported_ctrls(vmx, msr_info->index);
+	if (msr_info->index == MSR_IA32_VMX_VMFUNC)
+		msr_info->data &= ~unsupported_ctrls;
+	else
+		msr_info->data &= ~(unsupported_ctrls << 32);
 }
 
-static bool nested_evmcs_is_valid_controls(enum evmcs_ctrl_type ctrl_type,
+static bool nested_evmcs_is_valid_controls(struct kvm_vcpu *vcpu,
+					   enum evmcs_ctrl_type ctrl_type,
 					   u32 val)
 {
-	return !(val & evmcs_get_unsupported_ctls(ctrl_type));
+	return !(val & evmcs_get_unsupported_ctls(to_vmx(vcpu), ctrl_type));
 }
 
-int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
+int nested_evmcs_check_controls(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 {
-	if (CC(!nested_evmcs_is_valid_controls(EVMCS_PINCTRL,
+	if (CC(!nested_evmcs_is_valid_controls(vcpu, EVMCS_PINCTRL,
 					       vmcs12->pin_based_vm_exec_control)))
 		return -EINVAL;
 
-	if (CC(!nested_evmcs_is_valid_controls(EVMCS_2NDEXEC,
+	if (CC(!nested_evmcs_is_valid_controls(vcpu, EVMCS_2NDEXEC,
 					       vmcs12->secondary_vm_exec_control)))
 		return -EINVAL;
 
-	if (CC(!nested_evmcs_is_valid_controls(EVMCS_EXIT_CTRLS,
+	if (CC(!nested_evmcs_is_valid_controls(vcpu, EVMCS_EXIT_CTRLS,
 					       vmcs12->vm_exit_controls)))
 		return -EINVAL;
 
-	if (CC(!nested_evmcs_is_valid_controls(EVMCS_ENTRY_CTRLS,
+	if (CC(!nested_evmcs_is_valid_controls(vcpu, EVMCS_ENTRY_CTRLS,
 					       vmcs12->vm_entry_controls)))
 		return -EINVAL;
 
-	if (CC(!nested_evmcs_is_valid_controls(EVMCS_VMFUNC,
+	if (CC(!nested_evmcs_is_valid_controls(vcpu, EVMCS_VMFUNC,
 					       vmcs12->vm_function_control)))
 		return -EINVAL;
 
 	return 0;
 }
 
-int nested_enable_evmcs(struct kvm_vcpu *vcpu,
-			uint16_t *vmcs_version)
+int nested_enable_evmcs(struct kvm_vcpu *vcpu, uint16_t *vmcs_version,
+			bool enforce_evmcs)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
+	if (vmx->nested.enlightened_vmcs_enabled && enforce_evmcs)
+		return -EINVAL;
+
 	vmx->nested.enlightened_vmcs_enabled = true;
+	vmx->nested.enforce_evmcs = enforce_evmcs;
 
 	if (vmcs_version)
 		*vmcs_version = nested_get_evmcs_version(vcpu);
diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
index f886a8ff0342..e2b3aeee57ac 100644
--- a/arch/x86/kvm/vmx/evmcs.h
+++ b/arch/x86/kvm/vmx/evmcs.h
@@ -13,6 +13,7 @@
 #include "vmcs12.h"
 
 struct vmcs_config;
+struct vcpu_vmx;
 
 DECLARE_STATIC_KEY_FALSE(enable_evmcs);
 
@@ -66,6 +67,14 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
 #define EVMCS1_UNSUPPORTED_VMENTRY_CTRL (VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL)
 #define EVMCS1_UNSUPPORTED_VMFUNC (VMX_VMFUNC_EPTP_SWITCHING)
 
+/* TODO: explicitly define these */
+#define EVMCS1_UNSUPPORTED_PINCTRL_LEGACY	EVMCS1_UNSUPPORTED_PINCTRL
+#define EVMCS1_UNSUPPORTED_EXEC_CTRL_LEGACY	EVMCS1_UNSUPPORTED_EXEC_CTRL
+#define EVMCS1_UNSUPPORTED_2NDEXEC_LEGACY	EVMCS1_UNSUPPORTED_2NDEXEC
+#define EVMCS1_UNSUPPORTED_VMEXIT_CTRL_LEGACY	EVMCS1_UNSUPPORTED_VMEXIT_CTRL
+#define EVMCS1_UNSUPPORTED_VMENTRY_CTRL_LEGACY	EVMCS1_UNSUPPORTED_VMENTRY_CTRL
+#define EVMCS1_UNSUPPORTED_VMFUNC_LEGACY	EVMCS1_UNSUPPORTED_VMFUNC
+
 struct evmcs_field {
 	u16 offset;
 	u16 clean_field;
@@ -241,9 +250,11 @@ enum nested_evmptrld_status {
 
 bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa);
 uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu);
-int nested_enable_evmcs(struct kvm_vcpu *vcpu,
-			uint16_t *vmcs_version);
-void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata);
-int nested_evmcs_check_controls(struct vmcs12 *vmcs12);
+int nested_enable_evmcs(struct kvm_vcpu *vcpu, uint16_t *vmcs_version,
+			bool enforce_evmcs);
+u64 nested_evmcs_get_unsupported_ctrls(struct vcpu_vmx *vmx, u32 msr_index);
+void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu,
+				     struct msr_data *msr_info);
+int nested_evmcs_check_controls(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12);
 
 #endif /* __KVM_X86_VMX_EVMCS_H */
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 28f9d64851b3..52d299b9263b 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1279,12 +1279,17 @@ static void vmx_get_control_msr(struct nested_vmx_msrs *msrs, u32 msr_index,
 static int
 vmx_restore_control_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
 {
-	u32 *lowp, *highp;
+	u32 *lowp, *highp, high;
 	u64 supported;
 
 	vmx_get_control_msr(&vmcs_config.nested, msr_index, &lowp, &highp);
 
-	supported = vmx_control_msr(*lowp, *highp);
+	/* Do not overwrite the global vmcs_config.nested! */
+	high = *highp;
+	if (vmx->nested.enforce_evmcs)
+		high &= ~nested_evmcs_get_unsupported_ctrls(vmx, msr_index);
+
+	supported = vmx_control_msr(*lowp, high);
 
 	/* Check must-be-1 bits are still 1. */
 	if (!is_bitwise_subset(data, supported, GENMASK_ULL(31, 0)))
@@ -1435,6 +1440,10 @@ int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
 	case MSR_IA32_VMX_VMFUNC:
 		if (data & ~vmcs_config.nested.vmfunc_controls)
 			return -EINVAL;
+		if (vmx->nested.enforce_evmcs &&
+		    (data & nested_evmcs_get_unsupported_ctrls(vmx, MSR_IA32_VMX_VMFUNC)))
+			return -EINVAL;
+
 		vmx->nested.msrs.vmfunc_controls = data;
 		return 0;
 	default:
@@ -2864,7 +2873,7 @@ static int nested_vmx_check_controls(struct kvm_vcpu *vcpu,
 		return -EINVAL;
 
 	if (guest_cpuid_has_evmcs(vcpu))
-		return nested_evmcs_check_controls(vmcs12);
+		return nested_evmcs_check_controls(vcpu, vmcs12);
 
 	return 0;
 }
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d4ed802947d7..73f9074efc61 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1924,15 +1924,11 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 				    &msr_info->data))
 			return 1;
 		/*
-		 * Enlightened VMCS v1 doesn't have certain VMCS fields but
-		 * instead of just ignoring the features, different Hyper-V
-		 * versions are either trying to use them and fail or do some
-		 * sanity checking and refuse to boot. Filter all unsupported
-		 * features out.
+		 * New Enlightened VMCS fields always lag behind their hardware
+		 * counterparts, filter out fields that are not yet defined.
 		 */
-		if (!msr_info->host_initiated && guest_cpuid_has_evmcs(vcpu))
-			nested_evmcs_filter_control_msr(msr_info->index,
-							&msr_info->data);
+		if (vmx->nested.enlightened_vmcs_enabled)
+			nested_evmcs_filter_control_msr(vcpu, msr_info);
 		break;
 	case MSR_IA32_RTIT_CTL:
 		if (!vmx_pt_mode_is_host_guest())
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 35c7e6aef301..a7a05b5e41d2 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -197,6 +197,8 @@ struct nested_vmx {
 	 */
 	bool enlightened_vmcs_enabled;
 
+	bool enforce_evmcs;
+
 	/* L2 must run next, and mustn't decide to exit to L1. */
 	bool nested_run_pending;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d7374d768296..fb5cecb19cf5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4452,6 +4452,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		r = kvm_x86_ops.enable_direct_tlbflush != NULL;
 		break;
 	case KVM_CAP_HYPERV_ENLIGHTENED_VMCS:
+	case KVM_CAP_HYPERV_ENLIGHTENED_VMCS2:
 		r = kvm_x86_ops.nested_ops->enable_evmcs != NULL;
 		break;
 	case KVM_CAP_SMALLER_MAXPHYADDR:
@@ -5429,9 +5430,13 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 		return kvm_hv_activate_synic(vcpu, cap->cap ==
 					     KVM_CAP_HYPERV_SYNIC2);
 	case KVM_CAP_HYPERV_ENLIGHTENED_VMCS:
+	case KVM_CAP_HYPERV_ENLIGHTENED_VMCS2: {
+		bool enforce_evmcs = cap->cap == KVM_CAP_HYPERV_ENLIGHTENED_VMCS2;
+
 		if (!kvm_x86_ops.nested_ops->enable_evmcs)
 			return -ENOTTY;
-		r = kvm_x86_ops.nested_ops->enable_evmcs(vcpu, &vmcs_version);
+		r = kvm_x86_ops.nested_ops->enable_evmcs(vcpu, &vmcs_version,
+							 enforce_evmcs);
 		if (!r) {
 			user_ptr = (void __user *)(uintptr_t)cap->args[0];
 			if (copy_to_user(user_ptr, &vmcs_version,
@@ -5439,6 +5444,7 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
 				r = -EFAULT;
 		}
 		return r;
+	}
 	case KVM_CAP_HYPERV_DIRECT_TLBFLUSH:
 		if (!kvm_x86_ops.enable_direct_tlbflush)
 			return -ENOTTY;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index eed0315a77a6..ba08d6f74267 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1177,6 +1177,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_VM_DISABLE_NX_HUGE_PAGES 220
 #define KVM_CAP_S390_ZPCI_OP 221
 #define KVM_CAP_S390_CPU_TOPOLOGY 222
+#define KVM_CAP_HYPERV_ENLIGHTENED_VMCS2 223
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 10/36] KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (8 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 09/36] KVM: nVMX: Enforce unsupported eVMCS in VMX MSRs for host accesses Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 11/36] KVM: nVMX: Support several new fields in eVMCSv1 Sean Christopherson
                   ` (26 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

Enlightened VMCS v1 definition was updated with new fields, support
them in KVM by defining VMCS-to-EVMCS conversion.

Note: SSP, CET and Guest LBR features are not supported by KVM yet and
the corresponding fields are not defined in 'enum vmcs_field', leave
them commented out for now.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/evmcs.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index c0cb68ce7b1b..ce358b13b75b 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -30,6 +30,8 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
 	EVMCS1_FIELD(HOST_IA32_EFER, host_ia32_efer,
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
+	EVMCS1_FIELD(HOST_IA32_PERF_GLOBAL_CTRL, host_ia32_perf_global_ctrl,
+		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
 	EVMCS1_FIELD(HOST_CR0, host_cr0,
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
 	EVMCS1_FIELD(HOST_CR3, host_cr3,
@@ -80,6 +82,8 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
 	EVMCS1_FIELD(GUEST_IA32_EFER, guest_ia32_efer,
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
+	EVMCS1_FIELD(GUEST_IA32_PERF_GLOBAL_CTRL, guest_ia32_perf_global_ctrl,
+		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
 	EVMCS1_FIELD(GUEST_PDPTR0, guest_pdptr0,
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
 	EVMCS1_FIELD(GUEST_PDPTR1, guest_pdptr1,
@@ -128,6 +132,28 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
 	EVMCS1_FIELD(XSS_EXIT_BITMAP, xss_exit_bitmap,
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2),
+	EVMCS1_FIELD(ENCLS_EXITING_BITMAP, encls_exiting_bitmap,
+		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2),
+	EVMCS1_FIELD(TSC_MULTIPLIER, tsc_multiplier,
+		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2),
+	/*
+	 * Not used by KVM:
+	 *
+	 * EVMCS1_FIELD(0x00006828, guest_ia32_s_cet,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
+	 * EVMCS1_FIELD(0x0000682A, guest_ssp,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_BASIC),
+	 * EVMCS1_FIELD(0x0000682C, guest_ia32_int_ssp_table_addr,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
+	 * EVMCS1_FIELD(0x00002816, guest_ia32_lbr_ctl,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
+	 * EVMCS1_FIELD(0x00006C18, host_ia32_s_cet,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
+	 * EVMCS1_FIELD(0x00006C1A, host_ssp,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
+	 * EVMCS1_FIELD(0x00006C1C, host_ia32_int_ssp_table_addr,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
+	 */
 
 	/* 64 bit read only */
 	EVMCS1_FIELD(GUEST_PHYSICAL_ADDRESS, guest_physical_address,
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 11/36] KVM: nVMX: Support several new fields in eVMCSv1
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (9 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 10/36] KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 12/36] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf Sean Christopherson
                   ` (25 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

Enlightened VMCS v1 definition was updated with new fields, add
support for them for Hyper-V on KVM.

Note: SSP, CET and Guest LBR features are not supported by KVM yet
and 'struct vmcs12' has no corresponding fields.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/nested.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 52d299b9263b..57e96f4ab765 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1616,6 +1616,10 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
 		vmcs12->guest_rflags = evmcs->guest_rflags;
 		vmcs12->guest_interruptibility_info =
 			evmcs->guest_interruptibility_info;
+		/*
+		 * Not present in struct vmcs12:
+		 * vmcs12->guest_ssp = evmcs->guest_ssp;
+		 */
 	}
 
 	if (unlikely(!(hv_clean_fields &
@@ -1662,6 +1666,13 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
 		vmcs12->host_fs_selector = evmcs->host_fs_selector;
 		vmcs12->host_gs_selector = evmcs->host_gs_selector;
 		vmcs12->host_tr_selector = evmcs->host_tr_selector;
+		vmcs12->host_ia32_perf_global_ctrl = evmcs->host_ia32_perf_global_ctrl;
+		/*
+		 * Not present in struct vmcs12:
+		 * vmcs12->host_ia32_s_cet = evmcs->host_ia32_s_cet;
+		 * vmcs12->host_ssp = evmcs->host_ssp;
+		 * vmcs12->host_ia32_int_ssp_table_addr = evmcs->host_ia32_int_ssp_table_addr;
+		 */
 	}
 
 	if (unlikely(!(hv_clean_fields &
@@ -1729,6 +1740,8 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
 		vmcs12->tsc_offset = evmcs->tsc_offset;
 		vmcs12->virtual_apic_page_addr = evmcs->virtual_apic_page_addr;
 		vmcs12->xss_exit_bitmap = evmcs->xss_exit_bitmap;
+		vmcs12->encls_exiting_bitmap = evmcs->encls_exiting_bitmap;
+		vmcs12->tsc_multiplier = evmcs->tsc_multiplier;
 	}
 
 	if (unlikely(!(hv_clean_fields &
@@ -1776,6 +1789,13 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
 		vmcs12->guest_bndcfgs = evmcs->guest_bndcfgs;
 		vmcs12->guest_activity_state = evmcs->guest_activity_state;
 		vmcs12->guest_sysenter_cs = evmcs->guest_sysenter_cs;
+		vmcs12->guest_ia32_perf_global_ctrl = evmcs->guest_ia32_perf_global_ctrl;
+		/*
+		 * Not present in struct vmcs12:
+		 * vmcs12->guest_ia32_s_cet = evmcs->guest_ia32_s_cet;
+		 * vmcs12->guest_ia32_lbr_ctl = evmcs->guest_ia32_lbr_ctl;
+		 * vmcs12->guest_ia32_int_ssp_table_addr = evmcs->guest_ia32_int_ssp_table_addr;
+		 */
 	}
 
 	/*
@@ -1878,12 +1898,23 @@ static void copy_vmcs12_to_enlightened(struct vcpu_vmx *vmx)
 	 * evmcs->vm_exit_msr_store_count = vmcs12->vm_exit_msr_store_count;
 	 * evmcs->vm_exit_msr_load_count = vmcs12->vm_exit_msr_load_count;
 	 * evmcs->vm_entry_msr_load_count = vmcs12->vm_entry_msr_load_count;
+	 * evmcs->guest_ia32_perf_global_ctrl = vmcs12->guest_ia32_perf_global_ctrl;
+	 * evmcs->host_ia32_perf_global_ctrl = vmcs12->host_ia32_perf_global_ctrl;
+	 * evmcs->encls_exiting_bitmap = vmcs12->encls_exiting_bitmap;
+	 * evmcs->tsc_multiplier = vmcs12->tsc_multiplier;
 	 *
 	 * Not present in struct vmcs12:
 	 * evmcs->exit_io_instruction_ecx = vmcs12->exit_io_instruction_ecx;
 	 * evmcs->exit_io_instruction_esi = vmcs12->exit_io_instruction_esi;
 	 * evmcs->exit_io_instruction_edi = vmcs12->exit_io_instruction_edi;
 	 * evmcs->exit_io_instruction_eip = vmcs12->exit_io_instruction_eip;
+	 * evmcs->host_ia32_s_cet = vmcs12->host_ia32_s_cet;
+	 * evmcs->host_ssp = vmcs12->host_ssp;
+	 * evmcs->host_ia32_int_ssp_table_addr = vmcs12->host_ia32_int_ssp_table_addr;
+	 * evmcs->guest_ia32_s_cet = vmcs12->guest_ia32_s_cet;
+	 * evmcs->guest_ia32_lbr_ctl = vmcs12->guest_ia32_lbr_ctl;
+	 * evmcs->guest_ia32_int_ssp_table_addr = vmcs12->guest_ia32_int_ssp_table_addr;
+	 * evmcs->guest_ssp = vmcs12->guest_ssp;
 	 */
 
 	evmcs->guest_es_selector = vmcs12->guest_es_selector;
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 12/36] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (10 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 11/36] KVM: nVMX: Support several new fields in eVMCSv1 Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 13/36] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields Sean Christopherson
                   ` (24 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

KVM has to check guest visible HYPERV_CPUID_NESTED_FEATURES.EBX CPUID
leaf to know which Enlightened VMCS definition to use (original or 2022
update). Cache the leaf along with other Hyper-V CPUID feature leaves
to make the check quick.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h | 2 ++
 arch/x86/kvm/hyperv.c           | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 2209724b765e..fa399329c9f8 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -615,6 +615,8 @@ struct kvm_vcpu_hv {
 		u32 enlightenments_eax; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EAX */
 		u32 enlightenments_ebx; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EBX */
 		u32 syndbg_cap_eax; /* HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES.EAX */
+		u32 nested_eax; /* HYPERV_CPUID_NESTED_FEATURES.EAX */
+		u32 nested_ebx; /* HYPERV_CPUID_NESTED_FEATURES.EBX */
 	} cpuid_cache;
 };
 
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index bf4729e8cc80..a7478b61088b 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2018,6 +2018,12 @@ void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu, bool hyperv_enabled)
 	entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES);
 	if (entry)
 		hv_vcpu->cpuid_cache.syndbg_cap_eax = entry->eax;
+
+	entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_NESTED_FEATURES);
+	if (entry) {
+		hv_vcpu->cpuid_cache.nested_eax = entry->eax;
+		hv_vcpu->cpuid_cache.nested_ebx = entry->ebx;
+	}
 }
 
 int kvm_hv_set_enforce_cpuid(struct kvm_vcpu *vcpu, bool enforce)
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 13/36] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (11 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 12/36] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 14/36] KVM: selftests: Switch to updated eVMCSv1 definition Sean Christopherson
                   ` (23 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

The updated Enlightened VMCS definition has 'encls_exiting_bitmap'
field which needs mapping to VMCS, add the missing encoding.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 tools/testing/selftests/kvm/include/x86_64/vmx.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/x86_64/vmx.h b/tools/testing/selftests/kvm/include/x86_64/vmx.h
index 99fa1410964c..7d8c980317f7 100644
--- a/tools/testing/selftests/kvm/include/x86_64/vmx.h
+++ b/tools/testing/selftests/kvm/include/x86_64/vmx.h
@@ -208,6 +208,8 @@ enum vmcs_field {
 	VMWRITE_BITMAP_HIGH		= 0x00002029,
 	XSS_EXIT_BITMAP			= 0x0000202C,
 	XSS_EXIT_BITMAP_HIGH		= 0x0000202D,
+	ENCLS_EXITING_BITMAP		= 0x0000202E,
+	ENCLS_EXITING_BITMAP_HIGH	= 0x0000202F,
 	TSC_MULTIPLIER			= 0x00002032,
 	TSC_MULTIPLIER_HIGH		= 0x00002033,
 	GUEST_PHYSICAL_ADDRESS		= 0x00002400,
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 14/36] KVM: selftests: Switch to updated eVMCSv1 definition
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (12 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 13/36] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 15/36] KVM: nVMX: WARN once and fail VM-Enter if eVMCS sees VMFUNC[63:32] != 0 Sean Christopherson
                   ` (22 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

Update Enlightened VMCS definition in selftests from KVM.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 .../selftests/kvm/include/x86_64/evmcs.h      | 45 +++++++++++++++++--
 1 file changed, 42 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/evmcs.h b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
index 3c9260f8e116..58db74f68af2 100644
--- a/tools/testing/selftests/kvm/include/x86_64/evmcs.h
+++ b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
@@ -203,14 +203,25 @@ struct hv_enlightened_vmcs {
 		u32 reserved:30;
 	} hv_enlightenments_control;
 	u32 hv_vp_id;
-
+	u32 padding32_2;
 	u64 hv_vm_id;
 	u64 partition_assist_page;
 	u64 padding64_4[4];
 	u64 guest_bndcfgs;
-	u64 padding64_5[7];
+	u64 guest_ia32_perf_global_ctrl;
+	u64 guest_ia32_s_cet;
+	u64 guest_ssp;
+	u64 guest_ia32_int_ssp_table_addr;
+	u64 guest_ia32_lbr_ctl;
+	u64 padding64_5[2];
 	u64 xss_exit_bitmap;
-	u64 padding64_6[7];
+	u64 encls_exiting_bitmap;
+	u64 host_ia32_perf_global_ctrl;
+	u64 tsc_multiplier;
+	u64 host_ia32_s_cet;
+	u64 host_ssp;
+	u64 host_ia32_int_ssp_table_addr;
+	u64 padding64_6;
 };
 
 #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE                     0
@@ -656,6 +667,18 @@ static inline int evmcs_vmread(uint64_t encoding, uint64_t *value)
 	case VIRTUAL_PROCESSOR_ID:
 		*value = current_evmcs->virtual_processor_id;
 		break;
+	case HOST_IA32_PERF_GLOBAL_CTRL:
+		*value = current_evmcs->host_ia32_perf_global_ctrl;
+		break;
+	case GUEST_IA32_PERF_GLOBAL_CTRL:
+		*value = current_evmcs->guest_ia32_perf_global_ctrl;
+		break;
+	case ENCLS_EXITING_BITMAP:
+		*value = current_evmcs->encls_exiting_bitmap;
+		break;
+	case TSC_MULTIPLIER:
+		*value = current_evmcs->tsc_multiplier;
+		break;
 	default: return 1;
 	}
 
@@ -1169,6 +1192,22 @@ static inline int evmcs_vmwrite(uint64_t encoding, uint64_t value)
 		current_evmcs->virtual_processor_id = value;
 		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_XLAT;
 		break;
+	case HOST_IA32_PERF_GLOBAL_CTRL:
+		current_evmcs->host_ia32_perf_global_ctrl = value;
+		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1;
+		break;
+	case GUEST_IA32_PERF_GLOBAL_CTRL:
+		current_evmcs->guest_ia32_perf_global_ctrl = value;
+		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1;
+		break;
+	case ENCLS_EXITING_BITMAP:
+		current_evmcs->encls_exiting_bitmap = value;
+		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2;
+		break;
+	case TSC_MULTIPLIER:
+		current_evmcs->tsc_multiplier = value;
+		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2;
+		break;
 	default: return 1;
 	}
 
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 15/36] KVM: nVMX: WARN once and fail VM-Enter if eVMCS sees VMFUNC[63:32] != 0
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (13 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 14/36] KVM: selftests: Switch to updated eVMCSv1 definition Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 16/36] KVM: nVMX: Support PERF_GLOBAL_CTRL with enlightened VMCS Sean Christopherson
                   ` (21 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

WARN and reject nested VM-Enter if KVM is using eVMCS and manages to
allow a non-zero value in the upper 32 bits of VM-function controls.  The
eVMCS code assumes all inputs are 32-bit values and subtly drops the
upper bits.  WARN instead of adding proper "support", it's unlikely the
upper bits will be defined/used in the next decade.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/evmcs.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index ce358b13b75b..bd1dcc077c85 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -486,6 +486,14 @@ int nested_evmcs_check_controls(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 					       vmcs12->vm_entry_controls)))
 		return -EINVAL;
 
+	/*
+	 * VM-Func controls are 64-bit, but KVM currently doesn't support any
+	 * controls in bits 63:32, i.e. dropping those bits on the consistency
+	 * check is intentional.
+	 */
+	if (WARN_ON_ONCE(vmcs12->vm_function_control >> 32))
+		return -EINVAL;
+
 	if (CC(!nested_evmcs_is_valid_controls(vcpu, EVMCS_VMFUNC,
 					       vmcs12->vm_function_control)))
 		return -EINVAL;
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 16/36] KVM: nVMX: Support PERF_GLOBAL_CTRL with enlightened VMCS
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (14 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 15/36] KVM: nVMX: WARN once and fail VM-Enter if eVMCS sees VMFUNC[63:32] != 0 Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 17/36] KVM: nVMX: Support TSC scaling " Sean Christopherson
                   ` (20 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

Enlightened VMCS v1 got updated and now includes the required fields
for loading PERF_GLOBAL_CTRL upon VMENTER/VMEXIT features. For KVM on
Hyper-V enablement, KVM can just observe VMX control MSRs and use the
features (with or without eVMCS) when possible.

Hyper-V on KVM is messier as Windows 11 guests fail to boot if the
controls are advertised and a new PV feature flag, CPUID.0x4000000A.EBX
BIT(0), is not set.  Honor the Hyper-V CPUID feature flag to play nice
with Windows guests.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/hyperv.c    |  2 +-
 arch/x86/kvm/vmx/evmcs.c | 32 ++++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx/evmcs.h |  7 ++-----
 3 files changed, 35 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index a7478b61088b..0adf4a437e85 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2546,7 +2546,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
 		case HYPERV_CPUID_NESTED_FEATURES:
 			ent->eax = evmcs_ver;
 			ent->eax |= HV_X64_NESTED_MSR_BITMAP;
-
+			ent->ebx |= HV_X64_NESTED_EVMCS1_PERF_GLOBAL_CTRL;
 			break;
 
 		case HYPERV_CPUID_SYNDBG_VENDOR_AND_MAX_FUNCTIONS:
diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index bd1dcc077c85..38ec41939cab 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -442,6 +442,23 @@ u64 nested_evmcs_get_unsupported_ctrls(struct vcpu_vmx *vmx, u32 msr_index)
 	return 0;
 }
 
+static bool evmcs_has_perf_global_ctrl(struct kvm_vcpu *vcpu)
+{
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+
+	/*
+	 * PERF_GLOBAL_CTRL has a quirk where some Windows guests may fail to
+	 * boot if a PV CPUID feature flag is not also set.  Treat the fields
+	 * as unsupported if the flag is not set in guest CPUID.  This should
+	 * be called only for guest accesses, and all guest accesses should be
+	 * gated on Hyper-V being enabled and initialized.
+	 */
+	if (WARN_ON_ONCE(!hv_vcpu))
+		return false;
+
+	return hv_vcpu->cpuid_cache.nested_ebx & HV_X64_NESTED_EVMCS1_PERF_GLOBAL_CTRL;
+}
+
 void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu,
 				     struct msr_data *msr_info)
 {
@@ -455,6 +472,21 @@ void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu,
 		return;
 
 	unsupported_ctrls = nested_evmcs_get_unsupported_ctrls(vmx, msr_info->index);
+	switch (msr_info->index) {
+	case MSR_IA32_VMX_EXIT_CTLS:
+	case MSR_IA32_VMX_TRUE_EXIT_CTLS:
+		if (!evmcs_has_perf_global_ctrl(vcpu))
+			unsupported_ctrls |= VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+		break;
+	case MSR_IA32_VMX_ENTRY_CTLS:
+	case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
+		if (!evmcs_has_perf_global_ctrl(vcpu))
+			unsupported_ctrls |= VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+		break;
+	default:
+		break;
+	}
+
 	if (msr_info->index == MSR_IA32_VMX_VMFUNC)
 		msr_info->data &= ~unsupported_ctrls;
 	else
diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
index e2b3aeee57ac..35b326386c50 100644
--- a/arch/x86/kvm/vmx/evmcs.h
+++ b/arch/x86/kvm/vmx/evmcs.h
@@ -43,8 +43,6 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
  *	PLE_GAP                         = 0x00004020,
  *	PLE_WINDOW                      = 0x00004022,
  *	VMX_PREEMPTION_TIMER_VALUE      = 0x0000482E,
- *      GUEST_IA32_PERF_GLOBAL_CTRL     = 0x00002808,
- *      HOST_IA32_PERF_GLOBAL_CTRL      = 0x00002c04,
  *
  * Currently unsupported in KVM:
  *	GUEST_IA32_RTIT_CTL		= 0x00002814,
@@ -62,9 +60,8 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
 	 SECONDARY_EXEC_TSC_SCALING |					\
 	 SECONDARY_EXEC_PAUSE_LOOP_EXITING)
 #define EVMCS1_UNSUPPORTED_VMEXIT_CTRL					\
-	(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |				\
-	 VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
-#define EVMCS1_UNSUPPORTED_VMENTRY_CTRL (VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL)
+	(VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
+#define EVMCS1_UNSUPPORTED_VMENTRY_CTRL (0)
 #define EVMCS1_UNSUPPORTED_VMFUNC (VMX_VMFUNC_EPTP_SWITCHING)
 
 /* TODO: explicitly define these */
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 17/36] KVM: nVMX: Support TSC scaling with enlightened VMCS
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (15 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 16/36] KVM: nVMX: Support PERF_GLOBAL_CTRL with enlightened VMCS Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 18/36] KVM: selftests: Enable TSC scaling in evmcs selftest Sean Christopherson
                   ` (19 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

Enlightened VMCS v1 got updated and now includes the required fields for
TSC scaling, enable TSC scaling for both KVM-on-HyperV and HyperV-on-KVM
simply by dropping the relevant fields from the unsupported controls.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
[sean: split to separate patch (from PERF_GLOBAL_CTRL)]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/evmcs.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
index 35b326386c50..a2e21bdd17bb 100644
--- a/arch/x86/kvm/vmx/evmcs.h
+++ b/arch/x86/kvm/vmx/evmcs.h
@@ -38,8 +38,6 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
  *	EPTP_LIST_ADDRESS               = 0x00002024,
  *	VMREAD_BITMAP                   = 0x00002026,
  *	VMWRITE_BITMAP                  = 0x00002028,
- *
- *	TSC_MULTIPLIER                  = 0x00002032,
  *	PLE_GAP                         = 0x00004020,
  *	PLE_WINDOW                      = 0x00004022,
  *	VMX_PREEMPTION_TIMER_VALUE      = 0x0000482E,
@@ -57,7 +55,6 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
 	 SECONDARY_EXEC_ENABLE_PML |					\
 	 SECONDARY_EXEC_ENABLE_VMFUNC |					\
 	 SECONDARY_EXEC_SHADOW_VMCS |					\
-	 SECONDARY_EXEC_TSC_SCALING |					\
 	 SECONDARY_EXEC_PAUSE_LOOP_EXITING)
 #define EVMCS1_UNSUPPORTED_VMEXIT_CTRL					\
 	(VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 18/36] KVM: selftests: Enable TSC scaling in evmcs selftest
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (16 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 17/36] KVM: nVMX: Support TSC scaling " Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 19/36] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization Sean Christopherson
                   ` (18 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

The updated Enlightened VMCS v1 definition enables TSC scaling, test
that SECONDARY_EXEC_TSC_SCALING can now be enabled.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 .../testing/selftests/kvm/x86_64/evmcs_test.c | 31 +++++++++++++++++--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/evmcs_test.c b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
index 99bc202243d2..21a7a792a010 100644
--- a/tools/testing/selftests/kvm/x86_64/evmcs_test.c
+++ b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
@@ -18,6 +18,9 @@
 
 #include "vmx.h"
 
+/* Test flags */
+#define HOST_HAS_TSC_SCALING BIT(0)
+
 static int ud_count;
 
 static void guest_ud_handler(struct ex_regs *regs)
@@ -64,11 +67,14 @@ void l2_guest_code(void)
 	vmcall();
 	rdmsr_gs_base(); /* intercepted */
 
+	/* TSC scaling */
+	vmcall();
+
 	/* Done, exit to L1 and never come back.  */
 	vmcall();
 }
 
-void guest_code(struct vmx_pages *vmx_pages)
+void guest_code(struct vmx_pages *vmx_pages, u64 test_flags)
 {
 #define L2_GUEST_STACK_SIZE 64
 	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
@@ -150,6 +156,18 @@ void guest_code(struct vmx_pages *vmx_pages)
 	GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
 	GUEST_SYNC(11);
 
+	if (test_flags & HOST_HAS_TSC_SCALING) {
+		GUEST_ASSERT((rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2) >> 32) &
+			     SECONDARY_EXEC_TSC_SCALING);
+		/* Try enabling TSC scaling */
+		vmwrite(SECONDARY_VM_EXEC_CONTROL, vmreadz(SECONDARY_VM_EXEC_CONTROL) |
+			SECONDARY_EXEC_TSC_SCALING);
+		vmwrite(TSC_MULTIPLIER, 1);
+	}
+	GUEST_ASSERT(!vmresume());
+	GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
+	GUEST_SYNC(12);
+
 	/* Try enlightened vmptrld with an incorrect GPA */
 	evmcs_vmptrld(0xdeadbeef, vmx_pages->enlightened_vmcs);
 	GUEST_ASSERT(vmlaunch());
@@ -204,6 +222,7 @@ int main(int argc, char *argv[])
 	struct kvm_vm *vm;
 	struct kvm_run *run;
 	struct ucall uc;
+	u64 test_flags = 0;
 	int stage;
 
 	vm = vm_create_with_one_vcpu(&vcpu, guest_code);
@@ -212,11 +231,19 @@ int main(int argc, char *argv[])
 	TEST_REQUIRE(kvm_has_cap(KVM_CAP_NESTED_STATE));
 	TEST_REQUIRE(kvm_has_cap(KVM_CAP_HYPERV_ENLIGHTENED_VMCS));
 
+	if ((kvm_get_feature_msr(MSR_IA32_VMX_PROCBASED_CTLS2) >> 32) &
+	    SECONDARY_EXEC_TSC_SCALING) {
+		test_flags |= HOST_HAS_TSC_SCALING;
+		pr_info("TSC scaling is supported, adding to test\n");
+	} else {
+		pr_info("TSC scaling is not supported\n");
+	}
+
 	vcpu_set_hv_cpuid(vcpu);
 	vcpu_enable_evmcs(vcpu);
 
 	vcpu_alloc_vmx(vm, &vmx_pages_gva);
-	vcpu_args_set(vcpu, 1, vmx_pages_gva);
+	vcpu_args_set(vcpu, 2, vmx_pages_gva, test_flags);
 
 	vm_init_descriptor_tables(vm);
 	vcpu_init_descriptor_tables(vcpu);
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 19/36] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (17 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 18/36] KVM: selftests: Enable TSC scaling in evmcs selftest Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 20/36] KVM: nVMX: Don't propagate vmcs12's PERF_GLOBAL_CTRL settings to vmcs02 Sean Christopherson
                   ` (17 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

With the updated eVMCSv1 definition, there's no known 'problematic'
controls which are exposed in VMX control MSRs but are not present in
eVMCSv1: all known Hyper-V versions either don't expose the new fields
by not setting bits in the VMX feature controls or support the new
eVMCS revision.

Get rid of VMX control MSRs filtering for KVM on Hyper-V.

Note: VMX control MSRs filtering for Hyper-V on KVM
(nested_evmcs_filter_control_msr()) stays as even the updated eVMCSv1
definition doesn't have all the features implemented by KVM and some
fields are still missing. Moreover, nested_evmcs_filter_control_msr()
has to support the original eVMCSv1 version when VMM wishes so.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/evmcs.c | 13 -------------
 arch/x86/kvm/vmx/evmcs.h |  1 -
 arch/x86/kvm/vmx/vmx.c   |  5 -----
 3 files changed, 19 deletions(-)

diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index 38ec41939cab..2365e81cfc6e 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -322,19 +322,6 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
 };
 const unsigned int nr_evmcs_1_fields = ARRAY_SIZE(vmcs_field_to_evmcs_1);
 
-#if IS_ENABLED(CONFIG_HYPERV)
-__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf)
-{
-	vmcs_conf->cpu_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_EXEC_CTRL;
-	vmcs_conf->pin_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_PINCTRL;
-	vmcs_conf->cpu_based_2nd_exec_ctrl &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
-	vmcs_conf->cpu_based_3rd_exec_ctrl = 0;
-
-	vmcs_conf->vmexit_ctrl &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
-	vmcs_conf->vmentry_ctrl &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
-}
-#endif
-
 bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa)
 {
 	struct hv_vp_assist_page assist_page;
diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
index a2e21bdd17bb..33cd4623bb0b 100644
--- a/arch/x86/kvm/vmx/evmcs.h
+++ b/arch/x86/kvm/vmx/evmcs.h
@@ -215,7 +215,6 @@ static inline void evmcs_load(u64 phys_addr)
 	vp_ap->enlighten_vmentry = 1;
 }
 
-__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf);
 #else /* !IS_ENABLED(CONFIG_HYPERV) */
 static __always_inline void evmcs_write64(unsigned long field, u64 value) {}
 static inline void evmcs_write32(unsigned long field, u32 value) {}
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 73f9074efc61..6b702c0085ff 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2762,11 +2762,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	vmcs_conf->vmexit_ctrl         = _vmexit_control;
 	vmcs_conf->vmentry_ctrl        = _vmentry_control;
 
-#if IS_ENABLED(CONFIG_HYPERV)
-	if (enlightened_vmcs)
-		evmcs_sanitize_exec_ctrls(vmcs_conf);
-#endif
-
 	return 0;
 }
 
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 20/36] KVM: nVMX: Don't propagate vmcs12's PERF_GLOBAL_CTRL settings to vmcs02
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (18 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 19/36] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 21/36] KVM: nVMX: Always emulate PERF_GLOBAL_CTRL VM-Entry/VM-Exit controls Sean Christopherson
                   ` (16 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

Don't propagate vmcs12's VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL to vmcs02.
KVM doesn't disallow L1 from using VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL
even when KVM itself doesn't use the control, e.g. due to the various
CPU errata that where the MSR can be corrupted on VM-Exit.

Preserve KVM's (vmcs01) setting to hopefully avoid having to toggle the
bit in vmcs02 at a later point.  E.g. if KVM is loading PERF_GLOBAL_CTRL
when running L1, then odds are good KVM will also load the MSR when
running L2.

Fixes: 8bf00a529967 ("KVM: VMX: add support for switching of PERF_GLOBAL_CTRL")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/nested.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 57e96f4ab765..eed7551dd63c 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2368,9 +2368,14 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct loaded_vmcs *vmcs0
 	 * are emulated by vmx_set_efer() in prepare_vmcs02(), but speculate
 	 * on the related bits (if supported by the CPU) in the hope that
 	 * we can avoid VMWrites during vmx_set_efer().
+	 *
+	 * Similarly, take vmcs01's PERF_GLOBAL_CTRL in the hope that if KVM is
+	 * loading PERF_GLOBAL_CTRL via the VMCS for L1, then KVM will want to
+	 * do the same for L2.
 	 */
 	exec_control = __vm_entry_controls_get(vmcs01);
-	exec_control |= vmcs12->vm_entry_controls;
+	exec_control |= (vmcs12->vm_entry_controls &
+			 ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL);
 	exec_control &= ~(VM_ENTRY_IA32E_MODE | VM_ENTRY_LOAD_IA32_EFER);
 	if (cpu_has_load_ia32_efer()) {
 		if (guest_efer & EFER_LMA)
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 21/36] KVM: nVMX: Always emulate PERF_GLOBAL_CTRL VM-Entry/VM-Exit controls
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (19 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 20/36] KVM: nVMX: Don't propagate vmcs12's PERF_GLOBAL_CTRL settings to vmcs02 Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 22/36] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config() Sean Christopherson
                   ` (15 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

Advertise VM_{ENTRY,EXIT}_LOAD_IA32_PERF_GLOBAL_CTRL as being supported
for nested VMs irrespective of hardware support.  KVM fully emulates
the controls, i.e. manually emulates MSR writes on entry/exit, and never
propagates the guest settings directly to vmcs02.

In addition to allowing L1 VMMs to use the controls on older hardware,
unconditionally advertising the controls will also allow KVM to use its
vmcs01 configuration as the basis for the nested VMX configuration
without causing a regression (due the errata which causes KVM to "hide"
the control from vmcs01 but not vmcs12).

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/nested.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index eed7551dd63c..6e9b32744e0d 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6611,11 +6611,12 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 		VM_EXIT_HOST_ADDR_SPACE_SIZE |
 #endif
 		VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT |
-		VM_EXIT_CLEAR_BNDCFGS | VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+		VM_EXIT_CLEAR_BNDCFGS;
 	msrs->exit_ctls_high |=
 		VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR |
 		VM_EXIT_LOAD_IA32_EFER | VM_EXIT_SAVE_IA32_EFER |
-		VM_EXIT_SAVE_VMX_PREEMPTION_TIMER | VM_EXIT_ACK_INTR_ON_EXIT;
+		VM_EXIT_SAVE_VMX_PREEMPTION_TIMER | VM_EXIT_ACK_INTR_ON_EXIT |
+		VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
 
 	/* We support free control of debug control saving. */
 	msrs->exit_ctls_low &= ~VM_EXIT_SAVE_DEBUG_CONTROLS;
@@ -6630,10 +6631,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 #ifdef CONFIG_X86_64
 		VM_ENTRY_IA32E_MODE |
 #endif
-		VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS |
-		VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+		VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS;
 	msrs->entry_ctls_high |=
-		(VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR | VM_ENTRY_LOAD_IA32_EFER);
+		(VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR | VM_ENTRY_LOAD_IA32_EFER |
+		 VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL);
 
 	/* We support free control of debug control loading. */
 	msrs->entry_ctls_low &= ~VM_ENTRY_LOAD_DEBUG_CONTROLS;
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 22/36] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (20 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 21/36] KVM: nVMX: Always emulate PERF_GLOBAL_CTRL VM-Entry/VM-Exit controls Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 23/36] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING " Sean Christopherson
                   ` (14 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

VM_ENTRY_IA32E_MODE control is toggled dynamically by vmx_set_efer()
and setup_vmcs_config() doesn't check its existence. On the contrary,
nested_vmx_setup_ctls_msrs() doesn set it on x86_64. Add the missing
check and filter the bit out in vmx_vmentry_ctrl().

No (real) functional change intended as all existing CPUs supporting
long mode and VMX are supposed to have it.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6b702c0085ff..eff38cbe6d35 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2683,6 +2683,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 		_pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;
 
 	min = VM_ENTRY_LOAD_DEBUG_CONTROLS;
+#ifdef CONFIG_X86_64
+	min |= VM_ENTRY_IA32E_MODE;
+#endif
 	opt = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
 	      VM_ENTRY_LOAD_IA32_PAT |
 	      VM_ENTRY_LOAD_IA32_EFER |
@@ -4317,9 +4320,14 @@ static u32 vmx_vmentry_ctrl(void)
 	if (vmx_pt_mode_is_system())
 		vmentry_ctrl &= ~(VM_ENTRY_PT_CONCEAL_PIP |
 				  VM_ENTRY_LOAD_IA32_RTIT_CTL);
-	/* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
-	return vmentry_ctrl &
-		~(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL | VM_ENTRY_LOAD_IA32_EFER);
+	/*
+	 * IA32e mode, and loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically.
+	 */
+	vmentry_ctrl &= ~(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
+			  VM_ENTRY_LOAD_IA32_EFER |
+			  VM_ENTRY_IA32E_MODE);
+
+	return vmentry_ctrl;
 }
 
 static u32 vmx_vmexit_ctrl(void)
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 23/36] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in setup_vmcs_config()
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (21 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 22/36] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config() Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 24/36] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING " Sean Christopherson
                   ` (13 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

CPU_BASED_{INTR,NMI}_WINDOW_EXITING controls are toggled dynamically by
vmx_enable_{irq,nmi}_window, handle_interrupt_window(), handle_nmi_window()
but setup_vmcs_config() doesn't check their existence. Add the check and
filter the controls out in vmx_exec_control().

Note: KVM explicitly supports CPUs without VIRTUAL_NMIS and all these CPUs
are supposedly lacking NMI_WINDOW_EXITING too. Adjust cpu_has_virtual_nmis()
accordingly.

No functional change intended.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/capabilities.h | 3 ++-
 arch/x86/kvm/vmx/vmx.c          | 8 +++++++-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index c5e5dfef69c7..faee1db8b0e0 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -82,7 +82,8 @@ static inline bool cpu_has_vmx_basic_inout(void)
 
 static inline bool cpu_has_virtual_nmis(void)
 {
-	return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS;
+	return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS &&
+	       vmcs_config.cpu_based_exec_ctrl & CPU_BASED_NMI_WINDOW_EXITING;
 }
 
 static inline bool cpu_has_vmx_preemption_timer(void)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index eff38cbe6d35..7acbe43030e4 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2560,10 +2560,12 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	      CPU_BASED_MWAIT_EXITING |
 	      CPU_BASED_MONITOR_EXITING |
 	      CPU_BASED_INVLPG_EXITING |
-	      CPU_BASED_RDPMC_EXITING;
+	      CPU_BASED_RDPMC_EXITING |
+	      CPU_BASED_INTR_WINDOW_EXITING;
 
 	opt = CPU_BASED_TPR_SHADOW |
 	      CPU_BASED_USE_MSR_BITMAPS |
+	      CPU_BASED_NMI_WINDOW_EXITING |
 	      CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
 	      CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
 	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
@@ -4374,6 +4376,10 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 {
 	u32 exec_control = vmcs_config.cpu_based_exec_ctrl;
 
+	/* INTR_WINDOW_EXITING and NMI_WINDOW_EXITING are toggled dynamically */
+	exec_control &= ~(CPU_BASED_INTR_WINDOW_EXITING |
+			  CPU_BASED_NMI_WINDOW_EXITING);
+
 	if (vmx->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
 		exec_control &= ~CPU_BASED_MOV_DR_EXITING;
 
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 24/36] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING in setup_vmcs_config()
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (22 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 23/36] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING " Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 25/36] KVM: VMX: Don't toggle VM_ENTRY_IA32E_MODE for 32-bit kernels/KVM Sean Christopherson
                   ` (12 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

SECONDARY_EXEC_ENCLS_EXITING is the only control which is conditionally
added to the 'optional' checklist in setup_vmcs_config() but the special
case can be avoided by always checking for its presence first and filtering
out the result later.

Note: the situation when SECONDARY_EXEC_ENCLS_EXITING is present but
cpu_has_sgx() is false is possible when SGX is "soft-disabled", e.g. if
software writes MCE control MSRs or there's an uncorrectable #MC.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 7acbe43030e4..e694eb2190f3 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2601,9 +2601,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 			SECONDARY_EXEC_PT_CONCEAL_VMX |
 			SECONDARY_EXEC_ENABLE_VMFUNC |
 			SECONDARY_EXEC_BUS_LOCK_DETECTION |
-			SECONDARY_EXEC_NOTIFY_VM_EXITING;
-		if (cpu_has_sgx())
-			opt2 |= SECONDARY_EXEC_ENCLS_EXITING;
+			SECONDARY_EXEC_NOTIFY_VM_EXITING |
+			SECONDARY_EXEC_ENCLS_EXITING;
+
 		if (adjust_vmx_controls(min2, opt2,
 					MSR_IA32_VMX_PROCBASED_CTLS2,
 					&_cpu_based_2nd_exec_control) < 0)
@@ -2650,6 +2650,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 		vmx_cap->vpid = 0;
 	}
 
+	if (!cpu_has_sgx())
+		_cpu_based_2nd_exec_control &= ~SECONDARY_EXEC_ENCLS_EXITING;
+
 	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) {
 		u64 opt3 = TERTIARY_EXEC_IPI_VIRT;
 
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 25/36] KVM: VMX: Don't toggle VM_ENTRY_IA32E_MODE for 32-bit kernels/KVM
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (23 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 24/36] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING " Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 26/36] KVM: VMX: Extend VMX controls macro shenanigans Sean Christopherson
                   ` (11 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

Don't toggle VM_ENTRY_IA32E_MODE in 32-bit kernels/KVM and instead bug
the VM if KVM attempts to run the guest with EFER.LMA=1. KVM doesn't
support running 64-bit guests with 32-bit hosts.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e694eb2190f3..cbb88d1fd55d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3035,10 +3035,15 @@ int vmx_set_efer(struct kvm_vcpu *vcpu, u64 efer)
 		return 0;
 
 	vcpu->arch.efer = efer;
+#ifdef CONFIG_X86_64
 	if (efer & EFER_LMA)
 		vm_entry_controls_setbit(vmx, VM_ENTRY_IA32E_MODE);
 	else
 		vm_entry_controls_clearbit(vmx, VM_ENTRY_IA32E_MODE);
+#else
+	if (KVM_BUG_ON(efer & EFER_LMA, vcpu->kvm))
+		return 1;
+#endif
 
 	vmx_setup_uret_msrs(vmx);
 	return 0;
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 26/36] KVM: VMX: Extend VMX controls macro shenanigans
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (24 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 25/36] KVM: VMX: Don't toggle VM_ENTRY_IA32E_MODE for 32-bit kernels/KVM Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 27/36] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config() Sean Christopherson
                   ` (10 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

When VMX controls macros are used to set or clear a control bit, make
sure that this bit was checked in setup_vmcs_config() and thus is properly
reflected in vmcs_config.

Opportunistically drop pointless "< 0" check for adjust_vmx_controls()'s
return value.

No functional change intended.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 112 +++++++----------------------
 arch/x86/kvm/vmx/vmx.h | 155 +++++++++++++++++++++++++++++++++++------
 2 files changed, 156 insertions(+), 111 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index cbb88d1fd55d..5f5e48d0dbcb 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -864,7 +864,7 @@ unsigned int __vmx_vcpu_run_flags(struct vcpu_vmx *vmx)
 	return flags;
 }
 
-static void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
+static __always_inline void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
 		unsigned long entry, unsigned long exit)
 {
 	vm_entry_controls_clearbit(vmx, entry);
@@ -922,7 +922,7 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
 	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr);
 }
 
-static void add_atomic_switch_msr_special(struct vcpu_vmx *vmx,
+static __always_inline void add_atomic_switch_msr_special(struct vcpu_vmx *vmx,
 		unsigned long entry, unsigned long exit,
 		unsigned long guest_val_vmcs, unsigned long host_val_vmcs,
 		u64 guest_val, u64 host_val)
@@ -2521,7 +2521,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 				    struct vmx_capability *vmx_cap)
 {
 	u32 vmx_msr_low, vmx_msr_high;
-	u32 min, opt, min2, opt2;
 	u32 _pin_based_exec_control = 0;
 	u32 _cpu_based_exec_control = 0;
 	u32 _cpu_based_2nd_exec_control = 0;
@@ -2547,29 +2546,11 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	};
 
 	memset(vmcs_conf, 0, sizeof(*vmcs_conf));
-	min = CPU_BASED_HLT_EXITING |
-#ifdef CONFIG_X86_64
-	      CPU_BASED_CR8_LOAD_EXITING |
-	      CPU_BASED_CR8_STORE_EXITING |
-#endif
-	      CPU_BASED_CR3_LOAD_EXITING |
-	      CPU_BASED_CR3_STORE_EXITING |
-	      CPU_BASED_UNCOND_IO_EXITING |
-	      CPU_BASED_MOV_DR_EXITING |
-	      CPU_BASED_USE_TSC_OFFSETTING |
-	      CPU_BASED_MWAIT_EXITING |
-	      CPU_BASED_MONITOR_EXITING |
-	      CPU_BASED_INVLPG_EXITING |
-	      CPU_BASED_RDPMC_EXITING |
-	      CPU_BASED_INTR_WINDOW_EXITING;
 
-	opt = CPU_BASED_TPR_SHADOW |
-	      CPU_BASED_USE_MSR_BITMAPS |
-	      CPU_BASED_NMI_WINDOW_EXITING |
-	      CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
-	      CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
-	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
-				&_cpu_based_exec_control) < 0)
+	if (adjust_vmx_controls(KVM_REQUIRED_VMX_CPU_BASED_VM_EXEC_CONTROL,
+				KVM_OPTIONAL_VMX_CPU_BASED_VM_EXEC_CONTROL,
+				MSR_IA32_VMX_PROCBASED_CTLS,
+				&_cpu_based_exec_control))
 		return -EIO;
 #ifdef CONFIG_X86_64
 	if (_cpu_based_exec_control & CPU_BASED_TPR_SHADOW)
@@ -2577,36 +2558,10 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 					   ~CPU_BASED_CR8_STORE_EXITING;
 #endif
 	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
-		min2 = 0;
-		opt2 = SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
-			SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
-			SECONDARY_EXEC_WBINVD_EXITING |
-			SECONDARY_EXEC_ENABLE_VPID |
-			SECONDARY_EXEC_ENABLE_EPT |
-			SECONDARY_EXEC_UNRESTRICTED_GUEST |
-			SECONDARY_EXEC_PAUSE_LOOP_EXITING |
-			SECONDARY_EXEC_DESC |
-			SECONDARY_EXEC_ENABLE_RDTSCP |
-			SECONDARY_EXEC_ENABLE_INVPCID |
-			SECONDARY_EXEC_APIC_REGISTER_VIRT |
-			SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
-			SECONDARY_EXEC_SHADOW_VMCS |
-			SECONDARY_EXEC_XSAVES |
-			SECONDARY_EXEC_RDSEED_EXITING |
-			SECONDARY_EXEC_RDRAND_EXITING |
-			SECONDARY_EXEC_ENABLE_PML |
-			SECONDARY_EXEC_TSC_SCALING |
-			SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE |
-			SECONDARY_EXEC_PT_USE_GPA |
-			SECONDARY_EXEC_PT_CONCEAL_VMX |
-			SECONDARY_EXEC_ENABLE_VMFUNC |
-			SECONDARY_EXEC_BUS_LOCK_DETECTION |
-			SECONDARY_EXEC_NOTIFY_VM_EXITING |
-			SECONDARY_EXEC_ENCLS_EXITING;
-
-		if (adjust_vmx_controls(min2, opt2,
+		if (adjust_vmx_controls(KVM_REQUIRED_VMX_SECONDARY_VM_EXEC_CONTROL,
+					KVM_OPTIONAL_VMX_SECONDARY_VM_EXEC_CONTROL,
 					MSR_IA32_VMX_PROCBASED_CTLS2,
-					&_cpu_based_2nd_exec_control) < 0)
+					&_cpu_based_2nd_exec_control))
 			return -EIO;
 	}
 #ifndef CONFIG_X86_64
@@ -2653,32 +2608,21 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	if (!cpu_has_sgx())
 		_cpu_based_2nd_exec_control &= ~SECONDARY_EXEC_ENCLS_EXITING;
 
-	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) {
-		u64 opt3 = TERTIARY_EXEC_IPI_VIRT;
-
-		_cpu_based_3rd_exec_control = adjust_vmx_controls64(opt3,
+	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS)
+		_cpu_based_3rd_exec_control =
+			adjust_vmx_controls64(KVM_OPTIONAL_VMX_TERTIARY_VM_EXEC_CONTROL,
 					      MSR_IA32_VMX_PROCBASED_CTLS3);
-	}
 
-	min = VM_EXIT_SAVE_DEBUG_CONTROLS | VM_EXIT_ACK_INTR_ON_EXIT;
-#ifdef CONFIG_X86_64
-	min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
-#endif
-	opt = VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
-	      VM_EXIT_LOAD_IA32_PAT |
-	      VM_EXIT_LOAD_IA32_EFER |
-	      VM_EXIT_CLEAR_BNDCFGS |
-	      VM_EXIT_PT_CONCEAL_PIP |
-	      VM_EXIT_CLEAR_IA32_RTIT_CTL;
-	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
-				&_vmexit_control) < 0)
+	if (adjust_vmx_controls(KVM_REQUIRED_VMX_VM_EXIT_CONTROLS,
+				KVM_OPTIONAL_VMX_VM_EXIT_CONTROLS,
+				MSR_IA32_VMX_EXIT_CTLS,
+				&_vmexit_control))
 		return -EIO;
 
-	min = PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING;
-	opt = PIN_BASED_VIRTUAL_NMIS | PIN_BASED_POSTED_INTR |
-		 PIN_BASED_VMX_PREEMPTION_TIMER;
-	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PINBASED_CTLS,
-				&_pin_based_exec_control) < 0)
+	if (adjust_vmx_controls(KVM_REQUIRED_VMX_PIN_BASED_VM_EXEC_CONTROL,
+				KVM_OPTIONAL_VMX_PIN_BASED_VM_EXEC_CONTROL,
+				MSR_IA32_VMX_PINBASED_CTLS,
+				&_pin_based_exec_control))
 		return -EIO;
 
 	if (cpu_has_broken_vmx_preemption_timer())
@@ -2687,18 +2631,10 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 		SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY))
 		_pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;
 
-	min = VM_ENTRY_LOAD_DEBUG_CONTROLS;
-#ifdef CONFIG_X86_64
-	min |= VM_ENTRY_IA32E_MODE;
-#endif
-	opt = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
-	      VM_ENTRY_LOAD_IA32_PAT |
-	      VM_ENTRY_LOAD_IA32_EFER |
-	      VM_ENTRY_LOAD_BNDCFGS |
-	      VM_ENTRY_PT_CONCEAL_PIP |
-	      VM_ENTRY_LOAD_IA32_RTIT_CTL;
-	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
-				&_vmentry_control) < 0)
+	if (adjust_vmx_controls(KVM_REQUIRED_VMX_VM_ENTRY_CONTROLS,
+				KVM_OPTIONAL_VMX_VM_ENTRY_CONTROLS,
+				MSR_IA32_VMX_ENTRY_CTLS,
+				&_vmentry_control))
 		return -EIO;
 
 	for (i = 0; i < ARRAY_SIZE(vmcs_entry_exit_pairs); i++) {
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index a7a05b5e41d2..3cfacf04be09 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -479,29 +479,138 @@ static inline u8 vmx_get_rvi(void)
 	return vmcs_read16(GUEST_INTR_STATUS) & 0xff;
 }
 
-#define BUILD_CONTROLS_SHADOW(lname, uname, bits)				\
-static inline void lname##_controls_set(struct vcpu_vmx *vmx, u##bits val)	\
-{										\
-	if (vmx->loaded_vmcs->controls_shadow.lname != val) {			\
-		vmcs_write##bits(uname, val);					\
-		vmx->loaded_vmcs->controls_shadow.lname = val;			\
-	}									\
-}										\
-static inline u##bits __##lname##_controls_get(struct loaded_vmcs *vmcs)	\
-{										\
-	return vmcs->controls_shadow.lname;					\
-}										\
-static inline u##bits lname##_controls_get(struct vcpu_vmx *vmx)		\
-{										\
-	return __##lname##_controls_get(vmx->loaded_vmcs);			\
-}										\
-static inline void lname##_controls_setbit(struct vcpu_vmx *vmx, u##bits val)	\
-{										\
-	lname##_controls_set(vmx, lname##_controls_get(vmx) | val);		\
-}										\
-static inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits val)	\
-{										\
-	lname##_controls_set(vmx, lname##_controls_get(vmx) & ~val);		\
+#define __KVM_REQUIRED_VMX_VM_ENTRY_CONTROLS				\
+	(VM_ENTRY_LOAD_DEBUG_CONTROLS)
+#ifdef CONFIG_X86_64
+	#define KVM_REQUIRED_VMX_VM_ENTRY_CONTROLS			\
+		(__KVM_REQUIRED_VMX_VM_ENTRY_CONTROLS |			\
+		 VM_ENTRY_IA32E_MODE)
+#else
+	#define KVM_REQUIRED_VMX_VM_ENTRY_CONTROLS			\
+		__KVM_REQUIRED_VMX_VM_ENTRY_CONTROLS
+#endif
+#define KVM_OPTIONAL_VMX_VM_ENTRY_CONTROLS				\
+	(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |				\
+	 VM_ENTRY_LOAD_IA32_PAT |					\
+	 VM_ENTRY_LOAD_IA32_EFER |					\
+	 VM_ENTRY_LOAD_BNDCFGS |					\
+	 VM_ENTRY_PT_CONCEAL_PIP |					\
+	 VM_ENTRY_LOAD_IA32_RTIT_CTL)
+
+#define __KVM_REQUIRED_VMX_VM_EXIT_CONTROLS				\
+	(VM_EXIT_SAVE_DEBUG_CONTROLS |					\
+	 VM_EXIT_ACK_INTR_ON_EXIT)
+#ifdef CONFIG_X86_64
+	#define KVM_REQUIRED_VMX_VM_EXIT_CONTROLS			\
+		(__KVM_REQUIRED_VMX_VM_EXIT_CONTROLS |			\
+		 VM_EXIT_HOST_ADDR_SPACE_SIZE)
+#else
+	#define KVM_REQUIRED_VMX_VM_EXIT_CONTROLS			\
+		__KVM_REQUIRED_VMX_VM_EXIT_CONTROLS
+#endif
+#define KVM_OPTIONAL_VMX_VM_EXIT_CONTROLS				\
+	      (VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |			\
+	       VM_EXIT_LOAD_IA32_PAT |					\
+	       VM_EXIT_LOAD_IA32_EFER |					\
+	       VM_EXIT_CLEAR_BNDCFGS |					\
+	       VM_EXIT_PT_CONCEAL_PIP |					\
+	       VM_EXIT_CLEAR_IA32_RTIT_CTL)
+
+#define KVM_REQUIRED_VMX_PIN_BASED_VM_EXEC_CONTROL			\
+	(PIN_BASED_EXT_INTR_MASK |					\
+	 PIN_BASED_NMI_EXITING)
+#define KVM_OPTIONAL_VMX_PIN_BASED_VM_EXEC_CONTROL			\
+	(PIN_BASED_VIRTUAL_NMIS |					\
+	 PIN_BASED_POSTED_INTR |					\
+	 PIN_BASED_VMX_PREEMPTION_TIMER)
+
+#define __KVM_REQUIRED_VMX_CPU_BASED_VM_EXEC_CONTROL			\
+	(CPU_BASED_HLT_EXITING |					\
+	 CPU_BASED_CR3_LOAD_EXITING |					\
+	 CPU_BASED_CR3_STORE_EXITING |					\
+	 CPU_BASED_UNCOND_IO_EXITING |					\
+	 CPU_BASED_MOV_DR_EXITING |					\
+	 CPU_BASED_USE_TSC_OFFSETTING |					\
+	 CPU_BASED_MWAIT_EXITING |					\
+	 CPU_BASED_MONITOR_EXITING |					\
+	 CPU_BASED_INVLPG_EXITING |					\
+	 CPU_BASED_RDPMC_EXITING |					\
+	 CPU_BASED_INTR_WINDOW_EXITING)
+
+#ifdef CONFIG_X86_64
+	#define KVM_REQUIRED_VMX_CPU_BASED_VM_EXEC_CONTROL		\
+		(__KVM_REQUIRED_VMX_CPU_BASED_VM_EXEC_CONTROL |		\
+		 CPU_BASED_CR8_LOAD_EXITING |				\
+		 CPU_BASED_CR8_STORE_EXITING)
+#else
+	#define KVM_REQUIRED_VMX_CPU_BASED_VM_EXEC_CONTROL		\
+		__KVM_REQUIRED_VMX_CPU_BASED_VM_EXEC_CONTROL
+#endif
+
+#define KVM_OPTIONAL_VMX_CPU_BASED_VM_EXEC_CONTROL			\
+	(CPU_BASED_TPR_SHADOW |						\
+	 CPU_BASED_USE_MSR_BITMAPS |					\
+	 CPU_BASED_NMI_WINDOW_EXITING |					\
+	 CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |			\
+	 CPU_BASED_ACTIVATE_TERTIARY_CONTROLS)
+
+#define KVM_REQUIRED_VMX_SECONDARY_VM_EXEC_CONTROL 0
+#define KVM_OPTIONAL_VMX_SECONDARY_VM_EXEC_CONTROL			\
+	(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |			\
+	 SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |			\
+	 SECONDARY_EXEC_WBINVD_EXITING |				\
+	 SECONDARY_EXEC_ENABLE_VPID |					\
+	 SECONDARY_EXEC_ENABLE_EPT |					\
+	 SECONDARY_EXEC_UNRESTRICTED_GUEST |				\
+	 SECONDARY_EXEC_PAUSE_LOOP_EXITING |				\
+	 SECONDARY_EXEC_DESC |						\
+	 SECONDARY_EXEC_ENABLE_RDTSCP |					\
+	 SECONDARY_EXEC_ENABLE_INVPCID |				\
+	 SECONDARY_EXEC_APIC_REGISTER_VIRT |				\
+	 SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |				\
+	 SECONDARY_EXEC_SHADOW_VMCS |					\
+	 SECONDARY_EXEC_XSAVES |					\
+	 SECONDARY_EXEC_RDSEED_EXITING |				\
+	 SECONDARY_EXEC_RDRAND_EXITING |				\
+	 SECONDARY_EXEC_ENABLE_PML |					\
+	 SECONDARY_EXEC_TSC_SCALING |					\
+	 SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE |				\
+	 SECONDARY_EXEC_PT_USE_GPA |					\
+	 SECONDARY_EXEC_PT_CONCEAL_VMX |				\
+	 SECONDARY_EXEC_ENABLE_VMFUNC |					\
+	 SECONDARY_EXEC_BUS_LOCK_DETECTION |				\
+	 SECONDARY_EXEC_NOTIFY_VM_EXITING |				\
+	 SECONDARY_EXEC_ENCLS_EXITING)
+
+#define KVM_REQUIRED_VMX_TERTIARY_VM_EXEC_CONTROL 0
+#define KVM_OPTIONAL_VMX_TERTIARY_VM_EXEC_CONTROL			\
+	(TERTIARY_EXEC_IPI_VIRT)
+
+#define BUILD_CONTROLS_SHADOW(lname, uname, bits)						\
+static inline void lname##_controls_set(struct vcpu_vmx *vmx, u##bits val)			\
+{												\
+	if (vmx->loaded_vmcs->controls_shadow.lname != val) {					\
+		vmcs_write##bits(uname, val);							\
+		vmx->loaded_vmcs->controls_shadow.lname = val;					\
+	}											\
+}												\
+static inline u##bits __##lname##_controls_get(struct loaded_vmcs *vmcs)			\
+{												\
+	return vmcs->controls_shadow.lname;							\
+}												\
+static inline u##bits lname##_controls_get(struct vcpu_vmx *vmx)				\
+{												\
+	return __##lname##_controls_get(vmx->loaded_vmcs);					\
+}												\
+static __always_inline void lname##_controls_setbit(struct vcpu_vmx *vmx, u##bits val)		\
+{												\
+	BUILD_BUG_ON(!(val & (KVM_REQUIRED_VMX_##uname | KVM_OPTIONAL_VMX_##uname)));		\
+	lname##_controls_set(vmx, lname##_controls_get(vmx) | val);				\
+}												\
+static __always_inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits val)	\
+{												\
+	BUILD_BUG_ON(!(val & (KVM_REQUIRED_VMX_##uname | KVM_OPTIONAL_VMX_##uname)));		\
+	lname##_controls_set(vmx, lname##_controls_get(vmx) & ~val);				\
 }
 BUILD_CONTROLS_SHADOW(vm_entry, VM_ENTRY_CONTROLS, 32)
 BUILD_CONTROLS_SHADOW(vm_exit, VM_EXIT_CONTROLS, 32)
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 27/36] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config()
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (25 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 26/36] KVM: VMX: Extend VMX controls macro shenanigans Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 28/36] KVM: VMX: Add missing VMEXIT controls to vmcs_config Sean Christopherson
                   ` (9 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

As a preparation to reusing the result of setup_vmcs_config() in
nested VMX MSR setup, move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering
to vmx_exec_control().

No functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 5f5e48d0dbcb..23e237ad3956 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2552,11 +2552,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 				MSR_IA32_VMX_PROCBASED_CTLS,
 				&_cpu_based_exec_control))
 		return -EIO;
-#ifdef CONFIG_X86_64
-	if (_cpu_based_exec_control & CPU_BASED_TPR_SHADOW)
-		_cpu_based_exec_control &= ~CPU_BASED_CR8_LOAD_EXITING &
-					   ~CPU_BASED_CR8_STORE_EXITING;
-#endif
 	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
 		if (adjust_vmx_controls(KVM_REQUIRED_VMX_SECONDARY_VM_EXEC_CONTROL,
 					KVM_OPTIONAL_VMX_SECONDARY_VM_EXEC_CONTROL,
@@ -4327,13 +4322,17 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 	if (vmx->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
 		exec_control &= ~CPU_BASED_MOV_DR_EXITING;
 
-	if (!cpu_need_tpr_shadow(&vmx->vcpu)) {
+	if (!cpu_need_tpr_shadow(&vmx->vcpu))
 		exec_control &= ~CPU_BASED_TPR_SHADOW;
+
 #ifdef CONFIG_X86_64
+	if (exec_control & CPU_BASED_TPR_SHADOW)
+		exec_control &= ~(CPU_BASED_CR8_LOAD_EXITING |
+				  CPU_BASED_CR8_STORE_EXITING);
+	else
 		exec_control |= CPU_BASED_CR8_STORE_EXITING |
 				CPU_BASED_CR8_LOAD_EXITING;
 #endif
-	}
 	if (!enable_ept)
 		exec_control |= CPU_BASED_CR3_STORE_EXITING |
 				CPU_BASED_CR3_LOAD_EXITING  |
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 28/36] KVM: VMX: Add missing VMEXIT controls to vmcs_config
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (26 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 27/36] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config() Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 29/36] KVM: VMX: Add missing CPU based VM execution " Sean Christopherson
                   ` (8 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

As a preparation to reusing the result of setup_vmcs_config() in
nested VMX MSR setup, add the VMEXIT controls which KVM doesn't
use but supports for nVMX to KVM_OPT_VMX_VM_EXIT_CONTROLS and
filter them out in vmx_vmexit_ctrl().

No functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 7 +++++++
 arch/x86/kvm/vmx/vmx.h | 3 +++
 2 files changed, 10 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 23e237ad3956..079cc4835248 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4275,6 +4275,13 @@ static u32 vmx_vmexit_ctrl(void)
 {
 	u32 vmexit_ctrl = vmcs_config.vmexit_ctrl;
 
+	/*
+	 * Not used by KVM and never set in vmcs01 or vmcs02, but emulated for
+	 * nested virtualization and thus allowed to be set in vmcs12.
+	 */
+	vmexit_ctrl &= ~(VM_EXIT_SAVE_IA32_PAT | VM_EXIT_SAVE_IA32_EFER |
+			 VM_EXIT_SAVE_VMX_PREEMPTION_TIMER);
+
 	if (vmx_pt_mode_is_system())
 		vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
 				 VM_EXIT_CLEAR_IA32_RTIT_CTL);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 3cfacf04be09..ce99704a37b7 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -510,7 +510,10 @@ static inline u8 vmx_get_rvi(void)
 #endif
 #define KVM_OPTIONAL_VMX_VM_EXIT_CONTROLS				\
 	      (VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |			\
+	       VM_EXIT_SAVE_IA32_PAT |					\
 	       VM_EXIT_LOAD_IA32_PAT |					\
+	       VM_EXIT_SAVE_IA32_EFER |					\
+	       VM_EXIT_SAVE_VMX_PREEMPTION_TIMER |			\
 	       VM_EXIT_LOAD_IA32_EFER |					\
 	       VM_EXIT_CLEAR_BNDCFGS |					\
 	       VM_EXIT_PT_CONCEAL_PIP |					\
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 29/36] KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (27 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 28/36] KVM: VMX: Add missing VMEXIT controls to vmcs_config Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 30/36] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup Sean Christopherson
                   ` (7 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

As a preparation to reusing the result of setup_vmcs_config() in
nested VMX MSR setup, add the CPU based VM execution controls which KVM
doesn't use but supports for nVMX to KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL
and filter them out in vmx_exec_control().

No functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 9 +++++++++
 arch/x86/kvm/vmx/vmx.h | 6 +++++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 079cc4835248..11e75f2b832f 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4322,6 +4322,15 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 {
 	u32 exec_control = vmcs_config.cpu_based_exec_ctrl;
 
+	/*
+	 * Not used by KVM, but fully supported for nesting, i.e. are allowed in
+	 * vmcs12 and propagated to vmcs02 when set in vmcs12.
+	 */
+	exec_control &= ~(CPU_BASED_RDTSC_EXITING |
+			  CPU_BASED_USE_IO_BITMAPS |
+			  CPU_BASED_MONITOR_TRAP_FLAG |
+			  CPU_BASED_PAUSE_EXITING);
+
 	/* INTR_WINDOW_EXITING and NMI_WINDOW_EXITING are toggled dynamically */
 	exec_control &= ~(CPU_BASED_INTR_WINDOW_EXITING |
 			  CPU_BASED_NMI_WINDOW_EXITING);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index ce99704a37b7..8a05d24f4167 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -551,9 +551,13 @@ static inline u8 vmx_get_rvi(void)
 #endif
 
 #define KVM_OPTIONAL_VMX_CPU_BASED_VM_EXEC_CONTROL			\
-	(CPU_BASED_TPR_SHADOW |						\
+	(CPU_BASED_RDTSC_EXITING |					\
+	 CPU_BASED_TPR_SHADOW |						\
+	 CPU_BASED_USE_IO_BITMAPS |					\
+	 CPU_BASED_MONITOR_TRAP_FLAG |					\
 	 CPU_BASED_USE_MSR_BITMAPS |					\
 	 CPU_BASED_NMI_WINDOW_EXITING |					\
+	 CPU_BASED_PAUSE_EXITING |					\
 	 CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |			\
 	 CPU_BASED_ACTIVATE_TERTIARY_CONTROLS)
 
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 30/36] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (28 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 29/36] KVM: VMX: Add missing CPU based VM execution " Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 31/36] KVM: x86: VMX: Replace some Intel model numbers with mnemonics Sean Christopherson
                   ` (6 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

Clear the CR3 and INVLPG interception controls at runtime based on
whether or not EPT is being _used_, as opposed to clearing the bits at
setup if EPT is _supported_ in hardware, and then restoring them when EPT
is not used.  Not mucking with the base config will allow using the base
config as the starting point for emulating the VMX capability MSRs.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 11e75f2b832f..5dcec85db093 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2574,13 +2574,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	rdmsr_safe(MSR_IA32_VMX_EPT_VPID_CAP,
 		&vmx_cap->ept, &vmx_cap->vpid);
 
-	if (_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_EPT) {
-		/* CR3 accesses and invlpg don't need to cause VM Exits when EPT
-		   enabled */
-		_cpu_based_exec_control &= ~(CPU_BASED_CR3_LOAD_EXITING |
-					     CPU_BASED_CR3_STORE_EXITING |
-					     CPU_BASED_INVLPG_EXITING);
-	} else if (vmx_cap->ept) {
+	if (!(_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_EPT) &&
+	    vmx_cap->ept) {
 		pr_warn_once("EPT CAP should not exist if not support "
 				"1-setting enable EPT VM-execution control\n");
 
@@ -4349,10 +4344,11 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 		exec_control |= CPU_BASED_CR8_STORE_EXITING |
 				CPU_BASED_CR8_LOAD_EXITING;
 #endif
-	if (!enable_ept)
-		exec_control |= CPU_BASED_CR3_STORE_EXITING |
-				CPU_BASED_CR3_LOAD_EXITING  |
-				CPU_BASED_INVLPG_EXITING;
+	/* No need to intercept CR3 access or INVPLG when using EPT. */
+	if (enable_ept)
+		exec_control &= ~(CPU_BASED_CR3_LOAD_EXITING |
+				  CPU_BASED_CR3_STORE_EXITING |
+				  CPU_BASED_INVLPG_EXITING);
 	if (kvm_mwait_in_guest(vmx->vcpu.kvm))
 		exec_control &= ~(CPU_BASED_MWAIT_EXITING |
 				CPU_BASED_MONITOR_EXITING);
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 31/36] KVM: x86: VMX: Replace some Intel model numbers with mnemonics
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (29 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 30/36] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 32/36] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config() Sean Christopherson
                   ` (5 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Jim Mattson <jmattson@google.com>

Intel processor code names are more familiar to many readers than
their decimal model numbers.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 5dcec85db093..6f6d8a008183 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2652,11 +2652,11 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	 */
 	if (boot_cpu_data.x86 == 0x6) {
 		switch (boot_cpu_data.x86_model) {
-		case 26: /* AAK155 */
-		case 30: /* AAP115 */
-		case 37: /* AAT100 */
-		case 44: /* BC86,AAY89,BD102 */
-		case 46: /* BA97 */
+		case INTEL_FAM6_NEHALEM_EP:	/* AAK155 */
+		case INTEL_FAM6_NEHALEM:	/* AAP115 */
+		case INTEL_FAM6_WESTMERE:	/* AAT100 */
+		case INTEL_FAM6_WESTMERE_EP:	/* BC86,AAY89,BD102 */
+		case INTEL_FAM6_NEHALEM_EX:	/* BA97 */
 			_vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
 			_vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
 			pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 32/36] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (30 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 31/36] KVM: x86: VMX: Replace some Intel model numbers with mnemonics Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 33/36] KVM: nVMX: Always set required-1 bits of pinbased_ctls to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR Sean Christopherson
                   ` (4 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

As a preparation to reusing the result of setup_vmcs_config() for setting
up nested VMX control MSRs, move LOAD_IA32_PERF_GLOBAL_CTRL errata handling
to vmx_vmexit_ctrl()/vmx_vmentry_ctrl() and print the warning from
hardware_setup(). While it seems reasonable to not expose
LOAD_IA32_PERF_GLOBAL_CTRL controls to L1 hypervisor on buggy CPUs,
such change would inevitably break live migration from older KVMs
where the controls are exposed. Keep the status quo for now, L1 hypervisor
itself is supposed to take care of the errata.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 59 +++++++++++++++++++++++++-----------------
 1 file changed, 35 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6f6d8a008183..6d346edf546b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2489,6 +2489,30 @@ static bool cpu_has_sgx(void)
 	return cpuid_eax(0) >= 0x12 && (cpuid_eax(0x12) & BIT(0));
 }
 
+/*
+ * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
+ * can't be used due to errata where VM Exit may incorrectly clear
+ * IA32_PERF_GLOBAL_CTRL[34:32]. Work around the errata by using the
+ * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
+ */
+static bool cpu_has_perf_global_ctrl_bug(void)
+{
+	if (boot_cpu_data.x86 == 0x6) {
+		switch (boot_cpu_data.x86_model) {
+		case INTEL_FAM6_NEHALEM_EP:	/* AAK155 */
+		case INTEL_FAM6_NEHALEM:	/* AAP115 */
+		case INTEL_FAM6_WESTMERE:	/* AAT100 */
+		case INTEL_FAM6_WESTMERE_EP:	/* BC86,AAY89,BD102 */
+		case INTEL_FAM6_NEHALEM_EX:	/* BA97 */
+			return true;
+		default:
+			break;
+		}
+	}
+
+	return false;
+}
+
 static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
 				      u32 msr, u32 *result)
 {
@@ -2644,30 +2668,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 		_vmexit_control &= ~x_ctrl;
 	}
 
-	/*
-	 * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
-	 * can't be used due to an errata where VM Exit may incorrectly clear
-	 * IA32_PERF_GLOBAL_CTRL[34:32].  Workaround the errata by using the
-	 * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
-	 */
-	if (boot_cpu_data.x86 == 0x6) {
-		switch (boot_cpu_data.x86_model) {
-		case INTEL_FAM6_NEHALEM_EP:	/* AAK155 */
-		case INTEL_FAM6_NEHALEM:	/* AAP115 */
-		case INTEL_FAM6_WESTMERE:	/* AAT100 */
-		case INTEL_FAM6_WESTMERE_EP:	/* BC86,AAY89,BD102 */
-		case INTEL_FAM6_NEHALEM_EX:	/* BA97 */
-			_vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
-			_vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
-			pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
-					"does not work properly. Using workaround\n");
-			break;
-		default:
-			break;
-		}
-	}
-
-
 	rdmsr(MSR_IA32_VMX_BASIC, vmx_msr_low, vmx_msr_high);
 
 	/* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
@@ -4263,6 +4263,9 @@ static u32 vmx_vmentry_ctrl(void)
 			  VM_ENTRY_LOAD_IA32_EFER |
 			  VM_ENTRY_IA32E_MODE);
 
+	if (cpu_has_perf_global_ctrl_bug())
+		vmentry_ctrl &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+
 	return vmentry_ctrl;
 }
 
@@ -4280,6 +4283,10 @@ static u32 vmx_vmexit_ctrl(void)
 	if (vmx_pt_mode_is_system())
 		vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
 				 VM_EXIT_CLEAR_IA32_RTIT_CTL);
+
+	if (cpu_has_perf_global_ctrl_bug())
+		vmexit_ctrl &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+
 	/* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
 	return vmexit_ctrl &
 		~(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | VM_EXIT_LOAD_IA32_EFER);
@@ -8186,6 +8193,10 @@ static __init int hardware_setup(void)
 	if (setup_vmcs_config(&vmcs_config, &vmx_capability) < 0)
 		return -EIO;
 
+	if (cpu_has_perf_global_ctrl_bug())
+		pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
+			     "does not work properly. Using workaround\n");
+
 	if (boot_cpu_has(X86_FEATURE_NX))
 		kvm_enable_efer_bits(EFER_NX);
 
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 33/36] KVM: nVMX: Always set required-1 bits of pinbased_ctls to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (31 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 32/36] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config() Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 34/36] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs Sean Christopherson
                   ` (3 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

Similar to exit_ctls_low, entry_ctls_low, procbased_ctls_low,
pinbased_ctls_low should be set to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR
and not host's MSR_IA32_VMX_PINBASED_CTLS value |=
PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR.

The commit eabeaaccfca0 ("KVM: nVMX: Clean up and fix pin-based
execution controls") which introduced '|=' doesn't mention anything
about why this is needed, the change seems rather accidental.

Note: normally, required-1 portion of MSR_IA32_VMX_PINBASED_CTLS should
be equal to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR so no behavioral change
is expected, however, it is (in theory) possible to observe something
different there when e.g. KVM is running as a nested hypervisor. Hope
this doesn't happen in practice.

Reported-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/nested.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 6e9b32744e0d..4b8301137d75 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6588,7 +6588,7 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	rdmsr(MSR_IA32_VMX_PINBASED_CTLS,
 		msrs->pinbased_ctls_low,
 		msrs->pinbased_ctls_high);
-	msrs->pinbased_ctls_low |=
+	msrs->pinbased_ctls_low =
 		PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
 	msrs->pinbased_ctls_high &=
 		PIN_BASED_EXT_INTR_MASK |
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 34/36] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (32 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 33/36] KVM: nVMX: Always set required-1 bits of pinbased_ctls to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 35/36] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config Sean Christopherson
                   ` (2 subsequent siblings)
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

Using raw host MSR values for setting up nested VMX control MSRs is
incorrect as some features need to disabled, e.g. when KVM runs as
a nested hypervisor on Hyper-V and uses Enlightened VMCS or when a
workaround for IA32_PERF_GLOBAL_CTRL is applied. For non-nested VMX, this
is done in setup_vmcs_config() and the result is stored in vmcs_config.
Use it for setting up allowed-1 bits in nested VMX MSRs too.

Suggested-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/nested.c | 30 ++++++++++++------------------
 arch/x86/kvm/vmx/nested.h |  2 +-
 arch/x86/kvm/vmx/vmx.c    |  5 ++---
 3 files changed, 15 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 4b8301137d75..6208cdebd173 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6567,8 +6567,10 @@ static u64 nested_vmx_calc_vmcs_enum_msr(void)
  * bit in the high half is on if the corresponding bit in the control field
  * may be on. See also vmx_control_verify().
  */
-void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
+void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
 {
+	struct nested_vmx_msrs *msrs = &vmcs_conf->nested;
+
 	/*
 	 * Note that as a general rule, the high half of the MSRs (bits in
 	 * the control fields which may be 1) should be initialized by the
@@ -6585,11 +6587,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	 */
 
 	/* pin-based controls */
-	rdmsr(MSR_IA32_VMX_PINBASED_CTLS,
-		msrs->pinbased_ctls_low,
-		msrs->pinbased_ctls_high);
 	msrs->pinbased_ctls_low =
 		PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
+
+	msrs->pinbased_ctls_high = vmcs_conf->pin_based_exec_ctrl;
 	msrs->pinbased_ctls_high &=
 		PIN_BASED_EXT_INTR_MASK |
 		PIN_BASED_NMI_EXITING |
@@ -6600,12 +6601,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 		PIN_BASED_VMX_PREEMPTION_TIMER;
 
 	/* exit controls */
-	rdmsr(MSR_IA32_VMX_EXIT_CTLS,
-		msrs->exit_ctls_low,
-		msrs->exit_ctls_high);
 	msrs->exit_ctls_low =
 		VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR;
 
+	msrs->exit_ctls_high = vmcs_conf->vmexit_ctrl;
 	msrs->exit_ctls_high &=
 #ifdef CONFIG_X86_64
 		VM_EXIT_HOST_ADDR_SPACE_SIZE |
@@ -6622,11 +6621,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	msrs->exit_ctls_low &= ~VM_EXIT_SAVE_DEBUG_CONTROLS;
 
 	/* entry controls */
-	rdmsr(MSR_IA32_VMX_ENTRY_CTLS,
-		msrs->entry_ctls_low,
-		msrs->entry_ctls_high);
 	msrs->entry_ctls_low =
 		VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR;
+
+	msrs->entry_ctls_high = vmcs_conf->vmentry_ctrl;
 	msrs->entry_ctls_high &=
 #ifdef CONFIG_X86_64
 		VM_ENTRY_IA32E_MODE |
@@ -6640,11 +6638,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	msrs->entry_ctls_low &= ~VM_ENTRY_LOAD_DEBUG_CONTROLS;
 
 	/* cpu-based controls */
-	rdmsr(MSR_IA32_VMX_PROCBASED_CTLS,
-		msrs->procbased_ctls_low,
-		msrs->procbased_ctls_high);
 	msrs->procbased_ctls_low =
 		CPU_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
+
+	msrs->procbased_ctls_high = vmcs_conf->cpu_based_exec_ctrl;
 	msrs->procbased_ctls_high &=
 		CPU_BASED_INTR_WINDOW_EXITING |
 		CPU_BASED_NMI_WINDOW_EXITING | CPU_BASED_USE_TSC_OFFSETTING |
@@ -6678,12 +6675,9 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	 * depend on CPUID bits, they are added later by
 	 * vmx_vcpu_after_set_cpuid.
 	 */
-	if (msrs->procbased_ctls_high & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS)
-		rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2,
-		      msrs->secondary_ctls_low,
-		      msrs->secondary_ctls_high);
-
 	msrs->secondary_ctls_low = 0;
+
+	msrs->secondary_ctls_high = vmcs_conf->cpu_based_2nd_exec_ctrl;
 	msrs->secondary_ctls_high &=
 		SECONDARY_EXEC_DESC |
 		SECONDARY_EXEC_ENABLE_RDTSCP |
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index 88b00a7359e4..6312c9541c3c 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -17,7 +17,7 @@ enum nvmx_vmentry_status {
 };
 
 void vmx_leave_nested(struct kvm_vcpu *vcpu);
-void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps);
+void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps);
 void nested_vmx_hardware_unsetup(void);
 __init int nested_vmx_hardware_setup(int (*exit_handlers[])(struct kvm_vcpu *));
 void nested_vmx_set_vmcs_shadowing_bitmap(void);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6d346edf546b..c42b6646afa4 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7396,7 +7396,7 @@ static int __init vmx_check_processor_compat(void)
 	if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0)
 		return -EIO;
 	if (nested)
-		nested_vmx_setup_ctls_msrs(&vmcs_conf.nested, vmx_cap.ept);
+		nested_vmx_setup_ctls_msrs(&vmcs_conf, vmx_cap.ept);
 	if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config)) != 0) {
 		printk(KERN_ERR "kvm: CPU %d feature inconsistency!\n",
 				smp_processor_id());
@@ -8351,8 +8351,7 @@ static __init int hardware_setup(void)
 	setup_default_sgx_lepubkeyhash();
 
 	if (nested) {
-		nested_vmx_setup_ctls_msrs(&vmcs_config.nested,
-					   vmx_capability.ept);
+		nested_vmx_setup_ctls_msrs(&vmcs_config, vmx_capability.ept);
 
 		r = nested_vmx_hardware_setup(kvm_vmx_exit_handlers);
 		if (r)
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 35/36] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (33 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 34/36] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-24  3:01 ` [RFC PATCH v6 36/36] KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up nested MSR Sean Christopherson
  2022-08-25 18:08 ` [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Vitaly Kuznetsov
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

Like other host VMX control MSRs, MSR_IA32_VMX_MISC can be cached in
vmcs_config to avoid the need to re-read it later, e.g. from
cpu_has_vmx_intel_pt() or cpu_has_vmx_shadow_vmcs().

No (real) functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/capabilities.h | 11 +++--------
 arch/x86/kvm/vmx/vmx.c          |  8 +++++---
 2 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index faee1db8b0e0..87c4e46daf37 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -65,6 +65,7 @@ struct vmcs_config {
 	u64 cpu_based_3rd_exec_ctrl;
 	u32 vmexit_ctrl;
 	u32 vmentry_ctrl;
+	u64 misc;
 	struct nested_vmx_msrs nested;
 };
 extern struct vmcs_config vmcs_config;
@@ -225,11 +226,8 @@ static inline bool cpu_has_vmx_vmfunc(void)
 
 static inline bool cpu_has_vmx_shadow_vmcs(void)
 {
-	u64 vmx_msr;
-
 	/* check if the cpu supports writing r/o exit information fields */
-	rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
-	if (!(vmx_msr & MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS))
+	if (!(vmcs_config.misc & MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS))
 		return false;
 
 	return vmcs_config.cpu_based_2nd_exec_ctrl &
@@ -371,10 +369,7 @@ static inline bool cpu_has_vmx_invvpid_global(void)
 
 static inline bool cpu_has_vmx_intel_pt(void)
 {
-	u64 vmx_msr;
-
-	rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
-	return (vmx_msr & MSR_IA32_VMX_MISC_INTEL_PT) &&
+	return (vmcs_config.misc & MSR_IA32_VMX_MISC_INTEL_PT) &&
 		(vmcs_config.cpu_based_2nd_exec_ctrl & SECONDARY_EXEC_PT_USE_GPA) &&
 		(vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_RTIT_CTL);
 }
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c42b6646afa4..f3d3a546dd2a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2551,6 +2551,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	u64 _cpu_based_3rd_exec_control = 0;
 	u32 _vmexit_control = 0;
 	u32 _vmentry_control = 0;
+	u64 misc_msr;
 	int i;
 
 	/*
@@ -2684,6 +2685,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	if (((vmx_msr_high >> 18) & 15) != 6)
 		return -EIO;
 
+	rdmsrl(MSR_IA32_VMX_MISC, misc_msr);
+
 	vmcs_conf->size = vmx_msr_high & 0x1fff;
 	vmcs_conf->basic_cap = vmx_msr_high & ~0x1fff;
 
@@ -2695,6 +2698,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	vmcs_conf->cpu_based_3rd_exec_ctrl = _cpu_based_3rd_exec_control;
 	vmcs_conf->vmexit_ctrl         = _vmexit_control;
 	vmcs_conf->vmentry_ctrl        = _vmentry_control;
+	vmcs_conf->misc	= misc_msr;
 
 	return 0;
 }
@@ -8311,11 +8315,9 @@ static __init int hardware_setup(void)
 
 	if (enable_preemption_timer) {
 		u64 use_timer_freq = 5000ULL * 1000 * 1000;
-		u64 vmx_msr;
 
-		rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
 		cpu_preemption_timer_multi =
-			vmx_msr & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
+			vmcs_config.misc & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
 
 		if (tsc_khz)
 			use_timer_freq = (u64)tsc_khz * 1000;
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v6 36/36] KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up nested MSR
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (34 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 35/36] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config Sean Christopherson
@ 2022-08-24  3:01 ` Sean Christopherson
  2022-08-25 18:08 ` [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Vitaly Kuznetsov
  36 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-24  3:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

From: Vitaly Kuznetsov <vkuznets@redhat.com>

vmcs_config has cached host MSR_IA32_VMX_MISC value, use it for setting
up nested MSR_IA32_VMX_MISC in nested_vmx_setup_ctls_msrs() and avoid the
redundant rdmsr().

No (real) functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/nested.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 6208cdebd173..a9d51afde502 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6757,10 +6757,7 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
 		msrs->secondary_ctls_high |= SECONDARY_EXEC_ENCLS_EXITING;
 
 	/* miscellaneous data */
-	rdmsr(MSR_IA32_VMX_MISC,
-		msrs->misc_low,
-		msrs->misc_high);
-	msrs->misc_low &= VMX_MISC_SAVE_EFER_LMA;
+	msrs->misc_low = (u32)vmcs_conf->misc & VMX_MISC_SAVE_EFER_LMA;
 	msrs->misc_low |=
 		MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS |
 		VMX_MISC_EMULATED_PREEMPTION_TIMER_RATE |
-- 
2.37.1.595.g718a3a8f04-goog


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v6 06/36] KVM: nVMX: Treat eVMCS as enabled for guest iff Hyper-V is also enabled
  2022-08-24  3:01 ` [RFC PATCH v6 06/36] KVM: nVMX: Treat eVMCS as enabled for guest iff Hyper-V is also enabled Sean Christopherson
@ 2022-08-25 10:21   ` Vitaly Kuznetsov
  2022-08-25 14:48     ` Sean Christopherson
  0 siblings, 1 reply; 45+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-25 10:21 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm

Sean Christopherson <seanjc@google.com> writes:

> When querying whether or not eVMCS is enabled on behalf of the guest,
> treat eVMCS as enable if and only if Hyper-V is enabled/exposed to the
> guest.
>
> Note, flows that come from the host, e.g. KVM_SET_NESTED_STATE, must NOT
> check for Hyper-V being enabled as KVM doesn't require guest CPUID to be
> set before most ioctls().
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/vmx/evmcs.c  |  3 +++
>  arch/x86/kvm/vmx/nested.c |  8 ++++----
>  arch/x86/kvm/vmx/vmx.c    |  3 +--
>  arch/x86/kvm/vmx/vmx.h    | 10 ++++++++++
>  4 files changed, 18 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
> index 6a61b1ae7942..9139c70b6008 100644
> --- a/arch/x86/kvm/vmx/evmcs.c
> +++ b/arch/x86/kvm/vmx/evmcs.c
> @@ -334,6 +334,9 @@ uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
>  	 * versions: lower 8 bits is the minimal version, higher 8 bits is the
>  	 * maximum supported version. KVM supports versions from 1 to
>  	 * KVM_EVMCS_VERSION.
> +	 *
> +	 * Note, do not check the Hyper-V is fully enabled in guest CPUID, this
> +	 * helper is used to _get_ the vCPU's supported CPUID.
>  	 */
>  	if (kvm_cpu_cap_get(X86_FEATURE_VMX) &&
>  	    (!vcpu || to_vmx(vcpu)->nested.enlightened_vmcs_enabled))
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index ddd4367d4826..28f9d64851b3 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -1982,7 +1982,7 @@ static enum nested_evmptrld_status nested_vmx_handle_enlightened_vmptrld(
>  	bool evmcs_gpa_changed = false;
>  	u64 evmcs_gpa;
>  
> -	if (likely(!vmx->nested.enlightened_vmcs_enabled))
> +	if (likely(!guest_cpuid_has_evmcs(vcpu)))
>  		return EVMPTRLD_DISABLED;
>  
>  	if (!nested_enlightened_vmentry(vcpu, &evmcs_gpa)) {
> @@ -2863,7 +2863,7 @@ static int nested_vmx_check_controls(struct kvm_vcpu *vcpu,
>  	    nested_check_vm_entry_controls(vcpu, vmcs12))
>  		return -EINVAL;
>  
> -	if (to_vmx(vcpu)->nested.enlightened_vmcs_enabled)
> +	if (guest_cpuid_has_evmcs(vcpu))
>  		return nested_evmcs_check_controls(vmcs12);
>  
>  	return 0;
> @@ -3145,7 +3145,7 @@ static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu)
>  	 * L2 was running), map it here to make sure vmcs12 changes are
>  	 * properly reflected.
>  	 */
> -	if (vmx->nested.enlightened_vmcs_enabled &&
> +	if (guest_cpuid_has_evmcs(vcpu) &&
>  	    vmx->nested.hv_evmcs_vmptr == EVMPTR_MAP_PENDING) {
>  		enum nested_evmptrld_status evmptrld_status =
>  			nested_vmx_handle_enlightened_vmptrld(vcpu, false);
> @@ -5067,7 +5067,7 @@ static int handle_vmclear(struct kvm_vcpu *vcpu)
>  	 * state. It is possible that the area will stay mapped as
>  	 * vmx->nested.hv_evmcs but this shouldn't be a problem.
>  	 */
> -	if (likely(!vmx->nested.enlightened_vmcs_enabled ||
> +	if (likely(!guest_cpuid_has_evmcs(vcpu) ||
>  		   !nested_enlightened_vmentry(vcpu, &evmcs_gpa))) {
>  		if (vmptr == vmx->nested.current_vmptr)
>  			nested_release_vmcs12(vcpu);
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index c9b49a09e6b5..d4ed802947d7 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1930,8 +1930,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		 * sanity checking and refuse to boot. Filter all unsupported
>  		 * features out.
>  		 */
> -		if (!msr_info->host_initiated &&
> -		    vmx->nested.enlightened_vmcs_enabled)
> +		if (!msr_info->host_initiated && guest_cpuid_has_evmcs(vcpu))
>  			nested_evmcs_filter_control_msr(msr_info->index,
>  							&msr_info->data);
>  		break;
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 24d58c2ffaa3..35c7e6aef301 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -626,4 +626,14 @@ static inline bool vmx_can_use_ipiv(struct kvm_vcpu *vcpu)
>  	return  lapic_in_kernel(vcpu) && enable_ipiv;
>  }
>  
> +static inline bool guest_cpuid_has_evmcs(struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * eVMCS is exposed to the guest if Hyper-V is enabled in CPUID and
> +	 * eVMCS has been explicitly enabled by userspace.
> +	 */
> +	return vcpu->arch.hyperv_enabled &&
> +	       to_vmx(vcpu)->nested.enlightened_vmcs_enabled;

I don't quite like 'guest_cpuid_has_evmcs' name as it makes me think
we're checking if eVMCS was exposed in guest CPUID but in fact we don't
do that. eVMCS can be enabled on a vCPU even if it is not exposed in
CPUID (and we should probably keep that to not mandate setting CPUID
before enabling eVMCS).

What about e.g. vcpu_has_evmcs_enabled() instead?

On a related not, any reason to put this to vmx/vmx.h and not
vmx/evmcs.h?


> +}
> +
>  #endif /* __KVM_X86_VMX_H */

-- 
Vitaly


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v6 07/36] KVM: nVMX: Refactor unsupported eVMCS controls logic to use 2-d array
  2022-08-24  3:01 ` [RFC PATCH v6 07/36] KVM: nVMX: Refactor unsupported eVMCS controls logic to use 2-d array Sean Christopherson
@ 2022-08-25 10:24   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 45+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-25 10:24 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm

Sean Christopherson <seanjc@google.com> writes:

> From: Vitaly Kuznetsov <vkuznets@redhat.com>
>
> Refactor the handling of unsupported eVMCS to use a 2-d array to store
> the set of unsupported controls.  KVM's handling of eVMCS is completely
> broken as there is no way for userspace to query which features are
> unsupported, nor does KVM prevent userspace from attempting to enable
> unsupported features.  A future commit will remedy that by filtering and
> enforcing unsupported features when eVMCS, but that needs to be opt-in
> from userspace to avoid breakage, i.e. KVM needs to maintain its legacy
> behavior by snapshotting the exact set of controls that are currently
> (un)supported by eVMCS.
>
> No functional change intended.
>
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> [sean: split to standalone patch, write changelog]
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/vmx/evmcs.c | 60 +++++++++++++++++++++++++++++++++-------
>  1 file changed, 50 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
> index 9139c70b6008..10fc0be49f96 100644
> --- a/arch/x86/kvm/vmx/evmcs.c
> +++ b/arch/x86/kvm/vmx/evmcs.c
> @@ -345,6 +345,45 @@ uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> +enum evmcs_revision {
> +	EVMCSv1_LEGACY,
> +	NR_EVMCS_REVISIONS,
> +};
> +
> +enum evmcs_ctrl_type {
> +	EVMCS_EXIT_CTRLS,
> +	EVMCS_ENTRY_CTRLS,
> +	EVMCS_2NDEXEC,
> +	EVMCS_PINCTRL,
> +	EVMCS_VMFUNC,
> +	NR_EVMCS_CTRLS,
> +};
> +
> +static const u32 evmcs_unsupported_ctrls[NR_EVMCS_CTRLS][NR_EVMCS_REVISIONS] = {
> +	[EVMCS_EXIT_CTRLS] = {
> +		[EVMCSv1_LEGACY] = EVMCS1_UNSUPPORTED_VMEXIT_CTRL | VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,
> +	},
> +	[EVMCS_ENTRY_CTRLS] = {
> +		[EVMCSv1_LEGACY] = EVMCS1_UNSUPPORTED_VMENTRY_CTRL | VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,
> +	},
> +	[EVMCS_2NDEXEC] = {
> +		[EVMCSv1_LEGACY] = EVMCS1_UNSUPPORTED_2NDEXEC | SECONDARY_EXEC_TSC_SCALING,

By the time of this patch, VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,
VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL, SECONDARY_EXEC_TSC_SCALING are
still in 'EVMCS1_UNSUPPORTED_*' lists.

> +	},
> +	[EVMCS_PINCTRL] = {
> +		[EVMCSv1_LEGACY] = EVMCS1_UNSUPPORTED_PINCTRL,
> +	},
> +	[EVMCS_VMFUNC] = {
> +		[EVMCSv1_LEGACY] = EVMCS1_UNSUPPORTED_VMFUNC,
> +	},
> +};
> +
> +static u32 evmcs_get_unsupported_ctls(enum evmcs_ctrl_type ctrl_type)
> +{
> +	enum evmcs_revision evmcs_rev = EVMCSv1_LEGACY;
> +
> +	return evmcs_unsupported_ctrls[ctrl_type][evmcs_rev];
> +}
> +
>  void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
>  {
>  	u32 ctl_low = (u32)*pdata;
> @@ -357,21 +396,21 @@ void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
>  	switch (msr_index) {
>  	case MSR_IA32_VMX_EXIT_CTLS:
>  	case MSR_IA32_VMX_TRUE_EXIT_CTLS:
> -		ctl_high &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
> +		ctl_high &= ~evmcs_get_unsupported_ctls(EVMCS_EXIT_CTRLS);
>  		break;
>  	case MSR_IA32_VMX_ENTRY_CTLS:
>  	case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
> -		ctl_high &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
> +		ctl_high &= ~evmcs_get_unsupported_ctls(EVMCS_ENTRY_CTRLS);
>  		break;
>  	case MSR_IA32_VMX_PROCBASED_CTLS2:
> -		ctl_high &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
> +		ctl_high &= ~evmcs_get_unsupported_ctls(EVMCS_2NDEXEC);
>  		break;
>  	case MSR_IA32_VMX_TRUE_PINBASED_CTLS:
>  	case MSR_IA32_VMX_PINBASED_CTLS:
> -		ctl_high &= ~EVMCS1_UNSUPPORTED_PINCTRL;
> +		ctl_high &= ~evmcs_get_unsupported_ctls(EVMCS_PINCTRL);
>  		break;
>  	case MSR_IA32_VMX_VMFUNC:
> -		ctl_low &= ~EVMCS1_UNSUPPORTED_VMFUNC;
> +		ctl_low &= ~evmcs_get_unsupported_ctls(EVMCS_VMFUNC);
>  		break;
>  	}
>  
> @@ -384,7 +423,7 @@ int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
>  	u32 unsupp_ctl;
>  
>  	unsupp_ctl = vmcs12->pin_based_vm_exec_control &
> -		EVMCS1_UNSUPPORTED_PINCTRL;
> +		evmcs_get_unsupported_ctls(EVMCS_PINCTRL);
>  	if (unsupp_ctl) {
>  		trace_kvm_nested_vmenter_failed(
>  			"eVMCS: unsupported pin-based VM-execution controls",
> @@ -393,7 +432,7 @@ int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
>  	}
>  
>  	unsupp_ctl = vmcs12->secondary_vm_exec_control &
> -		EVMCS1_UNSUPPORTED_2NDEXEC;
> +		evmcs_get_unsupported_ctls(EVMCS_2NDEXEC);
>  	if (unsupp_ctl) {
>  		trace_kvm_nested_vmenter_failed(
>  			"eVMCS: unsupported secondary VM-execution controls",
> @@ -402,7 +441,7 @@ int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
>  	}
>  
>  	unsupp_ctl = vmcs12->vm_exit_controls &
> -		EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
> +		evmcs_get_unsupported_ctls(EVMCS_EXIT_CTRLS);
>  	if (unsupp_ctl) {
>  		trace_kvm_nested_vmenter_failed(
>  			"eVMCS: unsupported VM-exit controls",
> @@ -411,7 +450,7 @@ int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
>  	}
>  
>  	unsupp_ctl = vmcs12->vm_entry_controls &
> -		EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
> +		evmcs_get_unsupported_ctls(EVMCS_ENTRY_CTRLS);
>  	if (unsupp_ctl) {
>  		trace_kvm_nested_vmenter_failed(
>  			"eVMCS: unsupported VM-entry controls",
> @@ -419,7 +458,8 @@ int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
>  		ret = -EINVAL;
>  	}
>  
> -	unsupp_ctl = vmcs12->vm_function_control & EVMCS1_UNSUPPORTED_VMFUNC;
> +	unsupp_ctl = vmcs12->vm_function_control &
> +		evmcs_get_unsupported_ctls(EVMCS_VMFUNC);
>  	if (unsupp_ctl) {
>  		trace_kvm_nested_vmenter_failed(
>  			"eVMCS: unsupported VM-function controls",

-- 
Vitaly


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v6 06/36] KVM: nVMX: Treat eVMCS as enabled for guest iff Hyper-V is also enabled
  2022-08-25 10:21   ` Vitaly Kuznetsov
@ 2022-08-25 14:48     ` Sean Christopherson
  0 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-25 14:48 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm

On Thu, Aug 25, 2022, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@google.com> writes:
> > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> > index 24d58c2ffaa3..35c7e6aef301 100644
> > --- a/arch/x86/kvm/vmx/vmx.h
> > +++ b/arch/x86/kvm/vmx/vmx.h
> > @@ -626,4 +626,14 @@ static inline bool vmx_can_use_ipiv(struct kvm_vcpu *vcpu)
> >  	return  lapic_in_kernel(vcpu) && enable_ipiv;
> >  }
> >  
> > +static inline bool guest_cpuid_has_evmcs(struct kvm_vcpu *vcpu)
> > +{
> > +	/*
> > +	 * eVMCS is exposed to the guest if Hyper-V is enabled in CPUID and
> > +	 * eVMCS has been explicitly enabled by userspace.
> > +	 */
> > +	return vcpu->arch.hyperv_enabled &&
> > +	       to_vmx(vcpu)->nested.enlightened_vmcs_enabled;
> 
> I don't quite like 'guest_cpuid_has_evmcs' name as it makes me think
> we're checking if eVMCS was exposed in guest CPUID but in fact we don't
> do that.

This does (indirectly) check guest CPUID.  hyperv_enabled is a direct reflection
of whether or not CPUID.HYPERV_CPUID_INTERFACE.EAX == HYPERV_CPUID_SIGNATURE_EAX.

> eVMCS can be enabled on a vCPU even if it is not exposed in
> CPUID (and we should probably keep that to not mandate setting CPUID
> before enabling eVMCS).

My intent with this helper is that it should be used only when the guest is
attempting to utilize eVMCS.  All host-initiated usage, e.g. KVM_SET_NESTED_STATE,
check enlightened_vmcs_enabled directly.

> What about e.g. vcpu_has_evmcs_enabled() instead?

I went with the guest_cpuid_has...() to align with the generic guest_cpuid_has()
so that it would somewhat clear that the helper should only be used when enforcing
guest behavior.

> On a related not, any reason to put this to vmx/vmx.h and not
> vmx/evmcs.h?

Can't dereference vcpu_vmx :-(

vmx.h includes evmcs.h by way of vmx_ops.h, and that ordering can't change because
the VMREAD/VMWRITE helpers need to get at the eVMCS stuff.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v6 00/36] KVM: x86: eVMCS rework
  2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
                   ` (35 preceding siblings ...)
  2022-08-24  3:01 ` [RFC PATCH v6 36/36] KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up nested MSR Sean Christopherson
@ 2022-08-25 18:08 ` Vitaly Kuznetsov
  2022-08-25 18:29   ` Sean Christopherson
  36 siblings, 1 reply; 45+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-25 18:08 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, Paolo Bonzini

Sean Christopherson <seanjc@google.com> writes:

> This is what I ended up with as a way to dig ourselves out of the eVMCS
> conundrum.  Not well tested, though KUT and selftests pass.  The enforcement
> added by "KVM: nVMX: Enforce unsupported eVMCS in VMX MSRs for host accesses"
> is not tested at all (and lacks a changelog).

Trying to enable KVM_CAP_HYPERV_ENLIGHTENED_VMCS2 in its new shape in
QEMU so I can test it and I immediately stumble upon

~/qemu/build/qemu-system-x86_64 -machine q35,accel=kvm,kernel-irqchip=split -cpu host,hv-evmcs-2022,hv-evmcs,hv-vpindex,hv-vapic 
qemu-system-x86_64: error: failed to set MSR 0x48d to 0xff00000016
qemu-system-x86_64: ../target/i386/kvm/kvm.c:3107: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.

Turns out, at least with "-cpu host" QEMU reads VMX feature MSRs first
and enables eVMCS after. This is fixable, I believe but it makes me
think that maybe eVMCS enablement (or even the whole Hyper-V emulation
thing) should be per-VM as it makes really little sense to have Hyper-V
features enabled on *some* vCPUs only. As we're going to add a new CAP
anyway, maybe it's a good time to make a switch?

-- 
Vitaly


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v6 00/36] KVM: x86: eVMCS rework
  2022-08-25 18:08 ` [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Vitaly Kuznetsov
@ 2022-08-25 18:29   ` Sean Christopherson
  2022-08-26 17:19     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 45+ messages in thread
From: Sean Christopherson @ 2022-08-25 18:29 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm, Paolo Bonzini

On Thu, Aug 25, 2022, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@google.com> writes:
> 
> > This is what I ended up with as a way to dig ourselves out of the eVMCS
> > conundrum.  Not well tested, though KUT and selftests pass.  The enforcement
> > added by "KVM: nVMX: Enforce unsupported eVMCS in VMX MSRs for host accesses"
> > is not tested at all (and lacks a changelog).
> 
> Trying to enable KVM_CAP_HYPERV_ENLIGHTENED_VMCS2 in its new shape in
> QEMU so I can test it and I immediately stumble upon
> 
> ~/qemu/build/qemu-system-x86_64 -machine q35,accel=kvm,kernel-irqchip=split -cpu host,hv-evmcs-2022,hv-evmcs,hv-vpindex,hv-vapic 
> qemu-system-x86_64: error: failed to set MSR 0x48d to 0xff00000016
> qemu-system-x86_64: ../target/i386/kvm/kvm.c:3107: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
> 
> Turns out, at least with "-cpu host" QEMU reads VMX feature MSRs first
> and enables eVMCS after.

Heh, of course there had to be a corner case.

> This is fixable, I believe but it makes me think that maybe eVMCS enablement
> (or even the whole Hyper-V emulation thing) should be per-VM as it makes
> really little sense to have Hyper-V features enabled on *some* vCPUs only. As
> we're going to add a new CAP anyway, maybe it's a good time to make a switch?

Works for me as long as the KVM code doesn't end up being a mess trying to smush
the two things together (I don't see any reason why it would).

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v6 00/36] KVM: x86: eVMCS rework
  2022-08-25 18:29   ` Sean Christopherson
@ 2022-08-26 17:19     ` Vitaly Kuznetsov
  2022-08-27 14:03       ` Vitaly Kuznetsov
  0 siblings, 1 reply; 45+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-26 17:19 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, Paolo Bonzini

Sean Christopherson <seanjc@google.com> writes:

> On Thu, Aug 25, 2022, Vitaly Kuznetsov wrote:
>> Sean Christopherson <seanjc@google.com> writes:
>> 
>> > This is what I ended up with as a way to dig ourselves out of the eVMCS
>> > conundrum.  Not well tested, though KUT and selftests pass.  The enforcement
>> > added by "KVM: nVMX: Enforce unsupported eVMCS in VMX MSRs for host accesses"
>> > is not tested at all (and lacks a changelog).
>> 
>> Trying to enable KVM_CAP_HYPERV_ENLIGHTENED_VMCS2 in its new shape in
>> QEMU so I can test it and I immediately stumble upon
>> 
>> ~/qemu/build/qemu-system-x86_64 -machine q35,accel=kvm,kernel-irqchip=split -cpu host,hv-evmcs-2022,hv-evmcs,hv-vpindex,hv-vapic 
>> qemu-system-x86_64: error: failed to set MSR 0x48d to 0xff00000016
>> qemu-system-x86_64: ../target/i386/kvm/kvm.c:3107: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
>> 
>> Turns out, at least with "-cpu host" QEMU reads VMX feature MSRs first
>> and enables eVMCS after.
>
> Heh, of course there had to be a corner case.
>

Unfortunatelly, it's not a corner case, named CPU models in QEMU behave
exactly the same (I've just forgotten to add '+vmx' yesterday). In fact,
it seems QEMU uses system-wide KVM_GET_MSRS (which results in
vmx_get_msr_feature() for our case) which gives unfiltered values. As it
is system wide it just can't filter anything. This happens even before
KVM_CREATE_VCPU is called so switching to per-vCPU ioctl is not an
option. What's worse is that all the discovered features (including VMX
features) are passed to upper layers of the virtualization stack,
starting with libvirt and upper layers may want to enable some of the
"available" features explicitly. Teaching everyone what's available with
eVMCS and what's not seems to be a hard task.

This use-case can probably be solved by making eVMCS enablement a per-VM
thing (already did locally) and creating a per-VM version of
KVM_GET_MSRS which will give us filtered VMX MSRs when eVMCS was
enabled.

Note: silently filtering out features when vCPUs are created is bad as
the list of such features will change over time. This is guaranteed to
break migrations.

Honestly I'm starting to think the 'evmcs revisions' idea (to keep
the exact list of features in KVM and update them every couple years
when new Hyper-V releases) is easier. It's just a list, it doesn't
require much. The main downside, as was already named, is that userspace
VMM doesn't see which VMX features are actually passed to the guest
unless it is also taught about these "evmcs revisions" (more than what's
the latest number available). This, to certain extent, can probably be
solved by VMM itself by doing KVM_GET_MSRS after vCPU is created (this
won't help much with feature discovery by upper layers, tough). This,
however, is a new use-case, unsupported with the current
KVM_CAP_HYPERV_ENLIGHTENED_VMCS implementation.

eVMCS seems to be special in a way that a) it evolves over time b) it is
mutually exclusive with *some* other features but the list changes. We
don't seem to have anything like that in KVM/QEMU, thus all the
confusion.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v6 00/36] KVM: x86: eVMCS rework
  2022-08-26 17:19     ` Vitaly Kuznetsov
@ 2022-08-27 14:03       ` Vitaly Kuznetsov
  2022-08-29 15:54         ` Sean Christopherson
  0 siblings, 1 reply; 45+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-27 14:03 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, Paolo Bonzini

Vitaly Kuznetsov <vkuznets@redhat.com> writes:

...

>
> Honestly I'm starting to think the 'evmcs revisions' idea (to keep
> the exact list of features in KVM and update them every couple years
> when new Hyper-V releases) is easier. It's just a list, it doesn't
> require much. The main downside, as was already named, is that userspace
> VMM doesn't see which VMX features are actually passed to the guest
> unless it is also taught about these "evmcs revisions" (more than what's
> the latest number available). This, to certain extent, can probably be
> solved by VMM itself by doing KVM_GET_MSRS after vCPU is created (this
> won't help much with feature discovery by upper layers, tough). This,
> however, is a new use-case, unsupported with the current
> KVM_CAP_HYPERV_ENLIGHTENED_VMCS implementation.

...

Thinking more about the above, if we invert the filtering logic (to
explicitly list what's supported), KVM's code which we will have to add
for every new revision can be very compact as it will only have to list
the newly added features. I can't imagine fields *disappearing* from
eVMCS definition but oh well..

Anyway, I think this series is already getting too big and has many
important fixes but some parts are still controversial. What if I split
off everything-but-Hyper-V-on-KVM (where no controversy is currenly
observed) and send it out so we can continue discussing the issue at
hand more conveniently?

-- 
Vitaly


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v6 00/36] KVM: x86: eVMCS rework
  2022-08-27 14:03       ` Vitaly Kuznetsov
@ 2022-08-29 15:54         ` Sean Christopherson
  0 siblings, 0 replies; 45+ messages in thread
From: Sean Christopherson @ 2022-08-29 15:54 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: kvm, Paolo Bonzini

On Sat, Aug 27, 2022, Vitaly Kuznetsov wrote:
> Vitaly Kuznetsov <vkuznets@redhat.com> writes:
> 
> ...
> 
> >
> > Honestly I'm starting to think the 'evmcs revisions' idea (to keep
> > the exact list of features in KVM and update them every couple years
> > when new Hyper-V releases) is easier. It's just a list, it doesn't
> > require much. The main downside, as was already named, is that userspace
> > VMM doesn't see which VMX features are actually passed to the guest
> > unless it is also taught about these "evmcs revisions" (more than what's
> > the latest number available). This, to certain extent, can probably be
> > solved by VMM itself by doing KVM_GET_MSRS after vCPU is created (this
> > won't help much with feature discovery by upper layers, tough). This,
> > however, is a new use-case, unsupported with the current
> > KVM_CAP_HYPERV_ENLIGHTENED_VMCS implementation.
> 
> ...
> 
> Thinking more about the above, if we invert the filtering logic (to
> explicitly list what's supported), KVM's code which we will have to add
> for every new revision can be very compact as it will only have to list
> the newly added features.

But that point KVM would effectively be implementing a less flexible version of
VMX MSRs.

> I can't imagine fields *disappearing* from eVMCS definition but oh well..

It's unlikely that features will truly disappear, but it is relatively likely that
userspace will want to hide a feature.  It's also likely that hardware won't
support all "previous" features, e.g. Intel has a habit of making features like
TSC scaling and APICv available only on Xeon SKUs.

Handling arbitrary configurations via version numbers gets kludgy because it's
impossible for userspace to communicate its exact desires to KVM.  All userspace
can do is state that it's aware of features up through version X; hiding individual
features requires maniuplating the VMX MSRs.

And if userspace needs to set VMX MSRs, then userspace also needs to get VMX MSRs,
and so why not simply have userspace do exactly that?  The only missing piece is a
way for userspace to opt-in to activating the "feature is available if supported in
hardware _and_ eVMCS" logic so as not to break backwards compatibility.  A per-VM
capability works very well for that.

> Anyway, I think this series is already getting too big and has many
> important fixes but some parts are still controversial. What if I split
> off everything-but-Hyper-V-on-KVM (where no controversy is currenly
> observed) and send it out so we can continue discussing the issue at
> hand more conveniently?

Yes, let's do that.

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2022-08-29 15:54 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-24  3:01 [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 01/36] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 02/36] x86/hyperv: Update " Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 03/36] KVM: x86: Zero out entire Hyper-V CPUID cache before processing entries Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 04/36] KVM: x86: Check for existing Hyper-V vCPU in kvm_hv_vcpu_init() Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 05/36] KVM: x86: Report error when setting CPUID if Hyper-V allocation fails Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 06/36] KVM: nVMX: Treat eVMCS as enabled for guest iff Hyper-V is also enabled Sean Christopherson
2022-08-25 10:21   ` Vitaly Kuznetsov
2022-08-25 14:48     ` Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 07/36] KVM: nVMX: Refactor unsupported eVMCS controls logic to use 2-d array Sean Christopherson
2022-08-25 10:24   ` Vitaly Kuznetsov
2022-08-24  3:01 ` [RFC PATCH v6 08/36] KVM: nVMX: Use CC() macro to handle eVMCS unsupported controls checks Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 09/36] KVM: nVMX: Enforce unsupported eVMCS in VMX MSRs for host accesses Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 10/36] KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 11/36] KVM: nVMX: Support several new fields in eVMCSv1 Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 12/36] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 13/36] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 14/36] KVM: selftests: Switch to updated eVMCSv1 definition Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 15/36] KVM: nVMX: WARN once and fail VM-Enter if eVMCS sees VMFUNC[63:32] != 0 Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 16/36] KVM: nVMX: Support PERF_GLOBAL_CTRL with enlightened VMCS Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 17/36] KVM: nVMX: Support TSC scaling " Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 18/36] KVM: selftests: Enable TSC scaling in evmcs selftest Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 19/36] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 20/36] KVM: nVMX: Don't propagate vmcs12's PERF_GLOBAL_CTRL settings to vmcs02 Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 21/36] KVM: nVMX: Always emulate PERF_GLOBAL_CTRL VM-Entry/VM-Exit controls Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 22/36] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config() Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 23/36] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING " Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 24/36] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING " Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 25/36] KVM: VMX: Don't toggle VM_ENTRY_IA32E_MODE for 32-bit kernels/KVM Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 26/36] KVM: VMX: Extend VMX controls macro shenanigans Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 27/36] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config() Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 28/36] KVM: VMX: Add missing VMEXIT controls to vmcs_config Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 29/36] KVM: VMX: Add missing CPU based VM execution " Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 30/36] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 31/36] KVM: x86: VMX: Replace some Intel model numbers with mnemonics Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 32/36] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config() Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 33/36] KVM: nVMX: Always set required-1 bits of pinbased_ctls to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 34/36] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 35/36] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config Sean Christopherson
2022-08-24  3:01 ` [RFC PATCH v6 36/36] KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up nested MSR Sean Christopherson
2022-08-25 18:08 ` [RFC PATCH v6 00/36] KVM: x86: eVMCS rework Vitaly Kuznetsov
2022-08-25 18:29   ` Sean Christopherson
2022-08-26 17:19     ` Vitaly Kuznetsov
2022-08-27 14:03       ` Vitaly Kuznetsov
2022-08-29 15:54         ` Sean Christopherson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.