All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs
@ 2022-07-14  9:13 Vitaly Kuznetsov
  2022-07-14  9:13 ` [PATCH v4 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags Vitaly Kuznetsov
                   ` (24 more replies)
  0 siblings, 25 replies; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Changes since v3 [Max]:
- Fix swapped encls_exiting_bitmap and host_ia32_perf_global_ctrl fields.
- Wipe 'hv_vcpu->cpuid_cache' with memset() on update.
- Add a comment explaining made-up HV_X64_NESTED_EVMCS1_2022_UPDATE name
  and what this bit means.
- Add R-b tags.

Original description:

Enlightened VMCS v1 definition was updates to include fields for the
following features:
    - PerfGlobalCtrl
    - EnclsExitingBitmap
    - TSC scaling
    - GuestLbrCtl
    - CET
    - SSP
While the information is missing in the publicly available TLFS, the
updated definition comes with a new feature bit in CPUID.0x4000000A.EBX
(BIT 0). Use a made up HV_X64_NESTED_EVMCS1_2022_UPDATE name for it.

Add support for the new revision to KVM. SSP, CET and GuestLbrCtl
features are not currently supported by KVM.

While on it, implement Sean's idea to use vmcs_config for setting up
L1 VMX control MSRs instead of re-reading host MSRs.

Jim Mattson (1):
  KVM: x86: VMX: Replace some Intel model numbers with mnemonics

Sean Christopherson (1):
  KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not
    setup

Vitaly Kuznetsov (23):
  KVM: x86: hyper-v: Expose access to debug MSRs in the partition
    privilege flags
  x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition
  x86/hyperv: Update 'struct hv_enlightened_vmcs' definition
  KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields
  KVM: nVMX: Support several new fields in eVMCSv1
  KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf
  KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields
  KVM: selftests: Switch to updated eVMCSv1 definition
  KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with
    enlightened VMCS
  KVM: selftests: Enable TSC scaling in evmcs selftest
  KVM: VMX: Get rid of eVMCS specific VMX controls sanitization
  KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
  KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
    setup_vmcs_config()
  KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
    in setup_vmcs_config()
  KVM: VMX: Extend VMX controls macro shenanigans
  KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
    setup_vmcs_config()
  KVM: VMX: Add missing VMEXIT controls to vmcs_config
  KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
  KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of
    setup_vmcs_config()
  KVM: nVMX: Always set required-1 bits of pinbased_ctls to
    PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR
  KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
  KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
  KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
    nested MSR

 arch/x86/include/asm/hyperv-tlfs.h            |  30 ++-
 arch/x86/include/asm/kvm_host.h               |   2 +
 arch/x86/kvm/hyperv.c                         |  20 +-
 arch/x86/kvm/vmx/capabilities.h               |  14 +-
 arch/x86/kvm/vmx/evmcs.c                      | 127 +++++++---
 arch/x86/kvm/vmx/evmcs.h                      |  18 +-
 arch/x86/kvm/vmx/nested.c                     |  70 ++++--
 arch/x86/kvm/vmx/nested.h                     |   2 +-
 arch/x86/kvm/vmx/vmx.c                        | 235 ++++++++----------
 arch/x86/kvm/vmx/vmx.h                        | 116 +++++++++
 include/asm-generic/hyperv-tlfs.h             |   2 +
 .../selftests/kvm/include/x86_64/evmcs.h      |  45 +++-
 .../selftests/kvm/include/x86_64/vmx.h        |   2 +
 .../testing/selftests/kvm/x86_64/evmcs_test.c |  31 ++-
 14 files changed, 484 insertions(+), 230 deletions(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v4 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-21 21:43   ` Sean Christopherson
  2022-07-14  9:13 ` [PATCH v4 02/25] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition Vitaly Kuznetsov
                   ` (23 subsequent siblings)
  24 siblings, 1 reply; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

For some features, Hyper-V spec defines two separate CPUID bits: one
listing whether the feature is supported or not and another one showing
whether guest partition was granted access to the feature ("partition
privilege mask"). 'Debug MSRs available' is one of such features. Add
the missing 'access' bit.

Fixes: f97f5a56f597 ("x86/kvm/hyper-v: Add support for synthetic debugger interface")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/hyperv.c             | 1 +
 include/asm-generic/hyperv-tlfs.h | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index e2e95a6fccfd..e08189211d9a 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2496,6 +2496,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
 			ent->eax |= HV_MSR_RESET_AVAILABLE;
 			ent->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
 			ent->eax |= HV_ACCESS_FREQUENCY_MSRS;
+			ent->eax |= HV_ACCESS_DEBUG_MSRS;
 			ent->eax |= HV_ACCESS_REENLIGHTENMENT;
 
 			ent->ebx |= HV_POST_MESSAGES;
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index fdce7a4cfc6f..1d99dd296a76 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -70,6 +70,8 @@
 #define HV_MSR_GUEST_IDLE_AVAILABLE		BIT(10)
 /* Partition local APIC and TSC frequency registers available */
 #define HV_ACCESS_FREQUENCY_MSRS		BIT(11)
+/* Debug MSRs available */
+#define HV_ACCESS_DEBUG_MSRS			BIT(12)
 /* AccessReenlightenmentControls privilege */
 #define HV_ACCESS_REENLIGHTENMENT		BIT(13)
 /* AccessTscInvariantControls privilege */
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 02/25] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
  2022-07-14  9:13 ` [PATCH v4 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-14  9:13 ` [PATCH v4 03/25] x86/hyperv: Update " Vitaly Kuznetsov
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Section 1.9 of TLFS v6.0b says:

"All structures are padded in such a way that fields are aligned
naturally (that is, an 8-byte field is aligned to an offset of 8 bytes
and so on)".

'struct enlightened_vmcs' has a glitch:

...
        struct {
                u32                nested_flush_hypercall:1; /*   836: 0  4 */
                u32                msr_bitmap:1;         /*   836: 1  4 */
                u32                reserved:30;          /*   836: 2  4 */
        } hv_enlightenments_control;                     /*   836     4 */
        u32                        hv_vp_id;             /*   840     4 */
        u64                        hv_vm_id;             /*   844     8 */
        u64                        partition_assist_page; /*   852     8 */
...

And the observed values in 'partition_assist_page' make no sense at
all. Fix the layout by padding the structure properly.

Fixes: 68d1eb72ee99 ("x86/hyper-v: define struct hv_enlightened_vmcs and clean field bits")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/hyperv-tlfs.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 0a9407dc0859..6f0acc45e67a 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -546,7 +546,7 @@ struct hv_enlightened_vmcs {
 	u64 guest_rip;
 
 	u32 hv_clean_fields;
-	u32 hv_padding_32;
+	u32 padding32_1;
 	u32 hv_synthetic_controls;
 	struct {
 		u32 nested_flush_hypercall:1;
@@ -554,7 +554,7 @@ struct hv_enlightened_vmcs {
 		u32 reserved:30;
 	}  __packed hv_enlightenments_control;
 	u32 hv_vp_id;
-
+	u32 padding32_2;
 	u64 hv_vm_id;
 	u64 partition_assist_page;
 	u64 padding64_4[4];
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 03/25] x86/hyperv: Update 'struct hv_enlightened_vmcs' definition
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
  2022-07-14  9:13 ` [PATCH v4 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags Vitaly Kuznetsov
  2022-07-14  9:13 ` [PATCH v4 02/25] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-14  9:57   ` Maxim Levitsky
  2022-07-14  9:13 ` [PATCH v4 04/25] KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields Vitaly Kuznetsov
                   ` (21 subsequent siblings)
  24 siblings, 1 reply; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Updated Hyper-V Enlightened VMCS specification lists several new
fields for the following features:

- PerfGlobalCtrl
- EnclsExitingBitmap
- Tsc Scaling
- GuestLbrCtl
- CET
- SSP

Update the definition. The updated definition is available only when
CPUID.0x4000000A.EBX BIT(0) is '1'. Add a define for it as well.

Note: The latest TLFS is available at
https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/hyperv-tlfs.h | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 6f0acc45e67a..ebc27017fa48 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -138,6 +138,17 @@
 #define HV_X64_NESTED_GUEST_MAPPING_FLUSH		BIT(18)
 #define HV_X64_NESTED_MSR_BITMAP			BIT(19)
 
+/*
+ * Nested quirks. These are HYPERV_CPUID_NESTED_FEATURES.EBX bits.
+ *
+ * Note: HV_X64_NESTED_EVMCS1_2022_UPDATE is not currently documented in any
+ * published TLFS version. When the bit is set, nested hypervisor can use
+ * 'updated' eVMCSv1 specification (perf_global_ctrl, s_cet, ssp, lbr_ctl,
+ * encls_exiting_bitmap, tsc_multiplier fields which were missing in 2016
+ * specification).
+ */
+#define HV_X64_NESTED_EVMCS1_2022_UPDATE		BIT(0)
+
 /*
  * This is specific to AMD and specifies that enlightened TLB flush is
  * supported. If guest opts in to this feature, ASID invalidations only
@@ -559,9 +570,20 @@ struct hv_enlightened_vmcs {
 	u64 partition_assist_page;
 	u64 padding64_4[4];
 	u64 guest_bndcfgs;
-	u64 padding64_5[7];
+	u64 guest_ia32_perf_global_ctrl;
+	u64 guest_ia32_s_cet;
+	u64 guest_ssp;
+	u64 guest_ia32_int_ssp_table_addr;
+	u64 guest_ia32_lbr_ctl;
+	u64 padding64_5[2];
 	u64 xss_exit_bitmap;
-	u64 padding64_6[7];
+	u64 encls_exiting_bitmap;
+	u64 host_ia32_perf_global_ctrl;
+	u64 tsc_multiplier;
+	u64 host_ia32_s_cet;
+	u64 host_ssp;
+	u64 host_ia32_int_ssp_table_addr;
+	u64 padding64_6;
 } __packed;
 
 #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE			0
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 04/25] KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (2 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 03/25] x86/hyperv: Update " Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-14  9:13 ` [PATCH v4 05/25] KVM: nVMX: Support several new fields in eVMCSv1 Vitaly Kuznetsov
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Enlightened VMCS v1 definition was updated with new fields, support
them in KVM by defining VMCS-to-EVMCS conversion.

Note: SSP, CET and Guest LBR features are not supported by KVM yet and
the corresponding fields are not defined in 'enum vmcs_field', leave
them commented out for now.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/evmcs.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index 6a61b1ae7942..8bea5dea0341 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -28,6 +28,8 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
 	EVMCS1_FIELD(HOST_IA32_EFER, host_ia32_efer,
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
+	EVMCS1_FIELD(HOST_IA32_PERF_GLOBAL_CTRL, host_ia32_perf_global_ctrl,
+		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
 	EVMCS1_FIELD(HOST_CR0, host_cr0,
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
 	EVMCS1_FIELD(HOST_CR3, host_cr3,
@@ -78,6 +80,8 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
 	EVMCS1_FIELD(GUEST_IA32_EFER, guest_ia32_efer,
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
+	EVMCS1_FIELD(GUEST_IA32_PERF_GLOBAL_CTRL, guest_ia32_perf_global_ctrl,
+		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
 	EVMCS1_FIELD(GUEST_PDPTR0, guest_pdptr0,
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
 	EVMCS1_FIELD(GUEST_PDPTR1, guest_pdptr1,
@@ -126,6 +130,28 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
 	EVMCS1_FIELD(XSS_EXIT_BITMAP, xss_exit_bitmap,
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2),
+	EVMCS1_FIELD(ENCLS_EXITING_BITMAP, encls_exiting_bitmap,
+		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2),
+	EVMCS1_FIELD(TSC_MULTIPLIER, tsc_multiplier,
+		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2),
+	/*
+	 * Not used by KVM:
+	 *
+	 * EVMCS1_FIELD(0x00006828, guest_ia32_s_cet,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
+	 * EVMCS1_FIELD(0x0000682A, guest_ssp,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_BASIC),
+	 * EVMCS1_FIELD(0x0000682C, guest_ia32_int_ssp_table_addr,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
+	 * EVMCS1_FIELD(0x00002816, guest_ia32_lbr_ctl,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
+	 * EVMCS1_FIELD(0x00006C18, host_ia32_s_cet,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
+	 * EVMCS1_FIELD(0x00006C1A, host_ssp,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
+	 * EVMCS1_FIELD(0x00006C1C, host_ia32_int_ssp_table_addr,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
+	 */
 
 	/* 64 bit read only */
 	EVMCS1_FIELD(GUEST_PHYSICAL_ADDRESS, guest_physical_address,
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 05/25] KVM: nVMX: Support several new fields in eVMCSv1
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (3 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 04/25] KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-14  9:13 ` [PATCH v4 06/25] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf Vitaly Kuznetsov
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Enlightened VMCS v1 definition was updated with new fields, add
support for them for Hyper-V on KVM.

Note: SSP, CET and Guest LBR features are not supported by KVM yet
and 'struct vmcs12' has no corresponding fields.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 778f82015f03..4fc84f0f5875 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1603,6 +1603,10 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
 		vmcs12->guest_rflags = evmcs->guest_rflags;
 		vmcs12->guest_interruptibility_info =
 			evmcs->guest_interruptibility_info;
+		/*
+		 * Not present in struct vmcs12:
+		 * vmcs12->guest_ssp = evmcs->guest_ssp;
+		 */
 	}
 
 	if (unlikely(!(hv_clean_fields &
@@ -1649,6 +1653,13 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
 		vmcs12->host_fs_selector = evmcs->host_fs_selector;
 		vmcs12->host_gs_selector = evmcs->host_gs_selector;
 		vmcs12->host_tr_selector = evmcs->host_tr_selector;
+		vmcs12->host_ia32_perf_global_ctrl = evmcs->host_ia32_perf_global_ctrl;
+		/*
+		 * Not present in struct vmcs12:
+		 * vmcs12->host_ia32_s_cet = evmcs->host_ia32_s_cet;
+		 * vmcs12->host_ssp = evmcs->host_ssp;
+		 * vmcs12->host_ia32_int_ssp_table_addr = evmcs->host_ia32_int_ssp_table_addr;
+		 */
 	}
 
 	if (unlikely(!(hv_clean_fields &
@@ -1716,6 +1727,8 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
 		vmcs12->tsc_offset = evmcs->tsc_offset;
 		vmcs12->virtual_apic_page_addr = evmcs->virtual_apic_page_addr;
 		vmcs12->xss_exit_bitmap = evmcs->xss_exit_bitmap;
+		vmcs12->encls_exiting_bitmap = evmcs->encls_exiting_bitmap;
+		vmcs12->tsc_multiplier = evmcs->tsc_multiplier;
 	}
 
 	if (unlikely(!(hv_clean_fields &
@@ -1763,6 +1776,13 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
 		vmcs12->guest_bndcfgs = evmcs->guest_bndcfgs;
 		vmcs12->guest_activity_state = evmcs->guest_activity_state;
 		vmcs12->guest_sysenter_cs = evmcs->guest_sysenter_cs;
+		vmcs12->guest_ia32_perf_global_ctrl = evmcs->guest_ia32_perf_global_ctrl;
+		/*
+		 * Not present in struct vmcs12:
+		 * vmcs12->guest_ia32_s_cet = evmcs->guest_ia32_s_cet;
+		 * vmcs12->guest_ia32_lbr_ctl = evmcs->guest_ia32_lbr_ctl;
+		 * vmcs12->guest_ia32_int_ssp_table_addr = evmcs->guest_ia32_int_ssp_table_addr;
+		 */
 	}
 
 	/*
@@ -1865,12 +1885,23 @@ static void copy_vmcs12_to_enlightened(struct vcpu_vmx *vmx)
 	 * evmcs->vm_exit_msr_store_count = vmcs12->vm_exit_msr_store_count;
 	 * evmcs->vm_exit_msr_load_count = vmcs12->vm_exit_msr_load_count;
 	 * evmcs->vm_entry_msr_load_count = vmcs12->vm_entry_msr_load_count;
+	 * evmcs->guest_ia32_perf_global_ctrl = vmcs12->guest_ia32_perf_global_ctrl;
+	 * evmcs->host_ia32_perf_global_ctrl = vmcs12->host_ia32_perf_global_ctrl;
+	 * evmcs->encls_exiting_bitmap = vmcs12->encls_exiting_bitmap;
+	 * evmcs->tsc_multiplier = vmcs12->tsc_multiplier;
 	 *
 	 * Not present in struct vmcs12:
 	 * evmcs->exit_io_instruction_ecx = vmcs12->exit_io_instruction_ecx;
 	 * evmcs->exit_io_instruction_esi = vmcs12->exit_io_instruction_esi;
 	 * evmcs->exit_io_instruction_edi = vmcs12->exit_io_instruction_edi;
 	 * evmcs->exit_io_instruction_eip = vmcs12->exit_io_instruction_eip;
+	 * evmcs->host_ia32_s_cet = vmcs12->host_ia32_s_cet;
+	 * evmcs->host_ssp = vmcs12->host_ssp;
+	 * evmcs->host_ia32_int_ssp_table_addr = vmcs12->host_ia32_int_ssp_table_addr;
+	 * evmcs->guest_ia32_s_cet = vmcs12->guest_ia32_s_cet;
+	 * evmcs->guest_ia32_lbr_ctl = vmcs12->guest_ia32_lbr_ctl;
+	 * evmcs->guest_ia32_int_ssp_table_addr = vmcs12->guest_ia32_int_ssp_table_addr;
+	 * evmcs->guest_ssp = vmcs12->guest_ssp;
 	 */
 
 	evmcs->guest_es_selector = vmcs12->guest_es_selector;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 06/25] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (4 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 05/25] KVM: nVMX: Support several new fields in eVMCSv1 Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-14  9:59   ` Maxim Levitsky
  2022-07-14  9:13 ` [PATCH v4 07/25] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields Vitaly Kuznetsov
                   ` (18 subsequent siblings)
  24 siblings, 1 reply; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

KVM has to check guest visible HYPERV_CPUID_NESTED_FEATURES.EBX CPUID
leaf to know which Enlightened VMCS definition to use (original or 2022
update). Cache the leaf along with other Hyper-V CPUID feature leaves
to make the check quick.

While on it, wipe the whole 'hv_vcpu->cpuid_cache' with memset() instead
of having to zero each particular member when the corresponding CPUID entry
was not found.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/hyperv.c           | 17 ++++++++---------
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index de5a149d0971..077ec9cf3169 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -616,6 +616,8 @@ struct kvm_vcpu_hv {
 		u32 enlightenments_eax; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EAX */
 		u32 enlightenments_ebx; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EBX */
 		u32 syndbg_cap_eax; /* HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES.EAX */
+		u32 nested_eax; /* HYPERV_CPUID_NESTED_FEATURES.EAX */
+		u32 nested_ebx; /* HYPERV_CPUID_NESTED_FEATURES.EBX */
 	} cpuid_cache;
 };
 
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index e08189211d9a..a8e4944ca110 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2005,31 +2005,30 @@ void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu)
 
 	hv_vcpu = to_hv_vcpu(vcpu);
 
+	memset(&hv_vcpu->cpuid_cache, 0, sizeof(hv_vcpu->cpuid_cache));
+
 	entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_FEATURES, 0);
 	if (entry) {
 		hv_vcpu->cpuid_cache.features_eax = entry->eax;
 		hv_vcpu->cpuid_cache.features_ebx = entry->ebx;
 		hv_vcpu->cpuid_cache.features_edx = entry->edx;
-	} else {
-		hv_vcpu->cpuid_cache.features_eax = 0;
-		hv_vcpu->cpuid_cache.features_ebx = 0;
-		hv_vcpu->cpuid_cache.features_edx = 0;
 	}
 
 	entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_ENLIGHTMENT_INFO, 0);
 	if (entry) {
 		hv_vcpu->cpuid_cache.enlightenments_eax = entry->eax;
 		hv_vcpu->cpuid_cache.enlightenments_ebx = entry->ebx;
-	} else {
-		hv_vcpu->cpuid_cache.enlightenments_eax = 0;
-		hv_vcpu->cpuid_cache.enlightenments_ebx = 0;
 	}
 
 	entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES, 0);
 	if (entry)
 		hv_vcpu->cpuid_cache.syndbg_cap_eax = entry->eax;
-	else
-		hv_vcpu->cpuid_cache.syndbg_cap_eax = 0;
+
+	entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_NESTED_FEATURES, 0);
+	if (entry) {
+		hv_vcpu->cpuid_cache.nested_eax = entry->eax;
+		hv_vcpu->cpuid_cache.nested_ebx = entry->ebx;
+	}
 }
 
 int kvm_hv_set_enforce_cpuid(struct kvm_vcpu *vcpu, bool enforce)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 07/25] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (5 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 06/25] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-14  9:20   ` Kai Huang
  2022-07-14  9:13 ` [PATCH v4 08/25] KVM: selftests: Switch to updated eVMCSv1 definition Vitaly Kuznetsov
                   ` (17 subsequent siblings)
  24 siblings, 1 reply; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

The updated Enlightened VMCS definition has 'encls_exiting_bitmap'
field which needs mapping to VMCS, add the missing encoding.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 tools/testing/selftests/kvm/include/x86_64/vmx.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/x86_64/vmx.h b/tools/testing/selftests/kvm/include/x86_64/vmx.h
index cc3604f8f1d3..5292d30fb7f2 100644
--- a/tools/testing/selftests/kvm/include/x86_64/vmx.h
+++ b/tools/testing/selftests/kvm/include/x86_64/vmx.h
@@ -208,6 +208,8 @@ enum vmcs_field {
 	VMWRITE_BITMAP_HIGH		= 0x00002029,
 	XSS_EXIT_BITMAP			= 0x0000202C,
 	XSS_EXIT_BITMAP_HIGH		= 0x0000202D,
+	ENCLS_EXITING_BITMAP		= 0x0000202E,
+	ENCLS_EXITING_BITMAP_HIGH	= 0x0000202F,
 	TSC_MULTIPLIER			= 0x00002032,
 	TSC_MULTIPLIER_HIGH		= 0x00002033,
 	GUEST_PHYSICAL_ADDRESS		= 0x00002400,
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 08/25] KVM: selftests: Switch to updated eVMCSv1 definition
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (6 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 07/25] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-14 10:07   ` Maxim Levitsky
  2022-07-14  9:13 ` [PATCH v4 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS Vitaly Kuznetsov
                   ` (16 subsequent siblings)
  24 siblings, 1 reply; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Update Enlightened VMCS definition in selftests from KVM.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 .../selftests/kvm/include/x86_64/evmcs.h      | 45 +++++++++++++++++--
 1 file changed, 42 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/evmcs.h b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
index 3c9260f8e116..58db74f68af2 100644
--- a/tools/testing/selftests/kvm/include/x86_64/evmcs.h
+++ b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
@@ -203,14 +203,25 @@ struct hv_enlightened_vmcs {
 		u32 reserved:30;
 	} hv_enlightenments_control;
 	u32 hv_vp_id;
-
+	u32 padding32_2;
 	u64 hv_vm_id;
 	u64 partition_assist_page;
 	u64 padding64_4[4];
 	u64 guest_bndcfgs;
-	u64 padding64_5[7];
+	u64 guest_ia32_perf_global_ctrl;
+	u64 guest_ia32_s_cet;
+	u64 guest_ssp;
+	u64 guest_ia32_int_ssp_table_addr;
+	u64 guest_ia32_lbr_ctl;
+	u64 padding64_5[2];
 	u64 xss_exit_bitmap;
-	u64 padding64_6[7];
+	u64 encls_exiting_bitmap;
+	u64 host_ia32_perf_global_ctrl;
+	u64 tsc_multiplier;
+	u64 host_ia32_s_cet;
+	u64 host_ssp;
+	u64 host_ia32_int_ssp_table_addr;
+	u64 padding64_6;
 };
 
 #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE                     0
@@ -656,6 +667,18 @@ static inline int evmcs_vmread(uint64_t encoding, uint64_t *value)
 	case VIRTUAL_PROCESSOR_ID:
 		*value = current_evmcs->virtual_processor_id;
 		break;
+	case HOST_IA32_PERF_GLOBAL_CTRL:
+		*value = current_evmcs->host_ia32_perf_global_ctrl;
+		break;
+	case GUEST_IA32_PERF_GLOBAL_CTRL:
+		*value = current_evmcs->guest_ia32_perf_global_ctrl;
+		break;
+	case ENCLS_EXITING_BITMAP:
+		*value = current_evmcs->encls_exiting_bitmap;
+		break;
+	case TSC_MULTIPLIER:
+		*value = current_evmcs->tsc_multiplier;
+		break;
 	default: return 1;
 	}
 
@@ -1169,6 +1192,22 @@ static inline int evmcs_vmwrite(uint64_t encoding, uint64_t value)
 		current_evmcs->virtual_processor_id = value;
 		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_XLAT;
 		break;
+	case HOST_IA32_PERF_GLOBAL_CTRL:
+		current_evmcs->host_ia32_perf_global_ctrl = value;
+		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1;
+		break;
+	case GUEST_IA32_PERF_GLOBAL_CTRL:
+		current_evmcs->guest_ia32_perf_global_ctrl = value;
+		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1;
+		break;
+	case ENCLS_EXITING_BITMAP:
+		current_evmcs->encls_exiting_bitmap = value;
+		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2;
+		break;
+	case TSC_MULTIPLIER:
+		current_evmcs->tsc_multiplier = value;
+		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2;
+		break;
 	default: return 1;
 	}
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (7 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 08/25] KVM: selftests: Switch to updated eVMCSv1 definition Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-21 21:58   ` Sean Christopherson
  2022-07-14  9:13 ` [PATCH v4 10/25] KVM: selftests: Enable TSC scaling in evmcs selftest Vitaly Kuznetsov
                   ` (15 subsequent siblings)
  24 siblings, 1 reply; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Enlightened VMCS v1 got updated and now includes the required fields
for TSC scaling and loading PERF_GLOBAL_CTRL upon VMENTER/VMEXIT
features. For Hyper-V on KVM enablement, KVM can just observe VMX control
MSRs and use the features (with or without eVMCS) when
possible. Hyper-V on KVM case is trickier because of live migration:
the new features require explicit enablement from VMM to not break
it. Luckily, the updated eVMCS revision comes with a feature bit in
CPUID.0x4000000A.EBX.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/hyperv.c     |  2 +-
 arch/x86/kvm/vmx/evmcs.c  | 88 +++++++++++++++++++++++++++++++--------
 arch/x86/kvm/vmx/evmcs.h  | 17 ++------
 arch/x86/kvm/vmx/nested.c |  2 +-
 arch/x86/kvm/vmx/vmx.c    |  2 +-
 5 files changed, 78 insertions(+), 33 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index a8e4944ca110..995d3ab1443e 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2552,7 +2552,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
 		case HYPERV_CPUID_NESTED_FEATURES:
 			ent->eax = evmcs_ver;
 			ent->eax |= HV_X64_NESTED_MSR_BITMAP;
-
+			ent->ebx |= HV_X64_NESTED_EVMCS1_2022_UPDATE;
 			break;
 
 		case HYPERV_CPUID_SYNDBG_VENDOR_AND_MAX_FUNCTIONS:
diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index 8bea5dea0341..52a53debd806 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -368,7 +368,60 @@ uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
-void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
+enum evmcs_v1_revision {
+	EVMCSv1_2016,
+	EVMCSv1_2022,
+};
+
+enum evmcs_unsupported_ctrl_type {
+	EVMCS_EXIT_CTLS,
+	EVMCS_ENTRY_CTLS,
+	EVMCS_2NDEXEC,
+	EVMCS_PINCTRL,
+	EVMCS_VMFUNC,
+};
+
+static u32 evmcs_get_unsupported_ctls(struct kvm_vcpu *vcpu,
+				      enum evmcs_unsupported_ctrl_type ctrl_type)
+{
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+	enum evmcs_v1_revision evmcs_rev = EVMCSv1_2016;
+
+	if (!hv_vcpu)
+		return 0;
+
+	if (hv_vcpu->cpuid_cache.nested_ebx & HV_X64_NESTED_EVMCS1_2022_UPDATE)
+		evmcs_rev = EVMCSv1_2022;
+
+	switch (ctrl_type) {
+	case EVMCS_EXIT_CTLS:
+		if (evmcs_rev == EVMCSv1_2016)
+			return EVMCS1_UNSUPPORTED_VMEXIT_CTRL |
+				VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+		else
+			return EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
+	case EVMCS_ENTRY_CTLS:
+		if (evmcs_rev == EVMCSv1_2016)
+			return EVMCS1_UNSUPPORTED_VMENTRY_CTRL |
+				VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+		else
+			return EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
+	case EVMCS_2NDEXEC:
+		if (evmcs_rev == EVMCSv1_2016)
+			return EVMCS1_UNSUPPORTED_2NDEXEC |
+				SECONDARY_EXEC_TSC_SCALING;
+		else
+			return EVMCS1_UNSUPPORTED_2NDEXEC;
+	case EVMCS_PINCTRL:
+		return EVMCS1_UNSUPPORTED_PINCTRL;
+	case EVMCS_VMFUNC:
+		return EVMCS1_UNSUPPORTED_VMFUNC;
+	}
+
+	return 0;
+}
+
+void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
 {
 	u32 ctl_low = (u32)*pdata;
 	u32 ctl_high = (u32)(*pdata >> 32);
@@ -380,72 +433,73 @@ void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
 	switch (msr_index) {
 	case MSR_IA32_VMX_EXIT_CTLS:
 	case MSR_IA32_VMX_TRUE_EXIT_CTLS:
-		ctl_high &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
+		ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_EXIT_CTLS);
 		break;
 	case MSR_IA32_VMX_ENTRY_CTLS:
 	case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
-		ctl_high &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
+		ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_ENTRY_CTLS);
 		break;
 	case MSR_IA32_VMX_PROCBASED_CTLS2:
-		ctl_high &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
+		ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_2NDEXEC);
 		break;
 	case MSR_IA32_VMX_TRUE_PINBASED_CTLS:
 	case MSR_IA32_VMX_PINBASED_CTLS:
-		ctl_high &= ~EVMCS1_UNSUPPORTED_PINCTRL;
+		ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_PINCTRL);
 		break;
 	case MSR_IA32_VMX_VMFUNC:
-		ctl_low &= ~EVMCS1_UNSUPPORTED_VMFUNC;
+		ctl_low &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_VMFUNC);
 		break;
 	}
 
 	*pdata = ctl_low | ((u64)ctl_high << 32);
 }
 
-int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
+int nested_evmcs_check_controls(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 {
 	int ret = 0;
 	u32 unsupp_ctl;
 
 	unsupp_ctl = vmcs12->pin_based_vm_exec_control &
-		EVMCS1_UNSUPPORTED_PINCTRL;
+		evmcs_get_unsupported_ctls(vcpu, EVMCS_PINCTRL);
 	if (unsupp_ctl) {
 		trace_kvm_nested_vmenter_failed(
-			"eVMCS: unsupported pin-based VM-execution controls",
+			"eVMCS: unsupported pin-based VM-execution controls: ",
 			unsupp_ctl);
 		ret = -EINVAL;
 	}
 
 	unsupp_ctl = vmcs12->secondary_vm_exec_control &
-		EVMCS1_UNSUPPORTED_2NDEXEC;
+		evmcs_get_unsupported_ctls(vcpu, EVMCS_2NDEXEC);
 	if (unsupp_ctl) {
 		trace_kvm_nested_vmenter_failed(
-			"eVMCS: unsupported secondary VM-execution controls",
+			"eVMCS: unsupported secondary VM-execution controls: ",
 			unsupp_ctl);
 		ret = -EINVAL;
 	}
 
 	unsupp_ctl = vmcs12->vm_exit_controls &
-		EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
+		evmcs_get_unsupported_ctls(vcpu, EVMCS_EXIT_CTLS);
 	if (unsupp_ctl) {
 		trace_kvm_nested_vmenter_failed(
-			"eVMCS: unsupported VM-exit controls",
+			"eVMCS: unsupported VM-exit controls: ",
 			unsupp_ctl);
 		ret = -EINVAL;
 	}
 
 	unsupp_ctl = vmcs12->vm_entry_controls &
-		EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
+		evmcs_get_unsupported_ctls(vcpu, EVMCS_ENTRY_CTLS);
 	if (unsupp_ctl) {
 		trace_kvm_nested_vmenter_failed(
-			"eVMCS: unsupported VM-entry controls",
+			"eVMCS: unsupported VM-entry controls: ",
 			unsupp_ctl);
 		ret = -EINVAL;
 	}
 
-	unsupp_ctl = vmcs12->vm_function_control & EVMCS1_UNSUPPORTED_VMFUNC;
+	unsupp_ctl = vmcs12->vm_function_control &
+		evmcs_get_unsupported_ctls(vcpu, EVMCS_VMFUNC);
 	if (unsupp_ctl) {
 		trace_kvm_nested_vmenter_failed(
-			"eVMCS: unsupported VM-function controls",
+			"eVMCS: unsupported VM-function controls: ",
 			unsupp_ctl);
 		ret = -EINVAL;
 	}
diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
index f886a8ff0342..4b809c79ae63 100644
--- a/arch/x86/kvm/vmx/evmcs.h
+++ b/arch/x86/kvm/vmx/evmcs.h
@@ -37,16 +37,9 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
  *	EPTP_LIST_ADDRESS               = 0x00002024,
  *	VMREAD_BITMAP                   = 0x00002026,
  *	VMWRITE_BITMAP                  = 0x00002028,
- *
- *	TSC_MULTIPLIER                  = 0x00002032,
  *	PLE_GAP                         = 0x00004020,
  *	PLE_WINDOW                      = 0x00004022,
  *	VMX_PREEMPTION_TIMER_VALUE      = 0x0000482E,
- *      GUEST_IA32_PERF_GLOBAL_CTRL     = 0x00002808,
- *      HOST_IA32_PERF_GLOBAL_CTRL      = 0x00002c04,
- *
- * Currently unsupported in KVM:
- *	GUEST_IA32_RTIT_CTL		= 0x00002814,
  */
 #define EVMCS1_UNSUPPORTED_PINCTRL (PIN_BASED_POSTED_INTR | \
 				    PIN_BASED_VMX_PREEMPTION_TIMER)
@@ -58,12 +51,10 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
 	 SECONDARY_EXEC_ENABLE_PML |					\
 	 SECONDARY_EXEC_ENABLE_VMFUNC |					\
 	 SECONDARY_EXEC_SHADOW_VMCS |					\
-	 SECONDARY_EXEC_TSC_SCALING |					\
 	 SECONDARY_EXEC_PAUSE_LOOP_EXITING)
 #define EVMCS1_UNSUPPORTED_VMEXIT_CTRL					\
-	(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |				\
-	 VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
-#define EVMCS1_UNSUPPORTED_VMENTRY_CTRL (VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL)
+	(VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
+#define EVMCS1_UNSUPPORTED_VMENTRY_CTRL (0)
 #define EVMCS1_UNSUPPORTED_VMFUNC (VMX_VMFUNC_EPTP_SWITCHING)
 
 struct evmcs_field {
@@ -243,7 +234,7 @@ bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa);
 uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu);
 int nested_enable_evmcs(struct kvm_vcpu *vcpu,
 			uint16_t *vmcs_version);
-void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata);
-int nested_evmcs_check_controls(struct vmcs12 *vmcs12);
+void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata);
+int nested_evmcs_check_controls(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12);
 
 #endif /* __KVM_X86_VMX_EVMCS_H */
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 4fc84f0f5875..dcf3ee645212 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2891,7 +2891,7 @@ static int nested_vmx_check_controls(struct kvm_vcpu *vcpu,
 		return -EINVAL;
 
 	if (to_vmx(vcpu)->nested.enlightened_vmcs_enabled)
-		return nested_evmcs_check_controls(vmcs12);
+		return nested_evmcs_check_controls(vcpu, vmcs12);
 
 	return 0;
 }
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c30115b9cb33..b4915d841357 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1858,7 +1858,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		 */
 		if (!msr_info->host_initiated &&
 		    vmx->nested.enlightened_vmcs_enabled)
-			nested_evmcs_filter_control_msr(msr_info->index,
+			nested_evmcs_filter_control_msr(vcpu, msr_info->index,
 							&msr_info->data);
 		break;
 	case MSR_IA32_RTIT_CTL:
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 10/25] KVM: selftests: Enable TSC scaling in evmcs selftest
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (8 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-14  9:13 ` [PATCH v4 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization Vitaly Kuznetsov
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

The updated Enlightened VMCS v1 definition enables TSC scaling, test
that SECONDARY_EXEC_TSC_SCALING can now be enabled.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 .../testing/selftests/kvm/x86_64/evmcs_test.c | 31 +++++++++++++++++--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/evmcs_test.c b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
index 8dda527cc080..80135b98dc3b 100644
--- a/tools/testing/selftests/kvm/x86_64/evmcs_test.c
+++ b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
@@ -18,6 +18,9 @@
 
 #include "vmx.h"
 
+/* Test flags */
+#define HOST_HAS_TSC_SCALING BIT(0)
+
 static int ud_count;
 
 static void guest_ud_handler(struct ex_regs *regs)
@@ -64,11 +67,14 @@ void l2_guest_code(void)
 	vmcall();
 	rdmsr_gs_base(); /* intercepted */
 
+	/* TSC scaling */
+	vmcall();
+
 	/* Done, exit to L1 and never come back.  */
 	vmcall();
 }
 
-void guest_code(struct vmx_pages *vmx_pages)
+void guest_code(struct vmx_pages *vmx_pages, u64 test_flags)
 {
 #define L2_GUEST_STACK_SIZE 64
 	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
@@ -150,6 +156,18 @@ void guest_code(struct vmx_pages *vmx_pages)
 	GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
 	GUEST_SYNC(11);
 
+	if (test_flags & HOST_HAS_TSC_SCALING) {
+		GUEST_ASSERT((rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2) >> 32) &
+			     SECONDARY_EXEC_TSC_SCALING);
+		/* Try enabling TSC scaling */
+		vmwrite(SECONDARY_VM_EXEC_CONTROL, vmreadz(SECONDARY_VM_EXEC_CONTROL) |
+			SECONDARY_EXEC_TSC_SCALING);
+		vmwrite(TSC_MULTIPLIER, 1);
+	}
+	GUEST_ASSERT(!vmresume());
+	GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
+	GUEST_SYNC(12);
+
 	/* Try enlightened vmptrld with an incorrect GPA */
 	evmcs_vmptrld(0xdeadbeef, vmx_pages->enlightened_vmcs);
 	GUEST_ASSERT(vmlaunch());
@@ -204,6 +222,7 @@ int main(int argc, char *argv[])
 	struct kvm_vm *vm;
 	struct kvm_run *run;
 	struct ucall uc;
+	u64 test_flags = 0;
 	int stage;
 
 	vm = vm_create_with_one_vcpu(&vcpu, guest_code);
@@ -212,11 +231,19 @@ int main(int argc, char *argv[])
 	TEST_REQUIRE(kvm_has_cap(KVM_CAP_NESTED_STATE));
 	TEST_REQUIRE(kvm_has_cap(KVM_CAP_HYPERV_ENLIGHTENED_VMCS));
 
+	if ((kvm_get_feature_msr(MSR_IA32_VMX_PROCBASED_CTLS2) >> 32) &
+	    SECONDARY_EXEC_TSC_SCALING) {
+		test_flags |= HOST_HAS_TSC_SCALING;
+		pr_info("TSC scaling is supported, adding to test\n");
+	} else {
+		pr_info("TSC scaling is not supported\n");
+	}
+
 	vcpu_set_hv_cpuid(vcpu);
 	vcpu_enable_evmcs(vcpu);
 
 	vcpu_alloc_vmx(vm, &vmx_pages_gva);
-	vcpu_args_set(vcpu, 1, vmx_pages_gva);
+	vcpu_args_set(vcpu, 2, vmx_pages_gva, test_flags);
 
 	vm_init_descriptor_tables(vm);
 	vcpu_init_descriptor_tables(vcpu);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (9 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 10/25] KVM: selftests: Enable TSC scaling in evmcs selftest Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-14 10:04   ` Maxim Levitsky
  2022-07-14  9:13 ` [PATCH v4 12/25] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config() Vitaly Kuznetsov
                   ` (13 subsequent siblings)
  24 siblings, 1 reply; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

With the updated eVMCSv1 definition, there's no known 'problematic'
controls which are exposed in VMX control MSRs but are not present in
eVMCSv1. Get rid of VMX control MSRs filtering for KVM on Hyper-V.

Note: VMX control MSRs filtering for Hyper-V on KVM
(nested_evmcs_filter_control_msr()) stays as even the updated eVMCSv1
definition doesn't have all the features implemented by KVM and some
fields are still missing. Moreover, nested_evmcs_filter_control_msr()
has to support the original eVMCSv1 version when VMM wishes so.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/evmcs.c | 13 -------------
 arch/x86/kvm/vmx/evmcs.h |  1 -
 arch/x86/kvm/vmx/vmx.c   |  5 -----
 3 files changed, 19 deletions(-)

diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index 52a53debd806..b5cfbf7d487b 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -320,19 +320,6 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
 };
 const unsigned int nr_evmcs_1_fields = ARRAY_SIZE(vmcs_field_to_evmcs_1);
 
-#if IS_ENABLED(CONFIG_HYPERV)
-__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf)
-{
-	vmcs_conf->cpu_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_EXEC_CTRL;
-	vmcs_conf->pin_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_PINCTRL;
-	vmcs_conf->cpu_based_2nd_exec_ctrl &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
-	vmcs_conf->cpu_based_3rd_exec_ctrl = 0;
-
-	vmcs_conf->vmexit_ctrl &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
-	vmcs_conf->vmentry_ctrl &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
-}
-#endif
-
 bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa)
 {
 	struct hv_vp_assist_page assist_page;
diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
index 4b809c79ae63..0feac101cce4 100644
--- a/arch/x86/kvm/vmx/evmcs.h
+++ b/arch/x86/kvm/vmx/evmcs.h
@@ -203,7 +203,6 @@ static inline void evmcs_load(u64 phys_addr)
 	vp_ap->enlighten_vmentry = 1;
 }
 
-__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf);
 #else /* !IS_ENABLED(CONFIG_HYPERV) */
 static __always_inline void evmcs_write64(unsigned long field, u64 value) {}
 static inline void evmcs_write32(unsigned long field, u32 value) {}
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index b4915d841357..dd905ad72637 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2689,11 +2689,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	vmcs_conf->vmexit_ctrl         = _vmexit_control;
 	vmcs_conf->vmentry_ctrl        = _vmentry_control;
 
-#if IS_ENABLED(CONFIG_HYPERV)
-	if (enlightened_vmcs)
-		evmcs_sanitize_exec_ctrls(vmcs_conf);
-#endif
-
 	return 0;
 }
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 12/25] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (10 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-21 22:00   ` Sean Christopherson
  2022-07-14  9:13 ` [PATCH v4 13/25] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING " Vitaly Kuznetsov
                   ` (12 subsequent siblings)
  24 siblings, 1 reply; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

VM_ENTRY_IA32E_MODE control is toggled dynamically by vmx_set_efer()
and setup_vmcs_config() doesn't check its existence. On the contrary,
nested_vmx_setup_ctls_msrs() doesn set it on x86_64. Add the missing
check and filter the bit out in vmx_vmentry_ctrl().

No (real) functional change intended as all existing CPUs supporting
long mode and VMX are supposed to have it.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index dd905ad72637..1aaec4d19e1b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2610,6 +2610,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 		_pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;
 
 	min = VM_ENTRY_LOAD_DEBUG_CONTROLS;
+#ifdef CONFIG_X86_64
+	min |= VM_ENTRY_IA32E_MODE;
+#endif
 	opt = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
 	      VM_ENTRY_LOAD_IA32_PAT |
 	      VM_ENTRY_LOAD_IA32_EFER |
@@ -4242,9 +4245,14 @@ static u32 vmx_vmentry_ctrl(void)
 	if (vmx_pt_mode_is_system())
 		vmentry_ctrl &= ~(VM_ENTRY_PT_CONCEAL_PIP |
 				  VM_ENTRY_LOAD_IA32_RTIT_CTL);
-	/* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
-	return vmentry_ctrl &
-		~(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL | VM_ENTRY_LOAD_IA32_EFER);
+	/*
+	 * IA32e mode, and loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically.
+	 */
+	vmentry_ctrl &= ~(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
+			  VM_ENTRY_LOAD_IA32_EFER |
+			  VM_ENTRY_IA32E_MODE);
+
+	return vmentry_ctrl;
 }
 
 static u32 vmx_vmexit_ctrl(void)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 13/25] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in setup_vmcs_config()
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (11 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 12/25] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config() Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-21 22:01   ` Sean Christopherson
  2022-07-14  9:13 ` [PATCH v4 14/25] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING " Vitaly Kuznetsov
                   ` (11 subsequent siblings)
  24 siblings, 1 reply; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

CPU_BASED_{INTR,NMI}_WINDOW_EXITING controls are toggled dynamically by
vmx_enable_{irq,nmi}_window, handle_interrupt_window(), handle_nmi_window()
but setup_vmcs_config() doesn't check their existence. Add the check and
filter the controls out in vmx_exec_control().

Note: KVM explicitly supports CPUs without VIRTUAL_NMIS and all these CPUs
are supposedly lacking NMI_WINDOW_EXITING too. Adjust cpu_has_virtual_nmis()
accordingly.

No functional change intended.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/capabilities.h | 3 ++-
 arch/x86/kvm/vmx/vmx.c          | 8 +++++++-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 069d8d298e1d..07e7492fe72a 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -82,7 +82,8 @@ static inline bool cpu_has_vmx_basic_inout(void)
 
 static inline bool cpu_has_virtual_nmis(void)
 {
-	return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS;
+	return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS &&
+		vmcs_config.cpu_based_exec_ctrl & CPU_BASED_NMI_WINDOW_EXITING;
 }
 
 static inline bool cpu_has_vmx_preemption_timer(void)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 1aaec4d19e1b..ce54f13d8da1 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2487,10 +2487,12 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	      CPU_BASED_MWAIT_EXITING |
 	      CPU_BASED_MONITOR_EXITING |
 	      CPU_BASED_INVLPG_EXITING |
-	      CPU_BASED_RDPMC_EXITING;
+	      CPU_BASED_RDPMC_EXITING |
+	      CPU_BASED_INTR_WINDOW_EXITING;
 
 	opt = CPU_BASED_TPR_SHADOW |
 	      CPU_BASED_USE_MSR_BITMAPS |
+	      CPU_BASED_NMI_WINDOW_EXITING |
 	      CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
 	      CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
 	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
@@ -4299,6 +4301,10 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 {
 	u32 exec_control = vmcs_config.cpu_based_exec_ctrl;
 
+	/* INTR_WINDOW_EXITING and NMI_WINDOW_EXITING are toggled dynamically */
+	exec_control &= ~(CPU_BASED_INTR_WINDOW_EXITING |
+			  CPU_BASED_NMI_WINDOW_EXITING);
+
 	if (vmx->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
 		exec_control &= ~CPU_BASED_MOV_DR_EXITING;
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 14/25] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING in setup_vmcs_config()
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (12 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 13/25] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING " Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-21 22:11   ` Sean Christopherson
  2022-07-14  9:13 ` [PATCH v4 15/25] KVM: VMX: Extend VMX controls macro shenanigans Vitaly Kuznetsov
                   ` (10 subsequent siblings)
  24 siblings, 1 reply; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

SECONDARY_EXEC_ENCLS_EXITING is conditionally added to the 'optional'
checklist in setup_vmcs_config() but there's little value in doing so.
First, as the control is optional, we can always check for its
presence, no harm done. Second, the only real value cpu_has_sgx() check
gives is that on the CPUs which support SECONDARY_EXEC_ENCLS_EXITING but
don't support SGX, the control is not getting enabled. It's highly unlikely
such CPUs exist but it's possible that some hypervisors expose broken vCPU
models.

Preserve cpu_has_sgx() check but filter the result of adjust_vmx_controls()
instead of the input.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ce54f13d8da1..566be73c6509 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2528,9 +2528,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 			SECONDARY_EXEC_PT_CONCEAL_VMX |
 			SECONDARY_EXEC_ENABLE_VMFUNC |
 			SECONDARY_EXEC_BUS_LOCK_DETECTION |
-			SECONDARY_EXEC_NOTIFY_VM_EXITING;
-		if (cpu_has_sgx())
-			opt2 |= SECONDARY_EXEC_ENCLS_EXITING;
+			SECONDARY_EXEC_NOTIFY_VM_EXITING |
+			SECONDARY_EXEC_ENCLS_EXITING;
+
 		if (adjust_vmx_controls(min2, opt2,
 					MSR_IA32_VMX_PROCBASED_CTLS2,
 					&_cpu_based_2nd_exec_control) < 0)
@@ -2577,6 +2577,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 		vmx_cap->vpid = 0;
 	}
 
+	if (!cpu_has_sgx())
+		_cpu_based_2nd_exec_control &= ~SECONDARY_EXEC_ENCLS_EXITING;
+
 	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) {
 		u64 opt3 = TERTIARY_EXEC_IPI_VIRT;
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 15/25] KVM: VMX: Extend VMX controls macro shenanigans
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (13 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 14/25] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING " Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-21 22:28   ` Sean Christopherson
  2022-07-22 18:33   ` Sean Christopherson
  2022-07-14  9:13 ` [PATCH v4 16/25] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config() Vitaly Kuznetsov
                   ` (9 subsequent siblings)
  24 siblings, 2 replies; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

When VMX controls macros are used to set or clear a control bit, make
sure that this bit was checked in setup_vmcs_config() and thus is properly
reflected in vmcs_config.

No functional change intended.

Suggested-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c |  99 +++++++------------------------------
 arch/x86/kvm/vmx/vmx.h | 109 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 127 insertions(+), 81 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 566be73c6509..93ca9ff8e641 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2448,7 +2448,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 				    struct vmx_capability *vmx_cap)
 {
 	u32 vmx_msr_low, vmx_msr_high;
-	u32 min, opt, min2, opt2;
 	u32 _pin_based_exec_control = 0;
 	u32 _cpu_based_exec_control = 0;
 	u32 _cpu_based_2nd_exec_control = 0;
@@ -2474,28 +2473,10 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	};
 
 	memset(vmcs_conf, 0, sizeof(*vmcs_conf));
-	min = CPU_BASED_HLT_EXITING |
-#ifdef CONFIG_X86_64
-	      CPU_BASED_CR8_LOAD_EXITING |
-	      CPU_BASED_CR8_STORE_EXITING |
-#endif
-	      CPU_BASED_CR3_LOAD_EXITING |
-	      CPU_BASED_CR3_STORE_EXITING |
-	      CPU_BASED_UNCOND_IO_EXITING |
-	      CPU_BASED_MOV_DR_EXITING |
-	      CPU_BASED_USE_TSC_OFFSETTING |
-	      CPU_BASED_MWAIT_EXITING |
-	      CPU_BASED_MONITOR_EXITING |
-	      CPU_BASED_INVLPG_EXITING |
-	      CPU_BASED_RDPMC_EXITING |
-	      CPU_BASED_INTR_WINDOW_EXITING;
-
-	opt = CPU_BASED_TPR_SHADOW |
-	      CPU_BASED_USE_MSR_BITMAPS |
-	      CPU_BASED_NMI_WINDOW_EXITING |
-	      CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
-	      CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
-	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
+
+	if (adjust_vmx_controls(KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL,
+				KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL,
+				MSR_IA32_VMX_PROCBASED_CTLS,
 				&_cpu_based_exec_control) < 0)
 		return -EIO;
 #ifdef CONFIG_X86_64
@@ -2504,34 +2485,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 					   ~CPU_BASED_CR8_STORE_EXITING;
 #endif
 	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
-		min2 = 0;
-		opt2 = SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
-			SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
-			SECONDARY_EXEC_WBINVD_EXITING |
-			SECONDARY_EXEC_ENABLE_VPID |
-			SECONDARY_EXEC_ENABLE_EPT |
-			SECONDARY_EXEC_UNRESTRICTED_GUEST |
-			SECONDARY_EXEC_PAUSE_LOOP_EXITING |
-			SECONDARY_EXEC_DESC |
-			SECONDARY_EXEC_ENABLE_RDTSCP |
-			SECONDARY_EXEC_ENABLE_INVPCID |
-			SECONDARY_EXEC_APIC_REGISTER_VIRT |
-			SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
-			SECONDARY_EXEC_SHADOW_VMCS |
-			SECONDARY_EXEC_XSAVES |
-			SECONDARY_EXEC_RDSEED_EXITING |
-			SECONDARY_EXEC_RDRAND_EXITING |
-			SECONDARY_EXEC_ENABLE_PML |
-			SECONDARY_EXEC_TSC_SCALING |
-			SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE |
-			SECONDARY_EXEC_PT_USE_GPA |
-			SECONDARY_EXEC_PT_CONCEAL_VMX |
-			SECONDARY_EXEC_ENABLE_VMFUNC |
-			SECONDARY_EXEC_BUS_LOCK_DETECTION |
-			SECONDARY_EXEC_NOTIFY_VM_EXITING |
-			SECONDARY_EXEC_ENCLS_EXITING;
-
-		if (adjust_vmx_controls(min2, opt2,
+		if (adjust_vmx_controls(KVM_REQ_VMX_SECONDARY_VM_EXEC_CONTROL,
+					KVM_OPT_VMX_SECONDARY_VM_EXEC_CONTROL,
 					MSR_IA32_VMX_PROCBASED_CTLS2,
 					&_cpu_based_2nd_exec_control) < 0)
 			return -EIO;
@@ -2581,30 +2536,20 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 		_cpu_based_2nd_exec_control &= ~SECONDARY_EXEC_ENCLS_EXITING;
 
 	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) {
-		u64 opt3 = TERTIARY_EXEC_IPI_VIRT;
-
-		_cpu_based_3rd_exec_control = adjust_vmx_controls64(opt3,
-					      MSR_IA32_VMX_PROCBASED_CTLS3);
+		_cpu_based_3rd_exec_control =
+			adjust_vmx_controls64(KVM_OPT_VMX_TERTIARY_VM_EXEC_CONTROL,
+			MSR_IA32_VMX_PROCBASED_CTLS3);
 	}
 
-	min = VM_EXIT_SAVE_DEBUG_CONTROLS | VM_EXIT_ACK_INTR_ON_EXIT;
-#ifdef CONFIG_X86_64
-	min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
-#endif
-	opt = VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
-	      VM_EXIT_LOAD_IA32_PAT |
-	      VM_EXIT_LOAD_IA32_EFER |
-	      VM_EXIT_CLEAR_BNDCFGS |
-	      VM_EXIT_PT_CONCEAL_PIP |
-	      VM_EXIT_CLEAR_IA32_RTIT_CTL;
-	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
+	if (adjust_vmx_controls(KVM_REQ_VMX_VM_EXIT_CONTROLS,
+				KVM_OPT_VMX_VM_EXIT_CONTROLS,
+				MSR_IA32_VMX_EXIT_CTLS,
 				&_vmexit_control) < 0)
 		return -EIO;
 
-	min = PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING;
-	opt = PIN_BASED_VIRTUAL_NMIS | PIN_BASED_POSTED_INTR |
-		 PIN_BASED_VMX_PREEMPTION_TIMER;
-	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PINBASED_CTLS,
+	if (adjust_vmx_controls(KVM_REQ_VMX_PIN_BASED_VM_EXEC_CONTROL,
+				KVM_OPT_VMX_PIN_BASED_VM_EXEC_CONTROL,
+				MSR_IA32_VMX_PINBASED_CTLS,
 				&_pin_based_exec_control) < 0)
 		return -EIO;
 
@@ -2614,17 +2559,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 		SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY))
 		_pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;
 
-	min = VM_ENTRY_LOAD_DEBUG_CONTROLS;
-#ifdef CONFIG_X86_64
-	min |= VM_ENTRY_IA32E_MODE;
-#endif
-	opt = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
-	      VM_ENTRY_LOAD_IA32_PAT |
-	      VM_ENTRY_LOAD_IA32_EFER |
-	      VM_ENTRY_LOAD_BNDCFGS |
-	      VM_ENTRY_PT_CONCEAL_PIP |
-	      VM_ENTRY_LOAD_IA32_RTIT_CTL;
-	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
+	if (adjust_vmx_controls(KVM_REQ_VMX_VM_ENTRY_CONTROLS,
+				KVM_OPT_VMX_VM_ENTRY_CONTROLS,
+				MSR_IA32_VMX_ENTRY_CTLS,
 				&_vmentry_control) < 0)
 		return -EIO;
 
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 286c88e285ea..89eaab3495a6 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -467,6 +467,113 @@ static inline u8 vmx_get_rvi(void)
 	return vmcs_read16(GUEST_INTR_STATUS) & 0xff;
 }
 
+#define __KVM_REQ_VMX_VM_ENTRY_CONTROLS				\
+	(VM_ENTRY_LOAD_DEBUG_CONTROLS)
+#ifdef CONFIG_X86_64
+	#define KVM_REQ_VMX_VM_ENTRY_CONTROLS			\
+		(__KVM_REQ_VMX_VM_ENTRY_CONTROLS |		\
+		VM_ENTRY_IA32E_MODE)
+#else
+	#define KVM_REQ_VMX_VM_ENTRY_CONTROLS			\
+		__KVM_REQ_VMX_VM_ENTRY_CONTROLS
+#endif
+#define KVM_OPT_VMX_VM_ENTRY_CONTROLS				\
+	(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |			\
+	VM_ENTRY_LOAD_IA32_PAT |				\
+	VM_ENTRY_LOAD_IA32_EFER |				\
+	VM_ENTRY_LOAD_BNDCFGS |					\
+	VM_ENTRY_PT_CONCEAL_PIP |				\
+	VM_ENTRY_LOAD_IA32_RTIT_CTL)
+
+#define __KVM_REQ_VMX_VM_EXIT_CONTROLS				\
+	(VM_EXIT_SAVE_DEBUG_CONTROLS |				\
+	VM_EXIT_ACK_INTR_ON_EXIT)
+#ifdef CONFIG_X86_64
+	#define KVM_REQ_VMX_VM_EXIT_CONTROLS			\
+		(__KVM_REQ_VMX_VM_EXIT_CONTROLS |		\
+		VM_EXIT_HOST_ADDR_SPACE_SIZE)
+#else
+	#define KVM_REQ_VMX_VM_EXIT_CONTROLS			\
+		__KVM_REQ_VMX_VM_EXIT_CONTROLS
+#endif
+#define KVM_OPT_VMX_VM_EXIT_CONTROLS				\
+	      (VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |		\
+	      VM_EXIT_LOAD_IA32_PAT |				\
+	      VM_EXIT_LOAD_IA32_EFER |				\
+	      VM_EXIT_CLEAR_BNDCFGS |				\
+	      VM_EXIT_PT_CONCEAL_PIP |				\
+	      VM_EXIT_CLEAR_IA32_RTIT_CTL)
+
+#define KVM_REQ_VMX_PIN_BASED_VM_EXEC_CONTROL			\
+	(PIN_BASED_EXT_INTR_MASK |				\
+	 PIN_BASED_NMI_EXITING)
+#define KVM_OPT_VMX_PIN_BASED_VM_EXEC_CONTROL			\
+	(PIN_BASED_VIRTUAL_NMIS |				\
+	PIN_BASED_POSTED_INTR |					\
+	PIN_BASED_VMX_PREEMPTION_TIMER)
+
+#define __KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL			\
+	(CPU_BASED_HLT_EXITING |				\
+	CPU_BASED_CR3_LOAD_EXITING |				\
+	CPU_BASED_CR3_STORE_EXITING |				\
+	CPU_BASED_UNCOND_IO_EXITING |				\
+	CPU_BASED_MOV_DR_EXITING |				\
+	CPU_BASED_USE_TSC_OFFSETTING |				\
+	CPU_BASED_MWAIT_EXITING |				\
+	CPU_BASED_MONITOR_EXITING |				\
+	CPU_BASED_INVLPG_EXITING |				\
+	CPU_BASED_RDPMC_EXITING |				\
+	CPU_BASED_INTR_WINDOW_EXITING)
+
+#ifdef CONFIG_X86_64
+	#define KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL		\
+		(__KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL |	\
+		CPU_BASED_CR8_LOAD_EXITING |			\
+		CPU_BASED_CR8_STORE_EXITING)
+#else
+	#define KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL		\
+		__KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL
+#endif
+
+#define KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL			\
+	(CPU_BASED_TPR_SHADOW |					\
+	CPU_BASED_USE_MSR_BITMAPS |				\
+	CPU_BASED_NMI_WINDOW_EXITING |				\
+	CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |			\
+	CPU_BASED_ACTIVATE_TERTIARY_CONTROLS)
+
+#define KVM_REQ_VMX_SECONDARY_VM_EXEC_CONTROL 0
+#define KVM_OPT_VMX_SECONDARY_VM_EXEC_CONTROL			\
+	(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |		\
+	SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |			\
+	SECONDARY_EXEC_WBINVD_EXITING |				\
+	SECONDARY_EXEC_ENABLE_VPID |				\
+	SECONDARY_EXEC_ENABLE_EPT |				\
+	SECONDARY_EXEC_UNRESTRICTED_GUEST |			\
+	SECONDARY_EXEC_PAUSE_LOOP_EXITING |			\
+	SECONDARY_EXEC_DESC |					\
+	SECONDARY_EXEC_ENABLE_RDTSCP |				\
+	SECONDARY_EXEC_ENABLE_INVPCID |				\
+	SECONDARY_EXEC_APIC_REGISTER_VIRT |			\
+	SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |			\
+	SECONDARY_EXEC_SHADOW_VMCS |				\
+	SECONDARY_EXEC_XSAVES |					\
+	SECONDARY_EXEC_RDSEED_EXITING |				\
+	SECONDARY_EXEC_RDRAND_EXITING |				\
+	SECONDARY_EXEC_ENABLE_PML |				\
+	SECONDARY_EXEC_TSC_SCALING |				\
+	SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE |			\
+	SECONDARY_EXEC_PT_USE_GPA |				\
+	SECONDARY_EXEC_PT_CONCEAL_VMX |				\
+	SECONDARY_EXEC_ENABLE_VMFUNC |				\
+	SECONDARY_EXEC_BUS_LOCK_DETECTION |			\
+	SECONDARY_EXEC_NOTIFY_VM_EXITING |			\
+	SECONDARY_EXEC_ENCLS_EXITING)
+
+#define KVM_REQ_VMX_TERTIARY_VM_EXEC_CONTROL 0
+#define KVM_OPT_VMX_TERTIARY_VM_EXEC_CONTROL			\
+	(TERTIARY_EXEC_IPI_VIRT)
+
 #define BUILD_CONTROLS_SHADOW(lname, uname, bits)				\
 static inline void lname##_controls_set(struct vcpu_vmx *vmx, u##bits val)	\
 {										\
@@ -485,10 +592,12 @@ static inline u##bits lname##_controls_get(struct vcpu_vmx *vmx)		\
 }										\
 static inline void lname##_controls_setbit(struct vcpu_vmx *vmx, u##bits val)	\
 {										\
+	BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));	\
 	lname##_controls_set(vmx, lname##_controls_get(vmx) | val);		\
 }										\
 static inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits val)	\
 {										\
+	BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));	\
 	lname##_controls_set(vmx, lname##_controls_get(vmx) & ~val);		\
 }
 BUILD_CONTROLS_SHADOW(vm_entry, VM_ENTRY_CONTROLS, 32)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 16/25] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config()
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (14 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 15/25] KVM: VMX: Extend VMX controls macro shenanigans Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-21 22:30   ` Sean Christopherson
  2022-07-14  9:13 ` [PATCH v4 17/25] KVM: VMX: Add missing VMEXIT controls to vmcs_config Vitaly Kuznetsov
                   ` (8 subsequent siblings)
  24 siblings, 1 reply; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

As a preparation to reusing the result of setup_vmcs_config() in
nested VMX MSR setup, move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering
to vmx_exec_control().

No functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 93ca9ff8e641..d7170990f469 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2479,11 +2479,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 				MSR_IA32_VMX_PROCBASED_CTLS,
 				&_cpu_based_exec_control) < 0)
 		return -EIO;
-#ifdef CONFIG_X86_64
-	if (_cpu_based_exec_control & CPU_BASED_TPR_SHADOW)
-		_cpu_based_exec_control &= ~CPU_BASED_CR8_LOAD_EXITING &
-					   ~CPU_BASED_CR8_STORE_EXITING;
-#endif
 	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
 		if (adjust_vmx_controls(KVM_REQ_VMX_SECONDARY_VM_EXEC_CONTROL,
 					KVM_OPT_VMX_SECONDARY_VM_EXEC_CONTROL,
@@ -4248,13 +4243,17 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 	if (vmx->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
 		exec_control &= ~CPU_BASED_MOV_DR_EXITING;
 
-	if (!cpu_need_tpr_shadow(&vmx->vcpu)) {
+	if (!cpu_need_tpr_shadow(&vmx->vcpu))
 		exec_control &= ~CPU_BASED_TPR_SHADOW;
+
 #ifdef CONFIG_X86_64
+	if (exec_control & CPU_BASED_TPR_SHADOW)
+		exec_control &= ~(CPU_BASED_CR8_LOAD_EXITING |
+				  CPU_BASED_CR8_STORE_EXITING);
+	else
 		exec_control |= CPU_BASED_CR8_STORE_EXITING |
 				CPU_BASED_CR8_LOAD_EXITING;
 #endif
-	}
 	if (!enable_ept)
 		exec_control |= CPU_BASED_CR3_STORE_EXITING |
 				CPU_BASED_CR3_LOAD_EXITING  |
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 17/25] KVM: VMX: Add missing VMEXIT controls to vmcs_config
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (15 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 16/25] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config() Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-21 22:34   ` Sean Christopherson
  2022-07-14  9:13 ` [PATCH v4 18/25] KVM: VMX: Add missing CPU based VM execution " Vitaly Kuznetsov
                   ` (7 subsequent siblings)
  24 siblings, 1 reply; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

As a preparation to reusing the result of setup_vmcs_config() in
nested VMX MSR setup, add the VMEXIT controls which KVM doesn't
use but supports for nVMX to KVM_OPT_VMX_VM_EXIT_CONTROLS and
filter them out in vmx_vmexit_ctrl().

No functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 4 ++++
 arch/x86/kvm/vmx/vmx.h | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d7170990f469..2fb89bdcbbd8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4196,6 +4196,10 @@ static u32 vmx_vmexit_ctrl(void)
 {
 	u32 vmexit_ctrl = vmcs_config.vmexit_ctrl;
 
+	/* Not used by KVM but supported for nesting. */
+	vmexit_ctrl &= ~(VM_EXIT_SAVE_IA32_PAT | VM_EXIT_SAVE_IA32_EFER |
+			 VM_EXIT_SAVE_VMX_PREEMPTION_TIMER);
+
 	if (vmx_pt_mode_is_system())
 		vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
 				 VM_EXIT_CLEAR_IA32_RTIT_CTL);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 89eaab3495a6..e9c392398f1b 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -498,8 +498,11 @@ static inline u8 vmx_get_rvi(void)
 #endif
 #define KVM_OPT_VMX_VM_EXIT_CONTROLS				\
 	      (VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |		\
+	      VM_EXIT_SAVE_IA32_PAT |				\
 	      VM_EXIT_LOAD_IA32_PAT |				\
+	      VM_EXIT_SAVE_IA32_EFER |				\
 	      VM_EXIT_LOAD_IA32_EFER |				\
+	      VM_EXIT_SAVE_VMX_PREEMPTION_TIMER |		\
 	      VM_EXIT_CLEAR_BNDCFGS |				\
 	      VM_EXIT_PT_CONCEAL_PIP |				\
 	      VM_EXIT_CLEAR_IA32_RTIT_CTL)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 18/25] KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (16 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 17/25] KVM: VMX: Add missing VMEXIT controls to vmcs_config Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-21 22:39   ` Sean Christopherson
  2022-07-14  9:13 ` [PATCH v4 19/25] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup Vitaly Kuznetsov
                   ` (6 subsequent siblings)
  24 siblings, 1 reply; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

As a preparation to reusing the result of setup_vmcs_config() in
nested VMX MSR setup, add the CPU based VM execution controls which KVM
doesn't use but supports for nVMX to KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL
and filter them out in vmx_exec_control().

No functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 6 ++++++
 arch/x86/kvm/vmx/vmx.h | 6 +++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 2fb89bdcbbd8..9771c771c8f5 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4240,6 +4240,12 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 {
 	u32 exec_control = vmcs_config.cpu_based_exec_ctrl;
 
+	/* Not used by KVM but supported for nesting. */
+	exec_control &= ~(CPU_BASED_RDTSC_EXITING |
+			  CPU_BASED_USE_IO_BITMAPS |
+			  CPU_BASED_MONITOR_TRAP_FLAG |
+			  CPU_BASED_PAUSE_EXITING);
+
 	/* INTR_WINDOW_EXITING and NMI_WINDOW_EXITING are toggled dynamically */
 	exec_control &= ~(CPU_BASED_INTR_WINDOW_EXITING |
 			  CPU_BASED_NMI_WINDOW_EXITING);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index e9c392398f1b..758f80c41beb 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -539,9 +539,13 @@ static inline u8 vmx_get_rvi(void)
 #endif
 
 #define KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL			\
-	(CPU_BASED_TPR_SHADOW |					\
+	(CPU_BASED_RDTSC_EXITING |				\
+	CPU_BASED_TPR_SHADOW |					\
+	CPU_BASED_USE_IO_BITMAPS |				\
+	CPU_BASED_MONITOR_TRAP_FLAG |				\
 	CPU_BASED_USE_MSR_BITMAPS |				\
 	CPU_BASED_NMI_WINDOW_EXITING |				\
+	CPU_BASED_PAUSE_EXITING |				\
 	CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |			\
 	CPU_BASED_ACTIVATE_TERTIARY_CONTROLS)
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 19/25] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (17 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 18/25] KVM: VMX: Add missing CPU based VM execution " Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-14  9:13 ` [PATCH v4 20/25] KVM: x86: VMX: Replace some Intel model numbers with mnemonics Vitaly Kuznetsov
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

From: Sean Christopherson <seanjc@google.com>

Clear the CR3 and INVLPG interception controls at runtime based on
whether or not EPT is being _used_, as opposed to clearing the bits at
setup if EPT is _supported_ in hardware, and then restoring them when EPT
is not used.  Not mucking with the base config will allow using the base
config as the starting point for emulating the VMX capability MSRs.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9771c771c8f5..eca6875d6732 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2501,13 +2501,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	rdmsr_safe(MSR_IA32_VMX_EPT_VPID_CAP,
 		&vmx_cap->ept, &vmx_cap->vpid);
 
-	if (_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_EPT) {
-		/* CR3 accesses and invlpg don't need to cause VM Exits when EPT
-		   enabled */
-		_cpu_based_exec_control &= ~(CPU_BASED_CR3_LOAD_EXITING |
-					     CPU_BASED_CR3_STORE_EXITING |
-					     CPU_BASED_INVLPG_EXITING);
-	} else if (vmx_cap->ept) {
+	if (!(_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_EPT) &&
+	    vmx_cap->ept) {
 		pr_warn_once("EPT CAP should not exist if not support "
 				"1-setting enable EPT VM-execution control\n");
 
@@ -4264,10 +4259,11 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 		exec_control |= CPU_BASED_CR8_STORE_EXITING |
 				CPU_BASED_CR8_LOAD_EXITING;
 #endif
-	if (!enable_ept)
-		exec_control |= CPU_BASED_CR3_STORE_EXITING |
-				CPU_BASED_CR3_LOAD_EXITING  |
-				CPU_BASED_INVLPG_EXITING;
+	/* No need to intercept CR3 access or INVPLG when using EPT. */
+	if (enable_ept)
+		exec_control &= ~(CPU_BASED_CR3_LOAD_EXITING |
+				  CPU_BASED_CR3_STORE_EXITING |
+				  CPU_BASED_INVLPG_EXITING);
 	if (kvm_mwait_in_guest(vmx->vcpu.kvm))
 		exec_control &= ~(CPU_BASED_MWAIT_EXITING |
 				CPU_BASED_MONITOR_EXITING);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 20/25] KVM: x86: VMX: Replace some Intel model numbers with mnemonics
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (18 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 19/25] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-14  9:13 ` [PATCH v4 21/25] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config() Vitaly Kuznetsov
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

From: Jim Mattson <jmattson@google.com>

Intel processor code names are more familiar to many readers than
their decimal model numbers.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index eca6875d6732..2dff5b94c535 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2580,11 +2580,11 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	 */
 	if (boot_cpu_data.x86 == 0x6) {
 		switch (boot_cpu_data.x86_model) {
-		case 26: /* AAK155 */
-		case 30: /* AAP115 */
-		case 37: /* AAT100 */
-		case 44: /* BC86,AAY89,BD102 */
-		case 46: /* BA97 */
+		case INTEL_FAM6_NEHALEM_EP:	/* AAK155 */
+		case INTEL_FAM6_NEHALEM:	/* AAP115 */
+		case INTEL_FAM6_WESTMERE:	/* AAT100 */
+		case INTEL_FAM6_WESTMERE_EP:	/* BC86,AAY89,BD102 */
+		case INTEL_FAM6_NEHALEM_EX:	/* BA97 */
 			_vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
 			_vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
 			pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 21/25] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (19 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 20/25] KVM: x86: VMX: Replace some Intel model numbers with mnemonics Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-21 22:56   ` Sean Christopherson
  2022-07-14  9:13 ` [PATCH v4 22/25] KVM: nVMX: Always set required-1 bits of pinbased_ctls to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR Vitaly Kuznetsov
                   ` (3 subsequent siblings)
  24 siblings, 1 reply; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

As a preparation to reusing the result of setup_vmcs_config() for setting
up nested VMX control MSRs, move LOAD_IA32_PERF_GLOBAL_CTRL errata handling
to vmx_vmexit_ctrl()/vmx_vmentry_ctrl() and print the warning from
hardware_setup(). While it seems reasonable to not expose
LOAD_IA32_PERF_GLOBAL_CTRL controls to L1 hypervisor on buggy CPUs,
such change would inevitably break live migration from older KVMs
where the controls are exposed. Keep the status quo for know, L1 hypervisor
itself is supposed to take care of the errata.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 62 ++++++++++++++++++++++++++----------------
 1 file changed, 38 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 2dff5b94c535..e462e5b9c0a1 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2416,6 +2416,31 @@ static bool cpu_has_sgx(void)
 	return cpuid_eax(0) >= 0x12 && (cpuid_eax(0x12) & BIT(0));
 }
 
+/*
+ * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
+ * can't be used due to errata where VM Exit may incorrectly clear
+ * IA32_PERF_GLOBAL_CTRL[34:32]. Work around the errata by using the
+ * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
+ */
+static bool cpu_has_perf_global_ctrl_bug(void)
+{
+	if (boot_cpu_data.x86 == 0x6) {
+		switch (boot_cpu_data.x86_model) {
+		case INTEL_FAM6_NEHALEM_EP:	/* AAK155 */
+		case INTEL_FAM6_NEHALEM:	/* AAP115 */
+		case INTEL_FAM6_WESTMERE:	/* AAT100 */
+		case INTEL_FAM6_WESTMERE_EP:	/* BC86,AAY89,BD102 */
+		case INTEL_FAM6_NEHALEM_EX:	/* BA97 */
+			return true;
+		default:
+			break;
+		}
+	}
+
+	return false;
+}
+
+
 static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
 				      u32 msr, u32 *result)
 {
@@ -2572,30 +2597,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 		_vmexit_control &= ~x_ctrl;
 	}
 
-	/*
-	 * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
-	 * can't be used due to an errata where VM Exit may incorrectly clear
-	 * IA32_PERF_GLOBAL_CTRL[34:32].  Workaround the errata by using the
-	 * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
-	 */
-	if (boot_cpu_data.x86 == 0x6) {
-		switch (boot_cpu_data.x86_model) {
-		case INTEL_FAM6_NEHALEM_EP:	/* AAK155 */
-		case INTEL_FAM6_NEHALEM:	/* AAP115 */
-		case INTEL_FAM6_WESTMERE:	/* AAT100 */
-		case INTEL_FAM6_WESTMERE_EP:	/* BC86,AAY89,BD102 */
-		case INTEL_FAM6_NEHALEM_EX:	/* BA97 */
-			_vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
-			_vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
-			pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
-					"does not work properly. Using workaround\n");
-			break;
-		default:
-			break;
-		}
-	}
-
-
 	rdmsr(MSR_IA32_VMX_BASIC, vmx_msr_low, vmx_msr_high);
 
 	/* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
@@ -4184,6 +4185,10 @@ static u32 vmx_vmentry_ctrl(void)
 			  VM_ENTRY_LOAD_IA32_EFER |
 			  VM_ENTRY_IA32E_MODE);
 
+
+	if (cpu_has_perf_global_ctrl_bug())
+		vmentry_ctrl &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+
 	return vmentry_ctrl;
 }
 
@@ -4198,6 +4203,10 @@ static u32 vmx_vmexit_ctrl(void)
 	if (vmx_pt_mode_is_system())
 		vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
 				 VM_EXIT_CLEAR_IA32_RTIT_CTL);
+
+	if (cpu_has_perf_global_ctrl_bug())
+		vmexit_ctrl &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+
 	/* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
 	return vmexit_ctrl &
 		~(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | VM_EXIT_LOAD_IA32_EFER);
@@ -8113,6 +8122,11 @@ static __init int hardware_setup(void)
 	if (setup_vmcs_config(&vmcs_config, &vmx_capability) < 0)
 		return -EIO;
 
+	if (cpu_has_perf_global_ctrl_bug()) {
+		pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
+			     "does not work properly. Using workaround\n");
+	}
+
 	if (boot_cpu_has(X86_FEATURE_NX))
 		kvm_enable_efer_bits(EFER_NX);
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 22/25] KVM: nVMX: Always set required-1 bits of pinbased_ctls to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (20 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 21/25] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config() Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-14  9:13 ` [PATCH v4 23/25] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs Vitaly Kuznetsov
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Similar to exit_ctls_low, entry_ctls_low, procbased_ctls_low,
pinbased_ctls_low should be set to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR
and not host's MSR_IA32_VMX_PINBASED_CTLS value |=
PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR.

The commit eabeaaccfca0 ("KVM: nVMX: Clean up and fix pin-based
execution controls") which introduced '|=' doesn't mention anything
about why this is needed, the change seems rather accidental.

Note: normally, required-1 portion of MSR_IA32_VMX_PINBASED_CTLS should
be equal to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR so no behavioral change
is expected, however, it is (in theory) possible to observe something
different there when e.g. KVM is running as a nested hypervisor. Hope
this doesn't happen in practice.

Reported-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index dcf3ee645212..09654d5c2144 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6586,7 +6586,7 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	rdmsr(MSR_IA32_VMX_PINBASED_CTLS,
 		msrs->pinbased_ctls_low,
 		msrs->pinbased_ctls_high);
-	msrs->pinbased_ctls_low |=
+	msrs->pinbased_ctls_low =
 		PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
 	msrs->pinbased_ctls_high &=
 		PIN_BASED_EXT_INTR_MASK |
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 23/25] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (21 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 22/25] KVM: nVMX: Always set required-1 bits of pinbased_ctls to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-14  9:13 ` [PATCH v4 24/25] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config Vitaly Kuznetsov
  2022-07-14  9:13 ` [PATCH v4 25/25] KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up nested MSR Vitaly Kuznetsov
  24 siblings, 0 replies; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Using raw host MSR values for setting up nested VMX control MSRs is
incorrect as some features need to disabled, e.g. when KVM runs as
a nested hypervisor on Hyper-V and uses Enlightened VMCS or when a
workaround for IA32_PERF_GLOBAL_CTRL is applied. For non-nested VMX, this
is done in setup_vmcs_config() and the result is stored in vmcs_config.
Use it for setting up allowed-1 bits in nested VMX MSRs too.

Suggested-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 30 ++++++++++++------------------
 arch/x86/kvm/vmx/nested.h |  2 +-
 arch/x86/kvm/vmx/vmx.c    |  5 ++---
 3 files changed, 15 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 09654d5c2144..3d386c663fac 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6565,8 +6565,10 @@ static u64 nested_vmx_calc_vmcs_enum_msr(void)
  * bit in the high half is on if the corresponding bit in the control field
  * may be on. See also vmx_control_verify().
  */
-void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
+void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
 {
+	struct nested_vmx_msrs *msrs = &vmcs_conf->nested;
+
 	/*
 	 * Note that as a general rule, the high half of the MSRs (bits in
 	 * the control fields which may be 1) should be initialized by the
@@ -6583,11 +6585,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	 */
 
 	/* pin-based controls */
-	rdmsr(MSR_IA32_VMX_PINBASED_CTLS,
-		msrs->pinbased_ctls_low,
-		msrs->pinbased_ctls_high);
 	msrs->pinbased_ctls_low =
 		PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
+
+	msrs->pinbased_ctls_high = vmcs_conf->pin_based_exec_ctrl;
 	msrs->pinbased_ctls_high &=
 		PIN_BASED_EXT_INTR_MASK |
 		PIN_BASED_NMI_EXITING |
@@ -6598,12 +6599,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 		PIN_BASED_VMX_PREEMPTION_TIMER;
 
 	/* exit controls */
-	rdmsr(MSR_IA32_VMX_EXIT_CTLS,
-		msrs->exit_ctls_low,
-		msrs->exit_ctls_high);
 	msrs->exit_ctls_low =
 		VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR;
 
+	msrs->exit_ctls_high = vmcs_conf->vmexit_ctrl;
 	msrs->exit_ctls_high &=
 #ifdef CONFIG_X86_64
 		VM_EXIT_HOST_ADDR_SPACE_SIZE |
@@ -6619,11 +6618,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	msrs->exit_ctls_low &= ~VM_EXIT_SAVE_DEBUG_CONTROLS;
 
 	/* entry controls */
-	rdmsr(MSR_IA32_VMX_ENTRY_CTLS,
-		msrs->entry_ctls_low,
-		msrs->entry_ctls_high);
 	msrs->entry_ctls_low =
 		VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR;
+
+	msrs->entry_ctls_high = vmcs_conf->vmentry_ctrl;
 	msrs->entry_ctls_high &=
 #ifdef CONFIG_X86_64
 		VM_ENTRY_IA32E_MODE |
@@ -6637,11 +6635,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	msrs->entry_ctls_low &= ~VM_ENTRY_LOAD_DEBUG_CONTROLS;
 
 	/* cpu-based controls */
-	rdmsr(MSR_IA32_VMX_PROCBASED_CTLS,
-		msrs->procbased_ctls_low,
-		msrs->procbased_ctls_high);
 	msrs->procbased_ctls_low =
 		CPU_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
+
+	msrs->procbased_ctls_high = vmcs_conf->cpu_based_exec_ctrl;
 	msrs->procbased_ctls_high &=
 		CPU_BASED_INTR_WINDOW_EXITING |
 		CPU_BASED_NMI_WINDOW_EXITING | CPU_BASED_USE_TSC_OFFSETTING |
@@ -6675,12 +6672,9 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	 * depend on CPUID bits, they are added later by
 	 * vmx_vcpu_after_set_cpuid.
 	 */
-	if (msrs->procbased_ctls_high & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS)
-		rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2,
-		      msrs->secondary_ctls_low,
-		      msrs->secondary_ctls_high);
-
 	msrs->secondary_ctls_low = 0;
+
+	msrs->secondary_ctls_high = vmcs_conf->cpu_based_2nd_exec_ctrl;
 	msrs->secondary_ctls_high &=
 		SECONDARY_EXEC_DESC |
 		SECONDARY_EXEC_ENABLE_RDTSCP |
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index c92cea0b8ccc..fae047c6204b 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -17,7 +17,7 @@ enum nvmx_vmentry_status {
 };
 
 void vmx_leave_nested(struct kvm_vcpu *vcpu);
-void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps);
+void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps);
 void nested_vmx_hardware_unsetup(void);
 __init int nested_vmx_hardware_setup(int (*exit_handlers[])(struct kvm_vcpu *));
 void nested_vmx_set_vmcs_shadowing_bitmap(void);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e462e5b9c0a1..35285109856f 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7306,7 +7306,7 @@ static int __init vmx_check_processor_compat(void)
 	if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0)
 		return -EIO;
 	if (nested)
-		nested_vmx_setup_ctls_msrs(&vmcs_conf.nested, vmx_cap.ept);
+		nested_vmx_setup_ctls_msrs(&vmcs_conf, vmx_cap.ept);
 	if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config)) != 0) {
 		printk(KERN_ERR "kvm: CPU %d feature inconsistency!\n",
 				smp_processor_id());
@@ -8281,8 +8281,7 @@ static __init int hardware_setup(void)
 	setup_default_sgx_lepubkeyhash();
 
 	if (nested) {
-		nested_vmx_setup_ctls_msrs(&vmcs_config.nested,
-					   vmx_capability.ept);
+		nested_vmx_setup_ctls_msrs(&vmcs_config, vmx_capability.ept);
 
 		r = nested_vmx_hardware_setup(kvm_vmx_exit_handlers);
 		if (r)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 24/25] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (22 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 23/25] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  2022-07-21 23:06   ` Sean Christopherson
  2022-07-14  9:13 ` [PATCH v4 25/25] KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up nested MSR Vitaly Kuznetsov
  24 siblings, 1 reply; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Like other host VMX control MSRs, MSR_IA32_VMX_MISC can be cached in
vmcs_config to avoid the need to re-read it later, e.g. from
cpu_has_vmx_intel_pt() or cpu_has_vmx_shadow_vmcs().

No (real) functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/capabilities.h | 11 +++--------
 arch/x86/kvm/vmx/vmx.c          |  8 +++++---
 2 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 07e7492fe72a..07f7a9534211 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -65,6 +65,7 @@ struct vmcs_config {
 	u64 cpu_based_3rd_exec_ctrl;
 	u32 vmexit_ctrl;
 	u32 vmentry_ctrl;
+	u64 misc;
 	struct nested_vmx_msrs nested;
 };
 extern struct vmcs_config vmcs_config;
@@ -225,11 +226,8 @@ static inline bool cpu_has_vmx_vmfunc(void)
 
 static inline bool cpu_has_vmx_shadow_vmcs(void)
 {
-	u64 vmx_msr;
-
 	/* check if the cpu supports writing r/o exit information fields */
-	rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
-	if (!(vmx_msr & MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS))
+	if (!(vmcs_config.misc & MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS))
 		return false;
 
 	return vmcs_config.cpu_based_2nd_exec_ctrl &
@@ -371,10 +369,7 @@ static inline bool cpu_has_vmx_invvpid_global(void)
 
 static inline bool cpu_has_vmx_intel_pt(void)
 {
-	u64 vmx_msr;
-
-	rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
-	return (vmx_msr & MSR_IA32_VMX_MISC_INTEL_PT) &&
+	return (vmcs_config.misc & MSR_IA32_VMX_MISC_INTEL_PT) &&
 		(vmcs_config.cpu_based_2nd_exec_ctrl & SECONDARY_EXEC_PT_USE_GPA) &&
 		(vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_RTIT_CTL);
 }
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 35285109856f..ab091758c437 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2479,6 +2479,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	u64 _cpu_based_3rd_exec_control = 0;
 	u32 _vmexit_control = 0;
 	u32 _vmentry_control = 0;
+	u64 misc_msr;
 	int i;
 
 	/*
@@ -2613,6 +2614,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	if (((vmx_msr_high >> 18) & 15) != 6)
 		return -EIO;
 
+	rdmsrl(MSR_IA32_VMX_MISC, misc_msr);
+
 	vmcs_conf->size = vmx_msr_high & 0x1fff;
 	vmcs_conf->basic_cap = vmx_msr_high & ~0x1fff;
 
@@ -2624,6 +2627,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	vmcs_conf->cpu_based_3rd_exec_ctrl = _cpu_based_3rd_exec_control;
 	vmcs_conf->vmexit_ctrl         = _vmexit_control;
 	vmcs_conf->vmentry_ctrl        = _vmentry_control;
+	vmcs_conf->misc	= misc_msr;
 
 	return 0;
 }
@@ -8241,11 +8245,9 @@ static __init int hardware_setup(void)
 
 	if (enable_preemption_timer) {
 		u64 use_timer_freq = 5000ULL * 1000 * 1000;
-		u64 vmx_msr;
 
-		rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
 		cpu_preemption_timer_multi =
-			vmx_msr & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
+			vmcs_config.misc & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
 
 		if (tsc_khz)
 			use_timer_freq = (u64)tsc_khz * 1000;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH v4 25/25] KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up nested MSR
  2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (23 preceding siblings ...)
  2022-07-14  9:13 ` [PATCH v4 24/25] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config Vitaly Kuznetsov
@ 2022-07-14  9:13 ` Vitaly Kuznetsov
  24 siblings, 0 replies; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-14  9:13 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

vmcs_config has cached host MSR_IA32_VMX_MISC value, use it for setting
up nested MSR_IA32_VMX_MISC in nested_vmx_setup_ctls_msrs() and avoid the
redundant rdmsr().

No (real) functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 3d386c663fac..8026dab71086 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6754,10 +6754,7 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
 		msrs->secondary_ctls_high |= SECONDARY_EXEC_ENCLS_EXITING;
 
 	/* miscellaneous data */
-	rdmsr(MSR_IA32_VMX_MISC,
-		msrs->misc_low,
-		msrs->misc_high);
-	msrs->misc_low &= VMX_MISC_SAVE_EFER_LMA;
+	msrs->misc_low = (u32)vmcs_conf->misc & VMX_MISC_SAVE_EFER_LMA;
 	msrs->misc_low |=
 		MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS |
 		VMX_MISC_EMULATED_PREEMPTION_TIMER_RATE |
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 07/25] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields
  2022-07-14  9:13 ` [PATCH v4 07/25] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields Vitaly Kuznetsov
@ 2022-07-14  9:20   ` Kai Huang
  0 siblings, 0 replies; 62+ messages in thread
From: Kai Huang @ 2022-07-14  9:20 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

On Thu, 2022-07-14 at 11:13 +0200, Vitaly Kuznetsov wrote:
> The updated Enlightened VMCS definition has 'encls_exiting_bitmap'
> field which needs mapping to VMCS, add the missing encoding.
> 
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  tools/testing/selftests/kvm/include/x86_64/vmx.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/tools/testing/selftests/kvm/include/x86_64/vmx.h b/tools/testing/selftests/kvm/include/x86_64/vmx.h
> index cc3604f8f1d3..5292d30fb7f2 100644
> --- a/tools/testing/selftests/kvm/include/x86_64/vmx.h
> +++ b/tools/testing/selftests/kvm/include/x86_64/vmx.h
> @@ -208,6 +208,8 @@ enum vmcs_field {
>  	VMWRITE_BITMAP_HIGH		= 0x00002029,
>  	XSS_EXIT_BITMAP			= 0x0000202C,
>  	XSS_EXIT_BITMAP_HIGH		= 0x0000202D,
> +	ENCLS_EXITING_BITMAP		= 0x0000202E,
> +	ENCLS_EXITING_BITMAP_HIGH	= 0x0000202F,
>  	TSC_MULTIPLIER			= 0x00002032,
>  	TSC_MULTIPLIER_HIGH		= 0x00002033,
>  	GUEST_PHYSICAL_ADDRESS		= 0x00002400,

Reviewed-by: Kai Huang <kai.huang@intel.com>

-- 
Thanks,
-Kai



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 03/25] x86/hyperv: Update 'struct hv_enlightened_vmcs' definition
  2022-07-14  9:13 ` [PATCH v4 03/25] x86/hyperv: Update " Vitaly Kuznetsov
@ 2022-07-14  9:57   ` Maxim Levitsky
  0 siblings, 0 replies; 62+ messages in thread
From: Maxim Levitsky @ 2022-07-14  9:57 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Thu, 2022-07-14 at 11:13 +0200, Vitaly Kuznetsov wrote:
> Updated Hyper-V Enlightened VMCS specification lists several new
> fields for the following features:
> 
> - PerfGlobalCtrl
> - EnclsExitingBitmap
> - Tsc Scaling
> - GuestLbrCtl
> - CET
> - SSP
> 
> Update the definition. The updated definition is available only when
> CPUID.0x4000000A.EBX BIT(0) is '1'. Add a define for it as well.
> 
> Note: The latest TLFS is available at
> https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/include/asm/hyperv-tlfs.h | 26 ++++++++++++++++++++++++--
>  1 file changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
> index 6f0acc45e67a..ebc27017fa48 100644
> --- a/arch/x86/include/asm/hyperv-tlfs.h
> +++ b/arch/x86/include/asm/hyperv-tlfs.h
> @@ -138,6 +138,17 @@
>  #define HV_X64_NESTED_GUEST_MAPPING_FLUSH              BIT(18)
>  #define HV_X64_NESTED_MSR_BITMAP                       BIT(19)
>  
> +/*
> + * Nested quirks. These are HYPERV_CPUID_NESTED_FEATURES.EBX bits.
> + *
> + * Note: HV_X64_NESTED_EVMCS1_2022_UPDATE is not currently documented in any
> + * published TLFS version. When the bit is set, nested hypervisor can use
> + * 'updated' eVMCSv1 specification (perf_global_ctrl, s_cet, ssp, lbr_ctl,
> + * encls_exiting_bitmap, tsc_multiplier fields which were missing in 2016
> + * specification).
> + */
> +#define HV_X64_NESTED_EVMCS1_2022_UPDATE               BIT(0)
> +
>  /*
>   * This is specific to AMD and specifies that enlightened TLB flush is
>   * supported. If guest opts in to this feature, ASID invalidations only
> @@ -559,9 +570,20 @@ struct hv_enlightened_vmcs {
>         u64 partition_assist_page;
>         u64 padding64_4[4];
>         u64 guest_bndcfgs;
> -       u64 padding64_5[7];
> +       u64 guest_ia32_perf_global_ctrl;
> +       u64 guest_ia32_s_cet;
> +       u64 guest_ssp;
> +       u64 guest_ia32_int_ssp_table_addr;
> +       u64 guest_ia32_lbr_ctl;
> +       u64 padding64_5[2];
>         u64 xss_exit_bitmap;
> -       u64 padding64_6[7];
> +       u64 encls_exiting_bitmap;
> +       u64 host_ia32_perf_global_ctrl;
> +       u64 tsc_multiplier;
> +       u64 host_ia32_s_cet;
> +       u64 host_ssp;
> +       u64 host_ia32_int_ssp_table_addr;
> +       u64 padding64_6;
>  } __packed;
>  
>  #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE                    0

All look good now.

I really don't like the new 'online' TLFS spec - as you said,
they can indeed change it any moment without any traces.

Seems it was done with good intentions, and it much easier to use,
but they should also provide a PDF, or at least some form or archive of
these web pages.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 06/25] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf
  2022-07-14  9:13 ` [PATCH v4 06/25] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf Vitaly Kuznetsov
@ 2022-07-14  9:59   ` Maxim Levitsky
  0 siblings, 0 replies; 62+ messages in thread
From: Maxim Levitsky @ 2022-07-14  9:59 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Thu, 2022-07-14 at 11:13 +0200, Vitaly Kuznetsov wrote:
> KVM has to check guest visible HYPERV_CPUID_NESTED_FEATURES.EBX CPUID
> leaf to know which Enlightened VMCS definition to use (original or 2022
> update). Cache the leaf along with other Hyper-V CPUID feature leaves
> to make the check quick.
> 
> While on it, wipe the whole 'hv_vcpu->cpuid_cache' with memset() instead
> of having to zero each particular member when the corresponding CPUID entry
> was not found.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  2 ++
>  arch/x86/kvm/hyperv.c           | 17 ++++++++---------
>  2 files changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index de5a149d0971..077ec9cf3169 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -616,6 +616,8 @@ struct kvm_vcpu_hv {
>                 u32 enlightenments_eax; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EAX */
>                 u32 enlightenments_ebx; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EBX */
>                 u32 syndbg_cap_eax; /* HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES.EAX */
> +               u32 nested_eax; /* HYPERV_CPUID_NESTED_FEATURES.EAX */
> +               u32 nested_ebx; /* HYPERV_CPUID_NESTED_FEATURES.EBX */
>         } cpuid_cache;
>  };
>  
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index e08189211d9a..a8e4944ca110 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -2005,31 +2005,30 @@ void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu)
>  
>         hv_vcpu = to_hv_vcpu(vcpu);
>  
> +       memset(&hv_vcpu->cpuid_cache, 0, sizeof(hv_vcpu->cpuid_cache));
> +
>         entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_FEATURES, 0);
>         if (entry) {
>                 hv_vcpu->cpuid_cache.features_eax = entry->eax;
>                 hv_vcpu->cpuid_cache.features_ebx = entry->ebx;
>                 hv_vcpu->cpuid_cache.features_edx = entry->edx;
> -       } else {
> -               hv_vcpu->cpuid_cache.features_eax = 0;
> -               hv_vcpu->cpuid_cache.features_ebx = 0;
> -               hv_vcpu->cpuid_cache.features_edx = 0;
>         }
>  
>         entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_ENLIGHTMENT_INFO, 0);
>         if (entry) {
>                 hv_vcpu->cpuid_cache.enlightenments_eax = entry->eax;
>                 hv_vcpu->cpuid_cache.enlightenments_ebx = entry->ebx;
> -       } else {
> -               hv_vcpu->cpuid_cache.enlightenments_eax = 0;
> -               hv_vcpu->cpuid_cache.enlightenments_ebx = 0;
>         }
>  
>         entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES, 0);
>         if (entry)
>                 hv_vcpu->cpuid_cache.syndbg_cap_eax = entry->eax;
> -       else
> -               hv_vcpu->cpuid_cache.syndbg_cap_eax = 0;
> +
> +       entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_NESTED_FEATURES, 0);
> +       if (entry) {
> +               hv_vcpu->cpuid_cache.nested_eax = entry->eax;
> +               hv_vcpu->cpuid_cache.nested_ebx = entry->ebx;
> +       }
>  }
>  
>  int kvm_hv_set_enforce_cpuid(struct kvm_vcpu *vcpu, bool enforce)

Makes sense, looks good.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization
  2022-07-14  9:13 ` [PATCH v4 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization Vitaly Kuznetsov
@ 2022-07-14 10:04   ` Maxim Levitsky
  0 siblings, 0 replies; 62+ messages in thread
From: Maxim Levitsky @ 2022-07-14 10:04 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Thu, 2022-07-14 at 11:13 +0200, Vitaly Kuznetsov wrote:
> With the updated eVMCSv1 definition, there's no known 'problematic'
> controls which are exposed in VMX control MSRs but are not present in
> eVMCSv1. Get rid of VMX control MSRs filtering for KVM on Hyper-V.

I think it still might be worth it, mentioning at least in the commit message,
that as you said, the all known HyperV versions, either don't expose the 
new fields by not setting bits in the VMX feature controls, 
or support the new eVMCS revision.

But anyway:

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky

> 
> Note: VMX control MSRs filtering for Hyper-V on KVM
> (nested_evmcs_filter_control_msr()) stays as even the updated eVMCSv1
> definition doesn't have all the features implemented by KVM and some
> fields are still missing. Moreover, nested_evmcs_filter_control_msr()
> has to support the original eVMCSv1 version when VMM wishes so.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/evmcs.c | 13 -------------
>  arch/x86/kvm/vmx/evmcs.h |  1 -
>  arch/x86/kvm/vmx/vmx.c   |  5 -----
>  3 files changed, 19 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
> index 52a53debd806..b5cfbf7d487b 100644
> --- a/arch/x86/kvm/vmx/evmcs.c
> +++ b/arch/x86/kvm/vmx/evmcs.c
> @@ -320,19 +320,6 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
>  };
>  const unsigned int nr_evmcs_1_fields = ARRAY_SIZE(vmcs_field_to_evmcs_1);
>  
> -#if IS_ENABLED(CONFIG_HYPERV)
> -__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf)
> -{
> -       vmcs_conf->cpu_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_EXEC_CTRL;
> -       vmcs_conf->pin_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_PINCTRL;
> -       vmcs_conf->cpu_based_2nd_exec_ctrl &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
> -       vmcs_conf->cpu_based_3rd_exec_ctrl = 0;
> -
> -       vmcs_conf->vmexit_ctrl &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
> -       vmcs_conf->vmentry_ctrl &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
> -}
> -#endif
> -
>  bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa)
>  {
>         struct hv_vp_assist_page assist_page;
> diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
> index 4b809c79ae63..0feac101cce4 100644
> --- a/arch/x86/kvm/vmx/evmcs.h
> +++ b/arch/x86/kvm/vmx/evmcs.h
> @@ -203,7 +203,6 @@ static inline void evmcs_load(u64 phys_addr)
>         vp_ap->enlighten_vmentry = 1;
>  }
>  
> -__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf);
>  #else /* !IS_ENABLED(CONFIG_HYPERV) */
>  static __always_inline void evmcs_write64(unsigned long field, u64 value) {}
>  static inline void evmcs_write32(unsigned long field, u32 value) {}
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index b4915d841357..dd905ad72637 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2689,11 +2689,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>         vmcs_conf->vmexit_ctrl         = _vmexit_control;
>         vmcs_conf->vmentry_ctrl        = _vmentry_control;
>  
> -#if IS_ENABLED(CONFIG_HYPERV)
> -       if (enlightened_vmcs)
> -               evmcs_sanitize_exec_ctrls(vmcs_conf);
> -#endif
> -
>         return 0;
>  }
>  



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 08/25] KVM: selftests: Switch to updated eVMCSv1 definition
  2022-07-14  9:13 ` [PATCH v4 08/25] KVM: selftests: Switch to updated eVMCSv1 definition Vitaly Kuznetsov
@ 2022-07-14 10:07   ` Maxim Levitsky
  0 siblings, 0 replies; 62+ messages in thread
From: Maxim Levitsky @ 2022-07-14 10:07 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Thu, 2022-07-14 at 11:13 +0200, Vitaly Kuznetsov wrote:
> Update Enlightened VMCS definition in selftests from KVM.
> 
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  .../selftests/kvm/include/x86_64/evmcs.h      | 45 +++++++++++++++++--
>  1 file changed, 42 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/x86_64/evmcs.h b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
> index 3c9260f8e116..58db74f68af2 100644
> --- a/tools/testing/selftests/kvm/include/x86_64/evmcs.h
> +++ b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
> @@ -203,14 +203,25 @@ struct hv_enlightened_vmcs {
>                 u32 reserved:30;
>         } hv_enlightenments_control;
>         u32 hv_vp_id;
> -
> +       u32 padding32_2;
>         u64 hv_vm_id;
>         u64 partition_assist_page;
>         u64 padding64_4[4];
>         u64 guest_bndcfgs;
> -       u64 padding64_5[7];
> +       u64 guest_ia32_perf_global_ctrl;
> +       u64 guest_ia32_s_cet;
> +       u64 guest_ssp;
> +       u64 guest_ia32_int_ssp_table_addr;
> +       u64 guest_ia32_lbr_ctl;
> +       u64 padding64_5[2];
>         u64 xss_exit_bitmap;
> -       u64 padding64_6[7];
> +       u64 encls_exiting_bitmap;
> +       u64 host_ia32_perf_global_ctrl;

Fixed here as well, thanks!

Best regards,
	Maxim Levitsky

> +       u64 tsc_multiplier;
> +       u64 host_ia32_s_cet;
> +       u64 host_ssp;
> +       u64 host_ia32_int_ssp_table_addr;
> +       u64 padding64_6;
>  };
>  
>  #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE                     0
> @@ -656,6 +667,18 @@ static inline int evmcs_vmread(uint64_t encoding, uint64_t *value)
>         case VIRTUAL_PROCESSOR_ID:
>                 *value = current_evmcs->virtual_processor_id;
>                 break;
> +       case HOST_IA32_PERF_GLOBAL_CTRL:
> +               *value = current_evmcs->host_ia32_perf_global_ctrl;
> +               break;
> +       case GUEST_IA32_PERF_GLOBAL_CTRL:
> +               *value = current_evmcs->guest_ia32_perf_global_ctrl;
> +               break;
> +       case ENCLS_EXITING_BITMAP:
> +               *value = current_evmcs->encls_exiting_bitmap;
> +               break;
> +       case TSC_MULTIPLIER:
> +               *value = current_evmcs->tsc_multiplier;
> +               break;
>         default: return 1;
>         }
>  
> @@ -1169,6 +1192,22 @@ static inline int evmcs_vmwrite(uint64_t encoding, uint64_t value)
>                 current_evmcs->virtual_processor_id = value;
>                 current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_XLAT;
>                 break;
> +       case HOST_IA32_PERF_GLOBAL_CTRL:
> +               current_evmcs->host_ia32_perf_global_ctrl = value;
> +               current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1;
> +               break;
> +       case GUEST_IA32_PERF_GLOBAL_CTRL:
> +               current_evmcs->guest_ia32_perf_global_ctrl = value;
> +               current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1;
> +               break;
> +       case ENCLS_EXITING_BITMAP:
> +               current_evmcs->encls_exiting_bitmap = value;
> +               current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2;
> +               break;
> +       case TSC_MULTIPLIER:
> +               current_evmcs->tsc_multiplier = value;
> +               current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2;
> +               break;
>         default: return 1;
>         }
>  



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags
  2022-07-14  9:13 ` [PATCH v4 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags Vitaly Kuznetsov
@ 2022-07-21 21:43   ` Sean Christopherson
  2022-07-22 17:22     ` Paolo Bonzini
  0 siblings, 1 reply; 62+ messages in thread
From: Sean Christopherson @ 2022-07-21 21:43 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Anirudh Rayabharam, Wanpeng Li, Jim Mattson,
	Maxim Levitsky, linux-hyperv, linux-kernel

On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
> For some features, Hyper-V spec defines two separate CPUID bits: one
> listing whether the feature is supported or not and another one showing
> whether guest partition was granted access to the feature ("partition
> privilege mask"). 'Debug MSRs available' is one of such features. Add
> the missing 'access' bit.
> 
> Fixes: f97f5a56f597 ("x86/kvm/hyper-v: Add support for synthetic debugger interface")
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/hyperv.c             | 1 +
>  include/asm-generic/hyperv-tlfs.h | 2 ++
>  2 files changed, 3 insertions(+)
> 
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index e2e95a6fccfd..e08189211d9a 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -2496,6 +2496,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>  			ent->eax |= HV_MSR_RESET_AVAILABLE;
>  			ent->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
>  			ent->eax |= HV_ACCESS_FREQUENCY_MSRS;
> +			ent->eax |= HV_ACCESS_DEBUG_MSRS;
>  			ent->eax |= HV_ACCESS_REENLIGHTENMENT;

Doesn't KVM also need to switch to enforcing the "access" flag?

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index c284a605e453..ca91547034e4 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1282,7 +1282,7 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv *hv_vcpu, u32 msr)
        case HV_X64_MSR_SYNDBG_OPTIONS:
        case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
                return hv_vcpu->cpuid_cache.features_edx &
-                       HV_FEATURE_DEBUG_MSRS_AVAILABLE;
+                       HV_ACCESS_DEBUG_MSRS;
        default:
                break;
        }


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS
  2022-07-14  9:13 ` [PATCH v4 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS Vitaly Kuznetsov
@ 2022-07-21 21:58   ` Sean Christopherson
  2022-07-25 17:09     ` Paolo Bonzini
  0 siblings, 1 reply; 62+ messages in thread
From: Sean Christopherson @ 2022-07-21 21:58 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Anirudh Rayabharam, Wanpeng Li, Jim Mattson,
	Maxim Levitsky, linux-hyperv, linux-kernel

Nit of all nits, just "KVM: nVMX:" in the shortlog.

On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
> Enlightened VMCS v1 got updated and now includes the required fields
> for TSC scaling and loading PERF_GLOBAL_CTRL upon VMENTER/VMEXIT
> features. For Hyper-V on KVM enablement, KVM can just observe VMX control
> MSRs and use the features (with or without eVMCS) when
> possible. Hyper-V on KVM case is trickier because of live migration:
> the new features require explicit enablement from VMM to not break
> it. Luckily, the updated eVMCS revision comes with a feature bit in
> CPUID.0x4000000A.EBX.

Mostly out of curiosity, what happens if the VMM parrots back the results of
kvm_get_hv_cpuid()?

> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/hyperv.c     |  2 +-
>  arch/x86/kvm/vmx/evmcs.c  | 88 +++++++++++++++++++++++++++++++--------
>  arch/x86/kvm/vmx/evmcs.h  | 17 ++------
>  arch/x86/kvm/vmx/nested.c |  2 +-
>  arch/x86/kvm/vmx/vmx.c    |  2 +-
>  5 files changed, 78 insertions(+), 33 deletions(-)
> 
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index a8e4944ca110..995d3ab1443e 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -2552,7 +2552,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>  		case HYPERV_CPUID_NESTED_FEATURES:
>  			ent->eax = evmcs_ver;
>  			ent->eax |= HV_X64_NESTED_MSR_BITMAP;
> -
> +			ent->ebx |= HV_X64_NESTED_EVMCS1_2022_UPDATE;
>  			break;
>  
>  		case HYPERV_CPUID_SYNDBG_VENDOR_AND_MAX_FUNCTIONS:
> diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
> index 8bea5dea0341..52a53debd806 100644
> --- a/arch/x86/kvm/vmx/evmcs.c
> +++ b/arch/x86/kvm/vmx/evmcs.c
> @@ -368,7 +368,60 @@ uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> -void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
> +enum evmcs_v1_revision {
> +	EVMCSv1_2016,
> +	EVMCSv1_2022,
> +};

Why bother with the enums?  They don't make life any easier, e.g. if there's a
2023 update then it's easy to do:

		unsupported = <baseline>;
		if (!has_evmcs_2022_features)
			unsupported |= <2022 features>;
		if (!has_evmcs_2023_features)
			unsupported |= <2023 features>;
		return unsupported;

diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index b5cfbf7d487b..7b348a03e096 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -355,11 +355,6 @@ uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
        return 0;
 }

-enum evmcs_v1_revision {
-       EVMCSv1_2016,
-       EVMCSv1_2022,
-};
-
 enum evmcs_unsupported_ctrl_type {
        EVMCS_EXIT_CTLS,
        EVMCS_ENTRY_CTLS,
@@ -372,29 +367,29 @@ static u32 evmcs_get_unsupported_ctls(struct kvm_vcpu *vcpu,
                                      enum evmcs_unsupported_ctrl_type ctrl_type)
 {
        struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
-       enum evmcs_v1_revision evmcs_rev = EVMCSv1_2016;
+       bool has_evmcs_2022_features;

        if (!hv_vcpu)
                return 0;

-       if (hv_vcpu->cpuid_cache.nested_ebx & HV_X64_NESTED_EVMCS1_2022_UPDATE)
-               evmcs_rev = EVMCSv1_2022;
+       has_evmcs_2022_features = hv_vcpu->cpuid_cache.nested_ebx &
+                                 HV_X64_NESTED_EVMCS1_2022_UPDATE;

        switch (ctrl_type) {
        case EVMCS_EXIT_CTLS:
-               if (evmcs_rev == EVMCSv1_2016)
+               if (!has_evmcs_2022_features)
                        return EVMCS1_UNSUPPORTED_VMEXIT_CTRL |
                                VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
                else
                        return EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
        case EVMCS_ENTRY_CTLS:
-               if (evmcs_rev == EVMCSv1_2016)
+               if (!has_evmcs_2022_features)
                        return EVMCS1_UNSUPPORTED_VMENTRY_CTRL |
                                VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
                else
                        return EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
        case EVMCS_2NDEXEC:
-               if (evmcs_rev == EVMCSv1_2016)
+               if (!has_evmcs_2022_features)
                        return EVMCS1_UNSUPPORTED_2NDEXEC |
                                SECONDARY_EXEC_TSC_SCALING;
                else

> +
> +enum evmcs_unsupported_ctrl_type {
> +	EVMCS_EXIT_CTLS,
> +	EVMCS_ENTRY_CTLS,
> +	EVMCS_2NDEXEC,
> +	EVMCS_PINCTRL,
> +	EVMCS_VMFUNC,
> +};

Same question here, it just makes life more difficult.  I.e. do

  static u32 evmcs_get_unsupported_ctls(struct kvm_vcpu *vcpu, int msr_index)

and then

  void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
  {
	u32 ctl_low = (u32)*pdata;
	u32 ctl_high = (u32)(*pdata >> 32);

	/*
	 * Hyper-V 2016 and 2019 try using unsupported features when eVMCS is
	 * enabled but there are no corresponding fields.
	 */
	ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, msr_index);

	*pdata = ctl_low | ((u64)ctl_high << 32);
  }


  int nested_evmcs_check_controls(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
  {
	int ret = 0;
	u32 unsupp_ctl;

	unsupp_ctl = vmcs12->pin_based_vm_exec_control &
		evmcs_get_unsupported_ctls(vcpu, MSR_IA32_VMX_TRUE_PINBASED_CTLS);

	<and so on and so forth>
  }

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 12/25] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
  2022-07-14  9:13 ` [PATCH v4 12/25] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config() Vitaly Kuznetsov
@ 2022-07-21 22:00   ` Sean Christopherson
  0 siblings, 0 replies; 62+ messages in thread
From: Sean Christopherson @ 2022-07-21 22:00 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Anirudh Rayabharam, Wanpeng Li, Jim Mattson,
	Maxim Levitsky, linux-hyperv, linux-kernel

On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
> VM_ENTRY_IA32E_MODE control is toggled dynamically by vmx_set_efer()
> and setup_vmcs_config() doesn't check its existence. On the contrary,
> nested_vmx_setup_ctls_msrs() doesn set it on x86_64. Add the missing
> check and filter the bit out in vmx_vmentry_ctrl().
> 
> No (real) functional change intended as all existing CPUs supporting
> long mode and VMX are supposed to have it.
> 
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---

Reviewed-by: Sean Christopherson <seanjc@google.com>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 13/25] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in setup_vmcs_config()
  2022-07-14  9:13 ` [PATCH v4 13/25] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING " Vitaly Kuznetsov
@ 2022-07-21 22:01   ` Sean Christopherson
  0 siblings, 0 replies; 62+ messages in thread
From: Sean Christopherson @ 2022-07-21 22:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Anirudh Rayabharam, Wanpeng Li, Jim Mattson,
	Maxim Levitsky, linux-hyperv, linux-kernel

On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
> CPU_BASED_{INTR,NMI}_WINDOW_EXITING controls are toggled dynamically by
> vmx_enable_{irq,nmi}_window, handle_interrupt_window(), handle_nmi_window()
> but setup_vmcs_config() doesn't check their existence. Add the check and
> filter the controls out in vmx_exec_control().
> 
> Note: KVM explicitly supports CPUs without VIRTUAL_NMIS and all these CPUs
> are supposedly lacking NMI_WINDOW_EXITING too. Adjust cpu_has_virtual_nmis()
> accordingly.
> 
> No functional change intended.
> 
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---

Nit aside,

Reviewed-by: Sean Christopherson <seanjc@google.com>

>  arch/x86/kvm/vmx/capabilities.h | 3 ++-
>  arch/x86/kvm/vmx/vmx.c          | 8 +++++++-
>  2 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
> index 069d8d298e1d..07e7492fe72a 100644
> --- a/arch/x86/kvm/vmx/capabilities.h
> +++ b/arch/x86/kvm/vmx/capabilities.h
> @@ -82,7 +82,8 @@ static inline bool cpu_has_vmx_basic_inout(void)
>  
>  static inline bool cpu_has_virtual_nmis(void)
>  {
> -	return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS;
> +	return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS &&
> +		vmcs_config.cpu_based_exec_ctrl & CPU_BASED_NMI_WINDOW_EXITING;

I prefer to align these, though Paolo always scoffs at me :-)


static inline bool cpu_has_virtual_nmis(void)
{
	return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS &&
	       vmcs_config.cpu_based_exec_ctrl & CPU_BASED_NMI_WINDOW_EXITING;
}

>  }
>  
>  static inline bool cpu_has_vmx_preemption_timer(void)
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 1aaec4d19e1b..ce54f13d8da1 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2487,10 +2487,12 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>  	      CPU_BASED_MWAIT_EXITING |
>  	      CPU_BASED_MONITOR_EXITING |
>  	      CPU_BASED_INVLPG_EXITING |
> -	      CPU_BASED_RDPMC_EXITING;
> +	      CPU_BASED_RDPMC_EXITING |
> +	      CPU_BASED_INTR_WINDOW_EXITING;
>  
>  	opt = CPU_BASED_TPR_SHADOW |
>  	      CPU_BASED_USE_MSR_BITMAPS |
> +	      CPU_BASED_NMI_WINDOW_EXITING |
>  	      CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
>  	      CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
>  	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
> @@ -4299,6 +4301,10 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
>  {
>  	u32 exec_control = vmcs_config.cpu_based_exec_ctrl;
>  
> +	/* INTR_WINDOW_EXITING and NMI_WINDOW_EXITING are toggled dynamically */
> +	exec_control &= ~(CPU_BASED_INTR_WINDOW_EXITING |
> +			  CPU_BASED_NMI_WINDOW_EXITING);
> +
>  	if (vmx->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
>  		exec_control &= ~CPU_BASED_MOV_DR_EXITING;
>  
> -- 
> 2.35.3
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 14/25] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING in setup_vmcs_config()
  2022-07-14  9:13 ` [PATCH v4 14/25] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING " Vitaly Kuznetsov
@ 2022-07-21 22:11   ` Sean Christopherson
  2022-08-02 12:52     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 62+ messages in thread
From: Sean Christopherson @ 2022-07-21 22:11 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Anirudh Rayabharam, Wanpeng Li, Jim Mattson,
	Maxim Levitsky, linux-hyperv, linux-kernel

On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
> SECONDARY_EXEC_ENCLS_EXITING is conditionally added to the 'optional'
> checklist in setup_vmcs_config() but there's little value in doing so.
> First, as the control is optional, we can always check for its
> presence, no harm done. Second, the only real value cpu_has_sgx() check
> gives is that on the CPUs which support SECONDARY_EXEC_ENCLS_EXITING but
> don't support SGX, the control is not getting enabled. It's highly unlikely
> such CPUs exist but it's possible that some hypervisors expose broken vCPU
> models.

It's not just broken vCPU models, SGX can be "soft-disabled" on bare metal, e.g. if
software writes MCE control MSRs or there's an uncorrectable #MC (may not be the
case on all platforms).  This is architectural behavior and needs to be handled in
KVM.  Obviously if SGX gets disabled after KVM is loaded then we're out of luck, but
having the ENCL-exiting control without SGX being enabled is 100% valid.

As for why KVM bothers with the check, it's to work around a suspected hardware
or XuCode bug (I'm still a bit shocked that's public now :-) ) where SGX got
_hard_ disabled across S3 on some CPUs and made the fields magically disappear.
The workaround was to soft-disable SGX in BIOS so that KVM wouldn't attempt to
enable the ENCLS-exiting control

> Preserve cpu_has_sgx() check but filter the result of adjust_vmx_controls()
> instead of the input.
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index ce54f13d8da1..566be73c6509 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2528,9 +2528,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>  			SECONDARY_EXEC_PT_CONCEAL_VMX |
>  			SECONDARY_EXEC_ENABLE_VMFUNC |
>  			SECONDARY_EXEC_BUS_LOCK_DETECTION |
> -			SECONDARY_EXEC_NOTIFY_VM_EXITING;
> -		if (cpu_has_sgx())
> -			opt2 |= SECONDARY_EXEC_ENCLS_EXITING;
> +			SECONDARY_EXEC_NOTIFY_VM_EXITING |
> +			SECONDARY_EXEC_ENCLS_EXITING;
> +
>  		if (adjust_vmx_controls(min2, opt2,
>  					MSR_IA32_VMX_PROCBASED_CTLS2,
>  					&_cpu_based_2nd_exec_control) < 0)
> @@ -2577,6 +2577,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>  		vmx_cap->vpid = 0;
>  	}
>  
> +	if (!cpu_has_sgx())
> +		_cpu_based_2nd_exec_control &= ~SECONDARY_EXEC_ENCLS_EXITING;
> +
>  	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) {
>  		u64 opt3 = TERTIARY_EXEC_IPI_VIRT;
>  
> -- 
> 2.35.3
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 15/25] KVM: VMX: Extend VMX controls macro shenanigans
  2022-07-14  9:13 ` [PATCH v4 15/25] KVM: VMX: Extend VMX controls macro shenanigans Vitaly Kuznetsov
@ 2022-07-21 22:28   ` Sean Christopherson
  2022-07-22 18:33   ` Sean Christopherson
  1 sibling, 0 replies; 62+ messages in thread
From: Sean Christopherson @ 2022-07-21 22:28 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Anirudh Rayabharam, Wanpeng Li, Jim Mattson,
	Maxim Levitsky, linux-hyperv, linux-kernel

On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
> @@ -2581,30 +2536,20 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>  		_cpu_based_2nd_exec_control &= ~SECONDARY_EXEC_ENCLS_EXITING;
>  
>  	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) {

Curly braces no longer necessary.

> -		u64 opt3 = TERTIARY_EXEC_IPI_VIRT;
> -
> -		_cpu_based_3rd_exec_control = adjust_vmx_controls64(opt3,
> -					      MSR_IA32_VMX_PROCBASED_CTLS3);
> +		_cpu_based_3rd_exec_control =
> +			adjust_vmx_controls64(KVM_OPT_VMX_TERTIARY_VM_EXEC_CONTROL,
> +			MSR_IA32_VMX_PROCBASED_CTLS3);

Please align indentation.


	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS)
		_cpu_based_3rd_exec_control =
			adjust_vmx_controls64(KVM_OPT_VMX_TERTIARY_VM_EXEC_CONTROL,
					      MSR_IA32_VMX_PROCBASED_CTLS3);


>  	}
>  
> -	min = VM_EXIT_SAVE_DEBUG_CONTROLS | VM_EXIT_ACK_INTR_ON_EXIT;
> -#ifdef CONFIG_X86_64
> -	min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
> -#endif
> -	opt = VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
> -	      VM_EXIT_LOAD_IA32_PAT |
> -	      VM_EXIT_LOAD_IA32_EFER |
> -	      VM_EXIT_CLEAR_BNDCFGS |
> -	      VM_EXIT_PT_CONCEAL_PIP |
> -	      VM_EXIT_CLEAR_IA32_RTIT_CTL;
> -	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
> +	if (adjust_vmx_controls(KVM_REQ_VMX_VM_EXIT_CONTROLS,

I think we should spell out REQUIRED and OPTIONAL, almost all of the usage puts
them on their own lines, i.e. longer names doesn't change much.  "OPT" is fine,
but "REQ" already means "REQUEST" in KVM, i.e. I mentally read this as
"KVM requested controls", which is quite different from "KVM required controls".

> +				KVM_OPT_VMX_VM_EXIT_CONTROLS,
> +				MSR_IA32_VMX_EXIT_CTLS,
>  				&_vmexit_control) < 0)
>  		return -EIO;
>  
> -	min = PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING;
> -	opt = PIN_BASED_VIRTUAL_NMIS | PIN_BASED_POSTED_INTR |
> -		 PIN_BASED_VMX_PREEMPTION_TIMER;
> -	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PINBASED_CTLS,
> +	if (adjust_vmx_controls(KVM_REQ_VMX_PIN_BASED_VM_EXEC_CONTROL,
> +				KVM_OPT_VMX_PIN_BASED_VM_EXEC_CONTROL,
> +				MSR_IA32_VMX_PINBASED_CTLS,
>  				&_pin_based_exec_control) < 0)

I vote to opportunistically (or in a prep patch) drop the silly "< 0" check, it's
pointless and makes the code unnecessarily difficult to follow.

And at that point, I also think it makes sense to move the pointer passing to the
same line as the MSR definition, even if one or two lines run a bit long:

	if (adjust_vmx_controls(KVM_REQUIRED_VMX_VM_ENTRY_CONTROLS,
				KVM_OPTIONAL_VMX_VM_ENTRY_CONTROLS,
				MSR_IA32_VMX_ENTRY_CTLS, &_vmentry_control))

> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 286c88e285ea..89eaab3495a6 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -467,6 +467,113 @@ static inline u8 vmx_get_rvi(void)
>  	return vmcs_read16(GUEST_INTR_STATUS) & 0xff;
>  }
>  
> +#define __KVM_REQ_VMX_VM_ENTRY_CONTROLS				\
> +	(VM_ENTRY_LOAD_DEBUG_CONTROLS)
> +#ifdef CONFIG_X86_64
> +	#define KVM_REQ_VMX_VM_ENTRY_CONTROLS			\
> +		(__KVM_REQ_VMX_VM_ENTRY_CONTROLS |		\
> +		VM_ENTRY_IA32E_MODE)

Align indentation (more obvious below).

> +#else
> +	#define KVM_REQ_VMX_VM_ENTRY_CONTROLS			\
> +		__KVM_REQ_VMX_VM_ENTRY_CONTROLS
> +#endif
> +#define KVM_OPT_VMX_VM_ENTRY_CONTROLS				\
> +	(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |			\
> +	VM_ENTRY_LOAD_IA32_PAT |				\
> +	VM_ENTRY_LOAD_IA32_EFER |				\
> +	VM_ENTRY_LOAD_BNDCFGS |					\
> +	VM_ENTRY_PT_CONCEAL_PIP |				\
> +	VM_ENTRY_LOAD_IA32_RTIT_CTL)

Align inside the paranthesis so that the control names all line up (goes for
everything in this file).

#define KVM_OPT_VMX_VM_ENTRY_CONTROLS				\
	(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |			\
	 VM_ENTRY_LOAD_IA32_PAT |				\
	 VM_ENTRY_LOAD_IA32_EFER |				\
	 VM_ENTRY_LOAD_BNDCFGS |				\
	 VM_ENTRY_PT_CONCEAL_PIP |				\
	 VM_ENTRY_LOAD_IA32_RTIT_CTL)

> +#define KVM_REQ_VMX_TERTIARY_VM_EXEC_CONTROL 0
> +#define KVM_OPT_VMX_TERTIARY_VM_EXEC_CONTROL			\
> +	(TERTIARY_EXEC_IPI_VIRT)
> +
>  #define BUILD_CONTROLS_SHADOW(lname, uname, bits)				\
>  static inline void lname##_controls_set(struct vcpu_vmx *vmx, u##bits val)	\
>  {										\
> @@ -485,10 +592,12 @@ static inline u##bits lname##_controls_get(struct vcpu_vmx *vmx)		\
>  }										\
>  static inline void lname##_controls_setbit(struct vcpu_vmx *vmx, u##bits val)	\

I suspect these need to be __always_inline, otherwise various sanitizers might
cause these to be out of line and break the build due to @val not being a
compile-time constant.

>  {										\
> +	BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));	\
>  	lname##_controls_set(vmx, lname##_controls_get(vmx) | val);		\
>  }										\
>  static inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits val)	\
>  {										\
> +	BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));	\
>  	lname##_controls_set(vmx, lname##_controls_get(vmx) & ~val);		\
>  }
>  BUILD_CONTROLS_SHADOW(vm_entry, VM_ENTRY_CONTROLS, 32)
> -- 
> 2.35.3
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 16/25] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config()
  2022-07-14  9:13 ` [PATCH v4 16/25] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config() Vitaly Kuznetsov
@ 2022-07-21 22:30   ` Sean Christopherson
  0 siblings, 0 replies; 62+ messages in thread
From: Sean Christopherson @ 2022-07-21 22:30 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Anirudh Rayabharam, Wanpeng Li, Jim Mattson,
	Maxim Levitsky, linux-hyperv, linux-kernel

On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
> As a preparation to reusing the result of setup_vmcs_config() in
> nested VMX MSR setup, move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering
> to vmx_exec_control().
> 
> No functional change intended.
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---

Reviewed-by: Sean Christopherson <seanjc@google.com>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 17/25] KVM: VMX: Add missing VMEXIT controls to vmcs_config
  2022-07-14  9:13 ` [PATCH v4 17/25] KVM: VMX: Add missing VMEXIT controls to vmcs_config Vitaly Kuznetsov
@ 2022-07-21 22:34   ` Sean Christopherson
  0 siblings, 0 replies; 62+ messages in thread
From: Sean Christopherson @ 2022-07-21 22:34 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Anirudh Rayabharam, Wanpeng Li, Jim Mattson,
	Maxim Levitsky, linux-hyperv, linux-kernel

On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
> As a preparation to reusing the result of setup_vmcs_config() in
> nested VMX MSR setup, add the VMEXIT controls which KVM doesn't
> use but supports for nVMX to KVM_OPT_VMX_VM_EXIT_CONTROLS and
> filter them out in vmx_vmexit_ctrl().
> 
> No functional change intended.
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 4 ++++
>  arch/x86/kvm/vmx/vmx.h | 3 +++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index d7170990f469..2fb89bdcbbd8 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4196,6 +4196,10 @@ static u32 vmx_vmexit_ctrl(void)
>  {
>  	u32 vmexit_ctrl = vmcs_config.vmexit_ctrl;
>  
> +	/* Not used by KVM but supported for nesting. */

I think it's worth expanding the comment to clarify that "supported for nesting"
just means allowing them in vmcs12.  Most controls are fully emulated and so are
never set in vmcs02 even when they're turned on by L1.  Something like?

	/*
	 * Not used by KVM and never set in vmcs01 or vmcs02, but emulated for
	 * nested virtualization and thus allowed to be set in vmcs12.
	 */


> +	vmexit_ctrl &= ~(VM_EXIT_SAVE_IA32_PAT | VM_EXIT_SAVE_IA32_EFER |
> +			 VM_EXIT_SAVE_VMX_PREEMPTION_TIMER);
> +
>  	if (vmx_pt_mode_is_system())
>  		vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
>  				 VM_EXIT_CLEAR_IA32_RTIT_CTL);
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 89eaab3495a6..e9c392398f1b 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -498,8 +498,11 @@ static inline u8 vmx_get_rvi(void)
>  #endif
>  #define KVM_OPT_VMX_VM_EXIT_CONTROLS				\
>  	      (VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |		\
> +	      VM_EXIT_SAVE_IA32_PAT |				\
>  	      VM_EXIT_LOAD_IA32_PAT |				\
> +	      VM_EXIT_SAVE_IA32_EFER |				\
>  	      VM_EXIT_LOAD_IA32_EFER |				\
> +	      VM_EXIT_SAVE_VMX_PREEMPTION_TIMER |		\
>  	      VM_EXIT_CLEAR_BNDCFGS |				\
>  	      VM_EXIT_PT_CONCEAL_PIP |				\
>  	      VM_EXIT_CLEAR_IA32_RTIT_CTL)
> -- 
> 2.35.3
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 18/25] KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
  2022-07-14  9:13 ` [PATCH v4 18/25] KVM: VMX: Add missing CPU based VM execution " Vitaly Kuznetsov
@ 2022-07-21 22:39   ` Sean Christopherson
  0 siblings, 0 replies; 62+ messages in thread
From: Sean Christopherson @ 2022-07-21 22:39 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Anirudh Rayabharam, Wanpeng Li, Jim Mattson,
	Maxim Levitsky, linux-hyperv, linux-kernel

On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
> As a preparation to reusing the result of setup_vmcs_config() in
> nested VMX MSR setup, add the CPU based VM execution controls which KVM
> doesn't use but supports for nVMX to KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL
> and filter them out in vmx_exec_control().
> 
> No functional change intended.
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 6 ++++++
>  arch/x86/kvm/vmx/vmx.h | 6 +++++-
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 2fb89bdcbbd8..9771c771c8f5 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4240,6 +4240,12 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
>  {
>  	u32 exec_control = vmcs_config.cpu_based_exec_ctrl;
>  
> +	/* Not used by KVM but supported for nesting. */

And then for this one, clarify that these _are_ enabled in vmcs02.  It doesn't
really matter, I was just surprised by the "SAVE_PAT" in the previous patch because
for a second I thought we were leaking host state :-)

	/*
	 * Not used by KVM, but fully supported for nesting, i.e. are allowed in
	 * vmcs12 and propagated to vmcs02 when set in vmcs12.
	 */

> +	exec_control &= ~(CPU_BASED_RDTSC_EXITING |
> +			  CPU_BASED_USE_IO_BITMAPS |
> +			  CPU_BASED_MONITOR_TRAP_FLAG |
> +			  CPU_BASED_PAUSE_EXITING);
> +
>  	/* INTR_WINDOW_EXITING and NMI_WINDOW_EXITING are toggled dynamically */
>  	exec_control &= ~(CPU_BASED_INTR_WINDOW_EXITING |
>  			  CPU_BASED_NMI_WINDOW_EXITING);
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index e9c392398f1b..758f80c41beb 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -539,9 +539,13 @@ static inline u8 vmx_get_rvi(void)
>  #endif
>  
>  #define KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL			\
> -	(CPU_BASED_TPR_SHADOW |					\
> +	(CPU_BASED_RDTSC_EXITING |				\
> +	CPU_BASED_TPR_SHADOW |					\
> +	CPU_BASED_USE_IO_BITMAPS |				\
> +	CPU_BASED_MONITOR_TRAP_FLAG |				\
>  	CPU_BASED_USE_MSR_BITMAPS |				\
>  	CPU_BASED_NMI_WINDOW_EXITING |				\
> +	CPU_BASED_PAUSE_EXITING |				\
>  	CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |			\
>  	CPU_BASED_ACTIVATE_TERTIARY_CONTROLS)
>  
> -- 
> 2.35.3
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 21/25] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()
  2022-07-14  9:13 ` [PATCH v4 21/25] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config() Vitaly Kuznetsov
@ 2022-07-21 22:56   ` Sean Christopherson
  2022-07-28 22:25     ` Paolo Bonzini
  0 siblings, 1 reply; 62+ messages in thread
From: Sean Christopherson @ 2022-07-21 22:56 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Anirudh Rayabharam, Wanpeng Li, Jim Mattson,
	Maxim Levitsky, linux-hyperv, linux-kernel

On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
> As a preparation to reusing the result of setup_vmcs_config() for setting
> up nested VMX control MSRs, move LOAD_IA32_PERF_GLOBAL_CTRL errata handling
> to vmx_vmexit_ctrl()/vmx_vmentry_ctrl() and print the warning from
> hardware_setup(). While it seems reasonable to not expose
> LOAD_IA32_PERF_GLOBAL_CTRL controls to L1 hypervisor on buggy CPUs,
> such change would inevitably break live migration from older KVMs
> where the controls are exposed. Keep the status quo for know, L1 hypervisor

s/know/now

> itself is supposed to take care of the errata.

Except the errata are based on FMS and the FMS exposed to the L1 hypervisor may
not be the real FMS.

But that's moot, because they _should_ be fully emulated by KVM anyways; KVM
runs L2 with a MSR value modified by perf, not the raw MSR value requested by L1.

Of course KVM screws things up and fails to clear the flag in entry controls...
All exit controls are emulated so at least KVM gets those right.

Untested, but I believe KVM the fix is:

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index d0e781c7ac72..76926147b672 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2357,7 +2357,8 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct loaded_vmcs *vmcs0
         * we can avoid VMWrites during vmx_set_efer().
         */
        exec_control = __vm_entry_controls_get(vmcs01);
-       exec_control |= vmcs12->vm_entry_controls;
+       exec_control |= (vmcs12->vm_entry_controls &
+                        ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL);
        exec_control &= ~(VM_ENTRY_IA32E_MODE | VM_ENTRY_LOAD_IA32_EFER);
        if (cpu_has_load_ia32_efer()) {
                if (guest_efer & EFER_LMA)

> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 62 ++++++++++++++++++++++++++----------------
>  1 file changed, 38 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 2dff5b94c535..e462e5b9c0a1 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2416,6 +2416,31 @@ static bool cpu_has_sgx(void)
>  	return cpuid_eax(0) >= 0x12 && (cpuid_eax(0x12) & BIT(0));
>  }
>  
> +/*
> + * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
> + * can't be used due to errata where VM Exit may incorrectly clear
> + * IA32_PERF_GLOBAL_CTRL[34:32]. Work around the errata by using the
> + * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
> + */
> +static bool cpu_has_perf_global_ctrl_bug(void)
> +{
> +	if (boot_cpu_data.x86 == 0x6) {
> +		switch (boot_cpu_data.x86_model) {
> +		case INTEL_FAM6_NEHALEM_EP:	/* AAK155 */
> +		case INTEL_FAM6_NEHALEM:	/* AAP115 */
> +		case INTEL_FAM6_WESTMERE:	/* AAT100 */
> +		case INTEL_FAM6_WESTMERE_EP:	/* BC86,AAY89,BD102 */
> +		case INTEL_FAM6_NEHALEM_EX:	/* BA97 */
> +			return true;
> +		default:
> +			break;
> +		}
> +	}
> +
> +	return false;
> +}
> +
> +
>  static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
>  				      u32 msr, u32 *result)
>  {
> @@ -2572,30 +2597,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>  		_vmexit_control &= ~x_ctrl;
>  	}
>  
> -	/*
> -	 * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
> -	 * can't be used due to an errata where VM Exit may incorrectly clear
> -	 * IA32_PERF_GLOBAL_CTRL[34:32].  Workaround the errata by using the
> -	 * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
> -	 */
> -	if (boot_cpu_data.x86 == 0x6) {
> -		switch (boot_cpu_data.x86_model) {
> -		case INTEL_FAM6_NEHALEM_EP:	/* AAK155 */
> -		case INTEL_FAM6_NEHALEM:	/* AAP115 */
> -		case INTEL_FAM6_WESTMERE:	/* AAT100 */
> -		case INTEL_FAM6_WESTMERE_EP:	/* BC86,AAY89,BD102 */
> -		case INTEL_FAM6_NEHALEM_EX:	/* BA97 */
> -			_vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
> -			_vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
> -			pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
> -					"does not work properly. Using workaround\n");
> -			break;
> -		default:
> -			break;
> -		}
> -	}
> -
> -
>  	rdmsr(MSR_IA32_VMX_BASIC, vmx_msr_low, vmx_msr_high);
>  
>  	/* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
> @@ -4184,6 +4185,10 @@ static u32 vmx_vmentry_ctrl(void)
>  			  VM_ENTRY_LOAD_IA32_EFER |
>  			  VM_ENTRY_IA32E_MODE);
>  
> +

Extra line.

> +	if (cpu_has_perf_global_ctrl_bug())
> +		vmentry_ctrl &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
> +
>  	return vmentry_ctrl;
>  }
>  
> @@ -4198,6 +4203,10 @@ static u32 vmx_vmexit_ctrl(void)
>  	if (vmx_pt_mode_is_system())
>  		vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
>  				 VM_EXIT_CLEAR_IA32_RTIT_CTL);
> +
> +	if (cpu_has_perf_global_ctrl_bug())
> +		vmexit_ctrl &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
> +
>  	/* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
>  	return vmexit_ctrl &
>  		~(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | VM_EXIT_LOAD_IA32_EFER);
> @@ -8113,6 +8122,11 @@ static __init int hardware_setup(void)
>  	if (setup_vmcs_config(&vmcs_config, &vmx_capability) < 0)
>  		return -EIO;
>  
> +	if (cpu_has_perf_global_ctrl_bug()) {

Curly braces not needed.

> +		pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
> +			     "does not work properly. Using workaround\n");
> +	}
> +
>  	if (boot_cpu_has(X86_FEATURE_NX))
>  		kvm_enable_efer_bits(EFER_NX);
>  
> -- 
> 2.35.3
> 

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 24/25] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
  2022-07-14  9:13 ` [PATCH v4 24/25] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config Vitaly Kuznetsov
@ 2022-07-21 23:06   ` Sean Christopherson
  2022-08-02 16:11     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 62+ messages in thread
From: Sean Christopherson @ 2022-07-21 23:06 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Anirudh Rayabharam, Wanpeng Li, Jim Mattson,
	Maxim Levitsky, linux-hyperv, linux-kernel

On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
> @@ -2613,6 +2614,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>  	if (((vmx_msr_high >> 18) & 15) != 6)
>  		return -EIO;
>  
> +	rdmsrl(MSR_IA32_VMX_MISC, misc_msr);

Might make sense to sanitize fields that KVM doesn't use and that are not exposed
to L1.  Not sure it's worthwhile though as many of the bits fall into a grey area,
e.g. all the SMM stuff isn't technically used by KVM, but that's largely because
much of it just isn't relevant to virtualization.

I'm totally ok leaving it as-is, though maybe name it "unsanitized_misc" or so
to make that obvious?

>  	vmcs_conf->size = vmx_msr_high & 0x1fff;
>  	vmcs_conf->basic_cap = vmx_msr_high & ~0x1fff;
>  
> @@ -2624,6 +2627,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>  	vmcs_conf->cpu_based_3rd_exec_ctrl = _cpu_based_3rd_exec_control;
>  	vmcs_conf->vmexit_ctrl         = _vmexit_control;
>  	vmcs_conf->vmentry_ctrl        = _vmentry_control;
> +	vmcs_conf->misc	= misc_msr;
>  
>  	return 0;
>  }
> @@ -8241,11 +8245,9 @@ static __init int hardware_setup(void)
>  
>  	if (enable_preemption_timer) {
>  		u64 use_timer_freq = 5000ULL * 1000 * 1000;
> -		u64 vmx_msr;
>  
> -		rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
>  		cpu_preemption_timer_multi =
> -			vmx_msr & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
> +			vmcs_config.misc & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
>  
>  		if (tsc_khz)
>  			use_timer_freq = (u64)tsc_khz * 1000;
> -- 
> 2.35.3
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags
  2022-07-21 21:43   ` Sean Christopherson
@ 2022-07-22 17:22     ` Paolo Bonzini
  2022-08-01  8:16       ` Vitaly Kuznetsov
  0 siblings, 1 reply; 62+ messages in thread
From: Paolo Bonzini @ 2022-07-22 17:22 UTC (permalink / raw)
  To: Sean Christopherson, Vitaly Kuznetsov
  Cc: kvm, Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

On 7/21/22 23:43, Sean Christopherson wrote:
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index c284a605e453..ca91547034e4 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -1282,7 +1282,7 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv *hv_vcpu, u32 msr)
>          case HV_X64_MSR_SYNDBG_OPTIONS:
>          case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>                  return hv_vcpu->cpuid_cache.features_edx &
> -                       HV_FEATURE_DEBUG_MSRS_AVAILABLE;
> +                       HV_ACCESS_DEBUG_MSRS;
>          default:
>                  break;
>          }
> 

Yes, and this will need some kind of hack in QEMU to expose both CPUID 
bits.  Fortunately hv-syndbg shouldn't be in much use in the wild, so I 
think we can avoid quirks etc.

Paolo

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 15/25] KVM: VMX: Extend VMX controls macro shenanigans
  2022-07-14  9:13 ` [PATCH v4 15/25] KVM: VMX: Extend VMX controls macro shenanigans Vitaly Kuznetsov
  2022-07-21 22:28   ` Sean Christopherson
@ 2022-07-22 18:33   ` Sean Christopherson
  2022-07-22 21:04     ` Nathan Chancellor
  2022-07-28 16:27     ` Paolo Bonzini
  1 sibling, 2 replies; 62+ messages in thread
From: Sean Christopherson @ 2022-07-22 18:33 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Anirudh Rayabharam, Wanpeng Li, Jim Mattson,
	Maxim Levitsky, linux-hyperv, linux-kernel

On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 286c88e285ea..89eaab3495a6 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -467,6 +467,113 @@ static inline u8 vmx_get_rvi(void)
>  	return vmcs_read16(GUEST_INTR_STATUS) & 0xff;
>  }
>  
> +#define __KVM_REQ_VMX_VM_ENTRY_CONTROLS				\
> +	(VM_ENTRY_LOAD_DEBUG_CONTROLS)
> +#ifdef CONFIG_X86_64
> +	#define KVM_REQ_VMX_VM_ENTRY_CONTROLS			\
> +		(__KVM_REQ_VMX_VM_ENTRY_CONTROLS |		\
> +		VM_ENTRY_IA32E_MODE)

This breaks 32-bit builds, but at least we know the assert works!

vmx_set_efer() toggles VM_ENTRY_IA32E_MODE without a CONFIG_X86_64 guard.  That
should be easy enough to fix since KVM should never allow EFER_LMA.  Compile 
tested patch at the bottom.

More problematic is that clang-13 doesn't like the new asserts, and even worse gives
a very cryptic error.  I don't have bandwidth to look into this at the moment, and
probably won't next week either.

ERROR: modpost: "__compiletime_assert_533" [arch/x86/kvm/kvm-intel.ko] undefined!
ERROR: modpost: "__compiletime_assert_531" [arch/x86/kvm/kvm-intel.ko] undefined!
ERROR: modpost: "__compiletime_assert_532" [arch/x86/kvm/kvm-intel.ko] undefined!
ERROR: modpost: "__compiletime_assert_530" [arch/x86/kvm/kvm-intel.ko] undefined!
make[2]: *** [scripts/Makefile.modpost:128: modules-only.symvers] Error 1
make[1]: *** [Makefile:1753: modules] Error 2
make[1]: *** Waiting for unfinished jobs....


> +#else
> +	#define KVM_REQ_VMX_VM_ENTRY_CONTROLS			\
> +		__KVM_REQ_VMX_VM_ENTRY_CONTROLS
> +#endif

EFER.LMA patch, compile tested only.

---
From: Sean Christopherson <seanjc@google.com>
Date: Fri, 22 Jul 2022 18:26:21 +0000
Subject: [PATCH] KVM: VMX: Don't toggle VM_ENTRY_IA32E_MODE for 32-bit
 kernels/KVM

Don't toggle VM_ENTRY_IA32E_MODE in 32-bit kernels/KVM and instead bug
the VM if KVM attempts to run the guest with EFER.LMA=1.  KVM doesn't
support running 64-bit guests with 32-bit hosts.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index bff97babf381..8623607e596d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2894,10 +2894,15 @@ int vmx_set_efer(struct kvm_vcpu *vcpu, u64 efer)
 		return 0;

 	vcpu->arch.efer = efer;
+#ifdef CONFIG_X86_64
 	if (efer & EFER_LMA)
 		vm_entry_controls_setbit(vmx, VM_ENTRY_IA32E_MODE);
 	else
 		vm_entry_controls_clearbit(vmx, VM_ENTRY_IA32E_MODE);
+#else
+	if (KVM_BUG_ON(efer & EFER_LMA, vcpu->kvm))
+		return 1;
+#endif

 	vmx_setup_uret_msrs(vmx);
 	return 0;

base-commit: e22e2665637151a321433b2bb705f5c3b8da40bc
--


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 15/25] KVM: VMX: Extend VMX controls macro shenanigans
  2022-07-22 18:33   ` Sean Christopherson
@ 2022-07-22 21:04     ` Nathan Chancellor
  2022-07-22 21:38       ` Sean Christopherson
  2022-07-28 16:27     ` Paolo Bonzini
  1 sibling, 1 reply; 62+ messages in thread
From: Nathan Chancellor @ 2022-07-22 21:04 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, kvm, Paolo Bonzini, Anirudh Rayabharam,
	Wanpeng Li, Jim Mattson, Maxim Levitsky, linux-hyperv,
	linux-kernel, llvm

On Fri, Jul 22, 2022 at 06:33:27PM +0000, Sean Christopherson wrote:
> On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
> > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> > index 286c88e285ea..89eaab3495a6 100644
> > --- a/arch/x86/kvm/vmx/vmx.h
> > +++ b/arch/x86/kvm/vmx/vmx.h
> > @@ -467,6 +467,113 @@ static inline u8 vmx_get_rvi(void)
> >  	return vmcs_read16(GUEST_INTR_STATUS) & 0xff;
> >  }
> >  
> > +#define __KVM_REQ_VMX_VM_ENTRY_CONTROLS				\
> > +	(VM_ENTRY_LOAD_DEBUG_CONTROLS)
> > +#ifdef CONFIG_X86_64
> > +	#define KVM_REQ_VMX_VM_ENTRY_CONTROLS			\
> > +		(__KVM_REQ_VMX_VM_ENTRY_CONTROLS |		\
> > +		VM_ENTRY_IA32E_MODE)
> 
> This breaks 32-bit builds, but at least we know the assert works!
> 
> vmx_set_efer() toggles VM_ENTRY_IA32E_MODE without a CONFIG_X86_64 guard.  That
> should be easy enough to fix since KVM should never allow EFER_LMA.  Compile 
> tested patch at the bottom.
> 
> More problematic is that clang-13 doesn't like the new asserts, and even worse gives
> a very cryptic error.  I don't have bandwidth to look into this at the moment, and
> probably won't next week either.
> 
> ERROR: modpost: "__compiletime_assert_533" [arch/x86/kvm/kvm-intel.ko] undefined!
> ERROR: modpost: "__compiletime_assert_531" [arch/x86/kvm/kvm-intel.ko] undefined!
> ERROR: modpost: "__compiletime_assert_532" [arch/x86/kvm/kvm-intel.ko] undefined!
> ERROR: modpost: "__compiletime_assert_530" [arch/x86/kvm/kvm-intel.ko] undefined!
> make[2]: *** [scripts/Makefile.modpost:128: modules-only.symvers] Error 1
> make[1]: *** [Makefile:1753: modules] Error 2
> make[1]: *** Waiting for unfinished jobs....

clang-14 added support for the error and warning attributes, which makes
the BUILD_BUG_ON failures look like GCC. With allmodconfig, this
becomes:

In file included from ../arch/x86/kvm/vmx/vmx.c:61:
In file included from ../arch/x86/kvm/vmx/nested.h:7:
../arch/x86/kvm/vmx/vmx.h:610:1: error: call to __compiletime_assert_1135 declared with 'error' attribute: BUILD_BUG_ON failed: !(val & (KVM_REQ_VMX_VM_ENTRY_CONTROLS | KVM_OPT_VMX_VM_ENTRY_CONTROLS))
BUILD_CONTROLS_SHADOW(vm_entry, VM_ENTRY_CONTROLS, 32)
^
../arch/x86/kvm/vmx/vmx.h:602:2: note: expanded from macro 'BUILD_CONTROLS_SHADOW'
        BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));     \
        ^
../include/linux/build_bug.h:50:2: note: expanded from macro 'BUILD_BUG_ON'
        BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
        ^
../include/linux/build_bug.h:39:37: note: expanded from macro 'BUILD_BUG_ON_MSG'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                    ^
note: (skipping 1 expansions in backtrace; use -fmacro-backtrace-limit=0 to see all)
../include/linux/compiler_types.h:340:2: note: expanded from macro '_compiletime_assert'
        __compiletime_assert(condition, msg, prefix, suffix)
        ^
../include/linux/compiler_types.h:333:4: note: expanded from macro '__compiletime_assert'
                        prefix ## suffix();                             \
                        ^
<scratch space>:259:1: note: expanded from here
__compiletime_assert_1135
^
In file included from ../arch/x86/kvm/vmx/vmx.c:61:
In file included from ../arch/x86/kvm/vmx/nested.h:7:
../arch/x86/kvm/vmx/vmx.h:610:1: error: call to __compiletime_assert_1136 declared with 'error' attribute: BUILD_BUG_ON failed: !(val & (KVM_REQ_VMX_VM_ENTRY_CONTROLS | KVM_OPT_VMX_VM_ENTRY_CONTROLS))
../arch/x86/kvm/vmx/vmx.h:607:2: note: expanded from macro 'BUILD_CONTROLS_SHADOW'
        BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));     \
        ^
../include/linux/build_bug.h:50:2: note: expanded from macro 'BUILD_BUG_ON'
        BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
        ^
../include/linux/build_bug.h:39:37: note: expanded from macro 'BUILD_BUG_ON_MSG'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                    ^
note: (skipping 1 expansions in backtrace; use -fmacro-backtrace-limit=0 to see all)
../include/linux/compiler_types.h:340:2: note: expanded from macro '_compiletime_assert'
        __compiletime_assert(condition, msg, prefix, suffix)
        ^
../include/linux/compiler_types.h:333:4: note: expanded from macro '__compiletime_assert'
                        prefix ## suffix();                             \
                        ^
<scratch space>:10:1: note: expanded from here
__compiletime_assert_1136
^
In file included from ../arch/x86/kvm/vmx/vmx.c:61:
In file included from ../arch/x86/kvm/vmx/nested.h:7:
../arch/x86/kvm/vmx/vmx.h:611:1: error: call to __compiletime_assert_1137 declared with 'error' attribute: BUILD_BUG_ON failed: !(val & (KVM_REQ_VMX_VM_EXIT_CONTROLS | KVM_OPT_VMX_VM_EXIT_CONTROLS))
BUILD_CONTROLS_SHADOW(vm_exit, VM_EXIT_CONTROLS, 32)
^
../arch/x86/kvm/vmx/vmx.h:602:2: note: expanded from macro 'BUILD_CONTROLS_SHADOW'
        BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));     \
        ^
../include/linux/build_bug.h:50:2: note: expanded from macro 'BUILD_BUG_ON'
        BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
        ^
../include/linux/build_bug.h:39:37: note: expanded from macro 'BUILD_BUG_ON_MSG'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                    ^
note: (skipping 1 expansions in backtrace; use -fmacro-backtrace-limit=0 to see all)
../include/linux/compiler_types.h:340:2: note: expanded from macro '_compiletime_assert'
        __compiletime_assert(condition, msg, prefix, suffix)
        ^
../include/linux/compiler_types.h:333:4: note: expanded from macro '__compiletime_assert'
                        prefix ## suffix();                             \
                        ^
<scratch space>:30:1: note: expanded from here
__compiletime_assert_1137
^
In file included from ../arch/x86/kvm/vmx/vmx.c:61:
In file included from ../arch/x86/kvm/vmx/nested.h:7:
../arch/x86/kvm/vmx/vmx.h:611:1: error: call to __compiletime_assert_1138 declared with 'error' attribute: BUILD_BUG_ON failed: !(val & (KVM_REQ_VMX_VM_EXIT_CONTROLS | KVM_OPT_VMX_VM_EXIT_CONTROLS))
../arch/x86/kvm/vmx/vmx.h:607:2: note: expanded from macro 'BUILD_CONTROLS_SHADOW'
        BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));     \
        ^
../include/linux/build_bug.h:50:2: note: expanded from macro 'BUILD_BUG_ON'
        BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
        ^
../include/linux/build_bug.h:39:37: note: expanded from macro 'BUILD_BUG_ON_MSG'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                    ^
note: (skipping 1 expansions in backtrace; use -fmacro-backtrace-limit=0 to see all)
../include/linux/compiler_types.h:340:2: note: expanded from macro '_compiletime_assert'
        __compiletime_assert(condition, msg, prefix, suffix)
        ^
../include/linux/compiler_types.h:333:4: note: expanded from macro '__compiletime_assert'
                        prefix ## suffix();                             \
                        ^
<scratch space>:40:1: note: expanded from here
__compiletime_assert_1138
^
4 errors generated.

As you mentioned in the other comment on this patch, the 'inline'
keyword should be '__always_inline' in the BUILD_CONTROLS_SHADOW macro
and a couple of other functions need it for BUILD_BUG_ON to see the
value all the way through the call chain. The following diff resolves
those errors for me, hopefully it is useful!

Cheers,
Nathan

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 4ce7ed835e06..b97ed63ece56 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -790,7 +790,7 @@ static bool msr_write_intercepted(struct vcpu_vmx *vmx, u32 msr)
 					 MSR_IA32_SPEC_CTRL);
 }
 
-static void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
+static __always_inline void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
 		unsigned long entry, unsigned long exit)
 {
 	vm_entry_controls_clearbit(vmx, entry);
@@ -848,7 +848,7 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
 	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr);
 }
 
-static void add_atomic_switch_msr_special(struct vcpu_vmx *vmx,
+static __always_inline void add_atomic_switch_msr_special(struct vcpu_vmx *vmx,
 		unsigned long entry, unsigned long exit,
 		unsigned long guest_val_vmcs, unsigned long host_val_vmcs,
 		u64 guest_val, u64 host_val)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 758f80c41beb..acefa5b5e1b9 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -597,12 +597,12 @@ static inline u##bits lname##_controls_get(struct vcpu_vmx *vmx)		\
 {										\
 	return __##lname##_controls_get(vmx->loaded_vmcs);			\
 }										\
-static inline void lname##_controls_setbit(struct vcpu_vmx *vmx, u##bits val)	\
+static __always_inline void lname##_controls_setbit(struct vcpu_vmx *vmx, u##bits val)	\
 {										\
 	BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));	\
 	lname##_controls_set(vmx, lname##_controls_get(vmx) | val);		\
 }										\
-static inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits val)	\
+static __always_inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits val)	\
 {										\
 	BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));	\
 	lname##_controls_set(vmx, lname##_controls_get(vmx) & ~val);		\

> > +#else
> > +	#define KVM_REQ_VMX_VM_ENTRY_CONTROLS			\
> > +		__KVM_REQ_VMX_VM_ENTRY_CONTROLS
> > +#endif
> 
> EFER.LMA patch, compile tested only.
> 
> ---
> From: Sean Christopherson <seanjc@google.com>
> Date: Fri, 22 Jul 2022 18:26:21 +0000
> Subject: [PATCH] KVM: VMX: Don't toggle VM_ENTRY_IA32E_MODE for 32-bit
>  kernels/KVM
> 
> Don't toggle VM_ENTRY_IA32E_MODE in 32-bit kernels/KVM and instead bug
> the VM if KVM attempts to run the guest with EFER.LMA=1.  KVM doesn't
> support running 64-bit guests with 32-bit hosts.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index bff97babf381..8623607e596d 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2894,10 +2894,15 @@ int vmx_set_efer(struct kvm_vcpu *vcpu, u64 efer)
>  		return 0;
> 
>  	vcpu->arch.efer = efer;
> +#ifdef CONFIG_X86_64
>  	if (efer & EFER_LMA)
>  		vm_entry_controls_setbit(vmx, VM_ENTRY_IA32E_MODE);
>  	else
>  		vm_entry_controls_clearbit(vmx, VM_ENTRY_IA32E_MODE);
> +#else
> +	if (KVM_BUG_ON(efer & EFER_LMA, vcpu->kvm))
> +		return 1;
> +#endif
> 
>  	vmx_setup_uret_msrs(vmx);
>  	return 0;
> 
> base-commit: e22e2665637151a321433b2bb705f5c3b8da40bc
> --
> 

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 15/25] KVM: VMX: Extend VMX controls macro shenanigans
  2022-07-22 21:04     ` Nathan Chancellor
@ 2022-07-22 21:38       ` Sean Christopherson
  2022-07-23  1:06         ` Nathan Chancellor
  0 siblings, 1 reply; 62+ messages in thread
From: Sean Christopherson @ 2022-07-22 21:38 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Vitaly Kuznetsov, kvm, Paolo Bonzini, Anirudh Rayabharam,
	Wanpeng Li, Jim Mattson, Maxim Levitsky, linux-hyperv,
	linux-kernel, llvm

On Fri, Jul 22, 2022, Nathan Chancellor wrote:
> On Fri, Jul 22, 2022 at 06:33:27PM +0000, Sean Christopherson wrote:
> > On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
> > > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> > > index 286c88e285ea..89eaab3495a6 100644
> > > --- a/arch/x86/kvm/vmx/vmx.h
> > > +++ b/arch/x86/kvm/vmx/vmx.h
> > > @@ -467,6 +467,113 @@ static inline u8 vmx_get_rvi(void)
> > >  	return vmcs_read16(GUEST_INTR_STATUS) & 0xff;
> > >  }
> > >  
> > > +#define __KVM_REQ_VMX_VM_ENTRY_CONTROLS				\
> > > +	(VM_ENTRY_LOAD_DEBUG_CONTROLS)
> > > +#ifdef CONFIG_X86_64
> > > +	#define KVM_REQ_VMX_VM_ENTRY_CONTROLS			\
> > > +		(__KVM_REQ_VMX_VM_ENTRY_CONTROLS |		\
> > > +		VM_ENTRY_IA32E_MODE)
> > 
> > This breaks 32-bit builds, but at least we know the assert works!
> > 
> > vmx_set_efer() toggles VM_ENTRY_IA32E_MODE without a CONFIG_X86_64 guard.  That
> > should be easy enough to fix since KVM should never allow EFER_LMA.  Compile 
> > tested patch at the bottom.
> > 
> > More problematic is that clang-13 doesn't like the new asserts, and even worse gives
> > a very cryptic error.  I don't have bandwidth to look into this at the moment, and
> > probably won't next week either.
> > 
> > ERROR: modpost: "__compiletime_assert_533" [arch/x86/kvm/kvm-intel.ko] undefined!
> > ERROR: modpost: "__compiletime_assert_531" [arch/x86/kvm/kvm-intel.ko] undefined!
> > ERROR: modpost: "__compiletime_assert_532" [arch/x86/kvm/kvm-intel.ko] undefined!
> > ERROR: modpost: "__compiletime_assert_530" [arch/x86/kvm/kvm-intel.ko] undefined!
> > make[2]: *** [scripts/Makefile.modpost:128: modules-only.symvers] Error 1
> > make[1]: *** [Makefile:1753: modules] Error 2
> > make[1]: *** Waiting for unfinished jobs....
> 
> clang-14 added support for the error and warning attributes, which makes
> the BUILD_BUG_ON failures look like GCC. With allmodconfig, this
> becomes:

...

> As you mentioned in the other comment on this patch, the 'inline'
> keyword should be '__always_inline' in the BUILD_CONTROLS_SHADOW macro
> and a couple of other functions need it for BUILD_BUG_ON to see the
> value all the way through the call chain. The following diff resolves
> those errors for me, hopefully it is useful!

Thanks a ton!  Y'all are like a benevolent Beetlejuice, one needs only to mention
"clang" and you show up and solve the problem :-)

> Cheers,
> Nathan
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 4ce7ed835e06..b97ed63ece56 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -790,7 +790,7 @@ static bool msr_write_intercepted(struct vcpu_vmx *vmx, u32 msr)
>  					 MSR_IA32_SPEC_CTRL);
>  }
>  
> -static void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
> +static __always_inline void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
>  		unsigned long entry, unsigned long exit)
>  {
>  	vm_entry_controls_clearbit(vmx, entry);
> @@ -848,7 +848,7 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
>  	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr);
>  }
>  
> -static void add_atomic_switch_msr_special(struct vcpu_vmx *vmx,
> +static __always_inline void add_atomic_switch_msr_special(struct vcpu_vmx *vmx,
>  		unsigned long entry, unsigned long exit,
>  		unsigned long guest_val_vmcs, unsigned long host_val_vmcs,
>  		u64 guest_val, u64 host_val)
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 758f80c41beb..acefa5b5e1b9 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -597,12 +597,12 @@ static inline u##bits lname##_controls_get(struct vcpu_vmx *vmx)		\
>  {										\
>  	return __##lname##_controls_get(vmx->loaded_vmcs);			\
>  }										\
> -static inline void lname##_controls_setbit(struct vcpu_vmx *vmx, u##bits val)	\
> +static __always_inline void lname##_controls_setbit(struct vcpu_vmx *vmx, u##bits val)	\
>  {										\
>  	BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));	\
>  	lname##_controls_set(vmx, lname##_controls_get(vmx) | val);		\
>  }										\
> -static inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits val)	\
> +static __always_inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits val)	\
>  {										\
>  	BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));	\
>  	lname##_controls_set(vmx, lname##_controls_get(vmx) & ~val);		\
> 
> > > +#else
> > > +	#define KVM_REQ_VMX_VM_ENTRY_CONTROLS			\
> > > +		__KVM_REQ_VMX_VM_ENTRY_CONTROLS
> > > +#endif

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 15/25] KVM: VMX: Extend VMX controls macro shenanigans
  2022-07-22 21:38       ` Sean Christopherson
@ 2022-07-23  1:06         ` Nathan Chancellor
  0 siblings, 0 replies; 62+ messages in thread
From: Nathan Chancellor @ 2022-07-23  1:06 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, kvm, Paolo Bonzini, Anirudh Rayabharam,
	Wanpeng Li, Jim Mattson, Maxim Levitsky, linux-hyperv,
	linux-kernel, llvm

On Fri, Jul 22, 2022 at 09:38:47PM +0000, Sean Christopherson wrote:
> On Fri, Jul 22, 2022, Nathan Chancellor wrote:
> > On Fri, Jul 22, 2022 at 06:33:27PM +0000, Sean Christopherson wrote:
> > > On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
> > > > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> > > > index 286c88e285ea..89eaab3495a6 100644
> > > > --- a/arch/x86/kvm/vmx/vmx.h
> > > > +++ b/arch/x86/kvm/vmx/vmx.h
> > > > @@ -467,6 +467,113 @@ static inline u8 vmx_get_rvi(void)
> > > >  	return vmcs_read16(GUEST_INTR_STATUS) & 0xff;
> > > >  }
> > > >  
> > > > +#define __KVM_REQ_VMX_VM_ENTRY_CONTROLS				\
> > > > +	(VM_ENTRY_LOAD_DEBUG_CONTROLS)
> > > > +#ifdef CONFIG_X86_64
> > > > +	#define KVM_REQ_VMX_VM_ENTRY_CONTROLS			\
> > > > +		(__KVM_REQ_VMX_VM_ENTRY_CONTROLS |		\
> > > > +		VM_ENTRY_IA32E_MODE)
> > > 
> > > This breaks 32-bit builds, but at least we know the assert works!
> > > 
> > > vmx_set_efer() toggles VM_ENTRY_IA32E_MODE without a CONFIG_X86_64 guard.  That
> > > should be easy enough to fix since KVM should never allow EFER_LMA.  Compile 
> > > tested patch at the bottom.
> > > 
> > > More problematic is that clang-13 doesn't like the new asserts, and even worse gives
> > > a very cryptic error.  I don't have bandwidth to look into this at the moment, and
> > > probably won't next week either.
> > > 
> > > ERROR: modpost: "__compiletime_assert_533" [arch/x86/kvm/kvm-intel.ko] undefined!
> > > ERROR: modpost: "__compiletime_assert_531" [arch/x86/kvm/kvm-intel.ko] undefined!
> > > ERROR: modpost: "__compiletime_assert_532" [arch/x86/kvm/kvm-intel.ko] undefined!
> > > ERROR: modpost: "__compiletime_assert_530" [arch/x86/kvm/kvm-intel.ko] undefined!
> > > make[2]: *** [scripts/Makefile.modpost:128: modules-only.symvers] Error 1
> > > make[1]: *** [Makefile:1753: modules] Error 2
> > > make[1]: *** Waiting for unfinished jobs....
> > 
> > clang-14 added support for the error and warning attributes, which makes
> > the BUILD_BUG_ON failures look like GCC. With allmodconfig, this
> > becomes:
> 
> ...
> 
> > As you mentioned in the other comment on this patch, the 'inline'
> > keyword should be '__always_inline' in the BUILD_CONTROLS_SHADOW macro
> > and a couple of other functions need it for BUILD_BUG_ON to see the
> > value all the way through the call chain. The following diff resolves
> > those errors for me, hopefully it is useful!
> 
> Thanks a ton!  Y'all are like a benevolent Beetlejuice, one needs only to mention
> "clang" and you show up and solve the problem :-)

Praise be to the mightly lei and its filters :)

FWIW, if you ever have a question about clang's behavior or any errors,
please feel free to cc llvm@lists.linux.dev, we're always happy to look
into things so that clang stays well supported upstream (and thank you
for verifying KVM changes with it!).

Cheers,
Nathan

> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 4ce7ed835e06..b97ed63ece56 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -790,7 +790,7 @@ static bool msr_write_intercepted(struct vcpu_vmx *vmx, u32 msr)
> >  					 MSR_IA32_SPEC_CTRL);
> >  }
> >  
> > -static void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
> > +static __always_inline void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
> >  		unsigned long entry, unsigned long exit)
> >  {
> >  	vm_entry_controls_clearbit(vmx, entry);
> > @@ -848,7 +848,7 @@ static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
> >  	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr);
> >  }
> >  
> > -static void add_atomic_switch_msr_special(struct vcpu_vmx *vmx,
> > +static __always_inline void add_atomic_switch_msr_special(struct vcpu_vmx *vmx,
> >  		unsigned long entry, unsigned long exit,
> >  		unsigned long guest_val_vmcs, unsigned long host_val_vmcs,
> >  		u64 guest_val, u64 host_val)
> > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> > index 758f80c41beb..acefa5b5e1b9 100644
> > --- a/arch/x86/kvm/vmx/vmx.h
> > +++ b/arch/x86/kvm/vmx/vmx.h
> > @@ -597,12 +597,12 @@ static inline u##bits lname##_controls_get(struct vcpu_vmx *vmx)		\
> >  {										\
> >  	return __##lname##_controls_get(vmx->loaded_vmcs);			\
> >  }										\
> > -static inline void lname##_controls_setbit(struct vcpu_vmx *vmx, u##bits val)	\
> > +static __always_inline void lname##_controls_setbit(struct vcpu_vmx *vmx, u##bits val)	\
> >  {										\
> >  	BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));	\
> >  	lname##_controls_set(vmx, lname##_controls_get(vmx) | val);		\
> >  }										\
> > -static inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits val)	\
> > +static __always_inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits val)	\
> >  {										\
> >  	BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));	\
> >  	lname##_controls_set(vmx, lname##_controls_get(vmx) & ~val);		\
> > 
> > > > +#else
> > > > +	#define KVM_REQ_VMX_VM_ENTRY_CONTROLS			\
> > > > +		__KVM_REQ_VMX_VM_ENTRY_CONTROLS
> > > > +#endif

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS
  2022-07-21 21:58   ` Sean Christopherson
@ 2022-07-25 17:09     ` Paolo Bonzini
  2022-07-25 18:18       ` Sean Christopherson
  0 siblings, 1 reply; 62+ messages in thread
From: Paolo Bonzini @ 2022-07-25 17:09 UTC (permalink / raw)
  To: Sean Christopherson, Vitaly Kuznetsov
  Cc: kvm, Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

On 7/21/22 23:58, Sean Christopherson wrote:
> 
> diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
> index b5cfbf7d487b..7b348a03e096 100644
> --- a/arch/x86/kvm/vmx/evmcs.c
> +++ b/arch/x86/kvm/vmx/evmcs.c
> @@ -355,11 +355,6 @@ uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
>          return 0;
>   }
> 
> -enum evmcs_v1_revision {
> -       EVMCSv1_2016,
> -       EVMCSv1_2022,
> -};
> -
>   enum evmcs_unsupported_ctrl_type {
>          EVMCS_EXIT_CTLS,
>          EVMCS_ENTRY_CTLS,
> @@ -372,29 +367,29 @@ static u32 evmcs_get_unsupported_ctls(struct kvm_vcpu *vcpu,
>                                        enum evmcs_unsupported_ctrl_type ctrl_type)
>   {
>          struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
> -       enum evmcs_v1_revision evmcs_rev = EVMCSv1_2016;
> +       bool has_evmcs_2022_features;
> 
>          if (!hv_vcpu)
>                  return 0;
> 
> -       if (hv_vcpu->cpuid_cache.nested_ebx & HV_X64_NESTED_EVMCS1_2022_UPDATE)
> -               evmcs_rev = EVMCSv1_2022;
> +       has_evmcs_2022_features = hv_vcpu->cpuid_cache.nested_ebx &
> +                                 HV_X64_NESTED_EVMCS1_2022_UPDATE;
> 
>          switch (ctrl_type) {
>          case EVMCS_EXIT_CTLS:
> -               if (evmcs_rev == EVMCSv1_2016)
> +               if (!has_evmcs_2022_features)
>                          return EVMCS1_UNSUPPORTED_VMEXIT_CTRL |
>                                  VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
>                  else
>                          return EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
>          case EVMCS_ENTRY_CTLS:
> -               if (evmcs_rev == EVMCSv1_2016)
> +               if (!has_evmcs_2022_features)
>                          return EVMCS1_UNSUPPORTED_VMENTRY_CTRL |
>                                  VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
>                  else
>                          return EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
>          case EVMCS_2NDEXEC:
> -               if (evmcs_rev == EVMCSv1_2016)
> +               if (!has_evmcs_2022_features)
>                          return EVMCS1_UNSUPPORTED_2NDEXEC |
>                                  SECONDARY_EXEC_TSC_SCALING;
>                  else
> 

I kind of like the idea of having a two-dimensional array based on the 
enums instead of switch statements, so for now I'll keep Vitaly's enums.

Paolo

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS
  2022-07-25 17:09     ` Paolo Bonzini
@ 2022-07-25 18:18       ` Sean Christopherson
  2022-07-28 21:52         ` Paolo Bonzini
  0 siblings, 1 reply; 62+ messages in thread
From: Sean Christopherson @ 2022-07-25 18:18 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, kvm, Anirudh Rayabharam, Wanpeng Li,
	Jim Mattson, Maxim Levitsky, linux-hyperv, linux-kernel

On Mon, Jul 25, 2022, Paolo Bonzini wrote:
> On 7/21/22 23:58, Sean Christopherson wrote:
> > 
> > diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
> > index b5cfbf7d487b..7b348a03e096 100644
> > --- a/arch/x86/kvm/vmx/evmcs.c
> > +++ b/arch/x86/kvm/vmx/evmcs.c
> > @@ -355,11 +355,6 @@ uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
> >          return 0;
> >   }
> > 
> > -enum evmcs_v1_revision {
> > -       EVMCSv1_2016,
> > -       EVMCSv1_2022,
> > -};
> > -
> >   enum evmcs_unsupported_ctrl_type {
> >          EVMCS_EXIT_CTLS,
> >          EVMCS_ENTRY_CTLS,
> > @@ -372,29 +367,29 @@ static u32 evmcs_get_unsupported_ctls(struct kvm_vcpu *vcpu,
> >                                        enum evmcs_unsupported_ctrl_type ctrl_type)
> >   {
> >          struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
> > -       enum evmcs_v1_revision evmcs_rev = EVMCSv1_2016;
> > +       bool has_evmcs_2022_features;
> > 
> >          if (!hv_vcpu)
> >                  return 0;
> > 
> > -       if (hv_vcpu->cpuid_cache.nested_ebx & HV_X64_NESTED_EVMCS1_2022_UPDATE)
> > -               evmcs_rev = EVMCSv1_2022;
> > +       has_evmcs_2022_features = hv_vcpu->cpuid_cache.nested_ebx &
> > +                                 HV_X64_NESTED_EVMCS1_2022_UPDATE;
> > 
> >          switch (ctrl_type) {
> >          case EVMCS_EXIT_CTLS:
> > -               if (evmcs_rev == EVMCSv1_2016)
> > +               if (!has_evmcs_2022_features)
> >                          return EVMCS1_UNSUPPORTED_VMEXIT_CTRL |
> >                                  VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
> >                  else
> >                          return EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
> >          case EVMCS_ENTRY_CTLS:
> > -               if (evmcs_rev == EVMCSv1_2016)
> > +               if (!has_evmcs_2022_features)
> >                          return EVMCS1_UNSUPPORTED_VMENTRY_CTRL |
> >                                  VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
> >                  else
> >                          return EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
> >          case EVMCS_2NDEXEC:
> > -               if (evmcs_rev == EVMCSv1_2016)
> > +               if (!has_evmcs_2022_features)
> >                          return EVMCS1_UNSUPPORTED_2NDEXEC |
> >                                  SECONDARY_EXEC_TSC_SCALING;
> >                  else
> > 
> 
> I kind of like the idea of having a two-dimensional array based on the enums
> instead of switch statements, so for now I'll keep Vitaly's enums.

I don't have a strong opinion on using a 2d array, but unless I'm missing something,
that's nowhere to be found in this patch.  IMO, having the enums without them
providing any unique value is silly and obfuscates the code.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 15/25] KVM: VMX: Extend VMX controls macro shenanigans
  2022-07-22 18:33   ` Sean Christopherson
  2022-07-22 21:04     ` Nathan Chancellor
@ 2022-07-28 16:27     ` Paolo Bonzini
  1 sibling, 0 replies; 62+ messages in thread
From: Paolo Bonzini @ 2022-07-28 16:27 UTC (permalink / raw)
  To: Sean Christopherson, Vitaly Kuznetsov
  Cc: kvm, Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

On 7/22/22 20:33, Sean Christopherson wrote:
> ERROR: modpost: "__compiletime_assert_533" [arch/x86/kvm/kvm-intel.ko] undefined!
> ERROR: modpost: "__compiletime_assert_531" [arch/x86/kvm/kvm-intel.ko] undefined!
> ERROR: modpost: "__compiletime_assert_532" [arch/x86/kvm/kvm-intel.ko] undefined!
> ERROR: modpost: "__compiletime_assert_530" [arch/x86/kvm/kvm-intel.ko] undefined!
> make[2]: *** [scripts/Makefile.modpost:128: modules-only.symvers] Error 1
> make[1]: *** [Makefile:1753: modules] Error 2
> make[1]: *** Waiting for unfinished jobs....

I think it comes from

static void add_atomic_switch_msr_special(struct vcpu_vmx *vmx,
                 unsigned long entry, unsigned long exit,
                 unsigned long guest_val_vmcs, unsigned long host_val_vmcs,
                 u64 guest_val, u64 host_val)
{
         vmcs_write64(guest_val_vmcs, guest_val);
         if (host_val_vmcs != HOST_IA32_EFER)
                 vmcs_write64(host_val_vmcs, host_val);
         vm_entry_controls_setbit(vmx, entry);
         vm_exit_controls_setbit(vmx, exit);
}


and it can be fixed just with __always_inline.

Paolo

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS
  2022-07-25 18:18       ` Sean Christopherson
@ 2022-07-28 21:52         ` Paolo Bonzini
  2022-07-28 22:13           ` Sean Christopherson
  0 siblings, 1 reply; 62+ messages in thread
From: Paolo Bonzini @ 2022-07-28 21:52 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, kvm, Anirudh Rayabharam, Wanpeng Li,
	Jim Mattson, Maxim Levitsky, linux-hyperv, linux-kernel

On 7/25/22 20:18, Sean Christopherson wrote:
>> I kind of like the idea of having a two-dimensional array based on the enums
>> instead of switch statements, so for now I'll keep Vitaly's enums.
> I don't have a strong opinion on using a 2d array, but unless I'm missing something,
> that's nowhere to be found in this patch.  IMO, having the enums without them
> providing any unique value is silly and obfuscates the code.

Yeah, like this:

diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index d8da4026c93d..8055128d8638 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -342,9 +342,10 @@ uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
  	return 0;
  }
  
-enum evmcs_v1_revision {
+enum evmcs_revision {
  	EVMCSv1_2016,
  	EVMCSv1_2022,
+	EVMCS_REVISION_MAX,
  };
  
  enum evmcs_unsupported_ctrl_type {
@@ -353,13 +354,37 @@ enum evmcs_unsupported_ctrl_type {
  	EVMCS_2NDEXEC,
  	EVMCS_PINCTRL,
  	EVMCS_VMFUNC,
+	EVMCS_CTRL_MAX,
+};
+
+static u32 evmcs_unsupported_ctls[EVMCS_CTRL_MAX][EVMCS_REVISION_MAX] = {
+	[EVMCS_EXIT_CTLS] = {
+		[EVMCSv1_2016] = EVMCS1_UNSUPPORTED_VMEXIT_CTRL | VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,
+		[EVMCSv1_2022] = EVMCS1_UNSUPPORTED_VMEXIT_CTRL,
+	},
+	[EVMCS_ENTRY_CTLS] = {
+		[EVMCSv1_2016] = EVMCS1_UNSUPPORTED_VMENTRY_CTRL | VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,
+		[EVMCSv1_2022] =  EVMCS1_UNSUPPORTED_VMENTRY_CTRL,
+	},
+	[EVMCS_2NDEXEC] = {
+		[EVMCSv1_2016] = EVMCS1_UNSUPPORTED_2NDEXEC | SECONDARY_EXEC_TSC_SCALING,
+		[EVMCSv1_2022] = EVMCS1_UNSUPPORTED_2NDEXEC,
+	},
+	[EVMCS_PINCTRL] = {
+		[EVMCSv1_2016] = EVMCS1_UNSUPPORTED_PINCTRL,
+		[EVMCSv1_2022] = EVMCS1_UNSUPPORTED_PINCTRL,
+	},
+	[EVMCS_VMFUNC] = {
+		[EVMCSv1_2016] = EVMCS1_UNSUPPORTED_VMFUNC,
+		[EVMCSv1_2022] = EVMCS1_UNSUPPORTED_VMFUNC,
+	},
  };
  
  static u32 evmcs_get_unsupported_ctls(struct kvm_vcpu *vcpu,
  				      enum evmcs_unsupported_ctrl_type ctrl_type)
  {
  	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
-	enum evmcs_v1_revision evmcs_rev = EVMCSv1_2016;
+	enum evmcs_revision evmcs_rev = EVMCSv1_2016;
  
  	if (!hv_vcpu)
  		return 0;
@@ -367,32 +392,7 @@ static u32 evmcs_get_unsupported_ctls(struct kvm_vcpu *vcpu,
  	if (hv_vcpu->cpuid_cache.nested_ebx & HV_X64_NESTED_EVMCS1_2022_UPDATE)
  		evmcs_rev = EVMCSv1_2022;
  
-	switch (ctrl_type) {
-	case EVMCS_EXIT_CTLS:
-		if (evmcs_rev == EVMCSv1_2016)
-			return EVMCS1_UNSUPPORTED_VMEXIT_CTRL |
-				VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
-		else
-			return EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
-	case EVMCS_ENTRY_CTLS:
-		if (evmcs_rev == EVMCSv1_2016)
-			return EVMCS1_UNSUPPORTED_VMENTRY_CTRL |
-				VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
-		else
-			return EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
-	case EVMCS_2NDEXEC:
-		if (evmcs_rev == EVMCSv1_2016)
-			return EVMCS1_UNSUPPORTED_2NDEXEC |
-				SECONDARY_EXEC_TSC_SCALING;
-		else
-			return EVMCS1_UNSUPPORTED_2NDEXEC;
-	case EVMCS_PINCTRL:
-		return EVMCS1_UNSUPPORTED_PINCTRL;
-	case EVMCS_VMFUNC:
-		return EVMCS1_UNSUPPORTED_VMFUNC;
-	}
-
-	return 0;
+	return evmcs_unsupported_ctls[ctrl_type][evmcs_rev];
  }
  
  void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)

Paolo

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS
  2022-07-28 21:52         ` Paolo Bonzini
@ 2022-07-28 22:13           ` Sean Christopherson
  2022-07-28 22:24             ` Paolo Bonzini
  0 siblings, 1 reply; 62+ messages in thread
From: Sean Christopherson @ 2022-07-28 22:13 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, kvm, Anirudh Rayabharam, Wanpeng Li,
	Jim Mattson, Maxim Levitsky, linux-hyperv, linux-kernel

On Thu, Jul 28, 2022, Paolo Bonzini wrote:
> On 7/25/22 20:18, Sean Christopherson wrote:
> > > I kind of like the idea of having a two-dimensional array based on the enums
> > > instead of switch statements, so for now I'll keep Vitaly's enums.
> > I don't have a strong opinion on using a 2d array, but unless I'm missing something,
> > that's nowhere to be found in this patch.  IMO, having the enums without them
> > providing any unique value is silly and obfuscates the code.
> 
> Yeah, like this:
> 
> diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
> index d8da4026c93d..8055128d8638 100644
> --- a/arch/x86/kvm/vmx/evmcs.c
> +++ b/arch/x86/kvm/vmx/evmcs.c
> @@ -342,9 +342,10 @@ uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
> -enum evmcs_v1_revision {
> +enum evmcs_revision {
>  	EVMCSv1_2016,
>  	EVMCSv1_2022,
> +	EVMCS_REVISION_MAX,
>  };
>  enum evmcs_unsupported_ctrl_type {
> @@ -353,13 +354,37 @@ enum evmcs_unsupported_ctrl_type {
>  	EVMCS_2NDEXEC,
>  	EVMCS_PINCTRL,
>  	EVMCS_VMFUNC,
> +	EVMCS_CTRL_MAX,
> +};
> +
> +static u32 evmcs_unsupported_ctls[EVMCS_CTRL_MAX][EVMCS_REVISION_MAX] = {

Can this be const?

> +	[EVMCS_EXIT_CTLS] = {
> +		[EVMCSv1_2016] = EVMCS1_UNSUPPORTED_VMEXIT_CTRL | VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,
> +		[EVMCSv1_2022] = EVMCS1_UNSUPPORTED_VMEXIT_CTRL,
> +	},
> +	[EVMCS_ENTRY_CTLS] = {
> +		[EVMCSv1_2016] = EVMCS1_UNSUPPORTED_VMENTRY_CTRL | VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,
> +		[EVMCSv1_2022] =  EVMCS1_UNSUPPORTED_VMENTRY_CTRL,
> +	},
> +	[EVMCS_2NDEXEC] = {
> +		[EVMCSv1_2016] = EVMCS1_UNSUPPORTED_2NDEXEC | SECONDARY_EXEC_TSC_SCALING,
> +		[EVMCSv1_2022] = EVMCS1_UNSUPPORTED_2NDEXEC,
> +	},
> +	[EVMCS_PINCTRL] = {
> +		[EVMCSv1_2016] = EVMCS1_UNSUPPORTED_PINCTRL,
> +		[EVMCSv1_2022] = EVMCS1_UNSUPPORTED_PINCTRL,
> +	},
> +	[EVMCS_VMFUNC] = {
> +		[EVMCSv1_2016] = EVMCS1_UNSUPPORTED_VMFUNC,
> +		[EVMCSv1_2022] = EVMCS1_UNSUPPORTED_VMFUNC,
> +	},
>  };

...

> +	return evmcs_unsupported_ctls[ctrl_type][evmcs_rev];
>  }

The only flaw in this is if KVM gets handed a CPUID model that enumerates support
for 2025 (or whenever the next update comes) but not 2022.  Hmm, though if Microsoft
defines each new "version" as a full superset, then even that theoretical bug goes
away.  I'm happy to be optimistic for once and give this a shot.  I definitely like
that it makes it easier to see the deltas between versions.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS
  2022-07-28 22:13           ` Sean Christopherson
@ 2022-07-28 22:24             ` Paolo Bonzini
  2022-07-28 22:35               ` Sean Christopherson
                                 ` (2 more replies)
  0 siblings, 3 replies; 62+ messages in thread
From: Paolo Bonzini @ 2022-07-28 22:24 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, kvm, Anirudh Rayabharam, Wanpeng Li,
	Jim Mattson, Maxim Levitsky, linux-hyperv, linux-kernel

On 7/29/22 00:13, Sean Christopherson wrote:
> The only flaw in this is if KVM gets handed a CPUID model that enumerates support
> for 2025 (or whenever the next update comes) but not 2022.  Hmm, though if Microsoft
> defines each new "version" as a full superset, then even that theoretical bug goes
> away.  I'm happy to be optimistic for once and give this a shot.  I definitely like
> that it makes it easier to see the deltas between versions.

Okay, I have queued the series but I still haven't gone through all the 
comments.  So this will _not_ be in the 5.21 pull request.

The first patch also needs a bit more thought to figure out the impact 
on userspace and whether we can consider syndbg niche enough to not care.

Paolo


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 21/25] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()
  2022-07-21 22:56   ` Sean Christopherson
@ 2022-07-28 22:25     ` Paolo Bonzini
  2022-07-28 22:34       ` Sean Christopherson
  0 siblings, 1 reply; 62+ messages in thread
From: Paolo Bonzini @ 2022-07-28 22:25 UTC (permalink / raw)
  To: Sean Christopherson, Vitaly Kuznetsov
  Cc: kvm, Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

On 7/22/22 00:56, Sean Christopherson wrote:
> Except the errata are based on FMS and the FMS exposed to the L1 hypervisor may
> not be the real FMS.
> 
> But that's moot, because they_should_  be fully emulated by KVM anyways; KVM
> runs L2 with a MSR value modified by perf, not the raw MSR value requested by L1.
> 
> Of course KVM screws things up and fails to clear the flag in entry controls...
> All exit controls are emulated so at least KVM gets those right.

Can you send this as a separate patch?

Paolo

> Untested, but I believe KVM the fix is:
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index d0e781c7ac72..76926147b672 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2357,7 +2357,8 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct loaded_vmcs *vmcs0
>           * we can avoid VMWrites during vmx_set_efer().
>           */
>          exec_control = __vm_entry_controls_get(vmcs01);
> -       exec_control |= vmcs12->vm_entry_controls;
> +       exec_control |= (vmcs12->vm_entry_controls &
> +                        ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL);
>          exec_control &= ~(VM_ENTRY_IA32E_MODE | VM_ENTRY_LOAD_IA32_EFER);
>          if (cpu_has_load_ia32_efer()) {
>                  if (guest_efer & EFER_LMA)


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 21/25] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()
  2022-07-28 22:25     ` Paolo Bonzini
@ 2022-07-28 22:34       ` Sean Christopherson
  0 siblings, 0 replies; 62+ messages in thread
From: Sean Christopherson @ 2022-07-28 22:34 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, kvm, Anirudh Rayabharam, Wanpeng Li,
	Jim Mattson, Maxim Levitsky, linux-hyperv, linux-kernel

On Fri, Jul 29, 2022, Paolo Bonzini wrote:
> On 7/22/22 00:56, Sean Christopherson wrote:
> > Except the errata are based on FMS and the FMS exposed to the L1 hypervisor may
> > not be the real FMS.
> > 
> > But that's moot, because they_should_  be fully emulated by KVM anyways; KVM
> > runs L2 with a MSR value modified by perf, not the raw MSR value requested by L1.
> > 
> > Of course KVM screws things up and fails to clear the flag in entry controls...
> > All exit controls are emulated so at least KVM gets those right.
> 
> Can you send this as a separate patch?

Will do.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS
  2022-07-28 22:24             ` Paolo Bonzini
@ 2022-07-28 22:35               ` Sean Christopherson
  2022-08-01  8:54               ` Vitaly Kuznetsov
  2022-08-02 13:03               ` Vitaly Kuznetsov
  2 siblings, 0 replies; 62+ messages in thread
From: Sean Christopherson @ 2022-07-28 22:35 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, kvm, Anirudh Rayabharam, Wanpeng Li,
	Jim Mattson, Maxim Levitsky, linux-hyperv, linux-kernel

On Fri, Jul 29, 2022, Paolo Bonzini wrote:
> On 7/29/22 00:13, Sean Christopherson wrote:
> > The only flaw in this is if KVM gets handed a CPUID model that enumerates support
> > for 2025 (or whenever the next update comes) but not 2022.  Hmm, though if Microsoft
> > defines each new "version" as a full superset, then even that theoretical bug goes
> > away.  I'm happy to be optimistic for once and give this a shot.  I definitely like
> > that it makes it easier to see the deltas between versions.
> 
> Okay, I have queued the series but I still haven't gone through all the
> comments.  So this will _not_ be in the 5.21 pull request.

I assume you meant 5.20?

> The first patch also needs a bit more thought to figure out the impact on
> userspace and whether we can consider syndbg niche enough to not care.
> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags
  2022-07-22 17:22     ` Paolo Bonzini
@ 2022-08-01  8:16       ` Vitaly Kuznetsov
  0 siblings, 0 replies; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-01  8:16 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 7/21/22 23:43, Sean Christopherson wrote:
>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>> index c284a605e453..ca91547034e4 100644
>> --- a/arch/x86/kvm/hyperv.c
>> +++ b/arch/x86/kvm/hyperv.c
>> @@ -1282,7 +1282,7 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv *hv_vcpu, u32 msr)
>>          case HV_X64_MSR_SYNDBG_OPTIONS:
>>          case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>>                  return hv_vcpu->cpuid_cache.features_edx &
>> -                       HV_FEATURE_DEBUG_MSRS_AVAILABLE;
>> +                       HV_ACCESS_DEBUG_MSRS;
>>          default:
>>                  break;
>>          }
>> 
>
> Yes, and this will need some kind of hack in QEMU to expose both CPUID 
> bits.  Fortunately hv-syndbg shouldn't be in much use in the wild, so I 
> think we can avoid quirks etc.

Properly behaving VMM should always expose both bits. I'm not sure what
would it mean if only 'access' bit is present: you can try accessing the
missing feature but you get #GP anyway most likely. When the feature is
available but 'access' bit is not set -- the result is also #GP. In case
we really want to support this behavior in KVM we should probably check
*both* bits in hv_check_msr_access() but I don't really see a
use-case. I've lazily kept HV_FEATURE_DEBUG_MSRS_AVAILABLE here just to
be QEMU compatible.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS
  2022-07-28 22:24             ` Paolo Bonzini
  2022-07-28 22:35               ` Sean Christopherson
@ 2022-08-01  8:54               ` Vitaly Kuznetsov
  2022-08-02 13:03               ` Vitaly Kuznetsov
  2 siblings, 0 replies; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-01  8:54 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 7/29/22 00:13, Sean Christopherson wrote:
>> The only flaw in this is if KVM gets handed a CPUID model that enumerates support
>> for 2025 (or whenever the next update comes) but not 2022.  Hmm, though if Microsoft
>> defines each new "version" as a full superset, then even that theoretical bug goes
>> away.  I'm happy to be optimistic for once and give this a shot.  I definitely like
>> that it makes it easier to see the deltas between versions.
>
> Okay, I have queued the series but I still haven't gone through all the 
> comments.  So this will _not_ be in the 5.21 pull request.
>
> The first patch also needs a bit more thought to figure out the impact 
> on userspace and whether we can consider syndbg niche enough to not care.

(Sorry for delayed replies here, I'm back from vacation now)

The first patch is not a requirement for the rest of the series, we can
discuss it separately. I, however, think that we can just keep checking
HV_FEATURE_DEBUG_MSRS_AVAILABLE in hv_check_msr_access() to be
compatible with existing QEMUs and make QEMU expose both
HV_FEATURE_DEBUG_MSRS_AVAILABLE and HV_ACCESS_DEBUG_MSRS unconditionally
when syndbg feature is enabled as we know that missing
HV_ACCESS_DEBUG_MSRS is just a bug. I don't think we actually need to
be so picky and support VMMs which want to set 'syndbg without access to
it' and 'access to syndbg without syndbg' use-cases. All-or-nothing is
likely good enough.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 14/25] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING in setup_vmcs_config()
  2022-07-21 22:11   ` Sean Christopherson
@ 2022-08-02 12:52     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-02 12:52 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, Anirudh Rayabharam, Wanpeng Li, Jim Mattson,
	Maxim Levitsky, linux-hyperv, linux-kernel

Sean Christopherson <seanjc@google.com> writes:

> On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
>> SECONDARY_EXEC_ENCLS_EXITING is conditionally added to the 'optional'
>> checklist in setup_vmcs_config() but there's little value in doing so.
>> First, as the control is optional, we can always check for its
>> presence, no harm done. Second, the only real value cpu_has_sgx() check
>> gives is that on the CPUs which support SECONDARY_EXEC_ENCLS_EXITING but
>> don't support SGX, the control is not getting enabled. It's highly unlikely
>> such CPUs exist but it's possible that some hypervisors expose broken vCPU
>> models.
>
> It's not just broken vCPU models, SGX can be "soft-disabled" on bare metal, e.g. if
> software writes MCE control MSRs or there's an uncorrectable #MC (may not be the
> case on all platforms).  This is architectural behavior and needs to be handled in
> KVM.  Obviously if SGX gets disabled after KVM is loaded then we're out of luck, but
> having the ENCL-exiting control without SGX being enabled is 100% valid.
>
> As for why KVM bothers with the check, it's to work around a suspected hardware
> or XuCode bug (I'm still a bit shocked that's public now :-) ) where SGX got
> _hard_ disabled across S3 on some CPUs and made the fields magically disappear.
> The workaround was to soft-disable SGX in BIOS so that KVM wouldn't attempt to
> enable the ENCLS-exiting control

Oh, thanks for this insight, I had no idea! I'll adjust my commit
message accordingly.

>
>> Preserve cpu_has_sgx() check but filter the result of adjust_vmx_controls()
>> instead of the input.
>> 
>> Reviewed-by: Jim Mattson <jmattson@google.com>
>> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  arch/x86/kvm/vmx/vmx.c | 9 ++++++---
>>  1 file changed, 6 insertions(+), 3 deletions(-)
>> 
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index ce54f13d8da1..566be73c6509 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -2528,9 +2528,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>>  			SECONDARY_EXEC_PT_CONCEAL_VMX |
>>  			SECONDARY_EXEC_ENABLE_VMFUNC |
>>  			SECONDARY_EXEC_BUS_LOCK_DETECTION |
>> -			SECONDARY_EXEC_NOTIFY_VM_EXITING;
>> -		if (cpu_has_sgx())
>> -			opt2 |= SECONDARY_EXEC_ENCLS_EXITING;
>> +			SECONDARY_EXEC_NOTIFY_VM_EXITING |
>> +			SECONDARY_EXEC_ENCLS_EXITING;
>> +
>>  		if (adjust_vmx_controls(min2, opt2,
>>  					MSR_IA32_VMX_PROCBASED_CTLS2,
>>  					&_cpu_based_2nd_exec_control) < 0)
>> @@ -2577,6 +2577,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>>  		vmx_cap->vpid = 0;
>>  	}
>>  
>> +	if (!cpu_has_sgx())
>> +		_cpu_based_2nd_exec_control &= ~SECONDARY_EXEC_ENCLS_EXITING;
>> +
>>  	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) {
>>  		u64 opt3 = TERTIARY_EXEC_IPI_VIRT;
>>  
>> -- 
>> 2.35.3
>> 
>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS
  2022-07-28 22:24             ` Paolo Bonzini
  2022-07-28 22:35               ` Sean Christopherson
  2022-08-01  8:54               ` Vitaly Kuznetsov
@ 2022-08-02 13:03               ` Vitaly Kuznetsov
  2 siblings, 0 replies; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-02 13:03 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: kvm, Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 7/29/22 00:13, Sean Christopherson wrote:
>> The only flaw in this is if KVM gets handed a CPUID model that enumerates support
>> for 2025 (or whenever the next update comes) but not 2022.  Hmm, though if Microsoft
>> defines each new "version" as a full superset, then even that theoretical bug goes
>> away.  I'm happy to be optimistic for once and give this a shot.  I definitely like
>> that it makes it easier to see the deltas between versions.
>
> Okay, I have queued the series but I still haven't gone through all the 
> comments.

The biggest problem with this version is the EFER.LMA problem on i386
discovered (and, thankfully, fixed in the suggested patch) by
Sean. To address this and all other comment I'm going to put together a
v5 on top of the current kvm/queue (as I don't yet see any of this stuff
there).

>  So this will _not_ be in the 5.21 pull request.

At first I thought you meant 5.20 but then I got the pun: 5.20 will
likely become 6.0 and so 5.21 pull request will just never happen :-)

-- 
Vitaly


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 24/25] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
  2022-07-21 23:06   ` Sean Christopherson
@ 2022-08-02 16:11     ` Vitaly Kuznetsov
  2022-08-02 16:28       ` Sean Christopherson
  0 siblings, 1 reply; 62+ messages in thread
From: Vitaly Kuznetsov @ 2022-08-02 16:11 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, Anirudh Rayabharam, Wanpeng Li, Jim Mattson,
	Maxim Levitsky, linux-hyperv, linux-kernel

Sean Christopherson <seanjc@google.com> writes:

> On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
>> @@ -2613,6 +2614,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>>  	if (((vmx_msr_high >> 18) & 15) != 6)
>>  		return -EIO;
>>  
>> +	rdmsrl(MSR_IA32_VMX_MISC, misc_msr);
>
> Might make sense to sanitize fields that KVM doesn't use and that are not exposed
> to L1.  Not sure it's worthwhile though as many of the bits fall into a grey area,
> e.g. all the SMM stuff isn't technically used by KVM, but that's largely because
> much of it just isn't relevant to virtualization.
>
> I'm totally ok leaving it as-is, though maybe name it "unsanitized_misc" or so
> to make that obvious?

I couldn't convince myself to add 'unsanitized_' prefix as I don't think
it significantly reduces possible confusion (the quiestion would be
'sanitized for what and in which way?') so a need for 'git grep' seems
imminent anyway. Hope I've addressed the rest of your comments in v5
though, thanks for your review!

-- 
Vitaly


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v4 24/25] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
  2022-08-02 16:11     ` Vitaly Kuznetsov
@ 2022-08-02 16:28       ` Sean Christopherson
  0 siblings, 0 replies; 62+ messages in thread
From: Sean Christopherson @ 2022-08-02 16:28 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Anirudh Rayabharam, Wanpeng Li, Jim Mattson,
	Maxim Levitsky, linux-hyperv, linux-kernel

On Tue, Aug 02, 2022, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@google.com> writes:
> 
> > On Thu, Jul 14, 2022, Vitaly Kuznetsov wrote:
> >> @@ -2613,6 +2614,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
> >>  	if (((vmx_msr_high >> 18) & 15) != 6)
> >>  		return -EIO;
> >>  
> >> +	rdmsrl(MSR_IA32_VMX_MISC, misc_msr);
> >
> > Might make sense to sanitize fields that KVM doesn't use and that are not exposed
> > to L1.  Not sure it's worthwhile though as many of the bits fall into a grey area,
> > e.g. all the SMM stuff isn't technically used by KVM, but that's largely because
> > much of it just isn't relevant to virtualization.
> >
> > I'm totally ok leaving it as-is, though maybe name it "unsanitized_misc" or so
> > to make that obvious?
> 
> I couldn't convince myself to add 'unsanitized_' prefix as I don't think
> it significantly reduces possible confusion (the quiestion would be
> 'sanitized for what and in which way?') so a need for 'git grep' seems
> imminent anyway.

Yeah, no objection to leaving it alone.  VMX_MISC is such an oddball MSR that it
practically comes with disclaimers anyways :-)

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2022-08-02 16:28 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-14  9:13 [PATCH v4 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
2022-07-14  9:13 ` [PATCH v4 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags Vitaly Kuznetsov
2022-07-21 21:43   ` Sean Christopherson
2022-07-22 17:22     ` Paolo Bonzini
2022-08-01  8:16       ` Vitaly Kuznetsov
2022-07-14  9:13 ` [PATCH v4 02/25] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition Vitaly Kuznetsov
2022-07-14  9:13 ` [PATCH v4 03/25] x86/hyperv: Update " Vitaly Kuznetsov
2022-07-14  9:57   ` Maxim Levitsky
2022-07-14  9:13 ` [PATCH v4 04/25] KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields Vitaly Kuznetsov
2022-07-14  9:13 ` [PATCH v4 05/25] KVM: nVMX: Support several new fields in eVMCSv1 Vitaly Kuznetsov
2022-07-14  9:13 ` [PATCH v4 06/25] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf Vitaly Kuznetsov
2022-07-14  9:59   ` Maxim Levitsky
2022-07-14  9:13 ` [PATCH v4 07/25] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields Vitaly Kuznetsov
2022-07-14  9:20   ` Kai Huang
2022-07-14  9:13 ` [PATCH v4 08/25] KVM: selftests: Switch to updated eVMCSv1 definition Vitaly Kuznetsov
2022-07-14 10:07   ` Maxim Levitsky
2022-07-14  9:13 ` [PATCH v4 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS Vitaly Kuznetsov
2022-07-21 21:58   ` Sean Christopherson
2022-07-25 17:09     ` Paolo Bonzini
2022-07-25 18:18       ` Sean Christopherson
2022-07-28 21:52         ` Paolo Bonzini
2022-07-28 22:13           ` Sean Christopherson
2022-07-28 22:24             ` Paolo Bonzini
2022-07-28 22:35               ` Sean Christopherson
2022-08-01  8:54               ` Vitaly Kuznetsov
2022-08-02 13:03               ` Vitaly Kuznetsov
2022-07-14  9:13 ` [PATCH v4 10/25] KVM: selftests: Enable TSC scaling in evmcs selftest Vitaly Kuznetsov
2022-07-14  9:13 ` [PATCH v4 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization Vitaly Kuznetsov
2022-07-14 10:04   ` Maxim Levitsky
2022-07-14  9:13 ` [PATCH v4 12/25] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config() Vitaly Kuznetsov
2022-07-21 22:00   ` Sean Christopherson
2022-07-14  9:13 ` [PATCH v4 13/25] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING " Vitaly Kuznetsov
2022-07-21 22:01   ` Sean Christopherson
2022-07-14  9:13 ` [PATCH v4 14/25] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING " Vitaly Kuznetsov
2022-07-21 22:11   ` Sean Christopherson
2022-08-02 12:52     ` Vitaly Kuznetsov
2022-07-14  9:13 ` [PATCH v4 15/25] KVM: VMX: Extend VMX controls macro shenanigans Vitaly Kuznetsov
2022-07-21 22:28   ` Sean Christopherson
2022-07-22 18:33   ` Sean Christopherson
2022-07-22 21:04     ` Nathan Chancellor
2022-07-22 21:38       ` Sean Christopherson
2022-07-23  1:06         ` Nathan Chancellor
2022-07-28 16:27     ` Paolo Bonzini
2022-07-14  9:13 ` [PATCH v4 16/25] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config() Vitaly Kuznetsov
2022-07-21 22:30   ` Sean Christopherson
2022-07-14  9:13 ` [PATCH v4 17/25] KVM: VMX: Add missing VMEXIT controls to vmcs_config Vitaly Kuznetsov
2022-07-21 22:34   ` Sean Christopherson
2022-07-14  9:13 ` [PATCH v4 18/25] KVM: VMX: Add missing CPU based VM execution " Vitaly Kuznetsov
2022-07-21 22:39   ` Sean Christopherson
2022-07-14  9:13 ` [PATCH v4 19/25] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup Vitaly Kuznetsov
2022-07-14  9:13 ` [PATCH v4 20/25] KVM: x86: VMX: Replace some Intel model numbers with mnemonics Vitaly Kuznetsov
2022-07-14  9:13 ` [PATCH v4 21/25] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config() Vitaly Kuznetsov
2022-07-21 22:56   ` Sean Christopherson
2022-07-28 22:25     ` Paolo Bonzini
2022-07-28 22:34       ` Sean Christopherson
2022-07-14  9:13 ` [PATCH v4 22/25] KVM: nVMX: Always set required-1 bits of pinbased_ctls to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR Vitaly Kuznetsov
2022-07-14  9:13 ` [PATCH v4 23/25] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs Vitaly Kuznetsov
2022-07-14  9:13 ` [PATCH v4 24/25] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config Vitaly Kuznetsov
2022-07-21 23:06   ` Sean Christopherson
2022-08-02 16:11     ` Vitaly Kuznetsov
2022-08-02 16:28       ` Sean Christopherson
2022-07-14  9:13 ` [PATCH v4 25/25] KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up nested MSR Vitaly Kuznetsov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.