All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs
@ 2022-07-08 14:41 Vitaly Kuznetsov
  2022-07-08 14:41 ` [PATCH v3 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags Vitaly Kuznetsov
                   ` (24 more replies)
  0 siblings, 25 replies; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:41 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

This series continues:
- "[PATCH v2 00/28] KVM: VMX: Support TscScaling and EnclsExitingBitmap
 with eVMCS + use vmcs_config for L1 VMX MSRs" work:
https://lore.kernel.org/kvm/20220629150625.238286-1-vkuznets@redhat.com/

Changes since v1:
- Turns out the updated eVMCSv1 revision comes with a CPUID feature bit
  and this changes a lot as we don't need to invent our own eVMCS
  revisions. Adjust the whole series accordingly, drop now unneeded
  KVM_CAP_HYPERV_ENLIGHTENED_VMCS2.
- VM_{EXIT,ENTRY}_LOAD_IA32_PERF_GLOBAL_CTRL are now supported for both
  Hyper-V on KVM and KVM on Hyper-V.
- Make CPU_BASED_NMI_WINDOW_EXITING optional [Sean].
- Drop erroneous "KVM: VMX: Add missing VMENTRY controls to vmcs_config"
  [Jim].
- Include Jim's "KVM: x86: VMX: Replace some Intel model numbers with
  mnemonics" into the series.
- "KVM: nVMX: Always set required-1 bits of pinbased_ctls to 
  PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR" added [Jim]. It allows to get rid
  of required-1 bits caching in vmcs_config, patches dropped.
- Collect R-b tags [Jim].
- Other minor tweaks (descriptions, comments) [Jim, Sean].

Original description:

Enlightened VMCS v1 definition was updates to include fields for the
following features:
    - PerfGlobalCtrl
    - EnclsExitingBitmap
    - TSC scaling
    - GuestLbrCtl
    - CET
    - SSP
While the information is missing in the publicly available TLFS, the
updated definition comes with a new feature bit in CPUID.0x4000000A.EBX
(BIT 0). Use a made up HV_X64_NESTED_EVMCS1_2022_UPDATE name for it.

Add support for the new revision to KVM. SSP, CET and GuestLbrCtl
features are not currently supported by KVM.

While on it, implement Sean's idea to use vmcs_config for setting up
L1 VMX control MSRs instead of re-reading host MSRs.

Jim Mattson (1):
  KVM: x86: VMX: Replace some Intel model numbers with mnemonics

Sean Christopherson (1):
  KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not
    setup

Vitaly Kuznetsov (23):
  KVM: x86: hyper-v: Expose access to debug MSRs in the partition
    privilege flags
  x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition
  x86/hyperv: Update 'struct hv_enlightened_vmcs' definition
  KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields
  KVM: nVMX: Support several new fields in eVMCSv1
  KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf
  KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields
  KVM: selftests: Switch to updated eVMCSv1 definition
  KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with
    enlightened VMCS
  KVM: selftests: Enable TSC scaling in evmcs selftest
  KVM: VMX: Get rid of eVMCS specific VMX controls sanitization
  KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
  KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
    setup_vmcs_config()
  KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
    in setup_vmcs_config()
  KVM: VMX: Extend VMX controls macro shenanigans
  KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
    setup_vmcs_config()
  KVM: VMX: Add missing VMEXIT controls to vmcs_config
  KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
  KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of
    setup_vmcs_config()
  KVM: nVMX: Always set required-1 bits of pinbased_ctls to
    PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR
  KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
  KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
  KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
    nested MSR

 arch/x86/include/asm/hyperv-tlfs.h            |  22 +-
 arch/x86/include/asm/kvm_host.h               |   2 +
 arch/x86/kvm/hyperv.c                         |  12 +-
 arch/x86/kvm/vmx/capabilities.h               |  14 +-
 arch/x86/kvm/vmx/evmcs.c                      | 127 +++++++---
 arch/x86/kvm/vmx/evmcs.h                      |  18 +-
 arch/x86/kvm/vmx/nested.c                     |  70 ++++--
 arch/x86/kvm/vmx/nested.h                     |   2 +-
 arch/x86/kvm/vmx/vmx.c                        | 235 ++++++++----------
 arch/x86/kvm/vmx/vmx.h                        | 116 +++++++++
 include/asm-generic/hyperv-tlfs.h             |   2 +
 .../selftests/kvm/include/x86_64/evmcs.h      |  45 +++-
 .../selftests/kvm/include/x86_64/vmx.h        |   2 +
 .../testing/selftests/kvm/x86_64/evmcs_test.c |  31 ++-
 14 files changed, 477 insertions(+), 221 deletions(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
@ 2022-07-08 14:41 ` Vitaly Kuznetsov
  2022-07-12 11:49   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 02/25] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition Vitaly Kuznetsov
                   ` (23 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:41 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

For some features, Hyper-V spec defines two separate CPUID bits: one
listing whether the feature is supported or not and another one showing
whether guest partition was granted access to the feature ("partition
privilege mask"). 'Debug MSRs available' is one of such features. Add
the missing 'access' bit.

Fixes: f97f5a56f597 ("x86/kvm/hyper-v: Add support for synthetic debugger interface")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/hyperv.c             | 1 +
 include/asm-generic/hyperv-tlfs.h | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index e2e95a6fccfd..e08189211d9a 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2496,6 +2496,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
 			ent->eax |= HV_MSR_RESET_AVAILABLE;
 			ent->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
 			ent->eax |= HV_ACCESS_FREQUENCY_MSRS;
+			ent->eax |= HV_ACCESS_DEBUG_MSRS;
 			ent->eax |= HV_ACCESS_REENLIGHTENMENT;
 
 			ent->ebx |= HV_POST_MESSAGES;
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index fdce7a4cfc6f..1d99dd296a76 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -70,6 +70,8 @@
 #define HV_MSR_GUEST_IDLE_AVAILABLE		BIT(10)
 /* Partition local APIC and TSC frequency registers available */
 #define HV_ACCESS_FREQUENCY_MSRS		BIT(11)
+/* Debug MSRs available */
+#define HV_ACCESS_DEBUG_MSRS			BIT(12)
 /* AccessReenlightenmentControls privilege */
 #define HV_ACCESS_REENLIGHTENMENT		BIT(13)
 /* AccessTscInvariantControls privilege */
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 02/25] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
  2022-07-08 14:41 ` [PATCH v3 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-08 14:42 ` [PATCH v3 03/25] x86/hyperv: Update " Vitaly Kuznetsov
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Section 1.9 of TLFS v6.0b says:

"All structures are padded in such a way that fields are aligned
naturally (that is, an 8-byte field is aligned to an offset of 8 bytes
and so on)".

'struct enlightened_vmcs' has a glitch:

...
        struct {
                u32                nested_flush_hypercall:1; /*   836: 0  4 */
                u32                msr_bitmap:1;         /*   836: 1  4 */
                u32                reserved:30;          /*   836: 2  4 */
        } hv_enlightenments_control;                     /*   836     4 */
        u32                        hv_vp_id;             /*   840     4 */
        u64                        hv_vm_id;             /*   844     8 */
        u64                        partition_assist_page; /*   852     8 */
...

And the observed values in 'partition_assist_page' make no sense at
all. Fix the layout by padding the structure properly.

Fixes: 68d1eb72ee99 ("x86/hyper-v: define struct hv_enlightened_vmcs and clean field bits")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/hyperv-tlfs.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 0a9407dc0859..6f0acc45e67a 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -546,7 +546,7 @@ struct hv_enlightened_vmcs {
 	u64 guest_rip;
 
 	u32 hv_clean_fields;
-	u32 hv_padding_32;
+	u32 padding32_1;
 	u32 hv_synthetic_controls;
 	struct {
 		u32 nested_flush_hypercall:1;
@@ -554,7 +554,7 @@ struct hv_enlightened_vmcs {
 		u32 reserved:30;
 	}  __packed hv_enlightenments_control;
 	u32 hv_vp_id;
-
+	u32 padding32_2;
 	u64 hv_vm_id;
 	u64 partition_assist_page;
 	u64 padding64_4[4];
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 03/25] x86/hyperv: Update 'struct hv_enlightened_vmcs' definition
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
  2022-07-08 14:41 ` [PATCH v3 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags Vitaly Kuznetsov
  2022-07-08 14:42 ` [PATCH v3 02/25] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:51   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 04/25] KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields Vitaly Kuznetsov
                   ` (21 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Updated Hyper-V Enlightened VMCS specification lists several new
fields for the following features:

- PerfGlobalCtrl
- EnclsExitingBitmap
- Tsc Scaling
- GuestLbrCtl
- CET
- SSP

Update the definition. The updated definition is available only when
CPUID.0x4000000A.EBX BIT(0) is '1'. Add a define for it as well.

Note: The latest TLFS is available at
https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/hyperv-tlfs.h | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 6f0acc45e67a..6f2c3cdacdf4 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -138,6 +138,9 @@
 #define HV_X64_NESTED_GUEST_MAPPING_FLUSH		BIT(18)
 #define HV_X64_NESTED_MSR_BITMAP			BIT(19)
 
+/* Nested quirks. These are HYPERV_CPUID_NESTED_FEATURES.EBX bits. */
+#define HV_X64_NESTED_EVMCS1_2022_UPDATE		BIT(0)
+
 /*
  * This is specific to AMD and specifies that enlightened TLB flush is
  * supported. If guest opts in to this feature, ASID invalidations only
@@ -559,9 +562,20 @@ struct hv_enlightened_vmcs {
 	u64 partition_assist_page;
 	u64 padding64_4[4];
 	u64 guest_bndcfgs;
-	u64 padding64_5[7];
+	u64 guest_ia32_perf_global_ctrl;
+	u64 guest_ia32_s_cet;
+	u64 guest_ssp;
+	u64 guest_ia32_int_ssp_table_addr;
+	u64 guest_ia32_lbr_ctl;
+	u64 padding64_5[2];
 	u64 xss_exit_bitmap;
-	u64 padding64_6[7];
+	u64 host_ia32_perf_global_ctrl;
+	u64 encls_exiting_bitmap;
+	u64 tsc_multiplier;
+	u64 host_ia32_s_cet;
+	u64 host_ssp;
+	u64 host_ia32_int_ssp_table_addr;
+	u64 padding64_6;
 } __packed;
 
 #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE			0
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 04/25] KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (2 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 03/25] x86/hyperv: Update " Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:50   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 05/25] KVM: nVMX: Support several new fields in eVMCSv1 Vitaly Kuznetsov
                   ` (20 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Enlightened VMCS v1 definition was updated with new fields, support
them in KVM by defining VMCS-to-EVMCS conversion.

Note: SSP, CET and Guest LBR features are not supported by KVM yet and
the corresponding fields are not defined in 'enum vmcs_field', leave
them commented out for now.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/evmcs.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index 6a61b1ae7942..8bea5dea0341 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -28,6 +28,8 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
 	EVMCS1_FIELD(HOST_IA32_EFER, host_ia32_efer,
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
+	EVMCS1_FIELD(HOST_IA32_PERF_GLOBAL_CTRL, host_ia32_perf_global_ctrl,
+		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
 	EVMCS1_FIELD(HOST_CR0, host_cr0,
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
 	EVMCS1_FIELD(HOST_CR3, host_cr3,
@@ -78,6 +80,8 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
 	EVMCS1_FIELD(GUEST_IA32_EFER, guest_ia32_efer,
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
+	EVMCS1_FIELD(GUEST_IA32_PERF_GLOBAL_CTRL, guest_ia32_perf_global_ctrl,
+		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
 	EVMCS1_FIELD(GUEST_PDPTR0, guest_pdptr0,
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
 	EVMCS1_FIELD(GUEST_PDPTR1, guest_pdptr1,
@@ -126,6 +130,28 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
 	EVMCS1_FIELD(XSS_EXIT_BITMAP, xss_exit_bitmap,
 		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2),
+	EVMCS1_FIELD(ENCLS_EXITING_BITMAP, encls_exiting_bitmap,
+		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2),
+	EVMCS1_FIELD(TSC_MULTIPLIER, tsc_multiplier,
+		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2),
+	/*
+	 * Not used by KVM:
+	 *
+	 * EVMCS1_FIELD(0x00006828, guest_ia32_s_cet,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
+	 * EVMCS1_FIELD(0x0000682A, guest_ssp,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_BASIC),
+	 * EVMCS1_FIELD(0x0000682C, guest_ia32_int_ssp_table_addr,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
+	 * EVMCS1_FIELD(0x00002816, guest_ia32_lbr_ctl,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
+	 * EVMCS1_FIELD(0x00006C18, host_ia32_s_cet,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
+	 * EVMCS1_FIELD(0x00006C1A, host_ssp,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
+	 * EVMCS1_FIELD(0x00006C1C, host_ia32_int_ssp_table_addr,
+	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
+	 */
 
 	/* 64 bit read only */
 	EVMCS1_FIELD(GUEST_PHYSICAL_ADDRESS, guest_physical_address,
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 05/25] KVM: nVMX: Support several new fields in eVMCSv1
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (3 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 04/25] KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:51   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 06/25] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf Vitaly Kuznetsov
                   ` (19 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Enlightened VMCS v1 definition was updated with new fields, add
support for them for Hyper-V on KVM.

Note: SSP, CET and Guest LBR features are not supported by KVM yet
and 'struct vmcs12' has no corresponding fields.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 778f82015f03..4fc84f0f5875 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1603,6 +1603,10 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
 		vmcs12->guest_rflags = evmcs->guest_rflags;
 		vmcs12->guest_interruptibility_info =
 			evmcs->guest_interruptibility_info;
+		/*
+		 * Not present in struct vmcs12:
+		 * vmcs12->guest_ssp = evmcs->guest_ssp;
+		 */
 	}
 
 	if (unlikely(!(hv_clean_fields &
@@ -1649,6 +1653,13 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
 		vmcs12->host_fs_selector = evmcs->host_fs_selector;
 		vmcs12->host_gs_selector = evmcs->host_gs_selector;
 		vmcs12->host_tr_selector = evmcs->host_tr_selector;
+		vmcs12->host_ia32_perf_global_ctrl = evmcs->host_ia32_perf_global_ctrl;
+		/*
+		 * Not present in struct vmcs12:
+		 * vmcs12->host_ia32_s_cet = evmcs->host_ia32_s_cet;
+		 * vmcs12->host_ssp = evmcs->host_ssp;
+		 * vmcs12->host_ia32_int_ssp_table_addr = evmcs->host_ia32_int_ssp_table_addr;
+		 */
 	}
 
 	if (unlikely(!(hv_clean_fields &
@@ -1716,6 +1727,8 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
 		vmcs12->tsc_offset = evmcs->tsc_offset;
 		vmcs12->virtual_apic_page_addr = evmcs->virtual_apic_page_addr;
 		vmcs12->xss_exit_bitmap = evmcs->xss_exit_bitmap;
+		vmcs12->encls_exiting_bitmap = evmcs->encls_exiting_bitmap;
+		vmcs12->tsc_multiplier = evmcs->tsc_multiplier;
 	}
 
 	if (unlikely(!(hv_clean_fields &
@@ -1763,6 +1776,13 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
 		vmcs12->guest_bndcfgs = evmcs->guest_bndcfgs;
 		vmcs12->guest_activity_state = evmcs->guest_activity_state;
 		vmcs12->guest_sysenter_cs = evmcs->guest_sysenter_cs;
+		vmcs12->guest_ia32_perf_global_ctrl = evmcs->guest_ia32_perf_global_ctrl;
+		/*
+		 * Not present in struct vmcs12:
+		 * vmcs12->guest_ia32_s_cet = evmcs->guest_ia32_s_cet;
+		 * vmcs12->guest_ia32_lbr_ctl = evmcs->guest_ia32_lbr_ctl;
+		 * vmcs12->guest_ia32_int_ssp_table_addr = evmcs->guest_ia32_int_ssp_table_addr;
+		 */
 	}
 
 	/*
@@ -1865,12 +1885,23 @@ static void copy_vmcs12_to_enlightened(struct vcpu_vmx *vmx)
 	 * evmcs->vm_exit_msr_store_count = vmcs12->vm_exit_msr_store_count;
 	 * evmcs->vm_exit_msr_load_count = vmcs12->vm_exit_msr_load_count;
 	 * evmcs->vm_entry_msr_load_count = vmcs12->vm_entry_msr_load_count;
+	 * evmcs->guest_ia32_perf_global_ctrl = vmcs12->guest_ia32_perf_global_ctrl;
+	 * evmcs->host_ia32_perf_global_ctrl = vmcs12->host_ia32_perf_global_ctrl;
+	 * evmcs->encls_exiting_bitmap = vmcs12->encls_exiting_bitmap;
+	 * evmcs->tsc_multiplier = vmcs12->tsc_multiplier;
 	 *
 	 * Not present in struct vmcs12:
 	 * evmcs->exit_io_instruction_ecx = vmcs12->exit_io_instruction_ecx;
 	 * evmcs->exit_io_instruction_esi = vmcs12->exit_io_instruction_esi;
 	 * evmcs->exit_io_instruction_edi = vmcs12->exit_io_instruction_edi;
 	 * evmcs->exit_io_instruction_eip = vmcs12->exit_io_instruction_eip;
+	 * evmcs->host_ia32_s_cet = vmcs12->host_ia32_s_cet;
+	 * evmcs->host_ssp = vmcs12->host_ssp;
+	 * evmcs->host_ia32_int_ssp_table_addr = vmcs12->host_ia32_int_ssp_table_addr;
+	 * evmcs->guest_ia32_s_cet = vmcs12->guest_ia32_s_cet;
+	 * evmcs->guest_ia32_lbr_ctl = vmcs12->guest_ia32_lbr_ctl;
+	 * evmcs->guest_ia32_int_ssp_table_addr = vmcs12->guest_ia32_int_ssp_table_addr;
+	 * evmcs->guest_ssp = vmcs12->guest_ssp;
 	 */
 
 	evmcs->guest_es_selector = vmcs12->guest_es_selector;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 06/25] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (4 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 05/25] KVM: nVMX: Support several new fields in eVMCSv1 Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:51   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 07/25] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields Vitaly Kuznetsov
                   ` (18 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

KVM has to check guest visible HYPERV_CPUID_NESTED_FEATURES.EBX CPUID
leaf to know with Enlightened VMCS definition to use (original or 2022
update). Cache the leaf along with other Hyper-V CPUID feature leaves
to make the check quick.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/kvm_host.h | 2 ++
 arch/x86/kvm/hyperv.c           | 9 +++++++++
 2 files changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index de5a149d0971..077ec9cf3169 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -616,6 +616,8 @@ struct kvm_vcpu_hv {
 		u32 enlightenments_eax; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EAX */
 		u32 enlightenments_ebx; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EBX */
 		u32 syndbg_cap_eax; /* HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES.EAX */
+		u32 nested_eax; /* HYPERV_CPUID_NESTED_FEATURES.EAX */
+		u32 nested_ebx; /* HYPERV_CPUID_NESTED_FEATURES.EBX */
 	} cpuid_cache;
 };
 
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index e08189211d9a..b666902da4d9 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2030,6 +2030,15 @@ void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu)
 		hv_vcpu->cpuid_cache.syndbg_cap_eax = entry->eax;
 	else
 		hv_vcpu->cpuid_cache.syndbg_cap_eax = 0;
+
+	entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_NESTED_FEATURES, 0);
+	if (entry) {
+		hv_vcpu->cpuid_cache.nested_eax = entry->eax;
+		hv_vcpu->cpuid_cache.nested_ebx = entry->ebx;
+	} else {
+		hv_vcpu->cpuid_cache.nested_eax = 0;
+		hv_vcpu->cpuid_cache.nested_ebx = 0;
+	}
 }
 
 int kvm_hv_set_enforce_cpuid(struct kvm_vcpu *vcpu, bool enforce)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 07/25] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (5 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 06/25] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:51   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 08/25] KVM: selftests: Switch to updated eVMCSv1 definition Vitaly Kuznetsov
                   ` (17 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

The updated Enlightened VMCS definition has 'encls_exiting_bitmap'
field which needs mapping to VMCS, add the missing encoding.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 tools/testing/selftests/kvm/include/x86_64/vmx.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/x86_64/vmx.h b/tools/testing/selftests/kvm/include/x86_64/vmx.h
index cc3604f8f1d3..5292d30fb7f2 100644
--- a/tools/testing/selftests/kvm/include/x86_64/vmx.h
+++ b/tools/testing/selftests/kvm/include/x86_64/vmx.h
@@ -208,6 +208,8 @@ enum vmcs_field {
 	VMWRITE_BITMAP_HIGH		= 0x00002029,
 	XSS_EXIT_BITMAP			= 0x0000202C,
 	XSS_EXIT_BITMAP_HIGH		= 0x0000202D,
+	ENCLS_EXITING_BITMAP		= 0x0000202E,
+	ENCLS_EXITING_BITMAP_HIGH	= 0x0000202F,
 	TSC_MULTIPLIER			= 0x00002032,
 	TSC_MULTIPLIER_HIGH		= 0x00002033,
 	GUEST_PHYSICAL_ADDRESS		= 0x00002400,
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 08/25] KVM: selftests: Switch to updated eVMCSv1 definition
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (6 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 07/25] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:52   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS Vitaly Kuznetsov
                   ` (16 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Update Enlightened VMCS definition in selftests from KVM.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 .../selftests/kvm/include/x86_64/evmcs.h      | 45 +++++++++++++++++--
 1 file changed, 42 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/evmcs.h b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
index 3c9260f8e116..a777711d5474 100644
--- a/tools/testing/selftests/kvm/include/x86_64/evmcs.h
+++ b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
@@ -203,14 +203,25 @@ struct hv_enlightened_vmcs {
 		u32 reserved:30;
 	} hv_enlightenments_control;
 	u32 hv_vp_id;
-
+	u32 padding32_2;
 	u64 hv_vm_id;
 	u64 partition_assist_page;
 	u64 padding64_4[4];
 	u64 guest_bndcfgs;
-	u64 padding64_5[7];
+	u64 guest_ia32_perf_global_ctrl;
+	u64 guest_ia32_s_cet;
+	u64 guest_ssp;
+	u64 guest_ia32_int_ssp_table_addr;
+	u64 guest_ia32_lbr_ctl;
+	u64 padding64_5[2];
 	u64 xss_exit_bitmap;
-	u64 padding64_6[7];
+	u64 host_ia32_perf_global_ctrl;
+	u64 encls_exiting_bitmap;
+	u64 tsc_multiplier;
+	u64 host_ia32_s_cet;
+	u64 host_ssp;
+	u64 host_ia32_int_ssp_table_addr;
+	u64 padding64_6;
 };
 
 #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE                     0
@@ -656,6 +667,18 @@ static inline int evmcs_vmread(uint64_t encoding, uint64_t *value)
 	case VIRTUAL_PROCESSOR_ID:
 		*value = current_evmcs->virtual_processor_id;
 		break;
+	case HOST_IA32_PERF_GLOBAL_CTRL:
+		*value = current_evmcs->host_ia32_perf_global_ctrl;
+		break;
+	case GUEST_IA32_PERF_GLOBAL_CTRL:
+		*value = current_evmcs->guest_ia32_perf_global_ctrl;
+		break;
+	case ENCLS_EXITING_BITMAP:
+		*value = current_evmcs->encls_exiting_bitmap;
+		break;
+	case TSC_MULTIPLIER:
+		*value = current_evmcs->tsc_multiplier;
+		break;
 	default: return 1;
 	}
 
@@ -1169,6 +1192,22 @@ static inline int evmcs_vmwrite(uint64_t encoding, uint64_t value)
 		current_evmcs->virtual_processor_id = value;
 		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_XLAT;
 		break;
+	case HOST_IA32_PERF_GLOBAL_CTRL:
+		current_evmcs->host_ia32_perf_global_ctrl = value;
+		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1;
+		break;
+	case GUEST_IA32_PERF_GLOBAL_CTRL:
+		current_evmcs->guest_ia32_perf_global_ctrl = value;
+		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1;
+		break;
+	case ENCLS_EXITING_BITMAP:
+		current_evmcs->encls_exiting_bitmap = value;
+		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2;
+		break;
+	case TSC_MULTIPLIER:
+		current_evmcs->tsc_multiplier = value;
+		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2;
+		break;
 	default: return 1;
 	}
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (7 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 08/25] KVM: selftests: Switch to updated eVMCSv1 definition Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:53   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 10/25] KVM: selftests: Enable TSC scaling in evmcs selftest Vitaly Kuznetsov
                   ` (15 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Enlightened VMCS v1 got updated and now includes the required fields
for TSC scaling and loading PERF_GLOBAL_CTRL upon VMENTER/VMEXIT
features. For Hyper-V on KVM enablement, KVM can just observe VMX control
MSRs and use the features (with or without eVMCS) when
possible. Hyper-V on KVM case is trickier because of live migration:
the new features require explicit enablement from VMM to not break
it. Luckily, the updated eVMCS revision comes with a feature bit in
CPUID.0x4000000A.EBX.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/hyperv.c     |  2 +-
 arch/x86/kvm/vmx/evmcs.c  | 88 +++++++++++++++++++++++++++++++--------
 arch/x86/kvm/vmx/evmcs.h  | 17 ++------
 arch/x86/kvm/vmx/nested.c |  2 +-
 arch/x86/kvm/vmx/vmx.c    |  2 +-
 5 files changed, 78 insertions(+), 33 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index b666902da4d9..406c5e273983 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2562,7 +2562,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
 		case HYPERV_CPUID_NESTED_FEATURES:
 			ent->eax = evmcs_ver;
 			ent->eax |= HV_X64_NESTED_MSR_BITMAP;
-
+			ent->ebx |= HV_X64_NESTED_EVMCS1_2022_UPDATE;
 			break;
 
 		case HYPERV_CPUID_SYNDBG_VENDOR_AND_MAX_FUNCTIONS:
diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index 8bea5dea0341..52a53debd806 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -368,7 +368,60 @@ uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
-void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
+enum evmcs_v1_revision {
+	EVMCSv1_2016,
+	EVMCSv1_2022,
+};
+
+enum evmcs_unsupported_ctrl_type {
+	EVMCS_EXIT_CTLS,
+	EVMCS_ENTRY_CTLS,
+	EVMCS_2NDEXEC,
+	EVMCS_PINCTRL,
+	EVMCS_VMFUNC,
+};
+
+static u32 evmcs_get_unsupported_ctls(struct kvm_vcpu *vcpu,
+				      enum evmcs_unsupported_ctrl_type ctrl_type)
+{
+	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+	enum evmcs_v1_revision evmcs_rev = EVMCSv1_2016;
+
+	if (!hv_vcpu)
+		return 0;
+
+	if (hv_vcpu->cpuid_cache.nested_ebx & HV_X64_NESTED_EVMCS1_2022_UPDATE)
+		evmcs_rev = EVMCSv1_2022;
+
+	switch (ctrl_type) {
+	case EVMCS_EXIT_CTLS:
+		if (evmcs_rev == EVMCSv1_2016)
+			return EVMCS1_UNSUPPORTED_VMEXIT_CTRL |
+				VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+		else
+			return EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
+	case EVMCS_ENTRY_CTLS:
+		if (evmcs_rev == EVMCSv1_2016)
+			return EVMCS1_UNSUPPORTED_VMENTRY_CTRL |
+				VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+		else
+			return EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
+	case EVMCS_2NDEXEC:
+		if (evmcs_rev == EVMCSv1_2016)
+			return EVMCS1_UNSUPPORTED_2NDEXEC |
+				SECONDARY_EXEC_TSC_SCALING;
+		else
+			return EVMCS1_UNSUPPORTED_2NDEXEC;
+	case EVMCS_PINCTRL:
+		return EVMCS1_UNSUPPORTED_PINCTRL;
+	case EVMCS_VMFUNC:
+		return EVMCS1_UNSUPPORTED_VMFUNC;
+	}
+
+	return 0;
+}
+
+void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
 {
 	u32 ctl_low = (u32)*pdata;
 	u32 ctl_high = (u32)(*pdata >> 32);
@@ -380,72 +433,73 @@ void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
 	switch (msr_index) {
 	case MSR_IA32_VMX_EXIT_CTLS:
 	case MSR_IA32_VMX_TRUE_EXIT_CTLS:
-		ctl_high &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
+		ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_EXIT_CTLS);
 		break;
 	case MSR_IA32_VMX_ENTRY_CTLS:
 	case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
-		ctl_high &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
+		ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_ENTRY_CTLS);
 		break;
 	case MSR_IA32_VMX_PROCBASED_CTLS2:
-		ctl_high &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
+		ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_2NDEXEC);
 		break;
 	case MSR_IA32_VMX_TRUE_PINBASED_CTLS:
 	case MSR_IA32_VMX_PINBASED_CTLS:
-		ctl_high &= ~EVMCS1_UNSUPPORTED_PINCTRL;
+		ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_PINCTRL);
 		break;
 	case MSR_IA32_VMX_VMFUNC:
-		ctl_low &= ~EVMCS1_UNSUPPORTED_VMFUNC;
+		ctl_low &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_VMFUNC);
 		break;
 	}
 
 	*pdata = ctl_low | ((u64)ctl_high << 32);
 }
 
-int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
+int nested_evmcs_check_controls(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 {
 	int ret = 0;
 	u32 unsupp_ctl;
 
 	unsupp_ctl = vmcs12->pin_based_vm_exec_control &
-		EVMCS1_UNSUPPORTED_PINCTRL;
+		evmcs_get_unsupported_ctls(vcpu, EVMCS_PINCTRL);
 	if (unsupp_ctl) {
 		trace_kvm_nested_vmenter_failed(
-			"eVMCS: unsupported pin-based VM-execution controls",
+			"eVMCS: unsupported pin-based VM-execution controls: ",
 			unsupp_ctl);
 		ret = -EINVAL;
 	}
 
 	unsupp_ctl = vmcs12->secondary_vm_exec_control &
-		EVMCS1_UNSUPPORTED_2NDEXEC;
+		evmcs_get_unsupported_ctls(vcpu, EVMCS_2NDEXEC);
 	if (unsupp_ctl) {
 		trace_kvm_nested_vmenter_failed(
-			"eVMCS: unsupported secondary VM-execution controls",
+			"eVMCS: unsupported secondary VM-execution controls: ",
 			unsupp_ctl);
 		ret = -EINVAL;
 	}
 
 	unsupp_ctl = vmcs12->vm_exit_controls &
-		EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
+		evmcs_get_unsupported_ctls(vcpu, EVMCS_EXIT_CTLS);
 	if (unsupp_ctl) {
 		trace_kvm_nested_vmenter_failed(
-			"eVMCS: unsupported VM-exit controls",
+			"eVMCS: unsupported VM-exit controls: ",
 			unsupp_ctl);
 		ret = -EINVAL;
 	}
 
 	unsupp_ctl = vmcs12->vm_entry_controls &
-		EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
+		evmcs_get_unsupported_ctls(vcpu, EVMCS_ENTRY_CTLS);
 	if (unsupp_ctl) {
 		trace_kvm_nested_vmenter_failed(
-			"eVMCS: unsupported VM-entry controls",
+			"eVMCS: unsupported VM-entry controls: ",
 			unsupp_ctl);
 		ret = -EINVAL;
 	}
 
-	unsupp_ctl = vmcs12->vm_function_control & EVMCS1_UNSUPPORTED_VMFUNC;
+	unsupp_ctl = vmcs12->vm_function_control &
+		evmcs_get_unsupported_ctls(vcpu, EVMCS_VMFUNC);
 	if (unsupp_ctl) {
 		trace_kvm_nested_vmenter_failed(
-			"eVMCS: unsupported VM-function controls",
+			"eVMCS: unsupported VM-function controls: ",
 			unsupp_ctl);
 		ret = -EINVAL;
 	}
diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
index f886a8ff0342..4b809c79ae63 100644
--- a/arch/x86/kvm/vmx/evmcs.h
+++ b/arch/x86/kvm/vmx/evmcs.h
@@ -37,16 +37,9 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
  *	EPTP_LIST_ADDRESS               = 0x00002024,
  *	VMREAD_BITMAP                   = 0x00002026,
  *	VMWRITE_BITMAP                  = 0x00002028,
- *
- *	TSC_MULTIPLIER                  = 0x00002032,
  *	PLE_GAP                         = 0x00004020,
  *	PLE_WINDOW                      = 0x00004022,
  *	VMX_PREEMPTION_TIMER_VALUE      = 0x0000482E,
- *      GUEST_IA32_PERF_GLOBAL_CTRL     = 0x00002808,
- *      HOST_IA32_PERF_GLOBAL_CTRL      = 0x00002c04,
- *
- * Currently unsupported in KVM:
- *	GUEST_IA32_RTIT_CTL		= 0x00002814,
  */
 #define EVMCS1_UNSUPPORTED_PINCTRL (PIN_BASED_POSTED_INTR | \
 				    PIN_BASED_VMX_PREEMPTION_TIMER)
@@ -58,12 +51,10 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
 	 SECONDARY_EXEC_ENABLE_PML |					\
 	 SECONDARY_EXEC_ENABLE_VMFUNC |					\
 	 SECONDARY_EXEC_SHADOW_VMCS |					\
-	 SECONDARY_EXEC_TSC_SCALING |					\
 	 SECONDARY_EXEC_PAUSE_LOOP_EXITING)
 #define EVMCS1_UNSUPPORTED_VMEXIT_CTRL					\
-	(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |				\
-	 VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
-#define EVMCS1_UNSUPPORTED_VMENTRY_CTRL (VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL)
+	(VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
+#define EVMCS1_UNSUPPORTED_VMENTRY_CTRL (0)
 #define EVMCS1_UNSUPPORTED_VMFUNC (VMX_VMFUNC_EPTP_SWITCHING)
 
 struct evmcs_field {
@@ -243,7 +234,7 @@ bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa);
 uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu);
 int nested_enable_evmcs(struct kvm_vcpu *vcpu,
 			uint16_t *vmcs_version);
-void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata);
-int nested_evmcs_check_controls(struct vmcs12 *vmcs12);
+void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata);
+int nested_evmcs_check_controls(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12);
 
 #endif /* __KVM_X86_VMX_EVMCS_H */
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 4fc84f0f5875..dcf3ee645212 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2891,7 +2891,7 @@ static int nested_vmx_check_controls(struct kvm_vcpu *vcpu,
 		return -EINVAL;
 
 	if (to_vmx(vcpu)->nested.enlightened_vmcs_enabled)
-		return nested_evmcs_check_controls(vmcs12);
+		return nested_evmcs_check_controls(vcpu, vmcs12);
 
 	return 0;
 }
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c30115b9cb33..b4915d841357 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1858,7 +1858,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		 */
 		if (!msr_info->host_initiated &&
 		    vmx->nested.enlightened_vmcs_enabled)
-			nested_evmcs_filter_control_msr(msr_info->index,
+			nested_evmcs_filter_control_msr(vcpu, msr_info->index,
 							&msr_info->data);
 		break;
 	case MSR_IA32_RTIT_CTL:
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 10/25] KVM: selftests: Enable TSC scaling in evmcs selftest
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (8 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:54   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization Vitaly Kuznetsov
                   ` (14 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

The updated Enlightened VMCS v1 definition enables TSC scaling, test
that SECONDARY_EXEC_TSC_SCALING can now be enabled.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 .../testing/selftests/kvm/x86_64/evmcs_test.c | 31 +++++++++++++++++--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/evmcs_test.c b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
index 8dda527cc080..80135b98dc3b 100644
--- a/tools/testing/selftests/kvm/x86_64/evmcs_test.c
+++ b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
@@ -18,6 +18,9 @@
 
 #include "vmx.h"
 
+/* Test flags */
+#define HOST_HAS_TSC_SCALING BIT(0)
+
 static int ud_count;
 
 static void guest_ud_handler(struct ex_regs *regs)
@@ -64,11 +67,14 @@ void l2_guest_code(void)
 	vmcall();
 	rdmsr_gs_base(); /* intercepted */
 
+	/* TSC scaling */
+	vmcall();
+
 	/* Done, exit to L1 and never come back.  */
 	vmcall();
 }
 
-void guest_code(struct vmx_pages *vmx_pages)
+void guest_code(struct vmx_pages *vmx_pages, u64 test_flags)
 {
 #define L2_GUEST_STACK_SIZE 64
 	unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
@@ -150,6 +156,18 @@ void guest_code(struct vmx_pages *vmx_pages)
 	GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
 	GUEST_SYNC(11);
 
+	if (test_flags & HOST_HAS_TSC_SCALING) {
+		GUEST_ASSERT((rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2) >> 32) &
+			     SECONDARY_EXEC_TSC_SCALING);
+		/* Try enabling TSC scaling */
+		vmwrite(SECONDARY_VM_EXEC_CONTROL, vmreadz(SECONDARY_VM_EXEC_CONTROL) |
+			SECONDARY_EXEC_TSC_SCALING);
+		vmwrite(TSC_MULTIPLIER, 1);
+	}
+	GUEST_ASSERT(!vmresume());
+	GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
+	GUEST_SYNC(12);
+
 	/* Try enlightened vmptrld with an incorrect GPA */
 	evmcs_vmptrld(0xdeadbeef, vmx_pages->enlightened_vmcs);
 	GUEST_ASSERT(vmlaunch());
@@ -204,6 +222,7 @@ int main(int argc, char *argv[])
 	struct kvm_vm *vm;
 	struct kvm_run *run;
 	struct ucall uc;
+	u64 test_flags = 0;
 	int stage;
 
 	vm = vm_create_with_one_vcpu(&vcpu, guest_code);
@@ -212,11 +231,19 @@ int main(int argc, char *argv[])
 	TEST_REQUIRE(kvm_has_cap(KVM_CAP_NESTED_STATE));
 	TEST_REQUIRE(kvm_has_cap(KVM_CAP_HYPERV_ENLIGHTENED_VMCS));
 
+	if ((kvm_get_feature_msr(MSR_IA32_VMX_PROCBASED_CTLS2) >> 32) &
+	    SECONDARY_EXEC_TSC_SCALING) {
+		test_flags |= HOST_HAS_TSC_SCALING;
+		pr_info("TSC scaling is supported, adding to test\n");
+	} else {
+		pr_info("TSC scaling is not supported\n");
+	}
+
 	vcpu_set_hv_cpuid(vcpu);
 	vcpu_enable_evmcs(vcpu);
 
 	vcpu_alloc_vmx(vm, &vmx_pages_gva);
-	vcpu_args_set(vcpu, 1, vmx_pages_gva);
+	vcpu_args_set(vcpu, 2, vmx_pages_gva, test_flags);
 
 	vm_init_descriptor_tables(vm);
 	vcpu_init_descriptor_tables(vcpu);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (9 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 10/25] KVM: selftests: Enable TSC scaling in evmcs selftest Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:54   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 12/25] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config() Vitaly Kuznetsov
                   ` (13 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

With the updated eVMCSv1 definition, there's no known 'problematic'
controls which are exposed in VMX control MSRs but are not present in
eVMCSv1. Get rid of VMX control MSRs filtering for KVM on Hyper-V.

Note: VMX control MSRs filtering for Hyper-V on KVM
(nested_evmcs_filter_control_msr()) stays as even the updated eVMCSv1
definition doesn't have all the features implemented by KVM and some
fields are still missing. Moreover, nested_evmcs_filter_control_msr()
has to support the original eVMCSv1 version when VMM wishes so.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/evmcs.c | 13 -------------
 arch/x86/kvm/vmx/evmcs.h |  1 -
 arch/x86/kvm/vmx/vmx.c   |  5 -----
 3 files changed, 19 deletions(-)

diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index 52a53debd806..b5cfbf7d487b 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -320,19 +320,6 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
 };
 const unsigned int nr_evmcs_1_fields = ARRAY_SIZE(vmcs_field_to_evmcs_1);
 
-#if IS_ENABLED(CONFIG_HYPERV)
-__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf)
-{
-	vmcs_conf->cpu_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_EXEC_CTRL;
-	vmcs_conf->pin_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_PINCTRL;
-	vmcs_conf->cpu_based_2nd_exec_ctrl &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
-	vmcs_conf->cpu_based_3rd_exec_ctrl = 0;
-
-	vmcs_conf->vmexit_ctrl &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
-	vmcs_conf->vmentry_ctrl &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
-}
-#endif
-
 bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa)
 {
 	struct hv_vp_assist_page assist_page;
diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
index 4b809c79ae63..0feac101cce4 100644
--- a/arch/x86/kvm/vmx/evmcs.h
+++ b/arch/x86/kvm/vmx/evmcs.h
@@ -203,7 +203,6 @@ static inline void evmcs_load(u64 phys_addr)
 	vp_ap->enlighten_vmentry = 1;
 }
 
-__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf);
 #else /* !IS_ENABLED(CONFIG_HYPERV) */
 static __always_inline void evmcs_write64(unsigned long field, u64 value) {}
 static inline void evmcs_write32(unsigned long field, u32 value) {}
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index b4915d841357..dd905ad72637 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2689,11 +2689,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	vmcs_conf->vmexit_ctrl         = _vmexit_control;
 	vmcs_conf->vmentry_ctrl        = _vmentry_control;
 
-#if IS_ENABLED(CONFIG_HYPERV)
-	if (enlightened_vmcs)
-		evmcs_sanitize_exec_ctrls(vmcs_conf);
-#endif
-
 	return 0;
 }
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 12/25] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (10 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:55   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 13/25] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING " Vitaly Kuznetsov
                   ` (12 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

VM_ENTRY_IA32E_MODE control is toggled dynamically by vmx_set_efer()
and setup_vmcs_config() doesn't check its existence. On the contrary,
nested_vmx_setup_ctls_msrs() doesn set it on x86_64. Add the missing
check and filter the bit out in vmx_vmentry_ctrl().

No (real) functional change intended as all existing CPUs supporting
long mode and VMX are supposed to have it.

Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index dd905ad72637..1aaec4d19e1b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2610,6 +2610,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 		_pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;
 
 	min = VM_ENTRY_LOAD_DEBUG_CONTROLS;
+#ifdef CONFIG_X86_64
+	min |= VM_ENTRY_IA32E_MODE;
+#endif
 	opt = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
 	      VM_ENTRY_LOAD_IA32_PAT |
 	      VM_ENTRY_LOAD_IA32_EFER |
@@ -4242,9 +4245,14 @@ static u32 vmx_vmentry_ctrl(void)
 	if (vmx_pt_mode_is_system())
 		vmentry_ctrl &= ~(VM_ENTRY_PT_CONCEAL_PIP |
 				  VM_ENTRY_LOAD_IA32_RTIT_CTL);
-	/* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
-	return vmentry_ctrl &
-		~(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL | VM_ENTRY_LOAD_IA32_EFER);
+	/*
+	 * IA32e mode, and loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically.
+	 */
+	vmentry_ctrl &= ~(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
+			  VM_ENTRY_LOAD_IA32_EFER |
+			  VM_ENTRY_IA32E_MODE);
+
+	return vmentry_ctrl;
 }
 
 static u32 vmx_vmexit_ctrl(void)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 13/25] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in setup_vmcs_config()
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (11 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 12/25] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config() Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:55   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 14/25] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING " Vitaly Kuznetsov
                   ` (11 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

CPU_BASED_{INTR,NMI}_WINDOW_EXITING controls are toggled dynamically by
vmx_enable_{irq,nmi}_window, handle_interrupt_window(), handle_nmi_window()
but setup_vmcs_config() doesn't check their existence. Add the check and
filter the controls out in vmx_exec_control().

Note: KVM explicitly supports CPUs without VIRTUAL_NMIS and all these CPUs
are supposedly lacking NMI_WINDOW_EXITING too. Adjust cpu_has_virtual_nmis()
accordingly.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/capabilities.h | 3 ++-
 arch/x86/kvm/vmx/vmx.c          | 8 +++++++-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 069d8d298e1d..07e7492fe72a 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -82,7 +82,8 @@ static inline bool cpu_has_vmx_basic_inout(void)
 
 static inline bool cpu_has_virtual_nmis(void)
 {
-	return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS;
+	return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS &&
+		vmcs_config.cpu_based_exec_ctrl & CPU_BASED_NMI_WINDOW_EXITING;
 }
 
 static inline bool cpu_has_vmx_preemption_timer(void)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 1aaec4d19e1b..ce54f13d8da1 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2487,10 +2487,12 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	      CPU_BASED_MWAIT_EXITING |
 	      CPU_BASED_MONITOR_EXITING |
 	      CPU_BASED_INVLPG_EXITING |
-	      CPU_BASED_RDPMC_EXITING;
+	      CPU_BASED_RDPMC_EXITING |
+	      CPU_BASED_INTR_WINDOW_EXITING;
 
 	opt = CPU_BASED_TPR_SHADOW |
 	      CPU_BASED_USE_MSR_BITMAPS |
+	      CPU_BASED_NMI_WINDOW_EXITING |
 	      CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
 	      CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
 	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
@@ -4299,6 +4301,10 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 {
 	u32 exec_control = vmcs_config.cpu_based_exec_ctrl;
 
+	/* INTR_WINDOW_EXITING and NMI_WINDOW_EXITING are toggled dynamically */
+	exec_control &= ~(CPU_BASED_INTR_WINDOW_EXITING |
+			  CPU_BASED_NMI_WINDOW_EXITING);
+
 	if (vmx->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
 		exec_control &= ~CPU_BASED_MOV_DR_EXITING;
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 14/25] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING in setup_vmcs_config()
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (12 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 13/25] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING " Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:56   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 15/25] KVM: VMX: Extend VMX controls macro shenanigans Vitaly Kuznetsov
                   ` (10 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

SECONDARY_EXEC_ENCLS_EXITING is conditionally added to the 'optional'
checklist in setup_vmcs_config() but there's little value in doing so.
First, as the control is optional, we can always check for its
presence, no harm done. Second, the only real value cpu_has_sgx() check
gives is that on the CPUs which support SECONDARY_EXEC_ENCLS_EXITING but
don't support SGX, the control is not getting enabled. It's highly unlikely
such CPUs exist but it's possible that some hypervisors expose broken vCPU
models.

Preserve cpu_has_sgx() check but filter the result of adjust_vmx_controls()
instead of the input.

Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ce54f13d8da1..566be73c6509 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2528,9 +2528,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 			SECONDARY_EXEC_PT_CONCEAL_VMX |
 			SECONDARY_EXEC_ENABLE_VMFUNC |
 			SECONDARY_EXEC_BUS_LOCK_DETECTION |
-			SECONDARY_EXEC_NOTIFY_VM_EXITING;
-		if (cpu_has_sgx())
-			opt2 |= SECONDARY_EXEC_ENCLS_EXITING;
+			SECONDARY_EXEC_NOTIFY_VM_EXITING |
+			SECONDARY_EXEC_ENCLS_EXITING;
+
 		if (adjust_vmx_controls(min2, opt2,
 					MSR_IA32_VMX_PROCBASED_CTLS2,
 					&_cpu_based_2nd_exec_control) < 0)
@@ -2577,6 +2577,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 		vmx_cap->vpid = 0;
 	}
 
+	if (!cpu_has_sgx())
+		_cpu_based_2nd_exec_control &= ~SECONDARY_EXEC_ENCLS_EXITING;
+
 	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) {
 		u64 opt3 = TERTIARY_EXEC_IPI_VIRT;
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 15/25] KVM: VMX: Extend VMX controls macro shenanigans
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (13 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 14/25] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING " Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:56   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 16/25] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config() Vitaly Kuznetsov
                   ` (9 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

When VMX controls macros are used to set or clear a control bit, make
sure that this bit was checked in setup_vmcs_config() and thus is properly
reflected in vmcs_config.

No functional change intended.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c |  99 +++++++------------------------------
 arch/x86/kvm/vmx/vmx.h | 109 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 127 insertions(+), 81 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 566be73c6509..93ca9ff8e641 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2448,7 +2448,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 				    struct vmx_capability *vmx_cap)
 {
 	u32 vmx_msr_low, vmx_msr_high;
-	u32 min, opt, min2, opt2;
 	u32 _pin_based_exec_control = 0;
 	u32 _cpu_based_exec_control = 0;
 	u32 _cpu_based_2nd_exec_control = 0;
@@ -2474,28 +2473,10 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	};
 
 	memset(vmcs_conf, 0, sizeof(*vmcs_conf));
-	min = CPU_BASED_HLT_EXITING |
-#ifdef CONFIG_X86_64
-	      CPU_BASED_CR8_LOAD_EXITING |
-	      CPU_BASED_CR8_STORE_EXITING |
-#endif
-	      CPU_BASED_CR3_LOAD_EXITING |
-	      CPU_BASED_CR3_STORE_EXITING |
-	      CPU_BASED_UNCOND_IO_EXITING |
-	      CPU_BASED_MOV_DR_EXITING |
-	      CPU_BASED_USE_TSC_OFFSETTING |
-	      CPU_BASED_MWAIT_EXITING |
-	      CPU_BASED_MONITOR_EXITING |
-	      CPU_BASED_INVLPG_EXITING |
-	      CPU_BASED_RDPMC_EXITING |
-	      CPU_BASED_INTR_WINDOW_EXITING;
-
-	opt = CPU_BASED_TPR_SHADOW |
-	      CPU_BASED_USE_MSR_BITMAPS |
-	      CPU_BASED_NMI_WINDOW_EXITING |
-	      CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
-	      CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
-	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
+
+	if (adjust_vmx_controls(KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL,
+				KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL,
+				MSR_IA32_VMX_PROCBASED_CTLS,
 				&_cpu_based_exec_control) < 0)
 		return -EIO;
 #ifdef CONFIG_X86_64
@@ -2504,34 +2485,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 					   ~CPU_BASED_CR8_STORE_EXITING;
 #endif
 	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
-		min2 = 0;
-		opt2 = SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
-			SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
-			SECONDARY_EXEC_WBINVD_EXITING |
-			SECONDARY_EXEC_ENABLE_VPID |
-			SECONDARY_EXEC_ENABLE_EPT |
-			SECONDARY_EXEC_UNRESTRICTED_GUEST |
-			SECONDARY_EXEC_PAUSE_LOOP_EXITING |
-			SECONDARY_EXEC_DESC |
-			SECONDARY_EXEC_ENABLE_RDTSCP |
-			SECONDARY_EXEC_ENABLE_INVPCID |
-			SECONDARY_EXEC_APIC_REGISTER_VIRT |
-			SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
-			SECONDARY_EXEC_SHADOW_VMCS |
-			SECONDARY_EXEC_XSAVES |
-			SECONDARY_EXEC_RDSEED_EXITING |
-			SECONDARY_EXEC_RDRAND_EXITING |
-			SECONDARY_EXEC_ENABLE_PML |
-			SECONDARY_EXEC_TSC_SCALING |
-			SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE |
-			SECONDARY_EXEC_PT_USE_GPA |
-			SECONDARY_EXEC_PT_CONCEAL_VMX |
-			SECONDARY_EXEC_ENABLE_VMFUNC |
-			SECONDARY_EXEC_BUS_LOCK_DETECTION |
-			SECONDARY_EXEC_NOTIFY_VM_EXITING |
-			SECONDARY_EXEC_ENCLS_EXITING;
-
-		if (adjust_vmx_controls(min2, opt2,
+		if (adjust_vmx_controls(KVM_REQ_VMX_SECONDARY_VM_EXEC_CONTROL,
+					KVM_OPT_VMX_SECONDARY_VM_EXEC_CONTROL,
 					MSR_IA32_VMX_PROCBASED_CTLS2,
 					&_cpu_based_2nd_exec_control) < 0)
 			return -EIO;
@@ -2581,30 +2536,20 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 		_cpu_based_2nd_exec_control &= ~SECONDARY_EXEC_ENCLS_EXITING;
 
 	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) {
-		u64 opt3 = TERTIARY_EXEC_IPI_VIRT;
-
-		_cpu_based_3rd_exec_control = adjust_vmx_controls64(opt3,
-					      MSR_IA32_VMX_PROCBASED_CTLS3);
+		_cpu_based_3rd_exec_control =
+			adjust_vmx_controls64(KVM_OPT_VMX_TERTIARY_VM_EXEC_CONTROL,
+			MSR_IA32_VMX_PROCBASED_CTLS3);
 	}
 
-	min = VM_EXIT_SAVE_DEBUG_CONTROLS | VM_EXIT_ACK_INTR_ON_EXIT;
-#ifdef CONFIG_X86_64
-	min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
-#endif
-	opt = VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
-	      VM_EXIT_LOAD_IA32_PAT |
-	      VM_EXIT_LOAD_IA32_EFER |
-	      VM_EXIT_CLEAR_BNDCFGS |
-	      VM_EXIT_PT_CONCEAL_PIP |
-	      VM_EXIT_CLEAR_IA32_RTIT_CTL;
-	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
+	if (adjust_vmx_controls(KVM_REQ_VMX_VM_EXIT_CONTROLS,
+				KVM_OPT_VMX_VM_EXIT_CONTROLS,
+				MSR_IA32_VMX_EXIT_CTLS,
 				&_vmexit_control) < 0)
 		return -EIO;
 
-	min = PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING;
-	opt = PIN_BASED_VIRTUAL_NMIS | PIN_BASED_POSTED_INTR |
-		 PIN_BASED_VMX_PREEMPTION_TIMER;
-	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PINBASED_CTLS,
+	if (adjust_vmx_controls(KVM_REQ_VMX_PIN_BASED_VM_EXEC_CONTROL,
+				KVM_OPT_VMX_PIN_BASED_VM_EXEC_CONTROL,
+				MSR_IA32_VMX_PINBASED_CTLS,
 				&_pin_based_exec_control) < 0)
 		return -EIO;
 
@@ -2614,17 +2559,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 		SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY))
 		_pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;
 
-	min = VM_ENTRY_LOAD_DEBUG_CONTROLS;
-#ifdef CONFIG_X86_64
-	min |= VM_ENTRY_IA32E_MODE;
-#endif
-	opt = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
-	      VM_ENTRY_LOAD_IA32_PAT |
-	      VM_ENTRY_LOAD_IA32_EFER |
-	      VM_ENTRY_LOAD_BNDCFGS |
-	      VM_ENTRY_PT_CONCEAL_PIP |
-	      VM_ENTRY_LOAD_IA32_RTIT_CTL;
-	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
+	if (adjust_vmx_controls(KVM_REQ_VMX_VM_ENTRY_CONTROLS,
+				KVM_OPT_VMX_VM_ENTRY_CONTROLS,
+				MSR_IA32_VMX_ENTRY_CTLS,
 				&_vmentry_control) < 0)
 		return -EIO;
 
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 286c88e285ea..89eaab3495a6 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -467,6 +467,113 @@ static inline u8 vmx_get_rvi(void)
 	return vmcs_read16(GUEST_INTR_STATUS) & 0xff;
 }
 
+#define __KVM_REQ_VMX_VM_ENTRY_CONTROLS				\
+	(VM_ENTRY_LOAD_DEBUG_CONTROLS)
+#ifdef CONFIG_X86_64
+	#define KVM_REQ_VMX_VM_ENTRY_CONTROLS			\
+		(__KVM_REQ_VMX_VM_ENTRY_CONTROLS |		\
+		VM_ENTRY_IA32E_MODE)
+#else
+	#define KVM_REQ_VMX_VM_ENTRY_CONTROLS			\
+		__KVM_REQ_VMX_VM_ENTRY_CONTROLS
+#endif
+#define KVM_OPT_VMX_VM_ENTRY_CONTROLS				\
+	(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |			\
+	VM_ENTRY_LOAD_IA32_PAT |				\
+	VM_ENTRY_LOAD_IA32_EFER |				\
+	VM_ENTRY_LOAD_BNDCFGS |					\
+	VM_ENTRY_PT_CONCEAL_PIP |				\
+	VM_ENTRY_LOAD_IA32_RTIT_CTL)
+
+#define __KVM_REQ_VMX_VM_EXIT_CONTROLS				\
+	(VM_EXIT_SAVE_DEBUG_CONTROLS |				\
+	VM_EXIT_ACK_INTR_ON_EXIT)
+#ifdef CONFIG_X86_64
+	#define KVM_REQ_VMX_VM_EXIT_CONTROLS			\
+		(__KVM_REQ_VMX_VM_EXIT_CONTROLS |		\
+		VM_EXIT_HOST_ADDR_SPACE_SIZE)
+#else
+	#define KVM_REQ_VMX_VM_EXIT_CONTROLS			\
+		__KVM_REQ_VMX_VM_EXIT_CONTROLS
+#endif
+#define KVM_OPT_VMX_VM_EXIT_CONTROLS				\
+	      (VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |		\
+	      VM_EXIT_LOAD_IA32_PAT |				\
+	      VM_EXIT_LOAD_IA32_EFER |				\
+	      VM_EXIT_CLEAR_BNDCFGS |				\
+	      VM_EXIT_PT_CONCEAL_PIP |				\
+	      VM_EXIT_CLEAR_IA32_RTIT_CTL)
+
+#define KVM_REQ_VMX_PIN_BASED_VM_EXEC_CONTROL			\
+	(PIN_BASED_EXT_INTR_MASK |				\
+	 PIN_BASED_NMI_EXITING)
+#define KVM_OPT_VMX_PIN_BASED_VM_EXEC_CONTROL			\
+	(PIN_BASED_VIRTUAL_NMIS |				\
+	PIN_BASED_POSTED_INTR |					\
+	PIN_BASED_VMX_PREEMPTION_TIMER)
+
+#define __KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL			\
+	(CPU_BASED_HLT_EXITING |				\
+	CPU_BASED_CR3_LOAD_EXITING |				\
+	CPU_BASED_CR3_STORE_EXITING |				\
+	CPU_BASED_UNCOND_IO_EXITING |				\
+	CPU_BASED_MOV_DR_EXITING |				\
+	CPU_BASED_USE_TSC_OFFSETTING |				\
+	CPU_BASED_MWAIT_EXITING |				\
+	CPU_BASED_MONITOR_EXITING |				\
+	CPU_BASED_INVLPG_EXITING |				\
+	CPU_BASED_RDPMC_EXITING |				\
+	CPU_BASED_INTR_WINDOW_EXITING)
+
+#ifdef CONFIG_X86_64
+	#define KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL		\
+		(__KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL |	\
+		CPU_BASED_CR8_LOAD_EXITING |			\
+		CPU_BASED_CR8_STORE_EXITING)
+#else
+	#define KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL		\
+		__KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL
+#endif
+
+#define KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL			\
+	(CPU_BASED_TPR_SHADOW |					\
+	CPU_BASED_USE_MSR_BITMAPS |				\
+	CPU_BASED_NMI_WINDOW_EXITING |				\
+	CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |			\
+	CPU_BASED_ACTIVATE_TERTIARY_CONTROLS)
+
+#define KVM_REQ_VMX_SECONDARY_VM_EXEC_CONTROL 0
+#define KVM_OPT_VMX_SECONDARY_VM_EXEC_CONTROL			\
+	(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |		\
+	SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |			\
+	SECONDARY_EXEC_WBINVD_EXITING |				\
+	SECONDARY_EXEC_ENABLE_VPID |				\
+	SECONDARY_EXEC_ENABLE_EPT |				\
+	SECONDARY_EXEC_UNRESTRICTED_GUEST |			\
+	SECONDARY_EXEC_PAUSE_LOOP_EXITING |			\
+	SECONDARY_EXEC_DESC |					\
+	SECONDARY_EXEC_ENABLE_RDTSCP |				\
+	SECONDARY_EXEC_ENABLE_INVPCID |				\
+	SECONDARY_EXEC_APIC_REGISTER_VIRT |			\
+	SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |			\
+	SECONDARY_EXEC_SHADOW_VMCS |				\
+	SECONDARY_EXEC_XSAVES |					\
+	SECONDARY_EXEC_RDSEED_EXITING |				\
+	SECONDARY_EXEC_RDRAND_EXITING |				\
+	SECONDARY_EXEC_ENABLE_PML |				\
+	SECONDARY_EXEC_TSC_SCALING |				\
+	SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE |			\
+	SECONDARY_EXEC_PT_USE_GPA |				\
+	SECONDARY_EXEC_PT_CONCEAL_VMX |				\
+	SECONDARY_EXEC_ENABLE_VMFUNC |				\
+	SECONDARY_EXEC_BUS_LOCK_DETECTION |			\
+	SECONDARY_EXEC_NOTIFY_VM_EXITING |			\
+	SECONDARY_EXEC_ENCLS_EXITING)
+
+#define KVM_REQ_VMX_TERTIARY_VM_EXEC_CONTROL 0
+#define KVM_OPT_VMX_TERTIARY_VM_EXEC_CONTROL			\
+	(TERTIARY_EXEC_IPI_VIRT)
+
 #define BUILD_CONTROLS_SHADOW(lname, uname, bits)				\
 static inline void lname##_controls_set(struct vcpu_vmx *vmx, u##bits val)	\
 {										\
@@ -485,10 +592,12 @@ static inline u##bits lname##_controls_get(struct vcpu_vmx *vmx)		\
 }										\
 static inline void lname##_controls_setbit(struct vcpu_vmx *vmx, u##bits val)	\
 {										\
+	BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));	\
 	lname##_controls_set(vmx, lname##_controls_get(vmx) | val);		\
 }										\
 static inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits val)	\
 {										\
+	BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));	\
 	lname##_controls_set(vmx, lname##_controls_get(vmx) & ~val);		\
 }
 BUILD_CONTROLS_SHADOW(vm_entry, VM_ENTRY_CONTROLS, 32)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 16/25] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config()
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (14 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 15/25] KVM: VMX: Extend VMX controls macro shenanigans Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:56   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 17/25] KVM: VMX: Add missing VMEXIT controls to vmcs_config Vitaly Kuznetsov
                   ` (8 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

As a preparation to reusing the result of setup_vmcs_config() in
nested VMX MSR setup, move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering
to vmx_exec_control().

No functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 93ca9ff8e641..d7170990f469 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2479,11 +2479,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 				MSR_IA32_VMX_PROCBASED_CTLS,
 				&_cpu_based_exec_control) < 0)
 		return -EIO;
-#ifdef CONFIG_X86_64
-	if (_cpu_based_exec_control & CPU_BASED_TPR_SHADOW)
-		_cpu_based_exec_control &= ~CPU_BASED_CR8_LOAD_EXITING &
-					   ~CPU_BASED_CR8_STORE_EXITING;
-#endif
 	if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
 		if (adjust_vmx_controls(KVM_REQ_VMX_SECONDARY_VM_EXEC_CONTROL,
 					KVM_OPT_VMX_SECONDARY_VM_EXEC_CONTROL,
@@ -4248,13 +4243,17 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 	if (vmx->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
 		exec_control &= ~CPU_BASED_MOV_DR_EXITING;
 
-	if (!cpu_need_tpr_shadow(&vmx->vcpu)) {
+	if (!cpu_need_tpr_shadow(&vmx->vcpu))
 		exec_control &= ~CPU_BASED_TPR_SHADOW;
+
 #ifdef CONFIG_X86_64
+	if (exec_control & CPU_BASED_TPR_SHADOW)
+		exec_control &= ~(CPU_BASED_CR8_LOAD_EXITING |
+				  CPU_BASED_CR8_STORE_EXITING);
+	else
 		exec_control |= CPU_BASED_CR8_STORE_EXITING |
 				CPU_BASED_CR8_LOAD_EXITING;
 #endif
-	}
 	if (!enable_ept)
 		exec_control |= CPU_BASED_CR3_STORE_EXITING |
 				CPU_BASED_CR3_LOAD_EXITING  |
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 17/25] KVM: VMX: Add missing VMEXIT controls to vmcs_config
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (15 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 16/25] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config() Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:57   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 18/25] KVM: VMX: Add missing CPU based VM execution " Vitaly Kuznetsov
                   ` (7 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

As a preparation to reusing the result of setup_vmcs_config() in
nested VMX MSR setup, add the VMEXIT controls which KVM doesn't
use but supports for nVMX to KVM_OPT_VMX_VM_EXIT_CONTROLS and
filter them out in vmx_vmexit_ctrl().

No functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 4 ++++
 arch/x86/kvm/vmx/vmx.h | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d7170990f469..2fb89bdcbbd8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4196,6 +4196,10 @@ static u32 vmx_vmexit_ctrl(void)
 {
 	u32 vmexit_ctrl = vmcs_config.vmexit_ctrl;
 
+	/* Not used by KVM but supported for nesting. */
+	vmexit_ctrl &= ~(VM_EXIT_SAVE_IA32_PAT | VM_EXIT_SAVE_IA32_EFER |
+			 VM_EXIT_SAVE_VMX_PREEMPTION_TIMER);
+
 	if (vmx_pt_mode_is_system())
 		vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
 				 VM_EXIT_CLEAR_IA32_RTIT_CTL);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 89eaab3495a6..e9c392398f1b 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -498,8 +498,11 @@ static inline u8 vmx_get_rvi(void)
 #endif
 #define KVM_OPT_VMX_VM_EXIT_CONTROLS				\
 	      (VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |		\
+	      VM_EXIT_SAVE_IA32_PAT |				\
 	      VM_EXIT_LOAD_IA32_PAT |				\
+	      VM_EXIT_SAVE_IA32_EFER |				\
 	      VM_EXIT_LOAD_IA32_EFER |				\
+	      VM_EXIT_SAVE_VMX_PREEMPTION_TIMER |		\
 	      VM_EXIT_CLEAR_BNDCFGS |				\
 	      VM_EXIT_PT_CONCEAL_PIP |				\
 	      VM_EXIT_CLEAR_IA32_RTIT_CTL)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 18/25] KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (16 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 17/25] KVM: VMX: Add missing VMEXIT controls to vmcs_config Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:57   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 19/25] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup Vitaly Kuznetsov
                   ` (6 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

As a preparation to reusing the result of setup_vmcs_config() in
nested VMX MSR setup, add the CPU based VM execution controls which KVM
doesn't use but supports for nVMX to KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL
and filter them out in vmx_exec_control().

No functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 6 ++++++
 arch/x86/kvm/vmx/vmx.h | 6 +++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 2fb89bdcbbd8..9771c771c8f5 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4240,6 +4240,12 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 {
 	u32 exec_control = vmcs_config.cpu_based_exec_ctrl;
 
+	/* Not used by KVM but supported for nesting. */
+	exec_control &= ~(CPU_BASED_RDTSC_EXITING |
+			  CPU_BASED_USE_IO_BITMAPS |
+			  CPU_BASED_MONITOR_TRAP_FLAG |
+			  CPU_BASED_PAUSE_EXITING);
+
 	/* INTR_WINDOW_EXITING and NMI_WINDOW_EXITING are toggled dynamically */
 	exec_control &= ~(CPU_BASED_INTR_WINDOW_EXITING |
 			  CPU_BASED_NMI_WINDOW_EXITING);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index e9c392398f1b..758f80c41beb 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -539,9 +539,13 @@ static inline u8 vmx_get_rvi(void)
 #endif
 
 #define KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL			\
-	(CPU_BASED_TPR_SHADOW |					\
+	(CPU_BASED_RDTSC_EXITING |				\
+	CPU_BASED_TPR_SHADOW |					\
+	CPU_BASED_USE_IO_BITMAPS |				\
+	CPU_BASED_MONITOR_TRAP_FLAG |				\
 	CPU_BASED_USE_MSR_BITMAPS |				\
 	CPU_BASED_NMI_WINDOW_EXITING |				\
+	CPU_BASED_PAUSE_EXITING |				\
 	CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |			\
 	CPU_BASED_ACTIVATE_TERTIARY_CONTROLS)
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 19/25] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (17 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 18/25] KVM: VMX: Add missing CPU based VM execution " Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:58   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 20/25] KVM: x86: VMX: Replace some Intel model numbers with mnemonics Vitaly Kuznetsov
                   ` (5 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

From: Sean Christopherson <seanjc@google.com>

Clear the CR3 and INVLPG interception controls at runtime based on
whether or not EPT is being _used_, as opposed to clearing the bits at
setup if EPT is _supported_ in hardware, and then restoring them when EPT
is not used.  Not mucking with the base config will allow using the base
config as the starting point for emulating the VMX capability MSRs.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9771c771c8f5..eca6875d6732 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2501,13 +2501,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	rdmsr_safe(MSR_IA32_VMX_EPT_VPID_CAP,
 		&vmx_cap->ept, &vmx_cap->vpid);
 
-	if (_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_EPT) {
-		/* CR3 accesses and invlpg don't need to cause VM Exits when EPT
-		   enabled */
-		_cpu_based_exec_control &= ~(CPU_BASED_CR3_LOAD_EXITING |
-					     CPU_BASED_CR3_STORE_EXITING |
-					     CPU_BASED_INVLPG_EXITING);
-	} else if (vmx_cap->ept) {
+	if (!(_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_EPT) &&
+	    vmx_cap->ept) {
 		pr_warn_once("EPT CAP should not exist if not support "
 				"1-setting enable EPT VM-execution control\n");
 
@@ -4264,10 +4259,11 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 		exec_control |= CPU_BASED_CR8_STORE_EXITING |
 				CPU_BASED_CR8_LOAD_EXITING;
 #endif
-	if (!enable_ept)
-		exec_control |= CPU_BASED_CR3_STORE_EXITING |
-				CPU_BASED_CR3_LOAD_EXITING  |
-				CPU_BASED_INVLPG_EXITING;
+	/* No need to intercept CR3 access or INVPLG when using EPT. */
+	if (enable_ept)
+		exec_control &= ~(CPU_BASED_CR3_LOAD_EXITING |
+				  CPU_BASED_CR3_STORE_EXITING |
+				  CPU_BASED_INVLPG_EXITING);
 	if (kvm_mwait_in_guest(vmx->vcpu.kvm))
 		exec_control &= ~(CPU_BASED_MWAIT_EXITING |
 				CPU_BASED_MONITOR_EXITING);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 20/25] KVM: x86: VMX: Replace some Intel model numbers with mnemonics
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (18 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 19/25] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:58   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 21/25] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config() Vitaly Kuznetsov
                   ` (4 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

From: Jim Mattson <jmattson@google.com>

Intel processor code names are more familiar to many readers than
their decimal model numbers.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index eca6875d6732..2dff5b94c535 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2580,11 +2580,11 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	 */
 	if (boot_cpu_data.x86 == 0x6) {
 		switch (boot_cpu_data.x86_model) {
-		case 26: /* AAK155 */
-		case 30: /* AAP115 */
-		case 37: /* AAT100 */
-		case 44: /* BC86,AAY89,BD102 */
-		case 46: /* BA97 */
+		case INTEL_FAM6_NEHALEM_EP:	/* AAK155 */
+		case INTEL_FAM6_NEHALEM:	/* AAP115 */
+		case INTEL_FAM6_WESTMERE:	/* AAT100 */
+		case INTEL_FAM6_WESTMERE_EP:	/* BC86,AAY89,BD102 */
+		case INTEL_FAM6_NEHALEM_EX:	/* BA97 */
 			_vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
 			_vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
 			pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 21/25] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (19 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 20/25] KVM: x86: VMX: Replace some Intel model numbers with mnemonics Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 11:59   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 22/25] KVM: nVMX: Always set required-1 bits of pinbased_ctls to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR Vitaly Kuznetsov
                   ` (3 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

As a preparation to reusing the result of setup_vmcs_config() for setting
up nested VMX control MSRs, move LOAD_IA32_PERF_GLOBAL_CTRL errata handling
to vmx_vmexit_ctrl()/vmx_vmentry_ctrl() and print the warning from
hardware_setup(). While it seems reasonable to not expose
LOAD_IA32_PERF_GLOBAL_CTRL controls to L1 hypervisor on buggy CPUs,
such change would inevitably break live migration from older KVMs
where the controls are exposed. Keep the status quo for know, L1 hypervisor
itself is supposed to take care of the errata.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 62 ++++++++++++++++++++++++++----------------
 1 file changed, 38 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 2dff5b94c535..e462e5b9c0a1 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2416,6 +2416,31 @@ static bool cpu_has_sgx(void)
 	return cpuid_eax(0) >= 0x12 && (cpuid_eax(0x12) & BIT(0));
 }
 
+/*
+ * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
+ * can't be used due to errata where VM Exit may incorrectly clear
+ * IA32_PERF_GLOBAL_CTRL[34:32]. Work around the errata by using the
+ * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
+ */
+static bool cpu_has_perf_global_ctrl_bug(void)
+{
+	if (boot_cpu_data.x86 == 0x6) {
+		switch (boot_cpu_data.x86_model) {
+		case INTEL_FAM6_NEHALEM_EP:	/* AAK155 */
+		case INTEL_FAM6_NEHALEM:	/* AAP115 */
+		case INTEL_FAM6_WESTMERE:	/* AAT100 */
+		case INTEL_FAM6_WESTMERE_EP:	/* BC86,AAY89,BD102 */
+		case INTEL_FAM6_NEHALEM_EX:	/* BA97 */
+			return true;
+		default:
+			break;
+		}
+	}
+
+	return false;
+}
+
+
 static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
 				      u32 msr, u32 *result)
 {
@@ -2572,30 +2597,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 		_vmexit_control &= ~x_ctrl;
 	}
 
-	/*
-	 * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
-	 * can't be used due to an errata where VM Exit may incorrectly clear
-	 * IA32_PERF_GLOBAL_CTRL[34:32].  Workaround the errata by using the
-	 * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
-	 */
-	if (boot_cpu_data.x86 == 0x6) {
-		switch (boot_cpu_data.x86_model) {
-		case INTEL_FAM6_NEHALEM_EP:	/* AAK155 */
-		case INTEL_FAM6_NEHALEM:	/* AAP115 */
-		case INTEL_FAM6_WESTMERE:	/* AAT100 */
-		case INTEL_FAM6_WESTMERE_EP:	/* BC86,AAY89,BD102 */
-		case INTEL_FAM6_NEHALEM_EX:	/* BA97 */
-			_vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
-			_vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
-			pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
-					"does not work properly. Using workaround\n");
-			break;
-		default:
-			break;
-		}
-	}
-
-
 	rdmsr(MSR_IA32_VMX_BASIC, vmx_msr_low, vmx_msr_high);
 
 	/* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
@@ -4184,6 +4185,10 @@ static u32 vmx_vmentry_ctrl(void)
 			  VM_ENTRY_LOAD_IA32_EFER |
 			  VM_ENTRY_IA32E_MODE);
 
+
+	if (cpu_has_perf_global_ctrl_bug())
+		vmentry_ctrl &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+
 	return vmentry_ctrl;
 }
 
@@ -4198,6 +4203,10 @@ static u32 vmx_vmexit_ctrl(void)
 	if (vmx_pt_mode_is_system())
 		vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
 				 VM_EXIT_CLEAR_IA32_RTIT_CTL);
+
+	if (cpu_has_perf_global_ctrl_bug())
+		vmexit_ctrl &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+
 	/* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
 	return vmexit_ctrl &
 		~(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | VM_EXIT_LOAD_IA32_EFER);
@@ -8113,6 +8122,11 @@ static __init int hardware_setup(void)
 	if (setup_vmcs_config(&vmcs_config, &vmx_capability) < 0)
 		return -EIO;
 
+	if (cpu_has_perf_global_ctrl_bug()) {
+		pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
+			     "does not work properly. Using workaround\n");
+	}
+
 	if (boot_cpu_has(X86_FEATURE_NX))
 		kvm_enable_efer_bits(EFER_NX);
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 22/25] KVM: nVMX: Always set required-1 bits of pinbased_ctls to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (20 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 21/25] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config() Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 12:02   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 23/25] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs Vitaly Kuznetsov
                   ` (2 subsequent siblings)
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Similar to exit_ctls_low, entry_ctls_low, procbased_ctls_low,
pinbased_ctls_low should be set to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR
and not host's MSR_IA32_VMX_PINBASED_CTLS value |=
PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR.

The commit eabeaaccfca0 ("KVM: nVMX: Clean up and fix pin-based
execution controls") which introduced '|=' doesn't mention anything
about why this is needed, the change seems rather accidental.

Note: normally, required-1 portion of MSR_IA32_VMX_PINBASED_CTLS should
be equal to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR so no behavioral change
is expected, however, it is (in theory) possible to observe something
different there when e.g. KVM is running as a nested hypervisor. Hope
this doesn't happen in practice.

Reported-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index dcf3ee645212..09654d5c2144 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6586,7 +6586,7 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	rdmsr(MSR_IA32_VMX_PINBASED_CTLS,
 		msrs->pinbased_ctls_low,
 		msrs->pinbased_ctls_high);
-	msrs->pinbased_ctls_low |=
+	msrs->pinbased_ctls_low =
 		PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
 	msrs->pinbased_ctls_high &=
 		PIN_BASED_EXT_INTR_MASK |
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 23/25] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (21 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 22/25] KVM: nVMX: Always set required-1 bits of pinbased_ctls to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 12:11   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 24/25] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config Vitaly Kuznetsov
  2022-07-08 14:42 ` [PATCH v3 25/25] KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up nested MSR Vitaly Kuznetsov
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Using raw host MSR values for setting up nested VMX control MSRs is
incorrect as some features need to disabled, e.g. when KVM runs as
a nested hypervisor on Hyper-V and uses Enlightened VMCS or when a
workaround for IA32_PERF_GLOBAL_CTRL is applied. For non-nested VMX, this
is done in setup_vmcs_config() and the result is stored in vmcs_config.
Use it for setting up allowed-1 bits in nested VMX MSRs too.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 30 ++++++++++++------------------
 arch/x86/kvm/vmx/nested.h |  2 +-
 arch/x86/kvm/vmx/vmx.c    |  5 ++---
 3 files changed, 15 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 09654d5c2144..3d386c663fac 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6565,8 +6565,10 @@ static u64 nested_vmx_calc_vmcs_enum_msr(void)
  * bit in the high half is on if the corresponding bit in the control field
  * may be on. See also vmx_control_verify().
  */
-void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
+void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
 {
+	struct nested_vmx_msrs *msrs = &vmcs_conf->nested;
+
 	/*
 	 * Note that as a general rule, the high half of the MSRs (bits in
 	 * the control fields which may be 1) should be initialized by the
@@ -6583,11 +6585,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	 */
 
 	/* pin-based controls */
-	rdmsr(MSR_IA32_VMX_PINBASED_CTLS,
-		msrs->pinbased_ctls_low,
-		msrs->pinbased_ctls_high);
 	msrs->pinbased_ctls_low =
 		PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
+
+	msrs->pinbased_ctls_high = vmcs_conf->pin_based_exec_ctrl;
 	msrs->pinbased_ctls_high &=
 		PIN_BASED_EXT_INTR_MASK |
 		PIN_BASED_NMI_EXITING |
@@ -6598,12 +6599,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 		PIN_BASED_VMX_PREEMPTION_TIMER;
 
 	/* exit controls */
-	rdmsr(MSR_IA32_VMX_EXIT_CTLS,
-		msrs->exit_ctls_low,
-		msrs->exit_ctls_high);
 	msrs->exit_ctls_low =
 		VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR;
 
+	msrs->exit_ctls_high = vmcs_conf->vmexit_ctrl;
 	msrs->exit_ctls_high &=
 #ifdef CONFIG_X86_64
 		VM_EXIT_HOST_ADDR_SPACE_SIZE |
@@ -6619,11 +6618,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	msrs->exit_ctls_low &= ~VM_EXIT_SAVE_DEBUG_CONTROLS;
 
 	/* entry controls */
-	rdmsr(MSR_IA32_VMX_ENTRY_CTLS,
-		msrs->entry_ctls_low,
-		msrs->entry_ctls_high);
 	msrs->entry_ctls_low =
 		VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR;
+
+	msrs->entry_ctls_high = vmcs_conf->vmentry_ctrl;
 	msrs->entry_ctls_high &=
 #ifdef CONFIG_X86_64
 		VM_ENTRY_IA32E_MODE |
@@ -6637,11 +6635,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	msrs->entry_ctls_low &= ~VM_ENTRY_LOAD_DEBUG_CONTROLS;
 
 	/* cpu-based controls */
-	rdmsr(MSR_IA32_VMX_PROCBASED_CTLS,
-		msrs->procbased_ctls_low,
-		msrs->procbased_ctls_high);
 	msrs->procbased_ctls_low =
 		CPU_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
+
+	msrs->procbased_ctls_high = vmcs_conf->cpu_based_exec_ctrl;
 	msrs->procbased_ctls_high &=
 		CPU_BASED_INTR_WINDOW_EXITING |
 		CPU_BASED_NMI_WINDOW_EXITING | CPU_BASED_USE_TSC_OFFSETTING |
@@ -6675,12 +6672,9 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	 * depend on CPUID bits, they are added later by
 	 * vmx_vcpu_after_set_cpuid.
 	 */
-	if (msrs->procbased_ctls_high & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS)
-		rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2,
-		      msrs->secondary_ctls_low,
-		      msrs->secondary_ctls_high);
-
 	msrs->secondary_ctls_low = 0;
+
+	msrs->secondary_ctls_high = vmcs_conf->cpu_based_2nd_exec_ctrl;
 	msrs->secondary_ctls_high &=
 		SECONDARY_EXEC_DESC |
 		SECONDARY_EXEC_ENABLE_RDTSCP |
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index c92cea0b8ccc..fae047c6204b 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -17,7 +17,7 @@ enum nvmx_vmentry_status {
 };
 
 void vmx_leave_nested(struct kvm_vcpu *vcpu);
-void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps);
+void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps);
 void nested_vmx_hardware_unsetup(void);
 __init int nested_vmx_hardware_setup(int (*exit_handlers[])(struct kvm_vcpu *));
 void nested_vmx_set_vmcs_shadowing_bitmap(void);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e462e5b9c0a1..35285109856f 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7306,7 +7306,7 @@ static int __init vmx_check_processor_compat(void)
 	if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0)
 		return -EIO;
 	if (nested)
-		nested_vmx_setup_ctls_msrs(&vmcs_conf.nested, vmx_cap.ept);
+		nested_vmx_setup_ctls_msrs(&vmcs_conf, vmx_cap.ept);
 	if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config)) != 0) {
 		printk(KERN_ERR "kvm: CPU %d feature inconsistency!\n",
 				smp_processor_id());
@@ -8281,8 +8281,7 @@ static __init int hardware_setup(void)
 	setup_default_sgx_lepubkeyhash();
 
 	if (nested) {
-		nested_vmx_setup_ctls_msrs(&vmcs_config.nested,
-					   vmx_capability.ept);
+		nested_vmx_setup_ctls_msrs(&vmcs_config, vmx_capability.ept);
 
 		r = nested_vmx_hardware_setup(kvm_vmx_exit_handlers);
 		if (r)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 24/25] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (22 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 23/25] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 12:11   ` Maxim Levitsky
  2022-07-08 14:42 ` [PATCH v3 25/25] KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up nested MSR Vitaly Kuznetsov
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

Like other host VMX control MSRs, MSR_IA32_VMX_MISC can be cached in
vmcs_config to avoid the need to re-read it later, e.g. from
cpu_has_vmx_intel_pt() or cpu_has_vmx_shadow_vmcs().

No (real) functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/capabilities.h | 11 +++--------
 arch/x86/kvm/vmx/vmx.c          |  8 +++++---
 2 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 07e7492fe72a..07f7a9534211 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -65,6 +65,7 @@ struct vmcs_config {
 	u64 cpu_based_3rd_exec_ctrl;
 	u32 vmexit_ctrl;
 	u32 vmentry_ctrl;
+	u64 misc;
 	struct nested_vmx_msrs nested;
 };
 extern struct vmcs_config vmcs_config;
@@ -225,11 +226,8 @@ static inline bool cpu_has_vmx_vmfunc(void)
 
 static inline bool cpu_has_vmx_shadow_vmcs(void)
 {
-	u64 vmx_msr;
-
 	/* check if the cpu supports writing r/o exit information fields */
-	rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
-	if (!(vmx_msr & MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS))
+	if (!(vmcs_config.misc & MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS))
 		return false;
 
 	return vmcs_config.cpu_based_2nd_exec_ctrl &
@@ -371,10 +369,7 @@ static inline bool cpu_has_vmx_invvpid_global(void)
 
 static inline bool cpu_has_vmx_intel_pt(void)
 {
-	u64 vmx_msr;
-
-	rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
-	return (vmx_msr & MSR_IA32_VMX_MISC_INTEL_PT) &&
+	return (vmcs_config.misc & MSR_IA32_VMX_MISC_INTEL_PT) &&
 		(vmcs_config.cpu_based_2nd_exec_ctrl & SECONDARY_EXEC_PT_USE_GPA) &&
 		(vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_RTIT_CTL);
 }
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 35285109856f..ab091758c437 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2479,6 +2479,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	u64 _cpu_based_3rd_exec_control = 0;
 	u32 _vmexit_control = 0;
 	u32 _vmentry_control = 0;
+	u64 misc_msr;
 	int i;
 
 	/*
@@ -2613,6 +2614,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	if (((vmx_msr_high >> 18) & 15) != 6)
 		return -EIO;
 
+	rdmsrl(MSR_IA32_VMX_MISC, misc_msr);
+
 	vmcs_conf->size = vmx_msr_high & 0x1fff;
 	vmcs_conf->basic_cap = vmx_msr_high & ~0x1fff;
 
@@ -2624,6 +2627,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	vmcs_conf->cpu_based_3rd_exec_ctrl = _cpu_based_3rd_exec_control;
 	vmcs_conf->vmexit_ctrl         = _vmexit_control;
 	vmcs_conf->vmentry_ctrl        = _vmentry_control;
+	vmcs_conf->misc	= misc_msr;
 
 	return 0;
 }
@@ -8241,11 +8245,9 @@ static __init int hardware_setup(void)
 
 	if (enable_preemption_timer) {
 		u64 use_timer_freq = 5000ULL * 1000 * 1000;
-		u64 vmx_msr;
 
-		rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
 		cpu_preemption_timer_multi =
-			vmx_msr & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
+			vmcs_config.misc & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
 
 		if (tsc_khz)
 			use_timer_freq = (u64)tsc_khz * 1000;
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 25/25] KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up nested MSR
  2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
                   ` (23 preceding siblings ...)
  2022-07-08 14:42 ` [PATCH v3 24/25] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config Vitaly Kuznetsov
@ 2022-07-08 14:42 ` Vitaly Kuznetsov
  2022-07-12 12:11   ` Maxim Levitsky
  24 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-08 14:42 UTC (permalink / raw)
  To: kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, Maxim Levitsky,
	linux-hyperv, linux-kernel

vmcs_config has cased host MSR_IA32_VMX_MISC value, use it for setting
up nested MSR_IA32_VMX_MISC in nested_vmx_setup_ctls_msrs() and avoid the
redundant rdmsr().

No (real) functional change intended.

Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 3d386c663fac..8026dab71086 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6754,10 +6754,7 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
 		msrs->secondary_ctls_high |= SECONDARY_EXEC_ENCLS_EXITING;
 
 	/* miscellaneous data */
-	rdmsr(MSR_IA32_VMX_MISC,
-		msrs->misc_low,
-		msrs->misc_high);
-	msrs->misc_low &= VMX_MISC_SAVE_EFER_LMA;
+	msrs->misc_low = (u32)vmcs_conf->misc & VMX_MISC_SAVE_EFER_LMA;
 	msrs->misc_low |=
 		MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS |
 		VMX_MISC_EMULATED_PREEMPTION_TIMER_RATE |
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags
  2022-07-08 14:41 ` [PATCH v3 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags Vitaly Kuznetsov
@ 2022-07-12 11:49   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:49 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:41 +0200, Vitaly Kuznetsov wrote:
> For some features, Hyper-V spec defines two separate CPUID bits: one
> listing whether the feature is supported or not and another one showing
> whether guest partition was granted access to the feature ("partition
> privilege mask"). 'Debug MSRs available' is one of such features. Add
> the missing 'access' bit.
> 
> Fixes: f97f5a56f597 ("x86/kvm/hyper-v: Add support for synthetic debugger interface")
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/hyperv.c             | 1 +
>  include/asm-generic/hyperv-tlfs.h | 2 ++
>  2 files changed, 3 insertions(+)
> 
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index e2e95a6fccfd..e08189211d9a 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -2496,6 +2496,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>  			ent->eax |= HV_MSR_RESET_AVAILABLE;
>  			ent->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
>  			ent->eax |= HV_ACCESS_FREQUENCY_MSRS;
> +			ent->eax |= HV_ACCESS_DEBUG_MSRS;
>  			ent->eax |= HV_ACCESS_REENLIGHTENMENT;
>  
>  			ent->ebx |= HV_POST_MESSAGES;
> diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
> index fdce7a4cfc6f..1d99dd296a76 100644
> --- a/include/asm-generic/hyperv-tlfs.h
> +++ b/include/asm-generic/hyperv-tlfs.h
> @@ -70,6 +70,8 @@
>  #define HV_MSR_GUEST_IDLE_AVAILABLE		BIT(10)
>  /* Partition local APIC and TSC frequency registers available */
>  #define HV_ACCESS_FREQUENCY_MSRS		BIT(11)
> +/* Debug MSRs available */
> +#define HV_ACCESS_DEBUG_MSRS			BIT(12)
>  /* AccessReenlightenmentControls privilege */
>  #define HV_ACCESS_REENLIGHTENMENT		BIT(13)
>  /* AccessTscInvariantControls privilege */


I guess you mean HV_FEATURE_DEBUG_MSRS_AVAILABLE and the new HV_ACCESS_DEBUG_MSRS

I checked the spec and the bits match, so I guess:

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 04/25] KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields
  2022-07-08 14:42 ` [PATCH v3 04/25] KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields Vitaly Kuznetsov
@ 2022-07-12 11:50   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:50 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> Enlightened VMCS v1 definition was updated with new fields, support
> them in KVM by defining VMCS-to-EVMCS conversion.
> 
> Note: SSP, CET and Guest LBR features are not supported by KVM yet and
> the corresponding fields are not defined in 'enum vmcs_field', leave
> them commented out for now.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/evmcs.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
> index 6a61b1ae7942..8bea5dea0341 100644
> --- a/arch/x86/kvm/vmx/evmcs.c
> +++ b/arch/x86/kvm/vmx/evmcs.c
> @@ -28,6 +28,8 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
>  		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
>  	EVMCS1_FIELD(HOST_IA32_EFER, host_ia32_efer,
>  		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
> +	EVMCS1_FIELD(HOST_IA32_PERF_GLOBAL_CTRL, host_ia32_perf_global_ctrl,
> +		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
>  	EVMCS1_FIELD(HOST_CR0, host_cr0,
>  		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
>  	EVMCS1_FIELD(HOST_CR3, host_cr3,
> @@ -78,6 +80,8 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
>  		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
>  	EVMCS1_FIELD(GUEST_IA32_EFER, guest_ia32_efer,
>  		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
> +	EVMCS1_FIELD(GUEST_IA32_PERF_GLOBAL_CTRL, guest_ia32_perf_global_ctrl,
> +		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
>  	EVMCS1_FIELD(GUEST_PDPTR0, guest_pdptr0,
>  		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
>  	EVMCS1_FIELD(GUEST_PDPTR1, guest_pdptr1,
> @@ -126,6 +130,28 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
>  		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
>  	EVMCS1_FIELD(XSS_EXIT_BITMAP, xss_exit_bitmap,
>  		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2),
> +	EVMCS1_FIELD(ENCLS_EXITING_BITMAP, encls_exiting_bitmap,
> +		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2),
> +	EVMCS1_FIELD(TSC_MULTIPLIER, tsc_multiplier,
> +		     HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2),
> +	/*
> +	 * Not used by KVM:
> +	 *
> +	 * EVMCS1_FIELD(0x00006828, guest_ia32_s_cet,
> +	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),
> +	 * EVMCS1_FIELD(0x0000682A, guest_ssp,
> +	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_BASIC),

> +	 * EVMCS1_FIELD(0x0000682C, guest_ia32_int_ssp_table_addr,
> +	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),


> +	 * EVMCS1_FIELD(0x00002816, guest_ia32_lbr_ctl,
> +	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1),

> +	 * EVMCS1_FIELD(0x00006C18, host_ia32_s_cet,
> +	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),

> +	 * EVMCS1_FIELD(0x00006C1A, host_ssp,
> +	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
> +	 * EVMCS1_FIELD(0x00006C1C, host_ia32_int_ssp_table_addr,
> +	 *	     HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1),
> +	 */

>  
>  	/* 64 bit read only */
>  	EVMCS1_FIELD(GUEST_PHYSICAL_ADDRESS, guest_physical_address,


I checked the Intel's SDM for encodings of the unused fields,
and checked the [1] for the clean bits.
All look good but as usual I could have missed something.

[1] https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/datatypes/hv_vmx_enlightened_vmcs

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 03/25] x86/hyperv: Update 'struct hv_enlightened_vmcs' definition
  2022-07-08 14:42 ` [PATCH v3 03/25] x86/hyperv: Update " Vitaly Kuznetsov
@ 2022-07-12 11:51   ` Maxim Levitsky
  2022-07-12 12:19     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:51 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> Updated Hyper-V Enlightened VMCS specification lists several new
> fields for the following features:
> 
> - PerfGlobalCtrl
> - EnclsExitingBitmap
> - Tsc Scaling
> - GuestLbrCtl
> - CET
> - SSP
> 
> Update the definition. The updated definition is available only when
> CPUID.0x4000000A.EBX BIT(0) is '1'. Add a define for it as well.
> 
> Note: The latest TLFS is available at
> https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/include/asm/hyperv-tlfs.h | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
> index 6f0acc45e67a..6f2c3cdacdf4 100644
> --- a/arch/x86/include/asm/hyperv-tlfs.h
> +++ b/arch/x86/include/asm/hyperv-tlfs.h
> @@ -138,6 +138,9 @@
>  #define HV_X64_NESTED_GUEST_MAPPING_FLUSH		BIT(18)
>  #define HV_X64_NESTED_MSR_BITMAP			BIT(19)
>  

Maybe add a comment that this is undocumented + what that cpuid bit does?

> +/* Nested quirks. These are HYPERV_CPUID_NESTED_FEATURES.EBX bits. */
> +#define HV_X64_NESTED_EVMCS1_2022_UPDATE		BIT(0)
> +
>  /*
>   * This is specific to AMD and specifies that enlightened TLB flush is
>   * supported. If guest opts in to this feature, ASID invalidations only
> @@ -559,9 +562,20 @@ struct hv_enlightened_vmcs {
>  	u64 partition_assist_page;
>  	u64 padding64_4[4];
>  	u64 guest_bndcfgs;
> -	u64 padding64_5[7];
> +	u64 guest_ia32_perf_global_ctrl;
> +	u64 guest_ia32_s_cet;
> +	u64 guest_ssp;
> +	u64 guest_ia32_int_ssp_table_addr;
> +	u64 guest_ia32_lbr_ctl;
> +	u64 padding64_5[2];

This change looks OK

>  	u64 xss_exit_bitmap;
> -	u64 padding64_6[7];
> +	u64 host_ia32_perf_global_ctrl;
> +	u64 encls_exiting_bitmap;
> +	u64 tsc_multiplier;
> +	u64 host_ia32_s_cet;
> +	u64 host_ssp;
> +	u64 host_ia32_int_ssp_table_addr;
> +	u64 padding64_6;

I think we have a mistake here:

UINT64 XssExitingBitmap;
UINT64 EnclsExitingBitmap;
UINT64 HostPerfGlobalCtrl;
UINT64 TscMultiplier;
UINT64 HostSCet;
UINT64 HostSsp;
UINT64 HostInterruptSspTableAddr;
UINT64 Rsvd8;


I think you need to swap encls_exiting_bitmap and host_ia32_perf_global_ctrl

I used https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/datatypes/hv_vmx_enlightened_vmcs
as the reference. 


Best regards,
	Maxim Levitsky


>  } __packed;
>  
>  #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE			0



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 05/25] KVM: nVMX: Support several new fields in eVMCSv1
  2022-07-08 14:42 ` [PATCH v3 05/25] KVM: nVMX: Support several new fields in eVMCSv1 Vitaly Kuznetsov
@ 2022-07-12 11:51   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:51 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> Enlightened VMCS v1 definition was updated with new fields, add
> support for them for Hyper-V on KVM.
> 
> Note: SSP, CET and Guest LBR features are not supported by KVM yet
> and 'struct vmcs12' has no corresponding fields.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 778f82015f03..4fc84f0f5875 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -1603,6 +1603,10 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
>  		vmcs12->guest_rflags = evmcs->guest_rflags;
>  		vmcs12->guest_interruptibility_info =
>  			evmcs->guest_interruptibility_info;
> +		/*
> +		 * Not present in struct vmcs12:
> +		 * vmcs12->guest_ssp = evmcs->guest_ssp;
> +		 */
>  	}
>  
>  	if (unlikely(!(hv_clean_fields &
> @@ -1649,6 +1653,13 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
>  		vmcs12->host_fs_selector = evmcs->host_fs_selector;
>  		vmcs12->host_gs_selector = evmcs->host_gs_selector;
>  		vmcs12->host_tr_selector = evmcs->host_tr_selector;
> +		vmcs12->host_ia32_perf_global_ctrl = evmcs->host_ia32_perf_global_ctrl;
> +		/*
> +		 * Not present in struct vmcs12:
> +		 * vmcs12->host_ia32_s_cet = evmcs->host_ia32_s_cet;
> +		 * vmcs12->host_ssp = evmcs->host_ssp;
> +		 * vmcs12->host_ia32_int_ssp_table_addr = evmcs->host_ia32_int_ssp_table_addr;
> +		 */
>  	}
>  
>  	if (unlikely(!(hv_clean_fields &
> @@ -1716,6 +1727,8 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
>  		vmcs12->tsc_offset = evmcs->tsc_offset;
>  		vmcs12->virtual_apic_page_addr = evmcs->virtual_apic_page_addr;
>  		vmcs12->xss_exit_bitmap = evmcs->xss_exit_bitmap;
> +		vmcs12->encls_exiting_bitmap = evmcs->encls_exiting_bitmap;
> +		vmcs12->tsc_multiplier = evmcs->tsc_multiplier;
>  	}
>  
>  	if (unlikely(!(hv_clean_fields &
> @@ -1763,6 +1776,13 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
>  		vmcs12->guest_bndcfgs = evmcs->guest_bndcfgs;
>  		vmcs12->guest_activity_state = evmcs->guest_activity_state;
>  		vmcs12->guest_sysenter_cs = evmcs->guest_sysenter_cs;
> +		vmcs12->guest_ia32_perf_global_ctrl = evmcs->guest_ia32_perf_global_ctrl;
> +		/*
> +		 * Not present in struct vmcs12:
> +		 * vmcs12->guest_ia32_s_cet = evmcs->guest_ia32_s_cet;
> +		 * vmcs12->guest_ia32_lbr_ctl = evmcs->guest_ia32_lbr_ctl;
> +		 * vmcs12->guest_ia32_int_ssp_table_addr = evmcs->guest_ia32_int_ssp_table_addr;
> +		 */
>  	}
>  
>  	/*
> @@ -1865,12 +1885,23 @@ static void copy_vmcs12_to_enlightened(struct vcpu_vmx *vmx)
>  	 * evmcs->vm_exit_msr_store_count = vmcs12->vm_exit_msr_store_count;
>  	 * evmcs->vm_exit_msr_load_count = vmcs12->vm_exit_msr_load_count;
>  	 * evmcs->vm_entry_msr_load_count = vmcs12->vm_entry_msr_load_count;
> +	 * evmcs->guest_ia32_perf_global_ctrl = vmcs12->guest_ia32_perf_global_ctrl;
> +	 * evmcs->host_ia32_perf_global_ctrl = vmcs12->host_ia32_perf_global_ctrl;
> +	 * evmcs->encls_exiting_bitmap = vmcs12->encls_exiting_bitmap;
> +	 * evmcs->tsc_multiplier = vmcs12->tsc_multiplier;
>  	 *
>  	 * Not present in struct vmcs12:
>  	 * evmcs->exit_io_instruction_ecx = vmcs12->exit_io_instruction_ecx;
>  	 * evmcs->exit_io_instruction_esi = vmcs12->exit_io_instruction_esi;
>  	 * evmcs->exit_io_instruction_edi = vmcs12->exit_io_instruction_edi;
>  	 * evmcs->exit_io_instruction_eip = vmcs12->exit_io_instruction_eip;
> +	 * evmcs->host_ia32_s_cet = vmcs12->host_ia32_s_cet;
> +	 * evmcs->host_ssp = vmcs12->host_ssp;
> +	 * evmcs->host_ia32_int_ssp_table_addr = vmcs12->host_ia32_int_ssp_table_addr;
> +	 * evmcs->guest_ia32_s_cet = vmcs12->guest_ia32_s_cet;
> +	 * evmcs->guest_ia32_lbr_ctl = vmcs12->guest_ia32_lbr_ctl;
> +	 * evmcs->guest_ia32_int_ssp_table_addr = vmcs12->guest_ia32_int_ssp_table_addr;
> +	 * evmcs->guest_ssp = vmcs12->guest_ssp;
>  	 */
>  
>  	evmcs->guest_es_selector = vmcs12->guest_es_selector;

Looks good.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 06/25] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf
  2022-07-08 14:42 ` [PATCH v3 06/25] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf Vitaly Kuznetsov
@ 2022-07-12 11:51   ` Maxim Levitsky
  2022-07-13 15:45     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:51 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> KVM has to check guest visible HYPERV_CPUID_NESTED_FEATURES.EBX CPUID
> leaf to know with Enlightened VMCS definition to use (original or 2022
> update). Cache the leaf along with other Hyper-V CPUID feature leaves
> to make the check quick.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 2 ++
>  arch/x86/kvm/hyperv.c           | 9 +++++++++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index de5a149d0971..077ec9cf3169 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -616,6 +616,8 @@ struct kvm_vcpu_hv {
>  		u32 enlightenments_eax; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EAX */
>  		u32 enlightenments_ebx; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EBX */
>  		u32 syndbg_cap_eax; /* HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES.EAX */
> +		u32 nested_eax; /* HYPERV_CPUID_NESTED_FEATURES.EAX */
> +		u32 nested_ebx; /* HYPERV_CPUID_NESTED_FEATURES.EBX */
>  	} cpuid_cache;
>  };
>  
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index e08189211d9a..b666902da4d9 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -2030,6 +2030,15 @@ void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu)
>  		hv_vcpu->cpuid_cache.syndbg_cap_eax = entry->eax;
>  	else
>  		hv_vcpu->cpuid_cache.syndbg_cap_eax = 0;
> +
> +	entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_NESTED_FEATURES, 0);
> +	if (entry) {
> +		hv_vcpu->cpuid_cache.nested_eax = entry->eax;
> +		hv_vcpu->cpuid_cache.nested_ebx = entry->ebx;
> +	} else {
> +		hv_vcpu->cpuid_cache.nested_eax = 0;
> +		hv_vcpu->cpuid_cache.nested_ebx = 0;
> +	}
>  }
>  
>  int kvm_hv_set_enforce_cpuid(struct kvm_vcpu *vcpu, bool enforce)


Small Nitpick:

If I understand correctly, the kvm_find_cpuid_entry can fail if the userspace didn't provide the
cpuid entry.

Since the code that deals with failback is now repeated 3 times, how about some wrapper function that
will return all zeros for a non present cpuid entry?

That can be done later of course, so

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 07/25] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields
  2022-07-08 14:42 ` [PATCH v3 07/25] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields Vitaly Kuznetsov
@ 2022-07-12 11:51   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:51 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> The updated Enlightened VMCS definition has 'encls_exiting_bitmap'
> field which needs mapping to VMCS, add the missing encoding.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  tools/testing/selftests/kvm/include/x86_64/vmx.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/tools/testing/selftests/kvm/include/x86_64/vmx.h b/tools/testing/selftests/kvm/include/x86_64/vmx.h
> index cc3604f8f1d3..5292d30fb7f2 100644
> --- a/tools/testing/selftests/kvm/include/x86_64/vmx.h
> +++ b/tools/testing/selftests/kvm/include/x86_64/vmx.h
> @@ -208,6 +208,8 @@ enum vmcs_field {
>  	VMWRITE_BITMAP_HIGH		= 0x00002029,
>  	XSS_EXIT_BITMAP			= 0x0000202C,
>  	XSS_EXIT_BITMAP_HIGH		= 0x0000202D,
> +	ENCLS_EXITING_BITMAP		= 0x0000202E,
> +	ENCLS_EXITING_BITMAP_HIGH	= 0x0000202F,
>  	TSC_MULTIPLIER			= 0x00002032,
>  	TSC_MULTIPLIER_HIGH		= 0x00002033,
>  	GUEST_PHYSICAL_ADDRESS		= 0x00002400,

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 08/25] KVM: selftests: Switch to updated eVMCSv1 definition
  2022-07-08 14:42 ` [PATCH v3 08/25] KVM: selftests: Switch to updated eVMCSv1 definition Vitaly Kuznetsov
@ 2022-07-12 11:52   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:52 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> Update Enlightened VMCS definition in selftests from KVM.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  .../selftests/kvm/include/x86_64/evmcs.h      | 45 +++++++++++++++++--
>  1 file changed, 42 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/x86_64/evmcs.h b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
> index 3c9260f8e116..a777711d5474 100644
> --- a/tools/testing/selftests/kvm/include/x86_64/evmcs.h
> +++ b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
> @@ -203,14 +203,25 @@ struct hv_enlightened_vmcs {
>  		u32 reserved:30;
>  	} hv_enlightenments_control;
>  	u32 hv_vp_id;
> -
> +	u32 padding32_2;

This is not just an update, this is a fix - I do see that padding field,
in the spec. It is possible that compiler added that padding on its own
to pad everything to 8 bytes though.

>  	u64 hv_vm_id;
>  	u64 partition_assist_page;
>  	u64 padding64_4[4];
>  	u64 guest_bndcfgs;
> -	u64 padding64_5[7];
> +	u64 guest_ia32_perf_global_ctrl;
> +	u64 guest_ia32_s_cet;
> +	u64 guest_ssp;
> +	u64 guest_ia32_int_ssp_table_addr;
> +	u64 guest_ia32_lbr_ctl;
> +	u64 padding64_5[2];
Matches the spec.


>  	u64 xss_exit_bitmap;
> -	u64 padding64_6[7];
> +	u64 host_ia32_perf_global_ctrl;
> +	u64 encls_exiting_bitmap;
This is swapped here as well, expains why it was not caught in the testing.

> +	u64 tsc_multiplier;
> +	u64 host_ia32_s_cet;
> +	u64 host_ssp;
> +	u64 host_ia32_int_ssp_table_addr;
> +	u64 padding64_6;
>  };
>  
>  #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE                     0
> @@ -656,6 +667,18 @@ static inline int evmcs_vmread(uint64_t encoding, uint64_t *value)
>  	case VIRTUAL_PROCESSOR_ID:
>  		*value = current_evmcs->virtual_processor_id;
>  		break;
> +	case HOST_IA32_PERF_GLOBAL_CTRL:
> +		*value = current_evmcs->host_ia32_perf_global_ctrl;
> +		break;
> +	case GUEST_IA32_PERF_GLOBAL_CTRL:
> +		*value = current_evmcs->guest_ia32_perf_global_ctrl;
> +		break;
> +	case ENCLS_EXITING_BITMAP:
> +		*value = current_evmcs->encls_exiting_bitmap;
> +		break;
> +	case TSC_MULTIPLIER:
> +		*value = current_evmcs->tsc_multiplier;
> +		break;
>  	default: return 1;
>  	}
>  
> @@ -1169,6 +1192,22 @@ static inline int evmcs_vmwrite(uint64_t encoding, uint64_t value)
>  		current_evmcs->virtual_processor_id = value;
>  		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_XLAT;
>  		break;
> +	case HOST_IA32_PERF_GLOBAL_CTRL:
> +		current_evmcs->host_ia32_perf_global_ctrl = value;
> +		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1;
> +		break;
> +	case GUEST_IA32_PERF_GLOBAL_CTRL:
> +		current_evmcs->guest_ia32_perf_global_ctrl = value;
> +		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1;
> +		break;
> +	case ENCLS_EXITING_BITMAP:
> +		current_evmcs->encls_exiting_bitmap = value;
> +		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2;
> +		break;
> +	case TSC_MULTIPLIER:
> +		current_evmcs->tsc_multiplier = value;
> +		current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2;
> +		break;
>  	default: return 1;
>  	}
>  

Other than bug in the order of host_ia32_perf_global_ctrl/encls_exiting_bitmap, everything else looks fine to me.

So with this bug fixed:

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS
  2022-07-08 14:42 ` [PATCH v3 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS Vitaly Kuznetsov
@ 2022-07-12 11:53   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:53 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> Enlightened VMCS v1 got updated and now includes the required fields
> for TSC scaling and loading PERF_GLOBAL_CTRL upon VMENTER/VMEXIT
> features. For Hyper-V on KVM enablement, KVM can just observe VMX control
> MSRs and use the features (with or without eVMCS) when
> possible. Hyper-V on KVM case is trickier because of live migration:
> the new features require explicit enablement from VMM to not break
> it. Luckily, the updated eVMCS revision comes with a feature bit in
> CPUID.0x4000000A.EBX.

Very tiny nitpick about the commit summary. It might make sense instead
of specifying which new fields added, just say that this patch
updates the EVMCS to 2022 revision, or something, because the patch
series as a whole addes some other fields (even if as commented out),
and a new cpuid bit to detect the new eVMCS revision.

Again, this is a very tiny nitpick, feel free to ignore.

> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/hyperv.c     |  2 +-
>  arch/x86/kvm/vmx/evmcs.c  | 88 +++++++++++++++++++++++++++++++--------
>  arch/x86/kvm/vmx/evmcs.h  | 17 ++------
>  arch/x86/kvm/vmx/nested.c |  2 +-
>  arch/x86/kvm/vmx/vmx.c    |  2 +-
>  5 files changed, 78 insertions(+), 33 deletions(-)
> 
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index b666902da4d9..406c5e273983 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -2562,7 +2562,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>  		case HYPERV_CPUID_NESTED_FEATURES:
>  			ent->eax = evmcs_ver;
>  			ent->eax |= HV_X64_NESTED_MSR_BITMAP;
> -
> +			ent->ebx |= HV_X64_NESTED_EVMCS1_2022_UPDATE;
>  			break;
>  
>  		case HYPERV_CPUID_SYNDBG_VENDOR_AND_MAX_FUNCTIONS:
> diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
> index 8bea5dea0341..52a53debd806 100644
> --- a/arch/x86/kvm/vmx/evmcs.c
> +++ b/arch/x86/kvm/vmx/evmcs.c
> @@ -368,7 +368,60 @@ uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> -void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
> +enum evmcs_v1_revision {
> +	EVMCSv1_2016,
> +	EVMCSv1_2022,
> +};
> +
> +enum evmcs_unsupported_ctrl_type {
> +	EVMCS_EXIT_CTLS,
> +	EVMCS_ENTRY_CTLS,
> +	EVMCS_2NDEXEC,
> +	EVMCS_PINCTRL,
> +	EVMCS_VMFUNC,
> +};
> +
> +static u32 evmcs_get_unsupported_ctls(struct kvm_vcpu *vcpu,
> +				      enum evmcs_unsupported_ctrl_type ctrl_type)
Tiny nitpick: line break not aligned in the above.

> +{
> +	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
> +	enum evmcs_v1_revision evmcs_rev = EVMCSv1_2016;
> +
> +	if (!hv_vcpu)
> +		return 0;
> +
> +	if (hv_vcpu->cpuid_cache.nested_ebx & HV_X64_NESTED_EVMCS1_2022_UPDATE)
> +		evmcs_rev = EVMCSv1_2022;
> +
> +	switch (ctrl_type) {
> +	case EVMCS_EXIT_CTLS:
> +		if (evmcs_rev == EVMCSv1_2016)
> +			return EVMCS1_UNSUPPORTED_VMEXIT_CTRL |
> +				VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
> +		else
> +			return EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
> +	case EVMCS_ENTRY_CTLS:
> +		if (evmcs_rev == EVMCSv1_2016)
> +			return EVMCS1_UNSUPPORTED_VMENTRY_CTRL |
> +				VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
> +		else
> +			return EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
> +	case EVMCS_2NDEXEC:
> +		if (evmcs_rev == EVMCSv1_2016)
> +			return EVMCS1_UNSUPPORTED_2NDEXEC |
> +				SECONDARY_EXEC_TSC_SCALING;
> +		else
> +			return EVMCS1_UNSUPPORTED_2NDEXEC;
> +	case EVMCS_PINCTRL:
> +		return EVMCS1_UNSUPPORTED_PINCTRL;
> +	case EVMCS_VMFUNC:
> +		return EVMCS1_UNSUPPORTED_VMFUNC;
> +	}
> +
> +	return 0;
> +}
> +
> +void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
>  {
>  	u32 ctl_low = (u32)*pdata;
>  	u32 ctl_high = (u32)(*pdata >> 32);
> @@ -380,72 +433,73 @@ void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
>  	switch (msr_index) {
>  	case MSR_IA32_VMX_EXIT_CTLS:
>  	case MSR_IA32_VMX_TRUE_EXIT_CTLS:
> -		ctl_high &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
> +		ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_EXIT_CTLS);
>  		break;
>  	case MSR_IA32_VMX_ENTRY_CTLS:
>  	case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
> -		ctl_high &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
> +		ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_ENTRY_CTLS);
>  		break;
>  	case MSR_IA32_VMX_PROCBASED_CTLS2:
> -		ctl_high &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
> +		ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_2NDEXEC);
>  		break;
>  	case MSR_IA32_VMX_TRUE_PINBASED_CTLS:
>  	case MSR_IA32_VMX_PINBASED_CTLS:
> -		ctl_high &= ~EVMCS1_UNSUPPORTED_PINCTRL;
> +		ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_PINCTRL);
>  		break;
>  	case MSR_IA32_VMX_VMFUNC:
> -		ctl_low &= ~EVMCS1_UNSUPPORTED_VMFUNC;
> +		ctl_low &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_VMFUNC);
>  		break;
>  	}
>  
>  	*pdata = ctl_low | ((u64)ctl_high << 32);
>  }
>  
> -int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
> +int nested_evmcs_check_controls(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
>  {
>  	int ret = 0;
>  	u32 unsupp_ctl;
>  
>  	unsupp_ctl = vmcs12->pin_based_vm_exec_control &
> -		EVMCS1_UNSUPPORTED_PINCTRL;
> +		evmcs_get_unsupported_ctls(vcpu, EVMCS_PINCTRL);
>  	if (unsupp_ctl) {
>  		trace_kvm_nested_vmenter_failed(
> -			"eVMCS: unsupported pin-based VM-execution controls",
> +			"eVMCS: unsupported pin-based VM-execution controls: ",
>  			unsupp_ctl);
>  		ret = -EINVAL;
>  	}
>  
>  	unsupp_ctl = vmcs12->secondary_vm_exec_control &
> -		EVMCS1_UNSUPPORTED_2NDEXEC;
> +		evmcs_get_unsupported_ctls(vcpu, EVMCS_2NDEXEC);
>  	if (unsupp_ctl) {
>  		trace_kvm_nested_vmenter_failed(
> -			"eVMCS: unsupported secondary VM-execution controls",
> +			"eVMCS: unsupported secondary VM-execution controls: ",
>  			unsupp_ctl);
>  		ret = -EINVAL;
>  	}
>  
>  	unsupp_ctl = vmcs12->vm_exit_controls &
> -		EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
> +		evmcs_get_unsupported_ctls(vcpu, EVMCS_EXIT_CTLS);
>  	if (unsupp_ctl) {
>  		trace_kvm_nested_vmenter_failed(
> -			"eVMCS: unsupported VM-exit controls",
> +			"eVMCS: unsupported VM-exit controls: ",
>  			unsupp_ctl);
>  		ret = -EINVAL;
>  	}
>  
>  	unsupp_ctl = vmcs12->vm_entry_controls &
> -		EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
> +		evmcs_get_unsupported_ctls(vcpu, EVMCS_ENTRY_CTLS);
>  	if (unsupp_ctl) {
>  		trace_kvm_nested_vmenter_failed(
> -			"eVMCS: unsupported VM-entry controls",
> +			"eVMCS: unsupported VM-entry controls: ",
>  			unsupp_ctl);
>  		ret = -EINVAL;
>  	}
>  
> -	unsupp_ctl = vmcs12->vm_function_control & EVMCS1_UNSUPPORTED_VMFUNC;
> +	unsupp_ctl = vmcs12->vm_function_control &
> +		evmcs_get_unsupported_ctls(vcpu, EVMCS_VMFUNC);
>  	if (unsupp_ctl) {
>  		trace_kvm_nested_vmenter_failed(
> -			"eVMCS: unsupported VM-function controls",
> +			"eVMCS: unsupported VM-function controls: ",
>  			unsupp_ctl);
>  		ret = -EINVAL;
>  	}
> diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
> index f886a8ff0342..4b809c79ae63 100644
> --- a/arch/x86/kvm/vmx/evmcs.h
> +++ b/arch/x86/kvm/vmx/evmcs.h
> @@ -37,16 +37,9 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
>   *	EPTP_LIST_ADDRESS               = 0x00002024,
>   *	VMREAD_BITMAP                   = 0x00002026,
>   *	VMWRITE_BITMAP                  = 0x00002028,
> - *
> - *	TSC_MULTIPLIER                  = 0x00002032,
>   *	PLE_GAP                         = 0x00004020,
>   *	PLE_WINDOW                      = 0x00004022,
>   *	VMX_PREEMPTION_TIMER_VALUE      = 0x0000482E,
> - *      GUEST_IA32_PERF_GLOBAL_CTRL     = 0x00002808,
> - *      HOST_IA32_PERF_GLOBAL_CTRL      = 0x00002c04,
> - *
> - * Currently unsupported in KVM:
> - *	GUEST_IA32_RTIT_CTL		= 0x00002814,
>   */
>  #define EVMCS1_UNSUPPORTED_PINCTRL (PIN_BASED_POSTED_INTR | \
>  				    PIN_BASED_VMX_PREEMPTION_TIMER)
> @@ -58,12 +51,10 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
>  	 SECONDARY_EXEC_ENABLE_PML |					\
>  	 SECONDARY_EXEC_ENABLE_VMFUNC |					\
>  	 SECONDARY_EXEC_SHADOW_VMCS |					\
> -	 SECONDARY_EXEC_TSC_SCALING |					\
>  	 SECONDARY_EXEC_PAUSE_LOOP_EXITING)
>  #define EVMCS1_UNSUPPORTED_VMEXIT_CTRL					\
> -	(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |				\
> -	 VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
> -#define EVMCS1_UNSUPPORTED_VMENTRY_CTRL (VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL)
> +	(VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
> +#define EVMCS1_UNSUPPORTED_VMENTRY_CTRL (0)
>  #define EVMCS1_UNSUPPORTED_VMFUNC (VMX_VMFUNC_EPTP_SWITCHING)
>  
>  struct evmcs_field {
> @@ -243,7 +234,7 @@ bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa);
>  uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu);
>  int nested_enable_evmcs(struct kvm_vcpu *vcpu,
>  			uint16_t *vmcs_version);
> -void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata);
> -int nested_evmcs_check_controls(struct vmcs12 *vmcs12);
> +void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata);
> +int nested_evmcs_check_controls(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12);
>  
>  #endif /* __KVM_X86_VMX_EVMCS_H */
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 4fc84f0f5875..dcf3ee645212 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2891,7 +2891,7 @@ static int nested_vmx_check_controls(struct kvm_vcpu *vcpu,
>  		return -EINVAL;
>  
>  	if (to_vmx(vcpu)->nested.enlightened_vmcs_enabled)
> -		return nested_evmcs_check_controls(vmcs12);
> +		return nested_evmcs_check_controls(vcpu, vmcs12);
>  
>  	return 0;
>  }
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index c30115b9cb33..b4915d841357 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1858,7 +1858,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		 */
>  		if (!msr_info->host_initiated &&
>  		    vmx->nested.enlightened_vmcs_enabled)
> -			nested_evmcs_filter_control_msr(msr_info->index,
> +			nested_evmcs_filter_control_msr(vcpu, msr_info->index,
>  							&msr_info->data);
>  		break;
>  	case MSR_IA32_RTIT_CTL:


Looks all good.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky





^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 10/25] KVM: selftests: Enable TSC scaling in evmcs selftest
  2022-07-08 14:42 ` [PATCH v3 10/25] KVM: selftests: Enable TSC scaling in evmcs selftest Vitaly Kuznetsov
@ 2022-07-12 11:54   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:54 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> The updated Enlightened VMCS v1 definition enables TSC scaling, test
> that SECONDARY_EXEC_TSC_SCALING can now be enabled.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  .../testing/selftests/kvm/x86_64/evmcs_test.c | 31 +++++++++++++++++--
>  1 file changed, 29 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/x86_64/evmcs_test.c b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
> index 8dda527cc080..80135b98dc3b 100644
> --- a/tools/testing/selftests/kvm/x86_64/evmcs_test.c
> +++ b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
> @@ -18,6 +18,9 @@
>  
>  #include "vmx.h"
>  
> +/* Test flags */
> +#define HOST_HAS_TSC_SCALING BIT(0)
> +
>  static int ud_count;
>  
>  static void guest_ud_handler(struct ex_regs *regs)
> @@ -64,11 +67,14 @@ void l2_guest_code(void)
>         vmcall();
>         rdmsr_gs_base(); /* intercepted */
>  
> +       /* TSC scaling */
> +       vmcall();
> +
>         /* Done, exit to L1 and never come back.  */
>         vmcall();
>  }
>  
> -void guest_code(struct vmx_pages *vmx_pages)
> +void guest_code(struct vmx_pages *vmx_pages, u64 test_flags)
>  {
>  #define L2_GUEST_STACK_SIZE 64
>         unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
> @@ -150,6 +156,18 @@ void guest_code(struct vmx_pages *vmx_pages)
>         GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
>         GUEST_SYNC(11);
>  
> +       if (test_flags & HOST_HAS_TSC_SCALING) {
> +               GUEST_ASSERT((rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2) >> 32) &
> +                            SECONDARY_EXEC_TSC_SCALING);
> +               /* Try enabling TSC scaling */
> +               vmwrite(SECONDARY_VM_EXEC_CONTROL, vmreadz(SECONDARY_VM_EXEC_CONTROL) |
> +                       SECONDARY_EXEC_TSC_SCALING);
> +               vmwrite(TSC_MULTIPLIER, 1);
> +       }
> +       GUEST_ASSERT(!vmresume());
> +       GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
> +       GUEST_SYNC(12);
> +
>         /* Try enlightened vmptrld with an incorrect GPA */
>         evmcs_vmptrld(0xdeadbeef, vmx_pages->enlightened_vmcs);
>         GUEST_ASSERT(vmlaunch());
> @@ -204,6 +222,7 @@ int main(int argc, char *argv[])
>         struct kvm_vm *vm;
>         struct kvm_run *run;
>         struct ucall uc;
> +       u64 test_flags = 0;
>         int stage;
>  
>         vm = vm_create_with_one_vcpu(&vcpu, guest_code);
> @@ -212,11 +231,19 @@ int main(int argc, char *argv[])
>         TEST_REQUIRE(kvm_has_cap(KVM_CAP_NESTED_STATE));
>         TEST_REQUIRE(kvm_has_cap(KVM_CAP_HYPERV_ENLIGHTENED_VMCS));
>  
> +       if ((kvm_get_feature_msr(MSR_IA32_VMX_PROCBASED_CTLS2) >> 32) &
> +           SECONDARY_EXEC_TSC_SCALING) {
> +               test_flags |= HOST_HAS_TSC_SCALING;
> +               pr_info("TSC scaling is supported, adding to test\n");
> +       } else {
> +               pr_info("TSC scaling is not supported\n");
> +       }
> +
>         vcpu_set_hv_cpuid(vcpu);
>         vcpu_enable_evmcs(vcpu);
>  
>         vcpu_alloc_vmx(vm, &vmx_pages_gva);
> -       vcpu_args_set(vcpu, 1, vmx_pages_gva);
> +       vcpu_args_set(vcpu, 2, vmx_pages_gva, test_flags);
>  
>         vm_init_descriptor_tables(vm);
>         vcpu_init_descriptor_tables(vcpu);

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky




^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization
  2022-07-08 14:42 ` [PATCH v3 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization Vitaly Kuznetsov
@ 2022-07-12 11:54   ` Maxim Levitsky
  2022-07-12 12:14     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:54 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> With the updated eVMCSv1 definition, there's no known 'problematic'
> controls which are exposed in VMX control MSRs but are not present in
> eVMCSv1. Get rid of VMX control MSRs filtering for KVM on Hyper-V.

If I understand correctly we are taking about running KVM as a nested guest of Hyper-V here:

Don't we need to check the new CPUID bit and only then use the new fields of eVMCS,
aka check that the 'cpu' supports the updated eVMCS version?

Best regards,
	Maxim Levitsky



> 
> Note: VMX control MSRs filtering for Hyper-V on KVM
> (nested_evmcs_filter_control_msr()) stays as even the updated eVMCSv1
> definition doesn't have all the features implemented by KVM and some
> fields are still missing. Moreover, nested_evmcs_filter_control_msr()
> has to support the original eVMCSv1 version when VMM wishes so.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/evmcs.c | 13 -------------
>  arch/x86/kvm/vmx/evmcs.h |  1 -
>  arch/x86/kvm/vmx/vmx.c   |  5 -----
>  3 files changed, 19 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
> index 52a53debd806..b5cfbf7d487b 100644
> --- a/arch/x86/kvm/vmx/evmcs.c
> +++ b/arch/x86/kvm/vmx/evmcs.c
> @@ -320,19 +320,6 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
>  };
>  const unsigned int nr_evmcs_1_fields = ARRAY_SIZE(vmcs_field_to_evmcs_1);
>  
> -#if IS_ENABLED(CONFIG_HYPERV)
> -__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf)
> -{
> -       vmcs_conf->cpu_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_EXEC_CTRL;
> -       vmcs_conf->pin_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_PINCTRL;
> -       vmcs_conf->cpu_based_2nd_exec_ctrl &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
> -       vmcs_conf->cpu_based_3rd_exec_ctrl = 0;
> -
> -       vmcs_conf->vmexit_ctrl &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
> -       vmcs_conf->vmentry_ctrl &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
> -}
> -#endif
> -
>  bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa)
>  {
>         struct hv_vp_assist_page assist_page;
> diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
> index 4b809c79ae63..0feac101cce4 100644
> --- a/arch/x86/kvm/vmx/evmcs.h
> +++ b/arch/x86/kvm/vmx/evmcs.h
> @@ -203,7 +203,6 @@ static inline void evmcs_load(u64 phys_addr)
>         vp_ap->enlighten_vmentry = 1;
>  }
>  
> -__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf);
>  #else /* !IS_ENABLED(CONFIG_HYPERV) */
>  static __always_inline void evmcs_write64(unsigned long field, u64 value) {}
>  static inline void evmcs_write32(unsigned long field, u32 value) {}
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index b4915d841357..dd905ad72637 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2689,11 +2689,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>         vmcs_conf->vmexit_ctrl         = _vmexit_control;
>         vmcs_conf->vmentry_ctrl        = _vmentry_control;
>  
> -#if IS_ENABLED(CONFIG_HYPERV)
> -       if (enlightened_vmcs)
> -               evmcs_sanitize_exec_ctrls(vmcs_conf);
> -#endif
> -
>         return 0;
>  }
>  



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 12/25] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
  2022-07-08 14:42 ` [PATCH v3 12/25] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config() Vitaly Kuznetsov
@ 2022-07-12 11:55   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:55 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> VM_ENTRY_IA32E_MODE control is toggled dynamically by vmx_set_efer()
> and setup_vmcs_config() doesn't check its existence. On the contrary,
> nested_vmx_setup_ctls_msrs() doesn set it on x86_64. Add the missing
> check and filter the bit out in vmx_vmentry_ctrl().
> 
> No (real) functional change intended as all existing CPUs supporting
> long mode and VMX are supposed to have it.
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index dd905ad72637..1aaec4d19e1b 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2610,6 +2610,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>                 _pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;
>  
>         min = VM_ENTRY_LOAD_DEBUG_CONTROLS;
> +#ifdef CONFIG_X86_64
> +       min |= VM_ENTRY_IA32E_MODE;
> +#endif
>         opt = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
>               VM_ENTRY_LOAD_IA32_PAT |
>               VM_ENTRY_LOAD_IA32_EFER |
> @@ -4242,9 +4245,14 @@ static u32 vmx_vmentry_ctrl(void)
>         if (vmx_pt_mode_is_system())
>                 vmentry_ctrl &= ~(VM_ENTRY_PT_CONCEAL_PIP |
>                                   VM_ENTRY_LOAD_IA32_RTIT_CTL);
> -       /* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
> -       return vmentry_ctrl &
> -               ~(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL | VM_ENTRY_LOAD_IA32_EFER);
> +       /*
> +        * IA32e mode, and loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically.
> +        */
> +       vmentry_ctrl &= ~(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
> +                         VM_ENTRY_LOAD_IA32_EFER |
> +                         VM_ENTRY_IA32E_MODE);
> +
> +       return vmentry_ctrl;
>  }
>  
>  static u32 vmx_vmexit_ctrl(void)

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 13/25] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in setup_vmcs_config()
  2022-07-08 14:42 ` [PATCH v3 13/25] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING " Vitaly Kuznetsov
@ 2022-07-12 11:55   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:55 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> CPU_BASED_{INTR,NMI}_WINDOW_EXITING controls are toggled dynamically by
> vmx_enable_{irq,nmi}_window, handle_interrupt_window(), handle_nmi_window()
> but setup_vmcs_config() doesn't check their existence. Add the check and
> filter the controls out in vmx_exec_control().
> 
> Note: KVM explicitly supports CPUs without VIRTUAL_NMIS and all these CPUs
> are supposedly lacking NMI_WINDOW_EXITING too. Adjust cpu_has_virtual_nmis()
> accordingly.
> 
> No functional change intended.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/capabilities.h | 3 ++-
>  arch/x86/kvm/vmx/vmx.c          | 8 +++++++-
>  2 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
> index 069d8d298e1d..07e7492fe72a 100644
> --- a/arch/x86/kvm/vmx/capabilities.h
> +++ b/arch/x86/kvm/vmx/capabilities.h
> @@ -82,7 +82,8 @@ static inline bool cpu_has_vmx_basic_inout(void)
>  
>  static inline bool cpu_has_virtual_nmis(void)
>  {
> -       return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS;
> +       return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS &&
> +               vmcs_config.cpu_based_exec_ctrl & CPU_BASED_NMI_WINDOW_EXITING;
>  }

Oh, a bit offtopic, I see how VMX handles the case of no support of vNMI,
VMX has no IRET intercept, and so it cheats by using regular interrupt window
and assuming that NMI hanlder will not enable normal interrupts....
Oh well....

>  
>  static inline bool cpu_has_vmx_preemption_timer(void)
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 1aaec4d19e1b..ce54f13d8da1 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2487,10 +2487,12 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>               CPU_BASED_MWAIT_EXITING |
>               CPU_BASED_MONITOR_EXITING |
>               CPU_BASED_INVLPG_EXITING |
> -             CPU_BASED_RDPMC_EXITING;
> +             CPU_BASED_RDPMC_EXITING |
> +             CPU_BASED_INTR_WINDOW_EXITING;
>  
>         opt = CPU_BASED_TPR_SHADOW |
>               CPU_BASED_USE_MSR_BITMAPS |
> +             CPU_BASED_NMI_WINDOW_EXITING |
>               CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
>               CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
>         if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
> @@ -4299,6 +4301,10 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
>  {
>         u32 exec_control = vmcs_config.cpu_based_exec_ctrl;
>  
> +       /* INTR_WINDOW_EXITING and NMI_WINDOW_EXITING are toggled dynamically */
> +       exec_control &= ~(CPU_BASED_INTR_WINDOW_EXITING |
> +                         CPU_BASED_NMI_WINDOW_EXITING);
> +
>         if (vmx->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
>                 exec_control &= ~CPU_BASED_MOV_DR_EXITING;
>  

Also makes sense.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 14/25] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING in setup_vmcs_config()
  2022-07-08 14:42 ` [PATCH v3 14/25] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING " Vitaly Kuznetsov
@ 2022-07-12 11:56   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:56 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> SECONDARY_EXEC_ENCLS_EXITING is conditionally added to the 'optional'
> checklist in setup_vmcs_config() but there's little value in doing so.
> First, as the control is optional, we can always check for its
> presence, no harm done. Second, the only real value cpu_has_sgx() check
> gives is that on the CPUs which support SECONDARY_EXEC_ENCLS_EXITING but
> don't support SGX, the control is not getting enabled. It's highly unlikely
> such CPUs exist but it's possible that some hypervisors expose broken vCPU
> models.
> 
> Preserve cpu_has_sgx() check but filter the result of adjust_vmx_controls()
> instead of the input.
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index ce54f13d8da1..566be73c6509 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2528,9 +2528,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>                         SECONDARY_EXEC_PT_CONCEAL_VMX |
>                         SECONDARY_EXEC_ENABLE_VMFUNC |
>                         SECONDARY_EXEC_BUS_LOCK_DETECTION |
> -                       SECONDARY_EXEC_NOTIFY_VM_EXITING;
> -               if (cpu_has_sgx())
> -                       opt2 |= SECONDARY_EXEC_ENCLS_EXITING;
> +                       SECONDARY_EXEC_NOTIFY_VM_EXITING |
> +                       SECONDARY_EXEC_ENCLS_EXITING;
> +
>                 if (adjust_vmx_controls(min2, opt2,
>                                         MSR_IA32_VMX_PROCBASED_CTLS2,
>                                         &_cpu_based_2nd_exec_control) < 0)
> @@ -2577,6 +2577,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>                 vmx_cap->vpid = 0;
>         }
>  
> +       if (!cpu_has_sgx())
> +               _cpu_based_2nd_exec_control &= ~SECONDARY_EXEC_ENCLS_EXITING;
> +
>         if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) {
>                 u64 opt3 = TERTIARY_EXEC_IPI_VIRT;
>  


Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 15/25] KVM: VMX: Extend VMX controls macro shenanigans
  2022-07-08 14:42 ` [PATCH v3 15/25] KVM: VMX: Extend VMX controls macro shenanigans Vitaly Kuznetsov
@ 2022-07-12 11:56   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:56 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> When VMX controls macros are used to set or clear a control bit, make
> sure that this bit was checked in setup_vmcs_config() and thus is properly
> reflected in vmcs_config.
> 
> No functional change intended.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c |  99 +++++++------------------------------
>  arch/x86/kvm/vmx/vmx.h | 109 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 127 insertions(+), 81 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 566be73c6509..93ca9ff8e641 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2448,7 +2448,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>                                     struct vmx_capability *vmx_cap)
>  {
>         u32 vmx_msr_low, vmx_msr_high;
> -       u32 min, opt, min2, opt2;
>         u32 _pin_based_exec_control = 0;
>         u32 _cpu_based_exec_control = 0;
>         u32 _cpu_based_2nd_exec_control = 0;
> @@ -2474,28 +2473,10 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>         };
>  
>         memset(vmcs_conf, 0, sizeof(*vmcs_conf));
> -       min = CPU_BASED_HLT_EXITING |
> -#ifdef CONFIG_X86_64
> -             CPU_BASED_CR8_LOAD_EXITING |
> -             CPU_BASED_CR8_STORE_EXITING |
> -#endif
> -             CPU_BASED_CR3_LOAD_EXITING |
> -             CPU_BASED_CR3_STORE_EXITING |
> -             CPU_BASED_UNCOND_IO_EXITING |
> -             CPU_BASED_MOV_DR_EXITING |
> -             CPU_BASED_USE_TSC_OFFSETTING |
> -             CPU_BASED_MWAIT_EXITING |
> -             CPU_BASED_MONITOR_EXITING |
> -             CPU_BASED_INVLPG_EXITING |
> -             CPU_BASED_RDPMC_EXITING |
> -             CPU_BASED_INTR_WINDOW_EXITING;
> -
> -       opt = CPU_BASED_TPR_SHADOW |
> -             CPU_BASED_USE_MSR_BITMAPS |
> -             CPU_BASED_NMI_WINDOW_EXITING |
> -             CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
> -             CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
> -       if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
> +
> +       if (adjust_vmx_controls(KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL,
> +                               KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL,
> +                               MSR_IA32_VMX_PROCBASED_CTLS,
>                                 &_cpu_based_exec_control) < 0)
>                 return -EIO;
OK


>  #ifdef CONFIG_X86_64
> @@ -2504,34 +2485,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>                                            ~CPU_BASED_CR8_STORE_EXITING;
>  #endif
>         if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
> -               min2 = 0;
> -               opt2 = SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
> -                       SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
> -                       SECONDARY_EXEC_WBINVD_EXITING |
> -                       SECONDARY_EXEC_ENABLE_VPID |
> -                       SECONDARY_EXEC_ENABLE_EPT |
> -                       SECONDARY_EXEC_UNRESTRICTED_GUEST |
> -                       SECONDARY_EXEC_PAUSE_LOOP_EXITING |
> -                       SECONDARY_EXEC_DESC |
> -                       SECONDARY_EXEC_ENABLE_RDTSCP |
> -                       SECONDARY_EXEC_ENABLE_INVPCID |
> -                       SECONDARY_EXEC_APIC_REGISTER_VIRT |
> -                       SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
> -                       SECONDARY_EXEC_SHADOW_VMCS |
> -                       SECONDARY_EXEC_XSAVES |
> -                       SECONDARY_EXEC_RDSEED_EXITING |
> -                       SECONDARY_EXEC_RDRAND_EXITING |
> -                       SECONDARY_EXEC_ENABLE_PML |
> -                       SECONDARY_EXEC_TSC_SCALING |
> -                       SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE |
> -                       SECONDARY_EXEC_PT_USE_GPA |
> -                       SECONDARY_EXEC_PT_CONCEAL_VMX |
> -                       SECONDARY_EXEC_ENABLE_VMFUNC |
> -                       SECONDARY_EXEC_BUS_LOCK_DETECTION |
> -                       SECONDARY_EXEC_NOTIFY_VM_EXITING |
> -                       SECONDARY_EXEC_ENCLS_EXITING;
> -
> -               if (adjust_vmx_controls(min2, opt2,
> +               if (adjust_vmx_controls(KVM_REQ_VMX_SECONDARY_VM_EXEC_CONTROL,
> +                                       KVM_OPT_VMX_SECONDARY_VM_EXEC_CONTROL,
>                                         MSR_IA32_VMX_PROCBASED_CTLS2,
>                                         &_cpu_based_2nd_exec_control) < 0)
>                         return -EIO;
OK

> @@ -2581,30 +2536,20 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>                 _cpu_based_2nd_exec_control &= ~SECONDARY_EXEC_ENCLS_EXITING;
>  
>         if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) {
> -               u64 opt3 = TERTIARY_EXEC_IPI_VIRT;
> -
> -               _cpu_based_3rd_exec_control = adjust_vmx_controls64(opt3,
> -                                             MSR_IA32_VMX_PROCBASED_CTLS3);
> +               _cpu_based_3rd_exec_control =
> +                       adjust_vmx_controls64(KVM_OPT_VMX_TERTIARY_VM_EXEC_CONTROL,
> +                       MSR_IA32_VMX_PROCBASED_CTLS3);
>         }
OK

>  
> -       min = VM_EXIT_SAVE_DEBUG_CONTROLS | VM_EXIT_ACK_INTR_ON_EXIT;
> -#ifdef CONFIG_X86_64
> -       min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
> -#endif
> -       opt = VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
> -             VM_EXIT_LOAD_IA32_PAT |
> -             VM_EXIT_LOAD_IA32_EFER |
> -             VM_EXIT_CLEAR_BNDCFGS |
> -             VM_EXIT_PT_CONCEAL_PIP |
> -             VM_EXIT_CLEAR_IA32_RTIT_CTL;
> -       if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
> +       if (adjust_vmx_controls(KVM_REQ_VMX_VM_EXIT_CONTROLS,
> +                               KVM_OPT_VMX_VM_EXIT_CONTROLS,
> +                               MSR_IA32_VMX_EXIT_CTLS,
>                                 &_vmexit_control) < 0)
>                 return -EIO;
OK.

>  
> -       min = PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING;
> -       opt = PIN_BASED_VIRTUAL_NMIS | PIN_BASED_POSTED_INTR |
> -                PIN_BASED_VMX_PREEMPTION_TIMER;
> -       if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PINBASED_CTLS,
> +       if (adjust_vmx_controls(KVM_REQ_VMX_PIN_BASED_VM_EXEC_CONTROL,
> +                               KVM_OPT_VMX_PIN_BASED_VM_EXEC_CONTROL,
> +                               MSR_IA32_VMX_PINBASED_CTLS,
>                                 &_pin_based_exec_control) < 0)
>                 return -EIO;
OK


>  
> @@ -2614,17 +2559,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>                 SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY))
>                 _pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;
>  
> -       min = VM_ENTRY_LOAD_DEBUG_CONTROLS;
> -#ifdef CONFIG_X86_64
> -       min |= VM_ENTRY_IA32E_MODE;
> -#endif
> -       opt = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
> -             VM_ENTRY_LOAD_IA32_PAT |
> -             VM_ENTRY_LOAD_IA32_EFER |
> -             VM_ENTRY_LOAD_BNDCFGS |
> -             VM_ENTRY_PT_CONCEAL_PIP |
> -             VM_ENTRY_LOAD_IA32_RTIT_CTL;
> -       if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
> +       if (adjust_vmx_controls(KVM_REQ_VMX_VM_ENTRY_CONTROLS,
> +                               KVM_OPT_VMX_VM_ENTRY_CONTROLS,
> +                               MSR_IA32_VMX_ENTRY_CTLS,
>                                 &_vmentry_control) < 0)

OK.

>                 return -EIO;

>  
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 286c88e285ea..89eaab3495a6 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -467,6 +467,113 @@ static inline u8 vmx_get_rvi(void)
>         return vmcs_read16(GUEST_INTR_STATUS) & 0xff;
>  }
>  
> +#define __KVM_REQ_VMX_VM_ENTRY_CONTROLS                                \
> +       (VM_ENTRY_LOAD_DEBUG_CONTROLS)
> +#ifdef CONFIG_X86_64
> +       #define KVM_REQ_VMX_VM_ENTRY_CONTROLS                   \
> +               (__KVM_REQ_VMX_VM_ENTRY_CONTROLS |              \
> +               VM_ENTRY_IA32E_MODE)
> +#else
> +       #define KVM_REQ_VMX_VM_ENTRY_CONTROLS                   \
> +               __KVM_REQ_VMX_VM_ENTRY_CONTROLS
> +#endif
> +#define KVM_OPT_VMX_VM_ENTRY_CONTROLS                          \
> +       (VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |                  \
> +       VM_ENTRY_LOAD_IA32_PAT |                                \
> +       VM_ENTRY_LOAD_IA32_EFER |                               \
> +       VM_ENTRY_LOAD_BNDCFGS |                                 \
> +       VM_ENTRY_PT_CONCEAL_PIP |                               \
> +       VM_ENTRY_LOAD_IA32_RTIT_CTL)
> +
> +#define __KVM_REQ_VMX_VM_EXIT_CONTROLS                         \
> +       (VM_EXIT_SAVE_DEBUG_CONTROLS |                          \
> +       VM_EXIT_ACK_INTR_ON_EXIT)
> +#ifdef CONFIG_X86_64
> +       #define KVM_REQ_VMX_VM_EXIT_CONTROLS                    \
> +               (__KVM_REQ_VMX_VM_EXIT_CONTROLS |               \
> +               VM_EXIT_HOST_ADDR_SPACE_SIZE)
> +#else
> +       #define KVM_REQ_VMX_VM_EXIT_CONTROLS                    \
> +               __KVM_REQ_VMX_VM_EXIT_CONTROLS
> +#endif
> +#define KVM_OPT_VMX_VM_EXIT_CONTROLS                           \
> +             (VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |             \
> +             VM_EXIT_LOAD_IA32_PAT |                           \
> +             VM_EXIT_LOAD_IA32_EFER |                          \
> +             VM_EXIT_CLEAR_BNDCFGS |                           \
> +             VM_EXIT_PT_CONCEAL_PIP |                          \
> +             VM_EXIT_CLEAR_IA32_RTIT_CTL)
> +
> +#define KVM_REQ_VMX_PIN_BASED_VM_EXEC_CONTROL                  \
> +       (PIN_BASED_EXT_INTR_MASK |                              \
> +        PIN_BASED_NMI_EXITING)
> +#define KVM_OPT_VMX_PIN_BASED_VM_EXEC_CONTROL                  \
> +       (PIN_BASED_VIRTUAL_NMIS |                               \
> +       PIN_BASED_POSTED_INTR |                                 \
> +       PIN_BASED_VMX_PREEMPTION_TIMER)
> +
> +#define __KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL                        \
> +       (CPU_BASED_HLT_EXITING |                                \
> +       CPU_BASED_CR3_LOAD_EXITING |                            \
> +       CPU_BASED_CR3_STORE_EXITING |                           \
> +       CPU_BASED_UNCOND_IO_EXITING |                           \
> +       CPU_BASED_MOV_DR_EXITING |                              \
> +       CPU_BASED_USE_TSC_OFFSETTING |                          \
> +       CPU_BASED_MWAIT_EXITING |                               \
> +       CPU_BASED_MONITOR_EXITING |                             \
> +       CPU_BASED_INVLPG_EXITING |                              \
> +       CPU_BASED_RDPMC_EXITING |                               \
> +       CPU_BASED_INTR_WINDOW_EXITING)
> +
> +#ifdef CONFIG_X86_64
> +       #define KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL           \
> +               (__KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL |      \
> +               CPU_BASED_CR8_LOAD_EXITING |                    \
> +               CPU_BASED_CR8_STORE_EXITING)
> +#else
> +       #define KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL           \
> +               __KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL
> +#endif
> +
> +#define KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL                  \
> +       (CPU_BASED_TPR_SHADOW |                                 \
> +       CPU_BASED_USE_MSR_BITMAPS |                             \
> +       CPU_BASED_NMI_WINDOW_EXITING |                          \
> +       CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |                 \
> +       CPU_BASED_ACTIVATE_TERTIARY_CONTROLS)
> +
> +#define KVM_REQ_VMX_SECONDARY_VM_EXEC_CONTROL 0
> +#define KVM_OPT_VMX_SECONDARY_VM_EXEC_CONTROL                  \
> +       (SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |              \
> +       SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |                 \
> +       SECONDARY_EXEC_WBINVD_EXITING |                         \
> +       SECONDARY_EXEC_ENABLE_VPID |                            \
> +       SECONDARY_EXEC_ENABLE_EPT |                             \
> +       SECONDARY_EXEC_UNRESTRICTED_GUEST |                     \
> +       SECONDARY_EXEC_PAUSE_LOOP_EXITING |                     \
> +       SECONDARY_EXEC_DESC |                                   \
> +       SECONDARY_EXEC_ENABLE_RDTSCP |                          \
> +       SECONDARY_EXEC_ENABLE_INVPCID |                         \
> +       SECONDARY_EXEC_APIC_REGISTER_VIRT |                     \
> +       SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |                  \
> +       SECONDARY_EXEC_SHADOW_VMCS |                            \
> +       SECONDARY_EXEC_XSAVES |                                 \
> +       SECONDARY_EXEC_RDSEED_EXITING |                         \
> +       SECONDARY_EXEC_RDRAND_EXITING |                         \
> +       SECONDARY_EXEC_ENABLE_PML |                             \
> +       SECONDARY_EXEC_TSC_SCALING |                            \
> +       SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE |                  \
> +       SECONDARY_EXEC_PT_USE_GPA |                             \
> +       SECONDARY_EXEC_PT_CONCEAL_VMX |                         \
> +       SECONDARY_EXEC_ENABLE_VMFUNC |                          \
> +       SECONDARY_EXEC_BUS_LOCK_DETECTION |                     \
> +       SECONDARY_EXEC_NOTIFY_VM_EXITING |                      \
> +       SECONDARY_EXEC_ENCLS_EXITING)
> +
> +#define KVM_REQ_VMX_TERTIARY_VM_EXEC_CONTROL 0
> +#define KVM_OPT_VMX_TERTIARY_VM_EXEC_CONTROL                   \
> +       (TERTIARY_EXEC_IPI_VIRT)
> +
>  #define BUILD_CONTROLS_SHADOW(lname, uname, bits)                              \
>  static inline void lname##_controls_set(struct vcpu_vmx *vmx, u##bits val)     \
>  {                                                                              \
> @@ -485,10 +592,12 @@ static inline u##bits lname##_controls_get(struct vcpu_vmx *vmx)          \
>  }                                                                              \
>  static inline void lname##_controls_setbit(struct vcpu_vmx *vmx, u##bits val)  \
>  {                                                                              \
> +       BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));     \
>         lname##_controls_set(vmx, lname##_controls_get(vmx) | val);             \
>  }                                                                              \
>  static inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits val)        \
>  {                                                                              \
> +       BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname)));     \
>         lname##_controls_set(vmx, lname##_controls_get(vmx) & ~val);            \
>  }
>  BUILD_CONTROLS_SHADOW(vm_entry, VM_ENTRY_CONTROLS, 32)

I cross checked that all bits were copied correctly.

This patch is a very nice idea!

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky




^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 16/25] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config()
  2022-07-08 14:42 ` [PATCH v3 16/25] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config() Vitaly Kuznetsov
@ 2022-07-12 11:56   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:56 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> As a preparation to reusing the result of setup_vmcs_config() in
> nested VMX MSR setup, move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering
> to vmx_exec_control().
> 
> No functional change intended.
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 93ca9ff8e641..d7170990f469 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2479,11 +2479,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>                                 MSR_IA32_VMX_PROCBASED_CTLS,
>                                 &_cpu_based_exec_control) < 0)
>                 return -EIO;
> -#ifdef CONFIG_X86_64
> -       if (_cpu_based_exec_control & CPU_BASED_TPR_SHADOW)
> -               _cpu_based_exec_control &= ~CPU_BASED_CR8_LOAD_EXITING &
> -                                          ~CPU_BASED_CR8_STORE_EXITING;
> -#endif
>         if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
>                 if (adjust_vmx_controls(KVM_REQ_VMX_SECONDARY_VM_EXEC_CONTROL,
>                                         KVM_OPT_VMX_SECONDARY_VM_EXEC_CONTROL,
> @@ -4248,13 +4243,17 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
>         if (vmx->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
>                 exec_control &= ~CPU_BASED_MOV_DR_EXITING;
>  
> -       if (!cpu_need_tpr_shadow(&vmx->vcpu)) {
> +       if (!cpu_need_tpr_shadow(&vmx->vcpu))
>                 exec_control &= ~CPU_BASED_TPR_SHADOW;
> +
>  #ifdef CONFIG_X86_64
> +       if (exec_control & CPU_BASED_TPR_SHADOW)
> +               exec_control &= ~(CPU_BASED_CR8_LOAD_EXITING |
> +                                 CPU_BASED_CR8_STORE_EXITING);
> +       else
>                 exec_control |= CPU_BASED_CR8_STORE_EXITING |
>                                 CPU_BASED_CR8_LOAD_EXITING;
>  #endif
> -       }
>         if (!enable_ept)
>                 exec_control |= CPU_BASED_CR3_STORE_EXITING |
>                                 CPU_BASED_CR3_LOAD_EXITING  |

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 17/25] KVM: VMX: Add missing VMEXIT controls to vmcs_config
  2022-07-08 14:42 ` [PATCH v3 17/25] KVM: VMX: Add missing VMEXIT controls to vmcs_config Vitaly Kuznetsov
@ 2022-07-12 11:57   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:57 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> As a preparation to reusing the result of setup_vmcs_config() in
> nested VMX MSR setup, add the VMEXIT controls which KVM doesn't
> use but supports for nVMX to KVM_OPT_VMX_VM_EXIT_CONTROLS and
> filter them out in vmx_vmexit_ctrl().
> 
> No functional change intended.
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 4 ++++
>  arch/x86/kvm/vmx/vmx.h | 3 +++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index d7170990f469..2fb89bdcbbd8 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4196,6 +4196,10 @@ static u32 vmx_vmexit_ctrl(void)
>  {
>         u32 vmexit_ctrl = vmcs_config.vmexit_ctrl;
>  
> +       /* Not used by KVM but supported for nesting. */
> +       vmexit_ctrl &= ~(VM_EXIT_SAVE_IA32_PAT | VM_EXIT_SAVE_IA32_EFER |
> +                        VM_EXIT_SAVE_VMX_PREEMPTION_TIMER);
> +
>         if (vmx_pt_mode_is_system())
>                 vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
>                                  VM_EXIT_CLEAR_IA32_RTIT_CTL);
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 89eaab3495a6..e9c392398f1b 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -498,8 +498,11 @@ static inline u8 vmx_get_rvi(void)
>  #endif
>  #define KVM_OPT_VMX_VM_EXIT_CONTROLS                           \
>               (VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |             \
> +             VM_EXIT_SAVE_IA32_PAT |                           \
>               VM_EXIT_LOAD_IA32_PAT |                           \
> +             VM_EXIT_SAVE_IA32_EFER |                          \
>               VM_EXIT_LOAD_IA32_EFER |                          \
> +             VM_EXIT_SAVE_VMX_PREEMPTION_TIMER |               \
>               VM_EXIT_CLEAR_BNDCFGS |                           \
>               VM_EXIT_PT_CONCEAL_PIP |                          \
>               VM_EXIT_CLEAR_IA32_RTIT_CTL)


Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 18/25] KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
  2022-07-08 14:42 ` [PATCH v3 18/25] KVM: VMX: Add missing CPU based VM execution " Vitaly Kuznetsov
@ 2022-07-12 11:57   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:57 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> As a preparation to reusing the result of setup_vmcs_config() in
> nested VMX MSR setup, add the CPU based VM execution controls which KVM
> doesn't use but supports for nVMX to KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL
> and filter them out in vmx_exec_control().
> 
> No functional change intended.
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 6 ++++++
>  arch/x86/kvm/vmx/vmx.h | 6 +++++-
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 2fb89bdcbbd8..9771c771c8f5 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4240,6 +4240,12 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
>  {
>         u32 exec_control = vmcs_config.cpu_based_exec_ctrl;
>  
> +       /* Not used by KVM but supported for nesting. */
> +       exec_control &= ~(CPU_BASED_RDTSC_EXITING |
> +                         CPU_BASED_USE_IO_BITMAPS |
> +                         CPU_BASED_MONITOR_TRAP_FLAG |
> +                         CPU_BASED_PAUSE_EXITING);
> +
>         /* INTR_WINDOW_EXITING and NMI_WINDOW_EXITING are toggled dynamically */
>         exec_control &= ~(CPU_BASED_INTR_WINDOW_EXITING |
>                           CPU_BASED_NMI_WINDOW_EXITING);
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index e9c392398f1b..758f80c41beb 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -539,9 +539,13 @@ static inline u8 vmx_get_rvi(void)
>  #endif
>  
>  #define KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL                  \
> -       (CPU_BASED_TPR_SHADOW |                                 \
> +       (CPU_BASED_RDTSC_EXITING |                              \
> +       CPU_BASED_TPR_SHADOW |                                  \
> +       CPU_BASED_USE_IO_BITMAPS |                              \
> +       CPU_BASED_MONITOR_TRAP_FLAG |                           \
>         CPU_BASED_USE_MSR_BITMAPS |                             \
>         CPU_BASED_NMI_WINDOW_EXITING |                          \
> +       CPU_BASED_PAUSE_EXITING |                               \
>         CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |                 \
>         CPU_BASED_ACTIVATE_TERTIARY_CONTROLS)
>  

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky <mlevitsk@redhat.com>


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 19/25] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup
  2022-07-08 14:42 ` [PATCH v3 19/25] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup Vitaly Kuznetsov
@ 2022-07-12 11:58   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:58 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> From: Sean Christopherson <seanjc@google.com>
> 
> Clear the CR3 and INVLPG interception controls at runtime based on
> whether or not EPT is being _used_, as opposed to clearing the bits at
> setup if EPT is _supported_ in hardware, and then restoring them when EPT
> is not used.  Not mucking with the base config will allow using the base
> config as the starting point for emulating the VMX capability MSRs.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 18 +++++++-----------
>  1 file changed, 7 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 9771c771c8f5..eca6875d6732 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2501,13 +2501,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>         rdmsr_safe(MSR_IA32_VMX_EPT_VPID_CAP,
>                 &vmx_cap->ept, &vmx_cap->vpid);
>  
> -       if (_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_EPT) {
> -               /* CR3 accesses and invlpg don't need to cause VM Exits when EPT
> -                  enabled */
> -               _cpu_based_exec_control &= ~(CPU_BASED_CR3_LOAD_EXITING |
> -                                            CPU_BASED_CR3_STORE_EXITING |
> -                                            CPU_BASED_INVLPG_EXITING);
> -       } else if (vmx_cap->ept) {
> +       if (!(_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_EPT) &&
> +           vmx_cap->ept) {
>                 pr_warn_once("EPT CAP should not exist if not support "
>                                 "1-setting enable EPT VM-execution control\n");
>  
> @@ -4264,10 +4259,11 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
>                 exec_control |= CPU_BASED_CR8_STORE_EXITING |
>                                 CPU_BASED_CR8_LOAD_EXITING;
>  #endif
> -       if (!enable_ept)
> -               exec_control |= CPU_BASED_CR3_STORE_EXITING |
> -                               CPU_BASED_CR3_LOAD_EXITING  |
> -                               CPU_BASED_INVLPG_EXITING;
> +       /* No need to intercept CR3 access or INVPLG when using EPT. */
> +       if (enable_ept)
> +               exec_control &= ~(CPU_BASED_CR3_LOAD_EXITING |
> +                                 CPU_BASED_CR3_STORE_EXITING |
> +                                 CPU_BASED_INVLPG_EXITING);
>         if (kvm_mwait_in_guest(vmx->vcpu.kvm))
>                 exec_control &= ~(CPU_BASED_MWAIT_EXITING |
>                                 CPU_BASED_MONITOR_EXITING);


Makes sense, although the 'runtime' word a bit misleading, as we don't allow to change
'enable_ept' after kvm_intel is loaded I think.


Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 20/25] KVM: x86: VMX: Replace some Intel model numbers with mnemonics
  2022-07-08 14:42 ` [PATCH v3 20/25] KVM: x86: VMX: Replace some Intel model numbers with mnemonics Vitaly Kuznetsov
@ 2022-07-12 11:58   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:58 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> From: Jim Mattson <jmattson@google.com>
> 
> Intel processor code names are more familiar to many readers than
> their decimal model numbers.
> 
> Signed-off-by: Jim Mattson <jmattson@google.com>
> Reviewed-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index eca6875d6732..2dff5b94c535 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2580,11 +2580,11 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>          */
>         if (boot_cpu_data.x86 == 0x6) {
>                 switch (boot_cpu_data.x86_model) {
> -               case 26: /* AAK155 */
> -               case 30: /* AAP115 */
> -               case 37: /* AAT100 */
> -               case 44: /* BC86,AAY89,BD102 */
> -               case 46: /* BA97 */
> +               case INTEL_FAM6_NEHALEM_EP:     /* AAK155 */
> +               case INTEL_FAM6_NEHALEM:        /* AAP115 */
> +               case INTEL_FAM6_WESTMERE:       /* AAT100 */
> +               case INTEL_FAM6_WESTMERE_EP:    /* BC86,AAY89,BD102 */
> +               case INTEL_FAM6_NEHALEM_EX:     /* BA97 */
>                         _vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
>                         _vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
>                         pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "

I cross checked the values, seems correct.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 21/25] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()
  2022-07-08 14:42 ` [PATCH v3 21/25] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config() Vitaly Kuznetsov
@ 2022-07-12 11:59   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 11:59 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> As a preparation to reusing the result of setup_vmcs_config() for setting
> up nested VMX control MSRs, move LOAD_IA32_PERF_GLOBAL_CTRL errata handling
> to vmx_vmexit_ctrl()/vmx_vmentry_ctrl() and print the warning from
> hardware_setup(). While it seems reasonable to not expose
> LOAD_IA32_PERF_GLOBAL_CTRL controls to L1 hypervisor on buggy CPUs,
> such change would inevitably break live migration from older KVMs
> where the controls are exposed. Keep the status quo for know, L1 hypervisor
> itself is supposed to take care of the errata.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 62 ++++++++++++++++++++++++++----------------
>  1 file changed, 38 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 2dff5b94c535..e462e5b9c0a1 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2416,6 +2416,31 @@ static bool cpu_has_sgx(void)
>         return cpuid_eax(0) >= 0x12 && (cpuid_eax(0x12) & BIT(0));
>  }
>  
> +/*
> + * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
> + * can't be used due to errata where VM Exit may incorrectly clear
> + * IA32_PERF_GLOBAL_CTRL[34:32]. Work around the errata by using the
> + * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
> + */
> +static bool cpu_has_perf_global_ctrl_bug(void)
> +{
> +       if (boot_cpu_data.x86 == 0x6) {
> +               switch (boot_cpu_data.x86_model) {
> +               case INTEL_FAM6_NEHALEM_EP:     /* AAK155 */
> +               case INTEL_FAM6_NEHALEM:        /* AAP115 */
> +               case INTEL_FAM6_WESTMERE:       /* AAT100 */
> +               case INTEL_FAM6_WESTMERE_EP:    /* BC86,AAY89,BD102 */
> +               case INTEL_FAM6_NEHALEM_EX:     /* BA97 */
> +                       return true;
> +               default:
> +                       break;
> +               }
> +       }
> +
> +       return false;
> +}
> +
> +
>  static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
>                                       u32 msr, u32 *result)
Nitpick: Indention is incorrect
>  {
> @@ -2572,30 +2597,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>                 _vmexit_control &= ~x_ctrl;
>         }
>  
> -       /*
> -        * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
> -        * can't be used due to an errata where VM Exit may incorrectly clear
> -        * IA32_PERF_GLOBAL_CTRL[34:32].  Workaround the errata by using the
> -        * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
> -        */
> -       if (boot_cpu_data.x86 == 0x6) {
> -               switch (boot_cpu_data.x86_model) {
> -               case INTEL_FAM6_NEHALEM_EP:     /* AAK155 */
> -               case INTEL_FAM6_NEHALEM:        /* AAP115 */
> -               case INTEL_FAM6_WESTMERE:       /* AAT100 */
> -               case INTEL_FAM6_WESTMERE_EP:    /* BC86,AAY89,BD102 */
> -               case INTEL_FAM6_NEHALEM_EX:     /* BA97 */
> -                       _vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
> -                       _vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
> -                       pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
> -                                       "does not work properly. Using workaround\n");
> -                       break;
> -               default:
> -                       break;
> -               }
> -       }
> -
> -
>         rdmsr(MSR_IA32_VMX_BASIC, vmx_msr_low, vmx_msr_high);
>  
>         /* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
> @@ -4184,6 +4185,10 @@ static u32 vmx_vmentry_ctrl(void)
>                           VM_ENTRY_LOAD_IA32_EFER |
>                           VM_ENTRY_IA32E_MODE);
>  
> +
> +       if (cpu_has_perf_global_ctrl_bug())
> +               vmentry_ctrl &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
> +
>         return vmentry_ctrl;
>  }
>  
> @@ -4198,6 +4203,10 @@ static u32 vmx_vmexit_ctrl(void)
>         if (vmx_pt_mode_is_system())
>                 vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
>                                  VM_EXIT_CLEAR_IA32_RTIT_CTL);
> +
> +       if (cpu_has_perf_global_ctrl_bug())
> +               vmexit_ctrl &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
> +
>         /* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
>         return vmexit_ctrl &
>                 ~(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | VM_EXIT_LOAD_IA32_EFER);
> @@ -8113,6 +8122,11 @@ static __init int hardware_setup(void)
>         if (setup_vmcs_config(&vmcs_config, &vmx_capability) < 0)
>                 return -EIO;
>  
> +       if (cpu_has_perf_global_ctrl_bug()) {
> +               pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
> +                            "does not work properly. Using workaround\n");
> +       }
> +
>         if (boot_cpu_has(X86_FEATURE_NX))
>                 kvm_enable_efer_bits(EFER_NX);
>  

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 22/25] KVM: nVMX: Always set required-1 bits of pinbased_ctls to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR
  2022-07-08 14:42 ` [PATCH v3 22/25] KVM: nVMX: Always set required-1 bits of pinbased_ctls to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR Vitaly Kuznetsov
@ 2022-07-12 12:02   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 12:02 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> Similar to exit_ctls_low, entry_ctls_low, procbased_ctls_low,
> pinbased_ctls_low should be set to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR
> and not host's MSR_IA32_VMX_PINBASED_CTLS value |=
> PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR.
> 
> The commit eabeaaccfca0 ("KVM: nVMX: Clean up and fix pin-based
> execution controls") which introduced '|=' doesn't mention anything
> about why this is needed, the change seems rather accidental.
> 
> Note: normally, required-1 portion of MSR_IA32_VMX_PINBASED_CTLS should
> be equal to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR so no behavioral change
> is expected, however, it is (in theory) possible to observe something
> different there when e.g. KVM is running as a nested hypervisor. Hope
> this doesn't happen in practice.
> 
> Reported-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index dcf3ee645212..09654d5c2144 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -6586,7 +6586,7 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
>         rdmsr(MSR_IA32_VMX_PINBASED_CTLS,
>                 msrs->pinbased_ctls_low,
>                 msrs->pinbased_ctls_high);
> -       msrs->pinbased_ctls_low |=
> +       msrs->pinbased_ctls_low =
>                 PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
>         msrs->pinbased_ctls_high &=
>                 PIN_BASED_EXT_INTR_MASK |

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 23/25] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
  2022-07-08 14:42 ` [PATCH v3 23/25] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs Vitaly Kuznetsov
@ 2022-07-12 12:11   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 12:11 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> Using raw host MSR values for setting up nested VMX control MSRs is
> incorrect as some features need to disabled, e.g. when KVM runs as
> a nested hypervisor on Hyper-V and uses Enlightened VMCS or when a
> workaround for IA32_PERF_GLOBAL_CTRL is applied. For non-nested VMX, this
> is done in setup_vmcs_config() and the result is stored in vmcs_config.
> Use it for setting up allowed-1 bits in nested VMX MSRs too.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 30 ++++++++++++------------------
>  arch/x86/kvm/vmx/nested.h |  2 +-
>  arch/x86/kvm/vmx/vmx.c    |  5 ++---
>  3 files changed, 15 insertions(+), 22 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 09654d5c2144..3d386c663fac 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -6565,8 +6565,10 @@ static u64 nested_vmx_calc_vmcs_enum_msr(void)
>   * bit in the high half is on if the corresponding bit in the control field
>   * may be on. See also vmx_control_verify().
>   */
> -void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
> +void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
>  {
> +       struct nested_vmx_msrs *msrs = &vmcs_conf->nested;
> +
>         /*
>          * Note that as a general rule, the high half of the MSRs (bits in
>          * the control fields which may be 1) should be initialized by the
> @@ -6583,11 +6585,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
>          */
>  
>         /* pin-based controls */
> -       rdmsr(MSR_IA32_VMX_PINBASED_CTLS,
> -               msrs->pinbased_ctls_low,
> -               msrs->pinbased_ctls_high);
>         msrs->pinbased_ctls_low =
>                 PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
> +
> +       msrs->pinbased_ctls_high = vmcs_conf->pin_based_exec_ctrl;
>         msrs->pinbased_ctls_high &=
>                 PIN_BASED_EXT_INTR_MASK |
>                 PIN_BASED_NMI_EXITING |
> @@ -6598,12 +6599,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
>                 PIN_BASED_VMX_PREEMPTION_TIMER;
>  
>         /* exit controls */
> -       rdmsr(MSR_IA32_VMX_EXIT_CTLS,
> -               msrs->exit_ctls_low,
> -               msrs->exit_ctls_high);
>         msrs->exit_ctls_low =
>                 VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR;
>  
> +       msrs->exit_ctls_high = vmcs_conf->vmexit_ctrl;
>         msrs->exit_ctls_high &=
>  #ifdef CONFIG_X86_64
>                 VM_EXIT_HOST_ADDR_SPACE_SIZE |
> @@ -6619,11 +6618,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
>         msrs->exit_ctls_low &= ~VM_EXIT_SAVE_DEBUG_CONTROLS;
>  
>         /* entry controls */
> -       rdmsr(MSR_IA32_VMX_ENTRY_CTLS,
> -               msrs->entry_ctls_low,
> -               msrs->entry_ctls_high);
>         msrs->entry_ctls_low =
>                 VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR;
> +
> +       msrs->entry_ctls_high = vmcs_conf->vmentry_ctrl;
>         msrs->entry_ctls_high &=
>  #ifdef CONFIG_X86_64
>                 VM_ENTRY_IA32E_MODE |
> @@ -6637,11 +6635,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
>         msrs->entry_ctls_low &= ~VM_ENTRY_LOAD_DEBUG_CONTROLS;
>  
>         /* cpu-based controls */
> -       rdmsr(MSR_IA32_VMX_PROCBASED_CTLS,
> -               msrs->procbased_ctls_low,
> -               msrs->procbased_ctls_high);
>         msrs->procbased_ctls_low =
>                 CPU_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
> +
> +       msrs->procbased_ctls_high = vmcs_conf->cpu_based_exec_ctrl;
>         msrs->procbased_ctls_high &=
>                 CPU_BASED_INTR_WINDOW_EXITING |
>                 CPU_BASED_NMI_WINDOW_EXITING | CPU_BASED_USE_TSC_OFFSETTING |
> @@ -6675,12 +6672,9 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
>          * depend on CPUID bits, they are added later by
>          * vmx_vcpu_after_set_cpuid.
>          */
> -       if (msrs->procbased_ctls_high & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS)
> -               rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2,
> -                     msrs->secondary_ctls_low,
> -                     msrs->secondary_ctls_high);
> -
>         msrs->secondary_ctls_low = 0;
> +
> +       msrs->secondary_ctls_high = vmcs_conf->cpu_based_2nd_exec_ctrl;
>         msrs->secondary_ctls_high &=
>                 SECONDARY_EXEC_DESC |
>                 SECONDARY_EXEC_ENABLE_RDTSCP |
> diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
> index c92cea0b8ccc..fae047c6204b 100644
> --- a/arch/x86/kvm/vmx/nested.h
> +++ b/arch/x86/kvm/vmx/nested.h
> @@ -17,7 +17,7 @@ enum nvmx_vmentry_status {
>  };
>  
>  void vmx_leave_nested(struct kvm_vcpu *vcpu);
> -void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps);
> +void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps);
>  void nested_vmx_hardware_unsetup(void);
>  __init int nested_vmx_hardware_setup(int (*exit_handlers[])(struct kvm_vcpu *));
>  void nested_vmx_set_vmcs_shadowing_bitmap(void);
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index e462e5b9c0a1..35285109856f 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7306,7 +7306,7 @@ static int __init vmx_check_processor_compat(void)
>         if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0)
>                 return -EIO;
>         if (nested)
> -               nested_vmx_setup_ctls_msrs(&vmcs_conf.nested, vmx_cap.ept);
> +               nested_vmx_setup_ctls_msrs(&vmcs_conf, vmx_cap.ept);
>         if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config)) != 0) {
>                 printk(KERN_ERR "kvm: CPU %d feature inconsistency!\n",
>                                 smp_processor_id());
> @@ -8281,8 +8281,7 @@ static __init int hardware_setup(void)
>         setup_default_sgx_lepubkeyhash();
>  
>         if (nested) {
> -               nested_vmx_setup_ctls_msrs(&vmcs_config.nested,
> -                                          vmx_capability.ept);
> +               nested_vmx_setup_ctls_msrs(&vmcs_config, vmx_capability.ept);
>  
>                 r = nested_vmx_hardware_setup(kvm_vmx_exit_handlers);
>                 if (r)


Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 24/25] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
  2022-07-08 14:42 ` [PATCH v3 24/25] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config Vitaly Kuznetsov
@ 2022-07-12 12:11   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 12:11 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> Like other host VMX control MSRs, MSR_IA32_VMX_MISC can be cached in
> vmcs_config to avoid the need to re-read it later, e.g. from
> cpu_has_vmx_intel_pt() or cpu_has_vmx_shadow_vmcs().
> 
> No (real) functional change intended.
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/capabilities.h | 11 +++--------
>  arch/x86/kvm/vmx/vmx.c          |  8 +++++---
>  2 files changed, 8 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
> index 07e7492fe72a..07f7a9534211 100644
> --- a/arch/x86/kvm/vmx/capabilities.h
> +++ b/arch/x86/kvm/vmx/capabilities.h
> @@ -65,6 +65,7 @@ struct vmcs_config {
>         u64 cpu_based_3rd_exec_ctrl;
>         u32 vmexit_ctrl;
>         u32 vmentry_ctrl;
> +       u64 misc;
>         struct nested_vmx_msrs nested;
>  };
>  extern struct vmcs_config vmcs_config;
> @@ -225,11 +226,8 @@ static inline bool cpu_has_vmx_vmfunc(void)
>  
>  static inline bool cpu_has_vmx_shadow_vmcs(void)
>  {
> -       u64 vmx_msr;
> -
>         /* check if the cpu supports writing r/o exit information fields */
> -       rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
> -       if (!(vmx_msr & MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS))
> +       if (!(vmcs_config.misc & MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS))
>                 return false;
>  
>         return vmcs_config.cpu_based_2nd_exec_ctrl &
> @@ -371,10 +369,7 @@ static inline bool cpu_has_vmx_invvpid_global(void)
>  
>  static inline bool cpu_has_vmx_intel_pt(void)
>  {
> -       u64 vmx_msr;
> -
> -       rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
> -       return (vmx_msr & MSR_IA32_VMX_MISC_INTEL_PT) &&
> +       return (vmcs_config.misc & MSR_IA32_VMX_MISC_INTEL_PT) &&
>                 (vmcs_config.cpu_based_2nd_exec_ctrl & SECONDARY_EXEC_PT_USE_GPA) &&
>                 (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_RTIT_CTL);
>  }
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 35285109856f..ab091758c437 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2479,6 +2479,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>         u64 _cpu_based_3rd_exec_control = 0;
>         u32 _vmexit_control = 0;
>         u32 _vmentry_control = 0;
> +       u64 misc_msr;
>         int i;
>  
>         /*
> @@ -2613,6 +2614,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>         if (((vmx_msr_high >> 18) & 15) != 6)
>                 return -EIO;
>  
> +       rdmsrl(MSR_IA32_VMX_MISC, misc_msr);
> +
>         vmcs_conf->size = vmx_msr_high & 0x1fff;
>         vmcs_conf->basic_cap = vmx_msr_high & ~0x1fff;
>  
> @@ -2624,6 +2627,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>         vmcs_conf->cpu_based_3rd_exec_ctrl = _cpu_based_3rd_exec_control;
>         vmcs_conf->vmexit_ctrl         = _vmexit_control;
>         vmcs_conf->vmentry_ctrl        = _vmentry_control;
> +       vmcs_conf->misc = misc_msr;
>  
>         return 0;
>  }
> @@ -8241,11 +8245,9 @@ static __init int hardware_setup(void)
>  
>         if (enable_preemption_timer) {
>                 u64 use_timer_freq = 5000ULL * 1000 * 1000;
> -               u64 vmx_msr;
>  
> -               rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
>                 cpu_preemption_timer_multi =
> -                       vmx_msr & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
> +                       vmcs_config.misc & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
>  
>                 if (tsc_khz)
>                         use_timer_freq = (u64)tsc_khz * 1000;


Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 25/25] KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up nested MSR
  2022-07-08 14:42 ` [PATCH v3 25/25] KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up nested MSR Vitaly Kuznetsov
@ 2022-07-12 12:11   ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 12:11 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini, Sean Christopherson
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv, linux-kernel

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> vmcs_config has cased host MSR_IA32_VMX_MISC value, use it for setting
 ^^ typo
> up nested MSR_IA32_VMX_MISC in nested_vmx_setup_ctls_msrs() and avoid the
> redundant rdmsr().
> 
> No (real) functional change intended.
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 3d386c663fac..8026dab71086 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -6754,10 +6754,7 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
>                 msrs->secondary_ctls_high |= SECONDARY_EXEC_ENCLS_EXITING;
>  
>         /* miscellaneous data */
> -       rdmsr(MSR_IA32_VMX_MISC,
> -               msrs->misc_low,
> -               msrs->misc_high);
> -       msrs->misc_low &= VMX_MISC_SAVE_EFER_LMA;
> +       msrs->misc_low = (u32)vmcs_conf->misc & VMX_MISC_SAVE_EFER_LMA;
>         msrs->misc_low |=
>                 MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS |
>                 VMX_MISC_EMULATED_PREEMPTION_TIMER_RATE |

Besides the typo,
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
 Maxim Levitsky



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization
  2022-07-12 11:54   ` Maxim Levitsky
@ 2022-07-12 12:14     ` Vitaly Kuznetsov
  2022-07-12 12:16       ` Maxim Levitsky
  0 siblings, 1 reply; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-12 12:14 UTC (permalink / raw)
  To: Maxim Levitsky, kvm
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv,
	linux-kernel, Paolo Bonzini, Sean Christopherson

Maxim Levitsky <mlevitsk@redhat.com> writes:

> On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
>> With the updated eVMCSv1 definition, there's no known 'problematic'
>> controls which are exposed in VMX control MSRs but are not present in
>> eVMCSv1. Get rid of VMX control MSRs filtering for KVM on Hyper-V.
>
> If I understand correctly we are taking about running KVM as a nested guest of Hyper-V here:
>
> Don't we need to check the new CPUID bit and only then use the new fields of eVMCS,
> aka check that the 'cpu' supports the updated eVMCS version?
>

I've checked various Hyper-V versions available around and it seems
there's no need for that: these new features are exposed in VMX control
MSRs only when the updated eVMCS is supported.

We can, in theory, preserve the filtering for non-updated eVMCS verison
but I'd vote for putting a WARN_ON() or something around: we can
eventually get rid of it in case we don't get any reports.

> Best regards,
> 	Maxim Levitsky
>
>
>
>> 
>> Note: VMX control MSRs filtering for Hyper-V on KVM
>> (nested_evmcs_filter_control_msr()) stays as even the updated eVMCSv1
>> definition doesn't have all the features implemented by KVM and some
>> fields are still missing. Moreover, nested_evmcs_filter_control_msr()
>> has to support the original eVMCSv1 version when VMM wishes so.
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  arch/x86/kvm/vmx/evmcs.c | 13 -------------
>>  arch/x86/kvm/vmx/evmcs.h |  1 -
>>  arch/x86/kvm/vmx/vmx.c   |  5 -----
>>  3 files changed, 19 deletions(-)
>> 
>> diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
>> index 52a53debd806..b5cfbf7d487b 100644
>> --- a/arch/x86/kvm/vmx/evmcs.c
>> +++ b/arch/x86/kvm/vmx/evmcs.c
>> @@ -320,19 +320,6 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
>>  };
>>  const unsigned int nr_evmcs_1_fields = ARRAY_SIZE(vmcs_field_to_evmcs_1);
>>  
>> -#if IS_ENABLED(CONFIG_HYPERV)
>> -__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf)
>> -{
>> -       vmcs_conf->cpu_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_EXEC_CTRL;
>> -       vmcs_conf->pin_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_PINCTRL;
>> -       vmcs_conf->cpu_based_2nd_exec_ctrl &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
>> -       vmcs_conf->cpu_based_3rd_exec_ctrl = 0;
>> -
>> -       vmcs_conf->vmexit_ctrl &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
>> -       vmcs_conf->vmentry_ctrl &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
>> -}
>> -#endif
>> -
>>  bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa)
>>  {
>>         struct hv_vp_assist_page assist_page;
>> diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
>> index 4b809c79ae63..0feac101cce4 100644
>> --- a/arch/x86/kvm/vmx/evmcs.h
>> +++ b/arch/x86/kvm/vmx/evmcs.h
>> @@ -203,7 +203,6 @@ static inline void evmcs_load(u64 phys_addr)
>>         vp_ap->enlighten_vmentry = 1;
>>  }
>>  
>> -__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf);
>>  #else /* !IS_ENABLED(CONFIG_HYPERV) */
>>  static __always_inline void evmcs_write64(unsigned long field, u64 value) {}
>>  static inline void evmcs_write32(unsigned long field, u32 value) {}
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index b4915d841357..dd905ad72637 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -2689,11 +2689,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>>         vmcs_conf->vmexit_ctrl         = _vmexit_control;
>>         vmcs_conf->vmentry_ctrl        = _vmentry_control;
>>  
>> -#if IS_ENABLED(CONFIG_HYPERV)
>> -       if (enlightened_vmcs)
>> -               evmcs_sanitize_exec_ctrls(vmcs_conf);
>> -#endif
>> -
>>         return 0;
>>  }
>>  
>
>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization
  2022-07-12 12:14     ` Vitaly Kuznetsov
@ 2022-07-12 12:16       ` Maxim Levitsky
  0 siblings, 0 replies; 54+ messages in thread
From: Maxim Levitsky @ 2022-07-12 12:16 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv,
	linux-kernel, Paolo Bonzini, Sean Christopherson

On Tue, 2022-07-12 at 14:14 +0200, Vitaly Kuznetsov wrote:
> Maxim Levitsky <mlevitsk@redhat.com> writes:
> 
> > On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> > > With the updated eVMCSv1 definition, there's no known 'problematic'
> > > controls which are exposed in VMX control MSRs but are not present in
> > > eVMCSv1. Get rid of VMX control MSRs filtering for KVM on Hyper-V.
> > 
> > If I understand correctly we are taking about running KVM as a nested guest of Hyper-V here:
> > 
> > Don't we need to check the new CPUID bit and only then use the new fields of eVMCS,
> > aka check that the 'cpu' supports the updated eVMCS version?
> > 
> 
> I've checked various Hyper-V versions available around and it seems
> there's no need for that: these new features are exposed in VMX control
> MSRs only when the updated eVMCS is supported.

Makes sense now. Might be worth a comment somewhere.

Best regards,
	Maxim Levitsky

> 
> We can, in theory, preserve the filtering for non-updated eVMCS verison
> but I'd vote for putting a WARN_ON() or something around: we can
> eventually get rid of it in case we don't get any reports.
> 
> > Best regards,
> >         Maxim Levitsky
> > 
> > 
> > 
> > > 
> > > Note: VMX control MSRs filtering for Hyper-V on KVM
> > > (nested_evmcs_filter_control_msr()) stays as even the updated eVMCSv1
> > > definition doesn't have all the features implemented by KVM and some
> > > fields are still missing. Moreover, nested_evmcs_filter_control_msr()
> > > has to support the original eVMCSv1 version when VMM wishes so.
> > > 
> > > Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> > > ---
> > >  arch/x86/kvm/vmx/evmcs.c | 13 -------------
> > >  arch/x86/kvm/vmx/evmcs.h |  1 -
> > >  arch/x86/kvm/vmx/vmx.c   |  5 -----
> > >  3 files changed, 19 deletions(-)
> > > 
> > > diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
> > > index 52a53debd806..b5cfbf7d487b 100644
> > > --- a/arch/x86/kvm/vmx/evmcs.c
> > > +++ b/arch/x86/kvm/vmx/evmcs.c
> > > @@ -320,19 +320,6 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
> > >  };
> > >  const unsigned int nr_evmcs_1_fields = ARRAY_SIZE(vmcs_field_to_evmcs_1);
> > >  
> > > -#if IS_ENABLED(CONFIG_HYPERV)
> > > -__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf)
> > > -{
> > > -       vmcs_conf->cpu_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_EXEC_CTRL;
> > > -       vmcs_conf->pin_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_PINCTRL;
> > > -       vmcs_conf->cpu_based_2nd_exec_ctrl &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
> > > -       vmcs_conf->cpu_based_3rd_exec_ctrl = 0;
> > > -
> > > -       vmcs_conf->vmexit_ctrl &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
> > > -       vmcs_conf->vmentry_ctrl &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
> > > -}
> > > -#endif
> > > -
> > >  bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa)
> > >  {
> > >         struct hv_vp_assist_page assist_page;
> > > diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
> > > index 4b809c79ae63..0feac101cce4 100644
> > > --- a/arch/x86/kvm/vmx/evmcs.h
> > > +++ b/arch/x86/kvm/vmx/evmcs.h
> > > @@ -203,7 +203,6 @@ static inline void evmcs_load(u64 phys_addr)
> > >         vp_ap->enlighten_vmentry = 1;
> > >  }
> > >  
> > > -__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf);
> > >  #else /* !IS_ENABLED(CONFIG_HYPERV) */
> > >  static __always_inline void evmcs_write64(unsigned long field, u64 value) {}
> > >  static inline void evmcs_write32(unsigned long field, u32 value) {}
> > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > > index b4915d841357..dd905ad72637 100644
> > > --- a/arch/x86/kvm/vmx/vmx.c
> > > +++ b/arch/x86/kvm/vmx/vmx.c
> > > @@ -2689,11 +2689,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
> > >         vmcs_conf->vmexit_ctrl         = _vmexit_control;
> > >         vmcs_conf->vmentry_ctrl        = _vmentry_control;
> > >  
> > > -#if IS_ENABLED(CONFIG_HYPERV)
> > > -       if (enlightened_vmcs)
> > > -               evmcs_sanitize_exec_ctrls(vmcs_conf);
> > > -#endif
> > > -
> > >         return 0;
> > >  }
> > >  
> > 
> > 
> 



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 03/25] x86/hyperv: Update 'struct hv_enlightened_vmcs' definition
  2022-07-12 11:51   ` Maxim Levitsky
@ 2022-07-12 12:19     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-12 12:19 UTC (permalink / raw)
  To: Maxim Levitsky, kvm
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv,
	linux-kernel, Paolo Bonzini, Sean Christopherson

Maxim Levitsky <mlevitsk@redhat.com> writes:

> On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
>> Updated Hyper-V Enlightened VMCS specification lists several new
>> fields for the following features:
>> 
>> - PerfGlobalCtrl
>> - EnclsExitingBitmap
>> - Tsc Scaling
>> - GuestLbrCtl
>> - CET
>> - SSP
>> 
>> Update the definition. The updated definition is available only when
>> CPUID.0x4000000A.EBX BIT(0) is '1'. Add a define for it as well.
>> 
>> Note: The latest TLFS is available at
>> https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  arch/x86/include/asm/hyperv-tlfs.h | 18 ++++++++++++++++--
>>  1 file changed, 16 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
>> index 6f0acc45e67a..6f2c3cdacdf4 100644
>> --- a/arch/x86/include/asm/hyperv-tlfs.h
>> +++ b/arch/x86/include/asm/hyperv-tlfs.h
>> @@ -138,6 +138,9 @@
>>  #define HV_X64_NESTED_GUEST_MAPPING_FLUSH		BIT(18)
>>  #define HV_X64_NESTED_MSR_BITMAP			BIT(19)
>>  
>
> Maybe add a comment that this is undocumented + what that cpuid bit does?
>
>> +/* Nested quirks. These are HYPERV_CPUID_NESTED_FEATURES.EBX bits. */
>> +#define HV_X64_NESTED_EVMCS1_2022_UPDATE		BIT(0)
>> +
>>  /*
>>   * This is specific to AMD and specifies that enlightened TLB flush is
>>   * supported. If guest opts in to this feature, ASID invalidations only
>> @@ -559,9 +562,20 @@ struct hv_enlightened_vmcs {
>>  	u64 partition_assist_page;
>>  	u64 padding64_4[4];
>>  	u64 guest_bndcfgs;
>> -	u64 padding64_5[7];
>> +	u64 guest_ia32_perf_global_ctrl;
>> +	u64 guest_ia32_s_cet;
>> +	u64 guest_ssp;
>> +	u64 guest_ia32_int_ssp_table_addr;
>> +	u64 guest_ia32_lbr_ctl;
>> +	u64 padding64_5[2];
>
> This change looks OK
>
>>  	u64 xss_exit_bitmap;
>> -	u64 padding64_6[7];
>> +	u64 host_ia32_perf_global_ctrl;
>> +	u64 encls_exiting_bitmap;
>> +	u64 tsc_multiplier;
>> +	u64 host_ia32_s_cet;
>> +	u64 host_ssp;
>> +	u64 host_ia32_int_ssp_table_addr;
>> +	u64 padding64_6;
>
> I think we have a mistake here:
>
> UINT64 XssExitingBitmap;
> UINT64 EnclsExitingBitmap;
> UINT64 HostPerfGlobalCtrl;
> UINT64 TscMultiplier;
> UINT64 HostSCet;
> UINT64 HostSsp;
> UINT64 HostInterruptSspTableAddr;
> UINT64 Rsvd8;
>
>
> I think you need to swap encls_exiting_bitmap and host_ia32_perf_global_ctrl
>
> I used https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/datatypes/hv_vmx_enlightened_vmcs
> as the reference. 

Oh, nice catch, thanks! I have no idea how this mistake crept in. A
conspiracy theory: maybe the online version of TLFS was updated
under our feet? :-)

v4 is coming to rescue.

>
>
> Best regards,
> 	Maxim Levitsky
>
>
>>  } __packed;
>>  
>>  #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE			0
>
>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 06/25] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf
  2022-07-12 11:51   ` Maxim Levitsky
@ 2022-07-13 15:45     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 54+ messages in thread
From: Vitaly Kuznetsov @ 2022-07-13 15:45 UTC (permalink / raw)
  To: Maxim Levitsky, kvm
  Cc: Anirudh Rayabharam, Wanpeng Li, Jim Mattson, linux-hyperv,
	linux-kernel, Paolo Bonzini, Sean Christopherson

Maxim Levitsky <mlevitsk@redhat.com> writes:

> On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
>> KVM has to check guest visible HYPERV_CPUID_NESTED_FEATURES.EBX CPUID
>> leaf to know with Enlightened VMCS definition to use (original or 2022
>> update). Cache the leaf along with other Hyper-V CPUID feature leaves
>> to make the check quick.
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  arch/x86/include/asm/kvm_host.h | 2 ++
>>  arch/x86/kvm/hyperv.c           | 9 +++++++++
>>  2 files changed, 11 insertions(+)
>> 
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index de5a149d0971..077ec9cf3169 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -616,6 +616,8 @@ struct kvm_vcpu_hv {
>>  		u32 enlightenments_eax; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EAX */
>>  		u32 enlightenments_ebx; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EBX */
>>  		u32 syndbg_cap_eax; /* HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES.EAX */
>> +		u32 nested_eax; /* HYPERV_CPUID_NESTED_FEATURES.EAX */
>> +		u32 nested_ebx; /* HYPERV_CPUID_NESTED_FEATURES.EBX */
>>  	} cpuid_cache;
>>  };
>>  
>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>> index e08189211d9a..b666902da4d9 100644
>> --- a/arch/x86/kvm/hyperv.c
>> +++ b/arch/x86/kvm/hyperv.c
>> @@ -2030,6 +2030,15 @@ void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu)
>>  		hv_vcpu->cpuid_cache.syndbg_cap_eax = entry->eax;
>>  	else
>>  		hv_vcpu->cpuid_cache.syndbg_cap_eax = 0;
>> +
>> +	entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_NESTED_FEATURES, 0);
>> +	if (entry) {
>> +		hv_vcpu->cpuid_cache.nested_eax = entry->eax;
>> +		hv_vcpu->cpuid_cache.nested_ebx = entry->ebx;
>> +	} else {
>> +		hv_vcpu->cpuid_cache.nested_eax = 0;
>> +		hv_vcpu->cpuid_cache.nested_ebx = 0;
>> +	}
>>  }
>>  
>>  int kvm_hv_set_enforce_cpuid(struct kvm_vcpu *vcpu, bool enforce)
>
>
> Small Nitpick:
>
> If I understand correctly, the kvm_find_cpuid_entry can fail if the userspace didn't provide the
> cpuid entry.
>
> Since the code that deals with failback is now repeated 3 times, how about some wrapper function that
> will return all zeros for a non present cpuid entry?

I've opted for wiping the whole hv_vcpu->cpuid_cache with memset(), this
way we don't even need a new helper.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2022-07-13 15:45 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-08 14:41 [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
2022-07-08 14:41 ` [PATCH v3 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags Vitaly Kuznetsov
2022-07-12 11:49   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 02/25] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition Vitaly Kuznetsov
2022-07-08 14:42 ` [PATCH v3 03/25] x86/hyperv: Update " Vitaly Kuznetsov
2022-07-12 11:51   ` Maxim Levitsky
2022-07-12 12:19     ` Vitaly Kuznetsov
2022-07-08 14:42 ` [PATCH v3 04/25] KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields Vitaly Kuznetsov
2022-07-12 11:50   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 05/25] KVM: nVMX: Support several new fields in eVMCSv1 Vitaly Kuznetsov
2022-07-12 11:51   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 06/25] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf Vitaly Kuznetsov
2022-07-12 11:51   ` Maxim Levitsky
2022-07-13 15:45     ` Vitaly Kuznetsov
2022-07-08 14:42 ` [PATCH v3 07/25] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields Vitaly Kuznetsov
2022-07-12 11:51   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 08/25] KVM: selftests: Switch to updated eVMCSv1 definition Vitaly Kuznetsov
2022-07-12 11:52   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS Vitaly Kuznetsov
2022-07-12 11:53   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 10/25] KVM: selftests: Enable TSC scaling in evmcs selftest Vitaly Kuznetsov
2022-07-12 11:54   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization Vitaly Kuznetsov
2022-07-12 11:54   ` Maxim Levitsky
2022-07-12 12:14     ` Vitaly Kuznetsov
2022-07-12 12:16       ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 12/25] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config() Vitaly Kuznetsov
2022-07-12 11:55   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 13/25] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING " Vitaly Kuznetsov
2022-07-12 11:55   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 14/25] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING " Vitaly Kuznetsov
2022-07-12 11:56   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 15/25] KVM: VMX: Extend VMX controls macro shenanigans Vitaly Kuznetsov
2022-07-12 11:56   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 16/25] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config() Vitaly Kuznetsov
2022-07-12 11:56   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 17/25] KVM: VMX: Add missing VMEXIT controls to vmcs_config Vitaly Kuznetsov
2022-07-12 11:57   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 18/25] KVM: VMX: Add missing CPU based VM execution " Vitaly Kuznetsov
2022-07-12 11:57   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 19/25] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup Vitaly Kuznetsov
2022-07-12 11:58   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 20/25] KVM: x86: VMX: Replace some Intel model numbers with mnemonics Vitaly Kuznetsov
2022-07-12 11:58   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 21/25] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config() Vitaly Kuznetsov
2022-07-12 11:59   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 22/25] KVM: nVMX: Always set required-1 bits of pinbased_ctls to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR Vitaly Kuznetsov
2022-07-12 12:02   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 23/25] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs Vitaly Kuznetsov
2022-07-12 12:11   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 24/25] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config Vitaly Kuznetsov
2022-07-12 12:11   ` Maxim Levitsky
2022-07-08 14:42 ` [PATCH v3 25/25] KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up nested MSR Vitaly Kuznetsov
2022-07-12 12:11   ` Maxim Levitsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.