All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] KVM: nVMX: Fix 2nd exec controls override goofs
@ 2022-12-13  6:23 Sean Christopherson
  2022-12-13  6:23 ` [PATCH v2 1/4] KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1 Sean Christopherson
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Sean Christopherson @ 2022-12-13  6:23 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Aaron Lewis, Yu Zhang

Fix bugs in KVM's (mis)handling of secondary execution controls.

KVM overrides the secondary execution control VMX MSR during KVM_SET_CPUID.
Similar to the somewhat recent reverts

  8805875aa473 ("Revert "KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled"")
  9389d5774aca ("Revert "KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL VM-{Entry,Exit} control"")

undo misguided KVM behavior where KVM overrides allowed-1 settings in the
secondary execution controls in response to changes to the guest's CPUID
model.  To avoid breaking userspace that doesn't take ownership of the
VMX MSRs, go hands off if and only if userpace sets the MSR in question.

Before fixing that, fix another bug it was hiding where the umwait/tpause
control was being exposed to L1 for nVMX only after KVM_SET_CPUID, and
harden KVM against similar bugs in the future.

v2: Fix the ENABLE_USR_WAIT_PAUSE bug too. [Aaron]

v1: https://lore.kernel.org/all/20221110005706.1064832-1-seanjc@google.com

Sean Christopherson (4):
  KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1
  KVM: nVMX: Don't stuff secondary execution control if it's not
    supported
  KVM: nVMX: Don't muck with allowed sec exec controls on CPUID changes
  KVM: selftests: Test KVM's handling of VMX's sec exec MSR on
    KVM_SET_CPUID

 arch/x86/kvm/vmx/capabilities.h               |  1 +
 arch/x86/kvm/vmx/nested.c                     |  6 +-
 arch/x86/kvm/vmx/vmx.c                        | 17 +++-
 .../selftests/kvm/include/x86_64/processor.h  |  1 +
 .../selftests/kvm/include/x86_64/vmx.h        |  4 +-
 .../selftests/kvm/x86_64/vmx_msrs_test.c      | 92 +++++++++++++++++++
 6 files changed, 116 insertions(+), 5 deletions(-)


base-commit: 02076de83f4de19a045227b9d44084a30e936c26
-- 
2.39.0.rc1.256.g54fd8350bd-goog


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/4] KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1
  2022-12-13  6:23 [PATCH v2 0/4] KVM: nVMX: Fix 2nd exec controls override goofs Sean Christopherson
@ 2022-12-13  6:23 ` Sean Christopherson
  2022-12-13 10:26   ` Yu Zhang
  2022-12-13 18:08   ` Jim Mattson
  2022-12-13  6:23 ` [PATCH v2 2/4] KVM: nVMX: Don't stuff secondary execution control if it's not supported Sean Christopherson
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 15+ messages in thread
From: Sean Christopherson @ 2022-12-13  6:23 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Aaron Lewis, Yu Zhang

Set ENABLE_USR_WAIT_PAUSE in KVM's supported VMX MSR configuration if the
feature is supported in hardware and enabled in KVM's base, non-nested
configuration, i.e. expose ENABLE_USR_WAIT_PAUSE to L1 if it's supported.
This fixes a bug where saving/restoring, i.e. migrating, a vCPU will fail
if WAITPKG (the associated CPUID feature) is enabled for the vCPU, and
obviously allows L1 to enable the feature for L2.

KVM already effectively exposes ENABLE_USR_WAIT_PAUSE to L1 by stuffing
the allowed-1 control ina vCPU's virtual MSR_IA32_VMX_PROCBASED_CTLS2 when
updating secondary controls in response to KVM_SET_CPUID(2), but (a) that
depends on flawed code (KVM shouldn't touch VMX MSRs in response to CPUID
updates) and (b) runs afoul of vmx_restore_control_msr()'s restriction
that the guest value must be a strict subset of the supported host value.

Although no past commit explicitly enabled nested support for WAITPKG,
doing so is safe and functionally correct from an architectural
perspective as no additional KVM support is needed to virtualize TPAUSE,
UMONITOR, and UMWAIT for L2 relative to L1, and KVM already forwards
VM-Exits to L1 as necessary (commit bf653b78f960, "KVM: vmx: Introduce
handle_unexpected_vmexit and handle WAITPKG vmexit").

Note, KVM always keeps the hosts MSR_IA32_UMWAIT_CONTROL resident in
hardware, i.e. always runs both L1 and L2 with the host's power management
settings for TPAUSE and UMWAIT.  See commit bf09fb6cba4f ("KVM: VMX: Stop
context switching MSR_IA32_UMWAIT_CONTROL") for more details.

Fixes: e69e72faa3a0 ("KVM: x86: Add support for user wait instructions")
Cc: stable@vger.kernel.org
Reported-by: Aaron Lewis <aaronlewis@google.com>
Reported-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/nested.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index b6f4411b613e..d131375f347a 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6873,7 +6873,8 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
 		SECONDARY_EXEC_ENABLE_INVPCID |
 		SECONDARY_EXEC_RDSEED_EXITING |
 		SECONDARY_EXEC_XSAVES |
-		SECONDARY_EXEC_TSC_SCALING;
+		SECONDARY_EXEC_TSC_SCALING |
+		SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE;
 
 	/*
 	 * We can emulate "VMCS shadowing," even if the hardware
-- 
2.39.0.rc1.256.g54fd8350bd-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/4] KVM: nVMX: Don't stuff secondary execution control if it's not supported
  2022-12-13  6:23 [PATCH v2 0/4] KVM: nVMX: Fix 2nd exec controls override goofs Sean Christopherson
  2022-12-13  6:23 ` [PATCH v2 1/4] KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1 Sean Christopherson
@ 2022-12-13  6:23 ` Sean Christopherson
  2022-12-13  6:23 ` [PATCH v2 3/4] KVM: nVMX: Don't muck with allowed sec exec controls on CPUID changes Sean Christopherson
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2022-12-13  6:23 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Aaron Lewis, Yu Zhang

When stuffing the allowed secondary execution controls for nested VMX in
response to CPUID updates, don't set the allowed-1 bit for a feature that
isn't supported by KVM, i.e. isn't allowed by the canonical vmcs_config.

WARN if KVM attempts to manipulate a feature that isn't supported.  All
features that are currently stuffed are always advertised to L1 for
nested VMX if they are supported in KVM's base configuration, and no
additional features should ever be added to the CPUID-induced stuffing
(updating VMX MSRs in response to CPUID updates is a long-standing KVM
flaw that is slowly being fixed).

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index fe5615fd8295..13d3f5eb4c32 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4459,6 +4459,16 @@ vmx_adjust_secondary_exec_control(struct vcpu_vmx *vmx, u32 *exec_control,
 	 * controls for features that are/aren't exposed to the guest.
 	 */
 	if (nested) {
+		/*
+		 * All features that got grandfathered into KVM's flawed CPUID-
+		 * induced manipulation of VMX MSRs are unconditionally exposed
+		 * to L1 if the feature is supported by KVM (for nested).  I.e.
+		 * KVM should never attempt to stuff a feature that isn't
+		 * already exposed to L1 for nested virtualization.
+		 */
+		if (WARN_ON_ONCE(!(vmcs_config.nested.secondary_ctls_high & control)))
+			enabled = false;
+
 		if (enabled)
 			vmx->nested.msrs.secondary_ctls_high |= control;
 		else
-- 
2.39.0.rc1.256.g54fd8350bd-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 3/4] KVM: nVMX: Don't muck with allowed sec exec controls on CPUID changes
  2022-12-13  6:23 [PATCH v2 0/4] KVM: nVMX: Fix 2nd exec controls override goofs Sean Christopherson
  2022-12-13  6:23 ` [PATCH v2 1/4] KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1 Sean Christopherson
  2022-12-13  6:23 ` [PATCH v2 2/4] KVM: nVMX: Don't stuff secondary execution control if it's not supported Sean Christopherson
@ 2022-12-13  6:23 ` Sean Christopherson
  2022-12-23 17:30   ` Paolo Bonzini
  2022-12-13  6:23 ` [PATCH v2 4/4] KVM: selftests: Test KVM's handling of VMX's sec exec MSR on KVM_SET_CPUID Sean Christopherson
  2022-12-14  3:00 ` [PATCH v2 0/4] KVM: nVMX: Fix 2nd exec controls override goofs Yu Zhang
  4 siblings, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2022-12-13  6:23 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Aaron Lewis, Yu Zhang

Don't modify the set of allowed secondary execution controls, i.e. the
virtual MSR_IA32_VMX_PROCBASED_CTLS2, in response to guest CPUID changes.
To avoid breaking old userspace that never sets the VMX MSRs, i.e. relies
on KVM to provide a consistent vCPU model, keep the existing behavior if
userspace has never written MSR_IA32_VMX_PROCBASED_CTLS2.

KVM should not modify the VMX capabilities presented to L1 based on CPUID
as doing so may discard explicit settings provided by userspace.  E.g. if
userspace does KVM_SET_MSRS => KVM_SET_CPUID and disables a feature in
the VMX MSRs but not CPUID (to prevent exposing the feature to L2), then
stuffing the VMX MSRs during KVM_SET_CPUID will expose the feature to L2
against userspace's wishes.

Alternatively, KVM could add a quirk, but that's less than ideal as a VMM
that is affected by the bug would need to be updated in order to opt out
of the buggy behavior.  The "has the MSR ever been written" logic handles
both the care where an enlightened userspace sets the MSR during setup,
and the case where userspace blindly migrates the MSR, as the migrated
value will already have been sanitized by the source KVM.

Reported-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/capabilities.h | 1 +
 arch/x86/kvm/vmx/nested.c       | 3 +++
 arch/x86/kvm/vmx/vmx.c          | 7 +++++--
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index cd2ac9536c99..7b08d6006f52 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -51,6 +51,7 @@ struct nested_vmx_msrs {
 	u64 cr4_fixed1;
 	u64 vmcs_enum;
 	u64 vmfunc_controls;
+	bool secondary_set_by_userspace;
 };
 
 struct vmcs_config {
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index d131375f347a..0140893412b7 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1271,6 +1271,9 @@ vmx_restore_control_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
 	if (!is_bitwise_subset(supported, data, GENMASK_ULL(63, 32)))
 		return -EINVAL;
 
+	if (msr_index == MSR_IA32_VMX_PROCBASED_CTLS2)
+		vmx->nested.msrs.secondary_set_by_userspace = true;
+
 	vmx_get_control_msr(&vmx->nested.msrs, msr_index, &lowp, &highp);
 	*lowp = data;
 	*highp = data >> 32;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 13d3f5eb4c32..dd0247bc7193 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4456,9 +4456,12 @@ vmx_adjust_secondary_exec_control(struct vcpu_vmx *vmx, u32 *exec_control,
 
 	/*
 	 * Update the nested MSR settings so that a nested VMM can/can't set
-	 * controls for features that are/aren't exposed to the guest.
+	 * controls for features that are/aren't exposed to the guest.  Stuff
+	 * the MSR if and only if userspace hasn't explicitly set the MSR, i.e.
+	 * to avoid ABI breakage if userspace might be relying on KVM's flawed
+	 * behavior to expose features to L1.
 	 */
-	if (nested) {
+	if (nested && !vmx->nested.msrs.secondary_set_by_userspace) {
 		/*
 		 * All features that got grandfathered into KVM's flawed CPUID-
 		 * induced manipulation of VMX MSRs are unconditionally exposed
-- 
2.39.0.rc1.256.g54fd8350bd-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 4/4] KVM: selftests: Test KVM's handling of VMX's sec exec MSR on KVM_SET_CPUID
  2022-12-13  6:23 [PATCH v2 0/4] KVM: nVMX: Fix 2nd exec controls override goofs Sean Christopherson
                   ` (2 preceding siblings ...)
  2022-12-13  6:23 ` [PATCH v2 3/4] KVM: nVMX: Don't muck with allowed sec exec controls on CPUID changes Sean Christopherson
@ 2022-12-13  6:23 ` Sean Christopherson
  2022-12-14  3:00 ` [PATCH v2 0/4] KVM: nVMX: Fix 2nd exec controls override goofs Yu Zhang
  4 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2022-12-13  6:23 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Aaron Lewis, Yu Zhang

Verify that KVM does, and does not, modify the allowed set of VMX's
secondary execution controls during KVM_SET_CPUID.  Historically, KVM has
modified select bits in response to guest CPUID changes to try and force
a consistent CPU model.  KVM's meddling causes problems if userspace
invokes KVM_SET_CPUID after explicitly setting the MSR, as KVM may end up
overriding a legal userspace config.

Newer, fixed KVM versions maintain the historical meddling for backwards
compatibility, but only if userspace has never set the MSR for the vCPU.
I.e. KVM transfers ownership to userspace on the first write.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 .../selftests/kvm/include/x86_64/processor.h  |  1 +
 .../selftests/kvm/include/x86_64/vmx.h        |  4 +-
 .../selftests/kvm/x86_64/vmx_msrs_test.c      | 92 +++++++++++++++++++
 3 files changed, 95 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index b1a31de7108a..9314a06f56d3 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -109,6 +109,7 @@ struct kvm_x86_cpu_feature {
 #define	X86_FEATURE_INVPCID		KVM_X86_CPU_FEATURE(0x7, 0, EBX, 10)
 #define	X86_FEATURE_RTM			KVM_X86_CPU_FEATURE(0x7, 0, EBX, 11)
 #define	X86_FEATURE_MPX			KVM_X86_CPU_FEATURE(0x7, 0, EBX, 14)
+#define X86_FEATURE_RDSEED		KVM_X86_CPU_FEATURE(0x7, 0, EBX, 18)
 #define	X86_FEATURE_SMAP		KVM_X86_CPU_FEATURE(0x7, 0, EBX, 20)
 #define	X86_FEATURE_PCOMMIT		KVM_X86_CPU_FEATURE(0x7, 0, EBX, 22)
 #define	X86_FEATURE_CLFLUSHOPT		KVM_X86_CPU_FEATURE(0x7, 0, EBX, 23)
diff --git a/tools/testing/selftests/kvm/include/x86_64/vmx.h b/tools/testing/selftests/kvm/include/x86_64/vmx.h
index 5f0c0a29c556..b66661ba28c8 100644
--- a/tools/testing/selftests/kvm/include/x86_64/vmx.h
+++ b/tools/testing/selftests/kvm/include/x86_64/vmx.h
@@ -61,8 +61,8 @@
 #define SECONDARY_EXEC_SHADOW_VMCS		0x00004000
 #define SECONDARY_EXEC_RDSEED_EXITING		0x00010000
 #define SECONDARY_EXEC_ENABLE_PML		0x00020000
-#define SECONDARY_EPT_VE			0x00040000
-#define SECONDARY_ENABLE_XSAV_RESTORE		0x00100000
+#define SECONDARY_EXEC_EPT_VE			0x00040000
+#define SECONDARY_EXEC_ENABLE_XSAVES		0x00100000
 #define SECONDARY_EXEC_TSC_SCALING		0x02000000
 
 #define PIN_BASED_EXT_INTR_MASK			0x00000001
diff --git a/tools/testing/selftests/kvm/x86_64/vmx_msrs_test.c b/tools/testing/selftests/kvm/x86_64/vmx_msrs_test.c
index 90720b6205f4..d7b1a72a8912 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_msrs_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_msrs_test.c
@@ -12,6 +12,96 @@
 #include "kvm_util.h"
 #include "vmx.h"
 
+static void vmx_sec_exec_assert_allowed(struct kvm_vcpu *vcpu,
+					const char *name, uint64_t ctrl)
+{
+	TEST_ASSERT(vcpu_get_msr(vcpu, MSR_IA32_VMX_PROCBASED_CTLS2) & ctrl,
+		    "Expected '%s' to be allowed in sec exec controls", name);
+}
+
+static void vmx_sec_exec_assert_denied(struct kvm_vcpu *vcpu,
+				       const char *name, uint64_t ctrl)
+{
+	TEST_ASSERT(!(vcpu_get_msr(vcpu, MSR_IA32_VMX_PROCBASED_CTLS2) & ctrl),
+		    "Expected '%s' to be denied in sec exec controls", name);
+}
+
+static void vmx_sec_exec_control_test(struct kvm_vcpu *vcpu,
+				      const char *name,
+				      struct kvm_x86_cpu_feature feature,
+				      uint64_t ctrl, bool kvm_owned)
+{
+	/* Allowed-1 settings are in the upper 32 bits. */
+	ctrl <<= 32;
+
+	if (!this_cpu_has(feature))
+		return;
+
+	if (kvm_owned) {
+		vcpu_set_cpuid_feature(vcpu, feature);
+		vmx_sec_exec_assert_allowed(vcpu, name, ctrl);
+
+		vcpu_clear_cpuid_feature(vcpu, feature);
+		vmx_sec_exec_assert_denied(vcpu, name, ctrl);
+
+		/* Make sure KVM is actually toggling the bit. */
+		vcpu_set_cpuid_feature(vcpu, feature);
+		vmx_sec_exec_assert_allowed(vcpu, name, ctrl);
+	} else {
+		vcpu_set_msr(vcpu, MSR_IA32_VMX_PROCBASED_CTLS2,
+			     vcpu_get_msr(vcpu, MSR_IA32_VMX_PROCBASED_CTLS2) | ctrl);
+		vmx_sec_exec_assert_allowed(vcpu, name, ctrl);
+
+		vcpu_set_cpuid_feature(vcpu, feature);
+		vmx_sec_exec_assert_allowed(vcpu, name, ctrl);
+
+		vcpu_clear_cpuid_feature(vcpu, feature);
+		vmx_sec_exec_assert_allowed(vcpu, name, ctrl);
+
+		vcpu_set_msr(vcpu, MSR_IA32_VMX_PROCBASED_CTLS2,
+			     vcpu_get_msr(vcpu, MSR_IA32_VMX_PROCBASED_CTLS2) & ~ctrl);
+		vmx_sec_exec_assert_denied(vcpu, name, ctrl);
+
+		vcpu_set_cpuid_feature(vcpu, feature);
+		vmx_sec_exec_assert_denied(vcpu, name, ctrl);
+
+		vcpu_clear_cpuid_feature(vcpu, feature);
+		vmx_sec_exec_assert_denied(vcpu, name, ctrl);
+	}
+}
+
+#define vmx_sec_exec_feature_test(vcpu, name, kvm_owned)			\
+	vmx_sec_exec_control_test(vcpu, #name, X86_FEATURE_##name,		\
+				  SECONDARY_EXEC_ENABLE_##name, kvm_owned)
+
+#define vmx_sec_exec_exiting_test(vcpu, name, kvm_owned)			\
+	vmx_sec_exec_control_test(vcpu, #name, X86_FEATURE_##name,		\
+				  SECONDARY_EXEC_##name##_EXITING, kvm_owned)
+
+static void vmx_sec_exec_controls_test(struct kvm_vcpu *vcpu)
+{
+	int i;
+
+	if (this_cpu_has(X86_FEATURE_XSAVE))
+		vcpu_set_cpuid_feature(vcpu, X86_FEATURE_XSAVE);
+
+	if (this_cpu_has(X86_FEATURE_RDPID))
+		vcpu_clear_cpuid_feature(vcpu, X86_FEATURE_RDPID);
+
+	/*
+	 * Verify that for features KVM has historically taken control of, KVM
+	 * updates PROCBASED_CTLS2 during KVM_SET_CPUID if userspace has never
+	 * set the MSR, but leaves it alone once userspace writes the MSR.
+	 */
+	for (i = 0; i < 2; i++) {
+		vmx_sec_exec_feature_test(vcpu, XSAVES, !i);
+		vmx_sec_exec_feature_test(vcpu, RDTSCP, !i);
+		vmx_sec_exec_feature_test(vcpu, INVPCID, !i);
+		vmx_sec_exec_exiting_test(vcpu, RDRAND, !i);
+		vmx_sec_exec_exiting_test(vcpu, RDSEED, !i);
+	}
+}
+
 static void vmx_fixed1_msr_test(struct kvm_vcpu *vcpu, uint32_t msr_index,
 				  uint64_t mask)
 {
@@ -124,6 +214,8 @@ int main(void)
 	/* No need to actually do KVM_RUN, thus no guest code. */
 	vm = vm_create_with_one_vcpu(&vcpu, NULL);
 
+	vmx_sec_exec_controls_test(vcpu);
+
 	vmx_save_restore_msrs_test(vcpu);
 	ia32_feature_control_msr_test(vcpu);
 
-- 
2.39.0.rc1.256.g54fd8350bd-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/4] KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1
  2022-12-13  6:23 ` [PATCH v2 1/4] KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1 Sean Christopherson
@ 2022-12-13 10:26   ` Yu Zhang
  2022-12-13 18:08   ` Jim Mattson
  1 sibling, 0 replies; 15+ messages in thread
From: Yu Zhang @ 2022-12-13 10:26 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Aaron Lewis, Lixiao Yang

On Tue, Dec 13, 2022 at 06:23:03AM +0000, Sean Christopherson wrote:
> Set ENABLE_USR_WAIT_PAUSE in KVM's supported VMX MSR configuration if the
> feature is supported in hardware and enabled in KVM's base, non-nested
> configuration, i.e. expose ENABLE_USR_WAIT_PAUSE to L1 if it's supported.
> This fixes a bug where saving/restoring, i.e. migrating, a vCPU will fail
> if WAITPKG (the associated CPUID feature) is enabled for the vCPU, and
> obviously allows L1 to enable the feature for L2.
> 
> KVM already effectively exposes ENABLE_USR_WAIT_PAUSE to L1 by stuffing
> the allowed-1 control ina vCPU's virtual MSR_IA32_VMX_PROCBASED_CTLS2 when
> updating secondary controls in response to KVM_SET_CPUID(2), but (a) that
> depends on flawed code (KVM shouldn't touch VMX MSRs in response to CPUID
> updates) and (b) runs afoul of vmx_restore_control_msr()'s restriction
> that the guest value must be a strict subset of the supported host value.
> 
> Although no past commit explicitly enabled nested support for WAITPKG,
> doing so is safe and functionally correct from an architectural
> perspective as no additional KVM support is needed to virtualize TPAUSE,
> UMONITOR, and UMWAIT for L2 relative to L1, and KVM already forwards
> VM-Exits to L1 as necessary (commit bf653b78f960, "KVM: vmx: Introduce
> handle_unexpected_vmexit and handle WAITPKG vmexit").
> 
> Note, KVM always keeps the hosts MSR_IA32_UMWAIT_CONTROL resident in
> hardware, i.e. always runs both L1 and L2 with the host's power management
> settings for TPAUSE and UMWAIT.  See commit bf09fb6cba4f ("KVM: VMX: Stop
> context switching MSR_IA32_UMWAIT_CONTROL") for more details.
> 
> Fixes: e69e72faa3a0 ("KVM: x86: Add support for user wait instructions")
> Cc: stable@vger.kernel.org
> Reported-by: Aaron Lewis <aaronlewis@google.com>
> Reported-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Could you please help add
"Reported-by: Yang, Lixiao <lixiao.yang@intel.com>"
She identified the failure of vmx_msrs_test in KVM selftest first. Thanks!

B.R.
Yu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/4] KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1
  2022-12-13  6:23 ` [PATCH v2 1/4] KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1 Sean Christopherson
  2022-12-13 10:26   ` Yu Zhang
@ 2022-12-13 18:08   ` Jim Mattson
  1 sibling, 0 replies; 15+ messages in thread
From: Jim Mattson @ 2022-12-13 18:08 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Aaron Lewis, Yu Zhang

On Mon, Dec 12, 2022 at 10:23 PM Sean Christopherson <seanjc@google.com> wrote:
>
> Set ENABLE_USR_WAIT_PAUSE in KVM's supported VMX MSR configuration if the
> feature is supported in hardware and enabled in KVM's base, non-nested
> configuration, i.e. expose ENABLE_USR_WAIT_PAUSE to L1 if it's supported.
> This fixes a bug where saving/restoring, i.e. migrating, a vCPU will fail
> if WAITPKG (the associated CPUID feature) is enabled for the vCPU, and
> obviously allows L1 to enable the feature for L2.
>
> KVM already effectively exposes ENABLE_USR_WAIT_PAUSE to L1 by stuffing
> the allowed-1 control ina vCPU's virtual MSR_IA32_VMX_PROCBASED_CTLS2 when
> updating secondary controls in response to KVM_SET_CPUID(2), but (a) that
> depends on flawed code (KVM shouldn't touch VMX MSRs in response to CPUID
> updates) and (b) runs afoul of vmx_restore_control_msr()'s restriction
> that the guest value must be a strict subset of the supported host value.
>
> Although no past commit explicitly enabled nested support for WAITPKG,
> doing so is safe and functionally correct from an architectural
> perspective as no additional KVM support is needed to virtualize TPAUSE,
> UMONITOR, and UMWAIT for L2 relative to L1, and KVM already forwards
> VM-Exits to L1 as necessary (commit bf653b78f960, "KVM: vmx: Introduce
> handle_unexpected_vmexit and handle WAITPKG vmexit").
>
> Note, KVM always keeps the hosts MSR_IA32_UMWAIT_CONTROL resident in
> hardware, i.e. always runs both L1 and L2 with the host's power management
> settings for TPAUSE and UMWAIT.  See commit bf09fb6cba4f ("KVM: VMX: Stop
> context switching MSR_IA32_UMWAIT_CONTROL") for more details.
>
> Fixes: e69e72faa3a0 ("KVM: x86: Add support for user wait instructions")
> Cc: stable@vger.kernel.org
> Reported-by: Aaron Lewis <aaronlewis@google.com>
> Reported-by: Yu Zhang <yu.c.zhang@linux.intel.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/4] KVM: nVMX: Fix 2nd exec controls override goofs
  2022-12-13  6:23 [PATCH v2 0/4] KVM: nVMX: Fix 2nd exec controls override goofs Sean Christopherson
                   ` (3 preceding siblings ...)
  2022-12-13  6:23 ` [PATCH v2 4/4] KVM: selftests: Test KVM's handling of VMX's sec exec MSR on KVM_SET_CPUID Sean Christopherson
@ 2022-12-14  3:00 ` Yu Zhang
  2022-12-15  0:18   ` Sean Christopherson
  4 siblings, 1 reply; 15+ messages in thread
From: Yu Zhang @ 2022-12-14  3:00 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel, Aaron Lewis

On Tue, Dec 13, 2022 at 06:23:02AM +0000, Sean Christopherson wrote:
> Fix bugs in KVM's (mis)handling of secondary execution controls.
> 
> KVM overrides the secondary execution control VMX MSR during KVM_SET_CPUID.
> Similar to the somewhat recent reverts
> 
>   8805875aa473 ("Revert "KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled"")
>   9389d5774aca ("Revert "KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL VM-{Entry,Exit} control"")
> 
> undo misguided KVM behavior where KVM overrides allowed-1 settings in the
> secondary execution controls in response to changes to the guest's CPUID
> model.  To avoid breaking userspace that doesn't take ownership of the
> VMX MSRs, go hands off if and only if userpace sets the MSR in question.
> 
> Before fixing that, fix another bug it was hiding where the umwait/tpause
> control was being exposed to L1 for nVMX only after KVM_SET_CPUID, and
> harden KVM against similar bugs in the future.
> 
> v2: Fix the ENABLE_USR_WAIT_PAUSE bug too. [Aaron]
> 
> v1: https://lore.kernel.org/all/20221110005706.1064832-1-seanjc@google.com
> 
> Sean Christopherson (4):
>   KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1
>   KVM: nVMX: Don't stuff secondary execution control if it's not
>     supported
>   KVM: nVMX: Don't muck with allowed sec exec controls on CPUID changes
>   KVM: selftests: Test KVM's handling of VMX's sec exec MSR on
>     KVM_SET_CPUID

BTW, we may need another patch to remove the obsolete comments in
nested_vmx_setup_ctls_msrs():

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index b6f4411b613e..42ceddcafd3e 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6854,11 +6854,6 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
        msrs->procbased_ctls_low &=
                ~(CPU_BASED_CR3_LOAD_EXITING | CPU_BASED_CR3_STORE_EXITING);

-       /*
-        * secondary cpu-based controls.  Do not include those that
-        * depend on CPUID bits, they are added later by
-        * vmx_vcpu_after_set_cpuid.
-        */
        msrs->secondary_ctls_low = 0;

        msrs->secondary_ctls_high = vmcs_conf->cpu_based_2nd_exec_ctrl;

B.R.
Yu

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/4] KVM: nVMX: Fix 2nd exec controls override goofs
  2022-12-14  3:00 ` [PATCH v2 0/4] KVM: nVMX: Fix 2nd exec controls override goofs Yu Zhang
@ 2022-12-15  0:18   ` Sean Christopherson
  2022-12-15 11:24     ` Yu Zhang
  0 siblings, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2022-12-15  0:18 UTC (permalink / raw)
  To: Yu Zhang; +Cc: Paolo Bonzini, kvm, linux-kernel, Aaron Lewis

On Wed, Dec 14, 2022, Yu Zhang wrote:
> On Tue, Dec 13, 2022 at 06:23:02AM +0000, Sean Christopherson wrote:
> > Fix bugs in KVM's (mis)handling of secondary execution controls.
> > 
> > KVM overrides the secondary execution control VMX MSR during KVM_SET_CPUID.
> > Similar to the somewhat recent reverts
> > 
> >   8805875aa473 ("Revert "KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled"")
> >   9389d5774aca ("Revert "KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL VM-{Entry,Exit} control"")
> > 
> > undo misguided KVM behavior where KVM overrides allowed-1 settings in the
> > secondary execution controls in response to changes to the guest's CPUID
> > model.  To avoid breaking userspace that doesn't take ownership of the
> > VMX MSRs, go hands off if and only if userpace sets the MSR in question.
> > 
> > Before fixing that, fix another bug it was hiding where the umwait/tpause
> > control was being exposed to L1 for nVMX only after KVM_SET_CPUID, and
> > harden KVM against similar bugs in the future.
> > 
> > v2: Fix the ENABLE_USR_WAIT_PAUSE bug too. [Aaron]
> > 
> > v1: https://lore.kernel.org/all/20221110005706.1064832-1-seanjc@google.com
> > 
> > Sean Christopherson (4):
> >   KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1
> >   KVM: nVMX: Don't stuff secondary execution control if it's not
> >     supported
> >   KVM: nVMX: Don't muck with allowed sec exec controls on CPUID changes
> >   KVM: selftests: Test KVM's handling of VMX's sec exec MSR on
> >     KVM_SET_CPUID
> 
> BTW, we may need another patch to remove the obsolete comments in
> nested_vmx_setup_ctls_msrs():

Ouch, indeed.  Want to send a proper patch?  Or provide your SoB and I'll write
a changelog?

The comment was added by commit 80154d77c922 ("KVM: VMX: cache secondary exec controls"),
but arguably the below is the appropriate Fixes, as it's the commit that fixed the
existing cases where KVM didn't enumerate supported-but-conditional controls.

Fixes: 6defc591846d ("KVM: nVMX: include conditional controls in /dev/kvm KVM_GET_MSRS")

> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index b6f4411b613e..42ceddcafd3e 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -6854,11 +6854,6 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
>         msrs->procbased_ctls_low &=
>                 ~(CPU_BASED_CR3_LOAD_EXITING | CPU_BASED_CR3_STORE_EXITING);
> 
> -       /*
> -        * secondary cpu-based controls.  Do not include those that
> -        * depend on CPUID bits, they are added later by
> -        * vmx_vcpu_after_set_cpuid.
> -        */
>         msrs->secondary_ctls_low = 0;
> 
>         msrs->secondary_ctls_high = vmcs_conf->cpu_based_2nd_exec_ctrl;
> 
> B.R.
> Yu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/4] KVM: nVMX: Fix 2nd exec controls override goofs
  2022-12-15  0:18   ` Sean Christopherson
@ 2022-12-15 11:24     ` Yu Zhang
  2022-12-15 18:33       ` Sean Christopherson
  0 siblings, 1 reply; 15+ messages in thread
From: Yu Zhang @ 2022-12-15 11:24 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel, Aaron Lewis

On Thu, Dec 15, 2022 at 12:18:33AM +0000, Sean Christopherson wrote:
> > 
> > BTW, we may need another patch to remove the obsolete comments in
> > nested_vmx_setup_ctls_msrs():
> 
> Ouch, indeed.  Want to send a proper patch?  Or provide your SoB and I'll write
> a changelog?
> 
> The comment was added by commit 80154d77c922 ("KVM: VMX: cache secondary exec controls"),
> but arguably the below is the appropriate Fixes, as it's the commit that fixed the
> existing cases where KVM didn't enumerate supported-but-conditional controls.
> 
> Fixes: 6defc591846d ("KVM: nVMX: include conditional controls in /dev/kvm KVM_GET_MSRS")
> 

Thanks a lot, Sean, especially for sharing the commit history.

And I just sent out a patch to fix it.

One question is about the process of small cleanup patches like
this: would it be better off to include the cleanup patches as
part of a larger submission, or is it OK to be sent seperately?

Previously I submitted some small patches(e.g. [1] & [2]) but
have not received any reply. So I am just wondering, maybe those
patches are too trivial and sometimes time-wasting for the reviewers?

Any suggestion? Thanks!

B.R.
Yu

[1]: [PATCH] KVM: MMU: Add wrapper to check whether MMU is in direct mode
https://www.spinics.net/lists/kvm/msg297583.html
[2]: [PATCH v2 0/2] Cleanup VMFUNC handling in KVM. 
https://www.spinics.net/lists/kernel/msg4582139.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/4] KVM: nVMX: Fix 2nd exec controls override goofs
  2022-12-15 11:24     ` Yu Zhang
@ 2022-12-15 18:33       ` Sean Christopherson
  2022-12-16  9:59         ` Yu Zhang
  0 siblings, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2022-12-15 18:33 UTC (permalink / raw)
  To: Yu Zhang; +Cc: Paolo Bonzini, kvm, linux-kernel, Aaron Lewis

On Thu, Dec 15, 2022, Yu Zhang wrote:
> On Thu, Dec 15, 2022 at 12:18:33AM +0000, Sean Christopherson wrote:
> > > 
> > > BTW, we may need another patch to remove the obsolete comments in
> > > nested_vmx_setup_ctls_msrs():
> > 
> > Ouch, indeed.  Want to send a proper patch?  Or provide your SoB and I'll write
> > a changelog?
> > 
> > The comment was added by commit 80154d77c922 ("KVM: VMX: cache secondary exec controls"),
> > but arguably the below is the appropriate Fixes, as it's the commit that fixed the
> > existing cases where KVM didn't enumerate supported-but-conditional controls.
> > 
> > Fixes: 6defc591846d ("KVM: nVMX: include conditional controls in /dev/kvm KVM_GET_MSRS")
> > 
> 
> Thanks a lot, Sean, especially for sharing the commit history.
> 
> And I just sent out a patch to fix it.
> 
> One question is about the process of small cleanup patches like
> this: would it be better off to include the cleanup patches as
> part of a larger submission, or is it OK to be sent seperately?

In this case, it's ok to be sent separately.  There are no code dependencies, and
the changelog can be written to say "this comment is no longer accurate", even if
there is still broken code in KVM.

> Previously I submitted some small patches(e.g. [1] & [2]) but
> have not received any reply. So I am just wondering, maybe those
> patches are too trivial and sometimes time-wasting for the reviewers?

They're definitely not too trivial.  This is just an especially rough time of
year for reviews, e.g. end of year corporate stuff, merge window, holidays, etc.

Part of why I haven't provided reviews is that the patches _aren't_ super trivial,
e.g. I'm on the fence on whether mmu_is_direct() should take @vcpu or @mmu, and if
I vote to have it take @mmu, then that'll conflict with mmu_is_nested().  So I end
up staying silent until I can come back to it with fresh eyes to see if there's a
better alternative, or if I'm just being nitpicky.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/4] KVM: nVMX: Fix 2nd exec controls override goofs
  2022-12-15 18:33       ` Sean Christopherson
@ 2022-12-16  9:59         ` Yu Zhang
  0 siblings, 0 replies; 15+ messages in thread
From: Yu Zhang @ 2022-12-16  9:59 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel, Aaron Lewis

> 
> They're definitely not too trivial.  This is just an especially rough time of
> year for reviews, e.g. end of year corporate stuff, merge window, holidays, etc.
> 

Glad to know that. Thanks!

> Part of why I haven't provided reviews is that the patches _aren't_ super trivial,
> e.g. I'm on the fence on whether mmu_is_direct() should take @vcpu or @mmu, and if
> I vote to have it take @mmu, then that'll conflict with mmu_is_nested().  So I end
> up staying silent until I can come back to it with fresh eyes to see if there's a
> better alternative, or if I'm just being nitpicky.
> 
Well, though I would prefer mmu_is_direct(), I appreciate for
being considerate. We can discuss it later. :)

B.R.
Yu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/4] KVM: nVMX: Don't muck with allowed sec exec controls on CPUID changes
  2022-12-13  6:23 ` [PATCH v2 3/4] KVM: nVMX: Don't muck with allowed sec exec controls on CPUID changes Sean Christopherson
@ 2022-12-23 17:30   ` Paolo Bonzini
  2023-01-04 14:31     ` Sean Christopherson
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2022-12-23 17:30 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, linux-kernel, Aaron Lewis, Yu Zhang

On 12/13/22 07:23, Sean Christopherson wrote:
> Don't modify the set of allowed secondary execution controls, i.e. the
> virtual MSR_IA32_VMX_PROCBASED_CTLS2, in response to guest CPUID changes.
> To avoid breaking old userspace that never sets the VMX MSRs, i.e. relies
> on KVM to provide a consistent vCPU model, keep the existing behavior if
> userspace has never written MSR_IA32_VMX_PROCBASED_CTLS2.
> 
> KVM should not modify the VMX capabilities presented to L1 based on CPUID
> as doing so may discard explicit settings provided by userspace.  E.g. if
> userspace does KVM_SET_MSRS => KVM_SET_CPUID and disables a feature in
> the VMX MSRs but not CPUID (to prevent exposing the feature to L2), then
> stuffing the VMX MSRs during KVM_SET_CPUID will expose the feature to L2
> against userspace's wishes.

The commit message doesn't explain *why* KVM_SET_CPUID would be done 
before KVM_SET_MSRS.  The presence of certain MSRs or bits within is 
signaled by CPUID bits, and even though KVM is more lenient on 
host-initiated MSR writes it still verifies in general that the bits are 
valid.

For now I applied patch 1 and (with a reworded comment) patch 2.  I'm 
not opposed to the rest, but I would like to better understand the 
reason for them.  (If it has been reported to the mailing list, please 
add a "Link" trailer too).

Paolo


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/4] KVM: nVMX: Don't muck with allowed sec exec controls on CPUID changes
  2022-12-23 17:30   ` Paolo Bonzini
@ 2023-01-04 14:31     ` Sean Christopherson
  2023-01-04 14:42       ` Sean Christopherson
  0 siblings, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2023-01-04 14:31 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel, Aaron Lewis, Yu Zhang

On Fri, Dec 23, 2022, Paolo Bonzini wrote:
> On 12/13/22 07:23, Sean Christopherson wrote:
> > Don't modify the set of allowed secondary execution controls, i.e. the
> > virtual MSR_IA32_VMX_PROCBASED_CTLS2, in response to guest CPUID changes.
> > To avoid breaking old userspace that never sets the VMX MSRs, i.e. relies
> > on KVM to provide a consistent vCPU model, keep the existing behavior if
> > userspace has never written MSR_IA32_VMX_PROCBASED_CTLS2.
> > 
> > KVM should not modify the VMX capabilities presented to L1 based on CPUID
> > as doing so may discard explicit settings provided by userspace.  E.g. if
> > userspace does KVM_SET_MSRS => KVM_SET_CPUID and disables a feature in
> > the VMX MSRs but not CPUID (to prevent exposing the feature to L2), then
> > stuffing the VMX MSRs during KVM_SET_CPUID will expose the feature to L2
> > against userspace's wishes.
> 
> The commit message doesn't explain *why* KVM_SET_CPUID would be done before
> KVM_SET_MSRS.

I assume you mean why KVM_SET_MSRS would be done before KVM_SET_CPUID2?

This patch is mostly paranoia, AFAIK there is no userspace that is negatively
affected by KVM's manipulations.  The only case I can think of is if userspace
wanted to emulate dynamic CPUID updates, e.g. set an MSR filter to intercept writes
to MISC_ENABLES to update MONITOR/MWAIT support, but that behavior isn't allowed
since commit feb627e8d6f6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN").

There are scenarios where userspace might do KVM_SET_MSRS before KVM_SET_CPUID,
e.g. QEMU's reuse of a vCPU for CPU hotplug, but in those cases I would expect
userspace to follow up with another KVM_SET_MSRS.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/4] KVM: nVMX: Don't muck with allowed sec exec controls on CPUID changes
  2023-01-04 14:31     ` Sean Christopherson
@ 2023-01-04 14:42       ` Sean Christopherson
  0 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2023-01-04 14:42 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, linux-kernel, Aaron Lewis, Yu Zhang

On Wed, Jan 04, 2023, Sean Christopherson wrote:
> On Fri, Dec 23, 2022, Paolo Bonzini wrote:
> > On 12/13/22 07:23, Sean Christopherson wrote:
> > > Don't modify the set of allowed secondary execution controls, i.e. the
> > > virtual MSR_IA32_VMX_PROCBASED_CTLS2, in response to guest CPUID changes.
> > > To avoid breaking old userspace that never sets the VMX MSRs, i.e. relies
> > > on KVM to provide a consistent vCPU model, keep the existing behavior if
> > > userspace has never written MSR_IA32_VMX_PROCBASED_CTLS2.
> > > 
> > > KVM should not modify the VMX capabilities presented to L1 based on CPUID
> > > as doing so may discard explicit settings provided by userspace.  E.g. if
> > > userspace does KVM_SET_MSRS => KVM_SET_CPUID and disables a feature in
> > > the VMX MSRs but not CPUID (to prevent exposing the feature to L2), then
> > > stuffing the VMX MSRs during KVM_SET_CPUID will expose the feature to L2
> > > against userspace's wishes.
> > 
> > The commit message doesn't explain *why* KVM_SET_CPUID would be done before
> > KVM_SET_MSRS.
> 
> I assume you mean why KVM_SET_MSRS would be done before KVM_SET_CPUID2?
> 
> This patch is mostly paranoia, AFAIK there is no userspace that is negatively
> affected by KVM's manipulations.  The only case I can think of is if userspace
> wanted to emulate dynamic CPUID updates, e.g. set an MSR filter to intercept writes
> to MISC_ENABLES to update MONITOR/MWAIT support, but that behavior isn't allowed
> since commit feb627e8d6f6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN").
> 
> There are scenarios where userspace might do KVM_SET_MSRS before KVM_SET_CPUID,
> e.g. QEMU's reuse of a vCPU for CPU hotplug, but in those cases I would expect
> userspace to follow up with another KVM_SET_MSRS.

An argument for taking this patch is that it might be necessary to disallow
KVM_SET_MSRS after KVM_RUN[*].  If KVM manipulates VMX MSRs during KVM_SET_CPUID2,
reusing a vCPU with sequence:

  SET_CPUID2 => SET_MSRS => RUN => unplug => hotplug => SET_CPUID2 => SET_MSRS

sequence will cause the second SET_MSRS to fail due to userspace "changing" the
MSR value.

[*] https://lore.kernel.org/all/20220805172945.35412-4-seanjc@google.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-01-04 14:43 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-13  6:23 [PATCH v2 0/4] KVM: nVMX: Fix 2nd exec controls override goofs Sean Christopherson
2022-12-13  6:23 ` [PATCH v2 1/4] KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1 Sean Christopherson
2022-12-13 10:26   ` Yu Zhang
2022-12-13 18:08   ` Jim Mattson
2022-12-13  6:23 ` [PATCH v2 2/4] KVM: nVMX: Don't stuff secondary execution control if it's not supported Sean Christopherson
2022-12-13  6:23 ` [PATCH v2 3/4] KVM: nVMX: Don't muck with allowed sec exec controls on CPUID changes Sean Christopherson
2022-12-23 17:30   ` Paolo Bonzini
2023-01-04 14:31     ` Sean Christopherson
2023-01-04 14:42       ` Sean Christopherson
2022-12-13  6:23 ` [PATCH v2 4/4] KVM: selftests: Test KVM's handling of VMX's sec exec MSR on KVM_SET_CPUID Sean Christopherson
2022-12-14  3:00 ` [PATCH v2 0/4] KVM: nVMX: Fix 2nd exec controls override goofs Yu Zhang
2022-12-15  0:18   ` Sean Christopherson
2022-12-15 11:24     ` Yu Zhang
2022-12-15 18:33       ` Sean Christopherson
2022-12-16  9:59         ` Yu Zhang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.