All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/7] VMX: nVMX: VMX control MSR fixes
@ 2022-02-04 20:46 Oliver Upton
  2022-02-04 20:46 ` [PATCH v2 1/7] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write Oliver Upton
                   ` (6 more replies)
  0 siblings, 7 replies; 26+ messages in thread
From: Oliver Upton @ 2022-02-04 20:46 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Oliver Upton

There are a few bits in the VMX entry/exit control MSRs where KVM
intervenes. The "load IA32_PERF_GLOBAL_CTRL" and "{load,clear}
IA32_BNDCFGS" VMX entry/exit control bits are under KVM control and
conditionally exposed based on the guest CPUID. If the guest CPUID
provides a supporting vPMU or MPX, the respective VMX control bits are
enabled.

These rules have not been upheld in all cases, though. KVM will only
apply its updates to the MSRs when the guest CPUID is set. If an
unsuspecting VMM writes to these VMX control MSRs after the CPUID has
been set, KVM fails to configure the appropriate bits. There does not
exist any ordering requirements between setting CPUID and writing to an
MSR.

[Patch 1-2]
Fix the immediate issue by hooking writes to the VMX control MSRs. If
userspace writes to one of the affected MSRs, reapply KVMs tweaks to
these registers. Note that these patches employ the minimal change
required to fix the issue, in case they are worthy of a backport.

[Patch 3-4]
Of course, it is not ideal to have KVM fiddling with the guest's MSRs in
this way. Add a quirk allowing sane VMMs to take complete ownership of
these VMX control bits.

[Patch 5-6]
Add tests to verify correct behavior for these bits with the quirk
enabled (KVM control) and quirk disabled (userspace control).

Finally, patch 7 is a nit cleanup that I noticed while doing the
renovations above.

This series applies cleanly to 5.17-rc2. Tested on a Skylake host.

Oliver Upton (7):
  KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL ctrl bits across MSR
    write
  KVM: nVMX: Roll all entry/exit ctl updates into a single helper
  KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs
  selftests: KVM: Add test for PERF_GLOBAL_CTRL VMX control MSR bits
  selftests: KVM: Add test for BNDCFGS VMX control MSR bits
  KVM: VMX: Use local pointer to vcpu_vmx in vmx_vcpu_after_set_cpuid()

 arch/x86/include/uapi/asm/kvm.h               |  11 +-
 arch/x86/kvm/vmx/nested.c                     |  30 +---
 arch/x86/kvm/vmx/nested.h                     |   1 -
 arch/x86/kvm/vmx/pmu_intel.c                  |   2 -
 arch/x86/kvm/vmx/vmx.c                        |  17 +-
 arch/x86/kvm/vmx/vmx.h                        |   2 +
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/include/x86_64/vmx.h        |   2 +
 .../kvm/x86_64/vmx_control_msrs_test.c        | 166 ++++++++++++++++++
 10 files changed, 201 insertions(+), 32 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c

-- 
2.35.0.263.gb82422642f-goog


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 1/7] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-02-04 20:46 [PATCH v2 0/7] VMX: nVMX: VMX control MSR fixes Oliver Upton
@ 2022-02-04 20:46 ` Oliver Upton
  2022-02-07 17:21   ` Paolo Bonzini
  2022-02-04 20:47 ` [PATCH v2 2/7] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL " Oliver Upton
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Oliver Upton @ 2022-02-04 20:46 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Oliver Upton

Since commit 5f76f6f5ff96 ("KVM: nVMX: Do not expose MPX VMX controls
when guest MPX disabled"), KVM has taken ownership of the "load
IA32_BNDCFGS" and "clear IA32_BNDCFGS" VMX entry/exit controls. The ABI
is that these bits must be set in the IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS
MSRs if the guest's CPUID supports MPX, and clear otherwise.

However, KVM will only do so if userspace sets the CPUID before writing
to the corresponding MSRs. Of course, there are no ordering requirements
between these ioctls. Uphold the ABI regardless of ordering by
reapplying KVMs tweaks to the VMX control MSRs after userspace has
written to them.

Fixes: 5f76f6f5ff96 ("KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled")
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/vmx/nested.c | 9 +++++++++
 arch/x86/kvm/vmx/vmx.c    | 2 +-
 arch/x86/kvm/vmx/vmx.h    | 2 ++
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index ba34e94049c7..59164394569f 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1291,6 +1291,15 @@ vmx_restore_control_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
 
 	*lowp = data;
 	*highp = data >> 32;
+
+	/*
+	 * Ensure KVM fiddling with these MSRs is preserved after userspace
+	 * write.
+	 */
+	if (msr_index == MSR_IA32_VMX_TRUE_ENTRY_CTLS ||
+	    msr_index == MSR_IA32_VMX_TRUE_EXIT_CTLS)
+		nested_vmx_entry_exit_ctls_update(&vmx->vcpu);
+
 	return 0;
 }
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index aca3ae2a02f3..d63d6dfbadbf 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7227,7 +7227,7 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
 #undef cr4_fixed1_update
 }
 
-static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
+void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 7f2c82e7f38f..e134e2763502 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -423,6 +423,8 @@ static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
 
 void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu);
 
+void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu);
+
 /*
  * Note, early Intel manuals have the write-low and read-high bitmap offsets
  * the wrong way round.  The bitmaps control MSRs 0x00000000-0x00001fff and
-- 
2.35.0.263.gb82422642f-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 2/7] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL ctrl bits across MSR write
  2022-02-04 20:46 [PATCH v2 0/7] VMX: nVMX: VMX control MSR fixes Oliver Upton
  2022-02-04 20:46 ` [PATCH v2 1/7] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write Oliver Upton
@ 2022-02-04 20:47 ` Oliver Upton
  2022-02-07 16:33   ` Paolo Bonzini
  2022-02-04 20:47 ` [PATCH v2 3/7] KVM: nVMX: Roll all entry/exit ctl updates into a single helper Oliver Upton
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Oliver Upton @ 2022-02-04 20:47 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Oliver Upton

Since commit 03a8871add95 ("KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL
VM-{Entry,Exit} control"), KVM has taken ownership of the "load
IA32_PERF_GLOBAL_CTRL" VMX entry/exit control bits. The ABI is that
these bits will be set in the IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS MSRs if
the guest's CPUID exposes a vPMU that supports the IA32_PERF_GLOBAL_CTRL
MSR (CPUID.0AH:EAX[7:0] > 1), and clear otherwise.

However, KVM will only do so if userspace sets the CPUID before writing
to the corresponding MSRs. Of course, there are no ordering requirements
between these ioctls. Uphold the ABI regardless of ordering by
reapplying KVMs tweaks to the VMX control MSRs after userspace has
written to them.

Note that older kernels without commit c44d9b34701d ("KVM: x86: Invoke
vendor's vcpu_after_set_cpuid() after all common updates") still require
that the entry/exit controls be updated from kvm_pmu_refresh(). Leave
the benign call in place to allow for cleaner backporting and punt the
cleanup to a later change.

Fixes: 03a8871add95 ("KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL VM-{Entry,Exit} control")
Reported-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d63d6dfbadbf..54ac382a0b73 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7242,6 +7242,8 @@ void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
 			vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_CLEAR_BNDCFGS;
 		}
 	}
+
+	nested_vmx_pmu_entry_exit_ctls_update(vcpu);
 }
 
 static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
-- 
2.35.0.263.gb82422642f-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 3/7] KVM: nVMX: Roll all entry/exit ctl updates into a single helper
  2022-02-04 20:46 [PATCH v2 0/7] VMX: nVMX: VMX control MSR fixes Oliver Upton
  2022-02-04 20:46 ` [PATCH v2 1/7] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write Oliver Upton
  2022-02-04 20:47 ` [PATCH v2 2/7] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL " Oliver Upton
@ 2022-02-04 20:47 ` Oliver Upton
  2022-02-05  7:43     ` kernel test robot
  2022-02-04 20:47 ` [PATCH v2 4/7] KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs Oliver Upton
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Oliver Upton @ 2022-02-04 20:47 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Oliver Upton

nested_vmx_pmu_entry_exit_ctls_update() is no longer useful; updating
the entry/exit ctrl bits in the vendor vcpu_after_set_cpuid() hook is
sufficient as KVM has already recalculated the vPMU version.

Keep all of KVM's bad behavior with regards to the VMX entry/exit
control MSRs in one place. Remove all traces of the PMU helper and
inline the bit twiddling to nested_vmx_entry_exit_ctls_update().

Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/vmx/nested.c    | 21 ---------------------
 arch/x86/kvm/vmx/nested.h    |  1 -
 arch/x86/kvm/vmx/pmu_intel.c |  2 --
 arch/x86/kvm/vmx/vmx.c       |  8 +++++++-
 4 files changed, 7 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 59164394569f..2e8facff93f8 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4806,27 +4806,6 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
 	return 0;
 }
 
-void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
-{
-	struct vcpu_vmx *vmx;
-
-	if (!nested_vmx_allowed(vcpu))
-		return;
-
-	vmx = to_vmx(vcpu);
-	if (kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL)) {
-		vmx->nested.msrs.entry_ctls_high |=
-				VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
-		vmx->nested.msrs.exit_ctls_high |=
-				VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
-	} else {
-		vmx->nested.msrs.entry_ctls_high &=
-				~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
-		vmx->nested.msrs.exit_ctls_high &=
-				~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
-	}
-}
-
 static int nested_vmx_get_vmptr(struct kvm_vcpu *vcpu, gpa_t *vmpointer,
 				int *ret)
 {
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index b69a80f43b37..14ad756aac46 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -32,7 +32,6 @@ int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data);
 int vmx_get_vmx_msr(struct nested_vmx_msrs *msrs, u32 msr_index, u64 *pdata);
 int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
 			u32 vmx_instruction_info, bool wr, int len, gva_t *ret);
-void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu);
 void nested_mark_vmcs12_pages_dirty(struct kvm_vcpu *vcpu);
 bool nested_vmx_check_io_bitmaps(struct kvm_vcpu *vcpu, unsigned int port,
 				 int size);
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 466d18fc0c5d..ad1adbaa7d9e 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -541,8 +541,6 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	bitmap_set(pmu->all_valid_pmc_idx,
 		INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters);
 
-	nested_vmx_pmu_entry_exit_ctls_update(vcpu);
-
 	if (intel_pmu_lbr_is_compatible(vcpu))
 		x86_perf_get_lbr(&lbr_desc->records);
 	else
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 54ac382a0b73..395787b7e7ac 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7243,7 +7243,13 @@ void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
 		}
 	}
 
-	nested_vmx_pmu_entry_exit_ctls_update(vcpu);
+	if (kvm_pmu_is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL)) {
+		vmx->nested.msrs.entry_ctls_high |= VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+		vmx->nested.msrs.exit_ctls_high |= VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+	} else {
+		vmx->nested.msrs.entry_ctls_high &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+		vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+	}
 }
 
 static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
-- 
2.35.0.263.gb82422642f-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 4/7] KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs
  2022-02-04 20:46 [PATCH v2 0/7] VMX: nVMX: VMX control MSR fixes Oliver Upton
                   ` (2 preceding siblings ...)
  2022-02-04 20:47 ` [PATCH v2 3/7] KVM: nVMX: Roll all entry/exit ctl updates into a single helper Oliver Upton
@ 2022-02-04 20:47 ` Oliver Upton
  2022-02-07 18:06   ` Sean Christopherson
  2022-02-04 20:47 ` [PATCH v2 5/7] selftests: KVM: Add test for PERF_GLOBAL_CTRL VMX control MSR bits Oliver Upton
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Oliver Upton @ 2022-02-04 20:47 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Oliver Upton

KVM really has no business messing with the vCPU state. Nonetheless, it
has become ABI for KVM to adjust certain bits of the VMX entry/exit
control MSRs depending on the guest CPUID. Namely, the bits associated
with the IA32_PERF_GLOBAL_CTRL and IA32_BNDCFGS MSRs were conditionally
enabled if the guest CPUID allows for it.

Allow userspace to opt-out of changes to VMX control MSRs by adding a
new KVM quirk.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/include/uapi/asm/kvm.h | 11 ++++++-----
 arch/x86/kvm/vmx/vmx.c          |  3 +++
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index bf6e96011dfe..acbab6a97fae 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -428,11 +428,12 @@ struct kvm_sync_regs {
 	struct kvm_vcpu_events events;
 };
 
-#define KVM_X86_QUIRK_LINT0_REENABLED	   (1 << 0)
-#define KVM_X86_QUIRK_CD_NW_CLEARED	   (1 << 1)
-#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE	   (1 << 2)
-#define KVM_X86_QUIRK_OUT_7E_INC_RIP	   (1 << 3)
-#define KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT (1 << 4)
+#define KVM_X86_QUIRK_LINT0_REENABLED		(1 << 0)
+#define KVM_X86_QUIRK_CD_NW_CLEARED		(1 << 1)
+#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE		(1 << 2)
+#define KVM_X86_QUIRK_OUT_7E_INC_RIP		(1 << 3)
+#define KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT	(1 << 4)
+#define KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS	(1 << 5)
 
 #define KVM_STATE_NESTED_FORMAT_VMX	0
 #define KVM_STATE_NESTED_FORMAT_SVM	1
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 395787b7e7ac..60b1b76782e1 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7231,6 +7231,9 @@ void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
+	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS))
+		return;
+
 	if (kvm_mpx_supported()) {
 		bool mpx_enabled = guest_cpuid_has(vcpu, X86_FEATURE_MPX);
 
-- 
2.35.0.263.gb82422642f-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 5/7] selftests: KVM: Add test for PERF_GLOBAL_CTRL VMX control MSR bits
  2022-02-04 20:46 [PATCH v2 0/7] VMX: nVMX: VMX control MSR fixes Oliver Upton
                   ` (3 preceding siblings ...)
  2022-02-04 20:47 ` [PATCH v2 4/7] KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs Oliver Upton
@ 2022-02-04 20:47 ` Oliver Upton
  2022-02-04 20:47 ` [PATCH v2 6/7] selftests: KVM: Add test for BNDCFGS " Oliver Upton
  2022-02-04 20:47 ` [PATCH v2 7/7] KVM: VMX: Use local pointer to vcpu_vmx in vmx_vcpu_after_set_cpuid() Oliver Upton
  6 siblings, 0 replies; 26+ messages in thread
From: Oliver Upton @ 2022-02-04 20:47 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Oliver Upton

Test that the default behavior of KVM is to ignore userspace MSR writes
and conditionally expose the "load IA32_PERF_GLOBAL_CTRL" bits in the
VMX control MSRs if the guest CPUID exposes a supporting vPMU.
Additionally, test that when the corresponding quirk is disabled,
userspace can still clear these bits regardless of what is exposed in
CPUID.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../kvm/x86_64/vmx_control_msrs_test.c        | 113 ++++++++++++++++++
 3 files changed, 115 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index dce7de7755e6..044aef3a8574 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -36,6 +36,7 @@
 /x86_64/userspace_io_test
 /x86_64/userspace_msr_exit_test
 /x86_64/vmx_apic_access_test
+/x86_64/vmx_control_msrs_test
 /x86_64/vmx_close_while_nested_test
 /x86_64/vmx_dirty_log_test
 /x86_64/vmx_exception_with_invalid_guest_state
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 0e4926bc9a58..88b99d9de373 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -68,6 +68,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/sync_regs_test
 TEST_GEN_PROGS_x86_64 += x86_64/userspace_io_test
 TEST_GEN_PROGS_x86_64 += x86_64/userspace_msr_exit_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_apic_access_test
+TEST_GEN_PROGS_x86_64 += x86_64/vmx_control_msrs_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_close_while_nested_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_dirty_log_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_exception_with_invalid_guest_state
diff --git a/tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c b/tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c
new file mode 100644
index 000000000000..ac5fdeb50eee
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c
@@ -0,0 +1,113 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * VMX control MSR test
+ *
+ * Copyright (C) 2022 Google LLC.
+ *
+ * Tests for KVM ownership of bits in the VMX entry/exit control MSRs. Checks
+ * that KVM will set owned bits where appropriate, and will not if
+ * KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS is disabled.
+ */
+
+#include "kvm_util.h"
+#include "vmx.h"
+
+#define VCPU_ID 0
+
+static void get_vmx_control_msr(struct kvm_vm *vm, uint32_t msr_index,
+				uint32_t *low, uint32_t *high)
+{
+	uint64_t val;
+
+	val = vcpu_get_msr(vm, VCPU_ID, msr_index);
+	*low = val;
+	*high = val >> 32;
+}
+
+static void set_vmx_control_msr(struct kvm_vm *vm, uint32_t msr_index,
+				uint32_t low, uint32_t high)
+{
+	uint64_t val = (((uint64_t) high) << 32) | low;
+
+	vcpu_set_msr(vm, VCPU_ID, msr_index, val);
+}
+
+static void test_vmx_control_msr(struct kvm_vm *vm, uint32_t msr_index, uint32_t set,
+				 uint32_t clear, uint32_t exp_set, uint32_t exp_clear)
+{
+	uint32_t low, high;
+
+	get_vmx_control_msr(vm, msr_index, &low, &high);
+
+	high &= ~clear;
+	high |= set;
+
+	set_vmx_control_msr(vm, msr_index, low, high);
+
+	get_vmx_control_msr(vm, msr_index, &low, &high);
+	ASSERT_EQ(high & exp_set, exp_set);
+	ASSERT_EQ(~high & exp_clear, exp_clear);
+}
+
+static void load_perf_global_ctrl_test(struct kvm_vm *vm)
+{
+	uint32_t entry_low, entry_high, exit_low, exit_high;
+	struct kvm_enable_cap cap = {0};
+
+	get_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS, &entry_low, &entry_high);
+	get_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS, &exit_low, &exit_high);
+
+	if (!(entry_high & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL) ||
+	    !(exit_high & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL)) {
+		print_skip("\"load IA32_PERF_GLOBAL_CTRL\" VM-{Entry,Exit} controls not supported");
+		return;
+	}
+
+	/*
+	 * Test that KVM will set these bits regardless of userspace if the
+	 * guest CPUID exposes a supporting vPMU.
+	 */
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS, 0,
+			     VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,
+			     VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,
+			     0);
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS, 0,
+			     VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,
+			     VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,
+			     0);
+
+	/*
+	 * Disable the quirk, giving userspace control of the VMX capability
+	 * MSRs.
+	 */
+	cap.cap = KVM_CAP_DISABLE_QUIRKS;
+	cap.args[0] = KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS;
+	vm_enable_cap(vm, &cap);
+
+	/*
+	 * Test that userspace can clear these bits, even if it exposes a vPMU
+	 * that supports IA32_PERF_GLOBAL_CTRL.
+	 */
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS, 0,
+			     VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,
+			     0,
+			     VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL);
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS, 0,
+			     VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,
+			     0,
+			     VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL);
+}
+
+int main(void)
+{
+	struct kvm_vm *vm;
+
+	nested_vmx_check_supported();
+
+	/* No need to run a guest for these tests */
+	vm = vm_create_default(VCPU_ID, 0, NULL);
+
+	load_perf_global_ctrl_test(vm);
+
+	kvm_vm_free(vm);
+}
-- 
2.35.0.263.gb82422642f-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 6/7] selftests: KVM: Add test for BNDCFGS VMX control MSR bits
  2022-02-04 20:46 [PATCH v2 0/7] VMX: nVMX: VMX control MSR fixes Oliver Upton
                   ` (4 preceding siblings ...)
  2022-02-04 20:47 ` [PATCH v2 5/7] selftests: KVM: Add test for PERF_GLOBAL_CTRL VMX control MSR bits Oliver Upton
@ 2022-02-04 20:47 ` Oliver Upton
  2022-02-07 16:42   ` Paolo Bonzini
  2022-02-04 20:47 ` [PATCH v2 7/7] KVM: VMX: Use local pointer to vcpu_vmx in vmx_vcpu_after_set_cpuid() Oliver Upton
  6 siblings, 1 reply; 26+ messages in thread
From: Oliver Upton @ 2022-02-04 20:47 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Oliver Upton

Test that the default behavior of KVM is to ignore userspace MSR writes
and conditionally expose the "{load,clear} IA32_BNDCFGS" bits in the VMX
control MSRs if the guest CPUID exposes MPX. Additionally, test that
when the corresponding quirk is disabled, userspace can still clear
these bits regardless of what is exposed in CPUID.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 .../selftests/kvm/include/x86_64/vmx.h        |  2 +
 .../kvm/x86_64/vmx_control_msrs_test.c        | 53 +++++++++++++++++++
 2 files changed, 55 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/x86_64/vmx.h b/tools/testing/selftests/kvm/include/x86_64/vmx.h
index 583ceb0d1457..811c66d9be74 100644
--- a/tools/testing/selftests/kvm/include/x86_64/vmx.h
+++ b/tools/testing/selftests/kvm/include/x86_64/vmx.h
@@ -80,6 +80,7 @@
 #define VM_EXIT_SAVE_IA32_EFER			0x00100000
 #define VM_EXIT_LOAD_IA32_EFER			0x00200000
 #define VM_EXIT_SAVE_VMX_PREEMPTION_TIMER	0x00400000
+#define VM_EXIT_CLEAR_BNDCFGS			0x00800000
 
 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR	0x00036dff
 
@@ -90,6 +91,7 @@
 #define VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL	0x00002000
 #define VM_ENTRY_LOAD_IA32_PAT			0x00004000
 #define VM_ENTRY_LOAD_IA32_EFER			0x00008000
+#define VM_ENTRY_LOAD_BNDCFGS			0x00010000
 
 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR	0x000011ff
 
diff --git a/tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c b/tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c
index ac5fdeb50eee..21e1dee0f83f 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c
@@ -96,6 +96,58 @@ static void load_perf_global_ctrl_test(struct kvm_vm *vm)
 			     VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,
 			     0,
 			     VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL);
+
+	/* cleanup, enable the quirk again */
+	cap.args[0] = 0;
+	vm_enable_cap(vm, &cap);
+}
+
+static void bndcfgs_test(struct kvm_vm *vm)
+{
+	uint32_t entry_low, entry_high, exit_low, exit_high;
+	struct kvm_enable_cap cap = {0};
+
+	get_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS, &entry_low, &entry_high);
+	get_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS, &exit_low, &exit_high);
+
+	if (!(entry_high & VM_ENTRY_LOAD_BNDCFGS) ||
+	    !(exit_high & VM_EXIT_CLEAR_BNDCFGS)) {
+		print_skip("\"load/clear IA32_BNDCFGS\" VM-{Entry,Exit} controls not supported");
+		return;
+	}
+
+	/*
+	 * Test that KVM will set these bits regardless of userspace if the
+	 * guest CPUID exposes MPX.
+	 */
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS, 0,
+			     VM_ENTRY_LOAD_BNDCFGS,
+			     VM_ENTRY_LOAD_BNDCFGS,
+			     0);
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS, 0,
+			     VM_EXIT_CLEAR_BNDCFGS,
+			     VM_EXIT_CLEAR_BNDCFGS,
+			     0);
+
+	/*
+	 * Disable the quirk, giving userspace control of the VMX capability
+	 * MSRs.
+	 */
+	cap.cap = KVM_CAP_DISABLE_QUIRKS;
+	cap.args[0] = KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS;
+	vm_enable_cap(vm, &cap);
+
+	/*
+	 * Test that userspace can clear these bits, even if it exposes MPX.
+	 */
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS, 0,
+			     VM_ENTRY_LOAD_BNDCFGS,
+			     0,
+			     VM_ENTRY_LOAD_BNDCFGS);
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS, 0,
+			     VM_EXIT_CLEAR_BNDCFGS,
+			     0,
+			     VM_EXIT_CLEAR_BNDCFGS);
 }
 
 int main(void)
@@ -108,6 +160,7 @@ int main(void)
 	vm = vm_create_default(VCPU_ID, 0, NULL);
 
 	load_perf_global_ctrl_test(vm);
+	bndcfgs_test(vm);
 
 	kvm_vm_free(vm);
 }
-- 
2.35.0.263.gb82422642f-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 7/7] KVM: VMX: Use local pointer to vcpu_vmx in vmx_vcpu_after_set_cpuid()
  2022-02-04 20:46 [PATCH v2 0/7] VMX: nVMX: VMX control MSR fixes Oliver Upton
                   ` (5 preceding siblings ...)
  2022-02-04 20:47 ` [PATCH v2 6/7] selftests: KVM: Add test for BNDCFGS " Oliver Upton
@ 2022-02-04 20:47 ` Oliver Upton
  2022-02-07 16:42   ` Paolo Bonzini
  6 siblings, 1 reply; 26+ messages in thread
From: Oliver Upton @ 2022-02-04 20:47 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Oliver Upton

There is a local that contains a pointer to vcpu_vmx already. Just use
that instead to get at the structure directly instead of doing pointer
arithmetic.

No functional change intended.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 60b1b76782e1..11b6332769c5 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7338,11 +7338,11 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 						vmx_secondary_exec_control(vmx));
 
 	if (nested_vmx_allowed(vcpu))
-		to_vmx(vcpu)->msr_ia32_feature_control_valid_bits |=
+		vmx->msr_ia32_feature_control_valid_bits |=
 			FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
 			FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
 	else
-		to_vmx(vcpu)->msr_ia32_feature_control_valid_bits &=
+		vmx->msr_ia32_feature_control_valid_bits &=
 			~(FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
 			  FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX);
 
-- 
2.35.0.263.gb82422642f-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 3/7] KVM: nVMX: Roll all entry/exit ctl updates into a single helper
  2022-02-04 20:47 ` [PATCH v2 3/7] KVM: nVMX: Roll all entry/exit ctl updates into a single helper Oliver Upton
@ 2022-02-05  7:43     ` kernel test robot
  0 siblings, 0 replies; 26+ messages in thread
From: kernel test robot @ 2022-02-05  7:43 UTC (permalink / raw)
  To: Oliver Upton, kvm
  Cc: kbuild-all, Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Oliver Upton

Hi Oliver,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.17-rc2 next-20220204]
[cannot apply to kvm/queue mst-vhost/linux-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Oliver-Upton/VMX-nVMX-VMX-control-MSR-fixes/20220205-044901
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 86286e486cbdd68f01d330409307f6a6efcd4298
config: x86_64-rhel-8.3 (https://download.01.org/0day-ci/archive/20220205/202202051529.y26BVBiF-lkp@intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/3053b58337c5bb48b67fce9fb0616887b7180370
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Oliver-Upton/VMX-nVMX-VMX-control-MSR-fixes/20220205-044901
        git checkout 3053b58337c5bb48b67fce9fb0616887b7180370
        # save the config file to linux build tree
        mkdir build_dir
        make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "kvm_pmu_is_valid_msr" [arch/x86/kvm/kvm-intel.ko] undefined!

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 3/7] KVM: nVMX: Roll all entry/exit ctl updates into a single helper
@ 2022-02-05  7:43     ` kernel test robot
  0 siblings, 0 replies; 26+ messages in thread
From: kernel test robot @ 2022-02-05  7:43 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1668 bytes --]

Hi Oliver,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.17-rc2 next-20220204]
[cannot apply to kvm/queue mst-vhost/linux-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Oliver-Upton/VMX-nVMX-VMX-control-MSR-fixes/20220205-044901
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 86286e486cbdd68f01d330409307f6a6efcd4298
config: x86_64-rhel-8.3 (https://download.01.org/0day-ci/archive/20220205/202202051529.y26BVBiF-lkp(a)intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/3053b58337c5bb48b67fce9fb0616887b7180370
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Oliver-Upton/VMX-nVMX-VMX-control-MSR-fixes/20220205-044901
        git checkout 3053b58337c5bb48b67fce9fb0616887b7180370
        # save the config file to linux build tree
        mkdir build_dir
        make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "kvm_pmu_is_valid_msr" [arch/x86/kvm/kvm-intel.ko] undefined!

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 3/7] KVM: nVMX: Roll all entry/exit ctl updates into a single helper
  2022-02-05  7:43     ` kernel test robot
@ 2022-02-05 19:41       ` Oliver Upton
  -1 siblings, 0 replies; 26+ messages in thread
From: Oliver Upton @ 2022-02-05 19:41 UTC (permalink / raw)
  To: kernel test robot
  Cc: kvm, kbuild-all, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel

On Fri, Feb 4, 2022 at 11:44 PM kernel test robot <lkp@intel.com> wrote:
> >> ERROR: modpost: "kvm_pmu_is_valid_msr" [arch/x86/kvm/kvm-intel.ko] undefined!

Argh... Local tooling defaults to building KVM nonmodular so I missed this.

Squashing the following in fixes the issue.

--
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index f614f95acc6b..18430547357d 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -396,6 +396,7 @@ bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
        return kvm_x86_ops.pmu_ops->msr_idx_to_pmc(vcpu, msr) ||
                kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, msr);
 }
+EXPORT_SYMBOL_GPL(kvm_pmu_is_valid_msr);

 static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 {

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 3/7] KVM: nVMX: Roll all entry/exit ctl updates into a single helper
@ 2022-02-05 19:41       ` Oliver Upton
  0 siblings, 0 replies; 26+ messages in thread
From: Oliver Upton @ 2022-02-05 19:41 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 766 bytes --]

On Fri, Feb 4, 2022 at 11:44 PM kernel test robot <lkp@intel.com> wrote:
> >> ERROR: modpost: "kvm_pmu_is_valid_msr" [arch/x86/kvm/kvm-intel.ko] undefined!

Argh... Local tooling defaults to building KVM nonmodular so I missed this.

Squashing the following in fixes the issue.

--
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index f614f95acc6b..18430547357d 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -396,6 +396,7 @@ bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
        return kvm_x86_ops.pmu_ops->msr_idx_to_pmc(vcpu, msr) ||
                kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, msr);
 }
+EXPORT_SYMBOL_GPL(kvm_pmu_is_valid_msr);

 static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
 {

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/7] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL ctrl bits across MSR write
  2022-02-04 20:47 ` [PATCH v2 2/7] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL " Oliver Upton
@ 2022-02-07 16:33   ` Paolo Bonzini
  0 siblings, 0 replies; 26+ messages in thread
From: Paolo Bonzini @ 2022-02-07 16:33 UTC (permalink / raw)
  To: Oliver Upton, kvm
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel

On 2/4/22 21:47, Oliver Upton wrote:
> Since commit 03a8871add95 ("KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL
> VM-{Entry,Exit} control"), KVM has taken ownership of the "load
> IA32_PERF_GLOBAL_CTRL" VMX entry/exit control bits. The ABI is that
> these bits will be set in the IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS MSRs if
> the guest's CPUID exposes a vPMU that supports the IA32_PERF_GLOBAL_CTRL
> MSR (CPUID.0AH:EAX[7:0] > 1), and clear otherwise.
> 
> However, KVM will only do so if userspace sets the CPUID before writing
> to the corresponding MSRs. Of course, there are no ordering requirements
> between these ioctls. Uphold the ABI regardless of ordering by
> reapplying KVMs tweaks to the VMX control MSRs after userspace has
> written to them.

Ok, this makes more sense.  Here you have KVM_SET_MSR before 
KVM_SET_CPUID2, so KVM_SET_CPUID2 does to PERF_GLOBAL_CTRL controls what 
it's already doing with BNDCFGS controls.  Is this correct?

Paolo

> Note that older kernels without commit c44d9b34701d ("KVM: x86: Invoke
> vendor's vcpu_after_set_cpuid() after all common updates") still require
> that the entry/exit controls be updated from kvm_pmu_refresh(). Leave
> the benign call in place to allow for cleaner backporting and punt the
> cleanup to a later change.
> 
> Fixes: 03a8871add95 ("KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL VM-{Entry,Exit} control")
> Reported-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>   arch/x86/kvm/vmx/vmx.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index d63d6dfbadbf..54ac382a0b73 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7242,6 +7242,8 @@ void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
>   			vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_CLEAR_BNDCFGS;
>   		}
>   	}
> +
> +	nested_vmx_pmu_entry_exit_ctls_update(vcpu);
>   }
>   
>   static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 6/7] selftests: KVM: Add test for BNDCFGS VMX control MSR bits
  2022-02-04 20:47 ` [PATCH v2 6/7] selftests: KVM: Add test for BNDCFGS " Oliver Upton
@ 2022-02-07 16:42   ` Paolo Bonzini
  0 siblings, 0 replies; 26+ messages in thread
From: Paolo Bonzini @ 2022-02-07 16:42 UTC (permalink / raw)
  To: Oliver Upton, kvm
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel

On 2/4/22 21:47, Oliver Upton wrote:
> +	/*
> +	 * Test that KVM will set these bits regardless of userspace if the
> +	 * guest CPUID exposes MPX.
> +	 */
> +	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS, 0,
> +			     VM_ENTRY_LOAD_BNDCFGS,
> +			     VM_ENTRY_LOAD_BNDCFGS,
> +			     0);
> +	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS, 0,
> +			     VM_EXIT_CLEAR_BNDCFGS,
> +			     VM_EXIT_CLEAR_BNDCFGS,
> +			     0);
> +

I wouldn't expect this behavior.

> +	/*
> +	 * Disable the quirk, giving userspace control of the VMX capability
> +	 * MSRs.
> +	 */
> +	cap.cap = KVM_CAP_DISABLE_QUIRKS;
> +	cap.args[0] = KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS;
> +	vm_enable_cap(vm, &cap);
> +
> +	/*
> +	 * Test that userspace can clear these bits, even if it exposes MPX.
> +	 */
> +	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS, 0,
> +			     VM_ENTRY_LOAD_BNDCFGS,
> +			     0,
> +			     VM_ENTRY_LOAD_BNDCFGS);
> +	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS, 0,
> +			     VM_EXIT_CLEAR_BNDCFGS,
> +			     0,
> +			     VM_EXIT_CLEAR_BNDCFGS);

and likewise I would have expected this one to work without need for a 
quirk.

It's also missing a testcase that sets clears MPX and checks that the 
BNDCFGS controls disappear, I think.

Paolo


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 7/7] KVM: VMX: Use local pointer to vcpu_vmx in vmx_vcpu_after_set_cpuid()
  2022-02-04 20:47 ` [PATCH v2 7/7] KVM: VMX: Use local pointer to vcpu_vmx in vmx_vcpu_after_set_cpuid() Oliver Upton
@ 2022-02-07 16:42   ` Paolo Bonzini
  0 siblings, 0 replies; 26+ messages in thread
From: Paolo Bonzini @ 2022-02-07 16:42 UTC (permalink / raw)
  To: Oliver Upton, kvm
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel

On 2/4/22 21:47, Oliver Upton wrote:
> There is a local that contains a pointer to vcpu_vmx already. Just use
> that instead to get at the structure directly instead of doing pointer
> arithmetic.
> 
> No functional change intended.
> 
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>   arch/x86/kvm/vmx/vmx.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 60b1b76782e1..11b6332769c5 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7338,11 +7338,11 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>   						vmx_secondary_exec_control(vmx));
>   
>   	if (nested_vmx_allowed(vcpu))
> -		to_vmx(vcpu)->msr_ia32_feature_control_valid_bits |=
> +		vmx->msr_ia32_feature_control_valid_bits |=
>   			FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
>   			FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
>   	else
> -		to_vmx(vcpu)->msr_ia32_feature_control_valid_bits &=
> +		vmx->msr_ia32_feature_control_valid_bits &=
>   			~(FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
>   			  FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX);
>   

Queued, thanks.

Paolo


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/7] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-02-04 20:46 ` [PATCH v2 1/7] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write Oliver Upton
@ 2022-02-07 17:21   ` Paolo Bonzini
  2022-02-07 18:13     ` Sean Christopherson
  2022-02-07 18:22     ` Oliver Upton
  0 siblings, 2 replies; 26+ messages in thread
From: Paolo Bonzini @ 2022-02-07 17:21 UTC (permalink / raw)
  To: Oliver Upton, kvm
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel

On 2/4/22 21:46, Oliver Upton wrote:
> Since commit 5f76f6f5ff96 ("KVM: nVMX: Do not expose MPX VMX controls
> when guest MPX disabled"), KVM has taken ownership of the "load
> IA32_BNDCFGS" and "clear IA32_BNDCFGS" VMX entry/exit controls. The ABI
> is that these bits must be set in the IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS
> MSRs if the guest's CPUID supports MPX, and clear otherwise.
> 
> However, KVM will only do so if userspace sets the CPUID before writing
> to the corresponding MSRs. Of course, there are no ordering requirements
> between these ioctls. Uphold the ABI regardless of ordering by
> reapplying KVMs tweaks to the VMX control MSRs after userspace has
> written to them.

I don't understand this patch.  If you first write the CPUID and then 
the MSR, the consistency is upheld by these checks:

         if (!is_bitwise_subset(data, supported, GENMASK_ULL(31, 0)))
                 return -EINVAL;

         if (!is_bitwise_subset(supported, data, GENMASK_ULL(63, 32)))
                 return -EINVAL;

If you're fixing a case where userspace first writes the MSR and then 
sets CPUID, then I would expect that KVM_SET_CPUID2 redoes those checks 
and, if they fail, it adjusts the MSRs.

The selftests confirm that I'm a bit confused, but in general 
vmx_restore_control_msr is not the place where I was expecting the change.

Paolo

> Fixes: 5f76f6f5ff96 ("KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled")
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>   arch/x86/kvm/vmx/nested.c | 9 +++++++++
>   arch/x86/kvm/vmx/vmx.c    | 2 +-
>   arch/x86/kvm/vmx/vmx.h    | 2 ++
>   3 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index ba34e94049c7..59164394569f 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -1291,6 +1291,15 @@ vmx_restore_control_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
>   
>   	*lowp = data;
>   	*highp = data >> 32;
> +
> +	/*
> +	 * Ensure KVM fiddling with these MSRs is preserved after userspace
> +	 * write.
> +	 */
> +	if (msr_index == MSR_IA32_VMX_TRUE_ENTRY_CTLS ||
> +	    msr_index == MSR_IA32_VMX_TRUE_EXIT_CTLS)
> +		nested_vmx_entry_exit_ctls_update(&vmx->vcpu);
> +
>   	return 0;
>   }
>   
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index aca3ae2a02f3..d63d6dfbadbf 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7227,7 +7227,7 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
>   #undef cr4_fixed1_update
>   }
>   
> -static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
> +void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
>   {
>   	struct vcpu_vmx *vmx = to_vmx(vcpu);
>   
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 7f2c82e7f38f..e134e2763502 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -423,6 +423,8 @@ static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
>   
>   void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu);
>   
> +void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu);
> +
>   /*
>    * Note, early Intel manuals have the write-low and read-high bitmap offsets
>    * the wrong way round.  The bitmaps control MSRs 0x00000000-0x00001fff and


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 3/7] KVM: nVMX: Roll all entry/exit ctl updates into a single helper
  2022-02-05 19:41       ` Oliver Upton
@ 2022-02-07 17:56         ` Sean Christopherson
  -1 siblings, 0 replies; 26+ messages in thread
From: Sean Christopherson @ 2022-02-07 17:56 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kernel test robot, kvm, kbuild-all, Paolo Bonzini,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel

On Sat, Feb 05, 2022, Oliver Upton wrote:
> On Fri, Feb 4, 2022 at 11:44 PM kernel test robot <lkp@intel.com> wrote:
> > >> ERROR: modpost: "kvm_pmu_is_valid_msr" [arch/x86/kvm/kvm-intel.ko] undefined!
> 
> Argh... Local tooling defaults to building KVM nonmodular so I missed this.
> 
> Squashing the following in fixes the issue.
> 
> --
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index f614f95acc6b..18430547357d 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -396,6 +396,7 @@ bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
>         return kvm_x86_ops.pmu_ops->msr_idx_to_pmc(vcpu, msr) ||
>                 kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, msr);
>  }
> +EXPORT_SYMBOL_GPL(kvm_pmu_is_valid_msr);

I'd much prefer to avoid this mess entirely.

[*] https://lore.kernel.org/all/20220128005208.4008533-9-seanjc@google.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 3/7] KVM: nVMX: Roll all entry/exit ctl updates into a single helper
@ 2022-02-07 17:56         ` Sean Christopherson
  0 siblings, 0 replies; 26+ messages in thread
From: Sean Christopherson @ 2022-02-07 17:56 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 895 bytes --]

On Sat, Feb 05, 2022, Oliver Upton wrote:
> On Fri, Feb 4, 2022 at 11:44 PM kernel test robot <lkp@intel.com> wrote:
> > >> ERROR: modpost: "kvm_pmu_is_valid_msr" [arch/x86/kvm/kvm-intel.ko] undefined!
> 
> Argh... Local tooling defaults to building KVM nonmodular so I missed this.
> 
> Squashing the following in fixes the issue.
> 
> --
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index f614f95acc6b..18430547357d 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -396,6 +396,7 @@ bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
>         return kvm_x86_ops.pmu_ops->msr_idx_to_pmc(vcpu, msr) ||
>                 kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, msr);
>  }
> +EXPORT_SYMBOL_GPL(kvm_pmu_is_valid_msr);

I'd much prefer to avoid this mess entirely.

[*] https://lore.kernel.org/all/20220128005208.4008533-9-seanjc(a)google.com

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 4/7] KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs
  2022-02-04 20:47 ` [PATCH v2 4/7] KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs Oliver Upton
@ 2022-02-07 18:06   ` Sean Christopherson
  2022-02-09  1:50     ` Oliver Upton
  0 siblings, 1 reply; 26+ messages in thread
From: Sean Christopherson @ 2022-02-07 18:06 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel

On Fri, Feb 04, 2022, Oliver Upton wrote:
> KVM really has no business messing with the vCPU state. Nonetheless, it
> has become ABI for KVM to adjust certain bits of the VMX entry/exit
> control MSRs depending on the guest CPUID. Namely, the bits associated
> with the IA32_PERF_GLOBAL_CTRL and IA32_BNDCFGS MSRs were conditionally
> enabled if the guest CPUID allows for it.
> 
> Allow userspace to opt-out of changes to VMX control MSRs by adding a
> new KVM quirk.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  arch/x86/include/uapi/asm/kvm.h | 11 ++++++-----
>  arch/x86/kvm/vmx/vmx.c          |  3 +++
>  2 files changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index bf6e96011dfe..acbab6a97fae 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -428,11 +428,12 @@ struct kvm_sync_regs {
>  	struct kvm_vcpu_events events;
>  };
>  
> -#define KVM_X86_QUIRK_LINT0_REENABLED	   (1 << 0)
> -#define KVM_X86_QUIRK_CD_NW_CLEARED	   (1 << 1)
> -#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE	   (1 << 2)
> -#define KVM_X86_QUIRK_OUT_7E_INC_RIP	   (1 << 3)
> -#define KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT (1 << 4)
> +#define KVM_X86_QUIRK_LINT0_REENABLED		(1 << 0)
> +#define KVM_X86_QUIRK_CD_NW_CLEARED		(1 << 1)
> +#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE		(1 << 2)
> +#define KVM_X86_QUIRK_OUT_7E_INC_RIP		(1 << 3)
> +#define KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT	(1 << 4)
> +#define KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS	(1 << 5)

I'd prefer we include msr_ia32_feature_control_valid_bits in this quirk, it should
be relatively easy to do since most of the modifications stem from
vmx_vcpu_after_set_cpuid().  vmx_setup_mce() is a bit odd, but IMO it's worth
excising as much crud as we can.

>  #define KVM_STATE_NESTED_FORMAT_VMX	0
>  #define KVM_STATE_NESTED_FORMAT_SVM	1
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 395787b7e7ac..60b1b76782e1 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7231,6 +7231,9 @@ void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
>  {
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
>  
> +	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS))
> +		return;


Probably worth calling out that nested_vmx_cr_fixed1_bits_update() is intentionally
exempt from this "rule":

	case MSR_IA32_VMX_CR0_FIXED1:
	case MSR_IA32_VMX_CR4_FIXED1:
		/*
		 * These MSRs are generated based on the vCPU's CPUID, so we
		 * do not support restoring them directly.
		 */
		return -EINVAL;

> +
>  	if (kvm_mpx_supported()) {
>  		bool mpx_enabled = guest_cpuid_has(vcpu, X86_FEATURE_MPX);
>  
> -- 
> 2.35.0.263.gb82422642f-goog
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/7] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-02-07 17:21   ` Paolo Bonzini
@ 2022-02-07 18:13     ` Sean Christopherson
  2022-02-07 18:22     ` Oliver Upton
  1 sibling, 0 replies; 26+ messages in thread
From: Sean Christopherson @ 2022-02-07 18:13 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Oliver Upton, kvm, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel

On Mon, Feb 07, 2022, Paolo Bonzini wrote:
> On 2/4/22 21:46, Oliver Upton wrote:
> > Since commit 5f76f6f5ff96 ("KVM: nVMX: Do not expose MPX VMX controls
> > when guest MPX disabled"), KVM has taken ownership of the "load
> > IA32_BNDCFGS" and "clear IA32_BNDCFGS" VMX entry/exit controls. The ABI
> > is that these bits must be set in the IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS
> > MSRs if the guest's CPUID supports MPX, and clear otherwise.
> > 
> > However, KVM will only do so if userspace sets the CPUID before writing
> > to the corresponding MSRs. Of course, there are no ordering requirements
> > between these ioctls. Uphold the ABI regardless of ordering by
> > reapplying KVMs tweaks to the VMX control MSRs after userspace has
> > written to them.
> 
> I don't understand this patch.  If you first write the CPUID and then the
> MSR, the consistency is upheld by these checks:
> 
>         if (!is_bitwise_subset(data, supported, GENMASK_ULL(31, 0)))
>                 return -EINVAL;
> 
>         if (!is_bitwise_subset(supported, data, GENMASK_ULL(63, 32)))
>                 return -EINVAL;
> 
> If you're fixing a case where userspace first writes the MSR and then sets
> CPUID, then I would expect that KVM_SET_CPUID2 redoes those checks and, if
> they fail, it adjusts the MSRs.
> 
> The selftests confirm that I'm a bit confused, but in general
> vmx_restore_control_msr is not the place where I was expecting the change.

Do we even need this change?  The ABI is whatever it is, not what may or may not
have been intended by a flawed, 3+ year old commit.   E.g. if there's a userspace
that relies on being able to override KVM's tweaks by writing the MSRs after
setting CPUID, then this commit will break the ABI for that userspace.  The quirk
should be sufficient warning that KVM's behavior is funky.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/7] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-02-07 17:21   ` Paolo Bonzini
  2022-02-07 18:13     ` Sean Christopherson
@ 2022-02-07 18:22     ` Oliver Upton
  2022-02-07 18:27       ` Paolo Bonzini
  2022-02-07 18:34       ` Sean Christopherson
  1 sibling, 2 replies; 26+ messages in thread
From: Oliver Upton @ 2022-02-07 18:22 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel

Hi Paolo,

On Mon, Feb 07, 2022 at 06:21:30PM +0100, Paolo Bonzini wrote:
> On 2/4/22 21:46, Oliver Upton wrote:
> > Since commit 5f76f6f5ff96 ("KVM: nVMX: Do not expose MPX VMX controls
> > when guest MPX disabled"), KVM has taken ownership of the "load
> > IA32_BNDCFGS" and "clear IA32_BNDCFGS" VMX entry/exit controls. The ABI
> > is that these bits must be set in the IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS
> > MSRs if the guest's CPUID supports MPX, and clear otherwise.
> > 
> > However, KVM will only do so if userspace sets the CPUID before writing
> > to the corresponding MSRs. Of course, there are no ordering requirements
> > between these ioctls. Uphold the ABI regardless of ordering by
> > reapplying KVMs tweaks to the VMX control MSRs after userspace has
> > written to them.
> 
> I don't understand this patch.  If you first write the CPUID and then the
> MSR, the consistency is upheld by these checks:
> 
>         if (!is_bitwise_subset(data, supported, GENMASK_ULL(31, 0)))
>                 return -EINVAL;
> 
>         if (!is_bitwise_subset(supported, data, GENMASK_ULL(63, 32)))
>                 return -EINVAL;

Right, this works if KVM chose to clear the bit, but userspace is trying
to set it. If KVM chose to set the bit, and userspace attempts to clear
it, these checks would pass.

> If you're fixing a case where userspace first writes the MSR and then sets
> CPUID, then I would expect that KVM_SET_CPUID2 redoes those checks and, if
> they fail, it adjusts the MSRs.

I am fixing the case where userspace issues KVM_SET_CPUID2 then writes
to the corresponding MSRs.

Until recently, this all sort of 'worked'. Since we called
kvm_update_cpuid() all the time it was possible for KVM to overwrite the
bits after the MSR write, just not immediately so. After the whole CPUID
rework, we only update the VMX control MSRs immediately after a
KVM_SET_CPUID2, meaning we've missed the case of MSR write after CPUID.

The entire spirit of this series is to back KVM out of touching these
bits, but it seemingly requires a quirk to do so given the userspace
expectations around these bits. Given that, these first two patches fix
the ordering issue between KVM_SET_CPUID2 and an MSR write and enforce
KVM ownership unconditionally.

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/7] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-02-07 18:22     ` Oliver Upton
@ 2022-02-07 18:27       ` Paolo Bonzini
  2022-02-07 18:34       ` Sean Christopherson
  1 sibling, 0 replies; 26+ messages in thread
From: Paolo Bonzini @ 2022-02-07 18:27 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel

On 2/7/22 19:22, Oliver Upton wrote:
> Hi Paolo,
> 
> On Mon, Feb 07, 2022 at 06:21:30PM +0100, Paolo Bonzini wrote:
>> On 2/4/22 21:46, Oliver Upton wrote:
>>> Since commit 5f76f6f5ff96 ("KVM: nVMX: Do not expose MPX VMX controls
>>> when guest MPX disabled"), KVM has taken ownership of the "load
>>> IA32_BNDCFGS" and "clear IA32_BNDCFGS" VMX entry/exit controls. The ABI
>>> is that these bits must be set in the IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS
>>> MSRs if the guest's CPUID supports MPX, and clear otherwise.
>>>
>>> However, KVM will only do so if userspace sets the CPUID before writing
>>> to the corresponding MSRs. Of course, there are no ordering requirements
>>> between these ioctls. Uphold the ABI regardless of ordering by
>>> reapplying KVMs tweaks to the VMX control MSRs after userspace has
>>> written to them.
>>
>> I don't understand this patch.  If you first write the CPUID and then the
>> MSR, the consistency is upheld by these checks:
>>
>>          if (!is_bitwise_subset(data, supported, GENMASK_ULL(31, 0)))
>>                  return -EINVAL;
>>
>>          if (!is_bitwise_subset(supported, data, GENMASK_ULL(63, 32)))
>>                  return -EINVAL;
> 
> Right, this works if KVM chose to clear the bit, but userspace is trying
> to set it. If KVM chose to set the bit, and userspace attempts to clear
> it, these checks would pass.

Okay, that's what I expected too but I thought it would be okay that the 
checks pass.  Are you trying to undo an involuntary API change, and if 
so why was the change a bug and not a fix?

Paolo


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/7] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-02-07 18:22     ` Oliver Upton
  2022-02-07 18:27       ` Paolo Bonzini
@ 2022-02-07 18:34       ` Sean Christopherson
  2022-02-07 18:52         ` Oliver Upton
  1 sibling, 1 reply; 26+ messages in thread
From: Sean Christopherson @ 2022-02-07 18:34 UTC (permalink / raw)
  To: Oliver Upton
  Cc: Paolo Bonzini, kvm, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel

On Mon, Feb 07, 2022, Oliver Upton wrote:
> Until recently, this all sort of 'worked'. Since we called
> kvm_update_cpuid() all the time it was possible for KVM to overwrite the
> bits after the MSR write, just not immediately so. After the whole CPUID
> rework, we only update the VMX control MSRs immediately after a
> KVM_SET_CPUID2, meaning we've missed the case of MSR write after CPUID.

That needs to be explained in the changelog (ditto for patch 02), and arguably
the Fixes tag is wrong too, or at least incomplete.  The commit that truly broke
things was

  aedbaf4f6afd ("KVM: x86: Extract kvm_update_cpuid_runtime() from kvm_update_cpuid()")

I'm guessing this is why Paolo is also confused.  Without understanding that KVM
used too (eventually) enforce its overrides, it looks like you're proposing an
arbitrary, unnecessary ABI change.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/7] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-02-07 18:34       ` Sean Christopherson
@ 2022-02-07 18:52         ` Oliver Upton
  0 siblings, 0 replies; 26+ messages in thread
From: Oliver Upton @ 2022-02-07 18:52 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel

On Mon, Feb 07, 2022 at 06:34:19PM +0000, Sean Christopherson wrote:
> On Mon, Feb 07, 2022, Oliver Upton wrote:
> > Until recently, this all sort of 'worked'. Since we called
> > kvm_update_cpuid() all the time it was possible for KVM to overwrite the
> > bits after the MSR write, just not immediately so. After the whole CPUID
> > rework, we only update the VMX control MSRs immediately after a
> > KVM_SET_CPUID2, meaning we've missed the case of MSR write after CPUID.
> 
> That needs to be explained in the changelog (ditto for patch 02), and arguably
> the Fixes tag is wrong too, or at least incomplete.  The commit that truly broke
> things was
> 
>   aedbaf4f6afd ("KVM: x86: Extract kvm_update_cpuid_runtime() from kvm_update_cpuid()")
> 
> I'm guessing this is why Paolo is also confused.  Without understanding that KVM
> used too (eventually) enforce its overrides, it looks like you're proposing an
> arbitrary, unnecessary ABI change.

Gah, sorry, I really didn't provide the full context on this. I chose to
blame the original commits for these since it was still possible to
write the MSR and avoid a KVM update (just looking for paths where
kvm_update_cpuid() is not called), but agree that full breakage came
from the above commit.

I'll add some language discussing how commit aedbaf4f6afd ("KVM: x86: Extract
kvm_update_cpuid_runtime() from kvm_update_cpuid()") fully broke this.

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 4/7] KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs
  2022-02-07 18:06   ` Sean Christopherson
@ 2022-02-09  1:50     ` Oliver Upton
  2022-02-09 20:23       ` Sean Christopherson
  0 siblings, 1 reply; 26+ messages in thread
From: Oliver Upton @ 2022-02-09  1:50 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel

On Mon, Feb 7, 2022 at 10:06 AM Sean Christopherson <seanjc@google.com> wrote:

[...]

> > +#define KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS    (1 << 5)
>
> I'd prefer we include msr_ia32_feature_control_valid_bits in this quirk, it should
> be relatively easy to do since most of the modifications stem from
> vmx_vcpu_after_set_cpuid().  vmx_setup_mce() is a bit odd, but IMO it's worth
> excising as much crud as we can.
>

Sure, this is a good opportunity to rip out the crud.
msr_ia32_feature_control_valid_bits is a bit messy, since the default
value does not contain all the bits we support. At least with
IA32_VM_TRUE_{ENTRY,EXIT}_CTLS we slim down the hardware values to get
the default value.

Not at all objecting, but it looks like we will need to populate some
bits in the default value of the IA32_FEAT_CTL mask, otherwise with
the quirk enabled guests could never set any of the bits in the MSR.

> >  #define KVM_STATE_NESTED_FORMAT_VMX  0
> >  #define KVM_STATE_NESTED_FORMAT_SVM  1
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 395787b7e7ac..60b1b76782e1 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -7231,6 +7231,9 @@ void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
> >  {
> >       struct vcpu_vmx *vmx = to_vmx(vcpu);
> >
> > +     if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS))
> > +             return;
>
>
> Probably worth calling out that nested_vmx_cr_fixed1_bits_update() is intentionally
> exempt from this "rule":

Agreed.

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 4/7] KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs
  2022-02-09  1:50     ` Oliver Upton
@ 2022-02-09 20:23       ` Sean Christopherson
  0 siblings, 0 replies; 26+ messages in thread
From: Sean Christopherson @ 2022-02-09 20:23 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel

On Tue, Feb 08, 2022, Oliver Upton wrote:
> On Mon, Feb 7, 2022 at 10:06 AM Sean Christopherson <seanjc@google.com> wrote:
> 
> [...]
> 
> > > +#define KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS    (1 << 5)
> >
> > I'd prefer we include msr_ia32_feature_control_valid_bits in this quirk, it should
> > be relatively easy to do since most of the modifications stem from
> > vmx_vcpu_after_set_cpuid().  vmx_setup_mce() is a bit odd, but IMO it's worth
> > excising as much crud as we can.
> >
> 
> Sure, this is a good opportunity to rip out the crud.
> msr_ia32_feature_control_valid_bits is a bit messy, since the default
> value does not contain all the bits we support. At least with
> IA32_VM_TRUE_{ENTRY,EXIT}_CTLS we slim down the hardware values to get
> the default value.
> 
> Not at all objecting, but it looks like we will need to populate some
> bits in the default value of the IA32_FEAT_CTL mask, otherwise with
> the quirk enabled guests could never set any of the bits in the MSR.

I assume you mean "quirk disabled"?  Because quirks are on by default, i.e. KVM's
default behavior will be to populate msr_ia32_feature_control_valid_bits based on
CPUID updates.

That said, after typing up what I had in mind, I don't think we need a quirk at all.
The only weird part is that KVM doesn't allow host userspace to set the MSR without
first setting CPUID.  That's trivial to fix and we can do so without impacting KVM's
modeling of WRMSR from the guest.  Modeling WRMSR is no different than KVM enforcing
CR4 bits based on CPUID.  The VMX MSRs are weird because they are technically
independent of the non-virtualization support reported in CPUID, i.e. KVM is overstepping
by manipulating the MSRs based on CPUID.

I'll send this is a formal patch, obviously with KVM_SUPPORTED_FEATURE_CONTROL
defined...

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d05b4955d14f..d50ae2de8b51 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1749,11 +1749,16 @@ bool nested_vmx_allowed(struct kvm_vcpu *vcpu)
 }

 static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
-                                                uint64_t val)
+                                                struct msr_data *msr)
 {
-       uint64_t valid_bits = to_vmx(vcpu)->msr_ia32_feature_control_valid_bits;
+       uint64_t valid_bits;

-       return !(val & ~valid_bits);
+       if (msr->host_initiated)
+               valid_bits = KVM_SUPPORTED_FEATURE_CONTROL;
+       else
+               to_vmx(vcpu)->msr_ia32_feature_control_valid_bits;
+
+       return !(msr->data & ~valid_bits);
 }

 static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
@@ -2146,7 +2151,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
                vcpu->arch.mcg_ext_ctl = data;
                break;
        case MSR_IA32_FEAT_CTL:
-               if (!vmx_feature_control_msr_valid(vcpu, data) ||
+               if (!vmx_feature_control_msr_valid(vcpu, msr_info) ||
                    (to_vmx(vcpu)->msr_ia32_feature_control &
                     FEAT_CTL_LOCKED && !msr_info->host_initiated))
                        return 1;

^ permalink raw reply related	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2022-02-09 20:23 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-04 20:46 [PATCH v2 0/7] VMX: nVMX: VMX control MSR fixes Oliver Upton
2022-02-04 20:46 ` [PATCH v2 1/7] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write Oliver Upton
2022-02-07 17:21   ` Paolo Bonzini
2022-02-07 18:13     ` Sean Christopherson
2022-02-07 18:22     ` Oliver Upton
2022-02-07 18:27       ` Paolo Bonzini
2022-02-07 18:34       ` Sean Christopherson
2022-02-07 18:52         ` Oliver Upton
2022-02-04 20:47 ` [PATCH v2 2/7] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL " Oliver Upton
2022-02-07 16:33   ` Paolo Bonzini
2022-02-04 20:47 ` [PATCH v2 3/7] KVM: nVMX: Roll all entry/exit ctl updates into a single helper Oliver Upton
2022-02-05  7:43   ` kernel test robot
2022-02-05  7:43     ` kernel test robot
2022-02-05 19:41     ` Oliver Upton
2022-02-05 19:41       ` Oliver Upton
2022-02-07 17:56       ` Sean Christopherson
2022-02-07 17:56         ` Sean Christopherson
2022-02-04 20:47 ` [PATCH v2 4/7] KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs Oliver Upton
2022-02-07 18:06   ` Sean Christopherson
2022-02-09  1:50     ` Oliver Upton
2022-02-09 20:23       ` Sean Christopherson
2022-02-04 20:47 ` [PATCH v2 5/7] selftests: KVM: Add test for PERF_GLOBAL_CTRL VMX control MSR bits Oliver Upton
2022-02-04 20:47 ` [PATCH v2 6/7] selftests: KVM: Add test for BNDCFGS " Oliver Upton
2022-02-07 16:42   ` Paolo Bonzini
2022-02-04 20:47 ` [PATCH v2 7/7] KVM: VMX: Use local pointer to vcpu_vmx in vmx_vcpu_after_set_cpuid() Oliver Upton
2022-02-07 16:42   ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.