All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/8] KVM: x86: VMX ctrl MSR + KVM quirk fixes
@ 2022-03-01  6:03 Oliver Upton
  2022-03-01  6:03 ` [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write Oliver Upton
                   ` (7 more replies)
  0 siblings, 8 replies; 31+ messages in thread
From: Oliver Upton @ 2022-03-01  6:03 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, David Dunn, Oliver Upton

There are a few bits in the VMX entry/exit control MSRs where KVM
intervenes. The "load IA32_PERF_GLOBAL_CTRL" and "{load,clear}
IA32_BNDCFGS" VM-{Entry,Exit} control bits are under KVM control and
conditionally exposed based on the guest CPUID. If the guest CPUID
provides a supporting vPMU or MPX, the respective VMX control bits are
enabled.

These rules have not been upheld in all cases, though. Since commit
aedbaf4f6afd ("KVM: x86: Extract kvm_update_cpuid_runtime() from
kvm_update_cpuid()") KVM will only apply its updates to the MSRs
when the guest CPUID is set. Before, KVM called kvm_update_cpuid()
frequently when running a guest, which had the effect of overriding
any userspace setting of these MSRs.

If an unsuspecting VMM writes to these VMX control MSRs after the
CPUID has been set, KVM fails to configure the appropriate bits.
There does not exist any ordering requirements between setting CPUID
and writing to an MSR.

At the same time, we probably want to get KVM out of the business of
fiddling with these control MSRs. This series adds a quirk that allows
userspace to opt-out of KVM tweaks to these MSRs.

[Patch 1-2]
Fix the immediate issue by hooking writes to the VMX control MSRs. If
userspace writes to one of the affected MSRs, reapply KVMs tweaks to
these registers. Note that these patches employ the minimal change
required to fix the issue, in case they are worthy of a backport.

[Patch 3]
With the hook added in Patch 2, updating
IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS MSRs is unnecessary on PMU refresh. Drop
everything related to updating these controls on PMU refresh.

[Patch 4]
KVM_CAP_DISABLE_QUIRKS2 is broken beyond repair. Create a new capability
that makes quirks discoverable and rejects invalid bits.

[Patch 5]
Add a quirk to opt out of KVM ownership of the aforementioned MSRs. It
is really userspace's responsibility to set up sane vCPU state.

[Patches 6-8]
Add test cases to verify expected behavior with the quirk enabled (KVM
control) and quirk disabled (userspace control).

Applies cleanly to kvm/queue, at the following commit:

  625e7ef7da1a ("KVM: selftests: Add test to verify KVM handling of ICR")

Tested with the included selftest on an Intel Skylake machine.

v3: http://lore.kernel.org/r/20220225200823.2522321-1-oupton@google.com

v3 -> v4:
 - Rebased to kvm/queue. Avoids conflicts with new CAPs and commit
   0bcd556e15f9 ("KVM: nVMX: Refactor PMU refresh to avoid referencing
   kvm_x86_ops.pmu_ops") on kvm/queue.
 - Grabbed KVM_CAP_DISABLE_QUIRKS2 patch, since this series also
   introduces a quirk.
 - Fix typo in KVM_CAP_DISABLE_QUIRKS2 documentation (Sean)
 - Eliminated the need to refresh 'load IA32_PGC' bits from PMU refresh.
 - Use consistent formatting to make test cases more easily readable
   (David Dunn)
 - Use correct 'Fixes: ' tag and correct a typo in Patch 2 changelog.

Oliver Upton (8):
  KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL ctrl bits across MSR
    write
  KVM: nVMX: Drop nested_vmx_pmu_refresh()
  KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2
  KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs
  selftests: KVM: Separate static alloc from KVM_GET_SUPPORTED_CPUID
    call
  selftests: KVM: Add test for PERF_GLOBAL_CTRL VMX control MSR bits
  selftests: KVM: Add test for BNDCFGS VMX control MSR bits

 Documentation/virt/kvm/api.rst                |  74 +++++
 arch/x86/include/asm/kvm_host.h               |   8 +
 arch/x86/include/uapi/asm/kvm.h               |  11 +-
 arch/x86/kvm/pmu.h                            |   5 +
 arch/x86/kvm/vmx/nested.c                     |  31 +--
 arch/x86/kvm/vmx/nested.h                     |   2 -
 arch/x86/kvm/vmx/pmu_intel.c                  |   3 -
 arch/x86/kvm/vmx/vmx.c                        |  17 +-
 arch/x86/kvm/vmx/vmx.h                        |   2 +
 arch/x86/kvm/x86.c                            |   8 +
 include/uapi/linux/kvm.h                      |   1 +
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/include/x86_64/processor.h  |   1 +
 .../selftests/kvm/include/x86_64/vmx.h        |   2 +
 .../selftests/kvm/lib/x86_64/processor.c      |  33 ++-
 .../kvm/x86_64/vmx_control_msrs_test.c        | 257 ++++++++++++++++++
 17 files changed, 418 insertions(+), 39 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c

-- 
2.35.1.574.g5d30c73bfb-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-03-01  6:03 [PATCH v4 0/8] KVM: x86: VMX ctrl MSR + KVM quirk fixes Oliver Upton
@ 2022-03-01  6:03 ` Oliver Upton
  2022-03-01 18:00   ` Paolo Bonzini
  2022-03-01  6:03 ` [PATCH v4 2/8] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL " Oliver Upton
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Oliver Upton @ 2022-03-01  6:03 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, David Dunn, Oliver Upton

Since commit 5f76f6f5ff96 ("KVM: nVMX: Do not expose MPX VMX controls
when guest MPX disabled"), KVM has taken ownership of the "load
IA32_BNDCFGS" and "clear IA32_BNDCFGS" VMX entry/exit controls. The ABI
is that these bits must be set in the IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS
MSRs if the guest's CPUID supports MPX, and clear otherwise.

However, commit aedbaf4f6afd ("KVM: x86: Extract
kvm_update_cpuid_runtime() from kvm_update_cpuid()") partially broke KVM
ownership of the aforementioned bits. Before, kvm_update_cpuid() was
exercised frequently when running a guest and constantly applied its own
changes to the BNDCFGS bits. Now, the BNDCFGS bits are only ever
updated after a KVM_SET_CPUID/KVM_SET_CPUID2 ioctl, meaning that a
subsequent MSR write from userspace will clobber these values.

Uphold the old ABI by reapplying KVM's tweaks to the BNDCFGS bits after
an MSR write from userspace.

Fixes: aedbaf4f6afd ("KVM: x86: Extract kvm_update_cpuid_runtime() from kvm_update_cpuid()")
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/vmx/nested.c | 9 +++++++++
 arch/x86/kvm/vmx/vmx.c    | 2 +-
 arch/x86/kvm/vmx/vmx.h    | 2 ++
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 1dfe23963a9e..a13f8f4e3d82 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1291,6 +1291,15 @@ vmx_restore_control_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
 
 	*lowp = data;
 	*highp = data >> 32;
+
+	/*
+	 * Ensure KVM fiddling with these MSRs is preserved after userspace
+	 * write.
+	 */
+	if (msr_index == MSR_IA32_VMX_TRUE_ENTRY_CTLS ||
+	    msr_index == MSR_IA32_VMX_TRUE_EXIT_CTLS)
+		nested_vmx_entry_exit_ctls_update(&vmx->vcpu);
+
 	return 0;
 }
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index b325f99b2177..3a97220c5f78 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7246,7 +7246,7 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
 #undef cr4_fixed1_update
 }
 
-static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
+void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 7f2c82e7f38f..e134e2763502 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -423,6 +423,8 @@ static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
 
 void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu);
 
+void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu);
+
 /*
  * Note, early Intel manuals have the write-low and read-high bitmap offsets
  * the wrong way round.  The bitmaps control MSRs 0x00000000-0x00001fff and
-- 
2.35.1.574.g5d30c73bfb-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 2/8] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL ctrl bits across MSR write
  2022-03-01  6:03 [PATCH v4 0/8] KVM: x86: VMX ctrl MSR + KVM quirk fixes Oliver Upton
  2022-03-01  6:03 ` [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write Oliver Upton
@ 2022-03-01  6:03 ` Oliver Upton
  2022-03-01 18:01   ` Paolo Bonzini
  2022-04-07  0:21   ` Sean Christopherson
  2022-03-01  6:03 ` [PATCH v4 3/8] KVM: nVMX: Drop nested_vmx_pmu_refresh() Oliver Upton
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 31+ messages in thread
From: Oliver Upton @ 2022-03-01  6:03 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, David Dunn, Oliver Upton

Since commit 03a8871add95 ("KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL
VM-{Entry,Exit} control"), KVM has taken ownership of the "load
IA32_PERF_GLOBAL_CTRL" VMX entry/exit control bits. The ABI is that
these bits will be set in the IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS MSRs if
the guest's CPUID exposes a vPMU that supports the IA32_PERF_GLOBAL_CTRL
MSR (CPUID.0AH:EAX[7:0] > 1), and clear otherwise.

However, commit aedbaf4f6afd ("KVM: x86: Extract
kvm_update_cpuid_runtime() from kvm_update_cpuid()") partially broke KVM
ownership of the aforementioned bits. Before, kvm_update_cpuid() was
exercised frequently when running a guest and constantly applied its own
changes to the "load IA32_PERF_GLOBAL_CTRL" bits. Now, the "load
IA32_PERF_GLOBAL_CTRL" bits are only ever updated after a
KVM_SET_CPUID/KVM_SET_CPUID2 ioctl, meaning that a subsequent MSR write
from userspace will clobber these values.

Note that older kernels without commit c44d9b34701d ("KVM: x86: Invoke
vendor's vcpu_after_set_cpuid() after all common updates") still require
that the entry/exit controls be updated from kvm_pmu_refresh(). Leave
the benign call in place to allow for cleaner backporting and punt the
cleanup to a later change.

Uphold the old ABI by reapplying KVM's tweaks to the "load
IA32_PERF_GLOBAL_CTRL" bits after an MSR write from userspace.

Fixes: aedbaf4f6afd ("KVM: x86: Extract kvm_update_cpuid_runtime() from kvm_update_cpuid()")
Reported-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/pmu.h     |  5 +++++
 arch/x86/kvm/vmx/vmx.c | 12 ++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 7a7b8d5b775e..2d9995668e0b 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -140,6 +140,11 @@ static inline u64 get_sample_period(struct kvm_pmc *pmc, u64 counter_value)
 	return sample_period;
 }
 
+static inline u8 kvm_pmu_version(struct kvm_vcpu *vcpu)
+{
+	return vcpu_to_pmu(vcpu)->version;
+}
+
 void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel);
 void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int fixed_idx);
 void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 3a97220c5f78..224ef4c19a5d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7261,6 +7261,18 @@ void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
 			vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_CLEAR_BNDCFGS;
 		}
 	}
+
+	/*
+	 * KVM supports a 1-setting of the "load IA32_PERF_GLOBAL_CTRL"
+	 * VM-{Entry,Exit} controls if the vPMU supports IA32_PERF_GLOBAL_CTRL.
+	 */
+	if (kvm_pmu_version(vcpu) >= 2) {
+		vmx->nested.msrs.entry_ctls_high |= VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+		vmx->nested.msrs.exit_ctls_high |= VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+	} else {
+		vmx->nested.msrs.entry_ctls_high &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+		vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+	}
 }
 
 static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
-- 
2.35.1.574.g5d30c73bfb-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 3/8] KVM: nVMX: Drop nested_vmx_pmu_refresh()
  2022-03-01  6:03 [PATCH v4 0/8] KVM: x86: VMX ctrl MSR + KVM quirk fixes Oliver Upton
  2022-03-01  6:03 ` [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write Oliver Upton
  2022-03-01  6:03 ` [PATCH v4 2/8] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL " Oliver Upton
@ 2022-03-01  6:03 ` Oliver Upton
  2022-03-01  6:03 ` [PATCH v4 4/8] KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2 Oliver Upton
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 31+ messages in thread
From: Oliver Upton @ 2022-03-01  6:03 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, David Dunn, Oliver Upton

nested_vmx_pmu_refresh() is now unneeded, as the call to
nested_vmx_entry_exit_ctls_update() in vmx_vcpu_after_set_cpuid()
guarantees that the VM-{Entry,Exit} control MSR changes are applied
after setting CPUID. Drop all vestiges of nested_vmx_pmu_refresh().

No functional change intended.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/vmx/nested.c    | 22 ----------------------
 arch/x86/kvm/vmx/nested.h    |  2 --
 arch/x86/kvm/vmx/pmu_intel.c |  3 ---
 3 files changed, 27 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index a13f8f4e3d82..dec45606ce0c 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4806,28 +4806,6 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
 	return 0;
 }
 
-void nested_vmx_pmu_refresh(struct kvm_vcpu *vcpu,
-			    bool vcpu_has_perf_global_ctrl)
-{
-	struct vcpu_vmx *vmx;
-
-	if (!nested_vmx_allowed(vcpu))
-		return;
-
-	vmx = to_vmx(vcpu);
-	if (vcpu_has_perf_global_ctrl) {
-		vmx->nested.msrs.entry_ctls_high |=
-				VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
-		vmx->nested.msrs.exit_ctls_high |=
-				VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
-	} else {
-		vmx->nested.msrs.entry_ctls_high &=
-				~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
-		vmx->nested.msrs.exit_ctls_high &=
-				~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
-	}
-}
-
 static int nested_vmx_get_vmptr(struct kvm_vcpu *vcpu, gpa_t *vmpointer,
 				int *ret)
 {
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index c92cea0b8ccc..14ad756aac46 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -32,8 +32,6 @@ int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data);
 int vmx_get_vmx_msr(struct nested_vmx_msrs *msrs, u32 msr_index, u64 *pdata);
 int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
 			u32 vmx_instruction_info, bool wr, int len, gva_t *ret);
-void nested_vmx_pmu_refresh(struct kvm_vcpu *vcpu,
-			    bool vcpu_has_perf_global_ctrl);
 void nested_mark_vmcs12_pages_dirty(struct kvm_vcpu *vcpu);
 bool nested_vmx_check_io_bitmaps(struct kvm_vcpu *vcpu, unsigned int port,
 				 int size);
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 4e5b1eeeb77c..6433a1091333 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -541,9 +541,6 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 	bitmap_set(pmu->all_valid_pmc_idx,
 		INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters);
 
-	nested_vmx_pmu_refresh(vcpu,
-			       intel_is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL));
-
 	if (intel_pmu_lbr_is_compatible(vcpu))
 		x86_perf_get_lbr(&lbr_desc->records);
 	else
-- 
2.35.1.574.g5d30c73bfb-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 4/8] KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2
  2022-03-01  6:03 [PATCH v4 0/8] KVM: x86: VMX ctrl MSR + KVM quirk fixes Oliver Upton
                   ` (2 preceding siblings ...)
  2022-03-01  6:03 ` [PATCH v4 3/8] KVM: nVMX: Drop nested_vmx_pmu_refresh() Oliver Upton
@ 2022-03-01  6:03 ` Oliver Upton
  2022-03-09 16:01   ` Paolo Bonzini
  2022-03-01  6:03 ` [PATCH v4 5/8] KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs Oliver Upton
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Oliver Upton @ 2022-03-01  6:03 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, David Dunn, Oliver Upton

KVM_CAP_DISABLE_QUIRKS is irrevocably broken. The capability does not
advertise the set of quirks which may be disabled to userspace, so it is
impossible to predict the behavior of KVM. Worse yet,
KVM_CAP_DISABLE_QUIRKS will tolerate any value for cap->args[0], meaning
it fails to reject attempts to set invalid quirk bits.

The only valid workaround for the quirky quirks API is to add a new CAP.
Actually advertise the set of quirks that can be disabled to userspace
so it can predict KVM's behavior. Reject values for cap->args[0] that
contain invalid bits.

Finally, add documentation for the new capability and describe the
existing quirks.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 Documentation/virt/kvm/api.rst  | 50 +++++++++++++++++++++++++++++++++
 arch/x86/include/asm/kvm_host.h |  7 +++++
 arch/x86/kvm/x86.c              |  8 ++++++
 include/uapi/linux/kvm.h        |  1 +
 4 files changed, 66 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index f5d011351016..8f7240e79cc0 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7079,6 +7079,56 @@ resource that is controlled with the H_SET_MODE hypercall.
 This capability allows a guest kernel to use a better-performance mode for
 handling interrupts and system calls.
 
+7.31 KVM_CAP_DISABLE_QUIRKS2
+----------------------------
+
+:Capability: KVM_CAP_DISABLE_QUIRKS2
+:Parameters: args[0] - set of KVM quirks to disable
+:Architectures: x86
+:Type: vm
+
+This capability, if enabled, will cause KVM to disable some behavior
+quirks.
+
+Calling KVM_CHECK_EXTENSION for this capability returns a bitmask of
+quirks that can be disabled in KVM.
+
+The argument to KVM_ENABLE_CAP for this capability is a bitmask of
+quirks to disable, and must be a subset of the bitmask returned by
+KVM_CHECK_EXTENSION.
+
+The valid bits in cap.args[0] are:
+
+=================================== ============================================
+ KVM_X86_QUIRK_LINT0_REENABLED      By default, the reset value for the LVT
+                                    LINT0 register is 0x700 (APIC_MODE_EXTINT).
+                                    When this quirk is disabled, the reset value
+                                    is 0x10000 (APIC_LVT_MASKED).
+
+ KVM_X86_QUIRK_CD_NW_CLEARED        By default, KVM clears CR0.CD and CR0.NW.
+                                    When this quirk is disabled, KVM does not
+                                    change the value of CR0.CD and CR0.NW.
+
+ KVM_X86_QUIRK_LAPIC_MMIO_HOLE      By default, the MMIO LAPIC interface is
+                                    available even when configured for x2APIC
+                                    mode. When this quirk is disabled, KVM
+                                    disables the MMIO LAPIC interface if the
+                                    LAPIC is in x2APIC mode.
+
+ KVM_X86_QUIRK_OUT_7E_INC_RIP       By default, KVM pre-increments %rip before
+                                    exiting to userspace for an OUT instruction
+                                    to port 0x7e. When this quirk is disabled,
+                                    KVM does not pre-increment %rip before
+                                    exiting to userspace.
+
+ KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT When this quirk is disabled, KVM sets
+                                    CPUID.01H:ECX[bit 3] (MONITOR/MWAIT) if
+                                    IA32_MISC_ENABLE[bit 18] (MWAIT) is set.
+                                    Additionally, when this quirk is disabled,
+                                    KVM clears CPUID.01H:ECX[bit 3] if
+                                    IA32_MISC_ENABLE[bit 18] is cleared.
+=================================== ============================================
+
 8. Other capabilities.
 ======================
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index ccec837e520d..bc3405565967 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1963,4 +1963,11 @@ int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages);
 #define KVM_CLOCK_VALID_FLAGS						\
 	(KVM_CLOCK_TSC_STABLE | KVM_CLOCK_REALTIME | KVM_CLOCK_HOST_TSC)
 
+#define KVM_X86_VALID_QUIRKS			\
+	(KVM_X86_QUIRK_LINT0_REENABLED |	\
+	 KVM_X86_QUIRK_CD_NW_CLEARED |		\
+	 KVM_X86_QUIRK_LAPIC_MMIO_HOLE |	\
+	 KVM_X86_QUIRK_OUT_7E_INC_RIP |		\
+	 KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)
+
 #endif /* _ASM_X86_KVM_HOST_H */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c712c33c1521..ec9b602be8da 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4350,6 +4350,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		r = enable_pmu ? KVM_CAP_PMU_VALID_MASK : 0;
 		break;
 	}
+	case KVM_CAP_DISABLE_QUIRKS2:
+		r = KVM_X86_VALID_QUIRKS;
+		break;
 	default:
 		break;
 	}
@@ -5896,6 +5899,11 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		return -EINVAL;
 
 	switch (cap->cap) {
+	case KVM_CAP_DISABLE_QUIRKS2:
+		r = -EINVAL;
+		if (cap->args[0] & ~KVM_X86_VALID_QUIRKS)
+			break;
+		fallthrough;
 	case KVM_CAP_DISABLE_QUIRKS:
 		kvm->arch.disabled_quirks = cap->args[0];
 		r = 0;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index d2f1efc3aa35..91a6fe4e02c0 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1143,6 +1143,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_PPC_AIL_MODE_3 210
 #define KVM_CAP_S390_MEM_OP_EXTENSION 211
 #define KVM_CAP_PMU_CAPABILITY 212
+#define KVM_CAP_DISABLE_QUIRKS2 213
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.35.1.574.g5d30c73bfb-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 5/8] KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs
  2022-03-01  6:03 [PATCH v4 0/8] KVM: x86: VMX ctrl MSR + KVM quirk fixes Oliver Upton
                   ` (3 preceding siblings ...)
  2022-03-01  6:03 ` [PATCH v4 4/8] KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2 Oliver Upton
@ 2022-03-01  6:03 ` Oliver Upton
  2022-04-07  0:28   ` Sean Christopherson
  2022-03-01  6:03 ` [PATCH v4 6/8] selftests: KVM: Separate static alloc from KVM_GET_SUPPORTED_CPUID call Oliver Upton
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Oliver Upton @ 2022-03-01  6:03 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, David Dunn, Oliver Upton

KVM really has no business messing with the vCPU state. Nonetheless, it
has become ABI for KVM to adjust certain bits of the VMX entry/exit
control MSRs depending on the guest CPUID. Namely, the bits associated
with the IA32_PERF_GLOBAL_CTRL and IA32_BNDCFGS MSRs were conditionally
enabled if the guest CPUID allows for it.

Allow userspace to opt-out of changes to VMX control MSRs by adding a
new KVM quirk.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
---
 Documentation/virt/kvm/api.rst  | 24 ++++++++++++++++++++++++
 arch/x86/include/asm/kvm_host.h |  3 ++-
 arch/x86/include/uapi/asm/kvm.h | 11 ++++++-----
 arch/x86/kvm/vmx/vmx.c          |  3 +++
 4 files changed, 35 insertions(+), 6 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 8f7240e79cc0..9bb79ca9de89 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7127,6 +7127,30 @@ The valid bits in cap.args[0] are:
                                     Additionally, when this quirk is disabled,
                                     KVM clears CPUID.01H:ECX[bit 3] if
                                     IA32_MISC_ENABLE[bit 18] is cleared.
+
+ KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS  By default, KVM adjusts the values of
+                                    IA32_VMX_TRUE_ENTRY_CTLS and
+                                    IA32_VMX_TRUE_EXIT_CTLS MSRs under the
+                                    following conditions:
+
+                                    - If CPUID.07H:EBX[bit 14] (MPX) is set, KVM
+                                      sets IA32_VMX_TRUE_ENTRY_CTLS[bit 48]
+                                      ('load IA32_BNDCFGS') and
+                                      IA32_VMX_TRUE_EXIT_CTLS[bit 55]
+                                      ('clear IA32_BNDCFGS'). Otherwise, these
+                                      corresponding MSR bits are cleared.
+                                    - If CPUID.0AH:EAX[bits 7:0] > 1, KVM sets
+                                      IA32_VMX_TRUE_ENTRY_CTLS[bit 45]
+                                      ('load IA32_PERF_GLOBAL_CTRL') and
+                                      IA32_VMX_TRUE_EXIT_CTLS[bit 44]
+                                      ('load IA32_PERF_GLOBAL_CTRL'). Otherwise,
+                                      these corresponding MSR bits are cleared.
+
+                                    When this quirk is disabled, KVM will not
+                                    change the values of
+                                    IA32_VMX_TRUE_ENTRY_CTLS or
+                                    IA32_VMX_TRUE_EXIT_CTLS based on the
+                                    aforementioned CPUID bits.
 =================================== ============================================
 
 8. Other capabilities.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index bc3405565967..1b905e6c4760 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1968,6 +1968,7 @@ int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages);
 	 KVM_X86_QUIRK_CD_NW_CLEARED |		\
 	 KVM_X86_QUIRK_LAPIC_MMIO_HOLE |	\
 	 KVM_X86_QUIRK_OUT_7E_INC_RIP |		\
-	 KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)
+	 KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT |	\
+	 KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS)
 
 #endif /* _ASM_X86_KVM_HOST_H */
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index bf6e96011dfe..acbab6a97fae 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -428,11 +428,12 @@ struct kvm_sync_regs {
 	struct kvm_vcpu_events events;
 };
 
-#define KVM_X86_QUIRK_LINT0_REENABLED	   (1 << 0)
-#define KVM_X86_QUIRK_CD_NW_CLEARED	   (1 << 1)
-#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE	   (1 << 2)
-#define KVM_X86_QUIRK_OUT_7E_INC_RIP	   (1 << 3)
-#define KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT (1 << 4)
+#define KVM_X86_QUIRK_LINT0_REENABLED		(1 << 0)
+#define KVM_X86_QUIRK_CD_NW_CLEARED		(1 << 1)
+#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE		(1 << 2)
+#define KVM_X86_QUIRK_OUT_7E_INC_RIP		(1 << 3)
+#define KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT	(1 << 4)
+#define KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS	(1 << 5)
 
 #define KVM_STATE_NESTED_FORMAT_VMX	0
 #define KVM_STATE_NESTED_FORMAT_SVM	1
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 224ef4c19a5d..21b98bad1319 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7250,6 +7250,9 @@ void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
+	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS))
+		return;
+
 	if (kvm_mpx_supported()) {
 		bool mpx_enabled = guest_cpuid_has(vcpu, X86_FEATURE_MPX);
 
-- 
2.35.1.574.g5d30c73bfb-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 6/8] selftests: KVM: Separate static alloc from KVM_GET_SUPPORTED_CPUID call
  2022-03-01  6:03 [PATCH v4 0/8] KVM: x86: VMX ctrl MSR + KVM quirk fixes Oliver Upton
                   ` (4 preceding siblings ...)
  2022-03-01  6:03 ` [PATCH v4 5/8] KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs Oliver Upton
@ 2022-03-01  6:03 ` Oliver Upton
  2022-03-01  6:03 ` [PATCH v4 7/8] selftests: KVM: Add test for PERF_GLOBAL_CTRL VMX control MSR bits Oliver Upton
  2022-03-01  6:03 ` [PATCH v4 8/8] selftests: KVM: Add test for BNDCFGS " Oliver Upton
  7 siblings, 0 replies; 31+ messages in thread
From: Oliver Upton @ 2022-03-01  6:03 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, David Dunn, Oliver Upton

The library code allows for a single allocation of CPUID to store the
value returned by KVM_GET_SUPPORTED_CPUID. Subsequent calls to the
helper simply return a pointer to the aforementioned allocation. A
subsequent change introduces a selftest that contains test cases which
adjust the CPUID value before calling KVM_SET_CPUID2. Using a single
definition of CPUID is problematic, as the changes are not isolated to a
single test case.

Create a helper that allocates memory for CPUID on a per-call basis.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 .../selftests/kvm/include/x86_64/processor.h  |  1 +
 .../selftests/kvm/lib/x86_64/processor.c      | 33 +++++++++++++++----
 2 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 8a470da7b71a..e36ab7de7717 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -390,6 +390,7 @@ void kvm_x86_state_cleanup(struct kvm_x86_state *state);
 
 struct kvm_msr_list *kvm_get_msr_index_list(void);
 uint64_t kvm_get_feature_msr(uint64_t msr_index);
+struct kvm_cpuid2 *_kvm_get_supported_cpuid(void);
 struct kvm_cpuid2 *kvm_get_supported_cpuid(void);
 
 struct kvm_cpuid2 *vcpu_get_cpuid(struct kvm_vm *vm, uint32_t vcpuid);
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index 9f000dfb5594..b8921cd09ede 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -772,17 +772,14 @@ static struct kvm_cpuid2 *allocate_kvm_cpuid2(void)
  *
  * Return: The supported KVM CPUID
  *
- * Get the guest CPUID supported by KVM.
+ * Gets the supported guest CPUID with the KVM_GET_SUPPORTED_CPUID ioctl.
  */
-struct kvm_cpuid2 *kvm_get_supported_cpuid(void)
+struct kvm_cpuid2 *_kvm_get_supported_cpuid(void)
 {
-	static struct kvm_cpuid2 *cpuid;
+	struct kvm_cpuid2 *cpuid;
 	int ret;
 	int kvm_fd;
 
-	if (cpuid)
-		return cpuid;
-
 	cpuid = allocate_kvm_cpuid2();
 	kvm_fd = open_kvm_dev_path_or_exit();
 
@@ -794,6 +791,30 @@ struct kvm_cpuid2 *kvm_get_supported_cpuid(void)
 	return cpuid;
 }
 
+/*
+ * KVM Supported CPUID Get
+ *
+ * Input Args: None
+ *
+ * Output Args: None
+ *
+ * Return: The supported KVM CPUID
+ *
+ * Gets the supported guest CPUID with the KVM_GET_SUPPORTED_CPUID ioctl.
+ * The first call creates a static allocation of CPUID for the process.
+ * Subsequent calls will return a pointer to the previously allocated CPUID.
+ */
+struct kvm_cpuid2 *kvm_get_supported_cpuid(void)
+{
+	static struct kvm_cpuid2 *cpuid;
+
+	if (cpuid)
+		return cpuid;
+
+	cpuid = _kvm_get_supported_cpuid();
+	return cpuid;
+}
+
 /*
  * KVM Get MSR
  *
-- 
2.35.1.574.g5d30c73bfb-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 7/8] selftests: KVM: Add test for PERF_GLOBAL_CTRL VMX control MSR bits
  2022-03-01  6:03 [PATCH v4 0/8] KVM: x86: VMX ctrl MSR + KVM quirk fixes Oliver Upton
                   ` (5 preceding siblings ...)
  2022-03-01  6:03 ` [PATCH v4 6/8] selftests: KVM: Separate static alloc from KVM_GET_SUPPORTED_CPUID call Oliver Upton
@ 2022-03-01  6:03 ` Oliver Upton
  2022-03-01 16:59   ` David Dunn
  2022-03-01  6:03 ` [PATCH v4 8/8] selftests: KVM: Add test for BNDCFGS " Oliver Upton
  7 siblings, 1 reply; 31+ messages in thread
From: Oliver Upton @ 2022-03-01  6:03 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, David Dunn, Oliver Upton

Test that the default behavior of KVM is to ignore userspace MSR writes
and conditionally expose the "load IA32_PERF_GLOBAL_CTRL" bits in the
VMX control MSRs if the guest CPUID exposes a supporting vPMU.
Additionally, test that when the corresponding quirk is disabled,
userspace can still clear these bits regardless of what is exposed in
CPUID.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../kvm/x86_64/vmx_control_msrs_test.c        | 160 ++++++++++++++++++
 3 files changed, 162 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index 052ddfe4b23a..38edeace1432 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -38,6 +38,7 @@
 /x86_64/userspace_msr_exit_test
 /x86_64/vmx_apic_access_test
 /x86_64/vmx_close_while_nested_test
+/x86_64/vmx_control_msrs_test
 /x86_64/vmx_dirty_log_test
 /x86_64/vmx_exception_with_invalid_guest_state
 /x86_64/vmx_invalid_nested_guest_state
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index f7fa5655e535..a1f0c5885b6d 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -70,6 +70,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/userspace_io_test
 TEST_GEN_PROGS_x86_64 += x86_64/userspace_msr_exit_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_apic_access_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_close_while_nested_test
+TEST_GEN_PROGS_x86_64 += x86_64/vmx_control_msrs_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_dirty_log_test
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_exception_with_invalid_guest_state
 TEST_GEN_PROGS_x86_64 += x86_64/vmx_invalid_nested_guest_state
diff --git a/tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c b/tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c
new file mode 100644
index 000000000000..4ab780483e15
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c
@@ -0,0 +1,160 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * VMX control MSR test
+ *
+ * Copyright (C) 2022 Google LLC.
+ *
+ * Tests for KVM ownership of bits in the VMX entry/exit control MSRs. Checks
+ * that KVM will set owned bits where appropriate, and will not if
+ * KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS is disabled.
+ */
+
+#include "kvm_util.h"
+#include "vmx.h"
+
+#define VCPU_ID 0
+
+static void get_vmx_control_msr(struct kvm_vm *vm, uint32_t msr_index,
+				uint32_t *low, uint32_t *high)
+{
+	uint64_t val;
+
+	val = vcpu_get_msr(vm, VCPU_ID, msr_index);
+	*low = val;
+	*high = val >> 32;
+}
+
+static void set_vmx_control_msr(struct kvm_vm *vm, uint32_t msr_index,
+				uint32_t low, uint32_t high)
+{
+	uint64_t val = (((uint64_t) high) << 32) | low;
+
+	vcpu_set_msr(vm, VCPU_ID, msr_index, val);
+}
+
+static void test_vmx_control_msr(struct kvm_vm *vm, uint32_t msr_index, uint32_t set,
+				 uint32_t clear, uint32_t exp_set, uint32_t exp_clear)
+{
+	uint32_t low, high;
+
+	get_vmx_control_msr(vm, msr_index, &low, &high);
+
+	high &= ~clear;
+	high |= set;
+
+	set_vmx_control_msr(vm, msr_index, low, high);
+
+	get_vmx_control_msr(vm, msr_index, &low, &high);
+	ASSERT_EQ(high & exp_set, exp_set);
+	ASSERT_EQ(~high & exp_clear, exp_clear);
+}
+
+static void clear_performance_monitoring_leaf(struct kvm_cpuid2 *cpuid)
+{
+	struct kvm_cpuid_entry2 ent = {0};
+
+	ent.function = 0xa;
+	TEST_ASSERT(set_cpuid(cpuid, &ent),
+		    "failed to clear Architectual Performance Monitoring leaf (0xA)");
+}
+
+static void load_perf_global_ctrl_test(struct kvm_vm *vm)
+{
+	uint32_t entry_low, entry_high, exit_low, exit_high;
+	struct kvm_enable_cap cap = {0};
+	struct kvm_cpuid2 *cpuid;
+
+	get_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS, &entry_low, &entry_high);
+	get_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS, &exit_low, &exit_high);
+
+	if (!(entry_high & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL) ||
+	    !(exit_high & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL)) {
+		print_skip("\"load IA32_PERF_GLOBAL_CTRL\" VM-{Entry,Exit} controls not supported");
+		return;
+	}
+
+	/*
+	 * Test that KVM will set these bits regardless of userspace if the
+	 * guest CPUID exposes a supporting vPMU.
+	 */
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS,
+			     0,						/* set */
+			     VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,	/* clear */
+			     VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,	/* exp_set */
+			     0);					/* exp_clear */
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS,
+			     0,						/* set */
+			     VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,	/* clear */
+			     VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,	/* exp_set */
+			     0);					/* exp_clear */
+
+	/*
+	 * Hide vPMU in CPUID
+	 */
+	cpuid = _kvm_get_supported_cpuid();
+	clear_performance_monitoring_leaf(cpuid);
+	vcpu_set_cpuid(vm, VCPU_ID, cpuid);
+	free(cpuid);
+
+	/*
+	 * Test that KVM will clear these bits if guest CPUID does not expose a
+	 * supporting vPMU.
+	 */
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS,
+			     0,						/* set */
+			     0,						/* clear */
+			     0,						/* exp_set */
+			     VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL);	/* exp_clear */
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS,
+			     0,						/* set */
+			     0,						/* clear */
+			     0,						/* exp_set */
+			     VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL);	/* exp_clear */
+
+	/*
+	 * Re-enable vPMU in CPUID
+	 */
+	vcpu_set_cpuid(vm, VCPU_ID, kvm_get_supported_cpuid());
+
+	/*
+	 * Disable the quirk, giving userspace control of the VMX capability
+	 * MSRs.
+	 */
+	cap.cap = KVM_CAP_DISABLE_QUIRKS2;
+	cap.args[0] = KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS;
+	vm_enable_cap(vm, &cap);
+
+	/*
+	 * Test that userspace can clear these bits, even if it exposes a vPMU
+	 * that supports IA32_PERF_GLOBAL_CTRL.
+	 */
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS,
+			     0,						/* set */
+			     VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,	/* clear */
+			     0,						/* exp_set */
+			     VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL);	/* exp_clear */
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS,
+			     0,						/* set */
+			     VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,	/* clear */
+			     0,						/* exp_set */
+			     VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL);	/* exp_clear */
+}
+
+int main(void)
+{
+	struct kvm_vm *vm;
+
+	if (!kvm_check_cap(KVM_CAP_DISABLE_QUIRKS2)) {
+		print_skip("KVM_CAP_DISABLE_QUIRKS2 not supported");
+		exit(KSFT_SKIP);
+	}
+
+	nested_vmx_check_supported();
+
+	/* No need to run a guest for these tests */
+	vm = vm_create_default(VCPU_ID, 0, NULL);
+
+	load_perf_global_ctrl_test(vm);
+
+	kvm_vm_free(vm);
+}
-- 
2.35.1.574.g5d30c73bfb-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 8/8] selftests: KVM: Add test for BNDCFGS VMX control MSR bits
  2022-03-01  6:03 [PATCH v4 0/8] KVM: x86: VMX ctrl MSR + KVM quirk fixes Oliver Upton
                   ` (6 preceding siblings ...)
  2022-03-01  6:03 ` [PATCH v4 7/8] selftests: KVM: Add test for PERF_GLOBAL_CTRL VMX control MSR bits Oliver Upton
@ 2022-03-01  6:03 ` Oliver Upton
  7 siblings, 0 replies; 31+ messages in thread
From: Oliver Upton @ 2022-03-01  6:03 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, David Dunn, Oliver Upton

Test that the default behavior of KVM is to ignore userspace MSR writes
and conditionally expose the "{load,clear} IA32_BNDCFGS" bits in the VMX
control MSRs if the guest CPUID exposes MPX. Additionally, test that
when the corresponding quirk is disabled, userspace can still clear
these bits regardless of what is exposed in CPUID.

Signed-off-by: Oliver Upton <oupton@google.com>
---
 .../selftests/kvm/include/x86_64/vmx.h        |  2 +
 .../kvm/x86_64/vmx_control_msrs_test.c        | 97 +++++++++++++++++++
 2 files changed, 99 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/x86_64/vmx.h b/tools/testing/selftests/kvm/include/x86_64/vmx.h
index 583ceb0d1457..811c66d9be74 100644
--- a/tools/testing/selftests/kvm/include/x86_64/vmx.h
+++ b/tools/testing/selftests/kvm/include/x86_64/vmx.h
@@ -80,6 +80,7 @@
 #define VM_EXIT_SAVE_IA32_EFER			0x00100000
 #define VM_EXIT_LOAD_IA32_EFER			0x00200000
 #define VM_EXIT_SAVE_VMX_PREEMPTION_TIMER	0x00400000
+#define VM_EXIT_CLEAR_BNDCFGS			0x00800000
 
 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR	0x00036dff
 
@@ -90,6 +91,7 @@
 #define VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL	0x00002000
 #define VM_ENTRY_LOAD_IA32_PAT			0x00004000
 #define VM_ENTRY_LOAD_IA32_EFER			0x00008000
+#define VM_ENTRY_LOAD_BNDCFGS			0x00010000
 
 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR	0x000011ff
 
diff --git a/tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c b/tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c
index 4ab780483e15..132e7e435bfa 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_control_msrs_test.c
@@ -138,6 +138,102 @@ static void load_perf_global_ctrl_test(struct kvm_vm *vm)
 			     VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,	/* clear */
 			     0,						/* exp_set */
 			     VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL);	/* exp_clear */
+
+	/* cleanup, enable the quirk again */
+	cap.args[0] = 0;
+	vm_enable_cap(vm, &cap);
+}
+
+static void clear_mpx_bit(struct kvm_cpuid2 *cpuid)
+{
+	struct kvm_cpuid_entry2 ent;
+
+	ent = *kvm_get_supported_cpuid_index(0x7, 0x0);
+	ent.ebx &= ~(1u << 14);
+
+	TEST_ASSERT(set_cpuid(cpuid, &ent),
+		    "failed to clear CPUID.07H:EBX[14] (MPX)");
+}
+
+static void bndcfgs_test(struct kvm_vm *vm)
+{
+	uint32_t entry_low, entry_high, exit_low, exit_high;
+	struct kvm_enable_cap cap = {0};
+	struct kvm_cpuid2 *cpuid;
+
+	get_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS, &entry_low, &entry_high);
+	get_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS, &exit_low, &exit_high);
+
+	if (!(entry_high & VM_ENTRY_LOAD_BNDCFGS) ||
+	    !(exit_high & VM_EXIT_CLEAR_BNDCFGS)) {
+		print_skip("\"load/clear IA32_BNDCFGS\" VM-{Entry,Exit} controls not supported");
+		return;
+	}
+
+	/*
+	 * Test that KVM will set these bits regardless of userspace if the
+	 * guest CPUID exposes MPX.
+	 */
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS,
+			     0,					/* set */
+			     VM_ENTRY_LOAD_BNDCFGS,		/* clear */
+			     VM_ENTRY_LOAD_BNDCFGS,		/* exp_set */
+			     0);				/* exp_clear */
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS,
+			     0,					/* set */
+			     VM_EXIT_CLEAR_BNDCFGS,		/* clear */
+			     VM_EXIT_CLEAR_BNDCFGS,		/* exp_set */
+			     0);				/* exp_clear */
+
+	/*
+	 * Hide MPX in CPUID
+	 */
+	cpuid = _kvm_get_supported_cpuid();
+	clear_mpx_bit(cpuid);
+	vcpu_set_cpuid(vm, VCPU_ID, cpuid);
+	free(cpuid);
+
+	/*
+	 * Test that KVM will clear these bits if the guest CPUID does not
+	 * expose MPX
+	 */
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS,
+			     0,					/* set */
+			     0,					/* clear */
+			     0,					/* exp_set */
+			     VM_ENTRY_LOAD_BNDCFGS);		/* exp_clear */
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS,
+			     0,					/* set */
+			     0,					/* clear */
+			     0,					/* exp_set */
+			     VM_EXIT_CLEAR_BNDCFGS);		/* exp_clear */
+
+	/*
+	 * Re-enable MPX in CPUID
+	 */
+	vcpu_set_cpuid(vm, VCPU_ID, kvm_get_supported_cpuid());
+
+	/*
+	 * Disable the quirk, giving userspace control of the VMX capability
+	 * MSRs.
+	 */
+	cap.cap = KVM_CAP_DISABLE_QUIRKS2;
+	cap.args[0] = KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS;
+	vm_enable_cap(vm, &cap);
+
+	/*
+	 * Test that userspace can clear these bits, even if it exposes MPX.
+	 */
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS,
+			     0,					/* set */
+			     VM_ENTRY_LOAD_BNDCFGS,		/* clear */
+			     0,					/* exp_set */
+			     VM_ENTRY_LOAD_BNDCFGS);		/* exp_clear */
+	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS,
+			     0,					/* set */
+			     VM_EXIT_CLEAR_BNDCFGS,		/* clear */
+			     0,					/* exp_set */
+			     VM_EXIT_CLEAR_BNDCFGS);		/* exp_clear */
 }
 
 int main(void)
@@ -155,6 +251,7 @@ int main(void)
 	vm = vm_create_default(VCPU_ID, 0, NULL);
 
 	load_perf_global_ctrl_test(vm);
+	bndcfgs_test(vm);
 
 	kvm_vm_free(vm);
 }
-- 
2.35.1.574.g5d30c73bfb-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 7/8] selftests: KVM: Add test for PERF_GLOBAL_CTRL VMX control MSR bits
  2022-03-01  6:03 ` [PATCH v4 7/8] selftests: KVM: Add test for PERF_GLOBAL_CTRL VMX control MSR bits Oliver Upton
@ 2022-03-01 16:59   ` David Dunn
  0 siblings, 0 replies; 31+ messages in thread
From: David Dunn @ 2022-03-01 16:59 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, Paolo Bonzini, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel

Reviewed-by: David Dunn <daviddunn@google.com>

See below.

On Mon, Feb 28, 2022 at 10:04 PM Oliver Upton <oupton@google.com> wrote:
>
> +       /*
> +        * Re-enable vPMU in CPUID
> +        */
> +       vcpu_set_cpuid(vm, VCPU_ID, kvm_get_supported_cpuid());
> +
> +       /*
> +        * Disable the quirk, giving userspace control of the VMX capability
> +        * MSRs.
> +        */
> +       cap.cap = KVM_CAP_DISABLE_QUIRKS2;
> +       cap.args[0] = KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS;
> +       vm_enable_cap(vm, &cap);
> +
> +       /*
> +        * Test that userspace can clear these bits, even if it exposes a vPMU
> +        * that supports IA32_PERF_GLOBAL_CTRL.
> +        */
> +       test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS,
> +                            0,                                         /* set */
> +                            VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,       /* clear */
> +                            0,                                         /* exp_set */
> +                            VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL);      /* exp_clear */
> +       test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS,
> +                            0,                                         /* set */
> +                            VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,        /* clear */
> +                            0,                                         /* exp_set */
> +                            VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL);       /* exp_clear */
> +}

Appreciate the formatting change.   Can you also add a test for cpuid
clear while quirk is disabled?

Thanks,

Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-03-01  6:03 ` [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write Oliver Upton
@ 2022-03-01 18:00   ` Paolo Bonzini
  2022-03-01 18:43     ` Oliver Upton
  0 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2022-03-01 18:00 UTC (permalink / raw)
  To: Oliver Upton, kvm
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, David Dunn

On 3/1/22 07:03, Oliver Upton wrote:
> +
> +	/*
> +	 * Ensure KVM fiddling with these MSRs is preserved after userspace
> +	 * write.
> +	 */
> +	if (msr_index == MSR_IA32_VMX_TRUE_ENTRY_CTLS ||
> +	    msr_index == MSR_IA32_VMX_TRUE_EXIT_CTLS)
> +		nested_vmx_entry_exit_ctls_update(&vmx->vcpu);
> +

I still don't understand this patch.  You say:

> Now, the BNDCFGS bits are only ever
> updated after a KVM_SET_CPUID/KVM_SET_CPUID2 ioctl, meaning that a
> subsequent MSR write from userspace will clobber these values.

but I don't understand what's wrong with that.  If you can (if so 
inclined) define a VM without LOAD_BNDCFGS or CLEAR_BNDCFGS even if MPX 
enabled, commit aedbaf4f6afd counts as a bugfix.

Paolo


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 2/8] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL ctrl bits across MSR write
  2022-03-01  6:03 ` [PATCH v4 2/8] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL " Oliver Upton
@ 2022-03-01 18:01   ` Paolo Bonzini
  2022-04-07  0:21   ` Sean Christopherson
  1 sibling, 0 replies; 31+ messages in thread
From: Paolo Bonzini @ 2022-03-01 18:01 UTC (permalink / raw)
  To: Oliver Upton, kvm
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, David Dunn

On 3/1/22 07:03, Oliver Upton wrote:
> +
> +	/*
> +	 * KVM supports a 1-setting of the "load IA32_PERF_GLOBAL_CTRL"
> +	 * VM-{Entry,Exit} controls if the vPMU supports IA32_PERF_GLOBAL_CTRL.
> +	 */
> +	if (kvm_pmu_version(vcpu) >= 2) {
> +		vmx->nested.msrs.entry_ctls_high |= VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
> +		vmx->nested.msrs.exit_ctls_high |= VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
> +	} else {
> +		vmx->nested.msrs.entry_ctls_high &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
> +		vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
> +	}

This one I understand, following what's done with MPX, but I cannot make 
sense of the commit message just like in the case of patch 1.

Paolo


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-03-01 18:00   ` Paolo Bonzini
@ 2022-03-01 18:43     ` Oliver Upton
  2022-03-02 12:21       ` Paolo Bonzini
  0 siblings, 1 reply; 31+ messages in thread
From: Oliver Upton @ 2022-03-01 18:43 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, David Dunn

Hi Paolo,

On Tue, Mar 01, 2022 at 07:00:57PM +0100, Paolo Bonzini wrote:
> On 3/1/22 07:03, Oliver Upton wrote:
> > +
> > +	/*
> > +	 * Ensure KVM fiddling with these MSRs is preserved after userspace
> > +	 * write.
> > +	 */
> > +	if (msr_index == MSR_IA32_VMX_TRUE_ENTRY_CTLS ||
> > +	    msr_index == MSR_IA32_VMX_TRUE_EXIT_CTLS)
> > +		nested_vmx_entry_exit_ctls_update(&vmx->vcpu);
> > +
> 
> I still don't understand this patch.  You say:
> 
> > Now, the BNDCFGS bits are only ever
> > updated after a KVM_SET_CPUID/KVM_SET_CPUID2 ioctl, meaning that a
> > subsequent MSR write from userspace will clobber these values.
> 
> but I don't understand what's wrong with that.  If you can (if so inclined)
> define a VM without LOAD_BNDCFGS or CLEAR_BNDCFGS even if MPX enabled,
> commit aedbaf4f6afd counts as a bugfix.

Right, a 1-setting of '{load,clear} IA32_BNDCFGS' should really be the
responsibility of userspace. My issue is that the commit message in
commit 5f76f6f5ff96 ("KVM: nVMX: Do not expose MPX VMX controls when
guest MPX disabled") suggests that userspace can expect these bits to be
configured based on guest CPUID. Furthermore, before commit aedbaf4f6afd
("KVM: x86: Extract kvm_update_cpuid_runtime() from
kvm_update_cpuid()"), if userspace clears these bits, KVM will continue
to set them based on CPUID.

What is the userspace expectation here? If we are saying that changes to
IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS after userspace writes these MSRs is a
bug, then I agree aedbaf4f6afd is in fact a bugfix. But, the commit
message in 5f76f6f5ff96 seems to indicate that userspace wants KVM to
configure these bits based on guest CPUID.

Given that there were previous userspace expectations, I attempted to
restore the old behavior of KVM (ignore userspace writes) and add a
quirk to fully back out of the mess. All this logic also applies to
Patch 2 as well.

--
Oliver

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-03-01 18:43     ` Oliver Upton
@ 2022-03-02 12:21       ` Paolo Bonzini
  2022-03-02 20:51         ` Oliver Upton
  0 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2022-03-02 12:21 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, David Dunn

On 3/1/22 19:43, Oliver Upton wrote:
> Right, a 1-setting of '{load,clear} IA32_BNDCFGS' should really be the
> responsibility of userspace. My issue is that the commit message in
> commit 5f76f6f5ff96 ("KVM: nVMX: Do not expose MPX VMX controls when
> guest MPX disabled") suggests that userspace can expect these bits to be
> configured based on guest CPUID. Furthermore, before commit aedbaf4f6afd
> ("KVM: x86: Extract kvm_update_cpuid_runtime() from
> kvm_update_cpuid()"), if userspace clears these bits, KVM will continue
> to set them based on CPUID.
> 
> What is the userspace expectation here? If we are saying that changes to
> IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS after userspace writes these MSRs is a
> bug, then I agree aedbaf4f6afd is in fact a bugfix. But, the commit
> message in 5f76f6f5ff96 seems to indicate that userspace wants KVM to
> configure these bits based on guest CPUID.

Yes, but I think it's reasonable that userspace wants to override them. 
  It has to do that after KVM_SET_CPUID2, but that's okay too.

Paolo


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-03-02 12:21       ` Paolo Bonzini
@ 2022-03-02 20:51         ` Oliver Upton
  2022-03-02 21:22           ` Paolo Bonzini
  0 siblings, 1 reply; 31+ messages in thread
From: Oliver Upton @ 2022-03-02 20:51 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, David Dunn

On Wed, Mar 02, 2022 at 01:21:23PM +0100, Paolo Bonzini wrote:
> On 3/1/22 19:43, Oliver Upton wrote:
> > Right, a 1-setting of '{load,clear} IA32_BNDCFGS' should really be the
> > responsibility of userspace. My issue is that the commit message in
> > commit 5f76f6f5ff96 ("KVM: nVMX: Do not expose MPX VMX controls when
> > guest MPX disabled") suggests that userspace can expect these bits to be
> > configured based on guest CPUID. Furthermore, before commit aedbaf4f6afd
> > ("KVM: x86: Extract kvm_update_cpuid_runtime() from
> > kvm_update_cpuid()"), if userspace clears these bits, KVM will continue
> > to set them based on CPUID.
> > 
> > What is the userspace expectation here? If we are saying that changes to
> > IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS after userspace writes these MSRs is a
> > bug, then I agree aedbaf4f6afd is in fact a bugfix. But, the commit
> > message in 5f76f6f5ff96 seems to indicate that userspace wants KVM to
> > configure these bits based on guest CPUID.
> 
> Yes, but I think it's reasonable that userspace wants to override them.  It
> has to do that after KVM_SET_CPUID2, but that's okay too.
> 

In that case, I can rework the tests at the end of this series to ensure
userspace's ability to override w/o a quirk. Sorry for the toil,
aedbaf4f6afd caused some breakage for us internally, but really is just
a userspace bug.

Is it possible to pick up patch 4/8 "KVM: x86: Introduce
KVM_CAP_DISABLE_QUIRKS2" independent of the rest of this series?

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-03-02 20:51         ` Oliver Upton
@ 2022-03-02 21:22           ` Paolo Bonzini
  2022-03-02 21:54             ` Oliver Upton
  0 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2022-03-02 21:22 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, David Dunn

On 3/2/22 21:51, Oliver Upton wrote:
> On Wed, Mar 02, 2022 at 01:21:23PM +0100, Paolo Bonzini wrote:
>> On 3/1/22 19:43, Oliver Upton wrote:
>>> Right, a 1-setting of '{load,clear} IA32_BNDCFGS' should really be the
>>> responsibility of userspace. My issue is that the commit message in
>>> commit 5f76f6f5ff96 ("KVM: nVMX: Do not expose MPX VMX controls when
>>> guest MPX disabled") suggests that userspace can expect these bits to be
>>> configured based on guest CPUID. Furthermore, before commit aedbaf4f6afd
>>> ("KVM: x86: Extract kvm_update_cpuid_runtime() from
>>> kvm_update_cpuid()"), if userspace clears these bits, KVM will continue
>>> to set them based on CPUID.
>>>
>>> What is the userspace expectation here? If we are saying that changes to
>>> IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS after userspace writes these MSRs is a
>>> bug, then I agree aedbaf4f6afd is in fact a bugfix. But, the commit
>>> message in 5f76f6f5ff96 seems to indicate that userspace wants KVM to
>>> configure these bits based on guest CPUID.
>>
>> Yes, but I think it's reasonable that userspace wants to override them.  It
>> has to do that after KVM_SET_CPUID2, but that's okay too.
>>
> 
> In that case, I can rework the tests at the end of this series to ensure
> userspace's ability to override w/o a quirk. Sorry for the toil,
> aedbaf4f6afd caused some breakage for us internally, but really is just
> a userspace bug.

How did vanadium break?

Paolo

> Is it possible to pick up patch 4/8 "KVM: x86: Introduce
> KVM_CAP_DISABLE_QUIRKS2" independent of the rest of this series?


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-03-02 21:22           ` Paolo Bonzini
@ 2022-03-02 21:54             ` Oliver Upton
  2022-03-03  1:43               ` Sean Christopherson
  0 siblings, 1 reply; 31+ messages in thread
From: Oliver Upton @ 2022-03-02 21:54 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, David Dunn

On Wed, Mar 02, 2022 at 10:22:43PM +0100, Paolo Bonzini wrote:
> On 3/2/22 21:51, Oliver Upton wrote:
> > On Wed, Mar 02, 2022 at 01:21:23PM +0100, Paolo Bonzini wrote:
> > > On 3/1/22 19:43, Oliver Upton wrote:
> > > > Right, a 1-setting of '{load,clear} IA32_BNDCFGS' should really be the
> > > > responsibility of userspace. My issue is that the commit message in
> > > > commit 5f76f6f5ff96 ("KVM: nVMX: Do not expose MPX VMX controls when
> > > > guest MPX disabled") suggests that userspace can expect these bits to be
> > > > configured based on guest CPUID. Furthermore, before commit aedbaf4f6afd
> > > > ("KVM: x86: Extract kvm_update_cpuid_runtime() from
> > > > kvm_update_cpuid()"), if userspace clears these bits, KVM will continue
> > > > to set them based on CPUID.
> > > > 
> > > > What is the userspace expectation here? If we are saying that changes to
> > > > IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS after userspace writes these MSRs is a
> > > > bug, then I agree aedbaf4f6afd is in fact a bugfix. But, the commit
> > > > message in 5f76f6f5ff96 seems to indicate that userspace wants KVM to
> > > > configure these bits based on guest CPUID.
> > > 
> > > Yes, but I think it's reasonable that userspace wants to override them.  It
> > > has to do that after KVM_SET_CPUID2, but that's okay too.
> > > 
> > 
> > In that case, I can rework the tests at the end of this series to ensure
> > userspace's ability to override w/o a quirk. Sorry for the toil,
> > aedbaf4f6afd caused some breakage for us internally, but really is just
> > a userspace bug.
> 
> How did vanadium break?

Maybe I can redirect you to a test case to highlight a possible
regression in KVM, as seen by userspace ;-)

from Patch 7/8:

> +	/*
> +	 * Test that KVM will set these bits regardless of userspace if the
> +	 * guest CPUID exposes a supporting vPMU.
> +	 */
> +	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_ENTRY_CTLS,
> +			     0,						/* set */
> +			     VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,	/* clear */
> +			     VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,	/* exp_set */
> +			     0);					/* exp_clear */
> +	test_vmx_control_msr(vm, MSR_IA32_VMX_TRUE_EXIT_CTLS,
> +			     0,						/* set */
> +			     VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,	/* clear */
> +			     VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,	/* exp_set */
> +			     0);					/* exp_clear */

Same goes for the "{load,clear} IA32_BNDCFGS" bits too.

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-03-02 21:54             ` Oliver Upton
@ 2022-03-03  1:43               ` Sean Christopherson
  2022-03-03  6:29                 ` Paolo Bonzini
  0 siblings, 1 reply; 31+ messages in thread
From: Sean Christopherson @ 2022-03-03  1:43 UTC (permalink / raw)
  To: Oliver Upton
  Cc: Paolo Bonzini, kvm, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, David Dunn

On Wed, Mar 02, 2022, Oliver Upton wrote:
> On Wed, Mar 02, 2022 at 10:22:43PM +0100, Paolo Bonzini wrote:
> > On 3/2/22 21:51, Oliver Upton wrote:
> > > On Wed, Mar 02, 2022 at 01:21:23PM +0100, Paolo Bonzini wrote:
> > > > On 3/1/22 19:43, Oliver Upton wrote:
> > > > > Right, a 1-setting of '{load,clear} IA32_BNDCFGS' should really be the
> > > > > responsibility of userspace. My issue is that the commit message in
> > > > > commit 5f76f6f5ff96 ("KVM: nVMX: Do not expose MPX VMX controls when
> > > > > guest MPX disabled") suggests that userspace can expect these bits to be
> > > > > configured based on guest CPUID. Furthermore, before commit aedbaf4f6afd
> > > > > ("KVM: x86: Extract kvm_update_cpuid_runtime() from
> > > > > kvm_update_cpuid()"), if userspace clears these bits, KVM will continue
> > > > > to set them based on CPUID.
> > > > > 
> > > > > What is the userspace expectation here? If we are saying that changes to
> > > > > IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS after userspace writes these MSRs is a
> > > > > bug, then I agree aedbaf4f6afd is in fact a bugfix. But, the commit
> > > > > message in 5f76f6f5ff96 seems to indicate that userspace wants KVM to
> > > > > configure these bits based on guest CPUID.
> > > > 
> > > > Yes, but I think it's reasonable that userspace wants to override them.  It
> > > > has to do that after KVM_SET_CPUID2, but that's okay too.
> > > > 
> > > 
> > > In that case, I can rework the tests at the end of this series to ensure
> > > userspace's ability to override w/o a quirk. Sorry for the toil,
> > > aedbaf4f6afd caused some breakage for us internally, but really is just
> > > a userspace bug.
> > 
> > How did vanadium break?
> 
> Maybe I can redirect you to a test case to highlight a possible
> regression in KVM, as seen by userspace ;-)

Regressions aside, VMCS controls are not tied to CPUID, KVM should not be mucking
with unrelated things.  The original hack was to fix a userspace bug and should
never have been mreged.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-03-03  1:43               ` Sean Christopherson
@ 2022-03-03  6:29                 ` Paolo Bonzini
  2022-03-03 16:15                   ` Sean Christopherson
  0 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2022-03-03  6:29 UTC (permalink / raw)
  To: Sean Christopherson, Oliver Upton
  Cc: kvm, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, David Dunn

On 3/3/22 02:43, Sean Christopherson wrote:
>> Maybe I can redirect you to a test case to highlight a possible
>> regression in KVM, as seen by userspace;-)
> Regressions aside, VMCS controls are not tied to CPUID, KVM should not be mucking
> with unrelated things.  The original hack was to fix a userspace bug and should
> never have been mreged.

Note that it dates back to:

     commit 5f76f6f5ff96587af5acd5930f7d9fea81e0d1a8
     Author: Liran Alon <liran.alon@oracle.com>
     Date:   Fri Sep 14 03:25:52 2018 +0300

     KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled
     
     Before this commit, KVM exposes MPX VMX controls to L1 guest only based
     on if KVM and host processor supports MPX virtualization.
     However, these controls should be exposed to guest only in case guest
     vCPU supports MPX.

It's not to fix a userspace bug, it's to support userspace that doesn't
know about using KVM_SET_MSR for VMX features---which is okay since unlike
KVM_SET_CPUID2 it's not a mandatory call.

Paolo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-03-03  6:29                 ` Paolo Bonzini
@ 2022-03-03 16:15                   ` Sean Christopherson
  2022-03-03 21:44                     ` Jim Mattson
  0 siblings, 1 reply; 31+ messages in thread
From: Sean Christopherson @ 2022-03-03 16:15 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Oliver Upton, kvm, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, David Dunn

On Thu, Mar 03, 2022, Paolo Bonzini wrote:
> On 3/3/22 02:43, Sean Christopherson wrote:
> > > Maybe I can redirect you to a test case to highlight a possible
> > > regression in KVM, as seen by userspace;-)
> > Regressions aside, VMCS controls are not tied to CPUID, KVM should not be mucking
> > with unrelated things.  The original hack was to fix a userspace bug and should
> > never have been mreged.
> 
> Note that it dates back to:
> 
>     commit 5f76f6f5ff96587af5acd5930f7d9fea81e0d1a8
>     Author: Liran Alon <liran.alon@oracle.com>
>     Date:   Fri Sep 14 03:25:52 2018 +0300
> 
>     KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled
>     Before this commit, KVM exposes MPX VMX controls to L1 guest only based
>     on if KVM and host processor supports MPX virtualization.
>     However, these controls should be exposed to guest only in case guest
>     vCPU supports MPX.
> 
> It's not to fix a userspace bug, it's to support userspace that doesn't
> know about using KVM_SET_MSR for VMX features---which is okay since unlike
> KVM_SET_CPUID2 it's not a mandatory call.

I disagree, IMO failure to properly configure the vCPU model is a userspace bug.
Maybe it was a userspace bug induced by a haphazard and/or poorly documented KVM
ABI, but it's still a userspace bug.  One could argue that KVM should disable/clear
VMX features if userspace clears a related CPUID feature, but _setting_ a VMX
feature based on CPUID is architecturally wrong.  Even if we consider one or both
cases to be desirable behavior in terms of creating a consistent vCPU model, forcing
a consistent vCPU model for this one case goes against every other ioctl in KVM's
ABI.

If we consider it KVM's responsibility to propagate CPUID state to VMX MSRs, then
KVM has a bunch of "bugs".

  X86_FEATURE_LM => VM_EXIT_HOST_ADDR_SPACE_SIZE, VM_ENTRY_IA32E_MODE, VMX_MISC_SAVE_EFER_LMA

  X86_FEATURE_TSC => CPU_BASED_RDTSC_EXITING, CPU_BASED_USE_TSC_OFFSETTING,
                     SECONDARY_EXEC_TSC_SCALING

  X86_FEATURE_INVPCID_SINGLE => SECONDARY_EXEC_ENABLE_INVPCID

  X86_FEATURE_MWAIT => CPU_BASED_MONITOR_EXITING, CPU_BASED_MWAIT_EXITING

  X86_FEATURE_INTEL_PT => SECONDARY_EXEC_PT_CONCEAL_VMX, SECONDARY_EXEC_PT_USE_GPA,
                          VM_EXIT_CLEAR_IA32_RTIT_CTL, VM_ENTRY_LOAD_IA32_RTIT_CTL

  X86_FEATURE_XSAVES => SECONDARY_EXEC_XSAVES

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-03-03 16:15                   ` Sean Christopherson
@ 2022-03-03 21:44                     ` Jim Mattson
  2022-03-03 23:44                       ` Sean Christopherson
  0 siblings, 1 reply; 31+ messages in thread
From: Jim Mattson @ 2022-03-03 21:44 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Oliver Upton, kvm, Vitaly Kuznetsov, Wanpeng Li,
	Joerg Roedel, David Dunn

On Thu, Mar 3, 2022 at 8:15 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Thu, Mar 03, 2022, Paolo Bonzini wrote:
> > On 3/3/22 02:43, Sean Christopherson wrote:
> > > > Maybe I can redirect you to a test case to highlight a possible
> > > > regression in KVM, as seen by userspace;-)
> > > Regressions aside, VMCS controls are not tied to CPUID, KVM should not be mucking
> > > with unrelated things.  The original hack was to fix a userspace bug and should
> > > never have been mreged.
> >
> > Note that it dates back to:
> >
> >     commit 5f76f6f5ff96587af5acd5930f7d9fea81e0d1a8
> >     Author: Liran Alon <liran.alon@oracle.com>
> >     Date:   Fri Sep 14 03:25:52 2018 +0300
> >
> >     KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled
> >     Before this commit, KVM exposes MPX VMX controls to L1 guest only based
> >     on if KVM and host processor supports MPX virtualization.
> >     However, these controls should be exposed to guest only in case guest
> >     vCPU supports MPX.
> >
> > It's not to fix a userspace bug, it's to support userspace that doesn't
> > know about using KVM_SET_MSR for VMX features---which is okay since unlike
> > KVM_SET_CPUID2 it's not a mandatory call.
>
> I disagree, IMO failure to properly configure the vCPU model is a userspace bug.
> Maybe it was a userspace bug induced by a haphazard and/or poorly documented KVM
> ABI, but it's still a userspace bug.  One could argue that KVM should disable/clear
> VMX features if userspace clears a related CPUID feature, but _setting_ a VMX
> feature based on CPUID is architecturally wrong.  Even if we consider one or both
> cases to be desirable behavior in terms of creating a consistent vCPU model, forcing
> a consistent vCPU model for this one case goes against every other ioctl in KVM's
> ABI.
>
> If we consider it KVM's responsibility to propagate CPUID state to VMX MSRs, then
> KVM has a bunch of "bugs".
>
>   X86_FEATURE_LM => VM_EXIT_HOST_ADDR_SPACE_SIZE, VM_ENTRY_IA32E_MODE, VMX_MISC_SAVE_EFER_LMA
>
>   X86_FEATURE_TSC => CPU_BASED_RDTSC_EXITING, CPU_BASED_USE_TSC_OFFSETTING,
>                      SECONDARY_EXEC_TSC_SCALING
>
>   X86_FEATURE_INVPCID_SINGLE => SECONDARY_EXEC_ENABLE_INVPCID
>
>   X86_FEATURE_MWAIT => CPU_BASED_MONITOR_EXITING, CPU_BASED_MWAIT_EXITING
>
>   X86_FEATURE_INTEL_PT => SECONDARY_EXEC_PT_CONCEAL_VMX, SECONDARY_EXEC_PT_USE_GPA,
>                           VM_EXIT_CLEAR_IA32_RTIT_CTL, VM_ENTRY_LOAD_IA32_RTIT_CTL
>
>   X86_FEATURE_XSAVES => SECONDARY_EXEC_XSAVES

I don't disagree with you, but this does beg the question, "What's
going on with all of the invocations of cr4_fixed1_update()?"

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-03-03 21:44                     ` Jim Mattson
@ 2022-03-03 23:44                       ` Sean Christopherson
  2022-03-04 15:50                         ` Paolo Bonzini
  0 siblings, 1 reply; 31+ messages in thread
From: Sean Christopherson @ 2022-03-03 23:44 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Oliver Upton, kvm, Vitaly Kuznetsov, Wanpeng Li,
	Joerg Roedel, David Dunn

On Thu, Mar 03, 2022, Jim Mattson wrote:
> On Thu, Mar 3, 2022 at 8:15 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Thu, Mar 03, 2022, Paolo Bonzini wrote:
> > > On 3/3/22 02:43, Sean Christopherson wrote:
> > > > > Maybe I can redirect you to a test case to highlight a possible
> > > > > regression in KVM, as seen by userspace;-)
> > > > Regressions aside, VMCS controls are not tied to CPUID, KVM should not be mucking
> > > > with unrelated things.  The original hack was to fix a userspace bug and should
> > > > never have been mreged.
> > >
> > > Note that it dates back to:
> > >
> > >     commit 5f76f6f5ff96587af5acd5930f7d9fea81e0d1a8
> > >     Author: Liran Alon <liran.alon@oracle.com>
> > >     Date:   Fri Sep 14 03:25:52 2018 +0300
> > >
> > >     KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled
> > >     Before this commit, KVM exposes MPX VMX controls to L1 guest only based
> > >     on if KVM and host processor supports MPX virtualization.
> > >     However, these controls should be exposed to guest only in case guest
> > >     vCPU supports MPX.
> > >
> > > It's not to fix a userspace bug, it's to support userspace that doesn't
> > > know about using KVM_SET_MSR for VMX features---which is okay since unlike
> > > KVM_SET_CPUID2 it's not a mandatory call.
> >
> > I disagree, IMO failure to properly configure the vCPU model is a userspace bug.
> > Maybe it was a userspace bug induced by a haphazard and/or poorly documented KVM
> > ABI, but it's still a userspace bug.  One could argue that KVM should disable/clear
> > VMX features if userspace clears a related CPUID feature, but _setting_ a VMX
> > feature based on CPUID is architecturally wrong.  Even if we consider one or both
> > cases to be desirable behavior in terms of creating a consistent vCPU model, forcing
> > a consistent vCPU model for this one case goes against every other ioctl in KVM's
> > ABI.
> >
> > If we consider it KVM's responsibility to propagate CPUID state to VMX MSRs, then
> > KVM has a bunch of "bugs".
> >
> >   X86_FEATURE_LM => VM_EXIT_HOST_ADDR_SPACE_SIZE, VM_ENTRY_IA32E_MODE, VMX_MISC_SAVE_EFER_LMA
> >
> >   X86_FEATURE_TSC => CPU_BASED_RDTSC_EXITING, CPU_BASED_USE_TSC_OFFSETTING,
> >                      SECONDARY_EXEC_TSC_SCALING
> >
> >   X86_FEATURE_INVPCID_SINGLE => SECONDARY_EXEC_ENABLE_INVPCID
> >
> >   X86_FEATURE_MWAIT => CPU_BASED_MONITOR_EXITING, CPU_BASED_MWAIT_EXITING
> >
> >   X86_FEATURE_INTEL_PT => SECONDARY_EXEC_PT_CONCEAL_VMX, SECONDARY_EXEC_PT_USE_GPA,
> >                           VM_EXIT_CLEAR_IA32_RTIT_CTL, VM_ENTRY_LOAD_IA32_RTIT_CTL
> >
> >   X86_FEATURE_XSAVES => SECONDARY_EXEC_XSAVES
> 
> I don't disagree with you, but this does beg the question, "What's
> going on with all of the invocations of cr4_fixed1_update()?"

Boo, I forgot legal CR4 is controlled via MSRs too.  Ha!  That's a bug in nVMX.
nVMX only checks msrs.cr4_fixed0/1, it doesn't check "cr4_reserved_bits", which
is KVM's set of host reserved bits.  That means userspace can bypass those reserved
bits by setting guest CPUID and/or VMX MSRs and loading CR4 via VM-Enter/VM-Exit.

The immediate nVMX bug can be fixed by calling kvm_is_valid_cr4(), which calls
back into nVMX to do the VMX MSR checks.

My vote would be to include nested_vmx_cr_fixed1_bits_update() in the quirk, but
keep the guest CPUID enforcement that's in kvm_is_valid_cr4().  I.e. let userspace
further restrict CR4, but don't let it allow nested VM-Enter/VM-Exit to load bits
that L1 can't set via MOV CR4.

I'll send this as a proper patch:

diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index c92cea0b8ccc..46dd1967ec08 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -285,8 +285,8 @@ static inline bool nested_cr4_valid(struct kvm_vcpu *vcpu, unsigned long val)
 }

 /* No difference in the restrictions on guest and host CR4 in VMX operation. */
-#define nested_guest_cr4_valid nested_cr4_valid
-#define nested_host_cr4_valid  nested_cr4_valid
+#define nested_guest_cr4_valid kvm_is_valid_cr4
+#define nested_host_cr4_valid  kvm_is_valid_cr4

 extern struct kvm_x86_nested_ops vmx_nested_ops;


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-03-03 23:44                       ` Sean Christopherson
@ 2022-03-04 15:50                         ` Paolo Bonzini
  2022-04-07  0:26                           ` Sean Christopherson
  2022-05-27 16:55                           ` Sean Christopherson
  0 siblings, 2 replies; 31+ messages in thread
From: Paolo Bonzini @ 2022-03-04 15:50 UTC (permalink / raw)
  To: Sean Christopherson, Jim Mattson
  Cc: Oliver Upton, kvm, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	David Dunn

On 3/4/22 00:44, Sean Christopherson wrote:
> 
> diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
> index c92cea0b8ccc..46dd1967ec08 100644
> --- a/arch/x86/kvm/vmx/nested.h
> +++ b/arch/x86/kvm/vmx/nested.h
> @@ -285,8 +285,8 @@ static inline bool nested_cr4_valid(struct kvm_vcpu *vcpu, unsigned long val)
>  }
> 
>  /* No difference in the restrictions on guest and host CR4 in VMX operation. */
> -#define nested_guest_cr4_valid nested_cr4_valid
> -#define nested_host_cr4_valid  nested_cr4_valid
> +#define nested_guest_cr4_valid kvm_is_valid_cr4
> +#define nested_host_cr4_valid  kvm_is_valid_cr4

This doesn't allow the theoretically possible case of L0 setting some 
CR4-fixed-0 bits for L1.  I'll send another one.

Paolo


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/8] KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2
  2022-03-01  6:03 ` [PATCH v4 4/8] KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2 Oliver Upton
@ 2022-03-09 16:01   ` Paolo Bonzini
  0 siblings, 0 replies; 31+ messages in thread
From: Paolo Bonzini @ 2022-03-09 16:01 UTC (permalink / raw)
  To: Oliver Upton, kvm
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, David Dunn

On 3/1/22 07:03, Oliver Upton wrote:
> KVM_CAP_DISABLE_QUIRKS is irrevocably broken. The capability does not
> advertise the set of quirks which may be disabled to userspace, so it is
> impossible to predict the behavior of KVM. Worse yet,
> KVM_CAP_DISABLE_QUIRKS will tolerate any value for cap->args[0], meaning
> it fails to reject attempts to set invalid quirk bits.
> 
> The only valid workaround for the quirky quirks API is to add a new CAP.
> Actually advertise the set of quirks that can be disabled to userspace
> so it can predict KVM's behavior. Reject values for cap->args[0] that
> contain invalid bits.
> 
> Finally, add documentation for the new capability and describe the
> existing quirks.
> 
> Signed-off-by: Oliver Upton <oupton@google.com>

Queued, thanks.

Paolo

> ---
>   Documentation/virt/kvm/api.rst  | 50 +++++++++++++++++++++++++++++++++
>   arch/x86/include/asm/kvm_host.h |  7 +++++
>   arch/x86/kvm/x86.c              |  8 ++++++
>   include/uapi/linux/kvm.h        |  1 +
>   4 files changed, 66 insertions(+)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index f5d011351016..8f7240e79cc0 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -7079,6 +7079,56 @@ resource that is controlled with the H_SET_MODE hypercall.
>   This capability allows a guest kernel to use a better-performance mode for
>   handling interrupts and system calls.
>   
> +7.31 KVM_CAP_DISABLE_QUIRKS2
> +----------------------------
> +
> +:Capability: KVM_CAP_DISABLE_QUIRKS2
> +:Parameters: args[0] - set of KVM quirks to disable
> +:Architectures: x86
> +:Type: vm
> +
> +This capability, if enabled, will cause KVM to disable some behavior
> +quirks.
> +
> +Calling KVM_CHECK_EXTENSION for this capability returns a bitmask of
> +quirks that can be disabled in KVM.
> +
> +The argument to KVM_ENABLE_CAP for this capability is a bitmask of
> +quirks to disable, and must be a subset of the bitmask returned by
> +KVM_CHECK_EXTENSION.
> +
> +The valid bits in cap.args[0] are:
> +
> +=================================== ============================================
> + KVM_X86_QUIRK_LINT0_REENABLED      By default, the reset value for the LVT
> +                                    LINT0 register is 0x700 (APIC_MODE_EXTINT).
> +                                    When this quirk is disabled, the reset value
> +                                    is 0x10000 (APIC_LVT_MASKED).
> +
> + KVM_X86_QUIRK_CD_NW_CLEARED        By default, KVM clears CR0.CD and CR0.NW.
> +                                    When this quirk is disabled, KVM does not
> +                                    change the value of CR0.CD and CR0.NW.
> +
> + KVM_X86_QUIRK_LAPIC_MMIO_HOLE      By default, the MMIO LAPIC interface is
> +                                    available even when configured for x2APIC
> +                                    mode. When this quirk is disabled, KVM
> +                                    disables the MMIO LAPIC interface if the
> +                                    LAPIC is in x2APIC mode.
> +
> + KVM_X86_QUIRK_OUT_7E_INC_RIP       By default, KVM pre-increments %rip before
> +                                    exiting to userspace for an OUT instruction
> +                                    to port 0x7e. When this quirk is disabled,
> +                                    KVM does not pre-increment %rip before
> +                                    exiting to userspace.
> +
> + KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT When this quirk is disabled, KVM sets
> +                                    CPUID.01H:ECX[bit 3] (MONITOR/MWAIT) if
> +                                    IA32_MISC_ENABLE[bit 18] (MWAIT) is set.
> +                                    Additionally, when this quirk is disabled,
> +                                    KVM clears CPUID.01H:ECX[bit 3] if
> +                                    IA32_MISC_ENABLE[bit 18] is cleared.
> +=================================== ============================================
> +
>   8. Other capabilities.
>   ======================
>   
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index ccec837e520d..bc3405565967 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1963,4 +1963,11 @@ int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages);
>   #define KVM_CLOCK_VALID_FLAGS						\
>   	(KVM_CLOCK_TSC_STABLE | KVM_CLOCK_REALTIME | KVM_CLOCK_HOST_TSC)
>   
> +#define KVM_X86_VALID_QUIRKS			\
> +	(KVM_X86_QUIRK_LINT0_REENABLED |	\
> +	 KVM_X86_QUIRK_CD_NW_CLEARED |		\
> +	 KVM_X86_QUIRK_LAPIC_MMIO_HOLE |	\
> +	 KVM_X86_QUIRK_OUT_7E_INC_RIP |		\
> +	 KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)
> +
>   #endif /* _ASM_X86_KVM_HOST_H */
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c712c33c1521..ec9b602be8da 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4350,6 +4350,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>   		r = enable_pmu ? KVM_CAP_PMU_VALID_MASK : 0;
>   		break;
>   	}
> +	case KVM_CAP_DISABLE_QUIRKS2:
> +		r = KVM_X86_VALID_QUIRKS;
> +		break;
>   	default:
>   		break;
>   	}
> @@ -5896,6 +5899,11 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>   		return -EINVAL;
>   
>   	switch (cap->cap) {
> +	case KVM_CAP_DISABLE_QUIRKS2:
> +		r = -EINVAL;
> +		if (cap->args[0] & ~KVM_X86_VALID_QUIRKS)
> +			break;
> +		fallthrough;
>   	case KVM_CAP_DISABLE_QUIRKS:
>   		kvm->arch.disabled_quirks = cap->args[0];
>   		r = 0;
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index d2f1efc3aa35..91a6fe4e02c0 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1143,6 +1143,7 @@ struct kvm_ppc_resize_hpt {
>   #define KVM_CAP_PPC_AIL_MODE_3 210
>   #define KVM_CAP_S390_MEM_OP_EXTENSION 211
>   #define KVM_CAP_PMU_CAPABILITY 212
> +#define KVM_CAP_DISABLE_QUIRKS2 213
>   
>   #ifdef KVM_CAP_IRQ_ROUTING
>   


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 2/8] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL ctrl bits across MSR write
  2022-03-01  6:03 ` [PATCH v4 2/8] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL " Oliver Upton
  2022-03-01 18:01   ` Paolo Bonzini
@ 2022-04-07  0:21   ` Sean Christopherson
  1 sibling, 0 replies; 31+ messages in thread
From: Sean Christopherson @ 2022-04-07  0:21 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, David Dunn

On Tue, Mar 01, 2022, Oliver Upton wrote:
> Since commit 03a8871add95 ("KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL
> VM-{Entry,Exit} control"), KVM has taken ownership of the "load
> IA32_PERF_GLOBAL_CTRL" VMX entry/exit control bits. The ABI is that
> these bits will be set in the IA32_VMX_TRUE_{ENTRY,EXIT}_CTLS MSRs if
> the guest's CPUID exposes a vPMU that supports the IA32_PERF_GLOBAL_CTRL
> MSR (CPUID.0AH:EAX[7:0] > 1), and clear otherwise.
> 
> However, commit aedbaf4f6afd ("KVM: x86: Extract
> kvm_update_cpuid_runtime() from kvm_update_cpuid()") partially broke KVM
> ownership of the aforementioned bits. Before, kvm_update_cpuid() was
> exercised frequently when running a guest and constantly applied its own
> changes to the "load IA32_PERF_GLOBAL_CTRL" bits. Now, the "load
> IA32_PERF_GLOBAL_CTRL" bits are only ever updated after a
> KVM_SET_CPUID/KVM_SET_CPUID2 ioctl, meaning that a subsequent MSR write
> from userspace will clobber these values.
> 
> Note that older kernels without commit c44d9b34701d ("KVM: x86: Invoke
> vendor's vcpu_after_set_cpuid() after all common updates") still require
> that the entry/exit controls be updated from kvm_pmu_refresh(). Leave
> the benign call in place to allow for cleaner backporting and punt the
> cleanup to a later change.
> 
> Uphold the old ABI by reapplying KVM's tweaks to the "load
> IA32_PERF_GLOBAL_CTRL" bits after an MSR write from userspace.
> 
> Fixes: aedbaf4f6afd ("KVM: x86: Extract kvm_update_cpuid_runtime() from kvm_update_cpuid()")
> Reported-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Oliver Upton <oupton@google.com>
> ---
>  arch/x86/kvm/pmu.h     |  5 +++++
>  arch/x86/kvm/vmx/vmx.c | 12 ++++++++++++
>  2 files changed, 17 insertions(+)
> 
> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
> index 7a7b8d5b775e..2d9995668e0b 100644
> --- a/arch/x86/kvm/pmu.h
> +++ b/arch/x86/kvm/pmu.h
> @@ -140,6 +140,11 @@ static inline u64 get_sample_period(struct kvm_pmc *pmc, u64 counter_value)
>  	return sample_period;
>  }
>  
> +static inline u8 kvm_pmu_version(struct kvm_vcpu *vcpu)
> +{
> +	return vcpu_to_pmu(vcpu)->version;
> +}
> +
>  void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel);
>  void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int fixed_idx);
>  void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx);
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 3a97220c5f78..224ef4c19a5d 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7261,6 +7261,18 @@ void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
>  			vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_CLEAR_BNDCFGS;
>  		}
>  	}
> +
> +	/*
> +	 * KVM supports a 1-setting of the "load IA32_PERF_GLOBAL_CTRL"
> +	 * VM-{Entry,Exit} controls if the vPMU supports IA32_PERF_GLOBAL_CTRL.
> +	 */
> +	if (kvm_pmu_version(vcpu) >= 2) {

Oh c'mon, now you're just being mean.  Not only is this behavior obliquely described
in the changelog and not explained in the comment, it doesn't even use the same code,
e.g. grepping for ">= 2" didn't lead me to the "> 1" check in intel_is_valid_msr().

Rather than encourage open coding the version check, how about adding a helper in
vmx.h and using that in pmu_intel.c?

---
 arch/x86/kvm/vmx/pmu_intel.c |  2 +-
 arch/x86/kvm/vmx/vmx.c       | 12 ++++++++++++
 arch/x86/kvm/vmx/vmx.h       |  6 ++++++
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index efa172a7278e..7cabc584da6c 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -212,7 +212,7 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 	case MSR_CORE_PERF_GLOBAL_STATUS:
 	case MSR_CORE_PERF_GLOBAL_CTRL:
 	case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
-		ret = pmu->version > 1;
+		ret = intel_pmu_has_perf_global_ctrl(vcpu);
 		break;
 	default:
 		ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 53225b1aec46..ce0000202c5e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7262,6 +7262,18 @@ void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
 			vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_CLEAR_BNDCFGS;
 		}
 	}
+
+	/*
+	 * KVM supports a 1-setting of the "load IA32_PERF_GLOBAL_CTRL"
+	 * VM-{Entry,Exit} controls if the vPMU supports IA32_PERF_GLOBAL_CTRL.
+	 */
+	if (intel_pmu_has_perf_global_ctrl(vcpu)) {
+		vmx->nested.msrs.entry_ctls_high |= VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+		vmx->nested.msrs.exit_ctls_high |= VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+	} else {
+		vmx->nested.msrs.entry_ctls_high &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+		vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+	}
 }

 static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index d4b809abda62..fbdd00250bbf 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -100,6 +100,12 @@ bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu);
 int intel_pmu_create_guest_lbr_event(struct kvm_vcpu *vcpu);
 void vmx_passthrough_lbr_msrs(struct kvm_vcpu *vcpu);

+static inline bool intel_pmu_has_perf_global_ctrl(struct kvm_vcpu *vcpu)
+{
+	/* Oliver needs to add a comment here as penance for being mean. */
+	return vcpu_to_pmu(vcpu)->version > 1;
+}
+
 struct lbr_desc {
 	/* Basic info about guest LBR records. */
 	struct x86_pmu_lbr records;

base-commit: fdc20fdbbb5b1982bd5cec3db509ede60c85222e
--



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-03-04 15:50                         ` Paolo Bonzini
@ 2022-04-07  0:26                           ` Sean Christopherson
  2022-04-07  0:29                             ` Oliver Upton
  2022-05-27 16:55                           ` Sean Christopherson
  1 sibling, 1 reply; 31+ messages in thread
From: Sean Christopherson @ 2022-04-07  0:26 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Jim Mattson, Oliver Upton, kvm, Vitaly Kuznetsov, Wanpeng Li,
	Joerg Roedel, David Dunn

On Fri, Mar 04, 2022, Paolo Bonzini wrote:
> On 3/4/22 00:44, Sean Christopherson wrote:
> > 
> > diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
> > index c92cea0b8ccc..46dd1967ec08 100644
> > --- a/arch/x86/kvm/vmx/nested.h
> > +++ b/arch/x86/kvm/vmx/nested.h
> > @@ -285,8 +285,8 @@ static inline bool nested_cr4_valid(struct kvm_vcpu *vcpu, unsigned long val)
> >  }
> > 
> >  /* No difference in the restrictions on guest and host CR4 in VMX operation. */
> > -#define nested_guest_cr4_valid nested_cr4_valid
> > -#define nested_host_cr4_valid  nested_cr4_valid
> > +#define nested_guest_cr4_valid kvm_is_valid_cr4
> > +#define nested_host_cr4_valid  kvm_is_valid_cr4
> 
> This doesn't allow the theoretically possible case of L0 setting some
> CR4-fixed-0 bits for L1.  I'll send another one.

Are you still planning on sending a proper patch for this?

And more importantly, have we shifted your view on this patch/series?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 5/8] KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs
  2022-03-01  6:03 ` [PATCH v4 5/8] KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs Oliver Upton
@ 2022-04-07  0:28   ` Sean Christopherson
  0 siblings, 0 replies; 31+ messages in thread
From: Sean Christopherson @ 2022-04-07  0:28 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvm, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, David Dunn

On Tue, Mar 01, 2022, Oliver Upton wrote:
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 224ef4c19a5d..21b98bad1319 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7250,6 +7250,9 @@ void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
>  {
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
>  
> +	if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS))
> +		return;
> +
>  	if (kvm_mpx_supported()) {
>  		bool mpx_enabled = guest_cpuid_has(vcpu, X86_FEATURE_MPX);

Assuming the proposed L2 CR4 checking changes don't throw a wrench in things, I'd
I'd prefer we include the CR4 fixed1 massaging in the quirk:

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ce0000202c5e..17b19946d6cc 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7367,7 +7367,8 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
                        ~(FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
                          FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX);

-       if (nested_vmx_allowed(vcpu)) {
+       if (nested_vmx_allowed(vcpu) &&
+           !kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_TWEAK_VMX_CTRL_MSRS)) {
                nested_vmx_cr_fixed1_bits_update(vcpu);
                nested_vmx_entry_exit_ctls_update(vcpu);
        }

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-04-07  0:26                           ` Sean Christopherson
@ 2022-04-07  0:29                             ` Oliver Upton
  2022-04-07  0:32                               ` Oliver Upton
  2022-04-07  0:34                               ` Sean Christopherson
  0 siblings, 2 replies; 31+ messages in thread
From: Oliver Upton @ 2022-04-07  0:29 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, Vitaly Kuznetsov, Wanpeng Li,
	Joerg Roedel, David Dunn

Hey Sean,

On Wed, Apr 6, 2022 at 5:26 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Mar 04, 2022, Paolo Bonzini wrote:
> > On 3/4/22 00:44, Sean Christopherson wrote:
> > >
> > > diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
> > > index c92cea0b8ccc..46dd1967ec08 100644
> > > --- a/arch/x86/kvm/vmx/nested.h
> > > +++ b/arch/x86/kvm/vmx/nested.h
> > > @@ -285,8 +285,8 @@ static inline bool nested_cr4_valid(struct kvm_vcpu *vcpu, unsigned long val)
> > >  }
> > >
> > >  /* No difference in the restrictions on guest and host CR4 in VMX operation. */
> > > -#define nested_guest_cr4_valid nested_cr4_valid
> > > -#define nested_host_cr4_valid  nested_cr4_valid
> > > +#define nested_guest_cr4_valid kvm_is_valid_cr4
> > > +#define nested_host_cr4_valid  kvm_is_valid_cr4
> >
> > This doesn't allow the theoretically possible case of L0 setting some
> > CR4-fixed-0 bits for L1.  I'll send another one.
>
> Are you still planning on sending a proper patch for this?
>
> And more importantly, have we shifted your view on this patch/series?

Sorry, I should've followed up. If nobody else complains, let's just
leave everything as-is and avoid repeating the mistakes of the patches
to blame (hey, I authored one of those!)

--
Best,
Oliver

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-04-07  0:29                             ` Oliver Upton
@ 2022-04-07  0:32                               ` Oliver Upton
  2022-04-07  0:34                               ` Sean Christopherson
  1 sibling, 0 replies; 31+ messages in thread
From: Oliver Upton @ 2022-04-07  0:32 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Jim Mattson, kvm, Vitaly Kuznetsov, Wanpeng Li,
	Joerg Roedel, David Dunn

On Wed, Apr 6, 2022 at 5:29 PM Oliver Upton <oupton@google.com> wrote:
>
> Hey Sean,
>
> On Wed, Apr 6, 2022 at 5:26 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Fri, Mar 04, 2022, Paolo Bonzini wrote:
> > > On 3/4/22 00:44, Sean Christopherson wrote:
> > > >
> > > > diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
> > > > index c92cea0b8ccc..46dd1967ec08 100644
> > > > --- a/arch/x86/kvm/vmx/nested.h
> > > > +++ b/arch/x86/kvm/vmx/nested.h
> > > > @@ -285,8 +285,8 @@ static inline bool nested_cr4_valid(struct kvm_vcpu *vcpu, unsigned long val)
> > > >  }
> > > >
> > > >  /* No difference in the restrictions on guest and host CR4 in VMX operation. */
> > > > -#define nested_guest_cr4_valid nested_cr4_valid
> > > > -#define nested_host_cr4_valid  nested_cr4_valid
> > > > +#define nested_guest_cr4_valid kvm_is_valid_cr4
> > > > +#define nested_host_cr4_valid  kvm_is_valid_cr4
> > >
> > > This doesn't allow the theoretically possible case of L0 setting some
> > > CR4-fixed-0 bits for L1.  I'll send another one.
> >
> > Are you still planning on sending a proper patch for this?
> >
> > And more importantly, have we shifted your view on this patch/series?
>
> Sorry, I should've followed up. If nobody else complains, let's just
> leave everything as-is and avoid repeating the mistakes of the patches
> to blame (hey, I authored one of those!)

Well, that is to say we do nothing to plug the half-baked behavior of
changing MSRs based on CPUID. Quirk the rest of it away if we don't
want to fudge with these bits any more :)

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-04-07  0:29                             ` Oliver Upton
  2022-04-07  0:32                               ` Oliver Upton
@ 2022-04-07  0:34                               ` Sean Christopherson
  1 sibling, 0 replies; 31+ messages in thread
From: Sean Christopherson @ 2022-04-07  0:34 UTC (permalink / raw)
  To: Oliver Upton
  Cc: Paolo Bonzini, Jim Mattson, kvm, Vitaly Kuznetsov, Wanpeng Li,
	Joerg Roedel, David Dunn

On Wed, Apr 06, 2022, Oliver Upton wrote:
> Hey Sean,
> 
> On Wed, Apr 6, 2022 at 5:26 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Fri, Mar 04, 2022, Paolo Bonzini wrote:
> > > On 3/4/22 00:44, Sean Christopherson wrote:
> > > >
> > > > diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
> > > > index c92cea0b8ccc..46dd1967ec08 100644
> > > > --- a/arch/x86/kvm/vmx/nested.h
> > > > +++ b/arch/x86/kvm/vmx/nested.h
> > > > @@ -285,8 +285,8 @@ static inline bool nested_cr4_valid(struct kvm_vcpu *vcpu, unsigned long val)
> > > >  }
> > > >
> > > >  /* No difference in the restrictions on guest and host CR4 in VMX operation. */
> > > > -#define nested_guest_cr4_valid nested_cr4_valid
> > > > -#define nested_host_cr4_valid  nested_cr4_valid
> > > > +#define nested_guest_cr4_valid kvm_is_valid_cr4
> > > > +#define nested_host_cr4_valid  kvm_is_valid_cr4
> > >
> > > This doesn't allow the theoretically possible case of L0 setting some
> > > CR4-fixed-0 bits for L1.  I'll send another one.
> >
> > Are you still planning on sending a proper patch for this?
> >
> > And more importantly, have we shifted your view on this patch/series?
> 
> Sorry, I should've followed up. If nobody else complains, let's just
> leave everything as-is and avoid repeating the mistakes of the patches
> to blame (hey, I authored one of those!)

The problem is that if we leave things as is, someone will inevitably think it's
the right thing to do and will repeat those mistakes.  I don't see why we wouldn't
add the quirk, broken userspace gets to keep its broken behavior unless it opts
into to disabling the quirk.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write
  2022-03-04 15:50                         ` Paolo Bonzini
  2022-04-07  0:26                           ` Sean Christopherson
@ 2022-05-27 16:55                           ` Sean Christopherson
  1 sibling, 0 replies; 31+ messages in thread
From: Sean Christopherson @ 2022-05-27 16:55 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Jim Mattson, Oliver Upton, kvm, Vitaly Kuznetsov, Wanpeng Li,
	Joerg Roedel, David Dunn

/cast resurrect

On Fri, Mar 04, 2022, Paolo Bonzini wrote:
> On 3/4/22 00:44, Sean Christopherson wrote:
> > 
> > diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
> > index c92cea0b8ccc..46dd1967ec08 100644
> > --- a/arch/x86/kvm/vmx/nested.h
> > +++ b/arch/x86/kvm/vmx/nested.h
> > @@ -285,8 +285,8 @@ static inline bool nested_cr4_valid(struct kvm_vcpu *vcpu, unsigned long val)
> >  }
> > 
> >  /* No difference in the restrictions on guest and host CR4 in VMX operation. */
> > -#define nested_guest_cr4_valid nested_cr4_valid
> > -#define nested_host_cr4_valid  nested_cr4_valid
> > +#define nested_guest_cr4_valid kvm_is_valid_cr4
> > +#define nested_host_cr4_valid  kvm_is_valid_cr4
> 
> This doesn't allow the theoretically possible case of L0 setting some
> CR4-fixed-0 bits for L1.  I'll send another one.

Ha!  My "patch" is correct.  kvm_is_valid_cr4() calls vmx_is_valid_cr4(), which
calls nested_cr4_valid() when the vCPU is post-VMXON, so it _does_ cover the fixed0
case.  I'll send a proper patch with a comment to call out that subtlety.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2022-05-27 16:55 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-01  6:03 [PATCH v4 0/8] KVM: x86: VMX ctrl MSR + KVM quirk fixes Oliver Upton
2022-03-01  6:03 ` [PATCH v4 1/8] KVM: nVMX: Keep KVM updates to BNDCFGS ctrl bits across MSR write Oliver Upton
2022-03-01 18:00   ` Paolo Bonzini
2022-03-01 18:43     ` Oliver Upton
2022-03-02 12:21       ` Paolo Bonzini
2022-03-02 20:51         ` Oliver Upton
2022-03-02 21:22           ` Paolo Bonzini
2022-03-02 21:54             ` Oliver Upton
2022-03-03  1:43               ` Sean Christopherson
2022-03-03  6:29                 ` Paolo Bonzini
2022-03-03 16:15                   ` Sean Christopherson
2022-03-03 21:44                     ` Jim Mattson
2022-03-03 23:44                       ` Sean Christopherson
2022-03-04 15:50                         ` Paolo Bonzini
2022-04-07  0:26                           ` Sean Christopherson
2022-04-07  0:29                             ` Oliver Upton
2022-04-07  0:32                               ` Oliver Upton
2022-04-07  0:34                               ` Sean Christopherson
2022-05-27 16:55                           ` Sean Christopherson
2022-03-01  6:03 ` [PATCH v4 2/8] KVM: nVMX: Keep KVM updates to PERF_GLOBAL_CTRL " Oliver Upton
2022-03-01 18:01   ` Paolo Bonzini
2022-04-07  0:21   ` Sean Christopherson
2022-03-01  6:03 ` [PATCH v4 3/8] KVM: nVMX: Drop nested_vmx_pmu_refresh() Oliver Upton
2022-03-01  6:03 ` [PATCH v4 4/8] KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2 Oliver Upton
2022-03-09 16:01   ` Paolo Bonzini
2022-03-01  6:03 ` [PATCH v4 5/8] KVM: nVMX: Add a quirk for KVM tweaks to VMX control MSRs Oliver Upton
2022-04-07  0:28   ` Sean Christopherson
2022-03-01  6:03 ` [PATCH v4 6/8] selftests: KVM: Separate static alloc from KVM_GET_SUPPORTED_CPUID call Oliver Upton
2022-03-01  6:03 ` [PATCH v4 7/8] selftests: KVM: Add test for PERF_GLOBAL_CTRL VMX control MSR bits Oliver Upton
2022-03-01 16:59   ` David Dunn
2022-03-01  6:03 ` [PATCH v4 8/8] selftests: KVM: Add test for BNDCFGS " Oliver Upton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.