linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups
@ 2021-05-04 17:17 Sean Christopherson
  2021-05-04 17:17 ` [PATCH 01/15] KVM: VMX: Do not adverise RDPID if ENABLE_RDTSCP control is unsupported Sean Christopherson
                   ` (15 more replies)
  0 siblings, 16 replies; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 17:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

This is a continuation of a less ambitious effort to unify MSR_TSC_AUX
handling across SVM and VMX.  Reiji pointed out that MSR_TSC_AUX exists if
RDTSCP *or* RDPID is supported, and things went downhill from there. 

The first half of this series fixes a variety of RDTSCP and RDPID related
bugs.

The second half of the series cleans up VMX's user return MSR framework
and consolidates more of the uret logic into common x86.

The last two patches leverage the uret MSR cleanups to move MSR_TSC_AUX
handling to common x86 and add sanity checks to guard against misreporting
of RDPID and/or RDTSCP support.

This will conflict with my vCPU RESET/INIT cleanup series.  Feel free to
punt the conflicts to me.

Other "fun" things to tackle:

 - The kernel proper also botches RDPID vs. RDTSCP, as MSR_TSC_AUX is
   configured if RDTSCP is supported, but is consumed if RDPID is
   supported.  I'll send this fix separately.

 - Commit 844d69c26d83 ("KVM: SVM: Delay restoration of host MSR_TSC_AUX
   until return to userspace") unwittingly fixed a bug where KVM would
   write MSR_TSC_AUX with the guest's value when svm->guest_state_loaded
   is false, which could lead to running the host with the guest's value.
   The bug only exists in 5.12 (maybe 5.11 too?), so crafting a fix for
   stable won't be too awful.

Sean Christopherson (15):
  KVM: VMX: Do not adverise RDPID if ENABLE_RDTSCP control is
    unsupported
  KVM: x86: Emulate RDPID only if RDTSCP is supported
  KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
  KVM: x86: Move RDPID emulation intercept to its own enum
  KVM: VMX: Disable preemption when probing user return MSRs
  KVM: SVM: Probe and load MSR_TSC_AUX regardless of RDTSCP support in
    host
  KVM: x86: Add support for RDPID without RDTSCP
  KVM: VMX: Configure list of user return MSRs at module init
  KVM: VMX: Use flag to indicate "active" uret MSRs instead of sorting
    list
  KVM: VMX: Use common x86's uret MSR list as the one true list
  KVM: VMX: Disable loading of TSX_CTRL MSR the more conventional way
  KVM: x86: Export the number of uret MSRs to vendor modules
  KVM: x86: Move uret MSR slot management to common x86
  KVM: x86: Tie Intel and AMD behavior for MSR_TSC_AUX to guest CPU
    model
  KVM: x86: Hide RDTSCP and RDPID if MSR_TSC_AUX probing failed

 arch/x86/include/asm/kvm_host.h |   9 +-
 arch/x86/kvm/cpuid.c            |  18 ++-
 arch/x86/kvm/emulate.c          |   2 +-
 arch/x86/kvm/kvm_emulate.h      |   1 +
 arch/x86/kvm/svm/svm.c          |  50 +++-----
 arch/x86/kvm/vmx/vmx.c          | 217 ++++++++++++++++----------------
 arch/x86/kvm/vmx/vmx.h          |  12 +-
 arch/x86/kvm/x86.c              | 101 ++++++++++++---
 8 files changed, 245 insertions(+), 165 deletions(-)

-- 
2.31.1.527.g47e6f16901-goog


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 01/15] KVM: VMX: Do not adverise RDPID if ENABLE_RDTSCP control is unsupported
  2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
@ 2021-05-04 17:17 ` Sean Christopherson
  2021-05-04 17:37   ` Jim Mattson
                     ` (2 more replies)
  2021-05-04 17:17 ` [PATCH 02/15] KVM: x86: Emulate RDPID only if RDTSCP is supported Sean Christopherson
                   ` (14 subsequent siblings)
  15 siblings, 3 replies; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 17:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

Clear KVM's RDPID capability if the ENABLE_RDTSCP secondary exec control is
unsupported.  Despite being enumerated in a separate CPUID flag, RDPID is
bundled under the same VMCS control as RDTSCP and will #UD in VMX non-root
if ENABLE_RDTSCP is not enabled.

Fixes: 41cd02c6f7f6 ("kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 10b610fc7bbc..82404ee2520e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7377,9 +7377,11 @@ static __init void vmx_set_cpu_caps(void)
 	if (!cpu_has_vmx_xsaves())
 		kvm_cpu_cap_clear(X86_FEATURE_XSAVES);
 
-	/* CPUID 0x80000001 */
-	if (!cpu_has_vmx_rdtscp())
+	/* CPUID 0x80000001 and 0x7 (RDPID) */
+	if (!cpu_has_vmx_rdtscp()) {
 		kvm_cpu_cap_clear(X86_FEATURE_RDTSCP);
+		kvm_cpu_cap_clear(X86_FEATURE_RDPID);
+	}
 
 	if (cpu_has_vmx_waitpkg())
 		kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);
-- 
2.31.1.527.g47e6f16901-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 02/15] KVM: x86: Emulate RDPID only if RDTSCP is supported
  2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
  2021-05-04 17:17 ` [PATCH 01/15] KVM: VMX: Do not adverise RDPID if ENABLE_RDTSCP control is unsupported Sean Christopherson
@ 2021-05-04 17:17 ` Sean Christopherson
  2021-05-04 17:50   ` Jim Mattson
                     ` (2 more replies)
  2021-05-04 17:17 ` [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest Sean Christopherson
                   ` (13 subsequent siblings)
  15 siblings, 3 replies; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 17:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

Do not advertise emulation support for RDPID if RDTSCP is unsupported.
RDPID emulation subtly relies on MSR_TSC_AUX to exist in hardware, as
both vmx_get_msr() and svm_get_msr() will return an error if the MSR is
unsupported, i.e. ctxt->ops->get_msr() will fail and the emulator will
inject a #UD.

Note, RDPID emulation also relies on RDTSCP being enabled in the guest,
but this is a KVM bug and will eventually be fixed.

Fixes: fb6d4d340e05 ("KVM: x86: emulate RDPID")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index f765bf7a529c..c96f79c9fff2 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -637,7 +637,8 @@ static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
 	case 7:
 		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
 		entry->eax = 0;
-		entry->ecx = F(RDPID);
+		if (kvm_cpu_cap_has(X86_FEATURE_RDTSCP))
+			entry->ecx = F(RDPID);
 		++array->nent;
 	default:
 		break;
-- 
2.31.1.527.g47e6f16901-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
  2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
  2021-05-04 17:17 ` [PATCH 01/15] KVM: VMX: Do not adverise RDPID if ENABLE_RDTSCP control is unsupported Sean Christopherson
  2021-05-04 17:17 ` [PATCH 02/15] KVM: x86: Emulate RDPID only if RDTSCP is supported Sean Christopherson
@ 2021-05-04 17:17 ` Sean Christopherson
  2021-05-04 21:45   ` Jim Mattson
                     ` (2 more replies)
  2021-05-04 17:17 ` [PATCH 04/15] KVM: x86: Move RDPID emulation intercept to its own enum Sean Christopherson
                   ` (12 subsequent siblings)
  15 siblings, 3 replies; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 17:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

Intercept RDTSCP to inject #UD if RDTSC is disabled in the guest.

Note, SVM does not support intercepting RDPID.  Unlike VMX's
ENABLE_RDTSCP control, RDTSCP interception does not apply to RDPID.  This
is a benign virtualization hole as the host kernel (incorrectly) sets
MSR_TSC_AUX if RDTSCP is supported, and KVM loads the guest's MSR_TSC_AUX
into hardware if RDTSCP is supported in the host, i.e. KVM will not leak
the host's MSR_TSC_AUX to the guest.

But, when the kernel bug is fixed, KVM will start leaking the host's
MSR_TSC_AUX if RDPID is supported in hardware, but RDTSCP isn't available
for whatever reason.  This leak will be remedied in a future commit.

Fixes: 46896c73c1a4 ("KVM: svm: add support for RDTSCP")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a7271f31df47..8f2b184270c0 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1100,7 +1100,9 @@ static u64 svm_write_l1_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
 	return svm->vmcb->control.tsc_offset;
 }
 
-static void svm_check_invpcid(struct vcpu_svm *svm)
+/* Evaluate instruction intercepts that depend on guest CPUID features. */
+static void svm_recalc_instruction_intercepts(struct kvm_vcpu *vcpu,
+					      struct vcpu_svm *svm)
 {
 	/*
 	 * Intercept INVPCID if shadow paging is enabled to sync/free shadow
@@ -1113,6 +1115,13 @@ static void svm_check_invpcid(struct vcpu_svm *svm)
 		else
 			svm_clr_intercept(svm, INTERCEPT_INVPCID);
 	}
+
+	if (kvm_cpu_cap_has(X86_FEATURE_RDTSCP)) {
+		if (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
+			svm_clr_intercept(svm, INTERCEPT_RDTSCP);
+		else
+			svm_set_intercept(svm, INTERCEPT_RDTSCP);
+	}
 }
 
 static void init_vmcb(struct kvm_vcpu *vcpu)
@@ -1248,7 +1257,7 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
 		svm_clr_intercept(svm, INTERCEPT_PAUSE);
 	}
 
-	svm_check_invpcid(svm);
+	svm_recalc_instruction_intercepts(vcpu, svm);
 
 	/*
 	 * If the host supports V_SPEC_CTRL then disable the interception
@@ -3084,6 +3093,7 @@ static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
 	[SVM_EXIT_STGI]				= stgi_interception,
 	[SVM_EXIT_CLGI]				= clgi_interception,
 	[SVM_EXIT_SKINIT]			= skinit_interception,
+	[SVM_EXIT_RDTSCP]			= kvm_handle_invalid_op,
 	[SVM_EXIT_WBINVD]                       = kvm_emulate_wbinvd,
 	[SVM_EXIT_MONITOR]			= kvm_emulate_monitor,
 	[SVM_EXIT_MWAIT]			= kvm_emulate_mwait,
@@ -4007,8 +4017,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	svm->nrips_enabled = kvm_cpu_cap_has(X86_FEATURE_NRIPS) &&
 			     guest_cpuid_has(vcpu, X86_FEATURE_NRIPS);
 
-	/* Check again if INVPCID interception if required */
-	svm_check_invpcid(svm);
+	svm_recalc_instruction_intercepts(vcpu, svm);
 
 	/* For sev guests, the memory encryption bit is not reserved in CR3.  */
 	if (sev_guest(vcpu->kvm)) {
-- 
2.31.1.527.g47e6f16901-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 04/15] KVM: x86: Move RDPID emulation intercept to its own enum
  2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
                   ` (2 preceding siblings ...)
  2021-05-04 17:17 ` [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest Sean Christopherson
@ 2021-05-04 17:17 ` Sean Christopherson
  2021-05-04 23:24   ` Jim Mattson
  2021-05-10  8:14   ` Maxim Levitsky
  2021-05-04 17:17 ` [PATCH 05/15] KVM: VMX: Disable preemption when probing user return MSRs Sean Christopherson
                   ` (11 subsequent siblings)
  15 siblings, 2 replies; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 17:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

Add a dedicated intercept enum for RDPID instead of piggybacking RDTSCP.
Unlike VMX's ENABLE_RDTSCP, RDPID is not bound to SVM's RDTSCP intercept.

Fixes: fb6d4d340e05 ("KVM: x86: emulate RDPID")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/emulate.c     | 2 +-
 arch/x86/kvm/kvm_emulate.h | 1 +
 arch/x86/kvm/vmx/vmx.c     | 3 ++-
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index abd9a4db11a8..8fc71e70857d 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -4502,7 +4502,7 @@ static const struct opcode group8[] = {
  * from the register case of group9.
  */
 static const struct gprefix pfx_0f_c7_7 = {
-	N, N, N, II(DstMem | ModRM | Op3264 | EmulateOnUD, em_rdpid, rdtscp),
+	N, N, N, II(DstMem | ModRM | Op3264 | EmulateOnUD, em_rdpid, rdpid),
 };
 
 
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 0d359115429a..f016838faedd 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -468,6 +468,7 @@ enum x86_intercept {
 	x86_intercept_clgi,
 	x86_intercept_skinit,
 	x86_intercept_rdtscp,
+	x86_intercept_rdpid,
 	x86_intercept_icebp,
 	x86_intercept_wbinvd,
 	x86_intercept_monitor,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 82404ee2520e..99591e523b47 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7437,8 +7437,9 @@ static int vmx_check_intercept(struct kvm_vcpu *vcpu,
 	/*
 	 * RDPID causes #UD if disabled through secondary execution controls.
 	 * Because it is marked as EmulateOnUD, we need to intercept it here.
+	 * Note, RDPID is hidden behind ENABLE_RDTSCP.
 	 */
-	case x86_intercept_rdtscp:
+	case x86_intercept_rdpid:
 		if (!nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_RDTSCP)) {
 			exception->vector = UD_VECTOR;
 			exception->error_code_valid = false;
-- 
2.31.1.527.g47e6f16901-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 05/15] KVM: VMX: Disable preemption when probing user return MSRs
  2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
                   ` (3 preceding siblings ...)
  2021-05-04 17:17 ` [PATCH 04/15] KVM: x86: Move RDPID emulation intercept to its own enum Sean Christopherson
@ 2021-05-04 17:17 ` Sean Christopherson
  2021-05-04 23:36   ` Jim Mattson
  2021-05-10  8:18   ` Maxim Levitsky
  2021-05-04 17:17 ` [PATCH 06/15] KVM: SVM: Probe and load MSR_TSC_AUX regardless of RDTSCP support in host Sean Christopherson
                   ` (10 subsequent siblings)
  15 siblings, 2 replies; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 17:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

Disable preemption when probing a user return MSR via RDSMR/WRMSR.  If
the MSR holds a different value per logical CPU, the WRMSR could corrupt
the host's value if KVM is preempted between the RDMSR and WRMSR, and
then rescheduled on a different CPU.

Opportunistically land the helper in common x86, SVM will use the helper
in a future commit.

Fixes: 4be534102624 ("KVM: VMX: Initialize vmx->guest_msrs[] right after allocation")
Cc: stable@vger.kernel.org
Cc: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/vmx/vmx.c          |  5 +----
 arch/x86/kvm/x86.c              | 16 ++++++++++++++++
 3 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3e5fc80a35c8..a02c9bf3f7f1 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1778,6 +1778,7 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
 		    unsigned long icr, int op_64_bit);
 
 void kvm_define_user_return_msr(unsigned index, u32 msr);
+int kvm_probe_user_return_msr(u32 msr);
 int kvm_set_user_return_msr(unsigned index, u64 val, u64 mask);
 
 u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 99591e523b47..990ee339a05f 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6914,12 +6914,9 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 
 	for (i = 0; i < ARRAY_SIZE(vmx_uret_msrs_list); ++i) {
 		u32 index = vmx_uret_msrs_list[i];
-		u32 data_low, data_high;
 		int j = vmx->nr_uret_msrs;
 
-		if (rdmsr_safe(index, &data_low, &data_high) < 0)
-			continue;
-		if (wrmsr_safe(index, data_low, data_high) < 0)
+		if (kvm_probe_user_return_msr(index))
 			continue;
 
 		vmx->guest_uret_msrs[j].slot = i;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3bf52ba5f2bb..e304447be42d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -339,6 +339,22 @@ static void kvm_on_user_return(struct user_return_notifier *urn)
 	}
 }
 
+int kvm_probe_user_return_msr(u32 msr)
+{
+	u64 val;
+	int ret;
+
+	preempt_disable();
+	ret = rdmsrl_safe(msr, &val);
+	if (ret)
+		goto out;
+	ret = wrmsrl_safe(msr, val);
+out:
+	preempt_enable();
+	return ret;
+}
+EXPORT_SYMBOL_GPL(kvm_probe_user_return_msr);
+
 void kvm_define_user_return_msr(unsigned slot, u32 msr)
 {
 	BUG_ON(slot >= KVM_MAX_NR_USER_RETURN_MSRS);
-- 
2.31.1.527.g47e6f16901-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 06/15] KVM: SVM: Probe and load MSR_TSC_AUX regardless of RDTSCP support in host
  2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
                   ` (4 preceding siblings ...)
  2021-05-04 17:17 ` [PATCH 05/15] KVM: VMX: Disable preemption when probing user return MSRs Sean Christopherson
@ 2021-05-04 17:17 ` Sean Christopherson
  2021-05-10  8:20   ` Maxim Levitsky
  2021-05-04 17:17 ` [PATCH 07/15] KVM: x86: Add support for RDPID without RDTSCP Sean Christopherson
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 17:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

Probe MSR_TSC_AUX whether or not RDTSCP is supported in the host, and
if probing succeeds, load the guest's MSR_TSC_AUX into hardware prior to
VMRUN.  Because SVM doesn't support interception of RDPID, RDPID cannot
be disallowed in the guest (without resorting to binary translation).
Leaving the host's MSR_TSC_AUX in hardware would leak the host's value to
the guest if RDTSCP is not supported.

Note, there is also a kernel bug that prevents leaking the host's value.
The host kernel initializes MSR_TSC_AUX if and only if RDTSCP is
supported, even though the vDSO usage consumes MSR_TSC_AUX via RDPID.
I.e. if RDTSCP is not supported, there is no host value to leak.  But,
if/when the host kernel bug is fixed, KVM would start leaking MSR_TSC_AUX
in the case where hardware supports RDPID but RDTSCP is unavailable for
whatever reason.

Probing MSR_TSC_AUX will also allow consolidating the probe and define
logic in common x86, and will make it simpler to condition the existence
of MSR_TSX_AUX (from the guest's perspective) on RDTSCP *or* RDPID.

Fixes: AMD CPUs
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 8f2b184270c0..b3153d40cc4d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -212,7 +212,7 @@ DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
  * RDTSCP and RDPID are not used in the kernel, specifically to allow KVM to
  * defer the restoration of TSC_AUX until the CPU returns to userspace.
  */
-#define TSC_AUX_URET_SLOT	0
+static int tsc_aux_uret_slot __read_mostly = -1;
 
 static const u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000};
 
@@ -959,8 +959,10 @@ static __init int svm_hardware_setup(void)
 		kvm_tsc_scaling_ratio_frac_bits = 32;
 	}
 
-	if (boot_cpu_has(X86_FEATURE_RDTSCP))
-		kvm_define_user_return_msr(TSC_AUX_URET_SLOT, MSR_TSC_AUX);
+	if (!kvm_probe_user_return_msr(MSR_TSC_AUX)) {
+		tsc_aux_uret_slot = 0;
+		kvm_define_user_return_msr(tsc_aux_uret_slot, MSR_TSC_AUX);
+	}
 
 	/* Check for pause filtering support */
 	if (!boot_cpu_has(X86_FEATURE_PAUSEFILTER)) {
@@ -1454,8 +1456,8 @@ static void svm_prepare_guest_switch(struct kvm_vcpu *vcpu)
 		}
 	}
 
-	if (static_cpu_has(X86_FEATURE_RDTSCP))
-		kvm_set_user_return_msr(TSC_AUX_URET_SLOT, svm->tsc_aux, -1ull);
+	if (likely(tsc_aux_uret_slot >= 0))
+		kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
 
 	svm->guest_state_loaded = true;
 }
@@ -2664,7 +2666,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			msr_info->data |= (u64)svm->sysenter_esp_hi << 32;
 		break;
 	case MSR_TSC_AUX:
-		if (!boot_cpu_has(X86_FEATURE_RDTSCP))
+		if (tsc_aux_uret_slot < 0)
 			return 1;
 		if (!msr_info->host_initiated &&
 		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
@@ -2885,7 +2887,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 		svm->sysenter_esp_hi = guest_cpuid_is_intel(vcpu) ? (data >> 32) : 0;
 		break;
 	case MSR_TSC_AUX:
-		if (!boot_cpu_has(X86_FEATURE_RDTSCP))
+		if (tsc_aux_uret_slot < 0)
 			return 1;
 
 		if (!msr->host_initiated &&
@@ -2908,7 +2910,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 		 * guest via direct_access_msrs, and switch it via user return.
 		 */
 		preempt_disable();
-		r = kvm_set_user_return_msr(TSC_AUX_URET_SLOT, data, -1ull);
+		r = kvm_set_user_return_msr(tsc_aux_uret_slot, data, -1ull);
 		preempt_enable();
 		if (r)
 			return 1;
-- 
2.31.1.527.g47e6f16901-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 07/15] KVM: x86: Add support for RDPID without RDTSCP
  2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
                   ` (5 preceding siblings ...)
  2021-05-04 17:17 ` [PATCH 06/15] KVM: SVM: Probe and load MSR_TSC_AUX regardless of RDTSCP support in host Sean Christopherson
@ 2021-05-04 17:17 ` Sean Christopherson
  2021-05-10  8:20   ` Maxim Levitsky
  2021-05-04 17:17 ` [PATCH 08/15] KVM: VMX: Configure list of user return MSRs at module init Sean Christopherson
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 17:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

Allow userspace to enable RDPID for a guest without also enabling RDTSCP.
Aside from checking for RDPID support in the obvious flows, VMX also needs
to set ENABLE_RDTSCP=1 when RDPID is exposed.

For the record, there is no known scenario where enabling RDPID without
RDTSCP is desirable.  But, both AMD and Intel architectures allow for the
condition, i.e. this is purely to make KVM more architecturally accurate.

Fixes: 41cd02c6f7f6 ("kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID")
Cc: stable@vger.kernel.org
Reported-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c |  6 ++++--
 arch/x86/kvm/vmx/vmx.c | 27 +++++++++++++++++++++++----
 arch/x86/kvm/x86.c     |  3 ++-
 3 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index b3153d40cc4d..231b9650d864 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2669,7 +2669,8 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		if (tsc_aux_uret_slot < 0)
 			return 1;
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
+		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
 			return 1;
 		msr_info->data = svm->tsc_aux;
 		break;
@@ -2891,7 +2892,8 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 			return 1;
 
 		if (!msr->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
+		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
 			return 1;
 
 		/*
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 990ee339a05f..42e4bbaa299a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1788,7 +1788,8 @@ static void setup_msrs(struct vcpu_vmx *vmx)
 	if (update_transition_efer(vmx))
 		vmx_setup_uret_msr(vmx, MSR_EFER);
 
-	if (guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP))
+	if (guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP)  ||
+	    guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDPID))
 		vmx_setup_uret_msr(vmx, MSR_TSC_AUX);
 
 	vmx_setup_uret_msr(vmx, MSR_IA32_TSX_CTRL);
@@ -1994,7 +1995,8 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		break;
 	case MSR_TSC_AUX:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
+		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
 			return 1;
 		goto find_uret_msr;
 	case MSR_IA32_DEBUGCTLMSR:
@@ -2314,7 +2316,8 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		break;
 	case MSR_TSC_AUX:
 		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
+		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
 			return 1;
 		/* Check reserved bit, higher 32 bits should be zero */
 		if ((data >> 32) != 0)
@@ -4368,7 +4371,23 @@ static void vmx_compute_secondary_exec_control(struct vcpu_vmx *vmx)
 						  xsaves_enabled, false);
 	}
 
-	vmx_adjust_sec_exec_feature(vmx, &exec_control, rdtscp, RDTSCP);
+	/*
+	 * RDPID is also gated by ENABLE_RDTSCP, turn on the control if either
+	 * feature is exposed to the guest.  This creates a virtualization hole
+	 * if both are supported in hardware but only one is exposed to the
+	 * guest, but letting the guest execute RDTSCP or RDPID when either one
+	 * is advertised is preferable to emulating the advertised instruction
+	 * in KVM on #UD, and obviously better than incorrectly injecting #UD.
+	 */
+	if (cpu_has_vmx_rdtscp()) {
+		bool rdpid_or_rdtscp_enabled =
+			guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) ||
+			guest_cpuid_has(vcpu, X86_FEATURE_RDPID);
+
+		vmx_adjust_secondary_exec_control(vmx, &exec_control,
+						  SECONDARY_EXEC_ENABLE_RDTSCP,
+						  rdpid_or_rdtscp_enabled, false);
+	}
 	vmx_adjust_sec_exec_feature(vmx, &exec_control, invpcid, INVPCID);
 
 	vmx_adjust_sec_exec_exiting(vmx, &exec_control, rdrand, RDRAND);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e304447be42d..b4516d303413 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5978,7 +5978,8 @@ static void kvm_init_msr_list(void)
 				continue;
 			break;
 		case MSR_TSC_AUX:
-			if (!kvm_cpu_cap_has(X86_FEATURE_RDTSCP))
+			if (!kvm_cpu_cap_has(X86_FEATURE_RDTSCP) &&
+			    !kvm_cpu_cap_has(X86_FEATURE_RDPID))
 				continue;
 			break;
 		case MSR_IA32_UMWAIT_CONTROL:
-- 
2.31.1.527.g47e6f16901-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 08/15] KVM: VMX: Configure list of user return MSRs at module init
  2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
                   ` (6 preceding siblings ...)
  2021-05-04 17:17 ` [PATCH 07/15] KVM: x86: Add support for RDPID without RDTSCP Sean Christopherson
@ 2021-05-04 17:17 ` Sean Christopherson
  2021-05-10  8:23   ` Maxim Levitsky
  2021-05-04 17:17 ` [PATCH 09/15] KVM: VMX: Use flag to indicate "active" uret MSRs instead of sorting list Sean Christopherson
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 17:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

Configure the list of user return MSRs that are actually supported at
module init instead of reprobing the list of possible MSRs every time a
vCPU is created.  Curating the list on a per-vCPU basis is pointless; KVM
is completely hosed if the set of supported MSRs changes after module init,
or if the set of MSRs differs per physical PCU.

The per-vCPU lists also increase complexity (see __vmx_find_uret_msr()) and
creates corner cases that _should_ be impossible, but theoretically exist
in KVM, e.g. advertising RDTSCP to userspace without actually being able to
virtualize RDTSCP if probing MSR_TSC_AUX fails.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 61 ++++++++++++++++++++++++++++--------------
 arch/x86/kvm/vmx/vmx.h | 10 ++++++-
 2 files changed, 50 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 42e4bbaa299a..68454b0de2b1 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -461,7 +461,7 @@ static unsigned long host_idt_base;
  * support this emulation, IA32_STAR must always be included in
  * vmx_uret_msrs_list[], even in i386 builds.
  */
-static const u32 vmx_uret_msrs_list[] = {
+static u32 vmx_uret_msrs_list[] = {
 #ifdef CONFIG_X86_64
 	MSR_SYSCALL_MASK, MSR_LSTAR, MSR_CSTAR,
 #endif
@@ -469,6 +469,12 @@ static const u32 vmx_uret_msrs_list[] = {
 	MSR_IA32_TSX_CTRL,
 };
 
+/*
+ * Number of user return MSRs that are actually supported in hardware.
+ * vmx_uret_msrs_list is modified when KVM is loaded to drop unsupported MSRs.
+ */
+static int vmx_nr_uret_msrs;
+
 #if IS_ENABLED(CONFIG_HYPERV)
 static bool __read_mostly enlightened_vmcs = true;
 module_param(enlightened_vmcs, bool, 0444);
@@ -700,9 +706,16 @@ static inline int __vmx_find_uret_msr(struct vcpu_vmx *vmx, u32 msr)
 {
 	int i;
 
-	for (i = 0; i < vmx->nr_uret_msrs; ++i)
+	/*
+	 * Note, vmx->guest_uret_msrs is the same size as vmx_uret_msrs_list,
+	 * but is ordered differently.  The MSR is matched against the list of
+	 * supported uret MSRs using "slot", but the index that is returned is
+	 * the index into guest_uret_msrs.
+	 */
+	for (i = 0; i < vmx_nr_uret_msrs; ++i) {
 		if (vmx_uret_msrs_list[vmx->guest_uret_msrs[i].slot] == msr)
 			return i;
+	}
 	return -1;
 }
 
@@ -6929,18 +6942,10 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 			goto free_vpid;
 	}
 
-	BUILD_BUG_ON(ARRAY_SIZE(vmx_uret_msrs_list) != MAX_NR_USER_RETURN_MSRS);
+	for (i = 0; i < vmx_nr_uret_msrs; ++i) {
+		vmx->guest_uret_msrs[i].data = 0;
 
-	for (i = 0; i < ARRAY_SIZE(vmx_uret_msrs_list); ++i) {
-		u32 index = vmx_uret_msrs_list[i];
-		int j = vmx->nr_uret_msrs;
-
-		if (kvm_probe_user_return_msr(index))
-			continue;
-
-		vmx->guest_uret_msrs[j].slot = i;
-		vmx->guest_uret_msrs[j].data = 0;
-		switch (index) {
+		switch (vmx_uret_msrs_list[i]) {
 		case MSR_IA32_TSX_CTRL:
 			/*
 			 * TSX_CTRL_CPUID_CLEAR is handled in the CPUID
@@ -6954,15 +6959,14 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 			 * host so that TSX remains always disabled.
 			 */
 			if (boot_cpu_has(X86_FEATURE_RTM))
-				vmx->guest_uret_msrs[j].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
+				vmx->guest_uret_msrs[i].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
 			else
-				vmx->guest_uret_msrs[j].mask = 0;
+				vmx->guest_uret_msrs[i].mask = 0;
 			break;
 		default:
-			vmx->guest_uret_msrs[j].mask = -1ull;
+			vmx->guest_uret_msrs[i].mask = -1ull;
 			break;
 		}
-		++vmx->nr_uret_msrs;
 	}
 
 	err = alloc_loaded_vmcs(&vmx->vmcs01);
@@ -7821,17 +7825,34 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
 };
 
+static __init void vmx_setup_user_return_msrs(void)
+{
+	u32 msr;
+	int i;
+
+	BUILD_BUG_ON(ARRAY_SIZE(vmx_uret_msrs_list) != MAX_NR_USER_RETURN_MSRS);
+
+	for (i = 0; i < ARRAY_SIZE(vmx_uret_msrs_list); ++i) {
+		msr = vmx_uret_msrs_list[i];
+
+		if (kvm_probe_user_return_msr(msr))
+			continue;
+
+		kvm_define_user_return_msr(vmx_nr_uret_msrs, msr);
+		vmx_uret_msrs_list[vmx_nr_uret_msrs++] = msr;
+	}
+}
+
 static __init int hardware_setup(void)
 {
 	unsigned long host_bndcfgs;
 	struct desc_ptr dt;
-	int r, i, ept_lpage_level;
+	int r, ept_lpage_level;
 
 	store_idt(&dt);
 	host_idt_base = dt.address;
 
-	for (i = 0; i < ARRAY_SIZE(vmx_uret_msrs_list); ++i)
-		kvm_define_user_return_msr(i, vmx_uret_msrs_list[i]);
+	vmx_setup_user_return_msrs();
 
 	if (setup_vmcs_config(&vmcs_config, &vmx_capability) < 0)
 		return -EIO;
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 008cb87ff088..d71ed8b425c5 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -245,8 +245,16 @@ struct vcpu_vmx {
 	u32                   idt_vectoring_info;
 	ulong                 rflags;
 
+	/*
+	 * User return MSRs are always emulated when enabled in the guest, but
+	 * only loaded into hardware when necessary, e.g. SYSCALL #UDs outside
+	 * of 64-bit mode or if EFER.SCE=1, thus the SYSCALL MSRs don't need to
+	 * be loaded into hardware if those conditions aren't met.
+	 * nr_active_uret_msrs tracks the number of MSRs that need to be loaded
+	 * into hardware when running the guest.  guest_uret_msrs[] is resorted
+	 * whenever the number of "active" uret MSRs is modified.
+	 */
 	struct vmx_uret_msr   guest_uret_msrs[MAX_NR_USER_RETURN_MSRS];
-	int                   nr_uret_msrs;
 	int                   nr_active_uret_msrs;
 	bool                  guest_uret_msrs_loaded;
 #ifdef CONFIG_X86_64
-- 
2.31.1.527.g47e6f16901-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 09/15] KVM: VMX: Use flag to indicate "active" uret MSRs instead of sorting list
  2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
                   ` (7 preceding siblings ...)
  2021-05-04 17:17 ` [PATCH 08/15] KVM: VMX: Configure list of user return MSRs at module init Sean Christopherson
@ 2021-05-04 17:17 ` Sean Christopherson
  2021-05-08  3:31   ` Reiji Watanabe
  2021-05-10  8:25   ` Maxim Levitsky
  2021-05-04 17:17 ` [PATCH 10/15] KVM: VMX: Use common x86's uret MSR list as the one true list Sean Christopherson
                   ` (6 subsequent siblings)
  15 siblings, 2 replies; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 17:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

Explicitly flag a uret MSR as needing to be loaded into hardware instead of
resorting the list of "active" MSRs and tracking how many MSRs in total
need to be loaded.  The only benefit to sorting the list is that the loop
to load MSRs during vmx_prepare_switch_to_guest() doesn't need to iterate
over all supported uret MRS, only those that are active.  But that is a
pointless optimization, as the most common case, running a 64-bit guest,
will load the vast majority of MSRs.  Not to mention that a single WRMSR is
far more expensive than iterating over the list.

Providing a stable list order obviates the need to track a given MSR's
"slot" in the per-CPU list of user return MSRs; all lists simply use the
same ordering.  Future patches will take advantage of the stable order to
further simplify the related code.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 80 ++++++++++++++++++++++--------------------
 arch/x86/kvm/vmx/vmx.h |  2 +-
 2 files changed, 42 insertions(+), 40 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 68454b0de2b1..6caabcd5037e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -458,8 +458,9 @@ static unsigned long host_idt_base;
  * Though SYSCALL is only supported in 64-bit mode on Intel CPUs, kvm
  * will emulate SYSCALL in legacy mode if the vendor string in guest
  * CPUID.0:{EBX,ECX,EDX} is "AuthenticAMD" or "AMDisbetter!" To
- * support this emulation, IA32_STAR must always be included in
- * vmx_uret_msrs_list[], even in i386 builds.
+ * support this emulation, MSR_STAR is included in the list for i386,
+ * but is never loaded into hardware.  MSR_CSTAR is also never loaded
+ * into hardware and is here purely for emulation purposes.
  */
 static u32 vmx_uret_msrs_list[] = {
 #ifdef CONFIG_X86_64
@@ -702,18 +703,12 @@ static bool is_valid_passthrough_msr(u32 msr)
 	return r;
 }
 
-static inline int __vmx_find_uret_msr(struct vcpu_vmx *vmx, u32 msr)
+static inline int __vmx_find_uret_msr(u32 msr)
 {
 	int i;
 
-	/*
-	 * Note, vmx->guest_uret_msrs is the same size as vmx_uret_msrs_list,
-	 * but is ordered differently.  The MSR is matched against the list of
-	 * supported uret MSRs using "slot", but the index that is returned is
-	 * the index into guest_uret_msrs.
-	 */
 	for (i = 0; i < vmx_nr_uret_msrs; ++i) {
-		if (vmx_uret_msrs_list[vmx->guest_uret_msrs[i].slot] == msr)
+		if (vmx_uret_msrs_list[i] == msr)
 			return i;
 	}
 	return -1;
@@ -723,7 +718,7 @@ struct vmx_uret_msr *vmx_find_uret_msr(struct vcpu_vmx *vmx, u32 msr)
 {
 	int i;
 
-	i = __vmx_find_uret_msr(vmx, msr);
+	i = __vmx_find_uret_msr(msr);
 	if (i >= 0)
 		return &vmx->guest_uret_msrs[i];
 	return NULL;
@@ -732,13 +727,14 @@ struct vmx_uret_msr *vmx_find_uret_msr(struct vcpu_vmx *vmx, u32 msr)
 static int vmx_set_guest_uret_msr(struct vcpu_vmx *vmx,
 				  struct vmx_uret_msr *msr, u64 data)
 {
+	unsigned int slot = msr - vmx->guest_uret_msrs;
 	int ret = 0;
 
 	u64 old_msr_data = msr->data;
 	msr->data = data;
-	if (msr - vmx->guest_uret_msrs < vmx->nr_active_uret_msrs) {
+	if (msr->load_into_hardware) {
 		preempt_disable();
-		ret = kvm_set_user_return_msr(msr->slot, msr->data, msr->mask);
+		ret = kvm_set_user_return_msr(slot, msr->data, msr->mask);
 		preempt_enable();
 		if (ret)
 			msr->data = old_msr_data;
@@ -1090,7 +1086,7 @@ static bool update_transition_efer(struct vcpu_vmx *vmx)
 		return false;
 	}
 
-	i = __vmx_find_uret_msr(vmx, MSR_EFER);
+	i = __vmx_find_uret_msr(MSR_EFER);
 	if (i < 0)
 		return false;
 
@@ -1252,11 +1248,14 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
 	 */
 	if (!vmx->guest_uret_msrs_loaded) {
 		vmx->guest_uret_msrs_loaded = true;
-		for (i = 0; i < vmx->nr_active_uret_msrs; ++i)
-			kvm_set_user_return_msr(vmx->guest_uret_msrs[i].slot,
+		for (i = 0; i < vmx_nr_uret_msrs; ++i) {
+			if (!vmx->guest_uret_msrs[i].load_into_hardware)
+				continue;
+
+			kvm_set_user_return_msr(i,
 						vmx->guest_uret_msrs[i].data,
 						vmx->guest_uret_msrs[i].mask);
-
+		}
 	}
 
     	if (vmx->nested.need_vmcs12_to_shadow_sync)
@@ -1763,19 +1762,16 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
 	vmx_clear_hlt(vcpu);
 }
 
-static void vmx_setup_uret_msr(struct vcpu_vmx *vmx, unsigned int msr)
+static void vmx_setup_uret_msr(struct vcpu_vmx *vmx, unsigned int msr,
+			       bool load_into_hardware)
 {
-	struct vmx_uret_msr tmp;
-	int from, to;
+	struct vmx_uret_msr *uret_msr;
 
-	from = __vmx_find_uret_msr(vmx, msr);
-	if (from < 0)
+	uret_msr = vmx_find_uret_msr(vmx, msr);
+	if (!uret_msr)
 		return;
-	to = vmx->nr_active_uret_msrs++;
 
-	tmp = vmx->guest_uret_msrs[to];
-	vmx->guest_uret_msrs[to] = vmx->guest_uret_msrs[from];
-	vmx->guest_uret_msrs[from] = tmp;
+	uret_msr->load_into_hardware = load_into_hardware;
 }
 
 /*
@@ -1785,30 +1781,36 @@ static void vmx_setup_uret_msr(struct vcpu_vmx *vmx, unsigned int msr)
  */
 static void setup_msrs(struct vcpu_vmx *vmx)
 {
-	vmx->guest_uret_msrs_loaded = false;
-	vmx->nr_active_uret_msrs = 0;
 #ifdef CONFIG_X86_64
+	bool load_syscall_msrs;
+
 	/*
 	 * The SYSCALL MSRs are only needed on long mode guests, and only
 	 * when EFER.SCE is set.
 	 */
-	if (is_long_mode(&vmx->vcpu) && (vmx->vcpu.arch.efer & EFER_SCE)) {
-		vmx_setup_uret_msr(vmx, MSR_STAR);
-		vmx_setup_uret_msr(vmx, MSR_LSTAR);
-		vmx_setup_uret_msr(vmx, MSR_SYSCALL_MASK);
-	}
+	load_syscall_msrs = is_long_mode(&vmx->vcpu) &&
+			    (vmx->vcpu.arch.efer & EFER_SCE);
+
+	vmx_setup_uret_msr(vmx, MSR_STAR, load_syscall_msrs);
+	vmx_setup_uret_msr(vmx, MSR_LSTAR, load_syscall_msrs);
+	vmx_setup_uret_msr(vmx, MSR_SYSCALL_MASK, load_syscall_msrs);
 #endif
-	if (update_transition_efer(vmx))
-		vmx_setup_uret_msr(vmx, MSR_EFER);
+	vmx_setup_uret_msr(vmx, MSR_EFER, update_transition_efer(vmx));
 
-	if (guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP)  ||
-	    guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDPID))
-		vmx_setup_uret_msr(vmx, MSR_TSC_AUX);
+	vmx_setup_uret_msr(vmx, MSR_TSC_AUX,
+			   guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP) ||
+			   guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDPID));
 
-	vmx_setup_uret_msr(vmx, MSR_IA32_TSX_CTRL);
+	vmx_setup_uret_msr(vmx, MSR_IA32_TSX_CTRL, true);
 
 	if (cpu_has_vmx_msr_bitmap())
 		vmx_update_msr_bitmap(&vmx->vcpu);
+
+	/*
+	 * The set of MSRs to load may have changed, reload MSRs before the
+	 * next VM-Enter.
+	 */
+	vmx->guest_uret_msrs_loaded = false;
 }
 
 static u64 vmx_write_l1_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index d71ed8b425c5..16e4e457ba23 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -36,7 +36,7 @@ struct vmx_msrs {
 };
 
 struct vmx_uret_msr {
-	unsigned int slot; /* The MSR's slot in kvm_user_return_msrs. */
+	bool load_into_hardware;
 	u64 data;
 	u64 mask;
 };
-- 
2.31.1.527.g47e6f16901-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 10/15] KVM: VMX: Use common x86's uret MSR list as the one true list
  2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
                   ` (8 preceding siblings ...)
  2021-05-04 17:17 ` [PATCH 09/15] KVM: VMX: Use flag to indicate "active" uret MSRs instead of sorting list Sean Christopherson
@ 2021-05-04 17:17 ` Sean Christopherson
  2021-05-10  8:25   ` Maxim Levitsky
  2021-05-04 17:17 ` [PATCH 11/15] KVM: VMX: Disable loading of TSX_CTRL MSR the more conventional way Sean Christopherson
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 17:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

Drop VMX's global list of user return MSRs now that VMX doesn't resort said
list to isolate "active" MSRs, i.e. now that VMX's list and x86's list have
the same MSRs in the same order.

In addition to eliminating the redundant list, this will also allow moving
more of the list management into common x86.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/vmx/vmx.c          | 97 ++++++++++++++-------------------
 arch/x86/kvm/x86.c              | 12 ++++
 3 files changed, 53 insertions(+), 57 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a02c9bf3f7f1..c9452472ed55 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1778,6 +1778,7 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
 		    unsigned long icr, int op_64_bit);
 
 void kvm_define_user_return_msr(unsigned index, u32 msr);
+int kvm_find_user_return_msr(u32 msr);
 int kvm_probe_user_return_msr(u32 msr);
 int kvm_set_user_return_msr(unsigned index, u64 val, u64 mask);
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6caabcd5037e..4b432d2bbd06 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -454,26 +454,7 @@ static inline void vmx_segment_cache_clear(struct vcpu_vmx *vmx)
 
 static unsigned long host_idt_base;
 
-/*
- * Though SYSCALL is only supported in 64-bit mode on Intel CPUs, kvm
- * will emulate SYSCALL in legacy mode if the vendor string in guest
- * CPUID.0:{EBX,ECX,EDX} is "AuthenticAMD" or "AMDisbetter!" To
- * support this emulation, MSR_STAR is included in the list for i386,
- * but is never loaded into hardware.  MSR_CSTAR is also never loaded
- * into hardware and is here purely for emulation purposes.
- */
-static u32 vmx_uret_msrs_list[] = {
-#ifdef CONFIG_X86_64
-	MSR_SYSCALL_MASK, MSR_LSTAR, MSR_CSTAR,
-#endif
-	MSR_EFER, MSR_TSC_AUX, MSR_STAR,
-	MSR_IA32_TSX_CTRL,
-};
-
-/*
- * Number of user return MSRs that are actually supported in hardware.
- * vmx_uret_msrs_list is modified when KVM is loaded to drop unsupported MSRs.
- */
+/* Number of user return MSRs that are actually supported in hardware. */
 static int vmx_nr_uret_msrs;
 
 #if IS_ENABLED(CONFIG_HYPERV)
@@ -703,22 +684,11 @@ static bool is_valid_passthrough_msr(u32 msr)
 	return r;
 }
 
-static inline int __vmx_find_uret_msr(u32 msr)
-{
-	int i;
-
-	for (i = 0; i < vmx_nr_uret_msrs; ++i) {
-		if (vmx_uret_msrs_list[i] == msr)
-			return i;
-	}
-	return -1;
-}
-
 struct vmx_uret_msr *vmx_find_uret_msr(struct vcpu_vmx *vmx, u32 msr)
 {
 	int i;
 
-	i = __vmx_find_uret_msr(msr);
+	i = kvm_find_user_return_msr(msr);
 	if (i >= 0)
 		return &vmx->guest_uret_msrs[i];
 	return NULL;
@@ -1086,7 +1056,7 @@ static bool update_transition_efer(struct vcpu_vmx *vmx)
 		return false;
 	}
 
-	i = __vmx_find_uret_msr(MSR_EFER);
+	i = kvm_find_user_return_msr(MSR_EFER);
 	if (i < 0)
 		return false;
 
@@ -6922,6 +6892,7 @@ static void vmx_free_vcpu(struct kvm_vcpu *vcpu)
 
 static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 {
+	struct vmx_uret_msr *tsx_ctrl;
 	struct vcpu_vmx *vmx;
 	int i, cpu, err;
 
@@ -6946,29 +6917,25 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 
 	for (i = 0; i < vmx_nr_uret_msrs; ++i) {
 		vmx->guest_uret_msrs[i].data = 0;
-
-		switch (vmx_uret_msrs_list[i]) {
-		case MSR_IA32_TSX_CTRL:
-			/*
-			 * TSX_CTRL_CPUID_CLEAR is handled in the CPUID
-			 * interception.  Keep the host value unchanged to avoid
-			 * changing CPUID bits under the host kernel's feet.
-			 *
-			 * hle=0, rtm=0, tsx_ctrl=1 can be found with some
-			 * combinations of new kernel and old userspace.  If
-			 * those guests run on a tsx=off host, do allow guests
-			 * to use TSX_CTRL, but do not change the value on the
-			 * host so that TSX remains always disabled.
-			 */
-			if (boot_cpu_has(X86_FEATURE_RTM))
-				vmx->guest_uret_msrs[i].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
-			else
-				vmx->guest_uret_msrs[i].mask = 0;
-			break;
-		default:
-			vmx->guest_uret_msrs[i].mask = -1ull;
-			break;
-		}
+		vmx->guest_uret_msrs[i].mask = -1ull;
+	}
+	tsx_ctrl = vmx_find_uret_msr(vmx, MSR_IA32_TSX_CTRL);
+	if (tsx_ctrl) {
+		/*
+		 * TSX_CTRL_CPUID_CLEAR is handled in the CPUID interception.
+		 * Keep the host value unchanged to avoid changing CPUID bits
+		 * under the host kernel's feet.
+		 *
+		 * hle=0, rtm=0, tsx_ctrl=1 can be found with some combinations
+		 * of new kernel and old userspace.  If those guests run on a
+		 * tsx=off host, do allow guests to use TSX_CTRL, but do not
+		 * change the value on the host so that TSX remains always
+		 * disabled.
+		 */
+		if (boot_cpu_has(X86_FEATURE_RTM))
+			vmx->guest_uret_msrs[i].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
+		else
+			vmx->guest_uret_msrs[i].mask = 0;
 	}
 
 	err = alloc_loaded_vmcs(&vmx->vmcs01);
@@ -7829,6 +7796,22 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 
 static __init void vmx_setup_user_return_msrs(void)
 {
+
+	/*
+	 * Though SYSCALL is only supported in 64-bit mode on Intel CPUs, kvm
+	 * will emulate SYSCALL in legacy mode if the vendor string in guest
+	 * CPUID.0:{EBX,ECX,EDX} is "AuthenticAMD" or "AMDisbetter!" To
+	 * support this emulation, MSR_STAR is included in the list for i386,
+	 * but is never loaded into hardware.  MSR_CSTAR is also never loaded
+	 * into hardware and is here purely for emulation purposes.
+	 */
+	const u32 vmx_uret_msrs_list[] = {
+	#ifdef CONFIG_X86_64
+		MSR_SYSCALL_MASK, MSR_LSTAR, MSR_CSTAR,
+	#endif
+		MSR_EFER, MSR_TSC_AUX, MSR_STAR,
+		MSR_IA32_TSX_CTRL,
+	};
 	u32 msr;
 	int i;
 
@@ -7841,7 +7824,7 @@ static __init void vmx_setup_user_return_msrs(void)
 			continue;
 
 		kvm_define_user_return_msr(vmx_nr_uret_msrs, msr);
-		vmx_uret_msrs_list[vmx_nr_uret_msrs++] = msr;
+		vmx_nr_uret_msrs++;
 	}
 }
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b4516d303413..90ef340565a4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -364,6 +364,18 @@ void kvm_define_user_return_msr(unsigned slot, u32 msr)
 }
 EXPORT_SYMBOL_GPL(kvm_define_user_return_msr);
 
+int kvm_find_user_return_msr(u32 msr)
+{
+	int i;
+
+	for (i = 0; i < user_return_msrs_global.nr; ++i) {
+		if (user_return_msrs_global.msrs[i] == msr)
+			return i;
+	}
+	return -1;
+}
+EXPORT_SYMBOL_GPL(kvm_find_user_return_msr);
+
 static void kvm_user_return_msr_cpu_online(void)
 {
 	unsigned int cpu = smp_processor_id();
-- 
2.31.1.527.g47e6f16901-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 11/15] KVM: VMX: Disable loading of TSX_CTRL MSR the more conventional way
  2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
                   ` (9 preceding siblings ...)
  2021-05-04 17:17 ` [PATCH 10/15] KVM: VMX: Use common x86's uret MSR list as the one true list Sean Christopherson
@ 2021-05-04 17:17 ` Sean Christopherson
  2021-05-05  8:49   ` Paolo Bonzini
  2021-05-10  8:26   ` Maxim Levitsky
  2021-05-04 17:17 ` [PATCH 12/15] KVM: x86: Export the number of uret MSRs to vendor modules Sean Christopherson
                   ` (4 subsequent siblings)
  15 siblings, 2 replies; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 17:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

Tag TSX_CTRL as not needing to be loaded when RTM isn't supported in the
host.  Crushing the write mask to '0' has the same effect, but requires
more mental gymnastics to understand.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 4b432d2bbd06..7a53568b34fc 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1771,7 +1771,13 @@ static void setup_msrs(struct vcpu_vmx *vmx)
 			   guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP) ||
 			   guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDPID));
 
-	vmx_setup_uret_msr(vmx, MSR_IA32_TSX_CTRL, true);
+	/*
+	 * hle=0, rtm=0, tsx_ctrl=1 can be found with some combinations of new
+	 * kernel and old userspace.  If those guests run on a tsx=off host, do
+	 * allow guests to use TSX_CTRL, but don't change the value in hardware
+	 * so that TSX remains always disabled.
+	 */
+	vmx_setup_uret_msr(vmx, MSR_IA32_TSX_CTRL, boot_cpu_has(X86_FEATURE_RTM));
 
 	if (cpu_has_vmx_msr_bitmap())
 		vmx_update_msr_bitmap(&vmx->vcpu);
@@ -6919,23 +6925,15 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 		vmx->guest_uret_msrs[i].data = 0;
 		vmx->guest_uret_msrs[i].mask = -1ull;
 	}
-	tsx_ctrl = vmx_find_uret_msr(vmx, MSR_IA32_TSX_CTRL);
-	if (tsx_ctrl) {
+	if (boot_cpu_has(X86_FEATURE_RTM)) {
 		/*
 		 * TSX_CTRL_CPUID_CLEAR is handled in the CPUID interception.
 		 * Keep the host value unchanged to avoid changing CPUID bits
 		 * under the host kernel's feet.
-		 *
-		 * hle=0, rtm=0, tsx_ctrl=1 can be found with some combinations
-		 * of new kernel and old userspace.  If those guests run on a
-		 * tsx=off host, do allow guests to use TSX_CTRL, but do not
-		 * change the value on the host so that TSX remains always
-		 * disabled.
 		 */
-		if (boot_cpu_has(X86_FEATURE_RTM))
+		tsx_ctrl = vmx_find_uret_msr(vmx, MSR_IA32_TSX_CTRL);
+		if (tsx_ctrl)
 			vmx->guest_uret_msrs[i].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
-		else
-			vmx->guest_uret_msrs[i].mask = 0;
 	}
 
 	err = alloc_loaded_vmcs(&vmx->vmcs01);
-- 
2.31.1.527.g47e6f16901-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 12/15] KVM: x86: Export the number of uret MSRs to vendor modules
  2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
                   ` (10 preceding siblings ...)
  2021-05-04 17:17 ` [PATCH 11/15] KVM: VMX: Disable loading of TSX_CTRL MSR the more conventional way Sean Christopherson
@ 2021-05-04 17:17 ` Sean Christopherson
  2021-05-10  8:27   ` Maxim Levitsky
  2021-05-04 17:17 ` [PATCH 13/15] KVM: x86: Move uret MSR slot management to common x86 Sean Christopherson
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 17:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

Split out and export the number of configured user return MSRs so that
VMX can iterate over the set of MSRs without having to do its own tracking.
Keep the list itself internal to x86 so that vendor code still has to go
through the "official" APIs to add/modify entries.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/x86.c              | 29 +++++++++++++----------------
 2 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c9452472ed55..10663610f105 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1419,6 +1419,7 @@ struct kvm_arch_async_pf {
 	bool direct_map;
 };
 
+extern u32 __read_mostly kvm_nr_uret_msrs;
 extern u64 __read_mostly host_efer;
 extern bool __read_mostly allow_smaller_maxphyaddr;
 extern struct kvm_x86_ops kvm_x86_ops;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 90ef340565a4..2fd46e917666 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -184,11 +184,6 @@ module_param(pi_inject_timer, bint, S_IRUGO | S_IWUSR);
  */
 #define KVM_MAX_NR_USER_RETURN_MSRS 16
 
-struct kvm_user_return_msrs_global {
-	int nr;
-	u32 msrs[KVM_MAX_NR_USER_RETURN_MSRS];
-};
-
 struct kvm_user_return_msrs {
 	struct user_return_notifier urn;
 	bool registered;
@@ -198,7 +193,9 @@ struct kvm_user_return_msrs {
 	} values[KVM_MAX_NR_USER_RETURN_MSRS];
 };
 
-static struct kvm_user_return_msrs_global __read_mostly user_return_msrs_global;
+u32 __read_mostly kvm_nr_uret_msrs;
+EXPORT_SYMBOL_GPL(kvm_nr_uret_msrs);
+static u32 __read_mostly kvm_uret_msrs_list[KVM_MAX_NR_USER_RETURN_MSRS];
 static struct kvm_user_return_msrs __percpu *user_return_msrs;
 
 #define KVM_SUPPORTED_XCR0     (XFEATURE_MASK_FP | XFEATURE_MASK_SSE \
@@ -330,10 +327,10 @@ static void kvm_on_user_return(struct user_return_notifier *urn)
 		user_return_notifier_unregister(urn);
 	}
 	local_irq_restore(flags);
-	for (slot = 0; slot < user_return_msrs_global.nr; ++slot) {
+	for (slot = 0; slot < kvm_nr_uret_msrs; ++slot) {
 		values = &msrs->values[slot];
 		if (values->host != values->curr) {
-			wrmsrl(user_return_msrs_global.msrs[slot], values->host);
+			wrmsrl(kvm_uret_msrs_list[slot], values->host);
 			values->curr = values->host;
 		}
 	}
@@ -358,9 +355,9 @@ EXPORT_SYMBOL_GPL(kvm_probe_user_return_msr);
 void kvm_define_user_return_msr(unsigned slot, u32 msr)
 {
 	BUG_ON(slot >= KVM_MAX_NR_USER_RETURN_MSRS);
-	user_return_msrs_global.msrs[slot] = msr;
-	if (slot >= user_return_msrs_global.nr)
-		user_return_msrs_global.nr = slot + 1;
+	kvm_uret_msrs_list[slot] = msr;
+	if (slot >= kvm_nr_uret_msrs)
+		kvm_nr_uret_msrs = slot + 1;
 }
 EXPORT_SYMBOL_GPL(kvm_define_user_return_msr);
 
@@ -368,8 +365,8 @@ int kvm_find_user_return_msr(u32 msr)
 {
 	int i;
 
-	for (i = 0; i < user_return_msrs_global.nr; ++i) {
-		if (user_return_msrs_global.msrs[i] == msr)
+	for (i = 0; i < kvm_nr_uret_msrs; ++i) {
+		if (kvm_uret_msrs_list[i] == msr)
 			return i;
 	}
 	return -1;
@@ -383,8 +380,8 @@ static void kvm_user_return_msr_cpu_online(void)
 	u64 value;
 	int i;
 
-	for (i = 0; i < user_return_msrs_global.nr; ++i) {
-		rdmsrl_safe(user_return_msrs_global.msrs[i], &value);
+	for (i = 0; i < kvm_nr_uret_msrs; ++i) {
+		rdmsrl_safe(kvm_uret_msrs_list[i], &value);
 		msrs->values[i].host = value;
 		msrs->values[i].curr = value;
 	}
@@ -399,7 +396,7 @@ int kvm_set_user_return_msr(unsigned slot, u64 value, u64 mask)
 	value = (value & mask) | (msrs->values[slot].host & ~mask);
 	if (value == msrs->values[slot].curr)
 		return 0;
-	err = wrmsrl_safe(user_return_msrs_global.msrs[slot], value);
+	err = wrmsrl_safe(kvm_uret_msrs_list[slot], value);
 	if (err)
 		return 1;
 
-- 
2.31.1.527.g47e6f16901-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 13/15] KVM: x86: Move uret MSR slot management to common x86
  2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
                   ` (11 preceding siblings ...)
  2021-05-04 17:17 ` [PATCH 12/15] KVM: x86: Export the number of uret MSRs to vendor modules Sean Christopherson
@ 2021-05-04 17:17 ` Sean Christopherson
  2021-05-10  8:28   ` Maxim Levitsky
  2021-05-04 17:17 ` [PATCH 14/15] KVM: x86: Tie Intel and AMD behavior for MSR_TSC_AUX to guest CPU model Sean Christopherson
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 17:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

Now that SVM and VMX both probe MSRs before "defining" user return slots
for them, consolidate the code for probe+define into common x86 and
eliminate the odd behavior of having the vendor code define the slot for
a given MSR.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h |  3 +--
 arch/x86/kvm/svm/svm.c          |  5 +----
 arch/x86/kvm/vmx/vmx.c          | 19 ++++---------------
 arch/x86/kvm/x86.c              | 19 +++++++++++--------
 4 files changed, 17 insertions(+), 29 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 10663610f105..a4b912f7e427 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1778,9 +1778,8 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
 		    unsigned long ipi_bitmap_high, u32 min,
 		    unsigned long icr, int op_64_bit);
 
-void kvm_define_user_return_msr(unsigned index, u32 msr);
+int kvm_add_user_return_msr(u32 msr);
 int kvm_find_user_return_msr(u32 msr);
-int kvm_probe_user_return_msr(u32 msr);
 int kvm_set_user_return_msr(unsigned index, u64 val, u64 mask);
 
 u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 231b9650d864..de921935e8de 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -959,10 +959,7 @@ static __init int svm_hardware_setup(void)
 		kvm_tsc_scaling_ratio_frac_bits = 32;
 	}
 
-	if (!kvm_probe_user_return_msr(MSR_TSC_AUX)) {
-		tsc_aux_uret_slot = 0;
-		kvm_define_user_return_msr(tsc_aux_uret_slot, MSR_TSC_AUX);
-	}
+	tsc_aux_uret_slot = kvm_add_user_return_msr(MSR_TSC_AUX);
 
 	/* Check for pause filtering support */
 	if (!boot_cpu_has(X86_FEATURE_PAUSEFILTER)) {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 7a53568b34fc..26f82f302391 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -454,9 +454,6 @@ static inline void vmx_segment_cache_clear(struct vcpu_vmx *vmx)
 
 static unsigned long host_idt_base;
 
-/* Number of user return MSRs that are actually supported in hardware. */
-static int vmx_nr_uret_msrs;
-
 #if IS_ENABLED(CONFIG_HYPERV)
 static bool __read_mostly enlightened_vmcs = true;
 module_param(enlightened_vmcs, bool, 0444);
@@ -1218,7 +1215,7 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
 	 */
 	if (!vmx->guest_uret_msrs_loaded) {
 		vmx->guest_uret_msrs_loaded = true;
-		for (i = 0; i < vmx_nr_uret_msrs; ++i) {
+		for (i = 0; i < kvm_nr_uret_msrs; ++i) {
 			if (!vmx->guest_uret_msrs[i].load_into_hardware)
 				continue;
 
@@ -6921,7 +6918,7 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 			goto free_vpid;
 	}
 
-	for (i = 0; i < vmx_nr_uret_msrs; ++i) {
+	for (i = 0; i < kvm_nr_uret_msrs; ++i) {
 		vmx->guest_uret_msrs[i].data = 0;
 		vmx->guest_uret_msrs[i].mask = -1ull;
 	}
@@ -7810,20 +7807,12 @@ static __init void vmx_setup_user_return_msrs(void)
 		MSR_EFER, MSR_TSC_AUX, MSR_STAR,
 		MSR_IA32_TSX_CTRL,
 	};
-	u32 msr;
 	int i;
 
 	BUILD_BUG_ON(ARRAY_SIZE(vmx_uret_msrs_list) != MAX_NR_USER_RETURN_MSRS);
 
-	for (i = 0; i < ARRAY_SIZE(vmx_uret_msrs_list); ++i) {
-		msr = vmx_uret_msrs_list[i];
-
-		if (kvm_probe_user_return_msr(msr))
-			continue;
-
-		kvm_define_user_return_msr(vmx_nr_uret_msrs, msr);
-		vmx_nr_uret_msrs++;
-	}
+	for (i = 0; i < ARRAY_SIZE(vmx_uret_msrs_list); ++i)
+		kvm_add_user_return_msr(vmx_uret_msrs_list[i]);
 }
 
 static __init int hardware_setup(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2fd46e917666..adca491d3b4b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -336,7 +336,7 @@ static void kvm_on_user_return(struct user_return_notifier *urn)
 	}
 }
 
-int kvm_probe_user_return_msr(u32 msr)
+static int kvm_probe_user_return_msr(u32 msr)
 {
 	u64 val;
 	int ret;
@@ -350,16 +350,18 @@ int kvm_probe_user_return_msr(u32 msr)
 	preempt_enable();
 	return ret;
 }
-EXPORT_SYMBOL_GPL(kvm_probe_user_return_msr);
 
-void kvm_define_user_return_msr(unsigned slot, u32 msr)
+int kvm_add_user_return_msr(u32 msr)
 {
-	BUG_ON(slot >= KVM_MAX_NR_USER_RETURN_MSRS);
-	kvm_uret_msrs_list[slot] = msr;
-	if (slot >= kvm_nr_uret_msrs)
-		kvm_nr_uret_msrs = slot + 1;
+	BUG_ON(kvm_nr_uret_msrs >= KVM_MAX_NR_USER_RETURN_MSRS);
+
+	if (kvm_probe_user_return_msr(msr))
+		return -1;
+
+	kvm_uret_msrs_list[kvm_nr_uret_msrs] = msr;
+	return kvm_nr_uret_msrs++;
 }
-EXPORT_SYMBOL_GPL(kvm_define_user_return_msr);
+EXPORT_SYMBOL_GPL(kvm_add_user_return_msr);
 
 int kvm_find_user_return_msr(u32 msr)
 {
@@ -8169,6 +8171,7 @@ int kvm_arch_init(void *opaque)
 		printk(KERN_ERR "kvm: failed to allocate percpu kvm_user_return_msrs\n");
 		goto out_free_x86_emulator_cache;
 	}
+	kvm_nr_uret_msrs = 0;
 
 	r = kvm_mmu_module_init();
 	if (r)
-- 
2.31.1.527.g47e6f16901-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 14/15] KVM: x86: Tie Intel and AMD behavior for MSR_TSC_AUX to guest CPU model
  2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
                   ` (12 preceding siblings ...)
  2021-05-04 17:17 ` [PATCH 13/15] KVM: x86: Move uret MSR slot management to common x86 Sean Christopherson
@ 2021-05-04 17:17 ` Sean Christopherson
  2021-05-10  8:29   ` Maxim Levitsky
  2021-05-04 17:17 ` [PATCH 15/15] KVM: x86: Hide RDTSCP and RDPID if MSR_TSC_AUX probing failed Sean Christopherson
  2021-05-05  8:51 ` [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Paolo Bonzini
  15 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 17:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

Squish the Intel and AMD emulation of MSR_TSC_AUX together and tie it to
the guest CPU model instead of the host CPU behavior.  While not strictly
necessary to avoid guest breakage, emulating cross-vendor "architecture"
will provide consistent behavior for the guest, e.g. WRMSR fault behavior
won't change if the vCPU is migrated to a host with divergent behavior.

Note, the "new" kvm_is_supported_user_return_msr() checks do not add new
functionality on either SVM or VMX.  On SVM, the equivalent was
"tsc_aux_uret_slot < 0", and on VMX the check was buried in the
vmx_find_uret_msr() call at the find_uret_msr label.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h |  5 +++++
 arch/x86/kvm/svm/svm.c          | 24 ----------------------
 arch/x86/kvm/vmx/vmx.c          | 15 --------------
 arch/x86/kvm/x86.c              | 36 +++++++++++++++++++++++++++++++++
 4 files changed, 41 insertions(+), 39 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a4b912f7e427..00fb9efb9984 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1782,6 +1782,11 @@ int kvm_add_user_return_msr(u32 msr);
 int kvm_find_user_return_msr(u32 msr);
 int kvm_set_user_return_msr(unsigned index, u64 val, u64 mask);
 
+static inline bool kvm_is_supported_user_return_msr(u32 msr)
+{
+	return kvm_find_user_return_msr(msr) >= 0;
+}
+
 u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc);
 u64 kvm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc);
 
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index de921935e8de..6c7c6a303cc5 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2663,12 +2663,6 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			msr_info->data |= (u64)svm->sysenter_esp_hi << 32;
 		break;
 	case MSR_TSC_AUX:
-		if (tsc_aux_uret_slot < 0)
-			return 1;
-		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
-			return 1;
 		msr_info->data = svm->tsc_aux;
 		break;
 	/*
@@ -2885,24 +2879,6 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 		svm->sysenter_esp_hi = guest_cpuid_is_intel(vcpu) ? (data >> 32) : 0;
 		break;
 	case MSR_TSC_AUX:
-		if (tsc_aux_uret_slot < 0)
-			return 1;
-
-		if (!msr->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
-			return 1;
-
-		/*
-		 * Per Intel's SDM, bits 63:32 are reserved, but AMD's APM has
-		 * incomplete and conflicting architectural behavior.  Current
-		 * AMD CPUs completely ignore bits 63:32, i.e. they aren't
-		 * reserved and always read as zeros.  Emulate AMD CPU behavior
-		 * to avoid explosions if the vCPU is migrated from an AMD host
-		 * to an Intel host.
-		 */
-		data = (u32)data;
-
 		/*
 		 * TSC_AUX is usually changed only during boot and never read
 		 * directly.  Intercept TSC_AUX instead of exposing it to the
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 26f82f302391..d85ac5876982 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1981,12 +1981,6 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		else
 			msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
 		break;
-	case MSR_TSC_AUX:
-		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
-			return 1;
-		goto find_uret_msr;
 	case MSR_IA32_DEBUGCTLMSR:
 		msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
 		break;
@@ -2302,15 +2296,6 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		else
 			vmx->pt_desc.guest.addr_a[index / 2] = data;
 		break;
-	case MSR_TSC_AUX:
-		if (!msr_info->host_initiated &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
-			return 1;
-		/* Check reserved bit, higher 32 bits should be zero */
-		if ((data >> 32) != 0)
-			return 1;
-		goto find_uret_msr;
 	case MSR_IA32_PERF_CAPABILITIES:
 		if (data && !vcpu_to_pmu(vcpu)->version)
 			return 1;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index adca491d3b4b..896127ea4d4f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1642,6 +1642,30 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data,
 		 * invokes 64-bit SYSENTER.
 		 */
 		data = get_canonical(data, vcpu_virt_addr_bits(vcpu));
+		break;
+	case MSR_TSC_AUX:
+		if (!kvm_is_supported_user_return_msr(MSR_TSC_AUX))
+			return 1;
+
+		if (!host_initiated &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
+			return 1;
+
+		/*
+		 * Per Intel's SDM, bits 63:32 are reserved, but AMD's APM has
+		 * incomplete and conflicting architectural behavior.  Current
+		 * AMD CPUs completely ignore bits 63:32, i.e. they aren't
+		 * reserved and always read as zeros.  Enforce Intel's reserved
+		 * bits check if and only if the guest CPU is Intel, and clear
+		 * the bits in all other cases.  This ensures cross-vendor
+		 * migration will provide consistent behavior for the guest.
+		 */
+		if (guest_cpuid_is_intel(vcpu) && (data >> 32) != 0)
+			return 1;
+
+		data = (u32)data;
+		break;
 	}
 
 	msr.data = data;
@@ -1678,6 +1702,18 @@ int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data,
 	if (!host_initiated && !kvm_msr_allowed(vcpu, index, KVM_MSR_FILTER_READ))
 		return KVM_MSR_RET_FILTERED;
 
+	switch (index) {
+	case MSR_TSC_AUX:
+		if (!kvm_is_supported_user_return_msr(MSR_TSC_AUX))
+			return 1;
+
+		if (!host_initiated &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
+			return 1;
+		break;
+	}
+
 	msr.index = index;
 	msr.host_initiated = host_initiated;
 
-- 
2.31.1.527.g47e6f16901-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 15/15] KVM: x86: Hide RDTSCP and RDPID if MSR_TSC_AUX probing failed
  2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
                   ` (13 preceding siblings ...)
  2021-05-04 17:17 ` [PATCH 14/15] KVM: x86: Tie Intel and AMD behavior for MSR_TSC_AUX to guest CPU model Sean Christopherson
@ 2021-05-04 17:17 ` Sean Christopherson
  2021-05-10  8:29   ` Maxim Levitsky
  2021-05-05  8:51 ` [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Paolo Bonzini
  15 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 17:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

If probing MSR_TSC_AUX failed, hide RDTSCP and RDPID, and WARN if either
feature was reported as supported.  In theory, such a scenario should
never happen as both Intel and AMD state that MSR_TSC_AUX is available if
RDTSCP or RDPID is supported.  But, KVM injects #GP on MSR_TSC_AUX
accesses if probing failed, faults on WRMSR(MSR_TSC_AUX) may be fatal to
the guest (because they happen during early CPU bringup), and KVM itself
has effectively misreported RDPID support in the past.

Note, this also has the happy side effect of omitting MSR_TSC_AUX from
the list of MSRs that are exposed to userspace if probing the MSR fails.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index c96f79c9fff2..bf0f74ce4974 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -567,6 +567,21 @@ void kvm_set_cpu_caps(void)
 		F(ACE2) | F(ACE2_EN) | F(PHE) | F(PHE_EN) |
 		F(PMM) | F(PMM_EN)
 	);
+
+	/*
+	 * Hide RDTSCP and RDPID if either feature is reported as supported but
+	 * probing MSR_TSC_AUX failed.  This is purely a sanity check and
+	 * should never happen, but the guest will likely crash if RDTSCP or
+	 * RDPID is misreported, and KVM has botched MSR_TSC_AUX emulation in
+	 * the past, e.g. the sanity check may fire if this instance of KVM is
+	 * running as L1 on top of an older, broken KVM.
+	 */
+	if (WARN_ON((kvm_cpu_cap_has(X86_FEATURE_RDTSCP) ||
+		     kvm_cpu_cap_has(X86_FEATURE_RDPID)) &&
+		     !kvm_is_supported_user_return_msr(MSR_TSC_AUX))) {
+		kvm_cpu_cap_clear(X86_FEATURE_RDTSCP);
+		kvm_cpu_cap_clear(X86_FEATURE_RDPID);
+	}
 }
 EXPORT_SYMBOL_GPL(kvm_set_cpu_caps);
 
-- 
2.31.1.527.g47e6f16901-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH 01/15] KVM: VMX: Do not adverise RDPID if ENABLE_RDTSCP control is unsupported
  2021-05-04 17:17 ` [PATCH 01/15] KVM: VMX: Do not adverise RDPID if ENABLE_RDTSCP control is unsupported Sean Christopherson
@ 2021-05-04 17:37   ` Jim Mattson
  2021-05-04 17:53     ` Jim Mattson
  2021-05-05  3:04   ` Reiji Watanabe
  2021-05-10  8:03   ` Maxim Levitsky
  2 siblings, 1 reply; 68+ messages in thread
From: Jim Mattson @ 2021-05-04 17:37 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Tue, May 4, 2021 at 10:17 AM Sean Christopherson <seanjc@google.com> wrote:
>
> Clear KVM's RDPID capability if the ENABLE_RDTSCP secondary exec control is
> unsupported.  Despite being enumerated in a separate CPUID flag, RDPID is
> bundled under the same VMCS control as RDTSCP and will #UD in VMX non-root
> if ENABLE_RDTSCP is not enabled.
>
> Fixes: 41cd02c6f7f6 ("kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>

But KVM will happily emulate RDPID if the instruction causes a #UD
VM-exit, won't it? See commit fb6d4d340e05 (KVM: x86: emulate RDPID).

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 02/15] KVM: x86: Emulate RDPID only if RDTSCP is supported
  2021-05-04 17:17 ` [PATCH 02/15] KVM: x86: Emulate RDPID only if RDTSCP is supported Sean Christopherson
@ 2021-05-04 17:50   ` Jim Mattson
  2021-05-05  3:51   ` Reiji Watanabe
  2021-05-10  8:08   ` Maxim Levitsky
  2 siblings, 0 replies; 68+ messages in thread
From: Jim Mattson @ 2021-05-04 17:50 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Tue, May 4, 2021 at 10:17 AM Sean Christopherson <seanjc@google.com> wrote:
>
> Do not advertise emulation support for RDPID if RDTSCP is unsupported.
> RDPID emulation subtly relies on MSR_TSC_AUX to exist in hardware, as
> both vmx_get_msr() and svm_get_msr() will return an error if the MSR is
> unsupported, i.e. ctxt->ops->get_msr() will fail and the emulator will
> inject a #UD.
>
> Note, RDPID emulation also relies on RDTSCP being enabled in the guest,
> but this is a KVM bug and will eventually be fixed.
>
> Fixes: fb6d4d340e05 ("KVM: x86: emulate RDPID")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 01/15] KVM: VMX: Do not adverise RDPID if ENABLE_RDTSCP control is unsupported
  2021-05-04 17:37   ` Jim Mattson
@ 2021-05-04 17:53     ` Jim Mattson
  2021-05-04 18:14       ` Sean Christopherson
  0 siblings, 1 reply; 68+ messages in thread
From: Jim Mattson @ 2021-05-04 17:53 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Tue, May 4, 2021 at 10:37 AM Jim Mattson <jmattson@google.com> wrote:
>
> On Tue, May 4, 2021 at 10:17 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > Clear KVM's RDPID capability if the ENABLE_RDTSCP secondary exec control is
> > unsupported.  Despite being enumerated in a separate CPUID flag, RDPID is
> > bundled under the same VMCS control as RDTSCP and will #UD in VMX non-root
> > if ENABLE_RDTSCP is not enabled.
> >
> > Fixes: 41cd02c6f7f6 ("kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
>
> But KVM will happily emulate RDPID if the instruction causes a #UD
> VM-exit, won't it? See commit fb6d4d340e05 (KVM: x86: emulate RDPID).

Oh, after reading the second patch, I now see why this is needed.

You mispelled 'advertise' in the summary line.

Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 01/15] KVM: VMX: Do not adverise RDPID if ENABLE_RDTSCP control is unsupported
  2021-05-04 17:53     ` Jim Mattson
@ 2021-05-04 18:14       ` Sean Christopherson
  0 siblings, 0 replies; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 18:14 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Tue, May 04, 2021, Jim Mattson wrote:
> On Tue, May 4, 2021 at 10:37 AM Jim Mattson <jmattson@google.com> wrote:
> >
> > On Tue, May 4, 2021 at 10:17 AM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > Clear KVM's RDPID capability if the ENABLE_RDTSCP secondary exec control is
> > > unsupported.  Despite being enumerated in a separate CPUID flag, RDPID is
> > > bundled under the same VMCS control as RDTSCP and will #UD in VMX non-root
> > > if ENABLE_RDTSCP is not enabled.
> > >
> > > Fixes: 41cd02c6f7f6 ("kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID")
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> >
> > But KVM will happily emulate RDPID if the instruction causes a #UD
> > VM-exit, won't it? See commit fb6d4d340e05 (KVM: x86: emulate RDPID).
> 
> Oh, after reading the second patch, I now see why this is needed.

Yeah.  Technically, once common x86 can query MSR_TSC_AUX support directly at
the end of the series, the emulation enumeration could be:

		if (kvm_is_supported_user_return_msr(MSR_TSC_AUX))
			entry->ecx = F(RDPID);

I think I actually meant to do that, then lost track of that TODO item when
reworking the series for the umpteenth time.

Practically speaking, the only way for kvm_is_supported_user_return_msr() to be
meaningful vs. kvm_cpu_cap_has() is if RDTSCP is supported in hardware but the
VMCS control is not available.  And I suppose there's also the case where
X86_FEATURE_RDTSCP was cleared by the kernel, but I feel like KVM should respect
the kernel's avoidance of RDTSCP/MSR_TSC_AUX in that case.  Regarding the silly
VMCS case, I have no objection to making the change, but I also don't care if we
sweep it under the rug.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
  2021-05-04 17:17 ` [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest Sean Christopherson
@ 2021-05-04 21:45   ` Jim Mattson
  2021-05-04 21:53     ` Sean Christopherson
  2021-05-05  4:26   ` Reiji Watanabe
  2021-05-10  8:08   ` Maxim Levitsky
  2 siblings, 1 reply; 68+ messages in thread
From: Jim Mattson @ 2021-05-04 21:45 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Tue, May 4, 2021 at 10:17 AM Sean Christopherson <seanjc@google.com> wrote:
>
> Intercept RDTSCP to inject #UD if RDTSC is disabled in the guest.
>
> Note, SVM does not support intercepting RDPID.  Unlike VMX's
> ENABLE_RDTSCP control, RDTSCP interception does not apply to RDPID.  This
> is a benign virtualization hole as the host kernel (incorrectly) sets
> MSR_TSC_AUX if RDTSCP is supported, and KVM loads the guest's MSR_TSC_AUX
> into hardware if RDTSCP is supported in the host, i.e. KVM will not leak
> the host's MSR_TSC_AUX to the guest.
>
> But, when the kernel bug is fixed, KVM will start leaking the host's
> MSR_TSC_AUX if RDPID is supported in hardware, but RDTSCP isn't available
> for whatever reason.  This leak will be remedied in a future commit.
>
> Fixes: 46896c73c1a4 ("KVM: svm: add support for RDTSCP")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
...
> @@ -4007,8 +4017,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>         svm->nrips_enabled = kvm_cpu_cap_has(X86_FEATURE_NRIPS) &&
>                              guest_cpuid_has(vcpu, X86_FEATURE_NRIPS);
>
> -       /* Check again if INVPCID interception if required */
> -       svm_check_invpcid(svm);
> +       svm_recalc_instruction_intercepts(vcpu, svm);

Does the right thing happen here if the vCPU is in guest mode when
userspace decides to toggle the CPUID.80000001H:EDX.RDTSCP bit on or
off?

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
  2021-05-04 21:45   ` Jim Mattson
@ 2021-05-04 21:53     ` Sean Christopherson
  2021-05-04 21:56       ` Jim Mattson
  2021-05-04 21:57       ` Paolo Bonzini
  0 siblings, 2 replies; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 21:53 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Tue, May 04, 2021, Jim Mattson wrote:
> On Tue, May 4, 2021 at 10:17 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > Intercept RDTSCP to inject #UD if RDTSC is disabled in the guest.
> >
> > Note, SVM does not support intercepting RDPID.  Unlike VMX's
> > ENABLE_RDTSCP control, RDTSCP interception does not apply to RDPID.  This
> > is a benign virtualization hole as the host kernel (incorrectly) sets
> > MSR_TSC_AUX if RDTSCP is supported, and KVM loads the guest's MSR_TSC_AUX
> > into hardware if RDTSCP is supported in the host, i.e. KVM will not leak
> > the host's MSR_TSC_AUX to the guest.
> >
> > But, when the kernel bug is fixed, KVM will start leaking the host's
> > MSR_TSC_AUX if RDPID is supported in hardware, but RDTSCP isn't available
> > for whatever reason.  This leak will be remedied in a future commit.
> >
> > Fixes: 46896c73c1a4 ("KVM: svm: add support for RDTSCP")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> ...
> > @@ -4007,8 +4017,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> >         svm->nrips_enabled = kvm_cpu_cap_has(X86_FEATURE_NRIPS) &&
> >                              guest_cpuid_has(vcpu, X86_FEATURE_NRIPS);
> >
> > -       /* Check again if INVPCID interception if required */
> > -       svm_check_invpcid(svm);
> > +       svm_recalc_instruction_intercepts(vcpu, svm);
> 
> Does the right thing happen here if the vCPU is in guest mode when
> userspace decides to toggle the CPUID.80000001H:EDX.RDTSCP bit on or
> off?

I hate our terminology.  By "guest mode", do you mean running the vCPU, or do
you specifically mean running in L2?

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
  2021-05-04 21:53     ` Sean Christopherson
@ 2021-05-04 21:56       ` Jim Mattson
  2021-05-04 22:10         ` Sean Christopherson
  2021-05-04 21:57       ` Paolo Bonzini
  1 sibling, 1 reply; 68+ messages in thread
From: Jim Mattson @ 2021-05-04 21:56 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Tue, May 4, 2021 at 2:53 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, May 04, 2021, Jim Mattson wrote:
> > On Tue, May 4, 2021 at 10:17 AM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > Intercept RDTSCP to inject #UD if RDTSC is disabled in the guest.
> > >
> > > Note, SVM does not support intercepting RDPID.  Unlike VMX's
> > > ENABLE_RDTSCP control, RDTSCP interception does not apply to RDPID.  This
> > > is a benign virtualization hole as the host kernel (incorrectly) sets
> > > MSR_TSC_AUX if RDTSCP is supported, and KVM loads the guest's MSR_TSC_AUX
> > > into hardware if RDTSCP is supported in the host, i.e. KVM will not leak
> > > the host's MSR_TSC_AUX to the guest.
> > >
> > > But, when the kernel bug is fixed, KVM will start leaking the host's
> > > MSR_TSC_AUX if RDPID is supported in hardware, but RDTSCP isn't available
> > > for whatever reason.  This leak will be remedied in a future commit.
> > >
> > > Fixes: 46896c73c1a4 ("KVM: svm: add support for RDTSCP")
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > ---
> > ...
> > > @@ -4007,8 +4017,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > >         svm->nrips_enabled = kvm_cpu_cap_has(X86_FEATURE_NRIPS) &&
> > >                              guest_cpuid_has(vcpu, X86_FEATURE_NRIPS);
> > >
> > > -       /* Check again if INVPCID interception if required */
> > > -       svm_check_invpcid(svm);
> > > +       svm_recalc_instruction_intercepts(vcpu, svm);
> >
> > Does the right thing happen here if the vCPU is in guest mode when
> > userspace decides to toggle the CPUID.80000001H:EDX.RDTSCP bit on or
> > off?
>
> I hate our terminology.  By "guest mode", do you mean running the vCPU, or do
> you specifically mean running in L2?

I mean is_guest_mode(vcpu) is true (i.e. running L2).

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
  2021-05-04 21:53     ` Sean Christopherson
  2021-05-04 21:56       ` Jim Mattson
@ 2021-05-04 21:57       ` Paolo Bonzini
  2021-05-04 21:58         ` Jim Mattson
  1 sibling, 1 reply; 68+ messages in thread
From: Paolo Bonzini @ 2021-05-04 21:57 UTC (permalink / raw)
  To: Sean Christopherson, Jim Mattson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel, kvm list, LKML,
	Xiaoyao Li, Reiji Watanabe

On 04/05/21 23:53, Sean Christopherson wrote:
>> Does the right thing happen here if the vCPU is in guest mode when
>> userspace decides to toggle the CPUID.80000001H:EDX.RDTSCP bit on or
>> off?
> I hate our terminology.  By "guest mode", do you mean running the vCPU, or do
> you specifically mean running in L2?
> 

Guest mode should mean L2.

(I wonder if we should have a capability that says "KVM_SET_CPUID2 can 
only be called prior to KVM_RUN").

Paolo


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
  2021-05-04 21:57       ` Paolo Bonzini
@ 2021-05-04 21:58         ` Jim Mattson
  2021-05-10  8:08           ` Maxim Levitsky
  0 siblings, 1 reply; 68+ messages in thread
From: Jim Mattson @ 2021-05-04 21:58 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Tue, May 4, 2021 at 2:57 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 04/05/21 23:53, Sean Christopherson wrote:
> >> Does the right thing happen here if the vCPU is in guest mode when
> >> userspace decides to toggle the CPUID.80000001H:EDX.RDTSCP bit on or
> >> off?
> > I hate our terminology.  By "guest mode", do you mean running the vCPU, or do
> > you specifically mean running in L2?
> >
>
> Guest mode should mean L2.
>
> (I wonder if we should have a capability that says "KVM_SET_CPUID2 can
> only be called prior to KVM_RUN").

It would certainly make it easier to reason about potential security issues.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
  2021-05-04 21:56       ` Jim Mattson
@ 2021-05-04 22:10         ` Sean Christopherson
  2021-05-04 22:24           ` Jim Mattson
  0 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2021-05-04 22:10 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Tue, May 04, 2021, Jim Mattson wrote:
> On Tue, May 4, 2021 at 2:53 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Tue, May 04, 2021, Jim Mattson wrote:
> > > On Tue, May 4, 2021 at 10:17 AM Sean Christopherson <seanjc@google.com> wrote:
> > > >
> > > > Intercept RDTSCP to inject #UD if RDTSC is disabled in the guest.
> > > >
> > > > Note, SVM does not support intercepting RDPID.  Unlike VMX's
> > > > ENABLE_RDTSCP control, RDTSCP interception does not apply to RDPID.  This
> > > > is a benign virtualization hole as the host kernel (incorrectly) sets
> > > > MSR_TSC_AUX if RDTSCP is supported, and KVM loads the guest's MSR_TSC_AUX
> > > > into hardware if RDTSCP is supported in the host, i.e. KVM will not leak
> > > > the host's MSR_TSC_AUX to the guest.
> > > >
> > > > But, when the kernel bug is fixed, KVM will start leaking the host's
> > > > MSR_TSC_AUX if RDPID is supported in hardware, but RDTSCP isn't available
> > > > for whatever reason.  This leak will be remedied in a future commit.
> > > >
> > > > Fixes: 46896c73c1a4 ("KVM: svm: add support for RDTSCP")
> > > > Cc: stable@vger.kernel.org
> > > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > > ---
> > > ...
> > > > @@ -4007,8 +4017,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > > >         svm->nrips_enabled = kvm_cpu_cap_has(X86_FEATURE_NRIPS) &&
> > > >                              guest_cpuid_has(vcpu, X86_FEATURE_NRIPS);
> > > >
> > > > -       /* Check again if INVPCID interception if required */
> > > > -       svm_check_invpcid(svm);
> > > > +       svm_recalc_instruction_intercepts(vcpu, svm);
> > >
> > > Does the right thing happen here if the vCPU is in guest mode when
> > > userspace decides to toggle the CPUID.80000001H:EDX.RDTSCP bit on or
> > > off?
> >
> > I hate our terminology.  By "guest mode", do you mean running the vCPU, or do
> > you specifically mean running in L2?
> 
> I mean is_guest_mode(vcpu) is true (i.e. running L2).

No, it will not do the right thing, whatever "right thing" even means in this
context.  That's a pre-existing issue, e.g. INVCPID handling is also wrong.
I highly doubt VMX does, or even can, do the right thing either.

I'm pretty sure I lobbied in the past to disallow KVM_SET_CPUID* if the vCPU is
in guest mode since it's impossible to do the right thing without forcing an
exit to L1, e.g. changing MAXPHYSADDR will allow running L2 with an illegal
CR3, ditto for various CR4 bits.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
  2021-05-04 22:10         ` Sean Christopherson
@ 2021-05-04 22:24           ` Jim Mattson
  0 siblings, 0 replies; 68+ messages in thread
From: Jim Mattson @ 2021-05-04 22:24 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Tue, May 4, 2021 at 3:10 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, May 04, 2021, Jim Mattson wrote:
> > On Tue, May 4, 2021 at 2:53 PM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Tue, May 04, 2021, Jim Mattson wrote:
> > > > On Tue, May 4, 2021 at 10:17 AM Sean Christopherson <seanjc@google.com> wrote:
> > > > >
> > > > > Intercept RDTSCP to inject #UD if RDTSC is disabled in the guest.
> > > > >
> > > > > Note, SVM does not support intercepting RDPID.  Unlike VMX's
> > > > > ENABLE_RDTSCP control, RDTSCP interception does not apply to RDPID.  This
> > > > > is a benign virtualization hole as the host kernel (incorrectly) sets
> > > > > MSR_TSC_AUX if RDTSCP is supported, and KVM loads the guest's MSR_TSC_AUX
> > > > > into hardware if RDTSCP is supported in the host, i.e. KVM will not leak
> > > > > the host's MSR_TSC_AUX to the guest.
> > > > >
> > > > > But, when the kernel bug is fixed, KVM will start leaking the host's
> > > > > MSR_TSC_AUX if RDPID is supported in hardware, but RDTSCP isn't available
> > > > > for whatever reason.  This leak will be remedied in a future commit.
> > > > >
> > > > > Fixes: 46896c73c1a4 ("KVM: svm: add support for RDTSCP")
> > > > > Cc: stable@vger.kernel.org
> > > > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > > > ---
> > > > ...
> > > > > @@ -4007,8 +4017,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > > > >         svm->nrips_enabled = kvm_cpu_cap_has(X86_FEATURE_NRIPS) &&
> > > > >                              guest_cpuid_has(vcpu, X86_FEATURE_NRIPS);
> > > > >
> > > > > -       /* Check again if INVPCID interception if required */
> > > > > -       svm_check_invpcid(svm);
> > > > > +       svm_recalc_instruction_intercepts(vcpu, svm);
> > > >
> > > > Does the right thing happen here if the vCPU is in guest mode when
> > > > userspace decides to toggle the CPUID.80000001H:EDX.RDTSCP bit on or
> > > > off?
> > >
> > > I hate our terminology.  By "guest mode", do you mean running the vCPU, or do
> > > you specifically mean running in L2?
> >
> > I mean is_guest_mode(vcpu) is true (i.e. running L2).
>
> No, it will not do the right thing, whatever "right thing" even means in this
> context.  That's a pre-existing issue, e.g. INVCPID handling is also wrong.
> I highly doubt VMX does, or even can, do the right thing either.
>
> I'm pretty sure I lobbied in the past to disallow KVM_SET_CPUID* if the vCPU is
> in guest mode since it's impossible to do the right thing without forcing an
> exit to L1, e.g. changing MAXPHYSADDR will allow running L2 with an illegal
> CR3, ditto for various CR4 bits.

With that caveat understood,

Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 04/15] KVM: x86: Move RDPID emulation intercept to its own enum
  2021-05-04 17:17 ` [PATCH 04/15] KVM: x86: Move RDPID emulation intercept to its own enum Sean Christopherson
@ 2021-05-04 23:24   ` Jim Mattson
  2021-05-10  8:14   ` Maxim Levitsky
  1 sibling, 0 replies; 68+ messages in thread
From: Jim Mattson @ 2021-05-04 23:24 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Tue, May 4, 2021 at 10:17 AM Sean Christopherson <seanjc@google.com> wrote:
>
> Add a dedicated intercept enum for RDPID instead of piggybacking RDTSCP.
> Unlike VMX's ENABLE_RDTSCP, RDPID is not bound to SVM's RDTSCP intercept.
>
> Fixes: fb6d4d340e05 ("KVM: x86: emulate RDPID")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 05/15] KVM: VMX: Disable preemption when probing user return MSRs
  2021-05-04 17:17 ` [PATCH 05/15] KVM: VMX: Disable preemption when probing user return MSRs Sean Christopherson
@ 2021-05-04 23:36   ` Jim Mattson
  2021-05-10  8:18   ` Maxim Levitsky
  1 sibling, 0 replies; 68+ messages in thread
From: Jim Mattson @ 2021-05-04 23:36 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Tue, May 4, 2021 at 10:17 AM Sean Christopherson <seanjc@google.com> wrote:
>
> Disable preemption when probing a user return MSR via RDSMR/WRMSR.  If
> the MSR holds a different value per logical CPU, the WRMSR could corrupt
> the host's value if KVM is preempted between the RDMSR and WRMSR, and
> then rescheduled on a different CPU.
>
> Opportunistically land the helper in common x86, SVM will use the helper
> in a future commit.
>
> Fixes: 4be534102624 ("KVM: VMX: Initialize vmx->guest_msrs[] right after allocation")
> Cc: stable@vger.kernel.org
> Cc: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 01/15] KVM: VMX: Do not adverise RDPID if ENABLE_RDTSCP control is unsupported
  2021-05-04 17:17 ` [PATCH 01/15] KVM: VMX: Do not adverise RDPID if ENABLE_RDTSCP control is unsupported Sean Christopherson
  2021-05-04 17:37   ` Jim Mattson
@ 2021-05-05  3:04   ` Reiji Watanabe
  2021-05-10  8:03   ` Maxim Levitsky
  2 siblings, 0 replies; 68+ messages in thread
From: Reiji Watanabe @ 2021-05-05  3:04 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li

On Tue, May 4, 2021 at 10:17 AM Sean Christopherson <seanjc@google.com> wrote:
>
> Clear KVM's RDPID capability if the ENABLE_RDTSCP secondary exec control is
> unsupported.  Despite being enumerated in a separate CPUID flag, RDPID is
> bundled under the same VMCS control as RDTSCP and will #UD in VMX non-root
> if ENABLE_RDTSCP is not enabled.
>
> Fixes: 41cd02c6f7f6 ("kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Reiji Watanabe <reijiw@google.com>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 02/15] KVM: x86: Emulate RDPID only if RDTSCP is supported
  2021-05-04 17:17 ` [PATCH 02/15] KVM: x86: Emulate RDPID only if RDTSCP is supported Sean Christopherson
  2021-05-04 17:50   ` Jim Mattson
@ 2021-05-05  3:51   ` Reiji Watanabe
  2021-05-05  8:01     ` Paolo Bonzini
  2021-05-10  8:08   ` Maxim Levitsky
  2 siblings, 1 reply; 68+ messages in thread
From: Reiji Watanabe @ 2021-05-05  3:51 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li

> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -637,7 +637,8 @@ static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
>         case 7:
>                 entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
>                 entry->eax = 0;
> -               entry->ecx = F(RDPID);
> +               if (kvm_cpu_cap_has(X86_FEATURE_RDTSCP))
> +                       entry->ecx = F(RDPID);
>                 ++array->nent;
>         default:
>                 break;

I'm wondering if entry->ecx should be set to F(RDPID) here
even if the CPU supports RDPID natively.
(i.e. kvm_cpu_cap_has(X86_FEATURE_RDPID) is true)

The document "Documentation/virt/kvm/api.rst" says:
---
4.88 KVM_GET_EMULATED_CPUID
---------------------------
<...>
Userspace can use the information returned by this ioctl to query
which features are emulated by kvm instead of being present natively.
---

Thanks,
Reiji

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
  2021-05-04 17:17 ` [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest Sean Christopherson
  2021-05-04 21:45   ` Jim Mattson
@ 2021-05-05  4:26   ` Reiji Watanabe
  2021-05-10  8:08   ` Maxim Levitsky
  2 siblings, 0 replies; 68+ messages in thread
From: Reiji Watanabe @ 2021-05-05  4:26 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li

On Tue, May 4, 2021 at 10:17 AM Sean Christopherson <seanjc@google.com> wrote:
>
> Intercept RDTSCP to inject #UD if RDTSC is disabled in the guest.
>
> Note, SVM does not support intercepting RDPID.  Unlike VMX's
> ENABLE_RDTSCP control, RDTSCP interception does not apply to RDPID.  This
> is a benign virtualization hole as the host kernel (incorrectly) sets
> MSR_TSC_AUX if RDTSCP is supported, and KVM loads the guest's MSR_TSC_AUX
> into hardware if RDTSCP is supported in the host, i.e. KVM will not leak
> the host's MSR_TSC_AUX to the guest.
>
> But, when the kernel bug is fixed, KVM will start leaking the host's
> MSR_TSC_AUX if RDPID is supported in hardware, but RDTSCP isn't available
> for whatever reason.  This leak will be remedied in a future commit.
>
> Fixes: 46896c73c1a4 ("KVM: svm: add support for RDTSCP")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Reiji Watanabe <reijiw@google.com>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 02/15] KVM: x86: Emulate RDPID only if RDTSCP is supported
  2021-05-05  3:51   ` Reiji Watanabe
@ 2021-05-05  8:01     ` Paolo Bonzini
  0 siblings, 0 replies; 68+ messages in thread
From: Paolo Bonzini @ 2021-05-05  8:01 UTC (permalink / raw)
  To: Reiji Watanabe, Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li

On 05/05/21 05:51, Reiji Watanabe wrote:
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -637,7 +637,8 @@ static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
>>          case 7:
>>                  entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
>>                  entry->eax = 0;
>> -               entry->ecx = F(RDPID);
>> +               if (kvm_cpu_cap_has(X86_FEATURE_RDTSCP))
>> +                       entry->ecx = F(RDPID);
>>                  ++array->nent;
>>          default:
>>                  break;
> I'm wondering if entry->ecx should be set to F(RDPID) here
> even if the CPU supports RDPID natively.
> (i.e. kvm_cpu_cap_has(X86_FEATURE_RDPID) is true)
> 
> The document "Documentation/virt/kvm/api.rst" says:
> ---
> 4.88 KVM_GET_EMULATED_CPUID
> ---------------------------
> <...>
> Userspace can use the information returned by this ioctl to query
> which features are emulated by kvm instead of being present natively.
> ---

Setting it always is consistent with the treatment of MOVBE above. 
Either way is okay but it should be done for both bits.

Paolo


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 11/15] KVM: VMX: Disable loading of TSX_CTRL MSR the more conventional way
  2021-05-04 17:17 ` [PATCH 11/15] KVM: VMX: Disable loading of TSX_CTRL MSR the more conventional way Sean Christopherson
@ 2021-05-05  8:49   ` Paolo Bonzini
  2021-05-05 15:36     ` Sean Christopherson
  2021-05-10  8:26   ` Maxim Levitsky
  1 sibling, 1 reply; 68+ messages in thread
From: Paolo Bonzini @ 2021-05-05  8:49 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On 04/05/21 19:17, Sean Christopherson wrote:
> Tag TSX_CTRL as not needing to be loaded when RTM isn't supported in the
> host.  Crushing the write mask to '0' has the same effect, but requires
> more mental gymnastics to understand.

This doesn't explain _why_ this is now possible.  What about:

Now that user return MSRs is always present in the list, we don't have
the problem that the TSX_CTRL MSR needs a slot vmx->guest_uret_msrs even
if RTM is not supported in the host (and therefore there is nothing to
enable).  Thus we can simply tag TSX_CTRL as not needing to be loaded
instead of crushing the write mask to '0'.  That has the same effect,
but requires less mental gymnastics to understand.

Paolo

> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   arch/x86/kvm/vmx/vmx.c | 22 ++++++++++------------
>   1 file changed, 10 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 4b432d2bbd06..7a53568b34fc 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1771,7 +1771,13 @@ static void setup_msrs(struct vcpu_vmx *vmx)
>   			   guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP) ||
>   			   guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDPID));
>   
> -	vmx_setup_uret_msr(vmx, MSR_IA32_TSX_CTRL, true);
> +	/*
> +	 * hle=0, rtm=0, tsx_ctrl=1 can be found with some combinations of new
> +	 * kernel and old userspace.  If those guests run on a tsx=off host, do
> +	 * allow guests to use TSX_CTRL, but don't change the value in hardware
> +	 * so that TSX remains always disabled.
> +	 */
> +	vmx_setup_uret_msr(vmx, MSR_IA32_TSX_CTRL, boot_cpu_has(X86_FEATURE_RTM));
>   
>   	if (cpu_has_vmx_msr_bitmap())
>   		vmx_update_msr_bitmap(&vmx->vcpu);
> @@ -6919,23 +6925,15 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
>   		vmx->guest_uret_msrs[i].data = 0;
>   		vmx->guest_uret_msrs[i].mask = -1ull;
>   	}
> -	tsx_ctrl = vmx_find_uret_msr(vmx, MSR_IA32_TSX_CTRL);
> -	if (tsx_ctrl) {
> +	if (boot_cpu_has(X86_FEATURE_RTM)) {
>   		/*
>   		 * TSX_CTRL_CPUID_CLEAR is handled in the CPUID interception.
>   		 * Keep the host value unchanged to avoid changing CPUID bits
>   		 * under the host kernel's feet.
> -		 *
> -		 * hle=0, rtm=0, tsx_ctrl=1 can be found with some combinations
> -		 * of new kernel and old userspace.  If those guests run on a
> -		 * tsx=off host, do allow guests to use TSX_CTRL, but do not
> -		 * change the value on the host so that TSX remains always
> -		 * disabled.
>   		 */
> -		if (boot_cpu_has(X86_FEATURE_RTM))
> +		tsx_ctrl = vmx_find_uret_msr(vmx, MSR_IA32_TSX_CTRL);
> +		if (tsx_ctrl)
>   			vmx->guest_uret_msrs[i].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
> -		else
> -			vmx->guest_uret_msrs[i].mask = 0;
>   	}
>   
>   	err = alloc_loaded_vmcs(&vmx->vmcs01);
> 


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups
  2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
                   ` (14 preceding siblings ...)
  2021-05-04 17:17 ` [PATCH 15/15] KVM: x86: Hide RDTSCP and RDPID if MSR_TSC_AUX probing failed Sean Christopherson
@ 2021-05-05  8:51 ` Paolo Bonzini
  15 siblings, 0 replies; 68+ messages in thread
From: Paolo Bonzini @ 2021-05-05  8:51 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On 04/05/21 19:17, Sean Christopherson wrote:
> This is a continuation of a less ambitious effort to unify MSR_TSC_AUX
> handling across SVM and VMX.  Reiji pointed out that MSR_TSC_AUX exists if
> RDTSCP *or* RDPID is supported, and things went downhill from there.
> 
> The first half of this series fixes a variety of RDTSCP and RDPID related
> bugs.
> 
> The second half of the series cleans up VMX's user return MSR framework
> and consolidates more of the uret logic into common x86.
> 
> The last two patches leverage the uret MSR cleanups to move MSR_TSC_AUX
> handling to common x86 and add sanity checks to guard against misreporting
> of RDPID and/or RDTSCP support.
> 
> This will conflict with my vCPU RESET/INIT cleanup series.  Feel free to
> punt the conflicts to me.
> 
> Other "fun" things to tackle:
> 
>   - The kernel proper also botches RDPID vs. RDTSCP, as MSR_TSC_AUX is
>     configured if RDTSCP is supported, but is consumed if RDPID is
>     supported.  I'll send this fix separately.
> 
>   - Commit 844d69c26d83 ("KVM: SVM: Delay restoration of host MSR_TSC_AUX
>     until return to userspace") unwittingly fixed a bug where KVM would
>     write MSR_TSC_AUX with the guest's value when svm->guest_state_loaded
>     is false, which could lead to running the host with the guest's value.
>     The bug only exists in 5.12 (maybe 5.11 too?), so crafting a fix for
>     stable won't be too awful.
> 
> Sean Christopherson (15):
>    KVM: VMX: Do not adverise RDPID if ENABLE_RDTSCP control is
>      unsupported
>    KVM: x86: Emulate RDPID only if RDTSCP is supported
>    KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
>    KVM: x86: Move RDPID emulation intercept to its own enum
>    KVM: VMX: Disable preemption when probing user return MSRs
>    KVM: SVM: Probe and load MSR_TSC_AUX regardless of RDTSCP support in
>      host
>    KVM: x86: Add support for RDPID without RDTSCP
>    KVM: VMX: Configure list of user return MSRs at module init
>    KVM: VMX: Use flag to indicate "active" uret MSRs instead of sorting
>      list
>    KVM: VMX: Use common x86's uret MSR list as the one true list
>    KVM: VMX: Disable loading of TSX_CTRL MSR the more conventional way
>    KVM: x86: Export the number of uret MSRs to vendor modules
>    KVM: x86: Move uret MSR slot management to common x86
>    KVM: x86: Tie Intel and AMD behavior for MSR_TSC_AUX to guest CPU
>      model
>    KVM: x86: Hide RDTSCP and RDPID if MSR_TSC_AUX probing failed
> 
>   arch/x86/include/asm/kvm_host.h |   9 +-
>   arch/x86/kvm/cpuid.c            |  18 ++-
>   arch/x86/kvm/emulate.c          |   2 +-
>   arch/x86/kvm/kvm_emulate.h      |   1 +
>   arch/x86/kvm/svm/svm.c          |  50 +++-----
>   arch/x86/kvm/vmx/vmx.c          | 217 ++++++++++++++++----------------
>   arch/x86/kvm/vmx/vmx.h          |  12 +-
>   arch/x86/kvm/x86.c              | 101 ++++++++++++---
>   8 files changed, 245 insertions(+), 165 deletions(-)
> 

Queued, thanks.

Paolo


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 11/15] KVM: VMX: Disable loading of TSX_CTRL MSR the more conventional way
  2021-05-05  8:49   ` Paolo Bonzini
@ 2021-05-05 15:36     ` Sean Christopherson
  2021-05-05 15:50       ` Paolo Bonzini
  0 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2021-05-05 15:36 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On Wed, May 05, 2021, Paolo Bonzini wrote:
> On 04/05/21 19:17, Sean Christopherson wrote:
> > Tag TSX_CTRL as not needing to be loaded when RTM isn't supported in the
> > host.  Crushing the write mask to '0' has the same effect, but requires
> > more mental gymnastics to understand.
> 
> This doesn't explain _why_ this is now possible.  What about:
> 
> Now that user return MSRs is always present in the list, we don't have

User return MSRs aren't always present in the list; this series doesn't change
that behavior at all.

> the problem that the TSX_CTRL MSR needs a slot vmx->guest_uret_msrs even
> if RTM is not supported in the host (and therefore there is nothing to
> enable).  Thus we can simply tag TSX_CTRL as not needing to be loaded
> instead of crushing the write mask to '0'.

Unless I'm missing something, it would have been possible to give TSX_CTRL a
slot but not load it even before this refactoring, we just missed that approach
when handling the TSX_CTRL without HLE/RTM case.  Several other MSRs rely on
this behavior, notably the SYSCALL MSRs, which are present in the list so that
the guest can read/write the MSRs, but are loaded into hardware iff the guest
has enabled SYSCALL.

All that said, I certainly have no objection to writing a longer changelog.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 11/15] KVM: VMX: Disable loading of TSX_CTRL MSR the more conventional way
  2021-05-05 15:36     ` Sean Christopherson
@ 2021-05-05 15:50       ` Paolo Bonzini
  0 siblings, 0 replies; 68+ messages in thread
From: Paolo Bonzini @ 2021-05-05 15:50 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On 05/05/21 17:36, Sean Christopherson wrote:
> On Wed, May 05, 2021, Paolo Bonzini wrote:
>> On 04/05/21 19:17, Sean Christopherson wrote:
>>> Tag TSX_CTRL as not needing to be loaded when RTM isn't supported in the
>>> host.  Crushing the write mask to '0' has the same effect, but requires
>>> more mental gymnastics to understand.
>>
>> This doesn't explain _why_ this is now possible.  What about:
>>
>> Now that user return MSRs is always present in the list, we don't have
> 
> User return MSRs aren't always present in the list; this series doesn't change
> that behavior at all.
> 
>> the problem that the TSX_CTRL MSR needs a slot vmx->guest_uret_msrs even
>> if RTM is not supported in the host (and therefore there is nothing to
>> enable).  Thus we can simply tag TSX_CTRL as not needing to be loaded
>> instead of crushing the write mask to '0'.
> 
> Unless I'm missing something, it would have been possible to give TSX_CTRL a
> slot but not load it even before this refactoring, we just missed that approach
> when handling the TSX_CTRL without HLE/RTM case.  Several other MSRs rely on
> this behavior, notably the SYSCALL MSRs, which are present in the list so that
> the guest can read/write the MSRs, but are loaded into hardware iff the guest
> has enabled SYSCALL.

You're right, it used to be done with vmx->nr_active_uret_msr.

Paolo

> All that said, I certainly have no objection to writing a longer changelog.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 09/15] KVM: VMX: Use flag to indicate "active" uret MSRs instead of sorting list
  2021-05-04 17:17 ` [PATCH 09/15] KVM: VMX: Use flag to indicate "active" uret MSRs instead of sorting list Sean Christopherson
@ 2021-05-08  3:31   ` Reiji Watanabe
  2021-05-10 16:43     ` Sean Christopherson
  2021-05-10  8:25   ` Maxim Levitsky
  1 sibling, 1 reply; 68+ messages in thread
From: Reiji Watanabe @ 2021-05-08  3:31 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li

> -static void vmx_setup_uret_msr(struct vcpu_vmx *vmx, unsigned int msr)
> +static void vmx_setup_uret_msr(struct vcpu_vmx *vmx, unsigned int msr,
> +                              bool load_into_hardware)
>  {
> -       struct vmx_uret_msr tmp;
> -       int from, to;
> +       struct vmx_uret_msr *uret_msr;
>
> -       from = __vmx_find_uret_msr(vmx, msr);
> -       if (from < 0)
> +       uret_msr = vmx_find_uret_msr(vmx, msr);
> +       if (!uret_msr)
>                 return;
> -       to = vmx->nr_active_uret_msrs++;
>
> -       tmp = vmx->guest_uret_msrs[to];
> -       vmx->guest_uret_msrs[to] = vmx->guest_uret_msrs[from];
> -       vmx->guest_uret_msrs[from] = tmp;
> +       uret_msr->load_into_hardware = load_into_hardware;
>  }
>
>  /*
> @@ -1785,30 +1781,36 @@ static void vmx_setup_uret_msr(struct vcpu_vmx *vmx, unsigned int msr)
>   */
>  static void setup_msrs(struct vcpu_vmx *vmx)
>  {
> -       vmx->guest_uret_msrs_loaded = false;
> -       vmx->nr_active_uret_msrs = 0;
>  #ifdef CONFIG_X86_64
> +       bool load_syscall_msrs;
> +
>         /*
>          * The SYSCALL MSRs are only needed on long mode guests, and only
>          * when EFER.SCE is set.
>          */
> -       if (is_long_mode(&vmx->vcpu) && (vmx->vcpu.arch.efer & EFER_SCE)) {
> -               vmx_setup_uret_msr(vmx, MSR_STAR);
> -               vmx_setup_uret_msr(vmx, MSR_LSTAR);
> -               vmx_setup_uret_msr(vmx, MSR_SYSCALL_MASK);
> -       }
> +       load_syscall_msrs = is_long_mode(&vmx->vcpu) &&
> +                           (vmx->vcpu.arch.efer & EFER_SCE);
> +
> +       vmx_setup_uret_msr(vmx, MSR_STAR, load_syscall_msrs);
> +       vmx_setup_uret_msr(vmx, MSR_LSTAR, load_syscall_msrs);
> +       vmx_setup_uret_msr(vmx, MSR_SYSCALL_MASK, load_syscall_msrs);
>  #endif
> -       if (update_transition_efer(vmx))
> -               vmx_setup_uret_msr(vmx, MSR_EFER);
> +       vmx_setup_uret_msr(vmx, MSR_EFER, update_transition_efer(vmx));
>
> -       if (guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP)  ||
> -           guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDPID))
> -               vmx_setup_uret_msr(vmx, MSR_TSC_AUX);
> +       vmx_setup_uret_msr(vmx, MSR_TSC_AUX,
> +                          guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP) ||
> +                          guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDPID));

Shouldn't vmx_setup_uret_msr(,MSR_TSC_AUX,) be called to update
the new flag load_into_hardware for MSR_TSC_AUX when CPUID
(X86_FEATURE_RDTSCP/X86_FEATURE_RDPID) of the vCPU is updated ?


Thanks,
Reiji

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 01/15] KVM: VMX: Do not adverise RDPID if ENABLE_RDTSCP control is unsupported
  2021-05-04 17:17 ` [PATCH 01/15] KVM: VMX: Do not adverise RDPID if ENABLE_RDTSCP control is unsupported Sean Christopherson
  2021-05-04 17:37   ` Jim Mattson
  2021-05-05  3:04   ` Reiji Watanabe
@ 2021-05-10  8:03   ` Maxim Levitsky
  2 siblings, 0 replies; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-10  8:03 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> Clear KVM's RDPID capability if the ENABLE_RDTSCP secondary exec control is
> unsupported.  Despite being enumerated in a separate CPUID flag, RDPID is
> bundled under the same VMCS control as RDTSCP and will #UD in VMX non-root
> if ENABLE_RDTSCP is not enabled.
> 
> Fixes: 41cd02c6f7f6 ("kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 10b610fc7bbc..82404ee2520e 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7377,9 +7377,11 @@ static __init void vmx_set_cpu_caps(void)
>  	if (!cpu_has_vmx_xsaves())
>  		kvm_cpu_cap_clear(X86_FEATURE_XSAVES);
>  
> -	/* CPUID 0x80000001 */
> -	if (!cpu_has_vmx_rdtscp())
> +	/* CPUID 0x80000001 and 0x7 (RDPID) */
> +	if (!cpu_has_vmx_rdtscp()) {
>  		kvm_cpu_cap_clear(X86_FEATURE_RDTSCP);
> +		kvm_cpu_cap_clear(X86_FEATURE_RDPID);
> +	}
>  
>  	if (cpu_has_vmx_waitpkg())
>  		kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 02/15] KVM: x86: Emulate RDPID only if RDTSCP is supported
  2021-05-04 17:17 ` [PATCH 02/15] KVM: x86: Emulate RDPID only if RDTSCP is supported Sean Christopherson
  2021-05-04 17:50   ` Jim Mattson
  2021-05-05  3:51   ` Reiji Watanabe
@ 2021-05-10  8:08   ` Maxim Levitsky
  2021-05-10 17:20     ` Sean Christopherson
  2 siblings, 1 reply; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-10  8:08 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> Do not advertise emulation support for RDPID if RDTSCP is unsupported.
> RDPID emulation subtly relies on MSR_TSC_AUX to exist in hardware, as
> both vmx_get_msr() and svm_get_msr() will return an error if the MSR is
> unsupported, i.e. ctxt->ops->get_msr() will fail and the emulator will
> inject a #UD.
> 
> Note, RDPID emulation also relies on RDTSCP being enabled in the guest,
> but this is a KVM bug and will eventually be fixed.
> 
> Fixes: fb6d4d340e05 ("KVM: x86: emulate RDPID")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/cpuid.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index f765bf7a529c..c96f79c9fff2 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -637,7 +637,8 @@ static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
>  	case 7:
>  		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
>  		entry->eax = 0;
> -		entry->ecx = F(RDPID);
> +		if (kvm_cpu_cap_has(X86_FEATURE_RDTSCP))
> +			entry->ecx = F(RDPID);
>  		++array->nent;
>  	default:
>  		break;

Just to make sure that I understand this correctly:

This is what I know:

Both RDTSCP and RDPID are instructions that read IA32_TSC_AUX
(and RDTSCP also reads the TSC).

Both instructions have their own CPUID bits (X86_FEATURE_RDPID, X86_FEATURE_RDTSCP)
If either of these CPUID bits are present, IA32_TSC_AUX should be supported.


RDPID is a newer feature, thus I can at least for the sanity sake assume that
usually a CPU will either have neither of the features, have only RDTSCP,
and IA32_AUX, or have both RDSCP and RDPID.

If not supported in hardware KVM only emulates RDPID as I see.

Why btw? Performance wise guest that only wants the IA32_AUX in userspace,
is better to use RDTSCP and pay the penalty of saving/restoring of the
unwanted registers, than use RDPID with a vmexit.

My own guess for an answer to this question is that RDPID emulation is there
to aid migration from a host that does support RDPID to a host that doesn't.

Having said all that, assuming that we don't want to emulate the RDTSCP too,
when it is not supported, then this patch does make sense.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
  2021-05-04 21:58         ` Jim Mattson
@ 2021-05-10  8:08           ` Maxim Levitsky
  2021-05-10 16:56             ` Sean Christopherson
  0 siblings, 1 reply; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-10  8:08 UTC (permalink / raw)
  To: Jim Mattson, Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Tue, 2021-05-04 at 14:58 -0700, Jim Mattson wrote:
> On Tue, May 4, 2021 at 2:57 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> > On 04/05/21 23:53, Sean Christopherson wrote:
> > > > Does the right thing happen here if the vCPU is in guest mode when
> > > > userspace decides to toggle the CPUID.80000001H:EDX.RDTSCP bit on or
> > > > off?
> > > I hate our terminology.  By "guest mode", do you mean running the vCPU, or do
> > > you specifically mean running in L2?
> > > 
> > 
> > Guest mode should mean L2.
> > 
> > (I wonder if we should have a capability that says "KVM_SET_CPUID2 can
> > only be called prior to KVM_RUN").
> 
> It would certainly make it easier to reason about potential security issues.
> 
I vote too for this.
Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
  2021-05-04 17:17 ` [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest Sean Christopherson
  2021-05-04 21:45   ` Jim Mattson
  2021-05-05  4:26   ` Reiji Watanabe
@ 2021-05-10  8:08   ` Maxim Levitsky
  2 siblings, 0 replies; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-10  8:08 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> Intercept RDTSCP to inject #UD if RDTSC is disabled in the guest.
> 
> Note, SVM does not support intercepting RDPID.  Unlike VMX's
> ENABLE_RDTSCP control, RDTSCP interception does not apply to RDPID.  This
> is a benign virtualization hole as the host kernel (incorrectly) sets
> MSR_TSC_AUX if RDTSCP is supported, and KVM loads the guest's MSR_TSC_AUX
> into hardware if RDTSCP is supported in the host, i.e. KVM will not leak
> the host's MSR_TSC_AUX to the guest.
> 
> But, when the kernel bug is fixed, KVM will start leaking the host's
> MSR_TSC_AUX if RDPID is supported in hardware, but RDTSCP isn't available
> for whatever reason.  This leak will be remedied in a future commit.
> 
> Fixes: 46896c73c1a4 ("KVM: svm: add support for RDTSCP")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/svm.c | 17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index a7271f31df47..8f2b184270c0 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -1100,7 +1100,9 @@ static u64 svm_write_l1_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
>  	return svm->vmcb->control.tsc_offset;
>  }
>  
> -static void svm_check_invpcid(struct vcpu_svm *svm)
> +/* Evaluate instruction intercepts that depend on guest CPUID features. */
> +static void svm_recalc_instruction_intercepts(struct kvm_vcpu *vcpu,
> +					      struct vcpu_svm *svm)
>  {
>  	/*
>  	 * Intercept INVPCID if shadow paging is enabled to sync/free shadow
> @@ -1113,6 +1115,13 @@ static void svm_check_invpcid(struct vcpu_svm *svm)
>  		else
>  			svm_clr_intercept(svm, INTERCEPT_INVPCID);
>  	}
> +
> +	if (kvm_cpu_cap_has(X86_FEATURE_RDTSCP)) {
> +		if (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> +			svm_clr_intercept(svm, INTERCEPT_RDTSCP);
> +		else
> +			svm_set_intercept(svm, INTERCEPT_RDTSCP);
> +	}
>  }
>  
>  static void init_vmcb(struct kvm_vcpu *vcpu)
> @@ -1248,7 +1257,7 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
>  		svm_clr_intercept(svm, INTERCEPT_PAUSE);
>  	}
>  
> -	svm_check_invpcid(svm);
> +	svm_recalc_instruction_intercepts(vcpu, svm);
>  
>  	/*
>  	 * If the host supports V_SPEC_CTRL then disable the interception
> @@ -3084,6 +3093,7 @@ static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
>  	[SVM_EXIT_STGI]				= stgi_interception,
>  	[SVM_EXIT_CLGI]				= clgi_interception,
>  	[SVM_EXIT_SKINIT]			= skinit_interception,
> +	[SVM_EXIT_RDTSCP]			= kvm_handle_invalid_op,
>  	[SVM_EXIT_WBINVD]                       = kvm_emulate_wbinvd,
>  	[SVM_EXIT_MONITOR]			= kvm_emulate_monitor,
>  	[SVM_EXIT_MWAIT]			= kvm_emulate_mwait,
> @@ -4007,8 +4017,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	svm->nrips_enabled = kvm_cpu_cap_has(X86_FEATURE_NRIPS) &&
>  			     guest_cpuid_has(vcpu, X86_FEATURE_NRIPS);
>  
> -	/* Check again if INVPCID interception if required */
> -	svm_check_invpcid(svm);
> +	svm_recalc_instruction_intercepts(vcpu, svm);
>  
>  	/* For sev guests, the memory encryption bit is not reserved in CR3.  */
>  	if (sev_guest(vcpu->kvm)) {
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 04/15] KVM: x86: Move RDPID emulation intercept to its own enum
  2021-05-04 17:17 ` [PATCH 04/15] KVM: x86: Move RDPID emulation intercept to its own enum Sean Christopherson
  2021-05-04 23:24   ` Jim Mattson
@ 2021-05-10  8:14   ` Maxim Levitsky
  1 sibling, 0 replies; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-10  8:14 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> Add a dedicated intercept enum for RDPID instead of piggybacking RDTSCP.
> Unlike VMX's ENABLE_RDTSCP, RDPID is not bound to SVM's RDTSCP intercept.
> 
> Fixes: fb6d4d340e05 ("KVM: x86: emulate RDPID")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/emulate.c     | 2 +-
>  arch/x86/kvm/kvm_emulate.h | 1 +
>  arch/x86/kvm/vmx/vmx.c     | 3 ++-
>  3 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index abd9a4db11a8..8fc71e70857d 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -4502,7 +4502,7 @@ static const struct opcode group8[] = {
>   * from the register case of group9.
>   */
>  static const struct gprefix pfx_0f_c7_7 = {
> -	N, N, N, II(DstMem | ModRM | Op3264 | EmulateOnUD, em_rdpid, rdtscp),
> +	N, N, N, II(DstMem | ModRM | Op3264 | EmulateOnUD, em_rdpid, rdpid),
>  };
>  
>  
> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
> index 0d359115429a..f016838faedd 100644
> --- a/arch/x86/kvm/kvm_emulate.h
> +++ b/arch/x86/kvm/kvm_emulate.h
> @@ -468,6 +468,7 @@ enum x86_intercept {
>  	x86_intercept_clgi,
>  	x86_intercept_skinit,
>  	x86_intercept_rdtscp,
> +	x86_intercept_rdpid,
>  	x86_intercept_icebp,
>  	x86_intercept_wbinvd,
>  	x86_intercept_monitor,
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 82404ee2520e..99591e523b47 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7437,8 +7437,9 @@ static int vmx_check_intercept(struct kvm_vcpu *vcpu,
>  	/*
>  	 * RDPID causes #UD if disabled through secondary execution controls.
>  	 * Because it is marked as EmulateOnUD, we need to intercept it here.
> +	 * Note, RDPID is hidden behind ENABLE_RDTSCP.
>  	 */
> -	case x86_intercept_rdtscp:
> +	case x86_intercept_rdpid:
Shoudn't this path still handle the x86_intercept_rdtscp as I described below,
or should we remove it from the SVM side as well?

>  		if (!nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_RDTSCP)) {
>  			exception->vector = UD_VECTOR;
>  			exception->error_code_valid = false;

I have a maybe unrelated question that caught my eye:
I see this:

	DIP(SrcNone, rdtscp, check_rdtsc),

As far as I can see this means that if a nested guest executes
the rdtscp, and L1 intercepts it, then we will emulate the rdtscp by doing a nested
VM exit, but if we emulate a rdtscp for L1, we will fail since there is no .execute callback.

Is this intentional? As I understand it, at least in theory the emulator can be called
on any instruction due to things like lack of unrestricted guest, and/or emulating an
instruction on page fault (although the later is usually done by re-executing the instruction).

I know that the x86 emulator is far from being complete for such cases but I 
do wonder why rdtspc has different behavior in regard to nested and not nested case.

So this patch (since it removes the x86_intercept_rdtscp handling from the VMX),
should break the rdtscp emulation for the nested guest on VMX, although it is probably
not used anyway and should be removed.

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 05/15] KVM: VMX: Disable preemption when probing user return MSRs
  2021-05-04 17:17 ` [PATCH 05/15] KVM: VMX: Disable preemption when probing user return MSRs Sean Christopherson
  2021-05-04 23:36   ` Jim Mattson
@ 2021-05-10  8:18   ` Maxim Levitsky
  1 sibling, 0 replies; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-10  8:18 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> Disable preemption when probing a user return MSR via RDSMR/WRMSR.  If
> the MSR holds a different value per logical CPU, the WRMSR could corrupt
> the host's value if KVM is preempted between the RDMSR and WRMSR, and
> then rescheduled on a different CPU.
> 
> Opportunistically land the helper in common x86, SVM will use the helper
> in a future commit.
> 
> Fixes: 4be534102624 ("KVM: VMX: Initialize vmx->guest_msrs[] right after allocation")
> Cc: stable@vger.kernel.org
> Cc: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/vmx/vmx.c          |  5 +----
>  arch/x86/kvm/x86.c              | 16 ++++++++++++++++
>  3 files changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 3e5fc80a35c8..a02c9bf3f7f1 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1778,6 +1778,7 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
>  		    unsigned long icr, int op_64_bit);
>  
>  void kvm_define_user_return_msr(unsigned index, u32 msr);
> +int kvm_probe_user_return_msr(u32 msr);
>  int kvm_set_user_return_msr(unsigned index, u64 val, u64 mask);
>  
>  u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc);
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 99591e523b47..990ee339a05f 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6914,12 +6914,9 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
>  
>  	for (i = 0; i < ARRAY_SIZE(vmx_uret_msrs_list); ++i) {
>  		u32 index = vmx_uret_msrs_list[i];
> -		u32 data_low, data_high;
>  		int j = vmx->nr_uret_msrs;
>  
> -		if (rdmsr_safe(index, &data_low, &data_high) < 0)
> -			continue;
> -		if (wrmsr_safe(index, data_low, data_high) < 0)
> +		if (kvm_probe_user_return_msr(index))
>  			continue;
>  
>  		vmx->guest_uret_msrs[j].slot = i;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 3bf52ba5f2bb..e304447be42d 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -339,6 +339,22 @@ static void kvm_on_user_return(struct user_return_notifier *urn)
>  	}
>  }
>  
> +int kvm_probe_user_return_msr(u32 msr)
> +{
> +	u64 val;
> +	int ret;
> +
> +	preempt_disable();
> +	ret = rdmsrl_safe(msr, &val);
> +	if (ret)
> +		goto out;
> +	ret = wrmsrl_safe(msr, val);
> +out:
> +	preempt_enable();
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(kvm_probe_user_return_msr);
> +
>  void kvm_define_user_return_msr(unsigned slot, u32 msr)
>  {
>  	BUG_ON(slot >= KVM_MAX_NR_USER_RETURN_MSRS);

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

One note though: every time we probe for a MSR existance via
checking for a #GP on write, we risk getting nonsense results 
if the L1 has ignore_msrs=1.

Thus if possible to use the CPUID instead, 
that would be preferred.

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 06/15] KVM: SVM: Probe and load MSR_TSC_AUX regardless of RDTSCP support in host
  2021-05-04 17:17 ` [PATCH 06/15] KVM: SVM: Probe and load MSR_TSC_AUX regardless of RDTSCP support in host Sean Christopherson
@ 2021-05-10  8:20   ` Maxim Levitsky
  0 siblings, 0 replies; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-10  8:20 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> Probe MSR_TSC_AUX whether or not RDTSCP is supported in the host, and
> if probing succeeds, load the guest's MSR_TSC_AUX into hardware prior to
> VMRUN.  Because SVM doesn't support interception of RDPID, RDPID cannot
> be disallowed in the guest (without resorting to binary translation).
> Leaving the host's MSR_TSC_AUX in hardware would leak the host's value to
> the guest if RDTSCP is not supported.
> 
> Note, there is also a kernel bug that prevents leaking the host's value.
> The host kernel initializes MSR_TSC_AUX if and only if RDTSCP is
> supported, even though the vDSO usage consumes MSR_TSC_AUX via RDPID.
> I.e. if RDTSCP is not supported, there is no host value to leak.  But,
> if/when the host kernel bug is fixed, KVM would start leaking MSR_TSC_AUX
> in the case where hardware supports RDPID but RDTSCP is unavailable for
> whatever reason.
> 
> Probing MSR_TSC_AUX will also allow consolidating the probe and define
> logic in common x86, and will make it simpler to condition the existence
> of MSR_TSX_AUX (from the guest's perspective) on RDTSCP *or* RDPID.
> 
> Fixes: AMD CPUs
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/svm.c | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 8f2b184270c0..b3153d40cc4d 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -212,7 +212,7 @@ DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
>   * RDTSCP and RDPID are not used in the kernel, specifically to allow KVM to
>   * defer the restoration of TSC_AUX until the CPU returns to userspace.
>   */
> -#define TSC_AUX_URET_SLOT	0
> +static int tsc_aux_uret_slot __read_mostly = -1;
>  
>  static const u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000};
>  
> @@ -959,8 +959,10 @@ static __init int svm_hardware_setup(void)
>  		kvm_tsc_scaling_ratio_frac_bits = 32;
>  	}
>  
> -	if (boot_cpu_has(X86_FEATURE_RDTSCP))
> -		kvm_define_user_return_msr(TSC_AUX_URET_SLOT, MSR_TSC_AUX);
> +	if (!kvm_probe_user_return_msr(MSR_TSC_AUX)) {
> +		tsc_aux_uret_slot = 0;
> +		kvm_define_user_return_msr(tsc_aux_uret_slot, MSR_TSC_AUX);
> +	}
>  
>  	/* Check for pause filtering support */
>  	if (!boot_cpu_has(X86_FEATURE_PAUSEFILTER)) {
> @@ -1454,8 +1456,8 @@ static void svm_prepare_guest_switch(struct kvm_vcpu *vcpu)
>  		}
>  	}
>  
> -	if (static_cpu_has(X86_FEATURE_RDTSCP))
> -		kvm_set_user_return_msr(TSC_AUX_URET_SLOT, svm->tsc_aux, -1ull);
> +	if (likely(tsc_aux_uret_slot >= 0))
> +		kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
>  
>  	svm->guest_state_loaded = true;
>  }
> @@ -2664,7 +2666,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  			msr_info->data |= (u64)svm->sysenter_esp_hi << 32;
>  		break;
>  	case MSR_TSC_AUX:
> -		if (!boot_cpu_has(X86_FEATURE_RDTSCP))
> +		if (tsc_aux_uret_slot < 0)
>  			return 1;
>  		if (!msr_info->host_initiated &&
>  		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> @@ -2885,7 +2887,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>  		svm->sysenter_esp_hi = guest_cpuid_is_intel(vcpu) ? (data >> 32) : 0;
>  		break;
>  	case MSR_TSC_AUX:
> -		if (!boot_cpu_has(X86_FEATURE_RDTSCP))
> +		if (tsc_aux_uret_slot < 0)
>  			return 1;
>  
>  		if (!msr->host_initiated &&
> @@ -2908,7 +2910,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>  		 * guest via direct_access_msrs, and switch it via user return.
>  		 */
>  		preempt_disable();
> -		r = kvm_set_user_return_msr(TSC_AUX_URET_SLOT, data, -1ull);
> +		r = kvm_set_user_return_msr(tsc_aux_uret_slot, data, -1ull);
>  		preempt_enable();
>  		if (r)
>  			return 1;

If L1 has ignore_msrs=1, then we will end up writing the IA32_TSC_AUX for nothing,
but this shouldn't be that of a big deal, so:

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 07/15] KVM: x86: Add support for RDPID without RDTSCP
  2021-05-04 17:17 ` [PATCH 07/15] KVM: x86: Add support for RDPID without RDTSCP Sean Christopherson
@ 2021-05-10  8:20   ` Maxim Levitsky
  0 siblings, 0 replies; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-10  8:20 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> Allow userspace to enable RDPID for a guest without also enabling RDTSCP.
> Aside from checking for RDPID support in the obvious flows, VMX also needs
> to set ENABLE_RDTSCP=1 when RDPID is exposed.
> 
> For the record, there is no known scenario where enabling RDPID without
> RDTSCP is desirable.  But, both AMD and Intel architectures allow for the
> condition, i.e. this is purely to make KVM more architecturally accurate.
> 
> Fixes: 41cd02c6f7f6 ("kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID")
> Cc: stable@vger.kernel.org
> Reported-by: Reiji Watanabe <reijiw@google.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/svm.c |  6 ++++--
>  arch/x86/kvm/vmx/vmx.c | 27 +++++++++++++++++++++++----
>  arch/x86/kvm/x86.c     |  3 ++-
>  3 files changed, 29 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index b3153d40cc4d..231b9650d864 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -2669,7 +2669,8 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		if (tsc_aux_uret_slot < 0)
>  			return 1;
>  		if (!msr_info->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
>  			return 1;
>  		msr_info->data = svm->tsc_aux;
>  		break;
> @@ -2891,7 +2892,8 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>  			return 1;
>  
>  		if (!msr->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
>  			return 1;
>  
>  		/*
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 990ee339a05f..42e4bbaa299a 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1788,7 +1788,8 @@ static void setup_msrs(struct vcpu_vmx *vmx)
>  	if (update_transition_efer(vmx))
>  		vmx_setup_uret_msr(vmx, MSR_EFER);
>  
> -	if (guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP))
> +	if (guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP)  ||
> +	    guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDPID))
>  		vmx_setup_uret_msr(vmx, MSR_TSC_AUX);
>  
>  	vmx_setup_uret_msr(vmx, MSR_IA32_TSX_CTRL);
> @@ -1994,7 +1995,8 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		break;
>  	case MSR_TSC_AUX:
>  		if (!msr_info->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
>  			return 1;
>  		goto find_uret_msr;
>  	case MSR_IA32_DEBUGCTLMSR:
> @@ -2314,7 +2316,8 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		break;
>  	case MSR_TSC_AUX:
>  		if (!msr_info->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
>  			return 1;
>  		/* Check reserved bit, higher 32 bits should be zero */
>  		if ((data >> 32) != 0)
> @@ -4368,7 +4371,23 @@ static void vmx_compute_secondary_exec_control(struct vcpu_vmx *vmx)
>  						  xsaves_enabled, false);
>  	}
>  
> -	vmx_adjust_sec_exec_feature(vmx, &exec_control, rdtscp, RDTSCP);
> +	/*
> +	 * RDPID is also gated by ENABLE_RDTSCP, turn on the control if either
> +	 * feature is exposed to the guest.  This creates a virtualization hole
> +	 * if both are supported in hardware but only one is exposed to the
> +	 * guest, but letting the guest execute RDTSCP or RDPID when either one
> +	 * is advertised is preferable to emulating the advertised instruction
> +	 * in KVM on #UD, and obviously better than incorrectly injecting #UD.
> +	 */
> +	if (cpu_has_vmx_rdtscp()) {
> +		bool rdpid_or_rdtscp_enabled =
> +			guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) ||
> +			guest_cpuid_has(vcpu, X86_FEATURE_RDPID);
> +
> +		vmx_adjust_secondary_exec_control(vmx, &exec_control,
> +						  SECONDARY_EXEC_ENABLE_RDTSCP,
> +						  rdpid_or_rdtscp_enabled, false);
> +	}
>  	vmx_adjust_sec_exec_feature(vmx, &exec_control, invpcid, INVPCID);
>  
>  	vmx_adjust_sec_exec_exiting(vmx, &exec_control, rdrand, RDRAND);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e304447be42d..b4516d303413 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5978,7 +5978,8 @@ static void kvm_init_msr_list(void)
>  				continue;
>  			break;
>  		case MSR_TSC_AUX:
> -			if (!kvm_cpu_cap_has(X86_FEATURE_RDTSCP))
> +			if (!kvm_cpu_cap_has(X86_FEATURE_RDTSCP) &&
> +			    !kvm_cpu_cap_has(X86_FEATURE_RDPID))
>  				continue;
>  			break;
>  		case MSR_IA32_UMWAIT_CONTROL:

Reviewed-by : Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 08/15] KVM: VMX: Configure list of user return MSRs at module init
  2021-05-04 17:17 ` [PATCH 08/15] KVM: VMX: Configure list of user return MSRs at module init Sean Christopherson
@ 2021-05-10  8:23   ` Maxim Levitsky
  2021-05-10 15:13     ` Sean Christopherson
  0 siblings, 1 reply; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-10  8:23 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> Configure the list of user return MSRs that are actually supported at
> module init instead of reprobing the list of possible MSRs every time a
> vCPU is created.  Curating the list on a per-vCPU basis is pointless; KVM
> is completely hosed if the set of supported MSRs changes after module init,
> or if the set of MSRs differs per physical PCU.
> 
> The per-vCPU lists also increase complexity (see __vmx_find_uret_msr()) and
> creates corner cases that _should_ be impossible, but theoretically exist
> in KVM, e.g. advertising RDTSCP to userspace without actually being able to
> virtualize RDTSCP if probing MSR_TSC_AUX fails.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 61 ++++++++++++++++++++++++++++--------------
>  arch/x86/kvm/vmx/vmx.h | 10 ++++++-
>  2 files changed, 50 insertions(+), 21 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 42e4bbaa299a..68454b0de2b1 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -461,7 +461,7 @@ static unsigned long host_idt_base;
>   * support this emulation, IA32_STAR must always be included in
>   * vmx_uret_msrs_list[], even in i386 builds.
>   */
> -static const u32 vmx_uret_msrs_list[] = {
> +static u32 vmx_uret_msrs_list[] = {
>  #ifdef CONFIG_X86_64
>  	MSR_SYSCALL_MASK, MSR_LSTAR, MSR_CSTAR,
>  #endif
> @@ -469,6 +469,12 @@ static const u32 vmx_uret_msrs_list[] = {
>  	MSR_IA32_TSX_CTRL,
>  };
>  
> +/*
> + * Number of user return MSRs that are actually supported in hardware.
> + * vmx_uret_msrs_list is modified when KVM is loaded to drop unsupported MSRs.
> + */
> +static int vmx_nr_uret_msrs;
> +
>  #if IS_ENABLED(CONFIG_HYPERV)
>  static bool __read_mostly enlightened_vmcs = true;
>  module_param(enlightened_vmcs, bool, 0444);
> @@ -700,9 +706,16 @@ static inline int __vmx_find_uret_msr(struct vcpu_vmx *vmx, u32 msr)
>  {
>  	int i;
>  
> -	for (i = 0; i < vmx->nr_uret_msrs; ++i)
> +	/*
> +	 * Note, vmx->guest_uret_msrs is the same size as vmx_uret_msrs_list,
> +	 * but is ordered differently.  The MSR is matched against the list of
> +	 * supported uret MSRs using "slot", but the index that is returned is
> +	 * the index into guest_uret_msrs.
> +	 */
> +	for (i = 0; i < vmx_nr_uret_msrs; ++i) {
>  		if (vmx_uret_msrs_list[vmx->guest_uret_msrs[i].slot] == msr)
>  			return i;
> +	}
>  	return -1;
>  }
>  
> @@ -6929,18 +6942,10 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
>  			goto free_vpid;
>  	}
>  
> -	BUILD_BUG_ON(ARRAY_SIZE(vmx_uret_msrs_list) != MAX_NR_USER_RETURN_MSRS);
> +	for (i = 0; i < vmx_nr_uret_msrs; ++i) {
> +		vmx->guest_uret_msrs[i].data = 0;
>  
> -	for (i = 0; i < ARRAY_SIZE(vmx_uret_msrs_list); ++i) {
> -		u32 index = vmx_uret_msrs_list[i];
> -		int j = vmx->nr_uret_msrs;
> -
> -		if (kvm_probe_user_return_msr(index))
> -			continue;
> -
> -		vmx->guest_uret_msrs[j].slot = i;
I don't see anything initalizing the .slot after this patch.
Now this code is removed later which masks this bug, 
but for the bisect sake, I think that this patch 
should still be fixed.

Other than this minor bug:

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



> -		vmx->guest_uret_msrs[j].data = 0;
> -		switch (index) {
> +		switch (vmx_uret_msrs_list[i]) {
>  		case MSR_IA32_TSX_CTRL:
>  			/*
>  			 * TSX_CTRL_CPUID_CLEAR is handled in the CPUID
> @@ -6954,15 +6959,14 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
>  			 * host so that TSX remains always disabled.
>  			 */
>  			if (boot_cpu_has(X86_FEATURE_RTM))
> -				vmx->guest_uret_msrs[j].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
> +				vmx->guest_uret_msrs[i].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
>  			else
> -				vmx->guest_uret_msrs[j].mask = 0;
> +				vmx->guest_uret_msrs[i].mask = 0;
>  			break;
>  		default:
> -			vmx->guest_uret_msrs[j].mask = -1ull;
> +			vmx->guest_uret_msrs[i].mask = -1ull;
>  			break;
>  		}
> -		++vmx->nr_uret_msrs;
>  	}
>  
>  	err = alloc_loaded_vmcs(&vmx->vmcs01);
> @@ -7821,17 +7825,34 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
>  	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
>  };
>  
> +static __init void vmx_setup_user_return_msrs(void)
> +{
> +	u32 msr;
> +	int i;
> +
> +	BUILD_BUG_ON(ARRAY_SIZE(vmx_uret_msrs_list) != MAX_NR_USER_RETURN_MSRS);
> +
> +	for (i = 0; i < ARRAY_SIZE(vmx_uret_msrs_list); ++i) {
> +		msr = vmx_uret_msrs_list[i];
> +
> +		if (kvm_probe_user_return_msr(msr))
> +			continue;
> +
> +		kvm_define_user_return_msr(vmx_nr_uret_msrs, msr);
> +		vmx_uret_msrs_list[vmx_nr_uret_msrs++] = msr;
> +	}
> +}
> +
>  static __init int hardware_setup(void)
>  {
>  	unsigned long host_bndcfgs;
>  	struct desc_ptr dt;
> -	int r, i, ept_lpage_level;
> +	int r, ept_lpage_level;
>  
>  	store_idt(&dt);
>  	host_idt_base = dt.address;
>  
> -	for (i = 0; i < ARRAY_SIZE(vmx_uret_msrs_list); ++i)
> -		kvm_define_user_return_msr(i, vmx_uret_msrs_list[i]);
> +	vmx_setup_user_return_msrs();
>  
>  	if (setup_vmcs_config(&vmcs_config, &vmx_capability) < 0)
>  		return -EIO;
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 008cb87ff088..d71ed8b425c5 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -245,8 +245,16 @@ struct vcpu_vmx {
>  	u32                   idt_vectoring_info;
>  	ulong                 rflags;
>  
> +	/*
> +	 * User return MSRs are always emulated when enabled in the guest, but
> +	 * only loaded into hardware when necessary, e.g. SYSCALL #UDs outside
> +	 * of 64-bit mode or if EFER.SCE=1, thus the SYSCALL MSRs don't need to
> +	 * be loaded into hardware if those conditions aren't met.
> +	 * nr_active_uret_msrs tracks the number of MSRs that need to be loaded
> +	 * into hardware when running the guest.  guest_uret_msrs[] is resorted
> +	 * whenever the number of "active" uret MSRs is modified.
> +	 */
>  	struct vmx_uret_msr   guest_uret_msrs[MAX_NR_USER_RETURN_MSRS];
> -	int                   nr_uret_msrs;
>  	int                   nr_active_uret_msrs;
>  	bool                  guest_uret_msrs_loaded;
>  #ifdef CONFIG_X86_64







^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 09/15] KVM: VMX: Use flag to indicate "active" uret MSRs instead of sorting list
  2021-05-04 17:17 ` [PATCH 09/15] KVM: VMX: Use flag to indicate "active" uret MSRs instead of sorting list Sean Christopherson
  2021-05-08  3:31   ` Reiji Watanabe
@ 2021-05-10  8:25   ` Maxim Levitsky
  1 sibling, 0 replies; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-10  8:25 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> Explicitly flag a uret MSR as needing to be loaded into hardware instead of
> resorting the list of "active" MSRs and tracking how many MSRs in total
> need to be loaded.  The only benefit to sorting the list is that the loop
> to load MSRs during vmx_prepare_switch_to_guest() doesn't need to iterate
> over all supported uret MRS, only those that are active.  But that is a
> pointless optimization, as the most common case, running a 64-bit guest,
> will load the vast majority of MSRs.  Not to mention that a single WRMSR is
> far more expensive than iterating over the list.
> 
> Providing a stable list order obviates the need to track a given MSR's
> "slot" in the per-CPU list of user return MSRs; all lists simply use the
> same ordering.  Future patches will take advantage of the stable order to
> further simplify the related code.

> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 80 ++++++++++++++++++++++--------------------
>  arch/x86/kvm/vmx/vmx.h |  2 +-
>  2 files changed, 42 insertions(+), 40 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 68454b0de2b1..6caabcd5037e 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -458,8 +458,9 @@ static unsigned long host_idt_base;
>   * Though SYSCALL is only supported in 64-bit mode on Intel CPUs, kvm
>   * will emulate SYSCALL in legacy mode if the vendor string in guest
>   * CPUID.0:{EBX,ECX,EDX} is "AuthenticAMD" or "AMDisbetter!" To
> - * support this emulation, IA32_STAR must always be included in
> - * vmx_uret_msrs_list[], even in i386 builds.
> + * support this emulation, MSR_STAR is included in the list for i386,
> + * but is never loaded into hardware.  MSR_CSTAR is also never loaded
> + * into hardware and is here purely for emulation purposes.
>   */
>  static u32 vmx_uret_msrs_list[] = {
>  #ifdef CONFIG_X86_64
> @@ -702,18 +703,12 @@ static bool is_valid_passthrough_msr(u32 msr)
>  	return r;
>  }
>  
> -static inline int __vmx_find_uret_msr(struct vcpu_vmx *vmx, u32 msr)
> +static inline int __vmx_find_uret_msr(u32 msr)
>  {
>  	int i;
>  
> -	/*
> -	 * Note, vmx->guest_uret_msrs is the same size as vmx_uret_msrs_list,
> -	 * but is ordered differently.  The MSR is matched against the list of
> -	 * supported uret MSRs using "slot", but the index that is returned is
> -	 * the index into guest_uret_msrs.
> -	 */
>  	for (i = 0; i < vmx_nr_uret_msrs; ++i) {
> -		if (vmx_uret_msrs_list[vmx->guest_uret_msrs[i].slot] == msr)
> +		if (vmx_uret_msrs_list[i] == msr)
>  			return i;
>  	}
>  	return -1;
> @@ -723,7 +718,7 @@ struct vmx_uret_msr *vmx_find_uret_msr(struct vcpu_vmx *vmx, u32 msr)
>  {
>  	int i;
>  
> -	i = __vmx_find_uret_msr(vmx, msr);
> +	i = __vmx_find_uret_msr(msr);
>  	if (i >= 0)
>  		return &vmx->guest_uret_msrs[i];
>  	return NULL;
> @@ -732,13 +727,14 @@ struct vmx_uret_msr *vmx_find_uret_msr(struct vcpu_vmx *vmx, u32 msr)
>  static int vmx_set_guest_uret_msr(struct vcpu_vmx *vmx,
>  				  struct vmx_uret_msr *msr, u64 data)
>  {
> +	unsigned int slot = msr - vmx->guest_uret_msrs;
>  	int ret = 0;
>  
>  	u64 old_msr_data = msr->data;
>  	msr->data = data;
> -	if (msr - vmx->guest_uret_msrs < vmx->nr_active_uret_msrs) {
> +	if (msr->load_into_hardware) {
>  		preempt_disable();
> -		ret = kvm_set_user_return_msr(msr->slot, msr->data, msr->mask);
> +		ret = kvm_set_user_return_msr(slot, msr->data, msr->mask);
>  		preempt_enable();
>  		if (ret)
>  			msr->data = old_msr_data;
> @@ -1090,7 +1086,7 @@ static bool update_transition_efer(struct vcpu_vmx *vmx)
>  		return false;
>  	}
>  
> -	i = __vmx_find_uret_msr(vmx, MSR_EFER);
> +	i = __vmx_find_uret_msr(MSR_EFER);
>  	if (i < 0)
>  		return false;
>  
> @@ -1252,11 +1248,14 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
>  	 */
>  	if (!vmx->guest_uret_msrs_loaded) {
>  		vmx->guest_uret_msrs_loaded = true;
> -		for (i = 0; i < vmx->nr_active_uret_msrs; ++i)
> -			kvm_set_user_return_msr(vmx->guest_uret_msrs[i].slot,
> +		for (i = 0; i < vmx_nr_uret_msrs; ++i) {
> +			if (!vmx->guest_uret_msrs[i].load_into_hardware)
> +				continue;
> +
> +			kvm_set_user_return_msr(i,
>  						vmx->guest_uret_msrs[i].data,
>  						vmx->guest_uret_msrs[i].mask);
> -
> +		}
>  	}
>  
>      	if (vmx->nested.need_vmcs12_to_shadow_sync)
> @@ -1763,19 +1762,16 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
>  	vmx_clear_hlt(vcpu);
>  }
>  
> -static void vmx_setup_uret_msr(struct vcpu_vmx *vmx, unsigned int msr)
> +static void vmx_setup_uret_msr(struct vcpu_vmx *vmx, unsigned int msr,
> +			       bool load_into_hardware)
>  {
> -	struct vmx_uret_msr tmp;
> -	int from, to;
> +	struct vmx_uret_msr *uret_msr;
>  
> -	from = __vmx_find_uret_msr(vmx, msr);
> -	if (from < 0)
> +	uret_msr = vmx_find_uret_msr(vmx, msr);
> +	if (!uret_msr)
>  		return;
> -	to = vmx->nr_active_uret_msrs++;
>  
> -	tmp = vmx->guest_uret_msrs[to];
> -	vmx->guest_uret_msrs[to] = vmx->guest_uret_msrs[from];
> -	vmx->guest_uret_msrs[from] = tmp;
> +	uret_msr->load_into_hardware = load_into_hardware;
>  }
>  
>  /*
> @@ -1785,30 +1781,36 @@ static void vmx_setup_uret_msr(struct vcpu_vmx *vmx, unsigned int msr)
>   */
>  static void setup_msrs(struct vcpu_vmx *vmx)
>  {
> -	vmx->guest_uret_msrs_loaded = false;
> -	vmx->nr_active_uret_msrs = 0;
>  #ifdef CONFIG_X86_64
> +	bool load_syscall_msrs;
> +
>  	/*
>  	 * The SYSCALL MSRs are only needed on long mode guests, and only
>  	 * when EFER.SCE is set.
>  	 */
> -	if (is_long_mode(&vmx->vcpu) && (vmx->vcpu.arch.efer & EFER_SCE)) {
> -		vmx_setup_uret_msr(vmx, MSR_STAR);
> -		vmx_setup_uret_msr(vmx, MSR_LSTAR);
> -		vmx_setup_uret_msr(vmx, MSR_SYSCALL_MASK);
> -	}
> +	load_syscall_msrs = is_long_mode(&vmx->vcpu) &&
> +			    (vmx->vcpu.arch.efer & EFER_SCE);
> +
> +	vmx_setup_uret_msr(vmx, MSR_STAR, load_syscall_msrs);
> +	vmx_setup_uret_msr(vmx, MSR_LSTAR, load_syscall_msrs);
> +	vmx_setup_uret_msr(vmx, MSR_SYSCALL_MASK, load_syscall_msrs);
>  #endif
> -	if (update_transition_efer(vmx))
> -		vmx_setup_uret_msr(vmx, MSR_EFER);
> +	vmx_setup_uret_msr(vmx, MSR_EFER, update_transition_efer(vmx));
>  
> -	if (guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP)  ||
> -	    guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDPID))
> -		vmx_setup_uret_msr(vmx, MSR_TSC_AUX);
> +	vmx_setup_uret_msr(vmx, MSR_TSC_AUX,
> +			   guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP) ||
> +			   guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDPID));
>  
> -	vmx_setup_uret_msr(vmx, MSR_IA32_TSX_CTRL);
> +	vmx_setup_uret_msr(vmx, MSR_IA32_TSX_CTRL, true);
>  
>  	if (cpu_has_vmx_msr_bitmap())
>  		vmx_update_msr_bitmap(&vmx->vcpu);
> +
> +	/*
> +	 * The set of MSRs to load may have changed, reload MSRs before the
> +	 * next VM-Enter.
> +	 */
> +	vmx->guest_uret_msrs_loaded = false;
>  }
>  
>  static u64 vmx_write_l1_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index d71ed8b425c5..16e4e457ba23 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -36,7 +36,7 @@ struct vmx_msrs {
>  };
>  
>  struct vmx_uret_msr {
> -	unsigned int slot; /* The MSR's slot in kvm_user_return_msrs. */
> +	bool load_into_hardware;
>  	u64 data;
>  	u64 mask;
>  };

This is a very welcomed change, the old code was very complicated
to understand.

However I still feel that it would be nice to have a comment explaining
that vmx->guest_uret_msrs follow the same order now as the (eventually common)
uret msr list, and basically is a parallel array that extends the 
percpu 'user_return_msrs' and the global kvm_uret_msrs_list arrays.

In fact why not to fold the vmx->guest_uret_msrs into the x86 common uret msr list?
There is nothing VMX specific in this list IMHO and SVM can use it as well,
in fact it has 'svm->tsc_aux' which is the 'data' field of a 'struct vmx_uret_msr'


Reviewed-by: Maxim Levitsky <mlevitsk@gmail.com>

Best regards,
	Maxim Levitsky






^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 10/15] KVM: VMX: Use common x86's uret MSR list as the one true list
  2021-05-04 17:17 ` [PATCH 10/15] KVM: VMX: Use common x86's uret MSR list as the one true list Sean Christopherson
@ 2021-05-10  8:25   ` Maxim Levitsky
  0 siblings, 0 replies; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-10  8:25 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> Drop VMX's global list of user return MSRs now that VMX doesn't resort said
> list to isolate "active" MSRs, i.e. now that VMX's list and x86's list have
> the same MSRs in the same order.
> 
> In addition to eliminating the redundant list, this will also allow moving
> more of the list management into common x86.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/vmx/vmx.c          | 97 ++++++++++++++-------------------
>  arch/x86/kvm/x86.c              | 12 ++++
>  3 files changed, 53 insertions(+), 57 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index a02c9bf3f7f1..c9452472ed55 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1778,6 +1778,7 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
>  		    unsigned long icr, int op_64_bit);
>  
>  void kvm_define_user_return_msr(unsigned index, u32 msr);
> +int kvm_find_user_return_msr(u32 msr);
>  int kvm_probe_user_return_msr(u32 msr);
>  int kvm_set_user_return_msr(unsigned index, u64 val, u64 mask);
>  
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 6caabcd5037e..4b432d2bbd06 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -454,26 +454,7 @@ static inline void vmx_segment_cache_clear(struct vcpu_vmx *vmx)
>  
>  static unsigned long host_idt_base;
>  
> -/*
> - * Though SYSCALL is only supported in 64-bit mode on Intel CPUs, kvm
> - * will emulate SYSCALL in legacy mode if the vendor string in guest
> - * CPUID.0:{EBX,ECX,EDX} is "AuthenticAMD" or "AMDisbetter!" To
> - * support this emulation, MSR_STAR is included in the list for i386,
> - * but is never loaded into hardware.  MSR_CSTAR is also never loaded
> - * into hardware and is here purely for emulation purposes.
> - */
> -static u32 vmx_uret_msrs_list[] = {
> -#ifdef CONFIG_X86_64
> -	MSR_SYSCALL_MASK, MSR_LSTAR, MSR_CSTAR,
> -#endif
> -	MSR_EFER, MSR_TSC_AUX, MSR_STAR,
> -	MSR_IA32_TSX_CTRL,
> -};
> -
> -/*
> - * Number of user return MSRs that are actually supported in hardware.
> - * vmx_uret_msrs_list is modified when KVM is loaded to drop unsupported MSRs.
> - */
> +/* Number of user return MSRs that are actually supported in hardware. */
>  static int vmx_nr_uret_msrs;
>  
>  #if IS_ENABLED(CONFIG_HYPERV)
> @@ -703,22 +684,11 @@ static bool is_valid_passthrough_msr(u32 msr)
>  	return r;
>  }
>  
> -static inline int __vmx_find_uret_msr(u32 msr)
> -{
> -	int i;
> -
> -	for (i = 0; i < vmx_nr_uret_msrs; ++i) {
> -		if (vmx_uret_msrs_list[i] == msr)
> -			return i;
> -	}
> -	return -1;
> -}
> -
>  struct vmx_uret_msr *vmx_find_uret_msr(struct vcpu_vmx *vmx, u32 msr)
>  {
>  	int i;
>  
> -	i = __vmx_find_uret_msr(msr);
> +	i = kvm_find_user_return_msr(msr);
>  	if (i >= 0)
>  		return &vmx->guest_uret_msrs[i];
>  	return NULL;
> @@ -1086,7 +1056,7 @@ static bool update_transition_efer(struct vcpu_vmx *vmx)
>  		return false;
>  	}
>  
> -	i = __vmx_find_uret_msr(MSR_EFER);
> +	i = kvm_find_user_return_msr(MSR_EFER);
>  	if (i < 0)
>  		return false;
>  
> @@ -6922,6 +6892,7 @@ static void vmx_free_vcpu(struct kvm_vcpu *vcpu)
>  
>  static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
>  {
> +	struct vmx_uret_msr *tsx_ctrl;
>  	struct vcpu_vmx *vmx;
>  	int i, cpu, err;
>  
> @@ -6946,29 +6917,25 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
>  
>  	for (i = 0; i < vmx_nr_uret_msrs; ++i) {
>  		vmx->guest_uret_msrs[i].data = 0;
> -
> -		switch (vmx_uret_msrs_list[i]) {
> -		case MSR_IA32_TSX_CTRL:
> -			/*
> -			 * TSX_CTRL_CPUID_CLEAR is handled in the CPUID
> -			 * interception.  Keep the host value unchanged to avoid
> -			 * changing CPUID bits under the host kernel's feet.
> -			 *
> -			 * hle=0, rtm=0, tsx_ctrl=1 can be found with some
> -			 * combinations of new kernel and old userspace.  If
> -			 * those guests run on a tsx=off host, do allow guests
> -			 * to use TSX_CTRL, but do not change the value on the
> -			 * host so that TSX remains always disabled.
> -			 */
> -			if (boot_cpu_has(X86_FEATURE_RTM))
> -				vmx->guest_uret_msrs[i].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
> -			else
> -				vmx->guest_uret_msrs[i].mask = 0;
> -			break;
> -		default:
> -			vmx->guest_uret_msrs[i].mask = -1ull;
> -			break;
> -		}
> +		vmx->guest_uret_msrs[i].mask = -1ull;
> +	}
> +	tsx_ctrl = vmx_find_uret_msr(vmx, MSR_IA32_TSX_CTRL);
> +	if (tsx_ctrl) {
> +		/*
> +		 * TSX_CTRL_CPUID_CLEAR is handled in the CPUID interception.
> +		 * Keep the host value unchanged to avoid changing CPUID bits
> +		 * under the host kernel's feet.
> +		 *
> +		 * hle=0, rtm=0, tsx_ctrl=1 can be found with some combinations
> +		 * of new kernel and old userspace.  If those guests run on a
> +		 * tsx=off host, do allow guests to use TSX_CTRL, but do not
> +		 * change the value on the host so that TSX remains always
> +		 * disabled.
> +		 */
> +		if (boot_cpu_has(X86_FEATURE_RTM))
> +			vmx->guest_uret_msrs[i].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
> +		else
> +			vmx->guest_uret_msrs[i].mask = 0;
>  	}
>  
>  	err = alloc_loaded_vmcs(&vmx->vmcs01);
> @@ -7829,6 +7796,22 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
>  
>  static __init void vmx_setup_user_return_msrs(void)
>  {
> +
> +	/*
> +	 * Though SYSCALL is only supported in 64-bit mode on Intel CPUs, kvm
> +	 * will emulate SYSCALL in legacy mode if the vendor string in guest
> +	 * CPUID.0:{EBX,ECX,EDX} is "AuthenticAMD" or "AMDisbetter!" To
> +	 * support this emulation, MSR_STAR is included in the list for i386,
> +	 * but is never loaded into hardware.  MSR_CSTAR is also never loaded
> +	 * into hardware and is here purely for emulation purposes.
> +	 */
> +	const u32 vmx_uret_msrs_list[] = {
> +	#ifdef CONFIG_X86_64
> +		MSR_SYSCALL_MASK, MSR_LSTAR, MSR_CSTAR,
> +	#endif
> +		MSR_EFER, MSR_TSC_AUX, MSR_STAR,
> +		MSR_IA32_TSX_CTRL,
> +	};
>  	u32 msr;
>  	int i;
>  
> @@ -7841,7 +7824,7 @@ static __init void vmx_setup_user_return_msrs(void)
>  			continue;
>  
>  		kvm_define_user_return_msr(vmx_nr_uret_msrs, msr);
> -		vmx_uret_msrs_list[vmx_nr_uret_msrs++] = msr;
> +		vmx_nr_uret_msrs++;
>  	}
>  }
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index b4516d303413..90ef340565a4 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -364,6 +364,18 @@ void kvm_define_user_return_msr(unsigned slot, u32 msr)
>  }
>  EXPORT_SYMBOL_GPL(kvm_define_user_return_msr);
>  
> +int kvm_find_user_return_msr(u32 msr)
> +{
> +	int i;
> +
> +	for (i = 0; i < user_return_msrs_global.nr; ++i) {
> +		if (user_return_msrs_global.msrs[i] == msr)
> +			return i;
> +	}
> +	return -1;
> +}
> +EXPORT_SYMBOL_GPL(kvm_find_user_return_msr);
> +
>  static void kvm_user_return_msr_cpu_online(void)
>  {
>  	unsigned int cpu = smp_processor_id();


Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky <mlevitsk@redhat.com>



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 11/15] KVM: VMX: Disable loading of TSX_CTRL MSR the more conventional way
  2021-05-04 17:17 ` [PATCH 11/15] KVM: VMX: Disable loading of TSX_CTRL MSR the more conventional way Sean Christopherson
  2021-05-05  8:49   ` Paolo Bonzini
@ 2021-05-10  8:26   ` Maxim Levitsky
  1 sibling, 0 replies; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-10  8:26 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> Tag TSX_CTRL as not needing to be loaded when RTM isn't supported in the
> host.  Crushing the write mask to '0' has the same effect, but requires
> more mental gymnastics to understand.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 22 ++++++++++------------
>  1 file changed, 10 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 4b432d2bbd06..7a53568b34fc 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1771,7 +1771,13 @@ static void setup_msrs(struct vcpu_vmx *vmx)
>  			   guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP) ||
>  			   guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDPID));
>  
> -	vmx_setup_uret_msr(vmx, MSR_IA32_TSX_CTRL, true);
> +	/*
> +	 * hle=0, rtm=0, tsx_ctrl=1 can be found with some combinations of new
> +	 * kernel and old userspace.  If those guests run on a tsx=off host, do
> +	 * allow guests to use TSX_CTRL, but don't change the value in hardware
> +	 * so that TSX remains always disabled.
> +	 */
> +	vmx_setup_uret_msr(vmx, MSR_IA32_TSX_CTRL, boot_cpu_has(X86_FEATURE_RTM));
>  
>  	if (cpu_has_vmx_msr_bitmap())
>  		vmx_update_msr_bitmap(&vmx->vcpu);
> @@ -6919,23 +6925,15 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
>  		vmx->guest_uret_msrs[i].data = 0;
>  		vmx->guest_uret_msrs[i].mask = -1ull;
>  	}
> -	tsx_ctrl = vmx_find_uret_msr(vmx, MSR_IA32_TSX_CTRL);
> -	if (tsx_ctrl) {
> +	if (boot_cpu_has(X86_FEATURE_RTM)) {
>  		/*
>  		 * TSX_CTRL_CPUID_CLEAR is handled in the CPUID interception.
>  		 * Keep the host value unchanged to avoid changing CPUID bits
>  		 * under the host kernel's feet.
> -		 *
> -		 * hle=0, rtm=0, tsx_ctrl=1 can be found with some combinations
> -		 * of new kernel and old userspace.  If those guests run on a
> -		 * tsx=off host, do allow guests to use TSX_CTRL, but do not
> -		 * change the value on the host so that TSX remains always
> -		 * disabled.
>  		 */
> -		if (boot_cpu_has(X86_FEATURE_RTM))
> +		tsx_ctrl = vmx_find_uret_msr(vmx, MSR_IA32_TSX_CTRL);
> +		if (tsx_ctrl)
>  			vmx->guest_uret_msrs[i].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
> -		else
> -			vmx->guest_uret_msrs[i].mask = 0;
>  	}
>  
>  	err = alloc_loaded_vmcs(&vmx->vmcs01);

I also agree that commit message should be updated as Paolo said,
but other than that:

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky <mlevitsk@redhat.com>




^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 12/15] KVM: x86: Export the number of uret MSRs to vendor modules
  2021-05-04 17:17 ` [PATCH 12/15] KVM: x86: Export the number of uret MSRs to vendor modules Sean Christopherson
@ 2021-05-10  8:27   ` Maxim Levitsky
  0 siblings, 0 replies; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-10  8:27 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> Split out and export the number of configured user return MSRs so that
> VMX can iterate over the set of MSRs without having to do its own tracking.
> Keep the list itself internal to x86 so that vendor code still has to go
> through the "official" APIs to add/modify entries.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/x86.c              | 29 +++++++++++++----------------
>  2 files changed, 14 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index c9452472ed55..10663610f105 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1419,6 +1419,7 @@ struct kvm_arch_async_pf {
>  	bool direct_map;
>  };
>  
> +extern u32 __read_mostly kvm_nr_uret_msrs;
>  extern u64 __read_mostly host_efer;
>  extern bool __read_mostly allow_smaller_maxphyaddr;
>  extern struct kvm_x86_ops kvm_x86_ops;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 90ef340565a4..2fd46e917666 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -184,11 +184,6 @@ module_param(pi_inject_timer, bint, S_IRUGO | S_IWUSR);
>   */
>  #define KVM_MAX_NR_USER_RETURN_MSRS 16
>  
> -struct kvm_user_return_msrs_global {
> -	int nr;
> -	u32 msrs[KVM_MAX_NR_USER_RETURN_MSRS];
> -};
> -
>  struct kvm_user_return_msrs {
>  	struct user_return_notifier urn;
>  	bool registered;
> @@ -198,7 +193,9 @@ struct kvm_user_return_msrs {
>  	} values[KVM_MAX_NR_USER_RETURN_MSRS];
>  };
>  
> -static struct kvm_user_return_msrs_global __read_mostly user_return_msrs_global;
> +u32 __read_mostly kvm_nr_uret_msrs;
> +EXPORT_SYMBOL_GPL(kvm_nr_uret_msrs);
> +static u32 __read_mostly kvm_uret_msrs_list[KVM_MAX_NR_USER_RETURN_MSRS];
>  static struct kvm_user_return_msrs __percpu *user_return_msrs;
>  
>  #define KVM_SUPPORTED_XCR0     (XFEATURE_MASK_FP | XFEATURE_MASK_SSE \
> @@ -330,10 +327,10 @@ static void kvm_on_user_return(struct user_return_notifier *urn)
>  		user_return_notifier_unregister(urn);
>  	}
>  	local_irq_restore(flags);
> -	for (slot = 0; slot < user_return_msrs_global.nr; ++slot) {
> +	for (slot = 0; slot < kvm_nr_uret_msrs; ++slot) {
>  		values = &msrs->values[slot];
>  		if (values->host != values->curr) {
> -			wrmsrl(user_return_msrs_global.msrs[slot], values->host);
> +			wrmsrl(kvm_uret_msrs_list[slot], values->host);
>  			values->curr = values->host;
>  		}
>  	}
> @@ -358,9 +355,9 @@ EXPORT_SYMBOL_GPL(kvm_probe_user_return_msr);
>  void kvm_define_user_return_msr(unsigned slot, u32 msr)
>  {
>  	BUG_ON(slot >= KVM_MAX_NR_USER_RETURN_MSRS);
> -	user_return_msrs_global.msrs[slot] = msr;
> -	if (slot >= user_return_msrs_global.nr)
> -		user_return_msrs_global.nr = slot + 1;
> +	kvm_uret_msrs_list[slot] = msr;
> +	if (slot >= kvm_nr_uret_msrs)
> +		kvm_nr_uret_msrs = slot + 1;
>  }
>  EXPORT_SYMBOL_GPL(kvm_define_user_return_msr);
>  
> @@ -368,8 +365,8 @@ int kvm_find_user_return_msr(u32 msr)
>  {
>  	int i;
>  
> -	for (i = 0; i < user_return_msrs_global.nr; ++i) {
> -		if (user_return_msrs_global.msrs[i] == msr)
> +	for (i = 0; i < kvm_nr_uret_msrs; ++i) {
> +		if (kvm_uret_msrs_list[i] == msr)
>  			return i;
>  	}
>  	return -1;
> @@ -383,8 +380,8 @@ static void kvm_user_return_msr_cpu_online(void)
>  	u64 value;
>  	int i;
>  
> -	for (i = 0; i < user_return_msrs_global.nr; ++i) {
> -		rdmsrl_safe(user_return_msrs_global.msrs[i], &value);
> +	for (i = 0; i < kvm_nr_uret_msrs; ++i) {
> +		rdmsrl_safe(kvm_uret_msrs_list[i], &value);
>  		msrs->values[i].host = value;
>  		msrs->values[i].curr = value;
>  	}
> @@ -399,7 +396,7 @@ int kvm_set_user_return_msr(unsigned slot, u64 value, u64 mask)
>  	value = (value & mask) | (msrs->values[slot].host & ~mask);
>  	if (value == msrs->values[slot].curr)
>  		return 0;
> -	err = wrmsrl_safe(user_return_msrs_global.msrs[slot], value);
> +	err = wrmsrl_safe(kvm_uret_msrs_list[slot], value);
>  	if (err)
>  		return 1;
>  

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 13/15] KVM: x86: Move uret MSR slot management to common x86
  2021-05-04 17:17 ` [PATCH 13/15] KVM: x86: Move uret MSR slot management to common x86 Sean Christopherson
@ 2021-05-10  8:28   ` Maxim Levitsky
  0 siblings, 0 replies; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-10  8:28 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> Now that SVM and VMX both probe MSRs before "defining" user return slots
> for them, consolidate the code for probe+define into common x86 and
> eliminate the odd behavior of having the vendor code define the slot for
> a given MSR.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  3 +--
>  arch/x86/kvm/svm/svm.c          |  5 +----
>  arch/x86/kvm/vmx/vmx.c          | 19 ++++---------------
>  arch/x86/kvm/x86.c              | 19 +++++++++++--------
>  4 files changed, 17 insertions(+), 29 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 10663610f105..a4b912f7e427 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1778,9 +1778,8 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
>  		    unsigned long ipi_bitmap_high, u32 min,
>  		    unsigned long icr, int op_64_bit);
>  
> -void kvm_define_user_return_msr(unsigned index, u32 msr);
> +int kvm_add_user_return_msr(u32 msr);
>  int kvm_find_user_return_msr(u32 msr);
> -int kvm_probe_user_return_msr(u32 msr);
>  int kvm_set_user_return_msr(unsigned index, u64 val, u64 mask);
>  
>  u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc);
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 231b9650d864..de921935e8de 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -959,10 +959,7 @@ static __init int svm_hardware_setup(void)
>  		kvm_tsc_scaling_ratio_frac_bits = 32;
>  	}
>  
> -	if (!kvm_probe_user_return_msr(MSR_TSC_AUX)) {
> -		tsc_aux_uret_slot = 0;
> -		kvm_define_user_return_msr(tsc_aux_uret_slot, MSR_TSC_AUX);
> -	}
> +	tsc_aux_uret_slot = kvm_add_user_return_msr(MSR_TSC_AUX);
>  
>  	/* Check for pause filtering support */
>  	if (!boot_cpu_has(X86_FEATURE_PAUSEFILTER)) {
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 7a53568b34fc..26f82f302391 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -454,9 +454,6 @@ static inline void vmx_segment_cache_clear(struct vcpu_vmx *vmx)
>  
>  static unsigned long host_idt_base;
>  
> -/* Number of user return MSRs that are actually supported in hardware. */
> -static int vmx_nr_uret_msrs;
> -
>  #if IS_ENABLED(CONFIG_HYPERV)
>  static bool __read_mostly enlightened_vmcs = true;
>  module_param(enlightened_vmcs, bool, 0444);
> @@ -1218,7 +1215,7 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
>  	 */
>  	if (!vmx->guest_uret_msrs_loaded) {
>  		vmx->guest_uret_msrs_loaded = true;
> -		for (i = 0; i < vmx_nr_uret_msrs; ++i) {
> +		for (i = 0; i < kvm_nr_uret_msrs; ++i) {
>  			if (!vmx->guest_uret_msrs[i].load_into_hardware)
>  				continue;
>  
> @@ -6921,7 +6918,7 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
>  			goto free_vpid;
>  	}
>  
> -	for (i = 0; i < vmx_nr_uret_msrs; ++i) {
> +	for (i = 0; i < kvm_nr_uret_msrs; ++i) {
>  		vmx->guest_uret_msrs[i].data = 0;
>  		vmx->guest_uret_msrs[i].mask = -1ull;
>  	}
> @@ -7810,20 +7807,12 @@ static __init void vmx_setup_user_return_msrs(void)
>  		MSR_EFER, MSR_TSC_AUX, MSR_STAR,
>  		MSR_IA32_TSX_CTRL,
>  	};
> -	u32 msr;
>  	int i;
>  
>  	BUILD_BUG_ON(ARRAY_SIZE(vmx_uret_msrs_list) != MAX_NR_USER_RETURN_MSRS);
>  
> -	for (i = 0; i < ARRAY_SIZE(vmx_uret_msrs_list); ++i) {
> -		msr = vmx_uret_msrs_list[i];
> -
> -		if (kvm_probe_user_return_msr(msr))
> -			continue;
> -
> -		kvm_define_user_return_msr(vmx_nr_uret_msrs, msr);
> -		vmx_nr_uret_msrs++;
> -	}
> +	for (i = 0; i < ARRAY_SIZE(vmx_uret_msrs_list); ++i)
> +		kvm_add_user_return_msr(vmx_uret_msrs_list[i]);
>  }
>  
>  static __init int hardware_setup(void)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 2fd46e917666..adca491d3b4b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -336,7 +336,7 @@ static void kvm_on_user_return(struct user_return_notifier *urn)
>  	}
>  }
>  
> -int kvm_probe_user_return_msr(u32 msr)
> +static int kvm_probe_user_return_msr(u32 msr)
>  {
>  	u64 val;
>  	int ret;
> @@ -350,16 +350,18 @@ int kvm_probe_user_return_msr(u32 msr)
>  	preempt_enable();
>  	return ret;
>  }
> -EXPORT_SYMBOL_GPL(kvm_probe_user_return_msr);
>  
> -void kvm_define_user_return_msr(unsigned slot, u32 msr)
> +int kvm_add_user_return_msr(u32 msr)
>  {
> -	BUG_ON(slot >= KVM_MAX_NR_USER_RETURN_MSRS);
> -	kvm_uret_msrs_list[slot] = msr;
> -	if (slot >= kvm_nr_uret_msrs)
> -		kvm_nr_uret_msrs = slot + 1;
> +	BUG_ON(kvm_nr_uret_msrs >= KVM_MAX_NR_USER_RETURN_MSRS);
> +
> +	if (kvm_probe_user_return_msr(msr))
> +		return -1;
> +
> +	kvm_uret_msrs_list[kvm_nr_uret_msrs] = msr;
> +	return kvm_nr_uret_msrs++;
>  }
> -EXPORT_SYMBOL_GPL(kvm_define_user_return_msr);
> +EXPORT_SYMBOL_GPL(kvm_add_user_return_msr);
>  
>  int kvm_find_user_return_msr(u32 msr)
>  {
> @@ -8169,6 +8171,7 @@ int kvm_arch_init(void *opaque)
>  		printk(KERN_ERR "kvm: failed to allocate percpu kvm_user_return_msrs\n");
>  		goto out_free_x86_emulator_cache;
>  	}
> +	kvm_nr_uret_msrs = 0;
>  
>  	r = kvm_mmu_module_init();
>  	if (r)
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky





^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 14/15] KVM: x86: Tie Intel and AMD behavior for MSR_TSC_AUX to guest CPU model
  2021-05-04 17:17 ` [PATCH 14/15] KVM: x86: Tie Intel and AMD behavior for MSR_TSC_AUX to guest CPU model Sean Christopherson
@ 2021-05-10  8:29   ` Maxim Levitsky
  2021-05-10 16:50     ` Sean Christopherson
  0 siblings, 1 reply; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-10  8:29 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> Squish the Intel and AMD emulation of MSR_TSC_AUX together and tie it to
> the guest CPU model instead of the host CPU behavior.  While not strictly
> necessary to avoid guest breakage, emulating cross-vendor "architecture"
> will provide consistent behavior for the guest, e.g. WRMSR fault behavior
> won't change if the vCPU is migrated to a host with divergent behavior.
> 
> Note, the "new" kvm_is_supported_user_return_msr() checks do not add new
> functionality on either SVM or VMX.  On SVM, the equivalent was
> "tsc_aux_uret_slot < 0", and on VMX the check was buried in the
> vmx_find_uret_msr() call at the find_uret_msr label.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  5 +++++
>  arch/x86/kvm/svm/svm.c          | 24 ----------------------
>  arch/x86/kvm/vmx/vmx.c          | 15 --------------
>  arch/x86/kvm/x86.c              | 36 +++++++++++++++++++++++++++++++++
>  4 files changed, 41 insertions(+), 39 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index a4b912f7e427..00fb9efb9984 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1782,6 +1782,11 @@ int kvm_add_user_return_msr(u32 msr);
>  int kvm_find_user_return_msr(u32 msr);
>  int kvm_set_user_return_msr(unsigned index, u64 val, u64 mask);
>  
> +static inline bool kvm_is_supported_user_return_msr(u32 msr)
> +{
> +	return kvm_find_user_return_msr(msr) >= 0;
> +}
> +
>  u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc);
>  u64 kvm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc);
>  
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index de921935e8de..6c7c6a303cc5 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -2663,12 +2663,6 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  			msr_info->data |= (u64)svm->sysenter_esp_hi << 32;
>  		break;
>  	case MSR_TSC_AUX:
> -		if (tsc_aux_uret_slot < 0)
> -			return 1;
> -		if (!msr_info->host_initiated &&
Not related to this patch, but I do wonder why do we need
to always allow writing this msr if done by the host,
since if neither RDTSPC nor RDPID are supported, the guest
won't be able to read this msr at all.


> -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
> -			return 1;
>  		msr_info->data = svm->tsc_aux;
>  		break;
>  	/*
> @@ -2885,24 +2879,6 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>  		svm->sysenter_esp_hi = guest_cpuid_is_intel(vcpu) ? (data >> 32) : 0;
>  		break;
>  	case MSR_TSC_AUX:
> -		if (tsc_aux_uret_slot < 0)
> -			return 1;
> -
> -		if (!msr->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
> -			return 1;
> -
> -		/*
> -		 * Per Intel's SDM, bits 63:32 are reserved, but AMD's APM has
> -		 * incomplete and conflicting architectural behavior.  Current
> -		 * AMD CPUs completely ignore bits 63:32, i.e. they aren't
> -		 * reserved and always read as zeros.  Emulate AMD CPU behavior
> -		 * to avoid explosions if the vCPU is migrated from an AMD host
> -		 * to an Intel host.
> -		 */
> -		data = (u32)data;
> -
>  		/*
>  		 * TSC_AUX is usually changed only during boot and never read
>  		 * directly.  Intercept TSC_AUX instead of exposing it to the
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 26f82f302391..d85ac5876982 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1981,12 +1981,6 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		else
>  			msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
>  		break;
> -	case MSR_TSC_AUX:
> -		if (!msr_info->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
> -			return 1;
> -		goto find_uret_msr;
>  	case MSR_IA32_DEBUGCTLMSR:
>  		msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL);
>  		break;
> @@ -2302,15 +2296,6 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		else
>  			vmx->pt_desc.guest.addr_a[index / 2] = data;
>  		break;
> -	case MSR_TSC_AUX:
> -		if (!msr_info->host_initiated &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
> -			return 1;
> -		/* Check reserved bit, higher 32 bits should be zero */
> -		if ((data >> 32) != 0)
> -			return 1;
> -		goto find_uret_msr;
>  	case MSR_IA32_PERF_CAPABILITIES:
>  		if (data && !vcpu_to_pmu(vcpu)->version)
>  			return 1;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index adca491d3b4b..896127ea4d4f 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1642,6 +1642,30 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data,
>  		 * invokes 64-bit SYSENTER.
>  		 */
>  		data = get_canonical(data, vcpu_virt_addr_bits(vcpu));
> +		break;
> +	case MSR_TSC_AUX:
> +		if (!kvm_is_supported_user_return_msr(MSR_TSC_AUX))
> +			return 1;
> +
> +		if (!host_initiated &&
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
> +			return 1;
> +
> +		/*
> +		 * Per Intel's SDM, bits 63:32 are reserved, but AMD's APM has
> +		 * incomplete and conflicting architectural behavior.  Current
> +		 * AMD CPUs completely ignore bits 63:32, i.e. they aren't
> +		 * reserved and always read as zeros.  Enforce Intel's reserved
> +		 * bits check if and only if the guest CPU is Intel, and clear
> +		 * the bits in all other cases.  This ensures cross-vendor
> +		 * migration will provide consistent behavior for the guest.
> +		 */
> +		if (guest_cpuid_is_intel(vcpu) && (data >> 32) != 0)
> +			return 1;
> +
> +		data = (u32)data;
> +		break;
>  	}
>  
>  	msr.data = data;
> @@ -1678,6 +1702,18 @@ int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data,
>  	if (!host_initiated && !kvm_msr_allowed(vcpu, index, KVM_MSR_FILTER_READ))
>  		return KVM_MSR_RET_FILTERED;
>  
> +	switch (index) {
> +	case MSR_TSC_AUX:
> +		if (!kvm_is_supported_user_return_msr(MSR_TSC_AUX))
> +			return 1;
> +
> +		if (!host_initiated &&
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
> +			return 1;
> +		break;
> +	}
> +
>  	msr.index = index;
>  	msr.host_initiated = host_initiated;
>  

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 15/15] KVM: x86: Hide RDTSCP and RDPID if MSR_TSC_AUX probing failed
  2021-05-04 17:17 ` [PATCH 15/15] KVM: x86: Hide RDTSCP and RDPID if MSR_TSC_AUX probing failed Sean Christopherson
@ 2021-05-10  8:29   ` Maxim Levitsky
  0 siblings, 0 replies; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-10  8:29 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li, Reiji Watanabe

On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> If probing MSR_TSC_AUX failed, hide RDTSCP and RDPID, and WARN if either
> feature was reported as supported.  In theory, such a scenario should
> never happen as both Intel and AMD state that MSR_TSC_AUX is available if
> RDTSCP or RDPID is supported.  But, KVM injects #GP on MSR_TSC_AUX
> accesses if probing failed, faults on WRMSR(MSR_TSC_AUX) may be fatal to
> the guest (because they happen during early CPU bringup), and KVM itself
> has effectively misreported RDPID support in the past.
> 
> Note, this also has the happy side effect of omitting MSR_TSC_AUX from
> the list of MSRs that are exposed to userspace if probing the MSR fails.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/cpuid.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index c96f79c9fff2..bf0f74ce4974 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -567,6 +567,21 @@ void kvm_set_cpu_caps(void)
>  		F(ACE2) | F(ACE2_EN) | F(PHE) | F(PHE_EN) |
>  		F(PMM) | F(PMM_EN)
>  	);
> +
> +	/*
> +	 * Hide RDTSCP and RDPID if either feature is reported as supported but
> +	 * probing MSR_TSC_AUX failed.  This is purely a sanity check and
> +	 * should never happen, but the guest will likely crash if RDTSCP or
> +	 * RDPID is misreported, and KVM has botched MSR_TSC_AUX emulation in
> +	 * the past, e.g. the sanity check may fire if this instance of KVM is
> +	 * running as L1 on top of an older, broken KVM.
> +	 */
> +	if (WARN_ON((kvm_cpu_cap_has(X86_FEATURE_RDTSCP) ||
> +		     kvm_cpu_cap_has(X86_FEATURE_RDPID)) &&
> +		     !kvm_is_supported_user_return_msr(MSR_TSC_AUX))) {
> +		kvm_cpu_cap_clear(X86_FEATURE_RDTSCP);
> +		kvm_cpu_cap_clear(X86_FEATURE_RDPID);
> +	}
>  }
>  EXPORT_SYMBOL_GPL(kvm_set_cpu_caps);
>  
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 08/15] KVM: VMX: Configure list of user return MSRs at module init
  2021-05-10  8:23   ` Maxim Levitsky
@ 2021-05-10 15:13     ` Sean Christopherson
  2021-05-11 12:34       ` Maxim Levitsky
  0 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2021-05-10 15:13 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

On Mon, May 10, 2021, Maxim Levitsky wrote:
> On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> > @@ -6929,18 +6942,10 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
> >  			goto free_vpid;
> >  	}
> >  
> > -	BUILD_BUG_ON(ARRAY_SIZE(vmx_uret_msrs_list) != MAX_NR_USER_RETURN_MSRS);
> > +	for (i = 0; i < vmx_nr_uret_msrs; ++i) {
> > +		vmx->guest_uret_msrs[i].data = 0;
> >  
> > -	for (i = 0; i < ARRAY_SIZE(vmx_uret_msrs_list); ++i) {
> > -		u32 index = vmx_uret_msrs_list[i];
> > -		int j = vmx->nr_uret_msrs;
> > -
> > -		if (kvm_probe_user_return_msr(index))
> > -			continue;
> > -
> > -		vmx->guest_uret_msrs[j].slot = i;
> I don't see anything initalizing the .slot after this patch.
> Now this code is removed later which masks this bug, 
> but for the bisect sake, I think that this patch 
> should still be fixed.

Egad, indeed it's broken.  I'll retest the whole series to verify the other
patches will bisect cleanly.

Nice catch!

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 09/15] KVM: VMX: Use flag to indicate "active" uret MSRs instead of sorting list
  2021-05-08  3:31   ` Reiji Watanabe
@ 2021-05-10 16:43     ` Sean Christopherson
  2021-05-10 17:55       ` Reiji Watanabe
  0 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2021-05-10 16:43 UTC (permalink / raw)
  To: Reiji Watanabe
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li

On Fri, May 07, 2021, Reiji Watanabe wrote:
> > -static void vmx_setup_uret_msr(struct vcpu_vmx *vmx, unsigned int msr)
> > +static void vmx_setup_uret_msr(struct vcpu_vmx *vmx, unsigned int msr,
> > +                              bool load_into_hardware)
> >  {
> > -       struct vmx_uret_msr tmp;
> > -       int from, to;
> > +       struct vmx_uret_msr *uret_msr;
> >
> > -       from = __vmx_find_uret_msr(vmx, msr);
> > -       if (from < 0)
> > +       uret_msr = vmx_find_uret_msr(vmx, msr);
> > +       if (!uret_msr)
> >                 return;
> > -       to = vmx->nr_active_uret_msrs++;
> >
> > -       tmp = vmx->guest_uret_msrs[to];
> > -       vmx->guest_uret_msrs[to] = vmx->guest_uret_msrs[from];
> > -       vmx->guest_uret_msrs[from] = tmp;
> > +       uret_msr->load_into_hardware = load_into_hardware;
> >  }
> >
> >  /*
> > @@ -1785,30 +1781,36 @@ static void vmx_setup_uret_msr(struct vcpu_vmx *vmx, unsigned int msr)
> >   */
> >  static void setup_msrs(struct vcpu_vmx *vmx)
> >  {
> > -       vmx->guest_uret_msrs_loaded = false;
> > -       vmx->nr_active_uret_msrs = 0;
> >  #ifdef CONFIG_X86_64
> > +       bool load_syscall_msrs;
> > +
> >         /*
> >          * The SYSCALL MSRs are only needed on long mode guests, and only
> >          * when EFER.SCE is set.
> >          */
> > -       if (is_long_mode(&vmx->vcpu) && (vmx->vcpu.arch.efer & EFER_SCE)) {
> > -               vmx_setup_uret_msr(vmx, MSR_STAR);
> > -               vmx_setup_uret_msr(vmx, MSR_LSTAR);
> > -               vmx_setup_uret_msr(vmx, MSR_SYSCALL_MASK);
> > -       }
> > +       load_syscall_msrs = is_long_mode(&vmx->vcpu) &&
> > +                           (vmx->vcpu.arch.efer & EFER_SCE);
> > +
> > +       vmx_setup_uret_msr(vmx, MSR_STAR, load_syscall_msrs);
> > +       vmx_setup_uret_msr(vmx, MSR_LSTAR, load_syscall_msrs);
> > +       vmx_setup_uret_msr(vmx, MSR_SYSCALL_MASK, load_syscall_msrs);
> >  #endif
> > -       if (update_transition_efer(vmx))
> > -               vmx_setup_uret_msr(vmx, MSR_EFER);
> > +       vmx_setup_uret_msr(vmx, MSR_EFER, update_transition_efer(vmx));
> >
> > -       if (guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP)  ||
> > -           guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDPID))
> > -               vmx_setup_uret_msr(vmx, MSR_TSC_AUX);
> > +       vmx_setup_uret_msr(vmx, MSR_TSC_AUX,
> > +                          guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP) ||
> > +                          guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDPID));
> 
> Shouldn't vmx_setup_uret_msr(,MSR_TSC_AUX,) be called to update
> the new flag load_into_hardware for MSR_TSC_AUX when CPUID
> (X86_FEATURE_RDTSCP/X86_FEATURE_RDPID) of the vCPU is updated ?

Yes.  I have a patch in the massive vCPU RESET/INIT series to do exactly that.
I honestly can't remember if there was a dependency that "required" the fix to
be buried in the middle of that series.  I suspect not; I'm guessing I just
didn't think it was worth backporting to stable kernels and so didn't prioritize
hoisting the patch out of that mess.

https://lkml.kernel.org/r/20210424004645.3950558-34-seanjc@google.com

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 14/15] KVM: x86: Tie Intel and AMD behavior for MSR_TSC_AUX to guest CPU model
  2021-05-10  8:29   ` Maxim Levitsky
@ 2021-05-10 16:50     ` Sean Christopherson
  2021-05-10 17:11       ` Jim Mattson
  0 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2021-05-10 16:50 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

On Mon, May 10, 2021, Maxim Levitsky wrote:
> On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index de921935e8de..6c7c6a303cc5 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -2663,12 +2663,6 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> >  			msr_info->data |= (u64)svm->sysenter_esp_hi << 32;
> >  		break;
> >  	case MSR_TSC_AUX:
> > -		if (tsc_aux_uret_slot < 0)
> > -			return 1;
> > -		if (!msr_info->host_initiated &&
> Not related to this patch, but I do wonder why do we need
> to always allow writing this msr if done by the host,
> since if neither RDTSPC nor RDPID are supported, the guest
> won't be able to read this msr at all.

It's an ordering thing and not specific to MSR_TSC_AUX.  Exempting host userspace
from guest CPUID checks allows userspace to set MSR state, e.g. during migration,
before setting the guest CPUID model.

> > -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> > -		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
> > -			return 1;
> >  		msr_info->data = svm->tsc_aux;
> >  		break;
> >  	/*

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
  2021-05-10  8:08           ` Maxim Levitsky
@ 2021-05-10 16:56             ` Sean Christopherson
  2021-05-11 12:34               ` Maxim Levitsky
  2021-05-18 10:59               ` Paolo Bonzini
  0 siblings, 2 replies; 68+ messages in thread
From: Sean Christopherson @ 2021-05-10 16:56 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Jim Mattson, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Joerg Roedel, kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Mon, May 10, 2021, Maxim Levitsky wrote:
> On Tue, 2021-05-04 at 14:58 -0700, Jim Mattson wrote:
> > On Tue, May 4, 2021 at 2:57 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > On 04/05/21 23:53, Sean Christopherson wrote:
> > > > > Does the right thing happen here if the vCPU is in guest mode when
> > > > > userspace decides to toggle the CPUID.80000001H:EDX.RDTSCP bit on or
> > > > > off?
> > > > I hate our terminology.  By "guest mode", do you mean running the vCPU, or do
> > > > you specifically mean running in L2?
> > > > 
> > > 
> > > Guest mode should mean L2.
> > > 
> > > (I wonder if we should have a capability that says "KVM_SET_CPUID2 can
> > > only be called prior to KVM_RUN").
> > 
> > It would certainly make it easier to reason about potential security issues.
> > 
> I vote too for this.

Alternatively, what about adding KVM_VCPU_RESET to let userspace explicitly
pull RESET#, and defining that ioctl() to freeze the vCPU model?  I.e. after
userspace resets the vCPU, KVM_SET_CPUID (and any other relevant ioctls() is
disallowed.

Lack of proper RESET emulation is annoying, e.g. userspace has to manually stuff
EDX after vCPU creation to get the right value at RESET.  A dedicated ioctl()
would kill two birds with one stone, without having to add yet another "2"
ioctl().

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 14/15] KVM: x86: Tie Intel and AMD behavior for MSR_TSC_AUX to guest CPU model
  2021-05-10 16:50     ` Sean Christopherson
@ 2021-05-10 17:11       ` Jim Mattson
  2021-05-11 12:34         ` Maxim Levitsky
  0 siblings, 1 reply; 68+ messages in thread
From: Jim Mattson @ 2021-05-10 17:11 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Maxim Levitsky, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Joerg Roedel, kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Mon, May 10, 2021 at 9:50 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, May 10, 2021, Maxim Levitsky wrote:
> > On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > > index de921935e8de..6c7c6a303cc5 100644
> > > --- a/arch/x86/kvm/svm/svm.c
> > > +++ b/arch/x86/kvm/svm/svm.c
> > > @@ -2663,12 +2663,6 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> > >                     msr_info->data |= (u64)svm->sysenter_esp_hi << 32;
> > >             break;
> > >     case MSR_TSC_AUX:
> > > -           if (tsc_aux_uret_slot < 0)
> > > -                   return 1;
> > > -           if (!msr_info->host_initiated &&
> > Not related to this patch, but I do wonder why do we need
> > to always allow writing this msr if done by the host,
> > since if neither RDTSPC nor RDPID are supported, the guest
> > won't be able to read this msr at all.
>
> It's an ordering thing and not specific to MSR_TSC_AUX.  Exempting host userspace
> from guest CPUID checks allows userspace to set MSR state, e.g. during migration,
> before setting the guest CPUID model.

I thought the rule was that if an MSR was enumerated by
KVM_GET_MSR_INDEX_LIST, then KVM had to accept legal writes from the
host. The only "ordering thing" is that KVM_GET_MSR_INDEX_LIST is a
device ioctl, so it can't take guest CPUID information into account.

> > > -               !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> > > -               !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
> > > -                   return 1;
> > >             msr_info->data = svm->tsc_aux;
> > >             break;
> > >     /*

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 02/15] KVM: x86: Emulate RDPID only if RDTSCP is supported
  2021-05-10  8:08   ` Maxim Levitsky
@ 2021-05-10 17:20     ` Sean Christopherson
  2021-05-11 12:32       ` Maxim Levitsky
  0 siblings, 1 reply; 68+ messages in thread
From: Sean Christopherson @ 2021-05-10 17:20 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

On Mon, May 10, 2021, Maxim Levitsky wrote:
> On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> > Do not advertise emulation support for RDPID if RDTSCP is unsupported.
> > RDPID emulation subtly relies on MSR_TSC_AUX to exist in hardware, as
> > both vmx_get_msr() and svm_get_msr() will return an error if the MSR is
> > unsupported, i.e. ctxt->ops->get_msr() will fail and the emulator will
> > inject a #UD.
> > 
> > Note, RDPID emulation also relies on RDTSCP being enabled in the guest,
> > but this is a KVM bug and will eventually be fixed.
> > 
> > Fixes: fb6d4d340e05 ("KVM: x86: emulate RDPID")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >  arch/x86/kvm/cpuid.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index f765bf7a529c..c96f79c9fff2 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -637,7 +637,8 @@ static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
> >  	case 7:
> >  		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> >  		entry->eax = 0;
> > -		entry->ecx = F(RDPID);
> > +		if (kvm_cpu_cap_has(X86_FEATURE_RDTSCP))
> > +			entry->ecx = F(RDPID);
> >  		++array->nent;
> >  	default:
> >  		break;
> 
> Just to make sure that I understand this correctly:
> 
> This is what I know:
> 
> Both RDTSCP and RDPID are instructions that read IA32_TSC_AUX
> (and RDTSCP also reads the TSC).
> 
> Both instructions have their own CPUID bits (X86_FEATURE_RDPID, X86_FEATURE_RDTSCP)
> If either of these CPUID bits are present, IA32_TSC_AUX should be supported.

Yep.

> RDPID is a newer feature, thus I can at least for the sanity sake assume that
> usually a CPU will either have neither of the features, have only RDTSCP,
> and IA32_AUX, or have both RDSCP and RDPID.

Yep.

> If not supported in hardware KVM only emulates RDPID as I see.

Yep.

> Why btw? Performance wise guest that only wants the IA32_AUX in userspace,
> is better to use RDTSCP and pay the penalty of saving/restoring of the
> unwanted registers, than use RDPID with a vmexit.

FWIW, Linux doesn't even fall back to RDTSCP.  If RDPID isn't supported, Linux
throws the info into the limit of a dummy segment in the GDT and uses LSL to get
at the data.  Turns out that RDTSCP is too slow for its intended use case :-)

> My own guess for an answer to this question is that RDPID emulation is there
> to aid migration from a host that does support RDPID to a host that doesn't.

That's my assumption as well.  Paolo's commit is a bit light on why emulation
was added in the first place, but emulating to allow migrating to old hardware
is the only motivation I can come up with.

commit fb6d4d340e0532032c808a9933eaaa7b8de435ab
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Tue Jul 12 11:04:26 2016 +0200

    KVM: x86: emulate RDPID

    This is encoded as F3 0F C7 /7 with a register argument.  The register
    argument is the second array in the group9 GroupDual, while F3 is the
    fourth element of a Prefix.

> Having said all that, assuming that we don't want to emulate the RDTSCP too,
> when it is not supported, then this patch does make sense.
> 
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> 
> Best regards,
> 	Maxim Levitsky
> 
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 09/15] KVM: VMX: Use flag to indicate "active" uret MSRs instead of sorting list
  2021-05-10 16:43     ` Sean Christopherson
@ 2021-05-10 17:55       ` Reiji Watanabe
  0 siblings, 0 replies; 68+ messages in thread
From: Reiji Watanabe @ 2021-05-10 17:55 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li

> > Shouldn't vmx_setup_uret_msr(,MSR_TSC_AUX,) be called to update
> > the new flag load_into_hardware for MSR_TSC_AUX when CPUID
> > (X86_FEATURE_RDTSCP/X86_FEATURE_RDPID) of the vCPU is updated ?
>
> Yes.  I have a patch in the massive vCPU RESET/INIT series to do exactly that.
> I honestly can't remember if there was a dependency that "required" the fix to
> be buried in the middle of that series.  I suspect not; I'm guessing I just
> didn't think it was worth backporting to stable kernels and so didn't prioritize
> hoisting the patch out of that mess.
>
> https://lkml.kernel.org/r/20210424004645.3950558-34-seanjc@google.com

I see. I hadn't checked the patch.
Thank you for your answer !

Regards,
Reiji

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 02/15] KVM: x86: Emulate RDPID only if RDTSCP is supported
  2021-05-10 17:20     ` Sean Christopherson
@ 2021-05-11 12:32       ` Maxim Levitsky
  0 siblings, 0 replies; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-11 12:32 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

On Mon, 2021-05-10 at 17:20 +0000, Sean Christopherson wrote:
> On Mon, May 10, 2021, Maxim Levitsky wrote:
> > On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> > > Do not advertise emulation support for RDPID if RDTSCP is unsupported.
> > > RDPID emulation subtly relies on MSR_TSC_AUX to exist in hardware, as
> > > both vmx_get_msr() and svm_get_msr() will return an error if the MSR is
> > > unsupported, i.e. ctxt->ops->get_msr() will fail and the emulator will
> > > inject a #UD.
> > > 
> > > Note, RDPID emulation also relies on RDTSCP being enabled in the guest,
> > > but this is a KVM bug and will eventually be fixed.
> > > 
> > > Fixes: fb6d4d340e05 ("KVM: x86: emulate RDPID")
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > ---
> > >  arch/x86/kvm/cpuid.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > > index f765bf7a529c..c96f79c9fff2 100644
> > > --- a/arch/x86/kvm/cpuid.c
> > > +++ b/arch/x86/kvm/cpuid.c
> > > @@ -637,7 +637,8 @@ static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
> > >  	case 7:
> > >  		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> > >  		entry->eax = 0;
> > > -		entry->ecx = F(RDPID);
> > > +		if (kvm_cpu_cap_has(X86_FEATURE_RDTSCP))
> > > +			entry->ecx = F(RDPID);
> > >  		++array->nent;
> > >  	default:
> > >  		break;
> > 
> > Just to make sure that I understand this correctly:
> > 
> > This is what I know:
> > 
> > Both RDTSCP and RDPID are instructions that read IA32_TSC_AUX
> > (and RDTSCP also reads the TSC).
> > 
> > Both instructions have their own CPUID bits (X86_FEATURE_RDPID, X86_FEATURE_RDTSCP)
> > If either of these CPUID bits are present, IA32_TSC_AUX should be supported.
> 
> Yep.
> 
> > RDPID is a newer feature, thus I can at least for the sanity sake assume that
> > usually a CPU will either have neither of the features, have only RDTSCP,
> > and IA32_AUX, or have both RDSCP and RDPID.
> 
> Yep.
> 
> > If not supported in hardware KVM only emulates RDPID as I see.
> 
> Yep.
> 
> > Why btw? Performance wise guest that only wants the IA32_AUX in userspace,
> > is better to use RDTSCP and pay the penalty of saving/restoring of the
> > unwanted registers, than use RDPID with a vmexit.
> 
> FWIW, Linux doesn't even fall back to RDTSCP.  If RDPID isn't supported, Linux
> throws the info into the limit of a dummy segment in the GDT and uses LSL to get
> at the data.  Turns out that RDTSCP is too slow for its intended use case :-)
> 
> > My own guess for an answer to this question is that RDPID emulation is there
> > to aid migration from a host that does support RDPID to a host that doesn't.
> 
> That's my assumption as well.  Paolo's commit is a bit light on why emulation
> was added in the first place, but emulating to allow migrating to old hardware
> is the only motivation I can come up with.

Cool thanks!
Best regards,
	Maxim Levitsky

> 
> commit fb6d4d340e0532032c808a9933eaaa7b8de435ab
> Author: Paolo Bonzini <pbonzini@redhat.com>
> Date:   Tue Jul 12 11:04:26 2016 +0200
> 
>     KVM: x86: emulate RDPID
> 
>     This is encoded as F3 0F C7 /7 with a register argument.  The register
>     argument is the second array in the group9 GroupDual, while F3 is the
>     fourth element of a Prefix.
> 
> > Having said all that, assuming that we don't want to emulate the RDTSCP too,
> > when it is not supported, then this patch does make sense.
> > 
> > Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> > 
> > Best regards,
> > 	Maxim Levitsky
> > 
> > 



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
  2021-05-10 16:56             ` Sean Christopherson
@ 2021-05-11 12:34               ` Maxim Levitsky
  2021-05-18 10:59               ` Paolo Bonzini
  1 sibling, 0 replies; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-11 12:34 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Jim Mattson, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Joerg Roedel, kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Mon, 2021-05-10 at 16:56 +0000, Sean Christopherson wrote:
> On Mon, May 10, 2021, Maxim Levitsky wrote:
> > On Tue, 2021-05-04 at 14:58 -0700, Jim Mattson wrote:
> > > On Tue, May 4, 2021 at 2:57 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > > On 04/05/21 23:53, Sean Christopherson wrote:
> > > > > > Does the right thing happen here if the vCPU is in guest mode when
> > > > > > userspace decides to toggle the CPUID.80000001H:EDX.RDTSCP bit on or
> > > > > > off?
> > > > > I hate our terminology.  By "guest mode", do you mean running the vCPU, or do
> > > > > you specifically mean running in L2?
> > > > > 
> > > > 
> > > > Guest mode should mean L2.
> > > > 
> > > > (I wonder if we should have a capability that says "KVM_SET_CPUID2 can
> > > > only be called prior to KVM_RUN").
> > > 
> > > It would certainly make it easier to reason about potential security issues.
> > > 
> > I vote too for this.
> 
> Alternatively, what about adding KVM_VCPU_RESET to let userspace explicitly
> pull RESET#, and defining that ioctl() to freeze the vCPU model?  I.e. after
> userspace resets the vCPU, KVM_SET_CPUID (and any other relevant ioctls() is
> disallowed.

I vote even stronger for this!

I recently created what Paulo called an 'evil' test to stress KVM related
bugs in nested migration by simulating a nested migration with a vCPU reset,
followed by reload of all of its state including nested state.

It is ugly since as you say I have to  manually load the reset state, and thus 
using 'KVM_VCPU_RESET' ioctl instead would make this test much more 
cleaner and even more 'evil'.

Best regards,
	Maxim Levitsky


> 
> Lack of proper RESET emulation is annoying, e.g. userspace has to manually stuff
> EDX after vCPU creation to get the right value at RESET.  A dedicated ioctl()
> would kill two birds with one stone, without having to add yet another "2"
> ioctl().
> 



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 08/15] KVM: VMX: Configure list of user return MSRs at module init
  2021-05-10 15:13     ` Sean Christopherson
@ 2021-05-11 12:34       ` Maxim Levitsky
  2021-05-11 20:10         ` Sean Christopherson
  0 siblings, 1 reply; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-11 12:34 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

On Mon, 2021-05-10 at 15:13 +0000, Sean Christopherson wrote:
> On Mon, May 10, 2021, Maxim Levitsky wrote:
> > On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> > > @@ -6929,18 +6942,10 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
> > >  			goto free_vpid;
> > >  	}
> > >  
> > > -	BUILD_BUG_ON(ARRAY_SIZE(vmx_uret_msrs_list) != MAX_NR_USER_RETURN_MSRS);
> > > +	for (i = 0; i < vmx_nr_uret_msrs; ++i) {
> > > +		vmx->guest_uret_msrs[i].data = 0;
> > >  
> > > -	for (i = 0; i < ARRAY_SIZE(vmx_uret_msrs_list); ++i) {
> > > -		u32 index = vmx_uret_msrs_list[i];
> > > -		int j = vmx->nr_uret_msrs;
> > > -
> > > -		if (kvm_probe_user_return_msr(index))
> > > -			continue;
> > > -
> > > -		vmx->guest_uret_msrs[j].slot = i;
> > I don't see anything initalizing the .slot after this patch.
> > Now this code is removed later which masks this bug, 
> > but for the bisect sake, I think that this patch 
> > should still be fixed.
> 
> Egad, indeed it's broken.  I'll retest the whole series to verify the other
> patches will bisect cleanly.
> 
> Nice catch!
> 
Thanks!

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 14/15] KVM: x86: Tie Intel and AMD behavior for MSR_TSC_AUX to guest CPU model
  2021-05-10 17:11       ` Jim Mattson
@ 2021-05-11 12:34         ` Maxim Levitsky
  0 siblings, 0 replies; 68+ messages in thread
From: Maxim Levitsky @ 2021-05-11 12:34 UTC (permalink / raw)
  To: Jim Mattson, Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Mon, 2021-05-10 at 10:11 -0700, Jim Mattson wrote:
> On Mon, May 10, 2021 at 9:50 AM Sean Christopherson <seanjc@google.com> wrote:
> > On Mon, May 10, 2021, Maxim Levitsky wrote:
> > > On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> > > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > > > index de921935e8de..6c7c6a303cc5 100644
> > > > --- a/arch/x86/kvm/svm/svm.c
> > > > +++ b/arch/x86/kvm/svm/svm.c
> > > > @@ -2663,12 +2663,6 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> > > >                     msr_info->data |= (u64)svm->sysenter_esp_hi << 32;
> > > >             break;
> > > >     case MSR_TSC_AUX:
> > > > -           if (tsc_aux_uret_slot < 0)
> > > > -                   return 1;
> > > > -           if (!msr_info->host_initiated &&
> > > Not related to this patch, but I do wonder why do we need
> > > to always allow writing this msr if done by the host,
> > > since if neither RDTSPC nor RDPID are supported, the guest
> > > won't be able to read this msr at all.
> > 
> > It's an ordering thing and not specific to MSR_TSC_AUX.  Exempting host userspace
> > from guest CPUID checks allows userspace to set MSR state, e.g. during migration,
> > before setting the guest CPUID model.
> 
> I thought the rule was that if an MSR was enumerated by
> KVM_GET_MSR_INDEX_LIST, then KVM had to accept legal writes from the
> host. The only "ordering thing" is that KVM_GET_MSR_INDEX_LIST is a
> device ioctl, so it can't take guest CPUID information into account.

This makes sense.

Thanks!
Best regards,
	Maxim Levitsky
> 
> > > > -               !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> > > > -               !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
> > > > -                   return 1;
> > > >             msr_info->data = svm->tsc_aux;
> > > >             break;
> > > >     /*



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 08/15] KVM: VMX: Configure list of user return MSRs at module init
  2021-05-11 12:34       ` Maxim Levitsky
@ 2021-05-11 20:10         ` Sean Christopherson
  0 siblings, 0 replies; 68+ messages in thread
From: Sean Christopherson @ 2021-05-11 20:10 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li, Reiji Watanabe

On Tue, May 11, 2021, Maxim Levitsky wrote:
> On Mon, 2021-05-10 at 15:13 +0000, Sean Christopherson wrote:
> > On Mon, May 10, 2021, Maxim Levitsky wrote:
> > > On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> > > > @@ -6929,18 +6942,10 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
> > > >  			goto free_vpid;
> > > >  	}
> > > >  
> > > > -	BUILD_BUG_ON(ARRAY_SIZE(vmx_uret_msrs_list) != MAX_NR_USER_RETURN_MSRS);
> > > > +	for (i = 0; i < vmx_nr_uret_msrs; ++i) {
> > > > +		vmx->guest_uret_msrs[i].data = 0;
> > > >  
> > > > -	for (i = 0; i < ARRAY_SIZE(vmx_uret_msrs_list); ++i) {
> > > > -		u32 index = vmx_uret_msrs_list[i];
> > > > -		int j = vmx->nr_uret_msrs;
> > > > -
> > > > -		if (kvm_probe_user_return_msr(index))
> > > > -			continue;
> > > > -
> > > > -		vmx->guest_uret_msrs[j].slot = i;
> > > I don't see anything initalizing the .slot after this patch.
> > > Now this code is removed later which masks this bug, 
> > > but for the bisect sake, I think that this patch 
> > > should still be fixed.
> > 
> > Egad, indeed it's broken.  I'll retest the whole series to verify the other
> > patches will bisect cleanly.
> > 
> > Nice catch!
> > 
> Thanks!

Unfortunately this made it's way to Linus before I could fix my goof.  Fingers
crossed no one is unfortunate enough to land on this while bisecting...

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
  2021-05-10 16:56             ` Sean Christopherson
  2021-05-11 12:34               ` Maxim Levitsky
@ 2021-05-18 10:59               ` Paolo Bonzini
  2021-05-18 19:24                 ` Sean Christopherson
  1 sibling, 1 reply; 68+ messages in thread
From: Paolo Bonzini @ 2021-05-18 10:59 UTC (permalink / raw)
  To: Sean Christopherson, Maxim Levitsky
  Cc: Jim Mattson, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On 10/05/21 18:56, Sean Christopherson wrote:
> On Mon, May 10, 2021, Maxim Levitsky wrote:
>> On Tue, 2021-05-04 at 14:58 -0700, Jim Mattson wrote:
>>> On Tue, May 4, 2021 at 2:57 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>> On 04/05/21 23:53, Sean Christopherson wrote:
>>>>>> Does the right thing happen here if the vCPU is in guest mode when
>>>>>> userspace decides to toggle the CPUID.80000001H:EDX.RDTSCP bit on or
>>>>>> off?
>>>>> I hate our terminology.  By "guest mode", do you mean running the vCPU, or do
>>>>> you specifically mean running in L2?
>>>>>
>>>>
>>>> Guest mode should mean L2.
>>>>
>>>> (I wonder if we should have a capability that says "KVM_SET_CPUID2 can
>>>> only be called prior to KVM_RUN").
>>>
>>> It would certainly make it easier to reason about potential security issues.
>>>
>> I vote too for this.
> 
> Alternatively, what about adding KVM_VCPU_RESET to let userspace explicitly
> pull RESET#, and defining that ioctl() to freeze the vCPU model?  I.e. after
> userspace resets the vCPU, KVM_SET_CPUID (and any other relevant ioctls() is
> disallowed.
> 
> Lack of proper RESET emulation is annoying, e.g. userspace has to manually stuff
> EDX after vCPU creation to get the right value at RESET.  A dedicated ioctl()
> would kill two birds with one stone, without having to add yet another "2"
> ioctl().

That has a disadvantage of opting into the more secure behavior, but we 
can do both (forbidding KVM_SET_CPUID2 after both KVM_RUN and KVM_RESET).

Paolo


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest
  2021-05-18 10:59               ` Paolo Bonzini
@ 2021-05-18 19:24                 ` Sean Christopherson
  0 siblings, 0 replies; 68+ messages in thread
From: Sean Christopherson @ 2021-05-18 19:24 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Maxim Levitsky, Jim Mattson, Vitaly Kuznetsov, Wanpeng Li,
	Joerg Roedel, kvm list, LKML, Xiaoyao Li, Reiji Watanabe

On Tue, May 18, 2021, Paolo Bonzini wrote:
> On 10/05/21 18:56, Sean Christopherson wrote:
> > On Mon, May 10, 2021, Maxim Levitsky wrote:
> > > On Tue, 2021-05-04 at 14:58 -0700, Jim Mattson wrote:
> > > > On Tue, May 4, 2021 at 2:57 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > > > On 04/05/21 23:53, Sean Christopherson wrote:
> > > > > > > Does the right thing happen here if the vCPU is in guest mode when
> > > > > > > userspace decides to toggle the CPUID.80000001H:EDX.RDTSCP bit on or
> > > > > > > off?
> > > > > > I hate our terminology.  By "guest mode", do you mean running the vCPU, or do
> > > > > > you specifically mean running in L2?
> > > > > > 
> > > > > 
> > > > > Guest mode should mean L2.
> > > > > 
> > > > > (I wonder if we should have a capability that says "KVM_SET_CPUID2 can
> > > > > only be called prior to KVM_RUN").
> > > > 
> > > > It would certainly make it easier to reason about potential security issues.
> > > > 
> > > I vote too for this.
> > 
> > Alternatively, what about adding KVM_VCPU_RESET to let userspace explicitly
> > pull RESET#, and defining that ioctl() to freeze the vCPU model?  I.e. after
> > userspace resets the vCPU, KVM_SET_CPUID (and any other relevant ioctls() is
> > disallowed.
> > 
> > Lack of proper RESET emulation is annoying, e.g. userspace has to manually stuff
> > EDX after vCPU creation to get the right value at RESET.  A dedicated ioctl()
> > would kill two birds with one stone, without having to add yet another "2"
> > ioctl().
> 
> That has a disadvantage of opting into the more secure behavior, but we can
> do both (forbidding KVM_SET_CPUID2 after both KVM_RUN and KVM_RESET).

Doesn't changing KVM_SET_CPUID2 need to be opt-in as well, e.g. if the strict
behavior is activated via a capability?

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2021-05-18 19:24 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-04 17:17 [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Sean Christopherson
2021-05-04 17:17 ` [PATCH 01/15] KVM: VMX: Do not adverise RDPID if ENABLE_RDTSCP control is unsupported Sean Christopherson
2021-05-04 17:37   ` Jim Mattson
2021-05-04 17:53     ` Jim Mattson
2021-05-04 18:14       ` Sean Christopherson
2021-05-05  3:04   ` Reiji Watanabe
2021-05-10  8:03   ` Maxim Levitsky
2021-05-04 17:17 ` [PATCH 02/15] KVM: x86: Emulate RDPID only if RDTSCP is supported Sean Christopherson
2021-05-04 17:50   ` Jim Mattson
2021-05-05  3:51   ` Reiji Watanabe
2021-05-05  8:01     ` Paolo Bonzini
2021-05-10  8:08   ` Maxim Levitsky
2021-05-10 17:20     ` Sean Christopherson
2021-05-11 12:32       ` Maxim Levitsky
2021-05-04 17:17 ` [PATCH 03/15] KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guest Sean Christopherson
2021-05-04 21:45   ` Jim Mattson
2021-05-04 21:53     ` Sean Christopherson
2021-05-04 21:56       ` Jim Mattson
2021-05-04 22:10         ` Sean Christopherson
2021-05-04 22:24           ` Jim Mattson
2021-05-04 21:57       ` Paolo Bonzini
2021-05-04 21:58         ` Jim Mattson
2021-05-10  8:08           ` Maxim Levitsky
2021-05-10 16:56             ` Sean Christopherson
2021-05-11 12:34               ` Maxim Levitsky
2021-05-18 10:59               ` Paolo Bonzini
2021-05-18 19:24                 ` Sean Christopherson
2021-05-05  4:26   ` Reiji Watanabe
2021-05-10  8:08   ` Maxim Levitsky
2021-05-04 17:17 ` [PATCH 04/15] KVM: x86: Move RDPID emulation intercept to its own enum Sean Christopherson
2021-05-04 23:24   ` Jim Mattson
2021-05-10  8:14   ` Maxim Levitsky
2021-05-04 17:17 ` [PATCH 05/15] KVM: VMX: Disable preemption when probing user return MSRs Sean Christopherson
2021-05-04 23:36   ` Jim Mattson
2021-05-10  8:18   ` Maxim Levitsky
2021-05-04 17:17 ` [PATCH 06/15] KVM: SVM: Probe and load MSR_TSC_AUX regardless of RDTSCP support in host Sean Christopherson
2021-05-10  8:20   ` Maxim Levitsky
2021-05-04 17:17 ` [PATCH 07/15] KVM: x86: Add support for RDPID without RDTSCP Sean Christopherson
2021-05-10  8:20   ` Maxim Levitsky
2021-05-04 17:17 ` [PATCH 08/15] KVM: VMX: Configure list of user return MSRs at module init Sean Christopherson
2021-05-10  8:23   ` Maxim Levitsky
2021-05-10 15:13     ` Sean Christopherson
2021-05-11 12:34       ` Maxim Levitsky
2021-05-11 20:10         ` Sean Christopherson
2021-05-04 17:17 ` [PATCH 09/15] KVM: VMX: Use flag to indicate "active" uret MSRs instead of sorting list Sean Christopherson
2021-05-08  3:31   ` Reiji Watanabe
2021-05-10 16:43     ` Sean Christopherson
2021-05-10 17:55       ` Reiji Watanabe
2021-05-10  8:25   ` Maxim Levitsky
2021-05-04 17:17 ` [PATCH 10/15] KVM: VMX: Use common x86's uret MSR list as the one true list Sean Christopherson
2021-05-10  8:25   ` Maxim Levitsky
2021-05-04 17:17 ` [PATCH 11/15] KVM: VMX: Disable loading of TSX_CTRL MSR the more conventional way Sean Christopherson
2021-05-05  8:49   ` Paolo Bonzini
2021-05-05 15:36     ` Sean Christopherson
2021-05-05 15:50       ` Paolo Bonzini
2021-05-10  8:26   ` Maxim Levitsky
2021-05-04 17:17 ` [PATCH 12/15] KVM: x86: Export the number of uret MSRs to vendor modules Sean Christopherson
2021-05-10  8:27   ` Maxim Levitsky
2021-05-04 17:17 ` [PATCH 13/15] KVM: x86: Move uret MSR slot management to common x86 Sean Christopherson
2021-05-10  8:28   ` Maxim Levitsky
2021-05-04 17:17 ` [PATCH 14/15] KVM: x86: Tie Intel and AMD behavior for MSR_TSC_AUX to guest CPU model Sean Christopherson
2021-05-10  8:29   ` Maxim Levitsky
2021-05-10 16:50     ` Sean Christopherson
2021-05-10 17:11       ` Jim Mattson
2021-05-11 12:34         ` Maxim Levitsky
2021-05-04 17:17 ` [PATCH 15/15] KVM: x86: Hide RDTSCP and RDPID if MSR_TSC_AUX probing failed Sean Christopherson
2021-05-10  8:29   ` Maxim Levitsky
2021-05-05  8:51 ` [PATCH 00/15] KVM: x86: RDPID/RDTSCP fixes and uret MSR cleanups Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).