All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest
@ 2018-01-08 18:08 Paolo Bonzini
  2018-01-08 18:08 ` [PATCH 1/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT accessors Paolo Bonzini
                   ` (7 more replies)
  0 siblings, 8 replies; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-08 18:08 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp

This series allows guests to use the MSR_IA32_SPEC_CTRL and
MSR_IA32_PRED_CMD model specific registers that were added as mitigations
for CVE-2017-5715.

These are only the KVM specific parts of the fix.  It does *not* yet
include any protection for reading host memory from the guest, because
that would be done in the same way as the rest of Linux.  So there is no
IBRS *usage* here, no retpolines, no stuffing of the return stack buffer.
(KVM already includes a fix to clear all registers on vmexit, which is
enough to block Google Project Zero's PoC exploit).

However, I am including the changes to use IBPB (indirect branch
predictor barrier) if available.  That occurs only when there is a VCPU
switch on a physical CPU, thus it has a small impact on performance.

The patches are a bit hackish because the relevant cpufeatures have
not been included yet, and because I wanted to make the patches easier
to backport to distro kernels if desired, but I would still like to
have them in 4.16.

Please review.

Thanks,

Paolo

Paolo Bonzini (5):
  KVM: x86: add SPEC_CTRL and IBPB_SUPPORT accessors
  x86/msr: add definitions for indirect branch predictor MSRs
  kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
  kvm: svm: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to guest
  KVM: x86: add SPEC_CTRL and IBPB_SUPPORT to MSR and CPUID lists

Tim Chen (1):
  kvm: vmx: Set IBPB when running a different VCPU

Tom Lendacky (1):
  x86/svm: Set IBPB when running a different VCPU

 arch/x86/include/asm/msr-index.h |  5 ++++
 arch/x86/kvm/cpuid.c             | 27 +++++++++++++----
 arch/x86/kvm/cpuid.h             | 22 ++++++++++++++
 arch/x86/kvm/svm.c               | 65 +++++++++++++++++++++++++++++++++++++++-
 arch/x86/kvm/vmx.c               | 41 +++++++++++++++++++++++++
 arch/x86/kvm/x86.c               |  1 +
 6 files changed, 154 insertions(+), 7 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 1/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT accessors
  2018-01-08 18:08 [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest Paolo Bonzini
@ 2018-01-08 18:08 ` Paolo Bonzini
  2018-01-08 18:33   ` Konrad Rzeszutek Wilk
                     ` (2 more replies)
  2018-01-08 18:08 ` [PATCH 2/7] x86/msr: add definitions for indirect branch predictor MSRs Paolo Bonzini
                   ` (6 subsequent siblings)
  7 siblings, 3 replies; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-08 18:08 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp

As an interim measure until SPEC_CTRL is supported by upstream
Linux in cpufeatures, add a function that lets vmx.c and svm.c
know whether to save/restore MSR_IA32_SPEC_CTRL.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/cpuid.c |  3 ---
 arch/x86/kvm/cpuid.h | 22 ++++++++++++++++++++++
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 8e9a07c557f1..767af697c20c 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -67,9 +67,6 @@ u64 kvm_supported_xcr0(void)
 
 #define F(x) bit(X86_FEATURE_##x)
 
-/* These are scattered features in cpufeatures.h. */
-#define KVM_CPUID_BIT_AVX512_4VNNIW     2
-#define KVM_CPUID_BIT_AVX512_4FMAPS     3
 #define KF(x) bit(KVM_CPUID_BIT_##x)
 
 int kvm_update_cpuid(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index c2cea6651279..8d04ccf177ce 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -155,6 +155,28 @@ static inline int guest_cpuid_stepping(struct kvm_vcpu *vcpu)
 	return x86_stepping(best->eax);
 }
 
+/* These are scattered features in cpufeatures.h. */
+#define KVM_CPUID_BIT_AVX512_4VNNIW	2
+#define KVM_CPUID_BIT_AVX512_4FMAPS	3
+#define KVM_CPUID_BIT_SPEC_CTRL		26
+#define KVM_CPUID_BIT_STIBP		27
+
+/* CPUID[eax=0x80000008].ebx */
+#define KVM_CPUID_BIT_IBPB_SUPPORT	12
+
+static inline bool cpu_has_spec_ctrl(void)
+{
+	u32 eax, ebx, ecx, edx;
+	cpuid_count(7, 0, &eax, &ebx, &ecx, &edx);
+
+	return edx & bit(KVM_CPUID_BIT_SPEC_CTRL);
+}
+
+static inline bool cpu_has_ibpb_support(void)
+{
+	return cpuid_ebx(0x80000008) & bit(KVM_CPUID_BIT_IBPB_SUPPORT);
+}
+
 static inline bool supports_cpuid_fault(struct kvm_vcpu *vcpu)
 {
 	return vcpu->arch.msr_platform_info & MSR_PLATFORM_INFO_CPUID_FAULT;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 2/7] x86/msr: add definitions for indirect branch predictor MSRs
  2018-01-08 18:08 [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest Paolo Bonzini
  2018-01-08 18:08 ` [PATCH 1/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT accessors Paolo Bonzini
@ 2018-01-08 18:08 ` Paolo Bonzini
  2018-01-08 18:35   ` Konrad Rzeszutek Wilk
  2018-01-08 19:10   ` Liran Alon
  2018-01-08 18:08 ` [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest Paolo Bonzini
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-08 18:08 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp

KVM will start using them soon.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/msr-index.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 03ffde6217d0..ec08f1d8d39b 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -39,6 +39,11 @@
 
 /* Intel MSRs. Some also available on other CPUs */
 
+#define MSR_IA32_SPEC_CTRL		0x00000048
+
+#define MSR_IA32_PRED_CMD		0x00000049
+#define FEATURE_SET_IBPB		(1UL << 0)
+
 #define MSR_PPIN_CTL			0x0000004e
 #define MSR_PPIN			0x0000004f
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
  2018-01-08 18:08 [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest Paolo Bonzini
  2018-01-08 18:08 ` [PATCH 1/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT accessors Paolo Bonzini
  2018-01-08 18:08 ` [PATCH 2/7] x86/msr: add definitions for indirect branch predictor MSRs Paolo Bonzini
@ 2018-01-08 18:08 ` Paolo Bonzini
  2018-01-08 18:43   ` Konrad Rzeszutek Wilk
                     ` (5 more replies)
  2018-01-08 18:08 ` [PATCH 4/7] kvm: vmx: Set IBPB when running a different VCPU Paolo Bonzini
                   ` (4 subsequent siblings)
  7 siblings, 6 replies; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-08 18:08 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp

Direct access to MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD is important
for performance.  Allow load/store of MSR_IA32_SPEC_CTRL, restore guest
IBRS on VM entry and set it to 0 on VM exit (because Linux does not use
it yet).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/vmx.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 669f5f74857d..d00bcad7336e 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -120,6 +120,8 @@
 module_param_named(preemption_timer, enable_preemption_timer, bool, S_IRUGO);
 #endif
 
+static bool __read_mostly have_spec_ctrl;
+
 #define KVM_GUEST_CR0_MASK (X86_CR0_NW | X86_CR0_CD)
 #define KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST (X86_CR0_WP | X86_CR0_NE)
 #define KVM_VM_CR0_ALWAYS_ON						\
@@ -609,6 +611,8 @@ struct vcpu_vmx {
 	u64 		      msr_host_kernel_gs_base;
 	u64 		      msr_guest_kernel_gs_base;
 #endif
+	u64		      spec_ctrl;
+
 	u32 vm_entry_controls_shadow;
 	u32 vm_exit_controls_shadow;
 	u32 secondary_exec_control;
@@ -3361,6 +3365,9 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_TSC:
 		msr_info->data = guest_read_tsc(vcpu);
 		break;
+	case MSR_IA32_SPEC_CTRL:
+		msr_info->data = to_vmx(vcpu)->spec_ctrl;
+		break;
 	case MSR_IA32_SYSENTER_CS:
 		msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
 		break;
@@ -3500,6 +3507,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_TSC:
 		kvm_write_tsc(vcpu, msr_info);
 		break;
+	case MSR_IA32_SPEC_CTRL:
+		to_vmx(vcpu)->spec_ctrl = msr_info->data;
+		break;
 	case MSR_IA32_CR_PAT:
 		if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
 			if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
@@ -7062,6 +7072,17 @@ static __init int hardware_setup(void)
 		goto out;
 	}
 
+	/*
+	 * FIXME: this is only needed until SPEC_CTRL is supported
+	 * by upstream Linux in cpufeatures, then it can be replaced
+	 * with static_cpu_has.
+	 */
+	have_spec_ctrl = cpu_has_spec_ctrl();
+	if (have_spec_ctrl)
+		pr_info("kvm: SPEC_CTRL available\n");
+	else
+		pr_info("kvm: SPEC_CTRL not available\n");
+
 	if (boot_cpu_has(X86_FEATURE_NX))
 		kvm_enable_efer_bits(EFER_NX);
 
@@ -7131,6 +7152,8 @@ static __init int hardware_setup(void)
 	vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_CS, false);
 	vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false);
 	vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false);
+	vmx_disable_intercept_for_msr(MSR_IA32_SPEC_CTRL, false);
+	vmx_disable_intercept_for_msr(MSR_IA32_PRED_CMD, false);
 
 	memcpy(vmx_msr_bitmap_legacy_x2apic_apicv,
 			vmx_msr_bitmap_legacy, PAGE_SIZE);
@@ -9597,6 +9620,9 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 
 	pt_guest_enter(vmx);
 
+	if (have_spec_ctrl && vmx->spec_ctrl != 0)
+		wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
+
 	atomic_switch_perf_msrs(vmx);
 
 	vmx_arm_hv_timer(vcpu);
@@ -9707,6 +9733,12 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 #endif
 	      );
 
+	if (have_spec_ctrl) {
+		rdmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
+		if (vmx->spec_ctrl)
+			wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+	}
+
 	/* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */
 	if (vmx->host_debugctlmsr)
 		update_debugctlmsr(vmx->host_debugctlmsr);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 4/7] kvm: vmx: Set IBPB when running a different VCPU
  2018-01-08 18:08 [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest Paolo Bonzini
                   ` (2 preceding siblings ...)
  2018-01-08 18:08 ` [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest Paolo Bonzini
@ 2018-01-08 18:08 ` Paolo Bonzini
  2018-01-08 19:23   ` Liran Alon
                     ` (2 more replies)
  2018-01-08 18:08 ` [PATCH 5/7] kvm: svm: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to guest Paolo Bonzini
                   ` (3 subsequent siblings)
  7 siblings, 3 replies; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-08 18:08 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp, Tim Chen

From: Tim Chen <tim.c.chen@linux.intel.com>

Ensure an IBPB (Indirect branch prediction barrier) before every VCPU
switch.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/vmx.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index d00bcad7336e..bf127c570675 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2375,6 +2375,8 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
 		per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
 		vmcs_load(vmx->loaded_vmcs->vmcs);
+		if (have_spec_ctrl)
+			wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
 	}
 
 	if (!already_loaded) {
@@ -4029,6 +4031,13 @@ static void free_loaded_vmcs(struct loaded_vmcs *loaded_vmcs)
 	free_vmcs(loaded_vmcs->vmcs);
 	loaded_vmcs->vmcs = NULL;
 	WARN_ON(loaded_vmcs->shadow_vmcs != NULL);
+
+	/*
+	 * The VMCS could be recycled, causing a false negative in
+	 * vmx_vcpu_load; block speculative execution.
+	 */
+	if (have_spec_ctrl)
+		wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
 }
 
 static void vmx_nested_free_vmcs02(struct vcpu_vmx *vmx)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 5/7] kvm: svm: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to guest
  2018-01-08 18:08 [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest Paolo Bonzini
                   ` (3 preceding siblings ...)
  2018-01-08 18:08 ` [PATCH 4/7] kvm: vmx: Set IBPB when running a different VCPU Paolo Bonzini
@ 2018-01-08 18:08 ` Paolo Bonzini
  2018-01-08 19:41   ` Liran Alon
  2018-01-08 18:08 ` [PATCH 6/7] x86/svm: Set IBPB when running a different VCPU Paolo Bonzini
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-08 18:08 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp

Direct access to MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD is important
for performance.  Allow load/store of MSR_IA32_SPEC_CTRL, restore guest
IBRS on VM entry and set it to 0 on VM exit (because Linux does not use
it yet).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm.c | 34 +++++++++++++++++++++++++++++++++-
 1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 55dde3896898..779981a00d01 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -183,6 +183,8 @@ struct vcpu_svm {
 		u64 gs_base;
 	} host;
 
+	u64 spec_ctrl;
+
 	u32 *msrpm;
 
 	ulong nmi_iret_rip;
@@ -236,7 +238,7 @@ struct amd_svm_iommu_ir {
 
 static const struct svm_direct_access_msrs {
 	u32 index;   /* Index of the MSR */
-	bool always; /* True if intercept is always on */
+	bool always; /* True if intercept is always off */
 } direct_access_msrs[] = {
 	{ .index = MSR_STAR,				.always = true  },
 	{ .index = MSR_IA32_SYSENTER_CS,		.always = true  },
@@ -248,6 +250,8 @@ struct amd_svm_iommu_ir {
 	{ .index = MSR_CSTAR,				.always = true  },
 	{ .index = MSR_SYSCALL_MASK,			.always = true  },
 #endif
+	{ .index = MSR_IA32_SPEC_CTRL,			.always = true  },
+	{ .index = MSR_IA32_PRED_CMD,			.always = true  },
 	{ .index = MSR_IA32_LASTBRANCHFROMIP,		.always = false },
 	{ .index = MSR_IA32_LASTBRANCHTOIP,		.always = false },
 	{ .index = MSR_IA32_LASTINTFROMIP,		.always = false },
@@ -283,6 +287,8 @@ struct amd_svm_iommu_ir {
 /* enable/disable Virtual GIF */
 static int vgif = true;
 module_param(vgif, int, 0444);
+ 
+static bool __read_mostly have_spec_ctrl;
 
 static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
 static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
@@ -1135,6 +1141,17 @@ static __init int svm_hardware_setup(void)
 			pr_info("Virtual GIF supported\n");
 	}
 
+	/*
+	 * FIXME: this is only needed until SPEC_CTRL is supported
+	 * by upstream Linux in cpufeatures, then it can be replaced
+	 * with static_cpu_has.
+	 */
+	have_spec_ctrl = cpu_has_spec_ctrl();
+	if (have_spec_ctrl)
+		pr_info("kvm: SPEC_CTRL available\n");
+	else
+		pr_info("kvm: SPEC_CTRL not available\n");
+
 	return 0;
 
 err:
@@ -3599,6 +3616,9 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_VM_CR:
 		msr_info->data = svm->nested.vm_cr_msr;
 		break;
+	case MSR_IA32_SPEC_CTRL:
+		msr_info->data = svm->spec_ctrl;
+		break;
 	case MSR_IA32_UCODE_REV:
 		msr_info->data = 0x01000065;
 		break;
@@ -3754,6 +3774,9 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 	case MSR_VM_IGNNE:
 		vcpu_unimpl(vcpu, "unimplemented wrmsr: 0x%x data 0x%llx\n", ecx, data);
 		break;
+	case MSR_IA32_SPEC_CTRL:
+		svm->spec_ctrl = data;
+		break;
 	case MSR_IA32_APICBASE:
 		if (kvm_vcpu_apicv_active(vcpu))
 			avic_update_vapic_bar(to_svm(vcpu), data);
@@ -4942,6 +4965,9 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
 
 	local_irq_enable();
 
+	if (have_spec_ctrl && svm->spec_ctrl != 0)
+		wrmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
+
 	asm volatile (
 		"push %%" _ASM_BP "; \n\t"
 		"mov %c[rbx](%[svm]), %%" _ASM_BX " \n\t"
@@ -5015,6 +5041,12 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
 #endif
 		);
 
+	if (have_spec_ctrl) {
+		rdmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
+		if (svm->spec_ctrl != 0)
+			wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+	}
+
 #ifdef CONFIG_X86_64
 	wrmsrl(MSR_GS_BASE, svm->host.gs_base);
 #else
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 6/7] x86/svm: Set IBPB when running a different VCPU
  2018-01-08 18:08 [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest Paolo Bonzini
                   ` (4 preceding siblings ...)
  2018-01-08 18:08 ` [PATCH 5/7] kvm: svm: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to guest Paolo Bonzini
@ 2018-01-08 18:08 ` Paolo Bonzini
  2018-01-08 20:00   ` Liran Alon
  2018-01-08 18:08 ` [PATCH 7/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT to MSR and CPUID lists Paolo Bonzini
  2018-01-09 10:15 ` [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest Thomas Gleixner
  7 siblings, 1 reply; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-08 18:08 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp

From: Tom Lendacky <thomas.lendacky@amd.com>

Set IBPB (Indirect Branch Prediction Barrier) when the current CPU is
going to run a VCPU different from what was previously run.  Nested
virtualization uses the same VMCB for the second level guest, but the
L1 hypervisor should be using IBRS to protect itself.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 779981a00d01..dd744d896cec 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -289,6 +289,7 @@ struct amd_svm_iommu_ir {
 module_param(vgif, int, 0444);
  
 static bool __read_mostly have_spec_ctrl;
+static bool __read_mostly have_ibpb_support;
 
 static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
 static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
@@ -540,6 +541,7 @@ struct svm_cpu_data {
 	struct kvm_ldttss_desc *tss_desc;
 
 	struct page *save_area;
+	struct vmcb *current_vmcb;
 };
 
 static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
@@ -1151,6 +1153,11 @@ static __init int svm_hardware_setup(void)
 		pr_info("kvm: SPEC_CTRL available\n");
 	else
 		pr_info("kvm: SPEC_CTRL not available\n");
+	have_ibpb_support = have_spec_ctrl || cpu_has_ibpb_support();
+	if (have_ibpb_support)
+		pr_info("kvm: IBPB_SUPPORT available\n");
+	else
+		pr_info("kvm: IBPB_SUPPORT not available\n");
 
 	return 0;
 
@@ -1725,11 +1732,19 @@ static void svm_free_vcpu(struct kvm_vcpu *vcpu)
 	__free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER);
 	kvm_vcpu_uninit(vcpu);
 	kmem_cache_free(kvm_vcpu_cache, svm);
+
+	/*
+	 * The VMCB could be recycled, causing a false negative in
+	 * svm_vcpu_load; block speculative execution.
+	 */
+	if (have_ibpb_support)
+		wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
 }
 
 static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
+	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
 	int i;
 
 	if (unlikely(cpu != vcpu->cpu)) {
@@ -1758,6 +1773,12 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	if (static_cpu_has(X86_FEATURE_RDTSCP))
 		wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
 
+	if (sd->current_vmcb != svm->vmcb) {
+		sd->current_vmcb = svm->vmcb;
+		if (have_ibpb_support)
+			wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
+	}
+
 	avic_vcpu_load(vcpu, cpu);
 }
 
@@ -2798,6 +2819,11 @@ static int nested_svm_vmexit(struct vcpu_svm *svm)
 	if (!nested_vmcb)
 		return 1;
 
+	/*
+	 * No need for IBPB here, the L1 hypervisor should be running with
+	 * IBRS=1 and inserts one already when switching L2 VMs.
+	 */
+
 	/* Exit Guest-Mode */
 	leave_guest_mode(&svm->vcpu);
 	svm->nested.vmcb = 0;
@@ -3061,6 +3087,11 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
 	if (!nested_vmcb)
 		return false;
 
+	/*
+	 * No need for IBPB here, since the nested VM is less privileged.  The
+	 * L1 hypervisor inserts one already when switching L2 VMs.
+	 */
+
 	if (!nested_vmcb_checks(nested_vmcb)) {
 		nested_vmcb->control.exit_code    = SVM_EXIT_ERR;
 		nested_vmcb->control.exit_code_hi = 0;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 7/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT to MSR and CPUID lists
  2018-01-08 18:08 [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest Paolo Bonzini
                   ` (5 preceding siblings ...)
  2018-01-08 18:08 ` [PATCH 6/7] x86/svm: Set IBPB when running a different VCPU Paolo Bonzini
@ 2018-01-08 18:08 ` Paolo Bonzini
  2018-01-08 20:07   ` Liran Alon
  2018-01-09 10:15 ` [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest Thomas Gleixner
  7 siblings, 1 reply; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-08 18:08 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp

Expose them to userspace, now that guests can use them.
I am not adding cpufeatures here to avoid having a kernel
that shows spec_ctrl in /proc/cpuinfo and actually has no
support whatsoever for IBRS/IBPB.  Keep the ugly special-casing
for now, and clean it up once the generic arch/x86/ code
learns about them.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/cpuid.c | 24 +++++++++++++++++++++---
 arch/x86/kvm/x86.c   |  1 +
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 767af697c20c..5f43a5940275 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -397,7 +397,12 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 
 	/* cpuid 7.0.edx*/
 	const u32 kvm_cpuid_7_0_edx_x86_features =
-		KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
+		KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) |
+		KF(SPEC_CTRL) | KF(STIBP);
+
+	/* cpuid 0x80000008.edx */
+	const u32 kvm_cpuid_80000008_ebx_x86_features =
+		KF(IBPB_SUPPORT);
 
 	/* all calls to cpuid_count() should be made on the same cpu */
 	get_cpu();
@@ -483,7 +488,14 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 			if (!tdp_enabled || !boot_cpu_has(X86_FEATURE_OSPKE))
 				entry->ecx &= ~F(PKU);
 			entry->edx &= kvm_cpuid_7_0_edx_x86_features;
-			entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX);
+			/*
+			 * FIXME: the special casing of SPEC_CTRL and STIBP
+			 * can be removed once they become regular
+			 * cpufeatures.
+			 */
+			entry->edx &= (
+				get_scattered_cpuid_leaf(7, 0, CPUID_EDX) |
+				KF(SPEC_CTRL) | KF(STIBP));
 		} else {
 			entry->ebx = 0;
 			entry->ecx = 0;
@@ -651,7 +663,13 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		if (!g_phys_as)
 			g_phys_as = phys_as;
 		entry->eax = g_phys_as | (virt_as << 8);
-		entry->ebx = entry->edx = 0;
+		/*
+		 * FIXME: mask against cpufeatures, with
+		 * get_scattered_cpuid_leaf(0x80000008, 0, CPUID_EBX),
+		 * once IBPB_SUPPORT becomes a regular cpufeature.
+		 */
+		entry->ebx &= kvm_cpuid_80000008_ebx_x86_features;
+		entry->edx = 0;
 		break;
 	}
 	case 0x80000019:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index daa1918031df..4abb37d9f4d8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1032,6 +1032,7 @@ unsigned int kvm_get_pt_addr_cnt(void)
 	MSR_IA32_RTIT_ADDR1_A, MSR_IA32_RTIT_ADDR1_B,
 	MSR_IA32_RTIT_ADDR2_A, MSR_IA32_RTIT_ADDR2_B,
 	MSR_IA32_RTIT_ADDR3_A, MSR_IA32_RTIT_ADDR3_B,
+	MSR_IA32_SPEC_CTRL,
 };
 
 static unsigned num_msrs_to_save;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH 1/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT accessors
  2018-01-08 18:08 ` [PATCH 1/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT accessors Paolo Bonzini
@ 2018-01-08 18:33   ` Konrad Rzeszutek Wilk
  2018-01-08 19:09   ` Liran Alon
  2018-01-09 11:14   ` David Hildenbrand
  2 siblings, 0 replies; 44+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-01-08 18:33 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, jmattson, aliguori, thomas.lendacky, dwmw, bp

On Mon, Jan 08, 2018 at 07:08:39PM +0100, Paolo Bonzini wrote:
> As an interim measure until SPEC_CTRL is supported by upstream
> Linux in cpufeatures, add a function that lets vmx.c and svm.c
> know whether to save/restore MSR_IA32_SPEC_CTRL.

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/cpuid.c |  3 ---
>  arch/x86/kvm/cpuid.h | 22 ++++++++++++++++++++++
>  2 files changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 8e9a07c557f1..767af697c20c 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -67,9 +67,6 @@ u64 kvm_supported_xcr0(void)
>  
>  #define F(x) bit(X86_FEATURE_##x)
>  
> -/* These are scattered features in cpufeatures.h. */
> -#define KVM_CPUID_BIT_AVX512_4VNNIW     2
> -#define KVM_CPUID_BIT_AVX512_4FMAPS     3
>  #define KF(x) bit(KVM_CPUID_BIT_##x)
>  
>  int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index c2cea6651279..8d04ccf177ce 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -155,6 +155,28 @@ static inline int guest_cpuid_stepping(struct kvm_vcpu *vcpu)
>  	return x86_stepping(best->eax);
>  }
>  
> +/* These are scattered features in cpufeatures.h. */
> +#define KVM_CPUID_BIT_AVX512_4VNNIW	2
> +#define KVM_CPUID_BIT_AVX512_4FMAPS	3
> +#define KVM_CPUID_BIT_SPEC_CTRL		26
> +#define KVM_CPUID_BIT_STIBP		27
> +
> +/* CPUID[eax=0x80000008].ebx */
> +#define KVM_CPUID_BIT_IBPB_SUPPORT	12
> +
> +static inline bool cpu_has_spec_ctrl(void)
> +{
> +	u32 eax, ebx, ecx, edx;
> +	cpuid_count(7, 0, &eax, &ebx, &ecx, &edx);
> +
> +	return edx & bit(KVM_CPUID_BIT_SPEC_CTRL);
> +}
> +
> +static inline bool cpu_has_ibpb_support(void)
> +{
> +	return cpuid_ebx(0x80000008) & bit(KVM_CPUID_BIT_IBPB_SUPPORT);
> +}
> +
>  static inline bool supports_cpuid_fault(struct kvm_vcpu *vcpu)
>  {
>  	return vcpu->arch.msr_platform_info & MSR_PLATFORM_INFO_CPUID_FAULT;
> -- 
> 1.8.3.1
> 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 2/7] x86/msr: add definitions for indirect branch predictor MSRs
  2018-01-08 18:08 ` [PATCH 2/7] x86/msr: add definitions for indirect branch predictor MSRs Paolo Bonzini
@ 2018-01-08 18:35   ` Konrad Rzeszutek Wilk
  2018-01-08 18:52     ` Jim Mattson
  2018-01-08 19:10   ` Liran Alon
  1 sibling, 1 reply; 44+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-01-08 18:35 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, jmattson, aliguori, thomas.lendacky, dwmw, bp

On Mon, Jan 08, 2018 at 07:08:40PM +0100, Paolo Bonzini wrote:
> KVM will start using them soon.

Perhaps include a bit of description?
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/include/asm/msr-index.h | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 03ffde6217d0..ec08f1d8d39b 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -39,6 +39,11 @@
>  
>  /* Intel MSRs. Some also available on other CPUs */
>  
> +#define MSR_IA32_SPEC_CTRL		0x00000048
> +
> +#define MSR_IA32_PRED_CMD		0x00000049
> +#define FEATURE_SET_IBPB		(1UL << 0)
> +
>  #define MSR_PPIN_CTL			0x0000004e
>  #define MSR_PPIN			0x0000004f
>  
> -- 
> 1.8.3.1
> 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
  2018-01-08 18:08 ` [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest Paolo Bonzini
@ 2018-01-08 18:43   ` Konrad Rzeszutek Wilk
  2018-01-08 19:18   ` Jim Mattson
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 44+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-01-08 18:43 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, jmattson, aliguori, thomas.lendacky, dwmw, bp

On Mon, Jan 08, 2018 at 07:08:41PM +0100, Paolo Bonzini wrote:
> Direct access to MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD is important
> for performance.  Allow load/store of MSR_IA32_SPEC_CTRL, restore guest
> IBRS on VM entry and set it to 0 on VM exit (because Linux does not use
> it yet).
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/vmx.c | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 669f5f74857d..d00bcad7336e 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -120,6 +120,8 @@
>  module_param_named(preemption_timer, enable_preemption_timer, bool, S_IRUGO);
>  #endif
>  
> +static bool __read_mostly have_spec_ctrl;
> +
>  #define KVM_GUEST_CR0_MASK (X86_CR0_NW | X86_CR0_CD)
>  #define KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST (X86_CR0_WP | X86_CR0_NE)
>  #define KVM_VM_CR0_ALWAYS_ON						\
> @@ -609,6 +611,8 @@ struct vcpu_vmx {
>  	u64 		      msr_host_kernel_gs_base;
>  	u64 		      msr_guest_kernel_gs_base;
>  #endif
> +	u64		      spec_ctrl;
> +
>  	u32 vm_entry_controls_shadow;
>  	u32 vm_exit_controls_shadow;
>  	u32 secondary_exec_control;
> @@ -3361,6 +3365,9 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case MSR_IA32_TSC:
>  		msr_info->data = guest_read_tsc(vcpu);
>  		break;
> +	case MSR_IA32_SPEC_CTRL:
> +		msr_info->data = to_vmx(vcpu)->spec_ctrl;
> +		break;
>  	case MSR_IA32_SYSENTER_CS:
>  		msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
>  		break;
> @@ -3500,6 +3507,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case MSR_IA32_TSC:
>  		kvm_write_tsc(vcpu, msr_info);
>  		break;
> +	case MSR_IA32_SPEC_CTRL:
> +		to_vmx(vcpu)->spec_ctrl = msr_info->data;
> +		break;
>  	case MSR_IA32_CR_PAT:
>  		if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
>  			if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
> @@ -7062,6 +7072,17 @@ static __init int hardware_setup(void)
>  		goto out;
>  	}
>  
> +	/*
> +	 * FIXME: this is only needed until SPEC_CTRL is supported
> +	 * by upstream Linux in cpufeatures, then it can be replaced
> +	 * with static_cpu_has.
> +	 */
> +	have_spec_ctrl = cpu_has_spec_ctrl();
> +	if (have_spec_ctrl)
> +		pr_info("kvm: SPEC_CTRL available\n");
> +	else
> +		pr_info("kvm: SPEC_CTRL not available\n");
> +
>  	if (boot_cpu_has(X86_FEATURE_NX))
>  		kvm_enable_efer_bits(EFER_NX);
>  
> @@ -7131,6 +7152,8 @@ static __init int hardware_setup(void)
>  	vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_CS, false);
>  	vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false);
>  	vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false);
> +	vmx_disable_intercept_for_msr(MSR_IA32_SPEC_CTRL, false);
> +	vmx_disable_intercept_for_msr(MSR_IA32_PRED_CMD, false);
>  
>  	memcpy(vmx_msr_bitmap_legacy_x2apic_apicv,
>  			vmx_msr_bitmap_legacy, PAGE_SIZE);
> @@ -9597,6 +9620,9 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>  
>  	pt_guest_enter(vmx);
>  
> +	if (have_spec_ctrl && vmx->spec_ctrl)
> +		wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
> +

Shouldn't this be as close to the assembler code as possible?

>  	atomic_switch_perf_msrs(vmx);
>  
>  	vmx_arm_hv_timer(vcpu);

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 2/7] x86/msr: add definitions for indirect branch predictor MSRs
  2018-01-08 18:35   ` Konrad Rzeszutek Wilk
@ 2018-01-08 18:52     ` Jim Mattson
  0 siblings, 0 replies; 44+ messages in thread
From: Jim Mattson @ 2018-01-08 18:52 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Paolo Bonzini, LKML, kvm list, aliguori, Tom Lendacky, dwmw, bp

I don't really understand the organization of this file, but I put
these MSRs in the /* Intel defined MSRs. */ block, between
MSR_IA32_TSC_ADJUST and MSR_IA32_BNDCFGS.

On Mon, Jan 8, 2018 at 10:35 AM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> On Mon, Jan 08, 2018 at 07:08:40PM +0100, Paolo Bonzini wrote:
>> KVM will start using them soon.
>
> Perhaps include a bit of description?
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  arch/x86/include/asm/msr-index.h | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
>> index 03ffde6217d0..ec08f1d8d39b 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -39,6 +39,11 @@
>>
>>  /* Intel MSRs. Some also available on other CPUs */
>>
>> +#define MSR_IA32_SPEC_CTRL           0x00000048
>> +
>> +#define MSR_IA32_PRED_CMD            0x00000049
>> +#define FEATURE_SET_IBPB             (1UL << 0)
>> +
>>  #define MSR_PPIN_CTL                 0x0000004e
>>  #define MSR_PPIN                     0x0000004f
>>
>> --
>> 1.8.3.1
>>
>>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 1/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT accessors
  2018-01-08 18:08 ` [PATCH 1/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT accessors Paolo Bonzini
  2018-01-08 18:33   ` Konrad Rzeszutek Wilk
@ 2018-01-08 19:09   ` Liran Alon
  2018-01-09 10:32     ` Paolo Bonzini
  2018-01-09 11:14   ` David Hildenbrand
  2 siblings, 1 reply; 44+ messages in thread
From: Liran Alon @ 2018-01-08 19:09 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp



On 08/01/18 20:08, Paolo Bonzini wrote:
> As an interim measure until SPEC_CTRL is supported by upstream
> Linux in cpufeatures, add a function that lets vmx.c and svm.c
> know whether to save/restore MSR_IA32_SPEC_CTRL.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   arch/x86/kvm/cpuid.c |  3 ---
>   arch/x86/kvm/cpuid.h | 22 ++++++++++++++++++++++
>   2 files changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 8e9a07c557f1..767af697c20c 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -67,9 +67,6 @@ u64 kvm_supported_xcr0(void)
>
>   #define F(x) bit(X86_FEATURE_##x)
>
> -/* These are scattered features in cpufeatures.h. */
> -#define KVM_CPUID_BIT_AVX512_4VNNIW     2
> -#define KVM_CPUID_BIT_AVX512_4FMAPS     3
>   #define KF(x) bit(KVM_CPUID_BIT_##x)
>
>   int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index c2cea6651279..8d04ccf177ce 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -155,6 +155,28 @@ static inline int guest_cpuid_stepping(struct kvm_vcpu *vcpu)
>   	return x86_stepping(best->eax);
>   }
>
> +/* These are scattered features in cpufeatures.h. */
> +#define KVM_CPUID_BIT_AVX512_4VNNIW	2
> +#define KVM_CPUID_BIT_AVX512_4FMAPS	3
> +#define KVM_CPUID_BIT_SPEC_CTRL		26
> +#define KVM_CPUID_BIT_STIBP		27
> +
> +/* CPUID[eax=0x80000008].ebx */
> +#define KVM_CPUID_BIT_IBPB_SUPPORT	12
> +
> +static inline bool cpu_has_spec_ctrl(void)
> +{
> +	u32 eax, ebx, ecx, edx;
> +	cpuid_count(7, 0, &eax, &ebx, &ecx, &edx);
> +
> +	return edx & bit(KVM_CPUID_BIT_SPEC_CTRL);

Why not just "return cpuid_edx(7) & bit(KVM_CPUID_BIT_SPEC_CTRL);"?
This is also consistent with how cpu_has_ibpb_support() is written.

> +}
> +
> +static inline bool cpu_has_ibpb_support(void)
> +{
> +	return cpuid_ebx(0x80000008) & bit(KVM_CPUID_BIT_IBPB_SUPPORT);
> +}
> +
>   static inline bool supports_cpuid_fault(struct kvm_vcpu *vcpu)
>   {
>   	return vcpu->arch.msr_platform_info & MSR_PLATFORM_INFO_CPUID_FAULT;
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 2/7] x86/msr: add definitions for indirect branch predictor MSRs
  2018-01-08 18:08 ` [PATCH 2/7] x86/msr: add definitions for indirect branch predictor MSRs Paolo Bonzini
  2018-01-08 18:35   ` Konrad Rzeszutek Wilk
@ 2018-01-08 19:10   ` Liran Alon
  1 sibling, 0 replies; 44+ messages in thread
From: Liran Alon @ 2018-01-08 19:10 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp


On 08/01/18 20:08, Paolo Bonzini wrote:
> KVM will start using them soon.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   arch/x86/include/asm/msr-index.h | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 03ffde6217d0..ec08f1d8d39b 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -39,6 +39,11 @@
>
>   /* Intel MSRs. Some also available on other CPUs */
>
> +#define MSR_IA32_SPEC_CTRL		0x00000048
> +
> +#define MSR_IA32_PRED_CMD		0x00000049
> +#define FEATURE_SET_IBPB		(1UL << 0)
> +
>   #define MSR_PPIN_CTL			0x0000004e
>   #define MSR_PPIN			0x0000004f
>
>

Trivially,
Reviewed-by: Liran Alon <liran.alon@oracle.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
  2018-01-08 18:08 ` [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest Paolo Bonzini
  2018-01-08 18:43   ` Konrad Rzeszutek Wilk
@ 2018-01-08 19:18   ` Jim Mattson
  2018-01-08 20:23     ` Liran Alon
  2018-01-08 22:32     ` Paolo Bonzini
  2018-01-08 19:22   ` Liran Alon
                     ` (3 subsequent siblings)
  5 siblings, 2 replies; 44+ messages in thread
From: Jim Mattson @ 2018-01-08 19:18 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: LKML, kvm list, aliguori, Tom Lendacky, dwmw, bp

Guest usage of MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD should be
predicated on guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL).

On Mon, Jan 8, 2018 at 10:08 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Direct access to MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD is important
> for performance.  Allow load/store of MSR_IA32_SPEC_CTRL, restore guest
> IBRS on VM entry and set it to 0 on VM exit (because Linux does not use
> it yet).
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/vmx.c | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 669f5f74857d..d00bcad7336e 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -120,6 +120,8 @@
>  module_param_named(preemption_timer, enable_preemption_timer, bool, S_IRUGO);
>  #endif
>
> +static bool __read_mostly have_spec_ctrl;
> +
>  #define KVM_GUEST_CR0_MASK (X86_CR0_NW | X86_CR0_CD)
>  #define KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST (X86_CR0_WP | X86_CR0_NE)
>  #define KVM_VM_CR0_ALWAYS_ON                                           \
> @@ -609,6 +611,8 @@ struct vcpu_vmx {
>         u64                   msr_host_kernel_gs_base;
>         u64                   msr_guest_kernel_gs_base;
>  #endif
> +       u64                   spec_ctrl;
> +
>         u32 vm_entry_controls_shadow;
>         u32 vm_exit_controls_shadow;
>         u32 secondary_exec_control;
> @@ -3361,6 +3365,9 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>         case MSR_IA32_TSC:
>                 msr_info->data = guest_read_tsc(vcpu);
>                 break;
> +       case MSR_IA32_SPEC_CTRL:
> +               msr_info->data = to_vmx(vcpu)->spec_ctrl;
> +               break;

I have:

if (!have_spec_ctrl ||
    (!msr_info->host_initiated &&
     !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)))
return 1;
msr_info->data = to_vmx(vcpu)->msr_ia32_spec_ctrl;
break;

>         case MSR_IA32_SYSENTER_CS:
>                 msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
>                 break;
> @@ -3500,6 +3507,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>         case MSR_IA32_TSC:
>                 kvm_write_tsc(vcpu, msr_info);
>                 break;
> +       case MSR_IA32_SPEC_CTRL:
> +               to_vmx(vcpu)->spec_ctrl = msr_info->data;
> +               break;

I have:

if (!have_spec_ctrl ||
    (!msr_info->host_initiated &&
     !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)))
return 1;
to_vmx(vcpu)->msr_ia32_spec_ctrl = data;
break;

>         case MSR_IA32_CR_PAT:
>                 if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
>                         if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
> @@ -7062,6 +7072,17 @@ static __init int hardware_setup(void)
>                 goto out;
>         }
>
> +       /*
> +        * FIXME: this is only needed until SPEC_CTRL is supported
> +        * by upstream Linux in cpufeatures, then it can be replaced
> +        * with static_cpu_has.
> +        */
> +       have_spec_ctrl = cpu_has_spec_ctrl();
> +       if (have_spec_ctrl)
> +               pr_info("kvm: SPEC_CTRL available\n");
> +       else
> +               pr_info("kvm: SPEC_CTRL not available\n");
> +
>         if (boot_cpu_has(X86_FEATURE_NX))
>                 kvm_enable_efer_bits(EFER_NX);
>
> @@ -7131,6 +7152,8 @@ static __init int hardware_setup(void)
>         vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_CS, false);
>         vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false);
>         vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false);
> +       vmx_disable_intercept_for_msr(MSR_IA32_SPEC_CTRL, false);
> +       vmx_disable_intercept_for_msr(MSR_IA32_PRED_CMD, false);

I have a lot of changes to MSR permission bitmap handling, but these
intercepts should only be disabled when guest_cpuid_has(vcpu,
X86_FEATURE_SPEC_CTRL).

>         memcpy(vmx_msr_bitmap_legacy_x2apic_apicv,
>                         vmx_msr_bitmap_legacy, PAGE_SIZE);
> @@ -9597,6 +9620,9 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>
>         pt_guest_enter(vmx);
>
> +       if (have_spec_ctrl && vmx->spec_ctrl != 0)
> +               wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
> +

Here, I have:

/*
* If the guest is allowed to write to MSR_IA32_SPEC_CTRL,
* store it on VM-exit.
*/
if (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
add_autostore_msr(vmx, MSR_IA32_SPEC_CTRL);
else
clear_autostore_msr(vmx, MSR_IA32_SPEC_CTRL);

/*
* If the guest's IA32_SPEC_CTRL MSR doesn't match the host's
* IA32_SPEC_CTRL MSR, then add the MSR to the atomic switch
* MSRs, so that the guest value will be loaded on VM-entry
* and the host value will be loaded on VM-exit.
*/
if (vmx->msr_ia32_spec_ctrl != spec_ctrl_enabled())
add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL,
      vmx->msr_ia32_spec_ctrl,
      spec_ctrl_enabled());
else
clear_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL);

>         atomic_switch_perf_msrs(vmx);
>
>         vmx_arm_hv_timer(vcpu);
> @@ -9707,6 +9733,12 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>  #endif
>               );
>
> +       if (have_spec_ctrl) {
> +               rdmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
> +               if (vmx->spec_ctrl)
> +                       wrmsrl(MSR_IA32_SPEC_CTRL, 0);
> +       }
> +

I know the VM-exit MSR load and store lists are probably slower, but
I'm a little uncomfortable restoring the host's IA32_SPEC_CTRL MSR
late if the guest has it clear and the host has it set.

>         /* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */
>         if (vmx->host_debugctlmsr)
>                 update_debugctlmsr(vmx->host_debugctlmsr);
> --
> 1.8.3.1
>
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
  2018-01-08 18:08 ` [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest Paolo Bonzini
  2018-01-08 18:43   ` Konrad Rzeszutek Wilk
  2018-01-08 19:18   ` Jim Mattson
@ 2018-01-08 19:22   ` Liran Alon
  2018-01-08 19:41   ` David Woodhouse
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 44+ messages in thread
From: Liran Alon @ 2018-01-08 19:22 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp



On 08/01/18 20:08, Paolo Bonzini wrote:
> Direct access to MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD is important
> for performance.  Allow load/store of MSR_IA32_SPEC_CTRL, restore guest
> IBRS on VM entry and set it to 0 on VM exit (because Linux does not use
> it yet).
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   arch/x86/kvm/vmx.c | 32 ++++++++++++++++++++++++++++++++
>   1 file changed, 32 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 669f5f74857d..d00bcad7336e 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -120,6 +120,8 @@
>   module_param_named(preemption_timer, enable_preemption_timer, bool, S_IRUGO);
>   #endif
>
> +static bool __read_mostly have_spec_ctrl;
> +
>   #define KVM_GUEST_CR0_MASK (X86_CR0_NW | X86_CR0_CD)
>   #define KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST (X86_CR0_WP | X86_CR0_NE)
>   #define KVM_VM_CR0_ALWAYS_ON						\
> @@ -609,6 +611,8 @@ struct vcpu_vmx {
>   	u64 		      msr_host_kernel_gs_base;
>   	u64 		      msr_guest_kernel_gs_base;
>   #endif
> +	u64		      spec_ctrl;
> +
>   	u32 vm_entry_controls_shadow;
>   	u32 vm_exit_controls_shadow;
>   	u32 secondary_exec_control;
> @@ -3361,6 +3365,9 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>   	case MSR_IA32_TSC:
>   		msr_info->data = guest_read_tsc(vcpu);
>   		break;
> +	case MSR_IA32_SPEC_CTRL:
> +		msr_info->data = to_vmx(vcpu)->spec_ctrl;
> +		break;
>   	case MSR_IA32_SYSENTER_CS:
>   		msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
>   		break;
> @@ -3500,6 +3507,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>   	case MSR_IA32_TSC:
>   		kvm_write_tsc(vcpu, msr_info);
>   		break;
> +	case MSR_IA32_SPEC_CTRL:
> +		to_vmx(vcpu)->spec_ctrl = msr_info->data;
> +		break;
>   	case MSR_IA32_CR_PAT:
>   		if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
>   			if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
> @@ -7062,6 +7072,17 @@ static __init int hardware_setup(void)
>   		goto out;
>   	}
>
> +	/*
> +	 * FIXME: this is only needed until SPEC_CTRL is supported
> +	 * by upstream Linux in cpufeatures, then it can be replaced
> +	 * with static_cpu_has.
> +	 */
> +	have_spec_ctrl = cpu_has_spec_ctrl();
> +	if (have_spec_ctrl)
> +		pr_info("kvm: SPEC_CTRL available\n");
> +	else
> +		pr_info("kvm: SPEC_CTRL not available\n");
> +
>   	if (boot_cpu_has(X86_FEATURE_NX))
>   		kvm_enable_efer_bits(EFER_NX);
>
> @@ -7131,6 +7152,8 @@ static __init int hardware_setup(void)
>   	vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_CS, false);
>   	vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false);
>   	vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false);
> +	vmx_disable_intercept_for_msr(MSR_IA32_SPEC_CTRL, false);
> +	vmx_disable_intercept_for_msr(MSR_IA32_PRED_CMD, false);
>
>   	memcpy(vmx_msr_bitmap_legacy_x2apic_apicv,
>   			vmx_msr_bitmap_legacy, PAGE_SIZE);
> @@ -9597,6 +9620,9 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>
>   	pt_guest_enter(vmx);
>
> +	if (have_spec_ctrl && vmx->spec_ctrl != 0)
> +		wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
> +

Intel specifies that the restore of MSR_IA32_SPEC_CTRL to guest value 
using WRMSR should be done after the last indirect branch before VMEntry.
However, atomic_switch_perf_msrs() calls perf_guest_get_msrs() which 
calls x86_pmu.guest_get_msrs() which is an indirect branch.

Therefore, it seems that this block should be done after the call to 
vmx_arm_hv_timer().

>   	atomic_switch_perf_msrs(vmx);
>
>   	vmx_arm_hv_timer(vcpu);
> @@ -9707,6 +9733,12 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>   #endif
>   	      );
>
> +	if (have_spec_ctrl) {
> +		rdmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
> +		if (vmx->spec_ctrl)
> +			wrmsrl(MSR_IA32_SPEC_CTRL, 0);
> +	}
> +
>   	/* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */
>   	if (vmx->host_debugctlmsr)
>   		update_debugctlmsr(vmx->host_debugctlmsr);
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 4/7] kvm: vmx: Set IBPB when running a different VCPU
  2018-01-08 18:08 ` [PATCH 4/7] kvm: vmx: Set IBPB when running a different VCPU Paolo Bonzini
@ 2018-01-08 19:23   ` Liran Alon
  2018-01-08 19:36   ` Jim Mattson
  2018-01-09 11:01   ` David Hildenbrand
  2 siblings, 0 replies; 44+ messages in thread
From: Liran Alon @ 2018-01-08 19:23 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp, Tim Chen



On 08/01/18 20:08, Paolo Bonzini wrote:
> From: Tim Chen <tim.c.chen@linux.intel.com>
>
> Ensure an IBPB (Indirect branch prediction barrier) before every VCPU
> switch.
>
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   arch/x86/kvm/vmx.c | 9 +++++++++
>   1 file changed, 9 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index d00bcad7336e..bf127c570675 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -2375,6 +2375,8 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>   	if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
>   		per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
>   		vmcs_load(vmx->loaded_vmcs->vmcs);
> +		if (have_spec_ctrl)
> +			wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
>   	}
>
>   	if (!already_loaded) {
> @@ -4029,6 +4031,13 @@ static void free_loaded_vmcs(struct loaded_vmcs *loaded_vmcs)
>   	free_vmcs(loaded_vmcs->vmcs);
>   	loaded_vmcs->vmcs = NULL;
>   	WARN_ON(loaded_vmcs->shadow_vmcs != NULL);
> +
> +	/*
> +	 * The VMCS could be recycled, causing a false negative in
> +	 * vmx_vcpu_load; block speculative execution.
> +	 */
> +	if (have_spec_ctrl)
> +		wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
>   }
>
>   static void vmx_nested_free_vmcs02(struct vcpu_vmx *vmx)
>

Reviewed-by: Liran Alon <liran.alon@oracle.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 4/7] kvm: vmx: Set IBPB when running a different VCPU
  2018-01-08 18:08 ` [PATCH 4/7] kvm: vmx: Set IBPB when running a different VCPU Paolo Bonzini
  2018-01-08 19:23   ` Liran Alon
@ 2018-01-08 19:36   ` Jim Mattson
  2018-01-09  8:33     ` Paolo Bonzini
  2018-01-09 11:01   ` David Hildenbrand
  2 siblings, 1 reply; 44+ messages in thread
From: Jim Mattson @ 2018-01-08 19:36 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: LKML, kvm list, aliguori, Tom Lendacky, dwmw, bp, Tim Chen

Shouldn't there be an IBPB on *any* context switch away from a VCPU
thread, even if it is to a non-VCPU thread?

On Mon, Jan 8, 2018 at 10:08 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> From: Tim Chen <tim.c.chen@linux.intel.com>
>
> Ensure an IBPB (Indirect branch prediction barrier) before every VCPU
> switch.
>
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/vmx.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index d00bcad7336e..bf127c570675 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -2375,6 +2375,8 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>         if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
>                 per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
>                 vmcs_load(vmx->loaded_vmcs->vmcs);
> +               if (have_spec_ctrl)
> +                       wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
>         }
>
>         if (!already_loaded) {
> @@ -4029,6 +4031,13 @@ static void free_loaded_vmcs(struct loaded_vmcs *loaded_vmcs)
>         free_vmcs(loaded_vmcs->vmcs);
>         loaded_vmcs->vmcs = NULL;
>         WARN_ON(loaded_vmcs->shadow_vmcs != NULL);
> +
> +       /*
> +        * The VMCS could be recycled, causing a false negative in
> +        * vmx_vcpu_load; block speculative execution.
> +        */
> +       if (have_spec_ctrl)
> +               wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
>  }
>
>  static void vmx_nested_free_vmcs02(struct vcpu_vmx *vmx)
> --
> 1.8.3.1
>
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
  2018-01-08 18:08 ` [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest Paolo Bonzini
                     ` (2 preceding siblings ...)
  2018-01-08 19:22   ` Liran Alon
@ 2018-01-08 19:41   ` David Woodhouse
  2018-01-08 22:33     ` Paolo Bonzini
  2018-01-08 22:09   ` Ashok Raj
  2018-01-11  2:47   ` Tim Chen
  5 siblings, 1 reply; 44+ messages in thread
From: David Woodhouse @ 2018-01-08 19:41 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm; +Cc: jmattson, aliguori, thomas.lendacky, bp

[-- Attachment #1: Type: text/plain, Size: 424 bytes --]

On Mon, 2018-01-08 at 19:08 +0100, Paolo Bonzini wrote:
> 
> +       if (have_spec_ctrl && vmx->spec_ctrl != 0)
> +               wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
> +

I think this one probably *is* safe even without an 'else lfence',
which means that the CPU can speculate around it, but it wants a
comment explaining that someone has properly analysed it and saying
precisely why.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 5/7] kvm: svm: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to guest
  2018-01-08 18:08 ` [PATCH 5/7] kvm: svm: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to guest Paolo Bonzini
@ 2018-01-08 19:41   ` Liran Alon
  0 siblings, 0 replies; 44+ messages in thread
From: Liran Alon @ 2018-01-08 19:41 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp



On 08/01/18 20:08, Paolo Bonzini wrote:
> Direct access to MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD is important
> for performance.  Allow load/store of MSR_IA32_SPEC_CTRL, restore guest
> IBRS on VM entry and set it to 0 on VM exit (because Linux does not use
> it yet).
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   arch/x86/kvm/svm.c | 34 +++++++++++++++++++++++++++++++++-
>   1 file changed, 33 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 55dde3896898..779981a00d01 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -183,6 +183,8 @@ struct vcpu_svm {
>   		u64 gs_base;
>   	} host;
>
> +	u64 spec_ctrl;
> +
>   	u32 *msrpm;
>
>   	ulong nmi_iret_rip;
> @@ -236,7 +238,7 @@ struct amd_svm_iommu_ir {
>
>   static const struct svm_direct_access_msrs {
>   	u32 index;   /* Index of the MSR */
> -	bool always; /* True if intercept is always on */
> +	bool always; /* True if intercept is always off */

By examining svm_vcpu_init_msrpm() it seems you are correct as if 
always==true then read/write bits are cleared from msrpm (which 
pass-through the MSR to the guest).

However, I do think this comment change should be done in a separate 
commit as it is unrelated.

>   } direct_access_msrs[] = {
>   	{ .index = MSR_STAR,				.always = true  },
>   	{ .index = MSR_IA32_SYSENTER_CS,		.always = true  },
> @@ -248,6 +250,8 @@ struct amd_svm_iommu_ir {
>   	{ .index = MSR_CSTAR,				.always = true  },
>   	{ .index = MSR_SYSCALL_MASK,			.always = true  },
>   #endif
> +	{ .index = MSR_IA32_SPEC_CTRL,			.always = true  },
> +	{ .index = MSR_IA32_PRED_CMD,			.always = true  },
>   	{ .index = MSR_IA32_LASTBRANCHFROMIP,		.always = false },
>   	{ .index = MSR_IA32_LASTBRANCHTOIP,		.always = false },
>   	{ .index = MSR_IA32_LASTINTFROMIP,		.always = false },
> @@ -283,6 +287,8 @@ struct amd_svm_iommu_ir {
>   /* enable/disable Virtual GIF */
>   static int vgif = true;
>   module_param(vgif, int, 0444);
> +
> +static bool __read_mostly have_spec_ctrl;
>
>   static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
>   static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
> @@ -1135,6 +1141,17 @@ static __init int svm_hardware_setup(void)
>   			pr_info("Virtual GIF supported\n");
>   	}
>
> +	/*
> +	 * FIXME: this is only needed until SPEC_CTRL is supported
> +	 * by upstream Linux in cpufeatures, then it can be replaced
> +	 * with static_cpu_has.
> +	 */
> +	have_spec_ctrl = cpu_has_spec_ctrl();
> +	if (have_spec_ctrl)
> +		pr_info("kvm: SPEC_CTRL available\n");
> +	else
> +		pr_info("kvm: SPEC_CTRL not available\n");
> +
>   	return 0;
>
>   err:
> @@ -3599,6 +3616,9 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>   	case MSR_VM_CR:
>   		msr_info->data = svm->nested.vm_cr_msr;
>   		break;
> +	case MSR_IA32_SPEC_CTRL:
> +		msr_info->data = svm->spec_ctrl;
> +		break;
>   	case MSR_IA32_UCODE_REV:
>   		msr_info->data = 0x01000065;
>   		break;
> @@ -3754,6 +3774,9 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>   	case MSR_VM_IGNNE:
>   		vcpu_unimpl(vcpu, "unimplemented wrmsr: 0x%x data 0x%llx\n", ecx, data);
>   		break;
> +	case MSR_IA32_SPEC_CTRL:
> +		svm->spec_ctrl = data;
> +		break;
>   	case MSR_IA32_APICBASE:
>   		if (kvm_vcpu_apicv_active(vcpu))
>   			avic_update_vapic_bar(to_svm(vcpu), data);
> @@ -4942,6 +4965,9 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
>
>   	local_irq_enable();
>
> +	if (have_spec_ctrl && svm->spec_ctrl != 0)
> +		wrmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
> +
>   	asm volatile (
>   		"push %%" _ASM_BP "; \n\t"
>   		"mov %c[rbx](%[svm]), %%" _ASM_BX " \n\t"
> @@ -5015,6 +5041,12 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
>   #endif
>   		);
>
> +	if (have_spec_ctrl) {
> +		rdmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
> +		if (svm->spec_ctrl != 0)

Very small stuff but I hate inconsistencies:
Either you put here (svm->spec_ctrl) like in arch/x86/kvm/vmx.c or you 
will use there (vmx->spec_ctrl != 0).

> +			wrmsrl(MSR_IA32_SPEC_CTRL, 0);
> +	}
> +
>   #ifdef CONFIG_X86_64
>   	wrmsrl(MSR_GS_BASE, svm->host.gs_base);
>   #else
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 6/7] x86/svm: Set IBPB when running a different VCPU
  2018-01-08 18:08 ` [PATCH 6/7] x86/svm: Set IBPB when running a different VCPU Paolo Bonzini
@ 2018-01-08 20:00   ` Liran Alon
  2018-01-09 11:07     ` Paolo Bonzini
  0 siblings, 1 reply; 44+ messages in thread
From: Liran Alon @ 2018-01-08 20:00 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp



On 08/01/18 20:08, Paolo Bonzini wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
>
> Set IBPB (Indirect Branch Prediction Barrier) when the current CPU is
> going to run a VCPU different from what was previously run.  Nested
> virtualization uses the same VMCB for the second level guest, but the
> L1 hypervisor should be using IBRS to protect itself.
>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   arch/x86/kvm/svm.c | 31 +++++++++++++++++++++++++++++++
>   1 file changed, 31 insertions(+)
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 779981a00d01..dd744d896cec 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -289,6 +289,7 @@ struct amd_svm_iommu_ir {
>   module_param(vgif, int, 0444);
>
>   static bool __read_mostly have_spec_ctrl;
> +static bool __read_mostly have_ibpb_support;
>
>   static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
>   static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
> @@ -540,6 +541,7 @@ struct svm_cpu_data {
>   	struct kvm_ldttss_desc *tss_desc;
>
>   	struct page *save_area;
> +	struct vmcb *current_vmcb;
>   };
>
>   static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
> @@ -1151,6 +1153,11 @@ static __init int svm_hardware_setup(void)
>   		pr_info("kvm: SPEC_CTRL available\n");
>   	else
>   		pr_info("kvm: SPEC_CTRL not available\n");
> +	have_ibpb_support = have_spec_ctrl || cpu_has_ibpb_support();
> +	if (have_ibpb_support)
> +		pr_info("kvm: IBPB_SUPPORT available\n");
> +	else
> +		pr_info("kvm: IBPB_SUPPORT not available\n");
>
>   	return 0;
>
> @@ -1725,11 +1732,19 @@ static void svm_free_vcpu(struct kvm_vcpu *vcpu)
>   	__free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER);
>   	kvm_vcpu_uninit(vcpu);
>   	kmem_cache_free(kvm_vcpu_cache, svm);
> +
> +	/*
> +	 * The VMCB could be recycled, causing a false negative in
> +	 * svm_vcpu_load; block speculative execution.
> +	 */
> +	if (have_ibpb_support)
> +		wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
>   }
>
>   static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>   {
>   	struct vcpu_svm *svm = to_svm(vcpu);
> +	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
>   	int i;
>
>   	if (unlikely(cpu != vcpu->cpu)) {
> @@ -1758,6 +1773,12 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>   	if (static_cpu_has(X86_FEATURE_RDTSCP))
>   		wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
>
> +	if (sd->current_vmcb != svm->vmcb) {
> +		sd->current_vmcb = svm->vmcb;
> +		if (have_ibpb_support)
> +			wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
> +	}
> +
>   	avic_vcpu_load(vcpu, cpu);
>   }
>
> @@ -2798,6 +2819,11 @@ static int nested_svm_vmexit(struct vcpu_svm *svm)
>   	if (!nested_vmcb)
>   		return 1;
>
> +	/*
> +	 * No need for IBPB here, the L1 hypervisor should be running with
> +	 * IBRS=1 and inserts one already when switching L2 VMs.
> +	 */
> +

I am not fully convinced yet this is OK from security perspective.
 From the CPU point-of-view, both L1 & L2 run in the same 
prediction-mode (as they are both running in SVM guest-mode). Therefore, 
IBRS will not hide the BHB & BTB of L1 from L2.

This is how I understand things:
1. When transition between contexts in same prediction-mode, software is 
responsible for issuing an IBPB to basically "delete" the "branch 
prediction data" of the previous context such that the new context won't 
be able to use it.
This is orthogonal to IBRS which makes sure new context won't use 
"branch prediction data" that was created by a previous less-privileged 
prediction-mode as we are talking about transitioning between 2 contexts 
in same physical prediction-mode.
2. Because of (1), KVM was modified to issue IBPB when replacing active 
VMCB/VMCS on CPU.
3. For the nested-virtualization case, L1 & L2 both run in same physical 
prediction-mode. As from physical CPU perspective, they are both running 
in SVM guest-mode. Therefore, L0 should issue IBPB when transitioning 
from L1 to L2 and vice-versa.
4. In nested VMX case, this already happens because transitioning 
between L1 & L2 involves changing active VMCS on CPU (from vmcs01 to 
vmcs02) which already issues an IBPB.
5. However, nested SVM is reusing the same VMCB when transitioning 
between L1 & L2 and therefore we should explicitly issue an IBPB in 
nested_svm_vmrun() & nested_svm_vmexit().
6. One can implicitly assume that L1 hypervisor did issued a IBPB when 
loading an VMCB and therefore it doesn't have to do it again in L0.
However, I see 2 problems with this approach:
(a) We don't protect old non-patched L1 hypervisors from Spectre even 
though we could easily do that from L0 with small performance hit.
(b) When L1 hypervisor runs only a single L2 VM, it doesn't have to 
issue an IBPB when loading the VMCB (as it didn't run any other VMCB 
before) and it should be sufficient from L1 perspective to just write 1 
to IBRS. However, in our nested-case, this is a security-hole.

Am I misunderstanding something?

Regards,
-Liran

>   	/* Exit Guest-Mode */
>   	leave_guest_mode(&svm->vcpu);
>   	svm->nested.vmcb = 0;
> @@ -3061,6 +3087,11 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
>   	if (!nested_vmcb)
>   		return false;
>
> +	/*
> +	 * No need for IBPB here, since the nested VM is less privileged.  The
> +	 * L1 hypervisor inserts one already when switching L2 VMs.
> +	 */
> +
>   	if (!nested_vmcb_checks(nested_vmcb)) {
>   		nested_vmcb->control.exit_code    = SVM_EXIT_ERR;
>   		nested_vmcb->control.exit_code_hi = 0;
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 7/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT to MSR and CPUID lists
  2018-01-08 18:08 ` [PATCH 7/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT to MSR and CPUID lists Paolo Bonzini
@ 2018-01-08 20:07   ` Liran Alon
  2018-01-08 20:15     ` Jim Mattson
  0 siblings, 1 reply; 44+ messages in thread
From: Liran Alon @ 2018-01-08 20:07 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp



On 08/01/18 20:08, Paolo Bonzini wrote:
> Expose them to userspace, now that guests can use them.
> I am not adding cpufeatures here to avoid having a kernel
> that shows spec_ctrl in /proc/cpuinfo and actually has no
> support whatsoever for IBRS/IBPB.  Keep the ugly special-casing
> for now, and clean it up once the generic arch/x86/ code
> learns about them.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   arch/x86/kvm/cpuid.c | 24 +++++++++++++++++++++---
>   arch/x86/kvm/x86.c   |  1 +
>   2 files changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 767af697c20c..5f43a5940275 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -397,7 +397,12 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>
>   	/* cpuid 7.0.edx*/
>   	const u32 kvm_cpuid_7_0_edx_x86_features =
> -		KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
> +		KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) |
> +		KF(SPEC_CTRL) | KF(STIBP);
> +
> +	/* cpuid 0x80000008.edx */
> +	const u32 kvm_cpuid_80000008_ebx_x86_features =
> +		KF(IBPB_SUPPORT);
>
>   	/* all calls to cpuid_count() should be made on the same cpu */
>   	get_cpu();
> @@ -483,7 +488,14 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>   			if (!tdp_enabled || !boot_cpu_has(X86_FEATURE_OSPKE))
>   				entry->ecx &= ~F(PKU);
>   			entry->edx &= kvm_cpuid_7_0_edx_x86_features;
> -			entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX);
> +			/*
> +			 * FIXME: the special casing of SPEC_CTRL and STIBP
> +			 * can be removed once they become regular
> +			 * cpufeatures.
> +			 */
> +			entry->edx &= (
> +				get_scattered_cpuid_leaf(7, 0, CPUID_EDX) |
> +				KF(SPEC_CTRL) | KF(STIBP));
>   		} else {
>   			entry->ebx = 0;
>   			entry->ecx = 0;
> @@ -651,7 +663,13 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>   		if (!g_phys_as)
>   			g_phys_as = phys_as;
>   		entry->eax = g_phys_as | (virt_as << 8);
> -		entry->ebx = entry->edx = 0;
> +		/*
> +		 * FIXME: mask against cpufeatures, with
> +		 * get_scattered_cpuid_leaf(0x80000008, 0, CPUID_EBX),
> +		 * once IBPB_SUPPORT becomes a regular cpufeature.
> +		 */
> +		entry->ebx &= kvm_cpuid_80000008_ebx_x86_features;
> +		entry->edx = 0;
>   		break;
>   	}
>   	case 0x80000019:
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index daa1918031df..4abb37d9f4d8 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1032,6 +1032,7 @@ unsigned int kvm_get_pt_addr_cnt(void)
>   	MSR_IA32_RTIT_ADDR1_A, MSR_IA32_RTIT_ADDR1_B,
>   	MSR_IA32_RTIT_ADDR2_A, MSR_IA32_RTIT_ADDR2_B,
>   	MSR_IA32_RTIT_ADDR3_A, MSR_IA32_RTIT_ADDR3_B,
> +	MSR_IA32_SPEC_CTRL,
>   };
>
>   static unsigned num_msrs_to_save;
>

Reviewed-by: Liran Alon <liran.alon@oracle.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 7/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT to MSR and CPUID lists
  2018-01-08 20:07   ` Liran Alon
@ 2018-01-08 20:15     ` Jim Mattson
  0 siblings, 0 replies; 44+ messages in thread
From: Jim Mattson @ 2018-01-08 20:15 UTC (permalink / raw)
  To: Liran Alon
  Cc: Paolo Bonzini, LKML, kvm list, aliguori, Tom Lendacky, dwmw, bp

Reviewed-by: Jim Mattson <jmattson@google.com>

On Mon, Jan 8, 2018 at 12:07 PM, Liran Alon <LIRAN.ALON@oracle.com> wrote:
>
>
> On 08/01/18 20:08, Paolo Bonzini wrote:
>>
>> Expose them to userspace, now that guests can use them.
>> I am not adding cpufeatures here to avoid having a kernel
>> that shows spec_ctrl in /proc/cpuinfo and actually has no
>> support whatsoever for IBRS/IBPB.  Keep the ugly special-casing
>> for now, and clean it up once the generic arch/x86/ code
>> learns about them.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>   arch/x86/kvm/cpuid.c | 24 +++++++++++++++++++++---
>>   arch/x86/kvm/x86.c   |  1 +
>>   2 files changed, 22 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 767af697c20c..5f43a5940275 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -397,7 +397,12 @@ static inline int __do_cpuid_ent(struct
>> kvm_cpuid_entry2 *entry, u32 function,
>>
>>         /* cpuid 7.0.edx*/
>>         const u32 kvm_cpuid_7_0_edx_x86_features =
>> -               KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
>> +               KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) |
>> +               KF(SPEC_CTRL) | KF(STIBP);
>> +
>> +       /* cpuid 0x80000008.edx */
>> +       const u32 kvm_cpuid_80000008_ebx_x86_features =
>> +               KF(IBPB_SUPPORT);
>>
>>         /* all calls to cpuid_count() should be made on the same cpu */
>>         get_cpu();
>> @@ -483,7 +488,14 @@ static inline int __do_cpuid_ent(struct
>> kvm_cpuid_entry2 *entry, u32 function,
>>                         if (!tdp_enabled ||
>> !boot_cpu_has(X86_FEATURE_OSPKE))
>>                                 entry->ecx &= ~F(PKU);
>>                         entry->edx &= kvm_cpuid_7_0_edx_x86_features;
>> -                       entry->edx &= get_scattered_cpuid_leaf(7, 0,
>> CPUID_EDX);
>> +                       /*
>> +                        * FIXME: the special casing of SPEC_CTRL and
>> STIBP
>> +                        * can be removed once they become regular
>> +                        * cpufeatures.
>> +                        */
>> +                       entry->edx &= (
>> +                               get_scattered_cpuid_leaf(7, 0, CPUID_EDX)
>> |
>> +                               KF(SPEC_CTRL) | KF(STIBP));
>>                 } else {
>>                         entry->ebx = 0;
>>                         entry->ecx = 0;
>> @@ -651,7 +663,13 @@ static inline int __do_cpuid_ent(struct
>> kvm_cpuid_entry2 *entry, u32 function,
>>                 if (!g_phys_as)
>>                         g_phys_as = phys_as;
>>                 entry->eax = g_phys_as | (virt_as << 8);
>> -               entry->ebx = entry->edx = 0;
>> +               /*
>> +                * FIXME: mask against cpufeatures, with
>> +                * get_scattered_cpuid_leaf(0x80000008, 0, CPUID_EBX),
>> +                * once IBPB_SUPPORT becomes a regular cpufeature.
>> +                */
>> +               entry->ebx &= kvm_cpuid_80000008_ebx_x86_features;
>> +               entry->edx = 0;
>>                 break;
>>         }
>>         case 0x80000019:
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index daa1918031df..4abb37d9f4d8 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -1032,6 +1032,7 @@ unsigned int kvm_get_pt_addr_cnt(void)
>>         MSR_IA32_RTIT_ADDR1_A, MSR_IA32_RTIT_ADDR1_B,
>>         MSR_IA32_RTIT_ADDR2_A, MSR_IA32_RTIT_ADDR2_B,
>>         MSR_IA32_RTIT_ADDR3_A, MSR_IA32_RTIT_ADDR3_B,
>> +       MSR_IA32_SPEC_CTRL,
>>   };
>>
>>   static unsigned num_msrs_to_save;
>>
>
> Reviewed-by: Liran Alon <liran.alon@oracle.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
  2018-01-08 19:18   ` Jim Mattson
@ 2018-01-08 20:23     ` Liran Alon
  2018-01-08 22:32     ` Paolo Bonzini
  1 sibling, 0 replies; 44+ messages in thread
From: Liran Alon @ 2018-01-08 20:23 UTC (permalink / raw)
  To: Jim Mattson, Paolo Bonzini
  Cc: LKML, kvm list, aliguori, Tom Lendacky, dwmw, bp



On 08/01/18 21:18, Jim Mattson wrote:
> Guest usage of MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD should be
> predicated on guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL).
>
> On Mon, Jan 8, 2018 at 10:08 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> Direct access to MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD is important
>> for performance.  Allow load/store of MSR_IA32_SPEC_CTRL, restore guest
>> IBRS on VM entry and set it to 0 on VM exit (because Linux does not use
>> it yet).
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>   arch/x86/kvm/vmx.c | 32 ++++++++++++++++++++++++++++++++
>>   1 file changed, 32 insertions(+)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 669f5f74857d..d00bcad7336e 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -120,6 +120,8 @@
>>   module_param_named(preemption_timer, enable_preemption_timer, bool, S_IRUGO);
>>   #endif
>>
>> +static bool __read_mostly have_spec_ctrl;
>> +
>>   #define KVM_GUEST_CR0_MASK (X86_CR0_NW | X86_CR0_CD)
>>   #define KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST (X86_CR0_WP | X86_CR0_NE)
>>   #define KVM_VM_CR0_ALWAYS_ON                                           \
>> @@ -609,6 +611,8 @@ struct vcpu_vmx {
>>          u64                   msr_host_kernel_gs_base;
>>          u64                   msr_guest_kernel_gs_base;
>>   #endif
>> +       u64                   spec_ctrl;
>> +
>>          u32 vm_entry_controls_shadow;
>>          u32 vm_exit_controls_shadow;
>>          u32 secondary_exec_control;
>> @@ -3361,6 +3365,9 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>          case MSR_IA32_TSC:
>>                  msr_info->data = guest_read_tsc(vcpu);
>>                  break;
>> +       case MSR_IA32_SPEC_CTRL:
>> +               msr_info->data = to_vmx(vcpu)->spec_ctrl;
>> +               break;
>
> I have:
>
> if (!have_spec_ctrl ||
>      (!msr_info->host_initiated &&
>       !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)))
> return 1;
> msr_info->data = to_vmx(vcpu)->msr_ia32_spec_ctrl;
> break;
>
>>          case MSR_IA32_SYSENTER_CS:
>>                  msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
>>                  break;
>> @@ -3500,6 +3507,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>          case MSR_IA32_TSC:
>>                  kvm_write_tsc(vcpu, msr_info);
>>                  break;
>> +       case MSR_IA32_SPEC_CTRL:
>> +               to_vmx(vcpu)->spec_ctrl = msr_info->data;
>> +               break;
>
> I have:
>
> if (!have_spec_ctrl ||
>      (!msr_info->host_initiated &&
>       !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)))
> return 1;
> to_vmx(vcpu)->msr_ia32_spec_ctrl = data;
> break;
>
>>          case MSR_IA32_CR_PAT:
>>                  if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
>>                          if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
>> @@ -7062,6 +7072,17 @@ static __init int hardware_setup(void)
>>                  goto out;
>>          }
>>
>> +       /*
>> +        * FIXME: this is only needed until SPEC_CTRL is supported
>> +        * by upstream Linux in cpufeatures, then it can be replaced
>> +        * with static_cpu_has.
>> +        */
>> +       have_spec_ctrl = cpu_has_spec_ctrl();
>> +       if (have_spec_ctrl)
>> +               pr_info("kvm: SPEC_CTRL available\n");
>> +       else
>> +               pr_info("kvm: SPEC_CTRL not available\n");
>> +
>>          if (boot_cpu_has(X86_FEATURE_NX))
>>                  kvm_enable_efer_bits(EFER_NX);
>>
>> @@ -7131,6 +7152,8 @@ static __init int hardware_setup(void)
>>          vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_CS, false);
>>          vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false);
>>          vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false);
>> +       vmx_disable_intercept_for_msr(MSR_IA32_SPEC_CTRL, false);
>> +       vmx_disable_intercept_for_msr(MSR_IA32_PRED_CMD, false);
>
> I have a lot of changes to MSR permission bitmap handling, but these
> intercepts should only be disabled when guest_cpuid_has(vcpu,
> X86_FEATURE_SPEC_CTRL).
>
>>          memcpy(vmx_msr_bitmap_legacy_x2apic_apicv,
>>                          vmx_msr_bitmap_legacy, PAGE_SIZE);
>> @@ -9597,6 +9620,9 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>>
>>          pt_guest_enter(vmx);
>>
>> +       if (have_spec_ctrl && vmx->spec_ctrl != 0)
>> +               wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
>> +
>
> Here, I have:
>
> /*
> * If the guest is allowed to write to MSR_IA32_SPEC_CTRL,
> * store it on VM-exit.
> */
> if (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
> add_autostore_msr(vmx, MSR_IA32_SPEC_CTRL);
> else
> clear_autostore_msr(vmx, MSR_IA32_SPEC_CTRL);
>
> /*
> * If the guest's IA32_SPEC_CTRL MSR doesn't match the host's
> * IA32_SPEC_CTRL MSR, then add the MSR to the atomic switch
> * MSRs, so that the guest value will be loaded on VM-entry
> * and the host value will be loaded on VM-exit.
> */
> if (vmx->msr_ia32_spec_ctrl != spec_ctrl_enabled())
> add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL,
>        vmx->msr_ia32_spec_ctrl,
>        spec_ctrl_enabled());
> else
> clear_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL);
>

I totally agree with this.

This exactly solves the issue I mentioned before of restoring the guest 
value of MSR_IA32_SPEC_CTRL using WRMSR before calling 
atomic_switch_perf_msrs() which does an indirect branch.

>>          atomic_switch_perf_msrs(vmx);
>>
>>          vmx_arm_hv_timer(vcpu);
>> @@ -9707,6 +9733,12 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>>   #endif
>>                );
>>
>> +       if (have_spec_ctrl) {
>> +               rdmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
>> +               if (vmx->spec_ctrl)
>> +                       wrmsrl(MSR_IA32_SPEC_CTRL, 0);
>> +       }
>> +
>
> I know the VM-exit MSR load and store lists are probably slower, but
> I'm a little uncomfortable restoring the host's IA32_SPEC_CTRL MSR
> late if the guest has it clear and the host has it set.

Again, I totally agree. This is a better approach for handling this.

>
>>          /* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */
>>          if (vmx->host_debugctlmsr)
>>                  update_debugctlmsr(vmx->host_debugctlmsr);
>> --
>> 1.8.3.1
>>
>>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
  2018-01-08 18:08 ` [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest Paolo Bonzini
                     ` (3 preceding siblings ...)
  2018-01-08 19:41   ` David Woodhouse
@ 2018-01-08 22:09   ` Ashok Raj
  2018-01-08 22:25     ` Paolo Bonzini
  2018-01-11  2:47   ` Tim Chen
  5 siblings, 1 reply; 44+ messages in thread
From: Ashok Raj @ 2018-01-08 22:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, jmattson, aliguori, thomas.lendacky, dwmw, bp,
	Raj, Ashok

Hi Paolo

Do you assume that host isn't using IBRS and only guest uses it?



On Mon, Jan 8, 2018 at 10:08 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Direct access to MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD is important
> for performance.  Allow load/store of MSR_IA32_SPEC_CTRL, restore guest
> IBRS on VM entry and set it to 0 on VM exit (because Linux does not use
> it yet).
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/vmx.c | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 669f5f74857d..d00bcad7336e 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -120,6 +120,8 @@
>  module_param_named(preemption_timer, enable_preemption_timer, bool, S_IRUGO);
>  #endif
>
> +static bool __read_mostly have_spec_ctrl;
> +
>  #define KVM_GUEST_CR0_MASK (X86_CR0_NW | X86_CR0_CD)
>  #define KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST (X86_CR0_WP | X86_CR0_NE)
>  #define KVM_VM_CR0_ALWAYS_ON                                           \
> @@ -609,6 +611,8 @@ struct vcpu_vmx {
>         u64                   msr_host_kernel_gs_base;
>         u64                   msr_guest_kernel_gs_base;
>  #endif
> +       u64                   spec_ctrl;
> +
>         u32 vm_entry_controls_shadow;
>         u32 vm_exit_controls_shadow;
>         u32 secondary_exec_control;
> @@ -3361,6 +3365,9 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>         case MSR_IA32_TSC:
>                 msr_info->data = guest_read_tsc(vcpu);
>                 break;
> +       case MSR_IA32_SPEC_CTRL:
> +               msr_info->data = to_vmx(vcpu)->spec_ctrl;
> +               break;
>         case MSR_IA32_SYSENTER_CS:
>                 msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
>                 break;
> @@ -3500,6 +3507,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>         case MSR_IA32_TSC:
>                 kvm_write_tsc(vcpu, msr_info);
>                 break;
> +       case MSR_IA32_SPEC_CTRL:
> +               to_vmx(vcpu)->spec_ctrl = msr_info->data;
> +               break;
>         case MSR_IA32_CR_PAT:
>                 if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
>                         if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
> @@ -7062,6 +7072,17 @@ static __init int hardware_setup(void)
>                 goto out;
>         }
>
> +       /*
> +        * FIXME: this is only needed until SPEC_CTRL is supported
> +        * by upstream Linux in cpufeatures, then it can be replaced
> +        * with static_cpu_has.
> +        */
> +       have_spec_ctrl = cpu_has_spec_ctrl();
> +       if (have_spec_ctrl)
> +               pr_info("kvm: SPEC_CTRL available\n");
> +       else
> +               pr_info("kvm: SPEC_CTRL not available\n");
> +
>         if (boot_cpu_has(X86_FEATURE_NX))
>                 kvm_enable_efer_bits(EFER_NX);
>
> @@ -7131,6 +7152,8 @@ static __init int hardware_setup(void)
>         vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_CS, false);
>         vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false);
>         vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false);
> +       vmx_disable_intercept_for_msr(MSR_IA32_SPEC_CTRL, false);
> +       vmx_disable_intercept_for_msr(MSR_IA32_PRED_CMD, false);
>
>         memcpy(vmx_msr_bitmap_legacy_x2apic_apicv,
>                         vmx_msr_bitmap_legacy, PAGE_SIZE);
> @@ -9597,6 +9620,9 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>
>         pt_guest_enter(vmx);
>
> +       if (have_spec_ctrl && vmx->spec_ctrl != 0)
> +               wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
> +

Do we even need to optimize this? what if host Linux enabled IBRS, but
guest has it turned off?
Thought it might be simpler to blindly update it with what
vmx->spec_ctrl value is?

>         atomic_switch_perf_msrs(vmx);
>
>         vmx_arm_hv_timer(vcpu);
> @@ -9707,6 +9733,12 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>  #endif
>               );
>
> +       if (have_spec_ctrl) {
> +               rdmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
> +               if (vmx->spec_ctrl)
> +                       wrmsrl(MSR_IA32_SPEC_CTRL, 0);
> +       }
> +

Same thing here.. if the host OS has enabled IBRS wouldn't you want to
keep the same value?

>         /* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */
>         if (vmx->host_debugctlmsr)
>                 update_debugctlmsr(vmx->host_debugctlmsr);
> --
> 1.8.3.1
>
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
  2018-01-08 22:09   ` Ashok Raj
@ 2018-01-08 22:25     ` Paolo Bonzini
  0 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-08 22:25 UTC (permalink / raw)
  To: ashok raj
  Cc: linux-kernel, kvm, jmattson, aliguori, thomas lendacky, dwmw, bp



----- Original Message -----
> From: "Ashok Raj" <lkml.araj@gmail.com>
> To: "Paolo Bonzini" <pbonzini@redhat.com>
> Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, jmattson@google.com, aliguori@amazon.com, "thomas lendacky"
> <thomas.lendacky@amd.com>, dwmw@amazon.co.uk, bp@alien8.de, "Ashok Raj" <ashok.raj@linux.intel.com>
> Sent: Monday, January 8, 2018 11:09:53 PM
> Subject: Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
> 
> Hi Paolo
> 
> Do you assume that host isn't using IBRS and only guest uses it?

For now, yes.

Patches to add the IBRS and IBPB cpufeatures will have to adjust the
MSR writes from this patch.

Paolo

> 
> 
> 
> On Mon, Jan 8, 2018 at 10:08 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> > Direct access to MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD is important
> > for performance.  Allow load/store of MSR_IA32_SPEC_CTRL, restore guest
> > IBRS on VM entry and set it to 0 on VM exit (because Linux does not use
> > it yet).
> >
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  arch/x86/kvm/vmx.c | 32 ++++++++++++++++++++++++++++++++
> >  1 file changed, 32 insertions(+)
> >
> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> > index 669f5f74857d..d00bcad7336e 100644
> > --- a/arch/x86/kvm/vmx.c
> > +++ b/arch/x86/kvm/vmx.c
> > @@ -120,6 +120,8 @@
> >  module_param_named(preemption_timer, enable_preemption_timer, bool,
> >  S_IRUGO);
> >  #endif
> >
> > +static bool __read_mostly have_spec_ctrl;
> > +
> >  #define KVM_GUEST_CR0_MASK (X86_CR0_NW | X86_CR0_CD)
> >  #define KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST (X86_CR0_WP | X86_CR0_NE)
> >  #define KVM_VM_CR0_ALWAYS_ON                                           \
> > @@ -609,6 +611,8 @@ struct vcpu_vmx {
> >         u64                   msr_host_kernel_gs_base;
> >         u64                   msr_guest_kernel_gs_base;
> >  #endif
> > +       u64                   spec_ctrl;
> > +
> >         u32 vm_entry_controls_shadow;
> >         u32 vm_exit_controls_shadow;
> >         u32 secondary_exec_control;
> > @@ -3361,6 +3365,9 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct
> > msr_data *msr_info)
> >         case MSR_IA32_TSC:
> >                 msr_info->data = guest_read_tsc(vcpu);
> >                 break;
> > +       case MSR_IA32_SPEC_CTRL:
> > +               msr_info->data = to_vmx(vcpu)->spec_ctrl;
> > +               break;
> >         case MSR_IA32_SYSENTER_CS:
> >                 msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
> >                 break;
> > @@ -3500,6 +3507,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct
> > msr_data *msr_info)
> >         case MSR_IA32_TSC:
> >                 kvm_write_tsc(vcpu, msr_info);
> >                 break;
> > +       case MSR_IA32_SPEC_CTRL:
> > +               to_vmx(vcpu)->spec_ctrl = msr_info->data;
> > +               break;
> >         case MSR_IA32_CR_PAT:
> >                 if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
> >                         if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
> > @@ -7062,6 +7072,17 @@ static __init int hardware_setup(void)
> >                 goto out;
> >         }
> >
> > +       /*
> > +        * FIXME: this is only needed until SPEC_CTRL is supported
> > +        * by upstream Linux in cpufeatures, then it can be replaced
> > +        * with static_cpu_has.
> > +        */
> > +       have_spec_ctrl = cpu_has_spec_ctrl();
> > +       if (have_spec_ctrl)
> > +               pr_info("kvm: SPEC_CTRL available\n");
> > +       else
> > +               pr_info("kvm: SPEC_CTRL not available\n");
> > +
> >         if (boot_cpu_has(X86_FEATURE_NX))
> >                 kvm_enable_efer_bits(EFER_NX);
> >
> > @@ -7131,6 +7152,8 @@ static __init int hardware_setup(void)
> >         vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_CS, false);
> >         vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false);
> >         vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false);
> > +       vmx_disable_intercept_for_msr(MSR_IA32_SPEC_CTRL, false);
> > +       vmx_disable_intercept_for_msr(MSR_IA32_PRED_CMD, false);
> >
> >         memcpy(vmx_msr_bitmap_legacy_x2apic_apicv,
> >                         vmx_msr_bitmap_legacy, PAGE_SIZE);
> > @@ -9597,6 +9620,9 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu
> > *vcpu)
> >
> >         pt_guest_enter(vmx);
> >
> > +       if (have_spec_ctrl && vmx->spec_ctrl != 0)
> > +               wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
> > +
> 
> Do we even need to optimize this? what if host Linux enabled IBRS, but
> guest has it turned off?
> Thought it might be simpler to blindly update it with what
> vmx->spec_ctrl value is?
> 
> >         atomic_switch_perf_msrs(vmx);
> >
> >         vmx_arm_hv_timer(vcpu);
> > @@ -9707,6 +9733,12 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu
> > *vcpu)
> >  #endif
> >               );
> >
> > +       if (have_spec_ctrl) {
> > +               rdmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
> > +               if (vmx->spec_ctrl)
> > +                       wrmsrl(MSR_IA32_SPEC_CTRL, 0);
> > +       }
> > +
> 
> Same thing here.. if the host OS has enabled IBRS wouldn't you want to
> keep the same value?
> 
> >         /* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed
> >         */
> >         if (vmx->host_debugctlmsr)
> >                 update_debugctlmsr(vmx->host_debugctlmsr);
> > --
> > 1.8.3.1
> >
> >
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
  2018-01-08 19:18   ` Jim Mattson
  2018-01-08 20:23     ` Liran Alon
@ 2018-01-08 22:32     ` Paolo Bonzini
  2018-01-08 23:19       ` Jim Mattson
  1 sibling, 1 reply; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-08 22:32 UTC (permalink / raw)
  To: Jim Mattson; +Cc: LKML, kvm list, aliguori, Tom Lendacky, dwmw, bp


> I have:
> 
> if (!have_spec_ctrl ||
>     (!msr_info->host_initiated &&
>      !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)))
> return 1;
> msr_info->data = to_vmx(vcpu)->msr_ia32_spec_ctrl;
> break;

> I have:
> 
> if (!have_spec_ctrl ||
>     (!msr_info->host_initiated &&
>      !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)))
> return 1;
> to_vmx(vcpu)->msr_ia32_spec_ctrl = data;
> break;

Both better than mine.

> > +       vmx_disable_intercept_for_msr(MSR_IA32_SPEC_CTRL, false);
> > +       vmx_disable_intercept_for_msr(MSR_IA32_PRED_CMD, false);
> 
> I have a lot of changes to MSR permission bitmap handling, but these
> intercepts should only be disabled when guest_cpuid_has(vcpu,
> X86_FEATURE_SPEC_CTRL).

That's harder to backport and not strictly necessary (just like
e.g. we don't check guest CPUID bits before emulation).  I agree that
your version is better, but I think the above is fine as a minimal
fix.

> Here, I have:
> 
> /*
> * If the guest is allowed to write to MSR_IA32_SPEC_CTRL,
> * store it on VM-exit.
> */
> if (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
> add_autostore_msr(vmx, MSR_IA32_SPEC_CTRL);
> else
> clear_autostore_msr(vmx, MSR_IA32_SPEC_CTRL);
> 
> /*
> * If the guest's IA32_SPEC_CTRL MSR doesn't match the host's
> * IA32_SPEC_CTRL MSR, then add the MSR to the atomic switch
> * MSRs, so that the guest value will be loaded on VM-entry
> * and the host value will be loaded on VM-exit.
> */
> if (vmx->msr_ia32_spec_ctrl != spec_ctrl_enabled())
> add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL,
>       vmx->msr_ia32_spec_ctrl,
>       spec_ctrl_enabled());
> else
> clear_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL);
>
> >         atomic_switch_perf_msrs(vmx);
> >
> >         vmx_arm_hv_timer(vcpu);
> > @@ -9707,6 +9733,12 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu
> > *vcpu)
> >  #endif
> >               );
> >
> > +       if (have_spec_ctrl) {
> > +               rdmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
> > +               if (vmx->spec_ctrl)
> > +                       wrmsrl(MSR_IA32_SPEC_CTRL, 0);
> > +       }
> > +
> 
> I know the VM-exit MSR load and store lists are probably slower, but
> I'm a little uncomfortable restoring the host's IA32_SPEC_CTRL MSR
> late if the guest has it clear and the host has it set.

There is no indirect branch before though, isn't it?

Paolo

> >         /* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed
> >         */
> >         if (vmx->host_debugctlmsr)
> >                 update_debugctlmsr(vmx->host_debugctlmsr);
> > --
> > 1.8.3.1
> >
> >
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
  2018-01-08 19:41   ` David Woodhouse
@ 2018-01-08 22:33     ` Paolo Bonzini
  0 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-08 22:33 UTC (permalink / raw)
  To: David Woodhouse
  Cc: linux-kernel, kvm, jmattson, aliguori, thomas lendacky, bp



----- Original Message -----
> From: "David Woodhouse" <dwmw2@infradead.org>
> To: "Paolo Bonzini" <pbonzini@redhat.com>, linux-kernel@vger.kernel.org, kvm@vger.kernel.org
> Cc: jmattson@google.com, aliguori@amazon.com, "thomas lendacky" <thomas.lendacky@amd.com>, bp@alien8.de
> Sent: Monday, January 8, 2018 8:41:07 PM
> Subject: Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
> 
> On Mon, 2018-01-08 at 19:08 +0100, Paolo Bonzini wrote:
> > 
> > +       if (have_spec_ctrl && vmx->spec_ctrl != 0)
> > +               wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
> > +
> 
> I think this one probably *is* safe even without an 'else lfence',
> which means that the CPU can speculate around it, but it wants a
> comment explaining that someone has properly analysed it and saying
> precisely why.

This one is okay as long as there are no indirect jumps until
vmresume.  But the one on vmexit is only okay because right now
it's *disabling* IBRS.  Once IBRS is used by Linux, we'll need an
lfence there.  I'll add a comment.

Paolo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
  2018-01-08 22:32     ` Paolo Bonzini
@ 2018-01-08 23:19       ` Jim Mattson
  2018-01-09 10:11         ` Paolo Bonzini
  0 siblings, 1 reply; 44+ messages in thread
From: Jim Mattson @ 2018-01-08 23:19 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: LKML, kvm list, aliguori, Tom Lendacky, dwmw, bp

On Mon, Jan 8, 2018 at 2:32 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>> I have:
>>
>> if (!have_spec_ctrl ||
>>     (!msr_info->host_initiated &&
>>      !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)))
>> return 1;
>> msr_info->data = to_vmx(vcpu)->msr_ia32_spec_ctrl;
>> break;
>
>> I have:
>>
>> if (!have_spec_ctrl ||
>>     (!msr_info->host_initiated &&
>>      !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL)))
>> return 1;
>> to_vmx(vcpu)->msr_ia32_spec_ctrl = data;
>> break;
>
> Both better than mine.
>
>> > +       vmx_disable_intercept_for_msr(MSR_IA32_SPEC_CTRL, false);
>> > +       vmx_disable_intercept_for_msr(MSR_IA32_PRED_CMD, false);
>>
>> I have a lot of changes to MSR permission bitmap handling, but these
>> intercepts should only be disabled when guest_cpuid_has(vcpu,
>> X86_FEATURE_SPEC_CTRL).
>
> That's harder to backport and not strictly necessary (just like
> e.g. we don't check guest CPUID bits before emulation).  I agree that
> your version is better, but I think the above is fine as a minimal
> fix.

Due to the impacts that spec_ctrl has on the neighboring hyperthread,
one may want to disable MSRs 0x48 and 0x49 for a particular VM. We do
this by masking off CPUID.(EAX=7, ECX=0).EDX[26] and CPUID.(EAX=7,
ECX=0).EDX[27] from the userspace agent. However, with your patch,
*any* VCPU gets unrestricted access to these MSRs, without any
mechanism for disabling them.

>> Here, I have:
>>
>> /*
>> * If the guest is allowed to write to MSR_IA32_SPEC_CTRL,
>> * store it on VM-exit.
>> */
>> if (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
>> add_autostore_msr(vmx, MSR_IA32_SPEC_CTRL);
>> else
>> clear_autostore_msr(vmx, MSR_IA32_SPEC_CTRL);
>>
>> /*
>> * If the guest's IA32_SPEC_CTRL MSR doesn't match the host's
>> * IA32_SPEC_CTRL MSR, then add the MSR to the atomic switch
>> * MSRs, so that the guest value will be loaded on VM-entry
>> * and the host value will be loaded on VM-exit.
>> */
>> if (vmx->msr_ia32_spec_ctrl != spec_ctrl_enabled())
>> add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL,
>>       vmx->msr_ia32_spec_ctrl,
>>       spec_ctrl_enabled());
>> else
>> clear_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL);
>>
>> >         atomic_switch_perf_msrs(vmx);
>> >
>> >         vmx_arm_hv_timer(vcpu);
>> > @@ -9707,6 +9733,12 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu
>> > *vcpu)
>> >  #endif
>> >               );
>> >
>> > +       if (have_spec_ctrl) {
>> > +               rdmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
>> > +               if (vmx->spec_ctrl)
>> > +                       wrmsrl(MSR_IA32_SPEC_CTRL, 0);
>> > +       }
>> > +
>>
>> I know the VM-exit MSR load and store lists are probably slower, but
>> I'm a little uncomfortable restoring the host's IA32_SPEC_CTRL MSR
>> late if the guest has it clear and the host has it set.
>
> There is no indirect branch before though, isn't it?

I guess that depends on how you define "before."

> Paolo
>
>> >         /* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed
>> >         */
>> >         if (vmx->host_debugctlmsr)
>> >                 update_debugctlmsr(vmx->host_debugctlmsr);
>> > --
>> > 1.8.3.1
>> >
>> >
>>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 4/7] kvm: vmx: Set IBPB when running a different VCPU
  2018-01-08 19:36   ` Jim Mattson
@ 2018-01-09  8:33     ` Paolo Bonzini
  0 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-09  8:33 UTC (permalink / raw)
  To: Jim Mattson; +Cc: LKML, kvm list, aliguori, Tom Lendacky, dwmw, bp, Tim Chen

On 08/01/2018 20:36, Jim Mattson wrote:
> Shouldn't there be an IBPB on *any* context switch away from a VCPU
> thread, even if it is to a non-VCPU thread?

Yes, but that's the task of patches to the generic Linux context
switching code.  As mentioned in the cover letter, this isn't yet a full
solution---just the KVM bits, which is what Radim and I have full
control on.  Hence the hacks with cpuid in local bools.

Paolo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
  2018-01-08 23:19       ` Jim Mattson
@ 2018-01-09 10:11         ` Paolo Bonzini
  0 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-09 10:11 UTC (permalink / raw)
  To: Jim Mattson; +Cc: LKML, kvm list, aliguori, Tom Lendacky, dwmw, bp

On 09/01/2018 00:19, Jim Mattson wrote:
>>>> +       vmx_disable_intercept_for_msr(MSR_IA32_SPEC_CTRL, false);
>>>> +       vmx_disable_intercept_for_msr(MSR_IA32_PRED_CMD, false);
>>> I have a lot of changes to MSR permission bitmap handling, but these
>>> intercepts should only be disabled when guest_cpuid_has(vcpu,
>>> X86_FEATURE_SPEC_CTRL).
>> That's harder to backport and not strictly necessary (just like
>> e.g. we don't check guest CPUID bits before emulation).  I agree that
>> your version is better, but I think the above is fine as a minimal
>> fix.
> 
> Due to the impacts that spec_ctrl has on the neighboring hyperthread,
> one may want to disable MSRs 0x48 and 0x49 for a particular VM. We do
> this by masking off CPUID.(EAX=7, ECX=0).EDX[26] and CPUID.(EAX=7,
> ECX=0).EDX[27] from the userspace agent. However, with your patch,
> *any* VCPU gets unrestricted access to these MSRs, without any
> mechanism for disabling them.

Yes, I agree that having the check is superior.  However, I also want to
get there step by step.

>>>> +       if (have_spec_ctrl) {
>>>> +               rdmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
>>>> +               if (vmx->spec_ctrl)
>>>> +                       wrmsrl(MSR_IA32_SPEC_CTRL, 0);
>>>> +       }
>>>> +
>>>
>>> I know the VM-exit MSR load and store lists are probably slower, but
>>> I'm a little uncomfortable restoring the host's IA32_SPEC_CTRL MSR
>>> late if the guest has it clear and the host has it set.
>>
>> There is no indirect branch before though, isn't it?
>
> I guess that depends on how you define "before."

--verbose? :-/

Paolo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest
  2018-01-08 18:08 [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest Paolo Bonzini
                   ` (6 preceding siblings ...)
  2018-01-08 18:08 ` [PATCH 7/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT to MSR and CPUID lists Paolo Bonzini
@ 2018-01-09 10:15 ` Thomas Gleixner
  2018-01-09 11:12   ` Paolo Bonzini
  7 siblings, 1 reply; 44+ messages in thread
From: Thomas Gleixner @ 2018-01-09 10:15 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, jmattson, aliguori, thomas.lendacky, dwmw, bp

On Mon, 8 Jan 2018, Paolo Bonzini wrote:

> This series allows guests to use the MSR_IA32_SPEC_CTRL and
> MSR_IA32_PRED_CMD model specific registers that were added as mitigations
> for CVE-2017-5715.
> 
> These are only the KVM specific parts of the fix.  It does *not* yet
> include any protection for reading host memory from the guest, because
> that would be done in the same way as the rest of Linux.  So there is no
> IBRS *usage* here, no retpolines, no stuffing of the return stack buffer.
> (KVM already includes a fix to clear all registers on vmexit, which is
> enough to block Google Project Zero's PoC exploit).
> 
> However, I am including the changes to use IBPB (indirect branch
> predictor barrier) if available.  That occurs only when there is a VCPU
> switch on a physical CPU, thus it has a small impact on performance.
> 
> The patches are a bit hackish because the relevant cpufeatures have
> not been included yet, and because I wanted to make the patches easier
> to backport to distro kernels if desired, but I would still like to
> have them in 4.16.
> 
> Please review.

CC'ing x86@kernel.org on this would have been asked too much, right?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 1/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT accessors
  2018-01-08 19:09   ` Liran Alon
@ 2018-01-09 10:32     ` Paolo Bonzini
  0 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-09 10:32 UTC (permalink / raw)
  To: Liran Alon, linux-kernel, kvm
  Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp

On 08/01/2018 20:09, Liran Alon wrote:
>>
>> +static inline bool cpu_has_spec_ctrl(void)
>> +{
>> +    u32 eax, ebx, ecx, edx;
>> +    cpuid_count(7, 0, &eax, &ebx, &ecx, &edx);
>> +
>> +    return edx & bit(KVM_CPUID_BIT_SPEC_CTRL);
> 
> Why not just "return cpuid_edx(7) & bit(KVM_CPUID_BIT_SPEC_CTRL);"?
> This is also consistent with how cpu_has_ibpb_support() is written.

Leaf 7 explicitly requires you to clear ECX (there could be a leaf for
EAX=7,ECX=1 in the future).  Even though cpuid_edx does do that, it's
not clear from the function that cpuid_edx(7) would work.

Paolo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 4/7] kvm: vmx: Set IBPB when running a different VCPU
  2018-01-08 18:08 ` [PATCH 4/7] kvm: vmx: Set IBPB when running a different VCPU Paolo Bonzini
  2018-01-08 19:23   ` Liran Alon
  2018-01-08 19:36   ` Jim Mattson
@ 2018-01-09 11:01   ` David Hildenbrand
  2 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2018-01-09 11:01 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp, Tim Chen

On 08.01.2018 19:08, Paolo Bonzini wrote:
> From: Tim Chen <tim.c.chen@linux.intel.com>
> 
> Ensure an IBPB (Indirect branch prediction barrier) before every VCPU
> switch.
> 
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/vmx.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index d00bcad7336e..bf127c570675 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -2375,6 +2375,8 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  	if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
>  		per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
>  		vmcs_load(vmx->loaded_vmcs->vmcs);
> +		if (have_spec_ctrl)
> +			wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
>  	}
>  
>  	if (!already_loaded) {
> @@ -4029,6 +4031,13 @@ static void free_loaded_vmcs(struct loaded_vmcs *loaded_vmcs)
>  	free_vmcs(loaded_vmcs->vmcs);
>  	loaded_vmcs->vmcs = NULL;
>  	WARN_ON(loaded_vmcs->shadow_vmcs != NULL);
> +
> +	/*
> +	 * The VMCS could be recycled, causing a false negative in
> +	 * vmx_vcpu_load; block speculative execution.
> +	 */
> +	if (have_spec_ctrl)
> +		wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
>  }
>  
>  static void vmx_nested_free_vmcs02(struct vcpu_vmx *vmx)
> 

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 6/7] x86/svm: Set IBPB when running a different VCPU
  2018-01-08 20:00   ` Liran Alon
@ 2018-01-09 11:07     ` Paolo Bonzini
  0 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-09 11:07 UTC (permalink / raw)
  To: Liran Alon, linux-kernel, kvm
  Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp

On 08/01/2018 21:00, Liran Alon wrote:
> 
> 
> On 08/01/18 20:08, Paolo Bonzini wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> Set IBPB (Indirect Branch Prediction Barrier) when the current CPU is
>> going to run a VCPU different from what was previously run.  Nested
>> virtualization uses the same VMCB for the second level guest, but the
>> L1 hypervisor should be using IBRS to protect itself.
>>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>   arch/x86/kvm/svm.c | 31 +++++++++++++++++++++++++++++++
>>   1 file changed, 31 insertions(+)
>>
>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index 779981a00d01..dd744d896cec 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -289,6 +289,7 @@ struct amd_svm_iommu_ir {
>>   module_param(vgif, int, 0444);
>>
>>   static bool __read_mostly have_spec_ctrl;
>> +static bool __read_mostly have_ibpb_support;
>>
>>   static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
>>   static void svm_flush_tlb(struct kvm_vcpu *vcpu, bool invalidate_gpa);
>> @@ -540,6 +541,7 @@ struct svm_cpu_data {
>>       struct kvm_ldttss_desc *tss_desc;
>>
>>       struct page *save_area;
>> +    struct vmcb *current_vmcb;
>>   };
>>
>>   static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
>> @@ -1151,6 +1153,11 @@ static __init int svm_hardware_setup(void)
>>           pr_info("kvm: SPEC_CTRL available\n");
>>       else
>>           pr_info("kvm: SPEC_CTRL not available\n");
>> +    have_ibpb_support = have_spec_ctrl || cpu_has_ibpb_support();
>> +    if (have_ibpb_support)
>> +        pr_info("kvm: IBPB_SUPPORT available\n");
>> +    else
>> +        pr_info("kvm: IBPB_SUPPORT not available\n");
>>
>>       return 0;
>>
>> @@ -1725,11 +1732,19 @@ static void svm_free_vcpu(struct kvm_vcpu *vcpu)
>>       __free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER);
>>       kvm_vcpu_uninit(vcpu);
>>       kmem_cache_free(kvm_vcpu_cache, svm);
>> +
>> +    /*
>> +     * The VMCB could be recycled, causing a false negative in
>> +     * svm_vcpu_load; block speculative execution.
>> +     */
>> +    if (have_ibpb_support)
>> +        wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
>>   }
>>
>>   static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>>   {
>>       struct vcpu_svm *svm = to_svm(vcpu);
>> +    struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
>>       int i;
>>
>>       if (unlikely(cpu != vcpu->cpu)) {
>> @@ -1758,6 +1773,12 @@ static void svm_vcpu_load(struct kvm_vcpu
>> *vcpu, int cpu)
>>       if (static_cpu_has(X86_FEATURE_RDTSCP))
>>           wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
>>
>> +    if (sd->current_vmcb != svm->vmcb) {
>> +        sd->current_vmcb = svm->vmcb;
>> +        if (have_ibpb_support)
>> +            wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
>> +    }
>> +
>>       avic_vcpu_load(vcpu, cpu);
>>   }
>>
>> @@ -2798,6 +2819,11 @@ static int nested_svm_vmexit(struct vcpu_svm *svm)
>>       if (!nested_vmcb)
>>           return 1;
>>
>> +    /*
>> +     * No need for IBPB here, the L1 hypervisor should be running with
>> +     * IBRS=1 and inserts one already when switching L2 VMs.
>> +     */
>> +
> 
> I am not fully convinced yet this is OK from security perspective.
> From the CPU point-of-view, both L1 & L2 run in the same prediction-mode
> (as they are both running in SVM guest-mode). Therefore, IBRS will not
> hide the BHB & BTB of L1 from L2.

Indeed, IBRS will not hide the indirect branch predictor state of L2
user mode from L1 user mode.

On current generation processors, as I understand it, IBRS simply
disables the indirect branch predictor when set to 1.  Therefore, as
long as the L1 hypervisor sets IBRS=1, the indirect branch predictor
state left by L2 does not affect execution of the L1 hypervisor.

Future processors might have a mode that says "just set IBRS=1 and I'll
DTRT".  If that's done by indexing the BTB using the full virtual
address, the CPL _and_ the ASID/PCID/VPID, then IBPB is not needed here
because the L2 VM uses a different ASID.  If that's done by only
augmenting the BTB index with the CPL, we'd need an IBPB.  But in this
new world we've been thrown into, that would be a processor bug...

Note that AMD currently doesn't implement IBRS at all.  In order to
mitigate against CVE-2017-5715, the hypervisor should issue an IBPB on
every vmexit (in both L0 and L1).  However, the cost of that is very
high.  While we may want a paranoia mode that does it, it is outside the
scope of this series (which is more of a "let's unblock QEMU patches"
thing than a complete fix).

> 6. One can implicitly assume that L1 hypervisor did issued a IBPB when
> loading an VMCB and therefore it doesn't have to do it again in L0.
> However, I see 2 problems with this approach:
> (a) We don't protect old non-patched L1 hypervisors from Spectre even
> though we could easily do that from L0 with small performance hit.

Yeah, that's a nice thing to have.  However, I disagree on the "small"
performance hit... on a patched hypervisor, the cost of a double fix can
be very noticeable.

> (b) When L1 hypervisor runs only a single L2 VM, it doesn't have to
> issue an IBPB when loading the VMCB (as it didn't run any other VMCB
> before) and it should be sufficient from L1 perspective to just write 1
> to IBRS. However, in our nested-case, this is a security-hole.

This is the main difference in our reasoning.  I think IBRS=1 (or IBPB
on vmexit) should be enough.

Paolo

> Am I misunderstanding something?
> 
> Regards,
> -Liran
> 
>>       /* Exit Guest-Mode */
>>       leave_guest_mode(&svm->vcpu);
>>       svm->nested.vmcb = 0;
>> @@ -3061,6 +3087,11 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
>>       if (!nested_vmcb)
>>           return false;
>>
>> +    /*
>> +     * No need for IBPB here, since the nested VM is less
>> privileged.  The
>> +     * L1 hypervisor inserts one already when switching L2 VMs.
>> +     */
>> +
>>       if (!nested_vmcb_checks(nested_vmcb)) {
>>           nested_vmcb->control.exit_code    = SVM_EXIT_ERR;
>>           nested_vmcb->control.exit_code_hi = 0;
>>
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest
  2018-01-09 10:15 ` [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest Thomas Gleixner
@ 2018-01-09 11:12   ` Paolo Bonzini
  2018-01-09 12:03     ` Thomas Gleixner
  0 siblings, 1 reply; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-09 11:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, kvm, jmattson, aliguori, thomas.lendacky, dwmw, bp

On 09/01/2018 11:15, Thomas Gleixner wrote:
> On Mon, 8 Jan 2018, Paolo Bonzini wrote:
> 
>> This series allows guests to use the MSR_IA32_SPEC_CTRL and
>> MSR_IA32_PRED_CMD model specific registers that were added as mitigations
>> for CVE-2017-5715.
>>
>> These are only the KVM specific parts of the fix.  It does *not* yet
>> include any protection for reading host memory from the guest, because
>> that would be done in the same way as the rest of Linux.  So there is no
>> IBRS *usage* here, no retpolines, no stuffing of the return stack buffer.
>> (KVM already includes a fix to clear all registers on vmexit, which is
>> enough to block Google Project Zero's PoC exploit).
>>
>> However, I am including the changes to use IBPB (indirect branch
>> predictor barrier) if available.  That occurs only when there is a VCPU
>> switch on a physical CPU, thus it has a small impact on performance.
>>
>> The patches are a bit hackish because the relevant cpufeatures have
>> not been included yet, and because I wanted to make the patches easier
>> to backport to distro kernels if desired, but I would still like to
>> have them in 4.16.
>>
>> Please review.
> 
> CC'ing x86@kernel.org on this would have been asked too much, right?

Sorry, my mistake.  I'll CC you on v2.

Paolo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 1/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT accessors
  2018-01-08 18:08 ` [PATCH 1/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT accessors Paolo Bonzini
  2018-01-08 18:33   ` Konrad Rzeszutek Wilk
  2018-01-08 19:09   ` Liran Alon
@ 2018-01-09 11:14   ` David Hildenbrand
  2018-01-09 11:18     ` Paolo Bonzini
  2 siblings, 1 reply; 44+ messages in thread
From: David Hildenbrand @ 2018-01-09 11:14 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp

On 08.01.2018 19:08, Paolo Bonzini wrote:
> As an interim measure until SPEC_CTRL is supported by upstream
> Linux in cpufeatures, add a function that lets vmx.c and svm.c
> know whether to save/restore MSR_IA32_SPEC_CTRL.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/cpuid.c |  3 ---
>  arch/x86/kvm/cpuid.h | 22 ++++++++++++++++++++++
>  2 files changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 8e9a07c557f1..767af697c20c 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -67,9 +67,6 @@ u64 kvm_supported_xcr0(void)
>  
>  #define F(x) bit(X86_FEATURE_##x)
>  
> -/* These are scattered features in cpufeatures.h. */
> -#define KVM_CPUID_BIT_AVX512_4VNNIW     2
> -#define KVM_CPUID_BIT_AVX512_4FMAPS     3
>  #define KF(x) bit(KVM_CPUID_BIT_##x)
>  
>  int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index c2cea6651279..8d04ccf177ce 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -155,6 +155,28 @@ static inline int guest_cpuid_stepping(struct kvm_vcpu *vcpu)
>  	return x86_stepping(best->eax);
>  }
>  
> +/* These are scattered features in cpufeatures.h. */
> +#define KVM_CPUID_BIT_AVX512_4VNNIW	2
> +#define KVM_CPUID_BIT_AVX512_4FMAPS	3
> +#define KVM_CPUID_BIT_SPEC_CTRL		26
> +#define KVM_CPUID_BIT_STIBP		27

I can see that STIBP is never checked in KVM code but only forwarded to
the guest if available.

I am wondering if we would have to check against that, too, before
issuing a FEATURE_SET_IBPB.

(can somebody point me at the intel documentation?)

> +
> +/* CPUID[eax=0x80000008].ebx */
> +#define KVM_CPUID_BIT_IBPB_SUPPORT	12
> +
> +static inline bool cpu_has_spec_ctrl(void)
> +{
> +	u32 eax, ebx, ecx, edx;
> +	cpuid_count(7, 0, &eax, &ebx, &ecx, &edx);
> +
> +	return edx & bit(KVM_CPUID_BIT_SPEC_CTRL);
> +}
> +
> +static inline bool cpu_has_ibpb_support(void)
> +{
> +	return cpuid_ebx(0x80000008) & bit(KVM_CPUID_BIT_IBPB_SUPPORT);
> +}
> +
>  static inline bool supports_cpuid_fault(struct kvm_vcpu *vcpu)
>  {
>  	return vcpu->arch.msr_platform_info & MSR_PLATFORM_INFO_CPUID_FAULT;
> 


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 1/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT accessors
  2018-01-09 11:14   ` David Hildenbrand
@ 2018-01-09 11:18     ` Paolo Bonzini
  0 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-09 11:18 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel, kvm
  Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp

On 09/01/2018 12:14, David Hildenbrand wrote:
>>  
>> +/* These are scattered features in cpufeatures.h. */
>> +#define KVM_CPUID_BIT_AVX512_4VNNIW	2
>> +#define KVM_CPUID_BIT_AVX512_4FMAPS	3
>> +#define KVM_CPUID_BIT_SPEC_CTRL		26
>> +#define KVM_CPUID_BIT_STIBP		27
> I can see that STIBP is never checked in KVM code but only forwarded to
> the guest if available.
> 
> I am wondering if we would have to check against that, too, before
> issuing a FEATURE_SET_IBPB.

STIBP is "single-threaded indirect branch prediction".  SPEC_CTRL always
implies that FEATURE_SET_IBPB is available.

Paolo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest
  2018-01-09 11:12   ` Paolo Bonzini
@ 2018-01-09 12:03     ` Thomas Gleixner
  2018-01-09 14:06       ` Paolo Bonzini
  0 siblings, 1 reply; 44+ messages in thread
From: Thomas Gleixner @ 2018-01-09 12:03 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: LKML, kvm, jmattson, aliguori, thomas.lendacky, dwmw,
	Borislav Petkov, Peter Zijlstra

On Tue, 9 Jan 2018, Paolo Bonzini wrote:
> On 09/01/2018 11:15, Thomas Gleixner wrote:
> > On Mon, 8 Jan 2018, Paolo Bonzini wrote:
> > 
> >> This series allows guests to use the MSR_IA32_SPEC_CTRL and
> >> MSR_IA32_PRED_CMD model specific registers that were added as mitigations
> >> for CVE-2017-5715.
> >>
> >> These are only the KVM specific parts of the fix.  It does *not* yet
> >> include any protection for reading host memory from the guest, because
> >> that would be done in the same way as the rest of Linux.  So there is no
> >> IBRS *usage* here, no retpolines, no stuffing of the return stack buffer.
> >> (KVM already includes a fix to clear all registers on vmexit, which is
> >> enough to block Google Project Zero's PoC exploit).
> >>
> >> However, I am including the changes to use IBPB (indirect branch
> >> predictor barrier) if available.  That occurs only when there is a VCPU
> >> switch on a physical CPU, thus it has a small impact on performance.
> >>
> >> The patches are a bit hackish because the relevant cpufeatures have
> >> not been included yet, and because I wanted to make the patches easier
> >> to backport to distro kernels if desired, but I would still like to
> >> have them in 4.16.

We really want to coordinate that proper with the ongoing integration of
the IB** for bare metal.

And that stuff really does not need to be hackish at all. We've spent a lot
of effort keeping all of it clean _AND_ available for 4.14 stable
consumption. Everything before 4.9 is a big fricking and incompatible mess
anyway.

> >> Please review.
> > 
> > CC'ing x86@kernel.org on this would have been asked too much, right?
> 
> Sorry, my mistake.  I'll CC you on v2.

All good ...

Please add the crowd which has been involved in the bare metal IBRS stuff
as well.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest
  2018-01-09 12:03     ` Thomas Gleixner
@ 2018-01-09 14:06       ` Paolo Bonzini
  0 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-09 14:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, kvm, jmattson, aliguori, thomas.lendacky, dwmw,
	Borislav Petkov, Peter Zijlstra

On 09/01/2018 13:03, Thomas Gleixner wrote:
> On Tue, 9 Jan 2018, Paolo Bonzini wrote:
>> On 09/01/2018 11:15, Thomas Gleixner wrote:
>>> On Mon, 8 Jan 2018, Paolo Bonzini wrote:
>>>
>>>> This series allows guests to use the MSR_IA32_SPEC_CTRL and
>>>> MSR_IA32_PRED_CMD model specific registers that were added as mitigations
>>>> for CVE-2017-5715.
>>>>
>>>> These are only the KVM specific parts of the fix.  It does *not* yet
>>>> include any protection for reading host memory from the guest, because
>>>> that would be done in the same way as the rest of Linux.  So there is no
>>>> IBRS *usage* here, no retpolines, no stuffing of the return stack buffer.
>>>> (KVM already includes a fix to clear all registers on vmexit, which is
>>>> enough to block Google Project Zero's PoC exploit).
>>>>
>>>> However, I am including the changes to use IBPB (indirect branch
>>>> predictor barrier) if available.  That occurs only when there is a VCPU
>>>> switch on a physical CPU, thus it has a small impact on performance.
>>>>
>>>> The patches are a bit hackish because the relevant cpufeatures have
>>>> not been included yet, and because I wanted to make the patches easier
>>>> to backport to distro kernels if desired, but I would still like to
>>>> have them in 4.16.
> 
> We really want to coordinate that proper with the ongoing integration of
> the IB** for bare metal.

Yes, this can get merged to master together with the bare-metal parts.
If, as I expect, the -rc rules will be be bent a bit for IB** (and
perhaps retpolines too) in 4.16, we have some time to sort it out.

Thanks,

Paolo

> And that stuff really does not need to be hackish at all. We've spent a lot
> of effort keeping all of it clean _AND_ available for 4.14 stable
> consumption. Everything before 4.9 is a big fricking and incompatible mess
> anyway.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
  2018-01-08 18:08 ` [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest Paolo Bonzini
                     ` (4 preceding siblings ...)
  2018-01-08 22:09   ` Ashok Raj
@ 2018-01-11  2:47   ` Tim Chen
  2018-01-11 10:41     ` Paolo Bonzini
  5 siblings, 1 reply; 44+ messages in thread
From: Tim Chen @ 2018-01-11  2:47 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp, ashok.raj

On 01/08/2018 10:08 AM, Paolo Bonzini wrote:

> @@ -9597,6 +9620,9 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>  
>  	pt_guest_enter(vmx);
>  
> +	if (have_spec_ctrl && vmx->spec_ctrl != 0)
> +		wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
> +

Say host uses IBRS (have_spec_ctrl==1), and have IBRS set to 1. 
And say guest's vmx->spec_ctrl is 0 and not using IBRS.

We will be leaving IBRS msr as 1 and won't be
switching it to 0 before entering guest.
Guest will be running with incorrect IBRS value.

Seems like the correct logic is 

if (host_supports_ibrs)
	/* switch IBRS if host and guest ibrs differs */
	if ((host_uses_ibrs && vmx->spec_ctrl == 0) ||    /* host IBRS 1, guest IBRS 0 */
	    (!host_uses_ibrs && vmx->spec_ctrl == 1))     /* host IBRS 0, guest IBRS 1 */
		wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
}

Your have_spec_ctrl logic specifies that IBRS is available.
But that doesn't necessarily mean that we will use IBRS in
host.

Tim

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
  2018-01-11  2:47   ` Tim Chen
@ 2018-01-11 10:41     ` Paolo Bonzini
  0 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-11 10:41 UTC (permalink / raw)
  To: Tim Chen, linux-kernel, kvm
  Cc: jmattson, aliguori, thomas.lendacky, dwmw, bp, ashok.raj

On 11/01/2018 03:47, Tim Chen wrote:
> On 01/08/2018 10:08 AM, Paolo Bonzini wrote:
> 
>> @@ -9597,6 +9620,9 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
>>  
>>  	pt_guest_enter(vmx);
>>  
>> +	if (have_spec_ctrl && vmx->spec_ctrl != 0)
>> +		wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
>> +
> 
> Say host uses IBRS (have_spec_ctrl==1), and have IBRS set to 1. 
> And say guest's vmx->spec_ctrl is 0 and not using IBRS.
> 
> We will be leaving IBRS msr as 1 and won't be
> switching it to 0 before entering guest.
> Guest will be running with incorrect IBRS value.
> 
> Seems like the correct logic is 
> 
> if (host_supports_ibrs)
> 	/* switch IBRS if host and guest ibrs differs */
> 	if ((host_uses_ibrs && vmx->spec_ctrl == 0) ||    /* host IBRS 1, guest IBRS 0 */
> 	    (!host_uses_ibrs && vmx->spec_ctrl == 1))     /* host IBRS 0, guest IBRS 1 */
> 		wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
> }
> 
> Your have_spec_ctrl logic specifies that IBRS is available.
> But that doesn't necessarily mean that we will use IBRS in
> host.

Of course.  But these patches are just an initial version for a host
that doesn't support IBRS.

Paolo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
  2018-01-08 23:58 [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest Liran Alon
@ 2018-01-09  8:35 ` Paolo Bonzini
  0 siblings, 0 replies; 44+ messages in thread
From: Paolo Bonzini @ 2018-01-09  8:35 UTC (permalink / raw)
  To: Liran Alon
  Cc: jmattson, bp, thomas.lendacky, aliguori, linux-kernel, dwmw2, kvm

On 09/01/2018 00:58, Liran Alon wrote:
> 
> ----- pbonzini@redhat.com wrote:
> 
>> ----- Original Message -----
>>> From: "David Woodhouse" <dwmw2@infradead.org>
>>> To: "Paolo Bonzini" <pbonzini@redhat.com>,
>> linux-kernel@vger.kernel.org, kvm@vger.kernel.org
>>> Cc: jmattson@google.com, aliguori@amazon.com, "thomas lendacky"
>> <thomas.lendacky@amd.com>, bp@alien8.de
>>> Sent: Monday, January 8, 2018 8:41:07 PM
>>> Subject: Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and
>> MSR_IA32_PRED_CMD down to the guest
>>>
>>> On Mon, 2018-01-08 at 19:08 +0100, Paolo Bonzini wrote:
>>>>
>>>> +       if (have_spec_ctrl && vmx->spec_ctrl != 0)
>>>> +               wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
>>>> +
>>>
>>> I think this one probably *is* safe even without an 'else lfence',
>>> which means that the CPU can speculate around it, but it wants a
>>> comment explaining that someone has properly analysed it and saying
>>> precisely why.
>>
>> This one is okay as long as there are no indirect jumps until
>> vmresume.  But the one on vmexit is only okay because right now
>> it's *disabling* IBRS.  Once IBRS is used by Linux, we'll need an
>> lfence there.  I'll add a comment.
>>
>> Paolo
> 
> That is true but from what I understand, there is an indirect branch from this point until vmresume.
> That indirect branch resides in atomic_switch_perf_msrs() immediately called after this WRMSR:
> atomic_switch_perf_msrs() -> perf_guest_get_msrs() -> x86_pmu.guest_get_msrs().

Sure, it has to move later as pointed out by other reviewers.

Paolo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest
@ 2018-01-08 23:58 Liran Alon
  2018-01-09  8:35 ` Paolo Bonzini
  0 siblings, 1 reply; 44+ messages in thread
From: Liran Alon @ 2018-01-08 23:58 UTC (permalink / raw)
  To: pbonzini
  Cc: jmattson, bp, thomas.lendacky, aliguori, linux-kernel, dwmw2, kvm


----- pbonzini@redhat.com wrote:

> ----- Original Message -----
> > From: "David Woodhouse" <dwmw2@infradead.org>
> > To: "Paolo Bonzini" <pbonzini@redhat.com>,
> linux-kernel@vger.kernel.org, kvm@vger.kernel.org
> > Cc: jmattson@google.com, aliguori@amazon.com, "thomas lendacky"
> <thomas.lendacky@amd.com>, bp@alien8.de
> > Sent: Monday, January 8, 2018 8:41:07 PM
> > Subject: Re: [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and
> MSR_IA32_PRED_CMD down to the guest
> > 
> > On Mon, 2018-01-08 at 19:08 +0100, Paolo Bonzini wrote:
> > > 
> > > +       if (have_spec_ctrl && vmx->spec_ctrl != 0)
> > > +               wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
> > > +
> > 
> > I think this one probably *is* safe even without an 'else lfence',
> > which means that the CPU can speculate around it, but it wants a
> > comment explaining that someone has properly analysed it and saying
> > precisely why.
> 
> This one is okay as long as there are no indirect jumps until
> vmresume.  But the one on vmexit is only okay because right now
> it's *disabling* IBRS.  Once IBRS is used by Linux, we'll need an
> lfence there.  I'll add a comment.
> 
> Paolo

That is true but from what I understand, there is an indirect branch from this point until vmresume.
That indirect branch resides in atomic_switch_perf_msrs() immediately called after this WRMSR:
atomic_switch_perf_msrs() -> perf_guest_get_msrs() -> x86_pmu.guest_get_msrs().

-Liran

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2018-01-11 10:41 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-08 18:08 [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest Paolo Bonzini
2018-01-08 18:08 ` [PATCH 1/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT accessors Paolo Bonzini
2018-01-08 18:33   ` Konrad Rzeszutek Wilk
2018-01-08 19:09   ` Liran Alon
2018-01-09 10:32     ` Paolo Bonzini
2018-01-09 11:14   ` David Hildenbrand
2018-01-09 11:18     ` Paolo Bonzini
2018-01-08 18:08 ` [PATCH 2/7] x86/msr: add definitions for indirect branch predictor MSRs Paolo Bonzini
2018-01-08 18:35   ` Konrad Rzeszutek Wilk
2018-01-08 18:52     ` Jim Mattson
2018-01-08 19:10   ` Liran Alon
2018-01-08 18:08 ` [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest Paolo Bonzini
2018-01-08 18:43   ` Konrad Rzeszutek Wilk
2018-01-08 19:18   ` Jim Mattson
2018-01-08 20:23     ` Liran Alon
2018-01-08 22:32     ` Paolo Bonzini
2018-01-08 23:19       ` Jim Mattson
2018-01-09 10:11         ` Paolo Bonzini
2018-01-08 19:22   ` Liran Alon
2018-01-08 19:41   ` David Woodhouse
2018-01-08 22:33     ` Paolo Bonzini
2018-01-08 22:09   ` Ashok Raj
2018-01-08 22:25     ` Paolo Bonzini
2018-01-11  2:47   ` Tim Chen
2018-01-11 10:41     ` Paolo Bonzini
2018-01-08 18:08 ` [PATCH 4/7] kvm: vmx: Set IBPB when running a different VCPU Paolo Bonzini
2018-01-08 19:23   ` Liran Alon
2018-01-08 19:36   ` Jim Mattson
2018-01-09  8:33     ` Paolo Bonzini
2018-01-09 11:01   ` David Hildenbrand
2018-01-08 18:08 ` [PATCH 5/7] kvm: svm: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to guest Paolo Bonzini
2018-01-08 19:41   ` Liran Alon
2018-01-08 18:08 ` [PATCH 6/7] x86/svm: Set IBPB when running a different VCPU Paolo Bonzini
2018-01-08 20:00   ` Liran Alon
2018-01-09 11:07     ` Paolo Bonzini
2018-01-08 18:08 ` [PATCH 7/7] KVM: x86: add SPEC_CTRL and IBPB_SUPPORT to MSR and CPUID lists Paolo Bonzini
2018-01-08 20:07   ` Liran Alon
2018-01-08 20:15     ` Jim Mattson
2018-01-09 10:15 ` [PATCH 0/7] KVM: x86: expose CVE-2017-5715 ("Spectre variant 2") mitigations to guest Thomas Gleixner
2018-01-09 11:12   ` Paolo Bonzini
2018-01-09 12:03     ` Thomas Gleixner
2018-01-09 14:06       ` Paolo Bonzini
2018-01-08 23:58 [PATCH 3/7] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest Liran Alon
2018-01-09  8:35 ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.