All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] KVM: Expose speculation control feature to guests
@ 2018-01-29  0:58 ` KarimAllah Ahmed
  0 siblings, 0 replies; 18+ messages in thread
From: KarimAllah Ahmed @ 2018-01-29  0:58 UTC (permalink / raw)
  To: kvm, linux-kernel, x86
  Cc: KarimAllah Ahmed, Andi Kleen, Andrea Arcangeli, Andy Lutomirski,
	Arjan van de Ven, Ashok Raj, Asit Mallick, Borislav Petkov,
	Dan Williams, Dave Hansen, David Woodhouse, Greg Kroah-Hartman,
	H . Peter Anvin, Ingo Molnar, Janakarajan Natarajan,
	Joerg Roedel, Jun Nakajima, Laura Abbott, Linus Torvalds,
	Masami Hiramatsu, Paolo Bonzini, Peter Zijlstra,
	Radim Krčmář,
	Thomas Gleixner, Tim Chen, Tom Lendacky

Add direct access to speculation control MSRs for KVM guests. This allows the
guest to protect itself against Spectre V2 using IBRS+IBPB instead of a
retpoline+IBPB based approach.

It also exposes the ARCH_CAPABILITIES MSR which is going to be used by future
Intel processors to indicate RDCL_NO and IBRS_ALL.

Ashok Raj (1):
  x86/kvm: Add IBPB support

KarimAllah Ahmed (3):
  x86: kvm: Update the reverse_cpuid list to include CPUID_7_EDX
  x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL
  x86: vmx: Allow direct access to MSR_IA32_ARCH_CAPABILITIES

 arch/x86/kvm/cpuid.c |  6 ++++-
 arch/x86/kvm/cpuid.h |  1 +
 arch/x86/kvm/svm.c   | 14 +++++++++++
 arch/x86/kvm/vmx.c   | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c   |  1 +
 5 files changed, 92 insertions(+), 1 deletion(-)

Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: x86@kernel.org

-- 
2.7.4

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 0/4] KVM: Expose speculation control feature to guests
@ 2018-01-29  0:58 ` KarimAllah Ahmed
  0 siblings, 0 replies; 18+ messages in thread
From: KarimAllah Ahmed @ 2018-01-29  0:58 UTC (permalink / raw)
  To: kvm, linux-kernel, x86
  Cc: KarimAllah Ahmed, Andi Kleen, Andrea Arcangeli, Andy Lutomirski,
	Arjan van de Ven, Ashok Raj, Asit Mallick, Borislav Petkov,
	Dan Williams, Dave Hansen, David Woodhouse, Greg Kroah-Hartman,
	H . Peter Anvin, Ingo Molnar, Janakarajan Natarajan,
	Joerg Roedel, Jun Nakajima, Laura Abbott, Linus Torvalds,
	Masami Hiramatsu

Add direct access to speculation control MSRs for KVM guests. This allows the
guest to protect itself against Spectre V2 using IBRS+IBPB instead of a
retpoline+IBPB based approach.

It also exposes the ARCH_CAPABILITIES MSR which is going to be used by future
Intel processors to indicate RDCL_NO and IBRS_ALL.

Ashok Raj (1):
  x86/kvm: Add IBPB support

KarimAllah Ahmed (3):
  x86: kvm: Update the reverse_cpuid list to include CPUID_7_EDX
  x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL
  x86: vmx: Allow direct access to MSR_IA32_ARCH_CAPABILITIES

 arch/x86/kvm/cpuid.c |  6 ++++-
 arch/x86/kvm/cpuid.h |  1 +
 arch/x86/kvm/svm.c   | 14 +++++++++++
 arch/x86/kvm/vmx.c   | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c   |  1 +
 5 files changed, 92 insertions(+), 1 deletion(-)

Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: x86@kernel.org

-- 
2.7.4

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/4] x86: kvm: Update the reverse_cpuid list to include CPUID_7_EDX
  2018-01-29  0:58 ` KarimAllah Ahmed
  (?)
@ 2018-01-29  0:58 ` KarimAllah Ahmed
  2018-01-29 10:37   ` Paolo Bonzini
  2018-01-29 18:58   ` Jim Mattson
  -1 siblings, 2 replies; 18+ messages in thread
From: KarimAllah Ahmed @ 2018-01-29  0:58 UTC (permalink / raw)
  To: kvm, linux-kernel, x86
  Cc: KarimAllah Ahmed, Paolo Bonzini, Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H . Peter Anvin

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
---
 arch/x86/kvm/cpuid.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index cdc70a3..dcfe227 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -54,6 +54,7 @@ static const struct cpuid_reg reverse_cpuid[] = {
 	[CPUID_8000_000A_EDX] = {0x8000000a, 0, CPUID_EDX},
 	[CPUID_7_ECX]         = {         7, 0, CPUID_ECX},
 	[CPUID_8000_0007_EBX] = {0x80000007, 0, CPUID_EBX},
+	[CPUID_7_EDX]         = {         7, 0, CPUID_EDX},
 };
 
 static __always_inline struct cpuid_reg x86_feature_cpuid(unsigned x86_feature)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/4] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL
  2018-01-29  0:58 ` KarimAllah Ahmed
  (?)
  (?)
@ 2018-01-29  0:58 ` KarimAllah Ahmed
  2018-01-29  8:15     ` David Woodhouse
                     ` (2 more replies)
  -1 siblings, 3 replies; 18+ messages in thread
From: KarimAllah Ahmed @ 2018-01-29  0:58 UTC (permalink / raw)
  To: kvm, linux-kernel, x86
  Cc: KarimAllah Ahmed, Asit Mallick, Arjan Van De Ven, Dave Hansen,
	Andi Kleen, Andrea Arcangeli, Linus Torvalds, Tim Chen,
	Thomas Gleixner, Dan Williams, Jun Nakajima, Paolo Bonzini,
	David Woodhouse, Greg KH, Andy Lutomirski, Ashok Raj

Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
be using a retpoline+IBPB based approach.

To avoid the overhead of atomically saving and restoring the
MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only
add_atomic_switch_msr when a non-zero is written to it.

Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>

---
v2:
- remove 'host_spec_ctrl' in favor of only a comment (dwmw@).
- special case writing '0' in SPEC_CTRL to avoid confusing live-migration
  when the instance never used the MSR (dwmw@).
- depend on X86_FEATURE_IBRS instead of X86_FEATURE_SPEC_CTRL (dwmw@).
- add MSR_IA32_SPEC_CTRL to the list of MSRs to save (dropped it by accident).
---
 arch/x86/kvm/cpuid.c |  4 +++-
 arch/x86/kvm/vmx.c   | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c   |  1 +
 3 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 0099e10..32c0c14 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -70,6 +70,7 @@ u64 kvm_supported_xcr0(void)
 /* These are scattered features in cpufeatures.h. */
 #define KVM_CPUID_BIT_AVX512_4VNNIW     2
 #define KVM_CPUID_BIT_AVX512_4FMAPS     3
+#define KVM_CPUID_BIT_IBRS              26
 #define KF(x) bit(KVM_CPUID_BIT_##x)
 
 int kvm_update_cpuid(struct kvm_vcpu *vcpu)
@@ -392,7 +393,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 
 	/* cpuid 7.0.edx*/
 	const u32 kvm_cpuid_7_0_edx_x86_features =
-		KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
+		KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
+		(boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
 
 	/* all calls to cpuid_count() should be made on the same cpu */
 	get_cpu();
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index aa8638a..dac564d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -920,6 +920,8 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked);
 static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12,
 					    u16 error_code);
 static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
+static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
+							  u32 msr, int type);
 
 static DEFINE_PER_CPU(struct vmcs *, vmxarea);
 static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
@@ -2007,6 +2009,28 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
 	m->host[i].value = host_val;
 }
 
+/* do not touch guest_val and host_val if the msr is not found */
+static int read_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
+				  u64 *guest_val, u64 *host_val)
+{
+	unsigned i;
+	struct msr_autoload *m = &vmx->msr_autoload;
+
+	for (i = 0; i < m->nr; ++i)
+		if (m->guest[i].index == msr)
+			break;
+
+	if (i == m->nr)
+		return 1;
+
+	if (guest_val)
+		*guest_val = m->guest[i].value;
+	if (host_val)
+		*host_val = m->host[i].value;
+
+	return 0;
+}
+
 static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset)
 {
 	u64 guest_efer = vmx->vcpu.arch.efer;
@@ -3203,7 +3227,9 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
  */
 static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
+	u64 spec_ctrl = 0;
 	struct shared_msr_entry *msr;
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
 	switch (msr_info->index) {
 #ifdef CONFIG_X86_64
@@ -3223,6 +3249,20 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_TSC:
 		msr_info->data = guest_read_tsc(vcpu);
 		break;
+	case MSR_IA32_SPEC_CTRL:
+		if (!msr_info->host_initiated &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
+			return 1;
+
+		/*
+		 * If the MSR is not in the atomic list yet, then the guest
+		 * never wrote a non-zero value to it yet i.e. the MSR value is
+		 * '0'.
+		 */
+		read_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, &spec_ctrl, NULL);
+
+		msr_info->data = spec_ctrl;
+		break;
 	case MSR_IA32_SYSENTER_CS:
 		msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
 		break;
@@ -3289,6 +3329,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	int ret = 0;
 	u32 msr_index = msr_info->index;
 	u64 data = msr_info->data;
+	unsigned long *msr_bitmap;
 
 	switch (msr_index) {
 	case MSR_EFER:
@@ -3330,6 +3371,30 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_TSC:
 		kvm_write_tsc(vcpu, msr_info);
 		break;
+	case MSR_IA32_SPEC_CTRL:
+		if (!msr_info->host_initiated &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_IBRS))
+			return 1;
+
+		if (!msr_info->data)
+			break;
+
+		/*
+		 * Now we know that the guest is actually using the MSR, so
+		 * atomically load and save the SPEC_CTRL MSR and pass it
+		 * through to the guest.
+		 *
+		 * NOTE:
+		 * IBRS is not supported yet as a mitigation for the host. Once
+		 * it is supported, the "host_value" will need to be '1'
+		 * instead of '0' if IBRS is used also by the host.
+		 */
+		add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, msr_info->data, 0);
+
+		msr_bitmap = vmx->vmcs01.msr_bitmap;
+		vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
+
+		break;
 	case MSR_IA32_CR_PAT:
 		if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
 			if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 03869eb..cabaad3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1006,6 +1006,7 @@ static u32 msrs_to_save[] = {
 #endif
 	MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
 	MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
+	MSR_IA32_SPEC_CTRL
 };
 
 static unsigned num_msrs_to_save;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] x86/kvm: Add IBPB support
  2018-01-29  0:58 ` KarimAllah Ahmed
                   ` (2 preceding siblings ...)
  (?)
@ 2018-01-29  0:58 ` KarimAllah Ahmed
  2018-01-29 19:19   ` Jim Mattson
  -1 siblings, 1 reply; 18+ messages in thread
From: KarimAllah Ahmed @ 2018-01-29  0:58 UTC (permalink / raw)
  To: kvm, linux-kernel, x86
  Cc: Ashok Raj, Asit Mallick, Dave Hansen, Arjan Van De Ven, Tim Chen,
	Linus Torvalds, Andrea Arcangeli, Andi Kleen, Thomas Gleixner,
	Dan Williams, Jun Nakajima, Andy Lutomirski, Greg KH,
	Paolo Bonzini, Peter Zijlstra, David Woodhouse, KarimAllah Ahmed

From: Ashok Raj <ashok.raj@intel.com>

Add MSR passthrough for MSR_IA32_PRED_CMD and place branch predictor
barriers on switching between VMs to avoid inter VM Spectre-v2 attacks.

[peterz: rebase and changelog rewrite]
[karahmed: - rebase
           - vmx: expose PRED_CMD whenever it is available
           - svm: only pass through IBPB if it is available]
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1515720739-43819-6-git-send-email-ashok.raj@intel.com
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
---
 arch/x86/kvm/svm.c | 14 ++++++++++++++
 arch/x86/kvm/vmx.c |  4 ++++
 2 files changed, 18 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 2744b973..c886e46 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -529,6 +529,7 @@ struct svm_cpu_data {
 	struct kvm_ldttss_desc *tss_desc;
 
 	struct page *save_area;
+	struct vmcb *current_vmcb;
 };
 
 static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
@@ -918,6 +919,9 @@ static void svm_vcpu_init_msrpm(u32 *msrpm)
 
 		set_msr_interception(msrpm, direct_access_msrs[i].index, 1, 1);
 	}
+
+	if (boot_cpu_has(X86_FEATURE_IBPB))
+		set_msr_interception(msrpm, MSR_IA32_PRED_CMD, 1, 1);
 }
 
 static void add_msr_offset(u32 offset)
@@ -1706,11 +1710,17 @@ static void svm_free_vcpu(struct kvm_vcpu *vcpu)
 	__free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER);
 	kvm_vcpu_uninit(vcpu);
 	kmem_cache_free(kvm_vcpu_cache, svm);
+	/*
+	 * The vmcb page can be recycled, causing a false negative in
+	 * svm_vcpu_load(). So do a full IBPB now.
+	 */
+	indirect_branch_prediction_barrier();
 }
 
 static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
+	struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
 	int i;
 
 	if (unlikely(cpu != vcpu->cpu)) {
@@ -1739,6 +1749,10 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	if (static_cpu_has(X86_FEATURE_RDTSCP))
 		wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
 
+	if (sd->current_vmcb != svm->vmcb) {
+		sd->current_vmcb = svm->vmcb;
+		indirect_branch_prediction_barrier();
+	}
 	avic_vcpu_load(vcpu, cpu);
 }
 
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index dac564d..f82a44c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2296,6 +2296,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
 		per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
 		vmcs_load(vmx->loaded_vmcs->vmcs);
+		indirect_branch_prediction_barrier();
 	}
 
 	if (!already_loaded) {
@@ -9613,6 +9614,9 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
 		goto free_msrs;
 
 	msr_bitmap = vmx->vmcs01.msr_bitmap;
+
+	if (boot_cpu_has(X86_FEATURE_IBPB))
+		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PRED_CMD, MSR_TYPE_RW);
 	vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
 	vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW);
 	vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 4/4] x86: vmx: Allow direct access to MSR_IA32_ARCH_CAPABILITIES
  2018-01-29  0:58 ` KarimAllah Ahmed
                   ` (3 preceding siblings ...)
  (?)
@ 2018-01-29  0:58 ` KarimAllah Ahmed
  2018-01-29 10:45   ` Paolo Bonzini
  2018-01-29 18:55   ` Jim Mattson
  -1 siblings, 2 replies; 18+ messages in thread
From: KarimAllah Ahmed @ 2018-01-29  0:58 UTC (permalink / raw)
  To: kvm, linux-kernel, x86
  Cc: KarimAllah Ahmed, Asit Mallick, Dave Hansen, Arjan Van De Ven,
	Tim Chen, Linus Torvalds, Andrea Arcangeli, Andi Kleen,
	Thomas Gleixner, Dan Williams, Jun Nakajima, Andy Lutomirski,
	Greg KH, Paolo Bonzini, Ashok Raj

Add direct access to MSR_IA32_SPEC_CTRL for guests. Future intel processors
will use this MSR to indicate RDCL_NO (bit 0) and IBRS_ALL (bit 1).

Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
---
 arch/x86/kvm/cpuid.c | 4 +++-
 arch/x86/kvm/vmx.c   | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 32c0c14..2339b1a 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -71,6 +71,7 @@ u64 kvm_supported_xcr0(void)
 #define KVM_CPUID_BIT_AVX512_4VNNIW     2
 #define KVM_CPUID_BIT_AVX512_4FMAPS     3
 #define KVM_CPUID_BIT_IBRS              26
+#define KVM_CPUID_BIT_ARCH_CAPABILITIES 29
 #define KF(x) bit(KVM_CPUID_BIT_##x)
 
 int kvm_update_cpuid(struct kvm_vcpu *vcpu)
@@ -394,7 +395,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 	/* cpuid 7.0.edx*/
 	const u32 kvm_cpuid_7_0_edx_x86_features =
 		KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
-		(boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
+		(boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0) | \
+		(boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES) ? KF(ARCH_CAPABILITIES) : 0);
 
 	/* all calls to cpuid_count() should be made on the same cpu */
 	get_cpu();
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index f82a44c..99cb761 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9617,6 +9617,8 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
 
 	if (boot_cpu_has(X86_FEATURE_IBPB))
 		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PRED_CMD, MSR_TYPE_RW);
+	if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
+		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_ARCH_CAPABILITIES, MSR_TYPE_R);
 	vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
 	vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW);
 	vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/4] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL
  2018-01-29  0:58 ` [PATCH v2 2/4] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL KarimAllah Ahmed
@ 2018-01-29  8:15     ` David Woodhouse
  2018-01-29 10:44   ` Paolo Bonzini
  2018-01-29 19:03   ` Jim Mattson
  2 siblings, 0 replies; 18+ messages in thread
From: David Woodhouse @ 2018-01-29  8:15 UTC (permalink / raw)
  To: KarimAllah Ahmed, kvm, linux-kernel, x86
  Cc: Asit Mallick, Arjan Van De Ven, Dave Hansen, Andi Kleen,
	Andrea Arcangeli, Linus Torvalds, Tim Chen, Thomas Gleixner,
	Dan Williams, Jun Nakajima, Paolo Bonzini, Greg KH,
	Andy Lutomirski, Ashok Raj

[-- Attachment #1: Type: text/plain, Size: 6143 bytes --]

On Mon, 2018-01-29 at 01:58 +0100, KarimAllah Ahmed wrote:
> Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
> guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
> be using a retpoline+IBPB based approach.
> 
> To avoid the overhead of atomically saving and restoring the
> MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only
> add_atomic_switch_msr when a non-zero is written to it.
> 
> Cc: Asit Mallick <asit.k.mallick@intel.com>
> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: David Woodhouse <dwmw@amazon.co.uk>
> Cc: Greg KH <gregkh@linuxfoundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
> 
> ---
> v2:
> - remove 'host_spec_ctrl' in favor of only a comment (dwmw@).
> - special case writing '0' in SPEC_CTRL to avoid confusing live-migration
>   when the instance never used the MSR (dwmw@).

Possibly wants a comment in the code explaining this in slightly more
detail. The point being that if we migrate a guest which has never used
the MSR, we don't want the act of setting it to zero on resume to flip
it into the auto-saved mode.
 
> - depend on X86_FEATURE_IBRS instead of X86_FEATURE_SPEC_CTRL (dwmw@).
> - add MSR_IA32_SPEC_CTRL to the list of MSRs to save (dropped it by accident).
> ---
>  arch/x86/kvm/cpuid.c |  4 +++-
>  arch/x86/kvm/vmx.c   | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c   |  1 +
>  3 files changed, 69 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 0099e10..32c0c14 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -70,6 +70,7 @@ u64 kvm_supported_xcr0(void)
>  /* These are scattered features in cpufeatures.h. */
>  #define KVM_CPUID_BIT_AVX512_4VNNIW     2
>  #define KVM_CPUID_BIT_AVX512_4FMAPS     3
> +#define KVM_CPUID_BIT_IBRS              26
>  #define KF(x) bit(KVM_CPUID_BIT_##x)
>  
>  int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> @@ -392,7 +393,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  
>  	/* cpuid 7.0.edx*/
>  	const u32 kvm_cpuid_7_0_edx_x86_features =
> -		KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
> +		KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
> +		(boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
>  	/* all calls to cpuid_count() should be made on the same cpu */
>  	get_cpu();

I think we need to expose more feature bits than that. See
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/pti&id=2961298efe1ea1b6fc0d7ee8b76018fa6c0bcef2

There are three AMD bits for IBRS, IBPB & STIBP which are the user-
visible ones in /proc/cpuinfo, and the ones we use within the kernel to
indicate the hardware availability (there are separate feature bits for
when we're *using* IBPB etc., but that's only because feature bits are
the only thing that ALTERNATIVEs can work from).

In addition to those bits, Intel has its own. The Intel SPEC_CTRL CPUID
bit (which you're setting above) indicates *both* IBRS and IBPB
capability. The kernel sets the corresponding AMD bits when it sees
SPEC_CTRL. Likewise Intel has a different bit for STIBP.

You could construct a set of CPUID bits for the guest based on what the
host has. So all three of the AMD IBRS/IBPB/STIBP bits in 80000008/EBX
should just be passed through, and you could set the Intel SPEC_CTRL
bit (7/EDX bit 26 that you're looking at above) only when you have
X86_FEATURE_IBPB && X86_FEATURE_IBRS. And the Intel STIBP when you have
X86_FEATURE_STIBP.

The Intel ARCH_CAPABILITIES CPUID bit is separate. Pass that through if
you have it, and expose the corresponding MSR read-only.


> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index aa8638a..dac564d 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -920,6 +920,8 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked);
>  static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12,
>  					    u16 error_code);
>  static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
> +static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
> +							  u32 msr, int type);
>  
>  static DEFINE_PER_CPU(struct vmcs *, vmxarea);
>  static DEFINE_PER_CPU(struct vmcs *, current_vmcs);

Perhaps move that whole function further up?

>  static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset)
>  {
>  	u64 guest_efer = vmx->vcpu.arch.efer;
> @@ -3203,7 +3227,9 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
>   */
>  static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  {
> +	u64 spec_ctrl = 0;

Could you ditch this additional variable and...

>  	struct shared_msr_entry *msr;
> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
>  
>  	switch (msr_info->index) {
>  #ifdef CONFIG_X86_64
> @@ -3223,6 +3249,20 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case MSR_IA32_TSC:
>  		msr_info->data = guest_read_tsc(vcpu);
>  		break;
> +	case MSR_IA32_SPEC_CTRL:
> +		if (!msr_info->host_initiated &&
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
> +			return 1;
> +
> +		/*
> +		 * If the MSR is not in the atomic list yet, then the guest
> +		 * never wrote a non-zero value to it yet i.e. the MSR value is
> +		 * '0'.
> +		 */

...
    if (read_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, &msr_info->data, NULL))
        msr_info->data = 0;

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/4] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL
@ 2018-01-29  8:15     ` David Woodhouse
  0 siblings, 0 replies; 18+ messages in thread
From: David Woodhouse @ 2018-01-29  8:15 UTC (permalink / raw)
  To: KarimAllah Ahmed, kvm, linux-kernel, x86
  Cc: Asit Mallick, Arjan Van De Ven, Dave Hansen, Andi Kleen,
	Andrea Arcangeli, Linus Torvalds, Tim Chen, Thomas Gleixner,
	Dan Williams, Jun Nakajima, Paolo Bonzini, Greg KH,
	Andy Lutomirski, Ashok Raj

[-- Attachment #1: Type: text/plain, Size: 6143 bytes --]

On Mon, 2018-01-29 at 01:58 +0100, KarimAllah Ahmed wrote:
> Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
> guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
> be using a retpoline+IBPB based approach.
> 
> To avoid the overhead of atomically saving and restoring the
> MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only
> add_atomic_switch_msr when a non-zero is written to it.
> 
> Cc: Asit Mallick <asit.k.mallick@intel.com>
> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: David Woodhouse <dwmw@amazon.co.uk>
> Cc: Greg KH <gregkh@linuxfoundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
> 
> ---
> v2:
> - remove 'host_spec_ctrl' in favor of only a comment (dwmw@).
> - special case writing '0' in SPEC_CTRL to avoid confusing live-migration
>   when the instance never used the MSR (dwmw@).

Possibly wants a comment in the code explaining this in slightly more
detail. The point being that if we migrate a guest which has never used
the MSR, we don't want the act of setting it to zero on resume to flip
it into the auto-saved mode.
 
> - depend on X86_FEATURE_IBRS instead of X86_FEATURE_SPEC_CTRL (dwmw@).
> - add MSR_IA32_SPEC_CTRL to the list of MSRs to save (dropped it by accident).
> ---
>  arch/x86/kvm/cpuid.c |  4 +++-
>  arch/x86/kvm/vmx.c   | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c   |  1 +
>  3 files changed, 69 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 0099e10..32c0c14 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -70,6 +70,7 @@ u64 kvm_supported_xcr0(void)
>  /* These are scattered features in cpufeatures.h. */
>  #define KVM_CPUID_BIT_AVX512_4VNNIW     2
>  #define KVM_CPUID_BIT_AVX512_4FMAPS     3
> +#define KVM_CPUID_BIT_IBRS              26
>  #define KF(x) bit(KVM_CPUID_BIT_##x)
>  
>  int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> @@ -392,7 +393,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  
>  	/* cpuid 7.0.edx*/
>  	const u32 kvm_cpuid_7_0_edx_x86_features =
> -		KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
> +		KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
> +		(boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
>  	/* all calls to cpuid_count() should be made on the same cpu */
>  	get_cpu();

I think we need to expose more feature bits than that. See
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/pti&id=2961298efe1ea1b6fc0d7ee8b76018fa6c0bcef2

There are three AMD bits for IBRS, IBPB & STIBP which are the user-
visible ones in /proc/cpuinfo, and the ones we use within the kernel to
indicate the hardware availability (there are separate feature bits for
when we're *using* IBPB etc., but that's only because feature bits are
the only thing that ALTERNATIVEs can work from).

In addition to those bits, Intel has its own. The Intel SPEC_CTRL CPUID
bit (which you're setting above) indicates *both* IBRS and IBPB
capability. The kernel sets the corresponding AMD bits when it sees
SPEC_CTRL. Likewise Intel has a different bit for STIBP.

You could construct a set of CPUID bits for the guest based on what the
host has. So all three of the AMD IBRS/IBPB/STIBP bits in 80000008/EBX
should just be passed through, and you could set the Intel SPEC_CTRL
bit (7/EDX bit 26 that you're looking at above) only when you have
X86_FEATURE_IBPB && X86_FEATURE_IBRS. And the Intel STIBP when you have
X86_FEATURE_STIBP.

The Intel ARCH_CAPABILITIES CPUID bit is separate. Pass that through if
you have it, and expose the corresponding MSR read-only.


> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index aa8638a..dac564d 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -920,6 +920,8 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked);
>  static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12,
>  					    u16 error_code);
>  static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
> +static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
> +							  u32 msr, int type);
>  
>  static DEFINE_PER_CPU(struct vmcs *, vmxarea);
>  static DEFINE_PER_CPU(struct vmcs *, current_vmcs);

Perhaps move that whole function further up?

>  static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset)
>  {
>  	u64 guest_efer = vmx->vcpu.arch.efer;
> @@ -3203,7 +3227,9 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
>   */
>  static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  {
> +	u64 spec_ctrl = 0;

Could you ditch this additional variable and...

>  	struct shared_msr_entry *msr;
> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
>  
>  	switch (msr_info->index) {
>  #ifdef CONFIG_X86_64
> @@ -3223,6 +3249,20 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case MSR_IA32_TSC:
>  		msr_info->data = guest_read_tsc(vcpu);
>  		break;
> +	case MSR_IA32_SPEC_CTRL:
> +		if (!msr_info->host_initiated &&
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
> +			return 1;
> +
> +		/*
> +		 * If the MSR is not in the atomic list yet, then the guest
> +		 * never wrote a non-zero value to it yet i.e. the MSR value is
> +		 * '0'.
> +		 */

...
    if (read_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, &msr_info->data, NULL))
        msr_info->data = 0;

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/4] x86: kvm: Update the reverse_cpuid list to include CPUID_7_EDX
  2018-01-29  0:58 ` [PATCH v2 1/4] x86: kvm: Update the reverse_cpuid list to include CPUID_7_EDX KarimAllah Ahmed
@ 2018-01-29 10:37   ` Paolo Bonzini
  2018-01-29 18:58   ` Jim Mattson
  1 sibling, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2018-01-29 10:37 UTC (permalink / raw)
  To: KarimAllah Ahmed, kvm, linux-kernel, x86
  Cc: Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H . Peter Anvin

On 29/01/2018 01:58, KarimAllah Ahmed wrote:
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: kvm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
> ---
>  arch/x86/kvm/cpuid.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index cdc70a3..dcfe227 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -54,6 +54,7 @@ static const struct cpuid_reg reverse_cpuid[] = {
>  	[CPUID_8000_000A_EDX] = {0x8000000a, 0, CPUID_EDX},
>  	[CPUID_7_ECX]         = {         7, 0, CPUID_ECX},
>  	[CPUID_8000_0007_EBX] = {0x80000007, 0, CPUID_EBX},
> +	[CPUID_7_EDX]         = {         7, 0, CPUID_EDX},
>  };
>  
>  static __always_inline struct cpuid_reg x86_feature_cpuid(unsigned x86_feature)
> 

You can also stop using KF(...) and KVM_CPUID_BIT_* for bits that come
from this leaf.  F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(IBRS) will be
enough.

Paolo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/4] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL
  2018-01-29  0:58 ` [PATCH v2 2/4] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL KarimAllah Ahmed
  2018-01-29  8:15     ` David Woodhouse
@ 2018-01-29 10:44   ` Paolo Bonzini
  2018-01-29 12:12     ` KarimAllah Ahmed
  2018-01-29 19:03   ` Jim Mattson
  2 siblings, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2018-01-29 10:44 UTC (permalink / raw)
  To: KarimAllah Ahmed, kvm, linux-kernel, x86
  Cc: Asit Mallick, Arjan Van De Ven, Dave Hansen, Andi Kleen,
	Andrea Arcangeli, Linus Torvalds, Tim Chen, Thomas Gleixner,
	Dan Williams, Jun Nakajima, David Woodhouse, Greg KH,
	Andy Lutomirski, Ashok Raj

On 29/01/2018 01:58, KarimAllah Ahmed wrote:
> Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
> guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
> be using a retpoline+IBPB based approach.
> 
> To avoid the overhead of atomically saving and restoring the
> MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only
> add_atomic_switch_msr when a non-zero is written to it.

You are not storing the guest's MSR value on though vmexit, aren't you?
Also, there's an obvious typo here:

+		add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, msr_info->data, 0);
+
+		msr_bitmap = vmx->vmcs01.msr_bitmap;
+		vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
+

Finally, apparently add_atomic_switch_msr is slower than just rdmsr/wrmsr
on vmexit.  Can you reuse the patches I had posted mid January instead?  They
are also assuming no IBRS usage on the host, so the changes shouldn't be large,
and limited mostly to using actual X86_FEATURE_* bits instead of cpuid_count().

They lack the code to only read/write SPEC_CTRL if the direct access is enabled,
but that's small too...  Enabling the direct access on the first write, as in
this patches, is okay.

Thanks,

Paolo

> Cc: Asit Mallick <asit.k.mallick@intel.com>
> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: David Woodhouse <dwmw@amazon.co.uk>
> Cc: Greg KH <gregkh@linuxfoundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
> 
> ---
> v2:
> - remove 'host_spec_ctrl' in favor of only a comment (dwmw@).
> - special case writing '0' in SPEC_CTRL to avoid confusing live-migration
>   when the instance never used the MSR (dwmw@).
> - depend on X86_FEATURE_IBRS instead of X86_FEATURE_SPEC_CTRL (dwmw@).
> - add MSR_IA32_SPEC_CTRL to the list of MSRs to save (dropped it by accident).
> ---
>  arch/x86/kvm/cpuid.c |  4 +++-
>  arch/x86/kvm/vmx.c   | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c   |  1 +
>  3 files changed, 69 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 0099e10..32c0c14 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -70,6 +70,7 @@ u64 kvm_supported_xcr0(void)
>  /* These are scattered features in cpufeatures.h. */
>  #define KVM_CPUID_BIT_AVX512_4VNNIW     2
>  #define KVM_CPUID_BIT_AVX512_4FMAPS     3
> +#define KVM_CPUID_BIT_IBRS              26
>  #define KF(x) bit(KVM_CPUID_BIT_##x)
>  
>  int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> @@ -392,7 +393,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  
>  	/* cpuid 7.0.edx*/
>  	const u32 kvm_cpuid_7_0_edx_x86_features =
> -		KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
> +		KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
> +		(boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
>  
>  	/* all calls to cpuid_count() should be made on the same cpu */
>  	get_cpu();
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index aa8638a..dac564d 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -920,6 +920,8 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked);
>  static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12,
>  					    u16 error_code);
>  static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
> +static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
> +							  u32 msr, int type);
>  
>  static DEFINE_PER_CPU(struct vmcs *, vmxarea);
>  static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
> @@ -2007,6 +2009,28 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
>  	m->host[i].value = host_val;
>  }
>  
> +/* do not touch guest_val and host_val if the msr is not found */
> +static int read_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
> +				  u64 *guest_val, u64 *host_val)
> +{
> +	unsigned i;
> +	struct msr_autoload *m = &vmx->msr_autoload;
> +
> +	for (i = 0; i < m->nr; ++i)
> +		if (m->guest[i].index == msr)
> +			break;
> +
> +	if (i == m->nr)
> +		return 1;
> +
> +	if (guest_val)
> +		*guest_val = m->guest[i].value;
> +	if (host_val)
> +		*host_val = m->host[i].value;
> +
> +	return 0;
> +}
> +
>  static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset)
>  {
>  	u64 guest_efer = vmx->vcpu.arch.efer;
> @@ -3203,7 +3227,9 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
>   */
>  static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  {
> +	u64 spec_ctrl = 0;
>  	struct shared_msr_entry *msr;
> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
>  
>  	switch (msr_info->index) {
>  #ifdef CONFIG_X86_64
> @@ -3223,6 +3249,20 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case MSR_IA32_TSC:
>  		msr_info->data = guest_read_tsc(vcpu);
>  		break;
> +	case MSR_IA32_SPEC_CTRL:
> +		if (!msr_info->host_initiated &&
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
> +			return 1;
> +
> +		/*
> +		 * If the MSR is not in the atomic list yet, then the guest
> +		 * never wrote a non-zero value to it yet i.e. the MSR value is
> +		 * '0'.
> +		 */
> +		read_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, &spec_ctrl, NULL);
> +
> +		msr_info->data = spec_ctrl;
> +		break;
>  	case MSR_IA32_SYSENTER_CS:
>  		msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
>  		break;
> @@ -3289,6 +3329,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	int ret = 0;
>  	u32 msr_index = msr_info->index;
>  	u64 data = msr_info->data;
> +	unsigned long *msr_bitmap;
>  
>  	switch (msr_index) {
>  	case MSR_EFER:
> @@ -3330,6 +3371,30 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case MSR_IA32_TSC:
>  		kvm_write_tsc(vcpu, msr_info);
>  		break;
> +	case MSR_IA32_SPEC_CTRL:
> +		if (!msr_info->host_initiated &&
> +		    !guest_cpuid_has(vcpu, X86_FEATURE_IBRS))
> +			return 1;
> +
> +		if (!msr_info->data)
> +			break;
> +
> +		/*
> +		 * Now we know that the guest is actually using the MSR, so
> +		 * atomically load and save the SPEC_CTRL MSR and pass it
> +		 * through to the guest.
> +		 *
> +		 * NOTE:
> +		 * IBRS is not supported yet as a mitigation for the host. Once
> +		 * it is supported, the "host_value" will need to be '1'
> +		 * instead of '0' if IBRS is used also by the host.
> +		 */
> +		add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, msr_info->data, 0);
> +
> +		msr_bitmap = vmx->vmcs01.msr_bitmap;
> +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
> +
> +		break;
>  	case MSR_IA32_CR_PAT:
>  		if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
>  			if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 03869eb..cabaad3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1006,6 +1006,7 @@ static u32 msrs_to_save[] = {
>  #endif
>  	MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
>  	MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
> +	MSR_IA32_SPEC_CTRL
>  };
>  
>  static unsigned num_msrs_to_save;
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] x86: vmx: Allow direct access to MSR_IA32_ARCH_CAPABILITIES
  2018-01-29  0:58 ` [PATCH v2 4/4] x86: vmx: Allow direct access to MSR_IA32_ARCH_CAPABILITIES KarimAllah Ahmed
@ 2018-01-29 10:45   ` Paolo Bonzini
  2018-01-29 14:18     ` Van De Ven, Arjan
  2018-01-29 18:55   ` Jim Mattson
  1 sibling, 1 reply; 18+ messages in thread
From: Paolo Bonzini @ 2018-01-29 10:45 UTC (permalink / raw)
  To: KarimAllah Ahmed, kvm, linux-kernel, x86
  Cc: Asit Mallick, Dave Hansen, Arjan Van De Ven, Tim Chen,
	Linus Torvalds, Andrea Arcangeli, Andi Kleen, Thomas Gleixner,
	Dan Williams, Jun Nakajima, Andy Lutomirski, Greg KH, Ashok Raj

On 29/01/2018 01:58, KarimAllah Ahmed wrote:
> Add direct access to MSR_IA32_SPEC_CTRL for guests. Future intel processors
> will use this MSR to indicate RDCL_NO (bit 0) and IBRS_ALL (bit 1).

This has to be customizable per-VM (similar to the patches Amazon posted
a while ago for UCODE_REV for example).  It's not urgent anyway given
that no processor has it yet.

Thanks,

Paolo

> Cc: Asit Mallick <asit.k.mallick@intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Greg KH <gregkh@linuxfoundation.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
> ---
>  arch/x86/kvm/cpuid.c | 4 +++-
>  arch/x86/kvm/vmx.c   | 2 ++
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 32c0c14..2339b1a 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -71,6 +71,7 @@ u64 kvm_supported_xcr0(void)
>  #define KVM_CPUID_BIT_AVX512_4VNNIW     2
>  #define KVM_CPUID_BIT_AVX512_4FMAPS     3
>  #define KVM_CPUID_BIT_IBRS              26
> +#define KVM_CPUID_BIT_ARCH_CAPABILITIES 29
>  #define KF(x) bit(KVM_CPUID_BIT_##x)
>  
>  int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> @@ -394,7 +395,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  	/* cpuid 7.0.edx*/
>  	const u32 kvm_cpuid_7_0_edx_x86_features =
>  		KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
> -		(boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
> +		(boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0) | \
> +		(boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES) ? KF(ARCH_CAPABILITIES) : 0);
>  
>  	/* all calls to cpuid_count() should be made on the same cpu */
>  	get_cpu();
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index f82a44c..99cb761 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -9617,6 +9617,8 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
>  
>  	if (boot_cpu_has(X86_FEATURE_IBPB))
>  		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PRED_CMD, MSR_TYPE_RW);
> +	if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
> +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_ARCH_CAPABILITIES, MSR_TYPE_R);
>  	vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
>  	vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW);
>  	vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/4] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL
  2018-01-29 10:44   ` Paolo Bonzini
@ 2018-01-29 12:12     ` KarimAllah Ahmed
  0 siblings, 0 replies; 18+ messages in thread
From: KarimAllah Ahmed @ 2018-01-29 12:12 UTC (permalink / raw)
  To: Paolo Bonzini, KarimAllah Ahmed, kvm, linux-kernel, x86
  Cc: Asit Mallick, Arjan Van De Ven, Dave Hansen, Andi Kleen,
	Andrea Arcangeli, Linus Torvalds, Tim Chen, Thomas Gleixner,
	Dan Williams, Jun Nakajima, David Woodhouse, Greg KH,
	Andy Lutomirski, Ashok Raj

On 01/29/2018 11:44 AM, Paolo Bonzini wrote:
> On 29/01/2018 01:58, KarimAllah Ahmed wrote:
>> Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
>> guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
>> be using a retpoline+IBPB based approach.
>>
>> To avoid the overhead of atomically saving and restoring the
>> MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only
>> add_atomic_switch_msr when a non-zero is written to it.
> 
> You are not storing the guest's MSR value on though vmexit, aren't you?

I originally thought that atomic_switch was also saving the guest MSR on 
VM-exit. Now I know it is not.

> Also, there's an obvious typo here:
> 
> +		add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, msr_info->data, 0);
> +
> +		msr_bitmap = vmx->vmcs01.msr_bitmap;
> +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
> +

oops! copy & paste error :)

> 
> Finally, apparently add_atomic_switch_msr is slower than just rdmsr/wrmsr
> on vmexit.  Can you reuse the patches I had posted mid January instead?  They
> are also assuming no IBRS usage on the host, so the changes shouldn't be large,
> and limited mostly to using actual X86_FEATURE_* bits instead of cpuid_count().
> 
> They lack the code to only read/write SPEC_CTRL if the direct access is enabled,
> but that's small too...  Enabling the direct access on the first write, as in
> this patches, is okay.
> 
> Thanks,
> 
> Paolo
> 
>> Cc: Asit Mallick <asit.k.mallick@intel.com>
>> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
>> Cc: Dave Hansen <dave.hansen@intel.com>
>> Cc: Andi Kleen <ak@linux.intel.com>
>> Cc: Andrea Arcangeli <aarcange@redhat.com>
>> Cc: Linus Torvalds <torvalds@linux-foundation.org>
>> Cc: Tim Chen <tim.c.chen@linux.intel.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: David Woodhouse <dwmw@amazon.co.uk>
>> Cc: Greg KH <gregkh@linuxfoundation.org>
>> Cc: Andy Lutomirski <luto@kernel.org>
>> Cc: Ashok Raj <ashok.raj@intel.com>
>> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
>>
>> ---
>> v2:
>> - remove 'host_spec_ctrl' in favor of only a comment (dwmw@).
>> - special case writing '0' in SPEC_CTRL to avoid confusing live-migration
>>    when the instance never used the MSR (dwmw@).
>> - depend on X86_FEATURE_IBRS instead of X86_FEATURE_SPEC_CTRL (dwmw@).
>> - add MSR_IA32_SPEC_CTRL to the list of MSRs to save (dropped it by accident).
>> ---
>>   arch/x86/kvm/cpuid.c |  4 +++-
>>   arch/x86/kvm/vmx.c   | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   arch/x86/kvm/x86.c   |  1 +
>>   3 files changed, 69 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 0099e10..32c0c14 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -70,6 +70,7 @@ u64 kvm_supported_xcr0(void)
>>   /* These are scattered features in cpufeatures.h. */
>>   #define KVM_CPUID_BIT_AVX512_4VNNIW     2
>>   #define KVM_CPUID_BIT_AVX512_4FMAPS     3
>> +#define KVM_CPUID_BIT_IBRS              26
>>   #define KF(x) bit(KVM_CPUID_BIT_##x)
>>   
>>   int kvm_update_cpuid(struct kvm_vcpu *vcpu)
>> @@ -392,7 +393,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>>   
>>   	/* cpuid 7.0.edx*/
>>   	const u32 kvm_cpuid_7_0_edx_x86_features =
>> -		KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
>> +		KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
>> +		(boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
>>   
>>   	/* all calls to cpuid_count() should be made on the same cpu */
>>   	get_cpu();
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index aa8638a..dac564d 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -920,6 +920,8 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked);
>>   static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12,
>>   					    u16 error_code);
>>   static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
>> +static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
>> +							  u32 msr, int type);
>>   
>>   static DEFINE_PER_CPU(struct vmcs *, vmxarea);
>>   static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
>> @@ -2007,6 +2009,28 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
>>   	m->host[i].value = host_val;
>>   }
>>   
>> +/* do not touch guest_val and host_val if the msr is not found */
>> +static int read_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
>> +				  u64 *guest_val, u64 *host_val)
>> +{
>> +	unsigned i;
>> +	struct msr_autoload *m = &vmx->msr_autoload;
>> +
>> +	for (i = 0; i < m->nr; ++i)
>> +		if (m->guest[i].index == msr)
>> +			break;
>> +
>> +	if (i == m->nr)
>> +		return 1;
>> +
>> +	if (guest_val)
>> +		*guest_val = m->guest[i].value;
>> +	if (host_val)
>> +		*host_val = m->host[i].value;
>> +
>> +	return 0;
>> +}
>> +
>>   static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset)
>>   {
>>   	u64 guest_efer = vmx->vcpu.arch.efer;
>> @@ -3203,7 +3227,9 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
>>    */
>>   static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>   {
>> +	u64 spec_ctrl = 0;
>>   	struct shared_msr_entry *msr;
>> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
>>   
>>   	switch (msr_info->index) {
>>   #ifdef CONFIG_X86_64
>> @@ -3223,6 +3249,20 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>   	case MSR_IA32_TSC:
>>   		msr_info->data = guest_read_tsc(vcpu);
>>   		break;
>> +	case MSR_IA32_SPEC_CTRL:
>> +		if (!msr_info->host_initiated &&
>> +		    !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
>> +			return 1;
>> +
>> +		/*
>> +		 * If the MSR is not in the atomic list yet, then the guest
>> +		 * never wrote a non-zero value to it yet i.e. the MSR value is
>> +		 * '0'.
>> +		 */
>> +		read_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, &spec_ctrl, NULL);
>> +
>> +		msr_info->data = spec_ctrl;
>> +		break;
>>   	case MSR_IA32_SYSENTER_CS:
>>   		msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
>>   		break;
>> @@ -3289,6 +3329,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>   	int ret = 0;
>>   	u32 msr_index = msr_info->index;
>>   	u64 data = msr_info->data;
>> +	unsigned long *msr_bitmap;
>>   
>>   	switch (msr_index) {
>>   	case MSR_EFER:
>> @@ -3330,6 +3371,30 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>   	case MSR_IA32_TSC:
>>   		kvm_write_tsc(vcpu, msr_info);
>>   		break;
>> +	case MSR_IA32_SPEC_CTRL:
>> +		if (!msr_info->host_initiated &&
>> +		    !guest_cpuid_has(vcpu, X86_FEATURE_IBRS))
>> +			return 1;
>> +
>> +		if (!msr_info->data)
>> +			break;
>> +
>> +		/*
>> +		 * Now we know that the guest is actually using the MSR, so
>> +		 * atomically load and save the SPEC_CTRL MSR and pass it
>> +		 * through to the guest.
>> +		 *
>> +		 * NOTE:
>> +		 * IBRS is not supported yet as a mitigation for the host. Once
>> +		 * it is supported, the "host_value" will need to be '1'
>> +		 * instead of '0' if IBRS is used also by the host.
>> +		 */
>> +		add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, msr_info->data, 0);
>> +
>> +		msr_bitmap = vmx->vmcs01.msr_bitmap;
>> +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
>> +
>> +		break;
>>   	case MSR_IA32_CR_PAT:
>>   		if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
>>   			if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 03869eb..cabaad3 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -1006,6 +1006,7 @@ static u32 msrs_to_save[] = {
>>   #endif
>>   	MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
>>   	MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
>> +	MSR_IA32_SPEC_CTRL
>>   };
>>   
>>   static unsigned num_msrs_to_save;
>>
> 
> 
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v2 4/4] x86: vmx: Allow direct access to MSR_IA32_ARCH_CAPABILITIES
  2018-01-29 10:45   ` Paolo Bonzini
@ 2018-01-29 14:18     ` Van De Ven, Arjan
  0 siblings, 0 replies; 18+ messages in thread
From: Van De Ven, Arjan @ 2018-01-29 14:18 UTC (permalink / raw)
  To: Paolo Bonzini, KarimAllah Ahmed, kvm, linux-kernel, x86
  Cc: Mallick, Asit K, Hansen, Dave, Tim Chen, Linus Torvalds,
	Andrea Arcangeli, Andi Kleen, Thomas Gleixner, Williams, Dan J,
	Nakajima, Jun, Andy Lutomirski, Greg KH, Raj, Ashok

> On 29/01/2018 01:58, KarimAllah Ahmed wrote:
> > Add direct access to MSR_IA32_SPEC_CTRL for guests. Future intel processors
> > will use this MSR to indicate RDCL_NO (bit 0) and IBRS_ALL (bit 1).
> 
> This has to be customizable per-VM (similar to the patches Amazon posted
> a while ago for UCODE_REV for example).  It's not urgent anyway given
> that no processor has it yet.


Apollo lake has it once you update the microcode.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] x86: vmx: Allow direct access to MSR_IA32_ARCH_CAPABILITIES
  2018-01-29  0:58 ` [PATCH v2 4/4] x86: vmx: Allow direct access to MSR_IA32_ARCH_CAPABILITIES KarimAllah Ahmed
  2018-01-29 10:45   ` Paolo Bonzini
@ 2018-01-29 18:55   ` Jim Mattson
  2018-01-29 18:57     ` KarimAllah Ahmed
  1 sibling, 1 reply; 18+ messages in thread
From: Jim Mattson @ 2018-01-29 18:55 UTC (permalink / raw)
  To: KarimAllah Ahmed
  Cc: kvm list, LKML, the arch/x86 maintainers, Asit Mallick,
	Dave Hansen, Arjan Van De Ven, Tim Chen, Linus Torvalds,
	Andrea Arcangeli, Andi Kleen, Thomas Gleixner, Dan Williams,
	Jun Nakajima, Andy Lutomirski, Greg KH, Paolo Bonzini, Ashok Raj

Why should this MSR be pass-through? I doubt that it would be accessed
frequently.

On Sun, Jan 28, 2018 at 4:58 PM, KarimAllah Ahmed <karahmed@amazon.de> wrote:
> Add direct access to MSR_IA32_SPEC_CTRL for guests. Future intel processors
> will use this MSR to indicate RDCL_NO (bit 0) and IBRS_ALL (bit 1).
>
> Cc: Asit Mallick <asit.k.mallick@intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Greg KH <gregkh@linuxfoundation.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
> ---
>  arch/x86/kvm/cpuid.c | 4 +++-
>  arch/x86/kvm/vmx.c   | 2 ++
>  2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 32c0c14..2339b1a 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -71,6 +71,7 @@ u64 kvm_supported_xcr0(void)
>  #define KVM_CPUID_BIT_AVX512_4VNNIW     2
>  #define KVM_CPUID_BIT_AVX512_4FMAPS     3
>  #define KVM_CPUID_BIT_IBRS              26
> +#define KVM_CPUID_BIT_ARCH_CAPABILITIES 29
>  #define KF(x) bit(KVM_CPUID_BIT_##x)
>
>  int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> @@ -394,7 +395,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>         /* cpuid 7.0.edx*/
>         const u32 kvm_cpuid_7_0_edx_x86_features =
>                 KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
> -               (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
> +               (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0) | \
> +               (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES) ? KF(ARCH_CAPABILITIES) : 0);
>
>         /* all calls to cpuid_count() should be made on the same cpu */
>         get_cpu();
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index f82a44c..99cb761 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -9617,6 +9617,8 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
>
>         if (boot_cpu_has(X86_FEATURE_IBPB))
>                 vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PRED_CMD, MSR_TYPE_RW);
> +       if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
> +               vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_ARCH_CAPABILITIES, MSR_TYPE_R);
>         vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
>         vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW);
>         vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] x86: vmx: Allow direct access to MSR_IA32_ARCH_CAPABILITIES
  2018-01-29 18:55   ` Jim Mattson
@ 2018-01-29 18:57     ` KarimAllah Ahmed
  0 siblings, 0 replies; 18+ messages in thread
From: KarimAllah Ahmed @ 2018-01-29 18:57 UTC (permalink / raw)
  To: Jim Mattson, KarimAllah Ahmed
  Cc: kvm list, LKML, the arch/x86 maintainers, Asit Mallick,
	Dave Hansen, Arjan Van De Ven, Tim Chen, Linus Torvalds,
	Andrea Arcangeli, Andi Kleen, Thomas Gleixner, Dan Williams,
	Jun Nakajima, Andy Lutomirski, Greg KH, Paolo Bonzini, Ashok Raj

On 01/29/2018 07:55 PM, Jim Mattson wrote:
> Why should this MSR be pass-through? I doubt that it would be accessed
> frequently.

True. Will update it to be emulated and allow user-space to set the 
value exposed.

> 
> On Sun, Jan 28, 2018 at 4:58 PM, KarimAllah Ahmed <karahmed@amazon.de> wrote:
>> Add direct access to MSR_IA32_SPEC_CTRL for guests. Future intel processors
>> will use this MSR to indicate RDCL_NO (bit 0) and IBRS_ALL (bit 1).
>>
>> Cc: Asit Mallick <asit.k.mallick@intel.com>
>> Cc: Dave Hansen <dave.hansen@intel.com>
>> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
>> Cc: Tim Chen <tim.c.chen@linux.intel.com>
>> Cc: Linus Torvalds <torvalds@linux-foundation.org>
>> Cc: Andrea Arcangeli <aarcange@redhat.com>
>> Cc: Andi Kleen <ak@linux.intel.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>> Cc: Andy Lutomirski <luto@kernel.org>
>> Cc: Greg KH <gregkh@linuxfoundation.org>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Ashok Raj <ashok.raj@intel.com>
>> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
>> ---
>>   arch/x86/kvm/cpuid.c | 4 +++-
>>   arch/x86/kvm/vmx.c   | 2 ++
>>   2 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 32c0c14..2339b1a 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -71,6 +71,7 @@ u64 kvm_supported_xcr0(void)
>>   #define KVM_CPUID_BIT_AVX512_4VNNIW     2
>>   #define KVM_CPUID_BIT_AVX512_4FMAPS     3
>>   #define KVM_CPUID_BIT_IBRS              26
>> +#define KVM_CPUID_BIT_ARCH_CAPABILITIES 29
>>   #define KF(x) bit(KVM_CPUID_BIT_##x)
>>
>>   int kvm_update_cpuid(struct kvm_vcpu *vcpu)
>> @@ -394,7 +395,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>>          /* cpuid 7.0.edx*/
>>          const u32 kvm_cpuid_7_0_edx_x86_features =
>>                  KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
>> -               (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
>> +               (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0) | \
>> +               (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES) ? KF(ARCH_CAPABILITIES) : 0);
>>
>>          /* all calls to cpuid_count() should be made on the same cpu */
>>          get_cpu();
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index f82a44c..99cb761 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -9617,6 +9617,8 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
>>
>>          if (boot_cpu_has(X86_FEATURE_IBPB))
>>                  vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PRED_CMD, MSR_TYPE_RW);
>> +       if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
>> +               vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_ARCH_CAPABILITIES, MSR_TYPE_R);
>>          vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
>>          vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW);
>>          vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
>> --
>> 2.7.4
>>
> 
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/4] x86: kvm: Update the reverse_cpuid list to include CPUID_7_EDX
  2018-01-29  0:58 ` [PATCH v2 1/4] x86: kvm: Update the reverse_cpuid list to include CPUID_7_EDX KarimAllah Ahmed
  2018-01-29 10:37   ` Paolo Bonzini
@ 2018-01-29 18:58   ` Jim Mattson
  1 sibling, 0 replies; 18+ messages in thread
From: Jim Mattson @ 2018-01-29 18:58 UTC (permalink / raw)
  To: KarimAllah Ahmed
  Cc: kvm list, LKML, the arch/x86 maintainers, Paolo Bonzini,
	Radim Krčmář,
	Thomas Gleixner, Ingo Molnar, H . Peter Anvin

On Sun, Jan 28, 2018 at 4:58 PM, KarimAllah Ahmed <karahmed@amazon.de> wrote:
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: kvm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
> ---
>  arch/x86/kvm/cpuid.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index cdc70a3..dcfe227 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -54,6 +54,7 @@ static const struct cpuid_reg reverse_cpuid[] = {
>         [CPUID_8000_000A_EDX] = {0x8000000a, 0, CPUID_EDX},
>         [CPUID_7_ECX]         = {         7, 0, CPUID_ECX},
>         [CPUID_8000_0007_EBX] = {0x80000007, 0, CPUID_EBX},
> +       [CPUID_7_EDX]         = {         7, 0, CPUID_EDX},

I haven't seen the change that introduces CPUID_7_EDX, but shouldn't
it be CPUID_7_0_EDX?

>  };
>
>  static __always_inline struct cpuid_reg x86_feature_cpuid(unsigned x86_feature)
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/4] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL
  2018-01-29  0:58 ` [PATCH v2 2/4] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL KarimAllah Ahmed
  2018-01-29  8:15     ` David Woodhouse
  2018-01-29 10:44   ` Paolo Bonzini
@ 2018-01-29 19:03   ` Jim Mattson
  2 siblings, 0 replies; 18+ messages in thread
From: Jim Mattson @ 2018-01-29 19:03 UTC (permalink / raw)
  To: KarimAllah Ahmed
  Cc: kvm list, LKML, the arch/x86 maintainers, Asit Mallick,
	Arjan Van De Ven, Dave Hansen, Andi Kleen, Andrea Arcangeli,
	Linus Torvalds, Tim Chen, Thomas Gleixner, Dan Williams,
	Jun Nakajima, Paolo Bonzini, David Woodhouse, Greg KH,
	Andy Lutomirski, Ashok Raj

On Sun, Jan 28, 2018 at 4:58 PM, KarimAllah Ahmed <karahmed@amazon.de> wrote:
> Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
> guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
> be using a retpoline+IBPB based approach.
>
> To avoid the overhead of atomically saving and restoring the
> MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only
> add_atomic_switch_msr when a non-zero is written to it.
>
> Cc: Asit Mallick <asit.k.mallick@intel.com>
> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: David Woodhouse <dwmw@amazon.co.uk>
> Cc: Greg KH <gregkh@linuxfoundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
>
> ---
> v2:
> - remove 'host_spec_ctrl' in favor of only a comment (dwmw@).
> - special case writing '0' in SPEC_CTRL to avoid confusing live-migration
>   when the instance never used the MSR (dwmw@).
> - depend on X86_FEATURE_IBRS instead of X86_FEATURE_SPEC_CTRL (dwmw@).
> - add MSR_IA32_SPEC_CTRL to the list of MSRs to save (dropped it by accident).
> ---
>  arch/x86/kvm/cpuid.c |  4 +++-
>  arch/x86/kvm/vmx.c   | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c   |  1 +
>  3 files changed, 69 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 0099e10..32c0c14 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -70,6 +70,7 @@ u64 kvm_supported_xcr0(void)
>  /* These are scattered features in cpufeatures.h. */
>  #define KVM_CPUID_BIT_AVX512_4VNNIW     2
>  #define KVM_CPUID_BIT_AVX512_4FMAPS     3
> +#define KVM_CPUID_BIT_IBRS              26
>  #define KF(x) bit(KVM_CPUID_BIT_##x)
>
>  int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> @@ -392,7 +393,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>
>         /* cpuid 7.0.edx*/
>         const u32 kvm_cpuid_7_0_edx_x86_features =
> -               KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
> +               KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) | \
> +               (boot_cpu_has(X86_FEATURE_IBRS) ? KF(IBRS) : 0);
>
>         /* all calls to cpuid_count() should be made on the same cpu */
>         get_cpu();
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index aa8638a..dac564d 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -920,6 +920,8 @@ static void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked);
>  static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12,
>                                             u16 error_code);
>  static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
> +static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
> +                                                         u32 msr, int type);
>
>  static DEFINE_PER_CPU(struct vmcs *, vmxarea);
>  static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
> @@ -2007,6 +2009,28 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
>         m->host[i].value = host_val;
>  }
>
> +/* do not touch guest_val and host_val if the msr is not found */
> +static int read_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
> +                                 u64 *guest_val, u64 *host_val)
> +{
> +       unsigned i;
> +       struct msr_autoload *m = &vmx->msr_autoload;
> +
> +       for (i = 0; i < m->nr; ++i)
> +               if (m->guest[i].index == msr)
> +                       break;
> +
> +       if (i == m->nr)
> +               return 1;
> +
> +       if (guest_val)
> +               *guest_val = m->guest[i].value;
> +       if (host_val)
> +               *host_val = m->host[i].value;
> +
> +       return 0;
> +}
> +
>  static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset)
>  {
>         u64 guest_efer = vmx->vcpu.arch.efer;
> @@ -3203,7 +3227,9 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
>   */
>  static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  {
> +       u64 spec_ctrl = 0;
>         struct shared_msr_entry *msr;
> +       struct vcpu_vmx *vmx = to_vmx(vcpu);
>
>         switch (msr_info->index) {
>  #ifdef CONFIG_X86_64
> @@ -3223,6 +3249,20 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>         case MSR_IA32_TSC:
>                 msr_info->data = guest_read_tsc(vcpu);
>                 break;
> +       case MSR_IA32_SPEC_CTRL:
> +               if (!msr_info->host_initiated &&
> +                   !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
> +                       return 1;
> +
> +               /*
> +                * If the MSR is not in the atomic list yet, then the guest
> +                * never wrote a non-zero value to it yet i.e. the MSR value is
> +                * '0'.
> +                */
> +               read_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, &spec_ctrl, NULL);
> +
> +               msr_info->data = spec_ctrl;
> +               break;
>         case MSR_IA32_SYSENTER_CS:
>                 msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
>                 break;
> @@ -3289,6 +3329,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>         int ret = 0;
>         u32 msr_index = msr_info->index;
>         u64 data = msr_info->data;
> +       unsigned long *msr_bitmap;
>
>         switch (msr_index) {
>         case MSR_EFER:
> @@ -3330,6 +3371,30 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>         case MSR_IA32_TSC:
>                 kvm_write_tsc(vcpu, msr_info);
>                 break;
> +       case MSR_IA32_SPEC_CTRL:
> +               if (!msr_info->host_initiated &&
> +                   !guest_cpuid_has(vcpu, X86_FEATURE_IBRS))
> +                       return 1;
> +
> +               if (!msr_info->data)
> +                       break;

What if the MSR has already been set? Doesn't this throw away the new
zero value? (e.g. on vcpu reset)?

> +
> +               /*
> +                * Now we know that the guest is actually using the MSR, so
> +                * atomically load and save the SPEC_CTRL MSR and pass it
> +                * through to the guest.
> +                *
> +                * NOTE:
> +                * IBRS is not supported yet as a mitigation for the host. Once
> +                * it is supported, the "host_value" will need to be '1'
> +                * instead of '0' if IBRS is used also by the host.
> +                */
> +               add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL, msr_info->data, 0);
> +
> +               msr_bitmap = vmx->vmcs01.msr_bitmap;
> +               vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
> +
> +               break;
>         case MSR_IA32_CR_PAT:
>                 if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
>                         if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 03869eb..cabaad3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1006,6 +1006,7 @@ static u32 msrs_to_save[] = {
>  #endif
>         MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
>         MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
> +       MSR_IA32_SPEC_CTRL
>  };
>
>  static unsigned num_msrs_to_save;
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] x86/kvm: Add IBPB support
  2018-01-29  0:58 ` [PATCH v2 3/4] x86/kvm: Add IBPB support KarimAllah Ahmed
@ 2018-01-29 19:19   ` Jim Mattson
  0 siblings, 0 replies; 18+ messages in thread
From: Jim Mattson @ 2018-01-29 19:19 UTC (permalink / raw)
  To: KarimAllah Ahmed
  Cc: kvm list, LKML, the arch/x86 maintainers, Ashok Raj,
	Asit Mallick, Dave Hansen, Arjan Van De Ven, Tim Chen,
	Linus Torvalds, Andrea Arcangeli, Andi Kleen, Thomas Gleixner,
	Dan Williams, Jun Nakajima, Andy Lutomirski, Greg KH,
	Paolo Bonzini, Peter Zijlstra, David Woodhouse

On Sun, Jan 28, 2018 at 4:58 PM, KarimAllah Ahmed <karahmed@amazon.de> wrote:
> From: Ashok Raj <ashok.raj@intel.com>
>
> Add MSR passthrough for MSR_IA32_PRED_CMD and place branch predictor
> barriers on switching between VMs to avoid inter VM Spectre-v2 attacks.
>
> [peterz: rebase and changelog rewrite]
> [karahmed: - rebase
>            - vmx: expose PRED_CMD whenever it is available
>            - svm: only pass through IBPB if it is available]
> Cc: Asit Mallick <asit.k.mallick@intel.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Greg KH <gregkh@linuxfoundation.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Ashok Raj <ashok.raj@intel.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Link: http://lkml.kernel.org/r/1515720739-43819-6-git-send-email-ashok.raj@intel.com
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
> ---
>  arch/x86/kvm/svm.c | 14 ++++++++++++++
>  arch/x86/kvm/vmx.c |  4 ++++
>  2 files changed, 18 insertions(+)
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 2744b973..c886e46 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -529,6 +529,7 @@ struct svm_cpu_data {
>         struct kvm_ldttss_desc *tss_desc;
>
>         struct page *save_area;
> +       struct vmcb *current_vmcb;
>  };
>
>  static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
> @@ -918,6 +919,9 @@ static void svm_vcpu_init_msrpm(u32 *msrpm)
>
>                 set_msr_interception(msrpm, direct_access_msrs[i].index, 1, 1);
>         }
> +
> +       if (boot_cpu_has(X86_FEATURE_IBPB))
> +               set_msr_interception(msrpm, MSR_IA32_PRED_CMD, 1, 1);

What if the guest doesn't have X86_FEATURE_IBPB?

>  }
>
>  static void add_msr_offset(u32 offset)
> @@ -1706,11 +1710,17 @@ static void svm_free_vcpu(struct kvm_vcpu *vcpu)
>         __free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER);
>         kvm_vcpu_uninit(vcpu);
>         kmem_cache_free(kvm_vcpu_cache, svm);
> +       /*
> +        * The vmcb page can be recycled, causing a false negative in
> +        * svm_vcpu_load(). So do a full IBPB now.
> +        */
> +       indirect_branch_prediction_barrier();
>  }
>
>  static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  {
>         struct vcpu_svm *svm = to_svm(vcpu);
> +       struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
>         int i;
>
>         if (unlikely(cpu != vcpu->cpu)) {
> @@ -1739,6 +1749,10 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>         if (static_cpu_has(X86_FEATURE_RDTSCP))
>                 wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
>
> +       if (sd->current_vmcb != svm->vmcb) {
> +               sd->current_vmcb = svm->vmcb;
> +               indirect_branch_prediction_barrier();
> +       }
>         avic_vcpu_load(vcpu, cpu);
>  }
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index dac564d..f82a44c 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -2296,6 +2296,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>         if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
>                 per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
>                 vmcs_load(vmx->loaded_vmcs->vmcs);
> +               indirect_branch_prediction_barrier();
>         }
>
>         if (!already_loaded) {
> @@ -9613,6 +9614,9 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
>                 goto free_msrs;
>
>         msr_bitmap = vmx->vmcs01.msr_bitmap;
> +
> +       if (boot_cpu_has(X86_FEATURE_IBPB))
> +               vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PRED_CMD, MSR_TYPE_RW);

What if the guest doesn't have X86_FEATURE_IBPB?

>         vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
>         vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW);
>         vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-01-29 19:19 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-29  0:58 [PATCH v2 0/4] KVM: Expose speculation control feature to guests KarimAllah Ahmed
2018-01-29  0:58 ` KarimAllah Ahmed
2018-01-29  0:58 ` [PATCH v2 1/4] x86: kvm: Update the reverse_cpuid list to include CPUID_7_EDX KarimAllah Ahmed
2018-01-29 10:37   ` Paolo Bonzini
2018-01-29 18:58   ` Jim Mattson
2018-01-29  0:58 ` [PATCH v2 2/4] x86: vmx: Allow direct access to MSR_IA32_SPEC_CTRL KarimAllah Ahmed
2018-01-29  8:15   ` David Woodhouse
2018-01-29  8:15     ` David Woodhouse
2018-01-29 10:44   ` Paolo Bonzini
2018-01-29 12:12     ` KarimAllah Ahmed
2018-01-29 19:03   ` Jim Mattson
2018-01-29  0:58 ` [PATCH v2 3/4] x86/kvm: Add IBPB support KarimAllah Ahmed
2018-01-29 19:19   ` Jim Mattson
2018-01-29  0:58 ` [PATCH v2 4/4] x86: vmx: Allow direct access to MSR_IA32_ARCH_CAPABILITIES KarimAllah Ahmed
2018-01-29 10:45   ` Paolo Bonzini
2018-01-29 14:18     ` Van De Ven, Arjan
2018-01-29 18:55   ` Jim Mattson
2018-01-29 18:57     ` KarimAllah Ahmed

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.