linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: x86: Extend Spectre-v1 mitigation
@ 2019-11-22 18:40 Marios Pomonis
  2019-11-22 19:45 ` Sean Christopherson
  2019-11-22 22:03 ` Liran Alon
  0 siblings, 2 replies; 4+ messages in thread
From: Marios Pomonis @ 2019-11-22 18:40 UTC (permalink / raw)
  To: Paolo Bonzini, rkrcmar, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	x86, kvm, linux-kernel, Marios Pomonis, Nick Finco, Andrew Honig

From: Nick Finco <nifi@google.com>

This extends the Spectre-v1 mitigation introduced in
commit 75f139aaf896 ("KVM: x86: Add memory barrier on vmcs field lookup")
and commit 085331dfc6bb ("x86/kvm: Update spectre-v1 mitigation") in light
of the Spectre-v1/L1TF combination described here:
https://xenbits.xen.org/xsa/advisory-289.html

As reported in the link, an attacker can use the cache-load part of a
Spectre-v1 gadget to bring memory into the L1 cache, then use L1TF to
leak the loaded memory. Note that this attack is not fully mitigated by
core scheduling; an attacker could employ L1TF on the same thread that
loaded the memory in L1 instead of relying on neighboring hyperthreads.

This patch uses array_index_nospec() to prevent index computations from
causing speculative loads into the L1 cache. These cases involve a
bounds check followed by a memory read using the index; this is more
common than the full Spectre-v1 pattern. In some cases, the index
computation can be eliminated entirely by small amounts of refactoring.

Signed-off-by: Nick Finco <nifi@google.com>
Signed-off-by: Marios Pomonis <pomonis@google.com>
Acked-by: Andrew Honig <ahonig@google.com>
---
 arch/x86/kvm/emulate.c       | 11 ++++++++---
 arch/x86/kvm/hyperv.c        | 10 ++++++----
 arch/x86/kvm/i8259.c         |  6 +++++-
 arch/x86/kvm/ioapic.c        | 15 +++++++++------
 arch/x86/kvm/lapic.c         | 13 +++++++++----
 arch/x86/kvm/mtrr.c          |  8 ++++++--
 arch/x86/kvm/pmu.h           | 18 ++++++++++++++----
 arch/x86/kvm/vmx/pmu_intel.c | 24 ++++++++++++++++--------
 arch/x86/kvm/vmx/vmx.c       | 22 ++++++++++++++++------
 arch/x86/kvm/x86.c           | 18 ++++++++++++++----
 10 files changed, 103 insertions(+), 42 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 952d1a4f4d7e..fcf7cdb21d60 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -5303,10 +5303,15 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
 			}
 			break;
 		case Escape:
-			if (ctxt->modrm > 0xbf)
-				opcode = opcode.u.esc->high[ctxt->modrm - 0xc0];
-			else
+			if (ctxt->modrm > 0xbf) {
+				size_t size = ARRAY_SIZE(opcode.u.esc->high);
+				u32 index = array_index_nospec(
+					ctxt->modrm - 0xc0, size);
+
+				opcode = opcode.u.esc->high[index];
+			} else {
 				opcode = opcode.u.esc->op[(ctxt->modrm >> 3) & 7];
+			}
 			break;
 		case InstrDual:
 			if ((ctxt->modrm >> 6) == 3)
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 23ff65504d7e..26408434b9bc 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -809,11 +809,12 @@ static int kvm_hv_msr_get_crash_data(struct kvm_vcpu *vcpu,
 				     u32 index, u64 *pdata)
 {
 	struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
+	size_t size = ARRAY_SIZE(hv->hv_crash_param);
 
-	if (WARN_ON_ONCE(index >= ARRAY_SIZE(hv->hv_crash_param)))
+	if (WARN_ON_ONCE(index >= size))
 		return -EINVAL;
 
-	*pdata = hv->hv_crash_param[index];
+	*pdata = hv->hv_crash_param[array_index_nospec(index, size)];
 	return 0;
 }
 
@@ -852,11 +853,12 @@ static int kvm_hv_msr_set_crash_data(struct kvm_vcpu *vcpu,
 				     u32 index, u64 data)
 {
 	struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
+	size_t size = ARRAY_SIZE(hv->hv_crash_param);
 
-	if (WARN_ON_ONCE(index >= ARRAY_SIZE(hv->hv_crash_param)))
+	if (WARN_ON_ONCE(index >= size))
 		return -EINVAL;
 
-	hv->hv_crash_param[index] = data;
+	hv->hv_crash_param[array_index_nospec(index, size)] = data;
 	return 0;
 }
 
diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
index 8b38bb4868a6..629a09ca9860 100644
--- a/arch/x86/kvm/i8259.c
+++ b/arch/x86/kvm/i8259.c
@@ -460,10 +460,14 @@ static int picdev_write(struct kvm_pic *s,
 	switch (addr) {
 	case 0x20:
 	case 0x21:
+		pic_lock(s);
+		pic_ioport_write(&s->pics[0], addr, data);
+		pic_unlock(s);
+		break;
 	case 0xa0:
 	case 0xa1:
 		pic_lock(s);
-		pic_ioport_write(&s->pics[addr >> 7], addr, data);
+		pic_ioport_write(&s->pics[1], addr, data);
 		pic_unlock(s);
 		break;
 	case 0x4d0:
diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
index 9fd2dd89a1c5..8aa58727045e 100644
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -36,6 +36,7 @@
 #include <linux/io.h>
 #include <linux/slab.h>
 #include <linux/export.h>
+#include <linux/nospec.h>
 #include <asm/processor.h>
 #include <asm/page.h>
 #include <asm/current.h>
@@ -68,13 +69,14 @@ static unsigned long ioapic_read_indirect(struct kvm_ioapic *ioapic,
 	default:
 		{
 			u32 redir_index = (ioapic->ioregsel - 0x10) >> 1;
-			u64 redir_content;
+			u64 redir_content = ~0ULL;
 
-			if (redir_index < IOAPIC_NUM_PINS)
-				redir_content =
-					ioapic->redirtbl[redir_index].bits;
-			else
-				redir_content = ~0ULL;
+			if (redir_index < IOAPIC_NUM_PINS) {
+				u32 index = array_index_nospec(
+					redir_index, IOAPIC_NUM_PINS);
+
+				redir_content = ioapic->redirtbl[index].bits;
+			}
 
 			result = (ioapic->ioregsel & 0x1) ?
 			    (redir_content >> 32) & 0xffffffff :
@@ -292,6 +294,7 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
 
 		if (index >= IOAPIC_NUM_PINS)
 			return;
+		index = array_index_nospec(index, IOAPIC_NUM_PINS);
 		e = &ioapic->redirtbl[index];
 		mask_before = e->fields.mask;
 		/* Preserve read-only fields */
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index cf9177b4a07f..3323115f52d5 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1963,15 +1963,20 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
 	case APIC_LVTTHMR:
 	case APIC_LVTPC:
 	case APIC_LVT1:
-	case APIC_LVTERR:
+	case APIC_LVTERR: {
 		/* TODO: Check vector */
+		size_t size;
+		u32 index;
+
 		if (!kvm_apic_sw_enabled(apic))
 			val |= APIC_LVT_MASKED;
-
-		val &= apic_lvt_mask[(reg - APIC_LVTT) >> 4];
+		size = ARRAY_SIZE(apic_lvt_mask);
+		index = array_index_nospec(
+				(reg - APIC_LVTT) >> 4, size);
+		val &= apic_lvt_mask[index];
 		kvm_lapic_set_reg(apic, reg, val);
-
 		break;
+	}
 
 	case APIC_LVTT:
 		if (!kvm_apic_sw_enabled(apic))
diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c
index 25ce3edd1872..7f0059aa30e1 100644
--- a/arch/x86/kvm/mtrr.c
+++ b/arch/x86/kvm/mtrr.c
@@ -192,11 +192,15 @@ static bool fixed_msr_to_seg_unit(u32 msr, int *seg, int *unit)
 		break;
 	case MSR_MTRRfix16K_80000 ... MSR_MTRRfix16K_A0000:
 		*seg = 1;
-		*unit = msr - MSR_MTRRfix16K_80000;
+		*unit = array_index_nospec(
+			msr - MSR_MTRRfix16K_80000,
+			MSR_MTRRfix16K_A0000 - MSR_MTRRfix16K_80000 + 1);
 		break;
 	case MSR_MTRRfix4K_C0000 ... MSR_MTRRfix4K_F8000:
 		*seg = 2;
-		*unit = msr - MSR_MTRRfix4K_C0000;
+		*unit = array_index_nospec(
+			msr - MSR_MTRRfix4K_C0000,
+			MSR_MTRRfix4K_F8000 - MSR_MTRRfix4K_C0000 + 1);
 		break;
 	default:
 		return false;
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 7ebb62326c14..13332984b6d5 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -2,6 +2,8 @@
 #ifndef __KVM_X86_PMU_H
 #define __KVM_X86_PMU_H
 
+#include <linux/nospec.h>
+
 #define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu)
 #define pmu_to_vcpu(pmu)  (container_of((pmu), struct kvm_vcpu, arch.pmu))
 #define pmc_to_pmu(pmc)   (&(pmc)->vcpu->arch.pmu)
@@ -102,8 +104,12 @@ static inline bool kvm_valid_perf_global_ctrl(struct kvm_pmu *pmu,
 static inline struct kvm_pmc *get_gp_pmc(struct kvm_pmu *pmu, u32 msr,
 					 u32 base)
 {
-	if (msr >= base && msr < base + pmu->nr_arch_gp_counters)
-		return &pmu->gp_counters[msr - base];
+	if (msr >= base && msr < base + pmu->nr_arch_gp_counters) {
+		u32 index = array_index_nospec(msr - base,
+					       pmu->nr_arch_gp_counters);
+
+		return &pmu->gp_counters[index];
+	}
 
 	return NULL;
 }
@@ -113,8 +119,12 @@ static inline struct kvm_pmc *get_fixed_pmc(struct kvm_pmu *pmu, u32 msr)
 {
 	int base = MSR_CORE_PERF_FIXED_CTR0;
 
-	if (msr >= base && msr < base + pmu->nr_arch_fixed_counters)
-		return &pmu->fixed_counters[msr - base];
+	if (msr >= base && msr < base + pmu->nr_arch_fixed_counters) {
+		u32 index = array_index_nospec(msr - base,
+					       pmu->nr_arch_fixed_counters);
+
+		return &pmu->fixed_counters[index];
+	}
 
 	return NULL;
 }
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 7023138b1cb0..34a3a17bb6d7 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -86,10 +86,14 @@ static unsigned intel_find_arch_event(struct kvm_pmu *pmu,
 
 static unsigned intel_find_fixed_event(int idx)
 {
-	if (idx >= ARRAY_SIZE(fixed_pmc_events))
+	u32 event;
+	size_t size = ARRAY_SIZE(fixed_pmc_events);
+
+	if (idx >= size)
 		return PERF_COUNT_HW_MAX;
 
-	return intel_arch_events[fixed_pmc_events[idx]].event_type;
+	event = fixed_pmc_events[array_index_nospec(idx, size)];
+	return intel_arch_events[event].event_type;
 }
 
 /* check if a PMC is enabled by comparing it with globl_ctrl bits. */
@@ -130,16 +134,20 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 	bool fixed = idx & (1u << 30);
 	struct kvm_pmc *counters;
+	unsigned int num_counters;
 
 	idx &= ~(3u << 30);
-	if (!fixed && idx >= pmu->nr_arch_gp_counters)
-		return NULL;
-	if (fixed && idx >= pmu->nr_arch_fixed_counters)
+	if (fixed) {
+		counters = pmu->fixed_counters;
+		num_counters = pmu->nr_arch_fixed_counters;
+	} else {
+		counters = pmu->gp_counters;
+		num_counters = pmu->nr_arch_gp_counters;
+	}
+	if (idx >= num_counters)
 		return NULL;
-	counters = fixed ? pmu->fixed_counters : pmu->gp_counters;
 	*mask &= pmu->counter_bitmask[fixed ? KVM_PMC_FIXED : KVM_PMC_GP];
-
-	return &counters[idx];
+	return &counters[array_index_nospec(idx, num_counters)];
 }
 
 static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d39475e2d44e..84def8e46d10 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -753,7 +753,9 @@ static bool vmx_segment_cache_test_set(struct vcpu_vmx *vmx, unsigned seg,
 
 static u16 vmx_read_guest_seg_selector(struct vcpu_vmx *vmx, unsigned seg)
 {
-	u16 *p = &vmx->segment_cache.seg[seg].selector;
+	size_t size = ARRAY_SIZE(vmx->segment_cache.seg);
+	size_t index = array_index_nospec(seg, size);
+	u16 *p = &vmx->segment_cache.seg[index].selector;
 
 	if (!vmx_segment_cache_test_set(vmx, seg, SEG_FIELD_SEL))
 		*p = vmcs_read16(kvm_vmx_segment_fields[seg].selector);
@@ -762,7 +764,9 @@ static u16 vmx_read_guest_seg_selector(struct vcpu_vmx *vmx, unsigned seg)
 
 static ulong vmx_read_guest_seg_base(struct vcpu_vmx *vmx, unsigned seg)
 {
-	ulong *p = &vmx->segment_cache.seg[seg].base;
+	size_t size = ARRAY_SIZE(vmx->segment_cache.seg);
+	size_t index = array_index_nospec(seg, size);
+	ulong *p = &vmx->segment_cache.seg[index].base;
 
 	if (!vmx_segment_cache_test_set(vmx, seg, SEG_FIELD_BASE))
 		*p = vmcs_readl(kvm_vmx_segment_fields[seg].base);
@@ -771,7 +775,9 @@ static ulong vmx_read_guest_seg_base(struct vcpu_vmx *vmx, unsigned seg)
 
 static u32 vmx_read_guest_seg_limit(struct vcpu_vmx *vmx, unsigned seg)
 {
-	u32 *p = &vmx->segment_cache.seg[seg].limit;
+	size_t size = ARRAY_SIZE(vmx->segment_cache.seg);
+	size_t index = array_index_nospec(seg, size);
+	u32 *p = &vmx->segment_cache.seg[index].limit;
 
 	if (!vmx_segment_cache_test_set(vmx, seg, SEG_FIELD_LIMIT))
 		*p = vmcs_read32(kvm_vmx_segment_fields[seg].limit);
@@ -780,7 +786,9 @@ static u32 vmx_read_guest_seg_limit(struct vcpu_vmx *vmx, unsigned seg)
 
 static u32 vmx_read_guest_seg_ar(struct vcpu_vmx *vmx, unsigned seg)
 {
-	u32 *p = &vmx->segment_cache.seg[seg].ar;
+	size_t size = ARRAY_SIZE(vmx->segment_cache.seg);
+	size_t index = array_index_nospec(seg, size);
+	u32 *p = &vmx->segment_cache.seg[index].ar;
 
 	if (!vmx_segment_cache_test_set(vmx, seg, SEG_FIELD_AR))
 		*p = vmcs_read32(kvm_vmx_segment_fields[seg].ar_bytes);
@@ -5828,6 +5836,8 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	u32 exit_reason = vmx->exit_reason;
+	u32 bounded_exit_reason = array_index_nospec(exit_reason,
+						kvm_vmx_max_exit_handlers);
 	u32 vectoring_info = vmx->idt_vectoring_info;
 
 	trace_kvm_exit(exit_reason, vcpu, KVM_ISA_VMX);
@@ -5911,7 +5921,7 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
 	}
 
 	if (exit_reason < kvm_vmx_max_exit_handlers
-	    && kvm_vmx_exit_handlers[exit_reason]) {
+	    && kvm_vmx_exit_handlers[bounded_exit_reason]) {
 #ifdef CONFIG_RETPOLINE
 		if (exit_reason == EXIT_REASON_MSR_WRITE)
 			return kvm_emulate_wrmsr(vcpu);
@@ -5926,7 +5936,7 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
 		else if (exit_reason == EXIT_REASON_EPT_MISCONFIG)
 			return handle_ept_misconfig(vcpu);
 #endif
-		return kvm_vmx_exit_handlers[exit_reason](vcpu);
+		return kvm_vmx_exit_handlers[bounded_exit_reason](vcpu);
 	} else {
 		vcpu_unimpl(vcpu, "vmx: unexpected exit reason 0x%x\n",
 				exit_reason);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a256e09f321a..9a2789652231 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1057,9 +1057,11 @@ static u64 kvm_dr6_fixed(struct kvm_vcpu *vcpu)
 
 static int __kvm_set_dr(struct kvm_vcpu *vcpu, int dr, unsigned long val)
 {
+	size_t size = ARRAY_SIZE(vcpu->arch.db);
+
 	switch (dr) {
 	case 0 ... 3:
-		vcpu->arch.db[dr] = val;
+		vcpu->arch.db[array_index_nospec(dr, size)] = val;
 		if (!(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP))
 			vcpu->arch.eff_db[dr] = val;
 		break;
@@ -1096,9 +1098,11 @@ EXPORT_SYMBOL_GPL(kvm_set_dr);
 
 int kvm_get_dr(struct kvm_vcpu *vcpu, int dr, unsigned long *val)
 {
+	size_t size = ARRAY_SIZE(vcpu->arch.db);
+
 	switch (dr) {
 	case 0 ... 3:
-		*val = vcpu->arch.db[dr];
+		*val = vcpu->arch.db[array_index_nospec(dr, size)];
 		break;
 	case 4:
 		/* fall through */
@@ -2496,7 +2500,10 @@ static int set_msr_mce(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	default:
 		if (msr >= MSR_IA32_MC0_CTL &&
 		    msr < MSR_IA32_MCx_CTL(bank_num)) {
-			u32 offset = msr - MSR_IA32_MC0_CTL;
+			u32 offset = array_index_nospec(
+				msr - MSR_IA32_MC0_CTL,
+				MSR_IA32_MCx_CTL(bank_num) - MSR_IA32_MC0_CTL);
+
 			/* only 0 or all 1s can be written to IA32_MCi_CTL
 			 * some Linux kernels though clear bit 10 in bank 4 to
 			 * workaround a BIOS/GART TBL issue on AMD K8s, ignore
@@ -2937,7 +2944,10 @@ static int get_msr_mce(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, bool host)
 	default:
 		if (msr >= MSR_IA32_MC0_CTL &&
 		    msr < MSR_IA32_MCx_CTL(bank_num)) {
-			u32 offset = msr - MSR_IA32_MC0_CTL;
+			u32 offset = array_index_nospec(
+				msr - MSR_IA32_MC0_CTL,
+				MSR_IA32_MCx_CTL(bank_num) - MSR_IA32_MC0_CTL);
+
 			data = vcpu->arch.mce_banks[offset];
 			break;
 		}
-- 
2.24.0.432.g9d3f5f5b63-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: x86: Extend Spectre-v1 mitigation
  2019-11-22 18:40 [PATCH] KVM: x86: Extend Spectre-v1 mitigation Marios Pomonis
@ 2019-11-22 19:45 ` Sean Christopherson
  2019-11-22 22:03 ` Liran Alon
  1 sibling, 0 replies; 4+ messages in thread
From: Sean Christopherson @ 2019-11-22 19:45 UTC (permalink / raw)
  To: Marios Pomonis
  Cc: Paolo Bonzini, rkrcmar, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86, kvm, linux-kernel,
	Nick Finco, Andrew Honig

On Fri, Nov 22, 2019 at 10:40:39AM -0800, Marios Pomonis wrote:
> From: Nick Finco <nifi@google.com>
> 
> This extends the Spectre-v1 mitigation introduced in
> commit 75f139aaf896 ("KVM: x86: Add memory barrier on vmcs field lookup")
> and commit 085331dfc6bb ("x86/kvm: Update spectre-v1 mitigation") in light
> of the Spectre-v1/L1TF combination described here:
> https://xenbits.xen.org/xsa/advisory-289.html
> 
> As reported in the link, an attacker can use the cache-load part of a
> Spectre-v1 gadget to bring memory into the L1 cache, then use L1TF to
> leak the loaded memory. Note that this attack is not fully mitigated by
> core scheduling; an attacker could employ L1TF on the same thread that
> loaded the memory in L1 instead of relying on neighboring hyperthreads.
> 
> This patch uses array_index_nospec() to prevent index computations from
> causing speculative loads into the L1 cache. These cases involve a
> bounds check followed by a memory read using the index; this is more
> common than the full Spectre-v1 pattern. In some cases, the index
> computation can be eliminated entirely by small amounts of refactoring.
> 
> Signed-off-by: Nick Finco <nifi@google.com>
> Signed-off-by: Marios Pomonis <pomonis@google.com>

+cc stable?

> Acked-by: Andrew Honig <ahonig@google.com>
> ---
>  arch/x86/kvm/emulate.c       | 11 ++++++++---
>  arch/x86/kvm/hyperv.c        | 10 ++++++----
>  arch/x86/kvm/i8259.c         |  6 +++++-
>  arch/x86/kvm/ioapic.c        | 15 +++++++++------
>  arch/x86/kvm/lapic.c         | 13 +++++++++----
>  arch/x86/kvm/mtrr.c          |  8 ++++++--
>  arch/x86/kvm/pmu.h           | 18 ++++++++++++++----
>  arch/x86/kvm/vmx/pmu_intel.c | 24 ++++++++++++++++--------
>  arch/x86/kvm/vmx/vmx.c       | 22 ++++++++++++++++------
>  arch/x86/kvm/x86.c           | 18 ++++++++++++++----
>  10 files changed, 103 insertions(+), 42 deletions(-)

Can you split this up into multiple patches?  I assume this needs to be
backported to stable, and throwing everything into a single patch will
make the process unnecessarily painful.  Reviewing things would also be
a lot easier.

...

> @@ -5828,6 +5836,8 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
>  {
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
>  	u32 exit_reason = vmx->exit_reason;
> +	u32 bounded_exit_reason = array_index_nospec(exit_reason,
> +						kvm_vmx_max_exit_handlers);
>  	u32 vectoring_info = vmx->idt_vectoring_info;
>  
>  	trace_kvm_exit(exit_reason, vcpu, KVM_ISA_VMX);
> @@ -5911,7 +5921,7 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
>  	}
>  
>  	if (exit_reason < kvm_vmx_max_exit_handlers
> -	    && kvm_vmx_exit_handlers[exit_reason]) {
> +	    && kvm_vmx_exit_handlers[bounded_exit_reason]) {
>  #ifdef CONFIG_RETPOLINE
>  		if (exit_reason == EXIT_REASON_MSR_WRITE)
>  			return kvm_emulate_wrmsr(vcpu);
> @@ -5926,7 +5936,7 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
>  		else if (exit_reason == EXIT_REASON_EPT_MISCONFIG)
>  			return handle_ept_misconfig(vcpu);
>  #endif
> -		return kvm_vmx_exit_handlers[exit_reason](vcpu);
> +		return kvm_vmx_exit_handlers[bounded_exit_reason](vcpu);
>  	} else {
>  		vcpu_unimpl(vcpu, "vmx: unexpected exit reason 0x%x\n",
>  				exit_reason);

Oof, using exit_reason for the comparison is subtle.  Rather than
precompute the bounded exit reason, what about refactoring the code so
that the array_index_nospec() tomfoolery can be done after the actual
bounds check?  This would also avoid the nospec stuff when using the
direct retpoline handling.

---
 arch/x86/kvm/vmx/vmx.c | 56 ++++++++++++++++++++++--------------------
 1 file changed, 30 insertions(+), 26 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d39475e2d44e..14c2efd66300 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5910,34 +5910,38 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
 		}
 	}
 
-	if (exit_reason < kvm_vmx_max_exit_handlers
-	    && kvm_vmx_exit_handlers[exit_reason]) {
+	if (exit_reason >= kvm_vmx_max_exit_handlers)
+		goto unexpected_vmexit;
+
 #ifdef CONFIG_RETPOLINE
-		if (exit_reason == EXIT_REASON_MSR_WRITE)
-			return kvm_emulate_wrmsr(vcpu);
-		else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER)
-			return handle_preemption_timer(vcpu);
-		else if (exit_reason == EXIT_REASON_PENDING_INTERRUPT)
-			return handle_interrupt_window(vcpu);
-		else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
-			return handle_external_interrupt(vcpu);
-		else if (exit_reason == EXIT_REASON_HLT)
-			return kvm_emulate_halt(vcpu);
-		else if (exit_reason == EXIT_REASON_EPT_MISCONFIG)
-			return handle_ept_misconfig(vcpu);
+	if (exit_reason == EXIT_REASON_MSR_WRITE)
+		return kvm_emulate_wrmsr(vcpu);
+	else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER)
+		return handle_preemption_timer(vcpu);
+	else if (exit_reason == EXIT_REASON_PENDING_INTERRUPT)
+		return handle_interrupt_window(vcpu);
+	else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
+		return handle_external_interrupt(vcpu);
+	else if (exit_reason == EXIT_REASON_HLT)
+		return kvm_emulate_halt(vcpu);
+	else if (exit_reason == EXIT_REASON_EPT_MISCONFIG)
+		return handle_ept_misconfig(vcpu);
 #endif
-		return kvm_vmx_exit_handlers[exit_reason](vcpu);
-	} else {
-		vcpu_unimpl(vcpu, "vmx: unexpected exit reason 0x%x\n",
-				exit_reason);
-		dump_vmcs();
-		vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
-		vcpu->run->internal.suberror =
-			KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON;
-		vcpu->run->internal.ndata = 1;
-		vcpu->run->internal.data[0] = exit_reason;
-		return 0;
-	}
+
+	exit_reason = array_index_nospec(exit_reason, kvm_vmx_max_exit_handlers);
+	if (!kvm_vmx_exit_handlers[exit_reason])
+		goto unexpected_vmexit;
+
+	return kvm_vmx_exit_handlers[exit_reason](vcpu);
+
+unexpected_vmexit:
+	vcpu_unimpl(vcpu, "vmx: unexpected exit reason 0x%x\n", exit_reason);
+	dump_vmcs();
+	vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
+	vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON;
+	vcpu->run->internal.ndata = 1;
+	vcpu->run->internal.data[0] = exit_reason;
+	return 0;
 }
 
 /*
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: x86: Extend Spectre-v1 mitigation
  2019-11-22 18:40 [PATCH] KVM: x86: Extend Spectre-v1 mitigation Marios Pomonis
  2019-11-22 19:45 ` Sean Christopherson
@ 2019-11-22 22:03 ` Liran Alon
  2019-11-22 22:19   ` Sean Christopherson
  1 sibling, 1 reply; 4+ messages in thread
From: Liran Alon @ 2019-11-22 22:03 UTC (permalink / raw)
  To: Marios Pomonis
  Cc: Paolo Bonzini, rkrcmar, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, kvm,
	linux-kernel, Nick Finco, Andrew Honig



> On 22 Nov 2019, at 20:40, Marios Pomonis <pomonis@google.com> wrote:
> 
> From: Nick Finco <nifi@google.com>
> 
> This extends the Spectre-v1 mitigation introduced in
> commit 75f139aaf896 ("KVM: x86: Add memory barrier on vmcs field lookup")
> and commit 085331dfc6bb ("x86/kvm: Update spectre-v1 mitigation") in light
> of the Spectre-v1/L1TF combination described here:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__xenbits.xen.org_xsa_advisory-2D289.html&d=DwIBaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0&m=1r1_2c3bq4nLv-MPEuf_dyQR9r4XHB3Vst3wJeE43pY&s=5GOdMhJ_AQYUkAPAveAzmHrO7hZ0qOgxKmPqKouGvnc&e= 
> 
> As reported in the link, an attacker can use the cache-load part of a
> Spectre-v1 gadget to bring memory into the L1 cache, then use L1TF to
> leak the loaded memory. Note that this attack is not fully mitigated by
> core scheduling; an attacker could employ L1TF on the same thread that
> loaded the memory in L1 instead of relying on neighboring hyperthreads.

This description is not accurate. The attack still relies on SMT.

If KVM flush L1D$ on every VMEntry to guest, then the only way to exploit this is via a sibling hyperthread.
It’s true that core-scheduling does not mitigate this scenario completely, but not because the reason described above.

The attack vector that still works when hyperthreading and core-scheduling is enabled, is if on a single core,
one hyperthread runs a VMExit handler which executes a cache-load gadget (even speculatively) while
the sibling hyperthread runs inside guest and executes L1TF code sequence to leak from the core’s L1D$.

You can find more information on this specific attack vector in the following KVM Forum 2019 slides:
https://static.sched.com/hosted_files/kvmforum2019/34/KVM%20Forum%202019%20KVM%20ASI.pdf

> 
> This patch uses array_index_nospec() to prevent index computations from
> causing speculative loads into the L1 cache. These cases involve a
> bounds check followed by a memory read using the index; this is more
> common than the full Spectre-v1 pattern. In some cases, the index
> computation can be eliminated entirely by small amounts of refactoring.
> 
> Signed-off-by: Nick Finco <nifi@google.com>
> Signed-off-by: Marios Pomonis <pomonis@google.com>
> Acked-by: Andrew Honig <ahonig@google.com>

I agree with Sean that this patch should be split into a patch-series with multiple patches.
It will be easier to review and more importantly, bisect in case one of the changes here cause a regression.

In addition, I think that every such patch should have a good explanation in commit message on why it is required.
i.e. A proof that the index used as part of the cache-load gadget is attacker-controllable.
I had a brief look over the patch below and it seems that there are some changes to code-path where attacker doesn’t control the index.
(I commented inline on those changes)
Therefore, I think these changes shouldn’t be necessary.

-Liran

> ---
> arch/x86/kvm/emulate.c       | 11 ++++++++---
> arch/x86/kvm/hyperv.c        | 10 ++++++----
> arch/x86/kvm/i8259.c         |  6 +++++-
> arch/x86/kvm/ioapic.c        | 15 +++++++++------
> arch/x86/kvm/lapic.c         | 13 +++++++++----
> arch/x86/kvm/mtrr.c          |  8 ++++++--
> arch/x86/kvm/pmu.h           | 18 ++++++++++++++----
> arch/x86/kvm/vmx/pmu_intel.c | 24 ++++++++++++++++--------
> arch/x86/kvm/vmx/vmx.c       | 22 ++++++++++++++++------
> arch/x86/kvm/x86.c           | 18 ++++++++++++++----
> 10 files changed, 103 insertions(+), 42 deletions(-)
> 
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 952d1a4f4d7e..fcf7cdb21d60 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -5303,10 +5303,15 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
> 			}
> 			break;
> 		case Escape:
> -			if (ctxt->modrm > 0xbf)
> -				opcode = opcode.u.esc->high[ctxt->modrm - 0xc0];
> -			else
> +			if (ctxt->modrm > 0xbf) {
> +				size_t size = ARRAY_SIZE(opcode.u.esc->high);
> +				u32 index = array_index_nospec(
> +					ctxt->modrm - 0xc0, size);
> +
> +				opcode = opcode.u.esc->high[index];
> +			} else {
> 				opcode = opcode.u.esc->op[(ctxt->modrm >> 3) & 7];
> +			}
> 			break;
> 		case InstrDual:
> 			if ((ctxt->modrm >> 6) == 3)
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 23ff65504d7e..26408434b9bc 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -809,11 +809,12 @@ static int kvm_hv_msr_get_crash_data(struct kvm_vcpu *vcpu,
> 				     u32 index, u64 *pdata)
> {
> 	struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
> +	size_t size = ARRAY_SIZE(hv->hv_crash_param);
> 
> -	if (WARN_ON_ONCE(index >= ARRAY_SIZE(hv->hv_crash_param)))
> +	if (WARN_ON_ONCE(index >= size))
> 		return -EINVAL;
> 
> -	*pdata = hv->hv_crash_param[index];
> +	*pdata = hv->hv_crash_param[array_index_nospec(index, size)];
> 	return 0;
> }
> 
> @@ -852,11 +853,12 @@ static int kvm_hv_msr_set_crash_data(struct kvm_vcpu *vcpu,
> 				     u32 index, u64 data)
> {
> 	struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
> +	size_t size = ARRAY_SIZE(hv->hv_crash_param);
> 
> -	if (WARN_ON_ONCE(index >= ARRAY_SIZE(hv->hv_crash_param)))
> +	if (WARN_ON_ONCE(index >= size))
> 		return -EINVAL;
> 
> -	hv->hv_crash_param[index] = data;
> +	hv->hv_crash_param[array_index_nospec(index, size)] = data;
> 	return 0;
> }
> 
> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
> index 8b38bb4868a6..629a09ca9860 100644
> --- a/arch/x86/kvm/i8259.c
> +++ b/arch/x86/kvm/i8259.c
> @@ -460,10 +460,14 @@ static int picdev_write(struct kvm_pic *s,
> 	switch (addr) {
> 	case 0x20:
> 	case 0x21:
> +		pic_lock(s);
> +		pic_ioport_write(&s->pics[0], addr, data);
> +		pic_unlock(s);
> +		break;
> 	case 0xa0:
> 	case 0xa1:
> 		pic_lock(s);
> -		pic_ioport_write(&s->pics[addr >> 7], addr, data);
> +		pic_ioport_write(&s->pics[1], addr, data);
> 		pic_unlock(s);
> 		break;
> 	case 0x4d0:
> diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
> index 9fd2dd89a1c5..8aa58727045e 100644
> --- a/arch/x86/kvm/ioapic.c
> +++ b/arch/x86/kvm/ioapic.c
> @@ -36,6 +36,7 @@
> #include <linux/io.h>
> #include <linux/slab.h>
> #include <linux/export.h>
> +#include <linux/nospec.h>
> #include <asm/processor.h>
> #include <asm/page.h>
> #include <asm/current.h>
> @@ -68,13 +69,14 @@ static unsigned long ioapic_read_indirect(struct kvm_ioapic *ioapic,
> 	default:
> 		{
> 			u32 redir_index = (ioapic->ioregsel - 0x10) >> 1;
> -			u64 redir_content;
> +			u64 redir_content = ~0ULL;
> 
> -			if (redir_index < IOAPIC_NUM_PINS)
> -				redir_content =
> -					ioapic->redirtbl[redir_index].bits;
> -			else
> -				redir_content = ~0ULL;
> +			if (redir_index < IOAPIC_NUM_PINS) {
> +				u32 index = array_index_nospec(
> +					redir_index, IOAPIC_NUM_PINS);
> +
> +				redir_content = ioapic->redirtbl[index].bits;
> +			}
> 
> 			result = (ioapic->ioregsel & 0x1) ?
> 			    (redir_content >> 32) & 0xffffffff :
> @@ -292,6 +294,7 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
> 
> 		if (index >= IOAPIC_NUM_PINS)
> 			return;
> +		index = array_index_nospec(index, IOAPIC_NUM_PINS);
> 		e = &ioapic->redirtbl[index];
> 		mask_before = e->fields.mask;
> 		/* Preserve read-only fields */
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index cf9177b4a07f..3323115f52d5 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1963,15 +1963,20 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
> 	case APIC_LVTTHMR:
> 	case APIC_LVTPC:
> 	case APIC_LVT1:
> -	case APIC_LVTERR:
> +	case APIC_LVTERR: {
> 		/* TODO: Check vector */
> +		size_t size;
> +		u32 index;
> +
> 		if (!kvm_apic_sw_enabled(apic))
> 			val |= APIC_LVT_MASKED;
> -
> -		val &= apic_lvt_mask[(reg - APIC_LVTT) >> 4];
> +		size = ARRAY_SIZE(apic_lvt_mask);
> +		index = array_index_nospec(
> +				(reg - APIC_LVTT) >> 4, size);
> +		val &= apic_lvt_mask[index];
> 		kvm_lapic_set_reg(apic, reg, val);
> -
> 		break;
> +	}
> 
> 	case APIC_LVTT:
> 		if (!kvm_apic_sw_enabled(apic))
> diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c
> index 25ce3edd1872..7f0059aa30e1 100644
> --- a/arch/x86/kvm/mtrr.c
> +++ b/arch/x86/kvm/mtrr.c
> @@ -192,11 +192,15 @@ static bool fixed_msr_to_seg_unit(u32 msr, int *seg, int *unit)
> 		break;
> 	case MSR_MTRRfix16K_80000 ... MSR_MTRRfix16K_A0000:
> 		*seg = 1;
> -		*unit = msr - MSR_MTRRfix16K_80000;
> +		*unit = array_index_nospec(
> +			msr - MSR_MTRRfix16K_80000,
> +			MSR_MTRRfix16K_A0000 - MSR_MTRRfix16K_80000 + 1);
> 		break;
> 	case MSR_MTRRfix4K_C0000 ... MSR_MTRRfix4K_F8000:
> 		*seg = 2;
> -		*unit = msr - MSR_MTRRfix4K_C0000;
> +		*unit = array_index_nospec(
> +			msr - MSR_MTRRfix4K_C0000,
> +			MSR_MTRRfix4K_F8000 - MSR_MTRRfix4K_C0000 + 1);
> 		break;
> 	default:
> 		return false;
> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
> index 7ebb62326c14..13332984b6d5 100644
> --- a/arch/x86/kvm/pmu.h
> +++ b/arch/x86/kvm/pmu.h
> @@ -2,6 +2,8 @@
> #ifndef __KVM_X86_PMU_H
> #define __KVM_X86_PMU_H
> 
> +#include <linux/nospec.h>
> +
> #define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu)
> #define pmu_to_vcpu(pmu)  (container_of((pmu), struct kvm_vcpu, arch.pmu))
> #define pmc_to_pmu(pmc)   (&(pmc)->vcpu->arch.pmu)
> @@ -102,8 +104,12 @@ static inline bool kvm_valid_perf_global_ctrl(struct kvm_pmu *pmu,
> static inline struct kvm_pmc *get_gp_pmc(struct kvm_pmu *pmu, u32 msr,
> 					 u32 base)
> {
> -	if (msr >= base && msr < base + pmu->nr_arch_gp_counters)
> -		return &pmu->gp_counters[msr - base];
> +	if (msr >= base && msr < base + pmu->nr_arch_gp_counters) {
> +		u32 index = array_index_nospec(msr - base,
> +					       pmu->nr_arch_gp_counters);
> +
> +		return &pmu->gp_counters[index];
> +	}
> 
> 	return NULL;
> }
> @@ -113,8 +119,12 @@ static inline struct kvm_pmc *get_fixed_pmc(struct kvm_pmu *pmu, u32 msr)
> {
> 	int base = MSR_CORE_PERF_FIXED_CTR0;
> 
> -	if (msr >= base && msr < base + pmu->nr_arch_fixed_counters)
> -		return &pmu->fixed_counters[msr - base];
> +	if (msr >= base && msr < base + pmu->nr_arch_fixed_counters) {
> +		u32 index = array_index_nospec(msr - base,
> +					       pmu->nr_arch_fixed_counters);
> +
> +		return &pmu->fixed_counters[index];
> +	}
> 
> 	return NULL;
> }
> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> index 7023138b1cb0..34a3a17bb6d7 100644
> --- a/arch/x86/kvm/vmx/pmu_intel.c
> +++ b/arch/x86/kvm/vmx/pmu_intel.c
> @@ -86,10 +86,14 @@ static unsigned intel_find_arch_event(struct kvm_pmu *pmu,
> 
> static unsigned intel_find_fixed_event(int idx)
> {
> -	if (idx >= ARRAY_SIZE(fixed_pmc_events))
> +	u32 event;
> +	size_t size = ARRAY_SIZE(fixed_pmc_events);
> +
> +	if (idx >= size)
> 		return PERF_COUNT_HW_MAX;
> 
> -	return intel_arch_events[fixed_pmc_events[idx]].event_type;
> +	event = fixed_pmc_events[array_index_nospec(idx, size)];
> +	return intel_arch_events[event].event_type;
> }
> 
> /* check if a PMC is enabled by comparing it with globl_ctrl bits. */
> @@ -130,16 +134,20 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
> 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
> 	bool fixed = idx & (1u << 30);
> 	struct kvm_pmc *counters;
> +	unsigned int num_counters;
> 
> 	idx &= ~(3u << 30);
> -	if (!fixed && idx >= pmu->nr_arch_gp_counters)
> -		return NULL;
> -	if (fixed && idx >= pmu->nr_arch_fixed_counters)
> +	if (fixed) {
> +		counters = pmu->fixed_counters;
> +		num_counters = pmu->nr_arch_fixed_counters;
> +	} else {
> +		counters = pmu->gp_counters;
> +		num_counters = pmu->nr_arch_gp_counters;
> +	}
> +	if (idx >= num_counters)
> 		return NULL;
> -	counters = fixed ? pmu->fixed_counters : pmu->gp_counters;
> 	*mask &= pmu->counter_bitmask[fixed ? KVM_PMC_FIXED : KVM_PMC_GP];
> -
> -	return &counters[idx];
> +	return &counters[array_index_nospec(idx, num_counters)];
> }
> 
> static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index d39475e2d44e..84def8e46d10 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -753,7 +753,9 @@ static bool vmx_segment_cache_test_set(struct vcpu_vmx *vmx, unsigned seg,
> 
> static u16 vmx_read_guest_seg_selector(struct vcpu_vmx *vmx, unsigned seg)
> {
> -	u16 *p = &vmx->segment_cache.seg[seg].selector;
> +	size_t size = ARRAY_SIZE(vmx->segment_cache.seg);
> +	size_t index = array_index_nospec(seg, size);
> +	u16 *p = &vmx->segment_cache.seg[index].selector;
> 
> 	if (!vmx_segment_cache_test_set(vmx, seg, SEG_FIELD_SEL))
> 		*p = vmcs_read16(kvm_vmx_segment_fields[seg].selector);
> @@ -762,7 +764,9 @@ static u16 vmx_read_guest_seg_selector(struct vcpu_vmx *vmx, unsigned seg)
> 
> static ulong vmx_read_guest_seg_base(struct vcpu_vmx *vmx, unsigned seg)
> {
> -	ulong *p = &vmx->segment_cache.seg[seg].base;
> +	size_t size = ARRAY_SIZE(vmx->segment_cache.seg);
> +	size_t index = array_index_nospec(seg, size);
> +	ulong *p = &vmx->segment_cache.seg[index].base;
> 
> 	if (!vmx_segment_cache_test_set(vmx, seg, SEG_FIELD_BASE))
> 		*p = vmcs_readl(kvm_vmx_segment_fields[seg].base);
> @@ -771,7 +775,9 @@ static ulong vmx_read_guest_seg_base(struct vcpu_vmx *vmx, unsigned seg)
> 
> static u32 vmx_read_guest_seg_limit(struct vcpu_vmx *vmx, unsigned seg)
> {
> -	u32 *p = &vmx->segment_cache.seg[seg].limit;
> +	size_t size = ARRAY_SIZE(vmx->segment_cache.seg);
> +	size_t index = array_index_nospec(seg, size);
> +	u32 *p = &vmx->segment_cache.seg[index].limit;
> 
> 	if (!vmx_segment_cache_test_set(vmx, seg, SEG_FIELD_LIMIT))
> 		*p = vmcs_read32(kvm_vmx_segment_fields[seg].limit);
> @@ -780,7 +786,9 @@ static u32 vmx_read_guest_seg_limit(struct vcpu_vmx *vmx, unsigned seg)
> 
> static u32 vmx_read_guest_seg_ar(struct vcpu_vmx *vmx, unsigned seg)
> {
> -	u32 *p = &vmx->segment_cache.seg[seg].ar;
> +	size_t size = ARRAY_SIZE(vmx->segment_cache.seg);
> +	size_t index = array_index_nospec(seg, size);
> +	u32 *p = &vmx->segment_cache.seg[index].ar;
> 
> 	if (!vmx_segment_cache_test_set(vmx, seg, SEG_FIELD_AR))
> 		*p = vmcs_read32(kvm_vmx_segment_fields[seg].ar_bytes);
> @@ -5828,6 +5836,8 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
> {
> 	struct vcpu_vmx *vmx = to_vmx(vcpu);
> 	u32 exit_reason = vmx->exit_reason;
> +	u32 bounded_exit_reason = array_index_nospec(exit_reason,
> +						kvm_vmx_max_exit_handlers);

Unlike the rest of this patch changes, exit_reason is not attacker-controllable.
Therefore, I don’t think we need this change to vmx_handle_exit().

> 	u32 vectoring_info = vmx->idt_vectoring_info;
> 
> 	trace_kvm_exit(exit_reason, vcpu, KVM_ISA_VMX);
> @@ -5911,7 +5921,7 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
> 	}
> 
> 	if (exit_reason < kvm_vmx_max_exit_handlers
> -	    && kvm_vmx_exit_handlers[exit_reason]) {
> +	    && kvm_vmx_exit_handlers[bounded_exit_reason]) {
> #ifdef CONFIG_RETPOLINE
> 		if (exit_reason == EXIT_REASON_MSR_WRITE)
> 			return kvm_emulate_wrmsr(vcpu);
> @@ -5926,7 +5936,7 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
> 		else if (exit_reason == EXIT_REASON_EPT_MISCONFIG)
> 			return handle_ept_misconfig(vcpu);
> #endif
> -		return kvm_vmx_exit_handlers[exit_reason](vcpu);
> +		return kvm_vmx_exit_handlers[bounded_exit_reason](vcpu);
> 	} else {
> 		vcpu_unimpl(vcpu, "vmx: unexpected exit reason 0x%x\n",
> 				exit_reason);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index a256e09f321a..9a2789652231 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1057,9 +1057,11 @@ static u64 kvm_dr6_fixed(struct kvm_vcpu *vcpu)
> 
> static int __kvm_set_dr(struct kvm_vcpu *vcpu, int dr, unsigned long val)
> {
> +	size_t size = ARRAY_SIZE(vcpu->arch.db);
> +

Why is this change needed?
“dr” shouldn’t be attacker-controllable.
It’s value is read from vmcs->exit_qualification or vmcb->control.exit_code on DR-access intercept.
Relevant also for kvm_get_dr().

> 	switch (dr) {
> 	case 0 ... 3:
> -		vcpu->arch.db[dr] = val;
> +		vcpu->arch.db[array_index_nospec(dr, size)] = val;
> 		if (!(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP))
> 			vcpu->arch.eff_db[dr] = val;
> 		break;
> @@ -1096,9 +1098,11 @@ EXPORT_SYMBOL_GPL(kvm_set_dr);
> 
> int kvm_get_dr(struct kvm_vcpu *vcpu, int dr, unsigned long *val)
> {
> +	size_t size = ARRAY_SIZE(vcpu->arch.db);
> +
> 	switch (dr) {
> 	case 0 ... 3:
> -		*val = vcpu->arch.db[dr];
> +		*val = vcpu->arch.db[array_index_nospec(dr, size)];
> 		break;
> 	case 4:
> 		/* fall through */
> @@ -2496,7 +2500,10 @@ static int set_msr_mce(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> 	default:
> 		if (msr >= MSR_IA32_MC0_CTL &&
> 		    msr < MSR_IA32_MCx_CTL(bank_num)) {
> -			u32 offset = msr - MSR_IA32_MC0_CTL;
> +			u32 offset = array_index_nospec(
> +				msr - MSR_IA32_MC0_CTL,
> +				MSR_IA32_MCx_CTL(bank_num) - MSR_IA32_MC0_CTL);
> +
> 			/* only 0 or all 1s can be written to IA32_MCi_CTL
> 			 * some Linux kernels though clear bit 10 in bank 4 to
> 			 * workaround a BIOS/GART TBL issue on AMD K8s, ignore
> @@ -2937,7 +2944,10 @@ static int get_msr_mce(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, bool host)
> 	default:
> 		if (msr >= MSR_IA32_MC0_CTL &&
> 		    msr < MSR_IA32_MCx_CTL(bank_num)) {
> -			u32 offset = msr - MSR_IA32_MC0_CTL;
> +			u32 offset = array_index_nospec(
> +				msr - MSR_IA32_MC0_CTL,
> +				MSR_IA32_MCx_CTL(bank_num) - MSR_IA32_MC0_CTL);
> +
> 			data = vcpu->arch.mce_banks[offset];
> 			break;
> 		}
> -- 
> 2.24.0.432.g9d3f5f5b63-goog
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: x86: Extend Spectre-v1 mitigation
  2019-11-22 22:03 ` Liran Alon
@ 2019-11-22 22:19   ` Sean Christopherson
  0 siblings, 0 replies; 4+ messages in thread
From: Sean Christopherson @ 2019-11-22 22:19 UTC (permalink / raw)
  To: Liran Alon
  Cc: Marios Pomonis, Paolo Bonzini, rkrcmar, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, kvm,
	linux-kernel, Nick Finco, Andrew Honig

On Sat, Nov 23, 2019 at 12:03:27AM +0200, Liran Alon wrote:
> 
> > On 22 Nov 2019, at 20:40, Marios Pomonis <pomonis@google.com> wrote:
> > @@ -5828,6 +5836,8 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu)
> > {
> > 	struct vcpu_vmx *vmx = to_vmx(vcpu);
> > 	u32 exit_reason = vmx->exit_reason;
> > +	u32 bounded_exit_reason = array_index_nospec(exit_reason,
> > +						kvm_vmx_max_exit_handlers);
> 
> Unlike the rest of this patch changes, exit_reason is not attacker-controllable.
> Therefore, I don’t think we need this change to vmx_handle_exit().

I waffled on this one too.  Theoretically, if an attacker finds a way to
trigger a VM-Exit that isn't yet known to KVM, and coordinates across
userspace and guest to keep rerunning the attack in the guest instead of
killing the VM (on the unexpected VM-Exit), then exit_reason is sort of
under attacker control.

Of course the above scenario would require a bug in KVM, e.g. enable an
unknown enabling/exiting control, or in a CPU, e.g. generate a new VM-Exit
without software opt-in or generate a completely bogus VM-Exit.  The
whole thing is pretty far fetched...

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-11-22 22:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-22 18:40 [PATCH] KVM: x86: Extend Spectre-v1 mitigation Marios Pomonis
2019-11-22 19:45 ` Sean Christopherson
2019-11-22 22:03 ` Liran Alon
2019-11-22 22:19   ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).