KVM Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths
@ 2019-08-13 13:53 Vitaly Kuznetsov
  2019-08-13 13:53 ` [PATCH v4 1/7] x86: KVM: svm: don't pretend to advance RIP in case wrmsr_interception() results in #GP Vitaly Kuznetsov
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-13 13:53 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

Changes since v3 [Sean Christopherson]:
- add Reviewed-by tag to PATCH5
- __skip_emulated_instruction()/skip_emulated_instruction() split,
  'unlikely(r != EMULATE_DONE)' in PATCH2
- Make nested_svm_vmrun() return an int in PATCH6 (moved from PATCH7)
- Avoid weird-looking 'if (rc) return ret' in PATCH7

Original description:

Jim rightfully complains that hardcoding instuctions lengths is not always
correct: additional (redundant) prefixes can be used. Luckily, the ugliness
is mostly harmless: modern AMD CPUs support NRIP_SAVE feature but I'd like
to clean things up and sacrifice speed in favor of correctness.

Vitaly Kuznetsov (7):
  x86: KVM: svm: don't pretend to advance RIP in case
    wrmsr_interception() results in #GP
  x86: kvm: svm: propagate errors from skip_emulated_instruction()
  x86: KVM: clear interrupt shadow on EMULTYPE_SKIP
  x86: KVM: add xsetbv to the emulator
  x86: KVM: svm: remove hardcoded instruction length from intercepts
  x86: KVM: svm: eliminate weird goto from vmrun_interception()
  x86: KVM: svm: eliminate hardcoded RIP advancement from
    vmrun_interception()

 arch/x86/include/asm/kvm_emulate.h |   3 +-
 arch/x86/include/asm/kvm_host.h    |   2 +-
 arch/x86/kvm/emulate.c             |  23 ++++++-
 arch/x86/kvm/svm.c                 | 100 +++++++++++++----------------
 arch/x86/kvm/vmx/vmx.c             |  16 ++++-
 arch/x86/kvm/x86.c                 |  13 +++-
 6 files changed, 92 insertions(+), 65 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v4 1/7] x86: KVM: svm: don't pretend to advance RIP in case wrmsr_interception() results in #GP
  2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
@ 2019-08-13 13:53 ` Vitaly Kuznetsov
  2019-08-13 13:53 ` [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction() Vitaly Kuznetsov
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-13 13:53 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

svm->next_rip is only used by skip_emulated_instruction() and in case
kvm_set_msr() fails we rightfully don't do that. Move svm->next_rip
advancement to 'else' branch to avoid creating false impression that
it's always advanced (and make it look like rdmsr_interception()).

This is a preparatory change to removing hardcoded RIP advancement
from instruction intercepts, no functional change.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/svm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 7eafc6907861..7e843b340490 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -4447,13 +4447,13 @@ static int wrmsr_interception(struct vcpu_svm *svm)
 	msr.index = ecx;
 	msr.host_initiated = false;
 
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 2;
 	if (kvm_set_msr(&svm->vcpu, &msr)) {
 		trace_kvm_msr_write_ex(ecx, data);
 		kvm_inject_gp(&svm->vcpu, 0);
 		return 1;
 	} else {
 		trace_kvm_msr_write(ecx, data);
+		svm->next_rip = kvm_rip_read(&svm->vcpu) + 2;
 		return kvm_skip_emulated_instruction(&svm->vcpu);
 	}
 }
-- 
2.20.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction()
  2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
  2019-08-13 13:53 ` [PATCH v4 1/7] x86: KVM: svm: don't pretend to advance RIP in case wrmsr_interception() results in #GP Vitaly Kuznetsov
@ 2019-08-13 13:53 ` Vitaly Kuznetsov
  2019-08-13 18:07   ` Sean Christopherson
  2019-08-13 13:53 ` [PATCH v4 3/7] x86: KVM: clear interrupt shadow on EMULTYPE_SKIP Vitaly Kuznetsov
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-13 13:53 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

On AMD, kvm_x86_ops->skip_emulated_instruction(vcpu) can, in theory,
fail: in !nrips case we call kvm_emulate_instruction(EMULTYPE_SKIP).
Currently, we only do printk(KERN_DEBUG) when this happens and this
is not ideal. Propagate the error up the stack.

On VMX, skip_emulated_instruction() doesn't fail, we have two call
sites calling it explicitly: handle_exception_nmi() and
handle_task_switch(), we can just ignore the result.

On SVM, we also have two explicit call sites:
svm_queue_exception() and it seems we don't need to do anything there as
we check if RIP was advanced or not. In task_switch_interception(),
however, we are better off not proceeding to kvm_task_switch() in case
skip_emulated_instruction() failed.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/svm.c              | 36 ++++++++++++++++++---------------
 arch/x86/kvm/vmx/vmx.c          | 16 ++++++++++++---
 arch/x86/kvm/x86.c              |  6 ++++--
 4 files changed, 38 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 7b0a4ee77313..f9e6d0b0f581 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1068,7 +1068,7 @@ struct kvm_x86_ops {
 
 	void (*run)(struct kvm_vcpu *vcpu);
 	int (*handle_exit)(struct kvm_vcpu *vcpu);
-	void (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
+	int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
 	void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask);
 	u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu);
 	void (*patch_hypercall)(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 7e843b340490..8299b0de06e2 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -770,7 +770,7 @@ static void svm_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
 
 }
 
-static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
+static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 
@@ -779,18 +779,17 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
 		svm->next_rip = svm->vmcb->control.next_rip;
 	}
 
-	if (!svm->next_rip) {
-		if (kvm_emulate_instruction(vcpu, EMULTYPE_SKIP) !=
-				EMULATE_DONE)
-			printk(KERN_DEBUG "%s: NOP\n", __func__);
-		return;
-	}
+	if (!svm->next_rip)
+		return kvm_emulate_instruction(vcpu, EMULTYPE_SKIP);
+
 	if (svm->next_rip - kvm_rip_read(vcpu) > MAX_INST_SIZE)
 		printk(KERN_ERR "%s: ip 0x%lx next 0x%llx\n",
 		       __func__, kvm_rip_read(vcpu), svm->next_rip);
 
 	kvm_rip_write(vcpu, svm->next_rip);
 	svm_set_interrupt_shadow(vcpu, 0);
+
+	return EMULATE_DONE;
 }
 
 static void svm_queue_exception(struct kvm_vcpu *vcpu)
@@ -821,7 +820,7 @@ static void svm_queue_exception(struct kvm_vcpu *vcpu)
 		 * raises a fault that is not intercepted. Still better than
 		 * failing in all cases.
 		 */
-		skip_emulated_instruction(&svm->vcpu);
+		(void)skip_emulated_instruction(&svm->vcpu);
 		rip = kvm_rip_read(&svm->vcpu);
 		svm->int3_rip = rip + svm->vmcb->save.cs.base;
 		svm->int3_injected = rip - old_rip;
@@ -3899,20 +3898,25 @@ static int task_switch_interception(struct vcpu_svm *svm)
 	if (reason != TASK_SWITCH_GATE ||
 	    int_type == SVM_EXITINTINFO_TYPE_SOFT ||
 	    (int_type == SVM_EXITINTINFO_TYPE_EXEPT &&
-	     (int_vec == OF_VECTOR || int_vec == BP_VECTOR)))
-		skip_emulated_instruction(&svm->vcpu);
+	     (int_vec == OF_VECTOR || int_vec == BP_VECTOR))) {
+		if (skip_emulated_instruction(&svm->vcpu) != EMULATE_DONE)
+			goto fail;
+	}
 
 	if (int_type != SVM_EXITINTINFO_TYPE_SOFT)
 		int_vec = -1;
 
 	if (kvm_task_switch(&svm->vcpu, tss_selector, int_vec, reason,
-				has_error_code, error_code) == EMULATE_FAIL) {
-		svm->vcpu.run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
-		svm->vcpu.run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
-		svm->vcpu.run->internal.ndata = 0;
-		return 0;
-	}
+				has_error_code, error_code) == EMULATE_FAIL)
+		goto fail;
+
 	return 1;
+
+fail:
+	svm->vcpu.run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
+	svm->vcpu.run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
+	svm->vcpu.run->internal.ndata = 0;
+	return 0;
 }
 
 static int cpuid_interception(struct vcpu_svm *svm)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 074385c86c09..358827b5bc44 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1472,8 +1472,11 @@ static int vmx_rtit_ctl_check(struct kvm_vcpu *vcpu, u64 data)
 	return 0;
 }
 
-
-static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
+/*
+ * Returns an int to be compatible with SVM implementation (which can fail).
+ * Do not use directly, use skip_emulated_instruction() instead.
+ */
+static int __skip_emulated_instruction(struct kvm_vcpu *vcpu)
 {
 	unsigned long rip;
 
@@ -1483,6 +1486,13 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
 
 	/* skipping an emulated instruction also counts */
 	vmx_set_interrupt_shadow(vcpu, 0);
+
+	return EMULATE_DONE;
+}
+
+static inline void skip_emulated_instruction(struct kvm_vcpu *vcpu)
+{
+	(void)__skip_emulated_instruction(vcpu);
 }
 
 static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
@@ -7700,7 +7710,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
 
 	.run = vmx_vcpu_run,
 	.handle_exit = vmx_handle_exit,
-	.skip_emulated_instruction = skip_emulated_instruction,
+	.skip_emulated_instruction = __skip_emulated_instruction,
 	.set_interrupt_shadow = vmx_set_interrupt_shadow,
 	.get_interrupt_shadow = vmx_get_interrupt_shadow,
 	.patch_hypercall = vmx_patch_hypercall,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c6d951cbd76c..e8f797fe9d9e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6383,9 +6383,11 @@ static void kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu, int *r)
 int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
 {
 	unsigned long rflags = kvm_x86_ops->get_rflags(vcpu);
-	int r = EMULATE_DONE;
+	int r;
 
-	kvm_x86_ops->skip_emulated_instruction(vcpu);
+	r = kvm_x86_ops->skip_emulated_instruction(vcpu);
+	if (unlikely(r != EMULATE_DONE))
+		return 0;
 
 	/*
 	 * rflags is the old, "raw" value of the flags.  The new value has
-- 
2.20.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v4 3/7] x86: KVM: clear interrupt shadow on EMULTYPE_SKIP
  2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
  2019-08-13 13:53 ` [PATCH v4 1/7] x86: KVM: svm: don't pretend to advance RIP in case wrmsr_interception() results in #GP Vitaly Kuznetsov
  2019-08-13 13:53 ` [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction() Vitaly Kuznetsov
@ 2019-08-13 13:53 ` Vitaly Kuznetsov
  2019-08-13 13:53 ` [PATCH v4 4/7] x86: KVM: add xsetbv to the emulator Vitaly Kuznetsov
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-13 13:53 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

When doing x86_emulate_instruction(EMULTYPE_SKIP) interrupt shadow has to
be cleared if and only if the skipping is successful.

There are two immediate issues:
- In SVM skip_emulated_instruction() we are not zapping interrupt shadow
  in case kvm_emulate_instruction(EMULTYPE_SKIP) is used to advance RIP
  (!nrpip_save).
- In VMX handle_ept_misconfig() when running as a nested hypervisor we
  (static_cpu_has(X86_FEATURE_HYPERVISOR) case) forget to clear interrupt
  shadow.

Note that we intentionally don't handle the case when the skipped
instruction is supposed to prolong the interrupt shadow ("MOV/POP SS") as
skip-emulation of those instructions should not happen under normal
circumstances.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/x86.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e8f797fe9d9e..c2409d06c114 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6539,6 +6539,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
 		kvm_rip_write(vcpu, ctxt->_eip);
 		if (ctxt->eflags & X86_EFLAGS_RF)
 			kvm_set_rflags(vcpu, ctxt->eflags & ~X86_EFLAGS_RF);
+		kvm_x86_ops->set_interrupt_shadow(vcpu, 0);
 		return EMULATE_DONE;
 	}
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v4 4/7] x86: KVM: add xsetbv to the emulator
  2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
                   ` (2 preceding siblings ...)
  2019-08-13 13:53 ` [PATCH v4 3/7] x86: KVM: clear interrupt shadow on EMULTYPE_SKIP Vitaly Kuznetsov
@ 2019-08-13 13:53 ` Vitaly Kuznetsov
  2019-08-13 13:53 ` [PATCH v4 5/7] x86: KVM: svm: remove hardcoded instruction length from intercepts Vitaly Kuznetsov
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-13 13:53 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

To avoid hardcoding xsetbv length to '3' we need to support decoding it in
the emulator.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/kvm_emulate.h |  3 ++-
 arch/x86/kvm/emulate.c             | 23 ++++++++++++++++++++++-
 arch/x86/kvm/svm.c                 |  1 +
 arch/x86/kvm/x86.c                 |  6 ++++++
 4 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index feab24cac610..77cf6c11f66b 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -229,7 +229,7 @@ struct x86_emulate_ops {
 	int (*pre_leave_smm)(struct x86_emulate_ctxt *ctxt,
 			     const char *smstate);
 	void (*post_leave_smm)(struct x86_emulate_ctxt *ctxt);
-
+	int (*set_xcr)(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr);
 };
 
 typedef u32 __attribute__((vector_size(16))) sse128_t;
@@ -429,6 +429,7 @@ enum x86_intercept {
 	x86_intercept_ins,
 	x86_intercept_out,
 	x86_intercept_outs,
+	x86_intercept_xsetbv,
 
 	nr_x86_intercepts
 };
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 718f7d9afedc..f9e843dd992a 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -4156,6 +4156,20 @@ static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
 	return rc;
 }
 
+static int em_xsetbv(struct x86_emulate_ctxt *ctxt)
+{
+	u32 eax, ecx, edx;
+
+	eax = reg_read(ctxt, VCPU_REGS_RAX);
+	edx = reg_read(ctxt, VCPU_REGS_RDX);
+	ecx = reg_read(ctxt, VCPU_REGS_RCX);
+
+	if (ctxt->ops->set_xcr(ctxt, ecx, ((u64)edx << 32) | eax))
+		return emulate_gp(ctxt, 0);
+
+	return X86EMUL_CONTINUE;
+}
+
 static bool valid_cr(int nr)
 {
 	switch (nr) {
@@ -4409,6 +4423,12 @@ static const struct opcode group7_rm1[] = {
 	N, N, N, N, N, N,
 };
 
+static const struct opcode group7_rm2[] = {
+	N,
+	II(ImplicitOps | Priv,			em_xsetbv,	xsetbv),
+	N, N, N, N, N, N,
+};
+
 static const struct opcode group7_rm3[] = {
 	DIP(SrcNone | Prot | Priv,		vmrun,		check_svme_pa),
 	II(SrcNone  | Prot | EmulateOnUD,	em_hypercall,	vmmcall),
@@ -4498,7 +4518,8 @@ static const struct group_dual group7 = { {
 }, {
 	EXT(0, group7_rm0),
 	EXT(0, group7_rm1),
-	N, EXT(0, group7_rm3),
+	EXT(0, group7_rm2),
+	EXT(0, group7_rm3),
 	II(SrcNone | DstMem | Mov,		em_smsw, smsw), N,
 	II(SrcMem16 | Mov | Priv,		em_lmsw, lmsw),
 	EXT(0, group7_rm7),
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 8299b0de06e2..858feeac01a4 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6067,6 +6067,7 @@ static const struct __x86_intercept {
 	[x86_intercept_ins]		= POST_EX(SVM_EXIT_IOIO),
 	[x86_intercept_out]		= POST_EX(SVM_EXIT_IOIO),
 	[x86_intercept_outs]		= POST_EX(SVM_EXIT_IOIO),
+	[x86_intercept_xsetbv]		= PRE_EX(SVM_EXIT_XSETBV),
 };
 
 #undef PRE_EX
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c2409d06c114..337559294169 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6068,6 +6068,11 @@ static void emulator_post_leave_smm(struct x86_emulate_ctxt *ctxt)
 	kvm_smm_changed(emul_to_vcpu(ctxt));
 }
 
+static int emulator_set_xcr(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr)
+{
+	return __kvm_set_xcr(emul_to_vcpu(ctxt), index, xcr);
+}
+
 static const struct x86_emulate_ops emulate_ops = {
 	.read_gpr            = emulator_read_gpr,
 	.write_gpr           = emulator_write_gpr,
@@ -6109,6 +6114,7 @@ static const struct x86_emulate_ops emulate_ops = {
 	.set_hflags          = emulator_set_hflags,
 	.pre_leave_smm       = emulator_pre_leave_smm,
 	.post_leave_smm      = emulator_post_leave_smm,
+	.set_xcr             = emulator_set_xcr,
 };
 
 static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
-- 
2.20.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v4 5/7] x86: KVM: svm: remove hardcoded instruction length from intercepts
  2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
                   ` (3 preceding siblings ...)
  2019-08-13 13:53 ` [PATCH v4 4/7] x86: KVM: add xsetbv to the emulator Vitaly Kuznetsov
@ 2019-08-13 13:53 ` Vitaly Kuznetsov
  2019-08-13 13:53 ` [PATCH v4 6/7] x86: KVM: svm: eliminate weird goto from vmrun_interception() Vitaly Kuznetsov
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-13 13:53 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

Various intercepts hard-code the respective instruction lengths to optimize
skip_emulated_instruction(): when next_rip is pre-set we skip
kvm_emulate_instruction(vcpu, EMULTYPE_SKIP). The optimization is, however,
incorrect: different (redundant) prefixes could be used to enlarge the
instruction. We can't really avoid decoding.

svm->next_rip is not used when CPU supports 'nrips' (X86_FEATURE_NRIPS)
feature: next RIP is provided in VMCB. The feature is not really new
(Opteron G3s had it already) and the change should have zero affect.

Remove manual svm->next_rip setting with hard-coded instruction lengths.
The only case where we now use svm->next_rip is EXIT_IOIO: the instruction
length is provided to us by hardware.

Hardcoded RIP advancement remains in vmrun_interception(), this is going to
be taken care of separately.

Reported-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/svm.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 858feeac01a4..6d16d1898810 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2903,13 +2903,11 @@ static int nop_on_interception(struct vcpu_svm *svm)
 
 static int halt_interception(struct vcpu_svm *svm)
 {
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 1;
 	return kvm_emulate_halt(&svm->vcpu);
 }
 
 static int vmmcall_interception(struct vcpu_svm *svm)
 {
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
 	return kvm_emulate_hypercall(&svm->vcpu);
 }
 
@@ -3697,7 +3695,6 @@ static int vmload_interception(struct vcpu_svm *svm)
 
 	nested_vmcb = map.hva;
 
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
 	ret = kvm_skip_emulated_instruction(&svm->vcpu);
 
 	nested_svm_vmloadsave(nested_vmcb, svm->vmcb);
@@ -3724,7 +3721,6 @@ static int vmsave_interception(struct vcpu_svm *svm)
 
 	nested_vmcb = map.hva;
 
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
 	ret = kvm_skip_emulated_instruction(&svm->vcpu);
 
 	nested_svm_vmloadsave(svm->vmcb, nested_vmcb);
@@ -3775,7 +3771,6 @@ static int stgi_interception(struct vcpu_svm *svm)
 	if (vgif_enabled(svm))
 		clr_intercept(svm, INTERCEPT_STGI);
 
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
 	ret = kvm_skip_emulated_instruction(&svm->vcpu);
 	kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
 
@@ -3791,7 +3786,6 @@ static int clgi_interception(struct vcpu_svm *svm)
 	if (nested_svm_check_permissions(svm))
 		return 1;
 
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
 	ret = kvm_skip_emulated_instruction(&svm->vcpu);
 
 	disable_gif(svm);
@@ -3816,7 +3810,6 @@ static int invlpga_interception(struct vcpu_svm *svm)
 	/* Let's treat INVLPGA the same as INVLPG (can be optimized!) */
 	kvm_mmu_invlpg(vcpu, kvm_rax_read(&svm->vcpu));
 
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
 	return kvm_skip_emulated_instruction(&svm->vcpu);
 }
 
@@ -3839,7 +3832,6 @@ static int xsetbv_interception(struct vcpu_svm *svm)
 	u32 index = kvm_rcx_read(&svm->vcpu);
 
 	if (kvm_set_xcr(&svm->vcpu, index, new_bv) == 0) {
-		svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
 		return kvm_skip_emulated_instruction(&svm->vcpu);
 	}
 
@@ -3921,7 +3913,6 @@ static int task_switch_interception(struct vcpu_svm *svm)
 
 static int cpuid_interception(struct vcpu_svm *svm)
 {
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 2;
 	return kvm_emulate_cpuid(&svm->vcpu);
 }
 
@@ -4251,7 +4242,6 @@ static int rdmsr_interception(struct vcpu_svm *svm)
 
 		kvm_rax_write(&svm->vcpu, msr_info.data & 0xffffffff);
 		kvm_rdx_write(&svm->vcpu, msr_info.data >> 32);
-		svm->next_rip = kvm_rip_read(&svm->vcpu) + 2;
 		return kvm_skip_emulated_instruction(&svm->vcpu);
 	}
 }
@@ -4457,7 +4447,6 @@ static int wrmsr_interception(struct vcpu_svm *svm)
 		return 1;
 	} else {
 		trace_kvm_msr_write(ecx, data);
-		svm->next_rip = kvm_rip_read(&svm->vcpu) + 2;
 		return kvm_skip_emulated_instruction(&svm->vcpu);
 	}
 }
-- 
2.20.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v4 6/7] x86: KVM: svm: eliminate weird goto from vmrun_interception()
  2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
                   ` (4 preceding siblings ...)
  2019-08-13 13:53 ` [PATCH v4 5/7] x86: KVM: svm: remove hardcoded instruction length from intercepts Vitaly Kuznetsov
@ 2019-08-13 13:53 ` Vitaly Kuznetsov
  2019-08-13 18:11   ` Sean Christopherson
  2019-08-13 13:53 ` [PATCH v4 7/7] x86: KVM: svm: eliminate hardcoded RIP advancement " Vitaly Kuznetsov
  2019-08-14 13:25 ` [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Paolo Bonzini
  7 siblings, 1 reply; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-13 13:53 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

Regardless of whether or not nested_svm_vmrun_msrpm() fails, we return 1
from vmrun_interception() so there's no point in doing goto. Also,
nested_svm_vmrun_msrpm() call can be made from nested_svm_vmrun() where
other nested launch issues are handled.

nested_svm_vmrun() returns a bool, however, its result is ignored in
vmrun_interception() as we always return '1'. As a preparatory change
to putting kvm_skip_emulated_instruction() inside nested_svm_vmrun()
make nested_svm_vmrun() return an int (always '1' for now).

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/svm.c | 36 ++++++++++++++----------------------
 1 file changed, 14 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 6d16d1898810..51c39b608ef7 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3586,7 +3586,7 @@ static void enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa,
 	mark_all_dirty(svm->vmcb);
 }
 
-static bool nested_svm_vmrun(struct vcpu_svm *svm)
+static int nested_svm_vmrun(struct vcpu_svm *svm)
 {
 	int rc;
 	struct vmcb *nested_vmcb;
@@ -3601,7 +3601,7 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
 	if (rc) {
 		if (rc == -EINVAL)
 			kvm_inject_gp(&svm->vcpu, 0);
-		return false;
+		return 1;
 	}
 
 	nested_vmcb = map.hva;
@@ -3614,7 +3614,7 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
 
 		kvm_vcpu_unmap(&svm->vcpu, &map, true);
 
-		return false;
+		return 1;
 	}
 
 	trace_kvm_nested_vmrun(svm->vmcb->save.rip, vmcb_gpa,
@@ -3658,7 +3658,16 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
 
 	enter_svm_guest_mode(svm, vmcb_gpa, nested_vmcb, &map);
 
-	return true;
+	if (!nested_svm_vmrun_msrpm(svm)) {
+		svm->vmcb->control.exit_code    = SVM_EXIT_ERR;
+		svm->vmcb->control.exit_code_hi = 0;
+		svm->vmcb->control.exit_info_1  = 0;
+		svm->vmcb->control.exit_info_2  = 0;
+
+		nested_svm_vmexit(svm);
+	}
+
+	return 1;
 }
 
 static void nested_svm_vmloadsave(struct vmcb *from_vmcb, struct vmcb *to_vmcb)
@@ -3737,24 +3746,7 @@ static int vmrun_interception(struct vcpu_svm *svm)
 	/* Save rip after vmrun instruction */
 	kvm_rip_write(&svm->vcpu, kvm_rip_read(&svm->vcpu) + 3);
 
-	if (!nested_svm_vmrun(svm))
-		return 1;
-
-	if (!nested_svm_vmrun_msrpm(svm))
-		goto failed;
-
-	return 1;
-
-failed:
-
-	svm->vmcb->control.exit_code    = SVM_EXIT_ERR;
-	svm->vmcb->control.exit_code_hi = 0;
-	svm->vmcb->control.exit_info_1  = 0;
-	svm->vmcb->control.exit_info_2  = 0;
-
-	nested_svm_vmexit(svm);
-
-	return 1;
+	return nested_svm_vmrun(svm);
 }
 
 static int stgi_interception(struct vcpu_svm *svm)
-- 
2.20.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v4 7/7] x86: KVM: svm: eliminate hardcoded RIP advancement from vmrun_interception()
  2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
                   ` (5 preceding siblings ...)
  2019-08-13 13:53 ` [PATCH v4 6/7] x86: KVM: svm: eliminate weird goto from vmrun_interception() Vitaly Kuznetsov
@ 2019-08-13 13:53 ` " Vitaly Kuznetsov
  2019-08-13 18:11   ` Sean Christopherson
  2019-08-14 13:25 ` [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Paolo Bonzini
  7 siblings, 1 reply; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-13 13:53 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

Just like we do with other intercepts, in vmrun_interception() we should be
doing kvm_skip_emulated_instruction() and not just RIP += 3. Also, it is
wrong to increment RIP before nested_svm_vmrun() as it can result in
kvm_inject_gp().

We can't call kvm_skip_emulated_instruction() after nested_svm_vmrun() so
move it inside.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/svm.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 51c39b608ef7..8473cbea7e8b 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3588,7 +3588,7 @@ static void enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa,
 
 static int nested_svm_vmrun(struct vcpu_svm *svm)
 {
-	int rc;
+	int ret;
 	struct vmcb *nested_vmcb;
 	struct vmcb *hsave = svm->nested.hsave;
 	struct vmcb *vmcb = svm->vmcb;
@@ -3597,13 +3597,16 @@ static int nested_svm_vmrun(struct vcpu_svm *svm)
 
 	vmcb_gpa = svm->vmcb->save.rax;
 
-	rc = kvm_vcpu_map(&svm->vcpu, gpa_to_gfn(vmcb_gpa), &map);
-	if (rc) {
-		if (rc == -EINVAL)
-			kvm_inject_gp(&svm->vcpu, 0);
+	ret = kvm_vcpu_map(&svm->vcpu, gpa_to_gfn(vmcb_gpa), &map);
+	if (ret == EINVAL) {
+		kvm_inject_gp(&svm->vcpu, 0);
 		return 1;
+	} else if (ret) {
+		return kvm_skip_emulated_instruction(&svm->vcpu);
 	}
 
+	ret = kvm_skip_emulated_instruction(&svm->vcpu);
+
 	nested_vmcb = map.hva;
 
 	if (!nested_vmcb_checks(nested_vmcb)) {
@@ -3614,7 +3617,7 @@ static int nested_svm_vmrun(struct vcpu_svm *svm)
 
 		kvm_vcpu_unmap(&svm->vcpu, &map, true);
 
-		return 1;
+		return ret;
 	}
 
 	trace_kvm_nested_vmrun(svm->vmcb->save.rip, vmcb_gpa,
@@ -3667,7 +3670,7 @@ static int nested_svm_vmrun(struct vcpu_svm *svm)
 		nested_svm_vmexit(svm);
 	}
 
-	return 1;
+	return ret;
 }
 
 static void nested_svm_vmloadsave(struct vmcb *from_vmcb, struct vmcb *to_vmcb)
@@ -3743,9 +3746,6 @@ static int vmrun_interception(struct vcpu_svm *svm)
 	if (nested_svm_check_permissions(svm))
 		return 1;
 
-	/* Save rip after vmrun instruction */
-	kvm_rip_write(&svm->vcpu, kvm_rip_read(&svm->vcpu) + 3);
-
 	return nested_svm_vmrun(svm);
 }
 
-- 
2.20.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction()
  2019-08-13 13:53 ` [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction() Vitaly Kuznetsov
@ 2019-08-13 18:07   ` Sean Christopherson
  2019-08-14  9:34     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2019-08-13 18:07 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson

On Tue, Aug 13, 2019 at 03:53:30PM +0200, Vitaly Kuznetsov wrote:
> @@ -3899,20 +3898,25 @@ static int task_switch_interception(struct vcpu_svm *svm)
>  	if (reason != TASK_SWITCH_GATE ||
>  	    int_type == SVM_EXITINTINFO_TYPE_SOFT ||
>  	    (int_type == SVM_EXITINTINFO_TYPE_EXEPT &&
> -	     (int_vec == OF_VECTOR || int_vec == BP_VECTOR)))
> -		skip_emulated_instruction(&svm->vcpu);
> +	     (int_vec == OF_VECTOR || int_vec == BP_VECTOR))) {
> +		if (skip_emulated_instruction(&svm->vcpu) != EMULATE_DONE)

This isn't broken in the current code, but it's flawed in the sense that
it assumes skip_emulated_instruction() never returns EMULATE_USER_EXIT.

Note, both SVM and VMX make the opposite assumption when handling
kvm_task_switch() and kvm_inject_realmode_interrupt().

More below...

> +			goto fail;
> +	}
>  
>  	if (int_type != SVM_EXITINTINFO_TYPE_SOFT)
>  		int_vec = -1;
>  
>  	if (kvm_task_switch(&svm->vcpu, tss_selector, int_vec, reason,
> -				has_error_code, error_code) == EMULATE_FAIL) {
> -		svm->vcpu.run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
> -		svm->vcpu.run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
> -		svm->vcpu.run->internal.ndata = 0;
> -		return 0;
> -	}
> +				has_error_code, error_code) == EMULATE_FAIL)
> +		goto fail;
> +
>  	return 1;
> +
> +fail:
> +	svm->vcpu.run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
> +	svm->vcpu.run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
> +	svm->vcpu.run->internal.ndata = 0;
> +	return 0;
>  }
>  
>  static int cpuid_interception(struct vcpu_svm *svm)

...

> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c6d951cbd76c..e8f797fe9d9e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6383,9 +6383,11 @@ static void kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu, int *r)
>  int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
>  {
>  	unsigned long rflags = kvm_x86_ops->get_rflags(vcpu);
> -	int r = EMULATE_DONE;
> +	int r;
>  
> -	kvm_x86_ops->skip_emulated_instruction(vcpu);
> +	r = kvm_x86_ops->skip_emulated_instruction(vcpu);
> +	if (unlikely(r != EMULATE_DONE))
> +		return 0;

x86_emulate_instruction() doesn't set vcpu->run->exit_reason when emulation
fails with EMULTYPE_SKIP, i.e. this will exit to userspace with garbage in
the exit_reason.

handle_ept_misconfig() also has the same (pre-existing) flaw.

Given the handle_ept_misconfig() bug and that kvm_emulate_instruction()
sets vcpu->run->exit_reason when it returns EMULATE_FAIL in the normal
case, I think it makes sense to fix the issue in x86_emulate_instruction().
That would also eliminate the need to worry about EMULATE_USER_EXIT in
task_switch_interception(), e.g. the SVM code can just return 0 when it
gets a non-EMULATE_DONE return type.

E.g.:

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 07ab14d73094..73b86f81ed9c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6201,7 +6201,8 @@ static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
        if (emulation_type & EMULTYPE_NO_UD_ON_FAIL)
                return EMULATE_FAIL;

-       if (!is_guest_mode(vcpu) && kvm_x86_ops->get_cpl(vcpu) == 0) {
+       if ((!is_guest_mode(vcpu) && kvm_x86_ops->get_cpl(vcpu) == 0) ||
+           (emulation_type & EMULTYPE_SKIP)) {
                vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
                vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
                vcpu->run->internal.ndata = 0;
@@ -6525,8 +6526,6 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
                                return EMULATE_DONE;
                        if (ctxt->have_exception && inject_emulated_exception(vcpu))
                                return EMULATE_DONE;
-                       if (emulation_type & EMULTYPE_SKIP)
-                               return EMULATE_FAIL;
                        return handle_emulation_failure(vcpu, emulation_type);
                }
        }


As for the kvm_task_switch() handling and other cases, I think it's
possible to rework all of the functions and callers that return/handle
EMULATE_DONE to instead return 0/1, i.e. contain EMULATE_* to x86.c.
I'll put together a series, I think you've suffered more than enough scope
creep as it is :-)

>  
>  	/*
>  	 * rflags is the old, "raw" value of the flags.  The new value has
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 6/7] x86: KVM: svm: eliminate weird goto from vmrun_interception()
  2019-08-13 13:53 ` [PATCH v4 6/7] x86: KVM: svm: eliminate weird goto from vmrun_interception() Vitaly Kuznetsov
@ 2019-08-13 18:11   ` Sean Christopherson
  0 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2019-08-13 18:11 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson

On Tue, Aug 13, 2019 at 03:53:34PM +0200, Vitaly Kuznetsov wrote:
> Regardless of whether or not nested_svm_vmrun_msrpm() fails, we return 1
> from vmrun_interception() so there's no point in doing goto. Also,
> nested_svm_vmrun_msrpm() call can be made from nested_svm_vmrun() where
> other nested launch issues are handled.
> 
> nested_svm_vmrun() returns a bool, however, its result is ignored in
> vmrun_interception() as we always return '1'. As a preparatory change
> to putting kvm_skip_emulated_instruction() inside nested_svm_vmrun()
> make nested_svm_vmrun() return an int (always '1' for now).
> 
> Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---

Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 7/7] x86: KVM: svm: eliminate hardcoded RIP advancement from vmrun_interception()
  2019-08-13 13:53 ` [PATCH v4 7/7] x86: KVM: svm: eliminate hardcoded RIP advancement " Vitaly Kuznetsov
@ 2019-08-13 18:11   ` Sean Christopherson
  0 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2019-08-13 18:11 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson

On Tue, Aug 13, 2019 at 03:53:35PM +0200, Vitaly Kuznetsov wrote:
> Just like we do with other intercepts, in vmrun_interception() we should be
> doing kvm_skip_emulated_instruction() and not just RIP += 3. Also, it is
> wrong to increment RIP before nested_svm_vmrun() as it can result in
> kvm_inject_gp().
> 
> We can't call kvm_skip_emulated_instruction() after nested_svm_vmrun() so
> move it inside.
> 
> Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---

Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction()
  2019-08-13 18:07   ` Sean Christopherson
@ 2019-08-14  9:34     ` Vitaly Kuznetsov
  2019-08-15  0:19       ` Sean Christopherson
  0 siblings, 1 reply; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-14  9:34 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson

Sean Christopherson <sean.j.christopherson@intel.com> writes:

> On Tue, Aug 13, 2019 at 03:53:30PM +0200, Vitaly Kuznetsov wrote:
>> @@ -3899,20 +3898,25 @@ static int task_switch_interception(struct vcpu_svm *svm)
>>  	if (reason != TASK_SWITCH_GATE ||
>>  	    int_type == SVM_EXITINTINFO_TYPE_SOFT ||
>>  	    (int_type == SVM_EXITINTINFO_TYPE_EXEPT &&
>> -	     (int_vec == OF_VECTOR || int_vec == BP_VECTOR)))
>> -		skip_emulated_instruction(&svm->vcpu);
>> +	     (int_vec == OF_VECTOR || int_vec == BP_VECTOR))) {
>> +		if (skip_emulated_instruction(&svm->vcpu) != EMULATE_DONE)
>
> This isn't broken in the current code, but it's flawed in the sense that
> it assumes skip_emulated_instruction() never returns EMULATE_USER_EXIT.
>
> Note, both SVM and VMX make the opposite assumption when handling
> kvm_task_switch() and kvm_inject_realmode_interrupt().
>
> More below...
>
>> +			goto fail;
>> +	}
>>  
>>  	if (int_type != SVM_EXITINTINFO_TYPE_SOFT)
>>  		int_vec = -1;
>>  
>>  	if (kvm_task_switch(&svm->vcpu, tss_selector, int_vec, reason,
>> -				has_error_code, error_code) == EMULATE_FAIL) {
>> -		svm->vcpu.run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>> -		svm->vcpu.run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
>> -		svm->vcpu.run->internal.ndata = 0;
>> -		return 0;
>> -	}
>> +				has_error_code, error_code) == EMULATE_FAIL)
>> +		goto fail;
>> +
>>  	return 1;
>> +
>> +fail:
>> +	svm->vcpu.run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>> +	svm->vcpu.run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
>> +	svm->vcpu.run->internal.ndata = 0;
>> +	return 0;
>>  }
>>  
>>  static int cpuid_interception(struct vcpu_svm *svm)
>
> ...
>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index c6d951cbd76c..e8f797fe9d9e 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -6383,9 +6383,11 @@ static void kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu, int *r)
>>  int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
>>  {
>>  	unsigned long rflags = kvm_x86_ops->get_rflags(vcpu);
>> -	int r = EMULATE_DONE;
>> +	int r;
>>  
>> -	kvm_x86_ops->skip_emulated_instruction(vcpu);
>> +	r = kvm_x86_ops->skip_emulated_instruction(vcpu);
>> +	if (unlikely(r != EMULATE_DONE))
>> +		return 0;
>
> x86_emulate_instruction() doesn't set vcpu->run->exit_reason when emulation
> fails with EMULTYPE_SKIP, i.e. this will exit to userspace with garbage in
> the exit_reason.

Oh, nice catch, will take a look!

>
> handle_ept_misconfig() also has the same (pre-existing) flaw.
>
> Given the handle_ept_misconfig() bug and that kvm_emulate_instruction()
> sets vcpu->run->exit_reason when it returns EMULATE_FAIL in the normal
> case, I think it makes sense to fix the issue in x86_emulate_instruction().
> That would also eliminate the need to worry about EMULATE_USER_EXIT in
> task_switch_interception(), e.g. the SVM code can just return 0 when it
> gets a non-EMULATE_DONE return type.
>
> E.g.:
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 07ab14d73094..73b86f81ed9c 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6201,7 +6201,8 @@ static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
>         if (emulation_type & EMULTYPE_NO_UD_ON_FAIL)
>                 return EMULATE_FAIL;
>
> -       if (!is_guest_mode(vcpu) && kvm_x86_ops->get_cpl(vcpu) == 0) {
> +       if ((!is_guest_mode(vcpu) && kvm_x86_ops->get_cpl(vcpu) == 0) ||
> +           (emulation_type & EMULTYPE_SKIP)) {
>                 vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>                 vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
>                 vcpu->run->internal.ndata = 0;
> @@ -6525,8 +6526,6 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
>                                 return EMULATE_DONE;
>                         if (ctxt->have_exception && inject_emulated_exception(vcpu))
>                                 return EMULATE_DONE;
> -                       if (emulation_type & EMULTYPE_SKIP)
> -                               return EMULATE_FAIL;
>                         return handle_emulation_failure(vcpu, emulation_type);
>                 }
>         }
>
>
> As for the kvm_task_switch() handling and other cases, I think it's
> possible to rework all of the functions and callers that return/handle
> EMULATE_DONE to instead return 0/1, i.e. contain EMULATE_* to x86.c.
> I'll put together a series, I think you've suffered more than enough scope
> creep as it is :-)

No problem at all, you seem to have a lot of great ideas on how to
improve things :-) Thanks for your review!

-- 
Vitaly

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths
  2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
                   ` (6 preceding siblings ...)
  2019-08-13 13:53 ` [PATCH v4 7/7] x86: KVM: svm: eliminate hardcoded RIP advancement " Vitaly Kuznetsov
@ 2019-08-14 13:25 ` Paolo Bonzini
  7 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2019-08-14 13:25 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm
  Cc: linux-kernel, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

On 13/08/19 15:53, Vitaly Kuznetsov wrote:
> Changes since v3 [Sean Christopherson]:
> - add Reviewed-by tag to PATCH5
> - __skip_emulated_instruction()/skip_emulated_instruction() split,
>   'unlikely(r != EMULATE_DONE)' in PATCH2
> - Make nested_svm_vmrun() return an int in PATCH6 (moved from PATCH7)
> - Avoid weird-looking 'if (rc) return ret' in PATCH7
> 
> Original description:
> 
> Jim rightfully complains that hardcoding instuctions lengths is not always
> correct: additional (redundant) prefixes can be used. Luckily, the ugliness
> is mostly harmless: modern AMD CPUs support NRIP_SAVE feature but I'd like
> to clean things up and sacrifice speed in favor of correctness.

Queued, thanks.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction()
  2019-08-14  9:34     ` Vitaly Kuznetsov
@ 2019-08-15  0:19       ` Sean Christopherson
  2019-08-15  9:24         ` Vitaly Kuznetsov
  0 siblings, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2019-08-15  0:19 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson

On Wed, Aug 14, 2019 at 11:34:52AM +0200, Vitaly Kuznetsov wrote:
> Sean Christopherson <sean.j.christopherson@intel.com> writes:
>
> > x86_emulate_instruction() doesn't set vcpu->run->exit_reason when emulation
> > fails with EMULTYPE_SKIP, i.e. this will exit to userspace with garbage in
> > the exit_reason.
> 
> Oh, nice catch, will take a look!

Don't worry about addressing this.  Paolo has already queued the series,
and I've got a patch set waiting that purges emulation_result entirely
that I'll post once your series hits kvm/queue.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction()
  2019-08-15  0:19       ` Sean Christopherson
@ 2019-08-15  9:24         ` Vitaly Kuznetsov
  0 siblings, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-15  9:24 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson

Sean Christopherson <sean.j.christopherson@intel.com> writes:

> On Wed, Aug 14, 2019 at 11:34:52AM +0200, Vitaly Kuznetsov wrote:
>> Sean Christopherson <sean.j.christopherson@intel.com> writes:
>>
>> > x86_emulate_instruction() doesn't set vcpu->run->exit_reason when emulation
>> > fails with EMULTYPE_SKIP, i.e. this will exit to userspace with garbage in
>> > the exit_reason.
>> 
>> Oh, nice catch, will take a look!
>
> Don't worry about addressing this.  Paolo has already queued the series,
> and I've got a patch set waiting that purges emulation_result entirely
> that I'll post once your series hits kvm/queue.

Sometimes being slow and lazy pays off :-)

Thanks a lot!

-- 
Vitaly

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, back to index

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
2019-08-13 13:53 ` [PATCH v4 1/7] x86: KVM: svm: don't pretend to advance RIP in case wrmsr_interception() results in #GP Vitaly Kuznetsov
2019-08-13 13:53 ` [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction() Vitaly Kuznetsov
2019-08-13 18:07   ` Sean Christopherson
2019-08-14  9:34     ` Vitaly Kuznetsov
2019-08-15  0:19       ` Sean Christopherson
2019-08-15  9:24         ` Vitaly Kuznetsov
2019-08-13 13:53 ` [PATCH v4 3/7] x86: KVM: clear interrupt shadow on EMULTYPE_SKIP Vitaly Kuznetsov
2019-08-13 13:53 ` [PATCH v4 4/7] x86: KVM: add xsetbv to the emulator Vitaly Kuznetsov
2019-08-13 13:53 ` [PATCH v4 5/7] x86: KVM: svm: remove hardcoded instruction length from intercepts Vitaly Kuznetsov
2019-08-13 13:53 ` [PATCH v4 6/7] x86: KVM: svm: eliminate weird goto from vmrun_interception() Vitaly Kuznetsov
2019-08-13 18:11   ` Sean Christopherson
2019-08-13 13:53 ` [PATCH v4 7/7] x86: KVM: svm: eliminate hardcoded RIP advancement " Vitaly Kuznetsov
2019-08-13 18:11   ` Sean Christopherson
2019-08-14 13:25 ` [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Paolo Bonzini

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org kvm@archiver.kernel.org
	public-inbox-index kvm


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/ public-inbox