All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths
@ 2019-08-13 13:53 Vitaly Kuznetsov
  2019-08-13 13:53 ` [PATCH v4 1/7] x86: KVM: svm: don't pretend to advance RIP in case wrmsr_interception() results in #GP Vitaly Kuznetsov
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-13 13:53 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

Changes since v3 [Sean Christopherson]:
- add Reviewed-by tag to PATCH5
- __skip_emulated_instruction()/skip_emulated_instruction() split,
  'unlikely(r != EMULATE_DONE)' in PATCH2
- Make nested_svm_vmrun() return an int in PATCH6 (moved from PATCH7)
- Avoid weird-looking 'if (rc) return ret' in PATCH7

Original description:

Jim rightfully complains that hardcoding instuctions lengths is not always
correct: additional (redundant) prefixes can be used. Luckily, the ugliness
is mostly harmless: modern AMD CPUs support NRIP_SAVE feature but I'd like
to clean things up and sacrifice speed in favor of correctness.

Vitaly Kuznetsov (7):
  x86: KVM: svm: don't pretend to advance RIP in case
    wrmsr_interception() results in #GP
  x86: kvm: svm: propagate errors from skip_emulated_instruction()
  x86: KVM: clear interrupt shadow on EMULTYPE_SKIP
  x86: KVM: add xsetbv to the emulator
  x86: KVM: svm: remove hardcoded instruction length from intercepts
  x86: KVM: svm: eliminate weird goto from vmrun_interception()
  x86: KVM: svm: eliminate hardcoded RIP advancement from
    vmrun_interception()

 arch/x86/include/asm/kvm_emulate.h |   3 +-
 arch/x86/include/asm/kvm_host.h    |   2 +-
 arch/x86/kvm/emulate.c             |  23 ++++++-
 arch/x86/kvm/svm.c                 | 100 +++++++++++++----------------
 arch/x86/kvm/vmx/vmx.c             |  16 ++++-
 arch/x86/kvm/x86.c                 |  13 +++-
 6 files changed, 92 insertions(+), 65 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v4 1/7] x86: KVM: svm: don't pretend to advance RIP in case wrmsr_interception() results in #GP
  2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
@ 2019-08-13 13:53 ` Vitaly Kuznetsov
  2019-08-13 13:53 ` [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction() Vitaly Kuznetsov
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-13 13:53 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

svm->next_rip is only used by skip_emulated_instruction() and in case
kvm_set_msr() fails we rightfully don't do that. Move svm->next_rip
advancement to 'else' branch to avoid creating false impression that
it's always advanced (and make it look like rdmsr_interception()).

This is a preparatory change to removing hardcoded RIP advancement
from instruction intercepts, no functional change.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/svm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 7eafc6907861..7e843b340490 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -4447,13 +4447,13 @@ static int wrmsr_interception(struct vcpu_svm *svm)
 	msr.index = ecx;
 	msr.host_initiated = false;
 
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 2;
 	if (kvm_set_msr(&svm->vcpu, &msr)) {
 		trace_kvm_msr_write_ex(ecx, data);
 		kvm_inject_gp(&svm->vcpu, 0);
 		return 1;
 	} else {
 		trace_kvm_msr_write(ecx, data);
+		svm->next_rip = kvm_rip_read(&svm->vcpu) + 2;
 		return kvm_skip_emulated_instruction(&svm->vcpu);
 	}
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction()
  2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
  2019-08-13 13:53 ` [PATCH v4 1/7] x86: KVM: svm: don't pretend to advance RIP in case wrmsr_interception() results in #GP Vitaly Kuznetsov
@ 2019-08-13 13:53 ` Vitaly Kuznetsov
  2019-08-13 18:07   ` Sean Christopherson
  2019-08-13 13:53 ` [PATCH v4 3/7] x86: KVM: clear interrupt shadow on EMULTYPE_SKIP Vitaly Kuznetsov
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-13 13:53 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

On AMD, kvm_x86_ops->skip_emulated_instruction(vcpu) can, in theory,
fail: in !nrips case we call kvm_emulate_instruction(EMULTYPE_SKIP).
Currently, we only do printk(KERN_DEBUG) when this happens and this
is not ideal. Propagate the error up the stack.

On VMX, skip_emulated_instruction() doesn't fail, we have two call
sites calling it explicitly: handle_exception_nmi() and
handle_task_switch(), we can just ignore the result.

On SVM, we also have two explicit call sites:
svm_queue_exception() and it seems we don't need to do anything there as
we check if RIP was advanced or not. In task_switch_interception(),
however, we are better off not proceeding to kvm_task_switch() in case
skip_emulated_instruction() failed.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/svm.c              | 36 ++++++++++++++++++---------------
 arch/x86/kvm/vmx/vmx.c          | 16 ++++++++++++---
 arch/x86/kvm/x86.c              |  6 ++++--
 4 files changed, 38 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 7b0a4ee77313..f9e6d0b0f581 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1068,7 +1068,7 @@ struct kvm_x86_ops {
 
 	void (*run)(struct kvm_vcpu *vcpu);
 	int (*handle_exit)(struct kvm_vcpu *vcpu);
-	void (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
+	int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
 	void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask);
 	u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu);
 	void (*patch_hypercall)(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 7e843b340490..8299b0de06e2 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -770,7 +770,7 @@ static void svm_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
 
 }
 
-static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
+static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 
@@ -779,18 +779,17 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
 		svm->next_rip = svm->vmcb->control.next_rip;
 	}
 
-	if (!svm->next_rip) {
-		if (kvm_emulate_instruction(vcpu, EMULTYPE_SKIP) !=
-				EMULATE_DONE)
-			printk(KERN_DEBUG "%s: NOP\n", __func__);
-		return;
-	}
+	if (!svm->next_rip)
+		return kvm_emulate_instruction(vcpu, EMULTYPE_SKIP);
+
 	if (svm->next_rip - kvm_rip_read(vcpu) > MAX_INST_SIZE)
 		printk(KERN_ERR "%s: ip 0x%lx next 0x%llx\n",
 		       __func__, kvm_rip_read(vcpu), svm->next_rip);
 
 	kvm_rip_write(vcpu, svm->next_rip);
 	svm_set_interrupt_shadow(vcpu, 0);
+
+	return EMULATE_DONE;
 }
 
 static void svm_queue_exception(struct kvm_vcpu *vcpu)
@@ -821,7 +820,7 @@ static void svm_queue_exception(struct kvm_vcpu *vcpu)
 		 * raises a fault that is not intercepted. Still better than
 		 * failing in all cases.
 		 */
-		skip_emulated_instruction(&svm->vcpu);
+		(void)skip_emulated_instruction(&svm->vcpu);
 		rip = kvm_rip_read(&svm->vcpu);
 		svm->int3_rip = rip + svm->vmcb->save.cs.base;
 		svm->int3_injected = rip - old_rip;
@@ -3899,20 +3898,25 @@ static int task_switch_interception(struct vcpu_svm *svm)
 	if (reason != TASK_SWITCH_GATE ||
 	    int_type == SVM_EXITINTINFO_TYPE_SOFT ||
 	    (int_type == SVM_EXITINTINFO_TYPE_EXEPT &&
-	     (int_vec == OF_VECTOR || int_vec == BP_VECTOR)))
-		skip_emulated_instruction(&svm->vcpu);
+	     (int_vec == OF_VECTOR || int_vec == BP_VECTOR))) {
+		if (skip_emulated_instruction(&svm->vcpu) != EMULATE_DONE)
+			goto fail;
+	}
 
 	if (int_type != SVM_EXITINTINFO_TYPE_SOFT)
 		int_vec = -1;
 
 	if (kvm_task_switch(&svm->vcpu, tss_selector, int_vec, reason,
-				has_error_code, error_code) == EMULATE_FAIL) {
-		svm->vcpu.run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
-		svm->vcpu.run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
-		svm->vcpu.run->internal.ndata = 0;
-		return 0;
-	}
+				has_error_code, error_code) == EMULATE_FAIL)
+		goto fail;
+
 	return 1;
+
+fail:
+	svm->vcpu.run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
+	svm->vcpu.run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
+	svm->vcpu.run->internal.ndata = 0;
+	return 0;
 }
 
 static int cpuid_interception(struct vcpu_svm *svm)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 074385c86c09..358827b5bc44 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1472,8 +1472,11 @@ static int vmx_rtit_ctl_check(struct kvm_vcpu *vcpu, u64 data)
 	return 0;
 }
 
-
-static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
+/*
+ * Returns an int to be compatible with SVM implementation (which can fail).
+ * Do not use directly, use skip_emulated_instruction() instead.
+ */
+static int __skip_emulated_instruction(struct kvm_vcpu *vcpu)
 {
 	unsigned long rip;
 
@@ -1483,6 +1486,13 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
 
 	/* skipping an emulated instruction also counts */
 	vmx_set_interrupt_shadow(vcpu, 0);
+
+	return EMULATE_DONE;
+}
+
+static inline void skip_emulated_instruction(struct kvm_vcpu *vcpu)
+{
+	(void)__skip_emulated_instruction(vcpu);
 }
 
 static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
@@ -7700,7 +7710,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
 
 	.run = vmx_vcpu_run,
 	.handle_exit = vmx_handle_exit,
-	.skip_emulated_instruction = skip_emulated_instruction,
+	.skip_emulated_instruction = __skip_emulated_instruction,
 	.set_interrupt_shadow = vmx_set_interrupt_shadow,
 	.get_interrupt_shadow = vmx_get_interrupt_shadow,
 	.patch_hypercall = vmx_patch_hypercall,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c6d951cbd76c..e8f797fe9d9e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6383,9 +6383,11 @@ static void kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu, int *r)
 int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
 {
 	unsigned long rflags = kvm_x86_ops->get_rflags(vcpu);
-	int r = EMULATE_DONE;
+	int r;
 
-	kvm_x86_ops->skip_emulated_instruction(vcpu);
+	r = kvm_x86_ops->skip_emulated_instruction(vcpu);
+	if (unlikely(r != EMULATE_DONE))
+		return 0;
 
 	/*
 	 * rflags is the old, "raw" value of the flags.  The new value has
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 3/7] x86: KVM: clear interrupt shadow on EMULTYPE_SKIP
  2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
  2019-08-13 13:53 ` [PATCH v4 1/7] x86: KVM: svm: don't pretend to advance RIP in case wrmsr_interception() results in #GP Vitaly Kuznetsov
  2019-08-13 13:53 ` [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction() Vitaly Kuznetsov
@ 2019-08-13 13:53 ` Vitaly Kuznetsov
  2019-08-13 13:53 ` [PATCH v4 4/7] x86: KVM: add xsetbv to the emulator Vitaly Kuznetsov
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-13 13:53 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

When doing x86_emulate_instruction(EMULTYPE_SKIP) interrupt shadow has to
be cleared if and only if the skipping is successful.

There are two immediate issues:
- In SVM skip_emulated_instruction() we are not zapping interrupt shadow
  in case kvm_emulate_instruction(EMULTYPE_SKIP) is used to advance RIP
  (!nrpip_save).
- In VMX handle_ept_misconfig() when running as a nested hypervisor we
  (static_cpu_has(X86_FEATURE_HYPERVISOR) case) forget to clear interrupt
  shadow.

Note that we intentionally don't handle the case when the skipped
instruction is supposed to prolong the interrupt shadow ("MOV/POP SS") as
skip-emulation of those instructions should not happen under normal
circumstances.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/x86.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e8f797fe9d9e..c2409d06c114 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6539,6 +6539,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
 		kvm_rip_write(vcpu, ctxt->_eip);
 		if (ctxt->eflags & X86_EFLAGS_RF)
 			kvm_set_rflags(vcpu, ctxt->eflags & ~X86_EFLAGS_RF);
+		kvm_x86_ops->set_interrupt_shadow(vcpu, 0);
 		return EMULATE_DONE;
 	}
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 4/7] x86: KVM: add xsetbv to the emulator
  2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
                   ` (2 preceding siblings ...)
  2019-08-13 13:53 ` [PATCH v4 3/7] x86: KVM: clear interrupt shadow on EMULTYPE_SKIP Vitaly Kuznetsov
@ 2019-08-13 13:53 ` Vitaly Kuznetsov
  2019-08-13 13:53 ` [PATCH v4 5/7] x86: KVM: svm: remove hardcoded instruction length from intercepts Vitaly Kuznetsov
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-13 13:53 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

To avoid hardcoding xsetbv length to '3' we need to support decoding it in
the emulator.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/kvm_emulate.h |  3 ++-
 arch/x86/kvm/emulate.c             | 23 ++++++++++++++++++++++-
 arch/x86/kvm/svm.c                 |  1 +
 arch/x86/kvm/x86.c                 |  6 ++++++
 4 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index feab24cac610..77cf6c11f66b 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -229,7 +229,7 @@ struct x86_emulate_ops {
 	int (*pre_leave_smm)(struct x86_emulate_ctxt *ctxt,
 			     const char *smstate);
 	void (*post_leave_smm)(struct x86_emulate_ctxt *ctxt);
-
+	int (*set_xcr)(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr);
 };
 
 typedef u32 __attribute__((vector_size(16))) sse128_t;
@@ -429,6 +429,7 @@ enum x86_intercept {
 	x86_intercept_ins,
 	x86_intercept_out,
 	x86_intercept_outs,
+	x86_intercept_xsetbv,
 
 	nr_x86_intercepts
 };
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 718f7d9afedc..f9e843dd992a 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -4156,6 +4156,20 @@ static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
 	return rc;
 }
 
+static int em_xsetbv(struct x86_emulate_ctxt *ctxt)
+{
+	u32 eax, ecx, edx;
+
+	eax = reg_read(ctxt, VCPU_REGS_RAX);
+	edx = reg_read(ctxt, VCPU_REGS_RDX);
+	ecx = reg_read(ctxt, VCPU_REGS_RCX);
+
+	if (ctxt->ops->set_xcr(ctxt, ecx, ((u64)edx << 32) | eax))
+		return emulate_gp(ctxt, 0);
+
+	return X86EMUL_CONTINUE;
+}
+
 static bool valid_cr(int nr)
 {
 	switch (nr) {
@@ -4409,6 +4423,12 @@ static const struct opcode group7_rm1[] = {
 	N, N, N, N, N, N,
 };
 
+static const struct opcode group7_rm2[] = {
+	N,
+	II(ImplicitOps | Priv,			em_xsetbv,	xsetbv),
+	N, N, N, N, N, N,
+};
+
 static const struct opcode group7_rm3[] = {
 	DIP(SrcNone | Prot | Priv,		vmrun,		check_svme_pa),
 	II(SrcNone  | Prot | EmulateOnUD,	em_hypercall,	vmmcall),
@@ -4498,7 +4518,8 @@ static const struct group_dual group7 = { {
 }, {
 	EXT(0, group7_rm0),
 	EXT(0, group7_rm1),
-	N, EXT(0, group7_rm3),
+	EXT(0, group7_rm2),
+	EXT(0, group7_rm3),
 	II(SrcNone | DstMem | Mov,		em_smsw, smsw), N,
 	II(SrcMem16 | Mov | Priv,		em_lmsw, lmsw),
 	EXT(0, group7_rm7),
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 8299b0de06e2..858feeac01a4 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6067,6 +6067,7 @@ static const struct __x86_intercept {
 	[x86_intercept_ins]		= POST_EX(SVM_EXIT_IOIO),
 	[x86_intercept_out]		= POST_EX(SVM_EXIT_IOIO),
 	[x86_intercept_outs]		= POST_EX(SVM_EXIT_IOIO),
+	[x86_intercept_xsetbv]		= PRE_EX(SVM_EXIT_XSETBV),
 };
 
 #undef PRE_EX
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c2409d06c114..337559294169 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6068,6 +6068,11 @@ static void emulator_post_leave_smm(struct x86_emulate_ctxt *ctxt)
 	kvm_smm_changed(emul_to_vcpu(ctxt));
 }
 
+static int emulator_set_xcr(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr)
+{
+	return __kvm_set_xcr(emul_to_vcpu(ctxt), index, xcr);
+}
+
 static const struct x86_emulate_ops emulate_ops = {
 	.read_gpr            = emulator_read_gpr,
 	.write_gpr           = emulator_write_gpr,
@@ -6109,6 +6114,7 @@ static const struct x86_emulate_ops emulate_ops = {
 	.set_hflags          = emulator_set_hflags,
 	.pre_leave_smm       = emulator_pre_leave_smm,
 	.post_leave_smm      = emulator_post_leave_smm,
+	.set_xcr             = emulator_set_xcr,
 };
 
 static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 5/7] x86: KVM: svm: remove hardcoded instruction length from intercepts
  2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
                   ` (3 preceding siblings ...)
  2019-08-13 13:53 ` [PATCH v4 4/7] x86: KVM: add xsetbv to the emulator Vitaly Kuznetsov
@ 2019-08-13 13:53 ` Vitaly Kuznetsov
  2019-08-13 13:53 ` [PATCH v4 6/7] x86: KVM: svm: eliminate weird goto from vmrun_interception() Vitaly Kuznetsov
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-13 13:53 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

Various intercepts hard-code the respective instruction lengths to optimize
skip_emulated_instruction(): when next_rip is pre-set we skip
kvm_emulate_instruction(vcpu, EMULTYPE_SKIP). The optimization is, however,
incorrect: different (redundant) prefixes could be used to enlarge the
instruction. We can't really avoid decoding.

svm->next_rip is not used when CPU supports 'nrips' (X86_FEATURE_NRIPS)
feature: next RIP is provided in VMCB. The feature is not really new
(Opteron G3s had it already) and the change should have zero affect.

Remove manual svm->next_rip setting with hard-coded instruction lengths.
The only case where we now use svm->next_rip is EXIT_IOIO: the instruction
length is provided to us by hardware.

Hardcoded RIP advancement remains in vmrun_interception(), this is going to
be taken care of separately.

Reported-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/svm.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 858feeac01a4..6d16d1898810 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2903,13 +2903,11 @@ static int nop_on_interception(struct vcpu_svm *svm)
 
 static int halt_interception(struct vcpu_svm *svm)
 {
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 1;
 	return kvm_emulate_halt(&svm->vcpu);
 }
 
 static int vmmcall_interception(struct vcpu_svm *svm)
 {
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
 	return kvm_emulate_hypercall(&svm->vcpu);
 }
 
@@ -3697,7 +3695,6 @@ static int vmload_interception(struct vcpu_svm *svm)
 
 	nested_vmcb = map.hva;
 
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
 	ret = kvm_skip_emulated_instruction(&svm->vcpu);
 
 	nested_svm_vmloadsave(nested_vmcb, svm->vmcb);
@@ -3724,7 +3721,6 @@ static int vmsave_interception(struct vcpu_svm *svm)
 
 	nested_vmcb = map.hva;
 
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
 	ret = kvm_skip_emulated_instruction(&svm->vcpu);
 
 	nested_svm_vmloadsave(svm->vmcb, nested_vmcb);
@@ -3775,7 +3771,6 @@ static int stgi_interception(struct vcpu_svm *svm)
 	if (vgif_enabled(svm))
 		clr_intercept(svm, INTERCEPT_STGI);
 
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
 	ret = kvm_skip_emulated_instruction(&svm->vcpu);
 	kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
 
@@ -3791,7 +3786,6 @@ static int clgi_interception(struct vcpu_svm *svm)
 	if (nested_svm_check_permissions(svm))
 		return 1;
 
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
 	ret = kvm_skip_emulated_instruction(&svm->vcpu);
 
 	disable_gif(svm);
@@ -3816,7 +3810,6 @@ static int invlpga_interception(struct vcpu_svm *svm)
 	/* Let's treat INVLPGA the same as INVLPG (can be optimized!) */
 	kvm_mmu_invlpg(vcpu, kvm_rax_read(&svm->vcpu));
 
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
 	return kvm_skip_emulated_instruction(&svm->vcpu);
 }
 
@@ -3839,7 +3832,6 @@ static int xsetbv_interception(struct vcpu_svm *svm)
 	u32 index = kvm_rcx_read(&svm->vcpu);
 
 	if (kvm_set_xcr(&svm->vcpu, index, new_bv) == 0) {
-		svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
 		return kvm_skip_emulated_instruction(&svm->vcpu);
 	}
 
@@ -3921,7 +3913,6 @@ static int task_switch_interception(struct vcpu_svm *svm)
 
 static int cpuid_interception(struct vcpu_svm *svm)
 {
-	svm->next_rip = kvm_rip_read(&svm->vcpu) + 2;
 	return kvm_emulate_cpuid(&svm->vcpu);
 }
 
@@ -4251,7 +4242,6 @@ static int rdmsr_interception(struct vcpu_svm *svm)
 
 		kvm_rax_write(&svm->vcpu, msr_info.data & 0xffffffff);
 		kvm_rdx_write(&svm->vcpu, msr_info.data >> 32);
-		svm->next_rip = kvm_rip_read(&svm->vcpu) + 2;
 		return kvm_skip_emulated_instruction(&svm->vcpu);
 	}
 }
@@ -4457,7 +4447,6 @@ static int wrmsr_interception(struct vcpu_svm *svm)
 		return 1;
 	} else {
 		trace_kvm_msr_write(ecx, data);
-		svm->next_rip = kvm_rip_read(&svm->vcpu) + 2;
 		return kvm_skip_emulated_instruction(&svm->vcpu);
 	}
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 6/7] x86: KVM: svm: eliminate weird goto from vmrun_interception()
  2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
                   ` (4 preceding siblings ...)
  2019-08-13 13:53 ` [PATCH v4 5/7] x86: KVM: svm: remove hardcoded instruction length from intercepts Vitaly Kuznetsov
@ 2019-08-13 13:53 ` Vitaly Kuznetsov
  2019-08-13 18:11   ` Sean Christopherson
  2019-08-13 13:53 ` [PATCH v4 7/7] x86: KVM: svm: eliminate hardcoded RIP advancement " Vitaly Kuznetsov
  2019-08-14 13:25 ` [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Paolo Bonzini
  7 siblings, 1 reply; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-13 13:53 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

Regardless of whether or not nested_svm_vmrun_msrpm() fails, we return 1
from vmrun_interception() so there's no point in doing goto. Also,
nested_svm_vmrun_msrpm() call can be made from nested_svm_vmrun() where
other nested launch issues are handled.

nested_svm_vmrun() returns a bool, however, its result is ignored in
vmrun_interception() as we always return '1'. As a preparatory change
to putting kvm_skip_emulated_instruction() inside nested_svm_vmrun()
make nested_svm_vmrun() return an int (always '1' for now).

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/svm.c | 36 ++++++++++++++----------------------
 1 file changed, 14 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 6d16d1898810..51c39b608ef7 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3586,7 +3586,7 @@ static void enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa,
 	mark_all_dirty(svm->vmcb);
 }
 
-static bool nested_svm_vmrun(struct vcpu_svm *svm)
+static int nested_svm_vmrun(struct vcpu_svm *svm)
 {
 	int rc;
 	struct vmcb *nested_vmcb;
@@ -3601,7 +3601,7 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
 	if (rc) {
 		if (rc == -EINVAL)
 			kvm_inject_gp(&svm->vcpu, 0);
-		return false;
+		return 1;
 	}
 
 	nested_vmcb = map.hva;
@@ -3614,7 +3614,7 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
 
 		kvm_vcpu_unmap(&svm->vcpu, &map, true);
 
-		return false;
+		return 1;
 	}
 
 	trace_kvm_nested_vmrun(svm->vmcb->save.rip, vmcb_gpa,
@@ -3658,7 +3658,16 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
 
 	enter_svm_guest_mode(svm, vmcb_gpa, nested_vmcb, &map);
 
-	return true;
+	if (!nested_svm_vmrun_msrpm(svm)) {
+		svm->vmcb->control.exit_code    = SVM_EXIT_ERR;
+		svm->vmcb->control.exit_code_hi = 0;
+		svm->vmcb->control.exit_info_1  = 0;
+		svm->vmcb->control.exit_info_2  = 0;
+
+		nested_svm_vmexit(svm);
+	}
+
+	return 1;
 }
 
 static void nested_svm_vmloadsave(struct vmcb *from_vmcb, struct vmcb *to_vmcb)
@@ -3737,24 +3746,7 @@ static int vmrun_interception(struct vcpu_svm *svm)
 	/* Save rip after vmrun instruction */
 	kvm_rip_write(&svm->vcpu, kvm_rip_read(&svm->vcpu) + 3);
 
-	if (!nested_svm_vmrun(svm))
-		return 1;
-
-	if (!nested_svm_vmrun_msrpm(svm))
-		goto failed;
-
-	return 1;
-
-failed:
-
-	svm->vmcb->control.exit_code    = SVM_EXIT_ERR;
-	svm->vmcb->control.exit_code_hi = 0;
-	svm->vmcb->control.exit_info_1  = 0;
-	svm->vmcb->control.exit_info_2  = 0;
-
-	nested_svm_vmexit(svm);
-
-	return 1;
+	return nested_svm_vmrun(svm);
 }
 
 static int stgi_interception(struct vcpu_svm *svm)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 7/7] x86: KVM: svm: eliminate hardcoded RIP advancement from vmrun_interception()
  2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
                   ` (5 preceding siblings ...)
  2019-08-13 13:53 ` [PATCH v4 6/7] x86: KVM: svm: eliminate weird goto from vmrun_interception() Vitaly Kuznetsov
@ 2019-08-13 13:53 ` Vitaly Kuznetsov
  2019-08-13 18:11   ` Sean Christopherson
  2019-08-14 13:25 ` [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Paolo Bonzini
  7 siblings, 1 reply; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-13 13:53 UTC (permalink / raw)
  To: kvm
  Cc: linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

Just like we do with other intercepts, in vmrun_interception() we should be
doing kvm_skip_emulated_instruction() and not just RIP += 3. Also, it is
wrong to increment RIP before nested_svm_vmrun() as it can result in
kvm_inject_gp().

We can't call kvm_skip_emulated_instruction() after nested_svm_vmrun() so
move it inside.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/svm.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 51c39b608ef7..8473cbea7e8b 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3588,7 +3588,7 @@ static void enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa,
 
 static int nested_svm_vmrun(struct vcpu_svm *svm)
 {
-	int rc;
+	int ret;
 	struct vmcb *nested_vmcb;
 	struct vmcb *hsave = svm->nested.hsave;
 	struct vmcb *vmcb = svm->vmcb;
@@ -3597,13 +3597,16 @@ static int nested_svm_vmrun(struct vcpu_svm *svm)
 
 	vmcb_gpa = svm->vmcb->save.rax;
 
-	rc = kvm_vcpu_map(&svm->vcpu, gpa_to_gfn(vmcb_gpa), &map);
-	if (rc) {
-		if (rc == -EINVAL)
-			kvm_inject_gp(&svm->vcpu, 0);
+	ret = kvm_vcpu_map(&svm->vcpu, gpa_to_gfn(vmcb_gpa), &map);
+	if (ret == EINVAL) {
+		kvm_inject_gp(&svm->vcpu, 0);
 		return 1;
+	} else if (ret) {
+		return kvm_skip_emulated_instruction(&svm->vcpu);
 	}
 
+	ret = kvm_skip_emulated_instruction(&svm->vcpu);
+
 	nested_vmcb = map.hva;
 
 	if (!nested_vmcb_checks(nested_vmcb)) {
@@ -3614,7 +3617,7 @@ static int nested_svm_vmrun(struct vcpu_svm *svm)
 
 		kvm_vcpu_unmap(&svm->vcpu, &map, true);
 
-		return 1;
+		return ret;
 	}
 
 	trace_kvm_nested_vmrun(svm->vmcb->save.rip, vmcb_gpa,
@@ -3667,7 +3670,7 @@ static int nested_svm_vmrun(struct vcpu_svm *svm)
 		nested_svm_vmexit(svm);
 	}
 
-	return 1;
+	return ret;
 }
 
 static void nested_svm_vmloadsave(struct vmcb *from_vmcb, struct vmcb *to_vmcb)
@@ -3743,9 +3746,6 @@ static int vmrun_interception(struct vcpu_svm *svm)
 	if (nested_svm_check_permissions(svm))
 		return 1;
 
-	/* Save rip after vmrun instruction */
-	kvm_rip_write(&svm->vcpu, kvm_rip_read(&svm->vcpu) + 3);
-
 	return nested_svm_vmrun(svm);
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction()
  2019-08-13 13:53 ` [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction() Vitaly Kuznetsov
@ 2019-08-13 18:07   ` Sean Christopherson
  2019-08-14  9:34     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2019-08-13 18:07 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson

On Tue, Aug 13, 2019 at 03:53:30PM +0200, Vitaly Kuznetsov wrote:
> @@ -3899,20 +3898,25 @@ static int task_switch_interception(struct vcpu_svm *svm)
>  	if (reason != TASK_SWITCH_GATE ||
>  	    int_type == SVM_EXITINTINFO_TYPE_SOFT ||
>  	    (int_type == SVM_EXITINTINFO_TYPE_EXEPT &&
> -	     (int_vec == OF_VECTOR || int_vec == BP_VECTOR)))
> -		skip_emulated_instruction(&svm->vcpu);
> +	     (int_vec == OF_VECTOR || int_vec == BP_VECTOR))) {
> +		if (skip_emulated_instruction(&svm->vcpu) != EMULATE_DONE)

This isn't broken in the current code, but it's flawed in the sense that
it assumes skip_emulated_instruction() never returns EMULATE_USER_EXIT.

Note, both SVM and VMX make the opposite assumption when handling
kvm_task_switch() and kvm_inject_realmode_interrupt().

More below...

> +			goto fail;
> +	}
>  
>  	if (int_type != SVM_EXITINTINFO_TYPE_SOFT)
>  		int_vec = -1;
>  
>  	if (kvm_task_switch(&svm->vcpu, tss_selector, int_vec, reason,
> -				has_error_code, error_code) == EMULATE_FAIL) {
> -		svm->vcpu.run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
> -		svm->vcpu.run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
> -		svm->vcpu.run->internal.ndata = 0;
> -		return 0;
> -	}
> +				has_error_code, error_code) == EMULATE_FAIL)
> +		goto fail;
> +
>  	return 1;
> +
> +fail:
> +	svm->vcpu.run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
> +	svm->vcpu.run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
> +	svm->vcpu.run->internal.ndata = 0;
> +	return 0;
>  }
>  
>  static int cpuid_interception(struct vcpu_svm *svm)

...

> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c6d951cbd76c..e8f797fe9d9e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6383,9 +6383,11 @@ static void kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu, int *r)
>  int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
>  {
>  	unsigned long rflags = kvm_x86_ops->get_rflags(vcpu);
> -	int r = EMULATE_DONE;
> +	int r;
>  
> -	kvm_x86_ops->skip_emulated_instruction(vcpu);
> +	r = kvm_x86_ops->skip_emulated_instruction(vcpu);
> +	if (unlikely(r != EMULATE_DONE))
> +		return 0;

x86_emulate_instruction() doesn't set vcpu->run->exit_reason when emulation
fails with EMULTYPE_SKIP, i.e. this will exit to userspace with garbage in
the exit_reason.

handle_ept_misconfig() also has the same (pre-existing) flaw.

Given the handle_ept_misconfig() bug and that kvm_emulate_instruction()
sets vcpu->run->exit_reason when it returns EMULATE_FAIL in the normal
case, I think it makes sense to fix the issue in x86_emulate_instruction().
That would also eliminate the need to worry about EMULATE_USER_EXIT in
task_switch_interception(), e.g. the SVM code can just return 0 when it
gets a non-EMULATE_DONE return type.

E.g.:

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 07ab14d73094..73b86f81ed9c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6201,7 +6201,8 @@ static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
        if (emulation_type & EMULTYPE_NO_UD_ON_FAIL)
                return EMULATE_FAIL;

-       if (!is_guest_mode(vcpu) && kvm_x86_ops->get_cpl(vcpu) == 0) {
+       if ((!is_guest_mode(vcpu) && kvm_x86_ops->get_cpl(vcpu) == 0) ||
+           (emulation_type & EMULTYPE_SKIP)) {
                vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
                vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
                vcpu->run->internal.ndata = 0;
@@ -6525,8 +6526,6 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
                                return EMULATE_DONE;
                        if (ctxt->have_exception && inject_emulated_exception(vcpu))
                                return EMULATE_DONE;
-                       if (emulation_type & EMULTYPE_SKIP)
-                               return EMULATE_FAIL;
                        return handle_emulation_failure(vcpu, emulation_type);
                }
        }


As for the kvm_task_switch() handling and other cases, I think it's
possible to rework all of the functions and callers that return/handle
EMULATE_DONE to instead return 0/1, i.e. contain EMULATE_* to x86.c.
I'll put together a series, I think you've suffered more than enough scope
creep as it is :-)

>  
>  	/*
>  	 * rflags is the old, "raw" value of the flags.  The new value has
> -- 
> 2.20.1
> 

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 6/7] x86: KVM: svm: eliminate weird goto from vmrun_interception()
  2019-08-13 13:53 ` [PATCH v4 6/7] x86: KVM: svm: eliminate weird goto from vmrun_interception() Vitaly Kuznetsov
@ 2019-08-13 18:11   ` Sean Christopherson
  0 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2019-08-13 18:11 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson

On Tue, Aug 13, 2019 at 03:53:34PM +0200, Vitaly Kuznetsov wrote:
> Regardless of whether or not nested_svm_vmrun_msrpm() fails, we return 1
> from vmrun_interception() so there's no point in doing goto. Also,
> nested_svm_vmrun_msrpm() call can be made from nested_svm_vmrun() where
> other nested launch issues are handled.
> 
> nested_svm_vmrun() returns a bool, however, its result is ignored in
> vmrun_interception() as we always return '1'. As a preparatory change
> to putting kvm_skip_emulated_instruction() inside nested_svm_vmrun()
> make nested_svm_vmrun() return an int (always '1' for now).
> 
> Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---

Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 7/7] x86: KVM: svm: eliminate hardcoded RIP advancement from vmrun_interception()
  2019-08-13 13:53 ` [PATCH v4 7/7] x86: KVM: svm: eliminate hardcoded RIP advancement " Vitaly Kuznetsov
@ 2019-08-13 18:11   ` Sean Christopherson
  0 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2019-08-13 18:11 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson

On Tue, Aug 13, 2019 at 03:53:35PM +0200, Vitaly Kuznetsov wrote:
> Just like we do with other intercepts, in vmrun_interception() we should be
> doing kvm_skip_emulated_instruction() and not just RIP += 3. Also, it is
> wrong to increment RIP before nested_svm_vmrun() as it can result in
> kvm_inject_gp().
> 
> We can't call kvm_skip_emulated_instruction() after nested_svm_vmrun() so
> move it inside.
> 
> Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---

Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction()
  2019-08-13 18:07   ` Sean Christopherson
@ 2019-08-14  9:34     ` Vitaly Kuznetsov
  2019-08-15  0:19       ` Sean Christopherson
  0 siblings, 1 reply; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-14  9:34 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson

Sean Christopherson <sean.j.christopherson@intel.com> writes:

> On Tue, Aug 13, 2019 at 03:53:30PM +0200, Vitaly Kuznetsov wrote:
>> @@ -3899,20 +3898,25 @@ static int task_switch_interception(struct vcpu_svm *svm)
>>  	if (reason != TASK_SWITCH_GATE ||
>>  	    int_type == SVM_EXITINTINFO_TYPE_SOFT ||
>>  	    (int_type == SVM_EXITINTINFO_TYPE_EXEPT &&
>> -	     (int_vec == OF_VECTOR || int_vec == BP_VECTOR)))
>> -		skip_emulated_instruction(&svm->vcpu);
>> +	     (int_vec == OF_VECTOR || int_vec == BP_VECTOR))) {
>> +		if (skip_emulated_instruction(&svm->vcpu) != EMULATE_DONE)
>
> This isn't broken in the current code, but it's flawed in the sense that
> it assumes skip_emulated_instruction() never returns EMULATE_USER_EXIT.
>
> Note, both SVM and VMX make the opposite assumption when handling
> kvm_task_switch() and kvm_inject_realmode_interrupt().
>
> More below...
>
>> +			goto fail;
>> +	}
>>  
>>  	if (int_type != SVM_EXITINTINFO_TYPE_SOFT)
>>  		int_vec = -1;
>>  
>>  	if (kvm_task_switch(&svm->vcpu, tss_selector, int_vec, reason,
>> -				has_error_code, error_code) == EMULATE_FAIL) {
>> -		svm->vcpu.run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>> -		svm->vcpu.run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
>> -		svm->vcpu.run->internal.ndata = 0;
>> -		return 0;
>> -	}
>> +				has_error_code, error_code) == EMULATE_FAIL)
>> +		goto fail;
>> +
>>  	return 1;
>> +
>> +fail:
>> +	svm->vcpu.run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>> +	svm->vcpu.run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
>> +	svm->vcpu.run->internal.ndata = 0;
>> +	return 0;
>>  }
>>  
>>  static int cpuid_interception(struct vcpu_svm *svm)
>
> ...
>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index c6d951cbd76c..e8f797fe9d9e 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -6383,9 +6383,11 @@ static void kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu, int *r)
>>  int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
>>  {
>>  	unsigned long rflags = kvm_x86_ops->get_rflags(vcpu);
>> -	int r = EMULATE_DONE;
>> +	int r;
>>  
>> -	kvm_x86_ops->skip_emulated_instruction(vcpu);
>> +	r = kvm_x86_ops->skip_emulated_instruction(vcpu);
>> +	if (unlikely(r != EMULATE_DONE))
>> +		return 0;
>
> x86_emulate_instruction() doesn't set vcpu->run->exit_reason when emulation
> fails with EMULTYPE_SKIP, i.e. this will exit to userspace with garbage in
> the exit_reason.

Oh, nice catch, will take a look!

>
> handle_ept_misconfig() also has the same (pre-existing) flaw.
>
> Given the handle_ept_misconfig() bug and that kvm_emulate_instruction()
> sets vcpu->run->exit_reason when it returns EMULATE_FAIL in the normal
> case, I think it makes sense to fix the issue in x86_emulate_instruction().
> That would also eliminate the need to worry about EMULATE_USER_EXIT in
> task_switch_interception(), e.g. the SVM code can just return 0 when it
> gets a non-EMULATE_DONE return type.
>
> E.g.:
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 07ab14d73094..73b86f81ed9c 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6201,7 +6201,8 @@ static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
>         if (emulation_type & EMULTYPE_NO_UD_ON_FAIL)
>                 return EMULATE_FAIL;
>
> -       if (!is_guest_mode(vcpu) && kvm_x86_ops->get_cpl(vcpu) == 0) {
> +       if ((!is_guest_mode(vcpu) && kvm_x86_ops->get_cpl(vcpu) == 0) ||
> +           (emulation_type & EMULTYPE_SKIP)) {
>                 vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>                 vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
>                 vcpu->run->internal.ndata = 0;
> @@ -6525,8 +6526,6 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
>                                 return EMULATE_DONE;
>                         if (ctxt->have_exception && inject_emulated_exception(vcpu))
>                                 return EMULATE_DONE;
> -                       if (emulation_type & EMULTYPE_SKIP)
> -                               return EMULATE_FAIL;
>                         return handle_emulation_failure(vcpu, emulation_type);
>                 }
>         }
>
>
> As for the kvm_task_switch() handling and other cases, I think it's
> possible to rework all of the functions and callers that return/handle
> EMULATE_DONE to instead return 0/1, i.e. contain EMULATE_* to x86.c.
> I'll put together a series, I think you've suffered more than enough scope
> creep as it is :-)

No problem at all, you seem to have a lot of great ideas on how to
improve things :-) Thanks for your review!

-- 
Vitaly

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths
  2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
                   ` (6 preceding siblings ...)
  2019-08-13 13:53 ` [PATCH v4 7/7] x86: KVM: svm: eliminate hardcoded RIP advancement " Vitaly Kuznetsov
@ 2019-08-14 13:25 ` Paolo Bonzini
  7 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2019-08-14 13:25 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm
  Cc: linux-kernel, Radim Krčmář,
	Joerg Roedel, Jim Mattson, Sean Christopherson

On 13/08/19 15:53, Vitaly Kuznetsov wrote:
> Changes since v3 [Sean Christopherson]:
> - add Reviewed-by tag to PATCH5
> - __skip_emulated_instruction()/skip_emulated_instruction() split,
>   'unlikely(r != EMULATE_DONE)' in PATCH2
> - Make nested_svm_vmrun() return an int in PATCH6 (moved from PATCH7)
> - Avoid weird-looking 'if (rc) return ret' in PATCH7
> 
> Original description:
> 
> Jim rightfully complains that hardcoding instuctions lengths is not always
> correct: additional (redundant) prefixes can be used. Luckily, the ugliness
> is mostly harmless: modern AMD CPUs support NRIP_SAVE feature but I'd like
> to clean things up and sacrifice speed in favor of correctness.

Queued, thanks.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction()
  2019-08-14  9:34     ` Vitaly Kuznetsov
@ 2019-08-15  0:19       ` Sean Christopherson
  2019-08-15  9:24         ` Vitaly Kuznetsov
  0 siblings, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2019-08-15  0:19 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson

On Wed, Aug 14, 2019 at 11:34:52AM +0200, Vitaly Kuznetsov wrote:
> Sean Christopherson <sean.j.christopherson@intel.com> writes:
>
> > x86_emulate_instruction() doesn't set vcpu->run->exit_reason when emulation
> > fails with EMULTYPE_SKIP, i.e. this will exit to userspace with garbage in
> > the exit_reason.
> 
> Oh, nice catch, will take a look!

Don't worry about addressing this.  Paolo has already queued the series,
and I've got a patch set waiting that purges emulation_result entirely
that I'll post once your series hits kvm/queue.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction()
  2019-08-15  0:19       ` Sean Christopherson
@ 2019-08-15  9:24         ` Vitaly Kuznetsov
  0 siblings, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2019-08-15  9:24 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, Paolo Bonzini, Radim Krčmář,
	Joerg Roedel, Jim Mattson

Sean Christopherson <sean.j.christopherson@intel.com> writes:

> On Wed, Aug 14, 2019 at 11:34:52AM +0200, Vitaly Kuznetsov wrote:
>> Sean Christopherson <sean.j.christopherson@intel.com> writes:
>>
>> > x86_emulate_instruction() doesn't set vcpu->run->exit_reason when emulation
>> > fails with EMULTYPE_SKIP, i.e. this will exit to userspace with garbage in
>> > the exit_reason.
>> 
>> Oh, nice catch, will take a look!
>
> Don't worry about addressing this.  Paolo has already queued the series,
> and I've got a patch set waiting that purges emulation_result entirely
> that I'll post once your series hits kvm/queue.

Sometimes being slow and lazy pays off :-)

Thanks a lot!

-- 
Vitaly

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-08-15  9:24 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-13 13:53 [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Vitaly Kuznetsov
2019-08-13 13:53 ` [PATCH v4 1/7] x86: KVM: svm: don't pretend to advance RIP in case wrmsr_interception() results in #GP Vitaly Kuznetsov
2019-08-13 13:53 ` [PATCH v4 2/7] x86: kvm: svm: propagate errors from skip_emulated_instruction() Vitaly Kuznetsov
2019-08-13 18:07   ` Sean Christopherson
2019-08-14  9:34     ` Vitaly Kuznetsov
2019-08-15  0:19       ` Sean Christopherson
2019-08-15  9:24         ` Vitaly Kuznetsov
2019-08-13 13:53 ` [PATCH v4 3/7] x86: KVM: clear interrupt shadow on EMULTYPE_SKIP Vitaly Kuznetsov
2019-08-13 13:53 ` [PATCH v4 4/7] x86: KVM: add xsetbv to the emulator Vitaly Kuznetsov
2019-08-13 13:53 ` [PATCH v4 5/7] x86: KVM: svm: remove hardcoded instruction length from intercepts Vitaly Kuznetsov
2019-08-13 13:53 ` [PATCH v4 6/7] x86: KVM: svm: eliminate weird goto from vmrun_interception() Vitaly Kuznetsov
2019-08-13 18:11   ` Sean Christopherson
2019-08-13 13:53 ` [PATCH v4 7/7] x86: KVM: svm: eliminate hardcoded RIP advancement " Vitaly Kuznetsov
2019-08-13 18:11   ` Sean Christopherson
2019-08-14 13:25 ` [PATCH v4 0/7] x86: KVM: svm: get rid of hardcoded instructions lengths Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.