All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/24] [RFC] emulator cleanup
@ 2010-03-09 14:08 Gleb Natapov
  2010-03-09 14:08 ` [PATCH 01/24] KVM: Remove pointer to rflags from realmode_set_cr parameters Gleb Natapov
                   ` (23 more replies)
  0 siblings, 24 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:08 UTC (permalink / raw)
  To: kvm

This is the first series of patches that tries to cleanup emulator code.
This is mix of bug fixes and moving code that does emulation from x86.c
to emulator.c while making it KVM independent. The status of the patches:
works for me. realtime.flat test now also pass where it failed before.

Gleb Natapov (24):
  KVM: Remove pointer to rflags from realmode_set_cr parameters.
  KVM: Provide callback to get/set control registers in emulator ops.
  KVM: remove realmode_lmsw function.
  KVM: Provide current CPL as part of emulator context.
  KVM: Provide current eip as part of emulator context.
  KVM: x86 emulator: fix mov r/m, sreg emulation.
  KVM: x86 emulator: fix 0f 01 /5 emulation
  KVM: x86 emulator: 0f (20|21|22|23) ignore mod bits.
  KVM: x86 emulator: inject #UD on access to non-existing CR
  KVM: x86 emulator: fix mov dr to inject #UD when needed.
  KVM: x86 emulator: fix return values of syscall/sysenter/sysexit
    emulations
  KVM: x86 emulator: do not call writeback if msr access fails.
  KVM: x86 emulator: If LOCK prefix is used dest arg should be memory.
  KVM: x86 emulator: cleanup grp3 return value
  KVM: x86 emulator: Provide more callbacks for x86 emulator.
  KVM: x86 emulator: Emulate task switch in emulator.c
  KVM: x86 emulator: Use load_segment_descriptor() instead of
    kvm_load_segment_descriptor()
  KVM: Use task switch from emulator.c
  KVM: x86 emulator: fix in/out emulation.
  KVM: x86 emulator: Move string pio emulation into emulator.c
  KVM: x86 emulator: remove saved_eip
  KVM: x86 emulator: restart string instruction without going back to a
    guest.
  KVM: x86 emulator: introduce pio in string read ahead.
  KVM: small kvm_arch_vcpu_ioctl_run() cleanup.

 arch/x86/include/asm/kvm_emulate.h |   41 ++-
 arch/x86/include/asm/kvm_host.h    |   10 -
 arch/x86/kvm/emulate.c             |  813 +++++++++++++++++++++++----
 arch/x86/kvm/svm.c                 |   22 +-
 arch/x86/kvm/vmx.c                 |   19 +-
 arch/x86/kvm/x86.c                 | 1112 +++++++++---------------------------
 6 files changed, 1016 insertions(+), 1001 deletions(-)


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH 01/24] KVM: Remove pointer to rflags from realmode_set_cr parameters.
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
@ 2010-03-09 14:08 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 02/24] KVM: Provide callback to get/set control registers in emulator ops Gleb Natapov
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:08 UTC (permalink / raw)
  To: kvm

Mov reg, cr instruction doesn't change flags in any meaningful way, so
no need to update rflags after instruction execution.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |    3 +--
 arch/x86/kvm/emulate.c          |    3 +--
 arch/x86/kvm/x86.c              |    4 +---
 3 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index ec891a2..3b178d8 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -586,8 +586,7 @@ void realmode_lmsw(struct kvm_vcpu *vcpu, unsigned long msw,
 		   unsigned long *rflags);
 
 unsigned long realmode_get_cr(struct kvm_vcpu *vcpu, int cr);
-void realmode_set_cr(struct kvm_vcpu *vcpu, int cr, unsigned long value,
-		     unsigned long *rflags);
+void realmode_set_cr(struct kvm_vcpu *vcpu, int cr, unsigned long value);
 void kvm_enable_efer_bits(u64);
 int kvm_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *data);
 int kvm_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data);
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 4f6ccab..a91bb42 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2531,8 +2531,7 @@ twobyte_insn:
 	case 0x22: /* mov reg, cr */
 		if (c->modrm_mod != 3)
 			goto cannot_emulate;
-		realmode_set_cr(ctxt->vcpu,
-				c->modrm_reg, c->modrm_val, &ctxt->eflags);
+		realmode_set_cr(ctxt->vcpu, c->modrm_reg, c->modrm_val);
 		c->dst.type = OP_NONE;
 		break;
 	case 0x23: /* mov from reg to dr */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3753c11..d8711fe 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4054,13 +4054,11 @@ unsigned long realmode_get_cr(struct kvm_vcpu *vcpu, int cr)
 	return value;
 }
 
-void realmode_set_cr(struct kvm_vcpu *vcpu, int cr, unsigned long val,
-		     unsigned long *rflags)
+void realmode_set_cr(struct kvm_vcpu *vcpu, int cr, unsigned long val)
 {
 	switch (cr) {
 	case 0:
 		kvm_set_cr0(vcpu, mk_cr_64(kvm_read_cr0(vcpu), val));
-		*rflags = kvm_get_rflags(vcpu);
 		break;
 	case 2:
 		vcpu->arch.cr2 = val;
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 02/24] KVM: Provide callback to get/set control registers in emulator ops.
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
  2010-03-09 14:08 ` [PATCH 01/24] KVM: Remove pointer to rflags from realmode_set_cr parameters Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:18   ` Avi Kivity
  2010-03-09 14:09 ` [PATCH 03/24] KVM: remove realmode_lmsw function Gleb Natapov
                   ` (21 subsequent siblings)
  23 siblings, 1 reply; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

Use this callback instead of directly call kvm function. Also rename
realmode_(set|get)_cr to emulator_(set|get)_cr since function has nothing
to do with real mode.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/include/asm/kvm_emulate.h |    3 +-
 arch/x86/include/asm/kvm_host.h    |    2 -
 arch/x86/kvm/emulate.c             |    7 +-
 arch/x86/kvm/x86.c                 |  114 ++++++++++++++++++------------------
 4 files changed, 63 insertions(+), 63 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index 2666d7a..0c5caa4 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -108,7 +108,8 @@ struct x86_emulate_ops {
 				const void *new,
 				unsigned int bytes,
 				struct kvm_vcpu *vcpu);
-
+	ulong (*get_cr)(int cr, struct kvm_vcpu *vcpu);
+	void (*set_cr)(int cr, ulong val, struct kvm_vcpu *vcpu);
 };
 
 /* Type, address-of, and value of an instruction's operand. */
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3b178d8..e8e108a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -585,8 +585,6 @@ void realmode_lidt(struct kvm_vcpu *vcpu, u16 size, unsigned long address);
 void realmode_lmsw(struct kvm_vcpu *vcpu, unsigned long msw,
 		   unsigned long *rflags);
 
-unsigned long realmode_get_cr(struct kvm_vcpu *vcpu, int cr);
-void realmode_set_cr(struct kvm_vcpu *vcpu, int cr, unsigned long value);
 void kvm_enable_efer_bits(u64);
 int kvm_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *data);
 int kvm_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data);
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index a91bb42..d515795 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2480,7 +2480,7 @@ twobyte_insn:
 			break;
 		case 4: /* smsw */
 			c->dst.bytes = 2;
-			c->dst.val = realmode_get_cr(ctxt->vcpu, 0);
+			c->dst.val = ops->get_cr(0, ctxt->vcpu);
 			break;
 		case 6: /* lmsw */
 			realmode_lmsw(ctxt->vcpu, (u16)c->src.val,
@@ -2516,8 +2516,7 @@ twobyte_insn:
 	case 0x20: /* mov cr, reg */
 		if (c->modrm_mod != 3)
 			goto cannot_emulate;
-		c->regs[c->modrm_rm] =
-				realmode_get_cr(ctxt->vcpu, c->modrm_reg);
+		c->regs[c->modrm_rm] = ops->get_cr(c->modrm_reg, ctxt->vcpu);
 		c->dst.type = OP_NONE;	/* no writeback */
 		break;
 	case 0x21: /* mov from dr to reg */
@@ -2531,7 +2530,7 @@ twobyte_insn:
 	case 0x22: /* mov reg, cr */
 		if (c->modrm_mod != 3)
 			goto cannot_emulate;
-		realmode_set_cr(ctxt->vcpu, c->modrm_reg, c->modrm_val);
+		ops->set_cr(c->modrm_reg, c->modrm_val, ctxt->vcpu);
 		c->dst.type = OP_NONE;
 		break;
 	case 0x23: /* mov from reg to dr */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d8711fe..7b62ef2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3397,12 +3397,70 @@ void kvm_report_emulation_failure(struct kvm_vcpu *vcpu, const char *context)
 }
 EXPORT_SYMBOL_GPL(kvm_report_emulation_failure);
 
+static u64 mk_cr_64(u64 curr_cr, u32 new_val)
+{
+	return (curr_cr & ~((1ULL << 32) - 1)) | new_val;
+}
+
+static unsigned long emulator_get_cr(int cr, struct kvm_vcpu *vcpu)
+{
+	unsigned long value;
+
+	switch (cr) {
+	case 0:
+		value = kvm_read_cr0(vcpu);
+		break;
+	case 2:
+		value = vcpu->arch.cr2;
+		break;
+	case 3:
+		value = vcpu->arch.cr3;
+		break;
+	case 4:
+		value = kvm_read_cr4(vcpu);
+		break;
+	case 8:
+		value = kvm_get_cr8(vcpu);
+		break;
+	default:
+		vcpu_printf(vcpu, "%s: unexpected cr %u\n", __func__, cr);
+		return 0;
+	}
+
+	return value;
+}
+
+static void emulator_set_cr(int cr, unsigned long val, struct kvm_vcpu *vcpu)
+{
+	switch (cr) {
+	case 0:
+		kvm_set_cr0(vcpu, mk_cr_64(kvm_read_cr0(vcpu), val));
+		break;
+	case 2:
+		vcpu->arch.cr2 = val;
+		break;
+	case 3:
+		kvm_set_cr3(vcpu, val);
+		break;
+	case 4:
+		kvm_set_cr4(vcpu, mk_cr_64(kvm_read_cr4(vcpu), val));
+		break;
+	case 8:
+		kvm_set_cr8(vcpu, val & 0xfUL);
+		break;
+	default:
+		vcpu_printf(vcpu, "%s: unexpected cr %u\n", __func__, cr);
+	}
+}
+
 static struct x86_emulate_ops emulate_ops = {
 	.read_std            = kvm_read_guest_virt_system,
 	.fetch               = kvm_fetch_guest_virt,
 	.read_emulated       = emulator_read_emulated,
 	.write_emulated      = emulator_write_emulated,
 	.cmpxchg_emulated    = emulator_cmpxchg_emulated,
+	.get_cr              = emulator_get_cr,
+	.set_cr              = emulator_set_cr,
 };
 
 static void cache_all_regs(struct kvm_vcpu *vcpu)
@@ -4000,11 +4058,6 @@ int kvm_fix_hypercall(struct kvm_vcpu *vcpu)
 	return emulator_write_emulated(rip, instruction, 3, vcpu);
 }
 
-static u64 mk_cr_64(u64 curr_cr, u32 new_val)
-{
-	return (curr_cr & ~((1ULL << 32) - 1)) | new_val;
-}
-
 void realmode_lgdt(struct kvm_vcpu *vcpu, u16 limit, unsigned long base)
 {
 	struct desc_ptr dt = { limit, base };
@@ -4026,57 +4079,6 @@ void realmode_lmsw(struct kvm_vcpu *vcpu, unsigned long msw,
 	*rflags = kvm_get_rflags(vcpu);
 }
 
-unsigned long realmode_get_cr(struct kvm_vcpu *vcpu, int cr)
-{
-	unsigned long value;
-
-	switch (cr) {
-	case 0:
-		value = kvm_read_cr0(vcpu);
-		break;
-	case 2:
-		value = vcpu->arch.cr2;
-		break;
-	case 3:
-		value = vcpu->arch.cr3;
-		break;
-	case 4:
-		value = kvm_read_cr4(vcpu);
-		break;
-	case 8:
-		value = kvm_get_cr8(vcpu);
-		break;
-	default:
-		vcpu_printf(vcpu, "%s: unexpected cr %u\n", __func__, cr);
-		return 0;
-	}
-
-	return value;
-}
-
-void realmode_set_cr(struct kvm_vcpu *vcpu, int cr, unsigned long val)
-{
-	switch (cr) {
-	case 0:
-		kvm_set_cr0(vcpu, mk_cr_64(kvm_read_cr0(vcpu), val));
-		break;
-	case 2:
-		vcpu->arch.cr2 = val;
-		break;
-	case 3:
-		kvm_set_cr3(vcpu, val);
-		break;
-	case 4:
-		kvm_set_cr4(vcpu, mk_cr_64(kvm_read_cr4(vcpu), val));
-		break;
-	case 8:
-		kvm_set_cr8(vcpu, val & 0xfUL);
-		break;
-	default:
-		vcpu_printf(vcpu, "%s: unexpected cr %u\n", __func__, cr);
-	}
-}
-
 static int move_to_next_stateful_cpuid_entry(struct kvm_vcpu *vcpu, int i)
 {
 	struct kvm_cpuid_entry2 *e = &vcpu->arch.cpuid_entries[i];
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 03/24] KVM: remove realmode_lmsw function.
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
  2010-03-09 14:08 ` [PATCH 01/24] KVM: Remove pointer to rflags from realmode_set_cr parameters Gleb Natapov
  2010-03-09 14:09 ` [PATCH 02/24] KVM: Provide callback to get/set control registers in emulator ops Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 04/24] KVM: Provide current CPL as part of emulator context Gleb Natapov
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

Use (get|set)_cr callback to emulate lmsw inside emulator.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |    2 --
 arch/x86/kvm/emulate.c          |    4 ++--
 arch/x86/kvm/x86.c              |    7 -------
 3 files changed, 2 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index e8e108a..1e15a0a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -582,8 +582,6 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
 void kvm_report_emulation_failure(struct kvm_vcpu *cvpu, const char *context);
 void realmode_lgdt(struct kvm_vcpu *vcpu, u16 size, unsigned long address);
 void realmode_lidt(struct kvm_vcpu *vcpu, u16 size, unsigned long address);
-void realmode_lmsw(struct kvm_vcpu *vcpu, unsigned long msw,
-		   unsigned long *rflags);
 
 void kvm_enable_efer_bits(u64);
 int kvm_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *data);
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index d515795..3d1ee74 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2483,8 +2483,8 @@ twobyte_insn:
 			c->dst.val = ops->get_cr(0, ctxt->vcpu);
 			break;
 		case 6: /* lmsw */
-			realmode_lmsw(ctxt->vcpu, (u16)c->src.val,
-				      &ctxt->eflags);
+			ops->set_cr(0, (ops->get_cr(0, ctxt->vcpu) & ~0x0ful) |
+				    (c->src.val & 0x0f), ctxt->vcpu);
 			c->dst.type = OP_NONE;
 			break;
 		case 7: /* invlpg*/
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7b62ef2..8f2b61c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4072,13 +4072,6 @@ void realmode_lidt(struct kvm_vcpu *vcpu, u16 limit, unsigned long base)
 	kvm_x86_ops->set_idt(vcpu, &dt);
 }
 
-void realmode_lmsw(struct kvm_vcpu *vcpu, unsigned long msw,
-		   unsigned long *rflags)
-{
-	kvm_lmsw(vcpu, msw);
-	*rflags = kvm_get_rflags(vcpu);
-}
-
 static int move_to_next_stateful_cpuid_entry(struct kvm_vcpu *vcpu, int i)
 {
 	struct kvm_cpuid_entry2 *e = &vcpu->arch.cpuid_entries[i];
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 04/24] KVM: Provide current CPL as part of emulator context.
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (2 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 03/24] KVM: remove realmode_lmsw function Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:24   ` Avi Kivity
  2010-03-09 14:09 ` [PATCH 05/24] KVM: Provide current eip " Gleb Natapov
                   ` (19 subsequent siblings)
  23 siblings, 1 reply; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

Eliminate the need to call back into KVM to get it from emulator.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/include/asm/kvm_emulate.h |    1 +
 arch/x86/kvm/emulate.c             |    6 +++---
 arch/x86/kvm/x86.c                 |    1 +
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index 0c5caa4..d8b2da0 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -159,6 +159,7 @@ struct x86_emulate_ctxt {
 	struct kvm_vcpu *vcpu;
 
 	unsigned long eflags;
+	int cpl;
 	/* Emulated execution mode, represented by an X86EMUL_MODE value. */
 	int mode;
 	u32 cs_base;
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 3d1ee74..ed29a52 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1254,7 +1254,7 @@ static int emulate_popf(struct x86_emulate_ctxt *ctxt,
 	int rc;
 	unsigned long val, change_mask;
 	int iopl = (ctxt->eflags & X86_EFLAGS_IOPL) >> IOPL_SHIFT;
-	int cpl = kvm_x86_ops->get_cpl(ctxt->vcpu);
+	int cpl = ctxt->cpl;
 
 	rc = emulate_pop(ctxt, ops, &val, len);
 	if (rc != X86EMUL_CONTINUE)
@@ -1763,7 +1763,7 @@ static bool emulator_bad_iopl(struct x86_emulate_ctxt *ctxt)
 	if (ctxt->mode == X86EMUL_MODE_VM86)
 		return true;
 	iopl = (ctxt->eflags & X86_EFLAGS_IOPL) >> IOPL_SHIFT;
-	return kvm_x86_ops->get_cpl(ctxt->vcpu) > iopl;
+	return ctxt->cpl > iopl;
 }
 
 static bool emulator_io_port_access_allowed(struct x86_emulate_ctxt *ctxt,
@@ -1839,7 +1839,7 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 	}
 
 	/* Privileged instruction can be executed only in CPL=0 */
-	if ((c->d & Priv) && kvm_x86_ops->get_cpl(ctxt->vcpu)) {
+	if ((c->d & Priv) && ctxt->cpl) {
 		kvm_inject_gp(ctxt->vcpu, 0);
 		goto done;
 	}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8f2b61c..9b5fb43 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3499,6 +3499,7 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
 
 		vcpu->arch.emulate_ctxt.vcpu = vcpu;
 		vcpu->arch.emulate_ctxt.eflags = kvm_x86_ops->get_rflags(vcpu);
+		vcpu->arch.emulate_ctxt.cpl = kvm_x86_ops->get_cpl(vcpu);
 		vcpu->arch.emulate_ctxt.mode =
 			(!is_protmode(vcpu)) ? X86EMUL_MODE_REAL :
 			(vcpu->arch.emulate_ctxt.eflags & X86_EFLAGS_VM)
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 05/24] KVM: Provide current eip as part of emulator context.
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (3 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 04/24] KVM: Provide current CPL as part of emulator context Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 06/24] KVM: x86 emulator: fix mov r/m, sreg emulation Gleb Natapov
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

Eliminate the need to call back into KVM to get it from emulator.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/include/asm/kvm_emulate.h |    3 ++-
 arch/x86/kvm/emulate.c             |   12 ++++++------
 arch/x86/kvm/x86.c                 |    1 +
 3 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index d8b2da0..032d02f 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -140,7 +140,7 @@ struct decode_cache {
 	u8 seg_override;
 	unsigned int d;
 	unsigned long regs[NR_VCPU_REGS];
-	unsigned long eip, eip_orig;
+	unsigned long eip;
 	/* modrm */
 	u8 modrm;
 	u8 modrm_mod;
@@ -159,6 +159,7 @@ struct x86_emulate_ctxt {
 	struct kvm_vcpu *vcpu;
 
 	unsigned long eflags;
+	unsigned long eip; /* eip before instruction emulation */
 	int cpl;
 	/* Emulated execution mode, represented by an X86EMUL_MODE value. */
 	int mode;
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index ed29a52..2cc9ef4 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -667,7 +667,7 @@ static int do_insn_fetch(struct x86_emulate_ctxt *ctxt,
 	int rc;
 
 	/* x86 instructions are limited to 15 bytes. */
-	if (eip + size - ctxt->decode.eip_orig > 15)
+	if (eip + size - ctxt->eip > 15)
 		return X86EMUL_UNHANDLEABLE;
 	eip += ctxt->cs_base;
 	while (size--) {
@@ -927,7 +927,7 @@ x86_decode_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 	/* Shadow copy of register state. Committed on successful emulation. */
 
 	memset(c, 0, sizeof(struct decode_cache));
-	c->eip = c->eip_orig = kvm_rip_read(ctxt->vcpu);
+	c->eip = ctxt->eip;
 	ctxt->cs_base = seg_base(ctxt, VCPU_SREG_CS);
 	memcpy(c->regs, ctxt->vcpu->arch.regs, sizeof c->regs);
 
@@ -1874,7 +1874,7 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 			}
 		}
 		c->regs[VCPU_REGS_RCX]--;
-		c->eip = kvm_rip_read(ctxt->vcpu);
+		c->eip = ctxt->eip;
 	}
 
 	if (c->src.type == OP_MEM) {
@@ -2443,7 +2443,7 @@ twobyte_insn:
 				goto done;
 
 			/* Let the processor re-execute the fixed hypercall */
-			c->eip = kvm_rip_read(ctxt->vcpu);
+			c->eip = ctxt->eip;
 			/* Disable writeback. */
 			c->dst.type = OP_NONE;
 			break;
@@ -2547,7 +2547,7 @@ twobyte_insn:
 			| ((u64)c->regs[VCPU_REGS_RDX] << 32);
 		if (kvm_set_msr(ctxt->vcpu, c->regs[VCPU_REGS_RCX], msr_data)) {
 			kvm_inject_gp(ctxt->vcpu, 0);
-			c->eip = kvm_rip_read(ctxt->vcpu);
+			c->eip = ctxt->eip;
 		}
 		rc = X86EMUL_CONTINUE;
 		c->dst.type = OP_NONE;
@@ -2556,7 +2556,7 @@ twobyte_insn:
 		/* rdmsr */
 		if (kvm_get_msr(ctxt->vcpu, c->regs[VCPU_REGS_RCX], &msr_data)) {
 			kvm_inject_gp(ctxt->vcpu, 0);
-			c->eip = kvm_rip_read(ctxt->vcpu);
+			c->eip = ctxt->eip;
 		} else {
 			c->regs[VCPU_REGS_RAX] = (u32)msr_data;
 			c->regs[VCPU_REGS_RDX] = msr_data >> 32;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9b5fb43..41cf54c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3500,6 +3500,7 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
 		vcpu->arch.emulate_ctxt.vcpu = vcpu;
 		vcpu->arch.emulate_ctxt.eflags = kvm_x86_ops->get_rflags(vcpu);
 		vcpu->arch.emulate_ctxt.cpl = kvm_x86_ops->get_cpl(vcpu);
+		vcpu->arch.emulate_ctxt.eip = kvm_rip_read(vcpu);
 		vcpu->arch.emulate_ctxt.mode =
 			(!is_protmode(vcpu)) ? X86EMUL_MODE_REAL :
 			(vcpu->arch.emulate_ctxt.eflags & X86_EFLAGS_VM)
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 06/24] KVM: x86 emulator: fix mov r/m, sreg emulation.
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (4 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 05/24] KVM: Provide current eip " Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 07/24] KVM: x86 emulator: fix 0f 01 /5 emulation Gleb Natapov
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

mov r/m, sreg generates #UD ins sreg is incorrect.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/emulate.c |    7 +++----
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 2cc9ef4..2df510b 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2122,12 +2122,11 @@ special_insn:
 	case 0x8c: { /* mov r/m, sreg */
 		struct kvm_segment segreg;
 
-		if (c->modrm_reg <= 5)
+		if (c->modrm_reg <= VCPU_SREG_GS)
 			kvm_get_segment(ctxt->vcpu, &segreg, c->modrm_reg);
 		else {
-			printk(KERN_INFO "0x8c: Invalid segreg in modrm byte 0x%02x\n",
-			       c->modrm);
-			goto cannot_emulate;
+			kvm_queue_exception(ctxt->vcpu, UD_VECTOR);
+			goto done;
 		}
 		c->dst.val = segreg.selector;
 		break;
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 07/24] KVM: x86 emulator: fix 0f 01 /5 emulation
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (5 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 06/24] KVM: x86 emulator: fix mov r/m, sreg emulation Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:27   ` Avi Kivity
  2010-03-09 14:09 ` [PATCH 08/24] KVM: x86 emulator: 0f (20|21|22|23) ignore mod bits Gleb Natapov
                   ` (16 subsequent siblings)
  23 siblings, 1 reply; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

It is undefined and should generate #UD.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/emulate.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 2df510b..1a32b78 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2486,6 +2486,9 @@ twobyte_insn:
 				    (c->src.val & 0x0f), ctxt->vcpu);
 			c->dst.type = OP_NONE;
 			break;
+		case 5: /* not defined */
+			kvm_queue_exception(ctxt->vcpu, UD_VECTOR);
+			goto done;
 		case 7: /* invlpg*/
 			emulate_invlpg(ctxt->vcpu, memop);
 			/* Disable writeback. */
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 08/24] KVM: x86 emulator: 0f (20|21|22|23) ignore mod bits.
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (6 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 07/24] KVM: x86 emulator: fix 0f 01 /5 emulation Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 09/24] KVM: x86 emulator: inject #UD on access to non-existing CR Gleb Natapov
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

Resent spec says that for 0f (20|21|22|23) the 2 bits in the mod field
are ignored. Interestingly enough older spec says that 11 is only valid
encoding.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/emulate.c |    8 --------
 1 files changed, 0 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 1a32b78..54e62dc 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2516,28 +2516,20 @@ twobyte_insn:
 		c->dst.type = OP_NONE;
 		break;
 	case 0x20: /* mov cr, reg */
-		if (c->modrm_mod != 3)
-			goto cannot_emulate;
 		c->regs[c->modrm_rm] = ops->get_cr(c->modrm_reg, ctxt->vcpu);
 		c->dst.type = OP_NONE;	/* no writeback */
 		break;
 	case 0x21: /* mov from dr to reg */
-		if (c->modrm_mod != 3)
-			goto cannot_emulate;
 		if (emulator_get_dr(ctxt, c->modrm_reg, &c->regs[c->modrm_rm]))
 			goto cannot_emulate;
 		rc = X86EMUL_CONTINUE;
 		c->dst.type = OP_NONE;	/* no writeback */
 		break;
 	case 0x22: /* mov reg, cr */
-		if (c->modrm_mod != 3)
-			goto cannot_emulate;
 		ops->set_cr(c->modrm_reg, c->modrm_val, ctxt->vcpu);
 		c->dst.type = OP_NONE;
 		break;
 	case 0x23: /* mov from reg to dr */
-		if (c->modrm_mod != 3)
-			goto cannot_emulate;
 		if (emulator_set_dr(ctxt, c->modrm_reg, c->regs[c->modrm_rm]))
 			goto cannot_emulate;
 		rc = X86EMUL_CONTINUE;
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 09/24] KVM: x86 emulator: inject #UD on access to non-existing CR
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (7 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 08/24] KVM: x86 emulator: 0f (20|21|22|23) ignore mod bits Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 10/24] KVM: x86 emulator: fix mov dr to inject #UD when needed Gleb Natapov
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm


Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/emulate.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 54e62dc..5ba082a 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2516,6 +2516,13 @@ twobyte_insn:
 		c->dst.type = OP_NONE;
 		break;
 	case 0x20: /* mov cr, reg */
+		switch (c->modrm_reg) {
+		case 1:
+		case 5 ... 7:
+		case 9 ... 15:
+			kvm_queue_exception(ctxt->vcpu, UD_VECTOR);
+			goto done;
+		}
 		c->regs[c->modrm_rm] = ops->get_cr(c->modrm_reg, ctxt->vcpu);
 		c->dst.type = OP_NONE;	/* no writeback */
 		break;
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 10/24] KVM: x86 emulator: fix mov dr to inject #UD when needed.
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (8 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 09/24] KVM: x86 emulator: inject #UD on access to non-existing CR Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 11/24] KVM: x86 emulator: fix return values of syscall/sysenter/sysexit emulations Gleb Natapov
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

If CR4.DE=1 access to registers DR4/DR5 cause #UD.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/emulate.c |   18 ++++++++++++------
 1 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 5ba082a..dcb9720 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2527,9 +2527,12 @@ twobyte_insn:
 		c->dst.type = OP_NONE;	/* no writeback */
 		break;
 	case 0x21: /* mov from dr to reg */
-		if (emulator_get_dr(ctxt, c->modrm_reg, &c->regs[c->modrm_rm]))
-			goto cannot_emulate;
-		rc = X86EMUL_CONTINUE;
+		if ((ops->get_cr(4, ctxt->vcpu) & X86_CR4_DE) &&
+		    (c->modrm_reg == 4 || c->modrm_reg == 5)) {
+			kvm_queue_exception(ctxt->vcpu, UD_VECTOR);
+			goto done;
+		}
+		emulator_get_dr(ctxt, c->modrm_reg, &c->regs[c->modrm_rm]);
 		c->dst.type = OP_NONE;	/* no writeback */
 		break;
 	case 0x22: /* mov reg, cr */
@@ -2537,9 +2540,12 @@ twobyte_insn:
 		c->dst.type = OP_NONE;
 		break;
 	case 0x23: /* mov from reg to dr */
-		if (emulator_set_dr(ctxt, c->modrm_reg, c->regs[c->modrm_rm]))
-			goto cannot_emulate;
-		rc = X86EMUL_CONTINUE;
+		if ((ops->get_cr(4, ctxt->vcpu) & X86_CR4_DE) &&
+		    (c->modrm_reg == 4 || c->modrm_reg == 5)) {
+			kvm_queue_exception(ctxt->vcpu, UD_VECTOR);
+			goto done;
+		}
+		emulator_set_dr(ctxt, c->modrm_reg, c->regs[c->modrm_rm]);
 		c->dst.type = OP_NONE;	/* no writeback */
 		break;
 	case 0x30:
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 11/24] KVM: x86 emulator: fix return values of syscall/sysenter/sysexit emulations
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (9 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 10/24] KVM: x86 emulator: fix mov dr to inject #UD when needed Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 12/24] KVM: x86 emulator: do not call writeback if msr access fails Gleb Natapov
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

Return X86EMUL_PROPAGATE_FAULT is fault was injected. Also inject #UD
for those instruction when appropriate.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/emulate.c |   17 +++++++++++------
 1 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index dcb9720..6381df9 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1597,8 +1597,11 @@ emulate_syscall(struct x86_emulate_ctxt *ctxt)
 	u64 msr_data;
 
 	/* syscall is not available in real mode */
-	if (ctxt->mode == X86EMUL_MODE_REAL || ctxt->mode == X86EMUL_MODE_VM86)
-		return X86EMUL_UNHANDLEABLE;
+	if (ctxt->mode == X86EMUL_MODE_REAL ||
+	    ctxt->mode == X86EMUL_MODE_VM86) {
+		kvm_queue_exception(ctxt->vcpu, UD_VECTOR);
+		return X86EMUL_PROPAGATE_FAULT;
+	}
 
 	setup_syscalls_segments(ctxt, &cs, &ss);
 
@@ -1648,14 +1651,16 @@ emulate_sysenter(struct x86_emulate_ctxt *ctxt)
 	/* inject #GP if in real mode */
 	if (ctxt->mode == X86EMUL_MODE_REAL) {
 		kvm_inject_gp(ctxt->vcpu, 0);
-		return X86EMUL_UNHANDLEABLE;
+		return X86EMUL_PROPAGATE_FAULT;
 	}
 
 	/* XXX sysenter/sysexit have not been tested in 64bit mode.
 	* Therefore, we inject an #UD.
 	*/
-	if (ctxt->mode == X86EMUL_MODE_PROT64)
-		return X86EMUL_UNHANDLEABLE;
+	if (ctxt->mode == X86EMUL_MODE_PROT64) {
+		kvm_queue_exception(ctxt->vcpu, UD_VECTOR);
+		return X86EMUL_PROPAGATE_FAULT;
+	}
 
 	setup_syscalls_segments(ctxt, &cs, &ss);
 
@@ -1710,7 +1715,7 @@ emulate_sysexit(struct x86_emulate_ctxt *ctxt)
 	if (ctxt->mode == X86EMUL_MODE_REAL ||
 	    ctxt->mode == X86EMUL_MODE_VM86) {
 		kvm_inject_gp(ctxt->vcpu, 0);
-		return X86EMUL_UNHANDLEABLE;
+		return X86EMUL_PROPAGATE_FAULT;
 	}
 
 	setup_syscalls_segments(ctxt, &cs, &ss);
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 12/24] KVM: x86 emulator: do not call writeback if msr access fails.
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (10 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 11/24] KVM: x86 emulator: fix return values of syscall/sysenter/sysexit emulations Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 13/24] KVM: x86 emulator: If LOCK prefix is used dest arg should be memory Gleb Natapov
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm


Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/emulate.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 6381df9..45ded7f 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2559,7 +2559,7 @@ twobyte_insn:
 			| ((u64)c->regs[VCPU_REGS_RDX] << 32);
 		if (kvm_set_msr(ctxt->vcpu, c->regs[VCPU_REGS_RCX], msr_data)) {
 			kvm_inject_gp(ctxt->vcpu, 0);
-			c->eip = ctxt->eip;
+			goto done;
 		}
 		rc = X86EMUL_CONTINUE;
 		c->dst.type = OP_NONE;
@@ -2568,7 +2568,7 @@ twobyte_insn:
 		/* rdmsr */
 		if (kvm_get_msr(ctxt->vcpu, c->regs[VCPU_REGS_RCX], &msr_data)) {
 			kvm_inject_gp(ctxt->vcpu, 0);
-			c->eip = ctxt->eip;
+			goto done;
 		} else {
 			c->regs[VCPU_REGS_RAX] = (u32)msr_data;
 			c->regs[VCPU_REGS_RDX] = msr_data >> 32;
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 13/24] KVM: x86 emulator: If LOCK prefix is used dest arg should be memory.
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (11 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 12/24] KVM: x86 emulator: do not call writeback if msr access fails Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 14/24] KVM: x86 emulator: cleanup grp3 return value Gleb Natapov
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

If LOCK prefix is used dest arg should be memory, otherwise instruction
should generate #UD.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/emulate.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 45ded7f..018abb3 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1838,7 +1838,7 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 	}
 
 	/* LOCK prefix is allowed only with some instructions */
-	if (c->lock_prefix && !(c->d & Lock)) {
+	if (c->lock_prefix && (!(c->d & Lock) || c->dst.type != OP_MEM)) {
 		kvm_queue_exception(ctxt->vcpu, UD_VECTOR);
 		goto done;
 	}
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 14/24] KVM: x86 emulator: cleanup grp3 return value
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (12 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 13/24] KVM: x86 emulator: If LOCK prefix is used dest arg should be memory Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 15/24] KVM: x86 emulator: Provide more callbacks for x86 emulator Gleb Natapov
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

When x86_emulate_insn() does not know how to emulate instruction it
exits via cannot_emulate label in all cases except when emulating
grp3. Fix that.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/emulate.c |   12 ++++--------
 1 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 018abb3..6e2b34b 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1394,7 +1394,6 @@ static inline int emulate_grp3(struct x86_emulate_ctxt *ctxt,
 			       struct x86_emulate_ops *ops)
 {
 	struct decode_cache *c = &ctxt->decode;
-	int rc = X86EMUL_CONTINUE;
 
 	switch (c->modrm_reg) {
 	case 0 ... 1:	/* test */
@@ -1407,11 +1406,9 @@ static inline int emulate_grp3(struct x86_emulate_ctxt *ctxt,
 		emulate_1op("neg", c->dst, ctxt->eflags);
 		break;
 	default:
-		DPRINTF("Cannot emulate %02x\n", c->b);
-		rc = X86EMUL_UNHANDLEABLE;
-		break;
+		return 0;
 	}
-	return rc;
+	return 1;
 }
 
 static inline int emulate_grp45(struct x86_emulate_ctxt *ctxt,
@@ -2370,9 +2367,8 @@ special_insn:
 		c->dst.type = OP_NONE;	/* Disable writeback. */
 		break;
 	case 0xf6 ... 0xf7:	/* Grp3 */
-		rc = emulate_grp3(ctxt, ops);
-		if (rc != X86EMUL_CONTINUE)
-			goto done;
+		if (!emulate_grp3(ctxt, ops))
+			goto cannot_emulate;
 		break;
 	case 0xf8: /* clc */
 		ctxt->eflags &= ~EFLG_CF;
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 15/24] KVM: x86 emulator: Provide more callbacks for x86 emulator.
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (13 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 14/24] KVM: x86 emulator: cleanup grp3 return value Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:43   ` Avi Kivity
  2010-03-09 14:09 ` [PATCH 16/24] KVM: x86 emulator: Emulate task switch in emulator.c Gleb Natapov
                   ` (8 subsequent siblings)
  23 siblings, 1 reply; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

Provide get_cached_descriptor(), set_cached_descriptor(),
get_segment_selector(), set_segment_selector(), get_gdt(),
write_std() callbacks.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/include/asm/kvm_emulate.h |   16 +++++
 arch/x86/kvm/x86.c                 |  130 +++++++++++++++++++++++++++++++----
 2 files changed, 131 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index 032d02f..e881618 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -63,6 +63,15 @@ struct x86_emulate_ops {
 			unsigned int bytes, struct kvm_vcpu *vcpu, u32 *error);
 
 	/*
+	 * write_std: Write bytes of standard (non-emulated/special) memory.
+	 *            Used for descriptor writing.
+	 *  @addr:  [IN ] Linear address to which to write.
+	 *  @val:   [OUT] Value write to memory, zero-extended to 'u_long'.
+	 *  @bytes: [IN ] Number of bytes to write to memory.
+	 */
+	int (*write_std)(unsigned long addr, void *val,
+			 unsigned int bytes, struct kvm_vcpu *vcpu, u32 *error);
+	/*
 	 * fetch: Read bytes of standard (non-emulated/special) memory.
 	 *        Used for instruction fetch.
 	 *  @addr:  [IN ] Linear address from which to read.
@@ -108,6 +117,13 @@ struct x86_emulate_ops {
 				const void *new,
 				unsigned int bytes,
 				struct kvm_vcpu *vcpu);
+	bool (*get_cached_descriptor)(struct desc_struct *desc,
+				      int seg, struct kvm_vcpu *vcpu);
+	void (*set_cached_descriptor)(struct desc_struct *desc,
+				      int seg, struct kvm_vcpu *vcpu);
+	u16 (*get_segment_selector)(int seg, struct kvm_vcpu *vcpu);
+	void (*set_segment_selector)(u16 sel, int seg, struct kvm_vcpu *vcpu);
+	void (*get_gdt)(struct desc_ptr *dt, struct kvm_vcpu *vcpu);
 	ulong (*get_cr)(int cr, struct kvm_vcpu *vcpu);
 	void (*set_cr)(int cr, ulong val, struct kvm_vcpu *vcpu);
 };
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 41cf54c..f89502d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3077,6 +3077,18 @@ static int vcpu_mmio_read(struct kvm_vcpu *vcpu, gpa_t addr, int len, void *v)
 	return kvm_io_bus_read(vcpu->kvm, KVM_MMIO_BUS, addr, len, v);
 }
 
+static void kvm_set_segment(struct kvm_vcpu *vcpu,
+			struct kvm_segment *var, int seg)
+{
+	kvm_x86_ops->set_segment(vcpu, var, seg);
+}
+
+void kvm_get_segment(struct kvm_vcpu *vcpu,
+		     struct kvm_segment *var, int seg)
+{
+	kvm_x86_ops->get_segment(vcpu, var, seg);
+}
+
 gpa_t kvm_mmu_gva_to_gpa_read(struct kvm_vcpu *vcpu, gva_t gva, u32 *error)
 {
 	u32 access = (kvm_x86_ops->get_cpl(vcpu) == 3) ? PFERR_USER_MASK : 0;
@@ -3157,14 +3169,18 @@ static int kvm_read_guest_virt_system(gva_t addr, void *val, unsigned int bytes,
 	return kvm_read_guest_virt_helper(addr, val, bytes, vcpu, 0, error);
 }
 
-static int kvm_write_guest_virt(gva_t addr, void *val, unsigned int bytes,
-				struct kvm_vcpu *vcpu, u32 *error)
+static int kvm_write_guest_virt_helper(gva_t addr, void *val,
+				       unsigned int bytes,
+				       struct kvm_vcpu *vcpu, u32 access,
+				       u32 *error)
 {
 	void *data = val;
 	int r = X86EMUL_CONTINUE;
 
+	access |= PFERR_WRITE_MASK;
+
 	while (bytes) {
-		gpa_t gpa = kvm_mmu_gva_to_gpa_write(vcpu, addr, error);
+		gpa_t gpa =  vcpu->arch.mmu.gva_to_gpa(vcpu, addr, access, error);
 		unsigned offset = addr & (PAGE_SIZE-1);
 		unsigned towrite = min(bytes, (unsigned)PAGE_SIZE - offset);
 		int ret;
@@ -3187,6 +3203,19 @@ out:
 	return r;
 }
 
+static int kvm_write_guest_virt(gva_t addr, void *val, unsigned int bytes,
+				struct kvm_vcpu *vcpu, u32 *error)
+{
+	u32 access = (kvm_x86_ops->get_cpl(vcpu) == 3) ? PFERR_USER_MASK : 0;
+	return kvm_write_guest_virt_helper(addr, val, bytes, vcpu, access, error);
+}
+
+static int kvm_write_guest_virt_system(gva_t addr, void *val,
+				       unsigned int bytes,
+				       struct kvm_vcpu *vcpu, u32 *error)
+{
+	return kvm_write_guest_virt_helper(addr, val, bytes, vcpu, 0, error);
+}
 
 static int emulator_read_emulated(unsigned long addr,
 				  void *val,
@@ -3453,12 +3482,95 @@ static void emulator_set_cr(int cr, unsigned long val, struct kvm_vcpu *vcpu)
 	}
 }
 
+static void emulator_get_gdt(struct desc_ptr *dt, struct kvm_vcpu *vcpu)
+{
+	kvm_x86_ops->get_gdt(vcpu, dt);
+}
+
+static bool emulator_get_cached_descriptor(struct desc_struct *desc, int seg,
+					   struct kvm_vcpu *vcpu)
+{
+	struct kvm_segment var;
+
+	kvm_get_segment(vcpu, &var, seg);
+
+	if (var.unusable)
+		return false;
+
+	if (var.g)
+		var.limit >>= 12;
+	set_desc_limit(desc, var.limit);
+	set_desc_base(desc, (unsigned long)var.base);
+	desc->type = var.type;
+	desc->s = var.s;
+	desc->dpl = var.dpl;
+	desc->p = var.present;
+	desc->avl = var.avl;
+	desc->l = var.l;
+	desc->d = var.db;
+	desc->g = var.g;
+
+	return true;
+}
+
+static void emulator_set_cached_descriptor(struct desc_struct *desc, int seg,
+					   struct kvm_vcpu *vcpu)
+{
+	struct kvm_segment var;
+
+	/* needed to preserve selector */
+	kvm_get_segment(vcpu, &var, seg);
+
+	var.base = get_desc_base(desc);
+	var.limit = get_desc_limit(desc);
+	if (desc->g)
+		var.limit = (var.limit << 12) | 0xfff;
+	var.type = desc->type;
+	var.present = desc->p;
+	var.dpl = desc->dpl;
+	var.db = desc->d;
+	var.s = desc->s;
+	var.l = desc->l;
+	var.g = desc->g;
+	var.avl = desc->avl;
+	var.present = desc->p;
+	var.unusable = !var.present;
+	var.padding = 0;
+
+	kvm_set_segment(vcpu, &var, seg);
+	return;
+}
+
+static u16 emulator_get_segment_selector(int seg, struct kvm_vcpu *vcpu)
+{
+	struct kvm_segment kvm_seg;
+
+	kvm_get_segment(vcpu, &kvm_seg, seg);
+	return kvm_seg.selector;
+}
+
+static void emulator_set_segment_selector(u16 sel, int seg,
+					  struct kvm_vcpu *vcpu)
+{
+	struct kvm_segment kvm_seg;
+
+	kvm_get_segment(vcpu, &kvm_seg, seg);
+	kvm_seg.selector = sel;
+	kvm_set_segment(vcpu, &kvm_seg, seg);
+}
+
 static struct x86_emulate_ops emulate_ops = {
 	.read_std            = kvm_read_guest_virt_system,
+	.write_std           = kvm_write_guest_virt_system,
 	.fetch               = kvm_fetch_guest_virt,
 	.read_emulated       = emulator_read_emulated,
 	.write_emulated      = emulator_write_emulated,
 	.cmpxchg_emulated    = emulator_cmpxchg_emulated,
+	.get_cached_descriptor = emulator_get_cached_descriptor,
+	.set_cached_descriptor = emulator_set_cached_descriptor,
+	.get_segment_selector = emulator_get_segment_selector,
+	.set_segment_selector = emulator_set_segment_selector,
+	.get_gdt             = emulator_get_gdt,
 	.get_cr              = emulator_get_cr,
 	.set_cr              = emulator_set_cr,
 };
@@ -4615,12 +4727,6 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 	return 0;
 }
 
-void kvm_get_segment(struct kvm_vcpu *vcpu,
-		     struct kvm_segment *var, int seg)
-{
-	kvm_x86_ops->get_segment(vcpu, var, seg);
-}
-
 void kvm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l)
 {
 	struct kvm_segment cs;
@@ -4692,12 +4798,6 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
 	return 0;
 }
 
-static void kvm_set_segment(struct kvm_vcpu *vcpu,
-			struct kvm_segment *var, int seg)
-{
-	kvm_x86_ops->set_segment(vcpu, var, seg);
-}
-
 static void seg_desct_to_kvm_desct(struct desc_struct *seg_desc, u16 selector,
 				   struct kvm_segment *kvm_desct)
 {
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 16/24] KVM: x86 emulator: Emulate task switch in emulator.c
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (14 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 15/24] KVM: x86 emulator: Provide more callbacks for x86 emulator Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 17/24] KVM: x86 emulator: Use load_segment_descriptor() instead of kvm_load_segment_descriptor() Gleb Natapov
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

Implement emulation of 16/32 bit task switch in emulator.c

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/include/asm/kvm_emulate.h |    5 +
 arch/x86/kvm/emulate.c             |  564 ++++++++++++++++++++++++++++++++++++
 2 files changed, 569 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index e881618..4268330 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -11,6 +11,8 @@
 #ifndef _ASM_X86_KVM_X86_EMULATE_H
 #define _ASM_X86_KVM_X86_EMULATE_H
 
+#include <asm/desc_defs.h>
+
 struct x86_emulate_ctxt;
 
 /*
@@ -210,5 +212,8 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt,
 		    struct x86_emulate_ops *ops);
 int x86_emulate_insn(struct x86_emulate_ctxt *ctxt,
 		     struct x86_emulate_ops *ops);
+int emulator_task_switch(struct x86_emulate_ctxt *ctxt,
+			 struct x86_emulate_ops *ops,
+			 u16 tss_selector, int reason);
 
 #endif /* _ASM_X86_KVM_X86_EMULATE_H */
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 6e2b34b..094d17c 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -33,6 +33,7 @@
 #include <asm/kvm_emulate.h>
 
 #include "x86.h"
+#include "tss.h"
 
 /*
  * Opcode effective-address decode tables.
@@ -1218,6 +1219,199 @@ done:
 	return (rc == X86EMUL_UNHANDLEABLE) ? -1 : 0;
 }
 
+static u32 desc_limit_scaled(struct desc_struct *desc)
+{
+	u32 limit = get_desc_limit(desc);
+
+	return desc->g ? (limit << 12) | 0xfff : limit;
+}
+
+static void get_descriptor_table_ptr(struct x86_emulate_ctxt *ctxt,
+				     struct x86_emulate_ops *ops,
+				     u16 selector, struct desc_ptr *dt)
+{
+	if (selector & 1 << 2) {
+		struct desc_struct desc;
+		memset (dt, 0, sizeof *dt);
+		if(!ops->get_cached_descriptor(&desc, VCPU_SREG_LDTR, ctxt->vcpu))
+			return;
+
+		dt->size = desc_limit_scaled(&desc); /* what if limit > 65535? */
+		dt->address = get_desc_base(&desc);
+	}
+	else
+		ops->get_gdt(dt, ctxt->vcpu);
+}
+
+/* allowed just for 8 bytes segments */
+static int read_segment_descriptor(struct x86_emulate_ctxt *ctxt,
+				   struct x86_emulate_ops *ops,
+				   u16 selector, struct desc_struct *desc)
+{
+	struct desc_ptr dt;
+	u16 index = selector >> 3;
+	int ret;
+	u32 err;
+	ulong addr;
+
+	get_descriptor_table_ptr(ctxt, ops, selector, &dt);
+
+	if (dt.size < index * 8 + 7) {
+		kvm_inject_gp(ctxt->vcpu, selector & 0xfffc);
+		return X86EMUL_PROPAGATE_FAULT;
+	}
+	addr = dt.address + index * 8;
+	ret = ops->read_std(addr, desc, sizeof *desc, ctxt->vcpu,  &err);
+	if (ret == X86EMUL_PROPAGATE_FAULT)
+		kvm_inject_page_fault(ctxt->vcpu, addr, err);
+
+       return ret;
+}
+
+/* allowed just for 8 bytes segments */
+static int write_segment_descriptor(struct x86_emulate_ctxt *ctxt,
+				    struct x86_emulate_ops *ops,
+				    u16 selector, struct desc_struct *desc)
+{
+	struct desc_ptr dt;
+	u16 index = selector >> 3;
+	u32 err;
+	ulong addr;
+	int ret;
+
+	get_descriptor_table_ptr(ctxt, ops, selector, &dt);
+
+	if (dt.size < index * 8 + 7) {
+		kvm_inject_gp(ctxt->vcpu, selector & 0xfffc);
+		return X86EMUL_PROPAGATE_FAULT;
+	}
+
+	addr = dt.address + index * 8;
+	ret = ops->write_std(addr, desc, sizeof *desc, ctxt->vcpu, &err);
+	if (ret == X86EMUL_PROPAGATE_FAULT)
+		kvm_inject_page_fault(ctxt->vcpu, addr, err);
+
+	return ret;
+}
+
+static int load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
+				   struct x86_emulate_ops *ops,
+				   u16 selector, int seg)
+{
+	struct desc_struct seg_desc;
+	u8 dpl, rpl, cpl;
+	unsigned err_vec = GP_VECTOR;
+	u32 err_code = 0;
+	bool null_selector = !(selector & ~0x3); /* 0000-0003 are null */
+	int ret;
+
+	memset(&seg_desc, 0, sizeof seg_desc);
+
+	if ((seg <= VCPU_SREG_GS && ctxt->mode == X86EMUL_MODE_VM86)
+	    || ctxt->mode == X86EMUL_MODE_REAL) {
+		/* set real mode segment descriptor */
+		set_desc_base(&seg_desc, selector << 4);
+		set_desc_limit(&seg_desc, 0xffff);
+		seg_desc.type = 3;
+		seg_desc.p = 1;
+		seg_desc.s = 1;
+		goto load;
+	}
+
+	/* NULL selector is not valid for TR, CS and SS */
+	if ((seg == VCPU_SREG_CS || seg == VCPU_SREG_SS || seg == VCPU_SREG_TR)
+	    && null_selector)
+		goto exception;
+
+	/* TR should be in GDT only */
+	if (seg == VCPU_SREG_TR && (selector & (1 << 2)))
+		goto exception;
+
+	if (null_selector) /* for NULL selector skip all following checks */
+		goto load;
+
+	ret = read_segment_descriptor(ctxt, ops, selector, &seg_desc);
+	if (ret != X86EMUL_CONTINUE)
+		return ret;
+
+	err_code = selector & 0xfffc;
+	err_vec = GP_VECTOR;
+
+	/* can't load system descriptor into segment selecor */
+	if (seg <= VCPU_SREG_GS && !seg_desc.s)
+		goto exception;
+
+	if (!seg_desc.p) {
+		err_vec = (seg == VCPU_SREG_SS) ? SS_VECTOR : NP_VECTOR;
+		goto exception;
+	}
+
+	rpl = selector & 3;
+	dpl = seg_desc.dpl;
+	cpl = ctxt->cpl;
+
+	switch (seg) {
+	case VCPU_SREG_SS:
+		/*
+		 * segment is not a writable data segment or segment
+		 * selector's RPL != CPL or segment selector's RPL != CPL
+		 */
+		if (rpl != cpl || (seg_desc.type & 0xa) != 0x2 || dpl != cpl)
+			goto exception;
+		break;
+	case VCPU_SREG_CS:
+		if (!(seg_desc.type & 8))
+			goto exception;
+
+		if (seg_desc.type & 4) {
+			/* conforming */
+			if (dpl > cpl)
+				goto exception;
+		} else {
+			/* nonconforming */
+			if (rpl > cpl || dpl != cpl)
+				goto exception;
+		}
+		/* CS(RPL) <- CPL */
+		selector = (selector & 0xfffc) | cpl;
+            break;
+	case VCPU_SREG_TR:
+		if (seg_desc.s || (seg_desc.type != 1 && seg_desc.type != 9))
+			goto exception;
+		break;
+	case VCPU_SREG_LDTR:
+		if (seg_desc.s || seg_desc.type != 2)
+			goto exception;
+		break;
+	default: /*  DS, ES, FS, or GS */
+		/*
+		 * segment is not a data or readable code segment or
+		 * ((segment is a data or nonconforming code segment)
+		 * and (both RPL and CPL > DPL))
+		 */
+		if ((seg_desc.type & 0xa) == 0x8 ||
+		    (((seg_desc.type & 0xc) != 0xc) &&
+		     (rpl > dpl && cpl > dpl)))
+			goto exception;
+		break;
+	}
+
+	if (seg_desc.s) {
+		/* mark segment as accessed */
+		seg_desc.type |= 1;
+		ret = write_segment_descriptor(ctxt, ops, selector, &seg_desc);
+		if (ret != X86EMUL_CONTINUE)
+			return ret;
+	}
+load:
+	ops->set_segment_selector(selector, seg, ctxt->vcpu);
+	ops->set_cached_descriptor(&seg_desc, seg, ctxt->vcpu);
+	return X86EMUL_CONTINUE;
+exception:
+	kvm_queue_exception_e(ctxt->vcpu, err_vec, err_code);
+	return X86EMUL_PROPAGATE_FAULT;
+}
+
 static inline void emulate_push(struct x86_emulate_ctxt *ctxt)
 {
 	struct decode_cache *c = &ctxt->decode;
@@ -1808,6 +2002,376 @@ static bool emulator_io_permited(struct x86_emulate_ctxt *ctxt,
 	return true;
 }
 
+static u32 get_cached_descriptor_base(struct x86_emulate_ctxt *ctxt,
+				      struct x86_emulate_ops *ops,
+				      int seg)
+{
+	struct desc_struct desc;
+	if (ops->get_cached_descriptor(&desc, seg, ctxt->vcpu))
+		return get_desc_base(&desc);
+	else
+		return ~0;
+}
+
+static void save_state_to_tss16(struct x86_emulate_ctxt *ctxt,
+				struct x86_emulate_ops *ops,
+				struct tss_segment_16 *tss)
+{
+	struct decode_cache *c = &ctxt->decode;
+
+	tss->ip = c->eip;
+	tss->flag = ctxt->eflags;
+	tss->ax = c->regs[VCPU_REGS_RAX];
+	tss->cx = c->regs[VCPU_REGS_RCX];
+	tss->dx = c->regs[VCPU_REGS_RDX];
+	tss->bx = c->regs[VCPU_REGS_RBX];
+	tss->sp = c->regs[VCPU_REGS_RSP];
+	tss->bp = c->regs[VCPU_REGS_RBP];
+	tss->si = c->regs[VCPU_REGS_RSI];
+	tss->di = c->regs[VCPU_REGS_RDI];
+
+	tss->es = ops->get_segment_selector(VCPU_SREG_ES, ctxt->vcpu);
+	tss->cs = ops->get_segment_selector(VCPU_SREG_CS, ctxt->vcpu);
+	tss->ss = ops->get_segment_selector(VCPU_SREG_SS, ctxt->vcpu);
+	tss->ds = ops->get_segment_selector(VCPU_SREG_DS, ctxt->vcpu);
+	tss->ldt = ops->get_segment_selector(VCPU_SREG_LDTR, ctxt->vcpu);
+}
+
+static int load_state_from_tss16(struct x86_emulate_ctxt *ctxt,
+				 struct x86_emulate_ops *ops,
+				 struct tss_segment_16 *tss)
+{
+	struct decode_cache *c = &ctxt->decode;
+	int ret;
+
+	c->eip = tss->ip;
+	ctxt->eflags = tss->flag | 2;
+	c->regs[VCPU_REGS_RAX] = tss->ax;
+	c->regs[VCPU_REGS_RCX] = tss->cx;
+	c->regs[VCPU_REGS_RDX] = tss->dx;
+	c->regs[VCPU_REGS_RBX] = tss->bx;
+	c->regs[VCPU_REGS_RSP] = tss->sp;
+	c->regs[VCPU_REGS_RBP] = tss->bp;
+	c->regs[VCPU_REGS_RSI] = tss->si;
+	c->regs[VCPU_REGS_RDI] = tss->di;
+
+	/*
+	 * SDM says that segment selectors are loaded before segment
+	 * descriptors
+	 */
+	ops->set_segment_selector(tss->ldt, VCPU_SREG_LDTR, ctxt->vcpu);
+	ops->set_segment_selector(tss->es, VCPU_SREG_ES, ctxt->vcpu);
+	ops->set_segment_selector(tss->cs, VCPU_SREG_CS, ctxt->vcpu);
+	ops->set_segment_selector(tss->ss, VCPU_SREG_SS, ctxt->vcpu);
+	ops->set_segment_selector(tss->ds, VCPU_SREG_DS, ctxt->vcpu);
+
+	/*
+	 * Now load segment descriptors. If fault happenes at this stage
+	 * it is handled in a context of new task
+	 */
+	ret = load_segment_descriptor(ctxt, ops, tss->ldt, VCPU_SREG_LDTR);
+	if (ret != X86EMUL_CONTINUE)
+		return ret;
+	ret = load_segment_descriptor(ctxt, ops, tss->es, VCPU_SREG_ES);
+	if (ret != X86EMUL_CONTINUE)
+		return ret;
+	ret = load_segment_descriptor(ctxt, ops, tss->cs, VCPU_SREG_CS);
+	if (ret != X86EMUL_CONTINUE)
+		return ret;
+	ret = load_segment_descriptor(ctxt, ops, tss->ss, VCPU_SREG_SS);
+	if (ret != X86EMUL_CONTINUE)
+		return ret;
+	ret = load_segment_descriptor(ctxt, ops, tss->ds, VCPU_SREG_DS);
+	if (ret != X86EMUL_CONTINUE)
+		return ret;
+
+	return X86EMUL_CONTINUE;
+}
+
+static int task_switch_16(struct x86_emulate_ctxt *ctxt,
+			  struct x86_emulate_ops *ops,
+			  u16 tss_selector, u16 old_tss_sel,
+			  ulong old_tss_base, struct desc_struct *new_desc)
+{
+	struct tss_segment_16 tss_seg;
+	int ret;
+	u32 err, new_tss_base = get_desc_base(new_desc);
+
+	ret = ops->read_std(old_tss_base, &tss_seg, sizeof tss_seg, ctxt->vcpu,
+			    &err);
+	if (ret == X86EMUL_PROPAGATE_FAULT) {
+		/* FIXME: need to provide precise fault address */
+		kvm_inject_page_fault(ctxt->vcpu, old_tss_base, err);
+		return ret;
+	}
+
+	save_state_to_tss16(ctxt, ops, &tss_seg);
+
+	ret = ops->write_std(old_tss_base, &tss_seg, sizeof tss_seg, ctxt->vcpu,
+			     &err);
+	if (ret == X86EMUL_PROPAGATE_FAULT) {
+		/* FIXME: need to provide precise fault address */
+		kvm_inject_page_fault(ctxt->vcpu, old_tss_base, err);
+		return ret;
+	}
+
+	ret = ops->read_std(new_tss_base, &tss_seg, sizeof tss_seg, ctxt->vcpu,
+			    &err);
+	if (ret == X86EMUL_PROPAGATE_FAULT) {
+		/* FIXME: need to provide precise fault address */
+		kvm_inject_page_fault(ctxt->vcpu, new_tss_base, err);
+		return ret;
+	}
+
+	if (old_tss_sel != 0xffff) {
+		tss_seg.prev_task_link = old_tss_sel;
+
+		ret = ops->write_std(new_tss_base,
+				     &tss_seg.prev_task_link,
+				     sizeof tss_seg.prev_task_link,
+				     ctxt->vcpu, &err);
+		if (ret == X86EMUL_PROPAGATE_FAULT) {
+			/* FIXME: need to provide precise fault address */
+			kvm_inject_page_fault(ctxt->vcpu, new_tss_base, err);
+			return ret;
+		}
+	}
+
+	return load_state_from_tss16(ctxt, ops, &tss_seg);
+}
+
+static void save_state_to_tss32(struct x86_emulate_ctxt *ctxt,
+				struct x86_emulate_ops *ops,
+				struct tss_segment_32 *tss)
+{
+	struct decode_cache *c = &ctxt->decode;
+
+	tss->cr3 = ops->get_cr(3, ctxt->vcpu);
+	tss->eip = c->eip;
+	tss->eflags = ctxt->eflags;
+	tss->eax = c->regs[VCPU_REGS_RAX];
+	tss->ecx = c->regs[VCPU_REGS_RCX];
+	tss->edx = c->regs[VCPU_REGS_RDX];
+	tss->ebx = c->regs[VCPU_REGS_RBX];
+	tss->esp = c->regs[VCPU_REGS_RSP];
+	tss->ebp = c->regs[VCPU_REGS_RBP];
+	tss->esi = c->regs[VCPU_REGS_RSI];
+	tss->edi = c->regs[VCPU_REGS_RDI];
+
+	tss->es = ops->get_segment_selector(VCPU_SREG_ES, ctxt->vcpu);
+	tss->cs = ops->get_segment_selector(VCPU_SREG_CS, ctxt->vcpu);
+	tss->ss = ops->get_segment_selector(VCPU_SREG_SS, ctxt->vcpu);
+	tss->ds = ops->get_segment_selector(VCPU_SREG_DS, ctxt->vcpu);
+	tss->fs = ops->get_segment_selector(VCPU_SREG_FS, ctxt->vcpu);
+	tss->gs = ops->get_segment_selector(VCPU_SREG_GS, ctxt->vcpu);
+	tss->ldt_selector = ops->get_segment_selector(VCPU_SREG_LDTR, ctxt->vcpu);
+}
+
+static int load_state_from_tss32(struct x86_emulate_ctxt *ctxt,
+				 struct x86_emulate_ops *ops,
+				 struct tss_segment_32 *tss)
+{
+	struct decode_cache *c = &ctxt->decode;
+	int ret;
+
+	ops->set_cr(3, tss->cr3, ctxt->vcpu);
+	c->eip = tss->eip;
+	ctxt->eflags = tss->eflags | 2;
+	c->regs[VCPU_REGS_RAX] = tss->eax;
+	c->regs[VCPU_REGS_RCX] = tss->ecx;
+	c->regs[VCPU_REGS_RDX] = tss->edx;
+	c->regs[VCPU_REGS_RBX] = tss->ebx;
+	c->regs[VCPU_REGS_RSP] = tss->esp;
+	c->regs[VCPU_REGS_RBP] = tss->ebp;
+	c->regs[VCPU_REGS_RSI] = tss->esi;
+	c->regs[VCPU_REGS_RDI] = tss->edi;
+
+	/*
+	 * SDM says that segment selectors are loaded before segment
+	 * descriptors
+	 */
+	ops->set_segment_selector(tss->ldt_selector, VCPU_SREG_LDTR, ctxt->vcpu);
+	ops->set_segment_selector(tss->es, VCPU_SREG_ES, ctxt->vcpu);
+	ops->set_segment_selector(tss->cs, VCPU_SREG_CS, ctxt->vcpu);
+	ops->set_segment_selector(tss->ss, VCPU_SREG_SS, ctxt->vcpu);
+	ops->set_segment_selector(tss->ds, VCPU_SREG_DS, ctxt->vcpu);
+	ops->set_segment_selector(tss->fs, VCPU_SREG_FS, ctxt->vcpu);
+	ops->set_segment_selector(tss->gs, VCPU_SREG_GS, ctxt->vcpu);
+
+	/*
+	 * Now load segment descriptors. If fault happenes at this stage
+	 * it is handled in a context of new task
+	 */
+	ret = load_segment_descriptor(ctxt, ops, tss->ldt_selector, VCPU_SREG_LDTR);
+	if (ret != X86EMUL_CONTINUE)
+		return ret;
+	ret = load_segment_descriptor(ctxt, ops, tss->es, VCPU_SREG_ES);
+	if (ret != X86EMUL_CONTINUE)
+		return ret;
+	ret = load_segment_descriptor(ctxt, ops, tss->cs, VCPU_SREG_CS);
+	if (ret != X86EMUL_CONTINUE)
+		return ret;
+	ret = load_segment_descriptor(ctxt, ops, tss->ss, VCPU_SREG_SS);
+	if (ret != X86EMUL_CONTINUE)
+		return ret;
+	ret = load_segment_descriptor(ctxt, ops, tss->ds, VCPU_SREG_DS);
+	if (ret != X86EMUL_CONTINUE)
+		return ret;
+	ret = load_segment_descriptor(ctxt, ops, tss->fs, VCPU_SREG_FS);
+	if (ret != X86EMUL_CONTINUE)
+		return ret;
+	ret = load_segment_descriptor(ctxt, ops, tss->gs, VCPU_SREG_GS);
+	if (ret != X86EMUL_CONTINUE)
+		return ret;
+
+	return X86EMUL_CONTINUE;
+}
+
+static int task_switch_32(struct x86_emulate_ctxt *ctxt,
+			  struct x86_emulate_ops *ops,
+			  u16 tss_selector, u16 old_tss_sel,
+			  ulong old_tss_base, struct desc_struct *new_desc)
+{
+	struct tss_segment_32 tss_seg;
+	int ret;
+	u32 err, new_tss_base = get_desc_base(new_desc);
+
+	ret = ops->read_std(old_tss_base, &tss_seg, sizeof tss_seg, ctxt->vcpu,
+			    &err);
+	if (ret == X86EMUL_PROPAGATE_FAULT) {
+		/* FIXME: need to provide precise fault address */
+		kvm_inject_page_fault(ctxt->vcpu, old_tss_base, err);
+		return ret;
+	}
+
+	save_state_to_tss32(ctxt, ops, &tss_seg);
+
+	ret = ops->write_std(old_tss_base, &tss_seg, sizeof tss_seg, ctxt->vcpu,
+			     &err);
+	if (ret == X86EMUL_PROPAGATE_FAULT) {
+		/* FIXME: need to provide precise fault address */
+		kvm_inject_page_fault(ctxt->vcpu, old_tss_base, err);
+		return ret;
+	}
+
+	ret = ops->read_std(new_tss_base, &tss_seg, sizeof tss_seg, ctxt->vcpu,
+			    &err);
+	if (ret == X86EMUL_PROPAGATE_FAULT) {
+		/* FIXME: need to provide precise fault address */
+		kvm_inject_page_fault(ctxt->vcpu, new_tss_base, err);
+		return ret;
+	}
+
+	if (old_tss_sel != 0xffff) {
+		tss_seg.prev_task_link = old_tss_sel;
+
+		ret = ops->write_std(new_tss_base,
+				     &tss_seg.prev_task_link,
+				     sizeof tss_seg.prev_task_link,
+				     ctxt->vcpu, &err);
+		if (ret == X86EMUL_PROPAGATE_FAULT) {
+			/* FIXME: need to provide precise fault address */
+			kvm_inject_page_fault(ctxt->vcpu, new_tss_base, err);
+			return ret;
+		}
+	}
+
+	return load_state_from_tss32(ctxt, ops, &tss_seg);
+}
+
+static int emulator_do_task_switch(struct x86_emulate_ctxt *ctxt,
+				    struct x86_emulate_ops *ops,
+				    u16 tss_selector, int reason)
+{
+	struct desc_struct curr_tss_desc, next_tss_desc;
+	int ret;
+	u16 old_tss_sel = ops->get_segment_selector(VCPU_SREG_TR, ctxt->vcpu);
+	ulong old_tss_base =
+		get_cached_descriptor_base(ctxt, ops, VCPU_SREG_TR);
+
+	/* FIXME: old_tss_base == ~0 ? */
+
+	ret = read_segment_descriptor(ctxt, ops, tss_selector, &next_tss_desc);
+	if (ret != X86EMUL_CONTINUE)
+		return ret;
+	ret = read_segment_descriptor(ctxt, ops, old_tss_sel, &curr_tss_desc);
+	if (ret != X86EMUL_CONTINUE)
+		return ret;
+
+	/* FIXME: check that next_tss_desc is tss */
+
+	if (reason != TASK_SWITCH_IRET) {
+		if ((tss_selector & 3) > next_tss_desc.dpl ||
+		    ctxt->cpl > next_tss_desc.dpl) {
+			kvm_inject_gp(ctxt->vcpu, 0);
+			return X86EMUL_PROPAGATE_FAULT;
+		}
+	}
+
+	if (!next_tss_desc.p || desc_limit_scaled(&next_tss_desc) < 0x67) {
+		kvm_queue_exception_e(ctxt->vcpu, TS_VECTOR,
+				      tss_selector & 0xfffc);
+		return X86EMUL_PROPAGATE_FAULT;
+	}
+
+	if (reason == TASK_SWITCH_IRET || reason == TASK_SWITCH_JMP) {
+		curr_tss_desc.type &= ~(1 << 1); /* clear busy flag */
+		write_segment_descriptor(ctxt, ops, old_tss_sel,
+					 &curr_tss_desc);
+	}
+
+	if (reason == TASK_SWITCH_IRET)
+		ctxt->eflags = ctxt->eflags & ~X86_EFLAGS_NT;
+
+	/* set back link to prev task only if NT bit is set in eflags
+	   note that old_tss_sel is not used afetr this point */
+	if (reason != TASK_SWITCH_CALL && reason != TASK_SWITCH_GATE)
+		old_tss_sel = 0xffff;
+
+	if (next_tss_desc.type & 8)
+		ret = task_switch_32(ctxt, ops, tss_selector, old_tss_sel,
+				     old_tss_base, &next_tss_desc);
+	else
+		ret = task_switch_16(ctxt, ops, tss_selector, old_tss_sel,
+				     old_tss_base, &next_tss_desc);
+
+	if (reason == TASK_SWITCH_CALL || reason == TASK_SWITCH_GATE)
+		ctxt->eflags = ctxt->eflags | X86_EFLAGS_NT;
+
+	if (reason != TASK_SWITCH_IRET) {
+		next_tss_desc.type |= (1 << 1); /* set busy flag */
+		write_segment_descriptor(ctxt, ops, tss_selector,
+					 &next_tss_desc);
+	}
+
+	ops->set_cr(0,  ops->get_cr(0, ctxt->vcpu) | X86_CR0_TS, ctxt->vcpu);
+	ops->set_cached_descriptor(&next_tss_desc, VCPU_SREG_TR, ctxt->vcpu);
+	ops->set_segment_selector(tss_selector, VCPU_SREG_TR, ctxt->vcpu);
+
+	return ret;
+}
+
+int emulator_task_switch(struct x86_emulate_ctxt *ctxt,
+			 struct x86_emulate_ops *ops,
+			 u16 tss_selector, int reason)
+{
+	struct decode_cache *c = &ctxt->decode;
+	int rc;
+
+	memset(c, 0, sizeof(struct decode_cache));
+	c->eip = ctxt->eip;
+	memcpy(c->regs, ctxt->vcpu->arch.regs, sizeof c->regs);
+
+	rc = emulator_do_task_switch(ctxt, ops, tss_selector, reason);
+
+	if (rc == X86EMUL_CONTINUE) {
+		memcpy(ctxt->vcpu->arch.regs, c->regs, sizeof c->regs);
+		kvm_rip_write(ctxt->vcpu, c->eip);
+	}
+
+	return rc;
+}
+
 int
 x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 {
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 17/24] KVM: x86 emulator: Use load_segment_descriptor() instead of kvm_load_segment_descriptor()
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (15 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 16/24] KVM: x86 emulator: Emulate task switch in emulator.c Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 18/24] KVM: Use task switch from emulator.c Gleb Natapov
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm


Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/emulate.c |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 094d17c..81ecf47 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1506,7 +1506,7 @@ static int emulate_pop_sreg(struct x86_emulate_ctxt *ctxt,
 	if (rc != X86EMUL_CONTINUE)
 		return rc;
 
-	rc = kvm_load_segment_descriptor(ctxt->vcpu, (u16)selector, seg);
+	rc = load_segment_descriptor(ctxt, ops, (u16)selector, seg);
 	return rc;
 }
 
@@ -1681,7 +1681,7 @@ static int emulate_ret_far(struct x86_emulate_ctxt *ctxt,
 	rc = emulate_pop(ctxt, ops, &cs, c->op_bytes);
 	if (rc != X86EMUL_CONTINUE)
 		return rc;
-	rc = kvm_load_segment_descriptor(ctxt->vcpu, (u16)cs, VCPU_SREG_CS);
+	rc = load_segment_descriptor(ctxt, ops, (u16)cs, VCPU_SREG_CS);
 	return rc;
 }
 
@@ -2714,7 +2714,7 @@ special_insn:
 		if (c->modrm_reg == VCPU_SREG_SS)
 			toggle_interruptibility(ctxt, KVM_X86_SHADOW_INT_MOV_SS);
 
-		rc = kvm_load_segment_descriptor(ctxt->vcpu, sel, c->modrm_reg);
+		rc = load_segment_descriptor(ctxt, ops, sel, c->modrm_reg);
 
 		c->dst.type = OP_NONE;  /* Disable writeback. */
 		break;
@@ -2889,8 +2889,8 @@ special_insn:
 		goto jmp;
 	case 0xea: /* jmp far */
 	jump_far:
-		if (kvm_load_segment_descriptor(ctxt->vcpu, c->src2.val,
-						VCPU_SREG_CS))
+		if (load_segment_descriptor(ctxt, ops, c->src2.val,
+					    VCPU_SREG_CS))
 			goto done;
 
 		c->eip = c->src.val;
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 18/24] KVM: Use task switch from emulator.c
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (16 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 17/24] KVM: x86 emulator: Use load_segment_descriptor() instead of kvm_load_segment_descriptor() Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 19/24] KVM: x86 emulator: fix in/out emulation Gleb Natapov
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

Remove old task switch code from x86.c

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/x86.c |  558 ++--------------------------------------------------
 1 files changed, 18 insertions(+), 540 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f89502d..5171696 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4798,553 +4798,31 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
 	return 0;
 }
 
-static void seg_desct_to_kvm_desct(struct desc_struct *seg_desc, u16 selector,
-				   struct kvm_segment *kvm_desct)
-{
-	kvm_desct->base = get_desc_base(seg_desc);
-	kvm_desct->limit = get_desc_limit(seg_desc);
-	if (seg_desc->g) {
-		kvm_desct->limit <<= 12;
-		kvm_desct->limit |= 0xfff;
-	}
-	kvm_desct->selector = selector;
-	kvm_desct->type = seg_desc->type;
-	kvm_desct->present = seg_desc->p;
-	kvm_desct->dpl = seg_desc->dpl;
-	kvm_desct->db = seg_desc->d;
-	kvm_desct->s = seg_desc->s;
-	kvm_desct->l = seg_desc->l;
-	kvm_desct->g = seg_desc->g;
-	kvm_desct->avl = seg_desc->avl;
-	if (!selector)
-		kvm_desct->unusable = 1;
-	else
-		kvm_desct->unusable = 0;
-	kvm_desct->padding = 0;
-}
-
-static void get_segment_descriptor_dtable(struct kvm_vcpu *vcpu,
-					  u16 selector,
-					  struct desc_ptr *dtable)
-{
-	if (selector & 1 << 2) {
-		struct kvm_segment kvm_seg;
-
-		kvm_get_segment(vcpu, &kvm_seg, VCPU_SREG_LDTR);
-
-		if (kvm_seg.unusable)
-			dtable->size = 0;
-		else
-			dtable->size = kvm_seg.limit;
-		dtable->address = kvm_seg.base;
-	}
-	else
-		kvm_x86_ops->get_gdt(vcpu, dtable);
-}
-
-/* allowed just for 8 bytes segments */
-static int load_guest_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector,
-					 struct desc_struct *seg_desc)
-{
-	struct desc_ptr dtable;
-	u16 index = selector >> 3;
-	int ret;
-	u32 err;
-	gva_t addr;
-
-	get_segment_descriptor_dtable(vcpu, selector, &dtable);
-
-	if (dtable.size < index * 8 + 7) {
-		kvm_queue_exception_e(vcpu, GP_VECTOR, selector & 0xfffc);
-		return X86EMUL_PROPAGATE_FAULT;
-	}
-	addr = dtable.address + index * 8;
-	ret = kvm_read_guest_virt_system(addr, seg_desc, sizeof(*seg_desc),
-					 vcpu,  &err);
-	if (ret == X86EMUL_PROPAGATE_FAULT)
-		kvm_inject_page_fault(vcpu, addr, err);
-
-       return ret;
-}
-
-/* allowed just for 8 bytes segments */
-static int save_guest_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector,
-					 struct desc_struct *seg_desc)
-{
-	struct desc_ptr dtable;
-	u16 index = selector >> 3;
-
-	get_segment_descriptor_dtable(vcpu, selector, &dtable);
-
-	if (dtable.size < index * 8 + 7)
-		return 1;
-	return kvm_write_guest_virt(dtable.address + index*8, seg_desc, sizeof(*seg_desc), vcpu, NULL);
-}
-
-static gpa_t get_tss_base_addr_write(struct kvm_vcpu *vcpu,
-			       struct desc_struct *seg_desc)
-{
-	u32 base_addr = get_desc_base(seg_desc);
-
-	return kvm_mmu_gva_to_gpa_write(vcpu, base_addr, NULL);
-}
-
-static gpa_t get_tss_base_addr_read(struct kvm_vcpu *vcpu,
-			     struct desc_struct *seg_desc)
-{
-	u32 base_addr = get_desc_base(seg_desc);
-
-	return kvm_mmu_gva_to_gpa_read(vcpu, base_addr, NULL);
-}
-
-static u16 get_segment_selector(struct kvm_vcpu *vcpu, int seg)
-{
-	struct kvm_segment kvm_seg;
-
-	kvm_get_segment(vcpu, &kvm_seg, seg);
-	return kvm_seg.selector;
-}
-
-static int kvm_load_realmode_segment(struct kvm_vcpu *vcpu, u16 selector, int seg)
-{
-	struct kvm_segment segvar = {
-		.base = selector << 4,
-		.limit = 0xffff,
-		.selector = selector,
-		.type = 3,
-		.present = 1,
-		.dpl = 3,
-		.db = 0,
-		.s = 1,
-		.l = 0,
-		.g = 0,
-		.avl = 0,
-		.unusable = 0,
-	};
-	kvm_x86_ops->set_segment(vcpu, &segvar, seg);
-	return X86EMUL_CONTINUE;
-}
-
-static int is_vm86_segment(struct kvm_vcpu *vcpu, int seg)
-{
-	return (seg != VCPU_SREG_LDTR) &&
-		(seg != VCPU_SREG_TR) &&
-		(kvm_get_rflags(vcpu) & X86_EFLAGS_VM);
-}
-
-int kvm_load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, int seg)
-{
-	struct kvm_segment kvm_seg;
-	struct desc_struct seg_desc;
-	u8 dpl, rpl, cpl;
-	unsigned err_vec = GP_VECTOR;
-	u32 err_code = 0;
-	bool null_selector = !(selector & ~0x3); /* 0000-0003 are null */
-	int ret;
-
-	if (is_vm86_segment(vcpu, seg) || !is_protmode(vcpu))
-		return kvm_load_realmode_segment(vcpu, selector, seg);
-
-	/* NULL selector is not valid for TR, CS and SS */
-	if ((seg == VCPU_SREG_CS || seg == VCPU_SREG_SS || seg == VCPU_SREG_TR)
-	    && null_selector)
-		goto exception;
-
-	/* TR should be in GDT only */
-	if (seg == VCPU_SREG_TR && (selector & (1 << 2)))
-		goto exception;
-
-	ret = load_guest_segment_descriptor(vcpu, selector, &seg_desc);
-	if (ret)
-		return ret;
-
-	seg_desct_to_kvm_desct(&seg_desc, selector, &kvm_seg);
-
-	if (null_selector) { /* for NULL selector skip all following checks */
-		kvm_seg.unusable = 1;
-		goto load;
-	}
-
-	err_code = selector & 0xfffc;
-	err_vec = GP_VECTOR;
-
-	/* can't load system descriptor into segment selecor */
-	if (seg <= VCPU_SREG_GS && !kvm_seg.s)
-		goto exception;
-
-	if (!kvm_seg.present) {
-		err_vec = (seg == VCPU_SREG_SS) ? SS_VECTOR : NP_VECTOR;
-		goto exception;
-	}
-
-	rpl = selector & 3;
-	dpl = kvm_seg.dpl;
-	cpl = kvm_x86_ops->get_cpl(vcpu);
-
-	switch (seg) {
-	case VCPU_SREG_SS:
-		/*
-		 * segment is not a writable data segment or segment
-		 * selector's RPL != CPL or segment selector's RPL != CPL
-		 */
-		if (rpl != cpl || (kvm_seg.type & 0xa) != 0x2 || dpl != cpl)
-			goto exception;
-		break;
-	case VCPU_SREG_CS:
-		if (!(kvm_seg.type & 8))
-			goto exception;
-
-		if (kvm_seg.type & 4) {
-			/* conforming */
-			if (dpl > cpl)
-				goto exception;
-		} else {
-			/* nonconforming */
-			if (rpl > cpl || dpl != cpl)
-				goto exception;
-		}
-		/* CS(RPL) <- CPL */
-		selector = (selector & 0xfffc) | cpl;
-            break;
-	case VCPU_SREG_TR:
-		if (kvm_seg.s || (kvm_seg.type != 1 && kvm_seg.type != 9))
-			goto exception;
-		break;
-	case VCPU_SREG_LDTR:
-		if (kvm_seg.s || kvm_seg.type != 2)
-			goto exception;
-		break;
-	default: /*  DS, ES, FS, or GS */
-		/*
-		 * segment is not a data or readable code segment or
-		 * ((segment is a data or nonconforming code segment)
-		 * and (both RPL and CPL > DPL))
-		 */
-		if ((kvm_seg.type & 0xa) == 0x8 ||
-		    (((kvm_seg.type & 0xc) != 0xc) && (rpl > dpl && cpl > dpl)))
-			goto exception;
-		break;
-	}
-
-	if (!kvm_seg.unusable && kvm_seg.s) {
-		/* mark segment as accessed */
-		kvm_seg.type |= 1;
-		seg_desc.type |= 1;
-		save_guest_segment_descriptor(vcpu, selector, &seg_desc);
-	}
-load:
-	kvm_set_segment(vcpu, &kvm_seg, seg);
-	return X86EMUL_CONTINUE;
-exception:
-	kvm_queue_exception_e(vcpu, err_vec, err_code);
-	return X86EMUL_PROPAGATE_FAULT;
-}
-
-static void save_state_to_tss32(struct kvm_vcpu *vcpu,
-				struct tss_segment_32 *tss)
-{
-	tss->cr3 = vcpu->arch.cr3;
-	tss->eip = kvm_rip_read(vcpu);
-	tss->eflags = kvm_get_rflags(vcpu);
-	tss->eax = kvm_register_read(vcpu, VCPU_REGS_RAX);
-	tss->ecx = kvm_register_read(vcpu, VCPU_REGS_RCX);
-	tss->edx = kvm_register_read(vcpu, VCPU_REGS_RDX);
-	tss->ebx = kvm_register_read(vcpu, VCPU_REGS_RBX);
-	tss->esp = kvm_register_read(vcpu, VCPU_REGS_RSP);
-	tss->ebp = kvm_register_read(vcpu, VCPU_REGS_RBP);
-	tss->esi = kvm_register_read(vcpu, VCPU_REGS_RSI);
-	tss->edi = kvm_register_read(vcpu, VCPU_REGS_RDI);
-	tss->es = get_segment_selector(vcpu, VCPU_SREG_ES);
-	tss->cs = get_segment_selector(vcpu, VCPU_SREG_CS);
-	tss->ss = get_segment_selector(vcpu, VCPU_SREG_SS);
-	tss->ds = get_segment_selector(vcpu, VCPU_SREG_DS);
-	tss->fs = get_segment_selector(vcpu, VCPU_SREG_FS);
-	tss->gs = get_segment_selector(vcpu, VCPU_SREG_GS);
-	tss->ldt_selector = get_segment_selector(vcpu, VCPU_SREG_LDTR);
-}
-
-static void kvm_load_segment_selector(struct kvm_vcpu *vcpu, u16 sel, int seg)
-{
-	struct kvm_segment kvm_seg;
-	kvm_get_segment(vcpu, &kvm_seg, seg);
-	kvm_seg.selector = sel;
-	kvm_set_segment(vcpu, &kvm_seg, seg);
-}
-
-static int load_state_from_tss32(struct kvm_vcpu *vcpu,
-				  struct tss_segment_32 *tss)
-{
-	kvm_set_cr3(vcpu, tss->cr3);
-
-	kvm_rip_write(vcpu, tss->eip);
-	kvm_set_rflags(vcpu, tss->eflags | 2);
-
-	kvm_register_write(vcpu, VCPU_REGS_RAX, tss->eax);
-	kvm_register_write(vcpu, VCPU_REGS_RCX, tss->ecx);
-	kvm_register_write(vcpu, VCPU_REGS_RDX, tss->edx);
-	kvm_register_write(vcpu, VCPU_REGS_RBX, tss->ebx);
-	kvm_register_write(vcpu, VCPU_REGS_RSP, tss->esp);
-	kvm_register_write(vcpu, VCPU_REGS_RBP, tss->ebp);
-	kvm_register_write(vcpu, VCPU_REGS_RSI, tss->esi);
-	kvm_register_write(vcpu, VCPU_REGS_RDI, tss->edi);
-
-	/*
-	 * SDM says that segment selectors are loaded before segment
-	 * descriptors
-	 */
-	kvm_load_segment_selector(vcpu, tss->ldt_selector, VCPU_SREG_LDTR);
-	kvm_load_segment_selector(vcpu, tss->es, VCPU_SREG_ES);
-	kvm_load_segment_selector(vcpu, tss->cs, VCPU_SREG_CS);
-	kvm_load_segment_selector(vcpu, tss->ss, VCPU_SREG_SS);
-	kvm_load_segment_selector(vcpu, tss->ds, VCPU_SREG_DS);
-	kvm_load_segment_selector(vcpu, tss->fs, VCPU_SREG_FS);
-	kvm_load_segment_selector(vcpu, tss->gs, VCPU_SREG_GS);
-
-	/*
-	 * Now load segment descriptors. If fault happenes at this stage
-	 * it is handled in a context of new task
-	 */
-	if (kvm_load_segment_descriptor(vcpu, tss->ldt_selector, VCPU_SREG_LDTR))
-		return 1;
-
-	if (kvm_load_segment_descriptor(vcpu, tss->es, VCPU_SREG_ES))
-		return 1;
-
-	if (kvm_load_segment_descriptor(vcpu, tss->cs, VCPU_SREG_CS))
-		return 1;
-
-	if (kvm_load_segment_descriptor(vcpu, tss->ss, VCPU_SREG_SS))
-		return 1;
-
-	if (kvm_load_segment_descriptor(vcpu, tss->ds, VCPU_SREG_DS))
-		return 1;
-
-	if (kvm_load_segment_descriptor(vcpu, tss->fs, VCPU_SREG_FS))
-		return 1;
-
-	if (kvm_load_segment_descriptor(vcpu, tss->gs, VCPU_SREG_GS))
-		return 1;
-	return 0;
-}
-
-static void save_state_to_tss16(struct kvm_vcpu *vcpu,
-				struct tss_segment_16 *tss)
-{
-	tss->ip = kvm_rip_read(vcpu);
-	tss->flag = kvm_get_rflags(vcpu);
-	tss->ax = kvm_register_read(vcpu, VCPU_REGS_RAX);
-	tss->cx = kvm_register_read(vcpu, VCPU_REGS_RCX);
-	tss->dx = kvm_register_read(vcpu, VCPU_REGS_RDX);
-	tss->bx = kvm_register_read(vcpu, VCPU_REGS_RBX);
-	tss->sp = kvm_register_read(vcpu, VCPU_REGS_RSP);
-	tss->bp = kvm_register_read(vcpu, VCPU_REGS_RBP);
-	tss->si = kvm_register_read(vcpu, VCPU_REGS_RSI);
-	tss->di = kvm_register_read(vcpu, VCPU_REGS_RDI);
-
-	tss->es = get_segment_selector(vcpu, VCPU_SREG_ES);
-	tss->cs = get_segment_selector(vcpu, VCPU_SREG_CS);
-	tss->ss = get_segment_selector(vcpu, VCPU_SREG_SS);
-	tss->ds = get_segment_selector(vcpu, VCPU_SREG_DS);
-	tss->ldt = get_segment_selector(vcpu, VCPU_SREG_LDTR);
-}
-
-static int load_state_from_tss16(struct kvm_vcpu *vcpu,
-				 struct tss_segment_16 *tss)
-{
-	kvm_rip_write(vcpu, tss->ip);
-	kvm_set_rflags(vcpu, tss->flag | 2);
-	kvm_register_write(vcpu, VCPU_REGS_RAX, tss->ax);
-	kvm_register_write(vcpu, VCPU_REGS_RCX, tss->cx);
-	kvm_register_write(vcpu, VCPU_REGS_RDX, tss->dx);
-	kvm_register_write(vcpu, VCPU_REGS_RBX, tss->bx);
-	kvm_register_write(vcpu, VCPU_REGS_RSP, tss->sp);
-	kvm_register_write(vcpu, VCPU_REGS_RBP, tss->bp);
-	kvm_register_write(vcpu, VCPU_REGS_RSI, tss->si);
-	kvm_register_write(vcpu, VCPU_REGS_RDI, tss->di);
-
-	/*
-	 * SDM says that segment selectors are loaded before segment
-	 * descriptors
-	 */
-	kvm_load_segment_selector(vcpu, tss->ldt, VCPU_SREG_LDTR);
-	kvm_load_segment_selector(vcpu, tss->es, VCPU_SREG_ES);
-	kvm_load_segment_selector(vcpu, tss->cs, VCPU_SREG_CS);
-	kvm_load_segment_selector(vcpu, tss->ss, VCPU_SREG_SS);
-	kvm_load_segment_selector(vcpu, tss->ds, VCPU_SREG_DS);
-
-	/*
-	 * Now load segment descriptors. If fault happenes at this stage
-	 * it is handled in a context of new task
-	 */
-	if (kvm_load_segment_descriptor(vcpu, tss->ldt, VCPU_SREG_LDTR))
-		return 1;
-
-	if (kvm_load_segment_descriptor(vcpu, tss->es, VCPU_SREG_ES))
-		return 1;
-
-	if (kvm_load_segment_descriptor(vcpu, tss->cs, VCPU_SREG_CS))
-		return 1;
-
-	if (kvm_load_segment_descriptor(vcpu, tss->ss, VCPU_SREG_SS))
-		return 1;
-
-	if (kvm_load_segment_descriptor(vcpu, tss->ds, VCPU_SREG_DS))
-		return 1;
-	return 0;
-}
-
-static int kvm_task_switch_16(struct kvm_vcpu *vcpu, u16 tss_selector,
-			      u16 old_tss_sel, u32 old_tss_base,
-			      struct desc_struct *nseg_desc)
-{
-	struct tss_segment_16 tss_segment_16;
-	int ret = 0;
-
-	if (kvm_read_guest(vcpu->kvm, old_tss_base, &tss_segment_16,
-			   sizeof tss_segment_16))
-		goto out;
-
-	save_state_to_tss16(vcpu, &tss_segment_16);
-
-	if (kvm_write_guest(vcpu->kvm, old_tss_base, &tss_segment_16,
-			    sizeof tss_segment_16))
-		goto out;
-
-	if (kvm_read_guest(vcpu->kvm, get_tss_base_addr_read(vcpu, nseg_desc),
-			   &tss_segment_16, sizeof tss_segment_16))
-		goto out;
-
-	if (old_tss_sel != 0xffff) {
-		tss_segment_16.prev_task_link = old_tss_sel;
-
-		if (kvm_write_guest(vcpu->kvm,
-				    get_tss_base_addr_write(vcpu, nseg_desc),
-				    &tss_segment_16.prev_task_link,
-				    sizeof tss_segment_16.prev_task_link))
-			goto out;
-	}
-
-	if (load_state_from_tss16(vcpu, &tss_segment_16))
-		goto out;
-
-	ret = 1;
-out:
-	return ret;
-}
-
-static int kvm_task_switch_32(struct kvm_vcpu *vcpu, u16 tss_selector,
-		       u16 old_tss_sel, u32 old_tss_base,
-		       struct desc_struct *nseg_desc)
-{
-	struct tss_segment_32 tss_segment_32;
-	int ret = 0;
-
-	if (kvm_read_guest(vcpu->kvm, old_tss_base, &tss_segment_32,
-			   sizeof tss_segment_32))
-		goto out;
-
-	save_state_to_tss32(vcpu, &tss_segment_32);
-
-	if (kvm_write_guest(vcpu->kvm, old_tss_base, &tss_segment_32,
-			    sizeof tss_segment_32))
-		goto out;
-
-	if (kvm_read_guest(vcpu->kvm, get_tss_base_addr_read(vcpu, nseg_desc),
-			   &tss_segment_32, sizeof tss_segment_32))
-		goto out;
-
-	if (old_tss_sel != 0xffff) {
-		tss_segment_32.prev_task_link = old_tss_sel;
-
-		if (kvm_write_guest(vcpu->kvm,
-				    get_tss_base_addr_write(vcpu, nseg_desc),
-				    &tss_segment_32.prev_task_link,
-				    sizeof tss_segment_32.prev_task_link))
-			goto out;
-	}
-
-	if (load_state_from_tss32(vcpu, &tss_segment_32))
-		goto out;
-
-	ret = 1;
-out:
-	return ret;
-}
-
 int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int reason)
 {
-	struct kvm_segment tr_seg;
-	struct desc_struct cseg_desc;
-	struct desc_struct nseg_desc;
-	int ret = 0;
-	u32 old_tss_base = get_segment_base(vcpu, VCPU_SREG_TR);
-	u16 old_tss_sel = get_segment_selector(vcpu, VCPU_SREG_TR);
-
-	old_tss_base = kvm_mmu_gva_to_gpa_write(vcpu, old_tss_base, NULL);
-
-	/* FIXME: Handle errors. Failure to read either TSS or their
-	 * descriptors should generate a pagefault.
-	 */
-	if (load_guest_segment_descriptor(vcpu, tss_selector, &nseg_desc))
-		goto out;
-
-	if (load_guest_segment_descriptor(vcpu, old_tss_sel, &cseg_desc))
-		goto out;
-
-	if (reason != TASK_SWITCH_IRET) {
-		int cpl;
-
-		cpl = kvm_x86_ops->get_cpl(vcpu);
-		if ((tss_selector & 3) > nseg_desc.dpl || cpl > nseg_desc.dpl) {
-			kvm_queue_exception_e(vcpu, GP_VECTOR, 0);
-			return 1;
-		}
-	}
-
-	if (!nseg_desc.p || get_desc_limit(&nseg_desc) < 0x67) {
-		kvm_queue_exception_e(vcpu, TS_VECTOR, tss_selector & 0xfffc);
-		return 1;
-	}
-
-	if (reason == TASK_SWITCH_IRET || reason == TASK_SWITCH_JMP) {
-		cseg_desc.type &= ~(1 << 1); //clear the B flag
-		save_guest_segment_descriptor(vcpu, old_tss_sel, &cseg_desc);
-	}
-
-	if (reason == TASK_SWITCH_IRET) {
-		u32 eflags = kvm_get_rflags(vcpu);
-		kvm_set_rflags(vcpu, eflags & ~X86_EFLAGS_NT);
-	}
+	int cs_db, cs_l, ret;
+	cache_all_regs(vcpu);
 
-	/* set back link to prev task only if NT bit is set in eflags
-	   note that old_tss_sel is not used afetr this point */
-	if (reason != TASK_SWITCH_CALL && reason != TASK_SWITCH_GATE)
-		old_tss_sel = 0xffff;
+	kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
 
-	if (nseg_desc.type & 8)
-		ret = kvm_task_switch_32(vcpu, tss_selector, old_tss_sel,
-					 old_tss_base, &nseg_desc);
-	else
-		ret = kvm_task_switch_16(vcpu, tss_selector, old_tss_sel,
-					 old_tss_base, &nseg_desc);
+	vcpu->arch.emulate_ctxt.vcpu = vcpu;
+	vcpu->arch.emulate_ctxt.eflags = kvm_x86_ops->get_rflags(vcpu);
+	vcpu->arch.emulate_ctxt.cpl = kvm_x86_ops->get_cpl(vcpu);
+	vcpu->arch.emulate_ctxt.eip = kvm_rip_read(vcpu);
+	vcpu->arch.emulate_ctxt.mode =
+		(!is_protmode(vcpu)) ? X86EMUL_MODE_REAL :
+		(vcpu->arch.emulate_ctxt.eflags & X86_EFLAGS_VM)
+		? X86EMUL_MODE_VM86 : cs_l
+		? X86EMUL_MODE_PROT64 :	cs_db
+		? X86EMUL_MODE_PROT32 : X86EMUL_MODE_PROT16;
 
-	if (reason == TASK_SWITCH_CALL || reason == TASK_SWITCH_GATE) {
-		u32 eflags = kvm_get_rflags(vcpu);
-		kvm_set_rflags(vcpu, eflags | X86_EFLAGS_NT);
-	}
+	ret = emulator_task_switch(&vcpu->arch.emulate_ctxt, &emulate_ops,
+				   tss_selector, reason);
 
-	if (reason != TASK_SWITCH_IRET) {
-		nseg_desc.type |= (1 << 1);
-		save_guest_segment_descriptor(vcpu, tss_selector,
-					      &nseg_desc);
-	}
+	if (ret == X86EMUL_CONTINUE)
+		kvm_x86_ops->set_rflags(vcpu, vcpu->arch.emulate_ctxt.eflags);
 
-	kvm_x86_ops->set_cr0(vcpu, kvm_read_cr0(vcpu) | X86_CR0_TS);
-	seg_desct_to_kvm_desct(&nseg_desc, tss_selector, &tr_seg);
-	tr_seg.type = 11;
-	kvm_set_segment(vcpu, &tr_seg, VCPU_SREG_TR);
-out:
-	return ret;
+	return (ret != X86EMUL_CONTINUE);
 }
 EXPORT_SYMBOL_GPL(kvm_task_switch);
 
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 19/24] KVM: x86 emulator: fix in/out emulation.
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (17 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 18/24] KVM: Use task switch from emulator.c Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:47   ` Avi Kivity
  2010-03-09 14:09 ` [PATCH 20/24] KVM: x86 emulator: Move string pio emulation into emulator.c Gleb Natapov
                   ` (4 subsequent siblings)
  23 siblings, 1 reply; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

in/out emulation is broken now. The breakage is different depending
on where IO device resides. If it is in userspace emulator reports
emulation failure since it incorrectly interprets kvm_emulate_pio()
return value. If IO device is in the kernel emulation of 'in' will do
nothing since kvm_emulate_pio() stores result directly into vcpu
registers, so emulator will overwrite result of emulation during
commit of shadowed register.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/include/asm/kvm_emulate.h |    7 ++
 arch/x86/kvm/emulate.c             |   17 ++--
 arch/x86/kvm/svm.c                 |   22 +----
 arch/x86/kvm/vmx.c                 |   19 +---
 arch/x86/kvm/x86.c                 |  203 +++++++++++++++++++++---------------
 5 files changed, 139 insertions(+), 129 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index 4268330..7d323d5 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -119,6 +119,13 @@ struct x86_emulate_ops {
 				const void *new,
 				unsigned int bytes,
 				struct kvm_vcpu *vcpu);
+
+	int (*pio_in_emulated)(int size, unsigned short port, void *val,
+			       unsigned int count, struct kvm_vcpu *vcpu);
+
+	int (*pio_out_emulated)(int size, unsigned short port, const void *val,
+				unsigned int count, struct kvm_vcpu *vcpu);
+
 	bool (*get_cached_descriptor)(struct desc_struct *desc,
 				      int seg, struct kvm_vcpu *vcpu);
 	void (*set_cached_descriptor)(struct desc_struct *desc,
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 81ecf47..0ec7b9b 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -208,12 +208,12 @@ static u32 opcode_table[256] = {
 	0, 0, 0, 0, 0, 0, 0, 0,
 	/* 0xE0 - 0xE7 */
 	0, 0, 0, 0,
-	ByteOp | SrcImmUByte, SrcImmUByte,
+	ByteOp | SrcImmUByte | DstAcc, SrcImmUByte | DstAcc,
 	ByteOp | SrcImmUByte, SrcImmUByte,
 	/* 0xE8 - 0xEF */
 	SrcImm | Stack, SrcImm | ImplicitOps,
 	SrcImmU | Src2Imm16 | No64, SrcImmByte | ImplicitOps,
-	SrcNone | ByteOp | ImplicitOps, SrcNone | ImplicitOps,
+	SrcNone | ByteOp | DstAcc, SrcNone | DstAcc,
 	SrcNone | ByteOp | ImplicitOps, SrcNone | ImplicitOps,
 	/* 0xF0 - 0xF7 */
 	0, 0, 0, 0,
@@ -2915,12 +2915,13 @@ special_insn:
 			kvm_inject_gp(ctxt->vcpu, 0);
 			goto done;
 		}
-		if (kvm_emulate_pio(ctxt->vcpu, io_dir_in,
-				   (c->d & ByteOp) ? 1 : c->op_bytes,
-				   port) != 0) {
-			c->eip = saved_eip;
-			goto cannot_emulate;
-		}
+		if (io_dir_in)
+			ops->pio_in_emulated((c->d & ByteOp) ? 1 : c->op_bytes,
+					     port, &c->dst.val, 1, ctxt->vcpu);
+		else
+			ops->pio_out_emulated((c->d & ByteOp) ? 1 : c->op_bytes,
+					      port, &c->regs[VCPU_REGS_RAX], 1,
+					      ctxt->vcpu);
 		break;
 	case 0xf4:              /* hlt */
 		ctxt->vcpu->arch.halt_request = 1;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index def4877..315e8a8 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1488,29 +1488,9 @@ static int shutdown_interception(struct vcpu_svm *svm)
 
 static int io_interception(struct vcpu_svm *svm)
 {
-	u32 io_info = svm->vmcb->control.exit_info_1; /* address size bug? */
-	int size, in, string;
-	unsigned port;
-
 	++svm->vcpu.stat.io_exits;
 
-	svm->next_rip = svm->vmcb->control.exit_info_2;
-
-	string = (io_info & SVM_IOIO_STR_MASK) != 0;
-
-	if (string) {
-		if (emulate_instruction(&svm->vcpu,
-					0, 0, 0) == EMULATE_DO_MMIO)
-			return 0;
-		return 1;
-	}
-
-	in = (io_info & SVM_IOIO_TYPE_MASK) != 0;
-	port = io_info >> 16;
-	size = (io_info & SVM_IOIO_SIZE_MASK) >> SVM_IOIO_SIZE_SHIFT;
-
-	skip_emulated_instruction(&svm->vcpu);
-	return kvm_emulate_pio(&svm->vcpu, in, size, port);
+	return !(emulate_instruction(&svm->vcpu, 0, 0, 0) == EMULATE_DO_MMIO);
 }
 
 static int nmi_interception(struct vcpu_svm *svm)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index ae3217d..7f33d8e 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2974,26 +2974,9 @@ static int handle_triple_fault(struct kvm_vcpu *vcpu)
 
 static int handle_io(struct kvm_vcpu *vcpu)
 {
-	unsigned long exit_qualification;
-	int size, in, string;
-	unsigned port;
-
 	++vcpu->stat.io_exits;
-	exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
-	string = (exit_qualification & 16) != 0;
 
-	if (string) {
-		if (emulate_instruction(vcpu, 0, 0, 0) == EMULATE_DO_MMIO)
-			return 0;
-		return 1;
-	}
-
-	size = (exit_qualification & 7) + 1;
-	in = (exit_qualification & 8) != 0;
-	port = exit_qualification >> 16;
-
-	skip_emulated_instruction(vcpu);
-	return kvm_emulate_pio(vcpu, in, size, port);
+	return !(emulate_instruction(vcpu, 0, 0, 0) == EMULATE_DO_MMIO);
 }
 
 static void
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5171696..7f6aa8d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3378,6 +3378,86 @@ emul_write:
 	return emulator_write_emulated(addr, new, bytes, vcpu);
 }
 
+static int kernel_pio(struct kvm_vcpu *vcpu, void *pd)
+{
+	/* TODO: String I/O for in kernel device */
+	int r;
+
+	if (vcpu->arch.pio.in)
+		r = kvm_io_bus_read(vcpu->kvm, KVM_PIO_BUS, vcpu->arch.pio.port,
+				    vcpu->arch.pio.size, pd);
+	else
+		r = kvm_io_bus_write(vcpu->kvm, KVM_PIO_BUS,
+				     vcpu->arch.pio.port, vcpu->arch.pio.size,
+				     pd);
+	return r;
+}
+
+
+static int emulator_pio_in_emulated(int size, unsigned short port, void *val,
+				    unsigned int count, struct kvm_vcpu *vcpu)
+{
+	if (vcpu->arch.pio.cur_count)
+		goto data_avail;
+
+	trace_kvm_pio(1, port, size, 1);
+
+	vcpu->arch.pio.port = port;
+	vcpu->arch.pio.in = 1;
+	vcpu->arch.pio.string = 0;
+	vcpu->arch.pio.down = 0;
+	vcpu->arch.pio.rep = 0;
+	vcpu->arch.pio.count = vcpu->arch.pio.cur_count = count;
+	vcpu->arch.pio.size = size;
+
+	if (!kernel_pio(vcpu, vcpu->arch.pio_data)) {
+	data_avail:
+		memcpy(val, vcpu->arch.pio_data, size * count);
+		vcpu->arch.pio.cur_count = 0;
+		return 1;
+	}
+
+	vcpu->run->exit_reason = KVM_EXIT_IO;
+	vcpu->run->io.direction = KVM_EXIT_IO_IN;
+	vcpu->run->io.size = size;
+	vcpu->run->io.data_offset = KVM_PIO_PAGE_OFFSET * PAGE_SIZE;
+	vcpu->run->io.count = count;
+	vcpu->run->io.port = port;
+
+	return 0;
+}
+
+static int emulator_pio_out_emulated(int size, unsigned short port,
+				     const void *val, unsigned int count,
+				     struct kvm_vcpu *vcpu)
+{
+	trace_kvm_pio(0, port, size, 1);
+
+	vcpu->arch.pio.port = port;
+	vcpu->arch.pio.in = 0;
+	vcpu->arch.pio.string = 0;
+	vcpu->arch.pio.down = 0;
+	vcpu->arch.pio.rep = 0;
+	vcpu->arch.pio.count = vcpu->arch.pio.cur_count = count;
+	vcpu->arch.pio.size = size;
+
+	memcpy(vcpu->arch.pio_data, val, size * count);
+
+	if (!kernel_pio(vcpu, vcpu->arch.pio_data)) {
+		vcpu->arch.pio.cur_count = 0;
+		return 1;
+	}
+
+	vcpu->run->exit_reason = KVM_EXIT_IO;
+	vcpu->run->io.direction = KVM_EXIT_IO_OUT;
+	vcpu->run->io.size = size;
+	vcpu->run->io.data_offset = KVM_PIO_PAGE_OFFSET * PAGE_SIZE;
+	vcpu->run->io.count = count;
+	vcpu->run->io.port = port;
+
+	return 0;
+}
+
 static unsigned long get_segment_base(struct kvm_vcpu *vcpu, int seg)
 {
 	return kvm_x86_ops->get_segment_base(vcpu, seg);
@@ -3566,6 +3646,8 @@ static struct x86_emulate_ops emulate_ops = {
 	.read_emulated       = emulator_read_emulated,
 	.write_emulated      = emulator_write_emulated,
 	.cmpxchg_emulated    = emulator_cmpxchg_emulated,
+	.pio_in_emulated     = emulator_pio_in_emulated,
+	.pio_out_emulated    = emulator_pio_out_emulated,
 	.get_cached_descriptor = emulator_get_cached_descriptor,
 	.set_cached_descriptor = emulator_set_cached_descriptor,
 	.get_segment_selector = emulator_get_segment_selector,
@@ -3673,6 +3755,12 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
 	if (vcpu->arch.pio.string)
 		return EMULATE_DO_MMIO;
 
+	if (vcpu->arch.pio.cur_count && !vcpu->arch.pio.string) {
+		if (!vcpu->arch.pio.in)
+			vcpu->arch.pio.cur_count = 0;
+		return EMULATE_DO_MMIO;
+	}
+
 	if (r || vcpu->mmio_is_write) {
 		run->exit_reason = KVM_EXIT_MMIO;
 		run->mmio.phys_addr = vcpu->mmio_phys_addr;
@@ -3729,43 +3817,36 @@ int complete_pio(struct kvm_vcpu *vcpu)
 	int r;
 	unsigned long val;
 
-	if (!io->string) {
-		if (io->in) {
-			val = kvm_register_read(vcpu, VCPU_REGS_RAX);
-			memcpy(&val, vcpu->arch.pio_data, io->size);
-			kvm_register_write(vcpu, VCPU_REGS_RAX, val);
-		}
-	} else {
-		if (io->in) {
-			r = pio_copy_data(vcpu);
-			if (r)
-				goto out;
-		}
+	if (io->in) {
+		r = pio_copy_data(vcpu);
+		if (r)
+			goto out;
+	}
 
-		delta = 1;
-		if (io->rep) {
-			delta *= io->cur_count;
-			/*
-			 * The size of the register should really depend on
-			 * current address size.
-			 */
-			val = kvm_register_read(vcpu, VCPU_REGS_RCX);
-			val -= delta;
-			kvm_register_write(vcpu, VCPU_REGS_RCX, val);
-		}
-		if (io->down)
-			delta = -delta;
-		delta *= io->size;
-		if (io->in) {
-			val = kvm_register_read(vcpu, VCPU_REGS_RDI);
-			val += delta;
-			kvm_register_write(vcpu, VCPU_REGS_RDI, val);
-		} else {
-			val = kvm_register_read(vcpu, VCPU_REGS_RSI);
-			val += delta;
-			kvm_register_write(vcpu, VCPU_REGS_RSI, val);
-		}
+	delta = 1;
+	if (io->rep) {
+		delta *= io->cur_count;
+		/*
+		 * The size of the register should really depend on
+		 * current address size.
+		 */
+		val = kvm_register_read(vcpu, VCPU_REGS_RCX);
+		val -= delta;
+		kvm_register_write(vcpu, VCPU_REGS_RCX, val);
+	}
+	if (io->down)
+		delta = -delta;
+	delta *= io->size;
+	if (io->in) {
+		val = kvm_register_read(vcpu, VCPU_REGS_RDI);
+		val += delta;
+		kvm_register_write(vcpu, VCPU_REGS_RDI, val);
+	} else {
+		val = kvm_register_read(vcpu, VCPU_REGS_RSI);
+		val += delta;
+		kvm_register_write(vcpu, VCPU_REGS_RSI, val);
 	}
+
 out:
 	io->count -= io->cur_count;
 	io->cur_count = 0;
@@ -3773,21 +3854,6 @@ out:
 	return 0;
 }
 
-static int kernel_pio(struct kvm_vcpu *vcpu, void *pd)
-{
-	/* TODO: String I/O for in kernel device */
-	int r;
-
-	if (vcpu->arch.pio.in)
-		r = kvm_io_bus_read(vcpu->kvm, KVM_PIO_BUS, vcpu->arch.pio.port,
-				    vcpu->arch.pio.size, pd);
-	else
-		r = kvm_io_bus_write(vcpu->kvm, KVM_PIO_BUS,
-				     vcpu->arch.pio.port, vcpu->arch.pio.size,
-				     pd);
-	return r;
-}
-
 static int pio_string_write(struct kvm_vcpu *vcpu)
 {
 	struct kvm_pio_request *io = &vcpu->arch.pio;
@@ -3805,36 +3871,6 @@ static int pio_string_write(struct kvm_vcpu *vcpu)
 	return r;
 }
 
-int kvm_emulate_pio(struct kvm_vcpu *vcpu, int in, int size, unsigned port)
-{
-	unsigned long val;
-
-	trace_kvm_pio(!in, port, size, 1);
-
-	vcpu->run->exit_reason = KVM_EXIT_IO;
-	vcpu->run->io.direction = in ? KVM_EXIT_IO_IN : KVM_EXIT_IO_OUT;
-	vcpu->run->io.size = vcpu->arch.pio.size = size;
-	vcpu->run->io.data_offset = KVM_PIO_PAGE_OFFSET * PAGE_SIZE;
-	vcpu->run->io.count = vcpu->arch.pio.count = vcpu->arch.pio.cur_count = 1;
-	vcpu->run->io.port = vcpu->arch.pio.port = port;
-	vcpu->arch.pio.in = in;
-	vcpu->arch.pio.string = 0;
-	vcpu->arch.pio.down = 0;
-	vcpu->arch.pio.rep = 0;
-
-	if (!vcpu->arch.pio.in) {
-		val = kvm_register_read(vcpu, VCPU_REGS_RAX);
-		memcpy(vcpu->arch.pio_data, &val, 4);
-	}
-
-	if (!kernel_pio(vcpu, vcpu->arch.pio_data)) {
-		complete_pio(vcpu);
-		return 1;
-	}
-	return 0;
-}
-EXPORT_SYMBOL_GPL(kvm_emulate_pio);
-
 int kvm_emulate_pio_string(struct kvm_vcpu *vcpu, int in,
 		  int size, unsigned long count, int down,
 		  gva_t address, int rep, unsigned port)
@@ -4627,9 +4663,12 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 
 	if (vcpu->arch.pio.cur_count) {
 		vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
-		r = complete_pio(vcpu);
+		if (!vcpu->arch.pio.string)
+			r = emulate_instruction(vcpu, 0, 0, EMULTYPE_NO_DECODE);
+		else
+			r = complete_pio(vcpu);
 		srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
-		if (r)
+		if (r == EMULATE_DO_MMIO)
 			goto out;
 	}
 	if (vcpu->mmio_needed) {
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 20/24] KVM: x86 emulator: Move string pio emulation into emulator.c
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (18 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 19/24] KVM: x86 emulator: fix in/out emulation Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 21/24] KVM: x86 emulator: remove saved_eip Gleb Natapov
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

Currently emulation is done outside of emulator so things like doing
ins/outs to/from mmio are broken it also makes it hard (if not impossible)
to implement single stepping in the future. The implementation in this
patch is not efficient since it exits to userspace for each IO while
previous implementation did 'ins' in batches. Further patch that
implements pio in string read ahead address this problem.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |    5 -
 arch/x86/kvm/emulate.c          |   61 ++++++------
 arch/x86/kvm/x86.c              |  204 +++------------------------------------
 3 files changed, 45 insertions(+), 225 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1e15a0a..8507b22 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -224,14 +224,9 @@ struct kvm_pv_mmu_op_buffer {
 
 struct kvm_pio_request {
 	unsigned long count;
-	int cur_count;
-	gva_t guest_gva;
 	int in;
 	int port;
 	int size;
-	int string;
-	int down;
-	int rep;
 };
 
 /*
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 0ec7b9b..505dfba 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -151,8 +151,8 @@ static u32 opcode_table[256] = {
 	0, 0, 0, 0,
 	/* 0x68 - 0x6F */
 	SrcImm | Mov | Stack, 0, SrcImmByte | Mov | Stack, 0,
-	SrcNone  | ByteOp  | ImplicitOps, SrcNone  | ImplicitOps, /* insb, insw/insd */
-	SrcNone  | ByteOp  | ImplicitOps, SrcNone  | ImplicitOps, /* outsb, outsw/outsd */
+	DstMem | ByteOp | Mov | String, DstMem | Mov | String, /* insb, insw/insd */
+	SrcMem | ByteOp | ImplicitOps | String, SrcMem | ImplicitOps | String, /* outsb, outsw/outsd */
 	/* 0x70 - 0x77 */
 	SrcImmByte, SrcImmByte, SrcImmByte, SrcImmByte,
 	SrcImmByte, SrcImmByte, SrcImmByte, SrcImmByte,
@@ -2439,7 +2439,12 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 				goto done;
 			}
 		}
-		c->regs[VCPU_REGS_RCX]--;
+		if (c->src.type == OP_MEM)
+			memop = register_address(c, seg_override_base(ctxt, c),
+						 c->regs[VCPU_REGS_RSI]);
+		if (c->dst.type == OP_MEM)
+			memop = register_address(c, es_base(ctxt),
+						 c->regs[VCPU_REGS_RDI]);
 		c->eip = ctxt->eip;
 	}
 
@@ -2596,20 +2601,14 @@ special_insn:
 			kvm_inject_gp(ctxt->vcpu, 0);
 			goto done;
 		}
-		if (kvm_emulate_pio_string(ctxt->vcpu,
-				1,
-				(c->d & ByteOp) ? 1 : c->op_bytes,
-				c->rep_prefix ?
-				address_mask(c, c->regs[VCPU_REGS_RCX]) : 1,
-				(ctxt->eflags & EFLG_DF),
-				register_address(c, es_base(ctxt),
-						 c->regs[VCPU_REGS_RDI]),
-				c->rep_prefix,
-				c->regs[VCPU_REGS_RDX]) == 0) {
-			c->eip = saved_eip;
-			return -1;
-		}
-		return 0;
+		if (!ops->pio_in_emulated(c->dst.bytes, c->regs[VCPU_REGS_RDX],
+					  &c->dst.val, 1, ctxt->vcpu))
+			goto done; /* IO is needed, skip writeback */
+
+		register_address_increment(c, &c->regs[VCPU_REGS_RDI],
+					   (ctxt->eflags & EFLG_DF) ?
+					   -c->dst.bytes : c->dst.bytes);
+		break;
 	case 0x6e:		/* outsb */
 	case 0x6f:		/* outsw/outsd */
 		if (!emulator_io_permited(ctxt, ops, c->regs[VCPU_REGS_RDX],
@@ -2617,21 +2616,14 @@ special_insn:
 			kvm_inject_gp(ctxt->vcpu, 0);
 			goto done;
 		}
-		if (kvm_emulate_pio_string(ctxt->vcpu,
-				0,
-				(c->d & ByteOp) ? 1 : c->op_bytes,
-				c->rep_prefix ?
-				address_mask(c, c->regs[VCPU_REGS_RCX]) : 1,
-				(ctxt->eflags & EFLG_DF),
-					 register_address(c,
-					  seg_override_base(ctxt, c),
-						 c->regs[VCPU_REGS_RSI]),
-				c->rep_prefix,
-				c->regs[VCPU_REGS_RDX]) == 0) {
-			c->eip = saved_eip;
-			return -1;
-		}
-		return 0;
+		ops->pio_out_emulated(c->src.bytes, c->regs[VCPU_REGS_RDX],
+				      &c->src.val, 1, ctxt->vcpu);
+
+		register_address_increment(c, &c->regs[VCPU_REGS_RSI],
+					   (ctxt->eflags & EFLG_DF) ?
+					   -c->src.bytes : c->src.bytes);
+		c->dst.type = OP_NONE; /* nothing to writeback */
+		break;
 	case 0x70 ... 0x7f: /* jcc (short) */
 		if (test_cc(c->b, ctxt->eflags))
 			jmp_rel(c, c->src.val);
@@ -2977,6 +2969,11 @@ special_insn:
 	}
 
 writeback:
+	if (c->rep_prefix) {
+		c->regs[VCPU_REGS_RCX]--;
+		c->eip = ctxt->eip;
+	}
+
 	rc = writeback(ctxt, ops);
 	if (rc != X86EMUL_CONTINUE)
 		goto done;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7f6aa8d..b25ef4b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3169,18 +3169,17 @@ static int kvm_read_guest_virt_system(gva_t addr, void *val, unsigned int bytes,
 	return kvm_read_guest_virt_helper(addr, val, bytes, vcpu, 0, error);
 }
 
-static int kvm_write_guest_virt_helper(gva_t addr, void *val,
+static int kvm_write_guest_virt_system(gva_t addr, void *val,
 				       unsigned int bytes,
-				       struct kvm_vcpu *vcpu, u32 access,
+				       struct kvm_vcpu *vcpu,
 				       u32 *error)
 {
 	void *data = val;
 	int r = X86EMUL_CONTINUE;
 
-	access |= PFERR_WRITE_MASK;
-
 	while (bytes) {
-		gpa_t gpa =  vcpu->arch.mmu.gva_to_gpa(vcpu, addr, access, error);
+		gpa_t gpa =  vcpu->arch.mmu.gva_to_gpa(vcpu, addr,
+						       PFERR_WRITE_MASK, error);
 		unsigned offset = addr & (PAGE_SIZE-1);
 		unsigned towrite = min(bytes, (unsigned)PAGE_SIZE - offset);
 		int ret;
@@ -3203,20 +3202,6 @@ out:
 	return r;
 }
 
-static int kvm_write_guest_virt(gva_t addr, void *val, unsigned int bytes,
-				struct kvm_vcpu *vcpu, u32 *error)
-{
-	u32 access = (kvm_x86_ops->get_cpl(vcpu) == 3) ? PFERR_USER_MASK : 0;
-	return kvm_write_guest_virt_helper(addr, val, bytes, vcpu, access, error);
-}
-
-static int kvm_write_guest_virt_system(gva_t addr, void *val,
-				       unsigned int bytes,
-				       struct kvm_vcpu *vcpu, u32 *error)
-{
-	return kvm_write_guest_virt_helper(addr, val, bytes, vcpu, 0, error);
-}
-
 static int emulator_read_emulated(unsigned long addr,
 				  void *val,
 				  unsigned int bytes,
@@ -3397,23 +3382,20 @@ static int kernel_pio(struct kvm_vcpu *vcpu, void *pd)
 static int emulator_pio_in_emulated(int size, unsigned short port, void *val,
 				    unsigned int count, struct kvm_vcpu *vcpu)
 {
-	if (vcpu->arch.pio.cur_count)
+	if (vcpu->arch.pio.count)
 		goto data_avail;
 
 	trace_kvm_pio(1, port, size, 1);
 
 	vcpu->arch.pio.port = port;
 	vcpu->arch.pio.in = 1;
-	vcpu->arch.pio.string = 0;
-	vcpu->arch.pio.down = 0;
-	vcpu->arch.pio.rep = 0;
-	vcpu->arch.pio.count = vcpu->arch.pio.cur_count = count;
+	vcpu->arch.pio.count  = count;
 	vcpu->arch.pio.size = size;
 
 	if (!kernel_pio(vcpu, vcpu->arch.pio_data)) {
 	data_avail:
 		memcpy(val, vcpu->arch.pio_data, size * count);
-		vcpu->arch.pio.cur_count = 0;
+		vcpu->arch.pio.count = 0;
 		return 1;
 	}
 
@@ -3435,16 +3417,13 @@ static int emulator_pio_out_emulated(int size, unsigned short port,
 
 	vcpu->arch.pio.port = port;
 	vcpu->arch.pio.in = 0;
-	vcpu->arch.pio.string = 0;
-	vcpu->arch.pio.down = 0;
-	vcpu->arch.pio.rep = 0;
-	vcpu->arch.pio.count = vcpu->arch.pio.cur_count = count;
+	vcpu->arch.pio.count = count;
 	vcpu->arch.pio.size = size;
 
 	memcpy(vcpu->arch.pio_data, val, size * count);
 
 	if (!kernel_pio(vcpu, vcpu->arch.pio_data)) {
-		vcpu->arch.pio.cur_count = 0;
+		vcpu->arch.pio.count = 0;
 		return 1;
 	}
 
@@ -3685,7 +3664,6 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
 	cache_all_regs(vcpu);
 
 	vcpu->mmio_is_write = 0;
-	vcpu->arch.pio.string = 0;
 
 	if (!(emulation_type & EMULTYPE_NO_DECODE)) {
 		int cs_db, cs_l;
@@ -3752,12 +3730,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
 	if (r == 0)
 		kvm_x86_ops->set_interrupt_shadow(vcpu, shadow_mask);
 
-	if (vcpu->arch.pio.string)
-		return EMULATE_DO_MMIO;
-
-	if (vcpu->arch.pio.cur_count && !vcpu->arch.pio.string) {
+	if (vcpu->arch.pio.count) {
 		if (!vcpu->arch.pio.in)
-			vcpu->arch.pio.cur_count = 0;
+			vcpu->arch.pio.count = 0;
 		return EMULATE_DO_MMIO;
 	}
 
@@ -3790,152 +3765,6 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
 }
 EXPORT_SYMBOL_GPL(emulate_instruction);
 
-static int pio_copy_data(struct kvm_vcpu *vcpu)
-{
-	void *p = vcpu->arch.pio_data;
-	gva_t q = vcpu->arch.pio.guest_gva;
-	unsigned bytes;
-	int ret;
-	u32 error_code;
-
-	bytes = vcpu->arch.pio.size * vcpu->arch.pio.cur_count;
-	if (vcpu->arch.pio.in)
-		ret = kvm_write_guest_virt(q, p, bytes, vcpu, &error_code);
-	else
-		ret = kvm_read_guest_virt(q, p, bytes, vcpu, &error_code);
-
-	if (ret == X86EMUL_PROPAGATE_FAULT)
-		kvm_inject_page_fault(vcpu, q, error_code);
-
-	return ret;
-}
-
-int complete_pio(struct kvm_vcpu *vcpu)
-{
-	struct kvm_pio_request *io = &vcpu->arch.pio;
-	long delta;
-	int r;
-	unsigned long val;
-
-	if (io->in) {
-		r = pio_copy_data(vcpu);
-		if (r)
-			goto out;
-	}
-
-	delta = 1;
-	if (io->rep) {
-		delta *= io->cur_count;
-		/*
-		 * The size of the register should really depend on
-		 * current address size.
-		 */
-		val = kvm_register_read(vcpu, VCPU_REGS_RCX);
-		val -= delta;
-		kvm_register_write(vcpu, VCPU_REGS_RCX, val);
-	}
-	if (io->down)
-		delta = -delta;
-	delta *= io->size;
-	if (io->in) {
-		val = kvm_register_read(vcpu, VCPU_REGS_RDI);
-		val += delta;
-		kvm_register_write(vcpu, VCPU_REGS_RDI, val);
-	} else {
-		val = kvm_register_read(vcpu, VCPU_REGS_RSI);
-		val += delta;
-		kvm_register_write(vcpu, VCPU_REGS_RSI, val);
-	}
-
-out:
-	io->count -= io->cur_count;
-	io->cur_count = 0;
-
-	return 0;
-}
-
-static int pio_string_write(struct kvm_vcpu *vcpu)
-{
-	struct kvm_pio_request *io = &vcpu->arch.pio;
-	void *pd = vcpu->arch.pio_data;
-	int i, r = 0;
-
-	for (i = 0; i < io->cur_count; i++) {
-		if (kvm_io_bus_write(vcpu->kvm, KVM_PIO_BUS,
-				     io->port, io->size, pd)) {
-			r = -EOPNOTSUPP;
-			break;
-		}
-		pd += io->size;
-	}
-	return r;
-}
-
-int kvm_emulate_pio_string(struct kvm_vcpu *vcpu, int in,
-		  int size, unsigned long count, int down,
-		  gva_t address, int rep, unsigned port)
-{
-	unsigned now, in_page;
-	int ret = 0;
-
-	trace_kvm_pio(!in, port, size, count);
-
-	vcpu->run->exit_reason = KVM_EXIT_IO;
-	vcpu->run->io.direction = in ? KVM_EXIT_IO_IN : KVM_EXIT_IO_OUT;
-	vcpu->run->io.size = vcpu->arch.pio.size = size;
-	vcpu->run->io.data_offset = KVM_PIO_PAGE_OFFSET * PAGE_SIZE;
-	vcpu->run->io.count = vcpu->arch.pio.count = vcpu->arch.pio.cur_count = count;
-	vcpu->run->io.port = vcpu->arch.pio.port = port;
-	vcpu->arch.pio.in = in;
-	vcpu->arch.pio.string = 1;
-	vcpu->arch.pio.down = down;
-	vcpu->arch.pio.rep = rep;
-
-	if (!count) {
-		kvm_x86_ops->skip_emulated_instruction(vcpu);
-		return 1;
-	}
-
-	if (!down)
-		in_page = PAGE_SIZE - offset_in_page(address);
-	else
-		in_page = offset_in_page(address) + size;
-	now = min(count, (unsigned long)in_page / size);
-	if (!now)
-		now = 1;
-	if (down) {
-		/*
-		 * String I/O in reverse.  Yuck.  Kill the guest, fix later.
-		 */
-		pr_unimpl(vcpu, "guest string pio down\n");
-		kvm_inject_gp(vcpu, 0);
-		return 1;
-	}
-	vcpu->run->io.count = now;
-	vcpu->arch.pio.cur_count = now;
-
-	if (vcpu->arch.pio.cur_count == vcpu->arch.pio.count)
-		kvm_x86_ops->skip_emulated_instruction(vcpu);
-
-	vcpu->arch.pio.guest_gva = address;
-
-	if (!vcpu->arch.pio.in) {
-		/* string PIO write */
-		ret = pio_copy_data(vcpu);
-		if (ret == X86EMUL_PROPAGATE_FAULT)
-			return 1;
-		if (ret == 0 && !pio_string_write(vcpu)) {
-			complete_pio(vcpu);
-			if (vcpu->arch.pio.count == 0)
-				ret = 1;
-		}
-	}
-	/* no string PIO read support yet */
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(kvm_emulate_pio_string);
-
 static void bounce_off(void *info)
 {
 	/* nothing */
@@ -4661,15 +4490,14 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	if (!irqchip_in_kernel(vcpu->kvm))
 		kvm_set_cr8(vcpu, kvm_run->cr8);
 
-	if (vcpu->arch.pio.cur_count) {
+	if (vcpu->arch.pio.count) {
 		vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
-		if (!vcpu->arch.pio.string)
-			r = emulate_instruction(vcpu, 0, 0, EMULTYPE_NO_DECODE);
-		else
-			r = complete_pio(vcpu);
+		r = emulate_instruction(vcpu, 0, 0, EMULTYPE_NO_DECODE);
 		srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
-		if (r == EMULATE_DO_MMIO)
+		if (r == EMULATE_DO_MMIO) {
+			r = 0;
 			goto out;
+		}
 	}
 	if (vcpu->mmio_needed) {
 		memcpy(vcpu->mmio_data, kvm_run->mmio.data, 8);
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 21/24] KVM: x86 emulator: remove saved_eip
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (19 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 20/24] KVM: x86 emulator: Move string pio emulation into emulator.c Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest Gleb Natapov
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

c->eip is never written back in case of emulation failure, so no need to
set it to old value.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/emulate.c |    9 +--------
 1 files changed, 1 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 505dfba..ba1ce61 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2377,7 +2377,6 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 {
 	unsigned long memop = 0;
 	u64 msr_data;
-	unsigned long saved_eip = 0;
 	struct decode_cache *c = &ctxt->decode;
 	unsigned int port;
 	int io_dir_in;
@@ -2391,7 +2390,6 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 	 */
 
 	memcpy(c->regs, ctxt->vcpu->arch.regs, sizeof c->regs);
-	saved_eip = c->eip;
 
 	if (ctxt->mode == X86EMUL_MODE_PROT64 && (c->d & No64)) {
 		kvm_queue_exception(ctxt->vcpu, UD_VECTOR);
@@ -2983,11 +2981,7 @@ writeback:
 	kvm_rip_write(ctxt->vcpu, c->eip);
 
 done:
-	if (rc == X86EMUL_UNHANDLEABLE) {
-		c->eip = saved_eip;
-		return -1;
-	}
-	return 0;
+	return (rc == X86EMUL_UNHANDLEABLE) ? -1 : 0;
 
 twobyte_insn:
 	switch (c->b) {
@@ -3264,6 +3258,5 @@ twobyte_insn:
 
 cannot_emulate:
 	DPRINTF("Cannot emulate %02x\n", c->b);
-	c->eip = saved_eip;
 	return -1;
 }
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest.
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (20 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 21/24] KVM: x86 emulator: remove saved_eip Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:50   ` Avi Kivity
  2010-03-09 14:09 ` [PATCH 23/24] KVM: x86 emulator: introduce pio in string read ahead Gleb Natapov
  2010-03-09 14:09 ` [PATCH 24/24] KVM: small kvm_arch_vcpu_ioctl_run() cleanup Gleb Natapov
  23 siblings, 1 reply; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

Currently when string instruction is only partially complete we go back
to a guest mode, guest tries to reexecute instruction and exits again
and at this point emulation continues. Avoid all of this by restarting
instruction without going back to a guest mode.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/include/asm/kvm_emulate.h |    1 +
 arch/x86/kvm/emulate.c             |   22 ++++++++++++----------
 arch/x86/kvm/x86.c                 |   16 +++++++++++++++-
 3 files changed, 28 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index 7d323d5..f74b4ad 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -193,6 +193,7 @@ struct x86_emulate_ctxt {
 	/* interruptibility state, as a result of execution of STI or MOV SS */
 	int interruptibility;
 
+	bool restart; /* restart string instruction after writeback */
 	/* decode cache */
 	struct decode_cache decode;
 };
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index ba1ce61..76ed77d 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -925,8 +925,11 @@ x86_decode_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 	int mode = ctxt->mode;
 	int def_op_bytes, def_ad_bytes, group;
 
-	/* Shadow copy of register state. Committed on successful emulation. */
 
+	/* we cannot decode insn before we complete previous rep insn */
+	WARN_ON(ctxt->restart);
+
+	/* Shadow copy of register state. Committed on successful emulation. */
 	memset(c, 0, sizeof(struct decode_cache));
 	c->eip = ctxt->eip;
 	ctxt->cs_base = seg_base(ctxt, VCPU_SREG_CS);
@@ -2412,8 +2415,11 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 		memop = c->modrm_ea;
 
 	if (c->rep_prefix && (c->d & String)) {
+		ctxt->restart = true;
 		/* All REP prefixes have the same first termination condition */
 		if (c->regs[VCPU_REGS_RCX] == 0) {
+		string_done:
+			ctxt->restart = false;
 			kvm_rip_write(ctxt->vcpu, c->eip);
 			goto done;
 		}
@@ -2425,17 +2431,13 @@ x86_emulate_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops)
 		 * 	- if REPNE/REPNZ and ZF = 1 then done
 		 */
 		if ((c->b == 0xa6) || (c->b == 0xa7) ||
-				(c->b == 0xae) || (c->b == 0xaf)) {
+		    (c->b == 0xae) || (c->b == 0xaf)) {
 			if ((c->rep_prefix == REPE_PREFIX) &&
-				((ctxt->eflags & EFLG_ZF) == 0)) {
-					kvm_rip_write(ctxt->vcpu, c->eip);
-					goto done;
-			}
+			    ((ctxt->eflags & EFLG_ZF) == 0))
+				goto string_done;
 			if ((c->rep_prefix == REPNE_PREFIX) &&
-				((ctxt->eflags & EFLG_ZF) == EFLG_ZF)) {
-				kvm_rip_write(ctxt->vcpu, c->eip);
-				goto done;
-			}
+			    ((ctxt->eflags & EFLG_ZF) == EFLG_ZF))
+				goto string_done;
 		}
 		if (c->src.type == OP_MEM)
 			memop = register_address(c, seg_override_base(ctxt, c),
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b25ef4b..82379e1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3724,6 +3724,7 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
 		return EMULATE_DONE;
 	}
 
+restart:
 	r = x86_emulate_insn(&vcpu->arch.emulate_ctxt, &emulate_ops);
 	shadow_mask = vcpu->arch.emulate_ctxt.interruptibility;
 
@@ -3746,7 +3747,7 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
 
 	if (r) {
 		if (kvm_mmu_unprotect_page_virt(vcpu, cr2))
-			return EMULATE_DONE;
+			goto done;
 		if (!vcpu->mmio_needed) {
 			kvm_report_emulation_failure(vcpu, "mmio");
 			return EMULATE_FAIL;
@@ -3761,6 +3762,10 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
 		return EMULATE_DO_MMIO;
 	}
 
+done:
+	if (vcpu->arch.emulate_ctxt.restart)
+		goto restart;
+
 	return EMULATE_DONE;
 }
 EXPORT_SYMBOL_GPL(emulate_instruction);
@@ -4516,6 +4521,15 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 			goto out;
 		}
 	}
+	if (vcpu->arch.emulate_ctxt.restart) {
+		vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
+		r = emulate_instruction(vcpu, 0, 0, EMULTYPE_NO_DECODE);
+		srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
+		if (r == EMULATE_DO_MMIO) {
+			r = 0;
+			goto out;
+		}
+	}
 	if (kvm_run->exit_reason == KVM_EXIT_HYPERCALL)
 		kvm_register_write(vcpu, VCPU_REGS_RAX,
 				     kvm_run->hypercall.ret);
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 23/24] KVM: x86 emulator: introduce pio in string read ahead.
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (21 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  2010-03-09 14:09 ` [PATCH 24/24] KVM: small kvm_arch_vcpu_ioctl_run() cleanup Gleb Natapov
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

To optimize "rep ins" instruction do IO in big chunks ahead of time
instead of doing it only when required during instruction emulation.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/include/asm/kvm_emulate.h |    7 +++++++
 arch/x86/kvm/emulate.c             |   34 ++++++++++++++++++++++++++++++----
 2 files changed, 37 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index f74b4ad..da7a711 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -150,6 +150,12 @@ struct fetch_cache {
 	unsigned long end;
 };
 
+struct read_cache {
+	u8 data[1024];
+	unsigned long pos;
+	unsigned long end;
+};
+
 struct decode_cache {
 	u8 twobyte;
 	u8 b;
@@ -177,6 +183,7 @@ struct decode_cache {
 	void *modrm_ptr;
 	unsigned long modrm_val;
 	struct fetch_cache fetch;
+	struct read_cache io_read;
 };
 
 struct x86_emulate_ctxt {
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 76ed77d..987be2a 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1222,6 +1222,28 @@ done:
 	return (rc == X86EMUL_UNHANDLEABLE) ? -1 : 0;
 }
 
+static int pio_in_emulated(struct x86_emulate_ctxt *ctxt,
+			   struct x86_emulate_ops *ops,
+			   unsigned int size, unsigned short port,
+			   void *dest, unsigned int count)
+{
+	struct read_cache *mc = &ctxt->decode.io_read;
+
+	if (mc->pos == mc->end) { /* refill pio read ahead */
+		unsigned int n = sizeof(mc->data) / size;
+		n = min(n, count);
+		mc->pos = mc->end = 0;
+		if (!ops->pio_in_emulated(size, port, mc->data, n,
+					  ctxt->vcpu))
+			return 0;
+		mc->end = n * size;
+	}
+
+	memcpy(dest, mc->data + mc->pos, size);
+	mc->pos += size;
+	return 1;
+}
+
 static u32 desc_limit_scaled(struct desc_struct *desc)
 {
 	u32 limit = get_desc_limit(desc);
@@ -2601,8 +2623,11 @@ special_insn:
 			kvm_inject_gp(ctxt->vcpu, 0);
 			goto done;
 		}
-		if (!ops->pio_in_emulated(c->dst.bytes, c->regs[VCPU_REGS_RDX],
-					  &c->dst.val, 1, ctxt->vcpu))
+		if (c->rep_prefix)
+			ctxt->restart = true;
+		if (!pio_in_emulated(ctxt, ops, c->dst.bytes,
+				     c->regs[VCPU_REGS_RDX], &c->dst.val,
+				     c->rep_prefix ? c->regs[VCPU_REGS_RCX] : 1))
 			goto done; /* IO is needed, skip writeback */
 
 		register_address_increment(c, &c->regs[VCPU_REGS_RDI],
@@ -2908,8 +2933,9 @@ special_insn:
 			goto done;
 		}
 		if (io_dir_in)
-			ops->pio_in_emulated((c->d & ByteOp) ? 1 : c->op_bytes,
-					     port, &c->dst.val, 1, ctxt->vcpu);
+			pio_in_emulated(ctxt, ops,
+					(c->d & ByteOp) ? 1 : c->op_bytes,
+					port, &c->dst.val, 1);
 		else
 			ops->pio_out_emulated((c->d & ByteOp) ? 1 : c->op_bytes,
 					      port, &c->regs[VCPU_REGS_RAX], 1,
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 24/24] KVM: small kvm_arch_vcpu_ioctl_run() cleanup.
  2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
                   ` (22 preceding siblings ...)
  2010-03-09 14:09 ` [PATCH 23/24] KVM: x86 emulator: introduce pio in string read ahead Gleb Natapov
@ 2010-03-09 14:09 ` Gleb Natapov
  23 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:09 UTC (permalink / raw)
  To: kvm

Unify all conditions that get us back into emulator after returning from
userspace.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/x86.c |   32 ++++++--------------------------
 1 files changed, 6 insertions(+), 26 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 82379e1..a2c728f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4495,33 +4495,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	if (!irqchip_in_kernel(vcpu->kvm))
 		kvm_set_cr8(vcpu, kvm_run->cr8);
 
-	if (vcpu->arch.pio.count) {
-		vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
-		r = emulate_instruction(vcpu, 0, 0, EMULTYPE_NO_DECODE);
-		srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
-		if (r == EMULATE_DO_MMIO) {
-			r = 0;
-			goto out;
+	if (vcpu->arch.pio.count || vcpu->mmio_needed ||
+	    vcpu->arch.emulate_ctxt.restart) {
+		if (vcpu->mmio_needed) {
+			memcpy(vcpu->mmio_data, kvm_run->mmio.data, 8);
+			vcpu->mmio_read_completed = 1;
+			vcpu->mmio_needed = 0;
 		}
-	}
-	if (vcpu->mmio_needed) {
-		memcpy(vcpu->mmio_data, kvm_run->mmio.data, 8);
-		vcpu->mmio_read_completed = 1;
-		vcpu->mmio_needed = 0;
-
-		vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
-		r = emulate_instruction(vcpu, vcpu->arch.mmio_fault_cr2, 0,
-					EMULTYPE_NO_DECODE);
-		srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
-		if (r == EMULATE_DO_MMIO) {
-			/*
-			 * Read-modify-write.  Back to userspace.
-			 */
-			r = 0;
-			goto out;
-		}
-	}
-	if (vcpu->arch.emulate_ctxt.restart) {
 		vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
 		r = emulate_instruction(vcpu, 0, 0, EMULTYPE_NO_DECODE);
 		srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
-- 
1.6.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH 02/24] KVM: Provide callback to get/set control registers in emulator ops.
  2010-03-09 14:09 ` [PATCH 02/24] KVM: Provide callback to get/set control registers in emulator ops Gleb Natapov
@ 2010-03-09 14:18   ` Avi Kivity
  2010-03-09 14:24     ` Gleb Natapov
  0 siblings, 1 reply; 53+ messages in thread
From: Avi Kivity @ 2010-03-09 14:18 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm

On 03/09/2010 04:09 PM, Gleb Natapov wrote:
> Use this callback instead of directly call kvm function. Also rename
> realmode_(set|get)_cr to emulator_(set|get)_cr since function has nothing
> to do with real mode.
>
>
> +	ulong (*get_cr)(int cr, struct kvm_vcpu *vcpu);
> +	void (*set_cr)(int cr, ulong val, struct kvm_vcpu *vcpu);
>   };
>    

Note, passing a vcpu means we are still tightly coupled to kvm.  Can be 
fixed later.

> +static unsigned long emulator_get_cr(int cr, struct kvm_vcpu *vcpu)
> +{
> +	unsigned long value;
> +
> +	switch (cr) {
> +	case 0:
> +		value = kvm_read_cr0(vcpu);
> +		break;
> +	case 2:
> +		value = vcpu->arch.cr2;
> +		break;
> +	case 3:
> +		value = vcpu->arch.cr3;
> +		break;
> +	case 4:
> +		value = kvm_read_cr4(vcpu);
> +		break;
> +	case 8:
> +		value = kvm_get_cr8(vcpu);
> +		break;
> +	default:
> +		vcpu_printf(vcpu, "%s: unexpected cr %u\n", __func__, cr);
> +		return 0;
>    

This printk is triggerable by guest code (as the patch didn't introduce 
this, it can be fixed later).

The emulator should #UD on unrecognised control registers.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 02/24] KVM: Provide callback to get/set control registers in emulator ops.
  2010-03-09 14:18   ` Avi Kivity
@ 2010-03-09 14:24     ` Gleb Natapov
  0 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:24 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

On Tue, Mar 09, 2010 at 04:18:09PM +0200, Avi Kivity wrote:
> On 03/09/2010 04:09 PM, Gleb Natapov wrote:
> >Use this callback instead of directly call kvm function. Also rename
> >realmode_(set|get)_cr to emulator_(set|get)_cr since function has nothing
> >to do with real mode.
> >
> >
> >+	ulong (*get_cr)(int cr, struct kvm_vcpu *vcpu);
> >+	void (*set_cr)(int cr, ulong val, struct kvm_vcpu *vcpu);
> >  };
> 
> Note, passing a vcpu means we are still tightly coupled to kvm.  Can
> be fixed later.
> 
Yes, that is on my todo.

> >+static unsigned long emulator_get_cr(int cr, struct kvm_vcpu *vcpu)
> >+{
> >+	unsigned long value;
> >+
> >+	switch (cr) {
> >+	case 0:
> >+		value = kvm_read_cr0(vcpu);
> >+		break;
> >+	case 2:
> >+		value = vcpu->arch.cr2;
> >+		break;
> >+	case 3:
> >+		value = vcpu->arch.cr3;
> >+		break;
> >+	case 4:
> >+		value = kvm_read_cr4(vcpu);
> >+		break;
> >+	case 8:
> >+		value = kvm_get_cr8(vcpu);
> >+		break;
> >+	default:
> >+		vcpu_printf(vcpu, "%s: unexpected cr %u\n", __func__, cr);
> >+		return 0;
> 
> This printk is triggerable by guest code (as the patch didn't
> introduce this, it can be fixed later).
> 
> The emulator should #UD on unrecognised control registers.
inject #UD on access to non-existing CR patch does this.

--
			Gleb.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 04/24] KVM: Provide current CPL as part of emulator context.
  2010-03-09 14:09 ` [PATCH 04/24] KVM: Provide current CPL as part of emulator context Gleb Natapov
@ 2010-03-09 14:24   ` Avi Kivity
  2010-03-09 14:27     ` Gleb Natapov
  0 siblings, 1 reply; 53+ messages in thread
From: Avi Kivity @ 2010-03-09 14:24 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm

On 03/09/2010 04:09 PM, Gleb Natapov wrote:
> Eliminate the need to call back into KVM to get it from emulator.
>
>
> @@ -3499,6 +3499,7 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
>
>   		vcpu->arch.emulate_ctxt.vcpu = vcpu;
>   		vcpu->arch.emulate_ctxt.eflags = kvm_x86_ops->get_rflags(vcpu);
> +		vcpu->arch.emulate_ctxt.cpl = kvm_x86_ops->get_cpl(vcpu);
>   		vcpu->arch.emulate_ctxt.mode =
>   			(!is_protmode(vcpu)) ? X86EMUL_MODE_REAL :
>   			(vcpu->arch.emulate_ctxt.eflags&  X86_EFLAGS_VM)
>    

This is an unconditional VMREAD, which is slow (extra slow if nested).  
Most common emulator ops do not need the cpl.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 04/24] KVM: Provide current CPL as part of emulator context.
  2010-03-09 14:24   ` Avi Kivity
@ 2010-03-09 14:27     ` Gleb Natapov
  0 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:27 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

On Tue, Mar 09, 2010 at 04:24:45PM +0200, Avi Kivity wrote:
> On 03/09/2010 04:09 PM, Gleb Natapov wrote:
> >Eliminate the need to call back into KVM to get it from emulator.
> >
> >
> >@@ -3499,6 +3499,7 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
> >
> >  		vcpu->arch.emulate_ctxt.vcpu = vcpu;
> >  		vcpu->arch.emulate_ctxt.eflags = kvm_x86_ops->get_rflags(vcpu);
> >+		vcpu->arch.emulate_ctxt.cpl = kvm_x86_ops->get_cpl(vcpu);
> >  		vcpu->arch.emulate_ctxt.mode =
> >  			(!is_protmode(vcpu)) ? X86EMUL_MODE_REAL :
> >  			(vcpu->arch.emulate_ctxt.eflags&  X86_EFLAGS_VM)
> 
> This is an unconditional VMREAD, which is slow (extra slow if
> nested).  Most common emulator ops do not need the cpl.
> 
Will have to make it one of x86_emulate_ops callback then.

--
			Gleb.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 07/24] KVM: x86 emulator: fix 0f 01 /5 emulation
  2010-03-09 14:09 ` [PATCH 07/24] KVM: x86 emulator: fix 0f 01 /5 emulation Gleb Natapov
@ 2010-03-09 14:27   ` Avi Kivity
  2010-03-09 14:33     ` Gleb Natapov
  0 siblings, 1 reply; 53+ messages in thread
From: Avi Kivity @ 2010-03-09 14:27 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm

On 03/09/2010 04:09 PM, Gleb Natapov wrote:
> It is undefined and should generate #UD.
>
> Signed-off-by: Gleb Natapov<gleb@redhat.com>
> ---
>   arch/x86/kvm/emulate.c |    3 +++
>   1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 2df510b..1a32b78 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -2486,6 +2486,9 @@ twobyte_insn:
>   				    (c->src.val&  0x0f), ctxt->vcpu);
>   			c->dst.type = OP_NONE;
>   			break;
> +		case 5: /* not defined */
> +			kvm_queue_exception(ctxt->vcpu, UD_VECTOR);
> +			goto done;
>   		case 7: /* invlpg*/
>   			emulate_invlpg(ctxt->vcpu, memop);
>   			/* Disable writeback. */
>    

Why is this needed?  We can only get here if the guest tricks us 
(otherwise the #UD would go back to the guest, or rather, we'd trap it 
to see if it's a hypercall instruction, but not pass it on to the emulator).

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 07/24] KVM: x86 emulator: fix 0f 01 /5 emulation
  2010-03-09 14:27   ` Avi Kivity
@ 2010-03-09 14:33     ` Gleb Natapov
  2010-03-09 14:34       ` Avi Kivity
  0 siblings, 1 reply; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 14:33 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

On Tue, Mar 09, 2010 at 04:27:39PM +0200, Avi Kivity wrote:
> On 03/09/2010 04:09 PM, Gleb Natapov wrote:
> >It is undefined and should generate #UD.
> >
> >Signed-off-by: Gleb Natapov<gleb@redhat.com>
> >---
> >  arch/x86/kvm/emulate.c |    3 +++
> >  1 files changed, 3 insertions(+), 0 deletions(-)
> >
> >diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> >index 2df510b..1a32b78 100644
> >--- a/arch/x86/kvm/emulate.c
> >+++ b/arch/x86/kvm/emulate.c
> >@@ -2486,6 +2486,9 @@ twobyte_insn:
> >  				    (c->src.val&  0x0f), ctxt->vcpu);
> >  			c->dst.type = OP_NONE;
> >  			break;
> >+		case 5: /* not defined */
> >+			kvm_queue_exception(ctxt->vcpu, UD_VECTOR);
> >+			goto done;
> >  		case 7: /* invlpg*/
> >  			emulate_invlpg(ctxt->vcpu, memop);
> >  			/* Disable writeback. */
> 
> Why is this needed?  We can only get here if the guest tricks us
> (otherwise the #UD would go back to the guest, or rather, we'd trap
> it to see if it's a hypercall instruction, but not pass it on to the
> emulator).
> 
For completes. A lot of code we added recently is there only because guest
can trick us to enter emulator. Unfortunately we have to take suck tricks
into account. Without this patch if emulator gets here it will report failed
emulation.

--
			Gleb.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 07/24] KVM: x86 emulator: fix 0f 01 /5 emulation
  2010-03-09 14:33     ` Gleb Natapov
@ 2010-03-09 14:34       ` Avi Kivity
  0 siblings, 0 replies; 53+ messages in thread
From: Avi Kivity @ 2010-03-09 14:34 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm

On 03/09/2010 04:33 PM, Gleb Natapov wrote:
> On Tue, Mar 09, 2010 at 04:27:39PM +0200, Avi Kivity wrote:
>    
>> On 03/09/2010 04:09 PM, Gleb Natapov wrote:
>>      
>>> It is undefined and should generate #UD.
>>>
>>> Signed-off-by: Gleb Natapov<gleb@redhat.com>
>>> ---
>>>   arch/x86/kvm/emulate.c |    3 +++
>>>   1 files changed, 3 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>>> index 2df510b..1a32b78 100644
>>> --- a/arch/x86/kvm/emulate.c
>>> +++ b/arch/x86/kvm/emulate.c
>>> @@ -2486,6 +2486,9 @@ twobyte_insn:
>>>   				    (c->src.val&   0x0f), ctxt->vcpu);
>>>   			c->dst.type = OP_NONE;
>>>   			break;
>>> +		case 5: /* not defined */
>>> +			kvm_queue_exception(ctxt->vcpu, UD_VECTOR);
>>> +			goto done;
>>>   		case 7: /* invlpg*/
>>>   			emulate_invlpg(ctxt->vcpu, memop);
>>>   			/* Disable writeback. */
>>>        
>> Why is this needed?  We can only get here if the guest tricks us
>> (otherwise the #UD would go back to the guest, or rather, we'd trap
>> it to see if it's a hypercall instruction, but not pass it on to the
>> emulator).
>>
>>      
> For completes. A lot of code we added recently is there only because guest
> can trick us to enter emulator. Unfortunately we have to take suck tricks
> into account. Without this patch if emulator gets here it will report failed
> emulation.
>
>    

Okay.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 15/24] KVM: x86 emulator: Provide more callbacks for x86 emulator.
  2010-03-09 14:09 ` [PATCH 15/24] KVM: x86 emulator: Provide more callbacks for x86 emulator Gleb Natapov
@ 2010-03-09 14:43   ` Avi Kivity
  2010-03-09 16:25     ` Gleb Natapov
  0 siblings, 1 reply; 53+ messages in thread
From: Avi Kivity @ 2010-03-09 14:43 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm

On 03/09/2010 04:09 PM, Gleb Natapov wrote:
> Provide get_cached_descriptor(), set_cached_descriptor(),
> get_segment_selector(), set_segment_selector(), get_gdt(),
> write_std() callbacks.
>
> Signed-off-by: Gleb Natapov<gleb@redhat.com>
> ---
>   arch/x86/include/asm/kvm_emulate.h |   16 +++++
>   arch/x86/kvm/x86.c                 |  130 +++++++++++++++++++++++++++++++----
>   2 files changed, 131 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
> index 032d02f..e881618 100644
> --- a/arch/x86/include/asm/kvm_emulate.h
> +++ b/arch/x86/include/asm/kvm_emulate.h
> @@ -63,6 +63,15 @@ struct x86_emulate_ops {
>   			unsigned int bytes, struct kvm_vcpu *vcpu, u32 *error);
>
>   	/*
> +	 * write_std: Write bytes of standard (non-emulated/special) memory.
> +	 *            Used for descriptor writing.
> +	 *  @addr:  [IN ] Linear address to which to write.
> +	 *  @val:   [OUT] Value write to memory, zero-extended to 'u_long'.
> +	 *  @bytes: [IN ] Number of bytes to write to memory.
> +	 */
> +	int (*write_std)(unsigned long addr, void *val,
> +			 unsigned int bytes, struct kvm_vcpu *vcpu, u32 *error);
>    

Descriptor writes need an atomic kvm_set_guest_bit(), no?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 19/24] KVM: x86 emulator: fix in/out emulation.
  2010-03-09 14:09 ` [PATCH 19/24] KVM: x86 emulator: fix in/out emulation Gleb Natapov
@ 2010-03-09 14:47   ` Avi Kivity
  2010-03-09 18:09     ` Gleb Natapov
  0 siblings, 1 reply; 53+ messages in thread
From: Avi Kivity @ 2010-03-09 14:47 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm

On 03/09/2010 04:09 PM, Gleb Natapov wrote:
> in/out emulation is broken now. The breakage is different depending
> on where IO device resides. If it is in userspace emulator reports
> emulation failure since it incorrectly interprets kvm_emulate_pio()
> return value. If IO device is in the kernel emulation of 'in' will do
> nothing since kvm_emulate_pio() stores result directly into vcpu
> registers, so emulator will overwrite result of emulation during
> commit of shadowed register.
>
>
> index def4877..315e8a8 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -1488,29 +1488,9 @@ static int shutdown_interception(struct vcpu_svm *svm)
>
>   static int io_interception(struct vcpu_svm *svm)
>   {
> -	u32 io_info = svm->vmcb->control.exit_info_1; /* address size bug? */
> -	int size, in, string;
> -	unsigned port;
> -
>   	++svm->vcpu.stat.io_exits;
>
> -	svm->next_rip = svm->vmcb->control.exit_info_2;
> -
> -	string = (io_info&  SVM_IOIO_STR_MASK) != 0;
> -
> -	if (string) {
> -		if (emulate_instruction(&svm->vcpu,
> -					0, 0, 0) == EMULATE_DO_MMIO)
> -			return 0;
> -		return 1;
> -	}
> -
> -	in = (io_info&  SVM_IOIO_TYPE_MASK) != 0;
> -	port = io_info>>  16;
> -	size = (io_info&  SVM_IOIO_SIZE_MASK)>>  SVM_IOIO_SIZE_SHIFT;
> -
> -	skip_emulated_instruction(&svm->vcpu);
> -	return kvm_emulate_pio(&svm->vcpu, in, size, port);
> +	return !(emulate_instruction(&svm->vcpu, 0, 0, 0) == EMULATE_DO_MMIO);
>   }
>    

We don't want to enter the emulator for non-string in/out.  Leftover 
test code?

>
>   static int nmi_interception(struct vcpu_svm *svm)
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index ae3217d..7f33d8e 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -2974,26 +2974,9 @@ static int handle_triple_fault(struct kvm_vcpu *vcpu)
>
>   static int handle_io(struct kvm_vcpu *vcpu)
>   {
> -	unsigned long exit_qualification;
> -	int size, in, string;
> -	unsigned port;
> -
>   	++vcpu->stat.io_exits;
> -	exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
> -	string = (exit_qualification&  16) != 0;
>
> -	if (string) {
> -		if (emulate_instruction(vcpu, 0, 0, 0) == EMULATE_DO_MMIO)
> -			return 0;
> -		return 1;
> -	}
> -
> -	size = (exit_qualification&  7) + 1;
> -	in = (exit_qualification&  8) != 0;
> -	port = exit_qualification>>  16;
> -
> -	skip_emulated_instruction(vcpu);
> -	return kvm_emulate_pio(vcpu, in, size, port);
> +	return !(emulate_instruction(vcpu, 0, 0, 0) == EMULATE_DO_MMIO);
>   }
>    

Ditto.


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest.
  2010-03-09 14:09 ` [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest Gleb Natapov
@ 2010-03-09 14:50   ` Avi Kivity
  2010-03-09 18:11     ` Gleb Natapov
  0 siblings, 1 reply; 53+ messages in thread
From: Avi Kivity @ 2010-03-09 14:50 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm

On 03/09/2010 04:09 PM, Gleb Natapov wrote:
> Currently when string instruction is only partially complete we go back
> to a guest mode, guest tries to reexecute instruction and exits again
> and at this point emulation continues. Avoid all of this by restarting
> instruction without going back to a guest mode.
>    

What happens if rcx is really big?  Going back into the guest gave us a 
preemption point.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 15/24] KVM: x86 emulator: Provide more callbacks for x86 emulator.
  2010-03-09 14:43   ` Avi Kivity
@ 2010-03-09 16:25     ` Gleb Natapov
  2010-03-09 17:22       ` Avi Kivity
  0 siblings, 1 reply; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 16:25 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

On Tue, Mar 09, 2010 at 04:43:59PM +0200, Avi Kivity wrote:
> On 03/09/2010 04:09 PM, Gleb Natapov wrote:
> >Provide get_cached_descriptor(), set_cached_descriptor(),
> >get_segment_selector(), set_segment_selector(), get_gdt(),
> >write_std() callbacks.
> >
> >Signed-off-by: Gleb Natapov<gleb@redhat.com>
> >---
> >  arch/x86/include/asm/kvm_emulate.h |   16 +++++
> >  arch/x86/kvm/x86.c                 |  130 +++++++++++++++++++++++++++++++----
> >  2 files changed, 131 insertions(+), 15 deletions(-)
> >
> >diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
> >index 032d02f..e881618 100644
> >--- a/arch/x86/include/asm/kvm_emulate.h
> >+++ b/arch/x86/include/asm/kvm_emulate.h
> >@@ -63,6 +63,15 @@ struct x86_emulate_ops {
> >  			unsigned int bytes, struct kvm_vcpu *vcpu, u32 *error);
> >
> >  	/*
> >+	 * write_std: Write bytes of standard (non-emulated/special) memory.
> >+	 *            Used for descriptor writing.
> >+	 *  @addr:  [IN ] Linear address to which to write.
> >+	 *  @val:   [OUT] Value write to memory, zero-extended to 'u_long'.
> >+	 *  @bytes: [IN ] Number of bytes to write to memory.
> >+	 */
> >+	int (*write_std)(unsigned long addr, void *val,
> >+			 unsigned int bytes, struct kvm_vcpu *vcpu, u32 *error);
> 
> Descriptor writes need an atomic kvm_set_guest_bit(), no?
> 
It is? atomic against what? Current code just write whole descriptor
using write_std().

--
			Gleb.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 15/24] KVM: x86 emulator: Provide more callbacks for x86 emulator.
  2010-03-09 16:25     ` Gleb Natapov
@ 2010-03-09 17:22       ` Avi Kivity
  2010-03-09 17:57         ` Gleb Natapov
  0 siblings, 1 reply; 53+ messages in thread
From: Avi Kivity @ 2010-03-09 17:22 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm

On 03/09/2010 06:25 PM, Gleb Natapov wrote:
> On Tue, Mar 09, 2010 at 04:43:59PM +0200, Avi Kivity wrote:
>    
>> On 03/09/2010 04:09 PM, Gleb Natapov wrote:
>>      
>>> Provide get_cached_descriptor(), set_cached_descriptor(),
>>> get_segment_selector(), set_segment_selector(), get_gdt(),
>>> write_std() callbacks.
>>>
>>> Signed-off-by: Gleb Natapov<gleb@redhat.com>
>>> ---
>>>   arch/x86/include/asm/kvm_emulate.h |   16 +++++
>>>   arch/x86/kvm/x86.c                 |  130 +++++++++++++++++++++++++++++++----
>>>   2 files changed, 131 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
>>> index 032d02f..e881618 100644
>>> --- a/arch/x86/include/asm/kvm_emulate.h
>>> +++ b/arch/x86/include/asm/kvm_emulate.h
>>> @@ -63,6 +63,15 @@ struct x86_emulate_ops {
>>>   			unsigned int bytes, struct kvm_vcpu *vcpu, u32 *error);
>>>
>>>   	/*
>>> +	 * write_std: Write bytes of standard (non-emulated/special) memory.
>>> +	 *            Used for descriptor writing.
>>> +	 *  @addr:  [IN ] Linear address to which to write.
>>> +	 *  @val:   [OUT] Value write to memory, zero-extended to 'u_long'.
>>> +	 *  @bytes: [IN ] Number of bytes to write to memory.
>>> +	 */
>>> +	int (*write_std)(unsigned long addr, void *val,
>>> +			 unsigned int bytes, struct kvm_vcpu *vcpu, u32 *error);
>>>        
>> Descriptor writes need an atomic kvm_set_guest_bit(), no?
>>
>>      
> It is? atomic against what? Current code just write whole descriptor
> using write_std().
>    

These are accessed bit changes, and are done atomically in the same way 
as a page table walk sets the accessed and dirty bit.  Presumably the 
atomic operation is to allow the kernel to scan segments and swap them 
out if they are not used.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 15/24] KVM: x86 emulator: Provide more callbacks for x86 emulator.
  2010-03-09 17:22       ` Avi Kivity
@ 2010-03-09 17:57         ` Gleb Natapov
  2010-03-10  9:11           ` Avi Kivity
  0 siblings, 1 reply; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 17:57 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

On Tue, Mar 09, 2010 at 07:22:51PM +0200, Avi Kivity wrote:
> On 03/09/2010 06:25 PM, Gleb Natapov wrote:
> >On Tue, Mar 09, 2010 at 04:43:59PM +0200, Avi Kivity wrote:
> >>On 03/09/2010 04:09 PM, Gleb Natapov wrote:
> >>>Provide get_cached_descriptor(), set_cached_descriptor(),
> >>>get_segment_selector(), set_segment_selector(), get_gdt(),
> >>>write_std() callbacks.
> >>>
> >>>Signed-off-by: Gleb Natapov<gleb@redhat.com>
> >>>---
> >>>  arch/x86/include/asm/kvm_emulate.h |   16 +++++
> >>>  arch/x86/kvm/x86.c                 |  130 +++++++++++++++++++++++++++++++----
> >>>  2 files changed, 131 insertions(+), 15 deletions(-)
> >>>
> >>>diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
> >>>index 032d02f..e881618 100644
> >>>--- a/arch/x86/include/asm/kvm_emulate.h
> >>>+++ b/arch/x86/include/asm/kvm_emulate.h
> >>>@@ -63,6 +63,15 @@ struct x86_emulate_ops {
> >>>  			unsigned int bytes, struct kvm_vcpu *vcpu, u32 *error);
> >>>
> >>>  	/*
> >>>+	 * write_std: Write bytes of standard (non-emulated/special) memory.
> >>>+	 *            Used for descriptor writing.
> >>>+	 *  @addr:  [IN ] Linear address to which to write.
> >>>+	 *  @val:   [OUT] Value write to memory, zero-extended to 'u_long'.
> >>>+	 *  @bytes: [IN ] Number of bytes to write to memory.
> >>>+	 */
> >>>+	int (*write_std)(unsigned long addr, void *val,
> >>>+			 unsigned int bytes, struct kvm_vcpu *vcpu, u32 *error);
> >>Descriptor writes need an atomic kvm_set_guest_bit(), no?
> >>
> >It is? atomic against what? Current code just write whole descriptor
> >using write_std().
> 
> These are accessed bit changes, and are done atomically in the same
> way as a page table walk sets the accessed and dirty bit.
> Presumably the atomic operation is to allow the kernel to scan
> segments and swap them out if they are not used.
> 
We can use cmpxchg callback for that, no?

--
			Gleb.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 19/24] KVM: x86 emulator: fix in/out emulation.
  2010-03-09 14:47   ` Avi Kivity
@ 2010-03-09 18:09     ` Gleb Natapov
  2010-03-10  9:12       ` Avi Kivity
  0 siblings, 1 reply; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 18:09 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

On Tue, Mar 09, 2010 at 04:47:24PM +0200, Avi Kivity wrote:
> On 03/09/2010 04:09 PM, Gleb Natapov wrote:
> >in/out emulation is broken now. The breakage is different depending
> >on where IO device resides. If it is in userspace emulator reports
> >emulation failure since it incorrectly interprets kvm_emulate_pio()
> >return value. If IO device is in the kernel emulation of 'in' will do
> >nothing since kvm_emulate_pio() stores result directly into vcpu
> >registers, so emulator will overwrite result of emulation during
> >commit of shadowed register.
> >
> >
> >index def4877..315e8a8 100644
> >--- a/arch/x86/kvm/svm.c
> >+++ b/arch/x86/kvm/svm.c
> >@@ -1488,29 +1488,9 @@ static int shutdown_interception(struct vcpu_svm *svm)
> >
> >  static int io_interception(struct vcpu_svm *svm)
> >  {
> >-	u32 io_info = svm->vmcb->control.exit_info_1; /* address size bug? */
> >-	int size, in, string;
> >-	unsigned port;
> >-
> >  	++svm->vcpu.stat.io_exits;
> >
> >-	svm->next_rip = svm->vmcb->control.exit_info_2;
> >-
> >-	string = (io_info&  SVM_IOIO_STR_MASK) != 0;
> >-
> >-	if (string) {
> >-		if (emulate_instruction(&svm->vcpu,
> >-					0, 0, 0) == EMULATE_DO_MMIO)
> >-			return 0;
> >-		return 1;
> >-	}
> >-
> >-	in = (io_info&  SVM_IOIO_TYPE_MASK) != 0;
> >-	port = io_info>>  16;
> >-	size = (io_info&  SVM_IOIO_SIZE_MASK)>>  SVM_IOIO_SIZE_SHIFT;
> >-
> >-	skip_emulated_instruction(&svm->vcpu);
> >-	return kvm_emulate_pio(&svm->vcpu, in, size, port);
> >+	return !(emulate_instruction(&svm->vcpu, 0, 0, 0) == EMULATE_DO_MMIO);
> >  }
> 
> We don't want to enter the emulator for non-string in/out.  Leftover
> test code?
> 
No, unfortunately this is not leftover. I just don't see a way how we
can bypass emulator and still have emulator be able to emulate in/out
(for big real mode for instance). The problem is basically described in
the commit message. If we have function outside of emulator that does
in/out emulation on vcpu directly, then emulator can't  use it since
committing shadowed registers will overwrite the result of emulation.
Having two different emulations (one outside of emulator and another in
emulator) is also problematic since when userspace returns after IO exit
we don't know which emulation to continue. If we want to avoid
instruction decoding we can fill in emulation context from exit info as
if instruction was already decoded and call emulator.

--
			Gleb.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest.
  2010-03-09 14:50   ` Avi Kivity
@ 2010-03-09 18:11     ` Gleb Natapov
  2010-03-10  2:30       ` Takuya Yoshikawa
  2010-03-10  9:13       ` Avi Kivity
  0 siblings, 2 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-09 18:11 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

On Tue, Mar 09, 2010 at 04:50:29PM +0200, Avi Kivity wrote:
> On 03/09/2010 04:09 PM, Gleb Natapov wrote:
> >Currently when string instruction is only partially complete we go back
> >to a guest mode, guest tries to reexecute instruction and exits again
> >and at this point emulation continues. Avoid all of this by restarting
> >instruction without going back to a guest mode.
> 
> What happens if rcx is really big?  Going back into the guest gave
> us a preemption point.
> 
Two solutions. We can check if reschedule is required and yield cpu if
needed. Or we can enter guest from time to time.

--
			Gleb.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest.
  2010-03-09 18:11     ` Gleb Natapov
@ 2010-03-10  2:30       ` Takuya Yoshikawa
  2010-03-10  9:06         ` Gleb Natapov
  2010-03-10  9:13       ` Avi Kivity
  1 sibling, 1 reply; 53+ messages in thread
From: Takuya Yoshikawa @ 2010-03-10  2:30 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Avi Kivity, kvm

Gleb Natapov wrote:
> On Tue, Mar 09, 2010 at 04:50:29PM +0200, Avi Kivity wrote:
>> On 03/09/2010 04:09 PM, Gleb Natapov wrote:
>>> Currently when string instruction is only partially complete we go back
>>> to a guest mode, guest tries to reexecute instruction and exits again
>>> and at this point emulation continues. Avoid all of this by restarting
>>> instruction without going back to a guest mode.
>> What happens if rcx is really big?  Going back into the guest gave
>> us a preemption point.
>>
> Two solutions. We can check if reschedule is required and yield cpu if
> needed. Or we can enter guest from time to time.

One generic question: from the viewpoint of KVM's policy, is it OK to make
the semantics different from real CPUs?

Semantics, may be better to use other words, but I'm little bit worried that
the second solution may change something, not mentioning about bugs but some
behavior patterns depending on the "time to time".

> 
> --
> 			Gleb.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest.
  2010-03-10  2:30       ` Takuya Yoshikawa
@ 2010-03-10  9:06         ` Gleb Natapov
  2010-03-10  9:12           ` Takuya Yoshikawa
  0 siblings, 1 reply; 53+ messages in thread
From: Gleb Natapov @ 2010-03-10  9:06 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: Avi Kivity, kvm

On Wed, Mar 10, 2010 at 11:30:20AM +0900, Takuya Yoshikawa wrote:
> Gleb Natapov wrote:
> >On Tue, Mar 09, 2010 at 04:50:29PM +0200, Avi Kivity wrote:
> >>On 03/09/2010 04:09 PM, Gleb Natapov wrote:
> >>>Currently when string instruction is only partially complete we go back
> >>>to a guest mode, guest tries to reexecute instruction and exits again
> >>>and at this point emulation continues. Avoid all of this by restarting
> >>>instruction without going back to a guest mode.
> >>What happens if rcx is really big?  Going back into the guest gave
> >>us a preemption point.
> >>
> >Two solutions. We can check if reschedule is required and yield cpu if
> >needed. Or we can enter guest from time to time.
> 
> One generic question: from the viewpoint of KVM's policy, is it OK to make
> the semantics different from real CPUs?
> 
> Semantics, may be better to use other words, but I'm little bit worried that
> the second solution may change something, not mentioning about bugs but some
> behavior patterns depending on the "time to time".
> 
Entering guest from time to time will not change semantics of the
processor (if code is not modified under processor's feet at least).
Currently we reenter guest mode after each iteration of string
instruction for all instruction but ins/outs.

--
			Gleb.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 15/24] KVM: x86 emulator: Provide more callbacks for x86 emulator.
  2010-03-09 17:57         ` Gleb Natapov
@ 2010-03-10  9:11           ` Avi Kivity
  0 siblings, 0 replies; 53+ messages in thread
From: Avi Kivity @ 2010-03-10  9:11 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm

On 03/09/2010 07:57 PM, Gleb Natapov wrote:
>>>>
>>>> Descriptor writes need an atomic kvm_set_guest_bit(), no?
>>>>
>>>>          
>>> It is? atomic against what? Current code just write whole descriptor
>>> using write_std().
>>>        
>> These are accessed bit changes, and are done atomically in the same
>> way as a page table walk sets the accessed and dirty bit.
>> Presumably the atomic operation is to allow the kernel to scan
>> segments and swap them out if they are not used.
>>
>>      
> We can use cmpxchg callback for that, no?
>
>    

Yes.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 19/24] KVM: x86 emulator: fix in/out emulation.
  2010-03-09 18:09     ` Gleb Natapov
@ 2010-03-10  9:12       ` Avi Kivity
  2010-03-10 14:41         ` Gleb Natapov
  0 siblings, 1 reply; 53+ messages in thread
From: Avi Kivity @ 2010-03-10  9:12 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm

On 03/09/2010 08:09 PM, Gleb Natapov wrote:
>
>> We don't want to enter the emulator for non-string in/out.  Leftover
>> test code?
>>
>>      
> No, unfortunately this is not leftover. I just don't see a way how we
> can bypass emulator and still have emulator be able to emulate in/out
> (for big real mode for instance). The problem is basically described in
> the commit message. If we have function outside of emulator that does
> in/out emulation on vcpu directly, then emulator can't  use it since
> committing shadowed registers will overwrite the result of emulation.
> Having two different emulations (one outside of emulator and another in
> emulator) is also problematic since when userspace returns after IO exit
> we don't know which emulation to continue. If we want to avoid
> instruction decoding we can fill in emulation context from exit info as
> if instruction was already decoded and call emulator.
>
>    

Alternatively, another entry point would be fine.  in/out is a fast path 
(used for virtio for example).

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest.
  2010-03-10  9:06         ` Gleb Natapov
@ 2010-03-10  9:12           ` Takuya Yoshikawa
  2010-03-10  9:14             ` Avi Kivity
  2010-03-10  9:15             ` Gleb Natapov
  0 siblings, 2 replies; 53+ messages in thread
From: Takuya Yoshikawa @ 2010-03-10  9:12 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Avi Kivity, kvm

Gleb Natapov wrote:
> On Wed, Mar 10, 2010 at 11:30:20AM +0900, Takuya Yoshikawa wrote:
>> Gleb Natapov wrote:
>>> On Tue, Mar 09, 2010 at 04:50:29PM +0200, Avi Kivity wrote:
>>>> On 03/09/2010 04:09 PM, Gleb Natapov wrote:
>>>>> Currently when string instruction is only partially complete we go back
>>>>> to a guest mode, guest tries to reexecute instruction and exits again
>>>>> and at this point emulation continues. Avoid all of this by restarting
>>>>> instruction without going back to a guest mode.
>>>> What happens if rcx is really big?  Going back into the guest gave
>>>> us a preemption point.
>>>>
>>> Two solutions. We can check if reschedule is required and yield cpu if
>>> needed. Or we can enter guest from time to time.
>> One generic question: from the viewpoint of KVM's policy, is it OK to make
>> the semantics different from real CPUs?
>>
>> Semantics, may be better to use other words, but I'm little bit worried that
>> the second solution may change something, not mentioning about bugs but some
>> behavior patterns depending on the "time to time".
>>
> Entering guest from time to time will not change semantics of the
> processor (if code is not modified under processor's feet at least).
> Currently we reenter guest mode after each iteration of string
> instruction for all instruction but ins/outs.
> 

E.g., is there no chance that during the repetitions, in the middle of the
repetitions, page faults occur? If it can, without entering the guest, can
we handle it?
  -- I lack some basic assumptions?

> --
> 			Gleb.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest.
  2010-03-09 18:11     ` Gleb Natapov
  2010-03-10  2:30       ` Takuya Yoshikawa
@ 2010-03-10  9:13       ` Avi Kivity
  1 sibling, 0 replies; 53+ messages in thread
From: Avi Kivity @ 2010-03-10  9:13 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm

On 03/09/2010 08:11 PM, Gleb Natapov wrote:
> On Tue, Mar 09, 2010 at 04:50:29PM +0200, Avi Kivity wrote:
>    
>> On 03/09/2010 04:09 PM, Gleb Natapov wrote:
>>      
>>> Currently when string instruction is only partially complete we go back
>>> to a guest mode, guest tries to reexecute instruction and exits again
>>> and at this point emulation continues. Avoid all of this by restarting
>>> instruction without going back to a guest mode.
>>>        
>> What happens if rcx is really big?  Going back into the guest gave
>> us a preemption point.
>>
>>      
> Two solutions. We can check if reschedule is required and yield cpu if
> needed. Or we can enter guest from time to time.
>    

I'd stick with the current solution, reentering the guest every page or so.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest.
  2010-03-10  9:12           ` Takuya Yoshikawa
@ 2010-03-10  9:14             ` Avi Kivity
  2010-03-10  9:15             ` Gleb Natapov
  1 sibling, 0 replies; 53+ messages in thread
From: Avi Kivity @ 2010-03-10  9:14 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: Gleb Natapov, kvm

On 03/10/2010 11:12 AM, Takuya Yoshikawa wrote:
>> Entering guest from time to time will not change semantics of the
>> processor (if code is not modified under processor's feet at least).
>> Currently we reenter guest mode after each iteration of string
>> instruction for all instruction but ins/outs.
>>
>
>
> E.g., is there no chance that during the repetitions, in the middle of 
> the
> repetitions, page faults occur? If it can, without entering the guest, 
> can
> we handle it?

Page faults can occur, and we need to handle them.

Another reason for reentering the guest is so that we can inject guest 
interrupts.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest.
  2010-03-10  9:12           ` Takuya Yoshikawa
  2010-03-10  9:14             ` Avi Kivity
@ 2010-03-10  9:15             ` Gleb Natapov
  2010-03-10 10:08               ` Takuya Yoshikawa
  1 sibling, 1 reply; 53+ messages in thread
From: Gleb Natapov @ 2010-03-10  9:15 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: Avi Kivity, kvm

On Wed, Mar 10, 2010 at 06:12:34PM +0900, Takuya Yoshikawa wrote:
> Gleb Natapov wrote:
> >On Wed, Mar 10, 2010 at 11:30:20AM +0900, Takuya Yoshikawa wrote:
> >>Gleb Natapov wrote:
> >>>On Tue, Mar 09, 2010 at 04:50:29PM +0200, Avi Kivity wrote:
> >>>>On 03/09/2010 04:09 PM, Gleb Natapov wrote:
> >>>>>Currently when string instruction is only partially complete we go back
> >>>>>to a guest mode, guest tries to reexecute instruction and exits again
> >>>>>and at this point emulation continues. Avoid all of this by restarting
> >>>>>instruction without going back to a guest mode.
> >>>>What happens if rcx is really big?  Going back into the guest gave
> >>>>us a preemption point.
> >>>>
> >>>Two solutions. We can check if reschedule is required and yield cpu if
> >>>needed. Or we can enter guest from time to time.
> >>One generic question: from the viewpoint of KVM's policy, is it OK to make
> >>the semantics different from real CPUs?
> >>
> >>Semantics, may be better to use other words, but I'm little bit worried that
> >>the second solution may change something, not mentioning about bugs but some
> >>behavior patterns depending on the "time to time".
> >>
> >Entering guest from time to time will not change semantics of the
> >processor (if code is not modified under processor's feet at least).
> >Currently we reenter guest mode after each iteration of string
> >instruction for all instruction but ins/outs.
> >
> 
> E.g., is there no chance that during the repetitions, in the middle of the
> repetitions, page faults occur? If it can, without entering the guest, can
> we handle it?
>  -- I lack some basic assumptions?
> 
If page fault occurs we inject it to the guest.

--
			Gleb.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest.
  2010-03-10  9:15             ` Gleb Natapov
@ 2010-03-10 10:08               ` Takuya Yoshikawa
  2010-03-10 13:48                 ` Gleb Natapov
  0 siblings, 1 reply; 53+ messages in thread
From: Takuya Yoshikawa @ 2010-03-10 10:08 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Avi Kivity, kvm

Gleb Natapov wrote:
>>>>
>>> Entering guest from time to time will not change semantics of the
>>> processor (if code is not modified under processor's feet at least).
>>> Currently we reenter guest mode after each iteration of string
>>> instruction for all instruction but ins/outs.
>>>
>> E.g., is there no chance that during the repetitions, in the middle of the
>> repetitions, page faults occur? If it can, without entering the guest, can
>> we handle it?
>>  -- I lack some basic assumptions?
>>
> If page fault occurs we inject it to the guest.
> 

Oh, I maight fail to tell what I worried about.
Opposite, I mean, I worried about NOT reentering the guest case.

I know that current implementation with reentrance is OK.

To inject a page fault without reentering the guest, we need to add
some more hacks to the emulator IIUC.


Thanks,
   Takuya

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest.
  2010-03-10 10:08               ` Takuya Yoshikawa
@ 2010-03-10 13:48                 ` Gleb Natapov
  2010-03-11  9:58                   ` Takuya Yoshikawa
  0 siblings, 1 reply; 53+ messages in thread
From: Gleb Natapov @ 2010-03-10 13:48 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: Avi Kivity, kvm

On Wed, Mar 10, 2010 at 07:08:31PM +0900, Takuya Yoshikawa wrote:
> Gleb Natapov wrote:
> >>>>
> >>>Entering guest from time to time will not change semantics of the
> >>>processor (if code is not modified under processor's feet at least).
> >>>Currently we reenter guest mode after each iteration of string
> >>>instruction for all instruction but ins/outs.
> >>>
> >>E.g., is there no chance that during the repetitions, in the middle of the
> >>repetitions, page faults occur? If it can, without entering the guest, can
> >>we handle it?
> >> -- I lack some basic assumptions?
> >>
> >If page fault occurs we inject it to the guest.
> >
> 
> Oh, I maight fail to tell what I worried about.
> Opposite, I mean, I worried about NOT reentering the guest case.
> 
Are you thinking about something specific here? If we inject exceptions
when they occur and we inject interrupt when they arrive what problem do
you see? I guess this is how real CPU actually works. I doubt it
re-reads string instruction on each iteration.

> I know that current implementation with reentrance is OK.
Current implementation does not reenter guest on each iteration for pio
string, so currently we have both variants.

> 
> To inject a page fault without reentering the guest, we need to add
> some more hacks to the emulator IIUC.
> 
No, we just need to enter guest if exception happens. I see that this in
handled incorrectly in my current patch series.

--
			Gleb.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 19/24] KVM: x86 emulator: fix in/out emulation.
  2010-03-10  9:12       ` Avi Kivity
@ 2010-03-10 14:41         ` Gleb Natapov
  0 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-10 14:41 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

On Wed, Mar 10, 2010 at 11:12:34AM +0200, Avi Kivity wrote:
> On 03/09/2010 08:09 PM, Gleb Natapov wrote:
> >
> >>We don't want to enter the emulator for non-string in/out.  Leftover
> >>test code?
> >>
> >No, unfortunately this is not leftover. I just don't see a way how we
> >can bypass emulator and still have emulator be able to emulate in/out
> >(for big real mode for instance). The problem is basically described in
> >the commit message. If we have function outside of emulator that does
> >in/out emulation on vcpu directly, then emulator can't  use it since
> >committing shadowed registers will overwrite the result of emulation.
> >Having two different emulations (one outside of emulator and another in
> >emulator) is also problematic since when userspace returns after IO exit
> >we don't know which emulation to continue. If we want to avoid
> >instruction decoding we can fill in emulation context from exit info as
> >if instruction was already decoded and call emulator.
> >
> 
> Alternatively, another entry point would be fine.  in/out is a fast
> path (used for virtio for example).
> 
You mean another entry point into emulator, not separate implementation
for emulated in/out and intercepted one. If yes this is what I mean by
"faking" decoding stage.

--
			Gleb.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest.
  2010-03-10 13:48                 ` Gleb Natapov
@ 2010-03-11  9:58                   ` Takuya Yoshikawa
  2010-03-11 10:07                     ` Gleb Natapov
  0 siblings, 1 reply; 53+ messages in thread
From: Takuya Yoshikawa @ 2010-03-11  9:58 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Avi Kivity, kvm

Gleb Natapov wrote:
> On Wed, Mar 10, 2010 at 07:08:31PM +0900, Takuya Yoshikawa wrote:
>> Gleb Natapov wrote:
>>>>> Entering guest from time to time will not change semantics of the
>>>>> processor (if code is not modified under processor's feet at least).
>>>>> Currently we reenter guest mode after each iteration of string
>>>>> instruction for all instruction but ins/outs.
>>>>>
>>>> E.g., is there no chance that during the repetitions, in the middle of the
>>>> repetitions, page faults occur? If it can, without entering the guest, can
>>>> we handle it?
>>>> -- I lack some basic assumptions?
>>>>
>>> If page fault occurs we inject it to the guest.
>>>
>> Oh, I maight fail to tell what I worried about.
>> Opposite, I mean, I worried about NOT reentering the guest case.
>>
> Are you thinking about something specific here? If we inject exceptions
Yes.

> when they occur and we inject interrupt when they arrive what problem do
> you see? I guess this is how real CPU actually works. I doubt it
> re-reads string instruction on each iteration.

No problem if we detect and inject page faults like that.

I just didn't so certain that when we encounter a page fault in the middle
of the repetitions(about rep specific case), if we can inject it, suspend the
repetition and enter the guest immediately like SDM Vol.2B says:

  "A repeating string operation can be suspended by an exception or interrupt.
   When this happens, the state of the registers is preserved to allow the string
   operation to be resumed upon a return from the exception or interrupt handler.
   ...
   This mechanism allows long string operations to proceed without affecting the
   interrupt response time of the system."

Ah, I might misunderstand that if we reenter the guest every time for rep,
page fault detection, not injection, can be done by the other ways easily,
by EXIT reason or something. Both ways may need the same thing, sorry.


Another concern I wrote was just about the dependencies between your
"time to time" criteria and SDM's "without affecting the interrupt response time".
This is just the problem of how we can determine the criteria appropriately.

>> I know that current implementation with reentrance is OK.
> Current implementation does not reenter guest on each iteration for pio
> string, so currently we have both variants.

I'm sorry, I was confused as if the current implementation already
included some of your patches.

> 
>> To inject a page fault without reentering the guest, we need to add
>> some more hacks to the emulator IIUC.
>>
> No, we just need to enter guest if exception happens. I see that this in
> handled incorrectly in my current patch series.

I was just not certain if the following condition(from SDM Vol.2B) is satisfied

   "The source and destination registers point to the next string elements
    to be operated on, the EIP register points to the string instruction,
    and the ECX register has the value it held following the last successful
    iteration of the instruction."

in the emulator's fault handling. I should have read your patch more closely.

Thanks,
   Takuya

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest.
  2010-03-11  9:58                   ` Takuya Yoshikawa
@ 2010-03-11 10:07                     ` Gleb Natapov
  0 siblings, 0 replies; 53+ messages in thread
From: Gleb Natapov @ 2010-03-11 10:07 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: Avi Kivity, kvm

On Thu, Mar 11, 2010 at 06:58:14PM +0900, Takuya Yoshikawa wrote:
> Gleb Natapov wrote:
> >On Wed, Mar 10, 2010 at 07:08:31PM +0900, Takuya Yoshikawa wrote:
> >>Gleb Natapov wrote:
> >>>>>Entering guest from time to time will not change semantics of the
> >>>>>processor (if code is not modified under processor's feet at least).
> >>>>>Currently we reenter guest mode after each iteration of string
> >>>>>instruction for all instruction but ins/outs.
> >>>>>
> >>>>E.g., is there no chance that during the repetitions, in the middle of the
> >>>>repetitions, page faults occur? If it can, without entering the guest, can
> >>>>we handle it?
> >>>>-- I lack some basic assumptions?
> >>>>
> >>>If page fault occurs we inject it to the guest.
> >>>
> >>Oh, I maight fail to tell what I worried about.
> >>Opposite, I mean, I worried about NOT reentering the guest case.
> >>
> >Are you thinking about something specific here? If we inject exceptions
> Yes.
> 
> >when they occur and we inject interrupt when they arrive what problem do
> >you see? I guess this is how real CPU actually works. I doubt it
> >re-reads string instruction on each iteration.
> 
> No problem if we detect and inject page faults like that.
> 
Yes, that part is missing from my patch.

> I just didn't so certain that when we encounter a page fault in the middle
> of the repetitions(about rep specific case), if we can inject it, suspend the
> repetition and enter the guest immediately like SDM Vol.2B says:
> 
>  "A repeating string operation can be suspended by an exception or interrupt.
>   When this happens, the state of the registers is preserved to allow the string
>   operation to be resumed upon a return from the exception or interrupt handler.
>   ...
>   This mechanism allows long string operations to proceed without affecting the
>   interrupt response time of the system."
> 
> Ah, I might misunderstand that if we reenter the guest every time for rep,
> page fault detection, not injection, can be done by the other ways easily,
> by EXIT reason or something. Both ways may need the same thing, sorry.
When instruction is emulated page fault detection is done by the
emulator itself. During guest entry the exception is injected. So all we
need to do in the emulator is to enter guest immediately when exception
condition is detected.

> 
> Another concern I wrote was just about the dependencies between your
> "time to time" criteria and SDM's "without affecting the interrupt response time".
> This is just the problem of how we can determine the criteria appropriately.
> 
We can reenter guest immediately if there is pending interrupt (we can't
do that with ins read ahead, but this optimization is non architectural anyway).

> >>I know that current implementation with reentrance is OK.
> >Current implementation does not reenter guest on each iteration for pio
> >string, so currently we have both variants.
> 
> I'm sorry, I was confused as if the current implementation already
> included some of your patches.
> 
It's independent from my patches. This is how string pio always worked.
Otherwise certain workloads are too slow.

> >
> >>To inject a page fault without reentering the guest, we need to add
> >>some more hacks to the emulator IIUC.
> >>
> >No, we just need to enter guest if exception happens. I see that this in
> >handled incorrectly in my current patch series.
> 
> I was just not certain if the following condition(from SDM Vol.2B) is satisfied
> 
>   "The source and destination registers point to the next string elements
>    to be operated on, the EIP register points to the string instruction,
>    and the ECX register has the value it held following the last successful
>    iteration of the instruction."
It is satisfied. Writeback is done on each iteration.

> 
> in the emulator's fault handling. I should have read your patch more closely.
> 
> Thanks,
>   Takuya

--
			Gleb.

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2010-03-11 10:08 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-09 14:08 [PATCH 00/24] [RFC] emulator cleanup Gleb Natapov
2010-03-09 14:08 ` [PATCH 01/24] KVM: Remove pointer to rflags from realmode_set_cr parameters Gleb Natapov
2010-03-09 14:09 ` [PATCH 02/24] KVM: Provide callback to get/set control registers in emulator ops Gleb Natapov
2010-03-09 14:18   ` Avi Kivity
2010-03-09 14:24     ` Gleb Natapov
2010-03-09 14:09 ` [PATCH 03/24] KVM: remove realmode_lmsw function Gleb Natapov
2010-03-09 14:09 ` [PATCH 04/24] KVM: Provide current CPL as part of emulator context Gleb Natapov
2010-03-09 14:24   ` Avi Kivity
2010-03-09 14:27     ` Gleb Natapov
2010-03-09 14:09 ` [PATCH 05/24] KVM: Provide current eip " Gleb Natapov
2010-03-09 14:09 ` [PATCH 06/24] KVM: x86 emulator: fix mov r/m, sreg emulation Gleb Natapov
2010-03-09 14:09 ` [PATCH 07/24] KVM: x86 emulator: fix 0f 01 /5 emulation Gleb Natapov
2010-03-09 14:27   ` Avi Kivity
2010-03-09 14:33     ` Gleb Natapov
2010-03-09 14:34       ` Avi Kivity
2010-03-09 14:09 ` [PATCH 08/24] KVM: x86 emulator: 0f (20|21|22|23) ignore mod bits Gleb Natapov
2010-03-09 14:09 ` [PATCH 09/24] KVM: x86 emulator: inject #UD on access to non-existing CR Gleb Natapov
2010-03-09 14:09 ` [PATCH 10/24] KVM: x86 emulator: fix mov dr to inject #UD when needed Gleb Natapov
2010-03-09 14:09 ` [PATCH 11/24] KVM: x86 emulator: fix return values of syscall/sysenter/sysexit emulations Gleb Natapov
2010-03-09 14:09 ` [PATCH 12/24] KVM: x86 emulator: do not call writeback if msr access fails Gleb Natapov
2010-03-09 14:09 ` [PATCH 13/24] KVM: x86 emulator: If LOCK prefix is used dest arg should be memory Gleb Natapov
2010-03-09 14:09 ` [PATCH 14/24] KVM: x86 emulator: cleanup grp3 return value Gleb Natapov
2010-03-09 14:09 ` [PATCH 15/24] KVM: x86 emulator: Provide more callbacks for x86 emulator Gleb Natapov
2010-03-09 14:43   ` Avi Kivity
2010-03-09 16:25     ` Gleb Natapov
2010-03-09 17:22       ` Avi Kivity
2010-03-09 17:57         ` Gleb Natapov
2010-03-10  9:11           ` Avi Kivity
2010-03-09 14:09 ` [PATCH 16/24] KVM: x86 emulator: Emulate task switch in emulator.c Gleb Natapov
2010-03-09 14:09 ` [PATCH 17/24] KVM: x86 emulator: Use load_segment_descriptor() instead of kvm_load_segment_descriptor() Gleb Natapov
2010-03-09 14:09 ` [PATCH 18/24] KVM: Use task switch from emulator.c Gleb Natapov
2010-03-09 14:09 ` [PATCH 19/24] KVM: x86 emulator: fix in/out emulation Gleb Natapov
2010-03-09 14:47   ` Avi Kivity
2010-03-09 18:09     ` Gleb Natapov
2010-03-10  9:12       ` Avi Kivity
2010-03-10 14:41         ` Gleb Natapov
2010-03-09 14:09 ` [PATCH 20/24] KVM: x86 emulator: Move string pio emulation into emulator.c Gleb Natapov
2010-03-09 14:09 ` [PATCH 21/24] KVM: x86 emulator: remove saved_eip Gleb Natapov
2010-03-09 14:09 ` [PATCH 22/24] KVM: x86 emulator: restart string instruction without going back to a guest Gleb Natapov
2010-03-09 14:50   ` Avi Kivity
2010-03-09 18:11     ` Gleb Natapov
2010-03-10  2:30       ` Takuya Yoshikawa
2010-03-10  9:06         ` Gleb Natapov
2010-03-10  9:12           ` Takuya Yoshikawa
2010-03-10  9:14             ` Avi Kivity
2010-03-10  9:15             ` Gleb Natapov
2010-03-10 10:08               ` Takuya Yoshikawa
2010-03-10 13:48                 ` Gleb Natapov
2010-03-11  9:58                   ` Takuya Yoshikawa
2010-03-11 10:07                     ` Gleb Natapov
2010-03-10  9:13       ` Avi Kivity
2010-03-09 14:09 ` [PATCH 23/24] KVM: x86 emulator: introduce pio in string read ahead Gleb Natapov
2010-03-09 14:09 ` [PATCH 24/24] KVM: small kvm_arch_vcpu_ioctl_run() cleanup Gleb Natapov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.