linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] KVM: x86: Fix exit handler and emulation bugs
@ 2014-05-07 12:32 Nadav Amit
  2014-05-07 12:32 ` [PATCH 1/5] KVM: x86: Emulator does not calculate address correctly Nadav Amit
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Nadav Amit @ 2014-05-07 12:32 UTC (permalink / raw)
  To: mtosatti, pbonzini, hpa
  Cc: gleb, tglx, mingo, x86, kvm, linux-kernel, Nadav Amit

This series of patches fixes various scenarios in which KVM does not follow x86
specifications.  Patches #4 and #5 are related; they reflect a new revision of
the previously submitted patch that dealt with the wrong masking of registers
in long-mode. Patch #3 is a follow-up to the previously sumbitted patch that
fixed the wrong reserved page table masks. Patches #3 and #5 were not tested in
a manner that actually checks the modified behavior. Not all the pathes in
patch #4 were tested.

Thanks for reviewing the patches.

Nadav Amit (5):
  KVM: x86: Emulator does not calculate address correctly
  KVM: vmx: handle_dr does not handle RSP correctly
  KVM: x86: Mark bit 7 in long-mode PDPTE according to 1GB pages support
  KVM: x86: Wrong register masking in 64-bit mode
  KVM: x86: Fix wrong masking on relative jump/call

 arch/x86/kvm/cpuid.h   |  7 +++++++
 arch/x86/kvm/emulate.c | 47 +++++++++++++++++++++++++++++------------------
 arch/x86/kvm/mmu.c     |  8 ++++++--
 arch/x86/kvm/vmx.c     |  2 +-
 4 files changed, 43 insertions(+), 21 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/5] KVM: x86: Emulator does not calculate address correctly
  2014-05-07 12:32 [PATCH 0/5] KVM: x86: Fix exit handler and emulation bugs Nadav Amit
@ 2014-05-07 12:32 ` Nadav Amit
  2014-05-07 13:57   ` Paolo Bonzini
  2014-05-07 12:32 ` [PATCH 2/5] KVM: vmx: handle_dr does not handle RSP correctly Nadav Amit
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Nadav Amit @ 2014-05-07 12:32 UTC (permalink / raw)
  To: mtosatti, pbonzini, hpa
  Cc: gleb, tglx, mingo, x86, kvm, linux-kernel, Nadav Amit

In long-mode, when the address size is 4 bytes, the linear address is not
truncated as the emulator mistakenly does.  Instead, the offset within the
segment (the ea field) should be truncated according to the address size.

As Intel SDM says: "In 64-bit mode, the effective address components are added
and the effective address is truncated ... before adding the full 64-bit
segment base."

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
---
 arch/x86/kvm/emulate.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index e8a5840..743e8e3 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -631,7 +631,8 @@ static int __linearize(struct x86_emulate_ctxt *ctxt,
 	u16 sel;
 	unsigned cpl;
 
-	la = seg_base(ctxt, addr.seg) + addr.ea;
+	la = seg_base(ctxt, addr.seg) +
+	    (ctxt->ad_bytes == 8 ? addr.ea : (u32)addr.ea);
 	switch (ctxt->mode) {
 	case X86EMUL_MODE_PROT64:
 		if (((signed long)la << 16) >> 16 != la)
@@ -678,7 +679,7 @@ static int __linearize(struct x86_emulate_ctxt *ctxt,
 		}
 		break;
 	}
-	if (fetch ? ctxt->mode != X86EMUL_MODE_PROT64 : ctxt->ad_bytes != 8)
+	if (ctxt->mode != X86EMUL_MODE_PROT64)
 		la &= (u32)-1;
 	if (insn_aligned(ctxt, size) && ((la & (size - 1)) != 0))
 		return emulate_gp(ctxt, 0);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/5] KVM: vmx: handle_dr does not handle RSP correctly
  2014-05-07 12:32 [PATCH 0/5] KVM: x86: Fix exit handler and emulation bugs Nadav Amit
  2014-05-07 12:32 ` [PATCH 1/5] KVM: x86: Emulator does not calculate address correctly Nadav Amit
@ 2014-05-07 12:32 ` Nadav Amit
  2014-05-07 12:32 ` [PATCH 3/5] KVM: x86: Mark bit 7 in long-mode PDPTE according to 1GB pages support Nadav Amit
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Nadav Amit @ 2014-05-07 12:32 UTC (permalink / raw)
  To: mtosatti, pbonzini, hpa
  Cc: gleb, tglx, mingo, x86, kvm, linux-kernel, Nadav Amit

The RSP register is not automatically cached, causing mov DR instruction with
RSP to fail.  Instead the regular register accessing interface should be used.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
---
 arch/x86/kvm/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 72b8012..0e2793f 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -5142,7 +5142,7 @@ static int handle_dr(struct kvm_vcpu *vcpu)
 			return 1;
 		kvm_register_write(vcpu, reg, val);
 	} else
-		if (kvm_set_dr(vcpu, dr, vcpu->arch.regs[reg]))
+		if (kvm_set_dr(vcpu, dr, kvm_register_read(vcpu, reg)))
 			return 1;
 
 	skip_emulated_instruction(vcpu);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/5] KVM: x86: Mark bit 7 in long-mode PDPTE according to 1GB pages support
  2014-05-07 12:32 [PATCH 0/5] KVM: x86: Fix exit handler and emulation bugs Nadav Amit
  2014-05-07 12:32 ` [PATCH 1/5] KVM: x86: Emulator does not calculate address correctly Nadav Amit
  2014-05-07 12:32 ` [PATCH 2/5] KVM: vmx: handle_dr does not handle RSP correctly Nadav Amit
@ 2014-05-07 12:32 ` Nadav Amit
  2014-05-07 12:32 ` [PATCH 4/5] KVM: x86: Wrong register masking in 64-bit mode Nadav Amit
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Nadav Amit @ 2014-05-07 12:32 UTC (permalink / raw)
  To: mtosatti, pbonzini, hpa
  Cc: gleb, tglx, mingo, x86, kvm, linux-kernel, Nadav Amit

In long-mode, bit 7 in the PDPTE is not reserved only if 1GB pages are
supported by the CPU. Currently the bit is considered by KVM as always
reserved.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
---
 arch/x86/kvm/cpuid.h | 7 +++++++
 arch/x86/kvm/mmu.c   | 8 ++++++--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index eeecbed..f908731 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -88,4 +88,11 @@ static inline bool guest_cpuid_has_x2apic(struct kvm_vcpu *vcpu)
 	return best && (best->ecx & bit(X86_FEATURE_X2APIC));
 }
 
+static inline bool guest_cpuid_has_gbpages(struct kvm_vcpu *vcpu)
+{
+	struct kvm_cpuid_entry2 *best;
+
+	best = kvm_find_cpuid_entry(vcpu, 0x80000001, 0);
+	return best && (best->edx & bit(X86_FEATURE_GBPAGES));
+}
 #endif
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 65f2400..9314678 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -22,6 +22,7 @@
 #include "mmu.h"
 #include "x86.h"
 #include "kvm_cache_regs.h"
+#include "cpuid.h"
 
 #include <linux/kvm_host.h>
 #include <linux/types.h>
@@ -3516,11 +3517,14 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
 {
 	int maxphyaddr = cpuid_maxphyaddr(vcpu);
 	u64 exb_bit_rsvd = 0;
+	u64 gbpages_bit_rsvd = 0;
 
 	context->bad_mt_xwr = 0;
 
 	if (!context->nx)
 		exb_bit_rsvd = rsvd_bits(63, 63);
+	if (!guest_cpuid_has_gbpages(vcpu))
+		gbpages_bit_rsvd = rsvd_bits(7, 7);
 	switch (context->root_level) {
 	case PT32_ROOT_LEVEL:
 		/* no rsvd bits for 2 level 4K page table entries */
@@ -3557,14 +3561,14 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
 		context->rsvd_bits_mask[0][3] = exb_bit_rsvd |
 			rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 7);
 		context->rsvd_bits_mask[0][2] = exb_bit_rsvd |
-			rsvd_bits(maxphyaddr, 51) | rsvd_bits(7, 7);
+			gbpages_bit_rsvd | rsvd_bits(maxphyaddr, 51);
 		context->rsvd_bits_mask[0][1] = exb_bit_rsvd |
 			rsvd_bits(maxphyaddr, 51);
 		context->rsvd_bits_mask[0][0] = exb_bit_rsvd |
 			rsvd_bits(maxphyaddr, 51);
 		context->rsvd_bits_mask[1][3] = context->rsvd_bits_mask[0][3];
 		context->rsvd_bits_mask[1][2] = exb_bit_rsvd |
-			rsvd_bits(maxphyaddr, 51) |
+			gbpages_bit_rsvd | rsvd_bits(maxphyaddr, 51) |
 			rsvd_bits(13, 29);
 		context->rsvd_bits_mask[1][1] = exb_bit_rsvd |
 			rsvd_bits(maxphyaddr, 51) |
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/5] KVM: x86: Wrong register masking in 64-bit mode
  2014-05-07 12:32 [PATCH 0/5] KVM: x86: Fix exit handler and emulation bugs Nadav Amit
                   ` (2 preceding siblings ...)
  2014-05-07 12:32 ` [PATCH 3/5] KVM: x86: Mark bit 7 in long-mode PDPTE according to 1GB pages support Nadav Amit
@ 2014-05-07 12:32 ` Nadav Amit
  2014-05-07 14:50   ` Bandan Das
  2014-05-07 15:52   ` Paolo Bonzini
  2014-05-07 12:32 ` [PATCH 5/5] KVM: x86: Fix wrong masking on relative jump/call Nadav Amit
  2014-05-07 16:00 ` [PATCH 0/5] KVM: x86: Fix exit handler and emulation bugs Paolo Bonzini
  5 siblings, 2 replies; 17+ messages in thread
From: Nadav Amit @ 2014-05-07 12:32 UTC (permalink / raw)
  To: mtosatti, pbonzini, hpa
  Cc: gleb, tglx, mingo, x86, kvm, linux-kernel, Nadav Amit

32-bit operations are zero extended in 64-bit mode. Currently, the code does
not handle them correctly and keeps the high bits. In 16-bit mode, the high
32-bits are kept intact.

In addition, although it is not well-documented, when address override prefix
is used with REP-string instruction, RCX high half is zeroed even if ECX was
zero on the first iteration (as if an assignment was performed to ECX).

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
---
 arch/x86/kvm/emulate.c | 38 +++++++++++++++++++++++---------------
 1 file changed, 23 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 743e8e3..6833b41 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -434,9 +434,21 @@ static int emulator_check_intercept(struct x86_emulate_ctxt *ctxt,
 	return ctxt->ops->intercept(ctxt, &info, stage);
 }
 
-static void assign_masked(ulong *dest, ulong src, ulong mask)
+static void assign_masked(ulong *dest, ulong src, int bytes)
 {
-	*dest = (*dest & ~mask) | (src & mask);
+	switch (bytes) {
+	case 2:
+		*dest = (u16)src | (*dest & ~0xfffful);
+		break;
+	case 4:
+		*dest = (u32)src;
+		break;
+	case 8:
+		*dest = src;
+		break;
+	default:
+		BUG();
+	}
 }
 
 static inline unsigned long ad_mask(struct x86_emulate_ctxt *ctxt)
@@ -476,26 +488,20 @@ register_address(struct x86_emulate_ctxt *ctxt, unsigned long reg)
 	return address_mask(ctxt, reg);
 }
 
-static void masked_increment(ulong *reg, ulong mask, int inc)
+static void masked_increment(ulong *reg, int size, int inc)
 {
-	assign_masked(reg, *reg + inc, mask);
+	assign_masked(reg, *reg + inc, size);
 }
 
 static inline void
 register_address_increment(struct x86_emulate_ctxt *ctxt, unsigned long *reg, int inc)
 {
-	ulong mask;
-
-	if (ctxt->ad_bytes == sizeof(unsigned long))
-		mask = ~0UL;
-	else
-		mask = ad_mask(ctxt);
-	masked_increment(reg, mask, inc);
+	masked_increment(reg, ctxt->ad_bytes, inc);
 }
 
 static void rsp_increment(struct x86_emulate_ctxt *ctxt, int inc)
 {
-	masked_increment(reg_rmw(ctxt, VCPU_REGS_RSP), stack_mask(ctxt), inc);
+	masked_increment(reg_rmw(ctxt, VCPU_REGS_RSP), stack_size(ctxt), inc);
 }
 
 static inline void jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
@@ -1712,17 +1718,17 @@ static int em_enter(struct x86_emulate_ctxt *ctxt)
 	if (rc != X86EMUL_CONTINUE)
 		return rc;
 	assign_masked(reg_rmw(ctxt, VCPU_REGS_RBP), reg_read(ctxt, VCPU_REGS_RSP),
-		      stack_mask(ctxt));
+		      stack_size(ctxt));
 	assign_masked(reg_rmw(ctxt, VCPU_REGS_RSP),
 		      reg_read(ctxt, VCPU_REGS_RSP) - frame_size,
-		      stack_mask(ctxt));
+		      stack_size(ctxt));
 	return X86EMUL_CONTINUE;
 }
 
 static int em_leave(struct x86_emulate_ctxt *ctxt)
 {
 	assign_masked(reg_rmw(ctxt, VCPU_REGS_RSP), reg_read(ctxt, VCPU_REGS_RBP),
-		      stack_mask(ctxt));
+		      stack_size(ctxt));
 	return emulate_pop(ctxt, reg_rmw(ctxt, VCPU_REGS_RBP), ctxt->op_bytes);
 }
 
@@ -4570,6 +4576,8 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
 	if (ctxt->rep_prefix && (ctxt->d & String)) {
 		/* All REP prefixes have the same first termination condition */
 		if (address_mask(ctxt, reg_read(ctxt, VCPU_REGS_RCX)) == 0) {
+			assign_masked(reg_write(ctxt, VCPU_REGS_RCX), 0,
+				      ctxt->ad_bytes);
 			ctxt->eip = ctxt->_eip;
 			goto done;
 		}
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/5] KVM: x86: Fix wrong masking on relative jump/call
  2014-05-07 12:32 [PATCH 0/5] KVM: x86: Fix exit handler and emulation bugs Nadav Amit
                   ` (3 preceding siblings ...)
  2014-05-07 12:32 ` [PATCH 4/5] KVM: x86: Wrong register masking in 64-bit mode Nadav Amit
@ 2014-05-07 12:32 ` Nadav Amit
  2014-05-07 14:43   ` Bandan Das
  2014-05-07 15:59   ` Paolo Bonzini
  2014-05-07 16:00 ` [PATCH 0/5] KVM: x86: Fix exit handler and emulation bugs Paolo Bonzini
  5 siblings, 2 replies; 17+ messages in thread
From: Nadav Amit @ 2014-05-07 12:32 UTC (permalink / raw)
  To: mtosatti, pbonzini, hpa
  Cc: gleb, tglx, mingo, x86, kvm, linux-kernel, Nadav Amit

Relative jumps and calls do the masking according to the operand size, and not
according to the address size as the KVM emulator does today.  In 64-bit mode,
the resulting RIP is always 64-bit. Otherwise it is masked according to the
instruction operand-size. Note that when 16-bit address size is used, bits
63:32 are unmodified.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
---
 arch/x86/kvm/emulate.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 6833b41..e406705 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -506,7 +506,9 @@ static void rsp_increment(struct x86_emulate_ctxt *ctxt, int inc)
 
 static inline void jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
 {
-	register_address_increment(ctxt, &ctxt->_eip, rel);
+	/* 64-bit mode relative jumps are always 64-bit; otherwise mask */
+	int op_bytes = ctxt->mode == X86EMUL_MODE_PROT64 ? 8 : ctxt->op_bytes;
+	masked_increment(&ctxt->eip, op_bytes, rel);
 }
 
 static u32 desc_limit_scaled(struct desc_struct *desc)
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/5] KVM: x86: Emulator does not calculate address correctly
  2014-05-07 12:32 ` [PATCH 1/5] KVM: x86: Emulator does not calculate address correctly Nadav Amit
@ 2014-05-07 13:57   ` Paolo Bonzini
  2014-05-07 15:21     ` Nadav Amit
  0 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2014-05-07 13:57 UTC (permalink / raw)
  To: Nadav Amit, mtosatti, hpa; +Cc: gleb, tglx, mingo, x86, kvm, linux-kernel

Il 07/05/2014 14:32, Nadav Amit ha scritto:
> In long-mode, when the address size is 4 bytes, the linear address is not
> truncated as the emulator mistakenly does.  Instead, the offset within the
> segment (the ea field) should be truncated according to the address size.
>
> As Intel SDM says: "In 64-bit mode, the effective address components are added
> and the effective address is truncated ... before adding the full 64-bit
> segment base."
>
> Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
> ---
>  arch/x86/kvm/emulate.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index e8a5840..743e8e3 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -631,7 +631,8 @@ static int __linearize(struct x86_emulate_ctxt *ctxt,
>  	u16 sel;
>  	unsigned cpl;
>
> -	la = seg_base(ctxt, addr.seg) + addr.ea;
> +	la = seg_base(ctxt, addr.seg) +
> +	    (ctxt->ad_bytes == 8 ? addr.ea : (u32)addr.ea);

I think you need "fetch || ctxt->ad_bytes == 8" here.

Paolo

>  	switch (ctxt->mode) {
>  	case X86EMUL_MODE_PROT64:
>  		if (((signed long)la << 16) >> 16 != la)
> @@ -678,7 +679,7 @@ static int __linearize(struct x86_emulate_ctxt *ctxt,
>  		}
>  		break;
>  	}
> -	if (fetch ? ctxt->mode != X86EMUL_MODE_PROT64 : ctxt->ad_bytes != 8)
> +	if (ctxt->mode != X86EMUL_MODE_PROT64)
>  		la &= (u32)-1;
>  	if (insn_aligned(ctxt, size) && ((la & (size - 1)) != 0))
>  		return emulate_gp(ctxt, 0);
>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/5] KVM: x86: Fix wrong masking on relative jump/call
  2014-05-07 12:32 ` [PATCH 5/5] KVM: x86: Fix wrong masking on relative jump/call Nadav Amit
@ 2014-05-07 14:43   ` Bandan Das
  2014-05-07 15:57     ` Nadav Amit
  2014-05-07 15:59   ` Paolo Bonzini
  1 sibling, 1 reply; 17+ messages in thread
From: Bandan Das @ 2014-05-07 14:43 UTC (permalink / raw)
  To: Nadav Amit
  Cc: mtosatti, pbonzini, hpa, gleb, tglx, mingo, x86, kvm, linux-kernel

Nadav Amit <namit@cs.technion.ac.il> writes:

> Relative jumps and calls do the masking according to the operand size, and not
> according to the address size as the KVM emulator does today.  In 64-bit mode,
> the resulting RIP is always 64-bit. Otherwise it is masked according to the
> instruction operand-size. Note that when 16-bit address size is used, bits
> 63:32 are unmodified.
>
> Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
> ---
>  arch/x86/kvm/emulate.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 6833b41..e406705 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -506,7 +506,9 @@ static void rsp_increment(struct x86_emulate_ctxt *ctxt, int inc)
>  
>  static inline void jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
>  {
> -	register_address_increment(ctxt, &ctxt->_eip, rel);
> +	/* 64-bit mode relative jumps are always 64-bit; otherwise mask */
> +	int op_bytes = ctxt->mode == X86EMUL_MODE_PROT64 ? 8 : ctxt->op_bytes;

Just a nit, probably break this up for readability ?

> +	masked_increment(&ctxt->eip, op_bytes, rel);
>  }
>  
>  static u32 desc_limit_scaled(struct desc_struct *desc)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/5] KVM: x86: Wrong register masking in 64-bit mode
  2014-05-07 12:32 ` [PATCH 4/5] KVM: x86: Wrong register masking in 64-bit mode Nadav Amit
@ 2014-05-07 14:50   ` Bandan Das
  2014-05-07 15:24     ` Paolo Bonzini
  2014-05-07 16:11     ` Nadav Amit
  2014-05-07 15:52   ` Paolo Bonzini
  1 sibling, 2 replies; 17+ messages in thread
From: Bandan Das @ 2014-05-07 14:50 UTC (permalink / raw)
  To: Nadav Amit
  Cc: mtosatti, pbonzini, hpa, gleb, tglx, mingo, x86, kvm, linux-kernel

Nadav Amit <namit@cs.technion.ac.il> writes:

> 32-bit operations are zero extended in 64-bit mode. Currently, the code does
> not handle them correctly and keeps the high bits. In 16-bit mode, the high
> 32-bits are kept intact.
>
> In addition, although it is not well-documented, when address override prefix

It would be helpful to have a pointer in the SDM especially for cases 
that are not well-documented.

> is used with REP-string instruction, RCX high half is zeroed even if ECX was
> zero on the first iteration (as if an assignment was performed to ECX).
>
> Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
> ---
>  arch/x86/kvm/emulate.c | 38 +++++++++++++++++++++++---------------
>  1 file changed, 23 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 743e8e3..6833b41 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -434,9 +434,21 @@ static int emulator_check_intercept(struct x86_emulate_ctxt *ctxt,
>  	return ctxt->ops->intercept(ctxt, &info, stage);
>  }
>  
> -static void assign_masked(ulong *dest, ulong src, ulong mask)
> +static void assign_masked(ulong *dest, ulong src, int bytes)
>  {
> -	*dest = (*dest & ~mask) | (src & mask);
> +	switch (bytes) {
> +	case 2:
> +		*dest = (u16)src | (*dest & ~0xfffful);
> +		break;
> +	case 4:
> +		*dest = (u32)src;
> +		break;
> +	case 8:
> +		*dest = src;
> +		break;
> +	default:
> +		BUG();

IIRC, Paolo mentioned that a WARN() is preferable. But I see
a lot other places where BUG() is called, maybe, he can confirm.

> +	}
>  }
>  
>  static inline unsigned long ad_mask(struct x86_emulate_ctxt *ctxt)
> @@ -476,26 +488,20 @@ register_address(struct x86_emulate_ctxt *ctxt, unsigned long reg)
>  	return address_mask(ctxt, reg);
>  }
>  
> -static void masked_increment(ulong *reg, ulong mask, int inc)
> +static void masked_increment(ulong *reg, int size, int inc)
>  {
> -	assign_masked(reg, *reg + inc, mask);
> +	assign_masked(reg, *reg + inc, size);
>  }
>  
>  static inline void
>  register_address_increment(struct x86_emulate_ctxt *ctxt, unsigned long *reg, int inc)
>  {
> -	ulong mask;
> -
> -	if (ctxt->ad_bytes == sizeof(unsigned long))
> -		mask = ~0UL;
> -	else
> -		mask = ad_mask(ctxt);
> -	masked_increment(reg, mask, inc);
> +	masked_increment(reg, ctxt->ad_bytes, inc);
>  }
>  
>  static void rsp_increment(struct x86_emulate_ctxt *ctxt, int inc)
>  {
> -	masked_increment(reg_rmw(ctxt, VCPU_REGS_RSP), stack_mask(ctxt), inc);
> +	masked_increment(reg_rmw(ctxt, VCPU_REGS_RSP), stack_size(ctxt), inc);
>  }
>  
>  static inline void jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
> @@ -1712,17 +1718,17 @@ static int em_enter(struct x86_emulate_ctxt *ctxt)
>  	if (rc != X86EMUL_CONTINUE)
>  		return rc;
>  	assign_masked(reg_rmw(ctxt, VCPU_REGS_RBP), reg_read(ctxt, VCPU_REGS_RSP),
> -		      stack_mask(ctxt));
> +		      stack_size(ctxt));
>  	assign_masked(reg_rmw(ctxt, VCPU_REGS_RSP),
>  		      reg_read(ctxt, VCPU_REGS_RSP) - frame_size,
> -		      stack_mask(ctxt));
> +		      stack_size(ctxt));
>  	return X86EMUL_CONTINUE;
>  }
>  
>  static int em_leave(struct x86_emulate_ctxt *ctxt)
>  {
>  	assign_masked(reg_rmw(ctxt, VCPU_REGS_RSP), reg_read(ctxt, VCPU_REGS_RBP),
> -		      stack_mask(ctxt));
> +		      stack_size(ctxt));
>  	return emulate_pop(ctxt, reg_rmw(ctxt, VCPU_REGS_RBP), ctxt->op_bytes);
>  }
>  
> @@ -4570,6 +4576,8 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
>  	if (ctxt->rep_prefix && (ctxt->d & String)) {
>  		/* All REP prefixes have the same first termination condition */
>  		if (address_mask(ctxt, reg_read(ctxt, VCPU_REGS_RCX)) == 0) {
> +			assign_masked(reg_write(ctxt, VCPU_REGS_RCX), 0,
> +				      ctxt->ad_bytes);
>  			ctxt->eip = ctxt->_eip;
>  			goto done;
>  		}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/5] KVM: x86: Emulator does not calculate address correctly
  2014-05-07 13:57   ` Paolo Bonzini
@ 2014-05-07 15:21     ` Nadav Amit
  0 siblings, 0 replies; 17+ messages in thread
From: Nadav Amit @ 2014-05-07 15:21 UTC (permalink / raw)
  To: Paolo Bonzini, Nadav Amit, mtosatti, hpa
  Cc: gleb, tglx, mingo, x86, kvm, linux-kernel

On 5/7/14, 4:57 PM, Paolo Bonzini wrote:
> Il 07/05/2014 14:32, Nadav Amit ha scritto:
>> In long-mode, when the address size is 4 bytes, the linear address is not
>> truncated as the emulator mistakenly does.  Instead, the offset within
>> the
>> segment (the ea field) should be truncated according to the address size.
>>
>> As Intel SDM says: "In 64-bit mode, the effective address components
>> are added
>> and the effective address is truncated ... before adding the full 64-bit
>> segment base."
>>
>> Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
>> ---
>>  arch/x86/kvm/emulate.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>> index e8a5840..743e8e3 100644
>> --- a/arch/x86/kvm/emulate.c
>> +++ b/arch/x86/kvm/emulate.c
>> @@ -631,7 +631,8 @@ static int __linearize(struct x86_emulate_ctxt *ctxt,
>>      u16 sel;
>>      unsigned cpl;
>>
>> -    la = seg_base(ctxt, addr.seg) + addr.ea;
>> +    la = seg_base(ctxt, addr.seg) +
>> +        (ctxt->ad_bytes == 8 ? addr.ea : (u32)addr.ea);
>
> I think you need "fetch || ctxt->ad_bytes == 8" here.
>
> Paolo
>
Yes. I did not test the fetch scenario.
I intend to do so soon to avoid such mistakes.

Nadav

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/5] KVM: x86: Wrong register masking in 64-bit mode
  2014-05-07 14:50   ` Bandan Das
@ 2014-05-07 15:24     ` Paolo Bonzini
  2014-05-07 16:11     ` Nadav Amit
  1 sibling, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2014-05-07 15:24 UTC (permalink / raw)
  To: Bandan Das, Nadav Amit
  Cc: mtosatti, hpa, gleb, tglx, mingo, x86, kvm, linux-kernel

Il 07/05/2014 16:50, Bandan Das ha scritto:
>> > +static void assign_masked(ulong *dest, ulong src, int bytes)
>> >  {
>> > -	*dest = (*dest & ~mask) | (src & mask);
>> > +	switch (bytes) {
>> > +	case 2:
>> > +		*dest = (u16)src | (*dest & ~0xfffful);
>> > +		break;
>> > +	case 4:
>> > +		*dest = (u32)src;
>> > +		break;
>> > +	case 8:
>> > +		*dest = src;
>> > +		break;
>> > +	default:
>> > +		BUG();
> IIRC, Paolo mentioned that a WARN() is preferable. But I see
> a lot other places where BUG() is called, maybe, he can confirm.
>

There is really no reason to crash the host for a misused API, so I do 
believe that a WARN() is preferable.

Paolo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/5] KVM: x86: Wrong register masking in 64-bit mode
  2014-05-07 12:32 ` [PATCH 4/5] KVM: x86: Wrong register masking in 64-bit mode Nadav Amit
  2014-05-07 14:50   ` Bandan Das
@ 2014-05-07 15:52   ` Paolo Bonzini
  2014-05-08 12:27     ` Nadav Amit
  1 sibling, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2014-05-07 15:52 UTC (permalink / raw)
  To: Nadav Amit, mtosatti, hpa; +Cc: gleb, tglx, mingo, x86, kvm, linux-kernel

Il 07/05/2014 14:32, Nadav Amit ha scritto:
> 32-bit operations are zero extended in 64-bit mode. Currently, the code does
> not handle them correctly and keeps the high bits. In 16-bit mode, the high
> 32-bits are kept intact.
>
> In addition, although it is not well-documented, when address override prefix
> is used with REP-string instruction, RCX high half is zeroed even if ECX was
> zero on the first iteration (as if an assignment was performed to ECX).

Is this true even for REPZ and ZF=0 or REPNZ and ZF=1?

Paolo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/5] KVM: x86: Fix wrong masking on relative jump/call
  2014-05-07 14:43   ` Bandan Das
@ 2014-05-07 15:57     ` Nadav Amit
  0 siblings, 0 replies; 17+ messages in thread
From: Nadav Amit @ 2014-05-07 15:57 UTC (permalink / raw)
  To: Bandan Das, Nadav Amit
  Cc: mtosatti, pbonzini, hpa, gleb, tglx, mingo, x86, kvm, linux-kernel

On 5/7/14, 5:43 PM, Bandan Das wrote:
> Nadav Amit <namit@cs.technion.ac.il> writes:
>
>> Relative jumps and calls do the masking according to the operand size, and not
>> according to the address size as the KVM emulator does today.  In 64-bit mode,
>> the resulting RIP is always 64-bit. Otherwise it is masked according to the
>> instruction operand-size. Note that when 16-bit address size is used, bits
>> 63:32 are unmodified.
>>
>> Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
>> ---
>>   arch/x86/kvm/emulate.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>> index 6833b41..e406705 100644
>> --- a/arch/x86/kvm/emulate.c
>> +++ b/arch/x86/kvm/emulate.c
>> @@ -506,7 +506,9 @@ static void rsp_increment(struct x86_emulate_ctxt *ctxt, int inc)
>>
>>   static inline void jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
>>   {
>> -	register_address_increment(ctxt, &ctxt->_eip, rel);
>> +	/* 64-bit mode relative jumps are always 64-bit; otherwise mask */
>> +	int op_bytes = ctxt->mode == X86EMUL_MODE_PROT64 ? 8 : ctxt->op_bytes;
>
> Just a nit, probably break this up for readability ?
>
I will make it more readable on the next version.

Thanks,
Nadav

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/5] KVM: x86: Fix wrong masking on relative jump/call
  2014-05-07 12:32 ` [PATCH 5/5] KVM: x86: Fix wrong masking on relative jump/call Nadav Amit
  2014-05-07 14:43   ` Bandan Das
@ 2014-05-07 15:59   ` Paolo Bonzini
  1 sibling, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2014-05-07 15:59 UTC (permalink / raw)
  To: Nadav Amit, mtosatti, hpa; +Cc: gleb, tglx, mingo, x86, kvm, linux-kernel

Il 07/05/2014 14:32, Nadav Amit ha scritto:
> Relative jumps and calls do the masking according to the operand size, and not
> according to the address size as the KVM emulator does today.  In 64-bit mode,
> the resulting RIP is always 64-bit. Otherwise it is masked according to the
> instruction operand-size. Note that when 16-bit address size is used, bits
> 63:32 are unmodified.

The SDM says "If the operand-size attribute is 16, the upper two bytes 
of the EIP register are cleared, resulting in a maximum instruction 
pointer size of 16 bits".  I'm not sure whether that should also imply 
that 63:32 are _not_ unmodified (because you do a 32-bit write not a 
16-bit one), but in any case it looks like masked_increment is not the 
right function.

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/5] KVM: x86: Fix exit handler and emulation bugs
  2014-05-07 12:32 [PATCH 0/5] KVM: x86: Fix exit handler and emulation bugs Nadav Amit
                   ` (4 preceding siblings ...)
  2014-05-07 12:32 ` [PATCH 5/5] KVM: x86: Fix wrong masking on relative jump/call Nadav Amit
@ 2014-05-07 16:00 ` Paolo Bonzini
  5 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2014-05-07 16:00 UTC (permalink / raw)
  To: Nadav Amit, mtosatti, hpa; +Cc: gleb, tglx, mingo, x86, kvm, linux-kernel

Il 07/05/2014 14:32, Nadav Amit ha scritto:
> This series of patches fixes various scenarios in which KVM does not follow x86
> specifications.  Patches #4 and #5 are related; they reflect a new revision of
> the previously submitted patch that dealt with the wrong masking of registers
> in long-mode. Patch #3 is a follow-up to the previously sumbitted patch that
> fixed the wrong reserved page table masks. Patches #3 and #5 were not tested in
> a manner that actually checks the modified behavior. Not all the pathes in
> patch #4 were tested.
>
> Thanks for reviewing the patches.
>
> Nadav Amit (5):
>   KVM: x86: Emulator does not calculate address correctly
>   KVM: vmx: handle_dr does not handle RSP correctly
>   KVM: x86: Mark bit 7 in long-mode PDPTE according to 1GB pages support
>   KVM: x86: Wrong register masking in 64-bit mode
>   KVM: x86: Fix wrong masking on relative jump/call
>
>  arch/x86/kvm/cpuid.h   |  7 +++++++
>  arch/x86/kvm/emulate.c | 47 +++++++++++++++++++++++++++++------------------
>  arch/x86/kvm/mmu.c     |  8 ++++++--
>  arch/x86/kvm/vmx.c     |  2 +-
>  4 files changed, 43 insertions(+), 21 deletions(-)
>

I'll apply patches 2 and 3.  Thanks for your work!

Paolo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/5] KVM: x86: Wrong register masking in 64-bit mode
  2014-05-07 14:50   ` Bandan Das
  2014-05-07 15:24     ` Paolo Bonzini
@ 2014-05-07 16:11     ` Nadav Amit
  1 sibling, 0 replies; 17+ messages in thread
From: Nadav Amit @ 2014-05-07 16:11 UTC (permalink / raw)
  To: Bandan Das, Nadav Amit
  Cc: mtosatti, pbonzini, hpa, gleb, tglx, mingo, x86, kvm, linux-kernel

On 5/7/14, 5:50 PM, Bandan Das wrote:
> Nadav Amit <namit@cs.technion.ac.il> writes:
>
>> 32-bit operations are zero extended in 64-bit mode. Currently, the code does
>> not handle them correctly and keeps the high bits. In 16-bit mode, the high
>> 32-bits are kept intact.
>>
>> In addition, although it is not well-documented, when address override prefix
>
> It would be helpful to have a pointer in the SDM especially for cases
> that are not well-documented.
>
>> is used with REP-string instruction, RCX high half is zeroed even if ECX was
>> zero on the first iteration (as if an assignment was performed to ECX).
>>
>> Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
>>

First, as for the different masking/zero-extending behavior in different 
modes, I guess this behavior is documented in SDM Volume 1, Section 
3.3.7: "Address Calculations in 64-Bit Mode". It appears (from 
experiments on bare-metal) that it also regards ESI/EDI/ECX on 
REP-prefix instructions and I presume it regards ESP on stack operations 
and ECX on loop operations. I will ensure it soon.

Second, the behavior of zero-extending RCX even if ECX is initially zero 
was experienced on native environment (Intel SandyBridge), and we found 
no erratum regarding it. I actually found no documentation that 
describes this behavior. The term "not well-documented" is probably 
misleading, and will be clarified on v2.

Thanks,
Nadav


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/5] KVM: x86: Wrong register masking in 64-bit mode
  2014-05-07 15:52   ` Paolo Bonzini
@ 2014-05-08 12:27     ` Nadav Amit
  0 siblings, 0 replies; 17+ messages in thread
From: Nadav Amit @ 2014-05-08 12:27 UTC (permalink / raw)
  To: Paolo Bonzini, Nadav Amit, mtosatti, hpa
  Cc: gleb, tglx, mingo, x86, kvm, linux-kernel

On 5/7/14, 6:52 PM, Paolo Bonzini wrote:
> Il 07/05/2014 14:32, Nadav Amit ha scritto:
>> 32-bit operations are zero extended in 64-bit mode. Currently, the
>> code does
>> not handle them correctly and keeps the high bits. In 16-bit mode, the
>> high
>> 32-bits are kept intact.
>>
>> In addition, although it is not well-documented, when address override
>> prefix
>> is used with REP-string instruction, RCX high half is zeroed even if
>> ECX was
>> zero on the first iteration (as if an assignment was performed to ECX).
>
> Is this true even for REPZ and ZF=0 or REPNZ and ZF=1?
>
> Paolo
The REPZ and REPNZ condition is checked on the end of an iteration (see 
the REP instruction description on the SDM), so it does not matter. So 
even REPZ/RENZ would zero RCX high half.

This "feature" is not well-documented, but can be observed. Here is a 
small code you can try.

---
#include <stdio.h>
unsigned long src, dst;
int main()
{
	unsigned long long rsi, rdi, rcx;
	rcx = 0xffffffff00000000ull;
	rsi = (unsigned long long)&src | 0xffffffff00000000ull;
	rdi = (unsigned long long)&dst | 0xffffffff00000000ull;
	printf("before: rsi %llx rdi %llx rcx %llx\n", rsi, rdi ,rcx);
	asm volatile (  ".byte 0x67\n\t"
			"repne cmpsd\n\t"
			: "+S" (rsi), "+D" (rdi), "+c" (rcx)
			: : "memory", "cc" );
	printf("after: rsi %llx rdi %llx rcx %llx\n", rsi, rdi ,rcx);
	return 0;
}
---

Nadav


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2014-05-08 12:27 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-07 12:32 [PATCH 0/5] KVM: x86: Fix exit handler and emulation bugs Nadav Amit
2014-05-07 12:32 ` [PATCH 1/5] KVM: x86: Emulator does not calculate address correctly Nadav Amit
2014-05-07 13:57   ` Paolo Bonzini
2014-05-07 15:21     ` Nadav Amit
2014-05-07 12:32 ` [PATCH 2/5] KVM: vmx: handle_dr does not handle RSP correctly Nadav Amit
2014-05-07 12:32 ` [PATCH 3/5] KVM: x86: Mark bit 7 in long-mode PDPTE according to 1GB pages support Nadav Amit
2014-05-07 12:32 ` [PATCH 4/5] KVM: x86: Wrong register masking in 64-bit mode Nadav Amit
2014-05-07 14:50   ` Bandan Das
2014-05-07 15:24     ` Paolo Bonzini
2014-05-07 16:11     ` Nadav Amit
2014-05-07 15:52   ` Paolo Bonzini
2014-05-08 12:27     ` Nadav Amit
2014-05-07 12:32 ` [PATCH 5/5] KVM: x86: Fix wrong masking on relative jump/call Nadav Amit
2014-05-07 14:43   ` Bandan Das
2014-05-07 15:57     ` Nadav Amit
2014-05-07 15:59   ` Paolo Bonzini
2014-05-07 16:00 ` [PATCH 0/5] KVM: x86: Fix exit handler and emulation bugs Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).