All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for 4.9 0/6] x86/emul: Fixes
@ 2017-03-31 19:50 Andrew Cooper
  2017-03-31 19:50 ` [PATCH for 4.9 1/6] x86/hvm: Correct some address space terminology Andrew Cooper
                   ` (5 more replies)
  0 siblings, 6 replies; 37+ messages in thread
From: Andrew Cooper @ 2017-03-31 19:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Julien Grall, Jan Beulich

This series started out as patchs 4 and 5, to aid the userspace fuzzing
harness, but ended up discovering the bug in patch 3, which is security
relevent.

Patch 3 is a must-fix for Xen 4.9, before the bug needs an XSA.  Patches 4-6
are nice-to-have.

Andrew Cooper (6):
  x86/hvm: Correct some address space terminology
  x86/hvm: Correct long mode predicate
  x86/hvm: Fix segmentation logic for system segments
  x86/svm: Introduce svm_emul_swint_injection()
  x86/emul: Drop swint_emulate infrastructure
  x86/emul: Require callers to provide LMA in the emulation context

 tools/fuzz/x86_instruction_emulator/fuzz-emul.c |  19 +-
 tools/tests/x86_emulator/test_x86_emulator.c    |   4 +
 xen/arch/x86/cpuid.c                            |   2 +-
 xen/arch/x86/hvm/emulate.c                      |  36 ++--
 xen/arch/x86/hvm/hvm.c                          |  53 ++++--
 xen/arch/x86/hvm/svm/svm.c                      | 142 +++++++++++++-
 xen/arch/x86/hvm/vmx/vmx.c                      |   6 +-
 xen/arch/x86/hvm/vmx/vvmx.c                     |   8 +-
 xen/arch/x86/mm.c                               |   4 +-
 xen/arch/x86/mm/hap/hap.c                       |   8 +-
 xen/arch/x86/mm/shadow/common.c                 |  27 +--
 xen/arch/x86/oprofile/backtrace.c               |   2 +-
 xen/arch/x86/traps.c                            |   1 +
 xen/arch/x86/x86_emulate/x86_emulate.c          | 238 ++++--------------------
 xen/arch/x86/x86_emulate/x86_emulate.h          |  56 +-----
 xen/include/asm-x86/hvm/hvm.h                   |  15 +-
 16 files changed, 285 insertions(+), 336 deletions(-)

-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH for 4.9 1/6] x86/hvm: Correct some address space terminology
  2017-03-31 19:50 [PATCH for 4.9 0/6] x86/emul: Fixes Andrew Cooper
@ 2017-03-31 19:50 ` Andrew Cooper
  2017-04-03  8:24   ` Paul Durrant
  2017-04-03  8:24   ` Jan Beulich
  2017-03-31 19:50 ` [PATCH for 4.9 2/6] x86/hvm: Correct long mode predicate Andrew Cooper
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 37+ messages in thread
From: Andrew Cooper @ 2017-03-31 19:50 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Julien Grall, Paul Durrant, Tim Deegan, Jan Beulich

The function hvm_translate_linear_addr() translates a virtual address to a
linear address, not a linear address to a physical address.  Correct its name.

Both hvm_translate_virtual_addr() and hvmemul_virtual_to_linear() return a
linear address, not a physical address.  Correct the parameters name.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Tim Deegan <tim@xen.org>
CC: Paul Durrant <paul.durrant@citrix.com>
CC: Julien Grall <julien.grall@arm.com>
---
 xen/arch/x86/hvm/emulate.c      | 12 ++++++------
 xen/arch/x86/mm/shadow/common.c | 16 ++++++++--------
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 2d92957..4073715 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -506,7 +506,7 @@ static int hvmemul_virtual_to_linear(
     unsigned long *reps,
     enum hvm_access_type access_type,
     struct hvm_emulate_ctxt *hvmemul_ctxt,
-    unsigned long *paddr)
+    unsigned long *linear)
 {
     struct segment_register *reg;
     int okay;
@@ -514,7 +514,7 @@ static int hvmemul_virtual_to_linear(
 
     if ( seg == x86_seg_none )
     {
-        *paddr = offset;
+        *linear = offset;
         return X86EMUL_OKAY;
     }
 
@@ -549,16 +549,16 @@ static int hvmemul_virtual_to_linear(
         okay = hvm_virtual_to_linear_addr(
             seg, reg, offset - (*reps - 1) * bytes_per_rep,
             *reps * bytes_per_rep, access_type,
-            hvmemul_ctxt->ctxt.addr_size, paddr);
-        *paddr += (*reps - 1) * bytes_per_rep;
+            hvmemul_ctxt->ctxt.addr_size, linear);
+        *linear += (*reps - 1) * bytes_per_rep;
         if ( hvmemul_ctxt->ctxt.addr_size != 64 )
-            *paddr = (uint32_t)*paddr;
+            *linear = (uint32_t)*linear;
     }
     else
     {
         okay = hvm_virtual_to_linear_addr(
             seg, reg, offset, *reps * bytes_per_rep, access_type,
-            hvmemul_ctxt->ctxt.addr_size, paddr);
+            hvmemul_ctxt->ctxt.addr_size, linear);
     }
 
     if ( okay )
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index d93f2ab..03cb24d 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -136,13 +136,13 @@ static struct segment_register *hvm_get_seg_reg(
     return seg_reg;
 }
 
-static int hvm_translate_linear_addr(
+static int hvm_translate_virtual_addr(
     enum x86_segment seg,
     unsigned long offset,
     unsigned int bytes,
     enum hvm_access_type access_type,
     struct sh_emulate_ctxt *sh_ctxt,
-    unsigned long *paddr)
+    unsigned long *linear)
 {
     const struct segment_register *reg;
     int okay;
@@ -152,7 +152,7 @@ static int hvm_translate_linear_addr(
         return -PTR_ERR(reg);
 
     okay = hvm_virtual_to_linear_addr(
-        seg, reg, offset, bytes, access_type, sh_ctxt->ctxt.addr_size, paddr);
+        seg, reg, offset, bytes, access_type, sh_ctxt->ctxt.addr_size, linear);
 
     if ( !okay )
     {
@@ -183,7 +183,7 @@ hvm_read(enum x86_segment seg,
     unsigned long addr;
     int rc;
 
-    rc = hvm_translate_linear_addr(
+    rc = hvm_translate_virtual_addr(
         seg, offset, bytes, access_type, sh_ctxt, &addr);
     if ( rc || !bytes )
         return rc;
@@ -265,7 +265,7 @@ hvm_emulate_write(enum x86_segment seg,
     if ( seg == x86_seg_ss )
         perfc_incr(shadow_fault_emulate_stack);
 
-    rc = hvm_translate_linear_addr(
+    rc = hvm_translate_virtual_addr(
         seg, offset, bytes, hvm_access_write, sh_ctxt, &addr);
     if ( rc || !bytes )
         return rc;
@@ -291,7 +291,7 @@ hvm_emulate_cmpxchg(enum x86_segment seg,
     if ( bytes > sizeof(long) )
         return X86EMUL_UNHANDLEABLE;
 
-    rc = hvm_translate_linear_addr(
+    rc = hvm_translate_virtual_addr(
         seg, offset, bytes, hvm_access_write, sh_ctxt, &addr);
     if ( rc )
         return rc;
@@ -345,7 +345,7 @@ const struct x86_emulate_ops *shadow_init_emulation(
     /* Attempt to prefetch whole instruction. */
     sh_ctxt->insn_buf_eip = regs->rip;
     sh_ctxt->insn_buf_bytes =
-        (!hvm_translate_linear_addr(
+        (!hvm_translate_virtual_addr(
             x86_seg_cs, regs->rip, sizeof(sh_ctxt->insn_buf),
             hvm_access_insn_fetch, sh_ctxt, &addr) &&
          !hvm_fetch_from_guest_linear(
@@ -374,7 +374,7 @@ void shadow_continue_emulation(struct sh_emulate_ctxt *sh_ctxt,
     {
         /* Prefetch more bytes. */
         sh_ctxt->insn_buf_bytes =
-            (!hvm_translate_linear_addr(
+            (!hvm_translate_virtual_addr(
                 x86_seg_cs, regs->rip, sizeof(sh_ctxt->insn_buf),
                 hvm_access_insn_fetch, sh_ctxt, &addr) &&
              !hvm_fetch_from_guest_linear(
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH for 4.9 2/6] x86/hvm: Correct long mode predicate
  2017-03-31 19:50 [PATCH for 4.9 0/6] x86/emul: Fixes Andrew Cooper
  2017-03-31 19:50 ` [PATCH for 4.9 1/6] x86/hvm: Correct some address space terminology Andrew Cooper
@ 2017-03-31 19:50 ` Andrew Cooper
  2017-04-03  8:26   ` Paul Durrant
                     ` (3 more replies)
  2017-03-31 19:50 ` [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments Andrew Cooper
                   ` (3 subsequent siblings)
  5 siblings, 4 replies; 37+ messages in thread
From: Andrew Cooper @ 2017-03-31 19:50 UTC (permalink / raw)
  To: Xen-devel
  Cc: Kevin Tian, Jan Beulich, George Dunlap, Andrew Cooper,
	Tim Deegan, Julien Grall, Paul Durrant, Jun Nakajima,
	Boris Ostrovsky, Suravee Suthikulpanit

hvm_long_mode_enabled() tests for EFER.LMA, which is specifically different to
EFER.LME.

Rename it to match its behaviour, and have it strictly return a boolean value
(although all its callers already use it in implicitly-boolean contexts, so no
functional change).

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Paul Durrant <paul.durrant@citrix.com>
CC: Tim Deegan <tim@xen.org>
CC: George Dunlap <george.dunlap@eu.citrix.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
CC: Julien Grall <julien.grall@arm.com>
---
 xen/arch/x86/cpuid.c              |  2 +-
 xen/arch/x86/hvm/emulate.c        |  2 +-
 xen/arch/x86/hvm/hvm.c            | 10 +++++-----
 xen/arch/x86/hvm/svm/svm.c        |  6 +++---
 xen/arch/x86/hvm/vmx/vmx.c        |  6 +++---
 xen/arch/x86/hvm/vmx/vvmx.c       |  8 ++++----
 xen/arch/x86/mm/hap/hap.c         |  8 ++++----
 xen/arch/x86/mm/shadow/common.c   |  4 ++--
 xen/arch/x86/oprofile/backtrace.c |  2 +-
 xen/include/asm-x86/hvm/hvm.h     |  3 +--
 10 files changed, 25 insertions(+), 26 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 1c6a6c6..d359e09 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -911,7 +911,7 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
     case 0x80000001:
         /* SYSCALL is hidden outside of long mode on Intel. */
         if ( p->x86_vendor == X86_VENDOR_INTEL &&
-             is_hvm_domain(d) && !hvm_long_mode_enabled(v) )
+             is_hvm_domain(d) && !hvm_long_mode_active(v) )
             res->d &= ~cpufeat_mask(X86_FEATURE_SYSCALL);
 
     common_leaf1_adjustments:
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 4073715..3d084ca 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -2051,7 +2051,7 @@ void hvm_emulate_init_per_insn(
     unsigned int pfec = PFEC_page_present;
     unsigned long addr;
 
-    if ( hvm_long_mode_enabled(curr) &&
+    if ( hvm_long_mode_active(curr) &&
          hvmemul_ctxt->seg_reg[x86_seg_cs].attr.fields.l )
         hvmemul_ctxt->ctxt.addr_size = hvmemul_ctxt->ctxt.sp_size = 64;
     else
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 0282986..9f83cd8 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2227,7 +2227,7 @@ int hvm_set_cr0(unsigned long value, bool_t may_defer)
         }
 
         /* When CR0.PG is cleared, LMA is cleared immediately. */
-        if ( hvm_long_mode_enabled(v) )
+        if ( hvm_long_mode_active(v) )
         {
             v->arch.hvm_vcpu.guest_efer &= ~EFER_LMA;
             hvm_update_guest_efer(v);
@@ -2321,7 +2321,7 @@ int hvm_set_cr4(unsigned long value, bool_t may_defer)
 
     if ( !(value & X86_CR4_PAE) )
     {
-        if ( hvm_long_mode_enabled(v) )
+        if ( hvm_long_mode_active(v) )
         {
             HVM_DBG_LOG(DBG_LEVEL_1, "Guest cleared CR4.PAE while "
                         "EFER.LMA is set");
@@ -2332,7 +2332,7 @@ int hvm_set_cr4(unsigned long value, bool_t may_defer)
     old_cr = v->arch.hvm_vcpu.guest_cr[4];
 
     if ( (value & X86_CR4_PCIDE) && !(old_cr & X86_CR4_PCIDE) &&
-         (!hvm_long_mode_enabled(v) ||
+         (!hvm_long_mode_active(v) ||
           (v->arch.hvm_vcpu.guest_cr[3] & 0xfff)) )
     {
         HVM_DBG_LOG(DBG_LEVEL_1, "Guest attempts to change CR4.PCIDE from "
@@ -3605,7 +3605,7 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
 
         if ( hvm_virtual_to_linear_addr(x86_seg_cs, cs, regs->rip,
                                         sizeof(sig), hvm_access_insn_fetch,
-                                        (hvm_long_mode_enabled(cur) &&
+                                        (hvm_long_mode_active(cur) &&
                                          cs->attr.fields.l) ? 64 :
                                         cs->attr.fields.db ? 32 : 16, &addr) &&
              (hvm_fetch_from_guest_linear(sig, addr, sizeof(sig),
@@ -3616,7 +3616,7 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
             regs->eflags &= ~X86_EFLAGS_RF;
 
             /* Zero the upper 32 bits of %rip if not in 64bit mode. */
-            if ( !(hvm_long_mode_enabled(cur) && cs->attr.fields.l) )
+            if ( !(hvm_long_mode_active(cur) && cs->attr.fields.l) )
                 regs->rip = regs->eip;
 
             add_taint(TAINT_HVM_FEP);
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index b69789b..4d7e49f 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -516,7 +516,7 @@ static int svm_guest_x86_mode(struct vcpu *v)
         return 0;
     if ( unlikely(guest_cpu_user_regs()->eflags & X86_EFLAGS_VM) )
         return 1;
-    if ( hvm_long_mode_enabled(v) && likely(vmcb->cs.attr.fields.l) )
+    if ( hvm_long_mode_active(v) && likely(vmcb->cs.attr.fields.l) )
         return 8;
     return (likely(vmcb->cs.attr.fields.db) ? 4 : 2);
 }
@@ -2279,7 +2279,7 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
 
     exit_reason = vmcb->exitcode;
 
-    if ( hvm_long_mode_enabled(v) )
+    if ( hvm_long_mode_active(v) )
         HVMTRACE_ND(VMEXIT64, vcpu_guestmode ? TRC_HVM_NESTEDFLAG : 0,
                     1/*cycles*/, 3, exit_reason,
                     regs->eip, regs->rip >> 32, 0, 0, 0);
@@ -2429,7 +2429,7 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
         {
             if ( trace_will_trace_event(TRC_SHADOW) )
                 break;
-            if ( hvm_long_mode_enabled(v) )
+            if ( hvm_long_mode_active(v) )
                 HVMTRACE_LONG_2D(PF_XEN, regs->error_code, TRC_PAR_LONG(va));
             else
                 HVMTRACE_2D(PF_XEN, regs->error_code, va);
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index d201956..b6526c9 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -611,7 +611,7 @@ int vmx_guest_x86_mode(struct vcpu *v)
     if ( unlikely(guest_cpu_user_regs()->eflags & X86_EFLAGS_VM) )
         return 1;
     __vmread(GUEST_CS_AR_BYTES, &cs_ar_bytes);
-    if ( hvm_long_mode_enabled(v) &&
+    if ( hvm_long_mode_active(v) &&
          likely(cs_ar_bytes & X86_SEG_AR_CS_LM_ACTIVE) )
         return 8;
     return (likely(cs_ar_bytes & X86_SEG_AR_DEF_OP_SIZE) ? 4 : 2);
@@ -3392,7 +3392,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
 
     __vmread(VM_EXIT_REASON, &exit_reason);
 
-    if ( hvm_long_mode_enabled(v) )
+    if ( hvm_long_mode_active(v) )
         HVMTRACE_ND(VMEXIT64, 0, 1/*cycles*/, 3, exit_reason,
                     regs->eip, regs->rip >> 32, 0, 0, 0);
     else
@@ -3632,7 +3632,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
             {
                 if ( trace_will_trace_event(TRC_SHADOW) )
                     break;
-                if ( hvm_long_mode_enabled(v) )
+                if ( hvm_long_mode_active(v) )
                     HVMTRACE_LONG_2D(PF_XEN, regs->error_code,
                                      TRC_PAR_LONG(exit_qualification) );
                 else
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index 09e4250..e9860f7 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -392,7 +392,7 @@ static int vmx_inst_check_privilege(struct cpu_user_regs *regs, int vmxop_check)
     else if ( !nvmx_vcpu_in_vmx(v) )
         goto invalid_op;
 
-    if ( vmx_guest_x86_mode(v) < (hvm_long_mode_enabled(v) ? 8 : 2) )
+    if ( vmx_guest_x86_mode(v) < (hvm_long_mode_active(v) ? 8 : 2) )
         goto invalid_op;
     else if ( nestedhvm_vcpu_in_guestmode(v) )
         goto vmexit;
@@ -1154,13 +1154,13 @@ static void virtual_vmentry(struct cpu_user_regs *regs)
     /*
      * EFER handling:
      * hvm_set_efer won't work if CR0.PG = 1, so we change the value
-     * directly to make hvm_long_mode_enabled(v) work in L2.
+     * directly to make hvm_long_mode_active(v) work in L2.
      * An additional update_paging_modes is also needed if
      * there is 32/64 switch. v->arch.hvm_vcpu.guest_efer doesn't
      * need to be saved, since its value on vmexit is determined by
      * L1 exit_controls
      */
-    lm_l1 = !!hvm_long_mode_enabled(v);
+    lm_l1 = !!hvm_long_mode_active(v);
     lm_l2 = !!(get_vvmcs(v, VM_ENTRY_CONTROLS) & VM_ENTRY_IA32E_MODE);
 
     if ( lm_l2 )
@@ -1359,7 +1359,7 @@ static void virtual_vmexit(struct cpu_user_regs *regs)
     nvcpu->nv_vmexit_pending = 0;
     nvcpu->nv_vmswitch_in_progress = 1;
 
-    lm_l2 = !!hvm_long_mode_enabled(v);
+    lm_l2 = !!hvm_long_mode_active(v);
     lm_l1 = !!(get_vvmcs(v, VM_EXIT_CONTROLS) & VM_EXIT_IA32E_MODE);
 
     if ( lm_l1 )
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index a57b385..283d4b7 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -690,10 +690,10 @@ static void hap_update_cr3(struct vcpu *v, int do_locking)
 const struct paging_mode *
 hap_paging_get_mode(struct vcpu *v)
 {
-    return !hvm_paging_enabled(v)   ? &hap_paging_real_mode :
-        hvm_long_mode_enabled(v) ? &hap_paging_long_mode :
-        hvm_pae_enabled(v)       ? &hap_paging_pae_mode  :
-                                   &hap_paging_protected_mode;
+    return !hvm_paging_enabled(v) ? &hap_paging_real_mode :
+        hvm_long_mode_active(v)   ? &hap_paging_long_mode :
+        hvm_pae_enabled(v)        ? &hap_paging_pae_mode  :
+                                    &hap_paging_protected_mode;
 }
 
 static void hap_update_paging_modes(struct vcpu *v)
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 03cb24d..14a07dd 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -331,7 +331,7 @@ const struct x86_emulate_ops *shadow_init_emulation(
     creg = hvm_get_seg_reg(x86_seg_cs, sh_ctxt);
 
     /* Work out the emulation mode. */
-    if ( hvm_long_mode_enabled(v) && creg->attr.fields.l )
+    if ( hvm_long_mode_active(v) && creg->attr.fields.l )
     {
         sh_ctxt->ctxt.addr_size = sh_ctxt->ctxt.sp_size = 64;
     }
@@ -2921,7 +2921,7 @@ static void sh_update_paging_modes(struct vcpu *v)
             v->arch.guest_table = d->arch.paging.shadow.unpaged_pagetable;
             v->arch.paging.mode = &SHADOW_INTERNAL_NAME(sh_paging_mode, 2);
         }
-        else if ( hvm_long_mode_enabled(v) )
+        else if ( hvm_long_mode_active(v) )
         {
             // long mode guest...
             v->arch.paging.mode =
diff --git a/xen/arch/x86/oprofile/backtrace.c b/xen/arch/x86/oprofile/backtrace.c
index f0fbb42..316821f 100644
--- a/xen/arch/x86/oprofile/backtrace.c
+++ b/xen/arch/x86/oprofile/backtrace.c
@@ -47,7 +47,7 @@ dump_hypervisor_backtrace(struct vcpu *vcpu, const struct frame_head *head,
 static inline int is_32bit_vcpu(struct vcpu *vcpu)
 {
     if (is_hvm_vcpu(vcpu))
-        return !hvm_long_mode_enabled(vcpu);
+        return !hvm_long_mode_active(vcpu);
     else
         return is_pv_32bit_vcpu(vcpu);
 }
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index c854183..49c8001 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -302,8 +302,7 @@ int hvm_girq_dest_2_vcpu_id(struct domain *d, uint8_t dest, uint8_t dest_mode);
 #define hap_has_1gb (!!(hvm_funcs.hap_capabilities & HVM_HAP_SUPERPAGE_1GB))
 #define hap_has_2mb (!!(hvm_funcs.hap_capabilities & HVM_HAP_SUPERPAGE_2MB))
 
-#define hvm_long_mode_enabled(v) \
-    ((v)->arch.hvm_vcpu.guest_efer & EFER_LMA)
+#define hvm_long_mode_active(v) (!!((v)->arch.hvm_vcpu.guest_efer & EFER_LMA))
 
 enum hvm_intblk
 hvm_interrupt_blocked(struct vcpu *v, struct hvm_intack intack);
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments
  2017-03-31 19:50 [PATCH for 4.9 0/6] x86/emul: Fixes Andrew Cooper
  2017-03-31 19:50 ` [PATCH for 4.9 1/6] x86/hvm: Correct some address space terminology Andrew Cooper
  2017-03-31 19:50 ` [PATCH for 4.9 2/6] x86/hvm: Correct long mode predicate Andrew Cooper
@ 2017-03-31 19:50 ` Andrew Cooper
  2017-04-03  8:31   ` Paul Durrant
  2017-04-03  9:13   ` Jan Beulich
  2017-03-31 19:50 ` [PATCH for 4.9 4/6] x86/svm: Introduce svm_emul_swint_injection() Andrew Cooper
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 37+ messages in thread
From: Andrew Cooper @ 2017-03-31 19:50 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Julien Grall, Paul Durrant, Tim Deegan, Jan Beulich

c/s c785f759718 "x86/emul: Prepare to allow use of system segments for memory
references" made alterations to hvm_virtual_to_linear_addr() to allow for the
use of system segments.

However, the determination of which segmentation mode to use was based on the
current address size from emulation.  In particular, it is wrong for system
segment accesses while executing in a compatibility mode code segment.

Replace the addr_size parameter with a new segmentation mode enumeration (to
prevent this mistake from being made again), and introduce a new helper to
calculate the correct segmenation mode from current register state.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Paul Durrant <paul.durrant@citrix.com>
CC: Tim Deegan <tim@xen.org>
CC: Julien Grall <julien.grall@arm.com>

This is the same logical bug that caused XSA-196, but luckily hasn't yet been
in a released version of Xen.
---
 xen/arch/x86/hvm/emulate.c      | 17 ++++++++++------
 xen/arch/x86/hvm/hvm.c          | 45 +++++++++++++++++++++++++++++------------
 xen/arch/x86/mm/shadow/common.c |  5 ++++-
 xen/include/asm-x86/hvm/hvm.h   | 12 ++++++++++-
 4 files changed, 58 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 3d084ca..f578796 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -508,6 +508,7 @@ static int hvmemul_virtual_to_linear(
     struct hvm_emulate_ctxt *hvmemul_ctxt,
     unsigned long *linear)
 {
+    enum hvm_segmentation_mode seg_mode;
     struct segment_register *reg;
     int okay;
     unsigned long max_reps = 4096;
@@ -518,6 +519,9 @@ static int hvmemul_virtual_to_linear(
         return X86EMUL_OKAY;
     }
 
+    seg_mode = hvm_seg_mode(
+        current, seg, hvmemul_get_seg_reg(x86_seg_cs, hvmemul_ctxt));
+
     /*
      * If introspection has been enabled for this domain, and we're emulating
      * becase a vm_reply asked us to (i.e. not doing regular IO) reps should
@@ -548,8 +552,7 @@ static int hvmemul_virtual_to_linear(
         ASSERT(offset >= ((*reps - 1) * bytes_per_rep));
         okay = hvm_virtual_to_linear_addr(
             seg, reg, offset - (*reps - 1) * bytes_per_rep,
-            *reps * bytes_per_rep, access_type,
-            hvmemul_ctxt->ctxt.addr_size, linear);
+            *reps * bytes_per_rep, access_type, seg_mode, linear);
         *linear += (*reps - 1) * bytes_per_rep;
         if ( hvmemul_ctxt->ctxt.addr_size != 64 )
             *linear = (uint32_t)*linear;
@@ -557,8 +560,8 @@ static int hvmemul_virtual_to_linear(
     else
     {
         okay = hvm_virtual_to_linear_addr(
-            seg, reg, offset, *reps * bytes_per_rep, access_type,
-            hvmemul_ctxt->ctxt.addr_size, linear);
+            seg, reg, offset, *reps * bytes_per_rep,
+            access_type, seg_mode, linear);
     }
 
     if ( okay )
@@ -2068,14 +2071,16 @@ void hvm_emulate_init_per_insn(
     hvmemul_ctxt->insn_buf_eip = hvmemul_ctxt->ctxt.regs->rip;
     if ( !insn_bytes )
     {
+        enum hvm_segmentation_mode seg_mode =
+            hvm_seg_mode(curr, x86_seg_cs, &hvmemul_ctxt->seg_reg[x86_seg_cs]);
+
         hvmemul_ctxt->insn_buf_bytes =
             hvm_get_insn_bytes(curr, hvmemul_ctxt->insn_buf) ?:
             (hvm_virtual_to_linear_addr(x86_seg_cs,
                                         &hvmemul_ctxt->seg_reg[x86_seg_cs],
                                         hvmemul_ctxt->insn_buf_eip,
                                         sizeof(hvmemul_ctxt->insn_buf),
-                                        hvm_access_insn_fetch,
-                                        hvmemul_ctxt->ctxt.addr_size,
+                                        hvm_access_insn_fetch, seg_mode,
                                         &addr) &&
              hvm_fetch_from_guest_linear(hvmemul_ctxt->insn_buf, addr,
                                          sizeof(hvmemul_ctxt->insn_buf),
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 9f83cd8..f250afb 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2374,13 +2374,27 @@ int hvm_set_cr4(unsigned long value, bool_t may_defer)
     return X86EMUL_OKAY;
 }
 
+enum hvm_segmentation_mode hvm_seg_mode(
+    const struct vcpu *v, enum x86_segment seg,
+    const struct segment_register *cs)
+{
+    if ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) )
+        return hvm_seg_mode_real;
+
+    if ( hvm_long_mode_active(v) &&
+         (is_x86_system_segment(seg) || cs->attr.fields.l) )
+        return hvm_seg_mode_long;
+
+    return hvm_seg_mode_prot;
+}
+
 bool_t hvm_virtual_to_linear_addr(
     enum x86_segment seg,
     const struct segment_register *reg,
     unsigned long offset,
     unsigned int bytes,
     enum hvm_access_type access_type,
-    unsigned int addr_size,
+    enum hvm_segmentation_mode seg_mode,
     unsigned long *linear_addr)
 {
     unsigned long addr = offset, last_byte;
@@ -2394,8 +2408,9 @@ bool_t hvm_virtual_to_linear_addr(
      */
     ASSERT(seg < x86_seg_none);
 
-    if ( !(current->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) )
+    switch ( seg_mode )
     {
+    case hvm_seg_mode_real:
         /*
          * REAL MODE: Don't bother with segment access checks.
          * Certain of them are not done in native real mode anyway.
@@ -2404,11 +2419,11 @@ bool_t hvm_virtual_to_linear_addr(
         last_byte = (uint32_t)addr + bytes - !!bytes;
         if ( last_byte < addr )
             goto out;
-    }
-    else if ( addr_size != 64 )
-    {
+        break;
+
+    case hvm_seg_mode_prot:
         /*
-         * COMPATIBILITY MODE: Apply segment checks and add base.
+         * PROTECTED/COMPATIBILITY MODE: Apply segment checks and add base.
          */
 
         /*
@@ -2454,9 +2469,9 @@ bool_t hvm_virtual_to_linear_addr(
         }
         else if ( (last_byte > reg->limit) || (last_byte < offset) )
             goto out; /* last byte is beyond limit or wraps 0xFFFFFFFF */
-    }
-    else
-    {
+        break;
+
+    case hvm_seg_mode_long:
         /*
          * User segments are always treated as present.  System segment may
          * not be, and also incur limit checks.
@@ -2476,6 +2491,11 @@ bool_t hvm_virtual_to_linear_addr(
         if ( !is_canonical_address(addr) || last_byte < addr ||
              !is_canonical_address(last_byte) )
             goto out;
+        break;
+
+    default:
+        ASSERT_UNREACHABLE();
+        goto out;
     }
 
     /* All checks ok. */
@@ -3024,7 +3044,7 @@ void hvm_task_switch(
             sp = regs->sp -= opsz;
         if ( hvm_virtual_to_linear_addr(x86_seg_ss, &segr, sp, opsz,
                                         hvm_access_write,
-                                        16 << segr.attr.fields.db,
+                                        hvm_seg_mode_prot,
                                         &linear_addr) )
         {
             rc = hvm_copy_to_guest_linear(linear_addr, &errcode, opsz, 0,
@@ -3600,14 +3620,13 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
         const struct segment_register *cs = &ctxt.seg_reg[x86_seg_cs];
         uint32_t walk = (ctxt.seg_reg[x86_seg_ss].attr.fields.dpl == 3)
             ? PFEC_user_mode : 0;
+        enum hvm_segmentation_mode seg_mode = hvm_seg_mode(cur, x86_seg_cs, cs);
         unsigned long addr;
         char sig[5]; /* ud2; .ascii "xen" */
 
         if ( hvm_virtual_to_linear_addr(x86_seg_cs, cs, regs->rip,
                                         sizeof(sig), hvm_access_insn_fetch,
-                                        (hvm_long_mode_active(cur) &&
-                                         cs->attr.fields.l) ? 64 :
-                                        cs->attr.fields.db ? 32 : 16, &addr) &&
+                                        seg_mode, &addr) &&
              (hvm_fetch_from_guest_linear(sig, addr, sizeof(sig),
                                           walk, NULL) == HVMCOPY_okay) &&
              (memcmp(sig, "\xf\xbxen", sizeof(sig)) == 0) )
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 14a07dd..551a7d3 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -144,6 +144,7 @@ static int hvm_translate_virtual_addr(
     struct sh_emulate_ctxt *sh_ctxt,
     unsigned long *linear)
 {
+    enum hvm_segmentation_mode seg_mode;
     const struct segment_register *reg;
     int okay;
 
@@ -151,8 +152,10 @@ static int hvm_translate_virtual_addr(
     if ( IS_ERR(reg) )
         return -PTR_ERR(reg);
 
+    seg_mode = hvm_seg_mode(current, seg, hvm_get_seg_reg(x86_seg_cs, sh_ctxt));
+
     okay = hvm_virtual_to_linear_addr(
-        seg, reg, offset, bytes, access_type, sh_ctxt->ctxt.addr_size, linear);
+        seg, reg, offset, bytes, access_type, seg_mode, linear);
 
     if ( !okay )
     {
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 49c8001..ed6994c 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -460,6 +460,16 @@ void hvm_task_switch(
     uint16_t tss_sel, enum hvm_task_switch_reason taskswitch_reason,
     int32_t errcode);
 
+enum hvm_segmentation_mode {
+    hvm_seg_mode_real,
+    hvm_seg_mode_prot,
+    hvm_seg_mode_long,
+};
+
+enum hvm_segmentation_mode hvm_seg_mode(
+    const struct vcpu *v, enum x86_segment seg,
+    const struct segment_register *cs);
+
 enum hvm_access_type {
     hvm_access_insn_fetch,
     hvm_access_none,
@@ -472,7 +482,7 @@ bool_t hvm_virtual_to_linear_addr(
     unsigned long offset,
     unsigned int bytes,
     enum hvm_access_type access_type,
-    unsigned int addr_size,
+    enum hvm_segmentation_mode seg_mode,
     unsigned long *linear_addr);
 
 void *hvm_map_guest_frame_rw(unsigned long gfn, bool_t permanent,
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH for 4.9 4/6] x86/svm: Introduce svm_emul_swint_injection()
  2017-03-31 19:50 [PATCH for 4.9 0/6] x86/emul: Fixes Andrew Cooper
                   ` (2 preceding siblings ...)
  2017-03-31 19:50 ` [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments Andrew Cooper
@ 2017-03-31 19:50 ` Andrew Cooper
  2017-04-03  9:30   ` Jan Beulich
  2017-04-03 14:04   ` Boris Ostrovsky
  2017-03-31 19:50 ` [PATCH for 4.9 5/6] x86/emul: Drop swint_emulate infrastructure Andrew Cooper
  2017-03-31 19:50 ` [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context Andrew Cooper
  5 siblings, 2 replies; 37+ messages in thread
From: Andrew Cooper @ 2017-03-31 19:50 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Boris Ostrovsky, Julien Grall,
	Suravee Suthikulpanit, Jan Beulich

Software events require emulation in some cases on AMD hardware.  Introduce
svm_emul_swint_injection() to perform this emulation if necessary in
svm_inject_event(), which will cope with any sources of event, rather than
just those coming from x86_emulate().

This logic mirrors inject_swint() in the x86 instruction emulator.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
CC: Julien Grall <julien.grall@arm.com>
---
 xen/arch/x86/hvm/svm/svm.c | 136 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 136 insertions(+)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 4d7e49f..6d77c7e 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1183,6 +1183,124 @@ static void svm_vcpu_destroy(struct vcpu *v)
     passive_domain_destroy(v);
 }
 
+/*
+ * Emulate enough of interrupt injection to cover the DPL check (omitted by
+ * hardware), and to work out whether it is safe to move %rip fowards for
+ * architectural trap vs fault semantics in the exception frame (which
+ * hardware won't cope with).
+ *
+ * The event parameter will be modified to a fault if necessary.
+ */
+static void svm_emul_swint_injection(struct x86_event *event)
+{
+    struct vcpu *curr = current;
+    struct vmcb_struct *vmcb = curr->arch.hvm_svm.vmcb;
+    struct cpu_user_regs *regs = guest_cpu_user_regs();
+
+    unsigned int trap = event->vector, type = event->type;
+    unsigned int fault = TRAP_gp_fault, ec = 0;
+
+    if ( vmcb->_cr0 & X86_CR0_PE ) /* TODO: support real-mode injection? */
+    {
+        pagefault_info_t pf;
+        struct segment_register idtr;
+        unsigned int idte_size, idte_offset;
+        unsigned long idte_linear_addr;
+        struct { uint32_t a, b, c, d; } idte = {};
+        bool lm = vmcb->_efer & EFER_LMA;
+        enum hvm_segmentation_mode seg_mode =
+            lm ? hvm_seg_mode_long : hvm_seg_mode_prot;
+        int rc;
+
+        idte_size = lm ? 16 : 8;
+        idte_offset = trap * idte_size;
+
+        /* ICEBP sets the External Event bit despite being an instruction. */
+        ec = (trap << 3) | X86_XEC_IDT |
+            (type == X86_EVENTTYPE_PRI_SW_EXCEPTION ? X86_XEC_EXT : 0);
+
+        /*
+         * TODO: This does not cover the v8086 mode with CR4.VME case
+         * correctly, but falls on the safe side from the point of view of a
+         * 32bit OS.  Someone with many TUITs can see about reading the TSS
+         * Software Interrupt Redirection bitmap.
+         */
+        if ( (regs->eflags & X86_EFLAGS_VM) &&
+             MASK_EXTR(regs->eflags, X86_EFLAGS_IOPL) != 3 )
+            goto raise_exception;
+
+        /*
+         * Read all 8/16 bytes so the idtr limit check is applied properly to
+         * this entry, even though don't look all the words read.
+         */
+        hvm_get_segment_register(curr, x86_seg_idtr, &idtr);
+        if ( !hvm_virtual_to_linear_addr(x86_seg_idtr, &idtr, idte_offset,
+                                         idte_size, hvm_access_read,
+                                         seg_mode, &idte_linear_addr) )
+            goto raise_exception;
+
+        rc = hvm_copy_from_guest_linear(&idte, idte_linear_addr, idte_size,
+                                        PFEC_implicit, &pf);
+        if ( rc )
+        {
+            if ( rc == HVMCOPY_bad_gva_to_gfn )
+            {
+                fault = TRAP_page_fault;
+                ec = pf.ec;
+                event->cr2 = pf.linear;
+            }
+
+            goto raise_exception;
+        }
+
+        /* This must be an interrupt, trap, or task gate. */
+        switch ( (idte.b >> 8) & 0x1f )
+            {
+            case SYS_DESC_irq_gate:
+            case SYS_DESC_trap_gate:
+                break;
+            case SYS_DESC_irq_gate16:
+            case SYS_DESC_trap_gate16:
+            case SYS_DESC_task_gate:
+                if ( !lm )
+                    break;
+                /* fall through */
+            default:
+                goto raise_exception;
+            }
+
+        /* The 64-bit high half's type must be zero. */
+        if ( idte.d & 0x1f00 )
+            goto raise_exception;
+
+        /* ICEBP counts as a hardware event, and bypasses the dpl check. */
+        if ( type != X86_EVENTTYPE_PRI_SW_EXCEPTION &&
+             vmcb->_cpl > ((idte.b >> 13) & 3) )
+                goto raise_exception;
+
+        /* Is this entry present? */
+        if ( !(idte.b & (1u << 15)) )
+        {
+            fault = TRAP_no_segment;
+            goto raise_exception;
+        }
+
+        /*
+         * Any further fault during injection will cause a double fault.  It
+         * is fine to leave this up to hardware, and software won't be in a
+         * position to care about the architectural correctness of %rip in the
+         * exception frame.
+         */
+        return;
+    }
+
+ raise_exception:
+    event->vector = fault;
+    event->type = X86_EVENTTYPE_HW_EXCEPTION;
+    event->insn_len = 0;
+    event->error_code = ec;
+}
+
 static void svm_inject_event(const struct x86_event *event)
 {
     struct vcpu *curr = current;
@@ -1191,6 +1309,24 @@ static void svm_inject_event(const struct x86_event *event)
     struct x86_event _event = *event;
     struct cpu_user_regs *regs = guest_cpu_user_regs();
 
+    /*
+     * For hardware lacking NRips support, and always for ICEBP instructions,
+     * the processor requires extra help to deliver software events.
+     *
+     * Xen must emulate enough of the event injection to be sure that a
+     * further fault shouldn't occur during delivery.  This covers the fact
+     * that hardware doesn't perform DPL checking on injection.
+     *
+     * Also, it accounts for proper positioning of %rip for an event with trap
+     * semantics (where %rip should point after the instruction) which suffers
+     * a fault during injection (at which point %rip should point at the
+     * instruction).
+     */
+    if ( event->type == X86_EVENTTYPE_PRI_SW_EXCEPTION ||
+         (!cpu_has_svm_nrips && (event->type == X86_EVENTTYPE_SW_INTERRUPT ||
+                                 event->type == X86_EVENTTYPE_SW_EXCEPTION)) )
+        svm_emul_swint_injection(&_event);
+
     switch ( _event.vector )
     {
     case TRAP_debug:
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH for 4.9 5/6] x86/emul: Drop swint_emulate infrastructure
  2017-03-31 19:50 [PATCH for 4.9 0/6] x86/emul: Fixes Andrew Cooper
                   ` (3 preceding siblings ...)
  2017-03-31 19:50 ` [PATCH for 4.9 4/6] x86/svm: Introduce svm_emul_swint_injection() Andrew Cooper
@ 2017-03-31 19:50 ` Andrew Cooper
  2017-04-03  8:36   ` Paul Durrant
  2017-04-03  9:38   ` Jan Beulich
  2017-03-31 19:50 ` [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context Andrew Cooper
  5 siblings, 2 replies; 37+ messages in thread
From: Andrew Cooper @ 2017-03-31 19:50 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Julien Grall, Paul Durrant, Tim Deegan, Jan Beulich

With the SVM injection logic capable of doing its own emulation, there is no
need for this hardware-specific assistance in the common emulator.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Paul Durrant <paul.durrant@citrix.com>
CC: Tim Deegan <tim@xen.org>
CC: Julien Grall <julien.grall@arm.com>
---
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c |  18 +--
 xen/arch/x86/hvm/emulate.c                      |   7 -
 xen/arch/x86/mm.c                               |   2 -
 xen/arch/x86/mm/shadow/common.c                 |   1 -
 xen/arch/x86/x86_emulate/x86_emulate.c          | 187 ++++--------------------
 xen/arch/x86/x86_emulate/x86_emulate.h          |  53 -------
 6 files changed, 30 insertions(+), 238 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 890642c..8488816 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -536,8 +536,7 @@ enum {
     HOOK_put_fpu,
     HOOK_invlpg,
     HOOK_vmfunc,
-    OPTION_swint_emulation, /* Two bits */
-    CANONICALIZE_rip = OPTION_swint_emulation + 2,
+    CANONICALIZE_rip,
     CANONICALIZE_rsp,
     CANONICALIZE_rbp
 };
@@ -577,19 +576,6 @@ static void disable_hooks(void)
     MAYBE_DISABLE_HOOK(invlpg);
 }
 
-static void set_swint_support(struct x86_emulate_ctxt *ctxt)
-{
-    unsigned int swint_opt = (input.options >> OPTION_swint_emulation) & 3;
-    static const enum x86_swint_emulation map[4] = {
-        x86_swint_emulate_none,
-        x86_swint_emulate_none,
-        x86_swint_emulate_icebp,
-        x86_swint_emulate_all
-    };
-
-    ctxt->swint_emulate = map[swint_opt];
-}
-
 /*
  * Constrain input to architecturally-possible states where
  * the emulator relies on these
@@ -693,8 +679,6 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
 
     disable_hooks();
 
-    set_swint_support(&ctxt);
-
     do {
         /* FIXME: Until we actually implement SIGFPE handling properly */
         setup_fpu_exception_handler();
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index f578796..efac2e9 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -2036,13 +2036,6 @@ void hvm_emulate_init_once(
     hvmemul_ctxt->ctxt.regs = regs;
     hvmemul_ctxt->ctxt.vendor = curr->domain->arch.cpuid->x86_vendor;
     hvmemul_ctxt->ctxt.force_writeback = true;
-
-    if ( cpu_has_vmx )
-        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
-    else if ( cpu_has_svm_nrips )
-        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_icebp;
-    else
-        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_all;
 }
 
 void hvm_emulate_init_per_insn(
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index be4e308..3918a37 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5412,7 +5412,6 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
             .vendor = d->arch.cpuid->x86_vendor,
             .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
             .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
-            .swint_emulate = x86_swint_emulate_none,
         },
     };
     int rc;
@@ -5567,7 +5566,6 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
         .vendor = v->domain->arch.cpuid->x86_vendor,
         .addr_size = addr_size,
         .sp_size = addr_size,
-        .swint_emulate = x86_swint_emulate_none,
         .data = &mmio_ro_ctxt
     };
     int rc;
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 551a7d3..2fc796b 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -328,7 +328,6 @@ const struct x86_emulate_ops *shadow_init_emulation(
 
     sh_ctxt->ctxt.regs = regs;
     sh_ctxt->ctxt.vendor = v->domain->arch.cpuid->x86_vendor;
-    sh_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
 
     /* Segment cache initialisation. Primed with CS. */
     creg = hvm_get_seg_reg(x86_seg_cs, sh_ctxt);
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 7af8a42..7315ca6 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1999,142 +1999,6 @@ static bool umip_active(struct x86_emulate_ctxt *ctxt,
            (cr4 & X86_CR4_UMIP);
 }
 
-/* Inject a software interrupt/exception, emulating if needed. */
-static int inject_swint(enum x86_swint_type type,
-                        uint8_t vector, uint8_t insn_len,
-                        struct x86_emulate_ctxt *ctxt,
-                        const struct x86_emulate_ops *ops)
-{
-    int rc, error_code, fault_type = EXC_GP;
-
-    /*
-     * Without hardware support, injecting software interrupts/exceptions is
-     * problematic.
-     *
-     * All software methods of generating exceptions (other than BOUND) yield
-     * traps, so eip in the exception frame needs to point after the
-     * instruction, not at it.
-     *
-     * However, if injecting it as a hardware exception causes a fault during
-     * delivery, our adjustment of eip will cause the fault to be reported
-     * after the faulting instruction, not pointing to it.
-     *
-     * Therefore, eip can only safely be wound forwards if we are certain that
-     * injecting an equivalent hardware exception won't fault, which means
-     * emulating everything the processor would do on a control transfer.
-     *
-     * However, emulation of complete control transfers is very complicated.
-     * All we care about is that guest userspace cannot avoid the descriptor
-     * DPL check by using the Xen emulator, and successfully invoke DPL=0
-     * descriptors.
-     *
-     * Any OS which would further fault during injection is going to receive a
-     * double fault anyway, and won't be in a position to care that the
-     * faulting eip is incorrect.
-     */
-
-    if ( (ctxt->swint_emulate == x86_swint_emulate_all) ||
-         ((ctxt->swint_emulate == x86_swint_emulate_icebp) &&
-          (type == x86_swint_icebp)) )
-    {
-        if ( !in_realmode(ctxt, ops) )
-        {
-            unsigned int idte_size, idte_offset;
-            struct { uint32_t a, b, c, d; } idte = {};
-            int lm = in_longmode(ctxt, ops);
-
-            if ( lm < 0 )
-                return X86EMUL_UNHANDLEABLE;
-
-            idte_size = lm ? 16 : 8;
-            idte_offset = vector * idte_size;
-
-            /* icebp sets the External Event bit despite being an instruction. */
-            error_code = (vector << 3) | ECODE_IDT |
-                (type == x86_swint_icebp ? ECODE_EXT : 0);
-
-            /*
-             * TODO - this does not cover the v8086 mode with CR4.VME case
-             * correctly, but falls on the safe side from the point of view of
-             * a 32bit OS.  Someone with many TUITs can see about reading the
-             * TSS Software Interrupt Redirection bitmap.
-             */
-            if ( (ctxt->regs->eflags & X86_EFLAGS_VM) &&
-                 ((ctxt->regs->eflags & X86_EFLAGS_IOPL) != X86_EFLAGS_IOPL) )
-                goto raise_exn;
-
-            /*
-             * Read all 8/16 bytes so the idtr limit check is applied properly
-             * to this entry, even though we only end up looking at the 2nd
-             * word.
-             */
-            switch ( rc = ops->read(x86_seg_idtr, idte_offset,
-                                    &idte, idte_size, ctxt) )
-            {
-            case X86EMUL_OKAY:
-                break;
-
-            case X86EMUL_EXCEPTION:
-                if ( !ctxt->event_pending )
-                    goto raise_exn;
-                /* fallthrough */
-
-            default:
-                return rc;
-            }
-
-            /* This must be an interrupt, trap, or task gate. */
-#ifdef __XEN__
-            switch ( (idte.b >> 8) & 0x1f )
-            {
-            case SYS_DESC_irq_gate:
-            case SYS_DESC_trap_gate:
-                break;
-            case SYS_DESC_irq_gate16:
-            case SYS_DESC_trap_gate16:
-            case SYS_DESC_task_gate:
-                if ( !lm )
-                    break;
-                /* fall through */
-            default:
-                goto raise_exn;
-            }
-#endif
-
-            /* The 64-bit high half's type must be zero. */
-            if ( idte.d & 0x1f00 )
-                goto raise_exn;
-
-            /* icebp counts as a hardware event, and bypasses the dpl check. */
-            if ( type != x86_swint_icebp )
-            {
-                int cpl = get_cpl(ctxt, ops);
-
-                fail_if(cpl < 0);
-
-                if ( cpl > ((idte.b >> 13) & 3) )
-                    goto raise_exn;
-            }
-
-            /* Is this entry present? */
-            if ( !(idte.b & (1u << 15)) )
-            {
-                fault_type = EXC_NP;
-                goto raise_exn;
-            }
-        }
-    }
-
-    x86_emul_software_event(type, vector, insn_len, ctxt);
-    rc = X86EMUL_OKAY;
-
- done:
-    return rc;
-
- raise_exn:
-    generate_exception(fault_type, error_code);
-}
-
 static void adjust_bnd(struct x86_emulate_ctxt *ctxt,
                        const struct x86_emulate_ops *ops, enum vex_pfx pfx)
 {
@@ -3101,7 +2965,6 @@ x86_emulate(
     struct operand src = { .reg = PTR_POISON };
     struct operand dst = { .reg = PTR_POISON };
     unsigned long cr4;
-    enum x86_swint_type swint_type;
     struct fpu_insn_ctxt fic = { .type = X86EMUL_FPU_none, .exn_raised = -1 };
     struct x86_emulate_stub stub = {};
     DECLARE_ALIGNED(mmval_t, mmval);
@@ -4103,25 +3966,38 @@ x86_emulate(
             goto done;
         break;
 
-    case 0xcc: /* int3 */
-        src.val = EXC_BP;
-        swint_type = x86_swint_int3;
-        goto swint;
-
-    case 0xcd: /* int imm8 */
-        swint_type = x86_swint_int;
-    swint:
-        rc = inject_swint(swint_type, (uint8_t)src.val,
-                          _regs.r(ip) - ctxt->regs->r(ip),
-                          ctxt, ops) ? : X86EMUL_EXCEPTION;
-        goto done;
-
     case 0xce: /* into */
         if ( !(_regs.eflags & X86_EFLAGS_OF) )
             break;
-        src.val = EXC_OF;
-        swint_type = x86_swint_into;
-        goto swint;
+        /* Fallthrough */
+    case 0xcc: /* int3 */
+    case 0xcd: /* int imm8 */
+    case 0xf1: /* int1 (icebp) */
+        ASSERT(!ctxt->event_pending);
+        switch ( ctxt->opcode )
+        {
+        case 0xcc: /* int3 */
+            ctxt->event.vector = EXC_BP;
+            ctxt->event.type = X86_EVENTTYPE_SW_EXCEPTION;
+            break;
+        case 0xcd: /* int imm8 */
+            ctxt->event.vector = src.val;
+            ctxt->event.type = X86_EVENTTYPE_SW_INTERRUPT;
+            break;
+        case 0xce: /* into */
+            ctxt->event.vector = EXC_OF;
+            ctxt->event.type = X86_EVENTTYPE_SW_EXCEPTION;
+            break;
+        case 0xf1: /* icebp */
+            ctxt->event.vector = EXC_DB;
+            ctxt->event.type = X86_EVENTTYPE_PRI_SW_EXCEPTION;
+            break;
+        }
+        ctxt->event.error_code = X86_EVENT_NO_EC;
+        ctxt->event.insn_len = _regs.r(ip) - ctxt->regs->r(ip);
+        ctxt->event_pending = true;
+        rc = X86EMUL_EXCEPTION;
+        goto done;
 
     case 0xcf: /* iret */ {
         unsigned long sel, eip, eflags;
@@ -4782,11 +4658,6 @@ x86_emulate(
             goto done;
         break;
 
-    case 0xf1: /* int1 (icebp) */
-        src.val = EXC_DB;
-        swint_type = x86_swint_icebp;
-        goto swint;
-
     case 0xf4: /* hlt */
         generate_exception_if(!mode_ring0(), EXC_GP, 0);
         ctxt->retire.hlt = true;
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 9c5fcde..215adf6 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -60,27 +60,6 @@ static inline bool is_x86_system_segment(enum x86_segment seg)
     return seg >= x86_seg_tr && seg < x86_seg_none;
 }
 
-/* Classification of the types of software generated interrupts/exceptions. */
-enum x86_swint_type {
-    x86_swint_icebp, /* 0xf1 */
-    x86_swint_int3,  /* 0xcc */
-    x86_swint_into,  /* 0xce */
-    x86_swint_int,   /* 0xcd $n */
-};
-
-/*
- * How much help is required with software event injection?
- *
- * All software events return from x86_emulate() with X86EMUL_EXCEPTION and
- * fault-like semantics.  This just controls whether the emulator performs
- * presence/dpl/etc checks and possibly raises exceptions instead.
- */
-enum x86_swint_emulation {
-    x86_swint_emulate_none, /* Hardware supports all software injection properly */
-    x86_swint_emulate_icebp,/* Help needed with `icebp` (0xf1) */
-    x86_swint_emulate_all,  /* Help needed with all software events */
-};
-
 /*
  * x86 event types. This enumeration is valid for:
  *  Intel VMX: {VM_ENTRY,VM_EXIT,IDT_VECTORING}_INTR_INFO[10:8]
@@ -472,9 +451,6 @@ struct x86_emulate_ctxt
      * Input-only state:
      */
 
-    /* Software event injection support. */
-    enum x86_swint_emulation swint_emulate;
-
     /* CPU vendor (X86_VENDOR_UNKNOWN for "don't care") */
     unsigned char vendor;
 
@@ -699,35 +675,6 @@ static inline void x86_emul_pagefault(
     ctxt->event_pending = true;
 }
 
-static inline void x86_emul_software_event(
-    enum x86_swint_type type, uint8_t vector, uint8_t insn_len,
-    struct x86_emulate_ctxt *ctxt)
-{
-    ASSERT(!ctxt->event_pending);
-
-    switch ( type )
-    {
-    case x86_swint_icebp:
-        ctxt->event.type = X86_EVENTTYPE_PRI_SW_EXCEPTION;
-        break;
-
-    case x86_swint_int3:
-    case x86_swint_into:
-        ctxt->event.type = X86_EVENTTYPE_SW_EXCEPTION;
-        break;
-
-    case x86_swint_int:
-        ctxt->event.type = X86_EVENTTYPE_SW_INTERRUPT;
-        break;
-    }
-
-    ctxt->event.vector = vector;
-    ctxt->event.error_code = X86_EVENT_NO_EC;
-    ctxt->event.insn_len = insn_len;
-
-    ctxt->event_pending = true;
-}
-
 static inline void x86_emul_reset_event(struct x86_emulate_ctxt *ctxt)
 {
     ctxt->event_pending = false;
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context
  2017-03-31 19:50 [PATCH for 4.9 0/6] x86/emul: Fixes Andrew Cooper
                   ` (4 preceding siblings ...)
  2017-03-31 19:50 ` [PATCH for 4.9 5/6] x86/emul: Drop swint_emulate infrastructure Andrew Cooper
@ 2017-03-31 19:50 ` Andrew Cooper
  2017-04-03  8:40   ` Paul Durrant
                     ` (2 more replies)
  5 siblings, 3 replies; 37+ messages in thread
From: Andrew Cooper @ 2017-03-31 19:50 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Julien Grall, Paul Durrant, Tim Deegan, Jan Beulich

Long mode (or not) influences emulation behaviour in a number of cases.
Instead of reusing the ->read_msr() hook to obtain EFER.LMA, require callers
to provide it directly.

This simplifies all long mode checks during emulation to a simple boolean
read, removing embedded msr reads.  It also allows for the removal of a local
variable in the sysenter emulation block, and removes a latent bug in the
syscall emulation block where rc contains a non X86EMUL_* constant for a
period of time.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Paul Durrant <paul.durrant@citrix.com>
CC: Tim Deegan <tim@xen.org>
CC: Julien Grall <julien.grall@arm.com>
---
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c |  1 +
 tools/tests/x86_emulator/test_x86_emulator.c    |  4 ++
 xen/arch/x86/hvm/emulate.c                      |  4 +-
 xen/arch/x86/mm.c                               |  2 +
 xen/arch/x86/mm/shadow/common.c                 |  5 +--
 xen/arch/x86/traps.c                            |  1 +
 xen/arch/x86/x86_emulate/x86_emulate.c          | 51 ++++++-------------------
 xen/arch/x86/x86_emulate/x86_emulate.h          |  3 ++
 8 files changed, 28 insertions(+), 43 deletions(-)

diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
index 8488816..2e42e54 100644
--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -649,6 +649,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t size)
     struct cpu_user_regs regs = {};
     struct x86_emulate_ctxt ctxt = {
         .regs = &regs,
+        .lma = sizeof(void *) == 8,
         .addr_size = 8 * sizeof(void *),
         .sp_size = 8 * sizeof(void *),
     };
diff --git a/tools/tests/x86_emulator/test_x86_emulator.c b/tools/tests/x86_emulator/test_x86_emulator.c
index 5be8ddc..efeb175 100644
--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -319,6 +319,7 @@ int main(int argc, char **argv)
     ctxt.regs = &regs;
     ctxt.force_writeback = 0;
     ctxt.vendor    = X86_VENDOR_UNKNOWN;
+    ctxt.lma       = sizeof(void *) == 8;
     ctxt.addr_size = 8 * sizeof(void *);
     ctxt.sp_size   = 8 * sizeof(void *);
 
@@ -2922,6 +2923,7 @@ int main(int argc, char **argv)
     {
         decl_insn(vzeroupper);
 
+        ctxt.lma = false;
         ctxt.sp_size = ctxt.addr_size = 32;
 
         asm volatile ( "vxorps %xmm2, %xmm2, %xmm3\n"
@@ -2949,6 +2951,7 @@ int main(int argc, char **argv)
             goto fail;
         printf("okay\n");
 
+        ctxt.lma = true;
         ctxt.sp_size = ctxt.addr_size = 64;
     }
     else
@@ -3001,6 +3004,7 @@ int main(int argc, char **argv)
             continue;
 
         memcpy(res, blobs[j].code, blobs[j].size);
+        ctxt.lma = blobs[j].bitness == 64;
         ctxt.addr_size = ctxt.sp_size = blobs[j].bitness;
 
         if ( ctxt.addr_size == sizeof(void *) * CHAR_BIT )
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index efac2e9..65cb707 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -2047,7 +2047,9 @@ void hvm_emulate_init_per_insn(
     unsigned int pfec = PFEC_page_present;
     unsigned long addr;
 
-    if ( hvm_long_mode_active(curr) &&
+    hvmemul_ctxt->ctxt.lma = hvm_long_mode_active(curr);
+
+    if ( hvmemul_ctxt->ctxt.lma &&
          hvmemul_ctxt->seg_reg[x86_seg_cs].attr.fields.l )
         hvmemul_ctxt->ctxt.addr_size = hvmemul_ctxt->ctxt.sp_size = 64;
     else
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 3918a37..eda220d 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5410,6 +5410,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
         .ctxt = {
             .regs = regs,
             .vendor = d->arch.cpuid->x86_vendor,
+            .lma = true,
             .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
             .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
         },
@@ -5564,6 +5565,7 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
     struct x86_emulate_ctxt ctxt = {
         .regs = regs,
         .vendor = v->domain->arch.cpuid->x86_vendor,
+        .lma = true,
         .addr_size = addr_size,
         .sp_size = addr_size,
         .data = &mmio_ro_ctxt
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 2fc796b..e42d3fd 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -328,15 +328,14 @@ const struct x86_emulate_ops *shadow_init_emulation(
 
     sh_ctxt->ctxt.regs = regs;
     sh_ctxt->ctxt.vendor = v->domain->arch.cpuid->x86_vendor;
+    sh_ctxt->ctxt.lma = hvm_long_mode_active(v);
 
     /* Segment cache initialisation. Primed with CS. */
     creg = hvm_get_seg_reg(x86_seg_cs, sh_ctxt);
 
     /* Work out the emulation mode. */
-    if ( hvm_long_mode_active(v) && creg->attr.fields.l )
-    {
+    if ( sh_ctxt->ctxt.lma && creg->attr.fields.l )
         sh_ctxt->ctxt.addr_size = sh_ctxt->ctxt.sp_size = 64;
-    }
     else
     {
         sreg = hvm_get_seg_reg(x86_seg_ss, sh_ctxt);
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 0d54baf..09dc2f1 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -2966,6 +2966,7 @@ static int emulate_privileged_op(struct cpu_user_regs *regs)
     struct priv_op_ctxt ctxt = {
         .ctxt.regs = regs,
         .ctxt.vendor = currd->arch.cpuid->x86_vendor,
+        .ctxt.lma = true,
     };
     int rc;
     unsigned int eflags, ar;
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 7315ca6..cc354bc 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -968,11 +968,10 @@ do {                                                                    \
 
 #define validate_far_branch(cs, ip) ({                                  \
     if ( sizeof(ip) <= 4 ) {                                            \
-        ASSERT(in_longmode(ctxt, ops) <= 0);                            \
+        ASSERT(!ctxt->lma);                                             \
         generate_exception_if((ip) > (cs)->limit, EXC_GP, 0);           \
     } else                                                              \
-        generate_exception_if(in_longmode(ctxt, ops) &&                 \
-                              (cs)->attr.fields.l                       \
+        generate_exception_if(ctxt->lma && (cs)->attr.fields.l          \
                               ? !is_canonical_address(ip)               \
                               : (ip) > (cs)->limit, EXC_GP, 0);         \
 })
@@ -1630,20 +1629,6 @@ static bool vcpu_has(
 #endif
 
 static int
-in_longmode(
-    struct x86_emulate_ctxt *ctxt,
-    const struct x86_emulate_ops *ops)
-{
-    uint64_t efer;
-
-    if ( !ops->read_msr ||
-         unlikely(ops->read_msr(MSR_EFER, &efer, ctxt) != X86EMUL_OKAY) )
-        return -1;
-
-    return !!(efer & EFER_LMA);
-}
-
-static int
 realmode_load_seg(
     enum x86_segment seg,
     uint16_t sel,
@@ -1767,8 +1752,7 @@ protmode_load_seg(
          * Experimentally in long mode, the L and D bits are checked before
          * the Present bit.
          */
-        if ( in_longmode(ctxt, ops) &&
-             (desc.b & (1 << 21)) && (desc.b & (1 << 22)) )
+        if ( ctxt->lma && (desc.b & (1 << 21)) && (desc.b & (1 << 22)) )
             goto raise_exn;
         sel = (sel ^ rpl) | cpl;
         break;
@@ -1818,10 +1802,8 @@ protmode_load_seg(
 
     if ( !is_x86_user_segment(seg) )
     {
-        int lm = (desc.b & (1u << 12)) ? 0 : in_longmode(ctxt, ops);
+        bool lm = (desc.b & (1u << 12)) ? 0 : ctxt->lma;
 
-        if ( lm < 0 )
-            return X86EMUL_UNHANDLEABLE;
         if ( lm )
         {
             switch ( rc = ops->read(sel_seg, (sel & 0xfff8) + 8,
@@ -5195,7 +5177,7 @@ x86_emulate(
                 case 0x03: /* busy 16-bit TSS */
                 case 0x04: /* 16-bit call gate */
                 case 0x05: /* 16/32-bit task gate */
-                    if ( in_longmode(ctxt, ops) )
+                    if ( ctxt->lma )
                         break;
                     /* fall through */
                 case 0x02: /* LDT */
@@ -5242,7 +5224,7 @@ x86_emulate(
                 {
                 case 0x01: /* available 16-bit TSS */
                 case 0x03: /* busy 16-bit TSS */
-                    if ( in_longmode(ctxt, ops) )
+                    if ( ctxt->lma )
                         break;
                     /* fall through */
                 case 0x02: /* LDT */
@@ -5292,10 +5274,7 @@ x86_emulate(
         sreg.attr.bytes = 0xc93; /* G+DB+P+S+Data */
 
 #ifdef __x86_64__
-        rc = in_longmode(ctxt, ops);
-        if ( rc < 0 )
-            goto cannot_emulate;
-        if ( rc )
+        if ( ctxt->lma )
         {
             cs.attr.bytes = 0xa9b; /* L+DB+P+S+Code */
 
@@ -5732,9 +5711,7 @@ x86_emulate(
             dst.val = src.val;
         break;
 
-    case X86EMUL_OPC(0x0f, 0x34): /* sysenter */ {
-        int lm;
-
+    case X86EMUL_OPC(0x0f, 0x34): /* sysenter */
         vcpu_must_have(sep);
         generate_exception_if(mode_ring0(), EXC_GP, 0);
         generate_exception_if(!in_protmode(ctxt, ops), EXC_GP, 0);
@@ -5745,17 +5722,14 @@ x86_emulate(
             goto done;
 
         generate_exception_if(!(msr_val & 0xfffc), EXC_GP, 0);
-        lm = in_longmode(ctxt, ops);
-        if ( lm < 0 )
-            goto cannot_emulate;
 
         _regs.eflags &= ~(X86_EFLAGS_VM | X86_EFLAGS_IF | X86_EFLAGS_RF);
 
         cs.sel = msr_val & ~3; /* SELECTOR_RPL_MASK */
         cs.base = 0;   /* flat segment */
         cs.limit = ~0u;  /* 4GB limit */
-        cs.attr.bytes = lm ? 0xa9b  /* G+L+P+S+Code */
-                           : 0xc9b; /* G+DB+P+S+Code */
+        cs.attr.bytes = ctxt->lma ? 0xa9b  /* G+L+P+S+Code */
+                                  : 0xc9b; /* G+DB+P+S+Code */
 
         sreg.sel = cs.sel + 8;
         sreg.base = 0;   /* flat segment */
@@ -5770,16 +5744,15 @@ x86_emulate(
         if ( (rc = ops->read_msr(MSR_IA32_SYSENTER_EIP,
                                  &msr_val, ctxt)) != X86EMUL_OKAY )
             goto done;
-        _regs.r(ip) = lm ? msr_val : (uint32_t)msr_val;
+        _regs.r(ip) = ctxt->lma ? msr_val : (uint32_t)msr_val;
 
         if ( (rc = ops->read_msr(MSR_IA32_SYSENTER_ESP,
                                  &msr_val, ctxt)) != X86EMUL_OKAY )
             goto done;
-        _regs.r(sp) = lm ? msr_val : (uint32_t)msr_val;
+        _regs.r(sp) = ctxt->lma ? msr_val : (uint32_t)msr_val;
 
         singlestep = _regs.eflags & X86_EFLAGS_TF;
         break;
-    }
 
     case X86EMUL_OPC(0x0f, 0x35): /* sysexit */
         vcpu_must_have(sep);
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 215adf6..0479e30 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -457,6 +457,9 @@ struct x86_emulate_ctxt
     /* Set this if writes may have side effects. */
     bool force_writeback;
 
+    /* Long mode active? */
+    bool lma;
+
     /* Caller data that can be used by x86_emulate_ops' routines. */
     void *data;
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 1/6] x86/hvm: Correct some address space terminology
  2017-03-31 19:50 ` [PATCH for 4.9 1/6] x86/hvm: Correct some address space terminology Andrew Cooper
@ 2017-04-03  8:24   ` Paul Durrant
  2017-04-03  8:24   ` Jan Beulich
  1 sibling, 0 replies; 37+ messages in thread
From: Paul Durrant @ 2017-04-03  8:24 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Julien Grall, Tim (Xen.org), Jan Beulich

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 31 March 2017 20:51
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Jan Beulich
> <JBeulich@suse.com>; Tim (Xen.org) <tim@xen.org>; Paul Durrant
> <Paul.Durrant@citrix.com>; Julien Grall <julien.grall@arm.com>
> Subject: [PATCH for 4.9 1/6] x86/hvm: Correct some address space
> terminology
> 
> The function hvm_translate_linear_addr() translates a virtual address to a
> linear address, not a linear address to a physical address.  Correct its name.
> 
> Both hvm_translate_virtual_addr() and hvmemul_virtual_to_linear() return
> a
> linear address, not a physical address.  Correct the parameters name.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Tim Deegan <tim@xen.org>
> CC: Paul Durrant <paul.durrant@citrix.com>
> CC: Julien Grall <julien.grall@arm.com>
> ---
>  xen/arch/x86/hvm/emulate.c      | 12 ++++++------
>  xen/arch/x86/mm/shadow/common.c | 16 ++++++++--------
>  2 files changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index 2d92957..4073715 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -506,7 +506,7 @@ static int hvmemul_virtual_to_linear(
>      unsigned long *reps,
>      enum hvm_access_type access_type,
>      struct hvm_emulate_ctxt *hvmemul_ctxt,
> -    unsigned long *paddr)
> +    unsigned long *linear)
>  {
>      struct segment_register *reg;
>      int okay;
> @@ -514,7 +514,7 @@ static int hvmemul_virtual_to_linear(
> 
>      if ( seg == x86_seg_none )
>      {
> -        *paddr = offset;
> +        *linear = offset;
>          return X86EMUL_OKAY;
>      }
> 
> @@ -549,16 +549,16 @@ static int hvmemul_virtual_to_linear(
>          okay = hvm_virtual_to_linear_addr(
>              seg, reg, offset - (*reps - 1) * bytes_per_rep,
>              *reps * bytes_per_rep, access_type,
> -            hvmemul_ctxt->ctxt.addr_size, paddr);
> -        *paddr += (*reps - 1) * bytes_per_rep;
> +            hvmemul_ctxt->ctxt.addr_size, linear);
> +        *linear += (*reps - 1) * bytes_per_rep;
>          if ( hvmemul_ctxt->ctxt.addr_size != 64 )
> -            *paddr = (uint32_t)*paddr;
> +            *linear = (uint32_t)*linear;
>      }
>      else
>      {
>          okay = hvm_virtual_to_linear_addr(
>              seg, reg, offset, *reps * bytes_per_rep, access_type,
> -            hvmemul_ctxt->ctxt.addr_size, paddr);
> +            hvmemul_ctxt->ctxt.addr_size, linear);
>      }
> 
>      if ( okay )
> diff --git a/xen/arch/x86/mm/shadow/common.c
> b/xen/arch/x86/mm/shadow/common.c
> index d93f2ab..03cb24d 100644
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -136,13 +136,13 @@ static struct segment_register *hvm_get_seg_reg(
>      return seg_reg;
>  }
> 
> -static int hvm_translate_linear_addr(
> +static int hvm_translate_virtual_addr(
>      enum x86_segment seg,
>      unsigned long offset,
>      unsigned int bytes,
>      enum hvm_access_type access_type,
>      struct sh_emulate_ctxt *sh_ctxt,
> -    unsigned long *paddr)
> +    unsigned long *linear)
>  {
>      const struct segment_register *reg;
>      int okay;
> @@ -152,7 +152,7 @@ static int hvm_translate_linear_addr(
>          return -PTR_ERR(reg);
> 
>      okay = hvm_virtual_to_linear_addr(
> -        seg, reg, offset, bytes, access_type, sh_ctxt->ctxt.addr_size, paddr);
> +        seg, reg, offset, bytes, access_type, sh_ctxt->ctxt.addr_size, linear);
> 
>      if ( !okay )
>      {
> @@ -183,7 +183,7 @@ hvm_read(enum x86_segment seg,
>      unsigned long addr;
>      int rc;
> 
> -    rc = hvm_translate_linear_addr(
> +    rc = hvm_translate_virtual_addr(
>          seg, offset, bytes, access_type, sh_ctxt, &addr);
>      if ( rc || !bytes )
>          return rc;
> @@ -265,7 +265,7 @@ hvm_emulate_write(enum x86_segment seg,
>      if ( seg == x86_seg_ss )
>          perfc_incr(shadow_fault_emulate_stack);
> 
> -    rc = hvm_translate_linear_addr(
> +    rc = hvm_translate_virtual_addr(
>          seg, offset, bytes, hvm_access_write, sh_ctxt, &addr);
>      if ( rc || !bytes )
>          return rc;
> @@ -291,7 +291,7 @@ hvm_emulate_cmpxchg(enum x86_segment seg,
>      if ( bytes > sizeof(long) )
>          return X86EMUL_UNHANDLEABLE;
> 
> -    rc = hvm_translate_linear_addr(
> +    rc = hvm_translate_virtual_addr(
>          seg, offset, bytes, hvm_access_write, sh_ctxt, &addr);
>      if ( rc )
>          return rc;
> @@ -345,7 +345,7 @@ const struct x86_emulate_ops
> *shadow_init_emulation(
>      /* Attempt to prefetch whole instruction. */
>      sh_ctxt->insn_buf_eip = regs->rip;
>      sh_ctxt->insn_buf_bytes =
> -        (!hvm_translate_linear_addr(
> +        (!hvm_translate_virtual_addr(
>              x86_seg_cs, regs->rip, sizeof(sh_ctxt->insn_buf),
>              hvm_access_insn_fetch, sh_ctxt, &addr) &&
>           !hvm_fetch_from_guest_linear(
> @@ -374,7 +374,7 @@ void shadow_continue_emulation(struct
> sh_emulate_ctxt *sh_ctxt,
>      {
>          /* Prefetch more bytes. */
>          sh_ctxt->insn_buf_bytes =
> -            (!hvm_translate_linear_addr(
> +            (!hvm_translate_virtual_addr(
>                  x86_seg_cs, regs->rip, sizeof(sh_ctxt->insn_buf),
>                  hvm_access_insn_fetch, sh_ctxt, &addr) &&
>               !hvm_fetch_from_guest_linear(
> --
> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 1/6] x86/hvm: Correct some address space terminology
  2017-03-31 19:50 ` [PATCH for 4.9 1/6] x86/hvm: Correct some address space terminology Andrew Cooper
  2017-04-03  8:24   ` Paul Durrant
@ 2017-04-03  8:24   ` Jan Beulich
  2017-04-03 10:19     ` Andrew Cooper
  1 sibling, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2017-04-03  8:24 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
> The function hvm_translate_linear_addr() translates a virtual address to a
> linear address, not a linear address to a physical address.  Correct its 
> name.
> 
> Both hvm_translate_virtual_addr() and hvmemul_virtual_to_linear() return a
> linear address, not a physical address.  Correct the parameters name.

I think the "p" in the names didn't stand for "physical", but for
"pointer", and I further think ...

> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -506,7 +506,7 @@ static int hvmemul_virtual_to_linear(
>      unsigned long *reps,
>      enum hvm_access_type access_type,
>      struct hvm_emulate_ctxt *hvmemul_ctxt,
> -    unsigned long *paddr)
> +    unsigned long *linear)

... retaining the "p" here and ...

> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -136,13 +136,13 @@ static struct segment_register *hvm_get_seg_reg(
>      return seg_reg;
>  }
>  
> -static int hvm_translate_linear_addr(
> +static int hvm_translate_virtual_addr(
>      enum x86_segment seg,
>      unsigned long offset,
>      unsigned int bytes,
>      enum hvm_access_type access_type,
>      struct sh_emulate_ctxt *sh_ctxt,
> -    unsigned long *paddr)
> +    unsigned long *linear)

... here would be useful (to distinguish the virtual address
represented by the pointer itself - in hypervisor space - from the
one the pointer points to, i.e. in guest space).

With that
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 2/6] x86/hvm: Correct long mode predicate
  2017-03-31 19:50 ` [PATCH for 4.9 2/6] x86/hvm: Correct long mode predicate Andrew Cooper
@ 2017-04-03  8:26   ` Paul Durrant
  2017-04-03  8:30   ` Jan Beulich
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 37+ messages in thread
From: Paul Durrant @ 2017-04-03  8:26 UTC (permalink / raw)
  To: Xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jun Nakajima, Boris Ostrovsky,
	Suravee Suthikulpanit

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 31 March 2017 20:51
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Jan Beulich
> <JBeulich@suse.com>; Paul Durrant <Paul.Durrant@citrix.com>; Tim
> (Xen.org) <tim@xen.org>; George Dunlap <George.Dunlap@citrix.com>; Jun
> Nakajima <jun.nakajima@intel.com>; Kevin Tian <kevin.tian@intel.com>;
> Boris Ostrovsky <boris.ostrovsky@oracle.com>; Suravee Suthikulpanit
> <suravee.suthikulpanit@amd.com>; Julien Grall <julien.grall@arm.com>
> Subject: [PATCH for 4.9 2/6] x86/hvm: Correct long mode predicate
> 
> hvm_long_mode_enabled() tests for EFER.LMA, which is specifically
> different to
> EFER.LME.
> 
> Rename it to match its behaviour, and have it strictly return a boolean value
> (although all its callers already use it in implicitly-boolean contexts, so no
> functional change).
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Paul Durrant <paul.durrant@citrix.com>
> CC: Tim Deegan <tim@xen.org>
> CC: George Dunlap <george.dunlap@eu.citrix.com>
> CC: Jun Nakajima <jun.nakajima@intel.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> CC: Julien Grall <julien.grall@arm.com>
> ---
>  xen/arch/x86/cpuid.c              |  2 +-
>  xen/arch/x86/hvm/emulate.c        |  2 +-
>  xen/arch/x86/hvm/hvm.c            | 10 +++++-----
>  xen/arch/x86/hvm/svm/svm.c        |  6 +++---
>  xen/arch/x86/hvm/vmx/vmx.c        |  6 +++---
>  xen/arch/x86/hvm/vmx/vvmx.c       |  8 ++++----
>  xen/arch/x86/mm/hap/hap.c         |  8 ++++----
>  xen/arch/x86/mm/shadow/common.c   |  4 ++--
>  xen/arch/x86/oprofile/backtrace.c |  2 +-
>  xen/include/asm-x86/hvm/hvm.h     |  3 +--
>  10 files changed, 25 insertions(+), 26 deletions(-)
> 
> diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
> index 1c6a6c6..d359e09 100644
> --- a/xen/arch/x86/cpuid.c
> +++ b/xen/arch/x86/cpuid.c
> @@ -911,7 +911,7 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
>      case 0x80000001:
>          /* SYSCALL is hidden outside of long mode on Intel. */
>          if ( p->x86_vendor == X86_VENDOR_INTEL &&
> -             is_hvm_domain(d) && !hvm_long_mode_enabled(v) )
> +             is_hvm_domain(d) && !hvm_long_mode_active(v) )
>              res->d &= ~cpufeat_mask(X86_FEATURE_SYSCALL);
> 
>      common_leaf1_adjustments:
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index 4073715..3d084ca 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -2051,7 +2051,7 @@ void hvm_emulate_init_per_insn(
>      unsigned int pfec = PFEC_page_present;
>      unsigned long addr;
> 
> -    if ( hvm_long_mode_enabled(curr) &&
> +    if ( hvm_long_mode_active(curr) &&
>           hvmemul_ctxt->seg_reg[x86_seg_cs].attr.fields.l )
>          hvmemul_ctxt->ctxt.addr_size = hvmemul_ctxt->ctxt.sp_size = 64;
>      else
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 0282986..9f83cd8 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2227,7 +2227,7 @@ int hvm_set_cr0(unsigned long value, bool_t
> may_defer)
>          }
> 
>          /* When CR0.PG is cleared, LMA is cleared immediately. */
> -        if ( hvm_long_mode_enabled(v) )
> +        if ( hvm_long_mode_active(v) )
>          {
>              v->arch.hvm_vcpu.guest_efer &= ~EFER_LMA;
>              hvm_update_guest_efer(v);
> @@ -2321,7 +2321,7 @@ int hvm_set_cr4(unsigned long value, bool_t
> may_defer)
> 
>      if ( !(value & X86_CR4_PAE) )
>      {
> -        if ( hvm_long_mode_enabled(v) )
> +        if ( hvm_long_mode_active(v) )
>          {
>              HVM_DBG_LOG(DBG_LEVEL_1, "Guest cleared CR4.PAE while "
>                          "EFER.LMA is set");
> @@ -2332,7 +2332,7 @@ int hvm_set_cr4(unsigned long value, bool_t
> may_defer)
>      old_cr = v->arch.hvm_vcpu.guest_cr[4];
> 
>      if ( (value & X86_CR4_PCIDE) && !(old_cr & X86_CR4_PCIDE) &&
> -         (!hvm_long_mode_enabled(v) ||
> +         (!hvm_long_mode_active(v) ||
>            (v->arch.hvm_vcpu.guest_cr[3] & 0xfff)) )
>      {
>          HVM_DBG_LOG(DBG_LEVEL_1, "Guest attempts to change CR4.PCIDE
> from "
> @@ -3605,7 +3605,7 @@ void hvm_ud_intercept(struct cpu_user_regs
> *regs)
> 
>          if ( hvm_virtual_to_linear_addr(x86_seg_cs, cs, regs->rip,
>                                          sizeof(sig), hvm_access_insn_fetch,
> -                                        (hvm_long_mode_enabled(cur) &&
> +                                        (hvm_long_mode_active(cur) &&
>                                           cs->attr.fields.l) ? 64 :
>                                          cs->attr.fields.db ? 32 : 16, &addr) &&
>               (hvm_fetch_from_guest_linear(sig, addr, sizeof(sig),
> @@ -3616,7 +3616,7 @@ void hvm_ud_intercept(struct cpu_user_regs
> *regs)
>              regs->eflags &= ~X86_EFLAGS_RF;
> 
>              /* Zero the upper 32 bits of %rip if not in 64bit mode. */
> -            if ( !(hvm_long_mode_enabled(cur) && cs->attr.fields.l) )
> +            if ( !(hvm_long_mode_active(cur) && cs->attr.fields.l) )
>                  regs->rip = regs->eip;
> 
>              add_taint(TAINT_HVM_FEP);
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index b69789b..4d7e49f 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -516,7 +516,7 @@ static int svm_guest_x86_mode(struct vcpu *v)
>          return 0;
>      if ( unlikely(guest_cpu_user_regs()->eflags & X86_EFLAGS_VM) )
>          return 1;
> -    if ( hvm_long_mode_enabled(v) && likely(vmcb->cs.attr.fields.l) )
> +    if ( hvm_long_mode_active(v) && likely(vmcb->cs.attr.fields.l) )
>          return 8;
>      return (likely(vmcb->cs.attr.fields.db) ? 4 : 2);
>  }
> @@ -2279,7 +2279,7 @@ void svm_vmexit_handler(struct cpu_user_regs
> *regs)
> 
>      exit_reason = vmcb->exitcode;
> 
> -    if ( hvm_long_mode_enabled(v) )
> +    if ( hvm_long_mode_active(v) )
>          HVMTRACE_ND(VMEXIT64, vcpu_guestmode ? TRC_HVM_NESTEDFLAG
> : 0,
>                      1/*cycles*/, 3, exit_reason,
>                      regs->eip, regs->rip >> 32, 0, 0, 0);
> @@ -2429,7 +2429,7 @@ void svm_vmexit_handler(struct cpu_user_regs
> *regs)
>          {
>              if ( trace_will_trace_event(TRC_SHADOW) )
>                  break;
> -            if ( hvm_long_mode_enabled(v) )
> +            if ( hvm_long_mode_active(v) )
>                  HVMTRACE_LONG_2D(PF_XEN, regs->error_code,
> TRC_PAR_LONG(va));
>              else
>                  HVMTRACE_2D(PF_XEN, regs->error_code, va);
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index d201956..b6526c9 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -611,7 +611,7 @@ int vmx_guest_x86_mode(struct vcpu *v)
>      if ( unlikely(guest_cpu_user_regs()->eflags & X86_EFLAGS_VM) )
>          return 1;
>      __vmread(GUEST_CS_AR_BYTES, &cs_ar_bytes);
> -    if ( hvm_long_mode_enabled(v) &&
> +    if ( hvm_long_mode_active(v) &&
>           likely(cs_ar_bytes & X86_SEG_AR_CS_LM_ACTIVE) )
>          return 8;
>      return (likely(cs_ar_bytes & X86_SEG_AR_DEF_OP_SIZE) ? 4 : 2);
> @@ -3392,7 +3392,7 @@ void vmx_vmexit_handler(struct cpu_user_regs
> *regs)
> 
>      __vmread(VM_EXIT_REASON, &exit_reason);
> 
> -    if ( hvm_long_mode_enabled(v) )
> +    if ( hvm_long_mode_active(v) )
>          HVMTRACE_ND(VMEXIT64, 0, 1/*cycles*/, 3, exit_reason,
>                      regs->eip, regs->rip >> 32, 0, 0, 0);
>      else
> @@ -3632,7 +3632,7 @@ void vmx_vmexit_handler(struct cpu_user_regs
> *regs)
>              {
>                  if ( trace_will_trace_event(TRC_SHADOW) )
>                      break;
> -                if ( hvm_long_mode_enabled(v) )
> +                if ( hvm_long_mode_active(v) )
>                      HVMTRACE_LONG_2D(PF_XEN, regs->error_code,
>                                       TRC_PAR_LONG(exit_qualification) );
>                  else
> diff --git a/xen/arch/x86/hvm/vmx/vvmx.c
> b/xen/arch/x86/hvm/vmx/vvmx.c
> index 09e4250..e9860f7 100644
> --- a/xen/arch/x86/hvm/vmx/vvmx.c
> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> @@ -392,7 +392,7 @@ static int vmx_inst_check_privilege(struct
> cpu_user_regs *regs, int vmxop_check)
>      else if ( !nvmx_vcpu_in_vmx(v) )
>          goto invalid_op;
> 
> -    if ( vmx_guest_x86_mode(v) < (hvm_long_mode_enabled(v) ? 8 : 2) )
> +    if ( vmx_guest_x86_mode(v) < (hvm_long_mode_active(v) ? 8 : 2) )
>          goto invalid_op;
>      else if ( nestedhvm_vcpu_in_guestmode(v) )
>          goto vmexit;
> @@ -1154,13 +1154,13 @@ static void virtual_vmentry(struct cpu_user_regs
> *regs)
>      /*
>       * EFER handling:
>       * hvm_set_efer won't work if CR0.PG = 1, so we change the value
> -     * directly to make hvm_long_mode_enabled(v) work in L2.
> +     * directly to make hvm_long_mode_active(v) work in L2.
>       * An additional update_paging_modes is also needed if
>       * there is 32/64 switch. v->arch.hvm_vcpu.guest_efer doesn't
>       * need to be saved, since its value on vmexit is determined by
>       * L1 exit_controls
>       */
> -    lm_l1 = !!hvm_long_mode_enabled(v);
> +    lm_l1 = !!hvm_long_mode_active(v);
>      lm_l2 = !!(get_vvmcs(v, VM_ENTRY_CONTROLS) &
> VM_ENTRY_IA32E_MODE);
> 
>      if ( lm_l2 )
> @@ -1359,7 +1359,7 @@ static void virtual_vmexit(struct cpu_user_regs
> *regs)
>      nvcpu->nv_vmexit_pending = 0;
>      nvcpu->nv_vmswitch_in_progress = 1;
> 
> -    lm_l2 = !!hvm_long_mode_enabled(v);
> +    lm_l2 = !!hvm_long_mode_active(v);
>      lm_l1 = !!(get_vvmcs(v, VM_EXIT_CONTROLS) & VM_EXIT_IA32E_MODE);
> 
>      if ( lm_l1 )
> diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
> index a57b385..283d4b7 100644
> --- a/xen/arch/x86/mm/hap/hap.c
> +++ b/xen/arch/x86/mm/hap/hap.c
> @@ -690,10 +690,10 @@ static void hap_update_cr3(struct vcpu *v, int
> do_locking)
>  const struct paging_mode *
>  hap_paging_get_mode(struct vcpu *v)
>  {
> -    return !hvm_paging_enabled(v)   ? &hap_paging_real_mode :
> -        hvm_long_mode_enabled(v) ? &hap_paging_long_mode :
> -        hvm_pae_enabled(v)       ? &hap_paging_pae_mode  :
> -                                   &hap_paging_protected_mode;
> +    return !hvm_paging_enabled(v) ? &hap_paging_real_mode :
> +        hvm_long_mode_active(v)   ? &hap_paging_long_mode :
> +        hvm_pae_enabled(v)        ? &hap_paging_pae_mode  :
> +                                    &hap_paging_protected_mode;
>  }
> 
>  static void hap_update_paging_modes(struct vcpu *v)
> diff --git a/xen/arch/x86/mm/shadow/common.c
> b/xen/arch/x86/mm/shadow/common.c
> index 03cb24d..14a07dd 100644
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -331,7 +331,7 @@ const struct x86_emulate_ops
> *shadow_init_emulation(
>      creg = hvm_get_seg_reg(x86_seg_cs, sh_ctxt);
> 
>      /* Work out the emulation mode. */
> -    if ( hvm_long_mode_enabled(v) && creg->attr.fields.l )
> +    if ( hvm_long_mode_active(v) && creg->attr.fields.l )
>      {
>          sh_ctxt->ctxt.addr_size = sh_ctxt->ctxt.sp_size = 64;
>      }
> @@ -2921,7 +2921,7 @@ static void sh_update_paging_modes(struct vcpu
> *v)
>              v->arch.guest_table = d->arch.paging.shadow.unpaged_pagetable;
>              v->arch.paging.mode =
> &SHADOW_INTERNAL_NAME(sh_paging_mode, 2);
>          }
> -        else if ( hvm_long_mode_enabled(v) )
> +        else if ( hvm_long_mode_active(v) )
>          {
>              // long mode guest...
>              v->arch.paging.mode =
> diff --git a/xen/arch/x86/oprofile/backtrace.c
> b/xen/arch/x86/oprofile/backtrace.c
> index f0fbb42..316821f 100644
> --- a/xen/arch/x86/oprofile/backtrace.c
> +++ b/xen/arch/x86/oprofile/backtrace.c
> @@ -47,7 +47,7 @@ dump_hypervisor_backtrace(struct vcpu *vcpu, const
> struct frame_head *head,
>  static inline int is_32bit_vcpu(struct vcpu *vcpu)
>  {
>      if (is_hvm_vcpu(vcpu))
> -        return !hvm_long_mode_enabled(vcpu);
> +        return !hvm_long_mode_active(vcpu);
>      else
>          return is_pv_32bit_vcpu(vcpu);
>  }
> diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-
> x86/hvm/hvm.h
> index c854183..49c8001 100644
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -302,8 +302,7 @@ int hvm_girq_dest_2_vcpu_id(struct domain *d,
> uint8_t dest, uint8_t dest_mode);
>  #define hap_has_1gb (!!(hvm_funcs.hap_capabilities &
> HVM_HAP_SUPERPAGE_1GB))
>  #define hap_has_2mb (!!(hvm_funcs.hap_capabilities &
> HVM_HAP_SUPERPAGE_2MB))
> 
> -#define hvm_long_mode_enabled(v) \
> -    ((v)->arch.hvm_vcpu.guest_efer & EFER_LMA)
> +#define hvm_long_mode_active(v) (!!((v)->arch.hvm_vcpu.guest_efer &
> EFER_LMA))
> 
>  enum hvm_intblk
>  hvm_interrupt_blocked(struct vcpu *v, struct hvm_intack intack);
> --
> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 2/6] x86/hvm: Correct long mode predicate
  2017-03-31 19:50 ` [PATCH for 4.9 2/6] x86/hvm: Correct long mode predicate Andrew Cooper
  2017-04-03  8:26   ` Paul Durrant
@ 2017-04-03  8:30   ` Jan Beulich
  2017-04-03  8:50   ` George Dunlap
  2017-04-05  7:08   ` Tian, Kevin
  3 siblings, 0 replies; 37+ messages in thread
From: Jan Beulich @ 2017-04-03  8:30 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Suravee Suthikulpanit, George Dunlap, Tim Deegan,
	Xen-devel, Julien Grall, Paul Durrant, Jun Nakajima,
	Boris Ostrovsky

>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
> @@ -1154,13 +1154,13 @@ static void virtual_vmentry(struct cpu_user_regs *regs)
>      /*
>       * EFER handling:
>       * hvm_set_efer won't work if CR0.PG = 1, so we change the value
> -     * directly to make hvm_long_mode_enabled(v) work in L2.
> +     * directly to make hvm_long_mode_active(v) work in L2.
>       * An additional update_paging_modes is also needed if
>       * there is 32/64 switch. v->arch.hvm_vcpu.guest_efer doesn't
>       * need to be saved, since its value on vmexit is determined by
>       * L1 exit_controls
>       */
> -    lm_l1 = !!hvm_long_mode_enabled(v);
> +    lm_l1 = !!hvm_long_mode_active(v);

Mind dropping the now stray !! here and ...

> @@ -1359,7 +1359,7 @@ static void virtual_vmexit(struct cpu_user_regs *regs)
>      nvcpu->nv_vmexit_pending = 0;
>      nvcpu->nv_vmswitch_in_progress = 1;
>  
> -    lm_l2 = !!hvm_long_mode_enabled(v);
> +    lm_l2 = !!hvm_long_mode_active(v);

... here?

> --- a/xen/arch/x86/mm/hap/hap.c
> +++ b/xen/arch/x86/mm/hap/hap.c
> @@ -690,10 +690,10 @@ static void hap_update_cr3(struct vcpu *v, int do_locking)
>  const struct paging_mode *
>  hap_paging_get_mode(struct vcpu *v)
>  {
> -    return !hvm_paging_enabled(v)   ? &hap_paging_real_mode :
> -        hvm_long_mode_enabled(v) ? &hap_paging_long_mode :
> -        hvm_pae_enabled(v)       ? &hap_paging_pae_mode  :
> -                                   &hap_paging_protected_mode;
> +    return !hvm_paging_enabled(v) ? &hap_paging_real_mode :
> +        hvm_long_mode_active(v)   ? &hap_paging_long_mode :
> +        hvm_pae_enabled(v)        ? &hap_paging_pae_mode  :
> +                                    &hap_paging_protected_mode;

If you fiddle with alignment of fields here already, I think it would
be nice if you also made the control expressions align:

    return !hvm_paging_enabled(v)  ? &hap_paging_real_mode :
           hvm_long_mode_active(v) ? &hap_paging_long_mode :
           hvm_pae_enabled(v)      ? &hap_paging_pae_mode  :
                                     &hap_paging_protected_mode;

With that
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments
  2017-03-31 19:50 ` [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments Andrew Cooper
@ 2017-04-03  8:31   ` Paul Durrant
  2017-04-03  9:13   ` Jan Beulich
  1 sibling, 0 replies; 37+ messages in thread
From: Paul Durrant @ 2017-04-03  8:31 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Julien Grall, Tim (Xen.org), Jan Beulich

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 31 March 2017 20:51
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Jan Beulich
> <JBeulich@suse.com>; Paul Durrant <Paul.Durrant@citrix.com>; Tim
> (Xen.org) <tim@xen.org>; Julien Grall <julien.grall@arm.com>
> Subject: [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system
> segments
> 
> c/s c785f759718 "x86/emul: Prepare to allow use of system segments for
> memory
> references" made alterations to hvm_virtual_to_linear_addr() to allow for
> the
> use of system segments.
> 
> However, the determination of which segmentation mode to use was based
> on the
> current address size from emulation.  In particular, it is wrong for system
> segment accesses while executing in a compatibility mode code segment.
> 
> Replace the addr_size parameter with a new segmentation mode
> enumeration (to
> prevent this mistake from being made again), and introduce a new helper to
> calculate the correct segmenation mode from current register state.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Paul Durrant <paul.durrant@citrix.com>
> CC: Tim Deegan <tim@xen.org>
> CC: Julien Grall <julien.grall@arm.com>
> 
> This is the same logical bug that caused XSA-196, but luckily hasn't yet been
> in a released version of Xen.
> ---
>  xen/arch/x86/hvm/emulate.c      | 17 ++++++++++------
>  xen/arch/x86/hvm/hvm.c          | 45 +++++++++++++++++++++++++++++----
> --------
>  xen/arch/x86/mm/shadow/common.c |  5 ++++-
>  xen/include/asm-x86/hvm/hvm.h   | 12 ++++++++++-
>  4 files changed, 58 insertions(+), 21 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index 3d084ca..f578796 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -508,6 +508,7 @@ static int hvmemul_virtual_to_linear(
>      struct hvm_emulate_ctxt *hvmemul_ctxt,
>      unsigned long *linear)
>  {
> +    enum hvm_segmentation_mode seg_mode;
>      struct segment_register *reg;
>      int okay;
>      unsigned long max_reps = 4096;
> @@ -518,6 +519,9 @@ static int hvmemul_virtual_to_linear(
>          return X86EMUL_OKAY;
>      }
> 
> +    seg_mode = hvm_seg_mode(
> +        current, seg, hvmemul_get_seg_reg(x86_seg_cs, hvmemul_ctxt));
> +
>      /*
>       * If introspection has been enabled for this domain, and we're emulating
>       * becase a vm_reply asked us to (i.e. not doing regular IO) reps should
> @@ -548,8 +552,7 @@ static int hvmemul_virtual_to_linear(
>          ASSERT(offset >= ((*reps - 1) * bytes_per_rep));
>          okay = hvm_virtual_to_linear_addr(
>              seg, reg, offset - (*reps - 1) * bytes_per_rep,
> -            *reps * bytes_per_rep, access_type,
> -            hvmemul_ctxt->ctxt.addr_size, linear);
> +            *reps * bytes_per_rep, access_type, seg_mode, linear);
>          *linear += (*reps - 1) * bytes_per_rep;
>          if ( hvmemul_ctxt->ctxt.addr_size != 64 )
>              *linear = (uint32_t)*linear;
> @@ -557,8 +560,8 @@ static int hvmemul_virtual_to_linear(
>      else
>      {
>          okay = hvm_virtual_to_linear_addr(
> -            seg, reg, offset, *reps * bytes_per_rep, access_type,
> -            hvmemul_ctxt->ctxt.addr_size, linear);
> +            seg, reg, offset, *reps * bytes_per_rep,
> +            access_type, seg_mode, linear);
>      }
> 
>      if ( okay )
> @@ -2068,14 +2071,16 @@ void hvm_emulate_init_per_insn(
>      hvmemul_ctxt->insn_buf_eip = hvmemul_ctxt->ctxt.regs->rip;
>      if ( !insn_bytes )
>      {
> +        enum hvm_segmentation_mode seg_mode =
> +            hvm_seg_mode(curr, x86_seg_cs, &hvmemul_ctxt-
> >seg_reg[x86_seg_cs]);
> +
>          hvmemul_ctxt->insn_buf_bytes =
>              hvm_get_insn_bytes(curr, hvmemul_ctxt->insn_buf) ?:
>              (hvm_virtual_to_linear_addr(x86_seg_cs,
>                                          &hvmemul_ctxt->seg_reg[x86_seg_cs],
>                                          hvmemul_ctxt->insn_buf_eip,
>                                          sizeof(hvmemul_ctxt->insn_buf),
> -                                        hvm_access_insn_fetch,
> -                                        hvmemul_ctxt->ctxt.addr_size,
> +                                        hvm_access_insn_fetch, seg_mode,
>                                          &addr) &&
>               hvm_fetch_from_guest_linear(hvmemul_ctxt->insn_buf, addr,
>                                           sizeof(hvmemul_ctxt->insn_buf),
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 9f83cd8..f250afb 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2374,13 +2374,27 @@ int hvm_set_cr4(unsigned long value, bool_t
> may_defer)
>      return X86EMUL_OKAY;
>  }
> 
> +enum hvm_segmentation_mode hvm_seg_mode(
> +    const struct vcpu *v, enum x86_segment seg,
> +    const struct segment_register *cs)
> +{
> +    if ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) )
> +        return hvm_seg_mode_real;
> +
> +    if ( hvm_long_mode_active(v) &&
> +         (is_x86_system_segment(seg) || cs->attr.fields.l) )
> +        return hvm_seg_mode_long;
> +
> +    return hvm_seg_mode_prot;
> +}
> +
>  bool_t hvm_virtual_to_linear_addr(
>      enum x86_segment seg,
>      const struct segment_register *reg,
>      unsigned long offset,
>      unsigned int bytes,
>      enum hvm_access_type access_type,
> -    unsigned int addr_size,
> +    enum hvm_segmentation_mode seg_mode,
>      unsigned long *linear_addr)
>  {
>      unsigned long addr = offset, last_byte;
> @@ -2394,8 +2408,9 @@ bool_t hvm_virtual_to_linear_addr(
>       */
>      ASSERT(seg < x86_seg_none);
> 
> -    if ( !(current->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) )
> +    switch ( seg_mode )
>      {
> +    case hvm_seg_mode_real:
>          /*
>           * REAL MODE: Don't bother with segment access checks.
>           * Certain of them are not done in native real mode anyway.
> @@ -2404,11 +2419,11 @@ bool_t hvm_virtual_to_linear_addr(
>          last_byte = (uint32_t)addr + bytes - !!bytes;
>          if ( last_byte < addr )
>              goto out;
> -    }
> -    else if ( addr_size != 64 )
> -    {
> +        break;
> +
> +    case hvm_seg_mode_prot:
>          /*
> -         * COMPATIBILITY MODE: Apply segment checks and add base.
> +         * PROTECTED/COMPATIBILITY MODE: Apply segment checks and add
> base.
>           */
> 
>          /*
> @@ -2454,9 +2469,9 @@ bool_t hvm_virtual_to_linear_addr(
>          }
>          else if ( (last_byte > reg->limit) || (last_byte < offset) )
>              goto out; /* last byte is beyond limit or wraps 0xFFFFFFFF */
> -    }
> -    else
> -    {
> +        break;
> +
> +    case hvm_seg_mode_long:
>          /*
>           * User segments are always treated as present.  System segment may
>           * not be, and also incur limit checks.
> @@ -2476,6 +2491,11 @@ bool_t hvm_virtual_to_linear_addr(
>          if ( !is_canonical_address(addr) || last_byte < addr ||
>               !is_canonical_address(last_byte) )
>              goto out;
> +        break;
> +
> +    default:
> +        ASSERT_UNREACHABLE();
> +        goto out;
>      }
> 
>      /* All checks ok. */
> @@ -3024,7 +3044,7 @@ void hvm_task_switch(
>              sp = regs->sp -= opsz;
>          if ( hvm_virtual_to_linear_addr(x86_seg_ss, &segr, sp, opsz,
>                                          hvm_access_write,
> -                                        16 << segr.attr.fields.db,
> +                                        hvm_seg_mode_prot,
>                                          &linear_addr) )
>          {
>              rc = hvm_copy_to_guest_linear(linear_addr, &errcode, opsz, 0,
> @@ -3600,14 +3620,13 @@ void hvm_ud_intercept(struct cpu_user_regs
> *regs)
>          const struct segment_register *cs = &ctxt.seg_reg[x86_seg_cs];
>          uint32_t walk = (ctxt.seg_reg[x86_seg_ss].attr.fields.dpl == 3)
>              ? PFEC_user_mode : 0;
> +        enum hvm_segmentation_mode seg_mode = hvm_seg_mode(cur,
> x86_seg_cs, cs);
>          unsigned long addr;
>          char sig[5]; /* ud2; .ascii "xen" */
> 
>          if ( hvm_virtual_to_linear_addr(x86_seg_cs, cs, regs->rip,
>                                          sizeof(sig), hvm_access_insn_fetch,
> -                                        (hvm_long_mode_active(cur) &&
> -                                         cs->attr.fields.l) ? 64 :
> -                                        cs->attr.fields.db ? 32 : 16, &addr) &&
> +                                        seg_mode, &addr) &&
>               (hvm_fetch_from_guest_linear(sig, addr, sizeof(sig),
>                                            walk, NULL) == HVMCOPY_okay) &&
>               (memcmp(sig, "\xf\xbxen", sizeof(sig)) == 0) )
> diff --git a/xen/arch/x86/mm/shadow/common.c
> b/xen/arch/x86/mm/shadow/common.c
> index 14a07dd..551a7d3 100644
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -144,6 +144,7 @@ static int hvm_translate_virtual_addr(
>      struct sh_emulate_ctxt *sh_ctxt,
>      unsigned long *linear)
>  {
> +    enum hvm_segmentation_mode seg_mode;
>      const struct segment_register *reg;
>      int okay;
> 
> @@ -151,8 +152,10 @@ static int hvm_translate_virtual_addr(
>      if ( IS_ERR(reg) )
>          return -PTR_ERR(reg);
> 
> +    seg_mode = hvm_seg_mode(current, seg,
> hvm_get_seg_reg(x86_seg_cs, sh_ctxt));
> +
>      okay = hvm_virtual_to_linear_addr(
> -        seg, reg, offset, bytes, access_type, sh_ctxt->ctxt.addr_size, linear);
> +        seg, reg, offset, bytes, access_type, seg_mode, linear);
> 
>      if ( !okay )
>      {
> diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-
> x86/hvm/hvm.h
> index 49c8001..ed6994c 100644
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -460,6 +460,16 @@ void hvm_task_switch(
>      uint16_t tss_sel, enum hvm_task_switch_reason taskswitch_reason,
>      int32_t errcode);
> 
> +enum hvm_segmentation_mode {
> +    hvm_seg_mode_real,
> +    hvm_seg_mode_prot,
> +    hvm_seg_mode_long,
> +};
> +
> +enum hvm_segmentation_mode hvm_seg_mode(
> +    const struct vcpu *v, enum x86_segment seg,
> +    const struct segment_register *cs);
> +
>  enum hvm_access_type {
>      hvm_access_insn_fetch,
>      hvm_access_none,
> @@ -472,7 +482,7 @@ bool_t hvm_virtual_to_linear_addr(
>      unsigned long offset,
>      unsigned int bytes,
>      enum hvm_access_type access_type,
> -    unsigned int addr_size,
> +    enum hvm_segmentation_mode seg_mode,
>      unsigned long *linear_addr);
> 
>  void *hvm_map_guest_frame_rw(unsigned long gfn, bool_t permanent,
> --
> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 5/6] x86/emul: Drop swint_emulate infrastructure
  2017-03-31 19:50 ` [PATCH for 4.9 5/6] x86/emul: Drop swint_emulate infrastructure Andrew Cooper
@ 2017-04-03  8:36   ` Paul Durrant
  2017-04-03  9:38   ` Jan Beulich
  1 sibling, 0 replies; 37+ messages in thread
From: Paul Durrant @ 2017-04-03  8:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Julien Grall, Tim (Xen.org), Jan Beulich

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 31 March 2017 20:51
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Jan Beulich
> <JBeulich@suse.com>; Paul Durrant <Paul.Durrant@citrix.com>; Tim
> (Xen.org) <tim@xen.org>; Julien Grall <julien.grall@arm.com>
> Subject: [PATCH for 4.9 5/6] x86/emul: Drop swint_emulate infrastructure
> 
> With the SVM injection logic capable of doing its own emulation, there is no
> need for this hardware-specific assistance in the common emulator.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

emulate parts...

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Paul Durrant <paul.durrant@citrix.com>
> CC: Tim Deegan <tim@xen.org>
> CC: Julien Grall <julien.grall@arm.com>
> ---
>  tools/fuzz/x86_instruction_emulator/fuzz-emul.c |  18 +--
>  xen/arch/x86/hvm/emulate.c                      |   7 -
>  xen/arch/x86/mm.c                               |   2 -
>  xen/arch/x86/mm/shadow/common.c                 |   1 -
>  xen/arch/x86/x86_emulate/x86_emulate.c          | 187 ++++--------------------
>  xen/arch/x86/x86_emulate/x86_emulate.h          |  53 -------
>  6 files changed, 30 insertions(+), 238 deletions(-)
> 
> diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> index 890642c..8488816 100644
> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> @@ -536,8 +536,7 @@ enum {
>      HOOK_put_fpu,
>      HOOK_invlpg,
>      HOOK_vmfunc,
> -    OPTION_swint_emulation, /* Two bits */
> -    CANONICALIZE_rip = OPTION_swint_emulation + 2,
> +    CANONICALIZE_rip,
>      CANONICALIZE_rsp,
>      CANONICALIZE_rbp
>  };
> @@ -577,19 +576,6 @@ static void disable_hooks(void)
>      MAYBE_DISABLE_HOOK(invlpg);
>  }
> 
> -static void set_swint_support(struct x86_emulate_ctxt *ctxt)
> -{
> -    unsigned int swint_opt = (input.options >> OPTION_swint_emulation) &
> 3;
> -    static const enum x86_swint_emulation map[4] = {
> -        x86_swint_emulate_none,
> -        x86_swint_emulate_none,
> -        x86_swint_emulate_icebp,
> -        x86_swint_emulate_all
> -    };
> -
> -    ctxt->swint_emulate = map[swint_opt];
> -}
> -
>  /*
>   * Constrain input to architecturally-possible states where
>   * the emulator relies on these
> @@ -693,8 +679,6 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p,
> size_t size)
> 
>      disable_hooks();
> 
> -    set_swint_support(&ctxt);
> -
>      do {
>          /* FIXME: Until we actually implement SIGFPE handling properly */
>          setup_fpu_exception_handler();
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index f578796..efac2e9 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -2036,13 +2036,6 @@ void hvm_emulate_init_once(
>      hvmemul_ctxt->ctxt.regs = regs;
>      hvmemul_ctxt->ctxt.vendor = curr->domain->arch.cpuid->x86_vendor;
>      hvmemul_ctxt->ctxt.force_writeback = true;
> -
> -    if ( cpu_has_vmx )
> -        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
> -    else if ( cpu_has_svm_nrips )
> -        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_icebp;
> -    else
> -        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_all;
>  }
> 
>  void hvm_emulate_init_per_insn(
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index be4e308..3918a37 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5412,7 +5412,6 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned
> long addr,
>              .vendor = d->arch.cpuid->x86_vendor,
>              .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>              .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
> -            .swint_emulate = x86_swint_emulate_none,
>          },
>      };
>      int rc;
> @@ -5567,7 +5566,6 @@ int mmio_ro_do_page_fault(struct vcpu *v,
> unsigned long addr,
>          .vendor = v->domain->arch.cpuid->x86_vendor,
>          .addr_size = addr_size,
>          .sp_size = addr_size,
> -        .swint_emulate = x86_swint_emulate_none,
>          .data = &mmio_ro_ctxt
>      };
>      int rc;
> diff --git a/xen/arch/x86/mm/shadow/common.c
> b/xen/arch/x86/mm/shadow/common.c
> index 551a7d3..2fc796b 100644
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -328,7 +328,6 @@ const struct x86_emulate_ops
> *shadow_init_emulation(
> 
>      sh_ctxt->ctxt.regs = regs;
>      sh_ctxt->ctxt.vendor = v->domain->arch.cpuid->x86_vendor;
> -    sh_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
> 
>      /* Segment cache initialisation. Primed with CS. */
>      creg = hvm_get_seg_reg(x86_seg_cs, sh_ctxt);
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c
> b/xen/arch/x86/x86_emulate/x86_emulate.c
> index 7af8a42..7315ca6 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -1999,142 +1999,6 @@ static bool umip_active(struct x86_emulate_ctxt
> *ctxt,
>             (cr4 & X86_CR4_UMIP);
>  }
> 
> -/* Inject a software interrupt/exception, emulating if needed. */
> -static int inject_swint(enum x86_swint_type type,
> -                        uint8_t vector, uint8_t insn_len,
> -                        struct x86_emulate_ctxt *ctxt,
> -                        const struct x86_emulate_ops *ops)
> -{
> -    int rc, error_code, fault_type = EXC_GP;
> -
> -    /*
> -     * Without hardware support, injecting software interrupts/exceptions is
> -     * problematic.
> -     *
> -     * All software methods of generating exceptions (other than BOUND)
> yield
> -     * traps, so eip in the exception frame needs to point after the
> -     * instruction, not at it.
> -     *
> -     * However, if injecting it as a hardware exception causes a fault during
> -     * delivery, our adjustment of eip will cause the fault to be reported
> -     * after the faulting instruction, not pointing to it.
> -     *
> -     * Therefore, eip can only safely be wound forwards if we are certain that
> -     * injecting an equivalent hardware exception won't fault, which means
> -     * emulating everything the processor would do on a control transfer.
> -     *
> -     * However, emulation of complete control transfers is very complicated.
> -     * All we care about is that guest userspace cannot avoid the descriptor
> -     * DPL check by using the Xen emulator, and successfully invoke DPL=0
> -     * descriptors.
> -     *
> -     * Any OS which would further fault during injection is going to receive a
> -     * double fault anyway, and won't be in a position to care that the
> -     * faulting eip is incorrect.
> -     */
> -
> -    if ( (ctxt->swint_emulate == x86_swint_emulate_all) ||
> -         ((ctxt->swint_emulate == x86_swint_emulate_icebp) &&
> -          (type == x86_swint_icebp)) )
> -    {
> -        if ( !in_realmode(ctxt, ops) )
> -        {
> -            unsigned int idte_size, idte_offset;
> -            struct { uint32_t a, b, c, d; } idte = {};
> -            int lm = in_longmode(ctxt, ops);
> -
> -            if ( lm < 0 )
> -                return X86EMUL_UNHANDLEABLE;
> -
> -            idte_size = lm ? 16 : 8;
> -            idte_offset = vector * idte_size;
> -
> -            /* icebp sets the External Event bit despite being an instruction. */
> -            error_code = (vector << 3) | ECODE_IDT |
> -                (type == x86_swint_icebp ? ECODE_EXT : 0);
> -
> -            /*
> -             * TODO - this does not cover the v8086 mode with CR4.VME case
> -             * correctly, but falls on the safe side from the point of view of
> -             * a 32bit OS.  Someone with many TUITs can see about reading the
> -             * TSS Software Interrupt Redirection bitmap.
> -             */
> -            if ( (ctxt->regs->eflags & X86_EFLAGS_VM) &&
> -                 ((ctxt->regs->eflags & X86_EFLAGS_IOPL) != X86_EFLAGS_IOPL) )
> -                goto raise_exn;
> -
> -            /*
> -             * Read all 8/16 bytes so the idtr limit check is applied properly
> -             * to this entry, even though we only end up looking at the 2nd
> -             * word.
> -             */
> -            switch ( rc = ops->read(x86_seg_idtr, idte_offset,
> -                                    &idte, idte_size, ctxt) )
> -            {
> -            case X86EMUL_OKAY:
> -                break;
> -
> -            case X86EMUL_EXCEPTION:
> -                if ( !ctxt->event_pending )
> -                    goto raise_exn;
> -                /* fallthrough */
> -
> -            default:
> -                return rc;
> -            }
> -
> -            /* This must be an interrupt, trap, or task gate. */
> -#ifdef __XEN__
> -            switch ( (idte.b >> 8) & 0x1f )
> -            {
> -            case SYS_DESC_irq_gate:
> -            case SYS_DESC_trap_gate:
> -                break;
> -            case SYS_DESC_irq_gate16:
> -            case SYS_DESC_trap_gate16:
> -            case SYS_DESC_task_gate:
> -                if ( !lm )
> -                    break;
> -                /* fall through */
> -            default:
> -                goto raise_exn;
> -            }
> -#endif
> -
> -            /* The 64-bit high half's type must be zero. */
> -            if ( idte.d & 0x1f00 )
> -                goto raise_exn;
> -
> -            /* icebp counts as a hardware event, and bypasses the dpl check. */
> -            if ( type != x86_swint_icebp )
> -            {
> -                int cpl = get_cpl(ctxt, ops);
> -
> -                fail_if(cpl < 0);
> -
> -                if ( cpl > ((idte.b >> 13) & 3) )
> -                    goto raise_exn;
> -            }
> -
> -            /* Is this entry present? */
> -            if ( !(idte.b & (1u << 15)) )
> -            {
> -                fault_type = EXC_NP;
> -                goto raise_exn;
> -            }
> -        }
> -    }
> -
> -    x86_emul_software_event(type, vector, insn_len, ctxt);
> -    rc = X86EMUL_OKAY;
> -
> - done:
> -    return rc;
> -
> - raise_exn:
> -    generate_exception(fault_type, error_code);
> -}
> -
>  static void adjust_bnd(struct x86_emulate_ctxt *ctxt,
>                         const struct x86_emulate_ops *ops, enum vex_pfx pfx)
>  {
> @@ -3101,7 +2965,6 @@ x86_emulate(
>      struct operand src = { .reg = PTR_POISON };
>      struct operand dst = { .reg = PTR_POISON };
>      unsigned long cr4;
> -    enum x86_swint_type swint_type;
>      struct fpu_insn_ctxt fic = { .type = X86EMUL_FPU_none, .exn_raised = -1
> };
>      struct x86_emulate_stub stub = {};
>      DECLARE_ALIGNED(mmval_t, mmval);
> @@ -4103,25 +3966,38 @@ x86_emulate(
>              goto done;
>          break;
> 
> -    case 0xcc: /* int3 */
> -        src.val = EXC_BP;
> -        swint_type = x86_swint_int3;
> -        goto swint;
> -
> -    case 0xcd: /* int imm8 */
> -        swint_type = x86_swint_int;
> -    swint:
> -        rc = inject_swint(swint_type, (uint8_t)src.val,
> -                          _regs.r(ip) - ctxt->regs->r(ip),
> -                          ctxt, ops) ? : X86EMUL_EXCEPTION;
> -        goto done;
> -
>      case 0xce: /* into */
>          if ( !(_regs.eflags & X86_EFLAGS_OF) )
>              break;
> -        src.val = EXC_OF;
> -        swint_type = x86_swint_into;
> -        goto swint;
> +        /* Fallthrough */
> +    case 0xcc: /* int3 */
> +    case 0xcd: /* int imm8 */
> +    case 0xf1: /* int1 (icebp) */
> +        ASSERT(!ctxt->event_pending);
> +        switch ( ctxt->opcode )
> +        {
> +        case 0xcc: /* int3 */
> +            ctxt->event.vector = EXC_BP;
> +            ctxt->event.type = X86_EVENTTYPE_SW_EXCEPTION;
> +            break;
> +        case 0xcd: /* int imm8 */
> +            ctxt->event.vector = src.val;
> +            ctxt->event.type = X86_EVENTTYPE_SW_INTERRUPT;
> +            break;
> +        case 0xce: /* into */
> +            ctxt->event.vector = EXC_OF;
> +            ctxt->event.type = X86_EVENTTYPE_SW_EXCEPTION;
> +            break;
> +        case 0xf1: /* icebp */
> +            ctxt->event.vector = EXC_DB;
> +            ctxt->event.type = X86_EVENTTYPE_PRI_SW_EXCEPTION;
> +            break;
> +        }
> +        ctxt->event.error_code = X86_EVENT_NO_EC;
> +        ctxt->event.insn_len = _regs.r(ip) - ctxt->regs->r(ip);
> +        ctxt->event_pending = true;
> +        rc = X86EMUL_EXCEPTION;
> +        goto done;
> 
>      case 0xcf: /* iret */ {
>          unsigned long sel, eip, eflags;
> @@ -4782,11 +4658,6 @@ x86_emulate(
>              goto done;
>          break;
> 
> -    case 0xf1: /* int1 (icebp) */
> -        src.val = EXC_DB;
> -        swint_type = x86_swint_icebp;
> -        goto swint;
> -
>      case 0xf4: /* hlt */
>          generate_exception_if(!mode_ring0(), EXC_GP, 0);
>          ctxt->retire.hlt = true;
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h
> b/xen/arch/x86/x86_emulate/x86_emulate.h
> index 9c5fcde..215adf6 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -60,27 +60,6 @@ static inline bool is_x86_system_segment(enum
> x86_segment seg)
>      return seg >= x86_seg_tr && seg < x86_seg_none;
>  }
> 
> -/* Classification of the types of software generated interrupts/exceptions.
> */
> -enum x86_swint_type {
> -    x86_swint_icebp, /* 0xf1 */
> -    x86_swint_int3,  /* 0xcc */
> -    x86_swint_into,  /* 0xce */
> -    x86_swint_int,   /* 0xcd $n */
> -};
> -
> -/*
> - * How much help is required with software event injection?
> - *
> - * All software events return from x86_emulate() with
> X86EMUL_EXCEPTION and
> - * fault-like semantics.  This just controls whether the emulator performs
> - * presence/dpl/etc checks and possibly raises exceptions instead.
> - */
> -enum x86_swint_emulation {
> -    x86_swint_emulate_none, /* Hardware supports all software injection
> properly */
> -    x86_swint_emulate_icebp,/* Help needed with `icebp` (0xf1) */
> -    x86_swint_emulate_all,  /* Help needed with all software events */
> -};
> -
>  /*
>   * x86 event types. This enumeration is valid for:
>   *  Intel VMX: {VM_ENTRY,VM_EXIT,IDT_VECTORING}_INTR_INFO[10:8]
> @@ -472,9 +451,6 @@ struct x86_emulate_ctxt
>       * Input-only state:
>       */
> 
> -    /* Software event injection support. */
> -    enum x86_swint_emulation swint_emulate;
> -
>      /* CPU vendor (X86_VENDOR_UNKNOWN for "don't care") */
>      unsigned char vendor;
> 
> @@ -699,35 +675,6 @@ static inline void x86_emul_pagefault(
>      ctxt->event_pending = true;
>  }
> 
> -static inline void x86_emul_software_event(
> -    enum x86_swint_type type, uint8_t vector, uint8_t insn_len,
> -    struct x86_emulate_ctxt *ctxt)
> -{
> -    ASSERT(!ctxt->event_pending);
> -
> -    switch ( type )
> -    {
> -    case x86_swint_icebp:
> -        ctxt->event.type = X86_EVENTTYPE_PRI_SW_EXCEPTION;
> -        break;
> -
> -    case x86_swint_int3:
> -    case x86_swint_into:
> -        ctxt->event.type = X86_EVENTTYPE_SW_EXCEPTION;
> -        break;
> -
> -    case x86_swint_int:
> -        ctxt->event.type = X86_EVENTTYPE_SW_INTERRUPT;
> -        break;
> -    }
> -
> -    ctxt->event.vector = vector;
> -    ctxt->event.error_code = X86_EVENT_NO_EC;
> -    ctxt->event.insn_len = insn_len;
> -
> -    ctxt->event_pending = true;
> -}
> -
>  static inline void x86_emul_reset_event(struct x86_emulate_ctxt *ctxt)
>  {
>      ctxt->event_pending = false;
> --
> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context
  2017-03-31 19:50 ` [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context Andrew Cooper
@ 2017-04-03  8:40   ` Paul Durrant
  2017-04-03 10:10   ` Jan Beulich
  2017-04-05 16:07   ` Jan Beulich
  2 siblings, 0 replies; 37+ messages in thread
From: Paul Durrant @ 2017-04-03  8:40 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Julien Grall, Tim (Xen.org), Jan Beulich

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 31 March 2017 20:51
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Jan Beulich
> <JBeulich@suse.com>; Paul Durrant <Paul.Durrant@citrix.com>; Tim
> (Xen.org) <tim@xen.org>; Julien Grall <julien.grall@arm.com>
> Subject: [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the
> emulation context
> 
> Long mode (or not) influences emulation behaviour in a number of cases.
> Instead of reusing the ->read_msr() hook to obtain EFER.LMA, require callers
> to provide it directly.
> 
> This simplifies all long mode checks during emulation to a simple boolean
> read, removing embedded msr reads.  It also allows for the removal of a local
> variable in the sysenter emulation block, and removes a latent bug in the
> syscall emulation block where rc contains a non X86EMUL_* constant for a
> period of time.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

emulate parts...

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Paul Durrant <paul.durrant@citrix.com>
> CC: Tim Deegan <tim@xen.org>
> CC: Julien Grall <julien.grall@arm.com>
> ---
>  tools/fuzz/x86_instruction_emulator/fuzz-emul.c |  1 +
>  tools/tests/x86_emulator/test_x86_emulator.c    |  4 ++
>  xen/arch/x86/hvm/emulate.c                      |  4 +-
>  xen/arch/x86/mm.c                               |  2 +
>  xen/arch/x86/mm/shadow/common.c                 |  5 +--
>  xen/arch/x86/traps.c                            |  1 +
>  xen/arch/x86/x86_emulate/x86_emulate.c          | 51 ++++++-------------------
>  xen/arch/x86/x86_emulate/x86_emulate.h          |  3 ++
>  8 files changed, 28 insertions(+), 43 deletions(-)
> 
> diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> index 8488816..2e42e54 100644
> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> @@ -649,6 +649,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p,
> size_t size)
>      struct cpu_user_regs regs = {};
>      struct x86_emulate_ctxt ctxt = {
>          .regs = &regs,
> +        .lma = sizeof(void *) == 8,
>          .addr_size = 8 * sizeof(void *),
>          .sp_size = 8 * sizeof(void *),
>      };
> diff --git a/tools/tests/x86_emulator/test_x86_emulator.c
> b/tools/tests/x86_emulator/test_x86_emulator.c
> index 5be8ddc..efeb175 100644
> --- a/tools/tests/x86_emulator/test_x86_emulator.c
> +++ b/tools/tests/x86_emulator/test_x86_emulator.c
> @@ -319,6 +319,7 @@ int main(int argc, char **argv)
>      ctxt.regs = &regs;
>      ctxt.force_writeback = 0;
>      ctxt.vendor    = X86_VENDOR_UNKNOWN;
> +    ctxt.lma       = sizeof(void *) == 8;
>      ctxt.addr_size = 8 * sizeof(void *);
>      ctxt.sp_size   = 8 * sizeof(void *);
> 
> @@ -2922,6 +2923,7 @@ int main(int argc, char **argv)
>      {
>          decl_insn(vzeroupper);
> 
> +        ctxt.lma = false;
>          ctxt.sp_size = ctxt.addr_size = 32;
> 
>          asm volatile ( "vxorps %xmm2, %xmm2, %xmm3\n"
> @@ -2949,6 +2951,7 @@ int main(int argc, char **argv)
>              goto fail;
>          printf("okay\n");
> 
> +        ctxt.lma = true;
>          ctxt.sp_size = ctxt.addr_size = 64;
>      }
>      else
> @@ -3001,6 +3004,7 @@ int main(int argc, char **argv)
>              continue;
> 
>          memcpy(res, blobs[j].code, blobs[j].size);
> +        ctxt.lma = blobs[j].bitness == 64;
>          ctxt.addr_size = ctxt.sp_size = blobs[j].bitness;
> 
>          if ( ctxt.addr_size == sizeof(void *) * CHAR_BIT )
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index efac2e9..65cb707 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -2047,7 +2047,9 @@ void hvm_emulate_init_per_insn(
>      unsigned int pfec = PFEC_page_present;
>      unsigned long addr;
> 
> -    if ( hvm_long_mode_active(curr) &&
> +    hvmemul_ctxt->ctxt.lma = hvm_long_mode_active(curr);
> +
> +    if ( hvmemul_ctxt->ctxt.lma &&
>           hvmemul_ctxt->seg_reg[x86_seg_cs].attr.fields.l )
>          hvmemul_ctxt->ctxt.addr_size = hvmemul_ctxt->ctxt.sp_size = 64;
>      else
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index 3918a37..eda220d 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5410,6 +5410,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned
> long addr,
>          .ctxt = {
>              .regs = regs,
>              .vendor = d->arch.cpuid->x86_vendor,
> +            .lma = true,
>              .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>              .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>          },
> @@ -5564,6 +5565,7 @@ int mmio_ro_do_page_fault(struct vcpu *v,
> unsigned long addr,
>      struct x86_emulate_ctxt ctxt = {
>          .regs = regs,
>          .vendor = v->domain->arch.cpuid->x86_vendor,
> +        .lma = true,
>          .addr_size = addr_size,
>          .sp_size = addr_size,
>          .data = &mmio_ro_ctxt
> diff --git a/xen/arch/x86/mm/shadow/common.c
> b/xen/arch/x86/mm/shadow/common.c
> index 2fc796b..e42d3fd 100644
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -328,15 +328,14 @@ const struct x86_emulate_ops
> *shadow_init_emulation(
> 
>      sh_ctxt->ctxt.regs = regs;
>      sh_ctxt->ctxt.vendor = v->domain->arch.cpuid->x86_vendor;
> +    sh_ctxt->ctxt.lma = hvm_long_mode_active(v);
> 
>      /* Segment cache initialisation. Primed with CS. */
>      creg = hvm_get_seg_reg(x86_seg_cs, sh_ctxt);
> 
>      /* Work out the emulation mode. */
> -    if ( hvm_long_mode_active(v) && creg->attr.fields.l )
> -    {
> +    if ( sh_ctxt->ctxt.lma && creg->attr.fields.l )
>          sh_ctxt->ctxt.addr_size = sh_ctxt->ctxt.sp_size = 64;
> -    }
>      else
>      {
>          sreg = hvm_get_seg_reg(x86_seg_ss, sh_ctxt);
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index 0d54baf..09dc2f1 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -2966,6 +2966,7 @@ static int emulate_privileged_op(struct
> cpu_user_regs *regs)
>      struct priv_op_ctxt ctxt = {
>          .ctxt.regs = regs,
>          .ctxt.vendor = currd->arch.cpuid->x86_vendor,
> +        .ctxt.lma = true,
>      };
>      int rc;
>      unsigned int eflags, ar;
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c
> b/xen/arch/x86/x86_emulate/x86_emulate.c
> index 7315ca6..cc354bc 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -968,11 +968,10 @@ do {                                                                    \
> 
>  #define validate_far_branch(cs, ip) ({                                  \
>      if ( sizeof(ip) <= 4 ) {                                            \
> -        ASSERT(in_longmode(ctxt, ops) <= 0);                            \
> +        ASSERT(!ctxt->lma);                                             \
>          generate_exception_if((ip) > (cs)->limit, EXC_GP, 0);           \
>      } else                                                              \
> -        generate_exception_if(in_longmode(ctxt, ops) &&                 \
> -                              (cs)->attr.fields.l                       \
> +        generate_exception_if(ctxt->lma && (cs)->attr.fields.l          \
>                                ? !is_canonical_address(ip)               \
>                                : (ip) > (cs)->limit, EXC_GP, 0);         \
>  })
> @@ -1630,20 +1629,6 @@ static bool vcpu_has(
>  #endif
> 
>  static int
> -in_longmode(
> -    struct x86_emulate_ctxt *ctxt,
> -    const struct x86_emulate_ops *ops)
> -{
> -    uint64_t efer;
> -
> -    if ( !ops->read_msr ||
> -         unlikely(ops->read_msr(MSR_EFER, &efer, ctxt) != X86EMUL_OKAY) )
> -        return -1;
> -
> -    return !!(efer & EFER_LMA);
> -}
> -
> -static int
>  realmode_load_seg(
>      enum x86_segment seg,
>      uint16_t sel,
> @@ -1767,8 +1752,7 @@ protmode_load_seg(
>           * Experimentally in long mode, the L and D bits are checked before
>           * the Present bit.
>           */
> -        if ( in_longmode(ctxt, ops) &&
> -             (desc.b & (1 << 21)) && (desc.b & (1 << 22)) )
> +        if ( ctxt->lma && (desc.b & (1 << 21)) && (desc.b & (1 << 22)) )
>              goto raise_exn;
>          sel = (sel ^ rpl) | cpl;
>          break;
> @@ -1818,10 +1802,8 @@ protmode_load_seg(
> 
>      if ( !is_x86_user_segment(seg) )
>      {
> -        int lm = (desc.b & (1u << 12)) ? 0 : in_longmode(ctxt, ops);
> +        bool lm = (desc.b & (1u << 12)) ? 0 : ctxt->lma;
> 
> -        if ( lm < 0 )
> -            return X86EMUL_UNHANDLEABLE;
>          if ( lm )
>          {
>              switch ( rc = ops->read(sel_seg, (sel & 0xfff8) + 8,
> @@ -5195,7 +5177,7 @@ x86_emulate(
>                  case 0x03: /* busy 16-bit TSS */
>                  case 0x04: /* 16-bit call gate */
>                  case 0x05: /* 16/32-bit task gate */
> -                    if ( in_longmode(ctxt, ops) )
> +                    if ( ctxt->lma )
>                          break;
>                      /* fall through */
>                  case 0x02: /* LDT */
> @@ -5242,7 +5224,7 @@ x86_emulate(
>                  {
>                  case 0x01: /* available 16-bit TSS */
>                  case 0x03: /* busy 16-bit TSS */
> -                    if ( in_longmode(ctxt, ops) )
> +                    if ( ctxt->lma )
>                          break;
>                      /* fall through */
>                  case 0x02: /* LDT */
> @@ -5292,10 +5274,7 @@ x86_emulate(
>          sreg.attr.bytes = 0xc93; /* G+DB+P+S+Data */
> 
>  #ifdef __x86_64__
> -        rc = in_longmode(ctxt, ops);
> -        if ( rc < 0 )
> -            goto cannot_emulate;
> -        if ( rc )
> +        if ( ctxt->lma )
>          {
>              cs.attr.bytes = 0xa9b; /* L+DB+P+S+Code */
> 
> @@ -5732,9 +5711,7 @@ x86_emulate(
>              dst.val = src.val;
>          break;
> 
> -    case X86EMUL_OPC(0x0f, 0x34): /* sysenter */ {
> -        int lm;
> -
> +    case X86EMUL_OPC(0x0f, 0x34): /* sysenter */
>          vcpu_must_have(sep);
>          generate_exception_if(mode_ring0(), EXC_GP, 0);
>          generate_exception_if(!in_protmode(ctxt, ops), EXC_GP, 0);
> @@ -5745,17 +5722,14 @@ x86_emulate(
>              goto done;
> 
>          generate_exception_if(!(msr_val & 0xfffc), EXC_GP, 0);
> -        lm = in_longmode(ctxt, ops);
> -        if ( lm < 0 )
> -            goto cannot_emulate;
> 
>          _regs.eflags &= ~(X86_EFLAGS_VM | X86_EFLAGS_IF |
> X86_EFLAGS_RF);
> 
>          cs.sel = msr_val & ~3; /* SELECTOR_RPL_MASK */
>          cs.base = 0;   /* flat segment */
>          cs.limit = ~0u;  /* 4GB limit */
> -        cs.attr.bytes = lm ? 0xa9b  /* G+L+P+S+Code */
> -                           : 0xc9b; /* G+DB+P+S+Code */
> +        cs.attr.bytes = ctxt->lma ? 0xa9b  /* G+L+P+S+Code */
> +                                  : 0xc9b; /* G+DB+P+S+Code */
> 
>          sreg.sel = cs.sel + 8;
>          sreg.base = 0;   /* flat segment */
> @@ -5770,16 +5744,15 @@ x86_emulate(
>          if ( (rc = ops->read_msr(MSR_IA32_SYSENTER_EIP,
>                                   &msr_val, ctxt)) != X86EMUL_OKAY )
>              goto done;
> -        _regs.r(ip) = lm ? msr_val : (uint32_t)msr_val;
> +        _regs.r(ip) = ctxt->lma ? msr_val : (uint32_t)msr_val;
> 
>          if ( (rc = ops->read_msr(MSR_IA32_SYSENTER_ESP,
>                                   &msr_val, ctxt)) != X86EMUL_OKAY )
>              goto done;
> -        _regs.r(sp) = lm ? msr_val : (uint32_t)msr_val;
> +        _regs.r(sp) = ctxt->lma ? msr_val : (uint32_t)msr_val;
> 
>          singlestep = _regs.eflags & X86_EFLAGS_TF;
>          break;
> -    }
> 
>      case X86EMUL_OPC(0x0f, 0x35): /* sysexit */
>          vcpu_must_have(sep);
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h
> b/xen/arch/x86/x86_emulate/x86_emulate.h
> index 215adf6..0479e30 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -457,6 +457,9 @@ struct x86_emulate_ctxt
>      /* Set this if writes may have side effects. */
>      bool force_writeback;
> 
> +    /* Long mode active? */
> +    bool lma;
> +
>      /* Caller data that can be used by x86_emulate_ops' routines. */
>      void *data;
> 
> --
> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 2/6] x86/hvm: Correct long mode predicate
  2017-03-31 19:50 ` [PATCH for 4.9 2/6] x86/hvm: Correct long mode predicate Andrew Cooper
  2017-04-03  8:26   ` Paul Durrant
  2017-04-03  8:30   ` Jan Beulich
@ 2017-04-03  8:50   ` George Dunlap
  2017-04-05  7:08   ` Tian, Kevin
  3 siblings, 0 replies; 37+ messages in thread
From: George Dunlap @ 2017-04-03  8:50 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Kevin Tian, Jan Beulich, George Dunlap, Tim Deegan, Julien Grall,
	Paul Durrant, Jun Nakajima, Boris Ostrovsky,
	Suravee Suthikulpanit

On 31/03/17 20:50, Andrew Cooper wrote:
> hvm_long_mode_enabled() tests for EFER.LMA, which is specifically different to
> EFER.LME.
> 
> Rename it to match its behaviour, and have it strictly return a boolean value
> (although all its callers already use it in implicitly-boolean contexts, so no
> functional change).
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

MM bits:

Acked-by: George Dunlap <george.dunlap@citrix.com>

Again, thanks -- this bothered me when I was doing the AFL work as well. :-)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments
  2017-03-31 19:50 ` [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments Andrew Cooper
  2017-04-03  8:31   ` Paul Durrant
@ 2017-04-03  9:13   ` Jan Beulich
  2017-04-03 14:27     ` Andrew Cooper
  1 sibling, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2017-04-03  9:13 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2374,13 +2374,27 @@ int hvm_set_cr4(unsigned long value, bool_t may_defer)
>      return X86EMUL_OKAY;
>  }
>  
> +enum hvm_segmentation_mode hvm_seg_mode(
> +    const struct vcpu *v, enum x86_segment seg,
> +    const struct segment_register *cs)

The inputs here are at least somewhat counterintuitive (for example,
from an abstract pov it is unexpected that the result depends on seg
and cs). At the very least I think the naming should make clear that
cs is not just some code segment, but the CS v has currently in use
(e.g. active_cs). Going further the question is whether having this
helper is really useful (and not perhaps inviting mis-use), and hence
whether the inputs passed here wouldn't better be passed directly
to hvm_virtual_to_linear_addr(), the more that the "seg" argument
is required to match up between the two calls.

> +{
> +    if ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) )
> +        return hvm_seg_mode_real;

What about VM86 mode?

> +    if ( hvm_long_mode_active(v) &&
> +         (is_x86_system_segment(seg) || cs->attr.fields.l) )
> +        return hvm_seg_mode_long;
> +
> +    return hvm_seg_mode_prot;

Did you verify this actually matching real hardware behavior? There
being obvious anomalies when compat ring-0 code executes
LGDT/LIDT/SGDT/SIDT (in long mode these ought to have 10-byte
operands, yet 32-bit/compat code would expect 6-byte ones, so
one of the two views is necessarily wrong, and whichever it is, it
introduces an inconsistency), I wouldn't take it for given that _all_
descriptor table accessing insns behave like they would from a
64-bit code segment (I nevertheless assume they do, but as I
can't see it written down anywhere, we shouldn't assume so,
considering how many other oddities there are in x86).

This question is also being supported by the SDM using the same
standard "Same exceptions as in protected mode" in the
respective insns' "Compatibility Mode Exceptions" sections, yet
the behavior above implies that #GP(0) might also result for
compat mode descriptor table accesses if the descriptor address
ends up being non-canonical. Interestingly enough the PM
doesn't separate exception specifications for 32-bit protected,
compat, and 64-bit modes.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 4/6] x86/svm: Introduce svm_emul_swint_injection()
  2017-03-31 19:50 ` [PATCH for 4.9 4/6] x86/svm: Introduce svm_emul_swint_injection() Andrew Cooper
@ 2017-04-03  9:30   ` Jan Beulich
  2017-04-03 14:04   ` Boris Ostrovsky
  1 sibling, 0 replies; 37+ messages in thread
From: Jan Beulich @ 2017-04-03  9:30 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Julien Grall, Boris Ostrovsky, SuraveeSuthikulpanit, Xen-devel

>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
> +static void svm_emul_swint_injection(struct x86_event *event)
> +{
> +    struct vcpu *curr = current;
> +    struct vmcb_struct *vmcb = curr->arch.hvm_svm.vmcb;
> +    struct cpu_user_regs *regs = guest_cpu_user_regs();

All three look like they can be const.

> +
> +    unsigned int trap = event->vector, type = event->type;

Stray blank line in the middle of declarations.

> +    unsigned int fault = TRAP_gp_fault, ec = 0;
> +
> +    if ( vmcb->_cr0 & X86_CR0_PE ) /* TODO: support real-mode injection? */

vmcb_get_cr0() (also for EFER and CPL below).

> +    {
> +        pagefault_info_t pf;
> +        struct segment_register idtr;
> +        unsigned int idte_size, idte_offset;
> +        unsigned long idte_linear_addr;
> +        struct { uint32_t a, b, c, d; } idte = {};
> +        bool lm = vmcb->_efer & EFER_LMA;
> +        enum hvm_segmentation_mode seg_mode =
> +            lm ? hvm_seg_mode_long : hvm_seg_mode_prot;

This open coding is, I think, another hint that the helper function in
the other patch would perhaps better be dropped.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 5/6] x86/emul: Drop swint_emulate infrastructure
  2017-03-31 19:50 ` [PATCH for 4.9 5/6] x86/emul: Drop swint_emulate infrastructure Andrew Cooper
  2017-04-03  8:36   ` Paul Durrant
@ 2017-04-03  9:38   ` Jan Beulich
  1 sibling, 0 replies; 37+ messages in thread
From: Jan Beulich @ 2017-04-03  9:38 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
> With the SVM injection logic capable of doing its own emulation, there is no
> need for this hardware-specific assistance in the common emulator.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with one remark below.

> ---
>  tools/fuzz/x86_instruction_emulator/fuzz-emul.c |  18 +--
>  xen/arch/x86/hvm/emulate.c                      |   7 -
>  xen/arch/x86/mm.c                               |   2 -
>  xen/arch/x86/mm/shadow/common.c                 |   1 -
>  xen/arch/x86/x86_emulate/x86_emulate.c          | 187 ++++--------------------
>  xen/arch/x86/x86_emulate/x86_emulate.h          |  53 -------
>  6 files changed, 30 insertions(+), 238 deletions(-)

I like this.

> @@ -4103,25 +3966,38 @@ x86_emulate(
>              goto done;
>          break;
>  
> -    case 0xcc: /* int3 */
> -        src.val = EXC_BP;
> -        swint_type = x86_swint_int3;
> -        goto swint;
> -
> -    case 0xcd: /* int imm8 */
> -        swint_type = x86_swint_int;
> -    swint:
> -        rc = inject_swint(swint_type, (uint8_t)src.val,
> -                          _regs.r(ip) - ctxt->regs->r(ip),
> -                          ctxt, ops) ? : X86EMUL_EXCEPTION;
> -        goto done;
> -
>      case 0xce: /* into */
>          if ( !(_regs.eflags & X86_EFLAGS_OF) )
>              break;
> -        src.val = EXC_OF;
> -        swint_type = x86_swint_into;
> -        goto swint;
> +        /* Fallthrough */
> +    case 0xcc: /* int3 */
> +    case 0xcd: /* int imm8 */
> +    case 0xf1: /* int1 (icebp) */
> +        ASSERT(!ctxt->event_pending);
> +        switch ( ctxt->opcode )
> +        {
> +        case 0xcc: /* int3 */
> +            ctxt->event.vector = EXC_BP;
> +            ctxt->event.type = X86_EVENTTYPE_SW_EXCEPTION;
> +            break;
> +        case 0xcd: /* int imm8 */
> +            ctxt->event.vector = src.val;

I think with our current naming and fields use it might be marginally
better to use imm1 here; the R-b applies to both cases, though.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context
  2017-03-31 19:50 ` [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context Andrew Cooper
  2017-04-03  8:40   ` Paul Durrant
@ 2017-04-03 10:10   ` Jan Beulich
  2017-04-05 16:24     ` Andrew Cooper
  2017-04-05 16:07   ` Jan Beulich
  2 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2017-04-03 10:10 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5410,6 +5410,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
>          .ctxt = {
>              .regs = regs,
>              .vendor = d->arch.cpuid->x86_vendor,
> +            .lma = true,
>              .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>              .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>          },
> @@ -5564,6 +5565,7 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
>      struct x86_emulate_ctxt ctxt = {
>          .regs = regs,
>          .vendor = v->domain->arch.cpuid->x86_vendor,
> +        .lma = true,

Hmm, both of these are correct from Xen's pov, but potentially
wrong from the guest's. Since system segments aren't being
dealt with here, I think this difference is benign, but I think it
warrants at least a comment. If we ever meant to emulate
LLDT, this would become at active problem, as the guest's view
on gate descriptor layout would differ from that resulting from
setting .lma to true here. Same for emulate_privileged_op() then.

> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -328,15 +328,14 @@ const struct x86_emulate_ops *shadow_init_emulation(
>  
>      sh_ctxt->ctxt.regs = regs;
>      sh_ctxt->ctxt.vendor = v->domain->arch.cpuid->x86_vendor;
> +    sh_ctxt->ctxt.lma = hvm_long_mode_active(v);
>  
>      /* Segment cache initialisation. Primed with CS. */
>      creg = hvm_get_seg_reg(x86_seg_cs, sh_ctxt);
>  
>      /* Work out the emulation mode. */
> -    if ( hvm_long_mode_active(v) && creg->attr.fields.l )
> -    {
> +    if ( sh_ctxt->ctxt.lma && creg->attr.fields.l )
>          sh_ctxt->ctxt.addr_size = sh_ctxt->ctxt.sp_size = 64;
> -    }
>      else
>      {
>          sreg = hvm_get_seg_reg(x86_seg_ss, sh_ctxt);

The change itself looks fine, but the absence of a PV counterpart
makes me wonder if there isn't breakage elsewhere. Is the
"if ( is_hvm_domain(d) )" immediately ahead of the call site of
this function an unintended leftover? Yet having gone through
sh_page_fault() twice, I still can't spot where PV would be
diverted such that it can't come here.

> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -457,6 +457,9 @@ struct x86_emulate_ctxt
>      /* Set this if writes may have side effects. */
>      bool force_writeback;
>  
> +    /* Long mode active? */
> +    bool lma;

I don't think this should be in the input only section, as an insn
can have the effect of clearing it.

Also should x86_emulate_wrapper() perhaps assert the field to
be false for 32-bit (test harness) builds?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 1/6] x86/hvm: Correct some address space terminology
  2017-04-03  8:24   ` Jan Beulich
@ 2017-04-03 10:19     ` Andrew Cooper
  2017-04-03 10:29       ` Jan Beulich
  0 siblings, 1 reply; 37+ messages in thread
From: Andrew Cooper @ 2017-04-03 10:19 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

On 03/04/17 09:24, Jan Beulich wrote:
>>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
>> The function hvm_translate_linear_addr() translates a virtual address to a
>> linear address, not a linear address to a physical address.  Correct its 
>> name.
>>
>> Both hvm_translate_virtual_addr() and hvmemul_virtual_to_linear() return a
>> linear address, not a physical address.  Correct the parameters name.
> I think the "p" in the names didn't stand for "physical", but for
> "pointer",

In which case it should have been p_addr,

As it stands, we have a paddr_t which very definitely is a physical address.

>  and I further think ...
>
>> --- a/xen/arch/x86/hvm/emulate.c
>> +++ b/xen/arch/x86/hvm/emulate.c
>> @@ -506,7 +506,7 @@ static int hvmemul_virtual_to_linear(
>>      unsigned long *reps,
>>      enum hvm_access_type access_type,
>>      struct hvm_emulate_ctxt *hvmemul_ctxt,
>> -    unsigned long *paddr)
>> +    unsigned long *linear)
> ... retaining the "p" here and ...
>
>> --- a/xen/arch/x86/mm/shadow/common.c
>> +++ b/xen/arch/x86/mm/shadow/common.c
>> @@ -136,13 +136,13 @@ static struct segment_register *hvm_get_seg_reg(
>>      return seg_reg;
>>  }
>>  
>> -static int hvm_translate_linear_addr(
>> +static int hvm_translate_virtual_addr(
>>      enum x86_segment seg,
>>      unsigned long offset,
>>      unsigned int bytes,
>>      enum hvm_access_type access_type,
>>      struct sh_emulate_ctxt *sh_ctxt,
>> -    unsigned long *paddr)
>> +    unsigned long *linear)
> ... here would be useful (to distinguish the virtual address
> represented by the pointer itself - in hypervisor space - from the
> one the pointer points to, i.e. in guest space).

The virtual address representation is {seg, offset}, and Xen can't use
linear addresses.

I can't see any way to get them accidentally confused.

~Andrew

>
> With that
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
> Jan
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 1/6] x86/hvm: Correct some address space terminology
  2017-04-03 10:19     ` Andrew Cooper
@ 2017-04-03 10:29       ` Jan Beulich
  0 siblings, 0 replies; 37+ messages in thread
From: Jan Beulich @ 2017-04-03 10:29 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

>>> On 03.04.17 at 12:19, <andrew.cooper3@citrix.com> wrote:
> On 03/04/17 09:24, Jan Beulich wrote:
>>>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
>>> --- a/xen/arch/x86/mm/shadow/common.c
>>> +++ b/xen/arch/x86/mm/shadow/common.c
>>> @@ -136,13 +136,13 @@ static struct segment_register *hvm_get_seg_reg(
>>>      return seg_reg;
>>>  }
>>>  
>>> -static int hvm_translate_linear_addr(
>>> +static int hvm_translate_virtual_addr(
>>>      enum x86_segment seg,
>>>      unsigned long offset,
>>>      unsigned int bytes,
>>>      enum hvm_access_type access_type,
>>>      struct sh_emulate_ctxt *sh_ctxt,
>>> -    unsigned long *paddr)
>>> +    unsigned long *linear)
>> ... here would be useful (to distinguish the virtual address
>> represented by the pointer itself - in hypervisor space - from the
>> one the pointer points to, i.e. in guest space).
> 
> The virtual address representation is {seg, offset}, and Xen can't use
> linear addresses.

Oops, I'm sorry, I meant linear, not virtual. And of course Xen uses
linear addresses (in hypervisor address space) all the time (due to
virtual == linear in 64-bit mode, leaving aside the few exceptions;
in particular we never pass around any {seg, offset} pairs to
represent hypervisor pointers). I.e. once the function has written
*linear we have two linear addresses here:
- "linear" itself represents one (hypervisor, pointer type)
- "*linear" holds another (guest, integral type)

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 4/6] x86/svm: Introduce svm_emul_swint_injection()
  2017-03-31 19:50 ` [PATCH for 4.9 4/6] x86/svm: Introduce svm_emul_swint_injection() Andrew Cooper
  2017-04-03  9:30   ` Jan Beulich
@ 2017-04-03 14:04   ` Boris Ostrovsky
  1 sibling, 0 replies; 37+ messages in thread
From: Boris Ostrovsky @ 2017-04-03 14:04 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Julien Grall, Suravee Suthikulpanit, Jan Beulich


> +static void svm_emul_swint_injection(struct x86_event *event)
> +{
> +    struct vcpu *curr = current;
> +    struct vmcb_struct *vmcb = curr->arch.hvm_svm.vmcb;
> +    struct cpu_user_regs *regs = guest_cpu_user_regs();
> +
> +    unsigned int trap = event->vector, type = event->type;
> +    unsigned int fault = TRAP_gp_fault, ec = 0;
> +
> +    if ( vmcb->_cr0 & X86_CR0_PE ) /* TODO: support real-mode injection? */


if ( !(vmcb->_cr0 & X86_CR0_PE) )
    goto raise_exception;


will avoid extra indentation level;

> +    {
> +        pagefault_info_t pf;
> +        struct segment_register idtr;
> +        unsigned int idte_size, idte_offset;
> +        unsigned long idte_linear_addr;
> +        struct { uint32_t a, b, c, d; } idte = {};
> +        bool lm = vmcb->_efer & EFER_LMA;
> +        enum hvm_segmentation_mode seg_mode =
> +            lm ? hvm_seg_mode_long : hvm_seg_mode_prot;
> +        int rc;
> +
> +        idte_size = lm ? 16 : 8;
> +        idte_offset = trap * idte_size;
> +
> +        /* ICEBP sets the External Event bit despite being an instruction. */
> +        ec = (trap << 3) | X86_XEC_IDT |
> +            (type == X86_EVENTTYPE_PRI_SW_EXCEPTION ? X86_XEC_EXT : 0);
> +
> +        /*
> +         * TODO: This does not cover the v8086 mode with CR4.VME case
> +         * correctly, but falls on the safe side from the point of view of a
> +         * 32bit OS.  Someone with many TUITs can see about reading the TSS
> +         * Software Interrupt Redirection bitmap.
> +         */
> +        if ( (regs->eflags & X86_EFLAGS_VM) &&
> +             MASK_EXTR(regs->eflags, X86_EFLAGS_IOPL) != 3 )
> +            goto raise_exception;
> +
> +        /*
> +         * Read all 8/16 bytes so the idtr limit check is applied properly to
> +         * this entry, even though don't look all the words read.

I think some words are missing here.

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments
  2017-04-03  9:13   ` Jan Beulich
@ 2017-04-03 14:27     ` Andrew Cooper
  2017-04-03 15:07       ` Jan Beulich
  0 siblings, 1 reply; 37+ messages in thread
From: Andrew Cooper @ 2017-04-03 14:27 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

On 03/04/17 10:13, Jan Beulich wrote:
>>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -2374,13 +2374,27 @@ int hvm_set_cr4(unsigned long value, bool_t may_defer)
>>      return X86EMUL_OKAY;
>>  }
>>  
>> +enum hvm_segmentation_mode hvm_seg_mode(
>> +    const struct vcpu *v, enum x86_segment seg,
>> +    const struct segment_register *cs)
> The inputs here are at least somewhat counterintuitive (for example,
> from an abstract pov it is unexpected that the result depends on seg
> and cs). At the very least I think the naming should make clear that
> cs is not just some code segment, but the CS v has currently in use
> (e.g. active_cs). Going further the question is whether having this
> helper is really useful (and not perhaps inviting mis-use), and hence
> whether the inputs passed here wouldn't better be passed directly
> to hvm_virtual_to_linear_addr(), the more that the "seg" argument
> is required to match up between the two calls.

I purposefully wanted to avoid people opencoding the logic and getting
it wrong (looks like even I got it wrong).

I'm not convinced that passing the parameters individually is better.

>
>> +{
>> +    if ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) )
>> +        return hvm_seg_mode_real;
> What about VM86 mode?

Very good point.

>
>> +    if ( hvm_long_mode_active(v) &&
>> +         (is_x86_system_segment(seg) || cs->attr.fields.l) )
>> +        return hvm_seg_mode_long;
>> +
>> +    return hvm_seg_mode_prot;
> Did you verify this actually matching real hardware behavior? There
> being obvious anomalies when compat ring-0 code executes
> LGDT/LIDT/SGDT/SIDT (in long mode these ought to have 10-byte
> operands, yet 32-bit/compat code would expect 6-byte ones, so
> one of the two views is necessarily wrong, and whichever it is, it
> introduces an inconsistency), I wouldn't take it for given that _all_
> descriptor table accessing insns behave like they would from a
> 64-bit code segment (I nevertheless assume they do, but as I
> can't see it written down anywhere, we shouldn't assume so,
> considering how many other oddities there are in x86).
>
> This question is also being supported by the SDM using the same
> standard "Same exceptions as in protected mode" in the
> respective insns' "Compatibility Mode Exceptions" sections, yet
> the behavior above implies that #GP(0) might also result for
> compat mode descriptor table accesses if the descriptor address
> ends up being non-canonical. Interestingly enough the PM
> doesn't separate exception specifications for 32-bit protected,
> compat, and 64-bit modes.

You are mistaken.

{L,S}{I,G}DT are {read,write}_segment_register() operations, using a
plain memory operand.

When we come to the segmentation check, it will be by default
%ds-relative, with size as calculated by op_bytes in the instruction
emulator.

There are no instructions which cause a direct system segment-relative
memory access.  All of them are implicit, such as `int $imm`, `mov $reg,
%sreg`.

Therefore, I think your expectations of the described behaviour are
correct, and that my code is correct and behaves in the way you describe.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments
  2017-04-03 14:27     ` Andrew Cooper
@ 2017-04-03 15:07       ` Jan Beulich
  2017-04-03 15:42         ` Andrew Cooper
  0 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2017-04-03 15:07 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

>>> On 03.04.17 at 16:27, <andrew.cooper3@citrix.com> wrote:
> On 03/04/17 10:13, Jan Beulich wrote:
>>>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -2374,13 +2374,27 @@ int hvm_set_cr4(unsigned long value, bool_t 
> may_defer)
>>>      return X86EMUL_OKAY;
>>>  }
>>>  
>>> +enum hvm_segmentation_mode hvm_seg_mode(
>>> +    const struct vcpu *v, enum x86_segment seg,
>>> +    const struct segment_register *cs)
>> The inputs here are at least somewhat counterintuitive (for example,
>> from an abstract pov it is unexpected that the result depends on seg
>> and cs). At the very least I think the naming should make clear that
>> cs is not just some code segment, but the CS v has currently in use
>> (e.g. active_cs). Going further the question is whether having this
>> helper is really useful (and not perhaps inviting mis-use), and hence
>> whether the inputs passed here wouldn't better be passed directly
>> to hvm_virtual_to_linear_addr(), the more that the "seg" argument
>> is required to match up between the two calls.
> 
> I purposefully wanted to avoid people opencoding the logic and getting
> it wrong (looks like even I got it wrong).
> 
> I'm not convinced that passing the parameters individually is better.
> 
>>
>>> +{
>>> +    if ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) )
>>> +        return hvm_seg_mode_real;
>> What about VM86 mode?
> 
> Very good point.
> 
>>
>>> +    if ( hvm_long_mode_active(v) &&
>>> +         (is_x86_system_segment(seg) || cs->attr.fields.l) )
>>> +        return hvm_seg_mode_long;
>>> +
>>> +    return hvm_seg_mode_prot;
>> Did you verify this actually matching real hardware behavior? There
>> being obvious anomalies when compat ring-0 code executes
>> LGDT/LIDT/SGDT/SIDT (in long mode these ought to have 10-byte
>> operands, yet 32-bit/compat code would expect 6-byte ones, so
>> one of the two views is necessarily wrong, and whichever it is, it
>> introduces an inconsistency), I wouldn't take it for given that _all_
>> descriptor table accessing insns behave like they would from a
>> 64-bit code segment (I nevertheless assume they do, but as I
>> can't see it written down anywhere, we shouldn't assume so,
>> considering how many other oddities there are in x86).
>>
>> This question is also being supported by the SDM using the same
>> standard "Same exceptions as in protected mode" in the
>> respective insns' "Compatibility Mode Exceptions" sections, yet
>> the behavior above implies that #GP(0) might also result for
>> compat mode descriptor table accesses if the descriptor address
>> ends up being non-canonical. Interestingly enough the PM
>> doesn't separate exception specifications for 32-bit protected,
>> compat, and 64-bit modes.
> 
> You are mistaken.
> 
> {L,S}{I,G}DT are {read,write}_segment_register() operations, using a
> plain memory operand.
> 
> When we come to the segmentation check, it will be by default
> %ds-relative, with size as calculated by op_bytes in the instruction
> emulator.

I think I didn't make myself clear then: I'm not at all talking about how
the memory access of these insns get carried out, I solely talk about
the size of their operands: In long mode to load IDTR or GDTR you'd
expect a 64-bit base and a 16-bit limit. Hence if _all_ system segment
register related insns worked consistently in long mode, the four
named insns would have to have 10-byte operands. I'm pretty sure
they don't though, so there is _one_ anomaly already. With that I
don't think we can rule out there being other anomalies, with this
not being talked about explicitly anywhere in the doc.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments
  2017-04-03 15:07       ` Jan Beulich
@ 2017-04-03 15:42         ` Andrew Cooper
  2017-04-03 16:08           ` Jan Beulich
  0 siblings, 1 reply; 37+ messages in thread
From: Andrew Cooper @ 2017-04-03 15:42 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

On 03/04/17 16:07, Jan Beulich wrote:
>>>> On 03.04.17 at 16:27, <andrew.cooper3@citrix.com> wrote:
>> On 03/04/17 10:13, Jan Beulich wrote:
>>>>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
>>>> --- a/xen/arch/x86/hvm/hvm.c
>>>> +++ b/xen/arch/x86/hvm/hvm.c
>>>> @@ -2374,13 +2374,27 @@ int hvm_set_cr4(unsigned long value, bool_t 
>> may_defer)
>>>>      return X86EMUL_OKAY;
>>>>  }
>>>>  
>>>> +enum hvm_segmentation_mode hvm_seg_mode(
>>>> +    const struct vcpu *v, enum x86_segment seg,
>>>> +    const struct segment_register *cs)
>>> The inputs here are at least somewhat counterintuitive (for example,
>>> from an abstract pov it is unexpected that the result depends on seg
>>> and cs). At the very least I think the naming should make clear that
>>> cs is not just some code segment, but the CS v has currently in use
>>> (e.g. active_cs). Going further the question is whether having this
>>> helper is really useful (and not perhaps inviting mis-use), and hence
>>> whether the inputs passed here wouldn't better be passed directly
>>> to hvm_virtual_to_linear_addr(), the more that the "seg" argument
>>> is required to match up between the two calls.
>> I purposefully wanted to avoid people opencoding the logic and getting
>> it wrong (looks like even I got it wrong).
>>
>> I'm not convinced that passing the parameters individually is better.
>>
>>>> +{
>>>> +    if ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) )
>>>> +        return hvm_seg_mode_real;
>>> What about VM86 mode?
>> Very good point.
>>
>>>> +    if ( hvm_long_mode_active(v) &&
>>>> +         (is_x86_system_segment(seg) || cs->attr.fields.l) )
>>>> +        return hvm_seg_mode_long;
>>>> +
>>>> +    return hvm_seg_mode_prot;
>>> Did you verify this actually matching real hardware behavior? There
>>> being obvious anomalies when compat ring-0 code executes
>>> LGDT/LIDT/SGDT/SIDT (in long mode these ought to have 10-byte
>>> operands, yet 32-bit/compat code would expect 6-byte ones, so
>>> one of the two views is necessarily wrong, and whichever it is, it
>>> introduces an inconsistency), I wouldn't take it for given that _all_
>>> descriptor table accessing insns behave like they would from a
>>> 64-bit code segment (I nevertheless assume they do, but as I
>>> can't see it written down anywhere, we shouldn't assume so,
>>> considering how many other oddities there are in x86).
>>>
>>> This question is also being supported by the SDM using the same
>>> standard "Same exceptions as in protected mode" in the
>>> respective insns' "Compatibility Mode Exceptions" sections, yet
>>> the behavior above implies that #GP(0) might also result for
>>> compat mode descriptor table accesses if the descriptor address
>>> ends up being non-canonical. Interestingly enough the PM
>>> doesn't separate exception specifications for 32-bit protected,
>>> compat, and 64-bit modes.
>> You are mistaken.
>>
>> {L,S}{I,G}DT are {read,write}_segment_register() operations, using a
>> plain memory operand.
>>
>> When we come to the segmentation check, it will be by default
>> %ds-relative, with size as calculated by op_bytes in the instruction
>> emulator.
> I think I didn't make myself clear then: I'm not at all talking about how
> the memory access of these insns get carried out, I solely talk about
> the size of their operands:

I still fail to see what the size of the operands have to do with the
choice of segmentation mode.

> In long mode to load IDTR or GDTR you'd expect a 64-bit base and a 16-bit limit.

Why?  I'd expect nothing of the sort, because 32bit compat segments are
purposefully designed to be no functional difference from regular 32bit
protected mode segments.  That includes not changing the behaviour of
instructions like this.

> Hence if _all_ system segment
> register related insns worked consistently in long mode, the four
> named insns would have to have 10-byte operands.

This isn't a valid expectation to draw.

>  I'm pretty sure
> they don't though, so there is _one_ anomaly already.

Indeed they don't.  In a compatibility mode segment, they have take a
6-byte operand, identically to 32bit mode.

> With that I don't think we can rule out there being other anomalies, with this
> not being talked about explicitly anywhere in the doc.

I don't think any of this is relevant to the correctness of this patch.

Without this fix, implicit accesses to system segments in a
compatibility mode segment will truncate the resulting linear address as
part of performing the segmentation calculations, which is obviously not
how real hardware behaves.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments
  2017-04-03 15:42         ` Andrew Cooper
@ 2017-04-03 16:08           ` Jan Beulich
  2017-04-03 17:37             ` Andrew Cooper
  0 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2017-04-03 16:08 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

>>> On 03.04.17 at 17:42, <andrew.cooper3@citrix.com> wrote:
> On 03/04/17 16:07, Jan Beulich wrote:
>>>>> On 03.04.17 at 16:27, <andrew.cooper3@citrix.com> wrote:
>>> On 03/04/17 10:13, Jan Beulich wrote:
>>>>>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
>>>>> --- a/xen/arch/x86/hvm/hvm.c
>>>>> +++ b/xen/arch/x86/hvm/hvm.c
>>>>> @@ -2374,13 +2374,27 @@ int hvm_set_cr4(unsigned long value, bool_t 
>>> may_defer)
>>>>>      return X86EMUL_OKAY;
>>>>>  }
>>>>>  
>>>>> +enum hvm_segmentation_mode hvm_seg_mode(
>>>>> +    const struct vcpu *v, enum x86_segment seg,
>>>>> +    const struct segment_register *cs)
>>>> The inputs here are at least somewhat counterintuitive (for example,
>>>> from an abstract pov it is unexpected that the result depends on seg
>>>> and cs). At the very least I think the naming should make clear that
>>>> cs is not just some code segment, but the CS v has currently in use
>>>> (e.g. active_cs). Going further the question is whether having this
>>>> helper is really useful (and not perhaps inviting mis-use), and hence
>>>> whether the inputs passed here wouldn't better be passed directly
>>>> to hvm_virtual_to_linear_addr(), the more that the "seg" argument
>>>> is required to match up between the two calls.
>>> I purposefully wanted to avoid people opencoding the logic and getting
>>> it wrong (looks like even I got it wrong).
>>>
>>> I'm not convinced that passing the parameters individually is better.
>>>
>>>>> +{
>>>>> +    if ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) )
>>>>> +        return hvm_seg_mode_real;
>>>> What about VM86 mode?
>>> Very good point.
>>>
>>>>> +    if ( hvm_long_mode_active(v) &&
>>>>> +         (is_x86_system_segment(seg) || cs->attr.fields.l) )
>>>>> +        return hvm_seg_mode_long;
>>>>> +
>>>>> +    return hvm_seg_mode_prot;
>>>> Did you verify this actually matching real hardware behavior? There
>>>> being obvious anomalies when compat ring-0 code executes
>>>> LGDT/LIDT/SGDT/SIDT (in long mode these ought to have 10-byte
>>>> operands, yet 32-bit/compat code would expect 6-byte ones, so
>>>> one of the two views is necessarily wrong, and whichever it is, it
>>>> introduces an inconsistency), I wouldn't take it for given that _all_
>>>> descriptor table accessing insns behave like they would from a
>>>> 64-bit code segment (I nevertheless assume they do, but as I
>>>> can't see it written down anywhere, we shouldn't assume so,
>>>> considering how many other oddities there are in x86).
>>>>
>>>> This question is also being supported by the SDM using the same
>>>> standard "Same exceptions as in protected mode" in the
>>>> respective insns' "Compatibility Mode Exceptions" sections, yet
>>>> the behavior above implies that #GP(0) might also result for
>>>> compat mode descriptor table accesses if the descriptor address
>>>> ends up being non-canonical. Interestingly enough the PM
>>>> doesn't separate exception specifications for 32-bit protected,
>>>> compat, and 64-bit modes.
>>> You are mistaken.
>>>
>>> {L,S}{I,G}DT are {read,write}_segment_register() operations, using a
>>> plain memory operand.
>>>
>>> When we come to the segmentation check, it will be by default
>>> %ds-relative, with size as calculated by op_bytes in the instruction
>>> emulator.
>> I think I didn't make myself clear then: I'm not at all talking about how
>> the memory access of these insns get carried out, I solely talk about
>> the size of their operands:
> 
> I still fail to see what the size of the operands have to do with the
> choice of segmentation mode.
> 
>> In long mode to load IDTR or GDTR you'd expect a 64-bit base and a 16-bit 
> limit.
> 
> Why?  I'd expect nothing of the sort, because 32bit compat segments are
> purposefully designed to be no functional difference from regular 32bit
> protected mode segments.  That includes not changing the behaviour of
> instructions like this.

Well, one can of course take the position that ring-0 compat code
is simply a useless thing.

>> Hence if _all_ system segment
>> register related insns worked consistently in long mode, the four
>> named insns would have to have 10-byte operands.
> 
> This isn't a valid expectation to draw.
> 
>>  I'm pretty sure
>> they don't though, so there is _one_ anomaly already.
> 
> Indeed they don't.  In a compatibility mode segment, they have take a
> 6-byte operand, identically to 32bit mode.
> 
>> With that I don't think we can rule out there being other anomalies, with this
>> not being talked about explicitly anywhere in the doc.
> 
> I don't think any of this is relevant to the correctness of this patch.

I don't question the correctness; all I question is how far it is built
upon assumptions vs actually observed (short of documented)
behavior.

> Without this fix, implicit accesses to system segments in a
> compatibility mode segment will truncate the resulting linear address as
> part of performing the segmentation calculations, which is obviously not
> how real hardware behaves.

I understand this. But things are a little more complicated. Just
extend your line of thinking regarding IDTR/GDTR to LDTR and
TR: Above you suggest that the former two get loaded in a fully
32-bit mode compatible way. LTR and LLDT (as well as LAR and
LSL), however, access a descriptor table. 32-bit code would
expect an 8-byte descriptor to be read. Is compat mode code
then not allowed to make the same assumption? I think you've
made pretty much explicit that you'd expect a 16-byte descriptor
to be loaded in this case. So ring-0 compat code operates
identically to 32-bit mode in some respects, and identically to
64-bit code in others. Fundamentally I'd expect consistency here,
but along the lines of the usefulness remark higher up I simply
think that no-one had spent any thought on this when designing
x86-64; otherwise it would have been easy to simply disallow
ring-0 to enter compat mode, thus avoiding any inconsistencies.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments
  2017-04-03 16:08           ` Jan Beulich
@ 2017-04-03 17:37             ` Andrew Cooper
  2017-04-04 10:18               ` Andrew Cooper
  0 siblings, 1 reply; 37+ messages in thread
From: Andrew Cooper @ 2017-04-03 17:37 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

On 03/04/17 17:08, Jan Beulich wrote:
>>>> On 03.04.17 at 17:42, <andrew.cooper3@citrix.com> wrote:
>> On 03/04/17 16:07, Jan Beulich wrote:
>>>>>> On 03.04.17 at 16:27, <andrew.cooper3@citrix.com> wrote:
>>>> On 03/04/17 10:13, Jan Beulich wrote:
>>>>>>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
>>>>>> --- a/xen/arch/x86/hvm/hvm.c
>>>>>> +++ b/xen/arch/x86/hvm/hvm.c
>>>>>> @@ -2374,13 +2374,27 @@ int hvm_set_cr4(unsigned long value, bool_t 
>>>> may_defer)
>>>>>>      return X86EMUL_OKAY;
>>>>>>  }
>>>>>>  
>>>>>> +enum hvm_segmentation_mode hvm_seg_mode(
>>>>>> +    const struct vcpu *v, enum x86_segment seg,
>>>>>> +    const struct segment_register *cs)
>>>>> The inputs here are at least somewhat counterintuitive (for example,
>>>>> from an abstract pov it is unexpected that the result depends on seg
>>>>> and cs). At the very least I think the naming should make clear that
>>>>> cs is not just some code segment, but the CS v has currently in use
>>>>> (e.g. active_cs). Going further the question is whether having this
>>>>> helper is really useful (and not perhaps inviting mis-use), and hence
>>>>> whether the inputs passed here wouldn't better be passed directly
>>>>> to hvm_virtual_to_linear_addr(), the more that the "seg" argument
>>>>> is required to match up between the two calls.
>>>> I purposefully wanted to avoid people opencoding the logic and getting
>>>> it wrong (looks like even I got it wrong).
>>>>
>>>> I'm not convinced that passing the parameters individually is better.
>>>>
>>>>>> +{
>>>>>> +    if ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) )
>>>>>> +        return hvm_seg_mode_real;
>>>>> What about VM86 mode?
>>>> Very good point.
>>>>
>>>>>> +    if ( hvm_long_mode_active(v) &&
>>>>>> +         (is_x86_system_segment(seg) || cs->attr.fields.l) )
>>>>>> +        return hvm_seg_mode_long;
>>>>>> +
>>>>>> +    return hvm_seg_mode_prot;
>>>>> Did you verify this actually matching real hardware behavior? There
>>>>> being obvious anomalies when compat ring-0 code executes
>>>>> LGDT/LIDT/SGDT/SIDT (in long mode these ought to have 10-byte
>>>>> operands, yet 32-bit/compat code would expect 6-byte ones, so
>>>>> one of the two views is necessarily wrong, and whichever it is, it
>>>>> introduces an inconsistency), I wouldn't take it for given that _all_
>>>>> descriptor table accessing insns behave like they would from a
>>>>> 64-bit code segment (I nevertheless assume they do, but as I
>>>>> can't see it written down anywhere, we shouldn't assume so,
>>>>> considering how many other oddities there are in x86).
>>>>>
>>>>> This question is also being supported by the SDM using the same
>>>>> standard "Same exceptions as in protected mode" in the
>>>>> respective insns' "Compatibility Mode Exceptions" sections, yet
>>>>> the behavior above implies that #GP(0) might also result for
>>>>> compat mode descriptor table accesses if the descriptor address
>>>>> ends up being non-canonical. Interestingly enough the PM
>>>>> doesn't separate exception specifications for 32-bit protected,
>>>>> compat, and 64-bit modes.
>>>> You are mistaken.
>>>>
>>>> {L,S}{I,G}DT are {read,write}_segment_register() operations, using a
>>>> plain memory operand.
>>>>
>>>> When we come to the segmentation check, it will be by default
>>>> %ds-relative, with size as calculated by op_bytes in the instruction
>>>> emulator.
>>> I think I didn't make myself clear then: I'm not at all talking about how
>>> the memory access of these insns get carried out, I solely talk about
>>> the size of their operands:
>> I still fail to see what the size of the operands have to do with the
>> choice of segmentation mode.
>>
>>> In long mode to load IDTR or GDTR you'd expect a 64-bit base and a 16-bit 
>> limit.
>>
>> Why?  I'd expect nothing of the sort, because 32bit compat segments are
>> purposefully designed to be no functional difference from regular 32bit
>> protected mode segments.  That includes not changing the behaviour of
>> instructions like this.
> Well, one can of course take the position that ring-0 compat code
> is simply a useless thing.

Compatibility mode segments exist for the purpose of making user code
continue to work.  I don't find it surprising that compatbility
supervisor segments have some rough corners.

>
>>> Hence if _all_ system segment
>>> register related insns worked consistently in long mode, the four
>>> named insns would have to have 10-byte operands.
>> This isn't a valid expectation to draw.
>>
>>>  I'm pretty sure
>>> they don't though, so there is _one_ anomaly already.
>> Indeed they don't.  In a compatibility mode segment, they have take a
>> 6-byte operand, identically to 32bit mode.
>>
>>> With that I don't think we can rule out there being other anomalies, with this
>>> not being talked about explicitly anywhere in the doc.
>> I don't think any of this is relevant to the correctness of this patch.
> I don't question the correctness; all I question is how far it is built
> upon assumptions vs actually observed (short of documented)
> behavior.

AMD Vol2 14.5 "Initialising Long Mode" is fairly clear on the subject. 
The layout and behaviour of the 4 system structures depend on

>
>> Without this fix, implicit accesses to system segments in a
>> compatibility mode segment will truncate the resulting linear address as
>> part of performing the segmentation calculations, which is obviously not
>> how real hardware behaves.
> I understand this. But things are a little more complicated. Just
> extend your line of thinking regarding IDTR/GDTR to LDTR and
> TR: Above you suggest that the former two get loaded in a fully
> 32-bit mode compatible way. LTR and LLDT (as well as LAR and
> LSL), however, access a descriptor table. 32-bit code would
> expect an 8-byte descriptor to be read. Is compat mode code
> then not allowed to make the same assumption?

Hmm - the wording of LTR/LLDT in both manuals states 64bit mode, not
long mode, so there is a decent chance that the compat behaviour is
identical.  Let me experiment.

~Andrew

>  I think you've
> made pretty much explicit that you'd expect a 16-byte descriptor
> to be loaded in this case. So ring-0 compat code operates
> identically to 32-bit mode in some respects, and identically to
> 64-bit code in others. Fundamentally I'd expect consistency here,
> but along the lines of the usefulness remark higher up I simply
> think that no-one had spent any thought on this when designing
> x86-64; otherwise it would have been easy to simply disallow
> ring-0 to enter compat mode, thus avoiding any inconsistencies.
>
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments
  2017-04-03 17:37             ` Andrew Cooper
@ 2017-04-04 10:18               ` Andrew Cooper
  2017-04-04 10:32                 ` Jan Beulich
  0 siblings, 1 reply; 37+ messages in thread
From: Andrew Cooper @ 2017-04-04 10:18 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

On 03/04/17 18:37, Andrew Cooper wrote:
>>> Without this fix, implicit accesses to system segments in a
>>> compatibility mode segment will truncate the resulting linear address as
>>> part of performing the segmentation calculations, which is obviously not
>>> how real hardware behaves.
>> I understand this. But things are a little more complicated. Just
>> extend your line of thinking regarding IDTR/GDTR to LDTR and
>> TR: Above you suggest that the former two get loaded in a fully
>> 32-bit mode compatible way. LTR and LLDT (as well as LAR and
>> LSL), however, access a descriptor table. 32-bit code would
>> expect an 8-byte descriptor to be read. Is compat mode code
>> then not allowed to make the same assumption?
> Hmm - the wording of LTR/LLDT in both manuals states 64bit mode, not
> long mode, so there is a decent chance that the compat behaviour is
> identical.  Let me experiment.

In a compat mode segment, lldt/ltr operates almost identically to
protected mode.  They read 8-byte entries, and zero extends the base
field when loading the result into the segment cache.  However, in
compatibility mode, they are still subject to long mode restrictions. 
In particular, you can't attempt to load a 16bit TSS while in a compat
mode segment.

~Andrew


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments
  2017-04-04 10:18               ` Andrew Cooper
@ 2017-04-04 10:32                 ` Jan Beulich
  0 siblings, 0 replies; 37+ messages in thread
From: Jan Beulich @ 2017-04-04 10:32 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

>>> On 04.04.17 at 12:18, <andrew.cooper3@citrix.com> wrote:
> On 03/04/17 18:37, Andrew Cooper wrote:
>>>> Without this fix, implicit accesses to system segments in a
>>>> compatibility mode segment will truncate the resulting linear address as
>>>> part of performing the segmentation calculations, which is obviously not
>>>> how real hardware behaves.
>>> I understand this. But things are a little more complicated. Just
>>> extend your line of thinking regarding IDTR/GDTR to LDTR and
>>> TR: Above you suggest that the former two get loaded in a fully
>>> 32-bit mode compatible way. LTR and LLDT (as well as LAR and
>>> LSL), however, access a descriptor table. 32-bit code would
>>> expect an 8-byte descriptor to be read. Is compat mode code
>>> then not allowed to make the same assumption?
>> Hmm - the wording of LTR/LLDT in both manuals states 64bit mode, not
>> long mode, so there is a decent chance that the compat behaviour is
>> identical.  Let me experiment.
> 
> In a compat mode segment, lldt/ltr operates almost identically to
> protected mode.  They read 8-byte entries, and zero extends the base
> field when loading the result into the segment cache.  However, in
> compatibility mode, they are still subject to long mode restrictions. 
> In particular, you can't attempt to load a 16bit TSS while in a compat
> mode segment.

Interesting, and sort of unexpected. Besides this likely meaning we
need to further adjust the emulator, this raises a question on call
gates then: What factor is it which determines whether a call gate
is an 8- or 16-byte one? Is this perhaps dependent on the L bit of
the code segment descriptor referred to by the gate's code selector?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 2/6] x86/hvm: Correct long mode predicate
  2017-03-31 19:50 ` [PATCH for 4.9 2/6] x86/hvm: Correct long mode predicate Andrew Cooper
                     ` (2 preceding siblings ...)
  2017-04-03  8:50   ` George Dunlap
@ 2017-04-05  7:08   ` Tian, Kevin
  3 siblings, 0 replies; 37+ messages in thread
From: Tian, Kevin @ 2017-04-05  7:08 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Jan Beulich, George Dunlap, Tim Deegan, Julien Grall,
	Paul Durrant, Nakajima, Jun, Boris Ostrovsky,
	Suravee Suthikulpanit

> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Saturday, April 1, 2017 3:51 AM
> 
> hvm_long_mode_enabled() tests for EFER.LMA, which is specifically different
> to
> EFER.LME.
> 
> Rename it to match its behaviour, and have it strictly return a boolean value
> (although all its callers already use it in implicitly-boolean contexts, so no
> functional change).
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context
  2017-03-31 19:50 ` [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context Andrew Cooper
  2017-04-03  8:40   ` Paul Durrant
  2017-04-03 10:10   ` Jan Beulich
@ 2017-04-05 16:07   ` Jan Beulich
  2 siblings, 0 replies; 37+ messages in thread
From: Jan Beulich @ 2017-04-05 16:07 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
> @@ -1818,10 +1802,8 @@ protmode_load_seg(
>  
>      if ( !is_x86_user_segment(seg) )
>      {
> -        int lm = (desc.b & (1u << 12)) ? 0 : in_longmode(ctxt, ops);
> +        bool lm = (desc.b & (1u << 12)) ? 0 : ctxt->lma;

One more minor thing: You want to use false here.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context
  2017-04-03 10:10   ` Jan Beulich
@ 2017-04-05 16:24     ` Andrew Cooper
  2017-04-06  6:58       ` Jan Beulich
  0 siblings, 1 reply; 37+ messages in thread
From: Andrew Cooper @ 2017-04-05 16:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

On 03/04/17 11:10, Jan Beulich wrote:
>>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -5410,6 +5410,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
>>          .ctxt = {
>>              .regs = regs,
>>              .vendor = d->arch.cpuid->x86_vendor,
>> +            .lma = true,
>>              .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>>              .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>>          },
>> @@ -5564,6 +5565,7 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
>>      struct x86_emulate_ctxt ctxt = {
>>          .regs = regs,
>>          .vendor = v->domain->arch.cpuid->x86_vendor,
>> +        .lma = true,
> Hmm, both of these are correct from Xen's pov, but potentially
> wrong from the guest's. Since system segments aren't being
> dealt with here, I think this difference is benign, but I think it
> warrants at least a comment. If we ever meant to emulate
> LLDT, this would become at active problem, as the guest's view
> on gate descriptor layout would differ from that resulting from
> setting .lma to true here. Same for emulate_privileged_op() then.

As discovered in the meantime, things like LLDT/LTR and call gates are
far more complicated.

Still, setting LMA to true here is the right thing to do, as it is an
accurate statement of processor state.  Despite the level of
compatibility for 32bit, a 32bit PV guest isn't entirely isolated from
the fact that Xen is 64bit.

>
>> --- a/xen/arch/x86/mm/shadow/common.c
>> +++ b/xen/arch/x86/mm/shadow/common.c
>> @@ -328,15 +328,14 @@ const struct x86_emulate_ops *shadow_init_emulation(
>>  
>>      sh_ctxt->ctxt.regs = regs;
>>      sh_ctxt->ctxt.vendor = v->domain->arch.cpuid->x86_vendor;
>> +    sh_ctxt->ctxt.lma = hvm_long_mode_active(v);
>>  
>>      /* Segment cache initialisation. Primed with CS. */
>>      creg = hvm_get_seg_reg(x86_seg_cs, sh_ctxt);
>>  
>>      /* Work out the emulation mode. */
>> -    if ( hvm_long_mode_active(v) && creg->attr.fields.l )
>> -    {
>> +    if ( sh_ctxt->ctxt.lma && creg->attr.fields.l )
>>          sh_ctxt->ctxt.addr_size = sh_ctxt->ctxt.sp_size = 64;
>> -    }
>>      else
>>      {
>>          sreg = hvm_get_seg_reg(x86_seg_ss, sh_ctxt);
> The change itself looks fine, but the absence of a PV counterpart
> makes me wonder if there isn't breakage elsewhere. Is the
> "if ( is_hvm_domain(d) )" immediately ahead of the call site of
> this function an unintended leftover? Yet having gone through
> sh_page_fault() twice, I still can't spot where PV would be
> diverted such that it can't come here.

PV guests getting here was excluded when autotranslate was ripped out,
and just out of context above, we ASSERT(is_hvm_vcpu(v));

>
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>> @@ -457,6 +457,9 @@ struct x86_emulate_ctxt
>>      /* Set this if writes may have side effects. */
>>      bool force_writeback;
>>  
>> +    /* Long mode active? */
>> +    bool lma;
> I don't think this should be in the input only section, as an insn
> can have the effect of clearing it.
>
> Also should x86_emulate_wrapper() perhaps assert the field to
> be false for 32-bit (test harness) builds?

I'd prefer not to.  There is nothing in principle wrong with trying to
feed a 32bit compatibility segment through a 32bit build of the test
harness.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context
  2017-04-05 16:24     ` Andrew Cooper
@ 2017-04-06  6:58       ` Jan Beulich
  2017-04-06  9:47         ` Andrew Cooper
  0 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2017-04-06  6:58 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

>>> On 05.04.17 at 18:24, <andrew.cooper3@citrix.com> wrote:
> On 03/04/17 11:10, Jan Beulich wrote:
>>>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
>>> --- a/xen/arch/x86/mm.c
>>> +++ b/xen/arch/x86/mm.c
>>> @@ -5410,6 +5410,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
>>>          .ctxt = {
>>>              .regs = regs,
>>>              .vendor = d->arch.cpuid->x86_vendor,
>>> +            .lma = true,
>>>              .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>>>              .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>>>          },
>>> @@ -5564,6 +5565,7 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
>>>      struct x86_emulate_ctxt ctxt = {
>>>          .regs = regs,
>>>          .vendor = v->domain->arch.cpuid->x86_vendor,
>>> +        .lma = true,
>> Hmm, both of these are correct from Xen's pov, but potentially
>> wrong from the guest's. Since system segments aren't being
>> dealt with here, I think this difference is benign, but I think it
>> warrants at least a comment. If we ever meant to emulate
>> LLDT, this would become at active problem, as the guest's view
>> on gate descriptor layout would differ from that resulting from
>> setting .lma to true here. Same for emulate_privileged_op() then.
> 
> As discovered in the meantime, things like LLDT/LTR and call gates are
> far more complicated.
> 
> Still, setting LMA to true here is the right thing to do, as it is an
> accurate statement of processor state.  Despite the level of
> compatibility for 32bit, a 32bit PV guest isn't entirely isolated from
> the fact that Xen is 64bit.

Yes, but still call gates (which we don't currently handle in the
emulator itself) require 32-bit treatment for 32-bit guests, so
setting lma to true would still seem wrong. And the value of lma
is, as we now know, irrelevant for LDT and TSS descriptors (I'm
about to verify AMD behaves identically to Intel here).

>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>>> @@ -457,6 +457,9 @@ struct x86_emulate_ctxt
>>>      /* Set this if writes may have side effects. */
>>>      bool force_writeback;
>>>  
>>> +    /* Long mode active? */
>>> +    bool lma;
>> I don't think this should be in the input only section, as an insn
>> can have the effect of clearing it.
>>
>> Also should x86_emulate_wrapper() perhaps assert the field to
>> be false for 32-bit (test harness) builds?
> 
> I'd prefer not to.  There is nothing in principle wrong with trying to
> feed a 32bit compatibility segment through a 32bit build of the test
> harness.

Actually I've just figured myself that such an assertion would better
be checking that mode_64bit() implies lma to be set.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context
  2017-04-06  6:58       ` Jan Beulich
@ 2017-04-06  9:47         ` Andrew Cooper
  2017-04-06 14:14           ` Jan Beulich
  0 siblings, 1 reply; 37+ messages in thread
From: Andrew Cooper @ 2017-04-06  9:47 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

On 06/04/17 07:58, Jan Beulich wrote:
>>>> On 05.04.17 at 18:24, <andrew.cooper3@citrix.com> wrote:
>> On 03/04/17 11:10, Jan Beulich wrote:
>>>>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
>>>> --- a/xen/arch/x86/mm.c
>>>> +++ b/xen/arch/x86/mm.c
>>>> @@ -5410,6 +5410,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
>>>>          .ctxt = {
>>>>              .regs = regs,
>>>>              .vendor = d->arch.cpuid->x86_vendor,
>>>> +            .lma = true,
>>>>              .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>>>>              .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>>>>          },
>>>> @@ -5564,6 +5565,7 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
>>>>      struct x86_emulate_ctxt ctxt = {
>>>>          .regs = regs,
>>>>          .vendor = v->domain->arch.cpuid->x86_vendor,
>>>> +        .lma = true,
>>> Hmm, both of these are correct from Xen's pov, but potentially
>>> wrong from the guest's. Since system segments aren't being
>>> dealt with here, I think this difference is benign, but I think it
>>> warrants at least a comment. If we ever meant to emulate
>>> LLDT, this would become at active problem, as the guest's view
>>> on gate descriptor layout would differ from that resulting from
>>> setting .lma to true here. Same for emulate_privileged_op() then.
>> As discovered in the meantime, things like LLDT/LTR and call gates are
>> far more complicated.
>>
>> Still, setting LMA to true here is the right thing to do, as it is an
>> accurate statement of processor state.  Despite the level of
>> compatibility for 32bit, a 32bit PV guest isn't entirely isolated from
>> the fact that Xen is 64bit.
> Yes, but still call gates (which we don't currently handle in the
> emulator itself) require 32-bit treatment for 32-bit guests, so
> setting lma to true would still seem wrong.

I thought you said that a compatibility mode `call $gate` still checked
the type in the high 8 bytes.

A 32bit PV guest therefore needs to be aware that it can't position call
gates adjacently, or it will suffer #GP[sel] for a failed typecheck.


Now, in this specific case we are in a position to cope, because either
way we end up in the gate op code, but if we wanted to override hardware
behaviour, it should be the validate function, which positively
identifies a far call/jmp, which should choose to override lma for the
purposes of faking up legacy mode behaviour.

>  And the value of lma
> is, as we now know, irrelevant for LDT and TSS descriptors (I'm
> about to verify AMD behaves identically to Intel here).

I checked AMD when I was testing this quirk.  The behaviour does appear
to be the same.

>
>>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>>>> @@ -457,6 +457,9 @@ struct x86_emulate_ctxt
>>>>      /* Set this if writes may have side effects. */
>>>>      bool force_writeback;
>>>>  
>>>> +    /* Long mode active? */
>>>> +    bool lma;
>>> I don't think this should be in the input only section, as an insn
>>> can have the effect of clearing it.
>>>
>>> Also should x86_emulate_wrapper() perhaps assert the field to
>>> be false for 32-bit (test harness) builds?
>> I'd prefer not to.  There is nothing in principle wrong with trying to
>> feed a 32bit compatibility segment through a 32bit build of the test
>> harness.
> Actually I've just figured myself that such an assertion would better
> be checking that mode_64bit() implies lma to be set.

Good point.  I have folded:

diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c
b/xen/arch/x86/x86_emulate/x86_emulate.c
index 1d5a698..59c2da8 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -7895,7 +7895,12 @@ int x86_emulate_wrapper(
     const struct x86_emulate_ops *ops)
 {
     unsigned long orig_ip = ctxt->regs->r(ip);
-    int rc = x86_emulate(ctxt, ops);
+    int rc;
+
+    if ( mode_64bit() )
+        ASSERT(ctxt->lma);
+
+    rc = x86_emulate(ctxt, ops);
 
     /* Retire flags should only be set for successful instruction
emulation. */
     if ( rc != X86EMUL_OKAY )

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context
  2017-04-06  9:47         ` Andrew Cooper
@ 2017-04-06 14:14           ` Jan Beulich
  2017-04-06 16:32             ` Andrew Cooper
  0 siblings, 1 reply; 37+ messages in thread
From: Jan Beulich @ 2017-04-06 14:14 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

>>> On 06.04.17 at 11:47, <andrew.cooper3@citrix.com> wrote:
> On 06/04/17 07:58, Jan Beulich wrote:
>>>>> On 05.04.17 at 18:24, <andrew.cooper3@citrix.com> wrote:
>>> On 03/04/17 11:10, Jan Beulich wrote:
>>>>>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
>>>>> --- a/xen/arch/x86/mm.c
>>>>> +++ b/xen/arch/x86/mm.c
>>>>> @@ -5410,6 +5410,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long 
> addr,
>>>>>          .ctxt = {
>>>>>              .regs = regs,
>>>>>              .vendor = d->arch.cpuid->x86_vendor,
>>>>> +            .lma = true,
>>>>>              .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>>>>>              .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>>>>>          },
>>>>> @@ -5564,6 +5565,7 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long 
> addr,
>>>>>      struct x86_emulate_ctxt ctxt = {
>>>>>          .regs = regs,
>>>>>          .vendor = v->domain->arch.cpuid->x86_vendor,
>>>>> +        .lma = true,
>>>> Hmm, both of these are correct from Xen's pov, but potentially
>>>> wrong from the guest's. Since system segments aren't being
>>>> dealt with here, I think this difference is benign, but I think it
>>>> warrants at least a comment. If we ever meant to emulate
>>>> LLDT, this would become at active problem, as the guest's view
>>>> on gate descriptor layout would differ from that resulting from
>>>> setting .lma to true here. Same for emulate_privileged_op() then.
>>> As discovered in the meantime, things like LLDT/LTR and call gates are
>>> far more complicated.
>>>
>>> Still, setting LMA to true here is the right thing to do, as it is an
>>> accurate statement of processor state.  Despite the level of
>>> compatibility for 32bit, a 32bit PV guest isn't entirely isolated from
>>> the fact that Xen is 64bit.
>> Yes, but still call gates (which we don't currently handle in the
>> emulator itself) require 32-bit treatment for 32-bit guests, so
>> setting lma to true would still seem wrong.
> 
> I thought you said that a compatibility mode `call $gate` still checked
> the type in the high 8 bytes.

Right.

> A 32bit PV guest therefore needs to be aware that it can't position call
> gates adjacently, or it will suffer #GP[sel] for a failed typecheck.

That's not the conclusion I would draw. There is a reason we emulate
call gate accesses already now for 32-bit guests (just not via
x86_emulate()) - precisely to guarantee guests need _not_ be aware.

> Now, in this specific case we are in a position to cope, because either
> way we end up in the gate op code, but if we wanted to override hardware
> behaviour, it should be the validate function, which positively
> identifies a far call/jmp, which should choose to override lma for the
> purposes of faking up legacy mode behaviour.

I don't think the validate function should do any such overriding.
Specifically it shouldn't alter ctxt->lma, risking to surprise x86_emulate().

>>  And the value of lma
>> is, as we now know, irrelevant for LDT and TSS descriptors (I'm
>> about to verify AMD behaves identically to Intel here).
> 
> I checked AMD when I was testing this quirk.  The behaviour does appear
> to be the same.

Same here, except for the size of the reads done: AMD always
reads 16 bytes for system descriptors, whereas Intel suppresses
reading the high 8 bytes for other than call gates.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context
  2017-04-06 14:14           ` Jan Beulich
@ 2017-04-06 16:32             ` Andrew Cooper
  2017-04-07  8:35               ` Jan Beulich
  0 siblings, 1 reply; 37+ messages in thread
From: Andrew Cooper @ 2017-04-06 16:32 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

On 06/04/17 15:14, Jan Beulich wrote:
>>>> On 06.04.17 at 11:47, <andrew.cooper3@citrix.com> wrote:
>> On 06/04/17 07:58, Jan Beulich wrote:
>>>>>> On 05.04.17 at 18:24, <andrew.cooper3@citrix.com> wrote:
>>>> On 03/04/17 11:10, Jan Beulich wrote:
>>>>>>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
>>>>>> --- a/xen/arch/x86/mm.c
>>>>>> +++ b/xen/arch/x86/mm.c
>>>>>> @@ -5410,6 +5410,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long 
>> addr,
>>>>>>          .ctxt = {
>>>>>>              .regs = regs,
>>>>>>              .vendor = d->arch.cpuid->x86_vendor,
>>>>>> +            .lma = true,
>>>>>>              .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>>>>>>              .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>>>>>>          },
>>>>>> @@ -5564,6 +5565,7 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long 
>> addr,
>>>>>>      struct x86_emulate_ctxt ctxt = {
>>>>>>          .regs = regs,
>>>>>>          .vendor = v->domain->arch.cpuid->x86_vendor,
>>>>>> +        .lma = true,
>>>>> Hmm, both of these are correct from Xen's pov, but potentially
>>>>> wrong from the guest's. Since system segments aren't being
>>>>> dealt with here, I think this difference is benign, but I think it
>>>>> warrants at least a comment. If we ever meant to emulate
>>>>> LLDT, this would become at active problem, as the guest's view
>>>>> on gate descriptor layout would differ from that resulting from
>>>>> setting .lma to true here. Same for emulate_privileged_op() then.
>>>> As discovered in the meantime, things like LLDT/LTR and call gates are
>>>> far more complicated.
>>>>
>>>> Still, setting LMA to true here is the right thing to do, as it is an
>>>> accurate statement of processor state.  Despite the level of
>>>> compatibility for 32bit, a 32bit PV guest isn't entirely isolated from
>>>> the fact that Xen is 64bit.
>>> Yes, but still call gates (which we don't currently handle in the
>>> emulator itself) require 32-bit treatment for 32-bit guests, so
>>> setting lma to true would still seem wrong.
>> I thought you said that a compatibility mode `call $gate` still checked
>> the type in the high 8 bytes.
> Right.
>
>> A 32bit PV guest therefore needs to be aware that it can't position call
>> gates adjacently, or it will suffer #GP[sel] for a failed typecheck.
> That's not the conclusion I would draw. There is a reason we emulate
> call gate accesses already now for 32-bit guests (just not via
> x86_emulate()) - precisely to guarantee guests need _not_ be aware.
>
>> Now, in this specific case we are in a position to cope, because either
>> way we end up in the gate op code, but if we wanted to override hardware
>> behaviour, it should be the validate function, which positively
>> identifies a far call/jmp, which should choose to override lma for the
>> purposes of faking up legacy mode behaviour.
> I don't think the validate function should do any such overriding.
> Specifically it shouldn't alter ctxt->lma, risking to surprise x86_emulate().

I have folded:

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index d010aa3..96bc280 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5412,7 +5412,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned
long addr,
             .vendor = d->arch.cpuid->x86_vendor,
             .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
             .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
-            .lma = true,
+            .lma       = !is_pv_32bit_domain(d),
         },
     };
     int rc;
@@ -5567,7 +5567,7 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned
long addr,
         .vendor = v->domain->arch.cpuid->x86_vendor,
         .addr_size = addr_size,
         .sp_size = addr_size,
-        .lma = true,
+        .lma = !is_pv_32bit_vcpu(v),
         .data = &mmio_ro_ctxt
     };
     int rc;
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 09dc2f1..5b9bf21 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -2966,7 +2966,7 @@ static int emulate_privileged_op(struct
cpu_user_regs *regs)
     struct priv_op_ctxt ctxt = {
         .ctxt.regs = regs,
         .ctxt.vendor = currd->arch.cpuid->x86_vendor,
-        .ctxt.lma = true,
+        .ctxt.lma = !is_pv_32bit_domain(currd),
     };
     int rc;
     unsigned int eflags, ar;


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context
  2017-04-06 16:32             ` Andrew Cooper
@ 2017-04-07  8:35               ` Jan Beulich
  0 siblings, 0 replies; 37+ messages in thread
From: Jan Beulich @ 2017-04-07  8:35 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Julien Grall, Paul Durrant, Tim Deegan, Xen-devel

>>> On 06.04.17 at 18:32, <andrew.cooper3@citrix.com> wrote:
> On 06/04/17 15:14, Jan Beulich wrote:
>>>>> On 06.04.17 at 11:47, <andrew.cooper3@citrix.com> wrote:
>>> On 06/04/17 07:58, Jan Beulich wrote:
>>>>>>> On 05.04.17 at 18:24, <andrew.cooper3@citrix.com> wrote:
>>>>> On 03/04/17 11:10, Jan Beulich wrote:
>>>>>>>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
>>>>>>> --- a/xen/arch/x86/mm.c
>>>>>>> +++ b/xen/arch/x86/mm.c
>>>>>>> @@ -5410,6 +5410,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long 
>>> addr,
>>>>>>>          .ctxt = {
>>>>>>>              .regs = regs,
>>>>>>>              .vendor = d->arch.cpuid->x86_vendor,
>>>>>>> +            .lma = true,
>>>>>>>              .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>>>>>>>              .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>>>>>>>          },
>>>>>>> @@ -5564,6 +5565,7 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long 
> 
>>> addr,
>>>>>>>      struct x86_emulate_ctxt ctxt = {
>>>>>>>          .regs = regs,
>>>>>>>          .vendor = v->domain->arch.cpuid->x86_vendor,
>>>>>>> +        .lma = true,
>>>>>> Hmm, both of these are correct from Xen's pov, but potentially
>>>>>> wrong from the guest's. Since system segments aren't being
>>>>>> dealt with here, I think this difference is benign, but I think it
>>>>>> warrants at least a comment. If we ever meant to emulate
>>>>>> LLDT, this would become at active problem, as the guest's view
>>>>>> on gate descriptor layout would differ from that resulting from
>>>>>> setting .lma to true here. Same for emulate_privileged_op() then.
>>>>> As discovered in the meantime, things like LLDT/LTR and call gates are
>>>>> far more complicated.
>>>>>
>>>>> Still, setting LMA to true here is the right thing to do, as it is an
>>>>> accurate statement of processor state.  Despite the level of
>>>>> compatibility for 32bit, a 32bit PV guest isn't entirely isolated from
>>>>> the fact that Xen is 64bit.
>>>> Yes, but still call gates (which we don't currently handle in the
>>>> emulator itself) require 32-bit treatment for 32-bit guests, so
>>>> setting lma to true would still seem wrong.
>>> I thought you said that a compatibility mode `call $gate` still checked
>>> the type in the high 8 bytes.
>> Right.
>>
>>> A 32bit PV guest therefore needs to be aware that it can't position call
>>> gates adjacently, or it will suffer #GP[sel] for a failed typecheck.
>> That's not the conclusion I would draw. There is a reason we emulate
>> call gate accesses already now for 32-bit guests (just not via
>> x86_emulate()) - precisely to guarantee guests need _not_ be aware.
>>
>>> Now, in this specific case we are in a position to cope, because either
>>> way we end up in the gate op code, but if we wanted to override hardware
>>> behaviour, it should be the validate function, which positively
>>> identifies a far call/jmp, which should choose to override lma for the
>>> purposes of faking up legacy mode behaviour.
>> I don't think the validate function should do any such overriding.
>> Specifically it shouldn't alter ctxt->lma, risking to surprise x86_emulate().
> 
> I have folded:
> 
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index d010aa3..96bc280 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5412,7 +5412,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned
> long addr,
>              .vendor = d->arch.cpuid->x86_vendor,
>              .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
>              .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
> -            .lma = true,
> +            .lma       = !is_pv_32bit_domain(d),
>          },
>      };
>      int rc;
> @@ -5567,7 +5567,7 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned
> long addr,
>          .vendor = v->domain->arch.cpuid->x86_vendor,
>          .addr_size = addr_size,
>          .sp_size = addr_size,
> -        .lma = true,
> +        .lma = !is_pv_32bit_vcpu(v),
>          .data = &mmio_ro_ctxt
>      };
>      int rc;
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index 09dc2f1..5b9bf21 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -2966,7 +2966,7 @@ static int emulate_privileged_op(struct
> cpu_user_regs *regs)
>      struct priv_op_ctxt ctxt = {
>          .ctxt.regs = regs,
>          .ctxt.vendor = currd->arch.cpuid->x86_vendor,
> -        .ctxt.lma = true,
> +        .ctxt.lma = !is_pv_32bit_domain(currd),
>      };
>      int rc;
>      unsigned int eflags, ar;

With that
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2017-04-07  8:35 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-31 19:50 [PATCH for 4.9 0/6] x86/emul: Fixes Andrew Cooper
2017-03-31 19:50 ` [PATCH for 4.9 1/6] x86/hvm: Correct some address space terminology Andrew Cooper
2017-04-03  8:24   ` Paul Durrant
2017-04-03  8:24   ` Jan Beulich
2017-04-03 10:19     ` Andrew Cooper
2017-04-03 10:29       ` Jan Beulich
2017-03-31 19:50 ` [PATCH for 4.9 2/6] x86/hvm: Correct long mode predicate Andrew Cooper
2017-04-03  8:26   ` Paul Durrant
2017-04-03  8:30   ` Jan Beulich
2017-04-03  8:50   ` George Dunlap
2017-04-05  7:08   ` Tian, Kevin
2017-03-31 19:50 ` [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments Andrew Cooper
2017-04-03  8:31   ` Paul Durrant
2017-04-03  9:13   ` Jan Beulich
2017-04-03 14:27     ` Andrew Cooper
2017-04-03 15:07       ` Jan Beulich
2017-04-03 15:42         ` Andrew Cooper
2017-04-03 16:08           ` Jan Beulich
2017-04-03 17:37             ` Andrew Cooper
2017-04-04 10:18               ` Andrew Cooper
2017-04-04 10:32                 ` Jan Beulich
2017-03-31 19:50 ` [PATCH for 4.9 4/6] x86/svm: Introduce svm_emul_swint_injection() Andrew Cooper
2017-04-03  9:30   ` Jan Beulich
2017-04-03 14:04   ` Boris Ostrovsky
2017-03-31 19:50 ` [PATCH for 4.9 5/6] x86/emul: Drop swint_emulate infrastructure Andrew Cooper
2017-04-03  8:36   ` Paul Durrant
2017-04-03  9:38   ` Jan Beulich
2017-03-31 19:50 ` [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context Andrew Cooper
2017-04-03  8:40   ` Paul Durrant
2017-04-03 10:10   ` Jan Beulich
2017-04-05 16:24     ` Andrew Cooper
2017-04-06  6:58       ` Jan Beulich
2017-04-06  9:47         ` Andrew Cooper
2017-04-06 14:14           ` Jan Beulich
2017-04-06 16:32             ` Andrew Cooper
2017-04-07  8:35               ` Jan Beulich
2017-04-05 16:07   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.