All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-4.9 v3 00/24] XSA-191 followup
@ 2016-11-30 13:50 Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 01/24] x86/shadow: Fix #PFs from emulated writes crossing a page boundary Andrew Cooper
                   ` (23 more replies)
  0 siblings, 24 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

This is the quantity of changes required to fix some edgecases in XSA-191
which were ultimately chosen not to go out in the security fix.  The main
purpose of this series is to fix emulation sufficiently to allow the final
patch to avoid opencoding all of the segmenation logic.

Changes from v2:

 * 5 new patches (7-11) fixing x86_emulate() not to return X86EMUL_EXCEPTION
   with trap semantics.
 * Adjustments to callers of x86_emulate() to cope with the fault semantics.
 * Tweaks to the implementation of pv_inject_{event,page_fault,hw_exception}().

Andrew Cooper (24):
  x86/shadow: Fix #PFs from emulated writes crossing a page boundary
  x86/emul: Drop X86EMUL_CMPXCHG_FAILED
  x86/emul: Simplfy emulation state setup
  x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure
  x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC
  x86/pv: Implement pv_inject_{event,page_fault,hw_exception}()
  x86/emul: Clean up the naming of the retire union
  x86/emul: Correct the behaviour of pop %ss and interrupt shadowing
  x86/emul: Provide a wrapper to x86_emulate() to ASSERT() certain behaviour
  x86/emul: Always use fault semantics for software events
  x86/emul: Implement singlestep as a retire flag
  x86/emul: Remove opencoded exception generation
  x86/emul: Rework emulator event injection
  x86/vmx: Use hvm_{get,set}_segment_register() rather than vmx_{get,set}_segment_register()
  x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS
  x86/emul: Avoid raising faults behind the emulators back
  x86/pv: Avoid raising faults behind the emulators back
  x86/shadow: Avoid raising faults behind the emulators back
  x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer
  x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info
  x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear()
  x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back
  x86/emul: Prepare to allow use of system segments for memory references
  x86/emul: Use system-segment relative memory accesses

 tools/tests/x86_emulator/test_x86_emulator.c |   1 +
 tools/tests/x86_emulator/x86_emulate.c       |   3 +
 xen/arch/x86/hvm/emulate.c                   | 147 ++++-------
 xen/arch/x86/hvm/hvm.c                       | 370 +++++++++++++++++++--------
 xen/arch/x86/hvm/io.c                        |   4 +-
 xen/arch/x86/hvm/nestedhvm.c                 |   2 +-
 xen/arch/x86/hvm/svm/nestedsvm.c             |  13 +-
 xen/arch/x86/hvm/svm/svm.c                   | 144 +++++------
 xen/arch/x86/hvm/vmx/intr.c                  |   2 +-
 xen/arch/x86/hvm/vmx/realmode.c              |  16 +-
 xen/arch/x86/hvm/vmx/vmx.c                   | 109 ++++----
 xen/arch/x86/hvm/vmx/vvmx.c                  |  44 ++--
 xen/arch/x86/mm.c                            |  94 +++++--
 xen/arch/x86/mm/shadow/common.c              |  40 +--
 xen/arch/x86/mm/shadow/multi.c               |  57 ++++-
 xen/arch/x86/traps.c                         | 147 ++++++-----
 xen/arch/x86/x86_emulate/x86_emulate.c       | 357 +++++++++++++++-----------
 xen/arch/x86/x86_emulate/x86_emulate.h       | 219 +++++++++++++---
 xen/include/asm-x86/desc.h                   |   6 +
 xen/include/asm-x86/domain.h                 |  26 ++
 xen/include/asm-x86/hvm/emulate.h            |   3 -
 xen/include/asm-x86/hvm/hvm.h                |  86 +++----
 xen/include/asm-x86/hvm/support.h            |  42 ++-
 xen/include/asm-x86/hvm/svm/nestedsvm.h      |   6 +-
 xen/include/asm-x86/hvm/vcpu.h               |   2 +-
 xen/include/asm-x86/hvm/vmx/vmx.h            |   2 -
 xen/include/asm-x86/hvm/vmx/vvmx.h           |   4 +-
 xen/include/asm-x86/mm.h                     |   1 -
 28 files changed, 1190 insertions(+), 757 deletions(-)

-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v3 01/24] x86/shadow: Fix #PFs from emulated writes crossing a page boundary
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 02/24] x86/emul: Drop X86EMUL_CMPXCHG_FAILED Andrew Cooper
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

When translating the second frame of a write crossing a page boundary, mask
the linear address down to the page boundary.

This causes the correct %cr2 being reported to the guest in the case that the
second frame suffers a pagefault during translation.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v2:
 * New
---
 xen/arch/x86/mm/shadow/common.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index ced2313..7e5b8b0 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1808,7 +1808,8 @@ void *sh_emulate_map_dest(struct vcpu *v, unsigned long vaddr,
     else
     {
         /* This write crosses a page boundary. Translate the second page. */
-        sh_ctxt->mfn[1] = emulate_gva_to_mfn(v, vaddr + bytes - 1, sh_ctxt);
+        sh_ctxt->mfn[1] = emulate_gva_to_mfn(
+            v, (vaddr + bytes - 1) & PAGE_MASK, sh_ctxt);
         if ( !mfn_valid(sh_ctxt->mfn[1]) )
             return ((mfn_x(sh_ctxt->mfn[1]) == BAD_GVA_TO_GFN) ?
                     MAPPING_EXCEPTION :
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 02/24] x86/emul: Drop X86EMUL_CMPXCHG_FAILED
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 01/24] x86/shadow: Fix #PFs from emulated writes crossing a page boundary Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 03/24] x86/emul: Simplfy emulation state setup Andrew Cooper
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

X86EMUL_CMPXCHG_FAILED was introduced in c/s d430aae25 in 2005.  Even at the
time it alised what is now X86EMUL_RETRY (as well as what is now
X86EMUL_EXCEPTION).  I am not sure why the distinction was considered useful
at the time.

It is only used twice; there is no need to call it out differently from other
uses of X86EMUL_RETRY.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
v2:
 * New
---
 xen/arch/x86/mm.c                      | 2 +-
 xen/arch/x86/mm/shadow/multi.c         | 2 +-
 xen/arch/x86/x86_emulate/x86_emulate.h | 2 --
 3 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 03dcd71..5b0e9f3 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5254,7 +5254,7 @@ static int ptwr_emulated_update(
         {
             unmap_domain_page(pl1e);
             put_page_from_l1e(nl1e, d);
-            return X86EMUL_CMPXCHG_FAILED;
+            return X86EMUL_RETRY;
         }
     }
     else
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index d70b1c6..9ee48a8 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -4694,7 +4694,7 @@ sh_x86_emulate_cmpxchg(struct vcpu *v, unsigned long vaddr,
     }
 
     if ( prev != old )
-        rv = X86EMUL_CMPXCHG_FAILED;
+        rv = X86EMUL_RETRY;
 
     SHADOW_DEBUG(EMULATE, "va %#lx was %#lx expected %#lx"
                   " wanted %#lx now %#lx bytes %u\n",
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 993c576..ec824ce 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -109,8 +109,6 @@ struct __attribute__((__packed__)) segment_register {
 #define X86EMUL_EXCEPTION      2
  /* Retry the emulation for some reason. No state modified. */
 #define X86EMUL_RETRY          3
- /* (cmpxchg accessor): CMPXCHG failed. Maps to X86EMUL_RETRY in caller. */
-#define X86EMUL_CMPXCHG_FAILED 3
 
 /* FPU sub-types which may be requested via ->get_fpu(). */
 enum x86_emulate_fpu_type {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 03/24] x86/emul: Simplfy emulation state setup
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 01/24] x86/shadow: Fix #PFs from emulated writes crossing a page boundary Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 02/24] x86/emul: Drop X86EMUL_CMPXCHG_FAILED Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-12-08  6:34   ` George Dunlap
  2016-11-30 13:50 ` [PATCH v3 04/24] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure Andrew Cooper
                   ` (20 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: George Dunlap, Andrew Cooper

The current code to set up emulation state is ad-hoc and error prone.

 * Consistently zero all emulation state structures.
 * Avoid explicitly initialising some state to 0.
 * Explicitly identify all input and output state in x86_emulate_ctxt.  This
   involves rearanging some fields.
 * Have x86_decode() explicitly initalise all output state at its start.

While making the above changes, two minor tweaks:

 * Move the calculation of hvmemul_ctxt->ctxt.swint_emulate from
   _hvm_emulate_one() to hvm_emulate_init_once().  It doesn't need
   recalculating for each instruction.
 * Change force_writeback to being a boolean, to match its use.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
---
CC: George Dunlap <george.dunlap@eu.citrix.com>

v2:
 * Split x86_emulate_ctxt into three sections
---
 xen/arch/x86/hvm/emulate.c             | 28 +++++++++++++++-------------
 xen/arch/x86/mm.c                      | 14 ++++++++------
 xen/arch/x86/mm/shadow/common.c        |  4 ++--
 xen/arch/x86/x86_emulate/x86_emulate.c |  1 +
 xen/arch/x86/x86_emulate/x86_emulate.h | 32 ++++++++++++++++++++++----------
 5 files changed, 48 insertions(+), 31 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index f1f6e2f..3efeead 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1770,13 +1770,6 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
 
     vio->mmio_retry = 0;
 
-    if ( cpu_has_vmx )
-        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
-    else if ( cpu_has_svm_nrips )
-        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_icebp;
-    else
-        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_all;
-
     rc = x86_emulate(&hvmemul_ctxt->ctxt, ops);
 
     if ( rc == X86EMUL_OKAY && vio->mmio_retry )
@@ -1947,14 +1940,23 @@ void hvm_emulate_init_once(
     struct hvm_emulate_ctxt *hvmemul_ctxt,
     struct cpu_user_regs *regs)
 {
-    hvmemul_ctxt->intr_shadow = hvm_funcs.get_interrupt_shadow(current);
-    hvmemul_ctxt->ctxt.regs = regs;
-    hvmemul_ctxt->ctxt.force_writeback = 1;
-    hvmemul_ctxt->seg_reg_accessed = 0;
-    hvmemul_ctxt->seg_reg_dirty = 0;
-    hvmemul_ctxt->set_context = 0;
+    struct vcpu *curr = current;
+
+    memset(hvmemul_ctxt, 0, sizeof(*hvmemul_ctxt));
+
+    hvmemul_ctxt->intr_shadow = hvm_funcs.get_interrupt_shadow(curr);
     hvmemul_get_seg_reg(x86_seg_cs, hvmemul_ctxt);
     hvmemul_get_seg_reg(x86_seg_ss, hvmemul_ctxt);
+
+    hvmemul_ctxt->ctxt.regs = regs;
+    hvmemul_ctxt->ctxt.force_writeback = true;
+
+    if ( cpu_has_vmx )
+        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
+    else if ( cpu_has_svm_nrips )
+        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_icebp;
+    else
+        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_all;
 }
 
 void hvm_emulate_init_per_insn(
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 5b0e9f3..d365f59 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5337,7 +5337,14 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
     struct domain *d = v->domain;
     struct page_info *page;
     l1_pgentry_t      pte;
-    struct ptwr_emulate_ctxt ptwr_ctxt;
+    struct ptwr_emulate_ctxt ptwr_ctxt = {
+        .ctxt = {
+            .regs = regs,
+            .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
+            .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
+            .swint_emulate = x86_swint_emulate_none,
+        },
+    };
     int rc;
 
     /* Attempt to read the PTE that maps the VA being accessed. */
@@ -5363,11 +5370,6 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
         goto bail;
     }
 
-    ptwr_ctxt.ctxt.regs = regs;
-    ptwr_ctxt.ctxt.force_writeback = 0;
-    ptwr_ctxt.ctxt.addr_size = ptwr_ctxt.ctxt.sp_size =
-        is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG;
-    ptwr_ctxt.ctxt.swint_emulate = x86_swint_emulate_none;
     ptwr_ctxt.cr2 = addr;
     ptwr_ctxt.pte = pte;
 
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 7e5b8b0..a4a3c4b 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -385,8 +385,9 @@ const struct x86_emulate_ops *shadow_init_emulation(
     struct vcpu *v = current;
     unsigned long addr;
 
+    memset(sh_ctxt, 0, sizeof(*sh_ctxt));
+
     sh_ctxt->ctxt.regs = regs;
-    sh_ctxt->ctxt.force_writeback = 0;
     sh_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
 
     if ( is_pv_vcpu(v) )
@@ -396,7 +397,6 @@ const struct x86_emulate_ops *shadow_init_emulation(
     }
 
     /* Segment cache initialisation. Primed with CS. */
-    sh_ctxt->valid_seg_regs = 0;
     creg = hvm_get_seg_reg(x86_seg_cs, sh_ctxt);
 
     /* Work out the emulation mode. */
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index d82e85d..532bd32 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1904,6 +1904,7 @@ x86_decode(
     state->regs = ctxt->regs;
     state->eip = ctxt->regs->eip;
 
+    /* Initialise output state in x86_emulate_ctxt */
     ctxt->retire.byte = 0;
 
     op_bytes = def_op_bytes = ad_bytes = def_ad_bytes = ctxt->addr_size/8;
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index ec824ce..ab566c0 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -410,6 +410,23 @@ struct cpu_user_regs;
 
 struct x86_emulate_ctxt
 {
+    /*
+     * Input-only state:
+     */
+
+    /* Software event injection support. */
+    enum x86_swint_emulation swint_emulate;
+
+    /* Set this if writes may have side effects. */
+    bool force_writeback;
+
+    /* Caller data that can be used by x86_emulate_ops' routines. */
+    void *data;
+
+    /*
+     * Input/output state:
+     */
+
     /* Register state before/after emulation. */
     struct cpu_user_regs *regs;
 
@@ -419,14 +436,12 @@ struct x86_emulate_ctxt
     /* Stack pointer width in bits (16, 32 or 64). */
     unsigned int sp_size;
 
-    /* Canonical opcode (see below). */
-    unsigned int opcode;
-
-    /* Software event injection support. */
-    enum x86_swint_emulation swint_emulate;
+    /*
+     * Output-only state:
+     */
 
-    /* Set this if writes may have side effects. */
-    uint8_t force_writeback;
+    /* Canonical opcode (see below) (valid only on X86EMUL_OKAY). */
+    unsigned int opcode;
 
     /* Retirement state, set by the emulator (valid only on X86EMUL_OKAY). */
     union {
@@ -437,9 +452,6 @@ struct x86_emulate_ctxt
         } flags;
         uint8_t byte;
     } retire;
-
-    /* Caller data that can be used by x86_emulate_ops' routines. */
-    void *data;
 };
 
 /*
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 04/24] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (2 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 03/24] x86/emul: Simplfy emulation state setup Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 05/24] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC Andrew Cooper
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

The x86 emulator needs to gain an understanding of interrupts and exceptions
generated by its actions.  The naming choice is to match both the Intel and
AMD terms, and to avoid 'trap' specifically as it has an architectural meaning
different to its current usage.

While making this change, make other changes for consistency

 * Rename *_trap() infrastructure to *_event()
 * Rename trapnr/trap parameters to vector
 * Convert hvm_inject_hw_exception() and hvm_inject_page_fault() to being
   static inlines, as they are only thin wrappers around hvm_inject_event()

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/hvm/emulate.c              |  6 +--
 xen/arch/x86/hvm/hvm.c                  | 33 ++++------------
 xen/arch/x86/hvm/io.c                   |  2 +-
 xen/arch/x86/hvm/svm/nestedsvm.c        |  7 ++--
 xen/arch/x86/hvm/svm/svm.c              | 62 ++++++++++++++---------------
 xen/arch/x86/hvm/vmx/vmx.c              | 66 +++++++++++++++----------------
 xen/arch/x86/hvm/vmx/vvmx.c             | 11 +++---
 xen/arch/x86/x86_emulate/x86_emulate.c  | 11 ++++++
 xen/arch/x86/x86_emulate/x86_emulate.h  | 22 +++++++++++
 xen/include/asm-x86/hvm/emulate.h       |  2 +-
 xen/include/asm-x86/hvm/hvm.h           | 69 ++++++++++++++++-----------------
 xen/include/asm-x86/hvm/svm/nestedsvm.h |  6 +--
 xen/include/asm-x86/hvm/vcpu.h          |  2 +-
 xen/include/asm-x86/hvm/vmx/vvmx.h      |  4 +-
 14 files changed, 159 insertions(+), 144 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 3efeead..bb26d40 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1679,7 +1679,7 @@ static int hvmemul_invlpg(
          * violations, so squash them.
          */
         hvmemul_ctxt->exn_pending = 0;
-        hvmemul_ctxt->trap = (struct hvm_trap){};
+        hvmemul_ctxt->trap = (struct x86_event){};
         rc = X86EMUL_OKAY;
     }
 
@@ -1869,7 +1869,7 @@ int hvm_emulate_one_mmio(unsigned long mfn, unsigned long gla)
         break;
     case X86EMUL_EXCEPTION:
         if ( ctxt.exn_pending )
-            hvm_inject_trap(&ctxt.trap);
+            hvm_inject_event(&ctxt.trap);
         /* fallthrough */
     default:
         hvm_emulate_writeback(&ctxt);
@@ -1929,7 +1929,7 @@ void hvm_emulate_one_vm_event(enum emul_kind kind, unsigned int trapnr,
         break;
     case X86EMUL_EXCEPTION:
         if ( ctx.exn_pending )
-            hvm_inject_trap(&ctx.trap);
+            hvm_inject_event(&ctx.trap);
         break;
     }
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 25dc759..7b434aa 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -535,7 +535,7 @@ void hvm_do_resume(struct vcpu *v)
     /* Inject pending hw/sw trap */
     if ( v->arch.hvm_vcpu.inject_trap.vector != -1 )
     {
-        hvm_inject_trap(&v->arch.hvm_vcpu.inject_trap);
+        hvm_inject_event(&v->arch.hvm_vcpu.inject_trap);
         v->arch.hvm_vcpu.inject_trap.vector = -1;
     }
 }
@@ -1676,19 +1676,19 @@ void hvm_triple_fault(void)
     domain_shutdown(d, reason);
 }
 
-void hvm_inject_trap(const struct hvm_trap *trap)
+void hvm_inject_event(const struct x86_event *event)
 {
     struct vcpu *curr = current;
 
     if ( nestedhvm_enabled(curr->domain) &&
          !nestedhvm_vmswitch_in_progress(curr) &&
          nestedhvm_vcpu_in_guestmode(curr) &&
-         nhvm_vmcx_guest_intercepts_trap(
-             curr, trap->vector, trap->error_code) )
+         nhvm_vmcx_guest_intercepts_event(
+             curr, event->vector, event->error_code) )
     {
         enum nestedhvm_vmexits nsret;
 
-        nsret = nhvm_vcpu_vmexit_trap(curr, trap);
+        nsret = nhvm_vcpu_vmexit_event(curr, event);
 
         switch ( nsret )
         {
@@ -1704,26 +1704,7 @@ void hvm_inject_trap(const struct hvm_trap *trap)
         }
     }
 
-    hvm_funcs.inject_trap(trap);
-}
-
-void hvm_inject_hw_exception(unsigned int trapnr, int errcode)
-{
-    struct hvm_trap trap = {
-        .vector = trapnr,
-        .type = X86_EVENTTYPE_HW_EXCEPTION,
-        .error_code = errcode };
-    hvm_inject_trap(&trap);
-}
-
-void hvm_inject_page_fault(int errcode, unsigned long cr2)
-{
-    struct hvm_trap trap = {
-        .vector = TRAP_page_fault,
-        .type = X86_EVENTTYPE_HW_EXCEPTION,
-        .error_code = errcode,
-        .cr2 = cr2 };
-    hvm_inject_trap(&trap);
+    hvm_funcs.inject_event(event);
 }
 
 int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
@@ -4096,7 +4077,7 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
         break;
     case X86EMUL_EXCEPTION:
         if ( ctxt.exn_pending )
-            hvm_inject_trap(&ctxt.trap);
+            hvm_inject_event(&ctxt.trap);
         /* fall through */
     default:
         hvm_emulate_writeback(&ctxt);
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 7305801..1279f68 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -103,7 +103,7 @@ int handle_mmio(void)
         return 0;
     case X86EMUL_EXCEPTION:
         if ( ctxt.exn_pending )
-            hvm_inject_trap(&ctxt.trap);
+            hvm_inject_event(&ctxt.trap);
         break;
     default:
         break;
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index f9b38ab..b6b8526 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -821,7 +821,7 @@ nsvm_vcpu_vmexit_inject(struct vcpu *v, struct cpu_user_regs *regs,
 }
 
 int
-nsvm_vcpu_vmexit_trap(struct vcpu *v, const struct hvm_trap *trap)
+nsvm_vcpu_vmexit_event(struct vcpu *v, const struct x86_event *trap)
 {
     ASSERT(vcpu_nestedhvm(v).nv_vvmcx != NULL);
 
@@ -994,10 +994,11 @@ nsvm_vmcb_guest_intercepts_exitcode(struct vcpu *v,
 }
 
 bool_t
-nsvm_vmcb_guest_intercepts_trap(struct vcpu *v, unsigned int trapnr, int errcode)
+nsvm_vmcb_guest_intercepts_event(
+    struct vcpu *v, unsigned int vector, int errcode)
 {
     return nsvm_vmcb_guest_intercepts_exitcode(v,
-        guest_cpu_user_regs(), VMEXIT_EXCEPTION_DE + trapnr);
+        guest_cpu_user_regs(), VMEXIT_EXCEPTION_DE + vector);
 }
 
 static int
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 37bd6c4..caab5ce 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1203,15 +1203,15 @@ static void svm_vcpu_destroy(struct vcpu *v)
     passive_domain_destroy(v);
 }
 
-static void svm_inject_trap(const struct hvm_trap *trap)
+static void svm_inject_event(const struct x86_event *event)
 {
     struct vcpu *curr = current;
     struct vmcb_struct *vmcb = curr->arch.hvm_svm.vmcb;
-    eventinj_t event = vmcb->eventinj;
-    struct hvm_trap _trap = *trap;
+    eventinj_t eventinj = vmcb->eventinj;
+    struct x86_event _event = *event;
     const struct cpu_user_regs *regs = guest_cpu_user_regs();
 
-    switch ( _trap.vector )
+    switch ( _event.vector )
     {
     case TRAP_debug:
         if ( regs->eflags & X86_EFLAGS_TF )
@@ -1229,21 +1229,21 @@ static void svm_inject_trap(const struct hvm_trap *trap)
         }
     }
 
-    if ( unlikely(event.fields.v) &&
-         (event.fields.type == X86_EVENTTYPE_HW_EXCEPTION) )
+    if ( unlikely(eventinj.fields.v) &&
+         (eventinj.fields.type == X86_EVENTTYPE_HW_EXCEPTION) )
     {
-        _trap.vector = hvm_combine_hw_exceptions(
-            event.fields.vector, _trap.vector);
-        if ( _trap.vector == TRAP_double_fault )
-            _trap.error_code = 0;
+        _event.vector = hvm_combine_hw_exceptions(
+            eventinj.fields.vector, _event.vector);
+        if ( _event.vector == TRAP_double_fault )
+            _event.error_code = 0;
     }
 
-    event.bytes = 0;
-    event.fields.v = 1;
-    event.fields.vector = _trap.vector;
+    eventinj.bytes = 0;
+    eventinj.fields.v = 1;
+    eventinj.fields.vector = _event.vector;
 
     /* Refer to AMD Vol 2: System Programming, 15.20 Event Injection. */
-    switch ( _trap.type )
+    switch ( _event.type )
     {
     case X86_EVENTTYPE_SW_INTERRUPT: /* int $n */
         /*
@@ -1253,8 +1253,8 @@ static void svm_inject_trap(const struct hvm_trap *trap)
          * moved eip forward if appropriate.
          */
         if ( cpu_has_svm_nrips )
-            vmcb->nextrip = regs->eip + _trap.insn_len;
-        event.fields.type = X86_EVENTTYPE_SW_INTERRUPT;
+            vmcb->nextrip = regs->eip + _event.insn_len;
+        eventinj.fields.type = X86_EVENTTYPE_SW_INTERRUPT;
         break;
 
     case X86_EVENTTYPE_PRI_SW_EXCEPTION: /* icebp */
@@ -1265,7 +1265,7 @@ static void svm_inject_trap(const struct hvm_trap *trap)
          */
         if ( cpu_has_svm_nrips )
             vmcb->nextrip = regs->eip;
-        event.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
+        eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
         break;
 
     case X86_EVENTTYPE_SW_EXCEPTION: /* int3, into */
@@ -1279,28 +1279,28 @@ static void svm_inject_trap(const struct hvm_trap *trap)
          * the correct faulting eip should a fault occur.
          */
         if ( cpu_has_svm_nrips )
-            vmcb->nextrip = regs->eip + _trap.insn_len;
-        event.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
+            vmcb->nextrip = regs->eip + _event.insn_len;
+        eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
         break;
 
     default:
-        event.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
-        event.fields.ev = (_trap.error_code != HVM_DELIVER_NO_ERROR_CODE);
-        event.fields.errorcode = _trap.error_code;
+        eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
+        eventinj.fields.ev = (_event.error_code != HVM_DELIVER_NO_ERROR_CODE);
+        eventinj.fields.errorcode = _event.error_code;
         break;
     }
 
-    vmcb->eventinj = event;
+    vmcb->eventinj = eventinj;
 
-    if ( _trap.vector == TRAP_page_fault )
+    if ( _event.vector == TRAP_page_fault )
     {
-        curr->arch.hvm_vcpu.guest_cr[2] = _trap.cr2;
-        vmcb_set_cr2(vmcb, _trap.cr2);
-        HVMTRACE_LONG_2D(PF_INJECT, _trap.error_code, TRC_PAR_LONG(_trap.cr2));
+        curr->arch.hvm_vcpu.guest_cr[2] = _event.cr2;
+        vmcb_set_cr2(vmcb, _event.cr2);
+        HVMTRACE_LONG_2D(PF_INJECT, _event.error_code, TRC_PAR_LONG(_event.cr2));
     }
     else
     {
-        HVMTRACE_2D(INJ_EXC, _trap.vector, _trap.error_code);
+        HVMTRACE_2D(INJ_EXC, _event.vector, _event.error_code);
     }
 }
 
@@ -2238,7 +2238,7 @@ static struct hvm_function_table __initdata svm_function_table = {
     .set_guest_pat        = svm_set_guest_pat,
     .get_guest_pat        = svm_get_guest_pat,
     .set_tsc_offset       = svm_set_tsc_offset,
-    .inject_trap          = svm_inject_trap,
+    .inject_event         = svm_inject_event,
     .init_hypercall_page  = svm_init_hypercall_page,
     .event_pending        = svm_event_pending,
     .invlpg               = svm_invlpg,
@@ -2253,9 +2253,9 @@ static struct hvm_function_table __initdata svm_function_table = {
     .nhvm_vcpu_initialise = nsvm_vcpu_initialise,
     .nhvm_vcpu_destroy = nsvm_vcpu_destroy,
     .nhvm_vcpu_reset = nsvm_vcpu_reset,
-    .nhvm_vcpu_vmexit_trap = nsvm_vcpu_vmexit_trap,
+    .nhvm_vcpu_vmexit_event = nsvm_vcpu_vmexit_event,
     .nhvm_vcpu_p2m_base = nsvm_vcpu_hostcr3,
-    .nhvm_vmcx_guest_intercepts_trap = nsvm_vmcb_guest_intercepts_trap,
+    .nhvm_vmcx_guest_intercepts_event = nsvm_vmcb_guest_intercepts_event,
     .nhvm_vmcx_hap_enabled = nsvm_vmcb_hap_enabled,
     .nhvm_intr_blocked = nsvm_intr_blocked,
     .nhvm_hap_walk_L1_p2m = nsvm_hap_walk_L1_p2m,
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 7b2c50c..ed9b69b 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1623,9 +1623,9 @@ void nvmx_enqueue_n2_exceptions(struct vcpu *v,
                  nvmx->intr.intr_info, nvmx->intr.error_code);
 }
 
-static int nvmx_vmexit_trap(struct vcpu *v, const struct hvm_trap *trap)
+static int nvmx_vmexit_event(struct vcpu *v, const struct x86_event *event)
 {
-    nvmx_enqueue_n2_exceptions(v, trap->vector, trap->error_code,
+    nvmx_enqueue_n2_exceptions(v, event->vector, event->error_code,
                                hvm_intsrc_none);
     return NESTEDHVM_VMEXIT_DONE;
 }
@@ -1707,13 +1707,13 @@ void vmx_inject_nmi(void)
  *  - #DB is X86_EVENTTYPE_HW_EXCEPTION, except when generated by
  *    opcode 0xf1 (which is X86_EVENTTYPE_PRI_SW_EXCEPTION)
  */
-static void vmx_inject_trap(const struct hvm_trap *trap)
+static void vmx_inject_event(const struct x86_event *event)
 {
     unsigned long intr_info;
     struct vcpu *curr = current;
-    struct hvm_trap _trap = *trap;
+    struct x86_event _event = *event;
 
-    switch ( _trap.vector | -(_trap.type == X86_EVENTTYPE_SW_INTERRUPT) )
+    switch ( _event.vector | -(_event.type == X86_EVENTTYPE_SW_INTERRUPT) )
     {
     case TRAP_debug:
         if ( guest_cpu_user_regs()->eflags & X86_EFLAGS_TF )
@@ -1722,7 +1722,7 @@ static void vmx_inject_trap(const struct hvm_trap *trap)
             write_debugreg(6, read_debugreg(6) | DR_STEP);
         }
         if ( !nestedhvm_vcpu_in_guestmode(curr) ||
-             !nvmx_intercepts_exception(curr, TRAP_debug, _trap.error_code) )
+             !nvmx_intercepts_exception(curr, TRAP_debug, _event.error_code) )
         {
             unsigned long val;
 
@@ -1744,8 +1744,8 @@ static void vmx_inject_trap(const struct hvm_trap *trap)
         break;
 
     case TRAP_page_fault:
-        ASSERT(_trap.type == X86_EVENTTYPE_HW_EXCEPTION);
-        curr->arch.hvm_vcpu.guest_cr[2] = _trap.cr2;
+        ASSERT(_event.type == X86_EVENTTYPE_HW_EXCEPTION);
+        curr->arch.hvm_vcpu.guest_cr[2] = _event.cr2;
         break;
     }
 
@@ -1758,34 +1758,34 @@ static void vmx_inject_trap(const struct hvm_trap *trap)
          (MASK_EXTR(intr_info, INTR_INFO_INTR_TYPE_MASK) ==
           X86_EVENTTYPE_HW_EXCEPTION) )
     {
-        _trap.vector = hvm_combine_hw_exceptions(
-            (uint8_t)intr_info, _trap.vector);
-        if ( _trap.vector == TRAP_double_fault )
-            _trap.error_code = 0;
+        _event.vector = hvm_combine_hw_exceptions(
+            (uint8_t)intr_info, _event.vector);
+        if ( _event.vector == TRAP_double_fault )
+            _event.error_code = 0;
     }
 
-    if ( _trap.type >= X86_EVENTTYPE_SW_INTERRUPT )
-        __vmwrite(VM_ENTRY_INSTRUCTION_LEN, _trap.insn_len);
+    if ( _event.type >= X86_EVENTTYPE_SW_INTERRUPT )
+        __vmwrite(VM_ENTRY_INSTRUCTION_LEN, _event.insn_len);
 
     if ( nestedhvm_vcpu_in_guestmode(curr) &&
-         nvmx_intercepts_exception(curr, _trap.vector, _trap.error_code) )
+         nvmx_intercepts_exception(curr, _event.vector, _event.error_code) )
     {
         nvmx_enqueue_n2_exceptions (curr, 
             INTR_INFO_VALID_MASK |
-            MASK_INSR(_trap.type, INTR_INFO_INTR_TYPE_MASK) |
-            MASK_INSR(_trap.vector, INTR_INFO_VECTOR_MASK),
-            _trap.error_code, hvm_intsrc_none);
+            MASK_INSR(_event.type, INTR_INFO_INTR_TYPE_MASK) |
+            MASK_INSR(_event.vector, INTR_INFO_VECTOR_MASK),
+            _event.error_code, hvm_intsrc_none);
         return;
     }
     else
-        __vmx_inject_exception(_trap.vector, _trap.type, _trap.error_code);
+        __vmx_inject_exception(_event.vector, _event.type, _event.error_code);
 
-    if ( (_trap.vector == TRAP_page_fault) &&
-         (_trap.type == X86_EVENTTYPE_HW_EXCEPTION) )
-        HVMTRACE_LONG_2D(PF_INJECT, _trap.error_code,
+    if ( (_event.vector == TRAP_page_fault) &&
+         (_event.type == X86_EVENTTYPE_HW_EXCEPTION) )
+        HVMTRACE_LONG_2D(PF_INJECT, _event.error_code,
                          TRC_PAR_LONG(curr->arch.hvm_vcpu.guest_cr[2]));
     else
-        HVMTRACE_2D(INJ_EXC, _trap.vector, _trap.error_code);
+        HVMTRACE_2D(INJ_EXC, _event.vector, _event.error_code);
 }
 
 static int vmx_event_pending(struct vcpu *v)
@@ -2162,7 +2162,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .set_guest_pat        = vmx_set_guest_pat,
     .get_guest_pat        = vmx_get_guest_pat,
     .set_tsc_offset       = vmx_set_tsc_offset,
-    .inject_trap          = vmx_inject_trap,
+    .inject_event         = vmx_inject_event,
     .init_hypercall_page  = vmx_init_hypercall_page,
     .event_pending        = vmx_event_pending,
     .invlpg               = vmx_invlpg,
@@ -2182,8 +2182,8 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .nhvm_vcpu_reset      = nvmx_vcpu_reset,
     .nhvm_vcpu_p2m_base   = nvmx_vcpu_eptp_base,
     .nhvm_vmcx_hap_enabled = nvmx_ept_enabled,
-    .nhvm_vmcx_guest_intercepts_trap = nvmx_intercepts_exception,
-    .nhvm_vcpu_vmexit_trap = nvmx_vmexit_trap,
+    .nhvm_vmcx_guest_intercepts_event = nvmx_intercepts_exception,
+    .nhvm_vcpu_vmexit_event = nvmx_vmexit_event,
     .nhvm_intr_blocked    = nvmx_intr_blocked,
     .nhvm_domain_relinquish_resources = nvmx_domain_relinquish_resources,
     .update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap,
@@ -3201,7 +3201,7 @@ static int vmx_handle_eoi_write(void)
  */
 static void vmx_propagate_intr(unsigned long intr)
 {
-    struct hvm_trap trap = {
+    struct x86_event event = {
         .vector = MASK_EXTR(intr, INTR_INFO_VECTOR_MASK),
         .type = MASK_EXTR(intr, INTR_INFO_INTR_TYPE_MASK),
     };
@@ -3210,20 +3210,20 @@ static void vmx_propagate_intr(unsigned long intr)
     if ( intr & INTR_INFO_DELIVER_CODE_MASK )
     {
         __vmread(VM_EXIT_INTR_ERROR_CODE, &tmp);
-        trap.error_code = tmp;
+        event.error_code = tmp;
     }
     else
-        trap.error_code = HVM_DELIVER_NO_ERROR_CODE;
+        event.error_code = HVM_DELIVER_NO_ERROR_CODE;
 
-    if ( trap.type >= X86_EVENTTYPE_SW_INTERRUPT )
+    if ( event.type >= X86_EVENTTYPE_SW_INTERRUPT )
     {
         __vmread(VM_EXIT_INSTRUCTION_LEN, &tmp);
-        trap.insn_len = tmp;
+        event.insn_len = tmp;
     }
     else
-        trap.insn_len = 0;
+        event.insn_len = 0;
 
-    hvm_inject_trap(&trap);
+    hvm_inject_event(&event);
 }
 
 static void vmx_idtv_reinject(unsigned long idtv_info)
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index bed2e0a..b5837d4 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -491,18 +491,19 @@ static void vmreturn(struct cpu_user_regs *regs, enum vmx_ops_result ops_res)
     regs->eflags = eflags;
 }
 
-bool_t nvmx_intercepts_exception(struct vcpu *v, unsigned int trap,
-                                 int error_code)
+bool_t nvmx_intercepts_exception(
+    struct vcpu *v, unsigned int vector, int error_code)
 {
     u32 exception_bitmap, pfec_match=0, pfec_mask=0;
     int r;
 
-    ASSERT ( trap < 32 );
+    ASSERT(vector < 32);
 
     exception_bitmap = get_vvmcs(v, EXCEPTION_BITMAP);
-    r = exception_bitmap & (1 << trap) ? 1: 0;
+    r = exception_bitmap & (1 << vector) ? 1: 0;
 
-    if ( trap == TRAP_page_fault ) {
+    if ( vector == TRAP_page_fault )
+    {
         pfec_match = get_vvmcs(v, PAGE_FAULT_ERROR_CODE_MATCH);
         pfec_mask  = get_vvmcs(v, PAGE_FAULT_ERROR_CODE_MASK);
         if ( (error_code & pfec_mask) != pfec_match )
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 532bd32..9c28ed4 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -5451,6 +5451,17 @@ static void __init __maybe_unused build_assertions(void)
     BUILD_BUG_ON(x86_seg_ds != 3);
     BUILD_BUG_ON(x86_seg_fs != 4);
     BUILD_BUG_ON(x86_seg_gs != 5);
+
+    /*
+     * Check X86_EVENTTYPE_* against VMCB EVENTINJ and VMCS INTR_INFO type
+     * fields.
+     */
+    BUILD_BUG_ON(X86_EVENTTYPE_EXT_INTR != 0);
+    BUILD_BUG_ON(X86_EVENTTYPE_NMI != 2);
+    BUILD_BUG_ON(X86_EVENTTYPE_HW_EXCEPTION != 3);
+    BUILD_BUG_ON(X86_EVENTTYPE_SW_INTERRUPT != 4);
+    BUILD_BUG_ON(X86_EVENTTYPE_PRI_SW_EXCEPTION != 5);
+    BUILD_BUG_ON(X86_EVENTTYPE_SW_EXCEPTION != 6);
 }
 
 #ifdef __XEN__
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index ab566c0..54c532c 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -67,6 +67,28 @@ enum x86_swint_emulation {
     x86_swint_emulate_all,  /* Help needed with all software events */
 };
 
+/*
+ * x86 event types. This enumeration is valid for:
+ *  Intel VMX: {VM_ENTRY,VM_EXIT,IDT_VECTORING}_INTR_INFO[10:8]
+ *  AMD SVM: eventinj[10:8] and exitintinfo[10:8] (types 0-4 only)
+ */
+enum x86_event_type {
+    X86_EVENTTYPE_EXT_INTR,         /* External interrupt */
+    X86_EVENTTYPE_NMI = 2,          /* NMI */
+    X86_EVENTTYPE_HW_EXCEPTION,     /* Hardware exception */
+    X86_EVENTTYPE_SW_INTERRUPT,     /* Software interrupt (CD nn) */
+    X86_EVENTTYPE_PRI_SW_EXCEPTION, /* ICEBP (F1) */
+    X86_EVENTTYPE_SW_EXCEPTION,     /* INT3 (CC), INTO (CE) */
+};
+
+struct x86_event {
+    int16_t       vector;
+    uint8_t       type;         /* X86_EVENTTYPE_* */
+    uint8_t       insn_len;     /* Instruction length */
+    uint32_t      error_code;   /* HVM_DELIVER_NO_ERROR_CODE if n/a */
+    unsigned long cr2;          /* Only for TRAP_page_fault h/w exception */
+};
+
 /* 
  * Attribute for segment selector. This is a copy of bit 40:47 & 52:55 of the
  * segment descriptor. It happens to match the format of an AMD SVM VMCB.
diff --git a/xen/include/asm-x86/hvm/emulate.h b/xen/include/asm-x86/hvm/emulate.h
index d4186a2..3b7ec33 100644
--- a/xen/include/asm-x86/hvm/emulate.h
+++ b/xen/include/asm-x86/hvm/emulate.h
@@ -30,7 +30,7 @@ struct hvm_emulate_ctxt {
     unsigned long seg_reg_dirty;
 
     bool_t exn_pending;
-    struct hvm_trap trap;
+    struct x86_event trap;
 
     uint32_t intr_shadow;
 
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 7e7462e..51a64f7 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -77,14 +77,6 @@ enum hvm_intblk {
 #define HVM_HAP_SUPERPAGE_2MB   0x00000001
 #define HVM_HAP_SUPERPAGE_1GB   0x00000002
 
-struct hvm_trap {
-    int16_t       vector;
-    uint8_t       type;         /* X86_EVENTTYPE_* */
-    uint8_t       insn_len;     /* Instruction length */
-    uint32_t      error_code;   /* HVM_DELIVER_NO_ERROR_CODE if n/a */
-    unsigned long cr2;          /* Only for TRAP_page_fault h/w exception */
-};
-
 /*
  * The hardware virtual machine (HVM) interface abstracts away from the
  * x86/x86_64 CPU virtualization assist specifics. Currently this interface
@@ -152,7 +144,7 @@ struct hvm_function_table {
 
     void (*set_tsc_offset)(struct vcpu *v, u64 offset, u64 at_tsc);
 
-    void (*inject_trap)(const struct hvm_trap *trap);
+    void (*inject_event)(const struct x86_event *event);
 
     void (*init_hypercall_page)(struct domain *d, void *hypercall_page);
 
@@ -185,11 +177,10 @@ struct hvm_function_table {
     int (*nhvm_vcpu_initialise)(struct vcpu *v);
     void (*nhvm_vcpu_destroy)(struct vcpu *v);
     int (*nhvm_vcpu_reset)(struct vcpu *v);
-    int (*nhvm_vcpu_vmexit_trap)(struct vcpu *v, const struct hvm_trap *trap);
+    int (*nhvm_vcpu_vmexit_event)(struct vcpu *v, const struct x86_event *event);
     uint64_t (*nhvm_vcpu_p2m_base)(struct vcpu *v);
-    bool_t (*nhvm_vmcx_guest_intercepts_trap)(struct vcpu *v,
-                                              unsigned int trapnr,
-                                              int errcode);
+    bool_t (*nhvm_vmcx_guest_intercepts_event)(
+        struct vcpu *v, unsigned int vector, int errcode);
 
     bool_t (*nhvm_vmcx_hap_enabled)(struct vcpu *v);
 
@@ -419,9 +410,30 @@ void hvm_migrate_timers(struct vcpu *v);
 void hvm_do_resume(struct vcpu *v);
 void hvm_migrate_pirqs(struct vcpu *v);
 
-void hvm_inject_trap(const struct hvm_trap *trap);
-void hvm_inject_hw_exception(unsigned int trapnr, int errcode);
-void hvm_inject_page_fault(int errcode, unsigned long cr2);
+void hvm_inject_event(const struct x86_event *event);
+
+static inline void hvm_inject_hw_exception(unsigned int vector, int errcode)
+{
+    struct x86_event event = {
+        .vector = vector,
+        .type = X86_EVENTTYPE_HW_EXCEPTION,
+        .error_code = errcode,
+    };
+
+    hvm_inject_event(&event);
+}
+
+static inline void hvm_inject_page_fault(int errcode, unsigned long cr2)
+{
+    struct x86_event event = {
+        .vector = TRAP_page_fault,
+        .type = X86_EVENTTYPE_HW_EXCEPTION,
+        .error_code = errcode,
+        .cr2 = cr2,
+    };
+
+    hvm_inject_event(&event);
+}
 
 static inline int hvm_event_pending(struct vcpu *v)
 {
@@ -437,18 +449,6 @@ static inline int hvm_event_pending(struct vcpu *v)
                        (1U << TRAP_alignment_check) | \
                        (1U << TRAP_machine_check))
 
-/*
- * x86 event types. This enumeration is valid for:
- *  Intel VMX: {VM_ENTRY,VM_EXIT,IDT_VECTORING}_INTR_INFO[10:8]
- *  AMD SVM: eventinj[10:8] and exitintinfo[10:8] (types 0-4 only)
- */
-#define X86_EVENTTYPE_EXT_INTR         0 /* external interrupt */
-#define X86_EVENTTYPE_NMI              2 /* NMI */
-#define X86_EVENTTYPE_HW_EXCEPTION     3 /* hardware exception */
-#define X86_EVENTTYPE_SW_INTERRUPT     4 /* software interrupt (CD nn) */
-#define X86_EVENTTYPE_PRI_SW_EXCEPTION 5 /* ICEBP (F1) */
-#define X86_EVENTTYPE_SW_EXCEPTION     6 /* INT3 (CC), INTO (CE) */
-
 int hvm_event_needs_reinjection(uint8_t type, uint8_t vector);
 
 uint8_t hvm_combine_hw_exceptions(uint8_t vec1, uint8_t vec2);
@@ -542,10 +542,10 @@ int hvm_x2apic_msr_write(struct vcpu *v, unsigned int msr, uint64_t msr_content)
 /* inject vmexit into l1 guest. l1 guest will see a VMEXIT due to
  * 'trapnr' exception.
  */ 
-static inline int nhvm_vcpu_vmexit_trap(struct vcpu *v,
-                                        const struct hvm_trap *trap)
+static inline int nhvm_vcpu_vmexit_event(
+    struct vcpu *v, const struct x86_event *event)
 {
-    return hvm_funcs.nhvm_vcpu_vmexit_trap(v, trap);
+    return hvm_funcs.nhvm_vcpu_vmexit_event(v, event);
 }
 
 /* returns l1 guest's cr3 that points to the page table used to
@@ -557,11 +557,10 @@ static inline uint64_t nhvm_vcpu_p2m_base(struct vcpu *v)
 }
 
 /* returns true, when l1 guest intercepts the specified trap */
-static inline bool_t nhvm_vmcx_guest_intercepts_trap(struct vcpu *v,
-                                                     unsigned int trap,
-                                                     int errcode)
+static inline bool_t nhvm_vmcx_guest_intercepts_event(
+    struct vcpu *v, unsigned int vector, int errcode)
 {
-    return hvm_funcs.nhvm_vmcx_guest_intercepts_trap(v, trap, errcode);
+    return hvm_funcs.nhvm_vmcx_guest_intercepts_event(v, vector, errcode);
 }
 
 /* returns true when l1 guest wants to use hap to run l2 guest */
diff --git a/xen/include/asm-x86/hvm/svm/nestedsvm.h b/xen/include/asm-x86/hvm/svm/nestedsvm.h
index 0dbc5ec..4b36c25 100644
--- a/xen/include/asm-x86/hvm/svm/nestedsvm.h
+++ b/xen/include/asm-x86/hvm/svm/nestedsvm.h
@@ -110,10 +110,10 @@ void nsvm_vcpu_destroy(struct vcpu *v);
 int nsvm_vcpu_initialise(struct vcpu *v);
 int nsvm_vcpu_reset(struct vcpu *v);
 int nsvm_vcpu_vmrun(struct vcpu *v, struct cpu_user_regs *regs);
-int nsvm_vcpu_vmexit_trap(struct vcpu *v, const struct hvm_trap *trap);
+int nsvm_vcpu_vmexit_event(struct vcpu *v, const struct x86_event *event);
 uint64_t nsvm_vcpu_hostcr3(struct vcpu *v);
-bool_t nsvm_vmcb_guest_intercepts_trap(struct vcpu *v, unsigned int trapnr,
-                                       int errcode);
+bool_t nsvm_vmcb_guest_intercepts_event(
+    struct vcpu *v, unsigned int vector, int errcode);
 bool_t nsvm_vmcb_hap_enabled(struct vcpu *v);
 enum hvm_intblk nsvm_intr_blocked(struct vcpu *v);
 
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 84d9406..d485536 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -206,7 +206,7 @@ struct hvm_vcpu {
     void *fpu_exception_callback_arg;
 
     /* Pending hw/sw interrupt (.vector = -1 means nothing pending). */
-    struct hvm_trap     inject_trap;
+    struct x86_event     inject_trap;
 
     struct viridian_vcpu viridian;
 };
diff --git a/xen/include/asm-x86/hvm/vmx/vvmx.h b/xen/include/asm-x86/hvm/vmx/vvmx.h
index aca8b4b..ead586e 100644
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h
@@ -112,8 +112,8 @@ void nvmx_vcpu_destroy(struct vcpu *v);
 int nvmx_vcpu_reset(struct vcpu *v);
 uint64_t nvmx_vcpu_eptp_base(struct vcpu *v);
 enum hvm_intblk nvmx_intr_blocked(struct vcpu *v);
-bool_t nvmx_intercepts_exception(struct vcpu *v, unsigned int trap,
-                                 int error_code);
+bool_t nvmx_intercepts_exception(
+    struct vcpu *v, unsigned int vector, int error_code);
 void nvmx_domain_relinquish_resources(struct domain *d);
 
 bool_t nvmx_ept_enabled(struct vcpu *v);
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 05/24] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (3 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 04/24] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 06/24] x86/pv: Implement pv_inject_{event, page_fault, hw_exception}() Andrew Cooper
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

and move it to live with the other x86_event infrastructure in x86_emulate.h.
Switch it and x86_event.error_code to being signed, matching the rest of the
code.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v2:
 * Rebase over corrections to the use of HVM_DELIVER_NO_ERROR_CODE
---
 xen/arch/x86/hvm/emulate.c             |  5 ++---
 xen/arch/x86/hvm/hvm.c                 |  6 +++---
 xen/arch/x86/hvm/nestedhvm.c           |  2 +-
 xen/arch/x86/hvm/svm/nestedsvm.c       |  6 +++---
 xen/arch/x86/hvm/svm/svm.c             | 20 ++++++++++----------
 xen/arch/x86/hvm/vmx/intr.c            |  2 +-
 xen/arch/x86/hvm/vmx/vmx.c             | 25 +++++++++++++------------
 xen/arch/x86/hvm/vmx/vvmx.c            |  2 +-
 xen/arch/x86/x86_emulate/x86_emulate.h |  3 ++-
 xen/include/asm-x86/hvm/support.h      |  2 --
 10 files changed, 36 insertions(+), 37 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index bb26d40..bc259ec 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1609,7 +1609,7 @@ static int hvmemul_inject_sw_interrupt(
 
     hvmemul_ctxt->exn_pending = 1;
     hvmemul_ctxt->trap.vector = vector;
-    hvmemul_ctxt->trap.error_code = HVM_DELIVER_NO_ERROR_CODE;
+    hvmemul_ctxt->trap.error_code = X86_EVENT_NO_EC;
     hvmemul_ctxt->trap.insn_len = insn_len;
 
     return X86EMUL_OKAY;
@@ -1696,8 +1696,7 @@ static int hvmemul_vmfunc(
 
     rc = hvm_funcs.altp2m_vcpu_emulate_vmfunc(ctxt->regs);
     if ( rc != X86EMUL_OKAY )
-        hvmemul_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE,
-                                    ctxt);
+        hvmemul_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC, ctxt);
 
     return rc;
 }
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 7b434aa..b950842 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -502,7 +502,7 @@ void hvm_do_resume(struct vcpu *v)
                 kind = EMUL_KIND_SET_CONTEXT_INSN;
 
             hvm_emulate_one_vm_event(kind, TRAP_invalid_op,
-                                     HVM_DELIVER_NO_ERROR_CODE);
+                                     X86_EVENT_NO_EC);
 
             v->arch.vm_event->emulate_flags = 0;
         }
@@ -3054,7 +3054,7 @@ void hvm_task_switch(
     }
 
     if ( (tss.trace & 1) && !exn_raised )
-        hvm_inject_hw_exception(TRAP_debug, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
 
  out:
     hvm_unmap_entry(optss_desc);
@@ -4073,7 +4073,7 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
     switch ( hvm_emulate_one(&ctxt) )
     {
     case X86EMUL_UNHANDLEABLE:
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         break;
     case X86EMUL_EXCEPTION:
         if ( ctxt.exn_pending )
diff --git a/xen/arch/x86/hvm/nestedhvm.c b/xen/arch/x86/hvm/nestedhvm.c
index caad525..c4671d8 100644
--- a/xen/arch/x86/hvm/nestedhvm.c
+++ b/xen/arch/x86/hvm/nestedhvm.c
@@ -17,7 +17,7 @@
  */
 
 #include <asm/msr.h>
-#include <asm/hvm/support.h>	/* for HVM_DELIVER_NO_ERROR_CODE */
+#include <asm/hvm/support.h>
 #include <asm/hvm/hvm.h>
 #include <asm/p2m.h>    /* for struct p2m_domain */
 #include <asm/hvm/nestedhvm.h>
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index b6b8526..8c9b073 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -756,7 +756,7 @@ nsvm_vcpu_vmrun(struct vcpu *v, struct cpu_user_regs *regs)
     default:
         gdprintk(XENLOG_ERR,
             "nsvm_vcpu_vmentry failed, injecting #UD\n");
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         /* Must happen after hvm_inject_hw_exception or it doesn't work right. */
         nv->nv_vmswitch_in_progress = 0;
         return 1;
@@ -1581,7 +1581,7 @@ void svm_vmexit_do_stgi(struct cpu_user_regs *regs, struct vcpu *v)
     unsigned int inst_len;
 
     if ( !nestedhvm_enabled(v->domain) ) {
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         return;
     }
 
@@ -1601,7 +1601,7 @@ void svm_vmexit_do_clgi(struct cpu_user_regs *regs, struct vcpu *v)
     vintr_t intr;
 
     if ( !nestedhvm_enabled(v->domain) ) {
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         return;
     }
 
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index caab5ce..912d871 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -89,7 +89,7 @@ static DEFINE_SPINLOCK(osvw_lock);
 static void svm_crash_or_fault(struct vcpu *v)
 {
     if ( vmcb_get_cpl(v->arch.hvm_svm.vmcb) )
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
     else
         domain_crash(v->domain);
 }
@@ -116,7 +116,7 @@ void __update_guest_eip(struct cpu_user_regs *regs, unsigned int inst_len)
     curr->arch.hvm_svm.vmcb->interrupt_shadow = 0;
 
     if ( regs->eflags & X86_EFLAGS_TF )
-        hvm_inject_hw_exception(TRAP_debug, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
 }
 
 static void svm_cpu_down(void)
@@ -1285,7 +1285,7 @@ static void svm_inject_event(const struct x86_event *event)
 
     default:
         eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
-        eventinj.fields.ev = (_event.error_code != HVM_DELIVER_NO_ERROR_CODE);
+        eventinj.fields.ev = (_event.error_code != X86_EVENT_NO_EC);
         eventinj.fields.errorcode = _event.error_code;
         break;
     }
@@ -1553,7 +1553,7 @@ static void svm_fpu_dirty_intercept(void)
     {
        /* Check if l1 guest must make FPU ready for the l2 guest */
        if ( v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_TS )
-           hvm_inject_hw_exception(TRAP_no_device, HVM_DELIVER_NO_ERROR_CODE);
+           hvm_inject_hw_exception(TRAP_no_device, X86_EVENT_NO_EC);
        else
            vmcb_set_cr0(n1vmcb, vmcb_get_cr0(n1vmcb) & ~X86_CR0_TS);
        return;
@@ -2022,7 +2022,7 @@ svm_vmexit_do_vmrun(struct cpu_user_regs *regs,
     if ( !nsvm_efer_svm_enabled(v) )
     {
         gdprintk(XENLOG_ERR, "VMRUN: nestedhvm disabled, injecting #UD\n");
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         return;
     }
 
@@ -2077,7 +2077,7 @@ svm_vmexit_do_vmload(struct vmcb_struct *vmcb,
     if ( !nsvm_efer_svm_enabled(v) ) 
     {
         gdprintk(XENLOG_ERR, "VMLOAD: nestedhvm disabled, injecting #UD\n");
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         return;
     }
 
@@ -2113,7 +2113,7 @@ svm_vmexit_do_vmsave(struct vmcb_struct *vmcb,
     if ( !nsvm_efer_svm_enabled(v) ) 
     {
         gdprintk(XENLOG_ERR, "VMSAVE: nestedhvm disabled, injecting #UD\n");
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         return;
     }
 
@@ -2416,7 +2416,7 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
 
     case VMEXIT_EXCEPTION_DB:
         if ( !v->domain->debugger_attached )
-            hvm_inject_hw_exception(TRAP_debug, HVM_DELIVER_NO_ERROR_CODE);
+            hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
         else
             domain_pause_for_debugger();
         break;
@@ -2604,7 +2604,7 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
 
     case VMEXIT_MONITOR:
     case VMEXIT_MWAIT:
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         break;
 
     case VMEXIT_VMRUN:
@@ -2623,7 +2623,7 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
         svm_vmexit_do_clgi(regs, v);
         break;
     case VMEXIT_SKINIT:
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         break;
 
     case VMEXIT_XSETBV:
diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c
index 8fca08c..639a705 100644
--- a/xen/arch/x86/hvm/vmx/intr.c
+++ b/xen/arch/x86/hvm/vmx/intr.c
@@ -302,7 +302,7 @@ void vmx_intr_assist(void)
     }
     else if ( intack.source == hvm_intsrc_mce )
     {
-        hvm_inject_hw_exception(TRAP_machine_check, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_machine_check, X86_EVENT_NO_EC);
     }
     else if ( cpu_has_vmx_virtual_intr_delivery &&
               intack.source != hvm_intsrc_pic &&
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index ed9b69b..31f08d2 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1646,7 +1646,8 @@ static void __vmx_inject_exception(int trap, int type, int error_code)
     intr_fields = INTR_INFO_VALID_MASK |
                   MASK_INSR(type, INTR_INFO_INTR_TYPE_MASK) |
                   MASK_INSR(trap, INTR_INFO_VECTOR_MASK);
-    if ( error_code != HVM_DELIVER_NO_ERROR_CODE ) {
+    if ( error_code != X86_EVENT_NO_EC )
+    {
         __vmwrite(VM_ENTRY_EXCEPTION_ERROR_CODE, error_code);
         intr_fields |= INTR_INFO_DELIVER_CODE_MASK;
     }
@@ -1671,12 +1672,12 @@ void vmx_inject_extint(int trap, uint8_t source)
                INTR_INFO_VALID_MASK |
                MASK_INSR(X86_EVENTTYPE_EXT_INTR, INTR_INFO_INTR_TYPE_MASK) |
                MASK_INSR(trap, INTR_INFO_VECTOR_MASK),
-               HVM_DELIVER_NO_ERROR_CODE, source);
+               X86_EVENT_NO_EC, source);
             return;
         }
     }
     __vmx_inject_exception(trap, X86_EVENTTYPE_EXT_INTR,
-                           HVM_DELIVER_NO_ERROR_CODE);
+                           X86_EVENT_NO_EC);
 }
 
 void vmx_inject_nmi(void)
@@ -1691,12 +1692,12 @@ void vmx_inject_nmi(void)
                INTR_INFO_VALID_MASK |
                MASK_INSR(X86_EVENTTYPE_NMI, INTR_INFO_INTR_TYPE_MASK) |
                MASK_INSR(TRAP_nmi, INTR_INFO_VECTOR_MASK),
-               HVM_DELIVER_NO_ERROR_CODE, hvm_intsrc_nmi);
+               X86_EVENT_NO_EC, hvm_intsrc_nmi);
             return;
         }
     }
     __vmx_inject_exception(2, X86_EVENTTYPE_NMI,
-                           HVM_DELIVER_NO_ERROR_CODE);
+                           X86_EVENT_NO_EC);
 }
 
 /*
@@ -2111,7 +2112,7 @@ static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
     vmx_vmcs_exit(v);
 
     hvm_inject_hw_exception(TRAP_virtualisation,
-                            HVM_DELIVER_NO_ERROR_CODE);
+                            X86_EVENT_NO_EC);
 
  out:
     hvm_unmap_guest_frame(veinfo, 0);
@@ -2387,7 +2388,7 @@ void update_guest_eip(void)
     }
 
     if ( regs->eflags & X86_EFLAGS_TF )
-        hvm_inject_hw_exception(TRAP_debug, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
 }
 
 static void vmx_fpu_dirty_intercept(void)
@@ -2915,7 +2916,7 @@ static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content)
 
         if ( (rc < 0) ||
              (msr_content && (vmx_add_host_load_msr(msr) < 0)) )
-            hvm_inject_hw_exception(TRAP_machine_check, HVM_DELIVER_NO_ERROR_CODE);
+            hvm_inject_hw_exception(TRAP_machine_check, X86_EVENT_NO_EC);
         else
             __vmwrite(GUEST_IA32_DEBUGCTL, msr_content);
 
@@ -3213,7 +3214,7 @@ static void vmx_propagate_intr(unsigned long intr)
         event.error_code = tmp;
     }
     else
-        event.error_code = HVM_DELIVER_NO_ERROR_CODE;
+        event.error_code = X86_EVENT_NO_EC;
 
     if ( event.type >= X86_EVENTTYPE_SW_INTERRUPT )
     {
@@ -3770,7 +3771,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
 
     case EXIT_REASON_VMFUNC:
         if ( vmx_vmfunc_intercept(regs) != X86EMUL_OKAY )
-            hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+            hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         else
             update_guest_eip();
         break;
@@ -3784,7 +3785,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
          * as far as vmexit.
          */
         WARN_ON(exit_reason == EXIT_REASON_GETSEC);
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         break;
 
     case EXIT_REASON_TPR_BELOW_THRESHOLD:
@@ -3909,7 +3910,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
             vmx_get_segment_register(v, x86_seg_ss, &ss);
             if ( ss.attr.fields.dpl )
                 hvm_inject_hw_exception(TRAP_invalid_op,
-                                        HVM_DELIVER_NO_ERROR_CODE);
+                                        X86_EVENT_NO_EC);
             else
                 domain_crash(v->domain);
         }
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index b5837d4..efaf54c 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -380,7 +380,7 @@ static int vmx_inst_check_privilege(struct cpu_user_regs *regs, int vmxop_check)
     
 invalid_op:
     gdprintk(XENLOG_ERR, "vmx_inst_check_privilege: invalid_op\n");
-    hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+    hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
     return X86EMUL_EXCEPTION;
 
 gp_fault:
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 54c532c..b0f0304 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -80,12 +80,13 @@ enum x86_event_type {
     X86_EVENTTYPE_PRI_SW_EXCEPTION, /* ICEBP (F1) */
     X86_EVENTTYPE_SW_EXCEPTION,     /* INT3 (CC), INTO (CE) */
 };
+#define X86_EVENT_NO_EC (-1)        /* No error code. */
 
 struct x86_event {
     int16_t       vector;
     uint8_t       type;         /* X86_EVENTTYPE_* */
     uint8_t       insn_len;     /* Instruction length */
-    uint32_t      error_code;   /* HVM_DELIVER_NO_ERROR_CODE if n/a */
+    int32_t       error_code;   /* X86_EVENT_NO_EC if n/a */
     unsigned long cr2;          /* Only for TRAP_page_fault h/w exception */
 };
 
diff --git a/xen/include/asm-x86/hvm/support.h b/xen/include/asm-x86/hvm/support.h
index 2984abc..9938450 100644
--- a/xen/include/asm-x86/hvm/support.h
+++ b/xen/include/asm-x86/hvm/support.h
@@ -25,8 +25,6 @@
 #include <xen/hvm/save.h>
 #include <asm/processor.h>
 
-#define HVM_DELIVER_NO_ERROR_CODE  (~0U)
-
 #ifndef NDEBUG
 #define DBG_LEVEL_0                 (1 << 0)
 #define DBG_LEVEL_1                 (1 << 1)
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 06/24] x86/pv: Implement pv_inject_{event, page_fault, hw_exception}()
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (4 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 05/24] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-12-01 10:06   ` Jan Beulich
  2016-11-30 13:50 ` [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union Andrew Cooper
                   ` (17 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

To help with event injection improvements for the PV uses of x86_emulate(),
implement a event injection API which matches its hvm counterpart.

This is started with taking do_guest_trap() and modifying its calling API to
pv_inject_event(), subsequentally implementing the former in terms of the
latter.

The existing propagate_page_fault() is fairly similar to
pv_inject_page_fault(), although it has a return value.  Only a single caller
makes use of the return value, and non-NULL is only returned if the passed cr2
is non-canonical.  Opencode this single case in
handle_gdt_ldt_mapping_fault(), allowing propagate_page_fault() to become
void.

The call to reserved_bit_page_fault() in propagate_page_fault() was
conceptually wrong to start with.  Complaining about reserved bits should be
part of handling the pagefault itself, not part of injecting a pagefault into
the guest.  It is therefore moved ahead of the injection call in
do_page_fault() to compensate.

The remaining #PF specific bits are moved into pv_inject_event(), and
pv_inject_page_fault() is implemented as a static inline wrapper.

No practical change from a guests point of view.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
---
CC: Jan Beulich <JBeulich@suse.com>

v3:
 * Reposition reserved_bit_page_fault() handling
v2:
 * New
---
 xen/arch/x86/mm.c               |   5 +-
 xen/arch/x86/mm/shadow/common.c |   4 +-
 xen/arch/x86/traps.c            | 147 ++++++++++++++++++++--------------------
 xen/include/asm-x86/domain.h    |  26 +++++++
 xen/include/asm-x86/mm.h        |   1 -
 5 files changed, 104 insertions(+), 79 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index d365f59..b7c7122 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5136,7 +5136,7 @@ static int ptwr_emulated_read(
     if ( !__addr_ok(addr) ||
          (rc = __copy_from_user(p_data, (void *)addr, bytes)) )
     {
-        propagate_page_fault(addr + bytes - rc, 0); /* read fault */
+        pv_inject_page_fault(0, addr + bytes - rc); /* Read fault. */
         return X86EMUL_EXCEPTION;
     }
 
@@ -5177,7 +5177,8 @@ static int ptwr_emulated_update(
         addr &= ~(sizeof(paddr_t)-1);
         if ( (rc = copy_from_user(&full, (void *)addr, sizeof(paddr_t))) != 0 )
         {
-            propagate_page_fault(addr+sizeof(paddr_t)-rc, 0); /* read fault */
+            pv_inject_page_fault(0, /* Read fault. */
+                                 addr + sizeof(paddr_t) - rc);
             return X86EMUL_EXCEPTION;
         }
         /* Mask out bits provided by caller. */
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index a4a3c4b..f07803b 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -323,7 +323,7 @@ pv_emulate_read(enum x86_segment seg,
 
     if ( (rc = copy_from_user(p_data, (void *)offset, bytes)) != 0 )
     {
-        propagate_page_fault(offset + bytes - rc, 0); /* read fault */
+        pv_inject_page_fault(0, offset + bytes - rc); /* Read fault. */
         return X86EMUL_EXCEPTION;
     }
 
@@ -1723,7 +1723,7 @@ static mfn_t emulate_gva_to_mfn(struct vcpu *v, unsigned long vaddr,
         if ( is_hvm_vcpu(v) )
             hvm_inject_page_fault(pfec, vaddr);
         else
-            propagate_page_fault(vaddr, pfec);
+            pv_inject_page_fault(pfec, vaddr);
         return _mfn(BAD_GVA_TO_GFN);
     }
 
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index b464211..195d590 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -625,37 +625,75 @@ void fatal_trap(const struct cpu_user_regs *regs, bool_t show_remote)
           (regs->eflags & X86_EFLAGS_IF) ? "" : ", IN INTERRUPT CONTEXT");
 }
 
-static void do_guest_trap(unsigned int trapnr,
-                          const struct cpu_user_regs *regs)
+void pv_inject_event(const struct x86_event *event)
 {
     struct vcpu *v = current;
+    struct cpu_user_regs *regs = guest_cpu_user_regs();
     struct trap_bounce *tb;
     const struct trap_info *ti;
+    const uint8_t vector = event->vector;
     const bool use_error_code =
-        ((trapnr < 32) && (TRAP_HAVE_EC & (1u << trapnr)));
+        ((vector < 32) && (TRAP_HAVE_EC & (1u << vector)));
+    unsigned int error_code = event->error_code;
 
-    trace_pv_trap(trapnr, regs->eip, use_error_code, regs->error_code);
+    ASSERT(vector == event->vector); /* Confirm no truncation. */
+    if ( use_error_code )
+        ASSERT(error_code != X86_EVENT_NO_EC);
+    else
+        ASSERT(error_code == X86_EVENT_NO_EC);
 
     tb = &v->arch.pv_vcpu.trap_bounce;
-    ti = &v->arch.pv_vcpu.trap_ctxt[trapnr];
+    ti = &v->arch.pv_vcpu.trap_ctxt[vector];
 
     tb->flags = TBF_EXCEPTION;
     tb->cs    = ti->cs;
     tb->eip   = ti->address;
 
+    if ( vector == TRAP_page_fault )
+    {
+        v->arch.pv_vcpu.ctrlreg[2] = event->cr2;
+        arch_set_cr2(v, event->cr2);
+
+        /* Re-set error_code.user flag appropriately for the guest. */
+        error_code &= ~PFEC_user_mode;
+        if ( !guest_kernel_mode(v, regs) )
+            error_code |= PFEC_user_mode;
+
+        trace_pv_page_fault(event->cr2, error_code);
+    }
+    else
+        trace_pv_trap(vector, regs->eip, use_error_code, error_code);
+
     if ( use_error_code )
     {
         tb->flags |= TBF_EXCEPTION_ERRCODE;
-        tb->error_code = regs->error_code;
+        tb->error_code = error_code;
     }
 
     if ( TI_GET_IF(ti) )
         tb->flags |= TBF_INTERRUPT;
 
     if ( unlikely(null_trap_bounce(v, tb)) )
+    {
         gprintk(XENLOG_WARNING,
                 "Unhandled %s fault/trap [#%d, ec=%04x]\n",
-                trapstr(trapnr), trapnr, regs->error_code);
+                trapstr(vector), vector, error_code);
+
+        if ( vector == TRAP_page_fault )
+            show_page_walk(event->cr2);
+    }
+}
+
+static inline void do_guest_trap(unsigned int trapnr,
+                                 const struct cpu_user_regs *regs)
+{
+    const struct x86_event event = {
+        .vector = trapnr,
+        .error_code = (((trapnr < 32) && (TRAP_HAVE_EC & (1u << trapnr)))
+                       ? regs->error_code : X86_EVENT_NO_EC),
+    };
+
+    pv_inject_event(&event);
 }
 
 static void instruction_done(
@@ -1289,7 +1327,7 @@ static int emulate_invalid_rdtscp(struct cpu_user_regs *regs)
     eip = regs->eip;
     if ( (rc = copy_from_user(opcode, (char *)eip, sizeof(opcode))) != 0 )
     {
-        propagate_page_fault(eip + sizeof(opcode) - rc, 0);
+        pv_inject_page_fault(0, eip + sizeof(opcode) - rc);
         return EXCRET_fault_fixed;
     }
     if ( memcmp(opcode, "\xf\x1\xf9", sizeof(opcode)) )
@@ -1310,7 +1348,7 @@ static int emulate_forced_invalid_op(struct cpu_user_regs *regs)
     /* Check for forced emulation signature: ud2 ; .ascii "xen". */
     if ( (rc = copy_from_user(sig, (char *)eip, sizeof(sig))) != 0 )
     {
-        propagate_page_fault(eip + sizeof(sig) - rc, 0);
+        pv_inject_page_fault(0, eip + sizeof(sig) - rc);
         return EXCRET_fault_fixed;
     }
     if ( memcmp(sig, "\xf\xbxen", sizeof(sig)) )
@@ -1320,7 +1358,7 @@ static int emulate_forced_invalid_op(struct cpu_user_regs *regs)
     /* We only emulate CPUID. */
     if ( ( rc = copy_from_user(instr, (char *)eip, sizeof(instr))) != 0 )
     {
-        propagate_page_fault(eip + sizeof(instr) - rc, 0);
+        pv_inject_page_fault(0, eip + sizeof(instr) - rc);
         return EXCRET_fault_fixed;
     }
     if ( memcmp(instr, "\xf\xa2", sizeof(instr)) )
@@ -1487,53 +1525,6 @@ static void reserved_bit_page_fault(
     show_execution_state(regs);
 }
 
-struct trap_bounce *propagate_page_fault(unsigned long addr, u16 error_code)
-{
-    struct trap_info *ti;
-    struct vcpu *v = current;
-    struct trap_bounce *tb = &v->arch.pv_vcpu.trap_bounce;
-
-    if ( unlikely(!is_canonical_address(addr)) )
-    {
-        ti = &v->arch.pv_vcpu.trap_ctxt[TRAP_gp_fault];
-        tb->flags      = TBF_EXCEPTION | TBF_EXCEPTION_ERRCODE;
-        tb->error_code = 0;
-        tb->cs         = ti->cs;
-        tb->eip        = ti->address;
-        if ( TI_GET_IF(ti) )
-            tb->flags |= TBF_INTERRUPT;
-        return tb;
-    }
-
-    v->arch.pv_vcpu.ctrlreg[2] = addr;
-    arch_set_cr2(v, addr);
-
-    /* Re-set error_code.user flag appropriately for the guest. */
-    error_code &= ~PFEC_user_mode;
-    if ( !guest_kernel_mode(v, guest_cpu_user_regs()) )
-        error_code |= PFEC_user_mode;
-
-    trace_pv_page_fault(addr, error_code);
-
-    ti = &v->arch.pv_vcpu.trap_ctxt[TRAP_page_fault];
-    tb->flags = TBF_EXCEPTION | TBF_EXCEPTION_ERRCODE;
-    tb->error_code = error_code;
-    tb->cs         = ti->cs;
-    tb->eip        = ti->address;
-    if ( TI_GET_IF(ti) )
-        tb->flags |= TBF_INTERRUPT;
-    if ( unlikely(null_trap_bounce(v, tb)) )
-    {
-        printk("%pv: unhandled page fault (ec=%04X)\n", v, error_code);
-        show_page_walk(addr);
-    }
-
-    if ( unlikely(error_code & PFEC_reserved_bit) )
-        reserved_bit_page_fault(addr, guest_cpu_user_regs());
-
-    return NULL;
-}
-
 static int handle_gdt_ldt_mapping_fault(
     unsigned long offset, struct cpu_user_regs *regs)
 {
@@ -1565,17 +1556,22 @@ static int handle_gdt_ldt_mapping_fault(
         }
         else
         {
-            struct trap_bounce *tb;
-
             /* In hypervisor mode? Leave it to the #PF handler to fix up. */
             if ( !guest_mode(regs) )
                 return 0;
-            /* In guest mode? Propagate fault to guest, with adjusted %cr2. */
-            tb = propagate_page_fault(curr->arch.pv_vcpu.ldt_base + offset,
-                                      regs->error_code);
-            if ( tb )
-                tb->error_code = (offset & ~(X86_XEC_EXT | X86_XEC_IDT)) |
-                                 X86_XEC_TI;
+
+            /* Access would have become non-canonical? Pass #GP[sel] back. */
+            if ( unlikely(!is_canonical_address(
+                              curr->arch.pv_vcpu.ldt_base + offset)) )
+            {
+                uint16_t ec = (offset & ~(X86_XEC_EXT | X86_XEC_IDT)) | X86_XEC_TI;
+
+                pv_inject_hw_exception(TRAP_gp_fault, ec);
+            }
+            else
+                /* else pass the #PF back, with adjusted %cr2. */
+                pv_inject_page_fault(regs->error_code,
+                                     curr->arch.pv_vcpu.ldt_base + offset);
         }
     }
     else
@@ -1858,7 +1854,10 @@ void do_page_fault(struct cpu_user_regs *regs)
             return;
     }
 
-    propagate_page_fault(addr, regs->error_code);
+    if ( unlikely(regs->error_code & PFEC_reserved_bit) )
+        reserved_bit_page_fault(addr, regs);
+
+    pv_inject_page_fault(regs->error_code, addr);
 }
 
 /*
@@ -2788,7 +2787,7 @@ int pv_emul_cpuid(unsigned int *eax, unsigned int *ebx, unsigned int *ecx,
         goto fail;                                                          \
     if ( (_rc = copy_from_user(&_x, (type *)_ptr, sizeof(_x))) != 0 )       \
     {                                                                       \
-        propagate_page_fault(_ptr + sizeof(_x) - _rc, 0);                   \
+        pv_inject_page_fault(0, _ptr + sizeof(_x) - _rc);                   \
         goto skip;                                                          \
     }                                                                       \
     (eip) += sizeof(_x); _x; })
@@ -2953,8 +2952,8 @@ static int emulate_privileged_op(struct cpu_user_regs *regs)
             if ( (rc = copy_to_user((void *)data_base + rd_ad(edi),
                                     &data, op_bytes)) != 0 )
             {
-                propagate_page_fault(data_base + rd_ad(edi) + op_bytes - rc,
-                                     PFEC_write_access);
+                pv_inject_page_fault(PFEC_write_access,
+                                     data_base + rd_ad(edi) + op_bytes - rc);
                 return EXCRET_fault_fixed;
             }
             wr_ad(edi, regs->edi + (int)((regs->eflags & X86_EFLAGS_DF)
@@ -2971,8 +2970,8 @@ static int emulate_privileged_op(struct cpu_user_regs *regs)
             if ( (rc = copy_from_user(&data, (void *)data_base + rd_ad(esi),
                                       op_bytes)) != 0 )
             {
-                propagate_page_fault(data_base + rd_ad(esi)
-                                     + op_bytes - rc, 0);
+                pv_inject_page_fault(0, data_base + rd_ad(esi)
+                                     + op_bytes - rc);
                 return EXCRET_fault_fixed;
             }
             guest_io_write(port, op_bytes, data, currd);
@@ -3529,8 +3528,8 @@ static void emulate_gate_op(struct cpu_user_regs *regs)
             rc = __put_user(item, stkp); \
             if ( rc ) \
             { \
-                propagate_page_fault((unsigned long)(stkp + 1) - rc, \
-                                     PFEC_write_access); \
+                pv_inject_page_fault(PFEC_write_access, \
+                                     (unsigned long)(stkp + 1) - rc); \
                 return; \
             } \
         } while ( 0 )
@@ -3597,7 +3596,7 @@ static void emulate_gate_op(struct cpu_user_regs *regs)
                     rc = __get_user(parm, ustkp);
                     if ( rc )
                     {
-                        propagate_page_fault((unsigned long)(ustkp + 1) - rc, 0);
+                        pv_inject_page_fault(0, (unsigned long)(ustkp + 1) - rc);
                         return;
                     }
                     push(parm);
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index f6a40eb..39cc658 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -8,6 +8,7 @@
 #include <asm/hvm/domain.h>
 #include <asm/e820.h>
 #include <asm/mce.h>
+#include <asm/x86_emulate.h>
 #include <public/vcpu.h>
 #include <public/hvm/hvm_info_table.h>
 
@@ -632,6 +633,31 @@ static inline void free_vcpu_guest_context(struct vcpu_guest_context *vgc)
 struct vcpu_hvm_context;
 int arch_set_info_hvm_guest(struct vcpu *v, const struct vcpu_hvm_context *ctx);
 
+void pv_inject_event(const struct x86_event *event);
+
+static inline void pv_inject_hw_exception(unsigned int vector, int errcode)
+{
+    const struct x86_event event = {
+        .vector = vector,
+        .type = X86_EVENTTYPE_HW_EXCEPTION,
+        .error_code = errcode,
+    };
+
+    pv_inject_event(&event);
+}
+
+static inline void pv_inject_page_fault(int errcode, unsigned long cr2)
+{
+    const struct x86_event event = {
+        .vector = TRAP_page_fault,
+        .type = X86_EVENTTYPE_HW_EXCEPTION,
+        .error_code = errcode,
+        .cr2 = cr2,
+    };
+
+    pv_inject_event(&event);
+}
+
 #endif /* __ASM_DOMAIN_H__ */
 
 /*
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 1b4d1c3..a15029c 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -539,7 +539,6 @@ int new_guest_cr3(unsigned long pfn);
 void make_cr3(struct vcpu *v, unsigned long mfn);
 void update_cr3(struct vcpu *v);
 int vcpu_destroy_pagetables(struct vcpu *);
-struct trap_bounce *propagate_page_fault(unsigned long addr, u16 error_code);
 void *do_page_walk(struct vcpu *v, unsigned long addr);
 
 int __sync_local_execstate(void);
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (5 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 06/24] x86/pv: Implement pv_inject_{event, page_fault, hw_exception}() Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 13:58   ` Paul Durrant
  2016-12-01 10:08   ` Jan Beulich
  2016-11-30 13:50 ` [PATCH v3 08/24] x86/emul: Correct the behaviour of pop %ss and interrupt shadowing Andrew Cooper
                   ` (16 subsequent siblings)
  23 siblings, 2 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Paul Durrant, Jan Beulich

Rename byte to raw, as the field being a single byte long is an implementation
detail.  Make the bitfields part of an anonymous struct to remove the .flags
qualifier.  Change the types of the flags to being booleans, to match their
use.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Paul Durrant <paul.durrant@citrix.com>

v3:
 * New
---
 xen/arch/x86/hvm/emulate.c             |  6 +++---
 xen/arch/x86/x86_emulate/x86_emulate.c | 10 +++++-----
 xen/arch/x86/x86_emulate/x86_emulate.h | 10 +++++-----
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index bc259ec..fe62500 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1791,13 +1791,13 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
     new_intr_shadow = hvmemul_ctxt->intr_shadow;
 
     /* MOV-SS instruction toggles MOV-SS shadow, else we just clear it. */
-    if ( hvmemul_ctxt->ctxt.retire.flags.mov_ss )
+    if ( hvmemul_ctxt->ctxt.retire.mov_ss )
         new_intr_shadow ^= HVM_INTR_SHADOW_MOV_SS;
     else
         new_intr_shadow &= ~HVM_INTR_SHADOW_MOV_SS;
 
     /* STI instruction toggles STI shadow, else we just clear it. */
-    if ( hvmemul_ctxt->ctxt.retire.flags.sti )
+    if ( hvmemul_ctxt->ctxt.retire.sti )
         new_intr_shadow ^= HVM_INTR_SHADOW_STI;
     else
         new_intr_shadow &= ~HVM_INTR_SHADOW_STI;
@@ -1808,7 +1808,7 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
         hvm_funcs.set_interrupt_shadow(curr, new_intr_shadow);
     }
 
-    if ( hvmemul_ctxt->ctxt.retire.flags.hlt &&
+    if ( hvmemul_ctxt->ctxt.retire.hlt &&
          !hvm_local_events_need_delivery(curr) )
     {
         hvm_hlt(regs->eflags);
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 9c28ed4..416812e 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1905,7 +1905,7 @@ x86_decode(
     state->eip = ctxt->regs->eip;
 
     /* Initialise output state in x86_emulate_ctxt */
-    ctxt->retire.byte = 0;
+    ctxt->retire.raw = 0;
 
     op_bytes = def_op_bytes = ad_bytes = def_ad_bytes = ctxt->addr_size/8;
     if ( op_bytes == 8 )
@@ -2668,7 +2668,7 @@ x86_emulate(
 
     case 0x17: /* pop %%ss */
         src.val = x86_seg_ss;
-        ctxt->retire.flags.mov_ss = 1;
+        ctxt->retire.mov_ss = 1;
         goto pop_seg;
 
     case 0x1e: /* push %%ds */
@@ -2996,7 +2996,7 @@ x86_emulate(
         if ( (rc = load_seg(seg, src.val, 0, NULL, ctxt, ops)) != 0 )
             goto done;
         if ( seg == x86_seg_ss )
-            ctxt->retire.flags.mov_ss = 1;
+            ctxt->retire.mov_ss = 1;
         dst.type = OP_NONE;
         break;
 
@@ -4033,7 +4033,7 @@ x86_emulate(
 
     case 0xf4: /* hlt */
         generate_exception_if(!mode_ring0(), EXC_GP, 0);
-        ctxt->retire.flags.hlt = 1;
+        ctxt->retire.hlt = 1;
         break;
 
     case 0xf5: /* cmc */
@@ -4247,7 +4247,7 @@ x86_emulate(
         if ( !(_regs.eflags & EFLG_IF) )
         {
             _regs.eflags |= EFLG_IF;
-            ctxt->retire.flags.sti = 1;
+            ctxt->retire.sti = 1;
         }
         break;
 
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index b0f0304..ef39601 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -468,12 +468,12 @@ struct x86_emulate_ctxt
 
     /* Retirement state, set by the emulator (valid only on X86EMUL_OKAY). */
     union {
+        uint8_t raw;
         struct {
-            uint8_t hlt:1;          /* Instruction HLTed. */
-            uint8_t mov_ss:1;       /* Instruction sets MOV-SS irq shadow. */
-            uint8_t sti:1;          /* Instruction sets STI irq shadow. */
-        } flags;
-        uint8_t byte;
+            bool hlt:1;          /* Instruction HLTed. */
+            bool mov_ss:1;       /* Instruction sets MOV-SS irq shadow. */
+            bool sti:1;          /* Instruction sets STI irq shadow. */
+        };
     } retire;
 };
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 08/24] x86/emul: Correct the behaviour of pop %ss and interrupt shadowing
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (6 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-12-01 10:18   ` Jan Beulich
  2016-11-30 13:50 ` [PATCH v3 09/24] x86/emul: Provide a wrapper to x86_emulate() to ASSERT() certain behaviour Andrew Cooper
                   ` (15 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

The mov_ss retire flag should only be set once load_seg() has returned
success.  In particular, it should not be set if an exception occured when
trying to load %ss.

_hvm_emulate_one(), currently the sole user of mov_ss, only consideres it in
the case that x86_emulate() returns X86EMUL_OKAY, so this bug isn't actually
exposed to guests.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>

v3:
 * New
---
 xen/arch/x86/x86_emulate/x86_emulate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 416812e..bacdee6 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -2656,6 +2656,8 @@ x86_emulate(
                               &dst.val, op_bytes, ctxt, ops)) != 0 ||
              (rc = load_seg(src.val, dst.val, 0, NULL, ctxt, ops)) != 0 )
             goto done;
+        if ( src.val == x86_seg_ss )
+            ctxt->retire.mov_ss = 1;
         break;
 
     case 0x0e: /* push %%cs */
@@ -2668,7 +2670,6 @@ x86_emulate(
 
     case 0x17: /* pop %%ss */
         src.val = x86_seg_ss;
-        ctxt->retire.mov_ss = 1;
         goto pop_seg;
 
     case 0x1e: /* push %%ds */
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 09/24] x86/emul: Provide a wrapper to x86_emulate() to ASSERT() certain behaviour
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (7 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 08/24] x86/emul: Correct the behaviour of pop %ss and interrupt shadowing Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-12-01 10:40   ` Jan Beulich
  2016-11-30 13:50 ` [PATCH v3 10/24] x86/emul: Always use fault semantics for software events Andrew Cooper
                   ` (14 subsequent siblings)
  23 siblings, 1 reply; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

In debug builds, confirm that some properties of x86_emulate()'s behaviour
actually hold.  The first property, fixed in a previous change, is that retire
flags are only ever set in the X86EMUL_OKAY case.

While adjusting the userspace test harness to cope with ASSERT() in
x86_emulate.h, fix a build problem introduced in c/s 122dd9575c7 "x86emul:
in_longmode() should not ignore ->read_msr() errors" by providing an
implementation of likely()/unlikely().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>

v3:
 * New
---
 tools/tests/x86_emulator/test_x86_emulator.c |  1 +
 tools/tests/x86_emulator/x86_emulate.c       |  3 +++
 xen/arch/x86/x86_emulate/x86_emulate.c       |  5 +++++
 xen/arch/x86/x86_emulate/x86_emulate.h       | 25 +++++++++++++++++++++++++
 4 files changed, 34 insertions(+)

diff --git a/tools/tests/x86_emulator/test_x86_emulator.c b/tools/tests/x86_emulator/test_x86_emulator.c
index f255fef..b54fd11 100644
--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -1,3 +1,4 @@
+#include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stdbool.h>
diff --git a/tools/tests/x86_emulator/x86_emulate.c b/tools/tests/x86_emulator/x86_emulate.c
index c46b7fc..3272867 100644
--- a/tools/tests/x86_emulator/x86_emulate.c
+++ b/tools/tests/x86_emulator/x86_emulate.c
@@ -50,4 +50,7 @@ typedef bool bool_t;
 #define __init
 #define __maybe_unused __attribute__((__unused__))
 
+#define likely(x)     __builtin_expect(!!(x),1)
+#define unlikely(x)   __builtin_expect(!!(x),0)
+
 #include "x86_emulate/x86_emulate.c"
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index bacdee6..e4643a3 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -2404,6 +2404,11 @@ x86_decode(
 #undef insn_fetch_bytes
 #undef insn_fetch_type
 
+/* Undo DEBUG wrapper. */
+#ifdef x86_emulate
+#undef x86_emulate
+#endif
+
 int
 x86_emulate(
     struct x86_emulate_ctxt *ctxt,
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index ef39601..f84ced2 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -23,6 +23,10 @@
 #ifndef __X86_EMULATE_H__
 #define __X86_EMULATE_H__
 
+#ifndef ASSERT
+#define ASSERT assert
+#endif
+
 #define MAX_INST_LEN 15
 
 struct x86_emulate_ctxt;
@@ -554,6 +558,27 @@ x86_emulate(
     const struct x86_emulate_ops *ops);
 
 /*
+ * In debug builds, wrap x86_emulate() with some assertions about its expected
+ * behaviour.
+ */
+#ifndef NDEBUG
+static inline int x86_emulate_wrapper(
+    struct x86_emulate_ctxt *ctxt,
+    const struct x86_emulate_ops *ops)
+{
+    int rc = x86_emulate(ctxt, ops);
+
+    /* Retire flags should only be set for successful instruction emulation. */
+    if ( rc != X86EMUL_OKAY )
+        ASSERT(ctxt->retire.raw == 0);
+
+    return rc;
+}
+
+#define x86_emulate x86_emulate_wrapper
+#endif
+
+/*
  * Given the 'reg' portion of a ModRM byte, and a register block, return a
  * pointer into the block that addresses the relevant register.
  * @highbyte_regs specifies whether to decode AH,CH,DH,BH.
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 10/24] x86/emul: Always use fault semantics for software events
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (8 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 09/24] x86/emul: Provide a wrapper to x86_emulate() to ASSERT() certain behaviour Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 17:55   ` Boris Ostrovsky
  2016-12-01 10:53   ` Jan Beulich
  2016-11-30 13:50 ` [PATCH v3 11/24] x86/emul: Implement singlestep as a retire flag Andrew Cooper
                   ` (13 subsequent siblings)
  23 siblings, 2 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Boris Ostrovsky, Tim Deegan,
	Suravee Suthikulpanit, Jan Beulich

The common case is already using fault semantics out of x86_emulate(), as that
is how VT-x/SVM expects to inject the event (given suitable hardware support).

However, x86_emulate() returning X86EMUL_EXCEPTION and also completing a
register writeback is problematic for callers.

Switch the logic to always using fault semantics, and leave svm_inject_trap()
to fix up %eip if necessary.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Tim Deegan <tim@xen.org>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

v3:
 * New
---
 xen/arch/x86/hvm/svm/svm.c             | 44 ++++++++++++++++++++--------------
 xen/arch/x86/x86_emulate/x86_emulate.c |  2 --
 xen/arch/x86/x86_emulate/x86_emulate.h |  8 ++++++-
 3 files changed, 33 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 912d871..65eeab7 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1209,7 +1209,7 @@ static void svm_inject_event(const struct x86_event *event)
     struct vmcb_struct *vmcb = curr->arch.hvm_svm.vmcb;
     eventinj_t eventinj = vmcb->eventinj;
     struct x86_event _event = *event;
-    const struct cpu_user_regs *regs = guest_cpu_user_regs();
+    struct cpu_user_regs *regs = guest_cpu_user_regs();
 
     switch ( _event.vector )
     {
@@ -1242,27 +1242,38 @@ static void svm_inject_event(const struct x86_event *event)
     eventinj.fields.v = 1;
     eventinj.fields.vector = _event.vector;
 
-    /* Refer to AMD Vol 2: System Programming, 15.20 Event Injection. */
+    /*
+     * Refer to AMD Vol 2: System Programming, 15.20 Event Injection.
+     *
+     * On hardware lacking NextRIP support, and all hardware in the case of
+     * icebp, software events with trap semantics need emulating, so %eip in
+     * the trap frame points after the instruction.
+     *
+     * The x86 emulator (if requested by the x86_swint_emulate_* choice) will
+     * have performed checks such as presence/dpl/etc and believes that the
+     * event injection will succeed without faulting.
+     *
+     * The x86 emulator will always provide fault semantics for software
+     * events, with _trap.insn_len set appropriately.  If the injection
+     * requires emulation, move %eip forwards at this point.
+     */
     switch ( _event.type )
     {
     case X86_EVENTTYPE_SW_INTERRUPT: /* int $n */
-        /*
-         * Software interrupts (type 4) cannot be properly injected if the
-         * processor doesn't support NextRIP.  Without NextRIP, the emulator
-         * will have performed DPL and presence checks for us, and will have
-         * moved eip forward if appropriate.
-         */
         if ( cpu_has_svm_nrips )
             vmcb->nextrip = regs->eip + _event.insn_len;
+        else
+            regs->eip += _event.insn_len;
         eventinj.fields.type = X86_EVENTTYPE_SW_INTERRUPT;
         break;
 
     case X86_EVENTTYPE_PRI_SW_EXCEPTION: /* icebp */
         /*
-         * icebp's injection must always be emulated.  Software injection help
-         * in x86_emulate has moved eip forward, but NextRIP (if used) still
-         * needs setting or execution will resume from 0.
+         * icebp's injection must always be emulated, as hardware does not
+         * special case HW_EXCEPTION with vector 1 (#DB) as having trap
+         * semantics.
          */
+        regs->eip += _event.insn_len;
         if ( cpu_has_svm_nrips )
             vmcb->nextrip = regs->eip;
         eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
@@ -1270,16 +1281,13 @@ static void svm_inject_event(const struct x86_event *event)
 
     case X86_EVENTTYPE_SW_EXCEPTION: /* int3, into */
         /*
-         * The AMD manual states that .type=3 (HW exception), .vector=3 or 4,
-         * will perform DPL checks.  Experimentally, DPL and presence checks
-         * are indeed performed, even without NextRIP support.
-         *
-         * However without NextRIP support, the event injection still needs
-         * fully emulating to get the correct eip in the trap frame, yet get
-         * the correct faulting eip should a fault occur.
+         * Hardware special cases HW_EXCEPTION with vectors 3 and 4 as having
+         * trap semantics, and will perform DPL checks.
          */
         if ( cpu_has_svm_nrips )
             vmcb->nextrip = regs->eip + _event.insn_len;
+        else
+            regs->eip += _event.insn_len;
         eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
         break;
 
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index e4643a3..8a1f1f5 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1694,8 +1694,6 @@ static int inject_swint(enum x86_swint_type type,
                     goto raise_exn;
             }
         }
-
-        ctxt->regs->eip += insn_len;
     }
 
     rc = ops->inject_sw_interrupt(type, vector, insn_len, ctxt);
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index f84ced2..fc28976 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -64,7 +64,13 @@ enum x86_swint_type {
     x86_swint_int,   /* 0xcd $n */
 };
 
-/* How much help is required with software event injection? */
+/*
+ * How much help is required with software event injection?
+ *
+ * All software events return from x86_emulate() with X86EMUL_EXCEPTION and
+ * fault-like semantics.  This just controls whether the emulator performs
+ * presence/dpl/etc checks and possibly raises excepions instead.
+ */
 enum x86_swint_emulation {
     x86_swint_emulate_none, /* Hardware supports all software injection properly */
     x86_swint_emulate_icebp,/* Help needed with `icebp` (0xf1) */
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 11/24] x86/emul: Implement singlestep as a retire flag
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (9 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 10/24] x86/emul: Always use fault semantics for software events Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 14:28   ` Paul Durrant
  2016-12-01 11:16   ` Jan Beulich
  2016-11-30 13:50 ` [PATCH v3 12/24] x86/emul: Remove opencoded exception generation Andrew Cooper
                   ` (12 subsequent siblings)
  23 siblings, 2 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Paul Durrant, Tim Deegan, Jan Beulich

The behaviour of singlestep is to raise #DB after the instruction has been
completed, but implementing it with inject_hw_exception() causes x86_emulate()
to return X86EMUL_EXCEPTION, despite succesfully completing execution of the
instruction, including register writeback.

Instead, use a retire flag to indicate singlestep, which causes x86_emulate()
to return X86EMUL_OKAY.

Update all callers of x86_emulate() to use the new retire flag.  This fixes
the behaviour of singlestep for shadow pagetable updates and mmcfg/mmio_ro
intercepts, which previously discarded the exception.

With this change, all uses of X86EMUL_EXCEPTION from x86_emulate() are
believed to have strictly fault semantics.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Tim Deegan <tim@xen.org>
CC: Paul Durrant <paul.durrant@citrix.com>

v3:
 * New
---
 xen/arch/x86/hvm/emulate.c             |  3 +++
 xen/arch/x86/mm.c                      | 11 ++++++++++-
 xen/arch/x86/mm/shadow/multi.c         | 21 ++++++++++++++++++++-
 xen/arch/x86/x86_emulate/x86_emulate.c |  9 ++++-----
 xen/arch/x86/x86_emulate/x86_emulate.h |  6 ++++++
 5 files changed, 43 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index fe62500..91c79fa 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1788,6 +1788,9 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
     if ( rc != X86EMUL_OKAY )
         return rc;
 
+    if ( hvmemul_ctxt->ctxt.retire.singlestep )
+        hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+
     new_intr_shadow = hvmemul_ctxt->intr_shadow;
 
     /* MOV-SS instruction toggles MOV-SS shadow, else we just clear it. */
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index b7c7122..231c7bf 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5382,6 +5382,9 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
     if ( rc == X86EMUL_UNHANDLEABLE )
         goto bail;
 
+    if ( ptwr_ctxt.ctxt.retire.singlestep )
+        pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+
     perfc_incr(ptwr_emulations);
     return EXCRET_fault_fixed;
 
@@ -5503,7 +5506,13 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
     else
         rc = x86_emulate(&ctxt, &mmio_ro_emulate_ops);
 
-    return rc != X86EMUL_UNHANDLEABLE ? EXCRET_fault_fixed : 0;
+    if ( rc == X86EMUL_UNHANDLEABLE )
+        return 0;
+
+    if ( ctxt.retire.singlestep )
+        pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+
+    return EXCRET_fault_fixed;
 }
 
 void *alloc_xen_pagetable(void)
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 9ee48a8..ddfb815 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3422,6 +3422,16 @@ static int sh_page_fault(struct vcpu *v,
         v->arch.paging.last_write_emul_ok = 0;
 #endif
 
+    if ( emul_ctxt.ctxt.retire.singlestep )
+    {
+        if ( is_hvm_vcpu(v) )
+            hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+        else
+            pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+
+        goto emulate_done;
+    }
+
 #if GUEST_PAGING_LEVELS == 3 /* PAE guest */
     if ( r == X86EMUL_OKAY ) {
         int i, emulation_count=0;
@@ -3433,7 +3443,7 @@ static int sh_page_fault(struct vcpu *v,
             shadow_continue_emulation(&emul_ctxt, regs);
             v->arch.paging.last_write_was_pt = 0;
             r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
-            if ( r == X86EMUL_OKAY )
+            if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.retire.raw )
             {
                 emulation_count++;
                 if ( v->arch.paging.last_write_was_pt )
@@ -3449,6 +3459,15 @@ static int sh_page_fault(struct vcpu *v,
             {
                 perfc_incr(shadow_em_ex_fail);
                 TRACE_SHADOW_PATH_FLAG(TRCE_SFLAG_EMULATION_LAST_FAILED);
+
+                if ( emul_ctxt.ctxt.retire.singlestep )
+                {
+                    if ( is_hvm_vcpu(v) )
+                        hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+                    else
+                        pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+                }
+
                 break; /* Don't emulate again if we failed! */
             }
         }
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 8a1f1f5..0af532e 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -2417,7 +2417,6 @@ x86_emulate(
     struct x86_emulate_state state;
     int rc;
     uint8_t b, d;
-    bool tf = ctxt->regs->eflags & EFLG_TF;
     struct operand src = { .reg = PTR_POISON };
     struct operand dst = { .reg = PTR_POISON };
     enum x86_swint_type swint_type;
@@ -5415,11 +5414,11 @@ x86_emulate(
     if ( !mode_64bit() )
         _regs.eip = (uint32_t)_regs.eip;
 
-    *ctxt->regs = _regs;
+    /* Was singestepping active at the start of this instruction? */
+    if ( (rc == X86EMUL_OKAY) && (ctxt->regs->eflags & EFLG_TF) )
+        ctxt->retire.singlestep = true;
 
-    /* Inject #DB if single-step tracing was enabled at instruction start. */
-    if ( tf && (rc == X86EMUL_OKAY) && ops->inject_hw_exception )
-        rc = ops->inject_hw_exception(EXC_DB, -1, ctxt) ? : X86EMUL_EXCEPTION;
+    *ctxt->regs = _regs;
 
  done:
     _put_fpu();
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index fc28976..da8924b 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -483,6 +483,7 @@ struct x86_emulate_ctxt
             bool hlt:1;          /* Instruction HLTed. */
             bool mov_ss:1;       /* Instruction sets MOV-SS irq shadow. */
             bool sti:1;          /* Instruction sets STI irq shadow. */
+            bool singlestep:1;   /* Singlestepping was active. */
         };
     } retire;
 };
@@ -572,12 +573,17 @@ static inline int x86_emulate_wrapper(
     struct x86_emulate_ctxt *ctxt,
     const struct x86_emulate_ops *ops)
 {
+    unsigned long orig_eip = ctxt->regs->eip;
     int rc = x86_emulate(ctxt, ops);
 
     /* Retire flags should only be set for successful instruction emulation. */
     if ( rc != X86EMUL_OKAY )
         ASSERT(ctxt->retire.raw == 0);
 
+    /* All cases returning X86EMUL_EXCEPTION should have fault semantics. */
+    if ( rc == X86EMUL_EXCEPTION )
+        ASSERT(ctxt->regs->eip == orig_eip);
+
     return rc;
 }
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 12/24] x86/emul: Remove opencoded exception generation
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (10 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 11/24] x86/emul: Implement singlestep as a retire flag Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 13/24] x86/emul: Rework emulator event injection Andrew Cooper
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

Introduce generate_exception() for unconditional exception generation, and
replace existing uses.  Both generate_exception() and generate_exception_if()
are updated to make their error code parameters optional, which removes the
use of the -1 sentinal.

The ioport_access_check() check loses the presence check for %tr, as the x86
architecture has no concept of a non-usable task register.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
---
v3:
 * Rebase over singlestepping changes
v2:
 * Brackets around &
---
 xen/arch/x86/x86_emulate/x86_emulate.c | 193 +++++++++++++++++----------------
 1 file changed, 99 insertions(+), 94 deletions(-)

diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 0af532e..6adfdbe 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -457,14 +457,20 @@ typedef union {
 #define EXC_BR  5
 #define EXC_UD  6
 #define EXC_NM  7
+#define EXC_DF  8
 #define EXC_TS 10
 #define EXC_NP 11
 #define EXC_SS 12
 #define EXC_GP 13
 #define EXC_PF 14
 #define EXC_MF 16
+#define EXC_AC 17
 #define EXC_XM 19
 
+#define EXC_HAS_EC                                                      \
+    ((1u << EXC_DF) | (1u << EXC_TS) | (1u << EXC_NP) |                 \
+     (1u << EXC_SS) | (1u << EXC_GP) | (1u << EXC_PF) | (1u << EXC_AC))
+
 /* Segment selector error code bits. */
 #define ECODE_EXT (1 << 0)
 #define ECODE_IDT (1 << 1)
@@ -667,14 +673,22 @@ do {                                                    \
     if ( rc ) goto done;                                \
 } while (0)
 
-#define generate_exception_if(p, e, ec)                                   \
+static inline int mkec(uint8_t e, int32_t ec, ...)
+{
+    return (e < 32 && ((1u << e) & EXC_HAS_EC)) ? ec : X86_EVENT_NO_EC;
+}
+
+#define generate_exception_if(p, e, ec...)                                \
 ({  if ( (p) ) {                                                          \
         fail_if(ops->inject_hw_exception == NULL);                        \
-        rc = ops->inject_hw_exception(e, ec, ctxt) ? : X86EMUL_EXCEPTION; \
+        rc = ops->inject_hw_exception(e, mkec(e, ##ec, 0), ctxt)          \
+            ? : X86EMUL_EXCEPTION;                                        \
         goto done;                                                        \
     }                                                                     \
 })
 
+#define generate_exception(e, ec...) generate_exception_if(true, e, ##ec)
+
 /*
  * Given byte has even parity (even number of 1s)? SDM Vol. 1 Sec. 3.4.3.1,
  * "Status Flags": EFLAGS.PF reflects parity of least-sig. byte of result only.
@@ -785,7 +799,7 @@ static int _get_fpu(
                 return rc;
             generate_exception_if(!(cr4 & ((type == X86EMUL_FPU_xmm)
                                            ? CR4_OSFXSR : CR4_OSXSAVE)),
-                                  EXC_UD, -1);
+                                  EXC_UD);
         }
 
         rc = ops->read_cr(0, &cr0, ctxt);
@@ -798,13 +812,13 @@ static int _get_fpu(
         }
         if ( cr0 & CR0_EM )
         {
-            generate_exception_if(type == X86EMUL_FPU_fpu, EXC_NM, -1);
-            generate_exception_if(type == X86EMUL_FPU_mmx, EXC_UD, -1);
-            generate_exception_if(type == X86EMUL_FPU_xmm, EXC_UD, -1);
+            generate_exception_if(type == X86EMUL_FPU_fpu, EXC_NM);
+            generate_exception_if(type == X86EMUL_FPU_mmx, EXC_UD);
+            generate_exception_if(type == X86EMUL_FPU_xmm, EXC_UD);
         }
         generate_exception_if((cr0 & CR0_TS) &&
                               (type != X86EMUL_FPU_wait || (cr0 & CR0_MP)),
-                              EXC_NM, -1);
+                              EXC_NM);
     }
 
  done:
@@ -832,7 +846,7 @@ do {                                                            \
             (_fic)->exn_raised = EXC_UD;                        \
     }                                                           \
     generate_exception_if((_fic)->exn_raised >= 0,              \
-                          (_fic)->exn_raised, -1);              \
+                          (_fic)->exn_raised);                  \
 } while (0)
 
 #define emulate_fpu_insn(_op)                           \
@@ -1167,11 +1181,9 @@ static int ioport_access_check(
     if ( (rc = ops->read_segment(x86_seg_tr, &tr, ctxt)) != 0 )
         return rc;
 
-    /* Ensure that the TSS is valid and has an io-bitmap-offset field. */
-    if ( !tr.attr.fields.p ||
-         ((tr.attr.fields.type & 0xd) != 0x9) ||
-         (tr.limit < 0x67) )
-        goto raise_exception;
+    /* Ensure the TSS has an io-bitmap-offset field. */
+    generate_exception_if(tr.attr.fields.type != 0xb ||
+                          tr.limit < 0x67, EXC_GP, 0);
 
     if ( (rc = read_ulong(x86_seg_none, tr.base + 0x66,
                           &iobmp, 2, ctxt, ops)) )
@@ -1179,21 +1191,16 @@ static int ioport_access_check(
 
     /* Ensure TSS includes two bytes including byte containing first port. */
     iobmp += first_port / 8;
-    if ( tr.limit <= iobmp )
-        goto raise_exception;
+    generate_exception_if(tr.limit <= iobmp, EXC_GP, 0);
 
     if ( (rc = read_ulong(x86_seg_none, tr.base + iobmp,
                           &iobmp, 2, ctxt, ops)) )
         return rc;
-    if ( (iobmp & (((1<<bytes)-1) << (first_port&7))) != 0 )
-        goto raise_exception;
+    generate_exception_if(iobmp & (((1 << bytes) - 1) << (first_port & 7)),
+                          EXC_GP, 0);
 
  done:
     return rc;
-
- raise_exception:
-    fail_if(ops->inject_hw_exception == NULL);
-    return ops->inject_hw_exception(EXC_GP, 0, ctxt) ? : X86EMUL_EXCEPTION;
 }
 
 static bool_t
@@ -1262,7 +1269,7 @@ static bool_t vcpu_has(
 #define vcpu_has_rtm()   vcpu_has(0x00000007, EBX, 11, ctxt, ops)
 
 #define vcpu_must_have(leaf, reg, bit) \
-    generate_exception_if(!vcpu_has(leaf, reg, bit, ctxt, ops), EXC_UD, -1)
+    generate_exception_if(!vcpu_has(leaf, reg, bit, ctxt, ops), EXC_UD)
 #define vcpu_must_have_fpu()  vcpu_must_have(0x00000001, EDX, 0)
 #define vcpu_must_have_cmov() vcpu_must_have(0x00000001, EDX, 15)
 #define vcpu_must_have_mmx()  vcpu_must_have(0x00000001, EDX, 23)
@@ -1282,7 +1289,7 @@ static bool_t vcpu_has(
  * the actual operation.
  */
 #define host_and_vcpu_must_have(feat) ({ \
-    generate_exception_if(!cpu_has_##feat, EXC_UD, -1); \
+    generate_exception_if(!cpu_has_##feat, EXC_UD); \
     vcpu_must_have_##feat(); \
 })
 #else
@@ -1485,11 +1492,9 @@ protmode_load_seg(
     return X86EMUL_OKAY;
 
  raise_exn:
-    if ( ops->inject_hw_exception == NULL )
-        return X86EMUL_UNHANDLEABLE;
-    if ( (rc = ops->inject_hw_exception(fault_type, sel & 0xfffc, ctxt)) )
-        return rc;
-    return X86EMUL_EXCEPTION;
+    generate_exception(fault_type, sel & 0xfffc);
+ done:
+    return rc;
 }
 
 static int
@@ -1702,7 +1707,7 @@ static int inject_swint(enum x86_swint_type type,
     return rc;
 
  raise_exn:
-    return ops->inject_hw_exception(fault_type, error_code, ctxt);
+    generate_exception(fault_type, error_code);
 }
 
 int x86emul_unhandleable_rw(
@@ -1793,7 +1798,7 @@ x86_decode_onebyte(
 
     case 0x9a: /* call (far, absolute) */
     case 0xea: /* jmp (far, absolute) */
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
 
         imm1 = insn_fetch_bytes(op_bytes);
         imm2 = insn_fetch_type(uint16_t);
@@ -2021,7 +2026,7 @@ x86_decode(
                 /* fall through */
             case 8:
                 /* VEX / XOP / EVEX */
-                generate_exception_if(rex_prefix || vex.pfx, EXC_UD, -1);
+                generate_exception_if(rex_prefix || vex.pfx, EXC_UD);
 
                 vex.raw[0] = modrm;
                 if ( b == 0xc5 )
@@ -2515,12 +2520,12 @@ x86_emulate(
             (ext != ext_0f ||
              (((b < 0x20) || (b > 0x23)) && /* MOV CRn/DRn */
               (b != 0xc7))),                /* CMPXCHG{8,16}B */
-            EXC_UD, -1);
+            EXC_UD);
         dst.type = OP_NONE;
         break;
 
     case DstReg:
-        generate_exception_if(lock_prefix, EXC_UD, -1);
+        generate_exception_if(lock_prefix, EXC_UD);
         dst.type = OP_REG;
         if ( d & ByteOp )
         {
@@ -2576,7 +2581,7 @@ x86_emulate(
         dst = ea;
         if ( dst.type == OP_REG )
         {
-            generate_exception_if(lock_prefix, EXC_UD, -1);
+            generate_exception_if(lock_prefix, EXC_UD);
             switch ( dst.bytes )
             {
             case 1: dst.val = *(uint8_t  *)dst.reg; break;
@@ -2593,7 +2598,7 @@ x86_emulate(
             dst.orig_val = dst.val;
         }
         else /* Lock prefix is allowed only on RMW instructions. */
-            generate_exception_if(lock_prefix, EXC_UD, -1);
+            generate_exception_if(lock_prefix, EXC_UD);
         break;
     }
 
@@ -2631,7 +2636,7 @@ x86_emulate(
         break;
 
     case 0x38 ... 0x3d: cmp: /* cmp */
-        generate_exception_if(lock_prefix, EXC_UD, -1);
+        generate_exception_if(lock_prefix, EXC_UD);
         emulate_2op_SrcV("cmp", src, dst, _regs.eflags);
         dst.type = OP_NONE;
         break;
@@ -2639,7 +2644,7 @@ x86_emulate(
     case 0x06: /* push %%es */
         src.val = x86_seg_es;
     push_seg:
-        generate_exception_if(mode_64bit() && !ext, EXC_UD, -1);
+        generate_exception_if(mode_64bit() && !ext, EXC_UD);
         fail_if(ops->read_segment == NULL);
         if ( (rc = ops->read_segment(src.val, &sreg, ctxt)) != 0 )
             goto done;
@@ -2649,7 +2654,7 @@ x86_emulate(
     case 0x07: /* pop %%es */
         src.val = x86_seg_es;
     pop_seg:
-        generate_exception_if(mode_64bit() && !ext, EXC_UD, -1);
+        generate_exception_if(mode_64bit() && !ext, EXC_UD);
         fail_if(ops->write_segment == NULL);
         /* 64-bit mode: POP defaults to a 64-bit operand. */
         if ( mode_64bit() && (op_bytes == 4) )
@@ -2687,7 +2692,7 @@ x86_emulate(
         uint8_t al = _regs.eax;
         unsigned long eflags = _regs.eflags;
 
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
         _regs.eflags &= ~(EFLG_CF|EFLG_AF|EFLG_SF|EFLG_ZF|EFLG_PF);
         if ( ((al & 0x0f) > 9) || (eflags & EFLG_AF) )
         {
@@ -2709,7 +2714,7 @@ x86_emulate(
 
     case 0x37: /* aaa */
     case 0x3f: /* aas */
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
         _regs.eflags &= ~EFLG_CF;
         if ( ((uint8_t)_regs.eax > 9) || (_regs.eflags & EFLG_AF) )
         {
@@ -2753,7 +2758,7 @@ x86_emulate(
         unsigned long regs[] = {
             _regs.eax, _regs.ecx, _regs.edx, _regs.ebx,
             _regs.esp, _regs.ebp, _regs.esi, _regs.edi };
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
         for ( i = 0; i < 8; i++ )
             if ( (rc = ops->write(x86_seg_ss, sp_pre_dec(op_bytes),
                                   &regs[i], op_bytes, ctxt)) != 0 )
@@ -2768,7 +2773,7 @@ x86_emulate(
             (unsigned long *)&_regs.ebp, (unsigned long *)&dummy_esp,
             (unsigned long *)&_regs.ebx, (unsigned long *)&_regs.edx,
             (unsigned long *)&_regs.ecx, (unsigned long *)&_regs.eax };
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
         for ( i = 0; i < 8; i++ )
         {
             if ( (rc = read_ulong(x86_seg_ss, sp_post_inc(op_bytes),
@@ -2786,14 +2791,14 @@ x86_emulate(
         unsigned long src_val2;
         int lb, ub, idx;
         generate_exception_if(mode_64bit() || (src.type != OP_MEM),
-                              EXC_UD, -1);
+                              EXC_UD);
         if ( (rc = read_ulong(src.mem.seg, src.mem.off + op_bytes,
                               &src_val2, op_bytes, ctxt, ops)) )
             goto done;
         ub  = (op_bytes == 2) ? (int16_t)src_val2 : (int32_t)src_val2;
         lb  = (op_bytes == 2) ? (int16_t)src.val  : (int32_t)src.val;
         idx = (op_bytes == 2) ? (int16_t)dst.val  : (int32_t)dst.val;
-        generate_exception_if((idx < lb) || (idx > ub), EXC_BR, -1);
+        generate_exception_if((idx < lb) || (idx > ub), EXC_BR);
         dst.type = OP_NONE;
         break;
     }
@@ -2831,7 +2836,7 @@ x86_emulate(
                 _regs.eflags &= ~EFLG_ZF;
                 dst.type = OP_NONE;
             }
-            generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
+            generate_exception_if(!in_protmode(ctxt, ops), EXC_UD);
         }
         break;
 
@@ -2922,7 +2927,7 @@ x86_emulate(
         break;
 
     case 0x82: /* Grp1 (x86/32 only) */
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
     case 0x80: case 0x81: case 0x83: /* Grp1 */
         switch ( modrm_reg & 7 )
         {
@@ -2973,7 +2978,7 @@ x86_emulate(
             dst.type = OP_NONE;
             break;
         }
-        generate_exception_if((modrm_reg & 7) != 0, EXC_UD, -1);
+        generate_exception_if((modrm_reg & 7) != 0, EXC_UD);
     case 0x88 ... 0x8b: /* mov */
     case 0xa0 ... 0xa1: /* mov mem.offs,{%al,%ax,%eax,%rax} */
     case 0xa2 ... 0xa3: /* mov {%al,%ax,%eax,%rax},mem.offs */
@@ -2982,7 +2987,7 @@ x86_emulate(
 
     case 0x8c: /* mov Sreg,r/m */
         seg = modrm_reg & 7; /* REX.R is ignored. */
-        generate_exception_if(!is_x86_user_segment(seg), EXC_UD, -1);
+        generate_exception_if(!is_x86_user_segment(seg), EXC_UD);
     store_selector:
         fail_if(ops->read_segment == NULL);
         if ( (rc = ops->read_segment(seg, &sreg, ctxt)) != 0 )
@@ -2995,7 +3000,7 @@ x86_emulate(
     case 0x8e: /* mov r/m,Sreg */
         seg = modrm_reg & 7; /* REX.R is ignored. */
         generate_exception_if(!is_x86_user_segment(seg) ||
-                              seg == x86_seg_cs, EXC_UD, -1);
+                              seg == x86_seg_cs, EXC_UD);
         if ( (rc = load_seg(seg, src.val, 0, NULL, ctxt, ops)) != 0 )
             goto done;
         if ( seg == x86_seg_ss )
@@ -3004,12 +3009,12 @@ x86_emulate(
         break;
 
     case 0x8d: /* lea */
-        generate_exception_if(ea.type != OP_MEM, EXC_UD, -1);
+        generate_exception_if(ea.type != OP_MEM, EXC_UD);
         dst.val = ea.mem.off;
         break;
 
     case 0x8f: /* pop (sole member of Grp1a) */
-        generate_exception_if((modrm_reg & 7) != 0, EXC_UD, -1);
+        generate_exception_if((modrm_reg & 7) != 0, EXC_UD);
         /* 64-bit mode: POP defaults to a 64-bit operand. */
         if ( mode_64bit() && (dst.bytes == 4) )
             dst.bytes = 8;
@@ -3286,8 +3291,8 @@ x86_emulate(
         unsigned long sel;
         dst.val = x86_seg_es;
     les: /* dst.val identifies the segment */
-        generate_exception_if(mode_64bit() && !ext, EXC_UD, -1);
-        generate_exception_if(src.type != OP_MEM, EXC_UD, -1);
+        generate_exception_if(mode_64bit() && !ext, EXC_UD);
+        generate_exception_if(src.type != OP_MEM, EXC_UD);
         if ( (rc = read_ulong(src.mem.seg, src.mem.off + src.bytes,
                               &sel, 2, ctxt, ops)) != 0 )
             goto done;
@@ -3377,7 +3382,7 @@ x86_emulate(
         goto done;
 
     case 0xce: /* into */
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
         if ( !(_regs.eflags & EFLG_OF) )
             break;
         src.val = EXC_OF;
@@ -3419,7 +3424,7 @@ x86_emulate(
     case 0xd5: /* aad */ {
         unsigned int base = (uint8_t)src.val;
 
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
         if ( b & 0x01 )
         {
             uint16_t ax = _regs.eax;
@@ -3430,7 +3435,7 @@ x86_emulate(
         {
             uint8_t al = _regs.eax;
 
-            generate_exception_if(!base, EXC_DE, -1);
+            generate_exception_if(!base, EXC_DE);
             *(uint16_t *)&_regs.eax = ((al / base) << 8) | (al % base);
         }
         _regs.eflags &= ~(EFLG_SF|EFLG_ZF|EFLG_PF);
@@ -3441,7 +3446,7 @@ x86_emulate(
     }
 
     case 0xd6: /* salc */
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
         *(uint8_t *)&_regs.eax = (_regs.eflags & EFLG_CF) ? 0xff : 0x00;
         break;
 
@@ -4049,7 +4054,7 @@ x86_emulate(
             unsigned long u[2], v;
 
         case 0 ... 1: /* test */
-            generate_exception_if(lock_prefix, EXC_UD, -1);
+            generate_exception_if(lock_prefix, EXC_UD);
             goto test;
         case 2: /* not */
             dst.val = ~dst.val;
@@ -4147,7 +4152,7 @@ x86_emulate(
                 v    = (uint8_t)src.val;
                 generate_exception_if(
                     div_dbl(u, v) || ((uint8_t)u[0] != (uint16_t)u[0]),
-                    EXC_DE, -1);
+                    EXC_DE);
                 dst.val = (uint8_t)u[0];
                 ((uint8_t *)&_regs.eax)[1] = u[1];
                 break;
@@ -4157,7 +4162,7 @@ x86_emulate(
                 v    = (uint16_t)src.val;
                 generate_exception_if(
                     div_dbl(u, v) || ((uint16_t)u[0] != (uint32_t)u[0]),
-                    EXC_DE, -1);
+                    EXC_DE);
                 dst.val = (uint16_t)u[0];
                 *(uint16_t *)&_regs.edx = u[1];
                 break;
@@ -4168,7 +4173,7 @@ x86_emulate(
                 v    = (uint32_t)src.val;
                 generate_exception_if(
                     div_dbl(u, v) || ((uint32_t)u[0] != u[0]),
-                    EXC_DE, -1);
+                    EXC_DE);
                 dst.val   = (uint32_t)u[0];
                 _regs.edx = (uint32_t)u[1];
                 break;
@@ -4177,7 +4182,7 @@ x86_emulate(
                 u[0] = _regs.eax;
                 u[1] = _regs.edx;
                 v    = src.val;
-                generate_exception_if(div_dbl(u, v), EXC_DE, -1);
+                generate_exception_if(div_dbl(u, v), EXC_DE);
                 dst.val   = u[0];
                 _regs.edx = u[1];
                 break;
@@ -4193,7 +4198,7 @@ x86_emulate(
                 v    = (int8_t)src.val;
                 generate_exception_if(
                     idiv_dbl(u, v) || ((int8_t)u[0] != (int16_t)u[0]),
-                    EXC_DE, -1);
+                    EXC_DE);
                 dst.val = (int8_t)u[0];
                 ((int8_t *)&_regs.eax)[1] = u[1];
                 break;
@@ -4203,7 +4208,7 @@ x86_emulate(
                 v    = (int16_t)src.val;
                 generate_exception_if(
                     idiv_dbl(u, v) || ((int16_t)u[0] != (int32_t)u[0]),
-                    EXC_DE, -1);
+                    EXC_DE);
                 dst.val = (int16_t)u[0];
                 *(int16_t *)&_regs.edx = u[1];
                 break;
@@ -4214,7 +4219,7 @@ x86_emulate(
                 v    = (int32_t)src.val;
                 generate_exception_if(
                     idiv_dbl(u, v) || ((int32_t)u[0] != u[0]),
-                    EXC_DE, -1);
+                    EXC_DE);
                 dst.val   = (int32_t)u[0];
                 _regs.edx = (uint32_t)u[1];
                 break;
@@ -4223,7 +4228,7 @@ x86_emulate(
                 u[0] = _regs.eax;
                 u[1] = _regs.edx;
                 v    = src.val;
-                generate_exception_if(idiv_dbl(u, v), EXC_DE, -1);
+                generate_exception_if(idiv_dbl(u, v), EXC_DE);
                 dst.val   = u[0];
                 _regs.edx = u[1];
                 break;
@@ -4263,7 +4268,7 @@ x86_emulate(
         break;
 
     case 0xfe: /* Grp4 */
-        generate_exception_if((modrm_reg & 7) >= 2, EXC_UD, -1);
+        generate_exception_if((modrm_reg & 7) >= 2, EXC_UD);
     case 0xff: /* Grp5 */
         switch ( modrm_reg & 7 )
         {
@@ -4288,7 +4293,7 @@ x86_emulate(
             break;
         case 3: /* call (far, absolute indirect) */
         case 5: /* jmp (far, absolute indirect) */
-            generate_exception_if(src.type != OP_MEM, EXC_UD, -1);
+            generate_exception_if(src.type != OP_MEM, EXC_UD);
 
             if ( (rc = read_ulong(src.mem.seg, src.mem.off + op_bytes,
                                   &imm2, 2, ctxt, ops)) )
@@ -4300,13 +4305,13 @@ x86_emulate(
         case 6: /* push */
             goto push;
         case 7:
-            generate_exception_if(1, EXC_UD, -1);
+            generate_exception(EXC_UD);
         }
         break;
 
     case X86EMUL_OPC(0x0f, 0x00): /* Grp6 */
         seg = (modrm_reg & 1) ? x86_seg_tr : x86_seg_ldtr;
-        generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
+        generate_exception_if(!in_protmode(ctxt, ops), EXC_UD);
         switch ( modrm_reg & 6 )
         {
         case 0: /* sldt / str */
@@ -4318,7 +4323,7 @@ x86_emulate(
                 goto done;
             break;
         default:
-            generate_exception_if(true, EXC_UD, -1);
+            generate_exception_if(true, EXC_UD);
             break;
         }
         break;
@@ -4333,10 +4338,10 @@ x86_emulate(
         {
             unsigned long cr4;
 
-            generate_exception_if(vex.pfx, EXC_UD, -1);
+            generate_exception_if(vex.pfx, EXC_UD);
             if ( !ops->read_cr || ops->read_cr(4, &cr4, ctxt) != X86EMUL_OKAY )
                 cr4 = 0;
-            generate_exception_if(!(cr4 & X86_CR4_OSXSAVE), EXC_UD, -1);
+            generate_exception_if(!(cr4 & X86_CR4_OSXSAVE), EXC_UD);
             generate_exception_if(!mode_ring0() ||
                                   handle_xsetbv(_regs._ecx,
                                                 _regs._eax | (_regs.rdx << 32)),
@@ -4347,28 +4352,28 @@ x86_emulate(
 
         case 0xd4: /* vmfunc */
             generate_exception_if(lock_prefix | rep_prefix() | (vex.pfx == vex_66),
-                                  EXC_UD, -1);
+                                  EXC_UD);
             fail_if(!ops->vmfunc);
             if ( (rc = ops->vmfunc(ctxt) != X86EMUL_OKAY) )
                 goto done;
             goto no_writeback;
 
         case 0xd5: /* xend */
-            generate_exception_if(vex.pfx, EXC_UD, -1);
-            generate_exception_if(!vcpu_has_rtm(), EXC_UD, -1);
+            generate_exception_if(vex.pfx, EXC_UD);
+            generate_exception_if(!vcpu_has_rtm(), EXC_UD);
             generate_exception_if(vcpu_has_rtm(), EXC_GP, 0);
             break;
 
         case 0xd6: /* xtest */
-            generate_exception_if(vex.pfx, EXC_UD, -1);
+            generate_exception_if(vex.pfx, EXC_UD);
             generate_exception_if(!vcpu_has_rtm() && !vcpu_has_hle(),
-                                  EXC_UD, -1);
+                                  EXC_UD);
             /* Neither HLE nor RTM can be active when we get here. */
             _regs.eflags |= EFLG_ZF;
             goto no_writeback;
 
         case 0xdf: /* invlpga */
-            generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
+            generate_exception_if(!in_protmode(ctxt, ops), EXC_UD);
             generate_exception_if(!mode_ring0(), EXC_GP, 0);
             fail_if(ops->invlpg == NULL);
             if ( (rc = ops->invlpg(x86_seg_none, truncate_ea(_regs.eax),
@@ -4398,7 +4403,7 @@ x86_emulate(
                  ops->cpuid(&eax, &ebx, &dummy, &dummy, ctxt) == X86EMUL_OKAY )
                 limit = ((ebx >> 8) & 0xff) * 8;
             generate_exception_if(limit < sizeof(long) ||
-                                  (limit & (limit - 1)), EXC_UD, -1);
+                                  (limit & (limit - 1)), EXC_UD);
             base &= ~(limit - 1);
             if ( override_seg == -1 )
                 override_seg = x86_seg_ds;
@@ -4434,7 +4439,7 @@ x86_emulate(
         {
         case 0: /* sgdt */
         case 1: /* sidt */
-            generate_exception_if(ea.type != OP_MEM, EXC_UD, -1);
+            generate_exception_if(ea.type != OP_MEM, EXC_UD);
             generate_exception_if(umip_active(ctxt, ops), EXC_GP, 0);
             fail_if(ops->read_segment == NULL);
             if ( (rc = ops->read_segment(seg, &sreg, ctxt)) )
@@ -4455,7 +4460,7 @@ x86_emulate(
         case 2: /* lgdt */
         case 3: /* lidt */
             generate_exception_if(!mode_ring0(), EXC_GP, 0);
-            generate_exception_if(ea.type != OP_MEM, EXC_UD, -1);
+            generate_exception_if(ea.type != OP_MEM, EXC_UD);
             fail_if(ops->write_segment == NULL);
             memset(&sreg, 0, sizeof(sreg));
             if ( (rc = read_ulong(ea.mem.seg, ea.mem.off+0,
@@ -4498,7 +4503,7 @@ x86_emulate(
             break;
         case 7: /* invlpg */
             generate_exception_if(!mode_ring0(), EXC_GP, 0);
-            generate_exception_if(ea.type != OP_MEM, EXC_UD, -1);
+            generate_exception_if(ea.type != OP_MEM, EXC_UD);
             fail_if(ops->invlpg == NULL);
             if ( (rc = ops->invlpg(ea.mem.seg, ea.mem.off, ctxt)) )
                 goto done;
@@ -4512,13 +4517,13 @@ x86_emulate(
     case X86EMUL_OPC(0x0f, 0x05): /* syscall */ {
         uint64_t msr_content;
 
-        generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
+        generate_exception_if(!in_protmode(ctxt, ops), EXC_UD);
 
         /* Inject #UD if syscall/sysret are disabled. */
         fail_if(ops->read_msr == NULL);
         if ( (rc = ops->read_msr(MSR_EFER, &msr_content, ctxt)) != 0 )
             goto done;
-        generate_exception_if((msr_content & EFER_SCE) == 0, EXC_UD, -1);
+        generate_exception_if((msr_content & EFER_SCE) == 0, EXC_UD);
 
         if ( (rc = ops->read_msr(MSR_STAR, &msr_content, ctxt)) != 0 )
             goto done;
@@ -4587,7 +4592,7 @@ x86_emulate(
     case X86EMUL_OPC(0x0f, 0x0b): /* ud2 */
     case X86EMUL_OPC(0x0f, 0xb9): /* ud1 */
     case X86EMUL_OPC(0x0f, 0xff): /* ud0 */
-        generate_exception_if(1, EXC_UD, -1);
+        generate_exception(EXC_UD);
 
     case X86EMUL_OPC(0x0f, 0x0d): /* GrpP (prefetch) */
     case X86EMUL_OPC(0x0f, 0x18): /* Grp16 (prefetch/nop) */
@@ -4706,7 +4711,7 @@ x86_emulate(
     case X86EMUL_OPC(0x0f, 0x21): /* mov dr,reg */
     case X86EMUL_OPC(0x0f, 0x22): /* mov reg,cr */
     case X86EMUL_OPC(0x0f, 0x23): /* mov reg,dr */
-        generate_exception_if(ea.type != OP_REG, EXC_UD, -1);
+        generate_exception_if(ea.type != OP_REG, EXC_UD);
         generate_exception_if(!mode_ring0(), EXC_GP, 0);
         modrm_reg |= lock_prefix << 3;
         if ( b & 2 )
@@ -4942,11 +4947,11 @@ x86_emulate(
         switch ( b )
         {
         case 0x7e:
-            generate_exception_if(vex.l, EXC_UD, -1);
+            generate_exception_if(vex.l, EXC_UD);
             ea.bytes = op_bytes;
             break;
         case 0xd6:
-            generate_exception_if(vex.l, EXC_UD, -1);
+            generate_exception_if(vex.l, EXC_UD);
             ea.bytes = 8;
             break;
         }
@@ -5036,7 +5041,7 @@ x86_emulate(
     case X86EMUL_OPC(0x0f, 0xad): /* shrd %%cl,r,r/m */ {
         uint8_t shift, width = dst.bytes << 3;
 
-        generate_exception_if(lock_prefix, EXC_UD, -1);
+        generate_exception_if(lock_prefix, EXC_UD);
         if ( b & 1 )
             shift = _regs.ecx;
         else
@@ -5151,7 +5156,7 @@ x86_emulate(
         case 5: goto bts;
         case 6: goto btr;
         case 7: goto btc;
-        default: generate_exception_if(1, EXC_UD, -1);
+        default: generate_exception(EXC_UD);
         }
         break;
 
@@ -5252,15 +5257,15 @@ x86_emulate(
     case X86EMUL_OPC(0x0f, 0xc3): /* movnti */
         /* Ignore the non-temporal hint for now. */
         vcpu_must_have_sse2();
-        generate_exception_if(dst.bytes <= 2, EXC_UD, -1);
+        generate_exception_if(dst.bytes <= 2, EXC_UD);
         dst.val = src.val;
         break;
 
     case X86EMUL_OPC(0x0f, 0xc7): /* Grp9 (cmpxchg8b/cmpxchg16b) */ {
         unsigned long old[2], exp[2], new[2];
 
-        generate_exception_if((modrm_reg & 7) != 1, EXC_UD, -1);
-        generate_exception_if(ea.type != OP_MEM, EXC_UD, -1);
+        generate_exception_if((modrm_reg & 7) != 1, EXC_UD);
+        generate_exception_if(ea.type != OP_MEM, EXC_UD);
         if ( op_bytes == 8 )
             host_and_vcpu_must_have(cx16);
         op_bytes *= 2;
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 13/24] x86/emul: Rework emulator event injection
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (11 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 12/24] x86/emul: Remove opencoded exception generation Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 14:26   ` Paul Durrant
                     ` (2 more replies)
  2016-11-30 13:50 ` [PATCH v3 14/24] x86/vmx: Use hvm_{get, set}_segment_register() rather than vmx_{get, set}_segment_register() Andrew Cooper
                   ` (10 subsequent siblings)
  23 siblings, 3 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel
  Cc: George Dunlap, Andrew Cooper, Paul Durrant, Tim Deegan, Jan Beulich

The emulator needs to gain an understanding of interrupts and exceptions
generated by its actions.

Move hvm_emulate_ctxt.{exn_pending,trap} into struct x86_emulate_ctxt so they
are visible to the emulator.  This removes the need for the
inject_{hw_exception,sw_interrupt}() hooks, which are dropped and replaced
with x86_emul_{hw_exception,software_event,reset_event}() instead.

For exceptions raised by x86_emulate() itself (rather than its callbacks), the
shadow pagetable and PV uses of x86_emulate() previously failed with
X86EMUL_UNHANDLEABLE due to the lack of inject_*() hooks.

This behaviour has changed, and such cases will now return X86EMUL_EXCEPTION
with event_pending set.  Until the callers of x86_emulate() have been updated
to inject events back into the guest, divert the event_pending case back into
the X86EMUL_UNHANDLEABLE path to maintain the same guest-visible behaviour.

No overall functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Paul Durrant <paul.durrant@citrix.com>
CC: Tim Deegan <tim@xen.org>
CC: George Dunlap <george.dunlap@eu.citrix.com>

v3:
 * Rework how the event_pending case is currently handled
v2:
 * Change x86_emul_hw_exception()'s error_code parameter to being signed
 * Clarify how software interrupt injection happens.
 * More ASSERT()'s and description of how event_pending works without the
   inject_sw_interrupt() hook
---
 xen/arch/x86/hvm/emulate.c             | 81 ++++------------------------------
 xen/arch/x86/hvm/hvm.c                 |  4 +-
 xen/arch/x86/hvm/io.c                  |  4 +-
 xen/arch/x86/hvm/vmx/realmode.c        | 16 +++----
 xen/arch/x86/mm.c                      | 26 +++++++++++
 xen/arch/x86/mm/shadow/multi.c         | 17 +++++++
 xen/arch/x86/x86_emulate/x86_emulate.c | 12 +++--
 xen/arch/x86/x86_emulate/x86_emulate.h | 76 +++++++++++++++++++++++++------
 xen/include/asm-x86/hvm/emulate.h      |  3 --
 9 files changed, 132 insertions(+), 107 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 91c79fa..4b8c9a0 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -568,12 +568,9 @@ static int hvmemul_virtual_to_linear(
         return X86EMUL_UNHANDLEABLE;
 
     /* This is a singleton operation: fail it with an exception. */
-    hvmemul_ctxt->exn_pending = 1;
-    hvmemul_ctxt->trap.vector =
-        (seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault;
-    hvmemul_ctxt->trap.type = X86_EVENTTYPE_HW_EXCEPTION;
-    hvmemul_ctxt->trap.error_code = 0;
-    hvmemul_ctxt->trap.insn_len = 0;
+    x86_emul_hw_exception((seg == x86_seg_ss)
+                          ? TRAP_stack_error
+                          : TRAP_gp_fault, 0, &hvmemul_ctxt->ctxt);
     return X86EMUL_EXCEPTION;
 }
 
@@ -1562,59 +1559,6 @@ int hvmemul_cpuid(
     return X86EMUL_OKAY;
 }
 
-static int hvmemul_inject_hw_exception(
-    uint8_t vector,
-    int32_t error_code,
-    struct x86_emulate_ctxt *ctxt)
-{
-    struct hvm_emulate_ctxt *hvmemul_ctxt =
-        container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
-
-    hvmemul_ctxt->exn_pending = 1;
-    hvmemul_ctxt->trap.vector = vector;
-    hvmemul_ctxt->trap.type = X86_EVENTTYPE_HW_EXCEPTION;
-    hvmemul_ctxt->trap.error_code = error_code;
-    hvmemul_ctxt->trap.insn_len = 0;
-
-    return X86EMUL_OKAY;
-}
-
-static int hvmemul_inject_sw_interrupt(
-    enum x86_swint_type type,
-    uint8_t vector,
-    uint8_t insn_len,
-    struct x86_emulate_ctxt *ctxt)
-{
-    struct hvm_emulate_ctxt *hvmemul_ctxt =
-        container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
-
-    switch ( type )
-    {
-    case x86_swint_icebp:
-        hvmemul_ctxt->trap.type = X86_EVENTTYPE_PRI_SW_EXCEPTION;
-        break;
-
-    case x86_swint_int3:
-    case x86_swint_into:
-        hvmemul_ctxt->trap.type = X86_EVENTTYPE_SW_EXCEPTION;
-        break;
-
-    case x86_swint_int:
-        hvmemul_ctxt->trap.type = X86_EVENTTYPE_SW_INTERRUPT;
-        break;
-
-    default:
-        return X86EMUL_UNHANDLEABLE;
-    }
-
-    hvmemul_ctxt->exn_pending = 1;
-    hvmemul_ctxt->trap.vector = vector;
-    hvmemul_ctxt->trap.error_code = X86_EVENT_NO_EC;
-    hvmemul_ctxt->trap.insn_len = insn_len;
-
-    return X86EMUL_OKAY;
-}
-
 static int hvmemul_get_fpu(
     void (*exception_callback)(void *, struct cpu_user_regs *),
     void *exception_callback_arg,
@@ -1678,8 +1622,7 @@ static int hvmemul_invlpg(
          * hvmemul_virtual_to_linear() raises exceptions for type/limit
          * violations, so squash them.
          */
-        hvmemul_ctxt->exn_pending = 0;
-        hvmemul_ctxt->trap = (struct x86_event){};
+        x86_emul_reset_event(ctxt);
         rc = X86EMUL_OKAY;
     }
 
@@ -1696,7 +1639,7 @@ static int hvmemul_vmfunc(
 
     rc = hvm_funcs.altp2m_vcpu_emulate_vmfunc(ctxt->regs);
     if ( rc != X86EMUL_OKAY )
-        hvmemul_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC, ctxt);
+        x86_emul_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC, ctxt);
 
     return rc;
 }
@@ -1720,8 +1663,6 @@ static const struct x86_emulate_ops hvm_emulate_ops = {
     .write_msr     = hvmemul_write_msr,
     .wbinvd        = hvmemul_wbinvd,
     .cpuid         = hvmemul_cpuid,
-    .inject_hw_exception = hvmemul_inject_hw_exception,
-    .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
     .get_fpu       = hvmemul_get_fpu,
     .put_fpu       = hvmemul_put_fpu,
     .invlpg        = hvmemul_invlpg,
@@ -1747,8 +1688,6 @@ static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
     .write_msr     = hvmemul_write_msr_discard,
     .wbinvd        = hvmemul_wbinvd_discard,
     .cpuid         = hvmemul_cpuid,
-    .inject_hw_exception = hvmemul_inject_hw_exception,
-    .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
     .get_fpu       = hvmemul_get_fpu,
     .put_fpu       = hvmemul_put_fpu,
     .invlpg        = hvmemul_invlpg,
@@ -1870,8 +1809,8 @@ int hvm_emulate_one_mmio(unsigned long mfn, unsigned long gla)
         hvm_dump_emulation_state(XENLOG_G_WARNING "MMCFG", &ctxt);
         break;
     case X86EMUL_EXCEPTION:
-        if ( ctxt.exn_pending )
-            hvm_inject_event(&ctxt.trap);
+        if ( ctxt.ctxt.event_pending )
+            hvm_inject_event(&ctxt.ctxt.event);
         /* fallthrough */
     default:
         hvm_emulate_writeback(&ctxt);
@@ -1930,8 +1869,8 @@ void hvm_emulate_one_vm_event(enum emul_kind kind, unsigned int trapnr,
         hvm_inject_hw_exception(trapnr, errcode);
         break;
     case X86EMUL_EXCEPTION:
-        if ( ctx.exn_pending )
-            hvm_inject_event(&ctx.trap);
+        if ( ctx.ctxt.event_pending )
+            hvm_inject_event(&ctx.ctxt.event);
         break;
     }
 
@@ -2006,8 +1945,6 @@ void hvm_emulate_init_per_insn(
         hvmemul_ctxt->insn_buf_bytes = insn_bytes;
         memcpy(hvmemul_ctxt->insn_buf, insn_buf, insn_bytes);
     }
-
-    hvmemul_ctxt->exn_pending = 0;
 }
 
 void hvm_emulate_writeback(
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index b950842..ef83100 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4076,8 +4076,8 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
         hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         break;
     case X86EMUL_EXCEPTION:
-        if ( ctxt.exn_pending )
-            hvm_inject_event(&ctxt.trap);
+        if ( ctxt.ctxt.event_pending )
+            hvm_inject_event(&ctxt.ctxt.event);
         /* fall through */
     default:
         hvm_emulate_writeback(&ctxt);
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 1279f68..abb9d51 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -102,8 +102,8 @@ int handle_mmio(void)
         hvm_dump_emulation_state(XENLOG_G_WARNING "MMIO", &ctxt);
         return 0;
     case X86EMUL_EXCEPTION:
-        if ( ctxt.exn_pending )
-            hvm_inject_event(&ctxt.trap);
+        if ( ctxt.ctxt.event_pending )
+            hvm_inject_event(&ctxt.ctxt.event);
         break;
     default:
         break;
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index 9002638..dc3ab44 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -122,7 +122,7 @@ void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt)
 
     if ( rc == X86EMUL_EXCEPTION )
     {
-        if ( !hvmemul_ctxt->exn_pending )
+        if ( !hvmemul_ctxt->ctxt.event_pending )
         {
             unsigned long intr_info;
 
@@ -133,27 +133,27 @@ void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt)
                 gdprintk(XENLOG_ERR, "Exception pending but no info.\n");
                 goto fail;
             }
-            hvmemul_ctxt->trap.vector = (uint8_t)intr_info;
-            hvmemul_ctxt->trap.insn_len = 0;
+            hvmemul_ctxt->ctxt.event.vector = (uint8_t)intr_info;
+            hvmemul_ctxt->ctxt.event.insn_len = 0;
         }
 
         if ( unlikely(curr->domain->debugger_attached) &&
-             ((hvmemul_ctxt->trap.vector == TRAP_debug) ||
-              (hvmemul_ctxt->trap.vector == TRAP_int3)) )
+             ((hvmemul_ctxt->ctxt.event.vector == TRAP_debug) ||
+              (hvmemul_ctxt->ctxt.event.vector == TRAP_int3)) )
         {
             domain_pause_for_debugger();
         }
         else if ( curr->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE )
         {
             gdprintk(XENLOG_ERR, "Exception %02x in protected mode.\n",
-                     hvmemul_ctxt->trap.vector);
+                     hvmemul_ctxt->ctxt.event.vector);
             goto fail;
         }
         else
         {
             realmode_deliver_exception(
-                hvmemul_ctxt->trap.vector,
-                hvmemul_ctxt->trap.insn_len,
+                hvmemul_ctxt->ctxt.event.vector,
+                hvmemul_ctxt->ctxt.event.insn_len,
                 hvmemul_ctxt);
         }
     }
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 231c7bf..5d59479 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5379,6 +5379,19 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
     page_unlock(page);
     put_page(page);
 
+    /*
+     * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
+     * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such exceptions
+     * now set event_pending instead.  Exceptions raised behind the back of
+     * the emulator don't yet set event_pending.
+     *
+     * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE path,
+     * for no functional change from before.  Future patches will fix this
+     * properly.
+     */
+    if ( rc == X86EMUL_EXCEPTION && ptwr_ctxt.ctxt.event_pending )
+        rc = X86EMUL_UNHANDLEABLE;
+
     if ( rc == X86EMUL_UNHANDLEABLE )
         goto bail;
 
@@ -5506,6 +5519,19 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
     else
         rc = x86_emulate(&ctxt, &mmio_ro_emulate_ops);
 
+    /*
+     * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
+     * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such exceptions
+     * now set event_pending instead.  Exceptions raised behind the back of
+     * the emulator don't yet set event_pending.
+     *
+     * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE path,
+     * for no functional change from before.  Future patches will fix this
+     * properly.
+     */
+    if ( rc == X86EMUL_EXCEPTION && ctxt.event_pending )
+        rc = X86EMUL_UNHANDLEABLE;
+
     if ( rc == X86EMUL_UNHANDLEABLE )
         return 0;
 
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index ddfb815..56c40f8 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3374,6 +3374,19 @@ static int sh_page_fault(struct vcpu *v,
     r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
 
     /*
+     * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
+     * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such exceptions
+     * now set event_pending instead.  Exceptions raised behind the back of
+     * the emulator don't yet set event_pending.
+     *
+     * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE path,
+     * for no functional change from before.  Future patches will fix this
+     * properly.
+     */
+    if ( r == X86EMUL_EXCEPTION && emul_ctxt.ctxt.event_pending )
+        r = X86EMUL_UNHANDLEABLE;
+
+    /*
      * NB. We do not unshadow on X86EMUL_EXCEPTION. It's not clear that it
      * would be a good unshadow hint. If we *do* decide to unshadow-on-fault
      * then it must be 'failable': we cannot require the unshadow to succeed.
@@ -3443,6 +3456,10 @@ static int sh_page_fault(struct vcpu *v,
             shadow_continue_emulation(&emul_ctxt, regs);
             v->arch.paging.last_write_was_pt = 0;
             r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
+
+            if ( r == X86EMUL_EXCEPTION && emul_ctxt.ctxt.event_pending )
+                r = X86EMUL_UNHANDLEABLE;
+
             if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.retire.raw )
             {
                 emulation_count++;
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 6adfdbe..0fb2c09 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -680,9 +680,8 @@ static inline int mkec(uint8_t e, int32_t ec, ...)
 
 #define generate_exception_if(p, e, ec...)                                \
 ({  if ( (p) ) {                                                          \
-        fail_if(ops->inject_hw_exception == NULL);                        \
-        rc = ops->inject_hw_exception(e, mkec(e, ##ec, 0), ctxt)          \
-            ? : X86EMUL_EXCEPTION;                                        \
+        x86_emul_hw_exception(e, mkec(e, ##ec, 0), ctxt);                 \
+        rc = X86EMUL_EXCEPTION;                                           \
         goto done;                                                        \
     }                                                                     \
 })
@@ -1604,9 +1603,6 @@ static int inject_swint(enum x86_swint_type type,
 {
     int rc, error_code, fault_type = EXC_GP;
 
-    fail_if(ops->inject_sw_interrupt == NULL);
-    fail_if(ops->inject_hw_exception == NULL);
-
     /*
      * Without hardware support, injecting software interrupts/exceptions is
      * problematic.
@@ -1701,7 +1697,8 @@ static int inject_swint(enum x86_swint_type type,
         }
     }
 
-    rc = ops->inject_sw_interrupt(type, vector, insn_len, ctxt);
+    x86_emul_software_event(type, vector, insn_len, ctxt);
+    rc = X86EMUL_OKAY;
 
  done:
     return rc;
@@ -1909,6 +1906,7 @@ x86_decode(
 
     /* Initialise output state in x86_emulate_ctxt */
     ctxt->retire.raw = 0;
+    x86_emul_reset_event(ctxt);
 
     op_bytes = def_op_bytes = ad_bytes = def_ad_bytes = ctxt->addr_size/8;
     if ( op_bytes == 8 )
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index da8924b..3c0b25d 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -396,19 +396,6 @@ struct x86_emulate_ops
         unsigned int *edx,
         struct x86_emulate_ctxt *ctxt);
 
-    /* inject_hw_exception */
-    int (*inject_hw_exception)(
-        uint8_t vector,
-        int32_t error_code,
-        struct x86_emulate_ctxt *ctxt);
-
-    /* inject_sw_interrupt */
-    int (*inject_sw_interrupt)(
-        enum x86_swint_type type,
-        uint8_t vector,
-        uint8_t insn_len,
-        struct x86_emulate_ctxt *ctxt);
-
     /*
      * get_fpu: Load emulated environment's FPU state onto processor.
      *  @exn_callback: On any FPU or SIMD exception, pass control to
@@ -486,6 +473,9 @@ struct x86_emulate_ctxt
             bool singlestep:1;   /* Singlestepping was active. */
         };
     } retire;
+
+    bool event_pending;
+    struct x86_event event;
 };
 
 /*
@@ -584,6 +574,19 @@ static inline int x86_emulate_wrapper(
     if ( rc == X86EMUL_EXCEPTION )
         ASSERT(ctxt->regs->eip == orig_eip);
 
+    /*
+     * TODO: Make this true:
+     *
+    ASSERT(ctxt->event_pending == (rc == X86EMUL_EXCEPTION));
+     *
+     * Some codepaths still raise exceptions behind the back of the
+     * emulator. (i.e. return X86EMUL_EXCEPTION but without
+     * event_pending being set).  In the meantime, use a slightly
+     * relaxed check...
+     */
+    if ( ctxt->event_pending )
+        ASSERT(rc == X86EMUL_EXCEPTION);
+
     return rc;
 }
 
@@ -633,4 +636,51 @@ void x86_emulate_free_state(struct x86_emulate_state *state);
 
 #endif
 
+static inline void x86_emul_hw_exception(
+    unsigned int vector, int error_code, struct x86_emulate_ctxt *ctxt)
+{
+    ASSERT(!ctxt->event_pending);
+
+    ctxt->event.vector = vector;
+    ctxt->event.type = X86_EVENTTYPE_HW_EXCEPTION;
+    ctxt->event.error_code = error_code;
+
+    ctxt->event_pending = true;
+}
+
+static inline void x86_emul_software_event(
+    enum x86_swint_type type, uint8_t vector, uint8_t insn_len,
+    struct x86_emulate_ctxt *ctxt)
+{
+    ASSERT(!ctxt->event_pending);
+
+    switch ( type )
+    {
+    case x86_swint_icebp:
+        ctxt->event.type = X86_EVENTTYPE_PRI_SW_EXCEPTION;
+        break;
+
+    case x86_swint_int3:
+    case x86_swint_into:
+        ctxt->event.type = X86_EVENTTYPE_SW_EXCEPTION;
+        break;
+
+    case x86_swint_int:
+        ctxt->event.type = X86_EVENTTYPE_SW_INTERRUPT;
+        break;
+    }
+
+    ctxt->event.vector = vector;
+    ctxt->event.error_code = X86_EVENT_NO_EC;
+    ctxt->event.insn_len = insn_len;
+
+    ctxt->event_pending = true;
+}
+
+static inline void x86_emul_reset_event(struct x86_emulate_ctxt *ctxt)
+{
+    ctxt->event_pending = false;
+    ctxt->event = (struct x86_event){};
+}
+
 #endif /* __X86_EMULATE_H__ */
diff --git a/xen/include/asm-x86/hvm/emulate.h b/xen/include/asm-x86/hvm/emulate.h
index 3b7ec33..d64d834 100644
--- a/xen/include/asm-x86/hvm/emulate.h
+++ b/xen/include/asm-x86/hvm/emulate.h
@@ -29,9 +29,6 @@ struct hvm_emulate_ctxt {
     unsigned long seg_reg_accessed;
     unsigned long seg_reg_dirty;
 
-    bool_t exn_pending;
-    struct x86_event trap;
-
     uint32_t intr_shadow;
 
     bool_t set_context;
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 14/24] x86/vmx: Use hvm_{get, set}_segment_register() rather than vmx_{get, set}_segment_register()
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (12 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 13/24] x86/emul: Rework emulator event injection Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 15/24] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS Andrew Cooper
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

No functional change at this point, but this is a prerequisite for forthcoming
functional changes.

Make vmx_get_segment_register() private to vmx.c like all the other Vendor
get/set functions.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/hvm/vmx/vmx.c        | 14 +++++++-------
 xen/arch/x86/hvm/vmx/vvmx.c       |  6 +++---
 xen/include/asm-x86/hvm/vmx/vmx.h |  2 --
 3 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 31f08d2..377c789 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -940,8 +940,8 @@ static void vmx_ctxt_switch_to(struct vcpu *v)
         .fields = { .type = 0xb, .s = 0, .dpl = 0, .p = 1, .avl = 0,    \
                     .l = 0, .db = 0, .g = 0, .pad = 0 } }).bytes)
 
-void vmx_get_segment_register(struct vcpu *v, enum x86_segment seg,
-                              struct segment_register *reg)
+static void vmx_get_segment_register(struct vcpu *v, enum x86_segment seg,
+                                     struct segment_register *reg)
 {
     unsigned long attr = 0, sel = 0, limit;
 
@@ -1504,19 +1504,19 @@ static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr)
              * Need to read them all either way, as realmode reads can update
              * the saved values we'll use when returning to prot mode. */
             for ( s = 0; s < ARRAY_SIZE(reg); s++ )
-                vmx_get_segment_register(v, s, &reg[s]);
+                hvm_get_segment_register(v, s, &reg[s]);
             v->arch.hvm_vmx.vmx_realmode = realmode;
             
             if ( realmode )
             {
                 for ( s = 0; s < ARRAY_SIZE(reg); s++ )
-                    vmx_set_segment_register(v, s, &reg[s]);
+                    hvm_set_segment_register(v, s, &reg[s]);
             }
             else 
             {
                 for ( s = 0; s < ARRAY_SIZE(reg); s++ )
                     if ( !(v->arch.hvm_vmx.vm86_segment_mask & (1<<s)) )
-                        vmx_set_segment_register(
+                        hvm_set_segment_register(
                             v, s, &v->arch.hvm_vmx.vm86_saved_seg[s]);
             }
 
@@ -3907,7 +3907,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
             gdprintk(XENLOG_WARNING, "Bad vmexit (reason %#lx)\n",
                      exit_reason);
 
-            vmx_get_segment_register(v, x86_seg_ss, &ss);
+            hvm_get_segment_register(v, x86_seg_ss, &ss);
             if ( ss.attr.fields.dpl )
                 hvm_inject_hw_exception(TRAP_invalid_op,
                                         X86_EVENT_NO_EC);
@@ -3939,7 +3939,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
 
         gprintk(XENLOG_WARNING, "Bad rIP %lx for mode %u\n", regs->rip, mode);
 
-        vmx_get_segment_register(v, x86_seg_ss, &ss);
+        hvm_get_segment_register(v, x86_seg_ss, &ss);
         if ( ss.attr.fields.dpl )
         {
             __vmread(VM_ENTRY_INTR_INFO, &intr_info);
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index efaf54c..bcc4a97 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -360,7 +360,7 @@ static int vmx_inst_check_privilege(struct cpu_user_regs *regs, int vmxop_check)
     else if ( !vcpu_2_nvmx(v).vmxon_region_pa )
         goto invalid_op;
 
-    vmx_get_segment_register(v, x86_seg_cs, &cs);
+    hvm_get_segment_register(v, x86_seg_cs, &cs);
 
     if ( (regs->eflags & X86_EFLAGS_VM) ||
          (hvm_long_mode_enabled(v) && cs.attr.fields.l == 0) )
@@ -419,13 +419,13 @@ static int decode_vmx_inst(struct cpu_user_regs *regs,
 
         if ( hvm_long_mode_enabled(v) )
         {
-            vmx_get_segment_register(v, x86_seg_cs, &seg);
+            hvm_get_segment_register(v, x86_seg_cs, &seg);
             mode_64bit = seg.attr.fields.l;
         }
 
         if ( info.fields.segment > VMX_SREG_GS )
             goto gp_fault;
-        vmx_get_segment_register(v, sreg_to_index[info.fields.segment], &seg);
+        hvm_get_segment_register(v, sreg_to_index[info.fields.segment], &seg);
         seg_base = seg.base;
 
         base = info.fields.base_reg_invalid ? 0 :
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 4cdd9b1..0e5902d 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -550,8 +550,6 @@ static inline int __vmxon(u64 addr)
     return rc;
 }
 
-void vmx_get_segment_register(struct vcpu *, enum x86_segment,
-                              struct segment_register *);
 void vmx_inject_extint(int trap, uint8_t source);
 void vmx_inject_nmi(void);
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 15/24] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (13 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 14/24] x86/vmx: Use hvm_{get, set}_segment_register() rather than vmx_{get, set}_segment_register() Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 16/24] x86/emul: Avoid raising faults behind the emulators back Andrew Cooper
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

Intel VT-x and AMD SVM provide access to the full segment descriptor cache via
fields in the VMCB/VMCS.  However, the bits which are actually checked by
hardware and preserved across vmentry/exit are inconsistent, and the vendor
accessor functions perform inconsistent modification to the raw values.

Convert {svm,vmx}_{get,set}_segment_register() into raw accessors, and alter
hvm_{get,set}_segment_register() to cook the values consistently.  This allows
the common emulation code to better rely on finding architecturally-expected
values.

While moving the code performing the cooking, fix the %ss.db quirk.  A NULL
selector is indicated by .p being clear, not the value of the .type field.

This does cause some functional changes because of the modifications being
applied uniformly.  A side effect of this fixes latent bugs where
vmx_set_segment_register() didn't correctly fix up .G for segments, and
inconsistent fixing up of the GDTR/IDTR limits.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
v2:
 * Clarify the change of the %ss.db quirk
 * Rework %tr typecheck logic
 * Swap a break for return following ASSERT_UNREACHABLE()
---
 xen/arch/x86/hvm/hvm.c        | 154 ++++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/svm/svm.c    |  20 +-----
 xen/arch/x86/hvm/vmx/vmx.c    |   6 +-
 xen/include/asm-x86/desc.h    |   6 ++
 xen/include/asm-x86/hvm/hvm.h |  17 ++---
 5 files changed, 167 insertions(+), 36 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index ef83100..bdfd94e 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -6051,6 +6051,160 @@ void hvm_domain_soft_reset(struct domain *d)
 }
 
 /*
+ * Segment caches in VMCB/VMCS are inconsistent about which bits are checked,
+ * important, and preserved across vmentry/exit.  Cook the values to make them
+ * closer to what is architecturally expected from entries in the segment
+ * cache.
+ */
+void hvm_get_segment_register(struct vcpu *v, enum x86_segment seg,
+                              struct segment_register *reg)
+{
+    hvm_funcs.get_segment_register(v, seg, reg);
+
+    switch ( seg )
+    {
+    case x86_seg_ss:
+        /* SVM may retain %ss.DB when %ss is loaded with a NULL selector. */
+        if ( !reg->attr.fields.p )
+            reg->attr.fields.db = 0;
+        break;
+
+    case x86_seg_tr:
+        /*
+         * SVM doesn't track %tr.B. Architecturally, a loaded TSS segment will
+         * always be busy.
+         */
+        reg->attr.fields.type |= 0x2;
+
+        /*
+         * %cs and %tr are unconditionally present.  SVM ignores these present
+         * bits and will happily run without them set.
+         */
+    case x86_seg_cs:
+        reg->attr.fields.p = 1;
+        break;
+
+    case x86_seg_gdtr:
+    case x86_seg_idtr:
+        /*
+         * Treat GDTR/IDTR as being present system segments.  This avoids them
+         * needing special casing for segmentation checks.
+         */
+        reg->attr.bytes = 0x80;
+        break;
+
+    default: /* Avoid triggering -Werror=switch */
+        break;
+    }
+
+    if ( reg->attr.fields.p )
+    {
+        /*
+         * For segments which are present/usable, cook the system flag.  SVM
+         * ignores the S bit on all segments and will happily run with them in
+         * any state.
+         */
+        reg->attr.fields.s = is_x86_user_segment(seg);
+
+        /*
+         * SVM discards %cs.G on #VMEXIT.  Other user segments do have .G
+         * tracked, but Linux commit 80112c89ed87 "KVM: Synthesize G bit for
+         * all segments." indicates that this isn't necessarily the case when
+         * nested under ESXi.
+         *
+         * Unconditionally recalculate G.
+         */
+        reg->attr.fields.g = !!(reg->limit >> 20);
+
+        /*
+         * SVM doesn't track the Accessed flag.  It will always be set for
+         * usable user segments loaded into the descriptor cache.
+         */
+        if ( is_x86_user_segment(seg) )
+            reg->attr.fields.type |= 0x1;
+    }
+}
+
+void hvm_set_segment_register(struct vcpu *v, enum x86_segment seg,
+                              struct segment_register *reg)
+{
+    /* Set G to match the limit field.  VT-x cares, while SVM doesn't. */
+    if ( reg->attr.fields.p )
+        reg->attr.fields.g = !!(reg->limit >> 20);
+
+    switch ( seg )
+    {
+    case x86_seg_cs:
+        ASSERT(reg->attr.fields.p);                  /* Usable. */
+        ASSERT(reg->attr.fields.s);                  /* User segment. */
+        ASSERT((reg->base >> 32) == 0);              /* Upper bits clear. */
+        break;
+
+    case x86_seg_ss:
+        if ( reg->attr.fields.p )
+        {
+            ASSERT(reg->attr.fields.s);              /* User segment. */
+            ASSERT(!(reg->attr.fields.type & 0x8));  /* Data segment. */
+            ASSERT(reg->attr.fields.type & 0x2);     /* Writeable. */
+            ASSERT((reg->base >> 32) == 0);          /* Upper bits clear. */
+        }
+        break;
+
+    case x86_seg_ds:
+    case x86_seg_es:
+    case x86_seg_fs:
+    case x86_seg_gs:
+        if ( reg->attr.fields.p )
+        {
+            ASSERT(reg->attr.fields.s);              /* User segment. */
+
+            if ( reg->attr.fields.type & 0x8 )
+                ASSERT(reg->attr.fields.type & 0x2); /* Readable. */
+
+            if ( seg == x86_seg_fs || seg == x86_seg_gs )
+                ASSERT(is_canonical_address(reg->base));
+            else
+                ASSERT((reg->base >> 32) == 0);      /* Upper bits clear. */
+        }
+        break;
+
+    case x86_seg_tr:
+        ASSERT(reg->attr.fields.p);                  /* Usable. */
+        ASSERT(!reg->attr.fields.s);                 /* System segment. */
+        ASSERT(!(reg->sel & 0x4));                   /* !TI. */
+        if ( reg->attr.fields.type == SYS_DESC_tss_busy )
+            ASSERT(is_canonical_address(reg->base));
+        else if ( reg->attr.fields.type == SYS_DESC_tss16_busy )
+            ASSERT((reg->base >> 32) == 0);
+        else
+            ASSERT(!"%tr typecheck failure");
+        break;
+
+    case x86_seg_ldtr:
+        if ( reg->attr.fields.p )
+        {
+            ASSERT(!reg->attr.fields.s);             /* System segment. */
+            ASSERT(!(reg->sel & 0x4));               /* !TI. */
+            ASSERT(reg->attr.fields.type == SYS_DESC_ldt);
+            ASSERT(is_canonical_address(reg->base));
+        }
+        break;
+
+    case x86_seg_gdtr:
+    case x86_seg_idtr:
+        ASSERT(is_canonical_address(reg->base));
+        ASSERT((reg->limit >> 16) == 0);             /* Upper bits clear. */
+        break;
+
+    default:
+        ASSERT_UNREACHABLE();
+        return;
+    }
+
+    hvm_funcs.set_segment_register(v, seg, reg);
+}
+
+/*
  * Local variables:
  * mode: C
  * c-file-style: "BSD"
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 65eeab7..2ea2c11 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -627,50 +627,34 @@ static void svm_get_segment_register(struct vcpu *v, enum x86_segment seg,
     {
     case x86_seg_cs:
         memcpy(reg, &vmcb->cs, sizeof(*reg));
-        reg->attr.fields.p = 1;
-        reg->attr.fields.g = reg->limit > 0xFFFFF;
         break;
     case x86_seg_ds:
         memcpy(reg, &vmcb->ds, sizeof(*reg));
-        if ( reg->attr.fields.type != 0 )
-            reg->attr.fields.type |= 0x1;
         break;
     case x86_seg_es:
         memcpy(reg, &vmcb->es, sizeof(*reg));
-        if ( reg->attr.fields.type != 0 )
-            reg->attr.fields.type |= 0x1;
         break;
     case x86_seg_fs:
         svm_sync_vmcb(v);
         memcpy(reg, &vmcb->fs, sizeof(*reg));
-        if ( reg->attr.fields.type != 0 )
-            reg->attr.fields.type |= 0x1;
         break;
     case x86_seg_gs:
         svm_sync_vmcb(v);
         memcpy(reg, &vmcb->gs, sizeof(*reg));
-        if ( reg->attr.fields.type != 0 )
-            reg->attr.fields.type |= 0x1;
         break;
     case x86_seg_ss:
         memcpy(reg, &vmcb->ss, sizeof(*reg));
         reg->attr.fields.dpl = vmcb->_cpl;
-        if ( reg->attr.fields.type == 0 )
-            reg->attr.fields.db = 0;
         break;
     case x86_seg_tr:
         svm_sync_vmcb(v);
         memcpy(reg, &vmcb->tr, sizeof(*reg));
-        reg->attr.fields.p = 1;
-        reg->attr.fields.type |= 0x2;
         break;
     case x86_seg_gdtr:
         memcpy(reg, &vmcb->gdtr, sizeof(*reg));
-        reg->attr.bytes = 0x80;
         break;
     case x86_seg_idtr:
         memcpy(reg, &vmcb->idtr, sizeof(*reg));
-        reg->attr.bytes = 0x80;
         break;
     case x86_seg_ldtr:
         svm_sync_vmcb(v);
@@ -740,11 +724,11 @@ static void svm_set_segment_register(struct vcpu *v, enum x86_segment seg,
         break;
     case x86_seg_gdtr:
         vmcb->gdtr.base = reg->base;
-        vmcb->gdtr.limit = (uint16_t)reg->limit;
+        vmcb->gdtr.limit = reg->limit;
         break;
     case x86_seg_idtr:
         vmcb->idtr.base = reg->base;
-        vmcb->idtr.limit = (uint16_t)reg->limit;
+        vmcb->idtr.limit = reg->limit;
         break;
     case x86_seg_ldtr:
         memcpy(&vmcb->ldtr, reg, sizeof(*reg));
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 377c789..004dad8 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1126,9 +1126,6 @@ static void vmx_set_segment_register(struct vcpu *v, enum x86_segment seg,
      */
     attr = (!(attr & (1u << 7)) << 16) | ((attr & 0xf00) << 4) | (attr & 0xff);
 
-    /* VMX has strict consistency requirement for flag G. */
-    attr |= !!(limit >> 20) << 15;
-
     vmx_vmcs_enter(v);
 
     switch ( seg )
@@ -1173,8 +1170,7 @@ static void vmx_set_segment_register(struct vcpu *v, enum x86_segment seg,
         __vmwrite(GUEST_TR_SELECTOR, sel);
         __vmwrite(GUEST_TR_LIMIT, limit);
         __vmwrite(GUEST_TR_BASE, base);
-        /* VMX checks that the the busy flag (bit 1) is set. */
-        __vmwrite(GUEST_TR_AR_BYTES, attr | 2);
+        __vmwrite(GUEST_TR_AR_BYTES, attr);
         break;
     case x86_seg_gdtr:
         __vmwrite(GUEST_GDTR_LIMIT, limit);
diff --git a/xen/include/asm-x86/desc.h b/xen/include/asm-x86/desc.h
index 0e2d97f..da924bf 100644
--- a/xen/include/asm-x86/desc.h
+++ b/xen/include/asm-x86/desc.h
@@ -89,7 +89,13 @@
 #ifndef __ASSEMBLY__
 
 /* System Descriptor types for GDT and IDT entries. */
+#define SYS_DESC_tss16_avail  1
 #define SYS_DESC_ldt          2
+#define SYS_DESC_tss16_busy   3
+#define SYS_DESC_call_gate16  4
+#define SYS_DESC_task_gate    5
+#define SYS_DESC_irq_gate16   6
+#define SYS_DESC_trap_gate16  7
 #define SYS_DESC_tss_avail    9
 #define SYS_DESC_tss_busy     11
 #define SYS_DESC_call_gate    12
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 51a64f7..b37b335 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -358,19 +358,10 @@ static inline void hvm_flush_guest_tlbs(void)
 void hvm_hypercall_page_initialise(struct domain *d,
                                    void *hypercall_page);
 
-static inline void
-hvm_get_segment_register(struct vcpu *v, enum x86_segment seg,
-                         struct segment_register *reg)
-{
-    hvm_funcs.get_segment_register(v, seg, reg);
-}
-
-static inline void
-hvm_set_segment_register(struct vcpu *v, enum x86_segment seg,
-                         struct segment_register *reg)
-{
-    hvm_funcs.set_segment_register(v, seg, reg);
-}
+void hvm_get_segment_register(struct vcpu *v, enum x86_segment seg,
+                              struct segment_register *reg);
+void hvm_set_segment_register(struct vcpu *v, enum x86_segment seg,
+                              struct segment_register *reg);
 
 static inline unsigned long hvm_get_shadow_gs_base(struct vcpu *v)
 {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 16/24] x86/emul: Avoid raising faults behind the emulators back
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (14 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 15/24] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 17/24] x86/pv: " Andrew Cooper
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

Introduce a new x86_emul_pagefault() similar to x86_emul_hw_exception(), and
use this instead of hvm_inject_page_fault() from emulation codepaths.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
v2:
 * Change x86_emul_pagefault()'s error_code parameter to being signed
 * Split out shadow changes
---
 xen/arch/x86/hvm/emulate.c             |  4 ++--
 xen/arch/x86/x86_emulate/x86_emulate.h | 13 +++++++++++++
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 4b8c9a0..614e182 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -459,7 +459,7 @@ static int hvmemul_linear_to_phys(
     {
         if ( pfec & (PFEC_page_paged | PFEC_page_shared) )
             return X86EMUL_RETRY;
-        hvm_inject_page_fault(pfec, addr);
+        x86_emul_pagefault(pfec, addr, &hvmemul_ctxt->ctxt);
         return X86EMUL_EXCEPTION;
     }
 
@@ -483,7 +483,7 @@ static int hvmemul_linear_to_phys(
                 ASSERT(!reverse);
                 if ( npfn != gfn_x(INVALID_GFN) )
                     return X86EMUL_UNHANDLEABLE;
-                hvm_inject_page_fault(pfec, addr & PAGE_MASK);
+                x86_emul_pagefault(pfec, addr & PAGE_MASK, &hvmemul_ctxt->ctxt);
                 return X86EMUL_EXCEPTION;
             }
             *reps = done;
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 3c0b25d..8aa4b0b 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -648,6 +648,19 @@ static inline void x86_emul_hw_exception(
     ctxt->event_pending = true;
 }
 
+static inline void x86_emul_pagefault(
+    int error_code, unsigned long cr2, struct x86_emulate_ctxt *ctxt)
+{
+    ASSERT(!ctxt->event_pending);
+
+    ctxt->event.vector = 14; /* TRAP_page_fault */
+    ctxt->event.type = X86_EVENTTYPE_HW_EXCEPTION;
+    ctxt->event.error_code = error_code;
+    ctxt->event.cr2 = cr2;
+
+    ctxt->event_pending = true;
+}
+
 static inline void x86_emul_software_event(
     enum x86_swint_type type, uint8_t vector, uint8_t insn_len,
     struct x86_emulate_ctxt *ctxt)
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 17/24] x86/pv: Avoid raising faults behind the emulators back
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (15 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 16/24] x86/emul: Avoid raising faults behind the emulators back Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-12-01 11:50   ` Tim Deegan
  2016-12-01 12:57   ` Jan Beulich
  2016-11-30 13:50 ` [PATCH v3 18/24] x86/shadow: " Andrew Cooper
                   ` (6 subsequent siblings)
  23 siblings, 2 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Tim Deegan, Jan Beulich

Use x86_emul_pagefault() rather than pv_inject_page_fault() to cause raised
pagefaults to be known to the emulator.  This requires altering the callers of
x86_emulate() to properly re-inject the event.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Tim Deegan <tim@xen.org>

v3:
 * Split out #DB handling to an earlier part of the series
 * Don't raise #GP faults for unexpected events, but do return back to the
   guest.
v2:
 * New
---
 xen/arch/x86/mm.c | 104 ++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 65 insertions(+), 39 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 5d59479..cdfa85e 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5136,7 +5136,7 @@ static int ptwr_emulated_read(
     if ( !__addr_ok(addr) ||
          (rc = __copy_from_user(p_data, (void *)addr, bytes)) )
     {
-        pv_inject_page_fault(0, addr + bytes - rc); /* Read fault. */
+        x86_emul_pagefault(0, addr + bytes - rc, ctxt);  /* Read fault. */
         return X86EMUL_EXCEPTION;
     }
 
@@ -5177,8 +5177,9 @@ static int ptwr_emulated_update(
         addr &= ~(sizeof(paddr_t)-1);
         if ( (rc = copy_from_user(&full, (void *)addr, sizeof(paddr_t))) != 0 )
         {
-            pv_inject_page_fault(0, /* Read fault. */
-                                 addr + sizeof(paddr_t) - rc);
+            x86_emul_pagefault(0, /* Read fault. */
+                               addr + sizeof(paddr_t) - rc,
+                               &ptwr_ctxt->ctxt);
             return X86EMUL_EXCEPTION;
         }
         /* Mask out bits provided by caller. */
@@ -5379,30 +5380,41 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
     page_unlock(page);
     put_page(page);
 
-    /*
-     * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
-     * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such exceptions
-     * now set event_pending instead.  Exceptions raised behind the back of
-     * the emulator don't yet set event_pending.
-     *
-     * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE path,
-     * for no functional change from before.  Future patches will fix this
-     * properly.
-     */
-    if ( rc == X86EMUL_EXCEPTION && ptwr_ctxt.ctxt.event_pending )
-        rc = X86EMUL_UNHANDLEABLE;
+    /* More strict than x86_emulate_wrapper(), as this is now true for PV. */
+    ASSERT(ptwr_ctxt.ctxt.event_pending == (rc == X86EMUL_EXCEPTION));
 
-    if ( rc == X86EMUL_UNHANDLEABLE )
-        goto bail;
+    switch ( rc )
+    {
+    case X86EMUL_EXCEPTION:
+        /*
+         * This emulation only covers writes to pagetables which marked
+         * read-only by Xen.  We tolerate #PF (from hitting an adjacent page).
+         * Anything else is an emulation bug, or a guest playing with the
+         * instruction stream under Xen's feet.
+         */
+        if ( ptwr_ctxt.ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION &&
+             ptwr_ctxt.ctxt.event.vector == TRAP_page_fault )
+            pv_inject_event(&ptwr_ctxt.ctxt.event);
+        else
+            gdprintk(XENLOG_WARNING,
+                     "Unexpected event (type %u, vector %#x) from emulation\n",
+                     ptwr_ctxt.ctxt.event.type, ptwr_ctxt.ctxt.event.vector);
+
+        /* Fallthrough */
+    case X86EMUL_OKAY:
 
-    if ( ptwr_ctxt.ctxt.retire.singlestep )
-        pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+        if ( ptwr_ctxt.ctxt.retire.singlestep )
+            pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
 
-    perfc_incr(ptwr_emulations);
-    return EXCRET_fault_fixed;
+        /* Fallthrough */
+    case X86EMUL_RETRY:
+        perfc_incr(ptwr_emulations);
+        return EXCRET_fault_fixed;
 
  bail:
-    return 0;
+    default:
+        return 0;
+    }
 }
 
 /*************************
@@ -5519,26 +5531,40 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
     else
         rc = x86_emulate(&ctxt, &mmio_ro_emulate_ops);
 
-    /*
-     * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
-     * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such exceptions
-     * now set event_pending instead.  Exceptions raised behind the back of
-     * the emulator don't yet set event_pending.
-     *
-     * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE path,
-     * for no functional change from before.  Future patches will fix this
-     * properly.
-     */
-    if ( rc == X86EMUL_EXCEPTION && ctxt.event_pending )
-        rc = X86EMUL_UNHANDLEABLE;
+    /* More strict than x86_emulate_wrapper(), as this is now true for PV. */
+    ASSERT(ctxt.event_pending == (rc == X86EMUL_EXCEPTION));
 
-    if ( rc == X86EMUL_UNHANDLEABLE )
-        return 0;
+    switch ( rc )
+    {
+    case X86EMUL_EXCEPTION:
+        /*
+         * This emulation only covers writes to MMCFG space or read-only MFNs.
+         * We tolerate #PF (from hitting an adjacent page).  Anything else is
+         * an emulation bug, or a guest playing with the instruction stream
+         * under Xen's feet.
+         */
+        if ( ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION &&
+             ctxt.event.vector == TRAP_page_fault )
+            pv_inject_event(&ctxt.event);
+        else
+            gdprintk(XENLOG_WARNING,
+                     "Unexpected event (type %u, vector %#x) from emulation\n",
+                     ctxt.event.type, ctxt.event.vector);
+
+        /* Fallthrough */
+    case X86EMUL_OKAY:
+
+        if ( ctxt.retire.singlestep )
+            pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
 
-    if ( ctxt.retire.singlestep )
-        pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+        /* Fallthrough */
+    case X86EMUL_RETRY:
+        perfc_incr(ptwr_emulations);
+        return EXCRET_fault_fixed;
 
-    return EXCRET_fault_fixed;
+    default:
+        return 0;
+    }
 }
 
 void *alloc_xen_pagetable(void)
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 18/24] x86/shadow: Avoid raising faults behind the emulators back
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (16 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 17/24] x86/pv: " Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-12-01 11:39   ` Tim Deegan
  2016-12-01 13:00   ` Jan Beulich
  2016-11-30 13:50 ` [PATCH v3 19/24] x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer Andrew Cooper
                   ` (5 subsequent siblings)
  23 siblings, 2 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Tim Deegan, Jan Beulich

Use x86_emul_{hw_exception,pagefault}() rather than
{pv,hvm}_inject_page_fault() and hvm_inject_hw_exception() to cause raised
faults to be known to the emulator.  This requires altering the callers of
x86_emulate() to properly re-inject the event.

While fixing this, fix the singlestep behaviour.  Previously, an otherwise
successful emulation would fail if singlestepping was active, as the emulator
couldn't raise #DB.  This is unreasonable from the point of view of the guest.

We therefore tolerate #PF/#GP/SS and #DB being raised by the emulator, but
reject anything else as unexpected.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Tim Deegan <tim@xen.org>

v3:
 * Split out #DB handling to an earlier part of the series
 * Don't inject #GP faults for unexpected events, but do reenter the guest.
v2:
 * New
---
 xen/arch/x86/mm/shadow/common.c | 13 ++++++-------
 xen/arch/x86/mm/shadow/multi.c  | 39 ++++++++++++++++++++++++++++-----------
 2 files changed, 34 insertions(+), 18 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index f07803b..e509cc1 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -162,8 +162,9 @@ static int hvm_translate_linear_addr(
 
     if ( !okay )
     {
-        hvm_inject_hw_exception(
-            (seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault, 0);
+        x86_emul_hw_exception(
+            (seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault,
+            0, &sh_ctxt->ctxt);
         return X86EMUL_EXCEPTION;
     }
 
@@ -323,7 +324,7 @@ pv_emulate_read(enum x86_segment seg,
 
     if ( (rc = copy_from_user(p_data, (void *)offset, bytes)) != 0 )
     {
-        pv_inject_page_fault(0, offset + bytes - rc); /* Read fault. */
+        x86_emul_pagefault(0, offset + bytes - rc, ctxt); /* Read fault. */
         return X86EMUL_EXCEPTION;
     }
 
@@ -1720,10 +1721,8 @@ static mfn_t emulate_gva_to_mfn(struct vcpu *v, unsigned long vaddr,
     gfn = paging_get_hostmode(v)->gva_to_gfn(v, NULL, vaddr, &pfec);
     if ( gfn == gfn_x(INVALID_GFN) )
     {
-        if ( is_hvm_vcpu(v) )
-            hvm_inject_page_fault(pfec, vaddr);
-        else
-            pv_inject_page_fault(pfec, vaddr);
+        x86_emul_pagefault(pfec, vaddr, &sh_ctxt->ctxt);
+
         return _mfn(BAD_GVA_TO_GFN);
     }
 
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 56c40f8..098b653 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3373,18 +3373,35 @@ static int sh_page_fault(struct vcpu *v,
 
     r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
 
-    /*
-     * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
-     * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such exceptions
-     * now set event_pending instead.  Exceptions raised behind the back of
-     * the emulator don't yet set event_pending.
-     *
-     * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE path,
-     * for no functional change from before.  Future patches will fix this
-     * properly.
-     */
     if ( r == X86EMUL_EXCEPTION && emul_ctxt.ctxt.event_pending )
-        r = X86EMUL_UNHANDLEABLE;
+    {
+        /*
+         * This emulation covers writes to shadow pagetables.  We tolerate #PF
+         * (from hitting adjacent pages) and #GP/#SS (from segmentation
+         * errors).  Anything else is an emulation bug, or a guest playing
+         * with the instruction stream under Xen's feet.
+         */
+        if ( emul_ctxt.ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION &&
+             (emul_ctxt.ctxt.event.vector < 32) &&
+             ((1u << emul_ctxt.ctxt.event.vector) &
+              ((1u << TRAP_stack_error) | (1u << TRAP_gp_fault) |
+               (1u << TRAP_page_fault))) )
+        {
+            if ( is_hvm_vcpu(v) )
+                hvm_inject_event(&emul_ctxt.ctxt.event);
+            else
+                pv_inject_event(&emul_ctxt.ctxt.event);
+
+            goto emulate_done;
+        }
+        else
+        {
+            SHADOW_PRINTK(
+                "Unexpected event (type %u, vector %#x) from emulation\n",
+                emul_ctxt.ctxt.event.type, emul_ctxt.ctxt.event.vector);
+            r = X86EMUL_UNHANDLEABLE;
+        }
+    }
 
     /*
      * NB. We do not unshadow on X86EMUL_EXCEPTION. It's not clear that it
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 19/24] x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (17 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 18/24] x86/shadow: " Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 20/24] x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info Andrew Cooper
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

which is filled with pagefault information should one occur.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/hvm/emulate.c        |  8 ++++---
 xen/arch/x86/hvm/hvm.c            | 49 +++++++++++++++++++++++++--------------
 xen/arch/x86/hvm/vmx/vvmx.c       |  9 ++++---
 xen/arch/x86/mm/shadow/common.c   |  5 ++--
 xen/include/asm-x86/hvm/support.h | 23 +++++++++++++-----
 5 files changed, 63 insertions(+), 31 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 614e182..41f689e 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -770,6 +770,7 @@ static int __hvmemul_read(
     struct hvm_emulate_ctxt *hvmemul_ctxt)
 {
     struct vcpu *curr = current;
+    pagefault_info_t pfinfo;
     unsigned long addr, reps = 1;
     uint32_t pfec = PFEC_page_present;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
@@ -790,8 +791,8 @@ static int __hvmemul_read(
         pfec |= PFEC_user_mode;
 
     rc = ((access_type == hvm_access_insn_fetch) ?
-          hvm_fetch_from_guest_virt(p_data, addr, bytes, pfec) :
-          hvm_copy_from_guest_virt(p_data, addr, bytes, pfec));
+          hvm_fetch_from_guest_virt(p_data, addr, bytes, pfec, &pfinfo) :
+          hvm_copy_from_guest_virt(p_data, addr, bytes, pfec, &pfinfo));
 
     switch ( rc )
     {
@@ -878,6 +879,7 @@ static int hvmemul_write(
     struct hvm_emulate_ctxt *hvmemul_ctxt =
         container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
     struct vcpu *curr = current;
+    pagefault_info_t pfinfo;
     unsigned long addr, reps = 1;
     uint32_t pfec = PFEC_page_present | PFEC_write_access;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
@@ -896,7 +898,7 @@ static int hvmemul_write(
          (hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3) )
         pfec |= PFEC_user_mode;
 
-    rc = hvm_copy_to_guest_virt(addr, p_data, bytes, pfec);
+    rc = hvm_copy_to_guest_virt(addr, p_data, bytes, pfec, &pfinfo);
 
     switch ( rc )
     {
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index bdfd94e..390f76d 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2859,6 +2859,7 @@ void hvm_task_switch(
     struct desc_struct *optss_desc = NULL, *nptss_desc = NULL, tss_desc;
     bool_t otd_writable, ntd_writable;
     unsigned long eflags;
+    pagefault_info_t pfinfo;
     int exn_raised, rc;
     struct {
         u16 back_link,__blh;
@@ -2925,7 +2926,7 @@ void hvm_task_switch(
     }
 
     rc = hvm_copy_from_guest_virt(
-        &tss, prev_tr.base, sizeof(tss), PFEC_page_present);
+        &tss, prev_tr.base, sizeof(tss), PFEC_page_present, &pfinfo);
     if ( rc != HVMCOPY_okay )
         goto out;
 
@@ -2963,12 +2964,12 @@ void hvm_task_switch(
                                 &tss.eip,
                                 offsetof(typeof(tss), trace) -
                                 offsetof(typeof(tss), eip),
-                                PFEC_page_present);
+                                PFEC_page_present, &pfinfo);
     if ( rc != HVMCOPY_okay )
         goto out;
 
     rc = hvm_copy_from_guest_virt(
-        &tss, tr.base, sizeof(tss), PFEC_page_present);
+        &tss, tr.base, sizeof(tss), PFEC_page_present, &pfinfo);
     /*
      * Note: The HVMCOPY_gfn_shared case could be optimised, if the callee
      * functions knew we want RO access.
@@ -3008,7 +3009,8 @@ void hvm_task_switch(
         tss.back_link = prev_tr.sel;
 
         rc = hvm_copy_to_guest_virt(tr.base + offsetof(typeof(tss), back_link),
-                                    &tss.back_link, sizeof(tss.back_link), 0);
+                                    &tss.back_link, sizeof(tss.back_link), 0,
+                                    &pfinfo);
         if ( rc == HVMCOPY_bad_gva_to_gfn )
             exn_raised = 1;
         else if ( rc != HVMCOPY_okay )
@@ -3045,7 +3047,8 @@ void hvm_task_switch(
                                         16 << segr.attr.fields.db,
                                         &linear_addr) )
         {
-            rc = hvm_copy_to_guest_virt(linear_addr, &errcode, opsz, 0);
+            rc = hvm_copy_to_guest_virt(linear_addr, &errcode, opsz, 0,
+                                        &pfinfo);
             if ( rc == HVMCOPY_bad_gva_to_gfn )
                 exn_raised = 1;
             else if ( rc != HVMCOPY_okay )
@@ -3068,7 +3071,8 @@ void hvm_task_switch(
 #define HVMCOPY_phys       (0u<<2)
 #define HVMCOPY_virt       (1u<<2)
 static enum hvm_copy_result __hvm_copy(
-    void *buf, paddr_t addr, int size, unsigned int flags, uint32_t pfec)
+    void *buf, paddr_t addr, int size, unsigned int flags, uint32_t pfec,
+    pagefault_info_t *pfinfo)
 {
     struct vcpu *curr = current;
     unsigned long gfn;
@@ -3109,7 +3113,15 @@ static enum hvm_copy_result __hvm_copy(
                 if ( pfec & PFEC_page_shared )
                     return HVMCOPY_gfn_shared;
                 if ( flags & HVMCOPY_fault )
+                {
+                    if ( pfinfo )
+                    {
+                        pfinfo->linear = addr;
+                        pfinfo->ec = pfec;
+                    }
+
                     hvm_inject_page_fault(pfec, addr);
+                }
                 return HVMCOPY_bad_gva_to_gfn;
             }
             gpa |= (paddr_t)gfn << PAGE_SHIFT;
@@ -3279,7 +3291,7 @@ enum hvm_copy_result hvm_copy_to_guest_phys(
 {
     return __hvm_copy(buf, paddr, size,
                       HVMCOPY_to_guest | HVMCOPY_fault | HVMCOPY_phys,
-                      0);
+                      0, NULL);
 }
 
 enum hvm_copy_result hvm_copy_from_guest_phys(
@@ -3287,31 +3299,34 @@ enum hvm_copy_result hvm_copy_from_guest_phys(
 {
     return __hvm_copy(buf, paddr, size,
                       HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_phys,
-                      0);
+                      0, NULL);
 }
 
 enum hvm_copy_result hvm_copy_to_guest_virt(
-    unsigned long vaddr, void *buf, int size, uint32_t pfec)
+    unsigned long vaddr, void *buf, int size, uint32_t pfec,
+    pagefault_info_t *pfinfo)
 {
     return __hvm_copy(buf, vaddr, size,
                       HVMCOPY_to_guest | HVMCOPY_fault | HVMCOPY_virt,
-                      PFEC_page_present | PFEC_write_access | pfec);
+                      PFEC_page_present | PFEC_write_access | pfec, pfinfo);
 }
 
 enum hvm_copy_result hvm_copy_from_guest_virt(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec)
+    void *buf, unsigned long vaddr, int size, uint32_t pfec,
+    pagefault_info_t *pfinfo)
 {
     return __hvm_copy(buf, vaddr, size,
                       HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
-                      PFEC_page_present | pfec);
+                      PFEC_page_present | pfec, pfinfo);
 }
 
 enum hvm_copy_result hvm_fetch_from_guest_virt(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec)
+    void *buf, unsigned long vaddr, int size, uint32_t pfec,
+    pagefault_info_t *pfinfo)
 {
     return __hvm_copy(buf, vaddr, size,
                       HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
-                      PFEC_page_present | PFEC_insn_fetch | pfec);
+                      PFEC_page_present | PFEC_insn_fetch | pfec, pfinfo);
 }
 
 enum hvm_copy_result hvm_copy_to_guest_virt_nofault(
@@ -3319,7 +3334,7 @@ enum hvm_copy_result hvm_copy_to_guest_virt_nofault(
 {
     return __hvm_copy(buf, vaddr, size,
                       HVMCOPY_to_guest | HVMCOPY_no_fault | HVMCOPY_virt,
-                      PFEC_page_present | PFEC_write_access | pfec);
+                      PFEC_page_present | PFEC_write_access | pfec, NULL);
 }
 
 enum hvm_copy_result hvm_copy_from_guest_virt_nofault(
@@ -3327,7 +3342,7 @@ enum hvm_copy_result hvm_copy_from_guest_virt_nofault(
 {
     return __hvm_copy(buf, vaddr, size,
                       HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt,
-                      PFEC_page_present | pfec);
+                      PFEC_page_present | pfec, NULL);
 }
 
 enum hvm_copy_result hvm_fetch_from_guest_virt_nofault(
@@ -3335,7 +3350,7 @@ enum hvm_copy_result hvm_fetch_from_guest_virt_nofault(
 {
     return __hvm_copy(buf, vaddr, size,
                       HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt,
-                      PFEC_page_present | PFEC_insn_fetch | pfec);
+                      PFEC_page_present | PFEC_insn_fetch | pfec, NULL);
 }
 
 unsigned long copy_to_user_hvm(void *to, const void *from, unsigned int len)
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index bcc4a97..7342d12 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -396,6 +396,7 @@ static int decode_vmx_inst(struct cpu_user_regs *regs,
     struct vcpu *v = current;
     union vmx_inst_info info;
     struct segment_register seg;
+    pagefault_info_t pfinfo;
     unsigned long base, index, seg_base, disp, offset;
     int scale, size;
 
@@ -451,7 +452,7 @@ static int decode_vmx_inst(struct cpu_user_regs *regs,
             goto gp_fault;
 
         if ( poperandS != NULL &&
-             hvm_copy_from_guest_virt(poperandS, base, size, 0)
+             hvm_copy_from_guest_virt(poperandS, base, size, 0, &pfinfo)
                   != HVMCOPY_okay )
             return X86EMUL_EXCEPTION;
         decode->mem = base;
@@ -1611,6 +1612,7 @@ int nvmx_handle_vmptrst(struct cpu_user_regs *regs)
     struct vcpu *v = current;
     struct vmx_inst_decoded decode;
     struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    pagefault_info_t pfinfo;
     unsigned long gpa = 0;
     int rc;
 
@@ -1620,7 +1622,7 @@ int nvmx_handle_vmptrst(struct cpu_user_regs *regs)
 
     gpa = nvcpu->nv_vvmcxaddr;
 
-    rc = hvm_copy_to_guest_virt(decode.mem, &gpa, decode.len, 0);
+    rc = hvm_copy_to_guest_virt(decode.mem, &gpa, decode.len, 0, &pfinfo);
     if ( rc != HVMCOPY_okay )
         return X86EMUL_EXCEPTION;
 
@@ -1679,6 +1681,7 @@ int nvmx_handle_vmread(struct cpu_user_regs *regs)
 {
     struct vcpu *v = current;
     struct vmx_inst_decoded decode;
+    pagefault_info_t pfinfo;
     u64 value = 0;
     int rc;
 
@@ -1690,7 +1693,7 @@ int nvmx_handle_vmread(struct cpu_user_regs *regs)
 
     switch ( decode.type ) {
     case VMX_INST_MEMREG_TYPE_MEMORY:
-        rc = hvm_copy_to_guest_virt(decode.mem, &value, decode.len, 0);
+        rc = hvm_copy_to_guest_virt(decode.mem, &value, decode.len, 0, &pfinfo);
         if ( rc != HVMCOPY_okay )
             return X86EMUL_EXCEPTION;
         break;
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index e509cc1..e8501ce 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -179,6 +179,7 @@ hvm_read(enum x86_segment seg,
          enum hvm_access_type access_type,
          struct sh_emulate_ctxt *sh_ctxt)
 {
+    pagefault_info_t pfinfo;
     unsigned long addr;
     int rc;
 
@@ -188,9 +189,9 @@ hvm_read(enum x86_segment seg,
         return rc;
 
     if ( access_type == hvm_access_insn_fetch )
-        rc = hvm_fetch_from_guest_virt(p_data, addr, bytes, 0);
+        rc = hvm_fetch_from_guest_virt(p_data, addr, bytes, 0, &pfinfo);
     else
-        rc = hvm_copy_from_guest_virt(p_data, addr, bytes, 0);
+        rc = hvm_copy_from_guest_virt(p_data, addr, bytes, 0, &pfinfo);
 
     switch ( rc )
     {
diff --git a/xen/include/asm-x86/hvm/support.h b/xen/include/asm-x86/hvm/support.h
index 9938450..4aa5a36 100644
--- a/xen/include/asm-x86/hvm/support.h
+++ b/xen/include/asm-x86/hvm/support.h
@@ -83,16 +83,27 @@ enum hvm_copy_result hvm_copy_from_guest_phys(
  *  HVMCOPY_bad_gfn_to_mfn: Some guest physical address did not map to
  *                          ordinary machine memory.
  *  HVMCOPY_bad_gva_to_gfn: Some guest virtual address did not have a valid
- *                          mapping to a guest physical address. In this case
- *                          a page fault exception is automatically queued
- *                          for injection into the current HVM VCPU.
+ *                          mapping to a guest physical address.  The
+ *                          pagefault_info_t structure will be filled in if
+ *                          provided, and a page fault exception is
+ *                          automatically queued for injection into the
+ *                          current HVM VCPU.
  */
+typedef struct pagefault_info
+{
+    unsigned long linear;
+    int ec;
+} pagefault_info_t;
+
 enum hvm_copy_result hvm_copy_to_guest_virt(
-    unsigned long vaddr, void *buf, int size, uint32_t pfec);
+    unsigned long vaddr, void *buf, int size, uint32_t pfec,
+    pagefault_info_t *pfinfo);
 enum hvm_copy_result hvm_copy_from_guest_virt(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec);
+    void *buf, unsigned long vaddr, int size, uint32_t pfec,
+    pagefault_info_t *pfinfo);
 enum hvm_copy_result hvm_fetch_from_guest_virt(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec);
+    void *buf, unsigned long vaddr, int size, uint32_t pfec,
+    pagefault_info_t *pfinfo);
 
 /*
  * As above (copy to/from a guest virtual address), but no fault is generated
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 20/24] x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (18 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 19/24] x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 21/24] x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear() Andrew Cooper
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
---
 xen/arch/x86/hvm/emulate.c        |  6 ++---
 xen/arch/x86/hvm/hvm.c            | 56 +++++++++------------------------------
 xen/arch/x86/mm/shadow/common.c   |  8 +++---
 xen/include/asm-x86/hvm/support.h | 11 --------
 4 files changed, 19 insertions(+), 62 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 41f689e..321c5aa 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1937,9 +1937,9 @@ void hvm_emulate_init_per_insn(
                                         hvm_access_insn_fetch,
                                         hvmemul_ctxt->ctxt.addr_size,
                                         &addr) &&
-             hvm_fetch_from_guest_virt_nofault(hvmemul_ctxt->insn_buf, addr,
-                                               sizeof(hvmemul_ctxt->insn_buf),
-                                               pfec) == HVMCOPY_okay) ?
+             hvm_fetch_from_guest_virt(hvmemul_ctxt->insn_buf, addr,
+                                       sizeof(hvmemul_ctxt->insn_buf),
+                                       pfec, NULL) == HVMCOPY_okay) ?
             sizeof(hvmemul_ctxt->insn_buf) : 0;
     }
     else
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 390f76d..5eae06a 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3066,8 +3066,6 @@ void hvm_task_switch(
 
 #define HVMCOPY_from_guest (0u<<0)
 #define HVMCOPY_to_guest   (1u<<0)
-#define HVMCOPY_no_fault   (0u<<1)
-#define HVMCOPY_fault      (1u<<1)
 #define HVMCOPY_phys       (0u<<2)
 #define HVMCOPY_virt       (1u<<2)
 static enum hvm_copy_result __hvm_copy(
@@ -3112,13 +3110,10 @@ static enum hvm_copy_result __hvm_copy(
                     return HVMCOPY_gfn_paged_out;
                 if ( pfec & PFEC_page_shared )
                     return HVMCOPY_gfn_shared;
-                if ( flags & HVMCOPY_fault )
+                if ( pfinfo )
                 {
-                    if ( pfinfo )
-                    {
-                        pfinfo->linear = addr;
-                        pfinfo->ec = pfec;
-                    }
+                    pfinfo->linear = addr;
+                    pfinfo->ec = pfec;
 
                     hvm_inject_page_fault(pfec, addr);
                 }
@@ -3290,16 +3285,14 @@ enum hvm_copy_result hvm_copy_to_guest_phys(
     paddr_t paddr, void *buf, int size)
 {
     return __hvm_copy(buf, paddr, size,
-                      HVMCOPY_to_guest | HVMCOPY_fault | HVMCOPY_phys,
-                      0, NULL);
+                      HVMCOPY_to_guest | HVMCOPY_phys, 0, NULL);
 }
 
 enum hvm_copy_result hvm_copy_from_guest_phys(
     void *buf, paddr_t paddr, int size)
 {
     return __hvm_copy(buf, paddr, size,
-                      HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_phys,
-                      0, NULL);
+                      HVMCOPY_from_guest | HVMCOPY_phys, 0, NULL);
 }
 
 enum hvm_copy_result hvm_copy_to_guest_virt(
@@ -3307,7 +3300,7 @@ enum hvm_copy_result hvm_copy_to_guest_virt(
     pagefault_info_t *pfinfo)
 {
     return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_to_guest | HVMCOPY_fault | HVMCOPY_virt,
+                      HVMCOPY_to_guest | HVMCOPY_virt,
                       PFEC_page_present | PFEC_write_access | pfec, pfinfo);
 }
 
@@ -3316,7 +3309,7 @@ enum hvm_copy_result hvm_copy_from_guest_virt(
     pagefault_info_t *pfinfo)
 {
     return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
+                      HVMCOPY_from_guest | HVMCOPY_virt,
                       PFEC_page_present | pfec, pfinfo);
 }
 
@@ -3325,34 +3318,10 @@ enum hvm_copy_result hvm_fetch_from_guest_virt(
     pagefault_info_t *pfinfo)
 {
     return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
+                      HVMCOPY_from_guest | HVMCOPY_virt,
                       PFEC_page_present | PFEC_insn_fetch | pfec, pfinfo);
 }
 
-enum hvm_copy_result hvm_copy_to_guest_virt_nofault(
-    unsigned long vaddr, void *buf, int size, uint32_t pfec)
-{
-    return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_to_guest | HVMCOPY_no_fault | HVMCOPY_virt,
-                      PFEC_page_present | PFEC_write_access | pfec, NULL);
-}
-
-enum hvm_copy_result hvm_copy_from_guest_virt_nofault(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec)
-{
-    return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt,
-                      PFEC_page_present | pfec, NULL);
-}
-
-enum hvm_copy_result hvm_fetch_from_guest_virt_nofault(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec)
-{
-    return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt,
-                      PFEC_page_present | PFEC_insn_fetch | pfec, NULL);
-}
-
 unsigned long copy_to_user_hvm(void *to, const void *from, unsigned int len)
 {
     int rc;
@@ -3364,8 +3333,7 @@ unsigned long copy_to_user_hvm(void *to, const void *from, unsigned int len)
         return 0;
     }
 
-    rc = hvm_copy_to_guest_virt_nofault((unsigned long)to, (void *)from,
-                                        len, 0);
+    rc = hvm_copy_to_guest_virt((unsigned long)to, (void *)from, len, 0, NULL);
     return rc ? len : 0; /* fake a copy_to_user() return code */
 }
 
@@ -3395,7 +3363,7 @@ unsigned long copy_from_user_hvm(void *to, const void *from, unsigned len)
         return 0;
     }
 
-    rc = hvm_copy_from_guest_virt_nofault(to, (unsigned long)from, len, 0);
+    rc = hvm_copy_from_guest_virt(to, (unsigned long)from, len, 0, NULL);
     return rc ? len : 0; /* fake a copy_from_user() return code */
 }
 
@@ -4070,8 +4038,8 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
                                         (hvm_long_mode_enabled(cur) &&
                                          cs->attr.fields.l) ? 64 :
                                         cs->attr.fields.db ? 32 : 16, &addr) &&
-             (hvm_fetch_from_guest_virt_nofault(sig, addr, sizeof(sig),
-                                                walk) == HVMCOPY_okay) &&
+             (hvm_fetch_from_guest_virt(sig, addr, sizeof(sig),
+                                        walk, NULL) == HVMCOPY_okay) &&
              (memcmp(sig, "\xf\xbxen", sizeof(sig)) == 0) )
         {
             regs->eip += sizeof(sig);
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index e8501ce..b659324 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -419,8 +419,8 @@ const struct x86_emulate_ops *shadow_init_emulation(
         (!hvm_translate_linear_addr(
             x86_seg_cs, regs->eip, sizeof(sh_ctxt->insn_buf),
             hvm_access_insn_fetch, sh_ctxt, &addr) &&
-         !hvm_fetch_from_guest_virt_nofault(
-             sh_ctxt->insn_buf, addr, sizeof(sh_ctxt->insn_buf), 0))
+         !hvm_fetch_from_guest_virt(
+             sh_ctxt->insn_buf, addr, sizeof(sh_ctxt->insn_buf), 0, NULL))
         ? sizeof(sh_ctxt->insn_buf) : 0;
 
     return &hvm_shadow_emulator_ops;
@@ -447,8 +447,8 @@ void shadow_continue_emulation(struct sh_emulate_ctxt *sh_ctxt,
                 (!hvm_translate_linear_addr(
                     x86_seg_cs, regs->eip, sizeof(sh_ctxt->insn_buf),
                     hvm_access_insn_fetch, sh_ctxt, &addr) &&
-                 !hvm_fetch_from_guest_virt_nofault(
-                     sh_ctxt->insn_buf, addr, sizeof(sh_ctxt->insn_buf), 0))
+                 !hvm_fetch_from_guest_virt(
+                     sh_ctxt->insn_buf, addr, sizeof(sh_ctxt->insn_buf), 0, NULL))
                 ? sizeof(sh_ctxt->insn_buf) : 0;
             sh_ctxt->insn_buf_eip = regs->eip;
         }
diff --git a/xen/include/asm-x86/hvm/support.h b/xen/include/asm-x86/hvm/support.h
index 4aa5a36..114aa04 100644
--- a/xen/include/asm-x86/hvm/support.h
+++ b/xen/include/asm-x86/hvm/support.h
@@ -105,17 +105,6 @@ enum hvm_copy_result hvm_fetch_from_guest_virt(
     void *buf, unsigned long vaddr, int size, uint32_t pfec,
     pagefault_info_t *pfinfo);
 
-/*
- * As above (copy to/from a guest virtual address), but no fault is generated
- * when HVMCOPY_bad_gva_to_gfn is returned.
- */
-enum hvm_copy_result hvm_copy_to_guest_virt_nofault(
-    unsigned long vaddr, void *buf, int size, uint32_t pfec);
-enum hvm_copy_result hvm_copy_from_guest_virt_nofault(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec);
-enum hvm_copy_result hvm_fetch_from_guest_virt_nofault(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec);
-
 #define HVM_HCALL_completed  0 /* hypercall completed - no further action */
 #define HVM_HCALL_preempted  1 /* hypercall preempted - re-execute VMCALL */
 #define HVM_HCALL_invalidate 2 /* invalidate ioemu-dm memory cache        */
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 21/24] x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear()
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (19 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 20/24] x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 22/24] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back Andrew Cooper
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

The functions use linear addresses, not virtual addresses, as no segmentation
is used.  (Lots of other code in Xen makes this mistake.)

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
---
 xen/arch/x86/hvm/emulate.c        | 12 ++++----
 xen/arch/x86/hvm/hvm.c            | 60 +++++++++++++++++++--------------------
 xen/arch/x86/hvm/vmx/vvmx.c       |  6 ++--
 xen/arch/x86/mm/shadow/common.c   |  8 +++---
 xen/include/asm-x86/hvm/support.h | 14 ++++-----
 5 files changed, 50 insertions(+), 50 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 321c5aa..035b654 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -791,8 +791,8 @@ static int __hvmemul_read(
         pfec |= PFEC_user_mode;
 
     rc = ((access_type == hvm_access_insn_fetch) ?
-          hvm_fetch_from_guest_virt(p_data, addr, bytes, pfec, &pfinfo) :
-          hvm_copy_from_guest_virt(p_data, addr, bytes, pfec, &pfinfo));
+          hvm_fetch_from_guest_linear(p_data, addr, bytes, pfec, &pfinfo) :
+          hvm_copy_from_guest_linear(p_data, addr, bytes, pfec, &pfinfo));
 
     switch ( rc )
     {
@@ -898,7 +898,7 @@ static int hvmemul_write(
          (hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3) )
         pfec |= PFEC_user_mode;
 
-    rc = hvm_copy_to_guest_virt(addr, p_data, bytes, pfec, &pfinfo);
+    rc = hvm_copy_to_guest_linear(addr, p_data, bytes, pfec, &pfinfo);
 
     switch ( rc )
     {
@@ -1937,9 +1937,9 @@ void hvm_emulate_init_per_insn(
                                         hvm_access_insn_fetch,
                                         hvmemul_ctxt->ctxt.addr_size,
                                         &addr) &&
-             hvm_fetch_from_guest_virt(hvmemul_ctxt->insn_buf, addr,
-                                       sizeof(hvmemul_ctxt->insn_buf),
-                                       pfec, NULL) == HVMCOPY_okay) ?
+             hvm_fetch_from_guest_linear(hvmemul_ctxt->insn_buf, addr,
+                                         sizeof(hvmemul_ctxt->insn_buf),
+                                         pfec, NULL) == HVMCOPY_okay) ?
             sizeof(hvmemul_ctxt->insn_buf) : 0;
     }
     else
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 5eae06a..37eaee2 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2925,7 +2925,7 @@ void hvm_task_switch(
         goto out;
     }
 
-    rc = hvm_copy_from_guest_virt(
+    rc = hvm_copy_from_guest_linear(
         &tss, prev_tr.base, sizeof(tss), PFEC_page_present, &pfinfo);
     if ( rc != HVMCOPY_okay )
         goto out;
@@ -2960,15 +2960,15 @@ void hvm_task_switch(
     hvm_get_segment_register(v, x86_seg_ldtr, &segr);
     tss.ldt = segr.sel;
 
-    rc = hvm_copy_to_guest_virt(prev_tr.base + offsetof(typeof(tss), eip),
-                                &tss.eip,
-                                offsetof(typeof(tss), trace) -
-                                offsetof(typeof(tss), eip),
-                                PFEC_page_present, &pfinfo);
+    rc = hvm_copy_to_guest_linear(prev_tr.base + offsetof(typeof(tss), eip),
+                                  &tss.eip,
+                                  offsetof(typeof(tss), trace) -
+                                  offsetof(typeof(tss), eip),
+                                  PFEC_page_present, &pfinfo);
     if ( rc != HVMCOPY_okay )
         goto out;
 
-    rc = hvm_copy_from_guest_virt(
+    rc = hvm_copy_from_guest_linear(
         &tss, tr.base, sizeof(tss), PFEC_page_present, &pfinfo);
     /*
      * Note: The HVMCOPY_gfn_shared case could be optimised, if the callee
@@ -3008,9 +3008,9 @@ void hvm_task_switch(
         regs->eflags |= X86_EFLAGS_NT;
         tss.back_link = prev_tr.sel;
 
-        rc = hvm_copy_to_guest_virt(tr.base + offsetof(typeof(tss), back_link),
-                                    &tss.back_link, sizeof(tss.back_link), 0,
-                                    &pfinfo);
+        rc = hvm_copy_to_guest_linear(tr.base + offsetof(typeof(tss), back_link),
+                                      &tss.back_link, sizeof(tss.back_link), 0,
+                                      &pfinfo);
         if ( rc == HVMCOPY_bad_gva_to_gfn )
             exn_raised = 1;
         else if ( rc != HVMCOPY_okay )
@@ -3047,8 +3047,8 @@ void hvm_task_switch(
                                         16 << segr.attr.fields.db,
                                         &linear_addr) )
         {
-            rc = hvm_copy_to_guest_virt(linear_addr, &errcode, opsz, 0,
-                                        &pfinfo);
+            rc = hvm_copy_to_guest_linear(linear_addr, &errcode, opsz, 0,
+                                          &pfinfo);
             if ( rc == HVMCOPY_bad_gva_to_gfn )
                 exn_raised = 1;
             else if ( rc != HVMCOPY_okay )
@@ -3067,7 +3067,7 @@ void hvm_task_switch(
 #define HVMCOPY_from_guest (0u<<0)
 #define HVMCOPY_to_guest   (1u<<0)
 #define HVMCOPY_phys       (0u<<2)
-#define HVMCOPY_virt       (1u<<2)
+#define HVMCOPY_linear     (1u<<2)
 static enum hvm_copy_result __hvm_copy(
     void *buf, paddr_t addr, int size, unsigned int flags, uint32_t pfec,
     pagefault_info_t *pfinfo)
@@ -3101,7 +3101,7 @@ static enum hvm_copy_result __hvm_copy(
 
         count = min_t(int, PAGE_SIZE - gpa, todo);
 
-        if ( flags & HVMCOPY_virt )
+        if ( flags & HVMCOPY_linear )
         {
             gfn = paging_gva_to_gfn(curr, addr, &pfec);
             if ( gfn == gfn_x(INVALID_GFN) )
@@ -3295,30 +3295,30 @@ enum hvm_copy_result hvm_copy_from_guest_phys(
                       HVMCOPY_from_guest | HVMCOPY_phys, 0, NULL);
 }
 
-enum hvm_copy_result hvm_copy_to_guest_virt(
-    unsigned long vaddr, void *buf, int size, uint32_t pfec,
+enum hvm_copy_result hvm_copy_to_guest_linear(
+    unsigned long addr, void *buf, int size, uint32_t pfec,
     pagefault_info_t *pfinfo)
 {
-    return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_to_guest | HVMCOPY_virt,
+    return __hvm_copy(buf, addr, size,
+                      HVMCOPY_to_guest | HVMCOPY_linear,
                       PFEC_page_present | PFEC_write_access | pfec, pfinfo);
 }
 
-enum hvm_copy_result hvm_copy_from_guest_virt(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec,
+enum hvm_copy_result hvm_copy_from_guest_linear(
+    void *buf, unsigned long addr, int size, uint32_t pfec,
     pagefault_info_t *pfinfo)
 {
-    return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_from_guest | HVMCOPY_virt,
+    return __hvm_copy(buf, addr, size,
+                      HVMCOPY_from_guest | HVMCOPY_linear,
                       PFEC_page_present | pfec, pfinfo);
 }
 
-enum hvm_copy_result hvm_fetch_from_guest_virt(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec,
+enum hvm_copy_result hvm_fetch_from_guest_linear(
+    void *buf, unsigned long addr, int size, uint32_t pfec,
     pagefault_info_t *pfinfo)
 {
-    return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_from_guest | HVMCOPY_virt,
+    return __hvm_copy(buf, addr, size,
+                      HVMCOPY_from_guest | HVMCOPY_linear,
                       PFEC_page_present | PFEC_insn_fetch | pfec, pfinfo);
 }
 
@@ -3333,7 +3333,7 @@ unsigned long copy_to_user_hvm(void *to, const void *from, unsigned int len)
         return 0;
     }
 
-    rc = hvm_copy_to_guest_virt((unsigned long)to, (void *)from, len, 0, NULL);
+    rc = hvm_copy_to_guest_linear((unsigned long)to, (void *)from, len, 0, NULL);
     return rc ? len : 0; /* fake a copy_to_user() return code */
 }
 
@@ -3363,7 +3363,7 @@ unsigned long copy_from_user_hvm(void *to, const void *from, unsigned len)
         return 0;
     }
 
-    rc = hvm_copy_from_guest_virt(to, (unsigned long)from, len, 0, NULL);
+    rc = hvm_copy_from_guest_linear(to, (unsigned long)from, len, 0, NULL);
     return rc ? len : 0; /* fake a copy_from_user() return code */
 }
 
@@ -4038,8 +4038,8 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
                                         (hvm_long_mode_enabled(cur) &&
                                          cs->attr.fields.l) ? 64 :
                                         cs->attr.fields.db ? 32 : 16, &addr) &&
-             (hvm_fetch_from_guest_virt(sig, addr, sizeof(sig),
-                                        walk, NULL) == HVMCOPY_okay) &&
+             (hvm_fetch_from_guest_linear(sig, addr, sizeof(sig),
+                                          walk, NULL) == HVMCOPY_okay) &&
              (memcmp(sig, "\xf\xbxen", sizeof(sig)) == 0) )
         {
             regs->eip += sizeof(sig);
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index 7342d12..fd7ea0a 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -452,7 +452,7 @@ static int decode_vmx_inst(struct cpu_user_regs *regs,
             goto gp_fault;
 
         if ( poperandS != NULL &&
-             hvm_copy_from_guest_virt(poperandS, base, size, 0, &pfinfo)
+             hvm_copy_from_guest_linear(poperandS, base, size, 0, &pfinfo)
                   != HVMCOPY_okay )
             return X86EMUL_EXCEPTION;
         decode->mem = base;
@@ -1622,7 +1622,7 @@ int nvmx_handle_vmptrst(struct cpu_user_regs *regs)
 
     gpa = nvcpu->nv_vvmcxaddr;
 
-    rc = hvm_copy_to_guest_virt(decode.mem, &gpa, decode.len, 0, &pfinfo);
+    rc = hvm_copy_to_guest_linear(decode.mem, &gpa, decode.len, 0, &pfinfo);
     if ( rc != HVMCOPY_okay )
         return X86EMUL_EXCEPTION;
 
@@ -1693,7 +1693,7 @@ int nvmx_handle_vmread(struct cpu_user_regs *regs)
 
     switch ( decode.type ) {
     case VMX_INST_MEMREG_TYPE_MEMORY:
-        rc = hvm_copy_to_guest_virt(decode.mem, &value, decode.len, 0, &pfinfo);
+        rc = hvm_copy_to_guest_linear(decode.mem, &value, decode.len, 0, &pfinfo);
         if ( rc != HVMCOPY_okay )
             return X86EMUL_EXCEPTION;
         break;
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index b659324..0760e76 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -189,9 +189,9 @@ hvm_read(enum x86_segment seg,
         return rc;
 
     if ( access_type == hvm_access_insn_fetch )
-        rc = hvm_fetch_from_guest_virt(p_data, addr, bytes, 0, &pfinfo);
+        rc = hvm_fetch_from_guest_linear(p_data, addr, bytes, 0, &pfinfo);
     else
-        rc = hvm_copy_from_guest_virt(p_data, addr, bytes, 0, &pfinfo);
+        rc = hvm_copy_from_guest_linear(p_data, addr, bytes, 0, &pfinfo);
 
     switch ( rc )
     {
@@ -419,7 +419,7 @@ const struct x86_emulate_ops *shadow_init_emulation(
         (!hvm_translate_linear_addr(
             x86_seg_cs, regs->eip, sizeof(sh_ctxt->insn_buf),
             hvm_access_insn_fetch, sh_ctxt, &addr) &&
-         !hvm_fetch_from_guest_virt(
+         !hvm_fetch_from_guest_linear(
              sh_ctxt->insn_buf, addr, sizeof(sh_ctxt->insn_buf), 0, NULL))
         ? sizeof(sh_ctxt->insn_buf) : 0;
 
@@ -447,7 +447,7 @@ void shadow_continue_emulation(struct sh_emulate_ctxt *sh_ctxt,
                 (!hvm_translate_linear_addr(
                     x86_seg_cs, regs->eip, sizeof(sh_ctxt->insn_buf),
                     hvm_access_insn_fetch, sh_ctxt, &addr) &&
-                 !hvm_fetch_from_guest_virt(
+                 !hvm_fetch_from_guest_linear(
                      sh_ctxt->insn_buf, addr, sizeof(sh_ctxt->insn_buf), 0, NULL))
                 ? sizeof(sh_ctxt->insn_buf) : 0;
             sh_ctxt->insn_buf_eip = regs->eip;
diff --git a/xen/include/asm-x86/hvm/support.h b/xen/include/asm-x86/hvm/support.h
index 114aa04..78349f8 100644
--- a/xen/include/asm-x86/hvm/support.h
+++ b/xen/include/asm-x86/hvm/support.h
@@ -73,7 +73,7 @@ enum hvm_copy_result hvm_copy_from_guest_phys(
     void *buf, paddr_t paddr, int size);
 
 /*
- * Copy to/from a guest virtual address. @pfec should include PFEC_user_mode
+ * Copy to/from a guest linear address. @pfec should include PFEC_user_mode
  * if emulating a user-mode access (CPL=3). All other flags in @pfec are
  * managed by the called function: it is therefore optional for the caller
  * to set them.
@@ -95,14 +95,14 @@ typedef struct pagefault_info
     int ec;
 } pagefault_info_t;
 
-enum hvm_copy_result hvm_copy_to_guest_virt(
-    unsigned long vaddr, void *buf, int size, uint32_t pfec,
+enum hvm_copy_result hvm_copy_to_guest_linear(
+    unsigned long addr, void *buf, int size, uint32_t pfec,
     pagefault_info_t *pfinfo);
-enum hvm_copy_result hvm_copy_from_guest_virt(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec,
+enum hvm_copy_result hvm_copy_from_guest_linear(
+    void *buf, unsigned long addr, int size, uint32_t pfec,
     pagefault_info_t *pfinfo);
-enum hvm_copy_result hvm_fetch_from_guest_virt(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec,
+enum hvm_copy_result hvm_fetch_from_guest_linear(
+    void *buf, unsigned long addr, int size, uint32_t pfec,
     pagefault_info_t *pfinfo);
 
 #define HVM_HCALL_completed  0 /* hypercall completed - no further action */
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 22/24] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (20 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 21/24] x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear() Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 14:29   ` Paul Durrant
  2016-11-30 13:50 ` [PATCH v3 23/24] x86/emul: Prepare to allow use of system segments for memory references Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 24/24] x86/emul: Use system-segment relative memory accesses Andrew Cooper
  23 siblings, 1 reply; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Paul Durrant

Drop the call to hvm_inject_page_fault() in __hvm_copy(), and require callers
to inject the pagefault themselves.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
CC: Paul Durrant <paul.durrant@citrix.com>

v3:
 * Correct patch description
 * Fix rebasing error over previous TSS series
---
 xen/arch/x86/hvm/emulate.c        |  2 ++
 xen/arch/x86/hvm/hvm.c            | 14 ++++++++++++--
 xen/arch/x86/hvm/vmx/vvmx.c       | 20 +++++++++++++++-----
 xen/arch/x86/mm/shadow/common.c   |  1 +
 xen/include/asm-x86/hvm/support.h |  4 +---
 5 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 035b654..ccf3aa2 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -799,6 +799,7 @@ static int __hvmemul_read(
     case HVMCOPY_okay:
         break;
     case HVMCOPY_bad_gva_to_gfn:
+        x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt);
         return X86EMUL_EXCEPTION;
     case HVMCOPY_bad_gfn_to_mfn:
         if ( access_type == hvm_access_insn_fetch )
@@ -905,6 +906,7 @@ static int hvmemul_write(
     case HVMCOPY_okay:
         break;
     case HVMCOPY_bad_gva_to_gfn:
+        x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt);
         return X86EMUL_EXCEPTION;
     case HVMCOPY_bad_gfn_to_mfn:
         return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, hvmemul_ctxt, 0);
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 37eaee2..3596f2c 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2927,6 +2927,8 @@ void hvm_task_switch(
 
     rc = hvm_copy_from_guest_linear(
         &tss, prev_tr.base, sizeof(tss), PFEC_page_present, &pfinfo);
+    if ( rc == HVMCOPY_bad_gva_to_gfn )
+        hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
     if ( rc != HVMCOPY_okay )
         goto out;
 
@@ -2965,11 +2967,15 @@ void hvm_task_switch(
                                   offsetof(typeof(tss), trace) -
                                   offsetof(typeof(tss), eip),
                                   PFEC_page_present, &pfinfo);
+    if ( rc == HVMCOPY_bad_gva_to_gfn )
+        hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
     if ( rc != HVMCOPY_okay )
         goto out;
 
     rc = hvm_copy_from_guest_linear(
         &tss, tr.base, sizeof(tss), PFEC_page_present, &pfinfo);
+    if ( rc == HVMCOPY_bad_gva_to_gfn )
+        hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
     /*
      * Note: The HVMCOPY_gfn_shared case could be optimised, if the callee
      * functions knew we want RO access.
@@ -3012,7 +3018,10 @@ void hvm_task_switch(
                                       &tss.back_link, sizeof(tss.back_link), 0,
                                       &pfinfo);
         if ( rc == HVMCOPY_bad_gva_to_gfn )
+        {
+            hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
             exn_raised = 1;
+        }
         else if ( rc != HVMCOPY_okay )
             goto out;
     }
@@ -3050,7 +3059,10 @@ void hvm_task_switch(
             rc = hvm_copy_to_guest_linear(linear_addr, &errcode, opsz, 0,
                                           &pfinfo);
             if ( rc == HVMCOPY_bad_gva_to_gfn )
+            {
+                hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
                 exn_raised = 1;
+            }
             else if ( rc != HVMCOPY_okay )
                 goto out;
         }
@@ -3114,8 +3126,6 @@ static enum hvm_copy_result __hvm_copy(
                 {
                     pfinfo->linear = addr;
                     pfinfo->ec = pfec;
-
-                    hvm_inject_page_fault(pfec, addr);
                 }
                 return HVMCOPY_bad_gva_to_gfn;
             }
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index fd7ea0a..e6e9ebd 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -396,7 +396,6 @@ static int decode_vmx_inst(struct cpu_user_regs *regs,
     struct vcpu *v = current;
     union vmx_inst_info info;
     struct segment_register seg;
-    pagefault_info_t pfinfo;
     unsigned long base, index, seg_base, disp, offset;
     int scale, size;
 
@@ -451,10 +450,17 @@ static int decode_vmx_inst(struct cpu_user_regs *regs,
               offset + size - 1 > seg.limit) )
             goto gp_fault;
 
-        if ( poperandS != NULL &&
-             hvm_copy_from_guest_linear(poperandS, base, size, 0, &pfinfo)
-                  != HVMCOPY_okay )
-            return X86EMUL_EXCEPTION;
+        if ( poperandS != NULL )
+        {
+            pagefault_info_t pfinfo;
+            int rc = hvm_copy_from_guest_linear(poperandS, base, size,
+                                                0, &pfinfo);
+
+            if ( rc == HVMCOPY_bad_gva_to_gfn )
+                hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
+            if ( rc != HVMCOPY_okay )
+                return X86EMUL_EXCEPTION;
+        }
         decode->mem = base;
         decode->len = size;
     }
@@ -1623,6 +1629,8 @@ int nvmx_handle_vmptrst(struct cpu_user_regs *regs)
     gpa = nvcpu->nv_vvmcxaddr;
 
     rc = hvm_copy_to_guest_linear(decode.mem, &gpa, decode.len, 0, &pfinfo);
+    if ( rc == HVMCOPY_bad_gva_to_gfn )
+        hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
     if ( rc != HVMCOPY_okay )
         return X86EMUL_EXCEPTION;
 
@@ -1694,6 +1702,8 @@ int nvmx_handle_vmread(struct cpu_user_regs *regs)
     switch ( decode.type ) {
     case VMX_INST_MEMREG_TYPE_MEMORY:
         rc = hvm_copy_to_guest_linear(decode.mem, &value, decode.len, 0, &pfinfo);
+        if ( rc == HVMCOPY_bad_gva_to_gfn )
+            hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
         if ( rc != HVMCOPY_okay )
             return X86EMUL_EXCEPTION;
         break;
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 0760e76..fbe49e1 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -198,6 +198,7 @@ hvm_read(enum x86_segment seg,
     case HVMCOPY_okay:
         return X86EMUL_OKAY;
     case HVMCOPY_bad_gva_to_gfn:
+        x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &sh_ctxt->ctxt);
         return X86EMUL_EXCEPTION;
     case HVMCOPY_bad_gfn_to_mfn:
     case HVMCOPY_unhandleable:
diff --git a/xen/include/asm-x86/hvm/support.h b/xen/include/asm-x86/hvm/support.h
index 78349f8..3d767d7 100644
--- a/xen/include/asm-x86/hvm/support.h
+++ b/xen/include/asm-x86/hvm/support.h
@@ -85,9 +85,7 @@ enum hvm_copy_result hvm_copy_from_guest_phys(
  *  HVMCOPY_bad_gva_to_gfn: Some guest virtual address did not have a valid
  *                          mapping to a guest physical address.  The
  *                          pagefault_info_t structure will be filled in if
- *                          provided, and a page fault exception is
- *                          automatically queued for injection into the
- *                          current HVM VCPU.
+ *                          provided.
  */
 typedef struct pagefault_info
 {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 23/24] x86/emul: Prepare to allow use of system segments for memory references
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (21 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 22/24] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  2016-11-30 13:50 ` [PATCH v3 24/24] x86/emul: Use system-segment relative memory accesses Andrew Cooper
  23 siblings, 0 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

All system segments (GDT/IDT/LDT and TR) describe a linear address and limit,
and act similarly to user segments.  However all current uses of these tables
in the emulator opencode the address calculations and limit checks.  In
particular, no care is taken for access which wrap around the 4GB or
non-canonical boundaries.

Alter hvm_virtual_to_linear_addr() to cope with performing segmentation checks
on system segments.  This involves restricting access checks in the 32bit case
to user segments only, and adding presence/limit checks in the 64bit case.

When suffering a segmentation fault for a system segments, return
X86EMUL_EXCEPTION but leave the fault injection to the caller.  The fault type
depends on the higher level action being performed.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
---
 xen/arch/x86/hvm/emulate.c             | 14 ++++++++----
 xen/arch/x86/hvm/hvm.c                 | 40 ++++++++++++++++++++++------------
 xen/arch/x86/mm/shadow/common.c        | 12 +++++++---
 xen/arch/x86/x86_emulate/x86_emulate.h | 26 ++++++++++++++--------
 4 files changed, 62 insertions(+), 30 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index ccf3aa2..d0a043b 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -567,10 +567,16 @@ static int hvmemul_virtual_to_linear(
     if ( *reps != 1 )
         return X86EMUL_UNHANDLEABLE;
 
-    /* This is a singleton operation: fail it with an exception. */
-    x86_emul_hw_exception((seg == x86_seg_ss)
-                          ? TRAP_stack_error
-                          : TRAP_gp_fault, 0, &hvmemul_ctxt->ctxt);
+    /*
+     * Leave exception injection to the caller for non-user segments: We
+     * neither know the exact error code to be used, nor can we easily
+     * determine the kind of exception (#GP or #TS) in that case.
+     */
+    if ( is_x86_user_segment(seg) )
+        x86_emul_hw_exception((seg == x86_seg_ss)
+                              ? TRAP_stack_error
+                              : TRAP_gp_fault, 0, &hvmemul_ctxt->ctxt);
+
     return X86EMUL_EXCEPTION;
 }
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 3596f2c..426edee 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2497,24 +2497,28 @@ bool_t hvm_virtual_to_linear_addr(
         if ( !reg->attr.fields.p )
             goto out;
 
-        switch ( access_type )
+        /* Read/write restrictions only exist for user segments. */
+        if ( reg->attr.fields.s )
         {
-        case hvm_access_read:
-            if ( (reg->attr.fields.type & 0xa) == 0x8 )
-                goto out; /* execute-only code segment */
-            break;
-        case hvm_access_write:
-            if ( (reg->attr.fields.type & 0xa) != 0x2 )
-                goto out; /* not a writable data segment */
-            break;
-        default:
-            break;
+            switch ( access_type )
+            {
+            case hvm_access_read:
+                if ( (reg->attr.fields.type & 0xa) == 0x8 )
+                    goto out; /* execute-only code segment */
+                break;
+            case hvm_access_write:
+                if ( (reg->attr.fields.type & 0xa) != 0x2 )
+                    goto out; /* not a writable data segment */
+                break;
+            default:
+                break;
+            }
         }
 
         last_byte = (uint32_t)offset + bytes - !!bytes;
 
         /* Is this a grows-down data segment? Special limit check if so. */
-        if ( (reg->attr.fields.type & 0xc) == 0x4 )
+        if ( reg->attr.fields.s && (reg->attr.fields.type & 0xc) == 0x4 )
         {
             /* Is upper limit 0xFFFF or 0xFFFFFFFF? */
             if ( !reg->attr.fields.db )
@@ -2530,10 +2534,18 @@ bool_t hvm_virtual_to_linear_addr(
     else
     {
         /*
-         * LONG MODE: FS and GS add segment base. Addresses must be canonical.
+         * User segments are always treated as present.  System segment may
+         * not be, and also incur limit checks.
          */
+        if ( is_x86_system_segment(seg) &&
+             (!reg->attr.fields.p || (offset + bytes - !!bytes) > reg->limit) )
+            goto out;
 
-        if ( (seg == x86_seg_fs) || (seg == x86_seg_gs) )
+        /*
+         * LONG MODE: FS, GS and system segments: add segment base. All
+         * addresses must be canonical.
+         */
+        if ( seg >= x86_seg_fs )
             addr += reg->base;
 
         last_byte = addr + bytes - !!bytes;
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index fbe49e1..6c146f8 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -162,9 +162,15 @@ static int hvm_translate_linear_addr(
 
     if ( !okay )
     {
-        x86_emul_hw_exception(
-            (seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault,
-            0, &sh_ctxt->ctxt);
+        /*
+         * Leave exception injection to the caller for non-user segments: We
+         * neither know the exact error code to be used, nor can we easily
+         * determine the kind of exception (#GP or #TS) in that case.
+         */
+        if ( is_x86_user_segment(seg) )
+            x86_emul_hw_exception(
+                (seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault,
+                0, &sh_ctxt->ctxt);
         return X86EMUL_EXCEPTION;
     }
 
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 8aa4b0b..a7d3060 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -31,7 +31,11 @@
 
 struct x86_emulate_ctxt;
 
-/* Comprehensive enumeration of x86 segment registers. */
+/*
+ * Comprehensive enumeration of x86 segment registers.  Various bits of code
+ * rely on this order (general purpose before system, tr at the beginning of
+ * system).
+ */
 enum x86_segment {
     /* General purpose.  Matches the SReg3 encoding in opcode/ModRM bytes. */
     x86_seg_es,
@@ -40,21 +44,25 @@ enum x86_segment {
     x86_seg_ds,
     x86_seg_fs,
     x86_seg_gs,
-    /* System. */
+    /* System: Valid to use for implicit table references. */
     x86_seg_tr,
     x86_seg_ldtr,
     x86_seg_gdtr,
     x86_seg_idtr,
-    /*
-     * Dummy: used to emulate direct processor accesses to management
-     * structures (TSS, GDT, LDT, IDT, etc.) which use linear addressing
-     * (no segment component) and bypass usual segment- and page-level
-     * protection checks.
-     */
+    /* No Segment: For accesses which are already linear. */
     x86_seg_none
 };
 
-#define is_x86_user_segment(seg) ((unsigned)(seg) <= x86_seg_gs)
+static inline bool is_x86_user_segment(enum x86_segment seg)
+{
+    unsigned int idx = seg;
+
+    return idx <= x86_seg_gs;
+}
+static inline bool is_x86_system_segment(enum x86_segment seg)
+{
+    return seg >= x86_seg_tr && seg < x86_seg_none;
+}
 
 /* Classification of the types of software generated interrupts/exceptions. */
 enum x86_swint_type {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 24/24] x86/emul: Use system-segment relative memory accesses
  2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
                   ` (22 preceding siblings ...)
  2016-11-30 13:50 ` [PATCH v3 23/24] x86/emul: Prepare to allow use of system segments for memory references Andrew Cooper
@ 2016-11-30 13:50 ` Andrew Cooper
  23 siblings, 0 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 13:50 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

With hvm_virtual_to_linear_addr() capable of doing proper system-segment
relative memory accesses, avoid open-coding the address and limit calculations
locally.

When a table spans the 4GB boundary (32bit) or non-canonical boundary (64bit),
segmentation errors are now raised.  Previously, the use of x86_seg_none
resulted in segmentation being skipped, and the linear address being truncated
through the pagewalk, and possibly coming out valid on the far side.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <JBeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
---
v2:
 * Shorten exception handling
 * Replace ->cmpxchg() assertion with proper exception handling
---
 xen/arch/x86/hvm/hvm.c                 |   8 +++
 xen/arch/x86/x86_emulate/x86_emulate.c | 123 +++++++++++++++++++++------------
 2 files changed, 85 insertions(+), 46 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 426edee..599363b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2470,6 +2470,14 @@ bool_t hvm_virtual_to_linear_addr(
     unsigned long addr = offset, last_byte;
     bool_t okay = 0;
 
+    /*
+     * These checks are for a memory access through an active segment.
+     *
+     * It is expected that the access rights of reg are suitable for seg (and
+     * that this is enforced at the point that seg is loaded).
+     */
+    ASSERT(seg < x86_seg_none);
+
     if ( !(current->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) )
     {
         /*
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 0fb2c09..c18adbe 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1181,20 +1181,36 @@ static int ioport_access_check(
         return rc;
 
     /* Ensure the TSS has an io-bitmap-offset field. */
-    generate_exception_if(tr.attr.fields.type != 0xb ||
-                          tr.limit < 0x67, EXC_GP, 0);
+    generate_exception_if(tr.attr.fields.type != 0xb, EXC_GP, 0);
 
-    if ( (rc = read_ulong(x86_seg_none, tr.base + 0x66,
-                          &iobmp, 2, ctxt, ops)) )
+    switch ( rc = read_ulong(x86_seg_tr, 0x66, &iobmp, 2, ctxt, ops) )
+    {
+    case X86EMUL_OKAY:
+        break;
+
+    case X86EMUL_EXCEPTION:
+        generate_exception_if(!ctxt->event_pending, EXC_GP, 0);
+        /* fallthrough */
+
+    default:
         return rc;
+    }
 
-    /* Ensure TSS includes two bytes including byte containing first port. */
-    iobmp += first_port / 8;
-    generate_exception_if(tr.limit <= iobmp, EXC_GP, 0);
+    /* Read two bytes including byte containing first port. */
+    switch ( rc = read_ulong(x86_seg_tr, iobmp + first_port / 8,
+                             &iobmp, 2, ctxt, ops) )
+    {
+    case X86EMUL_OKAY:
+        break;
+
+    case X86EMUL_EXCEPTION:
+        generate_exception_if(!ctxt->event_pending, EXC_GP, 0);
+        /* fallthrough */
 
-    if ( (rc = read_ulong(x86_seg_none, tr.base + iobmp,
-                          &iobmp, 2, ctxt, ops)) )
+    default:
         return rc;
+    }
+
     generate_exception_if(iobmp & (((1 << bytes) - 1) << (first_port & 7)),
                           EXC_GP, 0);
 
@@ -1317,9 +1333,12 @@ realmode_load_seg(
     struct x86_emulate_ctxt *ctxt,
     const struct x86_emulate_ops *ops)
 {
-    int rc = ops->read_segment(seg, sreg, ctxt);
+    int rc;
+
+    if ( !ops->read_segment )
+        return X86EMUL_UNHANDLEABLE;
 
-    if ( !rc )
+    if ( (rc = ops->read_segment(seg, sreg, ctxt)) == X86EMUL_OKAY )
     {
         sreg->sel  = sel;
         sreg->base = (uint32_t)sel << 4;
@@ -1336,7 +1355,7 @@ protmode_load_seg(
     struct x86_emulate_ctxt *ctxt,
     const struct x86_emulate_ops *ops)
 {
-    struct segment_register desctab;
+    enum x86_segment sel_seg = (sel & 4) ? x86_seg_ldtr : x86_seg_gdtr;
     struct { uint32_t a, b; } desc;
     uint8_t dpl, rpl;
     int cpl = get_cpl(ctxt, ops);
@@ -1369,21 +1388,19 @@ protmode_load_seg(
     if ( !is_x86_user_segment(seg) && (sel & 4) )
         goto raise_exn;
 
-    if ( (rc = ops->read_segment((sel & 4) ? x86_seg_ldtr : x86_seg_gdtr,
-                                 &desctab, ctxt)) )
-        return rc;
-
-    /* Segment not valid for use (cooked meaning of .p)? */
-    if ( !desctab.attr.fields.p )
-        goto raise_exn;
+    switch ( rc = ops->read(sel_seg, sel & 0xfff8, &desc, sizeof(desc), ctxt) )
+    {
+    case X86EMUL_OKAY:
+        break;
 
-    /* Check against descriptor table limit. */
-    if ( ((sel & 0xfff8) + 7) > desctab.limit )
-        goto raise_exn;
+    case X86EMUL_EXCEPTION:
+        if ( !ctxt->event_pending )
+            goto raise_exn;
+        /* fallthrough */
 
-    if ( (rc = ops->read(x86_seg_none, desctab.base + (sel & 0xfff8),
-                         &desc, sizeof(desc), ctxt)) )
+    default:
         return rc;
+    }
 
     if ( !is_x86_user_segment(seg) )
     {
@@ -1471,9 +1488,20 @@ protmode_load_seg(
     {
         uint32_t new_desc_b = desc.b | a_flag;
 
-        if ( (rc = ops->cmpxchg(x86_seg_none, desctab.base + (sel & 0xfff8) + 4,
-                                &desc.b, &new_desc_b, 4, ctxt)) != 0 )
+        switch ( (rc = ops->cmpxchg(sel_seg, (sel & 0xfff8) + 4, &desc.b,
+                                    &new_desc_b, sizeof(desc.b), ctxt)) )
+        {
+        case X86EMUL_OKAY:
+            break;
+
+        case X86EMUL_EXCEPTION:
+            if ( !ctxt->event_pending )
+                goto raise_exn;
+            /* fallthrough */
+
+        default:
             return rc;
+        }
 
         /* Force the Accessed flag in our local copy. */
         desc.b = new_desc_b;
@@ -1507,8 +1535,7 @@ load_seg(
     struct segment_register reg;
     int rc;
 
-    if ( (ops->read_segment == NULL) ||
-         (ops->write_segment == NULL) )
+    if ( !ops->write_segment )
         return X86EMUL_UNHANDLEABLE;
 
     if ( !sreg )
@@ -1636,8 +1663,7 @@ static int inject_swint(enum x86_swint_type type,
         if ( !in_realmode(ctxt, ops) )
         {
             unsigned int idte_size, idte_offset;
-            struct segment_register idtr;
-            uint32_t idte_ctl;
+            struct { uint32_t a, b, c, d; } idte;
             int lm = in_longmode(ctxt, ops);
 
             if ( lm < 0 )
@@ -1660,24 +1686,30 @@ static int inject_swint(enum x86_swint_type type,
                  ((ctxt->regs->eflags & EFLG_IOPL) != EFLG_IOPL) )
                 goto raise_exn;
 
-            fail_if(ops->read_segment == NULL);
             fail_if(ops->read == NULL);
-            if ( (rc = ops->read_segment(x86_seg_idtr, &idtr, ctxt)) )
-                goto done;
-
-            if ( (idte_offset + idte_size - 1) > idtr.limit )
-                goto raise_exn;
 
             /*
-             * Should strictly speaking read all 8/16 bytes of an entry,
-             * but we currently only care about the dpl and present bits.
+             * Read all 8/16 bytes so the idtr limit check is applied properly
+             * to this entry, even though we only end up looking at the 2nd
+             * word.
              */
-            if ( (rc = ops->read(x86_seg_none, idtr.base + idte_offset + 4,
-                                 &idte_ctl, sizeof(idte_ctl), ctxt)) )
-                goto done;
+            switch ( rc = ops->read(x86_seg_idtr, idte_offset,
+                                    &idte, idte_size, ctxt) )
+            {
+            case X86EMUL_OKAY:
+                break;
+
+            case X86EMUL_EXCEPTION:
+                if ( !ctxt->event_pending )
+                    goto raise_exn;
+                /* fallthrough */
+
+            default:
+                return rc;
+            }
 
             /* Is this entry present? */
-            if ( !(idte_ctl & (1u << 15)) )
+            if ( !(idte.b & (1u << 15)) )
             {
                 fault_type = EXC_NP;
                 goto raise_exn;
@@ -1686,12 +1718,11 @@ static int inject_swint(enum x86_swint_type type,
             /* icebp counts as a hardware event, and bypasses the dpl check. */
             if ( type != x86_swint_icebp )
             {
-                struct segment_register ss;
+                int cpl = get_cpl(ctxt, ops);
 
-                if ( (rc = ops->read_segment(x86_seg_ss, &ss, ctxt)) )
-                    goto done;
+                fail_if(cpl < 0);
 
-                if ( ss.attr.fields.dpl > ((idte_ctl >> 13) & 3) )
+                if ( cpl > ((idte.b >> 13) & 3) )
                     goto raise_exn;
             }
         }
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union
  2016-11-30 13:50 ` [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union Andrew Cooper
@ 2016-11-30 13:58   ` Paul Durrant
  2016-11-30 14:02     ` Andrew Cooper
  2016-12-01 10:08   ` Jan Beulich
  1 sibling, 1 reply; 59+ messages in thread
From: Paul Durrant @ 2016-11-30 13:58 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 30 November 2016 13:50
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Jan Beulich
> <JBeulich@suse.com>; Paul Durrant <Paul.Durrant@citrix.com>
> Subject: [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union
> 
> Rename byte to raw, as the field being a single byte long is an
> implementation
> detail.  Make the bitfields part of an anonymous struct to remove the .flags
> qualifier.  Change the types of the flags to being booleans, to match their
> use.
> 

Is it legitimate to use a bool in a bitfield? Also, anonymous unions are not part of C99 AFAIK... are we now stipulating something more recent?

  Paul

> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Paul Durrant <paul.durrant@citrix.com>
> 
> v3:
>  * New
> ---
>  xen/arch/x86/hvm/emulate.c             |  6 +++---
>  xen/arch/x86/x86_emulate/x86_emulate.c | 10 +++++-----
>  xen/arch/x86/x86_emulate/x86_emulate.h | 10 +++++-----
>  3 files changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index bc259ec..fe62500 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -1791,13 +1791,13 @@ static int _hvm_emulate_one(struct
> hvm_emulate_ctxt *hvmemul_ctxt,
>      new_intr_shadow = hvmemul_ctxt->intr_shadow;
> 
>      /* MOV-SS instruction toggles MOV-SS shadow, else we just clear it. */
> -    if ( hvmemul_ctxt->ctxt.retire.flags.mov_ss )
> +    if ( hvmemul_ctxt->ctxt.retire.mov_ss )
>          new_intr_shadow ^= HVM_INTR_SHADOW_MOV_SS;
>      else
>          new_intr_shadow &= ~HVM_INTR_SHADOW_MOV_SS;
> 
>      /* STI instruction toggles STI shadow, else we just clear it. */
> -    if ( hvmemul_ctxt->ctxt.retire.flags.sti )
> +    if ( hvmemul_ctxt->ctxt.retire.sti )
>          new_intr_shadow ^= HVM_INTR_SHADOW_STI;
>      else
>          new_intr_shadow &= ~HVM_INTR_SHADOW_STI;
> @@ -1808,7 +1808,7 @@ static int _hvm_emulate_one(struct
> hvm_emulate_ctxt *hvmemul_ctxt,
>          hvm_funcs.set_interrupt_shadow(curr, new_intr_shadow);
>      }
> 
> -    if ( hvmemul_ctxt->ctxt.retire.flags.hlt &&
> +    if ( hvmemul_ctxt->ctxt.retire.hlt &&
>           !hvm_local_events_need_delivery(curr) )
>      {
>          hvm_hlt(regs->eflags);
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c
> b/xen/arch/x86/x86_emulate/x86_emulate.c
> index 9c28ed4..416812e 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -1905,7 +1905,7 @@ x86_decode(
>      state->eip = ctxt->regs->eip;
> 
>      /* Initialise output state in x86_emulate_ctxt */
> -    ctxt->retire.byte = 0;
> +    ctxt->retire.raw = 0;
> 
>      op_bytes = def_op_bytes = ad_bytes = def_ad_bytes = ctxt-
> >addr_size/8;
>      if ( op_bytes == 8 )
> @@ -2668,7 +2668,7 @@ x86_emulate(
> 
>      case 0x17: /* pop %%ss */
>          src.val = x86_seg_ss;
> -        ctxt->retire.flags.mov_ss = 1;
> +        ctxt->retire.mov_ss = 1;
>          goto pop_seg;
> 
>      case 0x1e: /* push %%ds */
> @@ -2996,7 +2996,7 @@ x86_emulate(
>          if ( (rc = load_seg(seg, src.val, 0, NULL, ctxt, ops)) != 0 )
>              goto done;
>          if ( seg == x86_seg_ss )
> -            ctxt->retire.flags.mov_ss = 1;
> +            ctxt->retire.mov_ss = 1;
>          dst.type = OP_NONE;
>          break;
> 
> @@ -4033,7 +4033,7 @@ x86_emulate(
> 
>      case 0xf4: /* hlt */
>          generate_exception_if(!mode_ring0(), EXC_GP, 0);
> -        ctxt->retire.flags.hlt = 1;
> +        ctxt->retire.hlt = 1;
>          break;
> 
>      case 0xf5: /* cmc */
> @@ -4247,7 +4247,7 @@ x86_emulate(
>          if ( !(_regs.eflags & EFLG_IF) )
>          {
>              _regs.eflags |= EFLG_IF;
> -            ctxt->retire.flags.sti = 1;
> +            ctxt->retire.sti = 1;
>          }
>          break;
> 
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h
> b/xen/arch/x86/x86_emulate/x86_emulate.h
> index b0f0304..ef39601 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -468,12 +468,12 @@ struct x86_emulate_ctxt
> 
>      /* Retirement state, set by the emulator (valid only on X86EMUL_OKAY).
> */
>      union {
> +        uint8_t raw;
>          struct {
> -            uint8_t hlt:1;          /* Instruction HLTed. */
> -            uint8_t mov_ss:1;       /* Instruction sets MOV-SS irq shadow. */
> -            uint8_t sti:1;          /* Instruction sets STI irq shadow. */
> -        } flags;
> -        uint8_t byte;
> +            bool hlt:1;          /* Instruction HLTed. */
> +            bool mov_ss:1;       /* Instruction sets MOV-SS irq shadow. */
> +            bool sti:1;          /* Instruction sets STI irq shadow. */
> +        };
>      } retire;
>  };
> 
> --
> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union
  2016-11-30 13:58   ` Paul Durrant
@ 2016-11-30 14:02     ` Andrew Cooper
  2016-11-30 14:05       ` Paul Durrant
  0 siblings, 1 reply; 59+ messages in thread
From: Andrew Cooper @ 2016-11-30 14:02 UTC (permalink / raw)
  To: Paul Durrant, Xen-devel; +Cc: Jan Beulich

On 30/11/16 13:58, Paul Durrant wrote:
>> -----Original Message-----
>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>> Sent: 30 November 2016 13:50
>> To: Xen-devel <xen-devel@lists.xen.org>
>> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Jan Beulich
>> <JBeulich@suse.com>; Paul Durrant <Paul.Durrant@citrix.com>
>> Subject: [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union
>>
>> Rename byte to raw, as the field being a single byte long is an
>> implementation
>> detail.  Make the bitfields part of an anonymous struct to remove the .flags
>> qualifier.  Change the types of the flags to being booleans, to match their
>> use.
>>
> Is it legitimate to use a bool in a bitfield?

Yes.  Why wouldn't it be?

> Also, anonymous unions are not part of C99 AFAIK... are we now stipulating something more recent?

We used gnu99 for as long as I can remember, and we have other examples
of this pattern already in Xen.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union
  2016-11-30 14:02     ` Andrew Cooper
@ 2016-11-30 14:05       ` Paul Durrant
  2016-11-30 16:43         ` Jan Beulich
  0 siblings, 1 reply; 59+ messages in thread
From: Paul Durrant @ 2016-11-30 14:05 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Jan Beulich

> -----Original Message-----
> From: Andrew Cooper
> Sent: 30 November 2016 14:02
> To: Paul Durrant <Paul.Durrant@citrix.com>; Xen-devel <xen-
> devel@lists.xen.org>
> Cc: Jan Beulich <JBeulich@suse.com>
> Subject: Re: [PATCH v3 07/24] x86/emul: Clean up the naming of the retire
> union
> 
> On 30/11/16 13:58, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> >> Sent: 30 November 2016 13:50
> >> To: Xen-devel <xen-devel@lists.xen.org>
> >> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Jan Beulich
> >> <JBeulich@suse.com>; Paul Durrant <Paul.Durrant@citrix.com>
> >> Subject: [PATCH v3 07/24] x86/emul: Clean up the naming of the retire
> union
> >>
> >> Rename byte to raw, as the field being a single byte long is an
> >> implementation
> >> detail.  Make the bitfields part of an anonymous struct to remove the
> .flags
> >> qualifier.  Change the types of the flags to being booleans, to match their
> >> use.
> >>
> > Is it legitimate to use a bool in a bitfield?
> 
> Yes.  Why wouldn't it be?
> 

They always used to be restricted to int or unsigned int. Looks like this was relaxed in C99.

> > Also, anonymous unions are not part of C99 AFAIK... are we now stipulating
> something more recent?
> 
> We used gnu99 for as long as I can remember, and we have other examples
> of this pattern already in Xen.
> 

If there's precedent then that's fine.

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 13/24] x86/emul: Rework emulator event injection
  2016-11-30 13:50 ` [PATCH v3 13/24] x86/emul: Rework emulator event injection Andrew Cooper
@ 2016-11-30 14:26   ` Paul Durrant
  2016-12-01 11:35   ` Tim Deegan
  2016-12-01 12:31   ` Jan Beulich
  2 siblings, 0 replies; 59+ messages in thread
From: Paul Durrant @ 2016-11-30 14:26 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Tim (Xen.org), George Dunlap, Jan Beulich

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 30 November 2016 13:51
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Jan Beulich
> <JBeulich@suse.com>; Paul Durrant <Paul.Durrant@citrix.com>; Tim
> (Xen.org) <tim@xen.org>; George Dunlap <George.Dunlap@citrix.com>
> Subject: [PATCH v3 13/24] x86/emul: Rework emulator event injection
> 
> The emulator needs to gain an understanding of interrupts and exceptions
> generated by its actions.
> 
> Move hvm_emulate_ctxt.{exn_pending,trap} into struct x86_emulate_ctxt
> so they
> are visible to the emulator.  This removes the need for the
> inject_{hw_exception,sw_interrupt}() hooks, which are dropped and
> replaced
> with x86_emul_{hw_exception,software_event,reset_event}() instead.
> 
> For exceptions raised by x86_emulate() itself (rather than its callbacks), the
> shadow pagetable and PV uses of x86_emulate() previously failed with
> X86EMUL_UNHANDLEABLE due to the lack of inject_*() hooks.
> 
> This behaviour has changed, and such cases will now return
> X86EMUL_EXCEPTION
> with event_pending set.  Until the callers of x86_emulate() have been
> updated
> to inject events back into the guest, divert the event_pending case back into
> the X86EMUL_UNHANDLEABLE path to maintain the same guest-visible
> behaviour.
> 
> No overall functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Paul Durrant <paul.durrant@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> CC: Tim Deegan <tim@xen.org>
> CC: George Dunlap <george.dunlap@eu.citrix.com>
> 
> v3:
>  * Rework how the event_pending case is currently handled
> v2:
>  * Change x86_emul_hw_exception()'s error_code parameter to being
> signed
>  * Clarify how software interrupt injection happens.
>  * More ASSERT()'s and description of how event_pending works without the
>    inject_sw_interrupt() hook
> ---
>  xen/arch/x86/hvm/emulate.c             | 81 ++++------------------------------
>  xen/arch/x86/hvm/hvm.c                 |  4 +-
>  xen/arch/x86/hvm/io.c                  |  4 +-
>  xen/arch/x86/hvm/vmx/realmode.c        | 16 +++----
>  xen/arch/x86/mm.c                      | 26 +++++++++++
>  xen/arch/x86/mm/shadow/multi.c         | 17 +++++++
>  xen/arch/x86/x86_emulate/x86_emulate.c | 12 +++--
>  xen/arch/x86/x86_emulate/x86_emulate.h | 76
> +++++++++++++++++++++++++------
>  xen/include/asm-x86/hvm/emulate.h      |  3 --
>  9 files changed, 132 insertions(+), 107 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index 91c79fa..4b8c9a0 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -568,12 +568,9 @@ static int hvmemul_virtual_to_linear(
>          return X86EMUL_UNHANDLEABLE;
> 
>      /* This is a singleton operation: fail it with an exception. */
> -    hvmemul_ctxt->exn_pending = 1;
> -    hvmemul_ctxt->trap.vector =
> -        (seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault;
> -    hvmemul_ctxt->trap.type = X86_EVENTTYPE_HW_EXCEPTION;
> -    hvmemul_ctxt->trap.error_code = 0;
> -    hvmemul_ctxt->trap.insn_len = 0;
> +    x86_emul_hw_exception((seg == x86_seg_ss)
> +                          ? TRAP_stack_error
> +                          : TRAP_gp_fault, 0, &hvmemul_ctxt->ctxt);
>      return X86EMUL_EXCEPTION;
>  }
> 
> @@ -1562,59 +1559,6 @@ int hvmemul_cpuid(
>      return X86EMUL_OKAY;
>  }
> 
> -static int hvmemul_inject_hw_exception(
> -    uint8_t vector,
> -    int32_t error_code,
> -    struct x86_emulate_ctxt *ctxt)
> -{
> -    struct hvm_emulate_ctxt *hvmemul_ctxt =
> -        container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
> -
> -    hvmemul_ctxt->exn_pending = 1;
> -    hvmemul_ctxt->trap.vector = vector;
> -    hvmemul_ctxt->trap.type = X86_EVENTTYPE_HW_EXCEPTION;
> -    hvmemul_ctxt->trap.error_code = error_code;
> -    hvmemul_ctxt->trap.insn_len = 0;
> -
> -    return X86EMUL_OKAY;
> -}
> -
> -static int hvmemul_inject_sw_interrupt(
> -    enum x86_swint_type type,
> -    uint8_t vector,
> -    uint8_t insn_len,
> -    struct x86_emulate_ctxt *ctxt)
> -{
> -    struct hvm_emulate_ctxt *hvmemul_ctxt =
> -        container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
> -
> -    switch ( type )
> -    {
> -    case x86_swint_icebp:
> -        hvmemul_ctxt->trap.type = X86_EVENTTYPE_PRI_SW_EXCEPTION;
> -        break;
> -
> -    case x86_swint_int3:
> -    case x86_swint_into:
> -        hvmemul_ctxt->trap.type = X86_EVENTTYPE_SW_EXCEPTION;
> -        break;
> -
> -    case x86_swint_int:
> -        hvmemul_ctxt->trap.type = X86_EVENTTYPE_SW_INTERRUPT;
> -        break;
> -
> -    default:
> -        return X86EMUL_UNHANDLEABLE;
> -    }
> -
> -    hvmemul_ctxt->exn_pending = 1;
> -    hvmemul_ctxt->trap.vector = vector;
> -    hvmemul_ctxt->trap.error_code = X86_EVENT_NO_EC;
> -    hvmemul_ctxt->trap.insn_len = insn_len;
> -
> -    return X86EMUL_OKAY;
> -}
> -
>  static int hvmemul_get_fpu(
>      void (*exception_callback)(void *, struct cpu_user_regs *),
>      void *exception_callback_arg,
> @@ -1678,8 +1622,7 @@ static int hvmemul_invlpg(
>           * hvmemul_virtual_to_linear() raises exceptions for type/limit
>           * violations, so squash them.
>           */
> -        hvmemul_ctxt->exn_pending = 0;
> -        hvmemul_ctxt->trap = (struct x86_event){};
> +        x86_emul_reset_event(ctxt);
>          rc = X86EMUL_OKAY;
>      }
> 
> @@ -1696,7 +1639,7 @@ static int hvmemul_vmfunc(
> 
>      rc = hvm_funcs.altp2m_vcpu_emulate_vmfunc(ctxt->regs);
>      if ( rc != X86EMUL_OKAY )
> -        hvmemul_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC,
> ctxt);
> +        x86_emul_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC, ctxt);
> 
>      return rc;
>  }
> @@ -1720,8 +1663,6 @@ static const struct x86_emulate_ops
> hvm_emulate_ops = {
>      .write_msr     = hvmemul_write_msr,
>      .wbinvd        = hvmemul_wbinvd,
>      .cpuid         = hvmemul_cpuid,
> -    .inject_hw_exception = hvmemul_inject_hw_exception,
> -    .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
>      .get_fpu       = hvmemul_get_fpu,
>      .put_fpu       = hvmemul_put_fpu,
>      .invlpg        = hvmemul_invlpg,
> @@ -1747,8 +1688,6 @@ static const struct x86_emulate_ops
> hvm_emulate_ops_no_write = {
>      .write_msr     = hvmemul_write_msr_discard,
>      .wbinvd        = hvmemul_wbinvd_discard,
>      .cpuid         = hvmemul_cpuid,
> -    .inject_hw_exception = hvmemul_inject_hw_exception,
> -    .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
>      .get_fpu       = hvmemul_get_fpu,
>      .put_fpu       = hvmemul_put_fpu,
>      .invlpg        = hvmemul_invlpg,
> @@ -1870,8 +1809,8 @@ int hvm_emulate_one_mmio(unsigned long mfn,
> unsigned long gla)
>          hvm_dump_emulation_state(XENLOG_G_WARNING "MMCFG", &ctxt);
>          break;
>      case X86EMUL_EXCEPTION:
> -        if ( ctxt.exn_pending )
> -            hvm_inject_event(&ctxt.trap);
> +        if ( ctxt.ctxt.event_pending )
> +            hvm_inject_event(&ctxt.ctxt.event);
>          /* fallthrough */
>      default:
>          hvm_emulate_writeback(&ctxt);
> @@ -1930,8 +1869,8 @@ void hvm_emulate_one_vm_event(enum
> emul_kind kind, unsigned int trapnr,
>          hvm_inject_hw_exception(trapnr, errcode);
>          break;
>      case X86EMUL_EXCEPTION:
> -        if ( ctx.exn_pending )
> -            hvm_inject_event(&ctx.trap);
> +        if ( ctx.ctxt.event_pending )
> +            hvm_inject_event(&ctx.ctxt.event);
>          break;
>      }
> 
> @@ -2006,8 +1945,6 @@ void hvm_emulate_init_per_insn(
>          hvmemul_ctxt->insn_buf_bytes = insn_bytes;
>          memcpy(hvmemul_ctxt->insn_buf, insn_buf, insn_bytes);
>      }
> -
> -    hvmemul_ctxt->exn_pending = 0;
>  }
> 
>  void hvm_emulate_writeback(
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index b950842..ef83100 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4076,8 +4076,8 @@ void hvm_ud_intercept(struct cpu_user_regs
> *regs)
>          hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
>          break;
>      case X86EMUL_EXCEPTION:
> -        if ( ctxt.exn_pending )
> -            hvm_inject_event(&ctxt.trap);
> +        if ( ctxt.ctxt.event_pending )
> +            hvm_inject_event(&ctxt.ctxt.event);
>          /* fall through */
>      default:
>          hvm_emulate_writeback(&ctxt);
> diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
> index 1279f68..abb9d51 100644
> --- a/xen/arch/x86/hvm/io.c
> +++ b/xen/arch/x86/hvm/io.c
> @@ -102,8 +102,8 @@ int handle_mmio(void)
>          hvm_dump_emulation_state(XENLOG_G_WARNING "MMIO", &ctxt);
>          return 0;
>      case X86EMUL_EXCEPTION:
> -        if ( ctxt.exn_pending )
> -            hvm_inject_event(&ctxt.trap);
> +        if ( ctxt.ctxt.event_pending )
> +            hvm_inject_event(&ctxt.ctxt.event);
>          break;
>      default:
>          break;
> diff --git a/xen/arch/x86/hvm/vmx/realmode.c
> b/xen/arch/x86/hvm/vmx/realmode.c
> index 9002638..dc3ab44 100644
> --- a/xen/arch/x86/hvm/vmx/realmode.c
> +++ b/xen/arch/x86/hvm/vmx/realmode.c
> @@ -122,7 +122,7 @@ void vmx_realmode_emulate_one(struct
> hvm_emulate_ctxt *hvmemul_ctxt)
> 
>      if ( rc == X86EMUL_EXCEPTION )
>      {
> -        if ( !hvmemul_ctxt->exn_pending )
> +        if ( !hvmemul_ctxt->ctxt.event_pending )
>          {
>              unsigned long intr_info;
> 
> @@ -133,27 +133,27 @@ void vmx_realmode_emulate_one(struct
> hvm_emulate_ctxt *hvmemul_ctxt)
>                  gdprintk(XENLOG_ERR, "Exception pending but no info.\n");
>                  goto fail;
>              }
> -            hvmemul_ctxt->trap.vector = (uint8_t)intr_info;
> -            hvmemul_ctxt->trap.insn_len = 0;
> +            hvmemul_ctxt->ctxt.event.vector = (uint8_t)intr_info;
> +            hvmemul_ctxt->ctxt.event.insn_len = 0;
>          }
> 
>          if ( unlikely(curr->domain->debugger_attached) &&
> -             ((hvmemul_ctxt->trap.vector == TRAP_debug) ||
> -              (hvmemul_ctxt->trap.vector == TRAP_int3)) )
> +             ((hvmemul_ctxt->ctxt.event.vector == TRAP_debug) ||
> +              (hvmemul_ctxt->ctxt.event.vector == TRAP_int3)) )
>          {
>              domain_pause_for_debugger();
>          }
>          else if ( curr->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE )
>          {
>              gdprintk(XENLOG_ERR, "Exception %02x in protected mode.\n",
> -                     hvmemul_ctxt->trap.vector);
> +                     hvmemul_ctxt->ctxt.event.vector);
>              goto fail;
>          }
>          else
>          {
>              realmode_deliver_exception(
> -                hvmemul_ctxt->trap.vector,
> -                hvmemul_ctxt->trap.insn_len,
> +                hvmemul_ctxt->ctxt.event.vector,
> +                hvmemul_ctxt->ctxt.event.insn_len,
>                  hvmemul_ctxt);
>          }
>      }
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index 231c7bf..5d59479 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5379,6 +5379,19 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned
> long addr,
>      page_unlock(page);
>      put_page(page);
> 
> +    /*
> +     * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
> +     * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such
> exceptions
> +     * now set event_pending instead.  Exceptions raised behind the back of
> +     * the emulator don't yet set event_pending.
> +     *
> +     * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE
> path,
> +     * for no functional change from before.  Future patches will fix this
> +     * properly.
> +     */
> +    if ( rc == X86EMUL_EXCEPTION && ptwr_ctxt.ctxt.event_pending )
> +        rc = X86EMUL_UNHANDLEABLE;
> +
>      if ( rc == X86EMUL_UNHANDLEABLE )
>          goto bail;
> 
> @@ -5506,6 +5519,19 @@ int mmio_ro_do_page_fault(struct vcpu *v,
> unsigned long addr,
>      else
>          rc = x86_emulate(&ctxt, &mmio_ro_emulate_ops);
> 
> +    /*
> +     * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
> +     * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such
> exceptions
> +     * now set event_pending instead.  Exceptions raised behind the back of
> +     * the emulator don't yet set event_pending.
> +     *
> +     * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE
> path,
> +     * for no functional change from before.  Future patches will fix this
> +     * properly.
> +     */
> +    if ( rc == X86EMUL_EXCEPTION && ctxt.event_pending )
> +        rc = X86EMUL_UNHANDLEABLE;
> +
>      if ( rc == X86EMUL_UNHANDLEABLE )
>          return 0;
> 
> diff --git a/xen/arch/x86/mm/shadow/multi.c
> b/xen/arch/x86/mm/shadow/multi.c
> index ddfb815..56c40f8 100644
> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -3374,6 +3374,19 @@ static int sh_page_fault(struct vcpu *v,
>      r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
> 
>      /*
> +     * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
> +     * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such
> exceptions
> +     * now set event_pending instead.  Exceptions raised behind the back of
> +     * the emulator don't yet set event_pending.
> +     *
> +     * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE
> path,
> +     * for no functional change from before.  Future patches will fix this
> +     * properly.
> +     */
> +    if ( r == X86EMUL_EXCEPTION && emul_ctxt.ctxt.event_pending )
> +        r = X86EMUL_UNHANDLEABLE;
> +
> +    /*
>       * NB. We do not unshadow on X86EMUL_EXCEPTION. It's not clear that it
>       * would be a good unshadow hint. If we *do* decide to unshadow-on-
> fault
>       * then it must be 'failable': we cannot require the unshadow to succeed.
> @@ -3443,6 +3456,10 @@ static int sh_page_fault(struct vcpu *v,
>              shadow_continue_emulation(&emul_ctxt, regs);
>              v->arch.paging.last_write_was_pt = 0;
>              r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
> +
> +            if ( r == X86EMUL_EXCEPTION && emul_ctxt.ctxt.event_pending )
> +                r = X86EMUL_UNHANDLEABLE;
> +
>              if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.retire.raw )
>              {
>                  emulation_count++;
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c
> b/xen/arch/x86/x86_emulate/x86_emulate.c
> index 6adfdbe..0fb2c09 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -680,9 +680,8 @@ static inline int mkec(uint8_t e, int32_t ec, ...)
> 
>  #define generate_exception_if(p, e, ec...)                                \
>  ({  if ( (p) ) {                                                          \
> -        fail_if(ops->inject_hw_exception == NULL);                        \
> -        rc = ops->inject_hw_exception(e, mkec(e, ##ec, 0), ctxt)          \
> -            ? : X86EMUL_EXCEPTION;                                        \
> +        x86_emul_hw_exception(e, mkec(e, ##ec, 0), ctxt);                 \
> +        rc = X86EMUL_EXCEPTION;                                           \
>          goto done;                                                        \
>      }                                                                     \
>  })
> @@ -1604,9 +1603,6 @@ static int inject_swint(enum x86_swint_type type,
>  {
>      int rc, error_code, fault_type = EXC_GP;
> 
> -    fail_if(ops->inject_sw_interrupt == NULL);
> -    fail_if(ops->inject_hw_exception == NULL);
> -
>      /*
>       * Without hardware support, injecting software interrupts/exceptions is
>       * problematic.
> @@ -1701,7 +1697,8 @@ static int inject_swint(enum x86_swint_type type,
>          }
>      }
> 
> -    rc = ops->inject_sw_interrupt(type, vector, insn_len, ctxt);
> +    x86_emul_software_event(type, vector, insn_len, ctxt);
> +    rc = X86EMUL_OKAY;
> 
>   done:
>      return rc;
> @@ -1909,6 +1906,7 @@ x86_decode(
> 
>      /* Initialise output state in x86_emulate_ctxt */
>      ctxt->retire.raw = 0;
> +    x86_emul_reset_event(ctxt);
> 
>      op_bytes = def_op_bytes = ad_bytes = def_ad_bytes = ctxt-
> >addr_size/8;
>      if ( op_bytes == 8 )
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h
> b/xen/arch/x86/x86_emulate/x86_emulate.h
> index da8924b..3c0b25d 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -396,19 +396,6 @@ struct x86_emulate_ops
>          unsigned int *edx,
>          struct x86_emulate_ctxt *ctxt);
> 
> -    /* inject_hw_exception */
> -    int (*inject_hw_exception)(
> -        uint8_t vector,
> -        int32_t error_code,
> -        struct x86_emulate_ctxt *ctxt);
> -
> -    /* inject_sw_interrupt */
> -    int (*inject_sw_interrupt)(
> -        enum x86_swint_type type,
> -        uint8_t vector,
> -        uint8_t insn_len,
> -        struct x86_emulate_ctxt *ctxt);
> -
>      /*
>       * get_fpu: Load emulated environment's FPU state onto processor.
>       *  @exn_callback: On any FPU or SIMD exception, pass control to
> @@ -486,6 +473,9 @@ struct x86_emulate_ctxt
>              bool singlestep:1;   /* Singlestepping was active. */
>          };
>      } retire;
> +
> +    bool event_pending;
> +    struct x86_event event;
>  };
> 
>  /*
> @@ -584,6 +574,19 @@ static inline int x86_emulate_wrapper(
>      if ( rc == X86EMUL_EXCEPTION )
>          ASSERT(ctxt->regs->eip == orig_eip);
> 
> +    /*
> +     * TODO: Make this true:
> +     *
> +    ASSERT(ctxt->event_pending == (rc == X86EMUL_EXCEPTION));
> +     *
> +     * Some codepaths still raise exceptions behind the back of the
> +     * emulator. (i.e. return X86EMUL_EXCEPTION but without
> +     * event_pending being set).  In the meantime, use a slightly
> +     * relaxed check...
> +     */
> +    if ( ctxt->event_pending )
> +        ASSERT(rc == X86EMUL_EXCEPTION);
> +
>      return rc;
>  }
> 
> @@ -633,4 +636,51 @@ void x86_emulate_free_state(struct
> x86_emulate_state *state);
> 
>  #endif
> 
> +static inline void x86_emul_hw_exception(
> +    unsigned int vector, int error_code, struct x86_emulate_ctxt *ctxt)
> +{
> +    ASSERT(!ctxt->event_pending);
> +
> +    ctxt->event.vector = vector;
> +    ctxt->event.type = X86_EVENTTYPE_HW_EXCEPTION;
> +    ctxt->event.error_code = error_code;
> +
> +    ctxt->event_pending = true;
> +}
> +
> +static inline void x86_emul_software_event(
> +    enum x86_swint_type type, uint8_t vector, uint8_t insn_len,
> +    struct x86_emulate_ctxt *ctxt)
> +{
> +    ASSERT(!ctxt->event_pending);
> +
> +    switch ( type )
> +    {
> +    case x86_swint_icebp:
> +        ctxt->event.type = X86_EVENTTYPE_PRI_SW_EXCEPTION;
> +        break;
> +
> +    case x86_swint_int3:
> +    case x86_swint_into:
> +        ctxt->event.type = X86_EVENTTYPE_SW_EXCEPTION;
> +        break;
> +
> +    case x86_swint_int:
> +        ctxt->event.type = X86_EVENTTYPE_SW_INTERRUPT;
> +        break;
> +    }
> +
> +    ctxt->event.vector = vector;
> +    ctxt->event.error_code = X86_EVENT_NO_EC;
> +    ctxt->event.insn_len = insn_len;
> +
> +    ctxt->event_pending = true;
> +}
> +
> +static inline void x86_emul_reset_event(struct x86_emulate_ctxt *ctxt)
> +{
> +    ctxt->event_pending = false;
> +    ctxt->event = (struct x86_event){};
> +}
> +
>  #endif /* __X86_EMULATE_H__ */
> diff --git a/xen/include/asm-x86/hvm/emulate.h b/xen/include/asm-
> x86/hvm/emulate.h
> index 3b7ec33..d64d834 100644
> --- a/xen/include/asm-x86/hvm/emulate.h
> +++ b/xen/include/asm-x86/hvm/emulate.h
> @@ -29,9 +29,6 @@ struct hvm_emulate_ctxt {
>      unsigned long seg_reg_accessed;
>      unsigned long seg_reg_dirty;
> 
> -    bool_t exn_pending;
> -    struct x86_event trap;
> -
>      uint32_t intr_shadow;
> 
>      bool_t set_context;
> --
> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 11/24] x86/emul: Implement singlestep as a retire flag
  2016-11-30 13:50 ` [PATCH v3 11/24] x86/emul: Implement singlestep as a retire flag Andrew Cooper
@ 2016-11-30 14:28   ` Paul Durrant
  2016-12-01 11:16   ` Jan Beulich
  1 sibling, 0 replies; 59+ messages in thread
From: Paul Durrant @ 2016-11-30 14:28 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Tim (Xen.org), Jan Beulich

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 30 November 2016 13:50
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Jan Beulich
> <JBeulich@suse.com>; Tim (Xen.org) <tim@xen.org>; Paul Durrant
> <Paul.Durrant@citrix.com>
> Subject: [PATCH v3 11/24] x86/emul: Implement singlestep as a retire flag
> 
> The behaviour of singlestep is to raise #DB after the instruction has been
> completed, but implementing it with inject_hw_exception() causes
> x86_emulate()
> to return X86EMUL_EXCEPTION, despite succesfully completing execution of
> the
> instruction, including register writeback.
> 
> Instead, use a retire flag to indicate singlestep, which causes x86_emulate()
> to return X86EMUL_OKAY.
> 
> Update all callers of x86_emulate() to use the new retire flag.  This fixes
> the behaviour of singlestep for shadow pagetable updates and
> mmcfg/mmio_ro
> intercepts, which previously discarded the exception.
> 
> With this change, all uses of X86EMUL_EXCEPTION from x86_emulate() are
> believed to have strictly fault semantics.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Tim Deegan <tim@xen.org>
> CC: Paul Durrant <paul.durrant@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> 
> v3:
>  * New
> ---
>  xen/arch/x86/hvm/emulate.c             |  3 +++
>  xen/arch/x86/mm.c                      | 11 ++++++++++-
>  xen/arch/x86/mm/shadow/multi.c         | 21 ++++++++++++++++++++-
>  xen/arch/x86/x86_emulate/x86_emulate.c |  9 ++++-----
>  xen/arch/x86/x86_emulate/x86_emulate.h |  6 ++++++
>  5 files changed, 43 insertions(+), 7 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index fe62500..91c79fa 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -1788,6 +1788,9 @@ static int _hvm_emulate_one(struct
> hvm_emulate_ctxt *hvmemul_ctxt,
>      if ( rc != X86EMUL_OKAY )
>          return rc;
> 
> +    if ( hvmemul_ctxt->ctxt.retire.singlestep )
> +        hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +
>      new_intr_shadow = hvmemul_ctxt->intr_shadow;
> 
>      /* MOV-SS instruction toggles MOV-SS shadow, else we just clear it. */
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index b7c7122..231c7bf 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5382,6 +5382,9 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned
> long addr,
>      if ( rc == X86EMUL_UNHANDLEABLE )
>          goto bail;
> 
> +    if ( ptwr_ctxt.ctxt.retire.singlestep )
> +        pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +
>      perfc_incr(ptwr_emulations);
>      return EXCRET_fault_fixed;
> 
> @@ -5503,7 +5506,13 @@ int mmio_ro_do_page_fault(struct vcpu *v,
> unsigned long addr,
>      else
>          rc = x86_emulate(&ctxt, &mmio_ro_emulate_ops);
> 
> -    return rc != X86EMUL_UNHANDLEABLE ? EXCRET_fault_fixed : 0;
> +    if ( rc == X86EMUL_UNHANDLEABLE )
> +        return 0;
> +
> +    if ( ctxt.retire.singlestep )
> +        pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +
> +    return EXCRET_fault_fixed;
>  }
> 
>  void *alloc_xen_pagetable(void)
> diff --git a/xen/arch/x86/mm/shadow/multi.c
> b/xen/arch/x86/mm/shadow/multi.c
> index 9ee48a8..ddfb815 100644
> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -3422,6 +3422,16 @@ static int sh_page_fault(struct vcpu *v,
>          v->arch.paging.last_write_emul_ok = 0;
>  #endif
> 
> +    if ( emul_ctxt.ctxt.retire.singlestep )
> +    {
> +        if ( is_hvm_vcpu(v) )
> +            hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +        else
> +            pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +
> +        goto emulate_done;
> +    }
> +
>  #if GUEST_PAGING_LEVELS == 3 /* PAE guest */
>      if ( r == X86EMUL_OKAY ) {
>          int i, emulation_count=0;
> @@ -3433,7 +3443,7 @@ static int sh_page_fault(struct vcpu *v,
>              shadow_continue_emulation(&emul_ctxt, regs);
>              v->arch.paging.last_write_was_pt = 0;
>              r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
> -            if ( r == X86EMUL_OKAY )
> +            if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.retire.raw )
>              {
>                  emulation_count++;
>                  if ( v->arch.paging.last_write_was_pt )
> @@ -3449,6 +3459,15 @@ static int sh_page_fault(struct vcpu *v,
>              {
>                  perfc_incr(shadow_em_ex_fail);
> 
> TRACE_SHADOW_PATH_FLAG(TRCE_SFLAG_EMULATION_LAST_FAILED);
> +
> +                if ( emul_ctxt.ctxt.retire.singlestep )
> +                {
> +                    if ( is_hvm_vcpu(v) )
> +                        hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +                    else
> +                        pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +                }
> +
>                  break; /* Don't emulate again if we failed! */
>              }
>          }
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c
> b/xen/arch/x86/x86_emulate/x86_emulate.c
> index 8a1f1f5..0af532e 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -2417,7 +2417,6 @@ x86_emulate(
>      struct x86_emulate_state state;
>      int rc;
>      uint8_t b, d;
> -    bool tf = ctxt->regs->eflags & EFLG_TF;
>      struct operand src = { .reg = PTR_POISON };
>      struct operand dst = { .reg = PTR_POISON };
>      enum x86_swint_type swint_type;
> @@ -5415,11 +5414,11 @@ x86_emulate(
>      if ( !mode_64bit() )
>          _regs.eip = (uint32_t)_regs.eip;
> 
> -    *ctxt->regs = _regs;
> +    /* Was singestepping active at the start of this instruction? */
> +    if ( (rc == X86EMUL_OKAY) && (ctxt->regs->eflags & EFLG_TF) )
> +        ctxt->retire.singlestep = true;
> 
> -    /* Inject #DB if single-step tracing was enabled at instruction start. */
> -    if ( tf && (rc == X86EMUL_OKAY) && ops->inject_hw_exception )
> -        rc = ops->inject_hw_exception(EXC_DB, -1, ctxt) ? :
> X86EMUL_EXCEPTION;
> +    *ctxt->regs = _regs;
> 
>   done:
>      _put_fpu();
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h
> b/xen/arch/x86/x86_emulate/x86_emulate.h
> index fc28976..da8924b 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -483,6 +483,7 @@ struct x86_emulate_ctxt
>              bool hlt:1;          /* Instruction HLTed. */
>              bool mov_ss:1;       /* Instruction sets MOV-SS irq shadow. */
>              bool sti:1;          /* Instruction sets STI irq shadow. */
> +            bool singlestep:1;   /* Singlestepping was active. */
>          };
>      } retire;
>  };
> @@ -572,12 +573,17 @@ static inline int x86_emulate_wrapper(
>      struct x86_emulate_ctxt *ctxt,
>      const struct x86_emulate_ops *ops)
>  {
> +    unsigned long orig_eip = ctxt->regs->eip;
>      int rc = x86_emulate(ctxt, ops);
> 
>      /* Retire flags should only be set for successful instruction emulation. */
>      if ( rc != X86EMUL_OKAY )
>          ASSERT(ctxt->retire.raw == 0);
> 
> +    /* All cases returning X86EMUL_EXCEPTION should have fault semantics.
> */
> +    if ( rc == X86EMUL_EXCEPTION )
> +        ASSERT(ctxt->regs->eip == orig_eip);
> +
>      return rc;
>  }
> 
> --
> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 22/24] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back
  2016-11-30 13:50 ` [PATCH v3 22/24] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back Andrew Cooper
@ 2016-11-30 14:29   ` Paul Durrant
  0 siblings, 0 replies; 59+ messages in thread
From: Paul Durrant @ 2016-11-30 14:29 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 30 November 2016 13:51
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant
> <Paul.Durrant@citrix.com>
> Subject: [PATCH v3 22/24] x86/hvm: Avoid __hvm_copy() raising #PF behind
> the emulators back
> 
> Drop the call to hvm_inject_page_fault() in __hvm_copy(), and require
> callers
> to inject the pagefault themselves.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Tim Deegan <tim@xen.org>
> Acked-by: Kevin Tian <kevin.tian@intel.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> ---
> CC: Paul Durrant <paul.durrant@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> 
> v3:
>  * Correct patch description
>  * Fix rebasing error over previous TSS series
> ---
>  xen/arch/x86/hvm/emulate.c        |  2 ++
>  xen/arch/x86/hvm/hvm.c            | 14 ++++++++++++--
>  xen/arch/x86/hvm/vmx/vvmx.c       | 20 +++++++++++++++-----
>  xen/arch/x86/mm/shadow/common.c   |  1 +
>  xen/include/asm-x86/hvm/support.h |  4 +---
>  5 files changed, 31 insertions(+), 10 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index 035b654..ccf3aa2 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -799,6 +799,7 @@ static int __hvmemul_read(
>      case HVMCOPY_okay:
>          break;
>      case HVMCOPY_bad_gva_to_gfn:
> +        x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt);
>          return X86EMUL_EXCEPTION;
>      case HVMCOPY_bad_gfn_to_mfn:
>          if ( access_type == hvm_access_insn_fetch )
> @@ -905,6 +906,7 @@ static int hvmemul_write(
>      case HVMCOPY_okay:
>          break;
>      case HVMCOPY_bad_gva_to_gfn:
> +        x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt);
>          return X86EMUL_EXCEPTION;
>      case HVMCOPY_bad_gfn_to_mfn:
>          return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec,
> hvmemul_ctxt, 0);
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 37eaee2..3596f2c 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2927,6 +2927,8 @@ void hvm_task_switch(
> 
>      rc = hvm_copy_from_guest_linear(
>          &tss, prev_tr.base, sizeof(tss), PFEC_page_present, &pfinfo);
> +    if ( rc == HVMCOPY_bad_gva_to_gfn )
> +        hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
>      if ( rc != HVMCOPY_okay )
>          goto out;
> 
> @@ -2965,11 +2967,15 @@ void hvm_task_switch(
>                                    offsetof(typeof(tss), trace) -
>                                    offsetof(typeof(tss), eip),
>                                    PFEC_page_present, &pfinfo);
> +    if ( rc == HVMCOPY_bad_gva_to_gfn )
> +        hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
>      if ( rc != HVMCOPY_okay )
>          goto out;
> 
>      rc = hvm_copy_from_guest_linear(
>          &tss, tr.base, sizeof(tss), PFEC_page_present, &pfinfo);
> +    if ( rc == HVMCOPY_bad_gva_to_gfn )
> +        hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
>      /*
>       * Note: The HVMCOPY_gfn_shared case could be optimised, if the callee
>       * functions knew we want RO access.
> @@ -3012,7 +3018,10 @@ void hvm_task_switch(
>                                        &tss.back_link, sizeof(tss.back_link), 0,
>                                        &pfinfo);
>          if ( rc == HVMCOPY_bad_gva_to_gfn )
> +        {
> +            hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
>              exn_raised = 1;
> +        }
>          else if ( rc != HVMCOPY_okay )
>              goto out;
>      }
> @@ -3050,7 +3059,10 @@ void hvm_task_switch(
>              rc = hvm_copy_to_guest_linear(linear_addr, &errcode, opsz, 0,
>                                            &pfinfo);
>              if ( rc == HVMCOPY_bad_gva_to_gfn )
> +            {
> +                hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
>                  exn_raised = 1;
> +            }
>              else if ( rc != HVMCOPY_okay )
>                  goto out;
>          }
> @@ -3114,8 +3126,6 @@ static enum hvm_copy_result __hvm_copy(
>                  {
>                      pfinfo->linear = addr;
>                      pfinfo->ec = pfec;
> -
> -                    hvm_inject_page_fault(pfec, addr);
>                  }
>                  return HVMCOPY_bad_gva_to_gfn;
>              }
> diff --git a/xen/arch/x86/hvm/vmx/vvmx.c
> b/xen/arch/x86/hvm/vmx/vvmx.c
> index fd7ea0a..e6e9ebd 100644
> --- a/xen/arch/x86/hvm/vmx/vvmx.c
> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> @@ -396,7 +396,6 @@ static int decode_vmx_inst(struct cpu_user_regs
> *regs,
>      struct vcpu *v = current;
>      union vmx_inst_info info;
>      struct segment_register seg;
> -    pagefault_info_t pfinfo;
>      unsigned long base, index, seg_base, disp, offset;
>      int scale, size;
> 
> @@ -451,10 +450,17 @@ static int decode_vmx_inst(struct cpu_user_regs
> *regs,
>                offset + size - 1 > seg.limit) )
>              goto gp_fault;
> 
> -        if ( poperandS != NULL &&
> -             hvm_copy_from_guest_linear(poperandS, base, size, 0, &pfinfo)
> -                  != HVMCOPY_okay )
> -            return X86EMUL_EXCEPTION;
> +        if ( poperandS != NULL )
> +        {
> +            pagefault_info_t pfinfo;
> +            int rc = hvm_copy_from_guest_linear(poperandS, base, size,
> +                                                0, &pfinfo);
> +
> +            if ( rc == HVMCOPY_bad_gva_to_gfn )
> +                hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
> +            if ( rc != HVMCOPY_okay )
> +                return X86EMUL_EXCEPTION;
> +        }
>          decode->mem = base;
>          decode->len = size;
>      }
> @@ -1623,6 +1629,8 @@ int nvmx_handle_vmptrst(struct cpu_user_regs
> *regs)
>      gpa = nvcpu->nv_vvmcxaddr;
> 
>      rc = hvm_copy_to_guest_linear(decode.mem, &gpa, decode.len, 0,
> &pfinfo);
> +    if ( rc == HVMCOPY_bad_gva_to_gfn )
> +        hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
>      if ( rc != HVMCOPY_okay )
>          return X86EMUL_EXCEPTION;
> 
> @@ -1694,6 +1702,8 @@ int nvmx_handle_vmread(struct cpu_user_regs
> *regs)
>      switch ( decode.type ) {
>      case VMX_INST_MEMREG_TYPE_MEMORY:
>          rc = hvm_copy_to_guest_linear(decode.mem, &value, decode.len, 0,
> &pfinfo);
> +        if ( rc == HVMCOPY_bad_gva_to_gfn )
> +            hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
>          if ( rc != HVMCOPY_okay )
>              return X86EMUL_EXCEPTION;
>          break;
> diff --git a/xen/arch/x86/mm/shadow/common.c
> b/xen/arch/x86/mm/shadow/common.c
> index 0760e76..fbe49e1 100644
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -198,6 +198,7 @@ hvm_read(enum x86_segment seg,
>      case HVMCOPY_okay:
>          return X86EMUL_OKAY;
>      case HVMCOPY_bad_gva_to_gfn:
> +        x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &sh_ctxt->ctxt);
>          return X86EMUL_EXCEPTION;
>      case HVMCOPY_bad_gfn_to_mfn:
>      case HVMCOPY_unhandleable:
> diff --git a/xen/include/asm-x86/hvm/support.h b/xen/include/asm-
> x86/hvm/support.h
> index 78349f8..3d767d7 100644
> --- a/xen/include/asm-x86/hvm/support.h
> +++ b/xen/include/asm-x86/hvm/support.h
> @@ -85,9 +85,7 @@ enum hvm_copy_result hvm_copy_from_guest_phys(
>   *  HVMCOPY_bad_gva_to_gfn: Some guest virtual address did not have a
> valid
>   *                          mapping to a guest physical address.  The
>   *                          pagefault_info_t structure will be filled in if
> - *                          provided, and a page fault exception is
> - *                          automatically queued for injection into the
> - *                          current HVM VCPU.
> + *                          provided.
>   */
>  typedef struct pagefault_info
>  {
> --
> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union
  2016-11-30 14:05       ` Paul Durrant
@ 2016-11-30 16:43         ` Jan Beulich
  0 siblings, 0 replies; 59+ messages in thread
From: Jan Beulich @ 2016-11-30 16:43 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Andrew Cooper, Xen-devel

>>> On 30.11.16 at 15:05, <Paul.Durrant@citrix.com> wrote:
>> From: Andrew Cooper
>> Sent: 30 November 2016 14:02
>> On 30/11/16 13:58, Paul Durrant wrote:
>> > Also, anonymous unions are not part of C99 AFAIK... are we now stipulating
>> something more recent?
>> 
>> We used gnu99 for as long as I can remember, and we have other examples
>> of this pattern already in Xen.
>> 
> 
> If there's precedent then that's fine.

Tighter rules only apply for the public headers.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 10/24] x86/emul: Always use fault semantics for software events
  2016-11-30 13:50 ` [PATCH v3 10/24] x86/emul: Always use fault semantics for software events Andrew Cooper
@ 2016-11-30 17:55   ` Boris Ostrovsky
  2016-12-01 10:53   ` Jan Beulich
  1 sibling, 0 replies; 59+ messages in thread
From: Boris Ostrovsky @ 2016-11-30 17:55 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Tim Deegan, Suravee Suthikulpanit, Jan Beulich

On 11/30/2016 08:50 AM, Andrew Cooper wrote:
> The common case is already using fault semantics out of x86_emulate(), as that
> is how VT-x/SVM expects to inject the event (given suitable hardware support).
>
> However, x86_emulate() returning X86EMUL_EXCEPTION and also completing a
> register writeback is problematic for callers.
>
> Switch the logic to always using fault semantics, and leave svm_inject_trap()
> to fix up %eip if necessary.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Tim Deegan <tim@xen.org>
> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Reviewed-by:  Boris Ostrovsky <boris.ostrovsky@oracle.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 06/24] x86/pv: Implement pv_inject_{event, page_fault, hw_exception}()
  2016-11-30 13:50 ` [PATCH v3 06/24] x86/pv: Implement pv_inject_{event, page_fault, hw_exception}() Andrew Cooper
@ 2016-12-01 10:06   ` Jan Beulich
  0 siblings, 0 replies; 59+ messages in thread
From: Jan Beulich @ 2016-12-01 10:06 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 30.11.16 at 14:50, <andrew.cooper3@citrix.com> wrote:
> To help with event injection improvements for the PV uses of x86_emulate(),
> implement a event injection API which matches its hvm counterpart.
> 
> This is started with taking do_guest_trap() and modifying its calling API to
> pv_inject_event(), subsequentally implementing the former in terms of the
> latter.
> 
> The existing propagate_page_fault() is fairly similar to
> pv_inject_page_fault(), although it has a return value.  Only a single caller
> makes use of the return value, and non-NULL is only returned if the passed cr2
> is non-canonical.  Opencode this single case in
> handle_gdt_ldt_mapping_fault(), allowing propagate_page_fault() to become
> void.
> 
> The call to reserved_bit_page_fault() in propagate_page_fault() was
> conceptually wrong to start with.  Complaining about reserved bits should be
> part of handling the pagefault itself, not part of injecting a pagefault into
> the guest.  It is therefore moved ahead of the injection call in
> do_page_fault() to compensate.
> 
> The remaining #PF specific bits are moved into pv_inject_event(), and
> pv_inject_page_fault() is implemented as a static inline wrapper.
> 
> No practical change from a guests point of view.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Tim Deegan <tim@xen.org>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -625,37 +625,75 @@ void fatal_trap(const struct cpu_user_regs *regs, bool_t show_remote)
>            (regs->eflags & X86_EFLAGS_IF) ? "" : ", IN INTERRUPT CONTEXT");
>  }
>  
> -static void do_guest_trap(unsigned int trapnr,
> -                          const struct cpu_user_regs *regs)
> +void pv_inject_event(const struct x86_event *event)
>  {
>      struct vcpu *v = current;
> +    struct cpu_user_regs *regs = guest_cpu_user_regs();
>      struct trap_bounce *tb;
>      const struct trap_info *ti;
> +    const uint8_t vector = event->vector;
>      const bool use_error_code =
> -        ((trapnr < 32) && (TRAP_HAVE_EC & (1u << trapnr)));
> +        ((vector < 32) && (TRAP_HAVE_EC & (1u << vector)));
> +    unsigned int error_code = event->error_code;
>  
> -    trace_pv_trap(trapnr, regs->eip, use_error_code, regs->error_code);
> +    ASSERT(vector == event->vector); /* Confirm no truncation. */
> +    if ( use_error_code )
> +        ASSERT(error_code != X86_EVENT_NO_EC);
> +    else
> +        ASSERT(error_code == X86_EVENT_NO_EC);
>  
>      tb = &v->arch.pv_vcpu.trap_bounce;
> -    ti = &v->arch.pv_vcpu.trap_ctxt[trapnr];
> +    ti = &v->arch.pv_vcpu.trap_ctxt[vector];
>  
>      tb->flags = TBF_EXCEPTION;
>      tb->cs    = ti->cs;
>      tb->eip   = ti->address;
>  
> +    if ( vector == TRAP_page_fault )
> +    {
> +        v->arch.pv_vcpu.ctrlreg[2] = event->cr2;
> +        arch_set_cr2(v, event->cr2);
> +
> +        /* Re-set error_code.user flag appropriately for the guest. */
> +        error_code &= ~PFEC_user_mode;
> +        if ( !guest_kernel_mode(v, regs) )
> +            error_code |= PFEC_user_mode;
> +
> +        trace_pv_page_fault(event->cr2, error_code);
> +    }
> +    else
> +        trace_pv_trap(vector, regs->eip, use_error_code, error_code);
> +
>      if ( use_error_code )
>      {
>          tb->flags |= TBF_EXCEPTION_ERRCODE;
> -        tb->error_code = regs->error_code;
> +        tb->error_code = error_code;
>      }
>  
>      if ( TI_GET_IF(ti) )
>          tb->flags |= TBF_INTERRUPT;
>  
>      if ( unlikely(null_trap_bounce(v, tb)) )
> +    {
>          gprintk(XENLOG_WARNING,
>                  "Unhandled %s fault/trap [#%d, ec=%04x]\n",
> -                trapstr(trapnr), trapnr, regs->error_code);
> +                trapstr(vector), vector, error_code);
> +
> +        if ( vector == TRAP_page_fault )
> +            show_page_walk(event->cr2);

Might be worth adding "&& !(regs->error_code & PFEC_reserved_bit)"
to the condition to avoid doing the page walk a second time, perhaps
with a brief comment explaining this.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union
  2016-11-30 13:50 ` [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union Andrew Cooper
  2016-11-30 13:58   ` Paul Durrant
@ 2016-12-01 10:08   ` Jan Beulich
  1 sibling, 0 replies; 59+ messages in thread
From: Jan Beulich @ 2016-12-01 10:08 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Paul Durrant, Xen-devel

>>> On 30.11.16 at 14:50, <andrew.cooper3@citrix.com> wrote:
> Rename byte to raw, as the field being a single byte long is an implementation
> detail.  Make the bitfields part of an anonymous struct to remove the .flags
> qualifier.  Change the types of the flags to being booleans, to match their
> use.

With that you should then also use true/false in assignments to
these fields. This taken care of,

> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 08/24] x86/emul: Correct the behaviour of pop %ss and interrupt shadowing
  2016-11-30 13:50 ` [PATCH v3 08/24] x86/emul: Correct the behaviour of pop %ss and interrupt shadowing Andrew Cooper
@ 2016-12-01 10:18   ` Jan Beulich
  2016-12-01 10:51     ` Andrew Cooper
  0 siblings, 1 reply; 59+ messages in thread
From: Jan Beulich @ 2016-12-01 10:18 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 30.11.16 at 14:50, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -2656,6 +2656,8 @@ x86_emulate(
>                                &dst.val, op_bytes, ctxt, ops)) != 0 ||
>               (rc = load_seg(src.val, dst.val, 0, NULL, ctxt, ops)) != 0 )
>              goto done;
> +        if ( src.val == x86_seg_ss )
> +            ctxt->retire.mov_ss = 1;
>          break;

While I don't mind it being done here (i.e. it can have my R-b as is),
wouldn't it be even better to put this into load_seg() itself?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 09/24] x86/emul: Provide a wrapper to x86_emulate() to ASSERT() certain behaviour
  2016-11-30 13:50 ` [PATCH v3 09/24] x86/emul: Provide a wrapper to x86_emulate() to ASSERT() certain behaviour Andrew Cooper
@ 2016-12-01 10:40   ` Jan Beulich
  2016-12-01 10:58     ` Andrew Cooper
  0 siblings, 1 reply; 59+ messages in thread
From: Jan Beulich @ 2016-12-01 10:40 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 30.11.16 at 14:50, <andrew.cooper3@citrix.com> wrote:
> In debug builds, confirm that some properties of x86_emulate()'s behaviour
> actually hold.  The first property, fixed in a previous change, is that retire
> flags are only ever set in the X86EMUL_OKAY case.
> 
> While adjusting the userspace test harness to cope with ASSERT() in
> x86_emulate.h, fix a build problem introduced in c/s 122dd9575c7 "x86emul:
> in_longmode() should not ignore ->read_msr() errors" by providing an
> implementation of likely()/unlikely().

Oh, I'm sorry for that one. When moving that patch ahead of about
50 other ones touching the emulator, I didn't notice I should have
pulled that addition out from another patch.

> --- a/tools/tests/x86_emulator/x86_emulate.c
> +++ b/tools/tests/x86_emulator/x86_emulate.c
> @@ -50,4 +50,7 @@ typedef bool bool_t;
>  #define __init
>  #define __maybe_unused __attribute__((__unused__))
>  
> +#define likely(x)     __builtin_expect(!!(x),1)
> +#define unlikely(x)   __builtin_expect(!!(x),0)

Please use true/false here and add blanks after the commas.

> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -2404,6 +2404,11 @@ x86_decode(
>  #undef insn_fetch_bytes
>  #undef insn_fetch_type
>  
> +/* Undo DEBUG wrapper. */
> +#ifdef x86_emulate
> +#undef x86_emulate
> +#endif

I don't see the need for the #ifdef here.

> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -23,6 +23,10 @@
>  #ifndef __X86_EMULATE_H__
>  #define __X86_EMULATE_H__
>  
> +#ifndef ASSERT
> +#define ASSERT assert
> +#endif

This doesn't seem to belong here (as there's nothing making sure
assert is defined), and duplicates an existing #define in the test
harness'es x86_emulate.c. I could agree to deleting that other one
and wrapping the one here with #ifndef __XEN__.

> @@ -554,6 +558,27 @@ x86_emulate(
>      const struct x86_emulate_ops *ops);
>  
>  /*
> + * In debug builds, wrap x86_emulate() with some assertions about its expected
> + * behaviour.
> + */
> +#ifndef NDEBUG

Mind swapping the order of comment and #ifndef, to make it more
reasonable to possibly add further things into this guarded block?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 08/24] x86/emul: Correct the behaviour of pop %ss and interrupt shadowing
  2016-12-01 10:18   ` Jan Beulich
@ 2016-12-01 10:51     ` Andrew Cooper
  2016-12-01 11:19       ` Jan Beulich
  0 siblings, 1 reply; 59+ messages in thread
From: Andrew Cooper @ 2016-12-01 10:51 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 01/12/16 10:18, Jan Beulich wrote:
>>>> On 30.11.16 at 14:50, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>> @@ -2656,6 +2656,8 @@ x86_emulate(
>>                                &dst.val, op_bytes, ctxt, ops)) != 0 ||
>>               (rc = load_seg(src.val, dst.val, 0, NULL, ctxt, ops)) != 0 )
>>              goto done;
>> +        if ( src.val == x86_seg_ss )
>> +            ctxt->retire.mov_ss = 1;
>>          break;
> While I don't mind it being done here (i.e. it can have my R-b as is),
> wouldn't it be even better to put this into load_seg() itself?

That would cause the mov_ss flag to be incorrectly set for `lss`.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 10/24] x86/emul: Always use fault semantics for software events
  2016-11-30 13:50 ` [PATCH v3 10/24] x86/emul: Always use fault semantics for software events Andrew Cooper
  2016-11-30 17:55   ` Boris Ostrovsky
@ 2016-12-01 10:53   ` Jan Beulich
  2016-12-01 11:15     ` Andrew Cooper
  1 sibling, 1 reply; 59+ messages in thread
From: Jan Beulich @ 2016-12-01 10:53 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Boris Ostrovsky, Tim Deegan, Suravee Suthikulpanit, Xen-devel

>>> On 30.11.16 at 14:50, <andrew.cooper3@citrix.com> wrote:
> @@ -1242,27 +1242,38 @@ static void svm_inject_event(const struct x86_event *event)
>      eventinj.fields.v = 1;
>      eventinj.fields.vector = _event.vector;
>  
> -    /* Refer to AMD Vol 2: System Programming, 15.20 Event Injection. */
> +    /*
> +     * Refer to AMD Vol 2: System Programming, 15.20 Event Injection.
> +     *
> +     * On hardware lacking NextRIP support, and all hardware in the case of
> +     * icebp, software events with trap semantics need emulating, so %eip in
> +     * the trap frame points after the instruction.
> +     *
> +     * The x86 emulator (if requested by the x86_swint_emulate_* choice) will
> +     * have performed checks such as presence/dpl/etc and believes that the
> +     * event injection will succeed without faulting.
> +     *
> +     * The x86 emulator will always provide fault semantics for software
> +     * events, with _trap.insn_len set appropriately.  If the injection
> +     * requires emulation, move %eip forwards at this point.
> +     */
>      switch ( _event.type )
>      {
>      case X86_EVENTTYPE_SW_INTERRUPT: /* int $n */
> -        /*
> -         * Software interrupts (type 4) cannot be properly injected if the
> -         * processor doesn't support NextRIP.  Without NextRIP, the emulator
> -         * will have performed DPL and presence checks for us, and will have
> -         * moved eip forward if appropriate.
> -         */
>          if ( cpu_has_svm_nrips )
>              vmcb->nextrip = regs->eip + _event.insn_len;
> +        else
> +            regs->eip += _event.insn_len;

Please use ->rip here and below (and perhaps also in the comment).

>          eventinj.fields.type = X86_EVENTTYPE_SW_INTERRUPT;
>          break;
>  
>      case X86_EVENTTYPE_PRI_SW_EXCEPTION: /* icebp */
>          /*
> -         * icebp's injection must always be emulated.  Software injection help
> -         * in x86_emulate has moved eip forward, but NextRIP (if used) still
> -         * needs setting or execution will resume from 0.
> +         * icebp's injection must always be emulated, as hardware does not
> +         * special case HW_EXCEPTION with vector 1 (#DB) as having trap
> +         * semantics.
>           */
> +        regs->eip += _event.insn_len;
>          if ( cpu_has_svm_nrips )
>              vmcb->nextrip = regs->eip;
>          eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
> @@ -1270,16 +1281,13 @@ static void svm_inject_event(const struct x86_event *event)
>  
>      case X86_EVENTTYPE_SW_EXCEPTION: /* int3, into */
>          /*
> -         * The AMD manual states that .type=3 (HW exception), .vector=3 or 4,
> -         * will perform DPL checks.  Experimentally, DPL and presence checks
> -         * are indeed performed, even without NextRIP support.
> -         *
> -         * However without NextRIP support, the event injection still needs
> -         * fully emulating to get the correct eip in the trap frame, yet get
> -         * the correct faulting eip should a fault occur.
> +         * Hardware special cases HW_EXCEPTION with vectors 3 and 4 as having
> +         * trap semantics, and will perform DPL checks.
>           */
>          if ( cpu_has_svm_nrips )
>              vmcb->nextrip = regs->eip + _event.insn_len;
> +        else
> +            regs->eip += _event.insn_len;
>          eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
>          break;

By moving the adding to reg->rip here, you bypass x86_emulate()'s
zeroing of the upper 32 bits outside of 64-bit mode, so you now need
to replicate it below here.

> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -1694,8 +1694,6 @@ static int inject_swint(enum x86_swint_type type,
>                      goto raise_exn;
>              }
>          }
> -
> -        ctxt->regs->eip += insn_len;
>      }
>  
>      rc = ops->inject_sw_interrupt(type, vector, insn_len, ctxt);

I think for the patch description to be correct, this call's return value
needs to be altered, for inject_swint() to return EXCEPTION when
OKAY comes back here (which iirc you do in a later patch when you
eliminate this hook).

> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -64,7 +64,13 @@ enum x86_swint_type {
>      x86_swint_int,   /* 0xcd $n */
>  };
>  
> -/* How much help is required with software event injection? */
> +/*
> + * How much help is required with software event injection?
> + *
> + * All software events return from x86_emulate() with X86EMUL_EXCEPTION and
> + * fault-like semantics.  This just controls whether the emulator performs
> + * presence/dpl/etc checks and possibly raises excepions instead.

exceptions

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 09/24] x86/emul: Provide a wrapper to x86_emulate() to ASSERT() certain behaviour
  2016-12-01 10:40   ` Jan Beulich
@ 2016-12-01 10:58     ` Andrew Cooper
  2016-12-01 11:21       ` Jan Beulich
  0 siblings, 1 reply; 59+ messages in thread
From: Andrew Cooper @ 2016-12-01 10:58 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 01/12/16 10:40, Jan Beulich wrote:
>
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>> @@ -2404,6 +2404,11 @@ x86_decode(
>>  #undef insn_fetch_bytes
>>  #undef insn_fetch_type
>>  
>> +/* Undo DEBUG wrapper. */
>> +#ifdef x86_emulate
>> +#undef x86_emulate
>> +#endif
> I don't see the need for the #ifdef here.

It will break the non-debug build if removed, as x86_emulate wouldn't be
a define.

>
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>> @@ -23,6 +23,10 @@
>>  #ifndef __X86_EMULATE_H__
>>  #define __X86_EMULATE_H__
>>  
>> +#ifndef ASSERT
>> +#define ASSERT assert
>> +#endif
> This doesn't seem to belong here (as there's nothing making sure
> assert is defined), and duplicates an existing #define in the test
> harness'es x86_emulate.c. I could agree to deleting that other one
> and wrapping the one here with #ifndef __XEN__.

Ok.

>
>> @@ -554,6 +558,27 @@ x86_emulate(
>>      const struct x86_emulate_ops *ops);
>>  
>>  /*
>> + * In debug builds, wrap x86_emulate() with some assertions about its expected
>> + * behaviour.
>> + */
>> +#ifndef NDEBUG
> Mind swapping the order of comment and #ifndef, to make it more
> reasonable to possibly add further things into this guarded block?

Ok.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 10/24] x86/emul: Always use fault semantics for software events
  2016-12-01 10:53   ` Jan Beulich
@ 2016-12-01 11:15     ` Andrew Cooper
  2016-12-01 11:23       ` Jan Beulich
  0 siblings, 1 reply; 59+ messages in thread
From: Andrew Cooper @ 2016-12-01 11:15 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Boris Ostrovsky, Tim Deegan, Suravee Suthikulpanit, Xen-devel

On 01/12/16 10:53, Jan Beulich wrote:
>
>>          eventinj.fields.type = X86_EVENTTYPE_SW_INTERRUPT;
>>          break;
>>  
>>      case X86_EVENTTYPE_PRI_SW_EXCEPTION: /* icebp */
>>          /*
>> -         * icebp's injection must always be emulated.  Software injection help
>> -         * in x86_emulate has moved eip forward, but NextRIP (if used) still
>> -         * needs setting or execution will resume from 0.
>> +         * icebp's injection must always be emulated, as hardware does not
>> +         * special case HW_EXCEPTION with vector 1 (#DB) as having trap
>> +         * semantics.
>>           */
>> +        regs->eip += _event.insn_len;
>>          if ( cpu_has_svm_nrips )
>>              vmcb->nextrip = regs->eip;
>>          eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
>> @@ -1270,16 +1281,13 @@ static void svm_inject_event(const struct x86_event *event)
>>  
>>      case X86_EVENTTYPE_SW_EXCEPTION: /* int3, into */
>>          /*
>> -         * The AMD manual states that .type=3 (HW exception), .vector=3 or 4,
>> -         * will perform DPL checks.  Experimentally, DPL and presence checks
>> -         * are indeed performed, even without NextRIP support.
>> -         *
>> -         * However without NextRIP support, the event injection still needs
>> -         * fully emulating to get the correct eip in the trap frame, yet get
>> -         * the correct faulting eip should a fault occur.
>> +         * Hardware special cases HW_EXCEPTION with vectors 3 and 4 as having
>> +         * trap semantics, and will perform DPL checks.
>>           */
>>          if ( cpu_has_svm_nrips )
>>              vmcb->nextrip = regs->eip + _event.insn_len;
>> +        else
>> +            regs->eip += _event.insn_len;
>>          eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
>>          break;
> By moving the adding to reg->rip here, you bypass x86_emulate()'s
> zeroing of the upper 32 bits outside of 64-bit mode, so you now need
> to replicate it below here.

Hmm - this was actually already broken with the nextrip adjustment.  I
will introduce a fixup at the end of this switch statement covering both
fields.

>
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>> @@ -1694,8 +1694,6 @@ static int inject_swint(enum x86_swint_type type,
>>                      goto raise_exn;
>>              }
>>          }
>> -
>> -        ctxt->regs->eip += insn_len;
>>      }
>>  
>>      rc = ops->inject_sw_interrupt(type, vector, insn_len, ctxt);
> I think for the patch description to be correct, this call's return value
> needs to be altered, for inject_swint() to return EXCEPTION when
> OKAY comes back here (which iirc you do in a later patch when you
> eliminate this hook).

At the moment, the sole user is

    swint:
        rc = inject_swint(swint_type, (uint8_t)src.val,
                          _regs.eip - ctxt->regs->eip,
                          ctxt, ops) ? : X86EMUL_EXCEPTION;
        goto done;

so the description is correct.

We currently have a uniform pattern of the injection functions returning
OKAY, and being converted to EXCEPTION by the caller.  If this wants
fixing at all, it wants fixing later when switching to using
x86_emul_software_event().

>
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>> @@ -64,7 +64,13 @@ enum x86_swint_type {
>>      x86_swint_int,   /* 0xcd $n */
>>  };
>>  
>> -/* How much help is required with software event injection? */
>> +/*
>> + * How much help is required with software event injection?
>> + *
>> + * All software events return from x86_emulate() with X86EMUL_EXCEPTION and
>> + * fault-like semantics.  This just controls whether the emulator performs
>> + * presence/dpl/etc checks and possibly raises excepions instead.
> exceptions

I had already noticed and fixed this.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 11/24] x86/emul: Implement singlestep as a retire flag
  2016-11-30 13:50 ` [PATCH v3 11/24] x86/emul: Implement singlestep as a retire flag Andrew Cooper
  2016-11-30 14:28   ` Paul Durrant
@ 2016-12-01 11:16   ` Jan Beulich
  2016-12-01 11:23     ` Andrew Cooper
  1 sibling, 1 reply; 59+ messages in thread
From: Jan Beulich @ 2016-12-01 11:16 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Paul Durrant, Tim Deegan, Xen-devel

>>> On 30.11.16 at 14:50, <andrew.cooper3@citrix.com> wrote:
> The behaviour of singlestep is to raise #DB after the instruction has been
> completed, but implementing it with inject_hw_exception() causes x86_emulate()
> to return X86EMUL_EXCEPTION, despite succesfully completing execution of the
> instruction, including register writeback.

Nice, I think that'll help simplify the privop patch a bit.

> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -3422,6 +3422,16 @@ static int sh_page_fault(struct vcpu *v,
>          v->arch.paging.last_write_emul_ok = 0;
>  #endif
>  
> +    if ( emul_ctxt.ctxt.retire.singlestep )
> +    {
> +        if ( is_hvm_vcpu(v) )
> +            hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +        else
> +            pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +
> +        goto emulate_done;

This results in skipping the PAE special code (which I think is intended)
but also the trace_shadow_emulate(), which I don't think is wanted.

> @@ -3433,7 +3443,7 @@ static int sh_page_fault(struct vcpu *v,
>              shadow_continue_emulation(&emul_ctxt, regs);
>              v->arch.paging.last_write_was_pt = 0;
>              r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
> -            if ( r == X86EMUL_OKAY )
> +            if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.retire.raw )

I think this wants to have a comment attached explaining why
a blanket check of all current and future retire flags here is the
right thing (or benign).

> @@ -3449,6 +3459,15 @@ static int sh_page_fault(struct vcpu *v,
>              {
>                  perfc_incr(shadow_em_ex_fail);
>                  TRACE_SHADOW_PATH_FLAG(TRCE_SFLAG_EMULATION_LAST_FAILED);
> +
> +                if ( emul_ctxt.ctxt.retire.singlestep )
> +                {
> +                    if ( is_hvm_vcpu(v) )
> +                        hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +                    else
> +                        pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +                }
> +
>                  break; /* Don't emulate again if we failed! */

This comment is now slightly stale.

> @@ -5415,11 +5414,11 @@ x86_emulate(
>      if ( !mode_64bit() )
>          _regs.eip = (uint32_t)_regs.eip;
>  
> -    *ctxt->regs = _regs;
> +    /* Was singestepping active at the start of this instruction? */
> +    if ( (rc == X86EMUL_OKAY) && (ctxt->regs->eflags & EFLG_TF) )
> +        ctxt->retire.singlestep = true;

Don't we need to avoid doing this when mov_ss is set? Or does the
hardware perhaps do the necessary deferring for us?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 08/24] x86/emul: Correct the behaviour of pop %ss and interrupt shadowing
  2016-12-01 10:51     ` Andrew Cooper
@ 2016-12-01 11:19       ` Jan Beulich
  0 siblings, 0 replies; 59+ messages in thread
From: Jan Beulich @ 2016-12-01 11:19 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 01.12.16 at 11:51, <andrew.cooper3@citrix.com> wrote:
> On 01/12/16 10:18, Jan Beulich wrote:
>>>>> On 30.11.16 at 14:50, <andrew.cooper3@citrix.com> wrote:
>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>> @@ -2656,6 +2656,8 @@ x86_emulate(
>>>                                &dst.val, op_bytes, ctxt, ops)) != 0 ||
>>>               (rc = load_seg(src.val, dst.val, 0, NULL, ctxt, ops)) != 0 )
>>>              goto done;
>>> +        if ( src.val == x86_seg_ss )
>>> +            ctxt->retire.mov_ss = 1;
>>>          break;
>> While I don't mind it being done here (i.e. it can have my R-b as is),
>> wouldn't it be even better to put this into load_seg() itself?
> 
> That would cause the mov_ss flag to be incorrectly set for `lss`.

Oh, good point. So as said,
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 09/24] x86/emul: Provide a wrapper to x86_emulate() to ASSERT() certain behaviour
  2016-12-01 10:58     ` Andrew Cooper
@ 2016-12-01 11:21       ` Jan Beulich
  0 siblings, 0 replies; 59+ messages in thread
From: Jan Beulich @ 2016-12-01 11:21 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 01.12.16 at 11:58, <andrew.cooper3@citrix.com> wrote:
> On 01/12/16 10:40, Jan Beulich wrote:
>>
>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>> @@ -2404,6 +2404,11 @@ x86_decode(
>>>  #undef insn_fetch_bytes
>>>  #undef insn_fetch_type
>>>  
>>> +/* Undo DEBUG wrapper. */
>>> +#ifdef x86_emulate
>>> +#undef x86_emulate
>>> +#endif
>> I don't see the need for the #ifdef here.
> 
> It will break the non-debug build if removed, as x86_emulate wouldn't be
> a define.

#undef for something that isn't a #define is well defined to do
nothing.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 11/24] x86/emul: Implement singlestep as a retire flag
  2016-12-01 11:16   ` Jan Beulich
@ 2016-12-01 11:23     ` Andrew Cooper
  2016-12-01 11:33       ` Tim Deegan
  2016-12-01 12:05       ` Jan Beulich
  0 siblings, 2 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-12-01 11:23 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Paul Durrant, Tim Deegan, Xen-devel

On 01/12/16 11:16, Jan Beulich wrote:
>>>> On 30.11.16 at 14:50, <andrew.cooper3@citrix.com> wrote:
>> The behaviour of singlestep is to raise #DB after the instruction has been
>> completed, but implementing it with inject_hw_exception() causes x86_emulate()
>> to return X86EMUL_EXCEPTION, despite succesfully completing execution of the
>> instruction, including register writeback.
> Nice, I think that'll help simplify the privop patch a bit.
>
>> --- a/xen/arch/x86/mm/shadow/multi.c
>> +++ b/xen/arch/x86/mm/shadow/multi.c
>> @@ -3422,6 +3422,16 @@ static int sh_page_fault(struct vcpu *v,
>>          v->arch.paging.last_write_emul_ok = 0;
>>  #endif
>>  
>> +    if ( emul_ctxt.ctxt.retire.singlestep )
>> +    {
>> +        if ( is_hvm_vcpu(v) )
>> +            hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
>> +        else
>> +            pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
>> +
>> +        goto emulate_done;
> This results in skipping the PAE special code (which I think is intended)

Correct

> but also the trace_shadow_emulate(), which I don't think is wanted.

Hmm.  It is only the PAE case we want to skip.  Perhaps changing the PAE
entry condition to

diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index c45d260..28ff945 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3480,7 +3480,7 @@ static int sh_page_fault(struct vcpu *v,
     }
 
 #if GUEST_PAGING_LEVELS == 3 /* PAE guest */
-    if ( r == X86EMUL_OKAY ) {
+    if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.retire.raw ) {
         int i, emulation_count=0;
         this_cpu(trace_emulate_initial_va) = va;
         /* Emulate up to four extra instructions in the hope of catching

would be better, along with suitable comments and style fixes?

>
>> @@ -3433,7 +3443,7 @@ static int sh_page_fault(struct vcpu *v,
>>              shadow_continue_emulation(&emul_ctxt, regs);
>>              v->arch.paging.last_write_was_pt = 0;
>>              r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
>> -            if ( r == X86EMUL_OKAY )
>> +            if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.retire.raw )
> I think this wants to have a comment attached explaining why
> a blanket check of all current and future retire flags here is the
> right thing (or benign).

Ok.

>
>> @@ -3449,6 +3459,15 @@ static int sh_page_fault(struct vcpu *v,
>>              {
>>                  perfc_incr(shadow_em_ex_fail);
>>                  TRACE_SHADOW_PATH_FLAG(TRCE_SFLAG_EMULATION_LAST_FAILED);
>> +
>> +                if ( emul_ctxt.ctxt.retire.singlestep )
>> +                {
>> +                    if ( is_hvm_vcpu(v) )
>> +                        hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
>> +                    else
>> +                        pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
>> +                }
>> +
>>                  break; /* Don't emulate again if we failed! */
> This comment is now slightly stale.

"failed to find the second half of the write".  In combination with a
suitable comment in the hunk above, this should be fine as is.

>
>> @@ -5415,11 +5414,11 @@ x86_emulate(
>>      if ( !mode_64bit() )
>>          _regs.eip = (uint32_t)_regs.eip;
>>  
>> -    *ctxt->regs = _regs;
>> +    /* Was singestepping active at the start of this instruction? */
>> +    if ( (rc == X86EMUL_OKAY) && (ctxt->regs->eflags & EFLG_TF) )
>> +        ctxt->retire.singlestep = true;
> Don't we need to avoid doing this when mov_ss is set? Or does the
> hardware perhaps do the necessary deferring for us?

I am currently reading up about this in the manual.  I need more coffee.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 10/24] x86/emul: Always use fault semantics for software events
  2016-12-01 11:15     ` Andrew Cooper
@ 2016-12-01 11:23       ` Jan Beulich
  0 siblings, 0 replies; 59+ messages in thread
From: Jan Beulich @ 2016-12-01 11:23 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Boris Ostrovsky, TimDeegan, Suravee Suthikulpanit, Xen-devel

>>> On 01.12.16 at 12:15, <andrew.cooper3@citrix.com> wrote:
> On 01/12/16 10:53, Jan Beulich wrote:
>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>> @@ -1694,8 +1694,6 @@ static int inject_swint(enum x86_swint_type type,
>>>                      goto raise_exn;
>>>              }
>>>          }
>>> -
>>> -        ctxt->regs->eip += insn_len;
>>>      }
>>>  
>>>      rc = ops->inject_sw_interrupt(type, vector, insn_len, ctxt);
>> I think for the patch description to be correct, this call's return value
>> needs to be altered, for inject_swint() to return EXCEPTION when
>> OKAY comes back here (which iirc you do in a later patch when you
>> eliminate this hook).
> 
> At the moment, the sole user is
> 
>     swint:
>         rc = inject_swint(swint_type, (uint8_t)src.val,
>                           _regs.eip - ctxt->regs->eip,
>                           ctxt, ops) ? : X86EMUL_EXCEPTION;
>         goto done;
> 
> so the description is correct.

Oh - I should have checked the caller. Sorry for the noise.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 11/24] x86/emul: Implement singlestep as a retire flag
  2016-12-01 11:23     ` Andrew Cooper
@ 2016-12-01 11:33       ` Tim Deegan
  2016-12-01 12:05       ` Jan Beulich
  1 sibling, 0 replies; 59+ messages in thread
From: Tim Deegan @ 2016-12-01 11:33 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Paul Durrant, Jan Beulich, Xen-devel

At 11:23 +0000 on 01 Dec (1480591394), Andrew Cooper wrote:
> Hmm.  It is only the PAE case we want to skip.  Perhaps changing the PAE
> entry condition to
> 
> diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
> index c45d260..28ff945 100644
> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -3480,7 +3480,7 @@ static int sh_page_fault(struct vcpu *v,
>      }
>  
>  #if GUEST_PAGING_LEVELS == 3 /* PAE guest */
> -    if ( r == X86EMUL_OKAY ) {
> +    if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.retire.raw ) {
>          int i, emulation_count=0;
>          this_cpu(trace_emulate_initial_va) = va;
>          /* Emulate up to four extra instructions in the hope of catching
> 
> would be better, along with suitable comments and style fixes?

That would be OK by me, and with that change,

Acked-by: Tim Deegan <tim@xen.org>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 13/24] x86/emul: Rework emulator event injection
  2016-11-30 13:50 ` [PATCH v3 13/24] x86/emul: Rework emulator event injection Andrew Cooper
  2016-11-30 14:26   ` Paul Durrant
@ 2016-12-01 11:35   ` Tim Deegan
  2016-12-01 12:31   ` Jan Beulich
  2 siblings, 0 replies; 59+ messages in thread
From: Tim Deegan @ 2016-12-01 11:35 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: George Dunlap, Paul Durrant, Jan Beulich, Xen-devel

At 13:50 +0000 on 30 Nov (1480513830), Andrew Cooper wrote:
> The emulator needs to gain an understanding of interrupts and exceptions
> generated by its actions.
> 
> Move hvm_emulate_ctxt.{exn_pending,trap} into struct x86_emulate_ctxt so they
> are visible to the emulator.  This removes the need for the
> inject_{hw_exception,sw_interrupt}() hooks, which are dropped and replaced
> with x86_emul_{hw_exception,software_event,reset_event}() instead.
> 
> For exceptions raised by x86_emulate() itself (rather than its callbacks), the
> shadow pagetable and PV uses of x86_emulate() previously failed with
> X86EMUL_UNHANDLEABLE due to the lack of inject_*() hooks.
> 
> This behaviour has changed, and such cases will now return X86EMUL_EXCEPTION
> with event_pending set.  Until the callers of x86_emulate() have been updated
> to inject events back into the guest, divert the event_pending case back into
> the X86EMUL_UNHANDLEABLE path to maintain the same guest-visible behaviour.
> 
> No overall functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>

Acked-by: Tim Deegan <tim@xen.org>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 18/24] x86/shadow: Avoid raising faults behind the emulators back
  2016-11-30 13:50 ` [PATCH v3 18/24] x86/shadow: " Andrew Cooper
@ 2016-12-01 11:39   ` Tim Deegan
  2016-12-01 11:40     ` Andrew Cooper
  2016-12-01 13:00   ` Jan Beulich
  1 sibling, 1 reply; 59+ messages in thread
From: Tim Deegan @ 2016-12-01 11:39 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Jan Beulich, Xen-devel

At 13:50 +0000 on 30 Nov (1480513835), Andrew Cooper wrote:
> Use x86_emul_{hw_exception,pagefault}() rather than
> {pv,hvm}_inject_page_fault() and hvm_inject_hw_exception() to cause raised
> faults to be known to the emulator.  This requires altering the callers of
> x86_emulate() to properly re-inject the event.
> 
> While fixing this, fix the singlestep behaviour.  Previously, an otherwise
> successful emulation would fail if singlestepping was active, as the emulator
> couldn't raise #DB.  This is unreasonable from the point of view of the guest.
> 
> We therefore tolerate #PF/#GP/SS and #DB being raised by the emulator, but
> reject anything else as unexpected.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Please update the patch description to remove the bits about
singlestepping and #DB. With that,

Acked-by: Tim Deegan <tim@xen.org>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 18/24] x86/shadow: Avoid raising faults behind the emulators back
  2016-12-01 11:39   ` Tim Deegan
@ 2016-12-01 11:40     ` Andrew Cooper
  0 siblings, 0 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-12-01 11:40 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Jan Beulich, Xen-devel

On 01/12/16 11:39, Tim Deegan wrote:
> At 13:50 +0000 on 30 Nov (1480513835), Andrew Cooper wrote:
>> Use x86_emul_{hw_exception,pagefault}() rather than
>> {pv,hvm}_inject_page_fault() and hvm_inject_hw_exception() to cause raised
>> faults to be known to the emulator.  This requires altering the callers of
>> x86_emulate() to properly re-inject the event.
>>
>> While fixing this, fix the singlestep behaviour.  Previously, an otherwise
>> successful emulation would fail if singlestepping was active, as the emulator
>> couldn't raise #DB.  This is unreasonable from the point of view of the guest.
>>
>> We therefore tolerate #PF/#GP/SS and #DB being raised by the emulator, but
>> reject anything else as unexpected.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Please update the patch description to remove the bits about
> singlestepping and #DB. With that,
>
> Acked-by: Tim Deegan <tim@xen.org>

Oh of course.  I managed to do that on the previous patch, but not this
one.  Sorry and thanks.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 17/24] x86/pv: Avoid raising faults behind the emulators back
  2016-11-30 13:50 ` [PATCH v3 17/24] x86/pv: " Andrew Cooper
@ 2016-12-01 11:50   ` Tim Deegan
  2016-12-01 12:57   ` Jan Beulich
  1 sibling, 0 replies; 59+ messages in thread
From: Tim Deegan @ 2016-12-01 11:50 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Jan Beulich, Xen-devel

At 13:50 +0000 on 30 Nov (1480513834), Andrew Cooper wrote:
> Use x86_emul_pagefault() rather than pv_inject_page_fault() to cause raised
> pagefaults to be known to the emulator.  This requires altering the callers of
> x86_emulate() to properly re-inject the event.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Tim Deegan <tim@xen.org>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 11/24] x86/emul: Implement singlestep as a retire flag
  2016-12-01 11:23     ` Andrew Cooper
  2016-12-01 11:33       ` Tim Deegan
@ 2016-12-01 12:05       ` Jan Beulich
  1 sibling, 0 replies; 59+ messages in thread
From: Jan Beulich @ 2016-12-01 12:05 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Paul Durrant, Tim Deegan, Xen-devel

>>> On 01.12.16 at 12:23, <andrew.cooper3@citrix.com> wrote:
> On 01/12/16 11:16, Jan Beulich wrote:
>>>>> On 30.11.16 at 14:50, <andrew.cooper3@citrix.com> wrote:
>>> @@ -3422,6 +3422,16 @@ static int sh_page_fault(struct vcpu *v,
>>>          v->arch.paging.last_write_emul_ok = 0;
>>>  #endif
>>>  
>>> +    if ( emul_ctxt.ctxt.retire.singlestep )
>>> +    {
>>> +        if ( is_hvm_vcpu(v) )
>>> +            hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
>>> +        else
>>> +            pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
>>> +
>>> +        goto emulate_done;
>> This results in skipping the PAE special code (which I think is intended)
> 
> Correct
> 
>> but also the trace_shadow_emulate(), which I don't think is wanted.
> 
> Hmm.  It is only the PAE case we want to skip.  Perhaps changing the PAE
> entry condition to
> 
> diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
> index c45d260..28ff945 100644
> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -3480,7 +3480,7 @@ static int sh_page_fault(struct vcpu *v,
>      }
>  
>  #if GUEST_PAGING_LEVELS == 3 /* PAE guest */
> -    if ( r == X86EMUL_OKAY ) {
> +    if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.retire.raw ) {
>          int i, emulation_count=0;
>          this_cpu(trace_emulate_initial_va) = va;
>          /* Emulate up to four extra instructions in the hope of catching
> 
> would be better, along with suitable comments and style fixes?

Yes I think so (and I see Tim has said so too).

>>> @@ -5415,11 +5414,11 @@ x86_emulate(
>>>      if ( !mode_64bit() )
>>>          _regs.eip = (uint32_t)_regs.eip;
>>>  
>>> -    *ctxt->regs = _regs;
>>> +    /* Was singestepping active at the start of this instruction? */
>>> +    if ( (rc == X86EMUL_OKAY) && (ctxt->regs->eflags & EFLG_TF) )
>>> +        ctxt->retire.singlestep = true;
>> Don't we need to avoid doing this when mov_ss is set? Or does the
>> hardware perhaps do the necessary deferring for us?
> 
> I am currently reading up about this in the manual.

Tell me if you find anything. All I have is my memory of good old
DOS days, where I recall single stepping over %ss loads always
surprised me (over time of course with a fading level of surprise)
in taking two steps. The only thing I can't tell for sure is whether
this maybe was a cute feature of the debugger (recognizing the
%ss load instructions).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 13/24] x86/emul: Rework emulator event injection
  2016-11-30 13:50 ` [PATCH v3 13/24] x86/emul: Rework emulator event injection Andrew Cooper
  2016-11-30 14:26   ` Paul Durrant
  2016-12-01 11:35   ` Tim Deegan
@ 2016-12-01 12:31   ` Jan Beulich
  2 siblings, 0 replies; 59+ messages in thread
From: Jan Beulich @ 2016-12-01 12:31 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: George Dunlap, Paul Durrant, Tim Deegan, Xen-devel

>>> On 30.11.16 at 14:50, <andrew.cooper3@citrix.com> wrote:
> The emulator needs to gain an understanding of interrupts and exceptions
> generated by its actions.
> 
> Move hvm_emulate_ctxt.{exn_pending,trap} into struct x86_emulate_ctxt so they
> are visible to the emulator.  This removes the need for the
> inject_{hw_exception,sw_interrupt}() hooks, which are dropped and replaced
> with x86_emul_{hw_exception,software_event,reset_event}() instead.
> 
> For exceptions raised by x86_emulate() itself (rather than its callbacks), the
> shadow pagetable and PV uses of x86_emulate() previously failed with
> X86EMUL_UNHANDLEABLE due to the lack of inject_*() hooks.
> 
> This behaviour has changed, and such cases will now return X86EMUL_EXCEPTION
> with event_pending set.  Until the callers of x86_emulate() have been updated
> to inject events back into the guest, divert the event_pending case back  into
> the X86EMUL_UNHANDLEABLE path to maintain the same guest-visible behaviour.
> 
> No overall functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 17/24] x86/pv: Avoid raising faults behind the emulators back
  2016-11-30 13:50 ` [PATCH v3 17/24] x86/pv: " Andrew Cooper
  2016-12-01 11:50   ` Tim Deegan
@ 2016-12-01 12:57   ` Jan Beulich
  2016-12-01 13:12     ` Andrew Cooper
  1 sibling, 1 reply; 59+ messages in thread
From: Jan Beulich @ 2016-12-01 12:57 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Tim Deegan, Xen-devel

>>> On 30.11.16 at 14:50, <andrew.cooper3@citrix.com> wrote:
> @@ -5379,30 +5380,41 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
>      page_unlock(page);
>      put_page(page);
>  
> -    /*
> -     * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
> -     * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such exceptions
> -     * now set event_pending instead.  Exceptions raised behind the back of
> -     * the emulator don't yet set event_pending.
> -     *
> -     * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE path,
> -     * for no functional change from before.  Future patches will fix this
> -     * properly.
> -     */
> -    if ( rc == X86EMUL_EXCEPTION && ptwr_ctxt.ctxt.event_pending )
> -        rc = X86EMUL_UNHANDLEABLE;
> +    /* More strict than x86_emulate_wrapper(), as this is now true for PV. */
> +    ASSERT(ptwr_ctxt.ctxt.event_pending == (rc == X86EMUL_EXCEPTION));
>  
> -    if ( rc == X86EMUL_UNHANDLEABLE )
> -        goto bail;
> +    switch ( rc )
> +    {
> +    case X86EMUL_EXCEPTION:
> +        /*
> +         * This emulation only covers writes to pagetables which marked

It looks to me as if either the "which" wants to be dropped, or "are" /
"have been" be inserted after it.

> +         * read-only by Xen.  We tolerate #PF (from hitting an adjacent page).

Why "adjacent"? Aiui the only legitimate #PF here can be from a
page having got paged out by the guest by the time here, and that
could be either the page table page itself, or the page(s) holding
the instruction (which normally aren't adjacent at all).

> +         * Anything else is an emulation bug, or a guest playing with the
> +         * instruction stream under Xen's feet.
> +         */
> +        if ( ptwr_ctxt.ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION &&
> +             ptwr_ctxt.ctxt.event.vector == TRAP_page_fault )
> +            pv_inject_event(&ptwr_ctxt.ctxt.event);
> +        else
> +            gdprintk(XENLOG_WARNING,
> +                     "Unexpected event (type %u, vector %#x) from emulation\n",
> +                     ptwr_ctxt.ctxt.event.type, ptwr_ctxt.ctxt.event.vector);
> +
> +        /* Fallthrough */
> +    case X86EMUL_OKAY:
>  
> -    if ( ptwr_ctxt.ctxt.retire.singlestep )
> -        pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> +        if ( ptwr_ctxt.ctxt.retire.singlestep )
> +            pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
>  
> -    perfc_incr(ptwr_emulations);
> -    return EXCRET_fault_fixed;
> +        /* Fallthrough */
> +    case X86EMUL_RETRY:
> +        perfc_incr(ptwr_emulations);
> +        return EXCRET_fault_fixed;
>  
>   bail:
> -    return 0;
> +    default:
> +        return 0;
> +    }
>  }

I think omitting the default label would not only make the patch
slightly smaller, but also result in the bail label look less misplaced.

With at least the comment aspect above taken care of,
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 18/24] x86/shadow: Avoid raising faults behind the emulators back
  2016-11-30 13:50 ` [PATCH v3 18/24] x86/shadow: " Andrew Cooper
  2016-12-01 11:39   ` Tim Deegan
@ 2016-12-01 13:00   ` Jan Beulich
  2016-12-01 13:15     ` Andrew Cooper
  1 sibling, 1 reply; 59+ messages in thread
From: Jan Beulich @ 2016-12-01 13:00 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Tim Deegan, Xen-devel

>>> On 30.11.16 at 14:50, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -3373,18 +3373,35 @@ static int sh_page_fault(struct vcpu *v,
>  
>      r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
>  
> -    /*
> -     * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
> -     * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such exceptions
> -     * now set event_pending instead.  Exceptions raised behind the back of
> -     * the emulator don't yet set event_pending.
> -     *
> -     * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE path,
> -     * for no functional change from before.  Future patches will fix this
> -     * properly.
> -     */
>      if ( r == X86EMUL_EXCEPTION && emul_ctxt.ctxt.event_pending )
> -        r = X86EMUL_UNHANDLEABLE;
> +    {
> +        /*
> +         * This emulation covers writes to shadow pagetables.  We tolerate #PF
> +         * (from hitting adjacent pages) and #GP/#SS (from segmentation
> +         * errors).  Anything else is an emulation bug, or a guest playing
> +         * with the instruction stream under Xen's feet.
> +         */

Same comment here regarding "adjacent".

> +        if ( emul_ctxt.ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION &&
> +             (emul_ctxt.ctxt.event.vector < 32) &&
> +             ((1u << emul_ctxt.ctxt.event.vector) &
> +              ((1u << TRAP_stack_error) | (1u << TRAP_gp_fault) |
> +               (1u << TRAP_page_fault))) )

May I suggest to also demand an error code of zero for #GP/#SS?

> +        {
> +            if ( is_hvm_vcpu(v) )

has_hvm_container_domain()?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 17/24] x86/pv: Avoid raising faults behind the emulators back
  2016-12-01 12:57   ` Jan Beulich
@ 2016-12-01 13:12     ` Andrew Cooper
  2016-12-01 13:27       ` Jan Beulich
  0 siblings, 1 reply; 59+ messages in thread
From: Andrew Cooper @ 2016-12-01 13:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Tim Deegan, Xen-devel

On 01/12/16 12:57, Jan Beulich wrote:
>>>> On 30.11.16 at 14:50, <andrew.cooper3@citrix.com> wrote:
>> @@ -5379,30 +5380,41 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
>>      page_unlock(page);
>>      put_page(page);
>>  
>> -    /*
>> -     * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
>> -     * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such exceptions
>> -     * now set event_pending instead.  Exceptions raised behind the back of
>> -     * the emulator don't yet set event_pending.
>> -     *
>> -     * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE path,
>> -     * for no functional change from before.  Future patches will fix this
>> -     * properly.
>> -     */
>> -    if ( rc == X86EMUL_EXCEPTION && ptwr_ctxt.ctxt.event_pending )
>> -        rc = X86EMUL_UNHANDLEABLE;
>> +    /* More strict than x86_emulate_wrapper(), as this is now true for PV. */
>> +    ASSERT(ptwr_ctxt.ctxt.event_pending == (rc == X86EMUL_EXCEPTION));
>>  
>> -    if ( rc == X86EMUL_UNHANDLEABLE )
>> -        goto bail;
>> +    switch ( rc )
>> +    {
>> +    case X86EMUL_EXCEPTION:
>> +        /*
>> +         * This emulation only covers writes to pagetables which marked
> It looks to me as if either the "which" wants to be dropped, or "are" /
> "have been" be inserted after it.

I meant "which are".

>
>> +         * read-only by Xen.  We tolerate #PF (from hitting an adjacent page).
> Why "adjacent"? Aiui the only legitimate #PF here can be from a
> page having got paged out by the guest by the time here, and that
> could be either the page table page itself, or the page(s) holding
> the instruction (which normally aren't adjacent at all).

This is less clear-cut than the subsequent case, as non-aligned accesses
are disallowed, so simply misaligning a write across the page boundary
won't result in the emulator raising #PF.

One issue I have decided to defer is the behaviour of UNHANDLEABLE in
this case.  If the #PF we are servicing is caused by a misaligned write
to a l1e, instead of explicitly re-injecting #PF, we let the logic
continue searching for all other cases which may have caused a #PF.

It would be better to explicitly disallow the access by re-raising #PF
than returning UNHANDLEABLE, as it skips the rest of the pagefault handler.

>
>> +         * Anything else is an emulation bug, or a guest playing with the
>> +         * instruction stream under Xen's feet.
>> +         */
>> +        if ( ptwr_ctxt.ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION &&
>> +             ptwr_ctxt.ctxt.event.vector == TRAP_page_fault )
>> +            pv_inject_event(&ptwr_ctxt.ctxt.event);
>> +        else
>> +            gdprintk(XENLOG_WARNING,
>> +                     "Unexpected event (type %u, vector %#x) from emulation\n",
>> +                     ptwr_ctxt.ctxt.event.type, ptwr_ctxt.ctxt.event.vector);
>> +
>> +        /* Fallthrough */
>> +    case X86EMUL_OKAY:
>>  
>> -    if ( ptwr_ctxt.ctxt.retire.singlestep )
>> -        pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
>> +        if ( ptwr_ctxt.ctxt.retire.singlestep )
>> +            pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
>>  
>> -    perfc_incr(ptwr_emulations);
>> -    return EXCRET_fault_fixed;
>> +        /* Fallthrough */
>> +    case X86EMUL_RETRY:
>> +        perfc_incr(ptwr_emulations);
>> +        return EXCRET_fault_fixed;
>>  
>>   bail:
>> -    return 0;
>> +    default:
>> +        return 0;
>> +    }
>>  }
> I think omitting the default label would not only make the patch
> slightly smaller, but also result in the bail label look less misplaced.

The default label is needed to cover the UNHANDLEABLE case.

~Andrew

>
> With at least the comment aspect above taken care of,
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
> Jan
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 18/24] x86/shadow: Avoid raising faults behind the emulators back
  2016-12-01 13:00   ` Jan Beulich
@ 2016-12-01 13:15     ` Andrew Cooper
  0 siblings, 0 replies; 59+ messages in thread
From: Andrew Cooper @ 2016-12-01 13:15 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Tim Deegan, Xen-devel

On 01/12/16 13:00, Jan Beulich wrote:
>>>> On 30.11.16 at 14:50, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/arch/x86/mm/shadow/multi.c
>> +++ b/xen/arch/x86/mm/shadow/multi.c
>> @@ -3373,18 +3373,35 @@ static int sh_page_fault(struct vcpu *v,
>>  
>>      r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
>>  
>> -    /*
>> -     * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
>> -     * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such exceptions
>> -     * now set event_pending instead.  Exceptions raised behind the back of
>> -     * the emulator don't yet set event_pending.
>> -     *
>> -     * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE path,
>> -     * for no functional change from before.  Future patches will fix this
>> -     * properly.
>> -     */
>>      if ( r == X86EMUL_EXCEPTION && emul_ctxt.ctxt.event_pending )
>> -        r = X86EMUL_UNHANDLEABLE;
>> +    {
>> +        /*
>> +         * This emulation covers writes to shadow pagetables.  We tolerate #PF
>> +         * (from hitting adjacent pages) and #GP/#SS (from segmentation
>> +         * errors).  Anything else is an emulation bug, or a guest playing
>> +         * with the instruction stream under Xen's feet.
>> +         */
> Same comment here regarding "adjacent".

In this case, the answer is different.  A misaligned write across the
end of a shadow pagetable may legitimately trigger a #PF.

>
>> +        if ( emul_ctxt.ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION &&
>> +             (emul_ctxt.ctxt.event.vector < 32) &&
>> +             ((1u << emul_ctxt.ctxt.event.vector) &
>> +              ((1u << TRAP_stack_error) | (1u << TRAP_gp_fault) |
>> +               (1u << TRAP_page_fault))) )
> May I suggest to also demand an error code of zero for #GP/#SS?

Ok.

>
>> +        {
>> +            if ( is_hvm_vcpu(v) )
> has_hvm_container_domain()?

Very good point.  Will fix.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 17/24] x86/pv: Avoid raising faults behind the emulators back
  2016-12-01 13:12     ` Andrew Cooper
@ 2016-12-01 13:27       ` Jan Beulich
  0 siblings, 0 replies; 59+ messages in thread
From: Jan Beulich @ 2016-12-01 13:27 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Tim Deegan, Xen-devel

>>> On 01.12.16 at 14:12, <andrew.cooper3@citrix.com> wrote:
> On 01/12/16 12:57, Jan Beulich wrote:
>>>>> On 30.11.16 at 14:50, <andrew.cooper3@citrix.com> wrote:
>>> @@ -5379,30 +5380,41 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
>>>      page_unlock(page);
>>>      put_page(page);
>>>  
>>> -    /*
>>> -     * The previous lack of inject_{sw,hw}*() hooks caused exceptions raised
>>> -     * by the emulator itself to become X86EMUL_UNHANDLEABLE.  Such exceptions
>>> -     * now set event_pending instead.  Exceptions raised behind the back of
>>> -     * the emulator don't yet set event_pending.
>>> -     *
>>> -     * For now, cause such cases to return to the X86EMUL_UNHANDLEABLE path,
>>> -     * for no functional change from before.  Future patches will fix this
>>> -     * properly.
>>> -     */
>>> -    if ( rc == X86EMUL_EXCEPTION && ptwr_ctxt.ctxt.event_pending )
>>> -        rc = X86EMUL_UNHANDLEABLE;
>>> +    /* More strict than x86_emulate_wrapper(), as this is now true for PV. */
>>> +    ASSERT(ptwr_ctxt.ctxt.event_pending == (rc == X86EMUL_EXCEPTION));
>>>  
>>> -    if ( rc == X86EMUL_UNHANDLEABLE )
>>> -        goto bail;
>>> +    switch ( rc )
>>> +    {
>>> +    case X86EMUL_EXCEPTION:
>>> +        /*
>>> +         * This emulation only covers writes to pagetables which marked
>>> +         * read-only by Xen.  We tolerate #PF (from hitting an adjacent page).
>> Why "adjacent"? Aiui the only legitimate #PF here can be from a
>> page having got paged out by the guest by the time here, and that
>> could be either the page table page itself, or the page(s) holding
>> the instruction (which normally aren't adjacent at all).
> 
> This is less clear-cut than the subsequent case, as non-aligned accesses
> are disallowed, so simply misaligning a write across the page boundary
> won't result in the emulator raising #PF.

I don't understand - what does this have to do with possibly getting
#PF from fetching the insn?

> One issue I have decided to defer is the behaviour of UNHANDLEABLE in
> this case.  If the #PF we are servicing is caused by a misaligned write
> to a l1e, instead of explicitly re-injecting #PF, we let the logic
> continue searching for all other cases which may have caused a #PF.
> 
> It would be better to explicitly disallow the access by re-raising #PF
> than returning UNHANDLEABLE, as it skips the rest of the pagefault handler.

At the point of the check we don't know yet whether we're dealing
with a page table, so we need to give other handlers a chance.

>>> +         * Anything else is an emulation bug, or a guest playing with the
>>> +         * instruction stream under Xen's feet.
>>> +         */
>>> +        if ( ptwr_ctxt.ctxt.event.type == X86_EVENTTYPE_HW_EXCEPTION &&
>>> +             ptwr_ctxt.ctxt.event.vector == TRAP_page_fault )
>>> +            pv_inject_event(&ptwr_ctxt.ctxt.event);
>>> +        else
>>> +            gdprintk(XENLOG_WARNING,
>>> +                     "Unexpected event (type %u, vector %#x) from emulation\n",
>>> +                     ptwr_ctxt.ctxt.event.type, ptwr_ctxt.ctxt.event.vector);
>>> +
>>> +        /* Fallthrough */
>>> +    case X86EMUL_OKAY:
>>>  
>>> -    if ( ptwr_ctxt.ctxt.retire.singlestep )
>>> -        pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
>>> +        if ( ptwr_ctxt.ctxt.retire.singlestep )
>>> +            pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
>>>  
>>> -    perfc_incr(ptwr_emulations);
>>> -    return EXCRET_fault_fixed;
>>> +        /* Fallthrough */
>>> +    case X86EMUL_RETRY:
>>> +        perfc_incr(ptwr_emulations);
>>> +        return EXCRET_fault_fixed;
>>>  
>>>   bail:
>>> -    return 0;
>>> +    default:
>>> +        return 0;
>>> +    }
>>>  }
>> I think omitting the default label would not only make the patch
>> slightly smaller, but also result in the bail label look less misplaced.
> 
> The default label is needed to cover the UNHANDLEABLE case.

Certainly not - it can just fall out of the switch statement, reaching
the pre-existing "bail" label. I could see your point if rc was of an
enum type ...

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 03/24] x86/emul: Simplfy emulation state setup
  2016-11-30 13:50 ` [PATCH v3 03/24] x86/emul: Simplfy emulation state setup Andrew Cooper
@ 2016-12-08  6:34   ` George Dunlap
  0 siblings, 0 replies; 59+ messages in thread
From: George Dunlap @ 2016-12-08  6:34 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: George Dunlap, Xen-devel


> On Nov 30, 2016, at 9:50 PM, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> 
> The current code to set up emulation state is ad-hoc and error prone.
> 
> * Consistently zero all emulation state structures.
> * Avoid explicitly initialising some state to 0.
> * Explicitly identify all input and output state in x86_emulate_ctxt.  This
>   involves rearanging some fields.
> * Have x86_decode() explicitly initalise all output state at its start.
> 
> While making the above changes, two minor tweaks:
> 
> * Move the calculation of hvmemul_ctxt->ctxt.swint_emulate from
>   _hvm_emulate_one() to hvm_emulate_init_once().  It doesn't need
>   recalculating for each instruction.
> * Change force_writeback to being a boolean, to match its use.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Tim Deegan <tim@xen.org>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

Not sure this is still needed, but just in case:

Acked-by: George Dunlap <george.dunlap@citrix.com>

> ---
> CC: George Dunlap <george.dunlap@eu.citrix.com>
> 
> v2:
> * Split x86_emulate_ctxt into three sections
> ---
> xen/arch/x86/hvm/emulate.c             | 28 +++++++++++++++-------------
> xen/arch/x86/mm.c                      | 14 ++++++++------
> xen/arch/x86/mm/shadow/common.c        |  4 ++--
> xen/arch/x86/x86_emulate/x86_emulate.c |  1 +
> xen/arch/x86/x86_emulate/x86_emulate.h | 32 ++++++++++++++++++++++----------
> 5 files changed, 48 insertions(+), 31 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index f1f6e2f..3efeead 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -1770,13 +1770,6 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
> 
>     vio->mmio_retry = 0;
> 
> -    if ( cpu_has_vmx )
> -        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
> -    else if ( cpu_has_svm_nrips )
> -        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_icebp;
> -    else
> -        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_all;
> -
>     rc = x86_emulate(&hvmemul_ctxt->ctxt, ops);
> 
>     if ( rc == X86EMUL_OKAY && vio->mmio_retry )
> @@ -1947,14 +1940,23 @@ void hvm_emulate_init_once(
>     struct hvm_emulate_ctxt *hvmemul_ctxt,
>     struct cpu_user_regs *regs)
> {
> -    hvmemul_ctxt->intr_shadow = hvm_funcs.get_interrupt_shadow(current);
> -    hvmemul_ctxt->ctxt.regs = regs;
> -    hvmemul_ctxt->ctxt.force_writeback = 1;
> -    hvmemul_ctxt->seg_reg_accessed = 0;
> -    hvmemul_ctxt->seg_reg_dirty = 0;
> -    hvmemul_ctxt->set_context = 0;
> +    struct vcpu *curr = current;
> +
> +    memset(hvmemul_ctxt, 0, sizeof(*hvmemul_ctxt));
> +
> +    hvmemul_ctxt->intr_shadow = hvm_funcs.get_interrupt_shadow(curr);
>     hvmemul_get_seg_reg(x86_seg_cs, hvmemul_ctxt);
>     hvmemul_get_seg_reg(x86_seg_ss, hvmemul_ctxt);
> +
> +    hvmemul_ctxt->ctxt.regs = regs;
> +    hvmemul_ctxt->ctxt.force_writeback = true;
> +
> +    if ( cpu_has_vmx )
> +        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
> +    else if ( cpu_has_svm_nrips )
> +        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_icebp;
> +    else
> +        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_all;
> }
> 
> void hvm_emulate_init_per_insn(
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index 5b0e9f3..d365f59 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5337,7 +5337,14 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
>     struct domain *d = v->domain;
>     struct page_info *page;
>     l1_pgentry_t      pte;
> -    struct ptwr_emulate_ctxt ptwr_ctxt;
> +    struct ptwr_emulate_ctxt ptwr_ctxt = {
> +        .ctxt = {
> +            .regs = regs,
> +            .addr_size = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
> +            .sp_size   = is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG,
> +            .swint_emulate = x86_swint_emulate_none,
> +        },
> +    };
>     int rc;
> 
>     /* Attempt to read the PTE that maps the VA being accessed. */
> @@ -5363,11 +5370,6 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
>         goto bail;
>     }
> 
> -    ptwr_ctxt.ctxt.regs = regs;
> -    ptwr_ctxt.ctxt.force_writeback = 0;
> -    ptwr_ctxt.ctxt.addr_size = ptwr_ctxt.ctxt.sp_size =
> -        is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG;
> -    ptwr_ctxt.ctxt.swint_emulate = x86_swint_emulate_none;
>     ptwr_ctxt.cr2 = addr;
>     ptwr_ctxt.pte = pte;
> 
> diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
> index 7e5b8b0..a4a3c4b 100644
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -385,8 +385,9 @@ const struct x86_emulate_ops *shadow_init_emulation(
>     struct vcpu *v = current;
>     unsigned long addr;
> 
> +    memset(sh_ctxt, 0, sizeof(*sh_ctxt));
> +
>     sh_ctxt->ctxt.regs = regs;
> -    sh_ctxt->ctxt.force_writeback = 0;
>     sh_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
> 
>     if ( is_pv_vcpu(v) )
> @@ -396,7 +397,6 @@ const struct x86_emulate_ops *shadow_init_emulation(
>     }
> 
>     /* Segment cache initialisation. Primed with CS. */
> -    sh_ctxt->valid_seg_regs = 0;
>     creg = hvm_get_seg_reg(x86_seg_cs, sh_ctxt);
> 
>     /* Work out the emulation mode. */
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
> index d82e85d..532bd32 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -1904,6 +1904,7 @@ x86_decode(
>     state->regs = ctxt->regs;
>     state->eip = ctxt->regs->eip;
> 
> +    /* Initialise output state in x86_emulate_ctxt */
>     ctxt->retire.byte = 0;
> 
>     op_bytes = def_op_bytes = ad_bytes = def_ad_bytes = ctxt->addr_size/8;
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
> index ec824ce..ab566c0 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -410,6 +410,23 @@ struct cpu_user_regs;
> 
> struct x86_emulate_ctxt
> {
> +    /*
> +     * Input-only state:
> +     */
> +
> +    /* Software event injection support. */
> +    enum x86_swint_emulation swint_emulate;
> +
> +    /* Set this if writes may have side effects. */
> +    bool force_writeback;
> +
> +    /* Caller data that can be used by x86_emulate_ops' routines. */
> +    void *data;
> +
> +    /*
> +     * Input/output state:
> +     */
> +
>     /* Register state before/after emulation. */
>     struct cpu_user_regs *regs;
> 
> @@ -419,14 +436,12 @@ struct x86_emulate_ctxt
>     /* Stack pointer width in bits (16, 32 or 64). */
>     unsigned int sp_size;
> 
> -    /* Canonical opcode (see below). */
> -    unsigned int opcode;
> -
> -    /* Software event injection support. */
> -    enum x86_swint_emulation swint_emulate;
> +    /*
> +     * Output-only state:
> +     */
> 
> -    /* Set this if writes may have side effects. */
> -    uint8_t force_writeback;
> +    /* Canonical opcode (see below) (valid only on X86EMUL_OKAY). */
> +    unsigned int opcode;
> 
>     /* Retirement state, set by the emulator (valid only on X86EMUL_OKAY). */
>     union {
> @@ -437,9 +452,6 @@ struct x86_emulate_ctxt
>         } flags;
>         uint8_t byte;
>     } retire;
> -
> -    /* Caller data that can be used by x86_emulate_ops' routines. */
> -    void *data;
> };
> 
> /*
> -- 
> 2.1.4
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2016-12-08  6:34 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-30 13:50 [PATCH for-4.9 v3 00/24] XSA-191 followup Andrew Cooper
2016-11-30 13:50 ` [PATCH v3 01/24] x86/shadow: Fix #PFs from emulated writes crossing a page boundary Andrew Cooper
2016-11-30 13:50 ` [PATCH v3 02/24] x86/emul: Drop X86EMUL_CMPXCHG_FAILED Andrew Cooper
2016-11-30 13:50 ` [PATCH v3 03/24] x86/emul: Simplfy emulation state setup Andrew Cooper
2016-12-08  6:34   ` George Dunlap
2016-11-30 13:50 ` [PATCH v3 04/24] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure Andrew Cooper
2016-11-30 13:50 ` [PATCH v3 05/24] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC Andrew Cooper
2016-11-30 13:50 ` [PATCH v3 06/24] x86/pv: Implement pv_inject_{event, page_fault, hw_exception}() Andrew Cooper
2016-12-01 10:06   ` Jan Beulich
2016-11-30 13:50 ` [PATCH v3 07/24] x86/emul: Clean up the naming of the retire union Andrew Cooper
2016-11-30 13:58   ` Paul Durrant
2016-11-30 14:02     ` Andrew Cooper
2016-11-30 14:05       ` Paul Durrant
2016-11-30 16:43         ` Jan Beulich
2016-12-01 10:08   ` Jan Beulich
2016-11-30 13:50 ` [PATCH v3 08/24] x86/emul: Correct the behaviour of pop %ss and interrupt shadowing Andrew Cooper
2016-12-01 10:18   ` Jan Beulich
2016-12-01 10:51     ` Andrew Cooper
2016-12-01 11:19       ` Jan Beulich
2016-11-30 13:50 ` [PATCH v3 09/24] x86/emul: Provide a wrapper to x86_emulate() to ASSERT() certain behaviour Andrew Cooper
2016-12-01 10:40   ` Jan Beulich
2016-12-01 10:58     ` Andrew Cooper
2016-12-01 11:21       ` Jan Beulich
2016-11-30 13:50 ` [PATCH v3 10/24] x86/emul: Always use fault semantics for software events Andrew Cooper
2016-11-30 17:55   ` Boris Ostrovsky
2016-12-01 10:53   ` Jan Beulich
2016-12-01 11:15     ` Andrew Cooper
2016-12-01 11:23       ` Jan Beulich
2016-11-30 13:50 ` [PATCH v3 11/24] x86/emul: Implement singlestep as a retire flag Andrew Cooper
2016-11-30 14:28   ` Paul Durrant
2016-12-01 11:16   ` Jan Beulich
2016-12-01 11:23     ` Andrew Cooper
2016-12-01 11:33       ` Tim Deegan
2016-12-01 12:05       ` Jan Beulich
2016-11-30 13:50 ` [PATCH v3 12/24] x86/emul: Remove opencoded exception generation Andrew Cooper
2016-11-30 13:50 ` [PATCH v3 13/24] x86/emul: Rework emulator event injection Andrew Cooper
2016-11-30 14:26   ` Paul Durrant
2016-12-01 11:35   ` Tim Deegan
2016-12-01 12:31   ` Jan Beulich
2016-11-30 13:50 ` [PATCH v3 14/24] x86/vmx: Use hvm_{get, set}_segment_register() rather than vmx_{get, set}_segment_register() Andrew Cooper
2016-11-30 13:50 ` [PATCH v3 15/24] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS Andrew Cooper
2016-11-30 13:50 ` [PATCH v3 16/24] x86/emul: Avoid raising faults behind the emulators back Andrew Cooper
2016-11-30 13:50 ` [PATCH v3 17/24] x86/pv: " Andrew Cooper
2016-12-01 11:50   ` Tim Deegan
2016-12-01 12:57   ` Jan Beulich
2016-12-01 13:12     ` Andrew Cooper
2016-12-01 13:27       ` Jan Beulich
2016-11-30 13:50 ` [PATCH v3 18/24] x86/shadow: " Andrew Cooper
2016-12-01 11:39   ` Tim Deegan
2016-12-01 11:40     ` Andrew Cooper
2016-12-01 13:00   ` Jan Beulich
2016-12-01 13:15     ` Andrew Cooper
2016-11-30 13:50 ` [PATCH v3 19/24] x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer Andrew Cooper
2016-11-30 13:50 ` [PATCH v3 20/24] x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info Andrew Cooper
2016-11-30 13:50 ` [PATCH v3 21/24] x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear() Andrew Cooper
2016-11-30 13:50 ` [PATCH v3 22/24] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back Andrew Cooper
2016-11-30 14:29   ` Paul Durrant
2016-11-30 13:50 ` [PATCH v3 23/24] x86/emul: Prepare to allow use of system segments for memory references Andrew Cooper
2016-11-30 13:50 ` [PATCH v3 24/24] x86/emul: Use system-segment relative memory accesses Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.