xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-4.9 00/15] XSA-191 followup
@ 2016-11-23 15:38 Andrew Cooper
  2016-11-23 15:38 ` [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity Andrew Cooper
                   ` (14 more replies)
  0 siblings, 15 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 15:38 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper

This is partly RFC as there is a whole lot of rebasing happened recently and I
haven't run thorough tests yet.

This is the quantity of changes required to fix some edgecases in XSA-191
which were ultimately chosen not to go out in the security fix.  The main
purpose of this series is to fix emulation sufficiently to allow patch 15 to
avoid opencoding all of the segmenation logic.

Andrew Cooper (15):
  x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity
  x86/emul: Simplfy emulation state setup
  x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure
  x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC
  x86/emul: Remove opencoded exception generation
  x86/emul: Rework emulator event injection
  x86/vmx: Use hvm_{get,set}_segment_register() rather than vmx_{get,set}_segment_register()
  x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS
  x86/emul: Avoid raising faults behind the emulators back
  x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer
  x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info
  x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear()
  x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back
  x86/hvm: Prepare to allow use of system segments for memory references
  x86/hvm: Use system-segment relative memory accesses

 tools/tests/x86_emulator/test_x86_emulator.c |   1 +
 xen/arch/x86/hvm/emulate.c                   | 238 +++++++----------
 xen/arch/x86/hvm/hvm.c                       | 366 +++++++++++++++++++--------
 xen/arch/x86/hvm/io.c                        |   6 +-
 xen/arch/x86/hvm/ioreq.c                     |   2 +-
 xen/arch/x86/hvm/nestedhvm.c                 |   2 +-
 xen/arch/x86/hvm/svm/emulate.c               |   4 +-
 xen/arch/x86/hvm/svm/nestedsvm.c             |  13 +-
 xen/arch/x86/hvm/svm/svm.c                   | 102 ++++----
 xen/arch/x86/hvm/vmx/intr.c                  |   2 +-
 xen/arch/x86/hvm/vmx/realmode.c              |  18 +-
 xen/arch/x86/hvm/vmx/vmx.c                   | 107 ++++----
 xen/arch/x86/hvm/vmx/vvmx.c                  |  44 ++--
 xen/arch/x86/mm.c                            |   8 +-
 xen/arch/x86/mm/shadow/common.c              |  29 ++-
 xen/arch/x86/mm/shadow/multi.c               |   4 +-
 xen/arch/x86/x86_emulate/x86_emulate.c       | 328 +++++++++++++-----------
 xen/arch/x86/x86_emulate/x86_emulate.h       | 149 +++++++++--
 xen/include/asm-x86/desc.h                   |   6 +
 xen/include/asm-x86/hvm/emulate.h            |   9 +-
 xen/include/asm-x86/hvm/hvm.h                |  86 +++----
 xen/include/asm-x86/hvm/support.h            |  42 ++-
 xen/include/asm-x86/hvm/svm/nestedsvm.h      |   6 +-
 xen/include/asm-x86/hvm/vcpu.h               |   2 +-
 xen/include/asm-x86/hvm/vmx/vmx.h            |   2 -
 xen/include/asm-x86/hvm/vmx/vvmx.h           |   4 +-
 26 files changed, 901 insertions(+), 679 deletions(-)

-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity
  2016-11-23 15:38 [PATCH for-4.9 00/15] XSA-191 followup Andrew Cooper
@ 2016-11-23 15:38 ` Andrew Cooper
  2016-11-23 15:49   ` Paul Durrant
                     ` (4 more replies)
  2016-11-23 15:38 ` [PATCH 02/15] x86/emul: Simplfy emulation state setup Andrew Cooper
                   ` (13 subsequent siblings)
  14 siblings, 5 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 15:38 UTC (permalink / raw)
  To: Xen-devel
  Cc: Kevin Tian, Wei Liu, Jan Beulich, Andrew Cooper, Paul Durrant,
	Jun Nakajima, Boris Ostrovsky, Suravee Suthikulpanit

 * Move hvm_emulate_init() to immediately hvm_emulate_prepare(), as they are
   very closely related.
 * Rename hvm_emulate_prepare() to hvm_emulate_init_once() and
   hvm_emulate_init() to hvm_emulate_init_per_insn() to make it clearer how to
   and when to use them.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Paul Durrant <paul.durrant@citrix.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
CC: Wei Liu <wei.liu2@citrix.com>

As hvm_emulate_prepare() was new in 4.8, it would be a good idea to take this
patch to avoid future confusion on the stable-4.8 branch
---
 xen/arch/x86/hvm/emulate.c        | 111 +++++++++++++++++++-------------------
 xen/arch/x86/hvm/hvm.c            |   2 +-
 xen/arch/x86/hvm/io.c             |   2 +-
 xen/arch/x86/hvm/ioreq.c          |   2 +-
 xen/arch/x86/hvm/svm/emulate.c    |   4 +-
 xen/arch/x86/hvm/vmx/realmode.c   |   2 +-
 xen/include/asm-x86/hvm/emulate.h |   6 ++-
 7 files changed, 66 insertions(+), 63 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index e9b8f8c..3ab0e8e 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1755,57 +1755,6 @@ static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
     .vmfunc        = hvmemul_vmfunc,
 };
 
-void hvm_emulate_init(
-    struct hvm_emulate_ctxt *hvmemul_ctxt,
-    const unsigned char *insn_buf,
-    unsigned int insn_bytes)
-{
-    struct vcpu *curr = current;
-    unsigned int pfec = PFEC_page_present;
-    unsigned long addr;
-
-    if ( hvm_long_mode_enabled(curr) &&
-         hvmemul_ctxt->seg_reg[x86_seg_cs].attr.fields.l )
-    {
-        hvmemul_ctxt->ctxt.addr_size = hvmemul_ctxt->ctxt.sp_size = 64;
-    }
-    else
-    {
-        hvmemul_ctxt->ctxt.addr_size =
-            hvmemul_ctxt->seg_reg[x86_seg_cs].attr.fields.db ? 32 : 16;
-        hvmemul_ctxt->ctxt.sp_size =
-            hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.db ? 32 : 16;
-    }
-
-    if ( hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3 )
-        pfec |= PFEC_user_mode;
-
-    hvmemul_ctxt->insn_buf_eip = hvmemul_ctxt->ctxt.regs->eip;
-    if ( !insn_bytes )
-    {
-        hvmemul_ctxt->insn_buf_bytes =
-            hvm_get_insn_bytes(curr, hvmemul_ctxt->insn_buf) ?:
-            (hvm_virtual_to_linear_addr(x86_seg_cs,
-                                        &hvmemul_ctxt->seg_reg[x86_seg_cs],
-                                        hvmemul_ctxt->insn_buf_eip,
-                                        sizeof(hvmemul_ctxt->insn_buf),
-                                        hvm_access_insn_fetch,
-                                        hvmemul_ctxt->ctxt.addr_size,
-                                        &addr) &&
-             hvm_fetch_from_guest_virt_nofault(hvmemul_ctxt->insn_buf, addr,
-                                               sizeof(hvmemul_ctxt->insn_buf),
-                                               pfec) == HVMCOPY_okay) ?
-            sizeof(hvmemul_ctxt->insn_buf) : 0;
-    }
-    else
-    {
-        hvmemul_ctxt->insn_buf_bytes = insn_bytes;
-        memcpy(hvmemul_ctxt->insn_buf, insn_buf, insn_bytes);
-    }
-
-    hvmemul_ctxt->exn_pending = 0;
-}
-
 static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
     const struct x86_emulate_ops *ops)
 {
@@ -1815,7 +1764,8 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
     int rc;
 
-    hvm_emulate_init(hvmemul_ctxt, vio->mmio_insn, vio->mmio_insn_bytes);
+    hvm_emulate_init_per_insn(hvmemul_ctxt, vio->mmio_insn,
+                              vio->mmio_insn_bytes);
 
     vio->mmio_retry = 0;
 
@@ -1915,7 +1865,7 @@ int hvm_emulate_one_mmio(unsigned long mfn, unsigned long gla)
     else
         ops = &hvm_ro_emulate_ops_mmio;
 
-    hvm_emulate_prepare(&ctxt, guest_cpu_user_regs());
+    hvm_emulate_init_once(&ctxt, guest_cpu_user_regs());
     ctxt.ctxt.data = &mmio_ro_ctxt;
     rc = _hvm_emulate_one(&ctxt, ops);
     switch ( rc )
@@ -1940,7 +1890,7 @@ void hvm_emulate_one_vm_event(enum emul_kind kind, unsigned int trapnr,
     struct hvm_emulate_ctxt ctx = {{ 0 }};
     int rc;
 
-    hvm_emulate_prepare(&ctx, guest_cpu_user_regs());
+    hvm_emulate_init_once(&ctx, guest_cpu_user_regs());
 
     switch ( kind )
     {
@@ -1992,7 +1942,7 @@ void hvm_emulate_one_vm_event(enum emul_kind kind, unsigned int trapnr,
     hvm_emulate_writeback(&ctx);
 }
 
-void hvm_emulate_prepare(
+void hvm_emulate_init_once(
     struct hvm_emulate_ctxt *hvmemul_ctxt,
     struct cpu_user_regs *regs)
 {
@@ -2006,6 +1956,57 @@ void hvm_emulate_prepare(
     hvmemul_get_seg_reg(x86_seg_ss, hvmemul_ctxt);
 }
 
+void hvm_emulate_init_per_insn(
+    struct hvm_emulate_ctxt *hvmemul_ctxt,
+    const unsigned char *insn_buf,
+    unsigned int insn_bytes)
+{
+    struct vcpu *curr = current;
+    unsigned int pfec = PFEC_page_present;
+    unsigned long addr;
+
+    if ( hvm_long_mode_enabled(curr) &&
+         hvmemul_ctxt->seg_reg[x86_seg_cs].attr.fields.l )
+    {
+        hvmemul_ctxt->ctxt.addr_size = hvmemul_ctxt->ctxt.sp_size = 64;
+    }
+    else
+    {
+        hvmemul_ctxt->ctxt.addr_size =
+            hvmemul_ctxt->seg_reg[x86_seg_cs].attr.fields.db ? 32 : 16;
+        hvmemul_ctxt->ctxt.sp_size =
+            hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.db ? 32 : 16;
+    }
+
+    if ( hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3 )
+        pfec |= PFEC_user_mode;
+
+    hvmemul_ctxt->insn_buf_eip = hvmemul_ctxt->ctxt.regs->eip;
+    if ( !insn_bytes )
+    {
+        hvmemul_ctxt->insn_buf_bytes =
+            hvm_get_insn_bytes(curr, hvmemul_ctxt->insn_buf) ?:
+            (hvm_virtual_to_linear_addr(x86_seg_cs,
+                                        &hvmemul_ctxt->seg_reg[x86_seg_cs],
+                                        hvmemul_ctxt->insn_buf_eip,
+                                        sizeof(hvmemul_ctxt->insn_buf),
+                                        hvm_access_insn_fetch,
+                                        hvmemul_ctxt->ctxt.addr_size,
+                                        &addr) &&
+             hvm_fetch_from_guest_virt_nofault(hvmemul_ctxt->insn_buf, addr,
+                                               sizeof(hvmemul_ctxt->insn_buf),
+                                               pfec) == HVMCOPY_okay) ?
+            sizeof(hvmemul_ctxt->insn_buf) : 0;
+    }
+    else
+    {
+        hvmemul_ctxt->insn_buf_bytes = insn_bytes;
+        memcpy(hvmemul_ctxt->insn_buf, insn_buf, insn_bytes);
+    }
+
+    hvmemul_ctxt->exn_pending = 0;
+}
+
 void hvm_emulate_writeback(
     struct hvm_emulate_ctxt *hvmemul_ctxt)
 {
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index f76dd90..25dc759 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4058,7 +4058,7 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
 {
     struct hvm_emulate_ctxt ctxt;
 
-    hvm_emulate_prepare(&ctxt, regs);
+    hvm_emulate_init_once(&ctxt, regs);
 
     if ( opt_hvm_fep )
     {
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 1e7a5f9..7305801 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -87,7 +87,7 @@ int handle_mmio(void)
 
     ASSERT(!is_pvh_vcpu(curr));
 
-    hvm_emulate_prepare(&ctxt, guest_cpu_user_regs());
+    hvm_emulate_init_once(&ctxt, guest_cpu_user_regs());
 
     rc = hvm_emulate_one(&ctxt);
 
diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index d2245e2..88071ab 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -167,7 +167,7 @@ bool_t handle_hvm_io_completion(struct vcpu *v)
     {
         struct hvm_emulate_ctxt ctxt;
 
-        hvm_emulate_prepare(&ctxt, guest_cpu_user_regs());
+        hvm_emulate_init_once(&ctxt, guest_cpu_user_regs());
         vmx_realmode_emulate_one(&ctxt);
         hvm_emulate_writeback(&ctxt);
 
diff --git a/xen/arch/x86/hvm/svm/emulate.c b/xen/arch/x86/hvm/svm/emulate.c
index a5545ea..9cdbe9e 100644
--- a/xen/arch/x86/hvm/svm/emulate.c
+++ b/xen/arch/x86/hvm/svm/emulate.c
@@ -107,8 +107,8 @@ int __get_instruction_length_from_list(struct vcpu *v,
 #endif
 
     ASSERT(v == current);
-    hvm_emulate_prepare(&ctxt, guest_cpu_user_regs());
-    hvm_emulate_init(&ctxt, NULL, 0);
+    hvm_emulate_init_once(&ctxt, guest_cpu_user_regs());
+    hvm_emulate_init_per_insn(&ctxt, NULL, 0);
     state = x86_decode_insn(&ctxt.ctxt, hvmemul_insn_fetch);
     if ( IS_ERR_OR_NULL(state) )
         return 0;
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index e83a61f..9002638 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -179,7 +179,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
     if ( intr_info & INTR_INFO_VALID_MASK )
         __vmwrite(VM_ENTRY_INTR_INFO, 0);
 
-    hvm_emulate_prepare(&hvmemul_ctxt, regs);
+    hvm_emulate_init_once(&hvmemul_ctxt, regs);
 
     /* Only deliver interrupts into emulated real mode. */
     if ( !(curr->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) &&
diff --git a/xen/include/asm-x86/hvm/emulate.h b/xen/include/asm-x86/hvm/emulate.h
index f610673..d4186a2 100644
--- a/xen/include/asm-x86/hvm/emulate.h
+++ b/xen/include/asm-x86/hvm/emulate.h
@@ -51,10 +51,12 @@ int hvm_emulate_one_no_write(
 void hvm_emulate_one_vm_event(enum emul_kind kind,
     unsigned int trapnr,
     unsigned int errcode);
-void hvm_emulate_prepare(
+/* Must be called once to set up hvmemul state. */
+void hvm_emulate_init_once(
     struct hvm_emulate_ctxt *hvmemul_ctxt,
     struct cpu_user_regs *regs);
-void hvm_emulate_init(
+/* Must be called once before each instruction emulated. */
+void hvm_emulate_init_per_insn(
     struct hvm_emulate_ctxt *hvmemul_ctxt,
     const unsigned char *insn_buf,
     unsigned int insn_bytes);
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 02/15] x86/emul: Simplfy emulation state setup
  2016-11-23 15:38 [PATCH for-4.9 00/15] XSA-191 followup Andrew Cooper
  2016-11-23 15:38 ` [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity Andrew Cooper
@ 2016-11-23 15:38 ` Andrew Cooper
  2016-11-23 15:58   ` Paul Durrant
                     ` (2 more replies)
  2016-11-23 15:38 ` [PATCH 03/15] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure Andrew Cooper
                   ` (12 subsequent siblings)
  14 siblings, 3 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 15:38 UTC (permalink / raw)
  To: Xen-devel
  Cc: George Dunlap, Andrew Cooper, Paul Durrant, Tim Deegan, Jan Beulich

The current code to set up emulation state is ad-hoc and error prone.

 * Consistently zero all emulation state structures.
 * Avoid explicitly initialising some state to 0.
 * Explicitly identify all input and output state in x86_emulate_ctxt.  This
   involves rearanging some fields.
 * Have x86_decode() explicitly initalise all output state at its start.

In addition, move the calculation of hvmemul_ctxt->ctxt.swint_emulate from
_hvm_emulate_one() to hvm_emulate_init_once(), as it doesn't need
recalculating for each instruction.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Tim Deegan <tim@xen.org>
CC: George Dunlap <george.dunlap@eu.citrix.com>
CC: Paul Durrant <paul.durrant@citrix.com>
---
 xen/arch/x86/hvm/emulate.c             | 28 +++++++++++++++-------------
 xen/arch/x86/mm.c                      |  3 ++-
 xen/arch/x86/mm/shadow/common.c        |  4 ++--
 xen/arch/x86/x86_emulate/x86_emulate.c |  2 ++
 xen/arch/x86/x86_emulate/x86_emulate.h | 22 +++++++++++++++-------
 5 files changed, 36 insertions(+), 23 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 3ab0e8e..f24e211 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1769,13 +1769,6 @@ static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
 
     vio->mmio_retry = 0;
 
-    if ( cpu_has_vmx )
-        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
-    else if ( cpu_has_svm_nrips )
-        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_icebp;
-    else
-        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_all;
-
     rc = x86_emulate(&hvmemul_ctxt->ctxt, ops);
 
     if ( rc == X86EMUL_OKAY && vio->mmio_retry )
@@ -1946,14 +1939,23 @@ void hvm_emulate_init_once(
     struct hvm_emulate_ctxt *hvmemul_ctxt,
     struct cpu_user_regs *regs)
 {
-    hvmemul_ctxt->intr_shadow = hvm_funcs.get_interrupt_shadow(current);
-    hvmemul_ctxt->ctxt.regs = regs;
-    hvmemul_ctxt->ctxt.force_writeback = 1;
-    hvmemul_ctxt->seg_reg_accessed = 0;
-    hvmemul_ctxt->seg_reg_dirty = 0;
-    hvmemul_ctxt->set_context = 0;
+    struct vcpu *curr = current;
+
+    memset(hvmemul_ctxt, 0, sizeof(*hvmemul_ctxt));
+
+    hvmemul_ctxt->intr_shadow = hvm_funcs.get_interrupt_shadow(curr);
     hvmemul_get_seg_reg(x86_seg_cs, hvmemul_ctxt);
     hvmemul_get_seg_reg(x86_seg_ss, hvmemul_ctxt);
+
+    hvmemul_ctxt->ctxt.regs = regs;
+    hvmemul_ctxt->ctxt.force_writeback = true;
+
+    if ( cpu_has_vmx )
+        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
+    else if ( cpu_has_svm_nrips )
+        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_icebp;
+    else
+        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_all;
 }
 
 void hvm_emulate_init_per_insn(
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 03dcd71..9901f6f 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5363,8 +5363,9 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
         goto bail;
     }
 
+    memset(&ptwr_ctxt, 0, sizeof(ptwr_ctxt));
+
     ptwr_ctxt.ctxt.regs = regs;
-    ptwr_ctxt.ctxt.force_writeback = 0;
     ptwr_ctxt.ctxt.addr_size = ptwr_ctxt.ctxt.sp_size =
         is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG;
     ptwr_ctxt.ctxt.swint_emulate = x86_swint_emulate_none;
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index ced2313..6f6668d 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -385,8 +385,9 @@ const struct x86_emulate_ops *shadow_init_emulation(
     struct vcpu *v = current;
     unsigned long addr;
 
+    memset(sh_ctxt, 0, sizeof(*sh_ctxt));
+
     sh_ctxt->ctxt.regs = regs;
-    sh_ctxt->ctxt.force_writeback = 0;
     sh_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
 
     if ( is_pv_vcpu(v) )
@@ -396,7 +397,6 @@ const struct x86_emulate_ops *shadow_init_emulation(
     }
 
     /* Segment cache initialisation. Primed with CS. */
-    sh_ctxt->valid_seg_regs = 0;
     creg = hvm_get_seg_reg(x86_seg_cs, sh_ctxt);
 
     /* Work out the emulation mode. */
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 04f0dac..c5d9664 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1904,6 +1904,8 @@ x86_decode(
     state->regs = ctxt->regs;
     state->eip = ctxt->regs->eip;
 
+    /* Initialise output state in x86_emulate_ctxt */
+    ctxt->opcode = ~0u;
     ctxt->retire.byte = 0;
 
     op_bytes = def_op_bytes = ad_bytes = def_ad_bytes = ctxt->addr_size/8;
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 993c576..93b268e 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -412,6 +412,10 @@ struct cpu_user_regs;
 
 struct x86_emulate_ctxt
 {
+    /*
+     * Input state:
+     */
+
     /* Register state before/after emulation. */
     struct cpu_user_regs *regs;
 
@@ -421,14 +425,21 @@ struct x86_emulate_ctxt
     /* Stack pointer width in bits (16, 32 or 64). */
     unsigned int sp_size;
 
-    /* Canonical opcode (see below). */
-    unsigned int opcode;
-
     /* Software event injection support. */
     enum x86_swint_emulation swint_emulate;
 
     /* Set this if writes may have side effects. */
-    uint8_t force_writeback;
+    bool force_writeback;
+
+    /* Caller data that can be used by x86_emulate_ops' routines. */
+    void *data;
+
+    /*
+     * Output state:
+     */
+
+    /* Canonical opcode (see below). */
+    unsigned int opcode;
 
     /* Retirement state, set by the emulator (valid only on X86EMUL_OKAY). */
     union {
@@ -439,9 +450,6 @@ struct x86_emulate_ctxt
         } flags;
         uint8_t byte;
     } retire;
-
-    /* Caller data that can be used by x86_emulate_ops' routines. */
-    void *data;
 };
 
 /*
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 03/15] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure
  2016-11-23 15:38 [PATCH for-4.9 00/15] XSA-191 followup Andrew Cooper
  2016-11-23 15:38 ` [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity Andrew Cooper
  2016-11-23 15:38 ` [PATCH 02/15] x86/emul: Simplfy emulation state setup Andrew Cooper
@ 2016-11-23 15:38 ` Andrew Cooper
  2016-11-23 16:12   ` Paul Durrant
                     ` (3 more replies)
  2016-11-23 15:38 ` [PATCH 04/15] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC Andrew Cooper
                   ` (11 subsequent siblings)
  14 siblings, 4 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 15:38 UTC (permalink / raw)
  To: Xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Paul Durrant,
	Jun Nakajima, Boris Ostrovsky, Suravee Suthikulpanit

The x86 emulator needs to gain an understanding of interrupts and exceptions
generated by its actions.  The naming choice is to match both the Intel and
AMD terms, and to avoid 'trap' specifically as it has an architectural meaning
different to its current usage.

While making this change, make other changes for consistency

 * Rename *_trap() infrastructure to *_event()
 * Rename trapnr/trap parameters to vector
 * Convert hvm_inject_hw_exception() and hvm_inject_page_fault() to being
   static inlines, as they are only thin wrappers around hvm_inject_event()

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Paul Durrant <paul.durrant@citrix.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 xen/arch/x86/hvm/emulate.c              |  6 +--
 xen/arch/x86/hvm/hvm.c                  | 33 ++++------------
 xen/arch/x86/hvm/io.c                   |  2 +-
 xen/arch/x86/hvm/svm/nestedsvm.c        |  7 ++--
 xen/arch/x86/hvm/svm/svm.c              | 62 ++++++++++++++---------------
 xen/arch/x86/hvm/vmx/vmx.c              | 66 +++++++++++++++----------------
 xen/arch/x86/hvm/vmx/vvmx.c             | 11 +++---
 xen/arch/x86/x86_emulate/x86_emulate.c  | 11 ++++++
 xen/arch/x86/x86_emulate/x86_emulate.h  | 22 +++++++++++
 xen/include/asm-x86/hvm/emulate.h       |  2 +-
 xen/include/asm-x86/hvm/hvm.h           | 69 ++++++++++++++++-----------------
 xen/include/asm-x86/hvm/svm/nestedsvm.h |  6 +--
 xen/include/asm-x86/hvm/vcpu.h          |  2 +-
 xen/include/asm-x86/hvm/vmx/vvmx.h      |  4 +-
 14 files changed, 159 insertions(+), 144 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index f24e211..d0c3185 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1679,7 +1679,7 @@ static int hvmemul_invlpg(
          * violations, so squash them.
          */
         hvmemul_ctxt->exn_pending = 0;
-        hvmemul_ctxt->trap = (struct hvm_trap){};
+        hvmemul_ctxt->trap = (struct x86_event){};
         rc = X86EMUL_OKAY;
     }
 
@@ -1868,7 +1868,7 @@ int hvm_emulate_one_mmio(unsigned long mfn, unsigned long gla)
         break;
     case X86EMUL_EXCEPTION:
         if ( ctxt.exn_pending )
-            hvm_inject_trap(&ctxt.trap);
+            hvm_inject_event(&ctxt.trap);
         /* fallthrough */
     default:
         hvm_emulate_writeback(&ctxt);
@@ -1928,7 +1928,7 @@ void hvm_emulate_one_vm_event(enum emul_kind kind, unsigned int trapnr,
         break;
     case X86EMUL_EXCEPTION:
         if ( ctx.exn_pending )
-            hvm_inject_trap(&ctx.trap);
+            hvm_inject_event(&ctx.trap);
         break;
     }
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 25dc759..7b434aa 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -535,7 +535,7 @@ void hvm_do_resume(struct vcpu *v)
     /* Inject pending hw/sw trap */
     if ( v->arch.hvm_vcpu.inject_trap.vector != -1 )
     {
-        hvm_inject_trap(&v->arch.hvm_vcpu.inject_trap);
+        hvm_inject_event(&v->arch.hvm_vcpu.inject_trap);
         v->arch.hvm_vcpu.inject_trap.vector = -1;
     }
 }
@@ -1676,19 +1676,19 @@ void hvm_triple_fault(void)
     domain_shutdown(d, reason);
 }
 
-void hvm_inject_trap(const struct hvm_trap *trap)
+void hvm_inject_event(const struct x86_event *event)
 {
     struct vcpu *curr = current;
 
     if ( nestedhvm_enabled(curr->domain) &&
          !nestedhvm_vmswitch_in_progress(curr) &&
          nestedhvm_vcpu_in_guestmode(curr) &&
-         nhvm_vmcx_guest_intercepts_trap(
-             curr, trap->vector, trap->error_code) )
+         nhvm_vmcx_guest_intercepts_event(
+             curr, event->vector, event->error_code) )
     {
         enum nestedhvm_vmexits nsret;
 
-        nsret = nhvm_vcpu_vmexit_trap(curr, trap);
+        nsret = nhvm_vcpu_vmexit_event(curr, event);
 
         switch ( nsret )
         {
@@ -1704,26 +1704,7 @@ void hvm_inject_trap(const struct hvm_trap *trap)
         }
     }
 
-    hvm_funcs.inject_trap(trap);
-}
-
-void hvm_inject_hw_exception(unsigned int trapnr, int errcode)
-{
-    struct hvm_trap trap = {
-        .vector = trapnr,
-        .type = X86_EVENTTYPE_HW_EXCEPTION,
-        .error_code = errcode };
-    hvm_inject_trap(&trap);
-}
-
-void hvm_inject_page_fault(int errcode, unsigned long cr2)
-{
-    struct hvm_trap trap = {
-        .vector = TRAP_page_fault,
-        .type = X86_EVENTTYPE_HW_EXCEPTION,
-        .error_code = errcode,
-        .cr2 = cr2 };
-    hvm_inject_trap(&trap);
+    hvm_funcs.inject_event(event);
 }
 
 int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
@@ -4096,7 +4077,7 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
         break;
     case X86EMUL_EXCEPTION:
         if ( ctxt.exn_pending )
-            hvm_inject_trap(&ctxt.trap);
+            hvm_inject_event(&ctxt.trap);
         /* fall through */
     default:
         hvm_emulate_writeback(&ctxt);
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 7305801..1279f68 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -103,7 +103,7 @@ int handle_mmio(void)
         return 0;
     case X86EMUL_EXCEPTION:
         if ( ctxt.exn_pending )
-            hvm_inject_trap(&ctxt.trap);
+            hvm_inject_event(&ctxt.trap);
         break;
     default:
         break;
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index f9b38ab..b6b8526 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -821,7 +821,7 @@ nsvm_vcpu_vmexit_inject(struct vcpu *v, struct cpu_user_regs *regs,
 }
 
 int
-nsvm_vcpu_vmexit_trap(struct vcpu *v, const struct hvm_trap *trap)
+nsvm_vcpu_vmexit_event(struct vcpu *v, const struct x86_event *trap)
 {
     ASSERT(vcpu_nestedhvm(v).nv_vvmcx != NULL);
 
@@ -994,10 +994,11 @@ nsvm_vmcb_guest_intercepts_exitcode(struct vcpu *v,
 }
 
 bool_t
-nsvm_vmcb_guest_intercepts_trap(struct vcpu *v, unsigned int trapnr, int errcode)
+nsvm_vmcb_guest_intercepts_event(
+    struct vcpu *v, unsigned int vector, int errcode)
 {
     return nsvm_vmcb_guest_intercepts_exitcode(v,
-        guest_cpu_user_regs(), VMEXIT_EXCEPTION_DE + trapnr);
+        guest_cpu_user_regs(), VMEXIT_EXCEPTION_DE + vector);
 }
 
 static int
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index f737d8c..66eb30b 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1203,15 +1203,15 @@ static void svm_vcpu_destroy(struct vcpu *v)
     passive_domain_destroy(v);
 }
 
-static void svm_inject_trap(const struct hvm_trap *trap)
+static void svm_inject_event(const struct x86_event *event)
 {
     struct vcpu *curr = current;
     struct vmcb_struct *vmcb = curr->arch.hvm_svm.vmcb;
-    eventinj_t event = vmcb->eventinj;
-    struct hvm_trap _trap = *trap;
+    eventinj_t eventinj = vmcb->eventinj;
+    struct x86_event _event = *event;
     const struct cpu_user_regs *regs = guest_cpu_user_regs();
 
-    switch ( _trap.vector )
+    switch ( _event.vector )
     {
     case TRAP_debug:
         if ( regs->eflags & X86_EFLAGS_TF )
@@ -1229,21 +1229,21 @@ static void svm_inject_trap(const struct hvm_trap *trap)
         }
     }
 
-    if ( unlikely(event.fields.v) &&
-         (event.fields.type == X86_EVENTTYPE_HW_EXCEPTION) )
+    if ( unlikely(eventinj.fields.v) &&
+         (eventinj.fields.type == X86_EVENTTYPE_HW_EXCEPTION) )
     {
-        _trap.vector = hvm_combine_hw_exceptions(
-            event.fields.vector, _trap.vector);
-        if ( _trap.vector == TRAP_double_fault )
-            _trap.error_code = 0;
+        _event.vector = hvm_combine_hw_exceptions(
+            eventinj.fields.vector, _event.vector);
+        if ( _event.vector == TRAP_double_fault )
+            _event.error_code = 0;
     }
 
-    event.bytes = 0;
-    event.fields.v = 1;
-    event.fields.vector = _trap.vector;
+    eventinj.bytes = 0;
+    eventinj.fields.v = 1;
+    eventinj.fields.vector = _event.vector;
 
     /* Refer to AMD Vol 2: System Programming, 15.20 Event Injection. */
-    switch ( _trap.type )
+    switch ( _event.type )
     {
     case X86_EVENTTYPE_SW_INTERRUPT: /* int $n */
         /*
@@ -1253,8 +1253,8 @@ static void svm_inject_trap(const struct hvm_trap *trap)
          * moved eip forward if appropriate.
          */
         if ( cpu_has_svm_nrips )
-            vmcb->nextrip = regs->eip + _trap.insn_len;
-        event.fields.type = X86_EVENTTYPE_SW_INTERRUPT;
+            vmcb->nextrip = regs->eip + _event.insn_len;
+        eventinj.fields.type = X86_EVENTTYPE_SW_INTERRUPT;
         break;
 
     case X86_EVENTTYPE_PRI_SW_EXCEPTION: /* icebp */
@@ -1265,7 +1265,7 @@ static void svm_inject_trap(const struct hvm_trap *trap)
          */
         if ( cpu_has_svm_nrips )
             vmcb->nextrip = regs->eip;
-        event.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
+        eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
         break;
 
     case X86_EVENTTYPE_SW_EXCEPTION: /* int3, into */
@@ -1279,28 +1279,28 @@ static void svm_inject_trap(const struct hvm_trap *trap)
          * the correct faulting eip should a fault occur.
          */
         if ( cpu_has_svm_nrips )
-            vmcb->nextrip = regs->eip + _trap.insn_len;
-        event.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
+            vmcb->nextrip = regs->eip + _event.insn_len;
+        eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
         break;
 
     default:
-        event.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
-        event.fields.ev = (_trap.error_code != HVM_DELIVER_NO_ERROR_CODE);
-        event.fields.errorcode = _trap.error_code;
+        eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
+        eventinj.fields.ev = (_event.error_code != HVM_DELIVER_NO_ERROR_CODE);
+        eventinj.fields.errorcode = _event.error_code;
         break;
     }
 
-    vmcb->eventinj = event;
+    vmcb->eventinj = eventinj;
 
-    if ( _trap.vector == TRAP_page_fault )
+    if ( _event.vector == TRAP_page_fault )
     {
-        curr->arch.hvm_vcpu.guest_cr[2] = _trap.cr2;
-        vmcb_set_cr2(vmcb, _trap.cr2);
-        HVMTRACE_LONG_2D(PF_INJECT, _trap.error_code, TRC_PAR_LONG(_trap.cr2));
+        curr->arch.hvm_vcpu.guest_cr[2] = _event.cr2;
+        vmcb_set_cr2(vmcb, _event.cr2);
+        HVMTRACE_LONG_2D(PF_INJECT, _event.error_code, TRC_PAR_LONG(_event.cr2));
     }
     else
     {
-        HVMTRACE_2D(INJ_EXC, _trap.vector, _trap.error_code);
+        HVMTRACE_2D(INJ_EXC, _event.vector, _event.error_code);
     }
 }
 
@@ -2250,7 +2250,7 @@ static struct hvm_function_table __initdata svm_function_table = {
     .set_guest_pat        = svm_set_guest_pat,
     .get_guest_pat        = svm_get_guest_pat,
     .set_tsc_offset       = svm_set_tsc_offset,
-    .inject_trap          = svm_inject_trap,
+    .inject_event         = svm_inject_event,
     .init_hypercall_page  = svm_init_hypercall_page,
     .event_pending        = svm_event_pending,
     .invlpg               = svm_invlpg,
@@ -2265,9 +2265,9 @@ static struct hvm_function_table __initdata svm_function_table = {
     .nhvm_vcpu_initialise = nsvm_vcpu_initialise,
     .nhvm_vcpu_destroy = nsvm_vcpu_destroy,
     .nhvm_vcpu_reset = nsvm_vcpu_reset,
-    .nhvm_vcpu_vmexit_trap = nsvm_vcpu_vmexit_trap,
+    .nhvm_vcpu_vmexit_event = nsvm_vcpu_vmexit_event,
     .nhvm_vcpu_p2m_base = nsvm_vcpu_hostcr3,
-    .nhvm_vmcx_guest_intercepts_trap = nsvm_vmcb_guest_intercepts_trap,
+    .nhvm_vmcx_guest_intercepts_event = nsvm_vmcb_guest_intercepts_event,
     .nhvm_vmcx_hap_enabled = nsvm_vmcb_hap_enabled,
     .nhvm_intr_blocked = nsvm_intr_blocked,
     .nhvm_hap_walk_L1_p2m = nsvm_hap_walk_L1_p2m,
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 0a52624..b1d8a0b 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1623,9 +1623,9 @@ void nvmx_enqueue_n2_exceptions(struct vcpu *v,
                  nvmx->intr.intr_info, nvmx->intr.error_code);
 }
 
-static int nvmx_vmexit_trap(struct vcpu *v, const struct hvm_trap *trap)
+static int nvmx_vmexit_event(struct vcpu *v, const struct x86_event *event)
 {
-    nvmx_enqueue_n2_exceptions(v, trap->vector, trap->error_code,
+    nvmx_enqueue_n2_exceptions(v, event->vector, event->error_code,
                                hvm_intsrc_none);
     return NESTEDHVM_VMEXIT_DONE;
 }
@@ -1707,13 +1707,13 @@ void vmx_inject_nmi(void)
  *  - #DB is X86_EVENTTYPE_HW_EXCEPTION, except when generated by
  *    opcode 0xf1 (which is X86_EVENTTYPE_PRI_SW_EXCEPTION)
  */
-static void vmx_inject_trap(const struct hvm_trap *trap)
+static void vmx_inject_event(const struct x86_event *event)
 {
     unsigned long intr_info;
     struct vcpu *curr = current;
-    struct hvm_trap _trap = *trap;
+    struct x86_event _event = *event;
 
-    switch ( _trap.vector | -(_trap.type == X86_EVENTTYPE_SW_INTERRUPT) )
+    switch ( _event.vector | -(_event.type == X86_EVENTTYPE_SW_INTERRUPT) )
     {
     case TRAP_debug:
         if ( guest_cpu_user_regs()->eflags & X86_EFLAGS_TF )
@@ -1722,7 +1722,7 @@ static void vmx_inject_trap(const struct hvm_trap *trap)
             write_debugreg(6, read_debugreg(6) | DR_STEP);
         }
         if ( !nestedhvm_vcpu_in_guestmode(curr) ||
-             !nvmx_intercepts_exception(curr, TRAP_debug, _trap.error_code) )
+             !nvmx_intercepts_exception(curr, TRAP_debug, _event.error_code) )
         {
             unsigned long val;
 
@@ -1744,8 +1744,8 @@ static void vmx_inject_trap(const struct hvm_trap *trap)
         break;
 
     case TRAP_page_fault:
-        ASSERT(_trap.type == X86_EVENTTYPE_HW_EXCEPTION);
-        curr->arch.hvm_vcpu.guest_cr[2] = _trap.cr2;
+        ASSERT(_event.type == X86_EVENTTYPE_HW_EXCEPTION);
+        curr->arch.hvm_vcpu.guest_cr[2] = _event.cr2;
         break;
     }
 
@@ -1758,34 +1758,34 @@ static void vmx_inject_trap(const struct hvm_trap *trap)
          (MASK_EXTR(intr_info, INTR_INFO_INTR_TYPE_MASK) ==
           X86_EVENTTYPE_HW_EXCEPTION) )
     {
-        _trap.vector = hvm_combine_hw_exceptions(
-            (uint8_t)intr_info, _trap.vector);
-        if ( _trap.vector == TRAP_double_fault )
-            _trap.error_code = 0;
+        _event.vector = hvm_combine_hw_exceptions(
+            (uint8_t)intr_info, _event.vector);
+        if ( _event.vector == TRAP_double_fault )
+            _event.error_code = 0;
     }
 
-    if ( _trap.type >= X86_EVENTTYPE_SW_INTERRUPT )
-        __vmwrite(VM_ENTRY_INSTRUCTION_LEN, _trap.insn_len);
+    if ( _event.type >= X86_EVENTTYPE_SW_INTERRUPT )
+        __vmwrite(VM_ENTRY_INSTRUCTION_LEN, _event.insn_len);
 
     if ( nestedhvm_vcpu_in_guestmode(curr) &&
-         nvmx_intercepts_exception(curr, _trap.vector, _trap.error_code) )
+         nvmx_intercepts_exception(curr, _event.vector, _event.error_code) )
     {
         nvmx_enqueue_n2_exceptions (curr, 
             INTR_INFO_VALID_MASK |
-            MASK_INSR(_trap.type, INTR_INFO_INTR_TYPE_MASK) |
-            MASK_INSR(_trap.vector, INTR_INFO_VECTOR_MASK),
-            _trap.error_code, hvm_intsrc_none);
+            MASK_INSR(_event.type, INTR_INFO_INTR_TYPE_MASK) |
+            MASK_INSR(_event.vector, INTR_INFO_VECTOR_MASK),
+            _event.error_code, hvm_intsrc_none);
         return;
     }
     else
-        __vmx_inject_exception(_trap.vector, _trap.type, _trap.error_code);
+        __vmx_inject_exception(_event.vector, _event.type, _event.error_code);
 
-    if ( (_trap.vector == TRAP_page_fault) &&
-         (_trap.type == X86_EVENTTYPE_HW_EXCEPTION) )
-        HVMTRACE_LONG_2D(PF_INJECT, _trap.error_code,
+    if ( (_event.vector == TRAP_page_fault) &&
+         (_event.type == X86_EVENTTYPE_HW_EXCEPTION) )
+        HVMTRACE_LONG_2D(PF_INJECT, _event.error_code,
                          TRC_PAR_LONG(curr->arch.hvm_vcpu.guest_cr[2]));
     else
-        HVMTRACE_2D(INJ_EXC, _trap.vector, _trap.error_code);
+        HVMTRACE_2D(INJ_EXC, _event.vector, _event.error_code);
 }
 
 static int vmx_event_pending(struct vcpu *v)
@@ -2162,7 +2162,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .set_guest_pat        = vmx_set_guest_pat,
     .get_guest_pat        = vmx_get_guest_pat,
     .set_tsc_offset       = vmx_set_tsc_offset,
-    .inject_trap          = vmx_inject_trap,
+    .inject_event         = vmx_inject_event,
     .init_hypercall_page  = vmx_init_hypercall_page,
     .event_pending        = vmx_event_pending,
     .invlpg               = vmx_invlpg,
@@ -2182,8 +2182,8 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .nhvm_vcpu_reset      = nvmx_vcpu_reset,
     .nhvm_vcpu_p2m_base   = nvmx_vcpu_eptp_base,
     .nhvm_vmcx_hap_enabled = nvmx_ept_enabled,
-    .nhvm_vmcx_guest_intercepts_trap = nvmx_intercepts_exception,
-    .nhvm_vcpu_vmexit_trap = nvmx_vmexit_trap,
+    .nhvm_vmcx_guest_intercepts_event = nvmx_intercepts_exception,
+    .nhvm_vcpu_vmexit_event = nvmx_vmexit_event,
     .nhvm_intr_blocked    = nvmx_intr_blocked,
     .nhvm_domain_relinquish_resources = nvmx_domain_relinquish_resources,
     .update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap,
@@ -3201,7 +3201,7 @@ static int vmx_handle_eoi_write(void)
  */
 static void vmx_propagate_intr(unsigned long intr)
 {
-    struct hvm_trap trap = {
+    struct x86_event event = {
         .vector = MASK_EXTR(intr, INTR_INFO_VECTOR_MASK),
         .type = MASK_EXTR(intr, INTR_INFO_INTR_TYPE_MASK),
     };
@@ -3210,20 +3210,20 @@ static void vmx_propagate_intr(unsigned long intr)
     if ( intr & INTR_INFO_DELIVER_CODE_MASK )
     {
         __vmread(VM_EXIT_INTR_ERROR_CODE, &tmp);
-        trap.error_code = tmp;
+        event.error_code = tmp;
     }
     else
-        trap.error_code = HVM_DELIVER_NO_ERROR_CODE;
+        event.error_code = HVM_DELIVER_NO_ERROR_CODE;
 
-    if ( trap.type >= X86_EVENTTYPE_SW_INTERRUPT )
+    if ( event.type >= X86_EVENTTYPE_SW_INTERRUPT )
     {
         __vmread(VM_EXIT_INSTRUCTION_LEN, &tmp);
-        trap.insn_len = tmp;
+        event.insn_len = tmp;
     }
     else
-        trap.insn_len = 0;
+        event.insn_len = 0;
 
-    hvm_inject_trap(&trap);
+    hvm_inject_event(&event);
 }
 
 static void vmx_idtv_reinject(unsigned long idtv_info)
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index bed2e0a..b5837d4 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -491,18 +491,19 @@ static void vmreturn(struct cpu_user_regs *regs, enum vmx_ops_result ops_res)
     regs->eflags = eflags;
 }
 
-bool_t nvmx_intercepts_exception(struct vcpu *v, unsigned int trap,
-                                 int error_code)
+bool_t nvmx_intercepts_exception(
+    struct vcpu *v, unsigned int vector, int error_code)
 {
     u32 exception_bitmap, pfec_match=0, pfec_mask=0;
     int r;
 
-    ASSERT ( trap < 32 );
+    ASSERT(vector < 32);
 
     exception_bitmap = get_vvmcs(v, EXCEPTION_BITMAP);
-    r = exception_bitmap & (1 << trap) ? 1: 0;
+    r = exception_bitmap & (1 << vector) ? 1: 0;
 
-    if ( trap == TRAP_page_fault ) {
+    if ( vector == TRAP_page_fault )
+    {
         pfec_match = get_vvmcs(v, PAGE_FAULT_ERROR_CODE_MATCH);
         pfec_mask  = get_vvmcs(v, PAGE_FAULT_ERROR_CODE_MASK);
         if ( (error_code & pfec_mask) != pfec_match )
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index c5d9664..2e65322 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -5453,6 +5453,17 @@ static void __init __maybe_unused build_assertions(void)
     BUILD_BUG_ON(x86_seg_ds != 3);
     BUILD_BUG_ON(x86_seg_fs != 4);
     BUILD_BUG_ON(x86_seg_gs != 5);
+
+    /*
+     * Check X86_EVENTTYPE_* against VMCB EVENTINJ and VMCS INTR_INFO type
+     * fields.
+     */
+    BUILD_BUG_ON(X86_EVENTTYPE_EXT_INTR != 0);
+    BUILD_BUG_ON(X86_EVENTTYPE_NMI != 2);
+    BUILD_BUG_ON(X86_EVENTTYPE_HW_EXCEPTION != 3);
+    BUILD_BUG_ON(X86_EVENTTYPE_SW_INTERRUPT != 4);
+    BUILD_BUG_ON(X86_EVENTTYPE_PRI_SW_EXCEPTION != 5);
+    BUILD_BUG_ON(X86_EVENTTYPE_SW_EXCEPTION != 6);
 }
 
 #ifdef __XEN__
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 93b268e..b0d8e6f 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -67,6 +67,28 @@ enum x86_swint_emulation {
     x86_swint_emulate_all,  /* Help needed with all software events */
 };
 
+/*
+ * x86 event types. This enumeration is valid for:
+ *  Intel VMX: {VM_ENTRY,VM_EXIT,IDT_VECTORING}_INTR_INFO[10:8]
+ *  AMD SVM: eventinj[10:8] and exitintinfo[10:8] (types 0-4 only)
+ */
+enum x86_event_type {
+    X86_EVENTTYPE_EXT_INTR,         /* External interrupt */
+    X86_EVENTTYPE_NMI = 2,          /* NMI */
+    X86_EVENTTYPE_HW_EXCEPTION,     /* Hardware exception */
+    X86_EVENTTYPE_SW_INTERRUPT,     /* Software interrupt (CD nn) */
+    X86_EVENTTYPE_PRI_SW_EXCEPTION, /* ICEBP (F1) */
+    X86_EVENTTYPE_SW_EXCEPTION,     /* INT3 (CC), INTO (CE) */
+};
+
+struct x86_event {
+    int16_t       vector;
+    uint8_t       type;         /* X86_EVENTTYPE_* */
+    uint8_t       insn_len;     /* Instruction length */
+    uint32_t      error_code;   /* HVM_DELIVER_NO_ERROR_CODE if n/a */
+    unsigned long cr2;          /* Only for TRAP_page_fault h/w exception */
+};
+
 /* 
  * Attribute for segment selector. This is a copy of bit 40:47 & 52:55 of the
  * segment descriptor. It happens to match the format of an AMD SVM VMCB.
diff --git a/xen/include/asm-x86/hvm/emulate.h b/xen/include/asm-x86/hvm/emulate.h
index d4186a2..3b7ec33 100644
--- a/xen/include/asm-x86/hvm/emulate.h
+++ b/xen/include/asm-x86/hvm/emulate.h
@@ -30,7 +30,7 @@ struct hvm_emulate_ctxt {
     unsigned long seg_reg_dirty;
 
     bool_t exn_pending;
-    struct hvm_trap trap;
+    struct x86_event trap;
 
     uint32_t intr_shadow;
 
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 7e7462e..51a64f7 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -77,14 +77,6 @@ enum hvm_intblk {
 #define HVM_HAP_SUPERPAGE_2MB   0x00000001
 #define HVM_HAP_SUPERPAGE_1GB   0x00000002
 
-struct hvm_trap {
-    int16_t       vector;
-    uint8_t       type;         /* X86_EVENTTYPE_* */
-    uint8_t       insn_len;     /* Instruction length */
-    uint32_t      error_code;   /* HVM_DELIVER_NO_ERROR_CODE if n/a */
-    unsigned long cr2;          /* Only for TRAP_page_fault h/w exception */
-};
-
 /*
  * The hardware virtual machine (HVM) interface abstracts away from the
  * x86/x86_64 CPU virtualization assist specifics. Currently this interface
@@ -152,7 +144,7 @@ struct hvm_function_table {
 
     void (*set_tsc_offset)(struct vcpu *v, u64 offset, u64 at_tsc);
 
-    void (*inject_trap)(const struct hvm_trap *trap);
+    void (*inject_event)(const struct x86_event *event);
 
     void (*init_hypercall_page)(struct domain *d, void *hypercall_page);
 
@@ -185,11 +177,10 @@ struct hvm_function_table {
     int (*nhvm_vcpu_initialise)(struct vcpu *v);
     void (*nhvm_vcpu_destroy)(struct vcpu *v);
     int (*nhvm_vcpu_reset)(struct vcpu *v);
-    int (*nhvm_vcpu_vmexit_trap)(struct vcpu *v, const struct hvm_trap *trap);
+    int (*nhvm_vcpu_vmexit_event)(struct vcpu *v, const struct x86_event *event);
     uint64_t (*nhvm_vcpu_p2m_base)(struct vcpu *v);
-    bool_t (*nhvm_vmcx_guest_intercepts_trap)(struct vcpu *v,
-                                              unsigned int trapnr,
-                                              int errcode);
+    bool_t (*nhvm_vmcx_guest_intercepts_event)(
+        struct vcpu *v, unsigned int vector, int errcode);
 
     bool_t (*nhvm_vmcx_hap_enabled)(struct vcpu *v);
 
@@ -419,9 +410,30 @@ void hvm_migrate_timers(struct vcpu *v);
 void hvm_do_resume(struct vcpu *v);
 void hvm_migrate_pirqs(struct vcpu *v);
 
-void hvm_inject_trap(const struct hvm_trap *trap);
-void hvm_inject_hw_exception(unsigned int trapnr, int errcode);
-void hvm_inject_page_fault(int errcode, unsigned long cr2);
+void hvm_inject_event(const struct x86_event *event);
+
+static inline void hvm_inject_hw_exception(unsigned int vector, int errcode)
+{
+    struct x86_event event = {
+        .vector = vector,
+        .type = X86_EVENTTYPE_HW_EXCEPTION,
+        .error_code = errcode,
+    };
+
+    hvm_inject_event(&event);
+}
+
+static inline void hvm_inject_page_fault(int errcode, unsigned long cr2)
+{
+    struct x86_event event = {
+        .vector = TRAP_page_fault,
+        .type = X86_EVENTTYPE_HW_EXCEPTION,
+        .error_code = errcode,
+        .cr2 = cr2,
+    };
+
+    hvm_inject_event(&event);
+}
 
 static inline int hvm_event_pending(struct vcpu *v)
 {
@@ -437,18 +449,6 @@ static inline int hvm_event_pending(struct vcpu *v)
                        (1U << TRAP_alignment_check) | \
                        (1U << TRAP_machine_check))
 
-/*
- * x86 event types. This enumeration is valid for:
- *  Intel VMX: {VM_ENTRY,VM_EXIT,IDT_VECTORING}_INTR_INFO[10:8]
- *  AMD SVM: eventinj[10:8] and exitintinfo[10:8] (types 0-4 only)
- */
-#define X86_EVENTTYPE_EXT_INTR         0 /* external interrupt */
-#define X86_EVENTTYPE_NMI              2 /* NMI */
-#define X86_EVENTTYPE_HW_EXCEPTION     3 /* hardware exception */
-#define X86_EVENTTYPE_SW_INTERRUPT     4 /* software interrupt (CD nn) */
-#define X86_EVENTTYPE_PRI_SW_EXCEPTION 5 /* ICEBP (F1) */
-#define X86_EVENTTYPE_SW_EXCEPTION     6 /* INT3 (CC), INTO (CE) */
-
 int hvm_event_needs_reinjection(uint8_t type, uint8_t vector);
 
 uint8_t hvm_combine_hw_exceptions(uint8_t vec1, uint8_t vec2);
@@ -542,10 +542,10 @@ int hvm_x2apic_msr_write(struct vcpu *v, unsigned int msr, uint64_t msr_content)
 /* inject vmexit into l1 guest. l1 guest will see a VMEXIT due to
  * 'trapnr' exception.
  */ 
-static inline int nhvm_vcpu_vmexit_trap(struct vcpu *v,
-                                        const struct hvm_trap *trap)
+static inline int nhvm_vcpu_vmexit_event(
+    struct vcpu *v, const struct x86_event *event)
 {
-    return hvm_funcs.nhvm_vcpu_vmexit_trap(v, trap);
+    return hvm_funcs.nhvm_vcpu_vmexit_event(v, event);
 }
 
 /* returns l1 guest's cr3 that points to the page table used to
@@ -557,11 +557,10 @@ static inline uint64_t nhvm_vcpu_p2m_base(struct vcpu *v)
 }
 
 /* returns true, when l1 guest intercepts the specified trap */
-static inline bool_t nhvm_vmcx_guest_intercepts_trap(struct vcpu *v,
-                                                     unsigned int trap,
-                                                     int errcode)
+static inline bool_t nhvm_vmcx_guest_intercepts_event(
+    struct vcpu *v, unsigned int vector, int errcode)
 {
-    return hvm_funcs.nhvm_vmcx_guest_intercepts_trap(v, trap, errcode);
+    return hvm_funcs.nhvm_vmcx_guest_intercepts_event(v, vector, errcode);
 }
 
 /* returns true when l1 guest wants to use hap to run l2 guest */
diff --git a/xen/include/asm-x86/hvm/svm/nestedsvm.h b/xen/include/asm-x86/hvm/svm/nestedsvm.h
index 0dbc5ec..4b36c25 100644
--- a/xen/include/asm-x86/hvm/svm/nestedsvm.h
+++ b/xen/include/asm-x86/hvm/svm/nestedsvm.h
@@ -110,10 +110,10 @@ void nsvm_vcpu_destroy(struct vcpu *v);
 int nsvm_vcpu_initialise(struct vcpu *v);
 int nsvm_vcpu_reset(struct vcpu *v);
 int nsvm_vcpu_vmrun(struct vcpu *v, struct cpu_user_regs *regs);
-int nsvm_vcpu_vmexit_trap(struct vcpu *v, const struct hvm_trap *trap);
+int nsvm_vcpu_vmexit_event(struct vcpu *v, const struct x86_event *event);
 uint64_t nsvm_vcpu_hostcr3(struct vcpu *v);
-bool_t nsvm_vmcb_guest_intercepts_trap(struct vcpu *v, unsigned int trapnr,
-                                       int errcode);
+bool_t nsvm_vmcb_guest_intercepts_event(
+    struct vcpu *v, unsigned int vector, int errcode);
 bool_t nsvm_vmcb_hap_enabled(struct vcpu *v);
 enum hvm_intblk nsvm_intr_blocked(struct vcpu *v);
 
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 84d9406..d485536 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -206,7 +206,7 @@ struct hvm_vcpu {
     void *fpu_exception_callback_arg;
 
     /* Pending hw/sw interrupt (.vector = -1 means nothing pending). */
-    struct hvm_trap     inject_trap;
+    struct x86_event     inject_trap;
 
     struct viridian_vcpu viridian;
 };
diff --git a/xen/include/asm-x86/hvm/vmx/vvmx.h b/xen/include/asm-x86/hvm/vmx/vvmx.h
index aca8b4b..ead586e 100644
--- a/xen/include/asm-x86/hvm/vmx/vvmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vvmx.h
@@ -112,8 +112,8 @@ void nvmx_vcpu_destroy(struct vcpu *v);
 int nvmx_vcpu_reset(struct vcpu *v);
 uint64_t nvmx_vcpu_eptp_base(struct vcpu *v);
 enum hvm_intblk nvmx_intr_blocked(struct vcpu *v);
-bool_t nvmx_intercepts_exception(struct vcpu *v, unsigned int trap,
-                                 int error_code);
+bool_t nvmx_intercepts_exception(
+    struct vcpu *v, unsigned int vector, int error_code);
 void nvmx_domain_relinquish_resources(struct domain *d);
 
 bool_t nvmx_ept_enabled(struct vcpu *v);
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 04/15] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC
  2016-11-23 15:38 [PATCH for-4.9 00/15] XSA-191 followup Andrew Cooper
                   ` (2 preceding siblings ...)
  2016-11-23 15:38 ` [PATCH 03/15] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure Andrew Cooper
@ 2016-11-23 15:38 ` Andrew Cooper
  2016-11-23 16:20   ` Paul Durrant
                     ` (3 more replies)
  2016-11-23 15:38 ` [PATCH 05/15] x86/emul: Remove opencoded exception generation Andrew Cooper
                   ` (10 subsequent siblings)
  14 siblings, 4 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 15:38 UTC (permalink / raw)
  To: Xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Paul Durrant,
	Jun Nakajima, Boris Ostrovsky, Suravee Suthikulpanit

and move it to live with the other x86_event infrastructure in x86_emulate.h.
Switch it and x86_event.error_code to being signed, matching the rest of the
code.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Paul Durrant <paul.durrant@citrix.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 xen/arch/x86/hvm/emulate.c             |  2 +-
 xen/arch/x86/hvm/hvm.c                 |  6 +++---
 xen/arch/x86/hvm/nestedhvm.c           |  2 +-
 xen/arch/x86/hvm/svm/nestedsvm.c       |  6 +++---
 xen/arch/x86/hvm/svm/svm.c             | 22 +++++++++++-----------
 xen/arch/x86/hvm/vmx/intr.c            |  2 +-
 xen/arch/x86/hvm/vmx/vmx.c             | 23 ++++++++++++-----------
 xen/arch/x86/hvm/vmx/vvmx.c            |  2 +-
 xen/arch/x86/x86_emulate/x86_emulate.h |  3 ++-
 xen/include/asm-x86/hvm/support.h      |  2 --
 10 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index d0c3185..790e9c1 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1609,7 +1609,7 @@ static int hvmemul_inject_sw_interrupt(
 
     hvmemul_ctxt->exn_pending = 1;
     hvmemul_ctxt->trap.vector = vector;
-    hvmemul_ctxt->trap.error_code = HVM_DELIVER_NO_ERROR_CODE;
+    hvmemul_ctxt->trap.error_code = X86_EVENT_NO_EC;
     hvmemul_ctxt->trap.insn_len = insn_len;
 
     return X86EMUL_OKAY;
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 7b434aa..b950842 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -502,7 +502,7 @@ void hvm_do_resume(struct vcpu *v)
                 kind = EMUL_KIND_SET_CONTEXT_INSN;
 
             hvm_emulate_one_vm_event(kind, TRAP_invalid_op,
-                                     HVM_DELIVER_NO_ERROR_CODE);
+                                     X86_EVENT_NO_EC);
 
             v->arch.vm_event->emulate_flags = 0;
         }
@@ -3054,7 +3054,7 @@ void hvm_task_switch(
     }
 
     if ( (tss.trace & 1) && !exn_raised )
-        hvm_inject_hw_exception(TRAP_debug, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
 
  out:
     hvm_unmap_entry(optss_desc);
@@ -4073,7 +4073,7 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
     switch ( hvm_emulate_one(&ctxt) )
     {
     case X86EMUL_UNHANDLEABLE:
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         break;
     case X86EMUL_EXCEPTION:
         if ( ctxt.exn_pending )
diff --git a/xen/arch/x86/hvm/nestedhvm.c b/xen/arch/x86/hvm/nestedhvm.c
index caad525..c4671d8 100644
--- a/xen/arch/x86/hvm/nestedhvm.c
+++ b/xen/arch/x86/hvm/nestedhvm.c
@@ -17,7 +17,7 @@
  */
 
 #include <asm/msr.h>
-#include <asm/hvm/support.h>	/* for HVM_DELIVER_NO_ERROR_CODE */
+#include <asm/hvm/support.h>
 #include <asm/hvm/hvm.h>
 #include <asm/p2m.h>    /* for struct p2m_domain */
 #include <asm/hvm/nestedhvm.h>
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index b6b8526..8c9b073 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -756,7 +756,7 @@ nsvm_vcpu_vmrun(struct vcpu *v, struct cpu_user_regs *regs)
     default:
         gdprintk(XENLOG_ERR,
             "nsvm_vcpu_vmentry failed, injecting #UD\n");
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         /* Must happen after hvm_inject_hw_exception or it doesn't work right. */
         nv->nv_vmswitch_in_progress = 0;
         return 1;
@@ -1581,7 +1581,7 @@ void svm_vmexit_do_stgi(struct cpu_user_regs *regs, struct vcpu *v)
     unsigned int inst_len;
 
     if ( !nestedhvm_enabled(v->domain) ) {
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         return;
     }
 
@@ -1601,7 +1601,7 @@ void svm_vmexit_do_clgi(struct cpu_user_regs *regs, struct vcpu *v)
     vintr_t intr;
 
     if ( !nestedhvm_enabled(v->domain) ) {
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         return;
     }
 
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 66eb30b..fb4fd0b 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -89,7 +89,7 @@ static DEFINE_SPINLOCK(osvw_lock);
 static void svm_crash_or_fault(struct vcpu *v)
 {
     if ( vmcb_get_cpl(v->arch.hvm_svm.vmcb) )
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
     else
         domain_crash(v->domain);
 }
@@ -116,7 +116,7 @@ void __update_guest_eip(struct cpu_user_regs *regs, unsigned int inst_len)
     curr->arch.hvm_svm.vmcb->interrupt_shadow = 0;
 
     if ( regs->eflags & X86_EFLAGS_TF )
-        hvm_inject_hw_exception(TRAP_debug, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
 }
 
 static void svm_cpu_down(void)
@@ -1285,7 +1285,7 @@ static void svm_inject_event(const struct x86_event *event)
 
     default:
         eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
-        eventinj.fields.ev = (_event.error_code != HVM_DELIVER_NO_ERROR_CODE);
+        eventinj.fields.ev = (_event.error_code != X86_EVENT_NO_EC);
         eventinj.fields.errorcode = _event.error_code;
         break;
     }
@@ -1553,7 +1553,7 @@ static void svm_fpu_dirty_intercept(void)
     {
        /* Check if l1 guest must make FPU ready for the l2 guest */
        if ( v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_TS )
-           hvm_inject_hw_exception(TRAP_no_device, HVM_DELIVER_NO_ERROR_CODE);
+           hvm_inject_hw_exception(TRAP_no_device, X86_EVENT_NO_EC);
        else
            vmcb_set_cr0(n1vmcb, vmcb_get_cr0(n1vmcb) & ~X86_CR0_TS);
        return;
@@ -2022,14 +2022,14 @@ svm_vmexit_do_vmrun(struct cpu_user_regs *regs,
     if ( !nsvm_efer_svm_enabled(v) )
     {
         gdprintk(XENLOG_ERR, "VMRUN: nestedhvm disabled, injecting #UD\n");
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         return;
     }
 
     if ( !nestedsvm_vmcb_map(v, vmcbaddr) )
     {
         gdprintk(XENLOG_ERR, "VMRUN: mapping vmcb failed, injecting #GP\n");
-        hvm_inject_hw_exception(TRAP_gp_fault, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_gp_fault, X86_EVENT_NO_EC);
         return;
     }
 
@@ -2101,7 +2101,7 @@ svm_vmexit_do_vmload(struct vmcb_struct *vmcb,
     return;
 
  inject:
-    hvm_inject_hw_exception(ret, HVM_DELIVER_NO_ERROR_CODE);
+    hvm_inject_hw_exception(ret, X86_EVENT_NO_EC);
     return;
 }
 
@@ -2139,7 +2139,7 @@ svm_vmexit_do_vmsave(struct vmcb_struct *vmcb,
     return;
 
  inject:
-    hvm_inject_hw_exception(ret, HVM_DELIVER_NO_ERROR_CODE);
+    hvm_inject_hw_exception(ret, X86_EVENT_NO_EC);
     return;
 }
 
@@ -2428,7 +2428,7 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
 
     case VMEXIT_EXCEPTION_DB:
         if ( !v->domain->debugger_attached )
-            hvm_inject_hw_exception(TRAP_debug, HVM_DELIVER_NO_ERROR_CODE);
+            hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
         else
             domain_pause_for_debugger();
         break;
@@ -2616,7 +2616,7 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
 
     case VMEXIT_MONITOR:
     case VMEXIT_MWAIT:
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         break;
 
     case VMEXIT_VMRUN:
@@ -2635,7 +2635,7 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
         svm_vmexit_do_clgi(regs, v);
         break;
     case VMEXIT_SKINIT:
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         break;
 
     case VMEXIT_XSETBV:
diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c
index 8fca08c..639a705 100644
--- a/xen/arch/x86/hvm/vmx/intr.c
+++ b/xen/arch/x86/hvm/vmx/intr.c
@@ -302,7 +302,7 @@ void vmx_intr_assist(void)
     }
     else if ( intack.source == hvm_intsrc_mce )
     {
-        hvm_inject_hw_exception(TRAP_machine_check, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_machine_check, X86_EVENT_NO_EC);
     }
     else if ( cpu_has_vmx_virtual_intr_delivery &&
               intack.source != hvm_intsrc_pic &&
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index b1d8a0b..eb7c902 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1646,7 +1646,8 @@ static void __vmx_inject_exception(int trap, int type, int error_code)
     intr_fields = INTR_INFO_VALID_MASK |
                   MASK_INSR(type, INTR_INFO_INTR_TYPE_MASK) |
                   MASK_INSR(trap, INTR_INFO_VECTOR_MASK);
-    if ( error_code != HVM_DELIVER_NO_ERROR_CODE ) {
+    if ( error_code != X86_EVENT_NO_EC )
+    {
         __vmwrite(VM_ENTRY_EXCEPTION_ERROR_CODE, error_code);
         intr_fields |= INTR_INFO_DELIVER_CODE_MASK;
     }
@@ -1671,12 +1672,12 @@ void vmx_inject_extint(int trap, uint8_t source)
                INTR_INFO_VALID_MASK |
                MASK_INSR(X86_EVENTTYPE_EXT_INTR, INTR_INFO_INTR_TYPE_MASK) |
                MASK_INSR(trap, INTR_INFO_VECTOR_MASK),
-               HVM_DELIVER_NO_ERROR_CODE, source);
+               X86_EVENT_NO_EC, source);
             return;
         }
     }
     __vmx_inject_exception(trap, X86_EVENTTYPE_EXT_INTR,
-                           HVM_DELIVER_NO_ERROR_CODE);
+                           X86_EVENT_NO_EC);
 }
 
 void vmx_inject_nmi(void)
@@ -1691,12 +1692,12 @@ void vmx_inject_nmi(void)
                INTR_INFO_VALID_MASK |
                MASK_INSR(X86_EVENTTYPE_NMI, INTR_INFO_INTR_TYPE_MASK) |
                MASK_INSR(TRAP_nmi, INTR_INFO_VECTOR_MASK),
-               HVM_DELIVER_NO_ERROR_CODE, hvm_intsrc_nmi);
+               X86_EVENT_NO_EC, hvm_intsrc_nmi);
             return;
         }
     }
     __vmx_inject_exception(2, X86_EVENTTYPE_NMI,
-                           HVM_DELIVER_NO_ERROR_CODE);
+                           X86_EVENT_NO_EC);
 }
 
 /*
@@ -2111,7 +2112,7 @@ static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
     vmx_vmcs_exit(v);
 
     hvm_inject_hw_exception(TRAP_virtualisation,
-                            HVM_DELIVER_NO_ERROR_CODE);
+                            X86_EVENT_NO_EC);
 
  out:
     hvm_unmap_guest_frame(veinfo, 0);
@@ -2387,7 +2388,7 @@ void update_guest_eip(void)
     }
 
     if ( regs->eflags & X86_EFLAGS_TF )
-        hvm_inject_hw_exception(TRAP_debug, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
 }
 
 static void vmx_fpu_dirty_intercept(void)
@@ -3213,7 +3214,7 @@ static void vmx_propagate_intr(unsigned long intr)
         event.error_code = tmp;
     }
     else
-        event.error_code = HVM_DELIVER_NO_ERROR_CODE;
+        event.error_code = X86_EVENT_NO_EC;
 
     if ( event.type >= X86_EVENTTYPE_SW_INTERRUPT )
     {
@@ -3770,7 +3771,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
 
     case EXIT_REASON_VMFUNC:
         if ( vmx_vmfunc_intercept(regs) != X86EMUL_OKAY )
-            hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+            hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         else
             update_guest_eip();
         break;
@@ -3784,7 +3785,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
          * as far as vmexit.
          */
         WARN_ON(exit_reason == EXIT_REASON_GETSEC);
-        hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         break;
 
     case EXIT_REASON_TPR_BELOW_THRESHOLD:
@@ -3909,7 +3910,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
             vmx_get_segment_register(v, x86_seg_ss, &ss);
             if ( ss.attr.fields.dpl )
                 hvm_inject_hw_exception(TRAP_invalid_op,
-                                        HVM_DELIVER_NO_ERROR_CODE);
+                                        X86_EVENT_NO_EC);
             else
                 domain_crash(v->domain);
         }
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index b5837d4..efaf54c 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -380,7 +380,7 @@ static int vmx_inst_check_privilege(struct cpu_user_regs *regs, int vmxop_check)
     
 invalid_op:
     gdprintk(XENLOG_ERR, "vmx_inst_check_privilege: invalid_op\n");
-    hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
+    hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
     return X86EMUL_EXCEPTION;
 
 gp_fault:
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index b0d8e6f..9df083e 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -80,12 +80,13 @@ enum x86_event_type {
     X86_EVENTTYPE_PRI_SW_EXCEPTION, /* ICEBP (F1) */
     X86_EVENTTYPE_SW_EXCEPTION,     /* INT3 (CC), INTO (CE) */
 };
+#define X86_EVENT_NO_EC (-1)        /* No error code. */
 
 struct x86_event {
     int16_t       vector;
     uint8_t       type;         /* X86_EVENTTYPE_* */
     uint8_t       insn_len;     /* Instruction length */
-    uint32_t      error_code;   /* HVM_DELIVER_NO_ERROR_CODE if n/a */
+    int32_t       error_code;   /* X86_EVENT_NO_EC if n/a */
     unsigned long cr2;          /* Only for TRAP_page_fault h/w exception */
 };
 
diff --git a/xen/include/asm-x86/hvm/support.h b/xen/include/asm-x86/hvm/support.h
index 2984abc..9938450 100644
--- a/xen/include/asm-x86/hvm/support.h
+++ b/xen/include/asm-x86/hvm/support.h
@@ -25,8 +25,6 @@
 #include <xen/hvm/save.h>
 #include <asm/processor.h>
 
-#define HVM_DELIVER_NO_ERROR_CODE  (~0U)
-
 #ifndef NDEBUG
 #define DBG_LEVEL_0                 (1 << 0)
 #define DBG_LEVEL_1                 (1 << 1)
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 05/15] x86/emul: Remove opencoded exception generation
  2016-11-23 15:38 [PATCH for-4.9 00/15] XSA-191 followup Andrew Cooper
                   ` (3 preceding siblings ...)
  2016-11-23 15:38 ` [PATCH 04/15] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC Andrew Cooper
@ 2016-11-23 15:38 ` Andrew Cooper
  2016-11-24 14:31   ` Jan Beulich
  2016-11-23 15:38 ` [PATCH 06/15] x86/emul: Rework emulator event injection Andrew Cooper
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 15:38 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

Introduce generate_exception() for unconditional exception generation, and
replace existing uses.  Both generate_exception() and generate_exception_if()
are updated to make their error code parameters optional, which removes the
use of the -1 sentinal.

The ioport_access_check() check loses the presence check for %tr, as the x86
architecture has no concept of a non-usable task register.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <JBeulich@suse.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/x86_emulate/x86_emulate.c | 196 +++++++++++++++++----------------
 1 file changed, 100 insertions(+), 96 deletions(-)

diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 2e65322..f8271b3 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -457,14 +457,20 @@ typedef union {
 #define EXC_BR  5
 #define EXC_UD  6
 #define EXC_NM  7
+#define EXC_DF  8
 #define EXC_TS 10
 #define EXC_NP 11
 #define EXC_SS 12
 #define EXC_GP 13
 #define EXC_PF 14
 #define EXC_MF 16
+#define EXC_AC 17
 #define EXC_XM 19
 
+#define EXC_HAS_EC                                                      \
+    ((1u << EXC_DF) | (1u << EXC_TS) | (1u << EXC_NP) |                 \
+     (1u << EXC_SS) | (1u << EXC_GP) | (1u << EXC_PF) | (1u << EXC_AC))
+
 /* Segment selector error code bits. */
 #define ECODE_EXT (1 << 0)
 #define ECODE_IDT (1 << 1)
@@ -667,14 +673,22 @@ do {                                                    \
     if ( rc ) goto done;                                \
 } while (0)
 
-#define generate_exception_if(p, e, ec)                                   \
+static inline int mkec(uint8_t e, int32_t ec, ...)
+{
+    return (e < 32 && (1u << e) & EXC_HAS_EC) ? ec : X86_EVENT_NO_EC;
+}
+
+#define generate_exception_if(p, e, ec...)                                \
 ({  if ( (p) ) {                                                          \
         fail_if(ops->inject_hw_exception == NULL);                        \
-        rc = ops->inject_hw_exception(e, ec, ctxt) ? : X86EMUL_EXCEPTION; \
+        rc = ops->inject_hw_exception(e, mkec(e, ##ec, 0), ctxt)          \
+            ? : X86EMUL_EXCEPTION;                                        \
         goto done;                                                        \
     }                                                                     \
 })
 
+#define generate_exception(e, ec...) generate_exception_if(true, e, ##ec)
+
 /*
  * Given byte has even parity (even number of 1s)? SDM Vol. 1 Sec. 3.4.3.1,
  * "Status Flags": EFLAGS.PF reflects parity of least-sig. byte of result only.
@@ -785,7 +799,7 @@ static int _get_fpu(
                 return rc;
             generate_exception_if(!(cr4 & ((type == X86EMUL_FPU_xmm)
                                            ? CR4_OSFXSR : CR4_OSXSAVE)),
-                                  EXC_UD, -1);
+                                  EXC_UD);
         }
 
         rc = ops->read_cr(0, &cr0, ctxt);
@@ -798,13 +812,13 @@ static int _get_fpu(
         }
         if ( cr0 & CR0_EM )
         {
-            generate_exception_if(type == X86EMUL_FPU_fpu, EXC_NM, -1);
-            generate_exception_if(type == X86EMUL_FPU_mmx, EXC_UD, -1);
-            generate_exception_if(type == X86EMUL_FPU_xmm, EXC_UD, -1);
+            generate_exception_if(type == X86EMUL_FPU_fpu, EXC_NM);
+            generate_exception_if(type == X86EMUL_FPU_mmx, EXC_UD);
+            generate_exception_if(type == X86EMUL_FPU_xmm, EXC_UD);
         }
         generate_exception_if((cr0 & CR0_TS) &&
                               (type != X86EMUL_FPU_wait || (cr0 & CR0_MP)),
-                              EXC_NM, -1);
+                              EXC_NM);
     }
 
  done:
@@ -832,7 +846,7 @@ do {                                                            \
             (_fic)->exn_raised = EXC_UD;                        \
     }                                                           \
     generate_exception_if((_fic)->exn_raised >= 0,              \
-                          (_fic)->exn_raised, -1);              \
+                          (_fic)->exn_raised);                  \
 } while (0)
 
 #define emulate_fpu_insn(_op)                           \
@@ -1167,11 +1181,9 @@ static int ioport_access_check(
     if ( (rc = ops->read_segment(x86_seg_tr, &tr, ctxt)) != 0 )
         return rc;
 
-    /* Ensure that the TSS is valid and has an io-bitmap-offset field. */
-    if ( !tr.attr.fields.p ||
-         ((tr.attr.fields.type & 0xd) != 0x9) ||
-         (tr.limit < 0x67) )
-        goto raise_exception;
+    /* Ensure the TSS has an io-bitmap-offset field. */
+    generate_exception_if(tr.attr.fields.type != 0xb ||
+                          tr.limit < 0x67, EXC_GP, 0);
 
     if ( (rc = read_ulong(x86_seg_none, tr.base + 0x66,
                           &iobmp, 2, ctxt, ops)) )
@@ -1179,21 +1191,16 @@ static int ioport_access_check(
 
     /* Ensure TSS includes two bytes including byte containing first port. */
     iobmp += first_port / 8;
-    if ( tr.limit <= iobmp )
-        goto raise_exception;
+    generate_exception_if(tr.limit <= iobmp, EXC_GP, 0);
 
     if ( (rc = read_ulong(x86_seg_none, tr.base + iobmp,
                           &iobmp, 2, ctxt, ops)) )
         return rc;
-    if ( (iobmp & (((1<<bytes)-1) << (first_port&7))) != 0 )
-        goto raise_exception;
+    generate_exception_if(iobmp & (((1 << bytes) - 1) << (first_port & 7)),
+                          EXC_GP, 0);
 
  done:
     return rc;
-
- raise_exception:
-    fail_if(ops->inject_hw_exception == NULL);
-    return ops->inject_hw_exception(EXC_GP, 0, ctxt) ? : X86EMUL_EXCEPTION;
 }
 
 static bool_t
@@ -1262,7 +1269,7 @@ static bool_t vcpu_has(
 #define vcpu_has_rtm()   vcpu_has(0x00000007, EBX, 11, ctxt, ops)
 
 #define vcpu_must_have(leaf, reg, bit) \
-    generate_exception_if(!vcpu_has(leaf, reg, bit, ctxt, ops), EXC_UD, -1)
+    generate_exception_if(!vcpu_has(leaf, reg, bit, ctxt, ops), EXC_UD)
 #define vcpu_must_have_fpu()  vcpu_must_have(0x00000001, EDX, 0)
 #define vcpu_must_have_cmov() vcpu_must_have(0x00000001, EDX, 15)
 #define vcpu_must_have_mmx()  vcpu_must_have(0x00000001, EDX, 23)
@@ -1282,7 +1289,7 @@ static bool_t vcpu_has(
  * the actual operation.
  */
 #define host_and_vcpu_must_have(feat) ({ \
-    generate_exception_if(!cpu_has_##feat, EXC_UD, -1); \
+    generate_exception_if(!cpu_has_##feat, EXC_UD); \
     vcpu_must_have_##feat(); \
 })
 #else
@@ -1485,11 +1492,9 @@ protmode_load_seg(
     return X86EMUL_OKAY;
 
  raise_exn:
-    if ( ops->inject_hw_exception == NULL )
-        return X86EMUL_UNHANDLEABLE;
-    if ( (rc = ops->inject_hw_exception(fault_type, sel & 0xfffc, ctxt)) )
-        return rc;
-    return X86EMUL_EXCEPTION;
+    generate_exception(fault_type, sel & 0xfffc);
+ done:
+    return rc;
 }
 
 static int
@@ -1704,7 +1709,7 @@ static int inject_swint(enum x86_swint_type type,
     return rc;
 
  raise_exn:
-    return ops->inject_hw_exception(fault_type, error_code, ctxt);
+    generate_exception(fault_type, error_code);
 }
 
 int x86emul_unhandleable_rw(
@@ -1795,7 +1800,7 @@ x86_decode_onebyte(
 
     case 0x9a: /* call (far, absolute) */
     case 0xea: /* jmp (far, absolute) */
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
 
         imm1 = insn_fetch_bytes(op_bytes);
         imm2 = insn_fetch_type(uint16_t);
@@ -2024,7 +2029,7 @@ x86_decode(
                 /* fall through */
             case 8:
                 /* VEX / XOP / EVEX */
-                generate_exception_if(rex_prefix || vex.pfx, EXC_UD, -1);
+                generate_exception_if(rex_prefix || vex.pfx, EXC_UD);
 
                 vex.raw[0] = modrm;
                 if ( b == 0xc5 )
@@ -2514,12 +2519,12 @@ x86_emulate(
             (ext != ext_0f ||
              (((b < 0x20) || (b > 0x23)) && /* MOV CRn/DRn */
               (b != 0xc7))),                /* CMPXCHG{8,16}B */
-            EXC_UD, -1);
+            EXC_UD);
         dst.type = OP_NONE;
         break;
 
     case DstReg:
-        generate_exception_if(lock_prefix, EXC_UD, -1);
+        generate_exception_if(lock_prefix, EXC_UD);
         dst.type = OP_REG;
         if ( d & ByteOp )
         {
@@ -2575,7 +2580,7 @@ x86_emulate(
         dst = ea;
         if ( dst.type == OP_REG )
         {
-            generate_exception_if(lock_prefix, EXC_UD, -1);
+            generate_exception_if(lock_prefix, EXC_UD);
             switch ( dst.bytes )
             {
             case 1: dst.val = *(uint8_t  *)dst.reg; break;
@@ -2592,7 +2597,7 @@ x86_emulate(
             dst.orig_val = dst.val;
         }
         else /* Lock prefix is allowed only on RMW instructions. */
-            generate_exception_if(lock_prefix, EXC_UD, -1);
+            generate_exception_if(lock_prefix, EXC_UD);
         break;
     }
 
@@ -2630,7 +2635,7 @@ x86_emulate(
         break;
 
     case 0x38 ... 0x3d: cmp: /* cmp */
-        generate_exception_if(lock_prefix, EXC_UD, -1);
+        generate_exception_if(lock_prefix, EXC_UD);
         emulate_2op_SrcV("cmp", src, dst, _regs.eflags);
         dst.type = OP_NONE;
         break;
@@ -2638,7 +2643,7 @@ x86_emulate(
     case 0x06: /* push %%es */
         src.val = x86_seg_es;
     push_seg:
-        generate_exception_if(mode_64bit() && !ext, EXC_UD, -1);
+        generate_exception_if(mode_64bit() && !ext, EXC_UD);
         fail_if(ops->read_segment == NULL);
         if ( (rc = ops->read_segment(src.val, &sreg, ctxt)) != 0 )
             goto done;
@@ -2648,7 +2653,7 @@ x86_emulate(
     case 0x07: /* pop %%es */
         src.val = x86_seg_es;
     pop_seg:
-        generate_exception_if(mode_64bit() && !ext, EXC_UD, -1);
+        generate_exception_if(mode_64bit() && !ext, EXC_UD);
         fail_if(ops->write_segment == NULL);
         /* 64-bit mode: POP defaults to a 64-bit operand. */
         if ( mode_64bit() && (op_bytes == 4) )
@@ -2685,7 +2690,7 @@ x86_emulate(
         uint8_t al = _regs.eax;
         unsigned long eflags = _regs.eflags;
 
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
         _regs.eflags &= ~(EFLG_CF|EFLG_AF|EFLG_SF|EFLG_ZF|EFLG_PF);
         if ( ((al & 0x0f) > 9) || (eflags & EFLG_AF) )
         {
@@ -2707,7 +2712,7 @@ x86_emulate(
 
     case 0x37: /* aaa */
     case 0x3f: /* aas */
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
         _regs.eflags &= ~EFLG_CF;
         if ( ((uint8_t)_regs.eax > 9) || (_regs.eflags & EFLG_AF) )
         {
@@ -2751,7 +2756,7 @@ x86_emulate(
         unsigned long regs[] = {
             _regs.eax, _regs.ecx, _regs.edx, _regs.ebx,
             _regs.esp, _regs.ebp, _regs.esi, _regs.edi };
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
         for ( i = 0; i < 8; i++ )
             if ( (rc = ops->write(x86_seg_ss, sp_pre_dec(op_bytes),
                                   &regs[i], op_bytes, ctxt)) != 0 )
@@ -2766,7 +2771,7 @@ x86_emulate(
             (unsigned long *)&_regs.ebp, (unsigned long *)&dummy_esp,
             (unsigned long *)&_regs.ebx, (unsigned long *)&_regs.edx,
             (unsigned long *)&_regs.ecx, (unsigned long *)&_regs.eax };
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
         for ( i = 0; i < 8; i++ )
         {
             if ( (rc = read_ulong(x86_seg_ss, sp_post_inc(op_bytes),
@@ -2784,14 +2789,14 @@ x86_emulate(
         unsigned long src_val2;
         int lb, ub, idx;
         generate_exception_if(mode_64bit() || (src.type != OP_MEM),
-                              EXC_UD, -1);
+                              EXC_UD);
         if ( (rc = read_ulong(src.mem.seg, src.mem.off + op_bytes,
                               &src_val2, op_bytes, ctxt, ops)) )
             goto done;
         ub  = (op_bytes == 2) ? (int16_t)src_val2 : (int32_t)src_val2;
         lb  = (op_bytes == 2) ? (int16_t)src.val  : (int32_t)src.val;
         idx = (op_bytes == 2) ? (int16_t)dst.val  : (int32_t)dst.val;
-        generate_exception_if((idx < lb) || (idx > ub), EXC_BR, -1);
+        generate_exception_if((idx < lb) || (idx > ub), EXC_BR);
         dst.type = OP_NONE;
         break;
     }
@@ -2829,7 +2834,7 @@ x86_emulate(
                 _regs.eflags &= ~EFLG_ZF;
                 dst.type = OP_NONE;
             }
-            generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
+            generate_exception_if(!in_protmode(ctxt, ops), EXC_UD);
         }
         break;
 
@@ -2920,7 +2925,7 @@ x86_emulate(
         break;
 
     case 0x82: /* Grp1 (x86/32 only) */
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
     case 0x80: case 0x81: case 0x83: /* Grp1 */
         switch ( modrm_reg & 7 )
         {
@@ -2971,7 +2976,7 @@ x86_emulate(
             dst.type = OP_NONE;
             break;
         }
-        generate_exception_if((modrm_reg & 7) != 0, EXC_UD, -1);
+        generate_exception_if((modrm_reg & 7) != 0, EXC_UD);
     case 0x88 ... 0x8b: /* mov */
     case 0xa0 ... 0xa1: /* mov mem.offs,{%al,%ax,%eax,%rax} */
     case 0xa2 ... 0xa3: /* mov {%al,%ax,%eax,%rax},mem.offs */
@@ -2980,7 +2985,7 @@ x86_emulate(
 
     case 0x8c: /* mov Sreg,r/m */
         seg = modrm_reg & 7; /* REX.R is ignored. */
-        generate_exception_if(!is_x86_user_segment(seg), EXC_UD, -1);
+        generate_exception_if(!is_x86_user_segment(seg), EXC_UD);
     store_selector:
         fail_if(ops->read_segment == NULL);
         if ( (rc = ops->read_segment(seg, &sreg, ctxt)) != 0 )
@@ -2993,7 +2998,7 @@ x86_emulate(
     case 0x8e: /* mov r/m,Sreg */
         seg = modrm_reg & 7; /* REX.R is ignored. */
         generate_exception_if(!is_x86_user_segment(seg) ||
-                              seg == x86_seg_cs, EXC_UD, -1);
+                              seg == x86_seg_cs, EXC_UD);
         if ( (rc = load_seg(seg, src.val, 0, NULL, ctxt, ops)) != 0 )
             goto done;
         if ( seg == x86_seg_ss )
@@ -3002,12 +3007,12 @@ x86_emulate(
         break;
 
     case 0x8d: /* lea */
-        generate_exception_if(ea.type != OP_MEM, EXC_UD, -1);
+        generate_exception_if(ea.type != OP_MEM, EXC_UD);
         dst.val = ea.mem.off;
         break;
 
     case 0x8f: /* pop (sole member of Grp1a) */
-        generate_exception_if((modrm_reg & 7) != 0, EXC_UD, -1);
+        generate_exception_if((modrm_reg & 7) != 0, EXC_UD);
         /* 64-bit mode: POP defaults to a 64-bit operand. */
         if ( mode_64bit() && (dst.bytes == 4) )
             dst.bytes = 8;
@@ -3284,8 +3289,8 @@ x86_emulate(
         unsigned long sel;
         dst.val = x86_seg_es;
     les: /* dst.val identifies the segment */
-        generate_exception_if(mode_64bit() && !ext, EXC_UD, -1);
-        generate_exception_if(src.type != OP_MEM, EXC_UD, -1);
+        generate_exception_if(mode_64bit() && !ext, EXC_UD);
+        generate_exception_if(src.type != OP_MEM, EXC_UD);
         if ( (rc = read_ulong(src.mem.seg, src.mem.off + src.bytes,
                               &sel, 2, ctxt, ops)) != 0 )
             goto done;
@@ -3375,7 +3380,7 @@ x86_emulate(
         goto done;
 
     case 0xce: /* into */
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
         if ( !(_regs.eflags & EFLG_OF) )
             break;
         src.val = EXC_OF;
@@ -3417,7 +3422,7 @@ x86_emulate(
     case 0xd5: /* aad */ {
         unsigned int base = (uint8_t)src.val;
 
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
         if ( b & 0x01 )
         {
             uint16_t ax = _regs.eax;
@@ -3428,7 +3433,7 @@ x86_emulate(
         {
             uint8_t al = _regs.eax;
 
-            generate_exception_if(!base, EXC_DE, -1);
+            generate_exception_if(!base, EXC_DE);
             *(uint16_t *)&_regs.eax = ((al / base) << 8) | (al % base);
         }
         _regs.eflags &= ~(EFLG_SF|EFLG_ZF|EFLG_PF);
@@ -3439,7 +3444,7 @@ x86_emulate(
     }
 
     case 0xd6: /* salc */
-        generate_exception_if(mode_64bit(), EXC_UD, -1);
+        generate_exception_if(mode_64bit(), EXC_UD);
         *(uint8_t *)&_regs.eax = (_regs.eflags & EFLG_CF) ? 0xff : 0x00;
         break;
 
@@ -4047,7 +4052,7 @@ x86_emulate(
             unsigned long u[2], v;
 
         case 0 ... 1: /* test */
-            generate_exception_if(lock_prefix, EXC_UD, -1);
+            generate_exception_if(lock_prefix, EXC_UD);
             goto test;
         case 2: /* not */
             dst.val = ~dst.val;
@@ -4145,7 +4150,7 @@ x86_emulate(
                 v    = (uint8_t)src.val;
                 generate_exception_if(
                     div_dbl(u, v) || ((uint8_t)u[0] != (uint16_t)u[0]),
-                    EXC_DE, -1);
+                    EXC_DE);
                 dst.val = (uint8_t)u[0];
                 ((uint8_t *)&_regs.eax)[1] = u[1];
                 break;
@@ -4155,7 +4160,7 @@ x86_emulate(
                 v    = (uint16_t)src.val;
                 generate_exception_if(
                     div_dbl(u, v) || ((uint16_t)u[0] != (uint32_t)u[0]),
-                    EXC_DE, -1);
+                    EXC_DE);
                 dst.val = (uint16_t)u[0];
                 *(uint16_t *)&_regs.edx = u[1];
                 break;
@@ -4166,7 +4171,7 @@ x86_emulate(
                 v    = (uint32_t)src.val;
                 generate_exception_if(
                     div_dbl(u, v) || ((uint32_t)u[0] != u[0]),
-                    EXC_DE, -1);
+                    EXC_DE);
                 dst.val   = (uint32_t)u[0];
                 _regs.edx = (uint32_t)u[1];
                 break;
@@ -4175,7 +4180,7 @@ x86_emulate(
                 u[0] = _regs.eax;
                 u[1] = _regs.edx;
                 v    = src.val;
-                generate_exception_if(div_dbl(u, v), EXC_DE, -1);
+                generate_exception_if(div_dbl(u, v), EXC_DE);
                 dst.val   = u[0];
                 _regs.edx = u[1];
                 break;
@@ -4191,7 +4196,7 @@ x86_emulate(
                 v    = (int8_t)src.val;
                 generate_exception_if(
                     idiv_dbl(u, v) || ((int8_t)u[0] != (int16_t)u[0]),
-                    EXC_DE, -1);
+                    EXC_DE);
                 dst.val = (int8_t)u[0];
                 ((int8_t *)&_regs.eax)[1] = u[1];
                 break;
@@ -4201,7 +4206,7 @@ x86_emulate(
                 v    = (int16_t)src.val;
                 generate_exception_if(
                     idiv_dbl(u, v) || ((int16_t)u[0] != (int32_t)u[0]),
-                    EXC_DE, -1);
+                    EXC_DE);
                 dst.val = (int16_t)u[0];
                 *(int16_t *)&_regs.edx = u[1];
                 break;
@@ -4212,7 +4217,7 @@ x86_emulate(
                 v    = (int32_t)src.val;
                 generate_exception_if(
                     idiv_dbl(u, v) || ((int32_t)u[0] != u[0]),
-                    EXC_DE, -1);
+                    EXC_DE);
                 dst.val   = (int32_t)u[0];
                 _regs.edx = (uint32_t)u[1];
                 break;
@@ -4221,7 +4226,7 @@ x86_emulate(
                 u[0] = _regs.eax;
                 u[1] = _regs.edx;
                 v    = src.val;
-                generate_exception_if(idiv_dbl(u, v), EXC_DE, -1);
+                generate_exception_if(idiv_dbl(u, v), EXC_DE);
                 dst.val   = u[0];
                 _regs.edx = u[1];
                 break;
@@ -4261,7 +4266,7 @@ x86_emulate(
         break;
 
     case 0xfe: /* Grp4 */
-        generate_exception_if((modrm_reg & 7) >= 2, EXC_UD, -1);
+        generate_exception_if((modrm_reg & 7) >= 2, EXC_UD);
     case 0xff: /* Grp5 */
         switch ( modrm_reg & 7 )
         {
@@ -4286,7 +4291,7 @@ x86_emulate(
             break;
         case 3: /* call (far, absolute indirect) */
         case 5: /* jmp (far, absolute indirect) */
-            generate_exception_if(src.type != OP_MEM, EXC_UD, -1);
+            generate_exception_if(src.type != OP_MEM, EXC_UD);
 
             if ( (rc = read_ulong(src.mem.seg, src.mem.off + op_bytes,
                                   &imm2, 2, ctxt, ops)) )
@@ -4298,13 +4303,13 @@ x86_emulate(
         case 6: /* push */
             goto push;
         case 7:
-            generate_exception_if(1, EXC_UD, -1);
+            generate_exception(EXC_UD);
         }
         break;
 
     case X86EMUL_OPC(0x0f, 0x00): /* Grp6 */
         seg = (modrm_reg & 1) ? x86_seg_tr : x86_seg_ldtr;
-        generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
+        generate_exception_if(!in_protmode(ctxt, ops), EXC_UD);
         switch ( modrm_reg & 6 )
         {
         case 0: /* sldt / str */
@@ -4316,7 +4321,7 @@ x86_emulate(
                 goto done;
             break;
         default:
-            generate_exception_if(true, EXC_UD, -1);
+            generate_exception_if(true, EXC_UD);
             break;
         }
         break;
@@ -4331,10 +4336,10 @@ x86_emulate(
         {
             unsigned long cr4;
 
-            generate_exception_if(vex.pfx, EXC_UD, -1);
+            generate_exception_if(vex.pfx, EXC_UD);
             if ( !ops->read_cr || ops->read_cr(4, &cr4, ctxt) != X86EMUL_OKAY )
                 cr4 = 0;
-            generate_exception_if(!(cr4 & X86_CR4_OSXSAVE), EXC_UD, -1);
+            generate_exception_if(!(cr4 & X86_CR4_OSXSAVE), EXC_UD);
             generate_exception_if(!mode_ring0() ||
                                   handle_xsetbv(_regs._ecx,
                                                 _regs._eax | (_regs.rdx << 32)),
@@ -4345,28 +4350,28 @@ x86_emulate(
 
         case 0xd4: /* vmfunc */
             generate_exception_if(lock_prefix | rep_prefix() | (vex.pfx == vex_66),
-                                  EXC_UD, -1);
+                                  EXC_UD);
             fail_if(!ops->vmfunc);
             if ( (rc = ops->vmfunc(ctxt) != X86EMUL_OKAY) )
                 goto done;
             goto no_writeback;
 
         case 0xd5: /* xend */
-            generate_exception_if(vex.pfx, EXC_UD, -1);
-            generate_exception_if(!vcpu_has_rtm(), EXC_UD, -1);
+            generate_exception_if(vex.pfx, EXC_UD);
+            generate_exception_if(!vcpu_has_rtm(), EXC_UD);
             generate_exception_if(vcpu_has_rtm(), EXC_GP, 0);
             break;
 
         case 0xd6: /* xtest */
-            generate_exception_if(vex.pfx, EXC_UD, -1);
+            generate_exception_if(vex.pfx, EXC_UD);
             generate_exception_if(!vcpu_has_rtm() && !vcpu_has_hle(),
-                                  EXC_UD, -1);
+                                  EXC_UD);
             /* Neither HLE nor RTM can be active when we get here. */
             _regs.eflags |= EFLG_ZF;
             goto no_writeback;
 
         case 0xdf: /* invlpga */
-            generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
+            generate_exception_if(!in_protmode(ctxt, ops), EXC_UD);
             generate_exception_if(!mode_ring0(), EXC_GP, 0);
             fail_if(ops->invlpg == NULL);
             if ( (rc = ops->invlpg(x86_seg_none, truncate_ea(_regs.eax),
@@ -4396,7 +4401,7 @@ x86_emulate(
                  ops->cpuid(&eax, &ebx, &dummy, &dummy, ctxt) == X86EMUL_OKAY )
                 limit = ((ebx >> 8) & 0xff) * 8;
             generate_exception_if(limit < sizeof(long) ||
-                                  (limit & (limit - 1)), EXC_UD, -1);
+                                  (limit & (limit - 1)), EXC_UD);
             base &= ~(limit - 1);
             if ( override_seg == -1 )
                 override_seg = x86_seg_ds;
@@ -4432,7 +4437,7 @@ x86_emulate(
         {
         case 0: /* sgdt */
         case 1: /* sidt */
-            generate_exception_if(ea.type != OP_MEM, EXC_UD, -1);
+            generate_exception_if(ea.type != OP_MEM, EXC_UD);
             generate_exception_if(umip_active(ctxt, ops), EXC_GP, 0);
             fail_if(ops->read_segment == NULL);
             if ( (rc = ops->read_segment(seg, &sreg, ctxt)) )
@@ -4453,7 +4458,7 @@ x86_emulate(
         case 2: /* lgdt */
         case 3: /* lidt */
             generate_exception_if(!mode_ring0(), EXC_GP, 0);
-            generate_exception_if(ea.type != OP_MEM, EXC_UD, -1);
+            generate_exception_if(ea.type != OP_MEM, EXC_UD);
             fail_if(ops->write_segment == NULL);
             memset(&sreg, 0, sizeof(sreg));
             if ( (rc = read_ulong(ea.mem.seg, ea.mem.off+0,
@@ -4496,7 +4501,7 @@ x86_emulate(
             break;
         case 7: /* invlpg */
             generate_exception_if(!mode_ring0(), EXC_GP, 0);
-            generate_exception_if(ea.type != OP_MEM, EXC_UD, -1);
+            generate_exception_if(ea.type != OP_MEM, EXC_UD);
             fail_if(ops->invlpg == NULL);
             if ( (rc = ops->invlpg(ea.mem.seg, ea.mem.off, ctxt)) )
                 goto done;
@@ -4510,13 +4515,13 @@ x86_emulate(
     case X86EMUL_OPC(0x0f, 0x05): /* syscall */ {
         uint64_t msr_content;
 
-        generate_exception_if(!in_protmode(ctxt, ops), EXC_UD, -1);
+        generate_exception_if(!in_protmode(ctxt, ops), EXC_UD);
 
         /* Inject #UD if syscall/sysret are disabled. */
         fail_if(ops->read_msr == NULL);
         if ( (rc = ops->read_msr(MSR_EFER, &msr_content, ctxt)) != 0 )
             goto done;
-        generate_exception_if((msr_content & EFER_SCE) == 0, EXC_UD, -1);
+        generate_exception_if((msr_content & EFER_SCE) == 0, EXC_UD);
 
         if ( (rc = ops->read_msr(MSR_STAR, &msr_content, ctxt)) != 0 )
             goto done;
@@ -4585,7 +4590,7 @@ x86_emulate(
     case X86EMUL_OPC(0x0f, 0x0b): /* ud2 */
     case X86EMUL_OPC(0x0f, 0xb9): /* ud1 */
     case X86EMUL_OPC(0x0f, 0xff): /* ud0 */
-        generate_exception_if(1, EXC_UD, -1);
+        generate_exception(EXC_UD);
 
     case X86EMUL_OPC(0x0f, 0x0d): /* GrpP (prefetch) */
     case X86EMUL_OPC(0x0f, 0x18): /* Grp16 (prefetch/nop) */
@@ -4704,7 +4709,7 @@ x86_emulate(
     case X86EMUL_OPC(0x0f, 0x21): /* mov dr,reg */
     case X86EMUL_OPC(0x0f, 0x22): /* mov reg,cr */
     case X86EMUL_OPC(0x0f, 0x23): /* mov reg,dr */
-        generate_exception_if(ea.type != OP_REG, EXC_UD, -1);
+        generate_exception_if(ea.type != OP_REG, EXC_UD);
         generate_exception_if(!mode_ring0(), EXC_GP, 0);
         modrm_reg |= lock_prefix << 3;
         if ( b & 2 )
@@ -4941,11 +4946,11 @@ x86_emulate(
         switch ( b )
         {
         case 0x7e:
-            generate_exception_if(vex.l, EXC_UD, -1);
+            generate_exception_if(vex.l, EXC_UD);
             ea.bytes = op_bytes;
             break;
         case 0xd6:
-            generate_exception_if(vex.l, EXC_UD, -1);
+            generate_exception_if(vex.l, EXC_UD);
             ea.bytes = 8;
             break;
         }
@@ -5035,7 +5040,7 @@ x86_emulate(
     case X86EMUL_OPC(0x0f, 0xad): /* shrd %%cl,r,r/m */ {
         uint8_t shift, width = dst.bytes << 3;
 
-        generate_exception_if(lock_prefix, EXC_UD, -1);
+        generate_exception_if(lock_prefix, EXC_UD);
         if ( b & 1 )
             shift = _regs.ecx;
         else
@@ -5150,7 +5155,7 @@ x86_emulate(
         case 5: goto bts;
         case 6: goto btr;
         case 7: goto btc;
-        default: generate_exception_if(1, EXC_UD, -1);
+        default: generate_exception(EXC_UD);
         }
         break;
 
@@ -5251,15 +5256,15 @@ x86_emulate(
     case X86EMUL_OPC(0x0f, 0xc3): /* movnti */
         /* Ignore the non-temporal hint for now. */
         vcpu_must_have_sse2();
-        generate_exception_if(dst.bytes <= 2, EXC_UD, -1);
+        generate_exception_if(dst.bytes <= 2, EXC_UD);
         dst.val = src.val;
         break;
 
     case X86EMUL_OPC(0x0f, 0xc7): /* Grp9 (cmpxchg8b/cmpxchg16b) */ {
         unsigned long old[2], exp[2], new[2];
 
-        generate_exception_if((modrm_reg & 7) != 1, EXC_UD, -1);
-        generate_exception_if(ea.type != OP_MEM, EXC_UD, -1);
+        generate_exception_if((modrm_reg & 7) != 1, EXC_UD);
+        generate_exception_if(ea.type != OP_MEM, EXC_UD);
         if ( op_bytes == 8 )
             host_and_vcpu_must_have(cx16);
         op_bytes *= 2;
@@ -5416,8 +5421,7 @@ x86_emulate(
     *ctxt->regs = _regs;
 
     /* Inject #DB if single-step tracing was enabled at instruction start. */
-    if ( tf && (rc == X86EMUL_OKAY) && ops->inject_hw_exception )
-        rc = ops->inject_hw_exception(EXC_DB, -1, ctxt) ? : X86EMUL_EXCEPTION;
+    generate_exception_if(tf && (rc == X86EMUL_OKAY), EXC_DB);
 
  done:
     _put_fpu();
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 06/15] x86/emul: Rework emulator event injection
  2016-11-23 15:38 [PATCH for-4.9 00/15] XSA-191 followup Andrew Cooper
                   ` (4 preceding siblings ...)
  2016-11-23 15:38 ` [PATCH 05/15] x86/emul: Remove opencoded exception generation Andrew Cooper
@ 2016-11-23 15:38 ` Andrew Cooper
  2016-11-23 16:19   ` Tim Deegan
                     ` (3 more replies)
  2016-11-23 15:38 ` [PATCH 07/15] x86/vmx: Use hvm_{get, set}_segment_register() rather than vmx_{get, set}_segment_register() Andrew Cooper
                   ` (8 subsequent siblings)
  14 siblings, 4 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 15:38 UTC (permalink / raw)
  To: Xen-devel
  Cc: Kevin Tian, Jan Beulich, George Dunlap, Andrew Cooper,
	Tim Deegan, Paul Durrant, Jun Nakajima, Boris Ostrovsky,
	Suravee Suthikulpanit

The emulator needs to gain an understanding of interrupts and exceptions
generated by its actions.

Move hvm_emulate_ctxt.{exn_pending,trap} into struct x86_emulate_ctxt so they
are visible to the emulator.  This removes the need for the
inject_{hw,sw}_interrupt() hooks, which are dropped and replaced with
x86_emul_{hw_exception,software_event}() instead.

The shadow pagetable and PV uses of x86_emulate() previously failed with
X86EMUL_UNHANDLEABLE due to the lack of inject_*() hooks, but this behaviour
has subtly changed.  Adjust the return value checking to cause a pending event
to fall back into the previous codepath.

No overall functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Paul Durrant <paul.durrant@citrix.com>
CC: Tim Deegan <tim@xen.org>
CC: George Dunlap <george.dunlap@eu.citrix.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 tools/tests/x86_emulator/test_x86_emulator.c |  1 +
 xen/arch/x86/hvm/emulate.c                   | 81 ++++------------------------
 xen/arch/x86/hvm/hvm.c                       |  4 +-
 xen/arch/x86/hvm/io.c                        |  4 +-
 xen/arch/x86/hvm/vmx/realmode.c              | 16 +++---
 xen/arch/x86/mm.c                            |  5 +-
 xen/arch/x86/mm/shadow/multi.c               |  4 +-
 xen/arch/x86/x86_emulate/x86_emulate.c       | 12 ++---
 xen/arch/x86/x86_emulate/x86_emulate.h       | 67 ++++++++++++++++++-----
 xen/include/asm-x86/hvm/emulate.h            |  3 --
 10 files changed, 86 insertions(+), 111 deletions(-)

diff --git a/tools/tests/x86_emulator/test_x86_emulator.c b/tools/tests/x86_emulator/test_x86_emulator.c
index 948ee8d..146e15e 100644
--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -1,3 +1,4 @@
+#include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stdbool.h>
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 790e9c1..c0fbde1 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -568,12 +568,9 @@ static int hvmemul_virtual_to_linear(
         return X86EMUL_UNHANDLEABLE;
 
     /* This is a singleton operation: fail it with an exception. */
-    hvmemul_ctxt->exn_pending = 1;
-    hvmemul_ctxt->trap.vector =
-        (seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault;
-    hvmemul_ctxt->trap.type = X86_EVENTTYPE_HW_EXCEPTION;
-    hvmemul_ctxt->trap.error_code = 0;
-    hvmemul_ctxt->trap.insn_len = 0;
+    x86_emul_hw_exception((seg == x86_seg_ss)
+                          ? TRAP_stack_error
+                          : TRAP_gp_fault, 0, &hvmemul_ctxt->ctxt);
     return X86EMUL_EXCEPTION;
 }
 
@@ -1562,59 +1559,6 @@ int hvmemul_cpuid(
     return X86EMUL_OKAY;
 }
 
-static int hvmemul_inject_hw_exception(
-    uint8_t vector,
-    int32_t error_code,
-    struct x86_emulate_ctxt *ctxt)
-{
-    struct hvm_emulate_ctxt *hvmemul_ctxt =
-        container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
-
-    hvmemul_ctxt->exn_pending = 1;
-    hvmemul_ctxt->trap.vector = vector;
-    hvmemul_ctxt->trap.type = X86_EVENTTYPE_HW_EXCEPTION;
-    hvmemul_ctxt->trap.error_code = error_code;
-    hvmemul_ctxt->trap.insn_len = 0;
-
-    return X86EMUL_OKAY;
-}
-
-static int hvmemul_inject_sw_interrupt(
-    enum x86_swint_type type,
-    uint8_t vector,
-    uint8_t insn_len,
-    struct x86_emulate_ctxt *ctxt)
-{
-    struct hvm_emulate_ctxt *hvmemul_ctxt =
-        container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
-
-    switch ( type )
-    {
-    case x86_swint_icebp:
-        hvmemul_ctxt->trap.type = X86_EVENTTYPE_PRI_SW_EXCEPTION;
-        break;
-
-    case x86_swint_int3:
-    case x86_swint_into:
-        hvmemul_ctxt->trap.type = X86_EVENTTYPE_SW_EXCEPTION;
-        break;
-
-    case x86_swint_int:
-        hvmemul_ctxt->trap.type = X86_EVENTTYPE_SW_INTERRUPT;
-        break;
-
-    default:
-        return X86EMUL_UNHANDLEABLE;
-    }
-
-    hvmemul_ctxt->exn_pending = 1;
-    hvmemul_ctxt->trap.vector = vector;
-    hvmemul_ctxt->trap.error_code = X86_EVENT_NO_EC;
-    hvmemul_ctxt->trap.insn_len = insn_len;
-
-    return X86EMUL_OKAY;
-}
-
 static int hvmemul_get_fpu(
     void (*exception_callback)(void *, struct cpu_user_regs *),
     void *exception_callback_arg,
@@ -1678,8 +1622,7 @@ static int hvmemul_invlpg(
          * hvmemul_virtual_to_linear() raises exceptions for type/limit
          * violations, so squash them.
          */
-        hvmemul_ctxt->exn_pending = 0;
-        hvmemul_ctxt->trap = (struct x86_event){};
+        x86_emul_reset_event(ctxt);
         rc = X86EMUL_OKAY;
     }
 
@@ -1696,7 +1639,7 @@ static int hvmemul_vmfunc(
 
     rc = hvm_funcs.altp2m_vcpu_emulate_vmfunc(ctxt->regs);
     if ( rc != X86EMUL_OKAY )
-        hvmemul_inject_hw_exception(TRAP_invalid_op, 0, ctxt);
+        x86_emul_hw_exception(TRAP_invalid_op, 0, ctxt);
 
     return rc;
 }
@@ -1720,8 +1663,6 @@ static const struct x86_emulate_ops hvm_emulate_ops = {
     .write_msr     = hvmemul_write_msr,
     .wbinvd        = hvmemul_wbinvd,
     .cpuid         = hvmemul_cpuid,
-    .inject_hw_exception = hvmemul_inject_hw_exception,
-    .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
     .get_fpu       = hvmemul_get_fpu,
     .put_fpu       = hvmemul_put_fpu,
     .invlpg        = hvmemul_invlpg,
@@ -1747,8 +1688,6 @@ static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
     .write_msr     = hvmemul_write_msr_discard,
     .wbinvd        = hvmemul_wbinvd_discard,
     .cpuid         = hvmemul_cpuid,
-    .inject_hw_exception = hvmemul_inject_hw_exception,
-    .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
     .get_fpu       = hvmemul_get_fpu,
     .put_fpu       = hvmemul_put_fpu,
     .invlpg        = hvmemul_invlpg,
@@ -1867,8 +1806,8 @@ int hvm_emulate_one_mmio(unsigned long mfn, unsigned long gla)
         hvm_dump_emulation_state(XENLOG_G_WARNING "MMCFG", &ctxt);
         break;
     case X86EMUL_EXCEPTION:
-        if ( ctxt.exn_pending )
-            hvm_inject_event(&ctxt.trap);
+        if ( ctxt.ctxt.event_pending )
+            hvm_inject_event(&ctxt.ctxt.event);
         /* fallthrough */
     default:
         hvm_emulate_writeback(&ctxt);
@@ -1927,8 +1866,8 @@ void hvm_emulate_one_vm_event(enum emul_kind kind, unsigned int trapnr,
         hvm_inject_hw_exception(trapnr, errcode);
         break;
     case X86EMUL_EXCEPTION:
-        if ( ctx.exn_pending )
-            hvm_inject_event(&ctx.trap);
+        if ( ctx.ctxt.event_pending )
+            hvm_inject_event(&ctx.ctxt.event);
         break;
     }
 
@@ -2005,8 +1944,6 @@ void hvm_emulate_init_per_insn(
         hvmemul_ctxt->insn_buf_bytes = insn_bytes;
         memcpy(hvmemul_ctxt->insn_buf, insn_buf, insn_bytes);
     }
-
-    hvmemul_ctxt->exn_pending = 0;
 }
 
 void hvm_emulate_writeback(
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index b950842..ef83100 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4076,8 +4076,8 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
         hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         break;
     case X86EMUL_EXCEPTION:
-        if ( ctxt.exn_pending )
-            hvm_inject_event(&ctxt.trap);
+        if ( ctxt.ctxt.event_pending )
+            hvm_inject_event(&ctxt.ctxt.event);
         /* fall through */
     default:
         hvm_emulate_writeback(&ctxt);
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 1279f68..abb9d51 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -102,8 +102,8 @@ int handle_mmio(void)
         hvm_dump_emulation_state(XENLOG_G_WARNING "MMIO", &ctxt);
         return 0;
     case X86EMUL_EXCEPTION:
-        if ( ctxt.exn_pending )
-            hvm_inject_event(&ctxt.trap);
+        if ( ctxt.ctxt.event_pending )
+            hvm_inject_event(&ctxt.ctxt.event);
         break;
     default:
         break;
diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c
index 9002638..dc3ab44 100644
--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -122,7 +122,7 @@ void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt)
 
     if ( rc == X86EMUL_EXCEPTION )
     {
-        if ( !hvmemul_ctxt->exn_pending )
+        if ( !hvmemul_ctxt->ctxt.event_pending )
         {
             unsigned long intr_info;
 
@@ -133,27 +133,27 @@ void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt)
                 gdprintk(XENLOG_ERR, "Exception pending but no info.\n");
                 goto fail;
             }
-            hvmemul_ctxt->trap.vector = (uint8_t)intr_info;
-            hvmemul_ctxt->trap.insn_len = 0;
+            hvmemul_ctxt->ctxt.event.vector = (uint8_t)intr_info;
+            hvmemul_ctxt->ctxt.event.insn_len = 0;
         }
 
         if ( unlikely(curr->domain->debugger_attached) &&
-             ((hvmemul_ctxt->trap.vector == TRAP_debug) ||
-              (hvmemul_ctxt->trap.vector == TRAP_int3)) )
+             ((hvmemul_ctxt->ctxt.event.vector == TRAP_debug) ||
+              (hvmemul_ctxt->ctxt.event.vector == TRAP_int3)) )
         {
             domain_pause_for_debugger();
         }
         else if ( curr->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE )
         {
             gdprintk(XENLOG_ERR, "Exception %02x in protected mode.\n",
-                     hvmemul_ctxt->trap.vector);
+                     hvmemul_ctxt->ctxt.event.vector);
             goto fail;
         }
         else
         {
             realmode_deliver_exception(
-                hvmemul_ctxt->trap.vector,
-                hvmemul_ctxt->trap.insn_len,
+                hvmemul_ctxt->ctxt.event.vector,
+                hvmemul_ctxt->ctxt.event.insn_len,
                 hvmemul_ctxt);
         }
     }
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 9901f6f..66ecdab 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5377,7 +5377,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
     page_unlock(page);
     put_page(page);
 
-    if ( rc == X86EMUL_UNHANDLEABLE )
+    if ( rc == X86EMUL_UNHANDLEABLE || ptwr_ctxt.ctxt.event_pending )
         goto bail;
 
     perfc_incr(ptwr_emulations);
@@ -5501,7 +5501,8 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
     else
         rc = x86_emulate(&ctxt, &mmio_ro_emulate_ops);
 
-    return rc != X86EMUL_UNHANDLEABLE ? EXCRET_fault_fixed : 0;
+    return ((rc != X86EMUL_UNHANDLEABLE && !ctxt.event_pending)
+            ? EXCRET_fault_fixed : 0);
 }
 
 void *alloc_xen_pagetable(void)
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index d70b1c6..84cb6b6 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3378,7 +3378,7 @@ static int sh_page_fault(struct vcpu *v,
      * would be a good unshadow hint. If we *do* decide to unshadow-on-fault
      * then it must be 'failable': we cannot require the unshadow to succeed.
      */
-    if ( r == X86EMUL_UNHANDLEABLE )
+    if ( r == X86EMUL_UNHANDLEABLE || emul_ctxt.ctxt.event_pending )
     {
         perfc_incr(shadow_fault_emulate_failed);
 #if SHADOW_OPTIMIZATIONS & SHOPT_FAST_EMULATION
@@ -3433,7 +3433,7 @@ static int sh_page_fault(struct vcpu *v,
             shadow_continue_emulation(&emul_ctxt, regs);
             v->arch.paging.last_write_was_pt = 0;
             r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
-            if ( r == X86EMUL_OKAY )
+            if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.event_pending )
             {
                 emulation_count++;
                 if ( v->arch.paging.last_write_was_pt )
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index f8271b3..768a436 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -680,9 +680,8 @@ static inline int mkec(uint8_t e, int32_t ec, ...)
 
 #define generate_exception_if(p, e, ec...)                                \
 ({  if ( (p) ) {                                                          \
-        fail_if(ops->inject_hw_exception == NULL);                        \
-        rc = ops->inject_hw_exception(e, mkec(e, ##ec, 0), ctxt)          \
-            ? : X86EMUL_EXCEPTION;                                        \
+        x86_emul_hw_exception(e, mkec(e, ##ec, 0), ctxt);                 \
+        rc = X86EMUL_EXCEPTION;                                           \
         goto done;                                                        \
     }                                                                     \
 })
@@ -1604,9 +1603,6 @@ static int inject_swint(enum x86_swint_type type,
 {
     int rc, error_code, fault_type = EXC_GP;
 
-    fail_if(ops->inject_sw_interrupt == NULL);
-    fail_if(ops->inject_hw_exception == NULL);
-
     /*
      * Without hardware support, injecting software interrupts/exceptions is
      * problematic.
@@ -1703,7 +1699,8 @@ static int inject_swint(enum x86_swint_type type,
         ctxt->regs->eip += insn_len;
     }
 
-    rc = ops->inject_sw_interrupt(type, vector, insn_len, ctxt);
+    x86_emul_software_event(type, vector, insn_len, ctxt);
+    rc = X86EMUL_OKAY;
 
  done:
     return rc;
@@ -1912,6 +1909,7 @@ x86_decode(
     /* Initialise output state in x86_emulate_ctxt */
     ctxt->opcode = ~0u;
     ctxt->retire.byte = 0;
+    x86_emul_reset_event(ctxt);
 
     op_bytes = def_op_bytes = ad_bytes = def_ad_bytes = ctxt->addr_size/8;
     if ( op_bytes == 8 )
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index 9df083e..ddcd93c 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -388,19 +388,6 @@ struct x86_emulate_ops
         unsigned int *edx,
         struct x86_emulate_ctxt *ctxt);
 
-    /* inject_hw_exception */
-    int (*inject_hw_exception)(
-        uint8_t vector,
-        int32_t error_code,
-        struct x86_emulate_ctxt *ctxt);
-
-    /* inject_sw_interrupt */
-    int (*inject_sw_interrupt)(
-        enum x86_swint_type type,
-        uint8_t vector,
-        uint8_t insn_len,
-        struct x86_emulate_ctxt *ctxt);
-
     /*
      * get_fpu: Load emulated environment's FPU state onto processor.
      *  @exn_callback: On any FPU or SIMD exception, pass control to
@@ -473,6 +460,9 @@ struct x86_emulate_ctxt
         } flags;
         uint8_t byte;
     } retire;
+
+    bool event_pending;
+    struct x86_event event;
 };
 
 /*
@@ -594,4 +584,55 @@ void x86_emulate_free_state(struct x86_emulate_state *state);
 
 #endif
 
+#ifndef ASSERT
+#define ASSERT assert
+#endif
+
+static inline void x86_emul_hw_exception(
+    unsigned int vector, unsigned int error_code, struct x86_emulate_ctxt *ctxt)
+{
+    ASSERT(!ctxt->event_pending);
+
+    ctxt->event.vector = vector;
+    ctxt->event.type = X86_EVENTTYPE_HW_EXCEPTION;
+    ctxt->event.error_code = error_code;
+
+    ctxt->event_pending = true;
+}
+
+static inline void x86_emul_software_event(
+    enum x86_swint_type type, uint8_t vector, uint8_t insn_len,
+    struct x86_emulate_ctxt *ctxt)
+{
+    ASSERT(!ctxt->event_pending);
+
+    switch ( type )
+    {
+    case x86_swint_icebp:
+        ctxt->event.type = X86_EVENTTYPE_PRI_SW_EXCEPTION;
+        break;
+
+    case x86_swint_int3:
+    case x86_swint_into:
+        ctxt->event.type = X86_EVENTTYPE_SW_EXCEPTION;
+        break;
+
+    case x86_swint_int:
+        ctxt->event.type = X86_EVENTTYPE_SW_INTERRUPT;
+        break;
+    }
+
+    ctxt->event.vector = vector;
+    ctxt->event.error_code = X86_EVENT_NO_EC;
+    ctxt->event.insn_len = insn_len;
+
+    ctxt->event_pending = true;
+}
+
+static inline void x86_emul_reset_event(struct x86_emulate_ctxt *ctxt)
+{
+    ctxt->event_pending = false;
+    ctxt->event = (struct x86_event){};
+}
+
 #endif /* __X86_EMULATE_H__ */
diff --git a/xen/include/asm-x86/hvm/emulate.h b/xen/include/asm-x86/hvm/emulate.h
index 3b7ec33..d64d834 100644
--- a/xen/include/asm-x86/hvm/emulate.h
+++ b/xen/include/asm-x86/hvm/emulate.h
@@ -29,9 +29,6 @@ struct hvm_emulate_ctxt {
     unsigned long seg_reg_accessed;
     unsigned long seg_reg_dirty;
 
-    bool_t exn_pending;
-    struct x86_event trap;
-
     uint32_t intr_shadow;
 
     bool_t set_context;
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 07/15] x86/vmx: Use hvm_{get, set}_segment_register() rather than vmx_{get, set}_segment_register()
  2016-11-23 15:38 [PATCH for-4.9 00/15] XSA-191 followup Andrew Cooper
                   ` (5 preceding siblings ...)
  2016-11-23 15:38 ` [PATCH 06/15] x86/emul: Rework emulator event injection Andrew Cooper
@ 2016-11-23 15:38 ` Andrew Cooper
  2016-11-24  6:20   ` Tian, Kevin
  2016-11-23 15:38 ` [PATCH 08/15] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS Andrew Cooper
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 15:38 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima

No functional change at this point, but this is a prerequisite for forthcoming
functional changes.

Make vmx_get_segment_register() private to vmx.c like all the other Vendor
get/set functions.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/hvm/vmx/vmx.c        | 14 +++++++-------
 xen/arch/x86/hvm/vmx/vvmx.c       |  6 +++---
 xen/include/asm-x86/hvm/vmx/vmx.h |  2 --
 3 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index eb7c902..29c6088 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -940,8 +940,8 @@ static void vmx_ctxt_switch_to(struct vcpu *v)
         .fields = { .type = 0xb, .s = 0, .dpl = 0, .p = 1, .avl = 0,    \
                     .l = 0, .db = 0, .g = 0, .pad = 0 } }).bytes)
 
-void vmx_get_segment_register(struct vcpu *v, enum x86_segment seg,
-                              struct segment_register *reg)
+static void vmx_get_segment_register(struct vcpu *v, enum x86_segment seg,
+                                     struct segment_register *reg)
 {
     unsigned long attr = 0, sel = 0, limit;
 
@@ -1504,19 +1504,19 @@ static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr)
              * Need to read them all either way, as realmode reads can update
              * the saved values we'll use when returning to prot mode. */
             for ( s = 0; s < ARRAY_SIZE(reg); s++ )
-                vmx_get_segment_register(v, s, &reg[s]);
+                hvm_get_segment_register(v, s, &reg[s]);
             v->arch.hvm_vmx.vmx_realmode = realmode;
             
             if ( realmode )
             {
                 for ( s = 0; s < ARRAY_SIZE(reg); s++ )
-                    vmx_set_segment_register(v, s, &reg[s]);
+                    hvm_set_segment_register(v, s, &reg[s]);
             }
             else 
             {
                 for ( s = 0; s < ARRAY_SIZE(reg); s++ )
                     if ( !(v->arch.hvm_vmx.vm86_segment_mask & (1<<s)) )
-                        vmx_set_segment_register(
+                        hvm_set_segment_register(
                             v, s, &v->arch.hvm_vmx.vm86_saved_seg[s]);
             }
 
@@ -3907,7 +3907,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
             gdprintk(XENLOG_WARNING, "Bad vmexit (reason %#lx)\n",
                      exit_reason);
 
-            vmx_get_segment_register(v, x86_seg_ss, &ss);
+            hvm_get_segment_register(v, x86_seg_ss, &ss);
             if ( ss.attr.fields.dpl )
                 hvm_inject_hw_exception(TRAP_invalid_op,
                                         X86_EVENT_NO_EC);
@@ -3939,7 +3939,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
 
         gprintk(XENLOG_WARNING, "Bad rIP %lx for mode %u\n", regs->rip, mode);
 
-        vmx_get_segment_register(v, x86_seg_ss, &ss);
+        hvm_get_segment_register(v, x86_seg_ss, &ss);
         if ( ss.attr.fields.dpl )
         {
             __vmread(VM_ENTRY_INTR_INFO, &intr_info);
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index efaf54c..bcc4a97 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -360,7 +360,7 @@ static int vmx_inst_check_privilege(struct cpu_user_regs *regs, int vmxop_check)
     else if ( !vcpu_2_nvmx(v).vmxon_region_pa )
         goto invalid_op;
 
-    vmx_get_segment_register(v, x86_seg_cs, &cs);
+    hvm_get_segment_register(v, x86_seg_cs, &cs);
 
     if ( (regs->eflags & X86_EFLAGS_VM) ||
          (hvm_long_mode_enabled(v) && cs.attr.fields.l == 0) )
@@ -419,13 +419,13 @@ static int decode_vmx_inst(struct cpu_user_regs *regs,
 
         if ( hvm_long_mode_enabled(v) )
         {
-            vmx_get_segment_register(v, x86_seg_cs, &seg);
+            hvm_get_segment_register(v, x86_seg_cs, &seg);
             mode_64bit = seg.attr.fields.l;
         }
 
         if ( info.fields.segment > VMX_SREG_GS )
             goto gp_fault;
-        vmx_get_segment_register(v, sreg_to_index[info.fields.segment], &seg);
+        hvm_get_segment_register(v, sreg_to_index[info.fields.segment], &seg);
         seg_base = seg.base;
 
         base = info.fields.base_reg_invalid ? 0 :
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 4cdd9b1..0e5902d 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -550,8 +550,6 @@ static inline int __vmxon(u64 addr)
     return rc;
 }
 
-void vmx_get_segment_register(struct vcpu *, enum x86_segment,
-                              struct segment_register *);
 void vmx_inject_extint(int trap, uint8_t source);
 void vmx_inject_nmi(void);
 
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 08/15] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS
  2016-11-23 15:38 [PATCH for-4.9 00/15] XSA-191 followup Andrew Cooper
                   ` (6 preceding siblings ...)
  2016-11-23 15:38 ` [PATCH 07/15] x86/vmx: Use hvm_{get, set}_segment_register() rather than vmx_{get, set}_segment_register() Andrew Cooper
@ 2016-11-23 15:38 ` Andrew Cooper
  2016-11-23 19:01   ` Boris Ostrovsky
                     ` (2 more replies)
  2016-11-23 15:38 ` [PATCH 09/15] x86/emul: Avoid raising faults behind the emulators back Andrew Cooper
                   ` (6 subsequent siblings)
  14 siblings, 3 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 15:38 UTC (permalink / raw)
  To: Xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Jun Nakajima,
	Boris Ostrovsky, Suravee Suthikulpanit

Intel VT-x and AMD SVM provide access to the full segment descriptor cache via
fields in the VMCB/VMCS.  However, the bits which are actually checked by
hardware and preserved across vmentry/exit are inconsistent, and the vendor
accessor functions perform inconsistent modification to the raw values.

Convert {svm,vmx}_{get,set}_segment_register() into raw accessors, and alter
hvm_{get,set}_segment_register() to cook the values consistently.  This allows
the common emulation code to better rely on finding architecturally-expected
values.

This does cause some functional changes because of the modifications being
applied uniformly.  A side effect of this fixes latent bugs where
vmx_set_segment_register() didn't correctly fix up .G for segments, and
inconsistent fixing up of the GDTR/IDTR limits.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 xen/arch/x86/hvm/hvm.c        | 151 ++++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/svm/svm.c    |  20 +-----
 xen/arch/x86/hvm/vmx/vmx.c    |   6 +-
 xen/include/asm-x86/desc.h    |   6 ++
 xen/include/asm-x86/hvm/hvm.h |  17 ++---
 5 files changed, 164 insertions(+), 36 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index ef83100..804cd88 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -6051,6 +6051,157 @@ void hvm_domain_soft_reset(struct domain *d)
 }
 
 /*
+ * Segment caches in VMCB/VMCS are inconsistent about which bits are checked,
+ * important, and preserved across vmentry/exit.  Cook the values to make them
+ * closer to what is architecturally expected from entries in the segment
+ * cache.
+ */
+void hvm_get_segment_register(struct vcpu *v, enum x86_segment seg,
+                              struct segment_register *reg)
+{
+    hvm_funcs.get_segment_register(v, seg, reg);
+
+    switch ( seg )
+    {
+    case x86_seg_ss:
+        /* SVM may retain %ss.DB when %ss is loaded with a NULL selector. */
+        if ( !reg->attr.fields.p )
+            reg->attr.fields.db = 0;
+        break;
+
+    case x86_seg_tr:
+        /*
+         * SVM doesn't track %tr.B. Architecturally, a loaded TSS segment will
+         * always be busy.
+         */
+        reg->attr.fields.type |= 0x2;
+
+        /*
+         * %cs and %tr are unconditionally present.  SVM ignores these present
+         * bits and will happily run without them set.
+         */
+    case x86_seg_cs:
+        reg->attr.fields.p = 1;
+        break;
+
+    case x86_seg_gdtr:
+    case x86_seg_idtr:
+        /*
+         * Treat GDTR/IDTR as being present system segments.  This avoids them
+         * needing special casing for segmentation checks.
+         */
+        reg->attr.bytes = 0x80;
+        break;
+
+    default: /* Avoid triggering -Werror=switch */
+        break;
+    }
+
+    if ( reg->attr.fields.p )
+    {
+        /*
+         * For segments which are present/usable, cook the system flag.  SVM
+         * ignores the S bit on all segments and will happily run with them in
+         * any state.
+         */
+        reg->attr.fields.s = is_x86_user_segment(seg);
+
+        /*
+         * SVM discards %cs.G on #VMEXIT.  Other user segments do have .G
+         * tracked, but Linux commit 80112c89ed87 "KVM: Synthesize G bit for
+         * all segments." indicates that this isn't necessarily the case when
+         * nested under ESXi.
+         *
+         * Unconditionally recalculate G.
+         */
+        reg->attr.fields.g = !!(reg->limit >> 20);
+
+        /*
+         * SVM doesn't track the Accessed flag.  It will always be set for
+         * usable user segments loaded into the descriptor cache.
+         */
+        if ( is_x86_user_segment(seg) )
+            reg->attr.fields.type |= 0x1;
+    }
+}
+
+void hvm_set_segment_register(struct vcpu *v, enum x86_segment seg,
+                              struct segment_register *reg)
+{
+    /* Set G to match the limit field.  VT-x cares, while SVM doesn't. */
+    if ( reg->attr.fields.p )
+        reg->attr.fields.g = !!(reg->limit >> 20);
+
+    switch ( seg )
+    {
+    case x86_seg_cs:
+        ASSERT(reg->attr.fields.p);                  /* Usable. */
+        ASSERT(reg->attr.fields.s);                  /* User segment. */
+        ASSERT((reg->base >> 32) == 0);              /* Upper bits clear. */
+        break;
+
+    case x86_seg_ss:
+        if ( reg->attr.fields.p )
+        {
+            ASSERT(reg->attr.fields.s);              /* User segment. */
+            ASSERT(!(reg->attr.fields.type & 0x8));  /* Data segment. */
+            ASSERT(reg->attr.fields.type & 0x2);     /* Writeable. */
+            ASSERT((reg->base >> 32) == 0);          /* Upper bits clear. */
+        }
+        break;
+
+    case x86_seg_ds:
+    case x86_seg_es:
+    case x86_seg_fs:
+    case x86_seg_gs:
+        if ( reg->attr.fields.p )
+        {
+            ASSERT(reg->attr.fields.s);              /* User segment. */
+
+            if ( reg->attr.fields.type & 0x8 )
+                ASSERT(reg->attr.fields.type & 0x2); /* Readable. */
+
+            if ( seg == x86_seg_fs || seg == x86_seg_gs )
+                ASSERT(is_canonical_address(reg->base));
+            else
+                ASSERT((reg->base >> 32) == 0);      /* Upper bits clear. */
+        }
+        break;
+
+    case x86_seg_tr:
+        ASSERT(reg->attr.fields.p);                  /* Usable. */
+        ASSERT(!reg->attr.fields.s);                 /* System segment. */
+        ASSERT(!(reg->sel & 0x4));                   /* !TI. */
+        ASSERT(reg->attr.fields.type == SYS_DESC_tss16_busy ||
+               reg->attr.fields.type == SYS_DESC_tss_busy);
+        ASSERT(is_canonical_address(reg->base));
+        break;
+
+    case x86_seg_ldtr:
+        if ( reg->attr.fields.p )
+        {
+            ASSERT(!reg->attr.fields.s);             /* System segment. */
+            ASSERT(!(reg->sel & 0x4));               /* !TI. */
+            ASSERT(reg->attr.fields.type == SYS_DESC_ldt);
+            ASSERT(is_canonical_address(reg->base));
+        }
+        break;
+
+    case x86_seg_gdtr:
+    case x86_seg_idtr:
+        ASSERT(is_canonical_address(reg->base));
+        ASSERT((reg->limit >> 16) == 0);             /* Upper bits clear. */
+        break;
+
+    default:
+        ASSERT_UNREACHABLE();
+        break;
+    }
+
+    hvm_funcs.set_segment_register(v, seg, reg);
+}
+
+/*
  * Local variables:
  * mode: C
  * c-file-style: "BSD"
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index fb4fd0b..c3c759f 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -627,50 +627,34 @@ static void svm_get_segment_register(struct vcpu *v, enum x86_segment seg,
     {
     case x86_seg_cs:
         memcpy(reg, &vmcb->cs, sizeof(*reg));
-        reg->attr.fields.p = 1;
-        reg->attr.fields.g = reg->limit > 0xFFFFF;
         break;
     case x86_seg_ds:
         memcpy(reg, &vmcb->ds, sizeof(*reg));
-        if ( reg->attr.fields.type != 0 )
-            reg->attr.fields.type |= 0x1;
         break;
     case x86_seg_es:
         memcpy(reg, &vmcb->es, sizeof(*reg));
-        if ( reg->attr.fields.type != 0 )
-            reg->attr.fields.type |= 0x1;
         break;
     case x86_seg_fs:
         svm_sync_vmcb(v);
         memcpy(reg, &vmcb->fs, sizeof(*reg));
-        if ( reg->attr.fields.type != 0 )
-            reg->attr.fields.type |= 0x1;
         break;
     case x86_seg_gs:
         svm_sync_vmcb(v);
         memcpy(reg, &vmcb->gs, sizeof(*reg));
-        if ( reg->attr.fields.type != 0 )
-            reg->attr.fields.type |= 0x1;
         break;
     case x86_seg_ss:
         memcpy(reg, &vmcb->ss, sizeof(*reg));
         reg->attr.fields.dpl = vmcb->_cpl;
-        if ( reg->attr.fields.type == 0 )
-            reg->attr.fields.db = 0;
         break;
     case x86_seg_tr:
         svm_sync_vmcb(v);
         memcpy(reg, &vmcb->tr, sizeof(*reg));
-        reg->attr.fields.p = 1;
-        reg->attr.fields.type |= 0x2;
         break;
     case x86_seg_gdtr:
         memcpy(reg, &vmcb->gdtr, sizeof(*reg));
-        reg->attr.bytes = 0x80;
         break;
     case x86_seg_idtr:
         memcpy(reg, &vmcb->idtr, sizeof(*reg));
-        reg->attr.bytes = 0x80;
         break;
     case x86_seg_ldtr:
         svm_sync_vmcb(v);
@@ -740,11 +724,11 @@ static void svm_set_segment_register(struct vcpu *v, enum x86_segment seg,
         break;
     case x86_seg_gdtr:
         vmcb->gdtr.base = reg->base;
-        vmcb->gdtr.limit = (uint16_t)reg->limit;
+        vmcb->gdtr.limit = reg->limit;
         break;
     case x86_seg_idtr:
         vmcb->idtr.base = reg->base;
-        vmcb->idtr.limit = (uint16_t)reg->limit;
+        vmcb->idtr.limit = reg->limit;
         break;
     case x86_seg_ldtr:
         memcpy(&vmcb->ldtr, reg, sizeof(*reg));
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 29c6088..0fd4b5c 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1126,9 +1126,6 @@ static void vmx_set_segment_register(struct vcpu *v, enum x86_segment seg,
      */
     attr = (!(attr & (1u << 7)) << 16) | ((attr & 0xf00) << 4) | (attr & 0xff);
 
-    /* VMX has strict consistency requirement for flag G. */
-    attr |= !!(limit >> 20) << 15;
-
     vmx_vmcs_enter(v);
 
     switch ( seg )
@@ -1173,8 +1170,7 @@ static void vmx_set_segment_register(struct vcpu *v, enum x86_segment seg,
         __vmwrite(GUEST_TR_SELECTOR, sel);
         __vmwrite(GUEST_TR_LIMIT, limit);
         __vmwrite(GUEST_TR_BASE, base);
-        /* VMX checks that the the busy flag (bit 1) is set. */
-        __vmwrite(GUEST_TR_AR_BYTES, attr | 2);
+        __vmwrite(GUEST_TR_AR_BYTES, attr);
         break;
     case x86_seg_gdtr:
         __vmwrite(GUEST_GDTR_LIMIT, limit);
diff --git a/xen/include/asm-x86/desc.h b/xen/include/asm-x86/desc.h
index 0e2d97f..da924bf 100644
--- a/xen/include/asm-x86/desc.h
+++ b/xen/include/asm-x86/desc.h
@@ -89,7 +89,13 @@
 #ifndef __ASSEMBLY__
 
 /* System Descriptor types for GDT and IDT entries. */
+#define SYS_DESC_tss16_avail  1
 #define SYS_DESC_ldt          2
+#define SYS_DESC_tss16_busy   3
+#define SYS_DESC_call_gate16  4
+#define SYS_DESC_task_gate    5
+#define SYS_DESC_irq_gate16   6
+#define SYS_DESC_trap_gate16  7
 #define SYS_DESC_tss_avail    9
 #define SYS_DESC_tss_busy     11
 #define SYS_DESC_call_gate    12
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 51a64f7..b37b335 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -358,19 +358,10 @@ static inline void hvm_flush_guest_tlbs(void)
 void hvm_hypercall_page_initialise(struct domain *d,
                                    void *hypercall_page);
 
-static inline void
-hvm_get_segment_register(struct vcpu *v, enum x86_segment seg,
-                         struct segment_register *reg)
-{
-    hvm_funcs.get_segment_register(v, seg, reg);
-}
-
-static inline void
-hvm_set_segment_register(struct vcpu *v, enum x86_segment seg,
-                         struct segment_register *reg)
-{
-    hvm_funcs.set_segment_register(v, seg, reg);
-}
+void hvm_get_segment_register(struct vcpu *v, enum x86_segment seg,
+                              struct segment_register *reg);
+void hvm_set_segment_register(struct vcpu *v, enum x86_segment seg,
+                              struct segment_register *reg);
 
 static inline unsigned long hvm_get_shadow_gs_base(struct vcpu *v)
 {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 09/15] x86/emul: Avoid raising faults behind the emulators back
  2016-11-23 15:38 [PATCH for-4.9 00/15] XSA-191 followup Andrew Cooper
                   ` (7 preceding siblings ...)
  2016-11-23 15:38 ` [PATCH 08/15] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS Andrew Cooper
@ 2016-11-23 15:38 ` Andrew Cooper
  2016-11-23 16:31   ` Tim Deegan
  2016-11-23 15:38 ` [PATCH 10/15] x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer Andrew Cooper
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 15:38 UTC (permalink / raw)
  To: Xen-devel
  Cc: George Dunlap, Andrew Cooper, Paul Durrant, Tim Deegan, Jan Beulich

Introduce a new x86_emul_pagefault() similar to x86_emul_hw_exception(), and
use this instead of hvm_inject_page_fault() from emulation codepaths.

Replace one hvm_inject_hw_exception() in the shadow emulation codepaths.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Paul Durrant <paul.durrant@citrix.com>
CC: Tim Deegan <tim@xen.org>
CC: George Dunlap <george.dunlap@eu.citrix.com>

NOTE: this is a functional change for the shadow code, as a #PF previously
raised properly with the guest will now cause X86EMUL_UNHANDLABLE. It is my
understanding after a discusion with Tim that this is ok, but I haven't done
extenstive testing yet.
---
 xen/arch/x86/hvm/emulate.c             |  4 ++--
 xen/arch/x86/mm/shadow/common.c        |  5 +++--
 xen/arch/x86/x86_emulate/x86_emulate.h | 13 +++++++++++++
 3 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index c0fbde1..3ebb200 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -459,7 +459,7 @@ static int hvmemul_linear_to_phys(
     {
         if ( pfec & (PFEC_page_paged | PFEC_page_shared) )
             return X86EMUL_RETRY;
-        hvm_inject_page_fault(pfec, addr);
+        x86_emul_pagefault(pfec, addr, &hvmemul_ctxt->ctxt);
         return X86EMUL_EXCEPTION;
     }
 
@@ -483,7 +483,7 @@ static int hvmemul_linear_to_phys(
                 ASSERT(!reverse);
                 if ( npfn != gfn_x(INVALID_GFN) )
                     return X86EMUL_UNHANDLEABLE;
-                hvm_inject_page_fault(pfec, addr & PAGE_MASK);
+                x86_emul_pagefault(pfec, addr & PAGE_MASK, &hvmemul_ctxt->ctxt);
                 return X86EMUL_EXCEPTION;
             }
             *reps = done;
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 6f6668d..c8b61b9 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -162,8 +162,9 @@ static int hvm_translate_linear_addr(
 
     if ( !okay )
     {
-        hvm_inject_hw_exception(
-            (seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault, 0);
+        x86_emul_hw_exception(
+            (seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault,
+            0, &sh_ctxt->ctxt);
         return X86EMUL_EXCEPTION;
     }
 
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index ddcd93c..cc26e9d 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -600,6 +600,19 @@ static inline void x86_emul_hw_exception(
     ctxt->event_pending = true;
 }
 
+static inline void x86_emul_pagefault(
+    unsigned int error_code, unsigned long cr2, struct x86_emulate_ctxt *ctxt)
+{
+    ASSERT(!ctxt->event_pending);
+
+    ctxt->event.vector = 14; /* TRAP_page_fault */
+    ctxt->event.type = X86_EVENTTYPE_HW_EXCEPTION;
+    ctxt->event.error_code = error_code;
+    ctxt->event.cr2 = cr2;
+
+    ctxt->event_pending = true;
+}
+
 static inline void x86_emul_software_event(
     enum x86_swint_type type, uint8_t vector, uint8_t insn_len,
     struct x86_emulate_ctxt *ctxt)
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 10/15] x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer
  2016-11-23 15:38 [PATCH for-4.9 00/15] XSA-191 followup Andrew Cooper
                   ` (8 preceding siblings ...)
  2016-11-23 15:38 ` [PATCH 09/15] x86/emul: Avoid raising faults behind the emulators back Andrew Cooper
@ 2016-11-23 15:38 ` Andrew Cooper
  2016-11-23 16:32   ` Tim Deegan
                     ` (2 more replies)
  2016-11-23 15:38 ` [PATCH 11/15] x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info Andrew Cooper
                   ` (4 subsequent siblings)
  14 siblings, 3 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 15:38 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Kevin Tian, Paul Durrant, Tim Deegan, Jun Nakajima

which is filled with pagefault information should one occur.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
CC: Paul Durrant <paul.durrant@citrix.com>
CC: Tim Deegan <tim@xen.org>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/hvm/emulate.c        |  8 ++++---
 xen/arch/x86/hvm/hvm.c            | 49 +++++++++++++++++++++++++--------------
 xen/arch/x86/hvm/vmx/vvmx.c       |  9 ++++---
 xen/arch/x86/mm/shadow/common.c   |  5 ++--
 xen/include/asm-x86/hvm/support.h | 23 +++++++++++++-----
 5 files changed, 63 insertions(+), 31 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 3ebb200..e50ee24 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -770,6 +770,7 @@ static int __hvmemul_read(
     struct hvm_emulate_ctxt *hvmemul_ctxt)
 {
     struct vcpu *curr = current;
+    pagefault_info_t pfinfo;
     unsigned long addr, reps = 1;
     uint32_t pfec = PFEC_page_present;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
@@ -790,8 +791,8 @@ static int __hvmemul_read(
         pfec |= PFEC_user_mode;
 
     rc = ((access_type == hvm_access_insn_fetch) ?
-          hvm_fetch_from_guest_virt(p_data, addr, bytes, pfec) :
-          hvm_copy_from_guest_virt(p_data, addr, bytes, pfec));
+          hvm_fetch_from_guest_virt(p_data, addr, bytes, pfec, &pfinfo) :
+          hvm_copy_from_guest_virt(p_data, addr, bytes, pfec, &pfinfo));
 
     switch ( rc )
     {
@@ -878,6 +879,7 @@ static int hvmemul_write(
     struct hvm_emulate_ctxt *hvmemul_ctxt =
         container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
     struct vcpu *curr = current;
+    pagefault_info_t pfinfo;
     unsigned long addr, reps = 1;
     uint32_t pfec = PFEC_page_present | PFEC_write_access;
     struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
@@ -896,7 +898,7 @@ static int hvmemul_write(
          (hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3) )
         pfec |= PFEC_user_mode;
 
-    rc = hvm_copy_to_guest_virt(addr, p_data, bytes, pfec);
+    rc = hvm_copy_to_guest_virt(addr, p_data, bytes, pfec, &pfinfo);
 
     switch ( rc )
     {
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 804cd88..afba51f 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2859,6 +2859,7 @@ void hvm_task_switch(
     struct desc_struct *optss_desc = NULL, *nptss_desc = NULL, tss_desc;
     bool_t otd_writable, ntd_writable;
     unsigned long eflags;
+    pagefault_info_t pfinfo;
     int exn_raised, rc;
     struct {
         u16 back_link,__blh;
@@ -2925,7 +2926,7 @@ void hvm_task_switch(
     }
 
     rc = hvm_copy_from_guest_virt(
-        &tss, prev_tr.base, sizeof(tss), PFEC_page_present);
+        &tss, prev_tr.base, sizeof(tss), PFEC_page_present, &pfinfo);
     if ( rc != HVMCOPY_okay )
         goto out;
 
@@ -2963,12 +2964,12 @@ void hvm_task_switch(
                                 &tss.eip,
                                 offsetof(typeof(tss), trace) -
                                 offsetof(typeof(tss), eip),
-                                PFEC_page_present);
+                                PFEC_page_present, &pfinfo);
     if ( rc != HVMCOPY_okay )
         goto out;
 
     rc = hvm_copy_from_guest_virt(
-        &tss, tr.base, sizeof(tss), PFEC_page_present);
+        &tss, tr.base, sizeof(tss), PFEC_page_present, &pfinfo);
     /*
      * Note: The HVMCOPY_gfn_shared case could be optimised, if the callee
      * functions knew we want RO access.
@@ -3008,7 +3009,8 @@ void hvm_task_switch(
         tss.back_link = prev_tr.sel;
 
         rc = hvm_copy_to_guest_virt(tr.base + offsetof(typeof(tss), back_link),
-                                    &tss.back_link, sizeof(tss.back_link), 0);
+                                    &tss.back_link, sizeof(tss.back_link), 0,
+                                    &pfinfo);
         if ( rc == HVMCOPY_bad_gva_to_gfn )
             exn_raised = 1;
         else if ( rc != HVMCOPY_okay )
@@ -3045,7 +3047,8 @@ void hvm_task_switch(
                                         16 << segr.attr.fields.db,
                                         &linear_addr) )
         {
-            rc = hvm_copy_to_guest_virt(linear_addr, &errcode, opsz, 0);
+            rc = hvm_copy_to_guest_virt(linear_addr, &errcode, opsz, 0,
+                                        &pfinfo);
             if ( rc == HVMCOPY_bad_gva_to_gfn )
                 exn_raised = 1;
             else if ( rc != HVMCOPY_okay )
@@ -3068,7 +3071,8 @@ void hvm_task_switch(
 #define HVMCOPY_phys       (0u<<2)
 #define HVMCOPY_virt       (1u<<2)
 static enum hvm_copy_result __hvm_copy(
-    void *buf, paddr_t addr, int size, unsigned int flags, uint32_t pfec)
+    void *buf, paddr_t addr, int size, unsigned int flags, uint32_t pfec,
+    pagefault_info_t *pfinfo)
 {
     struct vcpu *curr = current;
     unsigned long gfn;
@@ -3109,7 +3113,15 @@ static enum hvm_copy_result __hvm_copy(
                 if ( pfec & PFEC_page_shared )
                     return HVMCOPY_gfn_shared;
                 if ( flags & HVMCOPY_fault )
+                {
+                    if ( pfinfo )
+                    {
+                        pfinfo->linear = addr;
+                        pfinfo->ec = pfec;
+                    }
+
                     hvm_inject_page_fault(pfec, addr);
+                }
                 return HVMCOPY_bad_gva_to_gfn;
             }
             gpa |= (paddr_t)gfn << PAGE_SHIFT;
@@ -3279,7 +3291,7 @@ enum hvm_copy_result hvm_copy_to_guest_phys(
 {
     return __hvm_copy(buf, paddr, size,
                       HVMCOPY_to_guest | HVMCOPY_fault | HVMCOPY_phys,
-                      0);
+                      0, NULL);
 }
 
 enum hvm_copy_result hvm_copy_from_guest_phys(
@@ -3287,31 +3299,34 @@ enum hvm_copy_result hvm_copy_from_guest_phys(
 {
     return __hvm_copy(buf, paddr, size,
                       HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_phys,
-                      0);
+                      0, NULL);
 }
 
 enum hvm_copy_result hvm_copy_to_guest_virt(
-    unsigned long vaddr, void *buf, int size, uint32_t pfec)
+    unsigned long vaddr, void *buf, int size, uint32_t pfec,
+    pagefault_info_t *pfinfo)
 {
     return __hvm_copy(buf, vaddr, size,
                       HVMCOPY_to_guest | HVMCOPY_fault | HVMCOPY_virt,
-                      PFEC_page_present | PFEC_write_access | pfec);
+                      PFEC_page_present | PFEC_write_access | pfec, pfinfo);
 }
 
 enum hvm_copy_result hvm_copy_from_guest_virt(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec)
+    void *buf, unsigned long vaddr, int size, uint32_t pfec,
+    pagefault_info_t *pfinfo)
 {
     return __hvm_copy(buf, vaddr, size,
                       HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
-                      PFEC_page_present | pfec);
+                      PFEC_page_present | pfec, pfinfo);
 }
 
 enum hvm_copy_result hvm_fetch_from_guest_virt(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec)
+    void *buf, unsigned long vaddr, int size, uint32_t pfec,
+    pagefault_info_t *pfinfo)
 {
     return __hvm_copy(buf, vaddr, size,
                       HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
-                      PFEC_page_present | PFEC_insn_fetch | pfec);
+                      PFEC_page_present | PFEC_insn_fetch | pfec, pfinfo);
 }
 
 enum hvm_copy_result hvm_copy_to_guest_virt_nofault(
@@ -3319,7 +3334,7 @@ enum hvm_copy_result hvm_copy_to_guest_virt_nofault(
 {
     return __hvm_copy(buf, vaddr, size,
                       HVMCOPY_to_guest | HVMCOPY_no_fault | HVMCOPY_virt,
-                      PFEC_page_present | PFEC_write_access | pfec);
+                      PFEC_page_present | PFEC_write_access | pfec, NULL);
 }
 
 enum hvm_copy_result hvm_copy_from_guest_virt_nofault(
@@ -3327,7 +3342,7 @@ enum hvm_copy_result hvm_copy_from_guest_virt_nofault(
 {
     return __hvm_copy(buf, vaddr, size,
                       HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt,
-                      PFEC_page_present | pfec);
+                      PFEC_page_present | pfec, NULL);
 }
 
 enum hvm_copy_result hvm_fetch_from_guest_virt_nofault(
@@ -3335,7 +3350,7 @@ enum hvm_copy_result hvm_fetch_from_guest_virt_nofault(
 {
     return __hvm_copy(buf, vaddr, size,
                       HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt,
-                      PFEC_page_present | PFEC_insn_fetch | pfec);
+                      PFEC_page_present | PFEC_insn_fetch | pfec, NULL);
 }
 
 unsigned long copy_to_user_hvm(void *to, const void *from, unsigned int len)
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index bcc4a97..7342d12 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -396,6 +396,7 @@ static int decode_vmx_inst(struct cpu_user_regs *regs,
     struct vcpu *v = current;
     union vmx_inst_info info;
     struct segment_register seg;
+    pagefault_info_t pfinfo;
     unsigned long base, index, seg_base, disp, offset;
     int scale, size;
 
@@ -451,7 +452,7 @@ static int decode_vmx_inst(struct cpu_user_regs *regs,
             goto gp_fault;
 
         if ( poperandS != NULL &&
-             hvm_copy_from_guest_virt(poperandS, base, size, 0)
+             hvm_copy_from_guest_virt(poperandS, base, size, 0, &pfinfo)
                   != HVMCOPY_okay )
             return X86EMUL_EXCEPTION;
         decode->mem = base;
@@ -1611,6 +1612,7 @@ int nvmx_handle_vmptrst(struct cpu_user_regs *regs)
     struct vcpu *v = current;
     struct vmx_inst_decoded decode;
     struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
+    pagefault_info_t pfinfo;
     unsigned long gpa = 0;
     int rc;
 
@@ -1620,7 +1622,7 @@ int nvmx_handle_vmptrst(struct cpu_user_regs *regs)
 
     gpa = nvcpu->nv_vvmcxaddr;
 
-    rc = hvm_copy_to_guest_virt(decode.mem, &gpa, decode.len, 0);
+    rc = hvm_copy_to_guest_virt(decode.mem, &gpa, decode.len, 0, &pfinfo);
     if ( rc != HVMCOPY_okay )
         return X86EMUL_EXCEPTION;
 
@@ -1679,6 +1681,7 @@ int nvmx_handle_vmread(struct cpu_user_regs *regs)
 {
     struct vcpu *v = current;
     struct vmx_inst_decoded decode;
+    pagefault_info_t pfinfo;
     u64 value = 0;
     int rc;
 
@@ -1690,7 +1693,7 @@ int nvmx_handle_vmread(struct cpu_user_regs *regs)
 
     switch ( decode.type ) {
     case VMX_INST_MEMREG_TYPE_MEMORY:
-        rc = hvm_copy_to_guest_virt(decode.mem, &value, decode.len, 0);
+        rc = hvm_copy_to_guest_virt(decode.mem, &value, decode.len, 0, &pfinfo);
         if ( rc != HVMCOPY_okay )
             return X86EMUL_EXCEPTION;
         break;
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index c8b61b9..d28eae1 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -179,6 +179,7 @@ hvm_read(enum x86_segment seg,
          enum hvm_access_type access_type,
          struct sh_emulate_ctxt *sh_ctxt)
 {
+    pagefault_info_t pfinfo;
     unsigned long addr;
     int rc;
 
@@ -188,9 +189,9 @@ hvm_read(enum x86_segment seg,
         return rc;
 
     if ( access_type == hvm_access_insn_fetch )
-        rc = hvm_fetch_from_guest_virt(p_data, addr, bytes, 0);
+        rc = hvm_fetch_from_guest_virt(p_data, addr, bytes, 0, &pfinfo);
     else
-        rc = hvm_copy_from_guest_virt(p_data, addr, bytes, 0);
+        rc = hvm_copy_from_guest_virt(p_data, addr, bytes, 0, &pfinfo);
 
     switch ( rc )
     {
diff --git a/xen/include/asm-x86/hvm/support.h b/xen/include/asm-x86/hvm/support.h
index 9938450..4aa5a36 100644
--- a/xen/include/asm-x86/hvm/support.h
+++ b/xen/include/asm-x86/hvm/support.h
@@ -83,16 +83,27 @@ enum hvm_copy_result hvm_copy_from_guest_phys(
  *  HVMCOPY_bad_gfn_to_mfn: Some guest physical address did not map to
  *                          ordinary machine memory.
  *  HVMCOPY_bad_gva_to_gfn: Some guest virtual address did not have a valid
- *                          mapping to a guest physical address. In this case
- *                          a page fault exception is automatically queued
- *                          for injection into the current HVM VCPU.
+ *                          mapping to a guest physical address.  The
+ *                          pagefault_info_t structure will be filled in if
+ *                          provided, and a page fault exception is
+ *                          automatically queued for injection into the
+ *                          current HVM VCPU.
  */
+typedef struct pagefault_info
+{
+    unsigned long linear;
+    int ec;
+} pagefault_info_t;
+
 enum hvm_copy_result hvm_copy_to_guest_virt(
-    unsigned long vaddr, void *buf, int size, uint32_t pfec);
+    unsigned long vaddr, void *buf, int size, uint32_t pfec,
+    pagefault_info_t *pfinfo);
 enum hvm_copy_result hvm_copy_from_guest_virt(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec);
+    void *buf, unsigned long vaddr, int size, uint32_t pfec,
+    pagefault_info_t *pfinfo);
 enum hvm_copy_result hvm_fetch_from_guest_virt(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec);
+    void *buf, unsigned long vaddr, int size, uint32_t pfec,
+    pagefault_info_t *pfinfo);
 
 /*
  * As above (copy to/from a guest virtual address), but no fault is generated
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 11/15] x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info
  2016-11-23 15:38 [PATCH for-4.9 00/15] XSA-191 followup Andrew Cooper
                   ` (9 preceding siblings ...)
  2016-11-23 15:38 ` [PATCH 10/15] x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer Andrew Cooper
@ 2016-11-23 15:38 ` Andrew Cooper
  2016-11-23 16:35   ` Tim Deegan
  2016-11-23 15:38 ` [PATCH 12/15] x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear() Andrew Cooper
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 15:38 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Paul Durrant, Tim Deegan

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
CC: Paul Durrant <paul.durrant@citrix.com>
CC: Tim Deegan <tim@xen.org>
---
 xen/arch/x86/hvm/emulate.c        |  6 ++---
 xen/arch/x86/hvm/hvm.c            | 56 +++++++++------------------------------
 xen/arch/x86/mm/shadow/common.c   |  8 +++---
 xen/include/asm-x86/hvm/support.h | 11 --------
 4 files changed, 19 insertions(+), 62 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index e50ee24..34c9d77 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1936,9 +1936,9 @@ void hvm_emulate_init_per_insn(
                                         hvm_access_insn_fetch,
                                         hvmemul_ctxt->ctxt.addr_size,
                                         &addr) &&
-             hvm_fetch_from_guest_virt_nofault(hvmemul_ctxt->insn_buf, addr,
-                                               sizeof(hvmemul_ctxt->insn_buf),
-                                               pfec) == HVMCOPY_okay) ?
+             hvm_fetch_from_guest_virt(hvmemul_ctxt->insn_buf, addr,
+                                       sizeof(hvmemul_ctxt->insn_buf),
+                                       pfec, NULL) == HVMCOPY_okay) ?
             sizeof(hvmemul_ctxt->insn_buf) : 0;
     }
     else
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index afba51f..6dfe81b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3066,8 +3066,6 @@ void hvm_task_switch(
 
 #define HVMCOPY_from_guest (0u<<0)
 #define HVMCOPY_to_guest   (1u<<0)
-#define HVMCOPY_no_fault   (0u<<1)
-#define HVMCOPY_fault      (1u<<1)
 #define HVMCOPY_phys       (0u<<2)
 #define HVMCOPY_virt       (1u<<2)
 static enum hvm_copy_result __hvm_copy(
@@ -3112,13 +3110,10 @@ static enum hvm_copy_result __hvm_copy(
                     return HVMCOPY_gfn_paged_out;
                 if ( pfec & PFEC_page_shared )
                     return HVMCOPY_gfn_shared;
-                if ( flags & HVMCOPY_fault )
+                if ( pfinfo )
                 {
-                    if ( pfinfo )
-                    {
-                        pfinfo->linear = addr;
-                        pfinfo->ec = pfec;
-                    }
+                    pfinfo->linear = addr;
+                    pfinfo->ec = pfec;
 
                     hvm_inject_page_fault(pfec, addr);
                 }
@@ -3290,16 +3285,14 @@ enum hvm_copy_result hvm_copy_to_guest_phys(
     paddr_t paddr, void *buf, int size)
 {
     return __hvm_copy(buf, paddr, size,
-                      HVMCOPY_to_guest | HVMCOPY_fault | HVMCOPY_phys,
-                      0, NULL);
+                      HVMCOPY_to_guest | HVMCOPY_phys, 0, NULL);
 }
 
 enum hvm_copy_result hvm_copy_from_guest_phys(
     void *buf, paddr_t paddr, int size)
 {
     return __hvm_copy(buf, paddr, size,
-                      HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_phys,
-                      0, NULL);
+                      HVMCOPY_from_guest | HVMCOPY_phys, 0, NULL);
 }
 
 enum hvm_copy_result hvm_copy_to_guest_virt(
@@ -3307,7 +3300,7 @@ enum hvm_copy_result hvm_copy_to_guest_virt(
     pagefault_info_t *pfinfo)
 {
     return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_to_guest | HVMCOPY_fault | HVMCOPY_virt,
+                      HVMCOPY_to_guest | HVMCOPY_virt,
                       PFEC_page_present | PFEC_write_access | pfec, pfinfo);
 }
 
@@ -3316,7 +3309,7 @@ enum hvm_copy_result hvm_copy_from_guest_virt(
     pagefault_info_t *pfinfo)
 {
     return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
+                      HVMCOPY_from_guest | HVMCOPY_virt,
                       PFEC_page_present | pfec, pfinfo);
 }
 
@@ -3325,34 +3318,10 @@ enum hvm_copy_result hvm_fetch_from_guest_virt(
     pagefault_info_t *pfinfo)
 {
     return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
+                      HVMCOPY_from_guest | HVMCOPY_virt,
                       PFEC_page_present | PFEC_insn_fetch | pfec, pfinfo);
 }
 
-enum hvm_copy_result hvm_copy_to_guest_virt_nofault(
-    unsigned long vaddr, void *buf, int size, uint32_t pfec)
-{
-    return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_to_guest | HVMCOPY_no_fault | HVMCOPY_virt,
-                      PFEC_page_present | PFEC_write_access | pfec, NULL);
-}
-
-enum hvm_copy_result hvm_copy_from_guest_virt_nofault(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec)
-{
-    return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt,
-                      PFEC_page_present | pfec, NULL);
-}
-
-enum hvm_copy_result hvm_fetch_from_guest_virt_nofault(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec)
-{
-    return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt,
-                      PFEC_page_present | PFEC_insn_fetch | pfec, NULL);
-}
-
 unsigned long copy_to_user_hvm(void *to, const void *from, unsigned int len)
 {
     int rc;
@@ -3364,8 +3333,7 @@ unsigned long copy_to_user_hvm(void *to, const void *from, unsigned int len)
         return 0;
     }
 
-    rc = hvm_copy_to_guest_virt_nofault((unsigned long)to, (void *)from,
-                                        len, 0);
+    rc = hvm_copy_to_guest_virt((unsigned long)to, (void *)from, len, 0, NULL);
     return rc ? len : 0; /* fake a copy_to_user() return code */
 }
 
@@ -3395,7 +3363,7 @@ unsigned long copy_from_user_hvm(void *to, const void *from, unsigned len)
         return 0;
     }
 
-    rc = hvm_copy_from_guest_virt_nofault(to, (unsigned long)from, len, 0);
+    rc = hvm_copy_from_guest_virt(to, (unsigned long)from, len, 0, NULL);
     return rc ? len : 0; /* fake a copy_from_user() return code */
 }
 
@@ -4070,8 +4038,8 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
                                         (hvm_long_mode_enabled(cur) &&
                                          cs->attr.fields.l) ? 64 :
                                         cs->attr.fields.db ? 32 : 16, &addr) &&
-             (hvm_fetch_from_guest_virt_nofault(sig, addr, sizeof(sig),
-                                                walk) == HVMCOPY_okay) &&
+             (hvm_fetch_from_guest_virt(sig, addr, sizeof(sig),
+                                        walk, NULL) == HVMCOPY_okay) &&
              (memcmp(sig, "\xf\xbxen", sizeof(sig)) == 0) )
         {
             regs->eip += sizeof(sig);
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index d28eae1..bfd76af 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -419,8 +419,8 @@ const struct x86_emulate_ops *shadow_init_emulation(
         (!hvm_translate_linear_addr(
             x86_seg_cs, regs->eip, sizeof(sh_ctxt->insn_buf),
             hvm_access_insn_fetch, sh_ctxt, &addr) &&
-         !hvm_fetch_from_guest_virt_nofault(
-             sh_ctxt->insn_buf, addr, sizeof(sh_ctxt->insn_buf), 0))
+         !hvm_fetch_from_guest_virt(
+             sh_ctxt->insn_buf, addr, sizeof(sh_ctxt->insn_buf), 0, NULL))
         ? sizeof(sh_ctxt->insn_buf) : 0;
 
     return &hvm_shadow_emulator_ops;
@@ -447,8 +447,8 @@ void shadow_continue_emulation(struct sh_emulate_ctxt *sh_ctxt,
                 (!hvm_translate_linear_addr(
                     x86_seg_cs, regs->eip, sizeof(sh_ctxt->insn_buf),
                     hvm_access_insn_fetch, sh_ctxt, &addr) &&
-                 !hvm_fetch_from_guest_virt_nofault(
-                     sh_ctxt->insn_buf, addr, sizeof(sh_ctxt->insn_buf), 0))
+                 !hvm_fetch_from_guest_virt(
+                     sh_ctxt->insn_buf, addr, sizeof(sh_ctxt->insn_buf), 0, NULL))
                 ? sizeof(sh_ctxt->insn_buf) : 0;
             sh_ctxt->insn_buf_eip = regs->eip;
         }
diff --git a/xen/include/asm-x86/hvm/support.h b/xen/include/asm-x86/hvm/support.h
index 4aa5a36..114aa04 100644
--- a/xen/include/asm-x86/hvm/support.h
+++ b/xen/include/asm-x86/hvm/support.h
@@ -105,17 +105,6 @@ enum hvm_copy_result hvm_fetch_from_guest_virt(
     void *buf, unsigned long vaddr, int size, uint32_t pfec,
     pagefault_info_t *pfinfo);
 
-/*
- * As above (copy to/from a guest virtual address), but no fault is generated
- * when HVMCOPY_bad_gva_to_gfn is returned.
- */
-enum hvm_copy_result hvm_copy_to_guest_virt_nofault(
-    unsigned long vaddr, void *buf, int size, uint32_t pfec);
-enum hvm_copy_result hvm_copy_from_guest_virt_nofault(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec);
-enum hvm_copy_result hvm_fetch_from_guest_virt_nofault(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec);
-
 #define HVM_HCALL_completed  0 /* hypercall completed - no further action */
 #define HVM_HCALL_preempted  1 /* hypercall preempted - re-execute VMCALL */
 #define HVM_HCALL_invalidate 2 /* invalidate ioemu-dm memory cache        */
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 12/15] x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear()
  2016-11-23 15:38 [PATCH for-4.9 00/15] XSA-191 followup Andrew Cooper
                   ` (10 preceding siblings ...)
  2016-11-23 15:38 ` [PATCH 11/15] x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info Andrew Cooper
@ 2016-11-23 15:38 ` Andrew Cooper
  2016-11-23 16:35   ` Tim Deegan
                     ` (2 more replies)
  2016-11-23 15:38 ` [PATCH 13/15] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back Andrew Cooper
                   ` (2 subsequent siblings)
  14 siblings, 3 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 15:38 UTC (permalink / raw)
  To: Xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Tim Deegan, Paul Durrant,
	Jun Nakajima

The functions use linear addresses, not virtual addresses, as no segmentation
is used.  (Lots of other code in Xen makes this mistake.)

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Paul Durrant <paul.durrant@citrix.com>
CC: Tim Deegan <tim@xen.org>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/hvm/emulate.c        | 12 ++++----
 xen/arch/x86/hvm/hvm.c            | 60 +++++++++++++++++++--------------------
 xen/arch/x86/hvm/vmx/vvmx.c       |  6 ++--
 xen/arch/x86/mm/shadow/common.c   |  8 +++---
 xen/include/asm-x86/hvm/support.h | 14 ++++-----
 5 files changed, 50 insertions(+), 50 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 34c9d77..dd24979 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -791,8 +791,8 @@ static int __hvmemul_read(
         pfec |= PFEC_user_mode;
 
     rc = ((access_type == hvm_access_insn_fetch) ?
-          hvm_fetch_from_guest_virt(p_data, addr, bytes, pfec, &pfinfo) :
-          hvm_copy_from_guest_virt(p_data, addr, bytes, pfec, &pfinfo));
+          hvm_fetch_from_guest_linear(p_data, addr, bytes, pfec, &pfinfo) :
+          hvm_copy_from_guest_linear(p_data, addr, bytes, pfec, &pfinfo));
 
     switch ( rc )
     {
@@ -898,7 +898,7 @@ static int hvmemul_write(
          (hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3) )
         pfec |= PFEC_user_mode;
 
-    rc = hvm_copy_to_guest_virt(addr, p_data, bytes, pfec, &pfinfo);
+    rc = hvm_copy_to_guest_linear(addr, p_data, bytes, pfec, &pfinfo);
 
     switch ( rc )
     {
@@ -1936,9 +1936,9 @@ void hvm_emulate_init_per_insn(
                                         hvm_access_insn_fetch,
                                         hvmemul_ctxt->ctxt.addr_size,
                                         &addr) &&
-             hvm_fetch_from_guest_virt(hvmemul_ctxt->insn_buf, addr,
-                                       sizeof(hvmemul_ctxt->insn_buf),
-                                       pfec, NULL) == HVMCOPY_okay) ?
+             hvm_fetch_from_guest_linear(hvmemul_ctxt->insn_buf, addr,
+                                         sizeof(hvmemul_ctxt->insn_buf),
+                                         pfec, NULL) == HVMCOPY_okay) ?
             sizeof(hvmemul_ctxt->insn_buf) : 0;
     }
     else
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 6dfe81b..d0c4129 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2925,7 +2925,7 @@ void hvm_task_switch(
         goto out;
     }
 
-    rc = hvm_copy_from_guest_virt(
+    rc = hvm_copy_from_guest_linear(
         &tss, prev_tr.base, sizeof(tss), PFEC_page_present, &pfinfo);
     if ( rc != HVMCOPY_okay )
         goto out;
@@ -2960,15 +2960,15 @@ void hvm_task_switch(
     hvm_get_segment_register(v, x86_seg_ldtr, &segr);
     tss.ldt = segr.sel;
 
-    rc = hvm_copy_to_guest_virt(prev_tr.base + offsetof(typeof(tss), eip),
-                                &tss.eip,
-                                offsetof(typeof(tss), trace) -
-                                offsetof(typeof(tss), eip),
-                                PFEC_page_present, &pfinfo);
+    rc = hvm_copy_to_guest_linear(prev_tr.base + offsetof(typeof(tss), eip),
+                                  &tss.eip,
+                                  offsetof(typeof(tss), trace) -
+                                  offsetof(typeof(tss), eip),
+                                  PFEC_page_present, &pfinfo);
     if ( rc != HVMCOPY_okay )
         goto out;
 
-    rc = hvm_copy_from_guest_virt(
+    rc = hvm_copy_from_guest_linear(
         &tss, tr.base, sizeof(tss), PFEC_page_present, &pfinfo);
     /*
      * Note: The HVMCOPY_gfn_shared case could be optimised, if the callee
@@ -3008,9 +3008,9 @@ void hvm_task_switch(
         regs->eflags |= X86_EFLAGS_NT;
         tss.back_link = prev_tr.sel;
 
-        rc = hvm_copy_to_guest_virt(tr.base + offsetof(typeof(tss), back_link),
-                                    &tss.back_link, sizeof(tss.back_link), 0,
-                                    &pfinfo);
+        rc = hvm_copy_to_guest_linear(tr.base + offsetof(typeof(tss), back_link),
+                                      &tss.back_link, sizeof(tss.back_link), 0,
+                                      &pfinfo);
         if ( rc == HVMCOPY_bad_gva_to_gfn )
             exn_raised = 1;
         else if ( rc != HVMCOPY_okay )
@@ -3047,8 +3047,8 @@ void hvm_task_switch(
                                         16 << segr.attr.fields.db,
                                         &linear_addr) )
         {
-            rc = hvm_copy_to_guest_virt(linear_addr, &errcode, opsz, 0,
-                                        &pfinfo);
+            rc = hvm_copy_to_guest_linear(linear_addr, &errcode, opsz, 0,
+                                          &pfinfo);
             if ( rc == HVMCOPY_bad_gva_to_gfn )
                 exn_raised = 1;
             else if ( rc != HVMCOPY_okay )
@@ -3067,7 +3067,7 @@ void hvm_task_switch(
 #define HVMCOPY_from_guest (0u<<0)
 #define HVMCOPY_to_guest   (1u<<0)
 #define HVMCOPY_phys       (0u<<2)
-#define HVMCOPY_virt       (1u<<2)
+#define HVMCOPY_linear     (1u<<2)
 static enum hvm_copy_result __hvm_copy(
     void *buf, paddr_t addr, int size, unsigned int flags, uint32_t pfec,
     pagefault_info_t *pfinfo)
@@ -3101,7 +3101,7 @@ static enum hvm_copy_result __hvm_copy(
 
         count = min_t(int, PAGE_SIZE - gpa, todo);
 
-        if ( flags & HVMCOPY_virt )
+        if ( flags & HVMCOPY_linear )
         {
             gfn = paging_gva_to_gfn(curr, addr, &pfec);
             if ( gfn == gfn_x(INVALID_GFN) )
@@ -3295,30 +3295,30 @@ enum hvm_copy_result hvm_copy_from_guest_phys(
                       HVMCOPY_from_guest | HVMCOPY_phys, 0, NULL);
 }
 
-enum hvm_copy_result hvm_copy_to_guest_virt(
-    unsigned long vaddr, void *buf, int size, uint32_t pfec,
+enum hvm_copy_result hvm_copy_to_guest_linear(
+    unsigned long addr, void *buf, int size, uint32_t pfec,
     pagefault_info_t *pfinfo)
 {
-    return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_to_guest | HVMCOPY_virt,
+    return __hvm_copy(buf, addr, size,
+                      HVMCOPY_to_guest | HVMCOPY_linear,
                       PFEC_page_present | PFEC_write_access | pfec, pfinfo);
 }
 
-enum hvm_copy_result hvm_copy_from_guest_virt(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec,
+enum hvm_copy_result hvm_copy_from_guest_linear(
+    void *buf, unsigned long addr, int size, uint32_t pfec,
     pagefault_info_t *pfinfo)
 {
-    return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_from_guest | HVMCOPY_virt,
+    return __hvm_copy(buf, addr, size,
+                      HVMCOPY_from_guest | HVMCOPY_linear,
                       PFEC_page_present | pfec, pfinfo);
 }
 
-enum hvm_copy_result hvm_fetch_from_guest_virt(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec,
+enum hvm_copy_result hvm_fetch_from_guest_linear(
+    void *buf, unsigned long addr, int size, uint32_t pfec,
     pagefault_info_t *pfinfo)
 {
-    return __hvm_copy(buf, vaddr, size,
-                      HVMCOPY_from_guest | HVMCOPY_virt,
+    return __hvm_copy(buf, addr, size,
+                      HVMCOPY_from_guest | HVMCOPY_linear,
                       PFEC_page_present | PFEC_insn_fetch | pfec, pfinfo);
 }
 
@@ -3333,7 +3333,7 @@ unsigned long copy_to_user_hvm(void *to, const void *from, unsigned int len)
         return 0;
     }
 
-    rc = hvm_copy_to_guest_virt((unsigned long)to, (void *)from, len, 0, NULL);
+    rc = hvm_copy_to_guest_linear((unsigned long)to, (void *)from, len, 0, NULL);
     return rc ? len : 0; /* fake a copy_to_user() return code */
 }
 
@@ -3363,7 +3363,7 @@ unsigned long copy_from_user_hvm(void *to, const void *from, unsigned len)
         return 0;
     }
 
-    rc = hvm_copy_from_guest_virt(to, (unsigned long)from, len, 0, NULL);
+    rc = hvm_copy_from_guest_linear(to, (unsigned long)from, len, 0, NULL);
     return rc ? len : 0; /* fake a copy_from_user() return code */
 }
 
@@ -4038,8 +4038,8 @@ void hvm_ud_intercept(struct cpu_user_regs *regs)
                                         (hvm_long_mode_enabled(cur) &&
                                          cs->attr.fields.l) ? 64 :
                                         cs->attr.fields.db ? 32 : 16, &addr) &&
-             (hvm_fetch_from_guest_virt(sig, addr, sizeof(sig),
-                                        walk, NULL) == HVMCOPY_okay) &&
+             (hvm_fetch_from_guest_linear(sig, addr, sizeof(sig),
+                                          walk, NULL) == HVMCOPY_okay) &&
              (memcmp(sig, "\xf\xbxen", sizeof(sig)) == 0) )
         {
             regs->eip += sizeof(sig);
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index 7342d12..fd7ea0a 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -452,7 +452,7 @@ static int decode_vmx_inst(struct cpu_user_regs *regs,
             goto gp_fault;
 
         if ( poperandS != NULL &&
-             hvm_copy_from_guest_virt(poperandS, base, size, 0, &pfinfo)
+             hvm_copy_from_guest_linear(poperandS, base, size, 0, &pfinfo)
                   != HVMCOPY_okay )
             return X86EMUL_EXCEPTION;
         decode->mem = base;
@@ -1622,7 +1622,7 @@ int nvmx_handle_vmptrst(struct cpu_user_regs *regs)
 
     gpa = nvcpu->nv_vvmcxaddr;
 
-    rc = hvm_copy_to_guest_virt(decode.mem, &gpa, decode.len, 0, &pfinfo);
+    rc = hvm_copy_to_guest_linear(decode.mem, &gpa, decode.len, 0, &pfinfo);
     if ( rc != HVMCOPY_okay )
         return X86EMUL_EXCEPTION;
 
@@ -1693,7 +1693,7 @@ int nvmx_handle_vmread(struct cpu_user_regs *regs)
 
     switch ( decode.type ) {
     case VMX_INST_MEMREG_TYPE_MEMORY:
-        rc = hvm_copy_to_guest_virt(decode.mem, &value, decode.len, 0, &pfinfo);
+        rc = hvm_copy_to_guest_linear(decode.mem, &value, decode.len, 0, &pfinfo);
         if ( rc != HVMCOPY_okay )
             return X86EMUL_EXCEPTION;
         break;
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index bfd76af..afacd5f 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -189,9 +189,9 @@ hvm_read(enum x86_segment seg,
         return rc;
 
     if ( access_type == hvm_access_insn_fetch )
-        rc = hvm_fetch_from_guest_virt(p_data, addr, bytes, 0, &pfinfo);
+        rc = hvm_fetch_from_guest_linear(p_data, addr, bytes, 0, &pfinfo);
     else
-        rc = hvm_copy_from_guest_virt(p_data, addr, bytes, 0, &pfinfo);
+        rc = hvm_copy_from_guest_linear(p_data, addr, bytes, 0, &pfinfo);
 
     switch ( rc )
     {
@@ -419,7 +419,7 @@ const struct x86_emulate_ops *shadow_init_emulation(
         (!hvm_translate_linear_addr(
             x86_seg_cs, regs->eip, sizeof(sh_ctxt->insn_buf),
             hvm_access_insn_fetch, sh_ctxt, &addr) &&
-         !hvm_fetch_from_guest_virt(
+         !hvm_fetch_from_guest_linear(
              sh_ctxt->insn_buf, addr, sizeof(sh_ctxt->insn_buf), 0, NULL))
         ? sizeof(sh_ctxt->insn_buf) : 0;
 
@@ -447,7 +447,7 @@ void shadow_continue_emulation(struct sh_emulate_ctxt *sh_ctxt,
                 (!hvm_translate_linear_addr(
                     x86_seg_cs, regs->eip, sizeof(sh_ctxt->insn_buf),
                     hvm_access_insn_fetch, sh_ctxt, &addr) &&
-                 !hvm_fetch_from_guest_virt(
+                 !hvm_fetch_from_guest_linear(
                      sh_ctxt->insn_buf, addr, sizeof(sh_ctxt->insn_buf), 0, NULL))
                 ? sizeof(sh_ctxt->insn_buf) : 0;
             sh_ctxt->insn_buf_eip = regs->eip;
diff --git a/xen/include/asm-x86/hvm/support.h b/xen/include/asm-x86/hvm/support.h
index 114aa04..78349f8 100644
--- a/xen/include/asm-x86/hvm/support.h
+++ b/xen/include/asm-x86/hvm/support.h
@@ -73,7 +73,7 @@ enum hvm_copy_result hvm_copy_from_guest_phys(
     void *buf, paddr_t paddr, int size);
 
 /*
- * Copy to/from a guest virtual address. @pfec should include PFEC_user_mode
+ * Copy to/from a guest linear address. @pfec should include PFEC_user_mode
  * if emulating a user-mode access (CPL=3). All other flags in @pfec are
  * managed by the called function: it is therefore optional for the caller
  * to set them.
@@ -95,14 +95,14 @@ typedef struct pagefault_info
     int ec;
 } pagefault_info_t;
 
-enum hvm_copy_result hvm_copy_to_guest_virt(
-    unsigned long vaddr, void *buf, int size, uint32_t pfec,
+enum hvm_copy_result hvm_copy_to_guest_linear(
+    unsigned long addr, void *buf, int size, uint32_t pfec,
     pagefault_info_t *pfinfo);
-enum hvm_copy_result hvm_copy_from_guest_virt(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec,
+enum hvm_copy_result hvm_copy_from_guest_linear(
+    void *buf, unsigned long addr, int size, uint32_t pfec,
     pagefault_info_t *pfinfo);
-enum hvm_copy_result hvm_fetch_from_guest_virt(
-    void *buf, unsigned long vaddr, int size, uint32_t pfec,
+enum hvm_copy_result hvm_fetch_from_guest_linear(
+    void *buf, unsigned long addr, int size, uint32_t pfec,
     pagefault_info_t *pfinfo);
 
 #define HVM_HCALL_completed  0 /* hypercall completed - no further action */
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 13/15] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back
  2016-11-23 15:38 [PATCH for-4.9 00/15] XSA-191 followup Andrew Cooper
                   ` (11 preceding siblings ...)
  2016-11-23 15:38 ` [PATCH 12/15] x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear() Andrew Cooper
@ 2016-11-23 15:38 ` Andrew Cooper
  2016-11-23 16:18   ` Andrew Cooper
  2016-11-23 16:39   ` Tim Deegan
  2016-11-23 15:38 ` [PATCH 14/15] x86/hvm: Prepare to allow use of system segments for memory references Andrew Cooper
  2016-11-23 15:38 ` [PATCH 15/15] x86/hvm: Use system-segment relative memory accesses Andrew Cooper
  14 siblings, 2 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 15:38 UTC (permalink / raw)
  To: Xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Tim Deegan, Paul Durrant,
	Jun Nakajima

Drop the call to hvm_inject_page_fault() in __hvm_copy(), and require callers
to inject the pagefault themselves.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Paul Durrant <paul.durrant@citrix.com>
CC: Tim Deegan <tim@xen.org>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/hvm/emulate.c        |  2 ++
 xen/arch/x86/hvm/hvm.c            | 11 +++++++++--
 xen/arch/x86/hvm/vmx/vvmx.c       | 20 +++++++++++++++-----
 xen/arch/x86/mm/shadow/common.c   |  1 +
 xen/include/asm-x86/hvm/support.h |  4 +---
 5 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index dd24979..c248eca 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -799,6 +799,7 @@ static int __hvmemul_read(
     case HVMCOPY_okay:
         break;
     case HVMCOPY_bad_gva_to_gfn:
+        x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt);
         return X86EMUL_EXCEPTION;
     case HVMCOPY_bad_gfn_to_mfn:
         if ( access_type == hvm_access_insn_fetch )
@@ -905,6 +906,7 @@ static int hvmemul_write(
     case HVMCOPY_okay:
         break;
     case HVMCOPY_bad_gva_to_gfn:
+        x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt);
         return X86EMUL_EXCEPTION;
     case HVMCOPY_bad_gfn_to_mfn:
         return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, hvmemul_ctxt, 0);
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index d0c4129..e1f2c9e 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2927,6 +2927,8 @@ void hvm_task_switch(
 
     rc = hvm_copy_from_guest_linear(
         &tss, prev_tr.base, sizeof(tss), PFEC_page_present, &pfinfo);
+    if ( rc == HVMCOPY_bad_gva_to_gfn )
+        hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
     if ( rc != HVMCOPY_okay )
         goto out;
 
@@ -2965,11 +2967,15 @@ void hvm_task_switch(
                                   offsetof(typeof(tss), trace) -
                                   offsetof(typeof(tss), eip),
                                   PFEC_page_present, &pfinfo);
+    if ( rc == HVMCOPY_bad_gva_to_gfn )
+        hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
     if ( rc != HVMCOPY_okay )
         goto out;
 
     rc = hvm_copy_from_guest_linear(
         &tss, tr.base, sizeof(tss), PFEC_page_present, &pfinfo);
+    if ( rc == HVMCOPY_bad_gva_to_gfn )
+        hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
     /*
      * Note: The HVMCOPY_gfn_shared case could be optimised, if the callee
      * functions knew we want RO access.
@@ -3012,7 +3018,10 @@ void hvm_task_switch(
                                       &tss.back_link, sizeof(tss.back_link), 0,
                                       &pfinfo);
         if ( rc == HVMCOPY_bad_gva_to_gfn )
+        {
+            hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
             exn_raised = 1;
+        }
         else if ( rc != HVMCOPY_okay )
             goto out;
     }
@@ -3114,8 +3123,6 @@ static enum hvm_copy_result __hvm_copy(
                 {
                     pfinfo->linear = addr;
                     pfinfo->ec = pfec;
-
-                    hvm_inject_page_fault(pfec, addr);
                 }
                 return HVMCOPY_bad_gva_to_gfn;
             }
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index fd7ea0a..e6e9ebd 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -396,7 +396,6 @@ static int decode_vmx_inst(struct cpu_user_regs *regs,
     struct vcpu *v = current;
     union vmx_inst_info info;
     struct segment_register seg;
-    pagefault_info_t pfinfo;
     unsigned long base, index, seg_base, disp, offset;
     int scale, size;
 
@@ -451,10 +450,17 @@ static int decode_vmx_inst(struct cpu_user_regs *regs,
               offset + size - 1 > seg.limit) )
             goto gp_fault;
 
-        if ( poperandS != NULL &&
-             hvm_copy_from_guest_linear(poperandS, base, size, 0, &pfinfo)
-                  != HVMCOPY_okay )
-            return X86EMUL_EXCEPTION;
+        if ( poperandS != NULL )
+        {
+            pagefault_info_t pfinfo;
+            int rc = hvm_copy_from_guest_linear(poperandS, base, size,
+                                                0, &pfinfo);
+
+            if ( rc == HVMCOPY_bad_gva_to_gfn )
+                hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
+            if ( rc != HVMCOPY_okay )
+                return X86EMUL_EXCEPTION;
+        }
         decode->mem = base;
         decode->len = size;
     }
@@ -1623,6 +1629,8 @@ int nvmx_handle_vmptrst(struct cpu_user_regs *regs)
     gpa = nvcpu->nv_vvmcxaddr;
 
     rc = hvm_copy_to_guest_linear(decode.mem, &gpa, decode.len, 0, &pfinfo);
+    if ( rc == HVMCOPY_bad_gva_to_gfn )
+        hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
     if ( rc != HVMCOPY_okay )
         return X86EMUL_EXCEPTION;
 
@@ -1694,6 +1702,8 @@ int nvmx_handle_vmread(struct cpu_user_regs *regs)
     switch ( decode.type ) {
     case VMX_INST_MEMREG_TYPE_MEMORY:
         rc = hvm_copy_to_guest_linear(decode.mem, &value, decode.len, 0, &pfinfo);
+        if ( rc == HVMCOPY_bad_gva_to_gfn )
+            hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
         if ( rc != HVMCOPY_okay )
             return X86EMUL_EXCEPTION;
         break;
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index afacd5f..88d4642 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -198,6 +198,7 @@ hvm_read(enum x86_segment seg,
     case HVMCOPY_okay:
         return X86EMUL_OKAY;
     case HVMCOPY_bad_gva_to_gfn:
+        hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
         return X86EMUL_EXCEPTION;
     case HVMCOPY_bad_gfn_to_mfn:
     case HVMCOPY_unhandleable:
diff --git a/xen/include/asm-x86/hvm/support.h b/xen/include/asm-x86/hvm/support.h
index 78349f8..3d767d7 100644
--- a/xen/include/asm-x86/hvm/support.h
+++ b/xen/include/asm-x86/hvm/support.h
@@ -85,9 +85,7 @@ enum hvm_copy_result hvm_copy_from_guest_phys(
  *  HVMCOPY_bad_gva_to_gfn: Some guest virtual address did not have a valid
  *                          mapping to a guest physical address.  The
  *                          pagefault_info_t structure will be filled in if
- *                          provided, and a page fault exception is
- *                          automatically queued for injection into the
- *                          current HVM VCPU.
+ *                          provided.
  */
 typedef struct pagefault_info
 {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 14/15] x86/hvm: Prepare to allow use of system segments for memory references
  2016-11-23 15:38 [PATCH for-4.9 00/15] XSA-191 followup Andrew Cooper
                   ` (12 preceding siblings ...)
  2016-11-23 15:38 ` [PATCH 13/15] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back Andrew Cooper
@ 2016-11-23 15:38 ` Andrew Cooper
  2016-11-23 16:42   ` Paul Durrant
  2016-11-24 15:48   ` Jan Beulich
  2016-11-23 15:38 ` [PATCH 15/15] x86/hvm: Use system-segment relative memory accesses Andrew Cooper
  14 siblings, 2 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 15:38 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Paul Durrant, Jan Beulich

All system segments (GDT/IDT/LDT and TR) describe a linear address and limit,
and act similarly to user segments.  However all current uses of these tables
in the emulator opencode the address calculations and limit checks.  In
particular, no care is taken for access which wrap around the 4GB or
non-canonical boundaries.

Alter hvm_virtual_to_linear_addr() to cope with performing segmentation checks
on system segments.  This involves restricting access checks in the 32bit case
to user segments only, and adding presence/limit checks in the 64bit case.

When suffering a segmentation fault for a system segments, return
X86EMUL_EXCEPTION but leave the fault injection to the caller.  The fault type
depends on the higher level action being performed.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <JBeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Paul Durrant <paul.durrant@citrix.com>
---
 xen/arch/x86/hvm/emulate.c             | 14 ++++++++----
 xen/arch/x86/hvm/hvm.c                 | 40 ++++++++++++++++++++++------------
 xen/arch/x86/mm/shadow/common.c        | 12 +++++++---
 xen/arch/x86/x86_emulate/x86_emulate.h | 26 ++++++++++++++--------
 4 files changed, 62 insertions(+), 30 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index c248eca..3a7d1f3 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -567,10 +567,16 @@ static int hvmemul_virtual_to_linear(
     if ( *reps != 1 )
         return X86EMUL_UNHANDLEABLE;
 
-    /* This is a singleton operation: fail it with an exception. */
-    x86_emul_hw_exception((seg == x86_seg_ss)
-                          ? TRAP_stack_error
-                          : TRAP_gp_fault, 0, &hvmemul_ctxt->ctxt);
+    /*
+     * Leave exception injection to the caller for non-user segments: We
+     * neither know the exact error code to be used, nor can we easily
+     * determine the kind of exception (#GP or #TS) in that case.
+     */
+    if ( is_x86_user_segment(seg) )
+        x86_emul_hw_exception((seg == x86_seg_ss)
+                              ? TRAP_stack_error
+                              : TRAP_gp_fault, 0, &hvmemul_ctxt->ctxt);
+
     return X86EMUL_EXCEPTION;
 }
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index e1f2c9e..2bcef1f 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2497,24 +2497,28 @@ bool_t hvm_virtual_to_linear_addr(
         if ( !reg->attr.fields.p )
             goto out;
 
-        switch ( access_type )
+        /* Read/write restrictions only exist for user segments. */
+        if ( reg->attr.fields.s )
         {
-        case hvm_access_read:
-            if ( (reg->attr.fields.type & 0xa) == 0x8 )
-                goto out; /* execute-only code segment */
-            break;
-        case hvm_access_write:
-            if ( (reg->attr.fields.type & 0xa) != 0x2 )
-                goto out; /* not a writable data segment */
-            break;
-        default:
-            break;
+            switch ( access_type )
+            {
+            case hvm_access_read:
+                if ( (reg->attr.fields.type & 0xa) == 0x8 )
+                    goto out; /* execute-only code segment */
+                break;
+            case hvm_access_write:
+                if ( (reg->attr.fields.type & 0xa) != 0x2 )
+                    goto out; /* not a writable data segment */
+                break;
+            default:
+                break;
+            }
         }
 
         last_byte = (uint32_t)offset + bytes - !!bytes;
 
         /* Is this a grows-down data segment? Special limit check if so. */
-        if ( (reg->attr.fields.type & 0xc) == 0x4 )
+        if ( reg->attr.fields.s && (reg->attr.fields.type & 0xc) == 0x4 )
         {
             /* Is upper limit 0xFFFF or 0xFFFFFFFF? */
             if ( !reg->attr.fields.db )
@@ -2530,10 +2534,18 @@ bool_t hvm_virtual_to_linear_addr(
     else
     {
         /*
-         * LONG MODE: FS and GS add segment base. Addresses must be canonical.
+         * User segments are always treated as present.  System segment may
+         * not be, and also incur limit checks.
          */
+        if ( is_x86_system_segment(seg) &&
+             (!reg->attr.fields.p || (offset + bytes - !!bytes) > reg->limit) )
+            goto out;
 
-        if ( (seg == x86_seg_fs) || (seg == x86_seg_gs) )
+        /*
+         * LONG MODE: FS, GS and system segments: add segment base. All
+         * addresses must be canonical.
+         */
+        if ( seg >= x86_seg_fs )
             addr += reg->base;
 
         last_byte = addr + bytes - !!bytes;
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 88d4642..954c157 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -162,9 +162,15 @@ static int hvm_translate_linear_addr(
 
     if ( !okay )
     {
-        x86_emul_hw_exception(
-            (seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault,
-            0, &sh_ctxt->ctxt);
+        /*
+         * Leave exception injection to the caller for non-user segments: We
+         * neither know the exact error code to be used, nor can we easily
+         * determine the kind of exception (#GP or #TS) in that case.
+         */
+        if ( is_x86_user_segment(seg) )
+            x86_emul_hw_exception(
+                (seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault,
+                0, &sh_ctxt->ctxt);
         return X86EMUL_EXCEPTION;
     }
 
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h b/xen/arch/x86/x86_emulate/x86_emulate.h
index cc26e9d..4d18623 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.h
+++ b/xen/arch/x86/x86_emulate/x86_emulate.h
@@ -27,7 +27,11 @@
 
 struct x86_emulate_ctxt;
 
-/* Comprehensive enumeration of x86 segment registers. */
+/*
+ * Comprehensive enumeration of x86 segment registers.  Various bits of code
+ * rely on this order (general purpose before system, tr at the beginning of
+ * system).
+ */
 enum x86_segment {
     /* General purpose.  Matches the SReg3 encoding in opcode/ModRM bytes. */
     x86_seg_es,
@@ -36,21 +40,25 @@ enum x86_segment {
     x86_seg_ds,
     x86_seg_fs,
     x86_seg_gs,
-    /* System. */
+    /* System: Valid to use for implicit table references. */
     x86_seg_tr,
     x86_seg_ldtr,
     x86_seg_gdtr,
     x86_seg_idtr,
-    /*
-     * Dummy: used to emulate direct processor accesses to management
-     * structures (TSS, GDT, LDT, IDT, etc.) which use linear addressing
-     * (no segment component) and bypass usual segment- and page-level
-     * protection checks.
-     */
+    /* No Segment: For accesses which are already linear. */
     x86_seg_none
 };
 
-#define is_x86_user_segment(seg) ((unsigned)(seg) <= x86_seg_gs)
+static inline bool is_x86_user_segment(enum x86_segment seg)
+{
+    unsigned int idx = seg;
+
+    return idx <= x86_seg_gs;
+}
+static inline bool is_x86_system_segment(enum x86_segment seg)
+{
+    return seg >= x86_seg_tr && seg < x86_seg_none;
+}
 
 /* Classification of the types of software generated interrupts/exceptions. */
 enum x86_swint_type {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH 15/15] x86/hvm: Use system-segment relative memory accesses
  2016-11-23 15:38 [PATCH for-4.9 00/15] XSA-191 followup Andrew Cooper
                   ` (13 preceding siblings ...)
  2016-11-23 15:38 ` [PATCH 14/15] x86/hvm: Prepare to allow use of system segments for memory references Andrew Cooper
@ 2016-11-23 15:38 ` Andrew Cooper
  2016-11-24 16:01   ` Jan Beulich
  14 siblings, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 15:38 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

With hvm_virtual_to_linear_addr() capable of doing proper system-segment
relative memory accesses, avoid open-coding the address and limit calculations
locally.

When a table spans the 4GB boundary (32bit) or non-canonical boundary (64bit),
segmentation errors are now raised.  Previously, the use of x86_seg_none
resulted in segmentation being skipped, and the linear address being truncated
through the pagewalk, and possibly coming out valid on the far side.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <JBeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
---
 xen/arch/x86/hvm/hvm.c                 |   8 +++
 xen/arch/x86/x86_emulate/x86_emulate.c | 117 ++++++++++++++++++++-------------
 2 files changed, 79 insertions(+), 46 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 2bcef1f..fbdb8dd 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2470,6 +2470,14 @@ bool_t hvm_virtual_to_linear_addr(
     unsigned long addr = offset, last_byte;
     bool_t okay = 0;
 
+    /*
+     * These checks are for a memory access through an active segment.
+     *
+     * It is expected that the access rights of reg are suitable for seg (and
+     * that this is enforced at the point that seg is loaded).
+     */
+    ASSERT(seg < x86_seg_none);
+
     if ( !(current->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) )
     {
         /*
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 768a436..6f94593 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1181,20 +1181,38 @@ static int ioport_access_check(
         return rc;
 
     /* Ensure the TSS has an io-bitmap-offset field. */
-    generate_exception_if(tr.attr.fields.type != 0xb ||
-                          tr.limit < 0x67, EXC_GP, 0);
+    generate_exception_if(tr.attr.fields.type != 0xb, EXC_GP, 0);
 
-    if ( (rc = read_ulong(x86_seg_none, tr.base + 0x66,
-                          &iobmp, 2, ctxt, ops)) )
+    switch ( rc = read_ulong(x86_seg_tr, 0x66, &iobmp, 2, ctxt, ops) )
+    {
+    case X86EMUL_OKAY:
+        break;
+
+    case X86EMUL_EXCEPTION:
+        if ( !ctxt->event_pending )
+            generate_exception_if(true, EXC_GP, 0);
+        /* fallthrough */
+
+    default:
         return rc;
+    }
 
-    /* Ensure TSS includes two bytes including byte containing first port. */
-    iobmp += first_port / 8;
-    generate_exception_if(tr.limit <= iobmp, EXC_GP, 0);
+    /* Read two bytes including byte containing first port. */
+    switch ( rc = read_ulong(x86_seg_tr, iobmp + first_port / 8,
+                             &iobmp, 2, ctxt, ops) )
+    {
+    case X86EMUL_OKAY:
+        break;
+
+    case X86EMUL_EXCEPTION:
+        if ( !ctxt->event_pending )
+            generate_exception_if(true, EXC_GP, 0);
+        /* fallthrough */
 
-    if ( (rc = read_ulong(x86_seg_none, tr.base + iobmp,
-                          &iobmp, 2, ctxt, ops)) )
+    default:
         return rc;
+    }
+
     generate_exception_if(iobmp & (((1 << bytes) - 1) << (first_port & 7)),
                           EXC_GP, 0);
 
@@ -1317,9 +1335,12 @@ realmode_load_seg(
     struct x86_emulate_ctxt *ctxt,
     const struct x86_emulate_ops *ops)
 {
-    int rc = ops->read_segment(seg, sreg, ctxt);
+    int rc;
+
+    if ( !ops->read_segment )
+        return X86EMUL_UNHANDLEABLE;
 
-    if ( !rc )
+    if ( (rc = ops->read_segment(seg, sreg, ctxt)) == X86EMUL_OKAY )
     {
         sreg->sel  = sel;
         sreg->base = (uint32_t)sel << 4;
@@ -1336,7 +1357,7 @@ protmode_load_seg(
     struct x86_emulate_ctxt *ctxt,
     const struct x86_emulate_ops *ops)
 {
-    struct segment_register desctab;
+    enum x86_segment sel_seg = (sel & 4) ? x86_seg_ldtr : x86_seg_gdtr;
     struct { uint32_t a, b; } desc;
     uint8_t dpl, rpl;
     int cpl = get_cpl(ctxt, ops);
@@ -1369,21 +1390,19 @@ protmode_load_seg(
     if ( !is_x86_user_segment(seg) && (sel & 4) )
         goto raise_exn;
 
-    if ( (rc = ops->read_segment((sel & 4) ? x86_seg_ldtr : x86_seg_gdtr,
-                                 &desctab, ctxt)) )
-        return rc;
-
-    /* Segment not valid for use (cooked meaning of .p)? */
-    if ( !desctab.attr.fields.p )
-        goto raise_exn;
+    switch ( rc = ops->read(sel_seg, sel & 0xfff8, &desc, sizeof(desc), ctxt) )
+    {
+    case X86EMUL_OKAY:
+        break;
 
-    /* Check against descriptor table limit. */
-    if ( ((sel & 0xfff8) + 7) > desctab.limit )
-        goto raise_exn;
+    case X86EMUL_EXCEPTION:
+        if ( !ctxt->event_pending )
+            goto raise_exn;
+        /* fallthrough */
 
-    if ( (rc = ops->read(x86_seg_none, desctab.base + (sel & 0xfff8),
-                         &desc, sizeof(desc), ctxt)) )
+    default:
         return rc;
+    }
 
     if ( !is_x86_user_segment(seg) )
     {
@@ -1471,9 +1490,12 @@ protmode_load_seg(
     {
         uint32_t new_desc_b = desc.b | a_flag;
 
-        if ( (rc = ops->cmpxchg(x86_seg_none, desctab.base + (sel & 0xfff8) + 4,
-                                &desc.b, &new_desc_b, 4, ctxt)) != 0 )
+        if ( (rc = ops->cmpxchg(sel_seg, (sel & 0xfff8) + 4, &desc.b,
+                                &new_desc_b, 4, ctxt) != X86EMUL_OKAY) )
+        {
+            ASSERT(rc != X86EMUL_EXCEPTION);
             return rc;
+        }
 
         /* Force the Accessed flag in our local copy. */
         desc.b = new_desc_b;
@@ -1507,8 +1529,7 @@ load_seg(
     struct segment_register reg;
     int rc;
 
-    if ( (ops->read_segment == NULL) ||
-         (ops->write_segment == NULL) )
+    if ( !ops->write_segment )
         return X86EMUL_UNHANDLEABLE;
 
     if ( !sreg )
@@ -1636,8 +1657,7 @@ static int inject_swint(enum x86_swint_type type,
         if ( !in_realmode(ctxt, ops) )
         {
             unsigned int idte_size, idte_offset;
-            struct segment_register idtr;
-            uint32_t idte_ctl;
+            struct { uint32_t a, b, c, d; } idte;
             int lm = in_longmode(ctxt, ops);
 
             if ( lm < 0 )
@@ -1660,24 +1680,30 @@ static int inject_swint(enum x86_swint_type type,
                  ((ctxt->regs->eflags & EFLG_IOPL) != EFLG_IOPL) )
                 goto raise_exn;
 
-            fail_if(ops->read_segment == NULL);
             fail_if(ops->read == NULL);
-            if ( (rc = ops->read_segment(x86_seg_idtr, &idtr, ctxt)) )
-                goto done;
-
-            if ( (idte_offset + idte_size - 1) > idtr.limit )
-                goto raise_exn;
 
             /*
-             * Should strictly speaking read all 8/16 bytes of an entry,
-             * but we currently only care about the dpl and present bits.
+             * Read all 8/16 bytes so the idtr limit check is applied properly
+             * to this entry, even though we only end up looking at the 2nd
+             * word.
              */
-            if ( (rc = ops->read(x86_seg_none, idtr.base + idte_offset + 4,
-                                 &idte_ctl, sizeof(idte_ctl), ctxt)) )
-                goto done;
+            switch ( rc = ops->read(x86_seg_idtr, idte_offset,
+                                    &idte, idte_size, ctxt) )
+            {
+            case X86EMUL_OKAY:
+                break;
+
+            case X86EMUL_EXCEPTION:
+                if ( !ctxt->event_pending )
+                    goto raise_exn;
+                /* fallthrough */
+
+            default:
+                return rc;
+            }
 
             /* Is this entry present? */
-            if ( !(idte_ctl & (1u << 15)) )
+            if ( !(idte.b & (1u << 15)) )
             {
                 fault_type = EXC_NP;
                 goto raise_exn;
@@ -1686,12 +1712,11 @@ static int inject_swint(enum x86_swint_type type,
             /* icebp counts as a hardware event, and bypasses the dpl check. */
             if ( type != x86_swint_icebp )
             {
-                struct segment_register ss;
+                int cpl = get_cpl(ctxt, ops);
 
-                if ( (rc = ops->read_segment(x86_seg_ss, &ss, ctxt)) )
-                    goto done;
+                fail_if(cpl < 0);
 
-                if ( ss.attr.fields.dpl > ((idte_ctl >> 13) & 3) )
+                if ( cpl > ((idte.b >> 13) & 3) )
                     goto raise_exn;
             }
         }
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* Re: [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity
  2016-11-23 15:38 ` [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity Andrew Cooper
@ 2016-11-23 15:49   ` Paul Durrant
  2016-11-23 15:53   ` Wei Liu
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 91+ messages in thread
From: Paul Durrant @ 2016-11-23 15:49 UTC (permalink / raw)
  To: Xen-devel
  Cc: Kevin Tian, Wei Liu, Jan Beulich, Andrew Cooper, Jun Nakajima,
	Boris Ostrovsky, Suravee Suthikulpanit

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 23 November 2016 15:39
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Jan Beulich
> <JBeulich@suse.com>; Paul Durrant <Paul.Durrant@citrix.com>; Jun
> Nakajima <jun.nakajima@intel.com>; Kevin Tian <kevin.tian@intel.com>;
> Boris Ostrovsky <boris.ostrovsky@oracle.com>; Suravee Suthikulpanit
> <suravee.suthikulpanit@amd.com>; Wei Liu <wei.liu2@citrix.com>
> Subject: [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and
> hvm_emulate_prepare() for clarity
> 
>  * Move hvm_emulate_init() to immediately hvm_emulate_prepare(), as
> they are
>    very closely related.
>  * Rename hvm_emulate_prepare() to hvm_emulate_init_once() and
>    hvm_emulate_init() to hvm_emulate_init_per_insn() to make it clearer
> how to
>    and when to use them.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Paul Durrant <paul.durrant@citrix.com>
> CC: Jun Nakajima <jun.nakajima@intel.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> 
> As hvm_emulate_prepare() was new in 4.8, it would be a good idea to take
> this
> patch to avoid future confusion on the stable-4.8 branch
> ---
>  xen/arch/x86/hvm/emulate.c        | 111 +++++++++++++++++++---------------
> ----
>  xen/arch/x86/hvm/hvm.c            |   2 +-
>  xen/arch/x86/hvm/io.c             |   2 +-
>  xen/arch/x86/hvm/ioreq.c          |   2 +-
>  xen/arch/x86/hvm/svm/emulate.c    |   4 +-
>  xen/arch/x86/hvm/vmx/realmode.c   |   2 +-
>  xen/include/asm-x86/hvm/emulate.h |   6 ++-
>  7 files changed, 66 insertions(+), 63 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index e9b8f8c..3ab0e8e 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -1755,57 +1755,6 @@ static const struct x86_emulate_ops
> hvm_emulate_ops_no_write = {
>      .vmfunc        = hvmemul_vmfunc,
>  };
> 
> -void hvm_emulate_init(
> -    struct hvm_emulate_ctxt *hvmemul_ctxt,
> -    const unsigned char *insn_buf,
> -    unsigned int insn_bytes)
> -{
> -    struct vcpu *curr = current;
> -    unsigned int pfec = PFEC_page_present;
> -    unsigned long addr;
> -
> -    if ( hvm_long_mode_enabled(curr) &&
> -         hvmemul_ctxt->seg_reg[x86_seg_cs].attr.fields.l )
> -    {
> -        hvmemul_ctxt->ctxt.addr_size = hvmemul_ctxt->ctxt.sp_size = 64;
> -    }
> -    else
> -    {
> -        hvmemul_ctxt->ctxt.addr_size =
> -            hvmemul_ctxt->seg_reg[x86_seg_cs].attr.fields.db ? 32 : 16;
> -        hvmemul_ctxt->ctxt.sp_size =
> -            hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.db ? 32 : 16;
> -    }
> -
> -    if ( hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3 )
> -        pfec |= PFEC_user_mode;
> -
> -    hvmemul_ctxt->insn_buf_eip = hvmemul_ctxt->ctxt.regs->eip;
> -    if ( !insn_bytes )
> -    {
> -        hvmemul_ctxt->insn_buf_bytes =
> -            hvm_get_insn_bytes(curr, hvmemul_ctxt->insn_buf) ?:
> -            (hvm_virtual_to_linear_addr(x86_seg_cs,
> -                                        &hvmemul_ctxt->seg_reg[x86_seg_cs],
> -                                        hvmemul_ctxt->insn_buf_eip,
> -                                        sizeof(hvmemul_ctxt->insn_buf),
> -                                        hvm_access_insn_fetch,
> -                                        hvmemul_ctxt->ctxt.addr_size,
> -                                        &addr) &&
> -             hvm_fetch_from_guest_virt_nofault(hvmemul_ctxt->insn_buf,
> addr,
> -                                               sizeof(hvmemul_ctxt->insn_buf),
> -                                               pfec) == HVMCOPY_okay) ?
> -            sizeof(hvmemul_ctxt->insn_buf) : 0;
> -    }
> -    else
> -    {
> -        hvmemul_ctxt->insn_buf_bytes = insn_bytes;
> -        memcpy(hvmemul_ctxt->insn_buf, insn_buf, insn_bytes);
> -    }
> -
> -    hvmemul_ctxt->exn_pending = 0;
> -}
> -
>  static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
>      const struct x86_emulate_ops *ops)
>  {
> @@ -1815,7 +1764,8 @@ static int _hvm_emulate_one(struct
> hvm_emulate_ctxt *hvmemul_ctxt,
>      struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
>      int rc;
> 
> -    hvm_emulate_init(hvmemul_ctxt, vio->mmio_insn, vio-
> >mmio_insn_bytes);
> +    hvm_emulate_init_per_insn(hvmemul_ctxt, vio->mmio_insn,
> +                              vio->mmio_insn_bytes);
> 
>      vio->mmio_retry = 0;
> 
> @@ -1915,7 +1865,7 @@ int hvm_emulate_one_mmio(unsigned long mfn,
> unsigned long gla)
>      else
>          ops = &hvm_ro_emulate_ops_mmio;
> 
> -    hvm_emulate_prepare(&ctxt, guest_cpu_user_regs());
> +    hvm_emulate_init_once(&ctxt, guest_cpu_user_regs());
>      ctxt.ctxt.data = &mmio_ro_ctxt;
>      rc = _hvm_emulate_one(&ctxt, ops);
>      switch ( rc )
> @@ -1940,7 +1890,7 @@ void hvm_emulate_one_vm_event(enum
> emul_kind kind, unsigned int trapnr,
>      struct hvm_emulate_ctxt ctx = {{ 0 }};
>      int rc;
> 
> -    hvm_emulate_prepare(&ctx, guest_cpu_user_regs());
> +    hvm_emulate_init_once(&ctx, guest_cpu_user_regs());
> 
>      switch ( kind )
>      {
> @@ -1992,7 +1942,7 @@ void hvm_emulate_one_vm_event(enum
> emul_kind kind, unsigned int trapnr,
>      hvm_emulate_writeback(&ctx);
>  }
> 
> -void hvm_emulate_prepare(
> +void hvm_emulate_init_once(
>      struct hvm_emulate_ctxt *hvmemul_ctxt,
>      struct cpu_user_regs *regs)
>  {
> @@ -2006,6 +1956,57 @@ void hvm_emulate_prepare(
>      hvmemul_get_seg_reg(x86_seg_ss, hvmemul_ctxt);
>  }
> 
> +void hvm_emulate_init_per_insn(
> +    struct hvm_emulate_ctxt *hvmemul_ctxt,
> +    const unsigned char *insn_buf,
> +    unsigned int insn_bytes)
> +{
> +    struct vcpu *curr = current;
> +    unsigned int pfec = PFEC_page_present;
> +    unsigned long addr;
> +
> +    if ( hvm_long_mode_enabled(curr) &&
> +         hvmemul_ctxt->seg_reg[x86_seg_cs].attr.fields.l )
> +    {
> +        hvmemul_ctxt->ctxt.addr_size = hvmemul_ctxt->ctxt.sp_size = 64;
> +    }
> +    else
> +    {
> +        hvmemul_ctxt->ctxt.addr_size =
> +            hvmemul_ctxt->seg_reg[x86_seg_cs].attr.fields.db ? 32 : 16;
> +        hvmemul_ctxt->ctxt.sp_size =
> +            hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.db ? 32 : 16;
> +    }
> +
> +    if ( hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3 )
> +        pfec |= PFEC_user_mode;
> +
> +    hvmemul_ctxt->insn_buf_eip = hvmemul_ctxt->ctxt.regs->eip;
> +    if ( !insn_bytes )
> +    {
> +        hvmemul_ctxt->insn_buf_bytes =
> +            hvm_get_insn_bytes(curr, hvmemul_ctxt->insn_buf) ?:
> +            (hvm_virtual_to_linear_addr(x86_seg_cs,
> +                                        &hvmemul_ctxt->seg_reg[x86_seg_cs],
> +                                        hvmemul_ctxt->insn_buf_eip,
> +                                        sizeof(hvmemul_ctxt->insn_buf),
> +                                        hvm_access_insn_fetch,
> +                                        hvmemul_ctxt->ctxt.addr_size,
> +                                        &addr) &&
> +             hvm_fetch_from_guest_virt_nofault(hvmemul_ctxt->insn_buf,
> addr,
> +                                               sizeof(hvmemul_ctxt->insn_buf),
> +                                               pfec) == HVMCOPY_okay) ?
> +            sizeof(hvmemul_ctxt->insn_buf) : 0;
> +    }
> +    else
> +    {
> +        hvmemul_ctxt->insn_buf_bytes = insn_bytes;
> +        memcpy(hvmemul_ctxt->insn_buf, insn_buf, insn_bytes);
> +    }
> +
> +    hvmemul_ctxt->exn_pending = 0;
> +}
> +
>  void hvm_emulate_writeback(
>      struct hvm_emulate_ctxt *hvmemul_ctxt)
>  {
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index f76dd90..25dc759 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4058,7 +4058,7 @@ void hvm_ud_intercept(struct cpu_user_regs
> *regs)
>  {
>      struct hvm_emulate_ctxt ctxt;
> 
> -    hvm_emulate_prepare(&ctxt, regs);
> +    hvm_emulate_init_once(&ctxt, regs);
> 
>      if ( opt_hvm_fep )
>      {
> diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
> index 1e7a5f9..7305801 100644
> --- a/xen/arch/x86/hvm/io.c
> +++ b/xen/arch/x86/hvm/io.c
> @@ -87,7 +87,7 @@ int handle_mmio(void)
> 
>      ASSERT(!is_pvh_vcpu(curr));
> 
> -    hvm_emulate_prepare(&ctxt, guest_cpu_user_regs());
> +    hvm_emulate_init_once(&ctxt, guest_cpu_user_regs());
> 
>      rc = hvm_emulate_one(&ctxt);
> 
> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> index d2245e2..88071ab 100644
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -167,7 +167,7 @@ bool_t handle_hvm_io_completion(struct vcpu *v)
>      {
>          struct hvm_emulate_ctxt ctxt;
> 
> -        hvm_emulate_prepare(&ctxt, guest_cpu_user_regs());
> +        hvm_emulate_init_once(&ctxt, guest_cpu_user_regs());
>          vmx_realmode_emulate_one(&ctxt);
>          hvm_emulate_writeback(&ctxt);
> 
> diff --git a/xen/arch/x86/hvm/svm/emulate.c
> b/xen/arch/x86/hvm/svm/emulate.c
> index a5545ea..9cdbe9e 100644
> --- a/xen/arch/x86/hvm/svm/emulate.c
> +++ b/xen/arch/x86/hvm/svm/emulate.c
> @@ -107,8 +107,8 @@ int __get_instruction_length_from_list(struct vcpu
> *v,
>  #endif
> 
>      ASSERT(v == current);
> -    hvm_emulate_prepare(&ctxt, guest_cpu_user_regs());
> -    hvm_emulate_init(&ctxt, NULL, 0);
> +    hvm_emulate_init_once(&ctxt, guest_cpu_user_regs());
> +    hvm_emulate_init_per_insn(&ctxt, NULL, 0);
>      state = x86_decode_insn(&ctxt.ctxt, hvmemul_insn_fetch);
>      if ( IS_ERR_OR_NULL(state) )
>          return 0;
> diff --git a/xen/arch/x86/hvm/vmx/realmode.c
> b/xen/arch/x86/hvm/vmx/realmode.c
> index e83a61f..9002638 100644
> --- a/xen/arch/x86/hvm/vmx/realmode.c
> +++ b/xen/arch/x86/hvm/vmx/realmode.c
> @@ -179,7 +179,7 @@ void vmx_realmode(struct cpu_user_regs *regs)
>      if ( intr_info & INTR_INFO_VALID_MASK )
>          __vmwrite(VM_ENTRY_INTR_INFO, 0);
> 
> -    hvm_emulate_prepare(&hvmemul_ctxt, regs);
> +    hvm_emulate_init_once(&hvmemul_ctxt, regs);
> 
>      /* Only deliver interrupts into emulated real mode. */
>      if ( !(curr->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) &&
> diff --git a/xen/include/asm-x86/hvm/emulate.h b/xen/include/asm-
> x86/hvm/emulate.h
> index f610673..d4186a2 100644
> --- a/xen/include/asm-x86/hvm/emulate.h
> +++ b/xen/include/asm-x86/hvm/emulate.h
> @@ -51,10 +51,12 @@ int hvm_emulate_one_no_write(
>  void hvm_emulate_one_vm_event(enum emul_kind kind,
>      unsigned int trapnr,
>      unsigned int errcode);
> -void hvm_emulate_prepare(
> +/* Must be called once to set up hvmemul state. */
> +void hvm_emulate_init_once(
>      struct hvm_emulate_ctxt *hvmemul_ctxt,
>      struct cpu_user_regs *regs);
> -void hvm_emulate_init(
> +/* Must be called once before each instruction emulated. */
> +void hvm_emulate_init_per_insn(
>      struct hvm_emulate_ctxt *hvmemul_ctxt,
>      const unsigned char *insn_buf,
>      unsigned int insn_bytes);
> --
> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity
  2016-11-23 15:38 ` [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity Andrew Cooper
  2016-11-23 15:49   ` Paul Durrant
@ 2016-11-23 15:53   ` Wei Liu
  2016-11-23 16:40   ` Jan Beulich
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 91+ messages in thread
From: Wei Liu @ 2016-11-23 15:53 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Wei Liu, Jan Beulich, Xen-devel, Paul Durrant,
	Jun Nakajima, Boris Ostrovsky, Suravee Suthikulpanit

On Wed, Nov 23, 2016 at 03:38:44PM +0000, Andrew Cooper wrote:
>  * Move hvm_emulate_init() to immediately hvm_emulate_prepare(), as they are
>    very closely related.
>  * Rename hvm_emulate_prepare() to hvm_emulate_init_once() and
>    hvm_emulate_init() to hvm_emulate_init_per_insn() to make it clearer how to
>    and when to use them.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 02/15] x86/emul: Simplfy emulation state setup
  2016-11-23 15:38 ` [PATCH 02/15] x86/emul: Simplfy emulation state setup Andrew Cooper
@ 2016-11-23 15:58   ` Paul Durrant
  2016-11-23 16:01     ` Andrew Cooper
  2016-11-23 16:07   ` Tim Deegan
  2016-11-24 13:44   ` Jan Beulich
  2 siblings, 1 reply; 91+ messages in thread
From: Paul Durrant @ 2016-11-23 15:58 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Tim (Xen.org), George Dunlap, Jan Beulich

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 23 November 2016 15:39
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Jan Beulich
> <JBeulich@suse.com>; Tim (Xen.org) <tim@xen.org>; George Dunlap
> <George.Dunlap@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>
> Subject: [PATCH 02/15] x86/emul: Simplfy emulation state setup
> 
> The current code to set up emulation state is ad-hoc and error prone.
> 
>  * Consistently zero all emulation state structures.
>  * Avoid explicitly initialising some state to 0.
>  * Explicitly identify all input and output state in x86_emulate_ctxt.  This
>    involves rearanging some fields.
>  * Have x86_decode() explicitly initalise all output state at its start.
> 
> In addition, move the calculation of hvmemul_ctxt->ctxt.swint_emulate
> from
> _hvm_emulate_one() to hvm_emulate_init_once(), as it doesn't need
> recalculating for each instruction.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Tim Deegan <tim@xen.org>
> CC: George Dunlap <george.dunlap@eu.citrix.com>
> CC: Paul Durrant <paul.durrant@citrix.com>
> ---
>  xen/arch/x86/hvm/emulate.c             | 28 +++++++++++++++-------------
>  xen/arch/x86/mm.c                      |  3 ++-
>  xen/arch/x86/mm/shadow/common.c        |  4 ++--
>  xen/arch/x86/x86_emulate/x86_emulate.c |  2 ++
>  xen/arch/x86/x86_emulate/x86_emulate.h | 22 +++++++++++++++-------
>  5 files changed, 36 insertions(+), 23 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index 3ab0e8e..f24e211 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -1769,13 +1769,6 @@ static int _hvm_emulate_one(struct
> hvm_emulate_ctxt *hvmemul_ctxt,
> 
>      vio->mmio_retry = 0;
> 
> -    if ( cpu_has_vmx )
> -        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
> -    else if ( cpu_has_svm_nrips )
> -        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_icebp;
> -    else
> -        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_all;
> -
>      rc = x86_emulate(&hvmemul_ctxt->ctxt, ops);
> 
>      if ( rc == X86EMUL_OKAY && vio->mmio_retry )
> @@ -1946,14 +1939,23 @@ void hvm_emulate_init_once(
>      struct hvm_emulate_ctxt *hvmemul_ctxt,
>      struct cpu_user_regs *regs)
>  {
> -    hvmemul_ctxt->intr_shadow =
> hvm_funcs.get_interrupt_shadow(current);
> -    hvmemul_ctxt->ctxt.regs = regs;
> -    hvmemul_ctxt->ctxt.force_writeback = 1;
> -    hvmemul_ctxt->seg_reg_accessed = 0;
> -    hvmemul_ctxt->seg_reg_dirty = 0;
> -    hvmemul_ctxt->set_context = 0;
> +    struct vcpu *curr = current;
> +
> +    memset(hvmemul_ctxt, 0, sizeof(*hvmemul_ctxt));
> +
> +    hvmemul_ctxt->intr_shadow = hvm_funcs.get_interrupt_shadow(curr);
>      hvmemul_get_seg_reg(x86_seg_cs, hvmemul_ctxt);
>      hvmemul_get_seg_reg(x86_seg_ss, hvmemul_ctxt);
> +
> +    hvmemul_ctxt->ctxt.regs = regs;
> +    hvmemul_ctxt->ctxt.force_writeback = true;
> +
> +    if ( cpu_has_vmx )
> +        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
> +    else if ( cpu_has_svm_nrips )
> +        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_icebp;
> +    else
> +        hvmemul_ctxt->ctxt.swint_emulate = x86_swint_emulate_all;
>  }
> 
>  void hvm_emulate_init_per_insn(
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index 03dcd71..9901f6f 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5363,8 +5363,9 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned
> long addr,
>          goto bail;
>      }
> 
> +    memset(&ptwr_ctxt, 0, sizeof(ptwr_ctxt));
> +
>      ptwr_ctxt.ctxt.regs = regs;
> -    ptwr_ctxt.ctxt.force_writeback = 0;
>      ptwr_ctxt.ctxt.addr_size = ptwr_ctxt.ctxt.sp_size =
>          is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG;
>      ptwr_ctxt.ctxt.swint_emulate = x86_swint_emulate_none;
> diff --git a/xen/arch/x86/mm/shadow/common.c
> b/xen/arch/x86/mm/shadow/common.c
> index ced2313..6f6668d 100644
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -385,8 +385,9 @@ const struct x86_emulate_ops
> *shadow_init_emulation(
>      struct vcpu *v = current;
>      unsigned long addr;
> 
> +    memset(sh_ctxt, 0, sizeof(*sh_ctxt));
> +
>      sh_ctxt->ctxt.regs = regs;
> -    sh_ctxt->ctxt.force_writeback = 0;
>      sh_ctxt->ctxt.swint_emulate = x86_swint_emulate_none;
> 
>      if ( is_pv_vcpu(v) )
> @@ -396,7 +397,6 @@ const struct x86_emulate_ops
> *shadow_init_emulation(
>      }
> 
>      /* Segment cache initialisation. Primed with CS. */
> -    sh_ctxt->valid_seg_regs = 0;
>      creg = hvm_get_seg_reg(x86_seg_cs, sh_ctxt);
> 
>      /* Work out the emulation mode. */
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c
> b/xen/arch/x86/x86_emulate/x86_emulate.c
> index 04f0dac..c5d9664 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -1904,6 +1904,8 @@ x86_decode(
>      state->regs = ctxt->regs;
>      state->eip = ctxt->regs->eip;
> 
> +    /* Initialise output state in x86_emulate_ctxt */
> +    ctxt->opcode = ~0u;
>      ctxt->retire.byte = 0;

In the commit message you state that x86_decode() will "explicitly initalise all output state at its start". This doesn't seem to be all the output state. In fact you appear to be removing some initialization.

> 
>      op_bytes = def_op_bytes = ad_bytes = def_ad_bytes = ctxt-
> >addr_size/8;
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h
> b/xen/arch/x86/x86_emulate/x86_emulate.h
> index 993c576..93b268e 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -412,6 +412,10 @@ struct cpu_user_regs;
> 
>  struct x86_emulate_ctxt
>  {
> +    /*
> +     * Input state:
> +     */
> +
>      /* Register state before/after emulation. */
>      struct cpu_user_regs *regs;
> 
> @@ -421,14 +425,21 @@ struct x86_emulate_ctxt
>      /* Stack pointer width in bits (16, 32 or 64). */
>      unsigned int sp_size;
> 
> -    /* Canonical opcode (see below). */
> -    unsigned int opcode;
> -
>      /* Software event injection support. */
>      enum x86_swint_emulation swint_emulate;
> 
>      /* Set this if writes may have side effects. */
> -    uint8_t force_writeback;
> +    bool force_writeback;

Is this type change intentional? I assume it is, but you didn't call it out.

  Paul

> +
> +    /* Caller data that can be used by x86_emulate_ops' routines. */
> +    void *data;
> +
> +    /*
> +     * Output state:
> +     */
> +
> +    /* Canonical opcode (see below). */
> +    unsigned int opcode;
> 
>      /* Retirement state, set by the emulator (valid only on X86EMUL_OKAY).
> */
>      union {
> @@ -439,9 +450,6 @@ struct x86_emulate_ctxt
>          } flags;
>          uint8_t byte;
>      } retire;
> -
> -    /* Caller data that can be used by x86_emulate_ops' routines. */
> -    void *data;
>  };
> 
>  /*
> --
> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 02/15] x86/emul: Simplfy emulation state setup
  2016-11-23 15:58   ` Paul Durrant
@ 2016-11-23 16:01     ` Andrew Cooper
  2016-11-23 16:03       ` Paul Durrant
  0 siblings, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 16:01 UTC (permalink / raw)
  To: Paul Durrant, Xen-devel; +Cc: Tim (Xen.org), George Dunlap, Jan Beulich

On 23/11/16 15:58, Paul Durrant wrote:
>> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c
>> b/xen/arch/x86/x86_emulate/x86_emulate.c
>> index 04f0dac..c5d9664 100644
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>> @@ -1904,6 +1904,8 @@ x86_decode(
>>      state->regs = ctxt->regs;
>>      state->eip = ctxt->regs->eip;
>>
>> +    /* Initialise output state in x86_emulate_ctxt */
>> +    ctxt->opcode = ~0u;
>>      ctxt->retire.byte = 0;
> In the commit message you state that x86_decode() will "explicitly initalise all output state at its start". This doesn't seem to be all the output state. In fact you appear to be removing some initialization.

There are only two fields of output state, as delineated by the extra
comments in x86_emulate_ctxt.  Most of x86_emulate_ctxt is input state.

>
>>      op_bytes = def_op_bytes = ad_bytes = def_ad_bytes = ctxt-
>>> addr_size/8;
>> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h
>> b/xen/arch/x86/x86_emulate/x86_emulate.h
>> index 993c576..93b268e 100644
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>> @@ -412,6 +412,10 @@ struct cpu_user_regs;
>>
>>  struct x86_emulate_ctxt
>>  {
>> +    /*
>> +     * Input state:
>> +     */
>> +
>>      /* Register state before/after emulation. */
>>      struct cpu_user_regs *regs;
>>
>> @@ -421,14 +425,21 @@ struct x86_emulate_ctxt
>>      /* Stack pointer width in bits (16, 32 or 64). */
>>      unsigned int sp_size;
>>
>> -    /* Canonical opcode (see below). */
>> -    unsigned int opcode;
>> -
>>      /* Software event injection support. */
>>      enum x86_swint_emulation swint_emulate;
>>
>>      /* Set this if writes may have side effects. */
>> -    uint8_t force_writeback;
>> +    bool force_writeback;
> Is this type change intentional? I assume it is, but you didn't call it out.

Yes.  I thought I had it in the commit message, but will update for v2.

~Andrew

>
>   Paul
>
>> +
>> +    /* Caller data that can be used by x86_emulate_ops' routines. */
>> +    void *data;
>> +
>> +    /*
>> +     * Output state:
>> +     */
>> +
>> +    /* Canonical opcode (see below). */
>> +    unsigned int opcode;
>>
>>      /* Retirement state, set by the emulator (valid only on X86EMUL_OKAY).
>> */
>>      union {
>> @@ -439,9 +450,6 @@ struct x86_emulate_ctxt
>>          } flags;
>>          uint8_t byte;
>>      } retire;
>> -
>> -    /* Caller data that can be used by x86_emulate_ops' routines. */
>> -    void *data;
>>  };
>>
>>  /*
>> --
>> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 02/15] x86/emul: Simplfy emulation state setup
  2016-11-23 16:01     ` Andrew Cooper
@ 2016-11-23 16:03       ` Paul Durrant
  0 siblings, 0 replies; 91+ messages in thread
From: Paul Durrant @ 2016-11-23 16:03 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Tim (Xen.org), George Dunlap, Jan Beulich

> -----Original Message-----
> From: Andrew Cooper
> Sent: 23 November 2016 16:01
> To: Paul Durrant <Paul.Durrant@citrix.com>; Xen-devel <xen-
> devel@lists.xen.org>
> Cc: Jan Beulich <JBeulich@suse.com>; Tim (Xen.org) <tim@xen.org>; George
> Dunlap <George.Dunlap@citrix.com>
> Subject: Re: [PATCH 02/15] x86/emul: Simplfy emulation state setup
> 
> On 23/11/16 15:58, Paul Durrant wrote:
> >> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c
> >> b/xen/arch/x86/x86_emulate/x86_emulate.c
> >> index 04f0dac..c5d9664 100644
> >> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> >> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> >> @@ -1904,6 +1904,8 @@ x86_decode(
> >>      state->regs = ctxt->regs;
> >>      state->eip = ctxt->regs->eip;
> >>
> >> +    /* Initialise output state in x86_emulate_ctxt */
> >> +    ctxt->opcode = ~0u;
> >>      ctxt->retire.byte = 0;
> > In the commit message you state that x86_decode() will "explicitly initalise
> all output state at its start". This doesn't seem to be all the output state. In
> fact you appear to be removing some initialization.
> 
> There are only two fields of output state, as delineated by the extra
> comments in x86_emulate_ctxt.  Most of x86_emulate_ctxt is input state.

D'oh, yes. Sorry, got confused by the field movements... my eyes were seeing '+' as '-' for some reason.

  Paul

> 
> >
> >>      op_bytes = def_op_bytes = ad_bytes = def_ad_bytes = ctxt-
> >>> addr_size/8;
> >> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h
> >> b/xen/arch/x86/x86_emulate/x86_emulate.h
> >> index 993c576..93b268e 100644
> >> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> >> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> >> @@ -412,6 +412,10 @@ struct cpu_user_regs;
> >>
> >>  struct x86_emulate_ctxt
> >>  {
> >> +    /*
> >> +     * Input state:
> >> +     */
> >> +
> >>      /* Register state before/after emulation. */
> >>      struct cpu_user_regs *regs;
> >>
> >> @@ -421,14 +425,21 @@ struct x86_emulate_ctxt
> >>      /* Stack pointer width in bits (16, 32 or 64). */
> >>      unsigned int sp_size;
> >>
> >> -    /* Canonical opcode (see below). */
> >> -    unsigned int opcode;
> >> -
> >>      /* Software event injection support. */
> >>      enum x86_swint_emulation swint_emulate;
> >>
> >>      /* Set this if writes may have side effects. */
> >> -    uint8_t force_writeback;
> >> +    bool force_writeback;
> > Is this type change intentional? I assume it is, but you didn't call it out.
> 
> Yes.  I thought I had it in the commit message, but will update for v2.
> 
> ~Andrew
> 
> >
> >   Paul
> >
> >> +
> >> +    /* Caller data that can be used by x86_emulate_ops' routines. */
> >> +    void *data;
> >> +
> >> +    /*
> >> +     * Output state:
> >> +     */
> >> +
> >> +    /* Canonical opcode (see below). */
> >> +    unsigned int opcode;
> >>
> >>      /* Retirement state, set by the emulator (valid only on
> X86EMUL_OKAY).
> >> */
> >>      union {
> >> @@ -439,9 +450,6 @@ struct x86_emulate_ctxt
> >>          } flags;
> >>          uint8_t byte;
> >>      } retire;
> >> -
> >> -    /* Caller data that can be used by x86_emulate_ops' routines. */
> >> -    void *data;
> >>  };
> >>
> >>  /*
> >> --
> >> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 02/15] x86/emul: Simplfy emulation state setup
  2016-11-23 15:38 ` [PATCH 02/15] x86/emul: Simplfy emulation state setup Andrew Cooper
  2016-11-23 15:58   ` Paul Durrant
@ 2016-11-23 16:07   ` Tim Deegan
  2016-11-24 13:44   ` Jan Beulich
  2 siblings, 0 replies; 91+ messages in thread
From: Tim Deegan @ 2016-11-23 16:07 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: George Dunlap, Paul Durrant, Jan Beulich, Xen-devel

At 15:38 +0000 on 23 Nov (1479915525), Andrew Cooper wrote:
> The current code to set up emulation state is ad-hoc and error prone.
> 
>  * Consistently zero all emulation state structures.
>  * Avoid explicitly initialising some state to 0.
>  * Explicitly identify all input and output state in x86_emulate_ctxt.  This
>    involves rearanging some fields.
>  * Have x86_decode() explicitly initalise all output state at its start.
> 
> In addition, move the calculation of hvmemul_ctxt->ctxt.swint_emulate from
> _hvm_emulate_one() to hvm_emulate_init_once(), as it doesn't need
> recalculating for each instruction.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Shadow code changes Acked-by: Tim Deegan <tim@xen.org>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 03/15] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure
  2016-11-23 15:38 ` [PATCH 03/15] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure Andrew Cooper
@ 2016-11-23 16:12   ` Paul Durrant
  2016-11-23 16:22     ` Andrew Cooper
  2016-11-23 16:59   ` Boris Ostrovsky
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 91+ messages in thread
From: Paul Durrant @ 2016-11-23 16:12 UTC (permalink / raw)
  To: Xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Jun Nakajima,
	Boris Ostrovsky, Suravee Suthikulpanit

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 23 November 2016 15:39
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Jan Beulich
> <JBeulich@suse.com>; Paul Durrant <Paul.Durrant@citrix.com>; Jun
> Nakajima <jun.nakajima@intel.com>; Kevin Tian <kevin.tian@intel.com>;
> Boris Ostrovsky <boris.ostrovsky@oracle.com>; Suravee Suthikulpanit
> <suravee.suthikulpanit@amd.com>
> Subject: [PATCH 03/15] x86/emul: Rename hvm_trap to x86_event and
> move it into the emulation infrastructure
> 
> The x86 emulator needs to gain an understanding of interrupts and
> exceptions
> generated by its actions.  The naming choice is to match both the Intel and
> AMD terms, and to avoid 'trap' specifically as it has an architectural meaning
> different to its current usage.
> 
> While making this change, make other changes for consistency
> 
>  * Rename *_trap() infrastructure to *_event()
>  * Rename trapnr/trap parameters to vector
>  * Convert hvm_inject_hw_exception() and hvm_inject_page_fault() to
> being
>    static inlines, as they are only thin wrappers around hvm_inject_event()
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Paul Durrant <paul.durrant@citrix.com>
> CC: Jun Nakajima <jun.nakajima@intel.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>  xen/arch/x86/hvm/emulate.c              |  6 +--
>  xen/arch/x86/hvm/hvm.c                  | 33 ++++------------
>  xen/arch/x86/hvm/io.c                   |  2 +-
>  xen/arch/x86/hvm/svm/nestedsvm.c        |  7 ++--
>  xen/arch/x86/hvm/svm/svm.c              | 62 ++++++++++++++---------------
>  xen/arch/x86/hvm/vmx/vmx.c              | 66 +++++++++++++++----------------
>  xen/arch/x86/hvm/vmx/vvmx.c             | 11 +++---
>  xen/arch/x86/x86_emulate/x86_emulate.c  | 11 ++++++
>  xen/arch/x86/x86_emulate/x86_emulate.h  | 22 +++++++++++
>  xen/include/asm-x86/hvm/emulate.h       |  2 +-
>  xen/include/asm-x86/hvm/hvm.h           | 69 ++++++++++++++++--------------
> ---
>  xen/include/asm-x86/hvm/svm/nestedsvm.h |  6 +--
>  xen/include/asm-x86/hvm/vcpu.h          |  2 +-
>  xen/include/asm-x86/hvm/vmx/vvmx.h      |  4 +-
>  14 files changed, 159 insertions(+), 144 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index f24e211..d0c3185 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -1679,7 +1679,7 @@ static int hvmemul_invlpg(
>           * violations, so squash them.
>           */
>          hvmemul_ctxt->exn_pending = 0;
> -        hvmemul_ctxt->trap = (struct hvm_trap){};
> +        hvmemul_ctxt->trap = (struct x86_event){};

Can't say I like that way of initializing but...

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

>          rc = X86EMUL_OKAY;
>      }
> 
> @@ -1868,7 +1868,7 @@ int hvm_emulate_one_mmio(unsigned long mfn,
> unsigned long gla)
>          break;
>      case X86EMUL_EXCEPTION:
>          if ( ctxt.exn_pending )
> -            hvm_inject_trap(&ctxt.trap);
> +            hvm_inject_event(&ctxt.trap);
>          /* fallthrough */
>      default:
>          hvm_emulate_writeback(&ctxt);
> @@ -1928,7 +1928,7 @@ void hvm_emulate_one_vm_event(enum
> emul_kind kind, unsigned int trapnr,
>          break;
>      case X86EMUL_EXCEPTION:
>          if ( ctx.exn_pending )
> -            hvm_inject_trap(&ctx.trap);
> +            hvm_inject_event(&ctx.trap);
>          break;
>      }
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 25dc759..7b434aa 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -535,7 +535,7 @@ void hvm_do_resume(struct vcpu *v)
>      /* Inject pending hw/sw trap */
>      if ( v->arch.hvm_vcpu.inject_trap.vector != -1 )
>      {
> -        hvm_inject_trap(&v->arch.hvm_vcpu.inject_trap);
> +        hvm_inject_event(&v->arch.hvm_vcpu.inject_trap);
>          v->arch.hvm_vcpu.inject_trap.vector = -1;
>      }
>  }
> @@ -1676,19 +1676,19 @@ void hvm_triple_fault(void)
>      domain_shutdown(d, reason);
>  }
> 
> -void hvm_inject_trap(const struct hvm_trap *trap)
> +void hvm_inject_event(const struct x86_event *event)
>  {
>      struct vcpu *curr = current;
> 
>      if ( nestedhvm_enabled(curr->domain) &&
>           !nestedhvm_vmswitch_in_progress(curr) &&
>           nestedhvm_vcpu_in_guestmode(curr) &&
> -         nhvm_vmcx_guest_intercepts_trap(
> -             curr, trap->vector, trap->error_code) )
> +         nhvm_vmcx_guest_intercepts_event(
> +             curr, event->vector, event->error_code) )
>      {
>          enum nestedhvm_vmexits nsret;
> 
> -        nsret = nhvm_vcpu_vmexit_trap(curr, trap);
> +        nsret = nhvm_vcpu_vmexit_event(curr, event);
> 
>          switch ( nsret )
>          {
> @@ -1704,26 +1704,7 @@ void hvm_inject_trap(const struct hvm_trap
> *trap)
>          }
>      }
> 
> -    hvm_funcs.inject_trap(trap);
> -}
> -
> -void hvm_inject_hw_exception(unsigned int trapnr, int errcode)
> -{
> -    struct hvm_trap trap = {
> -        .vector = trapnr,
> -        .type = X86_EVENTTYPE_HW_EXCEPTION,
> -        .error_code = errcode };
> -    hvm_inject_trap(&trap);
> -}
> -
> -void hvm_inject_page_fault(int errcode, unsigned long cr2)
> -{
> -    struct hvm_trap trap = {
> -        .vector = TRAP_page_fault,
> -        .type = X86_EVENTTYPE_HW_EXCEPTION,
> -        .error_code = errcode,
> -        .cr2 = cr2 };
> -    hvm_inject_trap(&trap);
> +    hvm_funcs.inject_event(event);
>  }
> 
>  int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
> @@ -4096,7 +4077,7 @@ void hvm_ud_intercept(struct cpu_user_regs
> *regs)
>          break;
>      case X86EMUL_EXCEPTION:
>          if ( ctxt.exn_pending )
> -            hvm_inject_trap(&ctxt.trap);
> +            hvm_inject_event(&ctxt.trap);
>          /* fall through */
>      default:
>          hvm_emulate_writeback(&ctxt);
> diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
> index 7305801..1279f68 100644
> --- a/xen/arch/x86/hvm/io.c
> +++ b/xen/arch/x86/hvm/io.c
> @@ -103,7 +103,7 @@ int handle_mmio(void)
>          return 0;
>      case X86EMUL_EXCEPTION:
>          if ( ctxt.exn_pending )
> -            hvm_inject_trap(&ctxt.trap);
> +            hvm_inject_event(&ctxt.trap);
>          break;
>      default:
>          break;
> diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c
> b/xen/arch/x86/hvm/svm/nestedsvm.c
> index f9b38ab..b6b8526 100644
> --- a/xen/arch/x86/hvm/svm/nestedsvm.c
> +++ b/xen/arch/x86/hvm/svm/nestedsvm.c
> @@ -821,7 +821,7 @@ nsvm_vcpu_vmexit_inject(struct vcpu *v, struct
> cpu_user_regs *regs,
>  }
> 
>  int
> -nsvm_vcpu_vmexit_trap(struct vcpu *v, const struct hvm_trap *trap)
> +nsvm_vcpu_vmexit_event(struct vcpu *v, const struct x86_event *trap)
>  {
>      ASSERT(vcpu_nestedhvm(v).nv_vvmcx != NULL);
> 
> @@ -994,10 +994,11 @@ nsvm_vmcb_guest_intercepts_exitcode(struct
> vcpu *v,
>  }
> 
>  bool_t
> -nsvm_vmcb_guest_intercepts_trap(struct vcpu *v, unsigned int trapnr, int
> errcode)
> +nsvm_vmcb_guest_intercepts_event(
> +    struct vcpu *v, unsigned int vector, int errcode)
>  {
>      return nsvm_vmcb_guest_intercepts_exitcode(v,
> -        guest_cpu_user_regs(), VMEXIT_EXCEPTION_DE + trapnr);
> +        guest_cpu_user_regs(), VMEXIT_EXCEPTION_DE + vector);
>  }
> 
>  static int
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index f737d8c..66eb30b 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -1203,15 +1203,15 @@ static void svm_vcpu_destroy(struct vcpu *v)
>      passive_domain_destroy(v);
>  }
> 
> -static void svm_inject_trap(const struct hvm_trap *trap)
> +static void svm_inject_event(const struct x86_event *event)
>  {
>      struct vcpu *curr = current;
>      struct vmcb_struct *vmcb = curr->arch.hvm_svm.vmcb;
> -    eventinj_t event = vmcb->eventinj;
> -    struct hvm_trap _trap = *trap;
> +    eventinj_t eventinj = vmcb->eventinj;
> +    struct x86_event _event = *event;
>      const struct cpu_user_regs *regs = guest_cpu_user_regs();
> 
> -    switch ( _trap.vector )
> +    switch ( _event.vector )
>      {
>      case TRAP_debug:
>          if ( regs->eflags & X86_EFLAGS_TF )
> @@ -1229,21 +1229,21 @@ static void svm_inject_trap(const struct
> hvm_trap *trap)
>          }
>      }
> 
> -    if ( unlikely(event.fields.v) &&
> -         (event.fields.type == X86_EVENTTYPE_HW_EXCEPTION) )
> +    if ( unlikely(eventinj.fields.v) &&
> +         (eventinj.fields.type == X86_EVENTTYPE_HW_EXCEPTION) )
>      {
> -        _trap.vector = hvm_combine_hw_exceptions(
> -            event.fields.vector, _trap.vector);
> -        if ( _trap.vector == TRAP_double_fault )
> -            _trap.error_code = 0;
> +        _event.vector = hvm_combine_hw_exceptions(
> +            eventinj.fields.vector, _event.vector);
> +        if ( _event.vector == TRAP_double_fault )
> +            _event.error_code = 0;
>      }
> 
> -    event.bytes = 0;
> -    event.fields.v = 1;
> -    event.fields.vector = _trap.vector;
> +    eventinj.bytes = 0;
> +    eventinj.fields.v = 1;
> +    eventinj.fields.vector = _event.vector;
> 
>      /* Refer to AMD Vol 2: System Programming, 15.20 Event Injection. */
> -    switch ( _trap.type )
> +    switch ( _event.type )
>      {
>      case X86_EVENTTYPE_SW_INTERRUPT: /* int $n */
>          /*
> @@ -1253,8 +1253,8 @@ static void svm_inject_trap(const struct hvm_trap
> *trap)
>           * moved eip forward if appropriate.
>           */
>          if ( cpu_has_svm_nrips )
> -            vmcb->nextrip = regs->eip + _trap.insn_len;
> -        event.fields.type = X86_EVENTTYPE_SW_INTERRUPT;
> +            vmcb->nextrip = regs->eip + _event.insn_len;
> +        eventinj.fields.type = X86_EVENTTYPE_SW_INTERRUPT;
>          break;
> 
>      case X86_EVENTTYPE_PRI_SW_EXCEPTION: /* icebp */
> @@ -1265,7 +1265,7 @@ static void svm_inject_trap(const struct hvm_trap
> *trap)
>           */
>          if ( cpu_has_svm_nrips )
>              vmcb->nextrip = regs->eip;
> -        event.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
> +        eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
>          break;
> 
>      case X86_EVENTTYPE_SW_EXCEPTION: /* int3, into */
> @@ -1279,28 +1279,28 @@ static void svm_inject_trap(const struct
> hvm_trap *trap)
>           * the correct faulting eip should a fault occur.
>           */
>          if ( cpu_has_svm_nrips )
> -            vmcb->nextrip = regs->eip + _trap.insn_len;
> -        event.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
> +            vmcb->nextrip = regs->eip + _event.insn_len;
> +        eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
>          break;
> 
>      default:
> -        event.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
> -        event.fields.ev = (_trap.error_code !=
> HVM_DELIVER_NO_ERROR_CODE);
> -        event.fields.errorcode = _trap.error_code;
> +        eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
> +        eventinj.fields.ev = (_event.error_code !=
> HVM_DELIVER_NO_ERROR_CODE);
> +        eventinj.fields.errorcode = _event.error_code;
>          break;
>      }
> 
> -    vmcb->eventinj = event;
> +    vmcb->eventinj = eventinj;
> 
> -    if ( _trap.vector == TRAP_page_fault )
> +    if ( _event.vector == TRAP_page_fault )
>      {
> -        curr->arch.hvm_vcpu.guest_cr[2] = _trap.cr2;
> -        vmcb_set_cr2(vmcb, _trap.cr2);
> -        HVMTRACE_LONG_2D(PF_INJECT, _trap.error_code,
> TRC_PAR_LONG(_trap.cr2));
> +        curr->arch.hvm_vcpu.guest_cr[2] = _event.cr2;
> +        vmcb_set_cr2(vmcb, _event.cr2);
> +        HVMTRACE_LONG_2D(PF_INJECT, _event.error_code,
> TRC_PAR_LONG(_event.cr2));
>      }
>      else
>      {
> -        HVMTRACE_2D(INJ_EXC, _trap.vector, _trap.error_code);
> +        HVMTRACE_2D(INJ_EXC, _event.vector, _event.error_code);
>      }
>  }
> 
> @@ -2250,7 +2250,7 @@ static struct hvm_function_table __initdata
> svm_function_table = {
>      .set_guest_pat        = svm_set_guest_pat,
>      .get_guest_pat        = svm_get_guest_pat,
>      .set_tsc_offset       = svm_set_tsc_offset,
> -    .inject_trap          = svm_inject_trap,
> +    .inject_event         = svm_inject_event,
>      .init_hypercall_page  = svm_init_hypercall_page,
>      .event_pending        = svm_event_pending,
>      .invlpg               = svm_invlpg,
> @@ -2265,9 +2265,9 @@ static struct hvm_function_table __initdata
> svm_function_table = {
>      .nhvm_vcpu_initialise = nsvm_vcpu_initialise,
>      .nhvm_vcpu_destroy = nsvm_vcpu_destroy,
>      .nhvm_vcpu_reset = nsvm_vcpu_reset,
> -    .nhvm_vcpu_vmexit_trap = nsvm_vcpu_vmexit_trap,
> +    .nhvm_vcpu_vmexit_event = nsvm_vcpu_vmexit_event,
>      .nhvm_vcpu_p2m_base = nsvm_vcpu_hostcr3,
> -    .nhvm_vmcx_guest_intercepts_trap =
> nsvm_vmcb_guest_intercepts_trap,
> +    .nhvm_vmcx_guest_intercepts_event =
> nsvm_vmcb_guest_intercepts_event,
>      .nhvm_vmcx_hap_enabled = nsvm_vmcb_hap_enabled,
>      .nhvm_intr_blocked = nsvm_intr_blocked,
>      .nhvm_hap_walk_L1_p2m = nsvm_hap_walk_L1_p2m,
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 0a52624..b1d8a0b 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -1623,9 +1623,9 @@ void nvmx_enqueue_n2_exceptions(struct vcpu
> *v,
>                   nvmx->intr.intr_info, nvmx->intr.error_code);
>  }
> 
> -static int nvmx_vmexit_trap(struct vcpu *v, const struct hvm_trap *trap)
> +static int nvmx_vmexit_event(struct vcpu *v, const struct x86_event
> *event)
>  {
> -    nvmx_enqueue_n2_exceptions(v, trap->vector, trap->error_code,
> +    nvmx_enqueue_n2_exceptions(v, event->vector, event->error_code,
>                                 hvm_intsrc_none);
>      return NESTEDHVM_VMEXIT_DONE;
>  }
> @@ -1707,13 +1707,13 @@ void vmx_inject_nmi(void)
>   *  - #DB is X86_EVENTTYPE_HW_EXCEPTION, except when generated by
>   *    opcode 0xf1 (which is X86_EVENTTYPE_PRI_SW_EXCEPTION)
>   */
> -static void vmx_inject_trap(const struct hvm_trap *trap)
> +static void vmx_inject_event(const struct x86_event *event)
>  {
>      unsigned long intr_info;
>      struct vcpu *curr = current;
> -    struct hvm_trap _trap = *trap;
> +    struct x86_event _event = *event;
> 
> -    switch ( _trap.vector | -(_trap.type == X86_EVENTTYPE_SW_INTERRUPT) )
> +    switch ( _event.vector | -(_event.type ==
> X86_EVENTTYPE_SW_INTERRUPT) )
>      {
>      case TRAP_debug:
>          if ( guest_cpu_user_regs()->eflags & X86_EFLAGS_TF )
> @@ -1722,7 +1722,7 @@ static void vmx_inject_trap(const struct hvm_trap
> *trap)
>              write_debugreg(6, read_debugreg(6) | DR_STEP);
>          }
>          if ( !nestedhvm_vcpu_in_guestmode(curr) ||
> -             !nvmx_intercepts_exception(curr, TRAP_debug, _trap.error_code) )
> +             !nvmx_intercepts_exception(curr, TRAP_debug, _event.error_code)
> )
>          {
>              unsigned long val;
> 
> @@ -1744,8 +1744,8 @@ static void vmx_inject_trap(const struct hvm_trap
> *trap)
>          break;
> 
>      case TRAP_page_fault:
> -        ASSERT(_trap.type == X86_EVENTTYPE_HW_EXCEPTION);
> -        curr->arch.hvm_vcpu.guest_cr[2] = _trap.cr2;
> +        ASSERT(_event.type == X86_EVENTTYPE_HW_EXCEPTION);
> +        curr->arch.hvm_vcpu.guest_cr[2] = _event.cr2;
>          break;
>      }
> 
> @@ -1758,34 +1758,34 @@ static void vmx_inject_trap(const struct
> hvm_trap *trap)
>           (MASK_EXTR(intr_info, INTR_INFO_INTR_TYPE_MASK) ==
>            X86_EVENTTYPE_HW_EXCEPTION) )
>      {
> -        _trap.vector = hvm_combine_hw_exceptions(
> -            (uint8_t)intr_info, _trap.vector);
> -        if ( _trap.vector == TRAP_double_fault )
> -            _trap.error_code = 0;
> +        _event.vector = hvm_combine_hw_exceptions(
> +            (uint8_t)intr_info, _event.vector);
> +        if ( _event.vector == TRAP_double_fault )
> +            _event.error_code = 0;
>      }
> 
> -    if ( _trap.type >= X86_EVENTTYPE_SW_INTERRUPT )
> -        __vmwrite(VM_ENTRY_INSTRUCTION_LEN, _trap.insn_len);
> +    if ( _event.type >= X86_EVENTTYPE_SW_INTERRUPT )
> +        __vmwrite(VM_ENTRY_INSTRUCTION_LEN, _event.insn_len);
> 
>      if ( nestedhvm_vcpu_in_guestmode(curr) &&
> -         nvmx_intercepts_exception(curr, _trap.vector, _trap.error_code) )
> +         nvmx_intercepts_exception(curr, _event.vector, _event.error_code) )
>      {
>          nvmx_enqueue_n2_exceptions (curr,
>              INTR_INFO_VALID_MASK |
> -            MASK_INSR(_trap.type, INTR_INFO_INTR_TYPE_MASK) |
> -            MASK_INSR(_trap.vector, INTR_INFO_VECTOR_MASK),
> -            _trap.error_code, hvm_intsrc_none);
> +            MASK_INSR(_event.type, INTR_INFO_INTR_TYPE_MASK) |
> +            MASK_INSR(_event.vector, INTR_INFO_VECTOR_MASK),
> +            _event.error_code, hvm_intsrc_none);
>          return;
>      }
>      else
> -        __vmx_inject_exception(_trap.vector, _trap.type, _trap.error_code);
> +        __vmx_inject_exception(_event.vector, _event.type,
> _event.error_code);
> 
> -    if ( (_trap.vector == TRAP_page_fault) &&
> -         (_trap.type == X86_EVENTTYPE_HW_EXCEPTION) )
> -        HVMTRACE_LONG_2D(PF_INJECT, _trap.error_code,
> +    if ( (_event.vector == TRAP_page_fault) &&
> +         (_event.type == X86_EVENTTYPE_HW_EXCEPTION) )
> +        HVMTRACE_LONG_2D(PF_INJECT, _event.error_code,
>                           TRC_PAR_LONG(curr->arch.hvm_vcpu.guest_cr[2]));
>      else
> -        HVMTRACE_2D(INJ_EXC, _trap.vector, _trap.error_code);
> +        HVMTRACE_2D(INJ_EXC, _event.vector, _event.error_code);
>  }
> 
>  static int vmx_event_pending(struct vcpu *v)
> @@ -2162,7 +2162,7 @@ static struct hvm_function_table __initdata
> vmx_function_table = {
>      .set_guest_pat        = vmx_set_guest_pat,
>      .get_guest_pat        = vmx_get_guest_pat,
>      .set_tsc_offset       = vmx_set_tsc_offset,
> -    .inject_trap          = vmx_inject_trap,
> +    .inject_event         = vmx_inject_event,
>      .init_hypercall_page  = vmx_init_hypercall_page,
>      .event_pending        = vmx_event_pending,
>      .invlpg               = vmx_invlpg,
> @@ -2182,8 +2182,8 @@ static struct hvm_function_table __initdata
> vmx_function_table = {
>      .nhvm_vcpu_reset      = nvmx_vcpu_reset,
>      .nhvm_vcpu_p2m_base   = nvmx_vcpu_eptp_base,
>      .nhvm_vmcx_hap_enabled = nvmx_ept_enabled,
> -    .nhvm_vmcx_guest_intercepts_trap = nvmx_intercepts_exception,
> -    .nhvm_vcpu_vmexit_trap = nvmx_vmexit_trap,
> +    .nhvm_vmcx_guest_intercepts_event = nvmx_intercepts_exception,
> +    .nhvm_vcpu_vmexit_event = nvmx_vmexit_event,
>      .nhvm_intr_blocked    = nvmx_intr_blocked,
>      .nhvm_domain_relinquish_resources =
> nvmx_domain_relinquish_resources,
>      .update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap,
> @@ -3201,7 +3201,7 @@ static int vmx_handle_eoi_write(void)
>   */
>  static void vmx_propagate_intr(unsigned long intr)
>  {
> -    struct hvm_trap trap = {
> +    struct x86_event event = {
>          .vector = MASK_EXTR(intr, INTR_INFO_VECTOR_MASK),
>          .type = MASK_EXTR(intr, INTR_INFO_INTR_TYPE_MASK),
>      };
> @@ -3210,20 +3210,20 @@ static void vmx_propagate_intr(unsigned long
> intr)
>      if ( intr & INTR_INFO_DELIVER_CODE_MASK )
>      {
>          __vmread(VM_EXIT_INTR_ERROR_CODE, &tmp);
> -        trap.error_code = tmp;
> +        event.error_code = tmp;
>      }
>      else
> -        trap.error_code = HVM_DELIVER_NO_ERROR_CODE;
> +        event.error_code = HVM_DELIVER_NO_ERROR_CODE;
> 
> -    if ( trap.type >= X86_EVENTTYPE_SW_INTERRUPT )
> +    if ( event.type >= X86_EVENTTYPE_SW_INTERRUPT )
>      {
>          __vmread(VM_EXIT_INSTRUCTION_LEN, &tmp);
> -        trap.insn_len = tmp;
> +        event.insn_len = tmp;
>      }
>      else
> -        trap.insn_len = 0;
> +        event.insn_len = 0;
> 
> -    hvm_inject_trap(&trap);
> +    hvm_inject_event(&event);
>  }
> 
>  static void vmx_idtv_reinject(unsigned long idtv_info)
> diff --git a/xen/arch/x86/hvm/vmx/vvmx.c
> b/xen/arch/x86/hvm/vmx/vvmx.c
> index bed2e0a..b5837d4 100644
> --- a/xen/arch/x86/hvm/vmx/vvmx.c
> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> @@ -491,18 +491,19 @@ static void vmreturn(struct cpu_user_regs *regs,
> enum vmx_ops_result ops_res)
>      regs->eflags = eflags;
>  }
> 
> -bool_t nvmx_intercepts_exception(struct vcpu *v, unsigned int trap,
> -                                 int error_code)
> +bool_t nvmx_intercepts_exception(
> +    struct vcpu *v, unsigned int vector, int error_code)
>  {
>      u32 exception_bitmap, pfec_match=0, pfec_mask=0;
>      int r;
> 
> -    ASSERT ( trap < 32 );
> +    ASSERT(vector < 32);
> 
>      exception_bitmap = get_vvmcs(v, EXCEPTION_BITMAP);
> -    r = exception_bitmap & (1 << trap) ? 1: 0;
> +    r = exception_bitmap & (1 << vector) ? 1: 0;
> 
> -    if ( trap == TRAP_page_fault ) {
> +    if ( vector == TRAP_page_fault )
> +    {
>          pfec_match = get_vvmcs(v, PAGE_FAULT_ERROR_CODE_MATCH);
>          pfec_mask  = get_vvmcs(v, PAGE_FAULT_ERROR_CODE_MASK);
>          if ( (error_code & pfec_mask) != pfec_match )
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c
> b/xen/arch/x86/x86_emulate/x86_emulate.c
> index c5d9664..2e65322 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -5453,6 +5453,17 @@ static void __init __maybe_unused
> build_assertions(void)
>      BUILD_BUG_ON(x86_seg_ds != 3);
>      BUILD_BUG_ON(x86_seg_fs != 4);
>      BUILD_BUG_ON(x86_seg_gs != 5);
> +
> +    /*
> +     * Check X86_EVENTTYPE_* against VMCB EVENTINJ and VMCS
> INTR_INFO type
> +     * fields.
> +     */
> +    BUILD_BUG_ON(X86_EVENTTYPE_EXT_INTR != 0);
> +    BUILD_BUG_ON(X86_EVENTTYPE_NMI != 2);
> +    BUILD_BUG_ON(X86_EVENTTYPE_HW_EXCEPTION != 3);
> +    BUILD_BUG_ON(X86_EVENTTYPE_SW_INTERRUPT != 4);
> +    BUILD_BUG_ON(X86_EVENTTYPE_PRI_SW_EXCEPTION != 5);
> +    BUILD_BUG_ON(X86_EVENTTYPE_SW_EXCEPTION != 6);
>  }
> 
>  #ifdef __XEN__
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h
> b/xen/arch/x86/x86_emulate/x86_emulate.h
> index 93b268e..b0d8e6f 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -67,6 +67,28 @@ enum x86_swint_emulation {
>      x86_swint_emulate_all,  /* Help needed with all software events */
>  };
> 
> +/*
> + * x86 event types. This enumeration is valid for:
> + *  Intel VMX: {VM_ENTRY,VM_EXIT,IDT_VECTORING}_INTR_INFO[10:8]
> + *  AMD SVM: eventinj[10:8] and exitintinfo[10:8] (types 0-4 only)
> + */
> +enum x86_event_type {
> +    X86_EVENTTYPE_EXT_INTR,         /* External interrupt */
> +    X86_EVENTTYPE_NMI = 2,          /* NMI */
> +    X86_EVENTTYPE_HW_EXCEPTION,     /* Hardware exception */
> +    X86_EVENTTYPE_SW_INTERRUPT,     /* Software interrupt (CD nn) */
> +    X86_EVENTTYPE_PRI_SW_EXCEPTION, /* ICEBP (F1) */
> +    X86_EVENTTYPE_SW_EXCEPTION,     /* INT3 (CC), INTO (CE) */
> +};
> +
> +struct x86_event {
> +    int16_t       vector;
> +    uint8_t       type;         /* X86_EVENTTYPE_* */
> +    uint8_t       insn_len;     /* Instruction length */
> +    uint32_t      error_code;   /* HVM_DELIVER_NO_ERROR_CODE if n/a */
> +    unsigned long cr2;          /* Only for TRAP_page_fault h/w exception */
> +};
> +
>  /*
>   * Attribute for segment selector. This is a copy of bit 40:47 & 52:55 of the
>   * segment descriptor. It happens to match the format of an AMD SVM
> VMCB.
> diff --git a/xen/include/asm-x86/hvm/emulate.h b/xen/include/asm-
> x86/hvm/emulate.h
> index d4186a2..3b7ec33 100644
> --- a/xen/include/asm-x86/hvm/emulate.h
> +++ b/xen/include/asm-x86/hvm/emulate.h
> @@ -30,7 +30,7 @@ struct hvm_emulate_ctxt {
>      unsigned long seg_reg_dirty;
> 
>      bool_t exn_pending;
> -    struct hvm_trap trap;
> +    struct x86_event trap;
> 
>      uint32_t intr_shadow;
> 
> diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-
> x86/hvm/hvm.h
> index 7e7462e..51a64f7 100644
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -77,14 +77,6 @@ enum hvm_intblk {
>  #define HVM_HAP_SUPERPAGE_2MB   0x00000001
>  #define HVM_HAP_SUPERPAGE_1GB   0x00000002
> 
> -struct hvm_trap {
> -    int16_t       vector;
> -    uint8_t       type;         /* X86_EVENTTYPE_* */
> -    uint8_t       insn_len;     /* Instruction length */
> -    uint32_t      error_code;   /* HVM_DELIVER_NO_ERROR_CODE if n/a */
> -    unsigned long cr2;          /* Only for TRAP_page_fault h/w exception */
> -};
> -
>  /*
>   * The hardware virtual machine (HVM) interface abstracts away from the
>   * x86/x86_64 CPU virtualization assist specifics. Currently this interface
> @@ -152,7 +144,7 @@ struct hvm_function_table {
> 
>      void (*set_tsc_offset)(struct vcpu *v, u64 offset, u64 at_tsc);
> 
> -    void (*inject_trap)(const struct hvm_trap *trap);
> +    void (*inject_event)(const struct x86_event *event);
> 
>      void (*init_hypercall_page)(struct domain *d, void *hypercall_page);
> 
> @@ -185,11 +177,10 @@ struct hvm_function_table {
>      int (*nhvm_vcpu_initialise)(struct vcpu *v);
>      void (*nhvm_vcpu_destroy)(struct vcpu *v);
>      int (*nhvm_vcpu_reset)(struct vcpu *v);
> -    int (*nhvm_vcpu_vmexit_trap)(struct vcpu *v, const struct hvm_trap
> *trap);
> +    int (*nhvm_vcpu_vmexit_event)(struct vcpu *v, const struct x86_event
> *event);
>      uint64_t (*nhvm_vcpu_p2m_base)(struct vcpu *v);
> -    bool_t (*nhvm_vmcx_guest_intercepts_trap)(struct vcpu *v,
> -                                              unsigned int trapnr,
> -                                              int errcode);
> +    bool_t (*nhvm_vmcx_guest_intercepts_event)(
> +        struct vcpu *v, unsigned int vector, int errcode);
> 
>      bool_t (*nhvm_vmcx_hap_enabled)(struct vcpu *v);
> 
> @@ -419,9 +410,30 @@ void hvm_migrate_timers(struct vcpu *v);
>  void hvm_do_resume(struct vcpu *v);
>  void hvm_migrate_pirqs(struct vcpu *v);
> 
> -void hvm_inject_trap(const struct hvm_trap *trap);
> -void hvm_inject_hw_exception(unsigned int trapnr, int errcode);
> -void hvm_inject_page_fault(int errcode, unsigned long cr2);
> +void hvm_inject_event(const struct x86_event *event);
> +
> +static inline void hvm_inject_hw_exception(unsigned int vector, int
> errcode)
> +{
> +    struct x86_event event = {
> +        .vector = vector,
> +        .type = X86_EVENTTYPE_HW_EXCEPTION,
> +        .error_code = errcode,
> +    };
> +
> +    hvm_inject_event(&event);
> +}
> +
> +static inline void hvm_inject_page_fault(int errcode, unsigned long cr2)
> +{
> +    struct x86_event event = {
> +        .vector = TRAP_page_fault,
> +        .type = X86_EVENTTYPE_HW_EXCEPTION,
> +        .error_code = errcode,
> +        .cr2 = cr2,
> +    };
> +
> +    hvm_inject_event(&event);
> +}
> 
>  static inline int hvm_event_pending(struct vcpu *v)
>  {
> @@ -437,18 +449,6 @@ static inline int hvm_event_pending(struct vcpu *v)
>                         (1U << TRAP_alignment_check) | \
>                         (1U << TRAP_machine_check))
> 
> -/*
> - * x86 event types. This enumeration is valid for:
> - *  Intel VMX: {VM_ENTRY,VM_EXIT,IDT_VECTORING}_INTR_INFO[10:8]
> - *  AMD SVM: eventinj[10:8] and exitintinfo[10:8] (types 0-4 only)
> - */
> -#define X86_EVENTTYPE_EXT_INTR         0 /* external interrupt */
> -#define X86_EVENTTYPE_NMI              2 /* NMI */
> -#define X86_EVENTTYPE_HW_EXCEPTION     3 /* hardware exception */
> -#define X86_EVENTTYPE_SW_INTERRUPT     4 /* software interrupt (CD nn)
> */
> -#define X86_EVENTTYPE_PRI_SW_EXCEPTION 5 /* ICEBP (F1) */
> -#define X86_EVENTTYPE_SW_EXCEPTION     6 /* INT3 (CC), INTO (CE) */
> -
>  int hvm_event_needs_reinjection(uint8_t type, uint8_t vector);
> 
>  uint8_t hvm_combine_hw_exceptions(uint8_t vec1, uint8_t vec2);
> @@ -542,10 +542,10 @@ int hvm_x2apic_msr_write(struct vcpu *v,
> unsigned int msr, uint64_t msr_content)
>  /* inject vmexit into l1 guest. l1 guest will see a VMEXIT due to
>   * 'trapnr' exception.
>   */
> -static inline int nhvm_vcpu_vmexit_trap(struct vcpu *v,
> -                                        const struct hvm_trap *trap)
> +static inline int nhvm_vcpu_vmexit_event(
> +    struct vcpu *v, const struct x86_event *event)
>  {
> -    return hvm_funcs.nhvm_vcpu_vmexit_trap(v, trap);
> +    return hvm_funcs.nhvm_vcpu_vmexit_event(v, event);
>  }
> 
>  /* returns l1 guest's cr3 that points to the page table used to
> @@ -557,11 +557,10 @@ static inline uint64_t nhvm_vcpu_p2m_base(struct
> vcpu *v)
>  }
> 
>  /* returns true, when l1 guest intercepts the specified trap */
> -static inline bool_t nhvm_vmcx_guest_intercepts_trap(struct vcpu *v,
> -                                                     unsigned int trap,
> -                                                     int errcode)
> +static inline bool_t nhvm_vmcx_guest_intercepts_event(
> +    struct vcpu *v, unsigned int vector, int errcode)
>  {
> -    return hvm_funcs.nhvm_vmcx_guest_intercepts_trap(v, trap, errcode);
> +    return hvm_funcs.nhvm_vmcx_guest_intercepts_event(v, vector,
> errcode);
>  }
> 
>  /* returns true when l1 guest wants to use hap to run l2 guest */
> diff --git a/xen/include/asm-x86/hvm/svm/nestedsvm.h
> b/xen/include/asm-x86/hvm/svm/nestedsvm.h
> index 0dbc5ec..4b36c25 100644
> --- a/xen/include/asm-x86/hvm/svm/nestedsvm.h
> +++ b/xen/include/asm-x86/hvm/svm/nestedsvm.h
> @@ -110,10 +110,10 @@ void nsvm_vcpu_destroy(struct vcpu *v);
>  int nsvm_vcpu_initialise(struct vcpu *v);
>  int nsvm_vcpu_reset(struct vcpu *v);
>  int nsvm_vcpu_vmrun(struct vcpu *v, struct cpu_user_regs *regs);
> -int nsvm_vcpu_vmexit_trap(struct vcpu *v, const struct hvm_trap *trap);
> +int nsvm_vcpu_vmexit_event(struct vcpu *v, const struct x86_event
> *event);
>  uint64_t nsvm_vcpu_hostcr3(struct vcpu *v);
> -bool_t nsvm_vmcb_guest_intercepts_trap(struct vcpu *v, unsigned int
> trapnr,
> -                                       int errcode);
> +bool_t nsvm_vmcb_guest_intercepts_event(
> +    struct vcpu *v, unsigned int vector, int errcode);
>  bool_t nsvm_vmcb_hap_enabled(struct vcpu *v);
>  enum hvm_intblk nsvm_intr_blocked(struct vcpu *v);
> 
> diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-
> x86/hvm/vcpu.h
> index 84d9406..d485536 100644
> --- a/xen/include/asm-x86/hvm/vcpu.h
> +++ b/xen/include/asm-x86/hvm/vcpu.h
> @@ -206,7 +206,7 @@ struct hvm_vcpu {
>      void *fpu_exception_callback_arg;
> 
>      /* Pending hw/sw interrupt (.vector = -1 means nothing pending). */
> -    struct hvm_trap     inject_trap;
> +    struct x86_event     inject_trap;
> 
>      struct viridian_vcpu viridian;
>  };
> diff --git a/xen/include/asm-x86/hvm/vmx/vvmx.h b/xen/include/asm-
> x86/hvm/vmx/vvmx.h
> index aca8b4b..ead586e 100644
> --- a/xen/include/asm-x86/hvm/vmx/vvmx.h
> +++ b/xen/include/asm-x86/hvm/vmx/vvmx.h
> @@ -112,8 +112,8 @@ void nvmx_vcpu_destroy(struct vcpu *v);
>  int nvmx_vcpu_reset(struct vcpu *v);
>  uint64_t nvmx_vcpu_eptp_base(struct vcpu *v);
>  enum hvm_intblk nvmx_intr_blocked(struct vcpu *v);
> -bool_t nvmx_intercepts_exception(struct vcpu *v, unsigned int trap,
> -                                 int error_code);
> +bool_t nvmx_intercepts_exception(
> +    struct vcpu *v, unsigned int vector, int error_code);
>  void nvmx_domain_relinquish_resources(struct domain *d);
> 
>  bool_t nvmx_ept_enabled(struct vcpu *v);
> --
> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 13/15] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back
  2016-11-23 15:38 ` [PATCH 13/15] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back Andrew Cooper
@ 2016-11-23 16:18   ` Andrew Cooper
  2016-11-23 16:39   ` Tim Deegan
  1 sibling, 0 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 16:18 UTC (permalink / raw)
  To: Xen-devel; +Cc: Kevin Tian, Paul Durrant, Tim Deegan, Jun Nakajima, Jan Beulich

On 23/11/16 15:38, Andrew Cooper wrote:
> diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
> index afacd5f..88d4642 100644
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -198,6 +198,7 @@ hvm_read(enum x86_segment seg,
>      case HVMCOPY_okay:
>          return X86EMUL_OKAY;
>      case HVMCOPY_bad_gva_to_gfn:
> +        hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
>          return X86EMUL_EXCEPTION;
>      case HVMCOPY_bad_gfn_to_mfn:
>      case HVMCOPY_unhandleable:

I realise I have forgotten to adjust this change to being
x86_emul_pagefault().  emulate_gva_to_mfn() also needs modifying in a
similar vein.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 06/15] x86/emul: Rework emulator event injection
  2016-11-23 15:38 ` [PATCH 06/15] x86/emul: Rework emulator event injection Andrew Cooper
@ 2016-11-23 16:19   ` Tim Deegan
  2016-11-23 16:33     ` Jan Beulich
  2016-11-23 16:38     ` Andrew Cooper
  2016-11-23 17:56   ` Boris Ostrovsky
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 91+ messages in thread
From: Tim Deegan @ 2016-11-23 16:19 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Jan Beulich, George Dunlap, Xen-devel, Paul Durrant,
	Jun Nakajima, Boris Ostrovsky, Suravee Suthikulpanit

Hi,

At 15:38 +0000 on 23 Nov (1479915529), Andrew Cooper wrote:
> The emulator needs to gain an understanding of interrupts and exceptions
> generated by its actions.
> 
> Move hvm_emulate_ctxt.{exn_pending,trap} into struct x86_emulate_ctxt so they
> are visible to the emulator.  This removes the need for the
> inject_{hw,sw}_interrupt() hooks, which are dropped and replaced with
> x86_emul_{hw_exception,software_event}() instead.
> 
> The shadow pagetable and PV uses of x86_emulate() previously failed with
> X86EMUL_UNHANDLEABLE due to the lack of inject_*() hooks, but this behaviour
> has subtly changed.  Adjust the return value checking to cause a pending event
> to fall back into the previous codepath.
> 
> No overall functional change.

AIUI this does have a change in the shadow callers in the case where
the emulated instruction would inject an event.  Previously we would
have failed the emulation, perhaps unshadowed something, and returned
to the guest to retry.

Now the emulator records the event in the context struct, updates the
register state and returns success, so we'll return on the *next*
instruction.  I think that's OK, though.

Also, handle_mmio() and other callers of the emulator check for that
pending event and pass it to the hardware but you haven't added
anything in the shadow code to do that.  Does the event get dropped?

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 04/15] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC
  2016-11-23 15:38 ` [PATCH 04/15] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC Andrew Cooper
@ 2016-11-23 16:20   ` Paul Durrant
  2016-11-23 17:05   ` Boris Ostrovsky
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 91+ messages in thread
From: Paul Durrant @ 2016-11-23 16:20 UTC (permalink / raw)
  To: Xen-devel
  Cc: Kevin Tian, Jan Beulich, Andrew Cooper, Jun Nakajima,
	Boris Ostrovsky, Suravee Suthikulpanit

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 23 November 2016 15:39
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Jan Beulich
> <JBeulich@suse.com>; Paul Durrant <Paul.Durrant@citrix.com>; Jun
> Nakajima <jun.nakajima@intel.com>; Kevin Tian <kevin.tian@intel.com>;
> Boris Ostrovsky <boris.ostrovsky@oracle.com>; Suravee Suthikulpanit
> <suravee.suthikulpanit@amd.com>
> Subject: [PATCH 04/15] x86/emul: Rename
> HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC
> 
> and move it to live with the other x86_event infrastructure in
> x86_emulate.h.
> Switch it and x86_event.error_code to being signed, matching the rest of the
> code.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Paul Durrant <paul.durrant@citrix.com>
> CC: Jun Nakajima <jun.nakajima@intel.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>  xen/arch/x86/hvm/emulate.c             |  2 +-
>  xen/arch/x86/hvm/hvm.c                 |  6 +++---
>  xen/arch/x86/hvm/nestedhvm.c           |  2 +-
>  xen/arch/x86/hvm/svm/nestedsvm.c       |  6 +++---
>  xen/arch/x86/hvm/svm/svm.c             | 22 +++++++++++-----------
>  xen/arch/x86/hvm/vmx/intr.c            |  2 +-
>  xen/arch/x86/hvm/vmx/vmx.c             | 23 ++++++++++++-----------
>  xen/arch/x86/hvm/vmx/vvmx.c            |  2 +-
>  xen/arch/x86/x86_emulate/x86_emulate.h |  3 ++-
>  xen/include/asm-x86/hvm/support.h      |  2 --
>  10 files changed, 35 insertions(+), 35 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index d0c3185..790e9c1 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -1609,7 +1609,7 @@ static int hvmemul_inject_sw_interrupt(
> 
>      hvmemul_ctxt->exn_pending = 1;
>      hvmemul_ctxt->trap.vector = vector;
> -    hvmemul_ctxt->trap.error_code = HVM_DELIVER_NO_ERROR_CODE;
> +    hvmemul_ctxt->trap.error_code = X86_EVENT_NO_EC;
>      hvmemul_ctxt->trap.insn_len = insn_len;
> 
>      return X86EMUL_OKAY;
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 7b434aa..b950842 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -502,7 +502,7 @@ void hvm_do_resume(struct vcpu *v)
>                  kind = EMUL_KIND_SET_CONTEXT_INSN;
> 
>              hvm_emulate_one_vm_event(kind, TRAP_invalid_op,
> -                                     HVM_DELIVER_NO_ERROR_CODE);
> +                                     X86_EVENT_NO_EC);
> 
>              v->arch.vm_event->emulate_flags = 0;
>          }
> @@ -3054,7 +3054,7 @@ void hvm_task_switch(
>      }
> 
>      if ( (tss.trace & 1) && !exn_raised )
> -        hvm_inject_hw_exception(TRAP_debug,
> HVM_DELIVER_NO_ERROR_CODE);
> +        hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
> 
>   out:
>      hvm_unmap_entry(optss_desc);
> @@ -4073,7 +4073,7 @@ void hvm_ud_intercept(struct cpu_user_regs
> *regs)
>      switch ( hvm_emulate_one(&ctxt) )
>      {
>      case X86EMUL_UNHANDLEABLE:
> -        hvm_inject_hw_exception(TRAP_invalid_op,
> HVM_DELIVER_NO_ERROR_CODE);
> +        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
>          break;
>      case X86EMUL_EXCEPTION:
>          if ( ctxt.exn_pending )
> diff --git a/xen/arch/x86/hvm/nestedhvm.c
> b/xen/arch/x86/hvm/nestedhvm.c
> index caad525..c4671d8 100644
> --- a/xen/arch/x86/hvm/nestedhvm.c
> +++ b/xen/arch/x86/hvm/nestedhvm.c
> @@ -17,7 +17,7 @@
>   */
> 
>  #include <asm/msr.h>
> -#include <asm/hvm/support.h>	/* for
> HVM_DELIVER_NO_ERROR_CODE */
> +#include <asm/hvm/support.h>
>  #include <asm/hvm/hvm.h>
>  #include <asm/p2m.h>    /* for struct p2m_domain */
>  #include <asm/hvm/nestedhvm.h>
> diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c
> b/xen/arch/x86/hvm/svm/nestedsvm.c
> index b6b8526..8c9b073 100644
> --- a/xen/arch/x86/hvm/svm/nestedsvm.c
> +++ b/xen/arch/x86/hvm/svm/nestedsvm.c
> @@ -756,7 +756,7 @@ nsvm_vcpu_vmrun(struct vcpu *v, struct
> cpu_user_regs *regs)
>      default:
>          gdprintk(XENLOG_ERR,
>              "nsvm_vcpu_vmentry failed, injecting #UD\n");
> -        hvm_inject_hw_exception(TRAP_invalid_op,
> HVM_DELIVER_NO_ERROR_CODE);
> +        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
>          /* Must happen after hvm_inject_hw_exception or it doesn't work
> right. */
>          nv->nv_vmswitch_in_progress = 0;
>          return 1;
> @@ -1581,7 +1581,7 @@ void svm_vmexit_do_stgi(struct cpu_user_regs
> *regs, struct vcpu *v)
>      unsigned int inst_len;
> 
>      if ( !nestedhvm_enabled(v->domain) ) {
> -        hvm_inject_hw_exception(TRAP_invalid_op,
> HVM_DELIVER_NO_ERROR_CODE);
> +        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
>          return;
>      }
> 
> @@ -1601,7 +1601,7 @@ void svm_vmexit_do_clgi(struct cpu_user_regs
> *regs, struct vcpu *v)
>      vintr_t intr;
> 
>      if ( !nestedhvm_enabled(v->domain) ) {
> -        hvm_inject_hw_exception(TRAP_invalid_op,
> HVM_DELIVER_NO_ERROR_CODE);
> +        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
>          return;
>      }
> 
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index 66eb30b..fb4fd0b 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -89,7 +89,7 @@ static DEFINE_SPINLOCK(osvw_lock);
>  static void svm_crash_or_fault(struct vcpu *v)
>  {
>      if ( vmcb_get_cpl(v->arch.hvm_svm.vmcb) )
> -        hvm_inject_hw_exception(TRAP_invalid_op,
> HVM_DELIVER_NO_ERROR_CODE);
> +        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
>      else
>          domain_crash(v->domain);
>  }
> @@ -116,7 +116,7 @@ void __update_guest_eip(struct cpu_user_regs
> *regs, unsigned int inst_len)
>      curr->arch.hvm_svm.vmcb->interrupt_shadow = 0;
> 
>      if ( regs->eflags & X86_EFLAGS_TF )
> -        hvm_inject_hw_exception(TRAP_debug,
> HVM_DELIVER_NO_ERROR_CODE);
> +        hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
>  }
> 
>  static void svm_cpu_down(void)
> @@ -1285,7 +1285,7 @@ static void svm_inject_event(const struct
> x86_event *event)
> 
>      default:
>          eventinj.fields.type = X86_EVENTTYPE_HW_EXCEPTION;
> -        eventinj.fields.ev = (_event.error_code !=
> HVM_DELIVER_NO_ERROR_CODE);
> +        eventinj.fields.ev = (_event.error_code != X86_EVENT_NO_EC);
>          eventinj.fields.errorcode = _event.error_code;
>          break;
>      }
> @@ -1553,7 +1553,7 @@ static void svm_fpu_dirty_intercept(void)
>      {
>         /* Check if l1 guest must make FPU ready for the l2 guest */
>         if ( v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_TS )
> -           hvm_inject_hw_exception(TRAP_no_device,
> HVM_DELIVER_NO_ERROR_CODE);
> +           hvm_inject_hw_exception(TRAP_no_device, X86_EVENT_NO_EC);
>         else
>             vmcb_set_cr0(n1vmcb, vmcb_get_cr0(n1vmcb) & ~X86_CR0_TS);
>         return;
> @@ -2022,14 +2022,14 @@ svm_vmexit_do_vmrun(struct cpu_user_regs
> *regs,
>      if ( !nsvm_efer_svm_enabled(v) )
>      {
>          gdprintk(XENLOG_ERR, "VMRUN: nestedhvm disabled, injecting
> #UD\n");
> -        hvm_inject_hw_exception(TRAP_invalid_op,
> HVM_DELIVER_NO_ERROR_CODE);
> +        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
>          return;
>      }
> 
>      if ( !nestedsvm_vmcb_map(v, vmcbaddr) )
>      {
>          gdprintk(XENLOG_ERR, "VMRUN: mapping vmcb failed, injecting
> #GP\n");
> -        hvm_inject_hw_exception(TRAP_gp_fault,
> HVM_DELIVER_NO_ERROR_CODE);
> +        hvm_inject_hw_exception(TRAP_gp_fault, X86_EVENT_NO_EC);
>          return;
>      }
> 
> @@ -2101,7 +2101,7 @@ svm_vmexit_do_vmload(struct vmcb_struct
> *vmcb,
>      return;
> 
>   inject:
> -    hvm_inject_hw_exception(ret, HVM_DELIVER_NO_ERROR_CODE);
> +    hvm_inject_hw_exception(ret, X86_EVENT_NO_EC);
>      return;
>  }
> 
> @@ -2139,7 +2139,7 @@ svm_vmexit_do_vmsave(struct vmcb_struct
> *vmcb,
>      return;
> 
>   inject:
> -    hvm_inject_hw_exception(ret, HVM_DELIVER_NO_ERROR_CODE);
> +    hvm_inject_hw_exception(ret, X86_EVENT_NO_EC);
>      return;
>  }
> 
> @@ -2428,7 +2428,7 @@ void svm_vmexit_handler(struct cpu_user_regs
> *regs)
> 
>      case VMEXIT_EXCEPTION_DB:
>          if ( !v->domain->debugger_attached )
> -            hvm_inject_hw_exception(TRAP_debug,
> HVM_DELIVER_NO_ERROR_CODE);
> +            hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
>          else
>              domain_pause_for_debugger();
>          break;
> @@ -2616,7 +2616,7 @@ void svm_vmexit_handler(struct cpu_user_regs
> *regs)
> 
>      case VMEXIT_MONITOR:
>      case VMEXIT_MWAIT:
> -        hvm_inject_hw_exception(TRAP_invalid_op,
> HVM_DELIVER_NO_ERROR_CODE);
> +        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
>          break;
> 
>      case VMEXIT_VMRUN:
> @@ -2635,7 +2635,7 @@ void svm_vmexit_handler(struct cpu_user_regs
> *regs)
>          svm_vmexit_do_clgi(regs, v);
>          break;
>      case VMEXIT_SKINIT:
> -        hvm_inject_hw_exception(TRAP_invalid_op,
> HVM_DELIVER_NO_ERROR_CODE);
> +        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
>          break;
> 
>      case VMEXIT_XSETBV:
> diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c
> index 8fca08c..639a705 100644
> --- a/xen/arch/x86/hvm/vmx/intr.c
> +++ b/xen/arch/x86/hvm/vmx/intr.c
> @@ -302,7 +302,7 @@ void vmx_intr_assist(void)
>      }
>      else if ( intack.source == hvm_intsrc_mce )
>      {
> -        hvm_inject_hw_exception(TRAP_machine_check,
> HVM_DELIVER_NO_ERROR_CODE);
> +        hvm_inject_hw_exception(TRAP_machine_check,
> X86_EVENT_NO_EC);
>      }
>      else if ( cpu_has_vmx_virtual_intr_delivery &&
>                intack.source != hvm_intsrc_pic &&
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index b1d8a0b..eb7c902 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -1646,7 +1646,8 @@ static void __vmx_inject_exception(int trap, int
> type, int error_code)
>      intr_fields = INTR_INFO_VALID_MASK |
>                    MASK_INSR(type, INTR_INFO_INTR_TYPE_MASK) |
>                    MASK_INSR(trap, INTR_INFO_VECTOR_MASK);
> -    if ( error_code != HVM_DELIVER_NO_ERROR_CODE ) {
> +    if ( error_code != X86_EVENT_NO_EC )
> +    {
>          __vmwrite(VM_ENTRY_EXCEPTION_ERROR_CODE, error_code);
>          intr_fields |= INTR_INFO_DELIVER_CODE_MASK;
>      }
> @@ -1671,12 +1672,12 @@ void vmx_inject_extint(int trap, uint8_t source)
>                 INTR_INFO_VALID_MASK |
>                 MASK_INSR(X86_EVENTTYPE_EXT_INTR,
> INTR_INFO_INTR_TYPE_MASK) |
>                 MASK_INSR(trap, INTR_INFO_VECTOR_MASK),
> -               HVM_DELIVER_NO_ERROR_CODE, source);
> +               X86_EVENT_NO_EC, source);
>              return;
>          }
>      }
>      __vmx_inject_exception(trap, X86_EVENTTYPE_EXT_INTR,
> -                           HVM_DELIVER_NO_ERROR_CODE);
> +                           X86_EVENT_NO_EC);
>  }
> 
>  void vmx_inject_nmi(void)
> @@ -1691,12 +1692,12 @@ void vmx_inject_nmi(void)
>                 INTR_INFO_VALID_MASK |
>                 MASK_INSR(X86_EVENTTYPE_NMI, INTR_INFO_INTR_TYPE_MASK) |
>                 MASK_INSR(TRAP_nmi, INTR_INFO_VECTOR_MASK),
> -               HVM_DELIVER_NO_ERROR_CODE, hvm_intsrc_nmi);
> +               X86_EVENT_NO_EC, hvm_intsrc_nmi);
>              return;
>          }
>      }
>      __vmx_inject_exception(2, X86_EVENTTYPE_NMI,
> -                           HVM_DELIVER_NO_ERROR_CODE);
> +                           X86_EVENT_NO_EC);
>  }
> 
>  /*
> @@ -2111,7 +2112,7 @@ static bool_t vmx_vcpu_emulate_ve(struct vcpu
> *v)
>      vmx_vmcs_exit(v);
> 
>      hvm_inject_hw_exception(TRAP_virtualisation,
> -                            HVM_DELIVER_NO_ERROR_CODE);
> +                            X86_EVENT_NO_EC);
> 
>   out:
>      hvm_unmap_guest_frame(veinfo, 0);
> @@ -2387,7 +2388,7 @@ void update_guest_eip(void)
>      }
> 
>      if ( regs->eflags & X86_EFLAGS_TF )
> -        hvm_inject_hw_exception(TRAP_debug,
> HVM_DELIVER_NO_ERROR_CODE);
> +        hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
>  }
> 
>  static void vmx_fpu_dirty_intercept(void)
> @@ -3213,7 +3214,7 @@ static void vmx_propagate_intr(unsigned long intr)
>          event.error_code = tmp;
>      }
>      else
> -        event.error_code = HVM_DELIVER_NO_ERROR_CODE;
> +        event.error_code = X86_EVENT_NO_EC;
> 
>      if ( event.type >= X86_EVENTTYPE_SW_INTERRUPT )
>      {
> @@ -3770,7 +3771,7 @@ void vmx_vmexit_handler(struct cpu_user_regs
> *regs)
> 
>      case EXIT_REASON_VMFUNC:
>          if ( vmx_vmfunc_intercept(regs) != X86EMUL_OKAY )
> -            hvm_inject_hw_exception(TRAP_invalid_op,
> HVM_DELIVER_NO_ERROR_CODE);
> +            hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
>          else
>              update_guest_eip();
>          break;
> @@ -3784,7 +3785,7 @@ void vmx_vmexit_handler(struct cpu_user_regs
> *regs)
>           * as far as vmexit.
>           */
>          WARN_ON(exit_reason == EXIT_REASON_GETSEC);
> -        hvm_inject_hw_exception(TRAP_invalid_op,
> HVM_DELIVER_NO_ERROR_CODE);
> +        hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
>          break;
> 
>      case EXIT_REASON_TPR_BELOW_THRESHOLD:
> @@ -3909,7 +3910,7 @@ void vmx_vmexit_handler(struct cpu_user_regs
> *regs)
>              vmx_get_segment_register(v, x86_seg_ss, &ss);
>              if ( ss.attr.fields.dpl )
>                  hvm_inject_hw_exception(TRAP_invalid_op,
> -                                        HVM_DELIVER_NO_ERROR_CODE);
> +                                        X86_EVENT_NO_EC);
>              else
>                  domain_crash(v->domain);
>          }
> diff --git a/xen/arch/x86/hvm/vmx/vvmx.c
> b/xen/arch/x86/hvm/vmx/vvmx.c
> index b5837d4..efaf54c 100644
> --- a/xen/arch/x86/hvm/vmx/vvmx.c
> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> @@ -380,7 +380,7 @@ static int vmx_inst_check_privilege(struct
> cpu_user_regs *regs, int vmxop_check)
> 
>  invalid_op:
>      gdprintk(XENLOG_ERR, "vmx_inst_check_privilege: invalid_op\n");
> -    hvm_inject_hw_exception(TRAP_invalid_op,
> HVM_DELIVER_NO_ERROR_CODE);
> +    hvm_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
>      return X86EMUL_EXCEPTION;
> 
>  gp_fault:
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h
> b/xen/arch/x86/x86_emulate/x86_emulate.h
> index b0d8e6f..9df083e 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -80,12 +80,13 @@ enum x86_event_type {
>      X86_EVENTTYPE_PRI_SW_EXCEPTION, /* ICEBP (F1) */
>      X86_EVENTTYPE_SW_EXCEPTION,     /* INT3 (CC), INTO (CE) */
>  };
> +#define X86_EVENT_NO_EC (-1)        /* No error code. */
> 
>  struct x86_event {
>      int16_t       vector;
>      uint8_t       type;         /* X86_EVENTTYPE_* */
>      uint8_t       insn_len;     /* Instruction length */
> -    uint32_t      error_code;   /* HVM_DELIVER_NO_ERROR_CODE if n/a */
> +    int32_t       error_code;   /* X86_EVENT_NO_EC if n/a */
>      unsigned long cr2;          /* Only for TRAP_page_fault h/w exception */
>  };
> 
> diff --git a/xen/include/asm-x86/hvm/support.h b/xen/include/asm-
> x86/hvm/support.h
> index 2984abc..9938450 100644
> --- a/xen/include/asm-x86/hvm/support.h
> +++ b/xen/include/asm-x86/hvm/support.h
> @@ -25,8 +25,6 @@
>  #include <xen/hvm/save.h>
>  #include <asm/processor.h>
> 
> -#define HVM_DELIVER_NO_ERROR_CODE  (~0U)
> -
>  #ifndef NDEBUG
>  #define DBG_LEVEL_0                 (1 << 0)
>  #define DBG_LEVEL_1                 (1 << 1)
> --
> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 03/15] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure
  2016-11-23 16:12   ` Paul Durrant
@ 2016-11-23 16:22     ` Andrew Cooper
  0 siblings, 0 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 16:22 UTC (permalink / raw)
  To: Paul Durrant, Xen-devel
  Cc: Boris Ostrovsky, Suravee Suthikulpanit, Kevin Tian, Jun Nakajima,
	Jan Beulich

On 23/11/16 16:12, Paul Durrant wrote:
>> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
>> index f24e211..d0c3185 100644
>> --- a/xen/arch/x86/hvm/emulate.c
>> +++ b/xen/arch/x86/hvm/emulate.c
>> @@ -1679,7 +1679,7 @@ static int hvmemul_invlpg(
>>           * violations, so squash them.
>>           */
>>          hvmemul_ctxt->exn_pending = 0;
>> -        hvmemul_ctxt->trap = (struct hvm_trap){};
>> +        hvmemul_ctxt->trap = (struct x86_event){};
> Can't say I like that way of initializing but...
>
> Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

Think of it like memcpy/memset, but typesafe.  Either way, this is a
strict renaming patch, I wouldn't want to change the method of
initialisation here anyway.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 09/15] x86/emul: Avoid raising faults behind the emulators back
  2016-11-23 15:38 ` [PATCH 09/15] x86/emul: Avoid raising faults behind the emulators back Andrew Cooper
@ 2016-11-23 16:31   ` Tim Deegan
  2016-11-23 16:40     ` Andrew Cooper
  0 siblings, 1 reply; 91+ messages in thread
From: Tim Deegan @ 2016-11-23 16:31 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: George Dunlap, Paul Durrant, Jan Beulich, Xen-devel

At 15:38 +0000 on 23 Nov (1479915532), Andrew Cooper wrote:
> Introduce a new x86_emul_pagefault() similar to x86_emul_hw_exception(), and
> use this instead of hvm_inject_page_fault() from emulation codepaths.
> 
> Replace one hvm_inject_hw_exception() in the shadow emulation codepaths.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

> NOTE: this is a functional change for the shadow code, as a #PF previously
> raised properly with the guest will now cause X86EMUL_UNHANDLABLE. It is my
> understanding after a discusion with Tim that this is ok, but I haven't done
> extenstive testing yet.

Do you plan to?  I think this is indeed OK, but there may be some edge
case, e.g. an instruction that writes to both the current top-level
pagetable (which can't be unshadowed) and an unmapped virtual address.
That ought to raise #PF in the guest but might now spin retrying?  

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 10/15] x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer
  2016-11-23 15:38 ` [PATCH 10/15] x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer Andrew Cooper
@ 2016-11-23 16:32   ` Tim Deegan
  2016-11-23 16:36   ` Paul Durrant
  2016-11-24  6:25   ` Tian, Kevin
  2 siblings, 0 replies; 91+ messages in thread
From: Tim Deegan @ 2016-11-23 16:32 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Kevin Tian, Paul Durrant, Jun Nakajima, Xen-devel

At 15:38 +0000 on 23 Nov (1479915533), Andrew Cooper wrote:
> which is filled with pagefault information should one occur.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Tim Deegan <tim@xen.org>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 06/15] x86/emul: Rework emulator event injection
  2016-11-23 16:19   ` Tim Deegan
@ 2016-11-23 16:33     ` Jan Beulich
  2016-11-23 16:43       ` Tim Deegan
  2016-11-23 16:38     ` Andrew Cooper
  1 sibling, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2016-11-23 16:33 UTC (permalink / raw)
  To: Tim Deegan
  Cc: Kevin Tian, Suravee Suthikulpanit, George Dunlap, Andrew Cooper,
	Xen-devel, Paul Durrant, Jun Nakajima, Boris Ostrovsky

>>> On 23.11.16 at 17:19, <tim@xen.org> wrote:
> Hi,
> 
> At 15:38 +0000 on 23 Nov (1479915529), Andrew Cooper wrote:
>> The emulator needs to gain an understanding of interrupts and exceptions
>> generated by its actions.
>> 
>> Move hvm_emulate_ctxt.{exn_pending,trap} into struct x86_emulate_ctxt so 
> they
>> are visible to the emulator.  This removes the need for the
>> inject_{hw,sw}_interrupt() hooks, which are dropped and replaced with
>> x86_emul_{hw_exception,software_event}() instead.
>> 
>> The shadow pagetable and PV uses of x86_emulate() previously failed with
>> X86EMUL_UNHANDLEABLE due to the lack of inject_*() hooks, but this behaviour
>> has subtly changed.  Adjust the return value checking to cause a pending 
> event
>> to fall back into the previous codepath.
>> 
>> No overall functional change.
> 
> AIUI this does have a change in the shadow callers in the case where
> the emulated instruction would inject an event.  Previously we would
> have failed the emulation, perhaps unshadowed something, and returned
> to the guest to retry.
> 
> Now the emulator records the event in the context struct, updates the
> register state and returns success, so we'll return on the *next*
> instruction.  I think that's OK, though.

Not exactly - instead of success, X86EMUL_EXCEPTION is being
returned, which would suppress register updates. Also I don't
think continuing on the next instruction would be okay, as we'd
then basically have skipped the one having caused the (not
delivered) exception.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 11/15] x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info
  2016-11-23 15:38 ` [PATCH 11/15] x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info Andrew Cooper
@ 2016-11-23 16:35   ` Tim Deegan
  2016-11-23 16:38     ` Andrew Cooper
  2016-11-23 16:40     ` Tim Deegan
  0 siblings, 2 replies; 91+ messages in thread
From: Tim Deegan @ 2016-11-23 16:35 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Paul Durrant, Xen-devel

At 15:38 +0000 on 23 Nov (1479915534), Andrew Cooper wrote:
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Shouldn't this also update the comments to describe the new semantics
of hvm_copy_*()?

Tim.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 12/15] x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear()
  2016-11-23 15:38 ` [PATCH 12/15] x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear() Andrew Cooper
@ 2016-11-23 16:35   ` Tim Deegan
  2016-11-24  6:26   ` Tian, Kevin
  2016-11-24 15:41   ` Jan Beulich
  2 siblings, 0 replies; 91+ messages in thread
From: Tim Deegan @ 2016-11-23 16:35 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Paul Durrant, Jun Nakajima, Jan Beulich, Xen-devel

At 15:38 +0000 on 23 Nov (1479915535), Andrew Cooper wrote:
> The functions use linear addresses, not virtual addresses, as no segmentation
> is used.  (Lots of other code in Xen makes this mistake.)
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Tim Deegan <tim@xen.org>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 10/15] x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer
  2016-11-23 15:38 ` [PATCH 10/15] x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer Andrew Cooper
  2016-11-23 16:32   ` Tim Deegan
@ 2016-11-23 16:36   ` Paul Durrant
  2016-11-24  6:25   ` Tian, Kevin
  2 siblings, 0 replies; 91+ messages in thread
From: Paul Durrant @ 2016-11-23 16:36 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Kevin Tian, Tim (Xen.org), Jun Nakajima

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 23 November 2016 15:39
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant
> <Paul.Durrant@citrix.com>; Tim (Xen.org) <tim@xen.org>; Jun Nakajima
> <jun.nakajima@intel.com>; Kevin Tian <kevin.tian@intel.com>
> Subject: [PATCH 10/15] x86/hvm: Extend the hvm_copy_*() API with a
> pagefault_info pointer
> 
> which is filled with pagefault information should one occur.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
> CC: Paul Durrant <paul.durrant@citrix.com>
> CC: Tim Deegan <tim@xen.org>
> CC: Jun Nakajima <jun.nakajima@intel.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> ---
>  xen/arch/x86/hvm/emulate.c        |  8 ++++---
>  xen/arch/x86/hvm/hvm.c            | 49 +++++++++++++++++++++++++---------
> -----
>  xen/arch/x86/hvm/vmx/vvmx.c       |  9 ++++---
>  xen/arch/x86/mm/shadow/common.c   |  5 ++--
>  xen/include/asm-x86/hvm/support.h | 23 +++++++++++++-----
>  5 files changed, 63 insertions(+), 31 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index 3ebb200..e50ee24 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -770,6 +770,7 @@ static int __hvmemul_read(
>      struct hvm_emulate_ctxt *hvmemul_ctxt)
>  {
>      struct vcpu *curr = current;
> +    pagefault_info_t pfinfo;
>      unsigned long addr, reps = 1;
>      uint32_t pfec = PFEC_page_present;
>      struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
> @@ -790,8 +791,8 @@ static int __hvmemul_read(
>          pfec |= PFEC_user_mode;
> 
>      rc = ((access_type == hvm_access_insn_fetch) ?
> -          hvm_fetch_from_guest_virt(p_data, addr, bytes, pfec) :
> -          hvm_copy_from_guest_virt(p_data, addr, bytes, pfec));
> +          hvm_fetch_from_guest_virt(p_data, addr, bytes, pfec, &pfinfo) :
> +          hvm_copy_from_guest_virt(p_data, addr, bytes, pfec, &pfinfo));
> 
>      switch ( rc )
>      {
> @@ -878,6 +879,7 @@ static int hvmemul_write(
>      struct hvm_emulate_ctxt *hvmemul_ctxt =
>          container_of(ctxt, struct hvm_emulate_ctxt, ctxt);
>      struct vcpu *curr = current;
> +    pagefault_info_t pfinfo;
>      unsigned long addr, reps = 1;
>      uint32_t pfec = PFEC_page_present | PFEC_write_access;
>      struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
> @@ -896,7 +898,7 @@ static int hvmemul_write(
>           (hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.dpl == 3) )
>          pfec |= PFEC_user_mode;
> 
> -    rc = hvm_copy_to_guest_virt(addr, p_data, bytes, pfec);
> +    rc = hvm_copy_to_guest_virt(addr, p_data, bytes, pfec, &pfinfo);
> 
>      switch ( rc )
>      {
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 804cd88..afba51f 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2859,6 +2859,7 @@ void hvm_task_switch(
>      struct desc_struct *optss_desc = NULL, *nptss_desc = NULL, tss_desc;
>      bool_t otd_writable, ntd_writable;
>      unsigned long eflags;
> +    pagefault_info_t pfinfo;
>      int exn_raised, rc;
>      struct {
>          u16 back_link,__blh;
> @@ -2925,7 +2926,7 @@ void hvm_task_switch(
>      }
> 
>      rc = hvm_copy_from_guest_virt(
> -        &tss, prev_tr.base, sizeof(tss), PFEC_page_present);
> +        &tss, prev_tr.base, sizeof(tss), PFEC_page_present, &pfinfo);
>      if ( rc != HVMCOPY_okay )
>          goto out;
> 
> @@ -2963,12 +2964,12 @@ void hvm_task_switch(
>                                  &tss.eip,
>                                  offsetof(typeof(tss), trace) -
>                                  offsetof(typeof(tss), eip),
> -                                PFEC_page_present);
> +                                PFEC_page_present, &pfinfo);
>      if ( rc != HVMCOPY_okay )
>          goto out;
> 
>      rc = hvm_copy_from_guest_virt(
> -        &tss, tr.base, sizeof(tss), PFEC_page_present);
> +        &tss, tr.base, sizeof(tss), PFEC_page_present, &pfinfo);
>      /*
>       * Note: The HVMCOPY_gfn_shared case could be optimised, if the callee
>       * functions knew we want RO access.
> @@ -3008,7 +3009,8 @@ void hvm_task_switch(
>          tss.back_link = prev_tr.sel;
> 
>          rc = hvm_copy_to_guest_virt(tr.base + offsetof(typeof(tss), back_link),
> -                                    &tss.back_link, sizeof(tss.back_link), 0);
> +                                    &tss.back_link, sizeof(tss.back_link), 0,
> +                                    &pfinfo);
>          if ( rc == HVMCOPY_bad_gva_to_gfn )
>              exn_raised = 1;
>          else if ( rc != HVMCOPY_okay )
> @@ -3045,7 +3047,8 @@ void hvm_task_switch(
>                                          16 << segr.attr.fields.db,
>                                          &linear_addr) )
>          {
> -            rc = hvm_copy_to_guest_virt(linear_addr, &errcode, opsz, 0);
> +            rc = hvm_copy_to_guest_virt(linear_addr, &errcode, opsz, 0,
> +                                        &pfinfo);
>              if ( rc == HVMCOPY_bad_gva_to_gfn )
>                  exn_raised = 1;
>              else if ( rc != HVMCOPY_okay )
> @@ -3068,7 +3071,8 @@ void hvm_task_switch(
>  #define HVMCOPY_phys       (0u<<2)
>  #define HVMCOPY_virt       (1u<<2)
>  static enum hvm_copy_result __hvm_copy(
> -    void *buf, paddr_t addr, int size, unsigned int flags, uint32_t pfec)
> +    void *buf, paddr_t addr, int size, unsigned int flags, uint32_t pfec,
> +    pagefault_info_t *pfinfo)
>  {
>      struct vcpu *curr = current;
>      unsigned long gfn;
> @@ -3109,7 +3113,15 @@ static enum hvm_copy_result __hvm_copy(
>                  if ( pfec & PFEC_page_shared )
>                      return HVMCOPY_gfn_shared;
>                  if ( flags & HVMCOPY_fault )
> +                {
> +                    if ( pfinfo )
> +                    {
> +                        pfinfo->linear = addr;
> +                        pfinfo->ec = pfec;
> +                    }
> +
>                      hvm_inject_page_fault(pfec, addr);
> +                }
>                  return HVMCOPY_bad_gva_to_gfn;
>              }
>              gpa |= (paddr_t)gfn << PAGE_SHIFT;
> @@ -3279,7 +3291,7 @@ enum hvm_copy_result
> hvm_copy_to_guest_phys(
>  {
>      return __hvm_copy(buf, paddr, size,
>                        HVMCOPY_to_guest | HVMCOPY_fault | HVMCOPY_phys,
> -                      0);
> +                      0, NULL);
>  }
> 
>  enum hvm_copy_result hvm_copy_from_guest_phys(
> @@ -3287,31 +3299,34 @@ enum hvm_copy_result
> hvm_copy_from_guest_phys(
>  {
>      return __hvm_copy(buf, paddr, size,
>                        HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_phys,
> -                      0);
> +                      0, NULL);
>  }
> 
>  enum hvm_copy_result hvm_copy_to_guest_virt(
> -    unsigned long vaddr, void *buf, int size, uint32_t pfec)
> +    unsigned long vaddr, void *buf, int size, uint32_t pfec,
> +    pagefault_info_t *pfinfo)
>  {
>      return __hvm_copy(buf, vaddr, size,
>                        HVMCOPY_to_guest | HVMCOPY_fault | HVMCOPY_virt,
> -                      PFEC_page_present | PFEC_write_access | pfec);
> +                      PFEC_page_present | PFEC_write_access | pfec, pfinfo);
>  }
> 
>  enum hvm_copy_result hvm_copy_from_guest_virt(
> -    void *buf, unsigned long vaddr, int size, uint32_t pfec)
> +    void *buf, unsigned long vaddr, int size, uint32_t pfec,
> +    pagefault_info_t *pfinfo)
>  {
>      return __hvm_copy(buf, vaddr, size,
>                        HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
> -                      PFEC_page_present | pfec);
> +                      PFEC_page_present | pfec, pfinfo);
>  }
> 
>  enum hvm_copy_result hvm_fetch_from_guest_virt(
> -    void *buf, unsigned long vaddr, int size, uint32_t pfec)
> +    void *buf, unsigned long vaddr, int size, uint32_t pfec,
> +    pagefault_info_t *pfinfo)
>  {
>      return __hvm_copy(buf, vaddr, size,
>                        HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
> -                      PFEC_page_present | PFEC_insn_fetch | pfec);
> +                      PFEC_page_present | PFEC_insn_fetch | pfec, pfinfo);
>  }
> 
>  enum hvm_copy_result hvm_copy_to_guest_virt_nofault(
> @@ -3319,7 +3334,7 @@ enum hvm_copy_result
> hvm_copy_to_guest_virt_nofault(
>  {
>      return __hvm_copy(buf, vaddr, size,
>                        HVMCOPY_to_guest | HVMCOPY_no_fault | HVMCOPY_virt,
> -                      PFEC_page_present | PFEC_write_access | pfec);
> +                      PFEC_page_present | PFEC_write_access | pfec, NULL);
>  }
> 
>  enum hvm_copy_result hvm_copy_from_guest_virt_nofault(
> @@ -3327,7 +3342,7 @@ enum hvm_copy_result
> hvm_copy_from_guest_virt_nofault(
>  {
>      return __hvm_copy(buf, vaddr, size,
>                        HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt,
> -                      PFEC_page_present | pfec);
> +                      PFEC_page_present | pfec, NULL);
>  }
> 
>  enum hvm_copy_result hvm_fetch_from_guest_virt_nofault(
> @@ -3335,7 +3350,7 @@ enum hvm_copy_result
> hvm_fetch_from_guest_virt_nofault(
>  {
>      return __hvm_copy(buf, vaddr, size,
>                        HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt,
> -                      PFEC_page_present | PFEC_insn_fetch | pfec);
> +                      PFEC_page_present | PFEC_insn_fetch | pfec, NULL);
>  }
> 
>  unsigned long copy_to_user_hvm(void *to, const void *from, unsigned int
> len)
> diff --git a/xen/arch/x86/hvm/vmx/vvmx.c
> b/xen/arch/x86/hvm/vmx/vvmx.c
> index bcc4a97..7342d12 100644
> --- a/xen/arch/x86/hvm/vmx/vvmx.c
> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> @@ -396,6 +396,7 @@ static int decode_vmx_inst(struct cpu_user_regs
> *regs,
>      struct vcpu *v = current;
>      union vmx_inst_info info;
>      struct segment_register seg;
> +    pagefault_info_t pfinfo;
>      unsigned long base, index, seg_base, disp, offset;
>      int scale, size;
> 
> @@ -451,7 +452,7 @@ static int decode_vmx_inst(struct cpu_user_regs
> *regs,
>              goto gp_fault;
> 
>          if ( poperandS != NULL &&
> -             hvm_copy_from_guest_virt(poperandS, base, size, 0)
> +             hvm_copy_from_guest_virt(poperandS, base, size, 0, &pfinfo)
>                    != HVMCOPY_okay )
>              return X86EMUL_EXCEPTION;
>          decode->mem = base;
> @@ -1611,6 +1612,7 @@ int nvmx_handle_vmptrst(struct cpu_user_regs
> *regs)
>      struct vcpu *v = current;
>      struct vmx_inst_decoded decode;
>      struct nestedvcpu *nvcpu = &vcpu_nestedhvm(v);
> +    pagefault_info_t pfinfo;
>      unsigned long gpa = 0;
>      int rc;
> 
> @@ -1620,7 +1622,7 @@ int nvmx_handle_vmptrst(struct cpu_user_regs
> *regs)
> 
>      gpa = nvcpu->nv_vvmcxaddr;
> 
> -    rc = hvm_copy_to_guest_virt(decode.mem, &gpa, decode.len, 0);
> +    rc = hvm_copy_to_guest_virt(decode.mem, &gpa, decode.len, 0,
> &pfinfo);
>      if ( rc != HVMCOPY_okay )
>          return X86EMUL_EXCEPTION;
> 
> @@ -1679,6 +1681,7 @@ int nvmx_handle_vmread(struct cpu_user_regs
> *regs)
>  {
>      struct vcpu *v = current;
>      struct vmx_inst_decoded decode;
> +    pagefault_info_t pfinfo;
>      u64 value = 0;
>      int rc;
> 
> @@ -1690,7 +1693,7 @@ int nvmx_handle_vmread(struct cpu_user_regs
> *regs)
> 
>      switch ( decode.type ) {
>      case VMX_INST_MEMREG_TYPE_MEMORY:
> -        rc = hvm_copy_to_guest_virt(decode.mem, &value, decode.len, 0);
> +        rc = hvm_copy_to_guest_virt(decode.mem, &value, decode.len, 0,
> &pfinfo);
>          if ( rc != HVMCOPY_okay )
>              return X86EMUL_EXCEPTION;
>          break;
> diff --git a/xen/arch/x86/mm/shadow/common.c
> b/xen/arch/x86/mm/shadow/common.c
> index c8b61b9..d28eae1 100644
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -179,6 +179,7 @@ hvm_read(enum x86_segment seg,
>           enum hvm_access_type access_type,
>           struct sh_emulate_ctxt *sh_ctxt)
>  {
> +    pagefault_info_t pfinfo;
>      unsigned long addr;
>      int rc;
> 
> @@ -188,9 +189,9 @@ hvm_read(enum x86_segment seg,
>          return rc;
> 
>      if ( access_type == hvm_access_insn_fetch )
> -        rc = hvm_fetch_from_guest_virt(p_data, addr, bytes, 0);
> +        rc = hvm_fetch_from_guest_virt(p_data, addr, bytes, 0, &pfinfo);
>      else
> -        rc = hvm_copy_from_guest_virt(p_data, addr, bytes, 0);
> +        rc = hvm_copy_from_guest_virt(p_data, addr, bytes, 0, &pfinfo);
> 
>      switch ( rc )
>      {
> diff --git a/xen/include/asm-x86/hvm/support.h b/xen/include/asm-
> x86/hvm/support.h
> index 9938450..4aa5a36 100644
> --- a/xen/include/asm-x86/hvm/support.h
> +++ b/xen/include/asm-x86/hvm/support.h
> @@ -83,16 +83,27 @@ enum hvm_copy_result
> hvm_copy_from_guest_phys(
>   *  HVMCOPY_bad_gfn_to_mfn: Some guest physical address did not map to
>   *                          ordinary machine memory.
>   *  HVMCOPY_bad_gva_to_gfn: Some guest virtual address did not have a
> valid
> - *                          mapping to a guest physical address. In this case
> - *                          a page fault exception is automatically queued
> - *                          for injection into the current HVM VCPU.
> + *                          mapping to a guest physical address.  The
> + *                          pagefault_info_t structure will be filled in if
> + *                          provided, and a page fault exception is
> + *                          automatically queued for injection into the
> + *                          current HVM VCPU.
>   */
> +typedef struct pagefault_info
> +{
> +    unsigned long linear;
> +    int ec;
> +} pagefault_info_t;
> +
>  enum hvm_copy_result hvm_copy_to_guest_virt(
> -    unsigned long vaddr, void *buf, int size, uint32_t pfec);
> +    unsigned long vaddr, void *buf, int size, uint32_t pfec,
> +    pagefault_info_t *pfinfo);
>  enum hvm_copy_result hvm_copy_from_guest_virt(
> -    void *buf, unsigned long vaddr, int size, uint32_t pfec);
> +    void *buf, unsigned long vaddr, int size, uint32_t pfec,
> +    pagefault_info_t *pfinfo);
>  enum hvm_copy_result hvm_fetch_from_guest_virt(
> -    void *buf, unsigned long vaddr, int size, uint32_t pfec);
> +    void *buf, unsigned long vaddr, int size, uint32_t pfec,
> +    pagefault_info_t *pfinfo);
> 
>  /*
>   * As above (copy to/from a guest virtual address), but no fault is generated
> --
> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 11/15] x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info
  2016-11-23 16:35   ` Tim Deegan
@ 2016-11-23 16:38     ` Andrew Cooper
  2016-11-23 16:40     ` Tim Deegan
  1 sibling, 0 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 16:38 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Paul Durrant, Xen-devel

On 23/11/16 16:35, Tim Deegan wrote:
> At 15:38 +0000 on 23 Nov (1479915534), Andrew Cooper wrote:
>> No functional change.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Shouldn't this also update the comments to describe the new semantics
> of hvm_copy_*()?

I couldn't find an easy way of phrasing it which didn't get deleted by
the subsequent patch.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 06/15] x86/emul: Rework emulator event injection
  2016-11-23 16:19   ` Tim Deegan
  2016-11-23 16:33     ` Jan Beulich
@ 2016-11-23 16:38     ` Andrew Cooper
  1 sibling, 0 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 16:38 UTC (permalink / raw)
  To: Tim Deegan
  Cc: Kevin Tian, Jan Beulich, George Dunlap, Xen-devel, Paul Durrant,
	Jun Nakajima, Boris Ostrovsky, Suravee Suthikulpanit


[-- Attachment #1.1: Type: text/plain, Size: 2808 bytes --]

On 23/11/16 16:19, Tim Deegan wrote:
> Hi,
>
> At 15:38 +0000 on 23 Nov (1479915529), Andrew Cooper wrote:
>> The emulator needs to gain an understanding of interrupts and exceptions
>> generated by its actions.
>>
>> Move hvm_emulate_ctxt.{exn_pending,trap} into struct x86_emulate_ctxt so they
>> are visible to the emulator.  This removes the need for the
>> inject_{hw,sw}_interrupt() hooks, which are dropped and replaced with
>> x86_emul_{hw_exception,software_event}() instead.
>>
>> The shadow pagetable and PV uses of x86_emulate() previously failed with
>> X86EMUL_UNHANDLEABLE due to the lack of inject_*() hooks, but this behaviour
>> has subtly changed.  Adjust the return value checking to cause a pending event
>> to fall back into the previous codepath.
>>
>> No overall functional change.
> AIUI this does have a change in the shadow callers in the case where
> the emulated instruction would inject an event.  Previously we would
> have failed the emulation, perhaps unshadowed something, and returned
> to the guest to retry.
> Now the emulator records the event in the context struct, updates the
> register state and returns success, so we'll return on the *next*
> instruction.  I think that's OK, though.

We are still passing X86EMUL_EXCEPTION back into the emulator, so
nothing changes immediately from that point of view.  It will still
"goto done" and skip the writeback phase.

> Also, handle_mmio() and other callers of the emulator check for that
> pending event and pass it to the hardware but you haven't added
> anything in the shadow code to do that.  Does the event get dropped?

Yes.  That was the intended purpose of these hunks:

diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index d70b1c6..84cb6b6 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3378,7 +3378,7 @@ static int sh_page_fault(struct vcpu *v,
      * would be a good unshadow hint. If we *do* decide to unshadow-on-fault
      * then it must be 'failable': we cannot require the unshadow to succeed.
      */
-    if ( r == X86EMUL_UNHANDLEABLE )
+    if ( r == X86EMUL_UNHANDLEABLE || emul_ctxt.ctxt.event_pending )
     {
         perfc_incr(shadow_fault_emulate_failed);
 #if SHADOW_OPTIMIZATIONS & SHOPT_FAST_EMULATION
@@ -3433,7 +3433,7 @@ static int sh_page_fault(struct vcpu *v,
             shadow_continue_emulation(&emul_ctxt, regs);
             v->arch.paging.last_write_was_pt = 0;
             r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
-            if ( r == X86EMUL_OKAY )
+            if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.event_pending )
             {
                 emulation_count++;
                 if ( v->arch.paging.last_write_was_pt )

To take the failure path any time an event is seen pending.

~Andrew

[-- Attachment #1.2: Type: text/html, Size: 3695 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 91+ messages in thread

* Re: [PATCH 13/15] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back
  2016-11-23 15:38 ` [PATCH 13/15] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back Andrew Cooper
  2016-11-23 16:18   ` Andrew Cooper
@ 2016-11-23 16:39   ` Tim Deegan
  2016-11-23 17:06     ` Andrew Cooper
  1 sibling, 1 reply; 91+ messages in thread
From: Tim Deegan @ 2016-11-23 16:39 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Paul Durrant, Jun Nakajima, Jan Beulich, Xen-devel

At 15:38 +0000 on 23 Nov (1479915536), Andrew Cooper wrote:
> Drop the call to hvm_inject_page_fault() in __hvm_copy(), and require callers
> to inject the pagefault themselves.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

> index afacd5f..88d4642 100644
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -198,6 +198,7 @@ hvm_read(enum x86_segment seg,
>      case HVMCOPY_okay:
>          return X86EMUL_OKAY;
>      case HVMCOPY_bad_gva_to_gfn:
> +        hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
>          return X86EMUL_EXCEPTION;
>      case HVMCOPY_bad_gfn_to_mfn:
>      case HVMCOPY_unhandleable:

Should this also be converted to pass the injection to the emulator
rather than injecting it directly?

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 11/15] x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info
  2016-11-23 16:35   ` Tim Deegan
  2016-11-23 16:38     ` Andrew Cooper
@ 2016-11-23 16:40     ` Tim Deegan
  1 sibling, 0 replies; 91+ messages in thread
From: Tim Deegan @ 2016-11-23 16:40 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Paul Durrant, Xen-devel

At 16:35 +0000 on 23 Nov (1479918931), Tim Deegan wrote:
> At 15:38 +0000 on 23 Nov (1479915534), Andrew Cooper wrote:
> > No functional change.
> > 
> > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> 
> Shouldn't this also update the comments to describe the new semantics
> of hvm_copy_*()?

Right, now I see that this all goes away later in the series.  So, Ack.

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity
  2016-11-23 15:38 ` [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity Andrew Cooper
  2016-11-23 15:49   ` Paul Durrant
  2016-11-23 15:53   ` Wei Liu
@ 2016-11-23 16:40   ` Jan Beulich
  2016-11-23 16:41   ` Boris Ostrovsky
  2016-11-24  6:16   ` Tian, Kevin
  4 siblings, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2016-11-23 16:40 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Wei Liu, Suravee Suthikulpanit, Xen-devel,
	Paul Durrant, Jun Nakajima, Boris Ostrovsky

 >>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
> * Move hvm_emulate_init() to immediately hvm_emulate_prepare(), as they are
>    very closely related.
>  * Rename hvm_emulate_prepare() to hvm_emulate_init_once() and
>    hvm_emulate_init() to hvm_emulate_init_per_insn() to make it clearer how to
>    and when to use them.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with one further cosmetic request:

> @@ -2006,6 +1956,57 @@ void hvm_emulate_prepare(
>      hvmemul_get_seg_reg(x86_seg_ss, hvmemul_ctxt);
>  }
>  
> +void hvm_emulate_init_per_insn(
> +    struct hvm_emulate_ctxt *hvmemul_ctxt,
> +    const unsigned char *insn_buf,
> +    unsigned int insn_bytes)
> +{
> +    struct vcpu *curr = current;
> +    unsigned int pfec = PFEC_page_present;
> +    unsigned long addr;
> +
> +    if ( hvm_long_mode_enabled(curr) &&
> +         hvmemul_ctxt->seg_reg[x86_seg_cs].attr.fields.l )
> +    {
> +        hvmemul_ctxt->ctxt.addr_size = hvmemul_ctxt->ctxt.sp_size = 64;
> +    }

Please consider dropping the stray braces here.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 09/15] x86/emul: Avoid raising faults behind the emulators back
  2016-11-23 16:31   ` Tim Deegan
@ 2016-11-23 16:40     ` Andrew Cooper
  2016-11-23 16:50       ` Tim Deegan
  0 siblings, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 16:40 UTC (permalink / raw)
  To: Tim Deegan; +Cc: George Dunlap, Paul Durrant, Jan Beulich, Xen-devel

On 23/11/16 16:31, Tim Deegan wrote:
> At 15:38 +0000 on 23 Nov (1479915532), Andrew Cooper wrote:
>> Introduce a new x86_emul_pagefault() similar to x86_emul_hw_exception(), and
>> use this instead of hvm_inject_page_fault() from emulation codepaths.
>>
>> Replace one hvm_inject_hw_exception() in the shadow emulation codepaths.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> NOTE: this is a functional change for the shadow code, as a #PF previously
>> raised properly with the guest will now cause X86EMUL_UNHANDLABLE. It is my
>> understanding after a discusion with Tim that this is ok, but I haven't done
>> extenstive testing yet.
> Do you plan to?  I think this is indeed OK, but there may be some edge
> case, e.g. an instruction that writes to both the current top-level
> pagetable (which can't be unshadowed) and an unmapped virtual address.
> That ought to raise #PF in the guest but might now spin retrying?

That is a devious corner case.  I take it you have been there before?

The more I think about these changes, the more I think that the shadow
code would be better by selectively looking a pending event, injecting
pagefaults, but rejecting and retrying if any other event shows up.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity
  2016-11-23 16:41   ` Boris Ostrovsky
@ 2016-11-23 16:41     ` Andrew Cooper
  0 siblings, 0 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 16:41 UTC (permalink / raw)
  To: Boris Ostrovsky, Xen-devel
  Cc: Kevin Tian, Wei Liu, Jun Nakajima, Jan Beulich, Paul Durrant,
	Suravee Suthikulpanit

On 23/11/16 16:41, Boris Ostrovsky wrote:
> On 11/23/2016 10:38 AM, Andrew Cooper wrote:
>>  * Move hvm_emulate_init() to immediately hvm_emulate_prepare(), as they are
>>    very closely related.
>>  * Rename hvm_emulate_prepare() to hvm_emulate_init_once() and
>>    hvm_emulate_init() to hvm_emulate_init_per_insn() to make it clearer how to
>>    and when to use them.
>>
>> No functional change.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Paul Durrant <paul.durrant@citrix.com>
>> CC: Jun Nakajima <jun.nakajima@intel.com>
>> CC: Kevin Tian <kevin.tian@intel.com>
>> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>
> (although I am having trouble parsing the first bullet in the commit
> message)

Ah "to immediately after".  I will fix up the text.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity
  2016-11-23 15:38 ` [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity Andrew Cooper
                     ` (2 preceding siblings ...)
  2016-11-23 16:40   ` Jan Beulich
@ 2016-11-23 16:41   ` Boris Ostrovsky
  2016-11-23 16:41     ` Andrew Cooper
  2016-11-24  6:16   ` Tian, Kevin
  4 siblings, 1 reply; 91+ messages in thread
From: Boris Ostrovsky @ 2016-11-23 16:41 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Kevin Tian, Wei Liu, Jun Nakajima, Jan Beulich, Paul Durrant,
	Suravee Suthikulpanit

On 11/23/2016 10:38 AM, Andrew Cooper wrote:
>  * Move hvm_emulate_init() to immediately hvm_emulate_prepare(), as they are
>    very closely related.
>  * Rename hvm_emulate_prepare() to hvm_emulate_init_once() and
>    hvm_emulate_init() to hvm_emulate_init_per_insn() to make it clearer how to
>    and when to use them.
>
> No functional change.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Paul Durrant <paul.durrant@citrix.com>
> CC: Jun Nakajima <jun.nakajima@intel.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> CC: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

(although I am having trouble parsing the first bullet in the commit
message)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 14/15] x86/hvm: Prepare to allow use of system segments for memory references
  2016-11-23 15:38 ` [PATCH 14/15] x86/hvm: Prepare to allow use of system segments for memory references Andrew Cooper
@ 2016-11-23 16:42   ` Paul Durrant
  2016-11-24 15:48   ` Jan Beulich
  1 sibling, 0 replies; 91+ messages in thread
From: Paul Durrant @ 2016-11-23 16:42 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 23 November 2016 15:39
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Jan Beulich
> <JBeulich@suse.com>; Paul Durrant <Paul.Durrant@citrix.com>
> Subject: [PATCH 14/15] x86/hvm: Prepare to allow use of system segments
> for memory references
> 
> All system segments (GDT/IDT/LDT and TR) describe a linear address and
> limit,
> and act similarly to user segments.  However all current uses of these tables
> in the emulator opencode the address calculations and limit checks.  In
> particular, no care is taken for access which wrap around the 4GB or
> non-canonical boundaries.
> 
> Alter hvm_virtual_to_linear_addr() to cope with performing segmentation
> checks
> on system segments.  This involves restricting access checks in the 32bit case
> to user segments only, and adding presence/limit checks in the 64bit case.
> 
> When suffering a segmentation fault for a system segments, return
> X86EMUL_EXCEPTION but leave the fault injection to the caller.  The fault
> type
> depends on the higher level action being performed.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <JBeulich@suse.com>
> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
> ---
> CC: Paul Durrant <paul.durrant@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
>  xen/arch/x86/hvm/emulate.c             | 14 ++++++++----
>  xen/arch/x86/hvm/hvm.c                 | 40 ++++++++++++++++++++++-----------
> -
>  xen/arch/x86/mm/shadow/common.c        | 12 +++++++---
>  xen/arch/x86/x86_emulate/x86_emulate.h | 26 ++++++++++++++--------
>  4 files changed, 62 insertions(+), 30 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index c248eca..3a7d1f3 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -567,10 +567,16 @@ static int hvmemul_virtual_to_linear(
>      if ( *reps != 1 )
>          return X86EMUL_UNHANDLEABLE;
> 
> -    /* This is a singleton operation: fail it with an exception. */
> -    x86_emul_hw_exception((seg == x86_seg_ss)
> -                          ? TRAP_stack_error
> -                          : TRAP_gp_fault, 0, &hvmemul_ctxt->ctxt);
> +    /*
> +     * Leave exception injection to the caller for non-user segments: We
> +     * neither know the exact error code to be used, nor can we easily
> +     * determine the kind of exception (#GP or #TS) in that case.
> +     */
> +    if ( is_x86_user_segment(seg) )
> +        x86_emul_hw_exception((seg == x86_seg_ss)
> +                              ? TRAP_stack_error
> +                              : TRAP_gp_fault, 0, &hvmemul_ctxt->ctxt);
> +
>      return X86EMUL_EXCEPTION;
>  }
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index e1f2c9e..2bcef1f 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2497,24 +2497,28 @@ bool_t hvm_virtual_to_linear_addr(
>          if ( !reg->attr.fields.p )
>              goto out;
> 
> -        switch ( access_type )
> +        /* Read/write restrictions only exist for user segments. */
> +        if ( reg->attr.fields.s )
>          {
> -        case hvm_access_read:
> -            if ( (reg->attr.fields.type & 0xa) == 0x8 )
> -                goto out; /* execute-only code segment */
> -            break;
> -        case hvm_access_write:
> -            if ( (reg->attr.fields.type & 0xa) != 0x2 )
> -                goto out; /* not a writable data segment */
> -            break;
> -        default:
> -            break;
> +            switch ( access_type )
> +            {
> +            case hvm_access_read:
> +                if ( (reg->attr.fields.type & 0xa) == 0x8 )
> +                    goto out; /* execute-only code segment */
> +                break;
> +            case hvm_access_write:
> +                if ( (reg->attr.fields.type & 0xa) != 0x2 )
> +                    goto out; /* not a writable data segment */
> +                break;
> +            default:
> +                break;
> +            }
>          }
> 
>          last_byte = (uint32_t)offset + bytes - !!bytes;
> 
>          /* Is this a grows-down data segment? Special limit check if so. */
> -        if ( (reg->attr.fields.type & 0xc) == 0x4 )
> +        if ( reg->attr.fields.s && (reg->attr.fields.type & 0xc) == 0x4 )
>          {
>              /* Is upper limit 0xFFFF or 0xFFFFFFFF? */
>              if ( !reg->attr.fields.db )
> @@ -2530,10 +2534,18 @@ bool_t hvm_virtual_to_linear_addr(
>      else
>      {
>          /*
> -         * LONG MODE: FS and GS add segment base. Addresses must be
> canonical.
> +         * User segments are always treated as present.  System segment may
> +         * not be, and also incur limit checks.
>           */
> +        if ( is_x86_system_segment(seg) &&
> +             (!reg->attr.fields.p || (offset + bytes - !!bytes) > reg->limit) )
> +            goto out;
> 
> -        if ( (seg == x86_seg_fs) || (seg == x86_seg_gs) )
> +        /*
> +         * LONG MODE: FS, GS and system segments: add segment base. All
> +         * addresses must be canonical.
> +         */
> +        if ( seg >= x86_seg_fs )
>              addr += reg->base;
> 
>          last_byte = addr + bytes - !!bytes;
> diff --git a/xen/arch/x86/mm/shadow/common.c
> b/xen/arch/x86/mm/shadow/common.c
> index 88d4642..954c157 100644
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -162,9 +162,15 @@ static int hvm_translate_linear_addr(
> 
>      if ( !okay )
>      {
> -        x86_emul_hw_exception(
> -            (seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault,
> -            0, &sh_ctxt->ctxt);
> +        /*
> +         * Leave exception injection to the caller for non-user segments: We
> +         * neither know the exact error code to be used, nor can we easily
> +         * determine the kind of exception (#GP or #TS) in that case.
> +         */
> +        if ( is_x86_user_segment(seg) )
> +            x86_emul_hw_exception(
> +                (seg == x86_seg_ss) ? TRAP_stack_error : TRAP_gp_fault,
> +                0, &sh_ctxt->ctxt);
>          return X86EMUL_EXCEPTION;
>      }
> 
> diff --git a/xen/arch/x86/x86_emulate/x86_emulate.h
> b/xen/arch/x86/x86_emulate/x86_emulate.h
> index cc26e9d..4d18623 100644
> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -27,7 +27,11 @@
> 
>  struct x86_emulate_ctxt;
> 
> -/* Comprehensive enumeration of x86 segment registers. */
> +/*
> + * Comprehensive enumeration of x86 segment registers.  Various bits of
> code
> + * rely on this order (general purpose before system, tr at the beginning of
> + * system).
> + */
>  enum x86_segment {
>      /* General purpose.  Matches the SReg3 encoding in opcode/ModRM
> bytes. */
>      x86_seg_es,
> @@ -36,21 +40,25 @@ enum x86_segment {
>      x86_seg_ds,
>      x86_seg_fs,
>      x86_seg_gs,
> -    /* System. */
> +    /* System: Valid to use for implicit table references. */
>      x86_seg_tr,
>      x86_seg_ldtr,
>      x86_seg_gdtr,
>      x86_seg_idtr,
> -    /*
> -     * Dummy: used to emulate direct processor accesses to management
> -     * structures (TSS, GDT, LDT, IDT, etc.) which use linear addressing
> -     * (no segment component) and bypass usual segment- and page-level
> -     * protection checks.
> -     */
> +    /* No Segment: For accesses which are already linear. */
>      x86_seg_none
>  };
> 
> -#define is_x86_user_segment(seg) ((unsigned)(seg) <= x86_seg_gs)
> +static inline bool is_x86_user_segment(enum x86_segment seg)
> +{
> +    unsigned int idx = seg;
> +
> +    return idx <= x86_seg_gs;
> +}
> +static inline bool is_x86_system_segment(enum x86_segment seg)
> +{
> +    return seg >= x86_seg_tr && seg < x86_seg_none;
> +}
> 
>  /* Classification of the types of software generated interrupts/exceptions.
> */
>  enum x86_swint_type {
> --
> 2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 06/15] x86/emul: Rework emulator event injection
  2016-11-23 16:33     ` Jan Beulich
@ 2016-11-23 16:43       ` Tim Deegan
  0 siblings, 0 replies; 91+ messages in thread
From: Tim Deegan @ 2016-11-23 16:43 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Suravee Suthikulpanit, George Dunlap, Andrew Cooper,
	Xen-devel, Paul Durrant, Jun Nakajima, Boris Ostrovsky

At 09:33 -0700 on 23 Nov (1479893609), Jan Beulich wrote:
> >>> On 23.11.16 at 17:19, <tim@xen.org> wrote:
> > Hi,
> > 
> > At 15:38 +0000 on 23 Nov (1479915529), Andrew Cooper wrote:
> >> The emulator needs to gain an understanding of interrupts and exceptions
> >> generated by its actions.
> >> 
> >> Move hvm_emulate_ctxt.{exn_pending,trap} into struct x86_emulate_ctxt so 
> > they
> >> are visible to the emulator.  This removes the need for the
> >> inject_{hw,sw}_interrupt() hooks, which are dropped and replaced with
> >> x86_emul_{hw_exception,software_event}() instead.
> >> 
> >> The shadow pagetable and PV uses of x86_emulate() previously failed with
> >> X86EMUL_UNHANDLEABLE due to the lack of inject_*() hooks, but this behaviour
> >> has subtly changed.  Adjust the return value checking to cause a pending 
> > event
> >> to fall back into the previous codepath.
> >> 
> >> No overall functional change.
> > 
> > AIUI this does have a change in the shadow callers in the case where
> > the emulated instruction would inject an event.  Previously we would
> > have failed the emulation, perhaps unshadowed something, and returned
> > to the guest to retry.
> > 
> > Now the emulator records the event in the context struct, updates the
> > register state and returns success, so we'll return on the *next*
> > instruction.  I think that's OK, though.
> 
> Not exactly - instead of success, X86EMUL_EXCEPTION is being
> returned, which would suppress register updates.

Oh right.  In that case AFAICS neither invocation of x86_emulate() in
shadow/multi.c needs any adjustment.

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 09/15] x86/emul: Avoid raising faults behind the emulators back
  2016-11-23 16:40     ` Andrew Cooper
@ 2016-11-23 16:50       ` Tim Deegan
  2016-11-23 16:58         ` Andrew Cooper
  0 siblings, 1 reply; 91+ messages in thread
From: Tim Deegan @ 2016-11-23 16:50 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: George Dunlap, Paul Durrant, Jan Beulich, Xen-devel

At 16:40 +0000 on 23 Nov (1479919254), Andrew Cooper wrote:
> On 23/11/16 16:31, Tim Deegan wrote:
> > At 15:38 +0000 on 23 Nov (1479915532), Andrew Cooper wrote:
> >> Introduce a new x86_emul_pagefault() similar to x86_emul_hw_exception(), and
> >> use this instead of hvm_inject_page_fault() from emulation codepaths.
> >>
> >> Replace one hvm_inject_hw_exception() in the shadow emulation codepaths.
> >>
> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> >> NOTE: this is a functional change for the shadow code, as a #PF previously
> >> raised properly with the guest will now cause X86EMUL_UNHANDLABLE. It is my
> >> understanding after a discusion with Tim that this is ok, but I haven't done
> >> extenstive testing yet.
> > Do you plan to?  I think this is indeed OK, but there may be some edge
> > case, e.g. an instruction that writes to both the current top-level
> > pagetable (which can't be unshadowed) and an unmapped virtual address.
> > That ought to raise #PF in the guest but might now spin retrying?
> 
> That is a devious corner case.  I take it you have been there before?

In similar situations, yes. :)

> The more I think about these changes, the more I think that the shadow
> code would be better by selectively looking a pending event, injecting
> pagefaults, but rejecting and retrying if any other event shows up.

That sounds like a good idea, and seems like the smallest deviation
from the current behaviour.  It might also be OK to inject any event
that the emulator raises.  That's a bigger change but maybe a more
coherent end result?

Cheers,

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 09/15] x86/emul: Avoid raising faults behind the emulators back
  2016-11-23 16:50       ` Tim Deegan
@ 2016-11-23 16:58         ` Andrew Cooper
  2016-11-24 10:43           ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 16:58 UTC (permalink / raw)
  To: Tim Deegan; +Cc: George Dunlap, Paul Durrant, Jan Beulich, Xen-devel

On 23/11/16 16:50, Tim Deegan wrote:
> At 16:40 +0000 on 23 Nov (1479919254), Andrew Cooper wrote:
>> On 23/11/16 16:31, Tim Deegan wrote:
>>> At 15:38 +0000 on 23 Nov (1479915532), Andrew Cooper wrote:
>>>> Introduce a new x86_emul_pagefault() similar to x86_emul_hw_exception(), and
>>>> use this instead of hvm_inject_page_fault() from emulation codepaths.
>>>>
>>>> Replace one hvm_inject_hw_exception() in the shadow emulation codepaths.
>>>>
>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> NOTE: this is a functional change for the shadow code, as a #PF previously
>>>> raised properly with the guest will now cause X86EMUL_UNHANDLABLE. It is my
>>>> understanding after a discusion with Tim that this is ok, but I haven't done
>>>> extenstive testing yet.
>>> Do you plan to?  I think this is indeed OK, but there may be some edge
>>> case, e.g. an instruction that writes to both the current top-level
>>> pagetable (which can't be unshadowed) and an unmapped virtual address.
>>> That ought to raise #PF in the guest but might now spin retrying?
>> That is a devious corner case.  I take it you have been there before?
> In similar situations, yes. :)
>
>> The more I think about these changes, the more I think that the shadow
>> code would be better by selectively looking a pending event, injecting
>> pagefaults, but rejecting and retrying if any other event shows up.
> That sounds like a good idea, and seems like the smallest deviation
> from the current behaviour.  It might also be OK to inject any event
> that the emulator raises.  That's a bigger change but maybe a more
> coherent end result?

Well - now this isn't hidden in a security fix, I am less averse to
functional changes.

My only concern is that the previous lack of the ->inject_hw_exception()
hook cut off large chunks of functionality from the shadow and PV PT
emulation paths, and I am not sure opening this up in general is a good
idea.

Longterm the plan is to fully split the decode and emulate calls even
for external callers, at which point the the pagetable code could check
that the instruction is write which matches %cr2 before proceeding with
emulation.  Even then however, I am not sure it would be a good idea to
follow anything other than a pagefault which surfaces from emulation.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 03/15] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure
  2016-11-23 15:38 ` [PATCH 03/15] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure Andrew Cooper
  2016-11-23 16:12   ` Paul Durrant
@ 2016-11-23 16:59   ` Boris Ostrovsky
  2016-11-24  6:17   ` Tian, Kevin
  2016-11-24 13:56   ` Jan Beulich
  3 siblings, 0 replies; 91+ messages in thread
From: Boris Ostrovsky @ 2016-11-23 16:59 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Suravee Suthikulpanit, Kevin Tian, Paul Durrant, Jun Nakajima,
	Jan Beulich

On 11/23/2016 10:38 AM, Andrew Cooper wrote:
> The x86 emulator needs to gain an understanding of interrupts and exceptions
> generated by its actions.  The naming choice is to match both the Intel and
> AMD terms, and to avoid 'trap' specifically as it has an architectural meaning
> different to its current usage.
>
> While making this change, make other changes for consistency
>
>  * Rename *_trap() infrastructure to *_event()
>  * Rename trapnr/trap parameters to vector
>  * Convert hvm_inject_hw_exception() and hvm_inject_page_fault() to being
>    static inlines, as they are only thin wrappers around hvm_inject_event()
>
> No functional change.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>


Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 04/15] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC
  2016-11-23 15:38 ` [PATCH 04/15] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC Andrew Cooper
  2016-11-23 16:20   ` Paul Durrant
@ 2016-11-23 17:05   ` Boris Ostrovsky
  2016-11-24  6:18   ` Tian, Kevin
  2016-11-24 14:18   ` Jan Beulich
  3 siblings, 0 replies; 91+ messages in thread
From: Boris Ostrovsky @ 2016-11-23 17:05 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Suravee Suthikulpanit, Kevin Tian, Paul Durrant, Jun Nakajima,
	Jan Beulich

On 11/23/2016 10:38 AM, Andrew Cooper wrote:
> and move it to live with the other x86_event infrastructure in x86_emulate.h.
> Switch it and x86_event.error_code to being signed, matching the rest of the
> code.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 13/15] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back
  2016-11-23 16:39   ` Tim Deegan
@ 2016-11-23 17:06     ` Andrew Cooper
  0 siblings, 0 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 17:06 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Kevin Tian, Paul Durrant, Jun Nakajima, Jan Beulich, Xen-devel

On 23/11/16 16:39, Tim Deegan wrote:
> At 15:38 +0000 on 23 Nov (1479915536), Andrew Cooper wrote:
>> Drop the call to hvm_inject_page_fault() in __hvm_copy(), and require callers
>> to inject the pagefault themselves.
>>
>> No functional change.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> index afacd5f..88d4642 100644
>> --- a/xen/arch/x86/mm/shadow/common.c
>> +++ b/xen/arch/x86/mm/shadow/common.c
>> @@ -198,6 +198,7 @@ hvm_read(enum x86_segment seg,
>>      case HVMCOPY_okay:
>>          return X86EMUL_OKAY;
>>      case HVMCOPY_bad_gva_to_gfn:
>> +        hvm_inject_page_fault(pfinfo.ec, pfinfo.linear);
>>          return X86EMUL_EXCEPTION;
>>      case HVMCOPY_bad_gfn_to_mfn:
>>      case HVMCOPY_unhandleable:
> Should this also be converted to pass the injection to the emulator
> rather than injecting it directly?

Yes.

emulate_gva_to_mfn() also needs similar treatment, but will include some
PV pagetable fun as well.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 06/15] x86/emul: Rework emulator event injection
  2016-11-23 15:38 ` [PATCH 06/15] x86/emul: Rework emulator event injection Andrew Cooper
  2016-11-23 16:19   ` Tim Deegan
@ 2016-11-23 17:56   ` Boris Ostrovsky
  2016-11-24  6:20   ` Tian, Kevin
  2016-11-24 14:53   ` Jan Beulich
  3 siblings, 0 replies; 91+ messages in thread
From: Boris Ostrovsky @ 2016-11-23 17:56 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Kevin Tian, Jan Beulich, George Dunlap, Tim Deegan, Paul Durrant,
	Jun Nakajima, Suravee Suthikulpanit

On 11/23/2016 10:38 AM, Andrew Cooper wrote:
> The emulator needs to gain an understanding of interrupts and exceptions
> generated by its actions.
>
> Move hvm_emulate_ctxt.{exn_pending,trap} into struct x86_emulate_ctxt so they
> are visible to the emulator.  This removes the need for the
> inject_{hw,sw}_interrupt() hooks, which are dropped and replaced with
> x86_emul_{hw_exception,software_event}() instead.
>
> The shadow pagetable and PV uses of x86_emulate() previously failed with
> X86EMUL_UNHANDLEABLE due to the lack of inject_*() hooks, but this behaviour
> has subtly changed.  Adjust the return value checking to cause a pending event
> to fall back into the previous codepath.
>
> No overall functional change.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>


Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 08/15] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS
  2016-11-23 15:38 ` [PATCH 08/15] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS Andrew Cooper
@ 2016-11-23 19:01   ` Boris Ostrovsky
  2016-11-23 19:28     ` Andrew Cooper
  2016-11-24  6:24   ` Tian, Kevin
  2016-11-24 15:25   ` Jan Beulich
  2 siblings, 1 reply; 91+ messages in thread
From: Boris Ostrovsky @ 2016-11-23 19:01 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Suravee Suthikulpanit, Kevin Tian, Jun Nakajima, Jan Beulich

On 11/23/2016 10:38 AM, Andrew Cooper wrote:
> Intel VT-x and AMD SVM provide access to the full segment descriptor cache via
> fields in the VMCB/VMCS.  However, the bits which are actually checked by
> hardware and preserved across vmentry/exit are inconsistent, and the vendor
> accessor functions perform inconsistent modification to the raw values.
>
> Convert {svm,vmx}_{get,set}_segment_register() into raw accessors, and alter
> hvm_{get,set}_segment_register() to cook the values consistently.  This allows
> the common emulation code to better rely on finding architecturally-expected
> values.
>
> This does cause some functional changes because of the modifications being
> applied uniformly.  A side effect of this fixes latent bugs where
> vmx_set_segment_register() didn't correctly fix up .G for segments, and
> inconsistent fixing up of the GDTR/IDTR limits.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Jun Nakajima <jun.nakajima@intel.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>  xen/arch/x86/hvm/hvm.c        | 151 ++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/x86/hvm/svm/svm.c    |  20 +-----
>  xen/arch/x86/hvm/vmx/vmx.c    |   6 +-
>  xen/include/asm-x86/desc.h    |   6 ++
>  xen/include/asm-x86/hvm/hvm.h |  17 ++---
>  5 files changed, 164 insertions(+), 36 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index ef83100..804cd88 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -6051,6 +6051,157 @@ void hvm_domain_soft_reset(struct domain *d)
>  }
>  
>  /*
> + * Segment caches in VMCB/VMCS are inconsistent about which bits are checked,
> + * important, and preserved across vmentry/exit.  Cook the values to make them
> + * closer to what is architecturally expected from entries in the segment
> + * cache.
> + */
> +void hvm_get_segment_register(struct vcpu *v, enum x86_segment seg,
> +                              struct segment_register *reg)
> +{
> +    hvm_funcs.get_segment_register(v, seg, reg);
> +
> +    switch ( seg )
> +    {
> +    case x86_seg_ss:
> +        /* SVM may retain %ss.DB when %ss is loaded with a NULL selector. */
> +        if ( !reg->attr.fields.p )
> +            reg->attr.fields.db = 0;

Removed SVM code relied on type being zero to indicate a NULL segment.
Looking at P bit is the correct way and I think it's worth mentioning
this in the commit message.


> +}
> +
> +void hvm_set_segment_register(struct vcpu *v, enum x86_segment seg,
> +                              struct segment_register *reg)
> +{
> +    /* Set G to match the limit field.  VT-x cares, while SVM doesn't. */
> +    if ( reg->attr.fields.p )
> +        reg->attr.fields.g = !!(reg->limit >> 20);
> +
> +    switch ( seg )
> +    {
> +    case x86_seg_cs:
> +        ASSERT(reg->attr.fields.p);                  /* Usable. */
> +        ASSERT(reg->attr.fields.s);                  /* User segment. */
> +        ASSERT((reg->base >> 32) == 0);              /* Upper bits clear. */
> +        break;
> +
> +    case x86_seg_ss:
> +        if ( reg->attr.fields.p )
> +        {
> +            ASSERT(reg->attr.fields.s);              /* User segment. */
> +            ASSERT(!(reg->attr.fields.type & 0x8));  /* Data segment. */
> +            ASSERT(reg->attr.fields.type & 0x2);     /* Writeable. */
> +            ASSERT((reg->base >> 32) == 0);          /* Upper bits clear. */
> +        }
> +        break;

SVM requires attributes of any NULL segment to be zero. I don't know
about Intel but if it's the same then should we ASSERT this as well?


-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 08/15] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS
  2016-11-23 19:01   ` Boris Ostrovsky
@ 2016-11-23 19:28     ` Andrew Cooper
  2016-11-23 19:41       ` Boris Ostrovsky
  0 siblings, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 19:28 UTC (permalink / raw)
  To: Boris Ostrovsky, Xen-devel
  Cc: Suravee Suthikulpanit, Kevin Tian, Jun Nakajima, Jan Beulich

On 23/11/16 19:01, Boris Ostrovsky wrote:
> On 11/23/2016 10:38 AM, Andrew Cooper wrote:
>> Intel VT-x and AMD SVM provide access to the full segment descriptor cache via
>> fields in the VMCB/VMCS.  However, the bits which are actually checked by
>> hardware and preserved across vmentry/exit are inconsistent, and the vendor
>> accessor functions perform inconsistent modification to the raw values.
>>
>> Convert {svm,vmx}_{get,set}_segment_register() into raw accessors, and alter
>> hvm_{get,set}_segment_register() to cook the values consistently.  This allows
>> the common emulation code to better rely on finding architecturally-expected
>> values.
>>
>> This does cause some functional changes because of the modifications being
>> applied uniformly.  A side effect of this fixes latent bugs where
>> vmx_set_segment_register() didn't correctly fix up .G for segments, and
>> inconsistent fixing up of the GDTR/IDTR limits.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Jun Nakajima <jun.nakajima@intel.com>
>> CC: Kevin Tian <kevin.tian@intel.com>
>> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> ---
>>  xen/arch/x86/hvm/hvm.c        | 151 ++++++++++++++++++++++++++++++++++++++++++
>>  xen/arch/x86/hvm/svm/svm.c    |  20 +-----
>>  xen/arch/x86/hvm/vmx/vmx.c    |   6 +-
>>  xen/include/asm-x86/desc.h    |   6 ++
>>  xen/include/asm-x86/hvm/hvm.h |  17 ++---
>>  5 files changed, 164 insertions(+), 36 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
>> index ef83100..804cd88 100644
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -6051,6 +6051,157 @@ void hvm_domain_soft_reset(struct domain *d)
>>  }
>>  
>>  /*
>> + * Segment caches in VMCB/VMCS are inconsistent about which bits are checked,
>> + * important, and preserved across vmentry/exit.  Cook the values to make them
>> + * closer to what is architecturally expected from entries in the segment
>> + * cache.
>> + */
>> +void hvm_get_segment_register(struct vcpu *v, enum x86_segment seg,
>> +                              struct segment_register *reg)
>> +{
>> +    hvm_funcs.get_segment_register(v, seg, reg);
>> +
>> +    switch ( seg )
>> +    {
>> +    case x86_seg_ss:
>> +        /* SVM may retain %ss.DB when %ss is loaded with a NULL selector. */
>> +        if ( !reg->attr.fields.p )
>> +            reg->attr.fields.db = 0;
> Removed SVM code relied on type being zero to indicate a NULL segment.
> Looking at P bit is the correct way and I think it's worth mentioning
> this in the commit message.

Oh yes.  As far as I can tell, that was simply broken before.

>
>
>> +}
>> +
>> +void hvm_set_segment_register(struct vcpu *v, enum x86_segment seg,
>> +                              struct segment_register *reg)
>> +{
>> +    /* Set G to match the limit field.  VT-x cares, while SVM doesn't. */
>> +    if ( reg->attr.fields.p )
>> +        reg->attr.fields.g = !!(reg->limit >> 20);
>> +
>> +    switch ( seg )
>> +    {
>> +    case x86_seg_cs:
>> +        ASSERT(reg->attr.fields.p);                  /* Usable. */
>> +        ASSERT(reg->attr.fields.s);                  /* User segment. */
>> +        ASSERT((reg->base >> 32) == 0);              /* Upper bits clear. */
>> +        break;
>> +
>> +    case x86_seg_ss:
>> +        if ( reg->attr.fields.p )
>> +        {
>> +            ASSERT(reg->attr.fields.s);              /* User segment. */
>> +            ASSERT(!(reg->attr.fields.type & 0x8));  /* Data segment. */
>> +            ASSERT(reg->attr.fields.type & 0x2);     /* Writeable. */
>> +            ASSERT((reg->base >> 32) == 0);          /* Upper bits clear. */
>> +        }
>> +        break;
> SVM requires attributes of any NULL segment to be zero.

Where is this claim made?  Vol2 recommends that the VMM clear all
attributes, but the wording of the previous paragraph suggests that the
attributes would be ignored in this case.   The %ss bug, and some
experimentation on my behalf also indicate that they are ignored.

> I don't know about Intel but if it's the same then should we ASSERT this as well?

On Intel if unusable is set, all other bits are ignored.

However, the behaviour of both Intel and AMD is to occasionally set
upper attribute bits.  At some point I intend to make emulated segment
loading match d->arch.vendor's behaviour, at which point such an
ASSERT() would definitely trip.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 08/15] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS
  2016-11-23 19:28     ` Andrew Cooper
@ 2016-11-23 19:41       ` Boris Ostrovsky
  2016-11-23 19:58         ` Andrew Cooper
  0 siblings, 1 reply; 91+ messages in thread
From: Boris Ostrovsky @ 2016-11-23 19:41 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Suravee Suthikulpanit, Kevin Tian, Jun Nakajima, Jan Beulich

On 11/23/2016 02:28 PM, Andrew Cooper wrote:
>
>> SVM requires attributes of any NULL segment to be zero.
> Where is this claim made?  Vol2 recommends that the VMM clear all
> attributes, but the wording of the previous paragraph suggests that the
> attributes would be ignored in this case.   The %ss bug, and some
> experimentation on my behalf also indicate that they are ignored.

15.5.1 Basic Operation, Segment State in the VMCB:

The VMM should follow these rules when storing segment attributes into
the VMCB
* For NULL segments, set all attribute bits to zero; otherwise, write
the concatenation of bits 55:52 and 47:40 from the original 64-bit
(in-memory) segment descriptors.

I guess the preceding text is indeed unclear as to whether this is
somehow enforced (in which case I am not sure I see the point of having
this rule).

-boris

>
>> I don't know about Intel but if it's the same then should we ASSERT this as well?
> On Intel if unusable is set, all other bits are ignored.
>
> However, the behaviour of both Intel and AMD is to occasionally set
> upper attribute bits.  At some point I intend to make emulated segment
> loading match d->arch.vendor's behaviour, at which point such an
> ASSERT() would definitely trip.
>
> ~Andrew



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 08/15] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS
  2016-11-23 19:41       ` Boris Ostrovsky
@ 2016-11-23 19:58         ` Andrew Cooper
  0 siblings, 0 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-23 19:58 UTC (permalink / raw)
  To: Boris Ostrovsky, Xen-devel
  Cc: Suravee Suthikulpanit, Kevin Tian, Jun Nakajima, Jan Beulich

On 23/11/16 19:41, Boris Ostrovsky wrote:
> On 11/23/2016 02:28 PM, Andrew Cooper wrote:
>>> SVM requires attributes of any NULL segment to be zero.
>> Where is this claim made?  Vol2 recommends that the VMM clear all
>> attributes, but the wording of the previous paragraph suggests that the
>> attributes would be ignored in this case.   The %ss bug, and some
>> experimentation on my behalf also indicate that they are ignored.
> 15.5.1 Basic Operation, Segment State in the VMCB:
>
> The VMM should follow these rules when storing segment attributes into
> the VMCB
> * For NULL segments, set all attribute bits to zero; otherwise, write
> the concatenation of bits 55:52 and 47:40 from the original 64-bit
> (in-memory) segment descriptors.
>
> I guess the preceding text is indeed unclear as to whether this is
> somehow enforced (in which case I am not sure I see the point of having
> this rule).

The quality if information in this regard is very poor.  Both Intel and
AMD expose the segment cache to hypervisors, without actually
documenting the behaviour and expectations.  I spent several weeks
during the development of XSA-191 trying to divide what actually
happens, especially in terms of what a guest can do to the segment cache
without suffering a VMEXIT.

One point I deliberately didn't highlight at the time in c/s 12bc22f791
is that such an emulated instruction, on AMD, discards the non-canonical
part of the base, and happily runs with a truncated address implicitly
sign extended.  If you happen to look back through my submissions, you
will find quite a few patches drip-feeding segmentation fixes and
improvements.  c/s ed00f17616 was another curious issue to stumble across.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity
  2016-11-23 15:38 ` [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity Andrew Cooper
                     ` (3 preceding siblings ...)
  2016-11-23 16:41   ` Boris Ostrovsky
@ 2016-11-24  6:16   ` Tian, Kevin
  4 siblings, 0 replies; 91+ messages in thread
From: Tian, Kevin @ 2016-11-24  6:16 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Wei Liu, Nakajima, Jun, Jan Beulich, Paul Durrant,
	Suravee Suthikulpanit, Boris Ostrovsky

> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Wednesday, November 23, 2016 11:39 PM
> 
>  * Move hvm_emulate_init() to immediately hvm_emulate_prepare(), as they are
>    very closely related.
>  * Rename hvm_emulate_prepare() to hvm_emulate_init_once() and
>    hvm_emulate_init() to hvm_emulate_init_per_insn() to make it clearer how to
>    and when to use them.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

reviewed-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 03/15] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure
  2016-11-23 15:38 ` [PATCH 03/15] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure Andrew Cooper
  2016-11-23 16:12   ` Paul Durrant
  2016-11-23 16:59   ` Boris Ostrovsky
@ 2016-11-24  6:17   ` Tian, Kevin
  2016-11-24 13:56   ` Jan Beulich
  3 siblings, 0 replies; 91+ messages in thread
From: Tian, Kevin @ 2016-11-24  6:17 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Suravee Suthikulpanit, Boris Ostrovsky, Paul Durrant, Nakajima,
	Jun, Jan Beulich

> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Wednesday, November 23, 2016 11:39 PM
> 
> The x86 emulator needs to gain an understanding of interrupts and exceptions
> generated by its actions.  The naming choice is to match both the Intel and
> AMD terms, and to avoid 'trap' specifically as it has an architectural meaning
> different to its current usage.
> 
> While making this change, make other changes for consistency
> 
>  * Rename *_trap() infrastructure to *_event()
>  * Rename trapnr/trap parameters to vector
>  * Convert hvm_inject_hw_exception() and hvm_inject_page_fault() to being
>    static inlines, as they are only thin wrappers around hvm_inject_event()
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 04/15] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC
  2016-11-23 15:38 ` [PATCH 04/15] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC Andrew Cooper
  2016-11-23 16:20   ` Paul Durrant
  2016-11-23 17:05   ` Boris Ostrovsky
@ 2016-11-24  6:18   ` Tian, Kevin
  2016-11-24 14:18   ` Jan Beulich
  3 siblings, 0 replies; 91+ messages in thread
From: Tian, Kevin @ 2016-11-24  6:18 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Suravee Suthikulpanit, Boris Ostrovsky, Paul Durrant, Nakajima,
	Jun, Jan Beulich

> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Wednesday, November 23, 2016 11:39 PM
> 
> and move it to live with the other x86_event infrastructure in x86_emulate.h.
> Switch it and x86_event.error_code to being signed, matching the rest of the
> code.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 06/15] x86/emul: Rework emulator event injection
  2016-11-23 15:38 ` [PATCH 06/15] x86/emul: Rework emulator event injection Andrew Cooper
  2016-11-23 16:19   ` Tim Deegan
  2016-11-23 17:56   ` Boris Ostrovsky
@ 2016-11-24  6:20   ` Tian, Kevin
  2016-11-24 14:53   ` Jan Beulich
  3 siblings, 0 replies; 91+ messages in thread
From: Tian, Kevin @ 2016-11-24  6:20 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Jan Beulich, George Dunlap, Tim Deegan, Paul Durrant, Nakajima,
	Jun, Boris Ostrovsky, Suravee Suthikulpanit

> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Wednesday, November 23, 2016 11:39 PM
> 
> The emulator needs to gain an understanding of interrupts and exceptions
> generated by its actions.
> 
> Move hvm_emulate_ctxt.{exn_pending,trap} into struct x86_emulate_ctxt so they
> are visible to the emulator.  This removes the need for the
> inject_{hw,sw}_interrupt() hooks, which are dropped and replaced with
> x86_emul_{hw_exception,software_event}() instead.
> 
> The shadow pagetable and PV uses of x86_emulate() previously failed with
> X86EMUL_UNHANDLEABLE due to the lack of inject_*() hooks, but this behaviour
> has subtly changed.  Adjust the return value checking to cause a pending event
> to fall back into the previous codepath.
> 
> No overall functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 07/15] x86/vmx: Use hvm_{get, set}_segment_register() rather than vmx_{get, set}_segment_register()
  2016-11-23 15:38 ` [PATCH 07/15] x86/vmx: Use hvm_{get, set}_segment_register() rather than vmx_{get, set}_segment_register() Andrew Cooper
@ 2016-11-24  6:20   ` Tian, Kevin
  0 siblings, 0 replies; 91+ messages in thread
From: Tian, Kevin @ 2016-11-24  6:20 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Nakajima, Jun

> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Wednesday, November 23, 2016 11:39 PM
> 
> No functional change at this point, but this is a prerequisite for forthcoming
> functional changes.
> 
> Make vmx_get_segment_register() private to vmx.c like all the other Vendor
> get/set functions.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
> ---
> CC: Jun Nakajima <jun.nakajima@intel.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> ---

Acked-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 08/15] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS
  2016-11-23 15:38 ` [PATCH 08/15] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS Andrew Cooper
  2016-11-23 19:01   ` Boris Ostrovsky
@ 2016-11-24  6:24   ` Tian, Kevin
  2016-11-24 15:25   ` Jan Beulich
  2 siblings, 0 replies; 91+ messages in thread
From: Tian, Kevin @ 2016-11-24  6:24 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Suravee Suthikulpanit, Boris Ostrovsky, Nakajima, Jun, Jan Beulich

> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Wednesday, November 23, 2016 11:39 PM
> 
> Intel VT-x and AMD SVM provide access to the full segment descriptor cache via
> fields in the VMCB/VMCS.  However, the bits which are actually checked by
> hardware and preserved across vmentry/exit are inconsistent, and the vendor
> accessor functions perform inconsistent modification to the raw values.
> 
> Convert {svm,vmx}_{get,set}_segment_register() into raw accessors, and alter
> hvm_{get,set}_segment_register() to cook the values consistently.  This allows
> the common emulation code to better rely on finding architecturally-expected
> values.
> 
> This does cause some functional changes because of the modifications being
> applied uniformly.  A side effect of this fixes latent bugs where
> vmx_set_segment_register() didn't correctly fix up .G for segments, and
> inconsistent fixing up of the GDTR/IDTR limits.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 10/15] x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer
  2016-11-23 15:38 ` [PATCH 10/15] x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer Andrew Cooper
  2016-11-23 16:32   ` Tim Deegan
  2016-11-23 16:36   ` Paul Durrant
@ 2016-11-24  6:25   ` Tian, Kevin
  2 siblings, 0 replies; 91+ messages in thread
From: Tian, Kevin @ 2016-11-24  6:25 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Paul Durrant, Tim Deegan, Nakajima, Jun

> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Wednesday, November 23, 2016 11:39 PM
> 
> which is filled with pagefault information should one occur.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 12/15] x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear()
  2016-11-23 15:38 ` [PATCH 12/15] x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear() Andrew Cooper
  2016-11-23 16:35   ` Tim Deegan
@ 2016-11-24  6:26   ` Tian, Kevin
  2016-11-24 15:41   ` Jan Beulich
  2 siblings, 0 replies; 91+ messages in thread
From: Tian, Kevin @ 2016-11-24  6:26 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Paul Durrant, Tim Deegan, Nakajima, Jun, Jan Beulich

> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Wednesday, November 23, 2016 11:39 PM
> 
> The functions use linear addresses, not virtual addresses, as no segmentation
> is used.  (Lots of other code in Xen makes this mistake.)
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 09/15] x86/emul: Avoid raising faults behind the emulators back
  2016-11-23 16:58         ` Andrew Cooper
@ 2016-11-24 10:43           ` Jan Beulich
  2016-11-24 10:46             ` Andrew Cooper
  0 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2016-11-24 10:43 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: George Dunlap, PaulDurrant, Tim Deegan, Xen-devel

>>> On 23.11.16 at 17:58, <andrew.cooper3@citrix.com> wrote:
> Longterm the plan is to fully split the decode and emulate calls even
> for external callers,

This is news to me - why would that be? My PV priv-op series will
add a hook called between decode and execute, but I don't see
why we should bother all callers with invoking decode and execute
separately.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 09/15] x86/emul: Avoid raising faults behind the emulators back
  2016-11-24 10:43           ` Jan Beulich
@ 2016-11-24 10:46             ` Andrew Cooper
  2016-11-24 11:24               ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2016-11-24 10:46 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, PaulDurrant, Tim Deegan, Xen-devel

On 24/11/16 10:43, Jan Beulich wrote:
>>>> On 23.11.16 at 17:58, <andrew.cooper3@citrix.com> wrote:
>> Longterm the plan is to fully split the decode and emulate calls even
>> for external callers,
> This is news to me - why would that be? My PV priv-op series will
> add a hook called between decode and execute, but I don't see
> why we should bother all callers with invoking decode and execute
> separately.

Having an audit hook is also fine.  It achieves the same overall effect.

Without seeing how you planned to do this, I was referring back to the
discussion which was had at the April Hackathon.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 09/15] x86/emul: Avoid raising faults behind the emulators back
  2016-11-24 10:46             ` Andrew Cooper
@ 2016-11-24 11:24               ` Jan Beulich
  0 siblings, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2016-11-24 11:24 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: George Dunlap, PaulDurrant, TimDeegan, Xen-devel

>>> On 24.11.16 at 11:46, <andrew.cooper3@citrix.com> wrote:
> On 24/11/16 10:43, Jan Beulich wrote:
>>>>> On 23.11.16 at 17:58, <andrew.cooper3@citrix.com> wrote:
>>> Longterm the plan is to fully split the decode and emulate calls even
>>> for external callers,
>> This is news to me - why would that be? My PV priv-op series will
>> add a hook called between decode and execute, but I don't see
>> why we should bother all callers with invoking decode and execute
>> separately.
> 
> Having an audit hook is also fine.  It achieves the same overall effect.
> 
> Without seeing how you planned to do this, I was referring back to the
> discussion which was had at the April Hackathon.

Oh, I see. We actually did briefly discuss the hook later (see e.g.
lists.xenproject.org/archives/html/xen-devel/2016-09/msg03367.html),
and it doesn't make much sense for me to post v3 without your
XSA-191 follow-up finished and in place.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 02/15] x86/emul: Simplfy emulation state setup
  2016-11-23 15:38 ` [PATCH 02/15] x86/emul: Simplfy emulation state setup Andrew Cooper
  2016-11-23 15:58   ` Paul Durrant
  2016-11-23 16:07   ` Tim Deegan
@ 2016-11-24 13:44   ` Jan Beulich
  2016-11-24 13:59     ` Andrew Cooper
  2 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2016-11-24 13:44 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: George Dunlap, Paul Durrant, Tim Deegan, Xen-devel

>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5363,8 +5363,9 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
>          goto bail;
>      }
>  
> +    memset(&ptwr_ctxt, 0, sizeof(ptwr_ctxt));
> +
>      ptwr_ctxt.ctxt.regs = regs;
> -    ptwr_ctxt.ctxt.force_writeback = 0;
>      ptwr_ctxt.ctxt.addr_size = ptwr_ctxt.ctxt.sp_size =
>          is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG;
>      ptwr_ctxt.ctxt.swint_emulate = x86_swint_emulate_none;

Mind using an explicit initializer instead, moving everything there that's
already known at function start?

> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -1904,6 +1904,8 @@ x86_decode(
>      state->regs = ctxt->regs;
>      state->eip = ctxt->regs->eip;
>  
> +    /* Initialise output state in x86_emulate_ctxt */
> +    ctxt->opcode = ~0u;

I have to admit that I don't see the point of this.

> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -412,6 +412,10 @@ struct cpu_user_regs;
>  
>  struct x86_emulate_ctxt
>  {
> +    /*
> +     * Input state:
> +     */
> +
>      /* Register state before/after emulation. */
>      struct cpu_user_regs *regs;
>  
> @@ -421,14 +425,21 @@ struct x86_emulate_ctxt
>      /* Stack pointer width in bits (16, 32 or 64). */
>      unsigned int sp_size;
>  
> -    /* Canonical opcode (see below). */
> -    unsigned int opcode;
> -
>      /* Software event injection support. */
>      enum x86_swint_emulation swint_emulate;
>  
>      /* Set this if writes may have side effects. */
> -    uint8_t force_writeback;
> +    bool force_writeback;
> +
> +    /* Caller data that can be used by x86_emulate_ops' routines. */
> +    void *data;
> +
> +    /*
> +     * Output state:
> +     */
> +
> +    /* Canonical opcode (see below). */
> +    unsigned int opcode;
>  
>      /* Retirement state, set by the emulator (valid only on X86EMUL_OKAY). */
>      union {

Hmm, this separation looks to be correct for the current state of the
emulator, but I don't think it is architecturally so (and hence it would
become wrong in the course of us completing it): Both addr_size and
sp_size are not necessarily inputs only. Also keeping regs in the
input only group is slightly misleading - the pointer itself surely is input
only, but the pointed to data isn't.

So I'd suggest to have three groups: input, input/output, output,
even if for your purpose here you only want to tell apart fields which
need to be initialized before calling x86_emulate() / x86_decode()
(the first two groups) from those which don't (the last group).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 03/15] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure
  2016-11-23 15:38 ` [PATCH 03/15] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure Andrew Cooper
                     ` (2 preceding siblings ...)
  2016-11-24  6:17   ` Tian, Kevin
@ 2016-11-24 13:56   ` Jan Beulich
  2016-11-24 14:42     ` Andrew Cooper
  3 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2016-11-24 13:56 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Suravee Suthikulpanit, Xen-devel, Paul Durrant,
	Jun Nakajima, Boris Ostrovsky

>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
> @@ -67,6 +67,28 @@ enum x86_swint_emulation {
>      x86_swint_emulate_all,  /* Help needed with all software events */
>  };
>  
> +/*
> + * x86 event types. This enumeration is valid for:
> + *  Intel VMX: {VM_ENTRY,VM_EXIT,IDT_VECTORING}_INTR_INFO[10:8]
> + *  AMD SVM: eventinj[10:8] and exitintinfo[10:8] (types 0-4 only)
> + */
> +enum x86_event_type {
> +    X86_EVENTTYPE_EXT_INTR,         /* External interrupt */
> +    X86_EVENTTYPE_NMI = 2,          /* NMI */
> +    X86_EVENTTYPE_HW_EXCEPTION,     /* Hardware exception */
> +    X86_EVENTTYPE_SW_INTERRUPT,     /* Software interrupt (CD nn) */
> +    X86_EVENTTYPE_PRI_SW_EXCEPTION, /* ICEBP (F1) */
> +    X86_EVENTTYPE_SW_EXCEPTION,     /* INT3 (CC), INTO (CE) */
> +};
> +
> +struct x86_event {
> +    int16_t       vector;
> +    uint8_t       type;         /* X86_EVENTTYPE_* */

Do we perhaps want to make the compiler warn about possibly
incomplete switch statements, but making this an 8-bit field of
type enum x86_event_type? (That would perhaps imply making
vector and insn_len bitfields too; see also below.)

> +    uint8_t       insn_len;     /* Instruction length */
> +    uint32_t      error_code;   /* HVM_DELIVER_NO_ERROR_CODE if n/a */
> +    unsigned long cr2;          /* Only for TRAP_page_fault h/w exception */
> +};

Also I have to admit I'm not really happy about the mixing of fixed
width and fundamental types. Can I talk you into using only the
latter?

> --- a/xen/include/asm-x86/hvm/vmx/vvmx.h
> +++ b/xen/include/asm-x86/hvm/vmx/vvmx.h
> @@ -112,8 +112,8 @@ void nvmx_vcpu_destroy(struct vcpu *v);
>  int nvmx_vcpu_reset(struct vcpu *v);
>  uint64_t nvmx_vcpu_eptp_base(struct vcpu *v);
>  enum hvm_intblk nvmx_intr_blocked(struct vcpu *v);
> -bool_t nvmx_intercepts_exception(struct vcpu *v, unsigned int trap,
> -                                 int error_code);
> +bool_t nvmx_intercepts_exception(
> +    struct vcpu *v, unsigned int vector, int error_code);

This reformatting doesn't appear to be in line with other nearby
code. But iirc you've got an ack from the VMX side already...

Anyway, with or without the comments addressed,
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 02/15] x86/emul: Simplfy emulation state setup
  2016-11-24 13:44   ` Jan Beulich
@ 2016-11-24 13:59     ` Andrew Cooper
  2016-11-24 14:18       ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2016-11-24 13:59 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, Paul Durrant, Tim Deegan, Xen-devel

On 24/11/16 13:44, Jan Beulich wrote:
>>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -5363,8 +5363,9 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
>>          goto bail;
>>      }
>>  
>> +    memset(&ptwr_ctxt, 0, sizeof(ptwr_ctxt));
>> +
>>      ptwr_ctxt.ctxt.regs = regs;
>> -    ptwr_ctxt.ctxt.force_writeback = 0;
>>      ptwr_ctxt.ctxt.addr_size = ptwr_ctxt.ctxt.sp_size =
>>          is_pv_32bit_domain(d) ? 32 : BITS_PER_LONG;
>>      ptwr_ctxt.ctxt.swint_emulate = x86_swint_emulate_none;
> Mind using an explicit initializer instead, moving everything there that's
> already known at function start?

Done.

>
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>> @@ -1904,6 +1904,8 @@ x86_decode(
>>      state->regs = ctxt->regs;
>>      state->eip = ctxt->regs->eip;
>>  
>> +    /* Initialise output state in x86_emulate_ctxt */
>> +    ctxt->opcode = ~0u;
> I have to admit that I don't see the point of this.

There are early exit paths which leave it uninitialised.  An alternative
would be explicitly document that it is only valid for X86EMUL_OKAY?

>
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>> @@ -412,6 +412,10 @@ struct cpu_user_regs;
>>  
>>  struct x86_emulate_ctxt
>>  {
>> +    /*
>> +     * Input state:
>> +     */
>> +
>>      /* Register state before/after emulation. */
>>      struct cpu_user_regs *regs;
>>  
>> @@ -421,14 +425,21 @@ struct x86_emulate_ctxt
>>      /* Stack pointer width in bits (16, 32 or 64). */
>>      unsigned int sp_size;
>>  
>> -    /* Canonical opcode (see below). */
>> -    unsigned int opcode;
>> -
>>      /* Software event injection support. */
>>      enum x86_swint_emulation swint_emulate;
>>  
>>      /* Set this if writes may have side effects. */
>> -    uint8_t force_writeback;
>> +    bool force_writeback;
>> +
>> +    /* Caller data that can be used by x86_emulate_ops' routines. */
>> +    void *data;
>> +
>> +    /*
>> +     * Output state:
>> +     */
>> +
>> +    /* Canonical opcode (see below). */
>> +    unsigned int opcode;
>>  
>>      /* Retirement state, set by the emulator (valid only on X86EMUL_OKAY). */
>>      union {
> Hmm, this separation looks to be correct for the current state of the
> emulator, but I don't think it is architecturally so (and hence it would
> become wrong in the course of us completing it): Both addr_size and
> sp_size are not necessarily inputs only. Also keeping regs in the
> input only group is slightly misleading - the pointer itself surely is input
> only, but the pointed to data isn't.
>
> So I'd suggest to have three groups: input, input/output, output,
> even if for your purpose here you only want to tell apart fields which
> need to be initialized before calling x86_emulate() / x86_decode()
> (the first two groups) from those which don't (the last group).

Right - proposed net result looks like:

struct x86_emulate_ctxt
{
    /*
     * Input-only state:
     */

    /* Software event injection support. */
    enum x86_swint_emulation swint_emulate;

    /* Set this if writes may have side effects. */
    bool force_writeback;

    /* Caller data that can be used by x86_emulate_ops' routines. */
    void *data;

    /*
     * Input/output state:
     */

    /* Register state before/after emulation. */
    struct cpu_user_regs *regs;

    /* Default address size in current execution mode (16, 32, or 64). */
    unsigned int addr_size;

    /* Stack pointer width in bits (16, 32 or 64). */
    unsigned int sp_size;

    /*
     * Output-only state:
     */

    /* Canonical opcode (see below) (valid only on X86EMUL_OKAY). */
    unsigned int opcode;

    /* Retirement state, set by the emulator (valid only on
X86EMUL_OKAY). */
    union {
        struct {
            uint8_t hlt:1;          /* Instruction HLTed. */
            uint8_t mov_ss:1;       /* Instruction sets MOV-SS irq
shadow. */
            uint8_t sti:1;          /* Instruction sets STI irq shadow. */
        } flags;
        uint8_t byte;
    } retire;
};

If that is ok?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 02/15] x86/emul: Simplfy emulation state setup
  2016-11-24 13:59     ` Andrew Cooper
@ 2016-11-24 14:18       ` Jan Beulich
  0 siblings, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2016-11-24 14:18 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: George Dunlap, Paul Durrant, TimDeegan, Xen-devel

>>> On 24.11.16 at 14:59, <andrew.cooper3@citrix.com> wrote:
> On 24/11/16 13:44, Jan Beulich wrote:
>>>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>> @@ -1904,6 +1904,8 @@ x86_decode(
>>>      state->regs = ctxt->regs;
>>>      state->eip = ctxt->regs->eip;
>>>  
>>> +    /* Initialise output state in x86_emulate_ctxt */
>>> +    ctxt->opcode = ~0u;
>> I have to admit that I don't see the point of this.
> 
> There are early exit paths which leave it uninitialised.  An alternative
> would be explicitly document that it is only valid for X86EMUL_OKAY?

Well, to me that goes without saying, but I'm fine with it being added.

>> So I'd suggest to have three groups: input, input/output, output,
>> even if for your purpose here you only want to tell apart fields which
>> need to be initialized before calling x86_emulate() / x86_decode()
>> (the first two groups) from those which don't (the last group).
> 
> Right - proposed net result looks like:
> 
> struct x86_emulate_ctxt
> {
>     /*
>      * Input-only state:
>      */
> 
>     /* Software event injection support. */
>     enum x86_swint_emulation swint_emulate;
> 
>     /* Set this if writes may have side effects. */
>     bool force_writeback;
> 
>     /* Caller data that can be used by x86_emulate_ops' routines. */
>     void *data;
> 
>     /*
>      * Input/output state:
>      */
> 
>     /* Register state before/after emulation. */
>     struct cpu_user_regs *regs;
> 
>     /* Default address size in current execution mode (16, 32, or 64). */
>     unsigned int addr_size;
> 
>     /* Stack pointer width in bits (16, 32 or 64). */
>     unsigned int sp_size;
> 
>     /*
>      * Output-only state:
>      */
> 
>     /* Canonical opcode (see below) (valid only on X86EMUL_OKAY). */
>     unsigned int opcode;
> 
>     /* Retirement state, set by the emulator (valid only on
> X86EMUL_OKAY). */
>     union {
>         struct {
>             uint8_t hlt:1;          /* Instruction HLTed. */
>             uint8_t mov_ss:1;       /* Instruction sets MOV-SS irq
> shadow. */
>             uint8_t sti:1;          /* Instruction sets STI irq shadow. */
>         } flags;
>         uint8_t byte;
>     } retire;
> };
> 
> If that is ok?

Looks good, thanks. With that
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 04/15] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC
  2016-11-23 15:38 ` [PATCH 04/15] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC Andrew Cooper
                     ` (2 preceding siblings ...)
  2016-11-24  6:18   ` Tian, Kevin
@ 2016-11-24 14:18   ` Jan Beulich
  3 siblings, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2016-11-24 14:18 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Suravee Suthikulpanit, Xen-devel, Paul Durrant,
	Jun Nakajima, Boris Ostrovsky

>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
> and move it to live with the other x86_event infrastructure in x86_emulate.h.
> Switch it and x86_event.error_code to being signed, matching the rest of the
> code.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Preferrably re-based on top of the SVM patch just sent,
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 05/15] x86/emul: Remove opencoded exception generation
  2016-11-23 15:38 ` [PATCH 05/15] x86/emul: Remove opencoded exception generation Andrew Cooper
@ 2016-11-24 14:31   ` Jan Beulich
  2016-11-24 16:24     ` Andrew Cooper
  0 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2016-11-24 14:31 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
> +static inline int mkec(uint8_t e, int32_t ec, ...)
> +{
> +    return (e < 32 && (1u << e) & EXC_HAS_EC) ? ec : X86_EVENT_NO_EC;

Please parenthesize the operands of &.

> +}
> +
> +#define generate_exception_if(p, e, ec...)                                \
>  ({  if ( (p) ) {                                                          \
>          fail_if(ops->inject_hw_exception == NULL);                        \
> -        rc = ops->inject_hw_exception(e, ec, ctxt) ? : X86EMUL_EXCEPTION; \
> +        rc = ops->inject_hw_exception(e, mkec(e, ##ec, 0), ctxt)          \

Did you notice that with the 0 used here, ...

> @@ -1167,11 +1181,9 @@ static int ioport_access_check(
>      if ( (rc = ops->read_segment(x86_seg_tr, &tr, ctxt)) != 0 )
>          return rc;
>  
> -    /* Ensure that the TSS is valid and has an io-bitmap-offset field. */
> -    if ( !tr.attr.fields.p ||
> -         ((tr.attr.fields.type & 0xd) != 0x9) ||
> -         (tr.limit < 0x67) )
> -        goto raise_exception;
> +    /* Ensure the TSS has an io-bitmap-offset field. */
> +    generate_exception_if(tr.attr.fields.type != 0xb ||
> +                          tr.limit < 0x67, EXC_GP, 0);

... invocations like this one don't really need their error code
specified anymore either?

With you having added my S-o-b (not really sure why), I'm not sure
it makes a whole lot of sense to give my R-b as well (but feel free
to add it).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 03/15] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure
  2016-11-24 13:56   ` Jan Beulich
@ 2016-11-24 14:42     ` Andrew Cooper
  2016-11-24 14:57       ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2016-11-24 14:42 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Suravee Suthikulpanit, Xen-devel, Paul Durrant,
	Jun Nakajima, Boris Ostrovsky

On 24/11/16 13:56, Jan Beulich wrote:
>>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>> @@ -67,6 +67,28 @@ enum x86_swint_emulation {
>>      x86_swint_emulate_all,  /* Help needed with all software events */
>>  };
>>  
>> +/*
>> + * x86 event types. This enumeration is valid for:
>> + *  Intel VMX: {VM_ENTRY,VM_EXIT,IDT_VECTORING}_INTR_INFO[10:8]
>> + *  AMD SVM: eventinj[10:8] and exitintinfo[10:8] (types 0-4 only)
>> + */
>> +enum x86_event_type {
>> +    X86_EVENTTYPE_EXT_INTR,         /* External interrupt */
>> +    X86_EVENTTYPE_NMI = 2,          /* NMI */
>> +    X86_EVENTTYPE_HW_EXCEPTION,     /* Hardware exception */
>> +    X86_EVENTTYPE_SW_INTERRUPT,     /* Software interrupt (CD nn) */
>> +    X86_EVENTTYPE_PRI_SW_EXCEPTION, /* ICEBP (F1) */
>> +    X86_EVENTTYPE_SW_EXCEPTION,     /* INT3 (CC), INTO (CE) */
>> +};
>> +
>> +struct x86_event {
>> +    int16_t       vector;
>> +    uint8_t       type;         /* X86_EVENTTYPE_* */
> Do we perhaps want to make the compiler warn about possibly
> incomplete switch statements, but making this an 8-bit field of
> type enum x86_event_type? (That would perhaps imply making
> vector and insn_len bitfields too; see also below.)
>
>> +    uint8_t       insn_len;     /* Instruction length */
>> +    uint32_t      error_code;   /* HVM_DELIVER_NO_ERROR_CODE if n/a */
>> +    unsigned long cr2;          /* Only for TRAP_page_fault h/w exception */
>> +};
> Also I have to admit I'm not really happy about the mixing of fixed
> width and fundamental types. Can I talk you into using only the
> latter?

I am open to idea of swapping things around, but wonder whether this
would be better done in a separate patch to avoid interfering with this
mechanical movement.

>
>> --- a/xen/include/asm-x86/hvm/vmx/vvmx.h
>> +++ b/xen/include/asm-x86/hvm/vmx/vvmx.h
>> @@ -112,8 +112,8 @@ void nvmx_vcpu_destroy(struct vcpu *v);
>>  int nvmx_vcpu_reset(struct vcpu *v);
>>  uint64_t nvmx_vcpu_eptp_base(struct vcpu *v);
>>  enum hvm_intblk nvmx_intr_blocked(struct vcpu *v);
>> -bool_t nvmx_intercepts_exception(struct vcpu *v, unsigned int trap,
>> -                                 int error_code);
>> +bool_t nvmx_intercepts_exception(
>> +    struct vcpu *v, unsigned int vector, int error_code);
> This reformatting doesn't appear to be in line with other nearby
> code. But iirc you've got an ack from the VMX side already...

The first version also had an int => unsigned int change for error_code.

Now, the only difference is trap => vector, but I would like to keep it
for consistency with the other changes.

~Andrew

>
> Anyway, with or without the comments addressed,
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
> Jan
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 06/15] x86/emul: Rework emulator event injection
  2016-11-23 15:38 ` [PATCH 06/15] x86/emul: Rework emulator event injection Andrew Cooper
                     ` (2 preceding siblings ...)
  2016-11-24  6:20   ` Tian, Kevin
@ 2016-11-24 14:53   ` Jan Beulich
  2016-11-24 17:00     ` Andrew Cooper
  3 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2016-11-24 14:53 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Suravee Suthikulpanit, George Dunlap, Tim Deegan,
	Xen-devel, Paul Durrant, Jun Nakajima, Boris Ostrovsky

>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5377,7 +5377,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
>      page_unlock(page);
>      put_page(page);
>  
> -    if ( rc == X86EMUL_UNHANDLEABLE )
> +    if ( rc == X86EMUL_UNHANDLEABLE || ptwr_ctxt.ctxt.event_pending )
>          goto bail;
>  
>      perfc_incr(ptwr_emulations);
> @@ -5501,7 +5501,8 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
>      else
>          rc = x86_emulate(&ctxt, &mmio_ro_emulate_ops);
>  
> -    return rc != X86EMUL_UNHANDLEABLE ? EXCRET_fault_fixed : 0;
> +    return ((rc != X86EMUL_UNHANDLEABLE && !ctxt.event_pending)
> +            ? EXCRET_fault_fixed : 0);
>  }

Wouldn't these two better be adjusted to check for OKAY and RETRY,
the more that iirc we had settled on it not (yet) being guaranteed to
see event_pending set whenever getting back EXCEPTION?

> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -3378,7 +3378,7 @@ static int sh_page_fault(struct vcpu *v,
>       * would be a good unshadow hint. If we *do* decide to unshadow-on-fault
>       * then it must be 'failable': we cannot require the unshadow to succeed.
>       */
> -    if ( r == X86EMUL_UNHANDLEABLE )
> +    if ( r == X86EMUL_UNHANDLEABLE || emul_ctxt.ctxt.event_pending )

Same here then.

> @@ -3433,7 +3433,7 @@ static int sh_page_fault(struct vcpu *v,
>              shadow_continue_emulation(&emul_ctxt, regs);
>              v->arch.paging.last_write_was_pt = 0;
>              r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
> -            if ( r == X86EMUL_OKAY )
> +            if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.event_pending )

Aiui you need this for the swint case. But wouldn't you then need to
add similar checks in OKAY paths elsewhere? Or alternatively,
wouldn't it be better to have x86_emulate() return EXCEPTION also
for successful swint emulation (albeit that would likely require other
not very nice adjustments)?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 03/15] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure
  2016-11-24 14:42     ` Andrew Cooper
@ 2016-11-24 14:57       ` Jan Beulich
  0 siblings, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2016-11-24 14:57 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Suravee Suthikulpanit, Xen-devel, Paul Durrant,
	Jun Nakajima, Boris Ostrovsky

>>> On 24.11.16 at 15:42, <andrew.cooper3@citrix.com> wrote:
> On 24/11/16 13:56, Jan Beulich wrote:
>>>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>>> @@ -67,6 +67,28 @@ enum x86_swint_emulation {
>>>      x86_swint_emulate_all,  /* Help needed with all software events */
>>>  };
>>>  
>>> +/*
>>> + * x86 event types. This enumeration is valid for:
>>> + *  Intel VMX: {VM_ENTRY,VM_EXIT,IDT_VECTORING}_INTR_INFO[10:8]
>>> + *  AMD SVM: eventinj[10:8] and exitintinfo[10:8] (types 0-4 only)
>>> + */
>>> +enum x86_event_type {
>>> +    X86_EVENTTYPE_EXT_INTR,         /* External interrupt */
>>> +    X86_EVENTTYPE_NMI = 2,          /* NMI */
>>> +    X86_EVENTTYPE_HW_EXCEPTION,     /* Hardware exception */
>>> +    X86_EVENTTYPE_SW_INTERRUPT,     /* Software interrupt (CD nn) */
>>> +    X86_EVENTTYPE_PRI_SW_EXCEPTION, /* ICEBP (F1) */
>>> +    X86_EVENTTYPE_SW_EXCEPTION,     /* INT3 (CC), INTO (CE) */
>>> +};
>>> +
>>> +struct x86_event {
>>> +    int16_t       vector;
>>> +    uint8_t       type;         /* X86_EVENTTYPE_* */
>> Do we perhaps want to make the compiler warn about possibly
>> incomplete switch statements, but making this an 8-bit field of
>> type enum x86_event_type? (That would perhaps imply making
>> vector and insn_len bitfields too; see also below.)
>>
>>> +    uint8_t       insn_len;     /* Instruction length */
>>> +    uint32_t      error_code;   /* HVM_DELIVER_NO_ERROR_CODE if n/a */
>>> +    unsigned long cr2;          /* Only for TRAP_page_fault h/w exception */
>>> +};
>> Also I have to admit I'm not really happy about the mixing of fixed
>> width and fundamental types. Can I talk you into using only the
>> latter?
> 
> I am open to idea of swapping things around, but wonder whether this
> would be better done in a separate patch to avoid interfering with this
> mechanical movement.

Okay then.

>>> --- a/xen/include/asm-x86/hvm/vmx/vvmx.h
>>> +++ b/xen/include/asm-x86/hvm/vmx/vvmx.h
>>> @@ -112,8 +112,8 @@ void nvmx_vcpu_destroy(struct vcpu *v);
>>>  int nvmx_vcpu_reset(struct vcpu *v);
>>>  uint64_t nvmx_vcpu_eptp_base(struct vcpu *v);
>>>  enum hvm_intblk nvmx_intr_blocked(struct vcpu *v);
>>> -bool_t nvmx_intercepts_exception(struct vcpu *v, unsigned int trap,
>>> -                                 int error_code);
>>> +bool_t nvmx_intercepts_exception(
>>> +    struct vcpu *v, unsigned int vector, int error_code);
>> This reformatting doesn't appear to be in line with other nearby
>> code. But iirc you've got an ack from the VMX side already...
> 
> The first version also had an int => unsigned int change for error_code.
> 
> Now, the only difference is trap => vector, but I would like to keep it
> for consistency with the other changes.

Consistency? My comment was solely about the change in indentation
and breaking between lines, and there I think consistency with
surrounding code is more important than consistency with other
changes.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 08/15] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS
  2016-11-23 15:38 ` [PATCH 08/15] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS Andrew Cooper
  2016-11-23 19:01   ` Boris Ostrovsky
  2016-11-24  6:24   ` Tian, Kevin
@ 2016-11-24 15:25   ` Jan Beulich
  2016-11-24 17:22     ` Andrew Cooper
  2 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2016-11-24 15:25 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Boris Ostrovsky, Kevin Tian, Jun Nakajima, SuraveeSuthikulpanit,
	Xen-devel

>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
> +    case x86_seg_tr:
> +        ASSERT(reg->attr.fields.p);                  /* Usable. */
> +        ASSERT(!reg->attr.fields.s);                 /* System segment. */
> +        ASSERT(!(reg->sel & 0x4));                   /* !TI. */
> +        ASSERT(reg->attr.fields.type == SYS_DESC_tss16_busy ||
> +               reg->attr.fields.type == SYS_DESC_tss_busy);
> +        ASSERT(is_canonical_address(reg->base));
> +        break;

I can't help thinking that the slightly more strict

+        if ( reg->attr.fields.type == SYS_DESC_tss_busy )
+            ASSERT(is_canonical_address(reg->base));
+        else if ( reg->attr.fields.type == SYS_DESC_tss16_busy )
+            ASSERT(!(reg->base >> 32));
+        else
+            ASSERT_UNREACHABLE();

would be better, even if that's going to make TR checking look a
little different than the others (but we should leverage the
information we have).

> +    case x86_seg_ldtr:
> +        if ( reg->attr.fields.p )
> +        {
> +            ASSERT(!reg->attr.fields.s);             /* System segment. */
> +            ASSERT(!(reg->sel & 0x4));               /* !TI. */
> +            ASSERT(reg->attr.fields.type == SYS_DESC_ldt);
> +            ASSERT(is_canonical_address(reg->base));
> +        }
> +        break;
> +
> +    case x86_seg_gdtr:
> +    case x86_seg_idtr:
> +        ASSERT(is_canonical_address(reg->base));
> +        ASSERT((reg->limit >> 16) == 0);             /* Upper bits clear. */
> +        break;
> +
> +    default:
> +        ASSERT_UNREACHABLE();
> +        break;
> +    }

Didn't you agree to change this last "break" to "return"?

> --- a/xen/include/asm-x86/desc.h
> +++ b/xen/include/asm-x86/desc.h
> @@ -89,7 +89,13 @@
>  #ifndef __ASSEMBLY__
>  
>  /* System Descriptor types for GDT and IDT entries. */
> +#define SYS_DESC_tss16_avail  1
>  #define SYS_DESC_ldt          2
> +#define SYS_DESC_tss16_busy   3
> +#define SYS_DESC_call_gate16  4
> +#define SYS_DESC_task_gate    5
> +#define SYS_DESC_irq_gate16   6
> +#define SYS_DESC_trap_gate16  7
>  #define SYS_DESC_tss_avail    9
>  #define SYS_DESC_tss_busy     11
>  #define SYS_DESC_call_gate    12

Thanks for completing the set.

Regardless of how you decide on the two earlier remarks,
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 12/15] x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear()
  2016-11-23 15:38 ` [PATCH 12/15] x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear() Andrew Cooper
  2016-11-23 16:35   ` Tim Deegan
  2016-11-24  6:26   ` Tian, Kevin
@ 2016-11-24 15:41   ` Jan Beulich
  2 siblings, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2016-11-24 15:41 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Paul Durrant, Tim Deegan, Jun Nakajima, Xen-devel

>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
> The functions use linear addresses, not virtual addresses, as no segmentation
> is used.  (Lots of other code in Xen makes this mistake.)
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 14/15] x86/hvm: Prepare to allow use of system segments for memory references
  2016-11-23 15:38 ` [PATCH 14/15] x86/hvm: Prepare to allow use of system segments for memory references Andrew Cooper
  2016-11-23 16:42   ` Paul Durrant
@ 2016-11-24 15:48   ` Jan Beulich
  1 sibling, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2016-11-24 15:48 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Paul Durrant, Xen-devel

>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
> All system segments (GDT/IDT/LDT and TR) describe a linear address and limit,
> and act similarly to user segments.  However all current uses of these tables
> in the emulator opencode the address calculations and limit checks.  In
> particular, no care is taken for access which wrap around the 4GB or
> non-canonical boundaries.
> 
> Alter hvm_virtual_to_linear_addr() to cope with performing segmentation checks
> on system segments.  This involves restricting access checks in the 32bit case
> to user segments only, and adding presence/limit checks in the 64bit case.
> 
> When suffering a segmentation fault for a system segments, return
> X86EMUL_EXCEPTION but leave the fault injection to the caller.  The fault type
> depends on the higher level action being performed.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <JBeulich@suse.com>

I think the code that this covered has been moved elsewhere,
so please use
Reviewed-by: Jan Beulich <jbeulich@suse.com>
here instead.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 15/15] x86/hvm: Use system-segment relative memory accesses
  2016-11-23 15:38 ` [PATCH 15/15] x86/hvm: Use system-segment relative memory accesses Andrew Cooper
@ 2016-11-24 16:01   ` Jan Beulich
  2016-11-24 16:03     ` Andrew Cooper
  0 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2016-11-24 16:01 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -1181,20 +1181,38 @@ static int ioport_access_check(
>          return rc;
>  
>      /* Ensure the TSS has an io-bitmap-offset field. */
> -    generate_exception_if(tr.attr.fields.type != 0xb ||
> -                          tr.limit < 0x67, EXC_GP, 0);
> +    generate_exception_if(tr.attr.fields.type != 0xb, EXC_GP, 0);
>  
> -    if ( (rc = read_ulong(x86_seg_none, tr.base + 0x66,
> -                          &iobmp, 2, ctxt, ops)) )
> +    switch ( rc = read_ulong(x86_seg_tr, 0x66, &iobmp, 2, ctxt, ops) )
> +    {
> +    case X86EMUL_OKAY:
> +        break;
> +
> +    case X86EMUL_EXCEPTION:
> +        if ( !ctxt->event_pending )
> +            generate_exception_if(true, EXC_GP, 0);

generate_exception_if(!ctxt->event_pending, EXC_GP, 0) ?

> @@ -1471,9 +1490,12 @@ protmode_load_seg(
>      {
>          uint32_t new_desc_b = desc.b | a_flag;
>  
> -        if ( (rc = ops->cmpxchg(x86_seg_none, desctab.base + (sel & 0xfff8) + 4,
> -                                &desc.b, &new_desc_b, 4, ctxt)) != 0 )
> +        if ( (rc = ops->cmpxchg(sel_seg, (sel & 0xfff8) + 4, &desc.b,
> +                                &new_desc_b, 4, ctxt) != X86EMUL_OKAY) )
> +        {
> +            ASSERT(rc != X86EMUL_EXCEPTION);

Hmm, now that I look at this again I don't think it's right: Why did
we think there can't be any exception here? What if the descriptor
table page is write protected? Or page tables have been changed
behind our back after the earlier read?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 15/15] x86/hvm: Use system-segment relative memory accesses
  2016-11-24 16:01   ` Jan Beulich
@ 2016-11-24 16:03     ` Andrew Cooper
  0 siblings, 0 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-24 16:03 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 24/11/16 16:01, Jan Beulich wrote:
>>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>> @@ -1181,20 +1181,38 @@ static int ioport_access_check(
>>          return rc;
>>  
>>      /* Ensure the TSS has an io-bitmap-offset field. */
>> -    generate_exception_if(tr.attr.fields.type != 0xb ||
>> -                          tr.limit < 0x67, EXC_GP, 0);
>> +    generate_exception_if(tr.attr.fields.type != 0xb, EXC_GP, 0);
>>  
>> -    if ( (rc = read_ulong(x86_seg_none, tr.base + 0x66,
>> -                          &iobmp, 2, ctxt, ops)) )
>> +    switch ( rc = read_ulong(x86_seg_tr, 0x66, &iobmp, 2, ctxt, ops) )
>> +    {
>> +    case X86EMUL_OKAY:
>> +        break;
>> +
>> +    case X86EMUL_EXCEPTION:
>> +        if ( !ctxt->event_pending )
>> +            generate_exception_if(true, EXC_GP, 0);
> generate_exception_if(!ctxt->event_pending, EXC_GP, 0) ?

Already noticed and fixed in v2.

>
>> @@ -1471,9 +1490,12 @@ protmode_load_seg(
>>      {
>>          uint32_t new_desc_b = desc.b | a_flag;
>>  
>> -        if ( (rc = ops->cmpxchg(x86_seg_none, desctab.base + (sel & 0xfff8) + 4,
>> -                                &desc.b, &new_desc_b, 4, ctxt)) != 0 )
>> +        if ( (rc = ops->cmpxchg(sel_seg, (sel & 0xfff8) + 4, &desc.b,
>> +                                &new_desc_b, 4, ctxt) != X86EMUL_OKAY) )
>> +        {
>> +            ASSERT(rc != X86EMUL_EXCEPTION);
> Hmm, now that I look at this again I don't think it's right: Why did
> we think there can't be any exception here?

Hmm.  I cant remember either.

> What if the descriptor table page is write protected?

Architecturally, a #PF is raised.

> Or page tables have been changed behind our back after the earlier read?

Currently nothing because ops->cmpxchg() doesn't have atomic or xchg
propertied.

I will drop the assertion, because it is definitely wrong for the
pagefault case.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 05/15] x86/emul: Remove opencoded exception generation
  2016-11-24 14:31   ` Jan Beulich
@ 2016-11-24 16:24     ` Andrew Cooper
  2016-11-24 16:31       ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2016-11-24 16:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 24/11/16 14:31, Jan Beulich wrote:
>>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
>> +static inline int mkec(uint8_t e, int32_t ec, ...)
>> +{
>> +    return (e < 32 && (1u << e) & EXC_HAS_EC) ? ec : X86_EVENT_NO_EC;
> Please parenthesize the operands of &.

Fixed.

>
>> +}
>> +
>> +#define generate_exception_if(p, e, ec...)                                \
>>  ({  if ( (p) ) {                                                          \
>>          fail_if(ops->inject_hw_exception == NULL);                        \
>> -        rc = ops->inject_hw_exception(e, ec, ctxt) ? : X86EMUL_EXCEPTION; \
>> +        rc = ops->inject_hw_exception(e, mkec(e, ##ec, 0), ctxt)          \
> Did you notice that with the 0 used here, ...
>
>> @@ -1167,11 +1181,9 @@ static int ioport_access_check(
>>      if ( (rc = ops->read_segment(x86_seg_tr, &tr, ctxt)) != 0 )
>>          return rc;
>>  
>> -    /* Ensure that the TSS is valid and has an io-bitmap-offset field. */
>> -    if ( !tr.attr.fields.p ||
>> -         ((tr.attr.fields.type & 0xd) != 0x9) ||
>> -         (tr.limit < 0x67) )
>> -        goto raise_exception;
>> +    /* Ensure the TSS has an io-bitmap-offset field. */
>> +    generate_exception_if(tr.attr.fields.type != 0xb ||
>> +                          tr.limit < 0x67, EXC_GP, 0);
> ... invocations like this one don't really need their error code
> specified anymore either?

Yes, but I chose not to visually remove the error code from EXC_GP.

I attempted to get the compiler to force the presence or absence of the
error code parameter based on (e & EXC_HAS_EC) but failed to get
something working.  I doubt the C preprocessor is sufficiently
expressive for this use.

>
> With you having added my S-o-b (not really sure why), I'm not sure
> it makes a whole lot of sense to give my R-b as well (but feel free
> to add it).

Your code was originally the few bits replacing opencoded "goto
raise_exception" with generate_exception_if(), but the patch has morphed
a long way since then.  I am happy to exchange your S-o-B for R-b if you
would prefer?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 05/15] x86/emul: Remove opencoded exception generation
  2016-11-24 16:24     ` Andrew Cooper
@ 2016-11-24 16:31       ` Jan Beulich
  2016-11-24 17:04         ` Andrew Cooper
  0 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2016-11-24 16:31 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel

>>> On 24.11.16 at 17:24, <andrew.cooper3@citrix.com> wrote:
> On 24/11/16 14:31, Jan Beulich wrote:
>>>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
>>> +#define generate_exception_if(p, e, ec...)                                \
>>>  ({  if ( (p) ) {                                                          \
>>>          fail_if(ops->inject_hw_exception == NULL);                        \
>>> -        rc = ops->inject_hw_exception(e, ec, ctxt) ? : X86EMUL_EXCEPTION; \
>>> +        rc = ops->inject_hw_exception(e, mkec(e, ##ec, 0), ctxt)          \
>> Did you notice that with the 0 used here, ...
>>
>>> @@ -1167,11 +1181,9 @@ static int ioport_access_check(
>>>      if ( (rc = ops->read_segment(x86_seg_tr, &tr, ctxt)) != 0 )
>>>          return rc;
>>>  
>>> -    /* Ensure that the TSS is valid and has an io-bitmap-offset field. */
>>> -    if ( !tr.attr.fields.p ||
>>> -         ((tr.attr.fields.type & 0xd) != 0x9) ||
>>> -         (tr.limit < 0x67) )
>>> -        goto raise_exception;
>>> +    /* Ensure the TSS has an io-bitmap-offset field. */
>>> +    generate_exception_if(tr.attr.fields.type != 0xb ||
>>> +                          tr.limit < 0x67, EXC_GP, 0);
>> ... invocations like this one don't really need their error code
>> specified anymore either?
> 
> Yes, but I chose not to visually remove the error code from EXC_GP.

Well, let's keep it that way then for now, but I think I will be tempted
to get rid of those zeros down the road.

> I attempted to get the compiler to force the presence or absence of the
> error code parameter based on (e & EXC_HAS_EC) but failed to get
> something working.  I doubt the C preprocessor is sufficiently
> expressive for this use.

Right, I had tried to think of something like that too when
working out how that mkec() could be made work, but
couldn't think of anything. After all, us using the ellipsis is
pretty much heading into the opposite direction (and that part
of why I think dropping the zeros too would be a good idea,
because that will make more visible when we actually care
about a _specific_ error code).

>> With you having added my S-o-b (not really sure why), I'm not sure
>> it makes a whole lot of sense to give my R-b as well (but feel free
>> to add it).
> 
> Your code was originally the few bits replacing opencoded "goto
> raise_exception" with generate_exception_if(), but the patch has morphed
> a long way since then.  I am happy to exchange your S-o-B for R-b if you
> would prefer?

Yes, it's basically all your work, so I think R-b is more appropriate.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 06/15] x86/emul: Rework emulator event injection
  2016-11-24 14:53   ` Jan Beulich
@ 2016-11-24 17:00     ` Andrew Cooper
  2016-11-24 17:08       ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2016-11-24 17:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Suravee Suthikulpanit, George Dunlap, Tim Deegan,
	Xen-devel, Paul Durrant, Jun Nakajima, Boris Ostrovsky

On 24/11/16 14:53, Jan Beulich wrote:
>>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -5377,7 +5377,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
>>      page_unlock(page);
>>      put_page(page);
>>  
>> -    if ( rc == X86EMUL_UNHANDLEABLE )
>> +    if ( rc == X86EMUL_UNHANDLEABLE || ptwr_ctxt.ctxt.event_pending )
>>          goto bail;
>>  
>>      perfc_incr(ptwr_emulations);
>> @@ -5501,7 +5501,8 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
>>      else
>>          rc = x86_emulate(&ctxt, &mmio_ro_emulate_ops);
>>  
>> -    return rc != X86EMUL_UNHANDLEABLE ? EXCRET_fault_fixed : 0;
>> +    return ((rc != X86EMUL_UNHANDLEABLE && !ctxt.event_pending)
>> +            ? EXCRET_fault_fixed : 0);
>>  }
> Wouldn't these two better be adjusted to check for OKAY and RETRY,
> the more that iirc we had settled on it not (yet) being guaranteed to
> see event_pending set whenever getting back EXCEPTION?

In this patch, the key point I am guarding against is that, without the
->inject_*() hooks, some actions which previously took a fail_if() path
now succeed and latch an event.

From that point of view, it doesn't matter how the event became pending,
but the fact that one is means that it was a codepath which would
previously have returned UNHANDLEABLE.


Later patches, which stop raising faults behind the back of emulator,
are the ones where new consideration is needed towards the handling of
EXCEPTION/event_pending.  Following Tim's feedback, I have more work to
do in patch 9, as propagate_page_fault() raises #PF behind the back of
the emulator for PV guests.

In other words, I think this patch wants to stay like this, and a later
one change to be better accommodating.

>> @@ -3433,7 +3433,7 @@ static int sh_page_fault(struct vcpu *v,
>>              shadow_continue_emulation(&emul_ctxt, regs);
>>              v->arch.paging.last_write_was_pt = 0;
>>              r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
>> -            if ( r == X86EMUL_OKAY )
>> +            if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.event_pending )
> Aiui you need this for the swint case.

Why?  software interrupts were never previously tolerated in shadow
emulation.

> But wouldn't you then need to add similar checks in OKAY paths elsewhere?

I don't see why I would.  Does my explanation above resolve your concern?

> Or alternatively, wouldn't it be better to have x86_emulate() return EXCEPTION also
> for successful swint emulation (albeit that would likely require other
> not very nice adjustments)?

That would indeed be nasty.  If we were to go down that route, it would
be better to swap X86EMUL_EXCEPTION for X86EMUL_EVENT which explicitly
also includes traps where a register writeback has been completed.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 05/15] x86/emul: Remove opencoded exception generation
  2016-11-24 16:31       ` Jan Beulich
@ 2016-11-24 17:04         ` Andrew Cooper
  0 siblings, 0 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-24 17:04 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Xen-devel

On 24/11/16 16:31, Jan Beulich wrote:
>
>>> With you having added my S-o-b (not really sure why), I'm not sure
>>> it makes a whole lot of sense to give my R-b as well (but feel free
>>> to add it).
>> Your code was originally the few bits replacing opencoded "goto
>> raise_exception" with generate_exception_if(), but the patch has morphed
>> a long way since then.  I am happy to exchange your S-o-B for R-b if you
>> would prefer?
> Yes, it's basically all your work, so I think R-b is more appropriate.

Ok done.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 06/15] x86/emul: Rework emulator event injection
  2016-11-24 17:00     ` Andrew Cooper
@ 2016-11-24 17:08       ` Jan Beulich
  2016-11-24 17:19         ` Andrew Cooper
  0 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2016-11-24 17:08 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Suravee Suthikulpanit, George Dunlap, Tim Deegan,
	Xen-devel, Paul Durrant, JunNakajima, Boris Ostrovsky

>>> On 24.11.16 at 18:00, <andrew.cooper3@citrix.com> wrote:
> On 24/11/16 14:53, Jan Beulich wrote:
>>>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
>>> --- a/xen/arch/x86/mm.c
>>> +++ b/xen/arch/x86/mm.c
>>> @@ -5377,7 +5377,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
>>>      page_unlock(page);
>>>      put_page(page);
>>>  
>>> -    if ( rc == X86EMUL_UNHANDLEABLE )
>>> +    if ( rc == X86EMUL_UNHANDLEABLE || ptwr_ctxt.ctxt.event_pending )
>>>          goto bail;
>>>  
>>>      perfc_incr(ptwr_emulations);
>>> @@ -5501,7 +5501,8 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
>>>      else
>>>          rc = x86_emulate(&ctxt, &mmio_ro_emulate_ops);
>>>  
>>> -    return rc != X86EMUL_UNHANDLEABLE ? EXCRET_fault_fixed : 0;
>>> +    return ((rc != X86EMUL_UNHANDLEABLE && !ctxt.event_pending)
>>> +            ? EXCRET_fault_fixed : 0);
>>>  }
>> Wouldn't these two better be adjusted to check for OKAY and RETRY,
>> the more that iirc we had settled on it not (yet) being guaranteed to
>> see event_pending set whenever getting back EXCEPTION?
> 
> In this patch, the key point I am guarding against is that, without the
> ->inject_*() hooks, some actions which previously took a fail_if() path
> now succeed and latch an event.
> 
> From that point of view, it doesn't matter how the event became pending,
> but the fact that one is means that it was a codepath which would
> previously have returned UNHANDLEABLE.
> 
> 
> Later patches, which stop raising faults behind the back of emulator,
> are the ones where new consideration is needed towards the handling of
> EXCEPTION/event_pending.  Following Tim's feedback, I have more work to
> do in patch 9, as propagate_page_fault() raises #PF behind the back of
> the emulator for PV guests.
> 
> In other words, I think this patch wants to stay like this, and a later
> one change to be better accommodating.

Okay.

>>> @@ -3433,7 +3433,7 @@ static int sh_page_fault(struct vcpu *v,
>>>              shadow_continue_emulation(&emul_ctxt, regs);
>>>              v->arch.paging.last_write_was_pt = 0;
>>>              r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
>>> -            if ( r == X86EMUL_OKAY )
>>> +            if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.event_pending )
>> Aiui you need this for the swint case.
> 
> Why?  software interrupts were never previously tolerated in shadow
> emulation.

Then why would you expect OKAY together with event_pending set?
I'm not saying swint handling needs to succeed here, but I can't see
anything else to cause that particular state to occur.

>> But wouldn't you then need to add similar checks in OKAY paths elsewhere?
> 
> I don't see why I would.  Does my explanation above resolve your concern?

I'm afraid not: On the same basis as above, code not expecting to
handle swint may now see OKAY together with event_pending set,
and would need to indicate failure to their callers just like you do in
sh_page_fault().

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 06/15] x86/emul: Rework emulator event injection
  2016-11-24 17:08       ` Jan Beulich
@ 2016-11-24 17:19         ` Andrew Cooper
  2016-11-24 17:30           ` Tim Deegan
  2016-11-25  7:42           ` Jan Beulich
  0 siblings, 2 replies; 91+ messages in thread
From: Andrew Cooper @ 2016-11-24 17:19 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Suravee Suthikulpanit, George Dunlap, Tim Deegan,
	Xen-devel, Paul Durrant, JunNakajima, Boris Ostrovsky

On 24/11/16 17:08, Jan Beulich wrote:
>>>> On 24.11.16 at 18:00, <andrew.cooper3@citrix.com> wrote:
>> On 24/11/16 14:53, Jan Beulich wrote:
>>>>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
>>>> --- a/xen/arch/x86/mm.c
>>>> +++ b/xen/arch/x86/mm.c
>>>> @@ -5377,7 +5377,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
>>>>      page_unlock(page);
>>>>      put_page(page);
>>>>  
>>>> -    if ( rc == X86EMUL_UNHANDLEABLE )
>>>> +    if ( rc == X86EMUL_UNHANDLEABLE || ptwr_ctxt.ctxt.event_pending )
>>>>          goto bail;
>>>>  
>>>>      perfc_incr(ptwr_emulations);
>>>> @@ -5501,7 +5501,8 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
>>>>      else
>>>>          rc = x86_emulate(&ctxt, &mmio_ro_emulate_ops);
>>>>  
>>>> -    return rc != X86EMUL_UNHANDLEABLE ? EXCRET_fault_fixed : 0;
>>>> +    return ((rc != X86EMUL_UNHANDLEABLE && !ctxt.event_pending)
>>>> +            ? EXCRET_fault_fixed : 0);
>>>>  }
>>> Wouldn't these two better be adjusted to check for OKAY and RETRY,
>>> the more that iirc we had settled on it not (yet) being guaranteed to
>>> see event_pending set whenever getting back EXCEPTION?
>> In this patch, the key point I am guarding against is that, without the
>> ->inject_*() hooks, some actions which previously took a fail_if() path
>> now succeed and latch an event.
>>
>> From that point of view, it doesn't matter how the event became pending,
>> but the fact that one is means that it was a codepath which would
>> previously have returned UNHANDLEABLE.
>>
>>
>> Later patches, which stop raising faults behind the back of emulator,
>> are the ones where new consideration is needed towards the handling of
>> EXCEPTION/event_pending.  Following Tim's feedback, I have more work to
>> do in patch 9, as propagate_page_fault() raises #PF behind the back of
>> the emulator for PV guests.
>>
>> In other words, I think this patch wants to stay like this, and a later
>> one change to be better accommodating.
> Okay.
>
>>>> @@ -3433,7 +3433,7 @@ static int sh_page_fault(struct vcpu *v,
>>>>              shadow_continue_emulation(&emul_ctxt, regs);
>>>>              v->arch.paging.last_write_was_pt = 0;
>>>>              r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
>>>> -            if ( r == X86EMUL_OKAY )
>>>> +            if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.event_pending )
>>> Aiui you need this for the swint case.
>> Why?  software interrupts were never previously tolerated in shadow
>> emulation.
> Then why would you expect OKAY together with event_pending set?
> I'm not saying swint handling needs to succeed here, but I can't see
> anything else to cause that particular state to occur.

Before this patch, a VM playing race conditions with the emulator could
cause this case to emulate 0xcc, which would fail because of the lack of
->inject_sw_interrupt() hook, and return X86_UNHANDLEABLE.

The changes in this patch now mean that the same case would properly
latch #BP, returning OKAY because its a trap not an exception.

By not explicitly failing the OKAY case with an event pending, we are
suddenly opening up rather more functionality than previously existed.

>
>>> But wouldn't you then need to add similar checks in OKAY paths elsewhere?
>> I don't see why I would.  Does my explanation above resolve your concern?
> I'm afraid not: On the same basis as above, code not expecting to
> handle swint may now see OKAY together with event_pending set,
> and would need to indicate failure to their callers just like you do in
> sh_page_fault().

That is my intent with the current code.  I have double checked it, and
it still looks correct.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 08/15] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS
  2016-11-24 15:25   ` Jan Beulich
@ 2016-11-24 17:22     ` Andrew Cooper
  2016-11-25  7:45       ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2016-11-24 17:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Boris Ostrovsky, Kevin Tian, Jun Nakajima, SuraveeSuthikulpanit,
	Xen-devel

On 24/11/16 15:25, Jan Beulich wrote:
>>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
>> +    case x86_seg_tr:
>> +        ASSERT(reg->attr.fields.p);                  /* Usable. */
>> +        ASSERT(!reg->attr.fields.s);                 /* System segment. */
>> +        ASSERT(!(reg->sel & 0x4));                   /* !TI. */
>> +        ASSERT(reg->attr.fields.type == SYS_DESC_tss16_busy ||
>> +               reg->attr.fields.type == SYS_DESC_tss_busy);
>> +        ASSERT(is_canonical_address(reg->base));
>> +        break;
> I can't help thinking that the slightly more strict
>
> +        if ( reg->attr.fields.type == SYS_DESC_tss_busy )
> +            ASSERT(is_canonical_address(reg->base));
> +        else if ( reg->attr.fields.type == SYS_DESC_tss16_busy )
> +            ASSERT(!(reg->base >> 32));
> +        else
> +            ASSERT_UNREACHABLE();
>
> would be better, even if that's going to make TR checking look a
> little different than the others (but we should leverage the
> information we have).

I still don't like the use of ASSERT_UNREACHABLE(); as the "you failed
the typecheck" case.

Would ASSERT(!"%tr typecheck failure") be acceptable?

>
>> +    case x86_seg_ldtr:
>> +        if ( reg->attr.fields.p )
>> +        {
>> +            ASSERT(!reg->attr.fields.s);             /* System segment. */
>> +            ASSERT(!(reg->sel & 0x4));               /* !TI. */
>> +            ASSERT(reg->attr.fields.type == SYS_DESC_ldt);
>> +            ASSERT(is_canonical_address(reg->base));
>> +        }
>> +        break;
>> +
>> +    case x86_seg_gdtr:
>> +    case x86_seg_idtr:
>> +        ASSERT(is_canonical_address(reg->base));
>> +        ASSERT((reg->limit >> 16) == 0);             /* Upper bits clear. */
>> +        break;
>> +
>> +    default:
>> +        ASSERT_UNREACHABLE();
>> +        break;
>> +    }
> Didn't you agree to change this last "break" to "return"?

Yes.  Sorry.  Fixed.

>
>> --- a/xen/include/asm-x86/desc.h
>> +++ b/xen/include/asm-x86/desc.h
>> @@ -89,7 +89,13 @@
>>  #ifndef __ASSEMBLY__
>>  
>>  /* System Descriptor types for GDT and IDT entries. */
>> +#define SYS_DESC_tss16_avail  1
>>  #define SYS_DESC_ldt          2
>> +#define SYS_DESC_tss16_busy   3
>> +#define SYS_DESC_call_gate16  4
>> +#define SYS_DESC_task_gate    5
>> +#define SYS_DESC_irq_gate16   6
>> +#define SYS_DESC_trap_gate16  7
>>  #define SYS_DESC_tss_avail    9
>>  #define SYS_DESC_tss_busy     11
>>  #define SYS_DESC_call_gate    12
> Thanks for completing the set.
>
> Regardless of how you decide on the two earlier remarks,
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
> Jan
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 06/15] x86/emul: Rework emulator event injection
  2016-11-24 17:19         ` Andrew Cooper
@ 2016-11-24 17:30           ` Tim Deegan
  2016-11-24 17:37             ` Andrew Cooper
  2016-11-25  7:42           ` Jan Beulich
  1 sibling, 1 reply; 91+ messages in thread
From: Tim Deegan @ 2016-11-24 17:30 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: JunNakajima, Kevin Tian, Jan Beulich, George Dunlap, Xen-devel,
	Paul Durrant, Suravee Suthikulpanit, Boris Ostrovsky

At 17:19 +0000 on 24 Nov (1480007992), Andrew Cooper wrote:
> On 24/11/16 17:08, Jan Beulich wrote:
> >>>> On 24.11.16 at 18:00, <andrew.cooper3@citrix.com> wrote:
> >> On 24/11/16 14:53, Jan Beulich wrote:
> >>>>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
> >>>> --- a/xen/arch/x86/mm.c
> >>>> +++ b/xen/arch/x86/mm.c
> >>>> @@ -5377,7 +5377,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
> >>>>      page_unlock(page);
> >>>>      put_page(page);
> >>>>  
> >>>> -    if ( rc == X86EMUL_UNHANDLEABLE )
> >>>> +    if ( rc == X86EMUL_UNHANDLEABLE || ptwr_ctxt.ctxt.event_pending )
> >>>>          goto bail;
> >>>>  
> >>>>      perfc_incr(ptwr_emulations);
> >>>> @@ -5501,7 +5501,8 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
> >>>>      else
> >>>>          rc = x86_emulate(&ctxt, &mmio_ro_emulate_ops);
> >>>>  
> >>>> -    return rc != X86EMUL_UNHANDLEABLE ? EXCRET_fault_fixed : 0;
> >>>> +    return ((rc != X86EMUL_UNHANDLEABLE && !ctxt.event_pending)
> >>>> +            ? EXCRET_fault_fixed : 0);
> >>>>  }
> >>> Wouldn't these two better be adjusted to check for OKAY and RETRY,
> >>> the more that iirc we had settled on it not (yet) being guaranteed to
> >>> see event_pending set whenever getting back EXCEPTION?
> >> In this patch, the key point I am guarding against is that, without the
> >> ->inject_*() hooks, some actions which previously took a fail_if() path
> >> now succeed and latch an event.
> >>
> >> From that point of view, it doesn't matter how the event became pending,
> >> but the fact that one is means that it was a codepath which would
> >> previously have returned UNHANDLEABLE.
> >>
> >>
> >> Later patches, which stop raising faults behind the back of emulator,
> >> are the ones where new consideration is needed towards the handling of
> >> EXCEPTION/event_pending.  Following Tim's feedback, I have more work to
> >> do in patch 9, as propagate_page_fault() raises #PF behind the back of
> >> the emulator for PV guests.
> >>
> >> In other words, I think this patch wants to stay like this, and a later
> >> one change to be better accommodating.
> > Okay.
> >
> >>>> @@ -3433,7 +3433,7 @@ static int sh_page_fault(struct vcpu *v,
> >>>>              shadow_continue_emulation(&emul_ctxt, regs);
> >>>>              v->arch.paging.last_write_was_pt = 0;
> >>>>              r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
> >>>> -            if ( r == X86EMUL_OKAY )
> >>>> +            if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.event_pending )
> >>> Aiui you need this for the swint case.
> >> Why?  software interrupts were never previously tolerated in shadow
> >> emulation.
> > Then why would you expect OKAY together with event_pending set?
> > I'm not saying swint handling needs to succeed here, but I can't see
> > anything else to cause that particular state to occur.
> 
> Before this patch, a VM playing race conditions with the emulator could
> cause this case to emulate 0xcc, which would fail because of the lack of
> ->inject_sw_interrupt() hook, and return X86_UNHANDLEABLE.
> 
> The changes in this patch now mean that the same case would properly
> latch #BP, returning OKAY because its a trap not an exception.
> 
> By not explicitly failing the OKAY case with an event pending, we are
> suddenly opening up rather more functionality than previously existed.
> 
> >
> >>> But wouldn't you then need to add similar checks in OKAY paths elsewhere?
> >> I don't see why I would.  Does my explanation above resolve your concern?
> > I'm afraid not: On the same basis as above, code not expecting to
> > handle swint may now see OKAY together with event_pending set,
> > and would need to indicate failure to their callers just like you do in
> > sh_page_fault().
> 
> That is my intent with the current code.  I have double checked it, and
> it still looks correct.

So is that not the case I was worried about, where the emulator
updates register state but we then drop the expected event on the
floor?

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 06/15] x86/emul: Rework emulator event injection
  2016-11-24 17:30           ` Tim Deegan
@ 2016-11-24 17:37             ` Andrew Cooper
  2016-11-25  7:25               ` Jan Beulich
  0 siblings, 1 reply; 91+ messages in thread
From: Andrew Cooper @ 2016-11-24 17:37 UTC (permalink / raw)
  To: Tim Deegan
  Cc: JunNakajima, Kevin Tian, Jan Beulich, George Dunlap, Xen-devel,
	Paul Durrant, Suravee Suthikulpanit, Boris Ostrovsky

On 24/11/16 17:30, Tim Deegan wrote:
> At 17:19 +0000 on 24 Nov (1480007992), Andrew Cooper wrote:
>> On 24/11/16 17:08, Jan Beulich wrote:
>>>>>> On 24.11.16 at 18:00, <andrew.cooper3@citrix.com> wrote:
>>>> On 24/11/16 14:53, Jan Beulich wrote:
>>>>>>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
>>>>>> --- a/xen/arch/x86/mm.c
>>>>>> +++ b/xen/arch/x86/mm.c
>>>>>> @@ -5377,7 +5377,7 @@ int ptwr_do_page_fault(struct vcpu *v, unsigned long addr,
>>>>>>      page_unlock(page);
>>>>>>      put_page(page);
>>>>>>  
>>>>>> -    if ( rc == X86EMUL_UNHANDLEABLE )
>>>>>> +    if ( rc == X86EMUL_UNHANDLEABLE || ptwr_ctxt.ctxt.event_pending )
>>>>>>          goto bail;
>>>>>>  
>>>>>>      perfc_incr(ptwr_emulations);
>>>>>> @@ -5501,7 +5501,8 @@ int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
>>>>>>      else
>>>>>>          rc = x86_emulate(&ctxt, &mmio_ro_emulate_ops);
>>>>>>  
>>>>>> -    return rc != X86EMUL_UNHANDLEABLE ? EXCRET_fault_fixed : 0;
>>>>>> +    return ((rc != X86EMUL_UNHANDLEABLE && !ctxt.event_pending)
>>>>>> +            ? EXCRET_fault_fixed : 0);
>>>>>>  }
>>>>> Wouldn't these two better be adjusted to check for OKAY and RETRY,
>>>>> the more that iirc we had settled on it not (yet) being guaranteed to
>>>>> see event_pending set whenever getting back EXCEPTION?
>>>> In this patch, the key point I am guarding against is that, without the
>>>> ->inject_*() hooks, some actions which previously took a fail_if() path
>>>> now succeed and latch an event.
>>>>
>>>> From that point of view, it doesn't matter how the event became pending,
>>>> but the fact that one is means that it was a codepath which would
>>>> previously have returned UNHANDLEABLE.
>>>>
>>>>
>>>> Later patches, which stop raising faults behind the back of emulator,
>>>> are the ones where new consideration is needed towards the handling of
>>>> EXCEPTION/event_pending.  Following Tim's feedback, I have more work to
>>>> do in patch 9, as propagate_page_fault() raises #PF behind the back of
>>>> the emulator for PV guests.
>>>>
>>>> In other words, I think this patch wants to stay like this, and a later
>>>> one change to be better accommodating.
>>> Okay.
>>>
>>>>>> @@ -3433,7 +3433,7 @@ static int sh_page_fault(struct vcpu *v,
>>>>>>              shadow_continue_emulation(&emul_ctxt, regs);
>>>>>>              v->arch.paging.last_write_was_pt = 0;
>>>>>>              r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
>>>>>> -            if ( r == X86EMUL_OKAY )
>>>>>> +            if ( r == X86EMUL_OKAY && !emul_ctxt.ctxt.event_pending )
>>>>> Aiui you need this for the swint case.
>>>> Why?  software interrupts were never previously tolerated in shadow
>>>> emulation.
>>> Then why would you expect OKAY together with event_pending set?
>>> I'm not saying swint handling needs to succeed here, but I can't see
>>> anything else to cause that particular state to occur.
>> Before this patch, a VM playing race conditions with the emulator could
>> cause this case to emulate 0xcc, which would fail because of the lack of
>> ->inject_sw_interrupt() hook, and return X86_UNHANDLEABLE.
>>
>> The changes in this patch now mean that the same case would properly
>> latch #BP, returning OKAY because its a trap not an exception.
>>
>> By not explicitly failing the OKAY case with an event pending, we are
>> suddenly opening up rather more functionality than previously existed.
>>
>>>>> But wouldn't you then need to add similar checks in OKAY paths elsewhere?
>>>> I don't see why I would.  Does my explanation above resolve your concern?
>>> I'm afraid not: On the same basis as above, code not expecting to
>>> handle swint may now see OKAY together with event_pending set,
>>> and would need to indicate failure to their callers just like you do in
>>> sh_page_fault().
>> That is my intent with the current code.  I have double checked it, and
>> it still looks correct.
> So is that not the case I was worried about, where the emulator
> updates register state but we then drop the expected event on the
> floor?

Oh right, yes.  Sorry for being dense.

As an interim between now and getting a proper audit hook, would a bool
permit_traps in x86_emulate_ctxt suffice?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 06/15] x86/emul: Rework emulator event injection
  2016-11-24 17:37             ` Andrew Cooper
@ 2016-11-25  7:25               ` Jan Beulich
  2016-11-25  9:41                 ` Tim Deegan
  0 siblings, 1 reply; 91+ messages in thread
From: Jan Beulich @ 2016-11-25  7:25 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Suravee Suthikulpanit, George Dunlap, Tim Deegan,
	Xen-devel, Paul Durrant, JunNakajima, Boris Ostrovsky

>>> On 24.11.16 at 18:37, <andrew.cooper3@citrix.com> wrote:
> As an interim between now and getting a proper audit hook, would a bool
> permit_traps in x86_emulate_ctxt suffice?

That's one option; the other would be to do away with only the
exception injection hook for now, and keep the swint one. That's
effectively the same as the boolean flag, just with less overall
code churn (and keeping the hook would not undermine this
series' goal of eliminating event injection behind the back of the
emulator).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 06/15] x86/emul: Rework emulator event injection
  2016-11-24 17:19         ` Andrew Cooper
  2016-11-24 17:30           ` Tim Deegan
@ 2016-11-25  7:42           ` Jan Beulich
  1 sibling, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2016-11-25  7:42 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Suravee Suthikulpanit, George Dunlap, Tim Deegan,
	Xen-devel, Paul Durrant, JunNakajima, Boris Ostrovsky

>>> On 24.11.16 at 18:19, <andrew.cooper3@citrix.com> wrote:
> On 24/11/16 17:08, Jan Beulich wrote:
>>>>> On 24.11.16 at 18:00, <andrew.cooper3@citrix.com> wrote:
>>>> But wouldn't you then need to add similar checks in OKAY paths elsewhere?
>>> I don't see why I would.  Does my explanation above resolve your concern?
>> I'm afraid not: On the same basis as above, code not expecting to
>> handle swint may now see OKAY together with event_pending set,
>> and would need to indicate failure to their callers just like you do in
>> sh_page_fault().
> 
> That is my intent with the current code.  I have double checked it, and
> it still looks correct.

Then what about the handling immediately after the x86_emulate()
invocation in _hvm_emulate_one()? Apart from that I've now also
convinced myself that you handle all existing callers.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 08/15] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS
  2016-11-24 17:22     ` Andrew Cooper
@ 2016-11-25  7:45       ` Jan Beulich
  0 siblings, 0 replies; 91+ messages in thread
From: Jan Beulich @ 2016-11-25  7:45 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Boris Ostrovsky, Kevin Tian, Jun Nakajima, SuraveeSuthikulpanit,
	Xen-devel

>>> On 24.11.16 at 18:22, <andrew.cooper3@citrix.com> wrote:
> On 24/11/16 15:25, Jan Beulich wrote:
>>>>> On 23.11.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
>>> +    case x86_seg_tr:
>>> +        ASSERT(reg->attr.fields.p);                  /* Usable. */
>>> +        ASSERT(!reg->attr.fields.s);                 /* System segment. */
>>> +        ASSERT(!(reg->sel & 0x4));                   /* !TI. */
>>> +        ASSERT(reg->attr.fields.type == SYS_DESC_tss16_busy ||
>>> +               reg->attr.fields.type == SYS_DESC_tss_busy);
>>> +        ASSERT(is_canonical_address(reg->base));
>>> +        break;
>> I can't help thinking that the slightly more strict
>>
>> +        if ( reg->attr.fields.type == SYS_DESC_tss_busy )
>> +            ASSERT(is_canonical_address(reg->base));
>> +        else if ( reg->attr.fields.type == SYS_DESC_tss16_busy )
>> +            ASSERT(!(reg->base >> 32));
>> +        else
>> +            ASSERT_UNREACHABLE();
>>
>> would be better, even if that's going to make TR checking look a
>> little different than the others (but we should leverage the
>> information we have).
> 
> I still don't like the use of ASSERT_UNREACHABLE(); as the "you failed
> the typecheck" case.
> 
> Would ASSERT(!"%tr typecheck failure") be acceptable?

Sure.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH 06/15] x86/emul: Rework emulator event injection
  2016-11-25  7:25               ` Jan Beulich
@ 2016-11-25  9:41                 ` Tim Deegan
  0 siblings, 0 replies; 91+ messages in thread
From: Tim Deegan @ 2016-11-25  9:41 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Suravee Suthikulpanit, George Dunlap, Andrew Cooper,
	Xen-devel, Paul Durrant, JunNakajima, Boris Ostrovsky

At 00:25 -0700 on 25 Nov (1480033543), Jan Beulich wrote:
> >>> On 24.11.16 at 18:37, <andrew.cooper3@citrix.com> wrote:
> > As an interim between now and getting a proper audit hook, would a bool
> > permit_traps in x86_emulate_ctxt suffice?
> 
> That's one option; the other would be to do away with only the
> exception injection hook for now, and keep the swint one. That's
> effectively the same as the boolean flag, just with less overall
> code churn (and keeping the hook would not undermine this
> series' goal of eliminating event injection behind the back of the
> emulator).

I'd be OK with either of those.  Or indeed with
 - adjusting the emulator so it always returns a non-OK value when it
   needs the caller to do something to make the state consistent; or
 - keeping the injection hook but moving it to the end of the
   emulation, as a sort of write-back of the emulator's internal event
   state to the guest.  That would let the emulator DTRT about
   tracking events internally but still avoid any risk that the caller
   forgets to inject the event.

Tim.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 91+ messages in thread

end of thread, other threads:[~2016-11-25  9:41 UTC | newest]

Thread overview: 91+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-23 15:38 [PATCH for-4.9 00/15] XSA-191 followup Andrew Cooper
2016-11-23 15:38 ` [PATCH 01/15] x86/hvm: Rename hvm_emulate_init() and hvm_emulate_prepare() for clarity Andrew Cooper
2016-11-23 15:49   ` Paul Durrant
2016-11-23 15:53   ` Wei Liu
2016-11-23 16:40   ` Jan Beulich
2016-11-23 16:41   ` Boris Ostrovsky
2016-11-23 16:41     ` Andrew Cooper
2016-11-24  6:16   ` Tian, Kevin
2016-11-23 15:38 ` [PATCH 02/15] x86/emul: Simplfy emulation state setup Andrew Cooper
2016-11-23 15:58   ` Paul Durrant
2016-11-23 16:01     ` Andrew Cooper
2016-11-23 16:03       ` Paul Durrant
2016-11-23 16:07   ` Tim Deegan
2016-11-24 13:44   ` Jan Beulich
2016-11-24 13:59     ` Andrew Cooper
2016-11-24 14:18       ` Jan Beulich
2016-11-23 15:38 ` [PATCH 03/15] x86/emul: Rename hvm_trap to x86_event and move it into the emulation infrastructure Andrew Cooper
2016-11-23 16:12   ` Paul Durrant
2016-11-23 16:22     ` Andrew Cooper
2016-11-23 16:59   ` Boris Ostrovsky
2016-11-24  6:17   ` Tian, Kevin
2016-11-24 13:56   ` Jan Beulich
2016-11-24 14:42     ` Andrew Cooper
2016-11-24 14:57       ` Jan Beulich
2016-11-23 15:38 ` [PATCH 04/15] x86/emul: Rename HVM_DELIVER_NO_ERROR_CODE to X86_EVENT_NO_EC Andrew Cooper
2016-11-23 16:20   ` Paul Durrant
2016-11-23 17:05   ` Boris Ostrovsky
2016-11-24  6:18   ` Tian, Kevin
2016-11-24 14:18   ` Jan Beulich
2016-11-23 15:38 ` [PATCH 05/15] x86/emul: Remove opencoded exception generation Andrew Cooper
2016-11-24 14:31   ` Jan Beulich
2016-11-24 16:24     ` Andrew Cooper
2016-11-24 16:31       ` Jan Beulich
2016-11-24 17:04         ` Andrew Cooper
2016-11-23 15:38 ` [PATCH 06/15] x86/emul: Rework emulator event injection Andrew Cooper
2016-11-23 16:19   ` Tim Deegan
2016-11-23 16:33     ` Jan Beulich
2016-11-23 16:43       ` Tim Deegan
2016-11-23 16:38     ` Andrew Cooper
2016-11-23 17:56   ` Boris Ostrovsky
2016-11-24  6:20   ` Tian, Kevin
2016-11-24 14:53   ` Jan Beulich
2016-11-24 17:00     ` Andrew Cooper
2016-11-24 17:08       ` Jan Beulich
2016-11-24 17:19         ` Andrew Cooper
2016-11-24 17:30           ` Tim Deegan
2016-11-24 17:37             ` Andrew Cooper
2016-11-25  7:25               ` Jan Beulich
2016-11-25  9:41                 ` Tim Deegan
2016-11-25  7:42           ` Jan Beulich
2016-11-23 15:38 ` [PATCH 07/15] x86/vmx: Use hvm_{get, set}_segment_register() rather than vmx_{get, set}_segment_register() Andrew Cooper
2016-11-24  6:20   ` Tian, Kevin
2016-11-23 15:38 ` [PATCH 08/15] x86/hvm: Reposition the modification of raw segment data from the VMCB/VMCS Andrew Cooper
2016-11-23 19:01   ` Boris Ostrovsky
2016-11-23 19:28     ` Andrew Cooper
2016-11-23 19:41       ` Boris Ostrovsky
2016-11-23 19:58         ` Andrew Cooper
2016-11-24  6:24   ` Tian, Kevin
2016-11-24 15:25   ` Jan Beulich
2016-11-24 17:22     ` Andrew Cooper
2016-11-25  7:45       ` Jan Beulich
2016-11-23 15:38 ` [PATCH 09/15] x86/emul: Avoid raising faults behind the emulators back Andrew Cooper
2016-11-23 16:31   ` Tim Deegan
2016-11-23 16:40     ` Andrew Cooper
2016-11-23 16:50       ` Tim Deegan
2016-11-23 16:58         ` Andrew Cooper
2016-11-24 10:43           ` Jan Beulich
2016-11-24 10:46             ` Andrew Cooper
2016-11-24 11:24               ` Jan Beulich
2016-11-23 15:38 ` [PATCH 10/15] x86/hvm: Extend the hvm_copy_*() API with a pagefault_info pointer Andrew Cooper
2016-11-23 16:32   ` Tim Deegan
2016-11-23 16:36   ` Paul Durrant
2016-11-24  6:25   ` Tian, Kevin
2016-11-23 15:38 ` [PATCH 11/15] x86/hvm: Reimplement hvm_copy_*_nofault() in terms of no pagefault_info Andrew Cooper
2016-11-23 16:35   ` Tim Deegan
2016-11-23 16:38     ` Andrew Cooper
2016-11-23 16:40     ` Tim Deegan
2016-11-23 15:38 ` [PATCH 12/15] x86/hvm: Rename hvm_copy_*_guest_virt() to hvm_copy_*_guest_linear() Andrew Cooper
2016-11-23 16:35   ` Tim Deegan
2016-11-24  6:26   ` Tian, Kevin
2016-11-24 15:41   ` Jan Beulich
2016-11-23 15:38 ` [PATCH 13/15] x86/hvm: Avoid __hvm_copy() raising #PF behind the emulators back Andrew Cooper
2016-11-23 16:18   ` Andrew Cooper
2016-11-23 16:39   ` Tim Deegan
2016-11-23 17:06     ` Andrew Cooper
2016-11-23 15:38 ` [PATCH 14/15] x86/hvm: Prepare to allow use of system segments for memory references Andrew Cooper
2016-11-23 16:42   ` Paul Durrant
2016-11-24 15:48   ` Jan Beulich
2016-11-23 15:38 ` [PATCH 15/15] x86/hvm: Use system-segment relative memory accesses Andrew Cooper
2016-11-24 16:01   ` Jan Beulich
2016-11-24 16:03     ` Andrew Cooper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).