All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/27] x86: refactor trap handling code
@ 2017-06-08 17:11 Wei Liu
  2017-06-08 17:11 ` [PATCH v4 01/27] x86: factor out common PV emulation code Wei Liu
                   ` (26 more replies)
  0 siblings, 27 replies; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

V4 of this series, rebased on top of staging.

 git://xenbits.xen.org/people/liuw/xen.git wip.move-traps-v4

Wei Liu (27):
  x86: factor out common PV emulation code
  x86: move PV privileged instruction emulation code
  x86: move PV gate op emulation code
  x86: move PV invalid op emulation code
  x86/traps: remove now unused inclusion of emulate.h
  x86: clean up PV emulation code
  x86: move do_set_trap_table to pv/traps.c
  x86: move some misc PV hypercalls to misc-hypercalls.c
  x86/traps: move pv_inject_event to pv/traps.c
  x86/traps: move set_guest_{machine,nmi}_trapbounce
  x86:/traps: move {un,}register_guest_nmi_callback
  x86/traps: move guest_has_trap_callback to pv/traps.c
  x86: move toggle_guest_mode to pv/domain.c
  x86: move do_iret to pv/iret.c
  x86: move callback_op code to pv/callback.c
  x86/traps: factor out pv_trap_init
  x86/traps: move some PV specific functions and struct to pv/traps.c
  x86/traps: move init_int80_direct_trap to pv/traps.c
  x86: move hypercall_page_initialise_ring3_kernel to pv/hypercall.c
  x86: move hypercall_page_initialise_ring1_kernel
  x86: move compat_set_trap_table along side the non-compat variant
  x86: move compat_iret along side its non-compat variant
  x86: move the compat callback ops next to the non-compat variant
  x86: move compat_show_guest_statck near its non-compat variant
  x86: remove the now empty x86_64/compat/traps.c
  x86: fix coding a style issue in asm-x86/traps.h
  x86: clean up traps.c

 xen/arch/x86/pv/Makefile                 |    8 +
 xen/arch/x86/pv/callback.c               |  299 ++++
 xen/arch/x86/pv/domain.c                 |   30 +
 xen/arch/x86/pv/emul-gate-op.c           |  439 ++++++
 xen/arch/x86/pv/emul-inv-op.c            |  123 ++
 xen/arch/x86/pv/emul-priv-op.c           | 1418 +++++++++++++++++
 xen/arch/x86/pv/emulate.c                |   98 ++
 xen/arch/x86/pv/emulate.h                |   10 +
 xen/arch/x86/{x86_64 => pv}/gpr_switch.S |    0
 xen/arch/x86/pv/hypercall.c              |   67 +
 xen/arch/x86/pv/iret.c                   |  192 +++
 xen/arch/x86/pv/misc-hypercalls.c        |   78 +
 xen/arch/x86/pv/traps.c                  |  370 +++++
 xen/arch/x86/traps.c                     | 2497 +++---------------------------
 xen/arch/x86/x86_64/Makefile             |    1 -
 xen/arch/x86/x86_64/compat/traps.c       |  416 -----
 xen/arch/x86/x86_64/traps.c              |  286 ----
 xen/include/asm-x86/hypercall.h          |    2 +
 xen/include/asm-x86/processor.h          |    3 -
 xen/include/asm-x86/pv/traps.h           |   56 +
 xen/include/asm-x86/traps.h              |   24 +-
 21 files changed, 3382 insertions(+), 3035 deletions(-)
 create mode 100644 xen/arch/x86/pv/callback.c
 create mode 100644 xen/arch/x86/pv/emul-gate-op.c
 create mode 100644 xen/arch/x86/pv/emul-inv-op.c
 create mode 100644 xen/arch/x86/pv/emul-priv-op.c
 create mode 100644 xen/arch/x86/pv/emulate.c
 create mode 100644 xen/arch/x86/pv/emulate.h
 rename xen/arch/x86/{x86_64 => pv}/gpr_switch.S (100%)
 create mode 100644 xen/arch/x86/pv/iret.c
 create mode 100644 xen/arch/x86/pv/misc-hypercalls.c
 delete mode 100644 xen/arch/x86/x86_64/compat/traps.c
 create mode 100644 xen/include/asm-x86/pv/traps.h

-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 01/27] x86: factor out common PV emulation code
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-20 16:00   ` Jan Beulich
  2017-06-08 17:11 ` [PATCH v4 02/27] x86: move PV privileged instruction " Wei Liu
                   ` (25 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

We're going to split PV emulation code into several files. This patch
extracts the functions needed by them into a dedicated file.

The functions are now prefixed with "pv_emul_" and exported via a
local header file.

While at it, change bool_t to bool.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/Makefile  |  1 +
 xen/arch/x86/pv/emulate.c | 98 +++++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/pv/emulate.h | 10 +++++
 xen/arch/x86/traps.c      | 95 ++++++++-------------------------------------
 4 files changed, 126 insertions(+), 78 deletions(-)
 create mode 100644 xen/arch/x86/pv/emulate.c
 create mode 100644 xen/arch/x86/pv/emulate.h

diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index 489a9f59cb..564202cbb7 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -3,3 +3,4 @@ obj-y += traps.o
 
 obj-bin-y += dom0_build.init.o
 obj-y += domain.o
+obj-y += emulate.o
diff --git a/xen/arch/x86/pv/emulate.c b/xen/arch/x86/pv/emulate.c
new file mode 100644
index 0000000000..5750c7699b
--- /dev/null
+++ b/xen/arch/x86/pv/emulate.c
@@ -0,0 +1,98 @@
+/******************************************************************************
+ * arch/x86/pv/emulate.c
+ *
+ * Common PV emulation code
+ *
+ * Modifications to Linux original are copyright (c) 2002-2004, K A Fraser
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/guest_access.h>
+
+#include <asm/debugreg.h>
+
+#include "emulate.h"
+
+int pv_emul_read_descriptor(unsigned int sel, const struct vcpu *v,
+                            unsigned long *base, unsigned long *limit,
+                            unsigned int *ar, bool insn_fetch)
+{
+    struct desc_struct desc;
+
+    if ( sel < 4)
+        desc.b = desc.a = 0;
+    else if ( __get_user(desc,
+                         (const struct desc_struct *)(!(sel & 4)
+                                                      ? GDT_VIRT_START(v)
+                                                      : LDT_VIRT_START(v))
+                         + (sel >> 3)) )
+        return 0;
+    if ( !insn_fetch )
+        desc.b &= ~_SEGMENT_L;
+
+    *ar = desc.b & 0x00f0ff00;
+    if ( !(desc.b & _SEGMENT_L) )
+    {
+        *base = ((desc.a >> 16) + ((desc.b & 0xff) << 16) +
+                 (desc.b & 0xff000000));
+        *limit = (desc.a & 0xffff) | (desc.b & 0x000f0000);
+        if ( desc.b & _SEGMENT_G )
+            *limit = ((*limit + 1) << 12) - 1;
+#ifndef NDEBUG
+        if ( sel > 3 )
+        {
+            unsigned int a, l;
+            unsigned char valid;
+
+            asm volatile (
+                "larl %2,%0 ; setz %1"
+                : "=r" (a), "=qm" (valid) : "rm" (sel));
+            BUG_ON(valid && ((a & 0x00f0ff00) != *ar));
+            asm volatile (
+                "lsll %2,%0 ; setz %1"
+                : "=r" (l), "=qm" (valid) : "rm" (sel));
+            BUG_ON(valid && (l != *limit));
+        }
+#endif
+    }
+    else
+    {
+        *base = 0UL;
+        *limit = ~0UL;
+    }
+
+    return 1;
+}
+
+void pv_emul_instruction_done(struct cpu_user_regs *regs, unsigned long rip)
+{
+    regs->rip = rip;
+    regs->eflags &= ~X86_EFLAGS_RF;
+    if ( regs->eflags & X86_EFLAGS_TF )
+    {
+        current->arch.debugreg[6] |= DR_STEP | DR_STATUS_RESERVED_ONE;
+        pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+    }
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/pv/emulate.h b/xen/arch/x86/pv/emulate.h
new file mode 100644
index 0000000000..b2b1192d48
--- /dev/null
+++ b/xen/arch/x86/pv/emulate.h
@@ -0,0 +1,10 @@
+#ifndef __PV_EMULATE_H__
+#define __PV_EMULATE_H__
+
+int pv_emul_read_descriptor(unsigned int sel, const struct vcpu *v,
+                            unsigned long *base, unsigned long *limit,
+                            unsigned int *ar, bool insn_fetch);
+
+void pv_emul_instruction_done(struct cpu_user_regs *regs, unsigned long rip);
+
+#endif /* __PV_EMULATE_H__ */
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 52e740f11f..dcc48f9860 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -78,6 +78,8 @@
 #include <asm/cpuid.h>
 #include <xsm/xsm.h>
 
+#include "pv/emulate.h"
+
 /*
  * opt_nmi: one of 'ignore', 'dom0', or 'fatal'.
  *  fatal:  Xen prints diagnostic message and then hangs.
@@ -694,17 +696,6 @@ void pv_inject_event(const struct x86_event *event)
     }
 }
 
-static void instruction_done(struct cpu_user_regs *regs, unsigned long rip)
-{
-    regs->rip = rip;
-    regs->eflags &= ~X86_EFLAGS_RF;
-    if ( regs->eflags & X86_EFLAGS_TF )
-    {
-        current->arch.debugreg[6] |= DR_STEP | DR_STATUS_RESERVED_ONE;
-        pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
-    }
-}
-
 static unsigned int check_guest_io_breakpoint(struct vcpu *v,
     unsigned int port, unsigned int len)
 {
@@ -1027,7 +1018,7 @@ static int emulate_invalid_rdtscp(struct cpu_user_regs *regs)
         return 0;
     eip += sizeof(opcode);
     pv_soft_rdtsc(v, regs, 1);
-    instruction_done(regs, eip);
+    pv_emul_instruction_done(regs, eip);
     return EXCRET_fault_fixed;
 }
 
@@ -1075,7 +1066,7 @@ static int emulate_forced_invalid_op(struct cpu_user_regs *regs)
     regs->rcx = res.c;
     regs->rdx = res.d;
 
-    instruction_done(regs, eip);
+    pv_emul_instruction_done(regs, eip);
 
     trace_trap_one_addr(TRC_PV_FORCED_INVALID_OP, regs->rip);
 
@@ -1623,60 +1614,6 @@ long do_fpu_taskswitch(int set)
     return 0;
 }
 
-static int read_descriptor(unsigned int sel,
-                           const struct vcpu *v,
-                           unsigned long *base,
-                           unsigned long *limit,
-                           unsigned int *ar,
-                           bool_t insn_fetch)
-{
-    struct desc_struct desc;
-
-    if ( sel < 4)
-        desc.b = desc.a = 0;
-    else if ( __get_user(desc,
-                         (const struct desc_struct *)(!(sel & 4)
-                                                      ? GDT_VIRT_START(v)
-                                                      : LDT_VIRT_START(v))
-                         + (sel >> 3)) )
-        return 0;
-    if ( !insn_fetch )
-        desc.b &= ~_SEGMENT_L;
-
-    *ar = desc.b & 0x00f0ff00;
-    if ( !(desc.b & _SEGMENT_L) )
-    {
-        *base = ((desc.a >> 16) + ((desc.b & 0xff) << 16) +
-                 (desc.b & 0xff000000));
-        *limit = (desc.a & 0xffff) | (desc.b & 0x000f0000);
-        if ( desc.b & _SEGMENT_G )
-            *limit = ((*limit + 1) << 12) - 1;
-#ifndef NDEBUG
-        if ( sel > 3 )
-        {
-            unsigned int a, l;
-            unsigned char valid;
-
-            asm volatile (
-                "larl %2,%0 ; setz %1"
-                : "=r" (a), "=qm" (valid) : "rm" (sel));
-            BUG_ON(valid && ((a & 0x00f0ff00) != *ar));
-            asm volatile (
-                "lsll %2,%0 ; setz %1"
-                : "=r" (l), "=qm" (valid) : "rm" (sel));
-            BUG_ON(valid && (l != *limit));
-        }
-#endif
-    }
-    else
-    {
-        *base = 0UL;
-        *limit = ~0UL;
-    }
-
-    return 1;
-}
-
 static int read_gate_descriptor(unsigned int gate_sel,
                                 const struct vcpu *v,
                                 unsigned int *sel,
@@ -1841,7 +1778,8 @@ static int priv_op_read_segment(enum x86_segment seg,
         default: return X86EMUL_UNHANDLEABLE;
         }
 
-        if ( !read_descriptor(sel, current, &reg->base, &limit, &ar, 0) )
+        if ( !pv_emul_read_descriptor(sel, current, &reg->base,
+                                      &limit, &ar, 0) )
             return X86EMUL_UNHANDLEABLE;
 
         reg->limit = limit;
@@ -2987,8 +2925,8 @@ static int emulate_privileged_op(struct cpu_user_regs *regs)
     int rc;
     unsigned int eflags, ar;
 
-    if ( !read_descriptor(regs->cs, curr, &ctxt.cs.base, &ctxt.cs.limit,
-                          &ar, 1) ||
+    if ( !pv_emul_read_descriptor(regs->cs, curr, &ctxt.cs.base,
+                                  &ctxt.cs.limit, &ar, 1) ||
          !(ar & _SEGMENT_S) ||
          !(ar & _SEGMENT_P) ||
          !(ar & _SEGMENT_CODE) )
@@ -3110,7 +3048,7 @@ static int gate_op_read(
         unsigned int ar;
 
         ASSERT(!goc->insn_fetch);
-        if ( !read_descriptor(sel, current, &addr, &limit, &ar, 0) ||
+        if ( !pv_emul_read_descriptor(sel, current, &addr, &limit, &ar, 0) ||
              !(ar & _SEGMENT_S) ||
              !(ar & _SEGMENT_P) ||
              ((ar & _SEGMENT_CODE) && !(ar & _SEGMENT_WR)) )
@@ -3170,8 +3108,8 @@ static void emulate_gate_op(struct cpu_user_regs *regs)
      * Decode instruction (and perhaps operand) to determine RPL,
      * whether this is a jump or a call, and the call return offset.
      */
-    if ( !read_descriptor(regs->cs, v, &ctxt.cs.base, &ctxt.cs.limit,
-                          &ar, 0) ||
+    if ( !pv_emul_read_descriptor(regs->cs, v, &ctxt.cs.base, &ctxt.cs.limit,
+                                  &ar, 0) ||
          !(ar & _SEGMENT_S) ||
          !(ar & _SEGMENT_P) ||
          !(ar & _SEGMENT_CODE) )
@@ -3243,7 +3181,7 @@ static void emulate_gate_op(struct cpu_user_regs *regs)
         return;
     }
 
-    if ( !read_descriptor(sel, v, &base, &limit, &ar, 0) ||
+    if ( !pv_emul_read_descriptor(sel, v, &base, &limit, &ar, 0) ||
          !(ar & _SEGMENT_S) ||
          !(ar & _SEGMENT_CODE) ||
          (!jump || (ar & _SEGMENT_EC) ?
@@ -3293,7 +3231,7 @@ static void emulate_gate_op(struct cpu_user_regs *regs)
             esp = v->arch.pv_vcpu.kernel_sp;
             ss = v->arch.pv_vcpu.kernel_ss;
             if ( (ss & 3) != (sel & 3) ||
-                 !read_descriptor(ss, v, &base, &limit, &ar, 0) ||
+                 !pv_emul_read_descriptor(ss, v, &base, &limit, &ar, 0) ||
                  ((ar >> 13) & 3) != (sel & 3) ||
                  !(ar & _SEGMENT_S) ||
                  (ar & _SEGMENT_CODE) ||
@@ -3320,7 +3258,8 @@ static void emulate_gate_op(struct cpu_user_regs *regs)
             {
                 const unsigned int *ustkp;
 
-                if ( !read_descriptor(regs->ss, v, &base, &limit, &ar, 0) ||
+                if ( !pv_emul_read_descriptor(regs->ss, v, &base,
+                                              &limit, &ar, 0) ||
                      ((ar >> 13) & 3) != (regs->cs & 3) ||
                      !(ar & _SEGMENT_S) ||
                      (ar & _SEGMENT_CODE) ||
@@ -3354,7 +3293,7 @@ static void emulate_gate_op(struct cpu_user_regs *regs)
             sel |= (regs->cs & 3);
             esp = regs->rsp;
             ss = regs->ss;
-            if ( !read_descriptor(ss, v, &base, &limit, &ar, 0) ||
+            if ( !pv_emul_read_descriptor(ss, v, &base, &limit, &ar, 0) ||
                  ((ar >> 13) & 3) != (sel & 3) )
             {
                 pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
@@ -3382,7 +3321,7 @@ static void emulate_gate_op(struct cpu_user_regs *regs)
         sel |= (regs->cs & 3);
 
     regs->cs = sel;
-    instruction_done(regs, off);
+    pv_emul_instruction_done(regs, off);
 }
 
 void do_general_protection(struct cpu_user_regs *regs)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 02/27] x86: move PV privileged instruction emulation code
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
  2017-06-08 17:11 ` [PATCH v4 01/27] x86: factor out common PV emulation code Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-20 16:03   ` Jan Beulich
  2017-06-08 17:11 ` [PATCH v4 03/27] x86: move PV gate op " Wei Liu
                   ` (24 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Move the code to pv/emul-priv-op.c. Prefix emulate_privileged_op with
pv_ and export it via pv/traps.h.

Also move gpr_switch.S since it is used by the privileged instruction
emulation code only.

Code motion only except for the rename. Cleanup etc will come later.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/Makefile                 |    2 +
 xen/arch/x86/pv/emul-priv-op.c           | 1414 ++++++++++++++++++++++++++++++
 xen/arch/x86/{x86_64 => pv}/gpr_switch.S |    0
 xen/arch/x86/traps.c                     | 1360 +---------------------------
 xen/arch/x86/x86_64/Makefile             |    1 -
 xen/include/asm-x86/pv/traps.h           |   46 +
 6 files changed, 1464 insertions(+), 1359 deletions(-)
 create mode 100644 xen/arch/x86/pv/emul-priv-op.c
 rename xen/arch/x86/{x86_64 => pv}/gpr_switch.S (100%)
 create mode 100644 xen/include/asm-x86/pv/traps.h

diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index 564202cbb7..e48c460680 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -4,3 +4,5 @@ obj-y += traps.o
 obj-bin-y += dom0_build.init.o
 obj-y += domain.o
 obj-y += emulate.o
+obj-y += emul-priv-op.o
+obj-bin-y += gpr_switch.o
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
new file mode 100644
index 0000000000..fd5fd74bd1
--- /dev/null
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -0,0 +1,1414 @@
+/******************************************************************************
+ * arch/x86/pv/emul-priv-op.c
+ *
+ * Emulate privileged instructions for PV guests
+ *
+ * Modifications to Linux original are copyright (c) 2002-2004, K A Fraser
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/errno.h>
+#include <xen/event.h>
+#include <xen/guest_access.h>
+#include <xen/iocap.h>
+#include <xen/spinlock.h>
+#include <xen/trace.h>
+
+#include <asm/apic.h>
+#include <asm/debugreg.h>
+#include <asm/hpet.h>
+#include <asm/hypercall.h>
+#include <asm/mc146818rtc.h>
+#include <asm/p2m.h>
+#include <asm/pv/traps.h>
+#include <asm/shared.h>
+#include <asm/traps.h>
+#include <asm/x86_emulate.h>
+
+#include <xsm/xsm.h>
+
+#include "../x86_64/mmconfig.h"
+#include "emulate.h"
+
+/***********************
+ * I/O emulation support
+ */
+
+struct priv_op_ctxt {
+    struct x86_emulate_ctxt ctxt;
+    struct {
+        unsigned long base, limit;
+    } cs;
+    char *io_emul_stub;
+    unsigned int bpmatch;
+    unsigned int tsc;
+#define TSC_BASE 1
+#define TSC_AUX 2
+};
+
+/* I/O emulation support. Helper routines for, and type of, the stack stub.*/
+void host_to_guest_gpr_switch(struct cpu_user_regs *);
+unsigned long guest_to_host_gpr_switch(unsigned long);
+
+void (*pv_post_outb_hook)(unsigned int port, u8 value);
+
+typedef void io_emul_stub_t(struct cpu_user_regs *);
+
+static io_emul_stub_t *io_emul_stub_setup(struct priv_op_ctxt *ctxt, u8 opcode,
+                                          unsigned int port, unsigned int bytes)
+{
+    if ( !ctxt->io_emul_stub )
+        ctxt->io_emul_stub = map_domain_page(_mfn(this_cpu(stubs.mfn))) +
+                                             (this_cpu(stubs.addr) &
+                                              ~PAGE_MASK) +
+                                             STUB_BUF_SIZE / 2;
+
+    /* movq $host_to_guest_gpr_switch,%rcx */
+    ctxt->io_emul_stub[0] = 0x48;
+    ctxt->io_emul_stub[1] = 0xb9;
+    *(void **)&ctxt->io_emul_stub[2] = (void *)host_to_guest_gpr_switch;
+    /* callq *%rcx */
+    ctxt->io_emul_stub[10] = 0xff;
+    ctxt->io_emul_stub[11] = 0xd1;
+    /* data16 or nop */
+    ctxt->io_emul_stub[12] = (bytes != 2) ? 0x90 : 0x66;
+    /* <io-access opcode> */
+    ctxt->io_emul_stub[13] = opcode;
+    /* imm8 or nop */
+    ctxt->io_emul_stub[14] = !(opcode & 8) ? port : 0x90;
+    /* ret (jumps to guest_to_host_gpr_switch) */
+    ctxt->io_emul_stub[15] = 0xc3;
+    BUILD_BUG_ON(STUB_BUF_SIZE / 2 < 16);
+
+    if ( ioemul_handle_quirk )
+        ioemul_handle_quirk(opcode, &ctxt->io_emul_stub[12], ctxt->ctxt.regs);
+
+    /* Handy function-typed pointer to the stub. */
+    return (void *)(this_cpu(stubs.addr) + STUB_BUF_SIZE / 2);
+}
+
+
+/* Perform IOPL check between the vcpu's shadowed IOPL, and the assumed cpl. */
+static bool_t iopl_ok(const struct vcpu *v, const struct cpu_user_regs *regs)
+{
+    unsigned int cpl = guest_kernel_mode(v, regs) ?
+        (VM_ASSIST(v->domain, architectural_iopl) ? 0 : 1) : 3;
+
+    ASSERT((v->arch.pv_vcpu.iopl & ~X86_EFLAGS_IOPL) == 0);
+
+    return IOPL(cpl) <= v->arch.pv_vcpu.iopl;
+}
+
+/* Has the guest requested sufficient permission for this I/O access? */
+static int guest_io_okay(
+    unsigned int port, unsigned int bytes,
+    struct vcpu *v, struct cpu_user_regs *regs)
+{
+    /* If in user mode, switch to kernel mode just to read I/O bitmap. */
+    int user_mode = !(v->arch.flags & TF_kernel_mode);
+#define TOGGLE_MODE() if ( user_mode ) toggle_guest_mode(v)
+
+    if ( iopl_ok(v, regs) )
+        return 1;
+
+    if ( v->arch.pv_vcpu.iobmp_limit > (port + bytes) )
+    {
+        union { uint8_t bytes[2]; uint16_t mask; } x;
+
+        /*
+         * Grab permission bytes from guest space. Inaccessible bytes are
+         * read as 0xff (no access allowed).
+         */
+        TOGGLE_MODE();
+        switch ( __copy_from_guest_offset(x.bytes, v->arch.pv_vcpu.iobmp,
+                                          port>>3, 2) )
+        {
+        default: x.bytes[0] = ~0;
+            /* fallthrough */
+        case 1:  x.bytes[1] = ~0;
+            /* fallthrough */
+        case 0:  break;
+        }
+        TOGGLE_MODE();
+
+        if ( (x.mask & (((1<<bytes)-1) << (port&7))) == 0 )
+            return 1;
+    }
+
+    return 0;
+}
+
+/* Has the administrator granted sufficient permission for this I/O access? */
+static bool_t admin_io_okay(unsigned int port, unsigned int bytes,
+                            const struct domain *d)
+{
+    /*
+     * Port 0xcf8 (CONFIG_ADDRESS) is only visible for DWORD accesses.
+     * We never permit direct access to that register.
+     */
+    if ( (port == 0xcf8) && (bytes == 4) )
+        return 0;
+
+    /* We also never permit direct access to the RTC/CMOS registers. */
+    if ( ((port & ~1) == RTC_PORT(0)) )
+        return 0;
+
+    return ioports_access_permitted(d, port, port + bytes - 1);
+}
+
+static bool_t pci_cfg_ok(struct domain *currd, unsigned int start,
+                         unsigned int size, uint32_t *write)
+{
+    uint32_t machine_bdf;
+
+    if ( !is_hardware_domain(currd) )
+        return 0;
+
+    if ( !CF8_ENABLED(currd->arch.pci_cf8) )
+        return 1;
+
+    machine_bdf = CF8_BDF(currd->arch.pci_cf8);
+    if ( write )
+    {
+        const unsigned long *ro_map = pci_get_ro_map(0);
+
+        if ( ro_map && test_bit(machine_bdf, ro_map) )
+            return 0;
+    }
+    start |= CF8_ADDR_LO(currd->arch.pci_cf8);
+    /* AMD extended configuration space access? */
+    if ( CF8_ADDR_HI(currd->arch.pci_cf8) &&
+         boot_cpu_data.x86_vendor == X86_VENDOR_AMD &&
+         boot_cpu_data.x86 >= 0x10 && boot_cpu_data.x86 <= 0x17 )
+    {
+        uint64_t msr_val;
+
+        if ( rdmsr_safe(MSR_AMD64_NB_CFG, msr_val) )
+            return 0;
+        if ( msr_val & (1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT) )
+            start |= CF8_ADDR_HI(currd->arch.pci_cf8);
+    }
+
+    return !write ?
+           xsm_pci_config_permission(XSM_HOOK, currd, machine_bdf,
+                                     start, start + size - 1, 0) == 0 :
+           pci_conf_write_intercept(0, machine_bdf, start, size, write) >= 0;
+}
+
+uint32_t guest_io_read(unsigned int port, unsigned int bytes,
+                       struct domain *currd)
+{
+    uint32_t data = 0;
+    unsigned int shift = 0;
+
+    if ( admin_io_okay(port, bytes, currd) )
+    {
+        switch ( bytes )
+        {
+        case 1: return inb(port);
+        case 2: return inw(port);
+        case 4: return inl(port);
+        }
+    }
+
+    while ( bytes != 0 )
+    {
+        unsigned int size = 1;
+        uint32_t sub_data = ~0;
+
+        if ( (port == 0x42) || (port == 0x43) || (port == 0x61) )
+        {
+            sub_data = pv_pit_handler(port, 0, 0);
+        }
+        else if ( port == RTC_PORT(0) )
+        {
+            sub_data = currd->arch.cmos_idx;
+        }
+        else if ( (port == RTC_PORT(1)) &&
+                  ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
+        {
+            unsigned long flags;
+
+            spin_lock_irqsave(&rtc_lock, flags);
+            outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0));
+            sub_data = inb(RTC_PORT(1));
+            spin_unlock_irqrestore(&rtc_lock, flags);
+        }
+        else if ( (port == 0xcf8) && (bytes == 4) )
+        {
+            size = 4;
+            sub_data = currd->arch.pci_cf8;
+        }
+        else if ( (port & 0xfffc) == 0xcfc )
+        {
+            size = min(bytes, 4 - (port & 3));
+            if ( size == 3 )
+                size = 2;
+            if ( pci_cfg_ok(currd, port & 3, size, NULL) )
+                sub_data = pci_conf_read(currd->arch.pci_cf8, port & 3, size);
+        }
+
+        if ( size == 4 )
+            return sub_data;
+
+        data |= (sub_data & ((1u << (size * 8)) - 1)) << shift;
+        shift += size * 8;
+        port += size;
+        bytes -= size;
+    }
+
+    return data;
+}
+
+static unsigned int check_guest_io_breakpoint(struct vcpu *v,
+    unsigned int port, unsigned int len)
+{
+    unsigned int width, i, match = 0;
+    unsigned long start;
+
+    if ( !(v->arch.debugreg[5]) ||
+         !(v->arch.pv_vcpu.ctrlreg[4] & X86_CR4_DE) )
+        return 0;
+
+    for ( i = 0; i < 4; i++ )
+    {
+        if ( !(v->arch.debugreg[5] &
+               (3 << (i * DR_ENABLE_SIZE))) )
+            continue;
+
+        start = v->arch.debugreg[i];
+        width = 0;
+
+        switch ( (v->arch.debugreg[7] >>
+                  (DR_CONTROL_SHIFT + i * DR_CONTROL_SIZE)) & 0xc )
+        {
+        case DR_LEN_1: width = 1; break;
+        case DR_LEN_2: width = 2; break;
+        case DR_LEN_4: width = 4; break;
+        case DR_LEN_8: width = 8; break;
+        }
+
+        if ( (start < (port + len)) && ((start + width) > port) )
+            match |= 1 << i;
+    }
+
+    return match;
+}
+
+static int priv_op_read_io(unsigned int port, unsigned int bytes,
+                           unsigned long *val, struct x86_emulate_ctxt *ctxt)
+{
+    struct priv_op_ctxt *poc = container_of(ctxt, struct priv_op_ctxt, ctxt);
+    struct vcpu *curr = current;
+    struct domain *currd = current->domain;
+
+    /* INS must not come here. */
+    ASSERT((ctxt->opcode & ~9) == 0xe4);
+
+    if ( !guest_io_okay(port, bytes, curr, ctxt->regs) )
+        return X86EMUL_UNHANDLEABLE;
+
+    poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes);
+
+    if ( admin_io_okay(port, bytes, currd) )
+    {
+        io_emul_stub_t *io_emul =
+            io_emul_stub_setup(poc, ctxt->opcode, port, bytes);
+
+        mark_regs_dirty(ctxt->regs);
+        io_emul(ctxt->regs);
+        return X86EMUL_DONE;
+    }
+
+    *val = guest_io_read(port, bytes, currd);
+
+    return X86EMUL_OKAY;
+}
+
+void guest_io_write(unsigned int port, unsigned int bytes, uint32_t data,
+                    struct domain *currd)
+{
+    if ( admin_io_okay(port, bytes, currd) )
+    {
+        switch ( bytes ) {
+        case 1:
+            outb((uint8_t)data, port);
+            if ( pv_post_outb_hook )
+                pv_post_outb_hook(port, (uint8_t)data);
+            break;
+        case 2:
+            outw((uint16_t)data, port);
+            break;
+        case 4:
+            outl(data, port);
+            break;
+        }
+        return;
+    }
+
+    while ( bytes != 0 )
+    {
+        unsigned int size = 1;
+
+        if ( (port == 0x42) || (port == 0x43) || (port == 0x61) )
+        {
+            pv_pit_handler(port, (uint8_t)data, 1);
+        }
+        else if ( port == RTC_PORT(0) )
+        {
+            currd->arch.cmos_idx = data;
+        }
+        else if ( (port == RTC_PORT(1)) &&
+                  ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
+        {
+            unsigned long flags;
+
+            if ( pv_rtc_handler )
+                pv_rtc_handler(currd->arch.cmos_idx & 0x7f, data);
+            spin_lock_irqsave(&rtc_lock, flags);
+            outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0));
+            outb(data, RTC_PORT(1));
+            spin_unlock_irqrestore(&rtc_lock, flags);
+        }
+        else if ( (port == 0xcf8) && (bytes == 4) )
+        {
+            size = 4;
+            currd->arch.pci_cf8 = data;
+        }
+        else if ( (port & 0xfffc) == 0xcfc )
+        {
+            size = min(bytes, 4 - (port & 3));
+            if ( size == 3 )
+                size = 2;
+            if ( pci_cfg_ok(currd, port & 3, size, &data) )
+                pci_conf_write(currd->arch.pci_cf8, port & 3, size, data);
+        }
+
+        if ( size == 4 )
+            return;
+
+        port += size;
+        bytes -= size;
+        data >>= size * 8;
+    }
+}
+
+static int priv_op_write_io(unsigned int port, unsigned int bytes,
+                            unsigned long val, struct x86_emulate_ctxt *ctxt)
+{
+    struct priv_op_ctxt *poc = container_of(ctxt, struct priv_op_ctxt, ctxt);
+    struct vcpu *curr = current;
+    struct domain *currd = current->domain;
+
+    /* OUTS must not come here. */
+    ASSERT((ctxt->opcode & ~9) == 0xe6);
+
+    if ( !guest_io_okay(port, bytes, curr, ctxt->regs) )
+        return X86EMUL_UNHANDLEABLE;
+
+    poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes);
+
+    if ( admin_io_okay(port, bytes, currd) )
+    {
+        io_emul_stub_t *io_emul =
+            io_emul_stub_setup(poc, ctxt->opcode, port, bytes);
+
+        mark_regs_dirty(ctxt->regs);
+        io_emul(ctxt->regs);
+        if ( (bytes == 1) && pv_post_outb_hook )
+            pv_post_outb_hook(port, val);
+        return X86EMUL_DONE;
+    }
+
+    guest_io_write(port, bytes, val, currd);
+
+    return X86EMUL_OKAY;
+}
+
+static int priv_op_read_segment(enum x86_segment seg,
+                                struct segment_register *reg,
+                                struct x86_emulate_ctxt *ctxt)
+{
+    /* Check if this is an attempt to access the I/O bitmap. */
+    if ( seg == x86_seg_tr )
+    {
+        switch ( ctxt->opcode )
+        {
+        case 0x6c ... 0x6f: /* ins / outs */
+        case 0xe4 ... 0xe7: /* in / out (immediate port) */
+        case 0xec ... 0xef: /* in / out (port in %dx) */
+            /* Defer the check to priv_op_{read,write}_io(). */
+            return X86EMUL_DONE;
+        }
+    }
+
+    if ( ctxt->addr_size < 64 )
+    {
+        unsigned long limit;
+        unsigned int sel, ar;
+
+        switch ( seg )
+        {
+        case x86_seg_cs: sel = ctxt->regs->cs; break;
+        case x86_seg_ds: sel = read_sreg(ds);  break;
+        case x86_seg_es: sel = read_sreg(es);  break;
+        case x86_seg_fs: sel = read_sreg(fs);  break;
+        case x86_seg_gs: sel = read_sreg(gs);  break;
+        case x86_seg_ss: sel = ctxt->regs->ss; break;
+        default: return X86EMUL_UNHANDLEABLE;
+        }
+
+        if ( !pv_emul_read_descriptor(sel, current, &reg->base,
+                                      &limit, &ar, 0) )
+            return X86EMUL_UNHANDLEABLE;
+
+        reg->limit = limit;
+        reg->attr.bytes = ar >> 8;
+    }
+    else
+    {
+        switch ( seg )
+        {
+        default:
+            if ( !is_x86_user_segment(seg) )
+                return X86EMUL_UNHANDLEABLE;
+            reg->base = 0;
+            break;
+        case x86_seg_fs:
+            reg->base = rdfsbase();
+            break;
+        case x86_seg_gs:
+            reg->base = rdgsbase();
+            break;
+        }
+
+        reg->limit = ~0U;
+
+        reg->attr.bytes = 0;
+        reg->attr.fields.type = _SEGMENT_WR >> 8;
+        if ( seg == x86_seg_cs )
+        {
+            reg->attr.fields.type |= _SEGMENT_CODE >> 8;
+            reg->attr.fields.l = 1;
+        }
+        else
+            reg->attr.fields.db = 1;
+        reg->attr.fields.s   = 1;
+        reg->attr.fields.dpl = 3;
+        reg->attr.fields.p   = 1;
+        reg->attr.fields.g   = 1;
+    }
+
+    /*
+     * For x86_emulate.c's mode_ring0() to work, fake a DPL of zero.
+     * Also do this for consistency for non-conforming code segments.
+     */
+    if ( (seg == x86_seg_ss ||
+          (seg == x86_seg_cs &&
+           !(reg->attr.fields.type & (_SEGMENT_EC >> 8)))) &&
+         guest_kernel_mode(current, ctxt->regs) )
+        reg->attr.fields.dpl = 0;
+
+    return X86EMUL_OKAY;
+}
+
+static int pv_emul_virt_to_linear(unsigned long base, unsigned long offset,
+                                  unsigned int bytes, unsigned long limit,
+                                  enum x86_segment seg,
+                                  struct x86_emulate_ctxt *ctxt,
+                                  unsigned long *addr)
+{
+    int rc = X86EMUL_OKAY;
+
+    *addr = base + offset;
+
+    if ( ctxt->addr_size < 64 )
+    {
+        if ( limit < bytes - 1 || offset > limit - bytes + 1 )
+            rc = X86EMUL_EXCEPTION;
+        *addr = (uint32_t)*addr;
+    }
+    else if ( !__addr_ok(*addr) )
+        rc = X86EMUL_EXCEPTION;
+
+    if ( unlikely(rc == X86EMUL_EXCEPTION) )
+        x86_emul_hw_exception(seg != x86_seg_ss ? TRAP_gp_fault
+                                                : TRAP_stack_error,
+                              0, ctxt);
+
+    return rc;
+}
+
+static int priv_op_rep_ins(uint16_t port,
+                           enum x86_segment seg, unsigned long offset,
+                           unsigned int bytes_per_rep, unsigned long *reps,
+                           struct x86_emulate_ctxt *ctxt)
+{
+    struct priv_op_ctxt *poc = container_of(ctxt, struct priv_op_ctxt, ctxt);
+    struct vcpu *curr = current;
+    struct domain *currd = current->domain;
+    unsigned long goal = *reps;
+    struct segment_register sreg;
+    int rc;
+
+    ASSERT(seg == x86_seg_es);
+
+    *reps = 0;
+
+    if ( !guest_io_okay(port, bytes_per_rep, curr, ctxt->regs) )
+        return X86EMUL_UNHANDLEABLE;
+
+    rc = priv_op_read_segment(x86_seg_es, &sreg, ctxt);
+    if ( rc != X86EMUL_OKAY )
+        return rc;
+
+    if ( !sreg.attr.fields.p )
+        return X86EMUL_UNHANDLEABLE;
+    if ( !sreg.attr.fields.s ||
+         (sreg.attr.fields.type & (_SEGMENT_CODE >> 8)) ||
+         !(sreg.attr.fields.type & (_SEGMENT_WR >> 8)) )
+    {
+        x86_emul_hw_exception(TRAP_gp_fault, 0, ctxt);
+        return X86EMUL_EXCEPTION;
+    }
+
+    poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes_per_rep);
+
+    while ( *reps < goal )
+    {
+        unsigned int data = guest_io_read(port, bytes_per_rep, currd);
+        unsigned long addr;
+
+        rc = pv_emul_virt_to_linear(sreg.base, offset, bytes_per_rep,
+                                    sreg.limit, x86_seg_es, ctxt, &addr);
+        if ( rc != X86EMUL_OKAY )
+            return rc;
+
+        if ( (rc = __copy_to_user((void *)addr, &data, bytes_per_rep)) != 0 )
+        {
+            x86_emul_pagefault(PFEC_write_access,
+                               addr + bytes_per_rep - rc, ctxt);
+            return X86EMUL_EXCEPTION;
+        }
+
+        ++*reps;
+
+        if ( poc->bpmatch || hypercall_preempt_check() )
+            break;
+
+        /* x86_emulate() clips the repetition count to ensure we don't wrap. */
+        if ( unlikely(ctxt->regs->eflags & X86_EFLAGS_DF) )
+            offset -= bytes_per_rep;
+        else
+            offset += bytes_per_rep;
+    }
+
+    return X86EMUL_OKAY;
+}
+
+static int priv_op_rep_outs(enum x86_segment seg, unsigned long offset,
+                            uint16_t port,
+                            unsigned int bytes_per_rep, unsigned long *reps,
+                            struct x86_emulate_ctxt *ctxt)
+{
+    struct priv_op_ctxt *poc = container_of(ctxt, struct priv_op_ctxt, ctxt);
+    struct vcpu *curr = current;
+    struct domain *currd = current->domain;
+    unsigned long goal = *reps;
+    struct segment_register sreg;
+    int rc;
+
+    *reps = 0;
+
+    if ( !guest_io_okay(port, bytes_per_rep, curr, ctxt->regs) )
+        return X86EMUL_UNHANDLEABLE;
+
+    rc = priv_op_read_segment(seg, &sreg, ctxt);
+    if ( rc != X86EMUL_OKAY )
+        return rc;
+
+    if ( !sreg.attr.fields.p )
+        return X86EMUL_UNHANDLEABLE;
+    if ( !sreg.attr.fields.s ||
+         ((sreg.attr.fields.type & (_SEGMENT_CODE >> 8)) &&
+          !(sreg.attr.fields.type & (_SEGMENT_WR >> 8))) )
+    {
+        x86_emul_hw_exception(seg != x86_seg_ss ? TRAP_gp_fault
+                                                : TRAP_stack_error,
+                              0, ctxt);
+        return X86EMUL_EXCEPTION;
+    }
+
+    poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes_per_rep);
+
+    while ( *reps < goal )
+    {
+        unsigned int data = 0;
+        unsigned long addr;
+
+        rc = pv_emul_virt_to_linear(sreg.base, offset, bytes_per_rep,
+                                    sreg.limit, seg, ctxt, &addr);
+        if ( rc != X86EMUL_OKAY )
+            return rc;
+
+        if ( (rc = __copy_from_user(&data, (void *)addr, bytes_per_rep)) != 0 )
+        {
+            x86_emul_pagefault(0, addr + bytes_per_rep - rc, ctxt);
+            return X86EMUL_EXCEPTION;
+        }
+
+        guest_io_write(port, bytes_per_rep, data, currd);
+
+        ++*reps;
+
+        if ( poc->bpmatch || hypercall_preempt_check() )
+            break;
+
+        /* x86_emulate() clips the repetition count to ensure we don't wrap. */
+        if ( unlikely(ctxt->regs->eflags & X86_EFLAGS_DF) )
+            offset -= bytes_per_rep;
+        else
+            offset += bytes_per_rep;
+    }
+
+    return X86EMUL_OKAY;
+}
+
+static int priv_op_read_cr(unsigned int reg, unsigned long *val,
+                           struct x86_emulate_ctxt *ctxt)
+{
+    const struct vcpu *curr = current;
+
+    switch ( reg )
+    {
+    case 0: /* Read CR0 */
+        *val = (read_cr0() & ~X86_CR0_TS) | curr->arch.pv_vcpu.ctrlreg[0];
+        return X86EMUL_OKAY;
+
+    case 2: /* Read CR2 */
+    case 4: /* Read CR4 */
+        *val = curr->arch.pv_vcpu.ctrlreg[reg];
+        return X86EMUL_OKAY;
+
+    case 3: /* Read CR3 */
+    {
+        const struct domain *currd = curr->domain;
+        unsigned long mfn;
+
+        if ( !is_pv_32bit_domain(currd) )
+        {
+            mfn = pagetable_get_pfn(curr->arch.guest_table);
+            *val = xen_pfn_to_cr3(mfn_to_gmfn(currd, mfn));
+        }
+        else
+        {
+            l4_pgentry_t *pl4e =
+                map_domain_page(_mfn(pagetable_get_pfn(curr->arch.guest_table)));
+
+            mfn = l4e_get_pfn(*pl4e);
+            unmap_domain_page(pl4e);
+            *val = compat_pfn_to_cr3(mfn_to_gmfn(currd, mfn));
+        }
+        /* PTs should not be shared */
+        BUG_ON(page_get_owner(mfn_to_page(mfn)) == dom_cow);
+        return X86EMUL_OKAY;
+    }
+    }
+
+    return X86EMUL_UNHANDLEABLE;
+}
+
+static int priv_op_write_cr(unsigned int reg, unsigned long val,
+                            struct x86_emulate_ctxt *ctxt)
+{
+    struct vcpu *curr = current;
+
+    switch ( reg )
+    {
+    case 0: /* Write CR0 */
+        if ( (val ^ read_cr0()) & ~X86_CR0_TS )
+        {
+            gdprintk(XENLOG_WARNING,
+                    "Attempt to change unmodifiable CR0 flags\n");
+            break;
+        }
+        do_fpu_taskswitch(!!(val & X86_CR0_TS));
+        return X86EMUL_OKAY;
+
+    case 2: /* Write CR2 */
+        curr->arch.pv_vcpu.ctrlreg[2] = val;
+        arch_set_cr2(curr, val);
+        return X86EMUL_OKAY;
+
+    case 3: /* Write CR3 */
+    {
+        struct domain *currd = curr->domain;
+        unsigned long gfn;
+        struct page_info *page;
+        int rc;
+
+        gfn = !is_pv_32bit_domain(currd)
+              ? xen_cr3_to_pfn(val) : compat_cr3_to_pfn(val);
+        page = get_page_from_gfn(currd, gfn, NULL, P2M_ALLOC);
+        if ( !page )
+            break;
+        rc = new_guest_cr3(page_to_mfn(page));
+        put_page(page);
+
+        switch ( rc )
+        {
+        case 0:
+            return X86EMUL_OKAY;
+        case -ERESTART: /* retry after preemption */
+            return X86EMUL_RETRY;
+        }
+        break;
+    }
+
+    case 4: /* Write CR4 */
+        curr->arch.pv_vcpu.ctrlreg[4] = pv_guest_cr4_fixup(curr, val);
+        write_cr4(pv_guest_cr4_to_real_cr4(curr));
+        ctxt_switch_levelling(curr);
+        return X86EMUL_OKAY;
+    }
+
+    return X86EMUL_UNHANDLEABLE;
+}
+
+static int priv_op_read_dr(unsigned int reg, unsigned long *val,
+                           struct x86_emulate_ctxt *ctxt)
+{
+    unsigned long res = do_get_debugreg(reg);
+
+    if ( IS_ERR_VALUE(res) )
+        return X86EMUL_UNHANDLEABLE;
+
+    *val = res;
+
+    return X86EMUL_OKAY;
+}
+
+static int priv_op_write_dr(unsigned int reg, unsigned long val,
+                            struct x86_emulate_ctxt *ctxt)
+{
+    return do_set_debugreg(reg, val) == 0
+           ? X86EMUL_OKAY : X86EMUL_UNHANDLEABLE;
+}
+
+static inline uint64_t guest_misc_enable(uint64_t val)
+{
+    val &= ~(MSR_IA32_MISC_ENABLE_PERF_AVAIL |
+             MSR_IA32_MISC_ENABLE_MONITOR_ENABLE);
+    val |= MSR_IA32_MISC_ENABLE_BTS_UNAVAIL |
+           MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL |
+           MSR_IA32_MISC_ENABLE_XTPR_DISABLE;
+    return val;
+}
+
+static inline bool is_cpufreq_controller(const struct domain *d)
+{
+    return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
+            is_hardware_domain(d));
+}
+
+static int priv_op_read_msr(unsigned int reg, uint64_t *val,
+                            struct x86_emulate_ctxt *ctxt)
+{
+    struct priv_op_ctxt *poc = container_of(ctxt, struct priv_op_ctxt, ctxt);
+    const struct vcpu *curr = current;
+    const struct domain *currd = curr->domain;
+    bool vpmu_msr = false;
+
+    switch ( reg )
+    {
+        int rc;
+
+    case MSR_FS_BASE:
+        if ( is_pv_32bit_domain(currd) )
+            break;
+        *val = cpu_has_fsgsbase ? __rdfsbase() : curr->arch.pv_vcpu.fs_base;
+        return X86EMUL_OKAY;
+
+    case MSR_GS_BASE:
+        if ( is_pv_32bit_domain(currd) )
+            break;
+        *val = cpu_has_fsgsbase ? __rdgsbase()
+                                : curr->arch.pv_vcpu.gs_base_kernel;
+        return X86EMUL_OKAY;
+
+    case MSR_SHADOW_GS_BASE:
+        if ( is_pv_32bit_domain(currd) )
+            break;
+        *val = curr->arch.pv_vcpu.gs_base_user;
+        return X86EMUL_OKAY;
+
+    /*
+     * In order to fully retain original behavior, defer calling
+     * pv_soft_rdtsc() until after emulation. This may want/need to be
+     * reconsidered.
+     */
+    case MSR_IA32_TSC:
+        poc->tsc |= TSC_BASE;
+        goto normal;
+
+    case MSR_TSC_AUX:
+        poc->tsc |= TSC_AUX;
+        if ( cpu_has_rdtscp )
+            goto normal;
+        *val = 0;
+        return X86EMUL_OKAY;
+
+    case MSR_EFER:
+        *val = read_efer();
+        if ( is_pv_32bit_domain(currd) )
+            *val &= ~(EFER_LME | EFER_LMA | EFER_LMSLE);
+        return X86EMUL_OKAY;
+
+    case MSR_K7_FID_VID_CTL:
+    case MSR_K7_FID_VID_STATUS:
+    case MSR_K8_PSTATE_LIMIT:
+    case MSR_K8_PSTATE_CTRL:
+    case MSR_K8_PSTATE_STATUS:
+    case MSR_K8_PSTATE0:
+    case MSR_K8_PSTATE1:
+    case MSR_K8_PSTATE2:
+    case MSR_K8_PSTATE3:
+    case MSR_K8_PSTATE4:
+    case MSR_K8_PSTATE5:
+    case MSR_K8_PSTATE6:
+    case MSR_K8_PSTATE7:
+        if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD )
+            break;
+        if ( unlikely(is_cpufreq_controller(currd)) )
+            goto normal;
+        *val = 0;
+        return X86EMUL_OKAY;
+
+    case MSR_IA32_UCODE_REV:
+        BUILD_BUG_ON(MSR_IA32_UCODE_REV != MSR_AMD_PATCHLEVEL);
+        if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
+        {
+            if ( wrmsr_safe(MSR_IA32_UCODE_REV, 0) )
+                break;
+            /* As documented in the SDM: Do a CPUID 1 here */
+            cpuid_eax(1);
+        }
+        goto normal;
+
+    case MSR_IA32_MISC_ENABLE:
+        if ( rdmsr_safe(reg, *val) )
+            break;
+        *val = guest_misc_enable(*val);
+        return X86EMUL_OKAY;
+
+    case MSR_AMD64_DR0_ADDRESS_MASK:
+        if ( !boot_cpu_has(X86_FEATURE_DBEXT) )
+            break;
+        *val = curr->arch.pv_vcpu.dr_mask[0];
+        return X86EMUL_OKAY;
+
+    case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
+        if ( !boot_cpu_has(X86_FEATURE_DBEXT) )
+            break;
+        *val = curr->arch.pv_vcpu.dr_mask[reg - MSR_AMD64_DR1_ADDRESS_MASK + 1];
+        return X86EMUL_OKAY;
+
+    case MSR_IA32_PERF_CAPABILITIES:
+        /* No extra capabilities are supported. */
+        *val = 0;
+        return X86EMUL_OKAY;
+
+    case MSR_INTEL_PLATFORM_INFO:
+        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
+             rdmsr_safe(MSR_INTEL_PLATFORM_INFO, *val) )
+            break;
+        *val = 0;
+        if ( this_cpu(cpuid_faulting_enabled) )
+            *val |= MSR_PLATFORM_INFO_CPUID_FAULTING;
+        return X86EMUL_OKAY;
+
+    case MSR_INTEL_MISC_FEATURES_ENABLES:
+        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
+             rdmsr_safe(MSR_INTEL_MISC_FEATURES_ENABLES, *val) )
+            break;
+        *val = 0;
+        if ( curr->arch.cpuid_faulting )
+            *val |= MSR_MISC_FEATURES_CPUID_FAULTING;
+        return X86EMUL_OKAY;
+
+    case MSR_P6_PERFCTR(0)...MSR_P6_PERFCTR(7):
+    case MSR_P6_EVNTSEL(0)...MSR_P6_EVNTSEL(3):
+    case MSR_CORE_PERF_FIXED_CTR0...MSR_CORE_PERF_FIXED_CTR2:
+    case MSR_CORE_PERF_FIXED_CTR_CTRL...MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+        if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
+        {
+            vpmu_msr = true;
+            /* fall through */
+    case MSR_AMD_FAM15H_EVNTSEL0...MSR_AMD_FAM15H_PERFCTR5:
+    case MSR_K7_EVNTSEL0...MSR_K7_PERFCTR3:
+            if ( vpmu_msr || (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) )
+            {
+                if ( vpmu_do_rdmsr(reg, val) )
+                    break;
+                return X86EMUL_OKAY;
+            }
+        }
+        /* fall through */
+    default:
+        if ( rdmsr_hypervisor_regs(reg, val) )
+            return X86EMUL_OKAY;
+
+        rc = vmce_rdmsr(reg, val);
+        if ( rc < 0 )
+            break;
+        if ( rc )
+            return X86EMUL_OKAY;
+        /* fall through */
+    normal:
+        /* Everyone can read the MSR space. */
+        /* gdprintk(XENLOG_WARNING, "Domain attempted RDMSR %08x\n", reg); */
+        if ( rdmsr_safe(reg, *val) )
+            break;
+        return X86EMUL_OKAY;
+    }
+
+    return X86EMUL_UNHANDLEABLE;
+}
+
+static int priv_op_write_msr(unsigned int reg, uint64_t val,
+                             struct x86_emulate_ctxt *ctxt)
+{
+    struct vcpu *curr = current;
+    const struct domain *currd = curr->domain;
+    bool vpmu_msr = false;
+
+    switch ( reg )
+    {
+        uint64_t temp;
+        int rc;
+
+    case MSR_FS_BASE:
+        if ( is_pv_32bit_domain(currd) || !is_canonical_address(val) )
+            break;
+        wrfsbase(val);
+        curr->arch.pv_vcpu.fs_base = val;
+        return X86EMUL_OKAY;
+
+    case MSR_GS_BASE:
+        if ( is_pv_32bit_domain(currd) || !is_canonical_address(val) )
+            break;
+        wrgsbase(val);
+        curr->arch.pv_vcpu.gs_base_kernel = val;
+        return X86EMUL_OKAY;
+
+    case MSR_SHADOW_GS_BASE:
+        if ( is_pv_32bit_domain(currd) || !is_canonical_address(val) )
+            break;
+        wrmsrl(MSR_SHADOW_GS_BASE, val);
+        curr->arch.pv_vcpu.gs_base_user = val;
+        return X86EMUL_OKAY;
+
+    case MSR_K7_FID_VID_STATUS:
+    case MSR_K7_FID_VID_CTL:
+    case MSR_K8_PSTATE_LIMIT:
+    case MSR_K8_PSTATE_CTRL:
+    case MSR_K8_PSTATE_STATUS:
+    case MSR_K8_PSTATE0:
+    case MSR_K8_PSTATE1:
+    case MSR_K8_PSTATE2:
+    case MSR_K8_PSTATE3:
+    case MSR_K8_PSTATE4:
+    case MSR_K8_PSTATE5:
+    case MSR_K8_PSTATE6:
+    case MSR_K8_PSTATE7:
+    case MSR_K8_HWCR:
+        if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD )
+            break;
+        if ( likely(!is_cpufreq_controller(currd)) ||
+             wrmsr_safe(reg, val) == 0 )
+            return X86EMUL_OKAY;
+        break;
+
+    case MSR_AMD64_NB_CFG:
+        if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD ||
+             boot_cpu_data.x86 < 0x10 || boot_cpu_data.x86 > 0x17 )
+            break;
+        if ( !is_hardware_domain(currd) || !is_pinned_vcpu(curr) )
+            return X86EMUL_OKAY;
+        if ( (rdmsr_safe(MSR_AMD64_NB_CFG, temp) != 0) ||
+             ((val ^ temp) & ~(1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT)) )
+            goto invalid;
+        if ( wrmsr_safe(MSR_AMD64_NB_CFG, val) == 0 )
+            return X86EMUL_OKAY;
+        break;
+
+    case MSR_FAM10H_MMIO_CONF_BASE:
+        if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD ||
+             boot_cpu_data.x86 < 0x10 || boot_cpu_data.x86 > 0x17 )
+            break;
+        if ( !is_hardware_domain(currd) || !is_pinned_vcpu(curr) )
+            return X86EMUL_OKAY;
+        if ( rdmsr_safe(MSR_FAM10H_MMIO_CONF_BASE, temp) != 0 )
+            break;
+        if ( (pci_probe & PCI_PROBE_MASK) == PCI_PROBE_MMCONF ?
+             temp != val :
+             ((temp ^ val) &
+              ~(FAM10H_MMIO_CONF_ENABLE |
+                (FAM10H_MMIO_CONF_BUSRANGE_MASK <<
+                 FAM10H_MMIO_CONF_BUSRANGE_SHIFT) |
+                ((u64)FAM10H_MMIO_CONF_BASE_MASK <<
+                 FAM10H_MMIO_CONF_BASE_SHIFT))) )
+            goto invalid;
+        if ( wrmsr_safe(MSR_FAM10H_MMIO_CONF_BASE, val) == 0 )
+            return X86EMUL_OKAY;
+        break;
+
+    case MSR_IA32_UCODE_REV:
+        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
+            break;
+        if ( !is_hardware_domain(currd) || !is_pinned_vcpu(curr) )
+            return X86EMUL_OKAY;
+        if ( rdmsr_safe(reg, temp) )
+            break;
+        if ( val )
+            goto invalid;
+        return X86EMUL_OKAY;
+
+    case MSR_IA32_MISC_ENABLE:
+        if ( rdmsr_safe(reg, temp) )
+            break;
+        if ( val != guest_misc_enable(temp) )
+            goto invalid;
+        return X86EMUL_OKAY;
+
+    case MSR_IA32_MPERF:
+    case MSR_IA32_APERF:
+        if ( (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) &&
+             (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) )
+            break;
+        if ( likely(!is_cpufreq_controller(currd)) ||
+             wrmsr_safe(reg, val) == 0 )
+            return X86EMUL_OKAY;
+        break;
+
+    case MSR_IA32_PERF_CTL:
+        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
+            break;
+        if ( likely(!is_cpufreq_controller(currd)) ||
+             wrmsr_safe(reg, val) == 0 )
+            return X86EMUL_OKAY;
+        break;
+
+    case MSR_IA32_THERM_CONTROL:
+    case MSR_IA32_ENERGY_PERF_BIAS:
+        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
+            break;
+        if ( !is_hardware_domain(currd) || !is_pinned_vcpu(curr) ||
+             wrmsr_safe(reg, val) == 0 )
+            return X86EMUL_OKAY;
+        break;
+
+    case MSR_AMD64_DR0_ADDRESS_MASK:
+        if ( !boot_cpu_has(X86_FEATURE_DBEXT) || (val >> 32) )
+            break;
+        curr->arch.pv_vcpu.dr_mask[0] = val;
+        if ( curr->arch.debugreg[7] & DR7_ACTIVE_MASK )
+            wrmsrl(MSR_AMD64_DR0_ADDRESS_MASK, val);
+        return X86EMUL_OKAY;
+
+    case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
+        if ( !boot_cpu_has(X86_FEATURE_DBEXT) || (val >> 32) )
+            break;
+        curr->arch.pv_vcpu.dr_mask[reg - MSR_AMD64_DR1_ADDRESS_MASK + 1] = val;
+        if ( curr->arch.debugreg[7] & DR7_ACTIVE_MASK )
+            wrmsrl(reg, val);
+        return X86EMUL_OKAY;
+
+    case MSR_INTEL_PLATFORM_INFO:
+        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
+             val || rdmsr_safe(MSR_INTEL_PLATFORM_INFO, val) )
+            break;
+        return X86EMUL_OKAY;
+
+    case MSR_INTEL_MISC_FEATURES_ENABLES:
+        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
+             (val & ~MSR_MISC_FEATURES_CPUID_FAULTING) ||
+             rdmsr_safe(MSR_INTEL_MISC_FEATURES_ENABLES, temp) )
+            break;
+        if ( (val & MSR_MISC_FEATURES_CPUID_FAULTING) &&
+             !this_cpu(cpuid_faulting_enabled) )
+            break;
+        curr->arch.cpuid_faulting = !!(val & MSR_MISC_FEATURES_CPUID_FAULTING);
+        return X86EMUL_OKAY;
+
+    case MSR_P6_PERFCTR(0)...MSR_P6_PERFCTR(7):
+    case MSR_P6_EVNTSEL(0)...MSR_P6_EVNTSEL(3):
+    case MSR_CORE_PERF_FIXED_CTR0...MSR_CORE_PERF_FIXED_CTR2:
+    case MSR_CORE_PERF_FIXED_CTR_CTRL...MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+        if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
+        {
+            vpmu_msr = true;
+    case MSR_AMD_FAM15H_EVNTSEL0...MSR_AMD_FAM15H_PERFCTR5:
+    case MSR_K7_EVNTSEL0...MSR_K7_PERFCTR3:
+            if ( vpmu_msr || (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) )
+            {
+                if ( (vpmu_mode & XENPMU_MODE_ALL) &&
+                     !is_hardware_domain(currd) )
+                    return X86EMUL_OKAY;
+
+                if ( vpmu_do_wrmsr(reg, val, 0) )
+                    break;
+                return X86EMUL_OKAY;
+            }
+        }
+        /* fall through */
+    default:
+        if ( wrmsr_hypervisor_regs(reg, val) == 1 )
+            return X86EMUL_OKAY;
+
+        rc = vmce_wrmsr(reg, val);
+        if ( rc < 0 )
+            break;
+        if ( rc )
+            return X86EMUL_OKAY;
+
+        if ( (rdmsr_safe(reg, temp) != 0) || (val != temp) )
+    invalid:
+            gdprintk(XENLOG_WARNING,
+                     "Domain attempted WRMSR %08x from 0x%016"PRIx64" to 0x%016"PRIx64"\n",
+                     reg, temp, val);
+        return X86EMUL_OKAY;
+    }
+
+    return X86EMUL_UNHANDLEABLE;
+}
+
+static int priv_op_wbinvd(struct x86_emulate_ctxt *ctxt)
+{
+    /* Ignore the instruction if unprivileged. */
+    if ( !cache_flush_permitted(current->domain) )
+        /*
+         * Non-physdev domain attempted WBINVD; ignore for now since
+         * newer linux uses this in some start-of-day timing loops.
+         */
+        ;
+    else
+        wbinvd();
+
+    return X86EMUL_OKAY;
+}
+
+int pv_emul_cpuid(uint32_t leaf, uint32_t subleaf,
+                  struct cpuid_leaf *res, struct x86_emulate_ctxt *ctxt)
+{
+    guest_cpuid(current, leaf, subleaf, res);
+
+    return X86EMUL_OKAY;
+}
+
+static int priv_op_validate(const struct x86_emulate_state *state,
+                            struct x86_emulate_ctxt *ctxt)
+{
+    switch ( ctxt->opcode )
+    {
+    case 0x6c ... 0x6f: /* ins / outs */
+    case 0xe4 ... 0xe7: /* in / out (immediate port) */
+    case 0xec ... 0xef: /* in / out (port in %dx) */
+    case X86EMUL_OPC(0x0f, 0x06): /* clts */
+    case X86EMUL_OPC(0x0f, 0x09): /* wbinvd */
+    case X86EMUL_OPC(0x0f, 0x20) ...
+         X86EMUL_OPC(0x0f, 0x23): /* mov to/from cr/dr */
+    case X86EMUL_OPC(0x0f, 0x30): /* wrmsr */
+    case X86EMUL_OPC(0x0f, 0x31): /* rdtsc */
+    case X86EMUL_OPC(0x0f, 0x32): /* rdmsr */
+    case X86EMUL_OPC(0x0f, 0xa2): /* cpuid */
+        return X86EMUL_OKAY;
+
+    case 0xfa: case 0xfb: /* cli / sti */
+        if ( !iopl_ok(current, ctxt->regs) )
+            break;
+        /*
+         * This is just too dangerous to allow, in my opinion. Consider if the
+         * caller then tries to reenable interrupts using POPF: we can't trap
+         * that and we'll end up with hard-to-debug lockups. Fast & loose will
+         * do for us. :-)
+        vcpu_info(current, evtchn_upcall_mask) = (ctxt->opcode == 0xfa);
+         */
+        return X86EMUL_DONE;
+
+    case X86EMUL_OPC(0x0f, 0x01):
+    {
+        unsigned int modrm_rm, modrm_reg;
+
+        if ( x86_insn_modrm(state, &modrm_rm, &modrm_reg) != 3 ||
+             (modrm_rm & 7) != 1 )
+            break;
+        switch ( modrm_reg & 7 )
+        {
+        case 2: /* xsetbv */
+        case 7: /* rdtscp */
+            return X86EMUL_OKAY;
+        }
+        break;
+    }
+    }
+
+    return X86EMUL_UNHANDLEABLE;
+}
+
+static int priv_op_insn_fetch(enum x86_segment seg,
+                              unsigned long offset,
+                              void *p_data,
+                              unsigned int bytes,
+                              struct x86_emulate_ctxt *ctxt)
+{
+    const struct priv_op_ctxt *poc =
+        container_of(ctxt, struct priv_op_ctxt, ctxt);
+    unsigned int rc;
+    unsigned long addr = poc->cs.base + offset;
+
+    ASSERT(seg == x86_seg_cs);
+
+    /* We don't mean to emulate any branches. */
+    if ( !bytes )
+        return X86EMUL_UNHANDLEABLE;
+
+    rc = pv_emul_virt_to_linear(poc->cs.base, offset, bytes, poc->cs.limit,
+                                x86_seg_cs, ctxt, &addr);
+    if ( rc != X86EMUL_OKAY )
+        return rc;
+
+    if ( (rc = __copy_from_user(p_data, (void *)addr, bytes)) != 0 )
+    {
+        /*
+         * TODO: This should report PFEC_insn_fetch when goc->insn_fetch &&
+         * cpu_has_nx, but we'd then need a "fetch" variant of
+         * __copy_from_user() respecting NX, SMEP, and protection keys.
+         */
+        x86_emul_pagefault(0, addr + bytes - rc, ctxt);
+        return X86EMUL_EXCEPTION;
+    }
+
+    return X86EMUL_OKAY;
+}
+
+
+static const struct x86_emulate_ops priv_op_ops = {
+    .insn_fetch          = priv_op_insn_fetch,
+    .read                = x86emul_unhandleable_rw,
+    .validate            = priv_op_validate,
+    .read_io             = priv_op_read_io,
+    .write_io            = priv_op_write_io,
+    .rep_ins             = priv_op_rep_ins,
+    .rep_outs            = priv_op_rep_outs,
+    .read_segment        = priv_op_read_segment,
+    .read_cr             = priv_op_read_cr,
+    .write_cr            = priv_op_write_cr,
+    .read_dr             = priv_op_read_dr,
+    .write_dr            = priv_op_write_dr,
+    .read_msr            = priv_op_read_msr,
+    .write_msr           = priv_op_write_msr,
+    .cpuid               = pv_emul_cpuid,
+    .wbinvd              = priv_op_wbinvd,
+};
+
+int pv_emulate_privileged_op(struct cpu_user_regs *regs)
+{
+    struct vcpu *curr = current;
+    struct domain *currd = curr->domain;
+    struct priv_op_ctxt ctxt = {
+        .ctxt.regs = regs,
+        .ctxt.vendor = currd->arch.cpuid->x86_vendor,
+        .ctxt.lma = !is_pv_32bit_domain(currd),
+    };
+    int rc;
+    unsigned int eflags, ar;
+
+    if ( !pv_emul_read_descriptor(regs->cs, curr, &ctxt.cs.base,
+                                  &ctxt.cs.limit, &ar, 1) ||
+         !(ar & _SEGMENT_S) ||
+         !(ar & _SEGMENT_P) ||
+         !(ar & _SEGMENT_CODE) )
+        return 0;
+
+    /* Mirror virtualized state into EFLAGS. */
+    ASSERT(regs->eflags & X86_EFLAGS_IF);
+    if ( vcpu_info(curr, evtchn_upcall_mask) )
+        regs->eflags &= ~X86_EFLAGS_IF;
+    else
+        regs->eflags |= X86_EFLAGS_IF;
+    ASSERT(!(regs->eflags & X86_EFLAGS_IOPL));
+    regs->eflags |= curr->arch.pv_vcpu.iopl;
+    eflags = regs->eflags;
+
+    ctxt.ctxt.addr_size = ar & _SEGMENT_L ? 64 : ar & _SEGMENT_DB ? 32 : 16;
+    /* Leave zero in ctxt.ctxt.sp_size, as it's not needed. */
+    rc = x86_emulate(&ctxt.ctxt, &priv_op_ops);
+
+    if ( ctxt.io_emul_stub )
+        unmap_domain_page(ctxt.io_emul_stub);
+
+    /*
+     * Un-mirror virtualized state from EFLAGS.
+     * Nothing we allow to be emulated can change anything other than the
+     * arithmetic bits, and the resume flag.
+     */
+    ASSERT(!((regs->eflags ^ eflags) &
+             ~(X86_EFLAGS_RF | X86_EFLAGS_ARITH_MASK)));
+    regs->eflags |= X86_EFLAGS_IF;
+    regs->eflags &= ~X86_EFLAGS_IOPL;
+
+    switch ( rc )
+    {
+    case X86EMUL_OKAY:
+        if ( ctxt.tsc & TSC_BASE )
+        {
+            if ( ctxt.tsc & TSC_AUX )
+                pv_soft_rdtsc(curr, regs, 1);
+            else if ( currd->arch.vtsc )
+                pv_soft_rdtsc(curr, regs, 0);
+            else
+                msr_split(regs, rdtsc());
+        }
+
+        if ( ctxt.ctxt.retire.singlestep )
+            ctxt.bpmatch |= DR_STEP;
+        if ( ctxt.bpmatch )
+        {
+            curr->arch.debugreg[6] |= ctxt.bpmatch | DR_STATUS_RESERVED_ONE;
+            if ( !(curr->arch.pv_vcpu.trap_bounce.flags & TBF_EXCEPTION) )
+                pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
+        }
+        /* fall through */
+    case X86EMUL_RETRY:
+        return EXCRET_fault_fixed;
+
+    case X86EMUL_EXCEPTION:
+        pv_inject_event(&ctxt.ctxt.event);
+        return EXCRET_fault_fixed;
+    }
+
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/x86_64/gpr_switch.S b/xen/arch/x86/pv/gpr_switch.S
similarity index 100%
rename from xen/arch/x86/x86_64/gpr_switch.S
rename to xen/arch/x86/pv/gpr_switch.S
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index dcc48f9860..32cab71444 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -77,6 +77,7 @@
 #include <public/arch-x86/cpuid.h>
 #include <asm/cpuid.h>
 #include <xsm/xsm.h>
+#include <asm/pv/traps.h>
 
 #include "pv/emulate.h"
 
@@ -696,41 +697,6 @@ void pv_inject_event(const struct x86_event *event)
     }
 }
 
-static unsigned int check_guest_io_breakpoint(struct vcpu *v,
-    unsigned int port, unsigned int len)
-{
-    unsigned int width, i, match = 0;
-    unsigned long start;
-
-    if ( !(v->arch.debugreg[5]) ||
-         !(v->arch.pv_vcpu.ctrlreg[4] & X86_CR4_DE) )
-        return 0;
-
-    for ( i = 0; i < 4; i++ )
-    {
-        if ( !(v->arch.debugreg[5] &
-               (3 << (i * DR_ENABLE_SIZE))) )
-            continue;
-
-        start = v->arch.debugreg[i];
-        width = 0;
-
-        switch ( (v->arch.debugreg[7] >>
-                  (DR_CONTROL_SHIFT + i * DR_CONTROL_SIZE)) & 0xc )
-        {
-        case DR_LEN_1: width = 1; break;
-        case DR_LEN_2: width = 2; break;
-        case DR_LEN_4: width = 4; break;
-        case DR_LEN_8: width = 8; break;
-        }
-
-        if ( (start < (port + len)) && ((start + width) > port) )
-            match |= 1 << i;
-    }
-
-    return match;
-}
-
 /*
  * Called from asm to set up the MCE trapbounce info.
  * Returns 0 if no callback is set up, else 1.
@@ -1670,1328 +1636,6 @@ static int read_gate_descriptor(unsigned int gate_sel,
     return 1;
 }
 
-static int pv_emul_virt_to_linear(unsigned long base, unsigned long offset,
-                                  unsigned int bytes, unsigned long limit,
-                                  enum x86_segment seg,
-                                  struct x86_emulate_ctxt *ctxt,
-                                  unsigned long *addr)
-{
-    int rc = X86EMUL_OKAY;
-
-    *addr = base + offset;
-
-    if ( ctxt->addr_size < 64 )
-    {
-        if ( limit < bytes - 1 || offset > limit - bytes + 1 )
-            rc = X86EMUL_EXCEPTION;
-        *addr = (uint32_t)*addr;
-    }
-    else if ( !__addr_ok(*addr) )
-        rc = X86EMUL_EXCEPTION;
-
-    if ( unlikely(rc == X86EMUL_EXCEPTION) )
-        x86_emul_hw_exception(seg != x86_seg_ss ? TRAP_gp_fault
-                                                : TRAP_stack_error,
-                              0, ctxt);
-
-    return rc;
-}
-
-struct priv_op_ctxt {
-    struct x86_emulate_ctxt ctxt;
-    struct {
-        unsigned long base, limit;
-    } cs;
-    char *io_emul_stub;
-    unsigned int bpmatch;
-    unsigned int tsc;
-#define TSC_BASE 1
-#define TSC_AUX 2
-};
-
-static int priv_op_insn_fetch(enum x86_segment seg,
-                              unsigned long offset,
-                              void *p_data,
-                              unsigned int bytes,
-                              struct x86_emulate_ctxt *ctxt)
-{
-    const struct priv_op_ctxt *poc =
-        container_of(ctxt, struct priv_op_ctxt, ctxt);
-    unsigned int rc;
-    unsigned long addr = poc->cs.base + offset;
-
-    ASSERT(seg == x86_seg_cs);
-
-    /* We don't mean to emulate any branches. */
-    if ( !bytes )
-        return X86EMUL_UNHANDLEABLE;
-
-    rc = pv_emul_virt_to_linear(poc->cs.base, offset, bytes, poc->cs.limit,
-                                x86_seg_cs, ctxt, &addr);
-    if ( rc != X86EMUL_OKAY )
-        return rc;
-
-    if ( (rc = __copy_from_user(p_data, (void *)addr, bytes)) != 0 )
-    {
-        /*
-         * TODO: This should report PFEC_insn_fetch when goc->insn_fetch &&
-         * cpu_has_nx, but we'd then need a "fetch" variant of
-         * __copy_from_user() respecting NX, SMEP, and protection keys.
-         */
-        x86_emul_pagefault(0, addr + bytes - rc, ctxt);
-        return X86EMUL_EXCEPTION;
-    }
-
-    return X86EMUL_OKAY;
-}
-
-static int priv_op_read_segment(enum x86_segment seg,
-                                struct segment_register *reg,
-                                struct x86_emulate_ctxt *ctxt)
-{
-    /* Check if this is an attempt to access the I/O bitmap. */
-    if ( seg == x86_seg_tr )
-    {
-        switch ( ctxt->opcode )
-        {
-        case 0x6c ... 0x6f: /* ins / outs */
-        case 0xe4 ... 0xe7: /* in / out (immediate port) */
-        case 0xec ... 0xef: /* in / out (port in %dx) */
-            /* Defer the check to priv_op_{read,write}_io(). */
-            return X86EMUL_DONE;
-        }
-    }
-
-    if ( ctxt->addr_size < 64 )
-    {
-        unsigned long limit;
-        unsigned int sel, ar;
-
-        switch ( seg )
-        {
-        case x86_seg_cs: sel = ctxt->regs->cs; break;
-        case x86_seg_ds: sel = read_sreg(ds);  break;
-        case x86_seg_es: sel = read_sreg(es);  break;
-        case x86_seg_fs: sel = read_sreg(fs);  break;
-        case x86_seg_gs: sel = read_sreg(gs);  break;
-        case x86_seg_ss: sel = ctxt->regs->ss; break;
-        default: return X86EMUL_UNHANDLEABLE;
-        }
-
-        if ( !pv_emul_read_descriptor(sel, current, &reg->base,
-                                      &limit, &ar, 0) )
-            return X86EMUL_UNHANDLEABLE;
-
-        reg->limit = limit;
-        reg->attr.bytes = ar >> 8;
-    }
-    else
-    {
-        switch ( seg )
-        {
-        default:
-            if ( !is_x86_user_segment(seg) )
-                return X86EMUL_UNHANDLEABLE;
-            reg->base = 0;
-            break;
-        case x86_seg_fs:
-            reg->base = rdfsbase();
-            break;
-        case x86_seg_gs:
-            reg->base = rdgsbase();
-            break;
-        }
-
-        reg->limit = ~0U;
-
-        reg->attr.bytes = 0;
-        reg->attr.fields.type = _SEGMENT_WR >> 8;
-        if ( seg == x86_seg_cs )
-        {
-            reg->attr.fields.type |= _SEGMENT_CODE >> 8;
-            reg->attr.fields.l = 1;
-        }
-        else
-            reg->attr.fields.db = 1;
-        reg->attr.fields.s   = 1;
-        reg->attr.fields.dpl = 3;
-        reg->attr.fields.p   = 1;
-        reg->attr.fields.g   = 1;
-    }
-
-    /*
-     * For x86_emulate.c's mode_ring0() to work, fake a DPL of zero.
-     * Also do this for consistency for non-conforming code segments.
-     */
-    if ( (seg == x86_seg_ss ||
-          (seg == x86_seg_cs &&
-           !(reg->attr.fields.type & (_SEGMENT_EC >> 8)))) &&
-         guest_kernel_mode(current, ctxt->regs) )
-        reg->attr.fields.dpl = 0;
-
-    return X86EMUL_OKAY;
-}
-
-/* Perform IOPL check between the vcpu's shadowed IOPL, and the assumed cpl. */
-static bool_t iopl_ok(const struct vcpu *v, const struct cpu_user_regs *regs)
-{
-    unsigned int cpl = guest_kernel_mode(v, regs) ?
-        (VM_ASSIST(v->domain, architectural_iopl) ? 0 : 1) : 3;
-
-    ASSERT((v->arch.pv_vcpu.iopl & ~X86_EFLAGS_IOPL) == 0);
-
-    return IOPL(cpl) <= v->arch.pv_vcpu.iopl;
-}
-
-/* Has the guest requested sufficient permission for this I/O access? */
-static int guest_io_okay(
-    unsigned int port, unsigned int bytes,
-    struct vcpu *v, struct cpu_user_regs *regs)
-{
-    /* If in user mode, switch to kernel mode just to read I/O bitmap. */
-    int user_mode = !(v->arch.flags & TF_kernel_mode);
-#define TOGGLE_MODE() if ( user_mode ) toggle_guest_mode(v)
-
-    if ( iopl_ok(v, regs) )
-        return 1;
-
-    if ( v->arch.pv_vcpu.iobmp_limit > (port + bytes) )
-    {
-        union { uint8_t bytes[2]; uint16_t mask; } x;
-
-        /*
-         * Grab permission bytes from guest space. Inaccessible bytes are
-         * read as 0xff (no access allowed).
-         */
-        TOGGLE_MODE();
-        switch ( __copy_from_guest_offset(x.bytes, v->arch.pv_vcpu.iobmp,
-                                          port>>3, 2) )
-        {
-        default: x.bytes[0] = ~0;
-            /* fallthrough */
-        case 1:  x.bytes[1] = ~0;
-            /* fallthrough */
-        case 0:  break;
-        }
-        TOGGLE_MODE();
-
-        if ( (x.mask & (((1<<bytes)-1) << (port&7))) == 0 )
-            return 1;
-    }
-
-    return 0;
-}
-
-/* Has the administrator granted sufficient permission for this I/O access? */
-static bool_t admin_io_okay(unsigned int port, unsigned int bytes,
-                            const struct domain *d)
-{
-    /*
-     * Port 0xcf8 (CONFIG_ADDRESS) is only visible for DWORD accesses.
-     * We never permit direct access to that register.
-     */
-    if ( (port == 0xcf8) && (bytes == 4) )
-        return 0;
-
-    /* We also never permit direct access to the RTC/CMOS registers. */
-    if ( ((port & ~1) == RTC_PORT(0)) )
-        return 0;
-
-    return ioports_access_permitted(d, port, port + bytes - 1);
-}
-
-static bool_t pci_cfg_ok(struct domain *currd, unsigned int start,
-                         unsigned int size, uint32_t *write)
-{
-    uint32_t machine_bdf;
-
-    if ( !is_hardware_domain(currd) )
-        return 0;
-
-    if ( !CF8_ENABLED(currd->arch.pci_cf8) )
-        return 1;
-
-    machine_bdf = CF8_BDF(currd->arch.pci_cf8);
-    if ( write )
-    {
-        const unsigned long *ro_map = pci_get_ro_map(0);
-
-        if ( ro_map && test_bit(machine_bdf, ro_map) )
-            return 0;
-    }
-    start |= CF8_ADDR_LO(currd->arch.pci_cf8);
-    /* AMD extended configuration space access? */
-    if ( CF8_ADDR_HI(currd->arch.pci_cf8) &&
-         boot_cpu_data.x86_vendor == X86_VENDOR_AMD &&
-         boot_cpu_data.x86 >= 0x10 && boot_cpu_data.x86 <= 0x17 )
-    {
-        uint64_t msr_val;
-
-        if ( rdmsr_safe(MSR_AMD64_NB_CFG, msr_val) )
-            return 0;
-        if ( msr_val & (1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT) )
-            start |= CF8_ADDR_HI(currd->arch.pci_cf8);
-    }
-
-    return !write ?
-           xsm_pci_config_permission(XSM_HOOK, currd, machine_bdf,
-                                     start, start + size - 1, 0) == 0 :
-           pci_conf_write_intercept(0, machine_bdf, start, size, write) >= 0;
-}
-
-uint32_t guest_io_read(unsigned int port, unsigned int bytes,
-                       struct domain *currd)
-{
-    uint32_t data = 0;
-    unsigned int shift = 0;
-
-    if ( admin_io_okay(port, bytes, currd) )
-    {
-        switch ( bytes )
-        {
-        case 1: return inb(port);
-        case 2: return inw(port);
-        case 4: return inl(port);
-        }
-    }
-
-    while ( bytes != 0 )
-    {
-        unsigned int size = 1;
-        uint32_t sub_data = ~0;
-
-        if ( (port == 0x42) || (port == 0x43) || (port == 0x61) )
-        {
-            sub_data = pv_pit_handler(port, 0, 0);
-        }
-        else if ( port == RTC_PORT(0) )
-        {
-            sub_data = currd->arch.cmos_idx;
-        }
-        else if ( (port == RTC_PORT(1)) &&
-                  ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
-        {
-            unsigned long flags;
-
-            spin_lock_irqsave(&rtc_lock, flags);
-            outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0));
-            sub_data = inb(RTC_PORT(1));
-            spin_unlock_irqrestore(&rtc_lock, flags);
-        }
-        else if ( (port == 0xcf8) && (bytes == 4) )
-        {
-            size = 4;
-            sub_data = currd->arch.pci_cf8;
-        }
-        else if ( (port & 0xfffc) == 0xcfc )
-        {
-            size = min(bytes, 4 - (port & 3));
-            if ( size == 3 )
-                size = 2;
-            if ( pci_cfg_ok(currd, port & 3, size, NULL) )
-                sub_data = pci_conf_read(currd->arch.pci_cf8, port & 3, size);
-        }
-
-        if ( size == 4 )
-            return sub_data;
-
-        data |= (sub_data & ((1u << (size * 8)) - 1)) << shift;
-        shift += size * 8;
-        port += size;
-        bytes -= size;
-    }
-
-    return data;
-}
-
-void guest_io_write(unsigned int port, unsigned int bytes, uint32_t data,
-                    struct domain *currd)
-{
-    if ( admin_io_okay(port, bytes, currd) )
-    {
-        switch ( bytes ) {
-        case 1:
-            outb((uint8_t)data, port);
-            if ( pv_post_outb_hook )
-                pv_post_outb_hook(port, (uint8_t)data);
-            break;
-        case 2:
-            outw((uint16_t)data, port);
-            break;
-        case 4:
-            outl(data, port);
-            break;
-        }
-        return;
-    }
-
-    while ( bytes != 0 )
-    {
-        unsigned int size = 1;
-
-        if ( (port == 0x42) || (port == 0x43) || (port == 0x61) )
-        {
-            pv_pit_handler(port, (uint8_t)data, 1);
-        }
-        else if ( port == RTC_PORT(0) )
-        {
-            currd->arch.cmos_idx = data;
-        }
-        else if ( (port == RTC_PORT(1)) &&
-                  ioports_access_permitted(currd, RTC_PORT(0), RTC_PORT(1)) )
-        {
-            unsigned long flags;
-
-            if ( pv_rtc_handler )
-                pv_rtc_handler(currd->arch.cmos_idx & 0x7f, data);
-            spin_lock_irqsave(&rtc_lock, flags);
-            outb(currd->arch.cmos_idx & 0x7f, RTC_PORT(0));
-            outb(data, RTC_PORT(1));
-            spin_unlock_irqrestore(&rtc_lock, flags);
-        }
-        else if ( (port == 0xcf8) && (bytes == 4) )
-        {
-            size = 4;
-            currd->arch.pci_cf8 = data;
-        }
-        else if ( (port & 0xfffc) == 0xcfc )
-        {
-            size = min(bytes, 4 - (port & 3));
-            if ( size == 3 )
-                size = 2;
-            if ( pci_cfg_ok(currd, port & 3, size, &data) )
-                pci_conf_write(currd->arch.pci_cf8, port & 3, size, data);
-        }
-
-        if ( size == 4 )
-            return;
-
-        port += size;
-        bytes -= size;
-        data >>= size * 8;
-    }
-}
-
-/* I/O emulation support. Helper routines for, and type of, the stack stub.*/
-void host_to_guest_gpr_switch(struct cpu_user_regs *);
-unsigned long guest_to_host_gpr_switch(unsigned long);
-
-void (*pv_post_outb_hook)(unsigned int port, u8 value);
-
-typedef void io_emul_stub_t(struct cpu_user_regs *);
-
-static io_emul_stub_t *io_emul_stub_setup(struct priv_op_ctxt *ctxt, u8 opcode,
-                                          unsigned int port, unsigned int bytes)
-{
-    if ( !ctxt->io_emul_stub )
-        ctxt->io_emul_stub = map_domain_page(_mfn(this_cpu(stubs.mfn))) +
-                                             (this_cpu(stubs.addr) &
-                                              ~PAGE_MASK) +
-                                             STUB_BUF_SIZE / 2;
-
-    /* movq $host_to_guest_gpr_switch,%rcx */
-    ctxt->io_emul_stub[0] = 0x48;
-    ctxt->io_emul_stub[1] = 0xb9;
-    *(void **)&ctxt->io_emul_stub[2] = (void *)host_to_guest_gpr_switch;
-    /* callq *%rcx */
-    ctxt->io_emul_stub[10] = 0xff;
-    ctxt->io_emul_stub[11] = 0xd1;
-    /* data16 or nop */
-    ctxt->io_emul_stub[12] = (bytes != 2) ? 0x90 : 0x66;
-    /* <io-access opcode> */
-    ctxt->io_emul_stub[13] = opcode;
-    /* imm8 or nop */
-    ctxt->io_emul_stub[14] = !(opcode & 8) ? port : 0x90;
-    /* ret (jumps to guest_to_host_gpr_switch) */
-    ctxt->io_emul_stub[15] = 0xc3;
-    BUILD_BUG_ON(STUB_BUF_SIZE / 2 < 16);
-
-    if ( ioemul_handle_quirk )
-        ioemul_handle_quirk(opcode, &ctxt->io_emul_stub[12], ctxt->ctxt.regs);
-
-    /* Handy function-typed pointer to the stub. */
-    return (void *)(this_cpu(stubs.addr) + STUB_BUF_SIZE / 2);
-}
-
-static int priv_op_read_io(unsigned int port, unsigned int bytes,
-                           unsigned long *val, struct x86_emulate_ctxt *ctxt)
-{
-    struct priv_op_ctxt *poc = container_of(ctxt, struct priv_op_ctxt, ctxt);
-    struct vcpu *curr = current;
-    struct domain *currd = current->domain;
-
-    /* INS must not come here. */
-    ASSERT((ctxt->opcode & ~9) == 0xe4);
-
-    if ( !guest_io_okay(port, bytes, curr, ctxt->regs) )
-        return X86EMUL_UNHANDLEABLE;
-
-    poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes);
-
-    if ( admin_io_okay(port, bytes, currd) )
-    {
-        io_emul_stub_t *io_emul =
-            io_emul_stub_setup(poc, ctxt->opcode, port, bytes);
-
-        mark_regs_dirty(ctxt->regs);
-        io_emul(ctxt->regs);
-        return X86EMUL_DONE;
-    }
-
-    *val = guest_io_read(port, bytes, currd);
-
-    return X86EMUL_OKAY;
-}
-
-static int priv_op_write_io(unsigned int port, unsigned int bytes,
-                            unsigned long val, struct x86_emulate_ctxt *ctxt)
-{
-    struct priv_op_ctxt *poc = container_of(ctxt, struct priv_op_ctxt, ctxt);
-    struct vcpu *curr = current;
-    struct domain *currd = current->domain;
-
-    /* OUTS must not come here. */
-    ASSERT((ctxt->opcode & ~9) == 0xe6);
-
-    if ( !guest_io_okay(port, bytes, curr, ctxt->regs) )
-        return X86EMUL_UNHANDLEABLE;
-
-    poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes);
-
-    if ( admin_io_okay(port, bytes, currd) )
-    {
-        io_emul_stub_t *io_emul =
-            io_emul_stub_setup(poc, ctxt->opcode, port, bytes);
-
-        mark_regs_dirty(ctxt->regs);
-        io_emul(ctxt->regs);
-        if ( (bytes == 1) && pv_post_outb_hook )
-            pv_post_outb_hook(port, val);
-        return X86EMUL_DONE;
-    }
-
-    guest_io_write(port, bytes, val, currd);
-
-    return X86EMUL_OKAY;
-}
-
-static int priv_op_rep_ins(uint16_t port,
-                           enum x86_segment seg, unsigned long offset,
-                           unsigned int bytes_per_rep, unsigned long *reps,
-                           struct x86_emulate_ctxt *ctxt)
-{
-    struct priv_op_ctxt *poc = container_of(ctxt, struct priv_op_ctxt, ctxt);
-    struct vcpu *curr = current;
-    struct domain *currd = current->domain;
-    unsigned long goal = *reps;
-    struct segment_register sreg;
-    int rc;
-
-    ASSERT(seg == x86_seg_es);
-
-    *reps = 0;
-
-    if ( !guest_io_okay(port, bytes_per_rep, curr, ctxt->regs) )
-        return X86EMUL_UNHANDLEABLE;
-
-    rc = priv_op_read_segment(x86_seg_es, &sreg, ctxt);
-    if ( rc != X86EMUL_OKAY )
-        return rc;
-
-    if ( !sreg.attr.fields.p )
-        return X86EMUL_UNHANDLEABLE;
-    if ( !sreg.attr.fields.s ||
-         (sreg.attr.fields.type & (_SEGMENT_CODE >> 8)) ||
-         !(sreg.attr.fields.type & (_SEGMENT_WR >> 8)) )
-    {
-        x86_emul_hw_exception(TRAP_gp_fault, 0, ctxt);
-        return X86EMUL_EXCEPTION;
-    }
-
-    poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes_per_rep);
-
-    while ( *reps < goal )
-    {
-        unsigned int data = guest_io_read(port, bytes_per_rep, currd);
-        unsigned long addr;
-
-        rc = pv_emul_virt_to_linear(sreg.base, offset, bytes_per_rep,
-                                    sreg.limit, x86_seg_es, ctxt, &addr);
-        if ( rc != X86EMUL_OKAY )
-            return rc;
-
-        if ( (rc = __copy_to_user((void *)addr, &data, bytes_per_rep)) != 0 )
-        {
-            x86_emul_pagefault(PFEC_write_access,
-                               addr + bytes_per_rep - rc, ctxt);
-            return X86EMUL_EXCEPTION;
-        }
-
-        ++*reps;
-
-        if ( poc->bpmatch || hypercall_preempt_check() )
-            break;
-
-        /* x86_emulate() clips the repetition count to ensure we don't wrap. */
-        if ( unlikely(ctxt->regs->eflags & X86_EFLAGS_DF) )
-            offset -= bytes_per_rep;
-        else
-            offset += bytes_per_rep;
-    }
-
-    return X86EMUL_OKAY;
-}
-
-static int priv_op_rep_outs(enum x86_segment seg, unsigned long offset,
-                            uint16_t port,
-                            unsigned int bytes_per_rep, unsigned long *reps,
-                            struct x86_emulate_ctxt *ctxt)
-{
-    struct priv_op_ctxt *poc = container_of(ctxt, struct priv_op_ctxt, ctxt);
-    struct vcpu *curr = current;
-    struct domain *currd = current->domain;
-    unsigned long goal = *reps;
-    struct segment_register sreg;
-    int rc;
-
-    *reps = 0;
-
-    if ( !guest_io_okay(port, bytes_per_rep, curr, ctxt->regs) )
-        return X86EMUL_UNHANDLEABLE;
-
-    rc = priv_op_read_segment(seg, &sreg, ctxt);
-    if ( rc != X86EMUL_OKAY )
-        return rc;
-
-    if ( !sreg.attr.fields.p )
-        return X86EMUL_UNHANDLEABLE;
-    if ( !sreg.attr.fields.s ||
-         ((sreg.attr.fields.type & (_SEGMENT_CODE >> 8)) &&
-          !(sreg.attr.fields.type & (_SEGMENT_WR >> 8))) )
-    {
-        x86_emul_hw_exception(seg != x86_seg_ss ? TRAP_gp_fault
-                                                : TRAP_stack_error,
-                              0, ctxt);
-        return X86EMUL_EXCEPTION;
-    }
-
-    poc->bpmatch = check_guest_io_breakpoint(curr, port, bytes_per_rep);
-
-    while ( *reps < goal )
-    {
-        unsigned int data = 0;
-        unsigned long addr;
-
-        rc = pv_emul_virt_to_linear(sreg.base, offset, bytes_per_rep,
-                                    sreg.limit, seg, ctxt, &addr);
-        if ( rc != X86EMUL_OKAY )
-            return rc;
-
-        if ( (rc = __copy_from_user(&data, (void *)addr, bytes_per_rep)) != 0 )
-        {
-            x86_emul_pagefault(0, addr + bytes_per_rep - rc, ctxt);
-            return X86EMUL_EXCEPTION;
-        }
-
-        guest_io_write(port, bytes_per_rep, data, currd);
-
-        ++*reps;
-
-        if ( poc->bpmatch || hypercall_preempt_check() )
-            break;
-
-        /* x86_emulate() clips the repetition count to ensure we don't wrap. */
-        if ( unlikely(ctxt->regs->eflags & X86_EFLAGS_DF) )
-            offset -= bytes_per_rep;
-        else
-            offset += bytes_per_rep;
-    }
-
-    return X86EMUL_OKAY;
-}
-
-static int priv_op_read_cr(unsigned int reg, unsigned long *val,
-                           struct x86_emulate_ctxt *ctxt)
-{
-    const struct vcpu *curr = current;
-
-    switch ( reg )
-    {
-    case 0: /* Read CR0 */
-        *val = (read_cr0() & ~X86_CR0_TS) | curr->arch.pv_vcpu.ctrlreg[0];
-        return X86EMUL_OKAY;
-
-    case 2: /* Read CR2 */
-    case 4: /* Read CR4 */
-        *val = curr->arch.pv_vcpu.ctrlreg[reg];
-        return X86EMUL_OKAY;
-
-    case 3: /* Read CR3 */
-    {
-        const struct domain *currd = curr->domain;
-        unsigned long mfn;
-
-        if ( !is_pv_32bit_domain(currd) )
-        {
-            mfn = pagetable_get_pfn(curr->arch.guest_table);
-            *val = xen_pfn_to_cr3(mfn_to_gmfn(currd, mfn));
-        }
-        else
-        {
-            l4_pgentry_t *pl4e =
-                map_domain_page(_mfn(pagetable_get_pfn(curr->arch.guest_table)));
-
-            mfn = l4e_get_pfn(*pl4e);
-            unmap_domain_page(pl4e);
-            *val = compat_pfn_to_cr3(mfn_to_gmfn(currd, mfn));
-        }
-        /* PTs should not be shared */
-        BUG_ON(page_get_owner(mfn_to_page(mfn)) == dom_cow);
-        return X86EMUL_OKAY;
-    }
-    }
-
-    return X86EMUL_UNHANDLEABLE;
-}
-
-static int priv_op_write_cr(unsigned int reg, unsigned long val,
-                            struct x86_emulate_ctxt *ctxt)
-{
-    struct vcpu *curr = current;
-
-    switch ( reg )
-    {
-    case 0: /* Write CR0 */
-        if ( (val ^ read_cr0()) & ~X86_CR0_TS )
-        {
-            gdprintk(XENLOG_WARNING,
-                    "Attempt to change unmodifiable CR0 flags\n");
-            break;
-        }
-        do_fpu_taskswitch(!!(val & X86_CR0_TS));
-        return X86EMUL_OKAY;
-
-    case 2: /* Write CR2 */
-        curr->arch.pv_vcpu.ctrlreg[2] = val;
-        arch_set_cr2(curr, val);
-        return X86EMUL_OKAY;
-
-    case 3: /* Write CR3 */
-    {
-        struct domain *currd = curr->domain;
-        unsigned long gfn;
-        struct page_info *page;
-        int rc;
-
-        gfn = !is_pv_32bit_domain(currd)
-              ? xen_cr3_to_pfn(val) : compat_cr3_to_pfn(val);
-        page = get_page_from_gfn(currd, gfn, NULL, P2M_ALLOC);
-        if ( !page )
-            break;
-        rc = new_guest_cr3(page_to_mfn(page));
-        put_page(page);
-
-        switch ( rc )
-        {
-        case 0:
-            return X86EMUL_OKAY;
-        case -ERESTART: /* retry after preemption */
-            return X86EMUL_RETRY;
-        }
-        break;
-    }
-
-    case 4: /* Write CR4 */
-        curr->arch.pv_vcpu.ctrlreg[4] = pv_guest_cr4_fixup(curr, val);
-        write_cr4(pv_guest_cr4_to_real_cr4(curr));
-        ctxt_switch_levelling(curr);
-        return X86EMUL_OKAY;
-    }
-
-    return X86EMUL_UNHANDLEABLE;
-}
-
-static int priv_op_read_dr(unsigned int reg, unsigned long *val,
-                           struct x86_emulate_ctxt *ctxt)
-{
-    unsigned long res = do_get_debugreg(reg);
-
-    if ( IS_ERR_VALUE(res) )
-        return X86EMUL_UNHANDLEABLE;
-
-    *val = res;
-
-    return X86EMUL_OKAY;
-}
-
-static int priv_op_write_dr(unsigned int reg, unsigned long val,
-                            struct x86_emulate_ctxt *ctxt)
-{
-    return do_set_debugreg(reg, val) == 0
-           ? X86EMUL_OKAY : X86EMUL_UNHANDLEABLE;
-}
-
-static inline uint64_t guest_misc_enable(uint64_t val)
-{
-    val &= ~(MSR_IA32_MISC_ENABLE_PERF_AVAIL |
-             MSR_IA32_MISC_ENABLE_MONITOR_ENABLE);
-    val |= MSR_IA32_MISC_ENABLE_BTS_UNAVAIL |
-           MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL |
-           MSR_IA32_MISC_ENABLE_XTPR_DISABLE;
-    return val;
-}
-
-static inline bool is_cpufreq_controller(const struct domain *d)
-{
-    return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
-            is_hardware_domain(d));
-}
-
-static int priv_op_read_msr(unsigned int reg, uint64_t *val,
-                            struct x86_emulate_ctxt *ctxt)
-{
-    struct priv_op_ctxt *poc = container_of(ctxt, struct priv_op_ctxt, ctxt);
-    const struct vcpu *curr = current;
-    const struct domain *currd = curr->domain;
-    bool vpmu_msr = false;
-
-    switch ( reg )
-    {
-        int rc;
-
-    case MSR_FS_BASE:
-        if ( is_pv_32bit_domain(currd) )
-            break;
-        *val = cpu_has_fsgsbase ? __rdfsbase() : curr->arch.pv_vcpu.fs_base;
-        return X86EMUL_OKAY;
-
-    case MSR_GS_BASE:
-        if ( is_pv_32bit_domain(currd) )
-            break;
-        *val = cpu_has_fsgsbase ? __rdgsbase()
-                                : curr->arch.pv_vcpu.gs_base_kernel;
-        return X86EMUL_OKAY;
-
-    case MSR_SHADOW_GS_BASE:
-        if ( is_pv_32bit_domain(currd) )
-            break;
-        *val = curr->arch.pv_vcpu.gs_base_user;
-        return X86EMUL_OKAY;
-
-    /*
-     * In order to fully retain original behavior, defer calling
-     * pv_soft_rdtsc() until after emulation. This may want/need to be
-     * reconsidered.
-     */
-    case MSR_IA32_TSC:
-        poc->tsc |= TSC_BASE;
-        goto normal;
-
-    case MSR_TSC_AUX:
-        poc->tsc |= TSC_AUX;
-        if ( cpu_has_rdtscp )
-            goto normal;
-        *val = 0;
-        return X86EMUL_OKAY;
-
-    case MSR_EFER:
-        *val = read_efer();
-        if ( is_pv_32bit_domain(currd) )
-            *val &= ~(EFER_LME | EFER_LMA | EFER_LMSLE);
-        return X86EMUL_OKAY;
-
-    case MSR_K7_FID_VID_CTL:
-    case MSR_K7_FID_VID_STATUS:
-    case MSR_K8_PSTATE_LIMIT:
-    case MSR_K8_PSTATE_CTRL:
-    case MSR_K8_PSTATE_STATUS:
-    case MSR_K8_PSTATE0:
-    case MSR_K8_PSTATE1:
-    case MSR_K8_PSTATE2:
-    case MSR_K8_PSTATE3:
-    case MSR_K8_PSTATE4:
-    case MSR_K8_PSTATE5:
-    case MSR_K8_PSTATE6:
-    case MSR_K8_PSTATE7:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD )
-            break;
-        if ( unlikely(is_cpufreq_controller(currd)) )
-            goto normal;
-        *val = 0;
-        return X86EMUL_OKAY;
-
-    case MSR_IA32_UCODE_REV:
-        BUILD_BUG_ON(MSR_IA32_UCODE_REV != MSR_AMD_PATCHLEVEL);
-        if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
-        {
-            if ( wrmsr_safe(MSR_IA32_UCODE_REV, 0) )
-                break;
-            /* As documented in the SDM: Do a CPUID 1 here */
-            cpuid_eax(1);
-        }
-        goto normal;
-
-    case MSR_IA32_MISC_ENABLE:
-        if ( rdmsr_safe(reg, *val) )
-            break;
-        *val = guest_misc_enable(*val);
-        return X86EMUL_OKAY;
-
-    case MSR_AMD64_DR0_ADDRESS_MASK:
-        if ( !boot_cpu_has(X86_FEATURE_DBEXT) )
-            break;
-        *val = curr->arch.pv_vcpu.dr_mask[0];
-        return X86EMUL_OKAY;
-
-    case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
-        if ( !boot_cpu_has(X86_FEATURE_DBEXT) )
-            break;
-        *val = curr->arch.pv_vcpu.dr_mask[reg - MSR_AMD64_DR1_ADDRESS_MASK + 1];
-        return X86EMUL_OKAY;
-
-    case MSR_IA32_PERF_CAPABILITIES:
-        /* No extra capabilities are supported. */
-        *val = 0;
-        return X86EMUL_OKAY;
-
-    case MSR_INTEL_PLATFORM_INFO:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
-             rdmsr_safe(MSR_INTEL_PLATFORM_INFO, *val) )
-            break;
-        *val = 0;
-        if ( this_cpu(cpuid_faulting_enabled) )
-            *val |= MSR_PLATFORM_INFO_CPUID_FAULTING;
-        return X86EMUL_OKAY;
-
-    case MSR_INTEL_MISC_FEATURES_ENABLES:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
-             rdmsr_safe(MSR_INTEL_MISC_FEATURES_ENABLES, *val) )
-            break;
-        *val = 0;
-        if ( curr->arch.cpuid_faulting )
-            *val |= MSR_MISC_FEATURES_CPUID_FAULTING;
-        return X86EMUL_OKAY;
-
-    case MSR_P6_PERFCTR(0)...MSR_P6_PERFCTR(7):
-    case MSR_P6_EVNTSEL(0)...MSR_P6_EVNTSEL(3):
-    case MSR_CORE_PERF_FIXED_CTR0...MSR_CORE_PERF_FIXED_CTR2:
-    case MSR_CORE_PERF_FIXED_CTR_CTRL...MSR_CORE_PERF_GLOBAL_OVF_CTRL:
-        if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
-        {
-            vpmu_msr = true;
-            /* fall through */
-    case MSR_AMD_FAM15H_EVNTSEL0...MSR_AMD_FAM15H_PERFCTR5:
-    case MSR_K7_EVNTSEL0...MSR_K7_PERFCTR3:
-            if ( vpmu_msr || (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) )
-            {
-                if ( vpmu_do_rdmsr(reg, val) )
-                    break;
-                return X86EMUL_OKAY;
-            }
-        }
-        /* fall through */
-    default:
-        if ( rdmsr_hypervisor_regs(reg, val) )
-            return X86EMUL_OKAY;
-
-        rc = vmce_rdmsr(reg, val);
-        if ( rc < 0 )
-            break;
-        if ( rc )
-            return X86EMUL_OKAY;
-        /* fall through */
-    normal:
-        /* Everyone can read the MSR space. */
-        /* gdprintk(XENLOG_WARNING, "Domain attempted RDMSR %08x\n", reg); */
-        if ( rdmsr_safe(reg, *val) )
-            break;
-        return X86EMUL_OKAY;
-    }
-
-    return X86EMUL_UNHANDLEABLE;
-}
-
-#include "x86_64/mmconfig.h"
-
-static int priv_op_write_msr(unsigned int reg, uint64_t val,
-                             struct x86_emulate_ctxt *ctxt)
-{
-    struct vcpu *curr = current;
-    const struct domain *currd = curr->domain;
-    bool vpmu_msr = false;
-
-    switch ( reg )
-    {
-        uint64_t temp;
-        int rc;
-
-    case MSR_FS_BASE:
-        if ( is_pv_32bit_domain(currd) || !is_canonical_address(val) )
-            break;
-        wrfsbase(val);
-        curr->arch.pv_vcpu.fs_base = val;
-        return X86EMUL_OKAY;
-
-    case MSR_GS_BASE:
-        if ( is_pv_32bit_domain(currd) || !is_canonical_address(val) )
-            break;
-        wrgsbase(val);
-        curr->arch.pv_vcpu.gs_base_kernel = val;
-        return X86EMUL_OKAY;
-
-    case MSR_SHADOW_GS_BASE:
-        if ( is_pv_32bit_domain(currd) || !is_canonical_address(val) )
-            break;
-        wrmsrl(MSR_SHADOW_GS_BASE, val);
-        curr->arch.pv_vcpu.gs_base_user = val;
-        return X86EMUL_OKAY;
-
-    case MSR_K7_FID_VID_STATUS:
-    case MSR_K7_FID_VID_CTL:
-    case MSR_K8_PSTATE_LIMIT:
-    case MSR_K8_PSTATE_CTRL:
-    case MSR_K8_PSTATE_STATUS:
-    case MSR_K8_PSTATE0:
-    case MSR_K8_PSTATE1:
-    case MSR_K8_PSTATE2:
-    case MSR_K8_PSTATE3:
-    case MSR_K8_PSTATE4:
-    case MSR_K8_PSTATE5:
-    case MSR_K8_PSTATE6:
-    case MSR_K8_PSTATE7:
-    case MSR_K8_HWCR:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD )
-            break;
-        if ( likely(!is_cpufreq_controller(currd)) ||
-             wrmsr_safe(reg, val) == 0 )
-            return X86EMUL_OKAY;
-        break;
-
-    case MSR_AMD64_NB_CFG:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD ||
-             boot_cpu_data.x86 < 0x10 || boot_cpu_data.x86 > 0x17 )
-            break;
-        if ( !is_hardware_domain(currd) || !is_pinned_vcpu(curr) )
-            return X86EMUL_OKAY;
-        if ( (rdmsr_safe(MSR_AMD64_NB_CFG, temp) != 0) ||
-             ((val ^ temp) & ~(1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT)) )
-            goto invalid;
-        if ( wrmsr_safe(MSR_AMD64_NB_CFG, val) == 0 )
-            return X86EMUL_OKAY;
-        break;
-
-    case MSR_FAM10H_MMIO_CONF_BASE:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_AMD ||
-             boot_cpu_data.x86 < 0x10 || boot_cpu_data.x86 > 0x17 )
-            break;
-        if ( !is_hardware_domain(currd) || !is_pinned_vcpu(curr) )
-            return X86EMUL_OKAY;
-        if ( rdmsr_safe(MSR_FAM10H_MMIO_CONF_BASE, temp) != 0 )
-            break;
-        if ( (pci_probe & PCI_PROBE_MASK) == PCI_PROBE_MMCONF ?
-             temp != val :
-             ((temp ^ val) &
-              ~(FAM10H_MMIO_CONF_ENABLE |
-                (FAM10H_MMIO_CONF_BUSRANGE_MASK <<
-                 FAM10H_MMIO_CONF_BUSRANGE_SHIFT) |
-                ((u64)FAM10H_MMIO_CONF_BASE_MASK <<
-                 FAM10H_MMIO_CONF_BASE_SHIFT))) )
-            goto invalid;
-        if ( wrmsr_safe(MSR_FAM10H_MMIO_CONF_BASE, val) == 0 )
-            return X86EMUL_OKAY;
-        break;
-
-    case MSR_IA32_UCODE_REV:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
-            break;
-        if ( !is_hardware_domain(currd) || !is_pinned_vcpu(curr) )
-            return X86EMUL_OKAY;
-        if ( rdmsr_safe(reg, temp) )
-            break;
-        if ( val )
-            goto invalid;
-        return X86EMUL_OKAY;
-
-    case MSR_IA32_MISC_ENABLE:
-        if ( rdmsr_safe(reg, temp) )
-            break;
-        if ( val != guest_misc_enable(temp) )
-            goto invalid;
-        return X86EMUL_OKAY;
-
-    case MSR_IA32_MPERF:
-    case MSR_IA32_APERF:
-        if ( (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) &&
-             (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) )
-            break;
-        if ( likely(!is_cpufreq_controller(currd)) ||
-             wrmsr_safe(reg, val) == 0 )
-            return X86EMUL_OKAY;
-        break;
-
-    case MSR_IA32_PERF_CTL:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
-            break;
-        if ( likely(!is_cpufreq_controller(currd)) ||
-             wrmsr_safe(reg, val) == 0 )
-            return X86EMUL_OKAY;
-        break;
-
-    case MSR_IA32_THERM_CONTROL:
-    case MSR_IA32_ENERGY_PERF_BIAS:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
-            break;
-        if ( !is_hardware_domain(currd) || !is_pinned_vcpu(curr) ||
-             wrmsr_safe(reg, val) == 0 )
-            return X86EMUL_OKAY;
-        break;
-
-    case MSR_AMD64_DR0_ADDRESS_MASK:
-        if ( !boot_cpu_has(X86_FEATURE_DBEXT) || (val >> 32) )
-            break;
-        curr->arch.pv_vcpu.dr_mask[0] = val;
-        if ( curr->arch.debugreg[7] & DR7_ACTIVE_MASK )
-            wrmsrl(MSR_AMD64_DR0_ADDRESS_MASK, val);
-        return X86EMUL_OKAY;
-
-    case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
-        if ( !boot_cpu_has(X86_FEATURE_DBEXT) || (val >> 32) )
-            break;
-        curr->arch.pv_vcpu.dr_mask[reg - MSR_AMD64_DR1_ADDRESS_MASK + 1] = val;
-        if ( curr->arch.debugreg[7] & DR7_ACTIVE_MASK )
-            wrmsrl(reg, val);
-        return X86EMUL_OKAY;
-
-    case MSR_INTEL_PLATFORM_INFO:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
-             val || rdmsr_safe(MSR_INTEL_PLATFORM_INFO, val) )
-            break;
-        return X86EMUL_OKAY;
-
-    case MSR_INTEL_MISC_FEATURES_ENABLES:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
-             (val & ~MSR_MISC_FEATURES_CPUID_FAULTING) ||
-             rdmsr_safe(MSR_INTEL_MISC_FEATURES_ENABLES, temp) )
-            break;
-        if ( (val & MSR_MISC_FEATURES_CPUID_FAULTING) &&
-             !this_cpu(cpuid_faulting_enabled) )
-            break;
-        curr->arch.cpuid_faulting = !!(val & MSR_MISC_FEATURES_CPUID_FAULTING);
-        return X86EMUL_OKAY;
-
-    case MSR_P6_PERFCTR(0)...MSR_P6_PERFCTR(7):
-    case MSR_P6_EVNTSEL(0)...MSR_P6_EVNTSEL(3):
-    case MSR_CORE_PERF_FIXED_CTR0...MSR_CORE_PERF_FIXED_CTR2:
-    case MSR_CORE_PERF_FIXED_CTR_CTRL...MSR_CORE_PERF_GLOBAL_OVF_CTRL:
-        if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
-        {
-            vpmu_msr = true;
-    case MSR_AMD_FAM15H_EVNTSEL0...MSR_AMD_FAM15H_PERFCTR5:
-    case MSR_K7_EVNTSEL0...MSR_K7_PERFCTR3:
-            if ( vpmu_msr || (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) )
-            {
-                if ( (vpmu_mode & XENPMU_MODE_ALL) &&
-                     !is_hardware_domain(currd) )
-                    return X86EMUL_OKAY;
-
-                if ( vpmu_do_wrmsr(reg, val, 0) )
-                    break;
-                return X86EMUL_OKAY;
-            }
-        }
-        /* fall through */
-    default:
-        if ( wrmsr_hypervisor_regs(reg, val) == 1 )
-            return X86EMUL_OKAY;
-
-        rc = vmce_wrmsr(reg, val);
-        if ( rc < 0 )
-            break;
-        if ( rc )
-            return X86EMUL_OKAY;
-
-        if ( (rdmsr_safe(reg, temp) != 0) || (val != temp) )
-    invalid:
-            gdprintk(XENLOG_WARNING,
-                     "Domain attempted WRMSR %08x from 0x%016"PRIx64" to 0x%016"PRIx64"\n",
-                     reg, temp, val);
-        return X86EMUL_OKAY;
-    }
-
-    return X86EMUL_UNHANDLEABLE;
-}
-
-static int priv_op_wbinvd(struct x86_emulate_ctxt *ctxt)
-{
-    /* Ignore the instruction if unprivileged. */
-    if ( !cache_flush_permitted(current->domain) )
-        /*
-         * Non-physdev domain attempted WBINVD; ignore for now since
-         * newer linux uses this in some start-of-day timing loops.
-         */
-        ;
-    else
-        wbinvd();
-
-    return X86EMUL_OKAY;
-}
-
-int pv_emul_cpuid(uint32_t leaf, uint32_t subleaf,
-                  struct cpuid_leaf *res, struct x86_emulate_ctxt *ctxt)
-{
-    guest_cpuid(current, leaf, subleaf, res);
-
-    return X86EMUL_OKAY;
-}
-
-static int priv_op_validate(const struct x86_emulate_state *state,
-                            struct x86_emulate_ctxt *ctxt)
-{
-    switch ( ctxt->opcode )
-    {
-    case 0x6c ... 0x6f: /* ins / outs */
-    case 0xe4 ... 0xe7: /* in / out (immediate port) */
-    case 0xec ... 0xef: /* in / out (port in %dx) */
-    case X86EMUL_OPC(0x0f, 0x06): /* clts */
-    case X86EMUL_OPC(0x0f, 0x09): /* wbinvd */
-    case X86EMUL_OPC(0x0f, 0x20) ...
-         X86EMUL_OPC(0x0f, 0x23): /* mov to/from cr/dr */
-    case X86EMUL_OPC(0x0f, 0x30): /* wrmsr */
-    case X86EMUL_OPC(0x0f, 0x31): /* rdtsc */
-    case X86EMUL_OPC(0x0f, 0x32): /* rdmsr */
-    case X86EMUL_OPC(0x0f, 0xa2): /* cpuid */
-        return X86EMUL_OKAY;
-
-    case 0xfa: case 0xfb: /* cli / sti */
-        if ( !iopl_ok(current, ctxt->regs) )
-            break;
-        /*
-         * This is just too dangerous to allow, in my opinion. Consider if the
-         * caller then tries to reenable interrupts using POPF: we can't trap
-         * that and we'll end up with hard-to-debug lockups. Fast & loose will
-         * do for us. :-)
-        vcpu_info(current, evtchn_upcall_mask) = (ctxt->opcode == 0xfa);
-         */
-        return X86EMUL_DONE;
-
-    case X86EMUL_OPC(0x0f, 0x01):
-    {
-        unsigned int modrm_rm, modrm_reg;
-
-        if ( x86_insn_modrm(state, &modrm_rm, &modrm_reg) != 3 ||
-             (modrm_rm & 7) != 1 )
-            break;
-        switch ( modrm_reg & 7 )
-        {
-        case 2: /* xsetbv */
-        case 7: /* rdtscp */
-            return X86EMUL_OKAY;
-        }
-        break;
-    }
-    }
-
-    return X86EMUL_UNHANDLEABLE;
-}
-
-static const struct x86_emulate_ops priv_op_ops = {
-    .insn_fetch          = priv_op_insn_fetch,
-    .read                = x86emul_unhandleable_rw,
-    .validate            = priv_op_validate,
-    .read_io             = priv_op_read_io,
-    .write_io            = priv_op_write_io,
-    .rep_ins             = priv_op_rep_ins,
-    .rep_outs            = priv_op_rep_outs,
-    .read_segment        = priv_op_read_segment,
-    .read_cr             = priv_op_read_cr,
-    .write_cr            = priv_op_write_cr,
-    .read_dr             = priv_op_read_dr,
-    .write_dr            = priv_op_write_dr,
-    .read_msr            = priv_op_read_msr,
-    .write_msr           = priv_op_write_msr,
-    .cpuid               = pv_emul_cpuid,
-    .wbinvd              = priv_op_wbinvd,
-};
-
-static int emulate_privileged_op(struct cpu_user_regs *regs)
-{
-    struct vcpu *curr = current;
-    struct domain *currd = curr->domain;
-    struct priv_op_ctxt ctxt = {
-        .ctxt.regs = regs,
-        .ctxt.vendor = currd->arch.cpuid->x86_vendor,
-        .ctxt.lma = !is_pv_32bit_domain(currd),
-    };
-    int rc;
-    unsigned int eflags, ar;
-
-    if ( !pv_emul_read_descriptor(regs->cs, curr, &ctxt.cs.base,
-                                  &ctxt.cs.limit, &ar, 1) ||
-         !(ar & _SEGMENT_S) ||
-         !(ar & _SEGMENT_P) ||
-         !(ar & _SEGMENT_CODE) )
-        return 0;
-
-    /* Mirror virtualized state into EFLAGS. */
-    ASSERT(regs->eflags & X86_EFLAGS_IF);
-    if ( vcpu_info(curr, evtchn_upcall_mask) )
-        regs->eflags &= ~X86_EFLAGS_IF;
-    else
-        regs->eflags |= X86_EFLAGS_IF;
-    ASSERT(!(regs->eflags & X86_EFLAGS_IOPL));
-    regs->eflags |= curr->arch.pv_vcpu.iopl;
-    eflags = regs->eflags;
-
-    ctxt.ctxt.addr_size = ar & _SEGMENT_L ? 64 : ar & _SEGMENT_DB ? 32 : 16;
-    /* Leave zero in ctxt.ctxt.sp_size, as it's not needed. */
-    rc = x86_emulate(&ctxt.ctxt, &priv_op_ops);
-
-    if ( ctxt.io_emul_stub )
-        unmap_domain_page(ctxt.io_emul_stub);
-
-    /*
-     * Un-mirror virtualized state from EFLAGS.
-     * Nothing we allow to be emulated can change anything other than the
-     * arithmetic bits, and the resume flag.
-     */
-    ASSERT(!((regs->eflags ^ eflags) &
-             ~(X86_EFLAGS_RF | X86_EFLAGS_ARITH_MASK)));
-    regs->eflags |= X86_EFLAGS_IF;
-    regs->eflags &= ~X86_EFLAGS_IOPL;
-
-    switch ( rc )
-    {
-    case X86EMUL_OKAY:
-        if ( ctxt.tsc & TSC_BASE )
-        {
-            if ( ctxt.tsc & TSC_AUX )
-                pv_soft_rdtsc(curr, regs, 1);
-            else if ( currd->arch.vtsc )
-                pv_soft_rdtsc(curr, regs, 0);
-            else
-                msr_split(regs, rdtsc());
-        }
-
-        if ( ctxt.ctxt.retire.singlestep )
-            ctxt.bpmatch |= DR_STEP;
-        if ( ctxt.bpmatch )
-        {
-            curr->arch.debugreg[6] |= ctxt.bpmatch | DR_STATUS_RESERVED_ONE;
-            if ( !(curr->arch.pv_vcpu.trap_bounce.flags & TBF_EXCEPTION) )
-                pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
-        }
-        /* fall through */
-    case X86EMUL_RETRY:
-        return EXCRET_fault_fixed;
-
-    case X86EMUL_EXCEPTION:
-        pv_inject_event(&ctxt.ctxt.event);
-        return EXCRET_fault_fixed;
-    }
-
-    return 0;
-}
-
 static inline int check_stack_limit(unsigned int ar, unsigned int limit,
                                     unsigned int esp, unsigned int decr)
 {
@@ -3380,7 +2024,7 @@ void do_general_protection(struct cpu_user_regs *regs)
 
     /* Emulate some simple privileged and I/O instructions. */
     if ( (regs->error_code == 0) &&
-         emulate_privileged_op(regs) )
+         pv_emulate_privileged_op(regs) )
     {
         trace_trap_one_addr(TRC_PV_EMULATE_PRIVOP, regs->rip);
         return;
diff --git a/xen/arch/x86/x86_64/Makefile b/xen/arch/x86/x86_64/Makefile
index d8815e78b0..f336a6ae65 100644
--- a/xen/arch/x86/x86_64/Makefile
+++ b/xen/arch/x86/x86_64/Makefile
@@ -1,7 +1,6 @@
 subdir-y += compat
 
 obj-bin-y += entry.o
-obj-bin-y += gpr_switch.o
 obj-y += traps.o
 obj-$(CONFIG_KEXEC) += machine_kexec.o
 obj-y += pci.o
diff --git a/xen/include/asm-x86/pv/traps.h b/xen/include/asm-x86/pv/traps.h
new file mode 100644
index 0000000000..5aeb061551
--- /dev/null
+++ b/xen/include/asm-x86/pv/traps.h
@@ -0,0 +1,46 @@
+/*
+ * pv/traps.h
+ *
+ * PV guest traps interface definitions
+ *
+ * Copyright (C) 2017 Wei Liu <wei.liu2@citrix.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms and conditions of the GNU General Public
+ * License, version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __X86_PV_TRAPS_H__
+#define __X86_PV_TRAPS_H__
+
+#ifdef CONFIG_PV
+
+#include <public/xen.h>
+
+int pv_emulate_privileged_op(struct cpu_user_regs *regs);
+
+#else  /* !CONFIG_PV */
+
+int pv_emulate_privileged_op(struct cpu_user_regs *regs) { return 0; }
+
+#endif /* CONFIG_PV */
+
+#endif /* __X86_PV_TRAPS_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 03/27] x86: move PV gate op emulation code
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
  2017-06-08 17:11 ` [PATCH v4 01/27] x86: factor out common PV emulation code Wei Liu
  2017-06-08 17:11 ` [PATCH v4 02/27] x86: move PV privileged instruction " Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-20 16:05   ` Jan Beulich
  2017-06-08 17:11 ` [PATCH v4 04/27] x86: move PV invalid " Wei Liu
                   ` (23 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Move the code to pv/emul-gate-op.c. Prefix emulate_gate_op with pv_
and export it via pv/traps.h.

Pure code motion except for the rename.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/Makefile       |   1 +
 xen/arch/x86/pv/emul-gate-op.c | 440 +++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/traps.c           | 390 +-----------------------------------
 xen/include/asm-x86/pv/traps.h |   2 +
 4 files changed, 444 insertions(+), 389 deletions(-)
 create mode 100644 xen/arch/x86/pv/emul-gate-op.c

diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index e48c460680..1f6fbd3f5c 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -4,5 +4,6 @@ obj-y += traps.o
 obj-bin-y += dom0_build.init.o
 obj-y += domain.o
 obj-y += emulate.o
+obj-y += emul-gate-op.o
 obj-y += emul-priv-op.o
 obj-bin-y += gpr_switch.o
diff --git a/xen/arch/x86/pv/emul-gate-op.c b/xen/arch/x86/pv/emul-gate-op.c
new file mode 100644
index 0000000000..97a4b31a56
--- /dev/null
+++ b/xen/arch/x86/pv/emul-gate-op.c
@@ -0,0 +1,440 @@
+/******************************************************************************
+ * arch/x86/pv/emul-gate-op.c
+ *
+ * Emulate gate op for PV guests
+ *
+ * Modifications to Linux original are copyright (c) 2002-2004, K A Fraser
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/errno.h>
+#include <xen/event.h>
+#include <xen/guest_access.h>
+#include <xen/iocap.h>
+#include <xen/spinlock.h>
+#include <xen/trace.h>
+
+#include <asm/apic.h>
+#include <asm/debugreg.h>
+#include <asm/hpet.h>
+#include <asm/hypercall.h>
+#include <asm/mc146818rtc.h>
+#include <asm/p2m.h>
+#include <asm/pv/traps.h>
+#include <asm/shared.h>
+#include <asm/traps.h>
+#include <asm/x86_emulate.h>
+
+#include <xsm/xsm.h>
+
+#include "emulate.h"
+
+static int read_gate_descriptor(unsigned int gate_sel,
+                                const struct vcpu *v,
+                                unsigned int *sel,
+                                unsigned long *off,
+                                unsigned int *ar)
+{
+    struct desc_struct desc;
+    const struct desc_struct *pdesc;
+
+
+    pdesc = (const struct desc_struct *)
+        (!(gate_sel & 4) ? GDT_VIRT_START(v) : LDT_VIRT_START(v))
+        + (gate_sel >> 3);
+    if ( (gate_sel < 4) ||
+         ((gate_sel >= FIRST_RESERVED_GDT_BYTE) && !(gate_sel & 4)) ||
+         __get_user(desc, pdesc) )
+        return 0;
+
+    *sel = (desc.a >> 16) & 0x0000fffc;
+    *off = (desc.a & 0x0000ffff) | (desc.b & 0xffff0000);
+    *ar = desc.b & 0x0000ffff;
+
+    /*
+     * check_descriptor() clears the DPL field and stores the
+     * guest requested DPL in the selector's RPL field.
+     */
+    if ( *ar & _SEGMENT_DPL )
+        return 0;
+    *ar |= (desc.a >> (16 - 13)) & _SEGMENT_DPL;
+
+    if ( !is_pv_32bit_vcpu(v) )
+    {
+        if ( (*ar & 0x1f00) != 0x0c00 ||
+             (gate_sel >= FIRST_RESERVED_GDT_BYTE - 8 && !(gate_sel & 4)) ||
+             __get_user(desc, pdesc + 1) ||
+             (desc.b & 0x1f00) )
+            return 0;
+
+        *off |= (unsigned long)desc.a << 32;
+        return 1;
+    }
+
+    switch ( *ar & 0x1f00 )
+    {
+    case 0x0400:
+        *off &= 0xffff;
+        break;
+    case 0x0c00:
+        break;
+    default:
+        return 0;
+    }
+
+    return 1;
+}
+
+static inline int check_stack_limit(unsigned int ar, unsigned int limit,
+                                    unsigned int esp, unsigned int decr)
+{
+    return (((esp - decr) < (esp - 1)) &&
+            (!(ar & _SEGMENT_EC) ? (esp - 1) <= limit : (esp - decr) > limit));
+}
+
+struct gate_op_ctxt {
+    struct x86_emulate_ctxt ctxt;
+    struct {
+        unsigned long base, limit;
+    } cs;
+    bool insn_fetch;
+};
+
+static int gate_op_read(
+    enum x86_segment seg,
+    unsigned long offset,
+    void *p_data,
+    unsigned int bytes,
+    struct x86_emulate_ctxt *ctxt)
+{
+    const struct gate_op_ctxt *goc =
+        container_of(ctxt, struct gate_op_ctxt, ctxt);
+    unsigned int rc = bytes, sel = 0;
+    unsigned long addr = offset, limit = 0;
+
+    switch ( seg )
+    {
+    case x86_seg_cs:
+        addr += goc->cs.base;
+        limit = goc->cs.limit;
+        break;
+    case x86_seg_ds:
+        sel = read_sreg(ds);
+        break;
+    case x86_seg_es:
+        sel = read_sreg(es);
+        break;
+    case x86_seg_fs:
+        sel = read_sreg(fs);
+        break;
+    case x86_seg_gs:
+        sel = read_sreg(gs);
+        break;
+    case x86_seg_ss:
+        sel = ctxt->regs->ss;
+        break;
+    default:
+        return X86EMUL_UNHANDLEABLE;
+    }
+    if ( sel )
+    {
+        unsigned int ar;
+
+        ASSERT(!goc->insn_fetch);
+        if ( !pv_emul_read_descriptor(sel, current, &addr, &limit, &ar, 0) ||
+             !(ar & _SEGMENT_S) ||
+             !(ar & _SEGMENT_P) ||
+             ((ar & _SEGMENT_CODE) && !(ar & _SEGMENT_WR)) )
+            return X86EMUL_UNHANDLEABLE;
+        addr += offset;
+    }
+    else if ( seg != x86_seg_cs )
+        return X86EMUL_UNHANDLEABLE;
+
+    /* We don't mean to emulate any branches. */
+    if ( limit < bytes - 1 || offset > limit - bytes + 1 )
+        return X86EMUL_UNHANDLEABLE;
+
+    addr = (uint32_t)addr;
+
+    if ( (rc = __copy_from_user(p_data, (void *)addr, bytes)) )
+    {
+        /*
+         * TODO: This should report PFEC_insn_fetch when goc->insn_fetch &&
+         * cpu_has_nx, but we'd then need a "fetch" variant of
+         * __copy_from_user() respecting NX, SMEP, and protection keys.
+         */
+        x86_emul_pagefault(0, addr + bytes - rc, ctxt);
+        return X86EMUL_EXCEPTION;
+    }
+
+    return X86EMUL_OKAY;
+}
+
+void pv_emulate_gate_op(struct cpu_user_regs *regs)
+{
+    struct vcpu *v = current;
+    unsigned int sel, ar, dpl, nparm, insn_len;
+    struct gate_op_ctxt ctxt = { .ctxt.regs = regs, .insn_fetch = true };
+    struct x86_emulate_state *state;
+    unsigned long off, base, limit;
+    uint16_t opnd_sel = 0;
+    int jump = -1, rc = X86EMUL_OKAY;
+
+    /* Check whether this fault is due to the use of a call gate. */
+    if ( !read_gate_descriptor(regs->error_code, v, &sel, &off, &ar) ||
+         (((ar >> 13) & 3) < (regs->cs & 3)) ||
+         ((ar & _SEGMENT_TYPE) != 0xc00) )
+    {
+        pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
+        return;
+    }
+    if ( !(ar & _SEGMENT_P) )
+    {
+        pv_inject_hw_exception(TRAP_no_segment, regs->error_code);
+        return;
+    }
+    dpl = (ar >> 13) & 3;
+    nparm = ar & 0x1f;
+
+    /*
+     * Decode instruction (and perhaps operand) to determine RPL,
+     * whether this is a jump or a call, and the call return offset.
+     */
+    if ( !pv_emul_read_descriptor(regs->cs, v, &ctxt.cs.base, &ctxt.cs.limit,
+                                  &ar, 0) ||
+         !(ar & _SEGMENT_S) ||
+         !(ar & _SEGMENT_P) ||
+         !(ar & _SEGMENT_CODE) )
+    {
+        pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
+        return;
+    }
+
+    ctxt.ctxt.addr_size = ar & _SEGMENT_DB ? 32 : 16;
+    /* Leave zero in ctxt.ctxt.sp_size, as it's not needed for decoding. */
+    state = x86_decode_insn(&ctxt.ctxt, gate_op_read);
+    ctxt.insn_fetch = false;
+    if ( IS_ERR_OR_NULL(state) )
+    {
+        if ( PTR_ERR(state) == -X86EMUL_EXCEPTION )
+            pv_inject_event(&ctxt.ctxt.event);
+        else
+            pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
+        return;
+    }
+
+    switch ( ctxt.ctxt.opcode )
+    {
+        unsigned int modrm_345;
+
+    case 0xea:
+        ++jump;
+        /* fall through */
+    case 0x9a:
+        ++jump;
+        opnd_sel = x86_insn_immediate(state, 1);
+        break;
+    case 0xff:
+        if ( x86_insn_modrm(state, NULL, &modrm_345) >= 3 )
+            break;
+        switch ( modrm_345 & 7 )
+        {
+            enum x86_segment seg;
+
+        case 5:
+            ++jump;
+            /* fall through */
+        case 3:
+            ++jump;
+            base = x86_insn_operand_ea(state, &seg);
+            rc = gate_op_read(seg,
+                              base + (x86_insn_opsize(state) >> 3),
+                              &opnd_sel, sizeof(opnd_sel), &ctxt.ctxt);
+            break;
+        }
+        break;
+    }
+
+    insn_len = x86_insn_length(state, &ctxt.ctxt);
+    x86_emulate_free_state(state);
+
+    if ( rc == X86EMUL_EXCEPTION )
+    {
+        pv_inject_event(&ctxt.ctxt.event);
+        return;
+    }
+
+    if ( rc != X86EMUL_OKAY ||
+         jump < 0 ||
+         (opnd_sel & ~3) != regs->error_code ||
+         dpl < (opnd_sel & 3) )
+    {
+        pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
+        return;
+    }
+
+    if ( !pv_emul_read_descriptor(sel, v, &base, &limit, &ar, 0) ||
+         !(ar & _SEGMENT_S) ||
+         !(ar & _SEGMENT_CODE) ||
+         (!jump || (ar & _SEGMENT_EC) ?
+          ((ar >> 13) & 3) > (regs->cs & 3) :
+          ((ar >> 13) & 3) != (regs->cs & 3)) )
+    {
+        pv_inject_hw_exception(TRAP_gp_fault, sel);
+        return;
+    }
+    if ( !(ar & _SEGMENT_P) )
+    {
+        pv_inject_hw_exception(TRAP_no_segment, sel);
+        return;
+    }
+    if ( off > limit )
+    {
+        pv_inject_hw_exception(TRAP_gp_fault, 0);
+        return;
+    }
+
+    if ( !jump )
+    {
+        unsigned int ss, esp, *stkp;
+        int rc;
+#define push(item) do \
+        { \
+            --stkp; \
+            esp -= 4; \
+            rc = __put_user(item, stkp); \
+            if ( rc ) \
+            { \
+                pv_inject_page_fault(PFEC_write_access, \
+                                     (unsigned long)(stkp + 1) - rc); \
+                return; \
+            } \
+        } while ( 0 )
+
+        if ( ((ar >> 13) & 3) < (regs->cs & 3) )
+        {
+            sel |= (ar >> 13) & 3;
+            /* Inner stack known only for kernel ring. */
+            if ( (sel & 3) != GUEST_KERNEL_RPL(v->domain) )
+            {
+                pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
+                return;
+            }
+            esp = v->arch.pv_vcpu.kernel_sp;
+            ss = v->arch.pv_vcpu.kernel_ss;
+            if ( (ss & 3) != (sel & 3) ||
+                 !pv_emul_read_descriptor(ss, v, &base, &limit, &ar, 0) ||
+                 ((ar >> 13) & 3) != (sel & 3) ||
+                 !(ar & _SEGMENT_S) ||
+                 (ar & _SEGMENT_CODE) ||
+                 !(ar & _SEGMENT_WR) )
+            {
+                pv_inject_hw_exception(TRAP_invalid_tss, ss & ~3);
+                return;
+            }
+            if ( !(ar & _SEGMENT_P) ||
+                 !check_stack_limit(ar, limit, esp, (4 + nparm) * 4) )
+            {
+                pv_inject_hw_exception(TRAP_stack_error, ss & ~3);
+                return;
+            }
+            stkp = (unsigned int *)(unsigned long)((unsigned int)base + esp);
+            if ( !compat_access_ok(stkp - 4 - nparm, 16 + nparm * 4) )
+            {
+                pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
+                return;
+            }
+            push(regs->ss);
+            push(regs->rsp);
+            if ( nparm )
+            {
+                const unsigned int *ustkp;
+
+                if ( !pv_emul_read_descriptor(regs->ss, v, &base,
+                                              &limit, &ar, 0) ||
+                     ((ar >> 13) & 3) != (regs->cs & 3) ||
+                     !(ar & _SEGMENT_S) ||
+                     (ar & _SEGMENT_CODE) ||
+                     !(ar & _SEGMENT_WR) ||
+                     !check_stack_limit(ar, limit, esp + nparm * 4, nparm * 4) )
+                    return pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
+                ustkp = (unsigned int *)(unsigned long)
+                        ((unsigned int)base + regs->esp + nparm * 4);
+                if ( !compat_access_ok(ustkp - nparm, 0 + nparm * 4) )
+                {
+                    pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
+                    return;
+                }
+                do
+                {
+                    unsigned int parm;
+
+                    --ustkp;
+                    rc = __get_user(parm, ustkp);
+                    if ( rc )
+                    {
+                        pv_inject_page_fault(0, (unsigned long)(ustkp + 1) - rc);
+                        return;
+                    }
+                    push(parm);
+                } while ( --nparm );
+            }
+        }
+        else
+        {
+            sel |= (regs->cs & 3);
+            esp = regs->rsp;
+            ss = regs->ss;
+            if ( !pv_emul_read_descriptor(ss, v, &base, &limit, &ar, 0) ||
+                 ((ar >> 13) & 3) != (sel & 3) )
+            {
+                pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
+                return;
+            }
+            if ( !check_stack_limit(ar, limit, esp, 2 * 4) )
+            {
+                pv_inject_hw_exception(TRAP_stack_error, 0);
+                return;
+            }
+            stkp = (unsigned int *)(unsigned long)((unsigned int)base + esp);
+            if ( !compat_access_ok(stkp - 2, 2 * 4) )
+            {
+                pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
+                return;
+            }
+        }
+        push(regs->cs);
+        push(regs->rip + insn_len);
+#undef push
+        regs->rsp = esp;
+        regs->ss = ss;
+    }
+    else
+        sel |= (regs->cs & 3);
+
+    regs->cs = sel;
+    pv_emul_instruction_done(regs, off);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 32cab71444..7b781f17db 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1580,394 +1580,6 @@ long do_fpu_taskswitch(int set)
     return 0;
 }
 
-static int read_gate_descriptor(unsigned int gate_sel,
-                                const struct vcpu *v,
-                                unsigned int *sel,
-                                unsigned long *off,
-                                unsigned int *ar)
-{
-    struct desc_struct desc;
-    const struct desc_struct *pdesc;
-
-
-    pdesc = (const struct desc_struct *)
-        (!(gate_sel & 4) ? GDT_VIRT_START(v) : LDT_VIRT_START(v))
-        + (gate_sel >> 3);
-    if ( (gate_sel < 4) ||
-         ((gate_sel >= FIRST_RESERVED_GDT_BYTE) && !(gate_sel & 4)) ||
-         __get_user(desc, pdesc) )
-        return 0;
-
-    *sel = (desc.a >> 16) & 0x0000fffc;
-    *off = (desc.a & 0x0000ffff) | (desc.b & 0xffff0000);
-    *ar = desc.b & 0x0000ffff;
-
-    /*
-     * check_descriptor() clears the DPL field and stores the
-     * guest requested DPL in the selector's RPL field.
-     */
-    if ( *ar & _SEGMENT_DPL )
-        return 0;
-    *ar |= (desc.a >> (16 - 13)) & _SEGMENT_DPL;
-
-    if ( !is_pv_32bit_vcpu(v) )
-    {
-        if ( (*ar & 0x1f00) != 0x0c00 ||
-             (gate_sel >= FIRST_RESERVED_GDT_BYTE - 8 && !(gate_sel & 4)) ||
-             __get_user(desc, pdesc + 1) ||
-             (desc.b & 0x1f00) )
-            return 0;
-
-        *off |= (unsigned long)desc.a << 32;
-        return 1;
-    }
-
-    switch ( *ar & 0x1f00 )
-    {
-    case 0x0400:
-        *off &= 0xffff;
-        break;
-    case 0x0c00:
-        break;
-    default:
-        return 0;
-    }
-
-    return 1;
-}
-
-static inline int check_stack_limit(unsigned int ar, unsigned int limit,
-                                    unsigned int esp, unsigned int decr)
-{
-    return (((esp - decr) < (esp - 1)) &&
-            (!(ar & _SEGMENT_EC) ? (esp - 1) <= limit : (esp - decr) > limit));
-}
-
-struct gate_op_ctxt {
-    struct x86_emulate_ctxt ctxt;
-    struct {
-        unsigned long base, limit;
-    } cs;
-    bool insn_fetch;
-};
-
-static int gate_op_read(
-    enum x86_segment seg,
-    unsigned long offset,
-    void *p_data,
-    unsigned int bytes,
-    struct x86_emulate_ctxt *ctxt)
-{
-    const struct gate_op_ctxt *goc =
-        container_of(ctxt, struct gate_op_ctxt, ctxt);
-    unsigned int rc = bytes, sel = 0;
-    unsigned long addr = offset, limit = 0;
-
-    switch ( seg )
-    {
-    case x86_seg_cs:
-        addr += goc->cs.base;
-        limit = goc->cs.limit;
-        break;
-    case x86_seg_ds:
-        sel = read_sreg(ds);
-        break;
-    case x86_seg_es:
-        sel = read_sreg(es);
-        break;
-    case x86_seg_fs:
-        sel = read_sreg(fs);
-        break;
-    case x86_seg_gs:
-        sel = read_sreg(gs);
-        break;
-    case x86_seg_ss:
-        sel = ctxt->regs->ss;
-        break;
-    default:
-        return X86EMUL_UNHANDLEABLE;
-    }
-    if ( sel )
-    {
-        unsigned int ar;
-
-        ASSERT(!goc->insn_fetch);
-        if ( !pv_emul_read_descriptor(sel, current, &addr, &limit, &ar, 0) ||
-             !(ar & _SEGMENT_S) ||
-             !(ar & _SEGMENT_P) ||
-             ((ar & _SEGMENT_CODE) && !(ar & _SEGMENT_WR)) )
-            return X86EMUL_UNHANDLEABLE;
-        addr += offset;
-    }
-    else if ( seg != x86_seg_cs )
-        return X86EMUL_UNHANDLEABLE;
-
-    /* We don't mean to emulate any branches. */
-    if ( limit < bytes - 1 || offset > limit - bytes + 1 )
-        return X86EMUL_UNHANDLEABLE;
-
-    addr = (uint32_t)addr;
-
-    if ( (rc = __copy_from_user(p_data, (void *)addr, bytes)) )
-    {
-        /*
-         * TODO: This should report PFEC_insn_fetch when goc->insn_fetch &&
-         * cpu_has_nx, but we'd then need a "fetch" variant of
-         * __copy_from_user() respecting NX, SMEP, and protection keys.
-         */
-        x86_emul_pagefault(0, addr + bytes - rc, ctxt);
-        return X86EMUL_EXCEPTION;
-    }
-
-    return X86EMUL_OKAY;
-}
-
-static void emulate_gate_op(struct cpu_user_regs *regs)
-{
-    struct vcpu *v = current;
-    unsigned int sel, ar, dpl, nparm, insn_len;
-    struct gate_op_ctxt ctxt = { .ctxt.regs = regs, .insn_fetch = true };
-    struct x86_emulate_state *state;
-    unsigned long off, base, limit;
-    uint16_t opnd_sel = 0;
-    int jump = -1, rc = X86EMUL_OKAY;
-
-    /* Check whether this fault is due to the use of a call gate. */
-    if ( !read_gate_descriptor(regs->error_code, v, &sel, &off, &ar) ||
-         (((ar >> 13) & 3) < (regs->cs & 3)) ||
-         ((ar & _SEGMENT_TYPE) != 0xc00) )
-    {
-        pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
-        return;
-    }
-    if ( !(ar & _SEGMENT_P) )
-    {
-        pv_inject_hw_exception(TRAP_no_segment, regs->error_code);
-        return;
-    }
-    dpl = (ar >> 13) & 3;
-    nparm = ar & 0x1f;
-
-    /*
-     * Decode instruction (and perhaps operand) to determine RPL,
-     * whether this is a jump or a call, and the call return offset.
-     */
-    if ( !pv_emul_read_descriptor(regs->cs, v, &ctxt.cs.base, &ctxt.cs.limit,
-                                  &ar, 0) ||
-         !(ar & _SEGMENT_S) ||
-         !(ar & _SEGMENT_P) ||
-         !(ar & _SEGMENT_CODE) )
-    {
-        pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
-        return;
-    }
-
-    ctxt.ctxt.addr_size = ar & _SEGMENT_DB ? 32 : 16;
-    /* Leave zero in ctxt.ctxt.sp_size, as it's not needed for decoding. */
-    state = x86_decode_insn(&ctxt.ctxt, gate_op_read);
-    ctxt.insn_fetch = false;
-    if ( IS_ERR_OR_NULL(state) )
-    {
-        if ( PTR_ERR(state) == -X86EMUL_EXCEPTION )
-            pv_inject_event(&ctxt.ctxt.event);
-        else
-            pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
-        return;
-    }
-
-    switch ( ctxt.ctxt.opcode )
-    {
-        unsigned int modrm_345;
-
-    case 0xea:
-        ++jump;
-        /* fall through */
-    case 0x9a:
-        ++jump;
-        opnd_sel = x86_insn_immediate(state, 1);
-        break;
-    case 0xff:
-        if ( x86_insn_modrm(state, NULL, &modrm_345) >= 3 )
-            break;
-        switch ( modrm_345 & 7 )
-        {
-            enum x86_segment seg;
-
-        case 5:
-            ++jump;
-            /* fall through */
-        case 3:
-            ++jump;
-            base = x86_insn_operand_ea(state, &seg);
-            rc = gate_op_read(seg,
-                              base + (x86_insn_opsize(state) >> 3),
-                              &opnd_sel, sizeof(opnd_sel), &ctxt.ctxt);
-            break;
-        }
-        break;
-    }
-
-    insn_len = x86_insn_length(state, &ctxt.ctxt);
-    x86_emulate_free_state(state);
-
-    if ( rc == X86EMUL_EXCEPTION )
-    {
-        pv_inject_event(&ctxt.ctxt.event);
-        return;
-    }
-
-    if ( rc != X86EMUL_OKAY ||
-         jump < 0 ||
-         (opnd_sel & ~3) != regs->error_code ||
-         dpl < (opnd_sel & 3) )
-    {
-        pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
-        return;
-    }
-
-    if ( !pv_emul_read_descriptor(sel, v, &base, &limit, &ar, 0) ||
-         !(ar & _SEGMENT_S) ||
-         !(ar & _SEGMENT_CODE) ||
-         (!jump || (ar & _SEGMENT_EC) ?
-          ((ar >> 13) & 3) > (regs->cs & 3) :
-          ((ar >> 13) & 3) != (regs->cs & 3)) )
-    {
-        pv_inject_hw_exception(TRAP_gp_fault, sel);
-        return;
-    }
-    if ( !(ar & _SEGMENT_P) )
-    {
-        pv_inject_hw_exception(TRAP_no_segment, sel);
-        return;
-    }
-    if ( off > limit )
-    {
-        pv_inject_hw_exception(TRAP_gp_fault, 0);
-        return;
-    }
-
-    if ( !jump )
-    {
-        unsigned int ss, esp, *stkp;
-        int rc;
-#define push(item) do \
-        { \
-            --stkp; \
-            esp -= 4; \
-            rc = __put_user(item, stkp); \
-            if ( rc ) \
-            { \
-                pv_inject_page_fault(PFEC_write_access, \
-                                     (unsigned long)(stkp + 1) - rc); \
-                return; \
-            } \
-        } while ( 0 )
-
-        if ( ((ar >> 13) & 3) < (regs->cs & 3) )
-        {
-            sel |= (ar >> 13) & 3;
-            /* Inner stack known only for kernel ring. */
-            if ( (sel & 3) != GUEST_KERNEL_RPL(v->domain) )
-            {
-                pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
-                return;
-            }
-            esp = v->arch.pv_vcpu.kernel_sp;
-            ss = v->arch.pv_vcpu.kernel_ss;
-            if ( (ss & 3) != (sel & 3) ||
-                 !pv_emul_read_descriptor(ss, v, &base, &limit, &ar, 0) ||
-                 ((ar >> 13) & 3) != (sel & 3) ||
-                 !(ar & _SEGMENT_S) ||
-                 (ar & _SEGMENT_CODE) ||
-                 !(ar & _SEGMENT_WR) )
-            {
-                pv_inject_hw_exception(TRAP_invalid_tss, ss & ~3);
-                return;
-            }
-            if ( !(ar & _SEGMENT_P) ||
-                 !check_stack_limit(ar, limit, esp, (4 + nparm) * 4) )
-            {
-                pv_inject_hw_exception(TRAP_stack_error, ss & ~3);
-                return;
-            }
-            stkp = (unsigned int *)(unsigned long)((unsigned int)base + esp);
-            if ( !compat_access_ok(stkp - 4 - nparm, 16 + nparm * 4) )
-            {
-                pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
-                return;
-            }
-            push(regs->ss);
-            push(regs->rsp);
-            if ( nparm )
-            {
-                const unsigned int *ustkp;
-
-                if ( !pv_emul_read_descriptor(regs->ss, v, &base,
-                                              &limit, &ar, 0) ||
-                     ((ar >> 13) & 3) != (regs->cs & 3) ||
-                     !(ar & _SEGMENT_S) ||
-                     (ar & _SEGMENT_CODE) ||
-                     !(ar & _SEGMENT_WR) ||
-                     !check_stack_limit(ar, limit, esp + nparm * 4, nparm * 4) )
-                    return pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
-                ustkp = (unsigned int *)(unsigned long)
-                        ((unsigned int)base + regs->esp + nparm * 4);
-                if ( !compat_access_ok(ustkp - nparm, 0 + nparm * 4) )
-                {
-                    pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
-                    return;
-                }
-                do
-                {
-                    unsigned int parm;
-
-                    --ustkp;
-                    rc = __get_user(parm, ustkp);
-                    if ( rc )
-                    {
-                        pv_inject_page_fault(0, (unsigned long)(ustkp + 1) - rc);
-                        return;
-                    }
-                    push(parm);
-                } while ( --nparm );
-            }
-        }
-        else
-        {
-            sel |= (regs->cs & 3);
-            esp = regs->rsp;
-            ss = regs->ss;
-            if ( !pv_emul_read_descriptor(ss, v, &base, &limit, &ar, 0) ||
-                 ((ar >> 13) & 3) != (sel & 3) )
-            {
-                pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
-                return;
-            }
-            if ( !check_stack_limit(ar, limit, esp, 2 * 4) )
-            {
-                pv_inject_hw_exception(TRAP_stack_error, 0);
-                return;
-            }
-            stkp = (unsigned int *)(unsigned long)((unsigned int)base + esp);
-            if ( !compat_access_ok(stkp - 2, 2 * 4) )
-            {
-                pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
-                return;
-            }
-        }
-        push(regs->cs);
-        push(regs->rip + insn_len);
-#undef push
-        regs->rsp = esp;
-        regs->ss = ss;
-    }
-    else
-        sel |= (regs->cs & 3);
-
-    regs->cs = sel;
-    pv_emul_instruction_done(regs, off);
-}
-
 void do_general_protection(struct cpu_user_regs *regs)
 {
     struct vcpu *v = current;
@@ -2018,7 +1630,7 @@ void do_general_protection(struct cpu_user_regs *regs)
     }
     else if ( is_pv_32bit_vcpu(v) && regs->error_code )
     {
-        emulate_gate_op(regs);
+        pv_emulate_gate_op(regs);
         return;
     }
 
diff --git a/xen/include/asm-x86/pv/traps.h b/xen/include/asm-x86/pv/traps.h
index 5aeb061551..3f3bab4d8c 100644
--- a/xen/include/asm-x86/pv/traps.h
+++ b/xen/include/asm-x86/pv/traps.h
@@ -26,10 +26,12 @@
 #include <public/xen.h>
 
 int pv_emulate_privileged_op(struct cpu_user_regs *regs);
+void pv_emulate_gate_op(struct cpu_user_regs *regs);
 
 #else  /* !CONFIG_PV */
 
 int pv_emulate_privileged_op(struct cpu_user_regs *regs) { return 0; }
+void pv_emulate_gate_op(struct cpu_user_regs *regs) {}
 
 #endif /* CONFIG_PV */
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 04/27] x86: move PV invalid op emulation code
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (2 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 03/27] x86: move PV gate op " Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-20 16:21   ` Jan Beulich
  2017-06-08 17:11 ` [PATCH v4 05/27] x86/traps: remove now unused inclusion of emulate.h Wei Liu
                   ` (22 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Move the code to pv/emul-inv-op.c. Prefix emulate_* functions with pv_
and export them via pv/traps.h.

Pure code motion except for the rename.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/Makefile       |   1 +
 xen/arch/x86/pv/emul-inv-op.c  | 123 +++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/traps.c           |  75 +------------------------
 xen/include/asm-x86/pv/traps.h |   4 ++
 4 files changed, 130 insertions(+), 73 deletions(-)
 create mode 100644 xen/arch/x86/pv/emul-inv-op.c

diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index 1f6fbd3f5c..42ca64dc9e 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -5,5 +5,6 @@ obj-bin-y += dom0_build.init.o
 obj-y += domain.o
 obj-y += emulate.o
 obj-y += emul-gate-op.o
+obj-y += emul-inv-op.o
 obj-y += emul-priv-op.o
 obj-bin-y += gpr_switch.o
diff --git a/xen/arch/x86/pv/emul-inv-op.c b/xen/arch/x86/pv/emul-inv-op.c
new file mode 100644
index 0000000000..6a731c6049
--- /dev/null
+++ b/xen/arch/x86/pv/emul-inv-op.c
@@ -0,0 +1,123 @@
+/******************************************************************************
+ * arch/x86/pv/emul-inv-op.c
+ *
+ * Emulate invalid op for PV guests
+ *
+ * Modifications to Linux original are copyright (c) 2002-2004, K A Fraser
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/errno.h>
+#include <xen/event.h>
+#include <xen/guest_access.h>
+#include <xen/iocap.h>
+#include <xen/spinlock.h>
+#include <xen/trace.h>
+
+#include <asm/apic.h>
+#include <asm/debugreg.h>
+#include <asm/hpet.h>
+#include <asm/hypercall.h>
+#include <asm/mc146818rtc.h>
+#include <asm/p2m.h>
+#include <asm/pv/traps.h>
+#include <asm/shared.h>
+#include <asm/traps.h>
+#include <asm/x86_emulate.h>
+
+#include <xsm/xsm.h>
+
+#include "emulate.h"
+
+int pv_emulate_invalid_rdtscp(struct cpu_user_regs *regs)
+{
+    char opcode[3];
+    unsigned long eip, rc;
+    struct vcpu *v = current;
+
+    eip = regs->rip;
+    if ( (rc = copy_from_user(opcode, (char *)eip, sizeof(opcode))) != 0 )
+    {
+        pv_inject_page_fault(0, eip + sizeof(opcode) - rc);
+        return EXCRET_fault_fixed;
+    }
+    if ( memcmp(opcode, "\xf\x1\xf9", sizeof(opcode)) )
+        return 0;
+    eip += sizeof(opcode);
+    pv_soft_rdtsc(v, regs, 1);
+    pv_emul_instruction_done(regs, eip);
+    return EXCRET_fault_fixed;
+}
+
+int pv_emulate_forced_invalid_op(struct cpu_user_regs *regs)
+{
+    char sig[5], instr[2];
+    unsigned long eip, rc;
+    struct cpuid_leaf res;
+
+    eip = regs->rip;
+
+    /* Check for forced emulation signature: ud2 ; .ascii "xen". */
+    if ( (rc = copy_from_user(sig, (char *)eip, sizeof(sig))) != 0 )
+    {
+        pv_inject_page_fault(0, eip + sizeof(sig) - rc);
+        return EXCRET_fault_fixed;
+    }
+    if ( memcmp(sig, "\xf\xbxen", sizeof(sig)) )
+        return 0;
+    eip += sizeof(sig);
+
+    /* We only emulate CPUID. */
+    if ( ( rc = copy_from_user(instr, (char *)eip, sizeof(instr))) != 0 )
+    {
+        pv_inject_page_fault(0, eip + sizeof(instr) - rc);
+        return EXCRET_fault_fixed;
+    }
+    if ( memcmp(instr, "\xf\xa2", sizeof(instr)) )
+        return 0;
+
+    /* If cpuid faulting is enabled and CPL>0 inject a #GP in place of #UD. */
+    if ( current->arch.cpuid_faulting && !guest_kernel_mode(current, regs) )
+    {
+        regs->rip = eip;
+        pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
+        return EXCRET_fault_fixed;
+    }
+
+    eip += sizeof(instr);
+
+    guest_cpuid(current, regs->eax, regs->ecx, &res);
+
+    regs->rax = res.a;
+    regs->rbx = res.b;
+    regs->rcx = res.c;
+    regs->rdx = res.d;
+
+    pv_emul_instruction_done(regs, eip);
+
+    trace_trap_one_addr(TRC_PV_FORCED_INVALID_OP, regs->rip);
+
+    return EXCRET_fault_fixed;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 7b781f17db..ff25f679f5 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -968,77 +968,6 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
     }
 }
 
-static int emulate_invalid_rdtscp(struct cpu_user_regs *regs)
-{
-    char opcode[3];
-    unsigned long eip, rc;
-    struct vcpu *v = current;
-
-    eip = regs->rip;
-    if ( (rc = copy_from_user(opcode, (char *)eip, sizeof(opcode))) != 0 )
-    {
-        pv_inject_page_fault(0, eip + sizeof(opcode) - rc);
-        return EXCRET_fault_fixed;
-    }
-    if ( memcmp(opcode, "\xf\x1\xf9", sizeof(opcode)) )
-        return 0;
-    eip += sizeof(opcode);
-    pv_soft_rdtsc(v, regs, 1);
-    pv_emul_instruction_done(regs, eip);
-    return EXCRET_fault_fixed;
-}
-
-static int emulate_forced_invalid_op(struct cpu_user_regs *regs)
-{
-    char sig[5], instr[2];
-    unsigned long eip, rc;
-    struct cpuid_leaf res;
-
-    eip = regs->rip;
-
-    /* Check for forced emulation signature: ud2 ; .ascii "xen". */
-    if ( (rc = copy_from_user(sig, (char *)eip, sizeof(sig))) != 0 )
-    {
-        pv_inject_page_fault(0, eip + sizeof(sig) - rc);
-        return EXCRET_fault_fixed;
-    }
-    if ( memcmp(sig, "\xf\xbxen", sizeof(sig)) )
-        return 0;
-    eip += sizeof(sig);
-
-    /* We only emulate CPUID. */
-    if ( ( rc = copy_from_user(instr, (char *)eip, sizeof(instr))) != 0 )
-    {
-        pv_inject_page_fault(0, eip + sizeof(instr) - rc);
-        return EXCRET_fault_fixed;
-    }
-    if ( memcmp(instr, "\xf\xa2", sizeof(instr)) )
-        return 0;
-
-    /* If cpuid faulting is enabled and CPL>0 inject a #GP in place of #UD. */
-    if ( current->arch.cpuid_faulting && !guest_kernel_mode(current, regs) )
-    {
-        regs->rip = eip;
-        pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
-        return EXCRET_fault_fixed;
-    }
-
-    eip += sizeof(instr);
-
-    guest_cpuid(current, regs->eax, regs->ecx, &res);
-
-    regs->rax = res.a;
-    regs->rbx = res.b;
-    regs->rcx = res.c;
-    regs->rdx = res.d;
-
-    pv_emul_instruction_done(regs, eip);
-
-    trace_trap_one_addr(TRC_PV_FORCED_INVALID_OP, regs->rip);
-
-    return EXCRET_fault_fixed;
-}
-
 void do_invalid_op(struct cpu_user_regs *regs)
 {
     const struct bug_frame *bug = NULL;
@@ -1053,8 +982,8 @@ void do_invalid_op(struct cpu_user_regs *regs)
 
     if ( likely(guest_mode(regs)) )
     {
-        if ( !emulate_invalid_rdtscp(regs) &&
-             !emulate_forced_invalid_op(regs) )
+        if ( !pv_emulate_invalid_rdtscp(regs) &&
+             !pv_emulate_forced_invalid_op(regs) )
             pv_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         return;
     }
diff --git a/xen/include/asm-x86/pv/traps.h b/xen/include/asm-x86/pv/traps.h
index 3f3bab4d8c..a4af69e486 100644
--- a/xen/include/asm-x86/pv/traps.h
+++ b/xen/include/asm-x86/pv/traps.h
@@ -27,11 +27,15 @@
 
 int pv_emulate_privileged_op(struct cpu_user_regs *regs);
 void pv_emulate_gate_op(struct cpu_user_regs *regs);
+int pv_emulate_invalid_rdtscp(struct cpu_user_regs *regs);
+int pv_emulate_forced_invalid_op(struct cpu_user_regs *regs);
 
 #else  /* !CONFIG_PV */
 
 int pv_emulate_privileged_op(struct cpu_user_regs *regs) { return 0; }
 void pv_emulate_gate_op(struct cpu_user_regs *regs) {}
+int pv_emulate_invalid_rdtscp(struct cpu_user_regs *regs) { return 0; }
+int pv_emulate_forced_invalid_op(struct cpu_user_regs *regs) { return 0; }
 
 #endif /* CONFIG_PV */
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 05/27] x86/traps: remove now unused inclusion of emulate.h
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (3 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 04/27] x86: move PV invalid " Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-20 16:21   ` Jan Beulich
  2017-06-08 17:11 ` [PATCH v4 06/27] x86: clean up PV emulation code Wei Liu
                   ` (21 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
Can be squashed into previous patch. Kept separated in case further
code churn is required.
---
 xen/arch/x86/traps.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index ff25f679f5..b5642b0f9a 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -79,8 +79,6 @@
 #include <xsm/xsm.h>
 #include <asm/pv/traps.h>
 
-#include "pv/emulate.h"
-
 /*
  * opt_nmi: one of 'ignore', 'dom0', or 'fatal'.
  *  fatal:  Xen prints diagnostic message and then hangs.
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 06/27] x86: clean up PV emulation code
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (4 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 05/27] x86/traps: remove now unused inclusion of emulate.h Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 10:56   ` Andrew Cooper
  2017-06-08 17:11 ` [PATCH v4 07/27] x86: move do_set_trap_table to pv/traps.c Wei Liu
                   ` (20 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Replace bool_t with bool. Fix coding style issues. Add spaces around
binary ops. Use 1U for shifting. Eliminate TOGGLE_MODE.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/pv/emul-gate-op.c |  5 ++-
 xen/arch/x86/pv/emul-priv-op.c | 82 ++++++++++++++++++++++--------------------
 2 files changed, 45 insertions(+), 42 deletions(-)

diff --git a/xen/arch/x86/pv/emul-gate-op.c b/xen/arch/x86/pv/emul-gate-op.c
index 97a4b31a56..cdf3b308f2 100644
--- a/xen/arch/x86/pv/emul-gate-op.c
+++ b/xen/arch/x86/pv/emul-gate-op.c
@@ -50,7 +50,6 @@ static int read_gate_descriptor(unsigned int gate_sel,
     struct desc_struct desc;
     const struct desc_struct *pdesc;
 
-
     pdesc = (const struct desc_struct *)
         (!(gate_sel & 4) ? GDT_VIRT_START(v) : LDT_VIRT_START(v))
         + (gate_sel >> 3);
@@ -97,8 +96,8 @@ static int read_gate_descriptor(unsigned int gate_sel,
     return 1;
 }
 
-static inline int check_stack_limit(unsigned int ar, unsigned int limit,
-                                    unsigned int esp, unsigned int decr)
+static inline bool check_stack_limit(unsigned int ar, unsigned int limit,
+                                     unsigned int esp, unsigned int decr)
 {
     return (((esp - decr) < (esp - 1)) &&
             (!(ar & _SEGMENT_EC) ? (esp - 1) <= limit : (esp - decr) > limit));
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index fd5fd74bd1..85185b6b29 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -58,7 +58,7 @@ struct priv_op_ctxt {
 #define TSC_AUX 2
 };
 
-/* I/O emulation support. Helper routines for, and type of, the stack stub.*/
+/* I/O emulation support. Helper routines for, and type of, the stack stub. */
 void host_to_guest_gpr_switch(struct cpu_user_regs *);
 unsigned long guest_to_host_gpr_switch(unsigned long);
 
@@ -101,7 +101,7 @@ static io_emul_stub_t *io_emul_stub_setup(struct priv_op_ctxt *ctxt, u8 opcode,
 
 
 /* Perform IOPL check between the vcpu's shadowed IOPL, and the assumed cpl. */
-static bool_t iopl_ok(const struct vcpu *v, const struct cpu_user_regs *regs)
+static bool iopl_ok(const struct vcpu *v, const struct cpu_user_regs *regs)
 {
     unsigned int cpl = guest_kernel_mode(v, regs) ?
         (VM_ASSIST(v->domain, architectural_iopl) ? 0 : 1) : 3;
@@ -112,16 +112,14 @@ static bool_t iopl_ok(const struct vcpu *v, const struct cpu_user_regs *regs)
 }
 
 /* Has the guest requested sufficient permission for this I/O access? */
-static int guest_io_okay(
-    unsigned int port, unsigned int bytes,
-    struct vcpu *v, struct cpu_user_regs *regs)
+static bool guest_io_okay(unsigned int port, unsigned int bytes,
+                          struct vcpu *v, struct cpu_user_regs *regs)
 {
     /* If in user mode, switch to kernel mode just to read I/O bitmap. */
-    int user_mode = !(v->arch.flags & TF_kernel_mode);
-#define TOGGLE_MODE() if ( user_mode ) toggle_guest_mode(v)
+    const bool user_mode = !(v->arch.flags & TF_kernel_mode);
 
     if ( iopl_ok(v, regs) )
-        return 1;
+        return true;
 
     if ( v->arch.pv_vcpu.iobmp_limit > (port + bytes) )
     {
@@ -131,7 +129,9 @@ static int guest_io_okay(
          * Grab permission bytes from guest space. Inaccessible bytes are
          * read as 0xff (no access allowed).
          */
-        TOGGLE_MODE();
+        if ( user_mode )
+            toggle_guest_mode(v);
+
         switch ( __copy_from_guest_offset(x.bytes, v->arch.pv_vcpu.iobmp,
                                           port>>3, 2) )
         {
@@ -141,43 +141,45 @@ static int guest_io_okay(
             /* fallthrough */
         case 0:  break;
         }
-        TOGGLE_MODE();
 
-        if ( (x.mask & (((1<<bytes)-1) << (port&7))) == 0 )
-            return 1;
+        if ( user_mode )
+            toggle_guest_mode(v);
+
+        if ( (x.mask & (((1 << bytes) - 1) << (port & 7))) == 0 )
+            return true;
     }
 
-    return 0;
+    return false;
 }
 
 /* Has the administrator granted sufficient permission for this I/O access? */
-static bool_t admin_io_okay(unsigned int port, unsigned int bytes,
-                            const struct domain *d)
+static bool admin_io_okay(unsigned int port, unsigned int bytes,
+                          const struct domain *d)
 {
     /*
      * Port 0xcf8 (CONFIG_ADDRESS) is only visible for DWORD accesses.
      * We never permit direct access to that register.
      */
     if ( (port == 0xcf8) && (bytes == 4) )
-        return 0;
+        return false;
 
     /* We also never permit direct access to the RTC/CMOS registers. */
     if ( ((port & ~1) == RTC_PORT(0)) )
-        return 0;
+        return false;
 
     return ioports_access_permitted(d, port, port + bytes - 1);
 }
 
-static bool_t pci_cfg_ok(struct domain *currd, unsigned int start,
-                         unsigned int size, uint32_t *write)
+static bool pci_cfg_ok(struct domain *currd, unsigned int start,
+                       unsigned int size, uint32_t *write)
 {
     uint32_t machine_bdf;
 
     if ( !is_hardware_domain(currd) )
-        return 0;
+        return false;
 
     if ( !CF8_ENABLED(currd->arch.pci_cf8) )
-        return 1;
+        return true;
 
     machine_bdf = CF8_BDF(currd->arch.pci_cf8);
     if ( write )
@@ -185,7 +187,7 @@ static bool_t pci_cfg_ok(struct domain *currd, unsigned int start,
         const unsigned long *ro_map = pci_get_ro_map(0);
 
         if ( ro_map && test_bit(machine_bdf, ro_map) )
-            return 0;
+            return false;
     }
     start |= CF8_ADDR_LO(currd->arch.pci_cf8);
     /* AMD extended configuration space access? */
@@ -196,7 +198,7 @@ static bool_t pci_cfg_ok(struct domain *currd, unsigned int start,
         uint64_t msr_val;
 
         if ( rdmsr_safe(MSR_AMD64_NB_CFG, msr_val) )
-            return 0;
+            return false;
         if ( msr_val & (1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT) )
             start |= CF8_ADDR_HI(currd->arch.pci_cf8);
     }
@@ -273,7 +275,8 @@ uint32_t guest_io_read(unsigned int port, unsigned int bytes,
 }
 
 static unsigned int check_guest_io_breakpoint(struct vcpu *v,
-    unsigned int port, unsigned int len)
+                                              unsigned int port,
+                                              unsigned int len)
 {
     unsigned int width, i, match = 0;
     unsigned long start;
@@ -301,7 +304,7 @@ static unsigned int check_guest_io_breakpoint(struct vcpu *v,
         }
 
         if ( (start < (port + len)) && ((start + width) > port) )
-            match |= 1 << i;
+            match |= 1u << i;
     }
 
     return match;
@@ -342,7 +345,8 @@ void guest_io_write(unsigned int port, unsigned int bytes, uint32_t data,
 {
     if ( admin_io_okay(port, bytes, currd) )
     {
-        switch ( bytes ) {
+        switch ( bytes )
+        {
         case 1:
             outb((uint8_t)data, port);
             if ( pv_post_outb_hook )
@@ -741,7 +745,7 @@ static int priv_op_write_cr(unsigned int reg, unsigned long val,
         if ( (val ^ read_cr0()) & ~X86_CR0_TS )
         {
             gdprintk(XENLOG_WARNING,
-                    "Attempt to change unmodifiable CR0 flags\n");
+                     "Attempt to change unmodifiable CR0 flags\n");
             break;
         }
         do_fpu_taskswitch(!!(val & X86_CR0_TS));
@@ -948,16 +952,16 @@ static int priv_op_read_msr(unsigned int reg, uint64_t *val,
             *val |= MSR_MISC_FEATURES_CPUID_FAULTING;
         return X86EMUL_OKAY;
 
-    case MSR_P6_PERFCTR(0)...MSR_P6_PERFCTR(7):
-    case MSR_P6_EVNTSEL(0)...MSR_P6_EVNTSEL(3):
-    case MSR_CORE_PERF_FIXED_CTR0...MSR_CORE_PERF_FIXED_CTR2:
-    case MSR_CORE_PERF_FIXED_CTR_CTRL...MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+    case MSR_P6_PERFCTR(0) ... MSR_P6_PERFCTR(7):
+    case MSR_P6_EVNTSEL(0) ... MSR_P6_EVNTSEL(3):
+    case MSR_CORE_PERF_FIXED_CTR0 ... MSR_CORE_PERF_FIXED_CTR2:
+    case MSR_CORE_PERF_FIXED_CTR_CTRL ... MSR_CORE_PERF_GLOBAL_OVF_CTRL:
         if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
         {
             vpmu_msr = true;
             /* fall through */
-    case MSR_AMD_FAM15H_EVNTSEL0...MSR_AMD_FAM15H_PERFCTR5:
-    case MSR_K7_EVNTSEL0...MSR_K7_PERFCTR3:
+    case MSR_AMD_FAM15H_EVNTSEL0 ... MSR_AMD_FAM15H_PERFCTR5:
+    case MSR_K7_EVNTSEL0 ... MSR_K7_PERFCTR3:
             if ( vpmu_msr || (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) )
             {
                 if ( vpmu_do_rdmsr(reg, val) )
@@ -1153,15 +1157,15 @@ static int priv_op_write_msr(unsigned int reg, uint64_t val,
         curr->arch.cpuid_faulting = !!(val & MSR_MISC_FEATURES_CPUID_FAULTING);
         return X86EMUL_OKAY;
 
-    case MSR_P6_PERFCTR(0)...MSR_P6_PERFCTR(7):
-    case MSR_P6_EVNTSEL(0)...MSR_P6_EVNTSEL(3):
-    case MSR_CORE_PERF_FIXED_CTR0...MSR_CORE_PERF_FIXED_CTR2:
-    case MSR_CORE_PERF_FIXED_CTR_CTRL...MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+    case MSR_P6_PERFCTR(0) ... MSR_P6_PERFCTR(7):
+    case MSR_P6_EVNTSEL(0) ... MSR_P6_EVNTSEL(3):
+    case MSR_CORE_PERF_FIXED_CTR0 ... MSR_CORE_PERF_FIXED_CTR2:
+    case MSR_CORE_PERF_FIXED_CTR_CTRL ... MSR_CORE_PERF_GLOBAL_OVF_CTRL:
         if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
         {
             vpmu_msr = true;
-    case MSR_AMD_FAM15H_EVNTSEL0...MSR_AMD_FAM15H_PERFCTR5:
-    case MSR_K7_EVNTSEL0...MSR_K7_PERFCTR3:
+    case MSR_AMD_FAM15H_EVNTSEL0 ... MSR_AMD_FAM15H_PERFCTR5:
+    case MSR_K7_EVNTSEL0 ... MSR_K7_PERFCTR3:
             if ( vpmu_msr || (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) )
             {
                 if ( (vpmu_mode & XENPMU_MODE_ALL) &&
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 07/27] x86: move do_set_trap_table to pv/traps.c
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (5 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 06/27] x86: clean up PV emulation code Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 11:00   ` Andrew Cooper
  2017-06-08 17:11 ` [PATCH v4 08/27] x86: move some misc PV hypercalls to misc-hypercalls.c Wei Liu
                   ` (19 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/traps.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/traps.c    | 49 ----------------------------------------------
 2 files changed, 52 insertions(+), 49 deletions(-)

diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c
index 51125a8d86..6e69f2ad58 100644
--- a/xen/arch/x86/pv/traps.c
+++ b/xen/arch/x86/pv/traps.c
@@ -19,7 +19,10 @@
  * Copyright (c) 2017 Citrix Systems Ltd.
  */
 
+#include <xen/event.h>
+#include <xen/guest_access.h>
 #include <xen/hypercall.h>
+#include <xen/sched.h>
 
 #include <asm/apic.h>
 
@@ -31,6 +34,55 @@ void do_entry_int82(struct cpu_user_regs *regs)
     pv_hypercall(regs);
 }
 
+long do_set_trap_table(XEN_GUEST_HANDLE_PARAM(const_trap_info_t) traps)
+{
+    struct trap_info cur;
+    struct vcpu *curr = current;
+    struct trap_info *dst = curr->arch.pv_vcpu.trap_ctxt;
+    long rc = 0;
+
+    /* If no table is presented then clear the entire virtual IDT. */
+    if ( guest_handle_is_null(traps) )
+    {
+        memset(dst, 0, NR_VECTORS * sizeof(*dst));
+        init_int80_direct_trap(curr);
+        return 0;
+    }
+
+    for ( ; ; )
+    {
+        if ( copy_from_guest(&cur, traps, 1) )
+        {
+            rc = -EFAULT;
+            break;
+        }
+
+        if ( cur.address == 0 )
+            break;
+
+        if ( !is_canonical_address(cur.address) )
+            return -EINVAL;
+
+        fixup_guest_code_selector(curr->domain, cur.cs);
+
+        memcpy(&dst[cur.vector], &cur, sizeof(cur));
+
+        if ( cur.vector == 0x80 )
+            init_int80_direct_trap(curr);
+
+        guest_handle_add_offset(traps, 1);
+
+        if ( hypercall_preempt_check() )
+        {
+            rc = hypercall_create_continuation(
+                __HYPERVISOR_set_trap_table, "h", traps);
+            break;
+        }
+    }
+
+    return rc;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index b5642b0f9a..440aad182b 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -2126,55 +2126,6 @@ int send_guest_trap(struct domain *d, uint16_t vcpuid, unsigned int trap_nr)
 }
 
 
-long do_set_trap_table(XEN_GUEST_HANDLE_PARAM(const_trap_info_t) traps)
-{
-    struct trap_info cur;
-    struct vcpu *curr = current;
-    struct trap_info *dst = curr->arch.pv_vcpu.trap_ctxt;
-    long rc = 0;
-
-    /* If no table is presented then clear the entire virtual IDT. */
-    if ( guest_handle_is_null(traps) )
-    {
-        memset(dst, 0, NR_VECTORS * sizeof(*dst));
-        init_int80_direct_trap(curr);
-        return 0;
-    }
-
-    for ( ; ; )
-    {
-        if ( copy_from_guest(&cur, traps, 1) )
-        {
-            rc = -EFAULT;
-            break;
-        }
-
-        if ( cur.address == 0 )
-            break;
-
-        if ( !is_canonical_address(cur.address) )
-            return -EINVAL;
-
-        fixup_guest_code_selector(curr->domain, cur.cs);
-
-        memcpy(&dst[cur.vector], &cur, sizeof(cur));
-
-        if ( cur.vector == 0x80 )
-            init_int80_direct_trap(curr);
-
-        guest_handle_add_offset(traps, 1);
-
-        if ( hypercall_preempt_check() )
-        {
-            rc = hypercall_create_continuation(
-                __HYPERVISOR_set_trap_table, "h", traps);
-            break;
-        }
-    }
-
-    return rc;
-}
-
 void activate_debugregs(const struct vcpu *curr)
 {
     ASSERT(curr == current);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 08/27] x86: move some misc PV hypercalls to misc-hypercalls.c
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (6 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 07/27] x86: move do_set_trap_table to pv/traps.c Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 11:02   ` Andrew Cooper
  2017-06-08 17:11 ` [PATCH v4 09/27] x86/traps: move pv_inject_event to pv/traps.c Wei Liu
                   ` (18 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/Makefile          |  1 +
 xen/arch/x86/pv/misc-hypercalls.c | 78 +++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/traps.c              | 44 ----------------------
 3 files changed, 79 insertions(+), 44 deletions(-)
 create mode 100644 xen/arch/x86/pv/misc-hypercalls.c

diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index 42ca64dc9e..939ea60bea 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -8,3 +8,4 @@ obj-y += emul-gate-op.o
 obj-y += emul-inv-op.o
 obj-y += emul-priv-op.o
 obj-bin-y += gpr_switch.o
+obj-y += misc-hypercalls.o
diff --git a/xen/arch/x86/pv/misc-hypercalls.c b/xen/arch/x86/pv/misc-hypercalls.c
new file mode 100644
index 0000000000..5862130697
--- /dev/null
+++ b/xen/arch/x86/pv/misc-hypercalls.c
@@ -0,0 +1,78 @@
+/******************************************************************************
+ * arch/x86/pv/misc-hypercalls.c
+ *
+ * Misc hypercall handlers
+ *
+ * Modifications to Linux original are copyright (c) 2002-2004, K A Fraser
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/hypercall.h>
+
+#include <asm/debugreg.h>
+
+long do_set_debugreg(int reg, unsigned long value)
+{
+    return set_debugreg(current, reg, value);
+}
+
+unsigned long do_get_debugreg(int reg)
+{
+    struct vcpu *curr = current;
+
+    switch ( reg )
+    {
+    case 0 ... 3:
+    case 6:
+        return curr->arch.debugreg[reg];
+    case 7:
+        return (curr->arch.debugreg[7] |
+                curr->arch.debugreg[5]);
+    case 4 ... 5:
+        return ((curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_DE) ?
+                curr->arch.debugreg[reg + 2] : 0);
+    }
+
+    return -EINVAL;
+}
+
+long do_fpu_taskswitch(int set)
+{
+    struct vcpu *v = current;
+
+    if ( set )
+    {
+        v->arch.pv_vcpu.ctrlreg[0] |= X86_CR0_TS;
+        stts();
+    }
+    else
+    {
+        v->arch.pv_vcpu.ctrlreg[0] &= ~X86_CR0_TS;
+        if ( v->fpu_dirtied )
+            clts();
+    }
+
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 440aad182b..6923c2ef01 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1488,25 +1488,6 @@ void __init do_early_page_fault(struct cpu_user_regs *regs)
     }
 }
 
-long do_fpu_taskswitch(int set)
-{
-    struct vcpu *v = current;
-
-    if ( set )
-    {
-        v->arch.pv_vcpu.ctrlreg[0] |= X86_CR0_TS;
-        stts();
-    }
-    else
-    {
-        v->arch.pv_vcpu.ctrlreg[0] &= ~X86_CR0_TS;
-        if ( v->fpu_dirtied )
-            clts();
-    }
-
-    return 0;
-}
-
 void do_general_protection(struct cpu_user_regs *regs)
 {
     struct vcpu *v = current;
@@ -2249,31 +2230,6 @@ long set_debugreg(struct vcpu *v, unsigned int reg, unsigned long value)
     return 0;
 }
 
-long do_set_debugreg(int reg, unsigned long value)
-{
-    return set_debugreg(current, reg, value);
-}
-
-unsigned long do_get_debugreg(int reg)
-{
-    struct vcpu *curr = current;
-
-    switch ( reg )
-    {
-    case 0 ... 3:
-    case 6:
-        return curr->arch.debugreg[reg];
-    case 7:
-        return (curr->arch.debugreg[7] |
-                curr->arch.debugreg[5]);
-    case 4 ... 5:
-        return ((curr->arch.pv_vcpu.ctrlreg[4] & X86_CR4_DE) ?
-                curr->arch.debugreg[reg + 2] : 0);
-    }
-
-    return -EINVAL;
-}
-
 void asm_domain_crash_synchronous(unsigned long addr)
 {
     /*
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 09/27] x86/traps: move pv_inject_event to pv/traps.c
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (7 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 08/27] x86: move some misc PV hypercalls to misc-hypercalls.c Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 11:04   ` Andrew Cooper
  2017-06-08 17:11 ` [PATCH v4 10/27] x86/traps: move set_guest_{machine, nmi}_trapbounce Wei Liu
                   ` (17 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Take the opportunity to rename "v" to "curr".

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/traps.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/traps.c    | 69 ----------------------------------------------
 2 files changed, 73 insertions(+), 69 deletions(-)

diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c
index 6e69f2ad58..ec7ff1040b 100644
--- a/xen/arch/x86/pv/traps.c
+++ b/xen/arch/x86/pv/traps.c
@@ -22,9 +22,13 @@
 #include <xen/event.h>
 #include <xen/guest_access.h>
 #include <xen/hypercall.h>
+#include <xen/lib.h>
 #include <xen/sched.h>
+#include <xen/trace.h>
 
 #include <asm/apic.h>
+#include <asm/shared.h>
+#include <asm/traps.h>
 
 void do_entry_int82(struct cpu_user_regs *regs)
 {
@@ -83,6 +87,75 @@ long do_set_trap_table(XEN_GUEST_HANDLE_PARAM(const_trap_info_t) traps)
     return rc;
 }
 
+void pv_inject_event(const struct x86_event *event)
+{
+    struct vcpu *curr = current;
+    struct cpu_user_regs *regs = guest_cpu_user_regs();
+    struct trap_bounce *tb;
+    const struct trap_info *ti;
+    const uint8_t vector = event->vector;
+    unsigned int error_code = event->error_code;
+    bool use_error_code;
+
+    ASSERT(vector == event->vector); /* Confirm no truncation. */
+    if ( event->type == X86_EVENTTYPE_HW_EXCEPTION )
+    {
+        ASSERT(vector < 32);
+        use_error_code = TRAP_HAVE_EC & (1u << vector);
+    }
+    else
+    {
+        ASSERT(event->type == X86_EVENTTYPE_SW_INTERRUPT);
+        use_error_code = false;
+    }
+    if ( use_error_code )
+        ASSERT(error_code != X86_EVENT_NO_EC);
+    else
+        ASSERT(error_code == X86_EVENT_NO_EC);
+
+    tb = &curr->arch.pv_vcpu.trap_bounce;
+    ti = &curr->arch.pv_vcpu.trap_ctxt[vector];
+
+    tb->flags = TBF_EXCEPTION;
+    tb->cs    = ti->cs;
+    tb->eip   = ti->address;
+
+    if ( event->type == X86_EVENTTYPE_HW_EXCEPTION &&
+         vector == TRAP_page_fault )
+    {
+        curr->arch.pv_vcpu.ctrlreg[2] = event->cr2;
+        arch_set_cr2(curr, event->cr2);
+
+        /* Re-set error_code.user flag appropriately for the guest. */
+        error_code &= ~PFEC_user_mode;
+        if ( !guest_kernel_mode(curr, regs) )
+            error_code |= PFEC_user_mode;
+
+        trace_pv_page_fault(event->cr2, error_code);
+    }
+    else
+        trace_pv_trap(vector, regs->rip, use_error_code, error_code);
+
+    if ( use_error_code )
+    {
+        tb->flags |= TBF_EXCEPTION_ERRCODE;
+        tb->error_code = error_code;
+    }
+
+    if ( TI_GET_IF(ti) )
+        tb->flags |= TBF_INTERRUPT;
+
+    if ( unlikely(null_trap_bounce(curr, tb)) )
+    {
+        gprintk(XENLOG_WARNING,
+                "Unhandled %s fault/trap [#%d, ec=%04x]\n",
+                trapstr(vector), vector, error_code);
+
+        if ( vector == TRAP_page_fault )
+            show_page_walk(event->cr2);
+    }
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 6923c2ef01..6abfb62c0c 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -626,75 +626,6 @@ void fatal_trap(const struct cpu_user_regs *regs, bool_t show_remote)
           (regs->eflags & X86_EFLAGS_IF) ? "" : ", IN INTERRUPT CONTEXT");
 }
 
-void pv_inject_event(const struct x86_event *event)
-{
-    struct vcpu *v = current;
-    struct cpu_user_regs *regs = guest_cpu_user_regs();
-    struct trap_bounce *tb;
-    const struct trap_info *ti;
-    const uint8_t vector = event->vector;
-    unsigned int error_code = event->error_code;
-    bool use_error_code;
-
-    ASSERT(vector == event->vector); /* Confirm no truncation. */
-    if ( event->type == X86_EVENTTYPE_HW_EXCEPTION )
-    {
-        ASSERT(vector < 32);
-        use_error_code = TRAP_HAVE_EC & (1u << vector);
-    }
-    else
-    {
-        ASSERT(event->type == X86_EVENTTYPE_SW_INTERRUPT);
-        use_error_code = false;
-    }
-    if ( use_error_code )
-        ASSERT(error_code != X86_EVENT_NO_EC);
-    else
-        ASSERT(error_code == X86_EVENT_NO_EC);
-
-    tb = &v->arch.pv_vcpu.trap_bounce;
-    ti = &v->arch.pv_vcpu.trap_ctxt[vector];
-
-    tb->flags = TBF_EXCEPTION;
-    tb->cs    = ti->cs;
-    tb->eip   = ti->address;
-
-    if ( event->type == X86_EVENTTYPE_HW_EXCEPTION &&
-         vector == TRAP_page_fault )
-    {
-        v->arch.pv_vcpu.ctrlreg[2] = event->cr2;
-        arch_set_cr2(v, event->cr2);
-
-        /* Re-set error_code.user flag appropriately for the guest. */
-        error_code &= ~PFEC_user_mode;
-        if ( !guest_kernel_mode(v, regs) )
-            error_code |= PFEC_user_mode;
-
-        trace_pv_page_fault(event->cr2, error_code);
-    }
-    else
-        trace_pv_trap(vector, regs->rip, use_error_code, error_code);
-
-    if ( use_error_code )
-    {
-        tb->flags |= TBF_EXCEPTION_ERRCODE;
-        tb->error_code = error_code;
-    }
-
-    if ( TI_GET_IF(ti) )
-        tb->flags |= TBF_INTERRUPT;
-
-    if ( unlikely(null_trap_bounce(v, tb)) )
-    {
-        gprintk(XENLOG_WARNING,
-                "Unhandled %s fault/trap [#%d, ec=%04x]\n",
-                trapstr(vector), vector, error_code);
-
-        if ( vector == TRAP_page_fault )
-            show_page_walk(event->cr2);
-    }
-}
-
 /*
  * Called from asm to set up the MCE trapbounce info.
  * Returns 0 if no callback is set up, else 1.
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 10/27] x86/traps: move set_guest_{machine, nmi}_trapbounce
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (8 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 09/27] x86/traps: move pv_inject_event to pv/traps.c Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 11:05   ` Andrew Cooper
  2017-06-08 17:11 ` [PATCH v4 11/27] x86:/traps: move {un, }register_guest_nmi_callback Wei Liu
                   ` (16 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Take the opportunity to change their return type to bool. And rename
"v" to "curr".

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/traps.c | 27 +++++++++++++++++++++++++++
 xen/arch/x86/traps.c    | 27 ---------------------------
 2 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c
index ec7ff1040b..e374cd73b4 100644
--- a/xen/arch/x86/pv/traps.c
+++ b/xen/arch/x86/pv/traps.c
@@ -156,6 +156,33 @@ void pv_inject_event(const struct x86_event *event)
     }
 }
 
+/*
+ * Called from asm to set up the MCE trapbounce info.
+ * Returns false no callback is set up, else true.
+ */
+bool set_guest_machinecheck_trapbounce(void)
+{
+    struct vcpu *curr = current;
+    struct trap_bounce *tb = &curr->arch.pv_vcpu.trap_bounce;
+
+    pv_inject_hw_exception(TRAP_machine_check, X86_EVENT_NO_EC);
+    tb->flags &= ~TBF_EXCEPTION; /* not needed for MCE delivery path */
+    return !null_trap_bounce(curr, tb);
+}
+
+/*
+ * Called from asm to set up the NMI trapbounce info.
+ * Returns false if no callback is set up, else true.
+ */
+bool set_guest_nmi_trapbounce(void)
+{
+    struct vcpu *curr = current;
+    struct trap_bounce *tb = &curr->arch.pv_vcpu.trap_bounce;
+    pv_inject_hw_exception(TRAP_nmi, X86_EVENT_NO_EC);
+    tb->flags &= ~TBF_EXCEPTION; /* not needed for NMI delivery path */
+    return !null_trap_bounce(curr, tb);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 6abfb62c0c..013de702ad 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -626,33 +626,6 @@ void fatal_trap(const struct cpu_user_regs *regs, bool_t show_remote)
           (regs->eflags & X86_EFLAGS_IF) ? "" : ", IN INTERRUPT CONTEXT");
 }
 
-/*
- * Called from asm to set up the MCE trapbounce info.
- * Returns 0 if no callback is set up, else 1.
- */
-int set_guest_machinecheck_trapbounce(void)
-{
-    struct vcpu *v = current;
-    struct trap_bounce *tb = &v->arch.pv_vcpu.trap_bounce;
- 
-    pv_inject_hw_exception(TRAP_machine_check, X86_EVENT_NO_EC);
-    tb->flags &= ~TBF_EXCEPTION; /* not needed for MCE delivery path */
-    return !null_trap_bounce(v, tb);
-}
-
-/*
- * Called from asm to set up the NMI trapbounce info.
- * Returns 0 if no callback is set up, else 1.
- */
-int set_guest_nmi_trapbounce(void)
-{
-    struct vcpu *v = current;
-    struct trap_bounce *tb = &v->arch.pv_vcpu.trap_bounce;
-    pv_inject_hw_exception(TRAP_nmi, X86_EVENT_NO_EC);
-    tb->flags &= ~TBF_EXCEPTION; /* not needed for NMI delivery path */
-    return !null_trap_bounce(v, tb);
-}
-
 void do_reserved_trap(struct cpu_user_regs *regs)
 {
     unsigned int trapnr = regs->entry_vector;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 11/27] x86:/traps: move {un, }register_guest_nmi_callback
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (9 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 10/27] x86/traps: move set_guest_{machine, nmi}_trapbounce Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 11:38   ` Andrew Cooper
  2017-06-08 17:11 ` [PATCH v4 12/27] x86/traps: move guest_has_trap_callback to pv/traps.c Wei Liu
                   ` (15 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Take the opportunity to rename "v" to "curr".

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/traps.c | 36 ++++++++++++++++++++++++++++++++++++
 xen/arch/x86/traps.c    | 36 ------------------------------------
 2 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c
index e374cd73b4..d0e651616d 100644
--- a/xen/arch/x86/pv/traps.c
+++ b/xen/arch/x86/pv/traps.c
@@ -183,6 +183,42 @@ bool set_guest_nmi_trapbounce(void)
     return !null_trap_bounce(curr, tb);
 }
 
+long register_guest_nmi_callback(unsigned long address)
+{
+    struct vcpu *curr = current;
+    struct domain *d = curr->domain;
+    struct trap_info *t = &curr->arch.pv_vcpu.trap_ctxt[TRAP_nmi];
+
+    if ( !is_canonical_address(address) )
+        return -EINVAL;
+
+    t->vector  = TRAP_nmi;
+    t->flags   = 0;
+    t->cs      = (is_pv_32bit_domain(d) ?
+                  FLAT_COMPAT_KERNEL_CS : FLAT_KERNEL_CS);
+    t->address = address;
+    TI_SET_IF(t, 1);
+
+    /*
+     * If no handler was registered we can 'lose the NMI edge'. Re-assert it
+     * now.
+     */
+    if ( (curr->vcpu_id == 0) && (arch_get_nmi_reason(d) != 0) )
+        curr->nmi_pending = 1;
+
+    return 0;
+}
+
+long unregister_guest_nmi_callback(void)
+{
+    struct vcpu *curr = current;
+    struct trap_info *t = &curr->arch.pv_vcpu.trap_ctxt[TRAP_nmi];
+
+    memset(t, 0, sizeof(*t));
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 013de702ad..babb476097 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1909,42 +1909,6 @@ void __init trap_init(void)
     open_softirq(PCI_SERR_SOFTIRQ, pci_serr_softirq);
 }
 
-long register_guest_nmi_callback(unsigned long address)
-{
-    struct vcpu *v = current;
-    struct domain *d = v->domain;
-    struct trap_info *t = &v->arch.pv_vcpu.trap_ctxt[TRAP_nmi];
-
-    if ( !is_canonical_address(address) )
-        return -EINVAL;
-
-    t->vector  = TRAP_nmi;
-    t->flags   = 0;
-    t->cs      = (is_pv_32bit_domain(d) ?
-                  FLAT_COMPAT_KERNEL_CS : FLAT_KERNEL_CS);
-    t->address = address;
-    TI_SET_IF(t, 1);
-
-    /*
-     * If no handler was registered we can 'lose the NMI edge'. Re-assert it
-     * now.
-     */
-    if ( (v->vcpu_id == 0) && (arch_get_nmi_reason(d) != 0) )
-        v->nmi_pending = 1;
-
-    return 0;
-}
-
-long unregister_guest_nmi_callback(void)
-{
-    struct vcpu *v = current;
-    struct trap_info *t = &v->arch.pv_vcpu.trap_ctxt[TRAP_nmi];
-
-    memset(t, 0, sizeof(*t));
-
-    return 0;
-}
-
 int guest_has_trap_callback(struct domain *d, uint16_t vcpuid, unsigned int trap_nr)
 {
     struct vcpu *v;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 12/27] x86/traps: move guest_has_trap_callback to pv/traps.c
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (10 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 11/27] x86:/traps: move {un, }register_guest_nmi_callback Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 12:01   ` Andrew Cooper
  2017-06-08 17:11 ` [PATCH v4 13/27] x86: move toggle_guest_mode to pv/domain.c Wei Liu
                   ` (14 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Take the chance to constify pointers, replace uint16_t with unsigned
int etc.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/traps.c     | 18 ++++++++++++++++++
 xen/arch/x86/traps.c        | 18 ------------------
 xen/include/asm-x86/traps.h |  6 +++---
 3 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c
index d0e651616d..be215df57a 100644
--- a/xen/arch/x86/pv/traps.c
+++ b/xen/arch/x86/pv/traps.c
@@ -219,6 +219,24 @@ long unregister_guest_nmi_callback(void)
     return 0;
 }
 
+bool guest_has_trap_callback(const struct domain *d, unsigned int vcpuid,
+                             unsigned int trap_nr)
+{
+    const struct vcpu *v;
+    const struct trap_info *t;
+
+    BUG_ON(d == NULL);
+    BUG_ON(vcpuid >= d->max_vcpus);
+
+    /* Sanity check - XXX should be more fine grained. */
+    BUG_ON(trap_nr >= NR_VECTORS);
+
+    v = d->vcpu[vcpuid];
+    t = &v->arch.pv_vcpu.trap_ctxt[trap_nr];
+
+    return t->address;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index babb476097..8861dfd332 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1909,24 +1909,6 @@ void __init trap_init(void)
     open_softirq(PCI_SERR_SOFTIRQ, pci_serr_softirq);
 }
 
-int guest_has_trap_callback(struct domain *d, uint16_t vcpuid, unsigned int trap_nr)
-{
-    struct vcpu *v;
-    struct trap_info *t;
-
-    BUG_ON(d == NULL);
-    BUG_ON(vcpuid >= d->max_vcpus);
-
-    /* Sanity check - XXX should be more fine grained. */
-    BUG_ON(trap_nr >= NR_VECTORS);
-
-    v = d->vcpu[vcpuid];
-    t = &v->arch.pv_vcpu.trap_ctxt[trap_nr];
-
-    return (t->address != 0);
-}
-
-
 int send_guest_trap(struct domain *d, uint16_t vcpuid, unsigned int trap_nr)
 {
     struct vcpu *v;
diff --git a/xen/include/asm-x86/traps.h b/xen/include/asm-x86/traps.h
index f1d2513e6b..26625ce5a6 100644
--- a/xen/include/asm-x86/traps.h
+++ b/xen/include/asm-x86/traps.h
@@ -32,10 +32,10 @@ void async_exception_cleanup(struct vcpu *);
 /**
  * guest_has_trap_callback
  *
- * returns true (non-zero) if guest registered a trap handler
+ * returns true if guest registered a trap handler
  */
-extern int guest_has_trap_callback(struct domain *d, uint16_t vcpuid,
-				unsigned int trap_nr);
+bool guest_has_trap_callback(const struct domain *d, unsigned int vcpuid,
+                             unsigned int trap_nr);
 
 /**
  * send_guest_trap
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 13/27] x86: move toggle_guest_mode to pv/domain.c
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (11 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 12/27] x86/traps: move guest_has_trap_callback to pv/traps.c Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 12:10   ` Andrew Cooper
  2017-06-08 17:11 ` [PATCH v4 14/27] x86: move do_iret to pv/iret.c Wei Liu
                   ` (13 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/domain.c    | 30 ++++++++++++++++++++++++++++++
 xen/arch/x86/x86_64/traps.c | 30 ------------------------------
 2 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c
index 1c0c040ca0..0f3f0e85d6 100644
--- a/xen/arch/x86/pv/domain.c
+++ b/xen/arch/x86/pv/domain.c
@@ -213,6 +213,36 @@ int pv_domain_initialise(struct domain *d, unsigned int domcr_flags,
     return rc;
 }
 
+void toggle_guest_mode(struct vcpu *v)
+{
+    if ( is_pv_32bit_vcpu(v) )
+        return;
+    if ( cpu_has_fsgsbase )
+    {
+        if ( v->arch.flags & TF_kernel_mode )
+            v->arch.pv_vcpu.gs_base_kernel = __rdgsbase();
+        else
+            v->arch.pv_vcpu.gs_base_user = __rdgsbase();
+    }
+    v->arch.flags ^= TF_kernel_mode;
+    asm volatile ( "swapgs" );
+    update_cr3(v);
+    /* Don't flush user global mappings from the TLB. Don't tick TLB clock. */
+    asm volatile ( "mov %0, %%cr3" : : "r" (v->arch.cr3) : "memory" );
+
+    if ( !(v->arch.flags & TF_kernel_mode) )
+        return;
+
+    if ( v->arch.pv_vcpu.need_update_runstate_area &&
+         update_runstate_area(v) )
+        v->arch.pv_vcpu.need_update_runstate_area = 0;
+
+    if ( v->arch.pv_vcpu.pending_system_time.version &&
+         update_secondary_system_time(v,
+                                      &v->arch.pv_vcpu.pending_system_time) )
+        v->arch.pv_vcpu.pending_system_time.version = 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/x86_64/traps.c b/xen/arch/x86/x86_64/traps.c
index 78f410517c..36b694c605 100644
--- a/xen/arch/x86/x86_64/traps.c
+++ b/xen/arch/x86/x86_64/traps.c
@@ -254,36 +254,6 @@ void do_double_fault(struct cpu_user_regs *regs)
     panic("DOUBLE FAULT -- system shutdown");
 }
 
-void toggle_guest_mode(struct vcpu *v)
-{
-    if ( is_pv_32bit_vcpu(v) )
-        return;
-    if ( cpu_has_fsgsbase )
-    {
-        if ( v->arch.flags & TF_kernel_mode )
-            v->arch.pv_vcpu.gs_base_kernel = __rdgsbase();
-        else
-            v->arch.pv_vcpu.gs_base_user = __rdgsbase();
-    }
-    v->arch.flags ^= TF_kernel_mode;
-    asm volatile ( "swapgs" );
-    update_cr3(v);
-    /* Don't flush user global mappings from the TLB. Don't tick TLB clock. */
-    asm volatile ( "mov %0, %%cr3" : : "r" (v->arch.cr3) : "memory" );
-
-    if ( !(v->arch.flags & TF_kernel_mode) )
-        return;
-
-    if ( v->arch.pv_vcpu.need_update_runstate_area &&
-         update_runstate_area(v) )
-        v->arch.pv_vcpu.need_update_runstate_area = 0;
-
-    if ( v->arch.pv_vcpu.pending_system_time.version &&
-         update_secondary_system_time(v,
-                                      &v->arch.pv_vcpu.pending_system_time) )
-        v->arch.pv_vcpu.pending_system_time.version = 0;
-}
-
 unsigned long do_iret(void)
 {
     struct cpu_user_regs *regs = guest_cpu_user_regs();
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 14/27] x86: move do_iret to pv/iret.c
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (12 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 13/27] x86: move toggle_guest_mode to pv/domain.c Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 12:12   ` Andrew Cooper
  2017-06-08 17:11 ` [PATCH v4 15/27] x86: move callback_op code to pv/callback.c Wei Liu
                   ` (12 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
There is no copyright header in the original file. Use the default
one?
---
 xen/arch/x86/pv/Makefile    |  1 +
 xen/arch/x86/pv/iret.c      | 72 +++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/x86_64/traps.c | 56 -----------------------------------
 3 files changed, 73 insertions(+), 56 deletions(-)
 create mode 100644 xen/arch/x86/pv/iret.c

diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index 939ea60bea..7e3da332d8 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -8,4 +8,5 @@ obj-y += emul-gate-op.o
 obj-y += emul-inv-op.o
 obj-y += emul-priv-op.o
 obj-bin-y += gpr_switch.o
+obj-y += iret.o
 obj-y += misc-hypercalls.o
diff --git a/xen/arch/x86/pv/iret.c b/xen/arch/x86/pv/iret.c
new file mode 100644
index 0000000000..358ae7cf08
--- /dev/null
+++ b/xen/arch/x86/pv/iret.c
@@ -0,0 +1,72 @@
+#include <xen/guest_access.h>
+#include <xen/lib.h>
+#include <xen/sched.h>
+
+#include <asm/current.h>
+#include <asm/traps.h>
+
+unsigned long do_iret(void)
+{
+    struct cpu_user_regs *regs = guest_cpu_user_regs();
+    struct iret_context iret_saved;
+    struct vcpu *v = current;
+
+    if ( unlikely(copy_from_user(&iret_saved, (void *)regs->rsp,
+                                 sizeof(iret_saved))) )
+    {
+        gprintk(XENLOG_ERR,
+                "Fault while reading IRET context from guest stack\n");
+        goto exit_and_crash;
+    }
+
+    /* Returning to user mode? */
+    if ( (iret_saved.cs & 3) == 3 )
+    {
+        if ( unlikely(pagetable_is_null(v->arch.guest_table_user)) )
+        {
+            gprintk(XENLOG_ERR,
+                    "Guest switching to user mode with no user page tables\n");
+            goto exit_and_crash;
+        }
+        toggle_guest_mode(v);
+    }
+
+    if ( VM_ASSIST(v->domain, architectural_iopl) )
+        v->arch.pv_vcpu.iopl = iret_saved.rflags & X86_EFLAGS_IOPL;
+
+    regs->rip    = iret_saved.rip;
+    regs->cs     = iret_saved.cs | 3; /* force guest privilege */
+    regs->rflags = ((iret_saved.rflags & ~(X86_EFLAGS_IOPL|X86_EFLAGS_VM))
+                    | X86_EFLAGS_IF);
+    regs->rsp    = iret_saved.rsp;
+    regs->ss     = iret_saved.ss | 3; /* force guest privilege */
+
+    if ( !(iret_saved.flags & VGCF_in_syscall) )
+    {
+        regs->entry_vector &= ~TRAP_syscall;
+        regs->r11 = iret_saved.r11;
+        regs->rcx = iret_saved.rcx;
+    }
+
+    /* Restore upcall mask from supplied EFLAGS.IF. */
+    vcpu_info(v, evtchn_upcall_mask) = !(iret_saved.rflags & X86_EFLAGS_IF);
+
+    async_exception_cleanup(v);
+
+    /* Saved %rax gets written back to regs->rax in entry.S. */
+    return iret_saved.rax;
+
+ exit_and_crash:
+    domain_crash(v->domain);
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/x86_64/traps.c b/xen/arch/x86/x86_64/traps.c
index 36b694c605..4641bc6d06 100644
--- a/xen/arch/x86/x86_64/traps.c
+++ b/xen/arch/x86/x86_64/traps.c
@@ -254,62 +254,6 @@ void do_double_fault(struct cpu_user_regs *regs)
     panic("DOUBLE FAULT -- system shutdown");
 }
 
-unsigned long do_iret(void)
-{
-    struct cpu_user_regs *regs = guest_cpu_user_regs();
-    struct iret_context iret_saved;
-    struct vcpu *v = current;
-
-    if ( unlikely(copy_from_user(&iret_saved, (void *)regs->rsp,
-                                 sizeof(iret_saved))) )
-    {
-        gprintk(XENLOG_ERR,
-                "Fault while reading IRET context from guest stack\n");
-        goto exit_and_crash;
-    }
-
-    /* Returning to user mode? */
-    if ( (iret_saved.cs & 3) == 3 )
-    {
-        if ( unlikely(pagetable_is_null(v->arch.guest_table_user)) )
-        {
-            gprintk(XENLOG_ERR,
-                    "Guest switching to user mode with no user page tables\n");
-            goto exit_and_crash;
-        }
-        toggle_guest_mode(v);
-    }
-
-    if ( VM_ASSIST(v->domain, architectural_iopl) )
-        v->arch.pv_vcpu.iopl = iret_saved.rflags & X86_EFLAGS_IOPL;
-
-    regs->rip    = iret_saved.rip;
-    regs->cs     = iret_saved.cs | 3; /* force guest privilege */
-    regs->rflags = ((iret_saved.rflags & ~(X86_EFLAGS_IOPL|X86_EFLAGS_VM))
-                    | X86_EFLAGS_IF);
-    regs->rsp    = iret_saved.rsp;
-    regs->ss     = iret_saved.ss | 3; /* force guest privilege */
-
-    if ( !(iret_saved.flags & VGCF_in_syscall) )
-    {
-        regs->entry_vector &= ~TRAP_syscall;
-        regs->r11 = iret_saved.r11;
-        regs->rcx = iret_saved.rcx;
-    }
-
-    /* Restore upcall mask from supplied EFLAGS.IF. */
-    vcpu_info(v, evtchn_upcall_mask) = !(iret_saved.rflags & X86_EFLAGS_IF);
-
-    async_exception_cleanup(v);
-
-    /* Saved %rax gets written back to regs->rax in entry.S. */
-    return iret_saved.rax;
-
- exit_and_crash:
-    domain_crash(v->domain);
-    return 0;
-}
-
 static unsigned int write_stub_trampoline(
     unsigned char *stub, unsigned long stub_va,
     unsigned long stack_bottom, unsigned long target_va)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 15/27] x86: move callback_op code to pv/callback.c
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (13 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 14/27] x86: move do_iret to pv/iret.c Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-08 17:11 ` [PATCH v4 16/27] x86/traps: factor out pv_trap_init Wei Liu
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/Makefile    |   1 +
 xen/arch/x86/pv/callback.c  | 157 ++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/x86_64/traps.c | 148 -----------------------------------------
 3 files changed, 158 insertions(+), 148 deletions(-)
 create mode 100644 xen/arch/x86/pv/callback.c

diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index 7e3da332d8..bd1a7081fc 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -1,6 +1,7 @@
 obj-y += hypercall.o
 obj-y += traps.o
 
+obj-y += callback.o
 obj-bin-y += dom0_build.init.o
 obj-y += domain.o
 obj-y += emulate.o
diff --git a/xen/arch/x86/pv/callback.c b/xen/arch/x86/pv/callback.c
new file mode 100644
index 0000000000..dbd602c89d
--- /dev/null
+++ b/xen/arch/x86/pv/callback.c
@@ -0,0 +1,157 @@
+#include <xen/guest_access.h>
+#include <xen/lib.h>
+#include <xen/sched.h>
+
+#include <asm/current.h>
+#include <asm/nmi.h>
+#include <asm/traps.h>
+
+#include <public/callback.h>
+
+static long register_guest_callback(struct callback_register *reg)
+{
+    long ret = 0;
+    struct vcpu *v = current;
+
+    if ( !is_canonical_address(reg->address) )
+        return -EINVAL;
+
+    switch ( reg->type )
+    {
+    case CALLBACKTYPE_event:
+        v->arch.pv_vcpu.event_callback_eip    = reg->address;
+        break;
+
+    case CALLBACKTYPE_failsafe:
+        v->arch.pv_vcpu.failsafe_callback_eip = reg->address;
+        if ( reg->flags & CALLBACKF_mask_events )
+            set_bit(_VGCF_failsafe_disables_events,
+                    &v->arch.vgc_flags);
+        else
+            clear_bit(_VGCF_failsafe_disables_events,
+                      &v->arch.vgc_flags);
+        break;
+
+    case CALLBACKTYPE_syscall:
+        v->arch.pv_vcpu.syscall_callback_eip  = reg->address;
+        if ( reg->flags & CALLBACKF_mask_events )
+            set_bit(_VGCF_syscall_disables_events,
+                    &v->arch.vgc_flags);
+        else
+            clear_bit(_VGCF_syscall_disables_events,
+                      &v->arch.vgc_flags);
+        break;
+
+    case CALLBACKTYPE_syscall32:
+        v->arch.pv_vcpu.syscall32_callback_eip = reg->address;
+        v->arch.pv_vcpu.syscall32_disables_events =
+            !!(reg->flags & CALLBACKF_mask_events);
+        break;
+
+    case CALLBACKTYPE_sysenter:
+        v->arch.pv_vcpu.sysenter_callback_eip = reg->address;
+        v->arch.pv_vcpu.sysenter_disables_events =
+            !!(reg->flags & CALLBACKF_mask_events);
+        break;
+
+    case CALLBACKTYPE_nmi:
+        ret = register_guest_nmi_callback(reg->address);
+        break;
+
+    default:
+        ret = -ENOSYS;
+        break;
+    }
+
+    return ret;
+}
+
+static long unregister_guest_callback(struct callback_unregister *unreg)
+{
+    long ret;
+
+    switch ( unreg->type )
+    {
+    case CALLBACKTYPE_event:
+    case CALLBACKTYPE_failsafe:
+    case CALLBACKTYPE_syscall:
+    case CALLBACKTYPE_syscall32:
+    case CALLBACKTYPE_sysenter:
+        ret = -EINVAL;
+        break;
+
+    case CALLBACKTYPE_nmi:
+        ret = unregister_guest_nmi_callback();
+        break;
+
+    default:
+        ret = -ENOSYS;
+        break;
+    }
+
+    return ret;
+}
+
+
+long do_callback_op(int cmd, XEN_GUEST_HANDLE_PARAM(const_void) arg)
+{
+    long ret;
+
+    switch ( cmd )
+    {
+    case CALLBACKOP_register:
+    {
+        struct callback_register reg;
+
+        ret = -EFAULT;
+        if ( copy_from_guest(&reg, arg, 1) )
+            break;
+
+        ret = register_guest_callback(&reg);
+    }
+    break;
+
+    case CALLBACKOP_unregister:
+    {
+        struct callback_unregister unreg;
+
+        ret = -EFAULT;
+        if ( copy_from_guest(&unreg, arg, 1) )
+            break;
+
+        ret = unregister_guest_callback(&unreg);
+    }
+    break;
+
+    default:
+        ret = -ENOSYS;
+        break;
+    }
+
+    return ret;
+}
+
+long do_set_callbacks(unsigned long event_address,
+                      unsigned long failsafe_address,
+                      unsigned long syscall_address)
+{
+    struct callback_register event = {
+        .type = CALLBACKTYPE_event,
+        .address = event_address,
+    };
+    struct callback_register failsafe = {
+        .type = CALLBACKTYPE_failsafe,
+        .address = failsafe_address,
+    };
+    struct callback_register syscall = {
+        .type = CALLBACKTYPE_syscall,
+        .address = syscall_address,
+    };
+
+    register_guest_callback(&event);
+    register_guest_callback(&failsafe);
+    register_guest_callback(&syscall);
+
+    return 0;
+}
+
diff --git a/xen/arch/x86/x86_64/traps.c b/xen/arch/x86/x86_64/traps.c
index 4641bc6d06..1a8beb8068 100644
--- a/xen/arch/x86/x86_64/traps.c
+++ b/xen/arch/x86/x86_64/traps.c
@@ -23,7 +23,6 @@
 #include <asm/shared.h>
 #include <asm/hvm/hvm.h>
 #include <asm/hvm/support.h>
-#include <public/callback.h>
 
 
 static void print_xen_info(void)
@@ -350,153 +349,6 @@ void init_int80_direct_trap(struct vcpu *v)
         tb->flags = TBF_EXCEPTION | (TI_GET_IF(ti) ? TBF_INTERRUPT : 0);
 }
 
-static long register_guest_callback(struct callback_register *reg)
-{
-    long ret = 0;
-    struct vcpu *v = current;
-
-    if ( !is_canonical_address(reg->address) )
-        return -EINVAL;
-
-    switch ( reg->type )
-    {
-    case CALLBACKTYPE_event:
-        v->arch.pv_vcpu.event_callback_eip    = reg->address;
-        break;
-
-    case CALLBACKTYPE_failsafe:
-        v->arch.pv_vcpu.failsafe_callback_eip = reg->address;
-        if ( reg->flags & CALLBACKF_mask_events )
-            set_bit(_VGCF_failsafe_disables_events,
-                    &v->arch.vgc_flags);
-        else
-            clear_bit(_VGCF_failsafe_disables_events,
-                      &v->arch.vgc_flags);
-        break;
-
-    case CALLBACKTYPE_syscall:
-        v->arch.pv_vcpu.syscall_callback_eip  = reg->address;
-        if ( reg->flags & CALLBACKF_mask_events )
-            set_bit(_VGCF_syscall_disables_events,
-                    &v->arch.vgc_flags);
-        else
-            clear_bit(_VGCF_syscall_disables_events,
-                      &v->arch.vgc_flags);
-        break;
-
-    case CALLBACKTYPE_syscall32:
-        v->arch.pv_vcpu.syscall32_callback_eip = reg->address;
-        v->arch.pv_vcpu.syscall32_disables_events =
-            !!(reg->flags & CALLBACKF_mask_events);
-        break;
-
-    case CALLBACKTYPE_sysenter:
-        v->arch.pv_vcpu.sysenter_callback_eip = reg->address;
-        v->arch.pv_vcpu.sysenter_disables_events =
-            !!(reg->flags & CALLBACKF_mask_events);
-        break;
-
-    case CALLBACKTYPE_nmi:
-        ret = register_guest_nmi_callback(reg->address);
-        break;
-
-    default:
-        ret = -ENOSYS;
-        break;
-    }
-
-    return ret;
-}
-
-static long unregister_guest_callback(struct callback_unregister *unreg)
-{
-    long ret;
-
-    switch ( unreg->type )
-    {
-    case CALLBACKTYPE_event:
-    case CALLBACKTYPE_failsafe:
-    case CALLBACKTYPE_syscall:
-    case CALLBACKTYPE_syscall32:
-    case CALLBACKTYPE_sysenter:
-        ret = -EINVAL;
-        break;
-
-    case CALLBACKTYPE_nmi:
-        ret = unregister_guest_nmi_callback();
-        break;
-
-    default:
-        ret = -ENOSYS;
-        break;
-    }
-
-    return ret;
-}
-
-
-long do_callback_op(int cmd, XEN_GUEST_HANDLE_PARAM(const_void) arg)
-{
-    long ret;
-
-    switch ( cmd )
-    {
-    case CALLBACKOP_register:
-    {
-        struct callback_register reg;
-
-        ret = -EFAULT;
-        if ( copy_from_guest(&reg, arg, 1) )
-            break;
-
-        ret = register_guest_callback(&reg);
-    }
-    break;
-
-    case CALLBACKOP_unregister:
-    {
-        struct callback_unregister unreg;
-
-        ret = -EFAULT;
-        if ( copy_from_guest(&unreg, arg, 1) )
-            break;
-
-        ret = unregister_guest_callback(&unreg);
-    }
-    break;
-
-    default:
-        ret = -ENOSYS;
-        break;
-    }
-
-    return ret;
-}
-
-long do_set_callbacks(unsigned long event_address,
-                      unsigned long failsafe_address,
-                      unsigned long syscall_address)
-{
-    struct callback_register event = {
-        .type = CALLBACKTYPE_event,
-        .address = event_address,
-    };
-    struct callback_register failsafe = {
-        .type = CALLBACKTYPE_failsafe,
-        .address = failsafe_address,
-    };
-    struct callback_register syscall = {
-        .type = CALLBACKTYPE_syscall,
-        .address = syscall_address,
-    };
-
-    register_guest_callback(&event);
-    register_guest_callback(&failsafe);
-    register_guest_callback(&syscall);
-
-    return 0;
-}
-
 static void hypercall_page_initialise_ring3_kernel(void *hypercall_page)
 {
     char *p;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 16/27] x86/traps: factor out pv_trap_init
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (14 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 15/27] x86: move callback_op code to pv/callback.c Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 12:31   ` Andrew Cooper
  2017-06-08 17:11 ` [PATCH v4 17/27] x86/traps: move some PV specific functions and struct to pv/traps.c Wei Liu
                   ` (10 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/traps.c           | 22 ++++++++++++++--------
 xen/include/asm-x86/pv/traps.h |  4 ++++
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 8861dfd332..29a83994bd 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1871,14 +1871,8 @@ void __init init_idt_traps(void)
     this_cpu(compat_gdt_table) = boot_cpu_compat_gdt_table;
 }
 
-extern void (*const autogen_entrypoints[NR_VECTORS])(void);
-void __init trap_init(void)
+void __init pv_trap_init(void)
 {
-    unsigned int vector;
-
-    /* Replace early pagefault with real pagefault handler. */
-    set_intr_gate(TRAP_page_fault, &page_fault);
-
     /* The 32-on-64 hypercall vector is only accessible from ring 1. */
     _set_gate(idt_table + HYPERCALL_VECTOR,
               SYS_DESC_trap_gate, 1, entry_int82);
@@ -1886,6 +1880,19 @@ void __init trap_init(void)
     /* Fast trap for int80 (faster than taking the #GP-fixup path). */
     _set_gate(idt_table + 0x80, SYS_DESC_trap_gate, 3, &int80_direct_trap);
 
+    open_softirq(NMI_MCE_SOFTIRQ, nmi_mce_softirq);
+}
+
+extern void (*const autogen_entrypoints[NR_VECTORS])(void);
+void __init trap_init(void)
+{
+    unsigned int vector;
+
+    pv_trap_init();
+
+    /* Replace early pagefault with real pagefault handler. */
+    set_intr_gate(TRAP_page_fault, &page_fault);
+
     for ( vector = 0; vector < NR_VECTORS; ++vector )
     {
         if ( autogen_entrypoints[vector] )
@@ -1905,7 +1912,6 @@ void __init trap_init(void)
 
     cpu_init();
 
-    open_softirq(NMI_MCE_SOFTIRQ, nmi_mce_softirq);
     open_softirq(PCI_SERR_SOFTIRQ, pci_serr_softirq);
 }
 
diff --git a/xen/include/asm-x86/pv/traps.h b/xen/include/asm-x86/pv/traps.h
index a4af69e486..426c8f6216 100644
--- a/xen/include/asm-x86/pv/traps.h
+++ b/xen/include/asm-x86/pv/traps.h
@@ -25,6 +25,8 @@
 
 #include <public/xen.h>
 
+void pv_trap_init(void);
+
 int pv_emulate_privileged_op(struct cpu_user_regs *regs);
 void pv_emulate_gate_op(struct cpu_user_regs *regs);
 int pv_emulate_invalid_rdtscp(struct cpu_user_regs *regs);
@@ -32,6 +34,8 @@ int pv_emulate_forced_invalid_op(struct cpu_user_regs *regs);
 
 #else  /* !CONFIG_PV */
 
+void pv_trap_init(void) {}
+
 int pv_emulate_privileged_op(struct cpu_user_regs *regs) { return 0; }
 void pv_emulate_gate_op(struct cpu_user_regs *regs) {}
 int pv_emulate_invalid_rdtscp(struct cpu_user_regs *regs) { return 0; }
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 17/27] x86/traps: move some PV specific functions and struct to pv/traps.c
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (15 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 16/27] x86/traps: factor out pv_trap_init Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 12:36   ` Andrew Cooper
  2017-06-08 17:11 ` [PATCH v4 18/27] x86/traps: move init_int80_direct_trap " Wei Liu
                   ` (9 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Those functions need to be moved at the same time. Also move
softirq_trap because it is only used in that one place.

Fix some coding style issues while moving.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/traps.c     | 105 ++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/traps.c        |  93 ---------------------------------------
 xen/include/asm-x86/traps.h |   6 ---
 3 files changed, 105 insertions(+), 99 deletions(-)

diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c
index be215df57a..0c1600d886 100644
--- a/xen/arch/x86/pv/traps.c
+++ b/xen/arch/x86/pv/traps.c
@@ -237,6 +237,111 @@ bool guest_has_trap_callback(const struct domain *d, unsigned int vcpuid,
     return t->address;
 }
 
+struct softirq_trap {
+    struct domain *domain;  /* domain to inject trap */
+    struct vcpu *vcpu;      /* vcpu to inject trap */
+    int processor;          /* physical cpu to inject trap */
+};
+static DEFINE_PER_CPU(struct softirq_trap, softirq_trap);
+
+static void nmi_mce_softirq(void)
+{
+    int cpu = smp_processor_id();
+    struct softirq_trap *st = &per_cpu(softirq_trap, cpu);
+
+    BUG_ON(st->vcpu == NULL);
+
+    /*
+     * Set the tmp value unconditionally, so that
+     * the check in the iret hypercall works.
+     */
+    cpumask_copy(st->vcpu->cpu_hard_affinity_tmp,
+                 st->vcpu->cpu_hard_affinity);
+
+    if ( (cpu != st->processor) ||
+         (st->processor != st->vcpu->processor) )
+    {
+        /*
+         * We are on a different physical cpu.
+         * Make sure to wakeup the vcpu on the
+         * specified processor.
+         */
+        vcpu_set_hard_affinity(st->vcpu, cpumask_of(st->processor));
+
+        /* Affinity is restored in the iret hypercall. */
+    }
+
+    /*
+     * Only used to defer wakeup of domain/vcpu to
+     * a safe (non-NMI/MCE) context.
+     */
+    vcpu_kick(st->vcpu);
+    st->vcpu = NULL;
+}
+
+void __init pv_trap_init(void)
+{
+    /* The 32-on-64 hypercall vector is only accessible from ring 1. */
+    _set_gate(idt_table + HYPERCALL_VECTOR,
+              SYS_DESC_trap_gate, 1, entry_int82);
+
+    /* Fast trap for int80 (faster than taking the #GP-fixup path). */
+    _set_gate(idt_table + 0x80, SYS_DESC_trap_gate, 3, &int80_direct_trap);
+
+    open_softirq(NMI_MCE_SOFTIRQ, nmi_mce_softirq);
+}
+
+int send_guest_trap(struct domain *d, uint16_t vcpuid, unsigned int trap_nr)
+{
+    struct vcpu *v;
+    struct softirq_trap *st = &per_cpu(softirq_trap, smp_processor_id());
+
+    BUG_ON(d == NULL);
+    BUG_ON(vcpuid >= d->max_vcpus);
+    v = d->vcpu[vcpuid];
+
+    switch ( trap_nr )
+    {
+    case TRAP_nmi:
+        if ( cmpxchgptr(&st->vcpu, NULL, v) )
+            return -EBUSY;
+        if ( !test_and_set_bool(v->nmi_pending) )
+        {
+            st->domain = d;
+            st->processor = v->processor;
+
+            /* not safe to wake up a vcpu here */
+            raise_softirq(NMI_MCE_SOFTIRQ);
+            return 0;
+        }
+        st->vcpu = NULL;
+        break;
+
+    case TRAP_machine_check:
+        if ( cmpxchgptr(&st->vcpu, NULL, v) )
+            return -EBUSY;
+
+        /*
+	 * We are called by the machine check (exception or polling) handlers
+	 * on the physical CPU that reported a machine check error.
+         */
+        if ( !test_and_set_bool(v->mce_pending) )
+        {
+            st->domain = d;
+            st->processor = v->processor;
+
+            /* not safe to wake up a vcpu here */
+            raise_softirq(NMI_MCE_SOFTIRQ);
+            return 0;
+        }
+        st->vcpu = NULL;
+        break;
+    }
+
+    /* delivery failed */
+    return -EIO;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 29a83994bd..287503cd56 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1477,39 +1477,6 @@ void do_general_protection(struct cpu_user_regs *regs)
     panic("GENERAL PROTECTION FAULT\n[error_code=%04x]", regs->error_code);
 }
 
-static DEFINE_PER_CPU(struct softirq_trap, softirq_trap);
-
-static void nmi_mce_softirq(void)
-{
-    int cpu = smp_processor_id();
-    struct softirq_trap *st = &per_cpu(softirq_trap, cpu);
-
-    BUG_ON(st->vcpu == NULL);
-
-    /* Set the tmp value unconditionally, so that
-     * the check in the iret hypercall works. */
-    cpumask_copy(st->vcpu->cpu_hard_affinity_tmp,
-                 st->vcpu->cpu_hard_affinity);
-
-    if ((cpu != st->processor)
-       || (st->processor != st->vcpu->processor))
-    {
-        /* We are on a different physical cpu.
-         * Make sure to wakeup the vcpu on the
-         * specified processor.
-         */
-        vcpu_set_hard_affinity(st->vcpu, cpumask_of(st->processor));
-
-        /* Affinity is restored in the iret hypercall. */
-    }
-
-    /* Only used to defer wakeup of domain/vcpu to
-     * a safe (non-NMI/MCE) context.
-     */
-    vcpu_kick(st->vcpu);
-    st->vcpu = NULL;
-}
-
 static void pci_serr_softirq(void)
 {
     printk("\n\nNMI - PCI system error (SERR)\n");
@@ -1871,18 +1838,6 @@ void __init init_idt_traps(void)
     this_cpu(compat_gdt_table) = boot_cpu_compat_gdt_table;
 }
 
-void __init pv_trap_init(void)
-{
-    /* The 32-on-64 hypercall vector is only accessible from ring 1. */
-    _set_gate(idt_table + HYPERCALL_VECTOR,
-              SYS_DESC_trap_gate, 1, entry_int82);
-
-    /* Fast trap for int80 (faster than taking the #GP-fixup path). */
-    _set_gate(idt_table + 0x80, SYS_DESC_trap_gate, 3, &int80_direct_trap);
-
-    open_softirq(NMI_MCE_SOFTIRQ, nmi_mce_softirq);
-}
-
 extern void (*const autogen_entrypoints[NR_VECTORS])(void);
 void __init trap_init(void)
 {
@@ -1915,54 +1870,6 @@ void __init trap_init(void)
     open_softirq(PCI_SERR_SOFTIRQ, pci_serr_softirq);
 }
 
-int send_guest_trap(struct domain *d, uint16_t vcpuid, unsigned int trap_nr)
-{
-    struct vcpu *v;
-    struct softirq_trap *st = &per_cpu(softirq_trap, smp_processor_id());
-
-    BUG_ON(d == NULL);
-    BUG_ON(vcpuid >= d->max_vcpus);
-    v = d->vcpu[vcpuid];
-
-    switch (trap_nr) {
-    case TRAP_nmi:
-        if ( cmpxchgptr(&st->vcpu, NULL, v) )
-            return -EBUSY;
-        if ( !test_and_set_bool(v->nmi_pending) ) {
-               st->domain = d;
-               st->processor = v->processor;
-
-               /* not safe to wake up a vcpu here */
-               raise_softirq(NMI_MCE_SOFTIRQ);
-               return 0;
-        }
-        st->vcpu = NULL;
-        break;
-
-    case TRAP_machine_check:
-        if ( cmpxchgptr(&st->vcpu, NULL, v) )
-            return -EBUSY;
-
-        /* We are called by the machine check (exception or polling) handlers
-         * on the physical CPU that reported a machine check error. */
-
-        if ( !test_and_set_bool(v->mce_pending) ) {
-                st->domain = d;
-                st->processor = v->processor;
-
-                /* not safe to wake up a vcpu here */
-                raise_softirq(NMI_MCE_SOFTIRQ);
-                return 0;
-        }
-        st->vcpu = NULL;
-        break;
-    }
-
-    /* delivery failed */
-    return -EIO;
-}
-
-
 void activate_debugregs(const struct vcpu *curr)
 {
     ASSERT(curr == current);
diff --git a/xen/include/asm-x86/traps.h b/xen/include/asm-x86/traps.h
index 26625ce5a6..8cf6105d8d 100644
--- a/xen/include/asm-x86/traps.h
+++ b/xen/include/asm-x86/traps.h
@@ -19,12 +19,6 @@
 #ifndef ASM_TRAP_H
 #define ASM_TRAP_H
 
-struct softirq_trap {
-	struct domain *domain;  /* domain to inject trap */
-	struct vcpu *vcpu;	/* vcpu to inject trap */
-	int processor;		/* physical cpu to inject trap */
-};
-
 struct cpu_user_regs;
 
 void async_exception_cleanup(struct vcpu *);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 18/27] x86/traps: move init_int80_direct_trap to pv/traps.c
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (16 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 17/27] x86/traps: move some PV specific functions and struct to pv/traps.c Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 12:37   ` Andrew Cooper
  2017-06-08 17:11 ` [PATCH v4 19/27] x86: move hypercall_page_initialise_ring3_kernel to pv/hypercall.c Wei Liu
                   ` (8 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/pv/traps.c     | 14 ++++++++++++++
 xen/arch/x86/x86_64/traps.c | 14 --------------
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c
index 0c1600d886..f2556c7e4a 100644
--- a/xen/arch/x86/pv/traps.c
+++ b/xen/arch/x86/pv/traps.c
@@ -342,6 +342,20 @@ int send_guest_trap(struct domain *d, uint16_t vcpuid, unsigned int trap_nr)
     return -EIO;
 }
 
+void init_int80_direct_trap(struct vcpu *v)
+{
+    struct trap_info *ti = &v->arch.pv_vcpu.trap_ctxt[0x80];
+    struct trap_bounce *tb = &v->arch.pv_vcpu.int80_bounce;
+
+    tb->cs    = ti->cs;
+    tb->eip   = ti->address;
+
+    if ( null_trap_bounce(v, tb) )
+        tb->flags = 0;
+    else
+        tb->flags = TBF_EXCEPTION | (TI_GET_IF(ti) ? TBF_INTERRUPT : 0);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/x86_64/traps.c b/xen/arch/x86/x86_64/traps.c
index 1a8beb8068..d15c9023e8 100644
--- a/xen/arch/x86/x86_64/traps.c
+++ b/xen/arch/x86/x86_64/traps.c
@@ -335,20 +335,6 @@ void subarch_percpu_traps_init(void)
     wrmsrl(MSR_SYSCALL_MASK, XEN_SYSCALL_MASK);
 }
 
-void init_int80_direct_trap(struct vcpu *v)
-{
-    struct trap_info *ti = &v->arch.pv_vcpu.trap_ctxt[0x80];
-    struct trap_bounce *tb = &v->arch.pv_vcpu.int80_bounce;
-
-    tb->cs    = ti->cs;
-    tb->eip   = ti->address;
-
-    if ( null_trap_bounce(v, tb) )
-        tb->flags = 0;
-    else
-        tb->flags = TBF_EXCEPTION | (TI_GET_IF(ti) ? TBF_INTERRUPT : 0);
-}
-
 static void hypercall_page_initialise_ring3_kernel(void *hypercall_page)
 {
     char *p;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 19/27] x86: move hypercall_page_initialise_ring3_kernel to pv/hypercall.c
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (17 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 18/27] x86/traps: move init_int80_direct_trap " Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 12:41   ` Andrew Cooper
  2017-06-08 17:11 ` [PATCH v4 20/27] x86: move hypercall_page_initialise_ring1_kernel Wei Liu
                   ` (7 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/hypercall.c     | 36 ++++++++++++++++++++++++++++++++++++
 xen/arch/x86/x86_64/traps.c     | 36 ------------------------------------
 xen/include/asm-x86/hypercall.h |  1 +
 3 files changed, 37 insertions(+), 36 deletions(-)

diff --git a/xen/arch/x86/pv/hypercall.c b/xen/arch/x86/pv/hypercall.c
index 7c5e5a629d..287340e774 100644
--- a/xen/arch/x86/pv/hypercall.c
+++ b/xen/arch/x86/pv/hypercall.c
@@ -255,6 +255,42 @@ enum mc_disposition arch_do_multicall_call(struct mc_state *state)
              ? mc_continue : mc_preempt;
 }
 
+void hypercall_page_initialise_ring3_kernel(void *hypercall_page)
+{
+    char *p;
+    int i;
+
+    /* Fill in all the transfer points with template machine code. */
+    for ( i = 0; i < (PAGE_SIZE / 32); i++ )
+    {
+        if ( i == __HYPERVISOR_iret )
+            continue;
+
+        p = (char *)(hypercall_page + (i * 32));
+        *(u8  *)(p+ 0) = 0x51;    /* push %rcx */
+        *(u16 *)(p+ 1) = 0x5341;  /* push %r11 */
+        *(u8  *)(p+ 3) = 0xb8;    /* mov  $<i>,%eax */
+        *(u32 *)(p+ 4) = i;
+        *(u16 *)(p+ 8) = 0x050f;  /* syscall */
+        *(u16 *)(p+10) = 0x5b41;  /* pop  %r11 */
+        *(u8  *)(p+12) = 0x59;    /* pop  %rcx */
+        *(u8  *)(p+13) = 0xc3;    /* ret */
+    }
+
+    /*
+     * HYPERVISOR_iret is special because it doesn't return and expects a
+     * special stack frame. Guests jump at this transfer point instead of
+     * calling it.
+     */
+    p = (char *)(hypercall_page + (__HYPERVISOR_iret * 32));
+    *(u8  *)(p+ 0) = 0x51;    /* push %rcx */
+    *(u16 *)(p+ 1) = 0x5341;  /* push %r11 */
+    *(u8  *)(p+ 3) = 0x50;    /* push %rax */
+    *(u8  *)(p+ 4) = 0xb8;    /* mov  $__HYPERVISOR_iret,%eax */
+    *(u32 *)(p+ 5) = __HYPERVISOR_iret;
+    *(u16 *)(p+ 9) = 0x050f;  /* syscall */
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/x86_64/traps.c b/xen/arch/x86/x86_64/traps.c
index d15c9023e8..79bfc4d3f0 100644
--- a/xen/arch/x86/x86_64/traps.c
+++ b/xen/arch/x86/x86_64/traps.c
@@ -335,42 +335,6 @@ void subarch_percpu_traps_init(void)
     wrmsrl(MSR_SYSCALL_MASK, XEN_SYSCALL_MASK);
 }
 
-static void hypercall_page_initialise_ring3_kernel(void *hypercall_page)
-{
-    char *p;
-    int i;
-
-    /* Fill in all the transfer points with template machine code. */
-    for ( i = 0; i < (PAGE_SIZE / 32); i++ )
-    {
-        if ( i == __HYPERVISOR_iret )
-            continue;
-
-        p = (char *)(hypercall_page + (i * 32));
-        *(u8  *)(p+ 0) = 0x51;    /* push %rcx */
-        *(u16 *)(p+ 1) = 0x5341;  /* push %r11 */
-        *(u8  *)(p+ 3) = 0xb8;    /* mov  $<i>,%eax */
-        *(u32 *)(p+ 4) = i;
-        *(u16 *)(p+ 8) = 0x050f;  /* syscall */
-        *(u16 *)(p+10) = 0x5b41;  /* pop  %r11 */
-        *(u8  *)(p+12) = 0x59;    /* pop  %rcx */
-        *(u8  *)(p+13) = 0xc3;    /* ret */
-    }
-
-    /*
-     * HYPERVISOR_iret is special because it doesn't return and expects a
-     * special stack frame. Guests jump at this transfer point instead of
-     * calling it.
-     */
-    p = (char *)(hypercall_page + (__HYPERVISOR_iret * 32));
-    *(u8  *)(p+ 0) = 0x51;    /* push %rcx */
-    *(u16 *)(p+ 1) = 0x5341;  /* push %r11 */
-    *(u8  *)(p+ 3) = 0x50;    /* push %rax */
-    *(u8  *)(p+ 4) = 0xb8;    /* mov  $__HYPERVISOR_iret,%eax */
-    *(u32 *)(p+ 5) = __HYPERVISOR_iret;
-    *(u16 *)(p+ 9) = 0x050f;  /* syscall */
-}
-
 #include "compat/traps.c"
 
 void hypercall_page_initialise(struct domain *d, void *hypercall_page)
diff --git a/xen/include/asm-x86/hypercall.h b/xen/include/asm-x86/hypercall.h
index cfbcefe52f..5631cf2694 100644
--- a/xen/include/asm-x86/hypercall.h
+++ b/xen/include/asm-x86/hypercall.h
@@ -26,6 +26,7 @@ typedef struct {
 extern const hypercall_args_t hypercall_args_table[NR_hypercalls];
 
 void pv_hypercall(struct cpu_user_regs *regs);
+void hypercall_page_initialise_ring3_kernel(void *hypercall_page);
 
 /*
  * Both do_mmuext_op() and do_mmu_update():
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 20/27] x86: move hypercall_page_initialise_ring1_kernel
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (18 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 19/27] x86: move hypercall_page_initialise_ring3_kernel to pv/hypercall.c Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 12:41   ` Andrew Cooper
  2017-06-08 17:11 ` [PATCH v4 21/27] x86: move compat_set_trap_table along side the non-compat variant Wei Liu
                   ` (6 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/hypercall.c        | 31 +++++++++++++++++++++++++++++++
 xen/arch/x86/x86_64/compat/traps.c | 31 -------------------------------
 xen/include/asm-x86/hypercall.h    |  1 +
 3 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/xen/arch/x86/pv/hypercall.c b/xen/arch/x86/pv/hypercall.c
index 287340e774..58dc0f5c32 100644
--- a/xen/arch/x86/pv/hypercall.c
+++ b/xen/arch/x86/pv/hypercall.c
@@ -291,6 +291,37 @@ void hypercall_page_initialise_ring3_kernel(void *hypercall_page)
     *(u16 *)(p+ 9) = 0x050f;  /* syscall */
 }
 
+void hypercall_page_initialise_ring1_kernel(void *hypercall_page)
+{
+    char *p;
+    int i;
+
+    /* Fill in all the transfer points with template machine code. */
+
+    for ( i = 0; i < (PAGE_SIZE / 32); i++ )
+    {
+        if ( i == __HYPERVISOR_iret )
+            continue;
+
+        p = (char *)(hypercall_page + (i * 32));
+        *(u8  *)(p+ 0) = 0xb8;    /* mov  $<i>,%eax */
+        *(u32 *)(p+ 1) = i;
+        *(u16 *)(p+ 5) = (HYPERCALL_VECTOR << 8) | 0xcd; /* int  $xx */
+        *(u8  *)(p+ 7) = 0xc3;    /* ret */
+    }
+
+    /*
+     * HYPERVISOR_iret is special because it doesn't return and expects a
+     * special stack frame. Guests jump at this transfer point instead of
+     * calling it.
+     */
+    p = (char *)(hypercall_page + (__HYPERVISOR_iret * 32));
+    *(u8  *)(p+ 0) = 0x50;    /* push %eax */
+    *(u8  *)(p+ 1) = 0xb8;    /* mov  $__HYPERVISOR_iret,%eax */
+    *(u32 *)(p+ 2) = __HYPERVISOR_iret;
+    *(u16 *)(p+ 6) = (HYPERCALL_VECTOR << 8) | 0xcd; /* int  $xx */
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/x86_64/compat/traps.c b/xen/arch/x86/x86_64/compat/traps.c
index 1751ec67e8..f485299c88 100644
--- a/xen/arch/x86/x86_64/compat/traps.c
+++ b/xen/arch/x86/x86_64/compat/traps.c
@@ -374,37 +374,6 @@ int compat_set_trap_table(XEN_GUEST_HANDLE(trap_info_compat_t) traps)
     return rc;
 }
 
-static void hypercall_page_initialise_ring1_kernel(void *hypercall_page)
-{
-    char *p;
-    int i;
-
-    /* Fill in all the transfer points with template machine code. */
-
-    for ( i = 0; i < (PAGE_SIZE / 32); i++ )
-    {
-        if ( i == __HYPERVISOR_iret )
-            continue;
-
-        p = (char *)(hypercall_page + (i * 32));
-        *(u8  *)(p+ 0) = 0xb8;    /* mov  $<i>,%eax */
-        *(u32 *)(p+ 1) = i;
-        *(u16 *)(p+ 5) = (HYPERCALL_VECTOR << 8) | 0xcd; /* int  $xx */
-        *(u8  *)(p+ 7) = 0xc3;    /* ret */
-    }
-
-    /*
-     * HYPERVISOR_iret is special because it doesn't return and expects a
-     * special stack frame. Guests jump at this transfer point instead of
-     * calling it.
-     */
-    p = (char *)(hypercall_page + (__HYPERVISOR_iret * 32));
-    *(u8  *)(p+ 0) = 0x50;    /* push %eax */
-    *(u8  *)(p+ 1) = 0xb8;    /* mov  $__HYPERVISOR_iret,%eax */
-    *(u32 *)(p+ 2) = __HYPERVISOR_iret;
-    *(u16 *)(p+ 6) = (HYPERCALL_VECTOR << 8) | 0xcd; /* int  $xx */
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/hypercall.h b/xen/include/asm-x86/hypercall.h
index 5631cf2694..3eb4a8db89 100644
--- a/xen/include/asm-x86/hypercall.h
+++ b/xen/include/asm-x86/hypercall.h
@@ -27,6 +27,7 @@ extern const hypercall_args_t hypercall_args_table[NR_hypercalls];
 
 void pv_hypercall(struct cpu_user_regs *regs);
 void hypercall_page_initialise_ring3_kernel(void *hypercall_page);
+void hypercall_page_initialise_ring1_kernel(void *hypercall_page);
 
 /*
  * Both do_mmuext_op() and do_mmu_update():
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 21/27] x86: move compat_set_trap_table along side the non-compat variant
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (19 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 20/27] x86: move hypercall_page_initialise_ring1_kernel Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 12:43   ` Andrew Cooper
  2017-06-08 17:11 ` [PATCH v4 22/27] x86: move compat_iret along side its " Wei Liu
                   ` (5 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/traps.c            | 45 ++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/x86_64/compat/traps.c | 45 --------------------------------------
 2 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c
index f2556c7e4a..3dcb3f1877 100644
--- a/xen/arch/x86/pv/traps.c
+++ b/xen/arch/x86/pv/traps.c
@@ -87,6 +87,51 @@ long do_set_trap_table(XEN_GUEST_HANDLE_PARAM(const_trap_info_t) traps)
     return rc;
 }
 
+int compat_set_trap_table(XEN_GUEST_HANDLE(trap_info_compat_t) traps)
+{
+    struct compat_trap_info cur;
+    struct trap_info *dst = current->arch.pv_vcpu.trap_ctxt;
+    long rc = 0;
+
+    /* If no table is presented then clear the entire virtual IDT. */
+    if ( guest_handle_is_null(traps) )
+    {
+        memset(dst, 0, NR_VECTORS * sizeof(*dst));
+        init_int80_direct_trap(current);
+        return 0;
+    }
+
+    for ( ; ; )
+    {
+        if ( copy_from_guest(&cur, traps, 1) )
+        {
+            rc = -EFAULT;
+            break;
+        }
+
+        if ( cur.address == 0 )
+            break;
+
+        fixup_guest_code_selector(current->domain, cur.cs);
+
+        XLAT_trap_info(dst + cur.vector, &cur);
+
+        if ( cur.vector == 0x80 )
+            init_int80_direct_trap(current);
+
+        guest_handle_add_offset(traps, 1);
+
+        if ( hypercall_preempt_check() )
+        {
+            rc = hypercall_create_continuation(
+                __HYPERVISOR_set_trap_table, "h", traps);
+            break;
+        }
+    }
+
+    return rc;
+}
+
 void pv_inject_event(const struct x86_event *event)
 {
     struct vcpu *curr = current;
diff --git a/xen/arch/x86/x86_64/compat/traps.c b/xen/arch/x86/x86_64/compat/traps.c
index f485299c88..add4af3403 100644
--- a/xen/arch/x86/x86_64/compat/traps.c
+++ b/xen/arch/x86/x86_64/compat/traps.c
@@ -329,51 +329,6 @@ long compat_set_callbacks(unsigned long event_selector,
     return 0;
 }
 
-int compat_set_trap_table(XEN_GUEST_HANDLE(trap_info_compat_t) traps)
-{
-    struct compat_trap_info cur;
-    struct trap_info *dst = current->arch.pv_vcpu.trap_ctxt;
-    long rc = 0;
-
-    /* If no table is presented then clear the entire virtual IDT. */
-    if ( guest_handle_is_null(traps) )
-    {
-        memset(dst, 0, NR_VECTORS * sizeof(*dst));
-        init_int80_direct_trap(current);
-        return 0;
-    }
-
-    for ( ; ; )
-    {
-        if ( copy_from_guest(&cur, traps, 1) )
-        {
-            rc = -EFAULT;
-            break;
-        }
-
-        if ( cur.address == 0 )
-            break;
-
-        fixup_guest_code_selector(current->domain, cur.cs);
-
-        XLAT_trap_info(dst + cur.vector, &cur);
-
-        if ( cur.vector == 0x80 )
-            init_int80_direct_trap(current);
-
-        guest_handle_add_offset(traps, 1);
-
-        if ( hypercall_preempt_check() )
-        {
-            rc = hypercall_create_continuation(
-                __HYPERVISOR_set_trap_table, "h", traps);
-            break;
-        }
-    }
-
-    return rc;
-}
-
 /*
  * Local variables:
  * mode: C
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 22/27] x86: move compat_iret along side its non-compat variant
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (20 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 21/27] x86: move compat_set_trap_table along side the non-compat variant Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 12:44   ` Andrew Cooper
  2017-06-08 17:11 ` [PATCH v4 23/27] x86: move the compat callback ops next to the " Wei Liu
                   ` (4 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/iret.c             | 120 +++++++++++++++++++++++++++++++++++++
 xen/arch/x86/x86_64/compat/traps.c | 120 -------------------------------------
 2 files changed, 120 insertions(+), 120 deletions(-)

diff --git a/xen/arch/x86/pv/iret.c b/xen/arch/x86/pv/iret.c
index 358ae7cf08..013e619b3f 100644
--- a/xen/arch/x86/pv/iret.c
+++ b/xen/arch/x86/pv/iret.c
@@ -61,6 +61,126 @@ unsigned long do_iret(void)
     return 0;
 }
 
+unsigned int compat_iret(void)
+{
+    struct cpu_user_regs *regs = guest_cpu_user_regs();
+    struct vcpu *v = current;
+    u32 eflags;
+
+    /* Trim stack pointer to 32 bits. */
+    regs->rsp = (u32)regs->rsp;
+
+    /* Restore EAX (clobbered by hypercall). */
+    if ( unlikely(__get_user(regs->eax, (u32 *)regs->rsp)) )
+    {
+        domain_crash(v->domain);
+        return 0;
+    }
+
+    /* Restore CS and EIP. */
+    if ( unlikely(__get_user(regs->eip, (u32 *)regs->rsp + 1)) ||
+        unlikely(__get_user(regs->cs, (u32 *)regs->rsp + 2)) )
+    {
+        domain_crash(v->domain);
+        return 0;
+    }
+
+    /*
+     * Fix up and restore EFLAGS. We fix up in a local staging area
+     * to avoid firing the BUG_ON(IOPL) check in arch_get_info_guest.
+     */
+    if ( unlikely(__get_user(eflags, (u32 *)regs->rsp + 3)) )
+    {
+        domain_crash(v->domain);
+        return 0;
+    }
+
+    if ( VM_ASSIST(v->domain, architectural_iopl) )
+        v->arch.pv_vcpu.iopl = eflags & X86_EFLAGS_IOPL;
+
+    regs->eflags = (eflags & ~X86_EFLAGS_IOPL) | X86_EFLAGS_IF;
+
+    if ( unlikely(eflags & X86_EFLAGS_VM) )
+    {
+        /*
+         * Cannot return to VM86 mode: inject a GP fault instead. Note that
+         * the GP fault is reported on the first VM86 mode instruction, not on
+         * the IRET (which is why we can simply leave the stack frame as-is
+         * (except for perhaps having to copy it), which in turn seems better
+         * than teaching create_bounce_frame() to needlessly deal with vm86
+         * mode frames).
+         */
+        const struct trap_info *ti;
+        u32 x, ksp = v->arch.pv_vcpu.kernel_sp - 40;
+        unsigned int i;
+        int rc = 0;
+
+        gdprintk(XENLOG_ERR, "VM86 mode unavailable (ksp:%08X->%08X)\n",
+                 regs->esp, ksp);
+        if ( ksp < regs->esp )
+        {
+            for (i = 1; i < 10; ++i)
+            {
+                rc |= __get_user(x, (u32 *)regs->rsp + i);
+                rc |= __put_user(x, (u32 *)(unsigned long)ksp + i);
+            }
+        }
+        else if ( ksp > regs->esp )
+        {
+            for ( i = 9; i > 0; --i )
+            {
+                rc |= __get_user(x, (u32 *)regs->rsp + i);
+                rc |= __put_user(x, (u32 *)(unsigned long)ksp + i);
+            }
+        }
+        if ( rc )
+        {
+            domain_crash(v->domain);
+            return 0;
+        }
+        regs->esp = ksp;
+        regs->ss = v->arch.pv_vcpu.kernel_ss;
+
+        ti = &v->arch.pv_vcpu.trap_ctxt[TRAP_gp_fault];
+        if ( TI_GET_IF(ti) )
+            eflags &= ~X86_EFLAGS_IF;
+        regs->eflags &= ~(X86_EFLAGS_VM|X86_EFLAGS_RF|
+                          X86_EFLAGS_NT|X86_EFLAGS_TF);
+        if ( unlikely(__put_user(0, (u32 *)regs->rsp)) )
+        {
+            domain_crash(v->domain);
+            return 0;
+        }
+        regs->eip = ti->address;
+        regs->cs = ti->cs;
+    }
+    else if ( unlikely(ring_0(regs)) )
+    {
+        domain_crash(v->domain);
+        return 0;
+    }
+    else if ( ring_1(regs) )
+        regs->esp += 16;
+    /* Return to ring 2/3: restore ESP and SS. */
+    else if ( __get_user(regs->ss, (u32 *)regs->rsp + 5) ||
+              __get_user(regs->esp, (u32 *)regs->rsp + 4) )
+    {
+        domain_crash(v->domain);
+        return 0;
+    }
+
+    /* Restore upcall mask from supplied EFLAGS.IF. */
+    vcpu_info(v, evtchn_upcall_mask) = !(eflags & X86_EFLAGS_IF);
+
+    async_exception_cleanup(v);
+
+    /*
+     * The hypercall exit path will overwrite EAX with this return
+     * value.
+     */
+    return regs->eax;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/x86_64/compat/traps.c b/xen/arch/x86/x86_64/compat/traps.c
index add4af3403..df691f0ae3 100644
--- a/xen/arch/x86/x86_64/compat/traps.c
+++ b/xen/arch/x86/x86_64/compat/traps.c
@@ -66,126 +66,6 @@ void compat_show_guest_stack(struct vcpu *v, const struct cpu_user_regs *regs,
     printk("\n");
 }
 
-unsigned int compat_iret(void)
-{
-    struct cpu_user_regs *regs = guest_cpu_user_regs();
-    struct vcpu *v = current;
-    u32 eflags;
-
-    /* Trim stack pointer to 32 bits. */
-    regs->rsp = (u32)regs->rsp;
-
-    /* Restore EAX (clobbered by hypercall). */
-    if ( unlikely(__get_user(regs->eax, (u32 *)regs->rsp)) )
-    {
-        domain_crash(v->domain);
-        return 0;
-    }
-
-    /* Restore CS and EIP. */
-    if ( unlikely(__get_user(regs->eip, (u32 *)regs->rsp + 1)) ||
-        unlikely(__get_user(regs->cs, (u32 *)regs->rsp + 2)) )
-    {
-        domain_crash(v->domain);
-        return 0;
-    }
-
-    /*
-     * Fix up and restore EFLAGS. We fix up in a local staging area
-     * to avoid firing the BUG_ON(IOPL) check in arch_get_info_guest.
-     */
-    if ( unlikely(__get_user(eflags, (u32 *)regs->rsp + 3)) )
-    {
-        domain_crash(v->domain);
-        return 0;
-    }
-
-    if ( VM_ASSIST(v->domain, architectural_iopl) )
-        v->arch.pv_vcpu.iopl = eflags & X86_EFLAGS_IOPL;
-
-    regs->eflags = (eflags & ~X86_EFLAGS_IOPL) | X86_EFLAGS_IF;
-
-    if ( unlikely(eflags & X86_EFLAGS_VM) )
-    {
-        /*
-         * Cannot return to VM86 mode: inject a GP fault instead. Note that
-         * the GP fault is reported on the first VM86 mode instruction, not on
-         * the IRET (which is why we can simply leave the stack frame as-is
-         * (except for perhaps having to copy it), which in turn seems better
-         * than teaching create_bounce_frame() to needlessly deal with vm86
-         * mode frames).
-         */
-        const struct trap_info *ti;
-        u32 x, ksp = v->arch.pv_vcpu.kernel_sp - 40;
-        unsigned int i;
-        int rc = 0;
-
-        gdprintk(XENLOG_ERR, "VM86 mode unavailable (ksp:%08X->%08X)\n",
-                 regs->esp, ksp);
-        if ( ksp < regs->esp )
-        {
-            for (i = 1; i < 10; ++i)
-            {
-                rc |= __get_user(x, (u32 *)regs->rsp + i);
-                rc |= __put_user(x, (u32 *)(unsigned long)ksp + i);
-            }
-        }
-        else if ( ksp > regs->esp )
-        {
-            for ( i = 9; i > 0; --i )
-            {
-                rc |= __get_user(x, (u32 *)regs->rsp + i);
-                rc |= __put_user(x, (u32 *)(unsigned long)ksp + i);
-            }
-        }
-        if ( rc )
-        {
-            domain_crash(v->domain);
-            return 0;
-        }
-        regs->esp = ksp;
-        regs->ss = v->arch.pv_vcpu.kernel_ss;
-
-        ti = &v->arch.pv_vcpu.trap_ctxt[TRAP_gp_fault];
-        if ( TI_GET_IF(ti) )
-            eflags &= ~X86_EFLAGS_IF;
-        regs->eflags &= ~(X86_EFLAGS_VM|X86_EFLAGS_RF|
-                          X86_EFLAGS_NT|X86_EFLAGS_TF);
-        if ( unlikely(__put_user(0, (u32 *)regs->rsp)) )
-        {
-            domain_crash(v->domain);
-            return 0;
-        }
-        regs->eip = ti->address;
-        regs->cs = ti->cs;
-    }
-    else if ( unlikely(ring_0(regs)) )
-    {
-        domain_crash(v->domain);
-        return 0;
-    }
-    else if ( ring_1(regs) )
-        regs->esp += 16;
-    /* Return to ring 2/3: restore ESP and SS. */
-    else if ( __get_user(regs->ss, (u32 *)regs->rsp + 5) ||
-              __get_user(regs->esp, (u32 *)regs->rsp + 4) )
-    {
-        domain_crash(v->domain);
-        return 0;
-    }
-
-    /* Restore upcall mask from supplied EFLAGS.IF. */
-    vcpu_info(v, evtchn_upcall_mask) = !(eflags & X86_EFLAGS_IF);
-
-    async_exception_cleanup(v);
-
-    /*
-     * The hypercall exit path will overwrite EAX with this return
-     * value.
-     */
-    return regs->eax;
-}
-
 static long compat_register_guest_callback(
     struct compat_callback_register *reg)
 {
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 23/27] x86: move the compat callback ops next to the non-compat variant
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (21 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 22/27] x86: move compat_iret along side its " Wei Liu
@ 2017-06-08 17:11 ` Wei Liu
  2017-06-23 13:40   ` Jan Beulich
  2017-06-08 17:12 ` [PATCH v4 24/27] x86: move compat_show_guest_statck near its " Wei Liu
                   ` (3 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:11 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/callback.c         | 142 ++++++++++++++++++++++++++++++++++++
 xen/arch/x86/x86_64/compat/traps.c | 143 -------------------------------------
 2 files changed, 142 insertions(+), 143 deletions(-)

diff --git a/xen/arch/x86/pv/callback.c b/xen/arch/x86/pv/callback.c
index dbd602c89d..00981d0f47 100644
--- a/xen/arch/x86/pv/callback.c
+++ b/xen/arch/x86/pv/callback.c
@@ -1,6 +1,7 @@
 #include <xen/guest_access.h>
 #include <xen/lib.h>
 #include <xen/sched.h>
+#include <compat/callback.h>
 
 #include <asm/current.h>
 #include <asm/nmi.h>
@@ -155,3 +156,144 @@ long do_set_callbacks(unsigned long event_address,
     return 0;
 }
 
+static long compat_register_guest_callback(struct compat_callback_register *reg)
+{
+    long ret = 0;
+    struct vcpu *v = current;
+
+    fixup_guest_code_selector(v->domain, reg->address.cs);
+
+    switch ( reg->type )
+    {
+    case CALLBACKTYPE_event:
+        v->arch.pv_vcpu.event_callback_cs     = reg->address.cs;
+        v->arch.pv_vcpu.event_callback_eip    = reg->address.eip;
+        break;
+
+    case CALLBACKTYPE_failsafe:
+        v->arch.pv_vcpu.failsafe_callback_cs  = reg->address.cs;
+        v->arch.pv_vcpu.failsafe_callback_eip = reg->address.eip;
+        if ( reg->flags & CALLBACKF_mask_events )
+            set_bit(_VGCF_failsafe_disables_events,
+                    &v->arch.vgc_flags);
+        else
+            clear_bit(_VGCF_failsafe_disables_events,
+                      &v->arch.vgc_flags);
+        break;
+
+    case CALLBACKTYPE_syscall32:
+        v->arch.pv_vcpu.syscall32_callback_cs     = reg->address.cs;
+        v->arch.pv_vcpu.syscall32_callback_eip    = reg->address.eip;
+        v->arch.pv_vcpu.syscall32_disables_events =
+            (reg->flags & CALLBACKF_mask_events) != 0;
+        break;
+
+    case CALLBACKTYPE_sysenter:
+        v->arch.pv_vcpu.sysenter_callback_cs     = reg->address.cs;
+        v->arch.pv_vcpu.sysenter_callback_eip    = reg->address.eip;
+        v->arch.pv_vcpu.sysenter_disables_events =
+            (reg->flags & CALLBACKF_mask_events) != 0;
+        break;
+
+    case CALLBACKTYPE_nmi:
+        ret = register_guest_nmi_callback(reg->address.eip);
+        break;
+
+    default:
+        ret = -ENOSYS;
+        break;
+    }
+
+    return ret;
+}
+
+static long compat_unregister_guest_callback(
+    struct compat_callback_unregister *unreg)
+{
+    long ret;
+
+    switch ( unreg->type )
+    {
+    case CALLBACKTYPE_event:
+    case CALLBACKTYPE_failsafe:
+    case CALLBACKTYPE_syscall32:
+    case CALLBACKTYPE_sysenter:
+        ret = -EINVAL;
+        break;
+
+    case CALLBACKTYPE_nmi:
+        ret = unregister_guest_nmi_callback();
+        break;
+
+    default:
+        ret = -ENOSYS;
+        break;
+    }
+
+    return ret;
+}
+
+
+long compat_callback_op(int cmd, XEN_GUEST_HANDLE(void) arg)
+{
+    long ret;
+
+    switch ( cmd )
+    {
+    case CALLBACKOP_register:
+    {
+        struct compat_callback_register reg;
+
+        ret = -EFAULT;
+        if ( copy_from_guest(&reg, arg, 1) )
+            break;
+
+        ret = compat_register_guest_callback(&reg);
+    }
+    break;
+
+    case CALLBACKOP_unregister:
+    {
+        struct compat_callback_unregister unreg;
+
+        ret = -EFAULT;
+        if ( copy_from_guest(&unreg, arg, 1) )
+            break;
+
+        ret = compat_unregister_guest_callback(&unreg);
+    }
+    break;
+
+    default:
+        ret = -EINVAL;
+        break;
+    }
+
+    return ret;
+}
+
+long compat_set_callbacks(unsigned long event_selector,
+                          unsigned long event_address,
+                          unsigned long failsafe_selector,
+                          unsigned long failsafe_address)
+{
+    struct compat_callback_register event = {
+        .type = CALLBACKTYPE_event,
+        .address = {
+            .cs = event_selector,
+            .eip = event_address
+        }
+    };
+    struct compat_callback_register failsafe = {
+        .type = CALLBACKTYPE_failsafe,
+        .address = {
+            .cs = failsafe_selector,
+            .eip = failsafe_address
+        }
+    };
+
+    compat_register_guest_callback(&event);
+    compat_register_guest_callback(&failsafe);
+
+    return 0;
+}
diff --git a/xen/arch/x86/x86_64/compat/traps.c b/xen/arch/x86/x86_64/compat/traps.c
index df691f0ae3..6e146a62a7 100644
--- a/xen/arch/x86/x86_64/compat/traps.c
+++ b/xen/arch/x86/x86_64/compat/traps.c
@@ -66,149 +66,6 @@ void compat_show_guest_stack(struct vcpu *v, const struct cpu_user_regs *regs,
     printk("\n");
 }
 
-static long compat_register_guest_callback(
-    struct compat_callback_register *reg)
-{
-    long ret = 0;
-    struct vcpu *v = current;
-
-    fixup_guest_code_selector(v->domain, reg->address.cs);
-
-    switch ( reg->type )
-    {
-    case CALLBACKTYPE_event:
-        v->arch.pv_vcpu.event_callback_cs     = reg->address.cs;
-        v->arch.pv_vcpu.event_callback_eip    = reg->address.eip;
-        break;
-
-    case CALLBACKTYPE_failsafe:
-        v->arch.pv_vcpu.failsafe_callback_cs  = reg->address.cs;
-        v->arch.pv_vcpu.failsafe_callback_eip = reg->address.eip;
-        if ( reg->flags & CALLBACKF_mask_events )
-            set_bit(_VGCF_failsafe_disables_events,
-                    &v->arch.vgc_flags);
-        else
-            clear_bit(_VGCF_failsafe_disables_events,
-                      &v->arch.vgc_flags);
-        break;
-
-    case CALLBACKTYPE_syscall32:
-        v->arch.pv_vcpu.syscall32_callback_cs     = reg->address.cs;
-        v->arch.pv_vcpu.syscall32_callback_eip    = reg->address.eip;
-        v->arch.pv_vcpu.syscall32_disables_events =
-            (reg->flags & CALLBACKF_mask_events) != 0;
-        break;
-
-    case CALLBACKTYPE_sysenter:
-        v->arch.pv_vcpu.sysenter_callback_cs     = reg->address.cs;
-        v->arch.pv_vcpu.sysenter_callback_eip    = reg->address.eip;
-        v->arch.pv_vcpu.sysenter_disables_events =
-            (reg->flags & CALLBACKF_mask_events) != 0;
-        break;
-
-    case CALLBACKTYPE_nmi:
-        ret = register_guest_nmi_callback(reg->address.eip);
-        break;
-
-    default:
-        ret = -ENOSYS;
-        break;
-    }
-
-    return ret;
-}
-
-static long compat_unregister_guest_callback(
-    struct compat_callback_unregister *unreg)
-{
-    long ret;
-
-    switch ( unreg->type )
-    {
-    case CALLBACKTYPE_event:
-    case CALLBACKTYPE_failsafe:
-    case CALLBACKTYPE_syscall32:
-    case CALLBACKTYPE_sysenter:
-        ret = -EINVAL;
-        break;
-
-    case CALLBACKTYPE_nmi:
-        ret = unregister_guest_nmi_callback();
-        break;
-
-    default:
-        ret = -ENOSYS;
-        break;
-    }
-
-    return ret;
-}
-
-
-long compat_callback_op(int cmd, XEN_GUEST_HANDLE(void) arg)
-{
-    long ret;
-
-    switch ( cmd )
-    {
-    case CALLBACKOP_register:
-    {
-        struct compat_callback_register reg;
-
-        ret = -EFAULT;
-        if ( copy_from_guest(&reg, arg, 1) )
-            break;
-
-        ret = compat_register_guest_callback(&reg);
-    }
-    break;
-
-    case CALLBACKOP_unregister:
-    {
-        struct compat_callback_unregister unreg;
-
-        ret = -EFAULT;
-        if ( copy_from_guest(&unreg, arg, 1) )
-            break;
-
-        ret = compat_unregister_guest_callback(&unreg);
-    }
-    break;
-
-    default:
-        ret = -EINVAL;
-        break;
-    }
-
-    return ret;
-}
-
-long compat_set_callbacks(unsigned long event_selector,
-                          unsigned long event_address,
-                          unsigned long failsafe_selector,
-                          unsigned long failsafe_address)
-{
-    struct compat_callback_register event = {
-        .type = CALLBACKTYPE_event,
-        .address = {
-            .cs = event_selector,
-            .eip = event_address
-        }
-    };
-    struct compat_callback_register failsafe = {
-        .type = CALLBACKTYPE_failsafe,
-        .address = {
-            .cs = failsafe_selector,
-            .eip = failsafe_address
-        }
-    };
-
-    compat_register_guest_callback(&event);
-    compat_register_guest_callback(&failsafe);
-
-    return 0;
-}
-
 /*
  * Local variables:
  * mode: C
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 24/27] x86: move compat_show_guest_statck near its non-compat variant
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (22 preceding siblings ...)
  2017-06-08 17:11 ` [PATCH v4 23/27] x86: move the compat callback ops next to the " Wei Liu
@ 2017-06-08 17:12 ` Wei Liu
  2017-06-23 12:47   ` Andrew Cooper
  2017-06-08 17:12 ` [PATCH v4 25/27] x86: remove the now empty x86_64/compat/traps.c Wei Liu
                   ` (2 subsequent siblings)
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:12 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

And make it static, remove the declaration in header.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/traps.c               | 64 ++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/x86_64/compat/traps.c | 63 -------------------------------------
 xen/include/asm-x86/processor.h    |  3 --
 3 files changed, 64 insertions(+), 66 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 287503cd56..0cedd5159b 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -186,6 +186,70 @@ static void show_code(const struct cpu_user_regs *regs)
     printk("\n");
 }
 
+static void compat_show_guest_stack(struct vcpu *v,
+                                    const struct cpu_user_regs *regs,
+                                    int debug_stack_lines)
+{
+    unsigned int i, *stack, addr, mask = STACK_SIZE;
+
+    stack = (unsigned int *)(unsigned long)regs->esp;
+    printk("Guest stack trace from esp=%08lx:\n ", (unsigned long)stack);
+
+    if ( !__compat_access_ok(v->domain, stack, sizeof(*stack)) )
+    {
+        printk("Guest-inaccessible memory.\n");
+        return;
+    }
+
+    if ( v != current )
+    {
+        struct vcpu *vcpu;
+        unsigned long mfn;
+
+        ASSERT(guest_kernel_mode(v, regs));
+        mfn = read_cr3() >> PAGE_SHIFT;
+        for_each_vcpu( v->domain, vcpu )
+            if ( pagetable_get_pfn(vcpu->arch.guest_table) == mfn )
+                break;
+        if ( !vcpu )
+        {
+            stack = do_page_walk(v, (unsigned long)stack);
+            if ( (unsigned long)stack < PAGE_SIZE )
+            {
+                printk("Inaccessible guest memory.\n");
+                return;
+            }
+            mask = PAGE_SIZE;
+        }
+    }
+
+    for ( i = 0; i < debug_stack_lines * 8; i++ )
+    {
+        if ( (((long)stack - 1) ^ ((long)(stack + 1) - 1)) & mask )
+            break;
+        if ( __get_user(addr, stack) )
+        {
+            if ( i != 0 )
+                printk("\n    ");
+            printk("Fault while accessing guest memory.");
+            i = 1;
+            break;
+        }
+        if ( (i != 0) && ((i % 8) == 0) )
+            printk("\n ");
+        printk(" %08x", addr);
+        stack++;
+    }
+    if ( mask == PAGE_SIZE )
+    {
+        BUILD_BUG_ON(PAGE_SIZE == STACK_SIZE);
+        unmap_domain_page(stack);
+    }
+    if ( i == 0 )
+        printk("Stack empty.");
+    printk("\n");
+}
+
 static void show_guest_stack(struct vcpu *v, const struct cpu_user_regs *regs)
 {
     int i;
diff --git a/xen/arch/x86/x86_64/compat/traps.c b/xen/arch/x86/x86_64/compat/traps.c
index 6e146a62a7..18cd2c017c 100644
--- a/xen/arch/x86/x86_64/compat/traps.c
+++ b/xen/arch/x86/x86_64/compat/traps.c
@@ -3,69 +3,6 @@
 #include <compat/callback.h>
 #include <compat/arch-x86_32.h>
 
-void compat_show_guest_stack(struct vcpu *v, const struct cpu_user_regs *regs,
-                             int debug_stack_lines)
-{
-    unsigned int i, *stack, addr, mask = STACK_SIZE;
-
-    stack = (unsigned int *)(unsigned long)regs->esp;
-    printk("Guest stack trace from esp=%08lx:\n ", (unsigned long)stack);
-
-    if ( !__compat_access_ok(v->domain, stack, sizeof(*stack)) )
-    {
-        printk("Guest-inaccessible memory.\n");
-        return;
-    }
-
-    if ( v != current )
-    {
-        struct vcpu *vcpu;
-        unsigned long mfn;
-
-        ASSERT(guest_kernel_mode(v, regs));
-        mfn = read_cr3() >> PAGE_SHIFT;
-        for_each_vcpu( v->domain, vcpu )
-            if ( pagetable_get_pfn(vcpu->arch.guest_table) == mfn )
-                break;
-        if ( !vcpu )
-        {
-            stack = do_page_walk(v, (unsigned long)stack);
-            if ( (unsigned long)stack < PAGE_SIZE )
-            {
-                printk("Inaccessible guest memory.\n");
-                return;
-            }
-            mask = PAGE_SIZE;
-        }
-    }
-
-    for ( i = 0; i < debug_stack_lines * 8; i++ )
-    {
-        if ( (((long)stack - 1) ^ ((long)(stack + 1) - 1)) & mask )
-            break;
-        if ( __get_user(addr, stack) )
-        {
-            if ( i != 0 )
-                printk("\n    ");
-            printk("Fault while accessing guest memory.");
-            i = 1;
-            break;
-        }
-        if ( (i != 0) && ((i % 8) == 0) )
-            printk("\n ");
-        printk(" %08x", addr);
-        stack++;
-    }
-    if ( mask == PAGE_SIZE )
-    {
-        BUILD_BUG_ON(PAGE_SIZE == STACK_SIZE);
-        unmap_domain_page(stack);
-    }
-    if ( i == 0 )
-        printk("Stack empty.");
-    printk("\n");
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 6a335d3a61..5bf56b45e1 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -480,9 +480,6 @@ void show_execution_state(const struct cpu_user_regs *regs);
 void show_page_walk(unsigned long addr);
 void noreturn fatal_trap(const struct cpu_user_regs *regs, bool_t show_remote);
 
-void compat_show_guest_stack(struct vcpu *v,
-                             const struct cpu_user_regs *regs, int lines);
-
 extern void mtrr_ap_init(void);
 extern void mtrr_bp_init(void);
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 25/27] x86: remove the now empty x86_64/compat/traps.c
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (23 preceding siblings ...)
  2017-06-08 17:12 ` [PATCH v4 24/27] x86: move compat_show_guest_statck near its " Wei Liu
@ 2017-06-08 17:12 ` Wei Liu
  2017-06-23 12:47   ` Andrew Cooper
  2017-06-08 17:12 ` [PATCH v4 26/27] x86: fix coding a style issue in asm-x86/traps.h Wei Liu
  2017-06-08 17:12 ` [PATCH v4 27/27] x86: clean up traps.c Wei Liu
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:12 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/x86_64/compat/traps.c | 14 --------------
 xen/arch/x86/x86_64/traps.c        |  2 --
 2 files changed, 16 deletions(-)
 delete mode 100644 xen/arch/x86/x86_64/compat/traps.c

diff --git a/xen/arch/x86/x86_64/compat/traps.c b/xen/arch/x86/x86_64/compat/traps.c
deleted file mode 100644
index 18cd2c017c..0000000000
--- a/xen/arch/x86/x86_64/compat/traps.c
+++ /dev/null
@@ -1,14 +0,0 @@
-#include <xen/event.h>
-#include <asm/regs.h>
-#include <compat/callback.h>
-#include <compat/arch-x86_32.h>
-
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * tab-width: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/xen/arch/x86/x86_64/traps.c b/xen/arch/x86/x86_64/traps.c
index 79bfc4d3f0..a15231ca0c 100644
--- a/xen/arch/x86/x86_64/traps.c
+++ b/xen/arch/x86/x86_64/traps.c
@@ -335,8 +335,6 @@ void subarch_percpu_traps_init(void)
     wrmsrl(MSR_SYSCALL_MASK, XEN_SYSCALL_MASK);
 }
 
-#include "compat/traps.c"
-
 void hypercall_page_initialise(struct domain *d, void *hypercall_page)
 {
     memset(hypercall_page, 0xCC, PAGE_SIZE);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 26/27] x86: fix coding a style issue in asm-x86/traps.h
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (24 preceding siblings ...)
  2017-06-08 17:12 ` [PATCH v4 25/27] x86: remove the now empty x86_64/compat/traps.c Wei Liu
@ 2017-06-08 17:12 ` Wei Liu
  2017-06-23 12:48   ` Andrew Cooper
  2017-06-08 17:12 ` [PATCH v4 27/27] x86: clean up traps.c Wei Liu
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:12 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

And add an emacs block.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/include/asm-x86/traps.h | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/xen/include/asm-x86/traps.h b/xen/include/asm-x86/traps.h
index 8cf6105d8d..1ac6718257 100644
--- a/xen/include/asm-x86/traps.h
+++ b/xen/include/asm-x86/traps.h
@@ -38,7 +38,7 @@ bool guest_has_trap_callback(const struct domain *d, unsigned int vcpuid,
  * return 0 on successful delivery
  */
 extern int send_guest_trap(struct domain *d, uint16_t vcpuid,
-				unsigned int trap_nr);
+                           unsigned int trap_nr);
 
 uint32_t guest_io_read(unsigned int port, unsigned int bytes,
                        struct domain *);
@@ -48,3 +48,13 @@ void guest_io_write(unsigned int port, unsigned int bytes, uint32_t data,
 const char *trapstr(unsigned int trapnr);
 
 #endif /* ASM_TRAP_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 27/27] x86: clean up traps.c
  2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
                   ` (25 preceding siblings ...)
  2017-06-08 17:12 ` [PATCH v4 26/27] x86: fix coding a style issue in asm-x86/traps.h Wei Liu
@ 2017-06-08 17:12 ` Wei Liu
  2017-06-23 12:50   ` Andrew Cooper
  26 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-08 17:12 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich

Replace bool_t with bool. Delete trailing white spaces. Fix some
coding style issues.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/traps.c | 77 +++++++++++++++++++++++++++-------------------------
 1 file changed, 40 insertions(+), 37 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 0cedd5159b..e568586573 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1,18 +1,18 @@
 /******************************************************************************
  * arch/x86/traps.c
- * 
+ *
  * Modifications to Linux original are copyright (c) 2002-2004, K A Fraser
- * 
+ *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
  * the Free Software Foundation; either version 2 of the License, or
  * (at your option) any later version.
- * 
+ *
  * This program is distributed in the hope that it will be useful,
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  * GNU General Public License for more details.
- * 
+ *
  * You should have received a copy of the GNU General Public License
  * along with this program; If not, see <http://www.gnu.org/licenses/>.
  */
@@ -112,7 +112,7 @@ void (*ioemul_handle_quirk)(
 static int debug_stack_lines = 20;
 integer_param("debug_stack_lines", debug_stack_lines);
 
-static bool_t opt_ler;
+static bool opt_ler;
 boolean_param("ler", opt_ler);
 
 #define stack_words_per_line 4
@@ -591,7 +591,7 @@ void vcpu_show_execution_state(struct vcpu *v)
 }
 
 static cpumask_t show_state_mask;
-static bool_t opt_show_all;
+static bool opt_show_all;
 boolean_param("async-show-all", opt_show_all);
 
 static int nmi_show_execution_state(const struct cpu_user_regs *regs, int cpu)
@@ -602,8 +602,8 @@ static int nmi_show_execution_state(const struct cpu_user_regs *regs, int cpu)
     if ( opt_show_all )
         show_execution_state(regs);
     else
-        printk(XENLOG_ERR "CPU%d @ %04x:%08lx (%pS)\n", cpu, regs->cs, regs->rip,
-               guest_mode(regs) ? _p(regs->rip) : NULL);
+        printk(XENLOG_ERR "CPU%d @ %04x:%08lx (%pS)\n", cpu, regs->cs,
+               regs->rip, guest_mode(regs) ? _p(regs->rip) : NULL);
     cpumask_clear_cpu(cpu, &show_state_mask);
 
     return 1;
@@ -628,7 +628,7 @@ const char *trapstr(unsigned int trapnr)
  * are disabled). In such situations we can't do much that is safe. We try to
  * print out some tracing and then we just spin.
  */
-void fatal_trap(const struct cpu_user_regs *regs, bool_t show_remote)
+void fatal_trap(const struct cpu_user_regs *regs, bool show_remote)
 {
     static DEFINE_PER_CPU(char, depth);
     unsigned int trapnr = regs->entry_vector;
@@ -1081,8 +1081,8 @@ void do_int3(struct cpu_user_regs *regs)
     pv_inject_hw_exception(TRAP_int3, X86_EVENT_NO_EC);
 }
 
-static void reserved_bit_page_fault(
-    unsigned long addr, struct cpu_user_regs *regs)
+static void reserved_bit_page_fault(unsigned long addr,
+                                    struct cpu_user_regs *regs)
 {
     printk("%pv: reserved bit in page table (ec=%04X)\n",
            current, regs->error_code);
@@ -1090,8 +1090,8 @@ static void reserved_bit_page_fault(
     show_execution_state(regs);
 }
 
-static int handle_gdt_ldt_mapping_fault(
-    unsigned long offset, struct cpu_user_regs *regs)
+static int handle_gdt_ldt_mapping_fault(unsigned long offset,
+                                        struct cpu_user_regs *regs)
 {
     struct vcpu *curr = current;
     /* Which vcpu's area did we fault in, and is it in the ldt sub-area? */
@@ -1159,8 +1159,8 @@ enum pf_type {
     spurious_fault
 };
 
-static enum pf_type __page_fault_type(
-    unsigned long addr, const struct cpu_user_regs *regs)
+static enum pf_type __page_fault_type(unsigned long addr,
+                                      const struct cpu_user_regs *regs)
 {
     unsigned long mfn, cr3 = read_cr3();
     l4_pgentry_t l4e, *l4t;
@@ -1266,8 +1266,8 @@ leaf:
     return spurious_fault;
 }
 
-static enum pf_type spurious_page_fault(
-    unsigned long addr, const struct cpu_user_regs *regs)
+static enum pf_type spurious_page_fault(unsigned long addr,
+                                        const struct cpu_user_regs *regs)
 {
     unsigned long flags;
     enum pf_type pf_type;
@@ -1376,7 +1376,8 @@ void do_page_fault(struct cpu_user_regs *regs)
         if ( (pf_type == smep_fault) || (pf_type == smap_fault) )
         {
             console_start_sync();
-            printk("Xen SM%cP violation\n", (pf_type == smep_fault) ? 'E' : 'A');
+            printk("Xen SM%cP violation\n",
+                   (pf_type == smep_fault) ? 'E' : 'A');
             fatal_trap(regs, 0);
         }
 
@@ -1426,9 +1427,9 @@ void do_page_fault(struct cpu_user_regs *regs)
 
 /*
  * Early #PF handler to print CR2, error code, and stack.
- * 
+ *
  * We also deal with spurious faults here, even though they should never happen
- * during early boot (an issue was seen once, but was most likely a hardware 
+ * during early boot (an issue was seen once, but was most likely a hardware
  * problem).
  */
 void __init do_early_page_fault(struct cpu_user_regs *regs)
@@ -1472,7 +1473,7 @@ void do_general_protection(struct cpu_user_regs *regs)
 
     /*
      * Cunning trick to allow arbitrary "INT n" handling.
-     * 
+     *
      * We set DPL == 0 on all vectors in the IDT. This prevents any INT <n>
      * instruction from trapping to the appropriate vector, when that might not
      * be expected by Xen or the guest OS. For example, that entry might be for
@@ -1480,12 +1481,12 @@ void do_general_protection(struct cpu_user_regs *regs)
      * expect an error code on the stack (which a software trap never
      * provides), or might be a hardware interrupt handler that doesn't like
      * being called spuriously.
-     * 
+     *
      * Instead, a GPF occurs with the faulting IDT vector in the error code.
-     * Bit 1 is set to indicate that an IDT entry caused the fault. Bit 0 is 
+     * Bit 1 is set to indicate that an IDT entry caused the fault. Bit 0 is
      * clear (which got already checked above) to indicate that it's a software
      * fault, not a hardware one.
-     * 
+     *
      * NOTE: Vectors 3 and 4 are dealt with from their own handler. This is
      * okay because they can only be triggered by an explicit DPL-checked
      * instruction. The DPL specified by the guest OS for these vectors is NOT
@@ -1631,7 +1632,8 @@ static void io_check_error(const struct cpu_user_regs *regs)
     outb((inb(0x61) & 0x07) | 0x00, 0x61); /* enable IOCK */
 }
 
-static void unknown_nmi_error(const struct cpu_user_regs *regs, unsigned char reason)
+static void unknown_nmi_error(const struct cpu_user_regs *regs,
+                              unsigned char reason)
 {
     switch ( opt_nmi[0] )
     {
@@ -1651,14 +1653,14 @@ static int dummy_nmi_callback(const struct cpu_user_regs *regs, int cpu)
 {
     return 0;
 }
- 
+
 static nmi_callback_t *nmi_callback = dummy_nmi_callback;
 
 void do_nmi(const struct cpu_user_regs *regs)
 {
     unsigned int cpu = smp_processor_id();
     unsigned char reason;
-    bool_t handle_unknown = 0;
+    bool handle_unknown = false;
 
     ++nmi_count(cpu);
 
@@ -1667,7 +1669,7 @@ void do_nmi(const struct cpu_user_regs *regs)
 
     if ( (nmi_watchdog == NMI_NONE) ||
          (!nmi_watchdog_tick(regs) && watchdog_force) )
-        handle_unknown = 1;
+        handle_unknown = true;
 
     /* Only the BSP gets external NMIs from the system. */
     if ( cpu == 0 )
@@ -1787,7 +1789,8 @@ void do_debug(struct cpu_user_regs *regs)
     return;
 }
 
-static void __init noinline __set_intr_gate(unsigned int n, uint32_t dpl, void *addr)
+static void __init noinline __set_intr_gate(unsigned int n,
+                                            uint32_t dpl, void *addr)
 {
     _set_gate(&idt_table[n], SYS_DESC_irq_gate, dpl, addr);
 }
@@ -1968,28 +1971,28 @@ long set_debugreg(struct vcpu *v, unsigned int reg, unsigned long value)
 
     switch ( reg )
     {
-    case 0: 
+    case 0:
         if ( !access_ok(value, sizeof(long)) )
             return -EPERM;
-        if ( v == curr ) 
+        if ( v == curr )
             write_debugreg(0, value);
         break;
-    case 1: 
+    case 1:
         if ( !access_ok(value, sizeof(long)) )
             return -EPERM;
-        if ( v == curr ) 
+        if ( v == curr )
             write_debugreg(1, value);
         break;
-    case 2: 
+    case 2:
         if ( !access_ok(value, sizeof(long)) )
             return -EPERM;
-        if ( v == curr ) 
+        if ( v == curr )
             write_debugreg(2, value);
         break;
     case 3:
         if ( !access_ok(value, sizeof(long)) )
             return -EPERM;
-        if ( v == curr ) 
+        if ( v == curr )
             write_debugreg(3, value);
         break;
     case 6:
@@ -1999,7 +2002,7 @@ long set_debugreg(struct vcpu *v, unsigned int reg, unsigned long value)
          */
         value &= ~DR_STATUS_RESERVED_ZERO; /* reserved bits => 0 */
         value |=  DR_STATUS_RESERVED_ONE;  /* reserved bits => 1 */
-        if ( v == curr ) 
+        if ( v == curr )
             write_debugreg(6, value);
         break;
     case 7:
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 01/27] x86: factor out common PV emulation code
  2017-06-08 17:11 ` [PATCH v4 01/27] x86: factor out common PV emulation code Wei Liu
@ 2017-06-20 16:00   ` Jan Beulich
  0 siblings, 0 replies; 72+ messages in thread
From: Jan Beulich @ 2017-06-20 16:00 UTC (permalink / raw)
  To: Wei Liu; +Cc: Andrew Cooper, Xen-devel

>>> On 08.06.17 at 19:11, <wei.liu2@citrix.com> wrote:
> We're going to split PV emulation code into several files. This patch
> extracts the functions needed by them into a dedicated file.
> 
> The functions are now prefixed with "pv_emul_" and exported via a
> local header file.
> 
> While at it, change bool_t to bool.

On the basis that beyond this the two functions are simply being
moved, ...

> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 02/27] x86: move PV privileged instruction emulation code
  2017-06-08 17:11 ` [PATCH v4 02/27] x86: move PV privileged instruction " Wei Liu
@ 2017-06-20 16:03   ` Jan Beulich
  0 siblings, 0 replies; 72+ messages in thread
From: Jan Beulich @ 2017-06-20 16:03 UTC (permalink / raw)
  To: Wei Liu; +Cc: Andrew Cooper, Xen-devel

>>> On 08.06.17 at 19:11, <wei.liu2@citrix.com> wrote:
> Move the code to pv/emul-priv-op.c. Prefix emulate_privileged_op with
> pv_ and export it via pv/traps.h.
> 
> Also move gpr_switch.S since it is used by the privileged instruction
> emulation code only.
> 
> Code motion only except for the rename. Cleanup etc will come later.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 03/27] x86: move PV gate op emulation code
  2017-06-08 17:11 ` [PATCH v4 03/27] x86: move PV gate op " Wei Liu
@ 2017-06-20 16:05   ` Jan Beulich
  0 siblings, 0 replies; 72+ messages in thread
From: Jan Beulich @ 2017-06-20 16:05 UTC (permalink / raw)
  To: Wei Liu; +Cc: Andrew Cooper, Xen-devel

>>> On 08.06.17 at 19:11, <wei.liu2@citrix.com> wrote:
> --- a/xen/include/asm-x86/pv/traps.h
> +++ b/xen/include/asm-x86/pv/traps.h
> @@ -26,10 +26,12 @@
>  #include <public/xen.h>
>  
>  int pv_emulate_privileged_op(struct cpu_user_regs *regs);
> +void pv_emulate_gate_op(struct cpu_user_regs *regs);
>  
>  #else  /* !CONFIG_PV */
>  
>  int pv_emulate_privileged_op(struct cpu_user_regs *regs) { return 0; }
> +void pv_emulate_gate_op(struct cpu_user_regs *regs) {}

Missing "inline" (also applies to patch 2 as I've just noticed)? With
that corrected,
Acked-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 04/27] x86: move PV invalid op emulation code
  2017-06-08 17:11 ` [PATCH v4 04/27] x86: move PV invalid " Wei Liu
@ 2017-06-20 16:21   ` Jan Beulich
  2017-06-20 16:25     ` Wei Liu
  0 siblings, 1 reply; 72+ messages in thread
From: Jan Beulich @ 2017-06-20 16:21 UTC (permalink / raw)
  To: Wei Liu; +Cc: Andrew Cooper, Xen-devel

>>> On 08.06.17 at 19:11, <wei.liu2@citrix.com> wrote:
> @@ -1053,8 +982,8 @@ void do_invalid_op(struct cpu_user_regs *regs)
>  
>      if ( likely(guest_mode(regs)) )
>      {
> -        if ( !emulate_invalid_rdtscp(regs) &&
> -             !emulate_forced_invalid_op(regs) )
> +        if ( !pv_emulate_invalid_rdtscp(regs) &&
> +             !pv_emulate_forced_invalid_op(regs) )

I wonder if the first couldn't be called by the second, making it
unnecessary to export both. Or maybe have a wrapper
pv_emulate_invalid_op() around both.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 05/27] x86/traps: remove now unused inclusion of emulate.h
  2017-06-08 17:11 ` [PATCH v4 05/27] x86/traps: remove now unused inclusion of emulate.h Wei Liu
@ 2017-06-20 16:21   ` Jan Beulich
  0 siblings, 0 replies; 72+ messages in thread
From: Jan Beulich @ 2017-06-20 16:21 UTC (permalink / raw)
  To: Wei Liu; +Cc: Andrew Cooper, Xen-devel

>>> On 08.06.17 at 19:11, <wei.liu2@citrix.com> wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 04/27] x86: move PV invalid op emulation code
  2017-06-20 16:21   ` Jan Beulich
@ 2017-06-20 16:25     ` Wei Liu
  2017-06-21  6:15       ` Jan Beulich
  0 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-20 16:25 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, Xen-devel

On Tue, Jun 20, 2017 at 10:21:27AM -0600, Jan Beulich wrote:
> >>> On 08.06.17 at 19:11, <wei.liu2@citrix.com> wrote:
> > @@ -1053,8 +982,8 @@ void do_invalid_op(struct cpu_user_regs *regs)
> >  
> >      if ( likely(guest_mode(regs)) )
> >      {
> > -        if ( !emulate_invalid_rdtscp(regs) &&
> > -             !emulate_forced_invalid_op(regs) )
> > +        if ( !pv_emulate_invalid_rdtscp(regs) &&
> > +             !pv_emulate_forced_invalid_op(regs) )
> 
> I wonder if the first couldn't be called by the second, making it
> unnecessary to export both. Or maybe have a wrapper
> pv_emulate_invalid_op() around both.
> 

Do you want me to refactor and move code in the same patch? Wouldn't
that make it hard for you to review?

I can submit a follow-up patch to do what you ask for.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 04/27] x86: move PV invalid op emulation code
  2017-06-20 16:25     ` Wei Liu
@ 2017-06-21  6:15       ` Jan Beulich
  2017-06-21  8:57         ` Wei Liu
  0 siblings, 1 reply; 72+ messages in thread
From: Jan Beulich @ 2017-06-21  6:15 UTC (permalink / raw)
  To: Wei Liu; +Cc: Andrew Cooper, Xen-devel

>>> On 20.06.17 at 18:25, <wei.liu2@citrix.com> wrote:
> On Tue, Jun 20, 2017 at 10:21:27AM -0600, Jan Beulich wrote:
>> >>> On 08.06.17 at 19:11, <wei.liu2@citrix.com> wrote:
>> > @@ -1053,8 +982,8 @@ void do_invalid_op(struct cpu_user_regs *regs)
>> >  
>> >      if ( likely(guest_mode(regs)) )
>> >      {
>> > -        if ( !emulate_invalid_rdtscp(regs) &&
>> > -             !emulate_forced_invalid_op(regs) )
>> > +        if ( !pv_emulate_invalid_rdtscp(regs) &&
>> > +             !pv_emulate_forced_invalid_op(regs) )
>> 
>> I wonder if the first couldn't be called by the second, making it
>> unnecessary to export both. Or maybe have a wrapper
>> pv_emulate_invalid_op() around both.
>> 
> 
> Do you want me to refactor and move code in the same patch? Wouldn't
> that make it hard for you to review?

Why - especially in the wrapper variant you'd move both functions
unchanged (perhaps even with the names left as they are), and
merely add the wrapper (and of course use it in the code fragment
above). That'll make review rather simple, as you'll still be able to
state that you left both existing functions unchanged.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 04/27] x86: move PV invalid op emulation code
  2017-06-21  6:15       ` Jan Beulich
@ 2017-06-21  8:57         ` Wei Liu
  2017-06-21  9:09           ` Jan Beulich
  0 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-21  8:57 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, Xen-devel

On Wed, Jun 21, 2017 at 12:15:46AM -0600, Jan Beulich wrote:
> >>> On 20.06.17 at 18:25, <wei.liu2@citrix.com> wrote:
> > On Tue, Jun 20, 2017 at 10:21:27AM -0600, Jan Beulich wrote:
> >> >>> On 08.06.17 at 19:11, <wei.liu2@citrix.com> wrote:
> >> > @@ -1053,8 +982,8 @@ void do_invalid_op(struct cpu_user_regs *regs)
> >> >  
> >> >      if ( likely(guest_mode(regs)) )
> >> >      {
> >> > -        if ( !emulate_invalid_rdtscp(regs) &&
> >> > -             !emulate_forced_invalid_op(regs) )
> >> > +        if ( !pv_emulate_invalid_rdtscp(regs) &&
> >> > +             !pv_emulate_forced_invalid_op(regs) )
> >> 
> >> I wonder if the first couldn't be called by the second, making it
> >> unnecessary to export both. Or maybe have a wrapper
> >> pv_emulate_invalid_op() around both.
> >> 
> > 
> > Do you want me to refactor and move code in the same patch? Wouldn't
> > that make it hard for you to review?
> 
> Why - especially in the wrapper variant you'd move both functions
> unchanged (perhaps even with the names left as they are), and
> merely add the wrapper (and of course use it in the code fragment
> above). That'll make review rather simple, as you'll still be able to
> state that you left both existing functions unchanged.

OK

---8<---
From 50dfe1fe116c28a3953f0b72acc7b1dee4136e2b Mon Sep 17 00:00:00 2001
From: Wei Liu <wei.liu2@citrix.com>
Date: Mon, 5 Jun 2017 13:07:16 +0100
Subject: [PATCH] x86: move PV invalid op emulation code

Move the code to pv/emul-inv-op.c. Both functions are unchanged.
Provide pv_emulate_invalid_op and use it in traps.c.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/pv/Makefile       |   1 +
 xen/arch/x86/pv/emul-inv-op.c  | 128 +++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/traps.c           |  74 +-----------------------
 xen/include/asm-x86/pv/traps.h |   2 +
 4 files changed, 132 insertions(+), 73 deletions(-)
 create mode 100644 xen/arch/x86/pv/emul-inv-op.c

diff --git a/xen/arch/x86/pv/Makefile b/xen/arch/x86/pv/Makefile
index 1f6fbd3f5c..42ca64dc9e 100644
--- a/xen/arch/x86/pv/Makefile
+++ b/xen/arch/x86/pv/Makefile
@@ -5,5 +5,6 @@ obj-bin-y += dom0_build.init.o
 obj-y += domain.o
 obj-y += emulate.o
 obj-y += emul-gate-op.o
+obj-y += emul-inv-op.o
 obj-y += emul-priv-op.o
 obj-bin-y += gpr_switch.o
diff --git a/xen/arch/x86/pv/emul-inv-op.c b/xen/arch/x86/pv/emul-inv-op.c
new file mode 100644
index 0000000000..a1c56da171
--- /dev/null
+++ b/xen/arch/x86/pv/emul-inv-op.c
@@ -0,0 +1,128 @@
+/******************************************************************************
+ * arch/x86/pv/emul-inv-op.c
+ *
+ * Emulate invalid op for PV guests
+ *
+ * Modifications to Linux original are copyright (c) 2002-2004, K A Fraser
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/errno.h>
+#include <xen/event.h>
+#include <xen/guest_access.h>
+#include <xen/iocap.h>
+#include <xen/spinlock.h>
+#include <xen/trace.h>
+
+#include <asm/apic.h>
+#include <asm/debugreg.h>
+#include <asm/hpet.h>
+#include <asm/hypercall.h>
+#include <asm/mc146818rtc.h>
+#include <asm/p2m.h>
+#include <asm/pv/traps.h>
+#include <asm/shared.h>
+#include <asm/traps.h>
+#include <asm/x86_emulate.h>
+
+#include <xsm/xsm.h>
+
+#include "emulate.h"
+
+static int emulate_invalid_rdtscp(struct cpu_user_regs *regs)
+{
+    char opcode[3];
+    unsigned long eip, rc;
+    struct vcpu *v = current;
+
+    eip = regs->rip;
+    if ( (rc = copy_from_user(opcode, (char *)eip, sizeof(opcode))) != 0 )
+    {
+        pv_inject_page_fault(0, eip + sizeof(opcode) - rc);
+        return EXCRET_fault_fixed;
+    }
+    if ( memcmp(opcode, "\xf\x1\xf9", sizeof(opcode)) )
+        return 0;
+    eip += sizeof(opcode);
+    pv_soft_rdtsc(v, regs, 1);
+    pv_emul_instruction_done(regs, eip);
+    return EXCRET_fault_fixed;
+}
+
+static int emulate_forced_invalid_op(struct cpu_user_regs *regs)
+{
+    char sig[5], instr[2];
+    unsigned long eip, rc;
+    struct cpuid_leaf res;
+
+    eip = regs->rip;
+
+    /* Check for forced emulation signature: ud2 ; .ascii "xen". */
+    if ( (rc = copy_from_user(sig, (char *)eip, sizeof(sig))) != 0 )
+    {
+        pv_inject_page_fault(0, eip + sizeof(sig) - rc);
+        return EXCRET_fault_fixed;
+    }
+    if ( memcmp(sig, "\xf\xbxen", sizeof(sig)) )
+        return 0;
+    eip += sizeof(sig);
+
+    /* We only emulate CPUID. */
+    if ( ( rc = copy_from_user(instr, (char *)eip, sizeof(instr))) != 0 )
+    {
+        pv_inject_page_fault(0, eip + sizeof(instr) - rc);
+        return EXCRET_fault_fixed;
+    }
+    if ( memcmp(instr, "\xf\xa2", sizeof(instr)) )
+        return 0;
+
+    /* If cpuid faulting is enabled and CPL>0 inject a #GP in place of #UD. */
+    if ( current->arch.cpuid_faulting && !guest_kernel_mode(current, regs) )
+    {
+        regs->rip = eip;
+        pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
+        return EXCRET_fault_fixed;
+    }
+
+    eip += sizeof(instr);
+
+    guest_cpuid(current, regs->eax, regs->ecx, &res);
+
+    regs->rax = res.a;
+    regs->rbx = res.b;
+    regs->rcx = res.c;
+    regs->rdx = res.d;
+
+    pv_emul_instruction_done(regs, eip);
+
+    trace_trap_one_addr(TRC_PV_FORCED_INVALID_OP, regs->rip);
+
+    return EXCRET_fault_fixed;
+}
+
+int pv_emulate_invalid_op(struct cpu_user_regs *regs)
+{
+    return !emulate_invalid_rdtscp(regs) && !emulate_forced_invalid_op(regs);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 7b781f17db..88dfd464e7 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -968,77 +968,6 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
     }
 }
 
-static int emulate_invalid_rdtscp(struct cpu_user_regs *regs)
-{
-    char opcode[3];
-    unsigned long eip, rc;
-    struct vcpu *v = current;
-
-    eip = regs->rip;
-    if ( (rc = copy_from_user(opcode, (char *)eip, sizeof(opcode))) != 0 )
-    {
-        pv_inject_page_fault(0, eip + sizeof(opcode) - rc);
-        return EXCRET_fault_fixed;
-    }
-    if ( memcmp(opcode, "\xf\x1\xf9", sizeof(opcode)) )
-        return 0;
-    eip += sizeof(opcode);
-    pv_soft_rdtsc(v, regs, 1);
-    pv_emul_instruction_done(regs, eip);
-    return EXCRET_fault_fixed;
-}
-
-static int emulate_forced_invalid_op(struct cpu_user_regs *regs)
-{
-    char sig[5], instr[2];
-    unsigned long eip, rc;
-    struct cpuid_leaf res;
-
-    eip = regs->rip;
-
-    /* Check for forced emulation signature: ud2 ; .ascii "xen". */
-    if ( (rc = copy_from_user(sig, (char *)eip, sizeof(sig))) != 0 )
-    {
-        pv_inject_page_fault(0, eip + sizeof(sig) - rc);
-        return EXCRET_fault_fixed;
-    }
-    if ( memcmp(sig, "\xf\xbxen", sizeof(sig)) )
-        return 0;
-    eip += sizeof(sig);
-
-    /* We only emulate CPUID. */
-    if ( ( rc = copy_from_user(instr, (char *)eip, sizeof(instr))) != 0 )
-    {
-        pv_inject_page_fault(0, eip + sizeof(instr) - rc);
-        return EXCRET_fault_fixed;
-    }
-    if ( memcmp(instr, "\xf\xa2", sizeof(instr)) )
-        return 0;
-
-    /* If cpuid faulting is enabled and CPL>0 inject a #GP in place of #UD. */
-    if ( current->arch.cpuid_faulting && !guest_kernel_mode(current, regs) )
-    {
-        regs->rip = eip;
-        pv_inject_hw_exception(TRAP_gp_fault, regs->error_code);
-        return EXCRET_fault_fixed;
-    }
-
-    eip += sizeof(instr);
-
-    guest_cpuid(current, regs->eax, regs->ecx, &res);
-
-    regs->rax = res.a;
-    regs->rbx = res.b;
-    regs->rcx = res.c;
-    regs->rdx = res.d;
-
-    pv_emul_instruction_done(regs, eip);
-
-    trace_trap_one_addr(TRC_PV_FORCED_INVALID_OP, regs->rip);
-
-    return EXCRET_fault_fixed;
-}
-
 void do_invalid_op(struct cpu_user_regs *regs)
 {
     const struct bug_frame *bug = NULL;
@@ -1053,8 +982,7 @@ void do_invalid_op(struct cpu_user_regs *regs)
 
     if ( likely(guest_mode(regs)) )
     {
-        if ( !emulate_invalid_rdtscp(regs) &&
-             !emulate_forced_invalid_op(regs) )
+        if ( pv_emulate_invalid_op(regs) )
             pv_inject_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC);
         return;
     }
diff --git a/xen/include/asm-x86/pv/traps.h b/xen/include/asm-x86/pv/traps.h
index b1b6b1d0ad..458028a94b 100644
--- a/xen/include/asm-x86/pv/traps.h
+++ b/xen/include/asm-x86/pv/traps.h
@@ -27,11 +27,13 @@
 
 int pv_emulate_privileged_op(struct cpu_user_regs *regs);
 void pv_emulate_gate_op(struct cpu_user_regs *regs);
+int pv_emulate_invalid_op(struct cpu_user_regs *regs);
 
 #else  /* !CONFIG_PV */
 
 static inline int pv_emulate_privileged_op(struct cpu_user_regs *regs) { return 0; }
 static inline void pv_emulate_gate_op(struct cpu_user_regs *regs) {}
+static int pv_emulate_invalid_op(struct cpu_user_regs *regs) { return 0; }
 
 #endif /* CONFIG_PV */
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 04/27] x86: move PV invalid op emulation code
  2017-06-21  8:57         ` Wei Liu
@ 2017-06-21  9:09           ` Jan Beulich
  2017-06-21  9:14             ` Wei Liu
  0 siblings, 1 reply; 72+ messages in thread
From: Jan Beulich @ 2017-06-21  9:09 UTC (permalink / raw)
  To: Wei Liu; +Cc: Andrew Cooper, Xen-devel

>>> On 21.06.17 at 10:57, <wei.liu2@citrix.com> wrote:
> +int pv_emulate_invalid_op(struct cpu_user_regs *regs)
> +{
> +    return !emulate_invalid_rdtscp(regs) && !emulate_forced_invalid_op(regs);
> +}

This way you want to make the function return bool. Alternatively
you would want to preserve the EXCRET_* return value here, and
handle it accordingly in the caller.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 04/27] x86: move PV invalid op emulation code
  2017-06-21  9:09           ` Jan Beulich
@ 2017-06-21  9:14             ` Wei Liu
  2017-06-21  9:26               ` Jan Beulich
  0 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-21  9:14 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, Xen-devel

On Wed, Jun 21, 2017 at 03:09:41AM -0600, Jan Beulich wrote:
> >>> On 21.06.17 at 10:57, <wei.liu2@citrix.com> wrote:
> > +int pv_emulate_invalid_op(struct cpu_user_regs *regs)
> > +{
> > +    return !emulate_invalid_rdtscp(regs) && !emulate_forced_invalid_op(regs);
> > +}
> 
> This way you want to make the function return bool. Alternatively
> you would want to preserve the EXCRET_* return value here, and
> handle it accordingly in the caller.
> 

I will just make it return bool. Do you want me to send another version?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 04/27] x86: move PV invalid op emulation code
  2017-06-21  9:14             ` Wei Liu
@ 2017-06-21  9:26               ` Jan Beulich
  2017-06-21  9:29                 ` Wei Liu
  0 siblings, 1 reply; 72+ messages in thread
From: Jan Beulich @ 2017-06-21  9:26 UTC (permalink / raw)
  To: Wei Liu; +Cc: Andrew Cooper, Xen-devel

>>> On 21.06.17 at 11:14, <wei.liu2@citrix.com> wrote:
> On Wed, Jun 21, 2017 at 03:09:41AM -0600, Jan Beulich wrote:
>> >>> On 21.06.17 at 10:57, <wei.liu2@citrix.com> wrote:
>> > +int pv_emulate_invalid_op(struct cpu_user_regs *regs)
>> > +{
>> > +    return !emulate_invalid_rdtscp(regs) && !emulate_forced_invalid_op(regs);
>> > +}
>> 
>> This way you want to make the function return bool. Alternatively
>> you would want to preserve the EXCRET_* return value here, and
>> handle it accordingly in the caller.
>> 
> 
> I will just make it return bool. Do you want me to send another version?

No need to I think - feel free to add my ack.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 04/27] x86: move PV invalid op emulation code
  2017-06-21  9:26               ` Jan Beulich
@ 2017-06-21  9:29                 ` Wei Liu
  0 siblings, 0 replies; 72+ messages in thread
From: Wei Liu @ 2017-06-21  9:29 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, Xen-devel

On Wed, Jun 21, 2017 at 03:26:12AM -0600, Jan Beulich wrote:
> >>> On 21.06.17 at 11:14, <wei.liu2@citrix.com> wrote:
> > On Wed, Jun 21, 2017 at 03:09:41AM -0600, Jan Beulich wrote:
> >> >>> On 21.06.17 at 10:57, <wei.liu2@citrix.com> wrote:
> >> > +int pv_emulate_invalid_op(struct cpu_user_regs *regs)
> >> > +{
> >> > +    return !emulate_invalid_rdtscp(regs) && !emulate_forced_invalid_op(regs);
> >> > +}
> >> 
> >> This way you want to make the function return bool. Alternatively
> >> you would want to preserve the EXCRET_* return value here, and
> >> handle it accordingly in the caller.
> >> 
> > 
> > I will just make it return bool. Do you want me to send another version?
> 
> No need to I think - feel free to add my ack.
> 

Cool. Thank you.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 06/27] x86: clean up PV emulation code
  2017-06-08 17:11 ` [PATCH v4 06/27] x86: clean up PV emulation code Wei Liu
@ 2017-06-23 10:56   ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 10:56 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:11, Wei Liu wrote:
> Replace bool_t with bool. Fix coding style issues. Add spaces around
> binary ops. Use 1U for shifting. Eliminate TOGGLE_MODE.
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>, fwiw

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 07/27] x86: move do_set_trap_table to pv/traps.c
  2017-06-08 17:11 ` [PATCH v4 07/27] x86: move do_set_trap_table to pv/traps.c Wei Liu
@ 2017-06-23 11:00   ` Andrew Cooper
  2017-06-23 13:59     ` Wei Liu
  0 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 11:00 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:11, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

I'd suggest folding this into the next patch, and putting the hypercall
in misc-hypercalls.c

Despite its name, this hypercall is just setting up state in the vcpu.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 08/27] x86: move some misc PV hypercalls to misc-hypercalls.c
  2017-06-08 17:11 ` [PATCH v4 08/27] x86: move some misc PV hypercalls to misc-hypercalls.c Wei Liu
@ 2017-06-23 11:02   ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 11:02 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:11, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 09/27] x86/traps: move pv_inject_event to pv/traps.c
  2017-06-08 17:11 ` [PATCH v4 09/27] x86/traps: move pv_inject_event to pv/traps.c Wei Liu
@ 2017-06-23 11:04   ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 11:04 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:11, Wei Liu wrote:
> Take the opportunity to rename "v" to "curr".
>
> No functional change.
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 10/27] x86/traps: move set_guest_{machine, nmi}_trapbounce
  2017-06-08 17:11 ` [PATCH v4 10/27] x86/traps: move set_guest_{machine, nmi}_trapbounce Wei Liu
@ 2017-06-23 11:05   ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 11:05 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:11, Wei Liu wrote:
> Take the opportunity to change their return type to bool. And rename
> "v" to "curr".
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/arch/x86/pv/traps.c | 27 +++++++++++++++++++++++++++
>  xen/arch/x86/traps.c    | 27 ---------------------------
>  2 files changed, 27 insertions(+), 27 deletions(-)
>
> diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c
> index ec7ff1040b..e374cd73b4 100644
> --- a/xen/arch/x86/pv/traps.c
> +++ b/xen/arch/x86/pv/traps.c
> @@ -156,6 +156,33 @@ void pv_inject_event(const struct x86_event *event)
>      }
>  }
>  
> +/*
> + * Called from asm to set up the MCE trapbounce info.
> + * Returns false no callback is set up, else true.
> + */
> +bool set_guest_machinecheck_trapbounce(void)
> +{
> +    struct vcpu *curr = current;
> +    struct trap_bounce *tb = &curr->arch.pv_vcpu.trap_bounce;
> +
> +    pv_inject_hw_exception(TRAP_machine_check, X86_EVENT_NO_EC);
> +    tb->flags &= ~TBF_EXCEPTION; /* not needed for MCE delivery path */

As we are fixing style, newline.

> +    return !null_trap_bounce(curr, tb);
> +}
> +
> +/*
> + * Called from asm to set up the NMI trapbounce info.
> + * Returns false if no callback is set up, else true.
> + */
> +bool set_guest_nmi_trapbounce(void)
> +{
> +    struct vcpu *curr = current;
> +    struct trap_bounce *tb = &curr->arch.pv_vcpu.trap_bounce;

newline.

> +    pv_inject_hw_exception(TRAP_nmi, X86_EVENT_NO_EC);
> +    tb->flags &= ~TBF_EXCEPTION; /* not needed for NMI delivery path */

and newline.

Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

> +    return !null_trap_bounce(curr, tb);
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index 6abfb62c0c..013de702ad 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -626,33 +626,6 @@ void fatal_trap(const struct cpu_user_regs *regs, bool_t show_remote)
>            (regs->eflags & X86_EFLAGS_IF) ? "" : ", IN INTERRUPT CONTEXT");
>  }
>  
> -/*
> - * Called from asm to set up the MCE trapbounce info.
> - * Returns 0 if no callback is set up, else 1.
> - */
> -int set_guest_machinecheck_trapbounce(void)
> -{
> -    struct vcpu *v = current;
> -    struct trap_bounce *tb = &v->arch.pv_vcpu.trap_bounce;
> - 
> -    pv_inject_hw_exception(TRAP_machine_check, X86_EVENT_NO_EC);
> -    tb->flags &= ~TBF_EXCEPTION; /* not needed for MCE delivery path */
> -    return !null_trap_bounce(v, tb);
> -}
> -
> -/*
> - * Called from asm to set up the NMI trapbounce info.
> - * Returns 0 if no callback is set up, else 1.
> - */
> -int set_guest_nmi_trapbounce(void)
> -{
> -    struct vcpu *v = current;
> -    struct trap_bounce *tb = &v->arch.pv_vcpu.trap_bounce;
> -    pv_inject_hw_exception(TRAP_nmi, X86_EVENT_NO_EC);
> -    tb->flags &= ~TBF_EXCEPTION; /* not needed for NMI delivery path */
> -    return !null_trap_bounce(v, tb);
> -}
> -
>  void do_reserved_trap(struct cpu_user_regs *regs)
>  {
>      unsigned int trapnr = regs->entry_vector;


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 11/27] x86:/traps: move {un, }register_guest_nmi_callback
  2017-06-08 17:11 ` [PATCH v4 11/27] x86:/traps: move {un, }register_guest_nmi_callback Wei Liu
@ 2017-06-23 11:38   ` Andrew Cooper
  2017-06-23 12:19     ` Andrew Cooper
  0 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 11:38 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:11, Wei Liu wrote:
> Take the opportunity to rename "v" to "curr".
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

misc-hypercalls.c.  Again, this is just a hypercall handler stashing state.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 12/27] x86/traps: move guest_has_trap_callback to pv/traps.c
  2017-06-08 17:11 ` [PATCH v4 12/27] x86/traps: move guest_has_trap_callback to pv/traps.c Wei Liu
@ 2017-06-23 12:01   ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 12:01 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:11, Wei Liu wrote:
> diff --git a/xen/include/asm-x86/traps.h b/xen/include/asm-x86/traps.h
> index f1d2513e6b..26625ce5a6 100644
> --- a/xen/include/asm-x86/traps.h
> +++ b/xen/include/asm-x86/traps.h
> @@ -32,10 +32,10 @@ void async_exception_cleanup(struct vcpu *);
>  /**
>   * guest_has_trap_callback
>   *
> - * returns true (non-zero) if guest registered a trap handler
> + * returns true if guest registered a trap handler
>   */
> -extern int guest_has_trap_callback(struct domain *d, uint16_t vcpuid,
> -				unsigned int trap_nr);
> +bool guest_has_trap_callback(const struct domain *d, unsigned int vcpuid,
> +                             unsigned int trap_nr);

IMO, It would be better to reduce this to:

static inline bool pv_callback_registered(const struct vcpu *v, uint8_t
vector)
{
    return v->arch.pv_vcpu.trap_ctxt[vector].address;
}

and adjust its single caller to match.  inject_vmce() already has a
struct vcpu in its hand at the call point, so we can loose all the range
checking.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 13/27] x86: move toggle_guest_mode to pv/domain.c
  2017-06-08 17:11 ` [PATCH v4 13/27] x86: move toggle_guest_mode to pv/domain.c Wei Liu
@ 2017-06-23 12:10   ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 12:10 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:11, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/arch/x86/pv/domain.c    | 30 ++++++++++++++++++++++++++++++
>  xen/arch/x86/x86_64/traps.c | 30 ------------------------------
>  2 files changed, 30 insertions(+), 30 deletions(-)
>
> diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c
> index 1c0c040ca0..0f3f0e85d6 100644
> --- a/xen/arch/x86/pv/domain.c
> +++ b/xen/arch/x86/pv/domain.c
> @@ -213,6 +213,36 @@ int pv_domain_initialise(struct domain *d, unsigned int domcr_flags,
>      return rc;
>  }
>  
> +void toggle_guest_mode(struct vcpu *v)
> +{
> +    if ( is_pv_32bit_vcpu(v) )
> +        return;

newline.

Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 14/27] x86: move do_iret to pv/iret.c
  2017-06-08 17:11 ` [PATCH v4 14/27] x86: move do_iret to pv/iret.c Wei Liu
@ 2017-06-23 12:12   ` Andrew Cooper
  2017-06-23 14:17     ` Wei Liu
  0 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 12:12 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:11, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
> There is no copyright header in the original file. Use the default
> one?

It should at least gain a basic GPLv2 header.

Otherwise, LGTM.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 11/27] x86:/traps: move {un, }register_guest_nmi_callback
  2017-06-23 11:38   ` Andrew Cooper
@ 2017-06-23 12:19     ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 12:19 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 23/06/17 12:38, Andrew Cooper wrote:
> On 08/06/17 18:11, Wei Liu wrote:
>> Take the opportunity to rename "v" to "curr".
>>
>> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> misc-hypercalls.c.  Again, this is just a hypercall handler stashing state.

Actually, on reviewing a later patch, we can do better here.

do_nmi_op() is one of the callsites of these helpers.  It is in
common/kernel.c, but only wired up into the hypercall table for x86 PV
guests.

It would be better to move do_nmi_op() to being x86-specific, at which
point you can drop the ARM stubs, and make all the helpers local to
misc-hypercalls.c

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 16/27] x86/traps: factor out pv_trap_init
  2017-06-08 17:11 ` [PATCH v4 16/27] x86/traps: factor out pv_trap_init Wei Liu
@ 2017-06-23 12:31   ` Andrew Cooper
  2017-06-23 13:55     ` Wei Liu
  0 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 12:31 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:11, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/arch/x86/traps.c           | 22 ++++++++++++++--------
>  xen/include/asm-x86/pv/traps.h |  4 ++++
>  2 files changed, 18 insertions(+), 8 deletions(-)
>
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index 8861dfd332..29a83994bd 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1871,14 +1871,8 @@ void __init init_idt_traps(void)
>      this_cpu(compat_gdt_table) = boot_cpu_compat_gdt_table;
>  }
>  
> -extern void (*const autogen_entrypoints[NR_VECTORS])(void);
> -void __init trap_init(void)
> +void __init pv_trap_init(void)
>  {
> -    unsigned int vector;
> -
> -    /* Replace early pagefault with real pagefault handler. */
> -    set_intr_gate(TRAP_page_fault, &page_fault);
> -
>      /* The 32-on-64 hypercall vector is only accessible from ring 1. */
>      _set_gate(idt_table + HYPERCALL_VECTOR,
>                SYS_DESC_trap_gate, 1, entry_int82);
> @@ -1886,6 +1880,19 @@ void __init trap_init(void)
>      /* Fast trap for int80 (faster than taking the #GP-fixup path). */
>      _set_gate(idt_table + 0x80, SYS_DESC_trap_gate, 3, &int80_direct_trap);
>  
> +    open_softirq(NMI_MCE_SOFTIRQ, nmi_mce_softirq);
> +}
> +
> +extern void (*const autogen_entrypoints[NR_VECTORS])(void);
> +void __init trap_init(void)
> +{
> +    unsigned int vector;
> +
> +    pv_trap_init();

This call should be at the end of trap_init(), but I guess you hit the
assertion?

The !CONFIG_PV case will similarly hit the assertion.

For !CONFIG_PV, 0x80 and 0x82 would best become general interrupt
handlers, which means you need to tweak entry.S autogen_stubs, and also
tweak init_irq_data()

~Andrew

> +
> +    /* Replace early pagefault with real pagefault handler. */
> +    set_intr_gate(TRAP_page_fault, &page_fault);
> +
>      for ( vector = 0; vector < NR_VECTORS; ++vector )
>      {
>          if ( autogen_entrypoints[vector] )
> @@ -1905,7 +1912,6 @@ void __init trap_init(void)
>  
>      cpu_init();
>  
> -    open_softirq(NMI_MCE_SOFTIRQ, nmi_mce_softirq);
>      open_softirq(PCI_SERR_SOFTIRQ, pci_serr_softirq);
>  }
>  
> diff --git a/xen/include/asm-x86/pv/traps.h b/xen/include/asm-x86/pv/traps.h
> index a4af69e486..426c8f6216 100644
> --- a/xen/include/asm-x86/pv/traps.h
> +++ b/xen/include/asm-x86/pv/traps.h
> @@ -25,6 +25,8 @@
>  
>  #include <public/xen.h>
>  
> +void pv_trap_init(void);
> +
>  int pv_emulate_privileged_op(struct cpu_user_regs *regs);
>  void pv_emulate_gate_op(struct cpu_user_regs *regs);
>  int pv_emulate_invalid_rdtscp(struct cpu_user_regs *regs);
> @@ -32,6 +34,8 @@ int pv_emulate_forced_invalid_op(struct cpu_user_regs *regs);
>  
>  #else  /* !CONFIG_PV */
>  
> +void pv_trap_init(void) {}
> +
>  int pv_emulate_privileged_op(struct cpu_user_regs *regs) { return 0; }
>  void pv_emulate_gate_op(struct cpu_user_regs *regs) {}
>  int pv_emulate_invalid_rdtscp(struct cpu_user_regs *regs) { return 0; }


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 17/27] x86/traps: move some PV specific functions and struct to pv/traps.c
  2017-06-08 17:11 ` [PATCH v4 17/27] x86/traps: move some PV specific functions and struct to pv/traps.c Wei Liu
@ 2017-06-23 12:36   ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 12:36 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:11, Wei Liu wrote:
> +void __init pv_trap_init(void)
> +{
> +    /* The 32-on-64 hypercall vector is only accessible from ring 1. */
> +    _set_gate(idt_table + HYPERCALL_VECTOR,
> +              SYS_DESC_trap_gate, 1, entry_int82);
> +
> +    /* Fast trap for int80 (faster than taking the #GP-fixup path). */
> +    _set_gate(idt_table + 0x80, SYS_DESC_trap_gate, 3, &int80_direct_trap);
> +
> +    open_softirq(NMI_MCE_SOFTIRQ, nmi_mce_softirq);
> +}
> +
> +int send_guest_trap(struct domain *d, uint16_t vcpuid, unsigned int trap_nr)

Just like guest_has_trap_callback(), there is only a single caller and
it has a struct vcpu in its hand.

I'd recommend breaking out a cleanup patch which switches this to:

int pv_raise_interrupt(struct vcpu *v, uint8_t vector);

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 18/27] x86/traps: move init_int80_direct_trap to pv/traps.c
  2017-06-08 17:11 ` [PATCH v4 18/27] x86/traps: move init_int80_direct_trap " Wei Liu
@ 2017-06-23 12:37   ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 12:37 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:11, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> ---
>  xen/arch/x86/pv/traps.c     | 14 ++++++++++++++
>  xen/arch/x86/x86_64/traps.c | 14 --------------
>  2 files changed, 14 insertions(+), 14 deletions(-)
>
> diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c
> index 0c1600d886..f2556c7e4a 100644
> --- a/xen/arch/x86/pv/traps.c
> +++ b/xen/arch/x86/pv/traps.c
> @@ -342,6 +342,20 @@ int send_guest_trap(struct domain *d, uint16_t vcpuid, unsigned int trap_nr)
>      return -EIO;
>  }
>  
> +void init_int80_direct_trap(struct vcpu *v)
> +{
> +    struct trap_info *ti = &v->arch.pv_vcpu.trap_ctxt[0x80];
> +    struct trap_bounce *tb = &v->arch.pv_vcpu.int80_bounce;
> +
> +    tb->cs    = ti->cs;
> +    tb->eip   = ti->address;

Mind reducing the spaces here?  Otherwise, Reviewed-by: Andrew Cooper
<andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 19/27] x86: move hypercall_page_initialise_ring3_kernel to pv/hypercall.c
  2017-06-08 17:11 ` [PATCH v4 19/27] x86: move hypercall_page_initialise_ring3_kernel to pv/hypercall.c Wei Liu
@ 2017-06-23 12:41   ` Andrew Cooper
  2017-06-23 14:49     ` Wei Liu
  0 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 12:41 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:11, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/arch/x86/pv/hypercall.c     | 36 ++++++++++++++++++++++++++++++++++++
>  xen/arch/x86/x86_64/traps.c     | 36 ------------------------------------
>  xen/include/asm-x86/hypercall.h |  1 +
>  3 files changed, 37 insertions(+), 36 deletions(-)
>
> diff --git a/xen/arch/x86/pv/hypercall.c b/xen/arch/x86/pv/hypercall.c
> index 7c5e5a629d..287340e774 100644
> --- a/xen/arch/x86/pv/hypercall.c
> +++ b/xen/arch/x86/pv/hypercall.c
> @@ -255,6 +255,42 @@ enum mc_disposition arch_do_multicall_call(struct mc_state *state)
>               ? mc_continue : mc_preempt;
>  }
>  
> +void hypercall_page_initialise_ring3_kernel(void *hypercall_page)
> +{
> +    char *p;
> +    int i;

void *p = hypercall_page;
unsigned int i;

> +
> +    /* Fill in all the transfer points with template machine code. */
> +    for ( i = 0; i < (PAGE_SIZE / 32); i++ )
> +    {

i++, p += 32

Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 20/27] x86: move hypercall_page_initialise_ring1_kernel
  2017-06-08 17:11 ` [PATCH v4 20/27] x86: move hypercall_page_initialise_ring1_kernel Wei Liu
@ 2017-06-23 12:41   ` Andrew Cooper
  2017-06-23 13:56     ` Wei Liu
  0 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 12:41 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:11, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Same review suggestions as the previous patch.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 21/27] x86: move compat_set_trap_table along side the non-compat variant
  2017-06-08 17:11 ` [PATCH v4 21/27] x86: move compat_set_trap_table along side the non-compat variant Wei Liu
@ 2017-06-23 12:43   ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 12:43 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:11, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  xen/arch/x86/pv/traps.c            | 45 ++++++++++++++++++++++++++++++++++++++
>  xen/arch/x86/x86_64/compat/traps.c | 45 --------------------------------------
>  2 files changed, 45 insertions(+), 45 deletions(-)
>
> diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c
> index f2556c7e4a..3dcb3f1877 100644
> --- a/xen/arch/x86/pv/traps.c
> +++ b/xen/arch/x86/pv/traps.c
> @@ -87,6 +87,51 @@ long do_set_trap_table(XEN_GUEST_HANDLE_PARAM(const_trap_info_t) traps)
>      return rc;
>  }
>  
> +int compat_set_trap_table(XEN_GUEST_HANDLE(trap_info_compat_t) traps)
> +{

struct vcpu *curr = current;

Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 22/27] x86: move compat_iret along side its non-compat variant
  2017-06-08 17:11 ` [PATCH v4 22/27] x86: move compat_iret along side its " Wei Liu
@ 2017-06-23 12:44   ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 12:44 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:11, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 24/27] x86: move compat_show_guest_statck near its non-compat variant
  2017-06-08 17:12 ` [PATCH v4 24/27] x86: move compat_show_guest_statck near its " Wei Liu
@ 2017-06-23 12:47   ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 12:47 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:12, Wei Liu wrote:
> And make it static, remove the declaration in header.
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 25/27] x86: remove the now empty x86_64/compat/traps.c
  2017-06-08 17:12 ` [PATCH v4 25/27] x86: remove the now empty x86_64/compat/traps.c Wei Liu
@ 2017-06-23 12:47   ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 12:47 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:12, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 26/27] x86: fix coding a style issue in asm-x86/traps.h
  2017-06-08 17:12 ` [PATCH v4 26/27] x86: fix coding a style issue in asm-x86/traps.h Wei Liu
@ 2017-06-23 12:48   ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 12:48 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:12, Wei Liu wrote:
> And add an emacs block.
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 27/27] x86: clean up traps.c
  2017-06-08 17:12 ` [PATCH v4 27/27] x86: clean up traps.c Wei Liu
@ 2017-06-23 12:50   ` Andrew Cooper
  2017-06-23 13:45     ` Jan Beulich
  0 siblings, 1 reply; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 12:50 UTC (permalink / raw)
  To: Wei Liu, Xen-devel; +Cc: Jan Beulich

On 08/06/17 18:12, Wei Liu wrote:
> @@ -1081,8 +1081,8 @@ void do_int3(struct cpu_user_regs *regs)
>      pv_inject_hw_exception(TRAP_int3, X86_EVENT_NO_EC);
>  }
>  
> -static void reserved_bit_page_fault(
> -    unsigned long addr, struct cpu_user_regs *regs)
> +static void reserved_bit_page_fault(unsigned long addr,
> +                                    struct cpu_user_regs *regs)

Why are these prototypes changing?  For this case, it doesn't matter,
but the former is necessary if any of them gain more parameters.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 23/27] x86: move the compat callback ops next to the non-compat variant
  2017-06-08 17:11 ` [PATCH v4 23/27] x86: move the compat callback ops next to the " Wei Liu
@ 2017-06-23 13:40   ` Jan Beulich
  0 siblings, 0 replies; 72+ messages in thread
From: Jan Beulich @ 2017-06-23 13:40 UTC (permalink / raw)
  To: Wei Liu; +Cc: Andrew Cooper, Xen-devel

>>> On 08.06.17 at 19:11, <wei.liu2@citrix.com> wrote:
> +}
> +
> +
> +long compat_callback_op(int cmd, XEN_GUEST_HANDLE(void) arg)

Please avoid double blank lines like above. And preferably, please,
just like you did in earlier patches, use "curr" instead of "v". Then
Acked-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 27/27] x86: clean up traps.c
  2017-06-23 12:50   ` Andrew Cooper
@ 2017-06-23 13:45     ` Jan Beulich
  0 siblings, 0 replies; 72+ messages in thread
From: Jan Beulich @ 2017-06-23 13:45 UTC (permalink / raw)
  To: Andrew Cooper, Wei Liu; +Cc: Xen-devel

>>> On 23.06.17 at 14:50, <andrew.cooper3@citrix.com> wrote:
> On 08/06/17 18:12, Wei Liu wrote:
>> @@ -1081,8 +1081,8 @@ void do_int3(struct cpu_user_regs *regs)
>>      pv_inject_hw_exception(TRAP_int3, X86_EVENT_NO_EC);
>>  }
>>  
>> -static void reserved_bit_page_fault(
>> -    unsigned long addr, struct cpu_user_regs *regs)
>> +static void reserved_bit_page_fault(unsigned long addr,
>> +                                    struct cpu_user_regs *regs)
> 
> Why are these prototypes changing?  For this case, it doesn't matter,
> but the former is necessary if any of them gain more parameters.

I think we should use consistent format, and while originally there
may have been quite a few cases of what is being removed here,
over time we've been switching towards what is being put in
place here. I don't see why adding more parameters would be in
conflict with this. If not even a single parameter declaration fits
on a line, then imo this is a good suggestion that the function
name is too long.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 16/27] x86/traps: factor out pv_trap_init
  2017-06-23 12:31   ` Andrew Cooper
@ 2017-06-23 13:55     ` Wei Liu
  0 siblings, 0 replies; 72+ messages in thread
From: Wei Liu @ 2017-06-23 13:55 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Wei Liu, Jan Beulich

On Fri, Jun 23, 2017 at 01:31:22PM +0100, Andrew Cooper wrote:
> On 08/06/17 18:11, Wei Liu wrote:
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> > ---
> >  xen/arch/x86/traps.c           | 22 ++++++++++++++--------
> >  xen/include/asm-x86/pv/traps.h |  4 ++++
> >  2 files changed, 18 insertions(+), 8 deletions(-)
> >
> > diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> > index 8861dfd332..29a83994bd 100644
> > --- a/xen/arch/x86/traps.c
> > +++ b/xen/arch/x86/traps.c
> > @@ -1871,14 +1871,8 @@ void __init init_idt_traps(void)
> >      this_cpu(compat_gdt_table) = boot_cpu_compat_gdt_table;
> >  }
> >  
> > -extern void (*const autogen_entrypoints[NR_VECTORS])(void);
> > -void __init trap_init(void)
> > +void __init pv_trap_init(void)
> >  {
> > -    unsigned int vector;
> > -
> > -    /* Replace early pagefault with real pagefault handler. */
> > -    set_intr_gate(TRAP_page_fault, &page_fault);
> > -
> >      /* The 32-on-64 hypercall vector is only accessible from ring 1. */
> >      _set_gate(idt_table + HYPERCALL_VECTOR,
> >                SYS_DESC_trap_gate, 1, entry_int82);
> > @@ -1886,6 +1880,19 @@ void __init trap_init(void)
> >      /* Fast trap for int80 (faster than taking the #GP-fixup path). */
> >      _set_gate(idt_table + 0x80, SYS_DESC_trap_gate, 3, &int80_direct_trap);
> >  
> > +    open_softirq(NMI_MCE_SOFTIRQ, nmi_mce_softirq);
> > +}
> > +
> > +extern void (*const autogen_entrypoints[NR_VECTORS])(void);
> > +void __init trap_init(void)
> > +{
> > +    unsigned int vector;
> > +
> > +    pv_trap_init();
> 
> This call should be at the end of trap_init(), but I guess you hit the
> assertion?
> 

Yes, that would hit the assertion.

> The !CONFIG_PV case will similarly hit the assertion.
> 
> For !CONFIG_PV, 0x80 and 0x82 would best become general interrupt
> handlers, which means you need to tweak entry.S autogen_stubs, and also
> tweak init_irq_data()
> 

Good point. I will tweak both locations.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 20/27] x86: move hypercall_page_initialise_ring1_kernel
  2017-06-23 12:41   ` Andrew Cooper
@ 2017-06-23 13:56     ` Wei Liu
  2017-06-23 13:56       ` Andrew Cooper
  0 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-23 13:56 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Wei Liu, Jan Beulich

On Fri, Jun 23, 2017 at 01:41:59PM +0100, Andrew Cooper wrote:
> On 08/06/17 18:11, Wei Liu wrote:
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> 
> Same review suggestions as the previous patch.
> 

And then your Rb applies?

> ~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 20/27] x86: move hypercall_page_initialise_ring1_kernel
  2017-06-23 13:56     ` Wei Liu
@ 2017-06-23 13:56       ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 13:56 UTC (permalink / raw)
  To: Wei Liu; +Cc: Xen-devel, Jan Beulich

On 23/06/17 14:56, Wei Liu wrote:
> On Fri, Jun 23, 2017 at 01:41:59PM +0100, Andrew Cooper wrote:
>> On 08/06/17 18:11, Wei Liu wrote:
>>> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
>> Same review suggestions as the previous patch.
>>
> And then your Rb applies?

Yes.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 07/27] x86: move do_set_trap_table to pv/traps.c
  2017-06-23 11:00   ` Andrew Cooper
@ 2017-06-23 13:59     ` Wei Liu
  2017-06-23 13:59       ` Andrew Cooper
  0 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-23 13:59 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Wei Liu, Jan Beulich

On Fri, Jun 23, 2017 at 12:00:35PM +0100, Andrew Cooper wrote:
> On 08/06/17 18:11, Wei Liu wrote:
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> 
> I'd suggest folding this into the next patch, and putting the hypercall
> in misc-hypercalls.c
> 
> Despite its name, this hypercall is just setting up state in the vcpu.
> 

This is trivial to do. But if I do it, does your Rb in the next patch
apply to the new version?

> ~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 07/27] x86: move do_set_trap_table to pv/traps.c
  2017-06-23 13:59     ` Wei Liu
@ 2017-06-23 13:59       ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 13:59 UTC (permalink / raw)
  To: Wei Liu; +Cc: Xen-devel, Jan Beulich

On 23/06/17 14:59, Wei Liu wrote:
> On Fri, Jun 23, 2017 at 12:00:35PM +0100, Andrew Cooper wrote:
>> On 08/06/17 18:11, Wei Liu wrote:
>>> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
>> I'd suggest folding this into the next patch, and putting the hypercall
>> in misc-hypercalls.c
>>
>> Despite its name, this hypercall is just setting up state in the vcpu.
>>
> This is trivial to do. But if I do it, does your Rb in the next patch
> apply to the new version?

Yes.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 14/27] x86: move do_iret to pv/iret.c
  2017-06-23 12:12   ` Andrew Cooper
@ 2017-06-23 14:17     ` Wei Liu
  2017-06-23 14:17       ` Andrew Cooper
  0 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-23 14:17 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Wei Liu, Jan Beulich

On Fri, Jun 23, 2017 at 01:12:15PM +0100, Andrew Cooper wrote:
> On 08/06/17 18:11, Wei Liu wrote:
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> > ---
> > There is no copyright header in the original file. Use the default
> > one?
> 
> It should at least gain a basic GPLv2 header.
> 
> Otherwise, LGTM.
> 

I'm going to insert the following license from CONTRIBUTING:\

/*
 * pv/iret.c
 *
 * iret hypercall handling code
 *
 * This program is free software; you can redistribute it and/or
 * modify it under the terms and conditions of the GNU General Public
 * License, version 2, as published by the Free Software Foundation.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 * General Public License for more details.
 *
 * You should have received a copy of the GNU General Public
 * License along with this program; If not, see
 * <http://www.gnu.org/licenses/>.
 */


Can I have your ack or rb?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 14/27] x86: move do_iret to pv/iret.c
  2017-06-23 14:17     ` Wei Liu
@ 2017-06-23 14:17       ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 14:17 UTC (permalink / raw)
  To: Wei Liu; +Cc: Xen-devel, Jan Beulich

On 23/06/17 15:17, Wei Liu wrote:
> On Fri, Jun 23, 2017 at 01:12:15PM +0100, Andrew Cooper wrote:
>> On 08/06/17 18:11, Wei Liu wrote:
>>> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
>>> ---
>>> There is no copyright header in the original file. Use the default
>>> one?
>> It should at least gain a basic GPLv2 header.
>>
>> Otherwise, LGTM.
>>
> I'm going to insert the following license from CONTRIBUTING:\
>
> /*
>  * pv/iret.c
>  *
>  * iret hypercall handling code
>  *
>  * This program is free software; you can redistribute it and/or
>  * modify it under the terms and conditions of the GNU General Public
>  * License, version 2, as published by the Free Software Foundation.
>  *
>  * This program is distributed in the hope that it will be useful,
>  * but WITHOUT ANY WARRANTY; without even the implied warranty of
>  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>  * General Public License for more details.
>  *
>  * You should have received a copy of the GNU General Public
>  * License along with this program; If not, see
>  * <http://www.gnu.org/licenses/>.
>  */
>
>
> Can I have your ack or rb?

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 19/27] x86: move hypercall_page_initialise_ring3_kernel to pv/hypercall.c
  2017-06-23 12:41   ` Andrew Cooper
@ 2017-06-23 14:49     ` Wei Liu
  2017-06-23 14:53       ` Andrew Cooper
  0 siblings, 1 reply; 72+ messages in thread
From: Wei Liu @ 2017-06-23 14:49 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Xen-devel, Wei Liu, Jan Beulich

On Fri, Jun 23, 2017 at 01:41:29PM +0100, Andrew Cooper wrote:
> On 08/06/17 18:11, Wei Liu wrote:
> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> > ---
> >  xen/arch/x86/pv/hypercall.c     | 36 ++++++++++++++++++++++++++++++++++++
> >  xen/arch/x86/x86_64/traps.c     | 36 ------------------------------------
> >  xen/include/asm-x86/hypercall.h |  1 +
> >  3 files changed, 37 insertions(+), 36 deletions(-)
> >
> > diff --git a/xen/arch/x86/pv/hypercall.c b/xen/arch/x86/pv/hypercall.c
> > index 7c5e5a629d..287340e774 100644
> > --- a/xen/arch/x86/pv/hypercall.c
> > +++ b/xen/arch/x86/pv/hypercall.c
> > @@ -255,6 +255,42 @@ enum mc_disposition arch_do_multicall_call(struct mc_state *state)
> >               ? mc_continue : mc_preempt;
> >  }
> >  
> > +void hypercall_page_initialise_ring3_kernel(void *hypercall_page)
> > +{
> > +    char *p;
> > +    int i;
> 
> void *p = hypercall_page;
> unsigned int i;
> 
> > +
> > +    /* Fill in all the transfer points with template machine code. */
> > +    for ( i = 0; i < (PAGE_SIZE / 32); i++ )
> > +    {
> 
> i++, p += 32

Wait, I think pointer arithmetic is not allowed on void* ?

> 
> Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 19/27] x86: move hypercall_page_initialise_ring3_kernel to pv/hypercall.c
  2017-06-23 14:49     ` Wei Liu
@ 2017-06-23 14:53       ` Andrew Cooper
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Cooper @ 2017-06-23 14:53 UTC (permalink / raw)
  To: Wei Liu; +Cc: Xen-devel, Jan Beulich

On 23/06/17 15:49, Wei Liu wrote:
> On Fri, Jun 23, 2017 at 01:41:29PM +0100, Andrew Cooper wrote:
>> On 08/06/17 18:11, Wei Liu wrote:
>>> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
>>> ---
>>>  xen/arch/x86/pv/hypercall.c     | 36 ++++++++++++++++++++++++++++++++++++
>>>  xen/arch/x86/x86_64/traps.c     | 36 ------------------------------------
>>>  xen/include/asm-x86/hypercall.h |  1 +
>>>  3 files changed, 37 insertions(+), 36 deletions(-)
>>>
>>> diff --git a/xen/arch/x86/pv/hypercall.c b/xen/arch/x86/pv/hypercall.c
>>> index 7c5e5a629d..287340e774 100644
>>> --- a/xen/arch/x86/pv/hypercall.c
>>> +++ b/xen/arch/x86/pv/hypercall.c
>>> @@ -255,6 +255,42 @@ enum mc_disposition arch_do_multicall_call(struct mc_state *state)
>>>               ? mc_continue : mc_preempt;
>>>  }
>>>  
>>> +void hypercall_page_initialise_ring3_kernel(void *hypercall_page)
>>> +{
>>> +    char *p;
>>> +    int i;
>> void *p = hypercall_page;
>> unsigned int i;
>>
>>> +
>>> +    /* Fill in all the transfer points with template machine code. */
>>> +    for ( i = 0; i < (PAGE_SIZE / 32); i++ )
>>> +    {
>> i++, p += 32
> Wait, I think pointer arithmetic is not allowed on void* ?

void pointer arithmetic is a GCCism, which we use in plenty of other
locations.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2017-06-23 14:53 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-08 17:11 [PATCH v4 00/27] x86: refactor trap handling code Wei Liu
2017-06-08 17:11 ` [PATCH v4 01/27] x86: factor out common PV emulation code Wei Liu
2017-06-20 16:00   ` Jan Beulich
2017-06-08 17:11 ` [PATCH v4 02/27] x86: move PV privileged instruction " Wei Liu
2017-06-20 16:03   ` Jan Beulich
2017-06-08 17:11 ` [PATCH v4 03/27] x86: move PV gate op " Wei Liu
2017-06-20 16:05   ` Jan Beulich
2017-06-08 17:11 ` [PATCH v4 04/27] x86: move PV invalid " Wei Liu
2017-06-20 16:21   ` Jan Beulich
2017-06-20 16:25     ` Wei Liu
2017-06-21  6:15       ` Jan Beulich
2017-06-21  8:57         ` Wei Liu
2017-06-21  9:09           ` Jan Beulich
2017-06-21  9:14             ` Wei Liu
2017-06-21  9:26               ` Jan Beulich
2017-06-21  9:29                 ` Wei Liu
2017-06-08 17:11 ` [PATCH v4 05/27] x86/traps: remove now unused inclusion of emulate.h Wei Liu
2017-06-20 16:21   ` Jan Beulich
2017-06-08 17:11 ` [PATCH v4 06/27] x86: clean up PV emulation code Wei Liu
2017-06-23 10:56   ` Andrew Cooper
2017-06-08 17:11 ` [PATCH v4 07/27] x86: move do_set_trap_table to pv/traps.c Wei Liu
2017-06-23 11:00   ` Andrew Cooper
2017-06-23 13:59     ` Wei Liu
2017-06-23 13:59       ` Andrew Cooper
2017-06-08 17:11 ` [PATCH v4 08/27] x86: move some misc PV hypercalls to misc-hypercalls.c Wei Liu
2017-06-23 11:02   ` Andrew Cooper
2017-06-08 17:11 ` [PATCH v4 09/27] x86/traps: move pv_inject_event to pv/traps.c Wei Liu
2017-06-23 11:04   ` Andrew Cooper
2017-06-08 17:11 ` [PATCH v4 10/27] x86/traps: move set_guest_{machine, nmi}_trapbounce Wei Liu
2017-06-23 11:05   ` Andrew Cooper
2017-06-08 17:11 ` [PATCH v4 11/27] x86:/traps: move {un, }register_guest_nmi_callback Wei Liu
2017-06-23 11:38   ` Andrew Cooper
2017-06-23 12:19     ` Andrew Cooper
2017-06-08 17:11 ` [PATCH v4 12/27] x86/traps: move guest_has_trap_callback to pv/traps.c Wei Liu
2017-06-23 12:01   ` Andrew Cooper
2017-06-08 17:11 ` [PATCH v4 13/27] x86: move toggle_guest_mode to pv/domain.c Wei Liu
2017-06-23 12:10   ` Andrew Cooper
2017-06-08 17:11 ` [PATCH v4 14/27] x86: move do_iret to pv/iret.c Wei Liu
2017-06-23 12:12   ` Andrew Cooper
2017-06-23 14:17     ` Wei Liu
2017-06-23 14:17       ` Andrew Cooper
2017-06-08 17:11 ` [PATCH v4 15/27] x86: move callback_op code to pv/callback.c Wei Liu
2017-06-08 17:11 ` [PATCH v4 16/27] x86/traps: factor out pv_trap_init Wei Liu
2017-06-23 12:31   ` Andrew Cooper
2017-06-23 13:55     ` Wei Liu
2017-06-08 17:11 ` [PATCH v4 17/27] x86/traps: move some PV specific functions and struct to pv/traps.c Wei Liu
2017-06-23 12:36   ` Andrew Cooper
2017-06-08 17:11 ` [PATCH v4 18/27] x86/traps: move init_int80_direct_trap " Wei Liu
2017-06-23 12:37   ` Andrew Cooper
2017-06-08 17:11 ` [PATCH v4 19/27] x86: move hypercall_page_initialise_ring3_kernel to pv/hypercall.c Wei Liu
2017-06-23 12:41   ` Andrew Cooper
2017-06-23 14:49     ` Wei Liu
2017-06-23 14:53       ` Andrew Cooper
2017-06-08 17:11 ` [PATCH v4 20/27] x86: move hypercall_page_initialise_ring1_kernel Wei Liu
2017-06-23 12:41   ` Andrew Cooper
2017-06-23 13:56     ` Wei Liu
2017-06-23 13:56       ` Andrew Cooper
2017-06-08 17:11 ` [PATCH v4 21/27] x86: move compat_set_trap_table along side the non-compat variant Wei Liu
2017-06-23 12:43   ` Andrew Cooper
2017-06-08 17:11 ` [PATCH v4 22/27] x86: move compat_iret along side its " Wei Liu
2017-06-23 12:44   ` Andrew Cooper
2017-06-08 17:11 ` [PATCH v4 23/27] x86: move the compat callback ops next to the " Wei Liu
2017-06-23 13:40   ` Jan Beulich
2017-06-08 17:12 ` [PATCH v4 24/27] x86: move compat_show_guest_statck near its " Wei Liu
2017-06-23 12:47   ` Andrew Cooper
2017-06-08 17:12 ` [PATCH v4 25/27] x86: remove the now empty x86_64/compat/traps.c Wei Liu
2017-06-23 12:47   ` Andrew Cooper
2017-06-08 17:12 ` [PATCH v4 26/27] x86: fix coding a style issue in asm-x86/traps.h Wei Liu
2017-06-23 12:48   ` Andrew Cooper
2017-06-08 17:12 ` [PATCH v4 27/27] x86: clean up traps.c Wei Liu
2017-06-23 12:50   ` Andrew Cooper
2017-06-23 13:45     ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.