All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/9] x86: indirect call overhead reduction
@ 2018-09-11 13:26 Jan Beulich
  2018-09-11 13:32 ` [PATCH v3 1/9] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
                   ` (12 more replies)
  0 siblings, 13 replies; 119+ messages in thread
From: Jan Beulich @ 2018-09-11 13:26 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

While indirect calls have always been more expensive than direct ones,
their cost has further increased with the Spectre v2 mitigations. In a
number of cases we simply pointlessly use them in the first place. In
many other cases the indirection solely exists to abstract from e.g.
vendor specific hardware details, and hence the pointers used never
change once set. Here we can use alternatives patching to get rid of
the indirection.

From patch 2 onwards dependencies exist on earlier, yet to be reviewed
patches ("x86/alternatives: fully leverage automatic NOP filling" as well
as the "x86: improve PDX <-> PFN and alike translations" series at the
very least). I nevertheless wanted to enable a first round of review of
the series, the more that some of the patches (not just initial ones)
could perhaps be taken irrespective of those dependencies. The first
two of the three genapic patches, otoh, are entirely independent and
could go in right away afaict if they were ack-ed.

Further areas where indirect calls could be eliminated (and that I've put
on my todo list in case the general concept here is deemed reasonable)
are IOMMU, vPMU, and XSM. For some of these, the ARM side would
need dealing with as well - I'm not sure whether replacing indirect calls
by direct ones is worthwhile there as well; if not, the wrappers would
simply need to become function invocations in the ARM case (as was
already asked for in the IOMMU case).

1: x86: infrastructure to allow converting certain indirect calls to direct ones
2: x86/HVM: patch indirect calls through hvm_funcs to direct ones
3: x86/HVM: patch vINTR indirect calls through hvm_funcs to direct ones
4: x86: patch ctxt_switch_masking() indirect call to direct one
5: x86/genapic: remove indirection from genapic hook accesses
6: x86/genapic: patch indirect calls to direct ones
7: x86/cpuidle: patch some indirect calls to direct ones
8: cpufreq: convert to a single post-init driver (hooks) instance
9: cpufreq: patch target() indirect call to direct one

Besides some re-basing only patch 1 has really changed from v2.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v3 1/9] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-09-11 13:26 [PATCH v3 0/9] x86: indirect call overhead reduction Jan Beulich
@ 2018-09-11 13:32 ` Jan Beulich
  2018-09-21 10:49   ` Wei Liu
  2018-09-11 13:32 ` [PATCH v3 2/9] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-09-11 13:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall

In a number of cases the targets of indirect calls get determined once
at boot time. In such cases we can replace those calls with direct ones
via our alternative instruction patching mechanism.

Some of the targets (in particular the hvm_funcs ones) get established
only in pre-SMP initcalls, making necessary a second passs through the
alternative patching code. Therefore some adjustments beyond the
recognition of the new special pattern are necessary there.

Note that patching such sites more than once is not supported (and the
supplied macros also don't provide any means to do so).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: Use "X" constraint instead of "g" in alternative_callN(). Pre-
    calculate values to be put into local register variables.
v2: Introduce and use count_va_arg(). Don't omit middle operand from
    ?: in ALT_CALL_ARG(). Re-base.

--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -178,8 +178,9 @@ text_poke(void *addr, const void *opcode
  * APs have less capabilities than the boot processor are not handled.
  * Tough. Make sure you disable such features by hand.
  */
-void init_or_livepatch apply_alternatives(struct alt_instr *start,
-                                          struct alt_instr *end)
+static void init_or_livepatch _apply_alternatives(struct alt_instr *start,
+                                                  struct alt_instr *end,
+                                                  bool force)
 {
     struct alt_instr *a, *base;
 
@@ -218,6 +219,13 @@ void init_or_livepatch apply_alternative
         if ( ALT_ORIG_PTR(base) != orig )
             base = a;
 
+        /* Skip patch sites already handled during the first pass. */
+        if ( a->priv )
+        {
+            ASSERT(force);
+            continue;
+        }
+
         /* If there is no replacement to make, see about optimising the nops. */
         if ( !boot_cpu_has(a->cpuid) )
         {
@@ -225,7 +233,7 @@ void init_or_livepatch apply_alternative
             if ( base->priv )
                 continue;
 
-            base->priv = 1;
+            a->priv = 1;
 
             /* Nothing useful to do? */
             if ( toolchain_nops_are_ideal || a->pad_len <= 1 )
@@ -236,20 +244,74 @@ void init_or_livepatch apply_alternative
             continue;
         }
 
-        base->priv = 1;
-
         memcpy(buf, repl, a->repl_len);
 
         /* 0xe8/0xe9 are relative branches; fix the offset. */
         if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
-            *(int32_t *)(buf + 1) += repl - orig;
+        {
+            /*
+             * Detect the special case of indirect-to-direct branch patching:
+             * - replacement is a direct CALL/JMP (opcodes 0xE8/0xE9; already
+             *   checked above),
+             * - replacement's displacement is -5 (pointing back at the very
+             *   insn, which makes no sense in a real replacement insn),
+             * - original is an indirect CALL/JMP (opcodes 0xFF/2 or 0xFF/4)
+             *   using RIP-relative addressing.
+             * Some function targets may not be available when we come here
+             * the first time. Defer patching of those until the post-presmp-
+             * initcalls re-invocation. If at that point the target pointer is
+             * still NULL, insert "UD2; UD0" (for ease of recognition) instead
+             * of CALL/JMP.
+             */
+            if ( a->cpuid == X86_FEATURE_ALWAYS &&
+                 *(int32_t *)(buf + 1) == -5 &&
+                 a->orig_len >= 6 &&
+                 orig[0] == 0xff &&
+                 orig[1] == (*buf & 1 ? 0x25 : 0x15) )
+            {
+                long disp = *(int32_t *)(orig + 2);
+                const uint8_t *dest = *(void **)(orig + 6 + disp);
+
+                if ( dest )
+                {
+                    disp = dest - (orig + 5);
+                    ASSERT(disp == (int32_t)disp);
+                    *(int32_t *)(buf + 1) = disp;
+                }
+                else if ( force )
+                {
+                    buf[0] = 0x0f;
+                    buf[1] = 0x0b;
+                    buf[2] = 0x0f;
+                    buf[3] = 0xff;
+                    buf[4] = 0xff;
+                }
+                else
+                    continue;
+            }
+            else if ( force && system_state < SYS_STATE_active )
+                ASSERT_UNREACHABLE();
+            else
+                *(int32_t *)(buf + 1) += repl - orig;
+        }
+        else if ( force && system_state < SYS_STATE_active  )
+            ASSERT_UNREACHABLE();
+
+        a->priv = 1;
 
         add_nops(buf + a->repl_len, total_len - a->repl_len);
         text_poke(orig, buf, total_len);
     }
 }
 
-static bool __initdata alt_done;
+void init_or_livepatch apply_alternatives(struct alt_instr *start,
+                                          struct alt_instr *end)
+{
+    _apply_alternatives(start, end, true);
+}
+
+static unsigned int __initdata alt_todo;
+static unsigned int __initdata alt_done;
 
 /*
  * At boot time, we patch alternatives in NMI context.  This means that the
@@ -264,7 +326,7 @@ static int __init nmi_apply_alternatives
      * More than one NMI may occur between the two set_nmi_callback() below.
      * We only need to apply alternatives once.
      */
-    if ( !alt_done )
+    if ( !(alt_done & alt_todo) )
     {
         unsigned long cr0;
 
@@ -273,11 +335,12 @@ static int __init nmi_apply_alternatives
         /* Disable WP to allow patching read-only pages. */
         write_cr0(cr0 & ~X86_CR0_WP);
 
-        apply_alternatives(__alt_instructions, __alt_instructions_end);
+        _apply_alternatives(__alt_instructions, __alt_instructions_end,
+                            alt_done);
 
         write_cr0(cr0);
 
-        alt_done = true;
+        alt_done |= alt_todo;
     }
 
     return 1;
@@ -287,13 +350,11 @@ static int __init nmi_apply_alternatives
  * This routine is called with local interrupt disabled and used during
  * bootup.
  */
-void __init alternative_instructions(void)
+static void __init _alternative_instructions(bool force)
 {
     unsigned int i;
     nmi_callback_t *saved_nmi_callback;
 
-    arch_init_ideal_nops();
-
     /*
      * Don't stop machine check exceptions while patching.
      * MCEs only happen when something got corrupted and in this
@@ -306,6 +367,10 @@ void __init alternative_instructions(voi
      */
     ASSERT(!local_irq_is_enabled());
 
+    /* Set what operation to perform /before/ setting the callback. */
+    alt_todo = 1u << force;
+    barrier();
+
     /*
      * As soon as the callback is set up, the next NMI will trigger patching,
      * even an NMI ahead of our explicit self-NMI.
@@ -321,11 +386,24 @@ void __init alternative_instructions(voi
      * cover the (hopefully never) async case, poll alt_done for up to one
      * second.
      */
-    for ( i = 0; !ACCESS_ONCE(alt_done) && i < 1000; ++i )
+    for ( i = 0; !(ACCESS_ONCE(alt_done) & alt_todo) && i < 1000; ++i )
         mdelay(1);
 
-    if ( !ACCESS_ONCE(alt_done) )
+    if ( !(ACCESS_ONCE(alt_done) & alt_todo) )
         panic("Timed out waiting for alternatives self-NMI to hit\n");
 
     set_nmi_callback(saved_nmi_callback);
 }
+
+void __init alternative_instructions(void)
+{
+    arch_init_ideal_nops();
+    _alternative_instructions(false);
+}
+
+void __init alternative_branches(void)
+{
+    local_irq_disable();
+    _alternative_instructions(true);
+    local_irq_enable();
+}
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1637,6 +1637,8 @@ void __init noreturn __start_xen(unsigne
 
     do_presmp_initcalls();
 
+    alternative_branches();
+
     /*
      * NB: when running as a PV shim VCPUOP_up/down is wired to the shim
      * physical cpu_add/remove functions, so launch the guest with only
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -4,8 +4,8 @@
 #ifdef __ASSEMBLY__
 #include <asm/alternative-asm.h>
 #else
+#include <xen/lib.h>
 #include <xen/stringify.h>
-#include <xen/types.h>
 #include <asm/asm-macros.h>
 
 struct __packed alt_instr {
@@ -26,6 +26,7 @@ extern void add_nops(void *insns, unsign
 /* Similar to alternative_instructions except it can be run with IRQs enabled. */
 extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end);
 extern void alternative_instructions(void);
+extern void alternative_branches(void);
 
 #define alt_orig_len       "(.LXEN%=_orig_e - .LXEN%=_orig_s)"
 #define alt_pad_len        "(.LXEN%=_orig_p - .LXEN%=_orig_e)"
@@ -149,6 +150,233 @@ extern void alternative_instructions(voi
 /* Use this macro(s) if you need more than one output parameter. */
 #define ASM_OUTPUT2(a...) a
 
+/*
+ * Machinery to allow converting indirect to direct calls, when the called
+ * function is determined once at boot and later never changed.
+ */
+
+#define ALT_CALL_arg1 "rdi"
+#define ALT_CALL_arg2 "rsi"
+#define ALT_CALL_arg3 "rdx"
+#define ALT_CALL_arg4 "rcx"
+#define ALT_CALL_arg5 "r8"
+#define ALT_CALL_arg6 "r9"
+
+#define ALT_CALL_ARG(arg, n) \
+    register typeof((arg) ? (arg) : 0) a ## n ## _ \
+    asm ( ALT_CALL_arg ## n ) = (arg)
+#define ALT_CALL_NO_ARG(n) \
+    register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n )
+
+#define ALT_CALL_NO_ARG6 ALT_CALL_NO_ARG(6)
+#define ALT_CALL_NO_ARG5 ALT_CALL_NO_ARG(5); ALT_CALL_NO_ARG6
+#define ALT_CALL_NO_ARG4 ALT_CALL_NO_ARG(4); ALT_CALL_NO_ARG5
+#define ALT_CALL_NO_ARG3 ALT_CALL_NO_ARG(3); ALT_CALL_NO_ARG4
+#define ALT_CALL_NO_ARG2 ALT_CALL_NO_ARG(2); ALT_CALL_NO_ARG3
+#define ALT_CALL_NO_ARG1 ALT_CALL_NO_ARG(1); ALT_CALL_NO_ARG2
+
+/*
+ * Unfortunately ALT_CALL_NO_ARG() above can't use a fake initializer (to
+ * suppress "uninitialized variable" warnings), as various versions of gcc
+ * older than 8.1 fall on the nose in various ways with that (always because
+ * of some other construct elsewhere in the same function needing to use the
+ * same hard register). Otherwise the asm() below could uniformly use "+r"
+ * output constraints, making unnecessary all these ALT_CALL<n>_OUT macros.
+ */
+#define ALT_CALL0_OUT "=r" (a1_), "=r" (a2_), "=r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL1_OUT "+r" (a1_), "=r" (a2_), "=r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL2_OUT "+r" (a1_), "+r" (a2_), "=r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL3_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL4_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "+r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL5_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "+r" (a4_), "+r" (a5_), "=r" (a6_)
+#define ALT_CALL6_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "+r" (a4_), "+r" (a5_), "+r" (a6_)
+
+#define alternative_callN(n, rettype, func) ({                     \
+    rettype ret_;                                                  \
+    register unsigned long r10_ asm("r10");                        \
+    register unsigned long r11_ asm("r11");                        \
+    asm volatile (__stringify(ALTERNATIVE "call *%c[addr](%%rip)", \
+                                          "call .",                \
+                                          X86_FEATURE_ALWAYS)      \
+                  : ALT_CALL ## n ## _OUT, "=a" (ret_),            \
+                    "=r" (r10_), "=r" (r11_)                       \
+                  : [addr] "i" (&(func)), "g" (func)               \
+                  : "memory" );                                    \
+    ret_;                                                          \
+})
+
+#define alternative_vcall0(func) ({             \
+    ALT_CALL_NO_ARG1;                           \
+    ((void)alternative_callN(0, int, func));    \
+})
+
+#define alternative_call0(func) ({              \
+    ALT_CALL_NO_ARG1;                           \
+    alternative_callN(0, typeof(func()), func); \
+})
+
+#define alternative_vcall1(func, arg) ({           \
+    ALT_CALL_ARG(arg, 1);                          \
+    ALT_CALL_NO_ARG2;                              \
+    (void)sizeof(func(arg));                       \
+    (void)alternative_callN(1, int, func);         \
+})
+
+#define alternative_call1(func, arg) ({            \
+    ALT_CALL_ARG(arg, 1);                          \
+    ALT_CALL_NO_ARG2;                              \
+    alternative_callN(1, typeof(func(arg)), func); \
+})
+
+#define alternative_vcall2(func, arg1, arg2) ({           \
+    typeof(arg2) v2_ = (arg2);                            \
+    ALT_CALL_ARG(arg1, 1);                                \
+    ALT_CALL_ARG(v2_, 2);                                 \
+    ALT_CALL_NO_ARG3;                                     \
+    (void)sizeof(func(arg1, arg2));                       \
+    (void)alternative_callN(2, int, func);                \
+})
+
+#define alternative_call2(func, arg1, arg2) ({            \
+    typeof(arg2) v2_ = (arg2);                            \
+    ALT_CALL_ARG(arg1, 1);                                \
+    ALT_CALL_ARG(v2_, 2);                                 \
+    ALT_CALL_NO_ARG3;                                     \
+    alternative_callN(2, typeof(func(arg1, arg2)), func); \
+})
+
+#define alternative_vcall3(func, arg1, arg2, arg3) ({    \
+    typeof(arg2) v2_ = (arg2);                           \
+    typeof(arg3) v3_ = (arg3);                           \
+    ALT_CALL_ARG(arg1, 1);                               \
+    ALT_CALL_ARG(v2_, 2);                                \
+    ALT_CALL_ARG(v3_, 3);                                \
+    ALT_CALL_NO_ARG4;                                    \
+    (void)sizeof(func(arg1, arg2, arg3));                \
+    (void)alternative_callN(3, int, func);               \
+})
+
+#define alternative_call3(func, arg1, arg2, arg3) ({     \
+    typeof(arg2) v2_ = (arg2);                           \
+    typeof(arg3) v3_ = (arg3);                           \
+    ALT_CALL_ARG(arg1, 1);                               \
+    ALT_CALL_ARG(v2_, 2);                                \
+    ALT_CALL_ARG(v3_, 3);                                \
+    ALT_CALL_NO_ARG4;                                    \
+    alternative_callN(3, typeof(func(arg1, arg2, arg3)), \
+                      func);                             \
+})
+
+#define alternative_vcall4(func, arg1, arg2, arg3, arg4) ({ \
+    typeof(arg2) v2_ = (arg2);                              \
+    typeof(arg3) v3_ = (arg3);                              \
+    typeof(arg4) v4_ = (arg4);                              \
+    ALT_CALL_ARG(arg1, 1);                                  \
+    ALT_CALL_ARG(v2_, 2);                                   \
+    ALT_CALL_ARG(v3_, 3);                                   \
+    ALT_CALL_ARG(v4_, 4);                                   \
+    ALT_CALL_NO_ARG5;                                       \
+    (void)sizeof(func(arg1, arg2, arg3, arg4));             \
+    (void)alternative_callN(4, int, func);                  \
+})
+
+#define alternative_call4(func, arg1, arg2, arg3, arg4) ({  \
+    typeof(arg2) v2_ = (arg2);                              \
+    typeof(arg3) v3_ = (arg3);                              \
+    typeof(arg4) v4_ = (arg4);                              \
+    ALT_CALL_ARG(arg1, 1);                                  \
+    ALT_CALL_ARG(v2_, 2);                                   \
+    ALT_CALL_ARG(v3_, 3);                                   \
+    ALT_CALL_ARG(v4_, 4);                                   \
+    ALT_CALL_NO_ARG5;                                       \
+    alternative_callN(4, typeof(func(arg1, arg2,            \
+                                     arg3, arg4)),          \
+                      func);                                \
+})
+
+#define alternative_vcall5(func, arg1, arg2, arg3, arg4, arg5) ({ \
+    typeof(arg2) v2_ = (arg2);                                    \
+    typeof(arg3) v3_ = (arg3);                                    \
+    typeof(arg4) v4_ = (arg4);                                    \
+    typeof(arg5) v5_ = (arg5);                                    \
+    ALT_CALL_ARG(arg1, 1);                                        \
+    ALT_CALL_ARG(v2_, 2);                                         \
+    ALT_CALL_ARG(v3_, 3);                                         \
+    ALT_CALL_ARG(v4_, 4);                                         \
+    ALT_CALL_ARG(v5_, 5);                                         \
+    ALT_CALL_NO_ARG6;                                             \
+    (void)sizeof(func(arg1, arg2, arg3, arg4, arg5));             \
+    (void)alternative_callN(5, int, func, ALT_CALL_OUT5);         \
+})
+
+#define alternative_call5(func, arg1, arg2, arg3, arg4, arg5) ({  \
+    typeof(arg2) v2_ = (arg2);                                    \
+    typeof(arg3) v3_ = (arg3);                                    \
+    typeof(arg4) v4_ = (arg4);                                    \
+    typeof(arg5) v5_ = (arg5);                                    \
+    ALT_CALL_ARG(arg1, 1);                                        \
+    ALT_CALL_ARG(v2_, 2);                                         \
+    ALT_CALL_ARG(v3_, 3);                                         \
+    ALT_CALL_ARG(v4_, 4);                                         \
+    ALT_CALL_ARG(v5_, 5);                                         \
+    ALT_CALL_NO_ARG6;                                             \
+    alternative_callN(5, typeof(func(arg1, arg2, arg3,            \
+                                     arg4, arg5)),                \
+                      func, ALT_CALL_OUT5);                       \
+})
+
+#define alternative_vcall6(func, arg1, arg2, arg3, arg4, arg5, arg6) ({ \
+    typeof(arg2) v2_ = (arg2);                                          \
+    typeof(arg3) v3_ = (arg3);                                          \
+    typeof(arg4) v4_ = (arg4);                                          \
+    typeof(arg5) v5_ = (arg5);                                          \
+    typeof(arg6) v6_ = (arg6);                                          \
+    ALT_CALL_ARG(arg1, 1);                                              \
+    ALT_CALL_ARG(v2_, 2);                                               \
+    ALT_CALL_ARG(v3_, 3);                                               \
+    ALT_CALL_ARG(v4_, 4);                                               \
+    ALT_CALL_ARG(v5_, 5);                                               \
+    ALT_CALL_ARG(v6_, 6);                                               \
+    (void)sizeof(func(arg1, arg2, arg3, arg4, arg5, arg6));             \
+    (void)alternative_callN(6, int, func);                              \
+})
+
+#define alternative_call6(func, arg1, arg2, arg3, arg4, arg5, arg6) ({  \
+    typeof(arg2) v2_ = (arg2);                                          \
+    typeof(arg3) v3_ = (arg3);                                          \
+    typeof(arg4) v4_ = (arg4);                                          \
+    typeof(arg5) v5_ = (arg5);                                          \
+    typeof(arg6) v6_ = (arg6);                                          \
+    ALT_CALL_ARG(arg1, 1);                                              \
+    ALT_CALL_ARG(v2_, 2);                                               \
+    ALT_CALL_ARG(v3_, 3);                                               \
+    ALT_CALL_ARG(v4_, 4);                                               \
+    ALT_CALL_ARG(v5_, 5);                                               \
+    ALT_CALL_ARG(v6_, 6);                                               \
+    alternative_callN(6, typeof(func(arg1, arg2, arg3,                  \
+                                     arg4, arg5, arg6)),                \
+                      func, ALT_CALL_OUT6);                             \
+})
+
+#define alternative_vcall__(nr) alternative_vcall ## nr
+#define alternative_call__(nr)  alternative_call ## nr
+
+#define alternative_vcall_(nr) alternative_vcall__(nr)
+#define alternative_call_(nr)  alternative_call__(nr)
+
+#define alternative_vcall(func, args...) \
+    alternative_vcall_(count_va_arg(args))(func, ## args)
+
+#define alternative_call(func, args...) \
+    alternative_call_(count_va_arg(args))(func, ## args)
+
 #endif /*  !__ASSEMBLY__  */
 
 #endif /* __X86_ALTERNATIVE_H__ */
--- a/xen/include/xen/lib.h
+++ b/xen/include/xen/lib.h
@@ -66,6 +66,10 @@
 
 #define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
 
+#define count_va_arg_(dot, a1, a2, a3, a4, a5, a6, a7, a8, x, ...) x
+#define count_va_arg(args...) \
+    count_va_arg_(., ## args, 8, 7, 6, 5, 4, 3, 2, 1, 0)
+
 struct domain;
 
 void cmdline_parse(const char *cmdline);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v3 2/9] x86/HVM: patch indirect calls through hvm_funcs to direct ones
  2018-09-11 13:26 [PATCH v3 0/9] x86: indirect call overhead reduction Jan Beulich
  2018-09-11 13:32 ` [PATCH v3 1/9] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
@ 2018-09-11 13:32 ` Jan Beulich
  2018-09-21 10:50   ` Wei Liu
  2018-09-11 13:33 ` [PATCH v3 3/9] x86/HVM: patch vINTR " Jan Beulich
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-09-11 13:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Wei Liu

This is intentionally not touching hooks used rarely (or not at all)
during the lifetime of a VM, like {domain,vcpu}_initialise or cpu_up,
as well as nested, VM event, and altp2m ones (they can all be done
later, if so desired). Virtual Interrupt delivery ones will be dealt
with in a subsequent patch.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: Re-base.
v2: Drop open-coded numbers from macro invocations. Re-base.

--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -2104,7 +2104,7 @@ static int hvmemul_write_msr(
 static int hvmemul_wbinvd(
     struct x86_emulate_ctxt *ctxt)
 {
-    hvm_funcs.wbinvd_intercept();
+    alternative_vcall(hvm_funcs.wbinvd_intercept);
     return X86EMUL_OKAY;
 }
 
@@ -2122,7 +2122,7 @@ static int hvmemul_get_fpu(
     struct vcpu *curr = current;
 
     if ( !curr->fpu_dirtied )
-        hvm_funcs.fpu_dirty_intercept();
+        alternative_vcall(hvm_funcs.fpu_dirty_intercept);
     else if ( type == X86EMUL_FPU_fpu )
     {
         const typeof(curr->arch.xsave_area->fpu_sse) *fpu_ctxt =
@@ -2239,7 +2239,7 @@ static void hvmemul_put_fpu(
         {
             curr->fpu_dirtied = false;
             stts();
-            hvm_funcs.fpu_leave(curr);
+            alternative_vcall(hvm_funcs.fpu_leave, curr);
         }
     }
 }
@@ -2401,7 +2401,8 @@ static int _hvm_emulate_one(struct hvm_e
     if ( hvmemul_ctxt->intr_shadow != new_intr_shadow )
     {
         hvmemul_ctxt->intr_shadow = new_intr_shadow;
-        hvm_funcs.set_interrupt_shadow(curr, new_intr_shadow);
+        alternative_vcall(hvm_funcs.set_interrupt_shadow,
+                          curr, new_intr_shadow);
     }
 
     if ( hvmemul_ctxt->ctxt.retire.hlt &&
@@ -2538,7 +2539,8 @@ void hvm_emulate_init_once(
 
     memset(hvmemul_ctxt, 0, sizeof(*hvmemul_ctxt));
 
-    hvmemul_ctxt->intr_shadow = hvm_funcs.get_interrupt_shadow(curr);
+    hvmemul_ctxt->intr_shadow =
+        alternative_call(hvm_funcs.get_interrupt_shadow, curr);
     hvmemul_get_seg_reg(x86_seg_cs, hvmemul_ctxt);
     hvmemul_get_seg_reg(x86_seg_ss, hvmemul_ctxt);
 
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -272,12 +272,12 @@ void hvm_set_rdtsc_exiting(struct domain
     struct vcpu *v;
 
     for_each_vcpu ( d, v )
-        hvm_funcs.set_rdtsc_exiting(v, enable);
+        alternative_vcall(hvm_funcs.set_rdtsc_exiting, v, enable);
 }
 
 void hvm_get_guest_pat(struct vcpu *v, u64 *guest_pat)
 {
-    if ( !hvm_funcs.get_guest_pat(v, guest_pat) )
+    if ( !alternative_call(hvm_funcs.get_guest_pat, v, guest_pat) )
         *guest_pat = v->arch.hvm.pat_cr;
 }
 
@@ -302,7 +302,7 @@ int hvm_set_guest_pat(struct vcpu *v, u6
             return 0;
         }
 
-    if ( !hvm_funcs.set_guest_pat(v, guest_pat) )
+    if ( !alternative_call(hvm_funcs.set_guest_pat, v, guest_pat) )
         v->arch.hvm.pat_cr = guest_pat;
 
     return 1;
@@ -342,7 +342,7 @@ bool hvm_set_guest_bndcfgs(struct vcpu *
             /* nothing, best effort only */;
     }
 
-    return hvm_funcs.set_guest_bndcfgs(v, val);
+    return alternative_call(hvm_funcs.set_guest_bndcfgs, v, val);
 }
 
 /*
@@ -500,7 +500,8 @@ void hvm_migrate_pirqs(struct vcpu *v)
 static bool hvm_get_pending_event(struct vcpu *v, struct x86_event *info)
 {
     info->cr2 = v->arch.hvm.guest_cr[2];
-    return hvm_funcs.get_pending_event(v, info);
+
+    return alternative_call(hvm_funcs.get_pending_event, v, info);
 }
 
 void hvm_do_resume(struct vcpu *v)
@@ -1674,7 +1675,7 @@ void hvm_inject_event(const struct x86_e
         }
     }
 
-    hvm_funcs.inject_event(event);
+    alternative_vcall(hvm_funcs.inject_event, event);
 }
 
 int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
@@ -2261,7 +2262,7 @@ int hvm_set_cr0(unsigned long value, boo
          (!rangeset_is_empty(d->iomem_caps) ||
           !rangeset_is_empty(d->arch.ioport_caps) ||
           has_arch_pdevs(d)) )
-        hvm_funcs.handle_cd(v, value);
+        alternative_vcall(hvm_funcs.handle_cd, v, value);
 
     hvm_update_cr(v, 0, value);
 
@@ -3500,7 +3501,8 @@ int hvm_msr_read_intercept(unsigned int
             goto gp_fault;
         /* If ret == 0 then this is not an MCE MSR, see other MSRs. */
         ret = ((ret == 0)
-               ? hvm_funcs.msr_read_intercept(msr, msr_content)
+               ? alternative_call(hvm_funcs.msr_read_intercept,
+                                  msr, msr_content)
                : X86EMUL_OKAY);
         break;
     }
@@ -3660,7 +3662,8 @@ int hvm_msr_write_intercept(unsigned int
             goto gp_fault;
         /* If ret == 0 then this is not an MCE MSR, see other MSRs. */
         ret = ((ret == 0)
-               ? hvm_funcs.msr_write_intercept(msr, msr_content)
+               ? alternative_call(hvm_funcs.msr_write_intercept,
+                                  msr, msr_content)
                : X86EMUL_OKAY);
         break;
     }
@@ -3852,7 +3855,7 @@ void hvm_hypercall_page_initialise(struc
                                    void *hypercall_page)
 {
     hvm_latch_shinfo_size(d);
-    hvm_funcs.init_hypercall_page(d, hypercall_page);
+    alternative_vcall(hvm_funcs.init_hypercall_page, d, hypercall_page);
 }
 
 void hvm_vcpu_reset_state(struct vcpu *v, uint16_t cs, uint16_t ip)
@@ -4987,7 +4990,7 @@ void hvm_domain_soft_reset(struct domain
 void hvm_get_segment_register(struct vcpu *v, enum x86_segment seg,
                               struct segment_register *reg)
 {
-    hvm_funcs.get_segment_register(v, seg, reg);
+    alternative_vcall(hvm_funcs.get_segment_register, v, seg, reg);
 
     switch ( seg )
     {
@@ -5133,7 +5136,7 @@ void hvm_set_segment_register(struct vcp
         return;
     }
 
-    hvm_funcs.set_segment_register(v, seg, reg);
+    alternative_vcall(hvm_funcs.set_segment_register, v, seg, reg);
 }
 
 /*
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -380,42 +380,42 @@ static inline int
 hvm_guest_x86_mode(struct vcpu *v)
 {
     ASSERT(v == current);
-    return hvm_funcs.guest_x86_mode(v);
+    return alternative_call(hvm_funcs.guest_x86_mode, v);
 }
 
 static inline void
 hvm_update_host_cr3(struct vcpu *v)
 {
     if ( hvm_funcs.update_host_cr3 )
-        hvm_funcs.update_host_cr3(v);
+        alternative_vcall(hvm_funcs.update_host_cr3, v);
 }
 
 static inline void hvm_update_guest_cr(struct vcpu *v, unsigned int cr)
 {
-    hvm_funcs.update_guest_cr(v, cr, 0);
+    alternative_vcall(hvm_funcs.update_guest_cr, v, cr, 0);
 }
 
 static inline void hvm_update_guest_cr3(struct vcpu *v, bool noflush)
 {
     unsigned int flags = noflush ? HVM_UPDATE_GUEST_CR3_NOFLUSH : 0;
 
-    hvm_funcs.update_guest_cr(v, 3, flags);
+    alternative_vcall(hvm_funcs.update_guest_cr, v, 3, flags);
 }
 
 static inline void hvm_update_guest_efer(struct vcpu *v)
 {
-    hvm_funcs.update_guest_efer(v);
+    alternative_vcall(hvm_funcs.update_guest_efer, v);
 }
 
 static inline void hvm_cpuid_policy_changed(struct vcpu *v)
 {
-    hvm_funcs.cpuid_policy_changed(v);
+    alternative_vcall(hvm_funcs.cpuid_policy_changed, v);
 }
 
 static inline void hvm_set_tsc_offset(struct vcpu *v, uint64_t offset,
                                       uint64_t at_tsc)
 {
-    hvm_funcs.set_tsc_offset(v, offset, at_tsc);
+    alternative_vcall(hvm_funcs.set_tsc_offset, v, offset, at_tsc);
 }
 
 /*
@@ -432,18 +432,18 @@ static inline void hvm_flush_guest_tlbs(
 static inline unsigned int
 hvm_get_cpl(struct vcpu *v)
 {
-    return hvm_funcs.get_cpl(v);
+    return alternative_call(hvm_funcs.get_cpl, v);
 }
 
 static inline unsigned long hvm_get_shadow_gs_base(struct vcpu *v)
 {
-    return hvm_funcs.get_shadow_gs_base(v);
+    return alternative_call(hvm_funcs.get_shadow_gs_base, v);
 }
 
 static inline bool hvm_get_guest_bndcfgs(struct vcpu *v, u64 *val)
 {
     return hvm_funcs.get_guest_bndcfgs &&
-           hvm_funcs.get_guest_bndcfgs(v, val);
+           alternative_call(hvm_funcs.get_guest_bndcfgs, v, val);
 }
 
 #define has_hvm_params(d) \
@@ -500,12 +500,12 @@ static inline void hvm_inject_page_fault
 
 static inline int hvm_event_pending(struct vcpu *v)
 {
-    return hvm_funcs.event_pending(v);
+    return alternative_call(hvm_funcs.event_pending, v);
 }
 
 static inline void hvm_invlpg(struct vcpu *v, unsigned long linear)
 {
-    hvm_funcs.invlpg(v, linear);
+    alternative_vcall(hvm_funcs.invlpg, v, linear);
 }
 
 /* These bits in CR4 are owned by the host. */
@@ -530,13 +530,14 @@ static inline void hvm_cpu_down(void)
 
 static inline unsigned int hvm_get_insn_bytes(struct vcpu *v, uint8_t *buf)
 {
-    return (hvm_funcs.get_insn_bytes ? hvm_funcs.get_insn_bytes(v, buf) : 0);
+    return (hvm_funcs.get_insn_bytes
+            ? alternative_call(hvm_funcs.get_insn_bytes, v, buf) : 0);
 }
 
 static inline void hvm_set_info_guest(struct vcpu *v)
 {
     if ( hvm_funcs.set_info_guest )
-        return hvm_funcs.set_info_guest(v);
+        alternative_vcall(hvm_funcs.set_info_guest, v);
 }
 
 static inline void hvm_invalidate_regs_fields(struct cpu_user_regs *regs)




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v3 3/9] x86/HVM: patch vINTR indirect calls through hvm_funcs to direct ones
  2018-09-11 13:26 [PATCH v3 0/9] x86: indirect call overhead reduction Jan Beulich
  2018-09-11 13:32 ` [PATCH v3 1/9] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
  2018-09-11 13:32 ` [PATCH v3 2/9] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
@ 2018-09-11 13:33 ` Jan Beulich
  2018-09-21 10:50   ` Wei Liu
  2018-09-11 13:34 ` [PATCH v3 4/9] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-09-11 13:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Kevin Tian, Wei Liu, Jun Nakajima

While not strictly necessary, change the VMX initialization logic to
update the function table in start_vmx() from NULL rather than to NULL,
to make more obvious that we won't ever change an already (explictly)
initialized function pointer.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
---
v2: Drop open-coded numbers from macro invocations.

--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -111,10 +111,15 @@ static void vlapic_clear_irr(int vector,
     vlapic_clear_vector(vector, &vlapic->regs->data[APIC_IRR]);
 }
 
-static int vlapic_find_highest_irr(struct vlapic *vlapic)
+static void sync_pir_to_irr(struct vcpu *v)
 {
     if ( hvm_funcs.sync_pir_to_irr )
-        hvm_funcs.sync_pir_to_irr(vlapic_vcpu(vlapic));
+        alternative_vcall(hvm_funcs.sync_pir_to_irr, v);
+}
+
+static int vlapic_find_highest_irr(struct vlapic *vlapic)
+{
+    sync_pir_to_irr(vlapic_vcpu(vlapic));
 
     return vlapic_find_highest_vector(&vlapic->regs->data[APIC_IRR]);
 }
@@ -143,7 +148,7 @@ bool vlapic_test_irq(const struct vlapic
         return false;
 
     if ( hvm_funcs.test_pir &&
-         hvm_funcs.test_pir(const_vlapic_vcpu(vlapic), vec) )
+         alternative_call(hvm_funcs.test_pir, const_vlapic_vcpu(vlapic), vec) )
         return true;
 
     return vlapic_test_vector(vec, &vlapic->regs->data[APIC_IRR]);
@@ -165,10 +170,10 @@ void vlapic_set_irq(struct vlapic *vlapi
         vlapic_clear_vector(vec, &vlapic->regs->data[APIC_TMR]);
 
     if ( hvm_funcs.update_eoi_exit_bitmap )
-        hvm_funcs.update_eoi_exit_bitmap(target, vec, trig);
+        alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec, trig);
 
     if ( hvm_funcs.deliver_posted_intr )
-        hvm_funcs.deliver_posted_intr(target, vec);
+        alternative_vcall(hvm_funcs.deliver_posted_intr, target, vec);
     else if ( !vlapic_test_and_set_irr(vec, vlapic) )
         vcpu_kick(target);
 }
@@ -448,7 +453,7 @@ void vlapic_EOI_set(struct vlapic *vlapi
     vlapic_clear_vector(vector, &vlapic->regs->data[APIC_ISR]);
 
     if ( hvm_funcs.handle_eoi )
-        hvm_funcs.handle_eoi(vector);
+        alternative_vcall(hvm_funcs.handle_eoi, vector);
 
     vlapic_handle_EOI(vlapic, vector);
 
@@ -1429,8 +1434,7 @@ static int lapic_save_regs(struct domain
 
     for_each_vcpu ( d, v )
     {
-        if ( hvm_funcs.sync_pir_to_irr )
-            hvm_funcs.sync_pir_to_irr(v);
+        sync_pir_to_irr(v);
 
         s = vcpu_vlapic(v);
         if ( (rc = hvm_save_entry(LAPIC_REGS, v->vcpu_id, h, s->regs)) != 0 )
@@ -1531,7 +1535,8 @@ static int lapic_load_regs(struct domain
         lapic_load_fixup(s);
 
     if ( hvm_funcs.process_isr )
-        hvm_funcs.process_isr(vlapic_find_highest_isr(s), v);
+        alternative_vcall(hvm_funcs.process_isr,
+                           vlapic_find_highest_isr(s), v);
 
     vlapic_adjust_i8259_target(d);
     lapic_rearm(s);
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2336,12 +2336,6 @@ static struct hvm_function_table __initd
     .nhvm_vcpu_vmexit_event = nvmx_vmexit_event,
     .nhvm_intr_blocked    = nvmx_intr_blocked,
     .nhvm_domain_relinquish_resources = nvmx_domain_relinquish_resources,
-    .update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap,
-    .process_isr          = vmx_process_isr,
-    .deliver_posted_intr  = vmx_deliver_posted_intr,
-    .sync_pir_to_irr      = vmx_sync_pir_to_irr,
-    .test_pir             = vmx_test_pir,
-    .handle_eoi           = vmx_handle_eoi,
     .nhvm_hap_walk_L1_p2m = nvmx_hap_walk_L1_p2m,
     .enable_msr_interception = vmx_enable_msr_interception,
     .is_singlestep_supported = vmx_is_singlestep_supported,
@@ -2469,26 +2463,23 @@ const struct hvm_function_table * __init
         setup_ept_dump();
     }
 
-    if ( !cpu_has_vmx_virtual_intr_delivery )
+    if ( cpu_has_vmx_virtual_intr_delivery )
     {
-        vmx_function_table.update_eoi_exit_bitmap = NULL;
-        vmx_function_table.process_isr = NULL;
-        vmx_function_table.handle_eoi = NULL;
-    }
-    else
+        vmx_function_table.update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap;
+        vmx_function_table.process_isr = vmx_process_isr;
+        vmx_function_table.handle_eoi = vmx_handle_eoi;
         vmx_function_table.virtual_intr_delivery_enabled = true;
+    }
 
     if ( cpu_has_vmx_posted_intr_processing )
     {
         alloc_direct_apic_vector(&posted_intr_vector, pi_notification_interrupt);
         if ( iommu_intpost )
             alloc_direct_apic_vector(&pi_wakeup_vector, pi_wakeup_interrupt);
-    }
-    else
-    {
-        vmx_function_table.deliver_posted_intr = NULL;
-        vmx_function_table.sync_pir_to_irr = NULL;
-        vmx_function_table.test_pir = NULL;
+
+        vmx_function_table.deliver_posted_intr = vmx_deliver_posted_intr;
+        vmx_function_table.sync_pir_to_irr     = vmx_sync_pir_to_irr;
+        vmx_function_table.test_pir            = vmx_test_pir;
     }
 
     if ( cpu_has_vmx_tsc_scaling )




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v3 4/9] x86: patch ctxt_switch_masking() indirect call to direct one
  2018-09-11 13:26 [PATCH v3 0/9] x86: indirect call overhead reduction Jan Beulich
                   ` (2 preceding siblings ...)
  2018-09-11 13:33 ` [PATCH v3 3/9] x86/HVM: patch vINTR " Jan Beulich
@ 2018-09-11 13:34 ` Jan Beulich
  2018-09-21 10:51   ` Wei Liu
  2018-09-11 13:35 ` [PATCH v3 5/9] x86/genapic: remove indirection from genapic hook accesses Jan Beulich
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-09-11 13:34 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Drop open-coded number from macro invocation.

--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -184,7 +184,7 @@ void ctxt_switch_levelling(const struct
 	}
 
 	if (ctxt_switch_masking)
-		ctxt_switch_masking(next);
+		alternative_vcall(ctxt_switch_masking, next);
 }
 
 bool_t opt_cpu_info;



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v3 5/9] x86/genapic: remove indirection from genapic hook accesses
  2018-09-11 13:26 [PATCH v3 0/9] x86: indirect call overhead reduction Jan Beulich
                   ` (3 preceding siblings ...)
  2018-09-11 13:34 ` [PATCH v3 4/9] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
@ 2018-09-11 13:35 ` Jan Beulich
  2018-09-21 10:53   ` Wei Liu
  2018-09-11 13:35 ` [PATCH v3 6/9] x86/genapic: patch indirect calls to direct ones Jan Beulich
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-09-11 13:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

Instead of loading a pointer at each use site, have a single runtime
instance of struct genapic, copying into it from the individual
instances. The individual instances can this way also be moved to .init
(also adjust apic_probe[] at this occasion).

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -943,8 +943,8 @@ void __init x2apic_bsp_setup(void)
 
     force_iommu = 1;
 
-    genapic = apic_x2apic_probe();
-    printk("Switched to APIC driver %s.\n", genapic->name);
+    genapic = *apic_x2apic_probe();
+    printk("Switched to APIC driver %s.\n", genapic.name);
 
     if ( !x2apic_enabled )
     {
--- a/xen/arch/x86/genapic/bigsmp.c
+++ b/xen/arch/x86/genapic/bigsmp.c
@@ -42,7 +42,7 @@ static __init int probe_bigsmp(void)
 	return def_to_bigsmp;
 } 
 
-const struct genapic apic_bigsmp = {
+const struct genapic __initconstrel apic_bigsmp = {
 	APIC_INIT("bigsmp", probe_bigsmp),
 	GENAPIC_PHYS
 };
--- a/xen/arch/x86/genapic/default.c
+++ b/xen/arch/x86/genapic/default.c
@@ -20,7 +20,7 @@ static __init int probe_default(void)
 	return 1;
 } 
 
-const struct genapic apic_default = {
+const struct genapic __initconstrel apic_default = {
 	APIC_INIT("default", probe_default),
 	GENAPIC_FLAT
 };
--- a/xen/arch/x86/genapic/probe.c
+++ b/xen/arch/x86/genapic/probe.c
@@ -15,11 +15,9 @@
 #include <asm/mach-generic/mach_apic.h>
 #include <asm/setup.h>
 
-extern const struct genapic apic_bigsmp;
+struct genapic __read_mostly genapic;
 
-const struct genapic *__read_mostly genapic;
-
-const struct genapic *apic_probe[] __initdata = {
+const struct genapic *const __initconstrel apic_probe[] = {
 	&apic_bigsmp, 
 	&apic_default,	/* must be last */
 	NULL,
@@ -36,11 +34,11 @@ void __init generic_bigsmp_probe(void)
 	 * - we find more than 8 CPUs in acpi LAPIC listing with xAPIC support
 	 */
 
-	if (!cmdline_apic && genapic == &apic_default)
+	if (!cmdline_apic && genapic.name == apic_default.name)
 		if (apic_bigsmp.probe()) {
-			genapic = &apic_bigsmp;
+			genapic = apic_bigsmp;
 			printk(KERN_INFO "Overriding APIC driver with %s\n",
-			       genapic->name);
+			       genapic.name);
 		}
 }
 
@@ -50,7 +48,7 @@ static int __init genapic_apic_force(con
 
 	for (i = 0; apic_probe[i]; i++)
 		if (!strcmp(apic_probe[i]->name, str)) {
-			genapic = apic_probe[i];
+			genapic = *apic_probe[i];
 			rc = 0;
 		}
 
@@ -66,18 +64,18 @@ void __init generic_apic_probe(void)
 	record_boot_APIC_mode();
 
 	check_x2apic_preenabled();
-	cmdline_apic = changed = (genapic != NULL);
+	cmdline_apic = changed = !!genapic.name;
 
 	for (i = 0; !changed && apic_probe[i]; i++) { 
 		if (apic_probe[i]->probe()) {
 			changed = 1;
-			genapic = apic_probe[i];
+			genapic = *apic_probe[i];
 		} 
 	}
 	if (!changed) 
-		genapic = &apic_default;
+		genapic = apic_default;
 
-	printk(KERN_INFO "Using APIC driver %s\n", genapic->name);
+	printk(KERN_INFO "Using APIC driver %s\n", genapic.name);
 } 
 
 /* These functions can switch the APIC even after the initial ->probe() */
@@ -88,9 +86,9 @@ int __init mps_oem_check(struct mp_confi
 	for (i = 0; apic_probe[i]; ++i) { 
 		if (apic_probe[i]->mps_oem_check(mpc,oem,productid)) { 
 			if (!cmdline_apic) {
-				genapic = apic_probe[i];
+				genapic = *apic_probe[i];
 				printk(KERN_INFO "Switched to APIC driver `%s'.\n", 
-				       genapic->name);
+				       genapic.name);
 			}
 			return 1;
 		} 
@@ -104,9 +102,9 @@ int __init acpi_madt_oem_check(char *oem
 	for (i = 0; apic_probe[i]; ++i) { 
 		if (apic_probe[i]->acpi_madt_oem_check(oem_id, oem_table_id)) { 
 			if (!cmdline_apic) {
-				genapic = apic_probe[i];
+				genapic = *apic_probe[i];
 				printk(KERN_INFO "Switched to APIC driver `%s'.\n", 
-				       genapic->name);
+				       genapic.name);
 			}
 			return 1;
 		} 
--- a/xen/arch/x86/genapic/x2apic.c
+++ b/xen/arch/x86/genapic/x2apic.c
@@ -163,7 +163,7 @@ static void send_IPI_mask_x2apic_cluster
     local_irq_restore(flags);
 }
 
-static const struct genapic apic_x2apic_phys = {
+static const struct genapic __initconstrel apic_x2apic_phys = {
     APIC_INIT("x2apic_phys", NULL),
     .int_delivery_mode = dest_Fixed,
     .int_dest_mode = 0 /* physical delivery */,
@@ -175,7 +175,7 @@ static const struct genapic apic_x2apic_
     .send_IPI_self = send_IPI_self_x2apic
 };
 
-static const struct genapic apic_x2apic_cluster = {
+static const struct genapic __initconstrel apic_x2apic_cluster = {
     APIC_INIT("x2apic_cluster", NULL),
     .int_delivery_mode = dest_LowestPrio,
     .int_dest_mode = 1 /* logical delivery */,
@@ -259,6 +259,6 @@ void __init check_x2apic_preenabled(void
     {
         printk("x2APIC mode is already enabled by BIOS.\n");
         x2apic_enabled = 1;
-        genapic = apic_x2apic_probe();
+        genapic = *apic_x2apic_probe();
     }
 }
--- a/xen/arch/x86/mpparse.c
+++ b/xen/arch/x86/mpparse.c
@@ -162,7 +162,8 @@ static int MP_processor_info_x(struct mp
 		return -ENOSPC;
 	}
 
-	if (num_processors >= 8 && hotplug && genapic == &apic_default) {
+	if (num_processors >= 8 && hotplug
+	    && genapic.name == apic_default.name) {
 		printk(KERN_WARNING "WARNING: CPUs limit of 8 reached."
 			" Processor ignored.\n");
 		return -ENOSPC;
--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -29,12 +29,12 @@
 
 void send_IPI_mask(const cpumask_t *mask, int vector)
 {
-    genapic->send_IPI_mask(mask, vector);
+    genapic.send_IPI_mask(mask, vector);
 }
 
 void send_IPI_self(int vector)
 {
-    genapic->send_IPI_self(vector);
+    genapic.send_IPI_self(vector);
 }
 
 /*
--- a/xen/include/asm-x86/genapic.h
+++ b/xen/include/asm-x86/genapic.h
@@ -47,8 +47,9 @@ struct genapic {
 	APICFUNC(mps_oem_check), \
 	APICFUNC(acpi_madt_oem_check)
 
-extern const struct genapic *genapic;
+extern struct genapic genapic;
 extern const struct genapic apic_default;
+extern const struct genapic apic_bigsmp;
 
 void send_IPI_self_legacy(uint8_t vector);
 
--- a/xen/include/asm-x86/mach-generic/mach_apic.h
+++ b/xen/include/asm-x86/mach-generic/mach_apic.h
@@ -10,13 +10,13 @@
 #define esr_disable (0)
 
 /* The following are dependent on APIC delivery mode (logical vs. physical). */
-#define INT_DELIVERY_MODE (genapic->int_delivery_mode)
-#define INT_DEST_MODE (genapic->int_dest_mode)
+#define INT_DELIVERY_MODE (genapic.int_delivery_mode)
+#define INT_DEST_MODE (genapic.int_dest_mode)
 #define TARGET_CPUS ((const typeof(cpu_online_map) *)&cpu_online_map)
-#define init_apic_ldr (genapic->init_apic_ldr)
-#define clustered_apic_check (genapic->clustered_apic_check) 
-#define cpu_mask_to_apicid (genapic->cpu_mask_to_apicid)
-#define vector_allocation_cpumask(cpu) (genapic->vector_allocation_cpumask(cpu))
+#define init_apic_ldr (genapic.init_apic_ldr)
+#define clustered_apic_check (genapic.clustered_apic_check)
+#define cpu_mask_to_apicid (genapic.cpu_mask_to_apicid)
+#define vector_allocation_cpumask(cpu) (genapic.vector_allocation_cpumask(cpu))
 
 static inline void enable_apic_mode(void)
 {




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v3 6/9] x86/genapic: patch indirect calls to direct ones
  2018-09-11 13:26 [PATCH v3 0/9] x86: indirect call overhead reduction Jan Beulich
                   ` (4 preceding siblings ...)
  2018-09-11 13:35 ` [PATCH v3 5/9] x86/genapic: remove indirection from genapic hook accesses Jan Beulich
@ 2018-09-11 13:35 ` Jan Beulich
  2018-09-21 11:03   ` Wei Liu
  2018-09-21 13:55   ` Wei Liu
  2018-09-11 13:35 ` [PATCH v3 7/9] x86/cpuidle: patch some " Jan Beulich
                   ` (6 subsequent siblings)
  12 siblings, 2 replies; 119+ messages in thread
From: Jan Beulich @ 2018-09-11 13:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

For (I hope) obvious reasons only the ones used at runtime get
converted.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Drop open-coded numbers from macro invocations.

--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -29,12 +29,12 @@
 
 void send_IPI_mask(const cpumask_t *mask, int vector)
 {
-    genapic.send_IPI_mask(mask, vector);
+    alternative_vcall(genapic.send_IPI_mask, mask, vector);
 }
 
 void send_IPI_self(int vector)
 {
-    genapic.send_IPI_self(vector);
+    alternative_vcall(genapic.send_IPI_self, vector);
 }
 
 /*
--- a/xen/include/asm-x86/mach-generic/mach_apic.h
+++ b/xen/include/asm-x86/mach-generic/mach_apic.h
@@ -15,8 +15,18 @@
 #define TARGET_CPUS ((const typeof(cpu_online_map) *)&cpu_online_map)
 #define init_apic_ldr (genapic.init_apic_ldr)
 #define clustered_apic_check (genapic.clustered_apic_check)
-#define cpu_mask_to_apicid (genapic.cpu_mask_to_apicid)
-#define vector_allocation_cpumask(cpu) (genapic.vector_allocation_cpumask(cpu))
+#define cpu_mask_to_apicid(mask) ({ \
+	/* \
+	 * There are a number of places where the address of a local variable \
+	 * gets passed here. The use of ?: in alternative_call<N>() triggers an \
+	 * "address of ... is always true" warning in such a case with at least \
+	 * gcc 7 and 8. Hence the seemingly pointless local variable here. \
+	 */ \
+	const cpumask_t *m_ = (mask); \
+	alternative_call(genapic.cpu_mask_to_apicid, m_); \
+})
+#define vector_allocation_cpumask(cpu) \
+	alternative_call(genapic.vector_allocation_cpumask, cpu)
 
 static inline void enable_apic_mode(void)
 {





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v3 7/9] x86/cpuidle: patch some indirect calls to direct ones
  2018-09-11 13:26 [PATCH v3 0/9] x86: indirect call overhead reduction Jan Beulich
                   ` (5 preceding siblings ...)
  2018-09-11 13:35 ` [PATCH v3 6/9] x86/genapic: patch indirect calls to direct ones Jan Beulich
@ 2018-09-11 13:35 ` Jan Beulich
  2018-09-21 14:01   ` Wei Liu
  2018-09-11 13:37 ` [PATCH v3 8/9] cpufreq: convert to a single post-init driver (hooks) instance Jan Beulich
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-09-11 13:35 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

For now only the ones used during entering/exiting of idle states are
converted. Additionally pm_idle{,_save} and lapic_timer_{on,off} can't
be converted, as they may get established rather late (when Dom0 is
already active).

Note that for patching to be deferred until after the pre-SMP initcalls
(from where cpuidle_init_cpu() runs the first time) the pointers need to
start out as NULL.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Drop open-coded numbers from macro invocations.

--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -102,8 +102,6 @@ bool lapic_timer_init(void)
     return true;
 }
 
-static uint64_t (*__read_mostly tick_to_ns)(uint64_t) = acpi_pm_tick_to_ns;
-
 void (*__read_mostly pm_idle_save)(void);
 unsigned int max_cstate __read_mostly = ACPI_PROCESSOR_MAX_POWER - 1;
 integer_param("max_cstate", max_cstate);
@@ -289,9 +287,9 @@ static uint64_t acpi_pm_ticks_elapsed(ui
         return ((0xFFFFFFFF - t1) + t2 +1);
 }
 
-uint64_t (*__read_mostly cpuidle_get_tick)(void) = get_acpi_pm_tick;
-static uint64_t (*__read_mostly ticks_elapsed)(uint64_t, uint64_t)
-    = acpi_pm_ticks_elapsed;
+uint64_t (*__read_mostly cpuidle_get_tick)(void);
+static uint64_t (*__read_mostly tick_to_ns)(uint64_t);
+static uint64_t (*__read_mostly ticks_elapsed)(uint64_t, uint64_t);
 
 static void print_acpi_power(uint32_t cpu, struct acpi_processor_power *power)
 {
@@ -547,7 +545,7 @@ void update_idle_stats(struct acpi_proce
                        struct acpi_processor_cx *cx,
                        uint64_t before, uint64_t after)
 {
-    int64_t sleep_ticks = ticks_elapsed(before, after);
+    int64_t sleep_ticks = alternative_call(ticks_elapsed, before, after);
     /* Interrupts are disabled */
 
     spin_lock(&power->stat_lock);
@@ -555,7 +553,8 @@ void update_idle_stats(struct acpi_proce
     cx->usage++;
     if ( sleep_ticks > 0 )
     {
-        power->last_residency = tick_to_ns(sleep_ticks) / 1000UL;
+        power->last_residency = alternative_call(tick_to_ns, sleep_ticks) /
+                                1000UL;
         cx->time += sleep_ticks;
     }
     power->last_state = &power->states[0];
@@ -635,7 +634,7 @@ static void acpi_processor_idle(void)
         if ( cx->type == ACPI_STATE_C1 || local_apic_timer_c2_ok )
         {
             /* Get start time (ticks) */
-            t1 = cpuidle_get_tick();
+            t1 = alternative_call(cpuidle_get_tick);
             /* Trace cpu idle entry */
             TRACE_4D(TRC_PM_IDLE_ENTRY, cx->idx, t1, exp, pred);
 
@@ -644,7 +643,7 @@ static void acpi_processor_idle(void)
             /* Invoke C2 */
             acpi_idle_do_entry(cx);
             /* Get end time (ticks) */
-            t2 = cpuidle_get_tick();
+            t2 = alternative_call(cpuidle_get_tick);
             trace_exit_reason(irq_traced);
             /* Trace cpu idle exit */
             TRACE_6D(TRC_PM_IDLE_EXIT, cx->idx, t2,
@@ -666,7 +665,7 @@ static void acpi_processor_idle(void)
         lapic_timer_off();
 
         /* Get start time (ticks) */
-        t1 = cpuidle_get_tick();
+        t1 = alternative_call(cpuidle_get_tick);
         /* Trace cpu idle entry */
         TRACE_4D(TRC_PM_IDLE_ENTRY, cx->idx, t1, exp, pred);
 
@@ -717,7 +716,7 @@ static void acpi_processor_idle(void)
         }
 
         /* Get end time (ticks) */
-        t2 = cpuidle_get_tick();
+        t2 = alternative_call(cpuidle_get_tick);
 
         /* recovering TSC */
         cstate_restore_tsc();
@@ -827,11 +826,20 @@ int cpuidle_init_cpu(unsigned int cpu)
     {
         unsigned int i;
 
-        if ( cpu == 0 && boot_cpu_has(X86_FEATURE_NONSTOP_TSC) )
+        if ( cpu == 0 && system_state < SYS_STATE_active )
         {
-            cpuidle_get_tick = get_stime_tick;
-            ticks_elapsed = stime_ticks_elapsed;
-            tick_to_ns = stime_tick_to_ns;
+            if ( boot_cpu_has(X86_FEATURE_NONSTOP_TSC) )
+            {
+                cpuidle_get_tick = get_stime_tick;
+                ticks_elapsed = stime_ticks_elapsed;
+                tick_to_ns = stime_tick_to_ns;
+            }
+            else
+            {
+                cpuidle_get_tick = get_acpi_pm_tick;
+                ticks_elapsed = acpi_pm_ticks_elapsed;
+                tick_to_ns = acpi_pm_tick_to_ns;
+            }
         }
 
         acpi_power = xzalloc(struct acpi_processor_power);
--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -778,7 +778,7 @@ static void mwait_idle(void)
 	if (!(lapic_timer_reliable_states & (1 << cstate)))
 		lapic_timer_off();
 
-	before = cpuidle_get_tick();
+	before = alternative_call(cpuidle_get_tick);
 	TRACE_4D(TRC_PM_IDLE_ENTRY, cx->type, before, exp, pred);
 
 	update_last_cx_stat(power, cx, before);
@@ -786,7 +786,7 @@ static void mwait_idle(void)
 	if (cpu_is_haltable(cpu))
 		mwait_idle_with_hints(eax, MWAIT_ECX_INTERRUPT_BREAK);
 
-	after = cpuidle_get_tick();
+	after = alternative_call(cpuidle_get_tick);
 
 	cstate_restore_tsc();
 	trace_exit_reason(irq_traced);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v3 8/9] cpufreq: convert to a single post-init driver (hooks) instance
  2018-09-11 13:26 [PATCH v3 0/9] x86: indirect call overhead reduction Jan Beulich
                   ` (6 preceding siblings ...)
  2018-09-11 13:35 ` [PATCH v3 7/9] x86/cpuidle: patch some " Jan Beulich
@ 2018-09-11 13:37 ` Jan Beulich
  2018-09-21 14:06   ` Wei Liu
  2018-09-11 13:37 ` [PATCH v3 9/9] cpufreq: patch target() indirect call to direct one Jan Beulich
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-09-11 13:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall

This reduces the post-init memory footprint, eliminates a pointless
level of indirection at the use sites, and allows for subsequent
alternatives call patching.

Take the opportunity and also add a name to the PowerNow! instance.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.

--- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
+++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
@@ -53,8 +53,6 @@ enum {
 
 struct acpi_cpufreq_data *cpufreq_drv_data[NR_CPUS];
 
-static struct cpufreq_driver acpi_cpufreq_driver;
-
 static bool __read_mostly acpi_pstate_strict;
 boolean_param("acpi_pstate_strict", acpi_pstate_strict);
 
@@ -355,7 +353,7 @@ static void feature_detect(void *info)
     if ( cpu_has_aperfmperf )
     {
         policy->aperf_mperf = 1;
-        acpi_cpufreq_driver.getavg = get_measured_perf;
+        cpufreq_driver.getavg = get_measured_perf;
     }
 
     eax = cpuid_eax(6);
@@ -593,7 +591,7 @@ acpi_cpufreq_cpu_init(struct cpufreq_pol
         policy->cur = acpi_cpufreq_guess_freq(data, policy->cpu);
         break;
     case ACPI_ADR_SPACE_FIXED_HARDWARE:
-        acpi_cpufreq_driver.get = get_cur_freq_on_cpu;
+        cpufreq_driver.get = get_cur_freq_on_cpu;
         policy->cur = get_cur_freq_on_cpu(cpu);
         break;
     default:
@@ -635,7 +633,7 @@ static int acpi_cpufreq_cpu_exit(struct
     return 0;
 }
 
-static struct cpufreq_driver acpi_cpufreq_driver = {
+static const struct cpufreq_driver __initconstrel acpi_cpufreq_driver = {
     .name   = "acpi-cpufreq",
     .verify = acpi_cpufreq_verify,
     .target = acpi_cpufreq_target,
@@ -656,7 +654,7 @@ static int __init cpufreq_driver_init(vo
 
     return ret;
 }
-__initcall(cpufreq_driver_init);
+presmp_initcall(cpufreq_driver_init);
 
 int cpufreq_cpu_init(unsigned int cpuid)
 {
--- a/xen/arch/x86/acpi/cpufreq/powernow.c
+++ b/xen/arch/x86/acpi/cpufreq/powernow.c
@@ -52,8 +52,6 @@
 
 #define ARCH_CPU_FLAG_RESUME	1
 
-static struct cpufreq_driver powernow_cpufreq_driver;
-
 static void transition_pstate(void *pstate)
 {
     wrmsrl(MSR_PSTATE_CTRL, *(unsigned int *)pstate);
@@ -215,7 +213,7 @@ static void feature_detect(void *info)
     if ( cpu_has_aperfmperf )
     {
         policy->aperf_mperf = 1;
-        powernow_cpufreq_driver.getavg = get_measured_perf;
+        cpufreq_driver.getavg = get_measured_perf;
     }
 
     edx = cpuid_edx(CPUID_FREQ_VOLT_CAPABILITIES);
@@ -347,7 +345,8 @@ static int powernow_cpufreq_cpu_exit(str
     return 0;
 }
 
-static struct cpufreq_driver powernow_cpufreq_driver = {
+static const struct cpufreq_driver __initconstrel powernow_cpufreq_driver = {
+    .name   = "powernow",
     .verify = powernow_cpufreq_verify,
     .target = powernow_cpufreq_target,
     .init   = powernow_cpufreq_cpu_init,
--- a/xen/drivers/acpi/pmstat.c
+++ b/xen/drivers/acpi/pmstat.c
@@ -64,7 +64,7 @@ int do_get_pm_info(struct xen_sysctl_get
     case PMSTAT_PX:
         if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
             return -ENODEV;
-        if ( !cpufreq_driver )
+        if ( !cpufreq_driver.init )
             return -ENODEV;
         if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
             return -EINVAL;
@@ -255,16 +255,16 @@ static int get_cpufreq_para(struct xen_s
         return ret;
 
     op->u.get_para.cpuinfo_cur_freq =
-        cpufreq_driver->get ? cpufreq_driver->get(op->cpuid) : policy->cur;
+        cpufreq_driver.get ? cpufreq_driver.get(op->cpuid) : policy->cur;
     op->u.get_para.cpuinfo_max_freq = policy->cpuinfo.max_freq;
     op->u.get_para.cpuinfo_min_freq = policy->cpuinfo.min_freq;
     op->u.get_para.scaling_cur_freq = policy->cur;
     op->u.get_para.scaling_max_freq = policy->max;
     op->u.get_para.scaling_min_freq = policy->min;
 
-    if ( cpufreq_driver->name[0] )
+    if ( cpufreq_driver.name[0] )
         strlcpy(op->u.get_para.scaling_driver, 
-            cpufreq_driver->name, CPUFREQ_NAME_LEN);
+            cpufreq_driver.name, CPUFREQ_NAME_LEN);
     else
         strlcpy(op->u.get_para.scaling_driver, "Unknown", CPUFREQ_NAME_LEN);
 
--- a/xen/drivers/cpufreq/cpufreq.c
+++ b/xen/drivers/cpufreq/cpufreq.c
@@ -172,7 +172,7 @@ int cpufreq_add_cpu(unsigned int cpu)
     if ( !(perf->init & XEN_PX_INIT) )
         return -EINVAL;
 
-    if (!cpufreq_driver)
+    if (!cpufreq_driver.init)
         return 0;
 
     if (per_cpu(cpufreq_cpu_policy, cpu))
@@ -239,7 +239,7 @@ int cpufreq_add_cpu(unsigned int cpu)
         policy->cpu = cpu;
         per_cpu(cpufreq_cpu_policy, cpu) = policy;
 
-        ret = cpufreq_driver->init(policy);
+        ret = cpufreq_driver.init(policy);
         if (ret) {
             free_cpumask_var(policy->cpus);
             xfree(policy);
@@ -298,7 +298,7 @@ err1:
     cpumask_clear_cpu(cpu, cpufreq_dom->map);
 
     if (cpumask_empty(policy->cpus)) {
-        cpufreq_driver->exit(policy);
+        cpufreq_driver.exit(policy);
         free_cpumask_var(policy->cpus);
         xfree(policy);
     }
@@ -362,7 +362,7 @@ int cpufreq_del_cpu(unsigned int cpu)
     cpumask_clear_cpu(cpu, cpufreq_dom->map);
 
     if (cpumask_empty(policy->cpus)) {
-        cpufreq_driver->exit(policy);
+        cpufreq_driver.exit(policy);
         free_cpumask_var(policy->cpus);
         xfree(policy);
     }
@@ -663,17 +663,17 @@ static int __init cpufreq_presmp_init(vo
 }
 presmp_initcall(cpufreq_presmp_init);
 
-int __init cpufreq_register_driver(struct cpufreq_driver *driver_data)
+int __init cpufreq_register_driver(const struct cpufreq_driver *driver_data)
 {
    if ( !driver_data || !driver_data->init ||
         !driver_data->verify || !driver_data->exit ||
         (!driver_data->target == !driver_data->setpolicy) )
         return -EINVAL;
 
-    if ( cpufreq_driver )
+    if ( cpufreq_driver.init )
         return -EBUSY;
 
-    cpufreq_driver = driver_data;
+    cpufreq_driver = *driver_data;
 
     return 0;
 }
--- a/xen/drivers/cpufreq/utility.c
+++ b/xen/drivers/cpufreq/utility.c
@@ -31,7 +31,7 @@
 #include <acpi/cpufreq/cpufreq.h>
 #include <public/sysctl.h>
 
-struct cpufreq_driver   *cpufreq_driver;
+struct cpufreq_driver __read_mostly cpufreq_driver;
 struct processor_pminfo *__read_mostly processor_pminfo[NR_CPUS];
 DEFINE_PER_CPU_READ_MOSTLY(struct cpufreq_policy *, cpufreq_cpu_policy);
 
@@ -360,11 +360,11 @@ int __cpufreq_driver_target(struct cpufr
 {
     int retval = -EINVAL;
 
-    if (cpu_online(policy->cpu) && cpufreq_driver->target)
+    if (cpu_online(policy->cpu) && cpufreq_driver.target)
     {
         unsigned int prev_freq = policy->cur;
 
-        retval = cpufreq_driver->target(policy, target_freq, relation);
+        retval = cpufreq_driver.target(policy, target_freq, relation);
         if ( retval == 0 )
             TRACE_2D(TRC_PM_FREQ_CHANGE, prev_freq/1000, policy->cur/1000);
     }
@@ -380,9 +380,9 @@ int cpufreq_driver_getavg(unsigned int c
     if (!cpu_online(cpu) || !(policy = per_cpu(cpufreq_cpu_policy, cpu)))
         return 0;
 
-    if (cpufreq_driver->getavg)
+    if (cpufreq_driver.getavg)
     {
-        freq_avg = cpufreq_driver->getavg(cpu, flag);
+        freq_avg = cpufreq_driver.getavg(cpu, flag);
         if (freq_avg > 0)
             return freq_avg;
     }
@@ -412,9 +412,9 @@ int cpufreq_update_turbo(int cpuid, int
         return 0;
 
     policy->turbo = new_state;
-    if (cpufreq_driver->update)
+    if (cpufreq_driver.update)
     {
-        ret = cpufreq_driver->update(cpuid, policy);
+        ret = cpufreq_driver.update(cpuid, policy);
         if (ret)
             policy->turbo = curr_state;
     }
@@ -450,15 +450,15 @@ int __cpufreq_set_policy(struct cpufreq_
         return -EINVAL;
 
     /* verify the cpu speed can be set within this limit */
-    ret = cpufreq_driver->verify(policy);
+    ret = cpufreq_driver.verify(policy);
     if (ret)
         return ret;
 
     data->min = policy->min;
     data->max = policy->max;
     data->limits = policy->limits;
-    if (cpufreq_driver->setpolicy)
-        return cpufreq_driver->setpolicy(data);
+    if (cpufreq_driver.setpolicy)
+        return cpufreq_driver.setpolicy(data);
 
     if (policy->governor != data->governor) {
         /* save old, working values */
--- a/xen/include/acpi/cpufreq/cpufreq.h
+++ b/xen/include/acpi/cpufreq/cpufreq.h
@@ -153,7 +153,7 @@ __cpufreq_governor(struct cpufreq_policy
 #define CPUFREQ_RELATION_H 1  /* highest frequency below or at target */
 
 struct cpufreq_driver {
-    char   name[CPUFREQ_NAME_LEN];
+    const char *name;
     int    (*init)(struct cpufreq_policy *policy);
     int    (*verify)(struct cpufreq_policy *policy);
     int    (*setpolicy)(struct cpufreq_policy *policy);
@@ -166,9 +166,9 @@ struct cpufreq_driver {
     int    (*exit)(struct cpufreq_policy *policy);
 };
 
-extern struct cpufreq_driver *cpufreq_driver;
+extern struct cpufreq_driver cpufreq_driver;
 
-int cpufreq_register_driver(struct cpufreq_driver *);
+int cpufreq_register_driver(const struct cpufreq_driver *);
 
 static __inline__
 void cpufreq_verify_within_limits(struct cpufreq_policy *policy,




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v3 9/9] cpufreq: patch target() indirect call to direct one
  2018-09-11 13:26 [PATCH v3 0/9] x86: indirect call overhead reduction Jan Beulich
                   ` (7 preceding siblings ...)
  2018-09-11 13:37 ` [PATCH v3 8/9] cpufreq: convert to a single post-init driver (hooks) instance Jan Beulich
@ 2018-09-11 13:37 ` Jan Beulich
  2018-09-21 14:06   ` Wei Liu
  2018-10-02 10:09 ` [PATCH v4 00/12] x86: indirect call overhead reduction Jan Beulich
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-09-11 13:37 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall

This looks to be the only frequently executed hook; don't bother
patching any other ones.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.

--- a/xen/drivers/cpufreq/utility.c
+++ b/xen/drivers/cpufreq/utility.c
@@ -364,7 +364,8 @@ int __cpufreq_driver_target(struct cpufr
     {
         unsigned int prev_freq = policy->cur;
 
-        retval = cpufreq_driver.target(policy, target_freq, relation);
+        retval = alternative_call(cpufreq_driver.target,
+                                  policy, target_freq, relation);
         if ( retval == 0 )
             TRACE_2D(TRC_PM_FREQ_CHANGE, prev_freq/1000, policy->cur/1000);
     }



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v3 1/9] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-09-11 13:32 ` [PATCH v3 1/9] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
@ 2018-09-21 10:49   ` Wei Liu
  2018-09-21 11:47     ` Jan Beulich
  0 siblings, 1 reply; 119+ messages in thread
From: Wei Liu @ 2018-09-21 10:49 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

On Tue, Sep 11, 2018 at 07:32:04AM -0600, Jan Beulich wrote:
> In a number of cases the targets of indirect calls get determined once
> at boot time. In such cases we can replace those calls with direct ones
> via our alternative instruction patching mechanism.
> 
> Some of the targets (in particular the hvm_funcs ones) get established
> only in pre-SMP initcalls, making necessary a second passs through the
> alternative patching code. Therefore some adjustments beyond the
> recognition of the new special pattern are necessary there.
> 
> Note that patching such sites more than once is not supported (and the
> supplied macros also don't provide any means to do so).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v3: Use "X" constraint instead of "g" in alternative_callN(). Pre-
>     calculate values to be put into local register variables.
> v2: Introduce and use count_va_arg(). Don't omit middle operand from
>     ?: in ALT_CALL_ARG(). Re-base.
> 
> --- a/xen/arch/x86/alternative.c
> +++ b/xen/arch/x86/alternative.c
> @@ -178,8 +178,9 @@ text_poke(void *addr, const void *opcode
>   * APs have less capabilities than the boot processor are not handled.
>   * Tough. Make sure you disable such features by hand.
>   */
> -void init_or_livepatch apply_alternatives(struct alt_instr *start,
> -                                          struct alt_instr *end)
> +static void init_or_livepatch _apply_alternatives(struct alt_instr *start,
> +                                                  struct alt_instr *end,
> +                                                  bool force)
>  {
>      struct alt_instr *a, *base;
>  
> @@ -218,6 +219,13 @@ void init_or_livepatch apply_alternative

I think you need to fix the comment before this if statement. At the
very least you're now using two ->priv to make decision on patching.

Also I wonder why you keep base, since ...

>          if ( ALT_ORIG_PTR(base) != orig )
>              base = a;
>  
> +        /* Skip patch sites already handled during the first pass. */
> +        if ( a->priv )
> +        {
> +            ASSERT(force);
> +            continue;
> +        }
> +
>          /* If there is no replacement to make, see about optimising the nops. */
>          if ( !boot_cpu_has(a->cpuid) )
>          {
> @@ -225,7 +233,7 @@ void init_or_livepatch apply_alternative
>              if ( base->priv )
>                  continue;

... base is guaranteed to be a at this point, furthermore there is
already a check to skip patching added in this patch.

>  
> -            base->priv = 1;
> +            a->priv = 1;
>  
>              /* Nothing useful to do? */
>              if ( toolchain_nops_are_ideal || a->pad_len <= 1 )
> @@ -236,20 +244,74 @@ void init_or_livepatch apply_alternative
>              continue;
>          }
>  
> -        base->priv = 1;
> -
>          memcpy(buf, repl, a->repl_len);
>  
>          /* 0xe8/0xe9 are relative branches; fix the offset. */
>          if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
> -            *(int32_t *)(buf + 1) += repl - orig;
> +        {
> +            /*
> +             * Detect the special case of indirect-to-direct branch patching:
> +             * - replacement is a direct CALL/JMP (opcodes 0xE8/0xE9; already
> +             *   checked above),
> +             * - replacement's displacement is -5 (pointing back at the very
> +             *   insn, which makes no sense in a real replacement insn),
> +             * - original is an indirect CALL/JMP (opcodes 0xFF/2 or 0xFF/4)
> +             *   using RIP-relative addressing.
> +             * Some function targets may not be available when we come here
> +             * the first time. Defer patching of those until the post-presmp-
> +             * initcalls re-invocation. If at that point the target pointer is
> +             * still NULL, insert "UD2; UD0" (for ease of recognition) instead
> +             * of CALL/JMP.
> +             */
> +            if ( a->cpuid == X86_FEATURE_ALWAYS &&
> +                 *(int32_t *)(buf + 1) == -5 &&
> +                 a->orig_len >= 6 &&
> +                 orig[0] == 0xff &&
> +                 orig[1] == (*buf & 1 ? 0x25 : 0x15) )
> +            {
> +                long disp = *(int32_t *)(orig + 2);
> +                const uint8_t *dest = *(void **)(orig + 6 + disp);
> +
> +                if ( dest )
> +                {
> +                    disp = dest - (orig + 5);
> +                    ASSERT(disp == (int32_t)disp);
> +                    *(int32_t *)(buf + 1) = disp;
> +                }
> +                else if ( force )
> +                {
> +                    buf[0] = 0x0f;
> +                    buf[1] = 0x0b;
> +                    buf[2] = 0x0f;
> +                    buf[3] = 0xff;
> +                    buf[4] = 0xff;

I think these are opcodes for "UD2; UD0". Please add a comment for them.
Having to go through SDM to figure out what they are isn't nice.

At this point I also think the name "force" is not very good. What/who
is forced here? Why not use a more descriptive name like "post_init" or
"system_active"?

The rest of the patch looks fine to me.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v3 2/9] x86/HVM: patch indirect calls through hvm_funcs to direct ones
  2018-09-11 13:32 ` [PATCH v3 2/9] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
@ 2018-09-21 10:50   ` Wei Liu
  0 siblings, 0 replies; 119+ messages in thread
From: Wei Liu @ 2018-09-21 10:50 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, Andrew Cooper

On Tue, Sep 11, 2018 at 07:32:55AM -0600, Jan Beulich wrote:
> This is intentionally not touching hooks used rarely (or not at all)
> during the lifetime of a VM, like {domain,vcpu}_initialise or cpu_up,
> as well as nested, VM event, and altp2m ones (they can all be done
> later, if so desired). Virtual Interrupt delivery ones will be dealt
> with in a subsequent patch.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v3 3/9] x86/HVM: patch vINTR indirect calls through hvm_funcs to direct ones
  2018-09-11 13:33 ` [PATCH v3 3/9] x86/HVM: patch vINTR " Jan Beulich
@ 2018-09-21 10:50   ` Wei Liu
  0 siblings, 0 replies; 119+ messages in thread
From: Wei Liu @ 2018-09-21 10:50 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Kevin Tian, Wei Liu, Jun Nakajima, Andrew Cooper

On Tue, Sep 11, 2018 at 07:33:45AM -0600, Jan Beulich wrote:
> While not strictly necessary, change the VMX initialization logic to
> update the function table in start_vmx() from NULL rather than to NULL,
> to make more obvious that we won't ever change an already (explictly)
> initialized function pointer.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Acked-by: Kevin Tian <kevin.tian@intel.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v3 4/9] x86: patch ctxt_switch_masking() indirect call to direct one
  2018-09-11 13:34 ` [PATCH v3 4/9] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
@ 2018-09-21 10:51   ` Wei Liu
  0 siblings, 0 replies; 119+ messages in thread
From: Wei Liu @ 2018-09-21 10:51 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Tue, Sep 11, 2018 at 07:34:15AM -0600, Jan Beulich wrote:
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

> ---
> v2: Drop open-coded number from macro invocation.
> 
> --- a/xen/arch/x86/cpu/common.c
> +++ b/xen/arch/x86/cpu/common.c
> @@ -184,7 +184,7 @@ void ctxt_switch_levelling(const struct
>  	}
>  
>  	if (ctxt_switch_masking)
> -		ctxt_switch_masking(next);
> +		alternative_vcall(ctxt_switch_masking, next);
>  }
>  
>  bool_t opt_cpu_info;
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v3 5/9] x86/genapic: remove indirection from genapic hook accesses
  2018-09-11 13:35 ` [PATCH v3 5/9] x86/genapic: remove indirection from genapic hook accesses Jan Beulich
@ 2018-09-21 10:53   ` Wei Liu
  0 siblings, 0 replies; 119+ messages in thread
From: Wei Liu @ 2018-09-21 10:53 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Tue, Sep 11, 2018 at 07:35:04AM -0600, Jan Beulich wrote:
> Instead of loading a pointer at each use site, have a single runtime
> instance of struct genapic, copying into it from the individual
> instances. The individual instances can this way also be moved to .init
> (also adjust apic_probe[] at this occasion).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v3 6/9] x86/genapic: patch indirect calls to direct ones
  2018-09-11 13:35 ` [PATCH v3 6/9] x86/genapic: patch indirect calls to direct ones Jan Beulich
@ 2018-09-21 11:03   ` Wei Liu
  2018-09-21 11:53     ` Jan Beulich
  2018-09-21 13:55   ` Wei Liu
  1 sibling, 1 reply; 119+ messages in thread
From: Wei Liu @ 2018-09-21 11:03 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Tue, Sep 11, 2018 at 07:35:33AM -0600, Jan Beulich wrote:
> For (I hope) obvious reasons only the ones used at runtime get
> converted.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v2: Drop open-coded numbers from macro invocations.
> 
> --- a/xen/arch/x86/smp.c
> +++ b/xen/arch/x86/smp.c
> @@ -29,12 +29,12 @@
>  
>  void send_IPI_mask(const cpumask_t *mask, int vector)
>  {
> -    genapic.send_IPI_mask(mask, vector);
> +    alternative_vcall(genapic.send_IPI_mask, mask, vector);
>  }
>  
>  void send_IPI_self(int vector)
>  {
> -    genapic.send_IPI_self(vector);
> +    alternative_vcall(genapic.send_IPI_self, vector);
>  }
>  
>  /*
> --- a/xen/include/asm-x86/mach-generic/mach_apic.h
> +++ b/xen/include/asm-x86/mach-generic/mach_apic.h
> @@ -15,8 +15,18 @@
>  #define TARGET_CPUS ((const typeof(cpu_online_map) *)&cpu_online_map)
>  #define init_apic_ldr (genapic.init_apic_ldr)
>  #define clustered_apic_check (genapic.clustered_apic_check)
> -#define cpu_mask_to_apicid (genapic.cpu_mask_to_apicid)
> -#define vector_allocation_cpumask(cpu) (genapic.vector_allocation_cpumask(cpu))
> +#define cpu_mask_to_apicid(mask) ({ \
> +	/* \
> +	 * There are a number of places where the address of a local variable \
> +	 * gets passed here. The use of ?: in alternative_call<N>() triggers an \
> +	 * "address of ... is always true" warning in such a case with at least \
> +	 * gcc 7 and 8. Hence the seemingly pointless local variable here. \
> +	 */ \

Is this still needed given you have brought back the middle operand in ?: in
patch 1?

Wei.

> +	const cpumask_t *m_ = (mask); \
> +	alternative_call(genapic.cpu_mask_to_apicid, m_); \
> +})
> +#define vector_allocation_cpumask(cpu) \
> +	alternative_call(genapic.vector_allocation_cpumask, cpu)
>  
>  static inline void enable_apic_mode(void)
>  {
> 
> 
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v3 1/9] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-09-21 10:49   ` Wei Liu
@ 2018-09-21 11:47     ` Jan Beulich
  2018-09-21 13:48       ` Wei Liu
  0 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-09-21 11:47 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall, xen-devel

>>> On 21.09.18 at 12:49, <wei.liu2@citrix.com> wrote:
> On Tue, Sep 11, 2018 at 07:32:04AM -0600, Jan Beulich wrote:
>> @@ -218,6 +219,13 @@ void init_or_livepatch apply_alternative
> 
> I think you need to fix the comment before this if statement. At the
> very least you're now using two ->priv to make decision on patching.

I've been considering this, but even a very close look didn't turn up
anything I could do to this comment to improve it. Suggestions
welcome.

> Also I wonder why you keep base, since ...
> 
>>          if ( ALT_ORIG_PTR(base) != orig )
>>              base = a;
>>  
>> +        /* Skip patch sites already handled during the first pass. */
>> +        if ( a->priv )
>> +        {
>> +            ASSERT(force);
>> +            continue;
>> +        }
>> +
>>          /* If there is no replacement to make, see about optimising the nops. */
>>          if ( !boot_cpu_has(a->cpuid) )
>>          {
>> @@ -225,7 +233,7 @@ void init_or_livepatch apply_alternative
>>              if ( base->priv )
>>                  continue;
> 
> ... base is guaranteed to be a at this point, furthermore there is
> already a check to skip patching added in this patch.

Why would base equal a here?

> -            base->priv = 1;
> +            a->priv = 1;

This communicates from one pass to the next: Previously it was
sufficient to set ->priv on only the first of a group of patches for
the same site. This is no longer the case with the multi-pass
approach - we need to keep record for every entry, such that we
won't touch again in pass 2 what pass 1 has already dealt with.

With base and a not necessarily equal, I think the second half of
your statement becomes irrelevant (as we may be looking at
different entries' ->priv there and here). I agree this could perhaps
be written slightly differently; personally I find it easier to prove
correct in this shape than if we e.g. relied on base->priv to
necessarily be set in pass 2 when we process a non-"primary"
entry. The patch description forbids certain combinations of
patches, but I think the code should nevertheless have as few
latent bugs in this regard as possible.

>> @@ -236,20 +244,74 @@ void init_or_livepatch apply_alternative
>>              continue;
>>          }
>>  
>> -        base->priv = 1;
>> -
>>          memcpy(buf, repl, a->repl_len);
>>  
>>          /* 0xe8/0xe9 are relative branches; fix the offset. */
>>          if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
>> -            *(int32_t *)(buf + 1) += repl - orig;
>> +        {
>> +            /*
>> +             * Detect the special case of indirect-to-direct branch patching:
>> +             * - replacement is a direct CALL/JMP (opcodes 0xE8/0xE9; already
>> +             *   checked above),
>> +             * - replacement's displacement is -5 (pointing back at the very
>> +             *   insn, which makes no sense in a real replacement insn),
>> +             * - original is an indirect CALL/JMP (opcodes 0xFF/2 or 0xFF/4)
>> +             *   using RIP-relative addressing.
>> +             * Some function targets may not be available when we come here
>> +             * the first time. Defer patching of those until the post-presmp-
>> +             * initcalls re-invocation. If at that point the target pointer is
>> +             * still NULL, insert "UD2; UD0" (for ease of recognition) instead
>> +             * of CALL/JMP.
>> +             */
>> +            if ( a->cpuid == X86_FEATURE_ALWAYS &&
>> +                 *(int32_t *)(buf + 1) == -5 &&
>> +                 a->orig_len >= 6 &&
>> +                 orig[0] == 0xff &&
>> +                 orig[1] == (*buf & 1 ? 0x25 : 0x15) )
>> +            {
>> +                long disp = *(int32_t *)(orig + 2);
>> +                const uint8_t *dest = *(void **)(orig + 6 + disp);
>> +
>> +                if ( dest )
>> +                {
>> +                    disp = dest - (orig + 5);
>> +                    ASSERT(disp == (int32_t)disp);
>> +                    *(int32_t *)(buf + 1) = disp;
>> +                }
>> +                else if ( force )
>> +                {
>> +                    buf[0] = 0x0f;
>> +                    buf[1] = 0x0b;
>> +                    buf[2] = 0x0f;
>> +                    buf[3] = 0xff;
>> +                    buf[4] = 0xff;
> 
> I think these are opcodes for "UD2; UD0". Please add a comment for them.
> Having to go through SDM to figure out what they are isn't nice.

Well, I'm saying so in the relatively big comment ahead of this block of
code. I don't want to say the same thing twice.

> At this point I also think the name "force" is not very good. What/who
> is forced here? Why not use a more descriptive name like "post_init" or
> "system_active"?

_Patching_ is being forced here, i.e. even if we still can't find a non-NULL
pointer, we still patch the site. I'm certainly open for suggestions, but
I don't really like either of the two suggestions you make any better than
the current "force". The next best option I had been thinking about back
then was to pass in a number, to identify the stage / phase / pass we're in.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v3 6/9] x86/genapic: patch indirect calls to direct ones
  2018-09-21 11:03   ` Wei Liu
@ 2018-09-21 11:53     ` Jan Beulich
  0 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-09-21 11:53 UTC (permalink / raw)
  To: Wei Liu; +Cc: Andrew Cooper, xen-devel

>>> On 21.09.18 at 13:03, <wei.liu2@citrix.com> wrote:
> On Tue, Sep 11, 2018 at 07:35:33AM -0600, Jan Beulich wrote:
>> --- a/xen/include/asm-x86/mach-generic/mach_apic.h
>> +++ b/xen/include/asm-x86/mach-generic/mach_apic.h
>> @@ -15,8 +15,18 @@
>>  #define TARGET_CPUS ((const typeof(cpu_online_map) *)&cpu_online_map)
>>  #define init_apic_ldr (genapic.init_apic_ldr)
>>  #define clustered_apic_check (genapic.clustered_apic_check)
>> -#define cpu_mask_to_apicid (genapic.cpu_mask_to_apicid)
>> -#define vector_allocation_cpumask(cpu) (genapic.vector_allocation_cpumask(cpu))
>> +#define cpu_mask_to_apicid(mask) ({ \
>> +	/* \
>> +	 * There are a number of places where the address of a local variable \
>> +	 * gets passed here. The use of ?: in alternative_call<N>() triggers an \
>> +	 * "address of ... is always true" warning in such a case with at least \
>> +	 * gcc 7 and 8. Hence the seemingly pointless local variable here. \
>> +	 */ \
> 
> Is this still needed given you have brought back the middle operand in ?: in
> patch 1?

Yes, unfortunately. The omitted middle operand silenced a warning
when (arg) was of type bool. This is unrelated to the compiler recognizing
that the address of a local variable is never going to be NULL.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v3 1/9] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-09-21 11:47     ` Jan Beulich
@ 2018-09-21 13:48       ` Wei Liu
  2018-09-21 15:26         ` Jan Beulich
  0 siblings, 1 reply; 119+ messages in thread
From: Wei Liu @ 2018-09-21 13:48 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

On Fri, Sep 21, 2018 at 05:47:54AM -0600, Jan Beulich wrote:
> >>> On 21.09.18 at 12:49, <wei.liu2@citrix.com> wrote:
> > On Tue, Sep 11, 2018 at 07:32:04AM -0600, Jan Beulich wrote:
> >> @@ -218,6 +219,13 @@ void init_or_livepatch apply_alternative
> > 
> > I think you need to fix the comment before this if statement. At the
> > very least you're now using two ->priv to make decision on patching.
> 
> I've been considering this, but even a very close look didn't turn up
> anything I could do to this comment to improve it. Suggestions
> welcome.

Just remove the sentence about using single ->priv field?

> 
> > Also I wonder why you keep base, since ...
> > 
> >>          if ( ALT_ORIG_PTR(base) != orig )
> >>              base = a;
> >>  
> >> +        /* Skip patch sites already handled during the first pass. */
> >> +        if ( a->priv )
> >> +        {
> >> +            ASSERT(force);
> >> +            continue;
> >> +        }
> >> +
> >>          /* If there is no replacement to make, see about optimising the nops. */
> >>          if ( !boot_cpu_has(a->cpuid) )
> >>          {
> >> @@ -225,7 +233,7 @@ void init_or_livepatch apply_alternative
> >>              if ( base->priv )
> >>                  continue;
> > 
> > ... base is guaranteed to be a at this point, furthermore there is
> > already a check to skip patching added in this patch.
> 
> Why would base equal a here?

No they aren't necessarily equal. I have misread the code.

Your code is fine as-is.

> >> @@ -236,20 +244,74 @@ void init_or_livepatch apply_alternative
> >>              continue;
> >>          }
> >>  
> >> -        base->priv = 1;
> >> -
> >>          memcpy(buf, repl, a->repl_len);
> >>  
> >>          /* 0xe8/0xe9 are relative branches; fix the offset. */
> >>          if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
> >> -            *(int32_t *)(buf + 1) += repl - orig;
> >> +        {
> >> +            /*
> >> +             * Detect the special case of indirect-to-direct branch patching:
> >> +             * - replacement is a direct CALL/JMP (opcodes 0xE8/0xE9; already
> >> +             *   checked above),
> >> +             * - replacement's displacement is -5 (pointing back at the very
> >> +             *   insn, which makes no sense in a real replacement insn),
> >> +             * - original is an indirect CALL/JMP (opcodes 0xFF/2 or 0xFF/4)
> >> +             *   using RIP-relative addressing.
> >> +             * Some function targets may not be available when we come here
> >> +             * the first time. Defer patching of those until the post-presmp-
> >> +             * initcalls re-invocation. If at that point the target pointer is
> >> +             * still NULL, insert "UD2; UD0" (for ease of recognition) instead
> >> +             * of CALL/JMP.
> >> +             */
> >> +            if ( a->cpuid == X86_FEATURE_ALWAYS &&
> >> +                 *(int32_t *)(buf + 1) == -5 &&
> >> +                 a->orig_len >= 6 &&
> >> +                 orig[0] == 0xff &&
> >> +                 orig[1] == (*buf & 1 ? 0x25 : 0x15) )
> >> +            {
> >> +                long disp = *(int32_t *)(orig + 2);
> >> +                const uint8_t *dest = *(void **)(orig + 6 + disp);
> >> +
> >> +                if ( dest )
> >> +                {
> >> +                    disp = dest - (orig + 5);
> >> +                    ASSERT(disp == (int32_t)disp);
> >> +                    *(int32_t *)(buf + 1) = disp;
> >> +                }
> >> +                else if ( force )
> >> +                {
> >> +                    buf[0] = 0x0f;
> >> +                    buf[1] = 0x0b;
> >> +                    buf[2] = 0x0f;
> >> +                    buf[3] = 0xff;
> >> +                    buf[4] = 0xff;
> > 
> > I think these are opcodes for "UD2; UD0". Please add a comment for them.
> > Having to go through SDM to figure out what they are isn't nice.
> 
> Well, I'm saying so in the relatively big comment ahead of this block of
> code. I don't want to say the same thing twice.

It is all fine when one is rather familiar with the code and x86-ism,
but it is rather difficult for a casual reader when you refer to
"target" in comment but "dest" in code. Lacking comment of what "force"
means also doesn't help.

> 
> > At this point I also think the name "force" is not very good. What/who
> > is forced here? Why not use a more descriptive name like "post_init" or
> > "system_active"?
> 
> _Patching_ is being forced here, i.e. even if we still can't find a non-NULL
> pointer, we still patch the site. I'm certainly open for suggestions, but
> I don't really like either of the two suggestions you make any better than
> the current "force". The next best option I had been thinking about back
> then was to pass in a number, to identify the stage / phase / pass we're in.

I had to reverse-engineer when force is supposed to be true. It would
help a lot if you add a comment regarding "force" at the beginning of
the function.

Wei.

> 
> Jan
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v3 6/9] x86/genapic: patch indirect calls to direct ones
  2018-09-11 13:35 ` [PATCH v3 6/9] x86/genapic: patch indirect calls to direct ones Jan Beulich
  2018-09-21 11:03   ` Wei Liu
@ 2018-09-21 13:55   ` Wei Liu
  1 sibling, 0 replies; 119+ messages in thread
From: Wei Liu @ 2018-09-21 13:55 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Tue, Sep 11, 2018 at 07:35:33AM -0600, Jan Beulich wrote:
> For (I hope) obvious reasons only the ones used at runtime get
> converted.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v3 7/9] x86/cpuidle: patch some indirect calls to direct ones
  2018-09-11 13:35 ` [PATCH v3 7/9] x86/cpuidle: patch some " Jan Beulich
@ 2018-09-21 14:01   ` Wei Liu
  0 siblings, 0 replies; 119+ messages in thread
From: Wei Liu @ 2018-09-21 14:01 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Tue, Sep 11, 2018 at 07:35:58AM -0600, Jan Beulich wrote:
> For now only the ones used during entering/exiting of idle states are
> converted. Additionally pm_idle{,_save} and lapic_timer_{on,off} can't
> be converted, as they may get established rather late (when Dom0 is
> already active).
> 
> Note that for patching to be deferred until after the pre-SMP initcalls
> (from where cpuidle_init_cpu() runs the first time) the pointers need to
> start out as NULL.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v3 8/9] cpufreq: convert to a single post-init driver (hooks) instance
  2018-09-11 13:37 ` [PATCH v3 8/9] cpufreq: convert to a single post-init driver (hooks) instance Jan Beulich
@ 2018-09-21 14:06   ` Wei Liu
  0 siblings, 0 replies; 119+ messages in thread
From: Wei Liu @ 2018-09-21 14:06 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

On Tue, Sep 11, 2018 at 07:37:00AM -0600, Jan Beulich wrote:
> This reduces the post-init memory footprint, eliminates a pointless
> level of indirection at the use sites, and allows for subsequent
> alternatives call patching.
> 
> Take the opportunity and also add a name to the PowerNow! instance.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v3 9/9] cpufreq: patch target() indirect call to direct one
  2018-09-11 13:37 ` [PATCH v3 9/9] cpufreq: patch target() indirect call to direct one Jan Beulich
@ 2018-09-21 14:06   ` Wei Liu
  0 siblings, 0 replies; 119+ messages in thread
From: Wei Liu @ 2018-09-21 14:06 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

On Tue, Sep 11, 2018 at 07:37:37AM -0600, Jan Beulich wrote:
> This looks to be the only frequently executed hook; don't bother
> patching any other ones.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v3 1/9] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-09-21 13:48       ` Wei Liu
@ 2018-09-21 15:26         ` Jan Beulich
  2018-09-26 11:06           ` Wei Liu
  0 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-09-21 15:26 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall, xen-devel

>>> On 21.09.18 at 15:48, <wei.liu2@citrix.com> wrote:
> On Fri, Sep 21, 2018 at 05:47:54AM -0600, Jan Beulich wrote:
>> >>> On 21.09.18 at 12:49, <wei.liu2@citrix.com> wrote:
>> > On Tue, Sep 11, 2018 at 07:32:04AM -0600, Jan Beulich wrote:
>> >> @@ -218,6 +219,13 @@ void init_or_livepatch apply_alternative
>> > 
>> > I think you need to fix the comment before this if statement. At the
>> > very least you're now using two ->priv to make decision on patching.
>> 
>> I've been considering this, but even a very close look didn't turn up
>> anything I could do to this comment to improve it. Suggestions
>> welcome.
> 
> Just remove the sentence about using single ->priv field?

That would go too far. But I'll make it "for some of our patching decisions".

>> >> @@ -236,20 +244,74 @@ void init_or_livepatch apply_alternative
>> >>              continue;
>> >>          }
>> >>  
>> >> -        base->priv = 1;
>> >> -
>> >>          memcpy(buf, repl, a->repl_len);
>> >>  
>> >>          /* 0xe8/0xe9 are relative branches; fix the offset. */
>> >>          if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
>> >> -            *(int32_t *)(buf + 1) += repl - orig;
>> >> +        {
>> >> +            /*
>> >> +             * Detect the special case of indirect-to-direct branch patching:
>> >> +             * - replacement is a direct CALL/JMP (opcodes 0xE8/0xE9; already
>> >> +             *   checked above),
>> >> +             * - replacement's displacement is -5 (pointing back at the very
>> >> +             *   insn, which makes no sense in a real replacement insn),
>> >> +             * - original is an indirect CALL/JMP (opcodes 0xFF/2 or 0xFF/4)
>> >> +             *   using RIP-relative addressing.
>> >> +             * Some function targets may not be available when we come here
>> >> +             * the first time. Defer patching of those until the post-presmp-
>> >> +             * initcalls re-invocation. If at that point the target pointer is
>> >> +             * still NULL, insert "UD2; UD0" (for ease of recognition) instead
>> >> +             * of CALL/JMP.
>> >> +             */
>> >> +            if ( a->cpuid == X86_FEATURE_ALWAYS &&
>> >> +                 *(int32_t *)(buf + 1) == -5 &&
>> >> +                 a->orig_len >= 6 &&
>> >> +                 orig[0] == 0xff &&
>> >> +                 orig[1] == (*buf & 1 ? 0x25 : 0x15) )
>> >> +            {
>> >> +                long disp = *(int32_t *)(orig + 2);
>> >> +                const uint8_t *dest = *(void **)(orig + 6 + disp);
>> >> +
>> >> +                if ( dest )
>> >> +                {
>> >> +                    disp = dest - (orig + 5);
>> >> +                    ASSERT(disp == (int32_t)disp);
>> >> +                    *(int32_t *)(buf + 1) = disp;
>> >> +                }
>> >> +                else if ( force )
>> >> +                {
>> >> +                    buf[0] = 0x0f;
>> >> +                    buf[1] = 0x0b;
>> >> +                    buf[2] = 0x0f;
>> >> +                    buf[3] = 0xff;
>> >> +                    buf[4] = 0xff;
>> > 
>> > I think these are opcodes for "UD2; UD0". Please add a comment for them.
>> > Having to go through SDM to figure out what they are isn't nice.
>> 
>> Well, I'm saying so in the relatively big comment ahead of this block of
>> code. I don't want to say the same thing twice.
> 
> It is all fine when one is rather familiar with the code and x86-ism,
> but it is rather difficult for a casual reader when you refer to
> "target" in comment but "dest" in code.

Would "function pointers" / "branch destinations" (or both) in the
comment be better?

> Lacking comment of what "force" means also doesn't help.
> 
>> 
>> > At this point I also think the name "force" is not very good. What/who
>> > is forced here? Why not use a more descriptive name like "post_init" or
>> > "system_active"?
>> 
>> _Patching_ is being forced here, i.e. even if we still can't find a non-NULL
>> pointer, we still patch the site. I'm certainly open for suggestions, but
>> I don't really like either of the two suggestions you make any better than
>> the current "force". The next best option I had been thinking about back
>> then was to pass in a number, to identify the stage / phase / pass we're in.
> 
> I had to reverse-engineer when force is supposed to be true. It would
> help a lot if you add a comment regarding "force" at the beginning of
> the function.

Will do.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v3 1/9] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-09-21 15:26         ` Jan Beulich
@ 2018-09-26 11:06           ` Wei Liu
  0 siblings, 0 replies; 119+ messages in thread
From: Wei Liu @ 2018-09-26 11:06 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

On Fri, Sep 21, 2018 at 09:26:27AM -0600, Jan Beulich wrote:
> >>> On 21.09.18 at 15:48, <wei.liu2@citrix.com> wrote:
> > On Fri, Sep 21, 2018 at 05:47:54AM -0600, Jan Beulich wrote:
> >> >>> On 21.09.18 at 12:49, <wei.liu2@citrix.com> wrote:
> >> > On Tue, Sep 11, 2018 at 07:32:04AM -0600, Jan Beulich wrote:
> >> >> @@ -218,6 +219,13 @@ void init_or_livepatch apply_alternative
> >> > 
> >> > I think you need to fix the comment before this if statement. At the
> >> > very least you're now using two ->priv to make decision on patching.
> >> 
> >> I've been considering this, but even a very close look didn't turn up
> >> anything I could do to this comment to improve it. Suggestions
> >> welcome.
> > 
> > Just remove the sentence about using single ->priv field?
> 
> That would go too far. But I'll make it "for some of our patching decisions".

Fair enough.

> 
> >> >> @@ -236,20 +244,74 @@ void init_or_livepatch apply_alternative
> >> >>              continue;
> >> >>          }
> >> >>  
> >> >> -        base->priv = 1;
> >> >> -
> >> >>          memcpy(buf, repl, a->repl_len);
> >> >>  
> >> >>          /* 0xe8/0xe9 are relative branches; fix the offset. */
> >> >>          if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
> >> >> -            *(int32_t *)(buf + 1) += repl - orig;
> >> >> +        {
> >> >> +            /*
> >> >> +             * Detect the special case of indirect-to-direct branch patching:
> >> >> +             * - replacement is a direct CALL/JMP (opcodes 0xE8/0xE9; already
> >> >> +             *   checked above),
> >> >> +             * - replacement's displacement is -5 (pointing back at the very
> >> >> +             *   insn, which makes no sense in a real replacement insn),
> >> >> +             * - original is an indirect CALL/JMP (opcodes 0xFF/2 or 0xFF/4)
> >> >> +             *   using RIP-relative addressing.
> >> >> +             * Some function targets may not be available when we come here
> >> >> +             * the first time. Defer patching of those until the post-presmp-
> >> >> +             * initcalls re-invocation. If at that point the target pointer is
> >> >> +             * still NULL, insert "UD2; UD0" (for ease of recognition) instead
> >> >> +             * of CALL/JMP.
> >> >> +             */
> >> >> +            if ( a->cpuid == X86_FEATURE_ALWAYS &&
> >> >> +                 *(int32_t *)(buf + 1) == -5 &&
> >> >> +                 a->orig_len >= 6 &&
> >> >> +                 orig[0] == 0xff &&
> >> >> +                 orig[1] == (*buf & 1 ? 0x25 : 0x15) )
> >> >> +            {
> >> >> +                long disp = *(int32_t *)(orig + 2);
> >> >> +                const uint8_t *dest = *(void **)(orig + 6 + disp);
> >> >> +
> >> >> +                if ( dest )
> >> >> +                {
> >> >> +                    disp = dest - (orig + 5);
> >> >> +                    ASSERT(disp == (int32_t)disp);
> >> >> +                    *(int32_t *)(buf + 1) = disp;
> >> >> +                }
> >> >> +                else if ( force )
> >> >> +                {
> >> >> +                    buf[0] = 0x0f;
> >> >> +                    buf[1] = 0x0b;
> >> >> +                    buf[2] = 0x0f;
> >> >> +                    buf[3] = 0xff;
> >> >> +                    buf[4] = 0xff;
> >> > 
> >> > I think these are opcodes for "UD2; UD0". Please add a comment for them.
> >> > Having to go through SDM to figure out what they are isn't nice.
> >> 
> >> Well, I'm saying so in the relatively big comment ahead of this block of
> >> code. I don't want to say the same thing twice.
> > 
> > It is all fine when one is rather familiar with the code and x86-ism,
> > but it is rather difficult for a casual reader when you refer to
> > "target" in comment but "dest" in code.
> 
> Would "function pointers" / "branch destinations" (or both) in the
> comment be better?

I think "branch destination" is better because it matches "dest" in
code.

> 
> > Lacking comment of what "force" means also doesn't help.
> > 
> >> 
> >> > At this point I also think the name "force" is not very good. What/who
> >> > is forced here? Why not use a more descriptive name like "post_init" or
> >> > "system_active"?
> >> 
> >> _Patching_ is being forced here, i.e. even if we still can't find a non-NULL
> >> pointer, we still patch the site. I'm certainly open for suggestions, but
> >> I don't really like either of the two suggestions you make any better than
> >> the current "force". The next best option I had been thinking about back
> >> then was to pass in a number, to identify the stage / phase / pass we're in.
> > 
> > I had to reverse-engineer when force is supposed to be true. It would
> > help a lot if you add a comment regarding "force" at the beginning of
> > the function.
> 
> Will do.

Thanks, that would certainly help.

Wei.

> 
> Jan
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v4 00/12] x86: indirect call overhead reduction
  2018-09-11 13:26 [PATCH v3 0/9] x86: indirect call overhead reduction Jan Beulich
                   ` (8 preceding siblings ...)
  2018-09-11 13:37 ` [PATCH v3 9/9] cpufreq: patch target() indirect call to direct one Jan Beulich
@ 2018-10-02 10:09 ` Jan Beulich
  2018-10-02 10:12   ` [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
                     ` (11 more replies)
  2018-11-08 15:56 ` [PATCH v5 00/13] x86: indirect call overhead reduction Jan Beulich
                   ` (2 subsequent siblings)
  12 siblings, 12 replies; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 10:09 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

While indirect calls have always been more expensive than direct ones,
their cost has further increased with the Spectre v2 mitigations. In a
number of cases we simply pointlessly use them in the first place. In
many other cases the indirection solely exists to abstract from e.g.
vendor specific hardware details, and hence the pointers used never
change once set. Here we can use alternatives patching to get rid of
the indirection.

From patch 2 onwards dependencies exist on earlier, yet to be reviewed
patches ("x86/alternatives: fully leverage automatic NOP filling" as well
as the "x86: improve PDX <-> PFN and alike translations" series at the
very least). I nevertheless wanted to enable a first round of review of
the series, the more that some of the patches (not just initial ones)
could perhaps be taken irrespective of those dependencies. The first
two of the three genapic patches, otoh, are entirely independent and
could go in right away afaict if they were ack-ed.

Further areas where indirect calls could be eliminated (and that I've put
on my todo list in case the general concept here is deemed reasonable)
are IOMMU, vPMU, and XSM. For some of these, the ARM side would
need dealing with as well - I'm not sure whether replacing indirect calls
by direct ones is worthwhile there as well; if not, the wrappers would
simply need to become function invocations in the ARM case (as was
already asked for in the IOMMU case).

01: x86: infrastructure to allow converting certain indirect calls to direct ones
02: x86/HVM: patch indirect calls through hvm_funcs to direct ones
03: x86/HVM: patch vINTR indirect calls through hvm_funcs to direct ones
04: x86: patch ctxt_switch_masking() indirect call to direct one
05: x86/genapic: remove indirection from genapic hook accesses
06: x86/genapic: patch indirect calls to direct ones
07: x86/cpuidle: patch some indirect calls to direct ones
08: cpufreq: convert to a single post-init driver (hooks) instance
09: cpufreq: patch target() indirect call to direct one
10: IOMMU: introduce IOMMU_MIXED config option
11: IOMMU: remove indirection from certain IOMMU hook accesses
12: IOMMU: patch certain indirect calls to direct ones

Besides some re-basing only patch 1 has some comment improvements
compared to v2, and patches 10 and onwards are new.

Jan




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-02 10:09 ` [PATCH v4 00/12] x86: indirect call overhead reduction Jan Beulich
@ 2018-10-02 10:12   ` Jan Beulich
  2018-10-02 13:21     ` Andrew Cooper
                       ` (2 more replies)
  2018-10-02 10:12   ` [PATCH v4 02/12] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
                     ` (10 subsequent siblings)
  11 siblings, 3 replies; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 10:12 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall

In a number of cases the targets of indirect calls get determined once
at boot time. In such cases we can replace those calls with direct ones
via our alternative instruction patching mechanism.

Some of the targets (in particular the hvm_funcs ones) get established
only in pre-SMP initcalls, making necessary a second passs through the
alternative patching code. Therefore some adjustments beyond the
recognition of the new special pattern are necessary there.

Note that patching such sites more than once is not supported (and the
supplied macros also don't provide any means to do so).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v4: Extend / adjust comments.
v3: Use "X" constraint instead of "g" in alternative_callN(). Pre-
    calculate values to be put into local register variables.
v2: Introduce and use count_va_arg(). Don't omit middle operand from
    ?: in ALT_CALL_ARG(). Re-base.

--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -177,9 +177,14 @@ text_poke(void *addr, const void *opcode
  * self modifying code. This implies that asymmetric systems where
  * APs have less capabilities than the boot processor are not handled.
  * Tough. Make sure you disable such features by hand.
+ *
+ * The caller will set the "force" argument to true for the final
+ * invocation, such that no CALLs/JMPs to NULL pointers will be left
+ * around. See also the further comment below.
  */
-void init_or_livepatch apply_alternatives(struct alt_instr *start,
-                                          struct alt_instr *end)
+static void init_or_livepatch _apply_alternatives(struct alt_instr *start,
+                                                  struct alt_instr *end,
+                                                  bool force)
 {
     struct alt_instr *a, *base;
 
@@ -208,9 +213,10 @@ void init_or_livepatch apply_alternative
         /*
          * Detect sequences of alt_instr's patching the same origin site, and
          * keep base pointing at the first alt_instr entry.  This is so we can
-         * refer to a single ->priv field for patching decisions.  We
-         * deliberately use the alt_instr itself rather than a local variable
-         * in case we end up making multiple passes.
+         * refer to a single ->priv field for some of our patching decisions,
+         * in particular the NOP optimization. We deliberately use the alt_instr
+         * itself rather than a local variable in case we end up making multiple
+         * passes.
          *
          * ->priv being nonzero means that the origin site has already been
          * modified, and we shouldn't try to optimise the nops again.
@@ -218,6 +224,13 @@ void init_or_livepatch apply_alternative
         if ( ALT_ORIG_PTR(base) != orig )
             base = a;
 
+        /* Skip patch sites already handled during the first pass. */
+        if ( a->priv )
+        {
+            ASSERT(force);
+            continue;
+        }
+
         /* If there is no replacement to make, see about optimising the nops. */
         if ( !boot_cpu_has(a->cpuid) )
         {
@@ -225,7 +238,7 @@ void init_or_livepatch apply_alternative
             if ( base->priv )
                 continue;
 
-            base->priv = 1;
+            a->priv = 1;
 
             /* Nothing useful to do? */
             if ( toolchain_nops_are_ideal || a->pad_len <= 1 )
@@ -236,20 +249,74 @@ void init_or_livepatch apply_alternative
             continue;
         }
 
-        base->priv = 1;
-
         memcpy(buf, repl, a->repl_len);
 
         /* 0xe8/0xe9 are relative branches; fix the offset. */
         if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
-            *(int32_t *)(buf + 1) += repl - orig;
+        {
+            /*
+             * Detect the special case of indirect-to-direct branch patching:
+             * - replacement is a direct CALL/JMP (opcodes 0xE8/0xE9; already
+             *   checked above),
+             * - replacement's displacement is -5 (pointing back at the very
+             *   insn, which makes no sense in a real replacement insn),
+             * - original is an indirect CALL/JMP (opcodes 0xFF/2 or 0xFF/4)
+             *   using RIP-relative addressing.
+             * Some branch destinations may still be NULL when we come here
+             * the first time. Defer patching of those until the post-presmp-
+             * initcalls re-invocation (with force set to true). If at that
+             * point the branch destination is still NULL, insert "UD2; UD0"
+             * (for ease of recognition) instead of CALL/JMP.
+             */
+            if ( a->cpuid == X86_FEATURE_ALWAYS &&
+                 *(int32_t *)(buf + 1) == -5 &&
+                 a->orig_len >= 6 &&
+                 orig[0] == 0xff &&
+                 orig[1] == (*buf & 1 ? 0x25 : 0x15) )
+            {
+                long disp = *(int32_t *)(orig + 2);
+                const uint8_t *dest = *(void **)(orig + 6 + disp);
+
+                if ( dest )
+                {
+                    disp = dest - (orig + 5);
+                    ASSERT(disp == (int32_t)disp);
+                    *(int32_t *)(buf + 1) = disp;
+                }
+                else if ( force )
+                {
+                    buf[0] = 0x0f;
+                    buf[1] = 0x0b;
+                    buf[2] = 0x0f;
+                    buf[3] = 0xff;
+                    buf[4] = 0xff;
+                }
+                else
+                    continue;
+            }
+            else if ( force && system_state < SYS_STATE_active )
+                ASSERT_UNREACHABLE();
+            else
+                *(int32_t *)(buf + 1) += repl - orig;
+        }
+        else if ( force && system_state < SYS_STATE_active  )
+            ASSERT_UNREACHABLE();
+
+        a->priv = 1;
 
         add_nops(buf + a->repl_len, total_len - a->repl_len);
         text_poke(orig, buf, total_len);
     }
 }
 
-static bool __initdata alt_done;
+void init_or_livepatch apply_alternatives(struct alt_instr *start,
+                                          struct alt_instr *end)
+{
+    _apply_alternatives(start, end, true);
+}
+
+static unsigned int __initdata alt_todo;
+static unsigned int __initdata alt_done;
 
 /*
  * At boot time, we patch alternatives in NMI context.  This means that the
@@ -264,7 +331,7 @@ static int __init nmi_apply_alternatives
      * More than one NMI may occur between the two set_nmi_callback() below.
      * We only need to apply alternatives once.
      */
-    if ( !alt_done )
+    if ( !(alt_done & alt_todo) )
     {
         unsigned long cr0;
 
@@ -273,11 +340,12 @@ static int __init nmi_apply_alternatives
         /* Disable WP to allow patching read-only pages. */
         write_cr0(cr0 & ~X86_CR0_WP);
 
-        apply_alternatives(__alt_instructions, __alt_instructions_end);
+        _apply_alternatives(__alt_instructions, __alt_instructions_end,
+                            alt_done);
 
         write_cr0(cr0);
 
-        alt_done = true;
+        alt_done |= alt_todo;
     }
 
     return 1;
@@ -287,13 +355,11 @@ static int __init nmi_apply_alternatives
  * This routine is called with local interrupt disabled and used during
  * bootup.
  */
-void __init alternative_instructions(void)
+static void __init _alternative_instructions(bool force)
 {
     unsigned int i;
     nmi_callback_t *saved_nmi_callback;
 
-    arch_init_ideal_nops();
-
     /*
      * Don't stop machine check exceptions while patching.
      * MCEs only happen when something got corrupted and in this
@@ -306,6 +372,10 @@ void __init alternative_instructions(voi
      */
     ASSERT(!local_irq_is_enabled());
 
+    /* Set what operation to perform /before/ setting the callback. */
+    alt_todo = 1u << force;
+    barrier();
+
     /*
      * As soon as the callback is set up, the next NMI will trigger patching,
      * even an NMI ahead of our explicit self-NMI.
@@ -321,11 +391,24 @@ void __init alternative_instructions(voi
      * cover the (hopefully never) async case, poll alt_done for up to one
      * second.
      */
-    for ( i = 0; !ACCESS_ONCE(alt_done) && i < 1000; ++i )
+    for ( i = 0; !(ACCESS_ONCE(alt_done) & alt_todo) && i < 1000; ++i )
         mdelay(1);
 
-    if ( !ACCESS_ONCE(alt_done) )
+    if ( !(ACCESS_ONCE(alt_done) & alt_todo) )
         panic("Timed out waiting for alternatives self-NMI to hit\n");
 
     set_nmi_callback(saved_nmi_callback);
 }
+
+void __init alternative_instructions(void)
+{
+    arch_init_ideal_nops();
+    _alternative_instructions(false);
+}
+
+void __init alternative_branches(void)
+{
+    local_irq_disable();
+    _alternative_instructions(true);
+    local_irq_enable();
+}
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1644,6 +1644,8 @@ void __init noreturn __start_xen(unsigne
 
     do_presmp_initcalls();
 
+    alternative_branches();
+
     /*
      * NB: when running as a PV shim VCPUOP_up/down is wired to the shim
      * physical cpu_add/remove functions, so launch the guest with only
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -4,8 +4,8 @@
 #ifdef __ASSEMBLY__
 #include <asm/alternative-asm.h>
 #else
+#include <xen/lib.h>
 #include <xen/stringify.h>
-#include <xen/types.h>
 #include <asm/asm-macros.h>
 
 struct __packed alt_instr {
@@ -26,6 +26,7 @@ extern void add_nops(void *insns, unsign
 /* Similar to alternative_instructions except it can be run with IRQs enabled. */
 extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end);
 extern void alternative_instructions(void);
+extern void alternative_branches(void);
 
 #define alt_orig_len       "(.LXEN%=_orig_e - .LXEN%=_orig_s)"
 #define alt_pad_len        "(.LXEN%=_orig_p - .LXEN%=_orig_e)"
@@ -149,6 +150,233 @@ extern void alternative_instructions(voi
 /* Use this macro(s) if you need more than one output parameter. */
 #define ASM_OUTPUT2(a...) a
 
+/*
+ * Machinery to allow converting indirect to direct calls, when the called
+ * function is determined once at boot and later never changed.
+ */
+
+#define ALT_CALL_arg1 "rdi"
+#define ALT_CALL_arg2 "rsi"
+#define ALT_CALL_arg3 "rdx"
+#define ALT_CALL_arg4 "rcx"
+#define ALT_CALL_arg5 "r8"
+#define ALT_CALL_arg6 "r9"
+
+#define ALT_CALL_ARG(arg, n) \
+    register typeof((arg) ? (arg) : 0) a ## n ## _ \
+    asm ( ALT_CALL_arg ## n ) = (arg)
+#define ALT_CALL_NO_ARG(n) \
+    register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n )
+
+#define ALT_CALL_NO_ARG6 ALT_CALL_NO_ARG(6)
+#define ALT_CALL_NO_ARG5 ALT_CALL_NO_ARG(5); ALT_CALL_NO_ARG6
+#define ALT_CALL_NO_ARG4 ALT_CALL_NO_ARG(4); ALT_CALL_NO_ARG5
+#define ALT_CALL_NO_ARG3 ALT_CALL_NO_ARG(3); ALT_CALL_NO_ARG4
+#define ALT_CALL_NO_ARG2 ALT_CALL_NO_ARG(2); ALT_CALL_NO_ARG3
+#define ALT_CALL_NO_ARG1 ALT_CALL_NO_ARG(1); ALT_CALL_NO_ARG2
+
+/*
+ * Unfortunately ALT_CALL_NO_ARG() above can't use a fake initializer (to
+ * suppress "uninitialized variable" warnings), as various versions of gcc
+ * older than 8.1 fall on the nose in various ways with that (always because
+ * of some other construct elsewhere in the same function needing to use the
+ * same hard register). Otherwise the asm() below could uniformly use "+r"
+ * output constraints, making unnecessary all these ALT_CALL<n>_OUT macros.
+ */
+#define ALT_CALL0_OUT "=r" (a1_), "=r" (a2_), "=r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL1_OUT "+r" (a1_), "=r" (a2_), "=r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL2_OUT "+r" (a1_), "+r" (a2_), "=r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL3_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL4_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "+r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL5_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "+r" (a4_), "+r" (a5_), "=r" (a6_)
+#define ALT_CALL6_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "+r" (a4_), "+r" (a5_), "+r" (a6_)
+
+#define alternative_callN(n, rettype, func) ({                     \
+    rettype ret_;                                                  \
+    register unsigned long r10_ asm("r10");                        \
+    register unsigned long r11_ asm("r11");                        \
+    asm volatile (__stringify(ALTERNATIVE "call *%c[addr](%%rip)", \
+                                          "call .",                \
+                                          X86_FEATURE_ALWAYS)      \
+                  : ALT_CALL ## n ## _OUT, "=a" (ret_),            \
+                    "=r" (r10_), "=r" (r11_)                       \
+                  : [addr] "i" (&(func)), "g" (func)               \
+                  : "memory" );                                    \
+    ret_;                                                          \
+})
+
+#define alternative_vcall0(func) ({             \
+    ALT_CALL_NO_ARG1;                           \
+    ((void)alternative_callN(0, int, func));    \
+})
+
+#define alternative_call0(func) ({              \
+    ALT_CALL_NO_ARG1;                           \
+    alternative_callN(0, typeof(func()), func); \
+})
+
+#define alternative_vcall1(func, arg) ({           \
+    ALT_CALL_ARG(arg, 1);                          \
+    ALT_CALL_NO_ARG2;                              \
+    (void)sizeof(func(arg));                       \
+    (void)alternative_callN(1, int, func);         \
+})
+
+#define alternative_call1(func, arg) ({            \
+    ALT_CALL_ARG(arg, 1);                          \
+    ALT_CALL_NO_ARG2;                              \
+    alternative_callN(1, typeof(func(arg)), func); \
+})
+
+#define alternative_vcall2(func, arg1, arg2) ({           \
+    typeof(arg2) v2_ = (arg2);                            \
+    ALT_CALL_ARG(arg1, 1);                                \
+    ALT_CALL_ARG(v2_, 2);                                 \
+    ALT_CALL_NO_ARG3;                                     \
+    (void)sizeof(func(arg1, arg2));                       \
+    (void)alternative_callN(2, int, func);                \
+})
+
+#define alternative_call2(func, arg1, arg2) ({            \
+    typeof(arg2) v2_ = (arg2);                            \
+    ALT_CALL_ARG(arg1, 1);                                \
+    ALT_CALL_ARG(v2_, 2);                                 \
+    ALT_CALL_NO_ARG3;                                     \
+    alternative_callN(2, typeof(func(arg1, arg2)), func); \
+})
+
+#define alternative_vcall3(func, arg1, arg2, arg3) ({    \
+    typeof(arg2) v2_ = (arg2);                           \
+    typeof(arg3) v3_ = (arg3);                           \
+    ALT_CALL_ARG(arg1, 1);                               \
+    ALT_CALL_ARG(v2_, 2);                                \
+    ALT_CALL_ARG(v3_, 3);                                \
+    ALT_CALL_NO_ARG4;                                    \
+    (void)sizeof(func(arg1, arg2, arg3));                \
+    (void)alternative_callN(3, int, func);               \
+})
+
+#define alternative_call3(func, arg1, arg2, arg3) ({     \
+    typeof(arg2) v2_ = (arg2);                           \
+    typeof(arg3) v3_ = (arg3);                           \
+    ALT_CALL_ARG(arg1, 1);                               \
+    ALT_CALL_ARG(v2_, 2);                                \
+    ALT_CALL_ARG(v3_, 3);                                \
+    ALT_CALL_NO_ARG4;                                    \
+    alternative_callN(3, typeof(func(arg1, arg2, arg3)), \
+                      func);                             \
+})
+
+#define alternative_vcall4(func, arg1, arg2, arg3, arg4) ({ \
+    typeof(arg2) v2_ = (arg2);                              \
+    typeof(arg3) v3_ = (arg3);                              \
+    typeof(arg4) v4_ = (arg4);                              \
+    ALT_CALL_ARG(arg1, 1);                                  \
+    ALT_CALL_ARG(v2_, 2);                                   \
+    ALT_CALL_ARG(v3_, 3);                                   \
+    ALT_CALL_ARG(v4_, 4);                                   \
+    ALT_CALL_NO_ARG5;                                       \
+    (void)sizeof(func(arg1, arg2, arg3, arg4));             \
+    (void)alternative_callN(4, int, func);                  \
+})
+
+#define alternative_call4(func, arg1, arg2, arg3, arg4) ({  \
+    typeof(arg2) v2_ = (arg2);                              \
+    typeof(arg3) v3_ = (arg3);                              \
+    typeof(arg4) v4_ = (arg4);                              \
+    ALT_CALL_ARG(arg1, 1);                                  \
+    ALT_CALL_ARG(v2_, 2);                                   \
+    ALT_CALL_ARG(v3_, 3);                                   \
+    ALT_CALL_ARG(v4_, 4);                                   \
+    ALT_CALL_NO_ARG5;                                       \
+    alternative_callN(4, typeof(func(arg1, arg2,            \
+                                     arg3, arg4)),          \
+                      func);                                \
+})
+
+#define alternative_vcall5(func, arg1, arg2, arg3, arg4, arg5) ({ \
+    typeof(arg2) v2_ = (arg2);                                    \
+    typeof(arg3) v3_ = (arg3);                                    \
+    typeof(arg4) v4_ = (arg4);                                    \
+    typeof(arg5) v5_ = (arg5);                                    \
+    ALT_CALL_ARG(arg1, 1);                                        \
+    ALT_CALL_ARG(v2_, 2);                                         \
+    ALT_CALL_ARG(v3_, 3);                                         \
+    ALT_CALL_ARG(v4_, 4);                                         \
+    ALT_CALL_ARG(v5_, 5);                                         \
+    ALT_CALL_NO_ARG6;                                             \
+    (void)sizeof(func(arg1, arg2, arg3, arg4, arg5));             \
+    (void)alternative_callN(5, int, func, ALT_CALL_OUT5);         \
+})
+
+#define alternative_call5(func, arg1, arg2, arg3, arg4, arg5) ({  \
+    typeof(arg2) v2_ = (arg2);                                    \
+    typeof(arg3) v3_ = (arg3);                                    \
+    typeof(arg4) v4_ = (arg4);                                    \
+    typeof(arg5) v5_ = (arg5);                                    \
+    ALT_CALL_ARG(arg1, 1);                                        \
+    ALT_CALL_ARG(v2_, 2);                                         \
+    ALT_CALL_ARG(v3_, 3);                                         \
+    ALT_CALL_ARG(v4_, 4);                                         \
+    ALT_CALL_ARG(v5_, 5);                                         \
+    ALT_CALL_NO_ARG6;                                             \
+    alternative_callN(5, typeof(func(arg1, arg2, arg3,            \
+                                     arg4, arg5)),                \
+                      func, ALT_CALL_OUT5);                       \
+})
+
+#define alternative_vcall6(func, arg1, arg2, arg3, arg4, arg5, arg6) ({ \
+    typeof(arg2) v2_ = (arg2);                                          \
+    typeof(arg3) v3_ = (arg3);                                          \
+    typeof(arg4) v4_ = (arg4);                                          \
+    typeof(arg5) v5_ = (arg5);                                          \
+    typeof(arg6) v6_ = (arg6);                                          \
+    ALT_CALL_ARG(arg1, 1);                                              \
+    ALT_CALL_ARG(v2_, 2);                                               \
+    ALT_CALL_ARG(v3_, 3);                                               \
+    ALT_CALL_ARG(v4_, 4);                                               \
+    ALT_CALL_ARG(v5_, 5);                                               \
+    ALT_CALL_ARG(v6_, 6);                                               \
+    (void)sizeof(func(arg1, arg2, arg3, arg4, arg5, arg6));             \
+    (void)alternative_callN(6, int, func);                              \
+})
+
+#define alternative_call6(func, arg1, arg2, arg3, arg4, arg5, arg6) ({  \
+    typeof(arg2) v2_ = (arg2);                                          \
+    typeof(arg3) v3_ = (arg3);                                          \
+    typeof(arg4) v4_ = (arg4);                                          \
+    typeof(arg5) v5_ = (arg5);                                          \
+    typeof(arg6) v6_ = (arg6);                                          \
+    ALT_CALL_ARG(arg1, 1);                                              \
+    ALT_CALL_ARG(v2_, 2);                                               \
+    ALT_CALL_ARG(v3_, 3);                                               \
+    ALT_CALL_ARG(v4_, 4);                                               \
+    ALT_CALL_ARG(v5_, 5);                                               \
+    ALT_CALL_ARG(v6_, 6);                                               \
+    alternative_callN(6, typeof(func(arg1, arg2, arg3,                  \
+                                     arg4, arg5, arg6)),                \
+                      func, ALT_CALL_OUT6);                             \
+})
+
+#define alternative_vcall__(nr) alternative_vcall ## nr
+#define alternative_call__(nr)  alternative_call ## nr
+
+#define alternative_vcall_(nr) alternative_vcall__(nr)
+#define alternative_call_(nr)  alternative_call__(nr)
+
+#define alternative_vcall(func, args...) \
+    alternative_vcall_(count_va_arg(args))(func, ## args)
+
+#define alternative_call(func, args...) \
+    alternative_call_(count_va_arg(args))(func, ## args)
+
 #endif /*  !__ASSEMBLY__  */
 
 #endif /* __X86_ALTERNATIVE_H__ */
--- a/xen/include/xen/lib.h
+++ b/xen/include/xen/lib.h
@@ -66,6 +66,10 @@
 
 #define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
 
+#define count_va_arg_(dot, a1, a2, a3, a4, a5, a6, a7, a8, x, ...) x
+#define count_va_arg(args...) \
+    count_va_arg_(., ## args, 8, 7, 6, 5, 4, 3, 2, 1, 0)
+
 struct domain;
 
 void cmdline_parse(const char *cmdline);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v4 02/12] x86/HVM: patch indirect calls through hvm_funcs to direct ones
  2018-10-02 10:09 ` [PATCH v4 00/12] x86: indirect call overhead reduction Jan Beulich
  2018-10-02 10:12   ` [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
@ 2018-10-02 10:12   ` Jan Beulich
  2018-10-02 13:18     ` Paul Durrant
  2018-10-03 18:55     ` Andrew Cooper
  2018-10-02 10:13   ` [PATCH v4 03/12] x86/HVM: patch vINTR " Jan Beulich
                     ` (9 subsequent siblings)
  11 siblings, 2 replies; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 10:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Wei Liu

This is intentionally not touching hooks used rarely (or not at all)
during the lifetime of a VM, like {domain,vcpu}_initialise or cpu_up,
as well as nested, VM event, and altp2m ones (they can all be done
later, if so desired). Virtual Interrupt delivery ones will be dealt
with in a subsequent patch.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v3: Re-base.
v2: Drop open-coded numbers from macro invocations. Re-base.

--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -2104,7 +2104,7 @@ static int hvmemul_write_msr(
 static int hvmemul_wbinvd(
     struct x86_emulate_ctxt *ctxt)
 {
-    hvm_funcs.wbinvd_intercept();
+    alternative_vcall(hvm_funcs.wbinvd_intercept);
     return X86EMUL_OKAY;
 }
 
@@ -2122,7 +2122,7 @@ static int hvmemul_get_fpu(
     struct vcpu *curr = current;
 
     if ( !curr->fpu_dirtied )
-        hvm_funcs.fpu_dirty_intercept();
+        alternative_vcall(hvm_funcs.fpu_dirty_intercept);
     else if ( type == X86EMUL_FPU_fpu )
     {
         const typeof(curr->arch.xsave_area->fpu_sse) *fpu_ctxt =
@@ -2239,7 +2239,7 @@ static void hvmemul_put_fpu(
         {
             curr->fpu_dirtied = false;
             stts();
-            hvm_funcs.fpu_leave(curr);
+            alternative_vcall(hvm_funcs.fpu_leave, curr);
         }
     }
 }
@@ -2401,7 +2401,8 @@ static int _hvm_emulate_one(struct hvm_e
     if ( hvmemul_ctxt->intr_shadow != new_intr_shadow )
     {
         hvmemul_ctxt->intr_shadow = new_intr_shadow;
-        hvm_funcs.set_interrupt_shadow(curr, new_intr_shadow);
+        alternative_vcall(hvm_funcs.set_interrupt_shadow,
+                          curr, new_intr_shadow);
     }
 
     if ( hvmemul_ctxt->ctxt.retire.hlt &&
@@ -2538,7 +2539,8 @@ void hvm_emulate_init_once(
 
     memset(hvmemul_ctxt, 0, sizeof(*hvmemul_ctxt));
 
-    hvmemul_ctxt->intr_shadow = hvm_funcs.get_interrupt_shadow(curr);
+    hvmemul_ctxt->intr_shadow =
+        alternative_call(hvm_funcs.get_interrupt_shadow, curr);
     hvmemul_get_seg_reg(x86_seg_cs, hvmemul_ctxt);
     hvmemul_get_seg_reg(x86_seg_ss, hvmemul_ctxt);
 
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -272,12 +272,12 @@ void hvm_set_rdtsc_exiting(struct domain
     struct vcpu *v;
 
     for_each_vcpu ( d, v )
-        hvm_funcs.set_rdtsc_exiting(v, enable);
+        alternative_vcall(hvm_funcs.set_rdtsc_exiting, v, enable);
 }
 
 void hvm_get_guest_pat(struct vcpu *v, u64 *guest_pat)
 {
-    if ( !hvm_funcs.get_guest_pat(v, guest_pat) )
+    if ( !alternative_call(hvm_funcs.get_guest_pat, v, guest_pat) )
         *guest_pat = v->arch.hvm.pat_cr;
 }
 
@@ -302,7 +302,7 @@ int hvm_set_guest_pat(struct vcpu *v, u6
             return 0;
         }
 
-    if ( !hvm_funcs.set_guest_pat(v, guest_pat) )
+    if ( !alternative_call(hvm_funcs.set_guest_pat, v, guest_pat) )
         v->arch.hvm.pat_cr = guest_pat;
 
     return 1;
@@ -342,7 +342,7 @@ bool hvm_set_guest_bndcfgs(struct vcpu *
             /* nothing, best effort only */;
     }
 
-    return hvm_funcs.set_guest_bndcfgs(v, val);
+    return alternative_call(hvm_funcs.set_guest_bndcfgs, v, val);
 }
 
 /*
@@ -500,7 +500,8 @@ void hvm_migrate_pirqs(struct vcpu *v)
 static bool hvm_get_pending_event(struct vcpu *v, struct x86_event *info)
 {
     info->cr2 = v->arch.hvm.guest_cr[2];
-    return hvm_funcs.get_pending_event(v, info);
+
+    return alternative_call(hvm_funcs.get_pending_event, v, info);
 }
 
 void hvm_do_resume(struct vcpu *v)
@@ -1651,7 +1652,7 @@ void hvm_inject_event(const struct x86_e
         }
     }
 
-    hvm_funcs.inject_event(event);
+    alternative_vcall(hvm_funcs.inject_event, event);
 }
 
 int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
@@ -2238,7 +2239,7 @@ int hvm_set_cr0(unsigned long value, boo
          (!rangeset_is_empty(d->iomem_caps) ||
           !rangeset_is_empty(d->arch.ioport_caps) ||
           has_arch_pdevs(d)) )
-        hvm_funcs.handle_cd(v, value);
+        alternative_vcall(hvm_funcs.handle_cd, v, value);
 
     hvm_update_cr(v, 0, value);
 
@@ -3477,7 +3478,8 @@ int hvm_msr_read_intercept(unsigned int
             goto gp_fault;
         /* If ret == 0 then this is not an MCE MSR, see other MSRs. */
         ret = ((ret == 0)
-               ? hvm_funcs.msr_read_intercept(msr, msr_content)
+               ? alternative_call(hvm_funcs.msr_read_intercept,
+                                  msr, msr_content)
                : X86EMUL_OKAY);
         break;
     }
@@ -3637,7 +3639,8 @@ int hvm_msr_write_intercept(unsigned int
             goto gp_fault;
         /* If ret == 0 then this is not an MCE MSR, see other MSRs. */
         ret = ((ret == 0)
-               ? hvm_funcs.msr_write_intercept(msr, msr_content)
+               ? alternative_call(hvm_funcs.msr_write_intercept,
+                                  msr, msr_content)
                : X86EMUL_OKAY);
         break;
     }
@@ -3829,7 +3832,7 @@ void hvm_hypercall_page_initialise(struc
                                    void *hypercall_page)
 {
     hvm_latch_shinfo_size(d);
-    hvm_funcs.init_hypercall_page(d, hypercall_page);
+    alternative_vcall(hvm_funcs.init_hypercall_page, d, hypercall_page);
 }
 
 void hvm_vcpu_reset_state(struct vcpu *v, uint16_t cs, uint16_t ip)
@@ -5004,7 +5007,7 @@ void hvm_domain_soft_reset(struct domain
 void hvm_get_segment_register(struct vcpu *v, enum x86_segment seg,
                               struct segment_register *reg)
 {
-    hvm_funcs.get_segment_register(v, seg, reg);
+    alternative_vcall(hvm_funcs.get_segment_register, v, seg, reg);
 
     switch ( seg )
     {
@@ -5150,7 +5153,7 @@ void hvm_set_segment_register(struct vcp
         return;
     }
 
-    hvm_funcs.set_segment_register(v, seg, reg);
+    alternative_vcall(hvm_funcs.set_segment_register, v, seg, reg);
 }
 
 /*
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -383,42 +383,42 @@ static inline int
 hvm_guest_x86_mode(struct vcpu *v)
 {
     ASSERT(v == current);
-    return hvm_funcs.guest_x86_mode(v);
+    return alternative_call(hvm_funcs.guest_x86_mode, v);
 }
 
 static inline void
 hvm_update_host_cr3(struct vcpu *v)
 {
     if ( hvm_funcs.update_host_cr3 )
-        hvm_funcs.update_host_cr3(v);
+        alternative_vcall(hvm_funcs.update_host_cr3, v);
 }
 
 static inline void hvm_update_guest_cr(struct vcpu *v, unsigned int cr)
 {
-    hvm_funcs.update_guest_cr(v, cr, 0);
+    alternative_vcall(hvm_funcs.update_guest_cr, v, cr, 0);
 }
 
 static inline void hvm_update_guest_cr3(struct vcpu *v, bool noflush)
 {
     unsigned int flags = noflush ? HVM_UPDATE_GUEST_CR3_NOFLUSH : 0;
 
-    hvm_funcs.update_guest_cr(v, 3, flags);
+    alternative_vcall(hvm_funcs.update_guest_cr, v, 3, flags);
 }
 
 static inline void hvm_update_guest_efer(struct vcpu *v)
 {
-    hvm_funcs.update_guest_efer(v);
+    alternative_vcall(hvm_funcs.update_guest_efer, v);
 }
 
 static inline void hvm_cpuid_policy_changed(struct vcpu *v)
 {
-    hvm_funcs.cpuid_policy_changed(v);
+    alternative_vcall(hvm_funcs.cpuid_policy_changed, v);
 }
 
 static inline void hvm_set_tsc_offset(struct vcpu *v, uint64_t offset,
                                       uint64_t at_tsc)
 {
-    hvm_funcs.set_tsc_offset(v, offset, at_tsc);
+    alternative_vcall(hvm_funcs.set_tsc_offset, v, offset, at_tsc);
 }
 
 /*
@@ -435,18 +435,18 @@ static inline void hvm_flush_guest_tlbs(
 static inline unsigned int
 hvm_get_cpl(struct vcpu *v)
 {
-    return hvm_funcs.get_cpl(v);
+    return alternative_call(hvm_funcs.get_cpl, v);
 }
 
 static inline unsigned long hvm_get_shadow_gs_base(struct vcpu *v)
 {
-    return hvm_funcs.get_shadow_gs_base(v);
+    return alternative_call(hvm_funcs.get_shadow_gs_base, v);
 }
 
 static inline bool hvm_get_guest_bndcfgs(struct vcpu *v, u64 *val)
 {
     return hvm_funcs.get_guest_bndcfgs &&
-           hvm_funcs.get_guest_bndcfgs(v, val);
+           alternative_call(hvm_funcs.get_guest_bndcfgs, v, val);
 }
 
 #define has_hvm_params(d) \
@@ -503,12 +503,12 @@ static inline void hvm_inject_page_fault
 
 static inline int hvm_event_pending(struct vcpu *v)
 {
-    return hvm_funcs.event_pending(v);
+    return alternative_call(hvm_funcs.event_pending, v);
 }
 
 static inline void hvm_invlpg(struct vcpu *v, unsigned long linear)
 {
-    hvm_funcs.invlpg(v, linear);
+    alternative_vcall(hvm_funcs.invlpg, v, linear);
 }
 
 /* These bits in CR4 are owned by the host. */
@@ -533,13 +533,14 @@ static inline void hvm_cpu_down(void)
 
 static inline unsigned int hvm_get_insn_bytes(struct vcpu *v, uint8_t *buf)
 {
-    return (hvm_funcs.get_insn_bytes ? hvm_funcs.get_insn_bytes(v, buf) : 0);
+    return (hvm_funcs.get_insn_bytes
+            ? alternative_call(hvm_funcs.get_insn_bytes, v, buf) : 0);
 }
 
 static inline void hvm_set_info_guest(struct vcpu *v)
 {
     if ( hvm_funcs.set_info_guest )
-        return hvm_funcs.set_info_guest(v);
+        alternative_vcall(hvm_funcs.set_info_guest, v);
 }
 
 static inline void hvm_invalidate_regs_fields(struct cpu_user_regs *regs)




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v4 03/12] x86/HVM: patch vINTR indirect calls through hvm_funcs to direct ones
  2018-10-02 10:09 ` [PATCH v4 00/12] x86: indirect call overhead reduction Jan Beulich
  2018-10-02 10:12   ` [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
  2018-10-02 10:12   ` [PATCH v4 02/12] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
@ 2018-10-02 10:13   ` Jan Beulich
  2018-10-03 19:01     ` Andrew Cooper
  2018-10-02 10:13   ` [PATCH v4 04/12] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
                     ` (8 subsequent siblings)
  11 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 10:13 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

While not strictly necessary, change the VMX initialization logic to
update the function table in start_vmx() from NULL rather than to NULL,
to make more obvious that we won't ever change an already (explictly)
initialized function pointer.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v4: Re-base.
v2: Drop open-coded numbers from macro invocations.

--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -111,10 +111,15 @@ static void vlapic_clear_irr(int vector,
     vlapic_clear_vector(vector, &vlapic->regs->data[APIC_IRR]);
 }
 
-static int vlapic_find_highest_irr(struct vlapic *vlapic)
+static void sync_pir_to_irr(struct vcpu *v)
 {
     if ( hvm_funcs.sync_pir_to_irr )
-        hvm_funcs.sync_pir_to_irr(vlapic_vcpu(vlapic));
+        alternative_vcall(hvm_funcs.sync_pir_to_irr, v);
+}
+
+static int vlapic_find_highest_irr(struct vlapic *vlapic)
+{
+    sync_pir_to_irr(vlapic_vcpu(vlapic));
 
     return vlapic_find_highest_vector(&vlapic->regs->data[APIC_IRR]);
 }
@@ -143,7 +148,7 @@ bool vlapic_test_irq(const struct vlapic
         return false;
 
     if ( hvm_funcs.test_pir &&
-         hvm_funcs.test_pir(const_vlapic_vcpu(vlapic), vec) )
+         alternative_call(hvm_funcs.test_pir, const_vlapic_vcpu(vlapic), vec) )
         return true;
 
     return vlapic_test_vector(vec, &vlapic->regs->data[APIC_IRR]);
@@ -165,10 +170,10 @@ void vlapic_set_irq(struct vlapic *vlapi
         vlapic_clear_vector(vec, &vlapic->regs->data[APIC_TMR]);
 
     if ( hvm_funcs.update_eoi_exit_bitmap )
-        hvm_funcs.update_eoi_exit_bitmap(target, vec, trig);
+        alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec, trig);
 
     if ( hvm_funcs.deliver_posted_intr )
-        hvm_funcs.deliver_posted_intr(target, vec);
+        alternative_vcall(hvm_funcs.deliver_posted_intr, target, vec);
     else if ( !vlapic_test_and_set_irr(vec, vlapic) )
         vcpu_kick(target);
 }
@@ -448,7 +453,7 @@ void vlapic_EOI_set(struct vlapic *vlapi
     vlapic_clear_vector(vector, &vlapic->regs->data[APIC_ISR]);
 
     if ( hvm_funcs.handle_eoi )
-        hvm_funcs.handle_eoi(vector);
+        alternative_vcall(hvm_funcs.handle_eoi, vector);
 
     vlapic_handle_EOI(vlapic, vector);
 
@@ -1412,8 +1417,7 @@ static int lapic_save_regs(struct vcpu *
     if ( !has_vlapic(v->domain) )
         return 0;
 
-    if ( hvm_funcs.sync_pir_to_irr )
-        hvm_funcs.sync_pir_to_irr(v);
+    sync_pir_to_irr(v);
 
     return hvm_save_entry(LAPIC_REGS, v->vcpu_id, h, vcpu_vlapic(v)->regs);
 }
@@ -1509,7 +1513,8 @@ static int lapic_load_regs(struct domain
         lapic_load_fixup(s);
 
     if ( hvm_funcs.process_isr )
-        hvm_funcs.process_isr(vlapic_find_highest_isr(s), v);
+        alternative_vcall(hvm_funcs.process_isr,
+                           vlapic_find_highest_isr(s), v);
 
     vlapic_adjust_i8259_target(d);
     lapic_rearm(s);
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2336,12 +2336,6 @@ static struct hvm_function_table __initd
     .nhvm_vcpu_vmexit_event = nvmx_vmexit_event,
     .nhvm_intr_blocked    = nvmx_intr_blocked,
     .nhvm_domain_relinquish_resources = nvmx_domain_relinquish_resources,
-    .update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap,
-    .process_isr          = vmx_process_isr,
-    .deliver_posted_intr  = vmx_deliver_posted_intr,
-    .sync_pir_to_irr      = vmx_sync_pir_to_irr,
-    .test_pir             = vmx_test_pir,
-    .handle_eoi           = vmx_handle_eoi,
     .nhvm_hap_walk_L1_p2m = nvmx_hap_walk_L1_p2m,
     .enable_msr_interception = vmx_enable_msr_interception,
     .is_singlestep_supported = vmx_is_singlestep_supported,
@@ -2469,26 +2463,23 @@ const struct hvm_function_table * __init
         setup_ept_dump();
     }
 
-    if ( !cpu_has_vmx_virtual_intr_delivery )
+    if ( cpu_has_vmx_virtual_intr_delivery )
     {
-        vmx_function_table.update_eoi_exit_bitmap = NULL;
-        vmx_function_table.process_isr = NULL;
-        vmx_function_table.handle_eoi = NULL;
-    }
-    else
+        vmx_function_table.update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap;
+        vmx_function_table.process_isr = vmx_process_isr;
+        vmx_function_table.handle_eoi = vmx_handle_eoi;
         vmx_function_table.virtual_intr_delivery_enabled = true;
+    }
 
     if ( cpu_has_vmx_posted_intr_processing )
     {
         alloc_direct_apic_vector(&posted_intr_vector, pi_notification_interrupt);
         if ( iommu_intpost )
             alloc_direct_apic_vector(&pi_wakeup_vector, pi_wakeup_interrupt);
-    }
-    else
-    {
-        vmx_function_table.deliver_posted_intr = NULL;
-        vmx_function_table.sync_pir_to_irr = NULL;
-        vmx_function_table.test_pir = NULL;
+
+        vmx_function_table.deliver_posted_intr = vmx_deliver_posted_intr;
+        vmx_function_table.sync_pir_to_irr     = vmx_sync_pir_to_irr;
+        vmx_function_table.test_pir            = vmx_test_pir;
     }
 
     if ( cpu_has_vmx_tsc_scaling )




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v4 04/12] x86: patch ctxt_switch_masking() indirect call to direct one
  2018-10-02 10:09 ` [PATCH v4 00/12] x86: indirect call overhead reduction Jan Beulich
                     ` (2 preceding siblings ...)
  2018-10-02 10:13   ` [PATCH v4 03/12] x86/HVM: patch vINTR " Jan Beulich
@ 2018-10-02 10:13   ` Jan Beulich
  2018-10-03 19:01     ` Andrew Cooper
  2018-10-02 10:14   ` [PATCH v4 05/12] x86/genapic: remove indirection from genapic hook accesses Jan Beulich
                     ` (7 subsequent siblings)
  11 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 10:13 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v2: Drop open-coded number from macro invocation.

--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -184,7 +184,7 @@ void ctxt_switch_levelling(const struct
 	}
 
 	if (ctxt_switch_masking)
-		ctxt_switch_masking(next);
+		alternative_vcall(ctxt_switch_masking, next);
 }
 
 bool_t opt_cpu_info;



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v4 05/12] x86/genapic: remove indirection from genapic hook accesses
  2018-10-02 10:09 ` [PATCH v4 00/12] x86: indirect call overhead reduction Jan Beulich
                     ` (3 preceding siblings ...)
  2018-10-02 10:13   ` [PATCH v4 04/12] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
@ 2018-10-02 10:14   ` Jan Beulich
  2018-10-03 19:04     ` Andrew Cooper
  2018-10-02 10:14   ` [PATCH v4 06/12] x86/genapic: patch indirect calls to direct ones Jan Beulich
                     ` (6 subsequent siblings)
  11 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 10:14 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

Instead of loading a pointer at each use site, have a single runtime
instance of struct genapic, copying into it from the individual
instances. The individual instances can this way also be moved to .init
(also adjust apic_probe[] at this occasion).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>

--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -943,8 +943,8 @@ void __init x2apic_bsp_setup(void)
 
     force_iommu = 1;
 
-    genapic = apic_x2apic_probe();
-    printk("Switched to APIC driver %s.\n", genapic->name);
+    genapic = *apic_x2apic_probe();
+    printk("Switched to APIC driver %s.\n", genapic.name);
 
     if ( !x2apic_enabled )
     {
--- a/xen/arch/x86/genapic/bigsmp.c
+++ b/xen/arch/x86/genapic/bigsmp.c
@@ -42,7 +42,7 @@ static __init int probe_bigsmp(void)
 	return def_to_bigsmp;
 } 
 
-const struct genapic apic_bigsmp = {
+const struct genapic __initconstrel apic_bigsmp = {
 	APIC_INIT("bigsmp", probe_bigsmp),
 	GENAPIC_PHYS
 };
--- a/xen/arch/x86/genapic/default.c
+++ b/xen/arch/x86/genapic/default.c
@@ -20,7 +20,7 @@ static __init int probe_default(void)
 	return 1;
 } 
 
-const struct genapic apic_default = {
+const struct genapic __initconstrel apic_default = {
 	APIC_INIT("default", probe_default),
 	GENAPIC_FLAT
 };
--- a/xen/arch/x86/genapic/probe.c
+++ b/xen/arch/x86/genapic/probe.c
@@ -15,11 +15,9 @@
 #include <asm/mach-generic/mach_apic.h>
 #include <asm/setup.h>
 
-extern const struct genapic apic_bigsmp;
+struct genapic __read_mostly genapic;
 
-const struct genapic *__read_mostly genapic;
-
-const struct genapic *apic_probe[] __initdata = {
+const struct genapic *const __initconstrel apic_probe[] = {
 	&apic_bigsmp, 
 	&apic_default,	/* must be last */
 	NULL,
@@ -36,11 +34,11 @@ void __init generic_bigsmp_probe(void)
 	 * - we find more than 8 CPUs in acpi LAPIC listing with xAPIC support
 	 */
 
-	if (!cmdline_apic && genapic == &apic_default)
+	if (!cmdline_apic && genapic.name == apic_default.name)
 		if (apic_bigsmp.probe()) {
-			genapic = &apic_bigsmp;
+			genapic = apic_bigsmp;
 			printk(KERN_INFO "Overriding APIC driver with %s\n",
-			       genapic->name);
+			       genapic.name);
 		}
 }
 
@@ -50,7 +48,7 @@ static int __init genapic_apic_force(con
 
 	for (i = 0; apic_probe[i]; i++)
 		if (!strcmp(apic_probe[i]->name, str)) {
-			genapic = apic_probe[i];
+			genapic = *apic_probe[i];
 			rc = 0;
 		}
 
@@ -66,18 +64,18 @@ void __init generic_apic_probe(void)
 	record_boot_APIC_mode();
 
 	check_x2apic_preenabled();
-	cmdline_apic = changed = (genapic != NULL);
+	cmdline_apic = changed = !!genapic.name;
 
 	for (i = 0; !changed && apic_probe[i]; i++) { 
 		if (apic_probe[i]->probe()) {
 			changed = 1;
-			genapic = apic_probe[i];
+			genapic = *apic_probe[i];
 		} 
 	}
 	if (!changed) 
-		genapic = &apic_default;
+		genapic = apic_default;
 
-	printk(KERN_INFO "Using APIC driver %s\n", genapic->name);
+	printk(KERN_INFO "Using APIC driver %s\n", genapic.name);
 } 
 
 /* These functions can switch the APIC even after the initial ->probe() */
@@ -88,9 +86,9 @@ int __init mps_oem_check(struct mp_confi
 	for (i = 0; apic_probe[i]; ++i) { 
 		if (apic_probe[i]->mps_oem_check(mpc,oem,productid)) { 
 			if (!cmdline_apic) {
-				genapic = apic_probe[i];
+				genapic = *apic_probe[i];
 				printk(KERN_INFO "Switched to APIC driver `%s'.\n", 
-				       genapic->name);
+				       genapic.name);
 			}
 			return 1;
 		} 
@@ -104,9 +102,9 @@ int __init acpi_madt_oem_check(char *oem
 	for (i = 0; apic_probe[i]; ++i) { 
 		if (apic_probe[i]->acpi_madt_oem_check(oem_id, oem_table_id)) { 
 			if (!cmdline_apic) {
-				genapic = apic_probe[i];
+				genapic = *apic_probe[i];
 				printk(KERN_INFO "Switched to APIC driver `%s'.\n", 
-				       genapic->name);
+				       genapic.name);
 			}
 			return 1;
 		} 
--- a/xen/arch/x86/genapic/x2apic.c
+++ b/xen/arch/x86/genapic/x2apic.c
@@ -163,7 +163,7 @@ static void send_IPI_mask_x2apic_cluster
     local_irq_restore(flags);
 }
 
-static const struct genapic apic_x2apic_phys = {
+static const struct genapic __initconstrel apic_x2apic_phys = {
     APIC_INIT("x2apic_phys", NULL),
     .int_delivery_mode = dest_Fixed,
     .int_dest_mode = 0 /* physical delivery */,
@@ -175,7 +175,7 @@ static const struct genapic apic_x2apic_
     .send_IPI_self = send_IPI_self_x2apic
 };
 
-static const struct genapic apic_x2apic_cluster = {
+static const struct genapic __initconstrel apic_x2apic_cluster = {
     APIC_INIT("x2apic_cluster", NULL),
     .int_delivery_mode = dest_LowestPrio,
     .int_dest_mode = 1 /* logical delivery */,
@@ -259,6 +259,6 @@ void __init check_x2apic_preenabled(void
     {
         printk("x2APIC mode is already enabled by BIOS.\n");
         x2apic_enabled = 1;
-        genapic = apic_x2apic_probe();
+        genapic = *apic_x2apic_probe();
     }
 }
--- a/xen/arch/x86/mpparse.c
+++ b/xen/arch/x86/mpparse.c
@@ -162,7 +162,8 @@ static int MP_processor_info_x(struct mp
 		return -ENOSPC;
 	}
 
-	if (num_processors >= 8 && hotplug && genapic == &apic_default) {
+	if (num_processors >= 8 && hotplug
+	    && genapic.name == apic_default.name) {
 		printk(KERN_WARNING "WARNING: CPUs limit of 8 reached."
 			" Processor ignored.\n");
 		return -ENOSPC;
--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -29,12 +29,12 @@
 
 void send_IPI_mask(const cpumask_t *mask, int vector)
 {
-    genapic->send_IPI_mask(mask, vector);
+    genapic.send_IPI_mask(mask, vector);
 }
 
 void send_IPI_self(int vector)
 {
-    genapic->send_IPI_self(vector);
+    genapic.send_IPI_self(vector);
 }
 
 /*
--- a/xen/include/asm-x86/genapic.h
+++ b/xen/include/asm-x86/genapic.h
@@ -47,8 +47,9 @@ struct genapic {
 	APICFUNC(mps_oem_check), \
 	APICFUNC(acpi_madt_oem_check)
 
-extern const struct genapic *genapic;
+extern struct genapic genapic;
 extern const struct genapic apic_default;
+extern const struct genapic apic_bigsmp;
 
 void send_IPI_self_legacy(uint8_t vector);
 
--- a/xen/include/asm-x86/mach-generic/mach_apic.h
+++ b/xen/include/asm-x86/mach-generic/mach_apic.h
@@ -10,13 +10,13 @@
 #define esr_disable (0)
 
 /* The following are dependent on APIC delivery mode (logical vs. physical). */
-#define INT_DELIVERY_MODE (genapic->int_delivery_mode)
-#define INT_DEST_MODE (genapic->int_dest_mode)
+#define INT_DELIVERY_MODE (genapic.int_delivery_mode)
+#define INT_DEST_MODE (genapic.int_dest_mode)
 #define TARGET_CPUS ((const typeof(cpu_online_map) *)&cpu_online_map)
-#define init_apic_ldr (genapic->init_apic_ldr)
-#define clustered_apic_check (genapic->clustered_apic_check) 
-#define cpu_mask_to_apicid (genapic->cpu_mask_to_apicid)
-#define vector_allocation_cpumask(cpu) (genapic->vector_allocation_cpumask(cpu))
+#define init_apic_ldr (genapic.init_apic_ldr)
+#define clustered_apic_check (genapic.clustered_apic_check)
+#define cpu_mask_to_apicid (genapic.cpu_mask_to_apicid)
+#define vector_allocation_cpumask(cpu) (genapic.vector_allocation_cpumask(cpu))
 
 static inline void enable_apic_mode(void)
 {




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v4 06/12] x86/genapic: patch indirect calls to direct ones
  2018-10-02 10:09 ` [PATCH v4 00/12] x86: indirect call overhead reduction Jan Beulich
                     ` (4 preceding siblings ...)
  2018-10-02 10:14   ` [PATCH v4 05/12] x86/genapic: remove indirection from genapic hook accesses Jan Beulich
@ 2018-10-02 10:14   ` Jan Beulich
  2018-10-03 19:07     ` Andrew Cooper
  2018-10-02 10:15   ` [PATCH v4 07/12] x86/cpuidle: patch some " Jan Beulich
                     ` (5 subsequent siblings)
  11 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 10:14 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

For (I hope) obvious reasons only the ones used at runtime get
converted.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v2: Drop open-coded numbers from macro invocations.

--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -29,12 +29,12 @@
 
 void send_IPI_mask(const cpumask_t *mask, int vector)
 {
-    genapic.send_IPI_mask(mask, vector);
+    alternative_vcall(genapic.send_IPI_mask, mask, vector);
 }
 
 void send_IPI_self(int vector)
 {
-    genapic.send_IPI_self(vector);
+    alternative_vcall(genapic.send_IPI_self, vector);
 }
 
 /*
--- a/xen/include/asm-x86/mach-generic/mach_apic.h
+++ b/xen/include/asm-x86/mach-generic/mach_apic.h
@@ -15,8 +15,18 @@
 #define TARGET_CPUS ((const typeof(cpu_online_map) *)&cpu_online_map)
 #define init_apic_ldr (genapic.init_apic_ldr)
 #define clustered_apic_check (genapic.clustered_apic_check)
-#define cpu_mask_to_apicid (genapic.cpu_mask_to_apicid)
-#define vector_allocation_cpumask(cpu) (genapic.vector_allocation_cpumask(cpu))
+#define cpu_mask_to_apicid(mask) ({ \
+	/* \
+	 * There are a number of places where the address of a local variable \
+	 * gets passed here. The use of ?: in alternative_call<N>() triggers an \
+	 * "address of ... is always true" warning in such a case with at least \
+	 * gcc 7 and 8. Hence the seemingly pointless local variable here. \
+	 */ \
+	const cpumask_t *m_ = (mask); \
+	alternative_call(genapic.cpu_mask_to_apicid, m_); \
+})
+#define vector_allocation_cpumask(cpu) \
+	alternative_call(genapic.vector_allocation_cpumask, cpu)
 
 static inline void enable_apic_mode(void)
 {





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v4 07/12] x86/cpuidle: patch some indirect calls to direct ones
  2018-10-02 10:09 ` [PATCH v4 00/12] x86: indirect call overhead reduction Jan Beulich
                     ` (5 preceding siblings ...)
  2018-10-02 10:14   ` [PATCH v4 06/12] x86/genapic: patch indirect calls to direct ones Jan Beulich
@ 2018-10-02 10:15   ` Jan Beulich
  2018-10-04 10:35     ` Andrew Cooper
  2018-10-02 10:16   ` [PATCH v4 08/12] cpufreq: convert to a single post-init driver (hooks) instance Jan Beulich
                     ` (4 subsequent siblings)
  11 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 10:15 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

For now only the ones used during entering/exiting of idle states are
converted. Additionally pm_idle{,_save} and lapic_timer_{on,off} can't
be converted, as they may get established rather late (when Dom0 is
already active).

Note that for patching to be deferred until after the pre-SMP initcalls
(from where cpuidle_init_cpu() runs the first time) the pointers need to
start out as NULL.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v2: Drop open-coded numbers from macro invocations.

--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -102,8 +102,6 @@ bool lapic_timer_init(void)
     return true;
 }
 
-static uint64_t (*__read_mostly tick_to_ns)(uint64_t) = acpi_pm_tick_to_ns;
-
 void (*__read_mostly pm_idle_save)(void);
 unsigned int max_cstate __read_mostly = ACPI_PROCESSOR_MAX_POWER - 1;
 integer_param("max_cstate", max_cstate);
@@ -289,9 +287,9 @@ static uint64_t acpi_pm_ticks_elapsed(ui
         return ((0xFFFFFFFF - t1) + t2 +1);
 }
 
-uint64_t (*__read_mostly cpuidle_get_tick)(void) = get_acpi_pm_tick;
-static uint64_t (*__read_mostly ticks_elapsed)(uint64_t, uint64_t)
-    = acpi_pm_ticks_elapsed;
+uint64_t (*__read_mostly cpuidle_get_tick)(void);
+static uint64_t (*__read_mostly tick_to_ns)(uint64_t);
+static uint64_t (*__read_mostly ticks_elapsed)(uint64_t, uint64_t);
 
 static void print_acpi_power(uint32_t cpu, struct acpi_processor_power *power)
 {
@@ -547,7 +545,7 @@ void update_idle_stats(struct acpi_proce
                        struct acpi_processor_cx *cx,
                        uint64_t before, uint64_t after)
 {
-    int64_t sleep_ticks = ticks_elapsed(before, after);
+    int64_t sleep_ticks = alternative_call(ticks_elapsed, before, after);
     /* Interrupts are disabled */
 
     spin_lock(&power->stat_lock);
@@ -555,7 +553,8 @@ void update_idle_stats(struct acpi_proce
     cx->usage++;
     if ( sleep_ticks > 0 )
     {
-        power->last_residency = tick_to_ns(sleep_ticks) / 1000UL;
+        power->last_residency = alternative_call(tick_to_ns, sleep_ticks) /
+                                1000UL;
         cx->time += sleep_ticks;
     }
     power->last_state = &power->states[0];
@@ -635,7 +634,7 @@ static void acpi_processor_idle(void)
         if ( cx->type == ACPI_STATE_C1 || local_apic_timer_c2_ok )
         {
             /* Get start time (ticks) */
-            t1 = cpuidle_get_tick();
+            t1 = alternative_call(cpuidle_get_tick);
             /* Trace cpu idle entry */
             TRACE_4D(TRC_PM_IDLE_ENTRY, cx->idx, t1, exp, pred);
 
@@ -644,7 +643,7 @@ static void acpi_processor_idle(void)
             /* Invoke C2 */
             acpi_idle_do_entry(cx);
             /* Get end time (ticks) */
-            t2 = cpuidle_get_tick();
+            t2 = alternative_call(cpuidle_get_tick);
             trace_exit_reason(irq_traced);
             /* Trace cpu idle exit */
             TRACE_6D(TRC_PM_IDLE_EXIT, cx->idx, t2,
@@ -666,7 +665,7 @@ static void acpi_processor_idle(void)
         lapic_timer_off();
 
         /* Get start time (ticks) */
-        t1 = cpuidle_get_tick();
+        t1 = alternative_call(cpuidle_get_tick);
         /* Trace cpu idle entry */
         TRACE_4D(TRC_PM_IDLE_ENTRY, cx->idx, t1, exp, pred);
 
@@ -717,7 +716,7 @@ static void acpi_processor_idle(void)
         }
 
         /* Get end time (ticks) */
-        t2 = cpuidle_get_tick();
+        t2 = alternative_call(cpuidle_get_tick);
 
         /* recovering TSC */
         cstate_restore_tsc();
@@ -827,11 +826,20 @@ int cpuidle_init_cpu(unsigned int cpu)
     {
         unsigned int i;
 
-        if ( cpu == 0 && boot_cpu_has(X86_FEATURE_NONSTOP_TSC) )
+        if ( cpu == 0 && system_state < SYS_STATE_active )
         {
-            cpuidle_get_tick = get_stime_tick;
-            ticks_elapsed = stime_ticks_elapsed;
-            tick_to_ns = stime_tick_to_ns;
+            if ( boot_cpu_has(X86_FEATURE_NONSTOP_TSC) )
+            {
+                cpuidle_get_tick = get_stime_tick;
+                ticks_elapsed = stime_ticks_elapsed;
+                tick_to_ns = stime_tick_to_ns;
+            }
+            else
+            {
+                cpuidle_get_tick = get_acpi_pm_tick;
+                ticks_elapsed = acpi_pm_ticks_elapsed;
+                tick_to_ns = acpi_pm_tick_to_ns;
+            }
         }
 
         acpi_power = xzalloc(struct acpi_processor_power);
--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -778,7 +778,7 @@ static void mwait_idle(void)
 	if (!(lapic_timer_reliable_states & (1 << cstate)))
 		lapic_timer_off();
 
-	before = cpuidle_get_tick();
+	before = alternative_call(cpuidle_get_tick);
 	TRACE_4D(TRC_PM_IDLE_ENTRY, cx->type, before, exp, pred);
 
 	update_last_cx_stat(power, cx, before);
@@ -786,7 +786,7 @@ static void mwait_idle(void)
 	if (cpu_is_haltable(cpu))
 		mwait_idle_with_hints(eax, MWAIT_ECX_INTERRUPT_BREAK);
 
-	after = cpuidle_get_tick();
+	after = alternative_call(cpuidle_get_tick);
 
 	cstate_restore_tsc();
 	trace_exit_reason(irq_traced);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v4 08/12] cpufreq: convert to a single post-init driver (hooks) instance
  2018-10-02 10:09 ` [PATCH v4 00/12] x86: indirect call overhead reduction Jan Beulich
                     ` (6 preceding siblings ...)
  2018-10-02 10:15   ` [PATCH v4 07/12] x86/cpuidle: patch some " Jan Beulich
@ 2018-10-02 10:16   ` Jan Beulich
  2018-10-04 10:36     ` Andrew Cooper
  2018-10-02 10:16   ` [PATCH v4 09/12] cpufreq: patch target() indirect call to direct one Jan Beulich
                     ` (3 subsequent siblings)
  11 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 10:16 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall

This reduces the post-init memory footprint, eliminates a pointless
level of indirection at the use sites, and allows for subsequent
alternatives call patching.

Take the opportunity and also add a name to the PowerNow! instance.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v2: New.

--- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
+++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
@@ -53,8 +53,6 @@ enum {
 
 struct acpi_cpufreq_data *cpufreq_drv_data[NR_CPUS];
 
-static struct cpufreq_driver acpi_cpufreq_driver;
-
 static bool __read_mostly acpi_pstate_strict;
 boolean_param("acpi_pstate_strict", acpi_pstate_strict);
 
@@ -355,7 +353,7 @@ static void feature_detect(void *info)
     if ( cpu_has_aperfmperf )
     {
         policy->aperf_mperf = 1;
-        acpi_cpufreq_driver.getavg = get_measured_perf;
+        cpufreq_driver.getavg = get_measured_perf;
     }
 
     eax = cpuid_eax(6);
@@ -593,7 +591,7 @@ acpi_cpufreq_cpu_init(struct cpufreq_pol
         policy->cur = acpi_cpufreq_guess_freq(data, policy->cpu);
         break;
     case ACPI_ADR_SPACE_FIXED_HARDWARE:
-        acpi_cpufreq_driver.get = get_cur_freq_on_cpu;
+        cpufreq_driver.get = get_cur_freq_on_cpu;
         policy->cur = get_cur_freq_on_cpu(cpu);
         break;
     default:
@@ -635,7 +633,7 @@ static int acpi_cpufreq_cpu_exit(struct
     return 0;
 }
 
-static struct cpufreq_driver acpi_cpufreq_driver = {
+static const struct cpufreq_driver __initconstrel acpi_cpufreq_driver = {
     .name   = "acpi-cpufreq",
     .verify = acpi_cpufreq_verify,
     .target = acpi_cpufreq_target,
@@ -656,7 +654,7 @@ static int __init cpufreq_driver_init(vo
 
     return ret;
 }
-__initcall(cpufreq_driver_init);
+presmp_initcall(cpufreq_driver_init);
 
 int cpufreq_cpu_init(unsigned int cpuid)
 {
--- a/xen/arch/x86/acpi/cpufreq/powernow.c
+++ b/xen/arch/x86/acpi/cpufreq/powernow.c
@@ -52,8 +52,6 @@
 
 #define ARCH_CPU_FLAG_RESUME	1
 
-static struct cpufreq_driver powernow_cpufreq_driver;
-
 static void transition_pstate(void *pstate)
 {
     wrmsrl(MSR_PSTATE_CTRL, *(unsigned int *)pstate);
@@ -215,7 +213,7 @@ static void feature_detect(void *info)
     if ( cpu_has_aperfmperf )
     {
         policy->aperf_mperf = 1;
-        powernow_cpufreq_driver.getavg = get_measured_perf;
+        cpufreq_driver.getavg = get_measured_perf;
     }
 
     edx = cpuid_edx(CPUID_FREQ_VOLT_CAPABILITIES);
@@ -347,7 +345,8 @@ static int powernow_cpufreq_cpu_exit(str
     return 0;
 }
 
-static struct cpufreq_driver powernow_cpufreq_driver = {
+static const struct cpufreq_driver __initconstrel powernow_cpufreq_driver = {
+    .name   = "powernow",
     .verify = powernow_cpufreq_verify,
     .target = powernow_cpufreq_target,
     .init   = powernow_cpufreq_cpu_init,
--- a/xen/drivers/acpi/pmstat.c
+++ b/xen/drivers/acpi/pmstat.c
@@ -64,7 +64,7 @@ int do_get_pm_info(struct xen_sysctl_get
     case PMSTAT_PX:
         if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
             return -ENODEV;
-        if ( !cpufreq_driver )
+        if ( !cpufreq_driver.init )
             return -ENODEV;
         if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
             return -EINVAL;
@@ -255,16 +255,16 @@ static int get_cpufreq_para(struct xen_s
         return ret;
 
     op->u.get_para.cpuinfo_cur_freq =
-        cpufreq_driver->get ? cpufreq_driver->get(op->cpuid) : policy->cur;
+        cpufreq_driver.get ? cpufreq_driver.get(op->cpuid) : policy->cur;
     op->u.get_para.cpuinfo_max_freq = policy->cpuinfo.max_freq;
     op->u.get_para.cpuinfo_min_freq = policy->cpuinfo.min_freq;
     op->u.get_para.scaling_cur_freq = policy->cur;
     op->u.get_para.scaling_max_freq = policy->max;
     op->u.get_para.scaling_min_freq = policy->min;
 
-    if ( cpufreq_driver->name[0] )
+    if ( cpufreq_driver.name[0] )
         strlcpy(op->u.get_para.scaling_driver, 
-            cpufreq_driver->name, CPUFREQ_NAME_LEN);
+            cpufreq_driver.name, CPUFREQ_NAME_LEN);
     else
         strlcpy(op->u.get_para.scaling_driver, "Unknown", CPUFREQ_NAME_LEN);
 
--- a/xen/drivers/cpufreq/cpufreq.c
+++ b/xen/drivers/cpufreq/cpufreq.c
@@ -172,7 +172,7 @@ int cpufreq_add_cpu(unsigned int cpu)
     if ( !(perf->init & XEN_PX_INIT) )
         return -EINVAL;
 
-    if (!cpufreq_driver)
+    if (!cpufreq_driver.init)
         return 0;
 
     if (per_cpu(cpufreq_cpu_policy, cpu))
@@ -239,7 +239,7 @@ int cpufreq_add_cpu(unsigned int cpu)
         policy->cpu = cpu;
         per_cpu(cpufreq_cpu_policy, cpu) = policy;
 
-        ret = cpufreq_driver->init(policy);
+        ret = cpufreq_driver.init(policy);
         if (ret) {
             free_cpumask_var(policy->cpus);
             xfree(policy);
@@ -298,7 +298,7 @@ err1:
     cpumask_clear_cpu(cpu, cpufreq_dom->map);
 
     if (cpumask_empty(policy->cpus)) {
-        cpufreq_driver->exit(policy);
+        cpufreq_driver.exit(policy);
         free_cpumask_var(policy->cpus);
         xfree(policy);
     }
@@ -362,7 +362,7 @@ int cpufreq_del_cpu(unsigned int cpu)
     cpumask_clear_cpu(cpu, cpufreq_dom->map);
 
     if (cpumask_empty(policy->cpus)) {
-        cpufreq_driver->exit(policy);
+        cpufreq_driver.exit(policy);
         free_cpumask_var(policy->cpus);
         xfree(policy);
     }
@@ -663,17 +663,17 @@ static int __init cpufreq_presmp_init(vo
 }
 presmp_initcall(cpufreq_presmp_init);
 
-int __init cpufreq_register_driver(struct cpufreq_driver *driver_data)
+int __init cpufreq_register_driver(const struct cpufreq_driver *driver_data)
 {
    if ( !driver_data || !driver_data->init ||
         !driver_data->verify || !driver_data->exit ||
         (!driver_data->target == !driver_data->setpolicy) )
         return -EINVAL;
 
-    if ( cpufreq_driver )
+    if ( cpufreq_driver.init )
         return -EBUSY;
 
-    cpufreq_driver = driver_data;
+    cpufreq_driver = *driver_data;
 
     return 0;
 }
--- a/xen/drivers/cpufreq/utility.c
+++ b/xen/drivers/cpufreq/utility.c
@@ -31,7 +31,7 @@
 #include <acpi/cpufreq/cpufreq.h>
 #include <public/sysctl.h>
 
-struct cpufreq_driver   *cpufreq_driver;
+struct cpufreq_driver __read_mostly cpufreq_driver;
 struct processor_pminfo *__read_mostly processor_pminfo[NR_CPUS];
 DEFINE_PER_CPU_READ_MOSTLY(struct cpufreq_policy *, cpufreq_cpu_policy);
 
@@ -360,11 +360,11 @@ int __cpufreq_driver_target(struct cpufr
 {
     int retval = -EINVAL;
 
-    if (cpu_online(policy->cpu) && cpufreq_driver->target)
+    if (cpu_online(policy->cpu) && cpufreq_driver.target)
     {
         unsigned int prev_freq = policy->cur;
 
-        retval = cpufreq_driver->target(policy, target_freq, relation);
+        retval = cpufreq_driver.target(policy, target_freq, relation);
         if ( retval == 0 )
             TRACE_2D(TRC_PM_FREQ_CHANGE, prev_freq/1000, policy->cur/1000);
     }
@@ -380,9 +380,9 @@ int cpufreq_driver_getavg(unsigned int c
     if (!cpu_online(cpu) || !(policy = per_cpu(cpufreq_cpu_policy, cpu)))
         return 0;
 
-    if (cpufreq_driver->getavg)
+    if (cpufreq_driver.getavg)
     {
-        freq_avg = cpufreq_driver->getavg(cpu, flag);
+        freq_avg = cpufreq_driver.getavg(cpu, flag);
         if (freq_avg > 0)
             return freq_avg;
     }
@@ -412,9 +412,9 @@ int cpufreq_update_turbo(int cpuid, int
         return 0;
 
     policy->turbo = new_state;
-    if (cpufreq_driver->update)
+    if (cpufreq_driver.update)
     {
-        ret = cpufreq_driver->update(cpuid, policy);
+        ret = cpufreq_driver.update(cpuid, policy);
         if (ret)
             policy->turbo = curr_state;
     }
@@ -450,15 +450,15 @@ int __cpufreq_set_policy(struct cpufreq_
         return -EINVAL;
 
     /* verify the cpu speed can be set within this limit */
-    ret = cpufreq_driver->verify(policy);
+    ret = cpufreq_driver.verify(policy);
     if (ret)
         return ret;
 
     data->min = policy->min;
     data->max = policy->max;
     data->limits = policy->limits;
-    if (cpufreq_driver->setpolicy)
-        return cpufreq_driver->setpolicy(data);
+    if (cpufreq_driver.setpolicy)
+        return cpufreq_driver.setpolicy(data);
 
     if (policy->governor != data->governor) {
         /* save old, working values */
--- a/xen/include/acpi/cpufreq/cpufreq.h
+++ b/xen/include/acpi/cpufreq/cpufreq.h
@@ -153,7 +153,7 @@ __cpufreq_governor(struct cpufreq_policy
 #define CPUFREQ_RELATION_H 1  /* highest frequency below or at target */
 
 struct cpufreq_driver {
-    char   name[CPUFREQ_NAME_LEN];
+    const char *name;
     int    (*init)(struct cpufreq_policy *policy);
     int    (*verify)(struct cpufreq_policy *policy);
     int    (*setpolicy)(struct cpufreq_policy *policy);
@@ -166,9 +166,9 @@ struct cpufreq_driver {
     int    (*exit)(struct cpufreq_policy *policy);
 };
 
-extern struct cpufreq_driver *cpufreq_driver;
+extern struct cpufreq_driver cpufreq_driver;
 
-int cpufreq_register_driver(struct cpufreq_driver *);
+int cpufreq_register_driver(const struct cpufreq_driver *);
 
 static __inline__
 void cpufreq_verify_within_limits(struct cpufreq_policy *policy,




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v4 09/12] cpufreq: patch target() indirect call to direct one
  2018-10-02 10:09 ` [PATCH v4 00/12] x86: indirect call overhead reduction Jan Beulich
                     ` (7 preceding siblings ...)
  2018-10-02 10:16   ` [PATCH v4 08/12] cpufreq: convert to a single post-init driver (hooks) instance Jan Beulich
@ 2018-10-02 10:16   ` Jan Beulich
  2018-10-04 10:36     ` Andrew Cooper
  2018-10-02 10:18   ` [PATCH v4 10/12] IOMMU: introduce IOMMU_MIXED config option Jan Beulich
                     ` (2 subsequent siblings)
  11 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 10:16 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall

This looks to be the only frequently executed hook; don't bother
patching any other ones.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v2: New.

--- a/xen/drivers/cpufreq/utility.c
+++ b/xen/drivers/cpufreq/utility.c
@@ -364,7 +364,8 @@ int __cpufreq_driver_target(struct cpufr
     {
         unsigned int prev_freq = policy->cur;
 
-        retval = cpufreq_driver.target(policy, target_freq, relation);
+        retval = alternative_call(cpufreq_driver.target,
+                                  policy, target_freq, relation);
         if ( retval == 0 )
             TRACE_2D(TRC_PM_FREQ_CHANGE, prev_freq/1000, policy->cur/1000);
     }



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v4 10/12] IOMMU: introduce IOMMU_MIXED config option
  2018-10-02 10:09 ` [PATCH v4 00/12] x86: indirect call overhead reduction Jan Beulich
                     ` (8 preceding siblings ...)
  2018-10-02 10:16   ` [PATCH v4 09/12] cpufreq: patch target() indirect call to direct one Jan Beulich
@ 2018-10-02 10:18   ` Jan Beulich
  2018-10-02 10:38     ` Julien Grall
  2018-10-02 10:18   ` [PATCH v4 11/12] IOMMU: remove indirection from certain IOMMU hook accesses Jan Beulich
  2018-10-02 10:19   ` [PATCH v4 12/12] IOMMU: patch certain indirect calls to direct ones Jan Beulich
  11 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 10:18 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall

ARM is intended to gain support for heterogeneous IOMMUs on a single
system. This not only disallows boot time replacement of respective
indirect calls (handling of which is the main goal of the introduction
here), but more generally disallows calls using the iommu_ops() return
value directly - all such calls need to have means (commonly a domain
pointer) to know the targeted IOMMU.

Disallow all hooks lacking such context for the time being, which in
effect is some dead code elimination for ARM. Once extended suitably,
individual of these hooks can be moved out of their guards again in the
future.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v4: New.

--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -19,6 +19,7 @@ config ARM
 	select HAS_DEVICE_TREE
 	select HAS_PASSTHROUGH
 	select HAS_PDX
+	select IOMMU_MIXED
 
 config ARCH_DEFCONFIG
 	string
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -938,7 +938,7 @@ static int construct_memop_from_reservat
     return 0;
 }
 
-#ifdef CONFIG_HAS_PASSTHROUGH
+#if defined(CONFIG_HAS_PASSTHROUGH) && !defined(CONFIG_IOMMU_MIXED)
 struct get_reserved_device_memory {
     struct xen_reserved_device_memory_map map;
     unsigned int used_entries;
@@ -1550,7 +1550,7 @@ long do_memory_op(unsigned long cmd, XEN
         break;
     }
 
-#ifdef CONFIG_HAS_PASSTHROUGH
+#if defined(CONFIG_HAS_PASSTHROUGH) && !defined(CONFIG_IOMMU_MIXED)
     case XENMEM_reserved_device_memory_map:
     {
         struct get_reserved_device_memory grdm;
--- a/xen/drivers/passthrough/Kconfig
+++ b/xen/drivers/passthrough/Kconfig
@@ -2,6 +2,9 @@
 config HAS_PASSTHROUGH
 	bool
 
+config IOMMU_MIXED
+	bool
+
 if ARM
 config ARM_SMMU
 	bool "ARM SMMUv1 and v2 driver"
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -77,9 +77,11 @@ bool_t __read_mostly amd_iommu_perdev_in
 
 DEFINE_PER_CPU(bool_t, iommu_dont_flush_iotlb);
 
+#ifndef CONFIG_IOMMU_MIXED
 DEFINE_SPINLOCK(iommu_pt_cleanup_lock);
 PAGE_LIST_HEAD(iommu_pt_cleanup_list);
 static struct tasklet iommu_pt_cleanup_tasklet;
+#endif
 
 static int __init parse_iommu_param(const char *s)
 {
@@ -246,7 +248,9 @@ void iommu_teardown(struct domain *d)
 
     d->need_iommu = 0;
     hd->platform_ops->teardown(d);
+#ifndef CONFIG_IOMMU_MIXED
     tasklet_schedule(&iommu_pt_cleanup_tasklet);
+#endif
 }
 
 int iommu_construct(struct domain *d)
@@ -332,6 +336,7 @@ int iommu_unmap_page(struct domain *d, u
     return rc;
 }
 
+#ifndef CONFIG_IOMMU_MIXED
 static void iommu_free_pagetables(unsigned long unused)
 {
     do {
@@ -348,6 +353,7 @@ static void iommu_free_pagetables(unsign
     tasklet_schedule_on_cpu(&iommu_pt_cleanup_tasklet,
                             cpumask_cycle(smp_processor_id(), &cpu_online_map));
 }
+#endif
 
 int iommu_iotlb_flush(struct domain *d, unsigned long gfn,
                       unsigned int page_count)
@@ -433,12 +439,15 @@ int __init iommu_setup(void)
                iommu_hwdom_passthrough ? "Passthrough" :
                iommu_hwdom_strict ? "Strict" : "Relaxed");
         printk("Interrupt remapping %sabled\n", iommu_intremap ? "en" : "dis");
+#ifndef CONFIG_IOMMU_MIXED
         tasklet_init(&iommu_pt_cleanup_tasklet, iommu_free_pagetables, 0);
+#endif
     }
 
     return rc;
 }
 
+#ifndef CONFIG_IOMMU_MIXED
 int iommu_suspend()
 {
     if ( iommu_enabled )
@@ -453,27 +462,6 @@ void iommu_resume()
         iommu_get_ops()->resume();
 }
 
-int iommu_do_domctl(
-    struct xen_domctl *domctl, struct domain *d,
-    XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
-{
-    int ret = -ENODEV;
-
-    if ( !iommu_enabled )
-        return -ENOSYS;
-
-#ifdef CONFIG_HAS_PCI
-    ret = iommu_do_pci_domctl(domctl, d, u_domctl);
-#endif
-
-#ifdef CONFIG_HAS_DEVICE_TREE
-    if ( ret == -ENODEV )
-        ret = iommu_do_dt_domctl(domctl, d, u_domctl);
-#endif
-
-    return ret;
-}
-
 void iommu_share_p2m_table(struct domain* d)
 {
     if ( iommu_enabled && iommu_use_hap_pt(d) )
@@ -500,6 +488,28 @@ int iommu_get_reserved_device_memory(iom
 
     return ops->get_reserved_device_memory(func, ctxt);
 }
+#endif
+
+int iommu_do_domctl(
+    struct xen_domctl *domctl, struct domain *d,
+    XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
+{
+    int ret = -ENODEV;
+
+    if ( !iommu_enabled )
+        return -ENOSYS;
+
+#ifdef CONFIG_HAS_PCI
+    ret = iommu_do_pci_domctl(domctl, d, u_domctl);
+#endif
+
+#ifdef CONFIG_HAS_DEVICE_TREE
+    if ( ret == -ENODEV )
+        ret = iommu_do_dt_domctl(domctl, d, u_domctl);
+#endif
+
+    return ret;
+}
 
 bool_t iommu_has_feature(struct domain *d, enum iommu_feature feature)
 {
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -147,7 +147,7 @@ struct iommu_ops {
     int (*assign_device)(struct domain *, u8 devfn, device_t *dev, u32 flag);
     int (*reassign_device)(struct domain *s, struct domain *t,
                            u8 devfn, device_t *dev);
-#ifdef CONFIG_HAS_PCI
+#if defined(CONFIG_HAS_PCI) && !defined(CONFIG_IOMMU_MIXED)
     int (*get_device_group_id)(u16 seg, u8 bus, u8 devfn);
     int (*update_ire_from_msi)(struct msi_desc *msi_desc, struct msi_msg *msg);
     void (*read_msi_from_ire)(struct msi_desc *msi_desc, struct msi_msg *msg);
@@ -157,6 +157,7 @@ struct iommu_ops {
     int __must_check (*map_page)(struct domain *d, unsigned long gfn,
                                  unsigned long mfn, unsigned int flags);
     int __must_check (*unmap_page)(struct domain *d, unsigned long gfn);
+#ifndef CONFIG_IOMMU_MIXED
     void (*free_page_table)(struct page_info *);
 #ifdef CONFIG_X86
     void (*update_ire_from_apic)(unsigned int apic, unsigned int reg, unsigned int value);
@@ -167,10 +168,11 @@ struct iommu_ops {
     void (*resume)(void);
     void (*share_p2m)(struct domain *d);
     void (*crash_shutdown)(void);
+    int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
+#endif
     int __must_check (*iotlb_flush)(struct domain *d, unsigned long gfn,
                                     unsigned int page_count);
     int __must_check (*iotlb_flush_all)(struct domain *d);
-    int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
     void (*dump_p2m_table)(struct domain *d);
 };
 




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v4 11/12] IOMMU: remove indirection from certain IOMMU hook accesses
  2018-10-02 10:09 ` [PATCH v4 00/12] x86: indirect call overhead reduction Jan Beulich
                     ` (9 preceding siblings ...)
  2018-10-02 10:18   ` [PATCH v4 10/12] IOMMU: introduce IOMMU_MIXED config option Jan Beulich
@ 2018-10-02 10:18   ` Jan Beulich
  2018-10-02 10:19   ` [PATCH v4 12/12] IOMMU: patch certain indirect calls to direct ones Jan Beulich
  11 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 10:18 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall

In !IOMMU_MIXED mode there's no need to go through an extra level of
indirection. In order to limit code churn, call sites using struct
domain_iommu's platform_ops don't get touched here, however.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v4: New.

--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -29,6 +29,8 @@
 
 static bool_t __read_mostly init_done;
 
+static const struct iommu_ops amd_iommu_ops;
+
 struct amd_iommu *find_iommu_for_device(int seg, int bdf)
 {
     struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(seg);
@@ -182,6 +184,8 @@ int __init amd_iov_detect(void)
         return -ENODEV;
     }
 
+    iommu_ops = amd_iommu_ops;
+
     if ( amd_iommu_init() != 0 )
     {
         printk("AMD-Vi: Error initialization\n");
@@ -607,7 +611,7 @@ static void amd_dump_p2m_table(struct do
     amd_dump_p2m_table_level(hd->arch.root_table, hd->arch.paging_mode, 0, 0);
 }
 
-const struct iommu_ops amd_iommu_ops = {
+static const struct iommu_ops __initconstrel amd_iommu_ops = {
     .init = amd_iommu_domain_init,
     .hwdom_init = amd_iommu_hwdom_init,
     .add_device = amd_iommu_add_device,
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -78,6 +78,8 @@ bool_t __read_mostly amd_iommu_perdev_in
 DEFINE_PER_CPU(bool_t, iommu_dont_flush_iotlb);
 
 #ifndef CONFIG_IOMMU_MIXED
+struct iommu_ops iommu_ops;
+
 DEFINE_SPINLOCK(iommu_pt_cleanup_lock);
 PAGE_LIST_HEAD(iommu_pt_cleanup_list);
 static struct tasklet iommu_pt_cleanup_tasklet;
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -27,6 +27,7 @@
 
 struct pci_ats_dev;
 extern bool_t rwbf_quirk;
+extern const struct iommu_ops intel_iommu_ops;
 
 void print_iommu_regs(struct acpi_drhd_unit *drhd);
 void print_vtd_entries(struct iommu *iommu, int bus, int devfn, u64 gmfn);
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -897,6 +897,8 @@ int iommu_enable_x2apic_IR(void)
     else if ( !x2apic_enabled )
         return -EOPNOTSUPP;
 
+    iommu_ops = intel_iommu_ops;
+
     for_each_drhd_unit ( drhd )
     {
         iommu = drhd->iommu;
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2251,6 +2251,8 @@ int __init intel_vtd_setup(void)
         goto error;
     }
 
+    iommu_ops = intel_iommu_ops;
+
     /* We enable the following features only if they are supported by all VT-d
      * engines: Snoop Control, DMA passthrough, Queued Invalidation, Interrupt
      * Remapping, and Posted Interrupt
@@ -2650,7 +2652,7 @@ static void vtd_dump_p2m_table(struct do
     vtd_dump_p2m_table_level(hd->arch.pgd_maddr, agaw_to_level(hd->arch.agaw), 0, 0);
 }
 
-const struct iommu_ops intel_iommu_ops = {
+const struct iommu_ops __initconstrel intel_iommu_ops = {
     .init = intel_iommu_domain_init,
     .hwdom_init = intel_iommu_hwdom_init,
     .add_device = intel_iommu_add_device,
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -44,26 +44,9 @@ struct arch_iommu
     struct guest_iommu *g_iommu;
 };
 
-extern const struct iommu_ops intel_iommu_ops;
-extern const struct iommu_ops amd_iommu_ops;
 int intel_vtd_setup(void);
 int amd_iov_detect(void);
 
-static inline const struct iommu_ops *iommu_get_ops(void)
-{
-    switch ( boot_cpu_data.x86_vendor )
-    {
-    case X86_VENDOR_INTEL:
-        return &intel_iommu_ops;
-    case X86_VENDOR_AMD:
-        return &amd_iommu_ops;
-    }
-
-    BUG();
-
-    return NULL;
-}
-
 static inline int iommu_hardware_setup(void)
 {
     switch ( boot_cpu_data.x86_vendor )
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -176,6 +176,16 @@ struct iommu_ops {
     void (*dump_p2m_table)(struct domain *d);
 };
 
+#ifndef CONFIG_IOMMU_MIXED
+extern struct iommu_ops iommu_ops;
+
+static inline const struct iommu_ops *iommu_get_ops(void)
+{
+    BUG_ON(!iommu_ops.init);
+    return &iommu_ops;
+}
+#endif
+
 int __must_check iommu_suspend(void);
 void iommu_resume(void);
 void iommu_crash_shutdown(void);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v4 12/12] IOMMU: patch certain indirect calls to direct ones
  2018-10-02 10:09 ` [PATCH v4 00/12] x86: indirect call overhead reduction Jan Beulich
                     ` (10 preceding siblings ...)
  2018-10-02 10:18   ` [PATCH v4 11/12] IOMMU: remove indirection from certain IOMMU hook accesses Jan Beulich
@ 2018-10-02 10:19   ` Jan Beulich
  11 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 10:19 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall

This is intentionally not touching hooks used rarely (or not at all)
during the lifetime of a VM.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v4: New.

--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -228,7 +228,8 @@ void __hwdom_init iommu_hwdom_init(struc
                   == PGT_writable_page) )
                 mapping |= IOMMUF_writable;
 
-            ret = hd->platform_ops->map_page(d, gfn, mfn, mapping);
+            ret = iommu_call(hd->platform_ops, map_page,
+                             d, gfn, mfn, mapping);
             if ( !rc )
                 rc = ret;
 
@@ -300,7 +301,7 @@ int iommu_map_page(struct domain *d, uns
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->map_page(d, gfn, mfn, flags);
+    rc = iommu_call(hd->platform_ops, map_page, d, gfn, mfn, flags);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
@@ -323,7 +324,7 @@ int iommu_unmap_page(struct domain *d, u
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->unmap_page(d, gfn);
+    rc = iommu_call(hd->platform_ops, unmap_page, d, gfn);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
@@ -349,7 +350,7 @@ static void iommu_free_pagetables(unsign
         spin_unlock(&iommu_pt_cleanup_lock);
         if ( !pg )
             return;
-        iommu_get_ops()->free_page_table(pg);
+        iommu_vcall(iommu_get_ops(), free_page_table, pg);
     } while ( !softirq_pending(smp_processor_id()) );
 
     tasklet_schedule_on_cpu(&iommu_pt_cleanup_tasklet,
@@ -366,7 +367,7 @@ int iommu_iotlb_flush(struct domain *d,
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
         return 0;
 
-    rc = hd->platform_ops->iotlb_flush(d, gfn, page_count);
+    rc = iommu_call(hd->platform_ops, iotlb_flush, d, gfn, page_count);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
@@ -389,7 +390,7 @@ int iommu_iotlb_flush_all(struct domain
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush_all )
         return 0;
 
-    rc = hd->platform_ops->iotlb_flush_all(d);
+    rc = iommu_call(hd->platform_ops, iotlb_flush_all, d);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1301,14 +1301,14 @@ int iommu_update_ire_from_msi(
     struct msi_desc *msi_desc, struct msi_msg *msg)
 {
     return iommu_intremap
-           ? iommu_get_ops()->update_ire_from_msi(msi_desc, msg) : 0;
+           ? iommu_call(iommu_ops, update_ire_from_msi, msi_desc, msg) : 0;
 }
 
 void iommu_read_msi_from_ire(
     struct msi_desc *msi_desc, struct msi_msg *msg)
 {
     if ( iommu_intremap )
-        iommu_get_ops()->read_msi_from_ire(msi_desc, msg);
+        iommu_vcall(iommu_ops, read_msi_from_ire, msi_desc, msg);
 }
 
 static int iommu_add_device(struct pci_dev *pdev)
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -26,14 +26,12 @@
 void iommu_update_ire_from_apic(
     unsigned int apic, unsigned int reg, unsigned int value)
 {
-    const struct iommu_ops *ops = iommu_get_ops();
-    ops->update_ire_from_apic(apic, reg, value);
+    iommu_vcall(iommu_ops, update_ire_from_apic, apic, reg, value);
 }
 
 unsigned int iommu_read_apic_from_ire(unsigned int apic, unsigned int reg)
 {
-    const struct iommu_ops *ops = iommu_get_ops();
-    return ops->read_apic_from_ire(apic, reg);
+    return iommu_call(iommu_ops, read_apic_from_ire, apic, reg);
 }
 
 int __init iommu_setup_hpet_msi(struct msi_desc *msi)
@@ -44,7 +42,6 @@ int __init iommu_setup_hpet_msi(struct m
 
 int arch_iommu_populate_page_table(struct domain *d)
 {
-    const struct domain_iommu *hd = dom_iommu(d);
     struct page_info *page;
     int rc = 0, n = 0;
 
@@ -68,9 +65,8 @@ int arch_iommu_populate_page_table(struc
             {
                 ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
                 BUG_ON(SHARED_M2P(gfn));
-                rc = hd->platform_ops->map_page(d, gfn, mfn,
-                                                IOMMUF_readable |
-                                                IOMMUF_writable);
+                rc = iommu_call(iommu_ops, map_page, d, gfn, mfn,
+                                IOMMUF_readable | IOMMUF_writable);
             }
             if ( rc )
             {
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -176,9 +176,17 @@ struct iommu_ops {
     void (*dump_p2m_table)(struct domain *d);
 };
 
-#ifndef CONFIG_IOMMU_MIXED
+#ifdef CONFIG_IOMMU_MIXED
+# define iommu_call(ops, fn, args...) ((ops)->fn(args))
+# define iommu_vcall iommu_call
+#else
+# include <asm/alternative.h>
+
 extern struct iommu_ops iommu_ops;
 
+# define iommu_call(ops, fn, args...)  alternative_call(iommu_ops.fn, ## args)
+# define iommu_vcall(ops, fn, args...) alternative_vcall(iommu_ops.fn, ## args)
+
 static inline const struct iommu_ops *iommu_get_ops(void)
 {
     BUG_ON(!iommu_ops.init);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 10/12] IOMMU: introduce IOMMU_MIXED config option
  2018-10-02 10:18   ` [PATCH v4 10/12] IOMMU: introduce IOMMU_MIXED config option Jan Beulich
@ 2018-10-02 10:38     ` Julien Grall
  2018-10-02 10:42       ` Jan Beulich
  2018-11-06 15:48       ` Jan Beulich
  0 siblings, 2 replies; 119+ messages in thread
From: Julien Grall @ 2018-10-02 10:38 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan

Hi,

On 02/10/2018 11:18, Jan Beulich wrote:
> ARM is intended to gain support for heterogeneous IOMMUs on a single
> system. This not only disallows boot time replacement of respective
> indirect calls (handling of which is the main goal of the introduction
> here), but more generally disallows calls using the iommu_ops() return
> value directly - all such calls need to have means (commonly a domain
> pointer) to know the targeted IOMMU.
> 
> Disallow all hooks lacking such context for the time being, which in
> effect is some dead code elimination for ARM. Once extended suitably,
> individual of these hooks can be moved out of their guards again in the
> future.

While in theory it is possible to have platform with hetereneous IOMMUs. 
  I don't see such such support coming in Xen for the foreseeable 
future. Note that even Linux does not support such case.

This patch is going to make more complicate to unshare page-tables as 
now we would need to care of mixed case. So I would rather not set 
IOMMU_MIXED on Arm until we have a use case for it.

Cheers,

> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v4: New.
> 
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -19,6 +19,7 @@ config ARM
>   	select HAS_DEVICE_TREE
>   	select HAS_PASSTHROUGH
>   	select HAS_PDX
> +	select IOMMU_MIXED
>   
>   config ARCH_DEFCONFIG
>   	string
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -938,7 +938,7 @@ static int construct_memop_from_reservat
>       return 0;
>   }
>   
> -#ifdef CONFIG_HAS_PASSTHROUGH
> +#if defined(CONFIG_HAS_PASSTHROUGH) && !defined(CONFIG_IOMMU_MIXED)
>   struct get_reserved_device_memory {
>       struct xen_reserved_device_memory_map map;
>       unsigned int used_entries;
> @@ -1550,7 +1550,7 @@ long do_memory_op(unsigned long cmd, XEN
>           break;
>       }
>   
> -#ifdef CONFIG_HAS_PASSTHROUGH
> +#if defined(CONFIG_HAS_PASSTHROUGH) && !defined(CONFIG_IOMMU_MIXED)
>       case XENMEM_reserved_device_memory_map:
>       {
>           struct get_reserved_device_memory grdm;
> --- a/xen/drivers/passthrough/Kconfig
> +++ b/xen/drivers/passthrough/Kconfig
> @@ -2,6 +2,9 @@
>   config HAS_PASSTHROUGH
>   	bool
>   
> +config IOMMU_MIXED
> +	bool
> +
>   if ARM
>   config ARM_SMMU
>   	bool "ARM SMMUv1 and v2 driver"
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -77,9 +77,11 @@ bool_t __read_mostly amd_iommu_perdev_in
>   
>   DEFINE_PER_CPU(bool_t, iommu_dont_flush_iotlb);
>   
> +#ifndef CONFIG_IOMMU_MIXED
>   DEFINE_SPINLOCK(iommu_pt_cleanup_lock);
>   PAGE_LIST_HEAD(iommu_pt_cleanup_list);
>   static struct tasklet iommu_pt_cleanup_tasklet;
> +#endif
>   
>   static int __init parse_iommu_param(const char *s)
>   {
> @@ -246,7 +248,9 @@ void iommu_teardown(struct domain *d)
>   
>       d->need_iommu = 0;
>       hd->platform_ops->teardown(d);
> +#ifndef CONFIG_IOMMU_MIXED
>       tasklet_schedule(&iommu_pt_cleanup_tasklet);
> +#endif
>   }
>   
>   int iommu_construct(struct domain *d)
> @@ -332,6 +336,7 @@ int iommu_unmap_page(struct domain *d, u
>       return rc;
>   }
>   
> +#ifndef CONFIG_IOMMU_MIXED
>   static void iommu_free_pagetables(unsigned long unused)
>   {
>       do {
> @@ -348,6 +353,7 @@ static void iommu_free_pagetables(unsign
>       tasklet_schedule_on_cpu(&iommu_pt_cleanup_tasklet,
>                               cpumask_cycle(smp_processor_id(), &cpu_online_map));
>   }
> +#endif
>   
>   int iommu_iotlb_flush(struct domain *d, unsigned long gfn,
>                         unsigned int page_count)
> @@ -433,12 +439,15 @@ int __init iommu_setup(void)
>                  iommu_hwdom_passthrough ? "Passthrough" :
>                  iommu_hwdom_strict ? "Strict" : "Relaxed");
>           printk("Interrupt remapping %sabled\n", iommu_intremap ? "en" : "dis");
> +#ifndef CONFIG_IOMMU_MIXED
>           tasklet_init(&iommu_pt_cleanup_tasklet, iommu_free_pagetables, 0);
> +#endif
>       }
>   
>       return rc;
>   }
>   
> +#ifndef CONFIG_IOMMU_MIXED
>   int iommu_suspend()
>   {
>       if ( iommu_enabled )
> @@ -453,27 +462,6 @@ void iommu_resume()
>           iommu_get_ops()->resume();
>   }
>   
> -int iommu_do_domctl(
> -    struct xen_domctl *domctl, struct domain *d,
> -    XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
> -{
> -    int ret = -ENODEV;
> -
> -    if ( !iommu_enabled )
> -        return -ENOSYS;
> -
> -#ifdef CONFIG_HAS_PCI
> -    ret = iommu_do_pci_domctl(domctl, d, u_domctl);
> -#endif
> -
> -#ifdef CONFIG_HAS_DEVICE_TREE
> -    if ( ret == -ENODEV )
> -        ret = iommu_do_dt_domctl(domctl, d, u_domctl);
> -#endif
> -
> -    return ret;
> -}
> -
>   void iommu_share_p2m_table(struct domain* d)
>   {
>       if ( iommu_enabled && iommu_use_hap_pt(d) )
> @@ -500,6 +488,28 @@ int iommu_get_reserved_device_memory(iom
>   
>       return ops->get_reserved_device_memory(func, ctxt);
>   }
> +#endif
> +
> +int iommu_do_domctl(
> +    struct xen_domctl *domctl, struct domain *d,
> +    XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
> +{
> +    int ret = -ENODEV;
> +
> +    if ( !iommu_enabled )
> +        return -ENOSYS;
> +
> +#ifdef CONFIG_HAS_PCI
> +    ret = iommu_do_pci_domctl(domctl, d, u_domctl);
> +#endif
> +
> +#ifdef CONFIG_HAS_DEVICE_TREE
> +    if ( ret == -ENODEV )
> +        ret = iommu_do_dt_domctl(domctl, d, u_domctl);
> +#endif
> +
> +    return ret;
> +}
>   
>   bool_t iommu_has_feature(struct domain *d, enum iommu_feature feature)
>   {
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -147,7 +147,7 @@ struct iommu_ops {
>       int (*assign_device)(struct domain *, u8 devfn, device_t *dev, u32 flag);
>       int (*reassign_device)(struct domain *s, struct domain *t,
>                              u8 devfn, device_t *dev);
> -#ifdef CONFIG_HAS_PCI
> +#if defined(CONFIG_HAS_PCI) && !defined(CONFIG_IOMMU_MIXED)
>       int (*get_device_group_id)(u16 seg, u8 bus, u8 devfn);
>       int (*update_ire_from_msi)(struct msi_desc *msi_desc, struct msi_msg *msg);
>       void (*read_msi_from_ire)(struct msi_desc *msi_desc, struct msi_msg *msg);
> @@ -157,6 +157,7 @@ struct iommu_ops {
>       int __must_check (*map_page)(struct domain *d, unsigned long gfn,
>                                    unsigned long mfn, unsigned int flags);
>       int __must_check (*unmap_page)(struct domain *d, unsigned long gfn);
> +#ifndef CONFIG_IOMMU_MIXED
>       void (*free_page_table)(struct page_info *);
>   #ifdef CONFIG_X86
>       void (*update_ire_from_apic)(unsigned int apic, unsigned int reg, unsigned int value);
> @@ -167,10 +168,11 @@ struct iommu_ops {
>       void (*resume)(void);
>       void (*share_p2m)(struct domain *d);
>       void (*crash_shutdown)(void);
> +    int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
> +#endif
>       int __must_check (*iotlb_flush)(struct domain *d, unsigned long gfn,
>                                       unsigned int page_count);
>       int __must_check (*iotlb_flush_all)(struct domain *d);
> -    int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
>       void (*dump_p2m_table)(struct domain *d);
>   };
>   
> 
> 
> 

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 10/12] IOMMU: introduce IOMMU_MIXED config option
  2018-10-02 10:38     ` Julien Grall
@ 2018-10-02 10:42       ` Jan Beulich
  2018-10-02 11:00         ` Julien Grall
  2018-11-06 15:48       ` Jan Beulich
  1 sibling, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 10:42 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel

>>> On 02.10.18 at 12:38, <julien.grall@arm.com> wrote:
> On 02/10/2018 11:18, Jan Beulich wrote:
>> ARM is intended to gain support for heterogeneous IOMMUs on a single
>> system. This not only disallows boot time replacement of respective
>> indirect calls (handling of which is the main goal of the introduction
>> here), but more generally disallows calls using the iommu_ops() return
>> value directly - all such calls need to have means (commonly a domain
>> pointer) to know the targeted IOMMU.
>> 
>> Disallow all hooks lacking such context for the time being, which in
>> effect is some dead code elimination for ARM. Once extended suitably,
>> individual of these hooks can be moved out of their guards again in the
>> future.
> 
> While in theory it is possible to have platform with hetereneous IOMMUs. 
>   I don't see such such support coming in Xen for the foreseeable 
> future. Note that even Linux does not support such case.
> 
> This patch is going to make more complicate to unshare page-tables as 
> now we would need to care of mixed case. So I would rather not set 
> IOMMU_MIXED on Arm until we have a use case for it.

Interesting. I essence this is the exact opposite of what you've
told me when I inquired about indirect call patching of the IOMMU
code. The sole purpose of this new option is to have a key off of
which I can tell whether to use patchable indirect calls or plain
ones.

I'm also not following how this change would complicate anything
for you. There's effectively no change for ARM, except for some
dead code not getting built anymore.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 10/12] IOMMU: introduce IOMMU_MIXED config option
  2018-10-02 10:42       ` Jan Beulich
@ 2018-10-02 11:00         ` Julien Grall
  2018-10-02 11:58           ` Jan Beulich
  0 siblings, 1 reply; 119+ messages in thread
From: Julien Grall @ 2018-10-02 11:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel

Hi,

On 02/10/2018 11:42, Jan Beulich wrote:
>>>> On 02.10.18 at 12:38, <julien.grall@arm.com> wrote:
>> On 02/10/2018 11:18, Jan Beulich wrote:
>>> ARM is intended to gain support for heterogeneous IOMMUs on a single
>>> system. This not only disallows boot time replacement of respective
>>> indirect calls (handling of which is the main goal of the introduction
>>> here), but more generally disallows calls using the iommu_ops() return
>>> value directly - all such calls need to have means (commonly a domain
>>> pointer) to know the targeted IOMMU.
>>>
>>> Disallow all hooks lacking such context for the time being, which in
>>> effect is some dead code elimination for ARM. Once extended suitably,
>>> individual of these hooks can be moved out of their guards again in the
>>> future.
>>
>> While in theory it is possible to have platform with hetereneous IOMMUs.
>>    I don't see such such support coming in Xen for the foreseeable
>> future. Note that even Linux does not support such case.
>>
>> This patch is going to make more complicate to unshare page-tables as
>> now we would need to care of mixed case. So I would rather not set
>> IOMMU_MIXED on Arm until we have a use case for it.
> 
> Interesting. I essence this is the exact opposite of what you've
> told me when I inquired about indirect call patching of the IOMMU
> code. The sole purpose of this new option is to have a key off of
> which I can tell whether to use patchable indirect calls or plain
> ones.

I don't think I am saying the opposite. I opened the door for mixed use 
case. It does not mean, I want to see half of the helpers dropped on Arm 
because you want them to be patchable. There are other way to do it.

> 
> I'm also not following how this change would complicate anything
> for you. There's effectively no change for ARM, except for some
> dead code not getting built anymore.

Well, with your patch free_page_table, resume & co are under 
!IOMMU_MIXED. So they are unusable on Arm until we effectively rework
the code to handle mixed case. More likely those callback will be 
necessary on Arm before mixed case. So I don't think this patch as such 
is what we want for Arm at the moment.

I would much prefer if you drop IOMMU_MIXED and implement iommu_{v,}call 
on Arm as a iommu_ops->fun(...) (i.e no alternative for now). So call 
are still not patchable for Arm, yet we still have access to all IOMMU 
functions.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 10/12] IOMMU: introduce IOMMU_MIXED config option
  2018-10-02 11:00         ` Julien Grall
@ 2018-10-02 11:58           ` Jan Beulich
  2018-10-02 12:58             ` Julien Grall
  0 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 11:58 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel

>>> On 02.10.18 at 13:00, <julien.grall@arm.com> wrote:
> On 02/10/2018 11:42, Jan Beulich wrote:
>>>>> On 02.10.18 at 12:38, <julien.grall@arm.com> wrote:
>>> On 02/10/2018 11:18, Jan Beulich wrote:
>>>> ARM is intended to gain support for heterogeneous IOMMUs on a single
>>>> system. This not only disallows boot time replacement of respective
>>>> indirect calls (handling of which is the main goal of the introduction
>>>> here), but more generally disallows calls using the iommu_ops() return
>>>> value directly - all such calls need to have means (commonly a domain
>>>> pointer) to know the targeted IOMMU.
>>>>
>>>> Disallow all hooks lacking such context for the time being, which in
>>>> effect is some dead code elimination for ARM. Once extended suitably,
>>>> individual of these hooks can be moved out of their guards again in the
>>>> future.
>>>
>>> While in theory it is possible to have platform with hetereneous IOMMUs.
>>>    I don't see such such support coming in Xen for the foreseeable
>>> future. Note that even Linux does not support such case.
>>>
>>> This patch is going to make more complicate to unshare page-tables as
>>> now we would need to care of mixed case. So I would rather not set
>>> IOMMU_MIXED on Arm until we have a use case for it.
>> 
>> Interesting. I essence this is the exact opposite of what you've
>> told me when I inquired about indirect call patching of the IOMMU
>> code. The sole purpose of this new option is to have a key off of
>> which I can tell whether to use patchable indirect calls or plain
>> ones.
> 
> I don't think I am saying the opposite. I opened the door for mixed use 
> case. It does not mean, I want to see half of the helpers dropped on Arm 
> because you want them to be patchable. There are other way to do it.

I drop no helpers (or really hooks), I merely hide ones which can't
(currently) be used in mixed-IOMMU environments. And the
mixed-IOMMU case was what you've used to request that indirect
calls here not be patched for ARM.

>> I'm also not following how this change would complicate anything
>> for you. There's effectively no change for ARM, except for some
>> dead code not getting built anymore.
> 
> Well, with your patch free_page_table, resume & co are under 
> !IOMMU_MIXED. So they are unusable on Arm until we effectively rework
> the code to handle mixed case. More likely those callback will be 
> necessary on Arm before mixed case. So I don't think this patch as such 
> is what we want for Arm at the moment.
> 
> I would much prefer if you drop IOMMU_MIXED and implement iommu_{v,}call 
> on Arm as a iommu_ops->fun(...) (i.e no alternative for now). So call 
> are still not patchable for Arm, yet we still have access to all IOMMU 
> functions.

Well, that's what I would have done if you hadn't brought up the
mixed-IOMMU case - clearly, without being able to test, I wouldn't
have dared to try and implement patching of indirect calls for ARM.

I can certainly drop that patch, but in the end it means you've
made me do more work than would have been needed to reach
the immediate goal I have.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 10/12] IOMMU: introduce IOMMU_MIXED config option
  2018-10-02 11:58           ` Jan Beulich
@ 2018-10-02 12:58             ` Julien Grall
  0 siblings, 0 replies; 119+ messages in thread
From: Julien Grall @ 2018-10-02 12:58 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel

Hi Jan,

On 02/10/2018 12:58, Jan Beulich wrote:
> Well, that's what I would have done if you hadn't brought up the
> mixed-IOMMU case - clearly, without being able to test, I wouldn't
> have dared to try and implement patching of indirect calls for ARM.
> 
> I can certainly drop that patch, but in the end it means you've
> made me do more work than would have been needed to reach
> the immediate goal I have.

I have never asked you to re-arrange the code for Arm. I only asked to 
avoid patching indirect calls for Arm and explained why I wanted to 
avoid it. Sorry for the misunderstanding.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 02/12] x86/HVM: patch indirect calls through hvm_funcs to direct ones
  2018-10-02 10:12   ` [PATCH v4 02/12] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
@ 2018-10-02 13:18     ` Paul Durrant
  2018-10-03 18:55     ` Andrew Cooper
  1 sibling, 0 replies; 119+ messages in thread
From: Paul Durrant @ 2018-10-02 13:18 UTC (permalink / raw)
  To: 'Jan Beulich', xen-devel; +Cc: Andrew Cooper, Wei Liu

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 02 October 2018 11:13
> To: xen-devel <xen-devel@lists.xenproject.org>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant
> <Paul.Durrant@citrix.com>; Wei Liu <wei.liu2@citrix.com>
> Subject: [PATCH v4 02/12] x86/HVM: patch indirect calls through hvm_funcs
> to direct ones
> 
> This is intentionally not touching hooks used rarely (or not at all)
> during the lifetime of a VM, like {domain,vcpu}_initialise or cpu_up,
> as well as nested, VM event, and altp2m ones (they can all be done
> later, if so desired). Virtual Interrupt delivery ones will be dealt
> with in a subsequent patch.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
> ---
> v3: Re-base.
> v2: Drop open-coded numbers from macro invocations. Re-base.
> 
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -2104,7 +2104,7 @@ static int hvmemul_write_msr(
>  static int hvmemul_wbinvd(
>      struct x86_emulate_ctxt *ctxt)
>  {
> -    hvm_funcs.wbinvd_intercept();
> +    alternative_vcall(hvm_funcs.wbinvd_intercept);
>      return X86EMUL_OKAY;
>  }
> 
> @@ -2122,7 +2122,7 @@ static int hvmemul_get_fpu(
>      struct vcpu *curr = current;
> 
>      if ( !curr->fpu_dirtied )
> -        hvm_funcs.fpu_dirty_intercept();
> +        alternative_vcall(hvm_funcs.fpu_dirty_intercept);
>      else if ( type == X86EMUL_FPU_fpu )
>      {
>          const typeof(curr->arch.xsave_area->fpu_sse) *fpu_ctxt =
> @@ -2239,7 +2239,7 @@ static void hvmemul_put_fpu(
>          {
>              curr->fpu_dirtied = false;
>              stts();
> -            hvm_funcs.fpu_leave(curr);
> +            alternative_vcall(hvm_funcs.fpu_leave, curr);
>          }
>      }
>  }
> @@ -2401,7 +2401,8 @@ static int _hvm_emulate_one(struct hvm_e
>      if ( hvmemul_ctxt->intr_shadow != new_intr_shadow )
>      {
>          hvmemul_ctxt->intr_shadow = new_intr_shadow;
> -        hvm_funcs.set_interrupt_shadow(curr, new_intr_shadow);
> +        alternative_vcall(hvm_funcs.set_interrupt_shadow,
> +                          curr, new_intr_shadow);
>      }
> 
>      if ( hvmemul_ctxt->ctxt.retire.hlt &&
> @@ -2538,7 +2539,8 @@ void hvm_emulate_init_once(
> 
>      memset(hvmemul_ctxt, 0, sizeof(*hvmemul_ctxt));
> 
> -    hvmemul_ctxt->intr_shadow = hvm_funcs.get_interrupt_shadow(curr);
> +    hvmemul_ctxt->intr_shadow =
> +        alternative_call(hvm_funcs.get_interrupt_shadow, curr);
>      hvmemul_get_seg_reg(x86_seg_cs, hvmemul_ctxt);
>      hvmemul_get_seg_reg(x86_seg_ss, hvmemul_ctxt);
> 
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -272,12 +272,12 @@ void hvm_set_rdtsc_exiting(struct domain
>      struct vcpu *v;
> 
>      for_each_vcpu ( d, v )
> -        hvm_funcs.set_rdtsc_exiting(v, enable);
> +        alternative_vcall(hvm_funcs.set_rdtsc_exiting, v, enable);
>  }
> 
>  void hvm_get_guest_pat(struct vcpu *v, u64 *guest_pat)
>  {
> -    if ( !hvm_funcs.get_guest_pat(v, guest_pat) )
> +    if ( !alternative_call(hvm_funcs.get_guest_pat, v, guest_pat) )
>          *guest_pat = v->arch.hvm.pat_cr;
>  }
> 
> @@ -302,7 +302,7 @@ int hvm_set_guest_pat(struct vcpu *v, u6
>              return 0;
>          }
> 
> -    if ( !hvm_funcs.set_guest_pat(v, guest_pat) )
> +    if ( !alternative_call(hvm_funcs.set_guest_pat, v, guest_pat) )
>          v->arch.hvm.pat_cr = guest_pat;
> 
>      return 1;
> @@ -342,7 +342,7 @@ bool hvm_set_guest_bndcfgs(struct vcpu *
>              /* nothing, best effort only */;
>      }
> 
> -    return hvm_funcs.set_guest_bndcfgs(v, val);
> +    return alternative_call(hvm_funcs.set_guest_bndcfgs, v, val);
>  }
> 
>  /*
> @@ -500,7 +500,8 @@ void hvm_migrate_pirqs(struct vcpu *v)
>  static bool hvm_get_pending_event(struct vcpu *v, struct x86_event *info)
>  {
>      info->cr2 = v->arch.hvm.guest_cr[2];
> -    return hvm_funcs.get_pending_event(v, info);
> +
> +    return alternative_call(hvm_funcs.get_pending_event, v, info);
>  }
> 
>  void hvm_do_resume(struct vcpu *v)
> @@ -1651,7 +1652,7 @@ void hvm_inject_event(const struct x86_e
>          }
>      }
> 
> -    hvm_funcs.inject_event(event);
> +    alternative_vcall(hvm_funcs.inject_event, event);
>  }
> 
>  int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
> @@ -2238,7 +2239,7 @@ int hvm_set_cr0(unsigned long value, boo
>           (!rangeset_is_empty(d->iomem_caps) ||
>            !rangeset_is_empty(d->arch.ioport_caps) ||
>            has_arch_pdevs(d)) )
> -        hvm_funcs.handle_cd(v, value);
> +        alternative_vcall(hvm_funcs.handle_cd, v, value);
> 
>      hvm_update_cr(v, 0, value);
> 
> @@ -3477,7 +3478,8 @@ int hvm_msr_read_intercept(unsigned int
>              goto gp_fault;
>          /* If ret == 0 then this is not an MCE MSR, see other MSRs. */
>          ret = ((ret == 0)
> -               ? hvm_funcs.msr_read_intercept(msr, msr_content)
> +               ? alternative_call(hvm_funcs.msr_read_intercept,
> +                                  msr, msr_content)
>                 : X86EMUL_OKAY);
>          break;
>      }
> @@ -3637,7 +3639,8 @@ int hvm_msr_write_intercept(unsigned int
>              goto gp_fault;
>          /* If ret == 0 then this is not an MCE MSR, see other MSRs. */
>          ret = ((ret == 0)
> -               ? hvm_funcs.msr_write_intercept(msr, msr_content)
> +               ? alternative_call(hvm_funcs.msr_write_intercept,
> +                                  msr, msr_content)
>                 : X86EMUL_OKAY);
>          break;
>      }
> @@ -3829,7 +3832,7 @@ void hvm_hypercall_page_initialise(struc
>                                     void *hypercall_page)
>  {
>      hvm_latch_shinfo_size(d);
> -    hvm_funcs.init_hypercall_page(d, hypercall_page);
> +    alternative_vcall(hvm_funcs.init_hypercall_page, d, hypercall_page);
>  }
> 
>  void hvm_vcpu_reset_state(struct vcpu *v, uint16_t cs, uint16_t ip)
> @@ -5004,7 +5007,7 @@ void hvm_domain_soft_reset(struct domain
>  void hvm_get_segment_register(struct vcpu *v, enum x86_segment seg,
>                                struct segment_register *reg)
>  {
> -    hvm_funcs.get_segment_register(v, seg, reg);
> +    alternative_vcall(hvm_funcs.get_segment_register, v, seg, reg);
> 
>      switch ( seg )
>      {
> @@ -5150,7 +5153,7 @@ void hvm_set_segment_register(struct vcp
>          return;
>      }
> 
> -    hvm_funcs.set_segment_register(v, seg, reg);
> +    alternative_vcall(hvm_funcs.set_segment_register, v, seg, reg);
>  }
> 
>  /*
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -383,42 +383,42 @@ static inline int
>  hvm_guest_x86_mode(struct vcpu *v)
>  {
>      ASSERT(v == current);
> -    return hvm_funcs.guest_x86_mode(v);
> +    return alternative_call(hvm_funcs.guest_x86_mode, v);
>  }
> 
>  static inline void
>  hvm_update_host_cr3(struct vcpu *v)
>  {
>      if ( hvm_funcs.update_host_cr3 )
> -        hvm_funcs.update_host_cr3(v);
> +        alternative_vcall(hvm_funcs.update_host_cr3, v);
>  }
> 
>  static inline void hvm_update_guest_cr(struct vcpu *v, unsigned int cr)
>  {
> -    hvm_funcs.update_guest_cr(v, cr, 0);
> +    alternative_vcall(hvm_funcs.update_guest_cr, v, cr, 0);
>  }
> 
>  static inline void hvm_update_guest_cr3(struct vcpu *v, bool noflush)
>  {
>      unsigned int flags = noflush ? HVM_UPDATE_GUEST_CR3_NOFLUSH : 0;
> 
> -    hvm_funcs.update_guest_cr(v, 3, flags);
> +    alternative_vcall(hvm_funcs.update_guest_cr, v, 3, flags);
>  }
> 
>  static inline void hvm_update_guest_efer(struct vcpu *v)
>  {
> -    hvm_funcs.update_guest_efer(v);
> +    alternative_vcall(hvm_funcs.update_guest_efer, v);
>  }
> 
>  static inline void hvm_cpuid_policy_changed(struct vcpu *v)
>  {
> -    hvm_funcs.cpuid_policy_changed(v);
> +    alternative_vcall(hvm_funcs.cpuid_policy_changed, v);
>  }
> 
>  static inline void hvm_set_tsc_offset(struct vcpu *v, uint64_t offset,
>                                        uint64_t at_tsc)
>  {
> -    hvm_funcs.set_tsc_offset(v, offset, at_tsc);
> +    alternative_vcall(hvm_funcs.set_tsc_offset, v, offset, at_tsc);
>  }
> 
>  /*
> @@ -435,18 +435,18 @@ static inline void hvm_flush_guest_tlbs(
>  static inline unsigned int
>  hvm_get_cpl(struct vcpu *v)
>  {
> -    return hvm_funcs.get_cpl(v);
> +    return alternative_call(hvm_funcs.get_cpl, v);
>  }
> 
>  static inline unsigned long hvm_get_shadow_gs_base(struct vcpu *v)
>  {
> -    return hvm_funcs.get_shadow_gs_base(v);
> +    return alternative_call(hvm_funcs.get_shadow_gs_base, v);
>  }
> 
>  static inline bool hvm_get_guest_bndcfgs(struct vcpu *v, u64 *val)
>  {
>      return hvm_funcs.get_guest_bndcfgs &&
> -           hvm_funcs.get_guest_bndcfgs(v, val);
> +           alternative_call(hvm_funcs.get_guest_bndcfgs, v, val);
>  }
> 
>  #define has_hvm_params(d) \
> @@ -503,12 +503,12 @@ static inline void hvm_inject_page_fault
> 
>  static inline int hvm_event_pending(struct vcpu *v)
>  {
> -    return hvm_funcs.event_pending(v);
> +    return alternative_call(hvm_funcs.event_pending, v);
>  }
> 
>  static inline void hvm_invlpg(struct vcpu *v, unsigned long linear)
>  {
> -    hvm_funcs.invlpg(v, linear);
> +    alternative_vcall(hvm_funcs.invlpg, v, linear);
>  }
> 
>  /* These bits in CR4 are owned by the host. */
> @@ -533,13 +533,14 @@ static inline void hvm_cpu_down(void)
> 
>  static inline unsigned int hvm_get_insn_bytes(struct vcpu *v, uint8_t
> *buf)
>  {
> -    return (hvm_funcs.get_insn_bytes ? hvm_funcs.get_insn_bytes(v, buf) :
> 0);
> +    return (hvm_funcs.get_insn_bytes
> +            ? alternative_call(hvm_funcs.get_insn_bytes, v, buf) : 0);
>  }
> 
>  static inline void hvm_set_info_guest(struct vcpu *v)
>  {
>      if ( hvm_funcs.set_info_guest )
> -        return hvm_funcs.set_info_guest(v);
> +        alternative_vcall(hvm_funcs.set_info_guest, v);
>  }
> 
>  static inline void hvm_invalidate_regs_fields(struct cpu_user_regs *regs)
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-02 10:12   ` [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
@ 2018-10-02 13:21     ` Andrew Cooper
  2018-10-02 13:28       ` Julien Grall
  2018-10-02 14:06       ` Jan Beulich
  2018-10-02 13:55     ` Wei Liu
  2018-10-03 18:38     ` Andrew Cooper
  2 siblings, 2 replies; 119+ messages in thread
From: Andrew Cooper @ 2018-10-02 13:21 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall

On 02/10/18 11:12, Jan Beulich wrote:
> --- a/xen/include/xen/lib.h
> +++ b/xen/include/xen/lib.h
> @@ -66,6 +66,10 @@
>  
>  #define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
>  
> +#define count_va_arg_(dot, a1, a2, a3, a4, a5, a6, a7, a8, x, ...) x
> +#define count_va_arg(args...) \
> +    count_va_arg_(., ## args, 8, 7, 6, 5, 4, 3, 2, 1, 0)

This particular bit of review split out for obvious reasons.

We already have __count_args() in the ARM SMCCC infrastructure.  Please
can we dedup that (broken out into a separate patch) rather than
introducing a competing version.

The ARM version is buggy.  It is off-by-two in the base case, and
doesn't compile if fewer than two parameters are passed.  This version
functions correctly, but should be named with a plural.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-02 13:21     ` Andrew Cooper
@ 2018-10-02 13:28       ` Julien Grall
  2018-10-02 13:35         ` Andrew Cooper
  2018-10-02 14:06       ` Jan Beulich
  1 sibling, 1 reply; 119+ messages in thread
From: Julien Grall @ 2018-10-02 13:28 UTC (permalink / raw)
  To: Andrew Cooper, Jan Beulich, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson

Hi Andrew,

On 02/10/2018 14:21, Andrew Cooper wrote:
> On 02/10/18 11:12, Jan Beulich wrote:
>> --- a/xen/include/xen/lib.h
>> +++ b/xen/include/xen/lib.h
>> @@ -66,6 +66,10 @@
>>   
>>   #define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
>>   
>> +#define count_va_arg_(dot, a1, a2, a3, a4, a5, a6, a7, a8, x, ...) x
>> +#define count_va_arg(args...) \
>> +    count_va_arg_(., ## args, 8, 7, 6, 5, 4, 3, 2, 1, 0)
> 
> This particular bit of review split out for obvious reasons.
> 
> We already have __count_args() in the ARM SMCCC infrastructure.  Please
> can we dedup that (broken out into a separate patch) rather than
> introducing a competing version.
> 
> The ARM version is buggy.  It is off-by-two in the base case, and
> doesn't compile if fewer than two parameters are passed.  This version
> functions correctly, but should be named with a plural.

The ARM version is *not* buggy. What the ARM version count is the number 
of parameters for the SMCCC function, *not* the number of parameter for 
the call.

This matches the SMCCC where the first parameter is always the function 
ID, the last parameter is always the result structure.

The code will end up to be less readable using a generic count. I am 
planning to Nack anything trying to use a generic count in the SMCCC code.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-02 13:28       ` Julien Grall
@ 2018-10-02 13:35         ` Andrew Cooper
  2018-10-02 13:36           ` Julien Grall
  0 siblings, 1 reply; 119+ messages in thread
From: Andrew Cooper @ 2018-10-02 13:35 UTC (permalink / raw)
  To: Julien Grall, Jan Beulich, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson

On 02/10/18 14:28, Julien Grall wrote:
> Hi Andrew,
>
> On 02/10/2018 14:21, Andrew Cooper wrote:
>> On 02/10/18 11:12, Jan Beulich wrote:
>>> --- a/xen/include/xen/lib.h
>>> +++ b/xen/include/xen/lib.h
>>> @@ -66,6 +66,10 @@
>>>     #define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
>>>   +#define count_va_arg_(dot, a1, a2, a3, a4, a5, a6, a7, a8, x, ...) x
>>> +#define count_va_arg(args...) \
>>> +    count_va_arg_(., ## args, 8, 7, 6, 5, 4, 3, 2, 1, 0)
>>
>> This particular bit of review split out for obvious reasons.
>>
>> We already have __count_args() in the ARM SMCCC infrastructure.  Please
>> can we dedup that (broken out into a separate patch) rather than
>> introducing a competing version.
>>
>> The ARM version is buggy.  It is off-by-two in the base case, and
>> doesn't compile if fewer than two parameters are passed.  This version
>> functions correctly, but should be named with a plural.
>
> The ARM version is *not* buggy. What the ARM version count is the
> number of parameters for the SMCCC function, *not* the number of
> parameter for the call.
>
> This matches the SMCCC where the first parameter is always the
> function ID, the last parameter is always the result structure.
>
> The code will end up to be less readable using a generic count. I am
> planning to Nack anything trying to use a generic count in the SMCCC
> code.

Yes it is buggy, because it doesn't behave as the name suggests.

If you mean the infrastructure to have SMCCC specific behaviour, the
macros should be named suitably, to avoid interfering with common code.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-02 13:35         ` Andrew Cooper
@ 2018-10-02 13:36           ` Julien Grall
  0 siblings, 0 replies; 119+ messages in thread
From: Julien Grall @ 2018-10-02 13:36 UTC (permalink / raw)
  To: Andrew Cooper, Jan Beulich, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson



On 02/10/2018 14:35, Andrew Cooper wrote:
> On 02/10/18 14:28, Julien Grall wrote:
>> Hi Andrew,
>>
>> On 02/10/2018 14:21, Andrew Cooper wrote:
>>> On 02/10/18 11:12, Jan Beulich wrote:
>>>> --- a/xen/include/xen/lib.h
>>>> +++ b/xen/include/xen/lib.h
>>>> @@ -66,6 +66,10 @@
>>>>      #define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
>>>>    +#define count_va_arg_(dot, a1, a2, a3, a4, a5, a6, a7, a8, x, ...) x
>>>> +#define count_va_arg(args...) \
>>>> +    count_va_arg_(., ## args, 8, 7, 6, 5, 4, 3, 2, 1, 0)
>>>
>>> This particular bit of review split out for obvious reasons.
>>>
>>> We already have __count_args() in the ARM SMCCC infrastructure.  Please
>>> can we dedup that (broken out into a separate patch) rather than
>>> introducing a competing version.
>>>
>>> The ARM version is buggy.  It is off-by-two in the base case, and
>>> doesn't compile if fewer than two parameters are passed.  This version
>>> functions correctly, but should be named with a plural.
>>
>> The ARM version is *not* buggy. What the ARM version count is the
>> number of parameters for the SMCCC function, *not* the number of
>> parameter for the call.
>>
>> This matches the SMCCC where the first parameter is always the
>> function ID, the last parameter is always the result structure.
>>
>> The code will end up to be less readable using a generic count. I am
>> planning to Nack anything trying to use a generic count in the SMCCC
>> code.
> 
> Yes it is buggy, because it doesn't behave as the name suggests.
> 
> If you mean the infrastructure to have SMCCC specific behaviour, the
> macros should be named suitably, to avoid interfering with common code.

Well the code is coming from Linux. Feel free to send a patch to fix the 
name in Linux and Xen.
Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-02 10:12   ` [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
  2018-10-02 13:21     ` Andrew Cooper
@ 2018-10-02 13:55     ` Wei Liu
  2018-10-02 14:08       ` Jan Beulich
  2018-10-03 18:38     ` Andrew Cooper
  2 siblings, 1 reply; 119+ messages in thread
From: Wei Liu @ 2018-10-02 13:55 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

On Tue, Oct 02, 2018 at 04:12:08AM -0600, Jan Beulich wrote:
> In a number of cases the targets of indirect calls get determined once
> at boot time. In such cases we can replace those calls with direct ones
> via our alternative instruction patching mechanism.
> 
> Some of the targets (in particular the hvm_funcs ones) get established
> only in pre-SMP initcalls, making necessary a second passs through the
> alternative patching code. Therefore some adjustments beyond the
> recognition of the new special pattern are necessary there.
> 
> Note that patching such sites more than once is not supported (and the
> supplied macros also don't provide any means to do so).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v4: Extend / adjust comments.

Thanks, this makes it much easier to reason about the code.

I know there are comments about a macro, but they don't really affect
the meat of this patch, so whether the macro is split out to a
prerequisite patch or not:

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-02 13:21     ` Andrew Cooper
  2018-10-02 13:28       ` Julien Grall
@ 2018-10-02 14:06       ` Jan Beulich
  2018-10-02 14:23         ` Andrew Cooper
  1 sibling, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 14:06 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall, xen-devel

>>> On 02.10.18 at 15:21, <andrew.cooper3@citrix.com> wrote:
> On 02/10/18 11:12, Jan Beulich wrote:
>> --- a/xen/include/xen/lib.h
>> +++ b/xen/include/xen/lib.h
>> @@ -66,6 +66,10 @@
>>  
>>  #define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
>>  
>> +#define count_va_arg_(dot, a1, a2, a3, a4, a5, a6, a7, a8, x, ...) x
>> +#define count_va_arg(args...) \
>> +    count_va_arg_(., ## args, 8, 7, 6, 5, 4, 3, 2, 1, 0)
> 
> This particular bit of review split out for obvious reasons.
> 
> We already have __count_args() in the ARM SMCCC infrastructure.  Please
> can we dedup that (broken out into a separate patch) rather than
> introducing a competing version.
> 
> The ARM version is buggy.  It is off-by-two in the base case, and
> doesn't compile if fewer than two parameters are passed.

If you had followed earlier discussion, you'd have known up front
Julien's reaction. It is for that reason that I'm not trying to fiddle
with the ARM code in this regard, despite agreeing with you that
at the very least it _looks_ buggy.

> This version functions correctly, but should be named with a plural.

Why plural? Nothing in stdarg.h uses plural (including the header
file name itself).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-02 13:55     ` Wei Liu
@ 2018-10-02 14:08       ` Jan Beulich
  0 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 14:08 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall, xen-devel

>>> On 02.10.18 at 15:55, <wei.liu2@citrix.com> wrote:
> On Tue, Oct 02, 2018 at 04:12:08AM -0600, Jan Beulich wrote:
>> In a number of cases the targets of indirect calls get determined once
>> at boot time. In such cases we can replace those calls with direct ones
>> via our alternative instruction patching mechanism.
>> 
>> Some of the targets (in particular the hvm_funcs ones) get established
>> only in pre-SMP initcalls, making necessary a second passs through the
>> alternative patching code. Therefore some adjustments beyond the
>> recognition of the new special pattern are necessary there.
>> 
>> Note that patching such sites more than once is not supported (and the
>> supplied macros also don't provide any means to do so).
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> v4: Extend / adjust comments.
> 
> Thanks, this makes it much easier to reason about the code.
> 
> I know there are comments about a macro, but they don't really affect
> the meat of this patch, so whether the macro is split out to a
> prerequisite patch or not:
> 
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>

Thanks, this is much appreciated especially in that context.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-02 14:06       ` Jan Beulich
@ 2018-10-02 14:23         ` Andrew Cooper
  2018-10-02 14:43           ` Jan Beulich
  0 siblings, 1 reply; 119+ messages in thread
From: Andrew Cooper @ 2018-10-02 14:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall, xen-devel

On 02/10/18 15:06, Jan Beulich wrote:
>>>> On 02.10.18 at 15:21, <andrew.cooper3@citrix.com> wrote:
>> On 02/10/18 11:12, Jan Beulich wrote:
>>> --- a/xen/include/xen/lib.h
>>> +++ b/xen/include/xen/lib.h
>>> @@ -66,6 +66,10 @@
>>>  
>>>  #define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
>>>  
>>> +#define count_va_arg_(dot, a1, a2, a3, a4, a5, a6, a7, a8, x, ...) x
>>> +#define count_va_arg(args...) \
>>> +    count_va_arg_(., ## args, 8, 7, 6, 5, 4, 3, 2, 1, 0)
>> This particular bit of review split out for obvious reasons.
>>
>> We already have __count_args() in the ARM SMCCC infrastructure.  Please
>> can we dedup that (broken out into a separate patch) rather than
>> introducing a competing version.
>>
>> The ARM version is buggy.  It is off-by-two in the base case, and
>> doesn't compile if fewer than two parameters are passed.
> If you had followed earlier discussion, you'd have known up front
> Julien's reaction. It is for that reason that I'm not trying to fiddle
> with the ARM code in this regard, despite agreeing with you that
> at the very least it _looks_ buggy.
>
>> This version functions correctly, but should be named with a plural.
> Why plural? Nothing in stdarg.h uses plural (including the header
> file name itself).

What has stdarg.h got to do with anything here?  (Irrespective, the name
of the header file alone is the only thing which is grammatically
questionable.)

count_va_args() should be plural for exactly the same reason that you
named its parameter as 'args'.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-02 14:23         ` Andrew Cooper
@ 2018-10-02 14:43           ` Jan Beulich
  2018-10-02 15:40             ` Andrew Cooper
  0 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 14:43 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall, xen-devel

>>> On 02.10.18 at 16:23, <andrew.cooper3@citrix.com> wrote:
> On 02/10/18 15:06, Jan Beulich wrote:
>>>>> On 02.10.18 at 15:21, <andrew.cooper3@citrix.com> wrote:
>>> On 02/10/18 11:12, Jan Beulich wrote:
>>>> --- a/xen/include/xen/lib.h
>>>> +++ b/xen/include/xen/lib.h
>>>> @@ -66,6 +66,10 @@
>>>>  
>>>>  #define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
>>>>  
>>>> +#define count_va_arg_(dot, a1, a2, a3, a4, a5, a6, a7, a8, x, ...) x
>>>> +#define count_va_arg(args...) \
>>>> +    count_va_arg_(., ## args, 8, 7, 6, 5, 4, 3, 2, 1, 0)
>>> This particular bit of review split out for obvious reasons.
>>>
>>> We already have __count_args() in the ARM SMCCC infrastructure.  Please
>>> can we dedup that (broken out into a separate patch) rather than
>>> introducing a competing version.
>>>
>>> The ARM version is buggy.  It is off-by-two in the base case, and
>>> doesn't compile if fewer than two parameters are passed.
>> If you had followed earlier discussion, you'd have known up front
>> Julien's reaction. It is for that reason that I'm not trying to fiddle
>> with the ARM code in this regard, despite agreeing with you that
>> at the very least it _looks_ buggy.
>>
>>> This version functions correctly, but should be named with a plural.
>> Why plural? Nothing in stdarg.h uses plural (including the header
>> file name itself).
> 
> What has stdarg.h got to do with anything here?  (Irrespective, the name
> of the header file alone is the only thing which is grammatically
> questionable.)

And the identifier va_arg as well as the __builtin_va_arg it resolves
to are fine? It is precisely their naming that I've used to decide how
to name the macro here.

> count_va_args() should be plural for exactly the same reason that you
> named its parameter as 'args'.

That's your personal opinion. I very much think that the naming
here should not in any way block the series (and even less so when
the series has been out for review for almost 3 months, and through
a couple of iterations), as imo it is well within the bounds of what is
reasonable to let the submitter decide. (All of this is not to say that
I wouldn't make the change, if that's the only way to get the series
in, but it would be very reluctantly.)

The reason to name the arguments (in their entirety) "args" is, I
hope, quite obvious, and unrelated to the macro's name.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-02 14:43           ` Jan Beulich
@ 2018-10-02 15:40             ` Andrew Cooper
  2018-10-02 16:06               ` Jan Beulich
  0 siblings, 1 reply; 119+ messages in thread
From: Andrew Cooper @ 2018-10-02 15:40 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall, xen-devel

On 02/10/18 15:43, Jan Beulich wrote:
>>>> On 02.10.18 at 16:23, <andrew.cooper3@citrix.com> wrote:
>> On 02/10/18 15:06, Jan Beulich wrote:
>>>>>> On 02.10.18 at 15:21, <andrew.cooper3@citrix.com> wrote:
>>>> On 02/10/18 11:12, Jan Beulich wrote:
>>>>> --- a/xen/include/xen/lib.h
>>>>> +++ b/xen/include/xen/lib.h
>>>>> @@ -66,6 +66,10 @@
>>>>>  
>>>>>  #define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
>>>>>  
>>>>> +#define count_va_arg_(dot, a1, a2, a3, a4, a5, a6, a7, a8, x, ...) x
>>>>> +#define count_va_arg(args...) \
>>>>> +    count_va_arg_(., ## args, 8, 7, 6, 5, 4, 3, 2, 1, 0)
>>>> This particular bit of review split out for obvious reasons.
>>>>
>>>> We already have __count_args() in the ARM SMCCC infrastructure.  Please
>>>> can we dedup that (broken out into a separate patch) rather than
>>>> introducing a competing version.
>>>>
>>>> The ARM version is buggy.  It is off-by-two in the base case, and
>>>> doesn't compile if fewer than two parameters are passed.
>>> If you had followed earlier discussion, you'd have known up front
>>> Julien's reaction. It is for that reason that I'm not trying to fiddle
>>> with the ARM code in this regard, despite agreeing with you that
>>> at the very least it _looks_ buggy.
>>>
>>>> This version functions correctly, but should be named with a plural.
>>> Why plural? Nothing in stdarg.h uses plural (including the header
>>> file name itself).
>> What has stdarg.h got to do with anything here?  (Irrespective, the name
>> of the header file alone is the only thing which is grammatically
>> questionable.)
> And the identifier va_arg as well as the __builtin_va_arg it resolves
> to are fine? It is precisely their naming that I've used to decide how
> to name the macro here.

Yes, because that helper has the purpose of giving you one single argument.

>
>> count_va_args() should be plural for exactly the same reason that you
>> named its parameter as 'args'.
> That's your personal opinion.

It is plain grammar.  "count_arg" is only correct when the answer is 1.

>  I very much think that the naming
> here should not in any way block the series (and even less so when
> the series has been out for review for almost 3 months, and through
> a couple of iterations), as imo it is well within the bounds of what is
> reasonable to let the submitter decide. (All of this is not to say that
> I wouldn't make the change, if that's the only way to get the series
> in, but it would be very reluctantly.)

Do you realise how hypocritical this statement is?  You frequently
insist on naming changes and hold up series because of it.  Perhaps the
best example recently is bfn/dfn, where bfn is a term which has been
used for 3 years at conferences and on xen-devel without objection so far.

You cannot expect reviews of your code to be held to a different
standard than you hold others to.

As for 3 months, I'm sorry that this series hasn't yet reached the top
of my priority list, but you, better than most, know exactly what has
been taking up all of our time during that period.  I'm getting through
my review backlog as fast as I can, but it doesn't preempt higher
priority tasks.  (As for today, review is happing while one of my slow
servers reboots...)

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-02 15:40             ` Andrew Cooper
@ 2018-10-02 16:06               ` Jan Beulich
  0 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-10-02 16:06 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall, xen-devel

>>> On 02.10.18 at 17:40, <andrew.cooper3@citrix.com> wrote:
> On 02/10/18 15:43, Jan Beulich wrote:
>>>>> On 02.10.18 at 16:23, <andrew.cooper3@citrix.com> wrote:
>>> On 02/10/18 15:06, Jan Beulich wrote:
>>>>>>> On 02.10.18 at 15:21, <andrew.cooper3@citrix.com> wrote:
>>>>> On 02/10/18 11:12, Jan Beulich wrote:
>>>>>> --- a/xen/include/xen/lib.h
>>>>>> +++ b/xen/include/xen/lib.h
>>>>>> @@ -66,6 +66,10 @@
>>>>>>  
>>>>>>  #define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
>>>>>>  
>>>>>> +#define count_va_arg_(dot, a1, a2, a3, a4, a5, a6, a7, a8, x, ...) x
>>>>>> +#define count_va_arg(args...) \
>>>>>> +    count_va_arg_(., ## args, 8, 7, 6, 5, 4, 3, 2, 1, 0)
>>>>> This particular bit of review split out for obvious reasons.
>>>>>
>>>>> We already have __count_args() in the ARM SMCCC infrastructure.  Please
>>>>> can we dedup that (broken out into a separate patch) rather than
>>>>> introducing a competing version.
>>>>>
>>>>> The ARM version is buggy.  It is off-by-two in the base case, and
>>>>> doesn't compile if fewer than two parameters are passed.
>>>> If you had followed earlier discussion, you'd have known up front
>>>> Julien's reaction. It is for that reason that I'm not trying to fiddle
>>>> with the ARM code in this regard, despite agreeing with you that
>>>> at the very least it _looks_ buggy.
>>>>
>>>>> This version functions correctly, but should be named with a plural.
>>>> Why plural? Nothing in stdarg.h uses plural (including the header
>>>> file name itself).
>>> What has stdarg.h got to do with anything here?  (Irrespective, the name
>>> of the header file alone is the only thing which is grammatically
>>> questionable.)
>> And the identifier va_arg as well as the __builtin_va_arg it resolves
>> to are fine? It is precisely their naming that I've used to decide how
>> to name the macro here.
> 
> Yes, because that helper has the purpose of giving you one single argument.
> 
>>
>>> count_va_args() should be plural for exactly the same reason that you
>>> named its parameter as 'args'.
>> That's your personal opinion.
> 
> It is plain grammar.  "count_arg" is only correct when the answer is 1.
> 
>>  I very much think that the naming
>> here should not in any way block the series (and even less so when
>> the series has been out for review for almost 3 months, and through
>> a couple of iterations), as imo it is well within the bounds of what is
>> reasonable to let the submitter decide. (All of this is not to say that
>> I wouldn't make the change, if that's the only way to get the series
>> in, but it would be very reluctantly.)
> 
> Do you realise how hypocritical this statement is?  You frequently
> insist on naming changes and hold up series because of it.  Perhaps the
> best example recently is bfn/dfn, where bfn is a term which has been
> used for 3 years at conferences and on xen-devel without objection so far.

I know I did bring up this ambiguity of the term "bus" long before.

Also note the (significant imo) difference between a singular / plural
controversy, and one over possible ambiguities (which may make
already hard to understand code even harder to understand).

> You cannot expect reviews of your code to be held to a different
> standard than you hold others to.

And I don't, or at least I try not to.

> As for 3 months, I'm sorry that this series hasn't yet reached the top
> of my priority list, but you, better than most, know exactly what has
> been taking up all of our time during that period.  I'm getting through
> my review backlog as fast as I can, but it doesn't preempt higher
> priority tasks.  (As for today, review is happing while one of my slow
> servers reboots...)

This is all understood, but this particular series (and at least one other
one), at least large parts of it, has been reviewed by others, so you
_could_ (but of course you don't have to) rely on those other reviews.

I hope, however, that you also understand that this almost-no-progress
situation is frustrating here. For patches submitted by others, I can
stand in for you when you're buried in higher priority tasks, but this does
not work for my own patches.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-02 10:12   ` [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
  2018-10-02 13:21     ` Andrew Cooper
  2018-10-02 13:55     ` Wei Liu
@ 2018-10-03 18:38     ` Andrew Cooper
  2018-10-05 12:39       ` Andrew Cooper
  2018-10-29 11:01       ` Jan Beulich
  2 siblings, 2 replies; 119+ messages in thread
From: Andrew Cooper @ 2018-10-03 18:38 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall

[-- Attachment #1: Type: text/plain, Size: 9204 bytes --]

On 02/10/18 11:12, Jan Beulich wrote:
> In a number of cases the targets of indirect calls get determined once
> at boot time. In such cases we can replace those calls with direct ones
> via our alternative instruction patching mechanism.
>
> Some of the targets (in particular the hvm_funcs ones) get established
> only in pre-SMP initcalls, making necessary a second passs through the
> alternative patching code. Therefore some adjustments beyond the
> recognition of the new special pattern are necessary there.
>
> Note that patching such sites more than once is not supported (and the
> supplied macros also don't provide any means to do so).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewing just the code generation at this point.

See the Linux source code for ASM_CALL_CONSTRAINT.  There is a potential
code generation issue if you've got a call instruction inside an asm
block if you don't list the stack pointer as a clobbered output.

Next, with Clang, there seems to be some a bug causing the function
pointer to be spilled onto the stack

ffff82d08026e990 <foo2>:
ffff82d08026e990:       50                      push   %rax
ffff82d08026e991:       48 8b 05 40 bc 20 00    mov    0x20bc40(%rip),%rax        # ffff82d08047a5d8 <hvm_funcs+0x130>
ffff82d08026e998:       48 89 04 24             mov    %rax,(%rsp)
ffff82d08026e99c:       ff 15 36 bc 20 00       callq  *0x20bc36(%rip)        # ffff82d08047a5d8 <hvm_funcs+0x130>
ffff82d08026e9a2:       31 c0                   xor    %eax,%eax
ffff82d08026e9a4:       59                      pop    %rcx
ffff82d08026e9a5:       c3                      retq   
ffff82d08026e9a6:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
ffff82d08026e9ad:       00 00 00 


I'm not quite sure what is going on here, and the binary does boot, but
the code gen is definitely not correct.  Given this and the GCC bugs
you've found leading to the NO_ARG infrastructure, how about dropping
all the compatibility hacks, and making the infrastructure fall back to
a regular compiler-inserted function pointer call?

I think it is entirely reasonable to require people wanting to use this
optimised infrastructure to be using new-enough compilers, and it would
avoid the need to carry compatibility hacks for broken compilers.

Next, the ASM'd calls aren't SYSV-ABI compliant.

extern void bar(void);

int foo1(void)
{
    hvm_funcs.wbinvd_intercept();
    return 0;
}

int foo2(void)
{
    alternative_vcall(hvm_funcs.wbinvd_intercept);
    return 0;
}

int bar1(void)
{
    bar();
    return 0;
}

ffff82d08026e1e0 <foo1>:
ffff82d08026e1e0:       48 83 ec 08             sub    $0x8,%rsp
ffff82d08026e1e4:       48 8b 05 c5 49 1d 00    mov    0x1d49c5(%rip),%rax        # ffff82d080442bb0 <hvm_funcs+0x130>
ffff82d08026e1eb:       e8 30 2d 0f 00          callq  ffff82d080360f20 <__x86_indirect_thunk_rax>
ffff82d08026e1f0:       31 c0                   xor    %eax,%eax
ffff82d08026e1f2:       48 83 c4 08             add    $0x8,%rsp
ffff82d08026e1f6:       c3                      retq   
ffff82d08026e1f7:       66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
ffff82d08026e1fe:       00 00 

ffff82d08026e200 <foo2>:
ffff82d08026e200:       ff 15 aa 49 1d 00       callq  *0x1d49aa(%rip)        # ffff82d080442bb0 <hvm_funcs+0x130>
ffff82d08026e206:       31 c0                   xor    %eax,%eax
ffff82d08026e208:       c3                      retq   
ffff82d08026e209:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)

ffff82d08026e210 <bar1>:
ffff82d08026e210:       48 83 ec 08             sub    $0x8,%rsp
ffff82d08026e214:       e8 17 18 01 00          callq  ffff82d08027fa30 <bar>
ffff82d08026e219:       31 c0                   xor    %eax,%eax
ffff82d08026e21b:       48 83 c4 08             add    $0x8,%rsp
ffff82d08026e21f:       c3                      retq   

foo2 which uses alternative_vcall() should be subtracting 8 from the
stack pointer before the emitted call instruction.  I can't find any set
of constraints which causes the stack to be set up correctly.

Finally, this series doesn't link with the default Debian toolchain.

andrewcoop@andrewcoop:/local/xen.git/xen$ ld --version
GNU ld (GNU Binutils for Debian) 2.25

andrewcoop@andrewcoop:/local/xen.git/xen$ make -s build -j8 XEN_TARGET_ARCH=x86_64 KCONFIG_CONFIG=.config-release
 __  __            _  _    _ ____                     _        _     _      
 \ \/ /___ _ __   | || |  / |___ \    _   _ _ __  ___| |_ __ _| |__ | | ___ 
  \  // _ \ '_ \  | || |_ | | __) |__| | | | '_ \/ __| __/ _` | '_ \| |/ _ \
  /  \  __/ | | | |__   _|| |/ __/|__| |_| | | | \__ \ || (_| | |_) | |  __/
 /_/\_\___|_| |_|    |_|(_)_|_____|   \__,_|_| |_|___/\__\__,_|_.__/|_|\___|
                                                                            
prelink.o:(.debug_aranges+0x3c94): relocation truncated to fit: R_X86_64_32 against `.debug_info'
prelink.o:(.debug_info+0x225fa): relocation truncated to fit: R_X86_64_32 against `.debug_str'
prelink.o:(.debug_info+0x22b57): relocation truncated to fit: R_X86_64_32 against `.debug_str'
prelink.o:(.debug_info+0x1b92da): relocation truncated to fit: R_X86_64_32 against `.debug_str'
prelink.o:(.debug_info+0x21e976): relocation truncated to fit: R_X86_64_32 against `.debug_str'
prelink.o:(.debug_info+0x21ec31): relocation truncated to fit: R_X86_64_32 against `.debug_str'
prelink.o:(.debug_info+0x21f03b): relocation truncated to fit: R_X86_64_32 against `.debug_str'
prelink.o:(.debug_info+0x2b2ac3): relocation truncated to fit: R_X86_64_32 against `.debug_loc'
prelink.o:(.debug_info+0x2b37f6): relocation truncated to fit: R_X86_64_32 against `.debug_str'
prelink.o:(.debug_info+0x448fab): relocation truncated to fit: R_X86_64_32 against `.debug_str'
prelink.o:(.debug_info+0x44b856): additional relocation overflows omitted from the output
ld: prelink.o: access beyond end of merged section (6617683)
ld: prelink.o: access beyond end of merged section (6617630)
ld: prelink.o: access beyond end of merged section (6617579)
ld: prelink.o: access beyond end of merged section (6617558)
ld: prelink.o: access beyond end of merged section (6617544)
ld: prelink.o: access beyond end of merged section (6617605)
ld: prelink.o: access beyond end of merged section (6617718)
ld: prelink.o: access beyond end of merged section (6617570)
ld: prelink.o: access beyond end of merged section (6617665)
ld: prelink.o: access beyond end of merged section (6617671)
ld: prelink.o: access beyond end of merged section (6617624)
ld: prelink.o: access beyond end of merged section (6617748)
ld: prelink.o: access beyond end of merged section (6617771)
ld: prelink.o: access beyond end of merged section (6617592)
ld: prelink.o: access beyond end of merged section (6617635)
ld: prelink.o: access beyond end of merged section (6617652)
ld: prelink.o: access beyond end of merged section (6617766)
ld: prelink.o: access beyond end of merged section (6617742)
ld: prelink.o(.debug_info+0xc962ed): reloc against `.debug_loc': error 2
Makefile:134: recipe for target '/local/xen.git/xen/xen-syms' failed
make[2]: *** [/local/xen.git/xen/xen-syms] Error 1
make[2]: *** Waiting for unfinished jobs....
/local/xen.git/xen/.xen.efi.0s.S: Assembler messages:
/local/xen.git/xen/.xen.efi.0s.S:21: Warning: value 0x7d2f80000544 truncated to 0x80000544
/local/xen.git/xen/.xen.efi.0s.S:22: Warning: value 0x7d2f800008dc truncated to 0x800008dc
/local/xen.git/xen/.xen.efi.0s.S:23: Warning: value 0x7d2f800008de truncated to 0x800008de
/local/xen.git/xen/.xen.efi.0s.S:24: Warning: value 0x7d2f800008e3 truncated to 0x800008e3
/local/xen.git/xen/.xen.efi.0s.S:25: Warning: value 0x7d2f80001086 truncated to 0x80001086
/local/xen.git/xen/.xen.efi.0s.S:26: Warning: value 0x7d2f8000108a truncated to 0x8000108a
/local/xen.git/xen/.xen.efi.0s.S:27: Warning: value 0x7d2f8000108e truncated to 0x8000108e
/local/xen.git/xen/.xen.efi.0s.S:28: Warning: value 0x7d2f800010dc truncated to 0x800010dc
/local/xen.git/xen/.xen.efi.0s.S:29: Warning: value 0x7d2f80001172 truncated to 0x80001172
/local/xen.git/xen/.xen.efi.1s.S: Assembler messages:
/local/xen.git/xen/.xen.efi.1s.S:21: Warning: value 0x7d2f80000544 truncated to 0x80000544
/local/xen.git/xen/.xen.efi.1s.S:22: Warning: value 0x7d2f800008dc truncated to 0x800008dc
/local/xen.git/xen/.xen.efi.1s.S:23: Warning: value 0x7d2f800008de truncated to 0x800008de
/local/xen.git/xen/.xen.efi.1s.S:24: Warning: value 0x7d2f800008e3 truncated to 0x800008e3
/local/xen.git/xen/.xen.efi.1s.S:25: Warning: value 0x7d2f80001086 truncated to 0x80001086
/local/xen.git/xen/.xen.efi.1s.S:26: Warning: value 0x7d2f8000108a truncated to 0x8000108a
/local/xen.git/xen/.xen.efi.1s.S:27: Warning: value 0x7d2f8000108e truncated to 0x8000108e
/local/xen.git/xen/.xen.efi.1s.S:28: Warning: value 0x7d2f800010dc truncated to 0x800010dc
/local/xen.git/xen/.xen.efi.1s.S:29: Warning: value 0x7d2f80001172 truncated to 0x80001172
Makefile:136: recipe for target '/local/xen.git/xen/xen' failed
make[1]: *** [/local/xen.git/xen/xen] Error 2
Makefile:45: recipe for target 'build' failed
make: *** [build] Error 2

Using LD 2.30 built from source is fine, but I'm not sure exactly what
is going on here.

~Andrew

[-- Attachment #2: .config-release --]
[-- Type: text/plain, Size: 1964 bytes --]

#
# Automatically generated file; DO NOT EDIT.
# Xen/x86 4.12-unstable Configuration
#
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"

#
# Architecture Features
#
CONFIG_NR_CPUS=256
CONFIG_PV=y
CONFIG_PV_LINEAR_PT=y
CONFIG_HVM=y
CONFIG_SHADOW_PAGING=y
# CONFIG_BIGMEM is not set
# CONFIG_HVM_FEP is not set
CONFIG_TBOOT=y
CONFIG_XEN_GUEST=y
CONFIG_PVH_GUEST=y
CONFIG_PV_SHIM=y
# CONFIG_PV_SHIM_EXCLUSIVE is not set

#
# Common Features
#
CONFIG_COMPAT=y
CONFIG_CORE_PARKING=y
CONFIG_HAS_ALTERNATIVE=y
CONFIG_HAS_EX_TABLE=y
CONFIG_MEM_ACCESS_ALWAYS_ON=y
CONFIG_MEM_ACCESS=y
CONFIG_HAS_MEM_PAGING=y
CONFIG_HAS_MEM_SHARING=y
CONFIG_HAS_PDX=y
CONFIG_HAS_UBSAN=y
CONFIG_HAS_KEXEC=y
CONFIG_HAS_GDBSX=y
CONFIG_HAS_IOPORTS=y
CONFIG_NEEDS_LIBELF=y
CONFIG_KEXEC=y
CONFIG_TMEM=y
CONFIG_XENOPROF=y
# CONFIG_XSM is not set

#
# Schedulers
#
CONFIG_SCHED_CREDIT=y
CONFIG_SCHED_CREDIT2=y
CONFIG_SCHED_RTDS=y
CONFIG_SCHED_ARINC653=y
CONFIG_SCHED_NULL=y
CONFIG_SCHED_CREDIT_DEFAULT=y
# CONFIG_SCHED_CREDIT2_DEFAULT is not set
# CONFIG_SCHED_RTDS_DEFAULT is not set
# CONFIG_SCHED_ARINC653_DEFAULT is not set
# CONFIG_SCHED_NULL_DEFAULT is not set
CONFIG_SCHED_DEFAULT="credit"
CONFIG_CRYPTO=y
CONFIG_LIVEPATCH=y
CONFIG_FAST_SYMBOL_LOOKUP=y
CONFIG_CMDLINE=""

#
# Device Drivers
#
CONFIG_ACPI=y
CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y
CONFIG_NUMA=y
CONFIG_HAS_NS16550=y
CONFIG_HAS_EHCI=y
CONFIG_HAS_CPUFREQ=y
CONFIG_HAS_PASSTHROUGH=y
CONFIG_HAS_PCI=y
CONFIG_VIDEO=y
CONFIG_VGA=y
CONFIG_HAS_VPCI=y

#
# Deprecated Functionality
#
# CONFIG_PV_LDT_PAGING is not set
CONFIG_DEFCONFIG_LIST="arch/x86/configs/x86_64_defconfig"
CONFIG_ARCH_SUPPORTS_INT128=y

#
# Debugging Options
#
# CONFIG_DEBUG is not set
# CONFIG_CRASH_DEBUG is not set
CONFIG_DEBUG_INFO=y
# CONFIG_FRAME_POINTER is not set
# CONFIG_LOCK_PROFILE is not set
# CONFIG_PERF_COUNTERS is not set
# CONFIG_VERBOSE_DEBUG is not set
# CONFIG_SCRUB_DEBUG is not set
# CONFIG_UBSAN is not set

[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 02/12] x86/HVM: patch indirect calls through hvm_funcs to direct ones
  2018-10-02 10:12   ` [PATCH v4 02/12] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
  2018-10-02 13:18     ` Paul Durrant
@ 2018-10-03 18:55     ` Andrew Cooper
  2018-10-04 10:19       ` Jan Beulich
  1 sibling, 1 reply; 119+ messages in thread
From: Andrew Cooper @ 2018-10-03 18:55 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Paul Durrant, Wei Liu

On 02/10/18 11:12, Jan Beulich wrote:
> This is intentionally not touching hooks used rarely (or not at all)
> during the lifetime of a VM, like {domain,vcpu}_initialise or cpu_up,
> as well as nested, VM event, and altp2m ones (they can all be done
> later, if so desired). Virtual Interrupt delivery ones will be dealt
> with in a subsequent patch.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

It is a shame that we don't have a variation such as cond_alt_vcall()
which nops out the entire call when the function pointer is NULL, but I
can't think of any sane way of trying to make that happen.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 03/12] x86/HVM: patch vINTR indirect calls through hvm_funcs to direct ones
  2018-10-02 10:13   ` [PATCH v4 03/12] x86/HVM: patch vINTR " Jan Beulich
@ 2018-10-03 19:01     ` Andrew Cooper
  0 siblings, 0 replies; 119+ messages in thread
From: Andrew Cooper @ 2018-10-03 19:01 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu

On 02/10/18 11:13, Jan Beulich wrote:
> @@ -1509,7 +1513,8 @@ static int lapic_load_regs(struct domain
>          lapic_load_fixup(s);
>  
>      if ( hvm_funcs.process_isr )
> -        hvm_funcs.process_isr(vlapic_find_highest_isr(s), v);
> +        alternative_vcall(hvm_funcs.process_isr,
> +                           vlapic_find_highest_isr(s), v);

Alignment.

Other than this, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 04/12] x86: patch ctxt_switch_masking() indirect call to direct one
  2018-10-02 10:13   ` [PATCH v4 04/12] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
@ 2018-10-03 19:01     ` Andrew Cooper
  0 siblings, 0 replies; 119+ messages in thread
From: Andrew Cooper @ 2018-10-03 19:01 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu

On 02/10/18 11:13, Jan Beulich wrote:
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 05/12] x86/genapic: remove indirection from genapic hook accesses
  2018-10-02 10:14   ` [PATCH v4 05/12] x86/genapic: remove indirection from genapic hook accesses Jan Beulich
@ 2018-10-03 19:04     ` Andrew Cooper
  0 siblings, 0 replies; 119+ messages in thread
From: Andrew Cooper @ 2018-10-03 19:04 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu

On 02/10/18 11:14, Jan Beulich wrote:
> Instead of loading a pointer at each use site, have a single runtime
> instance of struct genapic, copying into it from the individual
> instances. The individual instances can this way also be moved to .init
> (also adjust apic_probe[] at this occasion).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 06/12] x86/genapic: patch indirect calls to direct ones
  2018-10-02 10:14   ` [PATCH v4 06/12] x86/genapic: patch indirect calls to direct ones Jan Beulich
@ 2018-10-03 19:07     ` Andrew Cooper
  0 siblings, 0 replies; 119+ messages in thread
From: Andrew Cooper @ 2018-10-03 19:07 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu

On 02/10/18 11:14, Jan Beulich wrote:
> For (I hope) obvious reasons only the ones used at runtime get
> converted.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 02/12] x86/HVM: patch indirect calls through hvm_funcs to direct ones
  2018-10-03 18:55     ` Andrew Cooper
@ 2018-10-04 10:19       ` Jan Beulich
  0 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-10-04 10:19 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Paul Durrant, Wei Liu

>>> On 03.10.18 at 20:55, <andrew.cooper3@citrix.com> wrote:
> On 02/10/18 11:12, Jan Beulich wrote:
>> This is intentionally not touching hooks used rarely (or not at all)
>> during the lifetime of a VM, like {domain,vcpu}_initialise or cpu_up,
>> as well as nested, VM event, and altp2m ones (they can all be done
>> later, if so desired). Virtual Interrupt delivery ones will be dealt
>> with in a subsequent patch.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
> 
> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks.

> It is a shame that we don't have a variation such as cond_alt_vcall()
> which nops out the entire call when the function pointer is NULL, but I
> can't think of any sane way of trying to make that happen.

I think this could be made work, e.g. by further utilizing special values
of the displacement of the CALL insn (out of the non-sensible ones we
currently use only -5; arguably using -4 ... -1 would be liable to
conflict with not entirely dumb disassemblers, which may imply an
instruction boundary at the target of any CALL/JMP without special
casing such bogus values).

If we thought this was a worthwhile avenue to explore, non-void
calls could be patched this way too, as long as the replacement
"return" value is a compile time constant (i.e. we'd have a compile
time "MOV $<value>, %eax" to patch in). We'd merely have to
sort out where to place this alternative replacement code.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 07/12] x86/cpuidle: patch some indirect calls to direct ones
  2018-10-02 10:15   ` [PATCH v4 07/12] x86/cpuidle: patch some " Jan Beulich
@ 2018-10-04 10:35     ` Andrew Cooper
  0 siblings, 0 replies; 119+ messages in thread
From: Andrew Cooper @ 2018-10-04 10:35 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu

On 02/10/18 11:15, Jan Beulich wrote:
> For now only the ones used during entering/exiting of idle states are
> converted. Additionally pm_idle{,_save} and lapic_timer_{on,off} can't
> be converted, as they may get established rather late (when Dom0 is
> already active).
>
> Note that for patching to be deferred until after the pre-SMP initcalls
> (from where cpuidle_init_cpu() runs the first time) the pointers need to
> start out as NULL.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 08/12] cpufreq: convert to a single post-init driver (hooks) instance
  2018-10-02 10:16   ` [PATCH v4 08/12] cpufreq: convert to a single post-init driver (hooks) instance Jan Beulich
@ 2018-10-04 10:36     ` Andrew Cooper
  0 siblings, 0 replies; 119+ messages in thread
From: Andrew Cooper @ 2018-10-04 10:36 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall

On 02/10/18 11:16, Jan Beulich wrote:
> This reduces the post-init memory footprint, eliminates a pointless
> level of indirection at the use sites, and allows for subsequent
> alternatives call patching.
>
> Take the opportunity and also add a name to the PowerNow! instance.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 09/12] cpufreq: patch target() indirect call to direct one
  2018-10-02 10:16   ` [PATCH v4 09/12] cpufreq: patch target() indirect call to direct one Jan Beulich
@ 2018-10-04 10:36     ` Andrew Cooper
  0 siblings, 0 replies; 119+ messages in thread
From: Andrew Cooper @ 2018-10-04 10:36 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall

On 02/10/18 11:16, Jan Beulich wrote:
> This looks to be the only frequently executed hook; don't bother
> patching any other ones.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-03 18:38     ` Andrew Cooper
@ 2018-10-05 12:39       ` Andrew Cooper
  2018-10-05 13:43         ` Jan Beulich
  2018-10-29 11:01       ` Jan Beulich
  1 sibling, 1 reply; 119+ messages in thread
From: Andrew Cooper @ 2018-10-05 12:39 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Ian Jackson, Tim Deegan, Julien Grall

On 03/10/18 19:38, Andrew Cooper wrote:
> Finally, this series doesn't link with the default Debian toolchain.
>
> andrewcoop@andrewcoop:/local/xen.git/xen$ ld --version
> GNU ld (GNU Binutils for Debian) 2.25
>
> andrewcoop@andrewcoop:/local/xen.git/xen$ make -s build -j8 XEN_TARGET_ARCH=x86_64 KCONFIG_CONFIG=.config-release
>  __  __            _  _    _ ____                     _        _     _      
>  \ \/ /___ _ __   | || |  / |___ \    _   _ _ __  ___| |_ __ _| |__ | | ___ 
>   \  // _ \ '_ \  | || |_ | | __) |__| | | | '_ \/ __| __/ _` | '_ \| |/ _ \
>   /  \  __/ | | | |__   _|| |/ __/|__| |_| | | | \__ \ || (_| | |_) | |  __/
>  /_/\_\___|_| |_|    |_|(_)_|_____|   \__,_|_| |_|___/\__\__,_|_.__/|_|\___|
>                                                                             
> prelink.o:(.debug_aranges+0x3c94): relocation truncated to fit: R_X86_64_32 against `.debug_info'
> prelink.o:(.debug_info+0x225fa): relocation truncated to fit: R_X86_64_32 against `.debug_str'
> prelink.o:(.debug_info+0x22b57): relocation truncated to fit: R_X86_64_32 against `.debug_str'
> prelink.o:(.debug_info+0x1b92da): relocation truncated to fit: R_X86_64_32 against `.debug_str'
> prelink.o:(.debug_info+0x21e976): relocation truncated to fit: R_X86_64_32 against `.debug_str'
> prelink.o:(.debug_info+0x21ec31): relocation truncated to fit: R_X86_64_32 against `.debug_str'
> prelink.o:(.debug_info+0x21f03b): relocation truncated to fit: R_X86_64_32 against `.debug_str'
> prelink.o:(.debug_info+0x2b2ac3): relocation truncated to fit: R_X86_64_32 against `.debug_loc'
> prelink.o:(.debug_info+0x2b37f6): relocation truncated to fit: R_X86_64_32 against `.debug_str'
> prelink.o:(.debug_info+0x448fab): relocation truncated to fit: R_X86_64_32 against `.debug_str'
> prelink.o:(.debug_info+0x44b856): additional relocation overflows omitted from the output
> ld: prelink.o: access beyond end of merged section (6617683)
> ld: prelink.o: access beyond end of merged section (6617630)
> ld: prelink.o: access beyond end of merged section (6617579)
> ld: prelink.o: access beyond end of merged section (6617558)
> ld: prelink.o: access beyond end of merged section (6617544)
> ld: prelink.o: access beyond end of merged section (6617605)
> ld: prelink.o: access beyond end of merged section (6617718)
> ld: prelink.o: access beyond end of merged section (6617570)
> ld: prelink.o: access beyond end of merged section (6617665)
> ld: prelink.o: access beyond end of merged section (6617671)
> ld: prelink.o: access beyond end of merged section (6617624)
> ld: prelink.o: access beyond end of merged section (6617748)
> ld: prelink.o: access beyond end of merged section (6617771)
> ld: prelink.o: access beyond end of merged section (6617592)
> ld: prelink.o: access beyond end of merged section (6617635)
> ld: prelink.o: access beyond end of merged section (6617652)
> ld: prelink.o: access beyond end of merged section (6617766)
> ld: prelink.o: access beyond end of merged section (6617742)
> ld: prelink.o(.debug_info+0xc962ed): reloc against `.debug_loc': error 2
> Makefile:134: recipe for target '/local/xen.git/xen/xen-syms' failed
> make[2]: *** [/local/xen.git/xen/xen-syms] Error 1
> make[2]: *** Waiting for unfinished jobs....
> /local/xen.git/xen/.xen.efi.0s.S: Assembler messages:
> /local/xen.git/xen/.xen.efi.0s.S:21: Warning: value 0x7d2f80000544 truncated to 0x80000544
> /local/xen.git/xen/.xen.efi.0s.S:22: Warning: value 0x7d2f800008dc truncated to 0x800008dc
> /local/xen.git/xen/.xen.efi.0s.S:23: Warning: value 0x7d2f800008de truncated to 0x800008de
> /local/xen.git/xen/.xen.efi.0s.S:24: Warning: value 0x7d2f800008e3 truncated to 0x800008e3
> /local/xen.git/xen/.xen.efi.0s.S:25: Warning: value 0x7d2f80001086 truncated to 0x80001086
> /local/xen.git/xen/.xen.efi.0s.S:26: Warning: value 0x7d2f8000108a truncated to 0x8000108a
> /local/xen.git/xen/.xen.efi.0s.S:27: Warning: value 0x7d2f8000108e truncated to 0x8000108e
> /local/xen.git/xen/.xen.efi.0s.S:28: Warning: value 0x7d2f800010dc truncated to 0x800010dc
> /local/xen.git/xen/.xen.efi.0s.S:29: Warning: value 0x7d2f80001172 truncated to 0x80001172
> /local/xen.git/xen/.xen.efi.1s.S: Assembler messages:
> /local/xen.git/xen/.xen.efi.1s.S:21: Warning: value 0x7d2f80000544 truncated to 0x80000544
> /local/xen.git/xen/.xen.efi.1s.S:22: Warning: value 0x7d2f800008dc truncated to 0x800008dc
> /local/xen.git/xen/.xen.efi.1s.S:23: Warning: value 0x7d2f800008de truncated to 0x800008de
> /local/xen.git/xen/.xen.efi.1s.S:24: Warning: value 0x7d2f800008e3 truncated to 0x800008e3
> /local/xen.git/xen/.xen.efi.1s.S:25: Warning: value 0x7d2f80001086 truncated to 0x80001086
> /local/xen.git/xen/.xen.efi.1s.S:26: Warning: value 0x7d2f8000108a truncated to 0x8000108a
> /local/xen.git/xen/.xen.efi.1s.S:27: Warning: value 0x7d2f8000108e truncated to 0x8000108e
> /local/xen.git/xen/.xen.efi.1s.S:28: Warning: value 0x7d2f800010dc truncated to 0x800010dc
> /local/xen.git/xen/.xen.efi.1s.S:29: Warning: value 0x7d2f80001172 truncated to 0x80001172
> Makefile:136: recipe for target '/local/xen.git/xen/xen' failed
> make[1]: *** [/local/xen.git/xen/xen] Error 2
> Makefile:45: recipe for target 'build' failed
> make: *** [build] Error 2
>
> Using LD 2.30 built from source is fine, but I'm not sure exactly what
> is going on here.

Actually, I've just encountered this failure to link on staging as well,
so it is clearly not related to this series.  Sorry for the noise (but
I'm still non-the-wiser as to what is actually broken).

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-05 12:39       ` Andrew Cooper
@ 2018-10-05 13:43         ` Jan Beulich
  2018-10-05 14:49           ` Andrew Cooper
  0 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-10-05 13:43 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall, xen-devel

>>> On 05.10.18 at 14:39, <andrew.cooper3@citrix.com> wrote:
> On 03/10/18 19:38, Andrew Cooper wrote:
>> Finally, this series doesn't link with the default Debian toolchain.
>>
>> andrewcoop@andrewcoop:/local/xen.git/xen$ ld --version
>> GNU ld (GNU Binutils for Debian) 2.25
>>
>> andrewcoop@andrewcoop:/local/xen.git/xen$ make -s build -j8 XEN_TARGET_ARCH=x86_64 KCONFIG_CONFIG=.config-release
>>  __  __            _  _    _ ____                     _        _     _      
>>  \ \/ /___ _ __   | || |  / |___ \    _   _ _ __  ___| |_ __ _| |__ | | ___ 
>>   \  // _ \ '_ \  | || |_ | | __) |__| | | | '_ \/ __| __/ _` | '_ \| |/ _ \
>>   /  \  __/ | | | |__   _|| |/ __/|__| |_| | | | \__ \ || (_| | |_) | |  __/
>>  /_/\_\___|_| |_|    |_|(_)_|_____|   \__,_|_| |_|___/\__\__,_|_.__/|_|\___|
>>                                                                             
>> prelink.o:(.debug_aranges+0x3c94): relocation truncated to fit: R_X86_64_32 against `.debug_info'
>> prelink.o:(.debug_info+0x225fa): relocation truncated to fit: R_X86_64_32 against `.debug_str'
>> prelink.o:(.debug_info+0x22b57): relocation truncated to fit: R_X86_64_32 against `.debug_str'
>> prelink.o:(.debug_info+0x1b92da): relocation truncated to fit: R_X86_64_32 against `.debug_str'
>> prelink.o:(.debug_info+0x21e976): relocation truncated to fit: R_X86_64_32 against `.debug_str'
>> prelink.o:(.debug_info+0x21ec31): relocation truncated to fit: R_X86_64_32 against `.debug_str'
>> prelink.o:(.debug_info+0x21f03b): relocation truncated to fit: R_X86_64_32 against `.debug_str'
>> prelink.o:(.debug_info+0x2b2ac3): relocation truncated to fit: R_X86_64_32 against `.debug_loc'
>> prelink.o:(.debug_info+0x2b37f6): relocation truncated to fit: R_X86_64_32 against `.debug_str'
>> prelink.o:(.debug_info+0x448fab): relocation truncated to fit: R_X86_64_32 against `.debug_str'
>> prelink.o:(.debug_info+0x44b856): additional relocation overflows omitted from the output
>> ld: prelink.o: access beyond end of merged section (6617683)
>> ld: prelink.o: access beyond end of merged section (6617630)
>> ld: prelink.o: access beyond end of merged section (6617579)
>> ld: prelink.o: access beyond end of merged section (6617558)
>> ld: prelink.o: access beyond end of merged section (6617544)
>> ld: prelink.o: access beyond end of merged section (6617605)
>> ld: prelink.o: access beyond end of merged section (6617718)
>> ld: prelink.o: access beyond end of merged section (6617570)
>> ld: prelink.o: access beyond end of merged section (6617665)
>> ld: prelink.o: access beyond end of merged section (6617671)
>> ld: prelink.o: access beyond end of merged section (6617624)
>> ld: prelink.o: access beyond end of merged section (6617748)
>> ld: prelink.o: access beyond end of merged section (6617771)
>> ld: prelink.o: access beyond end of merged section (6617592)
>> ld: prelink.o: access beyond end of merged section (6617635)
>> ld: prelink.o: access beyond end of merged section (6617652)
>> ld: prelink.o: access beyond end of merged section (6617766)
>> ld: prelink.o: access beyond end of merged section (6617742)

Something along the lines of one or both of the above kinds I've
happened to run into when using a gas producing compressed
debug sections together with an objcopy which doesn't know
about such section (iirc older objcopy silently dropped some
section flag in this case).

>> ld: prelink.o(.debug_info+0xc962ed): reloc against `.debug_loc': error 2
>> Makefile:134: recipe for target '/local/xen.git/xen/xen-syms' failed
>> make[2]: *** [/local/xen.git/xen/xen-syms] Error 1
>> make[2]: *** Waiting for unfinished jobs....
>> /local/xen.git/xen/.xen.efi.0s.S: Assembler messages:
>> /local/xen.git/xen/.xen.efi.0s.S:21: Warning: value 0x7d2f80000544 truncated to 0x80000544
>> /local/xen.git/xen/.xen.efi.0s.S:22: Warning: value 0x7d2f800008dc truncated to 0x800008dc
>> /local/xen.git/xen/.xen.efi.0s.S:23: Warning: value 0x7d2f800008de truncated to 0x800008de
>> /local/xen.git/xen/.xen.efi.0s.S:24: Warning: value 0x7d2f800008e3 truncated to 0x800008e3
>> /local/xen.git/xen/.xen.efi.0s.S:25: Warning: value 0x7d2f80001086 truncated to 0x80001086
>> /local/xen.git/xen/.xen.efi.0s.S:26: Warning: value 0x7d2f8000108a truncated to 0x8000108a
>> /local/xen.git/xen/.xen.efi.0s.S:27: Warning: value 0x7d2f8000108e truncated to 0x8000108e
>> /local/xen.git/xen/.xen.efi.0s.S:28: Warning: value 0x7d2f800010dc truncated to 0x800010dc
>> /local/xen.git/xen/.xen.efi.0s.S:29: Warning: value 0x7d2f80001172 truncated to 0x80001172
>> /local/xen.git/xen/.xen.efi.1s.S: Assembler messages:
>> /local/xen.git/xen/.xen.efi.1s.S:21: Warning: value 0x7d2f80000544 truncated to 0x80000544
>> /local/xen.git/xen/.xen.efi.1s.S:22: Warning: value 0x7d2f800008dc truncated to 0x800008dc
>> /local/xen.git/xen/.xen.efi.1s.S:23: Warning: value 0x7d2f800008de truncated to 0x800008de
>> /local/xen.git/xen/.xen.efi.1s.S:24: Warning: value 0x7d2f800008e3 truncated to 0x800008e3
>> /local/xen.git/xen/.xen.efi.1s.S:25: Warning: value 0x7d2f80001086 truncated to 0x80001086
>> /local/xen.git/xen/.xen.efi.1s.S:26: Warning: value 0x7d2f8000108a truncated to 0x8000108a
>> /local/xen.git/xen/.xen.efi.1s.S:27: Warning: value 0x7d2f8000108e truncated to 0x8000108e
>> /local/xen.git/xen/.xen.efi.1s.S:28: Warning: value 0x7d2f800010dc truncated to 0x800010dc
>> /local/xen.git/xen/.xen.efi.1s.S:29: Warning: value 0x7d2f80001172 truncated to 0x80001172

These are just warnings, and I vaguely recall looking into this once (I
don't see these myself, so I supposed I had asked you or someone
else to send my some object files), finding that gas is right to warn,
but the effect on the embedded symbol table should be benign. What
I don't understand though is why I've never seen these myself.

>> Makefile:136: recipe for target '/local/xen.git/xen/xen' failed
>> make[1]: *** [/local/xen.git/xen/xen] Error 2
>> Makefile:45: recipe for target 'build' failed
>> make: *** [build] Error 2
>>
>> Using LD 2.30 built from source is fine, but I'm not sure exactly what
>> is going on here.
> 
> Actually, I've just encountered this failure to link on staging as well,
> so it is clearly not related to this series.  Sorry for the noise (but
> I'm still non-the-wiser as to what is actually broken).

Could you try 2.31.1? Of course I did use 2.30 until 2.31 went out,
and still didn't see this. Yet then again my variant is not exactly
vanilla.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-05 13:43         ` Jan Beulich
@ 2018-10-05 14:49           ` Andrew Cooper
  2018-10-05 15:05             ` Jan Beulich
  0 siblings, 1 reply; 119+ messages in thread
From: Andrew Cooper @ 2018-10-05 14:49 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall, xen-devel

On 05/10/18 14:43, Jan Beulich wrote:
>>>> On 05.10.18 at 14:39, <andrew.cooper3@citrix.com> wrote:
>> On 03/10/18 19:38, Andrew Cooper wrote:
>>> Finally, this series doesn't link with the default Debian toolchain.
>>>
>>> andrewcoop@andrewcoop:/local/xen.git/xen$ ld --version
>>> GNU ld (GNU Binutils for Debian) 2.25
>>>
>>> andrewcoop@andrewcoop:/local/xen.git/xen$ make -s build -j8 XEN_TARGET_ARCH=x86_64 KCONFIG_CONFIG=.config-release
>>>  __  __            _  _    _ ____                     _        _     _      
>>>  \ \/ /___ _ __   | || |  / |___ \    _   _ _ __  ___| |_ __ _| |__ | | ___ 
>>>   \  // _ \ '_ \  | || |_ | | __) |__| | | | '_ \/ __| __/ _` | '_ \| |/ _ \
>>>   /  \  __/ | | | |__   _|| |/ __/|__| |_| | | | \__ \ || (_| | |_) | |  __/
>>>  /_/\_\___|_| |_|    |_|(_)_|_____|   \__,_|_| |_|___/\__\__,_|_.__/|_|\___|
>>>                                                                             
>>> prelink.o:(.debug_aranges+0x3c94): relocation truncated to fit: R_X86_64_32 against `.debug_info'
>>> prelink.o:(.debug_info+0x225fa): relocation truncated to fit: R_X86_64_32 against `.debug_str'
>>> prelink.o:(.debug_info+0x22b57): relocation truncated to fit: R_X86_64_32 against `.debug_str'
>>> prelink.o:(.debug_info+0x1b92da): relocation truncated to fit: R_X86_64_32 against `.debug_str'
>>> prelink.o:(.debug_info+0x21e976): relocation truncated to fit: R_X86_64_32 against `.debug_str'
>>> prelink.o:(.debug_info+0x21ec31): relocation truncated to fit: R_X86_64_32 against `.debug_str'
>>> prelink.o:(.debug_info+0x21f03b): relocation truncated to fit: R_X86_64_32 against `.debug_str'
>>> prelink.o:(.debug_info+0x2b2ac3): relocation truncated to fit: R_X86_64_32 against `.debug_loc'
>>> prelink.o:(.debug_info+0x2b37f6): relocation truncated to fit: R_X86_64_32 against `.debug_str'
>>> prelink.o:(.debug_info+0x448fab): relocation truncated to fit: R_X86_64_32 against `.debug_str'
>>> prelink.o:(.debug_info+0x44b856): additional relocation overflows omitted from the output
>>> ld: prelink.o: access beyond end of merged section (6617683)
>>> ld: prelink.o: access beyond end of merged section (6617630)
>>> ld: prelink.o: access beyond end of merged section (6617579)
>>> ld: prelink.o: access beyond end of merged section (6617558)
>>> ld: prelink.o: access beyond end of merged section (6617544)
>>> ld: prelink.o: access beyond end of merged section (6617605)
>>> ld: prelink.o: access beyond end of merged section (6617718)
>>> ld: prelink.o: access beyond end of merged section (6617570)
>>> ld: prelink.o: access beyond end of merged section (6617665)
>>> ld: prelink.o: access beyond end of merged section (6617671)
>>> ld: prelink.o: access beyond end of merged section (6617624)
>>> ld: prelink.o: access beyond end of merged section (6617748)
>>> ld: prelink.o: access beyond end of merged section (6617771)
>>> ld: prelink.o: access beyond end of merged section (6617592)
>>> ld: prelink.o: access beyond end of merged section (6617635)
>>> ld: prelink.o: access beyond end of merged section (6617652)
>>> ld: prelink.o: access beyond end of merged section (6617766)
>>> ld: prelink.o: access beyond end of merged section (6617742)
> Something along the lines of one or both of the above kinds I've
> happened to run into when using a gas producing compressed
> debug sections together with an objcopy which doesn't know
> about such section (iirc older objcopy silently dropped some
> section flag in this case).

I've tracked the issues down Olaf's DEBUG_INFO patch.  Having DEBUG_INFO
set when DEBUG is clear (which is the default for release builds)
results in this breakage.

Another diagnostic has come to light:

ld: Dwarf Error: found dwarf version '0', this reader only handles
version 2, 3 and 4 information.
prelink.o:(.debug_info+0x1bcee4): relocation truncated to fit:
R_X86_64_32 against `.debug_str'


>
>>> ld: prelink.o(.debug_info+0xc962ed): reloc against `.debug_loc': error 2
>>> Makefile:134: recipe for target '/local/xen.git/xen/xen-syms' failed
>>> make[2]: *** [/local/xen.git/xen/xen-syms] Error 1
>>> make[2]: *** Waiting for unfinished jobs....
>>> /local/xen.git/xen/.xen.efi.0s.S: Assembler messages:
>>> /local/xen.git/xen/.xen.efi.0s.S:21: Warning: value 0x7d2f80000544 truncated to 0x80000544
>>> /local/xen.git/xen/.xen.efi.0s.S:22: Warning: value 0x7d2f800008dc truncated to 0x800008dc
>>> /local/xen.git/xen/.xen.efi.0s.S:23: Warning: value 0x7d2f800008de truncated to 0x800008de
>>> /local/xen.git/xen/.xen.efi.0s.S:24: Warning: value 0x7d2f800008e3 truncated to 0x800008e3
>>> /local/xen.git/xen/.xen.efi.0s.S:25: Warning: value 0x7d2f80001086 truncated to 0x80001086
>>> /local/xen.git/xen/.xen.efi.0s.S:26: Warning: value 0x7d2f8000108a truncated to 0x8000108a
>>> /local/xen.git/xen/.xen.efi.0s.S:27: Warning: value 0x7d2f8000108e truncated to 0x8000108e
>>> /local/xen.git/xen/.xen.efi.0s.S:28: Warning: value 0x7d2f800010dc truncated to 0x800010dc
>>> /local/xen.git/xen/.xen.efi.0s.S:29: Warning: value 0x7d2f80001172 truncated to 0x80001172
>>> /local/xen.git/xen/.xen.efi.1s.S: Assembler messages:
>>> /local/xen.git/xen/.xen.efi.1s.S:21: Warning: value 0x7d2f80000544 truncated to 0x80000544
>>> /local/xen.git/xen/.xen.efi.1s.S:22: Warning: value 0x7d2f800008dc truncated to 0x800008dc
>>> /local/xen.git/xen/.xen.efi.1s.S:23: Warning: value 0x7d2f800008de truncated to 0x800008de
>>> /local/xen.git/xen/.xen.efi.1s.S:24: Warning: value 0x7d2f800008e3 truncated to 0x800008e3
>>> /local/xen.git/xen/.xen.efi.1s.S:25: Warning: value 0x7d2f80001086 truncated to 0x80001086
>>> /local/xen.git/xen/.xen.efi.1s.S:26: Warning: value 0x7d2f8000108a truncated to 0x8000108a
>>> /local/xen.git/xen/.xen.efi.1s.S:27: Warning: value 0x7d2f8000108e truncated to 0x8000108e
>>> /local/xen.git/xen/.xen.efi.1s.S:28: Warning: value 0x7d2f800010dc truncated to 0x800010dc
>>> /local/xen.git/xen/.xen.efi.1s.S:29: Warning: value 0x7d2f80001172 truncated to 0x80001172
> These are just warnings, and I vaguely recall looking into this once (I
> don't see these myself, so I supposed I had asked you or someone
> else to send my some object files), finding that gas is right to warn,
> but the effect on the embedded symbol table should be benign. What
> I don't understand though is why I've never seen these myself.

This happens when you've got CONFIG_LIVEPATCH enabled and use the
default debian toolchain.

>
>>> Makefile:136: recipe for target '/local/xen.git/xen/xen' failed
>>> make[1]: *** [/local/xen.git/xen/xen] Error 2
>>> Makefile:45: recipe for target 'build' failed
>>> make: *** [build] Error 2
>>>
>>> Using LD 2.30 built from source is fine, but I'm not sure exactly what
>>> is going on here.
>> Actually, I've just encountered this failure to link on staging as well,
>> so it is clearly not related to this series.  Sorry for the noise (but
>> I'm still non-the-wiser as to what is actually broken).
> Could you try 2.31.1? Of course I did use 2.30 until 2.31 went out,
> and still didn't see this. Yet then again my variant is not exactly
> vanilla.

I could (when I've got time to recompile my non-default compilers), but
what is this going to demonstrate?  I wouldn't expect 2.31 to be broken
when 2.30 works fine.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-05 14:49           ` Andrew Cooper
@ 2018-10-05 15:05             ` Jan Beulich
  0 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-10-05 15:05 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall, xen-devel

>>> On 05.10.18 at 16:49, <andrew.cooper3@citrix.com> wrote:
> On 05/10/18 14:43, Jan Beulich wrote:
>>>>> On 05.10.18 at 14:39, <andrew.cooper3@citrix.com> wrote:
>>> On 03/10/18 19:38, Andrew Cooper wrote:
>>>> Makefile:136: recipe for target '/local/xen.git/xen/xen' failed
>>>> make[1]: *** [/local/xen.git/xen/xen] Error 2
>>>> Makefile:45: recipe for target 'build' failed
>>>> make: *** [build] Error 2
>>>>
>>>> Using LD 2.30 built from source is fine, but I'm not sure exactly what
>>>> is going on here.
>>> Actually, I've just encountered this failure to link on staging as well,
>>> so it is clearly not related to this series.  Sorry for the noise (but
>>> I'm still non-the-wiser as to what is actually broken).
>> Could you try 2.31.1? Of course I did use 2.30 until 2.31 went out,
>> and still didn't see this. Yet then again my variant is not exactly
>> vanilla.
> 
> I could (when I've got time to recompile my non-default compilers), but
> what is this going to demonstrate?  I wouldn't expect 2.31 to be broken
> when 2.30 works fine.

Oh, sorry - I mis-read your original statement.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-10-03 18:38     ` Andrew Cooper
  2018-10-05 12:39       ` Andrew Cooper
@ 2018-10-29 11:01       ` Jan Beulich
  1 sibling, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-10-29 11:01 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall, xen-devel

>>> On 03.10.18 at 20:38, <andrew.cooper3@citrix.com> wrote:
> Reviewing just the code generation at this point.

Thanks for having found the time.

> See the Linux source code for ASM_CALL_CONSTRAINT.  There is a potential
> code generation issue if you've got a call instruction inside an asm
> block if you don't list the stack pointer as a clobbered output.

I'll look into this, but to be honest I don't always trust the Linux
folks in such regards. Plus doesn't this consideration come a little
late, seeing that we already have inline asm()-s inserting CALLs
(see e.g. asm-x86/guest/hypercall.h)?

> Next, with Clang, there seems to be some a bug causing the function
> pointer to be spilled onto the stack
> 
> ffff82d08026e990 <foo2>:
> ffff82d08026e990:       50                      push   %rax
> ffff82d08026e991:       48 8b 05 40 bc 20 00    mov    0x20bc40(%rip),%rax   
>      # ffff82d08047a5d8 <hvm_funcs+0x130>
> ffff82d08026e998:       48 89 04 24             mov    %rax,(%rsp)
> ffff82d08026e99c:       ff 15 36 bc 20 00       callq  *0x20bc36(%rip)       
>  # ffff82d08047a5d8 <hvm_funcs+0x130>
> ffff82d08026e9a2:       31 c0                   xor    %eax,%eax
> ffff82d08026e9a4:       59                      pop    %rcx
> ffff82d08026e9a5:       c3                      retq   
> ffff82d08026e9a6:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
> ffff82d08026e9ad:       00 00 00 
> 
> 
> I'm not quite sure what is going on here, and the binary does boot, but
> the code gen is definitely not correct.

What's not correct here? There's just some pointless extra code
the compiler produces.

>  Given this and the GCC bugs
> you've found leading to the NO_ARG infrastructure, how about dropping
> all the compatibility hacks, and making the infrastructure fall back to
> a regular compiler-inserted function pointer call?

Well, if there were multiple issues left, I'd view this as an option.
But there's just one ugly workaround in this version, in
"x86/genapic: patch indirect calls to direct ones", and that even affects
gcc 8. So on the whole I'd rather not go this route.

> Next, the ASM'd calls aren't SYSV-ABI compliant.
> 
> extern void bar(void);
> 
> int foo1(void)
> {
>     hvm_funcs.wbinvd_intercept();
>     return 0;
> }
> 
> int foo2(void)
> {
>     alternative_vcall(hvm_funcs.wbinvd_intercept);
>     return 0;
> }
> 
> int bar1(void)
> {
>     bar();
>     return 0;
> }
> 
> ffff82d08026e1e0 <foo1>:
> ffff82d08026e1e0:       48 83 ec 08             sub    $0x8,%rsp
> ffff82d08026e1e4:       48 8b 05 c5 49 1d 00    mov    0x1d49c5(%rip),%rax   
>      # ffff82d080442bb0 <hvm_funcs+0x130>
> ffff82d08026e1eb:       e8 30 2d 0f 00          callq  ffff82d080360f20 
> <__x86_indirect_thunk_rax>
> ffff82d08026e1f0:       31 c0                   xor    %eax,%eax
> ffff82d08026e1f2:       48 83 c4 08             add    $0x8,%rsp
> ffff82d08026e1f6:       c3                      retq   
> ffff82d08026e1f7:       66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
> ffff82d08026e1fe:       00 00 
> 
> ffff82d08026e200 <foo2>:
> ffff82d08026e200:       ff 15 aa 49 1d 00       callq  *0x1d49aa(%rip)       
>  # ffff82d080442bb0 <hvm_funcs+0x130>
> ffff82d08026e206:       31 c0                   xor    %eax,%eax
> ffff82d08026e208:       c3                      retq   
> ffff82d08026e209:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)
> 
> ffff82d08026e210 <bar1>:
> ffff82d08026e210:       48 83 ec 08             sub    $0x8,%rsp
> ffff82d08026e214:       e8 17 18 01 00          callq  ffff82d08027fa30 <bar>
> ffff82d08026e219:       31 c0                   xor    %eax,%eax
> ffff82d08026e21b:       48 83 c4 08             add    $0x8,%rsp
> ffff82d08026e21f:       c3                      retq   
> 
> foo2 which uses alternative_vcall() should be subtracting 8 from the
> stack pointer before the emitted call instruction.  I can't find any set
> of constraints which causes the stack to be set up correctly.

I think I have an idea how to arrange for this (adding an artificial
extra constraint referencing a local variable requiring 16-byte
alignment; ideally the variable would itself be zero size), but I've
yet to try it out. The question though is if we really need this
ABI compliance here. Recall that for years we've been running
with mis-aligned stacks, and only EFI runtime call issues required
us to address this. There are no EFI runtime calls behind any of
the (to be) patched calls.

I was in fact considering the opposite thing: Use (on new enough
gcc) the command line option to reduce stack alignment to 8 bytes
in general, enforcing 16-byte alignment only where we really need
it.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 10/12] IOMMU: introduce IOMMU_MIXED config option
  2018-10-02 10:38     ` Julien Grall
  2018-10-02 10:42       ` Jan Beulich
@ 2018-11-06 15:48       ` Jan Beulich
  2018-11-07 18:01         ` Julien Grall
  1 sibling, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-11-06 15:48 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel

>>> On 02.10.18 at 12:38, <julien.grall@arm.com> wrote:
> On 02/10/2018 11:18, Jan Beulich wrote:
>> ARM is intended to gain support for heterogeneous IOMMUs on a single
>> system. This not only disallows boot time replacement of respective
>> indirect calls (handling of which is the main goal of the introduction
>> here), but more generally disallows calls using the iommu_ops() return
>> value directly - all such calls need to have means (commonly a domain
>> pointer) to know the targeted IOMMU.
>> 
>> Disallow all hooks lacking such context for the time being, which in
>> effect is some dead code elimination for ARM. Once extended suitably,
>> individual of these hooks can be moved out of their guards again in the
>> future.
> 
> While in theory it is possible to have platform with hetereneous IOMMUs. 
>   I don't see such such support coming in Xen for the foreseeable 
> future. Note that even Linux does not support such case.
> 
> This patch is going to make more complicate to unshare page-tables as 
> now we would need to care of mixed case. So I would rather not set 
> IOMMU_MIXED on Arm until we have a use case for it.

So if I drop this here, how would you want iommu_get_ops()
get handled on Arm (patch 11)? Right now I'd mean to leave it
alone, but it could also be switched to the (new) x86 way (but
that would then perhaps make mixed mode introduction more
difficult down the road), allowing to get away with fewer
#ifdef-s.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v4 10/12] IOMMU: introduce IOMMU_MIXED config option
  2018-11-06 15:48       ` Jan Beulich
@ 2018-11-07 18:01         ` Julien Grall
  0 siblings, 0 replies; 119+ messages in thread
From: Julien Grall @ 2018-11-07 18:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel

Hi Jan,

On 06/11/2018 15:48, Jan Beulich wrote:
>>>> On 02.10.18 at 12:38, <julien.grall@arm.com> wrote:
>> On 02/10/2018 11:18, Jan Beulich wrote:
>>> ARM is intended to gain support for heterogeneous IOMMUs on a single
>>> system. This not only disallows boot time replacement of respective
>>> indirect calls (handling of which is the main goal of the introduction
>>> here), but more generally disallows calls using the iommu_ops() return
>>> value directly - all such calls need to have means (commonly a domain
>>> pointer) to know the targeted IOMMU.
>>>
>>> Disallow all hooks lacking such context for the time being, which in
>>> effect is some dead code elimination for ARM. Once extended suitably,
>>> individual of these hooks can be moved out of their guards again in the
>>> future.
>>
>> While in theory it is possible to have platform with hetereneous IOMMUs.
>>    I don't see such such support coming in Xen for the foreseeable
>> future. Note that even Linux does not support such case.
>>
>> This patch is going to make more complicate to unshare page-tables as
>> now we would need to care of mixed case. So I would rather not set
>> IOMMU_MIXED on Arm until we have a use case for it.
> 
> So if I drop this here, how would you want iommu_get_ops()
> get handled on Arm (patch 11)? Right now I'd mean to leave it
> alone, but it could also be switched to the (new) x86 way (but
> that would then perhaps make mixed mode introduction more
> difficult down the road), allowing to get away with fewer
> #ifdef-s.

I would introduce the iommu_get_ops for x86 in asm-x86/iommu.h.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v5 00/13] x86: indirect call overhead reduction
  2018-09-11 13:26 [PATCH v3 0/9] x86: indirect call overhead reduction Jan Beulich
                   ` (9 preceding siblings ...)
  2018-10-02 10:09 ` [PATCH v4 00/12] x86: indirect call overhead reduction Jan Beulich
@ 2018-11-08 15:56 ` Jan Beulich
  2018-11-08 16:05   ` [PATCH v5 01/13] x86: reduce general stack alignment to 8 Jan Beulich
                     ` (12 more replies)
  2018-12-05 15:54 ` [PATCH v6 00/10] x86: indirect call overhead reduction Jan Beulich
       [not found] ` <5C07F49D0200000000101036@prv1-mh.provo.novell.com>
  12 siblings, 13 replies; 119+ messages in thread
From: Jan Beulich @ 2018-11-08 15:56 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

While indirect calls have always been more expensive than direct ones,
their cost has further increased with the Spectre v2 mitigations. In a
number of cases we simply pointlessly use them in the first place. In
many other cases the indirection solely exists to abstract from e.g.
vendor specific hardware details, and hence the pointers used never
change once set. Here we can use alternatives patching to get rid of
the indirection.

Further areas where indirect calls could be eliminated (and that I've put
on my todo list in case the general concept here is deemed reasonable)
are vPMU and XSM. For the latter, the Arm side would need dealing
with as well - I'm not sure whether replacing indirect calls by direct ones
is worthwhile there; if not, the wrappers would simply need to become
function invocations in the Arm case (as is already done in the IOMMU
case).

01: x86: reduce general stack alignment to 8
02: x86: clone Linux'es ASM_CALL_CONSTRAINT
03: x86: infrastructure to allow converting certain indirect calls to direct ones
04: x86/HVM: patch indirect calls through hvm_funcs to direct ones
05: x86/HVM: patch vINTR indirect calls through hvm_funcs to direct ones
06: x86: patch ctxt_switch_masking() indirect call to direct one
07: x86/genapic: patch indirect calls to direct ones
08: x86/cpuidle: patch some indirect calls to direct ones
09: cpufreq: convert to a single post-init driver (hooks) instance
10: cpufreq: patch target() indirect call to direct one
11: IOMMU: move inclusion point of asm/iommu.h
12: IOMMU: remove indirection from certain IOMMU hook accesses
13: IOMMU: patch certain indirect calls to direct ones

Patches 1 and 2 are new in v5, addressing v4 remarks on what is now
patch 3. The IOMMU patches required quite some re-work, and patch
11 is also new (the former patch at that place in the series as dropped
as requested).

Jan





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v5 01/13] x86: reduce general stack alignment to 8
  2018-11-08 15:56 ` [PATCH v5 00/13] x86: indirect call overhead reduction Jan Beulich
@ 2018-11-08 16:05   ` Jan Beulich
  2018-11-29 14:54     ` Wei Liu
  2018-11-29 17:44     ` Wei Liu
  2018-11-08 16:06   ` [PATCH v5 02/13] x86: clone Linux'es ASM_CALL_CONSTRAINT Jan Beulich
                     ` (11 subsequent siblings)
  12 siblings, 2 replies; 119+ messages in thread
From: Jan Beulich @ 2018-11-08 16:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

We don't need bigger alignment except when calling EFI boot or runtime
services functions (and we don't guarantee that either, as explained
close to the top of xen/common/efi/runtime.c in the struct efi_rs_state
declaration). Hence if the compiler supports reducing stack alignment
from the ABI compatible 16 bytes (gcc 7 and newer), do so wherever
possible.

The EFI case itself is largely dealt with already (actually forcing
32-byte alignment) as a result of commit f6b7fedc89 ("x86/EFI: meet
further spec requirements for runtime calls"). However, as explained in
the description of that earlier change, without using
-mincoming-stack-boundary=3 (which we don't want) we still have to make
the compiler assume 16-byte stack boundaries for CUs making EFI calls in
order to keep the compiler from aligning the stack, but then placing an
odd number of 8-byte objects on it, resulting in a mis-aligned outgoing
stack.

This as a side effect yields some code size reduction, since for a
number of sufficiently simple non-leaf functions the stack adjustment
(by 8, when there are no local stack variables at all) gets dropped
altogether. I notice exceptions though, for example in guest_cpuid(),
where in a release build gcc 8.2 now decides to set up a frame pointer
(without ever using %rbp); I consider this a compiler quirk which we
should leave to the compiler folks to address eventually.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v5: New.

--- a/xen/arch/x86/Rules.mk
+++ b/xen/arch/x86/Rules.mk
@@ -51,6 +51,11 @@ CFLAGS += -DCONFIG_INDIRECT_THUNK
 export CONFIG_INDIRECT_THUNK=y
 endif
 
+# If supported by the compiler, reduce stack alignment to 8 bytes. But allow
+# this to be overridden elsewhere.
+$(call cc-option-add,CFLAGS-stack-boundary,CC,-mpreferred-stack-boundary=3)
+CFLAGS += $(CFLAGS-stack-boundary)
+
 # Set up the assembler include path properly for older toolchains.
 CFLAGS += -Wa,-I$(BASEDIR)/include
 
--- a/xen/arch/x86/efi/Makefile
+++ b/xen/arch/x86/efi/Makefile
@@ -5,7 +5,11 @@ CFLAGS += -fshort-wchar
 
 boot.init.o: buildid.o
 
+EFIOBJ := boot.init.o compat.o runtime.o
+
+$(EFIOBJ): CFLAGS-stack-boundary := -mpreferred-stack-boundary=4
+
 obj-y := stub.o
-obj-$(XEN_BUILD_EFI) := boot.init.o compat.o relocs-dummy.o runtime.o
+obj-$(XEN_BUILD_EFI) := $(EFIOBJ) relocs-dummy.o
 extra-$(XEN_BUILD_EFI) += buildid.o
 nocov-$(XEN_BUILD_EFI) += stub.o





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v5 02/13] x86: clone Linux'es ASM_CALL_CONSTRAINT
  2018-11-08 15:56 ` [PATCH v5 00/13] x86: indirect call overhead reduction Jan Beulich
  2018-11-08 16:05   ` [PATCH v5 01/13] x86: reduce general stack alignment to 8 Jan Beulich
@ 2018-11-08 16:06   ` Jan Beulich
  2018-11-29 17:13     ` Wei Liu
  2018-11-08 16:08   ` [PATCH v5 03/13] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
                     ` (10 subsequent siblings)
  12 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-11-08 16:06 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

While we don't mean to run their objtool over our generated code, it
still seems desirable to avoid calls to further functions before a
function's frame pointer is set up.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v5: New.

--- a/xen/arch/x86/efi/stub.c
+++ b/xen/arch/x86/efi/stub.c
@@ -34,10 +34,11 @@ void __init noreturn efi_multiboot2(EFI_
      * not be directly supported by C compiler.
      */
     asm volatile(
-    "    call *%3                     \n"
+    "    call *%[outstr]              \n"
     "0:  hlt                          \n"
     "    jmp  0b                      \n"
-       : "+c" (StdErr), "=d" (StdErr) : "1" (err), "rm" (StdErr->OutputString)
+       : "+c" (StdErr), "=d" (StdErr) ASM_CALL_CONSTRAINT
+       : "1" (err), [outstr] "rm" (StdErr->OutputString)
        : "rax", "r8", "r9", "r10", "r11", "memory");
 
     unreachable();
--- a/xen/arch/x86/extable.c
+++ b/xen/arch/x86/extable.c
@@ -168,7 +168,7 @@ static int __init stub_selftest(void)
                        "jmp .Lret%=\n\t"
                        ".popsection\n\t"
                        _ASM_EXTABLE(.Lret%=, .Lfix%=)
-                       : [exn] "+m" (res)
+                       : [exn] "+m" (res) ASM_CALL_CONSTRAINT
                        : [stb] "r" (addr), "a" (tests[i].rax));
 
         if ( res.raw != tests[i].res.raw )
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1044,7 +1044,8 @@ static inline int mkec(uint8_t e, int32_
                    "jmp .Lret%=\n\t"                                    \
                    ".popsection\n\t"                                    \
                    _ASM_EXTABLE(.Lret%=, .Lfix%=)                       \
-                   : [exn] "+g" (stub_exn.info), constraints,           \
+                   : [exn] "+g" (stub_exn.info) ASM_CALL_CONSTRAINT,    \
+                     constraints,                                       \
                      [stub] "r" (stub.func),                            \
                      "m" (*(uint8_t(*)[MAX_INST_LEN + 1])stub.ptr) );   \
     if ( unlikely(~stub_exn.info.raw) )                                 \
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -25,6 +25,19 @@ asm ( "\t.equ CONFIG_INDIRECT_THUNK, "
 
 #ifndef __ASSEMBLY__
 void ret_from_intr(void);
+
+/*
+ * This output constraint should be used for any inline asm which has a "call"
+ * instruction.  Otherwise the asm may be inserted before the frame pointer
+ * gets set up by the containing function.
+ */
+#ifdef CONFIG_FRAME_POINTER
+register unsigned long current_stack_pointer asm("rsp");
+# define ASM_CALL_CONSTRAINT , "+r" (current_stack_pointer)
+#else
+# define ASM_CALL_CONSTRAINT
+#endif
+
 #endif
 
 #ifndef NDEBUG
--- a/xen/include/asm-x86/guest/hypercall.h
+++ b/xen/include/asm-x86/guest/hypercall.h
@@ -40,7 +40,7 @@
         long res, tmp__;                                                \
         asm volatile (                                                  \
             "call hypercall_page + %c[offset]"                          \
-            : "=a" (res), "=D" (tmp__)                                  \
+            : "=a" (res), "=D" (tmp__) ASM_CALL_CONSTRAINT              \
             : [offset] "i" (hcall * 32),                                \
               "1" ((long)(a1))                                          \
             : "memory" );                                               \
@@ -53,6 +53,7 @@
         asm volatile (                                                  \
             "call hypercall_page + %c[offset]"                          \
             : "=a" (res), "=D" (tmp__), "=S" (tmp__)                    \
+              ASM_CALL_CONSTRAINT                                       \
             : [offset] "i" (hcall * 32),                                \
               "1" ((long)(a1)), "2" ((long)(a2))                        \
             : "memory" );                                               \
@@ -65,6 +66,7 @@
         asm volatile (                                                  \
             "call hypercall_page + %c[offset]"                          \
             : "=a" (res), "=D" (tmp__), "=S" (tmp__), "=d" (tmp__)      \
+              ASM_CALL_CONSTRAINT                                       \
             : [offset] "i" (hcall * 32),                                \
               "1" ((long)(a1)), "2" ((long)(a2)), "3" ((long)(a3))      \
             : "memory" );                                               \
@@ -78,7 +80,7 @@
         asm volatile (                                                  \
             "call hypercall_page + %c[offset]"                          \
             : "=a" (res), "=D" (tmp__), "=S" (tmp__), "=d" (tmp__),     \
-              "=&r" (tmp__)                                             \
+              "=&r" (tmp__) ASM_CALL_CONSTRAINT                         \
             : [offset] "i" (hcall * 32),                                \
               "1" ((long)(a1)), "2" ((long)(a2)), "3" ((long)(a3)),     \
               "4" (_a4)                                                 \




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v5 03/13] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-11-08 15:56 ` [PATCH v5 00/13] x86: indirect call overhead reduction Jan Beulich
  2018-11-08 16:05   ` [PATCH v5 01/13] x86: reduce general stack alignment to 8 Jan Beulich
  2018-11-08 16:06   ` [PATCH v5 02/13] x86: clone Linux'es ASM_CALL_CONSTRAINT Jan Beulich
@ 2018-11-08 16:08   ` Jan Beulich
  2018-11-12 10:36     ` Jan Beulich
  2018-11-08 16:09   ` [PATCH v5 04/13] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
                     ` (9 subsequent siblings)
  12 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-11-08 16:08 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall

In a number of cases the targets of indirect calls get determined once
at boot time. In such cases we can replace those calls with direct ones
via our alternative instruction patching mechanism.

Some of the targets (in particular the hvm_funcs ones) get established
only in pre-SMP initcalls, making necessary a second passs through the
alternative patching code. Therefore some adjustments beyond the
recognition of the new special pattern are necessary there.

Note that patching such sites more than once is not supported (and the
supplied macros also don't provide any means to do so).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v5: Use ASM_CALL_CONSTRAINT.
v4: Extend / adjust comments.
v3: Use "X" constraint instead of "g" in alternative_callN(). Pre-
    calculate values to be put into local register variables.
v2: Introduce and use count_va_arg(). Don't omit middle operand from
    ?: in ALT_CALL_ARG(). Re-base.

--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -177,9 +177,14 @@ text_poke(void *addr, const void *opcode
  * self modifying code. This implies that asymmetric systems where
  * APs have less capabilities than the boot processor are not handled.
  * Tough. Make sure you disable such features by hand.
+ *
+ * The caller will set the "force" argument to true for the final
+ * invocation, such that no CALLs/JMPs to NULL pointers will be left
+ * around. See also the further comment below.
  */
-void init_or_livepatch apply_alternatives(struct alt_instr *start,
-                                          struct alt_instr *end)
+static void init_or_livepatch _apply_alternatives(struct alt_instr *start,
+                                                  struct alt_instr *end,
+                                                  bool force)
 {
     struct alt_instr *a, *base;
 
@@ -208,9 +213,10 @@ void init_or_livepatch apply_alternative
         /*
          * Detect sequences of alt_instr's patching the same origin site, and
          * keep base pointing at the first alt_instr entry.  This is so we can
-         * refer to a single ->priv field for patching decisions.  We
-         * deliberately use the alt_instr itself rather than a local variable
-         * in case we end up making multiple passes.
+         * refer to a single ->priv field for some of our patching decisions,
+         * in particular the NOP optimization. We deliberately use the alt_instr
+         * itself rather than a local variable in case we end up making multiple
+         * passes.
          *
          * ->priv being nonzero means that the origin site has already been
          * modified, and we shouldn't try to optimise the nops again.
@@ -218,6 +224,13 @@ void init_or_livepatch apply_alternative
         if ( ALT_ORIG_PTR(base) != orig )
             base = a;
 
+        /* Skip patch sites already handled during the first pass. */
+        if ( a->priv )
+        {
+            ASSERT(force);
+            continue;
+        }
+
         /* If there is no replacement to make, see about optimising the nops. */
         if ( !boot_cpu_has(a->cpuid) )
         {
@@ -225,7 +238,7 @@ void init_or_livepatch apply_alternative
             if ( base->priv )
                 continue;
 
-            base->priv = 1;
+            a->priv = 1;
 
             /* Nothing useful to do? */
             if ( toolchain_nops_are_ideal || a->pad_len <= 1 )
@@ -236,20 +249,74 @@ void init_or_livepatch apply_alternative
             continue;
         }
 
-        base->priv = 1;
-
         memcpy(buf, repl, a->repl_len);
 
         /* 0xe8/0xe9 are relative branches; fix the offset. */
         if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
-            *(int32_t *)(buf + 1) += repl - orig;
+        {
+            /*
+             * Detect the special case of indirect-to-direct branch patching:
+             * - replacement is a direct CALL/JMP (opcodes 0xE8/0xE9; already
+             *   checked above),
+             * - replacement's displacement is -5 (pointing back at the very
+             *   insn, which makes no sense in a real replacement insn),
+             * - original is an indirect CALL/JMP (opcodes 0xFF/2 or 0xFF/4)
+             *   using RIP-relative addressing.
+             * Some branch destinations may still be NULL when we come here
+             * the first time. Defer patching of those until the post-presmp-
+             * initcalls re-invocation (with force set to true). If at that
+             * point the branch destination is still NULL, insert "UD2; UD0"
+             * (for ease of recognition) instead of CALL/JMP.
+             */
+            if ( a->cpuid == X86_FEATURE_ALWAYS &&
+                 *(int32_t *)(buf + 1) == -5 &&
+                 a->orig_len >= 6 &&
+                 orig[0] == 0xff &&
+                 orig[1] == (*buf & 1 ? 0x25 : 0x15) )
+            {
+                long disp = *(int32_t *)(orig + 2);
+                const uint8_t *dest = *(void **)(orig + 6 + disp);
+
+                if ( dest )
+                {
+                    disp = dest - (orig + 5);
+                    ASSERT(disp == (int32_t)disp);
+                    *(int32_t *)(buf + 1) = disp;
+                }
+                else if ( force )
+                {
+                    buf[0] = 0x0f;
+                    buf[1] = 0x0b;
+                    buf[2] = 0x0f;
+                    buf[3] = 0xff;
+                    buf[4] = 0xff;
+                }
+                else
+                    continue;
+            }
+            else if ( force && system_state < SYS_STATE_active )
+                ASSERT_UNREACHABLE();
+            else
+                *(int32_t *)(buf + 1) += repl - orig;
+        }
+        else if ( force && system_state < SYS_STATE_active  )
+            ASSERT_UNREACHABLE();
+
+        a->priv = 1;
 
         add_nops(buf + a->repl_len, total_len - a->repl_len);
         text_poke(orig, buf, total_len);
     }
 }
 
-static bool __initdata alt_done;
+void init_or_livepatch apply_alternatives(struct alt_instr *start,
+                                          struct alt_instr *end)
+{
+    _apply_alternatives(start, end, true);
+}
+
+static unsigned int __initdata alt_todo;
+static unsigned int __initdata alt_done;
 
 /*
  * At boot time, we patch alternatives in NMI context.  This means that the
@@ -264,7 +331,7 @@ static int __init nmi_apply_alternatives
      * More than one NMI may occur between the two set_nmi_callback() below.
      * We only need to apply alternatives once.
      */
-    if ( !alt_done )
+    if ( !(alt_done & alt_todo) )
     {
         unsigned long cr0;
 
@@ -273,11 +340,12 @@ static int __init nmi_apply_alternatives
         /* Disable WP to allow patching read-only pages. */
         write_cr0(cr0 & ~X86_CR0_WP);
 
-        apply_alternatives(__alt_instructions, __alt_instructions_end);
+        _apply_alternatives(__alt_instructions, __alt_instructions_end,
+                            alt_done);
 
         write_cr0(cr0);
 
-        alt_done = true;
+        alt_done |= alt_todo;
     }
 
     return 1;
@@ -287,13 +355,11 @@ static int __init nmi_apply_alternatives
  * This routine is called with local interrupt disabled and used during
  * bootup.
  */
-void __init alternative_instructions(void)
+static void __init _alternative_instructions(bool force)
 {
     unsigned int i;
     nmi_callback_t *saved_nmi_callback;
 
-    arch_init_ideal_nops();
-
     /*
      * Don't stop machine check exceptions while patching.
      * MCEs only happen when something got corrupted and in this
@@ -306,6 +372,10 @@ void __init alternative_instructions(voi
      */
     ASSERT(!local_irq_is_enabled());
 
+    /* Set what operation to perform /before/ setting the callback. */
+    alt_todo = 1u << force;
+    barrier();
+
     /*
      * As soon as the callback is set up, the next NMI will trigger patching,
      * even an NMI ahead of our explicit self-NMI.
@@ -321,11 +391,24 @@ void __init alternative_instructions(voi
      * cover the (hopefully never) async case, poll alt_done for up to one
      * second.
      */
-    for ( i = 0; !ACCESS_ONCE(alt_done) && i < 1000; ++i )
+    for ( i = 0; !(ACCESS_ONCE(alt_done) & alt_todo) && i < 1000; ++i )
         mdelay(1);
 
-    if ( !ACCESS_ONCE(alt_done) )
+    if ( !(ACCESS_ONCE(alt_done) & alt_todo) )
         panic("Timed out waiting for alternatives self-NMI to hit\n");
 
     set_nmi_callback(saved_nmi_callback);
 }
+
+void __init alternative_instructions(void)
+{
+    arch_init_ideal_nops();
+    _alternative_instructions(false);
+}
+
+void __init alternative_branches(void)
+{
+    local_irq_disable();
+    _alternative_instructions(true);
+    local_irq_enable();
+}
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1647,6 +1647,8 @@ void __init noreturn __start_xen(unsigne
 
     do_presmp_initcalls();
 
+    alternative_branches();
+
     /*
      * NB: when running as a PV shim VCPUOP_up/down is wired to the shim
      * physical cpu_add/remove functions, so launch the guest with only
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -4,8 +4,8 @@
 #ifdef __ASSEMBLY__
 #include <asm/alternative-asm.h>
 #else
+#include <xen/lib.h>
 #include <xen/stringify.h>
-#include <xen/types.h>
 #include <asm/asm-macros.h>
 
 struct __packed alt_instr {
@@ -26,6 +26,7 @@ extern void add_nops(void *insns, unsign
 /* Similar to alternative_instructions except it can be run with IRQs enabled. */
 extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end);
 extern void alternative_instructions(void);
+extern void alternative_branches(void);
 
 #define alt_orig_len       "(.LXEN%=_orig_e - .LXEN%=_orig_s)"
 #define alt_pad_len        "(.LXEN%=_orig_p - .LXEN%=_orig_e)"
@@ -149,6 +150,233 @@ extern void alternative_instructions(voi
 /* Use this macro(s) if you need more than one output parameter. */
 #define ASM_OUTPUT2(a...) a
 
+/*
+ * Machinery to allow converting indirect to direct calls, when the called
+ * function is determined once at boot and later never changed.
+ */
+
+#define ALT_CALL_arg1 "rdi"
+#define ALT_CALL_arg2 "rsi"
+#define ALT_CALL_arg3 "rdx"
+#define ALT_CALL_arg4 "rcx"
+#define ALT_CALL_arg5 "r8"
+#define ALT_CALL_arg6 "r9"
+
+#define ALT_CALL_ARG(arg, n) \
+    register typeof((arg) ? (arg) : 0) a ## n ## _ \
+    asm ( ALT_CALL_arg ## n ) = (arg)
+#define ALT_CALL_NO_ARG(n) \
+    register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n )
+
+#define ALT_CALL_NO_ARG6 ALT_CALL_NO_ARG(6)
+#define ALT_CALL_NO_ARG5 ALT_CALL_NO_ARG(5); ALT_CALL_NO_ARG6
+#define ALT_CALL_NO_ARG4 ALT_CALL_NO_ARG(4); ALT_CALL_NO_ARG5
+#define ALT_CALL_NO_ARG3 ALT_CALL_NO_ARG(3); ALT_CALL_NO_ARG4
+#define ALT_CALL_NO_ARG2 ALT_CALL_NO_ARG(2); ALT_CALL_NO_ARG3
+#define ALT_CALL_NO_ARG1 ALT_CALL_NO_ARG(1); ALT_CALL_NO_ARG2
+
+/*
+ * Unfortunately ALT_CALL_NO_ARG() above can't use a fake initializer (to
+ * suppress "uninitialized variable" warnings), as various versions of gcc
+ * older than 8.1 fall on the nose in various ways with that (always because
+ * of some other construct elsewhere in the same function needing to use the
+ * same hard register). Otherwise the asm() below could uniformly use "+r"
+ * output constraints, making unnecessary all these ALT_CALL<n>_OUT macros.
+ */
+#define ALT_CALL0_OUT "=r" (a1_), "=r" (a2_), "=r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL1_OUT "+r" (a1_), "=r" (a2_), "=r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL2_OUT "+r" (a1_), "+r" (a2_), "=r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL3_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL4_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "+r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL5_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "+r" (a4_), "+r" (a5_), "=r" (a6_)
+#define ALT_CALL6_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "+r" (a4_), "+r" (a5_), "+r" (a6_)
+
+#define alternative_callN(n, rettype, func) ({                     \
+    rettype ret_;                                                  \
+    register unsigned long r10_ asm("r10");                        \
+    register unsigned long r11_ asm("r11");                        \
+    asm volatile (__stringify(ALTERNATIVE "call *%c[addr](%%rip)", \
+                                          "call .",                \
+                                          X86_FEATURE_ALWAYS)      \
+                  : ALT_CALL ## n ## _OUT, "=a" (ret_),            \
+                    "=r" (r10_), "=r" (r11_) ASM_CALL_CONSTRAINT   \
+                  : [addr] "i" (&(func)), "g" (func)               \
+                  : "memory" );                                    \
+    ret_;                                                          \
+})
+
+#define alternative_vcall0(func) ({             \
+    ALT_CALL_NO_ARG1;                           \
+    ((void)alternative_callN(0, int, func));    \
+})
+
+#define alternative_call0(func) ({              \
+    ALT_CALL_NO_ARG1;                           \
+    alternative_callN(0, typeof(func()), func); \
+})
+
+#define alternative_vcall1(func, arg) ({           \
+    ALT_CALL_ARG(arg, 1);                          \
+    ALT_CALL_NO_ARG2;                              \
+    (void)sizeof(func(arg));                       \
+    (void)alternative_callN(1, int, func);         \
+})
+
+#define alternative_call1(func, arg) ({            \
+    ALT_CALL_ARG(arg, 1);                          \
+    ALT_CALL_NO_ARG2;                              \
+    alternative_callN(1, typeof(func(arg)), func); \
+})
+
+#define alternative_vcall2(func, arg1, arg2) ({           \
+    typeof(arg2) v2_ = (arg2);                            \
+    ALT_CALL_ARG(arg1, 1);                                \
+    ALT_CALL_ARG(v2_, 2);                                 \
+    ALT_CALL_NO_ARG3;                                     \
+    (void)sizeof(func(arg1, arg2));                       \
+    (void)alternative_callN(2, int, func);                \
+})
+
+#define alternative_call2(func, arg1, arg2) ({            \
+    typeof(arg2) v2_ = (arg2);                            \
+    ALT_CALL_ARG(arg1, 1);                                \
+    ALT_CALL_ARG(v2_, 2);                                 \
+    ALT_CALL_NO_ARG3;                                     \
+    alternative_callN(2, typeof(func(arg1, arg2)), func); \
+})
+
+#define alternative_vcall3(func, arg1, arg2, arg3) ({    \
+    typeof(arg2) v2_ = (arg2);                           \
+    typeof(arg3) v3_ = (arg3);                           \
+    ALT_CALL_ARG(arg1, 1);                               \
+    ALT_CALL_ARG(v2_, 2);                                \
+    ALT_CALL_ARG(v3_, 3);                                \
+    ALT_CALL_NO_ARG4;                                    \
+    (void)sizeof(func(arg1, arg2, arg3));                \
+    (void)alternative_callN(3, int, func);               \
+})
+
+#define alternative_call3(func, arg1, arg2, arg3) ({     \
+    typeof(arg2) v2_ = (arg2);                           \
+    typeof(arg3) v3_ = (arg3);                           \
+    ALT_CALL_ARG(arg1, 1);                               \
+    ALT_CALL_ARG(v2_, 2);                                \
+    ALT_CALL_ARG(v3_, 3);                                \
+    ALT_CALL_NO_ARG4;                                    \
+    alternative_callN(3, typeof(func(arg1, arg2, arg3)), \
+                      func);                             \
+})
+
+#define alternative_vcall4(func, arg1, arg2, arg3, arg4) ({ \
+    typeof(arg2) v2_ = (arg2);                              \
+    typeof(arg3) v3_ = (arg3);                              \
+    typeof(arg4) v4_ = (arg4);                              \
+    ALT_CALL_ARG(arg1, 1);                                  \
+    ALT_CALL_ARG(v2_, 2);                                   \
+    ALT_CALL_ARG(v3_, 3);                                   \
+    ALT_CALL_ARG(v4_, 4);                                   \
+    ALT_CALL_NO_ARG5;                                       \
+    (void)sizeof(func(arg1, arg2, arg3, arg4));             \
+    (void)alternative_callN(4, int, func);                  \
+})
+
+#define alternative_call4(func, arg1, arg2, arg3, arg4) ({  \
+    typeof(arg2) v2_ = (arg2);                              \
+    typeof(arg3) v3_ = (arg3);                              \
+    typeof(arg4) v4_ = (arg4);                              \
+    ALT_CALL_ARG(arg1, 1);                                  \
+    ALT_CALL_ARG(v2_, 2);                                   \
+    ALT_CALL_ARG(v3_, 3);                                   \
+    ALT_CALL_ARG(v4_, 4);                                   \
+    ALT_CALL_NO_ARG5;                                       \
+    alternative_callN(4, typeof(func(arg1, arg2,            \
+                                     arg3, arg4)),          \
+                      func);                                \
+})
+
+#define alternative_vcall5(func, arg1, arg2, arg3, arg4, arg5) ({ \
+    typeof(arg2) v2_ = (arg2);                                    \
+    typeof(arg3) v3_ = (arg3);                                    \
+    typeof(arg4) v4_ = (arg4);                                    \
+    typeof(arg5) v5_ = (arg5);                                    \
+    ALT_CALL_ARG(arg1, 1);                                        \
+    ALT_CALL_ARG(v2_, 2);                                         \
+    ALT_CALL_ARG(v3_, 3);                                         \
+    ALT_CALL_ARG(v4_, 4);                                         \
+    ALT_CALL_ARG(v5_, 5);                                         \
+    ALT_CALL_NO_ARG6;                                             \
+    (void)sizeof(func(arg1, arg2, arg3, arg4, arg5));             \
+    (void)alternative_callN(5, int, func, ALT_CALL_OUT5);         \
+})
+
+#define alternative_call5(func, arg1, arg2, arg3, arg4, arg5) ({  \
+    typeof(arg2) v2_ = (arg2);                                    \
+    typeof(arg3) v3_ = (arg3);                                    \
+    typeof(arg4) v4_ = (arg4);                                    \
+    typeof(arg5) v5_ = (arg5);                                    \
+    ALT_CALL_ARG(arg1, 1);                                        \
+    ALT_CALL_ARG(v2_, 2);                                         \
+    ALT_CALL_ARG(v3_, 3);                                         \
+    ALT_CALL_ARG(v4_, 4);                                         \
+    ALT_CALL_ARG(v5_, 5);                                         \
+    ALT_CALL_NO_ARG6;                                             \
+    alternative_callN(5, typeof(func(arg1, arg2, arg3,            \
+                                     arg4, arg5)),                \
+                      func, ALT_CALL_OUT5);                       \
+})
+
+#define alternative_vcall6(func, arg1, arg2, arg3, arg4, arg5, arg6) ({ \
+    typeof(arg2) v2_ = (arg2);                                          \
+    typeof(arg3) v3_ = (arg3);                                          \
+    typeof(arg4) v4_ = (arg4);                                          \
+    typeof(arg5) v5_ = (arg5);                                          \
+    typeof(arg6) v6_ = (arg6);                                          \
+    ALT_CALL_ARG(arg1, 1);                                              \
+    ALT_CALL_ARG(v2_, 2);                                               \
+    ALT_CALL_ARG(v3_, 3);                                               \
+    ALT_CALL_ARG(v4_, 4);                                               \
+    ALT_CALL_ARG(v5_, 5);                                               \
+    ALT_CALL_ARG(v6_, 6);                                               \
+    (void)sizeof(func(arg1, arg2, arg3, arg4, arg5, arg6));             \
+    (void)alternative_callN(6, int, func);                              \
+})
+
+#define alternative_call6(func, arg1, arg2, arg3, arg4, arg5, arg6) ({  \
+    typeof(arg2) v2_ = (arg2);                                          \
+    typeof(arg3) v3_ = (arg3);                                          \
+    typeof(arg4) v4_ = (arg4);                                          \
+    typeof(arg5) v5_ = (arg5);                                          \
+    typeof(arg6) v6_ = (arg6);                                          \
+    ALT_CALL_ARG(arg1, 1);                                              \
+    ALT_CALL_ARG(v2_, 2);                                               \
+    ALT_CALL_ARG(v3_, 3);                                               \
+    ALT_CALL_ARG(v4_, 4);                                               \
+    ALT_CALL_ARG(v5_, 5);                                               \
+    ALT_CALL_ARG(v6_, 6);                                               \
+    alternative_callN(6, typeof(func(arg1, arg2, arg3,                  \
+                                     arg4, arg5, arg6)),                \
+                      func, ALT_CALL_OUT6);                             \
+})
+
+#define alternative_vcall__(nr) alternative_vcall ## nr
+#define alternative_call__(nr)  alternative_call ## nr
+
+#define alternative_vcall_(nr) alternative_vcall__(nr)
+#define alternative_call_(nr)  alternative_call__(nr)
+
+#define alternative_vcall(func, args...) \
+    alternative_vcall_(count_va_arg(args))(func, ## args)
+
+#define alternative_call(func, args...) \
+    alternative_call_(count_va_arg(args))(func, ## args)
+
 #endif /*  !__ASSEMBLY__  */
 
 #endif /* __X86_ALTERNATIVE_H__ */
--- a/xen/include/xen/lib.h
+++ b/xen/include/xen/lib.h
@@ -66,6 +66,10 @@
 
 #define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
 
+#define count_va_arg_(dot, a1, a2, a3, a4, a5, a6, a7, a8, x, ...) x
+#define count_va_arg(args...) \
+    count_va_arg_(., ## args, 8, 7, 6, 5, 4, 3, 2, 1, 0)
+
 struct domain;
 
 void cmdline_parse(const char *cmdline);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v5 04/13] x86/HVM: patch indirect calls through hvm_funcs to direct ones
  2018-11-08 15:56 ` [PATCH v5 00/13] x86: indirect call overhead reduction Jan Beulich
                     ` (2 preceding siblings ...)
  2018-11-08 16:08   ` [PATCH v5 03/13] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
@ 2018-11-08 16:09   ` Jan Beulich
  2018-11-08 16:09   ` [PATCH v5 05/13] x86/HVM: patch vINTR " Jan Beulich
                     ` (8 subsequent siblings)
  12 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-11-08 16:09 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

This is intentionally not touching hooks used rarely (or not at all)
during the lifetime of a VM, like {domain,vcpu}_initialise or cpu_up,
as well as nested, VM event, and altp2m ones (they can all be done
later, if so desired). Virtual Interrupt delivery ones will be dealt
with in a subsequent patch.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v3: Re-base.
v2: Drop open-coded numbers from macro invocations. Re-base.

--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -2103,7 +2103,7 @@ static int hvmemul_write_msr(
 static int hvmemul_wbinvd(
     struct x86_emulate_ctxt *ctxt)
 {
-    hvm_funcs.wbinvd_intercept();
+    alternative_vcall(hvm_funcs.wbinvd_intercept);
     return X86EMUL_OKAY;
 }
 
@@ -2121,7 +2121,7 @@ static int hvmemul_get_fpu(
     struct vcpu *curr = current;
 
     if ( !curr->fpu_dirtied )
-        hvm_funcs.fpu_dirty_intercept();
+        alternative_vcall(hvm_funcs.fpu_dirty_intercept);
     else if ( type == X86EMUL_FPU_fpu )
     {
         const typeof(curr->arch.xsave_area->fpu_sse) *fpu_ctxt =
@@ -2238,7 +2238,7 @@ static void hvmemul_put_fpu(
         {
             curr->fpu_dirtied = false;
             stts();
-            hvm_funcs.fpu_leave(curr);
+            alternative_vcall(hvm_funcs.fpu_leave, curr);
         }
     }
 }
@@ -2400,7 +2400,8 @@ static int _hvm_emulate_one(struct hvm_e
     if ( hvmemul_ctxt->intr_shadow != new_intr_shadow )
     {
         hvmemul_ctxt->intr_shadow = new_intr_shadow;
-        hvm_funcs.set_interrupt_shadow(curr, new_intr_shadow);
+        alternative_vcall(hvm_funcs.set_interrupt_shadow,
+                          curr, new_intr_shadow);
     }
 
     if ( hvmemul_ctxt->ctxt.retire.hlt &&
@@ -2537,7 +2538,8 @@ void hvm_emulate_init_once(
 
     memset(hvmemul_ctxt, 0, sizeof(*hvmemul_ctxt));
 
-    hvmemul_ctxt->intr_shadow = hvm_funcs.get_interrupt_shadow(curr);
+    hvmemul_ctxt->intr_shadow =
+        alternative_call(hvm_funcs.get_interrupt_shadow, curr);
     hvmemul_get_seg_reg(x86_seg_cs, hvmemul_ctxt);
     hvmemul_get_seg_reg(x86_seg_ss, hvmemul_ctxt);
 
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -272,12 +272,12 @@ void hvm_set_rdtsc_exiting(struct domain
     struct vcpu *v;
 
     for_each_vcpu ( d, v )
-        hvm_funcs.set_rdtsc_exiting(v, enable);
+        alternative_vcall(hvm_funcs.set_rdtsc_exiting, v, enable);
 }
 
 void hvm_get_guest_pat(struct vcpu *v, u64 *guest_pat)
 {
-    if ( !hvm_funcs.get_guest_pat(v, guest_pat) )
+    if ( !alternative_call(hvm_funcs.get_guest_pat, v, guest_pat) )
         *guest_pat = v->arch.hvm.pat_cr;
 }
 
@@ -302,7 +302,7 @@ int hvm_set_guest_pat(struct vcpu *v, u6
             return 0;
         }
 
-    if ( !hvm_funcs.set_guest_pat(v, guest_pat) )
+    if ( !alternative_call(hvm_funcs.set_guest_pat, v, guest_pat) )
         v->arch.hvm.pat_cr = guest_pat;
 
     return 1;
@@ -342,7 +342,7 @@ bool hvm_set_guest_bndcfgs(struct vcpu *
             /* nothing, best effort only */;
     }
 
-    return hvm_funcs.set_guest_bndcfgs(v, val);
+    return alternative_call(hvm_funcs.set_guest_bndcfgs, v, val);
 }
 
 /*
@@ -500,7 +500,8 @@ void hvm_migrate_pirqs(struct vcpu *v)
 static bool hvm_get_pending_event(struct vcpu *v, struct x86_event *info)
 {
     info->cr2 = v->arch.hvm.guest_cr[2];
-    return hvm_funcs.get_pending_event(v, info);
+
+    return alternative_call(hvm_funcs.get_pending_event, v, info);
 }
 
 void hvm_do_resume(struct vcpu *v)
@@ -1659,7 +1660,7 @@ void hvm_inject_event(const struct x86_e
         }
     }
 
-    hvm_funcs.inject_event(event);
+    alternative_vcall(hvm_funcs.inject_event, event);
 }
 
 int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
@@ -2246,7 +2247,7 @@ int hvm_set_cr0(unsigned long value, boo
          (!rangeset_is_empty(d->iomem_caps) ||
           !rangeset_is_empty(d->arch.ioport_caps) ||
           has_arch_pdevs(d)) )
-        hvm_funcs.handle_cd(v, value);
+        alternative_vcall(hvm_funcs.handle_cd, v, value);
 
     hvm_update_cr(v, 0, value);
 
@@ -3473,7 +3474,8 @@ int hvm_msr_read_intercept(unsigned int
             goto gp_fault;
         /* If ret == 0 then this is not an MCE MSR, see other MSRs. */
         ret = ((ret == 0)
-               ? hvm_funcs.msr_read_intercept(msr, msr_content)
+               ? alternative_call(hvm_funcs.msr_read_intercept,
+                                  msr, msr_content)
                : X86EMUL_OKAY);
         break;
     }
@@ -3633,7 +3635,8 @@ int hvm_msr_write_intercept(unsigned int
             goto gp_fault;
         /* If ret == 0 then this is not an MCE MSR, see other MSRs. */
         ret = ((ret == 0)
-               ? hvm_funcs.msr_write_intercept(msr, msr_content)
+               ? alternative_call(hvm_funcs.msr_write_intercept,
+                                  msr, msr_content)
                : X86EMUL_OKAY);
         break;
     }
@@ -3825,7 +3828,7 @@ void hvm_hypercall_page_initialise(struc
                                    void *hypercall_page)
 {
     hvm_latch_shinfo_size(d);
-    hvm_funcs.init_hypercall_page(d, hypercall_page);
+    alternative_vcall(hvm_funcs.init_hypercall_page, d, hypercall_page);
 }
 
 void hvm_vcpu_reset_state(struct vcpu *v, uint16_t cs, uint16_t ip)
@@ -5031,7 +5034,7 @@ void hvm_domain_soft_reset(struct domain
 void hvm_get_segment_register(struct vcpu *v, enum x86_segment seg,
                               struct segment_register *reg)
 {
-    hvm_funcs.get_segment_register(v, seg, reg);
+    alternative_vcall(hvm_funcs.get_segment_register, v, seg, reg);
 
     switch ( seg )
     {
@@ -5177,7 +5180,7 @@ void hvm_set_segment_register(struct vcp
         return;
     }
 
-    hvm_funcs.set_segment_register(v, seg, reg);
+    alternative_vcall(hvm_funcs.set_segment_register, v, seg, reg);
 }
 
 /*
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -388,42 +388,42 @@ static inline int
 hvm_guest_x86_mode(struct vcpu *v)
 {
     ASSERT(v == current);
-    return hvm_funcs.guest_x86_mode(v);
+    return alternative_call(hvm_funcs.guest_x86_mode, v);
 }
 
 static inline void
 hvm_update_host_cr3(struct vcpu *v)
 {
     if ( hvm_funcs.update_host_cr3 )
-        hvm_funcs.update_host_cr3(v);
+        alternative_vcall(hvm_funcs.update_host_cr3, v);
 }
 
 static inline void hvm_update_guest_cr(struct vcpu *v, unsigned int cr)
 {
-    hvm_funcs.update_guest_cr(v, cr, 0);
+    alternative_vcall(hvm_funcs.update_guest_cr, v, cr, 0);
 }
 
 static inline void hvm_update_guest_cr3(struct vcpu *v, bool noflush)
 {
     unsigned int flags = noflush ? HVM_UPDATE_GUEST_CR3_NOFLUSH : 0;
 
-    hvm_funcs.update_guest_cr(v, 3, flags);
+    alternative_vcall(hvm_funcs.update_guest_cr, v, 3, flags);
 }
 
 static inline void hvm_update_guest_efer(struct vcpu *v)
 {
-    hvm_funcs.update_guest_efer(v);
+    alternative_vcall(hvm_funcs.update_guest_efer, v);
 }
 
 static inline void hvm_cpuid_policy_changed(struct vcpu *v)
 {
-    hvm_funcs.cpuid_policy_changed(v);
+    alternative_vcall(hvm_funcs.cpuid_policy_changed, v);
 }
 
 static inline void hvm_set_tsc_offset(struct vcpu *v, uint64_t offset,
                                       uint64_t at_tsc)
 {
-    hvm_funcs.set_tsc_offset(v, offset, at_tsc);
+    alternative_vcall(hvm_funcs.set_tsc_offset, v, offset, at_tsc);
 }
 
 /*
@@ -440,18 +440,18 @@ static inline void hvm_flush_guest_tlbs(
 static inline unsigned int
 hvm_get_cpl(struct vcpu *v)
 {
-    return hvm_funcs.get_cpl(v);
+    return alternative_call(hvm_funcs.get_cpl, v);
 }
 
 static inline unsigned long hvm_get_shadow_gs_base(struct vcpu *v)
 {
-    return hvm_funcs.get_shadow_gs_base(v);
+    return alternative_call(hvm_funcs.get_shadow_gs_base, v);
 }
 
 static inline bool hvm_get_guest_bndcfgs(struct vcpu *v, u64 *val)
 {
     return hvm_funcs.get_guest_bndcfgs &&
-           hvm_funcs.get_guest_bndcfgs(v, val);
+           alternative_call(hvm_funcs.get_guest_bndcfgs, v, val);
 }
 
 #define has_hvm_params(d) \
@@ -508,12 +508,12 @@ static inline void hvm_inject_page_fault
 
 static inline int hvm_event_pending(struct vcpu *v)
 {
-    return hvm_funcs.event_pending(v);
+    return alternative_call(hvm_funcs.event_pending, v);
 }
 
 static inline void hvm_invlpg(struct vcpu *v, unsigned long linear)
 {
-    hvm_funcs.invlpg(v, linear);
+    alternative_vcall(hvm_funcs.invlpg, v, linear);
 }
 
 /* These bits in CR4 are owned by the host. */
@@ -538,13 +538,14 @@ static inline void hvm_cpu_down(void)
 
 static inline unsigned int hvm_get_insn_bytes(struct vcpu *v, uint8_t *buf)
 {
-    return (hvm_funcs.get_insn_bytes ? hvm_funcs.get_insn_bytes(v, buf) : 0);
+    return (hvm_funcs.get_insn_bytes
+            ? alternative_call(hvm_funcs.get_insn_bytes, v, buf) : 0);
 }
 
 static inline void hvm_set_info_guest(struct vcpu *v)
 {
     if ( hvm_funcs.set_info_guest )
-        return hvm_funcs.set_info_guest(v);
+        alternative_vcall(hvm_funcs.set_info_guest, v);
 }
 
 static inline void hvm_invalidate_regs_fields(struct cpu_user_regs *regs)




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v5 05/13] x86/HVM: patch vINTR indirect calls through hvm_funcs to direct ones
  2018-11-08 15:56 ` [PATCH v5 00/13] x86: indirect call overhead reduction Jan Beulich
                     ` (3 preceding siblings ...)
  2018-11-08 16:09   ` [PATCH v5 04/13] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
@ 2018-11-08 16:09   ` Jan Beulich
  2018-11-08 16:10   ` [PATCH v5 06/13] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
                     ` (7 subsequent siblings)
  12 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-11-08 16:09 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

While not strictly necessary, change the VMX initialization logic to
update the function table in start_vmx() from NULL rather than to NULL,
to make more obvious that we won't ever change an already (explictly)
initialized function pointer.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v5: Fix indentation.
v4: Re-base.
v2: Drop open-coded numbers from macro invocations.

--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -111,10 +111,15 @@ static void vlapic_clear_irr(int vector,
     vlapic_clear_vector(vector, &vlapic->regs->data[APIC_IRR]);
 }
 
-static int vlapic_find_highest_irr(struct vlapic *vlapic)
+static void sync_pir_to_irr(struct vcpu *v)
 {
     if ( hvm_funcs.sync_pir_to_irr )
-        hvm_funcs.sync_pir_to_irr(vlapic_vcpu(vlapic));
+        alternative_vcall(hvm_funcs.sync_pir_to_irr, v);
+}
+
+static int vlapic_find_highest_irr(struct vlapic *vlapic)
+{
+    sync_pir_to_irr(vlapic_vcpu(vlapic));
 
     return vlapic_find_highest_vector(&vlapic->regs->data[APIC_IRR]);
 }
@@ -143,7 +148,7 @@ bool vlapic_test_irq(const struct vlapic
         return false;
 
     if ( hvm_funcs.test_pir &&
-         hvm_funcs.test_pir(const_vlapic_vcpu(vlapic), vec) )
+         alternative_call(hvm_funcs.test_pir, const_vlapic_vcpu(vlapic), vec) )
         return true;
 
     return vlapic_test_vector(vec, &vlapic->regs->data[APIC_IRR]);
@@ -165,10 +170,10 @@ void vlapic_set_irq(struct vlapic *vlapi
         vlapic_clear_vector(vec, &vlapic->regs->data[APIC_TMR]);
 
     if ( hvm_funcs.update_eoi_exit_bitmap )
-        hvm_funcs.update_eoi_exit_bitmap(target, vec, trig);
+        alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec, trig);
 
     if ( hvm_funcs.deliver_posted_intr )
-        hvm_funcs.deliver_posted_intr(target, vec);
+        alternative_vcall(hvm_funcs.deliver_posted_intr, target, vec);
     else if ( !vlapic_test_and_set_irr(vec, vlapic) )
         vcpu_kick(target);
 }
@@ -448,7 +453,8 @@ void vlapic_EOI_set(struct vlapic *vlapi
     vlapic_clear_vector(vector, &vlapic->regs->data[APIC_ISR]);
 
     if ( hvm_funcs.handle_eoi )
-        hvm_funcs.handle_eoi(vector, vlapic_find_highest_isr(vlapic));
+        alternative_vcall(hvm_funcs.handle_eoi, vector,
+                          vlapic_find_highest_isr(vlapic));
 
     vlapic_handle_EOI(vlapic, vector);
 
@@ -1412,8 +1418,7 @@ static int lapic_save_regs(struct vcpu *
     if ( !has_vlapic(v->domain) )
         return 0;
 
-    if ( hvm_funcs.sync_pir_to_irr )
-        hvm_funcs.sync_pir_to_irr(v);
+    sync_pir_to_irr(v);
 
     return hvm_save_entry(LAPIC_REGS, v->vcpu_id, h, vcpu_vlapic(v)->regs);
 }
@@ -1509,7 +1514,8 @@ static int lapic_load_regs(struct domain
         lapic_load_fixup(s);
 
     if ( hvm_funcs.process_isr )
-        hvm_funcs.process_isr(vlapic_find_highest_isr(s), v);
+        alternative_vcall(hvm_funcs.process_isr,
+                          vlapic_find_highest_isr(s), v);
 
     vlapic_adjust_i8259_target(d);
     lapic_rearm(s);
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2340,12 +2340,6 @@ static struct hvm_function_table __initd
     .nhvm_vcpu_vmexit_event = nvmx_vmexit_event,
     .nhvm_intr_blocked    = nvmx_intr_blocked,
     .nhvm_domain_relinquish_resources = nvmx_domain_relinquish_resources,
-    .update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap,
-    .process_isr          = vmx_process_isr,
-    .deliver_posted_intr  = vmx_deliver_posted_intr,
-    .sync_pir_to_irr      = vmx_sync_pir_to_irr,
-    .test_pir             = vmx_test_pir,
-    .handle_eoi           = vmx_handle_eoi,
     .nhvm_hap_walk_L1_p2m = nvmx_hap_walk_L1_p2m,
     .enable_msr_interception = vmx_enable_msr_interception,
     .is_singlestep_supported = vmx_is_singlestep_supported,
@@ -2473,26 +2467,23 @@ const struct hvm_function_table * __init
         setup_ept_dump();
     }
 
-    if ( !cpu_has_vmx_virtual_intr_delivery )
+    if ( cpu_has_vmx_virtual_intr_delivery )
     {
-        vmx_function_table.update_eoi_exit_bitmap = NULL;
-        vmx_function_table.process_isr = NULL;
-        vmx_function_table.handle_eoi = NULL;
-    }
-    else
+        vmx_function_table.update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap;
+        vmx_function_table.process_isr = vmx_process_isr;
+        vmx_function_table.handle_eoi = vmx_handle_eoi;
         vmx_function_table.virtual_intr_delivery_enabled = true;
+    }
 
     if ( cpu_has_vmx_posted_intr_processing )
     {
         alloc_direct_apic_vector(&posted_intr_vector, pi_notification_interrupt);
         if ( iommu_intpost )
             alloc_direct_apic_vector(&pi_wakeup_vector, pi_wakeup_interrupt);
-    }
-    else
-    {
-        vmx_function_table.deliver_posted_intr = NULL;
-        vmx_function_table.sync_pir_to_irr = NULL;
-        vmx_function_table.test_pir = NULL;
+
+        vmx_function_table.deliver_posted_intr = vmx_deliver_posted_intr;
+        vmx_function_table.sync_pir_to_irr     = vmx_sync_pir_to_irr;
+        vmx_function_table.test_pir            = vmx_test_pir;
     }
 
     if ( cpu_has_vmx_tsc_scaling )




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v5 06/13] x86: patch ctxt_switch_masking() indirect call to direct one
  2018-11-08 15:56 ` [PATCH v5 00/13] x86: indirect call overhead reduction Jan Beulich
                     ` (4 preceding siblings ...)
  2018-11-08 16:09   ` [PATCH v5 05/13] x86/HVM: patch vINTR " Jan Beulich
@ 2018-11-08 16:10   ` Jan Beulich
  2018-11-08 16:11   ` [PATCH v5 07/13] x86/genapic: patch indirect calls to direct ones Jan Beulich
                     ` (6 subsequent siblings)
  12 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-11-08 16:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v2: Drop open-coded number from macro invocation.

--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -185,7 +185,7 @@ void ctxt_switch_levelling(const struct
 	}
 
 	if (ctxt_switch_masking)
-		ctxt_switch_masking(next);
+		alternative_vcall(ctxt_switch_masking, next);
 }
 
 bool_t opt_cpu_info;



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v5 07/13] x86/genapic: patch indirect calls to direct ones
  2018-11-08 15:56 ` [PATCH v5 00/13] x86: indirect call overhead reduction Jan Beulich
                     ` (5 preceding siblings ...)
  2018-11-08 16:10   ` [PATCH v5 06/13] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
@ 2018-11-08 16:11   ` Jan Beulich
  2018-11-08 16:11   ` [PATCH v5 08/13] x86/cpuidle: patch some " Jan Beulich
                     ` (5 subsequent siblings)
  12 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-11-08 16:11 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

For (I hope) obvious reasons only the ones used at runtime get
converted.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v2: Drop open-coded numbers from macro invocations.

--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -29,12 +29,12 @@
 
 void send_IPI_mask(const cpumask_t *mask, int vector)
 {
-    genapic.send_IPI_mask(mask, vector);
+    alternative_vcall(genapic.send_IPI_mask, mask, vector);
 }
 
 void send_IPI_self(int vector)
 {
-    genapic.send_IPI_self(vector);
+    alternative_vcall(genapic.send_IPI_self, vector);
 }
 
 /*
--- a/xen/include/asm-x86/mach-generic/mach_apic.h
+++ b/xen/include/asm-x86/mach-generic/mach_apic.h
@@ -15,8 +15,18 @@
 #define TARGET_CPUS ((const typeof(cpu_online_map) *)&cpu_online_map)
 #define init_apic_ldr (genapic.init_apic_ldr)
 #define clustered_apic_check (genapic.clustered_apic_check)
-#define cpu_mask_to_apicid (genapic.cpu_mask_to_apicid)
-#define vector_allocation_cpumask(cpu) (genapic.vector_allocation_cpumask(cpu))
+#define cpu_mask_to_apicid(mask) ({ \
+	/* \
+	 * There are a number of places where the address of a local variable \
+	 * gets passed here. The use of ?: in alternative_call<N>() triggers an \
+	 * "address of ... is always true" warning in such a case with at least \
+	 * gcc 7 and 8. Hence the seemingly pointless local variable here. \
+	 */ \
+	const cpumask_t *m_ = (mask); \
+	alternative_call(genapic.cpu_mask_to_apicid, m_); \
+})
+#define vector_allocation_cpumask(cpu) \
+	alternative_call(genapic.vector_allocation_cpumask, cpu)
 
 static inline void enable_apic_mode(void)
 {





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v5 08/13] x86/cpuidle: patch some indirect calls to direct ones
  2018-11-08 15:56 ` [PATCH v5 00/13] x86: indirect call overhead reduction Jan Beulich
                     ` (6 preceding siblings ...)
  2018-11-08 16:11   ` [PATCH v5 07/13] x86/genapic: patch indirect calls to direct ones Jan Beulich
@ 2018-11-08 16:11   ` Jan Beulich
  2018-11-08 16:12   ` [PATCH v5 09/13] cpufreq: convert to a single post-init driver (hooks) instance Jan Beulich
                     ` (4 subsequent siblings)
  12 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-11-08 16:11 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

For now only the ones used during entering/exiting of idle states are
converted. Additionally pm_idle{,_save} and lapic_timer_{on,off} can't
be converted, as they may get established rather late (when Dom0 is
already active).

Note that for patching to be deferred until after the pre-SMP initcalls
(from where cpuidle_init_cpu() runs the first time) the pointers need to
start out as NULL.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v2: Drop open-coded numbers from macro invocations.

--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -102,8 +102,6 @@ bool lapic_timer_init(void)
     return true;
 }
 
-static uint64_t (*__read_mostly tick_to_ns)(uint64_t) = acpi_pm_tick_to_ns;
-
 void (*__read_mostly pm_idle_save)(void);
 unsigned int max_cstate __read_mostly = ACPI_PROCESSOR_MAX_POWER - 1;
 integer_param("max_cstate", max_cstate);
@@ -289,9 +287,9 @@ static uint64_t acpi_pm_ticks_elapsed(ui
         return ((0xFFFFFFFF - t1) + t2 +1);
 }
 
-uint64_t (*__read_mostly cpuidle_get_tick)(void) = get_acpi_pm_tick;
-static uint64_t (*__read_mostly ticks_elapsed)(uint64_t, uint64_t)
-    = acpi_pm_ticks_elapsed;
+uint64_t (*__read_mostly cpuidle_get_tick)(void);
+static uint64_t (*__read_mostly tick_to_ns)(uint64_t);
+static uint64_t (*__read_mostly ticks_elapsed)(uint64_t, uint64_t);
 
 static void print_acpi_power(uint32_t cpu, struct acpi_processor_power *power)
 {
@@ -547,7 +545,7 @@ void update_idle_stats(struct acpi_proce
                        struct acpi_processor_cx *cx,
                        uint64_t before, uint64_t after)
 {
-    int64_t sleep_ticks = ticks_elapsed(before, after);
+    int64_t sleep_ticks = alternative_call(ticks_elapsed, before, after);
     /* Interrupts are disabled */
 
     spin_lock(&power->stat_lock);
@@ -555,7 +553,8 @@ void update_idle_stats(struct acpi_proce
     cx->usage++;
     if ( sleep_ticks > 0 )
     {
-        power->last_residency = tick_to_ns(sleep_ticks) / 1000UL;
+        power->last_residency = alternative_call(tick_to_ns, sleep_ticks) /
+                                1000UL;
         cx->time += sleep_ticks;
     }
     power->last_state = &power->states[0];
@@ -635,7 +634,7 @@ static void acpi_processor_idle(void)
         if ( cx->type == ACPI_STATE_C1 || local_apic_timer_c2_ok )
         {
             /* Get start time (ticks) */
-            t1 = cpuidle_get_tick();
+            t1 = alternative_call(cpuidle_get_tick);
             /* Trace cpu idle entry */
             TRACE_4D(TRC_PM_IDLE_ENTRY, cx->idx, t1, exp, pred);
 
@@ -644,7 +643,7 @@ static void acpi_processor_idle(void)
             /* Invoke C2 */
             acpi_idle_do_entry(cx);
             /* Get end time (ticks) */
-            t2 = cpuidle_get_tick();
+            t2 = alternative_call(cpuidle_get_tick);
             trace_exit_reason(irq_traced);
             /* Trace cpu idle exit */
             TRACE_6D(TRC_PM_IDLE_EXIT, cx->idx, t2,
@@ -666,7 +665,7 @@ static void acpi_processor_idle(void)
         lapic_timer_off();
 
         /* Get start time (ticks) */
-        t1 = cpuidle_get_tick();
+        t1 = alternative_call(cpuidle_get_tick);
         /* Trace cpu idle entry */
         TRACE_4D(TRC_PM_IDLE_ENTRY, cx->idx, t1, exp, pred);
 
@@ -717,7 +716,7 @@ static void acpi_processor_idle(void)
         }
 
         /* Get end time (ticks) */
-        t2 = cpuidle_get_tick();
+        t2 = alternative_call(cpuidle_get_tick);
 
         /* recovering TSC */
         cstate_restore_tsc();
@@ -827,11 +826,20 @@ int cpuidle_init_cpu(unsigned int cpu)
     {
         unsigned int i;
 
-        if ( cpu == 0 && boot_cpu_has(X86_FEATURE_NONSTOP_TSC) )
+        if ( cpu == 0 && system_state < SYS_STATE_active )
         {
-            cpuidle_get_tick = get_stime_tick;
-            ticks_elapsed = stime_ticks_elapsed;
-            tick_to_ns = stime_tick_to_ns;
+            if ( boot_cpu_has(X86_FEATURE_NONSTOP_TSC) )
+            {
+                cpuidle_get_tick = get_stime_tick;
+                ticks_elapsed = stime_ticks_elapsed;
+                tick_to_ns = stime_tick_to_ns;
+            }
+            else
+            {
+                cpuidle_get_tick = get_acpi_pm_tick;
+                ticks_elapsed = acpi_pm_ticks_elapsed;
+                tick_to_ns = acpi_pm_tick_to_ns;
+            }
         }
 
         acpi_power = xzalloc(struct acpi_processor_power);
--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -778,7 +778,7 @@ static void mwait_idle(void)
 	if (!(lapic_timer_reliable_states & (1 << cstate)))
 		lapic_timer_off();
 
-	before = cpuidle_get_tick();
+	before = alternative_call(cpuidle_get_tick);
 	TRACE_4D(TRC_PM_IDLE_ENTRY, cx->type, before, exp, pred);
 
 	update_last_cx_stat(power, cx, before);
@@ -786,7 +786,7 @@ static void mwait_idle(void)
 	if (cpu_is_haltable(cpu))
 		mwait_idle_with_hints(eax, MWAIT_ECX_INTERRUPT_BREAK);
 
-	after = cpuidle_get_tick();
+	after = alternative_call(cpuidle_get_tick);
 
 	cstate_restore_tsc();
 	trace_exit_reason(irq_traced);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v5 09/13] cpufreq: convert to a single post-init driver (hooks) instance
  2018-11-08 15:56 ` [PATCH v5 00/13] x86: indirect call overhead reduction Jan Beulich
                     ` (7 preceding siblings ...)
  2018-11-08 16:11   ` [PATCH v5 08/13] x86/cpuidle: patch some " Jan Beulich
@ 2018-11-08 16:12   ` Jan Beulich
  2018-11-08 16:13   ` [PATCH v5 10/13] cpufreq: patch target() indirect call to direct one Jan Beulich
                     ` (3 subsequent siblings)
  12 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-11-08 16:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu

This reduces the post-init memory footprint, eliminates a pointless
level of indirection at the use sites, and allows for subsequent
alternatives call patching.

Take the opportunity and also add a name to the PowerNow! instance.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v2: New.

--- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
+++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
@@ -53,8 +53,6 @@ enum {
 
 struct acpi_cpufreq_data *cpufreq_drv_data[NR_CPUS];
 
-static struct cpufreq_driver acpi_cpufreq_driver;
-
 static bool __read_mostly acpi_pstate_strict;
 boolean_param("acpi_pstate_strict", acpi_pstate_strict);
 
@@ -355,7 +353,7 @@ static void feature_detect(void *info)
     if ( cpu_has_aperfmperf )
     {
         policy->aperf_mperf = 1;
-        acpi_cpufreq_driver.getavg = get_measured_perf;
+        cpufreq_driver.getavg = get_measured_perf;
     }
 
     eax = cpuid_eax(6);
@@ -593,7 +591,7 @@ acpi_cpufreq_cpu_init(struct cpufreq_pol
         policy->cur = acpi_cpufreq_guess_freq(data, policy->cpu);
         break;
     case ACPI_ADR_SPACE_FIXED_HARDWARE:
-        acpi_cpufreq_driver.get = get_cur_freq_on_cpu;
+        cpufreq_driver.get = get_cur_freq_on_cpu;
         policy->cur = get_cur_freq_on_cpu(cpu);
         break;
     default:
@@ -635,7 +633,7 @@ static int acpi_cpufreq_cpu_exit(struct
     return 0;
 }
 
-static struct cpufreq_driver acpi_cpufreq_driver = {
+static const struct cpufreq_driver __initconstrel acpi_cpufreq_driver = {
     .name   = "acpi-cpufreq",
     .verify = acpi_cpufreq_verify,
     .target = acpi_cpufreq_target,
@@ -656,7 +654,7 @@ static int __init cpufreq_driver_init(vo
 
     return ret;
 }
-__initcall(cpufreq_driver_init);
+presmp_initcall(cpufreq_driver_init);
 
 int cpufreq_cpu_init(unsigned int cpuid)
 {
--- a/xen/arch/x86/acpi/cpufreq/powernow.c
+++ b/xen/arch/x86/acpi/cpufreq/powernow.c
@@ -52,8 +52,6 @@
 
 #define ARCH_CPU_FLAG_RESUME	1
 
-static struct cpufreq_driver powernow_cpufreq_driver;
-
 static void transition_pstate(void *pstate)
 {
     wrmsrl(MSR_PSTATE_CTRL, *(unsigned int *)pstate);
@@ -215,7 +213,7 @@ static void feature_detect(void *info)
     if ( cpu_has_aperfmperf )
     {
         policy->aperf_mperf = 1;
-        powernow_cpufreq_driver.getavg = get_measured_perf;
+        cpufreq_driver.getavg = get_measured_perf;
     }
 
     edx = cpuid_edx(CPUID_FREQ_VOLT_CAPABILITIES);
@@ -347,7 +345,8 @@ static int powernow_cpufreq_cpu_exit(str
     return 0;
 }
 
-static struct cpufreq_driver powernow_cpufreq_driver = {
+static const struct cpufreq_driver __initconstrel powernow_cpufreq_driver = {
+    .name   = "powernow",
     .verify = powernow_cpufreq_verify,
     .target = powernow_cpufreq_target,
     .init   = powernow_cpufreq_cpu_init,
--- a/xen/drivers/acpi/pmstat.c
+++ b/xen/drivers/acpi/pmstat.c
@@ -64,7 +64,7 @@ int do_get_pm_info(struct xen_sysctl_get
     case PMSTAT_PX:
         if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
             return -ENODEV;
-        if ( !cpufreq_driver )
+        if ( !cpufreq_driver.init )
             return -ENODEV;
         if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
             return -EINVAL;
@@ -255,16 +255,16 @@ static int get_cpufreq_para(struct xen_s
         return ret;
 
     op->u.get_para.cpuinfo_cur_freq =
-        cpufreq_driver->get ? cpufreq_driver->get(op->cpuid) : policy->cur;
+        cpufreq_driver.get ? cpufreq_driver.get(op->cpuid) : policy->cur;
     op->u.get_para.cpuinfo_max_freq = policy->cpuinfo.max_freq;
     op->u.get_para.cpuinfo_min_freq = policy->cpuinfo.min_freq;
     op->u.get_para.scaling_cur_freq = policy->cur;
     op->u.get_para.scaling_max_freq = policy->max;
     op->u.get_para.scaling_min_freq = policy->min;
 
-    if ( cpufreq_driver->name[0] )
+    if ( cpufreq_driver.name[0] )
         strlcpy(op->u.get_para.scaling_driver, 
-            cpufreq_driver->name, CPUFREQ_NAME_LEN);
+            cpufreq_driver.name, CPUFREQ_NAME_LEN);
     else
         strlcpy(op->u.get_para.scaling_driver, "Unknown", CPUFREQ_NAME_LEN);
 
--- a/xen/drivers/cpufreq/cpufreq.c
+++ b/xen/drivers/cpufreq/cpufreq.c
@@ -172,7 +172,7 @@ int cpufreq_add_cpu(unsigned int cpu)
     if ( !(perf->init & XEN_PX_INIT) )
         return -EINVAL;
 
-    if (!cpufreq_driver)
+    if (!cpufreq_driver.init)
         return 0;
 
     if (per_cpu(cpufreq_cpu_policy, cpu))
@@ -239,7 +239,7 @@ int cpufreq_add_cpu(unsigned int cpu)
         policy->cpu = cpu;
         per_cpu(cpufreq_cpu_policy, cpu) = policy;
 
-        ret = cpufreq_driver->init(policy);
+        ret = cpufreq_driver.init(policy);
         if (ret) {
             free_cpumask_var(policy->cpus);
             xfree(policy);
@@ -298,7 +298,7 @@ err1:
     cpumask_clear_cpu(cpu, cpufreq_dom->map);
 
     if (cpumask_empty(policy->cpus)) {
-        cpufreq_driver->exit(policy);
+        cpufreq_driver.exit(policy);
         free_cpumask_var(policy->cpus);
         xfree(policy);
     }
@@ -362,7 +362,7 @@ int cpufreq_del_cpu(unsigned int cpu)
     cpumask_clear_cpu(cpu, cpufreq_dom->map);
 
     if (cpumask_empty(policy->cpus)) {
-        cpufreq_driver->exit(policy);
+        cpufreq_driver.exit(policy);
         free_cpumask_var(policy->cpus);
         xfree(policy);
     }
@@ -663,17 +663,17 @@ static int __init cpufreq_presmp_init(vo
 }
 presmp_initcall(cpufreq_presmp_init);
 
-int __init cpufreq_register_driver(struct cpufreq_driver *driver_data)
+int __init cpufreq_register_driver(const struct cpufreq_driver *driver_data)
 {
    if ( !driver_data || !driver_data->init ||
         !driver_data->verify || !driver_data->exit ||
         (!driver_data->target == !driver_data->setpolicy) )
         return -EINVAL;
 
-    if ( cpufreq_driver )
+    if ( cpufreq_driver.init )
         return -EBUSY;
 
-    cpufreq_driver = driver_data;
+    cpufreq_driver = *driver_data;
 
     return 0;
 }
--- a/xen/drivers/cpufreq/utility.c
+++ b/xen/drivers/cpufreq/utility.c
@@ -31,7 +31,7 @@
 #include <acpi/cpufreq/cpufreq.h>
 #include <public/sysctl.h>
 
-struct cpufreq_driver   *cpufreq_driver;
+struct cpufreq_driver __read_mostly cpufreq_driver;
 struct processor_pminfo *__read_mostly processor_pminfo[NR_CPUS];
 DEFINE_PER_CPU_READ_MOSTLY(struct cpufreq_policy *, cpufreq_cpu_policy);
 
@@ -360,11 +360,11 @@ int __cpufreq_driver_target(struct cpufr
 {
     int retval = -EINVAL;
 
-    if (cpu_online(policy->cpu) && cpufreq_driver->target)
+    if (cpu_online(policy->cpu) && cpufreq_driver.target)
     {
         unsigned int prev_freq = policy->cur;
 
-        retval = cpufreq_driver->target(policy, target_freq, relation);
+        retval = cpufreq_driver.target(policy, target_freq, relation);
         if ( retval == 0 )
             TRACE_2D(TRC_PM_FREQ_CHANGE, prev_freq/1000, policy->cur/1000);
     }
@@ -380,9 +380,9 @@ int cpufreq_driver_getavg(unsigned int c
     if (!cpu_online(cpu) || !(policy = per_cpu(cpufreq_cpu_policy, cpu)))
         return 0;
 
-    if (cpufreq_driver->getavg)
+    if (cpufreq_driver.getavg)
     {
-        freq_avg = cpufreq_driver->getavg(cpu, flag);
+        freq_avg = cpufreq_driver.getavg(cpu, flag);
         if (freq_avg > 0)
             return freq_avg;
     }
@@ -412,9 +412,9 @@ int cpufreq_update_turbo(int cpuid, int
         return 0;
 
     policy->turbo = new_state;
-    if (cpufreq_driver->update)
+    if (cpufreq_driver.update)
     {
-        ret = cpufreq_driver->update(cpuid, policy);
+        ret = cpufreq_driver.update(cpuid, policy);
         if (ret)
             policy->turbo = curr_state;
     }
@@ -450,15 +450,15 @@ int __cpufreq_set_policy(struct cpufreq_
         return -EINVAL;
 
     /* verify the cpu speed can be set within this limit */
-    ret = cpufreq_driver->verify(policy);
+    ret = cpufreq_driver.verify(policy);
     if (ret)
         return ret;
 
     data->min = policy->min;
     data->max = policy->max;
     data->limits = policy->limits;
-    if (cpufreq_driver->setpolicy)
-        return cpufreq_driver->setpolicy(data);
+    if (cpufreq_driver.setpolicy)
+        return cpufreq_driver.setpolicy(data);
 
     if (policy->governor != data->governor) {
         /* save old, working values */
--- a/xen/include/acpi/cpufreq/cpufreq.h
+++ b/xen/include/acpi/cpufreq/cpufreq.h
@@ -153,7 +153,7 @@ __cpufreq_governor(struct cpufreq_policy
 #define CPUFREQ_RELATION_H 1  /* highest frequency below or at target */
 
 struct cpufreq_driver {
-    char   name[CPUFREQ_NAME_LEN];
+    const char *name;
     int    (*init)(struct cpufreq_policy *policy);
     int    (*verify)(struct cpufreq_policy *policy);
     int    (*setpolicy)(struct cpufreq_policy *policy);
@@ -166,9 +166,9 @@ struct cpufreq_driver {
     int    (*exit)(struct cpufreq_policy *policy);
 };
 
-extern struct cpufreq_driver *cpufreq_driver;
+extern struct cpufreq_driver cpufreq_driver;
 
-int cpufreq_register_driver(struct cpufreq_driver *);
+int cpufreq_register_driver(const struct cpufreq_driver *);
 
 static __inline__
 void cpufreq_verify_within_limits(struct cpufreq_policy *policy,




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v5 10/13] cpufreq: patch target() indirect call to direct one
  2018-11-08 15:56 ` [PATCH v5 00/13] x86: indirect call overhead reduction Jan Beulich
                     ` (8 preceding siblings ...)
  2018-11-08 16:12   ` [PATCH v5 09/13] cpufreq: convert to a single post-init driver (hooks) instance Jan Beulich
@ 2018-11-08 16:13   ` Jan Beulich
  2018-11-08 16:14   ` [PATCH v5 11/13] IOMMU: move inclusion point of asm/iommu.h Jan Beulich
                     ` (2 subsequent siblings)
  12 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-11-08 16:13 UTC (permalink / raw)
  To: xen-devel, Jan Beulich; +Cc: Andrew Cooper, Wei Liu

This looks to be the only frequently executed hook; don't bother
patching any other ones.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v2: New.

--- a/xen/drivers/cpufreq/utility.c
+++ b/xen/drivers/cpufreq/utility.c
@@ -364,7 +364,8 @@ int __cpufreq_driver_target(struct cpufr
     {
         unsigned int prev_freq = policy->cur;
 
-        retval = cpufreq_driver.target(policy, target_freq, relation);
+        retval = alternative_call(cpufreq_driver.target,
+                                  policy, target_freq, relation);
         if ( retval == 0 )
             TRACE_2D(TRC_PM_FREQ_CHANGE, prev_freq/1000, policy->cur/1000);
     }



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v5 11/13] IOMMU: move inclusion point of asm/iommu.h
  2018-11-08 15:56 ` [PATCH v5 00/13] x86: indirect call overhead reduction Jan Beulich
                     ` (9 preceding siblings ...)
  2018-11-08 16:13   ` [PATCH v5 10/13] cpufreq: patch target() indirect call to direct one Jan Beulich
@ 2018-11-08 16:14   ` Jan Beulich
  2018-11-12 11:55     ` Julien Grall
  2018-11-08 16:16   ` [PATCH v5 12/13] IOMMU/x86: remove indirection from certain IOMMU hook accesses Jan Beulich
  2018-11-08 16:17   ` [PATCH v5 13/13] IOMMU: patch certain indirect calls to direct ones Jan Beulich
  12 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-11-08 16:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall

In preparation of allowing inline functions in asm/iommu.h to
de-reference struct struct iommu_ops, move the inclusion downwards past
the declaration of that structure. This in turn requires moving the
struct domain_iommu declaration, as it requires struct arch_iommu to be
fully declared beforehand.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v5: New.

--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -28,7 +28,6 @@
 #include <public/hvm/ioreq.h>
 #include <public/domctl.h>
 #include <asm/device.h>
-#include <asm/iommu.h>
 
 TYPE_SAFE(uint64_t, dfn);
 #define PRI_dfn     PRIx64
@@ -103,42 +102,6 @@ enum iommu_feature
 
 bool_t iommu_has_feature(struct domain *d, enum iommu_feature feature);
 
-enum iommu_status
-{
-    IOMMU_STATUS_disabled,
-    IOMMU_STATUS_initializing,
-    IOMMU_STATUS_initialized
-};
-
-struct domain_iommu {
-    struct arch_iommu arch;
-
-    /* iommu_ops */
-    const struct iommu_ops *platform_ops;
-
-#ifdef CONFIG_HAS_DEVICE_TREE
-    /* List of DT devices assigned to this domain */
-    struct list_head dt_devices;
-#endif
-
-    /* Features supported by the IOMMU */
-    DECLARE_BITMAP(features, IOMMU_FEAT_count);
-
-    /* Status of guest IOMMU mappings */
-    enum iommu_status status;
-
-    /*
-     * Does the guest reqire mappings to be synchonized, to maintain
-     * the default dfn == pfn map. (See comment on dfn at the top of
-     * include/xen/mm.h).
-     */
-    bool need_sync;
-};
-
-#define dom_iommu(d)              (&(d)->iommu)
-#define iommu_set_feature(d, f)   set_bit(f, dom_iommu(d)->features)
-#define iommu_clear_feature(d, f) clear_bit(f, dom_iommu(d)->features)
-
 #ifdef CONFIG_HAS_PCI
 struct pirq;
 int hvm_do_IRQ_dpci(struct domain *, struct pirq *);
@@ -230,6 +193,44 @@ struct iommu_ops {
     void (*dump_p2m_table)(struct domain *d);
 };
 
+#include <asm/iommu.h>
+
+enum iommu_status
+{
+    IOMMU_STATUS_disabled,
+    IOMMU_STATUS_initializing,
+    IOMMU_STATUS_initialized
+};
+
+struct domain_iommu {
+    struct arch_iommu arch;
+
+    /* iommu_ops */
+    const struct iommu_ops *platform_ops;
+
+#ifdef CONFIG_HAS_DEVICE_TREE
+    /* List of DT devices assigned to this domain */
+    struct list_head dt_devices;
+#endif
+
+    /* Features supported by the IOMMU */
+    DECLARE_BITMAP(features, IOMMU_FEAT_count);
+
+    /* Status of guest IOMMU mappings */
+    enum iommu_status status;
+
+    /*
+     * Does the guest reqire mappings to be synchonized, to maintain
+     * the default dfn == pfn map. (See comment on dfn at the top of
+     * include/xen/mm.h).
+     */
+    bool need_sync;
+};
+
+#define dom_iommu(d)              (&(d)->iommu)
+#define iommu_set_feature(d, f)   set_bit(f, dom_iommu(d)->features)
+#define iommu_clear_feature(d, f) clear_bit(f, dom_iommu(d)->features)
+
 int __must_check iommu_suspend(void);
 void iommu_resume(void);
 void iommu_crash_shutdown(void);





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v5 12/13] IOMMU/x86: remove indirection from certain IOMMU hook accesses
  2018-11-08 15:56 ` [PATCH v5 00/13] x86: indirect call overhead reduction Jan Beulich
                     ` (10 preceding siblings ...)
  2018-11-08 16:14   ` [PATCH v5 11/13] IOMMU: move inclusion point of asm/iommu.h Jan Beulich
@ 2018-11-08 16:16   ` Jan Beulich
  2018-11-14  3:25     ` Tian, Kevin
  2018-11-14 17:16     ` Woods, Brian
  2018-11-08 16:17   ` [PATCH v5 13/13] IOMMU: patch certain indirect calls to direct ones Jan Beulich
  12 siblings, 2 replies; 119+ messages in thread
From: Jan Beulich @ 2018-11-08 16:16 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Kevin Tian, Brian Woods, Wei Liu, Suravee Suthikulpanit

There's no need to go through an extra level of indirection. In order to
limit code churn, call sites using struct domain_iommu's platform_ops
don't get touched here, however.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v5: Re-base over dropped IOMMU_MIXED patch.
v4: New.

--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -29,6 +29,8 @@
 
 static bool_t __read_mostly init_done;
 
+static const struct iommu_ops amd_iommu_ops;
+
 struct amd_iommu *find_iommu_for_device(int seg, int bdf)
 {
     struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(seg);
@@ -182,6 +184,8 @@ int __init amd_iov_detect(void)
         return -ENODEV;
     }
 
+    iommu_ops = amd_iommu_ops;
+
     if ( amd_iommu_init() != 0 )
     {
         printk("AMD-Vi: Error initialization\n");
@@ -566,7 +570,7 @@ static void amd_dump_p2m_table(struct do
     amd_dump_p2m_table_level(hd->arch.root_table, hd->arch.paging_mode, 0, 0);
 }
 
-const struct iommu_ops amd_iommu_ops = {
+static const struct iommu_ops __initconstrel amd_iommu_ops = {
     .init = amd_iommu_domain_init,
     .hwdom_init = amd_iommu_hwdom_init,
     .add_device = amd_iommu_add_device,
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -27,6 +27,7 @@
 
 struct pci_ats_dev;
 extern bool_t rwbf_quirk;
+extern const struct iommu_ops intel_iommu_ops;
 
 void print_iommu_regs(struct acpi_drhd_unit *drhd);
 void print_vtd_entries(struct iommu *iommu, int bus, int devfn, u64 gmfn);
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -897,6 +897,8 @@ int iommu_enable_x2apic_IR(void)
     else if ( !x2apic_enabled )
         return -EOPNOTSUPP;
 
+    iommu_ops = intel_iommu_ops;
+
     for_each_drhd_unit ( drhd )
     {
         iommu = drhd->iommu;
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2299,6 +2299,8 @@ int __init intel_vtd_setup(void)
         goto error;
     }
 
+    iommu_ops = intel_iommu_ops;
+
     /* We enable the following features only if they are supported by all VT-d
      * engines: Snoop Control, DMA passthrough, Queued Invalidation, Interrupt
      * Remapping, and Posted Interrupt
@@ -2698,7 +2700,7 @@ static void vtd_dump_p2m_table(struct do
     vtd_dump_p2m_table_level(hd->arch.pgd_maddr, agaw_to_level(hd->arch.agaw), 0, 0);
 }
 
-const struct iommu_ops intel_iommu_ops = {
+const struct iommu_ops __initconstrel intel_iommu_ops = {
     .init = intel_iommu_domain_init,
     .hwdom_init = intel_iommu_hwdom_init,
     .add_device = intel_iommu_add_device,
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -23,6 +23,8 @@
 #include <asm/hvm/io.h>
 #include <asm/setup.h>
 
+struct iommu_ops iommu_ops;
+
 void iommu_update_ire_from_apic(
     unsigned int apic, unsigned int reg, unsigned int value)
 {
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -56,24 +56,15 @@ struct arch_iommu
     struct guest_iommu *g_iommu;
 };
 
-extern const struct iommu_ops intel_iommu_ops;
-extern const struct iommu_ops amd_iommu_ops;
 int intel_vtd_setup(void);
 int amd_iov_detect(void);
 
+extern struct iommu_ops iommu_ops;
+
 static inline const struct iommu_ops *iommu_get_ops(void)
 {
-    switch ( boot_cpu_data.x86_vendor )
-    {
-    case X86_VENDOR_INTEL:
-        return &intel_iommu_ops;
-    case X86_VENDOR_AMD:
-        return &amd_iommu_ops;
-    }
-
-    BUG();
-
-    return NULL;
+    BUG_ON(!iommu_ops.init);
+    return &iommu_ops;
 }
 
 static inline int iommu_hardware_setup(void)





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v5 13/13] IOMMU: patch certain indirect calls to direct ones
  2018-11-08 15:56 ` [PATCH v5 00/13] x86: indirect call overhead reduction Jan Beulich
                     ` (11 preceding siblings ...)
  2018-11-08 16:16   ` [PATCH v5 12/13] IOMMU/x86: remove indirection from certain IOMMU hook accesses Jan Beulich
@ 2018-11-08 16:17   ` Jan Beulich
  2018-11-29 14:49     ` Wei Liu
  12 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-11-08 16:17 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall

This is intentionally not touching hooks used rarely (or not at all)
during the lifetime of a VM, unless perhaps sitting on an error path
next to a call which gets changed (in which case I think the error
path better remains consistent with the respective main path).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v5: Re-base over type-safe changes and dropped IOMMU_MIXED patch. Also
    patch the new lookup_page() hook.
v4: New.

--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -226,8 +226,8 @@ void __hwdom_init iommu_hwdom_init(struc
                   == PGT_writable_page) )
                 mapping |= IOMMUF_writable;
 
-            ret = hd->platform_ops->map_page(d, _dfn(dfn), _mfn(mfn),
-                                             mapping);
+            ret = iommu_call(hd->platform_ops, map_page,
+                             d, _dfn(dfn), _mfn(mfn), mapping);
             if ( !rc )
                 rc = ret;
 
@@ -313,7 +313,7 @@ int iommu_map_page(struct domain *d, dfn
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->map_page(d, dfn, mfn, flags);
+    rc = iommu_call(hd->platform_ops, map_page, d, dfn, mfn, flags);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
@@ -336,7 +336,7 @@ int iommu_unmap_page(struct domain *d, d
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->unmap_page(d, dfn);
+    rc = iommu_call(hd->platform_ops, unmap_page, d, dfn);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
@@ -359,7 +359,7 @@ int iommu_lookup_page(struct domain *d,
     if ( !iommu_enabled || !hd->platform_ops )
         return -EOPNOTSUPP;
 
-    return hd->platform_ops->lookup_page(d, dfn, mfn, flags);
+    return iommu_call(hd->platform_ops, lookup_page, d, dfn, mfn, flags);
 }
 
 static void iommu_free_pagetables(unsigned long unused)
@@ -372,7 +372,7 @@ static void iommu_free_pagetables(unsign
         spin_unlock(&iommu_pt_cleanup_lock);
         if ( !pg )
             return;
-        iommu_get_ops()->free_page_table(pg);
+        iommu_vcall(iommu_get_ops(), free_page_table, pg);
     } while ( !softirq_pending(smp_processor_id()) );
 
     tasklet_schedule_on_cpu(&iommu_pt_cleanup_tasklet,
@@ -387,7 +387,7 @@ int iommu_iotlb_flush(struct domain *d,
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
         return 0;
 
-    rc = hd->platform_ops->iotlb_flush(d, dfn, page_count);
+    rc = iommu_call(hd->platform_ops, iotlb_flush, d, dfn, page_count);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
@@ -410,7 +410,7 @@ int iommu_iotlb_flush_all(struct domain
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush_all )
         return 0;
 
-    rc = hd->platform_ops->iotlb_flush_all(d);
+    rc = iommu_call(hd->platform_ops, iotlb_flush_all, d);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1301,14 +1301,14 @@ int iommu_update_ire_from_msi(
     struct msi_desc *msi_desc, struct msi_msg *msg)
 {
     return iommu_intremap
-           ? iommu_get_ops()->update_ire_from_msi(msi_desc, msg) : 0;
+           ? iommu_call(&iommu_ops, update_ire_from_msi, msi_desc, msg) : 0;
 }
 
 void iommu_read_msi_from_ire(
     struct msi_desc *msi_desc, struct msi_msg *msg)
 {
     if ( iommu_intremap )
-        iommu_get_ops()->read_msi_from_ire(msi_desc, msg);
+        iommu_vcall(&iommu_ops, read_msi_from_ire, msi_desc, msg);
 }
 
 static int iommu_add_device(struct pci_dev *pdev)
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -28,14 +28,12 @@ struct iommu_ops iommu_ops;
 void iommu_update_ire_from_apic(
     unsigned int apic, unsigned int reg, unsigned int value)
 {
-    const struct iommu_ops *ops = iommu_get_ops();
-    ops->update_ire_from_apic(apic, reg, value);
+    iommu_vcall(&iommu_ops, update_ire_from_apic, apic, reg, value);
 }
 
 unsigned int iommu_read_apic_from_ire(unsigned int apic, unsigned int reg)
 {
-    const struct iommu_ops *ops = iommu_get_ops();
-    return ops->read_apic_from_ire(apic, reg);
+    return iommu_call(&iommu_ops, read_apic_from_ire, apic, reg);
 }
 
 int __init iommu_setup_hpet_msi(struct msi_desc *msi)
@@ -46,7 +44,6 @@ int __init iommu_setup_hpet_msi(struct m
 
 int arch_iommu_populate_page_table(struct domain *d)
 {
-    const struct domain_iommu *hd = dom_iommu(d);
     struct page_info *page;
     int rc = 0, n = 0;
 
@@ -68,9 +65,8 @@ int arch_iommu_populate_page_table(struc
             {
                 ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
                 BUG_ON(SHARED_M2P(gfn));
-                rc = hd->platform_ops->map_page(d, _dfn(gfn), _mfn(mfn),
-                                                IOMMUF_readable |
-                                                IOMMUF_writable);
+                rc = iommu_call(&iommu_ops, map_page, d, _dfn(gfn), _mfn(mfn),
+                                IOMMUF_readable | IOMMUF_writable);
             }
             if ( rc )
             {
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -61,6 +61,12 @@ int amd_iov_detect(void);
 
 extern struct iommu_ops iommu_ops;
 
+#ifdef NDEBUG
+# include <asm/alternative.h>
+# define iommu_call(ops, fn, args...)  alternative_call(iommu_ops.fn, ## args)
+# define iommu_vcall(ops, fn, args...) alternative_vcall(iommu_ops.fn, ## args)
+#endif
+
 static inline const struct iommu_ops *iommu_get_ops(void)
 {
     BUG_ON(!iommu_ops.init);
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -195,6 +195,11 @@ struct iommu_ops {
 
 #include <asm/iommu.h>
 
+#ifndef iommu_call
+# define iommu_call(ops, fn, args...) ((ops)->fn(args))
+# define iommu_vcall iommu_call
+#endif
+
 enum iommu_status
 {
     IOMMU_STATUS_disabled,




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v5 03/13] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-11-08 16:08   ` [PATCH v5 03/13] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
@ 2018-11-12 10:36     ` Jan Beulich
  0 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-11-12 10:36 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Julien Grall

>>> On 08.11.18 at 17:08, <JBeulich@suse.com> wrote:
> --- a/xen/include/xen/lib.h
> +++ b/xen/include/xen/lib.h
> @@ -66,6 +66,10 @@
>  
>  #define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
>  
> +#define count_va_arg_(dot, a1, a2, a3, a4, a5, a6, a7, a8, x, ...) x
> +#define count_va_arg(args...) \
> +    count_va_arg_(., ## args, 8, 7, 6, 5, 4, 3, 2, 1, 0)

I've just noticed that I forgot to rename this to count_args(),
which then I hope is no longer controversial as a name.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v5 11/13] IOMMU: move inclusion point of asm/iommu.h
  2018-11-08 16:14   ` [PATCH v5 11/13] IOMMU: move inclusion point of asm/iommu.h Jan Beulich
@ 2018-11-12 11:55     ` Julien Grall
  0 siblings, 0 replies; 119+ messages in thread
From: Julien Grall @ 2018-11-12 11:55 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan

Hi Jan,

On 11/8/18 4:14 PM, Jan Beulich wrote:
> In preparation of allowing inline functions in asm/iommu.h to
> de-reference struct struct iommu_ops, move the inclusion downwards past
> the declaration of that structure. This in turn requires moving the
> struct domain_iommu declaration, as it requires struct arch_iommu to be
> fully declared beforehand.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Julien Grall <julien.grall@arm.com>

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v5 12/13] IOMMU/x86: remove indirection from certain IOMMU hook accesses
  2018-11-08 16:16   ` [PATCH v5 12/13] IOMMU/x86: remove indirection from certain IOMMU hook accesses Jan Beulich
@ 2018-11-14  3:25     ` Tian, Kevin
  2018-11-14 17:16     ` Woods, Brian
  1 sibling, 0 replies; 119+ messages in thread
From: Tian, Kevin @ 2018-11-14  3:25 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Andrew Cooper, Brian Woods, Wei Liu, Suravee Suthikulpanit

> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Friday, November 9, 2018 12:16 AM
> 
> There's no need to go through an extra level of indirection. In order to
> limit code churn, call sites using struct domain_iommu's platform_ops
> don't get touched here, however.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v5 12/13] IOMMU/x86: remove indirection from certain IOMMU hook accesses
  2018-11-08 16:16   ` [PATCH v5 12/13] IOMMU/x86: remove indirection from certain IOMMU hook accesses Jan Beulich
  2018-11-14  3:25     ` Tian, Kevin
@ 2018-11-14 17:16     ` Woods, Brian
  1 sibling, 0 replies; 119+ messages in thread
From: Woods, Brian @ 2018-11-14 17:16 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Wei Liu, Andrew Cooper, Suthikulpanit, Suravee,
	xen-devel, Woods, Brian

On Thu, Nov 08, 2018 at 09:16:12AM -0700, Jan Beulich wrote:
> There's no need to go through an extra level of indirection. In order to
> limit code churn, call sites using struct domain_iommu's platform_ops
> don't get touched here, however.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Brian Woods <brian.woods@amd.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v5 13/13] IOMMU: patch certain indirect calls to direct ones
  2018-11-08 16:17   ` [PATCH v5 13/13] IOMMU: patch certain indirect calls to direct ones Jan Beulich
@ 2018-11-29 14:49     ` Wei Liu
  0 siblings, 0 replies; 119+ messages in thread
From: Wei Liu @ 2018-11-29 14:49 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

On Thu, Nov 08, 2018 at 09:17:06AM -0700, Jan Beulich wrote:
> This is intentionally not touching hooks used rarely (or not at all)
> during the lifetime of a VM, unless perhaps sitting on an error path
> next to a call which gets changed (in which case I think the error
> path better remains consistent with the respective main path).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v5 01/13] x86: reduce general stack alignment to 8
  2018-11-08 16:05   ` [PATCH v5 01/13] x86: reduce general stack alignment to 8 Jan Beulich
@ 2018-11-29 14:54     ` Wei Liu
  2018-11-29 15:03       ` Jan Beulich
  2018-11-29 17:44     ` Wei Liu
  1 sibling, 1 reply; 119+ messages in thread
From: Wei Liu @ 2018-11-29 14:54 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Thu, Nov 08, 2018 at 09:05:45AM -0700, Jan Beulich wrote:
> We don't need bigger alignment except when calling EFI boot or runtime
> services functions (and we don't guarantee that either, as explained
> close to the top of xen/common/efi/runtime.c in the struct efi_rs_state
> declaration). Hence if the compiler supports reducing stack alignment
> from the ABI compatible 16 bytes (gcc 7 and newer), do so wherever
> possible.
> 
> The EFI case itself is largely dealt with already (actually forcing
> 32-byte alignment) as a result of commit f6b7fedc89 ("x86/EFI: meet
> further spec requirements for runtime calls"). However, as explained in
> the description of that earlier change, without using
> -mincoming-stack-boundary=3 (which we don't want) we still have to make
> the compiler assume 16-byte stack boundaries for CUs making EFI calls in
> order to keep the compiler from aligning the stack, but then placing an
> odd number of 8-byte objects on it, resulting in a mis-aligned outgoing
> stack.
> 
> This as a side effect yields some code size reduction, since for a
> number of sufficiently simple non-leaf functions the stack adjustment
> (by 8, when there are no local stack variables at all) gets dropped
> altogether. I notice exceptions though, for example in guest_cpuid(),
> where in a release build gcc 8.2 now decides to set up a frame pointer
> (without ever using %rbp); I consider this a compiler quirk which we
> should leave to the compiler folks to address eventually.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

The code does what it says it does, that's all I can without having gone
through EFI spec.

Since you're the EFI maintainer, you have the final say on this matter.
Not sure if you're expecting anything else from Andrew or me.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v5 01/13] x86: reduce general stack alignment to 8
  2018-11-29 14:54     ` Wei Liu
@ 2018-11-29 15:03       ` Jan Beulich
  0 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-11-29 15:03 UTC (permalink / raw)
  To: Andrew Cooper, Wei Liu; +Cc: xen-devel

>>> On 29.11.18 at 15:54, <wei.liu2@citrix.com> wrote:
> On Thu, Nov 08, 2018 at 09:05:45AM -0700, Jan Beulich wrote:
>> We don't need bigger alignment except when calling EFI boot or runtime
>> services functions (and we don't guarantee that either, as explained
>> close to the top of xen/common/efi/runtime.c in the struct efi_rs_state
>> declaration). Hence if the compiler supports reducing stack alignment
>> from the ABI compatible 16 bytes (gcc 7 and newer), do so wherever
>> possible.
>> 
>> The EFI case itself is largely dealt with already (actually forcing
>> 32-byte alignment) as a result of commit f6b7fedc89 ("x86/EFI: meet
>> further spec requirements for runtime calls"). However, as explained in
>> the description of that earlier change, without using
>> -mincoming-stack-boundary=3 (which we don't want) we still have to make
>> the compiler assume 16-byte stack boundaries for CUs making EFI calls in
>> order to keep the compiler from aligning the stack, but then placing an
>> odd number of 8-byte objects on it, resulting in a mis-aligned outgoing
>> stack.
>> 
>> This as a side effect yields some code size reduction, since for a
>> number of sufficiently simple non-leaf functions the stack adjustment
>> (by 8, when there are no local stack variables at all) gets dropped
>> altogether. I notice exceptions though, for example in guest_cpuid(),
>> where in a release build gcc 8.2 now decides to set up a frame pointer
>> (without ever using %rbp); I consider this a compiler quirk which we
>> should leave to the compiler folks to address eventually.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> The code does what it says it does, that's all I can without having gone
> through EFI spec.
> 
> Since you're the EFI maintainer, you have the final say on this matter.
> Not sure if you're expecting anything else from Andrew or me.

Well, since the change affects all x86 code, not just the EFI pieces,
an ack allowing me to commit this would be nice.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v5 02/13] x86: clone Linux'es ASM_CALL_CONSTRAINT
  2018-11-08 16:06   ` [PATCH v5 02/13] x86: clone Linux'es ASM_CALL_CONSTRAINT Jan Beulich
@ 2018-11-29 17:13     ` Wei Liu
  0 siblings, 0 replies; 119+ messages in thread
From: Wei Liu @ 2018-11-29 17:13 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Thu, Nov 08, 2018 at 09:06:15AM -0700, Jan Beulich wrote:
> While we don't mean to run their objtool over our generated code, it
> still seems desirable to avoid calls to further functions before a
> function's frame pointer is set up.
> 
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

On a related note I think making use Linux's objtool on Xen might be
beneficial.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v5 01/13] x86: reduce general stack alignment to 8
  2018-11-08 16:05   ` [PATCH v5 01/13] x86: reduce general stack alignment to 8 Jan Beulich
  2018-11-29 14:54     ` Wei Liu
@ 2018-11-29 17:44     ` Wei Liu
  2018-11-30  9:03       ` Jan Beulich
  1 sibling, 1 reply; 119+ messages in thread
From: Wei Liu @ 2018-11-29 17:44 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Andrew Cooper

On Thu, Nov 08, 2018 at 09:05:45AM -0700, Jan Beulich wrote:
> We don't need bigger alignment except when calling EFI boot or runtime
> services functions (and we don't guarantee that either, as explained
> close to the top of xen/common/efi/runtime.c in the struct efi_rs_state
> declaration). Hence if the compiler supports reducing stack alignment
> from the ABI compatible 16 bytes (gcc 7 and newer), do so wherever
> possible.
> 
> The EFI case itself is largely dealt with already (actually forcing
> 32-byte alignment) as a result of commit f6b7fedc89 ("x86/EFI: meet
> further spec requirements for runtime calls"). However, as explained in
> the description of that earlier change, without using
> -mincoming-stack-boundary=3 (which we don't want) we still have to make
> the compiler assume 16-byte stack boundaries for CUs making EFI calls in
> order to keep the compiler from aligning the stack, but then placing an
> odd number of 8-byte objects on it, resulting in a mis-aligned outgoing
> stack.
> 
> This as a side effect yields some code size reduction, since for a
> number of sufficiently simple non-leaf functions the stack adjustment
> (by 8, when there are no local stack variables at all) gets dropped
> altogether. I notice exceptions though, for example in guest_cpuid(),
> where in a release build gcc 8.2 now decides to set up a frame pointer
> (without ever using %rbp); I consider this a compiler quirk which we
> should leave to the compiler folks to address eventually.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v5: New.
> 
> --- a/xen/arch/x86/Rules.mk
> +++ b/xen/arch/x86/Rules.mk
> @@ -51,6 +51,11 @@ CFLAGS += -DCONFIG_INDIRECT_THUNK
>  export CONFIG_INDIRECT_THUNK=y
>  endif
>  
> +# If supported by the compiler, reduce stack alignment to 8 bytes. But allow
> +# this to be overridden elsewhere.
> +$(call cc-option-add,CFLAGS-stack-boundary,CC,-mpreferred-stack-boundary=3)
> +CFLAGS += $(CFLAGS-stack-boundary)
> +
>  # Set up the assembler include path properly for older toolchains.
>  CFLAGS += -Wa,-I$(BASEDIR)/include
>  
> --- a/xen/arch/x86/efi/Makefile
> +++ b/xen/arch/x86/efi/Makefile
> @@ -5,7 +5,11 @@ CFLAGS += -fshort-wchar
>  
>  boot.init.o: buildid.o
>  
> +EFIOBJ := boot.init.o compat.o runtime.o
> +
> +$(EFIOBJ): CFLAGS-stack-boundary := -mpreferred-stack-boundary=4

From gcc's manual on -mincoming-stack-boundary:

"Thus calling a function compiled with a higher preferred stack boundary
from a function compiled with a lower preferred stack boundary most
likely misaligns the stack." 

I notice runtime.o now has stack alignment of 2^4 while the rest of xen
has 2^3.

There is at least one example (efi_get_time) that could misalign the
stack. Is that okay?

Wei.

> +
>  obj-y := stub.o
> -obj-$(XEN_BUILD_EFI) := boot.init.o compat.o relocs-dummy.o runtime.o
> +obj-$(XEN_BUILD_EFI) := $(EFIOBJ) relocs-dummy.o
>  extra-$(XEN_BUILD_EFI) += buildid.o
>  nocov-$(XEN_BUILD_EFI) += stub.o
> 
> 
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v5 01/13] x86: reduce general stack alignment to 8
  2018-11-29 17:44     ` Wei Liu
@ 2018-11-30  9:03       ` Jan Beulich
  2018-12-03 11:29         ` Wei Liu
  0 siblings, 1 reply; 119+ messages in thread
From: Jan Beulich @ 2018-11-30  9:03 UTC (permalink / raw)
  To: Wei Liu; +Cc: Andrew Cooper, xen-devel

>>> On 29.11.18 at 18:44, <wei.liu2@citrix.com> wrote:
> On Thu, Nov 08, 2018 at 09:05:45AM -0700, Jan Beulich wrote:
>> --- a/xen/arch/x86/efi/Makefile
>> +++ b/xen/arch/x86/efi/Makefile
>> @@ -5,7 +5,11 @@ CFLAGS += -fshort-wchar
>>  
>>  boot.init.o: buildid.o
>>  
>> +EFIOBJ := boot.init.o compat.o runtime.o
>> +
>> +$(EFIOBJ): CFLAGS-stack-boundary := -mpreferred-stack-boundary=4
> 
> From gcc's manual on -mincoming-stack-boundary:
> 
> "Thus calling a function compiled with a higher preferred stack boundary
> from a function compiled with a lower preferred stack boundary most
> likely misaligns the stack." 
> 
> I notice runtime.o now has stack alignment of 2^4 while the rest of xen
> has 2^3.
> 
> There is at least one example (efi_get_time) that could misalign the
> stack. Is that okay?

It would not be okay if the runtime call machinery wouldn't force
32-byte alignment of the stack. See the declaration of struct
efi_rs_state, an instance of which gets put on the stack of every
function making runtime calls. Also note how this is no different
from prior to this change, as explained by the comment in that
structure declaration, except that instead of always running on
a reliably mis-aligned stack we will now run on a mixture (hence
the code [and stack] size savings mentioned in the description).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [PATCH v5 01/13] x86: reduce general stack alignment to 8
  2018-11-30  9:03       ` Jan Beulich
@ 2018-12-03 11:29         ` Wei Liu
  0 siblings, 0 replies; 119+ messages in thread
From: Wei Liu @ 2018-12-03 11:29 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel

On Fri, Nov 30, 2018 at 02:03:29AM -0700, Jan Beulich wrote:
> >>> On 29.11.18 at 18:44, <wei.liu2@citrix.com> wrote:
> > On Thu, Nov 08, 2018 at 09:05:45AM -0700, Jan Beulich wrote:
> >> --- a/xen/arch/x86/efi/Makefile
> >> +++ b/xen/arch/x86/efi/Makefile
> >> @@ -5,7 +5,11 @@ CFLAGS += -fshort-wchar
> >>  
> >>  boot.init.o: buildid.o
> >>  
> >> +EFIOBJ := boot.init.o compat.o runtime.o
> >> +
> >> +$(EFIOBJ): CFLAGS-stack-boundary := -mpreferred-stack-boundary=4
> > 
> > From gcc's manual on -mincoming-stack-boundary:
> > 
> > "Thus calling a function compiled with a higher preferred stack boundary
> > from a function compiled with a lower preferred stack boundary most
> > likely misaligns the stack." 
> > 
> > I notice runtime.o now has stack alignment of 2^4 while the rest of xen
> > has 2^3.
> > 
> > There is at least one example (efi_get_time) that could misalign the
> > stack. Is that okay?
> 
> It would not be okay if the runtime call machinery wouldn't force
> 32-byte alignment of the stack. See the declaration of struct
> efi_rs_state, an instance of which gets put on the stack of every
> function making runtime calls. Also note how this is no different
> from prior to this change, as explained by the comment in that
> structure declaration, except that instead of always running on
> a reliably mis-aligned stack we will now run on a mixture (hence
> the code [and stack] size savings mentioned in the description).

OK.

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v6 00/10] x86: indirect call overhead reduction
  2018-09-11 13:26 [PATCH v3 0/9] x86: indirect call overhead reduction Jan Beulich
                   ` (10 preceding siblings ...)
  2018-11-08 15:56 ` [PATCH v5 00/13] x86: indirect call overhead reduction Jan Beulich
@ 2018-12-05 15:54 ` Jan Beulich
  2018-12-05 16:02   ` [PATCH v6 01/10] x86: reduce general stack alignment to 8 Jan Beulich
                     ` (9 more replies)
       [not found] ` <5C07F49D0200000000101036@prv1-mh.provo.novell.com>
  12 siblings, 10 replies; 119+ messages in thread
From: Jan Beulich @ 2018-12-05 15:54 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

While indirect calls have always been more expensive than direct ones,
their cost has further increased with the Spectre v2 mitigations. In a
number of cases we simply pointlessly use them in the first place. In
many other cases the indirection solely exists to abstract from e.g.
vendor specific hardware details, and hence the pointers used never
change once set. Here we can use alternatives patching to get rid of
the indirection.

Further areas where indirect calls could be eliminated (and that I've put
on my todo list in case the general concept here is deemed reasonable)
are vPMU and XSM. For the latter, the Arm side would need dealing
with as well - I'm not sure whether replacing indirect calls by direct ones
is worthwhile there; if not, the wrappers would simply need to become
function invocations in the Arm case (as is already done in the IOMMU
case).

01: x86: reduce general stack alignment to 8
02: x86: clone Linux'es ASM_CALL_CONSTRAINT
03: x86: infrastructure to allow converting certain indirect calls to direct ones
04: x86/HVM: patch indirect calls through hvm_funcs to direct ones
05: x86/HVM: patch vINTR indirect calls through hvm_funcs to direct ones
06: x86: patch ctxt_switch_masking() indirect call to direct one
07: x86/genapic: patch indirect calls to direct ones
08: x86/cpuidle: patch some indirect calls to direct ones
09: cpufreq: patch target() indirect call to direct one
10: IOMMU: patch certain indirect calls to direct ones

v6: Just some re-basing and minor tweaks, plus review tags.

Jan






_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v6 01/10] x86: reduce general stack alignment to 8
  2018-12-05 15:54 ` [PATCH v6 00/10] x86: indirect call overhead reduction Jan Beulich
@ 2018-12-05 16:02   ` Jan Beulich
  2018-12-05 16:02   ` [PATCH v6 02/10] x86: clone Linux'es ASM_CALL_CONSTRAINT Jan Beulich
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-12-05 16:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

We don't need bigger alignment except when calling EFI boot or runtime
services functions (and we don't guarantee that either, as explained
close to the top of xen/common/efi/runtime.c in the struct efi_rs_state
declaration). Hence if the compiler supports reducing stack alignment
from the ABI compatible 16 bytes (gcc 7 and newer), do so wherever
possible.

The EFI case itself is largely dealt with already (actually forcing
32-byte alignment) as a result of commit f6b7fedc89 ("x86/EFI: meet
further spec requirements for runtime calls"). However, as explained in
the description of that earlier change, without using
-mincoming-stack-boundary=3 (which we don't want) we still have to make
the compiler assume 16-byte stack boundaries for CUs making EFI calls in
order to keep the compiler from aligning the stack, but then placing an
odd number of 8-byte objects on it, resulting in a mis-aligned outgoing
stack.

This as a side effect yields some code size reduction, since for a
number of sufficiently simple non-leaf functions the stack adjustment
(by 8, when there are no local stack variables at all) gets dropped
altogether. I notice exceptions though, for example in guest_cpuid(),
where in a release build gcc 8.2 now decides to set up a frame pointer
(without ever using %rbp); I consider this a compiler quirk which we
should leave to the compiler folks to address eventually.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v5: New.

--- a/xen/arch/x86/Rules.mk
+++ b/xen/arch/x86/Rules.mk
@@ -52,6 +52,11 @@ CFLAGS += -fno-jump-tables
 export CONFIG_INDIRECT_THUNK=y
 endif
 
+# If supported by the compiler, reduce stack alignment to 8 bytes. But allow
+# this to be overridden elsewhere.
+$(call cc-option-add,CFLAGS-stack-boundary,CC,-mpreferred-stack-boundary=3)
+CFLAGS += $(CFLAGS-stack-boundary)
+
 # Set up the assembler include path properly for older toolchains.
 CFLAGS += -Wa,-I$(BASEDIR)/include
 
--- a/xen/arch/x86/efi/Makefile
+++ b/xen/arch/x86/efi/Makefile
@@ -5,7 +5,11 @@ CFLAGS += -fshort-wchar
 
 boot.init.o: buildid.o
 
+EFIOBJ := boot.init.o compat.o runtime.o
+
+$(EFIOBJ): CFLAGS-stack-boundary := -mpreferred-stack-boundary=4
+
 obj-y := stub.o
-obj-$(XEN_BUILD_EFI) := boot.init.o compat.o relocs-dummy.o runtime.o
+obj-$(XEN_BUILD_EFI) := $(EFIOBJ) relocs-dummy.o
 extra-$(XEN_BUILD_EFI) += buildid.o
 nocov-$(XEN_BUILD_EFI) += stub.o





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v6 02/10] x86: clone Linux'es ASM_CALL_CONSTRAINT
  2018-12-05 15:54 ` [PATCH v6 00/10] x86: indirect call overhead reduction Jan Beulich
  2018-12-05 16:02   ` [PATCH v6 01/10] x86: reduce general stack alignment to 8 Jan Beulich
@ 2018-12-05 16:02   ` Jan Beulich
  2018-12-05 16:03   ` [PATCH v6 03/10] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-12-05 16:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

While we don't mean to run their objtool over our generated code, it
still seems desirable to avoid calls to further functions before a
function's frame pointer is set up.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v6: Fix build issue with old gcc.
v5: New.

--- a/xen/arch/x86/efi/stub.c
+++ b/xen/arch/x86/efi/stub.c
@@ -2,8 +2,9 @@
 #include <xen/errno.h>
 #include <xen/init.h>
 #include <xen/lib.h>
-#include <asm/page.h>
+#include <asm/asm_defns.h>
 #include <asm/efibind.h>
+#include <asm/page.h>
 #include <efi/efidef.h>
 #include <efi/eficapsule.h>
 #include <efi/eficon.h>
@@ -34,10 +35,11 @@ void __init noreturn efi_multiboot2(EFI_
      * not be directly supported by C compiler.
      */
     asm volatile(
-    "    call *%3                     \n"
+    "    call *%[outstr]              \n"
     "0:  hlt                          \n"
     "    jmp  0b                      \n"
-       : "+c" (StdErr), "=d" (StdErr) : "1" (err), "rm" (StdErr->OutputString)
+       : "+c" (StdErr), "=d" (StdErr) ASM_CALL_CONSTRAINT
+       : "1" (err), [outstr] "rm" (StdErr->OutputString)
        : "rax", "r8", "r9", "r10", "r11", "memory");
 
     unreachable();
--- a/xen/arch/x86/extable.c
+++ b/xen/arch/x86/extable.c
@@ -168,7 +168,7 @@ static int __init stub_selftest(void)
                        "jmp .Lret%=\n\t"
                        ".popsection\n\t"
                        _ASM_EXTABLE(.Lret%=, .Lfix%=)
-                       : [exn] "+m" (res)
+                       : [exn] "+m" (res) ASM_CALL_CONSTRAINT
                        : [stb] "r" (addr), "a" (tests[i].rax));
 
         if ( res.raw != tests[i].res.raw )
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1101,7 +1101,8 @@ static inline int mkec(uint8_t e, int32_
                    "jmp .Lret%=\n\t"                                    \
                    ".popsection\n\t"                                    \
                    _ASM_EXTABLE(.Lret%=, .Lfix%=)                       \
-                   : [exn] "+g" (stub_exn.info), constraints,           \
+                   : [exn] "+g" (stub_exn.info) ASM_CALL_CONSTRAINT,    \
+                     constraints,                                       \
                      [stub] "r" (stub.func),                            \
                      "m" (*(uint8_t(*)[MAX_INST_LEN + 1])stub.ptr) );   \
     if ( unlikely(~stub_exn.info.raw) )                                 \
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -25,6 +25,19 @@ asm ( "\t.equ CONFIG_INDIRECT_THUNK, "
 
 #ifndef __ASSEMBLY__
 void ret_from_intr(void);
+
+/*
+ * This output constraint should be used for any inline asm which has a "call"
+ * instruction.  Otherwise the asm may be inserted before the frame pointer
+ * gets set up by the containing function.
+ */
+#ifdef CONFIG_FRAME_POINTER
+register unsigned long current_stack_pointer asm("rsp");
+# define ASM_CALL_CONSTRAINT , "+r" (current_stack_pointer)
+#else
+# define ASM_CALL_CONSTRAINT
+#endif
+
 #endif
 
 #ifndef NDEBUG
--- a/xen/include/asm-x86/guest/hypercall.h
+++ b/xen/include/asm-x86/guest/hypercall.h
@@ -40,7 +40,7 @@
         long res, tmp__;                                                \
         asm volatile (                                                  \
             "call hypercall_page + %c[offset]"                          \
-            : "=a" (res), "=D" (tmp__)                                  \
+            : "=a" (res), "=D" (tmp__) ASM_CALL_CONSTRAINT              \
             : [offset] "i" (hcall * 32),                                \
               "1" ((long)(a1))                                          \
             : "memory" );                                               \
@@ -53,6 +53,7 @@
         asm volatile (                                                  \
             "call hypercall_page + %c[offset]"                          \
             : "=a" (res), "=D" (tmp__), "=S" (tmp__)                    \
+              ASM_CALL_CONSTRAINT                                       \
             : [offset] "i" (hcall * 32),                                \
               "1" ((long)(a1)), "2" ((long)(a2))                        \
             : "memory" );                                               \
@@ -65,6 +66,7 @@
         asm volatile (                                                  \
             "call hypercall_page + %c[offset]"                          \
             : "=a" (res), "=D" (tmp__), "=S" (tmp__), "=d" (tmp__)      \
+              ASM_CALL_CONSTRAINT                                       \
             : [offset] "i" (hcall * 32),                                \
               "1" ((long)(a1)), "2" ((long)(a2)), "3" ((long)(a3))      \
             : "memory" );                                               \
@@ -78,7 +80,7 @@
         asm volatile (                                                  \
             "call hypercall_page + %c[offset]"                          \
             : "=a" (res), "=D" (tmp__), "=S" (tmp__), "=d" (tmp__),     \
-              "=&r" (tmp__)                                             \
+              "=&r" (tmp__) ASM_CALL_CONSTRAINT                         \
             : [offset] "i" (hcall * 32),                                \
               "1" ((long)(a1)), "2" ((long)(a2)), "3" ((long)(a3)),     \
               "4" (_a4)                                                 \




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v6 03/10] x86: infrastructure to allow converting certain indirect calls to direct ones
  2018-12-05 15:54 ` [PATCH v6 00/10] x86: indirect call overhead reduction Jan Beulich
  2018-12-05 16:02   ` [PATCH v6 01/10] x86: reduce general stack alignment to 8 Jan Beulich
  2018-12-05 16:02   ` [PATCH v6 02/10] x86: clone Linux'es ASM_CALL_CONSTRAINT Jan Beulich
@ 2018-12-05 16:03   ` Jan Beulich
  2018-12-05 16:04   ` [PATCH v6 04/10] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-12-05 16:03 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall

In a number of cases the targets of indirect calls get determined once
at boot time. In such cases we can replace those calls with direct ones
via our alternative instruction patching mechanism.

Some of the targets (in particular the hvm_funcs ones) get established
only in pre-SMP initcalls, making necessary a second passs through the
alternative patching code. Therefore some adjustments beyond the
recognition of the new special pattern are necessary there.

Note that patching such sites more than once is not supported (and the
supplied macros also don't provide any means to do so).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v6: count_va_arg() -> count_args().
v5: Use ASM_CALL_CONSTRAINT.
v4: Extend / adjust comments.
v3: Use "X" constraint instead of "g" in alternative_callN(). Pre-
    calculate values to be put into local register variables.
v2: Introduce and use count_va_arg(). Don't omit middle operand from
    ?: in ALT_CALL_ARG(). Re-base.

--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -177,9 +177,14 @@ text_poke(void *addr, const void *opcode
  * self modifying code. This implies that asymmetric systems where
  * APs have less capabilities than the boot processor are not handled.
  * Tough. Make sure you disable such features by hand.
+ *
+ * The caller will set the "force" argument to true for the final
+ * invocation, such that no CALLs/JMPs to NULL pointers will be left
+ * around. See also the further comment below.
  */
-void init_or_livepatch apply_alternatives(struct alt_instr *start,
-                                          struct alt_instr *end)
+static void init_or_livepatch _apply_alternatives(struct alt_instr *start,
+                                                  struct alt_instr *end,
+                                                  bool force)
 {
     struct alt_instr *a, *base;
 
@@ -208,9 +213,10 @@ void init_or_livepatch apply_alternative
         /*
          * Detect sequences of alt_instr's patching the same origin site, and
          * keep base pointing at the first alt_instr entry.  This is so we can
-         * refer to a single ->priv field for patching decisions.  We
-         * deliberately use the alt_instr itself rather than a local variable
-         * in case we end up making multiple passes.
+         * refer to a single ->priv field for some of our patching decisions,
+         * in particular the NOP optimization. We deliberately use the alt_instr
+         * itself rather than a local variable in case we end up making multiple
+         * passes.
          *
          * ->priv being nonzero means that the origin site has already been
          * modified, and we shouldn't try to optimise the nops again.
@@ -218,6 +224,13 @@ void init_or_livepatch apply_alternative
         if ( ALT_ORIG_PTR(base) != orig )
             base = a;
 
+        /* Skip patch sites already handled during the first pass. */
+        if ( a->priv )
+        {
+            ASSERT(force);
+            continue;
+        }
+
         /* If there is no replacement to make, see about optimising the nops. */
         if ( !boot_cpu_has(a->cpuid) )
         {
@@ -225,7 +238,7 @@ void init_or_livepatch apply_alternative
             if ( base->priv )
                 continue;
 
-            base->priv = 1;
+            a->priv = 1;
 
             /* Nothing useful to do? */
             if ( toolchain_nops_are_ideal || a->pad_len <= 1 )
@@ -236,20 +249,74 @@ void init_or_livepatch apply_alternative
             continue;
         }
 
-        base->priv = 1;
-
         memcpy(buf, repl, a->repl_len);
 
         /* 0xe8/0xe9 are relative branches; fix the offset. */
         if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
-            *(int32_t *)(buf + 1) += repl - orig;
+        {
+            /*
+             * Detect the special case of indirect-to-direct branch patching:
+             * - replacement is a direct CALL/JMP (opcodes 0xE8/0xE9; already
+             *   checked above),
+             * - replacement's displacement is -5 (pointing back at the very
+             *   insn, which makes no sense in a real replacement insn),
+             * - original is an indirect CALL/JMP (opcodes 0xFF/2 or 0xFF/4)
+             *   using RIP-relative addressing.
+             * Some branch destinations may still be NULL when we come here
+             * the first time. Defer patching of those until the post-presmp-
+             * initcalls re-invocation (with force set to true). If at that
+             * point the branch destination is still NULL, insert "UD2; UD0"
+             * (for ease of recognition) instead of CALL/JMP.
+             */
+            if ( a->cpuid == X86_FEATURE_ALWAYS &&
+                 *(int32_t *)(buf + 1) == -5 &&
+                 a->orig_len >= 6 &&
+                 orig[0] == 0xff &&
+                 orig[1] == (*buf & 1 ? 0x25 : 0x15) )
+            {
+                long disp = *(int32_t *)(orig + 2);
+                const uint8_t *dest = *(void **)(orig + 6 + disp);
+
+                if ( dest )
+                {
+                    disp = dest - (orig + 5);
+                    ASSERT(disp == (int32_t)disp);
+                    *(int32_t *)(buf + 1) = disp;
+                }
+                else if ( force )
+                {
+                    buf[0] = 0x0f;
+                    buf[1] = 0x0b;
+                    buf[2] = 0x0f;
+                    buf[3] = 0xff;
+                    buf[4] = 0xff;
+                }
+                else
+                    continue;
+            }
+            else if ( force && system_state < SYS_STATE_active )
+                ASSERT_UNREACHABLE();
+            else
+                *(int32_t *)(buf + 1) += repl - orig;
+        }
+        else if ( force && system_state < SYS_STATE_active  )
+            ASSERT_UNREACHABLE();
+
+        a->priv = 1;
 
         add_nops(buf + a->repl_len, total_len - a->repl_len);
         text_poke(orig, buf, total_len);
     }
 }
 
-static bool __initdata alt_done;
+void init_or_livepatch apply_alternatives(struct alt_instr *start,
+                                          struct alt_instr *end)
+{
+    _apply_alternatives(start, end, true);
+}
+
+static unsigned int __initdata alt_todo;
+static unsigned int __initdata alt_done;
 
 /*
  * At boot time, we patch alternatives in NMI context.  This means that the
@@ -264,7 +331,7 @@ static int __init nmi_apply_alternatives
      * More than one NMI may occur between the two set_nmi_callback() below.
      * We only need to apply alternatives once.
      */
-    if ( !alt_done )
+    if ( !(alt_done & alt_todo) )
     {
         unsigned long cr0;
 
@@ -273,11 +340,12 @@ static int __init nmi_apply_alternatives
         /* Disable WP to allow patching read-only pages. */
         write_cr0(cr0 & ~X86_CR0_WP);
 
-        apply_alternatives(__alt_instructions, __alt_instructions_end);
+        _apply_alternatives(__alt_instructions, __alt_instructions_end,
+                            alt_done);
 
         write_cr0(cr0);
 
-        alt_done = true;
+        alt_done |= alt_todo;
     }
 
     return 1;
@@ -287,13 +355,11 @@ static int __init nmi_apply_alternatives
  * This routine is called with local interrupt disabled and used during
  * bootup.
  */
-void __init alternative_instructions(void)
+static void __init _alternative_instructions(bool force)
 {
     unsigned int i;
     nmi_callback_t *saved_nmi_callback;
 
-    arch_init_ideal_nops();
-
     /*
      * Don't stop machine check exceptions while patching.
      * MCEs only happen when something got corrupted and in this
@@ -306,6 +372,10 @@ void __init alternative_instructions(voi
      */
     ASSERT(!local_irq_is_enabled());
 
+    /* Set what operation to perform /before/ setting the callback. */
+    alt_todo = 1u << force;
+    barrier();
+
     /*
      * As soon as the callback is set up, the next NMI will trigger patching,
      * even an NMI ahead of our explicit self-NMI.
@@ -321,11 +391,24 @@ void __init alternative_instructions(voi
      * cover the (hopefully never) async case, poll alt_done for up to one
      * second.
      */
-    for ( i = 0; !ACCESS_ONCE(alt_done) && i < 1000; ++i )
+    for ( i = 0; !(ACCESS_ONCE(alt_done) & alt_todo) && i < 1000; ++i )
         mdelay(1);
 
-    if ( !ACCESS_ONCE(alt_done) )
+    if ( !(ACCESS_ONCE(alt_done) & alt_todo) )
         panic("Timed out waiting for alternatives self-NMI to hit\n");
 
     set_nmi_callback(saved_nmi_callback);
 }
+
+void __init alternative_instructions(void)
+{
+    arch_init_ideal_nops();
+    _alternative_instructions(false);
+}
+
+void __init alternative_branches(void)
+{
+    local_irq_disable();
+    _alternative_instructions(true);
+    local_irq_enable();
+}
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1648,6 +1648,8 @@ void __init noreturn __start_xen(unsigne
 
     do_presmp_initcalls();
 
+    alternative_branches();
+
     /*
      * NB: when running as a PV shim VCPUOP_up/down is wired to the shim
      * physical cpu_add/remove functions, so launch the guest with only
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -4,8 +4,8 @@
 #ifdef __ASSEMBLY__
 #include <asm/alternative-asm.h>
 #else
+#include <xen/lib.h>
 #include <xen/stringify.h>
-#include <xen/types.h>
 #include <asm/asm-macros.h>
 
 struct __packed alt_instr {
@@ -26,6 +26,7 @@ extern void add_nops(void *insns, unsign
 /* Similar to alternative_instructions except it can be run with IRQs enabled. */
 extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end);
 extern void alternative_instructions(void);
+extern void alternative_branches(void);
 
 #define alt_orig_len       "(.LXEN%=_orig_e - .LXEN%=_orig_s)"
 #define alt_pad_len        "(.LXEN%=_orig_p - .LXEN%=_orig_e)"
@@ -149,6 +150,233 @@ extern void alternative_instructions(voi
 /* Use this macro(s) if you need more than one output parameter. */
 #define ASM_OUTPUT2(a...) a
 
+/*
+ * Machinery to allow converting indirect to direct calls, when the called
+ * function is determined once at boot and later never changed.
+ */
+
+#define ALT_CALL_arg1 "rdi"
+#define ALT_CALL_arg2 "rsi"
+#define ALT_CALL_arg3 "rdx"
+#define ALT_CALL_arg4 "rcx"
+#define ALT_CALL_arg5 "r8"
+#define ALT_CALL_arg6 "r9"
+
+#define ALT_CALL_ARG(arg, n) \
+    register typeof((arg) ? (arg) : 0) a ## n ## _ \
+    asm ( ALT_CALL_arg ## n ) = (arg)
+#define ALT_CALL_NO_ARG(n) \
+    register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n )
+
+#define ALT_CALL_NO_ARG6 ALT_CALL_NO_ARG(6)
+#define ALT_CALL_NO_ARG5 ALT_CALL_NO_ARG(5); ALT_CALL_NO_ARG6
+#define ALT_CALL_NO_ARG4 ALT_CALL_NO_ARG(4); ALT_CALL_NO_ARG5
+#define ALT_CALL_NO_ARG3 ALT_CALL_NO_ARG(3); ALT_CALL_NO_ARG4
+#define ALT_CALL_NO_ARG2 ALT_CALL_NO_ARG(2); ALT_CALL_NO_ARG3
+#define ALT_CALL_NO_ARG1 ALT_CALL_NO_ARG(1); ALT_CALL_NO_ARG2
+
+/*
+ * Unfortunately ALT_CALL_NO_ARG() above can't use a fake initializer (to
+ * suppress "uninitialized variable" warnings), as various versions of gcc
+ * older than 8.1 fall on the nose in various ways with that (always because
+ * of some other construct elsewhere in the same function needing to use the
+ * same hard register). Otherwise the asm() below could uniformly use "+r"
+ * output constraints, making unnecessary all these ALT_CALL<n>_OUT macros.
+ */
+#define ALT_CALL0_OUT "=r" (a1_), "=r" (a2_), "=r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL1_OUT "+r" (a1_), "=r" (a2_), "=r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL2_OUT "+r" (a1_), "+r" (a2_), "=r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL3_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL4_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "+r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL5_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "+r" (a4_), "+r" (a5_), "=r" (a6_)
+#define ALT_CALL6_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "+r" (a4_), "+r" (a5_), "+r" (a6_)
+
+#define alternative_callN(n, rettype, func) ({                     \
+    rettype ret_;                                                  \
+    register unsigned long r10_ asm("r10");                        \
+    register unsigned long r11_ asm("r11");                        \
+    asm volatile (__stringify(ALTERNATIVE "call *%c[addr](%%rip)", \
+                                          "call .",                \
+                                          X86_FEATURE_ALWAYS)      \
+                  : ALT_CALL ## n ## _OUT, "=a" (ret_),            \
+                    "=r" (r10_), "=r" (r11_) ASM_CALL_CONSTRAINT   \
+                  : [addr] "i" (&(func)), "g" (func)               \
+                  : "memory" );                                    \
+    ret_;                                                          \
+})
+
+#define alternative_vcall0(func) ({             \
+    ALT_CALL_NO_ARG1;                           \
+    ((void)alternative_callN(0, int, func));    \
+})
+
+#define alternative_call0(func) ({              \
+    ALT_CALL_NO_ARG1;                           \
+    alternative_callN(0, typeof(func()), func); \
+})
+
+#define alternative_vcall1(func, arg) ({           \
+    ALT_CALL_ARG(arg, 1);                          \
+    ALT_CALL_NO_ARG2;                              \
+    (void)sizeof(func(arg));                       \
+    (void)alternative_callN(1, int, func);         \
+})
+
+#define alternative_call1(func, arg) ({            \
+    ALT_CALL_ARG(arg, 1);                          \
+    ALT_CALL_NO_ARG2;                              \
+    alternative_callN(1, typeof(func(arg)), func); \
+})
+
+#define alternative_vcall2(func, arg1, arg2) ({           \
+    typeof(arg2) v2_ = (arg2);                            \
+    ALT_CALL_ARG(arg1, 1);                                \
+    ALT_CALL_ARG(v2_, 2);                                 \
+    ALT_CALL_NO_ARG3;                                     \
+    (void)sizeof(func(arg1, arg2));                       \
+    (void)alternative_callN(2, int, func);                \
+})
+
+#define alternative_call2(func, arg1, arg2) ({            \
+    typeof(arg2) v2_ = (arg2);                            \
+    ALT_CALL_ARG(arg1, 1);                                \
+    ALT_CALL_ARG(v2_, 2);                                 \
+    ALT_CALL_NO_ARG3;                                     \
+    alternative_callN(2, typeof(func(arg1, arg2)), func); \
+})
+
+#define alternative_vcall3(func, arg1, arg2, arg3) ({    \
+    typeof(arg2) v2_ = (arg2);                           \
+    typeof(arg3) v3_ = (arg3);                           \
+    ALT_CALL_ARG(arg1, 1);                               \
+    ALT_CALL_ARG(v2_, 2);                                \
+    ALT_CALL_ARG(v3_, 3);                                \
+    ALT_CALL_NO_ARG4;                                    \
+    (void)sizeof(func(arg1, arg2, arg3));                \
+    (void)alternative_callN(3, int, func);               \
+})
+
+#define alternative_call3(func, arg1, arg2, arg3) ({     \
+    typeof(arg2) v2_ = (arg2);                           \
+    typeof(arg3) v3_ = (arg3);                           \
+    ALT_CALL_ARG(arg1, 1);                               \
+    ALT_CALL_ARG(v2_, 2);                                \
+    ALT_CALL_ARG(v3_, 3);                                \
+    ALT_CALL_NO_ARG4;                                    \
+    alternative_callN(3, typeof(func(arg1, arg2, arg3)), \
+                      func);                             \
+})
+
+#define alternative_vcall4(func, arg1, arg2, arg3, arg4) ({ \
+    typeof(arg2) v2_ = (arg2);                              \
+    typeof(arg3) v3_ = (arg3);                              \
+    typeof(arg4) v4_ = (arg4);                              \
+    ALT_CALL_ARG(arg1, 1);                                  \
+    ALT_CALL_ARG(v2_, 2);                                   \
+    ALT_CALL_ARG(v3_, 3);                                   \
+    ALT_CALL_ARG(v4_, 4);                                   \
+    ALT_CALL_NO_ARG5;                                       \
+    (void)sizeof(func(arg1, arg2, arg3, arg4));             \
+    (void)alternative_callN(4, int, func);                  \
+})
+
+#define alternative_call4(func, arg1, arg2, arg3, arg4) ({  \
+    typeof(arg2) v2_ = (arg2);                              \
+    typeof(arg3) v3_ = (arg3);                              \
+    typeof(arg4) v4_ = (arg4);                              \
+    ALT_CALL_ARG(arg1, 1);                                  \
+    ALT_CALL_ARG(v2_, 2);                                   \
+    ALT_CALL_ARG(v3_, 3);                                   \
+    ALT_CALL_ARG(v4_, 4);                                   \
+    ALT_CALL_NO_ARG5;                                       \
+    alternative_callN(4, typeof(func(arg1, arg2,            \
+                                     arg3, arg4)),          \
+                      func);                                \
+})
+
+#define alternative_vcall5(func, arg1, arg2, arg3, arg4, arg5) ({ \
+    typeof(arg2) v2_ = (arg2);                                    \
+    typeof(arg3) v3_ = (arg3);                                    \
+    typeof(arg4) v4_ = (arg4);                                    \
+    typeof(arg5) v5_ = (arg5);                                    \
+    ALT_CALL_ARG(arg1, 1);                                        \
+    ALT_CALL_ARG(v2_, 2);                                         \
+    ALT_CALL_ARG(v3_, 3);                                         \
+    ALT_CALL_ARG(v4_, 4);                                         \
+    ALT_CALL_ARG(v5_, 5);                                         \
+    ALT_CALL_NO_ARG6;                                             \
+    (void)sizeof(func(arg1, arg2, arg3, arg4, arg5));             \
+    (void)alternative_callN(5, int, func, ALT_CALL_OUT5);         \
+})
+
+#define alternative_call5(func, arg1, arg2, arg3, arg4, arg5) ({  \
+    typeof(arg2) v2_ = (arg2);                                    \
+    typeof(arg3) v3_ = (arg3);                                    \
+    typeof(arg4) v4_ = (arg4);                                    \
+    typeof(arg5) v5_ = (arg5);                                    \
+    ALT_CALL_ARG(arg1, 1);                                        \
+    ALT_CALL_ARG(v2_, 2);                                         \
+    ALT_CALL_ARG(v3_, 3);                                         \
+    ALT_CALL_ARG(v4_, 4);                                         \
+    ALT_CALL_ARG(v5_, 5);                                         \
+    ALT_CALL_NO_ARG6;                                             \
+    alternative_callN(5, typeof(func(arg1, arg2, arg3,            \
+                                     arg4, arg5)),                \
+                      func, ALT_CALL_OUT5);                       \
+})
+
+#define alternative_vcall6(func, arg1, arg2, arg3, arg4, arg5, arg6) ({ \
+    typeof(arg2) v2_ = (arg2);                                          \
+    typeof(arg3) v3_ = (arg3);                                          \
+    typeof(arg4) v4_ = (arg4);                                          \
+    typeof(arg5) v5_ = (arg5);                                          \
+    typeof(arg6) v6_ = (arg6);                                          \
+    ALT_CALL_ARG(arg1, 1);                                              \
+    ALT_CALL_ARG(v2_, 2);                                               \
+    ALT_CALL_ARG(v3_, 3);                                               \
+    ALT_CALL_ARG(v4_, 4);                                               \
+    ALT_CALL_ARG(v5_, 5);                                               \
+    ALT_CALL_ARG(v6_, 6);                                               \
+    (void)sizeof(func(arg1, arg2, arg3, arg4, arg5, arg6));             \
+    (void)alternative_callN(6, int, func);                              \
+})
+
+#define alternative_call6(func, arg1, arg2, arg3, arg4, arg5, arg6) ({  \
+    typeof(arg2) v2_ = (arg2);                                          \
+    typeof(arg3) v3_ = (arg3);                                          \
+    typeof(arg4) v4_ = (arg4);                                          \
+    typeof(arg5) v5_ = (arg5);                                          \
+    typeof(arg6) v6_ = (arg6);                                          \
+    ALT_CALL_ARG(arg1, 1);                                              \
+    ALT_CALL_ARG(v2_, 2);                                               \
+    ALT_CALL_ARG(v3_, 3);                                               \
+    ALT_CALL_ARG(v4_, 4);                                               \
+    ALT_CALL_ARG(v5_, 5);                                               \
+    ALT_CALL_ARG(v6_, 6);                                               \
+    alternative_callN(6, typeof(func(arg1, arg2, arg3,                  \
+                                     arg4, arg5, arg6)),                \
+                      func, ALT_CALL_OUT6);                             \
+})
+
+#define alternative_vcall__(nr) alternative_vcall ## nr
+#define alternative_call__(nr)  alternative_call ## nr
+
+#define alternative_vcall_(nr) alternative_vcall__(nr)
+#define alternative_call_(nr)  alternative_call__(nr)
+
+#define alternative_vcall(func, args...) \
+    alternative_vcall_(count_args(args))(func, ## args)
+
+#define alternative_call(func, args...) \
+    alternative_call_(count_args(args))(func, ## args)
+
 #endif /*  !__ASSEMBLY__  */
 
 #endif /* __X86_ALTERNATIVE_H__ */
--- a/xen/include/xen/lib.h
+++ b/xen/include/xen/lib.h
@@ -66,6 +66,10 @@
 
 #define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
 
+#define count_args_(dot, a1, a2, a3, a4, a5, a6, a7, a8, x, ...) x
+#define count_args(args...) \
+    count_args_(., ## args, 8, 7, 6, 5, 4, 3, 2, 1, 0)
+
 struct domain;
 
 void cmdline_parse(const char *cmdline);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v6 04/10] x86/HVM: patch indirect calls through hvm_funcs to direct ones
  2018-12-05 15:54 ` [PATCH v6 00/10] x86: indirect call overhead reduction Jan Beulich
                     ` (2 preceding siblings ...)
  2018-12-05 16:03   ` [PATCH v6 03/10] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
@ 2018-12-05 16:04   ` Jan Beulich
  2018-12-05 16:05   ` [PATCH v6 05/10] x86/HVM: patch vINTR " Jan Beulich
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-12-05 16:04 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

This is intentionally not touching hooks used rarely (or not at all)
during the lifetime of a VM, like {domain,vcpu}_initialise or cpu_up,
as well as nested, VM event, and altp2m ones (they can all be done
later, if so desired). Virtual Interrupt delivery ones will be dealt
with in a subsequent patch.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v6: Re-base.
v3: Re-base.
v2: Drop open-coded numbers from macro invocations. Re-base.

--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -2128,7 +2128,7 @@ static int hvmemul_write_msr(
 static int hvmemul_wbinvd(
     struct x86_emulate_ctxt *ctxt)
 {
-    hvm_funcs.wbinvd_intercept();
+    alternative_vcall(hvm_funcs.wbinvd_intercept);
     return X86EMUL_OKAY;
 }
 
@@ -2146,7 +2146,7 @@ static int hvmemul_get_fpu(
     struct vcpu *curr = current;
 
     if ( !curr->fpu_dirtied )
-        hvm_funcs.fpu_dirty_intercept();
+        alternative_vcall(hvm_funcs.fpu_dirty_intercept);
     else if ( type == X86EMUL_FPU_fpu )
     {
         const typeof(curr->arch.xsave_area->fpu_sse) *fpu_ctxt =
@@ -2263,7 +2263,7 @@ static void hvmemul_put_fpu(
         {
             curr->fpu_dirtied = false;
             stts();
-            hvm_funcs.fpu_leave(curr);
+            alternative_vcall(hvm_funcs.fpu_leave, curr);
         }
     }
 }
@@ -2425,7 +2425,8 @@ static int _hvm_emulate_one(struct hvm_e
     if ( hvmemul_ctxt->intr_shadow != new_intr_shadow )
     {
         hvmemul_ctxt->intr_shadow = new_intr_shadow;
-        hvm_funcs.set_interrupt_shadow(curr, new_intr_shadow);
+        alternative_vcall(hvm_funcs.set_interrupt_shadow,
+                          curr, new_intr_shadow);
     }
 
     if ( hvmemul_ctxt->ctxt.retire.hlt &&
@@ -2562,7 +2563,8 @@ void hvm_emulate_init_once(
 
     memset(hvmemul_ctxt, 0, sizeof(*hvmemul_ctxt));
 
-    hvmemul_ctxt->intr_shadow = hvm_funcs.get_interrupt_shadow(curr);
+    hvmemul_ctxt->intr_shadow =
+        alternative_call(hvm_funcs.get_interrupt_shadow, curr);
     hvmemul_get_seg_reg(x86_seg_cs, hvmemul_ctxt);
     hvmemul_get_seg_reg(x86_seg_ss, hvmemul_ctxt);
 
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -272,12 +272,12 @@ void hvm_set_rdtsc_exiting(struct domain
     struct vcpu *v;
 
     for_each_vcpu ( d, v )
-        hvm_funcs.set_rdtsc_exiting(v, enable);
+        alternative_vcall(hvm_funcs.set_rdtsc_exiting, v, enable);
 }
 
 void hvm_get_guest_pat(struct vcpu *v, u64 *guest_pat)
 {
-    if ( !hvm_funcs.get_guest_pat(v, guest_pat) )
+    if ( !alternative_call(hvm_funcs.get_guest_pat, v, guest_pat) )
         *guest_pat = v->arch.hvm.pat_cr;
 }
 
@@ -302,7 +302,7 @@ int hvm_set_guest_pat(struct vcpu *v, u6
             return 0;
         }
 
-    if ( !hvm_funcs.set_guest_pat(v, guest_pat) )
+    if ( !alternative_call(hvm_funcs.set_guest_pat, v, guest_pat) )
         v->arch.hvm.pat_cr = guest_pat;
 
     return 1;
@@ -342,7 +342,7 @@ bool hvm_set_guest_bndcfgs(struct vcpu *
             /* nothing, best effort only */;
     }
 
-    return hvm_funcs.set_guest_bndcfgs(v, val);
+    return alternative_call(hvm_funcs.set_guest_bndcfgs, v, val);
 }
 
 /*
@@ -506,7 +506,8 @@ void hvm_migrate_pirqs(struct vcpu *v)
 static bool hvm_get_pending_event(struct vcpu *v, struct x86_event *info)
 {
     info->cr2 = v->arch.hvm.guest_cr[2];
-    return hvm_funcs.get_pending_event(v, info);
+
+    return alternative_call(hvm_funcs.get_pending_event, v, info);
 }
 
 void hvm_do_resume(struct vcpu *v)
@@ -1673,7 +1674,7 @@ void hvm_inject_event(const struct x86_e
         }
     }
 
-    hvm_funcs.inject_event(event);
+    alternative_vcall(hvm_funcs.inject_event, event);
 }
 
 int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
@@ -2261,7 +2262,7 @@ int hvm_set_cr0(unsigned long value, boo
          (!rangeset_is_empty(d->iomem_caps) ||
           !rangeset_is_empty(d->arch.ioport_caps) ||
           has_arch_pdevs(d)) )
-        hvm_funcs.handle_cd(v, value);
+        alternative_vcall(hvm_funcs.handle_cd, v, value);
 
     hvm_update_cr(v, 0, value);
 
@@ -3495,7 +3496,8 @@ int hvm_msr_read_intercept(unsigned int
             goto gp_fault;
         /* If ret == 0 then this is not an MCE MSR, see other MSRs. */
         ret = ((ret == 0)
-               ? hvm_funcs.msr_read_intercept(msr, msr_content)
+               ? alternative_call(hvm_funcs.msr_read_intercept,
+                                  msr, msr_content)
                : X86EMUL_OKAY);
         break;
     }
@@ -3655,7 +3657,8 @@ int hvm_msr_write_intercept(unsigned int
             goto gp_fault;
         /* If ret == 0 then this is not an MCE MSR, see other MSRs. */
         ret = ((ret == 0)
-               ? hvm_funcs.msr_write_intercept(msr, msr_content)
+               ? alternative_call(hvm_funcs.msr_write_intercept,
+                                  msr, msr_content)
                : X86EMUL_OKAY);
         break;
     }
@@ -3847,7 +3850,7 @@ void hvm_hypercall_page_initialise(struc
                                    void *hypercall_page)
 {
     hvm_latch_shinfo_size(d);
-    hvm_funcs.init_hypercall_page(d, hypercall_page);
+    alternative_vcall(hvm_funcs.init_hypercall_page, d, hypercall_page);
 }
 
 void hvm_vcpu_reset_state(struct vcpu *v, uint16_t cs, uint16_t ip)
@@ -5053,7 +5056,7 @@ void hvm_domain_soft_reset(struct domain
 void hvm_get_segment_register(struct vcpu *v, enum x86_segment seg,
                               struct segment_register *reg)
 {
-    hvm_funcs.get_segment_register(v, seg, reg);
+    alternative_vcall(hvm_funcs.get_segment_register, v, seg, reg);
 
     switch ( seg )
     {
@@ -5199,7 +5202,7 @@ void hvm_set_segment_register(struct vcp
         return;
     }
 
-    hvm_funcs.set_segment_register(v, seg, reg);
+    alternative_vcall(hvm_funcs.set_segment_register, v, seg, reg);
 }
 
 /*
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -389,42 +389,42 @@ static inline int
 hvm_guest_x86_mode(struct vcpu *v)
 {
     ASSERT(v == current);
-    return hvm_funcs.guest_x86_mode(v);
+    return alternative_call(hvm_funcs.guest_x86_mode, v);
 }
 
 static inline void
 hvm_update_host_cr3(struct vcpu *v)
 {
     if ( hvm_funcs.update_host_cr3 )
-        hvm_funcs.update_host_cr3(v);
+        alternative_vcall(hvm_funcs.update_host_cr3, v);
 }
 
 static inline void hvm_update_guest_cr(struct vcpu *v, unsigned int cr)
 {
-    hvm_funcs.update_guest_cr(v, cr, 0);
+    alternative_vcall(hvm_funcs.update_guest_cr, v, cr, 0);
 }
 
 static inline void hvm_update_guest_cr3(struct vcpu *v, bool noflush)
 {
     unsigned int flags = noflush ? HVM_UPDATE_GUEST_CR3_NOFLUSH : 0;
 
-    hvm_funcs.update_guest_cr(v, 3, flags);
+    alternative_vcall(hvm_funcs.update_guest_cr, v, 3, flags);
 }
 
 static inline void hvm_update_guest_efer(struct vcpu *v)
 {
-    hvm_funcs.update_guest_efer(v);
+    alternative_vcall(hvm_funcs.update_guest_efer, v);
 }
 
 static inline void hvm_cpuid_policy_changed(struct vcpu *v)
 {
-    hvm_funcs.cpuid_policy_changed(v);
+    alternative_vcall(hvm_funcs.cpuid_policy_changed, v);
 }
 
 static inline void hvm_set_tsc_offset(struct vcpu *v, uint64_t offset,
                                       uint64_t at_tsc)
 {
-    hvm_funcs.set_tsc_offset(v, offset, at_tsc);
+    alternative_vcall(hvm_funcs.set_tsc_offset, v, offset, at_tsc);
 }
 
 /*
@@ -441,18 +441,18 @@ static inline void hvm_flush_guest_tlbs(
 static inline unsigned int
 hvm_get_cpl(struct vcpu *v)
 {
-    return hvm_funcs.get_cpl(v);
+    return alternative_call(hvm_funcs.get_cpl, v);
 }
 
 static inline unsigned long hvm_get_shadow_gs_base(struct vcpu *v)
 {
-    return hvm_funcs.get_shadow_gs_base(v);
+    return alternative_call(hvm_funcs.get_shadow_gs_base, v);
 }
 
 static inline bool hvm_get_guest_bndcfgs(struct vcpu *v, u64 *val)
 {
     return hvm_funcs.get_guest_bndcfgs &&
-           hvm_funcs.get_guest_bndcfgs(v, val);
+           alternative_call(hvm_funcs.get_guest_bndcfgs, v, val);
 }
 
 #define has_hvm_params(d) \
@@ -509,12 +509,12 @@ static inline void hvm_inject_page_fault
 
 static inline bool hvm_event_pending(const struct vcpu *v)
 {
-    return hvm_funcs.event_pending(v);
+    return alternative_call(hvm_funcs.event_pending, v);
 }
 
 static inline void hvm_invlpg(struct vcpu *v, unsigned long linear)
 {
-    hvm_funcs.invlpg(v, linear);
+    alternative_vcall(hvm_funcs.invlpg, v, linear);
 }
 
 /* These bits in CR4 are owned by the host. */
@@ -539,13 +539,14 @@ static inline void hvm_cpu_down(void)
 
 static inline unsigned int hvm_get_insn_bytes(struct vcpu *v, uint8_t *buf)
 {
-    return (hvm_funcs.get_insn_bytes ? hvm_funcs.get_insn_bytes(v, buf) : 0);
+    return (hvm_funcs.get_insn_bytes
+            ? alternative_call(hvm_funcs.get_insn_bytes, v, buf) : 0);
 }
 
 static inline void hvm_set_info_guest(struct vcpu *v)
 {
     if ( hvm_funcs.set_info_guest )
-        return hvm_funcs.set_info_guest(v);
+        alternative_vcall(hvm_funcs.set_info_guest, v);
 }
 
 static inline void hvm_invalidate_regs_fields(struct cpu_user_regs *regs)




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v6 05/10] x86/HVM: patch vINTR indirect calls through hvm_funcs to direct ones
  2018-12-05 15:54 ` [PATCH v6 00/10] x86: indirect call overhead reduction Jan Beulich
                     ` (3 preceding siblings ...)
  2018-12-05 16:04   ` [PATCH v6 04/10] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
@ 2018-12-05 16:05   ` Jan Beulich
  2018-12-05 16:06   ` [PATCH v6 06/10] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-12-05 16:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

While not strictly necessary, change the VMX initialization logic to
update the function table in start_vmx() from NULL rather than to NULL,
to make more obvious that we won't ever change an already (explicitly)
initialized function pointer.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v5: Fix indentation.
v4: Re-base.
v2: Drop open-coded numbers from macro invocations.

--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -111,10 +111,15 @@ static void vlapic_clear_irr(int vector,
     vlapic_clear_vector(vector, &vlapic->regs->data[APIC_IRR]);
 }
 
-static int vlapic_find_highest_irr(struct vlapic *vlapic)
+static void sync_pir_to_irr(struct vcpu *v)
 {
     if ( hvm_funcs.sync_pir_to_irr )
-        hvm_funcs.sync_pir_to_irr(vlapic_vcpu(vlapic));
+        alternative_vcall(hvm_funcs.sync_pir_to_irr, v);
+}
+
+static int vlapic_find_highest_irr(struct vlapic *vlapic)
+{
+    sync_pir_to_irr(vlapic_vcpu(vlapic));
 
     return vlapic_find_highest_vector(&vlapic->regs->data[APIC_IRR]);
 }
@@ -143,7 +148,7 @@ bool vlapic_test_irq(const struct vlapic
         return false;
 
     if ( hvm_funcs.test_pir &&
-         hvm_funcs.test_pir(const_vlapic_vcpu(vlapic), vec) )
+         alternative_call(hvm_funcs.test_pir, const_vlapic_vcpu(vlapic), vec) )
         return true;
 
     return vlapic_test_vector(vec, &vlapic->regs->data[APIC_IRR]);
@@ -165,10 +170,10 @@ void vlapic_set_irq(struct vlapic *vlapi
         vlapic_clear_vector(vec, &vlapic->regs->data[APIC_TMR]);
 
     if ( hvm_funcs.update_eoi_exit_bitmap )
-        hvm_funcs.update_eoi_exit_bitmap(target, vec, trig);
+        alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec, trig);
 
     if ( hvm_funcs.deliver_posted_intr )
-        hvm_funcs.deliver_posted_intr(target, vec);
+        alternative_vcall(hvm_funcs.deliver_posted_intr, target, vec);
     else if ( !vlapic_test_and_set_irr(vec, vlapic) )
         vcpu_kick(target);
 }
@@ -448,7 +453,8 @@ void vlapic_EOI_set(struct vlapic *vlapi
     vlapic_clear_vector(vector, &vlapic->regs->data[APIC_ISR]);
 
     if ( hvm_funcs.handle_eoi )
-        hvm_funcs.handle_eoi(vector, vlapic_find_highest_isr(vlapic));
+        alternative_vcall(hvm_funcs.handle_eoi, vector,
+                          vlapic_find_highest_isr(vlapic));
 
     vlapic_handle_EOI(vlapic, vector);
 
@@ -1412,8 +1418,7 @@ static int lapic_save_regs(struct vcpu *
     if ( !has_vlapic(v->domain) )
         return 0;
 
-    if ( hvm_funcs.sync_pir_to_irr )
-        hvm_funcs.sync_pir_to_irr(v);
+    sync_pir_to_irr(v);
 
     return hvm_save_entry(LAPIC_REGS, v->vcpu_id, h, vcpu_vlapic(v)->regs);
 }
@@ -1509,7 +1514,8 @@ static int lapic_load_regs(struct domain
         lapic_load_fixup(s);
 
     if ( hvm_funcs.process_isr )
-        hvm_funcs.process_isr(vlapic_find_highest_isr(s), v);
+        alternative_vcall(hvm_funcs.process_isr,
+                          vlapic_find_highest_isr(s), v);
 
     vlapic_adjust_i8259_target(d);
     lapic_rearm(s);
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2340,12 +2340,6 @@ static struct hvm_function_table __initd
     .nhvm_vcpu_vmexit_event = nvmx_vmexit_event,
     .nhvm_intr_blocked    = nvmx_intr_blocked,
     .nhvm_domain_relinquish_resources = nvmx_domain_relinquish_resources,
-    .update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap,
-    .process_isr          = vmx_process_isr,
-    .deliver_posted_intr  = vmx_deliver_posted_intr,
-    .sync_pir_to_irr      = vmx_sync_pir_to_irr,
-    .test_pir             = vmx_test_pir,
-    .handle_eoi           = vmx_handle_eoi,
     .nhvm_hap_walk_L1_p2m = nvmx_hap_walk_L1_p2m,
     .enable_msr_interception = vmx_enable_msr_interception,
     .is_singlestep_supported = vmx_is_singlestep_supported,
@@ -2473,26 +2467,23 @@ const struct hvm_function_table * __init
         setup_ept_dump();
     }
 
-    if ( !cpu_has_vmx_virtual_intr_delivery )
+    if ( cpu_has_vmx_virtual_intr_delivery )
     {
-        vmx_function_table.update_eoi_exit_bitmap = NULL;
-        vmx_function_table.process_isr = NULL;
-        vmx_function_table.handle_eoi = NULL;
-    }
-    else
+        vmx_function_table.update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap;
+        vmx_function_table.process_isr = vmx_process_isr;
+        vmx_function_table.handle_eoi = vmx_handle_eoi;
         vmx_function_table.virtual_intr_delivery_enabled = true;
+    }
 
     if ( cpu_has_vmx_posted_intr_processing )
     {
         alloc_direct_apic_vector(&posted_intr_vector, pi_notification_interrupt);
         if ( iommu_intpost )
             alloc_direct_apic_vector(&pi_wakeup_vector, pi_wakeup_interrupt);
-    }
-    else
-    {
-        vmx_function_table.deliver_posted_intr = NULL;
-        vmx_function_table.sync_pir_to_irr = NULL;
-        vmx_function_table.test_pir = NULL;
+
+        vmx_function_table.deliver_posted_intr = vmx_deliver_posted_intr;
+        vmx_function_table.sync_pir_to_irr     = vmx_sync_pir_to_irr;
+        vmx_function_table.test_pir            = vmx_test_pir;
     }
 
     if ( cpu_has_vmx_tsc_scaling )




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v6 06/10] x86: patch ctxt_switch_masking() indirect call to direct one
  2018-12-05 15:54 ` [PATCH v6 00/10] x86: indirect call overhead reduction Jan Beulich
                     ` (4 preceding siblings ...)
  2018-12-05 16:05   ` [PATCH v6 05/10] x86/HVM: patch vINTR " Jan Beulich
@ 2018-12-05 16:06   ` Jan Beulich
  2018-12-05 16:06   ` [PATCH v6 07/10] x86/genapic: patch indirect calls to direct ones Jan Beulich
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-12-05 16:06 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v2: Drop open-coded number from macro invocation.

--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -185,7 +185,7 @@ void ctxt_switch_levelling(const struct
 	}
 
 	if (ctxt_switch_masking)
-		ctxt_switch_masking(next);
+		alternative_vcall(ctxt_switch_masking, next);
 }
 
 bool_t opt_cpu_info;



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v6 07/10] x86/genapic: patch indirect calls to direct ones
  2018-12-05 15:54 ` [PATCH v6 00/10] x86: indirect call overhead reduction Jan Beulich
                     ` (5 preceding siblings ...)
  2018-12-05 16:06   ` [PATCH v6 06/10] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
@ 2018-12-05 16:06   ` Jan Beulich
  2018-12-05 16:07   ` [PATCH v6 08/10] x86/cpuidle: patch some " Jan Beulich
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-12-05 16:06 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

For (I hope) obvious reasons only the ones used at runtime get
converted.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v2: Drop open-coded numbers from macro invocations.

--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -29,12 +29,12 @@
 
 void send_IPI_mask(const cpumask_t *mask, int vector)
 {
-    genapic.send_IPI_mask(mask, vector);
+    alternative_vcall(genapic.send_IPI_mask, mask, vector);
 }
 
 void send_IPI_self(int vector)
 {
-    genapic.send_IPI_self(vector);
+    alternative_vcall(genapic.send_IPI_self, vector);
 }
 
 /*
--- a/xen/include/asm-x86/mach-generic/mach_apic.h
+++ b/xen/include/asm-x86/mach-generic/mach_apic.h
@@ -15,8 +15,18 @@
 #define TARGET_CPUS ((const typeof(cpu_online_map) *)&cpu_online_map)
 #define init_apic_ldr (genapic.init_apic_ldr)
 #define clustered_apic_check (genapic.clustered_apic_check)
-#define cpu_mask_to_apicid (genapic.cpu_mask_to_apicid)
-#define vector_allocation_cpumask(cpu) (genapic.vector_allocation_cpumask(cpu))
+#define cpu_mask_to_apicid(mask) ({ \
+	/* \
+	 * There are a number of places where the address of a local variable \
+	 * gets passed here. The use of ?: in alternative_call<N>() triggers an \
+	 * "address of ... is always true" warning in such a case with at least \
+	 * gcc 7 and 8. Hence the seemingly pointless local variable here. \
+	 */ \
+	const cpumask_t *m_ = (mask); \
+	alternative_call(genapic.cpu_mask_to_apicid, m_); \
+})
+#define vector_allocation_cpumask(cpu) \
+	alternative_call(genapic.vector_allocation_cpumask, cpu)
 
 static inline void enable_apic_mode(void)
 {





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v6 08/10] x86/cpuidle: patch some indirect calls to direct ones
  2018-12-05 15:54 ` [PATCH v6 00/10] x86: indirect call overhead reduction Jan Beulich
                     ` (6 preceding siblings ...)
  2018-12-05 16:06   ` [PATCH v6 07/10] x86/genapic: patch indirect calls to direct ones Jan Beulich
@ 2018-12-05 16:07   ` Jan Beulich
  2018-12-05 16:07   ` [PATCH v6 09/10] cpufreq: patch target() indirect call to direct one Jan Beulich
  2018-12-05 16:08   ` [PATCH v6 10/10] IOMMU: patch certain indirect calls to direct ones Jan Beulich
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-12-05 16:07 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

For now only the ones used during entering/exiting of idle states are
converted. Additionally pm_idle{,_save} and lapic_timer_{on,off} can't
be converted, as they may get established rather late (when Dom0 is
already active).

Note that for patching to be deferred until after the pre-SMP initcalls
(from where cpuidle_init_cpu() runs the first time) the pointers need to
start out as NULL.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v2: Drop open-coded numbers from macro invocations.

--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -102,8 +102,6 @@ bool lapic_timer_init(void)
     return true;
 }
 
-static uint64_t (*__read_mostly tick_to_ns)(uint64_t) = acpi_pm_tick_to_ns;
-
 void (*__read_mostly pm_idle_save)(void);
 unsigned int max_cstate __read_mostly = ACPI_PROCESSOR_MAX_POWER - 1;
 integer_param("max_cstate", max_cstate);
@@ -289,9 +287,9 @@ static uint64_t acpi_pm_ticks_elapsed(ui
         return ((0xFFFFFFFF - t1) + t2 +1);
 }
 
-uint64_t (*__read_mostly cpuidle_get_tick)(void) = get_acpi_pm_tick;
-static uint64_t (*__read_mostly ticks_elapsed)(uint64_t, uint64_t)
-    = acpi_pm_ticks_elapsed;
+uint64_t (*__read_mostly cpuidle_get_tick)(void);
+static uint64_t (*__read_mostly tick_to_ns)(uint64_t);
+static uint64_t (*__read_mostly ticks_elapsed)(uint64_t, uint64_t);
 
 static void print_acpi_power(uint32_t cpu, struct acpi_processor_power *power)
 {
@@ -547,7 +545,7 @@ void update_idle_stats(struct acpi_proce
                        struct acpi_processor_cx *cx,
                        uint64_t before, uint64_t after)
 {
-    int64_t sleep_ticks = ticks_elapsed(before, after);
+    int64_t sleep_ticks = alternative_call(ticks_elapsed, before, after);
     /* Interrupts are disabled */
 
     spin_lock(&power->stat_lock);
@@ -555,7 +553,8 @@ void update_idle_stats(struct acpi_proce
     cx->usage++;
     if ( sleep_ticks > 0 )
     {
-        power->last_residency = tick_to_ns(sleep_ticks) / 1000UL;
+        power->last_residency = alternative_call(tick_to_ns, sleep_ticks) /
+                                1000UL;
         cx->time += sleep_ticks;
     }
     power->last_state = &power->states[0];
@@ -635,7 +634,7 @@ static void acpi_processor_idle(void)
         if ( cx->type == ACPI_STATE_C1 || local_apic_timer_c2_ok )
         {
             /* Get start time (ticks) */
-            t1 = cpuidle_get_tick();
+            t1 = alternative_call(cpuidle_get_tick);
             /* Trace cpu idle entry */
             TRACE_4D(TRC_PM_IDLE_ENTRY, cx->idx, t1, exp, pred);
 
@@ -644,7 +643,7 @@ static void acpi_processor_idle(void)
             /* Invoke C2 */
             acpi_idle_do_entry(cx);
             /* Get end time (ticks) */
-            t2 = cpuidle_get_tick();
+            t2 = alternative_call(cpuidle_get_tick);
             trace_exit_reason(irq_traced);
             /* Trace cpu idle exit */
             TRACE_6D(TRC_PM_IDLE_EXIT, cx->idx, t2,
@@ -666,7 +665,7 @@ static void acpi_processor_idle(void)
         lapic_timer_off();
 
         /* Get start time (ticks) */
-        t1 = cpuidle_get_tick();
+        t1 = alternative_call(cpuidle_get_tick);
         /* Trace cpu idle entry */
         TRACE_4D(TRC_PM_IDLE_ENTRY, cx->idx, t1, exp, pred);
 
@@ -717,7 +716,7 @@ static void acpi_processor_idle(void)
         }
 
         /* Get end time (ticks) */
-        t2 = cpuidle_get_tick();
+        t2 = alternative_call(cpuidle_get_tick);
 
         /* recovering TSC */
         cstate_restore_tsc();
@@ -827,11 +826,20 @@ int cpuidle_init_cpu(unsigned int cpu)
     {
         unsigned int i;
 
-        if ( cpu == 0 && boot_cpu_has(X86_FEATURE_NONSTOP_TSC) )
+        if ( cpu == 0 && system_state < SYS_STATE_active )
         {
-            cpuidle_get_tick = get_stime_tick;
-            ticks_elapsed = stime_ticks_elapsed;
-            tick_to_ns = stime_tick_to_ns;
+            if ( boot_cpu_has(X86_FEATURE_NONSTOP_TSC) )
+            {
+                cpuidle_get_tick = get_stime_tick;
+                ticks_elapsed = stime_ticks_elapsed;
+                tick_to_ns = stime_tick_to_ns;
+            }
+            else
+            {
+                cpuidle_get_tick = get_acpi_pm_tick;
+                ticks_elapsed = acpi_pm_ticks_elapsed;
+                tick_to_ns = acpi_pm_tick_to_ns;
+            }
         }
 
         acpi_power = xzalloc(struct acpi_processor_power);
--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -778,7 +778,7 @@ static void mwait_idle(void)
 	if (!(lapic_timer_reliable_states & (1 << cstate)))
 		lapic_timer_off();
 
-	before = cpuidle_get_tick();
+	before = alternative_call(cpuidle_get_tick);
 	TRACE_4D(TRC_PM_IDLE_ENTRY, cx->type, before, exp, pred);
 
 	update_last_cx_stat(power, cx, before);
@@ -786,7 +786,7 @@ static void mwait_idle(void)
 	if (cpu_is_haltable(cpu))
 		mwait_idle_with_hints(eax, MWAIT_ECX_INTERRUPT_BREAK);
 
-	after = cpuidle_get_tick();
+	after = alternative_call(cpuidle_get_tick);
 
 	cstate_restore_tsc();
 	trace_exit_reason(irq_traced);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v6 09/10] cpufreq: patch target() indirect call to direct one
  2018-12-05 15:54 ` [PATCH v6 00/10] x86: indirect call overhead reduction Jan Beulich
                     ` (7 preceding siblings ...)
  2018-12-05 16:07   ` [PATCH v6 08/10] x86/cpuidle: patch some " Jan Beulich
@ 2018-12-05 16:07   ` Jan Beulich
  2018-12-05 16:08   ` [PATCH v6 10/10] IOMMU: patch certain indirect calls to direct ones Jan Beulich
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-12-05 16:07 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

This looks to be the only frequently executed hook; don't bother
patching any other ones.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v2: New.

--- a/xen/drivers/cpufreq/utility.c
+++ b/xen/drivers/cpufreq/utility.c
@@ -364,7 +364,8 @@ int __cpufreq_driver_target(struct cpufr
     {
         unsigned int prev_freq = policy->cur;
 
-        retval = cpufreq_driver.target(policy, target_freq, relation);
+        retval = alternative_call(cpufreq_driver.target,
+                                  policy, target_freq, relation);
         if ( retval == 0 )
             TRACE_2D(TRC_PM_FREQ_CHANGE, prev_freq/1000, policy->cur/1000);
     }



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v6 10/10] IOMMU: patch certain indirect calls to direct ones
  2018-12-05 15:54 ` [PATCH v6 00/10] x86: indirect call overhead reduction Jan Beulich
                     ` (8 preceding siblings ...)
  2018-12-05 16:07   ` [PATCH v6 09/10] cpufreq: patch target() indirect call to direct one Jan Beulich
@ 2018-12-05 16:08   ` Jan Beulich
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2018-12-05 16:08 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

This is intentionally not touching hooks used rarely (or not at all)
during the lifetime of a VM, unless perhaps sitting on an error path
next to a call which gets changed (in which case I think the error
path better remains consistent with the respective main path).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v6: Re-base.
v5: Re-base over type-safe changes and dropped IOMMU_MIXED patch. Also
    patch the new lookup_page() hook.
v4: New.

--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -226,8 +226,8 @@ void __hwdom_init iommu_hwdom_init(struc
                   == PGT_writable_page) )
                 mapping |= IOMMUF_writable;
 
-            ret = hd->platform_ops->map_page(d, _dfn(dfn), _mfn(mfn),
-                                             mapping);
+            ret = iommu_call(hd->platform_ops, map_page,
+                             d, _dfn(dfn), _mfn(mfn), mapping);
             if ( !rc )
                 rc = ret;
 
@@ -319,8 +319,8 @@ int iommu_map(struct domain *d, dfn_t df
 
     for ( i = 0; i < (1ul << page_order); i++ )
     {
-        rc = hd->platform_ops->map_page(d, dfn_add(dfn, i),
-                                        mfn_add(mfn, i), flags);
+        rc = iommu_call(hd->platform_ops, map_page, d, dfn_add(dfn, i),
+                        mfn_add(mfn, i), flags);
 
         if ( likely(!rc) )
             continue;
@@ -333,7 +333,7 @@ int iommu_map(struct domain *d, dfn_t df
 
         while ( i-- )
             /* if statement to satisfy __must_check */
-            if ( hd->platform_ops->unmap_page(d, dfn_add(dfn, i)) )
+            if ( iommu_call(hd->platform_ops, unmap_page, d, dfn_add(dfn, i)) )
                 continue;
 
         if ( !is_hardware_domain(d) )
@@ -358,7 +358,7 @@ int iommu_unmap(struct domain *d, dfn_t
 
     for ( i = 0; i < (1ul << page_order); i++ )
     {
-        int err = hd->platform_ops->unmap_page(d, dfn_add(dfn, i));
+        int err = iommu_call(hd->platform_ops, unmap_page, d, dfn_add(dfn, i));
 
         if ( likely(!err) )
             continue;
@@ -389,7 +389,7 @@ int iommu_lookup_page(struct domain *d,
     if ( !iommu_enabled || !hd->platform_ops )
         return -EOPNOTSUPP;
 
-    return hd->platform_ops->lookup_page(d, dfn, mfn, flags);
+    return iommu_call(hd->platform_ops, lookup_page, d, dfn, mfn, flags);
 }
 
 static void iommu_free_pagetables(unsigned long unused)
@@ -402,7 +402,7 @@ static void iommu_free_pagetables(unsign
         spin_unlock(&iommu_pt_cleanup_lock);
         if ( !pg )
             return;
-        iommu_get_ops()->free_page_table(pg);
+        iommu_vcall(iommu_get_ops(), free_page_table, pg);
     } while ( !softirq_pending(smp_processor_id()) );
 
     tasklet_schedule_on_cpu(&iommu_pt_cleanup_tasklet,
@@ -417,7 +417,7 @@ int iommu_iotlb_flush(struct domain *d,
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
         return 0;
 
-    rc = hd->platform_ops->iotlb_flush(d, dfn, page_count);
+    rc = iommu_call(hd->platform_ops, iotlb_flush, d, dfn, page_count);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
@@ -440,7 +440,7 @@ int iommu_iotlb_flush_all(struct domain
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush_all )
         return 0;
 
-    rc = hd->platform_ops->iotlb_flush_all(d);
+    rc = iommu_call(hd->platform_ops, iotlb_flush_all, d);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1349,14 +1349,14 @@ int iommu_update_ire_from_msi(
     struct msi_desc *msi_desc, struct msi_msg *msg)
 {
     return iommu_intremap
-           ? iommu_get_ops()->update_ire_from_msi(msi_desc, msg) : 0;
+           ? iommu_call(&iommu_ops, update_ire_from_msi, msi_desc, msg) : 0;
 }
 
 void iommu_read_msi_from_ire(
     struct msi_desc *msi_desc, struct msi_msg *msg)
 {
     if ( iommu_intremap )
-        iommu_get_ops()->read_msi_from_ire(msi_desc, msg);
+        iommu_vcall(&iommu_ops, read_msi_from_ire, msi_desc, msg);
 }
 
 static int iommu_add_device(struct pci_dev *pdev)
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -28,14 +28,12 @@ struct iommu_ops iommu_ops;
 void iommu_update_ire_from_apic(
     unsigned int apic, unsigned int reg, unsigned int value)
 {
-    const struct iommu_ops *ops = iommu_get_ops();
-    ops->update_ire_from_apic(apic, reg, value);
+    iommu_vcall(&iommu_ops, update_ire_from_apic, apic, reg, value);
 }
 
 unsigned int iommu_read_apic_from_ire(unsigned int apic, unsigned int reg)
 {
-    const struct iommu_ops *ops = iommu_get_ops();
-    return ops->read_apic_from_ire(apic, reg);
+    return iommu_call(&iommu_ops, read_apic_from_ire, apic, reg);
 }
 
 int __init iommu_setup_hpet_msi(struct msi_desc *msi)
@@ -46,7 +44,6 @@ int __init iommu_setup_hpet_msi(struct m
 
 int arch_iommu_populate_page_table(struct domain *d)
 {
-    const struct domain_iommu *hd = dom_iommu(d);
     struct page_info *page;
     int rc = 0, n = 0;
 
@@ -68,9 +65,8 @@ int arch_iommu_populate_page_table(struc
             {
                 ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
                 BUG_ON(SHARED_M2P(gfn));
-                rc = hd->platform_ops->map_page(d, _dfn(gfn), _mfn(mfn),
-                                                IOMMUF_readable |
-                                                IOMMUF_writable);
+                rc = iommu_call(&iommu_ops, map_page, d, _dfn(gfn), _mfn(mfn),
+                                IOMMUF_readable | IOMMUF_writable);
             }
             if ( rc )
             {
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -62,6 +62,12 @@ int amd_iov_detect(void);
 
 extern struct iommu_ops iommu_ops;
 
+#ifdef NDEBUG
+# include <asm/alternative.h>
+# define iommu_call(ops, fn, args...)  alternative_call(iommu_ops.fn, ## args)
+# define iommu_vcall(ops, fn, args...) alternative_vcall(iommu_ops.fn, ## args)
+#endif
+
 static inline const struct iommu_ops *iommu_get_ops(void)
 {
     BUG_ON(!iommu_ops.init);
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -196,6 +196,11 @@ struct iommu_ops {
 
 #include <asm/iommu.h>
 
+#ifndef iommu_call
+# define iommu_call(ops, fn, args...) ((ops)->fn(args))
+# define iommu_vcall iommu_call
+#endif
+
 enum iommu_status
 {
     IOMMU_STATUS_disabled,




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v7 00/10] x86: indirect call overhead reduction
       [not found]   ` <5C07F49D020000780021DC1A@prv1-mh.provo.novell.com>
@ 2019-03-12 13:59     ` Jan Beulich
  2019-03-12 14:03       ` [PATCH v7 01/10] x86: reduce general stack alignment to 8 Jan Beulich
                         ` (9 more replies)
  0 siblings, 10 replies; 119+ messages in thread
From: Jan Beulich @ 2019-03-12 13:59 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

While indirect calls have always been more expensive than direct ones,
their cost has further increased with the Spectre v2 mitigations. In a
number of cases we simply pointlessly use them in the first place. In
many other cases the indirection solely exists to abstract from e.g.
vendor specific hardware details, and hence the pointers used never
change once set. Here we can use alternatives patching to get rid of
the indirection.

Further areas where indirect calls could be eliminated (and that I've put
on my todo list in case the general concept here is deemed reasonable)
are vPMU and XSM. For the latter, the Arm side would need dealing
with as well - I'm not sure whether replacing indirect calls by direct ones
is worthwhile there; if not, the wrappers would simply need to become
function invocations in the Arm case (as is already done in the IOMMU
case).

01: x86: reduce general stack alignment to 8
02: x86: clone Linux'es ASM_CALL_CONSTRAINT
03: x86: infrastructure to allow converting certain indirect calls to direct ones
04: x86/HVM: patch indirect calls through hvm_funcs to direct ones
05: x86/HVM: patch vINTR indirect calls through hvm_funcs to direct ones
06: x86: patch ctxt_switch_masking() indirect call to direct one
07: x86/genapic: patch indirect calls to direct ones
08: x86/cpuidle: patch some indirect calls to direct ones
09: cpufreq: patch target() indirect call to direct one
10: IOMMU: patch certain indirect calls to direct ones

v7: Just some re-basing and a minor tweak (see patches 3 and 10).

Given for how long this has been pending, I'm intending to commit this
(if necessary without any further tags) as soon as staging is fully open
again, unless (of course) I hear back otherwise by that time.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v7 01/10] x86: reduce general stack alignment to 8
  2019-03-12 13:59     ` [PATCH v7 00/10] x86: indirect call overhead reduction Jan Beulich
@ 2019-03-12 14:03       ` Jan Beulich
  2019-03-12 14:04       ` [PATCH v7 02/10] x86: clone Linux'es ASM_CALL_CONSTRAINT Jan Beulich
                         ` (8 subsequent siblings)
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2019-03-12 14:03 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

We don't need bigger alignment except when calling EFI boot or runtime
services functions (and we don't guarantee that either, as explained
close to the top of xen/common/efi/runtime.c in the struct efi_rs_state
declaration). Hence if the compiler supports reducing stack alignment
from the ABI compatible 16 bytes (gcc 7 and newer), do so wherever
possible.

The EFI case itself is largely dealt with already (actually forcing
32-byte alignment) as a result of commit f6b7fedc89 ("x86/EFI: meet
further spec requirements for runtime calls"). However, as explained in
the description of that earlier change, without using
-mincoming-stack-boundary=3 (which we don't want) we still have to make
the compiler assume 16-byte stack boundaries for CUs making EFI calls in
order to keep the compiler from aligning the stack, but then placing an
odd number of 8-byte objects on it, resulting in a mis-aligned outgoing
stack.

This as a side effect yields some code size reduction, since for a
number of sufficiently simple non-leaf functions the stack adjustment
(by 8, when there are no local stack variables at all) gets dropped
altogether. I notice exceptions though, for example in guest_cpuid(),
where in a release build gcc 8.2 now decides to set up a frame pointer
(without ever using %rbp); I consider this a compiler quirk which we
should leave to the compiler folks to address eventually.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v5: New.

--- a/xen/arch/x86/Rules.mk
+++ b/xen/arch/x86/Rules.mk
@@ -52,6 +52,11 @@ CFLAGS += -fno-jump-tables
 export CONFIG_INDIRECT_THUNK=y
 endif
 
+# If supported by the compiler, reduce stack alignment to 8 bytes. But allow
+# this to be overridden elsewhere.
+$(call cc-option-add,CFLAGS-stack-boundary,CC,-mpreferred-stack-boundary=3)
+CFLAGS += $(CFLAGS-stack-boundary)
+
 # Set up the assembler include path properly for older toolchains.
 CFLAGS += -Wa,-I$(BASEDIR)/include
 
--- a/xen/arch/x86/efi/Makefile
+++ b/xen/arch/x86/efi/Makefile
@@ -5,7 +5,11 @@ CFLAGS += -fshort-wchar
 
 boot.init.o: buildid.o
 
+EFIOBJ := boot.init.o compat.o runtime.o
+
+$(EFIOBJ): CFLAGS-stack-boundary := -mpreferred-stack-boundary=4
+
 obj-y := stub.o
-obj-$(XEN_BUILD_EFI) := boot.init.o compat.o relocs-dummy.o runtime.o
+obj-$(XEN_BUILD_EFI) := $(EFIOBJ) relocs-dummy.o
 extra-$(XEN_BUILD_EFI) += buildid.o
 nocov-$(XEN_BUILD_EFI) += stub.o





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v7 02/10] x86: clone Linux'es ASM_CALL_CONSTRAINT
  2019-03-12 13:59     ` [PATCH v7 00/10] x86: indirect call overhead reduction Jan Beulich
  2019-03-12 14:03       ` [PATCH v7 01/10] x86: reduce general stack alignment to 8 Jan Beulich
@ 2019-03-12 14:04       ` Jan Beulich
  2019-03-12 14:05       ` [PATCH v7 03/10] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
                         ` (7 subsequent siblings)
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2019-03-12 14:04 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

While we don't mean to run their objtool over our generated code, it
still seems desirable to avoid calls to further functions before a
function's frame pointer is set up.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v6: Fix build issue with old gcc.
v5: New.

--- a/xen/arch/x86/efi/stub.c
+++ b/xen/arch/x86/efi/stub.c
@@ -2,8 +2,9 @@
 #include <xen/errno.h>
 #include <xen/init.h>
 #include <xen/lib.h>
-#include <asm/page.h>
+#include <asm/asm_defns.h>
 #include <asm/efibind.h>
+#include <asm/page.h>
 #include <efi/efidef.h>
 #include <efi/eficapsule.h>
 #include <efi/eficon.h>
@@ -34,10 +35,11 @@ void __init noreturn efi_multiboot2(EFI_
      * not be directly supported by C compiler.
      */
     asm volatile(
-    "    call *%3                     \n"
+    "    call *%[outstr]              \n"
     "0:  hlt                          \n"
     "    jmp  0b                      \n"
-       : "+c" (StdErr), "=d" (StdErr) : "1" (err), "rm" (StdErr->OutputString)
+       : "+c" (StdErr), "=d" (StdErr) ASM_CALL_CONSTRAINT
+       : "1" (err), [outstr] "rm" (StdErr->OutputString)
        : "rax", "r8", "r9", "r10", "r11", "memory");
 
     unreachable();
--- a/xen/arch/x86/extable.c
+++ b/xen/arch/x86/extable.c
@@ -168,7 +168,7 @@ static int __init stub_selftest(void)
                        "jmp .Lret%=\n\t"
                        ".popsection\n\t"
                        _ASM_EXTABLE(.Lret%=, .Lfix%=)
-                       : [exn] "+m" (res)
+                       : [exn] "+m" (res) ASM_CALL_CONSTRAINT
                        : [stb] "r" (addr), "a" (tests[i].rax));
 
         if ( res.raw != tests[i].res.raw )
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1101,7 +1101,8 @@ static inline int mkec(uint8_t e, int32_
                    "jmp .Lret%=\n\t"                                    \
                    ".popsection\n\t"                                    \
                    _ASM_EXTABLE(.Lret%=, .Lfix%=)                       \
-                   : [exn] "+g" (stub_exn.info), constraints,           \
+                   : [exn] "+g" (stub_exn.info) ASM_CALL_CONSTRAINT,    \
+                     constraints,                                       \
                      [stub] "r" (stub.func),                            \
                      "m" (*(uint8_t(*)[MAX_INST_LEN + 1])stub.ptr) );   \
     if ( unlikely(~stub_exn.info.raw) )                                 \
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -25,6 +25,19 @@ asm ( "\t.equ CONFIG_INDIRECT_THUNK, "
 
 #ifndef __ASSEMBLY__
 void ret_from_intr(void);
+
+/*
+ * This output constraint should be used for any inline asm which has a "call"
+ * instruction.  Otherwise the asm may be inserted before the frame pointer
+ * gets set up by the containing function.
+ */
+#ifdef CONFIG_FRAME_POINTER
+register unsigned long current_stack_pointer asm("rsp");
+# define ASM_CALL_CONSTRAINT , "+r" (current_stack_pointer)
+#else
+# define ASM_CALL_CONSTRAINT
+#endif
+
 #endif
 
 #ifndef NDEBUG
--- a/xen/include/asm-x86/guest/hypercall.h
+++ b/xen/include/asm-x86/guest/hypercall.h
@@ -40,7 +40,7 @@
         long res, tmp__;                                                \
         asm volatile (                                                  \
             "call hypercall_page + %c[offset]"                          \
-            : "=a" (res), "=D" (tmp__)                                  \
+            : "=a" (res), "=D" (tmp__) ASM_CALL_CONSTRAINT              \
             : [offset] "i" (hcall * 32),                                \
               "1" ((long)(a1))                                          \
             : "memory" );                                               \
@@ -53,6 +53,7 @@
         asm volatile (                                                  \
             "call hypercall_page + %c[offset]"                          \
             : "=a" (res), "=D" (tmp__), "=S" (tmp__)                    \
+              ASM_CALL_CONSTRAINT                                       \
             : [offset] "i" (hcall * 32),                                \
               "1" ((long)(a1)), "2" ((long)(a2))                        \
             : "memory" );                                               \
@@ -65,6 +66,7 @@
         asm volatile (                                                  \
             "call hypercall_page + %c[offset]"                          \
             : "=a" (res), "=D" (tmp__), "=S" (tmp__), "=d" (tmp__)      \
+              ASM_CALL_CONSTRAINT                                       \
             : [offset] "i" (hcall * 32),                                \
               "1" ((long)(a1)), "2" ((long)(a2)), "3" ((long)(a3))      \
             : "memory" );                                               \
@@ -78,7 +80,7 @@
         asm volatile (                                                  \
             "call hypercall_page + %c[offset]"                          \
             : "=a" (res), "=D" (tmp__), "=S" (tmp__), "=d" (tmp__),     \
-              "=&r" (tmp__)                                             \
+              "=&r" (tmp__) ASM_CALL_CONSTRAINT                         \
             : [offset] "i" (hcall * 32),                                \
               "1" ((long)(a1)), "2" ((long)(a2)), "3" ((long)(a3)),     \
               "4" (_a4)                                                 \




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v7 03/10] x86: infrastructure to allow converting certain indirect calls to direct ones
  2019-03-12 13:59     ` [PATCH v7 00/10] x86: indirect call overhead reduction Jan Beulich
  2019-03-12 14:03       ` [PATCH v7 01/10] x86: reduce general stack alignment to 8 Jan Beulich
  2019-03-12 14:04       ` [PATCH v7 02/10] x86: clone Linux'es ASM_CALL_CONSTRAINT Jan Beulich
@ 2019-03-12 14:05       ` Jan Beulich
  2019-03-12 14:06       ` [PATCH v7 04/10] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
                         ` (6 subsequent siblings)
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2019-03-12 14:05 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall

In a number of cases the targets of indirect calls get determined once
at boot time. In such cases we can replace those calls with direct ones
via our alternative instruction patching mechanism.

Some of the targets (in particular the hvm_funcs ones) get established
only in pre-SMP initcalls, making necessary a second passs through the
alternative patching code. Therefore some adjustments beyond the
recognition of the new special pattern are necessary there.

Note that patching such sites more than once is not supported (and the
supplied macros also don't provide any means to do so).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v7: Remove stray leftover uses of ALT_CALL_OUT{5,6}.
v6: count_va_arg() -> count_args().
v5: Use ASM_CALL_CONSTRAINT.
v4: Extend / adjust comments.
v3: Use "X" constraint instead of "g" in alternative_callN(). Pre-
    calculate values to be put into local register variables.
v2: Introduce and use count_va_arg(). Don't omit middle operand from
    ?: in ALT_CALL_ARG(). Re-base.

--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -177,9 +177,14 @@ text_poke(void *addr, const void *opcode
  * self modifying code. This implies that asymmetric systems where
  * APs have less capabilities than the boot processor are not handled.
  * Tough. Make sure you disable such features by hand.
+ *
+ * The caller will set the "force" argument to true for the final
+ * invocation, such that no CALLs/JMPs to NULL pointers will be left
+ * around. See also the further comment below.
  */
-void init_or_livepatch apply_alternatives(struct alt_instr *start,
-                                          struct alt_instr *end)
+static void init_or_livepatch _apply_alternatives(struct alt_instr *start,
+                                                  struct alt_instr *end,
+                                                  bool force)
 {
     struct alt_instr *a, *base;
 
@@ -208,9 +213,10 @@ void init_or_livepatch apply_alternative
         /*
          * Detect sequences of alt_instr's patching the same origin site, and
          * keep base pointing at the first alt_instr entry.  This is so we can
-         * refer to a single ->priv field for patching decisions.  We
-         * deliberately use the alt_instr itself rather than a local variable
-         * in case we end up making multiple passes.
+         * refer to a single ->priv field for some of our patching decisions,
+         * in particular the NOP optimization. We deliberately use the alt_instr
+         * itself rather than a local variable in case we end up making multiple
+         * passes.
          *
          * ->priv being nonzero means that the origin site has already been
          * modified, and we shouldn't try to optimise the nops again.
@@ -218,6 +224,13 @@ void init_or_livepatch apply_alternative
         if ( ALT_ORIG_PTR(base) != orig )
             base = a;
 
+        /* Skip patch sites already handled during the first pass. */
+        if ( a->priv )
+        {
+            ASSERT(force);
+            continue;
+        }
+
         /* If there is no replacement to make, see about optimising the nops. */
         if ( !boot_cpu_has(a->cpuid) )
         {
@@ -225,7 +238,7 @@ void init_or_livepatch apply_alternative
             if ( base->priv )
                 continue;
 
-            base->priv = 1;
+            a->priv = 1;
 
             /* Nothing useful to do? */
             if ( toolchain_nops_are_ideal || a->pad_len <= 1 )
@@ -236,20 +249,74 @@ void init_or_livepatch apply_alternative
             continue;
         }
 
-        base->priv = 1;
-
         memcpy(buf, repl, a->repl_len);
 
         /* 0xe8/0xe9 are relative branches; fix the offset. */
         if ( a->repl_len >= 5 && (*buf & 0xfe) == 0xe8 )
-            *(int32_t *)(buf + 1) += repl - orig;
+        {
+            /*
+             * Detect the special case of indirect-to-direct branch patching:
+             * - replacement is a direct CALL/JMP (opcodes 0xE8/0xE9; already
+             *   checked above),
+             * - replacement's displacement is -5 (pointing back at the very
+             *   insn, which makes no sense in a real replacement insn),
+             * - original is an indirect CALL/JMP (opcodes 0xFF/2 or 0xFF/4)
+             *   using RIP-relative addressing.
+             * Some branch destinations may still be NULL when we come here
+             * the first time. Defer patching of those until the post-presmp-
+             * initcalls re-invocation (with force set to true). If at that
+             * point the branch destination is still NULL, insert "UD2; UD0"
+             * (for ease of recognition) instead of CALL/JMP.
+             */
+            if ( a->cpuid == X86_FEATURE_ALWAYS &&
+                 *(int32_t *)(buf + 1) == -5 &&
+                 a->orig_len >= 6 &&
+                 orig[0] == 0xff &&
+                 orig[1] == (*buf & 1 ? 0x25 : 0x15) )
+            {
+                long disp = *(int32_t *)(orig + 2);
+                const uint8_t *dest = *(void **)(orig + 6 + disp);
+
+                if ( dest )
+                {
+                    disp = dest - (orig + 5);
+                    ASSERT(disp == (int32_t)disp);
+                    *(int32_t *)(buf + 1) = disp;
+                }
+                else if ( force )
+                {
+                    buf[0] = 0x0f;
+                    buf[1] = 0x0b;
+                    buf[2] = 0x0f;
+                    buf[3] = 0xff;
+                    buf[4] = 0xff;
+                }
+                else
+                    continue;
+            }
+            else if ( force && system_state < SYS_STATE_active )
+                ASSERT_UNREACHABLE();
+            else
+                *(int32_t *)(buf + 1) += repl - orig;
+        }
+        else if ( force && system_state < SYS_STATE_active  )
+            ASSERT_UNREACHABLE();
+
+        a->priv = 1;
 
         add_nops(buf + a->repl_len, total_len - a->repl_len);
         text_poke(orig, buf, total_len);
     }
 }
 
-static bool __initdata alt_done;
+void init_or_livepatch apply_alternatives(struct alt_instr *start,
+                                          struct alt_instr *end)
+{
+    _apply_alternatives(start, end, true);
+}
+
+static unsigned int __initdata alt_todo;
+static unsigned int __initdata alt_done;
 
 /*
  * At boot time, we patch alternatives in NMI context.  This means that the
@@ -264,7 +331,7 @@ static int __init nmi_apply_alternatives
      * More than one NMI may occur between the two set_nmi_callback() below.
      * We only need to apply alternatives once.
      */
-    if ( !alt_done )
+    if ( !(alt_done & alt_todo) )
     {
         unsigned long cr0;
 
@@ -273,11 +340,12 @@ static int __init nmi_apply_alternatives
         /* Disable WP to allow patching read-only pages. */
         write_cr0(cr0 & ~X86_CR0_WP);
 
-        apply_alternatives(__alt_instructions, __alt_instructions_end);
+        _apply_alternatives(__alt_instructions, __alt_instructions_end,
+                            alt_done);
 
         write_cr0(cr0);
 
-        alt_done = true;
+        alt_done |= alt_todo;
     }
 
     return 1;
@@ -287,13 +355,11 @@ static int __init nmi_apply_alternatives
  * This routine is called with local interrupt disabled and used during
  * bootup.
  */
-void __init alternative_instructions(void)
+static void __init _alternative_instructions(bool force)
 {
     unsigned int i;
     nmi_callback_t *saved_nmi_callback;
 
-    arch_init_ideal_nops();
-
     /*
      * Don't stop machine check exceptions while patching.
      * MCEs only happen when something got corrupted and in this
@@ -306,6 +372,10 @@ void __init alternative_instructions(voi
      */
     ASSERT(!local_irq_is_enabled());
 
+    /* Set what operation to perform /before/ setting the callback. */
+    alt_todo = 1u << force;
+    barrier();
+
     /*
      * As soon as the callback is set up, the next NMI will trigger patching,
      * even an NMI ahead of our explicit self-NMI.
@@ -321,11 +391,24 @@ void __init alternative_instructions(voi
      * cover the (hopefully never) async case, poll alt_done for up to one
      * second.
      */
-    for ( i = 0; !ACCESS_ONCE(alt_done) && i < 1000; ++i )
+    for ( i = 0; !(ACCESS_ONCE(alt_done) & alt_todo) && i < 1000; ++i )
         mdelay(1);
 
-    if ( !ACCESS_ONCE(alt_done) )
+    if ( !(ACCESS_ONCE(alt_done) & alt_todo) )
         panic("Timed out waiting for alternatives self-NMI to hit\n");
 
     set_nmi_callback(saved_nmi_callback);
 }
+
+void __init alternative_instructions(void)
+{
+    arch_init_ideal_nops();
+    _alternative_instructions(false);
+}
+
+void __init alternative_branches(void)
+{
+    local_irq_disable();
+    _alternative_instructions(true);
+    local_irq_enable();
+}
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1661,6 +1661,8 @@ void __init noreturn __start_xen(unsigne
 
     do_presmp_initcalls();
 
+    alternative_branches();
+
     /*
      * NB: when running as a PV shim VCPUOP_up/down is wired to the shim
      * physical cpu_add/remove functions, so launch the guest with only
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -4,8 +4,8 @@
 #ifdef __ASSEMBLY__
 #include <asm/alternative-asm.h>
 #else
+#include <xen/lib.h>
 #include <xen/stringify.h>
-#include <xen/types.h>
 #include <asm/asm-macros.h>
 
 struct __packed alt_instr {
@@ -26,6 +26,7 @@ extern void add_nops(void *insns, unsign
 /* Similar to alternative_instructions except it can be run with IRQs enabled. */
 extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end);
 extern void alternative_instructions(void);
+extern void alternative_branches(void);
 
 #define alt_orig_len       "(.LXEN%=_orig_e - .LXEN%=_orig_s)"
 #define alt_pad_len        "(.LXEN%=_orig_p - .LXEN%=_orig_e)"
@@ -149,6 +150,233 @@ extern void alternative_instructions(voi
 /* Use this macro(s) if you need more than one output parameter. */
 #define ASM_OUTPUT2(a...) a
 
+/*
+ * Machinery to allow converting indirect to direct calls, when the called
+ * function is determined once at boot and later never changed.
+ */
+
+#define ALT_CALL_arg1 "rdi"
+#define ALT_CALL_arg2 "rsi"
+#define ALT_CALL_arg3 "rdx"
+#define ALT_CALL_arg4 "rcx"
+#define ALT_CALL_arg5 "r8"
+#define ALT_CALL_arg6 "r9"
+
+#define ALT_CALL_ARG(arg, n) \
+    register typeof((arg) ? (arg) : 0) a ## n ## _ \
+    asm ( ALT_CALL_arg ## n ) = (arg)
+#define ALT_CALL_NO_ARG(n) \
+    register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n )
+
+#define ALT_CALL_NO_ARG6 ALT_CALL_NO_ARG(6)
+#define ALT_CALL_NO_ARG5 ALT_CALL_NO_ARG(5); ALT_CALL_NO_ARG6
+#define ALT_CALL_NO_ARG4 ALT_CALL_NO_ARG(4); ALT_CALL_NO_ARG5
+#define ALT_CALL_NO_ARG3 ALT_CALL_NO_ARG(3); ALT_CALL_NO_ARG4
+#define ALT_CALL_NO_ARG2 ALT_CALL_NO_ARG(2); ALT_CALL_NO_ARG3
+#define ALT_CALL_NO_ARG1 ALT_CALL_NO_ARG(1); ALT_CALL_NO_ARG2
+
+/*
+ * Unfortunately ALT_CALL_NO_ARG() above can't use a fake initializer (to
+ * suppress "uninitialized variable" warnings), as various versions of gcc
+ * older than 8.1 fall on the nose in various ways with that (always because
+ * of some other construct elsewhere in the same function needing to use the
+ * same hard register). Otherwise the asm() below could uniformly use "+r"
+ * output constraints, making unnecessary all these ALT_CALL<n>_OUT macros.
+ */
+#define ALT_CALL0_OUT "=r" (a1_), "=r" (a2_), "=r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL1_OUT "+r" (a1_), "=r" (a2_), "=r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL2_OUT "+r" (a1_), "+r" (a2_), "=r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL3_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "=r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL4_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "+r" (a4_), "=r" (a5_), "=r" (a6_)
+#define ALT_CALL5_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "+r" (a4_), "+r" (a5_), "=r" (a6_)
+#define ALT_CALL6_OUT "+r" (a1_), "+r" (a2_), "+r" (a3_), \
+                      "+r" (a4_), "+r" (a5_), "+r" (a6_)
+
+#define alternative_callN(n, rettype, func) ({                     \
+    rettype ret_;                                                  \
+    register unsigned long r10_ asm("r10");                        \
+    register unsigned long r11_ asm("r11");                        \
+    asm volatile (__stringify(ALTERNATIVE "call *%c[addr](%%rip)", \
+                                          "call .",                \
+                                          X86_FEATURE_ALWAYS)      \
+                  : ALT_CALL ## n ## _OUT, "=a" (ret_),            \
+                    "=r" (r10_), "=r" (r11_) ASM_CALL_CONSTRAINT   \
+                  : [addr] "i" (&(func)), "g" (func)               \
+                  : "memory" );                                    \
+    ret_;                                                          \
+})
+
+#define alternative_vcall0(func) ({             \
+    ALT_CALL_NO_ARG1;                           \
+    ((void)alternative_callN(0, int, func));    \
+})
+
+#define alternative_call0(func) ({              \
+    ALT_CALL_NO_ARG1;                           \
+    alternative_callN(0, typeof(func()), func); \
+})
+
+#define alternative_vcall1(func, arg) ({           \
+    ALT_CALL_ARG(arg, 1);                          \
+    ALT_CALL_NO_ARG2;                              \
+    (void)sizeof(func(arg));                       \
+    (void)alternative_callN(1, int, func);         \
+})
+
+#define alternative_call1(func, arg) ({            \
+    ALT_CALL_ARG(arg, 1);                          \
+    ALT_CALL_NO_ARG2;                              \
+    alternative_callN(1, typeof(func(arg)), func); \
+})
+
+#define alternative_vcall2(func, arg1, arg2) ({           \
+    typeof(arg2) v2_ = (arg2);                            \
+    ALT_CALL_ARG(arg1, 1);                                \
+    ALT_CALL_ARG(v2_, 2);                                 \
+    ALT_CALL_NO_ARG3;                                     \
+    (void)sizeof(func(arg1, arg2));                       \
+    (void)alternative_callN(2, int, func);                \
+})
+
+#define alternative_call2(func, arg1, arg2) ({            \
+    typeof(arg2) v2_ = (arg2);                            \
+    ALT_CALL_ARG(arg1, 1);                                \
+    ALT_CALL_ARG(v2_, 2);                                 \
+    ALT_CALL_NO_ARG3;                                     \
+    alternative_callN(2, typeof(func(arg1, arg2)), func); \
+})
+
+#define alternative_vcall3(func, arg1, arg2, arg3) ({    \
+    typeof(arg2) v2_ = (arg2);                           \
+    typeof(arg3) v3_ = (arg3);                           \
+    ALT_CALL_ARG(arg1, 1);                               \
+    ALT_CALL_ARG(v2_, 2);                                \
+    ALT_CALL_ARG(v3_, 3);                                \
+    ALT_CALL_NO_ARG4;                                    \
+    (void)sizeof(func(arg1, arg2, arg3));                \
+    (void)alternative_callN(3, int, func);               \
+})
+
+#define alternative_call3(func, arg1, arg2, arg3) ({     \
+    typeof(arg2) v2_ = (arg2);                           \
+    typeof(arg3) v3_ = (arg3);                           \
+    ALT_CALL_ARG(arg1, 1);                               \
+    ALT_CALL_ARG(v2_, 2);                                \
+    ALT_CALL_ARG(v3_, 3);                                \
+    ALT_CALL_NO_ARG4;                                    \
+    alternative_callN(3, typeof(func(arg1, arg2, arg3)), \
+                      func);                             \
+})
+
+#define alternative_vcall4(func, arg1, arg2, arg3, arg4) ({ \
+    typeof(arg2) v2_ = (arg2);                              \
+    typeof(arg3) v3_ = (arg3);                              \
+    typeof(arg4) v4_ = (arg4);                              \
+    ALT_CALL_ARG(arg1, 1);                                  \
+    ALT_CALL_ARG(v2_, 2);                                   \
+    ALT_CALL_ARG(v3_, 3);                                   \
+    ALT_CALL_ARG(v4_, 4);                                   \
+    ALT_CALL_NO_ARG5;                                       \
+    (void)sizeof(func(arg1, arg2, arg3, arg4));             \
+    (void)alternative_callN(4, int, func);                  \
+})
+
+#define alternative_call4(func, arg1, arg2, arg3, arg4) ({  \
+    typeof(arg2) v2_ = (arg2);                              \
+    typeof(arg3) v3_ = (arg3);                              \
+    typeof(arg4) v4_ = (arg4);                              \
+    ALT_CALL_ARG(arg1, 1);                                  \
+    ALT_CALL_ARG(v2_, 2);                                   \
+    ALT_CALL_ARG(v3_, 3);                                   \
+    ALT_CALL_ARG(v4_, 4);                                   \
+    ALT_CALL_NO_ARG5;                                       \
+    alternative_callN(4, typeof(func(arg1, arg2,            \
+                                     arg3, arg4)),          \
+                      func);                                \
+})
+
+#define alternative_vcall5(func, arg1, arg2, arg3, arg4, arg5) ({ \
+    typeof(arg2) v2_ = (arg2);                                    \
+    typeof(arg3) v3_ = (arg3);                                    \
+    typeof(arg4) v4_ = (arg4);                                    \
+    typeof(arg5) v5_ = (arg5);                                    \
+    ALT_CALL_ARG(arg1, 1);                                        \
+    ALT_CALL_ARG(v2_, 2);                                         \
+    ALT_CALL_ARG(v3_, 3);                                         \
+    ALT_CALL_ARG(v4_, 4);                                         \
+    ALT_CALL_ARG(v5_, 5);                                         \
+    ALT_CALL_NO_ARG6;                                             \
+    (void)sizeof(func(arg1, arg2, arg3, arg4, arg5));             \
+    (void)alternative_callN(5, int, func);                        \
+})
+
+#define alternative_call5(func, arg1, arg2, arg3, arg4, arg5) ({  \
+    typeof(arg2) v2_ = (arg2);                                    \
+    typeof(arg3) v3_ = (arg3);                                    \
+    typeof(arg4) v4_ = (arg4);                                    \
+    typeof(arg5) v5_ = (arg5);                                    \
+    ALT_CALL_ARG(arg1, 1);                                        \
+    ALT_CALL_ARG(v2_, 2);                                         \
+    ALT_CALL_ARG(v3_, 3);                                         \
+    ALT_CALL_ARG(v4_, 4);                                         \
+    ALT_CALL_ARG(v5_, 5);                                         \
+    ALT_CALL_NO_ARG6;                                             \
+    alternative_callN(5, typeof(func(arg1, arg2, arg3,            \
+                                     arg4, arg5)),                \
+                      func);                                      \
+})
+
+#define alternative_vcall6(func, arg1, arg2, arg3, arg4, arg5, arg6) ({ \
+    typeof(arg2) v2_ = (arg2);                                          \
+    typeof(arg3) v3_ = (arg3);                                          \
+    typeof(arg4) v4_ = (arg4);                                          \
+    typeof(arg5) v5_ = (arg5);                                          \
+    typeof(arg6) v6_ = (arg6);                                          \
+    ALT_CALL_ARG(arg1, 1);                                              \
+    ALT_CALL_ARG(v2_, 2);                                               \
+    ALT_CALL_ARG(v3_, 3);                                               \
+    ALT_CALL_ARG(v4_, 4);                                               \
+    ALT_CALL_ARG(v5_, 5);                                               \
+    ALT_CALL_ARG(v6_, 6);                                               \
+    (void)sizeof(func(arg1, arg2, arg3, arg4, arg5, arg6));             \
+    (void)alternative_callN(6, int, func);                              \
+})
+
+#define alternative_call6(func, arg1, arg2, arg3, arg4, arg5, arg6) ({  \
+    typeof(arg2) v2_ = (arg2);                                          \
+    typeof(arg3) v3_ = (arg3);                                          \
+    typeof(arg4) v4_ = (arg4);                                          \
+    typeof(arg5) v5_ = (arg5);                                          \
+    typeof(arg6) v6_ = (arg6);                                          \
+    ALT_CALL_ARG(arg1, 1);                                              \
+    ALT_CALL_ARG(v2_, 2);                                               \
+    ALT_CALL_ARG(v3_, 3);                                               \
+    ALT_CALL_ARG(v4_, 4);                                               \
+    ALT_CALL_ARG(v5_, 5);                                               \
+    ALT_CALL_ARG(v6_, 6);                                               \
+    alternative_callN(6, typeof(func(arg1, arg2, arg3,                  \
+                                     arg4, arg5, arg6)),                \
+                      func);                                            \
+})
+
+#define alternative_vcall__(nr) alternative_vcall ## nr
+#define alternative_call__(nr)  alternative_call ## nr
+
+#define alternative_vcall_(nr) alternative_vcall__(nr)
+#define alternative_call_(nr)  alternative_call__(nr)
+
+#define alternative_vcall(func, args...) \
+    alternative_vcall_(count_args(args))(func, ## args)
+
+#define alternative_call(func, args...) \
+    alternative_call_(count_args(args))(func, ## args)
+
 #endif /*  !__ASSEMBLY__  */
 
 #endif /* __X86_ALTERNATIVE_H__ */
--- a/xen/include/xen/lib.h
+++ b/xen/include/xen/lib.h
@@ -66,6 +66,10 @@
 
 #define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
 
+#define count_args_(dot, a1, a2, a3, a4, a5, a6, a7, a8, x, ...) x
+#define count_args(args...) \
+    count_args_(., ## args, 8, 7, 6, 5, 4, 3, 2, 1, 0)
+
 struct domain;
 
 void cmdline_parse(const char *cmdline);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v7 04/10] x86/HVM: patch indirect calls through hvm_funcs to direct ones
  2019-03-12 13:59     ` [PATCH v7 00/10] x86: indirect call overhead reduction Jan Beulich
                         ` (2 preceding siblings ...)
  2019-03-12 14:05       ` [PATCH v7 03/10] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
@ 2019-03-12 14:06       ` Jan Beulich
  2019-03-12 14:06       ` [PATCH v7 05/10] x86/HVM: patch vINTR " Jan Beulich
                         ` (5 subsequent siblings)
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2019-03-12 14:06 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

This is intentionally not touching hooks used rarely (or not at all)
during the lifetime of a VM, like {domain,vcpu}_initialise or cpu_up,
as well as nested, VM event, and altp2m ones (they can all be done
later, if so desired). Virtual Interrupt delivery ones will be dealt
with in a subsequent patch.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v6: Re-base.
v3: Re-base.
v2: Drop open-coded numbers from macro invocations. Re-base.

--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -2127,7 +2127,7 @@ static int hvmemul_write_msr(
 static int hvmemul_wbinvd(
     struct x86_emulate_ctxt *ctxt)
 {
-    hvm_funcs.wbinvd_intercept();
+    alternative_vcall(hvm_funcs.wbinvd_intercept);
     return X86EMUL_OKAY;
 }
 
@@ -2145,7 +2145,7 @@ static int hvmemul_get_fpu(
     struct vcpu *curr = current;
 
     if ( !curr->fpu_dirtied )
-        hvm_funcs.fpu_dirty_intercept();
+        alternative_vcall(hvm_funcs.fpu_dirty_intercept);
     else if ( type == X86EMUL_FPU_fpu )
     {
         const typeof(curr->arch.xsave_area->fpu_sse) *fpu_ctxt =
@@ -2262,7 +2262,7 @@ static void hvmemul_put_fpu(
         {
             curr->fpu_dirtied = false;
             stts();
-            hvm_funcs.fpu_leave(curr);
+            alternative_vcall(hvm_funcs.fpu_leave, curr);
         }
     }
 }
@@ -2424,7 +2424,8 @@ static int _hvm_emulate_one(struct hvm_e
     if ( hvmemul_ctxt->intr_shadow != new_intr_shadow )
     {
         hvmemul_ctxt->intr_shadow = new_intr_shadow;
-        hvm_funcs.set_interrupt_shadow(curr, new_intr_shadow);
+        alternative_vcall(hvm_funcs.set_interrupt_shadow,
+                          curr, new_intr_shadow);
     }
 
     if ( hvmemul_ctxt->ctxt.retire.hlt &&
@@ -2561,7 +2562,8 @@ void hvm_emulate_init_once(
 
     memset(hvmemul_ctxt, 0, sizeof(*hvmemul_ctxt));
 
-    hvmemul_ctxt->intr_shadow = hvm_funcs.get_interrupt_shadow(curr);
+    hvmemul_ctxt->intr_shadow =
+        alternative_call(hvm_funcs.get_interrupt_shadow, curr);
     hvmemul_get_seg_reg(x86_seg_cs, hvmemul_ctxt);
     hvmemul_get_seg_reg(x86_seg_ss, hvmemul_ctxt);
 
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -273,12 +273,12 @@ void hvm_set_rdtsc_exiting(struct domain
     struct vcpu *v;
 
     for_each_vcpu ( d, v )
-        hvm_funcs.set_rdtsc_exiting(v, enable);
+        alternative_vcall(hvm_funcs.set_rdtsc_exiting, v, enable);
 }
 
 void hvm_get_guest_pat(struct vcpu *v, u64 *guest_pat)
 {
-    if ( !hvm_funcs.get_guest_pat(v, guest_pat) )
+    if ( !alternative_call(hvm_funcs.get_guest_pat, v, guest_pat) )
         *guest_pat = v->arch.hvm.pat_cr;
 }
 
@@ -303,7 +303,7 @@ int hvm_set_guest_pat(struct vcpu *v, u6
             return 0;
         }
 
-    if ( !hvm_funcs.set_guest_pat(v, guest_pat) )
+    if ( !alternative_call(hvm_funcs.set_guest_pat, v, guest_pat) )
         v->arch.hvm.pat_cr = guest_pat;
 
     return 1;
@@ -343,7 +343,7 @@ bool hvm_set_guest_bndcfgs(struct vcpu *
             /* nothing, best effort only */;
     }
 
-    return hvm_funcs.set_guest_bndcfgs(v, val);
+    return alternative_call(hvm_funcs.set_guest_bndcfgs, v, val);
 }
 
 /*
@@ -507,7 +507,8 @@ void hvm_migrate_pirqs(struct vcpu *v)
 static bool hvm_get_pending_event(struct vcpu *v, struct x86_event *info)
 {
     info->cr2 = v->arch.hvm.guest_cr[2];
-    return hvm_funcs.get_pending_event(v, info);
+
+    return alternative_call(hvm_funcs.get_pending_event, v, info);
 }
 
 void hvm_do_resume(struct vcpu *v)
@@ -1671,7 +1672,7 @@ void hvm_inject_event(const struct x86_e
         }
     }
 
-    hvm_funcs.inject_event(event);
+    alternative_vcall(hvm_funcs.inject_event, event);
 }
 
 int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
@@ -2259,7 +2260,7 @@ int hvm_set_cr0(unsigned long value, boo
          (!rangeset_is_empty(d->iomem_caps) ||
           !rangeset_is_empty(d->arch.ioport_caps) ||
           has_arch_pdevs(d)) )
-        hvm_funcs.handle_cd(v, value);
+        alternative_vcall(hvm_funcs.handle_cd, v, value);
 
     hvm_update_cr(v, 0, value);
 
@@ -3488,7 +3489,8 @@ int hvm_msr_read_intercept(unsigned int
             goto gp_fault;
         /* If ret == 0 then this is not an MCE MSR, see other MSRs. */
         ret = ((ret == 0)
-               ? hvm_funcs.msr_read_intercept(msr, msr_content)
+               ? alternative_call(hvm_funcs.msr_read_intercept,
+                                  msr, msr_content)
                : X86EMUL_OKAY);
         break;
     }
@@ -3634,7 +3636,8 @@ int hvm_msr_write_intercept(unsigned int
             goto gp_fault;
         /* If ret == 0 then this is not an MCE MSR, see other MSRs. */
         ret = ((ret == 0)
-               ? hvm_funcs.msr_write_intercept(msr, msr_content)
+               ? alternative_call(hvm_funcs.msr_write_intercept,
+                                  msr, msr_content)
                : X86EMUL_OKAY);
         break;
     }
@@ -3826,7 +3829,7 @@ void hvm_hypercall_page_initialise(struc
                                    void *hypercall_page)
 {
     hvm_latch_shinfo_size(d);
-    hvm_funcs.init_hypercall_page(d, hypercall_page);
+    alternative_vcall(hvm_funcs.init_hypercall_page, d, hypercall_page);
 }
 
 void hvm_vcpu_reset_state(struct vcpu *v, uint16_t cs, uint16_t ip)
@@ -5077,7 +5080,7 @@ void hvm_domain_soft_reset(struct domain
 void hvm_get_segment_register(struct vcpu *v, enum x86_segment seg,
                               struct segment_register *reg)
 {
-    hvm_funcs.get_segment_register(v, seg, reg);
+    alternative_vcall(hvm_funcs.get_segment_register, v, seg, reg);
 
     switch ( seg )
     {
@@ -5223,7 +5226,7 @@ void hvm_set_segment_register(struct vcp
         return;
     }
 
-    hvm_funcs.set_segment_register(v, seg, reg);
+    alternative_vcall(hvm_funcs.set_segment_register, v, seg, reg);
 }
 
 /*
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -388,42 +388,42 @@ static inline int
 hvm_guest_x86_mode(struct vcpu *v)
 {
     ASSERT(v == current);
-    return hvm_funcs.guest_x86_mode(v);
+    return alternative_call(hvm_funcs.guest_x86_mode, v);
 }
 
 static inline void
 hvm_update_host_cr3(struct vcpu *v)
 {
     if ( hvm_funcs.update_host_cr3 )
-        hvm_funcs.update_host_cr3(v);
+        alternative_vcall(hvm_funcs.update_host_cr3, v);
 }
 
 static inline void hvm_update_guest_cr(struct vcpu *v, unsigned int cr)
 {
-    hvm_funcs.update_guest_cr(v, cr, 0);
+    alternative_vcall(hvm_funcs.update_guest_cr, v, cr, 0);
 }
 
 static inline void hvm_update_guest_cr3(struct vcpu *v, bool noflush)
 {
     unsigned int flags = noflush ? HVM_UPDATE_GUEST_CR3_NOFLUSH : 0;
 
-    hvm_funcs.update_guest_cr(v, 3, flags);
+    alternative_vcall(hvm_funcs.update_guest_cr, v, 3, flags);
 }
 
 static inline void hvm_update_guest_efer(struct vcpu *v)
 {
-    hvm_funcs.update_guest_efer(v);
+    alternative_vcall(hvm_funcs.update_guest_efer, v);
 }
 
 static inline void hvm_cpuid_policy_changed(struct vcpu *v)
 {
-    hvm_funcs.cpuid_policy_changed(v);
+    alternative_vcall(hvm_funcs.cpuid_policy_changed, v);
 }
 
 static inline void hvm_set_tsc_offset(struct vcpu *v, uint64_t offset,
                                       uint64_t at_tsc)
 {
-    hvm_funcs.set_tsc_offset(v, offset, at_tsc);
+    alternative_vcall(hvm_funcs.set_tsc_offset, v, offset, at_tsc);
 }
 
 /*
@@ -440,18 +440,18 @@ static inline void hvm_flush_guest_tlbs(
 static inline unsigned int
 hvm_get_cpl(struct vcpu *v)
 {
-    return hvm_funcs.get_cpl(v);
+    return alternative_call(hvm_funcs.get_cpl, v);
 }
 
 static inline unsigned long hvm_get_shadow_gs_base(struct vcpu *v)
 {
-    return hvm_funcs.get_shadow_gs_base(v);
+    return alternative_call(hvm_funcs.get_shadow_gs_base, v);
 }
 
 static inline bool hvm_get_guest_bndcfgs(struct vcpu *v, u64 *val)
 {
     return hvm_funcs.get_guest_bndcfgs &&
-           hvm_funcs.get_guest_bndcfgs(v, val);
+           alternative_call(hvm_funcs.get_guest_bndcfgs, v, val);
 }
 
 #define has_hvm_params(d) \
@@ -508,12 +508,12 @@ static inline void hvm_inject_page_fault
 
 static inline bool hvm_event_pending(const struct vcpu *v)
 {
-    return hvm_funcs.event_pending(v);
+    return alternative_call(hvm_funcs.event_pending, v);
 }
 
 static inline void hvm_invlpg(struct vcpu *v, unsigned long linear)
 {
-    hvm_funcs.invlpg(v, linear);
+    alternative_vcall(hvm_funcs.invlpg, v, linear);
 }
 
 /* These bits in CR4 are owned by the host. */
@@ -538,13 +538,14 @@ static inline void hvm_cpu_down(void)
 
 static inline unsigned int hvm_get_insn_bytes(struct vcpu *v, uint8_t *buf)
 {
-    return (hvm_funcs.get_insn_bytes ? hvm_funcs.get_insn_bytes(v, buf) : 0);
+    return (hvm_funcs.get_insn_bytes
+            ? alternative_call(hvm_funcs.get_insn_bytes, v, buf) : 0);
 }
 
 static inline void hvm_set_info_guest(struct vcpu *v)
 {
     if ( hvm_funcs.set_info_guest )
-        return hvm_funcs.set_info_guest(v);
+        alternative_vcall(hvm_funcs.set_info_guest, v);
 }
 
 static inline void hvm_invalidate_regs_fields(struct cpu_user_regs *regs)




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v7 05/10] x86/HVM: patch vINTR indirect calls through hvm_funcs to direct ones
  2019-03-12 13:59     ` [PATCH v7 00/10] x86: indirect call overhead reduction Jan Beulich
                         ` (3 preceding siblings ...)
  2019-03-12 14:06       ` [PATCH v7 04/10] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
@ 2019-03-12 14:06       ` Jan Beulich
  2019-03-12 14:07       ` [PATCH v7 06/10] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
                         ` (4 subsequent siblings)
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2019-03-12 14:06 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

While not strictly necessary, change the VMX initialization logic to
update the function table in start_vmx() from NULL rather than to NULL,
to make more obvious that we won't ever change an already (explicitly)
initialized function pointer.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v5: Fix indentation.
v4: Re-base.
v2: Drop open-coded numbers from macro invocations.

--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -111,10 +111,15 @@ static void vlapic_clear_irr(int vector,
     vlapic_clear_vector(vector, &vlapic->regs->data[APIC_IRR]);
 }
 
-static int vlapic_find_highest_irr(struct vlapic *vlapic)
+static void sync_pir_to_irr(struct vcpu *v)
 {
     if ( hvm_funcs.sync_pir_to_irr )
-        hvm_funcs.sync_pir_to_irr(vlapic_vcpu(vlapic));
+        alternative_vcall(hvm_funcs.sync_pir_to_irr, v);
+}
+
+static int vlapic_find_highest_irr(struct vlapic *vlapic)
+{
+    sync_pir_to_irr(vlapic_vcpu(vlapic));
 
     return vlapic_find_highest_vector(&vlapic->regs->data[APIC_IRR]);
 }
@@ -143,7 +148,7 @@ bool vlapic_test_irq(const struct vlapic
         return false;
 
     if ( hvm_funcs.test_pir &&
-         hvm_funcs.test_pir(const_vlapic_vcpu(vlapic), vec) )
+         alternative_call(hvm_funcs.test_pir, const_vlapic_vcpu(vlapic), vec) )
         return true;
 
     return vlapic_test_vector(vec, &vlapic->regs->data[APIC_IRR]);
@@ -165,10 +170,10 @@ void vlapic_set_irq(struct vlapic *vlapi
         vlapic_clear_vector(vec, &vlapic->regs->data[APIC_TMR]);
 
     if ( hvm_funcs.update_eoi_exit_bitmap )
-        hvm_funcs.update_eoi_exit_bitmap(target, vec, trig);
+        alternative_vcall(hvm_funcs.update_eoi_exit_bitmap, target, vec, trig);
 
     if ( hvm_funcs.deliver_posted_intr )
-        hvm_funcs.deliver_posted_intr(target, vec);
+        alternative_vcall(hvm_funcs.deliver_posted_intr, target, vec);
     else if ( !vlapic_test_and_set_irr(vec, vlapic) )
         vcpu_kick(target);
 }
@@ -448,7 +453,8 @@ void vlapic_EOI_set(struct vlapic *vlapi
     vlapic_clear_vector(vector, &vlapic->regs->data[APIC_ISR]);
 
     if ( hvm_funcs.handle_eoi )
-        hvm_funcs.handle_eoi(vector, vlapic_find_highest_isr(vlapic));
+        alternative_vcall(hvm_funcs.handle_eoi, vector,
+                          vlapic_find_highest_isr(vlapic));
 
     vlapic_handle_EOI(vlapic, vector);
 
@@ -1471,8 +1477,7 @@ static int lapic_save_regs(struct vcpu *
     if ( !has_vlapic(v->domain) )
         return 0;
 
-    if ( hvm_funcs.sync_pir_to_irr )
-        hvm_funcs.sync_pir_to_irr(v);
+    sync_pir_to_irr(v);
 
     return hvm_save_entry(LAPIC_REGS, v->vcpu_id, h, vcpu_vlapic(v)->regs);
 }
@@ -1568,7 +1573,8 @@ static int lapic_load_regs(struct domain
         lapic_load_fixup(s);
 
     if ( hvm_funcs.process_isr )
-        hvm_funcs.process_isr(vlapic_find_highest_isr(s), v);
+        alternative_vcall(hvm_funcs.process_isr,
+                          vlapic_find_highest_isr(s), v);
 
     vlapic_adjust_i8259_target(d);
     lapic_rearm(s);
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2339,12 +2339,6 @@ static struct hvm_function_table __initd
     .nhvm_vcpu_vmexit_event = nvmx_vmexit_event,
     .nhvm_intr_blocked    = nvmx_intr_blocked,
     .nhvm_domain_relinquish_resources = nvmx_domain_relinquish_resources,
-    .update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap,
-    .process_isr          = vmx_process_isr,
-    .deliver_posted_intr  = vmx_deliver_posted_intr,
-    .sync_pir_to_irr      = vmx_sync_pir_to_irr,
-    .test_pir             = vmx_test_pir,
-    .handle_eoi           = vmx_handle_eoi,
     .nhvm_hap_walk_L1_p2m = nvmx_hap_walk_L1_p2m,
     .enable_msr_interception = vmx_enable_msr_interception,
     .is_singlestep_supported = vmx_is_singlestep_supported,
@@ -2472,26 +2466,23 @@ const struct hvm_function_table * __init
         setup_ept_dump();
     }
 
-    if ( !cpu_has_vmx_virtual_intr_delivery )
+    if ( cpu_has_vmx_virtual_intr_delivery )
     {
-        vmx_function_table.update_eoi_exit_bitmap = NULL;
-        vmx_function_table.process_isr = NULL;
-        vmx_function_table.handle_eoi = NULL;
-    }
-    else
+        vmx_function_table.update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap;
+        vmx_function_table.process_isr = vmx_process_isr;
+        vmx_function_table.handle_eoi = vmx_handle_eoi;
         vmx_function_table.virtual_intr_delivery_enabled = true;
+    }
 
     if ( cpu_has_vmx_posted_intr_processing )
     {
         alloc_direct_apic_vector(&posted_intr_vector, pi_notification_interrupt);
         if ( iommu_intpost )
             alloc_direct_apic_vector(&pi_wakeup_vector, pi_wakeup_interrupt);
-    }
-    else
-    {
-        vmx_function_table.deliver_posted_intr = NULL;
-        vmx_function_table.sync_pir_to_irr = NULL;
-        vmx_function_table.test_pir = NULL;
+
+        vmx_function_table.deliver_posted_intr = vmx_deliver_posted_intr;
+        vmx_function_table.sync_pir_to_irr     = vmx_sync_pir_to_irr;
+        vmx_function_table.test_pir            = vmx_test_pir;
     }
 
     if ( cpu_has_vmx_tsc_scaling )




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v7 06/10] x86: patch ctxt_switch_masking() indirect call to direct one
  2019-03-12 13:59     ` [PATCH v7 00/10] x86: indirect call overhead reduction Jan Beulich
                         ` (4 preceding siblings ...)
  2019-03-12 14:06       ` [PATCH v7 05/10] x86/HVM: patch vINTR " Jan Beulich
@ 2019-03-12 14:07       ` Jan Beulich
  2019-03-12 14:07       ` [PATCH v7 07/10] x86/genapic: patch indirect calls to direct ones Jan Beulich
                         ` (3 subsequent siblings)
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2019-03-12 14:07 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v2: Drop open-coded number from macro invocation.

--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -185,7 +185,7 @@ void ctxt_switch_levelling(const struct
 	}
 
 	if (ctxt_switch_masking)
-		ctxt_switch_masking(next);
+		alternative_vcall(ctxt_switch_masking, next);
 }
 
 bool_t opt_cpu_info;



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v7 07/10] x86/genapic: patch indirect calls to direct ones
  2019-03-12 13:59     ` [PATCH v7 00/10] x86: indirect call overhead reduction Jan Beulich
                         ` (5 preceding siblings ...)
  2019-03-12 14:07       ` [PATCH v7 06/10] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
@ 2019-03-12 14:07       ` Jan Beulich
  2019-03-12 14:08       ` [PATCH v7 08/10] x86/cpuidle: patch some " Jan Beulich
                         ` (2 subsequent siblings)
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2019-03-12 14:07 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

For (I hope) obvious reasons only the ones used at runtime get
converted.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v2: Drop open-coded numbers from macro invocations.

--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -29,12 +29,12 @@
 
 void send_IPI_mask(const cpumask_t *mask, int vector)
 {
-    genapic.send_IPI_mask(mask, vector);
+    alternative_vcall(genapic.send_IPI_mask, mask, vector);
 }
 
 void send_IPI_self(int vector)
 {
-    genapic.send_IPI_self(vector);
+    alternative_vcall(genapic.send_IPI_self, vector);
 }
 
 /*
--- a/xen/include/asm-x86/mach-generic/mach_apic.h
+++ b/xen/include/asm-x86/mach-generic/mach_apic.h
@@ -15,8 +15,18 @@
 #define TARGET_CPUS ((const typeof(cpu_online_map) *)&cpu_online_map)
 #define init_apic_ldr (genapic.init_apic_ldr)
 #define clustered_apic_check (genapic.clustered_apic_check)
-#define cpu_mask_to_apicid (genapic.cpu_mask_to_apicid)
-#define vector_allocation_cpumask(cpu) (genapic.vector_allocation_cpumask(cpu))
+#define cpu_mask_to_apicid(mask) ({ \
+	/* \
+	 * There are a number of places where the address of a local variable \
+	 * gets passed here. The use of ?: in alternative_call<N>() triggers an \
+	 * "address of ... is always true" warning in such a case with at least \
+	 * gcc 7 and 8. Hence the seemingly pointless local variable here. \
+	 */ \
+	const cpumask_t *m_ = (mask); \
+	alternative_call(genapic.cpu_mask_to_apicid, m_); \
+})
+#define vector_allocation_cpumask(cpu) \
+	alternative_call(genapic.vector_allocation_cpumask, cpu)
 
 static inline void enable_apic_mode(void)
 {





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v7 08/10] x86/cpuidle: patch some indirect calls to direct ones
  2019-03-12 13:59     ` [PATCH v7 00/10] x86: indirect call overhead reduction Jan Beulich
                         ` (6 preceding siblings ...)
  2019-03-12 14:07       ` [PATCH v7 07/10] x86/genapic: patch indirect calls to direct ones Jan Beulich
@ 2019-03-12 14:08       ` Jan Beulich
  2019-03-12 14:08       ` [PATCH v7 09/10] cpufreq: patch target() indirect call to direct one Jan Beulich
  2019-03-12 14:09       ` [PATCH v7 10/10] IOMMU: patch certain indirect calls to direct ones Jan Beulich
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2019-03-12 14:08 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

For now only the ones used during entering/exiting of idle states are
converted. Additionally pm_idle{,_save} and lapic_timer_{on,off} can't
be converted, as they may get established rather late (when Dom0 is
already active).

Note that for patching to be deferred until after the pre-SMP initcalls
(from where cpuidle_init_cpu() runs the first time) the pointers need to
start out as NULL.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v2: Drop open-coded numbers from macro invocations.

--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -102,8 +102,6 @@ bool lapic_timer_init(void)
     return true;
 }
 
-static uint64_t (*__read_mostly tick_to_ns)(uint64_t) = acpi_pm_tick_to_ns;
-
 void (*__read_mostly pm_idle_save)(void);
 unsigned int max_cstate __read_mostly = ACPI_PROCESSOR_MAX_POWER - 1;
 integer_param("max_cstate", max_cstate);
@@ -289,9 +287,9 @@ static uint64_t acpi_pm_ticks_elapsed(ui
         return ((0xFFFFFFFF - t1) + t2 +1);
 }
 
-uint64_t (*__read_mostly cpuidle_get_tick)(void) = get_acpi_pm_tick;
-static uint64_t (*__read_mostly ticks_elapsed)(uint64_t, uint64_t)
-    = acpi_pm_ticks_elapsed;
+uint64_t (*__read_mostly cpuidle_get_tick)(void);
+static uint64_t (*__read_mostly tick_to_ns)(uint64_t);
+static uint64_t (*__read_mostly ticks_elapsed)(uint64_t, uint64_t);
 
 static void print_acpi_power(uint32_t cpu, struct acpi_processor_power *power)
 {
@@ -547,7 +545,7 @@ void update_idle_stats(struct acpi_proce
                        struct acpi_processor_cx *cx,
                        uint64_t before, uint64_t after)
 {
-    int64_t sleep_ticks = ticks_elapsed(before, after);
+    int64_t sleep_ticks = alternative_call(ticks_elapsed, before, after);
     /* Interrupts are disabled */
 
     spin_lock(&power->stat_lock);
@@ -555,7 +553,8 @@ void update_idle_stats(struct acpi_proce
     cx->usage++;
     if ( sleep_ticks > 0 )
     {
-        power->last_residency = tick_to_ns(sleep_ticks) / 1000UL;
+        power->last_residency = alternative_call(tick_to_ns, sleep_ticks) /
+                                1000UL;
         cx->time += sleep_ticks;
     }
     power->last_state = &power->states[0];
@@ -635,7 +634,7 @@ static void acpi_processor_idle(void)
         if ( cx->type == ACPI_STATE_C1 || local_apic_timer_c2_ok )
         {
             /* Get start time (ticks) */
-            t1 = cpuidle_get_tick();
+            t1 = alternative_call(cpuidle_get_tick);
             /* Trace cpu idle entry */
             TRACE_4D(TRC_PM_IDLE_ENTRY, cx->idx, t1, exp, pred);
 
@@ -644,7 +643,7 @@ static void acpi_processor_idle(void)
             /* Invoke C2 */
             acpi_idle_do_entry(cx);
             /* Get end time (ticks) */
-            t2 = cpuidle_get_tick();
+            t2 = alternative_call(cpuidle_get_tick);
             trace_exit_reason(irq_traced);
             /* Trace cpu idle exit */
             TRACE_6D(TRC_PM_IDLE_EXIT, cx->idx, t2,
@@ -666,7 +665,7 @@ static void acpi_processor_idle(void)
         lapic_timer_off();
 
         /* Get start time (ticks) */
-        t1 = cpuidle_get_tick();
+        t1 = alternative_call(cpuidle_get_tick);
         /* Trace cpu idle entry */
         TRACE_4D(TRC_PM_IDLE_ENTRY, cx->idx, t1, exp, pred);
 
@@ -717,7 +716,7 @@ static void acpi_processor_idle(void)
         }
 
         /* Get end time (ticks) */
-        t2 = cpuidle_get_tick();
+        t2 = alternative_call(cpuidle_get_tick);
 
         /* recovering TSC */
         cstate_restore_tsc();
@@ -827,11 +826,20 @@ int cpuidle_init_cpu(unsigned int cpu)
     {
         unsigned int i;
 
-        if ( cpu == 0 && boot_cpu_has(X86_FEATURE_NONSTOP_TSC) )
+        if ( cpu == 0 && system_state < SYS_STATE_active )
         {
-            cpuidle_get_tick = get_stime_tick;
-            ticks_elapsed = stime_ticks_elapsed;
-            tick_to_ns = stime_tick_to_ns;
+            if ( boot_cpu_has(X86_FEATURE_NONSTOP_TSC) )
+            {
+                cpuidle_get_tick = get_stime_tick;
+                ticks_elapsed = stime_ticks_elapsed;
+                tick_to_ns = stime_tick_to_ns;
+            }
+            else
+            {
+                cpuidle_get_tick = get_acpi_pm_tick;
+                ticks_elapsed = acpi_pm_ticks_elapsed;
+                tick_to_ns = acpi_pm_tick_to_ns;
+            }
         }
 
         acpi_power = xzalloc(struct acpi_processor_power);
--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -778,7 +778,7 @@ static void mwait_idle(void)
 	if (!(lapic_timer_reliable_states & (1 << cstate)))
 		lapic_timer_off();
 
-	before = cpuidle_get_tick();
+	before = alternative_call(cpuidle_get_tick);
 	TRACE_4D(TRC_PM_IDLE_ENTRY, cx->type, before, exp, pred);
 
 	update_last_cx_stat(power, cx, before);
@@ -786,7 +786,7 @@ static void mwait_idle(void)
 	if (cpu_is_haltable(cpu))
 		mwait_idle_with_hints(eax, MWAIT_ECX_INTERRUPT_BREAK);
 
-	after = cpuidle_get_tick();
+	after = alternative_call(cpuidle_get_tick);
 
 	cstate_restore_tsc();
 	trace_exit_reason(irq_traced);




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v7 09/10] cpufreq: patch target() indirect call to direct one
  2019-03-12 13:59     ` [PATCH v7 00/10] x86: indirect call overhead reduction Jan Beulich
                         ` (7 preceding siblings ...)
  2019-03-12 14:08       ` [PATCH v7 08/10] x86/cpuidle: patch some " Jan Beulich
@ 2019-03-12 14:08       ` Jan Beulich
  2019-03-12 14:09       ` [PATCH v7 10/10] IOMMU: patch certain indirect calls to direct ones Jan Beulich
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2019-03-12 14:08 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

This looks to be the only frequently executed hook; don't bother
patching any other ones.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v2: New.

--- a/xen/drivers/cpufreq/utility.c
+++ b/xen/drivers/cpufreq/utility.c
@@ -364,7 +364,8 @@ int __cpufreq_driver_target(struct cpufr
     {
         unsigned int prev_freq = policy->cur;
 
-        retval = cpufreq_driver.target(policy, target_freq, relation);
+        retval = alternative_call(cpufreq_driver.target,
+                                  policy, target_freq, relation);
         if ( retval == 0 )
             TRACE_2D(TRC_PM_FREQ_CHANGE, prev_freq/1000, policy->cur/1000);
     }



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [PATCH v7 10/10] IOMMU: patch certain indirect calls to direct ones
  2019-03-12 13:59     ` [PATCH v7 00/10] x86: indirect call overhead reduction Jan Beulich
                         ` (8 preceding siblings ...)
  2019-03-12 14:08       ` [PATCH v7 09/10] cpufreq: patch target() indirect call to direct one Jan Beulich
@ 2019-03-12 14:09       ` Jan Beulich
  9 siblings, 0 replies; 119+ messages in thread
From: Jan Beulich @ 2019-03-12 14:09 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

This is intentionally not touching hooks used rarely (or not at all)
during the lifetime of a VM, unless perhaps sitting on an error path
next to a call which gets changed (in which case I think the error
path better remains consistent with the respective main path).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
v7: Re-base once again.
v6: Re-base.
v5: Re-base over type-safe changes and dropped IOMMU_MIXED patch. Also
    patch the new lookup_page() hook.
v4: New.

--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -307,8 +307,8 @@ int iommu_map(struct domain *d, dfn_t df
 
     for ( i = 0; i < (1ul << page_order); i++ )
     {
-        rc = hd->platform_ops->map_page(d, dfn_add(dfn, i), mfn_add(mfn, i),
-                                        flags, flush_flags);
+        rc = iommu_call(hd->platform_ops, map_page, d, dfn_add(dfn, i),
+                        mfn_add(mfn, i), flags, flush_flags);
 
         if ( likely(!rc) )
             continue;
@@ -321,8 +321,8 @@ int iommu_map(struct domain *d, dfn_t df
 
         while ( i-- )
             /* if statement to satisfy __must_check */
-            if ( hd->platform_ops->unmap_page(d, dfn_add(dfn, i),
-                                              flush_flags) )
+            if ( iommu_call(hd->platform_ops, unmap_page, d, dfn_add(dfn, i),
+                            flush_flags) )
                 continue;
 
         if ( !is_hardware_domain(d) )
@@ -366,8 +366,8 @@ int iommu_unmap(struct domain *d, dfn_t
 
     for ( i = 0; i < (1ul << page_order); i++ )
     {
-        int err = hd->platform_ops->unmap_page(d, dfn_add(dfn, i),
-                                               flush_flags);
+        int err = iommu_call(hd->platform_ops, unmap_page, d, dfn_add(dfn, i),
+                             flush_flags);
 
         if ( likely(!err) )
             continue;
@@ -415,7 +415,7 @@ int iommu_lookup_page(struct domain *d,
     if ( !iommu_enabled || !hd->platform_ops )
         return -EOPNOTSUPP;
 
-    return hd->platform_ops->lookup_page(d, dfn, mfn, flags);
+    return iommu_call(hd->platform_ops, lookup_page, d, dfn, mfn, flags);
 }
 
 static void iommu_free_pagetables(unsigned long unused)
@@ -428,7 +428,7 @@ static void iommu_free_pagetables(unsign
         spin_unlock(&iommu_pt_cleanup_lock);
         if ( !pg )
             return;
-        iommu_get_ops()->free_page_table(pg);
+        iommu_vcall(iommu_get_ops(), free_page_table, pg);
     } while ( !softirq_pending(smp_processor_id()) );
 
     tasklet_schedule_on_cpu(&iommu_pt_cleanup_tasklet,
@@ -448,7 +448,8 @@ int iommu_iotlb_flush(struct domain *d,
     if ( dfn_eq(dfn, INVALID_DFN) )
         return -EINVAL;
 
-    rc = hd->platform_ops->iotlb_flush(d, dfn, page_count, flush_flags);
+    rc = iommu_call(hd->platform_ops, iotlb_flush, d, dfn, page_count,
+                    flush_flags);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
@@ -476,7 +477,7 @@ int iommu_iotlb_flush_all(struct domain
      * The operation does a full flush so we don't need to pass the
      * flush_flags in.
      */
-    rc = hd->platform_ops->iotlb_flush_all(d);
+    rc = iommu_call(hd->platform_ops, iotlb_flush_all, d);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1337,14 +1337,14 @@ int iommu_update_ire_from_msi(
     struct msi_desc *msi_desc, struct msi_msg *msg)
 {
     return iommu_intremap
-           ? iommu_get_ops()->update_ire_from_msi(msi_desc, msg) : 0;
+           ? iommu_call(&iommu_ops, update_ire_from_msi, msi_desc, msg) : 0;
 }
 
 void iommu_read_msi_from_ire(
     struct msi_desc *msi_desc, struct msi_msg *msg)
 {
     if ( iommu_intremap )
-        iommu_get_ops()->read_msi_from_ire(msi_desc, msg);
+        iommu_vcall(&iommu_ops, read_msi_from_ire, msi_desc, msg);
 }
 
 static int iommu_add_device(struct pci_dev *pdev)
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -28,14 +28,12 @@ struct iommu_ops iommu_ops;
 void iommu_update_ire_from_apic(
     unsigned int apic, unsigned int reg, unsigned int value)
 {
-    const struct iommu_ops *ops = iommu_get_ops();
-    ops->update_ire_from_apic(apic, reg, value);
+    iommu_vcall(&iommu_ops, update_ire_from_apic, apic, reg, value);
 }
 
 unsigned int iommu_read_apic_from_ire(unsigned int apic, unsigned int reg)
 {
-    const struct iommu_ops *ops = iommu_get_ops();
-    return ops->read_apic_from_ire(apic, reg);
+    return iommu_call(&iommu_ops, read_apic_from_ire, apic, reg);
 }
 
 int __init iommu_setup_hpet_msi(struct msi_desc *msi)
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -61,6 +61,12 @@ int amd_iov_detect(void);
 
 extern struct iommu_ops iommu_ops;
 
+#ifdef NDEBUG
+# include <asm/alternative.h>
+# define iommu_call(ops, fn, args...)  alternative_call(iommu_ops.fn, ## args)
+# define iommu_vcall(ops, fn, args...) alternative_vcall(iommu_ops.fn, ## args)
+#endif
+
 static inline const struct iommu_ops *iommu_get_ops(void)
 {
     BUG_ON(!iommu_ops.init);
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -236,6 +236,11 @@ struct iommu_ops {
 
 #include <asm/iommu.h>
 
+#ifndef iommu_call
+# define iommu_call(ops, fn, args...) ((ops)->fn(args))
+# define iommu_vcall iommu_call
+#endif
+
 enum iommu_status
 {
     IOMMU_STATUS_disabled,




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 119+ messages in thread

end of thread, other threads:[~2019-03-12 14:09 UTC | newest]

Thread overview: 119+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-11 13:26 [PATCH v3 0/9] x86: indirect call overhead reduction Jan Beulich
2018-09-11 13:32 ` [PATCH v3 1/9] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
2018-09-21 10:49   ` Wei Liu
2018-09-21 11:47     ` Jan Beulich
2018-09-21 13:48       ` Wei Liu
2018-09-21 15:26         ` Jan Beulich
2018-09-26 11:06           ` Wei Liu
2018-09-11 13:32 ` [PATCH v3 2/9] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
2018-09-21 10:50   ` Wei Liu
2018-09-11 13:33 ` [PATCH v3 3/9] x86/HVM: patch vINTR " Jan Beulich
2018-09-21 10:50   ` Wei Liu
2018-09-11 13:34 ` [PATCH v3 4/9] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
2018-09-21 10:51   ` Wei Liu
2018-09-11 13:35 ` [PATCH v3 5/9] x86/genapic: remove indirection from genapic hook accesses Jan Beulich
2018-09-21 10:53   ` Wei Liu
2018-09-11 13:35 ` [PATCH v3 6/9] x86/genapic: patch indirect calls to direct ones Jan Beulich
2018-09-21 11:03   ` Wei Liu
2018-09-21 11:53     ` Jan Beulich
2018-09-21 13:55   ` Wei Liu
2018-09-11 13:35 ` [PATCH v3 7/9] x86/cpuidle: patch some " Jan Beulich
2018-09-21 14:01   ` Wei Liu
2018-09-11 13:37 ` [PATCH v3 8/9] cpufreq: convert to a single post-init driver (hooks) instance Jan Beulich
2018-09-21 14:06   ` Wei Liu
2018-09-11 13:37 ` [PATCH v3 9/9] cpufreq: patch target() indirect call to direct one Jan Beulich
2018-09-21 14:06   ` Wei Liu
2018-10-02 10:09 ` [PATCH v4 00/12] x86: indirect call overhead reduction Jan Beulich
2018-10-02 10:12   ` [PATCH v4 01/12] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
2018-10-02 13:21     ` Andrew Cooper
2018-10-02 13:28       ` Julien Grall
2018-10-02 13:35         ` Andrew Cooper
2018-10-02 13:36           ` Julien Grall
2018-10-02 14:06       ` Jan Beulich
2018-10-02 14:23         ` Andrew Cooper
2018-10-02 14:43           ` Jan Beulich
2018-10-02 15:40             ` Andrew Cooper
2018-10-02 16:06               ` Jan Beulich
2018-10-02 13:55     ` Wei Liu
2018-10-02 14:08       ` Jan Beulich
2018-10-03 18:38     ` Andrew Cooper
2018-10-05 12:39       ` Andrew Cooper
2018-10-05 13:43         ` Jan Beulich
2018-10-05 14:49           ` Andrew Cooper
2018-10-05 15:05             ` Jan Beulich
2018-10-29 11:01       ` Jan Beulich
2018-10-02 10:12   ` [PATCH v4 02/12] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
2018-10-02 13:18     ` Paul Durrant
2018-10-03 18:55     ` Andrew Cooper
2018-10-04 10:19       ` Jan Beulich
2018-10-02 10:13   ` [PATCH v4 03/12] x86/HVM: patch vINTR " Jan Beulich
2018-10-03 19:01     ` Andrew Cooper
2018-10-02 10:13   ` [PATCH v4 04/12] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
2018-10-03 19:01     ` Andrew Cooper
2018-10-02 10:14   ` [PATCH v4 05/12] x86/genapic: remove indirection from genapic hook accesses Jan Beulich
2018-10-03 19:04     ` Andrew Cooper
2018-10-02 10:14   ` [PATCH v4 06/12] x86/genapic: patch indirect calls to direct ones Jan Beulich
2018-10-03 19:07     ` Andrew Cooper
2018-10-02 10:15   ` [PATCH v4 07/12] x86/cpuidle: patch some " Jan Beulich
2018-10-04 10:35     ` Andrew Cooper
2018-10-02 10:16   ` [PATCH v4 08/12] cpufreq: convert to a single post-init driver (hooks) instance Jan Beulich
2018-10-04 10:36     ` Andrew Cooper
2018-10-02 10:16   ` [PATCH v4 09/12] cpufreq: patch target() indirect call to direct one Jan Beulich
2018-10-04 10:36     ` Andrew Cooper
2018-10-02 10:18   ` [PATCH v4 10/12] IOMMU: introduce IOMMU_MIXED config option Jan Beulich
2018-10-02 10:38     ` Julien Grall
2018-10-02 10:42       ` Jan Beulich
2018-10-02 11:00         ` Julien Grall
2018-10-02 11:58           ` Jan Beulich
2018-10-02 12:58             ` Julien Grall
2018-11-06 15:48       ` Jan Beulich
2018-11-07 18:01         ` Julien Grall
2018-10-02 10:18   ` [PATCH v4 11/12] IOMMU: remove indirection from certain IOMMU hook accesses Jan Beulich
2018-10-02 10:19   ` [PATCH v4 12/12] IOMMU: patch certain indirect calls to direct ones Jan Beulich
2018-11-08 15:56 ` [PATCH v5 00/13] x86: indirect call overhead reduction Jan Beulich
2018-11-08 16:05   ` [PATCH v5 01/13] x86: reduce general stack alignment to 8 Jan Beulich
2018-11-29 14:54     ` Wei Liu
2018-11-29 15:03       ` Jan Beulich
2018-11-29 17:44     ` Wei Liu
2018-11-30  9:03       ` Jan Beulich
2018-12-03 11:29         ` Wei Liu
2018-11-08 16:06   ` [PATCH v5 02/13] x86: clone Linux'es ASM_CALL_CONSTRAINT Jan Beulich
2018-11-29 17:13     ` Wei Liu
2018-11-08 16:08   ` [PATCH v5 03/13] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
2018-11-12 10:36     ` Jan Beulich
2018-11-08 16:09   ` [PATCH v5 04/13] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
2018-11-08 16:09   ` [PATCH v5 05/13] x86/HVM: patch vINTR " Jan Beulich
2018-11-08 16:10   ` [PATCH v5 06/13] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
2018-11-08 16:11   ` [PATCH v5 07/13] x86/genapic: patch indirect calls to direct ones Jan Beulich
2018-11-08 16:11   ` [PATCH v5 08/13] x86/cpuidle: patch some " Jan Beulich
2018-11-08 16:12   ` [PATCH v5 09/13] cpufreq: convert to a single post-init driver (hooks) instance Jan Beulich
2018-11-08 16:13   ` [PATCH v5 10/13] cpufreq: patch target() indirect call to direct one Jan Beulich
2018-11-08 16:14   ` [PATCH v5 11/13] IOMMU: move inclusion point of asm/iommu.h Jan Beulich
2018-11-12 11:55     ` Julien Grall
2018-11-08 16:16   ` [PATCH v5 12/13] IOMMU/x86: remove indirection from certain IOMMU hook accesses Jan Beulich
2018-11-14  3:25     ` Tian, Kevin
2018-11-14 17:16     ` Woods, Brian
2018-11-08 16:17   ` [PATCH v5 13/13] IOMMU: patch certain indirect calls to direct ones Jan Beulich
2018-11-29 14:49     ` Wei Liu
2018-12-05 15:54 ` [PATCH v6 00/10] x86: indirect call overhead reduction Jan Beulich
2018-12-05 16:02   ` [PATCH v6 01/10] x86: reduce general stack alignment to 8 Jan Beulich
2018-12-05 16:02   ` [PATCH v6 02/10] x86: clone Linux'es ASM_CALL_CONSTRAINT Jan Beulich
2018-12-05 16:03   ` [PATCH v6 03/10] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
2018-12-05 16:04   ` [PATCH v6 04/10] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
2018-12-05 16:05   ` [PATCH v6 05/10] x86/HVM: patch vINTR " Jan Beulich
2018-12-05 16:06   ` [PATCH v6 06/10] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
2018-12-05 16:06   ` [PATCH v6 07/10] x86/genapic: patch indirect calls to direct ones Jan Beulich
2018-12-05 16:07   ` [PATCH v6 08/10] x86/cpuidle: patch some " Jan Beulich
2018-12-05 16:07   ` [PATCH v6 09/10] cpufreq: patch target() indirect call to direct one Jan Beulich
2018-12-05 16:08   ` [PATCH v6 10/10] IOMMU: patch certain indirect calls to direct ones Jan Beulich
     [not found] ` <5C07F49D0200000000101036@prv1-mh.provo.novell.com>
     [not found]   ` <5C07F49D020000780021DC1A@prv1-mh.provo.novell.com>
2019-03-12 13:59     ` [PATCH v7 00/10] x86: indirect call overhead reduction Jan Beulich
2019-03-12 14:03       ` [PATCH v7 01/10] x86: reduce general stack alignment to 8 Jan Beulich
2019-03-12 14:04       ` [PATCH v7 02/10] x86: clone Linux'es ASM_CALL_CONSTRAINT Jan Beulich
2019-03-12 14:05       ` [PATCH v7 03/10] x86: infrastructure to allow converting certain indirect calls to direct ones Jan Beulich
2019-03-12 14:06       ` [PATCH v7 04/10] x86/HVM: patch indirect calls through hvm_funcs " Jan Beulich
2019-03-12 14:06       ` [PATCH v7 05/10] x86/HVM: patch vINTR " Jan Beulich
2019-03-12 14:07       ` [PATCH v7 06/10] x86: patch ctxt_switch_masking() indirect call to direct one Jan Beulich
2019-03-12 14:07       ` [PATCH v7 07/10] x86/genapic: patch indirect calls to direct ones Jan Beulich
2019-03-12 14:08       ` [PATCH v7 08/10] x86/cpuidle: patch some " Jan Beulich
2019-03-12 14:08       ` [PATCH v7 09/10] cpufreq: patch target() indirect call to direct one Jan Beulich
2019-03-12 14:09       ` [PATCH v7 10/10] IOMMU: patch certain indirect calls to direct ones Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.