All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] x86/traps: Drop exception_table[]
@ 2021-11-19 18:21 Andrew Cooper
  2021-11-19 18:21 ` [PATCH 1/5] x86/traps: Collect PERFC_exceptions stats for IST vectors too Andrew Cooper
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Andrew Cooper @ 2021-11-19 18:21 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

Andrew Cooper (5):
  x86/traps: Collect PERFC_exceptions stats for IST vectors too
  x86/traps: Drop dummy_nmi_callback()
  x86/crash: Drop manual hooking of exception_table[]
  x86/traps: Drop exception_table[] and use if/else dispatching
  x86/traps: Clean up diagnostics

 xen/arch/x86/crash.c            | 15 +-------
 xen/arch/x86/extable.c          |  8 ++--
 xen/arch/x86/hvm/vmx/vmx.c      |  2 +-
 xen/arch/x86/pv/traps.c         |  6 +--
 xen/arch/x86/traps.c            | 83 +++++++++--------------------------------
 xen/arch/x86/x86_64/entry.S     | 74 ++++++++++++++++++++++++++++++++----
 xen/include/asm-x86/processor.h |  3 --
 xen/include/asm-x86/traps.h     |  2 +-
 8 files changed, 96 insertions(+), 97 deletions(-)

-- 
2.11.0



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/5] x86/traps: Collect PERFC_exceptions stats for IST vectors too
  2021-11-19 18:21 [PATCH 0/5] x86/traps: Drop exception_table[] Andrew Cooper
@ 2021-11-19 18:21 ` Andrew Cooper
  2021-11-22  8:50   ` Jan Beulich
  2021-11-19 18:21 ` [PATCH 2/5] x86/traps: Drop dummy_nmi_callback() Andrew Cooper
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2021-11-19 18:21 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

This causes NMIs, #DB and #MC to be counted, rather than being reported as 0.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/x86_64/entry.S | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index d5998acf8806..3caa5654768d 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -1005,6 +1005,13 @@ handle_ist_exception:
 #endif
         movq  %rsp,%rdi
         movzbl UREGS_entry_vector(%rsp),%eax
+
+#ifdef CONFIG_PERF_COUNTERS
+        lea   per_cpu__perfcounters(%rip), %rcx
+        add   STACK_CPUINFO_FIELD(per_cpu_offset)(%r14), %rcx
+        incl  ASM_PERFC_exceptions * 4(%rcx, %rax, 4)
+#endif
+
         leaq  exception_table(%rip),%rdx
         mov   (%rdx, %rax, 8), %rdx
         INDIRECT_CALL %rdx
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/5] x86/traps: Drop dummy_nmi_callback()
  2021-11-19 18:21 [PATCH 0/5] x86/traps: Drop exception_table[] Andrew Cooper
  2021-11-19 18:21 ` [PATCH 1/5] x86/traps: Collect PERFC_exceptions stats for IST vectors too Andrew Cooper
@ 2021-11-19 18:21 ` Andrew Cooper
  2021-11-22  8:51   ` Jan Beulich
  2021-11-19 18:21 ` [PATCH 3/5] x86/crash: Drop manual hooking of exception_table[] Andrew Cooper
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2021-11-19 18:21 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

The unconditional nmi_callback() call in do_nmi() calls dummy_nmi_callback()
in all cases other than for a few specific and rare tasks (alternative
patching, microcode loading, etc).

Indirect calls are expensive under retpoline, so rearrange the logic to use
NULL as the default, and skip the call entirely in the common case.

While rearranging the code, fold the exit paths.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/traps.c | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index d483aa91f2f1..f526298e997d 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1872,29 +1872,23 @@ static void unknown_nmi_error(const struct cpu_user_regs *regs,
     }
 }
 
-static int dummy_nmi_callback(const struct cpu_user_regs *regs, int cpu)
-{
-    return 0;
-}
-
-static nmi_callback_t *nmi_callback = dummy_nmi_callback;
+static nmi_callback_t *__read_mostly nmi_callback;
 
 DEFINE_PER_CPU(unsigned int, nmi_count);
 
 void do_nmi(const struct cpu_user_regs *regs)
 {
     unsigned int cpu = smp_processor_id();
+    nmi_callback_t *callback;
     unsigned char reason = 0;
     bool handle_unknown = false;
 
     this_cpu(nmi_count)++;
     nmi_enter();
 
-    if ( nmi_callback(regs, cpu) )
-    {
-        nmi_exit();
-        return;
-    }
+    callback = ACCESS_ONCE(nmi_callback);
+    if ( unlikely(callback) && callback(regs, cpu) )
+        goto out;
 
     /*
      * Accessing port 0x61 may trap to SMM which has been actually
@@ -1921,6 +1915,7 @@ void do_nmi(const struct cpu_user_regs *regs)
             unknown_nmi_error(regs, reason);
     }
 
+ out:
     nmi_exit();
 }
 
@@ -1935,7 +1930,7 @@ nmi_callback_t *set_nmi_callback(nmi_callback_t *callback)
 
 void unset_nmi_callback(void)
 {
-    nmi_callback = dummy_nmi_callback;
+    nmi_callback = NULL;
 }
 
 bool nmi_check_continuation(void)
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/5] x86/crash: Drop manual hooking of exception_table[]
  2021-11-19 18:21 [PATCH 0/5] x86/traps: Drop exception_table[] Andrew Cooper
  2021-11-19 18:21 ` [PATCH 1/5] x86/traps: Collect PERFC_exceptions stats for IST vectors too Andrew Cooper
  2021-11-19 18:21 ` [PATCH 2/5] x86/traps: Drop dummy_nmi_callback() Andrew Cooper
@ 2021-11-19 18:21 ` Andrew Cooper
  2021-11-22  8:57   ` Jan Beulich
  2021-11-19 18:21 ` [PATCH 4/5] x86/traps: Drop exception_table[] and use if/else dispatching Andrew Cooper
  2021-11-19 18:21 ` [PATCH 5/5] x86/traps: Clean up diagnostics Andrew Cooper
  4 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2021-11-19 18:21 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

NMI hooking in the crash path has undergone several revisions since its
introduction.  What we have now is not sufficiently different from the regular
nmi_callback() mechanism to warrant special casing.

Use set_nmi_callback() directly, and do away with patching a read-only data
structure via a read-write alias.  This also means that the
vmx_vmexit_handler() can and should call do_nmi() directly, rather than
indirecting through the exception table to pick up the crash path hook.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/crash.c       | 15 ++-------------
 xen/arch/x86/hvm/vmx/vmx.c |  2 +-
 xen/arch/x86/traps.c       |  5 +++++
 3 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/crash.c b/xen/arch/x86/crash.c
index 0611b4fb9b09..f6264946a681 100644
--- a/xen/arch/x86/crash.c
+++ b/xen/arch/x86/crash.c
@@ -36,10 +36,8 @@ static unsigned int crashing_cpu;
 static DEFINE_PER_CPU_READ_MOSTLY(bool, crash_save_done);
 
 /* This becomes the NMI handler for non-crashing CPUs, when Xen is crashing. */
-static void noreturn do_nmi_crash(const struct cpu_user_regs *regs)
+static int noreturn do_nmi_crash(const struct cpu_user_regs *regs, int cpu)
 {
-    unsigned int cpu = smp_processor_id();
-
     stac();
 
     /* nmi_shootdown_cpus() should ensure that this assertion is correct. */
@@ -136,16 +134,7 @@ static void nmi_shootdown_cpus(void)
                     SYS_DESC_irq_gate, 0, &trap_nop);
     set_ist(&idt_tables[cpu][TRAP_machine_check], IST_NONE);
 
-    /*
-     * Ideally would be:
-     *   exception_table[TRAP_nmi] = &do_nmi_crash;
-     *
-     * but the exception_table is read only.  Access it via its directmap
-     * mappings.
-     */
-    write_atomic((unsigned long *)__va(__pa(&exception_table[TRAP_nmi])),
-                 (unsigned long)&do_nmi_crash);
-
+    set_nmi_callback(do_nmi_crash);
     smp_send_nmi_allbutself();
 
     msecs = 1000; /* Wait at most a second for the other cpus to stop */
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index d403e2d8060a..37c31c08b984 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3887,7 +3887,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
              ((intr_info & INTR_INFO_INTR_TYPE_MASK) ==
               MASK_INSR(X86_EVENTTYPE_NMI, INTR_INFO_INTR_TYPE_MASK)) )
         {
-            exception_table[TRAP_nmi](regs);
+            do_nmi(regs);
             enable_nmis();
         }
         break;
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index f526298e997d..096a411fdf94 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1886,6 +1886,11 @@ void do_nmi(const struct cpu_user_regs *regs)
     this_cpu(nmi_count)++;
     nmi_enter();
 
+    /*
+     * Think carefully before putting any logic before this point.
+     * nmi_callback() might be the crash quiesce...
+     */
+
     callback = ACCESS_ONCE(nmi_callback);
     if ( unlikely(callback) && callback(regs, cpu) )
         goto out;
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/5] x86/traps: Drop exception_table[] and use if/else dispatching
  2021-11-19 18:21 [PATCH 0/5] x86/traps: Drop exception_table[] Andrew Cooper
                   ` (2 preceding siblings ...)
  2021-11-19 18:21 ` [PATCH 3/5] x86/crash: Drop manual hooking of exception_table[] Andrew Cooper
@ 2021-11-19 18:21 ` Andrew Cooper
  2021-11-22  9:04   ` Jan Beulich
  2021-11-23  9:05   ` Jan Beulich
  2021-11-19 18:21 ` [PATCH 5/5] x86/traps: Clean up diagnostics Andrew Cooper
  4 siblings, 2 replies; 18+ messages in thread
From: Andrew Cooper @ 2021-11-19 18:21 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

There is also a lot of redundancy in the table.  8 vectors head to do_trap(),
3 are handled in the IST logic, and that only leaves 7 others not heading to
the do_reserved_trap() catch-all.  This also removes the fragility that any
accidental NULL entry in the table becomes a ticking timebomb.

Function pointers are expensive under retpoline, and different vectors have
wildly different frequences.  Drop the indirect call, and use an if/else chain
instead, which is a code layout technique used by profile-guided optimsiation.

Using Xen's own perfcounter infrastructure, we see the following frequences of
vectors measured from boot until I can SSH into dom0 and collect the stats:

  vec | CFL-R   | Milan   | Notes
  ----+---------+---------+
  NMI |     345 |    3768 | Watchdog.  Milan has many more CPUs.
  ----+---------+---------+
  #PF | 1233234 | 2006441 |
  #GP |   90054 |   96193 |
  #UD |     848 |     851 |
  #NM |       0 |     132 | Per-vendor lazy vs eager FPU policy.
  #DB |      67 |      67 | No clue, but it's something in userspace.

Bloat-o-meter (after some manual insertion of ELF metadata) reports:

  add/remove: 0/1 grow/shrink: 2/0 up/down: 102/-256 (-154)
  Function                                     old     new   delta
  handle_exception_saved                       148     226     +78
  handle_ist_exception                         453     477     +24
  exception_table                              256       -    -256

showing that the if/else chains are less than half the size that
exception_table[] was in the first place.

As part of this change, make two other minor changes.  do_reserved_trap() is
renamed to do_unhandled_trap() because it is the catchall, and already covers
things that aren't reserved any more (#VE/#VC/#HV/#SX).

Furthermore, don't forward #TS to guests.  #TS is specifically for errors
relating to the Task State Segment, which is a Xen-owned structure, not a
guest-owned structure.  Even in the 32bit days, we never let guests register
their own Task State Segments.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

Preliminary perf data says that there is no measurable difference with
bti-thunk=jmp.  bti-thunk=lfence shows up to a 3% improvement, and
bti-thunk=retpoline is expected to have a better improvement still.
---
 xen/arch/x86/traps.c            | 34 ++-------------------
 xen/arch/x86/x86_64/entry.S     | 67 ++++++++++++++++++++++++++++++++++++-----
 xen/include/asm-x86/processor.h |  3 --
 3 files changed, 62 insertions(+), 42 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 096a411fdf94..e82ab007abcf 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -135,36 +135,6 @@ const unsigned int nmi_cpu;
 #define stack_words_per_line 4
 #define ESP_BEFORE_EXCEPTION(regs) ((unsigned long *)regs->rsp)
 
-static void do_trap(struct cpu_user_regs *regs);
-static void do_reserved_trap(struct cpu_user_regs *regs);
-
-void (* const exception_table[TRAP_nr])(struct cpu_user_regs *regs) = {
-    [TRAP_divide_error]                 = do_trap,
-    [TRAP_debug]                        = do_debug,
-    [TRAP_nmi]                          = (void *)do_nmi,
-    [TRAP_int3]                         = do_int3,
-    [TRAP_overflow]                     = do_trap,
-    [TRAP_bounds]                       = do_trap,
-    [TRAP_invalid_op]                   = do_invalid_op,
-    [TRAP_no_device]                    = do_device_not_available,
-    [TRAP_double_fault]                 = do_reserved_trap,
-    [TRAP_copro_seg]                    = do_reserved_trap,
-    [TRAP_invalid_tss]                  = do_trap,
-    [TRAP_no_segment]                   = do_trap,
-    [TRAP_stack_error]                  = do_trap,
-    [TRAP_gp_fault]                     = do_general_protection,
-    [TRAP_page_fault]                   = do_page_fault,
-    [TRAP_spurious_int]                 = do_reserved_trap,
-    [TRAP_copro_error]                  = do_trap,
-    [TRAP_alignment_check]              = do_trap,
-    [TRAP_machine_check]                = (void *)do_machine_check,
-    [TRAP_simd_error]                   = do_trap,
-    [TRAP_virtualisation]               = do_reserved_trap,
-    [X86_EXC_CP]                        = do_entry_CP,
-    [X86_EXC_CP + 1 ...
-     (ARRAY_SIZE(exception_table) - 1)] = do_reserved_trap,
-};
-
 void show_code(const struct cpu_user_regs *regs)
 {
     unsigned char insns_before[8] = {}, insns_after[16] = {};
@@ -881,7 +851,7 @@ void fatal_trap(const struct cpu_user_regs *regs, bool show_remote)
           (regs->eflags & X86_EFLAGS_IF) ? "" : " IN INTERRUPT CONTEXT");
 }
 
-static void do_reserved_trap(struct cpu_user_regs *regs)
+void do_unhandled_trap(struct cpu_user_regs *regs)
 {
     unsigned int trapnr = regs->entry_vector;
 
@@ -973,7 +943,7 @@ static bool extable_fixup(struct cpu_user_regs *regs, bool print)
     return true;
 }
 
-static void do_trap(struct cpu_user_regs *regs)
+void do_trap(struct cpu_user_regs *regs)
 {
     unsigned int trapnr = regs->entry_vector;
 
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index 3caa5654768d..3eaf0e67b2b9 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -773,14 +773,48 @@ handle_exception_saved:
         sti
 1:      movq  %rsp,%rdi
         movzbl UREGS_entry_vector(%rsp),%eax
-        leaq  exception_table(%rip),%rdx
 #ifdef CONFIG_PERF_COUNTERS
         lea   per_cpu__perfcounters(%rip), %rcx
         add   STACK_CPUINFO_FIELD(per_cpu_offset)(%r14), %rcx
         incl  ASM_PERFC_exceptions * 4(%rcx, %rax, 4)
 #endif
-        mov   (%rdx, %rax, 8), %rdx
-        INDIRECT_CALL %rdx
+
+        /*
+         * Dispatch to appropriate C handlers.
+         *
+         * The logic is implemented as an if/else chain.  DISPATCH() calls
+         * need be in frequency order for best performance.
+         */
+#define DISPATCH(vec, handler)         \
+        cmp   $vec, %al;               \
+        jne   .L_ ## vec ## _done;     \
+        call  handler;                 \
+        jmp   .L_exn_dispatch_done;    \
+.L_ ## vec ## _done:
+
+        DISPATCH(X86_EXC_PF, do_page_fault)
+        DISPATCH(X86_EXC_GP, do_general_protection)
+        DISPATCH(X86_EXC_UD, do_invalid_op)
+        DISPATCH(X86_EXC_NM, do_device_not_available)
+        DISPATCH(X86_EXC_BP, do_int3)
+
+        /* Logically "if ( (1 << vec) & MASK ) { do_trap(); }" */
+        mov   $(1 << X86_EXC_DE) | (1 << X86_EXC_OF) | (1 << X86_EXC_BR) |\
+               (1 << X86_EXC_NP) | (1 << X86_EXC_SS) | (1 << X86_EXC_MF) |\
+               (1 << X86_EXC_AC) | (1 << X86_EXC_XM), %edx
+        bt    %eax, %edx
+        jnc   .L_do_trap_done
+        call  do_trap
+        jmp   .L_exn_dispatch_done
+.L_do_trap_done:
+
+        DISPATCH(X86_EXC_CP, do_entry_CP)
+#undef DISPATCH
+
+        call  do_unhandled_trap
+        BUG   /* do_unhandled_trap() shouldn't return. */
+
+.L_exn_dispatch_done:
         mov   %r15, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
         mov   %r13b, STACK_CPUINFO_FIELD(use_pv_cr3)(%r14)
 #ifdef CONFIG_PV
@@ -1012,9 +1046,28 @@ handle_ist_exception:
         incl  ASM_PERFC_exceptions * 4(%rcx, %rax, 4)
 #endif
 
-        leaq  exception_table(%rip),%rdx
-        mov   (%rdx, %rax, 8), %rdx
-        INDIRECT_CALL %rdx
+        /*
+         * Dispatch to appropriate C handlers.
+         *
+         * The logic is implemented as an if/else chain.  DISPATCH() calls
+         * need be in frequency order for best performance.
+         */
+#define DISPATCH(vec, handler)         \
+        cmp   $vec, %al;               \
+        jne   .L_ ## vec ## _done;     \
+        call  handler;                 \
+        jmp   .L_ist_dispatch_done;    \
+.L_ ## vec ## _done:
+
+        DISPATCH(X86_EXC_NMI, do_nmi)
+        DISPATCH(X86_EXC_DB,  do_debug)
+        DISPATCH(X86_EXC_MC,  do_machine_check)
+#undef DISPATCH
+
+        call  do_unhandled_trap
+        BUG   /* do_unhandled_trap() shouldn't return. */
+
+.L_ist_dispatch_done:
         mov   %r15, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
         mov   %bl, STACK_CPUINFO_FIELD(use_pv_cr3)(%r14)
         cmpb  $TRAP_nmi,UREGS_entry_vector(%rsp)
@@ -1088,7 +1141,7 @@ autogen_stubs: /* Automatically generated stubs. */
 
         entrypoint 1b
 
-        /* Reserved exceptions, heading towards do_reserved_trap(). */
+        /* Reserved exceptions, heading towards do_unhandled_trap(). */
         .elseif vec == X86_EXC_CSO || vec == X86_EXC_SPV || \
                 vec == X86_EXC_VE  || (vec > X86_EXC_CP && vec < TRAP_nr)
 
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index bc4dc6925372..f031a050cba0 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -506,9 +506,6 @@ extern void mtrr_bp_init(void);
 
 void mcheck_init(struct cpuinfo_x86 *c, bool_t bsp);
 
-/* Dispatch table for exceptions */
-extern void (* const exception_table[TRAP_nr])(struct cpu_user_regs *regs);
-
 #define DECLARE_TRAP_HANDLER(_name)                    \
     void _name(void);                                  \
     void do_ ## _name(struct cpu_user_regs *regs)
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/5] x86/traps: Clean up diagnostics
  2021-11-19 18:21 [PATCH 0/5] x86/traps: Drop exception_table[] Andrew Cooper
                   ` (3 preceding siblings ...)
  2021-11-19 18:21 ` [PATCH 4/5] x86/traps: Drop exception_table[] and use if/else dispatching Andrew Cooper
@ 2021-11-19 18:21 ` Andrew Cooper
  2021-11-22  9:08   ` Jan Beulich
  4 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2021-11-19 18:21 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu

do{_reserved,}_trap() should use fatal_trap() rather than opencoding part of
it.  This lets the remote stack trace logic work in more fatal error
conditions.

With do_trap() converted, there is only one single user of trapstr()
remaining.  Tweak the formatting in pv_inject_event(), and remove trapstr()
entirely.

Take the opportunity of exporting vec_name() to improve the diagnostics in
stub_selftest().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/extable.c      |  8 +++++---
 xen/arch/x86/pv/traps.c     |  6 +++---
 xen/arch/x86/traps.c        | 25 +++----------------------
 xen/include/asm-x86/traps.h |  2 +-
 4 files changed, 12 insertions(+), 29 deletions(-)

diff --git a/xen/arch/x86/extable.c b/xen/arch/x86/extable.c
index 109ab7da9811..3cb0352abe5f 100644
--- a/xen/arch/x86/extable.c
+++ b/xen/arch/x86/extable.c
@@ -124,6 +124,8 @@ search_exception_table(const struct cpu_user_regs *regs)
 }
 
 #ifndef NDEBUG
+#include <asm/traps.h>
+
 static int __init stub_selftest(void)
 {
     static const struct {
@@ -172,10 +174,10 @@ static int __init stub_selftest(void)
         if ( res.raw != tests[i].res.raw )
         {
             printk("Selftest %u failed: Opc %*ph "
-                   "expected %u[%04x], got %u[%04x]\n",
+                   "expected %s[%04x], got %s[%04x]\n",
                    i, (int)ARRAY_SIZE(tests[i].opc), tests[i].opc,
-                   tests[i].res.fields.trapnr, tests[i].res.fields.ec,
-                   res.fields.trapnr, res.fields.ec);
+                   vec_name(tests[i].res.fields.trapnr), tests[i].res.fields.ec,
+                   vec_name(res.fields.trapnr), res.fields.ec);
 
             fail = true;
         }
diff --git a/xen/arch/x86/pv/traps.c b/xen/arch/x86/pv/traps.c
index 1e05a9f1cdad..1d1cb0784c98 100644
--- a/xen/arch/x86/pv/traps.c
+++ b/xen/arch/x86/pv/traps.c
@@ -89,9 +89,9 @@ void pv_inject_event(const struct x86_event *event)
 
     if ( unlikely(null_trap_bounce(curr, tb)) )
     {
-        gprintk(XENLOG_WARNING,
-                "Unhandled %s fault/trap [#%d, ec=%04x]\n",
-                trapstr(vector), vector, error_code);
+        gprintk(XENLOG_ERR,
+                "Unhandled: vec %u, %s[%04x]\n",
+                vector, vec_name(vector), error_code);
 
         if ( vector == TRAP_page_fault )
             show_page_walk(event->cr2);
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index e82ab007abcf..7a178f030c73 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -759,21 +759,7 @@ static int nmi_show_execution_state(const struct cpu_user_regs *regs, int cpu)
     return 1;
 }
 
-const char *trapstr(unsigned int trapnr)
-{
-    static const char * const strings[] = {
-        "divide error", "debug", "nmi", "bkpt", "overflow", "bounds",
-        "invalid opcode", "device not available", "double fault",
-        "coprocessor segment", "invalid tss", "segment not found",
-        "stack error", "general protection fault", "page fault",
-        "spurious interrupt", "coprocessor error", "alignment check",
-        "machine check", "simd error", "virtualisation exception"
-    };
-
-    return trapnr < ARRAY_SIZE(strings) ? strings[trapnr] : "???";
-}
-
-static const char *vec_name(unsigned int vec)
+const char *vec_name(unsigned int vec)
 {
     static const char names[][4] = {
 #define P(x) [X86_EXC_ ## x] = "#" #x
@@ -858,9 +844,7 @@ void do_unhandled_trap(struct cpu_user_regs *regs)
     if ( debugger_trap_fatal(trapnr, regs) )
         return;
 
-    show_execution_state(regs);
-    panic("FATAL RESERVED TRAP: vec %u, %s[%04x]\n",
-          trapnr, vec_name(trapnr), regs->error_code);
+    fatal_trap(regs, false);
 }
 
 static void fixup_exception_return(struct cpu_user_regs *regs,
@@ -970,10 +954,7 @@ void do_trap(struct cpu_user_regs *regs)
     if ( debugger_trap_fatal(trapnr, regs) )
         return;
 
-    show_execution_state(regs);
-    panic("FATAL TRAP: vector = %d (%s)\n"
-          "[error_code=%04x]\n",
-          trapnr, trapstr(trapnr), regs->error_code);
+    fatal_trap(regs, false);
 }
 
 int guest_rdmsr_xen(const struct vcpu *v, uint32_t idx, uint64_t *val)
diff --git a/xen/include/asm-x86/traps.h b/xen/include/asm-x86/traps.h
index ec23d3a70b36..3c5feac5e4b1 100644
--- a/xen/include/asm-x86/traps.h
+++ b/xen/include/asm-x86/traps.h
@@ -19,7 +19,7 @@
 #ifndef ASM_TRAP_H
 #define ASM_TRAP_H
 
-const char *trapstr(unsigned int trapnr);
+const char *vec_name(unsigned int vec);
 
 #endif /* ASM_TRAP_H */
 
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] x86/traps: Collect PERFC_exceptions stats for IST vectors too
  2021-11-19 18:21 ` [PATCH 1/5] x86/traps: Collect PERFC_exceptions stats for IST vectors too Andrew Cooper
@ 2021-11-22  8:50   ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2021-11-22  8:50 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 19.11.2021 19:21, Andrew Cooper wrote:
> This causes NMIs, #DB and #MC to be counted, rather than being reported as 0.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/5] x86/traps: Drop dummy_nmi_callback()
  2021-11-19 18:21 ` [PATCH 2/5] x86/traps: Drop dummy_nmi_callback() Andrew Cooper
@ 2021-11-22  8:51   ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2021-11-22  8:51 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 19.11.2021 19:21, Andrew Cooper wrote:
> The unconditional nmi_callback() call in do_nmi() calls dummy_nmi_callback()
> in all cases other than for a few specific and rare tasks (alternative
> patching, microcode loading, etc).
> 
> Indirect calls are expensive under retpoline, so rearrange the logic to use
> NULL as the default, and skip the call entirely in the common case.
> 
> While rearranging the code, fold the exit paths.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/5] x86/crash: Drop manual hooking of exception_table[]
  2021-11-19 18:21 ` [PATCH 3/5] x86/crash: Drop manual hooking of exception_table[] Andrew Cooper
@ 2021-11-22  8:57   ` Jan Beulich
  2021-11-22 13:48     ` Andrew Cooper
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2021-11-22  8:57 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 19.11.2021 19:21, Andrew Cooper wrote:
> NMI hooking in the crash path has undergone several revisions since its
> introduction.  What we have now is not sufficiently different from the regular
> nmi_callback() mechanism to warrant special casing.
> 
> Use set_nmi_callback() directly, and do away with patching a read-only data
> structure via a read-write alias.  This also means that the
> vmx_vmexit_handler() can and should call do_nmi() directly, rather than
> indirecting through the exception table to pick up the crash path hook.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with one remark / concern:

> --- a/xen/arch/x86/crash.c
> +++ b/xen/arch/x86/crash.c
> @@ -36,10 +36,8 @@ static unsigned int crashing_cpu;
>  static DEFINE_PER_CPU_READ_MOSTLY(bool, crash_save_done);
>  
>  /* This becomes the NMI handler for non-crashing CPUs, when Xen is crashing. */
> -static void noreturn do_nmi_crash(const struct cpu_user_regs *regs)
> +static int noreturn do_nmi_crash(const struct cpu_user_regs *regs, int cpu)
>  {
> -    unsigned int cpu = smp_processor_id();
> -
>      stac();
>  
>      /* nmi_shootdown_cpus() should ensure that this assertion is correct. */

Looks like this is the first instance of a noreturn function returning non-void.
Are you sufficiently certain that (older) compilers won't complain about missing
return statements (with a value)?

Jan



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] x86/traps: Drop exception_table[] and use if/else dispatching
  2021-11-19 18:21 ` [PATCH 4/5] x86/traps: Drop exception_table[] and use if/else dispatching Andrew Cooper
@ 2021-11-22  9:04   ` Jan Beulich
  2021-11-22 16:16     ` Andrew Cooper
  2021-11-23  9:05   ` Jan Beulich
  1 sibling, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2021-11-22  9:04 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 19.11.2021 19:21, Andrew Cooper wrote:
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -773,14 +773,48 @@ handle_exception_saved:
>          sti
>  1:      movq  %rsp,%rdi
>          movzbl UREGS_entry_vector(%rsp),%eax
> -        leaq  exception_table(%rip),%rdx
>  #ifdef CONFIG_PERF_COUNTERS
>          lea   per_cpu__perfcounters(%rip), %rcx
>          add   STACK_CPUINFO_FIELD(per_cpu_offset)(%r14), %rcx
>          incl  ASM_PERFC_exceptions * 4(%rcx, %rax, 4)
>  #endif
> -        mov   (%rdx, %rax, 8), %rdx
> -        INDIRECT_CALL %rdx
> +
> +        /*
> +         * Dispatch to appropriate C handlers.
> +         *
> +         * The logic is implemented as an if/else chain.  DISPATCH() calls
> +         * need be in frequency order for best performance.
> +         */
> +#define DISPATCH(vec, handler)         \
> +        cmp   $vec, %al;               \
> +        jne   .L_ ## vec ## _done;     \
> +        call  handler;                 \
> +        jmp   .L_exn_dispatch_done;    \
> +.L_ ## vec ## _done:
> +
> +        DISPATCH(X86_EXC_PF, do_page_fault)
> +        DISPATCH(X86_EXC_GP, do_general_protection)
> +        DISPATCH(X86_EXC_UD, do_invalid_op)
> +        DISPATCH(X86_EXC_NM, do_device_not_available)
> +        DISPATCH(X86_EXC_BP, do_int3)
> +
> +        /* Logically "if ( (1 << vec) & MASK ) { do_trap(); }" */
> +        mov   $(1 << X86_EXC_DE) | (1 << X86_EXC_OF) | (1 << X86_EXC_BR) |\
> +               (1 << X86_EXC_NP) | (1 << X86_EXC_SS) | (1 << X86_EXC_MF) |\
> +               (1 << X86_EXC_AC) | (1 << X86_EXC_XM), %edx
> +        bt    %eax, %edx
> +        jnc   .L_do_trap_done
> +        call  do_trap
> +        jmp   .L_exn_dispatch_done
> +.L_do_trap_done:
> +
> +        DISPATCH(X86_EXC_CP, do_entry_CP)
> +#undef DISPATCH

The basis for the choice of ordering imo wants spelling out here. For example,
despite the data provided in the description I'm not really convinced #BP
wants handling ahead of everything going to do_trap().

> @@ -1012,9 +1046,28 @@ handle_ist_exception:
>          incl  ASM_PERFC_exceptions * 4(%rcx, %rax, 4)
>  #endif
>  
> -        leaq  exception_table(%rip),%rdx
> -        mov   (%rdx, %rax, 8), %rdx
> -        INDIRECT_CALL %rdx
> +        /*
> +         * Dispatch to appropriate C handlers.
> +         *
> +         * The logic is implemented as an if/else chain.  DISPATCH() calls
> +         * need be in frequency order for best performance.
> +         */
> +#define DISPATCH(vec, handler)         \
> +        cmp   $vec, %al;               \
> +        jne   .L_ ## vec ## _done;     \
> +        call  handler;                 \
> +        jmp   .L_ist_dispatch_done;    \
> +.L_ ## vec ## _done:
> +
> +        DISPATCH(X86_EXC_NMI, do_nmi)
> +        DISPATCH(X86_EXC_DB,  do_debug)
> +        DISPATCH(X86_EXC_MC,  do_machine_check)
> +#undef DISPATCH
> +
> +        call  do_unhandled_trap
> +        BUG   /* do_unhandled_trap() shouldn't return. */

While I agree with putting BUG here, I don't see the need for the CALL.
Unlike in handle_exception there's no vector left unhandled by the
DISPATCH() invocations. The CALL being there give the (wrong) impression
there would / might be.

Jan



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/5] x86/traps: Clean up diagnostics
  2021-11-19 18:21 ` [PATCH 5/5] x86/traps: Clean up diagnostics Andrew Cooper
@ 2021-11-22  9:08   ` Jan Beulich
  2021-11-22 16:26     ` Andrew Cooper
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2021-11-22  9:08 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 19.11.2021 19:21, Andrew Cooper wrote:
> do{_reserved,}_trap() should use fatal_trap() rather than opencoding part of

Nit: That's do{_unhandled,}_trap() now.

> it.  This lets the remote stack trace logic work in more fatal error
> conditions.
> 
> With do_trap() converted, there is only one single user of trapstr()
> remaining.  Tweak the formatting in pv_inject_event(), and remove trapstr()
> entirely.
> 
> Take the opportunity of exporting vec_name() to improve the diagnostics in
> stub_selftest().
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
albeit with one further aspect to consider:

> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -759,21 +759,7 @@ static int nmi_show_execution_state(const struct cpu_user_regs *regs, int cpu)
>      return 1;
>  }
>  
> -const char *trapstr(unsigned int trapnr)
> -{
> -    static const char * const strings[] = {
> -        "divide error", "debug", "nmi", "bkpt", "overflow", "bounds",
> -        "invalid opcode", "device not available", "double fault",
> -        "coprocessor segment", "invalid tss", "segment not found",
> -        "stack error", "general protection fault", "page fault",
> -        "spurious interrupt", "coprocessor error", "alignment check",
> -        "machine check", "simd error", "virtualisation exception"
> -    };
> -
> -    return trapnr < ARRAY_SIZE(strings) ? strings[trapnr] : "???";
> -}
> -
> -static const char *vec_name(unsigned int vec)
> +const char *vec_name(unsigned int vec)

Is this perhaps too ambiguous a name for a non-static function? exn_vec_name()
at least, maybe?

Jan



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/5] x86/crash: Drop manual hooking of exception_table[]
  2021-11-22  8:57   ` Jan Beulich
@ 2021-11-22 13:48     ` Andrew Cooper
  2021-11-22 14:51       ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2021-11-22 13:48 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 22/11/2021 08:57, Jan Beulich wrote:
> On 19.11.2021 19:21, Andrew Cooper wrote:
>> --- a/xen/arch/x86/crash.c
>> +++ b/xen/arch/x86/crash.c
>> @@ -36,10 +36,8 @@ static unsigned int crashing_cpu;
>>  static DEFINE_PER_CPU_READ_MOSTLY(bool, crash_save_done);
>>  
>>  /* This becomes the NMI handler for non-crashing CPUs, when Xen is crashing. */
>> -static void noreturn do_nmi_crash(const struct cpu_user_regs *regs)
>> +static int noreturn do_nmi_crash(const struct cpu_user_regs *regs, int cpu)
>>  {
>> -    unsigned int cpu = smp_processor_id();
>> -
>>      stac();
>>  
>>      /* nmi_shootdown_cpus() should ensure that this assertion is correct. */
> Looks like this is the first instance of a noreturn function returning non-void.
> Are you sufficiently certain that (older) compilers won't complain about missing
> return statements (with a value)?

Yes.  https://godbolt.org/z/8a1efoh39

~Andrew


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/5] x86/crash: Drop manual hooking of exception_table[]
  2021-11-22 13:48     ` Andrew Cooper
@ 2021-11-22 14:51       ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2021-11-22 14:51 UTC (permalink / raw)
  To: Andrew Cooper, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 22.11.2021 14:48, Andrew Cooper wrote:
> On 22/11/2021 08:57, Jan Beulich wrote:
>> On 19.11.2021 19:21, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/crash.c
>>> +++ b/xen/arch/x86/crash.c
>>> @@ -36,10 +36,8 @@ static unsigned int crashing_cpu;
>>>  static DEFINE_PER_CPU_READ_MOSTLY(bool, crash_save_done);
>>>  
>>>  /* This becomes the NMI handler for non-crashing CPUs, when Xen is crashing. */
>>> -static void noreturn do_nmi_crash(const struct cpu_user_regs *regs)
>>> +static int noreturn do_nmi_crash(const struct cpu_user_regs *regs, int cpu)
>>>  {
>>> -    unsigned int cpu = smp_processor_id();
>>> -
>>>      stac();
>>>  
>>>      /* nmi_shootdown_cpus() should ensure that this assertion is correct. */
>> Looks like this is the first instance of a noreturn function returning non-void.
>> Are you sufficiently certain that (older) compilers won't complain about missing
>> return statements (with a value)?
> 
> Yes.  https://godbolt.org/z/8a1efoh39

Okay, thanks. That was with -O2 only, but adding -Wall didn't surface anything either.

Jan



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] x86/traps: Drop exception_table[] and use if/else dispatching
  2021-11-22  9:04   ` Jan Beulich
@ 2021-11-22 16:16     ` Andrew Cooper
  2021-11-22 16:28       ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2021-11-22 16:16 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 22/11/2021 09:04, Jan Beulich wrote:
> On 19.11.2021 19:21, Andrew Cooper wrote:
>> --- a/xen/arch/x86/x86_64/entry.S
>> +++ b/xen/arch/x86/x86_64/entry.S
>> @@ -773,14 +773,48 @@ handle_exception_saved:
>>          sti
>>  1:      movq  %rsp,%rdi
>>          movzbl UREGS_entry_vector(%rsp),%eax
>> -        leaq  exception_table(%rip),%rdx
>>  #ifdef CONFIG_PERF_COUNTERS
>>          lea   per_cpu__perfcounters(%rip), %rcx
>>          add   STACK_CPUINFO_FIELD(per_cpu_offset)(%r14), %rcx
>>          incl  ASM_PERFC_exceptions * 4(%rcx, %rax, 4)
>>  #endif
>> -        mov   (%rdx, %rax, 8), %rdx
>> -        INDIRECT_CALL %rdx
>> +
>> +        /*
>> +         * Dispatch to appropriate C handlers.
>> +         *
>> +         * The logic is implemented as an if/else chain.  DISPATCH() calls
>> +         * need be in frequency order for best performance.
>> +         */
>> +#define DISPATCH(vec, handler)         \
>> +        cmp   $vec, %al;               \
>> +        jne   .L_ ## vec ## _done;     \
>> +        call  handler;                 \
>> +        jmp   .L_exn_dispatch_done;    \
>> +.L_ ## vec ## _done:
>> +
>> +        DISPATCH(X86_EXC_PF, do_page_fault)
>> +        DISPATCH(X86_EXC_GP, do_general_protection)
>> +        DISPATCH(X86_EXC_UD, do_invalid_op)
>> +        DISPATCH(X86_EXC_NM, do_device_not_available)
>> +        DISPATCH(X86_EXC_BP, do_int3)
>> +
>> +        /* Logically "if ( (1 << vec) & MASK ) { do_trap(); }" */
>> +        mov   $(1 << X86_EXC_DE) | (1 << X86_EXC_OF) | (1 << X86_EXC_BR) |\
>> +               (1 << X86_EXC_NP) | (1 << X86_EXC_SS) | (1 << X86_EXC_MF) |\
>> +               (1 << X86_EXC_AC) | (1 << X86_EXC_XM), %edx
>> +        bt    %eax, %edx
>> +        jnc   .L_do_trap_done
>> +        call  do_trap
>> +        jmp   .L_exn_dispatch_done
>> +.L_do_trap_done:
>> +
>> +        DISPATCH(X86_EXC_CP, do_entry_CP)
>> +#undef DISPATCH
> The basis for the choice of ordering imo wants spelling out here. For example,
> despite the data provided in the description I'm not really convinced #BP
> wants handling ahead of everything going to do_trap().

Why?

Debugging might be rare, but #BP gets used in non-error cases. 
Everything heading towards do_trap() really got 0 hits, which is
entirely expected because they're all fatal signals by default.

This list is in proper frequency order.

>> @@ -1012,9 +1046,28 @@ handle_ist_exception:
>>          incl  ASM_PERFC_exceptions * 4(%rcx, %rax, 4)
>>  #endif
>>  
>> -        leaq  exception_table(%rip),%rdx
>> -        mov   (%rdx, %rax, 8), %rdx
>> -        INDIRECT_CALL %rdx
>> +        /*
>> +         * Dispatch to appropriate C handlers.
>> +         *
>> +         * The logic is implemented as an if/else chain.  DISPATCH() calls
>> +         * need be in frequency order for best performance.
>> +         */
>> +#define DISPATCH(vec, handler)         \
>> +        cmp   $vec, %al;               \
>> +        jne   .L_ ## vec ## _done;     \
>> +        call  handler;                 \
>> +        jmp   .L_ist_dispatch_done;    \
>> +.L_ ## vec ## _done:
>> +
>> +        DISPATCH(X86_EXC_NMI, do_nmi)
>> +        DISPATCH(X86_EXC_DB,  do_debug)
>> +        DISPATCH(X86_EXC_MC,  do_machine_check)
>> +#undef DISPATCH
>> +
>> +        call  do_unhandled_trap
>> +        BUG   /* do_unhandled_trap() shouldn't return. */
> While I agree with putting BUG here, I don't see the need for the CALL.
> Unlike in handle_exception there's no vector left unhandled by the
> DISPATCH() invocations. The CALL being there give the (wrong) impression
> there would / might be.

I firmly disagree.

do_unhandled_trap() is a fatal error path both here and for the non IST
case, and is absolutely the appropriate thing to call in the dangling
else from this chain.

It is only unreachable if 15 things line up correctly in a very fragile
piece of code, where both you and I have made errors in the past.

~Andrew


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/5] x86/traps: Clean up diagnostics
  2021-11-22  9:08   ` Jan Beulich
@ 2021-11-22 16:26     ` Andrew Cooper
  2021-11-22 16:34       ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Cooper @ 2021-11-22 16:26 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 22/11/2021 09:08, Jan Beulich wrote:
> On 19.11.2021 19:21, Andrew Cooper wrote:
>> do{_reserved,}_trap() should use fatal_trap() rather than opencoding part of
> Nit: That's do{_unhandled,}_trap() now.

Ah yes.  Will fix.

>> --- a/xen/arch/x86/traps.c
>> +++ b/xen/arch/x86/traps.c
>> @@ -759,21 +759,7 @@ static int nmi_show_execution_state(const struct cpu_user_regs *regs, int cpu)
>>      return 1;
>>  }
>>  
>> -const char *trapstr(unsigned int trapnr)
>> -{
>> -    static const char * const strings[] = {
>> -        "divide error", "debug", "nmi", "bkpt", "overflow", "bounds",
>> -        "invalid opcode", "device not available", "double fault",
>> -        "coprocessor segment", "invalid tss", "segment not found",
>> -        "stack error", "general protection fault", "page fault",
>> -        "spurious interrupt", "coprocessor error", "alignment check",
>> -        "machine check", "simd error", "virtualisation exception"
>> -    };
>> -
>> -    return trapnr < ARRAY_SIZE(strings) ? strings[trapnr] : "???";
>> -}
>> -
>> -static const char *vec_name(unsigned int vec)
>> +const char *vec_name(unsigned int vec)
> Is this perhaps too ambiguous a name for a non-static function? exn_vec_name()
> at least, maybe?

"exception" has the same problem that "trap" has.  It's actively
incorrect naming.

vec_name() is already less bad as a global than trapstr() was, so I
don't consider this an issue.  I could expand it to vector_name(), but
that would gain several more hunks in the patch.

~Andrew


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] x86/traps: Drop exception_table[] and use if/else dispatching
  2021-11-22 16:16     ` Andrew Cooper
@ 2021-11-22 16:28       ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2021-11-22 16:28 UTC (permalink / raw)
  To: Andrew Cooper, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 22.11.2021 17:16, Andrew Cooper wrote:
> On 22/11/2021 09:04, Jan Beulich wrote:
>> On 19.11.2021 19:21, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/x86_64/entry.S
>>> +++ b/xen/arch/x86/x86_64/entry.S
>>> @@ -773,14 +773,48 @@ handle_exception_saved:
>>>          sti
>>>  1:      movq  %rsp,%rdi
>>>          movzbl UREGS_entry_vector(%rsp),%eax
>>> -        leaq  exception_table(%rip),%rdx
>>>  #ifdef CONFIG_PERF_COUNTERS
>>>          lea   per_cpu__perfcounters(%rip), %rcx
>>>          add   STACK_CPUINFO_FIELD(per_cpu_offset)(%r14), %rcx
>>>          incl  ASM_PERFC_exceptions * 4(%rcx, %rax, 4)
>>>  #endif
>>> -        mov   (%rdx, %rax, 8), %rdx
>>> -        INDIRECT_CALL %rdx
>>> +
>>> +        /*
>>> +         * Dispatch to appropriate C handlers.
>>> +         *
>>> +         * The logic is implemented as an if/else chain.  DISPATCH() calls
>>> +         * need be in frequency order for best performance.
>>> +         */
>>> +#define DISPATCH(vec, handler)         \
>>> +        cmp   $vec, %al;               \
>>> +        jne   .L_ ## vec ## _done;     \
>>> +        call  handler;                 \
>>> +        jmp   .L_exn_dispatch_done;    \
>>> +.L_ ## vec ## _done:
>>> +
>>> +        DISPATCH(X86_EXC_PF, do_page_fault)
>>> +        DISPATCH(X86_EXC_GP, do_general_protection)
>>> +        DISPATCH(X86_EXC_UD, do_invalid_op)
>>> +        DISPATCH(X86_EXC_NM, do_device_not_available)
>>> +        DISPATCH(X86_EXC_BP, do_int3)
>>> +
>>> +        /* Logically "if ( (1 << vec) & MASK ) { do_trap(); }" */
>>> +        mov   $(1 << X86_EXC_DE) | (1 << X86_EXC_OF) | (1 << X86_EXC_BR) |\
>>> +               (1 << X86_EXC_NP) | (1 << X86_EXC_SS) | (1 << X86_EXC_MF) |\
>>> +               (1 << X86_EXC_AC) | (1 << X86_EXC_XM), %edx
>>> +        bt    %eax, %edx
>>> +        jnc   .L_do_trap_done
>>> +        call  do_trap
>>> +        jmp   .L_exn_dispatch_done
>>> +.L_do_trap_done:
>>> +
>>> +        DISPATCH(X86_EXC_CP, do_entry_CP)
>>> +#undef DISPATCH
>> The basis for the choice of ordering imo wants spelling out here. For example,
>> despite the data provided in the description I'm not really convinced #BP
>> wants handling ahead of everything going to do_trap().
> 
> Why?
> 
> Debugging might be rare, but #BP gets used in non-error cases. 
> Everything heading towards do_trap() really got 0 hits, which is
> entirely expected because they're all fatal signals by default.

#MF and #XM certainly aren't. For some of the others I agree with
"by default", but without extending this to "in general".

>>> @@ -1012,9 +1046,28 @@ handle_ist_exception:
>>>          incl  ASM_PERFC_exceptions * 4(%rcx, %rax, 4)
>>>  #endif
>>>  
>>> -        leaq  exception_table(%rip),%rdx
>>> -        mov   (%rdx, %rax, 8), %rdx
>>> -        INDIRECT_CALL %rdx
>>> +        /*
>>> +         * Dispatch to appropriate C handlers.
>>> +         *
>>> +         * The logic is implemented as an if/else chain.  DISPATCH() calls
>>> +         * need be in frequency order for best performance.
>>> +         */
>>> +#define DISPATCH(vec, handler)         \
>>> +        cmp   $vec, %al;               \
>>> +        jne   .L_ ## vec ## _done;     \
>>> +        call  handler;                 \
>>> +        jmp   .L_ist_dispatch_done;    \
>>> +.L_ ## vec ## _done:
>>> +
>>> +        DISPATCH(X86_EXC_NMI, do_nmi)
>>> +        DISPATCH(X86_EXC_DB,  do_debug)
>>> +        DISPATCH(X86_EXC_MC,  do_machine_check)
>>> +#undef DISPATCH
>>> +
>>> +        call  do_unhandled_trap
>>> +        BUG   /* do_unhandled_trap() shouldn't return. */
>> While I agree with putting BUG here, I don't see the need for the CALL.
>> Unlike in handle_exception there's no vector left unhandled by the
>> DISPATCH() invocations. The CALL being there give the (wrong) impression
>> there would / might be.
> 
> I firmly disagree.
> 
> do_unhandled_trap() is a fatal error path both here and for the non IST
> case, and is absolutely the appropriate thing to call in the dangling
> else from this chain.
> 
> It is only unreachable if 15 things line up correctly in a very fragile
> piece of code, where both you and I have made errors in the past.

The 15 things are: There may not be a new exception type vectored here
without updating the dispatch code. Failing to do so is quite fine to
be caught by the BUG alone. And adding new IST handlers requires care
in a number of other places anyway, so anyone doing so will surely
grep for all occurrences of the types presently handled here. That'll
make them spot the dispatch which needs updating alongside the several
other places.

Jan



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/5] x86/traps: Clean up diagnostics
  2021-11-22 16:26     ` Andrew Cooper
@ 2021-11-22 16:34       ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2021-11-22 16:34 UTC (permalink / raw)
  To: Andrew Cooper, Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 22.11.2021 17:26, Andrew Cooper wrote:
> On 22/11/2021 09:08, Jan Beulich wrote:
>> On 19.11.2021 19:21, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/traps.c
>>> +++ b/xen/arch/x86/traps.c
>>> @@ -759,21 +759,7 @@ static int nmi_show_execution_state(const struct cpu_user_regs *regs, int cpu)
>>>      return 1;
>>>  }
>>>  
>>> -const char *trapstr(unsigned int trapnr)
>>> -{
>>> -    static const char * const strings[] = {
>>> -        "divide error", "debug", "nmi", "bkpt", "overflow", "bounds",
>>> -        "invalid opcode", "device not available", "double fault",
>>> -        "coprocessor segment", "invalid tss", "segment not found",
>>> -        "stack error", "general protection fault", "page fault",
>>> -        "spurious interrupt", "coprocessor error", "alignment check",
>>> -        "machine check", "simd error", "virtualisation exception"
>>> -    };
>>> -
>>> -    return trapnr < ARRAY_SIZE(strings) ? strings[trapnr] : "???";
>>> -}
>>> -
>>> -static const char *vec_name(unsigned int vec)
>>> +const char *vec_name(unsigned int vec)
>> Is this perhaps too ambiguous a name for a non-static function? exn_vec_name()
>> at least, maybe?
> 
> "exception" has the same problem that "trap" has.  It's actively
> incorrect naming.

Well, yes, "exception_or_interrupt_name" would be more correct but quite
a bit too long for my taste. In an earlier project I did work on we
used "interruption" to cover everything (including hardware interrupts),
but the abbreviation thereof wouldn't be distinguishable from
"interrupt"'s.

Bottom line - you have my R-b, and the name change was just an extra
consideration.

Jan



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] x86/traps: Drop exception_table[] and use if/else dispatching
  2021-11-19 18:21 ` [PATCH 4/5] x86/traps: Drop exception_table[] and use if/else dispatching Andrew Cooper
  2021-11-22  9:04   ` Jan Beulich
@ 2021-11-23  9:05   ` Jan Beulich
  1 sibling, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2021-11-23  9:05 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Roger Pau Monné, Wei Liu, Xen-devel

On 19.11.2021 19:21, Andrew Cooper wrote:
> There is also a lot of redundancy in the table.  8 vectors head to do_trap(),
> 3 are handled in the IST logic, and that only leaves 7 others not heading to
> the do_reserved_trap() catch-all.  This also removes the fragility that any
> accidental NULL entry in the table becomes a ticking timebomb.
> 
> Function pointers are expensive under retpoline, and different vectors have
> wildly different frequences.  Drop the indirect call, and use an if/else chain
> instead, which is a code layout technique used by profile-guided optimsiation.
> 
> Using Xen's own perfcounter infrastructure, we see the following frequences of
> vectors measured from boot until I can SSH into dom0 and collect the stats:
> 
>   vec | CFL-R   | Milan   | Notes
>   ----+---------+---------+
>   NMI |     345 |    3768 | Watchdog.  Milan has many more CPUs.
>   ----+---------+---------+
>   #PF | 1233234 | 2006441 |
>   #GP |   90054 |   96193 |
>   #UD |     848 |     851 |
>   #NM |       0 |     132 | Per-vendor lazy vs eager FPU policy.
>   #DB |      67 |      67 | No clue, but it's something in userspace.
> 
> Bloat-o-meter (after some manual insertion of ELF metadata) reports:
> 
>   add/remove: 0/1 grow/shrink: 2/0 up/down: 102/-256 (-154)
>   Function                                     old     new   delta
>   handle_exception_saved                       148     226     +78
>   handle_ist_exception                         453     477     +24
>   exception_table                              256       -    -256
> 
> showing that the if/else chains are less than half the size that
> exception_table[] was in the first place.
> 
> As part of this change, make two other minor changes.  do_reserved_trap() is
> renamed to do_unhandled_trap() because it is the catchall, and already covers
> things that aren't reserved any more (#VE/#VC/#HV/#SX).
> 
> Furthermore, don't forward #TS to guests.  #TS is specifically for errors
> relating to the Task State Segment, which is a Xen-owned structure, not a
> guest-owned structure.  Even in the 32bit days, we never let guests register
> their own Task State Segments.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

As it feels like we won't be coming to an agreement here, despite
my reservations against the specific form of some of this and on
the basis that the code change itself is certainly an improvement:
Acked-by: Jan Beulich <jbeulich@suse.com>

I'd nevertheless be curious whether in five or ten years time you'd
still agree with your choice of basing a decision on not really
representative data (at least as far as what the description says).

Jan



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-11-23  9:06 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-19 18:21 [PATCH 0/5] x86/traps: Drop exception_table[] Andrew Cooper
2021-11-19 18:21 ` [PATCH 1/5] x86/traps: Collect PERFC_exceptions stats for IST vectors too Andrew Cooper
2021-11-22  8:50   ` Jan Beulich
2021-11-19 18:21 ` [PATCH 2/5] x86/traps: Drop dummy_nmi_callback() Andrew Cooper
2021-11-22  8:51   ` Jan Beulich
2021-11-19 18:21 ` [PATCH 3/5] x86/crash: Drop manual hooking of exception_table[] Andrew Cooper
2021-11-22  8:57   ` Jan Beulich
2021-11-22 13:48     ` Andrew Cooper
2021-11-22 14:51       ` Jan Beulich
2021-11-19 18:21 ` [PATCH 4/5] x86/traps: Drop exception_table[] and use if/else dispatching Andrew Cooper
2021-11-22  9:04   ` Jan Beulich
2021-11-22 16:16     ` Andrew Cooper
2021-11-22 16:28       ` Jan Beulich
2021-11-23  9:05   ` Jan Beulich
2021-11-19 18:21 ` [PATCH 5/5] x86/traps: Clean up diagnostics Andrew Cooper
2021-11-22  9:08   ` Jan Beulich
2021-11-22 16:26     ` Andrew Cooper
2021-11-22 16:34       ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.