All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions
@ 2020-05-05 13:49 Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 01/24] x86/int3: Ensure that poke_int3_handler() is not traced Thomas Gleixner
                   ` (23 more replies)
  0 siblings, 24 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

Folks,

This is the forth part of the rework series. Part 3 can be found here:

 https://lore.kernel.org/r/20200505134354.774943181@linutronix.de

The series has a total of 138 patches and is split into 5 parts. The base
for this 4th series is:

  git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git entry-v4-part-3

The full series with all parts applied is available here:

  git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git entry-v4-part-5

The forth part, i.e. this series is available from:

  git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git entry-v4-part-4
 
This part contains the modifications for complex and nasty exceptions and traps:

 - Conversion of int3 including a full isolation of the text poke handler
   so it is fully self contained, i.e does not call out into any
   instrumentable code.

 - Conversion of NMI handling including protection against instrumentation

 - Conversion of #DB with separation of the user and kernel mode entries

 - Conversion of #MC and #DF

The objtool check for the noinstr.text correctness is not yet added to the
build machinery and has to be invoked manually for now:

   objtool check -fal vmlinux.o

The checking only works for builtin code as objtool cannot do a combined
analysis of vmlinux.o and a module.o

Thanks,

	tglx

8<----------
 arch/x86/entry/entry_32.S            |   38 ----
 arch/x86/entry/entry_64.S            |   35 +---
 arch/x86/include/asm/desc.h          |    8 
 arch/x86/include/asm/idtentry.h      |  235 +++++++++++++++++++++++++++++
 arch/x86/include/asm/mce.h           |    2 
 arch/x86/include/asm/ptrace.h        |    2 
 arch/x86/include/asm/text-patching.h |   11 -
 arch/x86/include/asm/traps.h         |   23 --
 arch/x86/kernel/alternative.c        |   25 +--
 arch/x86/kernel/cpu/common.c         |    6 
 arch/x86/kernel/cpu/mce/core.c       |   91 ++++++++---
 arch/x86/kernel/cpu/mce/inject.c     |    4 
 arch/x86/kernel/cpu/mce/internal.h   |    2 
 arch/x86/kernel/cpu/mce/p5.c         |    8 
 arch/x86/kernel/cpu/mce/winchip.c    |    8 
 arch/x86/kernel/doublefault_32.c     |   10 -
 arch/x86/kernel/hw_breakpoint.c      |    6 
 arch/x86/kernel/idt.c                |   22 +-
 arch/x86/kernel/nmi.c                |   14 -
 arch/x86/kernel/traps.c              |  283 ++++++++++++++++++++++++-----------
 arch/x86/kvm/vmx/vmx.c               |    2 
 arch/x86/xen/enlighten_pv.c          |   17 +-
 arch/x86/xen/xen-asm_64.S            |   12 -
 include/linux/bsearch.h              |   26 ++-
 kernel/time/timekeeping.c            |    2 
 lib/bsearch.c                        |   22 --
 26 files changed, 615 insertions(+), 299 deletions(-)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 01/24] x86/int3: Ensure that poke_int3_handler() is not traced
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-14  4:57   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 02/24] x86/int3: Avoid atomic instrumentation Thomas Gleixner
                   ` (22 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon, Peter Zijlstra (Intel)

From: Thomas Gleixner <tglx@linutronix.de>

In order to ensure poke_int3_handler() is completely self contained -- this
is called while modifying other text, imagine the fun of hitting another
INT3 -- ensure that everything it uses is not traced.

The primary means here is to force inlining; bsearch() is notrace because
all of lib/ is.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/ptrace.h        |    2 +-
 arch/x86/include/asm/text-patching.h |   11 +++++++----
 arch/x86/kernel/alternative.c        |   13 ++++++-------
 3 files changed, 14 insertions(+), 12 deletions(-)

--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -123,7 +123,7 @@ static inline void regs_set_return_value
  * On x86_64, vm86 mode is mercifully nonexistent, and we don't need
  * the extra check.
  */
-static inline int user_mode(struct pt_regs *regs)
+static __always_inline int user_mode(struct pt_regs *regs)
 {
 #ifdef CONFIG_X86_32
 	return ((regs->cs & SEGMENT_RPL_MASK) | (regs->flags & X86_VM_MASK)) >= USER_RPL;
--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -64,7 +64,7 @@ extern void text_poke_finish(void);
 
 #define DISP32_SIZE		4
 
-static inline int text_opcode_size(u8 opcode)
+static __always_inline int text_opcode_size(u8 opcode)
 {
 	int size = 0;
 
@@ -118,12 +118,14 @@ extern __ro_after_init struct mm_struct
 extern __ro_after_init unsigned long poking_addr;
 
 #ifndef CONFIG_UML_X86
-static inline void int3_emulate_jmp(struct pt_regs *regs, unsigned long ip)
+static __always_inline
+void int3_emulate_jmp(struct pt_regs *regs, unsigned long ip)
 {
 	regs->ip = ip;
 }
 
-static inline void int3_emulate_push(struct pt_regs *regs, unsigned long val)
+static __always_inline
+void int3_emulate_push(struct pt_regs *regs, unsigned long val)
 {
 	/*
 	 * The int3 handler in entry_64.S adds a gap between the
@@ -138,7 +140,8 @@ static inline void int3_emulate_push(str
 	*(unsigned long *)regs->sp = val;
 }
 
-static inline void int3_emulate_call(struct pt_regs *regs, unsigned long func)
+static __always_inline
+void int3_emulate_call(struct pt_regs *regs, unsigned long func)
 {
 	int3_emulate_push(regs, regs->ip - INT3_INSN_SIZE + CALL_INSN_SIZE);
 	int3_emulate_jmp(regs, func);
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -957,7 +957,8 @@ struct bp_patching_desc {
 
 static struct bp_patching_desc *bp_desc;
 
-static inline struct bp_patching_desc *try_get_desc(struct bp_patching_desc **descp)
+static __always_inline
+struct bp_patching_desc *try_get_desc(struct bp_patching_desc **descp)
 {
 	struct bp_patching_desc *desc = READ_ONCE(*descp); /* rcu_dereference */
 
@@ -967,18 +968,18 @@ static inline struct bp_patching_desc *t
 	return desc;
 }
 
-static inline void put_desc(struct bp_patching_desc *desc)
+static __always_inline void put_desc(struct bp_patching_desc *desc)
 {
 	smp_mb__before_atomic();
 	atomic_dec(&desc->refs);
 }
 
-static inline void *text_poke_addr(struct text_poke_loc *tp)
+static __always_inline void *text_poke_addr(struct text_poke_loc *tp)
 {
 	return _stext + tp->rel_addr;
 }
 
-static int notrace patch_cmp(const void *key, const void *elt)
+static int noinstr patch_cmp(const void *key, const void *elt)
 {
 	struct text_poke_loc *tp = (struct text_poke_loc *) elt;
 
@@ -988,9 +989,8 @@ static int notrace patch_cmp(const void
 		return 1;
 	return 0;
 }
-NOKPROBE_SYMBOL(patch_cmp);
 
-int notrace poke_int3_handler(struct pt_regs *regs)
+int noinstr poke_int3_handler(struct pt_regs *regs)
 {
 	struct bp_patching_desc *desc;
 	struct text_poke_loc *tp;
@@ -1064,7 +1064,6 @@ int notrace poke_int3_handler(struct pt_
 	put_desc(desc);
 	return ret;
 }
-NOKPROBE_SYMBOL(poke_int3_handler);
 
 #define TP_VEC_MAX (PAGE_SIZE / sizeof(struct text_poke_loc))
 static struct text_poke_loc tp_vec[TP_VEC_MAX];


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 02/24] x86/int3: Avoid atomic instrumentation
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 01/24] x86/int3: Ensure that poke_int3_handler() is not traced Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-08 13:27   ` Masami Hiramatsu
                     ` (2 more replies)
  2020-05-05 13:49 ` [patch V4 part 4 03/24] lib/bsearch: Provide __always_inline variant Thomas Gleixner
                   ` (21 subsequent siblings)
  23 siblings, 3 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon, Peter Zijlstra (Intel)

From: Peter Zijlstra <peterz@infradead.org>

Use arch_atomic_*() and READ_ONCE_NOCHECK() to ensure nothing untoward
creeps in and ruins things.

That is; this is the INT3 text poke handler, strictly limit the code
that runs in it, lest it inadvertenly hits yet another INT3.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/alternative.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -960,9 +960,9 @@ static struct bp_patching_desc *bp_desc;
 static __always_inline
 struct bp_patching_desc *try_get_desc(struct bp_patching_desc **descp)
 {
-	struct bp_patching_desc *desc = READ_ONCE(*descp); /* rcu_dereference */
+	struct bp_patching_desc *desc = READ_ONCE_NOCHECK(*descp); /* rcu_dereference */
 
-	if (!desc || !atomic_inc_not_zero(&desc->refs))
+	if (!desc || !arch_atomic_inc_not_zero(&desc->refs))
 		return NULL;
 
 	return desc;
@@ -971,7 +971,7 @@ struct bp_patching_desc *try_get_desc(st
 static __always_inline void put_desc(struct bp_patching_desc *desc)
 {
 	smp_mb__before_atomic();
-	atomic_dec(&desc->refs);
+	arch_atomic_dec(&desc->refs);
 }
 
 static __always_inline void *text_poke_addr(struct text_poke_loc *tp)


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 03/24] lib/bsearch: Provide __always_inline variant
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 01/24] x86/int3: Ensure that poke_int3_handler() is not traced Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 02/24] x86/int3: Avoid atomic instrumentation Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-14  4:58   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  2020-05-05 13:49 ` [patch V4 part 4 04/24] x86/int3: Inline bsearch() Thomas Gleixner
                   ` (20 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon, Peter Zijlstra (Intel)

From: Peter Zijlstra <peterz@infradead.org>

For code that needs the ultimate performance (it can inline the @cmp
function too) or simply needs to avoid calling external functions for
whatever reason, provide an __always_inline variant of bsearch().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/bsearch.h |   26 ++++++++++++++++++++++++--
 lib/bsearch.c           |   22 ++--------------------
 2 files changed, 26 insertions(+), 22 deletions(-)

--- a/include/linux/bsearch.h
+++ b/include/linux/bsearch.h
@@ -4,7 +4,29 @@
 
 #include <linux/types.h>
 
-void *bsearch(const void *key, const void *base, size_t num, size_t size,
-	      cmp_func_t cmp);
+static __always_inline
+void *__bsearch(const void *key, const void *base, size_t num, size_t size, cmp_func_t cmp)
+{
+	const char *pivot;
+	int result;
+
+	while (num > 0) {
+		pivot = base + (num >> 1) * size;
+		result = cmp(key, pivot);
+
+		if (result == 0)
+			return (void *)pivot;
+
+		if (result > 0) {
+			base = pivot + size;
+			num--;
+		}
+		num >>= 1;
+	}
+
+	return NULL;
+}
+
+extern void *bsearch(const void *key, const void *base, size_t num, size_t size, cmp_func_t cmp);
 
 #endif /* _LINUX_BSEARCH_H */
--- a/lib/bsearch.c
+++ b/lib/bsearch.c
@@ -28,27 +28,9 @@
  * the key and elements in the array are of the same type, you can use
  * the same comparison function for both sort() and bsearch().
  */
-void *bsearch(const void *key, const void *base, size_t num, size_t size,
-	      cmp_func_t cmp)
+void *bsearch(const void *key, const void *base, size_t num, size_t size, cmp_func_t cmp)
 {
-	const char *pivot;
-	int result;
-
-	while (num > 0) {
-		pivot = base + (num >> 1) * size;
-		result = cmp(key, pivot);
-
-		if (result == 0)
-			return (void *)pivot;
-
-		if (result > 0) {
-			base = pivot + size;
-			num--;
-		}
-		num >>= 1;
-	}
-
-	return NULL;
+	return __bsearch(key, base, num, size, cmp);
 }
 EXPORT_SYMBOL(bsearch);
 NOKPROBE_SYMBOL(bsearch);


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 04/24] x86/int3: Inline bsearch()
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (2 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 03/24] lib/bsearch: Provide __always_inline variant Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-14  4:58   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  2020-05-05 13:49 ` [patch V4 part 4 05/24] x86/entry: Provide IDTENTRY_RAW Thomas Gleixner
                   ` (19 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon, Peter Zijlstra (Intel)

From: Peter Zijlstra <peterz@infradead.org>

Avoid calling out to bsearch() by inlining it, for normal kernel configs
this was the last external call and poke_int3_handler() is now fully self
sufficient -- no calls to external code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/alternative.c |    8 ++++----
 arch/x86/kernel/traps.c       |    5 +++++
 2 files changed, 9 insertions(+), 4 deletions(-)

--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -979,7 +979,7 @@ static __always_inline void *text_poke_a
 	return _stext + tp->rel_addr;
 }
 
-static int noinstr patch_cmp(const void *key, const void *elt)
+static __always_inline int patch_cmp(const void *key, const void *elt)
 {
 	struct text_poke_loc *tp = (struct text_poke_loc *) elt;
 
@@ -1023,9 +1023,9 @@ int noinstr poke_int3_handler(struct pt_
 	 * Skip the binary search if there is a single member in the vector.
 	 */
 	if (unlikely(desc->nr_entries > 1)) {
-		tp = bsearch(ip, desc->vec, desc->nr_entries,
-			     sizeof(struct text_poke_loc),
-			     patch_cmp);
+		tp = __bsearch(ip, desc->vec, desc->nr_entries,
+			       sizeof(struct text_poke_loc),
+			       patch_cmp);
 		if (!tp)
 			goto out_put;
 	} else {
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -566,6 +566,11 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_pr
 
 dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
 {
+	/*
+	 * poke_int3_handler() is completely self contained code; it does (and
+	 * must) *NOT* call out to anything, lest it hits upon yet another
+	 * INT3.
+	 */
 	if (poke_int3_handler(regs))
 		return;
 


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 05/24] x86/entry: Provide IDTENTRY_RAW
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (3 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 04/24] x86/int3: Inline bsearch() Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-14  4:59   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] x86/idtentry: " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 06/24] x86/entry: Convert INT3 exception to IDTENTRY_RAW Thomas Gleixner
                   ` (18 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

Some exception handlers need to do extra work before any of the entry
helpers are invoked. Provide IDTENTRY_RAW for this.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/include/asm/idtentry.h |   31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -104,6 +104,34 @@ static __always_inline void __##func(str
 static __always_inline void __##func(struct pt_regs *regs,		\
 				     unsigned long error_code)
 
+/**
+ * DECLARE_IDTENTRY_RAW - Declare functions for raw IDT entry points
+ *		      No error code pushed by hardware
+ * @vector:	Vector number (ignored for C)
+ * @func:	Function name of the entry point
+ *
+ * Maps to DECLARE_IDTENTRY().
+ */
+#define DECLARE_IDTENTRY_RAW(vector, func)				\
+	DECLARE_IDTENTRY(vector, func)
+
+/**
+ * DEFINE_IDTENTRY_RAW - Emit code for raw IDT entry points
+ * @func:	Function name of the entry point
+ *
+ * @func is called from ASM entry code with interrupts disabled.
+ *
+ * The macro is written so it acts as function definition. Append the
+ * body with a pair of curly brackets.
+ *
+ * Contrary to DEFINE_IDTENTRY() this does not invoke the
+ * idtentry_enter/exit() helpers before and after the body invocation. This
+ * needs to be done in the body itself if applicable. Use if extra work
+ * is required before the enter/exit() helpers are invoked.
+ */
+#define DEFINE_IDTENTRY_RAW(func)					\
+__visible noinstr void func(struct pt_regs *regs)
+
 #else /* !__ASSEMBLY__ */
 
 /*
@@ -118,6 +146,9 @@ static __always_inline void __##func(str
 /* Special case for 32bit IRET 'trap'. Do not emit ASM code */
 #define DECLARE_IDTENTRY_SW(vector, func)
 
+#define DECLARE_IDTENTRY_RAW(vector, func)				\
+	DECLARE_IDTENTRY(vector, func)
+
 #endif /* __ASSEMBLY__ */
 
 /*


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 06/24] x86/entry: Convert INT3 exception to IDTENTRY_RAW
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (4 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 05/24] x86/entry: Provide IDTENTRY_RAW Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-14  5:01   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 07/24] x86/traps: Split int3 handler up Thomas Gleixner
                   ` (17 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

Convert #BP to IDTENTRY_RAW:
  - Implement the C entry point with DEFINE_IDTENTRY_RAW
  - Invoke idtentry_enter/exit() from the function body
  - Emit the ASM stub with DECLARE_IDTENTRY_RAW
  - Remove the ASM idtentry in 64bit
  - Remove the open coded ASM entry code in 32bit
  - Fixup the XEN/PV code
  - Remove the old prototyoes

No functional change.

This could be a plain IDTENTRY, but as Peter pointed out INT3 is broken
vs. the static key in the context tracking code as this static key might be
in the state of being patched and has an int3 which would recurse forever.
IDTENTRY_RAW is therefore chosen to allow addressing this issue without
lots of code churn.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/entry/entry_32.S       |    7 -------
 arch/x86/entry/entry_64.S       |    2 --
 arch/x86/include/asm/idtentry.h |    3 +++
 arch/x86/include/asm/traps.h    |    3 ---
 arch/x86/kernel/idt.c           |    2 +-
 arch/x86/kernel/traps.c         |   28 +++++++++++++++++-----------
 arch/x86/xen/enlighten_pv.c     |    2 +-
 arch/x86/xen/xen-asm_64.S       |    2 +-
 8 files changed, 23 insertions(+), 26 deletions(-)

--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1651,13 +1651,6 @@ SYM_CODE_START(nmi)
 #endif
 SYM_CODE_END(nmi)
 
-SYM_CODE_START(int3)
-	ASM_CLAC
-	pushl	$0
-	pushl	$do_int3
-	jmp	common_exception
-SYM_CODE_END(int3)
-
 .pushsection .text, "ax"
 SYM_CODE_START(rewind_stack_do_exit)
 	/* Prevent any naive code from trying to unwind to our caller. */
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1073,8 +1073,6 @@ apicinterrupt IRQ_WORK_VECTOR			irq_work
  * Exception entry points.
  */
 
-idtentry	X86_TRAP_BP		int3			do_int3				has_error_code=0
-
 idtentry	X86_TRAP_PF		page_fault		do_page_fault			has_error_code=1
 
 #ifdef CONFIG_X86_MCE
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -181,4 +181,7 @@ DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_SS,
 DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_GP,	exc_general_protection);
 DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_AC,	exc_alignment_check);
 
+/* Raw exception entries which need extra work */
+DECLARE_IDTENTRY_RAW(X86_TRAP_BP,	exc_int3);
+
 #endif
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -13,7 +13,6 @@
 
 asmlinkage void debug(void);
 asmlinkage void nmi(void);
-asmlinkage void int3(void);
 #ifdef CONFIG_X86_64
 asmlinkage void double_fault(void);
 #endif
@@ -26,7 +25,6 @@ asmlinkage void machine_check(void);
 #if defined(CONFIG_X86_64) && defined(CONFIG_XEN_PV)
 asmlinkage void xen_xennmi(void);
 asmlinkage void xen_xendebug(void);
-asmlinkage void xen_int3(void);
 asmlinkage void xen_double_fault(void);
 asmlinkage void xen_page_fault(void);
 #ifdef CONFIG_X86_MCE
@@ -36,7 +34,6 @@ asmlinkage void xen_machine_check(void);
 
 dotraplinkage void do_debug(struct pt_regs *regs, long error_code);
 dotraplinkage void do_nmi(struct pt_regs *regs, long error_code);
-dotraplinkage void do_int3(struct pt_regs *regs, long error_code);
 #if defined(CONFIG_X86_64) || defined(CONFIG_DOUBLEFAULT)
 dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code, unsigned long cr2);
 #endif
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -57,7 +57,7 @@ struct idt_data {
  */
 static const __initconst struct idt_data early_idts[] = {
 	INTG(X86_TRAP_DB,		debug),
-	SYSG(X86_TRAP_BP,		int3),
+	SYSG(X86_TRAP_BP,		asm_exc_int3),
 #ifdef CONFIG_X86_32
 	INTG(X86_TRAP_PF,		page_fault),
 #endif
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -564,7 +564,7 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_pr
 	cond_local_irq_disable(regs);
 }
 
-dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
+DEFINE_IDTENTRY_RAW(exc_int3)
 {
 	/*
 	 * poke_int3_handler() is completely self contained code; it does (and
@@ -575,16 +575,20 @@ dotraplinkage void notrace do_int3(struc
 		return;
 
 	/*
-	 * Unlike any other non-IST entry, we can be called from pretty much
-	 * any location in the kernel through kprobes -- text_poke() will most
-	 * likely be handled by poke_int3_handler() above. This means this
-	 * handler is effectively NMI-like.
+	 * idtentry_enter() uses static_branch_{,un}likely() and therefore
+	 * can trigger INT3, hence poke_int3_handler() must be done
+	 * before. If the entry came from kernel mode, then use nmi_enter()
+	 * because the INT3 could have been hit in any context including
+	 * NMI.
 	 */
-	if (!user_mode(regs))
+	if (user_mode(regs))
+		idtentry_enter(regs);
+	else
 		nmi_enter();
 
+	instr_begin();
 #ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
-	if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
+	if (kgdb_ll_trap(DIE_INT3, "int3", regs, 0, X86_TRAP_BP,
 				SIGTRAP) == NOTIFY_STOP)
 		goto exit;
 #endif /* CONFIG_KGDB_LOW_LEVEL_TRAP */
@@ -594,19 +598,21 @@ dotraplinkage void notrace do_int3(struc
 		goto exit;
 #endif
 
-	if (notify_die(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
+	if (notify_die(DIE_INT3, "int3", regs, 0, X86_TRAP_BP,
 			SIGTRAP) == NOTIFY_STOP)
 		goto exit;
 
 	cond_local_irq_enable(regs);
-	do_trap(X86_TRAP_BP, SIGTRAP, "int3", regs, error_code, 0, NULL);
+	do_trap(X86_TRAP_BP, SIGTRAP, "int3", regs, 0, 0, NULL);
 	cond_local_irq_disable(regs);
 
 exit:
-	if (!user_mode(regs))
+	instr_end();
+	if (user_mode(regs))
+		idtentry_exit(regs);
+	else
 		nmi_exit();
 }
-NOKPROBE_SYMBOL(do_int3);
 
 #ifdef CONFIG_X86_64
 /*
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -617,7 +617,7 @@ static struct trap_array_entry trap_arra
 	{ machine_check,               xen_machine_check,               true },
 #endif
 	{ nmi,                         xen_xennmi,                      true },
-	{ int3,                        xen_int3,                        false },
+	TRAP_ENTRY(exc_int3,				false ),
 	TRAP_ENTRY(exc_overflow,			false ),
 #ifdef CONFIG_IA32_EMULATION
 	{ entry_INT80_compat,          xen_entry_INT80_compat,          false },
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -31,7 +31,7 @@ SYM_CODE_END(xen_\name)
 xen_pv_trap asm_exc_divide_error
 xen_pv_trap debug
 xen_pv_trap xendebug
-xen_pv_trap int3
+xen_pv_trap asm_exc_int3
 xen_pv_trap xennmi
 xen_pv_trap asm_exc_overflow
 xen_pv_trap asm_exc_bounds


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 07/24] x86/traps: Split int3 handler up
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (5 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 06/24] x86/entry: Convert INT3 exception to IDTENTRY_RAW Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-14  5:03   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  2020-05-05 13:49 ` [patch V4 part 4 08/24] x86/entry: Provide IDTENTRY_IST Thomas Gleixner
                   ` (16 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon, Peter Zijlstra (Intel)

For code simplicity split up the int3 handler into a kernel and user part
which makes the code flow simpler to understand.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/traps.c |   67 +++++++++++++++++++++++++++---------------------
 1 file changed, 39 insertions(+), 28 deletions(-)

--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -564,6 +564,35 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_pr
 	cond_local_irq_disable(regs);
 }
 
+static bool do_int3(struct pt_regs *regs)
+{
+	int res;
+
+#ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
+	if (kgdb_ll_trap(DIE_INT3, "int3", regs, 0, X86_TRAP_BP,
+			 SIGTRAP) == NOTIFY_STOP)
+		return true;
+#endif /* CONFIG_KGDB_LOW_LEVEL_TRAP */
+
+#ifdef CONFIG_KPROBES
+	if (kprobe_int3_handler(regs))
+		return true;
+#endif
+	res = notify_die(DIE_INT3, "int3", regs, 0, X86_TRAP_BP, SIGTRAP);
+
+	return res == NOTIFY_STOP;
+}
+
+static void do_int3_user(struct pt_regs *regs)
+{
+	if (do_int3(regs))
+		return;
+
+	cond_local_irq_enable(regs);
+	do_trap(X86_TRAP_BP, SIGTRAP, "int3", regs, 0, 0, NULL);
+	cond_local_irq_disable(regs);
+}
+
 DEFINE_IDTENTRY_RAW(exc_int3)
 {
 	/*
@@ -581,37 +610,19 @@ DEFINE_IDTENTRY_RAW(exc_int3)
 	 * because the INT3 could have been hit in any context including
 	 * NMI.
 	 */
-	if (user_mode(regs))
+	if (user_mode(regs)) {
 		idtentry_enter(regs);
-	else
-		nmi_enter();
-
-	instr_begin();
-#ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
-	if (kgdb_ll_trap(DIE_INT3, "int3", regs, 0, X86_TRAP_BP,
-				SIGTRAP) == NOTIFY_STOP)
-		goto exit;
-#endif /* CONFIG_KGDB_LOW_LEVEL_TRAP */
-
-#ifdef CONFIG_KPROBES
-	if (kprobe_int3_handler(regs))
-		goto exit;
-#endif
-
-	if (notify_die(DIE_INT3, "int3", regs, 0, X86_TRAP_BP,
-			SIGTRAP) == NOTIFY_STOP)
-		goto exit;
-
-	cond_local_irq_enable(regs);
-	do_trap(X86_TRAP_BP, SIGTRAP, "int3", regs, 0, 0, NULL);
-	cond_local_irq_disable(regs);
-
-exit:
-	instr_end();
-	if (user_mode(regs))
+		instr_begin();
+		do_int3_user(regs);
+		instr_end();
 		idtentry_exit(regs);
-	else
+	} else {
+		nmi_enter();
+		instr_begin();
+		do_int3(regs);
+		instr_end();
 		nmi_exit();
+	}
 }
 
 #ifdef CONFIG_X86_64


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 08/24] x86/entry: Provide IDTENTRY_IST
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (6 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 07/24] x86/traps: Split int3 handler up Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-14 16:39   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] x86/idtentry: " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 09/24] x86/mce: Move nmi_enter/exit() into the entry point Thomas Gleixner
                   ` (15 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

Same as IDTENTRY but for exceptions which run on Interrupt STacks (IST) on
64bit. For 32bit this maps to IDTENTRY.

There are 3 variants which will be used:
      IDTENTRY_MCE
      IDTENTRY_DB
      IDTENTRY_NMI

These map to IDTENTRY_IST, but only the MCE and DB variants are emitting
ASM code as the NMI entry needs hand crafted ASM still.

The function defines do not contain any idtenter/exit calls as these
exceptions need special treatment.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


---
 arch/x86/include/asm/idtentry.h |   54 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -132,6 +132,42 @@ static __always_inline void __##func(str
 #define DEFINE_IDTENTRY_RAW(func)					\
 __visible noinstr void func(struct pt_regs *regs)
 
+#ifdef CONFIG_X86_64
+/**
+ * DECLARE_IDTENTRY_IST - Declare functions for IST handling IDT entry points
+ * @vector:	Vector number (ignored for C)
+ * @func:	Function name of the entry point
+ *
+ * Maps to DECLARE_IDTENTRY_RAW
+ */
+#define DECLARE_IDTENTRY_IST(vector, func)				\
+	DECLARE_IDTENTRY_RAW(vector, func)
+
+/**
+ * DEFINE_IDTENTRY_IST - Emit code for IST entry points
+ * @func:	Function name of the entry point
+ *
+ * Maps to DEFINE_IDTENTRY_RAW
+ */
+#define DEFINE_IDTENTRY_IST(func)					\
+	DEFINE_IDTENTRY_RAW(func)
+
+#else	/* CONFIG_X86_64 */
+/* Maps to a regular IDTENTRY on 32bit for now */
+# define DECLARE_IDTENTRY_IST		DECLARE_IDTENTRY
+# define DEFINE_IDTENTRY_IST		DEFINE_IDTENTRY
+#endif	/* !CONFIG_X86_64 */
+
+/* C-Code mapping */
+#define DECLARE_IDTENTRY_MCE		DECLARE_IDTENTRY_IST
+#define DEFINE_IDTENTRY_MCE		DEFINE_IDTENTRY_IST
+
+#define DECLARE_IDTENTRY_NMI		DECLARE_IDTENTRY_IST
+#define DEFINE_IDTENTRY_NMI		DEFINE_IDTENTRY_IST
+
+#define DECLARE_IDTENTRY_DEBUG		DECLARE_IDTENTRY_IST
+#define DEFINE_IDTENTRY_DEBUG		DEFINE_IDTENTRY_IST
+
 #else /* !__ASSEMBLY__ */
 
 /*
@@ -149,6 +185,24 @@ static __always_inline void __##func(str
 #define DECLARE_IDTENTRY_RAW(vector, func)				\
 	DECLARE_IDTENTRY(vector, func)
 
+#ifdef CONFIG_X86_64
+# define DECLARE_IDTENTRY_MCE(vector, func)				\
+	idtentry_mce_db vector asm_##func func
+
+# define DECLARE_IDTENTRY_DEBUG(vector, func)				\
+	idtentry_mce_db vector asm_##func func
+
+#else
+# define DECLARE_IDTENTRY_MCE(vector, func)				\
+	DECLARE_IDTENTRY(vector, func)
+
+# define DECLARE_IDTENTRY_DEBUG(vector, func)				\
+	DECLARE_IDTENTRY(vector, func)
+#endif
+
+/* No ASM code emitted for NMI */
+#define DECLARE_IDTENTRY_NMI(vector, func)
+
 #endif /* __ASSEMBLY__ */
 
 /*


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 09/24] x86/mce: Move nmi_enter/exit() into the entry point
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (7 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 08/24] x86/entry: Provide IDTENTRY_IST Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-15  5:23   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 10/24] x86/entry: Convert Machine Check to IDTENTRY_IST Thomas Gleixner
                   ` (14 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

There is no reason to have nmi_enter/exit() in the actual MCE
handlers. Move it to the entry point. This also covers the until now
uncovered initial handler which only prints.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/mce/core.c    |   26 +++++++++++++-------------
 arch/x86/kernel/cpu/mce/p5.c      |    4 ----
 arch/x86/kernel/cpu/mce/winchip.c |    4 ----
 3 files changed, 13 insertions(+), 21 deletions(-)

--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1100,8 +1100,10 @@ static void mce_clear_state(unsigned lon
  * kdump kernel establishing a new #MC handler where a broadcasted MCE
  * might not get handled properly.
  */
-static bool __mc_check_crashing_cpu(int cpu)
+static noinstr bool mce_check_crashing_cpu(void)
 {
+	unsigned int cpu = smp_processor_id();
+
 	if (cpu_is_offline(cpu) ||
 	    (crashing_cpu != -1 && crashing_cpu != cpu)) {
 		u64 mcgstatus;
@@ -1235,7 +1237,6 @@ void noinstr do_machine_check(struct pt_
 	DECLARE_BITMAP(valid_banks, MAX_NR_BANKS);
 	DECLARE_BITMAP(toclear, MAX_NR_BANKS);
 	struct mca_config *cfg = &mca_cfg;
-	int cpu = smp_processor_id();
 	struct mce m, *final;
 	char *msg = NULL;
 	int worst = 0;
@@ -1264,11 +1265,6 @@ void noinstr do_machine_check(struct pt_
 	 */
 	int lmce = 1;
 
-	if (__mc_check_crashing_cpu(cpu))
-		return;
-
-	nmi_enter();
-
 	this_cpu_inc(mce_exception_count);
 
 	mce_gather_info(&m, regs);
@@ -1356,7 +1352,7 @@ void noinstr do_machine_check(struct pt_
 	sync_core();
 
 	if (worst != MCE_AR_SEVERITY && !kill_it)
-		goto out_ist;
+		return;
 
 	/* Fault was in user mode and we need to take some action */
 	if ((m.cs & 3) == 3) {
@@ -1373,9 +1369,6 @@ void noinstr do_machine_check(struct pt_
 		if (!fixup_exception(regs, X86_TRAP_MC, error_code, 0))
 			mce_panic("Failed kernel mode recovery", &m, msg);
 	}
-
-out_ist:
-	nmi_exit();
 }
 EXPORT_SYMBOL_GPL(do_machine_check);
 
@@ -1912,11 +1905,18 @@ static void unexpected_machine_check(str
 void (*machine_check_vector)(struct pt_regs *, long error_code) =
 						unexpected_machine_check;
 
-dotraplinkage notrace void do_mce(struct pt_regs *regs, long error_code)
+dotraplinkage noinstr void do_mce(struct pt_regs *regs, long error_code)
 {
+	if (machine_check_vector == do_machine_check &&
+	    mce_check_crashing_cpu())
+		return;
+
+	nmi_enter();
+
 	machine_check_vector(regs, error_code);
+
+	nmi_exit();
 }
-NOKPROBE_SYMBOL(do_mce);
 
 /*
  * Called for each booted CPU to set up machine checks.
--- a/arch/x86/kernel/cpu/mce/p5.c
+++ b/arch/x86/kernel/cpu/mce/p5.c
@@ -25,8 +25,6 @@ static void pentium_machine_check(struct
 {
 	u32 loaddr, hi, lotype;
 
-	nmi_enter();
-
 	rdmsr(MSR_IA32_P5_MC_ADDR, loaddr, hi);
 	rdmsr(MSR_IA32_P5_MC_TYPE, lotype, hi);
 
@@ -39,8 +37,6 @@ static void pentium_machine_check(struct
 	}
 
 	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
-
-	nmi_exit();
 }
 
 /* Set up machine check reporting for processors with Intel style MCE: */
--- a/arch/x86/kernel/cpu/mce/winchip.c
+++ b/arch/x86/kernel/cpu/mce/winchip.c
@@ -19,12 +19,8 @@
 /* Machine check handler for WinChip C6: */
 static void winchip_machine_check(struct pt_regs *regs, long error_code)
 {
-	nmi_enter();
-
 	pr_emerg("CPU0: Machine Check Exception.\n");
 	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
-
-	nmi_exit();
 }
 
 /* Set up machine check reporting on the Winchip C6 series */


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 10/24] x86/entry: Convert Machine Check to IDTENTRY_IST
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (8 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 09/24] x86/mce: Move nmi_enter/exit() into the entry point Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-15  5:24   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 11/24] x86/mce: Use untraced rd/wrmsr in the MCE offline/crash check Thomas Gleixner
                   ` (13 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

Convert #MC to IDTENTRY_MCE:
  - Implement the C entry points with DEFINE_IDTENTRY_MCE
  - Emit the ASM stub with DECLARE_IDTENTRY_MCE
  - Remove the ASM idtentry in 64bit
  - Remove the open coded ASM entry code in 32bit
  - Fixup the XEN/PV code
  - Remove the old prototyoes
  - Remove the error code from *machine_check_vector() as
    it is always 0 and not used by any of the functions
    it can point to. Fixup all the functions as well.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/entry/entry_32.S          |    9 ---------
 arch/x86/entry/entry_64.S          |    3 ---
 arch/x86/include/asm/idtentry.h    |    4 ++++
 arch/x86/include/asm/mce.h         |    2 +-
 arch/x86/include/asm/traps.h       |    7 -------
 arch/x86/kernel/cpu/mce/core.c     |   23 ++++++++++++++---------
 arch/x86/kernel/cpu/mce/inject.c   |    4 ++--
 arch/x86/kernel/cpu/mce/internal.h |    2 +-
 arch/x86/kernel/cpu/mce/p5.c       |    2 +-
 arch/x86/kernel/cpu/mce/winchip.c  |    2 +-
 arch/x86/kernel/idt.c              |   10 +++++-----
 arch/x86/kvm/vmx/vmx.c             |    2 +-
 arch/x86/xen/enlighten_pv.c        |    2 +-
 arch/x86/xen/xen-asm_64.S          |    2 +-
 14 files changed, 32 insertions(+), 42 deletions(-)

--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1297,15 +1297,6 @@ SYM_CODE_START(native_iret)
 SYM_CODE_END(native_iret)
 #endif
 
-#ifdef CONFIG_X86_MCE
-SYM_CODE_START(machine_check)
-	ASM_CLAC
-	pushl	$0
-	pushl	$do_mce
-	jmp	common_exception
-SYM_CODE_END(machine_check)
-#endif
-
 #ifdef CONFIG_XEN_PV
 SYM_FUNC_START(xen_hypervisor_callback)
 	/*
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1075,9 +1075,6 @@ apicinterrupt IRQ_WORK_VECTOR			irq_work
 
 idtentry	X86_TRAP_PF		page_fault		do_page_fault			has_error_code=1
 
-#ifdef CONFIG_X86_MCE
-idtentry_mce_db	X86_TRAP_MCE	 	machine_check		do_mce
-#endif
 idtentry_mce_db	X86_TRAP_DB		debug			do_debug
 idtentry_df	X86_TRAP_DF		double_fault		do_double_fault
 
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -238,4 +238,8 @@ DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_AC,
 /* Raw exception entries which need extra work */
 DECLARE_IDTENTRY_RAW(X86_TRAP_BP,	exc_int3);
 
+#ifdef CONFIG_X86_MCE
+DECLARE_IDTENTRY_MCE(X86_TRAP_MC,	exc_machine_check);
+#endif
+
 #endif
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -238,7 +238,7 @@ extern void mce_disable_bank(int bank);
 /*
  * Exception handler
  */
-void do_machine_check(struct pt_regs *, long);
+void do_machine_check(struct pt_regs *pt_regs);
 
 /*
  * Threshold handler
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -18,18 +18,12 @@ asmlinkage void double_fault(void);
 #endif
 asmlinkage void page_fault(void);
 asmlinkage void async_page_fault(void);
-#ifdef CONFIG_X86_MCE
-asmlinkage void machine_check(void);
-#endif /* CONFIG_X86_MCE */
 
 #if defined(CONFIG_X86_64) && defined(CONFIG_XEN_PV)
 asmlinkage void xen_xennmi(void);
 asmlinkage void xen_xendebug(void);
 asmlinkage void xen_double_fault(void);
 asmlinkage void xen_page_fault(void);
-#ifdef CONFIG_X86_MCE
-asmlinkage void xen_machine_check(void);
-#endif /* CONFIG_X86_MCE */
 #endif
 
 dotraplinkage void do_debug(struct pt_regs *regs, long error_code);
@@ -38,7 +32,6 @@ dotraplinkage void do_nmi(struct pt_regs
 dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code, unsigned long cr2);
 #endif
 dotraplinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address);
-dotraplinkage void do_mce(struct pt_regs *regs, long error_code);
 
 #ifdef CONFIG_X86_64
 asmlinkage __visible notrace struct pt_regs *sync_regs(struct pt_regs *eregs);
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1232,7 +1232,7 @@ static void kill_me_maybe(struct callbac
  * backing the user stack, tracing that reads the user stack will cause
  * potentially infinite recursion.
  */
-void noinstr do_machine_check(struct pt_regs *regs, long error_code)
+void noinstr do_machine_check(struct pt_regs *regs)
 {
 	DECLARE_BITMAP(valid_banks, MAX_NR_BANKS);
 	DECLARE_BITMAP(toclear, MAX_NR_BANKS);
@@ -1366,7 +1366,7 @@ void noinstr do_machine_check(struct pt_
 			current->mce_kill_me.func = kill_me_now;
 		task_work_add(current, &current->mce_kill_me, true);
 	} else {
-		if (!fixup_exception(regs, X86_TRAP_MC, error_code, 0))
+		if (!fixup_exception(regs, X86_TRAP_MC, 0, 0))
 			mce_panic("Failed kernel mode recovery", &m, msg);
 	}
 }
@@ -1895,27 +1895,32 @@ bool filter_mce(struct mce *m)
 }
 
 /* Handle unconfigured int18 (should never happen) */
-static void unexpected_machine_check(struct pt_regs *regs, long error_code)
+static void unexpected_machine_check(struct pt_regs *regs)
 {
 	pr_err("CPU#%d: Unexpected int18 (Machine Check)\n",
 	       smp_processor_id());
 }
 
 /* Call the installed machine check handler for this CPU setup. */
-void (*machine_check_vector)(struct pt_regs *, long error_code) =
-						unexpected_machine_check;
+void (*machine_check_vector)(struct pt_regs *) = unexpected_machine_check;
 
-dotraplinkage noinstr void do_mce(struct pt_regs *regs, long error_code)
+DEFINE_IDTENTRY_MCE(exc_machine_check)
 {
 	if (machine_check_vector == do_machine_check &&
 	    mce_check_crashing_cpu())
 		return;
 
-	nmi_enter();
+	if (user_mode(regs))
+		idtentry_enter(regs);
+	else
+		nmi_enter();
 
-	machine_check_vector(regs, error_code);
+	machine_check_vector(regs);
 
-	nmi_exit();
+	if (user_mode(regs))
+		idtentry_exit(regs);
+	else
+		nmi_exit();
 }
 
 /*
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -146,9 +146,9 @@ static void raise_exception(struct mce *
 		regs.cs = m->cs;
 		pregs = &regs;
 	}
-	/* in mcheck exeception handler, irq will be disabled */
+	/* do_machine_check() expects interrupts disabled -- at least */
 	local_irq_save(flags);
-	do_machine_check(pregs, 0);
+	do_machine_check(pregs);
 	local_irq_restore(flags);
 	m->finished = 0;
 }
--- a/arch/x86/kernel/cpu/mce/internal.h
+++ b/arch/x86/kernel/cpu/mce/internal.h
@@ -9,7 +9,7 @@
 #include <asm/mce.h>
 
 /* Pointer to the installed machine check handler for this CPU setup. */
-extern void (*machine_check_vector)(struct pt_regs *, long error_code);
+extern void (*machine_check_vector)(struct pt_regs *);
 
 enum severity_level {
 	MCE_NO_SEVERITY,
--- a/arch/x86/kernel/cpu/mce/p5.c
+++ b/arch/x86/kernel/cpu/mce/p5.c
@@ -21,7 +21,7 @@
 int mce_p5_enabled __read_mostly;
 
 /* Machine check handler for Pentium class Intel CPUs: */
-static void pentium_machine_check(struct pt_regs *regs, long error_code)
+static void pentium_machine_check(struct pt_regs *regs)
 {
 	u32 loaddr, hi, lotype;
 
--- a/arch/x86/kernel/cpu/mce/winchip.c
+++ b/arch/x86/kernel/cpu/mce/winchip.c
@@ -17,7 +17,7 @@
 #include "internal.h"
 
 /* Machine check handler for WinChip C6: */
-static void winchip_machine_check(struct pt_regs *regs, long error_code)
+static void winchip_machine_check(struct pt_regs *regs)
 {
 	pr_emerg("CPU0: Machine Check Exception.\n");
 	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -93,7 +93,7 @@ static const __initconst struct idt_data
 	INTG(X86_TRAP_DB,		debug),
 
 #ifdef CONFIG_X86_MCE
-	INTG(X86_TRAP_MC,		&machine_check),
+	INTG(X86_TRAP_MC,		asm_exc_machine_check),
 #endif
 
 	SYSG(X86_TRAP_OF,		asm_exc_overflow),
@@ -182,11 +182,11 @@ gate_desc debug_idt_table[IDT_ENTRIES] _
  * cpu_init() when the TSS has been initialized.
  */
 static const __initconst struct idt_data ist_idts[] = {
-	ISTG(X86_TRAP_DB,	debug,		IST_INDEX_DB),
-	ISTG(X86_TRAP_NMI,	nmi,		IST_INDEX_NMI),
-	ISTG(X86_TRAP_DF,	double_fault,	IST_INDEX_DF),
+	ISTG(X86_TRAP_DB,	debug,			IST_INDEX_DB),
+	ISTG(X86_TRAP_NMI,	nmi,			IST_INDEX_NMI),
+	ISTG(X86_TRAP_DF,	double_fault,		IST_INDEX_DF),
 #ifdef CONFIG_X86_MCE
-	ISTG(X86_TRAP_MC,	&machine_check,	IST_INDEX_MCE),
+	ISTG(X86_TRAP_MC,	asm_exc_machine_check,	IST_INDEX_MCE),
 #endif
 };
 
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4578,7 +4578,7 @@ static void kvm_machine_check(void)
 		.flags = X86_EFLAGS_IF,
 	};
 
-	do_machine_check(&regs, 0);
+	do_machine_check(&regs);
 #endif
 }
 
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -614,7 +614,7 @@ static struct trap_array_entry trap_arra
 	{ debug,                       xen_xendebug,                    true },
 	{ double_fault,                xen_double_fault,                true },
 #ifdef CONFIG_X86_MCE
-	{ machine_check,               xen_machine_check,               true },
+	TRAP_ENTRY(exc_machine_check,			true  ),
 #endif
 	{ nmi,                         xen_xennmi,                      true },
 	TRAP_ENTRY(exc_int3,				false ),
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -48,7 +48,7 @@ xen_pv_trap asm_exc_spurious_interrupt_b
 xen_pv_trap asm_exc_coprocessor_error
 xen_pv_trap asm_exc_alignment_check
 #ifdef CONFIG_X86_MCE
-xen_pv_trap machine_check
+xen_pv_trap asm_exc_machine_check
 #endif /* CONFIG_X86_MCE */
 xen_pv_trap asm_exc_simd_coprocessor_error
 #ifdef CONFIG_IA32_EMULATION


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 11/24] x86/mce: Use untraced rd/wrmsr in the MCE offline/crash check
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (9 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 10/24] x86/entry: Convert Machine Check to IDTENTRY_IST Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-15  5:24   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 12/24] x86/idtentry: Provide IDTENTRY_XEN for XEN/PV Thomas Gleixner
                   ` (12 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

mce_check_crashing_cpu() is called right at the entry of the MCE
handler. It uses mce_rdmsr() and mce_wrmsr() which are wrappers around
rdmsr() and wrmsr() to handle the MCE error injection mechanism, which is
pointless in this context, i.e. when the MCE hits an offline CPU or the
system is already marked crashing.

The MSR access can also be traced, so use the untraceable variants. This
is also safe vs. XEN paravirt as these MSRs are not affected by XEN PV
modifications.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/mce/core.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1108,7 +1108,7 @@ static noinstr bool mce_check_crashing_c
 	    (crashing_cpu != -1 && crashing_cpu != cpu)) {
 		u64 mcgstatus;
 
-		mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
+		mcgstatus = __rdmsr(MSR_IA32_MCG_STATUS);
 
 		if (boot_cpu_data.x86_vendor == X86_VENDOR_ZHAOXIN) {
 			if (mcgstatus & MCG_STATUS_LMCES)
@@ -1116,7 +1116,7 @@ static noinstr bool mce_check_crashing_c
 		}
 
 		if (mcgstatus & MCG_STATUS_RIPV) {
-			mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
+			__wrmsr(MSR_IA32_MCG_STATUS, 0, 0);
 			return true;
 		}
 	}


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 12/24] x86/idtentry: Provide IDTENTRY_XEN for XEN/PV
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (10 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 11/24] x86/mce: Use untraced rd/wrmsr in the MCE offline/crash check Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-15  5:25   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 13/24] x86/entry: Convert NMI to IDTENTRY_NMI Thomas Gleixner
                   ` (11 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

XEN/PV has special wrappers for NMI and DB exceptions. They redirect these
exceptions through regular IDTENTRY points. Provide the necessary IDTENTRY
macros to make this work

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/idtentry.h |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)

--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -168,6 +168,18 @@ static __always_inline void __##func(str
 #define DECLARE_IDTENTRY_DEBUG		DECLARE_IDTENTRY_IST
 #define DEFINE_IDTENTRY_DEBUG		DEFINE_IDTENTRY_IST
 
+/**
+ * DECLARE_IDTENTRY_XEN - Declare functions for XEN redirect IDT entry points
+ * @vector:	Vector number (ignored for C)
+ * @func:	Function name of the entry point
+ *
+ * Used for xennmi and xendebug redirections. No DEFINE as this is all ASM
+ * indirection magic.
+ */
+#define DECLARE_IDTENTRY_XEN(vector, func)				\
+	asmlinkage void xen_asm_exc_xen##func(void);			\
+	asmlinkage void asm_exc_xen##func(void)
+
 #else /* !__ASSEMBLY__ */
 
 /*
@@ -203,6 +215,10 @@ static __always_inline void __##func(str
 /* No ASM code emitted for NMI */
 #define DECLARE_IDTENTRY_NMI(vector, func)
 
+/* XEN NMI and DB wrapper */
+#define DECLARE_IDTENTRY_XEN(vector, func)				\
+	idtentry vector asm_exc_xen##func exc_##func has_error_code=0 sane=1
+
 #endif /* __ASSEMBLY__ */
 
 /*


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 13/24] x86/entry: Convert NMI to IDTENTRY_NMI
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (11 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 12/24] x86/idtentry: Provide IDTENTRY_XEN for XEN/PV Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-15  5:26   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 14/24] x86/nmi: Protect NMI entry against instrumentation Thomas Gleixner
                   ` (10 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

Convert #NMI to IDTENTRY_NMI:
  - Implement the C entry point with DEFINE_IDTENTRY_NMI
  - Fixup the XEN/PV code
  - Remove the old prototyoes

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/entry/entry_32.S       |    8 ++++----
 arch/x86/entry/entry_64.S       |   15 +++++++--------
 arch/x86/include/asm/idtentry.h |    4 ++++
 arch/x86/include/asm/traps.h    |    3 ---
 arch/x86/kernel/idt.c           |    4 ++--
 arch/x86/kernel/nmi.c           |    4 +---
 arch/x86/xen/enlighten_pv.c     |    7 ++++++-
 arch/x86/xen/xen-asm_64.S       |    2 +-
 8 files changed, 25 insertions(+), 22 deletions(-)

--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1547,7 +1547,7 @@ SYM_CODE_END(double_fault)
  * switched stacks.  We handle both conditions by simply checking whether we
  * interrupted kernel code running on the SYSENTER stack.
  */
-SYM_CODE_START(nmi)
+SYM_CODE_START(asm_exc_nmi)
 	ASM_CLAC
 
 #ifdef CONFIG_X86_ESPFIX32
@@ -1576,7 +1576,7 @@ SYM_CODE_START(nmi)
 	jb	.Lnmi_from_sysenter_stack
 
 	/* Not on SYSENTER stack. */
-	call	do_nmi
+	call	exc_nmi
 	jmp	.Lnmi_return
 
 .Lnmi_from_sysenter_stack:
@@ -1586,7 +1586,7 @@ SYM_CODE_START(nmi)
 	 */
 	movl	%esp, %ebx
 	movl	PER_CPU_VAR(cpu_current_top_of_stack), %esp
-	call	do_nmi
+	call	exc_nmi
 	movl	%ebx, %esp
 
 .Lnmi_return:
@@ -1640,7 +1640,7 @@ SYM_CODE_START(nmi)
 	lss	(1+5+6)*4(%esp), %esp			# back to espfix stack
 	jmp	.Lirq_return
 #endif
-SYM_CODE_END(nmi)
+SYM_CODE_END(asm_exc_nmi)
 
 .pushsection .text, "ax"
 SYM_CODE_START(rewind_stack_do_exit)
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1080,7 +1080,6 @@ idtentry_df	X86_TRAP_DF		double_fault		d
 
 #ifdef CONFIG_XEN_PV
 idtentry	512 /* dummy */		hypervisor_callback	xen_do_hypervisor_callback	has_error_code=0
-idtentry	X86_TRAP_NMI		xennmi			do_nmi				has_error_code=0
 idtentry	X86_TRAP_DB		xendebug		do_debug			has_error_code=0
 #endif
 
@@ -1416,7 +1415,7 @@ SYM_CODE_END(error_return)
  *	%r14: Used to save/restore the CR3 of the interrupted context
  *	      when PAGE_TABLE_ISOLATION is in use.  Do not clobber.
  */
-SYM_CODE_START(nmi)
+SYM_CODE_START(asm_exc_nmi)
 	UNWIND_HINT_IRET_REGS
 
 	/*
@@ -1501,7 +1500,7 @@ SYM_CODE_START(nmi)
 
 	movq	%rsp, %rdi
 	movq	$-1, %rsi
-	call	do_nmi
+	call	exc_nmi
 
 	/*
 	 * Return back to user mode.  We must *not* do the normal exit
@@ -1558,7 +1557,7 @@ SYM_CODE_START(nmi)
 	 * end_repeat_nmi, then we are a nested NMI.  We must not
 	 * modify the "iret" frame because it's being written by
 	 * the outer NMI.  That's okay; the outer NMI handler is
-	 * about to about to call do_nmi anyway, so we can just
+	 * about to about to call exc_nmi() anyway, so we can just
 	 * resume the outer NMI.
 	 */
 
@@ -1677,7 +1676,7 @@ SYM_CODE_START(nmi)
 	 * RSP is pointing to "outermost RIP".  gsbase is unknown, but, if
 	 * we're repeating an NMI, gsbase has the same value that it had on
 	 * the first iteration.  paranoid_entry will load the kernel
-	 * gsbase if needed before we call do_nmi.  "NMI executing"
+	 * gsbase if needed before we call exc_nmi().  "NMI executing"
 	 * is zero.
 	 */
 	movq	$1, 10*8(%rsp)		/* Set "NMI executing". */
@@ -1711,10 +1710,10 @@ SYM_CODE_START(nmi)
 	call	paranoid_entry
 	UNWIND_HINT_REGS
 
-	/* paranoidentry do_nmi, 0; without TRACE_IRQS_OFF */
+	/* paranoidentry exc_nmi(), 0; without TRACE_IRQS_OFF */
 	movq	%rsp, %rdi
 	movq	$-1, %rsi
-	call	do_nmi
+	call	exc_nmi
 
 	/* Always restore stashed CR3 value (see paranoid_entry) */
 	RESTORE_CR3 scratch_reg=%r15 save_reg=%r14
@@ -1751,7 +1750,7 @@ SYM_CODE_START(nmi)
 	 * about espfix64 on the way back to kernel mode.
 	 */
 	iretq
-SYM_CODE_END(nmi)
+SYM_CODE_END(asm_exc_nmi)
 
 #ifndef CONFIG_IA32_EMULATION
 /*
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -260,4 +260,8 @@ DECLARE_IDTENTRY_RAW(X86_TRAP_BP,	exc_in
 DECLARE_IDTENTRY_MCE(X86_TRAP_MC,	exc_machine_check);
 #endif
 
+/* NMI */
+DECLARE_IDTENTRY_NMI(X86_TRAP_NMI,	exc_nmi);
+DECLARE_IDTENTRY_XEN(X86_TRAP_NMI,	nmi);
+
 #endif
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -12,7 +12,6 @@
 #define dotraplinkage __visible
 
 asmlinkage void debug(void);
-asmlinkage void nmi(void);
 #ifdef CONFIG_X86_64
 asmlinkage void double_fault(void);
 #endif
@@ -20,14 +19,12 @@ asmlinkage void page_fault(void);
 asmlinkage void async_page_fault(void);
 
 #if defined(CONFIG_X86_64) && defined(CONFIG_XEN_PV)
-asmlinkage void xen_xennmi(void);
 asmlinkage void xen_xendebug(void);
 asmlinkage void xen_double_fault(void);
 asmlinkage void xen_page_fault(void);
 #endif
 
 dotraplinkage void do_debug(struct pt_regs *regs, long error_code);
-dotraplinkage void do_nmi(struct pt_regs *regs, long error_code);
 #if defined(CONFIG_X86_64) || defined(CONFIG_DOUBLEFAULT)
 dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code, unsigned long cr2);
 #endif
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -71,7 +71,7 @@ static const __initconst struct idt_data
  */
 static const __initconst struct idt_data def_idts[] = {
 	INTG(X86_TRAP_DE,		asm_exc_divide_error),
-	INTG(X86_TRAP_NMI,		nmi),
+	INTG(X86_TRAP_NMI,		asm_exc_nmi),
 	INTG(X86_TRAP_BR,		asm_exc_bounds),
 	INTG(X86_TRAP_UD,		asm_exc_invalid_op),
 	INTG(X86_TRAP_NM,		asm_exc_device_not_available),
@@ -183,7 +183,7 @@ gate_desc debug_idt_table[IDT_ENTRIES] _
  */
 static const __initconst struct idt_data ist_idts[] = {
 	ISTG(X86_TRAP_DB,	debug,			IST_INDEX_DB),
-	ISTG(X86_TRAP_NMI,	nmi,			IST_INDEX_NMI),
+	ISTG(X86_TRAP_NMI,	asm_exc_nmi,		IST_INDEX_NMI),
 	ISTG(X86_TRAP_DF,	double_fault,		IST_INDEX_DF),
 #ifdef CONFIG_X86_MCE
 	ISTG(X86_TRAP_MC,	asm_exc_machine_check,	IST_INDEX_MCE),
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -507,8 +507,7 @@ static bool notrace is_debug_stack(unsig
 NOKPROBE_SYMBOL(is_debug_stack);
 #endif
 
-dotraplinkage notrace void
-do_nmi(struct pt_regs *regs, long error_code)
+DEFINE_IDTENTRY_NMI(exc_nmi)
 {
 	if (IS_ENABLED(CONFIG_SMP) && cpu_is_offline(smp_processor_id()))
 		return;
@@ -558,7 +557,6 @@ do_nmi(struct pt_regs *regs, long error_
 	if (user_mode(regs))
 		mds_user_clear_cpu_buffers();
 }
-NOKPROBE_SYMBOL(do_nmi);
 
 void stop_nmi(void)
 {
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -610,13 +610,18 @@ struct trap_array_entry {
 	.xen		= xen_asm_##func,		\
 	.ist_okay	= ist_ok }
 
+#define TRAP_ENTRY_REDIR(func, xenfunc, ist_ok) {	\
+	.orig		= asm_##func,			\
+	.xen		= xen_asm_##xenfunc,		\
+	.ist_okay	= ist_ok }
+
 static struct trap_array_entry trap_array[] = {
 	{ debug,                       xen_xendebug,                    true },
 	{ double_fault,                xen_double_fault,                true },
 #ifdef CONFIG_X86_MCE
 	TRAP_ENTRY(exc_machine_check,			true  ),
 #endif
-	{ nmi,                         xen_xennmi,                      true },
+	TRAP_ENTRY_REDIR(exc_nmi, exc_xennmi,		true  ),
 	TRAP_ENTRY(exc_int3,				false ),
 	TRAP_ENTRY(exc_overflow,			false ),
 #ifdef CONFIG_IA32_EMULATION
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -32,7 +32,7 @@ xen_pv_trap asm_exc_divide_error
 xen_pv_trap debug
 xen_pv_trap xendebug
 xen_pv_trap asm_exc_int3
-xen_pv_trap xennmi
+xen_pv_trap asm_exc_xennmi
 xen_pv_trap asm_exc_overflow
 xen_pv_trap asm_exc_bounds
 xen_pv_trap asm_exc_invalid_op


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 14/24] x86/nmi: Protect NMI entry against instrumentation
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (12 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 13/24] x86/entry: Convert NMI to IDTENTRY_NMI Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-15  5:26   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling Thomas Gleixner
                   ` (9 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

Mark all functions in the fragile code parts noinstr or force inlining so
they can't be instrumented.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/desc.h  |    8 ++++----
 arch/x86/kernel/cpu/common.c |    6 ++----
 arch/x86/kernel/nmi.c        |   10 ++++++----
 3 files changed, 12 insertions(+), 12 deletions(-)

--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -214,7 +214,7 @@ static inline void native_load_gdt(const
 	asm volatile("lgdt %0"::"m" (*dtr));
 }
 
-static inline void native_load_idt(const struct desc_ptr *dtr)
+static __always_inline void native_load_idt(const struct desc_ptr *dtr)
 {
 	asm volatile("lidt %0"::"m" (*dtr));
 }
@@ -393,7 +393,7 @@ extern unsigned long system_vectors[];
 
 #ifdef CONFIG_X86_64
 DECLARE_PER_CPU(u32, debug_idt_ctr);
-static inline bool is_debug_idt_enabled(void)
+static __always_inline bool is_debug_idt_enabled(void)
 {
 	if (this_cpu_read(debug_idt_ctr))
 		return true;
@@ -401,7 +401,7 @@ static inline bool is_debug_idt_enabled(
 	return false;
 }
 
-static inline void load_debug_idt(void)
+static __always_inline void load_debug_idt(void)
 {
 	load_idt((const struct desc_ptr *)&debug_idt_descr);
 }
@@ -423,7 +423,7 @@ static inline void load_debug_idt(void)
  * that doesn't need to disable interrupts, as nothing should be
  * bothering the CPU then.
  */
-static inline void load_current_idt(void)
+static __always_inline void load_current_idt(void)
 {
 	if (is_debug_idt_enabled())
 		load_debug_idt();
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1692,21 +1692,19 @@ void syscall_init(void)
 DEFINE_PER_CPU(int, debug_stack_usage);
 DEFINE_PER_CPU(u32, debug_idt_ctr);
 
-void debug_stack_set_zero(void)
+noinstr void debug_stack_set_zero(void)
 {
 	this_cpu_inc(debug_idt_ctr);
 	load_current_idt();
 }
-NOKPROBE_SYMBOL(debug_stack_set_zero);
 
-void debug_stack_reset(void)
+noinstr void debug_stack_reset(void)
 {
 	if (WARN_ON(!this_cpu_read(debug_idt_ctr)))
 		return;
 	if (this_cpu_dec_return(debug_idt_ctr) == 0)
 		load_current_idt();
 }
-NOKPROBE_SYMBOL(debug_stack_reset);
 
 #else	/* CONFIG_X86_64 */
 
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -307,7 +307,7 @@ NOKPROBE_SYMBOL(unknown_nmi_error);
 static DEFINE_PER_CPU(bool, swallow_nmi);
 static DEFINE_PER_CPU(unsigned long, last_nmi_rip);
 
-static void default_do_nmi(struct pt_regs *regs)
+static noinstr void default_do_nmi(struct pt_regs *regs)
 {
 	unsigned char reason = 0;
 	int handled;
@@ -333,6 +333,7 @@ static void default_do_nmi(struct pt_reg
 
 	__this_cpu_write(last_nmi_rip, regs->ip);
 
+	instr_begin();
 	handled = nmi_handle(NMI_LOCAL, regs);
 	__this_cpu_add(nmi_stats.normal, handled);
 	if (handled) {
@@ -346,6 +347,7 @@ static void default_do_nmi(struct pt_reg
 		 */
 		if (handled > 1)
 			__this_cpu_write(swallow_nmi, true);
+		instr_end();
 		return;
 	}
 
@@ -378,6 +380,7 @@ static void default_do_nmi(struct pt_reg
 #endif
 		__this_cpu_add(nmi_stats.external, 1);
 		raw_spin_unlock(&nmi_reason_lock);
+		instr_end();
 		return;
 	}
 	raw_spin_unlock(&nmi_reason_lock);
@@ -416,8 +419,8 @@ static void default_do_nmi(struct pt_reg
 		__this_cpu_add(nmi_stats.swallow, 1);
 	else
 		unknown_nmi_error(reason, regs);
+	instr_end();
 }
-NOKPROBE_SYMBOL(default_do_nmi);
 
 /*
  * NMIs can page fault or hit breakpoints which will cause it to lose
@@ -489,7 +492,7 @@ static DEFINE_PER_CPU(unsigned long, nmi
  */
 static DEFINE_PER_CPU(int, update_debug_stack);
 
-static bool notrace is_debug_stack(unsigned long addr)
+static noinstr bool is_debug_stack(unsigned long addr)
 {
 	struct cea_exception_stacks *cs = __this_cpu_read(cea_exception_stacks);
 	unsigned long top = CEA_ESTACK_TOP(cs, DB);
@@ -504,7 +507,6 @@ static bool notrace is_debug_stack(unsig
 	 */
 	return addr >= bot && addr < top;
 }
-NOKPROBE_SYMBOL(is_debug_stack);
 #endif
 
 DEFINE_IDTENTRY_NMI(exc_nmi)


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (13 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 14/24] x86/nmi: Protect NMI entry against instrumentation Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-07 17:18   ` Alexandre Chartre
                     ` (3 more replies)
  2020-05-05 13:49 ` [patch V4 part 4 16/24] x86/entry: Convert Debug exception to IDTENTRY_DB Thomas Gleixner
                   ` (8 subsequent siblings)
  23 siblings, 4 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon, Peter Zijlstra

From: Peter Zijlstra <peterz@infradead.org>

DR6/7 should be handled before nmi_enter() is invoked and restore after
nmi_exit() to minimize the exposure.

Split it out into helper inlines and bring it into the correct order.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/hw_breakpoint.c |    6 ---
 arch/x86/kernel/traps.c         |   62 +++++++++++++++++++++++++++-------------
 2 files changed, 44 insertions(+), 24 deletions(-)

--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -464,7 +464,7 @@ static int hw_breakpoint_handler(struct
 {
 	int i, cpu, rc = NOTIFY_STOP;
 	struct perf_event *bp;
-	unsigned long dr7, dr6;
+	unsigned long dr6;
 	unsigned long *dr6_p;
 
 	/* The DR6 value is pointed by args->err */
@@ -479,9 +479,6 @@ static int hw_breakpoint_handler(struct
 	if ((dr6 & DR_TRAP_BITS) == 0)
 		return NOTIFY_DONE;
 
-	get_debugreg(dr7, 7);
-	/* Disable breakpoints during exception handling */
-	set_debugreg(0UL, 7);
 	/*
 	 * Assert that local interrupts are disabled
 	 * Reset the DRn bits in the virtualized register value.
@@ -538,7 +535,6 @@ static int hw_breakpoint_handler(struct
 	    (dr6 & (~DR_TRAP_BITS)))
 		rc = NOTIFY_DONE;
 
-	set_debugreg(dr7, 7);
 	put_cpu();
 
 	return rc;
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -691,6 +691,44 @@ static bool is_sysenter_singlestep(struc
 #endif
 }
 
+static __always_inline void debug_enter(unsigned long *dr6, unsigned long *dr7)
+{
+	/*
+	 * Disable breakpoints during exception handling; recursive exceptions
+	 * are exceedingly 'fun'.
+	 *
+	 * Since this function is NOKPROBE, and that also applies to
+	 * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a
+	 * HW_BREAKPOINT_W on our stack)
+	 *
+	 * Entry text is excluded for HW_BP_X and cpu_entry_area, which
+	 * includes the entry stack is excluded for everything.
+	 */
+	get_debugreg(*dr7, 6);
+	set_debugreg(0, 7);
+
+	/*
+	 * The Intel SDM says:
+	 *
+	 *   Certain debug exceptions may clear bits 0-3. The remaining
+	 *   contents of the DR6 register are never cleared by the
+	 *   processor. To avoid confusion in identifying debug
+	 *   exceptions, debug handlers should clear the register before
+	 *   returning to the interrupted task.
+	 *
+	 * Keep it simple: clear DR6 immediately.
+	 */
+	get_debugreg(*dr6, 6);
+	set_debugreg(0, 6);
+	/* Filter out all the reserved bits which are preset to 1 */
+	*dr6 &= ~DR6_RESERVED;
+}
+
+static __always_inline void debug_exit(unsigned long dr7)
+{
+	set_debugreg(dr7, 7);
+}
+
 /*
  * Our handling of the processor debug registers is non-trivial.
  * We do not clear them on entry and exit from the kernel. Therefore
@@ -718,28 +756,13 @@ static bool is_sysenter_singlestep(struc
 dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
 {
 	struct task_struct *tsk = current;
+	unsigned long dr6, dr7;
 	int user_icebp = 0;
-	unsigned long dr6;
 	int si_code;
 
-	nmi_enter();
-
-	get_debugreg(dr6, 6);
-	/*
-	 * The Intel SDM says:
-	 *
-	 *   Certain debug exceptions may clear bits 0-3. The remaining
-	 *   contents of the DR6 register are never cleared by the
-	 *   processor. To avoid confusion in identifying debug
-	 *   exceptions, debug handlers should clear the register before
-	 *   returning to the interrupted task.
-	 *
-	 * Keep it simple: clear DR6 immediately.
-	 */
-	set_debugreg(0, 6);
+	debug_enter(&dr6, &dr7);
 
-	/* Filter out all the reserved bits which are preset to 1 */
-	dr6 &= ~DR6_RESERVED;
+	nmi_enter();
 
 	/*
 	 * The SDM says "The processor clears the BTF flag when it
@@ -777,7 +800,7 @@ dotraplinkage void do_debug(struct pt_re
 #endif
 
 	if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, error_code,
-							SIGTRAP) == NOTIFY_STOP)
+		       SIGTRAP) == NOTIFY_STOP)
 		goto exit;
 
 	/*
@@ -816,6 +839,7 @@ dotraplinkage void do_debug(struct pt_re
 
 exit:
 	nmi_exit();
+	debug_exit(dr7);
 }
 NOKPROBE_SYMBOL(do_debug);
 


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 16/24] x86/entry: Convert Debug exception to IDTENTRY_DB
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (14 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-15  5:27   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 17/24] x86/entry/64: Remove error code clearing from #DB and #MCE ASM stub Thomas Gleixner
                   ` (7 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

Convert #DB to IDTENTRY_ERRORCODE:
  - Implement the C entry point with DEFINE_IDTENTRY_DB
  - Emit the ASM stub with DECLARE_IDTENTRY
  - Remove the ASM idtentry in 64bit
  - Remove the open coded ASM entry code in 32bit
  - Fixup the XEN/PV code
  - Remove the old prototyoes

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/entry/entry_32.S       |   10 ----------
 arch/x86/entry/entry_64.S       |    2 --
 arch/x86/include/asm/idtentry.h |    4 ++++
 arch/x86/include/asm/traps.h    |    3 ---
 arch/x86/kernel/idt.c           |    8 ++++----
 arch/x86/kernel/traps.c         |   21 +++++++++++++--------
 arch/x86/xen/enlighten_pv.c     |    2 +-
 arch/x86/xen/xen-asm_64.S       |    4 ++--
 8 files changed, 24 insertions(+), 30 deletions(-)

--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1488,16 +1488,6 @@ SYM_CODE_START_LOCAL_NOALIGN(handle_exce
 	jmp	restore_all_switch_stack
 SYM_CODE_END(handle_exception)
 
-SYM_CODE_START(debug)
-	/*
-	 * Entry from sysenter is now handled in common_exception
-	 */
-	ASM_CLAC
-	pushl	$0
-	pushl	$do_debug
-	jmp	common_exception
-SYM_CODE_END(debug)
-
 #ifdef CONFIG_DOUBLEFAULT
 SYM_CODE_START(double_fault)
 1:
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1075,12 +1075,10 @@ apicinterrupt IRQ_WORK_VECTOR			irq_work
 
 idtentry	X86_TRAP_PF		page_fault		do_page_fault			has_error_code=1
 
-idtentry_mce_db	X86_TRAP_DB		debug			do_debug
 idtentry_df	X86_TRAP_DF		double_fault		do_double_fault
 
 #ifdef CONFIG_XEN_PV
 idtentry	512 /* dummy */		hypervisor_callback	xen_do_hypervisor_callback	has_error_code=0
-idtentry	X86_TRAP_DB		xendebug		do_debug			has_error_code=0
 #endif
 
 /*
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -262,4 +262,8 @@ DECLARE_IDTENTRY_MCE(X86_TRAP_MC,	exc_ma
 DECLARE_IDTENTRY_NMI(X86_TRAP_NMI,	exc_nmi);
 DECLARE_IDTENTRY_XEN(X86_TRAP_NMI,	nmi);
 
+/* #DB */
+DECLARE_IDTENTRY_DEBUG(X86_TRAP_DB,	exc_debug);
+DECLARE_IDTENTRY_XEN(X86_TRAP_DB,	debug);
+
 #endif
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -11,7 +11,6 @@
 
 #define dotraplinkage __visible
 
-asmlinkage void debug(void);
 #ifdef CONFIG_X86_64
 asmlinkage void double_fault(void);
 #endif
@@ -19,12 +18,10 @@ asmlinkage void page_fault(void);
 asmlinkage void async_page_fault(void);
 
 #if defined(CONFIG_X86_64) && defined(CONFIG_XEN_PV)
-asmlinkage void xen_xendebug(void);
 asmlinkage void xen_double_fault(void);
 asmlinkage void xen_page_fault(void);
 #endif
 
-dotraplinkage void do_debug(struct pt_regs *regs, long error_code);
 #if defined(CONFIG_X86_64) || defined(CONFIG_DOUBLEFAULT)
 dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code, unsigned long cr2);
 #endif
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -56,7 +56,7 @@ struct idt_data {
  * stacks work only after cpu_init().
  */
 static const __initconst struct idt_data early_idts[] = {
-	INTG(X86_TRAP_DB,		debug),
+	INTG(X86_TRAP_DB,		asm_exc_debug),
 	SYSG(X86_TRAP_BP,		asm_exc_int3),
 #ifdef CONFIG_X86_32
 	INTG(X86_TRAP_PF,		page_fault),
@@ -90,7 +90,7 @@ static const __initconst struct idt_data
 #else
 	INTG(X86_TRAP_DF,		double_fault),
 #endif
-	INTG(X86_TRAP_DB,		debug),
+	INTG(X86_TRAP_DB,		asm_exc_debug),
 
 #ifdef CONFIG_X86_MCE
 	INTG(X86_TRAP_MC,		asm_exc_machine_check),
@@ -161,7 +161,7 @@ static const __initconst struct idt_data
  * stack set to DEFAULT_STACK (0). Required for NMI trap handling.
  */
 static const __initconst struct idt_data dbg_idts[] = {
-	INTG(X86_TRAP_DB,	debug),
+	INTG(X86_TRAP_DB,		asm_exc_debug),
 };
 #endif
 
@@ -182,7 +182,7 @@ gate_desc debug_idt_table[IDT_ENTRIES] _
  * cpu_init() when the TSS has been initialized.
  */
 static const __initconst struct idt_data ist_idts[] = {
-	ISTG(X86_TRAP_DB,	debug,			IST_INDEX_DB),
+	ISTG(X86_TRAP_DB,	asm_exc_debug,		IST_INDEX_DB),
 	ISTG(X86_TRAP_NMI,	asm_exc_nmi,		IST_INDEX_NMI),
 	ISTG(X86_TRAP_DF,	double_fault,		IST_INDEX_DF),
 #ifdef CONFIG_X86_MCE
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -753,7 +753,7 @@ static __always_inline void debug_exit(u
  *
  * May run on IST stack.
  */
-dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
+DEFINE_IDTENTRY_DEBUG(exc_debug)
 {
 	struct task_struct *tsk = current;
 	unsigned long dr6, dr7;
@@ -762,7 +762,10 @@ dotraplinkage void do_debug(struct pt_re
 
 	debug_enter(&dr6, &dr7);
 
-	nmi_enter();
+	if (user_mode(regs))
+		idtentry_enter(regs);
+	else
+		nmi_enter();
 
 	/*
 	 * The SDM says "The processor clears the BTF flag when it
@@ -799,7 +802,7 @@ dotraplinkage void do_debug(struct pt_re
 		goto exit;
 #endif
 
-	if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, error_code,
+	if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, 0,
 		       SIGTRAP) == NOTIFY_STOP)
 		goto exit;
 
@@ -813,8 +816,8 @@ dotraplinkage void do_debug(struct pt_re
 	cond_local_irq_enable(regs);
 
 	if (v8086_mode(regs)) {
-		handle_vm86_trap((struct kernel_vm86_regs *) regs, error_code,
-					X86_TRAP_DB);
+		handle_vm86_trap((struct kernel_vm86_regs *) regs, 0,
+				 X86_TRAP_DB);
 		cond_local_irq_disable(regs);
 		debug_stack_usage_dec();
 		goto exit;
@@ -833,15 +836,17 @@ dotraplinkage void do_debug(struct pt_re
 	}
 	si_code = get_si_code(tsk->thread.debugreg6);
 	if (tsk->thread.debugreg6 & (DR_STEP | DR_TRAP_BITS) || user_icebp)
-		send_sigtrap(regs, error_code, si_code);
+		send_sigtrap(regs, 0, si_code);
 	cond_local_irq_disable(regs);
 	debug_stack_usage_dec();
 
 exit:
-	nmi_exit();
+	if (user_mode(regs))
+		idtentry_exit(regs);
+	else
+		nmi_exit();
 	debug_exit(dr7);
 }
-NOKPROBE_SYMBOL(do_debug);
 
 /*
  * Note that we play around with the 'TS' bit in an attempt to get
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -616,7 +616,7 @@ struct trap_array_entry {
 	.ist_okay	= ist_ok }
 
 static struct trap_array_entry trap_array[] = {
-	{ debug,                       xen_xendebug,                    true },
+	TRAP_ENTRY_REDIR(exc_debug, exc_xendebug,	true  ),
 	{ double_fault,                xen_double_fault,                true },
 #ifdef CONFIG_X86_MCE
 	TRAP_ENTRY(exc_machine_check,			true  ),
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -29,8 +29,8 @@ SYM_CODE_END(xen_\name)
 .endm
 
 xen_pv_trap asm_exc_divide_error
-xen_pv_trap debug
-xen_pv_trap xendebug
+xen_pv_trap asm_exc_debug
+xen_pv_trap asm_exc_xendebug
 xen_pv_trap asm_exc_int3
 xen_pv_trap asm_exc_xennmi
 xen_pv_trap asm_exc_overflow


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 17/24] x86/entry/64: Remove error code clearing from #DB and #MCE ASM stub
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (15 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 16/24] x86/entry: Convert Debug exception to IDTENTRY_DB Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-15  5:27   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 18/24] x86/entry: Provide IDTRENTRY_NOIST variants for #DB and #MC Thomas Gleixner
                   ` (6 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

The C entry points do not expect an error code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/entry/entry_64.S |    1 -
 1 file changed, 1 deletion(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -643,7 +643,6 @@ SYM_CODE_START(\asmsym)
 	.endif
 
 	movq	%rsp, %rdi		/* pt_regs pointer */
-	xorl	%esi, %esi		/* Clear the error code */
 
 	.if \vector == X86_TRAP_DB
 		subq	$DB_STACK_OFFSET, CPU_TSS_IST(IST_INDEX_DB)


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 18/24] x86/entry: Provide IDTRENTRY_NOIST variants for #DB and #MC
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (16 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 17/24] x86/entry/64: Remove error code clearing from #DB and #MCE ASM stub Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-15  5:29   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] x86/idtentry: " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 19/24] x86/entry: Implement user mode C entry points for #DB and #MCE Thomas Gleixner
                   ` (5 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

Provide NOIST entry point macros which allows to implement NOIST variants
of the C entry points. These are invoked when #DB or #MC enter from user
space. This allows explicit handling of the difference between user mode
and kernel mode entry later.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/idtentry.h |   19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -138,10 +138,12 @@ static __always_inline void __##func(str
  * @vector:	Vector number (ignored for C)
  * @func:	Function name of the entry point
  *
- * Maps to DECLARE_IDTENTRY_RAW
+ * Maps to DECLARE_IDTENTRY_RAW, but declares also the NOIST C handler
+ * which is called from the ASM entry point on user mode entry
  */
 #define DECLARE_IDTENTRY_IST(vector, func)				\
-	DECLARE_IDTENTRY_RAW(vector, func)
+	DECLARE_IDTENTRY_RAW(vector, func);				\
+	__visible void noist_##func(struct pt_regs *regs)
 
 /**
  * DEFINE_IDTENTRY_IST - Emit code for IST entry points
@@ -152,6 +154,17 @@ static __always_inline void __##func(str
 #define DEFINE_IDTENTRY_IST(func)					\
 	DEFINE_IDTENTRY_RAW(func)
 
+/**
+ * DEFINE_IDTENTRY_NOIST - Emit code for NOIST entry points which
+ *			   belong to a IST entry point (MCE, DB)
+ * @func:	Function name of the entry point. Must be the same as
+ *		the function name of the corresponding IST variant
+ *
+ * Maps to DEFINE_IDTENTRY_RAW().
+ */
+#define DEFINE_IDTENTRY_NOIST(func)					\
+	DEFINE_IDTENTRY_RAW(noist_##func)
+
 #else	/* CONFIG_X86_64 */
 /* Maps to a regular IDTENTRY on 32bit for now */
 # define DECLARE_IDTENTRY_IST		DECLARE_IDTENTRY
@@ -161,12 +174,14 @@ static __always_inline void __##func(str
 /* C-Code mapping */
 #define DECLARE_IDTENTRY_MCE		DECLARE_IDTENTRY_IST
 #define DEFINE_IDTENTRY_MCE		DEFINE_IDTENTRY_IST
+#define DEFINE_IDTENTRY_MCE_USER	DEFINE_IDTENTRY_NOIST
 
 #define DECLARE_IDTENTRY_NMI		DECLARE_IDTENTRY_IST
 #define DEFINE_IDTENTRY_NMI		DEFINE_IDTENTRY_IST
 
 #define DECLARE_IDTENTRY_DEBUG		DECLARE_IDTENTRY_IST
 #define DEFINE_IDTENTRY_DEBUG		DEFINE_IDTENTRY_IST
+#define DEFINE_IDTENTRY_DEBUG_USER	DEFINE_IDTENTRY_NOIST
 
 /**
  * DECLARE_IDTENTRY_XEN - Declare functions for XEN redirect IDT entry points


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 19/24] x86/entry: Implement user mode C entry points for #DB and #MCE
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (17 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 18/24] x86/entry: Provide IDTRENTRY_NOIST variants for #DB and #MC Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-15  5:32   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 20/24] x86/traps: Restructure #DB handling Thomas Gleixner
                   ` (4 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

The MCE entry point uses the same mechanism as the IST entry point for
now. For #DB split the inner workings and just keep the ist_enter/exit
magic in the IST variant. Fixup the ASM code to emit the proper
noist_##cfunc call.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/entry/entry_64.S      |    2 -
 arch/x86/kernel/cpu/mce/core.c |   40 +++++++++++++++++++----
 arch/x86/kernel/traps.c        |   70 +++++++++++++++++++++++++++++++----------
 3 files changed, 88 insertions(+), 24 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -658,7 +658,7 @@ SYM_CODE_START(\asmsym)
 
 	/* Switch to the regular task stack and use the noist entry point */
 .Lfrom_usermode_switch_stack_\@:
-	idtentry_body vector \cfunc, has_error_code=0
+	idtentry_body vector noist_\cfunc, has_error_code=0 sane=1
 
 _ASM_NOKPROBE(\asmsym)
 SYM_CODE_END(\asmsym)
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1904,24 +1904,50 @@ static void unexpected_machine_check(str
 /* Call the installed machine check handler for this CPU setup. */
 void (*machine_check_vector)(struct pt_regs *) = unexpected_machine_check;
 
-DEFINE_IDTENTRY_MCE(exc_machine_check)
+static __always_inline void exc_machine_check_kernel(struct pt_regs *regs)
 {
+	/*
+	 * Only required when from kernel mode. See
+	 * mce_check_crashing_cpu() for details.
+	 */
 	if (machine_check_vector == do_machine_check &&
 	    mce_check_crashing_cpu())
 		return;
 
-	if (user_mode(regs))
-		idtentry_enter(regs);
-	else
-		nmi_enter();
+	nmi_enter();
+	machine_check_vector(regs);
+	nmi_exit();
+}
 
+static __always_inline void exc_machine_check_user(struct pt_regs *regs)
+{
+	idtentry_enter(regs);
 	machine_check_vector(regs);
+	idtentry_exit(regs);
+}
 
+#ifdef CONFIG_X86_64
+/* MCE hit kernel mode */
+DEFINE_IDTENTRY_MCE(exc_machine_check)
+{
+	exc_machine_check_kernel(regs);
+}
+
+/* The user mode variant. */
+DEFINE_IDTENTRY_MCE_USER(exc_machine_check)
+{
+	exc_machine_check_user(regs);
+}
+#else
+/* 32bit unified entry point */
+DEFINE_IDTENTRY_MCE(exc_machine_check)
+{
 	if (user_mode(regs))
-		idtentry_exit(regs);
+		exc_machine_check_user(regs);
 	else
-		nmi_exit();
+		exc_machine_check_kernel(regs);
 }
+#endif
 
 /*
  * Called for each booted CPU to set up machine checks.
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -753,20 +753,12 @@ static __always_inline void debug_exit(u
  *
  * May run on IST stack.
  */
-DEFINE_IDTENTRY_DEBUG(exc_debug)
+static noinstr void handle_debug(struct pt_regs *regs, unsigned long dr6)
 {
 	struct task_struct *tsk = current;
-	unsigned long dr6, dr7;
 	int user_icebp = 0;
 	int si_code;
 
-	debug_enter(&dr6, &dr7);
-
-	if (user_mode(regs))
-		idtentry_enter(regs);
-	else
-		nmi_enter();
-
 	/*
 	 * The SDM says "The processor clears the BTF flag when it
 	 * generates a debug exception."  Clear TIF_BLOCKSTEP to keep
@@ -778,7 +770,7 @@ DEFINE_IDTENTRY_DEBUG(exc_debug)
 		     is_sysenter_singlestep(regs))) {
 		dr6 &= ~DR_STEP;
 		if (!dr6)
-			goto exit;
+			return;
 		/*
 		 * else we might have gotten a single-step trap and hit a
 		 * watchpoint at the same time, in which case we should fall
@@ -799,12 +791,12 @@ DEFINE_IDTENTRY_DEBUG(exc_debug)
 
 #ifdef CONFIG_KPROBES
 	if (kprobe_debug_handler(regs))
-		goto exit;
+		return;
 #endif
 
 	if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, 0,
 		       SIGTRAP) == NOTIFY_STOP)
-		goto exit;
+		return;
 
 	/*
 	 * Let others (NMI) know that the debug stack is in use
@@ -820,7 +812,7 @@ DEFINE_IDTENTRY_DEBUG(exc_debug)
 				 X86_TRAP_DB);
 		cond_local_irq_disable(regs);
 		debug_stack_usage_dec();
-		goto exit;
+		return;
 	}
 
 	if (WARN_ON_ONCE((dr6 & DR_STEP) && !user_mode(regs))) {
@@ -839,14 +831,60 @@ DEFINE_IDTENTRY_DEBUG(exc_debug)
 		send_sigtrap(regs, 0, si_code);
 	cond_local_irq_disable(regs);
 	debug_stack_usage_dec();
+}
+
+static __always_inline void exc_debug_kernel(struct pt_regs *regs,
+					     unsigned long dr6)
+{
+	nmi_enter();
+	handle_debug(regs, dr6);
+	nmi_exit();
+}
+
+static __always_inline void exc_debug_user(struct pt_regs *regs,
+					   unsigned long dr6)
+{
+	idtentry_enter(regs);
+	handle_debug(regs, dr6);
+	idtentry_exit(regs);
+}
+
+#ifdef CONFIG_X86_64
+/* IST stack entry */
+DEFINE_IDTENTRY_DEBUG(exc_debug)
+{
+	unsigned long dr6, dr7;
+
+	debug_enter(&dr6, &dr7);
+	exc_debug_kernel(regs, dr6);
+	debug_exit(dr7);
+}
+
+/* User entry, runs on regular task stack */
+DEFINE_IDTENTRY_DEBUG_USER(exc_debug)
+{
+	unsigned long dr6, dr7;
+
+	debug_enter(&dr6, &dr7);
+	exc_debug_user(regs, dr6);
+	debug_exit(dr7);
+}
+#else
+/* 32 bit does not have separate entry points. */
+DEFINE_IDTENTRY_DEBUG(exc_debug)
+{
+	unsigned long dr6, dr7;
+
+	debug_enter(&dr6, &dr7);
 
-exit:
 	if (user_mode(regs))
-		idtentry_exit(regs);
+		exc_debug_user(regs, dr6);
 	else
-		nmi_exit();
+		exc_debug_kernel(regs, dr6);
+
 	debug_exit(dr7);
 }
+#endif
 
 /*
  * Note that we play around with the 'TS' bit in an attempt to get


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 20/24] x86/traps: Restructure #DB handling
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (18 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 19/24] x86/entry: Implement user mode C entry points for #DB and #MCE Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-15  5:39   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 21/24] x86/traps: Address objtool noinstr complaints in #DB Thomas Gleixner
                   ` (3 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

Now that there are separate entry points, move the kernel/user_mode specifc
checks into the entry functions so the common handling code does not need
the extra mode checks. Make the code more readable while at it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/traps.c |   69 ++++++++++++++++++++++++------------------------
 1 file changed, 35 insertions(+), 34 deletions(-)

--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -753,39 +753,12 @@ static __always_inline void debug_exit(u
  *
  * May run on IST stack.
  */
-static noinstr void handle_debug(struct pt_regs *regs, unsigned long dr6)
+static void noinstr handle_debug(struct pt_regs *regs, unsigned long dr6,
+				 bool user_icebp)
 {
 	struct task_struct *tsk = current;
-	int user_icebp = 0;
 	int si_code;
 
-	/*
-	 * The SDM says "The processor clears the BTF flag when it
-	 * generates a debug exception."  Clear TIF_BLOCKSTEP to keep
-	 * TIF_BLOCKSTEP in sync with the hardware BTF flag.
-	 */
-	clear_tsk_thread_flag(tsk, TIF_BLOCKSTEP);
-
-	if (unlikely(!user_mode(regs) && (dr6 & DR_STEP) &&
-		     is_sysenter_singlestep(regs))) {
-		dr6 &= ~DR_STEP;
-		if (!dr6)
-			return;
-		/*
-		 * else we might have gotten a single-step trap and hit a
-		 * watchpoint at the same time, in which case we should fall
-		 * through and handle the watchpoint.
-		 */
-	}
-
-	/*
-	 * If dr6 has no reason to give us about the origin of this trap,
-	 * then it's very likely the result of an icebp/int01 trap.
-	 * User wants a sigtrap for that.
-	 */
-	if (!dr6 && user_mode(regs))
-		user_icebp = 1;
-
 	/* Store the virtualized DR6 value */
 	tsk->thread.debugreg6 = dr6;
 
@@ -810,9 +783,7 @@ static noinstr void handle_debug(struct
 	if (v8086_mode(regs)) {
 		handle_vm86_trap((struct kernel_vm86_regs *) regs, 0,
 				 X86_TRAP_DB);
-		cond_local_irq_disable(regs);
-		debug_stack_usage_dec();
-		return;
+		goto out;
 	}
 
 	if (WARN_ON_ONCE((dr6 & DR_STEP) && !user_mode(regs))) {
@@ -826,9 +797,12 @@ static noinstr void handle_debug(struct
 		set_tsk_thread_flag(tsk, TIF_SINGLESTEP);
 		regs->flags &= ~X86_EFLAGS_TF;
 	}
+
 	si_code = get_si_code(tsk->thread.debugreg6);
 	if (tsk->thread.debugreg6 & (DR_STEP | DR_TRAP_BITS) || user_icebp)
 		send_sigtrap(regs, 0, si_code);
+
+out:
 	cond_local_irq_disable(regs);
 	debug_stack_usage_dec();
 }
@@ -837,7 +811,27 @@ static __always_inline void exc_debug_ke
 					     unsigned long dr6)
 {
 	nmi_enter();
-	handle_debug(regs, dr6);
+	/*
+	 * The SDM says "The processor clears the BTF flag when it
+	 * generates a debug exception."  Clear TIF_BLOCKSTEP to keep
+	 * TIF_BLOCKSTEP in sync with the hardware BTF flag.
+	 */
+	clear_tsk_thread_flag(tsk, TIF_BLOCKSTEP);
+
+	/*
+	 * Catch SYSENTER with TF set and clear DR_STEP. If this hit a
+	 * watchpoint at the same time then that will still be handled.
+	 */
+	if ((dr6 & DR_STEP) && is_sysenter_singlestep(regs))
+		dr6 &= ~DR_STEP;
+
+	/*
+	 * If DR6 is zero, no point in trying to handle it. The kernel is
+	 * not using INT1.
+	 */
+	if (dr6)
+		handle_debug(regs, dr6, false);
+
 	nmi_exit();
 }
 
@@ -845,7 +839,14 @@ static __always_inline void exc_debug_us
 					   unsigned long dr6)
 {
 	idtentry_enter(regs);
-	handle_debug(regs, dr6);
+	clear_tsk_thread_flag(tsk, TIF_BLOCKSTEP);
+
+	/*
+	 * If dr6 has no reason to give us about the origin of this trap,
+	 * then it's very likely the result of an icebp/int01 trap.
+	 * User wants a sigtrap for that.
+	 */
+	handle_debug(regs, dr6, !dr6);
 	idtentry_exit(regs);
 }
 


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 21/24] x86/traps: Address objtool noinstr complaints in #DB
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (19 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 20/24] x86/traps: Restructure #DB handling Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-15  5:39   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 22/24] x86/mce: Address objtools noinstr complaints Thomas Gleixner
                   ` (2 subsequent siblings)
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

The functions invoked from handle_debug() can be instrumented. Tell objtool
about it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/traps.c |   14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -762,14 +762,19 @@ static void noinstr handle_debug(struct
 	/* Store the virtualized DR6 value */
 	tsk->thread.debugreg6 = dr6;
 
+	instr_begin();
 #ifdef CONFIG_KPROBES
-	if (kprobe_debug_handler(regs))
+	if (kprobe_debug_handler(regs)) {
+		instr_end();
 		return;
+	}
 #endif
 
 	if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, 0,
-		       SIGTRAP) == NOTIFY_STOP)
+		       SIGTRAP) == NOTIFY_STOP) {
+		instr_end();
 		return;
+	}
 
 	/*
 	 * Let others (NMI) know that the debug stack is in use
@@ -805,6 +810,7 @@ static void noinstr handle_debug(struct
 out:
 	cond_local_irq_disable(regs);
 	debug_stack_usage_dec();
+	instr_end();
 }
 
 static __always_inline void exc_debug_kernel(struct pt_regs *regs,
@@ -816,7 +822,7 @@ static __always_inline void exc_debug_ke
 	 * generates a debug exception."  Clear TIF_BLOCKSTEP to keep
 	 * TIF_BLOCKSTEP in sync with the hardware BTF flag.
 	 */
-	clear_tsk_thread_flag(tsk, TIF_BLOCKSTEP);
+	clear_thread_flag(TIF_BLOCKSTEP);
 
 	/*
 	 * Catch SYSENTER with TF set and clear DR_STEP. If this hit a
@@ -839,7 +845,7 @@ static __always_inline void exc_debug_us
 					   unsigned long dr6)
 {
 	idtentry_enter(regs);
-	clear_tsk_thread_flag(tsk, TIF_BLOCKSTEP);
+	clear_thread_flag(TIF_BLOCKSTEP);
 
 	/*
 	 * If dr6 has no reason to give us about the origin of this trap,


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 22/24] x86/mce: Address objtools noinstr complaints
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (20 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 21/24] x86/traps: Address objtool noinstr complaints in #DB Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-15  5:40   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 23/24] x86/entry: Provide IDTENTRY_DF Thomas Gleixner
  2020-05-05 13:49 ` [patch V4 part 4 24/24] " Thomas Gleixner
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

Mark the relevant functions noinstr, use the plain non-instrumented MSR
accessors. The only odd part is the instr_begin()/end() pair around the
indirect machine_check_vector() call as objtool can't figure that out. The
possible invoked functions are annotated correctly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/mce/core.c    |   20 +++++++++++++++-----
 arch/x86/kernel/cpu/mce/p5.c      |    4 +++-
 arch/x86/kernel/cpu/mce/winchip.c |    4 +++-
 kernel/time/timekeeping.c         |    2 +-
 4 files changed, 22 insertions(+), 8 deletions(-)

--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -130,7 +130,7 @@ static void (*quirk_no_way_out)(int bank
 BLOCKING_NOTIFIER_HEAD(x86_mce_decoder_chain);
 
 /* Do initial initialization of a struct mce */
-void mce_setup(struct mce *m)
+noinstr void mce_setup(struct mce *m)
 {
 	memset(m, 0, sizeof(struct mce));
 	m->cpu = m->extcpu = smp_processor_id();
@@ -140,12 +140,12 @@ void mce_setup(struct mce *m)
 	m->cpuid = cpuid_eax(1);
 	m->socketid = cpu_data(m->extcpu).phys_proc_id;
 	m->apicid = cpu_data(m->extcpu).initial_apicid;
-	rdmsrl(MSR_IA32_MCG_CAP, m->mcgcap);
+	m->mcgcap = __rdmsr(MSR_IA32_MCG_CAP);
 
 	if (this_cpu_has(X86_FEATURE_INTEL_PPIN))
-		rdmsrl(MSR_PPIN, m->ppin);
+		m->ppin = __rdmsr(MSR_PPIN);
 	else if (this_cpu_has(X86_FEATURE_AMD_PPIN))
-		rdmsrl(MSR_AMD_PPIN, m->ppin);
+		m->ppin = __rdmsr(MSR_AMD_PPIN);
 
 	m->microcode = boot_cpu_data.microcode;
 }
@@ -1895,10 +1895,12 @@ bool filter_mce(struct mce *m)
 }
 
 /* Handle unconfigured int18 (should never happen) */
-static void unexpected_machine_check(struct pt_regs *regs)
+static noinstr void unexpected_machine_check(struct pt_regs *regs)
 {
+	instr_begin();
 	pr_err("CPU#%d: Unexpected int18 (Machine Check)\n",
 	       smp_processor_id());
+	instr_end();
 }
 
 /* Call the installed machine check handler for this CPU setup. */
@@ -1915,14 +1917,22 @@ static __always_inline void exc_machine_
 		return;
 
 	nmi_enter();
+	/*
+	 * The call targets are marked noinstr, but objtool can't figure
+	 * that out because it's an indirect call. Annotate it.
+	 */
+	instr_begin();
 	machine_check_vector(regs);
+	instr_end();
 	nmi_exit();
 }
 
 static __always_inline void exc_machine_check_user(struct pt_regs *regs)
 {
 	idtentry_enter(regs);
+	instr_begin();
 	machine_check_vector(regs);
+	instr_end();
 	idtentry_exit(regs);
 }
 
--- a/arch/x86/kernel/cpu/mce/p5.c
+++ b/arch/x86/kernel/cpu/mce/p5.c
@@ -21,10 +21,11 @@
 int mce_p5_enabled __read_mostly;
 
 /* Machine check handler for Pentium class Intel CPUs: */
-static void pentium_machine_check(struct pt_regs *regs)
+static noinstr void pentium_machine_check(struct pt_regs *regs)
 {
 	u32 loaddr, hi, lotype;
 
+	instr_begin();
 	rdmsr(MSR_IA32_P5_MC_ADDR, loaddr, hi);
 	rdmsr(MSR_IA32_P5_MC_TYPE, lotype, hi);
 
@@ -37,6 +38,7 @@ static void pentium_machine_check(struct
 	}
 
 	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
+	instr_end();
 }
 
 /* Set up machine check reporting for processors with Intel style MCE: */
--- a/arch/x86/kernel/cpu/mce/winchip.c
+++ b/arch/x86/kernel/cpu/mce/winchip.c
@@ -17,10 +17,12 @@
 #include "internal.h"
 
 /* Machine check handler for WinChip C6: */
-static void winchip_machine_check(struct pt_regs *regs)
+static noinstr void winchip_machine_check(struct pt_regs *regs)
 {
+	instr_begin();
 	pr_emerg("CPU0: Machine Check Exception.\n");
 	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
+	instr_end();
 }
 
 /* Set up machine check reporting on the Winchip C6 series */
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -953,7 +953,7 @@ EXPORT_SYMBOL_GPL(ktime_get_real_seconds
  * but without the sequence counter protect. This internal function
  * is called just when timekeeping lock is already held.
  */
-time64_t __ktime_get_real_seconds(void)
+noinstr time64_t __ktime_get_real_seconds(void)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
 


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 23/24] x86/entry: Provide IDTENTRY_DF
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (21 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 22/24] x86/mce: Address objtools noinstr complaints Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-15  5:41   ` Andy Lutomirski
                     ` (2 more replies)
  2020-05-05 13:49 ` [patch V4 part 4 24/24] " Thomas Gleixner
  23 siblings, 3 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

Provide a separate macro for #DF as this needs to emit paranoid only code
and has also a special ASM stub in 32bit.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/include/asm/idtentry.h |   99 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 99 insertions(+)

--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -132,6 +132,35 @@ static __always_inline void __##func(str
 #define DEFINE_IDTENTRY_RAW(func)					\
 __visible noinstr void func(struct pt_regs *regs)
 
+/**
+ * DECLARE_IDTENTRY_RAW_ERRORCODE - Declare functions for raw IDT entry points
+ *				    Error code pushed by hardware
+ * @vector:	Vector number (ignored for C)
+ * @func:	Function name of the entry point
+ *
+ * Maps to DECLARE_IDTENTRY_ERRORCODE()
+ */
+#define DECLARE_IDTENTRY_RAW_ERRORCODE(vector, func)			\
+	DECLARE_IDTENTRY_ERRORCODE(vector, func)
+
+/**
+ * DEFINE_IDTENTRY_RAW_ERRORCODE - Emit code for raw IDT entry points
+ * @func:	Function name of the entry point
+ *
+ * @func is called from ASM entry code with interrupts disabled.
+ *
+ * The macro is written so it acts as function definition. Append the
+ * body with a pair of curly brackets.
+ *
+ * Contrary to DEFINE_IDTENTRY_ERRORCODE() this does not invoke the
+ * idtentry_enter/exit() helpers before and after the body invocation. This
+ * needs to be done in the body itself if applicable. Use if extra work
+ * is required before the enter/exit() helpers are invoked.
+ */
+#define DEFINE_IDTENTRY_RAW_ERRORCODE(func)				\
+__visible noinstr void func(struct pt_regs *regs, unsigned long error_code)
+
+
 #ifdef CONFIG_X86_64
 /**
  * DECLARE_IDTENTRY_IST - Declare functions for IST handling IDT entry points
@@ -165,10 +194,70 @@ static __always_inline void __##func(str
 #define DEFINE_IDTENTRY_NOIST(func)					\
 	DEFINE_IDTENTRY_RAW(noist_##func)
 
+/**
+ * DECLARE_IDTENTRY_DF - Declare functions for double fault
+ * @vector:	Vector number (ignored for C)
+ * @func:	Function name of the entry point
+ *
+ * Maps to DECLARE_IDTENTRY_RAW_ERRORCODE
+ */
+#define DECLARE_IDTENTRY_DF(vector, func)				\
+	DECLARE_IDTENTRY_RAW_ERRORCODE(vector, func)
+
+/**
+ * DEFINE_IDTENTRY_DF - Emit code for double fault
+ * @func:	Function name of the entry point
+ *
+ * Maps to DEFINE_IDTENTRY_RAW_ERRORCODE
+ */
+#define DEFINE_IDTENTRY_DF(func)					\
+	DEFINE_IDTENTRY_RAW_ERRORCODE(func)
+
 #else	/* CONFIG_X86_64 */
+
 /* Maps to a regular IDTENTRY on 32bit for now */
 # define DECLARE_IDTENTRY_IST		DECLARE_IDTENTRY
 # define DEFINE_IDTENTRY_IST		DEFINE_IDTENTRY
+
+/**
+ * DECLARE_IDTENTRY_DF - Declare functions for double fault 32bit variant
+ * @vector:	Vector number (ignored for C)
+ * @func:	Function name of the entry point
+ *
+ * Declares two functions:
+ * - The ASM entry point: asm_##func
+ * - The C handler called from the C shim
+ */
+#define DECLARE_IDTENTRY_DF(vector, func)				\
+	asmlinkage void asm_##func(void);				\
+	__visible void func(struct pt_regs *regs,			\
+			    unsigned long error_code,			\
+			    unsigned long address)
+
+/**
+ * DEFINE_IDTENTRY_DF - Emit code for double fault on 32bit
+ * @func:	Function name of the entry point
+ *
+ * This is called through the doublefault shim which already provides
+ * cr2 in the address argument.
+ */
+#define DEFINE_IDTENTRY_DF(func)					\
+static __always_inline void __##func(struct pt_regs *regs,		\
+				     unsigned long error_code,		\
+				     unsigned long address);		\
+									\
+__visible notrace void func(struct pt_regs *regs,			\
+			    unsigned long error_code,			\
+			    unsigned long address)			\
+{									\
+	__##func (regs, error_code, address);				\
+}									\
+NOKPROBE_SYMBOL(func);							\
+									\
+static __always_inline void __##func(struct pt_regs *regs,		\
+				     unsigned long error_code,		\
+				     unsigned long address)
+
 #endif	/* !CONFIG_X86_64 */
 
 /* C-Code mapping */
@@ -212,6 +301,9 @@ static __always_inline void __##func(str
 #define DECLARE_IDTENTRY_RAW(vector, func)				\
 	DECLARE_IDTENTRY(vector, func)
 
+#define DECLARE_IDTENTRY_RAW_ERRORCODE(vector, func)			\
+	DECLARE_IDTENTRY_ERRORCODE(vector, func)
+
 #ifdef CONFIG_X86_64
 # define DECLARE_IDTENTRY_MCE(vector, func)				\
 	idtentry_mce_db vector asm_##func func
@@ -219,12 +311,19 @@ static __always_inline void __##func(str
 # define DECLARE_IDTENTRY_DEBUG(vector, func)				\
 	idtentry_mce_db vector asm_##func func
 
+# define DECLARE_IDTENTRY_DF(vector, func)				\
+	idtentry_df vector asm_##func func
+
 #else
 # define DECLARE_IDTENTRY_MCE(vector, func)				\
 	DECLARE_IDTENTRY(vector, func)
 
 # define DECLARE_IDTENTRY_DEBUG(vector, func)				\
 	DECLARE_IDTENTRY(vector, func)
+
+/* No ASM emitted for DF as this goes through a C shim */
+# define DECLARE_IDTENTRY_DF(vector, func)
+
 #endif
 
 /* No ASM code emitted for NMI */


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [patch V4 part 4 24/24] x86/entry: Convert double fault exception to IDTENTRY_DF
  2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
                   ` (22 preceding siblings ...)
  2020-05-05 13:49 ` [patch V4 part 4 23/24] x86/entry: Provide IDTENTRY_DF Thomas Gleixner
@ 2020-05-05 13:49 ` Thomas Gleixner
  2020-05-07 19:55   ` Alexandre Chartre
  2020-05-15  5:42   ` Andy Lutomirski
  23 siblings, 2 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-05 13:49 UTC (permalink / raw)
  To: LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

Convert #DF to IDTENTRY_DF
  - Implement the C entry point with DEFINE_IDTENTRY_DF
  - Emit the ASM stub with DECLARE_IDTENTRY_DF on 64bit
  - Remove the ASM idtentry in 64bit
  - Adjust the 32bit shim code
  - Fixup the XEN/PV code
  - Remove the old prototyoes

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/entry/entry_32.S        |    4 ++--
 arch/x86/entry/entry_64.S        |   10 +---------
 arch/x86/include/asm/idtentry.h  |    5 +++++
 arch/x86/include/asm/traps.h     |    7 -------
 arch/x86/kernel/doublefault_32.c |   10 ++++------
 arch/x86/kernel/idt.c            |    4 ++--
 arch/x86/kernel/traps.c          |   17 ++++++++++++++---
 arch/x86/xen/enlighten_pv.c      |    4 ++--
 arch/x86/xen/xen-asm_64.S        |    2 +-
 9 files changed, 31 insertions(+), 32 deletions(-)

--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1489,7 +1489,7 @@ SYM_CODE_START_LOCAL_NOALIGN(handle_exce
 SYM_CODE_END(handle_exception)
 
 #ifdef CONFIG_DOUBLEFAULT
-SYM_CODE_START(double_fault)
+SYM_CODE_START(asm_exc_double_fault)
 1:
 	/*
 	 * This is a task gate handler, not an interrupt gate handler.
@@ -1527,7 +1527,7 @@ SYM_CODE_START(double_fault)
 1:
 	hlt
 	jmp 1b
-SYM_CODE_END(double_fault)
+SYM_CODE_END(asm_exc_double_fault)
 #endif
 
 /*
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -681,15 +681,9 @@ SYM_CODE_START(\asmsym)
 	call	paranoid_entry
 	UNWIND_HINT_REGS
 
-	/* Read CR2 early */
-	GET_CR2_INTO(%r12);
-
-	TRACE_IRQS_OFF
-
 	movq	%rsp, %rdi		/* pt_regs pointer into first argument */
 	movq	ORIG_RAX(%rsp), %rsi	/* get error code into 2nd argument*/
 	movq	$-1, ORIG_RAX(%rsp)	/* no syscall to restart */
-	movq	%r12, %rdx		/* Move CR2 into 3rd argument */
 	call	\cfunc
 
 	jmp	paranoid_exit
@@ -918,7 +912,7 @@ SYM_INNER_LABEL(native_irq_return_iret,
 	/*
 	 * This may fault.  Non-paranoid faults on return to userspace are
 	 * handled by fixup_bad_iret.  These include #SS, #GP, and #NP.
-	 * Double-faults due to espfix64 are handled in do_double_fault.
+	 * Double-faults due to espfix64 are handled in exc_double_fault.
 	 * Other faults here are fatal.
 	 */
 	iretq
@@ -1074,8 +1068,6 @@ apicinterrupt IRQ_WORK_VECTOR			irq_work
 
 idtentry	X86_TRAP_PF		page_fault		do_page_fault			has_error_code=1
 
-idtentry_df	X86_TRAP_DF		double_fault		do_double_fault
-
 #ifdef CONFIG_XEN_PV
 idtentry	512 /* dummy */		hypervisor_callback	xen_do_hypervisor_callback	has_error_code=0
 #endif
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -380,4 +380,9 @@ DECLARE_IDTENTRY_XEN(X86_TRAP_NMI,	nmi);
 DECLARE_IDTENTRY_DEBUG(X86_TRAP_DB,	exc_debug);
 DECLARE_IDTENTRY_XEN(X86_TRAP_DB,	debug);
 
+/* #DF */
+#if defined(CONFIG_X86_64) || defined(CONFIG_DOUBLEFAULT)
+DECLARE_IDTENTRY_DF(X86_TRAP_DF,	exc_double_fault);
+#endif
+
 #endif
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -11,20 +11,13 @@
 
 #define dotraplinkage __visible
 
-#ifdef CONFIG_X86_64
-asmlinkage void double_fault(void);
-#endif
 asmlinkage void page_fault(void);
 asmlinkage void async_page_fault(void);
 
 #if defined(CONFIG_X86_64) && defined(CONFIG_XEN_PV)
-asmlinkage void xen_double_fault(void);
 asmlinkage void xen_page_fault(void);
 #endif
 
-#if defined(CONFIG_X86_64) || defined(CONFIG_DOUBLEFAULT)
-dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code, unsigned long cr2);
-#endif
 dotraplinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address);
 
 #ifdef CONFIG_X86_64
--- a/arch/x86/kernel/doublefault_32.c
+++ b/arch/x86/kernel/doublefault_32.c
@@ -11,7 +11,6 @@
 #include <asm/desc.h>
 #include <asm/traps.h>
 
-extern void double_fault(void);
 #define ptr_ok(x) ((x) > PAGE_OFFSET && (x) < PAGE_OFFSET + MAXMEM)
 
 #define TSS(x) this_cpu_read(cpu_tss_rw.x86_tss.x)
@@ -22,7 +21,7 @@ static void set_df_gdt_entry(unsigned in
  * Called by double_fault with CR0.TS and EFLAGS.NT cleared.  The CPU thinks
  * we're running the doublefault task.  Cannot return.
  */
-asmlinkage notrace void __noreturn doublefault_shim(void)
+asmlinkage noinstr void __noreturn doublefault_shim(void)
 {
 	unsigned long cr2;
 	struct pt_regs regs;
@@ -41,7 +40,7 @@ asmlinkage notrace void __noreturn doubl
 	 * Fill in pt_regs.  A downside of doing this in C is that the unwinder
 	 * won't see it (no ENCODE_FRAME_POINTER), so a nested stack dump
 	 * won't successfully unwind to the source of the double fault.
-	 * The main dump from do_double_fault() is fine, though, since it
+	 * The main dump from exc_double_fault() is fine, though, since it
 	 * uses these regs directly.
 	 *
 	 * If anyone ever cares, this could be moved to asm.
@@ -71,7 +70,7 @@ asmlinkage notrace void __noreturn doubl
 	regs.cx		= TSS(cx);
 	regs.bx		= TSS(bx);
 
-	do_double_fault(&regs, 0, cr2);
+	exc_double_fault(&regs, 0, cr2);
 
 	/*
 	 * x86_32 does not save the original CR3 anywhere on a task switch.
@@ -85,7 +84,6 @@ asmlinkage notrace void __noreturn doubl
 	 */
 	panic("cannot return from double fault\n");
 }
-NOKPROBE_SYMBOL(doublefault_shim);
 
 DEFINE_PER_CPU_PAGE_ALIGNED(struct doublefault_stack, doublefault_stack) = {
 	.tss = {
@@ -96,7 +94,7 @@ DEFINE_PER_CPU_PAGE_ALIGNED(struct doubl
 		.ldt		= 0,
 	.io_bitmap_base	= IO_BITMAP_OFFSET_INVALID,
 
-		.ip		= (unsigned long) double_fault,
+		.ip		= (unsigned long) asm_exc_double_fault,
 		.flags		= X86_EFLAGS_FIXED,
 		.es		= __USER_DS,
 		.cs		= __KERNEL_CS,
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -88,7 +88,7 @@ static const __initconst struct idt_data
 #ifdef CONFIG_X86_32
 	TSKG(X86_TRAP_DF,		GDT_ENTRY_DOUBLEFAULT_TSS),
 #else
-	INTG(X86_TRAP_DF,		double_fault),
+	INTG(X86_TRAP_DF,		asm_exc_double_fault),
 #endif
 	INTG(X86_TRAP_DB,		asm_exc_debug),
 
@@ -184,7 +184,7 @@ gate_desc debug_idt_table[IDT_ENTRIES] _
 static const __initconst struct idt_data ist_idts[] = {
 	ISTG(X86_TRAP_DB,	asm_exc_debug,		IST_INDEX_DB),
 	ISTG(X86_TRAP_NMI,	asm_exc_nmi,		IST_INDEX_NMI),
-	ISTG(X86_TRAP_DF,	double_fault,		IST_INDEX_DF),
+	ISTG(X86_TRAP_DF,	asm_exc_double_fault,	IST_INDEX_DF),
 #ifdef CONFIG_X86_MCE
 	ISTG(X86_TRAP_MC,	asm_exc_machine_check,	IST_INDEX_MCE),
 #endif
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -310,12 +310,19 @@ DEFINE_IDTENTRY_ERRORCODE(exc_alignment_
  * from the TSS.  Returning is, in principle, okay, but changes to regs will
  * be lost.  If, for some reason, we need to return to a context with modified
  * regs, the shim code could be adjusted to synchronize the registers.
+ *
+ * The 32bit #DF shim provides CR2 already as an argument. On 64bit it needs
+ * to be read before doing anything else.
  */
-dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code, unsigned long cr2)
+DEFINE_IDTENTRY_DF(exc_double_fault)
 {
 	static const char str[] = "double fault";
 	struct task_struct *tsk = current;
 
+#ifdef CONFIG_X86_64
+	unsigned long address = read_cr2();
+#endif
+
 #ifdef CONFIG_X86_ESPFIX64
 	extern unsigned char native_irq_return_iret[];
 
@@ -372,6 +379,7 @@ dotraplinkage void do_double_fault(struc
 #endif
 
 	nmi_enter();
+	instr_begin();
 	notify_die(DIE_TRAP, str, regs, error_code, X86_TRAP_DF, SIGSEGV);
 
 	tsk->thread.error_code = error_code;
@@ -415,13 +423,16 @@ dotraplinkage void do_double_fault(struc
 	 * stack even if the actual trigger for the double fault was
 	 * something else.
 	 */
-	if ((unsigned long)task_stack_page(tsk) - 1 - cr2 < PAGE_SIZE)
-		handle_stack_overflow("kernel stack overflow (double-fault)", regs, cr2);
+	if ((unsigned long)task_stack_page(tsk) - 1 - address < PAGE_SIZE) {
+		handle_stack_overflow("kernel stack overflow (double-fault)",
+				      regs, address);
+	}
 #endif
 
 	pr_emerg("PANIC: double fault, error_code: 0x%lx\n", error_code);
 	die("double fault", regs, error_code);
 	panic("Machine halted.");
+	instr_end();
 }
 #endif
 
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -617,7 +617,7 @@ struct trap_array_entry {
 
 static struct trap_array_entry trap_array[] = {
 	TRAP_ENTRY_REDIR(exc_debug, exc_xendebug,	true  ),
-	{ double_fault,                xen_double_fault,                true },
+	TRAP_ENTRY(exc_double_fault,			true  ),
 #ifdef CONFIG_X86_MCE
 	TRAP_ENTRY(exc_machine_check,			true  ),
 #endif
@@ -652,7 +652,7 @@ static bool __ref get_trap_addr(void **a
 	 * Replace trap handler addresses by Xen specific ones.
 	 * Check for known traps using IST and whitelist them.
 	 * The debugger ones are the only ones we care about.
-	 * Xen will handle faults like double_fault, * so we should never see
+	 * Xen will handle faults like double_fault, so we should never see
 	 * them.  Warn if there's an unexpected IST-using fault handler.
 	 */
 	for (nr = 0; nr < ARRAY_SIZE(trap_array); nr++) {
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -37,7 +37,7 @@ xen_pv_trap asm_exc_overflow
 xen_pv_trap asm_exc_bounds
 xen_pv_trap asm_exc_invalid_op
 xen_pv_trap asm_exc_device_not_available
-xen_pv_trap double_fault
+xen_pv_trap asm_exc_double_fault
 xen_pv_trap asm_exc_coproc_segment_overrun
 xen_pv_trap asm_exc_invalid_tss
 xen_pv_trap asm_exc_segment_not_present


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling
  2020-05-05 13:49 ` [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling Thomas Gleixner
@ 2020-05-07 17:18   ` Alexandre Chartre
  2020-05-08  8:59     ` Peter Zijlstra
  2020-05-14  2:24   ` Mathieu Desnoyers
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 94+ messages in thread
From: Alexandre Chartre @ 2020-05-07 17:18 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf,
	Will Deacon, Peter Zijlstra


On 5/5/20 3:49 PM, Thomas Gleixner wrote:
> From: Peter Zijlstra <peterz@infradead.org>
> 
> DR6/7 should be handled before nmi_enter() is invoked and restore after
> nmi_exit() to minimize the exposure.
> 
> Split it out into helper inlines and bring it into the correct order.
> 
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>   arch/x86/kernel/hw_breakpoint.c |    6 ---
>   arch/x86/kernel/traps.c         |   62 +++++++++++++++++++++++++++-------------
>   2 files changed, 44 insertions(+), 24 deletions(-)
> 
...
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -691,6 +691,44 @@ static bool is_sysenter_singlestep(struc
>   #endif
>   }
>   
> +static __always_inline void debug_enter(unsigned long *dr6, unsigned long *dr7)
> +{
> +	/*
> +	 * Disable breakpoints during exception handling; recursive exceptions
> +	 * are exceedingly 'fun'.
> +	 *
> +	 * Since this function is NOKPROBE, and that also applies to
> +	 * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a
> +	 * HW_BREAKPOINT_W on our stack)
> +	 *
> +	 * Entry text is excluded for HW_BP_X and cpu_entry_area, which
> +	 * includes the entry stack is excluded for everything.
> +	 */
> +	get_debugreg(*dr7, 6);

Do you mean  get_debugreg(*dr7, 7); ?

alex.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 24/24] x86/entry: Convert double fault exception to IDTENTRY_DF
  2020-05-05 13:49 ` [patch V4 part 4 24/24] " Thomas Gleixner
@ 2020-05-07 19:55   ` Alexandre Chartre
  2020-05-15  5:42   ` Andy Lutomirski
  1 sibling, 0 replies; 94+ messages in thread
From: Alexandre Chartre @ 2020-05-07 19:55 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Paul E. McKenney, Andy Lutomirski, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf,
	Will Deacon


On 5/5/20 3:49 PM, Thomas Gleixner wrote:
> Convert #DF to IDTENTRY_DF
>    - Implement the C entry point with DEFINE_IDTENTRY_DF
>    - Emit the ASM stub with DECLARE_IDTENTRY_DF on 64bit
>    - Remove the ASM idtentry in 64bit
>    - Adjust the 32bit shim code
>    - Fixup the XEN/PV code
>    - Remove the old prototyoes
> 
> No functional change.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> 

For all patches of part 4:

Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>

alex.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling
  2020-05-07 17:18   ` Alexandre Chartre
@ 2020-05-08  8:59     ` Peter Zijlstra
  2020-05-08 11:58       ` Thomas Gleixner
  0 siblings, 1 reply; 94+ messages in thread
From: Peter Zijlstra @ 2020-05-08  8:59 UTC (permalink / raw)
  To: Alexandre Chartre
  Cc: Thomas Gleixner, LKML, x86, Paul E. McKenney, Andy Lutomirski,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

On Thu, May 07, 2020 at 07:18:45PM +0200, Alexandre Chartre wrote:
> 
> On 5/5/20 3:49 PM, Thomas Gleixner wrote:
> > From: Peter Zijlstra <peterz@infradead.org>
> > 
> > DR6/7 should be handled before nmi_enter() is invoked and restore after
> > nmi_exit() to minimize the exposure.
> > 
> > Split it out into helper inlines and bring it into the correct order.
> > 
> > Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > ---
> >   arch/x86/kernel/hw_breakpoint.c |    6 ---
> >   arch/x86/kernel/traps.c         |   62 +++++++++++++++++++++++++++-------------
> >   2 files changed, 44 insertions(+), 24 deletions(-)
> > 
> ...
> > --- a/arch/x86/kernel/traps.c
> > +++ b/arch/x86/kernel/traps.c
> > @@ -691,6 +691,44 @@ static bool is_sysenter_singlestep(struc
> >   #endif
> >   }
> > +static __always_inline void debug_enter(unsigned long *dr6, unsigned long *dr7)
> > +{
> > +	/*
> > +	 * Disable breakpoints during exception handling; recursive exceptions
> > +	 * are exceedingly 'fun'.
> > +	 *
> > +	 * Since this function is NOKPROBE, and that also applies to
> > +	 * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a
> > +	 * HW_BREAKPOINT_W on our stack)
> > +	 *
> > +	 * Entry text is excluded for HW_BP_X and cpu_entry_area, which
> > +	 * includes the entry stack is excluded for everything.
> > +	 */
> > +	get_debugreg(*dr7, 6);
> 
> Do you mean  get_debugreg(*dr7, 7); ?

Shees, I have to go buy a new stack of brown paper bags at this rate,
don't I :/

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling
  2020-05-08  8:59     ` Peter Zijlstra
@ 2020-05-08 11:58       ` Thomas Gleixner
  2020-05-08 12:45         ` Peter Zijlstra
  0 siblings, 1 reply; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-08 11:58 UTC (permalink / raw)
  To: Peter Zijlstra, Alexandre Chartre
  Cc: LKML, x86, Paul E. McKenney, Andy Lutomirski,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

Peter Zijlstra <peterz@infradead.org> writes:
>> > +static __always_inline void debug_enter(unsigned long *dr6, unsigned long *dr7)
>> > +{
>> > +	/*
>> > +	 * Disable breakpoints during exception handling; recursive exceptions
>> > +	 * are exceedingly 'fun'.
>> > +	 *
>> > +	 * Since this function is NOKPROBE, and that also applies to
>> > +	 * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a
>> > +	 * HW_BREAKPOINT_W on our stack)
>> > +	 *
>> > +	 * Entry text is excluded for HW_BP_X and cpu_entry_area, which
>> > +	 * includes the entry stack is excluded for everything.
>> > +	 */
>> > +	get_debugreg(*dr7, 6);
>> 
>> Do you mean  get_debugreg(*dr7, 7); ?
>
> Shees, I have to go buy a new stack of brown paper bags at this rate,
> don't I :/

Not only you, but it's also  amazing that the selftests didn't catch
that.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling
  2020-05-08 11:58       ` Thomas Gleixner
@ 2020-05-08 12:45         ` Peter Zijlstra
  0 siblings, 0 replies; 94+ messages in thread
From: Peter Zijlstra @ 2020-05-08 12:45 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Alexandre Chartre, LKML, x86, Paul E. McKenney, Andy Lutomirski,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

On Fri, May 08, 2020 at 01:58:30PM +0200, Thomas Gleixner wrote:
> Peter Zijlstra <peterz@infradead.org> writes:
> >> > +static __always_inline void debug_enter(unsigned long *dr6, unsigned long *dr7)
> >> > +{
> >> > +	/*
> >> > +	 * Disable breakpoints during exception handling; recursive exceptions
> >> > +	 * are exceedingly 'fun'.
> >> > +	 *
> >> > +	 * Since this function is NOKPROBE, and that also applies to
> >> > +	 * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a
> >> > +	 * HW_BREAKPOINT_W on our stack)
> >> > +	 *
> >> > +	 * Entry text is excluded for HW_BP_X and cpu_entry_area, which
> >> > +	 * includes the entry stack is excluded for everything.
> >> > +	 */
> >> > +	get_debugreg(*dr7, 6);
> >> 
> >> Do you mean  get_debugreg(*dr7, 7); ?
> >
> > Shees, I have to go buy a new stack of brown paper bags at this rate,
> > don't I :/
> 
> Not only you, but it's also  amazing that the selftests didn't catch
> that.

I don't think the selftests try and set hardware breakpoints in the
kernel.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 02/24] x86/int3: Avoid atomic instrumentation
  2020-05-05 13:49 ` [patch V4 part 4 02/24] x86/int3: Avoid atomic instrumentation Thomas Gleixner
@ 2020-05-08 13:27   ` Masami Hiramatsu
  2020-05-14  4:57   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  2 siblings, 0 replies; 94+ messages in thread
From: Masami Hiramatsu @ 2020-05-08 13:27 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Paul E. McKenney, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon, Peter Zijlstra (Intel)

On Tue, 05 May 2020 15:49:28 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:

> From: Peter Zijlstra <peterz@infradead.org>
> 
> Use arch_atomic_*() and READ_ONCE_NOCHECK() to ensure nothing untoward
> creeps in and ruins things.
> 
> That is; this is the INT3 text poke handler, strictly limit the code
> that runs in it, lest it inadvertenly hits yet another INT3.
> 
> Reported-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Looks good to me.

Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>

> ---
>  arch/x86/kernel/alternative.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -960,9 +960,9 @@ static struct bp_patching_desc *bp_desc;
>  static __always_inline
>  struct bp_patching_desc *try_get_desc(struct bp_patching_desc **descp)
>  {
> -	struct bp_patching_desc *desc = READ_ONCE(*descp); /* rcu_dereference */
> +	struct bp_patching_desc *desc = READ_ONCE_NOCHECK(*descp); /* rcu_dereference */
>  
> -	if (!desc || !atomic_inc_not_zero(&desc->refs))
> +	if (!desc || !arch_atomic_inc_not_zero(&desc->refs))
>  		return NULL;
>  
>  	return desc;
> @@ -971,7 +971,7 @@ struct bp_patching_desc *try_get_desc(st
>  static __always_inline void put_desc(struct bp_patching_desc *desc)
>  {
>  	smp_mb__before_atomic();
> -	atomic_dec(&desc->refs);
> +	arch_atomic_dec(&desc->refs);
>  }
>  
>  static __always_inline void *text_poke_addr(struct text_poke_loc *tp)
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling
  2020-05-05 13:49 ` [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling Thomas Gleixner
  2020-05-07 17:18   ` Alexandre Chartre
@ 2020-05-14  2:24   ` Mathieu Desnoyers
  2020-05-14 17:28     ` Thomas Gleixner
  2020-05-14 18:06     ` Steven Rostedt
  2020-05-15  5:37   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  3 siblings, 2 replies; 94+ messages in thread
From: Mathieu Desnoyers @ 2020-05-14  2:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, x86, paulmck, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, rostedt, Joel Fernandes, Google,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Josh Poimboeuf,
	Will Deacon, Peter Zijlstra

----- On May 5, 2020, at 9:49 AM, Thomas Gleixner tglx@linutronix.de wrote:

> From: Peter Zijlstra <peterz@infradead.org>
> 
> DR6/7 should be handled before nmi_enter() is invoked and restore after
> nmi_exit() to minimize the exposure.
> 
> Split it out into helper inlines and bring it into the correct order.
> 
[...]
> 
> +static __always_inline void debug_enter(unsigned long *dr6, unsigned long *dr7)
> +{
> +	/*
> +	 * Disable breakpoints during exception handling; recursive exceptions
> +	 * are exceedingly 'fun'.
> +	 *
> +	 * Since this function is NOKPROBE, and that also applies to
> +	 * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a
> +	 * HW_BREAKPOINT_W on our stack)
> +	 *
> +	 * Entry text is excluded for HW_BP_X and cpu_entry_area, which
> +	 * includes the entry stack is excluded for everything.
> +	 */
> +	get_debugreg(*dr7, 6);
> +	set_debugreg(0, 7);
> +
> +	/*
> +	 * The Intel SDM says:
> +	 *
> +	 *   Certain debug exceptions may clear bits 0-3. The remaining
> +	 *   contents of the DR6 register are never cleared by the
> +	 *   processor. To avoid confusion in identifying debug
> +	 *   exceptions, debug handlers should clear the register before
> +	 *   returning to the interrupted task.
> +	 *
> +	 * Keep it simple: clear DR6 immediately.
> +	 */
> +	get_debugreg(*dr6, 6);
> +	set_debugreg(0, 6);
> +	/* Filter out all the reserved bits which are preset to 1 */
> +	*dr6 &= ~DR6_RESERVED;
> +}
> +
> +static __always_inline void debug_exit(unsigned long dr7)
> +{
> +	set_debugreg(dr7, 7);
> +}

Out of curiosity, what prevents the compiler from moving instructions
outside of the code regions surrounded by entry/exit ? This is an always
inline, which invokes set_debugreg which is inline for CONFIG_PARAVIRT_XXL=n,
which in turn uses an asm() (not volatile), without any memory clobber.

Also, considering that "inline" is not sufficient to ensure the compiler
does not emit a traceable function, I suspect you'll also want to mark
"native_get_debugreg" and "native_set_debugreg" always inline as well.

Thanks,

Mathieu

> +
> /*
>  * Our handling of the processor debug registers is non-trivial.
>  * We do not clear them on entry and exit from the kernel. Therefore
> @@ -718,28 +756,13 @@ static bool is_sysenter_singlestep(struc
> dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
> {
> 	struct task_struct *tsk = current;
> +	unsigned long dr6, dr7;
> 	int user_icebp = 0;
> -	unsigned long dr6;
> 	int si_code;
> 
> -	nmi_enter();
> -
> -	get_debugreg(dr6, 6);
> -	/*
> -	 * The Intel SDM says:
> -	 *
> -	 *   Certain debug exceptions may clear bits 0-3. The remaining
> -	 *   contents of the DR6 register are never cleared by the
> -	 *   processor. To avoid confusion in identifying debug
> -	 *   exceptions, debug handlers should clear the register before
> -	 *   returning to the interrupted task.
> -	 *
> -	 * Keep it simple: clear DR6 immediately.
> -	 */
> -	set_debugreg(0, 6);
> +	debug_enter(&dr6, &dr7);
> 
> -	/* Filter out all the reserved bits which are preset to 1 */
> -	dr6 &= ~DR6_RESERVED;
> +	nmi_enter();
> 
> 	/*
> 	 * The SDM says "The processor clears the BTF flag when it
> @@ -777,7 +800,7 @@ dotraplinkage void do_debug(struct pt_re
> #endif
> 
> 	if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, error_code,
> -							SIGTRAP) == NOTIFY_STOP)
> +		       SIGTRAP) == NOTIFY_STOP)
> 		goto exit;
> 
> 	/*
> @@ -816,6 +839,7 @@ dotraplinkage void do_debug(struct pt_re
> 
> exit:
> 	nmi_exit();
> +	debug_exit(dr7);
> }
>  NOKPROBE_SYMBOL(do_debug);

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 01/24] x86/int3: Ensure that poke_int3_handler() is not traced
  2020-05-05 13:49 ` [patch V4 part 4 01/24] x86/int3: Ensure that poke_int3_handler() is not traced Thomas Gleixner
@ 2020-05-14  4:57   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-14  4:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon,
	Peter Zijlstra (Intel)

On Tue, May 5, 2020 at 7:15 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> From: Thomas Gleixner <tglx@linutronix.de>
>
> In order to ensure poke_int3_handler() is completely self contained -- this
> is called while modifying other text, imagine the fun of hitting another
> INT3 -- ensure that everything it uses is not traced.
>
> The primary means here is to force inlining; bsearch() is notrace because
> all of lib/ is.


Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 02/24] x86/int3: Avoid atomic instrumentation
  2020-05-05 13:49 ` [patch V4 part 4 02/24] x86/int3: Avoid atomic instrumentation Thomas Gleixner
  2020-05-08 13:27   ` Masami Hiramatsu
@ 2020-05-14  4:57   ` Andy Lutomirski
  2020-05-14  9:32     ` Peter Zijlstra
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  2 siblings, 1 reply; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-14  4:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon,
	Peter Zijlstra (Intel)

On Tue, May 5, 2020 at 7:15 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> From: Peter Zijlstra <peterz@infradead.org>
>
> Use arch_atomic_*() and READ_ONCE_NOCHECK() to ensure nothing untoward
> creeps in and ruins things.
>
> That is; this is the INT3 text poke handler, strictly limit the code
> that runs in it, lest it inadvertenly hits yet another INT3.


Acked-by: Andy Lutomirski <luto@kernel.org>

Does objtool catch this error?

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 03/24] lib/bsearch: Provide __always_inline variant
  2020-05-05 13:49 ` [patch V4 part 4 03/24] lib/bsearch: Provide __always_inline variant Thomas Gleixner
@ 2020-05-14  4:58   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-14  4:58 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon,
	Peter Zijlstra (Intel)

On Tue, May 5, 2020 at 7:15 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> From: Peter Zijlstra <peterz@infradead.org>
>
> For code that needs the ultimate performance (it can inline the @cmp
> function too) or simply needs to avoid calling external functions for
> whatever reason, provide an __always_inline variant of bsearch().


Acked-by: Andy Lutomirski <luto@kernel.org>

Although maybe a more explicit name (e.g. __inlined_bsearch()) would
be more clear?

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 04/24] x86/int3: Inline bsearch()
  2020-05-05 13:49 ` [patch V4 part 4 04/24] x86/int3: Inline bsearch() Thomas Gleixner
@ 2020-05-14  4:58   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-14  4:58 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon,
	Peter Zijlstra (Intel)

On Tue, May 5, 2020 at 7:15 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> From: Peter Zijlstra <peterz@infradead.org>
>
> Avoid calling out to bsearch() by inlining it, for normal kernel configs
> this was the last external call and poke_int3_handler() is now fully self
> sufficient -- no calls to external code.
>


Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 05/24] x86/entry: Provide IDTENTRY_RAW
  2020-05-05 13:49 ` [patch V4 part 4 05/24] x86/entry: Provide IDTENTRY_RAW Thomas Gleixner
@ 2020-05-14  4:59   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] x86/idtentry: " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-14  4:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:15 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Some exception handlers need to do extra work before any of the entry
> helpers are invoked. Provide IDTENTRY_RAW for this.


Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 06/24] x86/entry: Convert INT3 exception to IDTENTRY_RAW
  2020-05-05 13:49 ` [patch V4 part 4 06/24] x86/entry: Convert INT3 exception to IDTENTRY_RAW Thomas Gleixner
@ 2020-05-14  5:01   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-14  5:01 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:15 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Convert #BP to IDTENTRY_RAW:
>   - Implement the C entry point with DEFINE_IDTENTRY_RAW
>   - Invoke idtentry_enter/exit() from the function body
>   - Emit the ASM stub with DECLARE_IDTENTRY_RAW
>   - Remove the ASM idtentry in 64bit
>   - Remove the open coded ASM entry code in 32bit
>   - Fixup the XEN/PV code
>   - Remove the old prototypes

Gmail is so amused by your prototypo that it fixes it sometimes in the
quoted text.  See just above :)


Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 07/24] x86/traps: Split int3 handler up
  2020-05-05 13:49 ` [patch V4 part 4 07/24] x86/traps: Split int3 handler up Thomas Gleixner
@ 2020-05-14  5:03   ` Andy Lutomirski
  2020-05-14  9:39     ` Peter Zijlstra
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  1 sibling, 1 reply; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-14  5:03 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon,
	Peter Zijlstra (Intel)

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> For code simplicity split up the int3 handler into a kernel and user part
> which makes the code flow simpler to understand.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/kernel/traps.c |   67 +++++++++++++++++++++++++++---------------------
>  1 file changed, 39 insertions(+), 28 deletions(-)
>
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -564,6 +564,35 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_pr
>         cond_local_irq_disable(regs);
>  }
>
> +static bool do_int3(struct pt_regs *regs)
> +{
> +       int res;
> +
> +#ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
> +       if (kgdb_ll_trap(DIE_INT3, "int3", regs, 0, X86_TRAP_BP,
> +                        SIGTRAP) == NOTIFY_STOP)
> +               return true;
> +#endif /* CONFIG_KGDB_LOW_LEVEL_TRAP */
> +
> +#ifdef CONFIG_KPROBES
> +       if (kprobe_int3_handler(regs))
> +               return true;
> +#endif
> +       res = notify_die(DIE_INT3, "int3", regs, 0, X86_TRAP_BP, SIGTRAP);
> +
> +       return res == NOTIFY_STOP;
> +}
> +
> +static void do_int3_user(struct pt_regs *regs)
> +{
> +       if (do_int3(regs))
> +               return;
> +
> +       cond_local_irq_enable(regs);
> +       do_trap(X86_TRAP_BP, SIGTRAP, "int3", regs, 0, 0, NULL);
> +       cond_local_irq_disable(regs);
> +}
> +
>  DEFINE_IDTENTRY_RAW(exc_int3)
>  {
>         /*
> @@ -581,37 +610,19 @@ DEFINE_IDTENTRY_RAW(exc_int3)
>          * because the INT3 could have been hit in any context including
>          * NMI.
>          */
> -       if (user_mode(regs))
> +       if (user_mode(regs)) {
>                 idtentry_enter(regs);
> -       else
> -               nmi_enter();
> -
> -       instr_begin();
> -#ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
> -       if (kgdb_ll_trap(DIE_INT3, "int3", regs, 0, X86_TRAP_BP,
> -                               SIGTRAP) == NOTIFY_STOP)
> -               goto exit;
> -#endif /* CONFIG_KGDB_LOW_LEVEL_TRAP */
> -
> -#ifdef CONFIG_KPROBES
> -       if (kprobe_int3_handler(regs))
> -               goto exit;
> -#endif
> -
> -       if (notify_die(DIE_INT3, "int3", regs, 0, X86_TRAP_BP,
> -                       SIGTRAP) == NOTIFY_STOP)
> -               goto exit;
> -
> -       cond_local_irq_enable(regs);
> -       do_trap(X86_TRAP_BP, SIGTRAP, "int3", regs, 0, 0, NULL);
> -       cond_local_irq_disable(regs);
> -
> -exit:
> -       instr_end();
> -       if (user_mode(regs))
> +               instr_begin();
> +               do_int3_user(regs);
> +               instr_end();
>                 idtentry_exit(regs);
> -       else
> +       } else {
> +               nmi_enter();
> +               instr_begin();
> +               do_int3(regs);

I think you should be checking the return value here.  Presumably this
should die() if it's not handled, since otherwise it will just
infinite loop.

> +               instr_end();
>                 nmi_exit();
> +       }
>  }
>
>  #ifdef CONFIG_X86_64
>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 02/24] x86/int3: Avoid atomic instrumentation
  2020-05-14  4:57   ` Andy Lutomirski
@ 2020-05-14  9:32     ` Peter Zijlstra
  2020-05-14 12:51       ` Thomas Gleixner
  0 siblings, 1 reply; 94+ messages in thread
From: Peter Zijlstra @ 2020-05-14  9:32 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, LKML, X86 ML, Paul E. McKenney,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Wed, May 13, 2020 at 09:57:52PM -0700, Andy Lutomirski wrote:
> On Tue, May 5, 2020 at 7:15 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > From: Peter Zijlstra <peterz@infradead.org>
> >
> > Use arch_atomic_*() and READ_ONCE_NOCHECK() to ensure nothing untoward
> > creeps in and ruins things.
> >
> > That is; this is the INT3 text poke handler, strictly limit the code
> > that runs in it, lest it inadvertenly hits yet another INT3.
> 
> 
> Acked-by: Andy Lutomirski <luto@kernel.org>
> 
> Does objtool catch this error?

It does not. I'll put it on the (endless) todo list..

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 07/24] x86/traps: Split int3 handler up
  2020-05-14  5:03   ` Andy Lutomirski
@ 2020-05-14  9:39     ` Peter Zijlstra
  0 siblings, 0 replies; 94+ messages in thread
From: Peter Zijlstra @ 2020-05-14  9:39 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, LKML, X86 ML, Paul E. McKenney,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Wed, May 13, 2020 at 10:03:13PM -0700, Andy Lutomirski wrote:
> On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > For code simplicity split up the int3 handler into a kernel and user part
> > which makes the code flow simpler to understand.
> >
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > ---
> >  arch/x86/kernel/traps.c |   67 +++++++++++++++++++++++++++---------------------
> >  1 file changed, 39 insertions(+), 28 deletions(-)
> >
> > --- a/arch/x86/kernel/traps.c
> > +++ b/arch/x86/kernel/traps.c
> > @@ -564,6 +564,35 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_pr
> >         cond_local_irq_disable(regs);
> >  }
> >
> > +static bool do_int3(struct pt_regs *regs)
> > +{
> > +       int res;
> > +
> > +#ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
> > +       if (kgdb_ll_trap(DIE_INT3, "int3", regs, 0, X86_TRAP_BP,
> > +                        SIGTRAP) == NOTIFY_STOP)
> > +               return true;
> > +#endif /* CONFIG_KGDB_LOW_LEVEL_TRAP */
> > +
> > +#ifdef CONFIG_KPROBES
> > +       if (kprobe_int3_handler(regs))
> > +               return true;
> > +#endif
> > +       res = notify_die(DIE_INT3, "int3", regs, 0, X86_TRAP_BP, SIGTRAP);
> > +
> > +       return res == NOTIFY_STOP;
> > +}
> > +
> > +static void do_int3_user(struct pt_regs *regs)
> > +{
> > +       if (do_int3(regs))
> > +               return;
> > +
> > +       cond_local_irq_enable(regs);
> > +       do_trap(X86_TRAP_BP, SIGTRAP, "int3", regs, 0, 0, NULL);
> > +       cond_local_irq_disable(regs);
> > +}
> > +
> >  DEFINE_IDTENTRY_RAW(exc_int3)
> >  {
> >         /*
> > @@ -581,37 +610,19 @@ DEFINE_IDTENTRY_RAW(exc_int3)
> >          * because the INT3 could have been hit in any context including
> >          * NMI.
> >          */
> > +       if (user_mode(regs)) {
> >                 idtentry_enter(regs);
> > +               instr_begin();
> > +               do_int3_user(regs);
> > +               instr_end();
> >                 idtentry_exit(regs);
> > +       } else {
> > +               nmi_enter();
> > +               instr_begin();
> > +               do_int3(regs);
> 
> I think you should be checking the return value here.  Presumably this
> should die() if it's not handled, since otherwise it will just
> infinite loop.

Indeed. Thanks!

> > +               instr_end();
> >                 nmi_exit();
> > +       }
> >  }
> >
> >  #ifdef CONFIG_X86_64
> >

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 02/24] x86/int3: Avoid atomic instrumentation
  2020-05-14  9:32     ` Peter Zijlstra
@ 2020-05-14 12:51       ` Thomas Gleixner
  2020-05-14 13:15         ` Peter Zijlstra
  0 siblings, 1 reply; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-14 12:51 UTC (permalink / raw)
  To: Peter Zijlstra, Andy Lutomirski
  Cc: LKML, X86 ML, Paul E. McKenney, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Mathieu Desnoyers,
	Josh Poimboeuf, Will Deacon

Peter Zijlstra <peterz@infradead.org> writes:
> On Wed, May 13, 2020 at 09:57:52PM -0700, Andy Lutomirski wrote:
>> On Tue, May 5, 2020 at 7:15 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>> >
>> > From: Peter Zijlstra <peterz@infradead.org>
>> >
>> > Use arch_atomic_*() and READ_ONCE_NOCHECK() to ensure nothing untoward
>> > creeps in and ruins things.
>> >
>> > That is; this is the INT3 text poke handler, strictly limit the code
>> > that runs in it, lest it inadvertenly hits yet another INT3.
>> 
>> 
>> Acked-by: Andy Lutomirski <luto@kernel.org>
>> 
>> Does objtool catch this error?
>
> It does not. I'll put it on the (endless) todo list..

Well, at least it detects when that code calls out into something which
is not in the non-instrumentable section.

As long as instrumentation respects the rules that this section is taboo,
this should not happen. Emphasis on *should*.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 02/24] x86/int3: Avoid atomic instrumentation
  2020-05-14 12:51       ` Thomas Gleixner
@ 2020-05-14 13:15         ` Peter Zijlstra
  2020-05-14 14:55           ` Andy Lutomirski
  2020-05-14 15:06           ` Thomas Gleixner
  0 siblings, 2 replies; 94+ messages in thread
From: Peter Zijlstra @ 2020-05-14 13:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Andy Lutomirski, LKML, X86 ML, Paul E. McKenney,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Thu, May 14, 2020 at 02:51:32PM +0200, Thomas Gleixner wrote:
> Peter Zijlstra <peterz@infradead.org> writes:
> > On Wed, May 13, 2020 at 09:57:52PM -0700, Andy Lutomirski wrote:
> >> On Tue, May 5, 2020 at 7:15 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >> >
> >> > From: Peter Zijlstra <peterz@infradead.org>
> >> >
> >> > Use arch_atomic_*() and READ_ONCE_NOCHECK() to ensure nothing untoward
> >> > creeps in and ruins things.
> >> >
> >> > That is; this is the INT3 text poke handler, strictly limit the code
> >> > that runs in it, lest it inadvertenly hits yet another INT3.
> >> 
> >> 
> >> Acked-by: Andy Lutomirski <luto@kernel.org>
> >> 
> >> Does objtool catch this error?
> >
> > It does not. I'll put it on the (endless) todo list..
> 
> Well, at least it detects when that code calls out into something which
> is not in the non-instrumentable section.

True, but the more specific problem is that noinstr code can use
jump_label/static_call just fine.

So a more specific test is validating none of that happens in the INT3
handler before poke_int3_handler(). Which is what I think Andy was
after.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 02/24] x86/int3: Avoid atomic instrumentation
  2020-05-14 13:15         ` Peter Zijlstra
@ 2020-05-14 14:55           ` Andy Lutomirski
  2020-05-14 15:06           ` Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-14 14:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Andy Lutomirski, LKML, X86 ML, Paul E. McKenney,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon



> On May 14, 2020, at 6:15 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
> On Thu, May 14, 2020 at 02:51:32PM +0200, Thomas Gleixner wrote:
>> Peter Zijlstra <peterz@infradead.org> writes:
>>> On Wed, May 13, 2020 at 09:57:52PM -0700, Andy Lutomirski wrote:
>>>>> On Tue, May 5, 2020 at 7:15 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>>>>>> 
>>>>>> From: Peter Zijlstra <peterz@infradead.org>
>>>>>> 
>>>>>> Use arch_atomic_*() and READ_ONCE_NOCHECK() to ensure nothing untoward
>>>>>> creeps in and ruins things.
>>>>>> 
>>>>>> That is; this is the INT3 text poke handler, strictly limit the code
>>>>>> that runs in it, lest it inadvertenly hits yet another INT3.
>>>>> 
>>>>> 
>>>>> Acked-by: Andy Lutomirski <luto@kernel.org>
>>>>> 
>>>>> Does objtool catch this error?
>>> 
>>> It does not. I'll put it on the (endless) todo list..
>> 
>> Well, at least it detects when that code calls out into something which
>> is not in the non-instrumentable section.
> 
> True, but the more specific problem is that noinstr code can use
> jump_label/static_call just fine.
> 
> So a more specific test is validating none of that happens in the INT3
> handler before poke_int3_handler(). Which is what I think Andy was
> after.

Exactly.  I admit that sleep-deprived Andy was actually thinking “tglx and/or PeterZ found this by inspection, and somewhere it escaped objtool’s notice,” which is sort of the same thing :)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 02/24] x86/int3: Avoid atomic instrumentation
  2020-05-14 13:15         ` Peter Zijlstra
  2020-05-14 14:55           ` Andy Lutomirski
@ 2020-05-14 15:06           ` Thomas Gleixner
  2020-05-14 15:08             ` Andy Lutomirski
  1 sibling, 1 reply; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-14 15:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andy Lutomirski, LKML, X86 ML, Paul E. McKenney,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

Peter Zijlstra <peterz@infradead.org> writes:
> On Thu, May 14, 2020 at 02:51:32PM +0200, Thomas Gleixner wrote:
>> Peter Zijlstra <peterz@infradead.org> writes:
>> > On Wed, May 13, 2020 at 09:57:52PM -0700, Andy Lutomirski wrote:
>> >> On Tue, May 5, 2020 at 7:15 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>> >> >
>> >> > From: Peter Zijlstra <peterz@infradead.org>
>> >> >
>> >> > Use arch_atomic_*() and READ_ONCE_NOCHECK() to ensure nothing untoward
>> >> > creeps in and ruins things.
>> >> >
>> >> > That is; this is the INT3 text poke handler, strictly limit the code
>> >> > that runs in it, lest it inadvertenly hits yet another INT3.
>> >> 
>> >> 
>> >> Acked-by: Andy Lutomirski <luto@kernel.org>
>> >> 
>> >> Does objtool catch this error?
>> >
>> > It does not. I'll put it on the (endless) todo list..
>> 
>> Well, at least it detects when that code calls out into something which
>> is not in the non-instrumentable section.
>
> True, but the more specific problem is that noinstr code can use
> jump_label/static_call just fine.
>
> So a more specific test is validating none of that happens in the INT3
> handler before poke_int3_handler(). Which is what I think Andy was
> after.

Indeed. Forgot about that one.

Hmm, alternatives and jumplabel patch locations in entry.text and
noinstr.text can be valid at least during early boot where we know that
we don't run those code pathes...

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 02/24] x86/int3: Avoid atomic instrumentation
  2020-05-14 15:06           ` Thomas Gleixner
@ 2020-05-14 15:08             ` Andy Lutomirski
  2020-05-14 15:10               ` Peter Zijlstra
  0 siblings, 1 reply; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-14 15:08 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, Andy Lutomirski, LKML, X86 ML, Paul E. McKenney,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon



> On May 14, 2020, at 8:06 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> Peter Zijlstra <peterz@infradead.org> writes:
>>> On Thu, May 14, 2020 at 02:51:32PM +0200, Thomas Gleixner wrote:
>>> Peter Zijlstra <peterz@infradead.org> writes:
>>>> On Wed, May 13, 2020 at 09:57:52PM -0700, Andy Lutomirski wrote:
>>>>> On Tue, May 5, 2020 at 7:15 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>>>>>> 
>>>>>> From: Peter Zijlstra <peterz@infradead.org>
>>>>>> 
>>>>>> Use arch_atomic_*() and READ_ONCE_NOCHECK() to ensure nothing untoward
>>>>>> creeps in and ruins things.
>>>>>> 
>>>>>> That is; this is the INT3 text poke handler, strictly limit the code
>>>>>> that runs in it, lest it inadvertenly hits yet another INT3.
>>>>> 
>>>>> 
>>>>> Acked-by: Andy Lutomirski <luto@kernel.org>
>>>>> 
>>>>> Does objtool catch this error?
>>>> 
>>>> It does not. I'll put it on the (endless) todo list..
>>> 
>>> Well, at least it detects when that code calls out into something which
>>> is not in the non-instrumentable section.
>> 
>> True, but the more specific problem is that noinstr code can use
>> jump_label/static_call just fine.
>> 
>> So a more specific test is validating none of that happens in the INT3
>> handler before poke_int3_handler(). Which is what I think Andy was
>> after.
> 
> Indeed. Forgot about that one.
> 
> Hmm, alternatives and jumplabel patch locations in entry.text and
> noinstr.text can be valid at least during early boot where we know that
> we don't run those code pathes...

Alternatives should be valid regardless. Isn’t the world essentially stopped while we apply them?

> 
> Thanks,
> 
>        tglx

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 02/24] x86/int3: Avoid atomic instrumentation
  2020-05-14 15:08             ` Andy Lutomirski
@ 2020-05-14 15:10               ` Peter Zijlstra
  0 siblings, 0 replies; 94+ messages in thread
From: Peter Zijlstra @ 2020-05-14 15:10 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, Andy Lutomirski, LKML, X86 ML, Paul E. McKenney,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Thu, May 14, 2020 at 08:08:46AM -0700, Andy Lutomirski wrote:
> Alternatives should be valid regardless. Isn’t the world essentially stopped while we apply them?

Yes, we do that before we go SMP.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 08/24] x86/entry: Provide IDTENTRY_IST
  2020-05-05 13:49 ` [patch V4 part 4 08/24] x86/entry: Provide IDTENTRY_IST Thomas Gleixner
@ 2020-05-14 16:39   ` Andy Lutomirski
  2020-05-14 18:44     ` Thomas Gleixner
  2020-05-19 19:58   ` [tip: x86/entry] x86/idtentry: " tip-bot2 for Thomas Gleixner
  1 sibling, 1 reply; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-14 16:39 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Same as IDTENTRY but for exceptions which run on Interrupt STacks (IST) on
> 64bit. For 32bit this maps to IDTENTRY.
>
> There are 3 variants which will be used:
>       IDTENTRY_MCE
>       IDTENTRY_DB
>       IDTENTRY_NMI
>
> These map to IDTENTRY_IST, but only the MCE and DB variants are emitting
> ASM code as the NMI entry needs hand crafted ASM still.
>
> The function defines do not contain any idtenter/exit calls as these
> exceptions need special treatment.

Okay I guess, but in the long run I'm guessing that we'll want to
merge a bunch of this to DECLARE_IDTENTRY_NOASM and just manually emit
the special cases in entry_32/64.S.

Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling
  2020-05-14  2:24   ` Mathieu Desnoyers
@ 2020-05-14 17:28     ` Thomas Gleixner
  2020-05-14 17:46       ` Mathieu Desnoyers
  2020-05-14 18:06     ` Steven Rostedt
  1 sibling, 1 reply; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-14 17:28 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, x86, paulmck, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, rostedt, Joel Fernandes, Google,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Josh Poimboeuf,
	Will Deacon, Peter Zijlstra

Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:
> ----- On May 5, 2020, at 9:49 AM, Thomas Gleixner tglx@linutronix.de wrote:
>> +
>> +static __always_inline void debug_exit(unsigned long dr7)
>> +{
>> +	set_debugreg(dr7, 7);
>> +}
>
> Out of curiosity, what prevents the compiler from moving instructions
> outside of the code regions surrounded by entry/exit ? This is an always
> inline, which invokes set_debugreg which is inline for CONFIG_PARAVIRT_XXL=n,
> which in turn uses an asm() (not volatile), without any memory
> clobber.
>
> Also, considering that "inline" is not sufficient to ensure the compiler
> does not emit a traceable function, I suspect you'll also want to mark
> "native_get_debugreg" and "native_set_debugreg" always inline as well.

It can move it into a function and call that. Fine. If that function
stays in the noinstr section then it should not emit a trace stub and if
it moves it out of the section or reuses another instance in text then
objtool will complain.

Checking for trace stubs and other instrumentation nonsense is on the
objtool wishlist anyway.

But yes, marking these __always_inline prevents that. Those escaped my
chase. But I would have found them once I go and fix that paravirt muck.

Thanks,

        tglx



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling
  2020-05-14 17:28     ` Thomas Gleixner
@ 2020-05-14 17:46       ` Mathieu Desnoyers
  2020-05-15 14:32         ` Thomas Gleixner
  0 siblings, 1 reply; 94+ messages in thread
From: Mathieu Desnoyers @ 2020-05-14 17:46 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, x86, paulmck, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, rostedt, Joel Fernandes, Google,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Josh Poimboeuf,
	Will Deacon, Peter Zijlstra

----- On May 14, 2020, at 1:28 PM, Thomas Gleixner tglx@linutronix.de wrote:

> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:
>> ----- On May 5, 2020, at 9:49 AM, Thomas Gleixner tglx@linutronix.de wrote:
>>> +
>>> +static __always_inline void debug_exit(unsigned long dr7)
>>> +{
>>> +	set_debugreg(dr7, 7);
>>> +}
>>

* Question 1

>> Out of curiosity, what prevents the compiler from moving instructions
>> outside of the code regions surrounded by entry/exit ? This is an always
>> inline, which invokes set_debugreg which is inline for CONFIG_PARAVIRT_XXL=n,
>> which in turn uses an asm() (not volatile), without any memory
>> clobber.
>>

?

* Question 2

>> Also, considering that "inline" is not sufficient to ensure the compiler
>> does not emit a traceable function, I suspect you'll also want to mark
>> "native_get_debugreg" and "native_set_debugreg" always inline as well.
> 
> It can move it into a function and call that. Fine. If that function
> stays in the noinstr section then it should not emit a trace stub and if
> it moves it out of the section or reuses another instance in text then
> objtool will complain.
> 
> Checking for trace stubs and other instrumentation nonsense is on the
> objtool wishlist anyway.
> 
> But yes, marking these __always_inline prevents that. Those escaped my
> chase. But I would have found them once I go and fix that paravirt muck.

This answers only my second question (see "Question 1" above).

Thanks,

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling
  2020-05-14  2:24   ` Mathieu Desnoyers
  2020-05-14 17:28     ` Thomas Gleixner
@ 2020-05-14 18:06     ` Steven Rostedt
  1 sibling, 0 replies; 94+ messages in thread
From: Steven Rostedt @ 2020-05-14 18:06 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Thomas Gleixner, linux-kernel, x86, paulmck, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Joel Fernandes, Google, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Josh Poimboeuf, Will Deacon, Peter Zijlstra

On Wed, 13 May 2020 22:24:56 -0400 (EDT)
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

> Also, considering that "inline" is not sufficient to ensure the compiler
> does not emit a traceable function, I suspect you'll also want to mark
> "native_get_debugreg" and "native_set_debugreg" always inline as well.

I was thinking that the noinstr sections was more about not doing tracing
and kprobes. As "inline" has been defined as "notrace" for some time, where
any function marked as "inline" will not be available to ftrace even if the
compiler decides not to honor the inline.

in linux/compiler_types.h:

#define inline inline __gnu_inline __inline_maybe_unused notrace

-- Steve

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 08/24] x86/entry: Provide IDTENTRY_IST
  2020-05-14 16:39   ` Andy Lutomirski
@ 2020-05-14 18:44     ` Thomas Gleixner
  0 siblings, 0 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-14 18:44 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

Andy Lutomirski <luto@kernel.org> writes:
> On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> Same as IDTENTRY but for exceptions which run on Interrupt STacks (IST) on
>> 64bit. For 32bit this maps to IDTENTRY.
>>
>> There are 3 variants which will be used:
>>       IDTENTRY_MCE
>>       IDTENTRY_DB
>>       IDTENTRY_NMI
>>
>> These map to IDTENTRY_IST, but only the MCE and DB variants are emitting
>> ASM code as the NMI entry needs hand crafted ASM still.
>>
>> The function defines do not contain any idtenter/exit calls as these
>> exceptions need special treatment.
>
> Okay I guess, but in the long run I'm guessing that we'll want to
> merge a bunch of this to DECLARE_IDTENTRY_NOASM and just manually emit
> the special cases in entry_32/64.S.

The ASM is still the paranoid muck which is emitted nicely.

But on the C side this needs a different treatment than the regular
exceptions which all use idtentry_enter() before and idtentry_exit()
after the handler function body.

Those need magic things before and after nmi_enter/exit(). That's why
the C function is directly called and does not have any automatically
emitted enter/exit stuff like the other IDTENTRY variants.

Thanks,

        tglx



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 09/24] x86/mce: Move nmi_enter/exit() into the entry point
  2020-05-05 13:49 ` [patch V4 part 4 09/24] x86/mce: Move nmi_enter/exit() into the entry point Thomas Gleixner
@ 2020-05-15  5:23   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-15  5:23 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> There is no reason to have nmi_enter/exit() in the actual MCE
> handlers. Move it to the entry point. This also covers the until now
> uncovered initial handler which only prints.

Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 10/24] x86/entry: Convert Machine Check to IDTENTRY_IST
  2020-05-05 13:49 ` [patch V4 part 4 10/24] x86/entry: Convert Machine Check to IDTENTRY_IST Thomas Gleixner
@ 2020-05-15  5:24   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-15  5:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Convert #MC to IDTENTRY_MCE:
>   - Implement the C entry points with DEFINE_IDTENTRY_MCE
>   - Emit the ASM stub with DECLARE_IDTENTRY_MCE
>   - Remove the ASM idtentry in 64bit
>   - Remove the open coded ASM entry code in 32bit
>   - Fixup the XEN/PV code
>   - Remove the old prototyoes
>   - Remove the error code from *machine_check_vector() as
>     it is always 0 and not used by any of the functions
>     it can point to. Fixup all the functions as well.
>
> No functional change.


Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 11/24] x86/mce: Use untraced rd/wrmsr in the MCE offline/crash check
  2020-05-05 13:49 ` [patch V4 part 4 11/24] x86/mce: Use untraced rd/wrmsr in the MCE offline/crash check Thomas Gleixner
@ 2020-05-15  5:24   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-15  5:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> mce_check_crashing_cpu() is called right at the entry of the MCE
> handler. It uses mce_rdmsr() and mce_wrmsr() which are wrappers around
> rdmsr() and wrmsr() to handle the MCE error injection mechanism, which is
> pointless in this context, i.e. when the MCE hits an offline CPU or the
> system is already marked crashing.
>
> The MSR access can also be traced, so use the untraceable variants. This
> is also safe vs. XEN paravirt as these MSRs are not affected by XEN PV
> modifications.


Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 12/24] x86/idtentry: Provide IDTENTRY_XEN for XEN/PV
  2020-05-05 13:49 ` [patch V4 part 4 12/24] x86/idtentry: Provide IDTENTRY_XEN for XEN/PV Thomas Gleixner
@ 2020-05-15  5:25   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-15  5:25 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> XEN/PV has special wrappers for NMI and DB exceptions. They redirect these
> exceptions through regular IDTENTRY points. Provide the necessary IDTENTRY
> macros to make this work


Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 13/24] x86/entry: Convert NMI to IDTENTRY_NMI
  2020-05-05 13:49 ` [patch V4 part 4 13/24] x86/entry: Convert NMI to IDTENTRY_NMI Thomas Gleixner
@ 2020-05-15  5:26   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-15  5:26 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Convert #NMI to IDTENTRY_NMI:
>   - Implement the C entry point with DEFINE_IDTENTRY_NMI
>   - Fixup the XEN/PV code
>   - Remove the old prototyoes
>
> No functional change.


Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 14/24] x86/nmi: Protect NMI entry against instrumentation
  2020-05-05 13:49 ` [patch V4 part 4 14/24] x86/nmi: Protect NMI entry against instrumentation Thomas Gleixner
@ 2020-05-15  5:26   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-15  5:26 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Mark all functions in the fragile code parts noinstr or force inlining so
> they can't be instrumented.


Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 16/24] x86/entry: Convert Debug exception to IDTENTRY_DB
  2020-05-05 13:49 ` [patch V4 part 4 16/24] x86/entry: Convert Debug exception to IDTENTRY_DB Thomas Gleixner
@ 2020-05-15  5:27   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-15  5:27 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Convert #DB to IDTENTRY_ERRORCODE:
>   - Implement the C entry point with DEFINE_IDTENTRY_DB
>   - Emit the ASM stub with DECLARE_IDTENTRY
>   - Remove the ASM idtentry in 64bit
>   - Remove the open coded ASM entry code in 32bit
>   - Fixup the XEN/PV code
>   - Remove the old prototyoes
>
> No functional change.


Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 17/24] x86/entry/64: Remove error code clearing from #DB and #MCE ASM stub
  2020-05-05 13:49 ` [patch V4 part 4 17/24] x86/entry/64: Remove error code clearing from #DB and #MCE ASM stub Thomas Gleixner
@ 2020-05-15  5:27   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-15  5:27 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> The C entry points do not expect an error code.


Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 18/24] x86/entry: Provide IDTRENTRY_NOIST variants for #DB and #MC
  2020-05-05 13:49 ` [patch V4 part 4 18/24] x86/entry: Provide IDTRENTRY_NOIST variants for #DB and #MC Thomas Gleixner
@ 2020-05-15  5:29   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] x86/idtentry: " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-15  5:29 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Provide NOIST entry point macros which allows to implement NOIST variants
> of the C entry points. These are invoked when #DB or #MC enter from user
> space. This allows explicit handling of the difference between user mode
> and kernel mode entry later.


Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 19/24] x86/entry: Implement user mode C entry points for #DB and #MCE
  2020-05-05 13:49 ` [patch V4 part 4 19/24] x86/entry: Implement user mode C entry points for #DB and #MCE Thomas Gleixner
@ 2020-05-15  5:32   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-15  5:32 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> The MCE entry point uses the same mechanism as the IST entry point for
> now. For #DB split the inner workings and just keep the ist_enter/exit
> magic in the IST variant. Fixup the ASM code to emit the proper
> noist_##cfunc call.
>


Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling
  2020-05-05 13:49 ` [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling Thomas Gleixner
  2020-05-07 17:18   ` Alexandre Chartre
  2020-05-14  2:24   ` Mathieu Desnoyers
@ 2020-05-15  5:37   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  3 siblings, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-15  5:37 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon,
	Peter Zijlstra

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> From: Peter Zijlstra <peterz@infradead.org>
>
> DR6/7 should be handled before nmi_enter() is invoked and restore after
> nmi_exit() to minimize the exposure.
>
> Split it out into helper inlines and bring it into the correct order.

> +        *
> +        * Entry text is excluded for HW_BP_X and cpu_entry_area, which
> +        * includes the entry stack is excluded for everything.
> +        */
> +       get_debugreg(*dr7, 6);
> +       set_debugreg(0, 7);

Fortunately, PeterZ is hiding in a brown paper bag, so I don't have to
comment :)

Other than that:

Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 20/24] x86/traps: Restructure #DB handling
  2020-05-05 13:49 ` [patch V4 part 4 20/24] x86/traps: Restructure #DB handling Thomas Gleixner
@ 2020-05-15  5:39   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-15  5:39 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Now that there are separate entry points, move the kernel/user_mode specifc
> checks into the entry functions so the common handling code does not need
> the extra mode checks. Make the code more readable while at it.

Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 21/24] x86/traps: Address objtool noinstr complaints in #DB
  2020-05-05 13:49 ` [patch V4 part 4 21/24] x86/traps: Address objtool noinstr complaints in #DB Thomas Gleixner
@ 2020-05-15  5:39   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-15  5:39 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> The functions invoked from handle_debug() can be instrumented. Tell objtool
> about it.

Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 22/24] x86/mce: Address objtools noinstr complaints
  2020-05-05 13:49 ` [patch V4 part 4 22/24] x86/mce: Address objtools noinstr complaints Thomas Gleixner
@ 2020-05-15  5:40   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-15  5:40 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Mark the relevant functions noinstr, use the plain non-instrumented MSR
> accessors. The only odd part is the instr_begin()/end() pair around the
> indirect machine_check_vector() call as objtool can't figure that out. The
> possible invoked functions are annotated correctly.

Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 23/24] x86/entry: Provide IDTENTRY_DF
  2020-05-05 13:49 ` [patch V4 part 4 23/24] x86/entry: Provide IDTENTRY_DF Thomas Gleixner
@ 2020-05-15  5:41   ` Andy Lutomirski
  2020-05-15 15:01     ` Thomas Gleixner
  2020-05-19 19:58   ` [tip: x86/entry] x86/idtentry: " tip-bot2 for Thomas Gleixner
  2020-05-19 19:58   ` [tip: x86/entry] x86/entry: Convert double fault exception to IDTENTRY_DF tip-bot2 for Thomas Gleixner
  2 siblings, 1 reply; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-15  5:41 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Provide a separate macro for #DF as this needs to emit paranoid only code
> and has also a special ASM stub in 32bit.

Acked-by: Andy Lutomirski <luto@kernel.org>

but... maybe it would be cleaner just to open-code all of this in the
next patch?  This is a lot of macro to do nothing at all.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 24/24] x86/entry: Convert double fault exception to IDTENTRY_DF
  2020-05-05 13:49 ` [patch V4 part 4 24/24] " Thomas Gleixner
  2020-05-07 19:55   ` Alexandre Chartre
@ 2020-05-15  5:42   ` Andy Lutomirski
  1 sibling, 0 replies; 94+ messages in thread
From: Andy Lutomirski @ 2020-05-15  5:42 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Convert #DF to IDTENTRY_DF
>   - Implement the C entry point with DEFINE_IDTENTRY_DF
>   - Emit the ASM stub with DECLARE_IDTENTRY_DF on 64bit
>   - Remove the ASM idtentry in 64bit
>   - Adjust the 32bit shim code
>   - Fixup the XEN/PV code
>   - Remove the old prototyoes
>
> No functional change.
>

Acked-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling
  2020-05-14 17:46       ` Mathieu Desnoyers
@ 2020-05-15 14:32         ` Thomas Gleixner
  0 siblings, 0 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-15 14:32 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, x86, paulmck, Andy Lutomirski, Alexandre Chartre,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, rostedt, Joel Fernandes, Google,
	Boris Ostrovsky, Juergen Gross, Brian Gerst, Josh Poimboeuf,
	Will Deacon, Peter Zijlstra

Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:

> ----- On May 14, 2020, at 1:28 PM, Thomas Gleixner tglx@linutronix.de wrote:
>
>> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:
>>> ----- On May 5, 2020, at 9:49 AM, Thomas Gleixner tglx@linutronix.de wrote:
>>>> +
>>>> +static __always_inline void debug_exit(unsigned long dr7)
>>>> +{
>>>> +	set_debugreg(dr7, 7);
>>>> +}
>>>
>
> * Question 1
>
>>> Out of curiosity, what prevents the compiler from moving instructions
>>> outside of the code regions surrounded by entry/exit ? This is an always
>>> inline, which invokes set_debugreg which is inline for CONFIG_PARAVIRT_XXL=n,
>>> which in turn uses an asm() (not volatile), without any memory
>>> clobber.

I misread 'surrounded by entry/exit'.

Reading it again I assume you mean nmi_enter/exit(). And yes, there is a
compiler barrier missing.

Thanks,

        tglx

8<----------------
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index e11ad0791dc3..ae1e61345225 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -718,6 +718,13 @@ static __always_inline void debug_enter(unsigned long *dr6, unsigned long *dr7)
 	get_debugreg(*dr7, 7);
 	set_debugreg(0, 7);
 
+	/*
+	 * Ensure the compiler doesn't lower the above statements into
+	 * the critical section; disabling breakpoints late would not
+	 * be good.
+	 */
+	barrier();
+
 	/*
 	 * The Intel SDM says:
 	 *
@@ -737,6 +744,12 @@ static __always_inline void debug_enter(unsigned long *dr6, unsigned long *dr7)
 
 static __always_inline void debug_exit(unsigned long dr7)
 {
+	/*
+	 * Ensure the compiler doesn't raise this statement into
+	 * the critical section; enabling breakpoints early would
+	 * not be good.
+	 */
+	barrier();
 	set_debugreg(dr7, 7);
 }
 

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [patch V4 part 4 23/24] x86/entry: Provide IDTENTRY_DF
  2020-05-15  5:41   ` Andy Lutomirski
@ 2020-05-15 15:01     ` Thomas Gleixner
  0 siblings, 0 replies; 94+ messages in thread
From: Thomas Gleixner @ 2020-05-15 15:01 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: LKML, X86 ML, Paul E. McKenney, Andy Lutomirski,
	Alexandre Chartre, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Juergen Gross,
	Brian Gerst, Mathieu Desnoyers, Josh Poimboeuf, Will Deacon

Andy Lutomirski <luto@kernel.org> writes:
> On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> Provide a separate macro for #DF as this needs to emit paranoid only code
>> and has also a special ASM stub in 32bit.
>
> Acked-by: Andy Lutomirski <luto@kernel.org>
>
> but... maybe it would be cleaner just to open-code all of this in the
> next patch?  This is a lot of macro to do nothing at all.

Well, yes and no. The point is that we really want to have all idt
entries marked IDENTRY_ and have the prototypes including the XEN
variants generated.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/idtentry: Provide IDTENTRY_DF
  2020-05-05 13:49 ` [patch V4 part 4 23/24] x86/entry: Provide IDTENTRY_DF Thomas Gleixner
  2020-05-15  5:41   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  2020-05-19 19:58   ` [tip: x86/entry] x86/entry: Convert double fault exception to IDTENTRY_DF tip-bot2 for Thomas Gleixner
  2 siblings, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     9bf779984c19883354f55a54283292b927939b6a
Gitweb:        https://git.kernel.org/tip/9bf779984c19883354f55a54283292b927939b6a
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 25 Feb 2020 23:33:30 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:14 +02:00

x86/idtentry: Provide IDTENTRY_DF

Provide a separate macro for #DF as this needs to emit paranoid only code
and has also a special ASM stub in 32bit.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135315.583415264@linutronix.de



---
 arch/x86/include/asm/idtentry.h | 87 ++++++++++++++++++++++++++++++++-
 1 file changed, 87 insertions(+)

diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 060f9e3..9521f32 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -132,6 +132,35 @@ static __always_inline void __##func(struct pt_regs *regs,		\
 #define DEFINE_IDTENTRY_RAW(func)					\
 __visible noinstr void func(struct pt_regs *regs)
 
+/**
+ * DECLARE_IDTENTRY_RAW_ERRORCODE - Declare functions for raw IDT entry points
+ *				    Error code pushed by hardware
+ * @vector:	Vector number (ignored for C)
+ * @func:	Function name of the entry point
+ *
+ * Maps to DECLARE_IDTENTRY_ERRORCODE()
+ */
+#define DECLARE_IDTENTRY_RAW_ERRORCODE(vector, func)			\
+	DECLARE_IDTENTRY_ERRORCODE(vector, func)
+
+/**
+ * DEFINE_IDTENTRY_RAW_ERRORCODE - Emit code for raw IDT entry points
+ * @func:	Function name of the entry point
+ *
+ * @func is called from ASM entry code with interrupts disabled.
+ *
+ * The macro is written so it acts as function definition. Append the
+ * body with a pair of curly brackets.
+ *
+ * Contrary to DEFINE_IDTENTRY_ERRORCODE() this does not invoke the
+ * idtentry_enter/exit() helpers before and after the body invocation. This
+ * needs to be done in the body itself if applicable. Use if extra work
+ * is required before the enter/exit() helpers are invoked.
+ */
+#define DEFINE_IDTENTRY_RAW_ERRORCODE(func)				\
+__visible noinstr void func(struct pt_regs *regs, unsigned long error_code)
+
+
 #ifdef CONFIG_X86_64
 /**
  * DECLARE_IDTENTRY_IST - Declare functions for IST handling IDT entry points
@@ -165,10 +194,58 @@ __visible noinstr void func(struct pt_regs *regs)
 #define DEFINE_IDTENTRY_NOIST(func)					\
 	DEFINE_IDTENTRY_RAW(noist_##func)
 
+/**
+ * DECLARE_IDTENTRY_DF - Declare functions for double fault
+ * @vector:	Vector number (ignored for C)
+ * @func:	Function name of the entry point
+ *
+ * Maps to DECLARE_IDTENTRY_RAW_ERRORCODE
+ */
+#define DECLARE_IDTENTRY_DF(vector, func)				\
+	DECLARE_IDTENTRY_RAW_ERRORCODE(vector, func)
+
+/**
+ * DEFINE_IDTENTRY_DF - Emit code for double fault
+ * @func:	Function name of the entry point
+ *
+ * Maps to DEFINE_IDTENTRY_RAW_ERRORCODE
+ */
+#define DEFINE_IDTENTRY_DF(func)					\
+	DEFINE_IDTENTRY_RAW_ERRORCODE(func)
+
 #else	/* CONFIG_X86_64 */
+
 /* Maps to a regular IDTENTRY on 32bit for now */
 # define DECLARE_IDTENTRY_IST		DECLARE_IDTENTRY
 # define DEFINE_IDTENTRY_IST		DEFINE_IDTENTRY
+
+/**
+ * DECLARE_IDTENTRY_DF - Declare functions for double fault 32bit variant
+ * @vector:	Vector number (ignored for C)
+ * @func:	Function name of the entry point
+ *
+ * Declares two functions:
+ * - The ASM entry point: asm_##func
+ * - The C handler called from the C shim
+ */
+#define DECLARE_IDTENTRY_DF(vector, func)				\
+	asmlinkage void asm_##func(void);				\
+	__visible void func(struct pt_regs *regs,			\
+			    unsigned long error_code,			\
+			    unsigned long address)
+
+/**
+ * DEFINE_IDTENTRY_DF - Emit code for double fault on 32bit
+ * @func:	Function name of the entry point
+ *
+ * This is called through the doublefault shim which already provides
+ * cr2 in the address argument.
+ */
+#define DEFINE_IDTENTRY_DF(func)					\
+__visible noinstr void func(struct pt_regs *regs,			\
+			    unsigned long error_code,			\
+			    unsigned long address)
+
 #endif	/* !CONFIG_X86_64 */
 
 /* C-Code mapping */
@@ -212,6 +289,9 @@ __visible noinstr void func(struct pt_regs *regs)
 #define DECLARE_IDTENTRY_RAW(vector, func)				\
 	DECLARE_IDTENTRY(vector, func)
 
+#define DECLARE_IDTENTRY_RAW_ERRORCODE(vector, func)			\
+	DECLARE_IDTENTRY_ERRORCODE(vector, func)
+
 #ifdef CONFIG_X86_64
 # define DECLARE_IDTENTRY_MCE(vector, func)				\
 	idtentry_mce_db vector asm_##func func
@@ -219,12 +299,19 @@ __visible noinstr void func(struct pt_regs *regs)
 # define DECLARE_IDTENTRY_DEBUG(vector, func)				\
 	idtentry_mce_db vector asm_##func func
 
+# define DECLARE_IDTENTRY_DF(vector, func)				\
+	idtentry_df vector asm_##func func
+
 #else
 # define DECLARE_IDTENTRY_MCE(vector, func)				\
 	DECLARE_IDTENTRY(vector, func)
 
 # define DECLARE_IDTENTRY_DEBUG(vector, func)				\
 	DECLARE_IDTENTRY(vector, func)
+
+/* No ASM emitted for DF as this goes through a C shim */
+# define DECLARE_IDTENTRY_DF(vector, func)
+
 #endif
 
 /* No ASM code emitted for NMI */

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/entry: Convert double fault exception to IDTENTRY_DF
  2020-05-05 13:49 ` [patch V4 part 4 23/24] x86/entry: Provide IDTENTRY_DF Thomas Gleixner
  2020-05-15  5:41   ` Andy Lutomirski
  2020-05-19 19:58   ` [tip: x86/entry] x86/idtentry: " tip-bot2 for Thomas Gleixner
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  2 siblings, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     095b7a3e7745e6fb7cf0a1c09967c4f43e76f8f4
Gitweb:        https://git.kernel.org/tip/095b7a3e7745e6fb7cf0a1c09967c4f43e76f8f4
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 25 Feb 2020 23:33:31 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:15 +02:00

x86/entry: Convert double fault exception to IDTENTRY_DF

Convert #DF to IDTENTRY_DF
  - Implement the C entry point with DEFINE_IDTENTRY_DF
  - Emit the ASM stub with DECLARE_IDTENTRY_DF on 64bit
  - Remove the ASM idtentry in 64bit
  - Adjust the 32bit shim code
  - Fixup the XEN/PV code
  - Remove the old prototypes

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135315.583415264@linutronix.de

---
 arch/x86/entry/entry_32.S        |  4 ++--
 arch/x86/entry/entry_64.S        | 10 +---------
 arch/x86/include/asm/idtentry.h  |  3 +++
 arch/x86/include/asm/traps.h     |  5 -----
 arch/x86/kernel/doublefault_32.c | 10 ++++------
 arch/x86/kernel/idt.c            |  4 ++--
 arch/x86/kernel/traps.c          | 17 ++++++++++++++---
 arch/x86/xen/enlighten_pv.c      |  4 ++--
 arch/x86/xen/xen-asm_64.S        |  2 +-
 9 files changed, 29 insertions(+), 30 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 30c6ed3..28d13f0 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1488,7 +1488,7 @@ ret_to_user:
 	jmp	restore_all_switch_stack
 SYM_CODE_END(handle_exception)
 
-SYM_CODE_START(double_fault)
+SYM_CODE_START(asm_exc_double_fault)
 1:
 	/*
 	 * This is a task gate handler, not an interrupt gate handler.
@@ -1526,7 +1526,7 @@ SYM_CODE_START(double_fault)
 1:
 	hlt
 	jmp 1b
-SYM_CODE_END(double_fault)
+SYM_CODE_END(asm_exc_double_fault)
 
 /*
  * NMI is doubly nasty.  It can happen on the first instruction of
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index d302839..d983a0d 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -680,15 +680,9 @@ SYM_CODE_START(\asmsym)
 	call	paranoid_entry
 	UNWIND_HINT_REGS
 
-	/* Read CR2 early */
-	GET_CR2_INTO(%r12);
-
-	TRACE_IRQS_OFF
-
 	movq	%rsp, %rdi		/* pt_regs pointer into first argument */
 	movq	ORIG_RAX(%rsp), %rsi	/* get error code into 2nd argument*/
 	movq	$-1, ORIG_RAX(%rsp)	/* no syscall to restart */
-	movq	%r12, %rdx		/* Move CR2 into 3rd argument */
 	call	\cfunc
 
 	jmp	paranoid_exit
@@ -918,7 +912,7 @@ SYM_INNER_LABEL(native_irq_return_iret, SYM_L_GLOBAL)
 	/*
 	 * This may fault.  Non-paranoid faults on return to userspace are
 	 * handled by fixup_bad_iret.  These include #SS, #GP, and #NP.
-	 * Double-faults due to espfix64 are handled in do_double_fault.
+	 * Double-faults due to espfix64 are handled in exc_double_fault.
 	 * Other faults here are fatal.
 	 */
 	iretq
@@ -1073,8 +1067,6 @@ apicinterrupt IRQ_WORK_VECTOR			irq_work_interrupt		smp_irq_work_interrupt
 
 idtentry	X86_TRAP_PF		page_fault		do_page_fault			has_error_code=1
 
-idtentry_df	X86_TRAP_DF		double_fault		do_double_fault
-
 #ifdef CONFIG_XEN_PV
 idtentry	512 /* dummy */		hypervisor_callback	xen_do_hypervisor_callback	has_error_code=0
 #endif
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 9521f32..ce97478 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -368,4 +368,7 @@ DECLARE_IDTENTRY_XEN(X86_TRAP_NMI,	nmi);
 DECLARE_IDTENTRY_DEBUG(X86_TRAP_DB,	exc_debug);
 DECLARE_IDTENTRY_XEN(X86_TRAP_DB,	debug);
 
+/* #DF */
+DECLARE_IDTENTRY_DF(X86_TRAP_DF,	exc_double_fault);
+
 #endif
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 9bd602d..f5a2e43 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -11,18 +11,13 @@
 
 #define dotraplinkage __visible
 
-#ifdef CONFIG_X86_64
-asmlinkage void double_fault(void);
-#endif
 asmlinkage void page_fault(void);
 asmlinkage void async_page_fault(void);
 
 #if defined(CONFIG_X86_64) && defined(CONFIG_XEN_PV)
-asmlinkage void xen_double_fault(void);
 asmlinkage void xen_page_fault(void);
 #endif
 
-dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code, unsigned long cr2);
 dotraplinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address);
 
 #ifdef CONFIG_X86_64
diff --git a/arch/x86/kernel/doublefault_32.c b/arch/x86/kernel/doublefault_32.c
index 3793646..25692bd 100644
--- a/arch/x86/kernel/doublefault_32.c
+++ b/arch/x86/kernel/doublefault_32.c
@@ -11,7 +11,6 @@
 #include <asm/desc.h>
 #include <asm/traps.h>
 
-extern void double_fault(void);
 #define ptr_ok(x) ((x) > PAGE_OFFSET && (x) < PAGE_OFFSET + MAXMEM)
 
 #define TSS(x) this_cpu_read(cpu_tss_rw.x86_tss.x)
@@ -22,7 +21,7 @@ static void set_df_gdt_entry(unsigned int cpu);
  * Called by double_fault with CR0.TS and EFLAGS.NT cleared.  The CPU thinks
  * we're running the doublefault task.  Cannot return.
  */
-asmlinkage notrace void __noreturn doublefault_shim(void)
+asmlinkage noinstr void __noreturn doublefault_shim(void)
 {
 	unsigned long cr2;
 	struct pt_regs regs;
@@ -41,7 +40,7 @@ asmlinkage notrace void __noreturn doublefault_shim(void)
 	 * Fill in pt_regs.  A downside of doing this in C is that the unwinder
 	 * won't see it (no ENCODE_FRAME_POINTER), so a nested stack dump
 	 * won't successfully unwind to the source of the double fault.
-	 * The main dump from do_double_fault() is fine, though, since it
+	 * The main dump from exc_double_fault() is fine, though, since it
 	 * uses these regs directly.
 	 *
 	 * If anyone ever cares, this could be moved to asm.
@@ -71,7 +70,7 @@ asmlinkage notrace void __noreturn doublefault_shim(void)
 	regs.cx		= TSS(cx);
 	regs.bx		= TSS(bx);
 
-	do_double_fault(&regs, 0, cr2);
+	exc_double_fault(&regs, 0, cr2);
 
 	/*
 	 * x86_32 does not save the original CR3 anywhere on a task switch.
@@ -85,7 +84,6 @@ asmlinkage notrace void __noreturn doublefault_shim(void)
 	 */
 	panic("cannot return from double fault\n");
 }
-NOKPROBE_SYMBOL(doublefault_shim);
 
 DEFINE_PER_CPU_PAGE_ALIGNED(struct doublefault_stack, doublefault_stack) = {
 	.tss = {
@@ -96,7 +94,7 @@ DEFINE_PER_CPU_PAGE_ALIGNED(struct doublefault_stack, doublefault_stack) = {
 		.ldt		= 0,
 	.io_bitmap_base	= IO_BITMAP_OFFSET_INVALID,
 
-		.ip		= (unsigned long) double_fault,
+		.ip		= (unsigned long) asm_exc_double_fault,
 		.flags		= X86_EFLAGS_FIXED,
 		.es		= __USER_DS,
 		.cs		= __KERNEL_CS,
diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index ddf3f3d..ec55479 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -91,7 +91,7 @@ static const __initconst struct idt_data def_idts[] = {
 #ifdef CONFIG_X86_32
 	TSKG(X86_TRAP_DF,		GDT_ENTRY_DOUBLEFAULT_TSS),
 #else
-	INTG(X86_TRAP_DF,		double_fault),
+	INTG(X86_TRAP_DF,		asm_exc_double_fault),
 #endif
 	INTG(X86_TRAP_DB,		asm_exc_debug),
 
@@ -187,7 +187,7 @@ gate_desc debug_idt_table[IDT_ENTRIES] __page_aligned_bss;
 static const __initconst struct idt_data ist_idts[] = {
 	ISTG(X86_TRAP_DB,	asm_exc_debug,		IST_INDEX_DB),
 	ISTG(X86_TRAP_NMI,	asm_exc_nmi,		IST_INDEX_NMI),
-	ISTG(X86_TRAP_DF,	double_fault,		IST_INDEX_DF),
+	ISTG(X86_TRAP_DF,	asm_exc_double_fault,	IST_INDEX_DF),
 #ifdef CONFIG_X86_MCE
 	ISTG(X86_TRAP_MC,	asm_exc_machine_check,	IST_INDEX_MCE),
 #endif
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 41bb0cb..35298c1 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -319,12 +319,19 @@ __visible void __noreturn handle_stack_overflow(const char *message,
  * from the TSS.  Returning is, in principle, okay, but changes to regs will
  * be lost.  If, for some reason, we need to return to a context with modified
  * regs, the shim code could be adjusted to synchronize the registers.
+ *
+ * The 32bit #DF shim provides CR2 already as an argument. On 64bit it needs
+ * to be read before doing anything else.
  */
-dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code, unsigned long cr2)
+DEFINE_IDTENTRY_DF(exc_double_fault)
 {
 	static const char str[] = "double fault";
 	struct task_struct *tsk = current;
 
+#ifdef CONFIG_X86_64
+	unsigned long address = read_cr2();
+#endif
+
 #ifdef CONFIG_X86_ESPFIX64
 	extern unsigned char native_irq_return_iret[];
 
@@ -381,6 +388,7 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code, unsign
 #endif
 
 	nmi_enter();
+	instrumentation_begin();
 	notify_die(DIE_TRAP, str, regs, error_code, X86_TRAP_DF, SIGSEGV);
 
 	tsk->thread.error_code = error_code;
@@ -424,13 +432,16 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code, unsign
 	 * stack even if the actual trigger for the double fault was
 	 * something else.
 	 */
-	if ((unsigned long)task_stack_page(tsk) - 1 - cr2 < PAGE_SIZE)
-		handle_stack_overflow("kernel stack overflow (double-fault)", regs, cr2);
+	if ((unsigned long)task_stack_page(tsk) - 1 - address < PAGE_SIZE) {
+		handle_stack_overflow("kernel stack overflow (double-fault)",
+				      regs, address);
+	}
 #endif
 
 	pr_emerg("PANIC: double fault, error_code: 0x%lx\n", error_code);
 	die("double fault", regs, error_code);
 	panic("Machine halted.");
+	instrumentation_end();
 }
 
 DEFINE_IDTENTRY(exc_bounds)
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 535dde1..851ea41 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -617,7 +617,7 @@ struct trap_array_entry {
 
 static struct trap_array_entry trap_array[] = {
 	TRAP_ENTRY_REDIR(exc_debug, exc_xendebug,	true  ),
-	{ double_fault,                xen_double_fault,                true },
+	TRAP_ENTRY(exc_double_fault,			true  ),
 #ifdef CONFIG_X86_MCE
 	TRAP_ENTRY(exc_machine_check,			true  ),
 #endif
@@ -652,7 +652,7 @@ static bool __ref get_trap_addr(void **addr, unsigned int ist)
 	 * Replace trap handler addresses by Xen specific ones.
 	 * Check for known traps using IST and whitelist them.
 	 * The debugger ones are the only ones we care about.
-	 * Xen will handle faults like double_fault, * so we should never see
+	 * Xen will handle faults like double_fault, so we should never see
 	 * them.  Warn if there's an unexpected IST-using fault handler.
 	 */
 	for (nr = 0; nr < ARRAY_SIZE(trap_array); nr++) {
diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index 9999ea3..e46d863 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -37,7 +37,7 @@ xen_pv_trap asm_exc_overflow
 xen_pv_trap asm_exc_bounds
 xen_pv_trap asm_exc_invalid_op
 xen_pv_trap asm_exc_device_not_available
-xen_pv_trap double_fault
+xen_pv_trap asm_exc_double_fault
 xen_pv_trap asm_exc_coproc_segment_overrun
 xen_pv_trap asm_exc_invalid_tss
 xen_pv_trap asm_exc_segment_not_present

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/mce: Address objtools noinstr complaints
  2020-05-05 13:49 ` [patch V4 part 4 22/24] x86/mce: Address objtools noinstr complaints Thomas Gleixner
  2020-05-15  5:40   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     260ba6c939f6ac42c8a96d2b50750b18706f1663
Gitweb:        https://git.kernel.org/tip/260ba6c939f6ac42c8a96d2b50750b18706f1663
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 21 Apr 2020 21:22:36 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:14 +02:00

x86/mce: Address objtools noinstr complaints

Mark the relevant functions noinstr, use the plain non-instrumented MSR
accessors. The only odd part is the instrumentation_begin()/end() pair around the
indirect machine_check_vector() call as objtool can't figure that out. The
possible invoked functions are annotated correctly.

Also use notrace variant of nmi_enter/exit(). If MCEs happen then hardware
latency tracing is the least of the worries.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135315.476734898@linutronix.de


---
 arch/x86/kernel/cpu/mce/core.c    | 20 +++++++++++++++-----
 arch/x86/kernel/cpu/mce/p5.c      |  4 +++-
 arch/x86/kernel/cpu/mce/winchip.c |  4 +++-
 kernel/time/timekeeping.c         |  2 +-
 4 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index a72c013..a32a7e2 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -130,7 +130,7 @@ static void (*quirk_no_way_out)(int bank, struct mce *m, struct pt_regs *regs);
 BLOCKING_NOTIFIER_HEAD(x86_mce_decoder_chain);
 
 /* Do initial initialization of a struct mce */
-void mce_setup(struct mce *m)
+noinstr void mce_setup(struct mce *m)
 {
 	memset(m, 0, sizeof(struct mce));
 	m->cpu = m->extcpu = smp_processor_id();
@@ -140,12 +140,12 @@ void mce_setup(struct mce *m)
 	m->cpuid = cpuid_eax(1);
 	m->socketid = cpu_data(m->extcpu).phys_proc_id;
 	m->apicid = cpu_data(m->extcpu).initial_apicid;
-	rdmsrl(MSR_IA32_MCG_CAP, m->mcgcap);
+	m->mcgcap = __rdmsr(MSR_IA32_MCG_CAP);
 
 	if (this_cpu_has(X86_FEATURE_INTEL_PPIN))
-		rdmsrl(MSR_PPIN, m->ppin);
+		m->ppin = __rdmsr(MSR_PPIN);
 	else if (this_cpu_has(X86_FEATURE_AMD_PPIN))
-		rdmsrl(MSR_AMD_PPIN, m->ppin);
+		m->ppin = __rdmsr(MSR_AMD_PPIN);
 
 	m->microcode = boot_cpu_data.microcode;
 }
@@ -1895,10 +1895,12 @@ bool filter_mce(struct mce *m)
 }
 
 /* Handle unconfigured int18 (should never happen) */
-static void unexpected_machine_check(struct pt_regs *regs)
+static noinstr void unexpected_machine_check(struct pt_regs *regs)
 {
+	instrumentation_begin();
 	pr_err("CPU#%d: Unexpected int18 (Machine Check)\n",
 	       smp_processor_id());
+	instrumentation_end();
 }
 
 /* Call the installed machine check handler for this CPU setup. */
@@ -1915,14 +1917,22 @@ static __always_inline void exc_machine_check_kernel(struct pt_regs *regs)
 		return;
 
 	nmi_enter();
+	/*
+	 * The call targets are marked noinstr, but objtool can't figure
+	 * that out because it's an indirect call. Annotate it.
+	 */
+	instrumentation_begin();
 	machine_check_vector(regs);
+	instrumentation_end();
 	nmi_exit();
 }
 
 static __always_inline void exc_machine_check_user(struct pt_regs *regs)
 {
 	idtentry_enter(regs);
+	instrumentation_begin();
 	machine_check_vector(regs);
+	instrumentation_end();
 	idtentry_exit(regs);
 }
 
diff --git a/arch/x86/kernel/cpu/mce/p5.c b/arch/x86/kernel/cpu/mce/p5.c
index eaebc4c..19e90ca 100644
--- a/arch/x86/kernel/cpu/mce/p5.c
+++ b/arch/x86/kernel/cpu/mce/p5.c
@@ -21,10 +21,11 @@
 int mce_p5_enabled __read_mostly;
 
 /* Machine check handler for Pentium class Intel CPUs: */
-static void pentium_machine_check(struct pt_regs *regs)
+static noinstr void pentium_machine_check(struct pt_regs *regs)
 {
 	u32 loaddr, hi, lotype;
 
+	instrumentation_begin();
 	rdmsr(MSR_IA32_P5_MC_ADDR, loaddr, hi);
 	rdmsr(MSR_IA32_P5_MC_TYPE, lotype, hi);
 
@@ -37,6 +38,7 @@ static void pentium_machine_check(struct pt_regs *regs)
 	}
 
 	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
+	instrumentation_end();
 }
 
 /* Set up machine check reporting for processors with Intel style MCE: */
diff --git a/arch/x86/kernel/cpu/mce/winchip.c b/arch/x86/kernel/cpu/mce/winchip.c
index 90e3d60..9c9f0ab 100644
--- a/arch/x86/kernel/cpu/mce/winchip.c
+++ b/arch/x86/kernel/cpu/mce/winchip.c
@@ -17,10 +17,12 @@
 #include "internal.h"
 
 /* Machine check handler for WinChip C6: */
-static void winchip_machine_check(struct pt_regs *regs)
+static noinstr void winchip_machine_check(struct pt_regs *regs)
 {
+	instrumentation_begin();
 	pr_emerg("CPU0: Machine Check Exception.\n");
 	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
+	instrumentation_end();
 }
 
 /* Set up machine check reporting on the Winchip C6 series */
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 9ebaab1..d20d489 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -953,7 +953,7 @@ EXPORT_SYMBOL_GPL(ktime_get_real_seconds);
  * but without the sequence counter protect. This internal function
  * is called just when timekeeping lock is already held.
  */
-time64_t __ktime_get_real_seconds(void)
+noinstr time64_t __ktime_get_real_seconds(void)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
 

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/traps: Address objtool noinstr complaints in #DB
  2020-05-05 13:49 ` [patch V4 part 4 21/24] x86/traps: Address objtool noinstr complaints in #DB Thomas Gleixner
  2020-05-15  5:39   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     467a8425d10508886eb5bcc0b80e0c73da4751c4
Gitweb:        https://git.kernel.org/tip/467a8425d10508886eb5bcc0b80e0c73da4751c4
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 30 Apr 2020 11:07:20 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:13 +02:00

x86/traps: Address objtool noinstr complaints in #DB

The functions invoked from handle_debug() can be instrumented. Tell objtool
about it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135315.380927730@linutronix.de


---
 arch/x86/kernel/traps.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index b62e962..41bb0cb 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -784,14 +784,19 @@ static void noinstr handle_debug(struct pt_regs *regs, unsigned long dr6,
 	/* Store the virtualized DR6 value */
 	tsk->thread.debugreg6 = dr6;
 
+	instrumentation_begin();
 #ifdef CONFIG_KPROBES
-	if (kprobe_debug_handler(regs))
+	if (kprobe_debug_handler(regs)) {
+		instrumentation_end();
 		return;
+	}
 #endif
 
 	if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, 0,
-		       SIGTRAP) == NOTIFY_STOP)
+		       SIGTRAP) == NOTIFY_STOP) {
+		instrumentation_end();
 		return;
+	}
 
 	/*
 	 * Let others (NMI) know that the debug stack is in use
@@ -827,6 +832,7 @@ static void noinstr handle_debug(struct pt_regs *regs, unsigned long dr6,
 out:
 	cond_local_irq_disable(regs);
 	debug_stack_usage_dec();
+	instrumentation_end();
 }
 
 static __always_inline void exc_debug_kernel(struct pt_regs *regs,

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/entry: Implement user mode C entry points for #DB and #MCE
  2020-05-05 13:49 ` [patch V4 part 4 19/24] x86/entry: Implement user mode C entry points for #DB and #MCE Thomas Gleixner
  2020-05-15  5:32   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     210d5380b6e055f0cd991a2ebefaa884689d4f95
Gitweb:        https://git.kernel.org/tip/210d5380b6e055f0cd991a2ebefaa884689d4f95
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 25 Feb 2020 23:33:29 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:12 +02:00

x86/entry: Implement user mode C entry points for #DB and #MCE

The MCE entry point uses the same mechanism as the IST entry point for
now. For #DB split the inner workings and just keep the nmi_enter/exit()
magic in the IST variant. Fixup the ASM code to emit the proper
noist_##cfunc call.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135315.177564104@linutronix.de


---
 arch/x86/entry/entry_64.S      |  2 +-
 arch/x86/kernel/cpu/mce/core.c | 40 +++++++++++++++----
 arch/x86/kernel/traps.c        | 70 +++++++++++++++++++++++++--------
 3 files changed, 88 insertions(+), 24 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index eeb4285..d302839 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -657,7 +657,7 @@ SYM_CODE_START(\asmsym)
 
 	/* Switch to the regular task stack and use the noist entry point */
 .Lfrom_usermode_switch_stack_\@:
-	idtentry_body vector \cfunc, has_error_code=0
+	idtentry_body vector noist_\cfunc, has_error_code=0 sane=1
 
 _ASM_NOKPROBE(\asmsym)
 SYM_CODE_END(\asmsym)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 3177652..a72c013 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1904,24 +1904,50 @@ static void unexpected_machine_check(struct pt_regs *regs)
 /* Call the installed machine check handler for this CPU setup. */
 void (*machine_check_vector)(struct pt_regs *) = unexpected_machine_check;
 
-DEFINE_IDTENTRY_MCE(exc_machine_check)
+static __always_inline void exc_machine_check_kernel(struct pt_regs *regs)
 {
+	/*
+	 * Only required when from kernel mode. See
+	 * mce_check_crashing_cpu() for details.
+	 */
 	if (machine_check_vector == do_machine_check &&
 	    mce_check_crashing_cpu())
 		return;
 
-	if (user_mode(regs))
-		idtentry_enter(regs);
-	else
-		nmi_enter();
+	nmi_enter();
+	machine_check_vector(regs);
+	nmi_exit();
+}
 
+static __always_inline void exc_machine_check_user(struct pt_regs *regs)
+{
+	idtentry_enter(regs);
 	machine_check_vector(regs);
+	idtentry_exit(regs);
+}
 
+#ifdef CONFIG_X86_64
+/* MCE hit kernel mode */
+DEFINE_IDTENTRY_MCE(exc_machine_check)
+{
+	exc_machine_check_kernel(regs);
+}
+
+/* The user mode variant. */
+DEFINE_IDTENTRY_MCE_USER(exc_machine_check)
+{
+	exc_machine_check_user(regs);
+}
+#else
+/* 32bit unified entry point */
+DEFINE_IDTENTRY_MCE(exc_machine_check)
+{
 	if (user_mode(regs))
-		idtentry_exit(regs);
+		exc_machine_check_user(regs);
 	else
-		nmi_exit();
+		exc_machine_check_kernel(regs);
 }
+#endif
 
 /*
  * Called for each booted CPU to set up machine checks.
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 569408a..4f248c5 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -775,20 +775,12 @@ static __always_inline void debug_exit(unsigned long dr7)
  *
  * May run on IST stack.
  */
-DEFINE_IDTENTRY_DEBUG(exc_debug)
+static noinstr void handle_debug(struct pt_regs *regs, unsigned long dr6)
 {
 	struct task_struct *tsk = current;
-	unsigned long dr6, dr7;
 	int user_icebp = 0;
 	int si_code;
 
-	debug_enter(&dr6, &dr7);
-
-	if (user_mode(regs))
-		idtentry_enter(regs);
-	else
-		nmi_enter();
-
 	/*
 	 * The SDM says "The processor clears the BTF flag when it
 	 * generates a debug exception."  Clear TIF_BLOCKSTEP to keep
@@ -800,7 +792,7 @@ DEFINE_IDTENTRY_DEBUG(exc_debug)
 		     is_sysenter_singlestep(regs))) {
 		dr6 &= ~DR_STEP;
 		if (!dr6)
-			goto exit;
+			return;
 		/*
 		 * else we might have gotten a single-step trap and hit a
 		 * watchpoint at the same time, in which case we should fall
@@ -821,12 +813,12 @@ DEFINE_IDTENTRY_DEBUG(exc_debug)
 
 #ifdef CONFIG_KPROBES
 	if (kprobe_debug_handler(regs))
-		goto exit;
+		return;
 #endif
 
 	if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, 0,
 		       SIGTRAP) == NOTIFY_STOP)
-		goto exit;
+		return;
 
 	/*
 	 * Let others (NMI) know that the debug stack is in use
@@ -842,7 +834,7 @@ DEFINE_IDTENTRY_DEBUG(exc_debug)
 				 X86_TRAP_DB);
 		cond_local_irq_disable(regs);
 		debug_stack_usage_dec();
-		goto exit;
+		return;
 	}
 
 	if (WARN_ON_ONCE((dr6 & DR_STEP) && !user_mode(regs))) {
@@ -861,14 +853,60 @@ DEFINE_IDTENTRY_DEBUG(exc_debug)
 		send_sigtrap(regs, 0, si_code);
 	cond_local_irq_disable(regs);
 	debug_stack_usage_dec();
+}
+
+static __always_inline void exc_debug_kernel(struct pt_regs *regs,
+					     unsigned long dr6)
+{
+	nmi_enter();
+	handle_debug(regs, dr6);
+	nmi_exit();
+}
+
+static __always_inline void exc_debug_user(struct pt_regs *regs,
+					   unsigned long dr6)
+{
+	idtentry_enter(regs);
+	handle_debug(regs, dr6);
+	idtentry_exit(regs);
+}
+
+#ifdef CONFIG_X86_64
+/* IST stack entry */
+DEFINE_IDTENTRY_DEBUG(exc_debug)
+{
+	unsigned long dr6, dr7;
+
+	debug_enter(&dr6, &dr7);
+	exc_debug_kernel(regs, dr6);
+	debug_exit(dr7);
+}
+
+/* User entry, runs on regular task stack */
+DEFINE_IDTENTRY_DEBUG_USER(exc_debug)
+{
+	unsigned long dr6, dr7;
+
+	debug_enter(&dr6, &dr7);
+	exc_debug_user(regs, dr6);
+	debug_exit(dr7);
+}
+#else
+/* 32 bit does not have separate entry points. */
+DEFINE_IDTENTRY_DEBUG(exc_debug)
+{
+	unsigned long dr6, dr7;
+
+	debug_enter(&dr6, &dr7);
 
-exit:
 	if (user_mode(regs))
-		idtentry_exit(regs);
+		exc_debug_user(regs, dr6);
 	else
-		nmi_exit();
+		exc_debug_kernel(regs, dr6);
+
 	debug_exit(dr7);
 }
+#endif
 
 /*
  * Note that we play around with the 'TS' bit in an attempt to get

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/traps: Restructure #DB handling
  2020-05-05 13:49 ` [patch V4 part 4 20/24] x86/traps: Restructure #DB handling Thomas Gleixner
  2020-05-15  5:39   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     ee8324f0167a914509d1db5ee0263620fd8e83b8
Gitweb:        https://git.kernel.org/tip/ee8324f0167a914509d1db5ee0263620fd8e83b8
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Mon, 04 May 2020 19:56:26 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:13 +02:00

x86/traps: Restructure #DB handling

Now that there are separate entry points, move the kernel/user_mode specifc
checks into the entry functions so the common handling code does not need
the extra mode checks. Make the code more readable while at it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135315.283276272@linutronix.de


---
 arch/x86/kernel/traps.c | 69 ++++++++++++++++++++--------------------
 1 file changed, 35 insertions(+), 34 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 4f248c5..b62e962 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -775,39 +775,12 @@ static __always_inline void debug_exit(unsigned long dr7)
  *
  * May run on IST stack.
  */
-static noinstr void handle_debug(struct pt_regs *regs, unsigned long dr6)
+static void noinstr handle_debug(struct pt_regs *regs, unsigned long dr6,
+				 bool user_icebp)
 {
 	struct task_struct *tsk = current;
-	int user_icebp = 0;
 	int si_code;
 
-	/*
-	 * The SDM says "The processor clears the BTF flag when it
-	 * generates a debug exception."  Clear TIF_BLOCKSTEP to keep
-	 * TIF_BLOCKSTEP in sync with the hardware BTF flag.
-	 */
-	clear_tsk_thread_flag(tsk, TIF_BLOCKSTEP);
-
-	if (unlikely(!user_mode(regs) && (dr6 & DR_STEP) &&
-		     is_sysenter_singlestep(regs))) {
-		dr6 &= ~DR_STEP;
-		if (!dr6)
-			return;
-		/*
-		 * else we might have gotten a single-step trap and hit a
-		 * watchpoint at the same time, in which case we should fall
-		 * through and handle the watchpoint.
-		 */
-	}
-
-	/*
-	 * If dr6 has no reason to give us about the origin of this trap,
-	 * then it's very likely the result of an icebp/int01 trap.
-	 * User wants a sigtrap for that.
-	 */
-	if (!dr6 && user_mode(regs))
-		user_icebp = 1;
-
 	/* Store the virtualized DR6 value */
 	tsk->thread.debugreg6 = dr6;
 
@@ -832,9 +805,7 @@ static noinstr void handle_debug(struct pt_regs *regs, unsigned long dr6)
 	if (v8086_mode(regs)) {
 		handle_vm86_trap((struct kernel_vm86_regs *) regs, 0,
 				 X86_TRAP_DB);
-		cond_local_irq_disable(regs);
-		debug_stack_usage_dec();
-		return;
+		goto out;
 	}
 
 	if (WARN_ON_ONCE((dr6 & DR_STEP) && !user_mode(regs))) {
@@ -848,9 +819,12 @@ static noinstr void handle_debug(struct pt_regs *regs, unsigned long dr6)
 		set_tsk_thread_flag(tsk, TIF_SINGLESTEP);
 		regs->flags &= ~X86_EFLAGS_TF;
 	}
+
 	si_code = get_si_code(tsk->thread.debugreg6);
 	if (tsk->thread.debugreg6 & (DR_STEP | DR_TRAP_BITS) || user_icebp)
 		send_sigtrap(regs, 0, si_code);
+
+out:
 	cond_local_irq_disable(regs);
 	debug_stack_usage_dec();
 }
@@ -859,7 +833,27 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs,
 					     unsigned long dr6)
 {
 	nmi_enter();
-	handle_debug(regs, dr6);
+	/*
+	 * The SDM says "The processor clears the BTF flag when it
+	 * generates a debug exception."  Clear TIF_BLOCKSTEP to keep
+	 * TIF_BLOCKSTEP in sync with the hardware BTF flag.
+	 */
+	clear_thread_flag(TIF_BLOCKSTEP);
+
+	/*
+	 * Catch SYSENTER with TF set and clear DR_STEP. If this hit a
+	 * watchpoint at the same time then that will still be handled.
+	 */
+	if ((dr6 & DR_STEP) && is_sysenter_singlestep(regs))
+		dr6 &= ~DR_STEP;
+
+	/*
+	 * If DR6 is zero, no point in trying to handle it. The kernel is
+	 * not using INT1.
+	 */
+	if (dr6)
+		handle_debug(regs, dr6, false);
+
 	nmi_exit();
 }
 
@@ -867,7 +861,14 @@ static __always_inline void exc_debug_user(struct pt_regs *regs,
 					   unsigned long dr6)
 {
 	idtentry_enter(regs);
-	handle_debug(regs, dr6);
+	clear_thread_flag(TIF_BLOCKSTEP);
+
+	/*
+	 * If dr6 has no reason to give us about the origin of this trap,
+	 * then it's very likely the result of an icebp/int01 trap.
+	 * User wants a sigtrap for that.
+	 */
+	handle_debug(regs, dr6, !dr6);
 	idtentry_exit(regs);
 }
 

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/idtentry: Provide IDTRENTRY_NOIST variants for #DB and #MC
  2020-05-05 13:49 ` [patch V4 part 4 18/24] x86/entry: Provide IDTRENTRY_NOIST variants for #DB and #MC Thomas Gleixner
  2020-05-15  5:29   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     97f4e8b75a99efc7376b6c7b5bc2a3124f609bc1
Gitweb:        https://git.kernel.org/tip/97f4e8b75a99efc7376b6c7b5bc2a3124f609bc1
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 25 Feb 2020 23:33:28 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:12 +02:00

x86/idtentry: Provide IDTRENTRY_NOIST variants for #DB and #MC

Provide NOIST entry point macros which allows to implement NOIST variants
of the C entry points. These are invoked when #DB or #MC enter from user
space. This allows explicit handling of the difference between user mode
and kernel mode entry later.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135315.084882104@linutronix.de


---
 arch/x86/include/asm/idtentry.h | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index fcd4230..060f9e3 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -138,10 +138,12 @@ __visible noinstr void func(struct pt_regs *regs)
  * @vector:	Vector number (ignored for C)
  * @func:	Function name of the entry point
  *
- * Maps to DECLARE_IDTENTRY_RAW
+ * Maps to DECLARE_IDTENTRY_RAW, but declares also the NOIST C handler
+ * which is called from the ASM entry point on user mode entry
  */
 #define DECLARE_IDTENTRY_IST(vector, func)				\
-	DECLARE_IDTENTRY_RAW(vector, func)
+	DECLARE_IDTENTRY_RAW(vector, func);				\
+	__visible void noist_##func(struct pt_regs *regs)
 
 /**
  * DEFINE_IDTENTRY_IST - Emit code for IST entry points
@@ -152,6 +154,17 @@ __visible noinstr void func(struct pt_regs *regs)
 #define DEFINE_IDTENTRY_IST(func)					\
 	DEFINE_IDTENTRY_RAW(func)
 
+/**
+ * DEFINE_IDTENTRY_NOIST - Emit code for NOIST entry points which
+ *			   belong to a IST entry point (MCE, DB)
+ * @func:	Function name of the entry point. Must be the same as
+ *		the function name of the corresponding IST variant
+ *
+ * Maps to DEFINE_IDTENTRY_RAW().
+ */
+#define DEFINE_IDTENTRY_NOIST(func)					\
+	DEFINE_IDTENTRY_RAW(noist_##func)
+
 #else	/* CONFIG_X86_64 */
 /* Maps to a regular IDTENTRY on 32bit for now */
 # define DECLARE_IDTENTRY_IST		DECLARE_IDTENTRY
@@ -161,12 +174,14 @@ __visible noinstr void func(struct pt_regs *regs)
 /* C-Code mapping */
 #define DECLARE_IDTENTRY_MCE		DECLARE_IDTENTRY_IST
 #define DEFINE_IDTENTRY_MCE		DEFINE_IDTENTRY_IST
+#define DEFINE_IDTENTRY_MCE_USER	DEFINE_IDTENTRY_NOIST
 
 #define DECLARE_IDTENTRY_NMI		DECLARE_IDTENTRY_IST
 #define DEFINE_IDTENTRY_NMI		DEFINE_IDTENTRY_IST
 
 #define DECLARE_IDTENTRY_DEBUG		DECLARE_IDTENTRY_IST
 #define DEFINE_IDTENTRY_DEBUG		DEFINE_IDTENTRY_IST
+#define DEFINE_IDTENTRY_DEBUG_USER	DEFINE_IDTENTRY_NOIST
 
 /**
  * DECLARE_IDTENTRY_XEN - Declare functions for XEN redirect IDT entry points

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/entry: Convert Debug exception to IDTENTRY_DB
  2020-05-05 13:49 ` [patch V4 part 4 16/24] x86/entry: Convert Debug exception to IDTENTRY_DB Thomas Gleixner
  2020-05-15  5:27   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     c087b87b1469166dbb0d757c7d79fecf14a90a0a
Gitweb:        https://git.kernel.org/tip/c087b87b1469166dbb0d757c7d79fecf14a90a0a
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 25 Feb 2020 23:33:26 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:11 +02:00

x86/entry: Convert Debug exception to IDTENTRY_DB

Convert #DB to IDTENTRY_ERRORCODE:
  - Implement the C entry point with DEFINE_IDTENTRY_DB
  - Emit the ASM stub with DECLARE_IDTENTRY
  - Remove the ASM idtentry in 64bit
  - Remove the open coded ASM entry code in 32bit
  - Fixup the XEN/PV code
  - Remove the old prototypes

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135314.900297476@linutronix.de


---
 arch/x86/entry/entry_32.S       | 10 ----------
 arch/x86/entry/entry_64.S       |  2 --
 arch/x86/include/asm/idtentry.h |  4 ++++
 arch/x86/include/asm/traps.h    |  3 ---
 arch/x86/kernel/idt.c           |  8 ++++----
 arch/x86/kernel/traps.c         | 21 +++++++++++++--------
 arch/x86/xen/enlighten_pv.c     |  2 +-
 arch/x86/xen/xen-asm_64.S       |  4 ++--
 8 files changed, 24 insertions(+), 30 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index d4961ca..30c6ed3 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1488,16 +1488,6 @@ ret_to_user:
 	jmp	restore_all_switch_stack
 SYM_CODE_END(handle_exception)
 
-SYM_CODE_START(debug)
-	/*
-	 * Entry from sysenter is now handled in common_exception
-	 */
-	ASM_CLAC
-	pushl	$0
-	pushl	$do_debug
-	jmp	common_exception
-SYM_CODE_END(debug)
-
 SYM_CODE_START(double_fault)
 1:
 	/*
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 3d7f2cc..f47629a 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1074,12 +1074,10 @@ apicinterrupt IRQ_WORK_VECTOR			irq_work_interrupt		smp_irq_work_interrupt
 
 idtentry	X86_TRAP_PF		page_fault		do_page_fault			has_error_code=1
 
-idtentry_mce_db	X86_TRAP_DB		debug			do_debug
 idtentry_df	X86_TRAP_DF		double_fault		do_double_fault
 
 #ifdef CONFIG_XEN_PV
 idtentry	512 /* dummy */		hypervisor_callback	xen_do_hypervisor_callback	has_error_code=0
-idtentry	X86_TRAP_DB		xendebug		do_debug			has_error_code=0
 #endif
 
 /*
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 1f067e6..fcd4230 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -262,4 +262,8 @@ DECLARE_IDTENTRY_MCE(X86_TRAP_MC,	exc_machine_check);
 DECLARE_IDTENTRY_NMI(X86_TRAP_NMI,	exc_nmi);
 DECLARE_IDTENTRY_XEN(X86_TRAP_NMI,	nmi);
 
+/* #DB */
+DECLARE_IDTENTRY_DEBUG(X86_TRAP_DB,	exc_debug);
+DECLARE_IDTENTRY_XEN(X86_TRAP_DB,	debug);
+
 #endif
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 57b83ae..9bd602d 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -11,7 +11,6 @@
 
 #define dotraplinkage __visible
 
-asmlinkage void debug(void);
 #ifdef CONFIG_X86_64
 asmlinkage void double_fault(void);
 #endif
@@ -19,12 +18,10 @@ asmlinkage void page_fault(void);
 asmlinkage void async_page_fault(void);
 
 #if defined(CONFIG_X86_64) && defined(CONFIG_XEN_PV)
-asmlinkage void xen_xendebug(void);
 asmlinkage void xen_double_fault(void);
 asmlinkage void xen_page_fault(void);
 #endif
 
-dotraplinkage void do_debug(struct pt_regs *regs, long error_code);
 dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code, unsigned long cr2);
 dotraplinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address);
 
diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index d3fecd8..ddf3f3d 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -59,7 +59,7 @@ static bool idt_setup_done __initdata;
  * stacks work only after cpu_init().
  */
 static const __initconst struct idt_data early_idts[] = {
-	INTG(X86_TRAP_DB,		debug),
+	INTG(X86_TRAP_DB,		asm_exc_debug),
 	SYSG(X86_TRAP_BP,		asm_exc_int3),
 #ifdef CONFIG_X86_32
 	INTG(X86_TRAP_PF,		page_fault),
@@ -93,7 +93,7 @@ static const __initconst struct idt_data def_idts[] = {
 #else
 	INTG(X86_TRAP_DF,		double_fault),
 #endif
-	INTG(X86_TRAP_DB,		debug),
+	INTG(X86_TRAP_DB,		asm_exc_debug),
 
 #ifdef CONFIG_X86_MCE
 	INTG(X86_TRAP_MC,		asm_exc_machine_check),
@@ -164,7 +164,7 @@ static const __initconst struct idt_data early_pf_idts[] = {
  * stack set to DEFAULT_STACK (0). Required for NMI trap handling.
  */
 static const __initconst struct idt_data dbg_idts[] = {
-	INTG(X86_TRAP_DB,	debug),
+	INTG(X86_TRAP_DB,		asm_exc_debug),
 };
 #endif
 
@@ -185,7 +185,7 @@ gate_desc debug_idt_table[IDT_ENTRIES] __page_aligned_bss;
  * cpu_init() when the TSS has been initialized.
  */
 static const __initconst struct idt_data ist_idts[] = {
-	ISTG(X86_TRAP_DB,	debug,			IST_INDEX_DB),
+	ISTG(X86_TRAP_DB,	asm_exc_debug,		IST_INDEX_DB),
 	ISTG(X86_TRAP_NMI,	asm_exc_nmi,		IST_INDEX_NMI),
 	ISTG(X86_TRAP_DF,	double_fault,		IST_INDEX_DF),
 #ifdef CONFIG_X86_MCE
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index de5120e..569408a 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -775,7 +775,7 @@ static __always_inline void debug_exit(unsigned long dr7)
  *
  * May run on IST stack.
  */
-dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
+DEFINE_IDTENTRY_DEBUG(exc_debug)
 {
 	struct task_struct *tsk = current;
 	unsigned long dr6, dr7;
@@ -784,7 +784,10 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
 
 	debug_enter(&dr6, &dr7);
 
-	nmi_enter();
+	if (user_mode(regs))
+		idtentry_enter(regs);
+	else
+		nmi_enter();
 
 	/*
 	 * The SDM says "The processor clears the BTF flag when it
@@ -821,7 +824,7 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
 		goto exit;
 #endif
 
-	if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, error_code,
+	if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, 0,
 		       SIGTRAP) == NOTIFY_STOP)
 		goto exit;
 
@@ -835,8 +838,8 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
 	cond_local_irq_enable(regs);
 
 	if (v8086_mode(regs)) {
-		handle_vm86_trap((struct kernel_vm86_regs *) regs, error_code,
-					X86_TRAP_DB);
+		handle_vm86_trap((struct kernel_vm86_regs *) regs, 0,
+				 X86_TRAP_DB);
 		cond_local_irq_disable(regs);
 		debug_stack_usage_dec();
 		goto exit;
@@ -855,15 +858,17 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
 	}
 	si_code = get_si_code(tsk->thread.debugreg6);
 	if (tsk->thread.debugreg6 & (DR_STEP | DR_TRAP_BITS) || user_icebp)
-		send_sigtrap(regs, error_code, si_code);
+		send_sigtrap(regs, 0, si_code);
 	cond_local_irq_disable(regs);
 	debug_stack_usage_dec();
 
 exit:
-	nmi_exit();
+	if (user_mode(regs))
+		idtentry_exit(regs);
+	else
+		nmi_exit();
 	debug_exit(dr7);
 }
-NOKPROBE_SYMBOL(do_debug);
 
 /*
  * Note that we play around with the 'TS' bit in an attempt to get
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index c65aa4c..535dde1 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -616,7 +616,7 @@ struct trap_array_entry {
 	.ist_okay	= ist_ok }
 
 static struct trap_array_entry trap_array[] = {
-	{ debug,                       xen_xendebug,                    true },
+	TRAP_ENTRY_REDIR(exc_debug, exc_xendebug,	true  ),
 	{ double_fault,                xen_double_fault,                true },
 #ifdef CONFIG_X86_MCE
 	TRAP_ENTRY(exc_machine_check,			true  ),
diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index 04fa01b..9999ea3 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -29,8 +29,8 @@ _ASM_NOKPROBE(xen_\name)
 .endm
 
 xen_pv_trap asm_exc_divide_error
-xen_pv_trap debug
-xen_pv_trap xendebug
+xen_pv_trap asm_exc_debug
+xen_pv_trap asm_exc_xendebug
 xen_pv_trap asm_exc_int3
 xen_pv_trap asm_exc_xennmi
 xen_pv_trap asm_exc_overflow

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/entry/64: Remove error code clearing from #DB and #MCE ASM stub
  2020-05-05 13:49 ` [patch V4 part 4 17/24] x86/entry/64: Remove error code clearing from #DB and #MCE ASM stub Thomas Gleixner
  2020-05-15  5:27   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     f951cbcf04fc9c09feaa88791ad1e0d5f9f0f000
Gitweb:        https://git.kernel.org/tip/f951cbcf04fc9c09feaa88791ad1e0d5f9f0f000
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 25 Feb 2020 23:33:27 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:11 +02:00

x86/entry/64: Remove error code clearing from #DB and #MCE ASM stub

The C entry points do not expect an error code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135314.992621707@linutronix.de


---
 arch/x86/entry/entry_64.S | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index f47629a..eeb4285 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -642,7 +642,6 @@ SYM_CODE_START(\asmsym)
 	.endif
 
 	movq	%rsp, %rdi		/* pt_regs pointer */
-	xorl	%esi, %esi		/* Clear the error code */
 
 	.if \vector == X86_TRAP_DB
 		subq	$DB_STACK_OFFSET, CPU_TSS_IST(IST_INDEX_DB)

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/db: Split out dr6/7 handling
  2020-05-05 13:49 ` [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling Thomas Gleixner
                     ` (2 preceding siblings ...)
  2020-05-15  5:37   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Peter Zijlstra
  3 siblings, 0 replies; 94+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra, Thomas Gleixner, Alexandre Chartre,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     9a3d7c76d28e55173be5dd7cadbe8760fb814afa
Gitweb:        https://git.kernel.org/tip/9a3d7c76d28e55173be5dd7cadbe8760fb814afa
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Mon, 06 Apr 2020 21:02:56 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:10 +02:00

x86/db: Split out dr6/7 handling

DR6/7 should be handled before nmi_enter() is invoked and restore after
nmi_exit() to minimize the exposure.

Split it out into helper inlines and bring it into the correct order.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135314.808628211@linutronix.de


---
 arch/x86/kernel/hw_breakpoint.c |  6 +---
 arch/x86/kernel/traps.c         | 75 +++++++++++++++++++++++---------
 2 files changed, 57 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index d42fc0e..9ddf441 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -464,7 +464,7 @@ static int hw_breakpoint_handler(struct die_args *args)
 {
 	int i, cpu, rc = NOTIFY_STOP;
 	struct perf_event *bp;
-	unsigned long dr7, dr6;
+	unsigned long dr6;
 	unsigned long *dr6_p;
 
 	/* The DR6 value is pointed by args->err */
@@ -479,9 +479,6 @@ static int hw_breakpoint_handler(struct die_args *args)
 	if ((dr6 & DR_TRAP_BITS) == 0)
 		return NOTIFY_DONE;
 
-	get_debugreg(dr7, 7);
-	/* Disable breakpoints during exception handling */
-	set_debugreg(0UL, 7);
 	/*
 	 * Assert that local interrupts are disabled
 	 * Reset the DRn bits in the virtualized register value.
@@ -538,7 +535,6 @@ static int hw_breakpoint_handler(struct die_args *args)
 	    (dr6 & (~DR_TRAP_BITS)))
 		rc = NOTIFY_DONE;
 
-	set_debugreg(dr7, 7);
 	put_cpu();
 
 	return rc;
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 21c8cfc..de5120e 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -700,6 +700,57 @@ static bool is_sysenter_singlestep(struct pt_regs *regs)
 #endif
 }
 
+static __always_inline void debug_enter(unsigned long *dr6, unsigned long *dr7)
+{
+	/*
+	 * Disable breakpoints during exception handling; recursive exceptions
+	 * are exceedingly 'fun'.
+	 *
+	 * Since this function is NOKPROBE, and that also applies to
+	 * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a
+	 * HW_BREAKPOINT_W on our stack)
+	 *
+	 * Entry text is excluded for HW_BP_X and cpu_entry_area, which
+	 * includes the entry stack is excluded for everything.
+	 */
+	get_debugreg(*dr7, 7);
+	set_debugreg(0, 7);
+
+	/*
+	 * Ensure the compiler doesn't lower the above statements into
+	 * the critical section; disabling breakpoints late would not
+	 * be good.
+	 */
+	barrier();
+
+	/*
+	 * The Intel SDM says:
+	 *
+	 *   Certain debug exceptions may clear bits 0-3. The remaining
+	 *   contents of the DR6 register are never cleared by the
+	 *   processor. To avoid confusion in identifying debug
+	 *   exceptions, debug handlers should clear the register before
+	 *   returning to the interrupted task.
+	 *
+	 * Keep it simple: clear DR6 immediately.
+	 */
+	get_debugreg(*dr6, 6);
+	set_debugreg(0, 6);
+	/* Filter out all the reserved bits which are preset to 1 */
+	*dr6 &= ~DR6_RESERVED;
+}
+
+static __always_inline void debug_exit(unsigned long dr7)
+{
+	/*
+	 * Ensure the compiler doesn't raise this statement into
+	 * the critical section; enabling breakpoints early would
+	 * not be good.
+	 */
+	barrier();
+	set_debugreg(dr7, 7);
+}
+
 /*
  * Our handling of the processor debug registers is non-trivial.
  * We do not clear them on entry and exit from the kernel. Therefore
@@ -727,28 +778,13 @@ static bool is_sysenter_singlestep(struct pt_regs *regs)
 dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
 {
 	struct task_struct *tsk = current;
+	unsigned long dr6, dr7;
 	int user_icebp = 0;
-	unsigned long dr6;
 	int si_code;
 
-	nmi_enter();
-
-	get_debugreg(dr6, 6);
-	/*
-	 * The Intel SDM says:
-	 *
-	 *   Certain debug exceptions may clear bits 0-3. The remaining
-	 *   contents of the DR6 register are never cleared by the
-	 *   processor. To avoid confusion in identifying debug
-	 *   exceptions, debug handlers should clear the register before
-	 *   returning to the interrupted task.
-	 *
-	 * Keep it simple: clear DR6 immediately.
-	 */
-	set_debugreg(0, 6);
+	debug_enter(&dr6, &dr7);
 
-	/* Filter out all the reserved bits which are preset to 1 */
-	dr6 &= ~DR6_RESERVED;
+	nmi_enter();
 
 	/*
 	 * The SDM says "The processor clears the BTF flag when it
@@ -786,7 +822,7 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
 #endif
 
 	if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, error_code,
-							SIGTRAP) == NOTIFY_STOP)
+		       SIGTRAP) == NOTIFY_STOP)
 		goto exit;
 
 	/*
@@ -825,6 +861,7 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
 
 exit:
 	nmi_exit();
+	debug_exit(dr7);
 }
 NOKPROBE_SYMBOL(do_debug);
 

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/nmi: Protect NMI entry against instrumentation
  2020-05-05 13:49 ` [patch V4 part 4 14/24] x86/nmi: Protect NMI entry against instrumentation Thomas Gleixner
  2020-05-15  5:26   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     3a301dc808b77c6ca25be35660f2fcb13389b038
Gitweb:        https://git.kernel.org/tip/3a301dc808b77c6ca25be35660f2fcb13389b038
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Mon, 06 Apr 2020 15:55:06 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:10 +02:00

x86/nmi: Protect NMI entry against instrumentation

Mark all functions in the fragile code parts noinstr or force inlining so
they can't be instrumented.

Also make the hardware latency tracer invocation explicit outside of
non-instrumentable section.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135314.716186134@linutronix.de


---
 arch/x86/include/asm/desc.h  |  8 ++++----
 arch/x86/kernel/cpu/common.c |  6 ++----
 arch/x86/kernel/nmi.c        | 15 +++++++++------
 3 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index 085a2dd..d6c3d34 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -214,7 +214,7 @@ static inline void native_load_gdt(const struct desc_ptr *dtr)
 	asm volatile("lgdt %0"::"m" (*dtr));
 }
 
-static inline void native_load_idt(const struct desc_ptr *dtr)
+static __always_inline void native_load_idt(const struct desc_ptr *dtr)
 {
 	asm volatile("lidt %0"::"m" (*dtr));
 }
@@ -392,7 +392,7 @@ extern unsigned long system_vectors[];
 
 #ifdef CONFIG_X86_64
 DECLARE_PER_CPU(u32, debug_idt_ctr);
-static inline bool is_debug_idt_enabled(void)
+static __always_inline bool is_debug_idt_enabled(void)
 {
 	if (this_cpu_read(debug_idt_ctr))
 		return true;
@@ -400,7 +400,7 @@ static inline bool is_debug_idt_enabled(void)
 	return false;
 }
 
-static inline void load_debug_idt(void)
+static __always_inline void load_debug_idt(void)
 {
 	load_idt((const struct desc_ptr *)&debug_idt_descr);
 }
@@ -422,7 +422,7 @@ static inline void load_debug_idt(void)
  * that doesn't need to disable interrupts, as nothing should be
  * bothering the CPU then.
  */
-static inline void load_current_idt(void)
+static __always_inline void load_current_idt(void)
 {
 	if (is_debug_idt_enabled())
 		load_debug_idt();
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index bed0cb8..6751b81 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1692,21 +1692,19 @@ void syscall_init(void)
 DEFINE_PER_CPU(int, debug_stack_usage);
 DEFINE_PER_CPU(u32, debug_idt_ctr);
 
-void debug_stack_set_zero(void)
+noinstr void debug_stack_set_zero(void)
 {
 	this_cpu_inc(debug_idt_ctr);
 	load_current_idt();
 }
-NOKPROBE_SYMBOL(debug_stack_set_zero);
 
-void debug_stack_reset(void)
+noinstr void debug_stack_reset(void)
 {
 	if (WARN_ON(!this_cpu_read(debug_idt_ctr)))
 		return;
 	if (this_cpu_dec_return(debug_idt_ctr) == 0)
 		load_current_idt();
 }
-NOKPROBE_SYMBOL(debug_stack_reset);
 
 #else	/* CONFIG_X86_64 */
 
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index d55e448..d18ec18 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -307,7 +307,7 @@ NOKPROBE_SYMBOL(unknown_nmi_error);
 static DEFINE_PER_CPU(bool, swallow_nmi);
 static DEFINE_PER_CPU(unsigned long, last_nmi_rip);
 
-static void default_do_nmi(struct pt_regs *regs)
+static noinstr void default_do_nmi(struct pt_regs *regs)
 {
 	unsigned char reason = 0;
 	int handled;
@@ -333,6 +333,8 @@ static void default_do_nmi(struct pt_regs *regs)
 
 	__this_cpu_write(last_nmi_rip, regs->ip);
 
+	instrumentation_begin();
+
 	handled = nmi_handle(NMI_LOCAL, regs);
 	__this_cpu_add(nmi_stats.normal, handled);
 	if (handled) {
@@ -346,7 +348,7 @@ static void default_do_nmi(struct pt_regs *regs)
 		 */
 		if (handled > 1)
 			__this_cpu_write(swallow_nmi, true);
-		return;
+		goto out;
 	}
 
 	/*
@@ -378,7 +380,7 @@ static void default_do_nmi(struct pt_regs *regs)
 #endif
 		__this_cpu_add(nmi_stats.external, 1);
 		raw_spin_unlock(&nmi_reason_lock);
-		return;
+		goto out;
 	}
 	raw_spin_unlock(&nmi_reason_lock);
 
@@ -416,8 +418,10 @@ static void default_do_nmi(struct pt_regs *regs)
 		__this_cpu_add(nmi_stats.swallow, 1);
 	else
 		unknown_nmi_error(reason, regs);
+
+out:
+	instrumentation_end();
 }
-NOKPROBE_SYMBOL(default_do_nmi);
 
 /*
  * NMIs can page fault or hit breakpoints which will cause it to lose
@@ -489,7 +493,7 @@ static DEFINE_PER_CPU(unsigned long, nmi_cr2);
  */
 static DEFINE_PER_CPU(int, update_debug_stack);
 
-static bool notrace is_debug_stack(unsigned long addr)
+static noinstr bool is_debug_stack(unsigned long addr)
 {
 	struct cea_exception_stacks *cs = __this_cpu_read(cea_exception_stacks);
 	unsigned long top = CEA_ESTACK_TOP(cs, DB);
@@ -504,7 +508,6 @@ static bool notrace is_debug_stack(unsigned long addr)
 	 */
 	return addr >= bot && addr < top;
 }
-NOKPROBE_SYMBOL(is_debug_stack);
 #endif
 
 DEFINE_IDTENTRY_NMI(exc_nmi)

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/idtentry: Provide IDTENTRY_XEN for XEN/PV
  2020-05-05 13:49 ` [patch V4 part 4 12/24] x86/idtentry: Provide IDTENTRY_XEN for XEN/PV Thomas Gleixner
  2020-05-15  5:25   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     9769a24d77c5708376bf99e04cffe5c764cd3e40
Gitweb:        https://git.kernel.org/tip/9769a24d77c5708376bf99e04cffe5c764cd3e40
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 25 Feb 2020 23:33:24 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:09 +02:00

x86/idtentry: Provide IDTENTRY_XEN for XEN/PV

XEN/PV has special wrappers for NMI and DB exceptions. They redirect these
exceptions through regular IDTENTRY points. Provide the necessary IDTENTRY
macros to make this work

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135314.518622698@linutronix.de


---
 arch/x86/include/asm/idtentry.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 36fe964..2315eec 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -168,6 +168,18 @@ __visible noinstr void func(struct pt_regs *regs)
 #define DECLARE_IDTENTRY_DEBUG		DECLARE_IDTENTRY_IST
 #define DEFINE_IDTENTRY_DEBUG		DEFINE_IDTENTRY_IST
 
+/**
+ * DECLARE_IDTENTRY_XEN - Declare functions for XEN redirect IDT entry points
+ * @vector:	Vector number (ignored for C)
+ * @func:	Function name of the entry point
+ *
+ * Used for xennmi and xendebug redirections. No DEFINE as this is all ASM
+ * indirection magic.
+ */
+#define DECLARE_IDTENTRY_XEN(vector, func)				\
+	asmlinkage void xen_asm_exc_xen##func(void);			\
+	asmlinkage void asm_exc_xen##func(void)
+
 #else /* !__ASSEMBLY__ */
 
 /*
@@ -203,6 +215,10 @@ __visible noinstr void func(struct pt_regs *regs)
 /* No ASM code emitted for NMI */
 #define DECLARE_IDTENTRY_NMI(vector, func)
 
+/* XEN NMI and DB wrapper */
+#define DECLARE_IDTENTRY_XEN(vector, func)				\
+	idtentry vector asm_exc_xen##func exc_##func has_error_code=0 sane=1
+
 #endif /* __ASSEMBLY__ */
 
 /*

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/entry: Convert NMI to IDTENTRY_NMI
  2020-05-05 13:49 ` [patch V4 part 4 13/24] x86/entry: Convert NMI to IDTENTRY_NMI Thomas Gleixner
  2020-05-15  5:26   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     b209b183b6db506737d2bf6c76a866fe954e513a
Gitweb:        https://git.kernel.org/tip/b209b183b6db506737d2bf6c76a866fe954e513a
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 25 Feb 2020 23:33:25 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:09 +02:00

x86/entry: Convert NMI to IDTENTRY_NMI

Convert #NMI to IDTENTRY_NMI:
  - Implement the C entry point with DEFINE_IDTENTRY_NMI
  - Fixup the XEN/PV code
  - Remove the old prototypes

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135314.609932306@linutronix.de


---
 arch/x86/entry/entry_32.S       |  8 ++++----
 arch/x86/entry/entry_64.S       | 15 +++++++--------
 arch/x86/include/asm/idtentry.h |  4 ++++
 arch/x86/include/asm/traps.h    |  3 ---
 arch/x86/kernel/idt.c           |  4 ++--
 arch/x86/kernel/nmi.c           |  4 +---
 arch/x86/xen/enlighten_pv.c     |  7 ++++++-
 arch/x86/xen/xen-asm_64.S       |  2 +-
 8 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 4dd3d70..d4961ca 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1545,7 +1545,7 @@ SYM_CODE_END(double_fault)
  * switched stacks.  We handle both conditions by simply checking whether we
  * interrupted kernel code running on the SYSENTER stack.
  */
-SYM_CODE_START(nmi)
+SYM_CODE_START(asm_exc_nmi)
 	ASM_CLAC
 
 #ifdef CONFIG_X86_ESPFIX32
@@ -1574,7 +1574,7 @@ SYM_CODE_START(nmi)
 	jb	.Lnmi_from_sysenter_stack
 
 	/* Not on SYSENTER stack. */
-	call	do_nmi
+	call	exc_nmi
 	jmp	.Lnmi_return
 
 .Lnmi_from_sysenter_stack:
@@ -1584,7 +1584,7 @@ SYM_CODE_START(nmi)
 	 */
 	movl	%esp, %ebx
 	movl	PER_CPU_VAR(cpu_current_top_of_stack), %esp
-	call	do_nmi
+	call	exc_nmi
 	movl	%ebx, %esp
 
 .Lnmi_return:
@@ -1638,7 +1638,7 @@ SYM_CODE_START(nmi)
 	lss	(1+5+6)*4(%esp), %esp			# back to espfix stack
 	jmp	.Lirq_return
 #endif
-SYM_CODE_END(nmi)
+SYM_CODE_END(asm_exc_nmi)
 
 .pushsection .text, "ax"
 SYM_CODE_START(rewind_stack_do_exit)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 5007b97..3d7f2cc 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1079,7 +1079,6 @@ idtentry_df	X86_TRAP_DF		double_fault		do_double_fault
 
 #ifdef CONFIG_XEN_PV
 idtentry	512 /* dummy */		hypervisor_callback	xen_do_hypervisor_callback	has_error_code=0
-idtentry	X86_TRAP_NMI		xennmi			do_nmi				has_error_code=0
 idtentry	X86_TRAP_DB		xendebug		do_debug			has_error_code=0
 #endif
 
@@ -1414,7 +1413,7 @@ SYM_CODE_END(error_return)
  *	%r14: Used to save/restore the CR3 of the interrupted context
  *	      when PAGE_TABLE_ISOLATION is in use.  Do not clobber.
  */
-SYM_CODE_START(nmi)
+SYM_CODE_START(asm_exc_nmi)
 	UNWIND_HINT_IRET_REGS
 
 	/*
@@ -1499,7 +1498,7 @@ SYM_CODE_START(nmi)
 
 	movq	%rsp, %rdi
 	movq	$-1, %rsi
-	call	do_nmi
+	call	exc_nmi
 
 	/*
 	 * Return back to user mode.  We must *not* do the normal exit
@@ -1556,7 +1555,7 @@ SYM_CODE_START(nmi)
 	 * end_repeat_nmi, then we are a nested NMI.  We must not
 	 * modify the "iret" frame because it's being written by
 	 * the outer NMI.  That's okay; the outer NMI handler is
-	 * about to about to call do_nmi anyway, so we can just
+	 * about to about to call exc_nmi() anyway, so we can just
 	 * resume the outer NMI.
 	 */
 
@@ -1675,7 +1674,7 @@ repeat_nmi:
 	 * RSP is pointing to "outermost RIP".  gsbase is unknown, but, if
 	 * we're repeating an NMI, gsbase has the same value that it had on
 	 * the first iteration.  paranoid_entry will load the kernel
-	 * gsbase if needed before we call do_nmi.  "NMI executing"
+	 * gsbase if needed before we call exc_nmi().  "NMI executing"
 	 * is zero.
 	 */
 	movq	$1, 10*8(%rsp)		/* Set "NMI executing". */
@@ -1709,10 +1708,10 @@ end_repeat_nmi:
 	call	paranoid_entry
 	UNWIND_HINT_REGS
 
-	/* paranoidentry do_nmi, 0; without TRACE_IRQS_OFF */
+	/* paranoidentry exc_nmi(), 0; without TRACE_IRQS_OFF */
 	movq	%rsp, %rdi
 	movq	$-1, %rsi
-	call	do_nmi
+	call	exc_nmi
 
 	/* Always restore stashed CR3 value (see paranoid_entry) */
 	RESTORE_CR3 scratch_reg=%r15 save_reg=%r14
@@ -1749,7 +1748,7 @@ nmi_restore:
 	 * about espfix64 on the way back to kernel mode.
 	 */
 	iretq
-SYM_CODE_END(nmi)
+SYM_CODE_END(asm_exc_nmi)
 
 #ifndef CONFIG_IA32_EMULATION
 /*
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 2315eec..1f067e6 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -258,4 +258,8 @@ DECLARE_IDTENTRY_RAW(X86_TRAP_BP,	exc_int3);
 DECLARE_IDTENTRY_MCE(X86_TRAP_MC,	exc_machine_check);
 #endif
 
+/* NMI */
+DECLARE_IDTENTRY_NMI(X86_TRAP_NMI,	exc_nmi);
+DECLARE_IDTENTRY_XEN(X86_TRAP_NMI,	nmi);
+
 #endif
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 6096db9..57b83ae 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -12,7 +12,6 @@
 #define dotraplinkage __visible
 
 asmlinkage void debug(void);
-asmlinkage void nmi(void);
 #ifdef CONFIG_X86_64
 asmlinkage void double_fault(void);
 #endif
@@ -20,14 +19,12 @@ asmlinkage void page_fault(void);
 asmlinkage void async_page_fault(void);
 
 #if defined(CONFIG_X86_64) && defined(CONFIG_XEN_PV)
-asmlinkage void xen_xennmi(void);
 asmlinkage void xen_xendebug(void);
 asmlinkage void xen_double_fault(void);
 asmlinkage void xen_page_fault(void);
 #endif
 
 dotraplinkage void do_debug(struct pt_regs *regs, long error_code);
-dotraplinkage void do_nmi(struct pt_regs *regs, long error_code);
 dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code, unsigned long cr2);
 dotraplinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address);
 
diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index 6b93840..d3fecd8 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -74,7 +74,7 @@ static const __initconst struct idt_data early_idts[] = {
  */
 static const __initconst struct idt_data def_idts[] = {
 	INTG(X86_TRAP_DE,		asm_exc_divide_error),
-	INTG(X86_TRAP_NMI,		nmi),
+	INTG(X86_TRAP_NMI,		asm_exc_nmi),
 	INTG(X86_TRAP_BR,		asm_exc_bounds),
 	INTG(X86_TRAP_UD,		asm_exc_invalid_op),
 	INTG(X86_TRAP_NM,		asm_exc_device_not_available),
@@ -186,7 +186,7 @@ gate_desc debug_idt_table[IDT_ENTRIES] __page_aligned_bss;
  */
 static const __initconst struct idt_data ist_idts[] = {
 	ISTG(X86_TRAP_DB,	debug,			IST_INDEX_DB),
-	ISTG(X86_TRAP_NMI,	nmi,			IST_INDEX_NMI),
+	ISTG(X86_TRAP_NMI,	asm_exc_nmi,		IST_INDEX_NMI),
 	ISTG(X86_TRAP_DF,	double_fault,		IST_INDEX_DF),
 #ifdef CONFIG_X86_MCE
 	ISTG(X86_TRAP_MC,	asm_exc_machine_check,	IST_INDEX_MCE),
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index 6407ea2..d55e448 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -507,8 +507,7 @@ static bool notrace is_debug_stack(unsigned long addr)
 NOKPROBE_SYMBOL(is_debug_stack);
 #endif
 
-dotraplinkage notrace void
-do_nmi(struct pt_regs *regs, long error_code)
+DEFINE_IDTENTRY_NMI(exc_nmi)
 {
 	if (IS_ENABLED(CONFIG_SMP) && cpu_is_offline(smp_processor_id()))
 		return;
@@ -558,7 +557,6 @@ nmi_restart:
 	if (user_mode(regs))
 		mds_user_clear_cpu_buffers();
 }
-NOKPROBE_SYMBOL(do_nmi);
 
 void stop_nmi(void)
 {
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index f116afe..c65aa4c 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -610,13 +610,18 @@ struct trap_array_entry {
 	.xen		= xen_asm_##func,		\
 	.ist_okay	= ist_ok }
 
+#define TRAP_ENTRY_REDIR(func, xenfunc, ist_ok) {	\
+	.orig		= asm_##func,			\
+	.xen		= xen_asm_##xenfunc,		\
+	.ist_okay	= ist_ok }
+
 static struct trap_array_entry trap_array[] = {
 	{ debug,                       xen_xendebug,                    true },
 	{ double_fault,                xen_double_fault,                true },
 #ifdef CONFIG_X86_MCE
 	TRAP_ENTRY(exc_machine_check,			true  ),
 #endif
-	{ nmi,                         xen_xennmi,                      true },
+	TRAP_ENTRY_REDIR(exc_nmi, exc_xennmi,		true  ),
 	TRAP_ENTRY(exc_int3,				false ),
 	TRAP_ENTRY(exc_overflow,			false ),
 #ifdef CONFIG_IA32_EMULATION
diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index 617ef3f..04fa01b 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -32,7 +32,7 @@ xen_pv_trap asm_exc_divide_error
 xen_pv_trap debug
 xen_pv_trap xendebug
 xen_pv_trap asm_exc_int3
-xen_pv_trap xennmi
+xen_pv_trap asm_exc_xennmi
 xen_pv_trap asm_exc_overflow
 xen_pv_trap asm_exc_bounds
 xen_pv_trap asm_exc_invalid_op

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/mce: Use untraced rd/wrmsr in the MCE offline/crash check
  2020-05-05 13:49 ` [patch V4 part 4 11/24] x86/mce: Use untraced rd/wrmsr in the MCE offline/crash check Thomas Gleixner
  2020-05-15  5:24   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     89cee5d63761ee0e9ca00631793eb16c8931421b
Gitweb:        https://git.kernel.org/tip/89cee5d63761ee0e9ca00631793eb16c8931421b
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Sat, 04 Apr 2020 15:39:13 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:08 +02:00

x86/mce: Use untraced rd/wrmsr in the MCE offline/crash check

mce_check_crashing_cpu() is called right at the entry of the MCE
handler. It uses mce_rdmsr() and mce_wrmsr() which are wrappers around
rdmsr() and wrmsr() to handle the MCE error injection mechanism, which is
pointless in this context, i.e. when the MCE hits an offline CPU or the
system is already marked crashing.

The MSR access can also be traced, so use the untraceable variants. This
is also safe vs. XEN paravirt as these MSRs are not affected by XEN PV
modifications.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135314.426347351@linutronix.de


---
 arch/x86/kernel/cpu/mce/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 842dd03..3177652 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1108,7 +1108,7 @@ static noinstr bool mce_check_crashing_cpu(void)
 	    (crashing_cpu != -1 && crashing_cpu != cpu)) {
 		u64 mcgstatus;
 
-		mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
+		mcgstatus = __rdmsr(MSR_IA32_MCG_STATUS);
 
 		if (boot_cpu_data.x86_vendor == X86_VENDOR_ZHAOXIN) {
 			if (mcgstatus & MCG_STATUS_LMCES)
@@ -1116,7 +1116,7 @@ static noinstr bool mce_check_crashing_cpu(void)
 		}
 
 		if (mcgstatus & MCG_STATUS_RIPV) {
-			mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
+			__wrmsr(MSR_IA32_MCG_STATUS, 0, 0);
 			return true;
 		}
 	}

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/mce: Move nmi_enter/exit() into the entry point
  2020-05-05 13:49 ` [patch V4 part 4 09/24] x86/mce: Move nmi_enter/exit() into the entry point Thomas Gleixner
  2020-05-15  5:23   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     46dbb1443cd5c4e1c892b820d7a578995a1708cf
Gitweb:        https://git.kernel.org/tip/46dbb1443cd5c4e1c892b820d7a578995a1708cf
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Fri, 03 Apr 2020 22:37:31 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:08 +02:00

x86/mce: Move nmi_enter/exit() into the entry point

There is no reason to have nmi_enter/exit() in the actual MCE
handlers. Move it to the entry point. This also covers the until now
uncovered initial handler which only prints.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135314.243936614@linutronix.de


---
 arch/x86/kernel/cpu/mce/core.c    | 26 +++++++++++++-------------
 arch/x86/kernel/cpu/mce/p5.c      |  4 ----
 arch/x86/kernel/cpu/mce/winchip.c |  4 ----
 3 files changed, 13 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index e9265e2..f5993ed 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1100,8 +1100,10 @@ static void mce_clear_state(unsigned long *toclear)
  * kdump kernel establishing a new #MC handler where a broadcasted MCE
  * might not get handled properly.
  */
-static bool __mc_check_crashing_cpu(int cpu)
+static noinstr bool mce_check_crashing_cpu(void)
 {
+	unsigned int cpu = smp_processor_id();
+
 	if (cpu_is_offline(cpu) ||
 	    (crashing_cpu != -1 && crashing_cpu != cpu)) {
 		u64 mcgstatus;
@@ -1235,7 +1237,6 @@ void noinstr do_machine_check(struct pt_regs *regs, long error_code)
 	DECLARE_BITMAP(valid_banks, MAX_NR_BANKS);
 	DECLARE_BITMAP(toclear, MAX_NR_BANKS);
 	struct mca_config *cfg = &mca_cfg;
-	int cpu = smp_processor_id();
 	struct mce m, *final;
 	char *msg = NULL;
 	int worst = 0;
@@ -1264,11 +1265,6 @@ void noinstr do_machine_check(struct pt_regs *regs, long error_code)
 	 */
 	int lmce = 1;
 
-	if (__mc_check_crashing_cpu(cpu))
-		return;
-
-	nmi_enter();
-
 	this_cpu_inc(mce_exception_count);
 
 	mce_gather_info(&m, regs);
@@ -1356,7 +1352,7 @@ void noinstr do_machine_check(struct pt_regs *regs, long error_code)
 	sync_core();
 
 	if (worst != MCE_AR_SEVERITY && !kill_it)
-		goto out_ist;
+		return;
 
 	/* Fault was in user mode and we need to take some action */
 	if ((m.cs & 3) == 3) {
@@ -1373,9 +1369,6 @@ void noinstr do_machine_check(struct pt_regs *regs, long error_code)
 		if (!fixup_exception(regs, X86_TRAP_MC, error_code, 0))
 			mce_panic("Failed kernel mode recovery", &m, msg);
 	}
-
-out_ist:
-	nmi_exit();
 }
 EXPORT_SYMBOL_GPL(do_machine_check);
 
@@ -1912,11 +1905,18 @@ static void unexpected_machine_check(struct pt_regs *regs, long error_code)
 void (*machine_check_vector)(struct pt_regs *, long error_code) =
 						unexpected_machine_check;
 
-dotraplinkage notrace void do_mce(struct pt_regs *regs, long error_code)
+dotraplinkage noinstr void do_mce(struct pt_regs *regs, long error_code)
 {
+	if (machine_check_vector == do_machine_check &&
+	    mce_check_crashing_cpu())
+		return;
+
+	nmi_enter();
+
 	machine_check_vector(regs, error_code);
+
+	nmi_exit();
 }
-NOKPROBE_SYMBOL(do_mce);
 
 /*
  * Called for each booted CPU to set up machine checks.
diff --git a/arch/x86/kernel/cpu/mce/p5.c b/arch/x86/kernel/cpu/mce/p5.c
index 5ee94aa..dc29f0f 100644
--- a/arch/x86/kernel/cpu/mce/p5.c
+++ b/arch/x86/kernel/cpu/mce/p5.c
@@ -25,8 +25,6 @@ static void pentium_machine_check(struct pt_regs *regs, long error_code)
 {
 	u32 loaddr, hi, lotype;
 
-	nmi_enter();
-
 	rdmsr(MSR_IA32_P5_MC_ADDR, loaddr, hi);
 	rdmsr(MSR_IA32_P5_MC_TYPE, lotype, hi);
 
@@ -39,8 +37,6 @@ static void pentium_machine_check(struct pt_regs *regs, long error_code)
 	}
 
 	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
-
-	nmi_exit();
 }
 
 /* Set up machine check reporting for processors with Intel style MCE: */
diff --git a/arch/x86/kernel/cpu/mce/winchip.c b/arch/x86/kernel/cpu/mce/winchip.c
index b3938c1..3f8f84b 100644
--- a/arch/x86/kernel/cpu/mce/winchip.c
+++ b/arch/x86/kernel/cpu/mce/winchip.c
@@ -19,12 +19,8 @@
 /* Machine check handler for WinChip C6: */
 static void winchip_machine_check(struct pt_regs *regs, long error_code)
 {
-	nmi_enter();
-
 	pr_emerg("CPU0: Machine Check Exception.\n");
 	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
-
-	nmi_exit();
 }
 
 /* Set up machine check reporting on the Winchip C6 series */

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/entry: Convert Machine Check to IDTENTRY_IST
  2020-05-05 13:49 ` [patch V4 part 4 10/24] x86/entry: Convert Machine Check to IDTENTRY_IST Thomas Gleixner
  2020-05-15  5:24   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     aaa4947defff8e6e647e5c1dbc0b4b0dfd4c4359
Gitweb:        https://git.kernel.org/tip/aaa4947defff8e6e647e5c1dbc0b4b0dfd4c4359
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 25 Feb 2020 23:33:23 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:08 +02:00

x86/entry: Convert Machine Check to IDTENTRY_IST

Convert #MC to IDTENTRY_MCE:
  - Implement the C entry points with DEFINE_IDTENTRY_MCE
  - Emit the ASM stub with DECLARE_IDTENTRY_MCE
  - Remove the ASM idtentry in 64bit
  - Remove the open coded ASM entry code in 32bit
  - Fixup the XEN/PV code
  - Remove the old prototypes
  - Remove the error code from *machine_check_vector() as
    it is always 0 and not used by any of the functions
    it can point to. Fixup all the functions as well.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135314.334980426@linutronix.de


---
 arch/x86/entry/entry_32.S          |  9 ---------
 arch/x86/entry/entry_64.S          |  3 ---
 arch/x86/include/asm/idtentry.h    |  4 ++++
 arch/x86/include/asm/mce.h         |  2 +-
 arch/x86/include/asm/traps.h       |  7 -------
 arch/x86/kernel/cpu/mce/core.c     | 23 ++++++++++++++---------
 arch/x86/kernel/cpu/mce/inject.c   |  4 ++--
 arch/x86/kernel/cpu/mce/internal.h |  2 +-
 arch/x86/kernel/cpu/mce/p5.c       |  2 +-
 arch/x86/kernel/cpu/mce/winchip.c  |  2 +-
 arch/x86/kernel/idt.c              | 10 +++++-----
 arch/x86/kvm/vmx/vmx.c             |  2 +-
 arch/x86/xen/enlighten_pv.c        |  2 +-
 arch/x86/xen/xen-asm_64.S          |  2 +-
 14 files changed, 32 insertions(+), 42 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index b9b0ddb..4dd3d70 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1297,15 +1297,6 @@ SYM_CODE_START(native_iret)
 SYM_CODE_END(native_iret)
 #endif
 
-#ifdef CONFIG_X86_MCE
-SYM_CODE_START(machine_check)
-	ASM_CLAC
-	pushl	$0
-	pushl	$do_mce
-	jmp	common_exception
-SYM_CODE_END(machine_check)
-#endif
-
 #ifdef CONFIG_XEN_PV
 SYM_FUNC_START(xen_hypervisor_callback)
 	/*
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 69ddd05..5007b97 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1074,9 +1074,6 @@ apicinterrupt IRQ_WORK_VECTOR			irq_work_interrupt		smp_irq_work_interrupt
 
 idtentry	X86_TRAP_PF		page_fault		do_page_fault			has_error_code=1
 
-#ifdef CONFIG_X86_MCE
-idtentry_mce_db	X86_TRAP_MCE	 	machine_check		do_mce
-#endif
 idtentry_mce_db	X86_TRAP_DB		debug			do_debug
 idtentry_df	X86_TRAP_DF		double_fault		do_double_fault
 
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 3edd6d0..36fe964 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -238,4 +238,8 @@ DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_AC,	exc_alignment_check);
 /* Raw exception entries which need extra work */
 DECLARE_IDTENTRY_RAW(X86_TRAP_BP,	exc_int3);
 
+#ifdef CONFIG_X86_MCE
+DECLARE_IDTENTRY_MCE(X86_TRAP_MC,	exc_machine_check);
+#endif
+
 #endif
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index f9cea08..a001301 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -238,7 +238,7 @@ extern void mce_disable_bank(int bank);
 /*
  * Exception handler
  */
-void do_machine_check(struct pt_regs *, long);
+void do_machine_check(struct pt_regs *pt_regs);
 
 /*
  * Threshold handler
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 698285a..6096db9 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -18,25 +18,18 @@ asmlinkage void double_fault(void);
 #endif
 asmlinkage void page_fault(void);
 asmlinkage void async_page_fault(void);
-#ifdef CONFIG_X86_MCE
-asmlinkage void machine_check(void);
-#endif /* CONFIG_X86_MCE */
 
 #if defined(CONFIG_X86_64) && defined(CONFIG_XEN_PV)
 asmlinkage void xen_xennmi(void);
 asmlinkage void xen_xendebug(void);
 asmlinkage void xen_double_fault(void);
 asmlinkage void xen_page_fault(void);
-#ifdef CONFIG_X86_MCE
-asmlinkage void xen_machine_check(void);
-#endif /* CONFIG_X86_MCE */
 #endif
 
 dotraplinkage void do_debug(struct pt_regs *regs, long error_code);
 dotraplinkage void do_nmi(struct pt_regs *regs, long error_code);
 dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code, unsigned long cr2);
 dotraplinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address);
-dotraplinkage void do_mce(struct pt_regs *regs, long error_code);
 
 #ifdef CONFIG_X86_64
 asmlinkage __visible notrace struct pt_regs *sync_regs(struct pt_regs *eregs);
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index f5993ed..842dd03 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1232,7 +1232,7 @@ static void kill_me_maybe(struct callback_head *cb)
  * backing the user stack, tracing that reads the user stack will cause
  * potentially infinite recursion.
  */
-void noinstr do_machine_check(struct pt_regs *regs, long error_code)
+void noinstr do_machine_check(struct pt_regs *regs)
 {
 	DECLARE_BITMAP(valid_banks, MAX_NR_BANKS);
 	DECLARE_BITMAP(toclear, MAX_NR_BANKS);
@@ -1366,7 +1366,7 @@ void noinstr do_machine_check(struct pt_regs *regs, long error_code)
 			current->mce_kill_me.func = kill_me_now;
 		task_work_add(current, &current->mce_kill_me, true);
 	} else {
-		if (!fixup_exception(regs, X86_TRAP_MC, error_code, 0))
+		if (!fixup_exception(regs, X86_TRAP_MC, 0, 0))
 			mce_panic("Failed kernel mode recovery", &m, msg);
 	}
 }
@@ -1895,27 +1895,32 @@ bool filter_mce(struct mce *m)
 }
 
 /* Handle unconfigured int18 (should never happen) */
-static void unexpected_machine_check(struct pt_regs *regs, long error_code)
+static void unexpected_machine_check(struct pt_regs *regs)
 {
 	pr_err("CPU#%d: Unexpected int18 (Machine Check)\n",
 	       smp_processor_id());
 }
 
 /* Call the installed machine check handler for this CPU setup. */
-void (*machine_check_vector)(struct pt_regs *, long error_code) =
-						unexpected_machine_check;
+void (*machine_check_vector)(struct pt_regs *) = unexpected_machine_check;
 
-dotraplinkage noinstr void do_mce(struct pt_regs *regs, long error_code)
+DEFINE_IDTENTRY_MCE(exc_machine_check)
 {
 	if (machine_check_vector == do_machine_check &&
 	    mce_check_crashing_cpu())
 		return;
 
-	nmi_enter();
+	if (user_mode(regs))
+		idtentry_enter(regs);
+	else
+		nmi_enter();
 
-	machine_check_vector(regs, error_code);
+	machine_check_vector(regs);
 
-	nmi_exit();
+	if (user_mode(regs))
+		idtentry_exit(regs);
+	else
+		nmi_exit();
 }
 
 /*
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index 3413b41..0593b19 100644
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -146,9 +146,9 @@ static void raise_exception(struct mce *m, struct pt_regs *pregs)
 		regs.cs = m->cs;
 		pregs = &regs;
 	}
-	/* in mcheck exeception handler, irq will be disabled */
+	/* do_machine_check() expects interrupts disabled -- at least */
 	local_irq_save(flags);
-	do_machine_check(pregs, 0);
+	do_machine_check(pregs);
 	local_irq_restore(flags);
 	m->finished = 0;
 }
diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h
index 3b00817..b74ca4a 100644
--- a/arch/x86/kernel/cpu/mce/internal.h
+++ b/arch/x86/kernel/cpu/mce/internal.h
@@ -9,7 +9,7 @@
 #include <asm/mce.h>
 
 /* Pointer to the installed machine check handler for this CPU setup. */
-extern void (*machine_check_vector)(struct pt_regs *, long error_code);
+extern void (*machine_check_vector)(struct pt_regs *);
 
 enum severity_level {
 	MCE_NO_SEVERITY,
diff --git a/arch/x86/kernel/cpu/mce/p5.c b/arch/x86/kernel/cpu/mce/p5.c
index dc29f0f..eaebc4c 100644
--- a/arch/x86/kernel/cpu/mce/p5.c
+++ b/arch/x86/kernel/cpu/mce/p5.c
@@ -21,7 +21,7 @@
 int mce_p5_enabled __read_mostly;
 
 /* Machine check handler for Pentium class Intel CPUs: */
-static void pentium_machine_check(struct pt_regs *regs, long error_code)
+static void pentium_machine_check(struct pt_regs *regs)
 {
 	u32 loaddr, hi, lotype;
 
diff --git a/arch/x86/kernel/cpu/mce/winchip.c b/arch/x86/kernel/cpu/mce/winchip.c
index 3f8f84b..90e3d60 100644
--- a/arch/x86/kernel/cpu/mce/winchip.c
+++ b/arch/x86/kernel/cpu/mce/winchip.c
@@ -17,7 +17,7 @@
 #include "internal.h"
 
 /* Machine check handler for WinChip C6: */
-static void winchip_machine_check(struct pt_regs *regs, long error_code)
+static void winchip_machine_check(struct pt_regs *regs)
 {
 	pr_emerg("CPU0: Machine Check Exception.\n");
 	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index 9ca8af6..6b93840 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -96,7 +96,7 @@ static const __initconst struct idt_data def_idts[] = {
 	INTG(X86_TRAP_DB,		debug),
 
 #ifdef CONFIG_X86_MCE
-	INTG(X86_TRAP_MC,		machine_check),
+	INTG(X86_TRAP_MC,		asm_exc_machine_check),
 #endif
 
 	SYSG(X86_TRAP_OF,		asm_exc_overflow),
@@ -185,11 +185,11 @@ gate_desc debug_idt_table[IDT_ENTRIES] __page_aligned_bss;
  * cpu_init() when the TSS has been initialized.
  */
 static const __initconst struct idt_data ist_idts[] = {
-	ISTG(X86_TRAP_DB,	debug,		IST_INDEX_DB),
-	ISTG(X86_TRAP_NMI,	nmi,		IST_INDEX_NMI),
-	ISTG(X86_TRAP_DF,	double_fault,	IST_INDEX_DF),
+	ISTG(X86_TRAP_DB,	debug,			IST_INDEX_DB),
+	ISTG(X86_TRAP_NMI,	nmi,			IST_INDEX_NMI),
+	ISTG(X86_TRAP_DF,	double_fault,		IST_INDEX_DF),
 #ifdef CONFIG_X86_MCE
-	ISTG(X86_TRAP_MC,	machine_check,	IST_INDEX_MCE),
+	ISTG(X86_TRAP_MC,	asm_exc_machine_check,	IST_INDEX_MCE),
 #endif
 };
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c2c6335..513378c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4578,7 +4578,7 @@ static void kvm_machine_check(void)
 		.flags = X86_EFLAGS_IF,
 	};
 
-	do_machine_check(&regs, 0);
+	do_machine_check(&regs);
 #endif
 }
 
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 725d550..f116afe 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -614,7 +614,7 @@ static struct trap_array_entry trap_array[] = {
 	{ debug,                       xen_xendebug,                    true },
 	{ double_fault,                xen_double_fault,                true },
 #ifdef CONFIG_X86_MCE
-	{ machine_check,               xen_machine_check,               true },
+	TRAP_ENTRY(exc_machine_check,			true  ),
 #endif
 	{ nmi,                         xen_xennmi,                      true },
 	TRAP_ENTRY(exc_int3,				false ),
diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index 44f5556..617ef3f 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -48,7 +48,7 @@ xen_pv_trap asm_exc_spurious_interrupt_bug
 xen_pv_trap asm_exc_coprocessor_error
 xen_pv_trap asm_exc_alignment_check
 #ifdef CONFIG_X86_MCE
-xen_pv_trap machine_check
+xen_pv_trap asm_exc_machine_check
 #endif /* CONFIG_X86_MCE */
 xen_pv_trap asm_exc_simd_coprocessor_error
 #ifdef CONFIG_IA32_EMULATION

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/idtentry: Provide IDTENTRY_IST
  2020-05-05 13:49 ` [patch V4 part 4 08/24] x86/entry: Provide IDTENTRY_IST Thomas Gleixner
  2020-05-14 16:39   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     2f2ed27cb62200e4faa2a088131e972e26b5b585
Gitweb:        https://git.kernel.org/tip/2f2ed27cb62200e4faa2a088131e972e26b5b585
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 25 Feb 2020 23:33:22 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:07 +02:00

x86/idtentry: Provide IDTENTRY_IST

Same as IDTENTRY but for exceptions which run on Interrupt Stacks (IST) on
64bit. For 32bit this maps to IDTENTRY.

There are 3 variants which will be used:
      IDTENTRY_MCE
      IDTENTRY_DB
      IDTENTRY_NMI

These map to IDTENTRY_IST, but only the MCE and DB variants are emitting
ASM code as the NMI entry needs hand crafted ASM still.

The function defines do not contain any idtenter/exit calls as these
exceptions need special treatment.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135314.137125609@linutronix.de



---
 arch/x86/include/asm/idtentry.h | 54 ++++++++++++++++++++++++++++++++-
 1 file changed, 54 insertions(+)

diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 3dc4d5b..3edd6d0 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -132,6 +132,42 @@ static __always_inline void __##func(struct pt_regs *regs,		\
 #define DEFINE_IDTENTRY_RAW(func)					\
 __visible noinstr void func(struct pt_regs *regs)
 
+#ifdef CONFIG_X86_64
+/**
+ * DECLARE_IDTENTRY_IST - Declare functions for IST handling IDT entry points
+ * @vector:	Vector number (ignored for C)
+ * @func:	Function name of the entry point
+ *
+ * Maps to DECLARE_IDTENTRY_RAW
+ */
+#define DECLARE_IDTENTRY_IST(vector, func)				\
+	DECLARE_IDTENTRY_RAW(vector, func)
+
+/**
+ * DEFINE_IDTENTRY_IST - Emit code for IST entry points
+ * @func:	Function name of the entry point
+ *
+ * Maps to DEFINE_IDTENTRY_RAW
+ */
+#define DEFINE_IDTENTRY_IST(func)					\
+	DEFINE_IDTENTRY_RAW(func)
+
+#else	/* CONFIG_X86_64 */
+/* Maps to a regular IDTENTRY on 32bit for now */
+# define DECLARE_IDTENTRY_IST		DECLARE_IDTENTRY
+# define DEFINE_IDTENTRY_IST		DEFINE_IDTENTRY
+#endif	/* !CONFIG_X86_64 */
+
+/* C-Code mapping */
+#define DECLARE_IDTENTRY_MCE		DECLARE_IDTENTRY_IST
+#define DEFINE_IDTENTRY_MCE		DEFINE_IDTENTRY_IST
+
+#define DECLARE_IDTENTRY_NMI		DECLARE_IDTENTRY_IST
+#define DEFINE_IDTENTRY_NMI		DEFINE_IDTENTRY_IST
+
+#define DECLARE_IDTENTRY_DEBUG		DECLARE_IDTENTRY_IST
+#define DEFINE_IDTENTRY_DEBUG		DEFINE_IDTENTRY_IST
+
 #else /* !__ASSEMBLY__ */
 
 /*
@@ -149,6 +185,24 @@ __visible noinstr void func(struct pt_regs *regs)
 #define DECLARE_IDTENTRY_RAW(vector, func)				\
 	DECLARE_IDTENTRY(vector, func)
 
+#ifdef CONFIG_X86_64
+# define DECLARE_IDTENTRY_MCE(vector, func)				\
+	idtentry_mce_db vector asm_##func func
+
+# define DECLARE_IDTENTRY_DEBUG(vector, func)				\
+	idtentry_mce_db vector asm_##func func
+
+#else
+# define DECLARE_IDTENTRY_MCE(vector, func)				\
+	DECLARE_IDTENTRY(vector, func)
+
+# define DECLARE_IDTENTRY_DEBUG(vector, func)				\
+	DECLARE_IDTENTRY(vector, func)
+#endif
+
+/* No ASM code emitted for NMI */
+#define DECLARE_IDTENTRY_NMI(vector, func)
+
 #endif /* __ASSEMBLY__ */
 
 /*

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/traps: Split int3 handler up
  2020-05-05 13:49 ` [patch V4 part 4 07/24] x86/traps: Split int3 handler up Thomas Gleixner
  2020-05-14  5:03   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Thomas Gleixner, Alexandre Chartre, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     f4f6b66fd8011eb5f24dec936faaa4cab2ca7ebc
Gitweb:        https://git.kernel.org/tip/f4f6b66fd8011eb5f24dec936faaa4cab2ca7ebc
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Thu, 05 Mar 2020 16:09:52 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:07 +02:00

x86/traps: Split int3 handler up

For code simplicity split up the int3 handler into a kernel and user part
which makes the code flow simpler to understand.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Link: https://lkml.kernel.org/r/20200505135314.045220765@linutronix.de


---
 arch/x86/kernel/traps.c | 68 +++++++++++++++++++++++-----------------
 1 file changed, 40 insertions(+), 28 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 0ad12df..21c8cfc 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -568,6 +568,35 @@ exit:
 	cond_local_irq_disable(regs);
 }
 
+static bool do_int3(struct pt_regs *regs)
+{
+	int res;
+
+#ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
+	if (kgdb_ll_trap(DIE_INT3, "int3", regs, 0, X86_TRAP_BP,
+			 SIGTRAP) == NOTIFY_STOP)
+		return true;
+#endif /* CONFIG_KGDB_LOW_LEVEL_TRAP */
+
+#ifdef CONFIG_KPROBES
+	if (kprobe_int3_handler(regs))
+		return true;
+#endif
+	res = notify_die(DIE_INT3, "int3", regs, 0, X86_TRAP_BP, SIGTRAP);
+
+	return res == NOTIFY_STOP;
+}
+
+static void do_int3_user(struct pt_regs *regs)
+{
+	if (do_int3(regs))
+		return;
+
+	cond_local_irq_enable(regs);
+	do_trap(X86_TRAP_BP, SIGTRAP, "int3", regs, 0, 0, NULL);
+	cond_local_irq_disable(regs);
+}
+
 DEFINE_IDTENTRY_RAW(exc_int3)
 {
 	/*
@@ -585,37 +614,20 @@ DEFINE_IDTENTRY_RAW(exc_int3)
 	 * because the INT3 could have been hit in any context including
 	 * NMI.
 	 */
-	if (user_mode(regs))
+	if (user_mode(regs)) {
 		idtentry_enter(regs);
-	else
-		nmi_enter();
-
-	instrumentation_begin();
-#ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
-	if (kgdb_ll_trap(DIE_INT3, "int3", regs, 0, X86_TRAP_BP,
-				SIGTRAP) == NOTIFY_STOP)
-		goto exit;
-#endif /* CONFIG_KGDB_LOW_LEVEL_TRAP */
-
-#ifdef CONFIG_KPROBES
-	if (kprobe_int3_handler(regs))
-		goto exit;
-#endif
-
-	if (notify_die(DIE_INT3, "int3", regs, 0, X86_TRAP_BP,
-			SIGTRAP) == NOTIFY_STOP)
-		goto exit;
-
-	cond_local_irq_enable(regs);
-	do_trap(X86_TRAP_BP, SIGTRAP, "int3", regs, 0, 0, NULL);
-	cond_local_irq_disable(regs);
-
-exit:
-	instrumentation_end();
-	if (user_mode(regs))
+		instrumentation_begin();
+		do_int3_user(regs);
+		instrumentation_end();
 		idtentry_exit(regs);
-	else
+	} else {
+		nmi_enter();
+		instrumentation_begin();
+		if (!do_int3(regs))
+			die("int3", regs, 0);
+		instrumentation_end();
 		nmi_exit();
+	}
 }
 
 #ifdef CONFIG_X86_64

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/idtentry: Provide IDTENTRY_RAW
  2020-05-05 13:49 ` [patch V4 part 4 05/24] x86/entry: Provide IDTENTRY_RAW Thomas Gleixner
  2020-05-14  4:59   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     e448b97001b48f410dd9843f4611e879db261ee2
Gitweb:        https://git.kernel.org/tip/e448b97001b48f410dd9843f4611e879db261ee2
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 04 Mar 2020 15:22:09 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:06 +02:00

x86/idtentry: Provide IDTENTRY_RAW

Some exception handlers need to do extra work before any of the entry
helpers are invoked. Provide IDTENTRY_RAW for this.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135313.830540017@linutronix.de



---
 arch/x86/include/asm/idtentry.h | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index ee6ebfe..2f31d03 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -104,6 +104,34 @@ __visible noinstr void func(struct pt_regs *regs,			\
 static __always_inline void __##func(struct pt_regs *regs,		\
 				     unsigned long error_code)
 
+/**
+ * DECLARE_IDTENTRY_RAW - Declare functions for raw IDT entry points
+ *		      No error code pushed by hardware
+ * @vector:	Vector number (ignored for C)
+ * @func:	Function name of the entry point
+ *
+ * Maps to DECLARE_IDTENTRY().
+ */
+#define DECLARE_IDTENTRY_RAW(vector, func)				\
+	DECLARE_IDTENTRY(vector, func)
+
+/**
+ * DEFINE_IDTENTRY_RAW - Emit code for raw IDT entry points
+ * @func:	Function name of the entry point
+ *
+ * @func is called from ASM entry code with interrupts disabled.
+ *
+ * The macro is written so it acts as function definition. Append the
+ * body with a pair of curly brackets.
+ *
+ * Contrary to DEFINE_IDTENTRY() this does not invoke the
+ * idtentry_enter/exit() helpers before and after the body invocation. This
+ * needs to be done in the body itself if applicable. Use if extra work
+ * is required before the enter/exit() helpers are invoked.
+ */
+#define DEFINE_IDTENTRY_RAW(func)					\
+__visible noinstr void func(struct pt_regs *regs)
+
 #else /* !__ASSEMBLY__ */
 
 /*
@@ -118,6 +146,9 @@ static __always_inline void __##func(struct pt_regs *regs,		\
 /* Special case for 32bit IRET 'trap'. Do not emit ASM code */
 #define DECLARE_IDTENTRY_SW(vector, func)
 
+#define DECLARE_IDTENTRY_RAW(vector, func)				\
+	DECLARE_IDTENTRY(vector, func)
+
 #endif /* __ASSEMBLY__ */
 
 /*

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/entry: Convert INT3 exception to IDTENTRY_RAW
  2020-05-05 13:49 ` [patch V4 part 4 06/24] x86/entry: Convert INT3 exception to IDTENTRY_RAW Thomas Gleixner
  2020-05-14  5:01   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     3512eab9b00ac8bd3930222233847677dc3b7592
Gitweb:        https://git.kernel.org/tip/3512eab9b00ac8bd3930222233847677dc3b7592
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 25 Feb 2020 23:16:16 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:06 +02:00

x86/entry: Convert INT3 exception to IDTENTRY_RAW

Convert #BP to IDTENTRY_RAW:
  - Implement the C entry point with DEFINE_IDTENTRY_RAW
  - Invoke idtentry_enter/exit() from the function body
  - Emit the ASM stub with DECLARE_IDTENTRY_RAW
  - Remove the ASM idtentry in 64bit
  - Remove the open coded ASM entry code in 32bit
  - Fixup the XEN/PV code
  - Remove the old prototypes

No functional change.

This could be a plain IDTENTRY, but as Peter pointed out INT3 is broken
vs. the static key in the context tracking code as this static key might be
in the state of being patched and has an int3 which would recurse forever.
IDTENTRY_RAW is therefore chosen to allow addressing this issue without
lots of code churn.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135313.938474960@linutronix.de



---
 arch/x86/entry/entry_32.S       |  7 -------
 arch/x86/entry/entry_64.S       |  2 --
 arch/x86/include/asm/idtentry.h |  3 +++
 arch/x86/include/asm/traps.h    |  3 ---
 arch/x86/kernel/idt.c           |  2 +-
 arch/x86/kernel/traps.c         | 28 +++++++++++++++++-----------
 arch/x86/xen/enlighten_pv.c     |  2 +-
 arch/x86/xen/xen-asm_64.S       |  2 +-
 8 files changed, 23 insertions(+), 26 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index f7a5f1c..b9b0ddb 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1649,13 +1649,6 @@ SYM_CODE_START(nmi)
 #endif
 SYM_CODE_END(nmi)
 
-SYM_CODE_START(int3)
-	ASM_CLAC
-	pushl	$0
-	pushl	$do_int3
-	jmp	common_exception
-SYM_CODE_END(int3)
-
 .pushsection .text, "ax"
 SYM_CODE_START(rewind_stack_do_exit)
 	/* Prevent any naive code from trying to unwind to our caller. */
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 1bada7b..69ddd05 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1072,8 +1072,6 @@ apicinterrupt IRQ_WORK_VECTOR			irq_work_interrupt		smp_irq_work_interrupt
  * Exception entry points.
  */
 
-idtentry	X86_TRAP_BP		int3			do_int3				has_error_code=0
-
 idtentry	X86_TRAP_PF		page_fault		do_page_fault			has_error_code=1
 
 #ifdef CONFIG_X86_MCE
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 2f31d03..3dc4d5b 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -181,4 +181,7 @@ DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_SS,	exc_stack_segment);
 DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_GP,	exc_general_protection);
 DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_AC,	exc_alignment_check);
 
+/* Raw exception entries which need extra work */
+DECLARE_IDTENTRY_RAW(X86_TRAP_BP,	exc_int3);
+
 #endif
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 5774d0b..698285a 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -13,7 +13,6 @@
 
 asmlinkage void debug(void);
 asmlinkage void nmi(void);
-asmlinkage void int3(void);
 #ifdef CONFIG_X86_64
 asmlinkage void double_fault(void);
 #endif
@@ -26,7 +25,6 @@ asmlinkage void machine_check(void);
 #if defined(CONFIG_X86_64) && defined(CONFIG_XEN_PV)
 asmlinkage void xen_xennmi(void);
 asmlinkage void xen_xendebug(void);
-asmlinkage void xen_int3(void);
 asmlinkage void xen_double_fault(void);
 asmlinkage void xen_page_fault(void);
 #ifdef CONFIG_X86_MCE
@@ -36,7 +34,6 @@ asmlinkage void xen_machine_check(void);
 
 dotraplinkage void do_debug(struct pt_regs *regs, long error_code);
 dotraplinkage void do_nmi(struct pt_regs *regs, long error_code);
-dotraplinkage void do_int3(struct pt_regs *regs, long error_code);
 dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code, unsigned long cr2);
 dotraplinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address);
 dotraplinkage void do_mce(struct pt_regs *regs, long error_code);
diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index 38b565b..9ca8af6 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -60,7 +60,7 @@ static bool idt_setup_done __initdata;
  */
 static const __initconst struct idt_data early_idts[] = {
 	INTG(X86_TRAP_DB,		debug),
-	SYSG(X86_TRAP_BP,		int3),
+	SYSG(X86_TRAP_BP,		asm_exc_int3),
 #ifdef CONFIG_X86_32
 	INTG(X86_TRAP_PF,		page_fault),
 #endif
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 280c290..0ad12df 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -568,7 +568,7 @@ exit:
 	cond_local_irq_disable(regs);
 }
 
-dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
+DEFINE_IDTENTRY_RAW(exc_int3)
 {
 	/*
 	 * poke_int3_handler() is completely self contained code; it does (and
@@ -579,16 +579,20 @@ dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
 		return;
 
 	/*
-	 * Unlike any other non-IST entry, we can be called from pretty much
-	 * any location in the kernel through kprobes -- text_poke() will most
-	 * likely be handled by poke_int3_handler() above. This means this
-	 * handler is effectively NMI-like.
+	 * idtentry_enter() uses static_branch_{,un}likely() and therefore
+	 * can trigger INT3, hence poke_int3_handler() must be done
+	 * before. If the entry came from kernel mode, then use nmi_enter()
+	 * because the INT3 could have been hit in any context including
+	 * NMI.
 	 */
-	if (!user_mode(regs))
+	if (user_mode(regs))
+		idtentry_enter(regs);
+	else
 		nmi_enter();
 
+	instrumentation_begin();
 #ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
-	if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
+	if (kgdb_ll_trap(DIE_INT3, "int3", regs, 0, X86_TRAP_BP,
 				SIGTRAP) == NOTIFY_STOP)
 		goto exit;
 #endif /* CONFIG_KGDB_LOW_LEVEL_TRAP */
@@ -598,19 +602,21 @@ dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
 		goto exit;
 #endif
 
-	if (notify_die(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
+	if (notify_die(DIE_INT3, "int3", regs, 0, X86_TRAP_BP,
 			SIGTRAP) == NOTIFY_STOP)
 		goto exit;
 
 	cond_local_irq_enable(regs);
-	do_trap(X86_TRAP_BP, SIGTRAP, "int3", regs, error_code, 0, NULL);
+	do_trap(X86_TRAP_BP, SIGTRAP, "int3", regs, 0, 0, NULL);
 	cond_local_irq_disable(regs);
 
 exit:
-	if (!user_mode(regs))
+	instrumentation_end();
+	if (user_mode(regs))
+		idtentry_exit(regs);
+	else
 		nmi_exit();
 }
-NOKPROBE_SYMBOL(do_int3);
 
 #ifdef CONFIG_X86_64
 /*
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index c41152b..725d550 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -617,7 +617,7 @@ static struct trap_array_entry trap_array[] = {
 	{ machine_check,               xen_machine_check,               true },
 #endif
 	{ nmi,                         xen_xennmi,                      true },
-	{ int3,                        xen_int3,                        false },
+	TRAP_ENTRY(exc_int3,				false ),
 	TRAP_ENTRY(exc_overflow,			false ),
 #ifdef CONFIG_IA32_EMULATION
 	{ entry_INT80_compat,          xen_entry_INT80_compat,          false },
diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index 6a91157..44f5556 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -31,7 +31,7 @@ _ASM_NOKPROBE(xen_\name)
 xen_pv_trap asm_exc_divide_error
 xen_pv_trap debug
 xen_pv_trap xendebug
-xen_pv_trap int3
+xen_pv_trap asm_exc_int3
 xen_pv_trap xennmi
 xen_pv_trap asm_exc_overflow
 xen_pv_trap asm_exc_bounds

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/int3: Inline bsearch()
  2020-05-05 13:49 ` [patch V4 part 4 04/24] x86/int3: Inline bsearch() Thomas Gleixner
  2020-05-14  4:58   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Thomas Gleixner, Alexandre Chartre, Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     c3be358894063fc2a628dd94ca7f17c232177dbb
Gitweb:        https://git.kernel.org/tip/c3be358894063fc2a628dd94ca7f17c232177dbb
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Thu, 20 Feb 2020 13:28:06 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:05 +02:00

x86/int3: Inline bsearch()

Avoid calling out to bsearch() by inlining it, for normal kernel configs
this was the last external call and poke_int3_handler() is now fully self
sufficient -- no calls to external code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135313.731774429@linutronix.de


---
 arch/x86/kernel/alternative.c | 8 ++++----
 arch/x86/kernel/traps.c       | 5 +++++
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 686c8ac..0f70712 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -979,7 +979,7 @@ static __always_inline void *text_poke_addr(struct text_poke_loc *tp)
 	return _stext + tp->rel_addr;
 }
 
-static int noinstr patch_cmp(const void *key, const void *elt)
+static __always_inline int patch_cmp(const void *key, const void *elt)
 {
 	struct text_poke_loc *tp = (struct text_poke_loc *) elt;
 
@@ -1023,9 +1023,9 @@ int noinstr poke_int3_handler(struct pt_regs *regs)
 	 * Skip the binary search if there is a single member in the vector.
 	 */
 	if (unlikely(desc->nr_entries > 1)) {
-		tp = bsearch(ip, desc->vec, desc->nr_entries,
-			     sizeof(struct text_poke_loc),
-			     patch_cmp);
+		tp = __inline_bsearch(ip, desc->vec, desc->nr_entries,
+				      sizeof(struct text_poke_loc),
+				      patch_cmp);
 		if (!tp)
 			goto out_put;
 	} else {
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index b28a64d..280c290 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -570,6 +570,11 @@ exit:
 
 dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code)
 {
+	/*
+	 * poke_int3_handler() is completely self contained code; it does (and
+	 * must) *NOT* call out to anything, lest it hits upon yet another
+	 * INT3.
+	 */
 	if (poke_int3_handler(regs))
 		return;
 

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] lib/bsearch: Provide __always_inline variant
  2020-05-05 13:49 ` [patch V4 part 4 03/24] lib/bsearch: Provide __always_inline variant Thomas Gleixner
  2020-05-14  4:58   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Thomas Gleixner, Alexandre Chartre, Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     83b169bb1d30546c21cf1e22a0834547bfe91f06
Gitweb:        https://git.kernel.org/tip/83b169bb1d30546c21cf1e22a0834547bfe91f06
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 19 Feb 2020 18:25:09 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:05 +02:00

lib/bsearch: Provide __always_inline variant

For code that needs the ultimate performance (it can inline the @cmp
function too) or simply needs to avoid calling external functions for
whatever reason, provide an __always_inline variant of bsearch().

[ tglx: Renamed to __inline_bsearch() as suggested by Andy ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135313.624443814@linutronix.de


---
 include/linux/bsearch.h | 26 ++++++++++++++++++++++++--
 lib/bsearch.c           | 22 ++--------------------
 2 files changed, 26 insertions(+), 22 deletions(-)

diff --git a/include/linux/bsearch.h b/include/linux/bsearch.h
index 8ed53d7..e66b711 100644
--- a/include/linux/bsearch.h
+++ b/include/linux/bsearch.h
@@ -4,7 +4,29 @@
 
 #include <linux/types.h>
 
-void *bsearch(const void *key, const void *base, size_t num, size_t size,
-	      cmp_func_t cmp);
+static __always_inline
+void *__inline_bsearch(const void *key, const void *base, size_t num, size_t size, cmp_func_t cmp)
+{
+	const char *pivot;
+	int result;
+
+	while (num > 0) {
+		pivot = base + (num >> 1) * size;
+		result = cmp(key, pivot);
+
+		if (result == 0)
+			return (void *)pivot;
+
+		if (result > 0) {
+			base = pivot + size;
+			num--;
+		}
+		num >>= 1;
+	}
+
+	return NULL;
+}
+
+extern void *bsearch(const void *key, const void *base, size_t num, size_t size, cmp_func_t cmp);
 
 #endif /* _LINUX_BSEARCH_H */
diff --git a/lib/bsearch.c b/lib/bsearch.c
index 8b3aae5..bf86aa6 100644
--- a/lib/bsearch.c
+++ b/lib/bsearch.c
@@ -28,27 +28,9 @@
  * the key and elements in the array are of the same type, you can use
  * the same comparison function for both sort() and bsearch().
  */
-void *bsearch(const void *key, const void *base, size_t num, size_t size,
-	      cmp_func_t cmp)
+void *bsearch(const void *key, const void *base, size_t num, size_t size, cmp_func_t cmp)
 {
-	const char *pivot;
-	int result;
-
-	while (num > 0) {
-		pivot = base + (num >> 1) * size;
-		result = cmp(key, pivot);
-
-		if (result == 0)
-			return (void *)pivot;
-
-		if (result > 0) {
-			base = pivot + size;
-			num--;
-		}
-		num >>= 1;
-	}
-
-	return NULL;
+	return __inline_bsearch(key, base, num, size, cmp);
 }
 EXPORT_SYMBOL(bsearch);
 NOKPROBE_SYMBOL(bsearch);

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/int3: Avoid atomic instrumentation
  2020-05-05 13:49 ` [patch V4 part 4 02/24] x86/int3: Avoid atomic instrumentation Thomas Gleixner
  2020-05-08 13:27   ` Masami Hiramatsu
  2020-05-14  4:57   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Peter Zijlstra
  2 siblings, 0 replies; 94+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Peter Zijlstra (Intel),
	Masami Hiramatsu, Alexandre Chartre, Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     a53a1d0435cdc2b66f41f75fa1cee31e8fe6d99e
Gitweb:        https://git.kernel.org/tip/a53a1d0435cdc2b66f41f75fa1cee31e8fe6d99e
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 24 Jan 2020 22:08:45 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:05 +02:00

x86/int3: Avoid atomic instrumentation

Use arch_atomic_*() and __READ_ONCE() to ensure nothing untoward
creeps in and ruins things.

That is; this is the INT3 text poke handler, strictly limit the code
that runs in it, lest it inadvertenly hits yet another INT3.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135313.517429268@linutronix.de


---
 arch/x86/kernel/alternative.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 1f4cb2c..686c8ac 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -960,9 +960,9 @@ static struct bp_patching_desc *bp_desc;
 static __always_inline
 struct bp_patching_desc *try_get_desc(struct bp_patching_desc **descp)
 {
-	struct bp_patching_desc *desc = READ_ONCE(*descp); /* rcu_dereference */
+	struct bp_patching_desc *desc = __READ_ONCE(*descp); /* rcu_dereference */
 
-	if (!desc || !atomic_inc_not_zero(&desc->refs))
+	if (!desc || !arch_atomic_inc_not_zero(&desc->refs))
 		return NULL;
 
 	return desc;
@@ -971,7 +971,7 @@ struct bp_patching_desc *try_get_desc(struct bp_patching_desc **descp)
 static __always_inline void put_desc(struct bp_patching_desc *desc)
 {
 	smp_mb__before_atomic();
-	atomic_dec(&desc->refs);
+	arch_atomic_dec(&desc->refs);
 }
 
 static __always_inline void *text_poke_addr(struct text_poke_loc *tp)

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [tip: x86/entry] x86/int3: Ensure that poke_int3_handler() is not traced
  2020-05-05 13:49 ` [patch V4 part 4 01/24] x86/int3: Ensure that poke_int3_handler() is not traced Thomas Gleixner
  2020-05-14  4:57   ` Andy Lutomirski
@ 2020-05-19 19:58   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 94+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2020-05-19 19:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Peter Zijlstra (Intel),
	Alexandre Chartre, Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     819f5f8cfbcf3c143ef2ca7a674cec3702ab3807
Gitweb:        https://git.kernel.org/tip/819f5f8cfbcf3c143ef2ca7a674cec3702ab3807
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 21 Jan 2020 15:53:09 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 19 May 2020 16:04:04 +02:00

x86/int3: Ensure that poke_int3_handler() is not traced

In order to ensure poke_int3_handler() is completely self contained -- this
is called while modifying other text, imagine the fun of hitting another
INT3 -- ensure that everything it uses is not traced.

The primary means here is to force inlining; bsearch() is notrace because
all of lib/ is.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505135313.410702173@linutronix.de


---
 arch/x86/include/asm/ptrace.h        |  2 +-
 arch/x86/include/asm/text-patching.h | 11 +++++++----
 arch/x86/kernel/alternative.c        | 13 ++++++-------
 3 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index 6d6475f..ebedeab 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -123,7 +123,7 @@ static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc)
  * On x86_64, vm86 mode is mercifully nonexistent, and we don't need
  * the extra check.
  */
-static inline int user_mode(struct pt_regs *regs)
+static __always_inline int user_mode(struct pt_regs *regs)
 {
 #ifdef CONFIG_X86_32
 	return ((regs->cs & SEGMENT_RPL_MASK) | (regs->flags & X86_VM_MASK)) >= USER_RPL;
diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h
index 67315fa..6593b42 100644
--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -64,7 +64,7 @@ extern void text_poke_finish(void);
 
 #define DISP32_SIZE		4
 
-static inline int text_opcode_size(u8 opcode)
+static __always_inline int text_opcode_size(u8 opcode)
 {
 	int size = 0;
 
@@ -118,12 +118,14 @@ extern __ro_after_init struct mm_struct *poking_mm;
 extern __ro_after_init unsigned long poking_addr;
 
 #ifndef CONFIG_UML_X86
-static inline void int3_emulate_jmp(struct pt_regs *regs, unsigned long ip)
+static __always_inline
+void int3_emulate_jmp(struct pt_regs *regs, unsigned long ip)
 {
 	regs->ip = ip;
 }
 
-static inline void int3_emulate_push(struct pt_regs *regs, unsigned long val)
+static __always_inline
+void int3_emulate_push(struct pt_regs *regs, unsigned long val)
 {
 	/*
 	 * The int3 handler in entry_64.S adds a gap between the
@@ -138,7 +140,8 @@ static inline void int3_emulate_push(struct pt_regs *regs, unsigned long val)
 	*(unsigned long *)regs->sp = val;
 }
 
-static inline void int3_emulate_call(struct pt_regs *regs, unsigned long func)
+static __always_inline
+void int3_emulate_call(struct pt_regs *regs, unsigned long func)
 {
 	int3_emulate_push(regs, regs->ip - INT3_INSN_SIZE + CALL_INSN_SIZE);
 	int3_emulate_jmp(regs, func);
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 7867dfb..1f4cb2c 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -957,7 +957,8 @@ struct bp_patching_desc {
 
 static struct bp_patching_desc *bp_desc;
 
-static inline struct bp_patching_desc *try_get_desc(struct bp_patching_desc **descp)
+static __always_inline
+struct bp_patching_desc *try_get_desc(struct bp_patching_desc **descp)
 {
 	struct bp_patching_desc *desc = READ_ONCE(*descp); /* rcu_dereference */
 
@@ -967,18 +968,18 @@ static inline struct bp_patching_desc *try_get_desc(struct bp_patching_desc **de
 	return desc;
 }
 
-static inline void put_desc(struct bp_patching_desc *desc)
+static __always_inline void put_desc(struct bp_patching_desc *desc)
 {
 	smp_mb__before_atomic();
 	atomic_dec(&desc->refs);
 }
 
-static inline void *text_poke_addr(struct text_poke_loc *tp)
+static __always_inline void *text_poke_addr(struct text_poke_loc *tp)
 {
 	return _stext + tp->rel_addr;
 }
 
-static int notrace patch_cmp(const void *key, const void *elt)
+static int noinstr patch_cmp(const void *key, const void *elt)
 {
 	struct text_poke_loc *tp = (struct text_poke_loc *) elt;
 
@@ -988,9 +989,8 @@ static int notrace patch_cmp(const void *key, const void *elt)
 		return 1;
 	return 0;
 }
-NOKPROBE_SYMBOL(patch_cmp);
 
-int notrace poke_int3_handler(struct pt_regs *regs)
+int noinstr poke_int3_handler(struct pt_regs *regs)
 {
 	struct bp_patching_desc *desc;
 	struct text_poke_loc *tp;
@@ -1064,7 +1064,6 @@ out_put:
 	put_desc(desc);
 	return ret;
 }
-NOKPROBE_SYMBOL(poke_int3_handler);
 
 #define TP_VEC_MAX (PAGE_SIZE / sizeof(struct text_poke_loc))
 static struct text_poke_loc tp_vec[TP_VEC_MAX];

^ permalink raw reply related	[flat|nested] 94+ messages in thread

end of thread, other threads:[~2020-05-19 20:02 UTC | newest]

Thread overview: 94+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-05 13:49 [patch V4 part 4 00/24] x86/entry: Entry/exception code rework, nasty exceptions Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 01/24] x86/int3: Ensure that poke_int3_handler() is not traced Thomas Gleixner
2020-05-14  4:57   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 02/24] x86/int3: Avoid atomic instrumentation Thomas Gleixner
2020-05-08 13:27   ` Masami Hiramatsu
2020-05-14  4:57   ` Andy Lutomirski
2020-05-14  9:32     ` Peter Zijlstra
2020-05-14 12:51       ` Thomas Gleixner
2020-05-14 13:15         ` Peter Zijlstra
2020-05-14 14:55           ` Andy Lutomirski
2020-05-14 15:06           ` Thomas Gleixner
2020-05-14 15:08             ` Andy Lutomirski
2020-05-14 15:10               ` Peter Zijlstra
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
2020-05-05 13:49 ` [patch V4 part 4 03/24] lib/bsearch: Provide __always_inline variant Thomas Gleixner
2020-05-14  4:58   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
2020-05-05 13:49 ` [patch V4 part 4 04/24] x86/int3: Inline bsearch() Thomas Gleixner
2020-05-14  4:58   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
2020-05-05 13:49 ` [patch V4 part 4 05/24] x86/entry: Provide IDTENTRY_RAW Thomas Gleixner
2020-05-14  4:59   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] x86/idtentry: " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 06/24] x86/entry: Convert INT3 exception to IDTENTRY_RAW Thomas Gleixner
2020-05-14  5:01   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 07/24] x86/traps: Split int3 handler up Thomas Gleixner
2020-05-14  5:03   ` Andy Lutomirski
2020-05-14  9:39     ` Peter Zijlstra
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
2020-05-05 13:49 ` [patch V4 part 4 08/24] x86/entry: Provide IDTENTRY_IST Thomas Gleixner
2020-05-14 16:39   ` Andy Lutomirski
2020-05-14 18:44     ` Thomas Gleixner
2020-05-19 19:58   ` [tip: x86/entry] x86/idtentry: " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 09/24] x86/mce: Move nmi_enter/exit() into the entry point Thomas Gleixner
2020-05-15  5:23   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 10/24] x86/entry: Convert Machine Check to IDTENTRY_IST Thomas Gleixner
2020-05-15  5:24   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 11/24] x86/mce: Use untraced rd/wrmsr in the MCE offline/crash check Thomas Gleixner
2020-05-15  5:24   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 12/24] x86/idtentry: Provide IDTENTRY_XEN for XEN/PV Thomas Gleixner
2020-05-15  5:25   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 13/24] x86/entry: Convert NMI to IDTENTRY_NMI Thomas Gleixner
2020-05-15  5:26   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 14/24] x86/nmi: Protect NMI entry against instrumentation Thomas Gleixner
2020-05-15  5:26   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling Thomas Gleixner
2020-05-07 17:18   ` Alexandre Chartre
2020-05-08  8:59     ` Peter Zijlstra
2020-05-08 11:58       ` Thomas Gleixner
2020-05-08 12:45         ` Peter Zijlstra
2020-05-14  2:24   ` Mathieu Desnoyers
2020-05-14 17:28     ` Thomas Gleixner
2020-05-14 17:46       ` Mathieu Desnoyers
2020-05-15 14:32         ` Thomas Gleixner
2020-05-14 18:06     ` Steven Rostedt
2020-05-15  5:37   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
2020-05-05 13:49 ` [patch V4 part 4 16/24] x86/entry: Convert Debug exception to IDTENTRY_DB Thomas Gleixner
2020-05-15  5:27   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 17/24] x86/entry/64: Remove error code clearing from #DB and #MCE ASM stub Thomas Gleixner
2020-05-15  5:27   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 18/24] x86/entry: Provide IDTRENTRY_NOIST variants for #DB and #MC Thomas Gleixner
2020-05-15  5:29   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] x86/idtentry: " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 19/24] x86/entry: Implement user mode C entry points for #DB and #MCE Thomas Gleixner
2020-05-15  5:32   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 20/24] x86/traps: Restructure #DB handling Thomas Gleixner
2020-05-15  5:39   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 21/24] x86/traps: Address objtool noinstr complaints in #DB Thomas Gleixner
2020-05-15  5:39   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 22/24] x86/mce: Address objtools noinstr complaints Thomas Gleixner
2020-05-15  5:40   ` Andy Lutomirski
2020-05-19 19:58   ` [tip: x86/entry] " tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 23/24] x86/entry: Provide IDTENTRY_DF Thomas Gleixner
2020-05-15  5:41   ` Andy Lutomirski
2020-05-15 15:01     ` Thomas Gleixner
2020-05-19 19:58   ` [tip: x86/entry] x86/idtentry: " tip-bot2 for Thomas Gleixner
2020-05-19 19:58   ` [tip: x86/entry] x86/entry: Convert double fault exception to IDTENTRY_DF tip-bot2 for Thomas Gleixner
2020-05-05 13:49 ` [patch V4 part 4 24/24] " Thomas Gleixner
2020-05-07 19:55   ` Alexandre Chartre
2020-05-15  5:42   ` Andy Lutomirski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.