linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/13] x86/debug: Untangle handle_debug()
@ 2020-09-02 13:25 Peter Zijlstra
  2020-09-02 13:25 ` [PATCH 01/13] x86/entry: Fix AC assertion Peter Zijlstra
                   ` (13 more replies)
  0 siblings, 14 replies; 34+ messages in thread
From: Peter Zijlstra @ 2020-09-02 13:25 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Kyle Huey, Alexandre Chartre,
	Robert O'Callahan, Paul E. McKenney, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Peter Zijlstra

Hi,

The first two patches probably ought to go in x86/urgent, the rest (!RFC) can
go into x86/core and wait a bit.

handle_debug() is a mess, and now that we have separate user and kernel paths,
try and clean it up a bit.

There's two RFC patches at the end that impact the ptrace_{get,set}_debugreg(6)
ABI, I've no idea what, if anything, is expected of that or if anybody actually
cares about that. If I read the code correctly nothing actually consumes the
value from ptrace_set_debugreg(6).

Kyle, you seem to be pushing all this to the edge with RR, any clues?

Since v2:

 - fixed (user) INT1 / icebp detection
 - some further cleanups
 - two additional RFC patches


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 01/13] x86/entry: Fix AC assertion
  2020-09-02 13:25 [PATCH 00/13] x86/debug: Untangle handle_debug() Peter Zijlstra
@ 2020-09-02 13:25 ` Peter Zijlstra
  2020-09-02 15:58   ` Brian Gerst
  2020-09-04 13:16   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
  2020-09-02 13:25 ` [PATCH 02/13] x86/debug: Allow a single level of #DB recursion Peter Zijlstra
                   ` (12 subsequent siblings)
  13 siblings, 2 replies; 34+ messages in thread
From: Peter Zijlstra @ 2020-09-02 13:25 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Kyle Huey, Alexandre Chartre,
	Robert O'Callahan, Paul E. McKenney, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Peter Zijlstra, Andrew Cooper

From: Peter Zijlstra <peterz@infradead.org>

The WARN added in commit 3c73b81a9164 ("x86/entry, selftests: Further
improve user entry sanity checks") unconditionally triggers on my IVB
machine because it does not support SMAP.

For !SMAP hardware we patch out CLAC/STAC instructions and thus if
userspace sets AC, we'll still have it set after entry.

Fixes: 3c73b81a9164 ("x86/entry, selftests: Further improve user entry sanity checks")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/entry-common.h |   11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -18,8 +18,16 @@ static __always_inline void arch_check_u
 		 * state, not the interrupt state as imagined by Xen.
 		 */
 		unsigned long flags = native_save_fl();
-		WARN_ON_ONCE(flags & (X86_EFLAGS_AC | X86_EFLAGS_DF |
-				      X86_EFLAGS_NT));
+		unsigned long mask = X86_EFLAGS_DF | X86_EFLAGS_NT;
+
+		/*
+		 * For !SMAP hardware we patch out CLAC on entry.
+		 */
+		if (boot_cpu_has(X86_FEATURE_SMAP) ||
+		    (IS_ENABLED(CONFIG_64_BIT) && boot_cpu_has(X86_FEATURE_XENPV)))
+			mask |= X86_EFLAGS_AC;
+
+		WARN_ON_ONCE(flags & mask);
 
 		/* We think we came from user mode. Make sure pt_regs agrees. */
 		WARN_ON_ONCE(!user_mode(regs));



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 02/13] x86/debug: Allow a single level of #DB recursion
  2020-09-02 13:25 [PATCH 00/13] x86/debug: Untangle handle_debug() Peter Zijlstra
  2020-09-02 13:25 ` [PATCH 01/13] x86/entry: Fix AC assertion Peter Zijlstra
@ 2020-09-02 13:25 ` Peter Zijlstra
  2020-09-02 23:59   ` Sasha Levin
  2020-09-03 16:12   ` Josh Poimboeuf
  2020-09-02 13:25 ` [PATCH 03/13] x86/debug: Sync BTF earlier Peter Zijlstra
                   ` (11 subsequent siblings)
  13 siblings, 2 replies; 34+ messages in thread
From: Peter Zijlstra @ 2020-09-02 13:25 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Kyle Huey, Alexandre Chartre,
	Robert O'Callahan, Paul E. McKenney, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Peter Zijlstra, stable

From: Andy Lutomirski <luto@kernel.org>

Trying to clear DR7 around a #DB from usermode malfunctions if we
schedule when delivering SIGTRAP.  Rather than trying to define a
special no-recursion region, just allow a single level of recursion.
We do the same thing for NMI, and it hasn't caused any problems yet.

Fixes: 9f58fdde95c9 ("x86/db: Split out dr6/7 handling")
Reported-by: Kyle Huey <me@kylehuey.com>
Debugged-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/8b9bd05f187231df008d48cf818a6a311cbd5c98.1597882384.git.luto@kernel.org
---
 arch/x86/kernel/traps.c |   65 ++++++++++++++++++++++--------------------------
 1 file changed, 31 insertions(+), 34 deletions(-)

--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -729,20 +729,9 @@ static bool is_sysenter_singlestep(struc
 #endif
 }
 
-static __always_inline void debug_enter(unsigned long *dr6, unsigned long *dr7)
+static __always_inline unsigned long debug_read_clear_dr6(void)
 {
-	/*
-	 * Disable breakpoints during exception handling; recursive exceptions
-	 * are exceedingly 'fun'.
-	 *
-	 * Since this function is NOKPROBE, and that also applies to
-	 * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a
-	 * HW_BREAKPOINT_W on our stack)
-	 *
-	 * Entry text is excluded for HW_BP_X and cpu_entry_area, which
-	 * includes the entry stack is excluded for everything.
-	 */
-	*dr7 = local_db_save();
+	unsigned long dr6;
 
 	/*
 	 * The Intel SDM says:
@@ -755,15 +744,12 @@ static __always_inline void debug_enter(
 	 *
 	 * Keep it simple: clear DR6 immediately.
 	 */
-	get_debugreg(*dr6, 6);
+	get_debugreg(dr6, 6);
 	set_debugreg(0, 6);
 	/* Filter out all the reserved bits which are preset to 1 */
-	*dr6 &= ~DR6_RESERVED;
-}
+	dr6 &= ~DR6_RESERVED;
 
-static __always_inline void debug_exit(unsigned long dr7)
-{
-	local_db_restore(dr7);
+	return dr6;
 }
 
 /*
@@ -863,6 +849,18 @@ static void handle_debug(struct pt_regs
 static __always_inline void exc_debug_kernel(struct pt_regs *regs,
 					     unsigned long dr6)
 {
+	/*
+	 * Disable breakpoints during exception handling; recursive exceptions
+	 * are exceedingly 'fun'.
+	 *
+	 * Since this function is NOKPROBE, and that also applies to
+	 * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a
+	 * HW_BREAKPOINT_W on our stack)
+	 *
+	 * Entry text is excluded for HW_BP_X and cpu_entry_area, which
+	 * includes the entry stack is excluded for everything.
+	 */
+	unsigned long dr7 = local_db_save();
 	bool irq_state = idtentry_enter_nmi(regs);
 	instrumentation_begin();
 
@@ -883,6 +881,8 @@ static __always_inline void exc_debug_ke
 
 	instrumentation_end();
 	idtentry_exit_nmi(regs, irq_state);
+
+	local_db_restore(dr7);
 }
 
 static __always_inline void exc_debug_user(struct pt_regs *regs,
@@ -894,6 +894,15 @@ static __always_inline void exc_debug_us
 	 */
 	WARN_ON_ONCE(!user_mode(regs));
 
+	/*
+	 * NB: We can't easily clear DR7 here because
+	 * idtentry_exit_to_usermode() can invoke ptrace, schedule, access
+	 * user memory, etc.  This means that a recursive #DB is possible.  If
+	 * this happens, that #DB will hit exc_debug_kernel() and clear DR7.
+	 * Since we're not on the IST stack right now, everything will be
+	 * fine.
+	 */
+
 	irqentry_enter_from_user_mode(regs);
 	instrumentation_begin();
 
@@ -907,36 +916,24 @@ static __always_inline void exc_debug_us
 /* IST stack entry */
 DEFINE_IDTENTRY_DEBUG(exc_debug)
 {
-	unsigned long dr6, dr7;
-
-	debug_enter(&dr6, &dr7);
-	exc_debug_kernel(regs, dr6);
-	debug_exit(dr7);
+	exc_debug_kernel(regs, debug_read_clear_dr6());
 }
 
 /* User entry, runs on regular task stack */
 DEFINE_IDTENTRY_DEBUG_USER(exc_debug)
 {
-	unsigned long dr6, dr7;
-
-	debug_enter(&dr6, &dr7);
-	exc_debug_user(regs, dr6);
-	debug_exit(dr7);
+	exc_debug_user(regs, debug_read_clear_dr6());
 }
 #else
 /* 32 bit does not have separate entry points. */
 DEFINE_IDTENTRY_RAW(exc_debug)
 {
-	unsigned long dr6, dr7;
-
-	debug_enter(&dr6, &dr7);
+	unsigned long dr6 = debug_read_clear_dr6();
 
 	if (user_mode(regs))
 		exc_debug_user(regs, dr6);
 	else
 		exc_debug_kernel(regs, dr6);
-
-	debug_exit(dr7);
 }
 #endif
 



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 03/13] x86/debug: Sync BTF earlier
  2020-09-02 13:25 [PATCH 00/13] x86/debug: Untangle handle_debug() Peter Zijlstra
  2020-09-02 13:25 ` [PATCH 01/13] x86/entry: Fix AC assertion Peter Zijlstra
  2020-09-02 13:25 ` [PATCH 02/13] x86/debug: Allow a single level of #DB recursion Peter Zijlstra
@ 2020-09-02 13:25 ` Peter Zijlstra
  2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  2020-09-02 13:25 ` [PATCH 04/13] x86/debug: Move kprobe_debug_handler() into exc_debug_kernel() Peter Zijlstra
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2020-09-02 13:25 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Kyle Huey, Alexandre Chartre,
	Robert O'Callahan, Paul E. McKenney, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Peter Zijlstra

Move the BTF sync near the DR6 load, as this will be the only common
code guaranteed to run on every #DB.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/kernel/traps.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -749,6 +749,13 @@ static __always_inline unsigned long deb
 	/* Filter out all the reserved bits which are preset to 1 */
 	dr6 &= ~DR6_RESERVED;
 
+	/*
+	 * The SDM says "The processor clears the BTF flag when it
+	 * generates a debug exception."  Clear TIF_BLOCKSTEP to keep
+	 * TIF_BLOCKSTEP in sync with the hardware BTF flag.
+	 */
+	clear_thread_flag(TIF_BLOCKSTEP);
+
 	return dr6;
 }
 
@@ -783,13 +790,6 @@ static void handle_debug(struct pt_regs
 	int si_code;
 
 	/*
-	 * The SDM says "The processor clears the BTF flag when it
-	 * generates a debug exception."  Clear TIF_BLOCKSTEP to keep
-	 * TIF_BLOCKSTEP in sync with the hardware BTF flag.
-	 */
-	clear_thread_flag(TIF_BLOCKSTEP);
-
-	/*
 	 * If DR6 is zero, no point in trying to handle it. The kernel is
 	 * not using INT1.
 	 */



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 04/13] x86/debug: Move kprobe_debug_handler() into exc_debug_kernel()
  2020-09-02 13:25 [PATCH 00/13] x86/debug: Untangle handle_debug() Peter Zijlstra
                   ` (2 preceding siblings ...)
  2020-09-02 13:25 ` [PATCH 03/13] x86/debug: Sync BTF earlier Peter Zijlstra
@ 2020-09-02 13:25 ` Peter Zijlstra
  2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  2020-09-02 13:25 ` [PATCH 05/13] x86/debug: Remove handle_debug(.user) argument Peter Zijlstra
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2020-09-02 13:25 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Kyle Huey, Alexandre Chartre,
	Robert O'Callahan, Paul E. McKenney, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Peter Zijlstra

Kprobes are on kernel text, and thus only matter for #DB-from-kernel.
Kprobes are ordered before the generic notifier, preserve that order.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/kprobes.h |    4 ++++
 arch/x86/kernel/traps.c        |   10 ++++------
 2 files changed, 8 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/kprobes.h
+++ b/arch/x86/include/asm/kprobes.h
@@ -106,5 +106,9 @@ extern int kprobe_exceptions_notify(stru
 extern int kprobe_int3_handler(struct pt_regs *regs);
 extern int kprobe_debug_handler(struct pt_regs *regs);
 
+#else
+
+static inline int kprobe_debug_handler(struct pt_regs *regs) { return 0; }
+
 #endif /* CONFIG_KPROBES */
 #endif /* _ASM_X86_KPROBES_H */
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -806,12 +806,6 @@ static void handle_debug(struct pt_regs
 	/* Store the virtualized DR6 value */
 	tsk->thread.debugreg6 = dr6;
 
-#ifdef CONFIG_KPROBES
-	if (kprobe_debug_handler(regs)) {
-		return;
-	}
-#endif
-
 	if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, 0,
 		       SIGTRAP) == NOTIFY_STOP) {
 		return;
@@ -877,8 +871,12 @@ static __always_inline void exc_debug_ke
 	if ((dr6 & DR_STEP) && is_sysenter_singlestep(regs))
 		dr6 &= ~DR_STEP;
 
+	if (kprobe_debug_handler(regs))
+		goto out;
+
 	handle_debug(regs, dr6, false);
 
+out:
 	instrumentation_end();
 	idtentry_exit_nmi(regs, irq_state);
 



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 05/13] x86/debug: Remove handle_debug(.user) argument
  2020-09-02 13:25 [PATCH 00/13] x86/debug: Untangle handle_debug() Peter Zijlstra
                   ` (3 preceding siblings ...)
  2020-09-02 13:25 ` [PATCH 04/13] x86/debug: Move kprobe_debug_handler() into exc_debug_kernel() Peter Zijlstra
@ 2020-09-02 13:25 ` Peter Zijlstra
  2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  2020-09-02 13:25 ` [PATCH 06/13] x86/debug: Simplify #DB signal code Peter Zijlstra
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2020-09-02 13:25 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Kyle Huey, Alexandre Chartre,
	Robert O'Callahan, Paul E. McKenney, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Peter Zijlstra

The handle_debug(.user) argument is used to terminate the #DB handler
early for the INT1-from-kernel case, since the kernel doesn't use
INT1.

Remove the argument and handle this explicitly in #DB-from-kernel.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/kernel/traps.c |   21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -783,25 +783,18 @@ static __always_inline unsigned long deb
  *
  * May run on IST stack.
  */
-static void handle_debug(struct pt_regs *regs, unsigned long dr6, bool user)
+static void handle_debug(struct pt_regs *regs, unsigned long dr6)
 {
 	struct task_struct *tsk = current;
 	bool user_icebp;
 	int si_code;
 
 	/*
-	 * If DR6 is zero, no point in trying to handle it. The kernel is
-	 * not using INT1.
-	 */
-	if (!user && !dr6)
-		return;
-
-	/*
 	 * If dr6 has no reason to give us about the origin of this trap,
 	 * then it's very likely the result of an icebp/int01 trap.
 	 * User wants a sigtrap for that.
 	 */
-	user_icebp = user && !dr6;
+	user_icebp = !dr6;
 
 	/* Store the virtualized DR6 value */
 	tsk->thread.debugreg6 = dr6;
@@ -874,7 +867,13 @@ static __always_inline void exc_debug_ke
 	if (kprobe_debug_handler(regs))
 		goto out;
 
-	handle_debug(regs, dr6, false);
+	/*
+	 * The kernel doesn't use INT1
+	 */
+	if (!dr6)
+		goto out;
+
+	handle_debug(regs, dr6);
 
 out:
 	instrumentation_end();
@@ -904,7 +903,7 @@ static __always_inline void exc_debug_us
 	irqentry_enter_from_user_mode(regs);
 	instrumentation_begin();
 
-	handle_debug(regs, dr6, true);
+	handle_debug(regs, dr6);
 
 	instrumentation_end();
 	irqentry_exit_to_user_mode(regs);



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 06/13] x86/debug: Simplify #DB signal code
  2020-09-02 13:25 [PATCH 00/13] x86/debug: Untangle handle_debug() Peter Zijlstra
                   ` (4 preceding siblings ...)
  2020-09-02 13:25 ` [PATCH 05/13] x86/debug: Remove handle_debug(.user) argument Peter Zijlstra
@ 2020-09-02 13:25 ` Peter Zijlstra
  2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  2020-09-02 13:25 ` [PATCH 07/13] x86/debug: Move historical SYSENTER junk into exc_debug_kernel() Peter Zijlstra
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2020-09-02 13:25 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Kyle Huey, Alexandre Chartre,
	Robert O'Callahan, Paul E. McKenney, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Peter Zijlstra

There's no point in calculating si_code if we're not going to use it.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/traps.c |   15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -786,15 +786,14 @@ static __always_inline unsigned long deb
 static void handle_debug(struct pt_regs *regs, unsigned long dr6)
 {
 	struct task_struct *tsk = current;
-	bool user_icebp;
-	int si_code;
+	bool icebp;
 
 	/*
 	 * If dr6 has no reason to give us about the origin of this trap,
 	 * then it's very likely the result of an icebp/int01 trap.
 	 * User wants a sigtrap for that.
 	 */
-	user_icebp = !dr6;
+	icebp = !dr6;
 
 	/* Store the virtualized DR6 value */
 	tsk->thread.debugreg6 = dr6;
@@ -813,6 +812,11 @@ static void handle_debug(struct pt_regs
 		goto out;
 	}
 
+	/*
+	 * Reload dr6, the notifier might have changed it.
+	 */
+	dr6 = tsk->thread.debugreg6;
+
 	if (WARN_ON_ONCE((dr6 & DR_STEP) && !user_mode(regs))) {
 		/*
 		 * Historical junk that used to handle SYSENTER single-stepping.
@@ -825,9 +829,8 @@ static void handle_debug(struct pt_regs
 		regs->flags &= ~X86_EFLAGS_TF;
 	}
 
-	si_code = get_si_code(tsk->thread.debugreg6);
-	if (tsk->thread.debugreg6 & (DR_STEP | DR_TRAP_BITS) || user_icebp)
-		send_sigtrap(regs, 0, si_code);
+	if (dr6 & (DR_STEP | DR_TRAP_BITS) || icebp)
+		send_sigtrap(regs, 0, get_si_code(dr6));
 
 out:
 	cond_local_irq_disable(regs);



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 07/13] x86/debug: Move historical SYSENTER junk into exc_debug_kernel()
  2020-09-02 13:25 [PATCH 00/13] x86/debug: Untangle handle_debug() Peter Zijlstra
                   ` (5 preceding siblings ...)
  2020-09-02 13:25 ` [PATCH 06/13] x86/debug: Simplify #DB signal code Peter Zijlstra
@ 2020-09-02 13:25 ` Peter Zijlstra
  2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  2020-09-02 13:25 ` [PATCH 08/13] x86/debug: Move cond_local_irq_enable() block into exc_debug_user() Peter Zijlstra
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2020-09-02 13:25 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Kyle Huey, Alexandre Chartre,
	Robert O'Callahan, Paul E. McKenney, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Peter Zijlstra

The historical SYSENTER junk is explicitly for from-kernel, so move it
to the #DB-from-kernel handler.

It is ordered after the notifier, this is important for KGDB which
uses TF single-step and needs to consume the event before we get here.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/traps.c |   49 ++++++++++++++++++++++++------------------------
 1 file changed, 25 insertions(+), 24 deletions(-)

--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -783,7 +783,7 @@ static __always_inline unsigned long deb
  *
  * May run on IST stack.
  */
-static void handle_debug(struct pt_regs *regs, unsigned long dr6)
+static bool handle_debug(struct pt_regs *regs, unsigned long *dr6)
 {
 	struct task_struct *tsk = current;
 	bool icebp;
@@ -793,15 +793,13 @@ static void handle_debug(struct pt_regs
 	 * then it's very likely the result of an icebp/int01 trap.
 	 * User wants a sigtrap for that.
 	 */
-	icebp = !dr6;
+	icebp = !*dr6;
 
 	/* Store the virtualized DR6 value */
-	tsk->thread.debugreg6 = dr6;
+	tsk->thread.debugreg6 = *dr6;
 
-	if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, 0,
-		       SIGTRAP) == NOTIFY_STOP) {
-		return;
-	}
+	if (notify_die(DIE_DEBUG, "debug", regs, (long)dr6, 0, SIGTRAP) == NOTIFY_STOP)
+		return true;
 
 	/* It's safe to allow irq's after DR6 has been saved */
 	cond_local_irq_enable(regs);
@@ -815,25 +813,15 @@ static void handle_debug(struct pt_regs
 	/*
 	 * Reload dr6, the notifier might have changed it.
 	 */
-	dr6 = tsk->thread.debugreg6;
+	*dr6 = tsk->thread.debugreg6;
 
-	if (WARN_ON_ONCE((dr6 & DR_STEP) && !user_mode(regs))) {
-		/*
-		 * Historical junk that used to handle SYSENTER single-stepping.
-		 * This should be unreachable now.  If we survive for a while
-		 * without anyone hitting this warning, we'll turn this into
-		 * an oops.
-		 */
-		tsk->thread.debugreg6 &= ~DR_STEP;
-		set_tsk_thread_flag(tsk, TIF_SINGLESTEP);
-		regs->flags &= ~X86_EFLAGS_TF;
-	}
-
-	if (dr6 & (DR_STEP | DR_TRAP_BITS) || icebp)
-		send_sigtrap(regs, 0, get_si_code(dr6));
+	if (*dr6 & (DR_STEP | DR_TRAP_BITS) || icebp)
+		send_sigtrap(regs, 0, get_si_code(*dr6));
 
 out:
 	cond_local_irq_disable(regs);
+
+	return false;
 }
 
 static __always_inline void exc_debug_kernel(struct pt_regs *regs,
@@ -876,7 +864,20 @@ static __always_inline void exc_debug_ke
 	if (!dr6)
 		goto out;
 
-	handle_debug(regs, dr6);
+	if (handle_debug(regs, &dr6))
+		goto out;
+
+	if (WARN_ON_ONCE(dr6 & DR_STEP)) {
+		/*
+		 * Historical junk that used to handle SYSENTER single-stepping.
+		 * This should be unreachable now.  If we survive for a while
+		 * without anyone hitting this warning, we'll turn this into
+		 * an oops.
+		 */
+		dr6 &= ~DR_STEP;
+		set_thread_flag(TIF_SINGLESTEP);
+		regs->flags &= ~X86_EFLAGS_TF;
+	}
 
 out:
 	instrumentation_end();
@@ -906,7 +907,7 @@ static __always_inline void exc_debug_us
 	irqentry_enter_from_user_mode(regs);
 	instrumentation_begin();
 
-	handle_debug(regs, dr6);
+	handle_debug(regs, &dr6);
 
 	instrumentation_end();
 	irqentry_exit_to_user_mode(regs);



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 08/13] x86/debug: Move cond_local_irq_enable() block into exc_debug_user()
  2020-09-02 13:25 [PATCH 00/13] x86/debug: Untangle handle_debug() Peter Zijlstra
                   ` (6 preceding siblings ...)
  2020-09-02 13:25 ` [PATCH 07/13] x86/debug: Move historical SYSENTER junk into exc_debug_kernel() Peter Zijlstra
@ 2020-09-02 13:25 ` Peter Zijlstra
  2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  2020-09-02 13:25 ` [PATCH 09/13] x86/debug: Remove the historical junk Peter Zijlstra
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2020-09-02 13:25 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Kyle Huey, Alexandre Chartre,
	Robert O'Callahan, Paul E. McKenney, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Peter Zijlstra

The cond_local_irq_enable() block, dealing with vm86 and sending
signals is only relevant for #DB-from-user, move it there.

This then reduces handle_debug() to only the notifier call, so rename
it to notify_debug().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/traps.c |   58 ++++++++++++++++++++++++------------------------
 1 file changed, 29 insertions(+), 29 deletions(-)

--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -783,17 +783,10 @@ static __always_inline unsigned long deb
  *
  * May run on IST stack.
  */
-static bool handle_debug(struct pt_regs *regs, unsigned long *dr6)
+
+static bool notify_debug(struct pt_regs *regs, unsigned long *dr6)
 {
 	struct task_struct *tsk = current;
-	bool icebp;
-
-	/*
-	 * If dr6 has no reason to give us about the origin of this trap,
-	 * then it's very likely the result of an icebp/int01 trap.
-	 * User wants a sigtrap for that.
-	 */
-	icebp = !*dr6;
 
 	/* Store the virtualized DR6 value */
 	tsk->thread.debugreg6 = *dr6;
@@ -801,26 +794,9 @@ static bool handle_debug(struct pt_regs
 	if (notify_die(DIE_DEBUG, "debug", regs, (long)dr6, 0, SIGTRAP) == NOTIFY_STOP)
 		return true;
 
-	/* It's safe to allow irq's after DR6 has been saved */
-	cond_local_irq_enable(regs);
-
-	if (v8086_mode(regs)) {
-		handle_vm86_trap((struct kernel_vm86_regs *) regs, 0,
-				 X86_TRAP_DB);
-		goto out;
-	}
-
-	/*
-	 * Reload dr6, the notifier might have changed it.
-	 */
+	/* Reload the DR6 value, the notifier might have changed it */
 	*dr6 = tsk->thread.debugreg6;
 
-	if (*dr6 & (DR_STEP | DR_TRAP_BITS) || icebp)
-		send_sigtrap(regs, 0, get_si_code(*dr6));
-
-out:
-	cond_local_irq_disable(regs);
-
 	return false;
 }
 
@@ -864,7 +840,7 @@ static __always_inline void exc_debug_ke
 	if (!dr6)
 		goto out;
 
-	if (handle_debug(regs, &dr6))
+	if (notify_debug(regs, &dr6))
 		goto out;
 
 	if (WARN_ON_ONCE(dr6 & DR_STEP)) {
@@ -889,6 +865,8 @@ static __always_inline void exc_debug_ke
 static __always_inline void exc_debug_user(struct pt_regs *regs,
 					   unsigned long dr6)
 {
+	bool icebp;
+
 	/*
 	 * If something gets miswired and we end up here for a kernel mode
 	 * #DB, we will malfunction.
@@ -907,8 +885,30 @@ static __always_inline void exc_debug_us
 	irqentry_enter_from_user_mode(regs);
 	instrumentation_begin();
 
-	handle_debug(regs, &dr6);
+	/*
+	 * If dr6 has no reason to give us about the origin of this trap,
+	 * then it's very likely the result of an icebp/int01 trap.
+	 * User wants a sigtrap for that.
+	 */
+	icebp = !dr6;
 
+	if (notify_debug(regs, &dr6))
+		goto out;
+
+	/* It's safe to allow irq's after DR6 has been saved */
+	local_irq_enable();
+
+	if (v8086_mode(regs)) {
+		handle_vm86_trap((struct kernel_vm86_regs *)regs, 0, X86_TRAP_DB);
+		goto out_irq;
+	}
+
+	if (dr6 & (DR_STEP | DR_TRAP_BITS) || icebp)
+		send_sigtrap(regs, 0, get_si_code(dr6));
+
+out_irq:
+	local_irq_disable();
+out:
 	instrumentation_end();
 	irqentry_exit_to_user_mode(regs);
 }



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 09/13] x86/debug: Remove the historical junk
  2020-09-02 13:25 [PATCH 00/13] x86/debug: Untangle handle_debug() Peter Zijlstra
                   ` (7 preceding siblings ...)
  2020-09-02 13:25 ` [PATCH 08/13] x86/debug: Move cond_local_irq_enable() block into exc_debug_user() Peter Zijlstra
@ 2020-09-02 13:25 ` Peter Zijlstra
  2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  2020-09-02 13:25 ` [PATCH 10/13] x86/debug: Remove aout_dump_debugregs() Peter Zijlstra
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2020-09-02 13:25 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Kyle Huey, Alexandre Chartre,
	Robert O'Callahan, Paul E. McKenney, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Peter Zijlstra, Andy Lutomirski

Remove the historical junk and replace it with a WARN and a comment.

The problem is that even though the kernel only uses TF single-step in
kprobes and KGDB, both of which consume the event before this,
QEMU/KVM has bugs in this area that can trigger this state so we have
to deal with it.

Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/traps.c |   23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -843,18 +843,19 @@ static __always_inline void exc_debug_ke
 	if (notify_debug(regs, &dr6))
 		goto out;
 
-	if (WARN_ON_ONCE(dr6 & DR_STEP)) {
-		/*
-		 * Historical junk that used to handle SYSENTER single-stepping.
-		 * This should be unreachable now.  If we survive for a while
-		 * without anyone hitting this warning, we'll turn this into
-		 * an oops.
-		 */
-		dr6 &= ~DR_STEP;
-		set_thread_flag(TIF_SINGLESTEP);
+	/*
+	 * The kernel doesn't use TF single-step outside of:
+	 *
+	 *  - Kprobes, consumed through kprobe_debug_handler()
+	 *  - KGDB, consumed through notify_debug()
+	 *
+	 * So if we get here with DR_STEP set, something is wonky.
+	 *
+	 * A known way to trigger this is through QEMU's GDB stub,
+	 * which leaks #DB into the guest and causes IST recursion.
+	 */
+	if (WARN_ON_ONCE(current->thread.debugreg6 & DR_STEP))
 		regs->flags &= ~X86_EFLAGS_TF;
-	}
-
 out:
 	instrumentation_end();
 	idtentry_exit_nmi(regs, irq_state);



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 10/13] x86/debug: Remove aout_dump_debugregs()
  2020-09-02 13:25 [PATCH 00/13] x86/debug: Untangle handle_debug() Peter Zijlstra
                   ` (8 preceding siblings ...)
  2020-09-02 13:25 ` [PATCH 09/13] x86/debug: Remove the historical junk Peter Zijlstra
@ 2020-09-02 13:25 ` Peter Zijlstra
  2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  2020-09-02 13:26 ` [PATCH 11/13] x86/debug: Simplify hw_breakpoint_handler() Peter Zijlstra
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2020-09-02 13:25 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Kyle Huey, Alexandre Chartre,
	Robert O'Callahan, Paul E. McKenney, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Peter Zijlstra

Unused remnants for the bit-bucket.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/debugreg.h |    2 --
 arch/x86/kernel/hw_breakpoint.c |   36 ------------------------------------
 2 files changed, 38 deletions(-)

--- a/arch/x86/include/asm/debugreg.h
+++ b/arch/x86/include/asm/debugreg.h
@@ -90,8 +90,6 @@ static __always_inline bool hw_breakpoin
 	return __this_cpu_read(cpu_dr7) & DR_GLOBAL_ENABLE_MASK;
 }
 
-extern void aout_dump_debugregs(struct user *dump);
-
 extern void hw_breakpoint_restore(void);
 
 static __always_inline unsigned long local_db_save(void)
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -442,42 +442,6 @@ int hw_breakpoint_arch_parse(struct perf
 }
 
 /*
- * Dump the debug register contents to the user.
- * We can't dump our per cpu values because it
- * may contain cpu wide breakpoint, something that
- * doesn't belong to the current task.
- *
- * TODO: include non-ptrace user breakpoints (perf)
- */
-void aout_dump_debugregs(struct user *dump)
-{
-	int i;
-	int dr7 = 0;
-	struct perf_event *bp;
-	struct arch_hw_breakpoint *info;
-	struct thread_struct *thread = &current->thread;
-
-	for (i = 0; i < HBP_NUM; i++) {
-		bp = thread->ptrace_bps[i];
-
-		if (bp && !bp->attr.disabled) {
-			dump->u_debugreg[i] = bp->attr.bp_addr;
-			info = counter_arch_bp(bp);
-			dr7 |= encode_dr7(i, info->len, info->type);
-		} else {
-			dump->u_debugreg[i] = 0;
-		}
-	}
-
-	dump->u_debugreg[4] = 0;
-	dump->u_debugreg[5] = 0;
-	dump->u_debugreg[6] = current->thread.debugreg6;
-
-	dump->u_debugreg[7] = dr7;
-}
-EXPORT_SYMBOL_GPL(aout_dump_debugregs);
-
-/*
  * Release the user breakpoints used by ptrace
  */
 void flush_ptrace_hw_breakpoint(struct task_struct *tsk)



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 11/13] x86/debug: Simplify hw_breakpoint_handler()
  2020-09-02 13:25 [PATCH 00/13] x86/debug: Untangle handle_debug() Peter Zijlstra
                   ` (9 preceding siblings ...)
  2020-09-02 13:25 ` [PATCH 10/13] x86/debug: Remove aout_dump_debugregs() Peter Zijlstra
@ 2020-09-02 13:26 ` Peter Zijlstra
  2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  2020-09-02 13:26 ` [RFC][PATCH 12/13] x86/debug: Support negative polarity DR6 bits Peter Zijlstra
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2020-09-02 13:26 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Kyle Huey, Alexandre Chartre,
	Robert O'Callahan, Paul E. McKenney, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Peter Zijlstra

This is called with interrupts disabled, there's no point in using
get_cpu() and per_cpu().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/hw_breakpoint.c |    8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -487,7 +487,7 @@ EXPORT_SYMBOL_GPL(hw_breakpoint_restore)
  */
 static int hw_breakpoint_handler(struct die_args *args)
 {
-	int i, cpu, rc = NOTIFY_STOP;
+	int i, rc = NOTIFY_STOP;
 	struct perf_event *bp;
 	unsigned long dr6;
 	unsigned long *dr6_p;
@@ -505,12 +505,10 @@ static int hw_breakpoint_handler(struct
 		return NOTIFY_DONE;
 
 	/*
-	 * Assert that local interrupts are disabled
 	 * Reset the DRn bits in the virtualized register value.
 	 * The ptrace trigger routine will add in whatever is needed.
 	 */
 	current->thread.debugreg6 &= ~DR_TRAP_BITS;
-	cpu = get_cpu();
 
 	/* Handle all the breakpoints that were triggered */
 	for (i = 0; i < HBP_NUM; ++i) {
@@ -525,7 +523,7 @@ static int hw_breakpoint_handler(struct
 		 */
 		rcu_read_lock();
 
-		bp = per_cpu(bp_per_reg[i], cpu);
+		bp = this_cpu_read(bp_per_reg[i]);
 		/*
 		 * Reset the 'i'th TRAP bit in dr6 to denote completion of
 		 * exception handling
@@ -560,8 +558,6 @@ static int hw_breakpoint_handler(struct
 	    (dr6 & (~DR_TRAP_BITS)))
 		rc = NOTIFY_DONE;
 
-	put_cpu();
-
 	return rc;
 }
 



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [RFC][PATCH 12/13] x86/debug: Support negative polarity DR6 bits
  2020-09-02 13:25 [PATCH 00/13] x86/debug: Untangle handle_debug() Peter Zijlstra
                   ` (10 preceding siblings ...)
  2020-09-02 13:26 ` [PATCH 11/13] x86/debug: Simplify hw_breakpoint_handler() Peter Zijlstra
@ 2020-09-02 13:26 ` Peter Zijlstra
  2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  2020-09-02 13:26 ` [RFC][PATCH 13/13] x86/debug: Change thread.debugreg6 to thread.virtual_dr6 Peter Zijlstra
  2020-09-03 15:21 ` [PATCH 00/13] x86/debug: Untangle handle_debug() Daniel Thompson
  13 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2020-09-02 13:26 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Kyle Huey, Alexandre Chartre,
	Robert O'Callahan, Paul E. McKenney, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Peter Zijlstra, Andrew Cooper

DR6 has a whole bunch of bits that have negative polarity; they were
architecturally reserved and defined to be 1 and are now getting used.
Since they're 1 by default, 0 becomes the signal value.

Handle this by xor'ing the read DR6 value by the reserved mask, this
will flip them around such that 1 is the signal value (positive
polarity).

Current Linux doesn't yet support any of these bits, but there's two
defined:

 - DR6[11] Bus Lock Debug Exception		(ISEr39)
 - DR6[16] Restricted Transactional Memory	(SDM)

Update ptrace_{set,get}_debugreg() to provide/consume the value in
architectural polarity. Although afaict ptrace_set_debugreg(6) is
pointless, the value is not consumed anywhere.

Change hw_breakpoint_restore() to alway write the DR6_RESERVED value
to DR6, again, no consumer for that write.

Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/hw_breakpoint.c |    2 +-
 arch/x86/kernel/ptrace.c        |    4 ++--
 arch/x86/kernel/traps.c         |    5 ++---
 3 files changed, 5 insertions(+), 6 deletions(-)

--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -500,7 +500,7 @@ void hw_breakpoint_restore(void)
 	set_debugreg(__this_cpu_read(cpu_debugreg[1]), 1);
 	set_debugreg(__this_cpu_read(cpu_debugreg[2]), 2);
 	set_debugreg(__this_cpu_read(cpu_debugreg[3]), 3);
-	set_debugreg(current->thread.debugreg6, 6);
+	set_debugreg(DR6_RESERVED, 6);
 	set_debugreg(__this_cpu_read(cpu_dr7), 7);
 }
 EXPORT_SYMBOL_GPL(hw_breakpoint_restore);
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -601,7 +601,7 @@ static unsigned long ptrace_get_debugreg
 		if (bp)
 			val = bp->hw.info.address;
 	} else if (n == 6) {
-		val = thread->debugreg6;
+		val = thread->debugreg6 ^ DR6_RESERVED; /* Flip back to arch polarity */
 	} else if (n == 7) {
 		val = thread->ptrace_dr7;
 	}
@@ -657,7 +657,7 @@ static int ptrace_set_debugreg(struct ta
 	if (n < HBP_NUM) {
 		rc = ptrace_set_breakpoint_addr(tsk, n, val);
 	} else if (n == 6) {
-		thread->debugreg6 = val;
+		thread->debugreg6 = val ^ DR6_RESERVED; /* Flip to positive polarity */
 		rc = 0;
 	} else if (n == 7) {
 		rc = ptrace_write_dr7(tsk, val);
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -745,9 +745,8 @@ static __always_inline unsigned long deb
 	 * Keep it simple: clear DR6 immediately.
 	 */
 	get_debugreg(dr6, 6);
-	set_debugreg(0, 6);
-	/* Filter out all the reserved bits which are preset to 1 */
-	dr6 &= ~DR6_RESERVED;
+	set_debugreg(DR6_RESERVED, 6);
+	dr6 ^= DR6_RESERVED; /* Flip to positive polarity */
 
 	/*
 	 * The SDM says "The processor clears the BTF flag when it



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [RFC][PATCH 13/13] x86/debug: Change thread.debugreg6 to thread.virtual_dr6
  2020-09-02 13:25 [PATCH 00/13] x86/debug: Untangle handle_debug() Peter Zijlstra
                   ` (11 preceding siblings ...)
  2020-09-02 13:26 ` [RFC][PATCH 12/13] x86/debug: Support negative polarity DR6 bits Peter Zijlstra
@ 2020-09-02 13:26 ` Peter Zijlstra
  2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
  2020-09-03 15:21 ` [PATCH 00/13] x86/debug: Untangle handle_debug() Daniel Thompson
  13 siblings, 1 reply; 34+ messages in thread
From: Peter Zijlstra @ 2020-09-02 13:26 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Kyle Huey, Alexandre Chartre,
	Robert O'Callahan, Paul E. McKenney, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Peter Zijlstra

Current usage of thread.debugreg6 is convoluted at best. It starts
life as a copy of the hardware DR6 value, but then we clear and set
various bits.

Replace this with a new variable thread.virtual_dr6 that is
initialized to 0 when we read DR6 and only gains bits, at the same
time the actual (on stack) dr6 value we read from the hardware only
gets bits cleared.

Again, I'm not sure what the expected effect of
ptrace_{set,get}debugreg(6) would be.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/processor.h |    2 +-
 arch/x86/kernel/hw_breakpoint.c  |   12 +++---------
 arch/x86/kernel/kgdb.c           |    5 +++--
 arch/x86/kernel/ptrace.c         |    6 +++---
 arch/x86/kernel/traps.c          |   25 ++++++++++++++++---------
 5 files changed, 26 insertions(+), 24 deletions(-)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -517,7 +517,7 @@ struct thread_struct {
 	/* Save middle states of ptrace breakpoints */
 	struct perf_event	*ptrace_bps[HBP_NUM];
 	/* Debug status used for traps, single steps, etc... */
-	unsigned long           debugreg6;
+	unsigned long           virtual_dr6;
 	/* Keep track of the exact dr7 value set by the user */
 	unsigned long           ptrace_dr7;
 	/* Fault info: */
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -454,7 +454,7 @@ void flush_ptrace_hw_breakpoint(struct t
 		t->ptrace_bps[i] = NULL;
 	}
 
-	t->debugreg6 = 0;
+	t->virtual_dr6 = 0;
 	t->ptrace_dr7 = 0;
 }
 
@@ -489,8 +489,8 @@ static int hw_breakpoint_handler(struct
 {
 	int i, rc = NOTIFY_STOP;
 	struct perf_event *bp;
-	unsigned long dr6;
 	unsigned long *dr6_p;
+	unsigned long dr6;
 
 	/* The DR6 value is pointed by args->err */
 	dr6_p = (unsigned long *)ERR_PTR(args->err);
@@ -504,12 +504,6 @@ static int hw_breakpoint_handler(struct
 	if ((dr6 & DR_TRAP_BITS) == 0)
 		return NOTIFY_DONE;
 
-	/*
-	 * Reset the DRn bits in the virtualized register value.
-	 * The ptrace trigger routine will add in whatever is needed.
-	 */
-	current->thread.debugreg6 &= ~DR_TRAP_BITS;
-
 	/* Handle all the breakpoints that were triggered */
 	for (i = 0; i < HBP_NUM; ++i) {
 		if (likely(!(dr6 & (DR_TRAP0 << i))))
@@ -554,7 +548,7 @@ static int hw_breakpoint_handler(struct
 	 * breakpoints (to generate signals) and b) when the system has
 	 * taken exception due to multiple causes
 	 */
-	if ((current->thread.debugreg6 & DR_TRAP_BITS) ||
+	if ((current->thread.virtual_dr6 & DR_TRAP_BITS) ||
 	    (dr6 & (~DR_TRAP_BITS)))
 		rc = NOTIFY_DONE;
 
--- a/arch/x86/kernel/kgdb.c
+++ b/arch/x86/kernel/kgdb.c
@@ -629,9 +629,10 @@ static void kgdb_hw_overflow_handler(str
 	struct task_struct *tsk = current;
 	int i;
 
-	for (i = 0; i < 4; i++)
+	for (i = 0; i < 4; i++) {
 		if (breakinfo[i].enabled)
-			tsk->thread.debugreg6 |= (DR_TRAP0 << i);
+			tsk->thread.virtual_dr6 |= (DR_TRAP0 << i);
+	}
 }
 
 void kgdb_arch_late(void)
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -465,7 +465,7 @@ static void ptrace_triggered(struct perf
 			break;
 	}
 
-	thread->debugreg6 |= (DR_TRAP0 << i);
+	thread->virtual_dr6 |= (DR_TRAP0 << i);
 }
 
 /*
@@ -601,7 +601,7 @@ static unsigned long ptrace_get_debugreg
 		if (bp)
 			val = bp->hw.info.address;
 	} else if (n == 6) {
-		val = thread->debugreg6 ^ DR6_RESERVED; /* Flip back to arch polarity */
+		val = thread->virtual_dr6 ^ DR6_RESERVED; /* Flip back to arch polarity */
 	} else if (n == 7) {
 		val = thread->ptrace_dr7;
 	}
@@ -657,7 +657,7 @@ static int ptrace_set_debugreg(struct ta
 	if (n < HBP_NUM) {
 		rc = ptrace_set_breakpoint_addr(tsk, n, val);
 	} else if (n == 6) {
-		thread->debugreg6 = val ^ DR6_RESERVED; /* Flip to positive polarity */
+		thread->virtual_dr6 = val ^ DR6_RESERVED; /* Flip to positive polarity */
 		rc = 0;
 	} else if (n == 7) {
 		rc = ptrace_write_dr7(tsk, val);
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -749,6 +749,12 @@ static __always_inline unsigned long deb
 	dr6 ^= DR6_RESERVED; /* Flip to positive polarity */
 
 	/*
+	 * Clear the virtual DR6 value, ptrace routines will set bits here for
+	 * things we want signals for.
+	 */
+	current->thread.virtual_dr6 = 0;
+
+	/*
 	 * The SDM says "The processor clears the BTF flag when it
 	 * generates a debug exception."  Clear TIF_BLOCKSTEP to keep
 	 * TIF_BLOCKSTEP in sync with the hardware BTF flag.
@@ -785,17 +791,16 @@ static __always_inline unsigned long deb
 
 static bool notify_debug(struct pt_regs *regs, unsigned long *dr6)
 {
-	struct task_struct *tsk = current;
-
-	/* Store the virtualized DR6 value */
-	tsk->thread.debugreg6 = *dr6;
-
+	/*
+	 * Notifiers will clear bits in @dr6 to indicate the event has been
+	 * consumed - hw_breakpoint_handler(), single_stop_cont().
+	 *
+	 * Notifiers will set bits in @virtual_dr6 to indicate the desire
+	 * for signals - ptrace_triggered(), kgdb_hw_overflow_handler().
+	 */
 	if (notify_die(DIE_DEBUG, "debug", regs, (long)dr6, 0, SIGTRAP) == NOTIFY_STOP)
 		return true;
 
-	/* Reload the DR6 value, the notifier might have changed it */
-	*dr6 = tsk->thread.debugreg6;
-
 	return false;
 }
 
@@ -853,7 +858,7 @@ static __always_inline void exc_debug_ke
 	 * A known way to trigger this is through QEMU's GDB stub,
 	 * which leaks #DB into the guest and causes IST recursion.
 	 */
-	if (WARN_ON_ONCE(current->thread.debugreg6 & DR_STEP))
+	if (WARN_ON_ONCE(dr6 & DR_STEP))
 		regs->flags &= ~X86_EFLAGS_TF;
 out:
 	instrumentation_end();
@@ -903,6 +908,8 @@ static __always_inline void exc_debug_us
 		goto out_irq;
 	}
 
+	/* Add the virtual_dr6 bits for signals. */
+	dr6 |= current->thread.virtual_dr6;
 	if (dr6 & (DR_STEP | DR_TRAP_BITS) || icebp)
 		send_sigtrap(regs, 0, get_si_code(dr6));
 



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 01/13] x86/entry: Fix AC assertion
  2020-09-02 13:25 ` [PATCH 01/13] x86/entry: Fix AC assertion Peter Zijlstra
@ 2020-09-02 15:58   ` Brian Gerst
  2020-09-02 16:24     ` Jürgen Groß
  2020-09-02 16:26     ` Andrew Cooper
  2020-09-04 13:16   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
  1 sibling, 2 replies; 34+ messages in thread
From: Brian Gerst @ 2020-09-02 15:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Kyle Huey,
	Alexandre Chartre, Robert O'Callahan, Paul E. McKenney,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Andrew Cooper

On Wed, Sep 2, 2020 at 9:38 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> From: Peter Zijlstra <peterz@infradead.org>
>
> The WARN added in commit 3c73b81a9164 ("x86/entry, selftests: Further
> improve user entry sanity checks") unconditionally triggers on my IVB
> machine because it does not support SMAP.
>
> For !SMAP hardware we patch out CLAC/STAC instructions and thus if
> userspace sets AC, we'll still have it set after entry.
>
> Fixes: 3c73b81a9164 ("x86/entry, selftests: Further improve user entry sanity checks")
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Acked-by: Andy Lutomirski <luto@kernel.org>
> ---
>  arch/x86/include/asm/entry-common.h |   11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> --- a/arch/x86/include/asm/entry-common.h
> +++ b/arch/x86/include/asm/entry-common.h
> @@ -18,8 +18,16 @@ static __always_inline void arch_check_u
>                  * state, not the interrupt state as imagined by Xen.
>                  */
>                 unsigned long flags = native_save_fl();
> -               WARN_ON_ONCE(flags & (X86_EFLAGS_AC | X86_EFLAGS_DF |
> -                                     X86_EFLAGS_NT));
> +               unsigned long mask = X86_EFLAGS_DF | X86_EFLAGS_NT;
> +
> +               /*
> +                * For !SMAP hardware we patch out CLAC on entry.
> +                */
> +               if (boot_cpu_has(X86_FEATURE_SMAP) ||
> +                   (IS_ENABLED(CONFIG_64_BIT) && boot_cpu_has(X86_FEATURE_XENPV)))
> +                       mask |= X86_EFLAGS_AC;

Is the explicit Xen check necessary?  IIRC the Xen hypervisor will
filter out the SMAP bit in the cpuid pvop.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 01/13] x86/entry: Fix AC assertion
  2020-09-02 15:58   ` Brian Gerst
@ 2020-09-02 16:24     ` Jürgen Groß
  2020-09-02 16:31       ` peterz
  2020-09-02 16:26     ` Andrew Cooper
  1 sibling, 1 reply; 34+ messages in thread
From: Jürgen Groß @ 2020-09-02 16:24 UTC (permalink / raw)
  To: Brian Gerst, Peter Zijlstra
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Kyle Huey,
	Alexandre Chartre, Robert O'Callahan, Paul E. McKenney,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Andrew Cooper

On 02.09.20 17:58, Brian Gerst wrote:
> On Wed, Sep 2, 2020 at 9:38 AM Peter Zijlstra <peterz@infradead.org> wrote:
>>
>> From: Peter Zijlstra <peterz@infradead.org>
>>
>> The WARN added in commit 3c73b81a9164 ("x86/entry, selftests: Further
>> improve user entry sanity checks") unconditionally triggers on my IVB
>> machine because it does not support SMAP.
>>
>> For !SMAP hardware we patch out CLAC/STAC instructions and thus if
>> userspace sets AC, we'll still have it set after entry.
>>
>> Fixes: 3c73b81a9164 ("x86/entry, selftests: Further improve user entry sanity checks")
>> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>> Acked-by: Andy Lutomirski <luto@kernel.org>
>> ---
>>   arch/x86/include/asm/entry-common.h |   11 +++++++++--
>>   1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> --- a/arch/x86/include/asm/entry-common.h
>> +++ b/arch/x86/include/asm/entry-common.h
>> @@ -18,8 +18,16 @@ static __always_inline void arch_check_u
>>                   * state, not the interrupt state as imagined by Xen.
>>                   */
>>                  unsigned long flags = native_save_fl();
>> -               WARN_ON_ONCE(flags & (X86_EFLAGS_AC | X86_EFLAGS_DF |
>> -                                     X86_EFLAGS_NT));
>> +               unsigned long mask = X86_EFLAGS_DF | X86_EFLAGS_NT;
>> +
>> +               /*
>> +                * For !SMAP hardware we patch out CLAC on entry.
>> +                */
>> +               if (boot_cpu_has(X86_FEATURE_SMAP) ||
>> +                   (IS_ENABLED(CONFIG_64_BIT) && boot_cpu_has(X86_FEATURE_XENPV)))
>> +                       mask |= X86_EFLAGS_AC;
> 
> Is the explicit Xen check necessary?  IIRC the Xen hypervisor will
> filter out the SMAP bit in the cpuid pvop.

Right, and this test will nevertheless result in setting AC in the mask.
IIRC this was the main objective here.


Juergen


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 01/13] x86/entry: Fix AC assertion
  2020-09-02 15:58   ` Brian Gerst
  2020-09-02 16:24     ` Jürgen Groß
@ 2020-09-02 16:26     ` Andrew Cooper
  1 sibling, 0 replies; 34+ messages in thread
From: Andrew Cooper @ 2020-09-02 16:26 UTC (permalink / raw)
  To: Brian Gerst, Peter Zijlstra
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Kyle Huey,
	Alexandre Chartre, Robert O'Callahan, Paul E. McKenney,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Juergen Gross, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson

On 02/09/2020 16:58, Brian Gerst wrote:
> On Wed, Sep 2, 2020 at 9:38 AM Peter Zijlstra <peterz@infradead.org> wrote:
>> From: Peter Zijlstra <peterz@infradead.org>
>>
>> The WARN added in commit 3c73b81a9164 ("x86/entry, selftests: Further
>> improve user entry sanity checks") unconditionally triggers on my IVB
>> machine because it does not support SMAP.
>>
>> For !SMAP hardware we patch out CLAC/STAC instructions and thus if
>> userspace sets AC, we'll still have it set after entry.
>>
>> Fixes: 3c73b81a9164 ("x86/entry, selftests: Further improve user entry sanity checks")
>> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>> Acked-by: Andy Lutomirski <luto@kernel.org>
>> ---
>>  arch/x86/include/asm/entry-common.h |   11 +++++++++--
>>  1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> --- a/arch/x86/include/asm/entry-common.h
>> +++ b/arch/x86/include/asm/entry-common.h
>> @@ -18,8 +18,16 @@ static __always_inline void arch_check_u
>>                  * state, not the interrupt state as imagined by Xen.
>>                  */
>>                 unsigned long flags = native_save_fl();
>> -               WARN_ON_ONCE(flags & (X86_EFLAGS_AC | X86_EFLAGS_DF |
>> -                                     X86_EFLAGS_NT));
>> +               unsigned long mask = X86_EFLAGS_DF | X86_EFLAGS_NT;
>> +
>> +               /*
>> +                * For !SMAP hardware we patch out CLAC on entry.
>> +                */
>> +               if (boot_cpu_has(X86_FEATURE_SMAP) ||
>> +                   (IS_ENABLED(CONFIG_64_BIT) && boot_cpu_has(X86_FEATURE_XENPV)))
>> +                       mask |= X86_EFLAGS_AC;
> Is the explicit Xen check necessary?  IIRC the Xen hypervisor will
> filter out the SMAP bit in the cpuid pvop.

The Xen check isn't anything to do with SMAP.

64bit PV guest kernels run in Ring3, so userspace's choice of AC for
real alignment check purposes needs to not leak into kernel context.

Xen's ABI for a user => kernel context switch should clear AC on behalf
of the kernel, but the fact still remains that if AC actually leaks into
context for whatever reason, stuff is going to break.

~Andrew

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 01/13] x86/entry: Fix AC assertion
  2020-09-02 16:24     ` Jürgen Groß
@ 2020-09-02 16:31       ` peterz
  2020-09-02 17:02         ` Brian Gerst
  0 siblings, 1 reply; 34+ messages in thread
From: peterz @ 2020-09-02 16:31 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Brian Gerst, the arch/x86 maintainers, Linux Kernel Mailing List,
	Kyle Huey, Alexandre Chartre, Robert O'Callahan,
	Paul E. McKenney, Frederic Weisbecker, Paolo Bonzini,
	Sean Christopherson, Masami Hiramatsu, Petr Mladek,
	Steven Rostedt, Joel Fernandes, Boris Ostrovsky, Andy Lutomirski,
	Josh Poimboeuf, Daniel Thompson, Andrew Cooper

On Wed, Sep 02, 2020 at 06:24:27PM +0200, Jürgen Groß wrote:
> On 02.09.20 17:58, Brian Gerst wrote:
> > On Wed, Sep 2, 2020 at 9:38 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > 
> > > From: Peter Zijlstra <peterz@infradead.org>
> > > 
> > > The WARN added in commit 3c73b81a9164 ("x86/entry, selftests: Further
> > > improve user entry sanity checks") unconditionally triggers on my IVB
> > > machine because it does not support SMAP.
> > > 
> > > For !SMAP hardware we patch out CLAC/STAC instructions and thus if
> > > userspace sets AC, we'll still have it set after entry.
> > > 
> > > Fixes: 3c73b81a9164 ("x86/entry, selftests: Further improve user entry sanity checks")
> > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > > Acked-by: Andy Lutomirski <luto@kernel.org>
> > > ---
> > >   arch/x86/include/asm/entry-common.h |   11 +++++++++--
> > >   1 file changed, 9 insertions(+), 2 deletions(-)
> > > 
> > > --- a/arch/x86/include/asm/entry-common.h
> > > +++ b/arch/x86/include/asm/entry-common.h
> > > @@ -18,8 +18,16 @@ static __always_inline void arch_check_u
> > >                   * state, not the interrupt state as imagined by Xen.
> > >                   */
> > >                  unsigned long flags = native_save_fl();
> > > -               WARN_ON_ONCE(flags & (X86_EFLAGS_AC | X86_EFLAGS_DF |
> > > -                                     X86_EFLAGS_NT));
> > > +               unsigned long mask = X86_EFLAGS_DF | X86_EFLAGS_NT;
> > > +
> > > +               /*
> > > +                * For !SMAP hardware we patch out CLAC on entry.
> > > +                */
> > > +               if (boot_cpu_has(X86_FEATURE_SMAP) ||
> > > +                   (IS_ENABLED(CONFIG_64_BIT) && boot_cpu_has(X86_FEATURE_XENPV)))
> > > +                       mask |= X86_EFLAGS_AC;
> > 
> > Is the explicit Xen check necessary?  IIRC the Xen hypervisor will
> > filter out the SMAP bit in the cpuid pvop.
> 
> Right, and this test will nevertheless result in setting AC in the mask.
> IIRC this was the main objective here.

Correct, this asserts that 64bit Xen-PV will never have AC set; it had
better not have it set since it runs in ring 3.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 01/13] x86/entry: Fix AC assertion
  2020-09-02 16:31       ` peterz
@ 2020-09-02 17:02         ` Brian Gerst
  0 siblings, 0 replies; 34+ messages in thread
From: Brian Gerst @ 2020-09-02 17:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jürgen Groß,
	the arch/x86 maintainers, Linux Kernel Mailing List, Kyle Huey,
	Alexandre Chartre, Robert O'Callahan, Paul E. McKenney,
	Frederic Weisbecker, Paolo Bonzini, Sean Christopherson,
	Masami Hiramatsu, Petr Mladek, Steven Rostedt, Joel Fernandes,
	Boris Ostrovsky, Andy Lutomirski, Josh Poimboeuf,
	Daniel Thompson, Andrew Cooper

On Wed, Sep 2, 2020 at 12:31 PM <peterz@infradead.org> wrote:
>
> On Wed, Sep 02, 2020 at 06:24:27PM +0200, Jürgen Groß wrote:
> > On 02.09.20 17:58, Brian Gerst wrote:
> > > On Wed, Sep 2, 2020 at 9:38 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > >
> > > > From: Peter Zijlstra <peterz@infradead.org>
> > > >
> > > > The WARN added in commit 3c73b81a9164 ("x86/entry, selftests: Further
> > > > improve user entry sanity checks") unconditionally triggers on my IVB
> > > > machine because it does not support SMAP.
> > > >
> > > > For !SMAP hardware we patch out CLAC/STAC instructions and thus if
> > > > userspace sets AC, we'll still have it set after entry.
> > > >
> > > > Fixes: 3c73b81a9164 ("x86/entry, selftests: Further improve user entry sanity checks")
> > > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > > > Acked-by: Andy Lutomirski <luto@kernel.org>
> > > > ---
> > > >   arch/x86/include/asm/entry-common.h |   11 +++++++++--
> > > >   1 file changed, 9 insertions(+), 2 deletions(-)
> > > >
> > > > --- a/arch/x86/include/asm/entry-common.h
> > > > +++ b/arch/x86/include/asm/entry-common.h
> > > > @@ -18,8 +18,16 @@ static __always_inline void arch_check_u
> > > >                   * state, not the interrupt state as imagined by Xen.
> > > >                   */
> > > >                  unsigned long flags = native_save_fl();
> > > > -               WARN_ON_ONCE(flags & (X86_EFLAGS_AC | X86_EFLAGS_DF |
> > > > -                                     X86_EFLAGS_NT));
> > > > +               unsigned long mask = X86_EFLAGS_DF | X86_EFLAGS_NT;
> > > > +
> > > > +               /*
> > > > +                * For !SMAP hardware we patch out CLAC on entry.
> > > > +                */
> > > > +               if (boot_cpu_has(X86_FEATURE_SMAP) ||
> > > > +                   (IS_ENABLED(CONFIG_64_BIT) && boot_cpu_has(X86_FEATURE_XENPV)))
> > > > +                       mask |= X86_EFLAGS_AC;
> > >
> > > Is the explicit Xen check necessary?  IIRC the Xen hypervisor will
> > > filter out the SMAP bit in the cpuid pvop.
> >
> > Right, and this test will nevertheless result in setting AC in the mask.
> > IIRC this was the main objective here.
>
> Correct, this asserts that 64bit Xen-PV will never have AC set; it had
> better not have it set since it runs in ring 3.

Ok.  That should be added to the comment to avoid confusion.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 02/13] x86/debug: Allow a single level of #DB recursion
  2020-09-02 13:25 ` [PATCH 02/13] x86/debug: Allow a single level of #DB recursion Peter Zijlstra
@ 2020-09-02 23:59   ` Sasha Levin
  2020-09-03 16:12   ` Josh Poimboeuf
  1 sibling, 0 replies; 34+ messages in thread
From: Sasha Levin @ 2020-09-02 23:59 UTC (permalink / raw)
  To: Sasha Levin, Peter Zijlstra, Andy Lutomirski, x86
  Cc: linux-kernel, Kyle Huey, stable, stable

Hi

[This is an automated email]

This commit has been processed because it contains a "Fixes:" tag
fixing commit: 9f58fdde95c9 ("x86/db: Split out dr6/7 handling").

The bot has tested the following trees: v5.8.5.

v5.8.5: Failed to apply! Possible dependencies:
    0b085e68f407 ("x86/entry: Consolidate 32/64 bit syscall entry")
    27d6b4d14f5c ("x86/entry: Use generic syscall entry function")
    517e499227be ("x86/entry: Cleanup idtentry_entry/exit_user")
    8d5ea35c5e91 ("x86/entry: Consolidate check_user_regs()")
    a377ac1cd9d7 ("x86/entry: Move user return notifier out of loop")
    b037b09b9058 ("x86/entry: Rename idtentry_enter/exit_cond_rcu() to idtentry_enter/exit()")
    ba1f2b2eaa2a ("x86/entry: Fix NMI vs IRQ state tracking")


NOTE: The patch will not be queued to stable trees until it is upstream.

How should we proceed with this patch?

-- 
Thanks
Sasha

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 00/13] x86/debug: Untangle handle_debug()
  2020-09-02 13:25 [PATCH 00/13] x86/debug: Untangle handle_debug() Peter Zijlstra
                   ` (12 preceding siblings ...)
  2020-09-02 13:26 ` [RFC][PATCH 13/13] x86/debug: Change thread.debugreg6 to thread.virtual_dr6 Peter Zijlstra
@ 2020-09-03 15:21 ` Daniel Thompson
  13 siblings, 0 replies; 34+ messages in thread
From: Daniel Thompson @ 2020-09-03 15:21 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, Kyle Huey, Alexandre Chartre,
	Robert O'Callahan, Paul E. McKenney, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Andy Lutomirski, Josh Poimboeuf

On Wed, Sep 02, 2020 at 03:25:49PM +0200, Peter Zijlstra wrote:
> Hi,
> 
> The first two patches probably ought to go in x86/urgent, the rest (!RFC) can
> go into x86/core and wait a bit.
> 
> handle_debug() is a mess, and now that we have separate user and kernel paths,
> try and clean it up a bit.
> 
> There's two RFC patches at the end that impact the ptrace_{get,set}_debugreg(6)
> ABI, I've no idea what, if anything, is expected of that or if anybody actually
> cares about that. If I read the code correctly nothing actually consumes the
> value from ptrace_set_debugreg(6).

I applied this both with and without the RFC patches and pointed the
(still work-in-progress) kgdb test suite at it. I won't pretend the
suite is comprehensive but nevertheless FWIW:
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>


Daniel.


> 
> Kyle, you seem to be pushing all this to the edge with RR, any clues?
> 
> Since v2:
> 
>  - fixed (user) INT1 / icebp detection
>  - some further cleanups
>  - two additional RFC patches
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 02/13] x86/debug: Allow a single level of #DB recursion
  2020-09-02 13:25 ` [PATCH 02/13] x86/debug: Allow a single level of #DB recursion Peter Zijlstra
  2020-09-02 23:59   ` Sasha Levin
@ 2020-09-03 16:12   ` Josh Poimboeuf
  1 sibling, 0 replies; 34+ messages in thread
From: Josh Poimboeuf @ 2020-09-03 16:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, Kyle Huey, Alexandre Chartre,
	Robert O'Callahan, Paul E. McKenney, Frederic Weisbecker,
	Paolo Bonzini, Sean Christopherson, Masami Hiramatsu,
	Petr Mladek, Steven Rostedt, Joel Fernandes, Boris Ostrovsky,
	Juergen Gross, Brian Gerst, Andy Lutomirski, Daniel Thompson,
	stable

On Wed, Sep 02, 2020 at 03:25:51PM +0200, Peter Zijlstra wrote:
> @@ -863,6 +849,18 @@ static void handle_debug(struct pt_regs
>  static __always_inline void exc_debug_kernel(struct pt_regs *regs,
>  					     unsigned long dr6)
>  {
> +	/*
> +	 * Disable breakpoints during exception handling; recursive exceptions
> +	 * are exceedingly 'fun'.
> +	 *
> +	 * Since this function is NOKPROBE, and that also applies to
> +	 * HW_BREAKPOINT_X, we can't hit a breakpoint before this (XXX except a
> +	 * HW_BREAKPOINT_W on our stack)
> +	 *
> +	 * Entry text is excluded for HW_BP_X and cpu_entry_area, which
> +	 * includes the entry stack is excluded for everything.
> +	 */

I know this comment was copy/pasted, but I had to stare at the last
paragraph like one of those 3D paintings they used to have at the mall.

Recommended rewording:

	 * HW_BREAKPOINT_X is disallowed for entry text; all breakpoints
	 * are disallowed for cpu_entry_area (which includes the entry
	 * stack).

-- 
Josh


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [tip: x86/urgent] x86/entry: Fix AC assertion
  2020-09-02 13:25 ` [PATCH 01/13] x86/entry: Fix AC assertion Peter Zijlstra
  2020-09-02 15:58   ` Brian Gerst
@ 2020-09-04 13:16   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 34+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-09-04 13:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Thomas Gleixner, Daniel Thompson, Andy Lutomirski, stable, x86,
	LKML

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     662a0221893a3d58aa72719671844264306f6e4b
Gitweb:        https://git.kernel.org/tip/662a0221893a3d58aa72719671844264306f6e4b
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 02 Sep 2020 15:25:50 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 04 Sep 2020 15:09:29 +02:00

x86/entry: Fix AC assertion

The WARN added in commit 3c73b81a9164 ("x86/entry, selftests: Further
improve user entry sanity checks") unconditionally triggers on a IVB
machine because it does not support SMAP.

For !SMAP hardware the CLAC/STAC instructions are patched out and thus if
userspace sets AC, it is still have set after entry.

Fixes: 3c73b81a9164 ("x86/entry, selftests: Further improve user entry sanity checks")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200902133200.666781610@infradead.org

---
 arch/x86/include/asm/entry-common.h | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
index a8f9315..6fe54b2 100644
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -18,8 +18,16 @@ static __always_inline void arch_check_user_regs(struct pt_regs *regs)
 		 * state, not the interrupt state as imagined by Xen.
 		 */
 		unsigned long flags = native_save_fl();
-		WARN_ON_ONCE(flags & (X86_EFLAGS_AC | X86_EFLAGS_DF |
-				      X86_EFLAGS_NT));
+		unsigned long mask = X86_EFLAGS_DF | X86_EFLAGS_NT;
+
+		/*
+		 * For !SMAP hardware we patch out CLAC on entry.
+		 */
+		if (boot_cpu_has(X86_FEATURE_SMAP) ||
+		    (IS_ENABLED(CONFIG_64_BIT) && boot_cpu_has(X86_FEATURE_XENPV)))
+			mask |= X86_EFLAGS_AC;
+
+		WARN_ON_ONCE(flags & mask);
 
 		/* We think we came from user mode. Make sure pt_regs agrees. */
 		WARN_ON_ONCE(!user_mode(regs));

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip: x86/entry] x86/debug: Change thread.debugreg6 to thread.virtual_dr6
  2020-09-02 13:26 ` [RFC][PATCH 13/13] x86/debug: Change thread.debugreg6 to thread.virtual_dr6 Peter Zijlstra
@ 2020-09-04 13:16   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-09-04 13:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Andy Lutomirski, Peter Zijlstra (Intel),
	Thomas Gleixner, Daniel Thompson, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     d53d9bc0cf783e93b374de3895145c7375e570ba
Gitweb:        https://git.kernel.org/tip/d53d9bc0cf783e93b374de3895145c7375e570ba
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 02 Sep 2020 15:26:02 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 04 Sep 2020 15:12:58 +02:00

x86/debug: Change thread.debugreg6 to thread.virtual_dr6

Current usage of thread.debugreg6 is convoluted at best. It starts life as
a copy of the hardware DR6 value, but then various bits are cleared and
set.

Replace this with a new variable thread.virtual_dr6 that is initialized to
0 when DR6 is read and only gains bits, at the same time the actual (on
stack) dr6 value which is read from the hardware only gets bits cleared.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200902133201.415372940@infradead.org

---
 arch/x86/include/asm/processor.h |  2 +-
 arch/x86/kernel/hw_breakpoint.c  | 12 +++---------
 arch/x86/kernel/kgdb.c           |  5 +++--
 arch/x86/kernel/ptrace.c         |  6 +++---
 arch/x86/kernel/traps.c          | 25 ++++++++++++++++---------
 5 files changed, 26 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 97143d8..d8a82e6 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -517,7 +517,7 @@ struct thread_struct {
 	/* Save middle states of ptrace breakpoints */
 	struct perf_event	*ptrace_bps[HBP_NUM];
 	/* Debug status used for traps, single steps, etc... */
-	unsigned long           debugreg6;
+	unsigned long           virtual_dr6;
 	/* Keep track of the exact dr7 value set by the user */
 	unsigned long           ptrace_dr7;
 	/* Fault info: */
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index d17a1da..03aa33b 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -454,7 +454,7 @@ void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
 		t->ptrace_bps[i] = NULL;
 	}
 
-	t->debugreg6 = 0;
+	t->virtual_dr6 = 0;
 	t->ptrace_dr7 = 0;
 }
 
@@ -489,8 +489,8 @@ static int hw_breakpoint_handler(struct die_args *args)
 {
 	int i, rc = NOTIFY_STOP;
 	struct perf_event *bp;
-	unsigned long dr6;
 	unsigned long *dr6_p;
+	unsigned long dr6;
 
 	/* The DR6 value is pointed by args->err */
 	dr6_p = (unsigned long *)ERR_PTR(args->err);
@@ -504,12 +504,6 @@ static int hw_breakpoint_handler(struct die_args *args)
 	if ((dr6 & DR_TRAP_BITS) == 0)
 		return NOTIFY_DONE;
 
-	/*
-	 * Reset the DRn bits in the virtualized register value.
-	 * The ptrace trigger routine will add in whatever is needed.
-	 */
-	current->thread.debugreg6 &= ~DR_TRAP_BITS;
-
 	/* Handle all the breakpoints that were triggered */
 	for (i = 0; i < HBP_NUM; ++i) {
 		if (likely(!(dr6 & (DR_TRAP0 << i))))
@@ -554,7 +548,7 @@ static int hw_breakpoint_handler(struct die_args *args)
 	 * breakpoints (to generate signals) and b) when the system has
 	 * taken exception due to multiple causes
 	 */
-	if ((current->thread.debugreg6 & DR_TRAP_BITS) ||
+	if ((current->thread.virtual_dr6 & DR_TRAP_BITS) ||
 	    (dr6 & (~DR_TRAP_BITS)))
 		rc = NOTIFY_DONE;
 
diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
index c2f02f3..ff7878d 100644
--- a/arch/x86/kernel/kgdb.c
+++ b/arch/x86/kernel/kgdb.c
@@ -629,9 +629,10 @@ static void kgdb_hw_overflow_handler(struct perf_event *event,
 	struct task_struct *tsk = current;
 	int i;
 
-	for (i = 0; i < 4; i++)
+	for (i = 0; i < 4; i++) {
 		if (breakinfo[i].enabled)
-			tsk->thread.debugreg6 |= (DR_TRAP0 << i);
+			tsk->thread.virtual_dr6 |= (DR_TRAP0 << i);
+	}
 }
 
 void kgdb_arch_late(void)
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 5f98289..bedca01 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -465,7 +465,7 @@ static void ptrace_triggered(struct perf_event *bp,
 			break;
 	}
 
-	thread->debugreg6 |= (DR_TRAP0 << i);
+	thread->virtual_dr6 |= (DR_TRAP0 << i);
 }
 
 /*
@@ -601,7 +601,7 @@ static unsigned long ptrace_get_debugreg(struct task_struct *tsk, int n)
 		if (bp)
 			val = bp->hw.info.address;
 	} else if (n == 6) {
-		val = thread->debugreg6 ^ DR6_RESERVED; /* Flip back to arch polarity */
+		val = thread->virtual_dr6 ^ DR6_RESERVED; /* Flip back to arch polarity */
 	} else if (n == 7) {
 		val = thread->ptrace_dr7;
 	}
@@ -657,7 +657,7 @@ static int ptrace_set_debugreg(struct task_struct *tsk, int n,
 	if (n < HBP_NUM) {
 		rc = ptrace_set_breakpoint_addr(tsk, n, val);
 	} else if (n == 6) {
-		thread->debugreg6 = val ^ DR6_RESERVED; /* Flip to positive polarity */
+		thread->virtual_dr6 = val ^ DR6_RESERVED; /* Flip to positive polarity */
 		rc = 0;
 	} else if (n == 7) {
 		rc = ptrace_write_dr7(tsk, val);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 114515b..df9c655 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -749,6 +749,12 @@ static __always_inline unsigned long debug_read_clear_dr6(void)
 	dr6 ^= DR6_RESERVED; /* Flip to positive polarity */
 
 	/*
+	 * Clear the virtual DR6 value, ptrace routines will set bits here for
+	 * things we want signals for.
+	 */
+	current->thread.virtual_dr6 = 0;
+
+	/*
 	 * The SDM says "The processor clears the BTF flag when it
 	 * generates a debug exception."  Clear TIF_BLOCKSTEP to keep
 	 * TIF_BLOCKSTEP in sync with the hardware BTF flag.
@@ -785,17 +791,16 @@ static __always_inline unsigned long debug_read_clear_dr6(void)
 
 static bool notify_debug(struct pt_regs *regs, unsigned long *dr6)
 {
-	struct task_struct *tsk = current;
-
-	/* Store the virtualized DR6 value */
-	tsk->thread.debugreg6 = *dr6;
-
+	/*
+	 * Notifiers will clear bits in @dr6 to indicate the event has been
+	 * consumed - hw_breakpoint_handler(), single_stop_cont().
+	 *
+	 * Notifiers will set bits in @virtual_dr6 to indicate the desire
+	 * for signals - ptrace_triggered(), kgdb_hw_overflow_handler().
+	 */
 	if (notify_die(DIE_DEBUG, "debug", regs, (long)dr6, 0, SIGTRAP) == NOTIFY_STOP)
 		return true;
 
-	/* Reload the DR6 value, the notifier might have changed it */
-	*dr6 = tsk->thread.debugreg6;
-
 	return false;
 }
 
@@ -853,7 +858,7 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs,
 	 * A known way to trigger this is through QEMU's GDB stub,
 	 * which leaks #DB into the guest and causes IST recursion.
 	 */
-	if (WARN_ON_ONCE(current->thread.debugreg6 & DR_STEP))
+	if (WARN_ON_ONCE(dr6 & DR_STEP))
 		regs->flags &= ~X86_EFLAGS_TF;
 out:
 	instrumentation_end();
@@ -903,6 +908,8 @@ static __always_inline void exc_debug_user(struct pt_regs *regs,
 		goto out_irq;
 	}
 
+	/* Add the virtual_dr6 bits for signals. */
+	dr6 |= current->thread.virtual_dr6;
 	if (dr6 & (DR_STEP | DR_TRAP_BITS) || icebp)
 		send_sigtrap(regs, 0, get_si_code(dr6));
 

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip: x86/entry] x86/debug: Support negative polarity DR6 bits
  2020-09-02 13:26 ` [RFC][PATCH 12/13] x86/debug: Support negative polarity DR6 bits Peter Zijlstra
@ 2020-09-04 13:16   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-09-04 13:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Andrew Cooper, Peter Zijlstra (Intel),
	Thomas Gleixner, Daniel Thompson, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     f4956cf83ed12271bdbd5b547f3378add72bbffb
Gitweb:        https://git.kernel.org/tip/f4956cf83ed12271bdbd5b547f3378add72bbffb
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 02 Sep 2020 15:26:01 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 04 Sep 2020 15:12:57 +02:00

x86/debug: Support negative polarity DR6 bits

DR6 has a whole bunch of bits that have negative polarity; they were
architecturally reserved and defined to be 1 and are now getting used.
Since they're 1 by default, 0 becomes the signal value.

Handle this by xor'ing the read DR6 value by the reserved mask, this
will flip them around such that 1 is the signal value (positive
polarity).

Current Linux doesn't yet support any of these bits, but there's two
defined:

 - DR6[11] Bus Lock Debug Exception		(ISEr39)
 - DR6[16] Restricted Transactional Memory	(SDM)

Update ptrace_{set,get}_debugreg() to provide/consume the value in
architectural polarity. Although afaict ptrace_set_debugreg(6) is
pointless, the value is not consumed anywhere.

Change hw_breakpoint_restore() to alway write the DR6_RESERVED value
to DR6, again, no consumer for that write.

Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200902133201.354220797@infradead.org

---
 arch/x86/kernel/hw_breakpoint.c | 2 +-
 arch/x86/kernel/ptrace.c        | 4 ++--
 arch/x86/kernel/traps.c         | 5 ++---
 3 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 7b7d9f2..d17a1da 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -464,7 +464,7 @@ void hw_breakpoint_restore(void)
 	set_debugreg(__this_cpu_read(cpu_debugreg[1]), 1);
 	set_debugreg(__this_cpu_read(cpu_debugreg[2]), 2);
 	set_debugreg(__this_cpu_read(cpu_debugreg[3]), 3);
-	set_debugreg(current->thread.debugreg6, 6);
+	set_debugreg(DR6_RESERVED, 6);
 	set_debugreg(__this_cpu_read(cpu_dr7), 7);
 }
 EXPORT_SYMBOL_GPL(hw_breakpoint_restore);
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index e7537c5..5f98289 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -601,7 +601,7 @@ static unsigned long ptrace_get_debugreg(struct task_struct *tsk, int n)
 		if (bp)
 			val = bp->hw.info.address;
 	} else if (n == 6) {
-		val = thread->debugreg6;
+		val = thread->debugreg6 ^ DR6_RESERVED; /* Flip back to arch polarity */
 	} else if (n == 7) {
 		val = thread->ptrace_dr7;
 	}
@@ -657,7 +657,7 @@ static int ptrace_set_debugreg(struct task_struct *tsk, int n,
 	if (n < HBP_NUM) {
 		rc = ptrace_set_breakpoint_addr(tsk, n, val);
 	} else if (n == 6) {
-		thread->debugreg6 = val;
+		thread->debugreg6 = val ^ DR6_RESERVED; /* Flip to positive polarity */
 		rc = 0;
 	} else if (n == 7) {
 		rc = ptrace_write_dr7(tsk, val);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 1e89001..114515b 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -745,9 +745,8 @@ static __always_inline unsigned long debug_read_clear_dr6(void)
 	 * Keep it simple: clear DR6 immediately.
 	 */
 	get_debugreg(dr6, 6);
-	set_debugreg(0, 6);
-	/* Filter out all the reserved bits which are preset to 1 */
-	dr6 &= ~DR6_RESERVED;
+	set_debugreg(DR6_RESERVED, 6);
+	dr6 ^= DR6_RESERVED; /* Flip to positive polarity */
 
 	/*
 	 * The SDM says "The processor clears the BTF flag when it

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip: x86/entry] x86/debug: Remove aout_dump_debugregs()
  2020-09-02 13:25 ` [PATCH 10/13] x86/debug: Remove aout_dump_debugregs() Peter Zijlstra
@ 2020-09-04 13:16   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-09-04 13:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Thomas Gleixner, Daniel Thompson, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     b84d42b6c6ac6a60519286e72b69f2dbf08dfb70
Gitweb:        https://git.kernel.org/tip/b84d42b6c6ac6a60519286e72b69f2dbf08dfb70
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 02 Sep 2020 15:25:59 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 04 Sep 2020 15:12:55 +02:00

x86/debug: Remove aout_dump_debugregs()

Unused remnants for the bit-bucket.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200902133201.233022474@infradead.org

---
 arch/x86/include/asm/debugreg.h |  2 +--
 arch/x86/kernel/hw_breakpoint.c | 36 +--------------------------------
 2 files changed, 38 deletions(-)

diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h
index e89558a..cfdf307 100644
--- a/arch/x86/include/asm/debugreg.h
+++ b/arch/x86/include/asm/debugreg.h
@@ -90,8 +90,6 @@ static __always_inline bool hw_breakpoint_active(void)
 	return __this_cpu_read(cpu_dr7) & DR_GLOBAL_ENABLE_MASK;
 }
 
-extern void aout_dump_debugregs(struct user *dump);
-
 extern void hw_breakpoint_restore(void);
 
 static __always_inline unsigned long local_db_save(void)
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index b98ff62..ca9de59 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -442,42 +442,6 @@ int hw_breakpoint_arch_parse(struct perf_event *bp,
 }
 
 /*
- * Dump the debug register contents to the user.
- * We can't dump our per cpu values because it
- * may contain cpu wide breakpoint, something that
- * doesn't belong to the current task.
- *
- * TODO: include non-ptrace user breakpoints (perf)
- */
-void aout_dump_debugregs(struct user *dump)
-{
-	int i;
-	int dr7 = 0;
-	struct perf_event *bp;
-	struct arch_hw_breakpoint *info;
-	struct thread_struct *thread = &current->thread;
-
-	for (i = 0; i < HBP_NUM; i++) {
-		bp = thread->ptrace_bps[i];
-
-		if (bp && !bp->attr.disabled) {
-			dump->u_debugreg[i] = bp->attr.bp_addr;
-			info = counter_arch_bp(bp);
-			dr7 |= encode_dr7(i, info->len, info->type);
-		} else {
-			dump->u_debugreg[i] = 0;
-		}
-	}
-
-	dump->u_debugreg[4] = 0;
-	dump->u_debugreg[5] = 0;
-	dump->u_debugreg[6] = current->thread.debugreg6;
-
-	dump->u_debugreg[7] = dr7;
-}
-EXPORT_SYMBOL_GPL(aout_dump_debugregs);
-
-/*
  * Release the user breakpoints used by ptrace
  */
 void flush_ptrace_hw_breakpoint(struct task_struct *tsk)

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip: x86/entry] x86/debug: Simplify hw_breakpoint_handler()
  2020-09-02 13:26 ` [PATCH 11/13] x86/debug: Simplify hw_breakpoint_handler() Peter Zijlstra
@ 2020-09-04 13:16   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-09-04 13:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Thomas Gleixner, Daniel Thompson, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     21d44be7b6ff4c254dc971e2c99d4082dd470afd
Gitweb:        https://git.kernel.org/tip/21d44be7b6ff4c254dc971e2c99d4082dd470afd
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 02 Sep 2020 15:26:00 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 04 Sep 2020 15:12:56 +02:00

x86/debug: Simplify hw_breakpoint_handler()

This is called with interrupts disabled, there's no point in using
get_cpu() and per_cpu().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200902133201.292906672@infradead.org

---
 arch/x86/kernel/hw_breakpoint.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index ca9de59..7b7d9f2 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -487,7 +487,7 @@ EXPORT_SYMBOL_GPL(hw_breakpoint_restore);
  */
 static int hw_breakpoint_handler(struct die_args *args)
 {
-	int i, cpu, rc = NOTIFY_STOP;
+	int i, rc = NOTIFY_STOP;
 	struct perf_event *bp;
 	unsigned long dr6;
 	unsigned long *dr6_p;
@@ -505,12 +505,10 @@ static int hw_breakpoint_handler(struct die_args *args)
 		return NOTIFY_DONE;
 
 	/*
-	 * Assert that local interrupts are disabled
 	 * Reset the DRn bits in the virtualized register value.
 	 * The ptrace trigger routine will add in whatever is needed.
 	 */
 	current->thread.debugreg6 &= ~DR_TRAP_BITS;
-	cpu = get_cpu();
 
 	/* Handle all the breakpoints that were triggered */
 	for (i = 0; i < HBP_NUM; ++i) {
@@ -525,7 +523,7 @@ static int hw_breakpoint_handler(struct die_args *args)
 		 */
 		rcu_read_lock();
 
-		bp = per_cpu(bp_per_reg[i], cpu);
+		bp = this_cpu_read(bp_per_reg[i]);
 		/*
 		 * Reset the 'i'th TRAP bit in dr6 to denote completion of
 		 * exception handling
@@ -560,8 +558,6 @@ static int hw_breakpoint_handler(struct die_args *args)
 	    (dr6 & (~DR_TRAP_BITS)))
 		rc = NOTIFY_DONE;
 
-	put_cpu();
-
 	return rc;
 }
 

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip: x86/entry] x86/debug: Move cond_local_irq_enable() block into exc_debug_user()
  2020-09-02 13:25 ` [PATCH 08/13] x86/debug: Move cond_local_irq_enable() block into exc_debug_user() Peter Zijlstra
@ 2020-09-04 13:16   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-09-04 13:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Thomas Gleixner, Daniel Thompson, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     f0b67c39c190e19bc1604a13bcc985c4445a4b2f
Gitweb:        https://git.kernel.org/tip/f0b67c39c190e19bc1604a13bcc985c4445a4b2f
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 02 Sep 2020 15:25:57 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 04 Sep 2020 15:12:53 +02:00

x86/debug: Move cond_local_irq_enable() block into exc_debug_user()

The cond_local_irq_enable() block, dealing with vm86 and sending
signals is only relevant for #DB-from-user, move it there.

This then reduces handle_debug() to only the notifier call, so rename
it to notify_debug().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200902133201.094265982@infradead.org

---
 arch/x86/kernel/traps.c | 58 ++++++++++++++++++++--------------------
 1 file changed, 29 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 2605686..682af24 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -783,17 +783,10 @@ static __always_inline unsigned long debug_read_clear_dr6(void)
  *
  * May run on IST stack.
  */
-static bool handle_debug(struct pt_regs *regs, unsigned long *dr6)
+
+static bool notify_debug(struct pt_regs *regs, unsigned long *dr6)
 {
 	struct task_struct *tsk = current;
-	bool icebp;
-
-	/*
-	 * If dr6 has no reason to give us about the origin of this trap,
-	 * then it's very likely the result of an icebp/int01 trap.
-	 * User wants a sigtrap for that.
-	 */
-	icebp = !*dr6;
 
 	/* Store the virtualized DR6 value */
 	tsk->thread.debugreg6 = *dr6;
@@ -801,26 +794,9 @@ static bool handle_debug(struct pt_regs *regs, unsigned long *dr6)
 	if (notify_die(DIE_DEBUG, "debug", regs, (long)dr6, 0, SIGTRAP) == NOTIFY_STOP)
 		return true;
 
-	/* It's safe to allow irq's after DR6 has been saved */
-	cond_local_irq_enable(regs);
-
-	if (v8086_mode(regs)) {
-		handle_vm86_trap((struct kernel_vm86_regs *) regs, 0,
-				 X86_TRAP_DB);
-		goto out;
-	}
-
-	/*
-	 * Reload dr6, the notifier might have changed it.
-	 */
+	/* Reload the DR6 value, the notifier might have changed it */
 	*dr6 = tsk->thread.debugreg6;
 
-	if (*dr6 & (DR_STEP | DR_TRAP_BITS) || icebp)
-		send_sigtrap(regs, 0, get_si_code(*dr6));
-
-out:
-	cond_local_irq_disable(regs);
-
 	return false;
 }
 
@@ -864,7 +840,7 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs,
 	if (!dr6)
 		goto out;
 
-	if (handle_debug(regs, &dr6))
+	if (notify_debug(regs, &dr6))
 		goto out;
 
 	if (WARN_ON_ONCE(dr6 & DR_STEP)) {
@@ -889,6 +865,8 @@ out:
 static __always_inline void exc_debug_user(struct pt_regs *regs,
 					   unsigned long dr6)
 {
+	bool icebp;
+
 	/*
 	 * If something gets miswired and we end up here for a kernel mode
 	 * #DB, we will malfunction.
@@ -907,8 +885,30 @@ static __always_inline void exc_debug_user(struct pt_regs *regs,
 	irqentry_enter_from_user_mode(regs);
 	instrumentation_begin();
 
-	handle_debug(regs, &dr6);
+	/*
+	 * If dr6 has no reason to give us about the origin of this trap,
+	 * then it's very likely the result of an icebp/int01 trap.
+	 * User wants a sigtrap for that.
+	 */
+	icebp = !dr6;
+
+	if (notify_debug(regs, &dr6))
+		goto out;
 
+	/* It's safe to allow irq's after DR6 has been saved */
+	local_irq_enable();
+
+	if (v8086_mode(regs)) {
+		handle_vm86_trap((struct kernel_vm86_regs *)regs, 0, X86_TRAP_DB);
+		goto out_irq;
+	}
+
+	if (dr6 & (DR_STEP | DR_TRAP_BITS) || icebp)
+		send_sigtrap(regs, 0, get_si_code(dr6));
+
+out_irq:
+	local_irq_disable();
+out:
 	instrumentation_end();
 	irqentry_exit_to_user_mode(regs);
 }

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip: x86/entry] x86/debug: Move historical SYSENTER junk into exc_debug_kernel()
  2020-09-02 13:25 ` [PATCH 07/13] x86/debug: Move historical SYSENTER junk into exc_debug_kernel() Peter Zijlstra
@ 2020-09-04 13:16   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-09-04 13:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Thomas Gleixner, Daniel Thompson, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     4eb5acc39187a7ba578fbb44f7bb1965057309ae
Gitweb:        https://git.kernel.org/tip/4eb5acc39187a7ba578fbb44f7bb1965057309ae
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 02 Sep 2020 15:25:56 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 04 Sep 2020 15:12:53 +02:00

x86/debug: Move historical SYSENTER junk into exc_debug_kernel()

The historical SYSENTER junk is explicitly for from-kernel, so move it
to the #DB-from-kernel handler.

It is ordered after the notifier, which is important for KGDB which uses TF
single-step and needs to consume the event before that point.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200902133201.031099736@infradead.org

---
 arch/x86/kernel/traps.c | 49 ++++++++++++++++++++--------------------
 1 file changed, 25 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 24e09f8..2605686 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -783,7 +783,7 @@ static __always_inline unsigned long debug_read_clear_dr6(void)
  *
  * May run on IST stack.
  */
-static void handle_debug(struct pt_regs *regs, unsigned long dr6)
+static bool handle_debug(struct pt_regs *regs, unsigned long *dr6)
 {
 	struct task_struct *tsk = current;
 	bool icebp;
@@ -793,15 +793,13 @@ static void handle_debug(struct pt_regs *regs, unsigned long dr6)
 	 * then it's very likely the result of an icebp/int01 trap.
 	 * User wants a sigtrap for that.
 	 */
-	icebp = !dr6;
+	icebp = !*dr6;
 
 	/* Store the virtualized DR6 value */
-	tsk->thread.debugreg6 = dr6;
+	tsk->thread.debugreg6 = *dr6;
 
-	if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, 0,
-		       SIGTRAP) == NOTIFY_STOP) {
-		return;
-	}
+	if (notify_die(DIE_DEBUG, "debug", regs, (long)dr6, 0, SIGTRAP) == NOTIFY_STOP)
+		return true;
 
 	/* It's safe to allow irq's after DR6 has been saved */
 	cond_local_irq_enable(regs);
@@ -815,25 +813,15 @@ static void handle_debug(struct pt_regs *regs, unsigned long dr6)
 	/*
 	 * Reload dr6, the notifier might have changed it.
 	 */
-	dr6 = tsk->thread.debugreg6;
+	*dr6 = tsk->thread.debugreg6;
 
-	if (WARN_ON_ONCE((dr6 & DR_STEP) && !user_mode(regs))) {
-		/*
-		 * Historical junk that used to handle SYSENTER single-stepping.
-		 * This should be unreachable now.  If we survive for a while
-		 * without anyone hitting this warning, we'll turn this into
-		 * an oops.
-		 */
-		tsk->thread.debugreg6 &= ~DR_STEP;
-		set_tsk_thread_flag(tsk, TIF_SINGLESTEP);
-		regs->flags &= ~X86_EFLAGS_TF;
-	}
-
-	if (dr6 & (DR_STEP | DR_TRAP_BITS) || icebp)
-		send_sigtrap(regs, 0, get_si_code(dr6));
+	if (*dr6 & (DR_STEP | DR_TRAP_BITS) || icebp)
+		send_sigtrap(regs, 0, get_si_code(*dr6));
 
 out:
 	cond_local_irq_disable(regs);
+
+	return false;
 }
 
 static __always_inline void exc_debug_kernel(struct pt_regs *regs,
@@ -876,7 +864,20 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs,
 	if (!dr6)
 		goto out;
 
-	handle_debug(regs, dr6);
+	if (handle_debug(regs, &dr6))
+		goto out;
+
+	if (WARN_ON_ONCE(dr6 & DR_STEP)) {
+		/*
+		 * Historical junk that used to handle SYSENTER single-stepping.
+		 * This should be unreachable now.  If we survive for a while
+		 * without anyone hitting this warning, we'll turn this into
+		 * an oops.
+		 */
+		dr6 &= ~DR_STEP;
+		set_thread_flag(TIF_SINGLESTEP);
+		regs->flags &= ~X86_EFLAGS_TF;
+	}
 
 out:
 	instrumentation_end();
@@ -906,7 +907,7 @@ static __always_inline void exc_debug_user(struct pt_regs *regs,
 	irqentry_enter_from_user_mode(regs);
 	instrumentation_begin();
 
-	handle_debug(regs, dr6);
+	handle_debug(regs, &dr6);
 
 	instrumentation_end();
 	irqentry_exit_to_user_mode(regs);

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip: x86/entry] x86/debug: Remove the historical junk
  2020-09-02 13:25 ` [PATCH 09/13] x86/debug: Remove the historical junk Peter Zijlstra
@ 2020-09-04 13:16   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-09-04 13:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Brian Gerst, Andy Lutomirski, Peter Zijlstra (Intel),
	Thomas Gleixner, Daniel Thompson, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     389cd0cd8b3790b555c3679da946f4aa4fba3bab
Gitweb:        https://git.kernel.org/tip/389cd0cd8b3790b555c3679da946f4aa4fba3bab
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 02 Sep 2020 15:25:58 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 04 Sep 2020 15:12:55 +02:00

x86/debug: Remove the historical junk

Remove the historical junk and replace it with a WARN and a comment.

The problem is that even though the kernel only uses TF single-step in
kprobes and KGDB, both of which consume the event before this, QEMU/KVM has
bugs in this area that can trigger this state so it has to be dealt with.

Suggested-by: Brian Gerst <brgerst@gmail.com>
Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200902133201.170216274@infradead.org

---
 arch/x86/kernel/traps.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 682af24..1e89001 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -843,18 +843,19 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs,
 	if (notify_debug(regs, &dr6))
 		goto out;
 
-	if (WARN_ON_ONCE(dr6 & DR_STEP)) {
-		/*
-		 * Historical junk that used to handle SYSENTER single-stepping.
-		 * This should be unreachable now.  If we survive for a while
-		 * without anyone hitting this warning, we'll turn this into
-		 * an oops.
-		 */
-		dr6 &= ~DR_STEP;
-		set_thread_flag(TIF_SINGLESTEP);
+	/*
+	 * The kernel doesn't use TF single-step outside of:
+	 *
+	 *  - Kprobes, consumed through kprobe_debug_handler()
+	 *  - KGDB, consumed through notify_debug()
+	 *
+	 * So if we get here with DR_STEP set, something is wonky.
+	 *
+	 * A known way to trigger this is through QEMU's GDB stub,
+	 * which leaks #DB into the guest and causes IST recursion.
+	 */
+	if (WARN_ON_ONCE(current->thread.debugreg6 & DR_STEP))
 		regs->flags &= ~X86_EFLAGS_TF;
-	}
-
 out:
 	instrumentation_end();
 	idtentry_exit_nmi(regs, irq_state);

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip: x86/entry] x86/debug: Remove handle_debug(.user) argument
  2020-09-02 13:25 ` [PATCH 05/13] x86/debug: Remove handle_debug(.user) argument Peter Zijlstra
@ 2020-09-04 13:16   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-09-04 13:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Thomas Gleixner, Daniel Thompson, Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     7043679a989af969e9f20cc7d90195b36f54036f
Gitweb:        https://git.kernel.org/tip/7043679a989af969e9f20cc7d90195b36f54036f
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 02 Sep 2020 15:25:54 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 04 Sep 2020 15:12:52 +02:00

x86/debug: Remove handle_debug(.user) argument

The handle_debug(.user) argument is used to terminate the #DB handler early
for the INT1-from-kernel case, since the kernel doesn't use INT1.

Remove the argument and handle this explicitly in #DB-from-kernel.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20200902133200.907020598@infradead.org

---
 arch/x86/kernel/traps.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 9cb39d3..58bc434 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -783,25 +783,18 @@ static __always_inline unsigned long debug_read_clear_dr6(void)
  *
  * May run on IST stack.
  */
-static void handle_debug(struct pt_regs *regs, unsigned long dr6, bool user)
+static void handle_debug(struct pt_regs *regs, unsigned long dr6)
 {
 	struct task_struct *tsk = current;
 	bool user_icebp;
 	int si_code;
 
 	/*
-	 * If DR6 is zero, no point in trying to handle it. The kernel is
-	 * not using INT1.
-	 */
-	if (!user && !dr6)
-		return;
-
-	/*
 	 * If dr6 has no reason to give us about the origin of this trap,
 	 * then it's very likely the result of an icebp/int01 trap.
 	 * User wants a sigtrap for that.
 	 */
-	user_icebp = user && !dr6;
+	user_icebp = !dr6;
 
 	/* Store the virtualized DR6 value */
 	tsk->thread.debugreg6 = dr6;
@@ -874,7 +867,13 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs,
 	if (kprobe_debug_handler(regs))
 		goto out;
 
-	handle_debug(regs, dr6, false);
+	/*
+	 * The kernel doesn't use INT1
+	 */
+	if (!dr6)
+		goto out;
+
+	handle_debug(regs, dr6);
 
 out:
 	instrumentation_end();
@@ -904,7 +903,7 @@ static __always_inline void exc_debug_user(struct pt_regs *regs,
 	irqentry_enter_from_user_mode(regs);
 	instrumentation_begin();
 
-	handle_debug(regs, dr6, true);
+	handle_debug(regs, dr6);
 
 	instrumentation_end();
 	irqentry_exit_to_user_mode(regs);

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip: x86/entry] x86/debug: Simplify #DB signal code
  2020-09-02 13:25 ` [PATCH 06/13] x86/debug: Simplify #DB signal code Peter Zijlstra
@ 2020-09-04 13:16   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-09-04 13:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Thomas Gleixner, Daniel Thompson, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     4182e9436916a48f16207f0619562f1d3843a0c8
Gitweb:        https://git.kernel.org/tip/4182e9436916a48f16207f0619562f1d3843a0c8
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 02 Sep 2020 15:25:55 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 04 Sep 2020 15:12:52 +02:00

x86/debug: Simplify #DB signal code

There's no point in calculating si_code if it's not going to be used.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200902133200.967434217@infradead.org

---
 arch/x86/kernel/traps.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 58bc434..24e09f8 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -786,15 +786,14 @@ static __always_inline unsigned long debug_read_clear_dr6(void)
 static void handle_debug(struct pt_regs *regs, unsigned long dr6)
 {
 	struct task_struct *tsk = current;
-	bool user_icebp;
-	int si_code;
+	bool icebp;
 
 	/*
 	 * If dr6 has no reason to give us about the origin of this trap,
 	 * then it's very likely the result of an icebp/int01 trap.
 	 * User wants a sigtrap for that.
 	 */
-	user_icebp = !dr6;
+	icebp = !dr6;
 
 	/* Store the virtualized DR6 value */
 	tsk->thread.debugreg6 = dr6;
@@ -813,6 +812,11 @@ static void handle_debug(struct pt_regs *regs, unsigned long dr6)
 		goto out;
 	}
 
+	/*
+	 * Reload dr6, the notifier might have changed it.
+	 */
+	dr6 = tsk->thread.debugreg6;
+
 	if (WARN_ON_ONCE((dr6 & DR_STEP) && !user_mode(regs))) {
 		/*
 		 * Historical junk that used to handle SYSENTER single-stepping.
@@ -825,9 +829,8 @@ static void handle_debug(struct pt_regs *regs, unsigned long dr6)
 		regs->flags &= ~X86_EFLAGS_TF;
 	}
 
-	si_code = get_si_code(tsk->thread.debugreg6);
-	if (tsk->thread.debugreg6 & (DR_STEP | DR_TRAP_BITS) || user_icebp)
-		send_sigtrap(regs, 0, si_code);
+	if (dr6 & (DR_STEP | DR_TRAP_BITS) || icebp)
+		send_sigtrap(regs, 0, get_si_code(dr6));
 
 out:
 	cond_local_irq_disable(regs);

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip: x86/entry] x86/debug: Move kprobe_debug_handler() into exc_debug_kernel()
  2020-09-02 13:25 ` [PATCH 04/13] x86/debug: Move kprobe_debug_handler() into exc_debug_kernel() Peter Zijlstra
@ 2020-09-04 13:16   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-09-04 13:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Thomas Gleixner, Daniel Thompson, Masami Hiramatsu,
	Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     20a6e35a948284b8ab246ed35eefc56d674ad076
Gitweb:        https://git.kernel.org/tip/20a6e35a948284b8ab246ed35eefc56d674ad076
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 02 Sep 2020 15:25:53 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 04 Sep 2020 15:12:52 +02:00

x86/debug: Move kprobe_debug_handler() into exc_debug_kernel()

Kprobes are on kernel text, and thus only matter for #DB-from-kernel.
Kprobes are ordered before the generic notifier, preserve that order.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20200902133200.847465360@infradead.org

---
 arch/x86/include/asm/kprobes.h |  4 ++++
 arch/x86/kernel/traps.c        | 10 ++++------
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kprobes.h b/arch/x86/include/asm/kprobes.h
index 143bc9a..991a7ad 100644
--- a/arch/x86/include/asm/kprobes.h
+++ b/arch/x86/include/asm/kprobes.h
@@ -106,5 +106,9 @@ extern int kprobe_exceptions_notify(struct notifier_block *self,
 extern int kprobe_int3_handler(struct pt_regs *regs);
 extern int kprobe_debug_handler(struct pt_regs *regs);
 
+#else
+
+static inline int kprobe_debug_handler(struct pt_regs *regs) { return 0; }
+
 #endif /* CONFIG_KPROBES */
 #endif /* _ASM_X86_KPROBES_H */
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 9945642..9cb39d3 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -806,12 +806,6 @@ static void handle_debug(struct pt_regs *regs, unsigned long dr6, bool user)
 	/* Store the virtualized DR6 value */
 	tsk->thread.debugreg6 = dr6;
 
-#ifdef CONFIG_KPROBES
-	if (kprobe_debug_handler(regs)) {
-		return;
-	}
-#endif
-
 	if (notify_die(DIE_DEBUG, "debug", regs, (long)&dr6, 0,
 		       SIGTRAP) == NOTIFY_STOP) {
 		return;
@@ -877,8 +871,12 @@ static __always_inline void exc_debug_kernel(struct pt_regs *regs,
 	if ((dr6 & DR_STEP) && is_sysenter_singlestep(regs))
 		dr6 &= ~DR_STEP;
 
+	if (kprobe_debug_handler(regs))
+		goto out;
+
 	handle_debug(regs, dr6, false);
 
+out:
 	instrumentation_end();
 	idtentry_exit_nmi(regs, irq_state);
 

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [tip: x86/entry] x86/debug: Sync BTF earlier
  2020-09-02 13:25 ` [PATCH 03/13] x86/debug: Sync BTF earlier Peter Zijlstra
@ 2020-09-04 13:16   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 34+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-09-04 13:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Thomas Gleixner, Daniel Thompson, Andy Lutomirski, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     c182487da1b5281463f2255d2347885dba219c08
Gitweb:        https://git.kernel.org/tip/c182487da1b5281463f2255d2347885dba219c08
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 02 Sep 2020 15:25:52 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 04 Sep 2020 15:12:52 +02:00

x86/debug: Sync BTF earlier

Move the BTF sync near the DR6 load, as this will be the only common
code guaranteed to run on every #DB.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20200902133200.786888252@infradead.org

---
 arch/x86/kernel/traps.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 81a2fb7..9945642 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -749,6 +749,13 @@ static __always_inline unsigned long debug_read_clear_dr6(void)
 	/* Filter out all the reserved bits which are preset to 1 */
 	dr6 &= ~DR6_RESERVED;
 
+	/*
+	 * The SDM says "The processor clears the BTF flag when it
+	 * generates a debug exception."  Clear TIF_BLOCKSTEP to keep
+	 * TIF_BLOCKSTEP in sync with the hardware BTF flag.
+	 */
+	clear_thread_flag(TIF_BLOCKSTEP);
+
 	return dr6;
 }
 
@@ -783,13 +790,6 @@ static void handle_debug(struct pt_regs *regs, unsigned long dr6, bool user)
 	int si_code;
 
 	/*
-	 * The SDM says "The processor clears the BTF flag when it
-	 * generates a debug exception."  Clear TIF_BLOCKSTEP to keep
-	 * TIF_BLOCKSTEP in sync with the hardware BTF flag.
-	 */
-	clear_thread_flag(TIF_BLOCKSTEP);
-
-	/*
 	 * If DR6 is zero, no point in trying to handle it. The kernel is
 	 * not using INT1.
 	 */

^ permalink raw reply related	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2020-09-04 13:21 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-02 13:25 [PATCH 00/13] x86/debug: Untangle handle_debug() Peter Zijlstra
2020-09-02 13:25 ` [PATCH 01/13] x86/entry: Fix AC assertion Peter Zijlstra
2020-09-02 15:58   ` Brian Gerst
2020-09-02 16:24     ` Jürgen Groß
2020-09-02 16:31       ` peterz
2020-09-02 17:02         ` Brian Gerst
2020-09-02 16:26     ` Andrew Cooper
2020-09-04 13:16   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
2020-09-02 13:25 ` [PATCH 02/13] x86/debug: Allow a single level of #DB recursion Peter Zijlstra
2020-09-02 23:59   ` Sasha Levin
2020-09-03 16:12   ` Josh Poimboeuf
2020-09-02 13:25 ` [PATCH 03/13] x86/debug: Sync BTF earlier Peter Zijlstra
2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
2020-09-02 13:25 ` [PATCH 04/13] x86/debug: Move kprobe_debug_handler() into exc_debug_kernel() Peter Zijlstra
2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
2020-09-02 13:25 ` [PATCH 05/13] x86/debug: Remove handle_debug(.user) argument Peter Zijlstra
2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
2020-09-02 13:25 ` [PATCH 06/13] x86/debug: Simplify #DB signal code Peter Zijlstra
2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
2020-09-02 13:25 ` [PATCH 07/13] x86/debug: Move historical SYSENTER junk into exc_debug_kernel() Peter Zijlstra
2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
2020-09-02 13:25 ` [PATCH 08/13] x86/debug: Move cond_local_irq_enable() block into exc_debug_user() Peter Zijlstra
2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
2020-09-02 13:25 ` [PATCH 09/13] x86/debug: Remove the historical junk Peter Zijlstra
2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
2020-09-02 13:25 ` [PATCH 10/13] x86/debug: Remove aout_dump_debugregs() Peter Zijlstra
2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
2020-09-02 13:26 ` [PATCH 11/13] x86/debug: Simplify hw_breakpoint_handler() Peter Zijlstra
2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
2020-09-02 13:26 ` [RFC][PATCH 12/13] x86/debug: Support negative polarity DR6 bits Peter Zijlstra
2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
2020-09-02 13:26 ` [RFC][PATCH 13/13] x86/debug: Change thread.debugreg6 to thread.virtual_dr6 Peter Zijlstra
2020-09-04 13:16   ` [tip: x86/entry] " tip-bot2 for Peter Zijlstra
2020-09-03 15:21 ` [PATCH 00/13] x86/debug: Untangle handle_debug() Daniel Thompson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).