All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] x86/entry,lockdep: Improve IRQ state tracking
@ 2020-05-28 14:05 Peter Zijlstra
  2020-05-28 14:05 ` [PATCH v2 1/6] x86/entry: Clarify irq_{enter,exit}_rcu() Peter Zijlstra
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-28 14:05 UTC (permalink / raw)
  To: mingo, will, tglx; +Cc: x86, linux-kernel, a.darwish, rostedt, bigeasy, peterz

Ahmed and Sebastian wanted additional lockdep_assert*() macros and ran
into header hell.

Move the IRQ state into per-cpu variables, which removes the dependency on
task_struct, which is what generated the header-hell.

And fix IRQ state tracking to not be affected by lockdep_off() (it really
should not have been anyway) and extends IRQ state tracking across (x86)
NMIs.

They now survive a kbuild+perf-top load.

Changes since v1:

 - this_cpu_inc_return() fix
 - reordered patches
 - fixed up changelogs
 - fixed the actual NMI bits; we can't use IF on exit.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/6] x86/entry: Clarify irq_{enter,exit}_rcu()
  2020-05-28 14:05 [PATCH v2 0/6] x86/entry,lockdep: Improve IRQ state tracking Peter Zijlstra
@ 2020-05-28 14:05 ` Peter Zijlstra
  2020-05-28 14:05 ` [PATCH v2 2/6] x86/entry: Rename trace_hardirqs_off_prepare() Peter Zijlstra
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-28 14:05 UTC (permalink / raw)
  To: mingo, will, tglx; +Cc: x86, linux-kernel, a.darwish, rostedt, bigeasy, peterz

Because:

  irq_enter_rcu() includes lockdep_hardirq_enter()
  irq_exit_rcu() does *NOT* include lockdep_hardirq_exit()

Which resulted in two 'stray' lockdep_hardirq_exit() calls in
idtentry.h, and me spending a long time trying to find the matching
enter calls.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/idtentry.h |    2 --
 kernel/softirq.c                |   19 +++++++++++++------
 2 files changed, 13 insertions(+), 8 deletions(-)

--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -206,7 +206,6 @@ __visible noinstr void func(struct pt_re
 	kvm_set_cpu_l1tf_flush_l1d();					\
 	__##func (regs, (u8)error_code);				\
 	irq_exit_rcu();							\
-	lockdep_hardirq_exit();						\
 	instrumentation_end();						\
 	idtentry_exit_cond_rcu(regs, rcu_exit);				\
 }									\
@@ -249,7 +248,6 @@ __visible noinstr void func(struct pt_re
 	kvm_set_cpu_l1tf_flush_l1d();					\
 	run_on_irqstack_cond(__##func, regs, regs);			\
 	irq_exit_rcu();							\
-	lockdep_hardirq_exit();						\
 	instrumentation_end();						\
 	idtentry_exit_cond_rcu(regs, rcu_exit);				\
 }									\
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -404,12 +404,7 @@ static inline void tick_irq_exit(void)
 #endif
 }
 
-/**
- * irq_exit_rcu() - Exit an interrupt context without updating RCU
- *
- * Also processes softirqs if needed and possible.
- */
-void irq_exit_rcu(void)
+static inline void __irq_exit_rcu(void)
 {
 #ifndef __ARCH_IRQ_EXIT_IRQS_DISABLED
 	local_irq_disable();
@@ -425,6 +420,18 @@ void irq_exit_rcu(void)
 }
 
 /**
+ * irq_exit_rcu() - Exit an interrupt context without updating RCU
+ *
+ * Also processes softirqs if needed and possible.
+ */
+void irq_exit_rcu(void)
+{
+	__irq_exit_rcu();
+	 /* must be last! */
+	lockdep_hardirq_exit();
+}
+
+/**
  * irq_exit - Exit an interrupt context, update RCU and lockdep
  *
  * Also processes softirqs if needed and possible.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 2/6] x86/entry: Rename trace_hardirqs_off_prepare()
  2020-05-28 14:05 [PATCH v2 0/6] x86/entry,lockdep: Improve IRQ state tracking Peter Zijlstra
  2020-05-28 14:05 ` [PATCH v2 1/6] x86/entry: Clarify irq_{enter,exit}_rcu() Peter Zijlstra
@ 2020-05-28 14:05 ` Peter Zijlstra
  2020-05-28 14:05 ` [PATCH v2 3/6] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables Peter Zijlstra
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-28 14:05 UTC (permalink / raw)
  To: mingo, will, tglx; +Cc: x86, linux-kernel, a.darwish, rostedt, bigeasy, peterz

The typical pattern for trace_hardirqs_off_prepare() is:

  ENTRY
    lockdep_hardirqs_off(); // because hardware
    ... do entry magic
    instrumentation_begin();
    trace_hardirqs_off_prepare();
    ... do actual work
    trace_hardirqs_on_prepare();
    lockdep_hardirqs_on_prepare();
    instrumentation_end();
    ... do exit magic
    lockdep_hardirqs_on();

which shows that it's named wrong, rename it to
trace_hardirqs_off_finish(), as it concludes the hardirq_off
transition.

Also, given that the above is the only correct order, make the
traditional all-in-one trace_hardirqs_off() follow suit.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/entry/common.c         |    6 +++---
 arch/x86/kernel/cpu/mce/core.c  |    2 +-
 arch/x86/kernel/nmi.c           |    2 +-
 arch/x86/kernel/traps.c         |    4 ++--
 include/linux/irqflags.h        |    4 ++--
 kernel/trace/trace_preemptirq.c |   10 +++++-----
 6 files changed, 14 insertions(+), 14 deletions(-)

--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -65,7 +65,7 @@ static noinstr void enter_from_user_mode
 
 	instrumentation_begin();
 	CT_WARN_ON(state != CONTEXT_USER);
-	trace_hardirqs_off_prepare();
+	trace_hardirqs_off_finish();
 	instrumentation_end();
 }
 #else
@@ -73,7 +73,7 @@ static __always_inline void enter_from_u
 {
 	lockdep_hardirqs_off(CALLER_ADDR0);
 	instrumentation_begin();
-	trace_hardirqs_off_prepare();
+	trace_hardirqs_off_finish();
 	instrumentation_end();
 }
 #endif
@@ -569,7 +569,7 @@ bool noinstr idtentry_enter_cond_rcu(str
 		lockdep_hardirqs_off(CALLER_ADDR0);
 		rcu_irq_enter();
 		instrumentation_begin();
-		trace_hardirqs_off_prepare();
+		trace_hardirqs_off_finish();
 		instrumentation_end();
 
 		return true;
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1915,7 +1915,7 @@ static __always_inline void exc_machine_
 	 * that out because it's an indirect call. Annotate it.
 	 */
 	instrumentation_begin();
-	trace_hardirqs_off_prepare();
+	trace_hardirqs_off_finish();
 	machine_check_vector(regs);
 	if (regs->flags & X86_EFLAGS_IF)
 		trace_hardirqs_on_prepare();
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -330,7 +330,7 @@ static noinstr void default_do_nmi(struc
 	__this_cpu_write(last_nmi_rip, regs->ip);
 
 	instrumentation_begin();
-	trace_hardirqs_off_prepare();
+	trace_hardirqs_off_finish();
 
 	handled = nmi_handle(NMI_LOCAL, regs);
 	__this_cpu_add(nmi_stats.normal, handled);
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -634,7 +634,7 @@ DEFINE_IDTENTRY_RAW(exc_int3)
 	} else {
 		nmi_enter();
 		instrumentation_begin();
-		trace_hardirqs_off_prepare();
+		trace_hardirqs_off_finish();
 		if (!do_int3(regs))
 			die("int3", regs, 0);
 		if (regs->flags & X86_EFLAGS_IF)
@@ -854,7 +854,7 @@ static __always_inline void exc_debug_ke
 {
 	nmi_enter();
 	instrumentation_begin();
-	trace_hardirqs_off_prepare();
+	trace_hardirqs_off_finish();
 	instrumentation_end();
 
 	/*
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -32,7 +32,7 @@
 
 #ifdef CONFIG_TRACE_IRQFLAGS
   extern void trace_hardirqs_on_prepare(void);
-  extern void trace_hardirqs_off_prepare(void);
+  extern void trace_hardirqs_off_finish(void);
   extern void trace_hardirqs_on(void);
   extern void trace_hardirqs_off(void);
 # define lockdep_hardirq_context(p)	((p)->hardirq_context)
@@ -101,7 +101,7 @@ do {						\
 
 #else
 # define trace_hardirqs_on_prepare()		do { } while (0)
-# define trace_hardirqs_off_prepare()		do { } while (0)
+# define trace_hardirqs_off_finish()		do { } while (0)
 # define trace_hardirqs_on()		do { } while (0)
 # define trace_hardirqs_off()		do { } while (0)
 # define lockdep_hardirq_context(p)	0
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -58,7 +58,7 @@ NOKPROBE_SYMBOL(trace_hardirqs_on);
  * and lockdep uses a staged approach which splits the lockdep hardirq
  * tracking into a RCU on and a RCU off section.
  */
-void trace_hardirqs_off_prepare(void)
+void trace_hardirqs_off_finish(void)
 {
 	if (!this_cpu_read(tracing_irq_cpu)) {
 		this_cpu_write(tracing_irq_cpu, 1);
@@ -68,19 +68,19 @@ void trace_hardirqs_off_prepare(void)
 	}
 
 }
-EXPORT_SYMBOL(trace_hardirqs_off_prepare);
-NOKPROBE_SYMBOL(trace_hardirqs_off_prepare);
+EXPORT_SYMBOL(trace_hardirqs_off_finish);
+NOKPROBE_SYMBOL(trace_hardirqs_off_finish);
 
 void trace_hardirqs_off(void)
 {
+	lockdep_hardirqs_off(CALLER_ADDR0);
+
 	if (!this_cpu_read(tracing_irq_cpu)) {
 		this_cpu_write(tracing_irq_cpu, 1);
 		tracer_hardirqs_off(CALLER_ADDR0, CALLER_ADDR1);
 		if (!in_nmi())
 			trace_irq_disable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
 	}
-
-	lockdep_hardirqs_off(CALLER_ADDR0);
 }
 EXPORT_SYMBOL(trace_hardirqs_off);
 NOKPROBE_SYMBOL(trace_hardirqs_off);



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 3/6] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables
  2020-05-28 14:05 [PATCH v2 0/6] x86/entry,lockdep: Improve IRQ state tracking Peter Zijlstra
  2020-05-28 14:05 ` [PATCH v2 1/6] x86/entry: Clarify irq_{enter,exit}_rcu() Peter Zijlstra
  2020-05-28 14:05 ` [PATCH v2 2/6] x86/entry: Rename trace_hardirqs_off_prepare() Peter Zijlstra
@ 2020-05-28 14:05 ` Peter Zijlstra
  2020-05-29  3:55   ` [PATCH v2 3/6] lockdep: Change hardirq{s_enabled, _context} " kbuild test robot
                     ` (2 more replies)
  2020-05-28 14:05 ` [PATCH v2 4/6] lockdep: Remove lockdep_hardirq{s_enabled,_context}() argument Peter Zijlstra
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-28 14:05 UTC (permalink / raw)
  To: mingo, will, tglx; +Cc: x86, linux-kernel, a.darwish, rostedt, bigeasy, peterz

Currently all IRQ-tracking state is in task_struct, this means that
task_struct needs to be defined before we use it.

Especially for lockdep_assert_irq*() this can lead to header-hell.

Move the hardirq state into per-cpu variables to avoid the task_struct
dependency.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/irqflags.h |   19 ++++++++++++-------
 include/linux/lockdep.h  |   34 ++++++++++++++++++----------------
 include/linux/sched.h    |    2 --
 kernel/fork.c            |    4 +---
 kernel/locking/lockdep.c |   30 +++++++++++++++---------------
 kernel/softirq.c         |    6 ++++++
 6 files changed, 52 insertions(+), 43 deletions(-)

--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -14,6 +14,7 @@
 
 #include <linux/typecheck.h>
 #include <asm/irqflags.h>
+#include <asm/percpu.h>
 
 /* Currently lockdep_softirqs_on/off is used only by lockdep */
 #ifdef CONFIG_PROVE_LOCKING
@@ -31,18 +32,22 @@
 #endif
 
 #ifdef CONFIG_TRACE_IRQFLAGS
+
+DECLARE_PER_CPU(int, hardirqs_enabled);
+DECLARE_PER_CPU(int, hardirq_context);
+
   extern void trace_hardirqs_on_prepare(void);
   extern void trace_hardirqs_off_finish(void);
   extern void trace_hardirqs_on(void);
   extern void trace_hardirqs_off(void);
-# define lockdep_hardirq_context(p)	((p)->hardirq_context)
+# define lockdep_hardirq_context(p)	(this_cpu_read(hardirq_context))
 # define lockdep_softirq_context(p)	((p)->softirq_context)
-# define lockdep_hardirqs_enabled(p)	((p)->hardirqs_enabled)
+# define lockdep_hardirqs_enabled(p)	(this_cpu_read(hardirqs_enabled))
 # define lockdep_softirqs_enabled(p)	((p)->softirqs_enabled)
-# define lockdep_hardirq_enter()		\
-do {						\
-	if (!current->hardirq_context++)	\
-		current->hardirq_threaded = 0;	\
+# define lockdep_hardirq_enter()			\
+do {							\
+	if (this_cpu_inc_return(hardirq_context) == 1)	\
+		current->hardirq_threaded = 0;		\
 } while (0)
 # define lockdep_hardirq_threaded()		\
 do {						\
@@ -50,7 +55,7 @@ do {						\
 } while (0)
 # define lockdep_hardirq_exit()			\
 do {						\
-	current->hardirq_context--;		\
+	this_cpu_dec(hardirq_context);		\
 } while (0)
 # define lockdep_softirq_enter()		\
 do {						\
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -20,6 +20,7 @@ extern int lock_stat;
 #define MAX_LOCKDEP_SUBCLASSES		8UL
 
 #include <linux/types.h>
+#include <asm/percpu.h>
 
 enum lockdep_wait_type {
 	LD_WAIT_INV = 0,	/* not checked, catch all */
@@ -703,28 +704,29 @@ do {									\
 	lock_release(&(lock)->dep_map, _THIS_IP_);			\
 } while (0)
 
-#define lockdep_assert_irqs_enabled()	do {				\
-		WARN_ONCE(debug_locks && !current->lockdep_recursion &&	\
-			  !current->hardirqs_enabled,			\
-			  "IRQs not enabled as expected\n");		\
-	} while (0)
+DECLARE_PER_CPU(int, hardirqs_enabled);
+DECLARE_PER_CPU(int, hardirq_context);
 
-#define lockdep_assert_irqs_disabled()	do {				\
-		WARN_ONCE(debug_locks && !current->lockdep_recursion &&	\
-			  current->hardirqs_enabled,			\
-			  "IRQs not disabled as expected\n");		\
-	} while (0)
+#define lockdep_assert_irqs_enabled()					\
+do {									\
+	WARN_ON_ONCE(debug_locks && !this_cpu_read(hardirqs_enabled));	\
+} while (0)
 
-#define lockdep_assert_in_irq() do {					\
-		WARN_ONCE(debug_locks && !current->lockdep_recursion &&	\
-			  !current->hardirq_context,			\
-			  "Not in hardirq as expected\n");		\
-	} while (0)
+#define lockdep_assert_irqs_disabled()					\
+do {									\
+	WARN_ON_ONCE(debug_locks && this_cpu_read(hardirqs_enabled));	\
+} while (0)
+
+#define lockdep_assert_in_irq()						\
+do {									\
+	WARN_ON_ONCE(debug_locks && !this_cpu_read(hardirq_context));	\
+} while (0)
 
 #else
 # define might_lock(lock) do { } while (0)
 # define might_lock_read(lock) do { } while (0)
 # define might_lock_nested(lock, subclass) do { } while (0)
+
 # define lockdep_assert_irqs_enabled() do { } while (0)
 # define lockdep_assert_irqs_disabled() do { } while (0)
 # define lockdep_assert_in_irq() do { } while (0)
@@ -734,7 +736,7 @@ do {									\
 
 # define lockdep_assert_RT_in_threaded_ctx() do {			\
 		WARN_ONCE(debug_locks && !current->lockdep_recursion &&	\
-			  current->hardirq_context &&			\
+			  lockdep_hardirq_context(current) &&		\
 			  !(current->hardirq_threaded || current->irq_config),	\
 			  "Not in threaded context on PREEMPT_RT as expected\n");	\
 } while (0)
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -991,8 +991,6 @@ struct task_struct {
 	unsigned long			hardirq_disable_ip;
 	unsigned int			hardirq_enable_event;
 	unsigned int			hardirq_disable_event;
-	int				hardirqs_enabled;
-	int				hardirq_context;
 	u64				hardirq_chain_key;
 	unsigned long			softirq_disable_ip;
 	unsigned long			softirq_enable_ip;
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1946,8 +1946,8 @@ static __latent_entropy struct task_stru
 
 	rt_mutex_init_task(p);
 
+	lockdep_assert_irqs_enabled();
 #ifdef CONFIG_PROVE_LOCKING
-	DEBUG_LOCKS_WARN_ON(!p->hardirqs_enabled);
 	DEBUG_LOCKS_WARN_ON(!p->softirqs_enabled);
 #endif
 	retval = -EAGAIN;
@@ -2028,7 +2028,6 @@ static __latent_entropy struct task_stru
 #endif
 #ifdef CONFIG_TRACE_IRQFLAGS
 	p->irq_events = 0;
-	p->hardirqs_enabled = 0;
 	p->hardirq_enable_ip = 0;
 	p->hardirq_enable_event = 0;
 	p->hardirq_disable_ip = _THIS_IP_;
@@ -2038,7 +2037,6 @@ static __latent_entropy struct task_stru
 	p->softirq_enable_event = 0;
 	p->softirq_disable_ip = 0;
 	p->softirq_disable_event = 0;
-	p->hardirq_context = 0;
 	p->softirq_context = 0;
 #endif
 
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2062,9 +2062,9 @@ print_bad_irq_dependency(struct task_str
 	pr_warn("-----------------------------------------------------\n");
 	pr_warn("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] is trying to acquire:\n",
 		curr->comm, task_pid_nr(curr),
-		curr->hardirq_context, hardirq_count() >> HARDIRQ_SHIFT,
+		lockdep_hardirq_context(curr), hardirq_count() >> HARDIRQ_SHIFT,
 		curr->softirq_context, softirq_count() >> SOFTIRQ_SHIFT,
-		curr->hardirqs_enabled,
+		lockdep_hardirqs_enabled(curr),
 		curr->softirqs_enabled);
 	print_lock(next);
 
@@ -3649,7 +3649,7 @@ void lockdep_hardirqs_on_prepare(unsigne
 	if (unlikely(!debug_locks || current->lockdep_recursion))
 		return;
 
-	if (unlikely(current->hardirqs_enabled)) {
+	if (unlikely(lockdep_hardirqs_enabled(current))) {
 		/*
 		 * Neither irq nor preemption are disabled here
 		 * so this is racy by nature but losing one hit
@@ -3677,7 +3677,7 @@ void lockdep_hardirqs_on_prepare(unsigne
 	 * Can't allow enabling interrupts while in an interrupt handler,
 	 * that's general bad form and such. Recursion, limited stack etc..
 	 */
-	if (DEBUG_LOCKS_WARN_ON(current->hardirq_context))
+	if (DEBUG_LOCKS_WARN_ON(lockdep_hardirq_context(current)))
 		return;
 
 	current->hardirq_chain_key = current->curr_chain_key;
@@ -3695,7 +3695,7 @@ void noinstr lockdep_hardirqs_on(unsigne
 	if (unlikely(!debug_locks || curr->lockdep_recursion))
 		return;
 
-	if (curr->hardirqs_enabled) {
+	if (lockdep_hardirqs_enabled(curr)) {
 		/*
 		 * Neither irq nor preemption are disabled here
 		 * so this is racy by nature but losing one hit
@@ -3721,7 +3721,7 @@ void noinstr lockdep_hardirqs_on(unsigne
 			    current->curr_chain_key);
 
 	/* we'll do an OFF -> ON transition: */
-	curr->hardirqs_enabled = 1;
+	this_cpu_write(hardirqs_enabled, 1);
 	curr->hardirq_enable_ip = ip;
 	curr->hardirq_enable_event = ++curr->irq_events;
 	debug_atomic_inc(hardirqs_on_events);
@@ -3745,11 +3745,11 @@ void noinstr lockdep_hardirqs_off(unsign
 	if (DEBUG_LOCKS_WARN_ON(!irqs_disabled()))
 		return;
 
-	if (curr->hardirqs_enabled) {
+	if (lockdep_hardirqs_enabled(curr)) {
 		/*
 		 * We have done an ON -> OFF transition:
 		 */
-		curr->hardirqs_enabled = 0;
+		this_cpu_write(hardirqs_enabled, 0);
 		curr->hardirq_disable_ip = ip;
 		curr->hardirq_disable_event = ++curr->irq_events;
 		debug_atomic_inc(hardirqs_off_events);
@@ -3794,7 +3794,7 @@ void lockdep_softirqs_on(unsigned long i
 	 * usage bit for all held locks, if hardirqs are
 	 * enabled too:
 	 */
-	if (curr->hardirqs_enabled)
+	if (lockdep_hardirqs_enabled(curr))
 		mark_held_locks(curr, LOCK_ENABLED_SOFTIRQ);
 	lockdep_recursion_finish();
 }
@@ -3843,7 +3843,7 @@ mark_usage(struct task_struct *curr, str
 	 */
 	if (!hlock->trylock) {
 		if (hlock->read) {
-			if (curr->hardirq_context)
+			if (lockdep_hardirq_context(curr))
 				if (!mark_lock(curr, hlock,
 						LOCK_USED_IN_HARDIRQ_READ))
 					return 0;
@@ -3852,7 +3852,7 @@ mark_usage(struct task_struct *curr, str
 						LOCK_USED_IN_SOFTIRQ_READ))
 					return 0;
 		} else {
-			if (curr->hardirq_context)
+			if (lockdep_hardirq_context(curr))
 				if (!mark_lock(curr, hlock, LOCK_USED_IN_HARDIRQ))
 					return 0;
 			if (curr->softirq_context)
@@ -3890,7 +3890,7 @@ mark_usage(struct task_struct *curr, str
 
 static inline unsigned int task_irq_context(struct task_struct *task)
 {
-	return LOCK_CHAIN_HARDIRQ_CONTEXT * !!task->hardirq_context +
+	return LOCK_CHAIN_HARDIRQ_CONTEXT * !!lockdep_hardirq_context(task) +
 	       LOCK_CHAIN_SOFTIRQ_CONTEXT * !!task->softirq_context;
 }
 
@@ -3983,7 +3983,7 @@ static inline short task_wait_context(st
 	 * Set appropriate wait type for the context; for IRQs we have to take
 	 * into account force_irqthread as that is implied by PREEMPT_RT.
 	 */
-	if (curr->hardirq_context) {
+	if (lockdep_hardirq_context(curr)) {
 		/*
 		 * Check if force_irqthreads will run us threaded.
 		 */
@@ -4826,11 +4826,11 @@ static void check_flags(unsigned long fl
 		return;
 
 	if (irqs_disabled_flags(flags)) {
-		if (DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)) {
+		if (DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled(current))) {
 			printk("possible reason: unannotated irqs-off.\n");
 		}
 	} else {
-		if (DEBUG_LOCKS_WARN_ON(!current->hardirqs_enabled)) {
+		if (DEBUG_LOCKS_WARN_ON(!lockdep_hardirqs_enabled(current))) {
 			printk("possible reason: unannotated irqs-on.\n");
 		}
 	}
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -107,6 +107,12 @@ static bool ksoftirqd_running(unsigned l
  * where hardirqs are disabled legitimately:
  */
 #ifdef CONFIG_TRACE_IRQFLAGS
+
+DEFINE_PER_CPU(int, hardirqs_enabled);
+DEFINE_PER_CPU(int, hardirq_context);
+EXPORT_PER_CPU_SYMBOL_GPL(hardirqs_enabled);
+EXPORT_PER_CPU_SYMBOL_GPL(hardirq_context);
+
 void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
 {
 	unsigned long flags;



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 4/6] lockdep: Remove lockdep_hardirq{s_enabled,_context}() argument
  2020-05-28 14:05 [PATCH v2 0/6] x86/entry,lockdep: Improve IRQ state tracking Peter Zijlstra
                   ` (2 preceding siblings ...)
  2020-05-28 14:05 ` [PATCH v2 3/6] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables Peter Zijlstra
@ 2020-05-28 14:05 ` Peter Zijlstra
  2020-05-28 14:05 ` [PATCH v2 5/6] lockdep: Prepare for NMI IRQ state tracking Peter Zijlstra
  2020-05-28 14:05 ` [PATCH v2 6/6] x86/entry: Fix NMI vs " Peter Zijlstra
  5 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-28 14:05 UTC (permalink / raw)
  To: mingo, will, tglx; +Cc: x86, linux-kernel, a.darwish, rostedt, bigeasy, peterz

Now that the macros use per-cpu data, we no longer need the argument.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/irqflags.h       |    8 ++++----
 include/linux/lockdep.h        |    2 +-
 kernel/locking/lockdep.c       |   30 +++++++++++++++---------------
 kernel/softirq.c               |    2 +-
 tools/include/linux/irqflags.h |    4 ++--
 5 files changed, 23 insertions(+), 23 deletions(-)

--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -40,9 +40,9 @@ DECLARE_PER_CPU(int, hardirq_context);
   extern void trace_hardirqs_off_finish(void);
   extern void trace_hardirqs_on(void);
   extern void trace_hardirqs_off(void);
-# define lockdep_hardirq_context(p)	(this_cpu_read(hardirq_context))
+# define lockdep_hardirq_context()	(this_cpu_read(hardirq_context))
 # define lockdep_softirq_context(p)	((p)->softirq_context)
-# define lockdep_hardirqs_enabled(p)	(this_cpu_read(hardirqs_enabled))
+# define lockdep_hardirqs_enabled()	(this_cpu_read(hardirqs_enabled))
 # define lockdep_softirqs_enabled(p)	((p)->softirqs_enabled)
 # define lockdep_hardirq_enter()			\
 do {							\
@@ -109,9 +109,9 @@ do {						\
 # define trace_hardirqs_off_finish()		do { } while (0)
 # define trace_hardirqs_on()		do { } while (0)
 # define trace_hardirqs_off()		do { } while (0)
-# define lockdep_hardirq_context(p)	0
+# define lockdep_hardirq_context()	0
 # define lockdep_softirq_context(p)	0
-# define lockdep_hardirqs_enabled(p)	0
+# define lockdep_hardirqs_enabled()	0
 # define lockdep_softirqs_enabled(p)	0
 # define lockdep_hardirq_enter()	do { } while (0)
 # define lockdep_hardirq_threaded()	do { } while (0)
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -736,7 +736,7 @@ do {									\
 
 # define lockdep_assert_RT_in_threaded_ctx() do {			\
 		WARN_ONCE(debug_locks && !current->lockdep_recursion &&	\
-			  lockdep_hardirq_context(current) &&		\
+			  lockdep_hardirq_context() &&			\
 			  !(current->hardirq_threaded || current->irq_config),	\
 			  "Not in threaded context on PREEMPT_RT as expected\n");	\
 } while (0)
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2062,9 +2062,9 @@ print_bad_irq_dependency(struct task_str
 	pr_warn("-----------------------------------------------------\n");
 	pr_warn("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] is trying to acquire:\n",
 		curr->comm, task_pid_nr(curr),
-		lockdep_hardirq_context(curr), hardirq_count() >> HARDIRQ_SHIFT,
+		lockdep_hardirq_context(), hardirq_count() >> HARDIRQ_SHIFT,
 		curr->softirq_context, softirq_count() >> SOFTIRQ_SHIFT,
-		lockdep_hardirqs_enabled(curr),
+		lockdep_hardirqs_enabled(),
 		curr->softirqs_enabled);
 	print_lock(next);
 
@@ -3331,9 +3331,9 @@ print_usage_bug(struct task_struct *curr
 
 	pr_warn("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] takes:\n",
 		curr->comm, task_pid_nr(curr),
-		lockdep_hardirq_context(curr), hardirq_count() >> HARDIRQ_SHIFT,
+		lockdep_hardirq_context(), hardirq_count() >> HARDIRQ_SHIFT,
 		lockdep_softirq_context(curr), softirq_count() >> SOFTIRQ_SHIFT,
-		lockdep_hardirqs_enabled(curr),
+		lockdep_hardirqs_enabled(),
 		lockdep_softirqs_enabled(curr));
 	print_lock(this);
 
@@ -3649,7 +3649,7 @@ void lockdep_hardirqs_on_prepare(unsigne
 	if (unlikely(!debug_locks || current->lockdep_recursion))
 		return;
 
-	if (unlikely(lockdep_hardirqs_enabled(current))) {
+	if (unlikely(lockdep_hardirqs_enabled())) {
 		/*
 		 * Neither irq nor preemption are disabled here
 		 * so this is racy by nature but losing one hit
@@ -3677,7 +3677,7 @@ void lockdep_hardirqs_on_prepare(unsigne
 	 * Can't allow enabling interrupts while in an interrupt handler,
 	 * that's general bad form and such. Recursion, limited stack etc..
 	 */
-	if (DEBUG_LOCKS_WARN_ON(lockdep_hardirq_context(current)))
+	if (DEBUG_LOCKS_WARN_ON(lockdep_hardirq_context()))
 		return;
 
 	current->hardirq_chain_key = current->curr_chain_key;
@@ -3695,7 +3695,7 @@ void noinstr lockdep_hardirqs_on(unsigne
 	if (unlikely(!debug_locks || curr->lockdep_recursion))
 		return;
 
-	if (lockdep_hardirqs_enabled(curr)) {
+	if (lockdep_hardirqs_enabled()) {
 		/*
 		 * Neither irq nor preemption are disabled here
 		 * so this is racy by nature but losing one hit
@@ -3745,7 +3745,7 @@ void noinstr lockdep_hardirqs_off(unsign
 	if (DEBUG_LOCKS_WARN_ON(!irqs_disabled()))
 		return;
 
-	if (lockdep_hardirqs_enabled(curr)) {
+	if (lockdep_hardirqs_enabled()) {
 		/*
 		 * We have done an ON -> OFF transition:
 		 */
@@ -3794,7 +3794,7 @@ void lockdep_softirqs_on(unsigned long i
 	 * usage bit for all held locks, if hardirqs are
 	 * enabled too:
 	 */
-	if (lockdep_hardirqs_enabled(curr))
+	if (lockdep_hardirqs_enabled())
 		mark_held_locks(curr, LOCK_ENABLED_SOFTIRQ);
 	lockdep_recursion_finish();
 }
@@ -3843,7 +3843,7 @@ mark_usage(struct task_struct *curr, str
 	 */
 	if (!hlock->trylock) {
 		if (hlock->read) {
-			if (lockdep_hardirq_context(curr))
+			if (lockdep_hardirq_context())
 				if (!mark_lock(curr, hlock,
 						LOCK_USED_IN_HARDIRQ_READ))
 					return 0;
@@ -3852,7 +3852,7 @@ mark_usage(struct task_struct *curr, str
 						LOCK_USED_IN_SOFTIRQ_READ))
 					return 0;
 		} else {
-			if (lockdep_hardirq_context(curr))
+			if (lockdep_hardirq_context())
 				if (!mark_lock(curr, hlock, LOCK_USED_IN_HARDIRQ))
 					return 0;
 			if (curr->softirq_context)
@@ -3890,7 +3890,7 @@ mark_usage(struct task_struct *curr, str
 
 static inline unsigned int task_irq_context(struct task_struct *task)
 {
-	return LOCK_CHAIN_HARDIRQ_CONTEXT * !!lockdep_hardirq_context(task) +
+	return LOCK_CHAIN_HARDIRQ_CONTEXT * !!lockdep_hardirq_context() +
 	       LOCK_CHAIN_SOFTIRQ_CONTEXT * !!task->softirq_context;
 }
 
@@ -3983,7 +3983,7 @@ static inline short task_wait_context(st
 	 * Set appropriate wait type for the context; for IRQs we have to take
 	 * into account force_irqthread as that is implied by PREEMPT_RT.
 	 */
-	if (lockdep_hardirq_context(curr)) {
+	if (lockdep_hardirq_context()) {
 		/*
 		 * Check if force_irqthreads will run us threaded.
 		 */
@@ -4826,11 +4826,11 @@ static void check_flags(unsigned long fl
 		return;
 
 	if (irqs_disabled_flags(flags)) {
-		if (DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled(current))) {
+		if (DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())) {
 			printk("possible reason: unannotated irqs-off.\n");
 		}
 	} else {
-		if (DEBUG_LOCKS_WARN_ON(!lockdep_hardirqs_enabled(current))) {
+		if (DEBUG_LOCKS_WARN_ON(!lockdep_hardirqs_enabled())) {
 			printk("possible reason: unannotated irqs-on.\n");
 		}
 	}
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -230,7 +230,7 @@ static inline bool lockdep_softirq_start
 {
 	bool in_hardirq = false;
 
-	if (lockdep_hardirq_context(current)) {
+	if (lockdep_hardirq_context()) {
 		in_hardirq = true;
 		lockdep_hardirq_exit();
 	}
--- a/tools/include/linux/irqflags.h
+++ b/tools/include/linux/irqflags.h
@@ -2,9 +2,9 @@
 #ifndef _LIBLOCKDEP_LINUX_TRACE_IRQFLAGS_H_
 #define _LIBLOCKDEP_LINUX_TRACE_IRQFLAGS_H_
 
-# define lockdep_hardirq_context(p)	0
+# define lockdep_hardirq_context()	0
 # define lockdep_softirq_context(p)	0
-# define lockdep_hardirqs_enabled(p)	0
+# define lockdep_hardirqs_enabled()	0
 # define lockdep_softirqs_enabled(p)	0
 # define lockdep_hardirq_enter()	do { } while (0)
 # define lockdep_hardirq_exit()		do { } while (0)



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 5/6] lockdep: Prepare for NMI IRQ state tracking
  2020-05-28 14:05 [PATCH v2 0/6] x86/entry,lockdep: Improve IRQ state tracking Peter Zijlstra
                   ` (3 preceding siblings ...)
  2020-05-28 14:05 ` [PATCH v2 4/6] lockdep: Remove lockdep_hardirq{s_enabled,_context}() argument Peter Zijlstra
@ 2020-05-28 14:05 ` Peter Zijlstra
  2020-05-28 14:05 ` [PATCH v2 6/6] x86/entry: Fix NMI vs " Peter Zijlstra
  5 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-28 14:05 UTC (permalink / raw)
  To: mingo, will, tglx; +Cc: x86, linux-kernel, a.darwish, rostedt, bigeasy, peterz

There is no reason not to always, accurately, track IRQ state.

This change also makes IRQ state tracking ignore lockdep_off().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/locking/lockdep.c |   33 ++++++++++++++++++++++++++++++---
 1 file changed, 30 insertions(+), 3 deletions(-)

--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3646,7 +3646,13 @@ static void __trace_hardirqs_on_caller(v
  */
 void lockdep_hardirqs_on_prepare(unsigned long ip)
 {
-	if (unlikely(!debug_locks || current->lockdep_recursion))
+	/*
+	 * NMIs do not (and cannot) track lock dependencies, nothing to do.
+	 */
+	if (in_nmi())
+		return;
+
+	if (DEBUG_LOCKS_WARN_ON(current->lockdep_recursion & LOCKDEP_RECURSION_MASK))
 		return;
 
 	if (unlikely(lockdep_hardirqs_enabled())) {
@@ -3692,7 +3698,24 @@ void noinstr lockdep_hardirqs_on(unsigne
 {
 	struct task_struct *curr = current;
 
-	if (unlikely(!debug_locks || curr->lockdep_recursion))
+	/*
+	 * NMIs can happen in the middle of local_irq_{en,dis}able() where the
+	 * tracking state and hardware state are out of sync.
+	 *
+	 * NMIs must save lockdep_hardirqs_enabled() to restore IRQ state from,
+	 * and not rely on hardware state like normal interrupts.
+	 */
+	if (in_nmi()) {
+		/*
+		 * Skip:
+		 *  - recursion check, because NMI can hit lockdep;
+		 *  - hardware state check, because above;
+		 *  - chain_key check, see lockdep_hardirqs_on_prepare().
+		 */
+		goto skip_checks;
+	}
+
+	if (DEBUG_LOCKS_WARN_ON(curr->lockdep_recursion & LOCKDEP_RECURSION_MASK))
 		return;
 
 	if (lockdep_hardirqs_enabled()) {
@@ -3720,6 +3743,7 @@ void noinstr lockdep_hardirqs_on(unsigne
 	DEBUG_LOCKS_WARN_ON(current->hardirq_chain_key !=
 			    current->curr_chain_key);
 
+skip_checks:
 	/* we'll do an OFF -> ON transition: */
 	this_cpu_write(hardirqs_enabled, 1);
 	curr->hardirq_enable_ip = ip;
@@ -3735,7 +3759,10 @@ void noinstr lockdep_hardirqs_off(unsign
 {
 	struct task_struct *curr = current;
 
-	if (unlikely(!debug_locks || curr->lockdep_recursion))
+	/*
+	 * NMIs can happen in lockdep.
+	 */
+	if (!in_nmi() && DEBUG_LOCKS_WARN_ON(curr->lockdep_recursion & LOCKDEP_RECURSION_MASK))
 		return;
 
 	/*



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 6/6] x86/entry: Fix NMI vs IRQ state tracking
  2020-05-28 14:05 [PATCH v2 0/6] x86/entry,lockdep: Improve IRQ state tracking Peter Zijlstra
                   ` (4 preceding siblings ...)
  2020-05-28 14:05 ` [PATCH v2 5/6] lockdep: Prepare for NMI IRQ state tracking Peter Zijlstra
@ 2020-05-28 14:05 ` Peter Zijlstra
  5 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-28 14:05 UTC (permalink / raw)
  To: mingo, will, tglx; +Cc: x86, linux-kernel, a.darwish, rostedt, bigeasy, peterz

While the nmi_enter() users did
trace_hardirqs_{off_prepare,on_finish}() there was no matching
lockdep_hardirqs_*() calls to complete the picture.

Introduce idtentry_{enter,exit}_nmi() to enable proper IRQ state
tracking across the NMIs.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/entry/common.c         |   42 ++++++++++++++++++++++++++++++++++++----
 arch/x86/include/asm/idtentry.h |    3 ++
 arch/x86/kernel/nmi.c           |    9 +++-----
 arch/x86/kernel/traps.c         |   20 ++++---------------
 include/linux/hardirq.h         |   28 ++++++++++++++++++--------
 5 files changed, 69 insertions(+), 33 deletions(-)

--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -550,7 +550,7 @@ SYSCALL_DEFINE0(ni_syscall)
  * The return value must be fed into the rcu_exit argument of
  * idtentry_exit_cond_rcu().
  */
-bool noinstr idtentry_enter_cond_rcu(struct pt_regs *regs)
+noinstr bool idtentry_enter_cond_rcu(struct pt_regs *regs)
 {
 	if (user_mode(regs)) {
 		enter_from_user_mode();
@@ -619,7 +619,7 @@ static void idtentry_exit_cond_resched(s
  * Counterpart to idtentry_enter_cond_rcu(). The return value of the entry
  * function must be fed into the @rcu_exit argument.
  */
-void noinstr idtentry_exit_cond_rcu(struct pt_regs *regs, bool rcu_exit)
+noinstr void idtentry_exit_cond_rcu(struct pt_regs *regs, bool rcu_exit)
 {
 	lockdep_assert_irqs_disabled();
 
@@ -663,7 +663,7 @@ void noinstr idtentry_exit_cond_rcu(stru
  * Invokes enter_from_user_mode() to establish the proper context for
  * NOHZ_FULL. Otherwise scheduling on exit would not be possible.
  */
-void noinstr idtentry_enter_user(struct pt_regs *regs)
+noinstr void idtentry_enter_user(struct pt_regs *regs)
 {
 	enter_from_user_mode();
 }
@@ -680,13 +680,47 @@ void noinstr idtentry_enter_user(struct
  *
  * Counterpart to idtentry_enter_user().
  */
-void noinstr idtentry_exit_user(struct pt_regs *regs)
+noinstr void idtentry_exit_user(struct pt_regs *regs)
 {
 	lockdep_assert_irqs_disabled();
 
 	prepare_exit_to_usermode(regs);
 }
 
+noinstr bool idtentry_enter_nmi(struct pt_regs *regs)
+{
+	bool irq_state = lockdep_hardirqs_enabled();
+
+	__nmi_enter();
+	lockdep_hardirqs_off(CALLER_ADDR0);
+	lockdep_hardirq_enter();
+	rcu_nmi_enter();
+
+	instrumentation_begin();
+	trace_hardirqs_off_finish();
+	ftrace_nmi_enter();
+	instrumentation_end();
+
+	return irq_state;
+}
+
+noinstr void idtentry_exit_nmi(struct pt_regs *regs, bool restore)
+{
+	instrumentation_begin();
+	ftrace_nmi_exit();
+	if (restore) {
+		trace_hardirqs_on_prepare();
+		lockdep_hardirqs_on_prepare(CALLER_ADDR0);
+	}
+	instrumentation_end();
+
+	rcu_nmi_exit();
+	lockdep_hardirq_exit();
+	if (restore)
+		lockdep_hardirqs_on(CALLER_ADDR0);
+	__nmi_exit();
+}
+
 #ifdef CONFIG_XEN_PV
 #ifndef CONFIG_PREEMPTION
 /*
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -16,6 +16,9 @@ void idtentry_exit_user(struct pt_regs *
 bool idtentry_enter_cond_rcu(struct pt_regs *regs);
 void idtentry_exit_cond_rcu(struct pt_regs *regs, bool rcu_exit);
 
+bool idtentry_enter_nmi(struct pt_regs *regs);
+void idtentry_exit_nmi(struct pt_regs *regs, bool irq_state);
+
 /**
  * DECLARE_IDTENTRY - Declare functions for simple IDT entry points
  *		      No error code pushed by hardware
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -330,7 +330,6 @@ static noinstr void default_do_nmi(struc
 	__this_cpu_write(last_nmi_rip, regs->ip);
 
 	instrumentation_begin();
-	trace_hardirqs_off_finish();
 
 	handled = nmi_handle(NMI_LOCAL, regs);
 	__this_cpu_add(nmi_stats.normal, handled);
@@ -417,8 +416,6 @@ static noinstr void default_do_nmi(struc
 		unknown_nmi_error(reason, regs);
 
 out:
-	if (regs->flags & X86_EFLAGS_IF)
-		trace_hardirqs_on_prepare();
 	instrumentation_end();
 }
 
@@ -511,6 +508,8 @@ static noinstr bool is_debug_stack(unsig
 
 DEFINE_IDTENTRY_NMI(exc_nmi)
 {
+	bool irq_state;
+
 	if (IS_ENABLED(CONFIG_SMP) && cpu_is_offline(smp_processor_id()))
 		return;
 
@@ -535,14 +534,14 @@ DEFINE_IDTENTRY_NMI(exc_nmi)
 	}
 #endif
 
-	nmi_enter();
+	irq_state = idtentry_enter_nmi(regs);
 
 	inc_irq_stat(__nmi_count);
 
 	if (!ignore_nmis)
 		default_do_nmi(regs);
 
-	nmi_exit();
+	idtentry_exit_nmi(regs, irq_state);
 
 #ifdef CONFIG_X86_64
 	if (unlikely(this_cpu_read(update_debug_stack))) {
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -387,7 +387,7 @@ DEFINE_IDTENTRY_DF(exc_double_fault)
 	}
 #endif
 
-	nmi_enter();
+	idtentry_enter_nmi(regs);
 	instrumentation_begin();
 	notify_die(DIE_TRAP, str, regs, error_code, X86_TRAP_DF, SIGSEGV);
 
@@ -632,15 +632,12 @@ DEFINE_IDTENTRY_RAW(exc_int3)
 		instrumentation_end();
 		idtentry_exit_user(regs);
 	} else {
-		nmi_enter();
+		bool irq_state = idtentry_enter_nmi(regs);
 		instrumentation_begin();
-		trace_hardirqs_off_finish();
 		if (!do_int3(regs))
 			die("int3", regs, 0);
-		if (regs->flags & X86_EFLAGS_IF)
-			trace_hardirqs_on_prepare();
 		instrumentation_end();
-		nmi_exit();
+		idtentry_exit_nmi(regs, irq_state);
 	}
 }
 
@@ -852,10 +849,7 @@ static void noinstr handle_debug(struct
 static __always_inline void exc_debug_kernel(struct pt_regs *regs,
 					     unsigned long dr6)
 {
-	nmi_enter();
-	instrumentation_begin();
-	trace_hardirqs_off_finish();
-	instrumentation_end();
+	bool irq_state = idtentry_enter_nmi(regs);
 
 	/*
 	 * The SDM says "The processor clears the BTF flag when it
@@ -878,11 +872,7 @@ static __always_inline void exc_debug_ke
 	if (dr6)
 		handle_debug(regs, dr6, false);
 
-	instrumentation_begin();
-	if (regs->flags & X86_EFLAGS_IF)
-		trace_hardirqs_on_prepare();
-	instrumentation_end();
-	nmi_exit();
+	idtentry_exit_nmi(regs, irq_state);
 }
 
 static __always_inline void exc_debug_user(struct pt_regs *regs,
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -111,32 +111,42 @@ extern void rcu_nmi_exit(void);
 /*
  * nmi_enter() can nest up to 15 times; see NMI_BITS.
  */
-#define nmi_enter()						\
+#define __nmi_enter()						\
 	do {							\
+		lockdep_off();					\
 		arch_nmi_enter();				\
 		printk_nmi_enter();				\
-		lockdep_off();					\
 		BUG_ON(in_nmi() == NMI_MASK);			\
 		__preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET);	\
-		rcu_nmi_enter();				\
+	} while (0)
+
+#define nmi_enter()						\
+	do {							\
+		__nmi_enter();					\
 		lockdep_hardirq_enter();			\
+		rcu_nmi_enter();				\
 		instrumentation_begin();			\
 		ftrace_nmi_enter();				\
 		instrumentation_end();				\
 	} while (0)
 
+#define __nmi_exit()						\
+	do {							\
+		BUG_ON(!in_nmi());				\
+		__preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET);	\
+		printk_nmi_exit();				\
+		arch_nmi_exit();				\
+		lockdep_on();					\
+	} while (0)
+
 #define nmi_exit()						\
 	do {							\
 		instrumentation_begin();			\
 		ftrace_nmi_exit();				\
 		instrumentation_end();				\
-		lockdep_hardirq_exit();				\
 		rcu_nmi_exit();					\
-		BUG_ON(!in_nmi());				\
-		__preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET);	\
-		lockdep_on();					\
-		printk_nmi_exit();				\
-		arch_nmi_exit();				\
+		lockdep_hardirq_exit();				\
+		__nmi_exit();					\
 	} while (0)
 
 #endif /* LINUX_HARDIRQ_H */



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/6] lockdep: Change hardirq{s_enabled, _context} to per-cpu variables
  2020-05-28 14:05 ` [PATCH v2 3/6] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables Peter Zijlstra
@ 2020-05-29  3:55   ` kbuild test robot
  2020-05-29  4:12   ` kbuild test robot
  2020-05-29 16:16   ` [PATCH v2 3/6] lockdep: Change hardirq{s_enabled,_context} " Peter Zijlstra
  2 siblings, 0 replies; 11+ messages in thread
From: kbuild test robot @ 2020-05-29  3:55 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3416 bytes --]

Hi Peter,

I love your patch! Yet something to improve:

[auto build test ERROR on tip/auto-latest]
[cannot apply to tip/locking/core tip/x86/asm linus/master tip/x86/core v5.7-rc7 next-20200528]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Peter-Zijlstra/x86-entry-lockdep-Improve-IRQ-state-tracking/20200528-222257
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 4e052965f46b6fd9641d79bf10208eac2631475f
config: s390-randconfig-m031-20200528 (attached as .config)
compiler: s390-linux-gcc (GCC) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>, old ones prefixed by <<):

In file included from arch/s390/include/asm/processor.h:41,
from arch/s390/include/asm/thread_info.h:27,
from include/linux/thread_info.h:38,
from arch/s390/include/asm/preempt.h:6,
from include/linux/preempt.h:78,
from arch/s390/include/asm/percpu.h:5,
from include/linux/lockdep.h:23,
from include/linux/spinlock_types.h:18,
from kernel/bounds.c:14:
>> include/linux/irqflags.h:36:22: error: unknown type name 'hardirqs_enabled'
36 | DECLARE_PER_CPU(int, hardirqs_enabled);
|                      ^~~~~~~~~~~~~~~~
>> include/linux/irqflags.h:37:22: error: unknown type name 'hardirq_context'
37 | DECLARE_PER_CPU(int, hardirq_context);
|                      ^~~~~~~~~~~~~~~
make[2]: *** [scripts/Makefile.build:100: kernel/bounds.s] Error 1
make[2]: Target '__build' not remade because of errors.
make[1]: *** [Makefile:1149: prepare0] Error 2
make[1]: Target 'prepare' not remade because of errors.
make: *** [Makefile:180: sub-make] Error 2

vim +/hardirqs_enabled +36 include/linux/irqflags.h

    35	
  > 36	DECLARE_PER_CPU(int, hardirqs_enabled);
  > 37	DECLARE_PER_CPU(int, hardirq_context);
    38	
    39	  extern void trace_hardirqs_on_prepare(void);
    40	  extern void trace_hardirqs_off_finish(void);
    41	  extern void trace_hardirqs_on(void);
    42	  extern void trace_hardirqs_off(void);
    43	# define lockdep_hardirq_context(p)	(this_cpu_read(hardirq_context))
    44	# define lockdep_softirq_context(p)	((p)->softirq_context)
    45	# define lockdep_hardirqs_enabled(p)	(this_cpu_read(hardirqs_enabled))
    46	# define lockdep_softirqs_enabled(p)	((p)->softirqs_enabled)
    47	# define lockdep_hardirq_enter()			\
    48	do {							\
    49		if (this_cpu_inc_return(hardirq_context) == 1)	\
    50			current->hardirq_threaded = 0;		\
    51	} while (0)
    52	# define lockdep_hardirq_threaded()		\
    53	do {						\
    54		current->hardirq_threaded = 1;		\
    55	} while (0)
    56	# define lockdep_hardirq_exit()			\
    57	do {						\
    58		this_cpu_dec(hardirq_context);		\
    59	} while (0)
    60	# define lockdep_softirq_enter()		\
    61	do {						\
    62		current->softirq_context++;		\
    63	} while (0)
    64	# define lockdep_softirq_exit()			\
    65	do {						\
    66		current->softirq_context--;		\
    67	} while (0)
    68	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 29429 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/6] lockdep: Change hardirq{s_enabled, _context} to per-cpu variables
  2020-05-28 14:05 ` [PATCH v2 3/6] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables Peter Zijlstra
  2020-05-29  3:55   ` [PATCH v2 3/6] lockdep: Change hardirq{s_enabled, _context} " kbuild test robot
@ 2020-05-29  4:12   ` kbuild test robot
  2020-05-29 16:16   ` [PATCH v2 3/6] lockdep: Change hardirq{s_enabled,_context} " Peter Zijlstra
  2 siblings, 0 replies; 11+ messages in thread
From: kbuild test robot @ 2020-05-29  4:12 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 7213 bytes --]

Hi Peter,

I love your patch! Yet something to improve:

[auto build test ERROR on tip/auto-latest]
[cannot apply to tip/locking/core tip/x86/asm linus/master tip/x86/core v5.7-rc7 next-20200528]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Peter-Zijlstra/x86-entry-lockdep-Improve-IRQ-state-tracking/20200528-222257
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 4e052965f46b6fd9641d79bf10208eac2631475f
config: sparc-allyesconfig (attached as .config)
compiler: sparc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=sparc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>, old ones prefixed by <<):

In file included from arch/sparc/include/asm/percpu_64.h:11,
from arch/sparc/include/asm/percpu.h:5,
from include/linux/irqflags.h:17,
from include/asm-generic/cmpxchg-local.h:6,
from arch/sparc/include/asm/cmpxchg_64.h:111,
from arch/sparc/include/asm/cmpxchg.h:5,
from arch/sparc/include/asm/atomic_64.h:12,
from arch/sparc/include/asm/atomic.h:5,
from include/linux/atomic.h:7,
from include/asm-generic/bitops/lock.h:5,
from arch/sparc/include/asm/bitops_64.h:52,
from arch/sparc/include/asm/bitops.h:5,
from include/linux/bitops.h:29,
from include/linux/kernel.h:12,
from include/asm-generic/bug.h:19,
from arch/sparc/include/asm/bug.h:25,
from include/linux/bug.h:5,
from include/linux/page-flags.h:10,
from kernel/bounds.c:10:
>> arch/sparc/include/asm/trap_block.h:54:39: error: 'NR_CPUS' undeclared here (not in a function)
54 | extern struct trap_per_cpu trap_block[NR_CPUS];
|                                       ^~~~~~~
make[2]: *** [scripts/Makefile.build:100: kernel/bounds.s] Error 1
make[2]: Target '__build' not remade because of errors.
make[1]: *** [Makefile:1149: prepare0] Error 2
make[1]: Target 'prepare' not remade because of errors.
make: *** [Makefile:180: sub-make] Error 2

vim +/NR_CPUS +54 arch/sparc/include/asm/trap_block.h

19f0fa3fb3499d8c David S. Miller 2009-04-01   9  
19f0fa3fb3499d8c David S. Miller 2009-04-01  10  /* Trap handling code needs to get at a few critical values upon
19f0fa3fb3499d8c David S. Miller 2009-04-01  11   * trap entry and to process TSB misses.  These cannot be in the
19f0fa3fb3499d8c David S. Miller 2009-04-01  12   * per_cpu() area as we really need to lock them into the TLB and
19f0fa3fb3499d8c David S. Miller 2009-04-01  13   * thus make them part of the main kernel image.  As a result we
19f0fa3fb3499d8c David S. Miller 2009-04-01  14   * try to make this as small as possible.
19f0fa3fb3499d8c David S. Miller 2009-04-01  15   *
19f0fa3fb3499d8c David S. Miller 2009-04-01  16   * This is padded out and aligned to 64-bytes to avoid false sharing
19f0fa3fb3499d8c David S. Miller 2009-04-01  17   * on SMP.
19f0fa3fb3499d8c David S. Miller 2009-04-01  18   */
19f0fa3fb3499d8c David S. Miller 2009-04-01  19  
19f0fa3fb3499d8c David S. Miller 2009-04-01  20  /* If you modify the size of this structure, please update
19f0fa3fb3499d8c David S. Miller 2009-04-01  21   * TRAP_BLOCK_SZ_SHIFT below.
19f0fa3fb3499d8c David S. Miller 2009-04-01  22   */
19f0fa3fb3499d8c David S. Miller 2009-04-01  23  struct thread_info;
19f0fa3fb3499d8c David S. Miller 2009-04-01  24  struct trap_per_cpu {
19f0fa3fb3499d8c David S. Miller 2009-04-01  25  /* D-cache line 1: Basic thread information, cpu and device mondo queues */
19f0fa3fb3499d8c David S. Miller 2009-04-01  26  	struct thread_info	*thread;
19f0fa3fb3499d8c David S. Miller 2009-04-01  27  	unsigned long		pgd_paddr;
19f0fa3fb3499d8c David S. Miller 2009-04-01  28  	unsigned long		cpu_mondo_pa;
19f0fa3fb3499d8c David S. Miller 2009-04-01  29  	unsigned long		dev_mondo_pa;
19f0fa3fb3499d8c David S. Miller 2009-04-01  30  
19f0fa3fb3499d8c David S. Miller 2009-04-01  31  /* D-cache line 2: Error Mondo Queue and kernel buffer pointers */
19f0fa3fb3499d8c David S. Miller 2009-04-01  32  	unsigned long		resum_mondo_pa;
19f0fa3fb3499d8c David S. Miller 2009-04-01  33  	unsigned long		resum_kernel_buf_pa;
19f0fa3fb3499d8c David S. Miller 2009-04-01  34  	unsigned long		nonresum_mondo_pa;
19f0fa3fb3499d8c David S. Miller 2009-04-01  35  	unsigned long		nonresum_kernel_buf_pa;
19f0fa3fb3499d8c David S. Miller 2009-04-01  36  
19f0fa3fb3499d8c David S. Miller 2009-04-01  37  /* Dcache lines 3, 4, 5, and 6: Hypervisor Fault Status */
19f0fa3fb3499d8c David S. Miller 2009-04-01  38  	struct hv_fault_status	fault_info;
19f0fa3fb3499d8c David S. Miller 2009-04-01  39  
19f0fa3fb3499d8c David S. Miller 2009-04-01  40  /* Dcache line 7: Physical addresses of CPU send mondo block and CPU list.  */
19f0fa3fb3499d8c David S. Miller 2009-04-01  41  	unsigned long		cpu_mondo_block_pa;
19f0fa3fb3499d8c David S. Miller 2009-04-01  42  	unsigned long		cpu_list_pa;
19f0fa3fb3499d8c David S. Miller 2009-04-01  43  	unsigned long		tsb_huge;
19f0fa3fb3499d8c David S. Miller 2009-04-01  44  	unsigned long		tsb_huge_temp;
19f0fa3fb3499d8c David S. Miller 2009-04-01  45  
19f0fa3fb3499d8c David S. Miller 2009-04-01  46  /* Dcache line 8: IRQ work list, and keep trap_block a power-of-2 in size.  */
19f0fa3fb3499d8c David S. Miller 2009-04-01  47  	unsigned long		irq_worklist_pa;
19f0fa3fb3499d8c David S. Miller 2009-04-01  48  	unsigned int		cpu_mondo_qmask;
19f0fa3fb3499d8c David S. Miller 2009-04-01  49  	unsigned int		dev_mondo_qmask;
19f0fa3fb3499d8c David S. Miller 2009-04-01  50  	unsigned int		resum_qmask;
19f0fa3fb3499d8c David S. Miller 2009-04-01  51  	unsigned int		nonresum_qmask;
5a5488d3bb9a23d9 David S. Miller 2009-04-01  52  	unsigned long		__per_cpu_base;
19f0fa3fb3499d8c David S. Miller 2009-04-01  53  } __attribute__((aligned(64)));
19f0fa3fb3499d8c David S. Miller 2009-04-01 @54  extern struct trap_per_cpu trap_block[NR_CPUS];
f05a68653e56ca2f Sam Ravnborg    2014-05-16  55  void init_cur_cpu_trap(struct thread_info *);
f05a68653e56ca2f Sam Ravnborg    2014-05-16  56  void setup_tba(void);
19f0fa3fb3499d8c David S. Miller 2009-04-01  57  extern int ncpus_probed;
9d53caec84c7c570 Jane Chu        2017-07-11  58  extern u64 cpu_mondo_counter[NR_CPUS];
19f0fa3fb3499d8c David S. Miller 2009-04-01  59  

:::::: The code at line 54 was first introduced by commit
:::::: 19f0fa3fb3499d8c5fb861933959f546d05fc202 sparc64: Move trap_block[] definitions into a new header file.

:::::: TO: David S. Miller <davem@davemloft.net>
:::::: CC: David S. Miller <davem@davemloft.net>

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 62441 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/6] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables
  2020-05-28 14:05 ` [PATCH v2 3/6] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables Peter Zijlstra
  2020-05-29  3:55   ` [PATCH v2 3/6] lockdep: Change hardirq{s_enabled, _context} " kbuild test robot
  2020-05-29  4:12   ` kbuild test robot
@ 2020-05-29 16:16   ` Peter Zijlstra
  2020-05-29 19:38     ` Peter Zijlstra
  2 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-29 16:16 UTC (permalink / raw)
  To: mingo, will, tglx; +Cc: x86, linux-kernel, a.darwish, rostedt, bigeasy

On Thu, May 28, 2020 at 04:05:38PM +0200, Peter Zijlstra wrote:
> +#include <asm/percpu.h>
> +#include <asm/percpu.h>

If we change those to <linux/percpu-defs.h>, in direct conflict with
what the header says about itself, all builds seem to be good again.

s390, sparc64, power64 all build again.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/6] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables
  2020-05-29 16:16   ` [PATCH v2 3/6] lockdep: Change hardirq{s_enabled,_context} " Peter Zijlstra
@ 2020-05-29 19:38     ` Peter Zijlstra
  0 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-05-29 19:38 UTC (permalink / raw)
  To: mingo, will, tglx; +Cc: x86, linux-kernel, a.darwish, rostedt, bigeasy

On Fri, May 29, 2020 at 06:16:47PM +0200, Peter Zijlstra wrote:
> On Thu, May 28, 2020 at 04:05:38PM +0200, Peter Zijlstra wrote:
> > +#include <asm/percpu.h>
> > +#include <asm/percpu.h>
> 
> If we change those to <linux/percpu-defs.h>, in direct conflict with
> what the header says about itself, all builds seem to be good again.
> 
> s390, sparc64, power64 all build again.

n/m that gives other fail :/

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-05-29 19:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-28 14:05 [PATCH v2 0/6] x86/entry,lockdep: Improve IRQ state tracking Peter Zijlstra
2020-05-28 14:05 ` [PATCH v2 1/6] x86/entry: Clarify irq_{enter,exit}_rcu() Peter Zijlstra
2020-05-28 14:05 ` [PATCH v2 2/6] x86/entry: Rename trace_hardirqs_off_prepare() Peter Zijlstra
2020-05-28 14:05 ` [PATCH v2 3/6] lockdep: Change hardirq{s_enabled,_context} to per-cpu variables Peter Zijlstra
2020-05-29  3:55   ` [PATCH v2 3/6] lockdep: Change hardirq{s_enabled, _context} " kbuild test robot
2020-05-29  4:12   ` kbuild test robot
2020-05-29 16:16   ` [PATCH v2 3/6] lockdep: Change hardirq{s_enabled,_context} " Peter Zijlstra
2020-05-29 19:38     ` Peter Zijlstra
2020-05-28 14:05 ` [PATCH v2 4/6] lockdep: Remove lockdep_hardirq{s_enabled,_context}() argument Peter Zijlstra
2020-05-28 14:05 ` [PATCH v2 5/6] lockdep: Prepare for NMI IRQ state tracking Peter Zijlstra
2020-05-28 14:05 ` [PATCH v2 6/6] x86/entry: Fix NMI vs " Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.