All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 0/2] preempt: Track unmatched preempt_enable() to preempt_disable()
@ 2014-01-16 17:18 Steven Rostedt
  2014-01-16 17:18 ` [RFC][PATCH 1/2] preempt: Add CONFIG_DEBUG_PREEMPT_COUNT to consolidate ifdef logic Steven Rostedt
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Steven Rostedt @ 2014-01-16 17:18 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Clark Williams


Back in the original real-time patch there was a nice debug feature
that tracked where preemption was disabled and not matched by a preempt
enable. This information made debugging scheduling while atomic much
easier, especially when a function was buried in a function chain
that happened just before the schedule. For example:

bar() {
  preempt_disable();
  [...]
  if (ret < 0)
     return;
  [...]
}

foo() {
  [...]
  bar();
  [...]
}

foo();
schedule();

Getting the "scheduling while atomic" warning detects the bug, but it
does not let you know where the bug happened. It may take several trials
to figure out where the missing preempt_enable() was. Some times those
functions are very deep and that makes it even more difficult to find.

Although this bug does not happen often, this feature is extremely helpful
when it does. I decided to take Ingo's code, which was just written for
debugging the -rt patch, and clean it up for something that is suitable
for mainline.

I started with his code and modified it to be a bit simpler, as he never
meant that code to be mainline. First, as preempt disabling is really more
of a CPU state and not a process state, I made the stacks just per cpu
arrays. This eliminates any changes needed by sched.h or the task_struct.

The only times the preempt trace is needed is when scheduling while atomic
or might sleep bugs are found. This information is only printed in those
cases (not on all bug reports).

I renamed the config option from PREEMPT_TRACE to PREEMPT_STACK to prevent
confusion from the tracing facility PREEMPT_TRACER that traces the length
that preemption is disabled.

Using a struct to save the ip and parent ip together also keeps the code
a bit cleaner than using two different arrays.

Using a module that creates a kernel thread that calls a function
bad_preempt_disable() that just calls preempt_disable() with no matching
preempt_enable(), and then calls schedule, the output can be seen as the
following:

BUG: scheduling while atomic: task1/3889/0x00000002
Modules linked in: sched_bug(O) [..]
CPU: 7 PID: 3889 Comm: task1 Tainted: G           O 3.13.0-rc8-test+ #28
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
 0000000000000007 ffff8800c87a1dd8 ffffffff81627676 ffff88011ebcf5d0
 ffff8800ca0eaf00 ffff8800c87a1df8 ffffffff81623825 0000000000000002
 ffff88011ebd3600 ffff8800c87a1e78 ffffffff8162b0f5 ffff8800c87a1e28
Call Trace:
 [<ffffffff81627676>] dump_stack+0x4f/0x7c
 [<ffffffff81623825>] __schedule_bug+0x59/0x6e
 [<ffffffff8162b0f5>] __schedule+0x6b5/0x7e0
 [<ffffffff816329f1>] ? preempt_count_sub+0xb1/0x100
 [<ffffffffa06d9020>] ? bad_preempt_disable+0x20/0x20 [sched_bug]
 [<ffffffff8162b339>] schedule+0x29/0x70
 [<ffffffffa06d9038>] dead_thread+0x18/0x78 [sched_bug]
 [<ffffffff81073559>] kthread+0xc9/0xe0
 [<ffffffff81620000>] ? pci_add_new_bus+0x150/0x3b0
 [<ffffffff81073490>] ? flush_kthread_worker+0xb0/0xb0
 [<ffffffff8163676c>] ret_from_fork+0x7c/0xb0
 [<ffffffff81073490>] ? flush_kthread_worker+0xb0/0xb0
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffffa06d9013>] .... bad_preempt_disable+0x13/0x20 [sched_bug]
.....[<ffffffffa06d9033>] ..   ( <= dead_thread+0x13/0x78 [sched_bug])
.. [<ffffffff8162aa89>] .... __schedule+0x49/0x7e0
.....[<ffffffff8162b339>] ..   ( <= schedule+0x29/0x70)


Steven Rostedt (Red Hat) (2):
      preempt: Add CONFIG_DEBUG_PREEMPT_COUNT to consolidate ifdef logic
      preempt: Show preempt disable stack on schedule bugs

----
 include/linux/preempt.h |  2 +-
 kernel/sched/core.c     | 63 ++++++++++++++++++++++++++++++++++++++++++++++---
 kernel/trace/Kconfig    |  1 +
 lib/Kconfig.debug       | 20 ++++++++++++++++
 4 files changed, 82 insertions(+), 4 deletions(-)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC][PATCH 1/2] preempt: Add CONFIG_DEBUG_PREEMPT_COUNT to consolidate ifdef logic
  2014-01-16 17:18 [RFC][PATCH 0/2] preempt: Track unmatched preempt_enable() to preempt_disable() Steven Rostedt
@ 2014-01-16 17:18 ` Steven Rostedt
  2014-01-16 17:18 ` [RFC][PATCH 2/2] preempt: Show preempt disable stack on schedule bugs Steven Rostedt
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2014-01-16 17:18 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Clark Williams

[-- Attachment #1: 0001-preempt-Add-CONFIG_DEBUG_PREEMPT_COUNT-to-consolidat.patch --]
[-- Type: text/plain, Size: 2509 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

Add a Kconfig bool DEBUG_PREEMPT_COUNT that gets enabled by both
DEBUG_PREEMPT and PREEMPT_TRACER, where they both enable the
full add_preempt_count() function call to debug preempt count logic.

This cleans up the code and lets other users of add_preempt_count()
without messing up the code as well.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/preempt.h | 2 +-
 kernel/sched/core.c     | 3 +--
 kernel/trace/Kconfig    | 1 +
 lib/Kconfig.debug       | 4 ++++
 4 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/linux/preempt.h b/include/linux/preempt.h
index a3d9dc8..cd48ecc 100644
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -17,7 +17,7 @@
 
 #include <asm/preempt.h>
 
-#if defined(CONFIG_DEBUG_PREEMPT) || defined(CONFIG_PREEMPT_TRACER)
+#ifdef CONFIG_DEBUG_PREEMPT_COUNT
 extern void preempt_count_add(int val);
 extern void preempt_count_sub(int val);
 #define preempt_count_dec_and_test() ({ preempt_count_sub(1); should_resched(); })
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a88f4a4..68b0c1f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2339,8 +2339,7 @@ notrace unsigned long get_parent_ip(unsigned long addr)
 	return addr;
 }
 
-#if defined(CONFIG_PREEMPT) && (defined(CONFIG_DEBUG_PREEMPT) || \
-				defined(CONFIG_PREEMPT_TRACER))
+#ifdef CONFIG_DEBUG_PREEMPT_COUNT
 
 void __kprobes preempt_count_add(int val)
 {
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 015f85a..3fb0331 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -202,6 +202,7 @@ config PREEMPT_TRACER
 	select RING_BUFFER_ALLOW_SWAP
 	select TRACER_SNAPSHOT
 	select TRACER_SNAPSHOT_PER_CPU_SWAP
+	select DEBUG_PREEMPT_COUNT
 	help
 	  This option measures the time spent in preemption-off critical
 	  sections, with microsecond accuracy.
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index db25707..8cf7819 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -796,9 +796,13 @@ config TIMER_STATS
 	  (it defaults to deactivated on bootup and will only be activated
 	  if some application like powertop activates it explicitly).
 
+config DEBUG_PREEMPT_COUNT
+       bool
+
 config DEBUG_PREEMPT
 	bool "Debug preemptible kernel"
 	depends on DEBUG_KERNEL && PREEMPT && TRACE_IRQFLAGS_SUPPORT
+	select DEBUG_PREEMPT_COUNT
 	default y
 	help
 	  If you say Y here then the kernel will use a debug variant of the
-- 
1.8.4.3



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC][PATCH 2/2] preempt: Show preempt disable stack on schedule bugs
  2014-01-16 17:18 [RFC][PATCH 0/2] preempt: Track unmatched preempt_enable() to preempt_disable() Steven Rostedt
  2014-01-16 17:18 ` [RFC][PATCH 1/2] preempt: Add CONFIG_DEBUG_PREEMPT_COUNT to consolidate ifdef logic Steven Rostedt
@ 2014-01-16 17:18 ` Steven Rostedt
  2014-01-16 17:42 ` [RFC][PATCH 0/2] preempt: Track unmatched preempt_enable() to preempt_disable() Steven Rostedt
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2014-01-16 17:18 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Clark Williams

[-- Attachment #1: 0002-preempt-Show-preempt-disable-stack-on-schedule-bugs.patch --]
[-- Type: text/plain, Size: 7404 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

There's been a few times where a function was called that did not have a
matching preempt_enable() for the preempt_disable().

For example, if a function deep in a call chain was called, and did a
preempt_disable() but due to an early exit of the routine, did not do a
matching preempt_enable(), it may not warn about this until later.

bar() {
  preempt_disable();
  [...]
  if (ret < 0)
     return;
  [...]
}

foo() {
  [...]
  bar();
  [...]
}

foo();
schedule();

Getting the "scheduling while atomic" warning detects the bug, but it
does not let you know where the bug happened. It may take several trials
to figure out where the missing preempt_enable() was. Some times those
functions are very deep and that makes it even more difficult to find.

The original real-time patch had a feature, written by Ingo Molnar, to trace
all the locations that disabled preemption for each process. On a bug, it
would report where the preemption was disabled.

I started with his code and modified it to be a bit simpler, as he never
meant that code to be mainline. First, as preempt disabling is really more
of a CPU state and not a process state, I made the stacks just per cpu
arrays. This eliminates any changes needed by sched.h or the task_struct.

The only times the preempt trace is needed is when scheduling while atomic
or might sleep bugs are found. This information is only printed in those
cases (not on all bug reports).

I renamed the config option from PREEMPT_TRACE to PREEMPT_STACK to prevent
confusion from the tracing facility PREEMPT_TRACER that traces the length
that preemption is disabled.

Using a struct to save the ip and parent ip together also keeps the code
a bit cleaner than using two different arrays.

Using a module that creates a kernel thread that calls a function
bad_preempt_disable() that just calls preempt_disable() with no matching
preempt_enable(), and then calls schedule, the output can be seen as the
following:

BUG: scheduling while atomic: task1/3889/0x00000002
Modules linked in: sched_bug(O) [..]
CPU: 7 PID: 3889 Comm: task1 Tainted: G           O 3.13.0-rc8-test+ #28
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
 0000000000000007 ffff8800c87a1dd8 ffffffff81627676 ffff88011ebcf5d0
 ffff8800ca0eaf00 ffff8800c87a1df8 ffffffff81623825 0000000000000002
 ffff88011ebd3600 ffff8800c87a1e78 ffffffff8162b0f5 ffff8800c87a1e28
Call Trace:
 [<ffffffff81627676>] dump_stack+0x4f/0x7c
 [<ffffffff81623825>] __schedule_bug+0x59/0x6e
 [<ffffffff8162b0f5>] __schedule+0x6b5/0x7e0
 [<ffffffff816329f1>] ? preempt_count_sub+0xb1/0x100
 [<ffffffffa06d9020>] ? bad_preempt_disable+0x20/0x20 [sched_bug]
 [<ffffffff8162b339>] schedule+0x29/0x70
 [<ffffffffa06d9038>] dead_thread+0x18/0x78 [sched_bug]
 [<ffffffff81073559>] kthread+0xc9/0xe0
 [<ffffffff81620000>] ? pci_add_new_bus+0x150/0x3b0
 [<ffffffff81073490>] ? flush_kthread_worker+0xb0/0xb0
 [<ffffffff8163676c>] ret_from_fork+0x7c/0xb0
 [<ffffffff81073490>] ? flush_kthread_worker+0xb0/0xb0
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffffa06d9013>] .... bad_preempt_disable+0x13/0x20 [sched_bug]
.....[<ffffffffa06d9033>] ..   ( <= dead_thread+0x13/0x78 [sched_bug])
.. [<ffffffff8162aa89>] .... __schedule+0x49/0x7e0
.....[<ffffffff8162b339>] ..   ( <= schedule+0x29/0x70)

Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/sched/core.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 lib/Kconfig.debug   | 16 ++++++++++++++
 2 files changed, 75 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 68b0c1f..336b2c0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2341,8 +2341,51 @@ notrace unsigned long get_parent_ip(unsigned long addr)
 
 #ifdef CONFIG_DEBUG_PREEMPT_COUNT
 
+#ifdef CONFIG_PREEMPT_STACK
+# define MAX_PREEMPT_STACK 25
+struct preempt_stack {
+	unsigned long		ip;
+	unsigned long		parent_ip;
+};
+
+static DEFINE_PER_CPU(struct preempt_stack,
+		      preempt_stack_trace[MAX_PREEMPT_STACK]);
+
+void print_preempt_trace(void)
+{
+	struct preempt_stack *stack;
+	unsigned int count;
+	unsigned int i, lim;
+
+	stack = this_cpu_ptr(preempt_stack_trace);
+
+	count = preempt_count();
+
+	lim = count & PREEMPT_MASK;
+
+	if (lim >= MAX_PREEMPT_STACK)
+		lim = MAX_PREEMPT_STACK-1;
+	printk(KERN_ERR "---------------------------\n");
+	printk(KERN_CONT "| preempt count: %08x ]\n", count);
+	printk(KERN_CONT "| %d-level deep critical section nesting:\n", lim);
+	printk(KERN_CONT "----------------------------------------\n");
+	for (i = 1; i <= lim; i++) {
+		printk(KERN_CONT ".. [<%08lx>] .... %pS\n",
+		       stack[i].ip, (void *)stack[i].ip);
+		printk(KERN_CONT ".....[<%08lx>] ..   ( <= %pS)\n",
+		       stack[i].parent_ip, (void *)stack[i].parent_ip);
+	}
+	printk(KERN_CONT "\n");
+}
+#else
+static inline void print_preempt_trace(void) { }
+#endif
+
 void __kprobes preempt_count_add(int val)
 {
+	unsigned long ip = CALLER_ADDR0;
+	unsigned long parent_ip = get_parent_ip(CALLER_ADDR1);
+
 #ifdef CONFIG_DEBUG_PREEMPT
 	/*
 	 * Underflow?
@@ -2351,6 +2394,19 @@ void __kprobes preempt_count_add(int val)
 		return;
 #endif
 	__preempt_count_add(val);
+
+#ifdef CONFIG_PREEMPT_STACK
+	if (val < SOFTIRQ_OFFSET) {
+		unsigned int idx = preempt_count() & PREEMPT_MASK;
+		if (idx < MAX_PREEMPT_STACK) {
+			struct preempt_stack *stack;
+			stack = this_cpu_ptr(preempt_stack_trace);
+			stack[idx].ip = ip;
+			stack[idx].parent_ip = parent_ip;
+		}
+	}
+#endif
+
 #ifdef CONFIG_DEBUG_PREEMPT
 	/*
 	 * Spinlock count overflowing soon?
@@ -2359,7 +2415,7 @@ void __kprobes preempt_count_add(int val)
 				PREEMPT_MASK - 10);
 #endif
 	if (preempt_count() == val)
-		trace_preempt_off(CALLER_ADDR0, get_parent_ip(CALLER_ADDR1));
+		trace_preempt_off(ip, parent_ip);
 }
 EXPORT_SYMBOL(preempt_count_add);
 
@@ -2403,6 +2459,7 @@ static noinline void __schedule_bug(struct task_struct *prev)
 	if (irqs_disabled())
 		print_irqtrace_events(prev);
 	dump_stack();
+	print_preempt_trace();
 	add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
 }
 
@@ -6388,6 +6445,7 @@ void __might_sleep(const char *file, int line, int preempt_offset)
 	if (irqs_disabled())
 		print_irqtrace_events(current);
 	dump_stack();
+	print_preempt_trace();
 }
 EXPORT_SYMBOL(__might_sleep);
 #endif
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 8cf7819..b7999a4 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -810,6 +810,22 @@ config DEBUG_PREEMPT
 	  if kernel code uses it in a preemption-unsafe way. Also, the kernel
 	  will detect preemption count underflows.
 
+config PREEMPT_STACK
+	bool "Keep a record of preempt disabled locations"
+	depends on DEBUG_KERNEL
+	depends on PREEMPT
+	select DEBUG_PREEMPT_COUNT
+	help
+	 When enabled, it keeps track of the last 25 locations that disabled
+	 preemption. This is useful to debug a "scheduling while atomic" error.
+	 Sometimes it is hard to find what left preemption disabled, and this
+	 utility can help.
+
+	 Saying yes to this will introduce an overhead, as all preempt disabled
+	 calls are being tracked.
+
+	 If unsure, say N.
+
 menu "Lock Debugging (spinlocks, mutexes, etc...)"
 
 config DEBUG_RT_MUTEXES
-- 
1.8.4.3



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH 0/2] preempt: Track unmatched preempt_enable() to preempt_disable()
  2014-01-16 17:18 [RFC][PATCH 0/2] preempt: Track unmatched preempt_enable() to preempt_disable() Steven Rostedt
  2014-01-16 17:18 ` [RFC][PATCH 1/2] preempt: Add CONFIG_DEBUG_PREEMPT_COUNT to consolidate ifdef logic Steven Rostedt
  2014-01-16 17:18 ` [RFC][PATCH 2/2] preempt: Show preempt disable stack on schedule bugs Steven Rostedt
@ 2014-01-16 17:42 ` Steven Rostedt
  2014-01-16 17:45 ` Peter Zijlstra
  2014-01-16 18:53 ` John Kacur
  4 siblings, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2014-01-16 17:42 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-rt-users, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Frederic Weisbecker,
	Clark Williams

On Thu, 16 Jan 2014 12:18:24 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> 
> Back in the original real-time patch there was a nice debug feature
> that tracked where preemption was disabled and not matched by a preempt
> enable. This information made debugging scheduling while atomic much
> easier, especially when a function was buried in a function chain
> that happened just before the schedule. For example:
> 

> Although this bug does not happen often, this feature is extremely helpful
> when it does. I decided to take Ingo's code, which was just written for
> debugging the -rt patch, and clean it up for something that is suitable
> for mainline.
> 

Hmm, I rewrote this prologue a few times, and even though patch 2 states
this, I want to make it clear that the original feature, that this is
based on, was written by Ingo Molnar. I had it clear in my original
version of the prologue, but somehow it was deleted.

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH 0/2] preempt: Track unmatched preempt_enable() to preempt_disable()
  2014-01-16 17:18 [RFC][PATCH 0/2] preempt: Track unmatched preempt_enable() to preempt_disable() Steven Rostedt
                   ` (2 preceding siblings ...)
  2014-01-16 17:42 ` [RFC][PATCH 0/2] preempt: Track unmatched preempt_enable() to preempt_disable() Steven Rostedt
@ 2014-01-16 17:45 ` Peter Zijlstra
  2014-01-16 18:39   ` Steven Rostedt
  2014-01-16 18:53 ` John Kacur
  4 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2014-01-16 17:45 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-rt-users, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Frederic Weisbecker, Clark Williams

On Thu, Jan 16, 2014 at 12:18:24PM -0500, Steven Rostedt wrote:

Didn't see anything horrible in there, so fine with me.

Another thing I recently ran into was:

  local_irq_disable();
  preempt_disable();
  local_irq_enable();

  local_irq_disable();
  preempt_enable();
  local_irq_enable();

I'm not sure any of our debugging catches that, I meant to extend
lockdep, but maybe it could also be hooked into something else.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH 0/2] preempt: Track unmatched preempt_enable() to preempt_disable()
  2014-01-16 17:45 ` Peter Zijlstra
@ 2014-01-16 18:39   ` Steven Rostedt
  2014-01-16 18:42     ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2014-01-16 18:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-rt-users, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Frederic Weisbecker, Clark Williams

On Thu, 16 Jan 2014 18:45:36 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> On Thu, Jan 16, 2014 at 12:18:24PM -0500, Steven Rostedt wrote:
> 
> Didn't see anything horrible in there, so fine with me.
> 
> Another thing I recently ran into was:
> 
>   local_irq_disable();
>   preempt_disable();
>   local_irq_enable();
> 
>   local_irq_disable();
>   preempt_enable();
>   local_irq_enable();
> 
> I'm not sure any of our debugging catches that, I meant to extend
> lockdep, but maybe it could also be hooked into something else.

Hmm, what's wrong with the above?

Note, the preemptirqsoff tracer will trace that as not being
preemptible between that first local_irq_disable() to the
local_irq_enable(). In fact, that's basically what the preemptirqsoff
tracer was made for ;-)

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH 0/2] preempt: Track unmatched preempt_enable() to preempt_disable()
  2014-01-16 18:39   ` Steven Rostedt
@ 2014-01-16 18:42     ` Peter Zijlstra
  2014-01-16 19:00       ` Steven Rostedt
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2014-01-16 18:42 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-rt-users, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Frederic Weisbecker, Clark Williams

On Thu, Jan 16, 2014 at 01:39:06PM -0500, Steven Rostedt wrote:
> On Thu, 16 Jan 2014 18:45:36 +0100
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > On Thu, Jan 16, 2014 at 12:18:24PM -0500, Steven Rostedt wrote:
> > 
> > Didn't see anything horrible in there, so fine with me.
> > 
> > Another thing I recently ran into was:
> > 
> >   local_irq_disable();
> >   preempt_disable();
> >   local_irq_enable();
> > 
> >   local_irq_disable();
> >   preempt_enable();
> >   local_irq_enable();
> > 
> > I'm not sure any of our debugging catches that, I meant to extend
> > lockdep, but maybe it could also be hooked into something else.
> 
> Hmm, what's wrong with the above?

Suppose a reschedule interrupt happens in the middle; we set
NEED_RESCHED but won't act because preempt is disabled.

When we hit preempt_enable() we'll call into preempt_schedule() but fail
because preemptible() finds irqs_disabled().

And in the end we do not schedule, but have NEED_RESCHED set.

FAIL.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH 0/2] preempt: Track unmatched preempt_enable() to preempt_disable()
  2014-01-16 17:18 [RFC][PATCH 0/2] preempt: Track unmatched preempt_enable() to preempt_disable() Steven Rostedt
                   ` (3 preceding siblings ...)
  2014-01-16 17:45 ` Peter Zijlstra
@ 2014-01-16 18:53 ` John Kacur
  4 siblings, 0 replies; 9+ messages in thread
From: John Kacur @ 2014-01-16 18:53 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, RT, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Clark Williams

On Thu, Jan 16, 2014 at 6:18 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> Back in the original real-time patch there was a nice debug feature
> that tracked where preemption was disabled and not matched by a preempt
> enable. This information made debugging scheduling while atomic much
> easier, especially when a function was buried in a function chain
> that happened just before the schedule. For example:
>
> bar() {
>   preempt_disable();
>   [...]
>   if (ret < 0)
>      return;
>   [...]
> }
>
> foo() {
>   [...]
>   bar();
>   [...]
> }
>
> foo();
> schedule();
>
> Getting the "scheduling while atomic" warning detects the bug, but it
> does not let you know where the bug happened. It may take several trials
> to figure out where the missing preempt_enable() was. Some times those
> functions are very deep and that makes it even more difficult to find.
>
> Although this bug does not happen often, this feature is extremely helpful
> when it does. I decided to take Ingo's code, which was just written for
> debugging the -rt patch, and clean it up for something that is suitable
> for mainline.
>
> I started with his code and modified it to be a bit simpler, as he never
> meant that code to be mainline. First, as preempt disabling is really more
> of a CPU state and not a process state, I made the stacks just per cpu
> arrays. This eliminates any changes needed by sched.h or the task_struct.
>
> The only times the preempt trace is needed is when scheduling while atomic
> or might sleep bugs are found. This information is only printed in those
> cases (not on all bug reports).
>
> I renamed the config option from PREEMPT_TRACE to PREEMPT_STACK to prevent
> confusion from the tracing facility PREEMPT_TRACER that traces the length
> that preemption is disabled.
>
> Using a struct to save the ip and parent ip together also keeps the code
> a bit cleaner than using two different arrays.
>
> Using a module that creates a kernel thread that calls a function
> bad_preempt_disable() that just calls preempt_disable() with no matching
> preempt_enable(), and then calls schedule, the output can be seen as the
> following:
>
> BUG: scheduling while atomic: task1/3889/0x00000002
> Modules linked in: sched_bug(O) [..]
> CPU: 7 PID: 3889 Comm: task1 Tainted: G           O 3.13.0-rc8-test+ #28
> Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
>  0000000000000007 ffff8800c87a1dd8 ffffffff81627676 ffff88011ebcf5d0
>  ffff8800ca0eaf00 ffff8800c87a1df8 ffffffff81623825 0000000000000002
>  ffff88011ebd3600 ffff8800c87a1e78 ffffffff8162b0f5 ffff8800c87a1e28
> Call Trace:
>  [<ffffffff81627676>] dump_stack+0x4f/0x7c
>  [<ffffffff81623825>] __schedule_bug+0x59/0x6e
>  [<ffffffff8162b0f5>] __schedule+0x6b5/0x7e0
>  [<ffffffff816329f1>] ? preempt_count_sub+0xb1/0x100
>  [<ffffffffa06d9020>] ? bad_preempt_disable+0x20/0x20 [sched_bug]
>  [<ffffffff8162b339>] schedule+0x29/0x70
>  [<ffffffffa06d9038>] dead_thread+0x18/0x78 [sched_bug]
>  [<ffffffff81073559>] kthread+0xc9/0xe0
>  [<ffffffff81620000>] ? pci_add_new_bus+0x150/0x3b0
>  [<ffffffff81073490>] ? flush_kthread_worker+0xb0/0xb0
>  [<ffffffff8163676c>] ret_from_fork+0x7c/0xb0
>  [<ffffffff81073490>] ? flush_kthread_worker+0xb0/0xb0
> ---------------------------
> | preempt count: 00000002 ]
> | 2-level deep critical section nesting:
> ----------------------------------------
> .. [<ffffffffa06d9013>] .... bad_preempt_disable+0x13/0x20 [sched_bug]
> .....[<ffffffffa06d9033>] ..   ( <= dead_thread+0x13/0x78 [sched_bug])
> .. [<ffffffff8162aa89>] .... __schedule+0x49/0x7e0
> .....[<ffffffff8162b339>] ..   ( <= schedule+0x29/0x70)
>
>
> Steven Rostedt (Red Hat) (2):
>       preempt: Add CONFIG_DEBUG_PREEMPT_COUNT to consolidate ifdef logic
>       preempt: Show preempt disable stack on schedule bugs
>
> ----
>  include/linux/preempt.h |  2 +-
>  kernel/sched/core.c     | 63 ++++++++++++++++++++++++++++++++++++++++++++++---
>  kernel/trace/Kconfig    |  1 +
>  lib/Kconfig.debug       | 20 ++++++++++++++++
>  4 files changed, 82 insertions(+), 4 deletions(-)
> --

fwiw - I have found this feature useful in the past, and would love to
see it mainstream.

Thanks

John Kacur

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH 0/2] preempt: Track unmatched preempt_enable() to preempt_disable()
  2014-01-16 18:42     ` Peter Zijlstra
@ 2014-01-16 19:00       ` Steven Rostedt
  0 siblings, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2014-01-16 19:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-rt-users, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Frederic Weisbecker, Clark Williams

On Thu, 16 Jan 2014 19:42:51 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> Suppose a reschedule interrupt happens in the middle; we set
> NEED_RESCHED but won't act because preempt is disabled.
> 
> When we hit preempt_enable() we'll call into preempt_schedule() but fail
> because preemptible() finds irqs_disabled().
> 
> And in the end we do not schedule, but have NEED_RESCHED set.
> 
> FAIL.

Ah! Interesting. Actually, you don't even need the first part. That
happens simply with:

	preempt_disable();

	<interrupt - set need_resched>

	local_irq_save();
	preempt_enable();
	local_irq_restore();

I think we can get lockdep to trigger on that. I have ideas to make
preempt tracer do that, which shouldn't be too hard to convert to
lockdep.

It looks that all we need to do is check for anytime we have interrupts
becoming disabled with preemption disabled, and then enable preemption
without first enabling interrupts. This should be able to be driven by
state of the task or CPU even.

I can see if I can whip up a patch to do that.

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-01-16 19:00 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-16 17:18 [RFC][PATCH 0/2] preempt: Track unmatched preempt_enable() to preempt_disable() Steven Rostedt
2014-01-16 17:18 ` [RFC][PATCH 1/2] preempt: Add CONFIG_DEBUG_PREEMPT_COUNT to consolidate ifdef logic Steven Rostedt
2014-01-16 17:18 ` [RFC][PATCH 2/2] preempt: Show preempt disable stack on schedule bugs Steven Rostedt
2014-01-16 17:42 ` [RFC][PATCH 0/2] preempt: Track unmatched preempt_enable() to preempt_disable() Steven Rostedt
2014-01-16 17:45 ` Peter Zijlstra
2014-01-16 18:39   ` Steven Rostedt
2014-01-16 18:42     ` Peter Zijlstra
2014-01-16 19:00       ` Steven Rostedt
2014-01-16 18:53 ` John Kacur

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.