* [PATCH] kernel/{lockdep,hung_task}: Show locks and backtrace of running tasks.
@ 2018-09-03 11:44 Tetsuo Handa
2018-09-10 6:07 ` Tetsuo Handa
0 siblings, 1 reply; 3+ messages in thread
From: Tetsuo Handa @ 2018-09-03 11:44 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton
Cc: linux-kernel, Tetsuo Handa, Dmitry Vyukov
We are getting reports from syzbot where running task seems to be
relevant to a hung task problem but NMI backtrace does not print useful
information [1].
Although commit 8cc05c71ba5f7936 ("locking/lockdep: Move sanity check to
inside lockdep_print_held_locks()") says that calling
lockdep_print_held_locks() on a running thread is considered unsafe,
it is useful for syzbot to show locks and backtrace of running tasks.
Thus, let's allow it if CONFIG_DEBUG_AID_FOR_SYZBOT is defined.
[1] https://syzkaller.appspot.com/bug?id=8bab7a6a5597bb10f90e8227a7d8a483748d93be
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Dmitry Vyukov <dvyukov@google.com>
---
kernel/hung_task.c | 20 ++++++++++++++++++++
kernel/locking/lockdep.c | 9 +++++++++
2 files changed, 29 insertions(+)
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index b9132d1..1ac49a5 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -201,6 +201,26 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
if (hung_task_show_lock)
debug_show_all_locks();
if (hung_task_call_panic) {
+#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT
+ /*
+ * debug_show_all_locks() above forcibly dumped locks held by
+ * running tasks with locks held. Now, let's dump backtrace of
+ * running tasks as well, for NMI backtrace below tends to show
+ * current thread (i.e. khungtaskd thread itself) and idle CPU
+ * which are useless for debugging hung task problems.
+ */
+ rcu_read_lock();
+ for_each_process_thread(g, t) {
+ if (t->state != TASK_RUNNING || t == current)
+ continue;
+ pr_err("INFO: task %s:%d was running.\n", t->comm,
+ t->pid);
+ sched_show_task(t);
+ touch_nmi_watchdog();
+ touch_all_softlockup_watchdogs();
+ }
+ rcu_read_unlock();
+#endif
trigger_all_cpu_backtrace();
panic("hung_task: blocked tasks");
}
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index e406c5f..efeebf6 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -565,12 +565,21 @@ static void lockdep_print_held_locks(struct task_struct *p)
else
printk("%d lock%s held by %s/%d:\n", depth,
depth > 1 ? "s" : "", p->comm, task_pid_nr(p));
+#ifndef CONFIG_DEBUG_AID_FOR_SYZBOT
/*
* It's not reliable to print a task's held locks if it's not sleeping
* and it's not the current task.
*/
if (p->state == TASK_RUNNING && p != current)
return;
+#else
+ /*
+ * But showing locks and backtrace of running tasks seems to be helpful
+ * for debugging hung task problems. Since syzbot will call panic()
+ * shortly, risking problems caused by accessing stale information is
+ * acceptable here.
+ */
+#endif
for (i = 0; i < depth; i++) {
printk(" #%d: ", i);
print_lock(p->held_locks + i);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] kernel/{lockdep,hung_task}: Show locks and backtrace of running tasks.
2018-09-03 11:44 [PATCH] kernel/{lockdep,hung_task}: Show locks and backtrace of running tasks Tetsuo Handa
@ 2018-09-10 6:07 ` Tetsuo Handa
2018-10-17 10:12 ` Tetsuo Handa
0 siblings, 1 reply; 3+ messages in thread
From: Tetsuo Handa @ 2018-09-10 6:07 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton
Cc: linux-kernel, Dmitry Vyukov
On 2018/09/03 20:44, Tetsuo Handa wrote:
> We are getting reports from syzbot where running task seems to be
> relevant to a hung task problem but NMI backtrace does not print useful
> information [1].
According to my local cache, 69% of hung task reports from syzbot say that
one CPU was running check_hung_uninterruptible_tasks() and the other CPU
was idle. I think that this patch would in many cases give more useful
information than trigger_all_cpu_backtrace() reports. Can we try this patch?
$ ls -l */CrashLog.*[0-9a-f] | wc -l
1666
$ for i in */CrashLog.*; do awk ' BEGIN { flag = 0; } { if (index($0, "NMI backtrace") > 0) { flag = 1; } else if (index($0, "panic") > 0) { exit; } if (flag == 1) { print $0; } }' $i > $i.tmp; done
$ ls -l */*.tmp | wc -l
1666
$ grep -i watchdog+ */*.tmp | wc -l
1662
$ grep -i "idling at" */*.tmp | wc -l
1151
$ grep -F '<IRQ>' */*.tmp | wc -l
220
>
> Although commit 8cc05c71ba5f7936 ("locking/lockdep: Move sanity check to
> inside lockdep_print_held_locks()") says that calling
> lockdep_print_held_locks() on a running thread is considered unsafe,
> it is useful for syzbot to show locks and backtrace of running tasks.
> Thus, let's allow it if CONFIG_DEBUG_AID_FOR_SYZBOT is defined.
>
> [1] https://syzkaller.appspot.com/bug?id=8bab7a6a5597bb10f90e8227a7d8a483748d93be
>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> ---
> kernel/hung_task.c | 20 ++++++++++++++++++++
> kernel/locking/lockdep.c | 9 +++++++++
> 2 files changed, 29 insertions(+)
>
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index b9132d1..1ac49a5 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -201,6 +201,26 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
> if (hung_task_show_lock)
> debug_show_all_locks();
> if (hung_task_call_panic) {
> +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT
> + /*
> + * debug_show_all_locks() above forcibly dumped locks held by
> + * running tasks with locks held. Now, let's dump backtrace of
> + * running tasks as well, for NMI backtrace below tends to show
> + * current thread (i.e. khungtaskd thread itself) and idle CPU
> + * which are useless for debugging hung task problems.
> + */
> + rcu_read_lock();
> + for_each_process_thread(g, t) {
> + if (t->state != TASK_RUNNING || t == current)
> + continue;
> + pr_err("INFO: task %s:%d was running.\n", t->comm,
> + t->pid);
> + sched_show_task(t);
> + touch_nmi_watchdog();
> + touch_all_softlockup_watchdogs();
> + }
> + rcu_read_unlock();
> +#endif
> trigger_all_cpu_backtrace();
> panic("hung_task: blocked tasks");
> }
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index e406c5f..efeebf6 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -565,12 +565,21 @@ static void lockdep_print_held_locks(struct task_struct *p)
> else
> printk("%d lock%s held by %s/%d:\n", depth,
> depth > 1 ? "s" : "", p->comm, task_pid_nr(p));
> +#ifndef CONFIG_DEBUG_AID_FOR_SYZBOT
> /*
> * It's not reliable to print a task's held locks if it's not sleeping
> * and it's not the current task.
> */
> if (p->state == TASK_RUNNING && p != current)
> return;
> +#else
> + /*
> + * But showing locks and backtrace of running tasks seems to be helpful
> + * for debugging hung task problems. Since syzbot will call panic()
> + * shortly, risking problems caused by accessing stale information is
> + * acceptable here.
> + */
> +#endif
> for (i = 0; i < depth; i++) {
> printk(" #%d: ", i);
> print_lock(p->held_locks + i);
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] kernel/{lockdep,hung_task}: Show locks and backtrace of running tasks.
2018-09-10 6:07 ` Tetsuo Handa
@ 2018-10-17 10:12 ` Tetsuo Handa
0 siblings, 0 replies; 3+ messages in thread
From: Tetsuo Handa @ 2018-10-17 10:12 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton
Cc: linux-kernel, Dmitry Vyukov
Hello.
I think that this patch helps examining reports like
https://syzkaller.appspot.com/text?tag=CrashLog&x=150eab91400000
where there is a TASK_RUNNING thread with a lock held
1 lock held by syz-executor0/18295:
and presumably it is the lock which the hung tasks are waiting for.
On 2018/09/10 15:07, Tetsuo Handa wrote:
> On 2018/09/03 20:44, Tetsuo Handa wrote:
>> We are getting reports from syzbot where running task seems to be
>> relevant to a hung task problem but NMI backtrace does not print useful
>> information [1].
>
> According to my local cache, 69% of hung task reports from syzbot say that
> one CPU was running check_hung_uninterruptible_tasks() and the other CPU
> was idle. I think that this patch would in many cases give more useful
> information than trigger_all_cpu_backtrace() reports. Can we try this patch?
>
> $ ls -l */CrashLog.*[0-9a-f] | wc -l
> 1666
> $ for i in */CrashLog.*; do awk ' BEGIN { flag = 0; } { if (index($0, "NMI backtrace") > 0) { flag = 1; } else if (index($0, "panic") > 0) { exit; } if (flag == 1) { print $0; } }' $i > $i.tmp; done
> $ ls -l */*.tmp | wc -l
> 1666
> $ grep -i watchdog+ */*.tmp | wc -l
> 1662
> $ grep -i "idling at" */*.tmp | wc -l
> 1151
> $ grep -F '<IRQ>' */*.tmp | wc -l
> 220
>
>>
>> Although commit 8cc05c71ba5f7936 ("locking/lockdep: Move sanity check to
>> inside lockdep_print_held_locks()") says that calling
>> lockdep_print_held_locks() on a running thread is considered unsafe,
>> it is useful for syzbot to show locks and backtrace of running tasks.
>> Thus, let's allow it if CONFIG_DEBUG_AID_FOR_SYZBOT is defined.
>>
>> [1] https://syzkaller.appspot.com/bug?id=8bab7a6a5597bb10f90e8227a7d8a483748d93be
>>
>> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
>> Cc: Dmitry Vyukov <dvyukov@google.com>
>> ---
>> kernel/hung_task.c | 20 ++++++++++++++++++++
>> kernel/locking/lockdep.c | 9 +++++++++
>> 2 files changed, 29 insertions(+)
>>
>> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
>> index b9132d1..1ac49a5 100644
>> --- a/kernel/hung_task.c
>> +++ b/kernel/hung_task.c
>> @@ -201,6 +201,26 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
>> if (hung_task_show_lock)
>> debug_show_all_locks();
>> if (hung_task_call_panic) {
>> +#ifdef CONFIG_DEBUG_AID_FOR_SYZBOT
>> + /*
>> + * debug_show_all_locks() above forcibly dumped locks held by
>> + * running tasks with locks held. Now, let's dump backtrace of
>> + * running tasks as well, for NMI backtrace below tends to show
>> + * current thread (i.e. khungtaskd thread itself) and idle CPU
>> + * which are useless for debugging hung task problems.
>> + */
>> + rcu_read_lock();
>> + for_each_process_thread(g, t) {
>> + if (t->state != TASK_RUNNING || t == current)
>> + continue;
>> + pr_err("INFO: task %s:%d was running.\n", t->comm,
>> + t->pid);
>> + sched_show_task(t);
>> + touch_nmi_watchdog();
>> + touch_all_softlockup_watchdogs();
>> + }
>> + rcu_read_unlock();
>> +#endif
>> trigger_all_cpu_backtrace();
>> panic("hung_task: blocked tasks");
>> }
>> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
>> index e406c5f..efeebf6 100644
>> --- a/kernel/locking/lockdep.c
>> +++ b/kernel/locking/lockdep.c
>> @@ -565,12 +565,21 @@ static void lockdep_print_held_locks(struct task_struct *p)
>> else
>> printk("%d lock%s held by %s/%d:\n", depth,
>> depth > 1 ? "s" : "", p->comm, task_pid_nr(p));
>> +#ifndef CONFIG_DEBUG_AID_FOR_SYZBOT
>> /*
>> * It's not reliable to print a task's held locks if it's not sleeping
>> * and it's not the current task.
>> */
>> if (p->state == TASK_RUNNING && p != current)
>> return;
>> +#else
>> + /*
>> + * But showing locks and backtrace of running tasks seems to be helpful
>> + * for debugging hung task problems. Since syzbot will call panic()
>> + * shortly, risking problems caused by accessing stale information is
>> + * acceptable here.
>> + */
>> +#endif
>> for (i = 0; i < depth; i++) {
>> printk(" #%d: ", i);
>> print_lock(p->held_locks + i);
>>
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2018-10-17 10:13 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-03 11:44 [PATCH] kernel/{lockdep,hung_task}: Show locks and backtrace of running tasks Tetsuo Handa
2018-09-10 6:07 ` Tetsuo Handa
2018-10-17 10:12 ` Tetsuo Handa
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).