linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH (repost)] locking/lockdep: add debug_show_all_lock_holders()
@ 2022-09-16 14:15 Tetsuo Handa
  2022-09-16 14:51 ` Waiman Long
  0 siblings, 1 reply; 8+ messages in thread
From: Tetsuo Handa @ 2022-09-16 14:15 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long, Boqun Feng
  Cc: Thomas Gleixner, Shaokun Zhang, Sebastian Andrzej Siewior,
	Petr Mladek, Andrew Morton, Ben Dooks, Rasmus Villemoes,
	Luis Chamberlain, Xiaoming Ni, John Ogness, LKML

Currently, check_hung_uninterruptible_tasks() reports details of locks
held in the system. Also, lockdep_print_held_locks() does not report
details of locks held by a thread if that thread is in TASK_RUNNING state.
Several years of experience of debugging without vmcore tells me that
these limitations have been a barrier for understanding what went wrong
in syzbot's "INFO: task hung in" reports.

I initially thought that the cause of "INFO: task hung in" reports is
due to over-stressing. But I understood that over-stressing is unlikely.
I now consider that there likely is a deadlock/livelock bug where lockdep
cannot report as a deadlock when "INFO: task hung in" is reported.

A typical case is that thread-1 is waiting for something to happen (e.g.
wait_event_*()) with a lock held. When thread-2 tries to hold that lock
using e.g. mutex_lock(), check_hung_uninterruptible_tasks() reports that
thread-2 is hung and thread-1 is holding a lock which thread-2 is trying
to hold. But currently check_hung_uninterruptible_tasks() cannot report
the exact location of thread-1 which gives us an important hint for
understanding why thread-1 is holding that lock for so long period.

When check_hung_uninterruptible_tasks() reports a thread waiting for a
lock, it is important to report backtrace of threads which already held
that lock. Therefore, allow check_hung_uninterruptible_tasks() to report
the exact location of threads which is holding any lock.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
This is repost of https://lkml.kernel.org/r/82af40cc-bf85-2b53-b8f9-dfc12e66a781@I-love.SAKURA.ne.jp .
I think there was no critical objection which blocks this change.

I wish that lockdep continues tracking locks (i.e. debug_locks remains 1)
even after something went wrong, for recently I sometimes encounter problems
that disable lockdep during boot stage.

It would be noisy to report possibility of e.g. circular locking dependency
every time due to keeping debug_locks enabled. But tracking locks even after
something went wrong will help debug_show_all_lock_holders() to survive
problems during boot stage.

I'm not expecting lockdep to report the same problem forever.
Reporting possibility of each problem pattern (e.g. circular locking dependency)
up to once, by using cmpxchg() inside reporting functions that call printk(),
would be enough.

I'm expecting lockdep to continue working without calling printk() even after
one of problem patterns (e.g. circular locking dependency) was printk()ed, so that
debug_show_all_locks()/debug_show_all_lock_holders() can call printk() when needed.

Changing debug_locks behavior is a future patch. For now, this patch alone
will help debugging Greg's usb.git#usb-testing tree which is generating
many "INFO: task hung in" reports.

 include/linux/debug_locks.h |  5 +++++
 kernel/hung_task.c          |  2 +-
 kernel/locking/lockdep.c    | 32 ++++++++++++++++++++++++++++++++
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/include/linux/debug_locks.h b/include/linux/debug_locks.h
index dbb409d..0567d5c 100644
--- a/include/linux/debug_locks.h
+++ b/include/linux/debug_locks.h
@@ -50,6 +50,7 @@ static __always_inline int __debug_locks_off(void)
 #ifdef CONFIG_LOCKDEP
 extern void debug_show_all_locks(void);
 extern void debug_show_held_locks(struct task_struct *task);
+extern void debug_show_all_lock_holders(void);
 extern void debug_check_no_locks_freed(const void *from, unsigned long len);
 extern void debug_check_no_locks_held(void);
 #else
@@ -61,6 +62,10 @@ static inline void debug_show_held_locks(struct task_struct *task)
 {
 }
 
+static inline void debug_show_all_lock_holders(void)
+{
+}
+
 static inline void
 debug_check_no_locks_freed(const void *from, unsigned long len)
 {
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index bb2354f..18e22bb 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -205,7 +205,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
  unlock:
 	rcu_read_unlock();
 	if (hung_task_show_lock)
-		debug_show_all_locks();
+		debug_show_all_lock_holders();
 
 	if (hung_task_show_all_bt) {
 		hung_task_show_all_bt = false;
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 64a13eb..d062541 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -55,6 +55,7 @@
 #include <linux/rcupdate.h>
 #include <linux/kprobes.h>
 #include <linux/lockdep.h>
+#include <linux/sched/debug.h>
 
 #include <asm/sections.h>
 
@@ -6509,6 +6510,37 @@ void debug_show_all_locks(void)
 	pr_warn("=============================================\n\n");
 }
 EXPORT_SYMBOL_GPL(debug_show_all_locks);
+
+void debug_show_all_lock_holders(void)
+{
+	struct task_struct *g, *p;
+
+	if (unlikely(!debug_locks)) {
+		pr_warn("INFO: lockdep is turned off.\n");
+		return;
+	}
+	pr_warn("\nShowing all threads with locks held in the system:\n");
+
+	rcu_read_lock();
+	for_each_process_thread(g, p) {
+		if (!p->lockdep_depth)
+			continue;
+		/*
+		 * Assuming that the caller of this function is in a process
+		 * context without any locks held, skip current thread which is
+		 * holding only RCU read lock.
+		 */
+		if (p == current)
+			continue;
+		sched_show_task(p);
+		lockdep_print_held_locks(p);
+		touch_nmi_watchdog();
+		touch_all_softlockup_watchdogs();
+	}
+	rcu_read_unlock();
+	pr_warn("\n");
+	pr_warn("=============================================\n\n");
+}
 #endif
 
 /*
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH (repost)] locking/lockdep: add debug_show_all_lock_holders()
  2022-09-16 14:15 [PATCH (repost)] locking/lockdep: add debug_show_all_lock_holders() Tetsuo Handa
@ 2022-09-16 14:51 ` Waiman Long
  2022-09-16 15:57   ` [PATCH v2] " Tetsuo Handa
  2022-09-16 17:46   ` [PATCH (repost)] " Boqun Feng
  0 siblings, 2 replies; 8+ messages in thread
From: Waiman Long @ 2022-09-16 14:51 UTC (permalink / raw)
  To: Tetsuo Handa, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng
  Cc: Thomas Gleixner, Shaokun Zhang, Sebastian Andrzej Siewior,
	Petr Mladek, Andrew Morton, Ben Dooks, Rasmus Villemoes,
	Luis Chamberlain, Xiaoming Ni, John Ogness, LKML

On 9/16/22 10:15, Tetsuo Handa wrote:
> Currently, check_hung_uninterruptible_tasks() reports details of locks
> held in the system. Also, lockdep_print_held_locks() does not report
> details of locks held by a thread if that thread is in TASK_RUNNING state.
> Several years of experience of debugging without vmcore tells me that
> these limitations have been a barrier for understanding what went wrong
> in syzbot's "INFO: task hung in" reports.
>
> I initially thought that the cause of "INFO: task hung in" reports is
> due to over-stressing. But I understood that over-stressing is unlikely.
> I now consider that there likely is a deadlock/livelock bug where lockdep
> cannot report as a deadlock when "INFO: task hung in" is reported.
>
> A typical case is that thread-1 is waiting for something to happen (e.g.
> wait_event_*()) with a lock held. When thread-2 tries to hold that lock
> using e.g. mutex_lock(), check_hung_uninterruptible_tasks() reports that
> thread-2 is hung and thread-1 is holding a lock which thread-2 is trying
> to hold. But currently check_hung_uninterruptible_tasks() cannot report
> the exact location of thread-1 which gives us an important hint for
> understanding why thread-1 is holding that lock for so long period.
>
> When check_hung_uninterruptible_tasks() reports a thread waiting for a
> lock, it is important to report backtrace of threads which already held
> that lock. Therefore, allow check_hung_uninterruptible_tasks() to report
> the exact location of threads which is holding any lock.

I am not against this patch, but I do like to see you wrapping your code 
in a __debug_show_all_locks() wrapper, for instance, with flags and make 
debug_show_all_locks() uses the new wrapper to avoid code redundancy.


> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
> This is repost of https://lkml.kernel.org/r/82af40cc-bf85-2b53-b8f9-dfc12e66a781@I-love.SAKURA.ne.jp .
> I think there was no critical objection which blocks this change.
>
> I wish that lockdep continues tracking locks (i.e. debug_locks remains 1)
> even after something went wrong, for recently I sometimes encounter problems
> that disable lockdep during boot stage.
>
> It would be noisy to report possibility of e.g. circular locking dependency
> every time due to keeping debug_locks enabled. But tracking locks even after
> something went wrong will help debug_show_all_lock_holders() to survive
> problems during boot stage.
>
> I'm not expecting lockdep to report the same problem forever.
> Reporting possibility of each problem pattern (e.g. circular locking dependency)
> up to once, by using cmpxchg() inside reporting functions that call printk(),
> would be enough.
>
> I'm expecting lockdep to continue working without calling printk() even after
> one of problem patterns (e.g. circular locking dependency) was printk()ed, so that
> debug_show_all_locks()/debug_show_all_lock_holders() can call printk() when needed.
>
> Changing debug_locks behavior is a future patch. For now, this patch alone
> will help debugging Greg's usb.git#usb-testing tree which is generating
> many "INFO: task hung in" reports.

Boqun is actually working on a modularization patch to make some lockdep 
checking still active even after a lockdep bug is reported. I think he 
will take into consideration about this request.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] locking/lockdep: add debug_show_all_lock_holders()
  2022-09-16 14:51 ` Waiman Long
@ 2022-09-16 15:57   ` Tetsuo Handa
  2022-09-16 18:41     ` Waiman Long
  2022-09-16 17:46   ` [PATCH (repost)] " Boqun Feng
  1 sibling, 1 reply; 8+ messages in thread
From: Tetsuo Handa @ 2022-09-16 15:57 UTC (permalink / raw)
  To: Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng
  Cc: Thomas Gleixner, Shaokun Zhang, Sebastian Andrzej Siewior,
	Petr Mladek, Andrew Morton, Ben Dooks, Rasmus Villemoes,
	Luis Chamberlain, Xiaoming Ni, John Ogness, LKML

Currently, check_hung_uninterruptible_tasks() reports details of locks
held in the system. Also, lockdep_print_held_locks() does not report
details of locks held by a thread if that thread is in TASK_RUNNING state.
Several years of experience of debugging without vmcore tells me that
these limitations have been a barrier for understanding what went wrong
in syzbot's "INFO: task hung in" reports.

I initially thought that the cause of "INFO: task hung in" reports is
due to over-stressing. But I understood that over-stressing is unlikely.
I now consider that there likely is a deadlock/livelock bug where lockdep
cannot report as a deadlock when "INFO: task hung in" is reported.

A typical case is that thread-1 is waiting for something to happen (e.g.
wait_event_*()) with a lock held. When thread-2 tries to hold that lock
using e.g. mutex_lock(), check_hung_uninterruptible_tasks() reports that
thread-2 is hung and thread-1 is holding a lock which thread-2 is trying
to hold. But currently check_hung_uninterruptible_tasks() cannot report
the exact location of thread-1 which gives us an important hint for
understanding why thread-1 is holding that lock for so long period.

When check_hung_uninterruptible_tasks() reports a thread waiting for a
lock, it is important to report backtrace of threads which already held
that lock. Therefore, allow check_hung_uninterruptible_tasks() to report
the exact location of threads which is holding any lock.

To deduplicate code, share debug_show_all_{locks,lock_holders}() using
a flag. As a side effect of sharing, __debug_show_all_locks() skips
current thread if the caller is holding no lock, for reporting RCU lock
taken inside __debug_show_all_locks() is generally useless.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
Changes in v2:
  Share debug_show_all_lock_holders() and debug_show_all_locks(),
  suggested by Waiman Long <longman@redhat.com>.

 include/linux/debug_locks.h | 17 ++++++++++++++++-
 kernel/hung_task.c          |  2 +-
 kernel/locking/lockdep.c    | 14 +++++++++++---
 3 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/include/linux/debug_locks.h b/include/linux/debug_locks.h
index dbb409d77d4f..b45c89fadfe4 100644
--- a/include/linux/debug_locks.h
+++ b/include/linux/debug_locks.h
@@ -48,7 +48,18 @@ extern int debug_locks_off(void);
 #endif
 
 #ifdef CONFIG_LOCKDEP
-extern void debug_show_all_locks(void);
+extern void __debug_show_all_locks(bool show_stack);
+
+static inline void debug_show_all_locks(void)
+{
+	__debug_show_all_locks(false);
+}
+
+static inline void debug_show_all_lock_holders(void)
+{
+	__debug_show_all_locks(true);
+}
+
 extern void debug_show_held_locks(struct task_struct *task);
 extern void debug_check_no_locks_freed(const void *from, unsigned long len);
 extern void debug_check_no_locks_held(void);
@@ -61,6 +72,10 @@ static inline void debug_show_held_locks(struct task_struct *task)
 {
 }
 
+static inline void debug_show_all_lock_holders(void)
+{
+}
+
 static inline void
 debug_check_no_locks_freed(const void *from, unsigned long len)
 {
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index bb2354f73ded..18e22bbb714f 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -205,7 +205,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
  unlock:
 	rcu_read_unlock();
 	if (hung_task_show_lock)
-		debug_show_all_locks();
+		debug_show_all_lock_holders();
 
 	if (hung_task_show_all_bt) {
 		hung_task_show_all_bt = false;
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 64a13eb56078..7870f7e5c46b 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -55,6 +55,7 @@
 #include <linux/rcupdate.h>
 #include <linux/kprobes.h>
 #include <linux/lockdep.h>
+#include <linux/sched/debug.h>
 
 #include <asm/sections.h>
 
@@ -6485,7 +6486,7 @@ void debug_check_no_locks_held(void)
 EXPORT_SYMBOL_GPL(debug_check_no_locks_held);
 
 #ifdef __KERNEL__
-void debug_show_all_locks(void)
+void __debug_show_all_locks(bool show_stack)
 {
 	struct task_struct *g, *p;
 
@@ -6493,12 +6494,19 @@ void debug_show_all_locks(void)
 		pr_warn("INFO: lockdep is turned off.\n");
 		return;
 	}
-	pr_warn("\nShowing all locks held in the system:\n");
+	if (show_stack)
+		pr_warn("\nShowing all threads with locks held in the system:\n");
+	else
+		pr_warn("\nShowing all locks held in the system:\n");
 
 	rcu_read_lock();
 	for_each_process_thread(g, p) {
 		if (!p->lockdep_depth)
 			continue;
+		if (p == current && p->lockdep_depth == 1)
+			continue;
+		if (show_stack)
+			sched_show_task(p);
 		lockdep_print_held_locks(p);
 		touch_nmi_watchdog();
 		touch_all_softlockup_watchdogs();
@@ -6508,7 +6516,7 @@ void debug_show_all_locks(void)
 	pr_warn("\n");
 	pr_warn("=============================================\n\n");
 }
-EXPORT_SYMBOL_GPL(debug_show_all_locks);
+EXPORT_SYMBOL_GPL(__debug_show_all_locks);
 #endif
 
 /*
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH (repost)] locking/lockdep: add debug_show_all_lock_holders()
  2022-09-16 14:51 ` Waiman Long
  2022-09-16 15:57   ` [PATCH v2] " Tetsuo Handa
@ 2022-09-16 17:46   ` Boqun Feng
  1 sibling, 0 replies; 8+ messages in thread
From: Boqun Feng @ 2022-09-16 17:46 UTC (permalink / raw)
  To: Waiman Long
  Cc: Tetsuo Handa, Peter Zijlstra, Ingo Molnar, Will Deacon,
	Thomas Gleixner, Shaokun Zhang, Sebastian Andrzej Siewior,
	Petr Mladek, Andrew Morton, Ben Dooks, Rasmus Villemoes,
	Luis Chamberlain, Xiaoming Ni, John Ogness, LKML

On Fri, Sep 16, 2022 at 10:51:14AM -0400, Waiman Long wrote:
[...]
> > 
> > I'm expecting lockdep to continue working without calling printk() even after
> > one of problem patterns (e.g. circular locking dependency) was printk()ed, so that
> > debug_show_all_locks()/debug_show_all_lock_holders() can call printk() when needed.
> > 
> > Changing debug_locks behavior is a future patch. For now, this patch alone
> > will help debugging Greg's usb.git#usb-testing tree which is generating
> > many "INFO: task hung in" reports.

The default behavior today does make sense: if the system has one
deadlock it should be fixed first before searching out another. So I
expect that changing debug_locks behavior would be configurable and we
keep the default as today.

> 
> Boqun is actually working on a modularization patch to make some lockdep
> checking still active even after a lockdep bug is reported. I think he will
> take into consideration about this request.
> 

Yes, this is one of the "problems" I try to resolve while cleaning up
the lockdep.

Regards,
Boqun

> Cheers,
> Longman
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] locking/lockdep: add debug_show_all_lock_holders()
  2022-09-16 15:57   ` [PATCH v2] " Tetsuo Handa
@ 2022-09-16 18:41     ` Waiman Long
  2022-10-03 22:18       ` Tetsuo Handa
  0 siblings, 1 reply; 8+ messages in thread
From: Waiman Long @ 2022-09-16 18:41 UTC (permalink / raw)
  To: Tetsuo Handa, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng
  Cc: Thomas Gleixner, Shaokun Zhang, Sebastian Andrzej Siewior,
	Petr Mladek, Andrew Morton, Ben Dooks, Rasmus Villemoes,
	Luis Chamberlain, Xiaoming Ni, John Ogness, LKML

On 9/16/22 11:57, Tetsuo Handa wrote:
> Currently, check_hung_uninterruptible_tasks() reports details of locks
> held in the system. Also, lockdep_print_held_locks() does not report
> details of locks held by a thread if that thread is in TASK_RUNNING state.
> Several years of experience of debugging without vmcore tells me that
> these limitations have been a barrier for understanding what went wrong
> in syzbot's "INFO: task hung in" reports.
>
> I initially thought that the cause of "INFO: task hung in" reports is
> due to over-stressing. But I understood that over-stressing is unlikely.
> I now consider that there likely is a deadlock/livelock bug where lockdep
> cannot report as a deadlock when "INFO: task hung in" is reported.
>
> A typical case is that thread-1 is waiting for something to happen (e.g.
> wait_event_*()) with a lock held. When thread-2 tries to hold that lock
> using e.g. mutex_lock(), check_hung_uninterruptible_tasks() reports that
> thread-2 is hung and thread-1 is holding a lock which thread-2 is trying
> to hold. But currently check_hung_uninterruptible_tasks() cannot report
> the exact location of thread-1 which gives us an important hint for
> understanding why thread-1 is holding that lock for so long period.
>
> When check_hung_uninterruptible_tasks() reports a thread waiting for a
> lock, it is important to report backtrace of threads which already held
> that lock. Therefore, allow check_hung_uninterruptible_tasks() to report
> the exact location of threads which is holding any lock.
>
> To deduplicate code, share debug_show_all_{locks,lock_holders}() using
> a flag. As a side effect of sharing, __debug_show_all_locks() skips
> current thread if the caller is holding no lock, for reporting RCU lock
> taken inside __debug_show_all_locks() is generally useless.
>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
> Changes in v2:
>    Share debug_show_all_lock_holders() and debug_show_all_locks(),
>    suggested by Waiman Long <longman@redhat.com>.
>
>   include/linux/debug_locks.h | 17 ++++++++++++++++-
>   kernel/hung_task.c          |  2 +-
>   kernel/locking/lockdep.c    | 14 +++++++++++---
>   3 files changed, 28 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/debug_locks.h b/include/linux/debug_locks.h
> index dbb409d77d4f..b45c89fadfe4 100644
> --- a/include/linux/debug_locks.h
> +++ b/include/linux/debug_locks.h
> @@ -48,7 +48,18 @@ extern int debug_locks_off(void);
>   #endif
>   
>   #ifdef CONFIG_LOCKDEP
> -extern void debug_show_all_locks(void);
> +extern void __debug_show_all_locks(bool show_stack);
> +
> +static inline void debug_show_all_locks(void)
> +{
> +	__debug_show_all_locks(false);
> +}
> +
> +static inline void debug_show_all_lock_holders(void)
> +{
> +	__debug_show_all_locks(true);
> +}
> +
>   extern void debug_show_held_locks(struct task_struct *task);
>   extern void debug_check_no_locks_freed(const void *from, unsigned long len);
>   extern void debug_check_no_locks_held(void);
> @@ -61,6 +72,10 @@ static inline void debug_show_held_locks(struct task_struct *task)
>   {
>   }
>   
> +static inline void debug_show_all_lock_holders(void)
> +{
> +}
> +
>   static inline void
>   debug_check_no_locks_freed(const void *from, unsigned long len)
>   {
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index bb2354f73ded..18e22bbb714f 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -205,7 +205,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
>    unlock:
>   	rcu_read_unlock();
>   	if (hung_task_show_lock)
> -		debug_show_all_locks();
> +		debug_show_all_lock_holders();
>   
>   	if (hung_task_show_all_bt) {
>   		hung_task_show_all_bt = false;
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 64a13eb56078..7870f7e5c46b 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -55,6 +55,7 @@
>   #include <linux/rcupdate.h>
>   #include <linux/kprobes.h>
>   #include <linux/lockdep.h>
> +#include <linux/sched/debug.h>
>   
>   #include <asm/sections.h>
>   
> @@ -6485,7 +6486,7 @@ void debug_check_no_locks_held(void)
>   EXPORT_SYMBOL_GPL(debug_check_no_locks_held);
>   
>   #ifdef __KERNEL__
> -void debug_show_all_locks(void)
> +void __debug_show_all_locks(bool show_stack)
>   {
>   	struct task_struct *g, *p;
>   
> @@ -6493,12 +6494,19 @@ void debug_show_all_locks(void)
>   		pr_warn("INFO: lockdep is turned off.\n");
>   		return;
>   	}
> -	pr_warn("\nShowing all locks held in the system:\n");
> +	if (show_stack)
> +		pr_warn("\nShowing all threads with locks held in the system:\n");
> +	else
> +		pr_warn("\nShowing all locks held in the system:\n");
>   
>   	rcu_read_lock();
>   	for_each_process_thread(g, p) {
>   		if (!p->lockdep_depth)
>   			continue;
> +		if (p == current && p->lockdep_depth == 1)
> +			continue;
> +		if (show_stack)
> +			sched_show_task(p);
>   		lockdep_print_held_locks(p);
>   		touch_nmi_watchdog();
>   		touch_all_softlockup_watchdogs();
> @@ -6508,7 +6516,7 @@ void debug_show_all_locks(void)
>   	pr_warn("\n");
>   	pr_warn("=============================================\n\n");
>   }
> -EXPORT_SYMBOL_GPL(debug_show_all_locks);
> +EXPORT_SYMBOL_GPL(__debug_show_all_locks);
>   #endif
>   
>   /*
Acked-by: Waiman Long <longman@redhat.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] locking/lockdep: add debug_show_all_lock_holders()
  2022-09-16 18:41     ` Waiman Long
@ 2022-10-03 22:18       ` Tetsuo Handa
  2022-10-04 18:09         ` Waiman Long
  0 siblings, 1 reply; 8+ messages in thread
From: Tetsuo Handa @ 2022-10-03 22:18 UTC (permalink / raw)
  To: Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng
  Cc: Thomas Gleixner, Shaokun Zhang, Sebastian Andrzej Siewior,
	Petr Mladek, Andrew Morton, Ben Dooks, Rasmus Villemoes,
	Luis Chamberlain, Xiaoming Ni, John Ogness, LKML

Can this patch go to linux.git ?

On 2022/09/17 3:41, Waiman Long wrote:
> On 9/16/22 11:57, Tetsuo Handa wrote:
>> Currently, check_hung_uninterruptible_tasks() reports details of locks
>> held in the system. Also, lockdep_print_held_locks() does not report
>> details of locks held by a thread if that thread is in TASK_RUNNING state.
>> Several years of experience of debugging without vmcore tells me that
>> these limitations have been a barrier for understanding what went wrong
>> in syzbot's "INFO: task hung in" reports.
>>
>> I initially thought that the cause of "INFO: task hung in" reports is
>> due to over-stressing. But I understood that over-stressing is unlikely.
>> I now consider that there likely is a deadlock/livelock bug where lockdep
>> cannot report as a deadlock when "INFO: task hung in" is reported.
>>
>> A typical case is that thread-1 is waiting for something to happen (e.g.
>> wait_event_*()) with a lock held. When thread-2 tries to hold that lock
>> using e.g. mutex_lock(), check_hung_uninterruptible_tasks() reports that
>> thread-2 is hung and thread-1 is holding a lock which thread-2 is trying
>> to hold. But currently check_hung_uninterruptible_tasks() cannot report
>> the exact location of thread-1 which gives us an important hint for
>> understanding why thread-1 is holding that lock for so long period.
>>
>> When check_hung_uninterruptible_tasks() reports a thread waiting for a
>> lock, it is important to report backtrace of threads which already held
>> that lock. Therefore, allow check_hung_uninterruptible_tasks() to report
>> the exact location of threads which is holding any lock.
>>
>> To deduplicate code, share debug_show_all_{locks,lock_holders}() using
>> a flag. As a side effect of sharing, __debug_show_all_locks() skips
>> current thread if the caller is holding no lock, for reporting RCU lock
>> taken inside __debug_show_all_locks() is generally useless.
>>
>> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Acked-by: Waiman Long <longman@redhat.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] locking/lockdep: add debug_show_all_lock_holders()
  2022-10-03 22:18       ` Tetsuo Handa
@ 2022-10-04 18:09         ` Waiman Long
  2022-11-14 10:05           ` Tetsuo Handa
  0 siblings, 1 reply; 8+ messages in thread
From: Waiman Long @ 2022-10-04 18:09 UTC (permalink / raw)
  To: Tetsuo Handa, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng
  Cc: Thomas Gleixner, Shaokun Zhang, Sebastian Andrzej Siewior,
	Petr Mladek, Andrew Morton, Ben Dooks, Rasmus Villemoes,
	Luis Chamberlain, Xiaoming Ni, John Ogness, LKML

On 10/3/22 18:18, Tetsuo Handa wrote:
> Can this patch go to linux.git ?

It is now up to Peter or Ingo to take it  to tip.

Cheers,
Longman

>
> On 2022/09/17 3:41, Waiman Long wrote:
>> On 9/16/22 11:57, Tetsuo Handa wrote:
>>> Currently, check_hung_uninterruptible_tasks() reports details of locks
>>> held in the system. Also, lockdep_print_held_locks() does not report
>>> details of locks held by a thread if that thread is in TASK_RUNNING state.
>>> Several years of experience of debugging without vmcore tells me that
>>> these limitations have been a barrier for understanding what went wrong
>>> in syzbot's "INFO: task hung in" reports.
>>>
>>> I initially thought that the cause of "INFO: task hung in" reports is
>>> due to over-stressing. But I understood that over-stressing is unlikely.
>>> I now consider that there likely is a deadlock/livelock bug where lockdep
>>> cannot report as a deadlock when "INFO: task hung in" is reported.
>>>
>>> A typical case is that thread-1 is waiting for something to happen (e.g.
>>> wait_event_*()) with a lock held. When thread-2 tries to hold that lock
>>> using e.g. mutex_lock(), check_hung_uninterruptible_tasks() reports that
>>> thread-2 is hung and thread-1 is holding a lock which thread-2 is trying
>>> to hold. But currently check_hung_uninterruptible_tasks() cannot report
>>> the exact location of thread-1 which gives us an important hint for
>>> understanding why thread-1 is holding that lock for so long period.
>>>
>>> When check_hung_uninterruptible_tasks() reports a thread waiting for a
>>> lock, it is important to report backtrace of threads which already held
>>> that lock. Therefore, allow check_hung_uninterruptible_tasks() to report
>>> the exact location of threads which is holding any lock.
>>>
>>> To deduplicate code, share debug_show_all_{locks,lock_holders}() using
>>> a flag. As a side effect of sharing, __debug_show_all_locks() skips
>>> current thread if the caller is holding no lock, for reporting RCU lock
>>> taken inside __debug_show_all_locks() is generally useless.
>>>
>>> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
>> Acked-by: Waiman Long <longman@redhat.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] locking/lockdep: add debug_show_all_lock_holders()
  2022-10-04 18:09         ` Waiman Long
@ 2022-11-14 10:05           ` Tetsuo Handa
  0 siblings, 0 replies; 8+ messages in thread
From: Tetsuo Handa @ 2022-11-14 10:05 UTC (permalink / raw)
  To: Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng
  Cc: Thomas Gleixner, Shaokun Zhang, Sebastian Andrzej Siewior,
	Petr Mladek, Andrew Morton, Ben Dooks, Rasmus Villemoes,
	Luis Chamberlain, Xiaoming Ni, John Ogness, LKML

Peter or Ingo, can you take this patch?

On 2022/10/05 3:09, Waiman Long wrote:
> On 10/3/22 18:18, Tetsuo Handa wrote:
>> Can this patch go to linux.git ?
> 
> It is now up to Peter or Ingo to take it  to tip.
> 
> Cheers,
> Longman
> 
>>
>> On 2022/09/17 3:41, Waiman Long wrote:
>>> On 9/16/22 11:57, Tetsuo Handa wrote:
>>>> Currently, check_hung_uninterruptible_tasks() reports details of locks
>>>> held in the system. Also, lockdep_print_held_locks() does not report
>>>> details of locks held by a thread if that thread is in TASK_RUNNING state.
>>>> Several years of experience of debugging without vmcore tells me that
>>>> these limitations have been a barrier for understanding what went wrong
>>>> in syzbot's "INFO: task hung in" reports.
>>>>
>>>> I initially thought that the cause of "INFO: task hung in" reports is
>>>> due to over-stressing. But I understood that over-stressing is unlikely.
>>>> I now consider that there likely is a deadlock/livelock bug where lockdep
>>>> cannot report as a deadlock when "INFO: task hung in" is reported.
>>>>
>>>> A typical case is that thread-1 is waiting for something to happen (e.g.
>>>> wait_event_*()) with a lock held. When thread-2 tries to hold that lock
>>>> using e.g. mutex_lock(), check_hung_uninterruptible_tasks() reports that
>>>> thread-2 is hung and thread-1 is holding a lock which thread-2 is trying
>>>> to hold. But currently check_hung_uninterruptible_tasks() cannot report
>>>> the exact location of thread-1 which gives us an important hint for
>>>> understanding why thread-1 is holding that lock for so long period.
>>>>
>>>> When check_hung_uninterruptible_tasks() reports a thread waiting for a
>>>> lock, it is important to report backtrace of threads which already held
>>>> that lock. Therefore, allow check_hung_uninterruptible_tasks() to report
>>>> the exact location of threads which is holding any lock.
>>>>
>>>> To deduplicate code, share debug_show_all_{locks,lock_holders}() using
>>>> a flag. As a side effect of sharing, __debug_show_all_locks() skips
>>>> current thread if the caller is holding no lock, for reporting RCU lock
>>>> taken inside __debug_show_all_locks() is generally useless.
>>>>
>>>> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
>>> Acked-by: Waiman Long <longman@redhat.com>
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-11-14 10:07 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-16 14:15 [PATCH (repost)] locking/lockdep: add debug_show_all_lock_holders() Tetsuo Handa
2022-09-16 14:51 ` Waiman Long
2022-09-16 15:57   ` [PATCH v2] " Tetsuo Handa
2022-09-16 18:41     ` Waiman Long
2022-10-03 22:18       ` Tetsuo Handa
2022-10-04 18:09         ` Waiman Long
2022-11-14 10:05           ` Tetsuo Handa
2022-09-16 17:46   ` [PATCH (repost)] " Boqun Feng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).