All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 (repost)] locking/lockdep: add debug_show_all_lock_holders()
@ 2022-11-21 10:10 Tetsuo Handa
  2022-12-29  0:25 ` Tetsuo Handa
  2023-01-14  9:36 ` Ingo Molnar
  0 siblings, 2 replies; 7+ messages in thread
From: Tetsuo Handa @ 2022-11-21 10:10 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
	Boqun Feng, Andrew Morton
  Cc: LKML

Currently, check_hung_uninterruptible_tasks() reports details of locks
held in the system. Also, lockdep_print_held_locks() does not report
details of locks held by a thread if that thread is in TASK_RUNNING state.
Several years of experience of debugging without vmcore tells me that
these limitations have been a barrier for understanding what went wrong
in syzbot's "INFO: task hung in" reports.

I initially thought that the cause of "INFO: task hung in" reports is
due to over-stressing. But I understood that over-stressing is unlikely.
I now consider that there likely is a deadlock/livelock bug where lockdep
cannot report as a deadlock when "INFO: task hung in" is reported.

A typical case is that thread-1 is waiting for something to happen (e.g.
wait_event_*()) with a lock held. When thread-2 tries to hold that lock
using e.g. mutex_lock(), check_hung_uninterruptible_tasks() reports that
thread-2 is hung and thread-1 is holding a lock which thread-2 is trying
to hold. But currently check_hung_uninterruptible_tasks() cannot report
the exact location of thread-1 which gives us an important hint for
understanding why thread-1 is holding that lock for so long period.

When check_hung_uninterruptible_tasks() reports a thread waiting for a
lock, it is important to report backtrace of threads which already held
that lock. Therefore, allow check_hung_uninterruptible_tasks() to report
the exact location of threads which is holding any lock.

To deduplicate code, share debug_show_all_{locks,lock_holders}() using
a flag. As a side effect of sharing, __debug_show_all_locks() skips
current thread if the caller is holding no lock, for reporting RCU lock
taken inside __debug_show_all_locks() is generally useless.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Waiman Long <longman@redhat.com>
---
Changes in v2:
  Share debug_show_all_lock_holders() and debug_show_all_locks(),
  suggested by Waiman Long <longman@redhat.com>.

 include/linux/debug_locks.h | 17 ++++++++++++++++-
 kernel/hung_task.c          |  2 +-
 kernel/locking/lockdep.c    | 14 +++++++++++---
 3 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/include/linux/debug_locks.h b/include/linux/debug_locks.h
index dbb409d77d4f..b45c89fadfe4 100644
--- a/include/linux/debug_locks.h
+++ b/include/linux/debug_locks.h
@@ -48,7 +48,18 @@ extern int debug_locks_off(void);
 #endif
 
 #ifdef CONFIG_LOCKDEP
-extern void debug_show_all_locks(void);
+extern void __debug_show_all_locks(bool show_stack);
+
+static inline void debug_show_all_locks(void)
+{
+	__debug_show_all_locks(false);
+}
+
+static inline void debug_show_all_lock_holders(void)
+{
+	__debug_show_all_locks(true);
+}
+
 extern void debug_show_held_locks(struct task_struct *task);
 extern void debug_check_no_locks_freed(const void *from, unsigned long len);
 extern void debug_check_no_locks_held(void);
@@ -61,6 +72,10 @@ static inline void debug_show_held_locks(struct task_struct *task)
 {
 }
 
+static inline void debug_show_all_lock_holders(void)
+{
+}
+
 static inline void
 debug_check_no_locks_freed(const void *from, unsigned long len)
 {
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index c71889f3f3fc..5fba784258b7 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -213,7 +213,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
  unlock:
 	rcu_read_unlock();
 	if (hung_task_show_lock)
-		debug_show_all_locks();
+		debug_show_all_lock_holders();
 
 	if (hung_task_show_all_bt) {
 		hung_task_show_all_bt = false;
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index e3375bc40dad..b3da133825cc 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -55,6 +55,7 @@
 #include <linux/rcupdate.h>
 #include <linux/kprobes.h>
 #include <linux/lockdep.h>
+#include <linux/sched/debug.h>
 
 #include <asm/sections.h>
 
@@ -6487,7 +6488,7 @@ void debug_check_no_locks_held(void)
 EXPORT_SYMBOL_GPL(debug_check_no_locks_held);
 
 #ifdef __KERNEL__
-void debug_show_all_locks(void)
+void __debug_show_all_locks(bool show_stack)
 {
 	struct task_struct *g, *p;
 
@@ -6495,12 +6496,19 @@ void debug_show_all_locks(void)
 		pr_warn("INFO: lockdep is turned off.\n");
 		return;
 	}
-	pr_warn("\nShowing all locks held in the system:\n");
+	if (show_stack)
+		pr_warn("\nShowing all threads with locks held in the system:\n");
+	else
+		pr_warn("\nShowing all locks held in the system:\n");
 
 	rcu_read_lock();
 	for_each_process_thread(g, p) {
 		if (!p->lockdep_depth)
 			continue;
+		if (p == current && p->lockdep_depth == 1)
+			continue;
+		if (show_stack)
+			sched_show_task(p);
 		lockdep_print_held_locks(p);
 		touch_nmi_watchdog();
 		touch_all_softlockup_watchdogs();
@@ -6510,7 +6518,7 @@ void debug_show_all_locks(void)
 	pr_warn("\n");
 	pr_warn("=============================================\n\n");
 }
-EXPORT_SYMBOL_GPL(debug_show_all_locks);
+EXPORT_SYMBOL_GPL(__debug_show_all_locks);
 #endif
 
 /*
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 (repost)] locking/lockdep: add debug_show_all_lock_holders()
  2022-11-21 10:10 [PATCH v2 (repost)] locking/lockdep: add debug_show_all_lock_holders() Tetsuo Handa
@ 2022-12-29  0:25 ` Tetsuo Handa
  2023-01-07 12:58   ` Tetsuo Handa
  2023-01-14  9:36 ` Ingo Molnar
  1 sibling, 1 reply; 7+ messages in thread
From: Tetsuo Handa @ 2022-12-29  0:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
	Boqun Feng, Linus Torvalds

Hello, Andrew.

Since neither Peter nor Ingo is responding, would you take this patch via
your tree?

On 2022/11/21 19:10, Tetsuo Handa wrote:
> Currently, check_hung_uninterruptible_tasks() reports details of locks
> held in the system. Also, lockdep_print_held_locks() does not report
> details of locks held by a thread if that thread is in TASK_RUNNING state.
> Several years of experience of debugging without vmcore tells me that
> these limitations have been a barrier for understanding what went wrong
> in syzbot's "INFO: task hung in" reports.
> 
> I initially thought that the cause of "INFO: task hung in" reports is
> due to over-stressing. But I understood that over-stressing is unlikely.
> I now consider that there likely is a deadlock/livelock bug where lockdep
> cannot report as a deadlock when "INFO: task hung in" is reported.
> 
> A typical case is that thread-1 is waiting for something to happen (e.g.
> wait_event_*()) with a lock held. When thread-2 tries to hold that lock
> using e.g. mutex_lock(), check_hung_uninterruptible_tasks() reports that
> thread-2 is hung and thread-1 is holding a lock which thread-2 is trying
> to hold. But currently check_hung_uninterruptible_tasks() cannot report
> the exact location of thread-1 which gives us an important hint for
> understanding why thread-1 is holding that lock for so long period.
> 
> When check_hung_uninterruptible_tasks() reports a thread waiting for a
> lock, it is important to report backtrace of threads which already held
> that lock. Therefore, allow check_hung_uninterruptible_tasks() to report
> the exact location of threads which is holding any lock.
> 
> To deduplicate code, share debug_show_all_{locks,lock_holders}() using
> a flag. As a side effect of sharing, __debug_show_all_locks() skips
> current thread if the caller is holding no lock, for reporting RCU lock
> taken inside __debug_show_all_locks() is generally useless.
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Acked-by: Waiman Long <longman@redhat.com>
> ---
> Changes in v2:
>   Share debug_show_all_lock_holders() and debug_show_all_locks(),
>   suggested by Waiman Long <longman@redhat.com>.
> 
>  include/linux/debug_locks.h | 17 ++++++++++++++++-
>  kernel/hung_task.c          |  2 +-
>  kernel/locking/lockdep.c    | 14 +++++++++++---
>  3 files changed, 28 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/debug_locks.h b/include/linux/debug_locks.h
> index dbb409d77d4f..b45c89fadfe4 100644
> --- a/include/linux/debug_locks.h
> +++ b/include/linux/debug_locks.h
> @@ -48,7 +48,18 @@ extern int debug_locks_off(void);
>  #endif
>  
>  #ifdef CONFIG_LOCKDEP
> -extern void debug_show_all_locks(void);
> +extern void __debug_show_all_locks(bool show_stack);
> +
> +static inline void debug_show_all_locks(void)
> +{
> +	__debug_show_all_locks(false);
> +}
> +
> +static inline void debug_show_all_lock_holders(void)
> +{
> +	__debug_show_all_locks(true);
> +}
> +
>  extern void debug_show_held_locks(struct task_struct *task);
>  extern void debug_check_no_locks_freed(const void *from, unsigned long len);
>  extern void debug_check_no_locks_held(void);
> @@ -61,6 +72,10 @@ static inline void debug_show_held_locks(struct task_struct *task)
>  {
>  }
>  
> +static inline void debug_show_all_lock_holders(void)
> +{
> +}
> +
>  static inline void
>  debug_check_no_locks_freed(const void *from, unsigned long len)
>  {
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index c71889f3f3fc..5fba784258b7 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -213,7 +213,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
>   unlock:
>  	rcu_read_unlock();
>  	if (hung_task_show_lock)
> -		debug_show_all_locks();
> +		debug_show_all_lock_holders();
>  
>  	if (hung_task_show_all_bt) {
>  		hung_task_show_all_bt = false;
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index e3375bc40dad..b3da133825cc 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -55,6 +55,7 @@
>  #include <linux/rcupdate.h>
>  #include <linux/kprobes.h>
>  #include <linux/lockdep.h>
> +#include <linux/sched/debug.h>
>  
>  #include <asm/sections.h>
>  
> @@ -6487,7 +6488,7 @@ void debug_check_no_locks_held(void)
>  EXPORT_SYMBOL_GPL(debug_check_no_locks_held);
>  
>  #ifdef __KERNEL__
> -void debug_show_all_locks(void)
> +void __debug_show_all_locks(bool show_stack)
>  {
>  	struct task_struct *g, *p;
>  
> @@ -6495,12 +6496,19 @@ void debug_show_all_locks(void)
>  		pr_warn("INFO: lockdep is turned off.\n");
>  		return;
>  	}
> -	pr_warn("\nShowing all locks held in the system:\n");
> +	if (show_stack)
> +		pr_warn("\nShowing all threads with locks held in the system:\n");
> +	else
> +		pr_warn("\nShowing all locks held in the system:\n");
>  
>  	rcu_read_lock();
>  	for_each_process_thread(g, p) {
>  		if (!p->lockdep_depth)
>  			continue;
> +		if (p == current && p->lockdep_depth == 1)
> +			continue;
> +		if (show_stack)
> +			sched_show_task(p);
>  		lockdep_print_held_locks(p);
>  		touch_nmi_watchdog();
>  		touch_all_softlockup_watchdogs();
> @@ -6510,7 +6518,7 @@ void debug_show_all_locks(void)
>  	pr_warn("\n");
>  	pr_warn("=============================================\n\n");
>  }
> -EXPORT_SYMBOL_GPL(debug_show_all_locks);
> +EXPORT_SYMBOL_GPL(__debug_show_all_locks);
>  #endif
>  
>  /*


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 (repost)] locking/lockdep: add debug_show_all_lock_holders()
  2022-12-29  0:25 ` Tetsuo Handa
@ 2023-01-07 12:58   ` Tetsuo Handa
  0 siblings, 0 replies; 7+ messages in thread
From: Tetsuo Handa @ 2023-01-07 12:58 UTC (permalink / raw)
  To: Andrew Morton
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
	Boqun Feng, Linus Torvalds

I decided to send this patch via my tree.

On 2022/12/29 9:25, Tetsuo Handa wrote:
> Hello, Andrew.
> 
> Since neither Peter nor Ingo is responding, would you take this patch via
> your tree?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 (repost)] locking/lockdep: add debug_show_all_lock_holders()
  2022-11-21 10:10 [PATCH v2 (repost)] locking/lockdep: add debug_show_all_lock_holders() Tetsuo Handa
  2022-12-29  0:25 ` Tetsuo Handa
@ 2023-01-14  9:36 ` Ingo Molnar
  2023-01-14  9:53   ` Tetsuo Handa
  1 sibling, 1 reply; 7+ messages in thread
From: Ingo Molnar @ 2023-01-14  9:36 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
	Boqun Feng, Andrew Morton, LKML, Linus Torvalds


* Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:

> --- a/include/linux/debug_locks.h
> +++ b/include/linux/debug_locks.h
> @@ -48,7 +48,18 @@ extern int debug_locks_off(void);
>  #endif
>  
>  #ifdef CONFIG_LOCKDEP
> -extern void debug_show_all_locks(void);
> +extern void __debug_show_all_locks(bool show_stack);
> +
> +static inline void debug_show_all_locks(void)
> +{
> +	__debug_show_all_locks(false);
> +}
> +
> +static inline void debug_show_all_lock_holders(void)
> +{
> +	__debug_show_all_locks(true);
> +}
> +
>  extern void debug_show_held_locks(struct task_struct *task);
>  extern void debug_check_no_locks_freed(const void *from, unsigned long len);
>  extern void debug_check_no_locks_held(void);
> @@ -61,6 +72,10 @@ static inline void debug_show_held_locks(struct task_struct *task)
>  {
>  }
>  
> +static inline void debug_show_all_lock_holders(void)
> +{
> +}
> +

> -		debug_show_all_locks();
> +		debug_show_all_lock_holders();

> -void debug_show_all_locks(void)
> +void __debug_show_all_locks(bool show_stack)
>  {
>  	struct task_struct *g, *p;
>  
> @@ -6495,12 +6496,19 @@ void debug_show_all_locks(void)
>  		pr_warn("INFO: lockdep is turned off.\n");
>  		return;
>  	}
> -	pr_warn("\nShowing all locks held in the system:\n");
> +	if (show_stack)
> +		pr_warn("\nShowing all threads with locks held in the system:\n");
> +	else
> +		pr_warn("\nShowing all locks held in the system:\n");
>  
>  	rcu_read_lock();
>  	for_each_process_thread(g, p) {
>  		if (!p->lockdep_depth)
>  			continue;
> +		if (p == current && p->lockdep_depth == 1)
> +			continue;
> +		if (show_stack)
> +			sched_show_task(p);
>  		lockdep_print_held_locks(p);
>  		touch_nmi_watchdog();
>  		touch_all_softlockup_watchdogs();

Yeah, so note how you introduce a function with a parameter:

	void __debug_show_all_locks(bool show_stack)

... only to then *hide* the new parameter via helper functions:

	static inline void debug_show_all_locks(void)
	{
		__debug_show_all_locks(false);
	}

	static inline void debug_show_all_lock_holders(void)
	{
		__debug_show_all_locks(true);
	}

... which is a *strong* hint by our universe that the new parameter was 
probably a bad idea to begin with.

Given how small debug_show_all_locks() is to begin with, I'd suggest simply 
duplicating the loop into debug_show_all_lock_holders() or so.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 (repost)] locking/lockdep: add debug_show_all_lock_holders()
  2023-01-14  9:36 ` Ingo Molnar
@ 2023-01-14  9:53   ` Tetsuo Handa
  2023-01-17  3:28     ` Waiman Long
  0 siblings, 1 reply; 7+ messages in thread
From: Tetsuo Handa @ 2023-01-14  9:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
	Boqun Feng, Andrew Morton, LKML, Linus Torvalds

On 2023/01/14 18:36, Ingo Molnar wrote:
> Yeah, so note how you introduce a function with a parameter:
> 
> 	void __debug_show_all_locks(bool show_stack)
> 
> ... only to then *hide* the new parameter via helper functions:
> 
> 	static inline void debug_show_all_locks(void)
> 	{
> 		__debug_show_all_locks(false);
> 	}
> 
> 	static inline void debug_show_all_lock_holders(void)
> 	{
> 		__debug_show_all_locks(true);
> 	}
> 
> ... which is a *strong* hint by our universe that the new parameter was 
> probably a bad idea to begin with.
> 
> Given how small debug_show_all_locks() is to begin with, I'd suggest simply 
> duplicating the loop into debug_show_all_lock_holders() or so.

Initial version at https://lkml.kernel.org/r/82af40cc-bf85-2b53-b8f9-dfc12e66a781@I-love.SAKURA.ne.jp
was duplicating the loop.

Waiman Long suggested me not to duplicate the loop at
https://lkml.kernel.org/r/3e027453-fda4-3891-3ec3-5623f1525e56@redhat.com .

Please talk with Waiman. I'm fine with either approach.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 (repost)] locking/lockdep: add debug_show_all_lock_holders()
  2023-01-14  9:53   ` Tetsuo Handa
@ 2023-01-17  3:28     ` Waiman Long
  2023-01-31  9:52       ` Tetsuo Handa
  0 siblings, 1 reply; 7+ messages in thread
From: Waiman Long @ 2023-01-17  3:28 UTC (permalink / raw)
  To: Tetsuo Handa, Ingo Molnar
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng,
	Andrew Morton, LKML, Linus Torvalds

On 1/14/23 04:53, Tetsuo Handa wrote:
> On 2023/01/14 18:36, Ingo Molnar wrote:
>> Yeah, so note how you introduce a function with a parameter:
>>
>> 	void __debug_show_all_locks(bool show_stack)
>>
>> ... only to then *hide* the new parameter via helper functions:
>>
>> 	static inline void debug_show_all_locks(void)
>> 	{
>> 		__debug_show_all_locks(false);
>> 	}
>>
>> 	static inline void debug_show_all_lock_holders(void)
>> 	{
>> 		__debug_show_all_locks(true);
>> 	}
>>
>> ... which is a *strong* hint by our universe that the new parameter was
>> probably a bad idea to begin with.
>>
>> Given how small debug_show_all_locks() is to begin with, I'd suggest simply
>> duplicating the loop into debug_show_all_lock_holders() or so.
> Initial version at https://lkml.kernel.org/r/82af40cc-bf85-2b53-b8f9-dfc12e66a781@I-love.SAKURA.ne.jp
> was duplicating the loop.
>
> Waiman Long suggested me not to duplicate the loop at
> https://lkml.kernel.org/r/3e027453-fda4-3891-3ec3-5623f1525e56@redhat.com .
>
> Please talk with Waiman. I'm fine with either approach.

My original concern was that two functions are very similar with some 
minor difference. My suggestion was to use a common helper to reduce the 
code redundancy and future maintenance.

I do have some nits about the patch. The show_stack parameter isn't 
informative. Maybe you can use show_tasks as the parameter name since 
the major difference is the calling of sched_show_task().

Define a new helper like debug_show_all_locks_tasks(bool show_tasks), 
use it directly in check_hung_uninterruptible_tasks() and make 
debug_show_all_lock() call debug_show_all_locks_tasks().

Ingo, will that OK with you?

Cheers,
Longman


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 (repost)] locking/lockdep: add debug_show_all_lock_holders()
  2023-01-17  3:28     ` Waiman Long
@ 2023-01-31  9:52       ` Tetsuo Handa
  0 siblings, 0 replies; 7+ messages in thread
From: Tetsuo Handa @ 2023-01-31  9:52 UTC (permalink / raw)
  To: Waiman Long, Ingo Molnar
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng,
	Andrew Morton, LKML, Linus Torvalds

Ingo, what do you think?

I want your tree to send this patch in the upcoming merge window.

On 2023/01/17 12:28, Waiman Long wrote:
>> Please talk with Waiman. I'm fine with either approach.
> 
> My original concern was that two functions are very similar with some minor difference. My suggestion was to use a common helper to reduce the code redundancy and future maintenance.
> 
> I do have some nits about the patch. The show_stack parameter isn't informative. Maybe you can use show_tasks as the parameter name since the major difference is the calling of sched_show_task().
> 
> Define a new helper like debug_show_all_locks_tasks(bool show_tasks), use it directly in check_hung_uninterruptible_tasks() and make debug_show_all_lock() call debug_show_all_locks_tasks().
> 
> Ingo, will that OK with you?
> 
> Cheers,
> Longman
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-01-31  9:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-21 10:10 [PATCH v2 (repost)] locking/lockdep: add debug_show_all_lock_holders() Tetsuo Handa
2022-12-29  0:25 ` Tetsuo Handa
2023-01-07 12:58   ` Tetsuo Handa
2023-01-14  9:36 ` Ingo Molnar
2023-01-14  9:53   ` Tetsuo Handa
2023-01-17  3:28     ` Waiman Long
2023-01-31  9:52       ` Tetsuo Handa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.