All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] workqueue: Add rcu lock check after work execute end
@ 2024-01-09 11:10 Xuewen Yan
  2024-01-09 16:39 ` Waiman Long
  0 siblings, 1 reply; 6+ messages in thread
From: Xuewen Yan @ 2024-01-09 11:10 UTC (permalink / raw)
  To: tj, jiangshanlai; +Cc: longman, linux-kernel, ke.wang

Now the workqueue just check the atomic and lock after
work execute end. However, sometimes, drivers's work
may don't unlock rcu after call rcu_read_lock().
And as a result, it would cause rcu stall, but the rcu stall warning
can not dump the work func, because the work has finished.

In order to quickly discover those works that do not call
rcu_read_unlock after rcu_read_lock(). Add the rcu lock check.

Use rcu_preempt_depth() to check the work's rcu status,
Normally, this value is 0. If this value is bigger than 0,
it means that the rcu lock is still held after the work ends.
At this time, we print err info and print the work func.

Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
 kernel/workqueue.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 2989b57e154a..a5a0df824df1 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2634,11 +2634,12 @@ __acquires(&pool->lock)
 	lock_map_release(&lockdep_map);
 	lock_map_release(&pwq->wq->lockdep_map);
 
-	if (unlikely(in_atomic() || lockdep_depth(current) > 0)) {
-		pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d\n"
+	if (unlikely(in_atomic() || lockdep_depth(current) > 0) ||
+		rcu_preempt_depth() > 0) {
+		pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d/%d\n"
 		       "     last function: %ps\n",
-		       current->comm, preempt_count(), task_pid_nr(current),
-		       worker->current_func);
+		       current->comm, preempt_count(), rcu_preempt_depth(),
+		       task_pid_nr(current), worker->current_func);
 		debug_show_held_locks(current);
 		dump_stack();
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] workqueue: Add rcu lock check after work execute end
  2024-01-09 11:10 [PATCH] workqueue: Add rcu lock check after work execute end Xuewen Yan
@ 2024-01-09 16:39 ` Waiman Long
  2024-01-10  3:27   ` [PATCH v2] " Xuewen Yan
  0 siblings, 1 reply; 6+ messages in thread
From: Waiman Long @ 2024-01-09 16:39 UTC (permalink / raw)
  To: Xuewen Yan, tj, jiangshanlai; +Cc: linux-kernel, ke.wang

On 1/9/24 06:10, Xuewen Yan wrote:
> Now the workqueue just check the atomic and lock after
> work execute end. However, sometimes, drivers's work
> may don't unlock rcu after call rcu_read_lock().
> And as a result, it would cause rcu stall, but the rcu stall warning
> can not dump the work func, because the work has finished.
>
> In order to quickly discover those works that do not call
> rcu_read_unlock after rcu_read_lock(). Add the rcu lock check.
>
> Use rcu_preempt_depth() to check the work's rcu status,
> Normally, this value is 0. If this value is bigger than 0,
> it means that the rcu lock is still held after the work ends.
> At this time, we print err info and print the work func.
>
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> ---
>   kernel/workqueue.c | 9 +++++----
>   1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 2989b57e154a..a5a0df824df1 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2634,11 +2634,12 @@ __acquires(&pool->lock)
>   	lock_map_release(&lockdep_map);
>   	lock_map_release(&pwq->wq->lockdep_map);
>   
> -	if (unlikely(in_atomic() || lockdep_depth(current) > 0)) {
> -		pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d\n"
> +	if (unlikely(in_atomic() || lockdep_depth(current) > 0) ||
> +		rcu_preempt_depth() > 0) {

The rcu_preempt_depth() check should be within the unlikely() helper. 
Other than that, it looks good to me.

Cheers,
Longman

> +		pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d/%d\n"
>   		       "     last function: %ps\n",
> -		       current->comm, preempt_count(), task_pid_nr(current),
> -		       worker->current_func);
> +		       current->comm, preempt_count(), rcu_preempt_depth(),
> +		       task_pid_nr(current), worker->current_func);
>   		debug_show_held_locks(current);
>   		dump_stack();
>   	}


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] workqueue: Add rcu lock check after work execute end
  2024-01-09 16:39 ` Waiman Long
@ 2024-01-10  3:27   ` Xuewen Yan
  2024-01-10  9:08     ` Lai Jiangshan
                       ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Xuewen Yan @ 2024-01-10  3:27 UTC (permalink / raw)
  To: tj, longman; +Cc: jiangshanlai, ke.wang, xuewen.yan94, linux-kernel

Now the workqueue just check the atomic and lock after
work execute end. However, sometimes, drivers's work
may don't unlock rcu after call rcu_read_lock().
And as a result, it would cause rcu stall, but the rcu stall warning
can not dump the work func, because the work has finished.

In order to quickly discover those works that do not call
rcu_read_unlock after rcu_read_lock(). Add the rcu lock check.

Use rcu_preempt_depth() to check the work's rcu status,
Normally, this value is 0. If this value is bigger than 0,
it means the work are still holding rcu lock.
At this time, we print err info and print the work func.

Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
V2:
- move check to unlikely() helper (Longman)
---
 kernel/workqueue.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 2989b57e154a..c2a73364f5ad 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2634,11 +2634,12 @@ __acquires(&pool->lock)
 	lock_map_release(&lockdep_map);
 	lock_map_release(&pwq->wq->lockdep_map);
 
-	if (unlikely(in_atomic() || lockdep_depth(current) > 0)) {
-		pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d\n"
+	if (unlikely(in_atomic() || lockdep_depth(current) > 0 ||
+		rcu_preempt_depth() > 0)) {
+		pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d/%d\n"
 		       "     last function: %ps\n",
-		       current->comm, preempt_count(), task_pid_nr(current),
-		       worker->current_func);
+		       current->comm, preempt_count(), rcu_preempt_depth(),
+		       task_pid_nr(current), worker->current_func);
 		debug_show_held_locks(current);
 		dump_stack();
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] workqueue: Add rcu lock check after work execute end
  2024-01-10  3:27   ` [PATCH v2] " Xuewen Yan
@ 2024-01-10  9:08     ` Lai Jiangshan
  2024-01-10 14:28     ` Waiman Long
  2024-01-16 20:22     ` Tejun Heo
  2 siblings, 0 replies; 6+ messages in thread
From: Lai Jiangshan @ 2024-01-10  9:08 UTC (permalink / raw)
  To: Xuewen Yan; +Cc: tj, longman, ke.wang, xuewen.yan94, linux-kernel

On Wed, Jan 10, 2024 at 11:27 AM Xuewen Yan <xuewen.yan@unisoc.com> wrote:
>
> Now the workqueue just check the atomic and lock after
> work execute end. However, sometimes, drivers's work
> may don't unlock rcu after call rcu_read_lock().
> And as a result, it would cause rcu stall, but the rcu stall warning
> can not dump the work func, because the work has finished.
>
> In order to quickly discover those works that do not call
> rcu_read_unlock after rcu_read_lock(). Add the rcu lock check.
>
> Use rcu_preempt_depth() to check the work's rcu status,
> Normally, this value is 0. If this value is bigger than 0,
> it means the work are still holding rcu lock.
> At this time, we print err info and print the work func.
>
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> ---
> V2:
> - move check to unlikely() helper (Longman)
> ---
>  kernel/workqueue.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 2989b57e154a..c2a73364f5ad 100644

Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>

> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2634,11 +2634,12 @@ __acquires(&pool->lock)
>         lock_map_release(&lockdep_map);
>         lock_map_release(&pwq->wq->lockdep_map);
>
> -       if (unlikely(in_atomic() || lockdep_depth(current) > 0)) {
> -               pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d\n"
> +       if (unlikely(in_atomic() || lockdep_depth(current) > 0 ||
> +               rcu_preempt_depth() > 0)) {
> +               pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d/%d\n"
>                        "     last function: %ps\n",
> -                      current->comm, preempt_count(), task_pid_nr(current),
> -                      worker->current_func);
> +                      current->comm, preempt_count(), rcu_preempt_depth(),
> +                      task_pid_nr(current), worker->current_func);
>                 debug_show_held_locks(current);
>                 dump_stack();
>         }
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] workqueue: Add rcu lock check after work execute end
  2024-01-10  3:27   ` [PATCH v2] " Xuewen Yan
  2024-01-10  9:08     ` Lai Jiangshan
@ 2024-01-10 14:28     ` Waiman Long
  2024-01-16 20:22     ` Tejun Heo
  2 siblings, 0 replies; 6+ messages in thread
From: Waiman Long @ 2024-01-10 14:28 UTC (permalink / raw)
  To: Xuewen Yan, tj; +Cc: jiangshanlai, ke.wang, xuewen.yan94, linux-kernel

On 1/9/24 22:27, Xuewen Yan wrote:
> Now the workqueue just check the atomic and lock after
> work execute end. However, sometimes, drivers's work
> may don't unlock rcu after call rcu_read_lock().
> And as a result, it would cause rcu stall, but the rcu stall warning
> can not dump the work func, because the work has finished.
>
> In order to quickly discover those works that do not call
> rcu_read_unlock after rcu_read_lock(). Add the rcu lock check.
>
> Use rcu_preempt_depth() to check the work's rcu status,
> Normally, this value is 0. If this value is bigger than 0,
> it means the work are still holding rcu lock.
> At this time, we print err info and print the work func.
>
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> ---
> V2:
> - move check to unlikely() helper (Longman)
> ---
>   kernel/workqueue.c | 9 +++++----
>   1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 2989b57e154a..c2a73364f5ad 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2634,11 +2634,12 @@ __acquires(&pool->lock)
>   	lock_map_release(&lockdep_map);
>   	lock_map_release(&pwq->wq->lockdep_map);
>   
> -	if (unlikely(in_atomic() || lockdep_depth(current) > 0)) {
> -		pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d\n"
> +	if (unlikely(in_atomic() || lockdep_depth(current) > 0 ||
> +		rcu_preempt_depth() > 0)) {
> +		pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d/%d\n"
>   		       "     last function: %ps\n",
> -		       current->comm, preempt_count(), task_pid_nr(current),
> -		       worker->current_func);
> +		       current->comm, preempt_count(), rcu_preempt_depth(),
> +		       task_pid_nr(current), worker->current_func);
>   		debug_show_held_locks(current);
>   		dump_stack();
>   	}

This can be a useful additional sanity test.

Reviewed-by: Waiman Long <longman@redhat.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] workqueue: Add rcu lock check after work execute end
  2024-01-10  3:27   ` [PATCH v2] " Xuewen Yan
  2024-01-10  9:08     ` Lai Jiangshan
  2024-01-10 14:28     ` Waiman Long
@ 2024-01-16 20:22     ` Tejun Heo
  2 siblings, 0 replies; 6+ messages in thread
From: Tejun Heo @ 2024-01-16 20:22 UTC (permalink / raw)
  To: Xuewen Yan; +Cc: longman, jiangshanlai, ke.wang, xuewen.yan94, linux-kernel

Hello,

I massaged the description for clarity and applied to wq/for-6.9.

Thanks.

From 1a65a6d17cbc58e1aeffb2be962acce49efbef9c Mon Sep 17 00:00:00 2001
From: Xuewen Yan <xuewen.yan@unisoc.com>
Date: Wed, 10 Jan 2024 11:27:24 +0800
Subject: [PATCH] workqueue: Add rcu lock check at the end of work item
 execution

Currently the workqueue just checks the atomic and locking states after work
execution ends. However, sometimes, a work item may not unlock rcu after
acquiring rcu_read_lock(). And as a result, it would cause rcu stall, but
the rcu stall warning can not dump the work func, because the work has
finished.

In order to quickly discover those works that do not call rcu_read_unlock()
after rcu_read_lock(), add the rcu lock check.

Use rcu_preempt_depth() to check the work's rcu status. Normally, this value
is 0. If this value is bigger than 0, it means the work are still holding
rcu lock. If so, print err info and the work func.

tj: Reworded the description for clarity. Minor formatting tweak.

Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/workqueue.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index ed442cefea7c..aec3efbaaf93 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2640,11 +2640,12 @@ __acquires(&pool->lock)
 	lock_map_release(&lockdep_map);
 	lock_map_release(&pwq->wq->lockdep_map);
 
-	if (unlikely(in_atomic() || lockdep_depth(current) > 0)) {
-		pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d\n"
+	if (unlikely(in_atomic() || lockdep_depth(current) > 0 ||
+		     rcu_preempt_depth() > 0)) {
+		pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d/%d\n"
 		       "     last function: %ps\n",
-		       current->comm, preempt_count(), task_pid_nr(current),
-		       worker->current_func);
+		       current->comm, preempt_count(), rcu_preempt_depth(),
+		       task_pid_nr(current), worker->current_func);
 		debug_show_held_locks(current);
 		dump_stack();
 	}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-01-16 20:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-09 11:10 [PATCH] workqueue: Add rcu lock check after work execute end Xuewen Yan
2024-01-09 16:39 ` Waiman Long
2024-01-10  3:27   ` [PATCH v2] " Xuewen Yan
2024-01-10  9:08     ` Lai Jiangshan
2024-01-10 14:28     ` Waiman Long
2024-01-16 20:22     ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.