All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] workqueue: fix possible race condition when rescuer VS pwq-release
@ 2014-03-27 16:20 Lai Jiangshan
  2014-03-28 12:07 ` [PATCH V2] " Lai Jiangshan
  0 siblings, 1 reply; 24+ messages in thread
From: Lai Jiangshan @ 2014-03-27 16:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, Lai Jiangshan

There is a race condition between rescuer_thread() and
pwq_unbound_release_workfn().

The works of the @pwq may be processed by some other worker,
and @pwq is scheduled to release before the rescuer starts
to process. In this case pwq_unbound_release_workfn() will
corrupt wq->maydays, and rescuer_thead will access to corrupted
data.

Using get_pwq() when send_mayday() will keep @pwq's lifetime
and avoid the race condition.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 82ef9f3..7066519 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1902,6 +1902,12 @@ static void send_mayday(struct work_struct *work)
 
 	/* mayday mayday mayday */
 	if (list_empty(&pwq->mayday_node)) {
+		/*
+		 * Keep the pwq and avoid the pwq to be scheduled to release
+		 * when someone else processes all the works before the rescuer
+		 * starts to process.
+		 */
+		get_pwq(pwq);
 		list_add_tail(&pwq->mayday_node, &wq->maydays);
 		wake_up_process(wq->rescuer->task);
 	}
@@ -2418,6 +2424,7 @@ repeat:
 		/* migrate to the target cpu if possible */
 		worker_maybe_bind_and_lock(pool);
 		rescuer->pool = pool;
+		put_pwq(pwq);
 
 		/*
 		 * Slurp in all works issued via this workqueue and
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V2] workqueue: fix possible race condition when rescuer VS pwq-release
  2014-03-27 16:20 [PATCH] workqueue: fix possible race condition when rescuer VS pwq-release Lai Jiangshan
@ 2014-03-28 12:07 ` Lai Jiangshan
  2014-03-31 14:40   ` Lai Jiangshan
  2014-04-15 16:47   ` Tejun Heo
  0 siblings, 2 replies; 24+ messages in thread
From: Lai Jiangshan @ 2014-03-28 12:07 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Lai Jiangshan, linux-kernel

>From 11af0cd0306309f0deaf3326cc26d3e7e517e3d1 Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <laijs@cn.fujitsu.com>
Date: Fri, 28 Mar 2014 00:20:12 +0800
Subject: [PATCH] workqueue: fix possible race condition when rescuer VS
 pwq-release

There is a race condition between rescuer_thread() and
pwq_unbound_release_workfn().

The works of the @pwq may be processed by some other workers,
and @pwq is scheduled to release(due to its wq's attr is changed)
before the rescuer starts to process. In this case
pwq_unbound_release_workfn() will corrupt wq->maydays list,
and rescuer_thead() will access to corrupted data.

Using get_unbound_pwq() when send_mayday() will keep @pwq's lifetime
and avoid the race condition.

Changed from V1:
	1) Introduce get_unbound_pwq() for beter readibility. Since
	   get_pwq() is considerred no-op for percpu workqueue,
	   so the patch are the same behavior in functionality.
	2) More precise comments.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |   30 ++++++++++++++++++++++++++++++
 1 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0c74979..d845bdd 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1050,6 +1050,12 @@ static void get_pwq(struct pool_workqueue *pwq)
 	pwq->refcnt++;
 }
 
+static inline void get_unbound_pwq(struct pool_workqueue *pwq)
+{
+	if (pwq->wq->flags & WQ_UNBOUND)
+		get_pwq(pwq);
+}
+
 /**
  * put_pwq - put a pool_workqueue reference
  * @pwq: pool_workqueue to put
@@ -1075,6 +1081,12 @@ static void put_pwq(struct pool_workqueue *pwq)
 	schedule_work(&pwq->unbound_release_work);
 }
 
+static inline void put_unbound_pwq(struct pool_workqueue *pwq)
+{
+	if (pwq->wq->flags & WQ_UNBOUND)
+		put_pwq(pwq);
+}
+
 /**
  * put_pwq_unlocked - put_pwq() with surrounding pool lock/unlock
  * @pwq: pool_workqueue to put (can be %NULL)
@@ -1908,6 +1920,19 @@ static void send_mayday(struct work_struct *work)
 
 	/* mayday mayday mayday */
 	if (list_empty(&pwq->mayday_node)) {
+		/*
+		 * A pwq of an unbound wq may be released before wq's
+		 * destruction when the wq's attr is changed. In this case,
+		 * pwq_unbound_release_workfn() may execute earlier before
+		 * rescuer_thread() and corrupt wq->maydays list.
+		 *
+		 * get_unbound_pwq() keeps the unbound pwq until the rescuer
+		 * processes it and protects the pwq from being scheduled to
+		 * release when someone else processes all the works before
+		 * the rescuer starts to process.
+		 */
+		get_unbound_pwq(pwq);
+
 		list_add_tail(&pwq->mayday_node, &wq->maydays);
 		wake_up_process(wq->rescuer->task);
 	}
@@ -2424,6 +2449,7 @@ repeat:
 		/* migrate to the target cpu if possible */
 		worker_maybe_bind_and_lock(pool);
 		rescuer->pool = pool;
+		put_unbound_pwq(pwq);
 
 		/*
 		 * Slurp in all works issued via this workqueue and
@@ -4318,6 +4344,10 @@ void destroy_workqueue(struct workqueue_struct *wq)
 		/*
 		 * The base ref is never dropped on per-cpu pwqs.  Directly
 		 * free the pwqs and wq.
+		 *
+		 * The wq->maydays list maybe still have some pwqs linked,
+		 * but it is safe to free them all together since the rescuer
+		 * is stopped.
 		 */
 		free_percpu(wq->cpu_pwqs);
 		kfree(wq);
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] workqueue: fix possible race condition when rescuer VS pwq-release
  2014-03-28 12:07 ` [PATCH V2] " Lai Jiangshan
@ 2014-03-31 14:40   ` Lai Jiangshan
  2014-03-31 20:06     ` Tejun Heo
  2014-04-15 16:47   ` Tejun Heo
  1 sibling, 1 reply; 24+ messages in thread
From: Lai Jiangshan @ 2014-03-31 14:40 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Tejun Heo, linux-kernel

Ping

On 03/28/2014 08:07 PM, Lai Jiangshan wrote:
>>From 11af0cd0306309f0deaf3326cc26d3e7e517e3d1 Mon Sep 17 00:00:00 2001
> From: Lai Jiangshan <laijs@cn.fujitsu.com>
> Date: Fri, 28 Mar 2014 00:20:12 +0800
> Subject: [PATCH] workqueue: fix possible race condition when rescuer VS
>  pwq-release
> 
> There is a race condition between rescuer_thread() and
> pwq_unbound_release_workfn().
> 
> The works of the @pwq may be processed by some other workers,
> and @pwq is scheduled to release(due to its wq's attr is changed)
> before the rescuer starts to process. In this case
> pwq_unbound_release_workfn() will corrupt wq->maydays list,
> and rescuer_thead() will access to corrupted data.
> 
> Using get_unbound_pwq() when send_mayday() will keep @pwq's lifetime
> and avoid the race condition.
> 
> Changed from V1:
> 	1) Introduce get_unbound_pwq() for beter readibility. Since
> 	   get_pwq() is considerred no-op for percpu workqueue,
> 	   so the patch are the same behavior in functionality.
> 	2) More precise comments.
> 
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> ---
>  kernel/workqueue.c |   30 ++++++++++++++++++++++++++++++
>  1 files changed, 30 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 0c74979..d845bdd 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1050,6 +1050,12 @@ static void get_pwq(struct pool_workqueue *pwq)
>  	pwq->refcnt++;
>  }
>  
> +static inline void get_unbound_pwq(struct pool_workqueue *pwq)
> +{
> +	if (pwq->wq->flags & WQ_UNBOUND)
> +		get_pwq(pwq);
> +}
> +
>  /**
>   * put_pwq - put a pool_workqueue reference
>   * @pwq: pool_workqueue to put
> @@ -1075,6 +1081,12 @@ static void put_pwq(struct pool_workqueue *pwq)
>  	schedule_work(&pwq->unbound_release_work);
>  }
>  
> +static inline void put_unbound_pwq(struct pool_workqueue *pwq)
> +{
> +	if (pwq->wq->flags & WQ_UNBOUND)
> +		put_pwq(pwq);
> +}
> +
>  /**
>   * put_pwq_unlocked - put_pwq() with surrounding pool lock/unlock
>   * @pwq: pool_workqueue to put (can be %NULL)
> @@ -1908,6 +1920,19 @@ static void send_mayday(struct work_struct *work)
>  
>  	/* mayday mayday mayday */
>  	if (list_empty(&pwq->mayday_node)) {
> +		/*
> +		 * A pwq of an unbound wq may be released before wq's
> +		 * destruction when the wq's attr is changed. In this case,
> +		 * pwq_unbound_release_workfn() may execute earlier before
> +		 * rescuer_thread() and corrupt wq->maydays list.
> +		 *
> +		 * get_unbound_pwq() keeps the unbound pwq until the rescuer
> +		 * processes it and protects the pwq from being scheduled to
> +		 * release when someone else processes all the works before
> +		 * the rescuer starts to process.
> +		 */
> +		get_unbound_pwq(pwq);
> +
>  		list_add_tail(&pwq->mayday_node, &wq->maydays);
>  		wake_up_process(wq->rescuer->task);
>  	}
> @@ -2424,6 +2449,7 @@ repeat:
>  		/* migrate to the target cpu if possible */
>  		worker_maybe_bind_and_lock(pool);
>  		rescuer->pool = pool;
> +		put_unbound_pwq(pwq);
>  
>  		/*
>  		 * Slurp in all works issued via this workqueue and
> @@ -4318,6 +4344,10 @@ void destroy_workqueue(struct workqueue_struct *wq)
>  		/*
>  		 * The base ref is never dropped on per-cpu pwqs.  Directly
>  		 * free the pwqs and wq.
> +		 *
> +		 * The wq->maydays list maybe still have some pwqs linked,
> +		 * but it is safe to free them all together since the rescuer
> +		 * is stopped.
>  		 */
>  		free_percpu(wq->cpu_pwqs);
>  		kfree(wq);


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] workqueue: fix possible race condition when rescuer VS pwq-release
  2014-03-31 14:40   ` Lai Jiangshan
@ 2014-03-31 20:06     ` Tejun Heo
  2014-04-14  7:02       ` Lai Jiangshan
  0 siblings, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2014-03-31 20:06 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel

On Mon, Mar 31, 2014 at 10:40:47PM +0800, Lai Jiangshan wrote:
> Ping

Lai, I'll keep the mail tagged but can you pleaes ping me once -rc1
drops?  While it is an actual bug, given that it'd be pretty difficult
to trigger, I don't think it's too urgent.

Thanks!

-- 
tejun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] workqueue: fix possible race condition when rescuer VS pwq-release
  2014-03-31 20:06     ` Tejun Heo
@ 2014-04-14  7:02       ` Lai Jiangshan
  0 siblings, 0 replies; 24+ messages in thread
From: Lai Jiangshan @ 2014-04-14  7:02 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel

On 04/01/2014 04:06 AM, Tejun Heo wrote:
> On Mon, Mar 31, 2014 at 10:40:47PM +0800, Lai Jiangshan wrote:
>> Ping
> 
> Lai, I'll keep the mail tagged but can you pleaes ping me once -rc1
> drops?  While it is an actual bug, given that it'd be pretty difficult
> to trigger, I don't think it's too urgent.
> 
> Thanks!
> 

Hi, Tejun
Ping.

Very old (resent) patch(1 patch):
workqueue: add __WQ_FREEZING and remove POOL_FREEZING

Possible race condition patch(1 patch), (In this email thread)
workqueue: fix possible race condition when rescuer VS pwq-release

Worker management patchset(6 patches)
workqueue: simpler&better workers management synchronization
  workqueue: generic routine to restore percpu/unbound pools' workers'
    cpumask
  workqueue: generic framework to manage normal&rescuer workers'
    cpumask
  workqueue: make destroy_worker() atomically
  workqueue: commit worker to pool's concurrency setting atomically.
  workqueue: remove manager_mutex
  workqueue: destroy worker directly in idle timeout handler

Sorry for adding some work to you.

Thanks
Lai

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] workqueue: fix possible race condition when rescuer VS pwq-release
  2014-03-28 12:07 ` [PATCH V2] " Lai Jiangshan
  2014-03-31 14:40   ` Lai Jiangshan
@ 2014-04-15 16:47   ` Tejun Heo
  2014-04-16  1:25     ` Lai Jiangshan
  1 sibling, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2014-04-15 16:47 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel

On Fri, Mar 28, 2014 at 08:07:58PM +0800, Lai Jiangshan wrote:
> +static inline void get_unbound_pwq(struct pool_workqueue *pwq)
> +{
> +	if (pwq->wq->flags & WQ_UNBOUND)
> +		get_pwq(pwq);
> +}
> +
>  /**
>   * put_pwq - put a pool_workqueue reference
>   * @pwq: pool_workqueue to put
> @@ -1075,6 +1081,12 @@ static void put_pwq(struct pool_workqueue *pwq)
>  	schedule_work(&pwq->unbound_release_work);
>  }
>  
> +static inline void put_unbound_pwq(struct pool_workqueue *pwq)
> +{
> +	if (pwq->wq->flags & WQ_UNBOUND)
> +		put_pwq(pwq);
> +}

Ugh... please drop these helpers.

> +		get_unbound_pwq(pwq);

Why not just do get_pwq() here?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] workqueue: fix possible race condition when rescuer VS pwq-release
  2014-04-15 16:47   ` Tejun Heo
@ 2014-04-16  1:25     ` Lai Jiangshan
  2014-04-16 15:23       ` Tejun Heo
  0 siblings, 1 reply; 24+ messages in thread
From: Lai Jiangshan @ 2014-04-16  1:25 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, Lai Jiangshan

On 04/16/2014 12:47 AM, Tejun Heo wrote:
> On Fri, Mar 28, 2014 at 08:07:58PM +0800, Lai Jiangshan wrote:
>> +static inline void get_unbound_pwq(struct pool_workqueue *pwq)
>> +{
>> +	if (pwq->wq->flags & WQ_UNBOUND)
>> +		get_pwq(pwq);
>> +}
>> +
>>  /**
>>   * put_pwq - put a pool_workqueue reference
>>   * @pwq: pool_workqueue to put
>> @@ -1075,6 +1081,12 @@ static void put_pwq(struct pool_workqueue *pwq)
>>  	schedule_work(&pwq->unbound_release_work);
>>  }
>>  
>> +static inline void put_unbound_pwq(struct pool_workqueue *pwq)
>> +{
>> +	if (pwq->wq->flags & WQ_UNBOUND)
>> +		put_pwq(pwq);
>> +}
> 
> Ugh... please drop these helpers.
> 
>> +		get_unbound_pwq(pwq);
> 
> Why not just do get_pwq() here?

V1 patch just do get_pwq().

> 
> Thanks.
> 

1) Our aim is to protect unbound pwq, not percpu pwq which can't be be protected by get_pwq().
2) get_pwq() will make reviewers confused/surprised, destroy_workqueue() may destroy percpu pwqs
   with ref > 1. At least we need to add more comments explain this behavior. Origin comments:
		/*
		 * The base ref is never dropped on per-cpu pwqs.  Directly
		 * free the pwqs and wq.
		 */
3) get_unbound_pwq() self document.

Thanks,
Lai

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] workqueue: fix possible race condition when rescuer VS pwq-release
  2014-04-16  1:25     ` Lai Jiangshan
@ 2014-04-16 15:23       ` Tejun Heo
  2014-04-16 16:21         ` Lai Jiangshan
  0 siblings, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2014-04-16 15:23 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel

Hello, Lai.

On Wed, Apr 16, 2014 at 09:25:16AM +0800, Lai Jiangshan wrote:
> 1) Our aim is to protect unbound pwq, not percpu pwq which can't be be protected by get_pwq().
> 2) get_pwq() will make reviewers confused/surprised, destroy_workqueue() may destroy percpu pwqs
>    with ref > 1. At least we need to add more comments explain this behavior. Origin comments:
> 		/*
> 		 * The base ref is never dropped on per-cpu pwqs.  Directly
> 		 * free the pwqs and wq.
> 		 */

You can just comment "pwqs might go away at any time, pin it until
rescuer is done with it" and that's actually the better way to do it.
percpu wq's not supporting attribute changes may change in the future.
What the code path wants is pinning down the pwq no matter where it
came from.  There's no point in distinguishing unbound and per-cpu
here.

> 3) get_unbound_pwq() self document.

Not really.  If the name is get_pwq_if_unbound(), maybe.  Functions
which take args and become noop depending on the argument aren't
generally good ideas.  There are specific cases that they are suitable
but this is just gratuituous.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] workqueue: fix possible race condition when rescuer VS pwq-release
  2014-04-16 15:23       ` Tejun Heo
@ 2014-04-16 16:21         ` Lai Jiangshan
  2014-04-16 16:50           ` Tejun Heo
  0 siblings, 1 reply; 24+ messages in thread
From: Lai Jiangshan @ 2014-04-16 16:21 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML

On Wed, Apr 16, 2014 at 11:23 PM, Tejun Heo <tj@kernel.org> wrote:
> Hello, Lai.
>
> On Wed, Apr 16, 2014 at 09:25:16AM +0800, Lai Jiangshan wrote:
>> 1) Our aim is to protect unbound pwq, not percpu pwq which can't be be protected by get_pwq().
>> 2) get_pwq() will make reviewers confused/surprised, destroy_workqueue() may destroy percpu pwqs
>>    with ref > 1. At least we need to add more comments explain this behavior. Origin comments:
>>               /*
>>                * The base ref is never dropped on per-cpu pwqs.  Directly
>>                * free the pwqs and wq.
>>                */

Hi, Tejun

OK. It is better to use get_pwq(). I will also change the above comments to:

  The base ref and the possible ref from rerscuer(stopped) are never
dropped on per-cpu pwqs.
  Directly free the pwqs and wq.

The reason I quickly dropped V1 and wrote the V2 is that I saw this comment.
"The base ref" is precise after I used get_pwq() in V1.

Or to avoid to change to this comments.
I can also move the following code down to the bottom of the rescuer_thread().

if (kthread_should_stop()) {
__set_current_state(TASK_RUNNING);
rescuer->task->flags &= ~PF_WQ_WORKER;
return 0;
}

(I reply this email on browser, never mind the syntax).
Maybe the second choice are better.

Any think?

Thanks,
Lai.

>
> You can just comment "pwqs might go away at any time, pin it until
> rescuer is done with it" and that's actually the better way to do it.
> percpu wq's not supporting attribute changes may change in the future.
> What the code path wants is pinning down the pwq no matter where it
> came from.  There's no point in distinguishing unbound and per-cpu
> here.
>
>> 3) get_unbound_pwq() self document.
>
> Not really.  If the name is get_pwq_if_unbound(), maybe.  Functions
> which take args and become noop depending on the argument aren't
> generally good ideas.  There are specific cases that they are suitable
> but this is just gratuituous.
>
> Thanks.
>
> --
> tejun
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] workqueue: fix possible race condition when rescuer VS pwq-release
  2014-04-16 16:21         ` Lai Jiangshan
@ 2014-04-16 16:50           ` Tejun Heo
  2014-04-16 22:35             ` Lai Jiangshan
  2014-04-16 23:34             ` [PATCH 1/2] workqueue: rescuer_thread() processes all pwqs before exit Lai Jiangshan
  0 siblings, 2 replies; 24+ messages in thread
From: Tejun Heo @ 2014-04-16 16:50 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: LKML

Hello, Lai.

On Thu, Apr 17, 2014 at 12:21:21AM +0800, Lai Jiangshan wrote:
> OK. It is better to use get_pwq(). I will also change the above comments to:
> 
>   The base ref and the possible ref from rerscuer(stopped) are never
> dropped on per-cpu pwqs.
>   Directly free the pwqs and wq.

Hmmm, why does that matter?  The base ref is the business of
create/destroy path and as long as base ref is never put other refs
just don't matter.

> The reason I quickly dropped V1 and wrote the V2 is that I saw this comment.
> "The base ref" is precise after I used get_pwq() in V1.
> 
> Or to avoid to change to this comments.
> I can also move the following code down to the bottom of the rescuer_thread().
> 
> if (kthread_should_stop()) {
> __set_current_state(TASK_RUNNING);
> rescuer->task->flags &= ~PF_WQ_WORKER;
> return 0;
> }
> 
> (I reply this email on browser, never mind the syntax).
> Maybe the second choice are better.

Hmmm... yeah, regardlesss of the above, it's a bit nasty that rescuer
may exit with non-empty mayday list.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] workqueue: fix possible race condition when rescuer VS pwq-release
  2014-04-16 16:50           ` Tejun Heo
@ 2014-04-16 22:35             ` Lai Jiangshan
  2014-04-16 23:34             ` [PATCH 1/2] workqueue: rescuer_thread() processes all pwqs before exit Lai Jiangshan
  1 sibling, 0 replies; 24+ messages in thread
From: Lai Jiangshan @ 2014-04-16 22:35 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML

On 04/17/2014 12:50 AM, Tejun Heo wrote:
> Hello, Lai.
> 
> On Thu, Apr 17, 2014 at 12:21:21AM +0800, Lai Jiangshan wrote:
>> OK. It is better to use get_pwq(). I will also change the above comments to:
>>
>>   The base ref and the possible ref from rerscuer(stopped) are never
>> dropped on per-cpu pwqs.
>>   Directly free the pwqs and wq.
> 
> Hmmm, why does that matter?  The base ref is the business of
> create/destroy path and as long as base ref is never put other refs
> just don't matter.
> 
>> The reason I quickly dropped V1 and wrote the V2 is that I saw this comment.
>> "The base ref" is precise after I used get_pwq() in V1.
>>
>> Or to avoid to change to this comments.
>> I can also move the following code down to the bottom of the rescuer_thread().
>>
>> if (kthread_should_stop()) {
>> __set_current_state(TASK_RUNNING);
>> rescuer->task->flags &= ~PF_WQ_WORKER;
>> return 0;
>> }
>>
>> (I reply this email on browser, never mind the syntax).
>> Maybe the second choice are better.
> 
> Hmmm... yeah, regardlesss of the above, it's a bit nasty that rescuer
> may exit with non-empty mayday list.
> 

+		 * The wq->maydays list maybe still have some pwqs linked,
+		 * but it is safe to free them all together since the rescuer
+		 * is stopped.

I just comment it without fix. It has no harm before, but it harms after I added a get_pwq().
Sorry.
Lai

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/2] workqueue: rescuer_thread() processes all pwqs before exit
  2014-04-16 16:50           ` Tejun Heo
  2014-04-16 22:35             ` Lai Jiangshan
@ 2014-04-16 23:34             ` Lai Jiangshan
  2014-04-16 23:34               ` [PATCH 2/2] workqueue: fix possible race condition when rescuer VS pwq-release Lai Jiangshan
  2014-04-17 15:27               ` [PATCH 1/2] workqueue: rescuer_thread() processes all pwqs before exit Tejun Heo
  1 sibling, 2 replies; 24+ messages in thread
From: Lai Jiangshan @ 2014-04-16 23:34 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

Before the rescuer is picked to running, the works of the @pwq
may be processed by some other workers, and destroy_workqueue()
may called at the same time. This may result a nasty situation
that rescuer may exit with non-empty mayday list.

It is no harm currently, destroy_workqueue() can safely to free
them all(workqueue&pwqs) togerther, since the rescuer is stopped.
No rescuer nor mayday-timer can access the mayday list.

But it is nasty and error-prone in future develop. Fix it.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0ee63af..832125f 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2409,12 +2409,6 @@ static int rescuer_thread(void *__rescuer)
 repeat:
 	set_current_state(TASK_INTERRUPTIBLE);
 
-	if (kthread_should_stop()) {
-		__set_current_state(TASK_RUNNING);
-		rescuer->task->flags &= ~PF_WQ_WORKER;
-		return 0;
-	}
-
 	/* see whether any pwq is asking for help */
 	spin_lock_irq(&wq_mayday_lock);
 
@@ -2459,6 +2453,12 @@ repeat:
 
 	spin_unlock_irq(&wq_mayday_lock);
 
+	if (kthread_should_stop()) {
+		__set_current_state(TASK_RUNNING);
+		rescuer->task->flags &= ~PF_WQ_WORKER;
+		return 0;
+	}
+
 	/* rescuers should never participate in concurrency management */
 	WARN_ON_ONCE(!(rescuer->flags & WORKER_NOT_RUNNING));
 	schedule();
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/2] workqueue: fix possible race condition when rescuer VS  pwq-release
  2014-04-16 23:34             ` [PATCH 1/2] workqueue: rescuer_thread() processes all pwqs before exit Lai Jiangshan
@ 2014-04-16 23:34               ` Lai Jiangshan
  2014-04-17 15:27               ` [PATCH 1/2] workqueue: rescuer_thread() processes all pwqs before exit Tejun Heo
  1 sibling, 0 replies; 24+ messages in thread
From: Lai Jiangshan @ 2014-04-16 23:34 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

There is a race condition between rescuer_thread() and
pwq_unbound_release_workfn().

The works of the @pwq may be processed by some other workers,
and @pwq is scheduled to release(due to its wq's attr is changed)
before the rescuer starts to process. In this case
pwq_unbound_release_workfn() will corrupt wq->maydays list,
and rescuer_thead() will access to corrupted data.

Using get_pwq() pin it until rescuer is done with it.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 832125f..77c29b7 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1916,6 +1916,16 @@ static void send_mayday(struct work_struct *work)
 
 	/* mayday mayday mayday */
 	if (list_empty(&pwq->mayday_node)) {
+		/*
+		 * pwqs might go away at any time, pin it until
+		 * rescuer is done with it.
+		 *
+		 * Especially a pwq of an unbound wq may be released
+		 * before wq's destruction when the wq's attr is changed.
+		 * In this case, pwq_unbound_release_workfn() may execute
+		 * earlier before rescuer_thread() and corrupt wq->maydays.
+		 */
+		get_pwq(pwq);
 		list_add_tail(&pwq->mayday_node, &wq->maydays);
 		wake_up_process(wq->rescuer->task);
 	}
@@ -2438,6 +2448,9 @@ repeat:
 
 		process_scheduled_works(rescuer);
 
+		/* put the reference grabbed by send_mayday(). */
+		put_pwq(pwq);
+
 		/*
 		 * Leave this pool.  If keep_working() is %true, notify a
 		 * regular worker; otherwise, we end up with 0 concurrency
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] workqueue: rescuer_thread() processes all pwqs before exit
  2014-04-16 23:34             ` [PATCH 1/2] workqueue: rescuer_thread() processes all pwqs before exit Lai Jiangshan
  2014-04-16 23:34               ` [PATCH 2/2] workqueue: fix possible race condition when rescuer VS pwq-release Lai Jiangshan
@ 2014-04-17 15:27               ` Tejun Heo
  2014-04-17 16:04                 ` Lai Jiangshan
  1 sibling, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2014-04-17 15:27 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel

Hello,

On Thu, Apr 17, 2014 at 07:34:08AM +0800, Lai Jiangshan wrote:
> Before the rescuer is picked to running, the works of the @pwq
> may be processed by some other workers, and destroy_workqueue()
> may called at the same time. This may result a nasty situation
> that rescuer may exit with non-empty mayday list.
> 
> It is no harm currently, destroy_workqueue() can safely to free
> them all(workqueue&pwqs) togerther, since the rescuer is stopped.
> No rescuer nor mayday-timer can access the mayday list.
> 
> But it is nasty and error-prone in future develop. Fix it.
> 
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> ---
>  kernel/workqueue.c |   12 ++++++------
>  1 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 0ee63af..832125f 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2409,12 +2409,6 @@ static int rescuer_thread(void *__rescuer)
>  repeat:
>  	set_current_state(TASK_INTERRUPTIBLE);
>  
> -	if (kthread_should_stop()) {
> -		__set_current_state(TASK_RUNNING);
> -		rescuer->task->flags &= ~PF_WQ_WORKER;
> -		return 0;
> -	}
> -
>  	/* see whether any pwq is asking for help */
>  	spin_lock_irq(&wq_mayday_lock);
>  
> @@ -2459,6 +2453,12 @@ repeat:
>  
>  	spin_unlock_irq(&wq_mayday_lock);
>  
> +	if (kthread_should_stop()) {
> +		__set_current_state(TASK_RUNNING);
> +		rescuer->task->flags &= ~PF_WQ_WORKER;
> +		return 0;
> +	}
> +

I don't think this is reliable.  What if mayday requests take place
between wq_mayday_lock and kthread_should_stop() check?  We'll
probably need to run through mayday list after checking should_stop.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] workqueue: rescuer_thread() processes all pwqs before exit
  2014-04-17 15:27               ` [PATCH 1/2] workqueue: rescuer_thread() processes all pwqs before exit Tejun Heo
@ 2014-04-17 16:04                 ` Lai Jiangshan
  2014-04-17 16:08                   ` Tejun Heo
  0 siblings, 1 reply; 24+ messages in thread
From: Lai Jiangshan @ 2014-04-17 16:04 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML

On Thu, Apr 17, 2014 at 11:27 PM, Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> On Thu, Apr 17, 2014 at 07:34:08AM +0800, Lai Jiangshan wrote:
>> Before the rescuer is picked to running, the works of the @pwq
>> may be processed by some other workers, and destroy_workqueue()
>> may called at the same time. This may result a nasty situation
>> that rescuer may exit with non-empty mayday list.
>>
>> It is no harm currently, destroy_workqueue() can safely to free
>> them all(workqueue&pwqs) togerther, since the rescuer is stopped.
>> No rescuer nor mayday-timer can access the mayday list.
>>
>> But it is nasty and error-prone in future develop. Fix it.
>>
>> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
>> ---
>>  kernel/workqueue.c |   12 ++++++------
>>  1 files changed, 6 insertions(+), 6 deletions(-)
>>
>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>> index 0ee63af..832125f 100644
>> --- a/kernel/workqueue.c
>> +++ b/kernel/workqueue.c
>> @@ -2409,12 +2409,6 @@ static int rescuer_thread(void *__rescuer)
>>  repeat:
>>       set_current_state(TASK_INTERRUPTIBLE);
>>
>> -     if (kthread_should_stop()) {
>> -             __set_current_state(TASK_RUNNING);
>> -             rescuer->task->flags &= ~PF_WQ_WORKER;
>> -             return 0;
>> -     }
>> -
>>       /* see whether any pwq is asking for help */
>>       spin_lock_irq(&wq_mayday_lock);
>>
>> @@ -2459,6 +2453,12 @@ repeat:
>>
>>       spin_unlock_irq(&wq_mayday_lock);
>>
>> +     if (kthread_should_stop()) {
>> +             __set_current_state(TASK_RUNNING);
>> +             rescuer->task->flags &= ~PF_WQ_WORKER;
>> +             return 0;
>> +     }
>> +
>
> I don't think this is reliable.  What if mayday requests take place
> between wq_mayday_lock and kthread_should_stop() check?  We'll
> probably need to run through mayday list after checking should_stop.

It is destroy_workqueue()'s responsibility to avoid this.
destroy_workqueue() should drain all works and refuse any new work queued
on the wq before destroy the wq.

So since there is no works, there is no new mayday request,
and there is no mayday request take place between wq_mayday_lock
and kthread_should_stop() check.

Thanks,
Lai

>
> Thanks.
>
> --
> tejun
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] workqueue: rescuer_thread() processes all pwqs before exit
  2014-04-17 16:04                 ` Lai Jiangshan
@ 2014-04-17 16:08                   ` Tejun Heo
  2014-04-17 16:21                     ` Lai Jiangshan
  0 siblings, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2014-04-17 16:08 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: LKML

Hello, Lai.

On Fri, Apr 18, 2014 at 12:04:29AM +0800, Lai Jiangshan wrote:
> > I don't think this is reliable.  What if mayday requests take place
> > between wq_mayday_lock and kthread_should_stop() check?  We'll
> > probably need to run through mayday list after checking should_stop.
> 
> It is destroy_workqueue()'s responsibility to avoid this.
> destroy_workqueue() should drain all works and refuse any new work queued
> on the wq before destroy the wq.
> 
> So since there is no works, there is no new mayday request,
> and there is no mayday request take place between wq_mayday_lock
> and kthread_should_stop() check.

Hmmm?  Isn't this the same race condition that you tried to remove by
relocating the test?  It doesn't matter what destroy_workqueue() does,
the rescuer may get preempted inbetween and anything can happen
inbetween including someone maydaying and initiation of
destroy_workqueue().  Your patch doesn't change the situation at all.
It can still return with non-empty mayday list.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] workqueue: rescuer_thread() processes all pwqs before exit
  2014-04-17 16:08                   ` Tejun Heo
@ 2014-04-17 16:21                     ` Lai Jiangshan
  2014-04-17 16:27                       ` Tejun Heo
  0 siblings, 1 reply; 24+ messages in thread
From: Lai Jiangshan @ 2014-04-17 16:21 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML

On Fri, Apr 18, 2014 at 12:08 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello, Lai.
>
> On Fri, Apr 18, 2014 at 12:04:29AM +0800, Lai Jiangshan wrote:
>> > I don't think this is reliable.  What if mayday requests take place
>> > between wq_mayday_lock and kthread_should_stop() check?  We'll
>> > probably need to run through mayday list after checking should_stop.
>>
>> It is destroy_workqueue()'s responsibility to avoid this.
>> destroy_workqueue() should drain all works and refuse any new work queued
>> on the wq before destroy the wq.
>>
>> So since there is no works, there is no new mayday request,
>> and there is no mayday request take place between wq_mayday_lock
>> and kthread_should_stop() check.
>
> Hmmm?  Isn't this the same race condition that you tried to remove by
> relocating the test?  It doesn't matter what destroy_workqueue() does,
> the rescuer may get preempted inbetween and anything can happen
> inbetween including someone maydaying and initiation of
> destroy_workqueue().  Your patch doesn't change the situation at all.
> It can still return with non-empty mayday list.

You are right. We need a additional atomic check.

Thanks,
Lai

>
> Thanks.
>
> --
> tejun
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] workqueue: rescuer_thread() processes all pwqs before exit
  2014-04-17 16:21                     ` Lai Jiangshan
@ 2014-04-17 16:27                       ` Tejun Heo
  2014-04-18 13:25                         ` [PATCH 1/2 V4] " Lai Jiangshan
  0 siblings, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2014-04-17 16:27 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: LKML

On Fri, Apr 18, 2014 at 12:21:51AM +0800, Lai Jiangshan wrote:
> > Hmmm?  Isn't this the same race condition that you tried to remove by
> > relocating the test?  It doesn't matter what destroy_workqueue() does,
> > the rescuer may get preempted inbetween and anything can happen
> > inbetween including someone maydaying and initiation of
> > destroy_workqueue().  Your patch doesn't change the situation at all.
> > It can still return with non-empty mayday list.
> 
> You are right. We need a additional atomic check.

I don't think increasing the number of checks will do anything.  We
just need to keep the test where it is now and exit after doing the
loop.  ie.

	set TASK_INTRRUPTIBLE
	test should_stop and cache the result

	spin_lock_irq();

	while () {
		process the list
	}

	spin_unlock_irq();

	if (was should_stop asserted?)
		exit;

	schedule();

Should do, no?

-- 
tejun

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/2 V4] workqueue: rescuer_thread() processes all pwqs before exit
  2014-04-17 16:27                       ` Tejun Heo
@ 2014-04-18 13:25                         ` Lai Jiangshan
  2014-04-18 13:25                           ` [PATCH 2/2 V4] workqueue: fix possible race condition when rescuer VS pwq-release Lai Jiangshan
  2014-04-18 15:06                           ` [PATCH 1/2] workqueue: make rescuer_thread() empty wq->maydays list before exiting Tejun Heo
  0 siblings, 2 replies; 24+ messages in thread
From: Lai Jiangshan @ 2014-04-18 13:25 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

Before the rescuer is picked to running, the works of the @pwq
may be processed by some other workers, and destroy_workqueue()
may called at the same time. This may result a nasty situation
that rescuer may exit with non-empty mayday list.

It is no harm currently, destroy_workqueue() can safely to free
them all(workqueue&pwqs) togerther, since the rescuer is stopped.
No rescuer nor mayday-timer can access the mayday list.

But it is nasty and error-prone in future development. Fix it.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |   21 +++++++++++++++------
 1 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0ee63af..7539244 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2398,6 +2398,7 @@ static int rescuer_thread(void *__rescuer)
 	struct worker *rescuer = __rescuer;
 	struct workqueue_struct *wq = rescuer->rescue_wq;
 	struct list_head *scheduled = &rescuer->scheduled;
+	bool should_stop;
 
 	set_user_nice(current, RESCUER_NICE_LEVEL);
 
@@ -2408,12 +2409,14 @@ static int rescuer_thread(void *__rescuer)
 	rescuer->task->flags |= PF_WQ_WORKER;
 repeat:
 	set_current_state(TASK_INTERRUPTIBLE);
-
-	if (kthread_should_stop()) {
-		__set_current_state(TASK_RUNNING);
-		rescuer->task->flags &= ~PF_WQ_WORKER;
-		return 0;
-	}
+	/*
+	 * When the rescuer is requested to stop, the workqueue has no
+	 * work pending, but wq->maydays may still have pwq(s) queued.
+	 * This can happend when some other workers process all works
+	 * before this rescuer is scheduled. The rescuer must process
+	 * all pwq(s) before exit.
+	 */
+	should_stop = kthread_should_stop();
 
 	/* see whether any pwq is asking for help */
 	spin_lock_irq(&wq_mayday_lock);
@@ -2459,6 +2462,12 @@ repeat:
 
 	spin_unlock_irq(&wq_mayday_lock);
 
+	if (should_stop) {
+		__set_current_state(TASK_RUNNING);
+		rescuer->task->flags &= ~PF_WQ_WORKER;
+		return 0;
+	}
+
 	/* rescuers should never participate in concurrency management */
 	WARN_ON_ONCE(!(rescuer->flags & WORKER_NOT_RUNNING));
 	schedule();
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/2 V4] workqueue: fix possible race condition when rescuer VS  pwq-release
  2014-04-18 13:25                         ` [PATCH 1/2 V4] " Lai Jiangshan
@ 2014-04-18 13:25                           ` Lai Jiangshan
  2014-04-18 15:06                             ` [PATCH 2/2] workqueue: fix a possible race condition between rescuer and pwq-release Tejun Heo
  2014-04-18 15:06                           ` [PATCH 1/2] workqueue: make rescuer_thread() empty wq->maydays list before exiting Tejun Heo
  1 sibling, 1 reply; 24+ messages in thread
From: Lai Jiangshan @ 2014-04-18 13:25 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

There is a race condition between rescuer_thread() and
pwq_unbound_release_workfn().

The works of the @pwq may be processed by some other workers,
and @pwq is scheduled to release(due to its wq's attr is changed)
before the rescuer starts to process. In this case
pwq_unbound_release_workfn() will corrupt wq->maydays list,
and rescuer_thead() will access to corrupted data.

Using get_pwq() pin it until rescuer is done with it.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 7539244..8c0830c 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1916,6 +1916,16 @@ static void send_mayday(struct work_struct *work)
 
 	/* mayday mayday mayday */
 	if (list_empty(&pwq->mayday_node)) {
+		/*
+		 * pwqs might go away at any time, pin it until the
+		 * rescuer is done with it.
+		 *
+		 * Especially a pwq of an unbound wq may be released
+		 * before wq's destruction when the wq's attr is changed.
+		 * In this case, pwq_unbound_release_workfn() may execute
+		 * earlier before rescuer_thread() and corrupt wq->maydays.
+		 */
+		get_pwq(pwq);
 		list_add_tail(&pwq->mayday_node, &wq->maydays);
 		wake_up_process(wq->rescuer->task);
 	}
@@ -2447,6 +2457,9 @@ repeat:
 
 		process_scheduled_works(rescuer);
 
+		/* put the reference grabbed by send_mayday(). */
+		put_pwq(pwq);
+
 		/*
 		 * Leave this pool.  If keep_working() is %true, notify a
 		 * regular worker; otherwise, we end up with 0 concurrency
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 1/2] workqueue: make rescuer_thread() empty wq->maydays list before exiting
  2014-04-18 13:25                         ` [PATCH 1/2 V4] " Lai Jiangshan
  2014-04-18 13:25                           ` [PATCH 2/2 V4] workqueue: fix possible race condition when rescuer VS pwq-release Lai Jiangshan
@ 2014-04-18 15:06                           ` Tejun Heo
  1 sibling, 0 replies; 24+ messages in thread
From: Tejun Heo @ 2014-04-18 15:06 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel

Applied to wq/for-3.15-fixes w/ comment and description updates.

Thanks.
------- 8< -------
>From 4d595b866d2c653dc90a492b9973a834eabfa354 Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <laijs@cn.fujitsu.com>
Date: Fri, 18 Apr 2014 11:04:16 -0400

After a @pwq is scheduled for emergency execution, other workers may
consume the affectd work items before the rescuer gets to them.  This
means that a workqueue many have pwqs queued on @wq->maydays list
while not having any work item pending or in-flight.  If
destroy_workqueue() executes in such condition, the rescuer may exit
without emptying @wq->maydays.

This currently doesn't cause any actual harm.  destroy_workqueue() can
safely destroy all the involved data structures whether @wq->maydays
is populated or not as nobody access the list once the rescuer exits.

However, this is nasty and makes future development difficult.  Let's
update rescuer_thread() so that it empties @wq->maydays after seeing
should_stop to guarantee that the list is empty on rescuer exit.

tj: Updated comment and patch description.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org # v3.10+
---
 kernel/workqueue.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 3150b21..6ba0c60 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2398,6 +2398,7 @@ static int rescuer_thread(void *__rescuer)
 	struct worker *rescuer = __rescuer;
 	struct workqueue_struct *wq = rescuer->rescue_wq;
 	struct list_head *scheduled = &rescuer->scheduled;
+	bool should_stop;
 
 	set_user_nice(current, RESCUER_NICE_LEVEL);
 
@@ -2409,11 +2410,15 @@ static int rescuer_thread(void *__rescuer)
 repeat:
 	set_current_state(TASK_INTERRUPTIBLE);
 
-	if (kthread_should_stop()) {
-		__set_current_state(TASK_RUNNING);
-		rescuer->task->flags &= ~PF_WQ_WORKER;
-		return 0;
-	}
+	/*
+	 * By the time the rescuer is requested to stop, the workqueue
+	 * shouldn't have any work pending, but @wq->maydays may still have
+	 * pwq(s) queued.  This can happen by non-rescuer workers consuming
+	 * all the work items before the rescuer got to them.  Go through
+	 * @wq->maydays processing before acting on should_stop so that the
+	 * list is always empty on exit.
+	 */
+	should_stop = kthread_should_stop();
 
 	/* see whether any pwq is asking for help */
 	spin_lock_irq(&wq_mayday_lock);
@@ -2459,6 +2464,12 @@ repeat:
 
 	spin_unlock_irq(&wq_mayday_lock);
 
+	if (should_stop) {
+		__set_current_state(TASK_RUNNING);
+		rescuer->task->flags &= ~PF_WQ_WORKER;
+		return 0;
+	}
+
 	/* rescuers should never participate in concurrency management */
 	WARN_ON_ONCE(!(rescuer->flags & WORKER_NOT_RUNNING));
 	schedule();
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/2] workqueue: fix a possible race condition between rescuer and pwq-release
  2014-04-18 13:25                           ` [PATCH 2/2 V4] workqueue: fix possible race condition when rescuer VS pwq-release Lai Jiangshan
@ 2014-04-18 15:06                             ` Tejun Heo
  2014-04-18 16:24                               ` Lai Jiangshan
  0 siblings, 1 reply; 24+ messages in thread
From: Tejun Heo @ 2014-04-18 15:06 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel

Hello,

Applied to wq/for-3.15-fixes with put_pwq() relocated and patch
description and comments updated.

Thanks.
-------- 8< --------
>From 4b81955722abe4306096c7b0137e0491a7ba7b0e Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <laijs@cn.fujitsu.com>
Date: Fri, 18 Apr 2014 11:04:16 -0400

There is a race condition between rescuer_thread() and
pwq_unbound_release_workfn().

Even after a pwq is scheduled for rescue, the associated work items
may be consumed by any worker.  If all of them are consumed before the
rescuer gets to them and the pwq's base ref was put due to attribute
change, the pwq may be released while still being linked on
@wq->maydays list making the rescuer dereference already freed pwq
later.

Make send_mayday() pin the target pwq until the rescuer is done with
it.

tj: Updated comment and patch description.  Moved put_pwq() to after
    the last @pwq->pool access.  This isn't strictly necessarily for
    correctness but is cleaner as the pool is accessed through the
    pwq.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org # v3.10+
---
 kernel/workqueue.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 6ba0c60..a6532f2 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1916,6 +1916,12 @@ static void send_mayday(struct work_struct *work)
 
 	/* mayday mayday mayday */
 	if (list_empty(&pwq->mayday_node)) {
+		/*
+		 * If @pwq is for an unbound wq, its base ref may be put at
+		 * any time due to an attribute change.  Pin @pwq until the
+		 * rescuer is done with it.
+		 */
+		get_pwq(pwq);
 		list_add_tail(&pwq->mayday_node, &wq->maydays);
 		wake_up_process(wq->rescuer->task);
 	}
@@ -2459,6 +2465,7 @@ repeat:
 
 		rescuer->pool = NULL;
 		spin_unlock(&pool->lock);
+		put_pwq(pwq);	/* put the ref from send_mayday() */
 		spin_lock(&wq_mayday_lock);
 	}
 
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] workqueue: fix a possible race condition between rescuer and pwq-release
  2014-04-18 15:06                             ` [PATCH 2/2] workqueue: fix a possible race condition between rescuer and pwq-release Tejun Heo
@ 2014-04-18 16:24                               ` Lai Jiangshan
  2014-04-18 16:35                                 ` Tejun Heo
  0 siblings, 1 reply; 24+ messages in thread
From: Lai Jiangshan @ 2014-04-18 16:24 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML

On Fri, Apr 18, 2014 at 11:06 PM, Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> Applied to wq/for-3.15-fixes with put_pwq() relocated and patch
> description and comments updated.
>
> Thanks.
> -------- 8< --------
> From 4b81955722abe4306096c7b0137e0491a7ba7b0e Mon Sep 17 00:00:00 2001
> From: Lai Jiangshan <laijs@cn.fujitsu.com>
> Date: Fri, 18 Apr 2014 11:04:16 -0400
>
> There is a race condition between rescuer_thread() and
> pwq_unbound_release_workfn().
>
> Even after a pwq is scheduled for rescue, the associated work items
> may be consumed by any worker.  If all of them are consumed before the
> rescuer gets to them and the pwq's base ref was put due to attribute
> change, the pwq may be released while still being linked on
> @wq->maydays list making the rescuer dereference already freed pwq
> later.
>
> Make send_mayday() pin the target pwq until the rescuer is done with
> it.
>
> tj: Updated comment and patch description.  Moved put_pwq() to after
>     the last @pwq->pool access.  This isn't strictly necessarily for
>     correctness but is cleaner as the pool is accessed through the
>     pwq.
>
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: stable@vger.kernel.org # v3.10+
> ---
>  kernel/workqueue.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 6ba0c60..a6532f2 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1916,6 +1916,12 @@ static void send_mayday(struct work_struct *work)
>
>         /* mayday mayday mayday */
>         if (list_empty(&pwq->mayday_node)) {
> +               /*
> +                * If @pwq is for an unbound wq, its base ref may be put at
> +                * any time due to an attribute change.  Pin @pwq until the
> +                * rescuer is done with it.
> +                */
> +               get_pwq(pwq);
>                 list_add_tail(&pwq->mayday_node, &wq->maydays);
>                 wake_up_process(wq->rescuer->task);
>         }
> @@ -2459,6 +2465,7 @@ repeat:
>
>                 rescuer->pool = NULL;
>                 spin_unlock(&pool->lock);
> +               put_pwq(pwq);   /* put the ref from send_mayday() */

put_pwq() requires pool->lock held.

>                 spin_lock(&wq_mayday_lock);
>         }
>
> --
> 1.9.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/2] workqueue: fix a possible race condition between rescuer and pwq-release
  2014-04-18 16:24                               ` Lai Jiangshan
@ 2014-04-18 16:35                                 ` Tejun Heo
  0 siblings, 0 replies; 24+ messages in thread
From: Tejun Heo @ 2014-04-18 16:35 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: LKML

Ah, you're right.  Fixed.

Thanks.
------ 8< ------
>From 77668c8b559e4fe2acf2a0749c7c83cde49a5025 Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <laijs@cn.fujitsu.com>
Date: Fri, 18 Apr 2014 11:04:16 -0400

There is a race condition between rescuer_thread() and
pwq_unbound_release_workfn().

Even after a pwq is scheduled for rescue, the associated work items
may be consumed by any worker.  If all of them are consumed before the
rescuer gets to them and the pwq's base ref was put due to attribute
change, the pwq may be released while still being linked on
@wq->maydays list making the rescuer dereference already freed pwq
later.

Make send_mayday() pin the target pwq until the rescuer is done with
it.

tj: Updated comment and patch description.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org # v3.10+
---
 kernel/workqueue.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 6ba0c60..8edc871 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1916,6 +1916,12 @@ static void send_mayday(struct work_struct *work)
 
 	/* mayday mayday mayday */
 	if (list_empty(&pwq->mayday_node)) {
+		/*
+		 * If @pwq is for an unbound wq, its base ref may be put at
+		 * any time due to an attribute change.  Pin @pwq until the
+		 * rescuer is done with it.
+		 */
+		get_pwq(pwq);
 		list_add_tail(&pwq->mayday_node, &wq->maydays);
 		wake_up_process(wq->rescuer->task);
 	}
@@ -2450,6 +2456,12 @@ repeat:
 		process_scheduled_works(rescuer);
 
 		/*
+		 * Put the reference grabbed by send_mayday().  @pool won't
+		 * go away while we're holding its lock.
+		 */
+		put_pwq(pwq);
+
+		/*
 		 * Leave this pool.  If keep_working() is %true, notify a
 		 * regular worker; otherwise, we end up with 0 concurrency
 		 * and stalling the execution.
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2014-04-18 16:35 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-27 16:20 [PATCH] workqueue: fix possible race condition when rescuer VS pwq-release Lai Jiangshan
2014-03-28 12:07 ` [PATCH V2] " Lai Jiangshan
2014-03-31 14:40   ` Lai Jiangshan
2014-03-31 20:06     ` Tejun Heo
2014-04-14  7:02       ` Lai Jiangshan
2014-04-15 16:47   ` Tejun Heo
2014-04-16  1:25     ` Lai Jiangshan
2014-04-16 15:23       ` Tejun Heo
2014-04-16 16:21         ` Lai Jiangshan
2014-04-16 16:50           ` Tejun Heo
2014-04-16 22:35             ` Lai Jiangshan
2014-04-16 23:34             ` [PATCH 1/2] workqueue: rescuer_thread() processes all pwqs before exit Lai Jiangshan
2014-04-16 23:34               ` [PATCH 2/2] workqueue: fix possible race condition when rescuer VS pwq-release Lai Jiangshan
2014-04-17 15:27               ` [PATCH 1/2] workqueue: rescuer_thread() processes all pwqs before exit Tejun Heo
2014-04-17 16:04                 ` Lai Jiangshan
2014-04-17 16:08                   ` Tejun Heo
2014-04-17 16:21                     ` Lai Jiangshan
2014-04-17 16:27                       ` Tejun Heo
2014-04-18 13:25                         ` [PATCH 1/2 V4] " Lai Jiangshan
2014-04-18 13:25                           ` [PATCH 2/2 V4] workqueue: fix possible race condition when rescuer VS pwq-release Lai Jiangshan
2014-04-18 15:06                             ` [PATCH 2/2] workqueue: fix a possible race condition between rescuer and pwq-release Tejun Heo
2014-04-18 16:24                               ` Lai Jiangshan
2014-04-18 16:35                                 ` Tejun Heo
2014-04-18 15:06                           ` [PATCH 1/2] workqueue: make rescuer_thread() empty wq->maydays list before exiting Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.