* [PATCH] workqueue: fix possible bug which may silence the pool
@ 2013-03-02 15:55 Lai Jiangshan
2013-03-04 19:20 ` Tejun Heo
2013-03-08 23:15 ` [PATCH wq/for-3.9-fixes] workqueue: fix possible pool stall bug in wq_unbind_fn() Tejun Heo
0 siblings, 2 replies; 7+ messages in thread
From: Lai Jiangshan @ 2013-03-02 15:55 UTC (permalink / raw)
To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan
After we introduce multiple pools for cpu pools, a part of the comments
in wq_unbind_fn() becomes wrong.
It said that "current worker would trigger unbound chain execution".
It is wrong. current worker only belongs to one of the multiple pools.
If wq_unbind_fn() does unbind the normal_pri pool(not the pool of the current
worker), the current worker is not the available worker to trigger unbound
chain execution of the normal_pri pool, and if all the workers of
the normal_pri goto sleep after they were set %WORKER_UNBOUND but before
they finish their current work, unbound chain execution is not triggered
totally. The pool is stopped!
We can change wq_unbind_fn() only does unbind one pool and we launch multiple
wq_unbind_fn()s, one for each pool to solve the problem.
But this change will add much latency to hotplug path unnecessarily.
So we choice to wake up a worker directly to trigger unbound chain execution.
current worker may sleep on &second_pool->assoc_mutex, so we also move
the wakeup code into the loop to avoid second_pool silences the first_pool.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
kernel/workqueue.c | 45 ++++++++++++++++++++++++++-------------------
1 files changed, 26 insertions(+), 19 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 81f2457..03159c2 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3446,28 +3446,35 @@ static void wq_unbind_fn(struct work_struct *work)
spin_unlock_irq(&pool->lock);
mutex_unlock(&pool->assoc_mutex);
- }
- /*
- * Call schedule() so that we cross rq->lock and thus can guarantee
- * sched callbacks see the %WORKER_UNBOUND flag. This is necessary
- * as scheduler callbacks may be invoked from other cpus.
- */
- schedule();
+ /*
+ * Call schedule() so that we cross rq->lock and thus can
+ * guarantee sched callbacks see the %WORKER_UNBOUND flag.
+ * This is necessary as scheduler callbacks may be invoked
+ * from other cpus.
+ */
+ schedule();
- /*
- * Sched callbacks are disabled now. Zap nr_running. After this,
- * nr_running stays zero and need_more_worker() and keep_working()
- * are always true as long as the worklist is not empty. Pools on
- * @cpu now behave as unbound (in terms of concurrency management)
- * pools which are served by workers tied to the CPU.
- *
- * On return from this function, the current worker would trigger
- * unbound chain execution of pending work items if other workers
- * didn't already.
- */
- for_each_std_worker_pool(pool, cpu)
+ /*
+ * Sched callbacks are disabled now. Zap nr_running.
+ * After this, nr_running stays zero and need_more_worker()
+ * and keep_working() are always true as long as the worklist
+ * is not empty. This pool now behave as unbound (in terms of
+ * concurrency management) pool which are served by workers
+ * tied to the pool.
+ */
atomic_set(&pool->nr_running, 0);
+
+ /* The current busy workers of this pool may goto sleep without
+ * wake up any other worker after they were set %WORKER_UNBOUND
+ * flag. Here we wake up another possible worker to start
+ * the unbound chain execution of pending work items in this
+ * case.
+ */
+ spin_lock_irq(&pool->lock);
+ wake_up_worker(pool);
+ spin_unlock_irq(&pool->lock);
+ }
}
/*
--
1.7.7.6
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] workqueue: fix possible bug which may silence the pool
2013-03-02 15:55 [PATCH] workqueue: fix possible bug which may silence the pool Lai Jiangshan
@ 2013-03-04 19:20 ` Tejun Heo
2013-03-05 1:17 ` Lai Jiangshan
2013-03-08 23:15 ` [PATCH wq/for-3.9-fixes] workqueue: fix possible pool stall bug in wq_unbind_fn() Tejun Heo
1 sibling, 1 reply; 7+ messages in thread
From: Tejun Heo @ 2013-03-04 19:20 UTC (permalink / raw)
To: Lai Jiangshan; +Cc: linux-kernel
Hello, Lai.
On Sat, Mar 02, 2013 at 11:55:29PM +0800, Lai Jiangshan wrote:
> After we introduce multiple pools for cpu pools, a part of the comments
> in wq_unbind_fn() becomes wrong.
>
> It said that "current worker would trigger unbound chain execution".
> It is wrong. current worker only belongs to one of the multiple pools.
>
> If wq_unbind_fn() does unbind the normal_pri pool(not the pool of the current
> worker), the current worker is not the available worker to trigger unbound
> chain execution of the normal_pri pool, and if all the workers of
> the normal_pri goto sleep after they were set %WORKER_UNBOUND but before
> they finish their current work, unbound chain execution is not triggered
> totally. The pool is stopped!
>
> We can change wq_unbind_fn() only does unbind one pool and we launch multiple
> wq_unbind_fn()s, one for each pool to solve the problem.
> But this change will add much latency to hotplug path unnecessarily.
>
> So we choice to wake up a worker directly to trigger unbound chain execution.
>
> current worker may sleep on &second_pool->assoc_mutex, so we also move
> the wakeup code into the loop to avoid second_pool silences the first_pool.
>
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Nice catch.
> @@ -3446,28 +3446,35 @@ static void wq_unbind_fn(struct work_struct *work)
>
> spin_unlock_irq(&pool->lock);
> mutex_unlock(&pool->assoc_mutex);
> - }
>
> - /*
> - * Call schedule() so that we cross rq->lock and thus can guarantee
> - * sched callbacks see the %WORKER_UNBOUND flag. This is necessary
> - * as scheduler callbacks may be invoked from other cpus.
> - */
> - schedule();
> + /*
> + * Call schedule() so that we cross rq->lock and thus can
> + * guarantee sched callbacks see the %WORKER_UNBOUND flag.
> + * This is necessary as scheduler callbacks may be invoked
> + * from other cpus.
> + */
> + schedule();
>
> - /*
> - * Sched callbacks are disabled now. Zap nr_running. After this,
> - * nr_running stays zero and need_more_worker() and keep_working()
> - * are always true as long as the worklist is not empty. Pools on
> - * @cpu now behave as unbound (in terms of concurrency management)
> - * pools which are served by workers tied to the CPU.
> - *
> - * On return from this function, the current worker would trigger
> - * unbound chain execution of pending work items if other workers
> - * didn't already.
> - */
> - for_each_std_worker_pool(pool, cpu)
> + /*
> + * Sched callbacks are disabled now. Zap nr_running.
> + * After this, nr_running stays zero and need_more_worker()
> + * and keep_working() are always true as long as the worklist
> + * is not empty. This pool now behave as unbound (in terms of
> + * concurrency management) pool which are served by workers
> + * tied to the pool.
> + */
> atomic_set(&pool->nr_running, 0);
> +
> + /* The current busy workers of this pool may goto sleep without
> + * wake up any other worker after they were set %WORKER_UNBOUND
> + * flag. Here we wake up another possible worker to start
> + * the unbound chain execution of pending work items in this
> + * case.
> + */
> + spin_lock_irq(&pool->lock);
> + wake_up_worker(pool);
> + spin_unlock_irq(&pool->lock);
> + }
But can we please just addd wake_up_worker() in the
for_each_std_worker_pool() loop? We want to mark the patch for
-stable and keep it short and to the point. This patch is a couple
times larger than necessary.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] workqueue: fix possible bug which may silence the pool
2013-03-04 19:20 ` Tejun Heo
@ 2013-03-05 1:17 ` Lai Jiangshan
2013-03-05 18:04 ` Tejun Heo
0 siblings, 1 reply; 7+ messages in thread
From: Lai Jiangshan @ 2013-03-05 1:17 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-kernel
On 03/05/2013 03:20 AM, Tejun Heo wrote:
> Hello, Lai.
>
> On Sat, Mar 02, 2013 at 11:55:29PM +0800, Lai Jiangshan wrote:
>> After we introduce multiple pools for cpu pools, a part of the comments
>> in wq_unbind_fn() becomes wrong.
>>
>> It said that "current worker would trigger unbound chain execution".
>> It is wrong. current worker only belongs to one of the multiple pools.
>>
>> If wq_unbind_fn() does unbind the normal_pri pool(not the pool of the current
>> worker), the current worker is not the available worker to trigger unbound
>> chain execution of the normal_pri pool, and if all the workers of
>> the normal_pri goto sleep after they were set %WORKER_UNBOUND but before
>> they finish their current work, unbound chain execution is not triggered
>> totally. The pool is stopped!
>>
>> We can change wq_unbind_fn() only does unbind one pool and we launch multiple
>> wq_unbind_fn()s, one for each pool to solve the problem.
>> But this change will add much latency to hotplug path unnecessarily.
>>
>> So we choice to wake up a worker directly to trigger unbound chain execution.
>>
>> current worker may sleep on &second_pool->assoc_mutex, so we also move
>> the wakeup code into the loop to avoid second_pool silences the first_pool.
>>
>> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
>
> Nice catch.
>
>> @@ -3446,28 +3446,35 @@ static void wq_unbind_fn(struct work_struct *work)
>>
>> spin_unlock_irq(&pool->lock);
>> mutex_unlock(&pool->assoc_mutex);
>> - }
>>
>> - /*
>> - * Call schedule() so that we cross rq->lock and thus can guarantee
>> - * sched callbacks see the %WORKER_UNBOUND flag. This is necessary
>> - * as scheduler callbacks may be invoked from other cpus.
>> - */
>> - schedule();
>> + /*
>> + * Call schedule() so that we cross rq->lock and thus can
>> + * guarantee sched callbacks see the %WORKER_UNBOUND flag.
>> + * This is necessary as scheduler callbacks may be invoked
>> + * from other cpus.
>> + */
>> + schedule();
>>
>> - /*
>> - * Sched callbacks are disabled now. Zap nr_running. After this,
>> - * nr_running stays zero and need_more_worker() and keep_working()
>> - * are always true as long as the worklist is not empty. Pools on
>> - * @cpu now behave as unbound (in terms of concurrency management)
>> - * pools which are served by workers tied to the CPU.
>> - *
>> - * On return from this function, the current worker would trigger
>> - * unbound chain execution of pending work items if other workers
>> - * didn't already.
>> - */
>> - for_each_std_worker_pool(pool, cpu)
>> + /*
>> + * Sched callbacks are disabled now. Zap nr_running.
>> + * After this, nr_running stays zero and need_more_worker()
>> + * and keep_working() are always true as long as the worklist
>> + * is not empty. This pool now behave as unbound (in terms of
>> + * concurrency management) pool which are served by workers
>> + * tied to the pool.
>> + */
>> atomic_set(&pool->nr_running, 0);
>> +
>> + /* The current busy workers of this pool may goto sleep without
>> + * wake up any other worker after they were set %WORKER_UNBOUND
>> + * flag. Here we wake up another possible worker to start
>> + * the unbound chain execution of pending work items in this
>> + * case.
>> + */
>> + spin_lock_irq(&pool->lock);
>> + wake_up_worker(pool);
>> + spin_unlock_irq(&pool->lock);
>> + }
>
> But can we please just addd wake_up_worker() in the
> for_each_std_worker_pool() loop?
wake_up_worker() needed be put on the same loop which do set %WORKER_UNBOUND.
mutex_lock(&pool->assoc_mutex);
do set %WORKER_UNBOUND for normal_pri pool
mutex_unlock(&pool->assoc_mutex);
// no wakeup for normal_pri pool
// but all workers of normal_pri pool goto sleep
// try to do set %WORKER_UNBOUND for high_pri pool
mutex_lock(&pool->assoc_mutex);
waiting forever here due to high_pri pool's manage_workers()
waiting on allocating memory forever(waiting normal_pri pool
free memory, but normal_pri pool is silenced)
mutex_unlock(&pool->assoc_mutex);
> We want to mark the patch for
> -stable and keep it short and to the point. This patch is a couple
> times larger than necessary.
>
> Thanks.
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] workqueue: fix possible bug which may silence the pool
2013-03-05 1:17 ` Lai Jiangshan
@ 2013-03-05 18:04 ` Tejun Heo
0 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2013-03-05 18:04 UTC (permalink / raw)
To: Lai Jiangshan; +Cc: linux-kernel
Hello, Lai.
On Tue, Mar 05, 2013 at 09:17:16AM +0800, Lai Jiangshan wrote:
> > But can we please just addd wake_up_worker() in the
> > for_each_std_worker_pool() loop?
>
> wake_up_worker() needed be put on the same loop which do set %WORKER_UNBOUND.
>
> mutex_lock(&pool->assoc_mutex);
> do set %WORKER_UNBOUND for normal_pri pool
> mutex_unlock(&pool->assoc_mutex);
>
> // no wakeup for normal_pri pool
> // but all workers of normal_pri pool goto sleep
>
> // try to do set %WORKER_UNBOUND for high_pri pool
> mutex_lock(&pool->assoc_mutex);
> waiting forever here due to high_pri pool's manage_workers()
> waiting on allocating memory forever(waiting normal_pri pool
> free memory, but normal_pri pool is silenced)
> mutex_unlock(&pool->assoc_mutex);
Hmmm.... I see. Can you please separate it into two patches then?
One to restructure the loop into one and the other to add wake up,
and add comment explaining why the loop has to be per-pool?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH wq/for-3.9-fixes] workqueue: fix possible pool stall bug in wq_unbind_fn()
2013-03-02 15:55 [PATCH] workqueue: fix possible bug which may silence the pool Lai Jiangshan
2013-03-04 19:20 ` Tejun Heo
@ 2013-03-08 23:15 ` Tejun Heo
2013-03-11 16:09 ` Lai Jiangshan
1 sibling, 1 reply; 7+ messages in thread
From: Tejun Heo @ 2013-03-08 23:15 UTC (permalink / raw)
To: Lai Jiangshan; +Cc: linux-kernel
From: Lai Jiangshan <laijs@cn.fujitsu.com>
Since multiple pools per cpu have been introduced, wq_unbind_fn() has
a subtle bug which may theoretically stall work item processing. The
problem is two-fold.
* wq_unbind_fn() depends on the worker executing wq_unbind_fn() itself
to start unbound chain execution, which works fine when there was
only single pool. With multiple pools, only the pool which is
running wq_unbind_fn() - the highpri one - is guaranteed to have
such kick-off. The other pool could stall when its busy workers
block.
* The current code is setting WORKER_UNBIND / POOL_DISASSOCIATED of
the two pools in succession without initiating work execution
inbetween. Because setting the flags requires grabbing assoc_mutex
which is held while new workers are created, this could lead to
stalls if a pool's manager is waiting for the previous pool's work
items to release memory. This is almost purely theoretical tho.
Update wq_unbind_fn() such that it sets WORKER_UNBIND /
POOL_DISASSOCIATED, goes over schedule() and explicitly kicks off
execution for a pool and then moves on to the next one.
tj: Updated comments and description.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
---
As you seemingly has disappeared, I just fixed up this patch and
applied it to wq/for-3.9-fixes.
Thanks.
kernel/workqueue.c | 44 +++++++++++++++++++++++++-------------------
1 file changed, 25 insertions(+), 19 deletions(-)
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3446,28 +3446,34 @@ static void wq_unbind_fn(struct work_str
spin_unlock_irq(&pool->lock);
mutex_unlock(&pool->assoc_mutex);
- }
- /*
- * Call schedule() so that we cross rq->lock and thus can guarantee
- * sched callbacks see the %WORKER_UNBOUND flag. This is necessary
- * as scheduler callbacks may be invoked from other cpus.
- */
- schedule();
+ /*
+ * Call schedule() so that we cross rq->lock and thus can
+ * guarantee sched callbacks see the %WORKER_UNBOUND flag.
+ * This is necessary as scheduler callbacks may be invoked
+ * from other cpus.
+ */
+ schedule();
- /*
- * Sched callbacks are disabled now. Zap nr_running. After this,
- * nr_running stays zero and need_more_worker() and keep_working()
- * are always true as long as the worklist is not empty. Pools on
- * @cpu now behave as unbound (in terms of concurrency management)
- * pools which are served by workers tied to the CPU.
- *
- * On return from this function, the current worker would trigger
- * unbound chain execution of pending work items if other workers
- * didn't already.
- */
- for_each_std_worker_pool(pool, cpu)
+ /*
+ * Sched callbacks are disabled now. Zap nr_running.
+ * After this, nr_running stays zero and need_more_worker()
+ * and keep_working() are always true as long as the
+ * worklist is not empty. This pool now behaves as an
+ * unbound (in terms of concurrency management) pool which
+ * are served by workers tied to the pool.
+ */
atomic_set(&pool->nr_running, 0);
+
+ /*
+ * With concurrency management just turned off, a busy
+ * worker blocking could lead to lengthy stalls. Kick off
+ * unbound chain execution of currently pending work items.
+ */
+ spin_lock_irq(&pool->lock);
+ wake_up_worker(pool);
+ spin_unlock_irq(&pool->lock);
+ }
}
/*
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH wq/for-3.9-fixes] workqueue: fix possible pool stall bug in wq_unbind_fn()
2013-03-08 23:15 ` [PATCH wq/for-3.9-fixes] workqueue: fix possible pool stall bug in wq_unbind_fn() Tejun Heo
@ 2013-03-11 16:09 ` Lai Jiangshan
2013-03-11 16:24 ` Tejun Heo
0 siblings, 1 reply; 7+ messages in thread
From: Lai Jiangshan @ 2013-03-11 16:09 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-kernel, Linus Torvalds
Hi, Tejun,
Forgot to send a pull-request?
Add CC Linus.
Thanks,
Lai
On 09/03/13 07:15, Tejun Heo wrote:
> From: Lai Jiangshan <laijs@cn.fujitsu.com>
>
> Since multiple pools per cpu have been introduced, wq_unbind_fn() has
> a subtle bug which may theoretically stall work item processing. The
> problem is two-fold.
>
> * wq_unbind_fn() depends on the worker executing wq_unbind_fn() itself
> to start unbound chain execution, which works fine when there was
> only single pool. With multiple pools, only the pool which is
> running wq_unbind_fn() - the highpri one - is guaranteed to have
> such kick-off. The other pool could stall when its busy workers
> block.
>
> * The current code is setting WORKER_UNBIND / POOL_DISASSOCIATED of
> the two pools in succession without initiating work execution
> inbetween. Because setting the flags requires grabbing assoc_mutex
> which is held while new workers are created, this could lead to
> stalls if a pool's manager is waiting for the previous pool's work
> items to release memory. This is almost purely theoretical tho.
>
> Update wq_unbind_fn() such that it sets WORKER_UNBIND /
> POOL_DISASSOCIATED, goes over schedule() and explicitly kicks off
> execution for a pool and then moves on to the next one.
>
> tj: Updated comments and description.
>
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: stable@vger.kernel.org
> ---
> As you seemingly has disappeared, I just fixed up this patch and
> applied it to wq/for-3.9-fixes.
>
> Thanks.
>
> kernel/workqueue.c | 44 +++++++++++++++++++++++++-------------------
> 1 file changed, 25 insertions(+), 19 deletions(-)
>
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -3446,28 +3446,34 @@ static void wq_unbind_fn(struct work_str
>
> spin_unlock_irq(&pool->lock);
> mutex_unlock(&pool->assoc_mutex);
> - }
>
> - /*
> - * Call schedule() so that we cross rq->lock and thus can guarantee
> - * sched callbacks see the %WORKER_UNBOUND flag. This is necessary
> - * as scheduler callbacks may be invoked from other cpus.
> - */
> - schedule();
> + /*
> + * Call schedule() so that we cross rq->lock and thus can
> + * guarantee sched callbacks see the %WORKER_UNBOUND flag.
> + * This is necessary as scheduler callbacks may be invoked
> + * from other cpus.
> + */
> + schedule();
>
> - /*
> - * Sched callbacks are disabled now. Zap nr_running. After this,
> - * nr_running stays zero and need_more_worker() and keep_working()
> - * are always true as long as the worklist is not empty. Pools on
> - * @cpu now behave as unbound (in terms of concurrency management)
> - * pools which are served by workers tied to the CPU.
> - *
> - * On return from this function, the current worker would trigger
> - * unbound chain execution of pending work items if other workers
> - * didn't already.
> - */
> - for_each_std_worker_pool(pool, cpu)
> + /*
> + * Sched callbacks are disabled now. Zap nr_running.
> + * After this, nr_running stays zero and need_more_worker()
> + * and keep_working() are always true as long as the
> + * worklist is not empty. This pool now behaves as an
> + * unbound (in terms of concurrency management) pool which
> + * are served by workers tied to the pool.
> + */
> atomic_set(&pool->nr_running, 0);
> +
> + /*
> + * With concurrency management just turned off, a busy
> + * worker blocking could lead to lengthy stalls. Kick off
> + * unbound chain execution of currently pending work items.
> + */
> + spin_lock_irq(&pool->lock);
> + wake_up_worker(pool);
> + spin_unlock_irq(&pool->lock);
> + }
> }
>
> /*
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH wq/for-3.9-fixes] workqueue: fix possible pool stall bug in wq_unbind_fn()
2013-03-11 16:09 ` Lai Jiangshan
@ 2013-03-11 16:24 ` Tejun Heo
0 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2013-03-11 16:24 UTC (permalink / raw)
To: Lai Jiangshan; +Cc: linux-kernel, Linus Torvalds
Hey, Lai.
On Tue, Mar 12, 2013 at 12:09:47AM +0800, Lai Jiangshan wrote:
> Hi, Tejun,
>
> Forgot to send a pull-request?
> Add CC Linus.
Ummm... I (and maintainers in general) don't send pull-request
whenever I queue a fix patch. I usually let them sit in the tree for
some weeks before sending a pull-request to Linus, so that the fixes
get some exposure in -next and more of them can be aggregated. As
workqueue is seeing high level of activity these days, it's likely
that we'll see more fixes coming out of it, and your fix while marked
for -stable isn't that urgent, so I'm gonna wait for some more weeks
before sending out pull request to Linus.
Thanks for checking tho.
--
tejun
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-03-11 16:25 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-02 15:55 [PATCH] workqueue: fix possible bug which may silence the pool Lai Jiangshan
2013-03-04 19:20 ` Tejun Heo
2013-03-05 1:17 ` Lai Jiangshan
2013-03-05 18:04 ` Tejun Heo
2013-03-08 23:15 ` [PATCH wq/for-3.9-fixes] workqueue: fix possible pool stall bug in wq_unbind_fn() Tejun Heo
2013-03-11 16:09 ` Lai Jiangshan
2013-03-11 16:24 ` Tejun Heo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.