linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2 V7 for-3.6-fixes] workqueue: add POOL_MANAGING_WORKERS
@ 2012-09-10  2:11 Lai Jiangshan
  2012-09-10  2:11 ` [PATCH 2/2 V7 for-3.6-fixes] workqueue: fix idle worker depletion Lai Jiangshan
  2012-09-10 17:08 ` [PATCH wq/for-3.6-fixes 1/2] workqueue: restore POOL_MANAGING_WORKERS Tejun Heo
  0 siblings, 2 replies; 4+ messages in thread
From: Lai Jiangshan @ 2012-09-10  2:11 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

When hotplug happens, the plug code will also grab the manager_mutex,
it will break too_many_workers()'s assumption, and make too_many_workers()
ugly(kick the timer wrongly, no found bug).

To avoid assumption-coruption, we add the original POOL_MANAGING_WORKERS back.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index dc7b845..383548e 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -66,6 +66,7 @@ enum {
 
 	/* pool flags */
 	POOL_MANAGE_WORKERS	= 1 << 0,	/* need to manage workers */
+	POOL_MANAGING_WORKERS   = 1 << 1,       /* managing workers */
 
 	/* worker flags */
 	WORKER_STARTED		= 1 << 0,	/* started */
@@ -652,7 +653,7 @@ static bool need_to_manage_workers(struct worker_pool *pool)
 /* Do we have too many workers and should some go away? */
 static bool too_many_workers(struct worker_pool *pool)
 {
-	bool managing = mutex_is_locked(&pool->manager_mutex);
+	bool managing = pool->flags & POOL_MANAGING_WORKERS;
 	int nr_idle = pool->nr_idle + managing; /* manager is considered idle */
 	int nr_busy = pool->nr_workers - nr_idle;
 
@@ -1827,6 +1828,7 @@ static bool manage_workers(struct worker *worker)
 	if (!mutex_trylock(&pool->manager_mutex))
 		return ret;
 
+	pool->flags |= POOL_MANAGING_WORKERS;
 	pool->flags &= ~POOL_MANAGE_WORKERS;
 
 	/*
@@ -1836,6 +1838,7 @@ static bool manage_workers(struct worker *worker)
 	ret |= maybe_destroy_workers(pool);
 	ret |= maybe_create_worker(pool);
 
+	pool->flags &= ~POOL_MANAGING_WORKERS;
 	mutex_unlock(&pool->manager_mutex);
 	return ret;
 }
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2 V7 for-3.6-fixes] workqueue: fix idle worker depletion
  2012-09-10  2:11 [PATCH 1/2 V7 for-3.6-fixes] workqueue: add POOL_MANAGING_WORKERS Lai Jiangshan
@ 2012-09-10  2:11 ` Lai Jiangshan
  2012-09-10 17:11   ` [PATCH wq/for-3.6-fixes 2/2] workqueue: fix possible idle worker depletion across CPU hotplug Tejun Heo
  2012-09-10 17:08 ` [PATCH wq/for-3.6-fixes 1/2] workqueue: restore POOL_MANAGING_WORKERS Tejun Heo
  1 sibling, 1 reply; 4+ messages in thread
From: Lai Jiangshan @ 2012-09-10  2:11 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

If hotplug code grabbed the manager_mutex and worker_thread try to create
a worker, the manage_worker() will return false and worker_thread go to
process work items. Now, on the CPU, all workers are processing work items,
no idle_worker left/ready for managing. It breaks the concept of workqueue
and it is bug.

So when manage_worker() failed to grab the manager_mutex, it should
release gcwq->lock and then grab manager_mutex.

After gcwq->lock is released, hotplug can happen. but the hoplug code
can't unbind/rebind the manager, so the manager should try to rebind
itself unconditionaly, if it fails, unbind itself.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |   31 ++++++++++++++++++++++++++++++-
 1 files changed, 30 insertions(+), 1 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 383548e..74434c8 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1825,10 +1825,39 @@ static bool manage_workers(struct worker *worker)
 	struct worker_pool *pool = worker->pool;
 	bool ret = false;
 
-	if (!mutex_trylock(&pool->manager_mutex))
+	if (pool->flags & POOL_MANAGING_WORKERS)
 		return ret;
 
 	pool->flags |= POOL_MANAGING_WORKERS;
+
+	if (unlikely(!mutex_trylock(&pool->manager_mutex))) {
+		/*
+		 * Ouch! rebind_workers() or gcwq_unbind_fn() beats it.
+		 * it can't return false here, otherwise it will lead to
+		 * worker depletion. So we release gcwq->lock and then
+		 * grab manager_mutex again.
+		 */
+		spin_unlock_irq(&pool->gcwq->lock);
+		mutex_lock(&pool->manager_mutex);
+
+		/*
+		 * The hotplug had happened after the previous releasing
+		 * of gcwq->lock. So we can't assume that this worker is
+		 * still associated or not. And we have to try to rebind it
+		 * via worker_maybe_bind_and_lock(). If it returns false,
+		 * we can conclude that the whole gcwq is disassociated,
+		 * and we must unbind this worker. (hotplug code can't
+		 * unbind/rebind the manager, because hotplug code can't
+		 * iterate the manager)
+		 */
+		if (worker_maybe_bind_and_lock(worker))
+			worker->flags &= ~WORKER_UNBOUND;
+		else
+			worker->flags |= WORKER_UNBOUND;
+
+		ret = true;
+	}
+
 	pool->flags &= ~POOL_MANAGE_WORKERS;
 
 	/*
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH wq/for-3.6-fixes 1/2] workqueue: restore POOL_MANAGING_WORKERS
  2012-09-10  2:11 [PATCH 1/2 V7 for-3.6-fixes] workqueue: add POOL_MANAGING_WORKERS Lai Jiangshan
  2012-09-10  2:11 ` [PATCH 2/2 V7 for-3.6-fixes] workqueue: fix idle worker depletion Lai Jiangshan
@ 2012-09-10 17:08 ` Tejun Heo
  1 sibling, 0 replies; 4+ messages in thread
From: Tejun Heo @ 2012-09-10 17:08 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel

>From 552a37e9360a293cd20e7f8ff1fb326a244c5f1e Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <laijs@cn.fujitsu.com>
Date: Mon, 10 Sep 2012 10:03:33 -0700

This patch restores POOL_MANAGING_WORKERS which was replaced by
pool->manager_mutex by 6037315269 "workqueue: use mutex for global_cwq
manager exclusion".

There's a subtle idle worker depletion bug across CPU hotplug events
and we need to distinguish an actual manager and CPU hotplug
preventing management.  POOL_MANAGING_WORKERS will be used for the
former and manager_mutex the later.

This patch just lays POOL_MANAGING_WORKERS on top of the existing
manager_mutex and doesn't introduce any synchronization changes.  The
next patch will update it.

Note that this patch fixes a non-critical anomaly where
too_many_workers() may return %true spuriously while CPU hotplug is in
progress.  While the issue could schedule idle timer spuriously, it
didn't trigger any actual misbehavior.

tj: Rewrote patch description.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
---
Applied to wq/for-3.6-fixes with rewritten patch description to
explain why it's being restored (to fix idle worker depletion across
CPU hotplug).

Thanks!

 kernel/workqueue.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index dc7b845..383548e 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -66,6 +66,7 @@ enum {
 
 	/* pool flags */
 	POOL_MANAGE_WORKERS	= 1 << 0,	/* need to manage workers */
+	POOL_MANAGING_WORKERS   = 1 << 1,       /* managing workers */
 
 	/* worker flags */
 	WORKER_STARTED		= 1 << 0,	/* started */
@@ -652,7 +653,7 @@ static bool need_to_manage_workers(struct worker_pool *pool)
 /* Do we have too many workers and should some go away? */
 static bool too_many_workers(struct worker_pool *pool)
 {
-	bool managing = mutex_is_locked(&pool->manager_mutex);
+	bool managing = pool->flags & POOL_MANAGING_WORKERS;
 	int nr_idle = pool->nr_idle + managing; /* manager is considered idle */
 	int nr_busy = pool->nr_workers - nr_idle;
 
@@ -1827,6 +1828,7 @@ static bool manage_workers(struct worker *worker)
 	if (!mutex_trylock(&pool->manager_mutex))
 		return ret;
 
+	pool->flags |= POOL_MANAGING_WORKERS;
 	pool->flags &= ~POOL_MANAGE_WORKERS;
 
 	/*
@@ -1836,6 +1838,7 @@ static bool manage_workers(struct worker *worker)
 	ret |= maybe_destroy_workers(pool);
 	ret |= maybe_create_worker(pool);
 
+	pool->flags &= ~POOL_MANAGING_WORKERS;
 	mutex_unlock(&pool->manager_mutex);
 	return ret;
 }
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH wq/for-3.6-fixes 2/2] workqueue: fix possible idle worker depletion across CPU hotplug
  2012-09-10  2:11 ` [PATCH 2/2 V7 for-3.6-fixes] workqueue: fix idle worker depletion Lai Jiangshan
@ 2012-09-10 17:11   ` Tejun Heo
  0 siblings, 0 replies; 4+ messages in thread
From: Tejun Heo @ 2012-09-10 17:11 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel

>From ee378aa49b594da9bda6a2c768cc5b2ad585f911 Mon Sep 17 00:00:00 2001
From: Lai Jiangshan <laijs@cn.fujitsu.com>
Date: Mon, 10 Sep 2012 10:03:44 -0700

To simplify both normal and CPU hotplug paths, worker management is
prevented while CPU hoplug is in progress.  This is achieved by CPU
hotplug holding the same exclusion mechanism used by workers to ensure
there's only one manager per pool.

If someone else seems to be performing the manager role, workers
proceed to execute work items.  CPU hotplug using the same mechanism
can lead to idle worker depletion because all workers could proceed to
execute work items while CPU hotplug is in progress and CPU hotplug
itself wouldn't actually perform the worker management duty - it
doesn't guarantee that there's an idle worker left when it releases
management.

This idle worker depletion, under extreme circumstances, can break
forward-progress guarantee and thus lead to deadlock.

This patch fixes the bug by using separate mechanisms for manager
exclusion among workers and hotplug exclusion.  For manager exclusion,
POOL_MANAGING_WORKERS which was restored by the previous patch is
used.  pool->manager_mutex is now only used for exclusion between the
elected manager and CPU hotplug.  The elected manager won't proceed
without holding pool->manager_mutex.

This ensures that the worker which won the manager position can't skip
managing while CPU hotplug is in progress.  It will block on
manager_mutex and perform management after CPU hotplug is complete.

Note that hotplug may happen while waiting for manager_mutex.  A
manager isn't either on idle or busy list and thus the hoplug code
can't unbind/rebind it.  Make the manager handle its own un/rebinding.

tj: Updated comment and description.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
---
Applied to wq/for-3.6-fixes w/ comment and patch description updated.
Sorry about hacking up your comment / description but this is a tricky
issue and I thought a bit more verbosity is a good idea.

Thanks a lot for fixing the issue and the persistence!

 kernel/workqueue.c |   37 ++++++++++++++++++++++++++++++++++++-
 1 files changed, 36 insertions(+), 1 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 383548e..1e1373b 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1825,10 +1825,45 @@ static bool manage_workers(struct worker *worker)
 	struct worker_pool *pool = worker->pool;
 	bool ret = false;
 
-	if (!mutex_trylock(&pool->manager_mutex))
+	if (pool->flags & POOL_MANAGING_WORKERS)
 		return ret;
 
 	pool->flags |= POOL_MANAGING_WORKERS;
+
+	/*
+	 * To simplify both worker management and CPU hotplug, hold off
+	 * management while hotplug is in progress.  CPU hotplug path can't
+	 * grab %POOL_MANAGING_WORKERS to achieve this because that can
+	 * lead to idle worker depletion (all become busy thinking someone
+	 * else is managing) which in turn can result in deadlock under
+	 * extreme circumstances.  Use @pool->manager_mutex to synchronize
+	 * manager against CPU hotplug.
+	 *
+	 * manager_mutex would always be free unless CPU hotplug is in
+	 * progress.  trylock first without dropping @gcwq->lock.
+	 */
+	if (unlikely(!mutex_trylock(&pool->manager_mutex))) {
+		spin_unlock_irq(&pool->gcwq->lock);
+		mutex_lock(&pool->manager_mutex);
+		/*
+		 * CPU hotplug could have happened while we were waiting
+		 * for manager_mutex.  Hotplug itself can't handle us
+		 * because manager isn't either on idle or busy list, and
+		 * @gcwq's state and ours could have deviated.
+		 *
+		 * As hotplug is now excluded via manager_mutex, we can
+		 * simply try to bind.  It will succeed or fail depending
+		 * on @gcwq's current state.  Try it and adjust
+		 * %WORKER_UNBOUND accordingly.
+		 */
+		if (worker_maybe_bind_and_lock(worker))
+			worker->flags &= ~WORKER_UNBOUND;
+		else
+			worker->flags |= WORKER_UNBOUND;
+
+		ret = true;
+	}
+
 	pool->flags &= ~POOL_MANAGE_WORKERS;
 
 	/*
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-09-10 17:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-10  2:11 [PATCH 1/2 V7 for-3.6-fixes] workqueue: add POOL_MANAGING_WORKERS Lai Jiangshan
2012-09-10  2:11 ` [PATCH 2/2 V7 for-3.6-fixes] workqueue: fix idle worker depletion Lai Jiangshan
2012-09-10 17:11   ` [PATCH wq/for-3.6-fixes 2/2] workqueue: fix possible idle worker depletion across CPU hotplug Tejun Heo
2012-09-10 17:08 ` [PATCH wq/for-3.6-fixes 1/2] workqueue: restore POOL_MANAGING_WORKERS Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).