linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/11 V5] workqueue: reimplement unbind/rebind
@ 2012-09-05 10:37 Lai Jiangshan
  2012-09-05 10:37 ` [PATCH 01/11 V5] workqueue: ensure the wq_worker_sleeping() see the right flags Lai Jiangshan
                   ` (10 more replies)
  0 siblings, 11 replies; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-05 10:37 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

Patch 2 a different way to fix deadlock problem. it removes 50 LOC !!
Patch 3-4 cleanup rebind

Patch 5-9 prepare for patch 10
Patch 10  unbind/rebind without manager_mutex, fix bug hotplug VS manage

	The first core algorithm for 5-10 is detecting the changes of binding
	by GCWQ_DISASSOCIATED bit and WORKER_UNBOUND bit.
	The second core algorithm is exile-operation. Patch2,9


Patch 11  cleanup manager.


Patch 1 accepted, just resent.

Changed from V4:
Give up to make manage_mutex safer, remove it instead.

Lai Jiangshan (11):
  workqueue: ensure the wq_worker_sleeping() see the      right flags
  async idle rebinding
  new day don't need WORKER_REBIND for busy worker
  remove WORKER_REBIND
  Add @bind arguement back
  unbind manager
  rebind manager
  unbind newly created worker
  rebind newly created worker
  unbind/rebind without manager_mutex
  remove manager_mutex

 kernel/workqueue.c |  270 ++++++++++++++++++++++++----------------------------
 1 files changed, 124 insertions(+), 146 deletions(-)

-- 
1.7.4.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 01/11 V5] workqueue: ensure the wq_worker_sleeping() see the  right flags
  2012-09-05 10:37 [PATCH 00/11 V5] workqueue: reimplement unbind/rebind Lai Jiangshan
@ 2012-09-05 10:37 ` Lai Jiangshan
  2012-09-05 10:37 ` [PATCH 02/11 V5] workqueue: async idle rebinding Lai Jiangshan
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-05 10:37 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

The compiler may compile this code into TWO write/modify instructions.
		worker->flags &= ~WORKER_UNBOUND;
		worker->flags |= WORKER_REBIND;

so the other CPU may see the temporary of worker->flags which has
not WORKER_UNBOUND nor WORKER_REBIND, it will wrongly do local wake up.

so we use one write/modify instruction explicitly instead.

This bug will not occur on idle workers, because they have another
WORKER_NOT_RUNNING flags.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 85bd340..050b2a5 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1739,10 +1739,13 @@ retry:
 	for_each_busy_worker(worker, i, pos, gcwq) {
 		struct work_struct *rebind_work = &worker->rebind_work;
 		struct workqueue_struct *wq;
+		unsigned long worker_flags = worker->flags;
 
 		/* morph UNBOUND to REBIND */
-		worker->flags &= ~WORKER_UNBOUND;
-		worker->flags |= WORKER_REBIND;
+		worker_flags &= ~WORKER_UNBOUND;
+		worker_flags |= WORKER_REBIND;
+		/* ensure the wq_worker_sleeping() see the right flags */
+		ACCESS_ONCE(worker->flags) = worker_flags;
 
 		if (test_and_set_bit(WORK_STRUCT_PENDING_BIT,
 				     work_data_bits(rebind_work)))
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 02/11 V5] workqueue: async idle rebinding
  2012-09-05 10:37 [PATCH 00/11 V5] workqueue: reimplement unbind/rebind Lai Jiangshan
  2012-09-05 10:37 ` [PATCH 01/11 V5] workqueue: ensure the wq_worker_sleeping() see the right flags Lai Jiangshan
@ 2012-09-05 10:37 ` Lai Jiangshan
  2012-09-05 18:06   ` Tejun Heo
  2012-09-05 10:37 ` [PATCH 03/11 V5] workqueue: new day don't need WORKER_REBIND for busy rebinding Lai Jiangshan
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-05 10:37 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

fix deadlock in rebind_workers()

Current idle_worker_rebind() has a bug.

idle_worker_rebind() path			HOTPLUG path
						online
							rebind_workers()
wait_event(gcwq->rebind_hold)
	woken up but no scheduled			rebind_workers() returns
						the same cpu offline
						the same cpu online again
							rebind_workers()
								set WORKER_REBIND
	scheduled,see the WORKER_REBIND
	wait rebind_workers() clear it    <--bug-->		wait idle_worker_rebind()
								rebound.

The two thread wait each other. It is bug.

This patch gives up to rebind idles synchronously. make them async instead.

To avoid to wrongly do local-wake-up, we add a exile-operation for
idle-worker-rebinding. When a idle worker is exiled, it will not queued
on @idle_list until it is rebound.

After we have exile-operation, the @nr_idle is not only the count of @idle_list,
but also exiled idle workers. so I check all the code, and make
them exile-operation aware. (too_many_workers())

exile-operation is also the core idea to rebind newly created worker.
(patch 9)

rebind_workers() become single pass and don't release gcwq->lock.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |  100 +++++++++++++---------------------------------------
 1 files changed, 25 insertions(+), 75 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 050b2a5..3dd7ce2 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -125,7 +125,6 @@ enum {
 
 struct global_cwq;
 struct worker_pool;
-struct idle_rebind;
 
 /*
  * The poor guys doing the actual heavy lifting.  All on-duty workers
@@ -149,7 +148,6 @@ struct worker {
 	int			id;		/* I: worker id */
 
 	/* for rebinding worker to CPU */
-	struct idle_rebind	*idle_rebind;	/* L: for idle worker */
 	struct work_struct	rebind_work;	/* L: for busy worker */
 };
 
@@ -159,7 +157,9 @@ struct worker_pool {
 
 	struct list_head	worklist;	/* L: list of pending works */
 	int			nr_workers;	/* L: total number of workers */
-	int			nr_idle;	/* L: currently idle ones */
+	int			nr_idle;	/* L: currently idle ones,
+						      include ones in idle_list
+						      and in doing rebind. */
 
 	struct list_head	idle_list;	/* X: list of idle workers */
 	struct timer_list	idle_timer;	/* L: worker idle timeout */
@@ -185,8 +185,6 @@ struct global_cwq {
 
 	struct worker_pool	pools[NR_WORKER_POOLS];
 						/* normal and highpri pools */
-
-	wait_queue_head_t	rebind_hold;	/* rebind hold wait */
 } ____cacheline_aligned_in_smp;
 
 /*
@@ -686,6 +684,10 @@ static bool too_many_workers(struct worker_pool *pool)
 	int nr_idle = pool->nr_idle + managing; /* manager is considered idle */
 	int nr_busy = pool->nr_workers - nr_idle;
 
+	/* Is any idle home ? */
+	if (unlikely(list_empty(&pool->idle_list)))
+		return false;
+
 	return nr_idle > 2 && (nr_idle - 2) * MAX_IDLE_WORKERS_RATIO >= nr_busy;
 }
 
@@ -1608,28 +1610,20 @@ __acquires(&gcwq->lock)
 	}
 }
 
-struct idle_rebind {
-	int			cnt;		/* # workers to be rebound */
-	struct completion	done;		/* all workers rebound */
-};
-
 /*
- * Rebind an idle @worker to its CPU.  During CPU onlining, this has to
- * happen synchronously for idle workers.  worker_thread() will test
+ * Rebind an idle @worker to its CPU. worker_thread() will test
  * %WORKER_REBIND before leaving idle and call this function.
  */
 static void idle_worker_rebind(struct worker *worker)
 {
 	struct global_cwq *gcwq = worker->pool->gcwq;
 
-	/* CPU must be online at this point */
-	WARN_ON(!worker_maybe_bind_and_lock(worker));
-	if (!--worker->idle_rebind->cnt)
-		complete(&worker->idle_rebind->done);
-	spin_unlock_irq(&worker->pool->gcwq->lock);
+	if (worker_maybe_bind_and_lock(worker))
+		worker_clr_flags(worker, WORKER_UNBOUND);
 
-	/* we did our part, wait for rebind_workers() to finish up */
-	wait_event(gcwq->rebind_hold, !(worker->flags & WORKER_REBIND));
+	worker_clr_flags(worker, WORKER_REBIND);
+	list_add(&worker->entry, &worker->pool->idle_list);
+	spin_unlock_irq(&gcwq->lock);
 }
 
 /*
@@ -1656,29 +1650,22 @@ static void busy_worker_rebind_fn(struct work_struct *work)
  * @gcwq->cpu is coming online.  Rebind all workers to the CPU.  Rebinding
  * is different for idle and busy ones.
  *
- * The idle ones should be rebound synchronously and idle rebinding should
- * be complete before any worker starts executing work items with
- * concurrency management enabled; otherwise, scheduler may oops trying to
- * wake up non-local idle worker from wq_worker_sleeping().
- *
- * This is achieved by repeatedly requesting rebinding until all idle
- * workers are known to have been rebound under @gcwq->lock and holding all
- * idle workers from becoming busy until idle rebinding is complete.
+ * The idle ones will be kicked out of the idle_list, and it will
+ * add itself back when it finish to rebind.
  *
- * Once idle workers are rebound, busy workers can be rebound as they
- * finish executing their current work items.  Queueing the rebind work at
- * the head of their scheduled lists is enough.  Note that nr_running will
+ * The busy workers can be rebound as they finish executing
+ * their current work items.  Queueing the rebind work at the head of
+ * their scheduled lists is enough.  Note that nr_running will
  * be properbly bumped as busy workers rebind.
  *
- * On return, all workers are guaranteed to either be bound or have rebind
- * work item scheduled.
+ * On return, all workers are guaranteed to be rebound when they
+ * add themselves back to idle_list if the gcwq is still associated.
  */
 static void rebind_workers(struct global_cwq *gcwq)
 	__releases(&gcwq->lock) __acquires(&gcwq->lock)
 {
-	struct idle_rebind idle_rebind;
 	struct worker_pool *pool;
-	struct worker *worker;
+	struct worker *worker, *n;
 	struct hlist_node *pos;
 	int i;
 
@@ -1687,54 +1674,19 @@ static void rebind_workers(struct global_cwq *gcwq)
 	for_each_worker_pool(pool, gcwq)
 		lockdep_assert_held(&pool->manager_mutex);
 
-	/*
-	 * Rebind idle workers.  Interlocked both ways.  We wait for
-	 * workers to rebind via @idle_rebind.done.  Workers will wait for
-	 * us to finish up by watching %WORKER_REBIND.
-	 */
-	init_completion(&idle_rebind.done);
-retry:
-	idle_rebind.cnt = 1;
-	INIT_COMPLETION(idle_rebind.done);
-
-	/* set REBIND and kick idle ones, we'll wait for these later */
+	/* set REBIND and kick idle ones */
 	for_each_worker_pool(pool, gcwq) {
-		list_for_each_entry(worker, &pool->idle_list, entry) {
-			if (worker->flags & WORKER_REBIND)
-				continue;
-
-			/* morph UNBOUND to REBIND */
-			worker->flags &= ~WORKER_UNBOUND;
+		list_for_each_entry_safe(worker, n, &pool->idle_list, entry) {
 			worker->flags |= WORKER_REBIND;
 
-			idle_rebind.cnt++;
-			worker->idle_rebind = &idle_rebind;
+			/* exile idle workers */
+			list_del_init(&worker->entry);
 
 			/* worker_thread() will call idle_worker_rebind() */
 			wake_up_process(worker->task);
 		}
 	}
 
-	if (--idle_rebind.cnt) {
-		spin_unlock_irq(&gcwq->lock);
-		wait_for_completion(&idle_rebind.done);
-		spin_lock_irq(&gcwq->lock);
-		/* busy ones might have become idle while waiting, retry */
-		goto retry;
-	}
-
-	/*
-	 * All idle workers are rebound and waiting for %WORKER_REBIND to
-	 * be cleared inside idle_worker_rebind().  Clear and release.
-	 * Clearing %WORKER_REBIND from this foreign context is safe
-	 * because these workers are still guaranteed to be idle.
-	 */
-	for_each_worker_pool(pool, gcwq)
-		list_for_each_entry(worker, &pool->idle_list, entry)
-			worker->flags &= ~WORKER_REBIND;
-
-	wake_up_all(&gcwq->rebind_hold);
-
 	/* rebind busy workers */
 	for_each_busy_worker(worker, i, pos, gcwq) {
 		struct work_struct *rebind_work = &worker->rebind_work;
@@ -3811,8 +3763,6 @@ static int __init init_workqueues(void)
 			mutex_init(&pool->manager_mutex);
 			ida_init(&pool->worker_ida);
 		}
-
-		init_waitqueue_head(&gcwq->rebind_hold);
 	}
 
 	/* create the initial worker */
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 03/11 V5] workqueue: new day don't need WORKER_REBIND for busy rebinding
  2012-09-05 10:37 [PATCH 00/11 V5] workqueue: reimplement unbind/rebind Lai Jiangshan
  2012-09-05 10:37 ` [PATCH 01/11 V5] workqueue: ensure the wq_worker_sleeping() see the right flags Lai Jiangshan
  2012-09-05 10:37 ` [PATCH 02/11 V5] workqueue: async idle rebinding Lai Jiangshan
@ 2012-09-05 10:37 ` Lai Jiangshan
  2012-09-05 18:31   ` Tejun Heo
  2012-09-05 10:37 ` [PATCH 04/11 V5] workqueue: remove WORKER_REBIND Lai Jiangshan
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-05 10:37 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

because old busy_worker_rebind_fn() have to wait until all idle worker finish.
so we have to use two flags WORKER_UNBOUND and WORKER_REBIND to avoid
prematurely clear all NOT_RUNNING bit when highly frequent offline/online.

but current code don't need to wait idle workers. so we don't need to
use two flags, just one is enough. remove WORKER_REBIND from busy rebinding.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |    9 +--------
 1 files changed, 1 insertions(+), 8 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 3dd7ce2..ba0ba33 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1638,7 +1638,7 @@ static void busy_worker_rebind_fn(struct work_struct *work)
 	struct global_cwq *gcwq = worker->pool->gcwq;
 
 	if (worker_maybe_bind_and_lock(worker))
-		worker_clr_flags(worker, WORKER_REBIND);
+		worker_clr_flags(worker, WORKER_UNBOUND);
 
 	spin_unlock_irq(&gcwq->lock);
 }
@@ -1691,13 +1691,6 @@ static void rebind_workers(struct global_cwq *gcwq)
 	for_each_busy_worker(worker, i, pos, gcwq) {
 		struct work_struct *rebind_work = &worker->rebind_work;
 		struct workqueue_struct *wq;
-		unsigned long worker_flags = worker->flags;
-
-		/* morph UNBOUND to REBIND */
-		worker_flags &= ~WORKER_UNBOUND;
-		worker_flags |= WORKER_REBIND;
-		/* ensure the wq_worker_sleeping() see the right flags */
-		ACCESS_ONCE(worker->flags) = worker_flags;
 
 		if (test_and_set_bit(WORK_STRUCT_PENDING_BIT,
 				     work_data_bits(rebind_work)))
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 04/11 V5] workqueue: remove WORKER_REBIND
  2012-09-05 10:37 [PATCH 00/11 V5] workqueue: reimplement unbind/rebind Lai Jiangshan
                   ` (2 preceding siblings ...)
  2012-09-05 10:37 ` [PATCH 03/11 V5] workqueue: new day don't need WORKER_REBIND for busy rebinding Lai Jiangshan
@ 2012-09-05 10:37 ` Lai Jiangshan
  2012-09-05 10:37 ` [PATCH 05/11 V5] workqueue: Add @bind arguement back without change any thing Lai Jiangshan
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-05 10:37 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

exile operation       = list_del_init(&worker->entry).
and destory operation -> list_del_init(&worker->entry).

so we can use list_empty(&worker->entry) to know: does the worker
has been exiled or killed.

WORKER_REBIND is not need any more, remove it to reduce the states
of workers.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |   21 ++++++++-------------
 1 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index ba0ba33..6bf4185 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -72,11 +72,10 @@ enum {
 	WORKER_DIE		= 1 << 1,	/* die die die */
 	WORKER_IDLE		= 1 << 2,	/* is idle */
 	WORKER_PREP		= 1 << 3,	/* preparing to run works */
-	WORKER_REBIND		= 1 << 5,	/* mom is home, come back */
 	WORKER_CPU_INTENSIVE	= 1 << 6,	/* cpu intensive */
 	WORKER_UNBOUND		= 1 << 7,	/* worker is unbound */
 
-	WORKER_NOT_RUNNING	= WORKER_PREP | WORKER_REBIND | WORKER_UNBOUND |
+	WORKER_NOT_RUNNING	= WORKER_PREP | WORKER_UNBOUND |
 				  WORKER_CPU_INTENSIVE,
 
 	NR_WORKER_POOLS		= 2,		/* # worker pools per gcwq */
@@ -1612,7 +1611,7 @@ __acquires(&gcwq->lock)
 
 /*
  * Rebind an idle @worker to its CPU. worker_thread() will test
- * %WORKER_REBIND before leaving idle and call this function.
+ * worker->entry before leaving idle and call this function.
  */
 static void idle_worker_rebind(struct worker *worker)
 {
@@ -1621,7 +1620,6 @@ static void idle_worker_rebind(struct worker *worker)
 	if (worker_maybe_bind_and_lock(worker))
 		worker_clr_flags(worker, WORKER_UNBOUND);
 
-	worker_clr_flags(worker, WORKER_REBIND);
 	list_add(&worker->entry, &worker->pool->idle_list);
 	spin_unlock_irq(&gcwq->lock);
 }
@@ -1674,11 +1672,9 @@ static void rebind_workers(struct global_cwq *gcwq)
 	for_each_worker_pool(pool, gcwq)
 		lockdep_assert_held(&pool->manager_mutex);
 
-	/* set REBIND and kick idle ones */
+	/* exile and kick idle ones */
 	for_each_worker_pool(pool, gcwq) {
 		list_for_each_entry_safe(worker, n, &pool->idle_list, entry) {
-			worker->flags |= WORKER_REBIND;
-
 			/* exile idle workers */
 			list_del_init(&worker->entry);
 
@@ -2115,7 +2111,7 @@ __acquires(&gcwq->lock)
 	 * necessary to avoid spurious warnings from rescuers servicing the
 	 * unbound or a disassociated gcwq.
 	 */
-	WARN_ON_ONCE(!(worker->flags & (WORKER_UNBOUND | WORKER_REBIND)) &&
+	WARN_ON_ONCE(!(worker->flags & WORKER_UNBOUND) &&
 		     !(gcwq->flags & GCWQ_DISASSOCIATED) &&
 		     raw_smp_processor_id() != gcwq->cpu);
 
@@ -2239,18 +2235,17 @@ static int worker_thread(void *__worker)
 woke_up:
 	spin_lock_irq(&gcwq->lock);
 
-	/*
-	 * DIE can be set only while idle and REBIND set while busy has
-	 * @worker->rebind_work scheduled.  Checking here is enough.
-	 */
-	if (unlikely(worker->flags & (WORKER_REBIND | WORKER_DIE))) {
+	/* Is it still home ? */
+	if (unlikely(list_empty(&worker->entry))) {
 		spin_unlock_irq(&gcwq->lock);
 
+		/* reason: DIE */
 		if (worker->flags & WORKER_DIE) {
 			worker->task->flags &= ~PF_WQ_WORKER;
 			return 0;
 		}
 
+		/* reason: idle rebind */
 		idle_worker_rebind(worker);
 		goto woke_up;
 	}
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 05/11 V5] workqueue: Add @bind arguement back without change any thing
  2012-09-05 10:37 [PATCH 00/11 V5] workqueue: reimplement unbind/rebind Lai Jiangshan
                   ` (3 preceding siblings ...)
  2012-09-05 10:37 ` [PATCH 04/11 V5] workqueue: remove WORKER_REBIND Lai Jiangshan
@ 2012-09-05 10:37 ` Lai Jiangshan
  2012-09-05 19:49   ` Tejun Heo
  2012-09-05 10:37 ` [PATCH 06/11 V5] workqueue: unbind manager Lai Jiangshan
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-05 10:37 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

Ensure the gcwq->flags is only accessed with gcwq->lock held.
And make the code more easier to understand.

In all current callsite of create_worker(), DISASSOCIATED can't
be flipped while create_worker().
So the whole behavior is unchanged with this patch.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |   33 ++++++++++++++++++---------------
 1 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 6bf4185..1946be4 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -56,10 +56,6 @@ enum {
 	 * While DISASSOCIATED, the cpu may be offline and all workers have
 	 * %WORKER_UNBOUND set and concurrency management disabled, and may
 	 * be executing on any CPU.  The gcwq behaves as an unbound one.
-	 *
-	 * Note that DISASSOCIATED can be flipped only while holding
-	 * managership of all pools on the gcwq to avoid changing binding
-	 * state while create_worker() is in progress.
 	 */
 	GCWQ_DISASSOCIATED	= 1 << 0,	/* cpu can't serve workers */
 	GCWQ_FREEZING		= 1 << 1,	/* freeze in progress */
@@ -1727,6 +1723,7 @@ static struct worker *alloc_worker(void)
 /**
  * create_worker - create a new workqueue worker
  * @pool: pool the new worker will belong to
+ * @bind: whether to set affinity to @cpu or not
  *
  * Create a new worker which is bound to @pool.  The returned worker
  * can be started by calling start_worker() or destroyed using
@@ -1738,7 +1735,7 @@ static struct worker *alloc_worker(void)
  * RETURNS:
  * Pointer to the newly created worker.
  */
-static struct worker *create_worker(struct worker_pool *pool)
+static struct worker *create_worker(struct worker_pool *pool, bool bind)
 {
 	struct global_cwq *gcwq = pool->gcwq;
 	const char *pri = worker_pool_pri(pool) ? "H" : "";
@@ -1775,15 +1772,10 @@ static struct worker *create_worker(struct worker_pool *pool)
 		set_user_nice(worker->task, HIGHPRI_NICE_LEVEL);
 
 	/*
-	 * Determine CPU binding of the new worker depending on
-	 * %GCWQ_DISASSOCIATED.  The caller is responsible for ensuring the
-	 * flag remains stable across this function.  See the comments
-	 * above the flag definition for details.
-	 *
 	 * As an unbound worker may later become a regular one if CPU comes
 	 * online, make sure every worker has %PF_THREAD_BOUND set.
 	 */
-	if (!(gcwq->flags & GCWQ_DISASSOCIATED)) {
+	if (bind) {
 		kthread_bind(worker->task, gcwq->cpu);
 	} else {
 		worker->task->flags |= PF_THREAD_BOUND;
@@ -1951,6 +1943,14 @@ __releases(&gcwq->lock)
 __acquires(&gcwq->lock)
 {
 	struct global_cwq *gcwq = pool->gcwq;
+	bool bind;
+
+	/*
+	 * Note we have held the manage_mutex, so DISASSOCIATED can't be
+	 * flipped and it is correct that we calculate @bind only once
+	 * and then release the gcwq->lock.
+	 */
+	bind = !(gcwq->flags & GCWQ_DISASSOCIATED);
 
 	if (!need_to_create_worker(pool))
 		return false;
@@ -1963,7 +1963,7 @@ restart:
 	while (true) {
 		struct worker *worker;
 
-		worker = create_worker(pool);
+		worker = create_worker(pool, bind);
 		if (worker) {
 			del_timer_sync(&pool->mayday_timer);
 			spin_lock_irq(&gcwq->lock);
@@ -3479,7 +3479,7 @@ static int __devinit workqueue_cpu_up_callback(struct notifier_block *nfb,
 			if (pool->nr_workers)
 				continue;
 
-			worker = create_worker(pool);
+			worker = create_worker(pool, false);
 			if (!worker)
 				return NOTIFY_BAD;
 
@@ -3757,14 +3757,17 @@ static int __init init_workqueues(void)
 	for_each_online_gcwq_cpu(cpu) {
 		struct global_cwq *gcwq = get_gcwq(cpu);
 		struct worker_pool *pool;
+		bool bind = false;
 
-		if (cpu != WORK_CPU_UNBOUND)
+		if (cpu != WORK_CPU_UNBOUND) {
 			gcwq->flags &= ~GCWQ_DISASSOCIATED;
+			bind = true;
+		}
 
 		for_each_worker_pool(pool, gcwq) {
 			struct worker *worker;
 
-			worker = create_worker(pool);
+			worker = create_worker(pool, bind);
 			BUG_ON(!worker);
 			spin_lock_irq(&gcwq->lock);
 			start_worker(worker);
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 06/11 V5] workqueue: unbind manager
  2012-09-05 10:37 [PATCH 00/11 V5] workqueue: reimplement unbind/rebind Lai Jiangshan
                   ` (4 preceding siblings ...)
  2012-09-05 10:37 ` [PATCH 05/11 V5] workqueue: Add @bind arguement back without change any thing Lai Jiangshan
@ 2012-09-05 10:37 ` Lai Jiangshan
  2012-09-05 10:37 ` [PATCH 07/11 V5] workqueue: rebind manager Lai Jiangshan
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-05 10:37 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

add ->manager to make gcwq_unbind_fn() knows the manager and unbind it.

It just prepares, this code is useless until
"we unbind/rebind without manager_mutex"

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 1946be4..9f4dd20 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -160,6 +160,7 @@ struct worker_pool {
 	struct timer_list	idle_timer;	/* L: worker idle timeout */
 	struct timer_list	mayday_timer;	/* L: SOS timer for workers */
 
+	struct worker		*manager;	/* L: manager worker */
 	struct mutex		manager_mutex;	/* mutex manager should hold */
 	struct ida		worker_ida;	/* L: for worker IDs */
 };
@@ -2055,6 +2056,7 @@ static bool manage_workers(struct worker *worker)
 	if (!mutex_trylock(&pool->manager_mutex))
 		return ret;
 
+	pool->manager = worker;
 	pool->flags &= ~POOL_MANAGE_WORKERS;
 
 	/*
@@ -2065,6 +2067,8 @@ static bool manage_workers(struct worker *worker)
 	ret |= maybe_create_worker(pool);
 
 	mutex_unlock(&pool->manager_mutex);
+	pool->manager = NULL;
+
 	return ret;
 }
 
@@ -3426,9 +3430,12 @@ static void gcwq_unbind_fn(struct work_struct *work)
 	 * ones which are still executing works from before the last CPU
 	 * down must be on the cpu.  After this, they may become diasporas.
 	 */
-	for_each_worker_pool(pool, gcwq)
+	for_each_worker_pool(pool, gcwq) {
 		list_for_each_entry(worker, &pool->idle_list, entry)
 			worker->flags |= WORKER_UNBOUND;
+		if (pool->manager)
+			pool->manager->flags |= WORKER_UNBOUND;
+	}
 
 	for_each_busy_worker(worker, i, pos, gcwq)
 		worker->flags |= WORKER_UNBOUND;
@@ -3748,6 +3755,7 @@ static int __init init_workqueues(void)
 			setup_timer(&pool->mayday_timer, gcwq_mayday_timeout,
 				    (unsigned long)pool);
 
+			pool->manager = NULL;
 			mutex_init(&pool->manager_mutex);
 			ida_init(&pool->worker_ida);
 		}
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 07/11 V5] workqueue: rebind manager
  2012-09-05 10:37 [PATCH 00/11 V5] workqueue: reimplement unbind/rebind Lai Jiangshan
                   ` (5 preceding siblings ...)
  2012-09-05 10:37 ` [PATCH 06/11 V5] workqueue: unbind manager Lai Jiangshan
@ 2012-09-05 10:37 ` Lai Jiangshan
  2012-09-05 10:37 ` [PATCH 08/11 V5] workqueue: unbind newly created worker Lai Jiangshan
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-05 10:37 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

Because the %UNBOUND bit of manager can't be cleared while it
is manage workers. maybe_rebind_manager() will be noticed and
will do rebind when needed.

It just prepares, this code is useless until
"we unbind/rebind without manager_mutex"

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |   23 +++++++++++++++++++++++
 1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 9f4dd20..e9fc065 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1540,6 +1540,14 @@ static void worker_leave_idle(struct worker *worker)
 	list_del_init(&worker->entry);
 }
 
+/* Does this worker need to be rebound */
+static bool need_to_rebind_worker(struct worker *worker)
+{
+	bool assoc = !(worker->pool->gcwq->flags & GCWQ_DISASSOCIATED);
+
+	return assoc && (worker->flags & WORKER_UNBOUND);
+}
+
 /**
  * worker_maybe_bind_and_lock - bind worker to its cpu if possible and lock gcwq
  * @worker: self
@@ -2028,6 +2036,20 @@ static bool maybe_destroy_workers(struct worker_pool *pool)
 	return ret;
 }
 
+static bool maybe_rebind_manager(struct worker *manager)
+{
+	if (unlikely(need_to_rebind_worker(manager))) {
+		spin_unlock_irq(&manager->pool->gcwq->lock);
+
+		if (worker_maybe_bind_and_lock(manager))
+			worker_clr_flags(manager, WORKER_UNBOUND);
+
+		return true;
+	}
+
+	return false;
+}
+
 /**
  * manage_workers - manage worker pool
  * @worker: self
@@ -2065,6 +2087,7 @@ static bool manage_workers(struct worker *worker)
 	 */
 	ret |= maybe_destroy_workers(pool);
 	ret |= maybe_create_worker(pool);
+	ret |= maybe_rebind_manager(worker);
 
 	mutex_unlock(&pool->manager_mutex);
 	pool->manager = NULL;
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 08/11 V5] workqueue: unbind newly created worker
  2012-09-05 10:37 [PATCH 00/11 V5] workqueue: reimplement unbind/rebind Lai Jiangshan
                   ` (6 preceding siblings ...)
  2012-09-05 10:37 ` [PATCH 07/11 V5] workqueue: rebind manager Lai Jiangshan
@ 2012-09-05 10:37 ` Lai Jiangshan
  2012-09-05 10:37 ` [PATCH 09/11 V5] workqueue: rebind " Lai Jiangshan
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-05 10:37 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

unbind newly created worker when manager is unbound.

It just prepares, this code is useless until
"we unbind/rebind without manager_mutex"

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |   23 ++++++++++++++++++++++-
 1 files changed, 22 insertions(+), 1 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index e9fc065..223d128 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -74,6 +74,8 @@ enum {
 	WORKER_NOT_RUNNING	= WORKER_PREP | WORKER_UNBOUND |
 				  WORKER_CPU_INTENSIVE,
 
+	WORKER_COPY_FLAGS	= WORKER_UNBOUND,
+
 	NR_WORKER_POOLS		= 2,		/* # worker pools per gcwq */
 
 	BUSY_WORKER_HASH_ORDER	= 6,		/* 64 pointers */
@@ -1820,6 +1822,25 @@ static void start_worker(struct worker *worker)
 }
 
 /**
+ * manager_start_worker - start a newly created worker by manager
+ * @worker: worker to start
+ * @manager: the manager
+ *
+ * Make the gcwq aware of @worker and start it.
+ * Fix the newly created worker's binding.
+ *
+ * CONTEXT:
+ * spin_lock_irq(gcwq->lock).
+ */
+static void manager_start_worker(struct worker *worker, struct worker *manager)
+{
+	start_worker(worker);
+
+	/* copy the flags. also unbind the worker if the manager is unbound */
+	worker->flags |= manager->flags & WORKER_COPY_FLAGS;
+}
+
+/**
  * destroy_worker - destroy a workqueue worker
  * @worker: worker to be destroyed
  *
@@ -1976,7 +1997,7 @@ restart:
 		if (worker) {
 			del_timer_sync(&pool->mayday_timer);
 			spin_lock_irq(&gcwq->lock);
-			start_worker(worker);
+			manager_start_worker(worker, pool->manager);
 			BUG_ON(need_to_create_worker(pool));
 			return true;
 		}
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 09/11 V5] workqueue: rebind newly created worker
  2012-09-05 10:37 [PATCH 00/11 V5] workqueue: reimplement unbind/rebind Lai Jiangshan
                   ` (7 preceding siblings ...)
  2012-09-05 10:37 ` [PATCH 08/11 V5] workqueue: unbind newly created worker Lai Jiangshan
@ 2012-09-05 10:37 ` Lai Jiangshan
  2012-09-05 16:19   ` Lai Jiangshan
  2012-09-05 10:37 ` [PATCH 10/11 V5] workqueue: unbind/rebind without manager_mutex Lai Jiangshan
  2012-09-05 10:37 ` [PATCH 11/11 V5] workqueue: remove manager_mutex Lai Jiangshan
  10 siblings, 1 reply; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-05 10:37 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

when the newly created needs to be rebound. exile it!
it will rebind itself in worker_thead().

It just prepares, this code is useless until
"we unbind/rebind without manager_mutex"

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 223d128..819c84e 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1838,6 +1838,11 @@ static void manager_start_worker(struct worker *worker, struct worker *manager)
 
 	/* copy the flags. also unbind the worker if the manager is unbound */
 	worker->flags |= manager->flags & WORKER_COPY_FLAGS;
+
+	if (need_to_rebind_worker(worker)) {
+		/* exile it, and let it rebind itself */
+		list_del_init(&worker->entry);
+	}
 }
 
 /**
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 10/11 V5] workqueue: unbind/rebind without manager_mutex
  2012-09-05 10:37 [PATCH 00/11 V5] workqueue: reimplement unbind/rebind Lai Jiangshan
                   ` (8 preceding siblings ...)
  2012-09-05 10:37 ` [PATCH 09/11 V5] workqueue: rebind " Lai Jiangshan
@ 2012-09-05 10:37 ` Lai Jiangshan
  2012-09-05 20:04   ` Tejun Heo
  2012-09-05 10:37 ` [PATCH 11/11 V5] workqueue: remove manager_mutex Lai Jiangshan
  10 siblings, 1 reply; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-05 10:37 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

gcwq_unbind_fn() unbind manager by ->manager pointer.

rebinding-manger, unbinding/rebinding newly created worker are done by
other place. so we don't need manager_mutex any more.

Also change the comment of @bind accordingly.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |   52 ++++++++++++++++------------------------------------
 1 files changed, 16 insertions(+), 36 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 819c84e..5fb712a 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1667,17 +1667,15 @@ static void busy_worker_rebind_fn(struct work_struct *work)
  * add themselves back to idle_list if the gcwq is still associated.
  */
 static void rebind_workers(struct global_cwq *gcwq)
-	__releases(&gcwq->lock) __acquires(&gcwq->lock)
 {
 	struct worker_pool *pool;
 	struct worker *worker, *n;
 	struct hlist_node *pos;
 	int i;
 
-	lockdep_assert_held(&gcwq->lock);
+	spin_lock_irq(&gcwq->lock);
 
-	for_each_worker_pool(pool, gcwq)
-		lockdep_assert_held(&pool->manager_mutex);
+	gcwq->flags &= ~GCWQ_DISASSOCIATED;
 
 	/* exile and kick idle ones */
 	for_each_worker_pool(pool, gcwq) {
@@ -1714,6 +1712,8 @@ static void rebind_workers(struct global_cwq *gcwq)
 			worker->scheduled.next,
 			work_color_to_flags(WORK_NO_COLOR));
 	}
+
+	spin_unlock_irq(&gcwq->lock);
 }
 
 static struct worker *alloc_worker(void)
@@ -1981,11 +1981,14 @@ __acquires(&gcwq->lock)
 	bool bind;
 
 	/*
-	 * Note we have held the manage_mutex, so DISASSOCIATED can't be
-	 * flipped and it is correct that we calculate @bind only once
-	 * and then release the gcwq->lock.
+	 * Snapshot the value of @bind.
+	 * Because the %UNBOUND bit of manager can't be cleared while it
+	 * is creating worker. So we can detect whether the real @bind
+	 * is/was changed and fix the newly created worker's binding in
+	 * manager_start_worker().
 	 */
 	bind = !(gcwq->flags & GCWQ_DISASSOCIATED);
+	BUG_ON(!bind && !(pool->manager->flags & WORKER_UNBOUND));
 
 	if (!need_to_create_worker(pool))
 		return false;
@@ -3441,26 +3444,6 @@ EXPORT_SYMBOL_GPL(work_busy);
  * cpu comes back online.
  */
 
-/* claim manager positions of all pools */
-static void gcwq_claim_management_and_lock(struct global_cwq *gcwq)
-{
-	struct worker_pool *pool;
-
-	for_each_worker_pool(pool, gcwq)
-		mutex_lock_nested(&pool->manager_mutex, pool - gcwq->pools);
-	spin_lock_irq(&gcwq->lock);
-}
-
-/* release manager positions */
-static void gcwq_release_management_and_unlock(struct global_cwq *gcwq)
-{
-	struct worker_pool *pool;
-
-	spin_unlock_irq(&gcwq->lock);
-	for_each_worker_pool(pool, gcwq)
-		mutex_unlock(&pool->manager_mutex);
-}
-
 static void gcwq_unbind_fn(struct work_struct *work)
 {
 	struct global_cwq *gcwq = get_gcwq(smp_processor_id());
@@ -3471,13 +3454,13 @@ static void gcwq_unbind_fn(struct work_struct *work)
 
 	BUG_ON(gcwq->cpu != smp_processor_id());
 
-	gcwq_claim_management_and_lock(gcwq);
+	spin_lock_irq(&gcwq->lock);
 
 	/*
-	 * We've claimed all manager positions.  Make all workers unbound
-	 * and set DISASSOCIATED.  Before this, all workers except for the
-	 * ones which are still executing works from before the last CPU
-	 * down must be on the cpu.  After this, they may become diasporas.
+	 * Make all workers unbound and set DISASSOCIATED.
+	 * Before this, all workers except for the ones which are still
+	 * executing works from before the last CPU down must be on the cpu.
+	 * After this, they may become diasporas.
 	 */
 	for_each_worker_pool(pool, gcwq) {
 		list_for_each_entry(worker, &pool->idle_list, entry)
@@ -3491,7 +3474,7 @@ static void gcwq_unbind_fn(struct work_struct *work)
 
 	gcwq->flags |= GCWQ_DISASSOCIATED;
 
-	gcwq_release_management_and_unlock(gcwq);
+	spin_unlock_irq(&gcwq->lock);
 
 	/*
 	 * Call schedule() so that we cross rq->lock and thus can guarantee
@@ -3547,10 +3530,7 @@ static int __devinit workqueue_cpu_up_callback(struct notifier_block *nfb,
 
 	case CPU_DOWN_FAILED:
 	case CPU_ONLINE:
-		gcwq_claim_management_and_lock(gcwq);
-		gcwq->flags &= ~GCWQ_DISASSOCIATED;
 		rebind_workers(gcwq);
-		gcwq_release_management_and_unlock(gcwq);
 		break;
 	}
 	return NOTIFY_OK;
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 11/11 V5] workqueue: remove manager_mutex
  2012-09-05 10:37 [PATCH 00/11 V5] workqueue: reimplement unbind/rebind Lai Jiangshan
                   ` (9 preceding siblings ...)
  2012-09-05 10:37 ` [PATCH 10/11 V5] workqueue: unbind/rebind without manager_mutex Lai Jiangshan
@ 2012-09-05 10:37 ` Lai Jiangshan
  10 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-05 10:37 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

no one except manage_workers() use it. remove it.
manage_workers() will use ->manager instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |    7 ++-----
 1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 5fb712a..eee3fad 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -163,7 +163,6 @@ struct worker_pool {
 	struct timer_list	mayday_timer;	/* L: SOS timer for workers */
 
 	struct worker		*manager;	/* L: manager worker */
-	struct mutex		manager_mutex;	/* mutex manager should hold */
 	struct ida		worker_ida;	/* L: for worker IDs */
 };
 
@@ -678,7 +677,7 @@ static bool need_to_manage_workers(struct worker_pool *pool)
 /* Do we have too many workers and should some go away? */
 static bool too_many_workers(struct worker_pool *pool)
 {
-	bool managing = mutex_is_locked(&pool->manager_mutex);
+	bool managing = !!pool->manager;
 	int nr_idle = pool->nr_idle + managing; /* manager is considered idle */
 	int nr_busy = pool->nr_workers - nr_idle;
 
@@ -2104,7 +2103,7 @@ static bool manage_workers(struct worker *worker)
 	struct worker_pool *pool = worker->pool;
 	bool ret = false;
 
-	if (!mutex_trylock(&pool->manager_mutex))
+	if (pool->manager)
 		return ret;
 
 	pool->manager = worker;
@@ -2118,7 +2117,6 @@ static bool manage_workers(struct worker *worker)
 	ret |= maybe_create_worker(pool);
 	ret |= maybe_rebind_manager(worker);
 
-	mutex_unlock(&pool->manager_mutex);
 	pool->manager = NULL;
 
 	return ret;
@@ -3785,7 +3783,6 @@ static int __init init_workqueues(void)
 				    (unsigned long)pool);
 
 			pool->manager = NULL;
-			mutex_init(&pool->manager_mutex);
 			ida_init(&pool->worker_ida);
 		}
 	}
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 09/11 V5] workqueue: rebind newly created worker
  2012-09-05 10:37 ` [PATCH 09/11 V5] workqueue: rebind " Lai Jiangshan
@ 2012-09-05 16:19   ` Lai Jiangshan
  0 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-05 16:19 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Tejun Heo, linux-kernel

On Wed, Sep 5, 2012 at 6:37 PM, Lai Jiangshan <laijs@cn.fujitsu.com> wrote:
> when the newly created needs to be rebound. exile it!
> it will rebind itself in worker_thead().
>
> It just prepares, this code is useless until
> "we unbind/rebind without manager_mutex"
>
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> ---
>  kernel/workqueue.c |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 223d128..819c84e 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1838,6 +1838,11 @@ static void manager_start_worker(struct worker *worker, struct worker *manager)
>
>         /* copy the flags. also unbind the worker if the manager is unbound */
>         worker->flags |= manager->flags & WORKER_COPY_FLAGS;
> +
> +       if (need_to_rebind_worker(worker)) {
> +               /* exile it, and let it rebind itself */
> +               list_del_init(&worker->entry);

Sorry, we should not add it to idle_list.
"add and then remove " is wrong here.

> +       }
>  }
>
>  /**
> --
> 1.7.4.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 02/11 V5] workqueue: async idle rebinding
  2012-09-05 10:37 ` [PATCH 02/11 V5] workqueue: async idle rebinding Lai Jiangshan
@ 2012-09-05 18:06   ` Tejun Heo
  2012-09-06  1:28     ` Lai Jiangshan
  0 siblings, 1 reply; 25+ messages in thread
From: Tejun Heo @ 2012-09-05 18:06 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel

Hello, Lai.

Ooh, I like the approach.  That said, I think it's a bit too invasive
for 3.6-fixes.  I'll merge the two patches I posted yesterday in
3.6-fixes.  Let's do this restructuring in for-3.7.

On Wed, Sep 05, 2012 at 06:37:39PM +0800, Lai Jiangshan wrote:
>  static void idle_worker_rebind(struct worker *worker)
>  {
>  	struct global_cwq *gcwq = worker->pool->gcwq;
>  
> -	/* CPU must be online at this point */
> -	WARN_ON(!worker_maybe_bind_and_lock(worker));
> -	if (!--worker->idle_rebind->cnt)
> -		complete(&worker->idle_rebind->done);
> -	spin_unlock_irq(&worker->pool->gcwq->lock);
> +	if (worker_maybe_bind_and_lock(worker))
> +		worker_clr_flags(worker, WORKER_UNBOUND);
>  
> -	/* we did our part, wait for rebind_workers() to finish up */
> -	wait_event(gcwq->rebind_hold, !(worker->flags & WORKER_REBIND));
> +	worker_clr_flags(worker, WORKER_REBIND);
> +	list_add(&worker->entry, &worker->pool->idle_list);
> +	spin_unlock_irq(&gcwq->lock);

This looks correct to me but it's still a bit scary.  Some comments
explaining why the above is correct would be nice.

Yeah, other than that, looks good to me.  I'll prepare new for-3.7
branch this can be based on, so please wait a bit.  Also, I think I'll
probably update commit description / comments while committing.

Thanks!

-- 
tejun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 03/11 V5] workqueue: new day don't need WORKER_REBIND for busy rebinding
  2012-09-05 10:37 ` [PATCH 03/11 V5] workqueue: new day don't need WORKER_REBIND for busy rebinding Lai Jiangshan
@ 2012-09-05 18:31   ` Tejun Heo
  2012-09-06  2:10     ` Lai Jiangshan
  0 siblings, 1 reply; 25+ messages in thread
From: Tejun Heo @ 2012-09-05 18:31 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel

On Wed, Sep 05, 2012 at 06:37:40PM +0800, Lai Jiangshan wrote:
> because old busy_worker_rebind_fn() have to wait until all idle worker finish.
> so we have to use two flags WORKER_UNBOUND and WORKER_REBIND to avoid
> prematurely clear all NOT_RUNNING bit when highly frequent offline/online.
> 
> but current code don't need to wait idle workers. so we don't need to
> use two flags, just one is enough. remove WORKER_REBIND from busy rebinding.

ROGUE / REBIND thing existed for busy workers from the beginning when
there was no idle worker rebinding, so this definitely wasn't about
whether idle rebind is synchronous or not.  Trying to remember
what... ah, okay, setting of DISASSOCIATED and setting of WORKER_ROGUE
didn't use to happen together with gcwq->lock held.  CPU_DOWN would
first set ROGUE and then later on set DISASSOCIATED, so if the
rebind_fn kicks in inbetween that, it would break CPU_DOWN.

I think now that both CPU_DOWN and UP are done under single holding of
gcwq->lock, this should be safe.  It would be nice to note what
changed in the patch description and the atomicity requirement as a
comment tho.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 05/11 V5] workqueue: Add @bind arguement back without change any thing
  2012-09-05 10:37 ` [PATCH 05/11 V5] workqueue: Add @bind arguement back without change any thing Lai Jiangshan
@ 2012-09-05 19:49   ` Tejun Heo
  2012-09-06  1:04     ` Lai Jiangshan
  0 siblings, 1 reply; 25+ messages in thread
From: Tejun Heo @ 2012-09-05 19:49 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel

Hello,

On Wed, Sep 05, 2012 at 06:37:42PM +0800, Lai Jiangshan wrote:
> Ensure the gcwq->flags is only accessed with gcwq->lock held.
> And make the code more easier to understand.
>
> In all current callsite of create_worker(), DISASSOCIATED can't
> be flipped while create_worker().
> So the whole behavior is unchanged with this patch.

This doesn't change anything.  You're just moving the test to the
caller with comments there explaining how it won't change even if
gcwq->lock is released.  It seems more confusing to me.  The flag is
still protected by manager_mutex.  How is this an improvement?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/11 V5] workqueue: unbind/rebind without manager_mutex
  2012-09-05 10:37 ` [PATCH 10/11 V5] workqueue: unbind/rebind without manager_mutex Lai Jiangshan
@ 2012-09-05 20:04   ` Tejun Heo
  2012-09-06 10:44     ` Lai Jiangshan
  0 siblings, 1 reply; 25+ messages in thread
From: Tejun Heo @ 2012-09-05 20:04 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel

Hello, Lai.

On Wed, Sep 05, 2012 at 06:37:47PM +0800, Lai Jiangshan wrote:
> gcwq_unbind_fn() unbind manager by ->manager pointer.
> 
> rebinding-manger, unbinding/rebinding newly created worker are done by
> other place. so we don't need manager_mutex any more.
> 
> Also change the comment of @bind accordingly.

Please don't scatter small prep patches like this.  Each piece in
isolation doesn't make much sense to me and the patch descriptions
don't help much.  Please collect the prep patches and explain in more
detail.

In general, I'm not sure about this approach.  I'd really like the
hotplug logic to be contained in hotplug logic proper as much as
possible.  This scatters around hotplug handling to usual code paths
and seems too invasive for 3.6-fixes.

Also, can you please talk to me before going ahead and sending me
completely new 10 patch series every other day?  You're taking
disproportionate amount of my time and I can't continue to do this.
Please discuss with me or at least explain the high-level approach in
the head message in detail.  Going through the patch series to figure
out high-level design which is constantly flipping is rather
inefficient and unfortunately your patch descriptions aren't too
helpful.  :(

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 05/11 V5] workqueue: Add @bind arguement back without change any thing
  2012-09-05 19:49   ` Tejun Heo
@ 2012-09-06  1:04     ` Lai Jiangshan
  2012-09-06 16:51       ` Tejun Heo
  0 siblings, 1 reply; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-06  1:04 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel

On 09/06/2012 03:49 AM, Tejun Heo wrote:
> Hello,
> 
> On Wed, Sep 05, 2012 at 06:37:42PM +0800, Lai Jiangshan wrote:
>> Ensure the gcwq->flags is only accessed with gcwq->lock held.
>> And make the code more easier to understand.
>>
>> In all current callsite of create_worker(), DISASSOCIATED can't
>> be flipped while create_worker().
>> So the whole behavior is unchanged with this patch.
> 
> This doesn't change anything.  You're just moving the test to the
> caller with comments there explaining how it won't change even if
> gcwq->lock is released.  It seems more confusing to me.  The flag is
> still protected by manager_mutex.  How is this an improvement?
> 

Some other bit of gcwq->flags is accessed(modified) without manager_mutex.
making gcwq->flags be accessed only form gcwq->lock C.S. will help the reviewer.

I don't like adding special things/code when not-absolutely-required.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 02/11 V5] workqueue: async idle rebinding
  2012-09-05 18:06   ` Tejun Heo
@ 2012-09-06  1:28     ` Lai Jiangshan
  0 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-06  1:28 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel

On 09/06/2012 02:06 AM, Tejun Heo wrote:
> Hello, Lai.
> 
> Ooh, I like the approach.  That said, I think it's a bit too invasive
> for 3.6-fixes.  I'll merge the two patches I posted yesterday in
> 3.6-fixes.  Let's do this restructuring in for-3.7.

OK for me.
it is too complicated for 3.6.

> 
> On Wed, Sep 05, 2012 at 06:37:39PM +0800, Lai Jiangshan wrote:
>>  static void idle_worker_rebind(struct worker *worker)
>>  {
>>  	struct global_cwq *gcwq = worker->pool->gcwq;
>>  
>> -	/* CPU must be online at this point */
>> -	WARN_ON(!worker_maybe_bind_and_lock(worker));
>> -	if (!--worker->idle_rebind->cnt)
>> -		complete(&worker->idle_rebind->done);
>> -	spin_unlock_irq(&worker->pool->gcwq->lock);
>> +	if (worker_maybe_bind_and_lock(worker))
>> +		worker_clr_flags(worker, WORKER_UNBOUND);
>>  
>> -	/* we did our part, wait for rebind_workers() to finish up */
>> -	wait_event(gcwq->rebind_hold, !(worker->flags & WORKER_REBIND));
>> +	worker_clr_flags(worker, WORKER_REBIND);
>> +	list_add(&worker->entry, &worker->pool->idle_list);
>> +	spin_unlock_irq(&gcwq->lock);
> 
> This looks correct to me but it's still a bit scary.  Some comments
> explaining why the above is correct would be nice.

How to explain the correct, could you give some clues.
correctness for rebinding and the flags: comments is missing. (old code miss it too, so I forgot it)
correctness for idle management: list_del_init() and list_add(), I don't like to add comment for slef-explain-code.
correctness for quick-enabled-CMWQ, local-wake-up: comments is in the changelog. (I should also add it to the code)
correctness for integrating of above: ..

> 
> Yeah, other than that, looks good to me.  I'll prepare new for-3.7
> branch this can be based on, so please wait a bit.  Also, I think I'll
> probably update commit description / comments while committing.
> 

I was coding it based on wq/for-3.7. so you can merge it easier.
waiting for you merged-result.

Thanks.
Lai


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 03/11 V5] workqueue: new day don't need WORKER_REBIND for busy rebinding
  2012-09-05 18:31   ` Tejun Heo
@ 2012-09-06  2:10     ` Lai Jiangshan
  0 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-06  2:10 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel

On 09/06/2012 02:31 AM, Tejun Heo wrote:
> On Wed, Sep 05, 2012 at 06:37:40PM +0800, Lai Jiangshan wrote:
>> because old busy_worker_rebind_fn() have to wait until all idle worker finish.
>> so we have to use two flags WORKER_UNBOUND and WORKER_REBIND to avoid
>> prematurely clear all NOT_RUNNING bit when highly frequent offline/online.
>>
>> but current code don't need to wait idle workers. so we don't need to
>> use two flags, just one is enough. remove WORKER_REBIND from busy rebinding.
> 
> ROGUE / REBIND thing existed for busy workers from the beginning when
> there was no idle worker rebinding, so this definitely wasn't about
> whether idle rebind is synchronous or not. 

In very old day, this definitely wasn't about whether idle rebind is synchronous or not.
but after you reimplement rebind_worker(), it is the only reason for WORKER_REBIND in busy rebinding.

if I miss something, this 03/11 will be wrong. the old code did not comment all why
WORKER_REBIND is needed. so we have to think more about the correctness of this 03/11.

> Trying to remember
> what... ah, okay, setting of DISASSOCIATED and setting of WORKER_ROGUE
> didn't use to happen together with gcwq->lock held.  CPU_DOWN would
> first set ROGUE and then later on set DISASSOCIATED, so if the
> rebind_fn kicks in inbetween that, it would break CPU_DOWN.
> 
> I think now that both CPU_DOWN and UP are done under single holding of
> gcwq->lock, this should be safe.  It would be nice to note what
> changed in the patch description and the atomicity requirement as a
> comment tho.
> 

Oh, I forgot to add changelog about single holding of gcwq->lock.


Thanks
Lai


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/11 V5] workqueue: unbind/rebind without manager_mutex
  2012-09-05 20:04   ` Tejun Heo
@ 2012-09-06 10:44     ` Lai Jiangshan
  2012-09-06 17:00       ` Tejun Heo
  0 siblings, 1 reply; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-06 10:44 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel

On 09/06/2012 04:04 AM, Tejun Heo wrote:
> Hello, Lai.
> 
> On Wed, Sep 05, 2012 at 06:37:47PM +0800, Lai Jiangshan wrote:
>> gcwq_unbind_fn() unbind manager by ->manager pointer.
>>
>> rebinding-manger, unbinding/rebinding newly created worker are done by
>> other place. so we don't need manager_mutex any more.
>>
>> Also change the comment of @bind accordingly.
> 
> Please don't scatter small prep patches like this.  Each piece in
> isolation doesn't make much sense to me and the patch descriptions
> don't help much.  Please collect the prep patches and explain in more
> detail.

There are 4 different tasks. unbind/rebind manager/newbie

1 task for 1 patch. if I collect them into one patch, it will be hard
to explain which code do which task.

> 
> In general, I'm not sure about this approach.  I'd really like the
> hotplug logic to be contained in hotplug logic proper as much as
> possible.  This scatters around hotplug handling to usual code paths
> and seems too invasive for 3.6-fixes.

I don't expect to fix it in 3.6. no approach is simple.

> 
> Also, can you please talk to me before going ahead and sending me
> completely new 10 patch series every other day?  You're taking
> disproportionate amount of my time and I can't continue to do this.
> Please discuss with me or at least explain the high-level approach in
> the head message in detail.  Going through the patch series to figure
> out high-level design which is constantly flipping is rather
> inefficient and unfortunately your patch descriptions aren't too
> helpful.  :(
> 

I'm not good in English, so I prefer to attach code when I show my idea.
(and the code can prove the idea). I admit that my changelog and comments
are always bad.


I have 4 idea/approach for bug of hotplug VS manage_workers().
there all come up to my mind last week. 
NOTE: (this V5 patch is my approach2)

(list with the order they came into my mind)
Approach 1	V3 patchset		non_manager_role_manager_mutex_unlock()
Approach 2	V5 patchset		"rebind manager, unbind/rebind newbie" are done outside. no manage mutex for hotplug
Approach 3	un-implemented		move unbind/rebind to worker_thread and handle them as POOL_MANAGE_WORKERS
Approach 4	V4 parchset		manage_workers_slowpath()

Approach 2,3 is partial implemented last week, but Approach2 is quickly finished yesterday.
Approach 3 is too complicated to finish.


Approach 1: the simplest. after it, we can use manage_mutex anywhere as needed, but we need to use non_manager_role_manager_mutex_unlock() to unlock.

Approach 2: the binding of manager and newly created worker is handled outside of hotplug code. thus hoplug code don't need manage_mutex. manage_mutex is typical protect-code-pattern, it is not good. we should always use lock to protect data instead of protecting code. although in linux kernel, there are many lock which are only used for protecting code, I think we can reduce them as possible. the removing of BIG-KERNEL-LOCK is an example. the line of code is also less in this approach, but it touch 2 place outside of hotplug code and the logic/path are increasing. GOOD to me: disallow manage_mutex(for future), not too much code.

Approach 3: complicated. make unbind/rebind 's calle-site and context are the same as manage_workers(). BAD: we can't free to use manage_mutex in future when need. encounter some other problems.(you suggested approach will also have some problem I encountered)

Approach 4: the problem comes from manage_worker(), just add manage_workers_slowpath() to fix it inside manage_worker(). it fixs problem in only 1 bulk of code. after it, we can use manage_mutex anywhere as needed. the line of code is more, but it just in one place. GOOD: the most clean approach.

Thanks
Lai


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 05/11 V5] workqueue: Add @bind arguement back without change any thing
  2012-09-06  1:04     ` Lai Jiangshan
@ 2012-09-06 16:51       ` Tejun Heo
  2012-09-07  2:11         ` Lai Jiangshan
  0 siblings, 1 reply; 25+ messages in thread
From: Tejun Heo @ 2012-09-06 16:51 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel

Hello, Lai.

On Thu, Sep 06, 2012 at 09:04:06AM +0800, Lai Jiangshan wrote:
> > This doesn't change anything.  You're just moving the test to the
> > caller with comments there explaining how it won't change even if
> > gcwq->lock is released.  It seems more confusing to me.  The flag is
> > still protected by manager_mutex.  How is this an improvement?
> > 
> 
> Some other bit of gcwq->flags is accessed(modified) without manager_mutex.
> making gcwq->flags be accessed only form gcwq->lock C.S. will help the reviewer.
> 
> I don't like adding special things/code when not-absolutely-required.

I really fail to see this.  The flag has to stay stable while
manage_mutex is held no matter where you test it.  It doesn't make any
it any more readable whether you test it inside gcwq->lock with the
comment saying "this won't change while manager_mutex is held" or just
test it while manager_mutex is held.  It is a synchronization oddity
no matter what and as long as it's well documented, I don't really see
the point in the change.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/11 V5] workqueue: unbind/rebind without manager_mutex
  2012-09-06 10:44     ` Lai Jiangshan
@ 2012-09-06 17:00       ` Tejun Heo
  0 siblings, 0 replies; 25+ messages in thread
From: Tejun Heo @ 2012-09-06 17:00 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel

Hello, Lai.

On Thu, Sep 06, 2012 at 06:44:53PM +0800, Lai Jiangshan wrote:
> On 09/06/2012 04:04 AM, Tejun Heo wrote:
> > Please don't scatter small prep patches like this.  Each piece in
> > isolation doesn't make much sense to me and the patch descriptions
> > don't help much.  Please collect the prep patches and explain in more
> > detail.
> 
> There are 4 different tasks. unbind/rebind manager/newbie
> 
> 1 task for 1 patch. if I collect them into one patch, it will be hard
> to explain which code do which task.

Not really.  Just list what each part does in the commit log and
explain how they're gonna be used by the following patch.

> > In general, I'm not sure about this approach.  I'd really like the
> > hotplug logic to be contained in hotplug logic proper as much as
> > possible.  This scatters around hotplug handling to usual code paths
> > and seems too invasive for 3.6-fixes.
> 
> I don't expect to fix it in 3.6. no approach is simple.

I think I can come up with something fairly simple.  Will post soon.

> > Also, can you please talk to me before going ahead and sending me
> > completely new 10 patch series every other day?  You're taking
> > disproportionate amount of my time and I can't continue to do this.
> > Please discuss with me or at least explain the high-level approach in
> > the head message in detail.  Going through the patch series to figure
> > out high-level design which is constantly flipping is rather
> > inefficient and unfortunately your patch descriptions aren't too
> > helpful.  :(
> 
> I'm not good in English, so I prefer to attach code when I show my idea.
> (and the code can prove the idea). I admit that my changelog and comments
> are always bad.

English isn't my first language either and I struggled with it for
quite a while too and it's perfectly okay to write non-perfect
sentences, but please do keep trying to express your ideas rather than
just throwing patches with one line description.  I'd be happy to
update the patch description and comments as necessary but no matter
how imperfect trying to communicate high level ideas in plain text
helps a lot.

* People might not understand fully but they would understand a lot of
  it.

* You'll have think one more time about it while trying to explain and
  justify all the changes in the patch.  It tends to make the code a
  lot better.

* Good patch descriptions and comments are often very important
  especially if one wants to make high-level restructuring changes
  like you're trying to.  It might be difficult right now but it won't
  get better without trying, right?

Thanks!

-- 
tejun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 05/11 V5] workqueue: Add @bind arguement back without change any thing
  2012-09-06 16:51       ` Tejun Heo
@ 2012-09-07  2:11         ` Lai Jiangshan
  2012-09-07 19:37           ` Tejun Heo
  0 siblings, 1 reply; 25+ messages in thread
From: Lai Jiangshan @ 2012-09-07  2:11 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel

On 09/07/2012 12:51 AM, Tejun Heo wrote:
> Hello, Lai.
> 
> On Thu, Sep 06, 2012 at 09:04:06AM +0800, Lai Jiangshan wrote:
>>> This doesn't change anything.  You're just moving the test to the
>>> caller with comments there explaining how it won't change even if
>>> gcwq->lock is released.  It seems more confusing to me.  The flag is
>>> still protected by manager_mutex.  How is this an improvement?
>>>
>>
>> Some other bit of gcwq->flags is accessed(modified) without manager_mutex.
>> making gcwq->flags be accessed only form gcwq->lock C.S. will help the reviewer.
>>
>> I don't like adding special things/code when not-absolutely-required.
> 
> I really fail to see this.  The flag has to stay stable while
> manage_mutex is held no matter where you test it. 

Only one bit is stable, the whole flags can be changed outside.

I prefer the whole byte or short or int or long is protected under the same synchronization.
I don't prefer different bit uses different synchronization.

> It doesn't make any
> it any more readable whether you test it inside gcwq->lock with the
> comment saying "this won't change while manager_mutex is held" or just
> test it while manager_mutex is held.  It is a synchronization oddity
> no matter what and as long as it's well documented, I don't really see
> the point in the change.
> 

When I read "gcwq->flags & GCWQ_DISASSOCIATED" in create_worker, I thought:
WTF, gcwq->flags can be change by other, is it correct?. When I saw the comments claim
it is correct, I have to use about 30 mins to check whether it is correct in several
places of code in workqueue.c(include check does flags has internal state in all gcwq->lock).
I'm not good on it, but I think there are some reviewers will be confused like me.

Thanks,
Lai
will be 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 05/11 V5] workqueue: Add @bind arguement back without change any thing
  2012-09-07  2:11         ` Lai Jiangshan
@ 2012-09-07 19:37           ` Tejun Heo
  0 siblings, 0 replies; 25+ messages in thread
From: Tejun Heo @ 2012-09-07 19:37 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel

Hello, Lai.

On Fri, Sep 07, 2012 at 10:11:56AM +0800, Lai Jiangshan wrote:
> When I read "gcwq->flags & GCWQ_DISASSOCIATED" in create_worker, I
> thought: WTF, gcwq->flags can be change by other, is it
> correct?. When I saw the comments claim it is correct, I have to use
> about 30 mins to check whether it is correct in several places of
> code in workqueue.c(include check does flags has internal state in
> all gcwq->lock).  I'm not good on it, but I think there are some
> reviewers will be confused like me.

The head-scratchy part there is not where it's tested - under
manager_mutex or gcwq->lock.  It's that DISASSOCIATED has an
additional locking restriction while still living in gcwq->flags.
Simply moving DISASSOCIATED test inside gcwq->lock doesn't change
anything including readability.  You still have to verify the flag
changes according to the said locking requirement.

Separating DISASSOCIATED into a separate variable could help it but I
really don't see much point in doing that.  I don't know why it took
you 30 minutes to figure out when it's clearly stated in the constant
definition and there are only two places which modify the flag - one
setting and the other clearing.  Were you worrying about the flag
flipping while other bits are modified?  But even then, there's only
one other flag - GCWQ_FREEZING which again is set and cleared once in
the whole code.

So, either you were hopelessly confused for some reason or I'm missing
something.  If the latter, please enlighten me; otherwise, I actually
like putting the test outside gcwq->lock.  That makes it explicit that
the flag shouldn't be changing outside manager_mutex, which is the
necessary locking requirement.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2012-09-07 19:38 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-05 10:37 [PATCH 00/11 V5] workqueue: reimplement unbind/rebind Lai Jiangshan
2012-09-05 10:37 ` [PATCH 01/11 V5] workqueue: ensure the wq_worker_sleeping() see the right flags Lai Jiangshan
2012-09-05 10:37 ` [PATCH 02/11 V5] workqueue: async idle rebinding Lai Jiangshan
2012-09-05 18:06   ` Tejun Heo
2012-09-06  1:28     ` Lai Jiangshan
2012-09-05 10:37 ` [PATCH 03/11 V5] workqueue: new day don't need WORKER_REBIND for busy rebinding Lai Jiangshan
2012-09-05 18:31   ` Tejun Heo
2012-09-06  2:10     ` Lai Jiangshan
2012-09-05 10:37 ` [PATCH 04/11 V5] workqueue: remove WORKER_REBIND Lai Jiangshan
2012-09-05 10:37 ` [PATCH 05/11 V5] workqueue: Add @bind arguement back without change any thing Lai Jiangshan
2012-09-05 19:49   ` Tejun Heo
2012-09-06  1:04     ` Lai Jiangshan
2012-09-06 16:51       ` Tejun Heo
2012-09-07  2:11         ` Lai Jiangshan
2012-09-07 19:37           ` Tejun Heo
2012-09-05 10:37 ` [PATCH 06/11 V5] workqueue: unbind manager Lai Jiangshan
2012-09-05 10:37 ` [PATCH 07/11 V5] workqueue: rebind manager Lai Jiangshan
2012-09-05 10:37 ` [PATCH 08/11 V5] workqueue: unbind newly created worker Lai Jiangshan
2012-09-05 10:37 ` [PATCH 09/11 V5] workqueue: rebind " Lai Jiangshan
2012-09-05 16:19   ` Lai Jiangshan
2012-09-05 10:37 ` [PATCH 10/11 V5] workqueue: unbind/rebind without manager_mutex Lai Jiangshan
2012-09-05 20:04   ` Tejun Heo
2012-09-06 10:44     ` Lai Jiangshan
2012-09-06 17:00       ` Tejun Heo
2012-09-05 10:37 ` [PATCH 11/11 V5] workqueue: remove manager_mutex Lai Jiangshan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).