linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7 V6] workqueue: fix hoplug things
@ 2012-09-08 17:12 Lai Jiangshan
  2012-09-08 17:12 ` [PATCH 1/7 V6] workqueue: ensure the wq_worker_sleeping() see the right flags Lai Jiangshan
                   ` (7 more replies)
  0 siblings, 8 replies; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-08 17:12 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

The patch set is based on 3b07e9ca26866697616097044f25fbe53dbab693 of wq.git

Patch 1,2 are accepted. Patch 1 goes to 3.6. tj has a replacement goes
to 3.6 instead of Patch 2. so Patch2 will go to 3.7. Patch2 will need
to be rebased if the replacement is still in 3.7.
(tj, could you help me do the rebase if I don't need to respin the patchset
as V7 ?)

Patch3,4 fix depletion problem, it is simple enough. it goes to 3.6.

Patch 5,6,7 are clean up. -> 3.7


Lai Jiangshan (7):
  workqueue: ensure the wq_worker_sleeping() see the right flags
  workqueue: async idle rebinding
  workqueue:  add manager pointer for worker_pool
  workqueue: fix idle worker depletion
  workqueue: rename manager_mutex to assoc_mutex
  workqueue: new day don't need WORKER_REBIND for busy rebinding
  workqueue: remove WORKER_REBIND

 kernel/workqueue.c |  195 +++++++++++++++++++++++-----------------------------
 1 files changed, 85 insertions(+), 110 deletions(-)

-- 
1.7.4.4


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/7 V6] workqueue: ensure the wq_worker_sleeping() see the right flags
  2012-09-08 17:12 [PATCH 0/7 V6] workqueue: fix hoplug things Lai Jiangshan
@ 2012-09-08 17:12 ` Lai Jiangshan
  2012-09-08 17:12 ` [PATCH 2/7 V6] workqueue: async idle rebinding Lai Jiangshan
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-08 17:12 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

The compiler may compile this code into TWO write/modify instructions.
		worker->flags &= ~WORKER_UNBOUND;
		worker->flags |= WORKER_REBIND;

so the other CPU may see the temporary of worker->flags which has
not WORKER_UNBOUND nor WORKER_REBIND, it will wrongly do local wake up.

so we use one write/modify instruction explicitly instead.

This bug will not occur on idle workers, because they have another
WORKER_NOT_RUNNING flags.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 85bd340..050b2a5 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1739,10 +1739,13 @@ retry:
 	for_each_busy_worker(worker, i, pos, gcwq) {
 		struct work_struct *rebind_work = &worker->rebind_work;
 		struct workqueue_struct *wq;
+		unsigned long worker_flags = worker->flags;
 
 		/* morph UNBOUND to REBIND */
-		worker->flags &= ~WORKER_UNBOUND;
-		worker->flags |= WORKER_REBIND;
+		worker_flags &= ~WORKER_UNBOUND;
+		worker_flags |= WORKER_REBIND;
+		/* ensure the wq_worker_sleeping() see the right flags */
+		ACCESS_ONCE(worker->flags) = worker_flags;
 
 		if (test_and_set_bit(WORK_STRUCT_PENDING_BIT,
 				     work_data_bits(rebind_work)))
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/7 V6] workqueue: async idle rebinding
  2012-09-08 17:12 [PATCH 0/7 V6] workqueue: fix hoplug things Lai Jiangshan
  2012-09-08 17:12 ` [PATCH 1/7 V6] workqueue: ensure the wq_worker_sleeping() see the right flags Lai Jiangshan
@ 2012-09-08 17:12 ` Lai Jiangshan
  2012-09-08 17:12 ` [PATCH 3/7 V6] workqueue: add manager pointer for worker_pool Lai Jiangshan
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-08 17:12 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

fix deadlock in rebind_workers()

Current idle_worker_rebind() has a bug.

idle_worker_rebind() path			HOTPLUG path
						online
							rebind_workers()
wait_event(gcwq->rebind_hold)
	woken up but no scheduled			rebind_workers() returns
						the same cpu offline
						the same cpu online again
							rebind_workers()
								set WORKER_REBIND
	scheduled,see the WORKER_REBIND
	wait rebind_workers() clear it    <--bug-->		wait idle_worker_rebind()
								rebound.

The two thread wait each other. It is bug.

This patch gives up to rebind idles synchronously. make them async instead.

To avoid to wrongly do local-wake-up, we add a exile-operation for
idle-worker-rebinding. When a idle worker is exiled, it will not queued
on @idle_list until it is rebound.

After we have exile-operation, the @nr_idle is not only the count of @idle_list,
but also exiled idle workers. so I check all the code, and make
them exile-operation aware. (too_many_workers())

exile-operation is also the core idea to rebind newly created worker.
(patch 9)

rebind_workers() become single pass and don't release gcwq->lock.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |  100 +++++++++++++---------------------------------------
 1 files changed, 25 insertions(+), 75 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 050b2a5..3dd7ce2 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -125,7 +125,6 @@ enum {
 
 struct global_cwq;
 struct worker_pool;
-struct idle_rebind;
 
 /*
  * The poor guys doing the actual heavy lifting.  All on-duty workers
@@ -149,7 +148,6 @@ struct worker {
 	int			id;		/* I: worker id */
 
 	/* for rebinding worker to CPU */
-	struct idle_rebind	*idle_rebind;	/* L: for idle worker */
 	struct work_struct	rebind_work;	/* L: for busy worker */
 };
 
@@ -159,7 +157,9 @@ struct worker_pool {
 
 	struct list_head	worklist;	/* L: list of pending works */
 	int			nr_workers;	/* L: total number of workers */
-	int			nr_idle;	/* L: currently idle ones */
+	int			nr_idle;	/* L: currently idle ones,
+						      include ones in idle_list
+						      and in doing rebind. */
 
 	struct list_head	idle_list;	/* X: list of idle workers */
 	struct timer_list	idle_timer;	/* L: worker idle timeout */
@@ -185,8 +185,6 @@ struct global_cwq {
 
 	struct worker_pool	pools[NR_WORKER_POOLS];
 						/* normal and highpri pools */
-
-	wait_queue_head_t	rebind_hold;	/* rebind hold wait */
 } ____cacheline_aligned_in_smp;
 
 /*
@@ -686,6 +684,10 @@ static bool too_many_workers(struct worker_pool *pool)
 	int nr_idle = pool->nr_idle + managing; /* manager is considered idle */
 	int nr_busy = pool->nr_workers - nr_idle;
 
+	/* Is any idle home ? */
+	if (unlikely(list_empty(&pool->idle_list)))
+		return false;
+
 	return nr_idle > 2 && (nr_idle - 2) * MAX_IDLE_WORKERS_RATIO >= nr_busy;
 }
 
@@ -1608,28 +1610,20 @@ __acquires(&gcwq->lock)
 	}
 }
 
-struct idle_rebind {
-	int			cnt;		/* # workers to be rebound */
-	struct completion	done;		/* all workers rebound */
-};
-
 /*
- * Rebind an idle @worker to its CPU.  During CPU onlining, this has to
- * happen synchronously for idle workers.  worker_thread() will test
+ * Rebind an idle @worker to its CPU. worker_thread() will test
  * %WORKER_REBIND before leaving idle and call this function.
  */
 static void idle_worker_rebind(struct worker *worker)
 {
 	struct global_cwq *gcwq = worker->pool->gcwq;
 
-	/* CPU must be online at this point */
-	WARN_ON(!worker_maybe_bind_and_lock(worker));
-	if (!--worker->idle_rebind->cnt)
-		complete(&worker->idle_rebind->done);
-	spin_unlock_irq(&worker->pool->gcwq->lock);
+	if (worker_maybe_bind_and_lock(worker))
+		worker_clr_flags(worker, WORKER_UNBOUND);
 
-	/* we did our part, wait for rebind_workers() to finish up */
-	wait_event(gcwq->rebind_hold, !(worker->flags & WORKER_REBIND));
+	worker_clr_flags(worker, WORKER_REBIND);
+	list_add(&worker->entry, &worker->pool->idle_list);
+	spin_unlock_irq(&gcwq->lock);
 }
 
 /*
@@ -1656,29 +1650,22 @@ static void busy_worker_rebind_fn(struct work_struct *work)
  * @gcwq->cpu is coming online.  Rebind all workers to the CPU.  Rebinding
  * is different for idle and busy ones.
  *
- * The idle ones should be rebound synchronously and idle rebinding should
- * be complete before any worker starts executing work items with
- * concurrency management enabled; otherwise, scheduler may oops trying to
- * wake up non-local idle worker from wq_worker_sleeping().
- *
- * This is achieved by repeatedly requesting rebinding until all idle
- * workers are known to have been rebound under @gcwq->lock and holding all
- * idle workers from becoming busy until idle rebinding is complete.
+ * The idle ones will be kicked out of the idle_list, and it will
+ * add itself back when it finish to rebind.
  *
- * Once idle workers are rebound, busy workers can be rebound as they
- * finish executing their current work items.  Queueing the rebind work at
- * the head of their scheduled lists is enough.  Note that nr_running will
+ * The busy workers can be rebound as they finish executing
+ * their current work items.  Queueing the rebind work at the head of
+ * their scheduled lists is enough.  Note that nr_running will
  * be properbly bumped as busy workers rebind.
  *
- * On return, all workers are guaranteed to either be bound or have rebind
- * work item scheduled.
+ * On return, all workers are guaranteed to be rebound when they
+ * add themselves back to idle_list if the gcwq is still associated.
  */
 static void rebind_workers(struct global_cwq *gcwq)
 	__releases(&gcwq->lock) __acquires(&gcwq->lock)
 {
-	struct idle_rebind idle_rebind;
 	struct worker_pool *pool;
-	struct worker *worker;
+	struct worker *worker, *n;
 	struct hlist_node *pos;
 	int i;
 
@@ -1687,54 +1674,19 @@ static void rebind_workers(struct global_cwq *gcwq)
 	for_each_worker_pool(pool, gcwq)
 		lockdep_assert_held(&pool->manager_mutex);
 
-	/*
-	 * Rebind idle workers.  Interlocked both ways.  We wait for
-	 * workers to rebind via @idle_rebind.done.  Workers will wait for
-	 * us to finish up by watching %WORKER_REBIND.
-	 */
-	init_completion(&idle_rebind.done);
-retry:
-	idle_rebind.cnt = 1;
-	INIT_COMPLETION(idle_rebind.done);
-
-	/* set REBIND and kick idle ones, we'll wait for these later */
+	/* set REBIND and kick idle ones */
 	for_each_worker_pool(pool, gcwq) {
-		list_for_each_entry(worker, &pool->idle_list, entry) {
-			if (worker->flags & WORKER_REBIND)
-				continue;
-
-			/* morph UNBOUND to REBIND */
-			worker->flags &= ~WORKER_UNBOUND;
+		list_for_each_entry_safe(worker, n, &pool->idle_list, entry) {
 			worker->flags |= WORKER_REBIND;
 
-			idle_rebind.cnt++;
-			worker->idle_rebind = &idle_rebind;
+			/* exile idle workers */
+			list_del_init(&worker->entry);
 
 			/* worker_thread() will call idle_worker_rebind() */
 			wake_up_process(worker->task);
 		}
 	}
 
-	if (--idle_rebind.cnt) {
-		spin_unlock_irq(&gcwq->lock);
-		wait_for_completion(&idle_rebind.done);
-		spin_lock_irq(&gcwq->lock);
-		/* busy ones might have become idle while waiting, retry */
-		goto retry;
-	}
-
-	/*
-	 * All idle workers are rebound and waiting for %WORKER_REBIND to
-	 * be cleared inside idle_worker_rebind().  Clear and release.
-	 * Clearing %WORKER_REBIND from this foreign context is safe
-	 * because these workers are still guaranteed to be idle.
-	 */
-	for_each_worker_pool(pool, gcwq)
-		list_for_each_entry(worker, &pool->idle_list, entry)
-			worker->flags &= ~WORKER_REBIND;
-
-	wake_up_all(&gcwq->rebind_hold);
-
 	/* rebind busy workers */
 	for_each_busy_worker(worker, i, pos, gcwq) {
 		struct work_struct *rebind_work = &worker->rebind_work;
@@ -3811,8 +3763,6 @@ static int __init init_workqueues(void)
 			mutex_init(&pool->manager_mutex);
 			ida_init(&pool->worker_ida);
 		}
-
-		init_waitqueue_head(&gcwq->rebind_hold);
 	}
 
 	/* create the initial worker */
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/7 V6] workqueue:  add manager pointer for worker_pool
  2012-09-08 17:12 [PATCH 0/7 V6] workqueue: fix hoplug things Lai Jiangshan
  2012-09-08 17:12 ` [PATCH 1/7 V6] workqueue: ensure the wq_worker_sleeping() see the right flags Lai Jiangshan
  2012-09-08 17:12 ` [PATCH 2/7 V6] workqueue: async idle rebinding Lai Jiangshan
@ 2012-09-08 17:12 ` Lai Jiangshan
  2012-09-08 17:12 ` [PATCH 4/7 V6] workqueue: fix idle worker depletion Lai Jiangshan
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-08 17:12 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

We have this plan for manage_workers(): if failed to grab
manager_mutex via mutex_trylock(), we will release gcwq->lock and then
grab manager_mutex again.

This plan will open a hole: hotplug is running after we release gcwq->lock,
and it will not handle the binding of manager. so we add ->manager
on worker_pool and let hotplug code(gcwq_unbind_fn()) handle it.

also fix too_many_workers() to use this pointer.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 3dd7ce2..b203806 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -165,6 +165,7 @@ struct worker_pool {
 	struct timer_list	idle_timer;	/* L: worker idle timeout */
 	struct timer_list	mayday_timer;	/* L: SOS timer for workers */
 
+	struct worker		*manager;	/* L: manager worker */
 	struct mutex		manager_mutex;	/* mutex manager should hold */
 	struct ida		worker_ida;	/* L: for worker IDs */
 };
@@ -680,7 +681,7 @@ static bool need_to_manage_workers(struct worker_pool *pool)
 /* Do we have too many workers and should some go away? */
 static bool too_many_workers(struct worker_pool *pool)
 {
-	bool managing = mutex_is_locked(&pool->manager_mutex);
+	bool managing = !!pool->manager;
 	int nr_idle = pool->nr_idle + managing; /* manager is considered idle */
 	int nr_busy = pool->nr_workers - nr_idle;
 
@@ -2066,6 +2067,7 @@ static bool manage_workers(struct worker *worker)
 	if (!mutex_trylock(&pool->manager_mutex))
 		return ret;
 
+	pool->manager = worker;
 	pool->flags &= ~POOL_MANAGE_WORKERS;
 
 	/*
@@ -2076,6 +2078,8 @@ static bool manage_workers(struct worker *worker)
 	ret |= maybe_create_worker(pool);
 
 	mutex_unlock(&pool->manager_mutex);
+	pool->manager = NULL;
+
 	return ret;
 }
 
@@ -3438,9 +3442,12 @@ static void gcwq_unbind_fn(struct work_struct *work)
 	 * ones which are still executing works from before the last CPU
 	 * down must be on the cpu.  After this, they may become diasporas.
 	 */
-	for_each_worker_pool(pool, gcwq)
+	for_each_worker_pool(pool, gcwq) {
 		list_for_each_entry(worker, &pool->idle_list, entry)
 			worker->flags |= WORKER_UNBOUND;
+		if (pool->manager)
+			pool->manager->flags |= WORKER_UNBOUND;
+	}
 
 	for_each_busy_worker(worker, i, pos, gcwq)
 		worker->flags |= WORKER_UNBOUND;
@@ -3760,6 +3767,7 @@ static int __init init_workqueues(void)
 			setup_timer(&pool->mayday_timer, gcwq_mayday_timeout,
 				    (unsigned long)pool);
 
+			pool->manager = NULL;
 			mutex_init(&pool->manager_mutex);
 			ida_init(&pool->worker_ida);
 		}
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 4/7 V6] workqueue: fix idle worker depletion
  2012-09-08 17:12 [PATCH 0/7 V6] workqueue: fix hoplug things Lai Jiangshan
                   ` (2 preceding siblings ...)
  2012-09-08 17:12 ` [PATCH 3/7 V6] workqueue: add manager pointer for worker_pool Lai Jiangshan
@ 2012-09-08 17:12 ` Lai Jiangshan
  2012-09-08 17:40   ` Tejun Heo
  2012-09-08 17:12 ` [PATCH 5/7 V6] workqueue: rename manager_mutex to assoc_mutex Lai Jiangshan
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-08 17:12 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

If hotplug code grabbed the manager_mutex and worker_thread try to create
a worker, the manage_worker() will return false and worker_thread go to
process work items. Now, on the CPU, all workers are processing work items,
no idle_worker left/ready for managing. It breaks the concept of workqueue
and it is bug.

So when manage_worker() failed to grab the manager_mutex, it should
release gcwq->lock and try again.

After gcwq->lock is released, hotplug can happen. gcwq_unbind_fn() will
do the right thing for manager via ->manager. But rebind_workers()
can't rebind workers directly, worker rebind itself when it is noticed.

Manager worker will be noticed by the bit of GCWQ_DISASSOCIATED and
WORKER_UNBIND. Because the %UNBOUND bit of manager can't be cleared
while it is managing workers. maybe_rebind_manager() will be noticed
when rebind_workers() happens.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |   33 ++++++++++++++++++++++++++++++++-
 1 files changed, 32 insertions(+), 1 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index b203806..207b6a1 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2039,6 +2039,20 @@ static bool maybe_destroy_workers(struct worker_pool *pool)
 	return ret;
 }
 
+/* does the manager need to be rebind after we just release gcwq->lock */
+static void maybe_rebind_manager(struct worker *manager)
+{
+	struct global_cwq *gcwq = manager->pool->gcwq;
+	bool assoc = !(gcwq->flags & GCWQ_DISASSOCIATED);
+
+	if (assoc && (manager->flags & WORKER_UNBOUND)) {
+		spin_unlock_irq(&gcwq->lock);
+
+		if (worker_maybe_bind_and_lock(manager))
+			worker_clr_flags(manager, WORKER_UNBOUND);
+	}
+}
+
 /**
  * manage_workers - manage worker pool
  * @worker: self
@@ -2062,12 +2076,29 @@ static bool maybe_destroy_workers(struct worker_pool *pool)
 static bool manage_workers(struct worker *worker)
 {
 	struct worker_pool *pool = worker->pool;
+	struct global_cwq *gcwq = pool->gcwq;
 	bool ret = false;
 
-	if (!mutex_trylock(&pool->manager_mutex))
+	if (pool->manager)
 		return ret;
 
 	pool->manager = worker;
+	if (unlikely(!mutex_trylock(&pool->manager_mutex))) {
+		/*
+		 * Ouch! rebind_workers() or gcwq_unbind_fn() beats we.
+		 * it can't return false here, otherwise it will lead to
+		 * worker depletion. So we release gcwq->lock and then
+		 * grab manager_mutex again.
+		 */
+		spin_unlock_irq(&gcwq->lock);
+		mutex_lock(&pool->manager_mutex);
+		spin_lock_irq(&gcwq->lock);
+
+		/* rebind_workers() can happen when we release gcwq->lock */
+		maybe_rebind_manager(worker);
+		ret = true;
+	}
+
 	pool->flags &= ~POOL_MANAGE_WORKERS;
 
 	/*
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 5/7 V6] workqueue: rename manager_mutex to assoc_mutex
  2012-09-08 17:12 [PATCH 0/7 V6] workqueue: fix hoplug things Lai Jiangshan
                   ` (3 preceding siblings ...)
  2012-09-08 17:12 ` [PATCH 4/7 V6] workqueue: fix idle worker depletion Lai Jiangshan
@ 2012-09-08 17:12 ` Lai Jiangshan
  2012-09-08 17:12 ` [PATCH 6/7 V6] workqueue: new day don't need WORKER_REBIND for busy rebinding Lai Jiangshan
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-08 17:12 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

assoc_mutex is clear, it protects GCWQ_DISASSOCIATED.

And the C.S. of assoc_mutex is narrowed, it protects
create_worker()+start_worker() which require
GCWQ_DISASSOCIATED stable. don't need to protects
the whole manage_workers().

A result of narrowed C.S. maybe_rebind_manager()
has to be moved to the bottom of manage_workers().

Other result of narrowed C.S. manager_workers() becomes
simpler.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |   63 ++++++++++++++++++++++++----------------------------
 1 files changed, 29 insertions(+), 34 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 207b6a1..d9765c4 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -58,7 +58,7 @@ enum {
 	 * be executing on any CPU.  The gcwq behaves as an unbound one.
 	 *
 	 * Note that DISASSOCIATED can be flipped only while holding
-	 * managership of all pools on the gcwq to avoid changing binding
+	 * assoc_mutex of all pools on the gcwq to avoid changing binding
 	 * state while create_worker() is in progress.
 	 */
 	GCWQ_DISASSOCIATED	= 1 << 0,	/* cpu can't serve workers */
@@ -166,7 +166,7 @@ struct worker_pool {
 	struct timer_list	mayday_timer;	/* L: SOS timer for workers */
 
 	struct worker		*manager;	/* L: manager worker */
-	struct mutex		manager_mutex;	/* mutex manager should hold */
+	struct mutex		assoc_mutex;	/* protect GCWQ_DISASSOCIATED */
 	struct ida		worker_ida;	/* L: for worker IDs */
 };
 
@@ -1673,7 +1673,7 @@ static void rebind_workers(struct global_cwq *gcwq)
 	lockdep_assert_held(&gcwq->lock);
 
 	for_each_worker_pool(pool, gcwq)
-		lockdep_assert_held(&pool->manager_mutex);
+		lockdep_assert_held(&pool->assoc_mutex);
 
 	/* set REBIND and kick idle ones */
 	for_each_worker_pool(pool, gcwq) {
@@ -1975,15 +1975,18 @@ restart:
 	while (true) {
 		struct worker *worker;
 
+		mutex_lock(&pool->assoc_mutex);
 		worker = create_worker(pool);
 		if (worker) {
 			del_timer_sync(&pool->mayday_timer);
 			spin_lock_irq(&gcwq->lock);
 			start_worker(worker);
 			BUG_ON(need_to_create_worker(pool));
+			mutex_unlock(&pool->assoc_mutex);
 			return true;
 		}
 
+		mutex_unlock(&pool->assoc_mutex);
 		if (!need_to_create_worker(pool))
 			break;
 
@@ -2040,7 +2043,7 @@ static bool maybe_destroy_workers(struct worker_pool *pool)
 }
 
 /* does the manager need to be rebind after we just release gcwq->lock */
-static void maybe_rebind_manager(struct worker *manager)
+static bool maybe_rebind_manager(struct worker *manager)
 {
 	struct global_cwq *gcwq = manager->pool->gcwq;
 	bool assoc = !(gcwq->flags & GCWQ_DISASSOCIATED);
@@ -2050,7 +2053,11 @@ static void maybe_rebind_manager(struct worker *manager)
 
 		if (worker_maybe_bind_and_lock(manager))
 			worker_clr_flags(manager, WORKER_UNBOUND);
+
+		return true;
 	}
+
+	return false;
 }
 
 /**
@@ -2061,9 +2068,7 @@ static void maybe_rebind_manager(struct worker *manager)
  * to.  At any given time, there can be only zero or one manager per
  * gcwq.  The exclusion is handled automatically by this function.
  *
- * The caller can safely start processing works on false return.  On
- * true return, it's guaranteed that need_to_create_worker() is false
- * and may_start_working() is true.
+ * The caller can safely start processing works on false return.
  *
  * CONTEXT:
  * spin_lock_irq(gcwq->lock) which may be released and regrabbed
@@ -2076,29 +2081,12 @@ static void maybe_rebind_manager(struct worker *manager)
 static bool manage_workers(struct worker *worker)
 {
 	struct worker_pool *pool = worker->pool;
-	struct global_cwq *gcwq = pool->gcwq;
 	bool ret = false;
 
 	if (pool->manager)
 		return ret;
 
 	pool->manager = worker;
-	if (unlikely(!mutex_trylock(&pool->manager_mutex))) {
-		/*
-		 * Ouch! rebind_workers() or gcwq_unbind_fn() beats we.
-		 * it can't return false here, otherwise it will lead to
-		 * worker depletion. So we release gcwq->lock and then
-		 * grab manager_mutex again.
-		 */
-		spin_unlock_irq(&gcwq->lock);
-		mutex_lock(&pool->manager_mutex);
-		spin_lock_irq(&gcwq->lock);
-
-		/* rebind_workers() can happen when we release gcwq->lock */
-		maybe_rebind_manager(worker);
-		ret = true;
-	}
-
 	pool->flags &= ~POOL_MANAGE_WORKERS;
 
 	/*
@@ -2108,7 +2096,14 @@ static bool manage_workers(struct worker *worker)
 	ret |= maybe_destroy_workers(pool);
 	ret |= maybe_create_worker(pool);
 
-	mutex_unlock(&pool->manager_mutex);
+	/*
+	 * rebind_workers() can happen while it does manage, so it has to
+	 * check and do rebind by itself.
+	 * Before rebind, it's guaranteed that need_to_create_worker() is
+	 * false and may_start_working() is true here.
+	 */
+	ret |= maybe_rebind_manager(worker);
+
 	pool->manager = NULL;
 
 	return ret;
@@ -3436,23 +3431,23 @@ EXPORT_SYMBOL_GPL(work_busy);
  */
 
 /* claim manager positions of all pools */
-static void gcwq_claim_management_and_lock(struct global_cwq *gcwq)
+static void gcwq_claim_assoc_and_lock(struct global_cwq *gcwq)
 {
 	struct worker_pool *pool;
 
 	for_each_worker_pool(pool, gcwq)
-		mutex_lock_nested(&pool->manager_mutex, pool - gcwq->pools);
+		mutex_lock_nested(&pool->assoc_mutex, pool - gcwq->pools);
 	spin_lock_irq(&gcwq->lock);
 }
 
 /* release manager positions */
-static void gcwq_release_management_and_unlock(struct global_cwq *gcwq)
+static void gcwq_release_assoc_and_unlock(struct global_cwq *gcwq)
 {
 	struct worker_pool *pool;
 
 	spin_unlock_irq(&gcwq->lock);
 	for_each_worker_pool(pool, gcwq)
-		mutex_unlock(&pool->manager_mutex);
+		mutex_unlock(&pool->assoc_mutex);
 }
 
 static void gcwq_unbind_fn(struct work_struct *work)
@@ -3465,7 +3460,7 @@ static void gcwq_unbind_fn(struct work_struct *work)
 
 	BUG_ON(gcwq->cpu != smp_processor_id());
 
-	gcwq_claim_management_and_lock(gcwq);
+	gcwq_claim_assoc_and_lock(gcwq);
 
 	/*
 	 * We've claimed all manager positions.  Make all workers unbound
@@ -3485,7 +3480,7 @@ static void gcwq_unbind_fn(struct work_struct *work)
 
 	gcwq->flags |= GCWQ_DISASSOCIATED;
 
-	gcwq_release_management_and_unlock(gcwq);
+	gcwq_release_assoc_and_unlock(gcwq);
 
 	/*
 	 * Call schedule() so that we cross rq->lock and thus can guarantee
@@ -3541,10 +3536,10 @@ static int __devinit workqueue_cpu_up_callback(struct notifier_block *nfb,
 
 	case CPU_DOWN_FAILED:
 	case CPU_ONLINE:
-		gcwq_claim_management_and_lock(gcwq);
+		gcwq_claim_assoc_and_lock(gcwq);
 		gcwq->flags &= ~GCWQ_DISASSOCIATED;
 		rebind_workers(gcwq);
-		gcwq_release_management_and_unlock(gcwq);
+		gcwq_release_assoc_and_unlock(gcwq);
 		break;
 	}
 	return NOTIFY_OK;
@@ -3799,7 +3794,7 @@ static int __init init_workqueues(void)
 				    (unsigned long)pool);
 
 			pool->manager = NULL;
-			mutex_init(&pool->manager_mutex);
+			mutex_init(&pool->assoc_mutex);
 			ida_init(&pool->worker_ida);
 		}
 	}
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 6/7 V6] workqueue: new day don't need WORKER_REBIND for busy rebinding
  2012-09-08 17:12 [PATCH 0/7 V6] workqueue: fix hoplug things Lai Jiangshan
                   ` (4 preceding siblings ...)
  2012-09-08 17:12 ` [PATCH 5/7 V6] workqueue: rename manager_mutex to assoc_mutex Lai Jiangshan
@ 2012-09-08 17:12 ` Lai Jiangshan
  2012-09-08 17:12 ` [PATCH 7/7 V6] workqueue: remove WORKER_REBIND Lai Jiangshan
  2012-09-08 17:27 ` [PATCH 0/7 V6] workqueue: fix hoplug things Lai Jiangshan
  7 siblings, 0 replies; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-08 17:12 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

because old busy_worker_rebind_fn() have to wait until all idle worker finish.
so we have to use two flags WORKER_UNBOUND and WORKER_REBIND to avoid
prematurely clear all NOT_RUNNING bit when highly frequent offline/online.

but current code don't need to wait idle workers. so we don't need to
use two flags, just one is enough. remove WORKER_REBIND from busy rebinding.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |    9 +--------
 1 files changed, 1 insertions(+), 8 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index d9765c4..4863162 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1639,7 +1639,7 @@ static void busy_worker_rebind_fn(struct work_struct *work)
 	struct global_cwq *gcwq = worker->pool->gcwq;
 
 	if (worker_maybe_bind_and_lock(worker))
-		worker_clr_flags(worker, WORKER_REBIND);
+		worker_clr_flags(worker, WORKER_UNBOUND);
 
 	spin_unlock_irq(&gcwq->lock);
 }
@@ -1692,13 +1692,6 @@ static void rebind_workers(struct global_cwq *gcwq)
 	for_each_busy_worker(worker, i, pos, gcwq) {
 		struct work_struct *rebind_work = &worker->rebind_work;
 		struct workqueue_struct *wq;
-		unsigned long worker_flags = worker->flags;
-
-		/* morph UNBOUND to REBIND */
-		worker_flags &= ~WORKER_UNBOUND;
-		worker_flags |= WORKER_REBIND;
-		/* ensure the wq_worker_sleeping() see the right flags */
-		ACCESS_ONCE(worker->flags) = worker_flags;
 
 		if (test_and_set_bit(WORK_STRUCT_PENDING_BIT,
 				     work_data_bits(rebind_work)))
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 7/7 V6] workqueue: remove WORKER_REBIND
  2012-09-08 17:12 [PATCH 0/7 V6] workqueue: fix hoplug things Lai Jiangshan
                   ` (5 preceding siblings ...)
  2012-09-08 17:12 ` [PATCH 6/7 V6] workqueue: new day don't need WORKER_REBIND for busy rebinding Lai Jiangshan
@ 2012-09-08 17:12 ` Lai Jiangshan
  2012-09-08 17:27 ` [PATCH 0/7 V6] workqueue: fix hoplug things Lai Jiangshan
  7 siblings, 0 replies; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-08 17:12 UTC (permalink / raw)
  To: Tejun Heo, linux-kernel; +Cc: Lai Jiangshan

exile operation       = list_del_init(&worker->entry).
and destory operation -> list_del_init(&worker->entry).

so we can use list_empty(&worker->entry) to know: does the worker
has been exiled or killed.

WORKER_REBIND is not need any more, remove it to reduce the states
of workers.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/workqueue.c |   21 ++++++++-------------
 1 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 4863162..aa46308 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -72,11 +72,10 @@ enum {
 	WORKER_DIE		= 1 << 1,	/* die die die */
 	WORKER_IDLE		= 1 << 2,	/* is idle */
 	WORKER_PREP		= 1 << 3,	/* preparing to run works */
-	WORKER_REBIND		= 1 << 5,	/* mom is home, come back */
 	WORKER_CPU_INTENSIVE	= 1 << 6,	/* cpu intensive */
 	WORKER_UNBOUND		= 1 << 7,	/* worker is unbound */
 
-	WORKER_NOT_RUNNING	= WORKER_PREP | WORKER_REBIND | WORKER_UNBOUND |
+	WORKER_NOT_RUNNING	= WORKER_PREP | WORKER_UNBOUND |
 				  WORKER_CPU_INTENSIVE,
 
 	NR_WORKER_POOLS		= 2,		/* # worker pools per gcwq */
@@ -1613,7 +1612,7 @@ __acquires(&gcwq->lock)
 
 /*
  * Rebind an idle @worker to its CPU. worker_thread() will test
- * %WORKER_REBIND before leaving idle and call this function.
+ * worker->entry before leaving idle and call this function.
  */
 static void idle_worker_rebind(struct worker *worker)
 {
@@ -1622,7 +1621,6 @@ static void idle_worker_rebind(struct worker *worker)
 	if (worker_maybe_bind_and_lock(worker))
 		worker_clr_flags(worker, WORKER_UNBOUND);
 
-	worker_clr_flags(worker, WORKER_REBIND);
 	list_add(&worker->entry, &worker->pool->idle_list);
 	spin_unlock_irq(&gcwq->lock);
 }
@@ -1675,11 +1673,9 @@ static void rebind_workers(struct global_cwq *gcwq)
 	for_each_worker_pool(pool, gcwq)
 		lockdep_assert_held(&pool->assoc_mutex);
 
-	/* set REBIND and kick idle ones */
+	/* exile and kick idle ones */
 	for_each_worker_pool(pool, gcwq) {
 		list_for_each_entry_safe(worker, n, &pool->idle_list, entry) {
-			worker->flags |= WORKER_REBIND;
-
 			/* exile idle workers */
 			list_del_init(&worker->entry);
 
@@ -2145,7 +2141,7 @@ __acquires(&gcwq->lock)
 	 * necessary to avoid spurious warnings from rescuers servicing the
 	 * unbound or a disassociated gcwq.
 	 */
-	WARN_ON_ONCE(!(worker->flags & (WORKER_UNBOUND | WORKER_REBIND)) &&
+	WARN_ON_ONCE(!(worker->flags & WORKER_UNBOUND) &&
 		     !(gcwq->flags & GCWQ_DISASSOCIATED) &&
 		     raw_smp_processor_id() != gcwq->cpu);
 
@@ -2269,18 +2265,17 @@ static int worker_thread(void *__worker)
 woke_up:
 	spin_lock_irq(&gcwq->lock);
 
-	/*
-	 * DIE can be set only while idle and REBIND set while busy has
-	 * @worker->rebind_work scheduled.  Checking here is enough.
-	 */
-	if (unlikely(worker->flags & (WORKER_REBIND | WORKER_DIE))) {
+	/* Is it still home ? */
+	if (unlikely(list_empty(&worker->entry))) {
 		spin_unlock_irq(&gcwq->lock);
 
+		/* reason: DIE */
 		if (worker->flags & WORKER_DIE) {
 			worker->task->flags &= ~PF_WQ_WORKER;
 			return 0;
 		}
 
+		/* reason: idle rebind */
 		idle_worker_rebind(worker);
 		goto woke_up;
 	}
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7 V6] workqueue: fix hoplug things
  2012-09-08 17:12 [PATCH 0/7 V6] workqueue: fix hoplug things Lai Jiangshan
                   ` (6 preceding siblings ...)
  2012-09-08 17:12 ` [PATCH 7/7 V6] workqueue: remove WORKER_REBIND Lai Jiangshan
@ 2012-09-08 17:27 ` Lai Jiangshan
  2012-09-08 17:34   ` Lai Jiangshan
  2012-09-08 17:37   ` Tejun Heo
  7 siblings, 2 replies; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-08 17:27 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Tejun Heo, linux-kernel

On Sun, Sep 9, 2012 at 1:12 AM, Lai Jiangshan <laijs@cn.fujitsu.com> wrote:
> The patch set is based on 3b07e9ca26866697616097044f25fbe53dbab693 of wq.git
>
> Patch 1,2 are accepted. Patch 1 goes to 3.6. tj has a replacement goes
> to 3.6 instead of Patch 2. so Patch2 will go to 3.7. Patch2 will need
> to be rebased if the replacement is still in 3.7.
> (tj, could you help me do the rebase if I don't need to respin the patchset
> as V7 ?)
>
> Patch3,4 fix depletion problem, it is simple enough. it goes to 3.6.

sorry.
3.6 is synchronous idles when we use tj's replacement for patch2.
and maybe_rebind_manager() don't wait for idles rebind. so it can't go to 3.6.

Choice1: also push Patch 2(async idle rebinding) to 3.6? thus patch 4
can goto 3.6 too.
Choice2: add workaroud and make patch4 and make it go to 3.6. (add some code.)

Thanks.
Lai

>
> Patch 5,6,7 are clean up. -> 3.7
>
>
> Lai Jiangshan (7):
>   workqueue: ensure the wq_worker_sleeping() see the right flags
>   workqueue: async idle rebinding
>   workqueue:  add manager pointer for worker_pool
>   workqueue: fix idle worker depletion
>   workqueue: rename manager_mutex to assoc_mutex
>   workqueue: new day don't need WORKER_REBIND for busy rebinding
>   workqueue: remove WORKER_REBIND
>
>  kernel/workqueue.c |  195 +++++++++++++++++++++++-----------------------------
>  1 files changed, 85 insertions(+), 110 deletions(-)
>
> --
> 1.7.4.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7 V6] workqueue: fix hoplug things
  2012-09-08 17:27 ` [PATCH 0/7 V6] workqueue: fix hoplug things Lai Jiangshan
@ 2012-09-08 17:34   ` Lai Jiangshan
  2012-09-08 17:37   ` Tejun Heo
  1 sibling, 0 replies; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-08 17:34 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Tejun Heo, linux-kernel

On Sun, Sep 9, 2012 at 1:27 AM, Lai Jiangshan <eag0628@gmail.com> wrote:
> On Sun, Sep 9, 2012 at 1:12 AM, Lai Jiangshan <laijs@cn.fujitsu.com> wrote:
>> The patch set is based on 3b07e9ca26866697616097044f25fbe53dbab693 of wq.git
>>
>> Patch 1,2 are accepted. Patch 1 goes to 3.6. tj has a replacement goes
>> to 3.6 instead of Patch 2. so Patch2 will go to 3.7. Patch2 will need
>> to be rebased if the replacement is still in 3.7.
>> (tj, could you help me do the rebase if I don't need to respin the patchset
>> as V7 ?)
>>
>> Patch3,4 fix depletion problem, it is simple enough. it goes to 3.6.
>
> sorry.
> 3.6 is synchronous idles when we use tj's replacement for patch2.
> and maybe_rebind_manager() don't wait for idles rebind. so it can't go to 3.6.
>
> Choice1: also push Patch 2(async idle rebinding) to 3.6? thus patch 4
> can goto 3.6 too.
> Choice2: add workaroud and make patch4 and make it go to 3.6. (add some code.)
>

Sorry again. the above worry is incorrect.
maybe_rebind_manager() **DO** wait for idles rebind via
mutex_lock(manager_lock).
so it is safe to 3.6.

sorry. don't worry anything. I just think it without code.

Thanks
Lai

>
>>
>> Patch 5,6,7 are clean up. -> 3.7
>>
>>
>> Lai Jiangshan (7):
>>   workqueue: ensure the wq_worker_sleeping() see the right flags
>>   workqueue: async idle rebinding
>>   workqueue:  add manager pointer for worker_pool
>>   workqueue: fix idle worker depletion
>>   workqueue: rename manager_mutex to assoc_mutex
>>   workqueue: new day don't need WORKER_REBIND for busy rebinding
>>   workqueue: remove WORKER_REBIND
>>
>>  kernel/workqueue.c |  195 +++++++++++++++++++++++-----------------------------
>>  1 files changed, 85 insertions(+), 110 deletions(-)
>>
>> --
>> 1.7.4.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7 V6] workqueue: fix hoplug things
  2012-09-08 17:27 ` [PATCH 0/7 V6] workqueue: fix hoplug things Lai Jiangshan
  2012-09-08 17:34   ` Lai Jiangshan
@ 2012-09-08 17:37   ` Tejun Heo
  2012-09-08 17:46     ` Lai Jiangshan
  1 sibling, 1 reply; 26+ messages in thread
From: Tejun Heo @ 2012-09-08 17:37 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Lai Jiangshan, linux-kernel

Hello, Lai.

On Sun, Sep 09, 2012 at 01:27:37AM +0800, Lai Jiangshan wrote:
> On Sun, Sep 9, 2012 at 1:12 AM, Lai Jiangshan <laijs@cn.fujitsu.com> wrote:
> > The patch set is based on 3b07e9ca26866697616097044f25fbe53dbab693 of wq.git
> >
> > Patch 1,2 are accepted. Patch 1 goes to 3.6. tj has a replacement goes
> > to 3.6 instead of Patch 2. so Patch2 will go to 3.7. Patch2 will need
> > to be rebased if the replacement is still in 3.7.
> > (tj, could you help me do the rebase if I don't need to respin the patchset
> > as V7 ?)
> >
> > Patch3,4 fix depletion problem, it is simple enough. it goes to 3.6.
> 
> sorry.
> 3.6 is synchronous idles when we use tj's replacement for patch2.
> and maybe_rebind_manager() don't wait for idles rebind. so it can't go to 3.6.

Let's get the fix down first.  I *think* we can do it for 3.6-fixes.
Can't we do the following?

* Instead of MANAGING, add pool->manager.

* Fix the idle depletion bug by using pool->manager for exclusion and
  always grabbing pool->manager_mutex.  Hotplug can use pool->manager
  to schedule rebind work (or UNBIND) to the manager.

Thoughts?

Also, can you please base the fix patches on top of wq/for-3.6-fixes?
It's getting quite confusing.

  git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git for-3.6-fixes

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7 V6] workqueue: fix idle worker depletion
  2012-09-08 17:12 ` [PATCH 4/7 V6] workqueue: fix idle worker depletion Lai Jiangshan
@ 2012-09-08 17:40   ` Tejun Heo
  2012-09-08 17:50     ` Lai Jiangshan
  0 siblings, 1 reply; 26+ messages in thread
From: Tejun Heo @ 2012-09-08 17:40 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel

Hello, Lai.

On Sun, Sep 09, 2012 at 01:12:53AM +0800, Lai Jiangshan wrote:
> +/* does the manager need to be rebind after we just release gcwq->lock */
> +static void maybe_rebind_manager(struct worker *manager)
> +{
> +	struct global_cwq *gcwq = manager->pool->gcwq;
> +	bool assoc = !(gcwq->flags & GCWQ_DISASSOCIATED);
> +
> +	if (assoc && (manager->flags & WORKER_UNBOUND)) {
> +		spin_unlock_irq(&gcwq->lock);
> +
> +		if (worker_maybe_bind_and_lock(manager))
> +			worker_clr_flags(manager, WORKER_UNBOUND);
> +	}
> +}

We can reuse busy_worker_rebind_fn(), right?

>  	pool->manager = worker;
> +	if (unlikely(!mutex_trylock(&pool->manager_mutex))) {
> +		/*
> +		 * Ouch! rebind_workers() or gcwq_unbind_fn() beats we.
> +		 * it can't return false here, otherwise it will lead to
> +		 * worker depletion. So we release gcwq->lock and then
> +		 * grab manager_mutex again.
> +		 */
> +		spin_unlock_irq(&gcwq->lock);
> +		mutex_lock(&pool->manager_mutex);
> +		spin_lock_irq(&gcwq->lock);
> +
> +		/* rebind_workers() can happen when we release gcwq->lock */
> +		maybe_rebind_manager(worker);

And we can call process_scheduled_works() here and make the CPU
hotplug check pool->manager and schedule rebind_work there.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7 V6] workqueue: fix hoplug things
  2012-09-08 17:37   ` Tejun Heo
@ 2012-09-08 17:46     ` Lai Jiangshan
  2012-09-08 17:50       ` Tejun Heo
  0 siblings, 1 reply; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-08 17:46 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Lai Jiangshan, linux-kernel

On Sun, Sep 9, 2012 at 1:37 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello, Lai.
>
> On Sun, Sep 09, 2012 at 01:27:37AM +0800, Lai Jiangshan wrote:
>> On Sun, Sep 9, 2012 at 1:12 AM, Lai Jiangshan <laijs@cn.fujitsu.com> wrote:
>> > The patch set is based on 3b07e9ca26866697616097044f25fbe53dbab693 of wq.git
>> >
>> > Patch 1,2 are accepted. Patch 1 goes to 3.6. tj has a replacement goes
>> > to 3.6 instead of Patch 2. so Patch2 will go to 3.7. Patch2 will need
>> > to be rebased if the replacement is still in 3.7.
>> > (tj, could you help me do the rebase if I don't need to respin the patchset
>> > as V7 ?)
>> >
>> > Patch3,4 fix depletion problem, it is simple enough. it goes to 3.6.
>>
>> sorry.
>> 3.6 is synchronous idles when we use tj's replacement for patch2.
>> and maybe_rebind_manager() don't wait for idles rebind. so it can't go to 3.6.
>
> Let's get the fix down first.  I *think* we can do it for 3.6-fixes.
> Can't we do the following?
>
> * Instead of MANAGING, add pool->manager.
>
> * Fix the idle depletion bug by using pool->manager for exclusion and
>   always grabbing pool->manager_mutex.  Hotplug can use pool->manager
>   to schedule rebind work (or UNBIND) to the manager.
>
> Thoughts?

Don't need.  my worry is wrong.

>
> Also, can you please base the fix patches on top of wq/for-3.6-fixes?
> It's getting quite confusing.
>
>   git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git for-3.6-fixes

I base on wq/for-3.7 of several days ago.

I can change the base, but which blanch should patch5,6,7 base on?


Thanks.
Lai

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7 V6] workqueue: fix hoplug things
  2012-09-08 17:46     ` Lai Jiangshan
@ 2012-09-08 17:50       ` Tejun Heo
  2012-09-08 17:51         ` Tejun Heo
  2012-09-08 17:56         ` Lai Jiangshan
  0 siblings, 2 replies; 26+ messages in thread
From: Tejun Heo @ 2012-09-08 17:50 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Lai Jiangshan, linux-kernel

Hello, Lai.

On Sun, Sep 09, 2012 at 01:46:59AM +0800, Lai Jiangshan wrote:
> > * Instead of MANAGING, add pool->manager.
> >
> > * Fix the idle depletion bug by using pool->manager for exclusion and
> >   always grabbing pool->manager_mutex.  Hotplug can use pool->manager
> >   to schedule rebind work (or UNBIND) to the manager.
> >
> > Thoughts?
> 
> Don't need.  my worry is wrong.

So, your worry was incorrect and the above is what we're gonna do, no?

> > Also, can you please base the fix patches on top of wq/for-3.6-fixes?
> > It's getting quite confusing.
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git for-3.6-fixes
> 
> I base on wq/for-3.7 of several days ago.
> 
> I can change the base, but which blanch should patch5,6,7 base on?

Please create two patches, first introducing pool->manager_mutex and
second fixing the depletion bug by making manager always grab
manager_mutex on top of wq/for-3.6-fixes.  For the rest of the series,
please wait for me to merge the for-3.6-fixes into for-3.7 and base on
top of them and *please* stop sending restructuring patches before the
fixes are settled.  It's wasting time for both of us.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7 V6] workqueue: fix idle worker depletion
  2012-09-08 17:40   ` Tejun Heo
@ 2012-09-08 17:50     ` Lai Jiangshan
  2012-09-08 17:53       ` Tejun Heo
  0 siblings, 1 reply; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-08 17:50 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Lai Jiangshan, linux-kernel

On Sun, Sep 9, 2012 at 1:40 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello, Lai.
>
> On Sun, Sep 09, 2012 at 01:12:53AM +0800, Lai Jiangshan wrote:
>> +/* does the manager need to be rebind after we just release gcwq->lock */
>> +static void maybe_rebind_manager(struct worker *manager)
>> +{
>> +     struct global_cwq *gcwq = manager->pool->gcwq;
>> +     bool assoc = !(gcwq->flags & GCWQ_DISASSOCIATED);
>> +
>> +     if (assoc && (manager->flags & WORKER_UNBOUND)) {
>> +             spin_unlock_irq(&gcwq->lock);
>> +
>> +             if (worker_maybe_bind_and_lock(manager))
>> +                     worker_clr_flags(manager, WORKER_UNBOUND);
>> +     }
>> +}
>
> We can reuse busy_worker_rebind_fn(), right?

busy_worker_rebind_fn() releases the gcwq->lock. we can't release the lock here.

>
>>       pool->manager = worker;
>> +     if (unlikely(!mutex_trylock(&pool->manager_mutex))) {
>> +             /*
>> +              * Ouch! rebind_workers() or gcwq_unbind_fn() beats we.
>> +              * it can't return false here, otherwise it will lead to
>> +              * worker depletion. So we release gcwq->lock and then
>> +              * grab manager_mutex again.
>> +              */
>> +             spin_unlock_irq(&gcwq->lock);
>> +             mutex_lock(&pool->manager_mutex);
>> +             spin_lock_irq(&gcwq->lock);
>> +
>> +             /* rebind_workers() can happen when we release gcwq->lock */
>> +             maybe_rebind_manager(worker);
>
> And we can call process_scheduled_works() here and make the CPU
> hotplug check pool->manager and schedule rebind_work there.
>

sorry again. don't need.

Thanks.
Lai

>
> --
> tejun
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7 V6] workqueue: fix hoplug things
  2012-09-08 17:50       ` Tejun Heo
@ 2012-09-08 17:51         ` Tejun Heo
  2012-09-08 17:56         ` Lai Jiangshan
  1 sibling, 0 replies; 26+ messages in thread
From: Tejun Heo @ 2012-09-08 17:51 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Lai Jiangshan, linux-kernel

On Sat, Sep 08, 2012 at 10:50:07AM -0700, Tejun Heo wrote:
> Please create two patches, first introducing pool->manager_mutex and

                                               ^ pool->manager

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7 V6] workqueue: fix idle worker depletion
  2012-09-08 17:50     ` Lai Jiangshan
@ 2012-09-08 17:53       ` Tejun Heo
  2012-09-08 18:07         ` Lai Jiangshan
  0 siblings, 1 reply; 26+ messages in thread
From: Tejun Heo @ 2012-09-08 17:53 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Lai Jiangshan, linux-kernel

Hello,

On Sun, Sep 09, 2012 at 01:50:41AM +0800, Lai Jiangshan wrote:
> >> +             if (worker_maybe_bind_and_lock(manager))
> >> +                     worker_clr_flags(manager, WORKER_UNBOUND);
> >> +     }
> >> +}
> >
> > We can reuse busy_worker_rebind_fn(), right?
> 
> busy_worker_rebind_fn() releases the gcwq->lock. we can't release
> the lock here.

Why so?  Can you please elaborate?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7 V6] workqueue: fix hoplug things
  2012-09-08 17:50       ` Tejun Heo
  2012-09-08 17:51         ` Tejun Heo
@ 2012-09-08 17:56         ` Lai Jiangshan
  2012-09-08 17:59           ` Tejun Heo
  1 sibling, 1 reply; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-08 17:56 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Lai Jiangshan, linux-kernel

On Sun, Sep 9, 2012 at 1:50 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello, Lai.
>
> On Sun, Sep 09, 2012 at 01:46:59AM +0800, Lai Jiangshan wrote:
>> > * Instead of MANAGING, add pool->manager.
>> >
>> > * Fix the idle depletion bug by using pool->manager for exclusion and
>> >   always grabbing pool->manager_mutex.  Hotplug can use pool->manager
>> >   to schedule rebind work (or UNBIND) to the manager.
>> >
>> > Thoughts?
>>
>> Don't need.  my worry is wrong.
>
> So, your worry was incorrect and the above is what we're gonna do, no?

NO BUG found for PATCH4 even idle rebinding is synchronous.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7 V6] workqueue: fix hoplug things
  2012-09-08 17:56         ` Lai Jiangshan
@ 2012-09-08 17:59           ` Tejun Heo
  2012-09-08 18:21             ` Lai Jiangshan
  0 siblings, 1 reply; 26+ messages in thread
From: Tejun Heo @ 2012-09-08 17:59 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Lai Jiangshan, linux-kernel

On Sun, Sep 09, 2012 at 01:56:47AM +0800, Lai Jiangshan wrote:
> > So, your worry was incorrect and the above is what we're gonna do, no?
> 
> NO BUG found for PATCH4 even idle rebinding is synchronous.

Hmmm... so, I'm having some difficulty communicating with you.  We
need two separate patch series.  One for for-3.6-fixes and the other
for restructuring on top of for-3.7 after the fixes are merged into
it.

As you currently posted, the patches are based on for-3.7 and fixes
and restructuring are intermixed, so I'm asking you to separate out
two patches to fix the idle depletion bug and base them on top of
for-3.6-fixes.  Am I misunderstanding something?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7 V6] workqueue: fix idle worker depletion
  2012-09-08 17:53       ` Tejun Heo
@ 2012-09-08 18:07         ` Lai Jiangshan
  2012-09-08 18:11           ` Tejun Heo
  0 siblings, 1 reply; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-08 18:07 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Lai Jiangshan, linux-kernel

On Sun, Sep 9, 2012 at 1:53 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> On Sun, Sep 09, 2012 at 01:50:41AM +0800, Lai Jiangshan wrote:
>> >> +             if (worker_maybe_bind_and_lock(manager))
>> >> +                     worker_clr_flags(manager, WORKER_UNBOUND);
>> >> +     }
>> >> +}
>> >
>> > We can reuse busy_worker_rebind_fn(), right?
>>
>> busy_worker_rebind_fn() releases the gcwq->lock. we can't release
>> the lock here.
>
> Why so?  Can you please elaborate?
>

when we release gcwq->lock and then grab it, we leave a hole that things
can be changed.

I don't want to open a hole. if the hole has bug we have to fix it.
if the hole has no bug, we have to add lot of comments to explain it.

When I write this reply. I am thinking: is the hole  has bug if
I release gcwq->lock here? result: no. But I don't want to add all things
what I have thought as comments to explain there is no bug even when we
open a hole. don't leave reviewers too much burden.

Thanks.
Lai.

>
> --
> tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7 V6] workqueue: fix idle worker depletion
  2012-09-08 18:07         ` Lai Jiangshan
@ 2012-09-08 18:11           ` Tejun Heo
  2012-09-08 18:34             ` Lai Jiangshan
  0 siblings, 1 reply; 26+ messages in thread
From: Tejun Heo @ 2012-09-08 18:11 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Lai Jiangshan, linux-kernel

On Sun, Sep 09, 2012 at 02:07:50AM +0800, Lai Jiangshan wrote:
> when we release gcwq->lock and then grab it, we leave a hole that things
> can be changed.
> 
> I don't want to open a hole. if the hole has bug we have to fix it.
> if the hole has no bug, we have to add lot of comments to explain it.
> 
> When I write this reply. I am thinking: is the hole  has bug if
> I release gcwq->lock here? result: no. But I don't want to add all things
> what I have thought as comments to explain there is no bug even when we
> open a hole. don't leave reviewers too much burden.

We're already releasing gcwq->lock in maybe_create_worker().  That's
the reason why @ret is set to true.  In addition, we already released
the lock to grab manager_mutex.  So, you're not adding any burden.
Please reuse the busy rebinding mechanism.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/7 V6] workqueue: fix hoplug things
  2012-09-08 17:59           ` Tejun Heo
@ 2012-09-08 18:21             ` Lai Jiangshan
  0 siblings, 0 replies; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-08 18:21 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Lai Jiangshan, linux-kernel

>
> Hmmm... so, I'm having some difficulty communicating with you.  We
> need two separate patch series.  One for for-3.6-fixes and the other
> for restructuring on top of for-3.7 after the fixes are merged into
> it.
>
> As you currently posted, the patches are based on for-3.7 and fixes
> and restructuring are intermixed, so I'm asking you to separate out
> two patches to fix the idle depletion bug and base them on top of
> for-3.6-fixes.  Am I misunderstanding something?
>


Patch 3,4 are ready for 3.6 and they can be applied to 3.6 directly.
don't need to revise nor resend.
you just pick them up to for-3.6-fixes.

Thanks.
Lai

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7 V6] workqueue: fix idle worker depletion
  2012-09-08 18:11           ` Tejun Heo
@ 2012-09-08 18:34             ` Lai Jiangshan
  2012-09-08 19:02               ` Tejun Heo
  0 siblings, 1 reply; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-08 18:34 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Lai Jiangshan, linux-kernel

On Sun, Sep 9, 2012 at 2:11 AM, Tejun Heo <tj@kernel.org> wrote:
> On Sun, Sep 09, 2012 at 02:07:50AM +0800, Lai Jiangshan wrote:
>> when we release gcwq->lock and then grab it, we leave a hole that things
>> can be changed.
>>
>> I don't want to open a hole. if the hole has bug we have to fix it.
>> if the hole has no bug, we have to add lot of comments to explain it.
>>
>> When I write this reply. I am thinking: is the hole  has bug if
>> I release gcwq->lock here? result: no. But I don't want to add all things
>> what I have thought as comments to explain there is no bug even when we
>> open a hole. don't leave reviewers too much burden.
>
> We're already releasing gcwq->lock in maybe_create_worker().  That's
> the reason why @ret is set to true.  In addition, we already released
> the lock to grab manager_mutex.  So, you're not adding any burden.
> Please reuse the busy rebinding mechanism.
>

in 3.6 busy_worker_rebind() handle WORKER_REBIND bit,
not WORKER_UNBOUND bit.

busy_worker_rebind() takes struct work_struct *work argument, we have to
add new patch to add a helper and restruct it at first.

worker_maybe_bind_and_lock() 's mean is very clear here. busy_worker_rebind()
seems for busy workers, manager is not busy workers.

>
> --
> tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7 V6] workqueue: fix idle worker depletion
  2012-09-08 18:34             ` Lai Jiangshan
@ 2012-09-08 19:02               ` Tejun Heo
  2012-09-09  4:09                 ` Lai Jiangshan
  0 siblings, 1 reply; 26+ messages in thread
From: Tejun Heo @ 2012-09-08 19:02 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Lai Jiangshan, linux-kernel

Hello, Lai.

On Sun, Sep 09, 2012 at 02:34:02AM +0800, Lai Jiangshan wrote:
> in 3.6 busy_worker_rebind() handle WORKER_REBIND bit,
> not WORKER_UNBOUND bit.
>
> busy_worker_rebind() takes struct work_struct *work argument, we have to
> add new patch to add a helper and restruct it at first.

What's wrong with just treating manager as busy.  Factor out,
rebind_work scheduling from rebind_workers() and call it for busy
workers and the manager if it exists.  manage_workers() only need to
call process_scheduled_works().  Wouldn't that work?

> worker_maybe_bind_and_lock() 's mean is very clear
> here. busy_worker_rebind() seems for busy workers, manager is not
> busy workers.

I don't know.  It just seems unnecessarily wordy.  If you don't like
reusing the busy worker path, how about just calling
maybe_bind_and_lock() unconditionally after locking manager_mutex?  I
mean, can't it just do the following?

	spin_unlock_irq(&gcwq->lock);

	/*
	 * Explain what's going on.
	 */
	mutex_lock(&pool->manager_mutex);
	if (worker_maybe_bind_and_lock(worker))
		worker_clr_flags(worker, WORKER_UNBOUND);
	ret = true;

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7 V6] workqueue: fix idle worker depletion
  2012-09-08 19:02               ` Tejun Heo
@ 2012-09-09  4:09                 ` Lai Jiangshan
  2012-09-09  6:30                   ` Tejun Heo
  0 siblings, 1 reply; 26+ messages in thread
From: Lai Jiangshan @ 2012-09-09  4:09 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Lai Jiangshan, linux-kernel

On Sun, Sep 9, 2012 at 3:02 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello, Lai.
>
> On Sun, Sep 09, 2012 at 02:34:02AM +0800, Lai Jiangshan wrote:
>> in 3.6 busy_worker_rebind() handle WORKER_REBIND bit,
>> not WORKER_UNBOUND bit.
>>
>> busy_worker_rebind() takes struct work_struct *work argument, we have to
>> add new patch to add a helper and restruct it at first.
>
> What's wrong with just treating manager as busy.  Factor out,
> rebind_work scheduling from rebind_workers() and call it for busy
> workers and the manager if it exists.  manage_workers() only need to
> call process_scheduled_works().  Wouldn't that work?
>
>> worker_maybe_bind_and_lock() 's mean is very clear
>> here. busy_worker_rebind() seems for busy workers, manager is not
>> busy workers.
>
> I don't know.  It just seems unnecessarily wordy.  If you don't like
> reusing the busy worker path, how about just calling
> maybe_bind_and_lock() unconditionally after locking manager_mutex?  I
> mean, can't it just do the following?
>
>         spin_unlock_irq(&gcwq->lock);
>
>         /*
>          * Explain what's going on.
>          */
>         mutex_lock(&pool->manager_mutex);
>         if (worker_maybe_bind_and_lock(worker))
>                 worker_clr_flags(worker, WORKER_UNBOUND);
>         ret = true;
>


This code is correct. worker_maybe_bind_and_lock() can be called any time.
but I like to call it only when it is really needed.

Thanks.
Lai

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7 V6] workqueue: fix idle worker depletion
  2012-09-09  4:09                 ` Lai Jiangshan
@ 2012-09-09  6:30                   ` Tejun Heo
  0 siblings, 0 replies; 26+ messages in thread
From: Tejun Heo @ 2012-09-09  6:30 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Lai Jiangshan, linux-kernel

Hello, Lai.

On Sun, Sep 09, 2012 at 12:09:58PM +0800, Lai Jiangshan wrote:
> >         mutex_lock(&pool->manager_mutex);
> >         if (worker_maybe_bind_and_lock(worker))
> >                 worker_clr_flags(worker, WORKER_UNBOUND);
> >         ret = true;
> >
> 
> 
> This code is correct. worker_maybe_bind_and_lock() can be called any time.
> but I like to call it only when it is really needed.

Ummm... that code is complex for no reason.  It isn't easier to read
or better in any way.  Given the recent exchanges, it seems our tastes
differ greatly.  Can you please just refresh the two patches as
requested and repost them on top of for-3.6-fixes?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2012-09-09  6:30 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-08 17:12 [PATCH 0/7 V6] workqueue: fix hoplug things Lai Jiangshan
2012-09-08 17:12 ` [PATCH 1/7 V6] workqueue: ensure the wq_worker_sleeping() see the right flags Lai Jiangshan
2012-09-08 17:12 ` [PATCH 2/7 V6] workqueue: async idle rebinding Lai Jiangshan
2012-09-08 17:12 ` [PATCH 3/7 V6] workqueue: add manager pointer for worker_pool Lai Jiangshan
2012-09-08 17:12 ` [PATCH 4/7 V6] workqueue: fix idle worker depletion Lai Jiangshan
2012-09-08 17:40   ` Tejun Heo
2012-09-08 17:50     ` Lai Jiangshan
2012-09-08 17:53       ` Tejun Heo
2012-09-08 18:07         ` Lai Jiangshan
2012-09-08 18:11           ` Tejun Heo
2012-09-08 18:34             ` Lai Jiangshan
2012-09-08 19:02               ` Tejun Heo
2012-09-09  4:09                 ` Lai Jiangshan
2012-09-09  6:30                   ` Tejun Heo
2012-09-08 17:12 ` [PATCH 5/7 V6] workqueue: rename manager_mutex to assoc_mutex Lai Jiangshan
2012-09-08 17:12 ` [PATCH 6/7 V6] workqueue: new day don't need WORKER_REBIND for busy rebinding Lai Jiangshan
2012-09-08 17:12 ` [PATCH 7/7 V6] workqueue: remove WORKER_REBIND Lai Jiangshan
2012-09-08 17:27 ` [PATCH 0/7 V6] workqueue: fix hoplug things Lai Jiangshan
2012-09-08 17:34   ` Lai Jiangshan
2012-09-08 17:37   ` Tejun Heo
2012-09-08 17:46     ` Lai Jiangshan
2012-09-08 17:50       ` Tejun Heo
2012-09-08 17:51         ` Tejun Heo
2012-09-08 17:56         ` Lai Jiangshan
2012-09-08 17:59           ` Tejun Heo
2012-09-08 18:21             ` Lai Jiangshan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).