All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] workqueue: cleanups for schedule callbacks
@ 2021-12-07  7:35 Lai Jiangshan
  2021-12-07  7:35 ` [PATCH 1/7] workqueue: Remove the outdated comment before wq_worker_sleeping() Lai Jiangshan
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-07  7:35 UTC (permalink / raw)
  To: linux-kernel, Tejun Heo; +Cc: Lai Jiangshan

From: Lai Jiangshan <laijs@linux.alibaba.com>

The commit 6d25be5782e4 ("sched/core, workqueues: Distangle worker
accounting from rq lock") changed the schedule callbacks for workqueue.

It simplified the connection between scheduler and workqueue.  But it
caused some code uselss and some comments outdated in workqueue.  This
patchset clean them up.

Patch1 is unrelated to 6d25be5782e4, it is related to a recent change
to make wq_worker_sleeping() not being called in preempt disabled
section.

Patch2 is cleanup for 6d25be5782e4 not calling schedule callbacks in
deeper sleeping path with local-wake-up fashion.

Patch3 is unrelated to 6d25be5782e4, but weakly prepared for patch4.

Patch4-6 are cleanup for 6d25be5782e4 not calling schedule callbacks in
wakeup code, so cacheline_aligned for nr_running and schedule() in
unbind_workers() is unneeded.

6d25be5782e4 also changed to use pool lock in wq_worker_sleeping(),
and patch 7 changes it back to use preemption disabling.  This patch is
marked for 'RFC' because using pool lock in slow (sleeping) path is OK
for me and paves the road to remove "X:" protection.

There are several further cleanups depended on if patch7 is accepted or
not.  For example, mb() in insert_work() can be removed if pool lock
wins.

Lai Jiangshan (7):
  workqueue: Remove the outdated comment before wq_worker_sleeping()
  workqueue: Remove the advanced kicking of the idle workers in
    rebind_workers()
  workqueue: Remove outdated comment about exceptional workers in
    unbind_workers()
  workqueue: Remove schedule() in unbind_workers()
  workqueue: Move the code of waking a worker up in unbind_workers()
  workqueue: Remove the cacheline_aligned for nr_running
  workqueue: Replace pool lock with preemption disabling in
    wq_worker_sleeping()

 kernel/workqueue.c | 102 +++++++++++++++++----------------------------
 1 file changed, 38 insertions(+), 64 deletions(-)

-- 
2.19.1.6.gb485710b


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/7] workqueue: Remove the outdated comment before wq_worker_sleeping()
  2021-12-07  7:35 [PATCH 0/7] workqueue: cleanups for schedule callbacks Lai Jiangshan
@ 2021-12-07  7:35 ` Lai Jiangshan
  2021-12-07  7:35 ` [PATCH 2/7] workqueue: Remove the advanced kicking of the idle workers in rebind_workers() Lai Jiangshan
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-07  7:35 UTC (permalink / raw)
  To: linux-kernel, Tejun Heo; +Cc: Lai Jiangshan, Lai Jiangshan

From: Lai Jiangshan <laijs@linux.alibaba.com>

It isn't called with preempt disabled now.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 kernel/workqueue.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 5557d19ea81c..312b806b3956 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -887,8 +887,7 @@ void wq_worker_running(struct task_struct *task)
  * @task: task going to sleep
  *
  * This function is called from schedule() when a busy worker is
- * going to sleep. Preemption needs to be disabled to protect ->sleeping
- * assignment.
+ * going to sleep.
  */
 void wq_worker_sleeping(struct task_struct *task)
 {
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/7] workqueue: Remove the advanced kicking of the idle workers in rebind_workers()
  2021-12-07  7:35 [PATCH 0/7] workqueue: cleanups for schedule callbacks Lai Jiangshan
  2021-12-07  7:35 ` [PATCH 1/7] workqueue: Remove the outdated comment before wq_worker_sleeping() Lai Jiangshan
@ 2021-12-07  7:35 ` Lai Jiangshan
  2021-12-07  7:35 ` [PATCH 3/7] workqueue: Remove outdated comment about exceptional workers in unbind_workers() Lai Jiangshan
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-07  7:35 UTC (permalink / raw)
  To: linux-kernel, Tejun Heo; +Cc: Lai Jiangshan, Lai Jiangshan

From: Lai Jiangshan <laijs@linux.alibaba.com>

The commit 6d25be5782e4 ("sched/core, workqueues: Distangle worker
accounting from rq lock") changed the schedule callbacks for workqueue
and removed the local-wake-up functionality.

Now the wakingup of workers is done by normal fashion and workers not
yet migrated to the specific CPU in concurrency managed pool can also
be woken up by workers that already bound to the specific cpu now.

So this advanced kicking of the idle workers to migrate them to the
associated CPU is unneeded now.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 kernel/workqueue.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 312b806b3956..ba8cf5f5e7fa 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5077,17 +5077,6 @@ static void rebind_workers(struct worker_pool *pool)
 	for_each_pool_worker(worker, pool) {
 		unsigned int worker_flags = worker->flags;
 
-		/*
-		 * A bound idle worker should actually be on the runqueue
-		 * of the associated CPU for local wake-ups targeting it to
-		 * work.  Kick all idle workers so that they migrate to the
-		 * associated CPU.  Doing this in the same loop as
-		 * replacing UNBOUND with REBOUND is safe as no worker will
-		 * be bound before @pool->lock is released.
-		 */
-		if (worker_flags & WORKER_IDLE)
-			wake_up_process(worker->task);
-
 		/*
 		 * We want to clear UNBOUND but can't directly call
 		 * worker_clr_flags() or adjust nr_running.  Atomically
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/7] workqueue: Remove outdated comment about exceptional workers in unbind_workers()
  2021-12-07  7:35 [PATCH 0/7] workqueue: cleanups for schedule callbacks Lai Jiangshan
  2021-12-07  7:35 ` [PATCH 1/7] workqueue: Remove the outdated comment before wq_worker_sleeping() Lai Jiangshan
  2021-12-07  7:35 ` [PATCH 2/7] workqueue: Remove the advanced kicking of the idle workers in rebind_workers() Lai Jiangshan
@ 2021-12-07  7:35 ` Lai Jiangshan
  2021-12-09 22:16   ` Tejun Heo
  2021-12-07  7:35 ` [PATCH 4/7] workqueue: Remove schedule() " Lai Jiangshan
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-07  7:35 UTC (permalink / raw)
  To: linux-kernel, Tejun Heo; +Cc: Lai Jiangshan, Lai Jiangshan

From: Lai Jiangshan <laijs@linux.alibaba.com>

Long time before, workers are not ALL bound after CPU_ONLINE, they can
still be running in other CPUs before self rebinding.

But the commit a9ab775bcadf ("workqueue: directly restore CPU affinity
of workers from CPU_ONLINE") makes rebind_workers() bind them all.

So all workers are on the CPU before the CPU is down.

And the comment in unbind_workers() refers to the workers "which are
still executing works from before the last CPU down" is outdated.
Just removed it.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 kernel/workqueue.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index ba8cf5f5e7fa..ad663853bb78 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4998,9 +4998,7 @@ static void unbind_workers(int cpu)
 		/*
 		 * We've blocked all attach/detach operations. Make all workers
 		 * unbound and set DISASSOCIATED.  Before this, all workers
-		 * except for the ones which are still executing works from
-		 * before the last CPU down must be on the cpu.  After
-		 * this, they may become diasporas.
+		 * must be on the cpu.  After this, they may become diasporas.
 		 */
 		for_each_pool_worker(worker, pool)
 			worker->flags |= WORKER_UNBOUND;
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/7] workqueue: Remove schedule() in unbind_workers()
  2021-12-07  7:35 [PATCH 0/7] workqueue: cleanups for schedule callbacks Lai Jiangshan
                   ` (2 preceding siblings ...)
  2021-12-07  7:35 ` [PATCH 3/7] workqueue: Remove outdated comment about exceptional workers in unbind_workers() Lai Jiangshan
@ 2021-12-07  7:35 ` Lai Jiangshan
  2021-12-07  7:35 ` [PATCH 5/7] workqueue: Move the code of waking a worker up " Lai Jiangshan
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-07  7:35 UTC (permalink / raw)
  To: linux-kernel, Tejun Heo; +Cc: Lai Jiangshan, Lai Jiangshan

From: Lai Jiangshan <laijs@linux.alibaba.com>

The commit 6d25be5782e4 ("sched/core, workqueues: Distangle worker
accounting from rq lock") changed the schedule callbacks for workqueue
and moved the schedule callback from the wakeup code to at end of
schedule() in the worker's process context.

It means that the callback wq_worker_running() must be guaranteed that
it sees the %WORKER_UNBOUND flag after scheduled since unbind_workers()
is running on the same CPU that all the pool's workers bound to.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 kernel/workqueue.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index ad663853bb78..595f9bd5ad29 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4999,6 +4999,9 @@ static void unbind_workers(int cpu)
 		 * We've blocked all attach/detach operations. Make all workers
 		 * unbound and set DISASSOCIATED.  Before this, all workers
 		 * must be on the cpu.  After this, they may become diasporas.
+		 * And the preemption disabled section in their sched callbacks
+		 * are guaranteed to see WORKER_UNBOUND since the code here
+		 * is on the same cpu.
 		 */
 		for_each_pool_worker(worker, pool)
 			worker->flags |= WORKER_UNBOUND;
@@ -5014,14 +5017,6 @@ static void unbind_workers(int cpu)
 
 		mutex_unlock(&wq_pool_attach_mutex);
 
-		/*
-		 * Call schedule() so that we cross rq->lock and thus can
-		 * guarantee sched callbacks see the %WORKER_UNBOUND flag.
-		 * This is necessary as scheduler callbacks may be invoked
-		 * from other cpus.
-		 */
-		schedule();
-
 		/*
 		 * Sched callbacks are disabled now.  Zap nr_running.
 		 * After this, nr_running stays zero and need_more_worker()
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/7] workqueue: Move the code of waking a worker up in unbind_workers()
  2021-12-07  7:35 [PATCH 0/7] workqueue: cleanups for schedule callbacks Lai Jiangshan
                   ` (3 preceding siblings ...)
  2021-12-07  7:35 ` [PATCH 4/7] workqueue: Remove schedule() " Lai Jiangshan
@ 2021-12-07  7:35 ` Lai Jiangshan
  2021-12-07  7:35 ` [PATCH 6/7] workqueue: Remove the cacheline_aligned for nr_running Lai Jiangshan
  2021-12-07  7:35 ` [RFC PATCH 7/7] workqueue: Replace pool lock with preemption disabling in wq_worker_sleeping() Lai Jiangshan
  6 siblings, 0 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-07  7:35 UTC (permalink / raw)
  To: linux-kernel, Tejun Heo; +Cc: Lai Jiangshan, Lai Jiangshan

From: Lai Jiangshan <laijs@linux.alibaba.com>

In unbind_workers(), there are two pool->lock held sections separated
by the code of zapping nr_running.  wake_up_worker() needs to be in
pool->lock held section and after zapping nr_running.  And zapping
nr_running had to be after schedule() when the local wake up
functionality was in use.  Now, the call to schedule() has been removed
along with the local wake up functionality, so the code can be merged
into the same pool->lock held section.

The diffstat shows that it is other code moved down because the diff
tools can not know the meaning of merging lock sections by swapping
two code blocks.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 kernel/workqueue.c | 38 +++++++++++++++-----------------------
 1 file changed, 15 insertions(+), 23 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 595f9bd5ad29..256f552e9513 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1830,14 +1830,8 @@ static void worker_enter_idle(struct worker *worker)
 	if (too_many_workers(pool) && !timer_pending(&pool->idle_timer))
 		mod_timer(&pool->idle_timer, jiffies + IDLE_WORKER_TIMEOUT);
 
-	/*
-	 * Sanity check nr_running.  Because unbind_workers() releases
-	 * pool->lock between setting %WORKER_UNBOUND and zapping
-	 * nr_running, the warning may trigger spuriously.  Check iff
-	 * unbind is not in progress.
-	 */
-	WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
-		     pool->nr_workers == pool->nr_idle &&
+	/* Sanity check nr_running. */
+	WARN_ON_ONCE(pool->nr_workers == pool->nr_idle &&
 		     atomic_read(&pool->nr_running));
 }
 
@@ -5008,21 +5002,12 @@ static void unbind_workers(int cpu)
 
 		pool->flags |= POOL_DISASSOCIATED;
 
-		raw_spin_unlock_irq(&pool->lock);
-
-		for_each_pool_worker(worker, pool) {
-			kthread_set_per_cpu(worker->task, -1);
-			WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_possible_mask) < 0);
-		}
-
-		mutex_unlock(&wq_pool_attach_mutex);
-
 		/*
-		 * Sched callbacks are disabled now.  Zap nr_running.
-		 * After this, nr_running stays zero and need_more_worker()
-		 * and keep_working() are always true as long as the
-		 * worklist is not empty.  This pool now behaves as an
-		 * unbound (in terms of concurrency management) pool which
+		 * The handling of nr_running in sched callbacks are disabled
+		 * now.  Zap nr_running.  After this, nr_running stays zero and
+		 * need_more_worker() and keep_working() are always true as
+		 * long as the worklist is not empty.  This pool now behaves as
+		 * an unbound (in terms of concurrency management) pool which
 		 * are served by workers tied to the pool.
 		 */
 		atomic_set(&pool->nr_running, 0);
@@ -5032,9 +5017,16 @@ static void unbind_workers(int cpu)
 		 * worker blocking could lead to lengthy stalls.  Kick off
 		 * unbound chain execution of currently pending work items.
 		 */
-		raw_spin_lock_irq(&pool->lock);
 		wake_up_worker(pool);
+
 		raw_spin_unlock_irq(&pool->lock);
+
+		for_each_pool_worker(worker, pool) {
+			kthread_set_per_cpu(worker->task, -1);
+			WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_possible_mask) < 0);
+		}
+
+		mutex_unlock(&wq_pool_attach_mutex);
 	}
 }
 
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/7] workqueue: Remove the cacheline_aligned for nr_running
  2021-12-07  7:35 [PATCH 0/7] workqueue: cleanups for schedule callbacks Lai Jiangshan
                   ` (4 preceding siblings ...)
  2021-12-07  7:35 ` [PATCH 5/7] workqueue: Move the code of waking a worker up " Lai Jiangshan
@ 2021-12-07  7:35 ` Lai Jiangshan
  2021-12-09 22:07   ` Tejun Heo
  2021-12-09 22:27   ` Tejun Heo
  2021-12-07  7:35 ` [RFC PATCH 7/7] workqueue: Replace pool lock with preemption disabling in wq_worker_sleeping() Lai Jiangshan
  6 siblings, 2 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-07  7:35 UTC (permalink / raw)
  To: linux-kernel, Tejun Heo; +Cc: Lai Jiangshan, Lai Jiangshan

From: Lai Jiangshan <laijs@linux.alibaba.com>

nr_running is never modified remotely after the schedule callback in
wakeup path is removed.

Rather nr_running is often accessed with other fields in the pool
together, so the cacheline_aligned for nr_running isn't needed.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 kernel/workqueue.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 256f552e9513..33f1106b4f99 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -154,6 +154,9 @@ struct worker_pool {
 
 	unsigned long		watchdog_ts;	/* L: watchdog timestamp */
 
+	/* The current concurrency level. */
+	atomic_t		nr_running;
+
 	struct list_head	worklist;	/* L: list of pending works */
 
 	int			nr_workers;	/* L: total number of workers */
@@ -177,19 +180,12 @@ struct worker_pool {
 	struct hlist_node	hash_node;	/* PL: unbound_pool_hash node */
 	int			refcnt;		/* PL: refcnt for unbound pools */
 
-	/*
-	 * The current concurrency level.  As it's likely to be accessed
-	 * from other CPUs during try_to_wake_up(), put it in a separate
-	 * cacheline.
-	 */
-	atomic_t		nr_running ____cacheline_aligned_in_smp;
-
 	/*
 	 * Destruction of pool is RCU protected to allow dereferences
 	 * from get_work_pool().
 	 */
 	struct rcu_head		rcu;
-} ____cacheline_aligned_in_smp;
+};
 
 /*
  * The per-pool workqueue.  While queued, the lower WORK_STRUCT_FLAG_BITS
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 7/7] workqueue: Replace pool lock with preemption disabling in wq_worker_sleeping()
  2021-12-07  7:35 [PATCH 0/7] workqueue: cleanups for schedule callbacks Lai Jiangshan
                   ` (5 preceding siblings ...)
  2021-12-07  7:35 ` [PATCH 6/7] workqueue: Remove the cacheline_aligned for nr_running Lai Jiangshan
@ 2021-12-07  7:35 ` Lai Jiangshan
  2021-12-09 22:14   ` Tejun Heo
  6 siblings, 1 reply; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-07  7:35 UTC (permalink / raw)
  To: linux-kernel, Tejun Heo; +Cc: Lai Jiangshan, Lai Jiangshan

From: Lai Jiangshan <laijs@linux.alibaba.com>

Once upon a time,  wq_worker_sleeping() was called with rq lock held,
so wq_worker_sleeping() can not use pool lock.  Instead it used "X:"
protection: preemption disabled on local cpu and wq_worker_sleeping()
didn't depend on rq lock to work even with it held.

Now, wq_worker_sleeping() isn't called with rq lock held and is using
pool lock.  But the functionality of "X:" protection isn't removed and
wq_worker_running() is still using it.

So we can also use "X:" protection in wq_worker_sleeping() and avoid
locking the pool lock.

This patch also documents that only idle_list.next is under "X:"
protection, not the whole idle_list because destroy_worker() in idle
timer can remove non-first idle workers.  Idle timer can be possible
strayed in different CPU, or even in the same CPU, it can interrupt
preemption disabled context.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 kernel/workqueue.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 33f1106b4f99..6c30edbe2fc2 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -162,7 +162,8 @@ struct worker_pool {
 	int			nr_workers;	/* L: total number of workers */
 	int			nr_idle;	/* L: currently idle workers */
 
-	struct list_head	idle_list;	/* X: list of idle workers */
+	/* idle_list.next and first_idle_worker(): X: first idle worker */
+	struct list_head	idle_list;	/* L: list of idle workers */
 	struct timer_list	idle_timer;	/* L: worker idle timeout */
 	struct timer_list	mayday_timer;	/* L: SOS timer for workers */
 
@@ -819,6 +820,11 @@ static bool too_many_workers(struct worker_pool *pool)
 	int nr_idle = pool->nr_idle + managing; /* manager is considered idle */
 	int nr_busy = pool->nr_workers - nr_idle;
 
+	/*
+	 * nr_idle must be at least 2 to allow a manager and at least an idle
+	 * worker in idle_list so that idle_worker_timeout() doesn't touch
+	 * idle_list.next.
+	 */
 	return nr_idle > 2 && (nr_idle - 2) * MAX_IDLE_WORKERS_RATIO >= nr_busy;
 }
 
@@ -826,7 +832,7 @@ static bool too_many_workers(struct worker_pool *pool)
  * Wake up functions.
  */
 
-/* Return the first idle worker.  Safe with preemption disabled */
+/* Return the first idle worker.  Safe with "X:" protection */
 static struct worker *first_idle_worker(struct worker_pool *pool)
 {
 	if (unlikely(list_empty(&pool->idle_list)))
@@ -905,7 +911,7 @@ void wq_worker_sleeping(struct task_struct *task)
 		return;
 
 	worker->sleeping = 1;
-	raw_spin_lock_irq(&pool->lock);
+	preempt_disable();
 
 	/*
 	 * Recheck in case unbind_workers() preempted us. We don't
@@ -913,7 +919,7 @@ void wq_worker_sleeping(struct task_struct *task)
 	 * and nr_running has been reset.
 	 */
 	if (worker->flags & WORKER_NOT_RUNNING) {
-		raw_spin_unlock_irq(&pool->lock);
+		preempt_enable();
 		return;
 	}
 
@@ -923,10 +929,9 @@ void wq_worker_sleeping(struct task_struct *task)
 	 * Please read comment there.
 	 *
 	 * NOT_RUNNING is clear.  This means that we're bound to and
-	 * running on the local cpu w/ rq lock held and preemption
-	 * disabled, which in turn means that none else could be
-	 * manipulating idle_list, so dereferencing idle_list without pool
-	 * lock is safe.
+	 * running on the local cpu with preemption disabled, which in turn
+	 * means that no one else could be manipulating idle_list.next
+	 * so dereferencing idle_list.next without pool lock is safe.
 	 */
 	if (atomic_dec_and_test(&pool->nr_running) &&
 	    !list_empty(&pool->worklist)) {
@@ -934,7 +939,7 @@ void wq_worker_sleeping(struct task_struct *task)
 		if (next)
 			wake_up_process(next->task);
 	}
-	raw_spin_unlock_irq(&pool->lock);
+	preempt_enable();
 }
 
 /**
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 6/7] workqueue: Remove the cacheline_aligned for nr_running
  2021-12-07  7:35 ` [PATCH 6/7] workqueue: Remove the cacheline_aligned for nr_running Lai Jiangshan
@ 2021-12-09 22:07   ` Tejun Heo
  2021-12-09 23:31     ` Lai Jiangshan
  2021-12-09 22:27   ` Tejun Heo
  1 sibling, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2021-12-09 22:07 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel, Lai Jiangshan

On Tue, Dec 07, 2021 at 03:35:42PM +0800, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@linux.alibaba.com>
> 
> nr_running is never modified remotely after the schedule callback in
> wakeup path is removed.
> 
> Rather nr_running is often accessed with other fields in the pool
> together, so the cacheline_aligned for nr_running isn't needed.

Does it even need to be atomic anymore?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 7/7] workqueue: Replace pool lock with preemption disabling in wq_worker_sleeping()
  2021-12-07  7:35 ` [RFC PATCH 7/7] workqueue: Replace pool lock with preemption disabling in wq_worker_sleeping() Lai Jiangshan
@ 2021-12-09 22:14   ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2021-12-09 22:14 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel, Lai Jiangshan

Hello,

On Tue, Dec 07, 2021 at 03:35:43PM +0800, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@linux.alibaba.com>
> 
> Once upon a time,  wq_worker_sleeping() was called with rq lock held,
> so wq_worker_sleeping() can not use pool lock.  Instead it used "X:"
> protection: preemption disabled on local cpu and wq_worker_sleeping()
> didn't depend on rq lock to work even with it held.
> 
> Now, wq_worker_sleeping() isn't called with rq lock held and is using
> pool lock.  But the functionality of "X:" protection isn't removed and
> wq_worker_running() is still using it.
> 
> So we can also use "X:" protection in wq_worker_sleeping() and avoid
> locking the pool lock.
> 
> This patch also documents that only idle_list.next is under "X:"
> protection, not the whole idle_list because destroy_worker() in idle
> timer can remove non-first idle workers.  Idle timer can be possible
> strayed in different CPU, or even in the same CPU, it can interrupt
> preemption disabled context.

It's nice to go back to not needing to grab pool lock in the worker sleeping
path but I'm not sure it actually matters. This isn't in a red-hot path and
we're touching a bunch of stuff in the pool anyway, so the overhead of
grabbing a lock which likely isn't too contended shouldn't matter all that
much. So, maybe it'd be better to just keep things simple?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/7] workqueue: Remove outdated comment about exceptional workers in unbind_workers()
  2021-12-07  7:35 ` [PATCH 3/7] workqueue: Remove outdated comment about exceptional workers in unbind_workers() Lai Jiangshan
@ 2021-12-09 22:16   ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2021-12-09 22:16 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel, Lai Jiangshan

On Tue, Dec 07, 2021 at 03:35:39PM +0800, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@linux.alibaba.com>
> 
> Long time before, workers are not ALL bound after CPU_ONLINE, they can
> still be running in other CPUs before self rebinding.
> 
> But the commit a9ab775bcadf ("workqueue: directly restore CPU affinity
> of workers from CPU_ONLINE") makes rebind_workers() bind them all.
> 
> So all workers are on the CPU before the CPU is down.
> 
> And the comment in unbind_workers() refers to the workers "which are
> still executing works from before the last CPU down" is outdated.
> Just removed it.
> 
> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>

Applied 1-3 to wq/for-5.17.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 6/7] workqueue: Remove the cacheline_aligned for nr_running
  2021-12-07  7:35 ` [PATCH 6/7] workqueue: Remove the cacheline_aligned for nr_running Lai Jiangshan
  2021-12-09 22:07   ` Tejun Heo
@ 2021-12-09 22:27   ` Tejun Heo
  1 sibling, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2021-12-09 22:27 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel, Lai Jiangshan

On Tue, Dec 07, 2021 at 03:35:42PM +0800, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@linux.alibaba.com>
> 
> nr_running is never modified remotely after the schedule callback in
> wakeup path is removed.
> 
> Rather nr_running is often accessed with other fields in the pool
> together, so the cacheline_aligned for nr_running isn't needed.
> 
> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>

Applied 4-6 to cgroup/for-5.17.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 6/7] workqueue: Remove the cacheline_aligned for nr_running
  2021-12-09 22:07   ` Tejun Heo
@ 2021-12-09 23:31     ` Lai Jiangshan
  0 siblings, 0 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-09 23:31 UTC (permalink / raw)
  To: Tejun Heo, Lai Jiangshan; +Cc: linux-kernel



On 2021/12/10 06:07, Tejun Heo wrote:
> On Tue, Dec 07, 2021 at 03:35:42PM +0800, Lai Jiangshan wrote:
>> From: Lai Jiangshan <laijs@linux.alibaba.com>
>>
>> nr_running is never modified remotely after the schedule callback in
>> wakeup path is removed.
>>
>> Rather nr_running is often accessed with other fields in the pool
>> together, so the cacheline_aligned for nr_running isn't needed.
> 
> Does it even need to be atomic anymore?
> 

It doesn't need to be atomic, it is only modified in its associated CPU
in process context.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-12-09 23:31 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-07  7:35 [PATCH 0/7] workqueue: cleanups for schedule callbacks Lai Jiangshan
2021-12-07  7:35 ` [PATCH 1/7] workqueue: Remove the outdated comment before wq_worker_sleeping() Lai Jiangshan
2021-12-07  7:35 ` [PATCH 2/7] workqueue: Remove the advanced kicking of the idle workers in rebind_workers() Lai Jiangshan
2021-12-07  7:35 ` [PATCH 3/7] workqueue: Remove outdated comment about exceptional workers in unbind_workers() Lai Jiangshan
2021-12-09 22:16   ` Tejun Heo
2021-12-07  7:35 ` [PATCH 4/7] workqueue: Remove schedule() " Lai Jiangshan
2021-12-07  7:35 ` [PATCH 5/7] workqueue: Move the code of waking a worker up " Lai Jiangshan
2021-12-07  7:35 ` [PATCH 6/7] workqueue: Remove the cacheline_aligned for nr_running Lai Jiangshan
2021-12-09 22:07   ` Tejun Heo
2021-12-09 23:31     ` Lai Jiangshan
2021-12-09 22:27   ` Tejun Heo
2021-12-07  7:35 ` [RFC PATCH 7/7] workqueue: Replace pool lock with preemption disabling in wq_worker_sleeping() Lai Jiangshan
2021-12-09 22:14   ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.