linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] workqueue: ensure all flush_work() completed when being destoryed
@ 2020-06-01  6:08 Lai Jiangshan
  2020-06-01 15:31 ` Tejun Heo
  2020-06-02 13:49 ` Lai Jiangshan
  0 siblings, 2 replies; 5+ messages in thread
From: Lai Jiangshan @ 2020-06-01  6:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: Lai Jiangshan, Tejun Heo, Lai Jiangshan

In old days, worker threads are not shared among different
workqueues and destroy_workqueue() used kthread_stop() to destroy
all workers before going to destroy workqueue structures.
And kthread_stop() can ensure the scheduled (worker->scheduled)
work items and the linked work items queued by flush_work()
to be completed.

For a workqueue to be completed/unused for wq users means that all
queued works have completed and all flush_work() have return,
and the workqueue is legitimate to passed to destroy_workqueue().

But
e22bee782b3b("workqueue: implement concurrency managed dynamic worker pool")
made worker pools and workers shared among different
workqueues and kthread_stop() is not used to sync the completion
of the work items. destroy_workqueue() uses drain_workqueue()
to drain user work items, but internal work items queued by
flush_work() is not drained due to they don't have colors.

So problems may occur when wq_barrier_func() does complete(&barr->done)
and the wokenup wq-user code does destroy_workqueue(). destroy_workqueue()
can be scheduled eariler than the proccess_one_work() to do
the put_pwq(), so that the sanity check in destroy_workqueue()
can see the no yet put pwq->refcnt and blame false positively.

The problem can be easily fixed by removing the WORK_NO_COLOR
and making the internal work item queued by flush_work() inherit
the color of the work items to be flushed. It would definitely
revert the design and the benefits of the WORK_NO_COLOR.

The color-based flush_workqueue() was designed so well that
we can also easily to adapt it to flush WORK_NO_COLOR. This
is the approach taken by this patch which introduces a
flush_no_color() to flush all the WORK_NO_COLOR works.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 include/linux/workqueue.h |  5 +--
 kernel/workqueue.c        | 72 +++++++++++++++++++++++++++++++++------
 2 files changed, 64 insertions(+), 13 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 8b505d22fc0e..85aea89fb6fc 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -55,8 +55,9 @@ enum {
 	 * The last color is no color used for works which don't
 	 * participate in workqueue flushing.
 	 */
-	WORK_NR_COLORS		= (1 << WORK_STRUCT_COLOR_BITS) - 1,
-	WORK_NO_COLOR		= WORK_NR_COLORS,
+	WORK_NR_COLORS		= (1 << WORK_STRUCT_COLOR_BITS),
+	WORK_NR_FLUSH_COLORS	= (WORK_NR_COLORS - 1),
+	WORK_NO_COLOR		= WORK_NR_FLUSH_COLORS,
 
 	/* not bound to any CPU, prefer the local CPU */
 	WORK_CPU_UNBOUND	= NR_CPUS,
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 7a1fc9fe6314..cf281e273b28 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -585,7 +585,7 @@ static int get_work_color(struct work_struct *work)
 
 static int work_next_color(int color)
 {
-	return (color + 1) % WORK_NR_COLORS;
+	return (color + 1) % WORK_NR_FLUSH_COLORS;
 }
 
 /*
@@ -1167,19 +1167,18 @@ static void pwq_activate_first_delayed(struct pool_workqueue *pwq)
  */
 static void pwq_dec_nr_in_flight(struct pool_workqueue *pwq, int color)
 {
-	/* uncolored work items don't participate in flushing or nr_active */
-	if (color == WORK_NO_COLOR)
-		goto out_put;
+	/* uncolored work items don't participate in nr_active */
+	if (color != WORK_NO_COLOR) {
+		pwq->nr_active--;
+		if (!list_empty(&pwq->delayed_works)) {
+			/* one down, submit a delayed one */
+			if (pwq->nr_active < pwq->max_active)
+				pwq_activate_first_delayed(pwq);
+		}
+	}
 
 	pwq->nr_in_flight[color]--;
 
-	pwq->nr_active--;
-	if (!list_empty(&pwq->delayed_works)) {
-		/* one down, submit a delayed one */
-		if (pwq->nr_active < pwq->max_active)
-			pwq_activate_first_delayed(pwq);
-	}
-
 	/* is flush in progress and are we at the flushing tip? */
 	if (likely(pwq->flush_color != color))
 		goto out_put;
@@ -2682,6 +2681,7 @@ static void insert_wq_barrier(struct pool_workqueue *pwq,
 	}
 
 	debug_work_activate(&barr->work);
+	pwq->nr_in_flight[WORK_NO_COLOR]++;
 	insert_work(pwq, &barr->work, head,
 		    work_color_to_flags(WORK_NO_COLOR) | linked);
 }
@@ -2915,6 +2915,55 @@ void flush_workqueue(struct workqueue_struct *wq)
 }
 EXPORT_SYMBOL(flush_workqueue);
 
+/*
+ * Called form destroy_workqueue() to flush all uncompleted internal
+ * work items queued by flush_work().
+ */
+static void flush_no_color(struct workqueue_struct *wq)
+{
+	struct wq_flusher this_flusher = {
+		.list = LIST_HEAD_INIT(this_flusher.list),
+		.flush_color = -1,
+		.done = COMPLETION_INITIALIZER_ONSTACK_MAP(this_flusher.done, wq->lockdep_map),
+	};
+
+	lock_map_acquire(&wq->lockdep_map);
+	lock_map_release(&wq->lockdep_map);
+
+	mutex_lock(&wq->mutex);
+	if (WARN_ON_ONCE(wq->first_flusher))
+		goto out_unlock;
+
+	wq->first_flusher = &this_flusher;
+	this_flusher.flush_color = wq->work_color;
+	WARN_ON_ONCE(wq->flush_color != this_flusher.flush_color);
+	WARN_ON_ONCE(!list_empty(&wq->flusher_overflow));
+	WARN_ON_ONCE(!list_empty(&wq->flusher_queue));
+
+	if (!flush_workqueue_prep_pwqs(wq, WORK_NO_COLOR, -1)) {
+		/* nothing to flush, done */
+		WRITE_ONCE(wq->first_flusher, NULL);
+		goto out_unlock;
+	}
+
+	check_flush_dependency(wq, NULL);
+
+	mutex_unlock(&wq->mutex);
+
+	wait_for_completion(&this_flusher.done);
+
+	mutex_lock(&wq->mutex);
+	WARN_ON_ONCE(wq->first_flusher != &this_flusher);
+	WARN_ON_ONCE(!list_empty(&wq->flusher_overflow));
+	WARN_ON_ONCE(!list_empty(&wq->flusher_queue));
+	WARN_ON_ONCE(!list_empty(&this_flusher.list));
+	WARN_ON_ONCE(wq->flush_color != this_flusher.flush_color);
+	WRITE_ONCE(wq->first_flusher, NULL);
+
+out_unlock:
+	mutex_unlock(&wq->mutex);
+}
+
 /**
  * drain_workqueue - drain a workqueue
  * @wq: workqueue to drain
@@ -4353,6 +4402,7 @@ void destroy_workqueue(struct workqueue_struct *wq)
 
 	/* drain it before proceeding with destruction */
 	drain_workqueue(wq);
+	flush_no_color(wq);
 
 	/* kill rescuer, if sanity checks fail, leave it w/o rescuer */
 	if (wq->rescuer) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] workqueue: ensure all flush_work() completed when being destoryed
  2020-06-01  6:08 [PATCH] workqueue: ensure all flush_work() completed when being destoryed Lai Jiangshan
@ 2020-06-01 15:31 ` Tejun Heo
  2020-06-02 13:49 ` Lai Jiangshan
  1 sibling, 0 replies; 5+ messages in thread
From: Tejun Heo @ 2020-06-01 15:31 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel, Lai Jiangshan

Hello, Lai.

On Mon, Jun 01, 2020 at 06:08:02AM +0000, Lai Jiangshan wrote:
> +static void flush_no_color(struct workqueue_struct *wq)
> +{
...

I'm not too sure about using the colored flushing system for this. Given
that the requirements are a lot simpler, I'd prefer keep it separate and
dumb rather than intertwining it with regular flushes. e.g. each wq can keep
the number of flush work items in flight and a simple wake up mechanism for
the task doing destroy_workqueue().

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] workqueue: ensure all flush_work() completed when being destoryed
  2020-06-01  6:08 [PATCH] workqueue: ensure all flush_work() completed when being destoryed Lai Jiangshan
  2020-06-01 15:31 ` Tejun Heo
@ 2020-06-02 13:49 ` Lai Jiangshan
  2020-06-02 16:13   ` Tejun Heo
  1 sibling, 1 reply; 5+ messages in thread
From: Lai Jiangshan @ 2020-06-02 13:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Lai Jiangshan, Tejun Heo, Lai Jiangshan

In old days, worker threads are not shared among different
workqueues and destroy_workqueue() used kthread_stop() to destroy
all workers before going to destroy workqueue structures.
And kthread_stop() can ensure the scheduled (worker->scheduled)
work items and the linked work items queued by flush_work()
to be completed.

For a workqueue to be completed/unused for wq users means that all
queued works have completed and all flush_work() have return,
and the workqueue is legitimate to passed to destroy_workqueue().

But
e22bee782b3b("workqueue: implement concurrency managed dynamic worker pool")
made worker pools and workers shared among different
workqueues and kthread_stop() is not used to sync the completion
of the work items. destroy_workqueue() uses drain_workqueue()
to drain user work items, but internal work items queued by
flush_work() is not drained due to they don't have colors.

So problems may occur when wq_barrier_func() does complete(&barr->done)
and the wokenup wq-user code does destroy_workqueue(). destroy_workqueue()
can be scheduled eariler than the proccess_one_work() to do
the put_pwq(), so that the sanity check in destroy_workqueue()
can see the no yet put pwq->refcnt and blame false positively.

The problem can be easily fixed by removing the WORK_NO_COLOR
and making the internal work item queued by flush_work() inherit
the color of the work items to be flushed. It would definitely
revert the design and the benefits of the WORK_NO_COLOR.

The patch simply adds an atomic counter for in-flight flush_work()
and a completion for destroy_workqueue() waiting for them.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
Changed from V1:
	Change from flush_no_color based mechanism to atomic+completion
	based as TJ suggested.

 kernel/workqueue.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 1921c982f920..71272beb8e01 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -253,6 +253,9 @@ struct workqueue_struct {
 	int			nr_drainers;	/* WQ: drain in progress */
 	int			saved_max_active; /* WQ: saved pwq max_active */
 
+	atomic_t		nr_flush_work;	/* flush work in progress */
+	struct completion	flush_work_done; /* sync flush_work() */
+
 	struct workqueue_attrs	*unbound_attrs;	/* PW: only for unbound wqs */
 	struct pool_workqueue	*dfl_pwq;	/* PW: only for unbound wqs */
 
@@ -1154,6 +1157,12 @@ static void pwq_activate_first_delayed(struct pool_workqueue *pwq)
 	pwq_activate_delayed_work(work);
 }
 
+static void dec_nr_in_flight_flush_work(struct workqueue_struct *wq)
+{
+	if (atomic_dec_and_test(&wq->nr_flush_work))
+		complete(&wq->flush_work_done);
+}
+
 /**
  * pwq_dec_nr_in_flight - decrement pwq's nr_in_flight
  * @pwq: pwq of interest
@@ -1168,8 +1177,10 @@ static void pwq_activate_first_delayed(struct pool_workqueue *pwq)
 static void pwq_dec_nr_in_flight(struct pool_workqueue *pwq, int color)
 {
 	/* uncolored work items don't participate in flushing or nr_active */
-	if (color == WORK_NO_COLOR)
+	if (color == WORK_NO_COLOR) {
+		dec_nr_in_flight_flush_work(pwq->wq);
 		goto out_put;
+	}
 
 	pwq->nr_in_flight[color]--;
 
@@ -2682,6 +2693,7 @@ static void insert_wq_barrier(struct pool_workqueue *pwq,
 	}
 
 	debug_work_activate(&barr->work);
+	atomic_inc(&pwq->wq->nr_flush_work);
 	insert_work(pwq, &barr->work, head,
 		    work_color_to_flags(WORK_NO_COLOR) | linked);
 }
@@ -4278,6 +4290,9 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
 	wq_init_lockdep(wq);
 	INIT_LIST_HEAD(&wq->list);
 
+	atomic_set(&wq->nr_flush_work, 1);
+	init_completion(&wq->flush_work_done);
+
 	if (alloc_and_link_pwqs(wq) < 0)
 		goto err_unreg_lockdep;
 
@@ -4354,6 +4369,10 @@ void destroy_workqueue(struct workqueue_struct *wq)
 	/* drain it before proceeding with destruction */
 	drain_workqueue(wq);
 
+	/* flush all uncompleted internal work items queued by flush_work() */
+	dec_nr_in_flight_flush_work(wq);
+	wait_for_completion(&wq->flush_work_done);
+
 	/* kill rescuer, if sanity checks fail, leave it w/o rescuer */
 	if (wq->rescuer) {
 		struct worker *rescuer = wq->rescuer;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] workqueue: ensure all flush_work() completed when being destoryed
  2020-06-02 13:49 ` Lai Jiangshan
@ 2020-06-02 16:13   ` Tejun Heo
  2020-06-02 23:13     ` Lai Jiangshan
  0 siblings, 1 reply; 5+ messages in thread
From: Tejun Heo @ 2020-06-02 16:13 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel, Lai Jiangshan

Hello, Lai.

On Tue, Jun 02, 2020 at 01:49:14PM +0000, Lai Jiangshan wrote:
> +static void dec_nr_in_flight_flush_work(struct workqueue_struct *wq)
> +{
> +	if (atomic_dec_and_test(&wq->nr_flush_work))

Do you think it'd make sense to put this in pwq so that it can be
synchronized with the pool lock instead of using a separate atomic counter?

Makes sense to me otherwise.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] workqueue: ensure all flush_work() completed when being destoryed
  2020-06-02 16:13   ` Tejun Heo
@ 2020-06-02 23:13     ` Lai Jiangshan
  0 siblings, 0 replies; 5+ messages in thread
From: Lai Jiangshan @ 2020-06-02 23:13 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Lai Jiangshan, LKML

On Wed, Jun 3, 2020 at 12:13 AM Tejun Heo <tj@kernel.org> wrote:
>
> Hello, Lai.
>
> On Tue, Jun 02, 2020 at 01:49:14PM +0000, Lai Jiangshan wrote:
> > +static void dec_nr_in_flight_flush_work(struct workqueue_struct *wq)
> > +{
> > +     if (atomic_dec_and_test(&wq->nr_flush_work))
>
> Do you think it'd make sense to put this in pwq so that it can be
> synchronized with the pool lock instead of using a separate atomic counter?
>

Hello, Tejun

When the counter is put in pwq, there will be a per-pwq counter,
a per-pwq completion or pointer and a flush_workqueue_prep_pwqs()-like
function (although simpler) to set up them before waiting for them.

It would sound like the V1 patch. per-pwq counter may have better
performance over per-wq atomic, it seems too tiny to add code
complicity. V1 has the simplest pwq_dec_nr_in_flight() except
there is too much WARN_ON_ONCE() in flush_no_color().

Thanks
Lai

> Makes sense to me otherwise.
>
> Thanks.
>
> --
> tejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-06-02 23:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-01  6:08 [PATCH] workqueue: ensure all flush_work() completed when being destoryed Lai Jiangshan
2020-06-01 15:31 ` Tejun Heo
2020-06-02 13:49 ` Lai Jiangshan
2020-06-02 16:13   ` Tejun Heo
2020-06-02 23:13     ` Lai Jiangshan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).