* [PATCH] workqueue: ensure all flush_work() completed when being destoryed
@ 2020-06-01 6:08 Lai Jiangshan
2020-06-01 15:31 ` Tejun Heo
2020-06-02 13:49 ` Lai Jiangshan
0 siblings, 2 replies; 5+ messages in thread
From: Lai Jiangshan @ 2020-06-01 6:08 UTC (permalink / raw)
To: linux-kernel; +Cc: Lai Jiangshan, Tejun Heo, Lai Jiangshan
In old days, worker threads are not shared among different
workqueues and destroy_workqueue() used kthread_stop() to destroy
all workers before going to destroy workqueue structures.
And kthread_stop() can ensure the scheduled (worker->scheduled)
work items and the linked work items queued by flush_work()
to be completed.
For a workqueue to be completed/unused for wq users means that all
queued works have completed and all flush_work() have return,
and the workqueue is legitimate to passed to destroy_workqueue().
But
e22bee782b3b("workqueue: implement concurrency managed dynamic worker pool")
made worker pools and workers shared among different
workqueues and kthread_stop() is not used to sync the completion
of the work items. destroy_workqueue() uses drain_workqueue()
to drain user work items, but internal work items queued by
flush_work() is not drained due to they don't have colors.
So problems may occur when wq_barrier_func() does complete(&barr->done)
and the wokenup wq-user code does destroy_workqueue(). destroy_workqueue()
can be scheduled eariler than the proccess_one_work() to do
the put_pwq(), so that the sanity check in destroy_workqueue()
can see the no yet put pwq->refcnt and blame false positively.
The problem can be easily fixed by removing the WORK_NO_COLOR
and making the internal work item queued by flush_work() inherit
the color of the work items to be flushed. It would definitely
revert the design and the benefits of the WORK_NO_COLOR.
The color-based flush_workqueue() was designed so well that
we can also easily to adapt it to flush WORK_NO_COLOR. This
is the approach taken by this patch which introduces a
flush_no_color() to flush all the WORK_NO_COLOR works.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
include/linux/workqueue.h | 5 +--
kernel/workqueue.c | 72 +++++++++++++++++++++++++++++++++------
2 files changed, 64 insertions(+), 13 deletions(-)
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 8b505d22fc0e..85aea89fb6fc 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -55,8 +55,9 @@ enum {
* The last color is no color used for works which don't
* participate in workqueue flushing.
*/
- WORK_NR_COLORS = (1 << WORK_STRUCT_COLOR_BITS) - 1,
- WORK_NO_COLOR = WORK_NR_COLORS,
+ WORK_NR_COLORS = (1 << WORK_STRUCT_COLOR_BITS),
+ WORK_NR_FLUSH_COLORS = (WORK_NR_COLORS - 1),
+ WORK_NO_COLOR = WORK_NR_FLUSH_COLORS,
/* not bound to any CPU, prefer the local CPU */
WORK_CPU_UNBOUND = NR_CPUS,
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 7a1fc9fe6314..cf281e273b28 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -585,7 +585,7 @@ static int get_work_color(struct work_struct *work)
static int work_next_color(int color)
{
- return (color + 1) % WORK_NR_COLORS;
+ return (color + 1) % WORK_NR_FLUSH_COLORS;
}
/*
@@ -1167,19 +1167,18 @@ static void pwq_activate_first_delayed(struct pool_workqueue *pwq)
*/
static void pwq_dec_nr_in_flight(struct pool_workqueue *pwq, int color)
{
- /* uncolored work items don't participate in flushing or nr_active */
- if (color == WORK_NO_COLOR)
- goto out_put;
+ /* uncolored work items don't participate in nr_active */
+ if (color != WORK_NO_COLOR) {
+ pwq->nr_active--;
+ if (!list_empty(&pwq->delayed_works)) {
+ /* one down, submit a delayed one */
+ if (pwq->nr_active < pwq->max_active)
+ pwq_activate_first_delayed(pwq);
+ }
+ }
pwq->nr_in_flight[color]--;
- pwq->nr_active--;
- if (!list_empty(&pwq->delayed_works)) {
- /* one down, submit a delayed one */
- if (pwq->nr_active < pwq->max_active)
- pwq_activate_first_delayed(pwq);
- }
-
/* is flush in progress and are we at the flushing tip? */
if (likely(pwq->flush_color != color))
goto out_put;
@@ -2682,6 +2681,7 @@ static void insert_wq_barrier(struct pool_workqueue *pwq,
}
debug_work_activate(&barr->work);
+ pwq->nr_in_flight[WORK_NO_COLOR]++;
insert_work(pwq, &barr->work, head,
work_color_to_flags(WORK_NO_COLOR) | linked);
}
@@ -2915,6 +2915,55 @@ void flush_workqueue(struct workqueue_struct *wq)
}
EXPORT_SYMBOL(flush_workqueue);
+/*
+ * Called form destroy_workqueue() to flush all uncompleted internal
+ * work items queued by flush_work().
+ */
+static void flush_no_color(struct workqueue_struct *wq)
+{
+ struct wq_flusher this_flusher = {
+ .list = LIST_HEAD_INIT(this_flusher.list),
+ .flush_color = -1,
+ .done = COMPLETION_INITIALIZER_ONSTACK_MAP(this_flusher.done, wq->lockdep_map),
+ };
+
+ lock_map_acquire(&wq->lockdep_map);
+ lock_map_release(&wq->lockdep_map);
+
+ mutex_lock(&wq->mutex);
+ if (WARN_ON_ONCE(wq->first_flusher))
+ goto out_unlock;
+
+ wq->first_flusher = &this_flusher;
+ this_flusher.flush_color = wq->work_color;
+ WARN_ON_ONCE(wq->flush_color != this_flusher.flush_color);
+ WARN_ON_ONCE(!list_empty(&wq->flusher_overflow));
+ WARN_ON_ONCE(!list_empty(&wq->flusher_queue));
+
+ if (!flush_workqueue_prep_pwqs(wq, WORK_NO_COLOR, -1)) {
+ /* nothing to flush, done */
+ WRITE_ONCE(wq->first_flusher, NULL);
+ goto out_unlock;
+ }
+
+ check_flush_dependency(wq, NULL);
+
+ mutex_unlock(&wq->mutex);
+
+ wait_for_completion(&this_flusher.done);
+
+ mutex_lock(&wq->mutex);
+ WARN_ON_ONCE(wq->first_flusher != &this_flusher);
+ WARN_ON_ONCE(!list_empty(&wq->flusher_overflow));
+ WARN_ON_ONCE(!list_empty(&wq->flusher_queue));
+ WARN_ON_ONCE(!list_empty(&this_flusher.list));
+ WARN_ON_ONCE(wq->flush_color != this_flusher.flush_color);
+ WRITE_ONCE(wq->first_flusher, NULL);
+
+out_unlock:
+ mutex_unlock(&wq->mutex);
+}
+
/**
* drain_workqueue - drain a workqueue
* @wq: workqueue to drain
@@ -4353,6 +4402,7 @@ void destroy_workqueue(struct workqueue_struct *wq)
/* drain it before proceeding with destruction */
drain_workqueue(wq);
+ flush_no_color(wq);
/* kill rescuer, if sanity checks fail, leave it w/o rescuer */
if (wq->rescuer) {
--
2.20.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] workqueue: ensure all flush_work() completed when being destoryed
2020-06-01 6:08 [PATCH] workqueue: ensure all flush_work() completed when being destoryed Lai Jiangshan
@ 2020-06-01 15:31 ` Tejun Heo
2020-06-02 13:49 ` Lai Jiangshan
1 sibling, 0 replies; 5+ messages in thread
From: Tejun Heo @ 2020-06-01 15:31 UTC (permalink / raw)
To: Lai Jiangshan; +Cc: linux-kernel, Lai Jiangshan
Hello, Lai.
On Mon, Jun 01, 2020 at 06:08:02AM +0000, Lai Jiangshan wrote:
> +static void flush_no_color(struct workqueue_struct *wq)
> +{
...
I'm not too sure about using the colored flushing system for this. Given
that the requirements are a lot simpler, I'd prefer keep it separate and
dumb rather than intertwining it with regular flushes. e.g. each wq can keep
the number of flush work items in flight and a simple wake up mechanism for
the task doing destroy_workqueue().
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] workqueue: ensure all flush_work() completed when being destoryed
2020-06-01 6:08 [PATCH] workqueue: ensure all flush_work() completed when being destoryed Lai Jiangshan
2020-06-01 15:31 ` Tejun Heo
@ 2020-06-02 13:49 ` Lai Jiangshan
2020-06-02 16:13 ` Tejun Heo
1 sibling, 1 reply; 5+ messages in thread
From: Lai Jiangshan @ 2020-06-02 13:49 UTC (permalink / raw)
To: linux-kernel; +Cc: Lai Jiangshan, Tejun Heo, Lai Jiangshan
In old days, worker threads are not shared among different
workqueues and destroy_workqueue() used kthread_stop() to destroy
all workers before going to destroy workqueue structures.
And kthread_stop() can ensure the scheduled (worker->scheduled)
work items and the linked work items queued by flush_work()
to be completed.
For a workqueue to be completed/unused for wq users means that all
queued works have completed and all flush_work() have return,
and the workqueue is legitimate to passed to destroy_workqueue().
But
e22bee782b3b("workqueue: implement concurrency managed dynamic worker pool")
made worker pools and workers shared among different
workqueues and kthread_stop() is not used to sync the completion
of the work items. destroy_workqueue() uses drain_workqueue()
to drain user work items, but internal work items queued by
flush_work() is not drained due to they don't have colors.
So problems may occur when wq_barrier_func() does complete(&barr->done)
and the wokenup wq-user code does destroy_workqueue(). destroy_workqueue()
can be scheduled eariler than the proccess_one_work() to do
the put_pwq(), so that the sanity check in destroy_workqueue()
can see the no yet put pwq->refcnt and blame false positively.
The problem can be easily fixed by removing the WORK_NO_COLOR
and making the internal work item queued by flush_work() inherit
the color of the work items to be flushed. It would definitely
revert the design and the benefits of the WORK_NO_COLOR.
The patch simply adds an atomic counter for in-flight flush_work()
and a completion for destroy_workqueue() waiting for them.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
Changed from V1:
Change from flush_no_color based mechanism to atomic+completion
based as TJ suggested.
kernel/workqueue.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 1921c982f920..71272beb8e01 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -253,6 +253,9 @@ struct workqueue_struct {
int nr_drainers; /* WQ: drain in progress */
int saved_max_active; /* WQ: saved pwq max_active */
+ atomic_t nr_flush_work; /* flush work in progress */
+ struct completion flush_work_done; /* sync flush_work() */
+
struct workqueue_attrs *unbound_attrs; /* PW: only for unbound wqs */
struct pool_workqueue *dfl_pwq; /* PW: only for unbound wqs */
@@ -1154,6 +1157,12 @@ static void pwq_activate_first_delayed(struct pool_workqueue *pwq)
pwq_activate_delayed_work(work);
}
+static void dec_nr_in_flight_flush_work(struct workqueue_struct *wq)
+{
+ if (atomic_dec_and_test(&wq->nr_flush_work))
+ complete(&wq->flush_work_done);
+}
+
/**
* pwq_dec_nr_in_flight - decrement pwq's nr_in_flight
* @pwq: pwq of interest
@@ -1168,8 +1177,10 @@ static void pwq_activate_first_delayed(struct pool_workqueue *pwq)
static void pwq_dec_nr_in_flight(struct pool_workqueue *pwq, int color)
{
/* uncolored work items don't participate in flushing or nr_active */
- if (color == WORK_NO_COLOR)
+ if (color == WORK_NO_COLOR) {
+ dec_nr_in_flight_flush_work(pwq->wq);
goto out_put;
+ }
pwq->nr_in_flight[color]--;
@@ -2682,6 +2693,7 @@ static void insert_wq_barrier(struct pool_workqueue *pwq,
}
debug_work_activate(&barr->work);
+ atomic_inc(&pwq->wq->nr_flush_work);
insert_work(pwq, &barr->work, head,
work_color_to_flags(WORK_NO_COLOR) | linked);
}
@@ -4278,6 +4290,9 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
wq_init_lockdep(wq);
INIT_LIST_HEAD(&wq->list);
+ atomic_set(&wq->nr_flush_work, 1);
+ init_completion(&wq->flush_work_done);
+
if (alloc_and_link_pwqs(wq) < 0)
goto err_unreg_lockdep;
@@ -4354,6 +4369,10 @@ void destroy_workqueue(struct workqueue_struct *wq)
/* drain it before proceeding with destruction */
drain_workqueue(wq);
+ /* flush all uncompleted internal work items queued by flush_work() */
+ dec_nr_in_flight_flush_work(wq);
+ wait_for_completion(&wq->flush_work_done);
+
/* kill rescuer, if sanity checks fail, leave it w/o rescuer */
if (wq->rescuer) {
struct worker *rescuer = wq->rescuer;
--
2.20.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] workqueue: ensure all flush_work() completed when being destoryed
2020-06-02 13:49 ` Lai Jiangshan
@ 2020-06-02 16:13 ` Tejun Heo
2020-06-02 23:13 ` Lai Jiangshan
0 siblings, 1 reply; 5+ messages in thread
From: Tejun Heo @ 2020-06-02 16:13 UTC (permalink / raw)
To: Lai Jiangshan; +Cc: linux-kernel, Lai Jiangshan
Hello, Lai.
On Tue, Jun 02, 2020 at 01:49:14PM +0000, Lai Jiangshan wrote:
> +static void dec_nr_in_flight_flush_work(struct workqueue_struct *wq)
> +{
> + if (atomic_dec_and_test(&wq->nr_flush_work))
Do you think it'd make sense to put this in pwq so that it can be
synchronized with the pool lock instead of using a separate atomic counter?
Makes sense to me otherwise.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] workqueue: ensure all flush_work() completed when being destoryed
2020-06-02 16:13 ` Tejun Heo
@ 2020-06-02 23:13 ` Lai Jiangshan
0 siblings, 0 replies; 5+ messages in thread
From: Lai Jiangshan @ 2020-06-02 23:13 UTC (permalink / raw)
To: Tejun Heo; +Cc: Lai Jiangshan, LKML
On Wed, Jun 3, 2020 at 12:13 AM Tejun Heo <tj@kernel.org> wrote:
>
> Hello, Lai.
>
> On Tue, Jun 02, 2020 at 01:49:14PM +0000, Lai Jiangshan wrote:
> > +static void dec_nr_in_flight_flush_work(struct workqueue_struct *wq)
> > +{
> > + if (atomic_dec_and_test(&wq->nr_flush_work))
>
> Do you think it'd make sense to put this in pwq so that it can be
> synchronized with the pool lock instead of using a separate atomic counter?
>
Hello, Tejun
When the counter is put in pwq, there will be a per-pwq counter,
a per-pwq completion or pointer and a flush_workqueue_prep_pwqs()-like
function (although simpler) to set up them before waiting for them.
It would sound like the V1 patch. per-pwq counter may have better
performance over per-wq atomic, it seems too tiny to add code
complicity. V1 has the simplest pwq_dec_nr_in_flight() except
there is too much WARN_ON_ONCE() in flush_no_color().
Thanks
Lai
> Makes sense to me otherwise.
>
> Thanks.
>
> --
> tejun
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-06-02 23:13 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-01 6:08 [PATCH] workqueue: ensure all flush_work() completed when being destoryed Lai Jiangshan
2020-06-01 15:31 ` Tejun Heo
2020-06-02 13:49 ` Lai Jiangshan
2020-06-02 16:13 ` Tejun Heo
2020-06-02 23:13 ` Lai Jiangshan
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.