[RFC 0/2] Re-entrace of a work when requeued to a different workqueue

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC 0/2] Re-entrace of a work when requeued to a different workqueue
@ 2021-10-08 10:04 Boqun Feng
  2021-10-08 10:04 ` [RFC 1/2] NOT FOR MERGE: A selftest shows that re-entrance can happen Boqun Feng
  2021-10-08 10:04 ` [RFC 2/2] workqueue: Fix work re-entrance when requeue to a different workqueue Boqun Feng
  0 siblings, 2 replies; 6+ messages in thread
From: Boqun Feng @ 2021-10-08 10:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, Lai Jiangshan, Paul E . McKenney, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Boqun Feng

Hi Tejun,

I found out a possible re-entrace case of a work item:

	queue_work_on(0, WQ1, W);
	// after a worker picks up W and clear the pending bit
	queue_work_on(1, WQ2, W);
	// workers on CPU0 and CPU1 will execute W in the same time.

To make this happen, work W must be queued to different workqueues (WQ1
& WQ2), which may look weird, but IIUC, we don't disallow it?

I'm sending this patchset out to see whether 1) this is by design (sane
users of workqueues should guarantee no concurrent queuing one work on
two workqueues) or 2) this is a real problem that we should fix. If it's
2), then I have a straight-forward fix in patch #2, which needs some
discussion because I changes the queue_work_on() semantics a little bit:
if WQ1 is a unbound workqueue and WQ2 is a bound one, my change makes
the second queue_work_on() in the above case effectively queue W on WQ1,
which I'm not sure is a desired change.

Patch #1 contains a simple reproduce of the re-entrance case, which is
of course not for merge.

Regards,
Boqun

Boqun Feng (2):
  NOT FOR MERGE: A selftest shows that re-entrance can happen
  workqueue: Fix work re-entrance when requeue to a different workqueue

 kernel/workqueue.c | 43 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 42 insertions(+), 1 deletion(-)

-- 
2.32.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC 1/2] NOT FOR MERGE: A selftest shows that re-entrance can happen
  2021-10-08 10:04 [RFC 0/2] Re-entrace of a work when requeued to a different workqueue Boqun Feng
@ 2021-10-08 10:04 ` Boqun Feng
  2021-10-08 10:04 ` [RFC 2/2] workqueue: Fix work re-entrance when requeue to a different workqueue Boqun Feng
  1 sibling, 0 replies; 6+ messages in thread
From: Boqun Feng @ 2021-10-08 10:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, Lai Jiangshan, Paul E . McKenney, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Boqun Feng

Re-entrance can be confirmed when PREEMPT=y as the racy output below:

[    1.498285] second queue succeeds
[    1.500679] result of i is 1031665
[    1.501069] result of i is 1221348

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/workqueue.c | 41 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 33a6b4a2443d..1418710bffcd 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -6099,3 +6099,44 @@ void __init workqueue_init(void)
 	wq_online = true;
 	wq_watchdog_init();
 }
+
+struct temp_work {
+	struct work_struct work;
+	int i;
+};
+
+#include <linux/delay.h>
+static void __init work_func(struct work_struct *work)
+{
+	struct temp_work *tmp = (struct temp_work *)work;
+	int p = 0;
+	int q;
+
+	for (p = 0; p < 1000000; p++) {
+		q = READ_ONCE(tmp->i);
+		WRITE_ONCE(tmp->i, q + 1);
+	}
+
+	printk("result of i is %d\n", tmp->i);
+}
+
+static int __init work_reentry(void)
+{
+	struct temp_work tmp;
+
+	tmp.i = 0;
+
+	INIT_WORK_ONSTACK(&tmp.work, work_func);
+
+	queue_work_on(1, system_wq, &tmp.work);
+
+	while (!queue_work_on(2, system_unbound_wq, &tmp.work)) { }
+
+	printk("second queue succeeds\n");
+
+	flush_work(&tmp.work);
+
+	return 0;
+}
+
+late_initcall(work_reentry);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [RFC 2/2] workqueue: Fix work re-entrance when requeue to a different workqueue
  2021-10-08 10:04 [RFC 0/2] Re-entrace of a work when requeued to a different workqueue Boqun Feng
  2021-10-08 10:04 ` [RFC 1/2] NOT FOR MERGE: A selftest shows that re-entrance can happen Boqun Feng
@ 2021-10-08 10:04 ` Boqun Feng
  2021-10-09  2:06   ` Lai Jiangshan
  1 sibling, 1 reply; 6+ messages in thread
From: Boqun Feng @ 2021-10-08 10:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tejun Heo, Lai Jiangshan, Paul E . McKenney, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Boqun Feng

When requeuing a work to a different workqueue while it's still getting
processed, re-entrace as the follow can happen:

	{ both WQ1 and WQ2 are bounded workqueue, and a work W has been
	  queued on CPU0 for WQ1}

	CPU 0			CPU 1
	=====			====
	<In worker on CPU 0>
	process_one_work():
	  ...
	  // pick up W
	  worker->current_work = W;
	  worker->current_func = W->func;
	  ...
	  set_work_pool_and_clear_pending(...);
	  // W can be requeued afterwards
				queue_work_on(1, WQ2, W):
				  if (!test_and_set_bit(...)) {
				    // this branch is taken, as CPU 0
				    // just clears pending bit.
				    __queue_work(...):
				      pwq = <pool for CPU1 of WQ2>;
				      last_pool = <pool for CPU 0 of WQ1>;
				      if (last_pool != pwq->pool) { // true
				        if (.. && worker->current_pwq->wq == wq) {
					  // false, since @worker is a
					  // a worker of @last_pool (for
					  // WQ1), and @wq is WQ2.
					}
					...
					insert_work(pwq, W, ...);
				      }
				// W queued.
				<schedule to worker on CPU 1>
				process_one_work():
				  collision = find_worker_executing_work(..);
				  // NULL, because we're searching the
				  // worker pool of CPU 1, while W is
				  // the current work on worker pool of
				  // CPU 0.
				  worker->current_work = W;
				  worker->current_func = W->func;
	  worker->current_func(...);
	  			  ...
	  			  worker->current_func(...); // Re-entrance

This issue is already partially fixed because in queue_work_on(),
last_pool can be used to queue the work, as a result the requeued work
processing will find the collision and wait for the existing one to
finish. However, currently the last_pool is only used when two
workqueues are the same one, which causes the issue. Therefore extend
the behavior to allow last_pool to requeue the work W even if the
workqueues are different. It's safe to do this since the work W has been
proved safe to queue and run on the last_pool.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/workqueue.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 1418710bffcd..410141cc5f88 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1465,7 +1465,7 @@ static void __queue_work(int cpu, struct workqueue_struct *wq,
 
 		worker = find_worker_executing_work(last_pool, work);
 
-		if (worker && worker->current_pwq->wq == wq) {
+		if (worker) {
 			pwq = worker->current_pwq;
 		} else {
 			/* meh... not running there, queue here */
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC 2/2] workqueue: Fix work re-entrance when requeue to a different workqueue
  2021-10-08 10:04 ` [RFC 2/2] workqueue: Fix work re-entrance when requeue to a different workqueue Boqun Feng
@ 2021-10-09  2:06   ` Lai Jiangshan
  2021-10-09  3:21     ` Boqun Feng
  0 siblings, 1 reply; 6+ messages in thread
From: Lai Jiangshan @ 2021-10-09  2:06 UTC (permalink / raw)
  To: Boqun Feng
  Cc: LKML, Tejun Heo, Paul E . McKenney, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker

On Fri, Oct 8, 2021 at 6:06 PM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> When requeuing a work to a different workqueue while it's still getting
> processed, re-entrace as the follow can happen:
>
>         { both WQ1 and WQ2 are bounded workqueue, and a work W has been
>           queued on CPU0 for WQ1}
>
>         CPU 0                   CPU 1
>         =====                   ====
>         <In worker on CPU 0>
>         process_one_work():
>           ...
>           // pick up W
>           worker->current_work = W;
>           worker->current_func = W->func;
>           ...
>           set_work_pool_and_clear_pending(...);
>           // W can be requeued afterwards
>                                 queue_work_on(1, WQ2, W):
>                                   if (!test_and_set_bit(...)) {
>                                     // this branch is taken, as CPU 0
>                                     // just clears pending bit.
>                                     __queue_work(...):
>                                       pwq = <pool for CPU1 of WQ2>;
>                                       last_pool = <pool for CPU 0 of WQ1>;
>                                       if (last_pool != pwq->pool) { // true
>                                         if (.. && worker->current_pwq->wq == wq) {
>                                           // false, since @worker is a
>                                           // a worker of @last_pool (for
>                                           // WQ1), and @wq is WQ2.
>                                         }
>                                         ...
>                                         insert_work(pwq, W, ...);
>                                       }
>                                 // W queued.
>                                 <schedule to worker on CPU 1>
>                                 process_one_work():
>                                   collision = find_worker_executing_work(..);
>                                   // NULL, because we're searching the
>                                   // worker pool of CPU 1, while W is
>                                   // the current work on worker pool of
>                                   // CPU 0.
>                                   worker->current_work = W;
>                                   worker->current_func = W->func;
>           worker->current_func(...);
>                                   ...
>                                   worker->current_func(...); // Re-entrance

Concurrent or parallel executions on the same work item aren't
considered as "Re-entrance" if the workqueue is changed.

It allows the work function to free itself(the item) and another
subsystem allocates the same item and reuses it.

"Re-entrance" is defined as:
  work function has not been changed
  wq has not been changed
  the item has not been reinitiated.
  (To reduce the check complication, the workqueue subsystem often
considers it "Re-entrance" if the condition is changed and has changed
back. But the wq users should not depend on this behavior and should avoid
it)


>
> This issue is already partially fixed because in queue_work_on(),
> last_pool can be used to queue the work, as a result the requeued work
> processing will find the collision and wait for the existing one to
> finish. However, currently the last_pool is only used when two
> workqueues are the same one, which causes the issue. Therefore extend
> the behavior to allow last_pool to requeue the work W even if the
> workqueues are different. It's safe to do this since the work W has been
> proved safe to queue and run on the last_pool.
>
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
>  kernel/workqueue.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 1418710bffcd..410141cc5f88 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1465,7 +1465,7 @@ static void __queue_work(int cpu, struct workqueue_struct *wq,
>
>                 worker = find_worker_executing_work(last_pool, work);
>
> -               if (worker && worker->current_pwq->wq == wq) {
> +               if (worker) {
>                         pwq = worker->current_pwq;
>                 } else {
>                         /* meh... not running there, queue here */
> --
> 2.32.0
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC 2/2] workqueue: Fix work re-entrance when requeue to a different workqueue
  2021-10-09  2:06   ` Lai Jiangshan
@ 2021-10-09  3:21     ` Boqun Feng
  2021-10-09 15:06       ` Boqun Feng
  0 siblings, 1 reply; 6+ messages in thread
From: Boqun Feng @ 2021-10-09  3:21 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: LKML, Tejun Heo, Paul E . McKenney, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker

On Sat, Oct 09, 2021 at 10:06:23AM +0800, Lai Jiangshan wrote:
> On Fri, Oct 8, 2021 at 6:06 PM Boqun Feng <boqun.feng@gmail.com> wrote:
> >
> > When requeuing a work to a different workqueue while it's still getting
> > processed, re-entrace as the follow can happen:
> >
> >         { both WQ1 and WQ2 are bounded workqueue, and a work W has been
> >           queued on CPU0 for WQ1}
> >
> >         CPU 0                   CPU 1
> >         =====                   ====
> >         <In worker on CPU 0>
> >         process_one_work():
> >           ...
> >           // pick up W
> >           worker->current_work = W;
> >           worker->current_func = W->func;
> >           ...
> >           set_work_pool_and_clear_pending(...);
> >           // W can be requeued afterwards
> >                                 queue_work_on(1, WQ2, W):
> >                                   if (!test_and_set_bit(...)) {
> >                                     // this branch is taken, as CPU 0
> >                                     // just clears pending bit.
> >                                     __queue_work(...):
> >                                       pwq = <pool for CPU1 of WQ2>;
> >                                       last_pool = <pool for CPU 0 of WQ1>;
> >                                       if (last_pool != pwq->pool) { // true
> >                                         if (.. && worker->current_pwq->wq == wq) {
> >                                           // false, since @worker is a
> >                                           // a worker of @last_pool (for
> >                                           // WQ1), and @wq is WQ2.
> >                                         }
> >                                         ...
> >                                         insert_work(pwq, W, ...);
> >                                       }
> >                                 // W queued.
> >                                 <schedule to worker on CPU 1>
> >                                 process_one_work():
> >                                   collision = find_worker_executing_work(..);
> >                                   // NULL, because we're searching the
> >                                   // worker pool of CPU 1, while W is
> >                                   // the current work on worker pool of
> >                                   // CPU 0.
> >                                   worker->current_work = W;
> >                                   worker->current_func = W->func;
> >           worker->current_func(...);
> >                                   ...
> >                                   worker->current_func(...); // Re-entrance
> 
> Concurrent or parallel executions on the same work item aren't
> considered as "Re-entrance" if the workqueue is changed.
> 

Well, then Documentation/core-api/workqueue.rst can use some help:

"Note that the flag ``WQ_NON_REENTRANT`` no longer exists as all
workqueues are now non-reentrant - any work item is guaranteed to be
executed by at most one worker system-wide at any given time."

Clearly in the above case that a work item is executed by two worker at
the same time.

> It allows the work function to free itself(the item) and another
> subsystem allocates the same item and reuses it.
> 

So you're saying in process_one_work(), ->current_work can point to a
work which gets freed and reallocated before the worker actually
execute it? And users should guarantee it's safe to do so? I mean this
is something that workqueue subsystem allows/expects users to do?

> "Re-entrance" is defined as:
>   work function has not been changed
>   wq has not been changed
>   the item has not been reinitiated.
>   (To reduce the check complication, the workqueue subsystem often
> considers it "Re-entrance" if the condition is changed and has changed
> back. But the wq users should not depend on this behavior and should avoid
> it)
> 

Thanks for clarifiction, could you also update the documentation to
avoid future confusion? Thanks!

Regards,
Boqun

> 
> >
> > This issue is already partially fixed because in queue_work_on(),
> > last_pool can be used to queue the work, as a result the requeued work
> > processing will find the collision and wait for the existing one to
> > finish. However, currently the last_pool is only used when two
> > workqueues are the same one, which causes the issue. Therefore extend
> > the behavior to allow last_pool to requeue the work W even if the
> > workqueues are different. It's safe to do this since the work W has been
> > proved safe to queue and run on the last_pool.
> >
> > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> > ---
> >  kernel/workqueue.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> > index 1418710bffcd..410141cc5f88 100644
> > --- a/kernel/workqueue.c
> > +++ b/kernel/workqueue.c
> > @@ -1465,7 +1465,7 @@ static void __queue_work(int cpu, struct workqueue_struct *wq,
> >
> >                 worker = find_worker_executing_work(last_pool, work);
> >
> > -               if (worker && worker->current_pwq->wq == wq) {
> > +               if (worker) {
> >                         pwq = worker->current_pwq;
> >                 } else {
> >                         /* meh... not running there, queue here */
> > --
> > 2.32.0
> >

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC 2/2] workqueue: Fix work re-entrance when requeue to a different workqueue
  2021-10-09  3:21     ` Boqun Feng
@ 2021-10-09 15:06       ` Boqun Feng
  0 siblings, 0 replies; 6+ messages in thread
From: Boqun Feng @ 2021-10-09 15:06 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: LKML, Tejun Heo, Paul E . McKenney, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker

On Sat, Oct 09, 2021 at 11:21:58AM +0800, Boqun Feng wrote:
[...]
> > "Re-entrance" is defined as:
> >   work function has not been changed
> >   wq has not been changed
> >   the item has not been reinitiated.
> >   (To reduce the check complication, the workqueue subsystem often
> > considers it "Re-entrance" if the condition is changed and has changed
> > back. But the wq users should not depend on this behavior and should avoid
> > it)
> > 
> 
> Thanks for clarifiction, could you also update the documentation to
> avoid future confusion? Thanks!
> 

I come up with the following, will send a proper patch tomorrow. Looking
forwards to your and others' suggestions.

Regards,
Boqun

--------->8
diff --git a/Documentation/core-api/workqueue.rst b/Documentation/core-api/workqueue.rst
index 541d31de8926..3b22ed137662 100644
--- a/Documentation/core-api/workqueue.rst
+++ b/Documentation/core-api/workqueue.rst
@@ -216,10 +216,6 @@ resources, scheduled and executed.

   This flag is meaningless for unbound wq.

-Note that the flag ``WQ_NON_REENTRANT`` no longer exists as all
-workqueues are now non-reentrant - any work item is guaranteed to be
-executed by at most one worker system-wide at any given time.
-

 ``max_active``
 --------------
@@ -391,6 +387,23 @@ the stack trace of the offending worker thread. ::
 The work item's function should be trivially visible in the stack
 trace.

+Non-reentrance Conditions
+=========================
+
+Workqueue guarantees that a work item cannot be re-entrant if the following
+conditions hold after a work item gets queued:
+
+        1. The work function hasn't been changed.
+        2. No one queues the work item to another workqueue.
+        3. The work item hasn't been reinitiated.
+
+In other words, if the above conditions hold, the work item is guaranteed to be
+executed by at most one worker system-wide at any given time.
+
+Note that requeuing the work item (to the same queue) in the self function
+doesn't break these conditions, so it's safe to do. Otherwise, caution is
+required when breaking the conditions inside a work function.
+

 Kernel Inline Documentations Reference
 ======================================


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-10-09 15:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-08 10:04 [RFC 0/2] Re-entrace of a work when requeued to a different workqueue Boqun Feng
2021-10-08 10:04 ` [RFC 1/2] NOT FOR MERGE: A selftest shows that re-entrance can happen Boqun Feng
2021-10-08 10:04 ` [RFC 2/2] workqueue: Fix work re-entrance when requeue to a different workqueue Boqun Feng
2021-10-09  2:06   ` Lai Jiangshan
2021-10-09  3:21     ` Boqun Feng
2021-10-09 15:06       ` Boqun Feng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).