All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 2.6.32-rc6] workqueue: fix race condition in schedule_on_each_cpu()
@ 2009-11-13  9:29 Tejun Heo
  2009-11-13  9:33 ` Ingo Molnar
  2009-11-13 15:38 ` Oleg Nesterov
  0 siblings, 2 replies; 5+ messages in thread
From: Tejun Heo @ 2009-11-13  9:29 UTC (permalink / raw)
  To: Oleg Nesterov, Ingo Molnar, Linus Torvalds, lkml

Commit 65a64464349883891e21e74af16c05d6e1eeb4e9 which allows
schedule_on_each_cpu() to be called from keventd added a race
condition.  schedule_on_each_cpu() may race with cpu hotplug and end
up executing the function twice on a cpu.

Fix it by moving direct execution into the section protected with
get/put_online_cpus().  While at it, update code such that direct
execution is done after works have been scheduled for all other cpus
and drop unnecessary cpu != orig test from flush loop.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
---
Andi, Oleg, this patch tested fine on my machine but it would be great
if you guys can ack it.  Ingo, upon ack, can you please route this
patch?

Thanks.

 kernel/workqueue.c |   28 +++++++++++++---------------
 1 file changed, 13 insertions(+), 15 deletions(-)

Index: work/kernel/workqueue.c
===================================================================
--- work.orig/kernel/workqueue.c
+++ work/kernel/workqueue.c
@@ -692,31 +692,29 @@ int schedule_on_each_cpu(work_func_t fun
 	if (!works)
 		return -ENOMEM;
 
+	get_online_cpus();
+
 	/*
-	 * when running in keventd don't schedule a work item on itself.
-	 * Can just call directly because the work queue is already bound.
-	 * This also is faster.
-	 * Make this a generic parameter for other workqueues?
+	 * When running in keventd don't schedule a work item on
+	 * itself.  Can just call directly because the work queue is
+	 * already bound.  This also is faster.
 	 */
-	if (current_is_keventd()) {
+	if (current_is_keventd())
 		orig = raw_smp_processor_id();
-		INIT_WORK(per_cpu_ptr(works, orig), func);
-		func(per_cpu_ptr(works, orig));
-	}
 
-	get_online_cpus();
 	for_each_online_cpu(cpu) {
 		struct work_struct *work = per_cpu_ptr(works, cpu);
 
-		if (cpu == orig)
-			continue;
 		INIT_WORK(work, func);
-		schedule_work_on(cpu, work);
-	}
-	for_each_online_cpu(cpu) {
 		if (cpu != orig)
-			flush_work(per_cpu_ptr(works, cpu));
+			schedule_work_on(cpu, work);
 	}
+	if (orig >= 0)
+		func(per_cpu_ptr(works, orig));
+
+	for_each_online_cpu(cpu)
+		flush_work(per_cpu_ptr(works, cpu));
+
 	put_online_cpus();
 	free_percpu(works);
 	return 0;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2.6.32-rc6] workqueue: fix race condition in schedule_on_each_cpu()
  2009-11-13  9:29 [PATCH 2.6.32-rc6] workqueue: fix race condition in schedule_on_each_cpu() Tejun Heo
@ 2009-11-13  9:33 ` Ingo Molnar
  2009-11-13 15:38 ` Oleg Nesterov
  1 sibling, 0 replies; 5+ messages in thread
From: Ingo Molnar @ 2009-11-13  9:33 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton; +Cc: Oleg Nesterov, Ingo Molnar, Linus Torvalds, lkml


* Tejun Heo <tj@kernel.org> wrote:

> Commit 65a64464349883891e21e74af16c05d6e1eeb4e9 which allows
> schedule_on_each_cpu() to be called from keventd added a race
> condition.  schedule_on_each_cpu() may race with cpu hotplug and end
> up executing the function twice on a cpu.
> 
> Fix it by moving direct execution into the section protected with
> get/put_online_cpus().  While at it, update code such that direct
> execution is done after works have been scheduled for all other cpus
> and drop unnecessary cpu != orig test from flush loop.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> ---
> Andi, Oleg, this patch tested fine on my machine but it would be great
> if you guys can ack it.  Ingo, upon ack, can you please route this
> patch?
> 
> Thanks.

Andrew's doing workqueue patches - i've Cc:-ed him.

Looks good to me. Btw., 65a644643 should have been done via -mm, or at 
least it should have gathered proper Acks from interested parties before 
pushing the change upstream.

	Ingo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2.6.32-rc6] workqueue: fix race condition in schedule_on_each_cpu()
  2009-11-13  9:29 [PATCH 2.6.32-rc6] workqueue: fix race condition in schedule_on_each_cpu() Tejun Heo
  2009-11-13  9:33 ` Ingo Molnar
@ 2009-11-13 15:38 ` Oleg Nesterov
  2009-11-14  9:01   ` Tejun Heo
  1 sibling, 1 reply; 5+ messages in thread
From: Oleg Nesterov @ 2009-11-13 15:38 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Ingo Molnar, Linus Torvalds, lkml

On 11/13, Tejun Heo wrote:
>
> Commit 65a64464349883891e21e74af16c05d6e1eeb4e9 which allows
> schedule_on_each_cpu() to be called from keventd added a race
> condition.  schedule_on_each_cpu() may race with cpu hotplug and end
> up executing the function twice on a cpu.
>
> Fix it by moving direct execution into the section protected with
> get/put_online_cpus().  While at it, update code such that direct
> execution is done after works have been scheduled for all other cpus
> and drop unnecessary cpu != orig test from flush loop.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> ---
> Andi, Oleg, this patch tested fine on my machine but it would be great
> if you guys can ack it.  Ingo, upon ack, can you please route this
> patch?

Thanks, I think this is correct.

A very minor nit, schedule_on_each_cpu() still checks "orig" twice,
perhaps it makes sense to do

	for_each_online_cpu(cpu) {
		struct work_struct *work = per_cpu_ptr(works, cpu);

		INIT_WORK(work, func);

		if (likely(cpu != orig))
			schedule_work_on(cpu, work);
		else
			func(work);
	}

instead and simplify the code a little bit.

But this is minor and up to you.

Acked-by: Oleg Nesterov <oleg@redhat.com>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2.6.32-rc6] workqueue: fix race condition in schedule_on_each_cpu()
  2009-11-13 15:38 ` Oleg Nesterov
@ 2009-11-14  9:01   ` Tejun Heo
  2009-11-14 18:49     ` Oleg Nesterov
  0 siblings, 1 reply; 5+ messages in thread
From: Tejun Heo @ 2009-11-14  9:01 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Ingo Molnar, Linus Torvalds, lkml

Hello, Oleg.

11/14/2009 12:38 AM, Oleg Nesterov wrote:
> A very minor nit, schedule_on_each_cpu() still checks "orig" twice,
> perhaps it makes sense to do
> 
> 	for_each_online_cpu(cpu) {
> 		struct work_struct *work = per_cpu_ptr(works, cpu);
> 
> 		INIT_WORK(work, func);
> 
> 		if (likely(cpu != orig))
> 			schedule_work_on(cpu, work);
> 		else
> 			func(work);
> 	}

The intention was to schedule works on all other cpus first and then
execute it on local cpu so that if it takes some time, it doesn't have
to go through the latency twice.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2.6.32-rc6] workqueue: fix race condition in schedule_on_each_cpu()
  2009-11-14  9:01   ` Tejun Heo
@ 2009-11-14 18:49     ` Oleg Nesterov
  0 siblings, 0 replies; 5+ messages in thread
From: Oleg Nesterov @ 2009-11-14 18:49 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Ingo Molnar, Linus Torvalds, lkml

On 11/14, Tejun Heo wrote:
>
> 11/14/2009 12:38 AM, Oleg Nesterov wrote:
> > A very minor nit, schedule_on_each_cpu() still checks "orig" twice,
> > perhaps it makes sense to do
> >
> > 	for_each_online_cpu(cpu) {
> > 		struct work_struct *work = per_cpu_ptr(works, cpu);
> >
> > 		INIT_WORK(work, func);
> >
> > 		if (likely(cpu != orig))
> > 			schedule_work_on(cpu, work);
> > 		else
> > 			func(work);
> > 	}
>
> The intention was to schedule works on all other cpus first and then
> execute it on local cpu so that if it takes some time, it doesn't have
> to go through the latency twice.

Ah, indeed, you are right.

Oleg.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-11-14 18:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-13  9:29 [PATCH 2.6.32-rc6] workqueue: fix race condition in schedule_on_each_cpu() Tejun Heo
2009-11-13  9:33 ` Ingo Molnar
2009-11-13 15:38 ` Oleg Nesterov
2009-11-14  9:01   ` Tejun Heo
2009-11-14 18:49     ` Oleg Nesterov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.