linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: + workqueue-proper-error-unwinding-in-cpu-hotplug-error-path.patch added to -mm tree
       [not found] <200807220826.m6M8Q4I6018588@imap1.linux-foundation.org>
@ 2008-07-22  8:45 ` Oleg Nesterov
  2008-07-22  8:59   ` Akinobu Mita
  0 siblings, 1 reply; 6+ messages in thread
From: Oleg Nesterov @ 2008-07-22  8:45 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel, akinobu.mita

On 07/22, Andrew Morton wrote:
> 
> From: Akinobu Mita <akinobu.mita@gmail.com>
> 
> Add proper error unwinding in error path in CPU_UP_PREPARE notifier.

Could you clarify?

> --- a/kernel/workqueue.c~workqueue-proper-error-unwinding-in-cpu-hotplug-error-path
> +++ a/kernel/workqueue.c
> @@ -928,6 +928,15 @@ static int __devinit workqueue_cpu_callb
>  				break;
>  			printk(KERN_ERR "workqueue [%s] for %i failed\n",
>  				wq->name, cpu);
> +
> +			list_for_each_entry_continue_reverse(wq, &workqueues,
> +							list) {
> +				cwq = per_cpu_ptr(wq->cpu_wq, cpu);
> +				start_workqueue_thread(cwq, -1);
> +				cleanup_workqueue_thread(cwq);
> +			}
> +			cpu_clear(cpu, cpu_populated_map);
> +
>  			return NOTIFY_BAD;

If CPU_UP_PREPARE fails, _cpu_up() sends CPU_UP_CANCELED, and afaics
workqueue_cpu_callback() correctly cleanups cwq->thread's.

Oleg.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: + workqueue-proper-error-unwinding-in-cpu-hotplug-error-path.patch added to -mm tree
  2008-07-22  8:45 ` + workqueue-proper-error-unwinding-in-cpu-hotplug-error-path.patch added to -mm tree Oleg Nesterov
@ 2008-07-22  8:59   ` Akinobu Mita
  2008-07-22  9:16     ` Oleg Nesterov
  0 siblings, 1 reply; 6+ messages in thread
From: Akinobu Mita @ 2008-07-22  8:59 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: akpm, linux-kernel

On Tue, Jul 22, 2008 at 12:45:26PM +0400, Oleg Nesterov wrote:
> On 07/22, Andrew Morton wrote:
> > 
> > From: Akinobu Mita <akinobu.mita@gmail.com>
> > 
> > Add proper error unwinding in error path in CPU_UP_PREPARE notifier.
> 
> Could you clarify?

Sure.

> > --- a/kernel/workqueue.c~workqueue-proper-error-unwinding-in-cpu-hotplug-error-path
> > +++ a/kernel/workqueue.c
> > @@ -928,6 +928,15 @@ static int __devinit workqueue_cpu_callb
> >  				break;
> >  			printk(KERN_ERR "workqueue [%s] for %i failed\n",
> >  				wq->name, cpu);
> > +
> > +			list_for_each_entry_continue_reverse(wq, &workqueues,
> > +							list) {
> > +				cwq = per_cpu_ptr(wq->cpu_wq, cpu);
> > +				start_workqueue_thread(cwq, -1);
> > +				cleanup_workqueue_thread(cwq);
> > +			}
> > +			cpu_clear(cpu, cpu_populated_map);
> > +
> >  			return NOTIFY_BAD;
> 
> If CPU_UP_PREPARE fails, _cpu_up() sends CPU_UP_CANCELED, and afaics
> workqueue_cpu_callback() correctly cleanups cwq->thread's.

_cpu_up() does not send CPU_UP_CANCELED to the callback which has
returned NOTIFY_BAD.

The behavior was changed by this commit:

commit a0d8cdb652d35af9319a9e0fb7134de2a276c636
Author: Akinobu Mita <akinobu.mita@gmail.com>
Date:   Thu Oct 18 03:05:12 2007 -0700

    cpu hotplug: cpu: deliver CPU_UP_CANCELED only to NOTIFY_OKed callbacks with CPU_UP_PREPARE
    
    The functions in a CPU notifier chain is called with CPU_UP_PREPARE event
    before making the CPU online.  If one of the callback returns NOTIFY_BAD, it
    stops to deliver CPU_UP_PREPARE event, and CPU online operation is canceled.
    Then CPU_UP_CANCELED event is delivered to the functions in a CPU notifier
    chain again.
    
    This CPU_UP_CANCELED event is delivered to the functions which have been
    called with CPU_UP_PREPARE, not delivered to the functions which haven't been
    called with CPU_UP_PREPARE.
    
    The problem that makes existing cpu hotplug error handlings complex is that
    the CPU_UP_CANCELED event is delivered to the function that has returned
    NOTIFY_BAD, too.
    
    Usually we don't expect to call destructor function against the object that
    has failed to initialize.  It is like:
    
    	err = register_something();
    	if (err) {
    		unregister_something();
    		return err;
    	}
    
    So it is natural to deliver CPU_UP_CANCELED event only to the functions that
    have returned NOTIFY_OK with CPU_UP_PREPARE event and not to call the function
    that have returned NOTIFY_BAD.  This is what this patch is doing.
    
    Otherwise, every cpu hotplug notifiler has to track whether notifiler event is
    failed or not for each cpu.  (drivers/base/topology.c is doing this with
    topology_dev_map)
    
    Similary this patch makes same thing with CPU_DOWN_PREPARE and CPU_DOWN_FAILED
    evnets.
    
    Acked-by: Rusty Russell <rusty@rustcorp.com.au>
    Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
    Cc: Gautham R Shenoy <ego@in.ibm.com>
    Cc: Oleg Nesterov <oleg@tv-sign.ru>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 38033db..a21f71a 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -150,6 +150,7 @@ static int _cpu_down(unsigned int cpu, int tasks_frozen)
 	err = __raw_notifier_call_chain(&cpu_chain, CPU_DOWN_PREPARE | mod,
 					hcpu, -1, &nr_calls);
 	if (err == NOTIFY_BAD) {
+		nr_calls--;
 		__raw_notifier_call_chain(&cpu_chain, CPU_DOWN_FAILED | mod,
 					  hcpu, nr_calls, NULL);
 		printk("%s: attempt to take down CPU %u failed\n",
@@ -233,6 +234,7 @@ static int __cpuinit _cpu_up(unsigned int cpu, int tasks_frozen)
 	ret = __raw_notifier_call_chain(&cpu_chain, CPU_UP_PREPARE | mod, hcpu,
 							-1, &nr_calls);
 	if (ret == NOTIFY_BAD) {
+		nr_calls--;
 		printk("%s: attempt to bring up CPU %u failed\n",
 				__FUNCTION__, cpu);
 		ret = -EINVAL;

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: + workqueue-proper-error-unwinding-in-cpu-hotplug-error-path.patch added to -mm tree
  2008-07-22  8:59   ` Akinobu Mita
@ 2008-07-22  9:16     ` Oleg Nesterov
  2008-07-22  9:55       ` Akinobu Mita
  0 siblings, 1 reply; 6+ messages in thread
From: Oleg Nesterov @ 2008-07-22  9:16 UTC (permalink / raw)
  To: Akinobu Mita; +Cc: akpm, linux-kernel

On 07/22, Akinobu Mita wrote:
>
> On Tue, Jul 22, 2008 at 12:45:26PM +0400, Oleg Nesterov wrote:
> > > 
> > > From: Akinobu Mita <akinobu.mita@gmail.com>
> > > 
> > > --- a/kernel/workqueue.c~workqueue-proper-error-unwinding-in-cpu-hotplug-error-path
> > > +++ a/kernel/workqueue.c
> > > @@ -928,6 +928,15 @@ static int __devinit workqueue_cpu_callb
> > >  				break;
> > >  			printk(KERN_ERR "workqueue [%s] for %i failed\n",
> > >  				wq->name, cpu);
> > > +
> > > +			list_for_each_entry_continue_reverse(wq, &workqueues,
> > > +							list) {
> > > +				cwq = per_cpu_ptr(wq->cpu_wq, cpu);
> > > +				start_workqueue_thread(cwq, -1);
> > > +				cleanup_workqueue_thread(cwq);
> > > +			}
> > > +			cpu_clear(cpu, cpu_populated_map);
> > > +
> > >  			return NOTIFY_BAD;
> > 
> > If CPU_UP_PREPARE fails, _cpu_up() sends CPU_UP_CANCELED, and afaics
> > workqueue_cpu_callback() correctly cleanups cwq->thread's.
> 
> _cpu_up() does not send CPU_UP_CANCELED to the callback which has
> returned NOTIFY_BAD.
>
> The behavior was changed by this commit:
> 
> commit a0d8cdb652d35af9319a9e0fb7134de2a276c636
> Author: Akinobu Mita <akinobu.mita@gmail.com>
> Date:   Thu Oct 18 03:05:12 2007 -0700
>
>     cpu hotplug: cpu: deliver CPU_UP_CANCELED only to NOTIFY_OKed callbacks with CPU_UP_PREPARE

Thanks Akinobu!

Can't we simplify the fix? I don't like the fact that the CPU_UP_CANCELED
logic is duplicated.

What do you think about the patch below?

Oleg.

--- 26-rc2/kernel/workqueue.c~WQ_CPU_UP_PREPARE	2008-07-12 19:40:57.000000000 +0400
+++ 26-rc2/kernel/workqueue.c	2008-07-22 13:15:16.000000000 +0400
@@ -911,6 +911,7 @@ static int __devinit workqueue_cpu_callb
 	unsigned int cpu = (unsigned long)hcpu;
 	struct cpu_workqueue_struct *cwq;
 	struct workqueue_struct *wq;
+	int ret = NOTIFY_OK;
 
 	action &= ~CPU_TASKS_FROZEN;
 
@@ -919,6 +920,7 @@ static int __devinit workqueue_cpu_callb
 		cpu_set(cpu, cpu_populated_map);
 	}
 
+cancel:
 	list_for_each_entry(wq, &workqueues, list) {
 		cwq = per_cpu_ptr(wq->cpu_wq, cpu);
 
@@ -928,7 +930,9 @@ static int __devinit workqueue_cpu_callb
 				break;
 			printk(KERN_ERR "workqueue [%s] for %i failed\n",
 				wq->name, cpu);
-			return NOTIFY_BAD;
+			action = CPU_UP_CANCELED;
+			ret = NOTIFY_BAD;
+			goto cancel;
 
 		case CPU_ONLINE:
 			start_workqueue_thread(cwq, cpu);
@@ -948,7 +952,7 @@ static int __devinit workqueue_cpu_callb
 		cpu_clear(cpu, cpu_populated_map);
 	}
 
-	return NOTIFY_OK;
+	return ret;
 }
 
 void __init init_workqueues(void)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: + workqueue-proper-error-unwinding-in-cpu-hotplug-error-path.patch added to -mm tree
  2008-07-22  9:16     ` Oleg Nesterov
@ 2008-07-22  9:55       ` Akinobu Mita
  2008-07-22 10:23         ` Oleg Nesterov
  0 siblings, 1 reply; 6+ messages in thread
From: Akinobu Mita @ 2008-07-22  9:55 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: akpm, linux-kernel

On Tue, Jul 22, 2008 at 01:16:36PM +0400, Oleg Nesterov wrote:
> Thanks Akinobu!

You are welcome!

> Can't we simplify the fix? I don't like the fact that the CPU_UP_CANCELED
> logic is duplicated.
> 
> What do you think about the patch below?

Yes, it is no duplication. But the goto usage in this patch looks bit
tricky to me.  Maybe it is better to factor out the list_for_each() block.

> 
> Oleg.
> 
> --- 26-rc2/kernel/workqueue.c~WQ_CPU_UP_PREPARE	2008-07-12 19:40:57.000000000 +0400
> +++ 26-rc2/kernel/workqueue.c	2008-07-22 13:15:16.000000000 +0400
> @@ -911,6 +911,7 @@ static int __devinit workqueue_cpu_callb
>  	unsigned int cpu = (unsigned long)hcpu;
>  	struct cpu_workqueue_struct *cwq;
>  	struct workqueue_struct *wq;
> +	int ret = NOTIFY_OK;
>  
>  	action &= ~CPU_TASKS_FROZEN;
>  
> @@ -919,6 +920,7 @@ static int __devinit workqueue_cpu_callb
>  		cpu_set(cpu, cpu_populated_map);
>  	}
>  
> +cancel:
>  	list_for_each_entry(wq, &workqueues, list) {
>  		cwq = per_cpu_ptr(wq->cpu_wq, cpu);
>  
> @@ -928,7 +930,9 @@ static int __devinit workqueue_cpu_callb
>  				break;
>  			printk(KERN_ERR "workqueue [%s] for %i failed\n",
>  				wq->name, cpu);
> -			return NOTIFY_BAD;
> +			action = CPU_UP_CANCELED;
> +			ret = NOTIFY_BAD;
> +			goto cancel;
>  
>  		case CPU_ONLINE:
>  			start_workqueue_thread(cwq, cpu);
> @@ -948,7 +952,7 @@ static int __devinit workqueue_cpu_callb
>  		cpu_clear(cpu, cpu_populated_map);
>  	}
>  
> -	return NOTIFY_OK;
> +	return ret;
>  }
>  
>  void __init init_workqueues(void)
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: + workqueue-proper-error-unwinding-in-cpu-hotplug-error-path.patch added to -mm tree
  2008-07-22  9:55       ` Akinobu Mita
@ 2008-07-22 10:23         ` Oleg Nesterov
  2008-07-22 10:27           ` [PATCH] workqueues: do CPU_UP_CANCELED if CPU_UP_PREPARE fails Oleg Nesterov
  0 siblings, 1 reply; 6+ messages in thread
From: Oleg Nesterov @ 2008-07-22 10:23 UTC (permalink / raw)
  To: Akinobu Mita; +Cc: akpm, linux-kernel

On 07/22, Akinobu Mita wrote:
>
> On Tue, Jul 22, 2008 at 01:16:36PM +0400, Oleg Nesterov wrote:
> 
> > Can't we simplify the fix? I don't like the fact that the CPU_UP_CANCELED
> > logic is duplicated.
> > 
> > What do you think about the patch below?
> 
> Yes, it is no duplication. But the goto usage in this patch looks bit
> tricky to me.

Well, the kernel is full of "goto again" / "goto retry". Not that I
claim this patch improves the readability ;)

But the duplication is really bad, imho.

> Maybe it is better to factor out the list_for_each() block.

Not sure I understand... do you mean add another function which
starts with list_for_each_entry ? This is not necessary, instead
of goto we can just do

		workqueue_cpu_callback(CPU_UP_CANCELED);
		return NOTIFY_BAD;

, in that case the patch will be one-liner.

Akinobu, Andrew, I'd suggest to drop

	workqueue-proper-error-unwinding-in-cpu-hotplug-error-path.patch

, I'll send the new patch. If nothing else, it is simpler.

Oleg.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] workqueues: do CPU_UP_CANCELED if CPU_UP_PREPARE fails
  2008-07-22 10:23         ` Oleg Nesterov
@ 2008-07-22 10:27           ` Oleg Nesterov
  0 siblings, 0 replies; 6+ messages in thread
From: Oleg Nesterov @ 2008-07-22 10:27 UTC (permalink / raw)
  To: Akinobu Mita; +Cc: akpm, linux-kernel

(replaces workqueue-proper-error-unwinding-in-cpu-hotplug-error-path.patch)

The bug was pointed out by Akinobu Mita <akinobu.mita@gmail.com>, and this
patch is based on his original patch.

workqueue_cpu_callback(CPU_UP_PREPARE) expects that if it returns NOTIFY_BAD,
_cpu_up() will send CPU_UP_CANCELED then.

However, this is not true since

	"cpu hotplug: cpu: deliver CPU_UP_CANCELED only to NOTIFY_OKed callbacks with CPU_UP_PREPARE"
	commit: a0d8cdb652d35af9319a9e0fb7134de2a276c636

The callback which has returned NOTIFY_BAD will not receive CPU_UP_CANCELED.
Change the code to fulfil the CPU_UP_CANCELED logic if CPU_UP_PREPARE fails.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>

--- 26-rc2/kernel/workqueue.c~WQ_CPU_UP_PREPARE	2008-07-12 19:40:57.000000000 +0400
+++ 26-rc2/kernel/workqueue.c	2008-07-22 13:43:47.000000000 +0400
@@ -911,6 +911,7 @@ static int __devinit workqueue_cpu_callb
 	unsigned int cpu = (unsigned long)hcpu;
 	struct cpu_workqueue_struct *cwq;
 	struct workqueue_struct *wq;
+	int ret = NOTIFY_OK;
 
 	action &= ~CPU_TASKS_FROZEN;
 
@@ -918,7 +919,7 @@ static int __devinit workqueue_cpu_callb
 	case CPU_UP_PREPARE:
 		cpu_set(cpu, cpu_populated_map);
 	}
-
+undo:
 	list_for_each_entry(wq, &workqueues, list) {
 		cwq = per_cpu_ptr(wq->cpu_wq, cpu);
 
@@ -928,7 +929,9 @@ static int __devinit workqueue_cpu_callb
 				break;
 			printk(KERN_ERR "workqueue [%s] for %i failed\n",
 				wq->name, cpu);
-			return NOTIFY_BAD;
+			action = CPU_UP_CANCELED;
+			ret = NOTIFY_BAD;
+			goto undo;
 
 		case CPU_ONLINE:
 			start_workqueue_thread(cwq, cpu);
@@ -948,7 +951,7 @@ static int __devinit workqueue_cpu_callb
 		cpu_clear(cpu, cpu_populated_map);
 	}
 
-	return NOTIFY_OK;
+	return ret;
 }
 
 void __init init_workqueues(void)


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-07-22 10:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <200807220826.m6M8Q4I6018588@imap1.linux-foundation.org>
2008-07-22  8:45 ` + workqueue-proper-error-unwinding-in-cpu-hotplug-error-path.patch added to -mm tree Oleg Nesterov
2008-07-22  8:59   ` Akinobu Mita
2008-07-22  9:16     ` Oleg Nesterov
2008-07-22  9:55       ` Akinobu Mita
2008-07-22 10:23         ` Oleg Nesterov
2008-07-22 10:27           ` [PATCH] workqueues: do CPU_UP_CANCELED if CPU_UP_PREPARE fails Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).