* [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu @ 2009-03-07 10:54 Lai Jiangshan 2009-03-07 17:29 ` Paul E. McKenney 2009-03-08 16:00 ` Ingo Molnar 0 siblings, 2 replies; 12+ messages in thread From: Lai Jiangshan @ 2009-03-07 10:54 UTC (permalink / raw) To: Ingo Molnar, Paul E. McKenney, Peter Zijlstra, LKML [RFC] I don't like this patch, but I thought for some days and I can't thought out a better one. I'm very hope rcu_barrier() can be called anywhere(any sleepable context). But get_online_cpus() is a very large lock, it limits rcu_barrier(). We can avoid get_online_cpus() easily for rcupreempt by using a new rcu_barrier: void rcu_barrier(void) { for each rcu_data { lock rcu_data; if rcu_data is not empty, queue a callback for rcu_barrier; unlock rcu_data; } } But we cannot use this algorithm for rcuclassic and rcutree, rcu_data in rcuclassic and rcutree have not a spinlock for queuing callback. From: Lai Jiangshan <laijs@cn.fujitsu.com> cpu hotplug may be happened asynchronously, some rcu callbacks are maybe still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks to complete, so we must ensure callbacks in dead cpu are migrated to online cpu. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> --- diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c index cae8a05..4665d18 100644 --- a/kernel/rcupdate.c +++ b/kernel/rcupdate.c @@ -129,6 +129,8 @@ static void rcu_barrier_func(void *type) static void _rcu_barrier(enum rcu_barrier type) { BUG_ON(in_interrupt()); + /* Ensure callbacks in dead cpu are migrated to online cpu */ + get_online_cpus(); /* Take cpucontrol mutex to protect against CPU hotplug */ mutex_lock(&rcu_barrier_mutex); init_completion(&rcu_barrier_completion); @@ -147,6 +149,7 @@ static void _rcu_barrier(enum rcu_barrier type) complete(&rcu_barrier_completion); wait_for_completion(&rcu_barrier_completion); mutex_unlock(&rcu_barrier_mutex); + put_online_cpus(); } /** ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu 2009-03-07 10:54 [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu Lai Jiangshan @ 2009-03-07 17:29 ` Paul E. McKenney 2009-03-08 2:58 ` Lai Jiangshan 2009-03-08 16:00 ` Ingo Molnar 1 sibling, 1 reply; 12+ messages in thread From: Paul E. McKenney @ 2009-03-07 17:29 UTC (permalink / raw) To: Lai Jiangshan; +Cc: Ingo Molnar, Peter Zijlstra, LKML On Sat, Mar 07, 2009 at 06:54:38PM +0800, Lai Jiangshan wrote: > [RFC] > I don't like this patch, but I thought for some days and I can't > thought out a better one. > > I'm very hope rcu_barrier() can be called anywhere(any sleepable context). > But get_online_cpus() is a very large lock, it limits rcu_barrier(). > > We can avoid get_online_cpus() easily for rcupreempt by using a new rcu_barrier: > void rcu_barrier(void) > { > for each rcu_data { > lock rcu_data; > if rcu_data is not empty, queue a callback for rcu_barrier; > unlock rcu_data; > } > } > But we cannot use this algorithm for rcuclassic and rcutree, > rcu_data in rcuclassic and rcutree have not a spinlock for queuing callback. > > From: Lai Jiangshan <laijs@cn.fujitsu.com> > > cpu hotplug may be happened asynchronously, some rcu callbacks are maybe > still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks > to complete, so we must ensure callbacks in dead cpu are migrated to > online cpu. Hmmm... I thought that on_each_cpu() took care of interlocking with CPU hotplug via smp_call_function(). During a CPU-hotplug operation, the RCU callbacks do get migrated from the CPU going offline. Are you seeing a sequence of events that finds a hole in this approach? Now, if a CPU were to go offline in the middle of smp_call_function() there could be trouble, but I was under the impression that the preempt_disable() in on_each_cpu() prevented this from happening. So, please tell me more! Thanx, Paul > Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> > --- > diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c > index cae8a05..4665d18 100644 > --- a/kernel/rcupdate.c > +++ b/kernel/rcupdate.c > @@ -129,6 +129,8 @@ static void rcu_barrier_func(void *type) > static void _rcu_barrier(enum rcu_barrier type) > { > BUG_ON(in_interrupt()); > + /* Ensure callbacks in dead cpu are migrated to online cpu */ > + get_online_cpus(); > /* Take cpucontrol mutex to protect against CPU hotplug */ > mutex_lock(&rcu_barrier_mutex); > init_completion(&rcu_barrier_completion); > @@ -147,6 +149,7 @@ static void _rcu_barrier(enum rcu_barrier type) > complete(&rcu_barrier_completion); > wait_for_completion(&rcu_barrier_completion); > mutex_unlock(&rcu_barrier_mutex); > + put_online_cpus(); > } > > /** > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu 2009-03-07 17:29 ` Paul E. McKenney @ 2009-03-08 2:58 ` Lai Jiangshan 2009-03-08 6:20 ` Paul E. McKenney 0 siblings, 1 reply; 12+ messages in thread From: Lai Jiangshan @ 2009-03-08 2:58 UTC (permalink / raw) To: paulmck; +Cc: Ingo Molnar, Peter Zijlstra, LKML Paul E. McKenney wrote: > On Sat, Mar 07, 2009 at 06:54:38PM +0800, Lai Jiangshan wrote: >> [RFC] >> I don't like this patch, but I thought for some days and I can't >> thought out a better one. >> >> I'm very hope rcu_barrier() can be called anywhere(any sleepable context). >> But get_online_cpus() is a very large lock, it limits rcu_barrier(). >> >> We can avoid get_online_cpus() easily for rcupreempt by using a new rcu_barrier: >> void rcu_barrier(void) >> { >> for each rcu_data { >> lock rcu_data; >> if rcu_data is not empty, queue a callback for rcu_barrier; >> unlock rcu_data; >> } >> } >> But we cannot use this algorithm for rcuclassic and rcutree, >> rcu_data in rcuclassic and rcutree have not a spinlock for queuing callback. >> >> From: Lai Jiangshan <laijs@cn.fujitsu.com> >> >> cpu hotplug may be happened asynchronously, some rcu callbacks are maybe >> still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks >> to complete, so we must ensure callbacks in dead cpu are migrated to >> online cpu. > > Hmmm... I thought that on_each_cpu() took care of interlocking with > CPU hotplug via smp_call_function(). During a CPU-hotplug operation, > the RCU callbacks do get migrated from the CPU going offline. Are you > seeing a sequence of events that finds a hole in this approach? > > Now, if a CPU were to go offline in the middle of smp_call_function() > there could be trouble, but I was under the impression that the > preempt_disable() in on_each_cpu() prevented this from happening. > > So, please tell me more! > preempt_disable() ensure online cpu is still online until preempt_enable(), but preempt_disable()/preempt_enable() can't ensure rcu callbacks migrated. rcu_barrier() | _cpu_down() | __cpu_die() (cpu D is dead) ........................|............................ on_each_cpu() | ........................|........................... wait_for_completion() | rcu_offline_cpu() (move cpu D's | rcu callbacks to A,B,or C) on_each_cpu() does not queue rcu_barrier_callback to cpu D(it's dead). So rcu_barrier() will not wait for callbacks which are original at cpu D. We need ensure callbacks in dead cpu are migrated to online cpu before we call on_each_cpu(). Thanks, Lai. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu 2009-03-08 2:58 ` Lai Jiangshan @ 2009-03-08 6:20 ` Paul E. McKenney 2009-03-09 2:56 ` Lai Jiangshan 0 siblings, 1 reply; 12+ messages in thread From: Paul E. McKenney @ 2009-03-08 6:20 UTC (permalink / raw) To: Lai Jiangshan; +Cc: Ingo Molnar, Peter Zijlstra, LKML On Sun, Mar 08, 2009 at 10:58:43AM +0800, Lai Jiangshan wrote: > Paul E. McKenney wrote: > > On Sat, Mar 07, 2009 at 06:54:38PM +0800, Lai Jiangshan wrote: > >> [RFC] > >> I don't like this patch, but I thought for some days and I can't > >> thought out a better one. > >> > >> I'm very hope rcu_barrier() can be called anywhere(any sleepable context). > >> But get_online_cpus() is a very large lock, it limits rcu_barrier(). > >> > >> We can avoid get_online_cpus() easily for rcupreempt by using a new rcu_barrier: > >> void rcu_barrier(void) > >> { > >> for each rcu_data { > >> lock rcu_data; > >> if rcu_data is not empty, queue a callback for rcu_barrier; > >> unlock rcu_data; > >> } > >> } > >> But we cannot use this algorithm for rcuclassic and rcutree, > >> rcu_data in rcuclassic and rcutree have not a spinlock for queuing callback. > >> > >> From: Lai Jiangshan <laijs@cn.fujitsu.com> > >> > >> cpu hotplug may be happened asynchronously, some rcu callbacks are maybe > >> still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks > >> to complete, so we must ensure callbacks in dead cpu are migrated to > >> online cpu. > > > > Hmmm... I thought that on_each_cpu() took care of interlocking with > > CPU hotplug via smp_call_function(). During a CPU-hotplug operation, > > the RCU callbacks do get migrated from the CPU going offline. Are you > > seeing a sequence of events that finds a hole in this approach? > > > > Now, if a CPU were to go offline in the middle of smp_call_function() > > there could be trouble, but I was under the impression that the > > preempt_disable() in on_each_cpu() prevented this from happening. > > > > So, please tell me more! > > > > preempt_disable() ensure online cpu is still online until preempt_enable(), > but preempt_disable()/preempt_enable() can't ensure rcu callbacks migrated. > > > rcu_barrier() | _cpu_down() > | __cpu_die() (cpu D is dead) > ........................|............................ > on_each_cpu() | > ........................|........................... > wait_for_completion() | rcu_offline_cpu() (move cpu D's > | rcu callbacks to A,B,or C) > > > on_each_cpu() does not queue rcu_barrier_callback to cpu D(it's dead). > So rcu_barrier() will not wait for callbacks which are original at cpu D. > > We need ensure callbacks in dead cpu are migrated to online cpu before > we call on_each_cpu(). Good catch!!! I did indeed miss that possibility. :-/ Hmmmm... rcu_barrier() already acquires a global mutex, and is an infrequent operation, so I am not all that worried about the scalability. But I agree that there should be a better way to do this. One approach might be to the dying CPU enqueue the rcu_barrier() callback on its own list when it goes offline, during the stop_machine() time period. This enqueuing operation would require some care -- it would be necessary to check to see if the callback was already on the list, for example, as well as to properly adjust the rcu_barrier_completion() state. Of course, it would also be necessary to handle the case where an rcu_barrier() callback was enqueued when there was no rcu_barrier() in flight, preferably by preventing this from happening. An entirely different approach would be to steal a trick from CPU designers, and use a count of the number of rcu_barrier() calls (this counter could be a single bit). Have a per-CPU counter of the number of callbacks outstanding for each counter value. Then rcu_barrier() simply increments the rcu_barrier() counter, and waits until the number of outstanding callbacks corresponding to the old value drops to zero. This would get rid of the need for rcu_barrier() to enqueue callbacks, preventing the scenario above from arising in the first place. Other thoughts? And again, good catch!!! Thanx, Paul ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu 2009-03-08 6:20 ` Paul E. McKenney @ 2009-03-09 2:56 ` Lai Jiangshan 2009-03-09 4:28 ` Paul E. McKenney 0 siblings, 1 reply; 12+ messages in thread From: Lai Jiangshan @ 2009-03-09 2:56 UTC (permalink / raw) To: paulmck; +Cc: Ingo Molnar, Peter Zijlstra, LKML Paul E. McKenney wrote: > On Sun, Mar 08, 2009 at 10:58:43AM +0800, Lai Jiangshan wrote: >> Paul E. McKenney wrote: >>> On Sat, Mar 07, 2009 at 06:54:38PM +0800, Lai Jiangshan wrote: >>>> [RFC] >>>> I don't like this patch, but I thought for some days and I can't >>>> thought out a better one. >>>> >>>> I'm very hope rcu_barrier() can be called anywhere(any sleepable context). >>>> But get_online_cpus() is a very large lock, it limits rcu_barrier(). >>>> >>>> We can avoid get_online_cpus() easily for rcupreempt by using a new rcu_barrier: >>>> void rcu_barrier(void) >>>> { >>>> for each rcu_data { >>>> lock rcu_data; >>>> if rcu_data is not empty, queue a callback for rcu_barrier; >>>> unlock rcu_data; >>>> } >>>> } >>>> But we cannot use this algorithm for rcuclassic and rcutree, >>>> rcu_data in rcuclassic and rcutree have not a spinlock for queuing callback. >>>> >>>> From: Lai Jiangshan <laijs@cn.fujitsu.com> >>>> >>>> cpu hotplug may be happened asynchronously, some rcu callbacks are maybe >>>> still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks >>>> to complete, so we must ensure callbacks in dead cpu are migrated to >>>> online cpu. >>> Hmmm... I thought that on_each_cpu() took care of interlocking with >>> CPU hotplug via smp_call_function(). During a CPU-hotplug operation, >>> the RCU callbacks do get migrated from the CPU going offline. Are you >>> seeing a sequence of events that finds a hole in this approach? >>> >>> Now, if a CPU were to go offline in the middle of smp_call_function() >>> there could be trouble, but I was under the impression that the >>> preempt_disable() in on_each_cpu() prevented this from happening. >>> >>> So, please tell me more! >>> >> preempt_disable() ensure online cpu is still online until preempt_enable(), >> but preempt_disable()/preempt_enable() can't ensure rcu callbacks migrated. >> >> >> rcu_barrier() | _cpu_down() >> | __cpu_die() (cpu D is dead) >> ........................|............................ >> on_each_cpu() | >> ........................|........................... >> wait_for_completion() | rcu_offline_cpu() (move cpu D's >> | rcu callbacks to A,B,or C) >> >> >> on_each_cpu() does not queue rcu_barrier_callback to cpu D(it's dead). >> So rcu_barrier() will not wait for callbacks which are original at cpu D. >> >> We need ensure callbacks in dead cpu are migrated to online cpu before >> we call on_each_cpu(). > > Good catch!!! I did indeed miss that possibility. :-/ > > Hmmmm... rcu_barrier() already acquires a global mutex, and is an > infrequent operation, so I am not all that worried about the scalability. I do not worry about the scalability either. When we use get_online_cpus(), rcu_barrier() can not be called anywhere (any sleepable context), this is what I worry about. Most locks in kernel are locked after cpu_hotplug.lock, if a path has required one of these lock, it cannot call get_online_cpus(). (to avoid ABBA deadlock) So, if we use get_online_cpus() in rcu_barrier(), we cannot use rcu_barrier() in most area in kernel. > > But I agree that there should be a better way to do this. One approach > might be to the dying CPU enqueue the rcu_barrier() callback on its > own list when it goes offline, during the stop_machine() time period. > This enqueuing operation would require some care -- it would be necessary > to check to see if the callback was already on the list, for example, > as well as to properly adjust the rcu_barrier_completion() state. > > Of course, it would also be necessary to handle the case where an > rcu_barrier() callback was enqueued when there was no rcu_barrier() > in flight, preferably by preventing this from happening. > > An entirely different approach would be to steal a trick from CPU > designers, and use a count of the number of rcu_barrier() calls (this > counter could be a single bit). Have a per-CPU counter of the number > of callbacks outstanding for each counter value. Then rcu_barrier() > simply increments the rcu_barrier() counter, and waits until the > number of outstanding callbacks corresponding to the old value drops > to zero. This would get rid of the need for rcu_barrier() to enqueue > callbacks, preventing the scenario above from arising in the first > place. > Will you implement it with one of better ways? Lai ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu 2009-03-09 2:56 ` Lai Jiangshan @ 2009-03-09 4:28 ` Paul E. McKenney 0 siblings, 0 replies; 12+ messages in thread From: Paul E. McKenney @ 2009-03-09 4:28 UTC (permalink / raw) To: Lai Jiangshan; +Cc: Ingo Molnar, Peter Zijlstra, LKML On Mon, Mar 09, 2009 at 10:56:12AM +0800, Lai Jiangshan wrote: > Paul E. McKenney wrote: > > On Sun, Mar 08, 2009 at 10:58:43AM +0800, Lai Jiangshan wrote: > >> Paul E. McKenney wrote: > >>> On Sat, Mar 07, 2009 at 06:54:38PM +0800, Lai Jiangshan wrote: > >>>> [RFC] > >>>> I don't like this patch, but I thought for some days and I can't > >>>> thought out a better one. > >>>> > >>>> I'm very hope rcu_barrier() can be called anywhere(any sleepable context). > >>>> But get_online_cpus() is a very large lock, it limits rcu_barrier(). > >>>> > >>>> We can avoid get_online_cpus() easily for rcupreempt by using a new rcu_barrier: > >>>> void rcu_barrier(void) > >>>> { > >>>> for each rcu_data { > >>>> lock rcu_data; > >>>> if rcu_data is not empty, queue a callback for rcu_barrier; > >>>> unlock rcu_data; > >>>> } > >>>> } > >>>> But we cannot use this algorithm for rcuclassic and rcutree, > >>>> rcu_data in rcuclassic and rcutree have not a spinlock for queuing callback. > >>>> > >>>> From: Lai Jiangshan <laijs@cn.fujitsu.com> > >>>> > >>>> cpu hotplug may be happened asynchronously, some rcu callbacks are maybe > >>>> still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks > >>>> to complete, so we must ensure callbacks in dead cpu are migrated to > >>>> online cpu. > >>> Hmmm... I thought that on_each_cpu() took care of interlocking with > >>> CPU hotplug via smp_call_function(). During a CPU-hotplug operation, > >>> the RCU callbacks do get migrated from the CPU going offline. Are you > >>> seeing a sequence of events that finds a hole in this approach? > >>> > >>> Now, if a CPU were to go offline in the middle of smp_call_function() > >>> there could be trouble, but I was under the impression that the > >>> preempt_disable() in on_each_cpu() prevented this from happening. > >>> > >>> So, please tell me more! > >>> > >> preempt_disable() ensure online cpu is still online until preempt_enable(), > >> but preempt_disable()/preempt_enable() can't ensure rcu callbacks migrated. > >> > >> > >> rcu_barrier() | _cpu_down() > >> | __cpu_die() (cpu D is dead) > >> ........................|............................ > >> on_each_cpu() | > >> ........................|........................... > >> wait_for_completion() | rcu_offline_cpu() (move cpu D's > >> | rcu callbacks to A,B,or C) > >> > >> > >> on_each_cpu() does not queue rcu_barrier_callback to cpu D(it's dead). > >> So rcu_barrier() will not wait for callbacks which are original at cpu D. > >> > >> We need ensure callbacks in dead cpu are migrated to online cpu before > >> we call on_each_cpu(). > > > > Good catch!!! I did indeed miss that possibility. :-/ > > > > Hmmmm... rcu_barrier() already acquires a global mutex, and is an > > infrequent operation, so I am not all that worried about the scalability. > > I do not worry about the scalability either. > When we use get_online_cpus(), rcu_barrier() can not be called anywhere > (any sleepable context), this is what I worry about. > > Most locks in kernel are locked after cpu_hotplug.lock, > if a path has required one of these lock, it cannot call get_online_cpus(). > (to avoid ABBA deadlock) > So, if we use get_online_cpus() in rcu_barrier(), we cannot use rcu_barrier() > in most area in kernel. Yes, that could be painful. > > But I agree that there should be a better way to do this. One approach > > might be to the dying CPU enqueue the rcu_barrier() callback on its > > own list when it goes offline, during the stop_machine() time period. > > This enqueuing operation would require some care -- it would be necessary > > to check to see if the callback was already on the list, for example, > > as well as to properly adjust the rcu_barrier_completion() state. > > > > Of course, it would also be necessary to handle the case where an > > rcu_barrier() callback was enqueued when there was no rcu_barrier() > > in flight, preferably by preventing this from happening. > > > > An entirely different approach would be to steal a trick from CPU > > designers, and use a count of the number of rcu_barrier() calls (this > > counter could be a single bit). Have a per-CPU counter of the number > > of callbacks outstanding for each counter value. Then rcu_barrier() > > simply increments the rcu_barrier() counter, and waits until the > > number of outstanding callbacks corresponding to the old value drops > > to zero. This would get rid of the need for rcu_barrier() to enqueue > > callbacks, preventing the scenario above from arising in the first > > place. > > Will you implement it with one of better ways? I would start after I get the idr crash resolved -- first patch already done (http://lkml.org/lkml/2009/3/7/133), but problems remain. So if you want to get started, I would be happy to review. Your tail-pointer trick would be needed to allow the counts to be correctly maintained when migrating callbacks, of course. Thanx, Paul ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu 2009-03-07 10:54 [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu Lai Jiangshan 2009-03-07 17:29 ` Paul E. McKenney @ 2009-03-08 16:00 ` Ingo Molnar 2009-03-19 3:06 ` Lai Jiangshan 1 sibling, 1 reply; 12+ messages in thread From: Ingo Molnar @ 2009-03-08 16:00 UTC (permalink / raw) To: Lai Jiangshan; +Cc: Paul E. McKenney, Peter Zijlstra, LKML * Lai Jiangshan <laijs@cn.fujitsu.com> wrote: > [RFC] > I don't like this patch, but I thought for some days and I can't > thought out a better one. Interesting find. Found via code review or via testing? If via testing, what is the symptom of the bug when it hits - did you see CPU hotplug stress-tests hanging? Crashing too perhaps? How frequently did it occur? It also appears we introduced this bug at around 2.6.26 (when the CPU hotplug code was rewritten) - or had it before that as well. We are late in -rc's so i'm trying to scope whether it's a 2.6.29 must-have or can wait for 2.6.30. Ingo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu 2009-03-08 16:00 ` Ingo Molnar @ 2009-03-19 3:06 ` Lai Jiangshan 2009-03-19 4:05 ` Paul E. McKenney 0 siblings, 1 reply; 12+ messages in thread From: Lai Jiangshan @ 2009-03-19 3:06 UTC (permalink / raw) To: Ingo Molnar; +Cc: Paul E. McKenney, Peter Zijlstra, LKML Ingo Molnar wrote: > * Lai Jiangshan <laijs@cn.fujitsu.com> wrote: > >> [RFC] >> I don't like this patch, but I thought for some days and I can't >> thought out a better one. > > Interesting find. Found via code review or via testing? If via > testing, what is the symptom of the bug when it hits - did you > see CPU hotplug stress-tests hanging? Crashing too perhaps? How > frequently did it occur? I found this bug when I tested the draft version of kfree_rcu(V3). I noticed kfree_rcu_cpu_notify() is called earlier than rcu_cpu_notify(). This means rcu_barrier() is called earlier than RCU callbacks migration, it should lockup as expectation. But actually, this lockup can not occurred, I tried to explore it, and I found that rcu_barrier() does not handle cpu_hotplug. It includes two bugs. kfree_rcu(V3) (V4 is available too, it will be sent soon): http://lkml.org/lkml/2009/3/6/156 The V1 fix of this bug: http://lkml.org/lkml/2009/3/7/38 The fix of the other bug: (it changed the scheduler's code too) http://lkml.org/lkml/2009/3/7/39 Subject: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu (V2) cpu hotplug may be happened asynchronously, some rcu callbacks are maybe still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks to complete, so we must ensure callbacks in dead cpu are migrated to online cpu. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> --- diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c index cae8a05..2c7b845 100644 --- a/kernel/rcupdate.c +++ b/kernel/rcupdate.c @@ -122,6 +122,8 @@ static void rcu_barrier_func(void *type) } } +static inline void wait_migrated_callbacks(void); + /* * Orchestrate the specified type of RCU barrier, waiting for all * RCU callbacks of the specified type to complete. @@ -147,6 +149,7 @@ static void _rcu_barrier(enum rcu_barrier type) complete(&rcu_barrier_completion); wait_for_completion(&rcu_barrier_completion); mutex_unlock(&rcu_barrier_mutex); + wait_migrated_callbacks(); } /** @@ -176,9 +179,50 @@ void rcu_barrier_sched(void) } EXPORT_SYMBOL_GPL(rcu_barrier_sched); +static atomic_t rcu_migrate_type_count = ATOMIC_INIT(0); +static struct rcu_head rcu_migrate_head[3]; +static DECLARE_WAIT_QUEUE_HEAD(rcu_migrate_wq); + +static void rcu_migrate_callback(struct rcu_head *notused) +{ + if (atomic_dec_and_test(&rcu_migrate_type_count)) + wake_up(&rcu_migrate_wq); +} + +static inline void wait_migrated_callbacks(void) +{ + wait_event(rcu_migrate_wq, !atomic_read(&rcu_migrate_type_count)); +} + +static int __cpuinit rcu_barrier_cpu_hotplug(struct notifier_block *self, + unsigned long action, void *hcpu) +{ + if (action == CPU_DYING) { + /* + * preempt_disable() in on_each_cpu() prevents stop_machine(), + * so when "on_each_cpu(rcu_barrier_func, (void *)type, 1);" + * returns, all online cpus have queued rcu_barrier_func(), + * and the dead cpu(if it exist) queues rcu_migrate_callback()s. + * + * These callbacks ensure _rcu_barrier() waits for all + * RCU callbacks of the specified type to complete. + */ + atomic_set(&rcu_migrate_type_count, 3); + call_rcu_bh(rcu_migrate_head, rcu_migrate_callback); + call_rcu_sched(rcu_migrate_head + 1, rcu_migrate_callback); + call_rcu(rcu_migrate_head + 2, rcu_migrate_callback); + } else if (action == CPU_POST_DEAD) { + /* rcu_migrate_head is protected by cpu_add_remove_lock */ + wait_migrated_callbacks(); + } + + return NOTIFY_OK; +} + void __init rcu_init(void) { __rcu_init(); + hotcpu_notifier(rcu_barrier_cpu_hotplug, 0); } void rcu_scheduler_starting(void) ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu 2009-03-19 3:06 ` Lai Jiangshan @ 2009-03-19 4:05 ` Paul E. McKenney [not found] ` <20090319082237.GA32179@elte.hu> 0 siblings, 1 reply; 12+ messages in thread From: Paul E. McKenney @ 2009-03-19 4:05 UTC (permalink / raw) To: Lai Jiangshan; +Cc: Ingo Molnar, Peter Zijlstra, LKML On Thu, Mar 19, 2009 at 11:06:39AM +0800, Lai Jiangshan wrote: > Ingo Molnar wrote: > > * Lai Jiangshan <laijs@cn.fujitsu.com> wrote: > > > >> [RFC] > >> I don't like this patch, but I thought for some days and I can't > >> thought out a better one. > > > > Interesting find. Found via code review or via testing? If via > > testing, what is the symptom of the bug when it hits - did you > > see CPU hotplug stress-tests hanging? Crashing too perhaps? How > > frequently did it occur? > > I found this bug when I tested the draft version of kfree_rcu(V3). > > I noticed kfree_rcu_cpu_notify() is called earlier than > rcu_cpu_notify(). This means rcu_barrier() is called earlier than > RCU callbacks migration, it should lockup as expectation. But actually, > this lockup can not occurred, I tried to explore it, and I found that > rcu_barrier() does not handle cpu_hotplug. It includes two bugs. > > kfree_rcu(V3) (V4 is available too, it will be sent soon): > http://lkml.org/lkml/2009/3/6/156 > > The V1 fix of this bug: > http://lkml.org/lkml/2009/3/7/38 > > The fix of the other bug: (it changed the scheduler's code too) > http://lkml.org/lkml/2009/3/7/39 > > Subject: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu (V2) > > cpu hotplug may be happened asynchronously, some rcu callbacks are maybe > still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks > to complete, so we must ensure callbacks in dead cpu are migrated to > online cpu. Good stuff, Lai!!! Simpler than any of the approaches that I was considering, and, better yet, independent of the underlying RCU implementation!!! I was initially worried that wake_up() might wake only one of two possible wait_event()s, namely rcu_barrier() and the CPU_POST_DEAD code, but the fact that wait_event() clears WQ_FLAG_EXCLUSIVE avoids that issue. I was also worried about the fact that different RCU implementations have different mappings of call_rcu(), call_rcu_bh(), and call_rcu_sched(), but this is OK as well because we just get an extra (harmless) callback in the case that they map together (for example, Classic RCU has call_rcu_sched() mapping to call_rcu()). Overlap of CPU-hotplug operations is prevented by cpu_add_remove_lock, and any stray callbacks that arrive (for example, from irq handlers running on the dying CPU) either are ahead of the CPU_DYING callbacks on the one hand (and thus accounted for), or happened after the rcu_barrier() started on the other (and thus don't need to be accounted for). So... Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> > --- > diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c > index cae8a05..2c7b845 100644 > --- a/kernel/rcupdate.c > +++ b/kernel/rcupdate.c > @@ -122,6 +122,8 @@ static void rcu_barrier_func(void *type) > } > } > > +static inline void wait_migrated_callbacks(void); > + > /* > * Orchestrate the specified type of RCU barrier, waiting for all > * RCU callbacks of the specified type to complete. > @@ -147,6 +149,7 @@ static void _rcu_barrier(enum rcu_barrier type) > complete(&rcu_barrier_completion); > wait_for_completion(&rcu_barrier_completion); > mutex_unlock(&rcu_barrier_mutex); > + wait_migrated_callbacks(); > } > > /** > @@ -176,9 +179,50 @@ void rcu_barrier_sched(void) > } > EXPORT_SYMBOL_GPL(rcu_barrier_sched); > > +static atomic_t rcu_migrate_type_count = ATOMIC_INIT(0); > +static struct rcu_head rcu_migrate_head[3]; > +static DECLARE_WAIT_QUEUE_HEAD(rcu_migrate_wq); > + > +static void rcu_migrate_callback(struct rcu_head *notused) > +{ > + if (atomic_dec_and_test(&rcu_migrate_type_count)) > + wake_up(&rcu_migrate_wq); > +} > + > +static inline void wait_migrated_callbacks(void) > +{ > + wait_event(rcu_migrate_wq, !atomic_read(&rcu_migrate_type_count)); > +} > + > +static int __cpuinit rcu_barrier_cpu_hotplug(struct notifier_block *self, > + unsigned long action, void *hcpu) > +{ > + if (action == CPU_DYING) { > + /* > + * preempt_disable() in on_each_cpu() prevents stop_machine(), > + * so when "on_each_cpu(rcu_barrier_func, (void *)type, 1);" > + * returns, all online cpus have queued rcu_barrier_func(), > + * and the dead cpu(if it exist) queues rcu_migrate_callback()s. > + * > + * These callbacks ensure _rcu_barrier() waits for all > + * RCU callbacks of the specified type to complete. > + */ > + atomic_set(&rcu_migrate_type_count, 3); > + call_rcu_bh(rcu_migrate_head, rcu_migrate_callback); > + call_rcu_sched(rcu_migrate_head + 1, rcu_migrate_callback); > + call_rcu(rcu_migrate_head + 2, rcu_migrate_callback); > + } else if (action == CPU_POST_DEAD) { > + /* rcu_migrate_head is protected by cpu_add_remove_lock */ > + wait_migrated_callbacks(); > + } > + > + return NOTIFY_OK; > +} > + > void __init rcu_init(void) > { > __rcu_init(); > + hotcpu_notifier(rcu_barrier_cpu_hotplug, 0); > } > > void rcu_scheduler_starting(void) > ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20090319082237.GA32179@elte.hu>]
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu [not found] ` <20090319082237.GA32179@elte.hu> @ 2009-03-20 9:40 ` Lai Jiangshan 2009-03-20 20:00 ` [tip:core/rcu] rcu: " Lai Jiangshan 2009-03-30 22:12 ` Lai Jiangshan 0 siblings, 2 replies; 12+ messages in thread From: Lai Jiangshan @ 2009-03-20 9:40 UTC (permalink / raw) To: Ingo Molnar, Paul E. McKenney, Peter Zijlstra, LKML Ingo Molnar wrote: > > Lai, would you mind to resend the full patch with an updated > changelog tha adds Paul's review tag, against latest -tip? > Subject: [PATCH -tip] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu (V2) cpu hotplug may be happened asynchronously, some rcu callbacks are maybe still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks to complete, so we must ensure callbacks in dead cpu are migrated to online cpu. Paul E. McKenney's review: Good stuff, Lai!!! Simpler than any of the approaches that I was considering, and, better yet, independent of the underlying RCU implementation!!! I was initially worried that wake_up() might wake only one of two possible wait_event()s, namely rcu_barrier() and the CPU_POST_DEAD code, but the fact that wait_event() clears WQ_FLAG_EXCLUSIVE avoids that issue. I was also worried about the fact that different RCU implementations have different mappings of call_rcu(), call_rcu_bh(), and call_rcu_sched(), but this is OK as well because we just get an extra (harmless) callback in the case that they map together (for example, Classic RCU has call_rcu_sched() mapping to call_rcu()). Overlap of CPU-hotplug operations is prevented by cpu_add_remove_lock, and any stray callbacks that arrive (for example, from irq handlers running on the dying CPU) either are ahead of the CPU_DYING callbacks on the one hand (and thus accounted for), or happened after the rcu_barrier() started on the other (and thus don't need to be accounted for). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> --- diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c index cae8a05..2c7b845 100644 --- a/kernel/rcupdate.c +++ b/kernel/rcupdate.c @@ -122,6 +122,8 @@ static void rcu_barrier_func(void *type) } } +static inline void wait_migrated_callbacks(void); + /* * Orchestrate the specified type of RCU barrier, waiting for all * RCU callbacks of the specified type to complete. @@ -147,6 +149,7 @@ static void _rcu_barrier(enum rcu_barrier type) complete(&rcu_barrier_completion); wait_for_completion(&rcu_barrier_completion); mutex_unlock(&rcu_barrier_mutex); + wait_migrated_callbacks(); } /** @@ -176,9 +179,50 @@ void rcu_barrier_sched(void) } EXPORT_SYMBOL_GPL(rcu_barrier_sched); +static atomic_t rcu_migrate_type_count = ATOMIC_INIT(0); +static struct rcu_head rcu_migrate_head[3]; +static DECLARE_WAIT_QUEUE_HEAD(rcu_migrate_wq); + +static void rcu_migrate_callback(struct rcu_head *notused) +{ + if (atomic_dec_and_test(&rcu_migrate_type_count)) + wake_up(&rcu_migrate_wq); +} + +static inline void wait_migrated_callbacks(void) +{ + wait_event(rcu_migrate_wq, !atomic_read(&rcu_migrate_type_count)); +} + +static int __cpuinit rcu_barrier_cpu_hotplug(struct notifier_block *self, + unsigned long action, void *hcpu) +{ + if (action == CPU_DYING) { + /* + * preempt_disable() in on_each_cpu() prevents stop_machine(), + * so when "on_each_cpu(rcu_barrier_func, (void *)type, 1);" + * returns, all online cpus have queued rcu_barrier_func(), + * and the dead cpu(if it exist) queues rcu_migrate_callback()s. + * + * These callbacks ensure _rcu_barrier() waits for all + * RCU callbacks of the specified type to complete. + */ + atomic_set(&rcu_migrate_type_count, 3); + call_rcu_bh(rcu_migrate_head, rcu_migrate_callback); + call_rcu_sched(rcu_migrate_head + 1, rcu_migrate_callback); + call_rcu(rcu_migrate_head + 2, rcu_migrate_callback); + } else if (action == CPU_POST_DEAD) { + /* rcu_migrate_head is protected by cpu_add_remove_lock */ + wait_migrated_callbacks(); + } + + return NOTIFY_OK; +} + void __init rcu_init(void) { __rcu_init(); + hotcpu_notifier(rcu_barrier_cpu_hotplug, 0); } void rcu_scheduler_starting(void) ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [tip:core/rcu] rcu: rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu 2009-03-20 9:40 ` Lai Jiangshan @ 2009-03-20 20:00 ` Lai Jiangshan 2009-03-30 22:12 ` Lai Jiangshan 1 sibling, 0 replies; 12+ messages in thread From: Lai Jiangshan @ 2009-03-20 20:00 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, paulmck, hpa, mingo, peterz, tglx, laijs, mingo Commit-ID: 04cb9ac11ba898049c99dc67ede6e68f6d700dee Gitweb: http://git.kernel.org/tip/04cb9ac11ba898049c99dc67ede6e68f6d700dee Author: Lai Jiangshan <laijs@cn.fujitsu.com> AuthorDate: Fri, 20 Mar 2009 17:40:06 +0800 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Fri, 20 Mar 2009 19:37:33 +0100 rcu: rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu cpu hotplug may happen asynchronously, some rcu callbacks are maybe still on dead cpu, rcu_barrier() also needs to wait for these rcu callbacks to complete, so we must ensure callbacks in dead cpu are migrated to online cpu. Paul E. McKenney's review: Good stuff, Lai!!! Simpler than any of the approaches that I was considering, and, better yet, independent of the underlying RCU implementation!!! I was initially worried that wake_up() might wake only one of two possible wait_event()s, namely rcu_barrier() and the CPU_POST_DEAD code, but the fact that wait_event() clears WQ_FLAG_EXCLUSIVE avoids that issue. I was also worried about the fact that different RCU implementations have different mappings of call_rcu(), call_rcu_bh(), and call_rcu_sched(), but this is OK as well because we just get an extra (harmless) callback in the case that they map together (for example, Classic RCU has call_rcu_sched() mapping to call_rcu()). Overlap of CPU-hotplug operations is prevented by cpu_add_remove_lock, and any stray callbacks that arrive (for example, from irq handlers running on the dying CPU) either are ahead of the CPU_DYING callbacks on the one hand (and thus accounted for), or happened after the rcu_barrier() started on the other (and thus don't need to be accounted for). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <49C36476.1010400@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- kernel/rcupdate.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 44 insertions(+), 0 deletions(-) diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c index cae8a05..2c7b845 100644 --- a/kernel/rcupdate.c +++ b/kernel/rcupdate.c @@ -122,6 +122,8 @@ static void rcu_barrier_func(void *type) } } +static inline void wait_migrated_callbacks(void); + /* * Orchestrate the specified type of RCU barrier, waiting for all * RCU callbacks of the specified type to complete. @@ -147,6 +149,7 @@ static void _rcu_barrier(enum rcu_barrier type) complete(&rcu_barrier_completion); wait_for_completion(&rcu_barrier_completion); mutex_unlock(&rcu_barrier_mutex); + wait_migrated_callbacks(); } /** @@ -176,9 +179,50 @@ void rcu_barrier_sched(void) } EXPORT_SYMBOL_GPL(rcu_barrier_sched); +static atomic_t rcu_migrate_type_count = ATOMIC_INIT(0); +static struct rcu_head rcu_migrate_head[3]; +static DECLARE_WAIT_QUEUE_HEAD(rcu_migrate_wq); + +static void rcu_migrate_callback(struct rcu_head *notused) +{ + if (atomic_dec_and_test(&rcu_migrate_type_count)) + wake_up(&rcu_migrate_wq); +} + +static inline void wait_migrated_callbacks(void) +{ + wait_event(rcu_migrate_wq, !atomic_read(&rcu_migrate_type_count)); +} + +static int __cpuinit rcu_barrier_cpu_hotplug(struct notifier_block *self, + unsigned long action, void *hcpu) +{ + if (action == CPU_DYING) { + /* + * preempt_disable() in on_each_cpu() prevents stop_machine(), + * so when "on_each_cpu(rcu_barrier_func, (void *)type, 1);" + * returns, all online cpus have queued rcu_barrier_func(), + * and the dead cpu(if it exist) queues rcu_migrate_callback()s. + * + * These callbacks ensure _rcu_barrier() waits for all + * RCU callbacks of the specified type to complete. + */ + atomic_set(&rcu_migrate_type_count, 3); + call_rcu_bh(rcu_migrate_head, rcu_migrate_callback); + call_rcu_sched(rcu_migrate_head + 1, rcu_migrate_callback); + call_rcu(rcu_migrate_head + 2, rcu_migrate_callback); + } else if (action == CPU_POST_DEAD) { + /* rcu_migrate_head is protected by cpu_add_remove_lock */ + wait_migrated_callbacks(); + } + + return NOTIFY_OK; +} + void __init rcu_init(void) { __rcu_init(); + hotcpu_notifier(rcu_barrier_cpu_hotplug, 0); } void rcu_scheduler_starting(void) ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [tip:core/rcu] rcu: rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu 2009-03-20 9:40 ` Lai Jiangshan 2009-03-20 20:00 ` [tip:core/rcu] rcu: " Lai Jiangshan @ 2009-03-30 22:12 ` Lai Jiangshan 1 sibling, 0 replies; 12+ messages in thread From: Lai Jiangshan @ 2009-03-30 22:12 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, paulmck, hpa, mingo, peterz, tglx, laijs, mingo Commit-ID: f69b17d7e745d8edd7c0d90390cbaa77e63c5ea3 Gitweb: http://git.kernel.org/tip/f69b17d7e745d8edd7c0d90390cbaa77e63c5ea3 Author: Lai Jiangshan <laijs@cn.fujitsu.com> AuthorDate: Fri, 20 Mar 2009 17:40:06 +0800 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Tue, 31 Mar 2009 00:09:37 +0200 rcu: rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu cpu hotplug may happen asynchronously, some rcu callbacks are maybe still on dead cpu, rcu_barrier() also needs to wait for these rcu callbacks to complete, so we must ensure callbacks in dead cpu are migrated to online cpu. Paul E. McKenney's review: Good stuff, Lai!!! Simpler than any of the approaches that I was considering, and, better yet, independent of the underlying RCU implementation!!! I was initially worried that wake_up() might wake only one of two possible wait_event()s, namely rcu_barrier() and the CPU_POST_DEAD code, but the fact that wait_event() clears WQ_FLAG_EXCLUSIVE avoids that issue. I was also worried about the fact that different RCU implementations have different mappings of call_rcu(), call_rcu_bh(), and call_rcu_sched(), but this is OK as well because we just get an extra (harmless) callback in the case that they map together (for example, Classic RCU has call_rcu_sched() mapping to call_rcu()). Overlap of CPU-hotplug operations is prevented by cpu_add_remove_lock, and any stray callbacks that arrive (for example, from irq handlers running on the dying CPU) either are ahead of the CPU_DYING callbacks on the one hand (and thus accounted for), or happened after the rcu_barrier() started on the other (and thus don't need to be accounted for). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <49C36476.1010400@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- kernel/rcupdate.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 44 insertions(+), 0 deletions(-) diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c index cae8a05..2c7b845 100644 --- a/kernel/rcupdate.c +++ b/kernel/rcupdate.c @@ -122,6 +122,8 @@ static void rcu_barrier_func(void *type) } } +static inline void wait_migrated_callbacks(void); + /* * Orchestrate the specified type of RCU barrier, waiting for all * RCU callbacks of the specified type to complete. @@ -147,6 +149,7 @@ static void _rcu_barrier(enum rcu_barrier type) complete(&rcu_barrier_completion); wait_for_completion(&rcu_barrier_completion); mutex_unlock(&rcu_barrier_mutex); + wait_migrated_callbacks(); } /** @@ -176,9 +179,50 @@ void rcu_barrier_sched(void) } EXPORT_SYMBOL_GPL(rcu_barrier_sched); +static atomic_t rcu_migrate_type_count = ATOMIC_INIT(0); +static struct rcu_head rcu_migrate_head[3]; +static DECLARE_WAIT_QUEUE_HEAD(rcu_migrate_wq); + +static void rcu_migrate_callback(struct rcu_head *notused) +{ + if (atomic_dec_and_test(&rcu_migrate_type_count)) + wake_up(&rcu_migrate_wq); +} + +static inline void wait_migrated_callbacks(void) +{ + wait_event(rcu_migrate_wq, !atomic_read(&rcu_migrate_type_count)); +} + +static int __cpuinit rcu_barrier_cpu_hotplug(struct notifier_block *self, + unsigned long action, void *hcpu) +{ + if (action == CPU_DYING) { + /* + * preempt_disable() in on_each_cpu() prevents stop_machine(), + * so when "on_each_cpu(rcu_barrier_func, (void *)type, 1);" + * returns, all online cpus have queued rcu_barrier_func(), + * and the dead cpu(if it exist) queues rcu_migrate_callback()s. + * + * These callbacks ensure _rcu_barrier() waits for all + * RCU callbacks of the specified type to complete. + */ + atomic_set(&rcu_migrate_type_count, 3); + call_rcu_bh(rcu_migrate_head, rcu_migrate_callback); + call_rcu_sched(rcu_migrate_head + 1, rcu_migrate_callback); + call_rcu(rcu_migrate_head + 2, rcu_migrate_callback); + } else if (action == CPU_POST_DEAD) { + /* rcu_migrate_head is protected by cpu_add_remove_lock */ + wait_migrated_callbacks(); + } + + return NOTIFY_OK; +} + void __init rcu_init(void) { __rcu_init(); + hotcpu_notifier(rcu_barrier_cpu_hotplug, 0); } void rcu_scheduler_starting(void) ^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2009-03-30 22:13 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-03-07 10:54 [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu Lai Jiangshan 2009-03-07 17:29 ` Paul E. McKenney 2009-03-08 2:58 ` Lai Jiangshan 2009-03-08 6:20 ` Paul E. McKenney 2009-03-09 2:56 ` Lai Jiangshan 2009-03-09 4:28 ` Paul E. McKenney 2009-03-08 16:00 ` Ingo Molnar 2009-03-19 3:06 ` Lai Jiangshan 2009-03-19 4:05 ` Paul E. McKenney [not found] ` <20090319082237.GA32179@elte.hu> 2009-03-20 9:40 ` Lai Jiangshan 2009-03-20 20:00 ` [tip:core/rcu] rcu: " Lai Jiangshan 2009-03-30 22:12 ` Lai Jiangshan
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.