* [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
@ 2009-03-07 10:54 Lai Jiangshan
2009-03-07 17:29 ` Paul E. McKenney
2009-03-08 16:00 ` Ingo Molnar
0 siblings, 2 replies; 12+ messages in thread
From: Lai Jiangshan @ 2009-03-07 10:54 UTC (permalink / raw)
To: Ingo Molnar, Paul E. McKenney, Peter Zijlstra, LKML
[RFC]
I don't like this patch, but I thought for some days and I can't
thought out a better one.
I'm very hope rcu_barrier() can be called anywhere(any sleepable context).
But get_online_cpus() is a very large lock, it limits rcu_barrier().
We can avoid get_online_cpus() easily for rcupreempt by using a new rcu_barrier:
void rcu_barrier(void)
{
for each rcu_data {
lock rcu_data;
if rcu_data is not empty, queue a callback for rcu_barrier;
unlock rcu_data;
}
}
But we cannot use this algorithm for rcuclassic and rcutree,
rcu_data in rcuclassic and rcutree have not a spinlock for queuing callback.
From: Lai Jiangshan <laijs@cn.fujitsu.com>
cpu hotplug may be happened asynchronously, some rcu callbacks are maybe
still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks
to complete, so we must ensure callbacks in dead cpu are migrated to
online cpu.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index cae8a05..4665d18 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -129,6 +129,8 @@ static void rcu_barrier_func(void *type)
static void _rcu_barrier(enum rcu_barrier type)
{
BUG_ON(in_interrupt());
+ /* Ensure callbacks in dead cpu are migrated to online cpu */
+ get_online_cpus();
/* Take cpucontrol mutex to protect against CPU hotplug */
mutex_lock(&rcu_barrier_mutex);
init_completion(&rcu_barrier_completion);
@@ -147,6 +149,7 @@ static void _rcu_barrier(enum rcu_barrier type)
complete(&rcu_barrier_completion);
wait_for_completion(&rcu_barrier_completion);
mutex_unlock(&rcu_barrier_mutex);
+ put_online_cpus();
}
/**
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
2009-03-07 10:54 [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu Lai Jiangshan
@ 2009-03-07 17:29 ` Paul E. McKenney
2009-03-08 2:58 ` Lai Jiangshan
2009-03-08 16:00 ` Ingo Molnar
1 sibling, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2009-03-07 17:29 UTC (permalink / raw)
To: Lai Jiangshan; +Cc: Ingo Molnar, Peter Zijlstra, LKML
On Sat, Mar 07, 2009 at 06:54:38PM +0800, Lai Jiangshan wrote:
> [RFC]
> I don't like this patch, but I thought for some days and I can't
> thought out a better one.
>
> I'm very hope rcu_barrier() can be called anywhere(any sleepable context).
> But get_online_cpus() is a very large lock, it limits rcu_barrier().
>
> We can avoid get_online_cpus() easily for rcupreempt by using a new rcu_barrier:
> void rcu_barrier(void)
> {
> for each rcu_data {
> lock rcu_data;
> if rcu_data is not empty, queue a callback for rcu_barrier;
> unlock rcu_data;
> }
> }
> But we cannot use this algorithm for rcuclassic and rcutree,
> rcu_data in rcuclassic and rcutree have not a spinlock for queuing callback.
>
> From: Lai Jiangshan <laijs@cn.fujitsu.com>
>
> cpu hotplug may be happened asynchronously, some rcu callbacks are maybe
> still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks
> to complete, so we must ensure callbacks in dead cpu are migrated to
> online cpu.
Hmmm... I thought that on_each_cpu() took care of interlocking with
CPU hotplug via smp_call_function(). During a CPU-hotplug operation,
the RCU callbacks do get migrated from the CPU going offline. Are you
seeing a sequence of events that finds a hole in this approach?
Now, if a CPU were to go offline in the middle of smp_call_function()
there could be trouble, but I was under the impression that the
preempt_disable() in on_each_cpu() prevented this from happening.
So, please tell me more!
Thanx, Paul
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> ---
> diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
> index cae8a05..4665d18 100644
> --- a/kernel/rcupdate.c
> +++ b/kernel/rcupdate.c
> @@ -129,6 +129,8 @@ static void rcu_barrier_func(void *type)
> static void _rcu_barrier(enum rcu_barrier type)
> {
> BUG_ON(in_interrupt());
> + /* Ensure callbacks in dead cpu are migrated to online cpu */
> + get_online_cpus();
> /* Take cpucontrol mutex to protect against CPU hotplug */
> mutex_lock(&rcu_barrier_mutex);
> init_completion(&rcu_barrier_completion);
> @@ -147,6 +149,7 @@ static void _rcu_barrier(enum rcu_barrier type)
> complete(&rcu_barrier_completion);
> wait_for_completion(&rcu_barrier_completion);
> mutex_unlock(&rcu_barrier_mutex);
> + put_online_cpus();
> }
>
> /**
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
2009-03-07 17:29 ` Paul E. McKenney
@ 2009-03-08 2:58 ` Lai Jiangshan
2009-03-08 6:20 ` Paul E. McKenney
0 siblings, 1 reply; 12+ messages in thread
From: Lai Jiangshan @ 2009-03-08 2:58 UTC (permalink / raw)
To: paulmck; +Cc: Ingo Molnar, Peter Zijlstra, LKML
Paul E. McKenney wrote:
> On Sat, Mar 07, 2009 at 06:54:38PM +0800, Lai Jiangshan wrote:
>> [RFC]
>> I don't like this patch, but I thought for some days and I can't
>> thought out a better one.
>>
>> I'm very hope rcu_barrier() can be called anywhere(any sleepable context).
>> But get_online_cpus() is a very large lock, it limits rcu_barrier().
>>
>> We can avoid get_online_cpus() easily for rcupreempt by using a new rcu_barrier:
>> void rcu_barrier(void)
>> {
>> for each rcu_data {
>> lock rcu_data;
>> if rcu_data is not empty, queue a callback for rcu_barrier;
>> unlock rcu_data;
>> }
>> }
>> But we cannot use this algorithm for rcuclassic and rcutree,
>> rcu_data in rcuclassic and rcutree have not a spinlock for queuing callback.
>>
>> From: Lai Jiangshan <laijs@cn.fujitsu.com>
>>
>> cpu hotplug may be happened asynchronously, some rcu callbacks are maybe
>> still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks
>> to complete, so we must ensure callbacks in dead cpu are migrated to
>> online cpu.
>
> Hmmm... I thought that on_each_cpu() took care of interlocking with
> CPU hotplug via smp_call_function(). During a CPU-hotplug operation,
> the RCU callbacks do get migrated from the CPU going offline. Are you
> seeing a sequence of events that finds a hole in this approach?
>
> Now, if a CPU were to go offline in the middle of smp_call_function()
> there could be trouble, but I was under the impression that the
> preempt_disable() in on_each_cpu() prevented this from happening.
>
> So, please tell me more!
>
preempt_disable() ensure online cpu is still online until preempt_enable(),
but preempt_disable()/preempt_enable() can't ensure rcu callbacks migrated.
rcu_barrier() | _cpu_down()
| __cpu_die() (cpu D is dead)
........................|............................
on_each_cpu() |
........................|...........................
wait_for_completion() | rcu_offline_cpu() (move cpu D's
| rcu callbacks to A,B,or C)
on_each_cpu() does not queue rcu_barrier_callback to cpu D(it's dead).
So rcu_barrier() will not wait for callbacks which are original at cpu D.
We need ensure callbacks in dead cpu are migrated to online cpu before
we call on_each_cpu().
Thanks, Lai.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
2009-03-08 2:58 ` Lai Jiangshan
@ 2009-03-08 6:20 ` Paul E. McKenney
2009-03-09 2:56 ` Lai Jiangshan
0 siblings, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2009-03-08 6:20 UTC (permalink / raw)
To: Lai Jiangshan; +Cc: Ingo Molnar, Peter Zijlstra, LKML
On Sun, Mar 08, 2009 at 10:58:43AM +0800, Lai Jiangshan wrote:
> Paul E. McKenney wrote:
> > On Sat, Mar 07, 2009 at 06:54:38PM +0800, Lai Jiangshan wrote:
> >> [RFC]
> >> I don't like this patch, but I thought for some days and I can't
> >> thought out a better one.
> >>
> >> I'm very hope rcu_barrier() can be called anywhere(any sleepable context).
> >> But get_online_cpus() is a very large lock, it limits rcu_barrier().
> >>
> >> We can avoid get_online_cpus() easily for rcupreempt by using a new rcu_barrier:
> >> void rcu_barrier(void)
> >> {
> >> for each rcu_data {
> >> lock rcu_data;
> >> if rcu_data is not empty, queue a callback for rcu_barrier;
> >> unlock rcu_data;
> >> }
> >> }
> >> But we cannot use this algorithm for rcuclassic and rcutree,
> >> rcu_data in rcuclassic and rcutree have not a spinlock for queuing callback.
> >>
> >> From: Lai Jiangshan <laijs@cn.fujitsu.com>
> >>
> >> cpu hotplug may be happened asynchronously, some rcu callbacks are maybe
> >> still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks
> >> to complete, so we must ensure callbacks in dead cpu are migrated to
> >> online cpu.
> >
> > Hmmm... I thought that on_each_cpu() took care of interlocking with
> > CPU hotplug via smp_call_function(). During a CPU-hotplug operation,
> > the RCU callbacks do get migrated from the CPU going offline. Are you
> > seeing a sequence of events that finds a hole in this approach?
> >
> > Now, if a CPU were to go offline in the middle of smp_call_function()
> > there could be trouble, but I was under the impression that the
> > preempt_disable() in on_each_cpu() prevented this from happening.
> >
> > So, please tell me more!
> >
>
> preempt_disable() ensure online cpu is still online until preempt_enable(),
> but preempt_disable()/preempt_enable() can't ensure rcu callbacks migrated.
>
>
> rcu_barrier() | _cpu_down()
> | __cpu_die() (cpu D is dead)
> ........................|............................
> on_each_cpu() |
> ........................|...........................
> wait_for_completion() | rcu_offline_cpu() (move cpu D's
> | rcu callbacks to A,B,or C)
>
>
> on_each_cpu() does not queue rcu_barrier_callback to cpu D(it's dead).
> So rcu_barrier() will not wait for callbacks which are original at cpu D.
>
> We need ensure callbacks in dead cpu are migrated to online cpu before
> we call on_each_cpu().
Good catch!!! I did indeed miss that possibility. :-/
Hmmmm... rcu_barrier() already acquires a global mutex, and is an
infrequent operation, so I am not all that worried about the scalability.
But I agree that there should be a better way to do this. One approach
might be to the dying CPU enqueue the rcu_barrier() callback on its
own list when it goes offline, during the stop_machine() time period.
This enqueuing operation would require some care -- it would be necessary
to check to see if the callback was already on the list, for example,
as well as to properly adjust the rcu_barrier_completion() state.
Of course, it would also be necessary to handle the case where an
rcu_barrier() callback was enqueued when there was no rcu_barrier()
in flight, preferably by preventing this from happening.
An entirely different approach would be to steal a trick from CPU
designers, and use a count of the number of rcu_barrier() calls (this
counter could be a single bit). Have a per-CPU counter of the number
of callbacks outstanding for each counter value. Then rcu_barrier()
simply increments the rcu_barrier() counter, and waits until the
number of outstanding callbacks corresponding to the old value drops
to zero. This would get rid of the need for rcu_barrier() to enqueue
callbacks, preventing the scenario above from arising in the first
place.
Other thoughts?
And again, good catch!!!
Thanx, Paul
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
2009-03-07 10:54 [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu Lai Jiangshan
2009-03-07 17:29 ` Paul E. McKenney
@ 2009-03-08 16:00 ` Ingo Molnar
2009-03-19 3:06 ` Lai Jiangshan
1 sibling, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2009-03-08 16:00 UTC (permalink / raw)
To: Lai Jiangshan; +Cc: Paul E. McKenney, Peter Zijlstra, LKML
* Lai Jiangshan <laijs@cn.fujitsu.com> wrote:
> [RFC]
> I don't like this patch, but I thought for some days and I can't
> thought out a better one.
Interesting find. Found via code review or via testing? If via
testing, what is the symptom of the bug when it hits - did you
see CPU hotplug stress-tests hanging? Crashing too perhaps? How
frequently did it occur?
It also appears we introduced this bug at around 2.6.26 (when
the CPU hotplug code was rewritten) - or had it before that as
well.
We are late in -rc's so i'm trying to scope whether it's a
2.6.29 must-have or can wait for 2.6.30.
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
2009-03-08 6:20 ` Paul E. McKenney
@ 2009-03-09 2:56 ` Lai Jiangshan
2009-03-09 4:28 ` Paul E. McKenney
0 siblings, 1 reply; 12+ messages in thread
From: Lai Jiangshan @ 2009-03-09 2:56 UTC (permalink / raw)
To: paulmck; +Cc: Ingo Molnar, Peter Zijlstra, LKML
Paul E. McKenney wrote:
> On Sun, Mar 08, 2009 at 10:58:43AM +0800, Lai Jiangshan wrote:
>> Paul E. McKenney wrote:
>>> On Sat, Mar 07, 2009 at 06:54:38PM +0800, Lai Jiangshan wrote:
>>>> [RFC]
>>>> I don't like this patch, but I thought for some days and I can't
>>>> thought out a better one.
>>>>
>>>> I'm very hope rcu_barrier() can be called anywhere(any sleepable context).
>>>> But get_online_cpus() is a very large lock, it limits rcu_barrier().
>>>>
>>>> We can avoid get_online_cpus() easily for rcupreempt by using a new rcu_barrier:
>>>> void rcu_barrier(void)
>>>> {
>>>> for each rcu_data {
>>>> lock rcu_data;
>>>> if rcu_data is not empty, queue a callback for rcu_barrier;
>>>> unlock rcu_data;
>>>> }
>>>> }
>>>> But we cannot use this algorithm for rcuclassic and rcutree,
>>>> rcu_data in rcuclassic and rcutree have not a spinlock for queuing callback.
>>>>
>>>> From: Lai Jiangshan <laijs@cn.fujitsu.com>
>>>>
>>>> cpu hotplug may be happened asynchronously, some rcu callbacks are maybe
>>>> still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks
>>>> to complete, so we must ensure callbacks in dead cpu are migrated to
>>>> online cpu.
>>> Hmmm... I thought that on_each_cpu() took care of interlocking with
>>> CPU hotplug via smp_call_function(). During a CPU-hotplug operation,
>>> the RCU callbacks do get migrated from the CPU going offline. Are you
>>> seeing a sequence of events that finds a hole in this approach?
>>>
>>> Now, if a CPU were to go offline in the middle of smp_call_function()
>>> there could be trouble, but I was under the impression that the
>>> preempt_disable() in on_each_cpu() prevented this from happening.
>>>
>>> So, please tell me more!
>>>
>> preempt_disable() ensure online cpu is still online until preempt_enable(),
>> but preempt_disable()/preempt_enable() can't ensure rcu callbacks migrated.
>>
>>
>> rcu_barrier() | _cpu_down()
>> | __cpu_die() (cpu D is dead)
>> ........................|............................
>> on_each_cpu() |
>> ........................|...........................
>> wait_for_completion() | rcu_offline_cpu() (move cpu D's
>> | rcu callbacks to A,B,or C)
>>
>>
>> on_each_cpu() does not queue rcu_barrier_callback to cpu D(it's dead).
>> So rcu_barrier() will not wait for callbacks which are original at cpu D.
>>
>> We need ensure callbacks in dead cpu are migrated to online cpu before
>> we call on_each_cpu().
>
> Good catch!!! I did indeed miss that possibility. :-/
>
> Hmmmm... rcu_barrier() already acquires a global mutex, and is an
> infrequent operation, so I am not all that worried about the scalability.
I do not worry about the scalability either.
When we use get_online_cpus(), rcu_barrier() can not be called anywhere
(any sleepable context), this is what I worry about.
Most locks in kernel are locked after cpu_hotplug.lock,
if a path has required one of these lock, it cannot call get_online_cpus().
(to avoid ABBA deadlock)
So, if we use get_online_cpus() in rcu_barrier(), we cannot use rcu_barrier()
in most area in kernel.
>
> But I agree that there should be a better way to do this. One approach
> might be to the dying CPU enqueue the rcu_barrier() callback on its
> own list when it goes offline, during the stop_machine() time period.
> This enqueuing operation would require some care -- it would be necessary
> to check to see if the callback was already on the list, for example,
> as well as to properly adjust the rcu_barrier_completion() state.
>
> Of course, it would also be necessary to handle the case where an
> rcu_barrier() callback was enqueued when there was no rcu_barrier()
> in flight, preferably by preventing this from happening.
>
> An entirely different approach would be to steal a trick from CPU
> designers, and use a count of the number of rcu_barrier() calls (this
> counter could be a single bit). Have a per-CPU counter of the number
> of callbacks outstanding for each counter value. Then rcu_barrier()
> simply increments the rcu_barrier() counter, and waits until the
> number of outstanding callbacks corresponding to the old value drops
> to zero. This would get rid of the need for rcu_barrier() to enqueue
> callbacks, preventing the scenario above from arising in the first
> place.
>
Will you implement it with one of better ways?
Lai
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
2009-03-09 2:56 ` Lai Jiangshan
@ 2009-03-09 4:28 ` Paul E. McKenney
0 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2009-03-09 4:28 UTC (permalink / raw)
To: Lai Jiangshan; +Cc: Ingo Molnar, Peter Zijlstra, LKML
On Mon, Mar 09, 2009 at 10:56:12AM +0800, Lai Jiangshan wrote:
> Paul E. McKenney wrote:
> > On Sun, Mar 08, 2009 at 10:58:43AM +0800, Lai Jiangshan wrote:
> >> Paul E. McKenney wrote:
> >>> On Sat, Mar 07, 2009 at 06:54:38PM +0800, Lai Jiangshan wrote:
> >>>> [RFC]
> >>>> I don't like this patch, but I thought for some days and I can't
> >>>> thought out a better one.
> >>>>
> >>>> I'm very hope rcu_barrier() can be called anywhere(any sleepable context).
> >>>> But get_online_cpus() is a very large lock, it limits rcu_barrier().
> >>>>
> >>>> We can avoid get_online_cpus() easily for rcupreempt by using a new rcu_barrier:
> >>>> void rcu_barrier(void)
> >>>> {
> >>>> for each rcu_data {
> >>>> lock rcu_data;
> >>>> if rcu_data is not empty, queue a callback for rcu_barrier;
> >>>> unlock rcu_data;
> >>>> }
> >>>> }
> >>>> But we cannot use this algorithm for rcuclassic and rcutree,
> >>>> rcu_data in rcuclassic and rcutree have not a spinlock for queuing callback.
> >>>>
> >>>> From: Lai Jiangshan <laijs@cn.fujitsu.com>
> >>>>
> >>>> cpu hotplug may be happened asynchronously, some rcu callbacks are maybe
> >>>> still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks
> >>>> to complete, so we must ensure callbacks in dead cpu are migrated to
> >>>> online cpu.
> >>> Hmmm... I thought that on_each_cpu() took care of interlocking with
> >>> CPU hotplug via smp_call_function(). During a CPU-hotplug operation,
> >>> the RCU callbacks do get migrated from the CPU going offline. Are you
> >>> seeing a sequence of events that finds a hole in this approach?
> >>>
> >>> Now, if a CPU were to go offline in the middle of smp_call_function()
> >>> there could be trouble, but I was under the impression that the
> >>> preempt_disable() in on_each_cpu() prevented this from happening.
> >>>
> >>> So, please tell me more!
> >>>
> >> preempt_disable() ensure online cpu is still online until preempt_enable(),
> >> but preempt_disable()/preempt_enable() can't ensure rcu callbacks migrated.
> >>
> >>
> >> rcu_barrier() | _cpu_down()
> >> | __cpu_die() (cpu D is dead)
> >> ........................|............................
> >> on_each_cpu() |
> >> ........................|...........................
> >> wait_for_completion() | rcu_offline_cpu() (move cpu D's
> >> | rcu callbacks to A,B,or C)
> >>
> >>
> >> on_each_cpu() does not queue rcu_barrier_callback to cpu D(it's dead).
> >> So rcu_barrier() will not wait for callbacks which are original at cpu D.
> >>
> >> We need ensure callbacks in dead cpu are migrated to online cpu before
> >> we call on_each_cpu().
> >
> > Good catch!!! I did indeed miss that possibility. :-/
> >
> > Hmmmm... rcu_barrier() already acquires a global mutex, and is an
> > infrequent operation, so I am not all that worried about the scalability.
>
> I do not worry about the scalability either.
> When we use get_online_cpus(), rcu_barrier() can not be called anywhere
> (any sleepable context), this is what I worry about.
>
> Most locks in kernel are locked after cpu_hotplug.lock,
> if a path has required one of these lock, it cannot call get_online_cpus().
> (to avoid ABBA deadlock)
> So, if we use get_online_cpus() in rcu_barrier(), we cannot use rcu_barrier()
> in most area in kernel.
Yes, that could be painful.
> > But I agree that there should be a better way to do this. One approach
> > might be to the dying CPU enqueue the rcu_barrier() callback on its
> > own list when it goes offline, during the stop_machine() time period.
> > This enqueuing operation would require some care -- it would be necessary
> > to check to see if the callback was already on the list, for example,
> > as well as to properly adjust the rcu_barrier_completion() state.
> >
> > Of course, it would also be necessary to handle the case where an
> > rcu_barrier() callback was enqueued when there was no rcu_barrier()
> > in flight, preferably by preventing this from happening.
> >
> > An entirely different approach would be to steal a trick from CPU
> > designers, and use a count of the number of rcu_barrier() calls (this
> > counter could be a single bit). Have a per-CPU counter of the number
> > of callbacks outstanding for each counter value. Then rcu_barrier()
> > simply increments the rcu_barrier() counter, and waits until the
> > number of outstanding callbacks corresponding to the old value drops
> > to zero. This would get rid of the need for rcu_barrier() to enqueue
> > callbacks, preventing the scenario above from arising in the first
> > place.
>
> Will you implement it with one of better ways?
I would start after I get the idr crash resolved -- first patch already
done (http://lkml.org/lkml/2009/3/7/133), but problems remain. So if
you want to get started, I would be happy to review. Your tail-pointer
trick would be needed to allow the counts to be correctly maintained
when migrating callbacks, of course.
Thanx, Paul
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
2009-03-08 16:00 ` Ingo Molnar
@ 2009-03-19 3:06 ` Lai Jiangshan
2009-03-19 4:05 ` Paul E. McKenney
0 siblings, 1 reply; 12+ messages in thread
From: Lai Jiangshan @ 2009-03-19 3:06 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Paul E. McKenney, Peter Zijlstra, LKML
Ingo Molnar wrote:
> * Lai Jiangshan <laijs@cn.fujitsu.com> wrote:
>
>> [RFC]
>> I don't like this patch, but I thought for some days and I can't
>> thought out a better one.
>
> Interesting find. Found via code review or via testing? If via
> testing, what is the symptom of the bug when it hits - did you
> see CPU hotplug stress-tests hanging? Crashing too perhaps? How
> frequently did it occur?
I found this bug when I tested the draft version of kfree_rcu(V3).
I noticed kfree_rcu_cpu_notify() is called earlier than
rcu_cpu_notify(). This means rcu_barrier() is called earlier than
RCU callbacks migration, it should lockup as expectation. But actually,
this lockup can not occurred, I tried to explore it, and I found that
rcu_barrier() does not handle cpu_hotplug. It includes two bugs.
kfree_rcu(V3) (V4 is available too, it will be sent soon):
http://lkml.org/lkml/2009/3/6/156
The V1 fix of this bug:
http://lkml.org/lkml/2009/3/7/38
The fix of the other bug: (it changed the scheduler's code too)
http://lkml.org/lkml/2009/3/7/39
Subject: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu (V2)
cpu hotplug may be happened asynchronously, some rcu callbacks are maybe
still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks
to complete, so we must ensure callbacks in dead cpu are migrated to
online cpu.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index cae8a05..2c7b845 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -122,6 +122,8 @@ static void rcu_barrier_func(void *type)
}
}
+static inline void wait_migrated_callbacks(void);
+
/*
* Orchestrate the specified type of RCU barrier, waiting for all
* RCU callbacks of the specified type to complete.
@@ -147,6 +149,7 @@ static void _rcu_barrier(enum rcu_barrier type)
complete(&rcu_barrier_completion);
wait_for_completion(&rcu_barrier_completion);
mutex_unlock(&rcu_barrier_mutex);
+ wait_migrated_callbacks();
}
/**
@@ -176,9 +179,50 @@ void rcu_barrier_sched(void)
}
EXPORT_SYMBOL_GPL(rcu_barrier_sched);
+static atomic_t rcu_migrate_type_count = ATOMIC_INIT(0);
+static struct rcu_head rcu_migrate_head[3];
+static DECLARE_WAIT_QUEUE_HEAD(rcu_migrate_wq);
+
+static void rcu_migrate_callback(struct rcu_head *notused)
+{
+ if (atomic_dec_and_test(&rcu_migrate_type_count))
+ wake_up(&rcu_migrate_wq);
+}
+
+static inline void wait_migrated_callbacks(void)
+{
+ wait_event(rcu_migrate_wq, !atomic_read(&rcu_migrate_type_count));
+}
+
+static int __cpuinit rcu_barrier_cpu_hotplug(struct notifier_block *self,
+ unsigned long action, void *hcpu)
+{
+ if (action == CPU_DYING) {
+ /*
+ * preempt_disable() in on_each_cpu() prevents stop_machine(),
+ * so when "on_each_cpu(rcu_barrier_func, (void *)type, 1);"
+ * returns, all online cpus have queued rcu_barrier_func(),
+ * and the dead cpu(if it exist) queues rcu_migrate_callback()s.
+ *
+ * These callbacks ensure _rcu_barrier() waits for all
+ * RCU callbacks of the specified type to complete.
+ */
+ atomic_set(&rcu_migrate_type_count, 3);
+ call_rcu_bh(rcu_migrate_head, rcu_migrate_callback);
+ call_rcu_sched(rcu_migrate_head + 1, rcu_migrate_callback);
+ call_rcu(rcu_migrate_head + 2, rcu_migrate_callback);
+ } else if (action == CPU_POST_DEAD) {
+ /* rcu_migrate_head is protected by cpu_add_remove_lock */
+ wait_migrated_callbacks();
+ }
+
+ return NOTIFY_OK;
+}
+
void __init rcu_init(void)
{
__rcu_init();
+ hotcpu_notifier(rcu_barrier_cpu_hotplug, 0);
}
void rcu_scheduler_starting(void)
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
2009-03-19 3:06 ` Lai Jiangshan
@ 2009-03-19 4:05 ` Paul E. McKenney
[not found] ` <20090319082237.GA32179@elte.hu>
0 siblings, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2009-03-19 4:05 UTC (permalink / raw)
To: Lai Jiangshan; +Cc: Ingo Molnar, Peter Zijlstra, LKML
On Thu, Mar 19, 2009 at 11:06:39AM +0800, Lai Jiangshan wrote:
> Ingo Molnar wrote:
> > * Lai Jiangshan <laijs@cn.fujitsu.com> wrote:
> >
> >> [RFC]
> >> I don't like this patch, but I thought for some days and I can't
> >> thought out a better one.
> >
> > Interesting find. Found via code review or via testing? If via
> > testing, what is the symptom of the bug when it hits - did you
> > see CPU hotplug stress-tests hanging? Crashing too perhaps? How
> > frequently did it occur?
>
> I found this bug when I tested the draft version of kfree_rcu(V3).
>
> I noticed kfree_rcu_cpu_notify() is called earlier than
> rcu_cpu_notify(). This means rcu_barrier() is called earlier than
> RCU callbacks migration, it should lockup as expectation. But actually,
> this lockup can not occurred, I tried to explore it, and I found that
> rcu_barrier() does not handle cpu_hotplug. It includes two bugs.
>
> kfree_rcu(V3) (V4 is available too, it will be sent soon):
> http://lkml.org/lkml/2009/3/6/156
>
> The V1 fix of this bug:
> http://lkml.org/lkml/2009/3/7/38
>
> The fix of the other bug: (it changed the scheduler's code too)
> http://lkml.org/lkml/2009/3/7/39
>
> Subject: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu (V2)
>
> cpu hotplug may be happened asynchronously, some rcu callbacks are maybe
> still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks
> to complete, so we must ensure callbacks in dead cpu are migrated to
> online cpu.
Good stuff, Lai!!! Simpler than any of the approaches that I was
considering, and, better yet, independent of the underlying RCU
implementation!!!
I was initially worried that wake_up() might wake only one of two
possible wait_event()s, namely rcu_barrier() and the CPU_POST_DEAD code,
but the fact that wait_event() clears WQ_FLAG_EXCLUSIVE avoids that issue.
I was also worried about the fact that different RCU implementations have
different mappings of call_rcu(), call_rcu_bh(), and call_rcu_sched(), but
this is OK as well because we just get an extra (harmless) callback in the
case that they map together (for example, Classic RCU has call_rcu_sched()
mapping to call_rcu()).
Overlap of CPU-hotplug operations is prevented by cpu_add_remove_lock,
and any stray callbacks that arrive (for example, from irq handlers
running on the dying CPU) either are ahead of the CPU_DYING callbacks on
the one hand (and thus accounted for), or happened after the rcu_barrier()
started on the other (and thus don't need to be accounted for).
So...
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> ---
> diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
> index cae8a05..2c7b845 100644
> --- a/kernel/rcupdate.c
> +++ b/kernel/rcupdate.c
> @@ -122,6 +122,8 @@ static void rcu_barrier_func(void *type)
> }
> }
>
> +static inline void wait_migrated_callbacks(void);
> +
> /*
> * Orchestrate the specified type of RCU barrier, waiting for all
> * RCU callbacks of the specified type to complete.
> @@ -147,6 +149,7 @@ static void _rcu_barrier(enum rcu_barrier type)
> complete(&rcu_barrier_completion);
> wait_for_completion(&rcu_barrier_completion);
> mutex_unlock(&rcu_barrier_mutex);
> + wait_migrated_callbacks();
> }
>
> /**
> @@ -176,9 +179,50 @@ void rcu_barrier_sched(void)
> }
> EXPORT_SYMBOL_GPL(rcu_barrier_sched);
>
> +static atomic_t rcu_migrate_type_count = ATOMIC_INIT(0);
> +static struct rcu_head rcu_migrate_head[3];
> +static DECLARE_WAIT_QUEUE_HEAD(rcu_migrate_wq);
> +
> +static void rcu_migrate_callback(struct rcu_head *notused)
> +{
> + if (atomic_dec_and_test(&rcu_migrate_type_count))
> + wake_up(&rcu_migrate_wq);
> +}
> +
> +static inline void wait_migrated_callbacks(void)
> +{
> + wait_event(rcu_migrate_wq, !atomic_read(&rcu_migrate_type_count));
> +}
> +
> +static int __cpuinit rcu_barrier_cpu_hotplug(struct notifier_block *self,
> + unsigned long action, void *hcpu)
> +{
> + if (action == CPU_DYING) {
> + /*
> + * preempt_disable() in on_each_cpu() prevents stop_machine(),
> + * so when "on_each_cpu(rcu_barrier_func, (void *)type, 1);"
> + * returns, all online cpus have queued rcu_barrier_func(),
> + * and the dead cpu(if it exist) queues rcu_migrate_callback()s.
> + *
> + * These callbacks ensure _rcu_barrier() waits for all
> + * RCU callbacks of the specified type to complete.
> + */
> + atomic_set(&rcu_migrate_type_count, 3);
> + call_rcu_bh(rcu_migrate_head, rcu_migrate_callback);
> + call_rcu_sched(rcu_migrate_head + 1, rcu_migrate_callback);
> + call_rcu(rcu_migrate_head + 2, rcu_migrate_callback);
> + } else if (action == CPU_POST_DEAD) {
> + /* rcu_migrate_head is protected by cpu_add_remove_lock */
> + wait_migrated_callbacks();
> + }
> +
> + return NOTIFY_OK;
> +}
> +
> void __init rcu_init(void)
> {
> __rcu_init();
> + hotcpu_notifier(rcu_barrier_cpu_hotplug, 0);
> }
>
> void rcu_scheduler_starting(void)
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
[not found] ` <20090319082237.GA32179@elte.hu>
@ 2009-03-20 9:40 ` Lai Jiangshan
2009-03-20 20:00 ` [tip:core/rcu] rcu: " Lai Jiangshan
2009-03-30 22:12 ` Lai Jiangshan
0 siblings, 2 replies; 12+ messages in thread
From: Lai Jiangshan @ 2009-03-20 9:40 UTC (permalink / raw)
To: Ingo Molnar, Paul E. McKenney, Peter Zijlstra, LKML
Ingo Molnar wrote:
>
> Lai, would you mind to resend the full patch with an updated
> changelog tha adds Paul's review tag, against latest -tip?
>
Subject: [PATCH -tip] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu (V2)
cpu hotplug may be happened asynchronously, some rcu callbacks are maybe
still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks
to complete, so we must ensure callbacks in dead cpu are migrated to
online cpu.
Paul E. McKenney's review:
Good stuff, Lai!!! Simpler than any of the approaches that I was
considering, and, better yet, independent of the underlying RCU
implementation!!!
I was initially worried that wake_up() might wake only one of two
possible wait_event()s, namely rcu_barrier() and the CPU_POST_DEAD code,
but the fact that wait_event() clears WQ_FLAG_EXCLUSIVE avoids that issue.
I was also worried about the fact that different RCU implementations have
different mappings of call_rcu(), call_rcu_bh(), and call_rcu_sched(), but
this is OK as well because we just get an extra (harmless) callback in the
case that they map together (for example, Classic RCU has call_rcu_sched()
mapping to call_rcu()).
Overlap of CPU-hotplug operations is prevented by cpu_add_remove_lock,
and any stray callbacks that arrive (for example, from irq handlers
running on the dying CPU) either are ahead of the CPU_DYING callbacks on
the one hand (and thus accounted for), or happened after the rcu_barrier()
started on the other (and thus don't need to be accounted for).
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index cae8a05..2c7b845 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -122,6 +122,8 @@ static void rcu_barrier_func(void *type)
}
}
+static inline void wait_migrated_callbacks(void);
+
/*
* Orchestrate the specified type of RCU barrier, waiting for all
* RCU callbacks of the specified type to complete.
@@ -147,6 +149,7 @@ static void _rcu_barrier(enum rcu_barrier type)
complete(&rcu_barrier_completion);
wait_for_completion(&rcu_barrier_completion);
mutex_unlock(&rcu_barrier_mutex);
+ wait_migrated_callbacks();
}
/**
@@ -176,9 +179,50 @@ void rcu_barrier_sched(void)
}
EXPORT_SYMBOL_GPL(rcu_barrier_sched);
+static atomic_t rcu_migrate_type_count = ATOMIC_INIT(0);
+static struct rcu_head rcu_migrate_head[3];
+static DECLARE_WAIT_QUEUE_HEAD(rcu_migrate_wq);
+
+static void rcu_migrate_callback(struct rcu_head *notused)
+{
+ if (atomic_dec_and_test(&rcu_migrate_type_count))
+ wake_up(&rcu_migrate_wq);
+}
+
+static inline void wait_migrated_callbacks(void)
+{
+ wait_event(rcu_migrate_wq, !atomic_read(&rcu_migrate_type_count));
+}
+
+static int __cpuinit rcu_barrier_cpu_hotplug(struct notifier_block *self,
+ unsigned long action, void *hcpu)
+{
+ if (action == CPU_DYING) {
+ /*
+ * preempt_disable() in on_each_cpu() prevents stop_machine(),
+ * so when "on_each_cpu(rcu_barrier_func, (void *)type, 1);"
+ * returns, all online cpus have queued rcu_barrier_func(),
+ * and the dead cpu(if it exist) queues rcu_migrate_callback()s.
+ *
+ * These callbacks ensure _rcu_barrier() waits for all
+ * RCU callbacks of the specified type to complete.
+ */
+ atomic_set(&rcu_migrate_type_count, 3);
+ call_rcu_bh(rcu_migrate_head, rcu_migrate_callback);
+ call_rcu_sched(rcu_migrate_head + 1, rcu_migrate_callback);
+ call_rcu(rcu_migrate_head + 2, rcu_migrate_callback);
+ } else if (action == CPU_POST_DEAD) {
+ /* rcu_migrate_head is protected by cpu_add_remove_lock */
+ wait_migrated_callbacks();
+ }
+
+ return NOTIFY_OK;
+}
+
void __init rcu_init(void)
{
__rcu_init();
+ hotcpu_notifier(rcu_barrier_cpu_hotplug, 0);
}
void rcu_scheduler_starting(void)
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [tip:core/rcu] rcu: rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
2009-03-20 9:40 ` Lai Jiangshan
@ 2009-03-20 20:00 ` Lai Jiangshan
2009-03-30 22:12 ` Lai Jiangshan
1 sibling, 0 replies; 12+ messages in thread
From: Lai Jiangshan @ 2009-03-20 20:00 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, paulmck, hpa, mingo, peterz, tglx, laijs, mingo
Commit-ID: 04cb9ac11ba898049c99dc67ede6e68f6d700dee
Gitweb: http://git.kernel.org/tip/04cb9ac11ba898049c99dc67ede6e68f6d700dee
Author: Lai Jiangshan <laijs@cn.fujitsu.com>
AuthorDate: Fri, 20 Mar 2009 17:40:06 +0800
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 20 Mar 2009 19:37:33 +0100
rcu: rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
cpu hotplug may happen asynchronously, some rcu callbacks are maybe
still on dead cpu, rcu_barrier() also needs to wait for these rcu
callbacks to complete, so we must ensure callbacks in dead cpu are
migrated to online cpu.
Paul E. McKenney's review:
Good stuff, Lai!!! Simpler than any of the approaches that I was
considering, and, better yet, independent of the underlying RCU
implementation!!!
I was initially worried that wake_up() might wake only one of two
possible wait_event()s, namely rcu_barrier() and the CPU_POST_DEAD code,
but the fact that wait_event() clears WQ_FLAG_EXCLUSIVE avoids that issue.
I was also worried about the fact that different RCU implementations have
different mappings of call_rcu(), call_rcu_bh(), and call_rcu_sched(), but
this is OK as well because we just get an extra (harmless) callback in the
case that they map together (for example, Classic RCU has call_rcu_sched()
mapping to call_rcu()).
Overlap of CPU-hotplug operations is prevented by cpu_add_remove_lock,
and any stray callbacks that arrive (for example, from irq handlers
running on the dying CPU) either are ahead of the CPU_DYING callbacks on
the one hand (and thus accounted for), or happened after the rcu_barrier()
started on the other (and thus don't need to be accounted for).
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <49C36476.1010400@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/rcupdate.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 44 insertions(+), 0 deletions(-)
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index cae8a05..2c7b845 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -122,6 +122,8 @@ static void rcu_barrier_func(void *type)
}
}
+static inline void wait_migrated_callbacks(void);
+
/*
* Orchestrate the specified type of RCU barrier, waiting for all
* RCU callbacks of the specified type to complete.
@@ -147,6 +149,7 @@ static void _rcu_barrier(enum rcu_barrier type)
complete(&rcu_barrier_completion);
wait_for_completion(&rcu_barrier_completion);
mutex_unlock(&rcu_barrier_mutex);
+ wait_migrated_callbacks();
}
/**
@@ -176,9 +179,50 @@ void rcu_barrier_sched(void)
}
EXPORT_SYMBOL_GPL(rcu_barrier_sched);
+static atomic_t rcu_migrate_type_count = ATOMIC_INIT(0);
+static struct rcu_head rcu_migrate_head[3];
+static DECLARE_WAIT_QUEUE_HEAD(rcu_migrate_wq);
+
+static void rcu_migrate_callback(struct rcu_head *notused)
+{
+ if (atomic_dec_and_test(&rcu_migrate_type_count))
+ wake_up(&rcu_migrate_wq);
+}
+
+static inline void wait_migrated_callbacks(void)
+{
+ wait_event(rcu_migrate_wq, !atomic_read(&rcu_migrate_type_count));
+}
+
+static int __cpuinit rcu_barrier_cpu_hotplug(struct notifier_block *self,
+ unsigned long action, void *hcpu)
+{
+ if (action == CPU_DYING) {
+ /*
+ * preempt_disable() in on_each_cpu() prevents stop_machine(),
+ * so when "on_each_cpu(rcu_barrier_func, (void *)type, 1);"
+ * returns, all online cpus have queued rcu_barrier_func(),
+ * and the dead cpu(if it exist) queues rcu_migrate_callback()s.
+ *
+ * These callbacks ensure _rcu_barrier() waits for all
+ * RCU callbacks of the specified type to complete.
+ */
+ atomic_set(&rcu_migrate_type_count, 3);
+ call_rcu_bh(rcu_migrate_head, rcu_migrate_callback);
+ call_rcu_sched(rcu_migrate_head + 1, rcu_migrate_callback);
+ call_rcu(rcu_migrate_head + 2, rcu_migrate_callback);
+ } else if (action == CPU_POST_DEAD) {
+ /* rcu_migrate_head is protected by cpu_add_remove_lock */
+ wait_migrated_callbacks();
+ }
+
+ return NOTIFY_OK;
+}
+
void __init rcu_init(void)
{
__rcu_init();
+ hotcpu_notifier(rcu_barrier_cpu_hotplug, 0);
}
void rcu_scheduler_starting(void)
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [tip:core/rcu] rcu: rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
2009-03-20 9:40 ` Lai Jiangshan
2009-03-20 20:00 ` [tip:core/rcu] rcu: " Lai Jiangshan
@ 2009-03-30 22:12 ` Lai Jiangshan
1 sibling, 0 replies; 12+ messages in thread
From: Lai Jiangshan @ 2009-03-30 22:12 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, paulmck, hpa, mingo, peterz, tglx, laijs, mingo
Commit-ID: f69b17d7e745d8edd7c0d90390cbaa77e63c5ea3
Gitweb: http://git.kernel.org/tip/f69b17d7e745d8edd7c0d90390cbaa77e63c5ea3
Author: Lai Jiangshan <laijs@cn.fujitsu.com>
AuthorDate: Fri, 20 Mar 2009 17:40:06 +0800
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Tue, 31 Mar 2009 00:09:37 +0200
rcu: rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
cpu hotplug may happen asynchronously, some rcu callbacks are maybe
still on dead cpu, rcu_barrier() also needs to wait for these rcu
callbacks to complete, so we must ensure callbacks in dead cpu are
migrated to online cpu.
Paul E. McKenney's review:
Good stuff, Lai!!! Simpler than any of the approaches that I was
considering, and, better yet, independent of the underlying RCU
implementation!!!
I was initially worried that wake_up() might wake only one of two
possible wait_event()s, namely rcu_barrier() and the CPU_POST_DEAD code,
but the fact that wait_event() clears WQ_FLAG_EXCLUSIVE avoids that issue.
I was also worried about the fact that different RCU implementations have
different mappings of call_rcu(), call_rcu_bh(), and call_rcu_sched(), but
this is OK as well because we just get an extra (harmless) callback in the
case that they map together (for example, Classic RCU has call_rcu_sched()
mapping to call_rcu()).
Overlap of CPU-hotplug operations is prevented by cpu_add_remove_lock,
and any stray callbacks that arrive (for example, from irq handlers
running on the dying CPU) either are ahead of the CPU_DYING callbacks on
the one hand (and thus accounted for), or happened after the rcu_barrier()
started on the other (and thus don't need to be accounted for).
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <49C36476.1010400@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/rcupdate.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 44 insertions(+), 0 deletions(-)
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index cae8a05..2c7b845 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -122,6 +122,8 @@ static void rcu_barrier_func(void *type)
}
}
+static inline void wait_migrated_callbacks(void);
+
/*
* Orchestrate the specified type of RCU barrier, waiting for all
* RCU callbacks of the specified type to complete.
@@ -147,6 +149,7 @@ static void _rcu_barrier(enum rcu_barrier type)
complete(&rcu_barrier_completion);
wait_for_completion(&rcu_barrier_completion);
mutex_unlock(&rcu_barrier_mutex);
+ wait_migrated_callbacks();
}
/**
@@ -176,9 +179,50 @@ void rcu_barrier_sched(void)
}
EXPORT_SYMBOL_GPL(rcu_barrier_sched);
+static atomic_t rcu_migrate_type_count = ATOMIC_INIT(0);
+static struct rcu_head rcu_migrate_head[3];
+static DECLARE_WAIT_QUEUE_HEAD(rcu_migrate_wq);
+
+static void rcu_migrate_callback(struct rcu_head *notused)
+{
+ if (atomic_dec_and_test(&rcu_migrate_type_count))
+ wake_up(&rcu_migrate_wq);
+}
+
+static inline void wait_migrated_callbacks(void)
+{
+ wait_event(rcu_migrate_wq, !atomic_read(&rcu_migrate_type_count));
+}
+
+static int __cpuinit rcu_barrier_cpu_hotplug(struct notifier_block *self,
+ unsigned long action, void *hcpu)
+{
+ if (action == CPU_DYING) {
+ /*
+ * preempt_disable() in on_each_cpu() prevents stop_machine(),
+ * so when "on_each_cpu(rcu_barrier_func, (void *)type, 1);"
+ * returns, all online cpus have queued rcu_barrier_func(),
+ * and the dead cpu(if it exist) queues rcu_migrate_callback()s.
+ *
+ * These callbacks ensure _rcu_barrier() waits for all
+ * RCU callbacks of the specified type to complete.
+ */
+ atomic_set(&rcu_migrate_type_count, 3);
+ call_rcu_bh(rcu_migrate_head, rcu_migrate_callback);
+ call_rcu_sched(rcu_migrate_head + 1, rcu_migrate_callback);
+ call_rcu(rcu_migrate_head + 2, rcu_migrate_callback);
+ } else if (action == CPU_POST_DEAD) {
+ /* rcu_migrate_head is protected by cpu_add_remove_lock */
+ wait_migrated_callbacks();
+ }
+
+ return NOTIFY_OK;
+}
+
void __init rcu_init(void)
{
__rcu_init();
+ hotcpu_notifier(rcu_barrier_cpu_hotplug, 0);
}
void rcu_scheduler_starting(void)
^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2009-03-30 22:13 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-07 10:54 [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu Lai Jiangshan
2009-03-07 17:29 ` Paul E. McKenney
2009-03-08 2:58 ` Lai Jiangshan
2009-03-08 6:20 ` Paul E. McKenney
2009-03-09 2:56 ` Lai Jiangshan
2009-03-09 4:28 ` Paul E. McKenney
2009-03-08 16:00 ` Ingo Molnar
2009-03-19 3:06 ` Lai Jiangshan
2009-03-19 4:05 ` Paul E. McKenney
[not found] ` <20090319082237.GA32179@elte.hu>
2009-03-20 9:40 ` Lai Jiangshan
2009-03-20 20:00 ` [tip:core/rcu] rcu: " Lai Jiangshan
2009-03-30 22:12 ` Lai Jiangshan
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.