* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
2006-01-08 19:19 [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending() Oleg Nesterov
@ 2006-01-08 22:00 ` Paul E. McKenney
2006-01-09 9:31 ` Srivatsa Vaddagiri
` (2 subsequent siblings)
3 siblings, 0 replies; 17+ messages in thread
From: Paul E. McKenney @ 2006-01-08 22:00 UTC (permalink / raw)
To: Oleg Nesterov
Cc: linux-kernel, Dipankar Sarma, Manfred Spraul, Linus Torvalds,
Andrew Morton
On Sun, Jan 08, 2006 at 10:19:24PM +0300, Oleg Nesterov wrote:
> ->donelist becomes != NULL only in rcu_process_callbacks().
>
> rcu_process_callbacks() always calls rcu_do_batch() when
> ->donelist != NULL.
>
> rcu_do_batch() schedules rcu_process_callbacks() again if
> ->donelist was not flushed entirely.
>
> So ->donelist != NULL means that rcu_tasklet is either
> TASKLET_STATE_SCHED or TASKLET_STATE_RUN, we don't need to
> check it in __rcu_pending().
>
> [ This patch was tested with rcutorture.ko, I don't understand
> it's output, but it says "End of test: SUCCESS". So if this
> patch incorrect blame Paul, not me! ]
;-)
Did you run the newest version of rcutorture.ko that includes
Vatsa's CPU-hotplug additions?
Thanx, Paul
> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
>
> --- 2.6.15/kernel/rcupdate.c~2_DONE 2006-01-08 21:35:21.000000000 +0300
> +++ 2.6.15/kernel/rcupdate.c 2006-01-08 21:55:45.000000000 +0300
> @@ -454,10 +454,6 @@ static int __rcu_pending(struct rcu_ctrl
> if (!rdp->curlist && rdp->nxtlist)
> return 1;
>
> - /* This cpu has finished callbacks to invoke */
> - if (rdp->donelist)
> - return 1;
> -
> /* The rcu core waits for a quiescent state from the cpu */
> if (rdp->quiescbatch != rcp->cur || rdp->qs_pending)
> return 1;
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
2006-01-08 19:19 [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending() Oleg Nesterov
2006-01-08 22:00 ` Paul E. McKenney
@ 2006-01-09 9:31 ` Srivatsa Vaddagiri
2006-01-09 14:51 ` Oleg Nesterov
2006-01-09 18:59 ` Paul E. McKenney
2006-01-10 18:13 ` Dipankar Sarma
3 siblings, 1 reply; 17+ messages in thread
From: Srivatsa Vaddagiri @ 2006-01-09 9:31 UTC (permalink / raw)
To: Oleg Nesterov
Cc: linux-kernel, Dipankar Sarma, Manfred Spraul, Linus Torvalds,
Paul E. McKenney, Andrew Morton
On Sun, Jan 08, 2006 at 10:19:24PM +0300, Oleg Nesterov wrote:
> ->donelist becomes != NULL only in rcu_process_callbacks().
>
> rcu_process_callbacks() always calls rcu_do_batch() when
> ->donelist != NULL.
>
> rcu_do_batch() schedules rcu_process_callbacks() again if
> ->donelist was not flushed entirely.
>
> So ->donelist != NULL means that rcu_tasklet is either
> TASKLET_STATE_SCHED or TASKLET_STATE_RUN, we don't need to
> check it in __rcu_pending().
Do I smell a bug wrt CPU Hotplug here? Basically, I see that we do
a rcu_move_batch of ->curlist and ->nxtlist of the dead CPU. Why not
->donelist? If we have to do a rcu_move_batch of ->donelist also,
then perhaps the ->donelist != NULL check is required in
rcu_pending? This is considering that the RCU tasklet of the dead
CPU is killed (rather than moved over to a different CPU).
--
Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
2006-01-09 9:31 ` Srivatsa Vaddagiri
@ 2006-01-09 14:51 ` Oleg Nesterov
2006-01-09 13:42 ` Srivatsa Vaddagiri
0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2006-01-09 14:51 UTC (permalink / raw)
To: vatsa
Cc: linux-kernel, Dipankar Sarma, Manfred Spraul, Linus Torvalds,
Paul E. McKenney, Andrew Morton
Srivatsa Vaddagiri wrote:
> On Sun, Jan 08, 2006 at 10:19:24PM +0300, Oleg Nesterov wrote:
> > ->donelist becomes != NULL only in rcu_process_callbacks().
> >
> > rcu_process_callbacks() always calls rcu_do_batch() when
> > ->donelist != NULL.
> >
> > rcu_do_batch() schedules rcu_process_callbacks() again if
> > ->donelist was not flushed entirely.
> >
> > So ->donelist != NULL means that rcu_tasklet is either
> > TASKLET_STATE_SCHED or TASKLET_STATE_RUN, we don't need to
> > check it in __rcu_pending().
>
> Do I smell a bug wrt CPU Hotplug here? Basically, I see that we do
> a rcu_move_batch of ->curlist and ->nxtlist of the dead CPU. Why not
> ->donelist?
After the quick reading CPU Hotplug code I think you are right, there
is a bug in rcu_offline_cpu(). It should also move ->donelist, it is
lost otherwise.
> If we have to do a rcu_move_batch of ->donelist also,
> then perhaps the ->donelist != NULL check is required in
> rcu_pending?
rcu_move_batch() always adds entries to the ->nxttail, so I think
this patch is correct.
> This is considering that the RCU tasklet of the dead
> CPU is killed (rather than moved over to a different CPU).
Yes, it is killed explicitly in rcu_offline_cpu() via tasklet_kill_immediate().
Note that we can't remove this tasklet_kill_immediate() and rely on
takeover_tasklets(). rcu_process_callbacks() does __get_cpu_var(),
so it can't find orphaned rcu_data anyway if the tasklet was moved
to another cpu.
So, do we need something like this (untested, uncompiled) patch or not?
--- 2.6.15/kernel/rcupdate.c~ 2006-01-09 20:23:32.000000000 +0300
+++ 2.6.15/kernel/rcupdate.c 2006-01-09 20:26:20.000000000 +0300
@@ -355,8 +355,9 @@ static void __rcu_offline_cpu(struct rcu
spin_unlock_bh(&rcp->lock);
rcu_move_batch(this_rdp, rdp->curlist, rdp->curtail);
rcu_move_batch(this_rdp, rdp->nxtlist, rdp->nxttail);
-
+ rcu_move_batch(this_rdp, rdp->donelist, rdp->donetail);
}
+
static void rcu_offline_cpu(int cpu)
{
struct rcu_data *this_rdp = &get_cpu_var(rcu_data);
Oleg.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
2006-01-09 14:51 ` Oleg Nesterov
@ 2006-01-09 13:42 ` Srivatsa Vaddagiri
2006-01-09 15:33 ` Oleg Nesterov
0 siblings, 1 reply; 17+ messages in thread
From: Srivatsa Vaddagiri @ 2006-01-09 13:42 UTC (permalink / raw)
To: Oleg Nesterov
Cc: linux-kernel, Dipankar Sarma, Manfred Spraul, Linus Torvalds,
Paul E. McKenney, Andrew Morton
On Mon, Jan 09, 2006 at 05:51:08PM +0300, Oleg Nesterov wrote:
> Srivatsa Vaddagiri wrote:
> > If we have to do a rcu_move_batch of ->donelist also,
> > then perhaps the ->donelist != NULL check is required in
> > rcu_pending?
>
> rcu_move_batch() always adds entries to the ->nxttail, so I think
> this patch is correct.
Hmm ..adding entries on dead cpu's ->donelist to ->nxtlist of some other CPU
doesnt make sense (it re-triggers graceperiods for all those callbacks which is
not needed). Maybe rcu_move_batch should take that into account and instead add
to ->donelist of current CPU which is processing the CPU_DEAD callback.
In which case, ->donelist != NULL check is still reqd in rcu_pending ?
--
Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
2006-01-09 13:42 ` Srivatsa Vaddagiri
@ 2006-01-09 15:33 ` Oleg Nesterov
0 siblings, 0 replies; 17+ messages in thread
From: Oleg Nesterov @ 2006-01-09 15:33 UTC (permalink / raw)
To: vatsa
Cc: linux-kernel, Dipankar Sarma, Manfred Spraul, Linus Torvalds,
Paul E. McKenney, Andrew Morton
Srivatsa Vaddagiri wrote:
>
> On Mon, Jan 09, 2006 at 05:51:08PM +0300, Oleg Nesterov wrote:
> > Srivatsa Vaddagiri wrote:
> > > If we have to do a rcu_move_batch of ->donelist also,
> > > then perhaps the ->donelist != NULL check is required in
> > > rcu_pending?
> >
> > rcu_move_batch() always adds entries to the ->nxttail, so I think
> > this patch is correct.
>
> Hmm ..adding entries on dead cpu's ->donelist to ->nxtlist of some other CPU
> doesnt make sense (it re-triggers graceperiods for all those callbacks which is
> not needed).
Yes, but this is rare event, so ...
> Maybe rcu_move_batch should take that into account and instead add
> to ->donelist of current CPU which is processing the CPU_DEAD callback.
May be you are right,
> In which case, ->donelist != NULL check is still reqd in rcu_pending ?
Yes, in that case it is required. Do you think it is worth to optimize
CPU_DEAD case?
Oleg.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
2006-01-08 19:19 [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending() Oleg Nesterov
2006-01-08 22:00 ` Paul E. McKenney
2006-01-09 9:31 ` Srivatsa Vaddagiri
@ 2006-01-09 18:59 ` Paul E. McKenney
2006-01-09 20:31 ` Oleg Nesterov
2006-01-10 18:13 ` Dipankar Sarma
3 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2006-01-09 18:59 UTC (permalink / raw)
To: Oleg Nesterov
Cc: linux-kernel, Dipankar Sarma, Manfred Spraul, Linus Torvalds,
Andrew Morton
On Sun, Jan 08, 2006 at 10:19:24PM +0300, Oleg Nesterov wrote:
> ->donelist becomes != NULL only in rcu_process_callbacks().
>
> rcu_process_callbacks() always calls rcu_do_batch() when
> ->donelist != NULL.
>
> rcu_do_batch() schedules rcu_process_callbacks() again if
> ->donelist was not flushed entirely.
>
> So ->donelist != NULL means that rcu_tasklet is either
> TASKLET_STATE_SCHED or TASKLET_STATE_RUN, we don't need to
> check it in __rcu_pending().
As Vatsa noted, this is needed if the CPU-hotplug case moves
from ->donelist to ->donelist. It could be omitted if CPU-hotplug
instead moves from ->donelist to ->nextlist, as is the case in Oleg's
patch. The extra grace-period delay should not be a problem for the
presumably rare hotplug case, but:
o the extra test in __rcu_pending() should be quite inexpensive,
since the cacheline is already loaded given the earlier tests.
o although tasklet_schedule() looks to be perfectly reliable
right now, and although any bugs in tasklet_schedule() must
be fixed, having RCU leakage be the major symptom of
tasklet_schedule() failure sounds quite unfriendly to me.
So I am not (yet) convinced that this patch is the way to go.
Thanx, Paul
> [ This patch was tested with rcutorture.ko, I don't understand
> it's output, but it says "End of test: SUCCESS". So if this
> patch incorrect blame Paul, not me! ]
>
> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
>
> --- 2.6.15/kernel/rcupdate.c~2_DONE 2006-01-08 21:35:21.000000000 +0300
> +++ 2.6.15/kernel/rcupdate.c 2006-01-08 21:55:45.000000000 +0300
> @@ -454,10 +454,6 @@ static int __rcu_pending(struct rcu_ctrl
> if (!rdp->curlist && rdp->nxtlist)
> return 1;
>
> - /* This cpu has finished callbacks to invoke */
> - if (rdp->donelist)
> - return 1;
> -
> /* The rcu core waits for a quiescent state from the cpu */
> if (rdp->quiescbatch != rcp->cur || rdp->qs_pending)
> return 1;
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
2006-01-09 18:59 ` Paul E. McKenney
@ 2006-01-09 20:31 ` Oleg Nesterov
2006-01-09 19:59 ` Paul E. McKenney
0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2006-01-09 20:31 UTC (permalink / raw)
To: paulmck
Cc: linux-kernel, Dipankar Sarma, Manfred Spraul, Linus Torvalds,
Andrew Morton
"Paul E. McKenney" wrote:
>
> On Sun, Jan 08, 2006 at 10:19:24PM +0300, Oleg Nesterov wrote:
> > ->donelist becomes != NULL only in rcu_process_callbacks().
> >
> > rcu_process_callbacks() always calls rcu_do_batch() when
> > ->donelist != NULL.
> >
> > rcu_do_batch() schedules rcu_process_callbacks() again if
> > ->donelist was not flushed entirely.
> >
> > So ->donelist != NULL means that rcu_tasklet is either
> > TASKLET_STATE_SCHED or TASKLET_STATE_RUN, we don't need to
> > check it in __rcu_pending().
>
> As Vatsa noted, this is needed if the CPU-hotplug case moves
> from ->donelist to ->donelist. It could be omitted if CPU-hotplug
> instead moves from ->donelist to ->nextlist, as is the case in Oleg's
> patch. The extra grace-period delay should not be a problem for the
> presumably rare hotplug case, but:
Just to be sure. So do you agree that CPU-hotplug is buggy now (without
that patch) ?
> o the extra test in __rcu_pending() should be quite inexpensive,
> since the cacheline is already loaded given the earlier tests.
Yes, it was a cleanup, not an optimization.
> o although tasklet_schedule() looks to be perfectly reliable
> right now, and although any bugs in tasklet_schedule() must
> be fixed, having RCU leakage be the major symptom of
> tasklet_schedule() failure sounds quite unfriendly to me.
>
> So I am not (yet) convinced that this patch is the way to go.
Ok, I agree.
Oleg.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
2006-01-09 20:31 ` Oleg Nesterov
@ 2006-01-09 19:59 ` Paul E. McKenney
2006-01-10 9:58 ` Srivatsa Vaddagiri
0 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2006-01-09 19:59 UTC (permalink / raw)
To: Oleg Nesterov
Cc: linux-kernel, Dipankar Sarma, Manfred Spraul, Linus Torvalds,
Andrew Morton, vatsa
On Mon, Jan 09, 2006 at 11:31:20PM +0300, Oleg Nesterov wrote:
> "Paul E. McKenney" wrote:
> >
> > On Sun, Jan 08, 2006 at 10:19:24PM +0300, Oleg Nesterov wrote:
> > > ->donelist becomes != NULL only in rcu_process_callbacks().
> > >
> > > rcu_process_callbacks() always calls rcu_do_batch() when
> > > ->donelist != NULL.
> > >
> > > rcu_do_batch() schedules rcu_process_callbacks() again if
> > > ->donelist was not flushed entirely.
> > >
> > > So ->donelist != NULL means that rcu_tasklet is either
> > > TASKLET_STATE_SCHED or TASKLET_STATE_RUN, we don't need to
> > > check it in __rcu_pending().
> >
> > As Vatsa noted, this is needed if the CPU-hotplug case moves
> > from ->donelist to ->donelist. It could be omitted if CPU-hotplug
> > instead moves from ->donelist to ->nextlist, as is the case in Oleg's
> > patch. The extra grace-period delay should not be a problem for the
> > presumably rare hotplug case, but:
>
> Just to be sure. So do you agree that CPU-hotplug is buggy now (without
> that patch) ?
Hmmm... So your thought is that __rcu_offline_cpu() moves nxtlist and
curlist, but not donelist, but then returns to rcu_offline_cpu(), which
might well do the tasklet_kill_immediate() before the tasklet completed
processing all of donelist.
Seems plausible to me. If true, your patch adding the following statement
to the ed of __rcu_offline_cpu seems like a reasonable fix:
rcu_move_batch(this_rdp, rdp->donelist, rdp->donetail);
Vatsa, is there something that Oleg and I are missing?
> > o the extra test in __rcu_pending() should be quite inexpensive,
> > since the cacheline is already loaded given the earlier tests.
>
> Yes, it was a cleanup, not an optimization.
>
> > o although tasklet_schedule() looks to be perfectly reliable
> > right now, and although any bugs in tasklet_schedule() must
> > be fixed, having RCU leakage be the major symptom of
> > tasklet_schedule() failure sounds quite unfriendly to me.
> >
> > So I am not (yet) convinced that this patch is the way to go.
>
> Ok, I agree.
Sounds good!
Thanx, Paul
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
2006-01-09 19:59 ` Paul E. McKenney
@ 2006-01-10 9:58 ` Srivatsa Vaddagiri
2006-01-10 14:24 ` [PATCH] rcu: fix hotplug-cpu ->donelist leak Oleg Nesterov
2006-01-10 14:27 ` [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending() Oleg Nesterov
0 siblings, 2 replies; 17+ messages in thread
From: Srivatsa Vaddagiri @ 2006-01-10 9:58 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Oleg Nesterov, linux-kernel, Dipankar Sarma, Manfred Spraul,
Linus Torvalds, Andrew Morton
On Mon, Jan 09, 2006 at 11:59:33AM -0800, Paul E. McKenney wrote:
> Hmmm... So your thought is that __rcu_offline_cpu() moves nxtlist and
> curlist, but not donelist, but then returns to rcu_offline_cpu(), which
> might well do the tasklet_kill_immediate() before the tasklet completed
> processing all of donelist.
>
> Seems plausible to me. If true, your patch adding the following statement
> to the ed of __rcu_offline_cpu seems like a reasonable fix:
>
> rcu_move_batch(this_rdp, rdp->donelist, rdp->donetail);
>
> Vatsa, is there something that Oleg and I are missing?
I think this should take care of the CPU Hotplug bug, with the caveat
that the callbacks on ->donelist will wait for additional grace period before
being invoked (which seems ok).
Oleg, do you want to resend the patch after some testing?
- vatsa
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] rcu: fix hotplug-cpu ->donelist leak
2006-01-10 9:58 ` Srivatsa Vaddagiri
@ 2006-01-10 14:24 ` Oleg Nesterov
2006-01-11 5:01 ` Paul E. McKenney
2006-01-11 16:28 ` Paul E. McKenney
2006-01-10 14:27 ` [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending() Oleg Nesterov
1 sibling, 2 replies; 17+ messages in thread
From: Oleg Nesterov @ 2006-01-10 14:24 UTC (permalink / raw)
To: vatsa
Cc: Paul E. McKenney, linux-kernel, Dipankar Sarma, Manfred Spraul,
Linus Torvalds, Andrew Morton
Pointed out by Srivatsa Vaddagiri <vatsa@in.ibm.com>.
rcu_do_batch() stops after processing maxbatch callbacks
on ->donelist leaving rcu_tasklet in TASKLET_STATE_SCHED
state.
If CPU_DEAD event happens remaining ->donelist entries are
lost, rcu_offline_cpu() kills this tasklet.
With this patch ->donelist migrates along with ->curlist
and ->nxtlist to the current cpu.
Compile tested.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
--- 2.6.15/kernel/rcupdate.c~4_HPFIX 2006-01-10 19:06:38.000000000 +0300
+++ 2.6.15/kernel/rcupdate.c 2006-01-10 19:15:01.000000000 +0300
@@ -343,8 +343,9 @@ static void __rcu_offline_cpu(struct rcu
spin_unlock_bh(&rcp->lock);
rcu_move_batch(this_rdp, rdp->curlist, rdp->curtail);
rcu_move_batch(this_rdp, rdp->nxtlist, rdp->nxttail);
-
+ rcu_move_batch(this_rdp, rdp->donelist, rdp->donetail);
}
+
static void rcu_offline_cpu(int cpu)
{
struct rcu_data *this_rdp = &get_cpu_var(rcu_data);
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] rcu: fix hotplug-cpu ->donelist leak
2006-01-10 14:24 ` [PATCH] rcu: fix hotplug-cpu ->donelist leak Oleg Nesterov
@ 2006-01-11 5:01 ` Paul E. McKenney
2006-01-11 16:28 ` Paul E. McKenney
1 sibling, 0 replies; 17+ messages in thread
From: Paul E. McKenney @ 2006-01-11 5:01 UTC (permalink / raw)
To: Oleg Nesterov
Cc: vatsa, linux-kernel, Dipankar Sarma, Manfred Spraul,
Linus Torvalds, Andrew Morton
On Tue, Jan 10, 2006 at 05:24:53PM +0300, Oleg Nesterov wrote:
> Pointed out by Srivatsa Vaddagiri <vatsa@in.ibm.com>.
>
> rcu_do_batch() stops after processing maxbatch callbacks
> on ->donelist leaving rcu_tasklet in TASKLET_STATE_SCHED
> state.
>
> If CPU_DEAD event happens remaining ->donelist entries are
> lost, rcu_offline_cpu() kills this tasklet.
>
> With this patch ->donelist migrates along with ->curlist
> and ->nxtlist to the current cpu.
>
> Compile tested.
Looks good, but I fat-fingered my test scripts and left it out. :-/
Started another RCU-torture run...
Thanx, Paul
> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
>
> --- 2.6.15/kernel/rcupdate.c~4_HPFIX 2006-01-10 19:06:38.000000000 +0300
> +++ 2.6.15/kernel/rcupdate.c 2006-01-10 19:15:01.000000000 +0300
> @@ -343,8 +343,9 @@ static void __rcu_offline_cpu(struct rcu
> spin_unlock_bh(&rcp->lock);
> rcu_move_batch(this_rdp, rdp->curlist, rdp->curtail);
> rcu_move_batch(this_rdp, rdp->nxtlist, rdp->nxttail);
> -
> + rcu_move_batch(this_rdp, rdp->donelist, rdp->donetail);
> }
> +
> static void rcu_offline_cpu(int cpu)
> {
> struct rcu_data *this_rdp = &get_cpu_var(rcu_data);
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] rcu: fix hotplug-cpu ->donelist leak
2006-01-10 14:24 ` [PATCH] rcu: fix hotplug-cpu ->donelist leak Oleg Nesterov
2006-01-11 5:01 ` Paul E. McKenney
@ 2006-01-11 16:28 ` Paul E. McKenney
2006-01-11 19:25 ` Oleg Nesterov
1 sibling, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2006-01-11 16:28 UTC (permalink / raw)
To: Oleg Nesterov
Cc: vatsa, linux-kernel, Dipankar Sarma, Manfred Spraul,
Linus Torvalds, Andrew Morton
On Tue, Jan 10, 2006 at 05:24:53PM +0300, Oleg Nesterov wrote:
> Pointed out by Srivatsa Vaddagiri <vatsa@in.ibm.com>.
>
> rcu_do_batch() stops after processing maxbatch callbacks
> on ->donelist leaving rcu_tasklet in TASKLET_STATE_SCHED
> state.
>
> If CPU_DEAD event happens remaining ->donelist entries are
> lost, rcu_offline_cpu() kills this tasklet.
>
> With this patch ->donelist migrates along with ->curlist
> and ->nxtlist to the current cpu.
>
> Compile tested.
This passed a ten-hour RCU torture test, with the torture test augmented
by Vatsa's CPU-hotplug RCU-torture-test patch.
Thanx, Paul
Acked-by: Paul E. McKenney <paulmck@us.ibm.com>
> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
>
> --- 2.6.15/kernel/rcupdate.c~4_HPFIX 2006-01-10 19:06:38.000000000 +0300
> +++ 2.6.15/kernel/rcupdate.c 2006-01-10 19:15:01.000000000 +0300
> @@ -343,8 +343,9 @@ static void __rcu_offline_cpu(struct rcu
> spin_unlock_bh(&rcp->lock);
> rcu_move_batch(this_rdp, rdp->curlist, rdp->curtail);
> rcu_move_batch(this_rdp, rdp->nxtlist, rdp->nxttail);
> -
> + rcu_move_batch(this_rdp, rdp->donelist, rdp->donetail);
> }
> +
> static void rcu_offline_cpu(int cpu)
> {
> struct rcu_data *this_rdp = &get_cpu_var(rcu_data);
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] rcu: fix hotplug-cpu ->donelist leak
2006-01-11 16:28 ` Paul E. McKenney
@ 2006-01-11 19:25 ` Oleg Nesterov
2006-01-11 18:21 ` Paul E. McKenney
0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2006-01-11 19:25 UTC (permalink / raw)
To: paulmck
Cc: vatsa, linux-kernel, Dipankar Sarma, Manfred Spraul,
Linus Torvalds, Andrew Morton
"Paul E. McKenney" wrote:
>
> This passed a ten-hour RCU torture test, with the torture test augmented
Thank you!
> by Vatsa's CPU-hotplug RCU-torture-test patch.
I can't find this patch, could you point me?
Oleg.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] rcu: fix hotplug-cpu ->donelist leak
2006-01-11 19:25 ` Oleg Nesterov
@ 2006-01-11 18:21 ` Paul E. McKenney
0 siblings, 0 replies; 17+ messages in thread
From: Paul E. McKenney @ 2006-01-11 18:21 UTC (permalink / raw)
To: Oleg Nesterov
Cc: vatsa, linux-kernel, Dipankar Sarma, Manfred Spraul,
Linus Torvalds, Andrew Morton
On Wed, Jan 11, 2006 at 10:25:35PM +0300, Oleg Nesterov wrote:
> "Paul E. McKenney" wrote:
> >
> > This passed a ten-hour RCU torture test, with the torture test augmented
>
> Thank you!
>
> > by Vatsa's CPU-hotplug RCU-torture-test patch.
>
> I can't find this patch, could you point me?
http://marc.theaimsgroup.com/?l=linux-kernel&m=113378075217761&w=2
I just realized that I had failed to ack this one, so just did so.
Thanx, Paul
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
2006-01-10 9:58 ` Srivatsa Vaddagiri
2006-01-10 14:24 ` [PATCH] rcu: fix hotplug-cpu ->donelist leak Oleg Nesterov
@ 2006-01-10 14:27 ` Oleg Nesterov
1 sibling, 0 replies; 17+ messages in thread
From: Oleg Nesterov @ 2006-01-10 14:27 UTC (permalink / raw)
To: vatsa
Cc: Paul E. McKenney, linux-kernel, Dipankar Sarma, Manfred Spraul,
Linus Torvalds, Andrew Morton
Srivatsa Vaddagiri wrote:
>
> On Mon, Jan 09, 2006 at 11:59:33AM -0800, Paul E. McKenney wrote:
> > Hmmm... So your thought is that __rcu_offline_cpu() moves nxtlist and
> > curlist, but not donelist, but then returns to rcu_offline_cpu(), which
> > might well do the tasklet_kill_immediate() before the tasklet completed
> > processing all of donelist.
> >
> > Seems plausible to me. If true, your patch adding the following statement
> > to the ed of __rcu_offline_cpu seems like a reasonable fix:
> >
> > rcu_move_batch(this_rdp, rdp->donelist, rdp->donetail);
> >
> > Vatsa, is there something that Oleg and I are missing?
>
> I think this should take care of the CPU Hotplug bug, with the caveat
> that the callbacks on ->donelist will wait for additional grace period before
> being invoked (which seems ok).
>
> Oleg, do you want to resend the patch after some testing?
Sure. The problem is that I have no idea how to test this change.
However, the patch seems "obviously correct", so I am sending it.
Srivatsa, I am sorry, I forgot to add you on CC: list while re-sending
these cleanups. I hope you can see them on lkml.
Oleg.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
2006-01-08 19:19 [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending() Oleg Nesterov
` (2 preceding siblings ...)
2006-01-09 18:59 ` Paul E. McKenney
@ 2006-01-10 18:13 ` Dipankar Sarma
3 siblings, 0 replies; 17+ messages in thread
From: Dipankar Sarma @ 2006-01-10 18:13 UTC (permalink / raw)
To: Oleg Nesterov
Cc: linux-kernel, Manfred Spraul, Linus Torvalds, Paul E. McKenney,
Andrew Morton
On Sun, Jan 08, 2006 at 10:19:24PM +0300, Oleg Nesterov wrote:
> ->donelist becomes != NULL only in rcu_process_callbacks().
>
> So ->donelist != NULL means that rcu_tasklet is either
> TASKLET_STATE_SCHED or TASKLET_STATE_RUN, we don't need to
> check it in __rcu_pending().
>
> [ This patch was tested with rcutorture.ko, I don't understand
> it's output, but it says "End of test: SUCCESS". So if this
> patch incorrect blame Paul, not me! ]
>
> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
>
> --- 2.6.15/kernel/rcupdate.c~2_DONE 2006-01-08 21:35:21.000000000 +0300
> +++ 2.6.15/kernel/rcupdate.c 2006-01-08 21:55:45.000000000 +0300
> @@ -454,10 +454,6 @@ static int __rcu_pending(struct rcu_ctrl
> if (!rdp->curlist && rdp->nxtlist)
> return 1;
>
> - /* This cpu has finished callbacks to invoke */
> - if (rdp->donelist)
> - return 1;
> -
> /* The rcu core waits for a quiescent state from the cpu */
> if (rdp->quiescbatch != rcp->cur || rdp->qs_pending)
> return 1;
This may not be a good idea. For example, during cpu hotplug,
a cpu may inherit a set of finished callbacks that need to be
invoked. So, an rcu_pending() check needs to detect that.
Thanks
Dipankar
^ permalink raw reply [flat|nested] 17+ messages in thread