linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
@ 2006-01-08 19:19 Oleg Nesterov
  2006-01-08 22:00 ` Paul E. McKenney
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Oleg Nesterov @ 2006-01-08 19:19 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dipankar Sarma, Manfred Spraul, Linus Torvalds, Paul E. McKenney,
	Andrew Morton

->donelist becomes != NULL only in rcu_process_callbacks().

rcu_process_callbacks() always calls rcu_do_batch() when
->donelist != NULL.

rcu_do_batch() schedules rcu_process_callbacks() again if
->donelist was not flushed entirely.

So ->donelist != NULL means that rcu_tasklet is either
TASKLET_STATE_SCHED or TASKLET_STATE_RUN, we don't need to
check it in __rcu_pending().

[ This patch was tested with rcutorture.ko, I don't understand
  it's output, but it says "End of test: SUCCESS". So if this
  patch incorrect blame Paul, not me! ]

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>

--- 2.6.15/kernel/rcupdate.c~2_DONE	2006-01-08 21:35:21.000000000 +0300
+++ 2.6.15/kernel/rcupdate.c	2006-01-08 21:55:45.000000000 +0300
@@ -454,10 +454,6 @@ static int __rcu_pending(struct rcu_ctrl
 	if (!rdp->curlist && rdp->nxtlist)
 		return 1;
 
-	/* This cpu has finished callbacks to invoke */
-	if (rdp->donelist)
-		return 1;
-
 	/* The rcu core waits for a quiescent state from the cpu */
 	if (rdp->quiescbatch != rcp->cur || rdp->qs_pending)
 		return 1;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
  2006-01-08 19:19 [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending() Oleg Nesterov
@ 2006-01-08 22:00 ` Paul E. McKenney
  2006-01-09  9:31 ` Srivatsa Vaddagiri
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Paul E. McKenney @ 2006-01-08 22:00 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-kernel, Dipankar Sarma, Manfred Spraul, Linus Torvalds,
	Andrew Morton

On Sun, Jan 08, 2006 at 10:19:24PM +0300, Oleg Nesterov wrote:
> ->donelist becomes != NULL only in rcu_process_callbacks().
> 
> rcu_process_callbacks() always calls rcu_do_batch() when
> ->donelist != NULL.
> 
> rcu_do_batch() schedules rcu_process_callbacks() again if
> ->donelist was not flushed entirely.
> 
> So ->donelist != NULL means that rcu_tasklet is either
> TASKLET_STATE_SCHED or TASKLET_STATE_RUN, we don't need to
> check it in __rcu_pending().
> 
> [ This patch was tested with rcutorture.ko, I don't understand
>   it's output, but it says "End of test: SUCCESS". So if this
>   patch incorrect blame Paul, not me! ]

;-)

Did you run the newest version of rcutorture.ko that includes
Vatsa's CPU-hotplug additions?

						Thanx, Paul

> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
> 
> --- 2.6.15/kernel/rcupdate.c~2_DONE	2006-01-08 21:35:21.000000000 +0300
> +++ 2.6.15/kernel/rcupdate.c	2006-01-08 21:55:45.000000000 +0300
> @@ -454,10 +454,6 @@ static int __rcu_pending(struct rcu_ctrl
>  	if (!rdp->curlist && rdp->nxtlist)
>  		return 1;
>  
> -	/* This cpu has finished callbacks to invoke */
> -	if (rdp->donelist)
> -		return 1;
> -
>  	/* The rcu core waits for a quiescent state from the cpu */
>  	if (rdp->quiescbatch != rcp->cur || rdp->qs_pending)
>  		return 1;
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
  2006-01-08 19:19 [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending() Oleg Nesterov
  2006-01-08 22:00 ` Paul E. McKenney
@ 2006-01-09  9:31 ` Srivatsa Vaddagiri
  2006-01-09 14:51   ` Oleg Nesterov
  2006-01-09 18:59 ` Paul E. McKenney
  2006-01-10 18:13 ` Dipankar Sarma
  3 siblings, 1 reply; 17+ messages in thread
From: Srivatsa Vaddagiri @ 2006-01-09  9:31 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-kernel, Dipankar Sarma, Manfred Spraul, Linus Torvalds,
	Paul E. McKenney, Andrew Morton

On Sun, Jan 08, 2006 at 10:19:24PM +0300, Oleg Nesterov wrote:
> ->donelist becomes != NULL only in rcu_process_callbacks().
> 
> rcu_process_callbacks() always calls rcu_do_batch() when
> ->donelist != NULL.
> 
> rcu_do_batch() schedules rcu_process_callbacks() again if
> ->donelist was not flushed entirely.
> 
> So ->donelist != NULL means that rcu_tasklet is either
> TASKLET_STATE_SCHED or TASKLET_STATE_RUN, we don't need to
> check it in __rcu_pending().

Do I smell a bug wrt CPU Hotplug here? Basically, I see that we do
a rcu_move_batch of ->curlist and ->nxtlist of the dead CPU. Why not
->donelist? If we have to do a rcu_move_batch of ->donelist also,
then perhaps the ->donelist != NULL check is required in
rcu_pending? This is considering that the RCU tasklet of the dead
CPU is killed (rather than moved over to a different CPU).


-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
  2006-01-09 14:51   ` Oleg Nesterov
@ 2006-01-09 13:42     ` Srivatsa Vaddagiri
  2006-01-09 15:33       ` Oleg Nesterov
  0 siblings, 1 reply; 17+ messages in thread
From: Srivatsa Vaddagiri @ 2006-01-09 13:42 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-kernel, Dipankar Sarma, Manfred Spraul, Linus Torvalds,
	Paul E. McKenney, Andrew Morton

On Mon, Jan 09, 2006 at 05:51:08PM +0300, Oleg Nesterov wrote:
> Srivatsa Vaddagiri wrote:
> >             If we have to do a rcu_move_batch of ->donelist also,
> > then perhaps the ->donelist != NULL check is required in
> > rcu_pending?
> 
> rcu_move_batch() always adds entries to the ->nxttail, so I think
> this patch is correct.

Hmm ..adding entries on dead cpu's ->donelist to ->nxtlist of some other CPU 
doesnt make sense (it re-triggers graceperiods for all those callbacks which is 
not needed). Maybe rcu_move_batch should take that into account and instead add
to ->donelist of current CPU which is processing the CPU_DEAD callback.

In which case, ->donelist != NULL check is still reqd in rcu_pending ?


-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
  2006-01-09  9:31 ` Srivatsa Vaddagiri
@ 2006-01-09 14:51   ` Oleg Nesterov
  2006-01-09 13:42     ` Srivatsa Vaddagiri
  0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2006-01-09 14:51 UTC (permalink / raw)
  To: vatsa
  Cc: linux-kernel, Dipankar Sarma, Manfred Spraul, Linus Torvalds,
	Paul E. McKenney, Andrew Morton

Srivatsa Vaddagiri wrote:
> On Sun, Jan 08, 2006 at 10:19:24PM +0300, Oleg Nesterov wrote:
> > ->donelist becomes != NULL only in rcu_process_callbacks().
> >
> > rcu_process_callbacks() always calls rcu_do_batch() when
> > ->donelist != NULL.
> >
> > rcu_do_batch() schedules rcu_process_callbacks() again if
> > ->donelist was not flushed entirely.
> >
> > So ->donelist != NULL means that rcu_tasklet is either
> > TASKLET_STATE_SCHED or TASKLET_STATE_RUN, we don't need to
> > check it in __rcu_pending().
>
> Do I smell a bug wrt CPU Hotplug here? Basically, I see that we do
> a rcu_move_batch of ->curlist and ->nxtlist of the dead CPU. Why not
> ->donelist?

After the quick reading CPU Hotplug code I think you are right, there
is a bug in rcu_offline_cpu(). It should also move ->donelist, it is
lost otherwise.

>             If we have to do a rcu_move_batch of ->donelist also,
> then perhaps the ->donelist != NULL check is required in
> rcu_pending?

rcu_move_batch() always adds entries to the ->nxttail, so I think
this patch is correct.

>               This is considering that the RCU tasklet of the dead
> CPU is killed (rather than moved over to a different CPU).

Yes, it is killed explicitly in rcu_offline_cpu() via tasklet_kill_immediate().

Note that we can't remove this tasklet_kill_immediate() and rely on
takeover_tasklets(). rcu_process_callbacks() does __get_cpu_var(),
so it can't find orphaned rcu_data anyway if the tasklet was moved
to another cpu.

So, do we need something like this (untested, uncompiled) patch or not?

--- 2.6.15/kernel/rcupdate.c~	2006-01-09 20:23:32.000000000 +0300
+++ 2.6.15/kernel/rcupdate.c	2006-01-09 20:26:20.000000000 +0300
@@ -355,8 +355,9 @@ static void __rcu_offline_cpu(struct rcu
 	spin_unlock_bh(&rcp->lock);
 	rcu_move_batch(this_rdp, rdp->curlist, rdp->curtail);
 	rcu_move_batch(this_rdp, rdp->nxtlist, rdp->nxttail);
-
+	rcu_move_batch(this_rdp, rdp->donelist, rdp->donetail);
 }
+
 static void rcu_offline_cpu(int cpu)
 {
 	struct rcu_data *this_rdp = &get_cpu_var(rcu_data);

Oleg.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
  2006-01-09 13:42     ` Srivatsa Vaddagiri
@ 2006-01-09 15:33       ` Oleg Nesterov
  0 siblings, 0 replies; 17+ messages in thread
From: Oleg Nesterov @ 2006-01-09 15:33 UTC (permalink / raw)
  To: vatsa
  Cc: linux-kernel, Dipankar Sarma, Manfred Spraul, Linus Torvalds,
	Paul E. McKenney, Andrew Morton

Srivatsa Vaddagiri wrote:
> 
> On Mon, Jan 09, 2006 at 05:51:08PM +0300, Oleg Nesterov wrote:
> > Srivatsa Vaddagiri wrote:
> > >             If we have to do a rcu_move_batch of ->donelist also,
> > > then perhaps the ->donelist != NULL check is required in
> > > rcu_pending?
> >
> > rcu_move_batch() always adds entries to the ->nxttail, so I think
> > this patch is correct.
> 
> Hmm ..adding entries on dead cpu's ->donelist to ->nxtlist of some other CPU
> doesnt make sense (it re-triggers graceperiods for all those callbacks which is
> not needed).

Yes, but this is rare event, so ...

>              Maybe rcu_move_batch should take that into account and instead add
> to ->donelist of current CPU which is processing the CPU_DEAD callback.

May be you are right,

> In which case, ->donelist != NULL check is still reqd in rcu_pending ?

Yes, in that case it is required. Do you think it is worth to optimize
CPU_DEAD case?

Oleg.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
  2006-01-08 19:19 [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending() Oleg Nesterov
  2006-01-08 22:00 ` Paul E. McKenney
  2006-01-09  9:31 ` Srivatsa Vaddagiri
@ 2006-01-09 18:59 ` Paul E. McKenney
  2006-01-09 20:31   ` Oleg Nesterov
  2006-01-10 18:13 ` Dipankar Sarma
  3 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2006-01-09 18:59 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-kernel, Dipankar Sarma, Manfred Spraul, Linus Torvalds,
	Andrew Morton

On Sun, Jan 08, 2006 at 10:19:24PM +0300, Oleg Nesterov wrote:
> ->donelist becomes != NULL only in rcu_process_callbacks().
> 
> rcu_process_callbacks() always calls rcu_do_batch() when
> ->donelist != NULL.
> 
> rcu_do_batch() schedules rcu_process_callbacks() again if
> ->donelist was not flushed entirely.
> 
> So ->donelist != NULL means that rcu_tasklet is either
> TASKLET_STATE_SCHED or TASKLET_STATE_RUN, we don't need to
> check it in __rcu_pending().

As Vatsa noted, this is needed if the CPU-hotplug case moves
from ->donelist to ->donelist.  It could be omitted if CPU-hotplug
instead moves from ->donelist to ->nextlist, as is the case in Oleg's
patch.  The extra grace-period delay should not be a problem for the
presumably rare hotplug case, but:

o	the extra test in __rcu_pending() should be quite inexpensive,
	since the cacheline is already loaded given the earlier tests.

o	although tasklet_schedule() looks to be perfectly reliable
	right now, and although any bugs in tasklet_schedule() must
	be fixed, having RCU leakage be the major symptom of
	tasklet_schedule() failure sounds quite unfriendly to me.

So I am not (yet) convinced that this patch is the way to go.

						Thanx, Paul

> [ This patch was tested with rcutorture.ko, I don't understand
>   it's output, but it says "End of test: SUCCESS". So if this
>   patch incorrect blame Paul, not me! ]
> 
> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
> 
> --- 2.6.15/kernel/rcupdate.c~2_DONE	2006-01-08 21:35:21.000000000 +0300
> +++ 2.6.15/kernel/rcupdate.c	2006-01-08 21:55:45.000000000 +0300
> @@ -454,10 +454,6 @@ static int __rcu_pending(struct rcu_ctrl
>  	if (!rdp->curlist && rdp->nxtlist)
>  		return 1;
>  
> -	/* This cpu has finished callbacks to invoke */
> -	if (rdp->donelist)
> -		return 1;
> -
>  	/* The rcu core waits for a quiescent state from the cpu */
>  	if (rdp->quiescbatch != rcp->cur || rdp->qs_pending)
>  		return 1;
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
  2006-01-09 20:31   ` Oleg Nesterov
@ 2006-01-09 19:59     ` Paul E. McKenney
  2006-01-10  9:58       ` Srivatsa Vaddagiri
  0 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2006-01-09 19:59 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-kernel, Dipankar Sarma, Manfred Spraul, Linus Torvalds,
	Andrew Morton, vatsa

On Mon, Jan 09, 2006 at 11:31:20PM +0300, Oleg Nesterov wrote:
> "Paul E. McKenney" wrote:
> > 
> > On Sun, Jan 08, 2006 at 10:19:24PM +0300, Oleg Nesterov wrote:
> > > ->donelist becomes != NULL only in rcu_process_callbacks().
> > >
> > > rcu_process_callbacks() always calls rcu_do_batch() when
> > > ->donelist != NULL.
> > >
> > > rcu_do_batch() schedules rcu_process_callbacks() again if
> > > ->donelist was not flushed entirely.
> > >
> > > So ->donelist != NULL means that rcu_tasklet is either
> > > TASKLET_STATE_SCHED or TASKLET_STATE_RUN, we don't need to
> > > check it in __rcu_pending().
> > 
> > As Vatsa noted, this is needed if the CPU-hotplug case moves
> > from ->donelist to ->donelist.  It could be omitted if CPU-hotplug
> > instead moves from ->donelist to ->nextlist, as is the case in Oleg's
> > patch.  The extra grace-period delay should not be a problem for the
> > presumably rare hotplug case, but:
> 
> Just to be sure. So do you agree that CPU-hotplug is buggy now (without
> that patch) ?

Hmmm...  So your thought is that __rcu_offline_cpu() moves nxtlist and
curlist, but not donelist, but then returns to rcu_offline_cpu(), which
might well do the tasklet_kill_immediate() before the tasklet completed
processing all of donelist.

Seems plausible to me.  If true, your patch adding the following statement
to the ed of __rcu_offline_cpu seems like a reasonable fix:

	rcu_move_batch(this_rdp, rdp->donelist, rdp->donetail);

Vatsa, is there something that Oleg and I are missing?

> > o       the extra test in __rcu_pending() should be quite inexpensive,
> >         since the cacheline is already loaded given the earlier tests.
> 
> Yes, it was a cleanup, not an optimization.
> 
> > o       although tasklet_schedule() looks to be perfectly reliable
> >         right now, and although any bugs in tasklet_schedule() must
> >         be fixed, having RCU leakage be the major symptom of
> >         tasklet_schedule() failure sounds quite unfriendly to me.
> > 
> > So I am not (yet) convinced that this patch is the way to go.
> 
> Ok, I agree.

Sounds good!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
  2006-01-09 18:59 ` Paul E. McKenney
@ 2006-01-09 20:31   ` Oleg Nesterov
  2006-01-09 19:59     ` Paul E. McKenney
  0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2006-01-09 20:31 UTC (permalink / raw)
  To: paulmck
  Cc: linux-kernel, Dipankar Sarma, Manfred Spraul, Linus Torvalds,
	Andrew Morton

"Paul E. McKenney" wrote:
> 
> On Sun, Jan 08, 2006 at 10:19:24PM +0300, Oleg Nesterov wrote:
> > ->donelist becomes != NULL only in rcu_process_callbacks().
> >
> > rcu_process_callbacks() always calls rcu_do_batch() when
> > ->donelist != NULL.
> >
> > rcu_do_batch() schedules rcu_process_callbacks() again if
> > ->donelist was not flushed entirely.
> >
> > So ->donelist != NULL means that rcu_tasklet is either
> > TASKLET_STATE_SCHED or TASKLET_STATE_RUN, we don't need to
> > check it in __rcu_pending().
> 
> As Vatsa noted, this is needed if the CPU-hotplug case moves
> from ->donelist to ->donelist.  It could be omitted if CPU-hotplug
> instead moves from ->donelist to ->nextlist, as is the case in Oleg's
> patch.  The extra grace-period delay should not be a problem for the
> presumably rare hotplug case, but:

Just to be sure. So do you agree that CPU-hotplug is buggy now (without
that patch) ?

> o       the extra test in __rcu_pending() should be quite inexpensive,
>         since the cacheline is already loaded given the earlier tests.

Yes, it was a cleanup, not an optimization.

> o       although tasklet_schedule() looks to be perfectly reliable
>         right now, and although any bugs in tasklet_schedule() must
>         be fixed, having RCU leakage be the major symptom of
>         tasklet_schedule() failure sounds quite unfriendly to me.
> 
> So I am not (yet) convinced that this patch is the way to go.

Ok, I agree.

Oleg.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
  2006-01-09 19:59     ` Paul E. McKenney
@ 2006-01-10  9:58       ` Srivatsa Vaddagiri
  2006-01-10 14:24         ` [PATCH] rcu: fix hotplug-cpu ->donelist leak Oleg Nesterov
  2006-01-10 14:27         ` [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending() Oleg Nesterov
  0 siblings, 2 replies; 17+ messages in thread
From: Srivatsa Vaddagiri @ 2006-01-10  9:58 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Oleg Nesterov, linux-kernel, Dipankar Sarma, Manfred Spraul,
	Linus Torvalds, Andrew Morton

On Mon, Jan 09, 2006 at 11:59:33AM -0800, Paul E. McKenney wrote:
> Hmmm...  So your thought is that __rcu_offline_cpu() moves nxtlist and
> curlist, but not donelist, but then returns to rcu_offline_cpu(), which
> might well do the tasklet_kill_immediate() before the tasklet completed
> processing all of donelist.
> 
> Seems plausible to me.  If true, your patch adding the following statement
> to the ed of __rcu_offline_cpu seems like a reasonable fix:
> 
> 	rcu_move_batch(this_rdp, rdp->donelist, rdp->donetail);
>
> Vatsa, is there something that Oleg and I are missing?

I think this should take care of the CPU Hotplug bug, with the caveat
that the callbacks on ->donelist will wait for additional grace period before 
being invoked (which seems ok).

Oleg, do you want to resend the patch after some testing? 

- vatsa

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] rcu: fix hotplug-cpu ->donelist leak
  2006-01-10  9:58       ` Srivatsa Vaddagiri
@ 2006-01-10 14:24         ` Oleg Nesterov
  2006-01-11  5:01           ` Paul E. McKenney
  2006-01-11 16:28           ` Paul E. McKenney
  2006-01-10 14:27         ` [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending() Oleg Nesterov
  1 sibling, 2 replies; 17+ messages in thread
From: Oleg Nesterov @ 2006-01-10 14:24 UTC (permalink / raw)
  To: vatsa
  Cc: Paul E. McKenney, linux-kernel, Dipankar Sarma, Manfred Spraul,
	Linus Torvalds, Andrew Morton

Pointed out by Srivatsa Vaddagiri <vatsa@in.ibm.com>.

rcu_do_batch() stops after processing maxbatch callbacks
on ->donelist leaving rcu_tasklet in TASKLET_STATE_SCHED
state.

If CPU_DEAD event happens remaining ->donelist entries are
lost, rcu_offline_cpu() kills this tasklet.

With this patch ->donelist migrates along with ->curlist
and ->nxtlist to the current cpu.

Compile tested.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>

--- 2.6.15/kernel/rcupdate.c~4_HPFIX	2006-01-10 19:06:38.000000000 +0300
+++ 2.6.15/kernel/rcupdate.c	2006-01-10 19:15:01.000000000 +0300
@@ -343,8 +343,9 @@ static void __rcu_offline_cpu(struct rcu
 	spin_unlock_bh(&rcp->lock);
 	rcu_move_batch(this_rdp, rdp->curlist, rdp->curtail);
 	rcu_move_batch(this_rdp, rdp->nxtlist, rdp->nxttail);
-
+	rcu_move_batch(this_rdp, rdp->donelist, rdp->donetail);
 }
+
 static void rcu_offline_cpu(int cpu)
 {
 	struct rcu_data *this_rdp = &get_cpu_var(rcu_data);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
  2006-01-10  9:58       ` Srivatsa Vaddagiri
  2006-01-10 14:24         ` [PATCH] rcu: fix hotplug-cpu ->donelist leak Oleg Nesterov
@ 2006-01-10 14:27         ` Oleg Nesterov
  1 sibling, 0 replies; 17+ messages in thread
From: Oleg Nesterov @ 2006-01-10 14:27 UTC (permalink / raw)
  To: vatsa
  Cc: Paul E. McKenney, linux-kernel, Dipankar Sarma, Manfred Spraul,
	Linus Torvalds, Andrew Morton

Srivatsa Vaddagiri wrote:
> 
> On Mon, Jan 09, 2006 at 11:59:33AM -0800, Paul E. McKenney wrote:
> > Hmmm...  So your thought is that __rcu_offline_cpu() moves nxtlist and
> > curlist, but not donelist, but then returns to rcu_offline_cpu(), which
> > might well do the tasklet_kill_immediate() before the tasklet completed
> > processing all of donelist.
> >
> > Seems plausible to me.  If true, your patch adding the following statement
> > to the ed of __rcu_offline_cpu seems like a reasonable fix:
> >
> >       rcu_move_batch(this_rdp, rdp->donelist, rdp->donetail);
> >
> > Vatsa, is there something that Oleg and I are missing?
> 
> I think this should take care of the CPU Hotplug bug, with the caveat
> that the callbacks on ->donelist will wait for additional grace period before
> being invoked (which seems ok).
> 
> Oleg, do you want to resend the patch after some testing?

Sure. The problem is that I have no idea how to test this change.
However, the patch seems "obviously correct", so I am sending it.

Srivatsa, I am sorry, I forgot to add you on CC: list while re-sending
these cleanups. I hope you can see them on lkml.

Oleg.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending()
  2006-01-08 19:19 [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending() Oleg Nesterov
                   ` (2 preceding siblings ...)
  2006-01-09 18:59 ` Paul E. McKenney
@ 2006-01-10 18:13 ` Dipankar Sarma
  3 siblings, 0 replies; 17+ messages in thread
From: Dipankar Sarma @ 2006-01-10 18:13 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-kernel, Manfred Spraul, Linus Torvalds, Paul E. McKenney,
	Andrew Morton

On Sun, Jan 08, 2006 at 10:19:24PM +0300, Oleg Nesterov wrote:
> ->donelist becomes != NULL only in rcu_process_callbacks().
> 
> So ->donelist != NULL means that rcu_tasklet is either
> TASKLET_STATE_SCHED or TASKLET_STATE_RUN, we don't need to
> check it in __rcu_pending().
> 
> [ This patch was tested with rcutorture.ko, I don't understand
>   it's output, but it says "End of test: SUCCESS". So if this
>   patch incorrect blame Paul, not me! ]
> 
> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
> 
> --- 2.6.15/kernel/rcupdate.c~2_DONE	2006-01-08 21:35:21.000000000 +0300
> +++ 2.6.15/kernel/rcupdate.c	2006-01-08 21:55:45.000000000 +0300
> @@ -454,10 +454,6 @@ static int __rcu_pending(struct rcu_ctrl
>  	if (!rdp->curlist && rdp->nxtlist)
>  		return 1;
>  
> -	/* This cpu has finished callbacks to invoke */
> -	if (rdp->donelist)
> -		return 1;
> -
>  	/* The rcu core waits for a quiescent state from the cpu */
>  	if (rdp->quiescbatch != rcp->cur || rdp->qs_pending)
>  		return 1;


This may not be a good idea. For example, during cpu hotplug,
a cpu may inherit a set of finished callbacks that need to be
invoked. So, an rcu_pending() check needs to detect that.

Thanks
Dipankar

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] rcu: fix hotplug-cpu ->donelist leak
  2006-01-10 14:24         ` [PATCH] rcu: fix hotplug-cpu ->donelist leak Oleg Nesterov
@ 2006-01-11  5:01           ` Paul E. McKenney
  2006-01-11 16:28           ` Paul E. McKenney
  1 sibling, 0 replies; 17+ messages in thread
From: Paul E. McKenney @ 2006-01-11  5:01 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: vatsa, linux-kernel, Dipankar Sarma, Manfred Spraul,
	Linus Torvalds, Andrew Morton

On Tue, Jan 10, 2006 at 05:24:53PM +0300, Oleg Nesterov wrote:
> Pointed out by Srivatsa Vaddagiri <vatsa@in.ibm.com>.
> 
> rcu_do_batch() stops after processing maxbatch callbacks
> on ->donelist leaving rcu_tasklet in TASKLET_STATE_SCHED
> state.
> 
> If CPU_DEAD event happens remaining ->donelist entries are
> lost, rcu_offline_cpu() kills this tasklet.
> 
> With this patch ->donelist migrates along with ->curlist
> and ->nxtlist to the current cpu.
> 
> Compile tested.

Looks good, but I fat-fingered my test scripts and left it out.  :-/

Started another RCU-torture run...

							Thanx, Paul

> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
> 
> --- 2.6.15/kernel/rcupdate.c~4_HPFIX	2006-01-10 19:06:38.000000000 +0300
> +++ 2.6.15/kernel/rcupdate.c	2006-01-10 19:15:01.000000000 +0300
> @@ -343,8 +343,9 @@ static void __rcu_offline_cpu(struct rcu
>  	spin_unlock_bh(&rcp->lock);
>  	rcu_move_batch(this_rdp, rdp->curlist, rdp->curtail);
>  	rcu_move_batch(this_rdp, rdp->nxtlist, rdp->nxttail);
> -
> +	rcu_move_batch(this_rdp, rdp->donelist, rdp->donetail);
>  }
> +
>  static void rcu_offline_cpu(int cpu)
>  {
>  	struct rcu_data *this_rdp = &get_cpu_var(rcu_data);
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] rcu: fix hotplug-cpu ->donelist leak
  2006-01-10 14:24         ` [PATCH] rcu: fix hotplug-cpu ->donelist leak Oleg Nesterov
  2006-01-11  5:01           ` Paul E. McKenney
@ 2006-01-11 16:28           ` Paul E. McKenney
  2006-01-11 19:25             ` Oleg Nesterov
  1 sibling, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2006-01-11 16:28 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: vatsa, linux-kernel, Dipankar Sarma, Manfred Spraul,
	Linus Torvalds, Andrew Morton

On Tue, Jan 10, 2006 at 05:24:53PM +0300, Oleg Nesterov wrote:
> Pointed out by Srivatsa Vaddagiri <vatsa@in.ibm.com>.
> 
> rcu_do_batch() stops after processing maxbatch callbacks
> on ->donelist leaving rcu_tasklet in TASKLET_STATE_SCHED
> state.
> 
> If CPU_DEAD event happens remaining ->donelist entries are
> lost, rcu_offline_cpu() kills this tasklet.
> 
> With this patch ->donelist migrates along with ->curlist
> and ->nxtlist to the current cpu.
> 
> Compile tested.

This passed a ten-hour RCU torture test, with the torture test augmented
by Vatsa's CPU-hotplug RCU-torture-test patch.

							Thanx, Paul

Acked-by: Paul E. McKenney <paulmck@us.ibm.com>
> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
> 
> --- 2.6.15/kernel/rcupdate.c~4_HPFIX	2006-01-10 19:06:38.000000000 +0300
> +++ 2.6.15/kernel/rcupdate.c	2006-01-10 19:15:01.000000000 +0300
> @@ -343,8 +343,9 @@ static void __rcu_offline_cpu(struct rcu
>  	spin_unlock_bh(&rcp->lock);
>  	rcu_move_batch(this_rdp, rdp->curlist, rdp->curtail);
>  	rcu_move_batch(this_rdp, rdp->nxtlist, rdp->nxttail);
> -
> +	rcu_move_batch(this_rdp, rdp->donelist, rdp->donetail);
>  }
> +
>  static void rcu_offline_cpu(int cpu)
>  {
>  	struct rcu_data *this_rdp = &get_cpu_var(rcu_data);
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] rcu: fix hotplug-cpu ->donelist leak
  2006-01-11 19:25             ` Oleg Nesterov
@ 2006-01-11 18:21               ` Paul E. McKenney
  0 siblings, 0 replies; 17+ messages in thread
From: Paul E. McKenney @ 2006-01-11 18:21 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: vatsa, linux-kernel, Dipankar Sarma, Manfred Spraul,
	Linus Torvalds, Andrew Morton

On Wed, Jan 11, 2006 at 10:25:35PM +0300, Oleg Nesterov wrote:
> "Paul E. McKenney" wrote:
> > 
> > This passed a ten-hour RCU torture test, with the torture test augmented
> 
> Thank you!
> 
> > by Vatsa's CPU-hotplug RCU-torture-test patch.
> 
> I can't find this patch, could you point me?

http://marc.theaimsgroup.com/?l=linux-kernel&m=113378075217761&w=2

I just realized that I had failed to ack this one, so just did so.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] rcu: fix hotplug-cpu ->donelist leak
  2006-01-11 16:28           ` Paul E. McKenney
@ 2006-01-11 19:25             ` Oleg Nesterov
  2006-01-11 18:21               ` Paul E. McKenney
  0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2006-01-11 19:25 UTC (permalink / raw)
  To: paulmck
  Cc: vatsa, linux-kernel, Dipankar Sarma, Manfred Spraul,
	Linus Torvalds, Andrew Morton

"Paul E. McKenney" wrote:
> 
> This passed a ten-hour RCU torture test, with the torture test augmented

Thank you!

> by Vatsa's CPU-hotplug RCU-torture-test patch.

I can't find this patch, could you point me?

Oleg.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2006-01-11 18:21 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-08 19:19 [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending() Oleg Nesterov
2006-01-08 22:00 ` Paul E. McKenney
2006-01-09  9:31 ` Srivatsa Vaddagiri
2006-01-09 14:51   ` Oleg Nesterov
2006-01-09 13:42     ` Srivatsa Vaddagiri
2006-01-09 15:33       ` Oleg Nesterov
2006-01-09 18:59 ` Paul E. McKenney
2006-01-09 20:31   ` Oleg Nesterov
2006-01-09 19:59     ` Paul E. McKenney
2006-01-10  9:58       ` Srivatsa Vaddagiri
2006-01-10 14:24         ` [PATCH] rcu: fix hotplug-cpu ->donelist leak Oleg Nesterov
2006-01-11  5:01           ` Paul E. McKenney
2006-01-11 16:28           ` Paul E. McKenney
2006-01-11 19:25             ` Oleg Nesterov
2006-01-11 18:21               ` Paul E. McKenney
2006-01-10 14:27         ` [PATCH 2/5] rcu: don't check ->donelist in __rcu_pending() Oleg Nesterov
2006-01-10 18:13 ` Dipankar Sarma

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).