* CONFIG_PREEMPT_RCU in next/mmotm
@ 2009-08-08 19:34 Hugh Dickins
2009-08-08 23:56 ` Paul E. McKenney
0 siblings, 1 reply; 14+ messages in thread
From: Hugh Dickins @ 2009-08-08 19:34 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Ingo Molnar, Ben Herrenschmidt, Andrew Morton, linux-kernel
Hi Paul,
Is CONFIG_PREEMPT_RCU=y expected to be working in -next (or mmotm)?
I ask because it appears to break down on PowerPC G5 when I try two
"make -j20" kernel builds in parallel. The "filp" slab which usually
contains a couple of thousand objects or so, jumps up to a couple of
hundred thousand before the builds complete, and continues rising
from then on - I think that's a sign of RCU in disgrace?
And rebooting hangs thereafter.
And I notice that include/linux/rcupreempt.h currently says:
static inline void synchronize_rcu_expedited(void)
{
synchronize_rcu(); /* Placeholder for new rcupreempt implementation. */
}
which gives an impression of work in progress?
CONFIG_PREEMPT_TREE=y seems okay on PowerPC; and when I briefly tried
CONFIG_PREEMPT_RCU=y on x86_64, it didn't show above symptoms there.
I did try bisecting yesterday's linux-next git, but that led me to
commit 8ca17c6082feee5841a7b0e91d00e18c3f85f063
Merge: dafcc6e... 7256cf0...
Author: Ingo Molnar <mingo@elte.hu>
Date: Mon Aug 3 15:50:37 2009 +0200
Merge branch 'core/rcu' into auto-sched-next
rather than to any particular patch of yours which that merges:
which seemed odd, but I'm not accustomed to bisecting next.
Another odd thing is that mmotm .DATE=2009-07-30-05-01, the last
I tried before Thursday's, showed no such symptoms: yet appears to
contain all the larger RCU changes, just lacking some more recent
on/offline race fixes. I didn't notice any likely difference
between those mmotm trees down in arch/powerpc either.
My guess is that there's some other issue which is triggering
the RCU disgrace, but that is just a guess.
Config attached. Any suggestions?
Thanks,
Hugh
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: CONFIG_PREEMPT_RCU in next/mmotm
2009-08-08 19:34 CONFIG_PREEMPT_RCU in next/mmotm Hugh Dickins
@ 2009-08-08 23:56 ` Paul E. McKenney
2009-08-09 3:32 ` Hugh Dickins
0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2009-08-08 23:56 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Ingo Molnar, Ben Herrenschmidt, Andrew Morton, linux-kernel
On Sat, Aug 08, 2009 at 08:34:03PM +0100, Hugh Dickins wrote:
> Hi Paul,
>
> Is CONFIG_PREEMPT_RCU=y expected to be working in -next (or mmotm)?
>
> I ask because it appears to break down on PowerPC G5 when I try two
> "make -j20" kernel builds in parallel. The "filp" slab which usually
> contains a couple of thousand objects or so, jumps up to a couple of
> hundred thousand before the builds complete, and continues rising
> from then on - I think that's a sign of RCU in disgrace?
> And rebooting hangs thereafter.
That would indeed be bad.
> And I notice that include/linux/rcupreempt.h currently says:
> static inline void synchronize_rcu_expedited(void)
> {
> synchronize_rcu(); /* Placeholder for new rcupreempt implementation. */
> }
> which gives an impression of work in progress?
Yep, but I would be surprised if this was causing your problem. In fact,
I would be surprised if this function was being invoked. Then again,
I have been surprised more than once.
For whatever it is worth, with luck, include/linux/rcupreempt.h disappears
entirely at some point.
> CONFIG_PREEMPT_TREE=y seems okay on PowerPC; and when I briefly tried
CONFIG_TREE_RCU=y, you mean, right?
> CONFIG_PREEMPT_RCU=y on x86_64, it didn't show above symptoms there.
Interesting...
> I did try bisecting yesterday's linux-next git, but that led me to
>
> commit 8ca17c6082feee5841a7b0e91d00e18c3f85f063
> Merge: dafcc6e... 7256cf0...
> Author: Ingo Molnar <mingo@elte.hu>
> Date: Mon Aug 3 15:50:37 2009 +0200
>
> Merge branch 'core/rcu' into auto-sched-next
>
> rather than to any particular patch of yours which that merges:
> which seemed odd, but I'm not accustomed to bisecting next.
I honestly don't know how "git bisect" interacts with merges.
> Another odd thing is that mmotm .DATE=2009-07-30-05-01, the last
> I tried before Thursday's, showed no such symptoms: yet appears to
> contain all the larger RCU changes, just lacking some more recent
> on/offline race fixes. I didn't notice any likely difference
> between those mmotm trees down in arch/powerpc either.
>
> My guess is that there's some other issue which is triggering
> the RCU disgrace, but that is just a guess.
The one other issue I know of is one that apparently happens only on
one of Ingo's machines, which suddenly decides that it need not inform
RCU when onlining a CPU (at boot time).
But you seem to be making it past boot, and I would guess that you are
not doing any hotplug CPU operations, right?
> Config attached. Any suggestions?
/me scratches head... Will try some more CONFIG_PREEMPT_RCU testing
locally.
Thanx, Paul
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: CONFIG_PREEMPT_RCU in next/mmotm
2009-08-08 23:56 ` Paul E. McKenney
@ 2009-08-09 3:32 ` Hugh Dickins
2009-08-09 5:33 ` Paul E. McKenney
0 siblings, 1 reply; 14+ messages in thread
From: Hugh Dickins @ 2009-08-09 3:32 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Hugh Dickins, Ingo Molnar, Ben Herrenschmidt, Andrew Morton,
linux-kernel
On Sat, 8 Aug 2009, Paul E. McKenney wrote:
> On Sat, Aug 08, 2009 at 08:34:03PM +0100, Hugh Dickins wrote:
>
> > CONFIG_PREEMPT_TREE=y seems okay on PowerPC; and when I briefly tried
>
> CONFIG_TREE_RCU=y, you mean, right?
Whoops, yes indeed.
> > My guess is that there's some other issue which is triggering
> > the RCU disgrace, but that is just a guess.
>
> The one other issue I know of is one that apparently happens only on
> one of Ingo's machines, which suddenly decides that it need not inform
> RCU when onlining a CPU (at boot time).
>
> But you seem to be making it past boot, and I would guess that you are
> not doing any hotplug CPU operations, right?
That's right, I'm well past boot, and don't explicitly get into CPU
hotplug, nor using suspend/hibernate on that machine; and looking at
the config, see I don't even have CONFIG_HOTPLUG_CPU set.
In the morning I'll check that /proc/cpuinfo really shows all four
cpus before I start, and is still showing four cpus when it goes bad,
in case there's nevertheless an issue with that.
Is there some particular attribute of a cpu, most relevant to its RCU
grace status, that I can check on with a printk?
>
> > Config attached. Any suggestions?
[ Of course I forgot the config, but sent it just to Paul later. ]
>
> /me scratches head... Will try some more CONFIG_PREEMPT_RCU testing
> locally.
Thanks: I'm kind of glad that it's interesting,
though it may sink some of our time.
Hugh
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: CONFIG_PREEMPT_RCU in next/mmotm
2009-08-09 3:32 ` Hugh Dickins
@ 2009-08-09 5:33 ` Paul E. McKenney
2009-08-09 11:24 ` Hugh Dickins
0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2009-08-09 5:33 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Ingo Molnar, Ben Herrenschmidt, Andrew Morton, linux-kernel
On Sun, Aug 09, 2009 at 04:32:28AM +0100, Hugh Dickins wrote:
> On Sat, 8 Aug 2009, Paul E. McKenney wrote:
> > On Sat, Aug 08, 2009 at 08:34:03PM +0100, Hugh Dickins wrote:
> >
> > > CONFIG_PREEMPT_TREE=y seems okay on PowerPC; and when I briefly tried
> >
> > CONFIG_TREE_RCU=y, you mean, right?
>
> Whoops, yes indeed.
>
> > > My guess is that there's some other issue which is triggering
> > > the RCU disgrace, but that is just a guess.
> >
> > The one other issue I know of is one that apparently happens only on
> > one of Ingo's machines, which suddenly decides that it need not inform
> > RCU when onlining a CPU (at boot time).
> >
> > But you seem to be making it past boot, and I would guess that you are
> > not doing any hotplug CPU operations, right?
>
> That's right, I'm well past boot, and don't explicitly get into CPU
> hotplug, nor using suspend/hibernate on that machine; and looking at
> the config, see I don't even have CONFIG_HOTPLUG_CPU set.
>
> In the morning I'll check that /proc/cpuinfo really shows all four
> cpus before I start, and is still showing four cpus when it goes bad,
> in case there's nevertheless an issue with that.
>
> Is there some particular attribute of a cpu, most relevant to its RCU
> grace status, that I can check on with a printk?
The debugfs files enabled by CONFIG_RCU_TRACE, except rcu/rcugp, which,
since it will attempt to wait for a grace period, will probably wait
forever.
> > > Config attached. Any suggestions?
>
> [ Of course I forgot the config, but sent it just to Paul later. ]
>
> >
> > /me scratches head... Will try some more CONFIG_PREEMPT_RCU testing
> > locally.
>
> Thanks: I'm kind of glad that it's interesting,
> though it may sink some of our time.
Well, it certainly does have me a bit confused at the moment!
Thanx, Paul
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: CONFIG_PREEMPT_RCU in next/mmotm
2009-08-09 5:33 ` Paul E. McKenney
@ 2009-08-09 11:24 ` Hugh Dickins
2009-08-09 13:06 ` Hugh Dickins
0 siblings, 1 reply; 14+ messages in thread
From: Hugh Dickins @ 2009-08-09 11:24 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Ingo Molnar, Ben Herrenschmidt, Andrew Morton, linux-kernel
On Sat, 8 Aug 2009, Paul E. McKenney wrote:
> On Sun, Aug 09, 2009 at 04:32:28AM +0100, Hugh Dickins wrote:
> >
> > Is there some particular attribute of a cpu, most relevant to its RCU
> > grace status, that I can check on with a printk?
>
> The debugfs files enabled by CONFIG_RCU_TRACE, except rcu/rcugp, which,
> since it will attempt to wait for a grace period, will probably wait
Here's ten second samples from just before and after it "goes":
the number of filp objects (my own guide to it going), rcuctrs and
rctstats. With ggp stuck on 35124, as you predicted, rcugp just hangs.
filp 5616 objects:
CPU last cur F M
0 0 0 0 1
1 0 0 0 0
2 0 -1 0 0
3 0 1 0 0
ggp = 33821, state = waitmb
na=2751281 nl=64 wa=2751217 wl=176 da=2751041 dl=0 dr=2751041 di=2751023
1=353642 e1=193343 i1=33939 ie1=118 g1=33821 a1=38041 ae1=4220 a2=33821
z1=36834 ze1=3013 z2=33821 m1=51485 me1=17665 m2=33820
filp 5780 objects:
CPU last cur F M
0 0 0 0 0
1 0 0 0 0
2 -1 0 0 0
3 1 0 0 0
ggp = 34518, state = waitack
na=2826191 nl=18 wa=2826173 wl=71 da=2826102 dl=0 dr=2826102 di=2826058
1=362636 e1=198972 i1=34640 ie1=122 g1=34518 a1=38847 ae1=4330 a2=34517
z1=37702 ze1=3185 z2=34517 m1=52475 me1=17958 m2=34517
filp 13343 objects:
CPU last cur F M
0 1 0 0 0
1 0 0 0 0
2 -1 0 0 0
3 0 0 0 0
ggp = 35124, state = waitzero
na=2898045 nl=15380 wa=2882665 wl=150 da=2882515 dl=0 dr=2882515 di=2882465
1=371717 e1=204532 i1=35248 ie1=124 g1=35124 a1=39545 ae1=4421 a2=35124
z1=39077 ze1=3954 z2=35123 m1=53315 me1=18192 m2=35123
filp 67190 objects:
CPU last cur F M
0 1 0 0 0
1 0 1 0 0
2 -1 0 0 0
3 0 0 0 0
ggp = 35124, state = waitzero
na=2961758 nl=79093 wa=2882665 wl=150 da=2882515 dl=0 dr=2882515 di=2882465
1=383301 e1=212065 i1=35248 ie1=124 g1=35124 a1=39545 ae1=4421 a2=35124
z1=43128 ze1=8005 z2=35123 m1=53315 me1=18192 m2=35123
filp 137160 objects:
CPU last cur F M
0 1 1 0 0
1 0 0 0 0
2 -1 0 0 0
3 0 0 0 0
ggp = 35124, state = waitzero
na=3041696 nl=159031 wa=2882665 wl=150 da=2882515 dl=0 dr=2882515 di=2882465
1=394717 e1=219335 i1=35248 ie1=124 g1=35124 a1=39545 ae1=4421 a2=35124
z1=47274 ze1=12151 z2=35123 m1=53315 me1=18192 m2=35123
Hugh
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: CONFIG_PREEMPT_RCU in next/mmotm
2009-08-09 11:24 ` Hugh Dickins
@ 2009-08-09 13:06 ` Hugh Dickins
2009-08-09 18:57 ` Paul E. McKenney
0 siblings, 1 reply; 14+ messages in thread
From: Hugh Dickins @ 2009-08-09 13:06 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Ingo Molnar, Ben Herrenschmidt, Andrew Morton, linux-kernel
On Sun, 9 Aug 2009, Hugh Dickins wrote:
>
> filp 13343 objects:
>
> CPU last cur F M
> 0 1 0 0 0
> 1 0 0 0 0
> 2 -1 0 0 0
> 3 0 0 0 0
> ggp = 35124, state = waitzero
Interesting that rcu_try_flip_waitzero() doesn't see 1 + 0 + -1 + 0 == 0.
That's because rcu_cpu_online_map is 0x1 instead of the 0xf it should be.
Which is because I don't have CONFIG_HOTPLUG_CPU=y on that PPC machine
(unlike the x86s), and I think you've made some recent mods which
accidentally made the rcu cpu initialization dependent on hotplug
cpu notifiers? CONFIG_HOTPLUG_CPU=y and it works properly again.
And TREE_RCU doesn't use an rcu_cpu_online_map (though it does expect
some per-cpu initialization, but seems to get away without it).
So I think that's the mystery solved - I'll let you decide the right fix!
Hugh
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: CONFIG_PREEMPT_RCU in next/mmotm
2009-08-09 13:06 ` Hugh Dickins
@ 2009-08-09 18:57 ` Paul E. McKenney
2009-08-09 20:53 ` Hugh Dickins
0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2009-08-09 18:57 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Ingo Molnar, Ben Herrenschmidt, Andrew Morton, linux-kernel
On Sun, Aug 09, 2009 at 02:06:05PM +0100, Hugh Dickins wrote:
> On Sun, 9 Aug 2009, Hugh Dickins wrote:
> >
> > filp 13343 objects:
> >
> > CPU last cur F M
> > 0 1 0 0 0
> > 1 0 0 0 0
> > 2 -1 0 0 0
> > 3 0 0 0 0
> > ggp = 35124, state = waitzero
>
> Interesting that rcu_try_flip_waitzero() doesn't see 1 + 0 + -1 + 0 == 0.
Indeed!!! You nailed it!!!
> That's because rcu_cpu_online_map is 0x1 instead of the 0xf it should be.
>
> Which is because I don't have CONFIG_HOTPLUG_CPU=y on that PPC machine
> (unlike the x86s), and I think you've made some recent mods which
> accidentally made the rcu cpu initialization dependent on hotplug
> cpu notifiers? CONFIG_HOTPLUG_CPU=y and it works properly again.
Back to the drawing board at this end...
> And TREE_RCU doesn't use an rcu_cpu_online_map (though it does expect
> some per-cpu initialization, but seems to get away without it).
Only by pure luck. The kind of luck that gets you into serious trouble
later on.
> So I think that's the mystery solved - I'll let you decide the right fix!
Thank you -very- much for tracking this down, Hugh!!!
I introduced the problem in commit 7fe616c5dd50a50f334edec1ea0580b90b7af0d9
by changing from register_cpu_notifier() to hotcpu_notifier(). The former
lets you know when CPUs come on line unconditionally, the latter only
when CONFIG_HOTPLUG_CPU is in effect.
But hotcpu_notifier() is much nicer to use, so I propose introducing
a cpu_notifier() that is invoked like hotcpu_notifier() is, but is
unconditional in the same way that register_cpu_notifier().
Something like the following (untested, probably does not compile):
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 4d668e0..d5dfc1f 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -48,6 +48,15 @@ struct notifier_block;
#ifdef CONFIG_SMP
/* Need to know about CPUs going up/down? */
+#if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE)
+#define cpu_notifier(fn, pri) { \
+ static struct notifier_block fn##_nb __cpuinitdata = \
+ { .notifier_call = fn, .priority = pri }; \
+ register_cpu_notifier(&fn##_nb); \
+}
+#else /* #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
+#define cpu_notifier(fn, pri) do { (void)(fn); } while (0)
+#endif /* #else #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
#ifdef CONFIG_HOTPLUG_CPU
extern int register_cpu_notifier(struct notifier_block *nb);
extern void unregister_cpu_notifier(struct notifier_block *nb);
@@ -99,11 +108,7 @@ extern struct sysdev_class cpu_sysdev_class;
extern void get_online_cpus(void);
extern void put_online_cpus(void);
-#define hotcpu_notifier(fn, pri) { \
- static struct notifier_block fn##_nb __cpuinitdata = \
- { .notifier_call = fn, .priority = pri }; \
- register_cpu_notifier(&fn##_nb); \
-}
+#define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri)
#define register_hotcpu_notifier(nb) register_cpu_notifier(nb)
#define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb)
int cpu_down(unsigned int cpu);
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index 9f0584e..c1bbfd5 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -251,7 +251,7 @@ void __init rcu_init(void)
int i;
__rcu_init();
- hotcpu_notifier(rcu_barrier_cpu_hotplug, 0);
+ cpu_notifier(rcu_barrier_cpu_hotplug, 0);
/*
* We don't need protection against CPU-hotplug here because
With this in place:
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index 9f0584e..c1bbfd5 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -251,7 +251,7 @@ void __init rcu_init(void)
int i;
__rcu_init();
- hotcpu_notifier(rcu_barrier_cpu_hotplug, 0);
+ cpu_notifier(rcu_barrier_cpu_hotplug, 0);
/*
* We don't need protection against CPU-hotplug here because
Thoughts?
Thanx, Paul
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: CONFIG_PREEMPT_RCU in next/mmotm
2009-08-09 18:57 ` Paul E. McKenney
@ 2009-08-09 20:53 ` Hugh Dickins
2009-08-09 21:16 ` Paul E. McKenney
0 siblings, 1 reply; 14+ messages in thread
From: Hugh Dickins @ 2009-08-09 20:53 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Ingo Molnar, Ben Herrenschmidt, Andrew Morton, linux-kernel
On Sun, 9 Aug 2009, Paul E. McKenney wrote:
>
> I introduced the problem in commit 7fe616c5dd50a50f334edec1ea0580b90b7af0d9
> by changing from register_cpu_notifier() to hotcpu_notifier(). The former
> lets you know when CPUs come on line unconditionally, the latter only
> when CONFIG_HOTPLUG_CPU is in effect.
>
> But hotcpu_notifier() is much nicer to use, so I propose introducing
> a cpu_notifier() that is invoked like hotcpu_notifier() is, but is
> unconditional in the same way that register_cpu_notifier().
>
> Something like the following (untested, probably does not compile):
>
> diff --git a/include/linux/cpu.h b/include/linux/cpu.h
> index 4d668e0..d5dfc1f 100644
> --- a/include/linux/cpu.h
> +++ b/include/linux/cpu.h
> @@ -48,6 +48,15 @@ struct notifier_block;
>
> #ifdef CONFIG_SMP
> /* Need to know about CPUs going up/down? */
> +#if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE)
> +#define cpu_notifier(fn, pri) { \
> + static struct notifier_block fn##_nb __cpuinitdata = \
> + { .notifier_call = fn, .priority = pri }; \
> + register_cpu_notifier(&fn##_nb); \
> +}
> +#else /* #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
> +#define cpu_notifier(fn, pri) do { (void)(fn); } while (0)
> +#endif /* #else #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
> #ifdef CONFIG_HOTPLUG_CPU
> extern int register_cpu_notifier(struct notifier_block *nb);
> extern void unregister_cpu_notifier(struct notifier_block *nb);
> @@ -99,11 +108,7 @@ extern struct sysdev_class cpu_sysdev_class;
>
> extern void get_online_cpus(void);
> extern void put_online_cpus(void);
> -#define hotcpu_notifier(fn, pri) { \
> - static struct notifier_block fn##_nb __cpuinitdata = \
> - { .notifier_call = fn, .priority = pri }; \
> - register_cpu_notifier(&fn##_nb); \
> -}
> +#define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri)
> #define register_hotcpu_notifier(nb) register_cpu_notifier(nb)
> #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb)
> int cpu_down(unsigned int cpu);
> diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
> index 9f0584e..c1bbfd5 100644
> --- a/kernel/rcupdate.c
> +++ b/kernel/rcupdate.c
> @@ -251,7 +251,7 @@ void __init rcu_init(void)
> int i;
>
> __rcu_init();
> - hotcpu_notifier(rcu_barrier_cpu_hotplug, 0);
> + cpu_notifier(rcu_barrier_cpu_hotplug, 0);
>
> /*
> * We don't need protection against CPU-hotplug here because
>
[ removed repeat of rcupdate.c patch ]
>
> Thoughts?
That builds and works for me, with or without CONFIG_HOTPLUG_CPU.
But I didn't get what you're achieving with the MODULE part of it;
and (I'm not a notifier buff at all) it does seems rather baroque to
me - a single callsite, why not stick with register_cpu_notifier()?
Ah, perhaps it's your ambition to move others over to this
(or perhaps it's your ambition to leave that to someone else ;-)
Hugh
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: CONFIG_PREEMPT_RCU in next/mmotm
2009-08-09 20:53 ` Hugh Dickins
@ 2009-08-09 21:16 ` Paul E. McKenney
2009-08-10 3:39 ` Paul E. McKenney
0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2009-08-09 21:16 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Ingo Molnar, Ben Herrenschmidt, Andrew Morton, linux-kernel
On Sun, Aug 09, 2009 at 09:53:53PM +0100, Hugh Dickins wrote:
> On Sun, 9 Aug 2009, Paul E. McKenney wrote:
> >
> > I introduced the problem in commit 7fe616c5dd50a50f334edec1ea0580b90b7af0d9
> > by changing from register_cpu_notifier() to hotcpu_notifier(). The former
> > lets you know when CPUs come on line unconditionally, the latter only
> > when CONFIG_HOTPLUG_CPU is in effect.
> >
> > But hotcpu_notifier() is much nicer to use, so I propose introducing
> > a cpu_notifier() that is invoked like hotcpu_notifier() is, but is
> > unconditional in the same way that register_cpu_notifier().
> >
> > Something like the following (untested, probably does not compile):
> >
> > diff --git a/include/linux/cpu.h b/include/linux/cpu.h
> > index 4d668e0..d5dfc1f 100644
> > --- a/include/linux/cpu.h
> > +++ b/include/linux/cpu.h
> > @@ -48,6 +48,15 @@ struct notifier_block;
> >
> > #ifdef CONFIG_SMP
> > /* Need to know about CPUs going up/down? */
> > +#if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE)
> > +#define cpu_notifier(fn, pri) { \
> > + static struct notifier_block fn##_nb __cpuinitdata = \
> > + { .notifier_call = fn, .priority = pri }; \
> > + register_cpu_notifier(&fn##_nb); \
> > +}
> > +#else /* #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
> > +#define cpu_notifier(fn, pri) do { (void)(fn); } while (0)
> > +#endif /* #else #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
> > #ifdef CONFIG_HOTPLUG_CPU
> > extern int register_cpu_notifier(struct notifier_block *nb);
> > extern void unregister_cpu_notifier(struct notifier_block *nb);
> > @@ -99,11 +108,7 @@ extern struct sysdev_class cpu_sysdev_class;
> >
> > extern void get_online_cpus(void);
> > extern void put_online_cpus(void);
> > -#define hotcpu_notifier(fn, pri) { \
> > - static struct notifier_block fn##_nb __cpuinitdata = \
> > - { .notifier_call = fn, .priority = pri }; \
> > - register_cpu_notifier(&fn##_nb); \
> > -}
> > +#define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri)
> > #define register_hotcpu_notifier(nb) register_cpu_notifier(nb)
> > #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb)
> > int cpu_down(unsigned int cpu);
> > diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
> > index 9f0584e..c1bbfd5 100644
> > --- a/kernel/rcupdate.c
> > +++ b/kernel/rcupdate.c
> > @@ -251,7 +251,7 @@ void __init rcu_init(void)
> > int i;
> >
> > __rcu_init();
> > - hotcpu_notifier(rcu_barrier_cpu_hotplug, 0);
> > + cpu_notifier(rcu_barrier_cpu_hotplug, 0);
> >
> > /*
> > * We don't need protection against CPU-hotplug here because
> >
> [ removed repeat of rcupdate.c patch ]
> >
> > Thoughts?
>
> That builds and works for me, with or without CONFIG_HOTPLUG_CPU.
>
> But I didn't get what you're achieving with the MODULE part of it;
> and (I'm not a notifier buff at all) it does seems rather baroque to
> me - a single callsite, why not stick with register_cpu_notifier()?
>
> Ah, perhaps it's your ambition to move others over to this
> (or perhaps it's your ambition to leave that to someone else ;-)
Actually, nothing quite that clearly thought out. I was just following
the pattern set for register_cpu_notifier(). My guess at the reasoning
is that when !HOTPLUG_CPU, modules cannot be loaded until all the CPUs
are online, so there is no point in letting a module set itself up for
notification.
But whatever their reasoning, mine was that there is no point in
creating a struct notifier_block that wasn't going to be used. ;-)
Thanx, Paul
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: CONFIG_PREEMPT_RCU in next/mmotm
2009-08-09 21:16 ` Paul E. McKenney
@ 2009-08-10 3:39 ` Paul E. McKenney
2009-08-10 22:43 ` Paul E. McKenney
0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2009-08-10 3:39 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Ingo Molnar, Ben Herrenschmidt, Andrew Morton, linux-kernel
On Sun, Aug 09, 2009 at 02:16:29PM -0700, Paul E. McKenney wrote:
> On Sun, Aug 09, 2009 at 09:53:53PM +0100, Hugh Dickins wrote:
> > On Sun, 9 Aug 2009, Paul E. McKenney wrote:
> > >
> > > I introduced the problem in commit 7fe616c5dd50a50f334edec1ea0580b90b7af0d9
> > > by changing from register_cpu_notifier() to hotcpu_notifier(). The former
> > > lets you know when CPUs come on line unconditionally, the latter only
> > > when CONFIG_HOTPLUG_CPU is in effect.
> > >
> > > But hotcpu_notifier() is much nicer to use, so I propose introducing
> > > a cpu_notifier() that is invoked like hotcpu_notifier() is, but is
> > > unconditional in the same way that register_cpu_notifier().
> > >
> > > Something like the following (untested, probably does not compile):
> > >
> > > diff --git a/include/linux/cpu.h b/include/linux/cpu.h
> > > index 4d668e0..d5dfc1f 100644
> > > --- a/include/linux/cpu.h
> > > +++ b/include/linux/cpu.h
> > > @@ -48,6 +48,15 @@ struct notifier_block;
> > >
> > > #ifdef CONFIG_SMP
> > > /* Need to know about CPUs going up/down? */
> > > +#if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE)
> > > +#define cpu_notifier(fn, pri) { \
> > > + static struct notifier_block fn##_nb __cpuinitdata = \
> > > + { .notifier_call = fn, .priority = pri }; \
> > > + register_cpu_notifier(&fn##_nb); \
> > > +}
> > > +#else /* #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
> > > +#define cpu_notifier(fn, pri) do { (void)(fn); } while (0)
> > > +#endif /* #else #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
> > > #ifdef CONFIG_HOTPLUG_CPU
> > > extern int register_cpu_notifier(struct notifier_block *nb);
> > > extern void unregister_cpu_notifier(struct notifier_block *nb);
> > > @@ -99,11 +108,7 @@ extern struct sysdev_class cpu_sysdev_class;
> > >
> > > extern void get_online_cpus(void);
> > > extern void put_online_cpus(void);
> > > -#define hotcpu_notifier(fn, pri) { \
> > > - static struct notifier_block fn##_nb __cpuinitdata = \
> > > - { .notifier_call = fn, .priority = pri }; \
> > > - register_cpu_notifier(&fn##_nb); \
> > > -}
> > > +#define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri)
> > > #define register_hotcpu_notifier(nb) register_cpu_notifier(nb)
> > > #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb)
> > > int cpu_down(unsigned int cpu);
> > > diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
> > > index 9f0584e..c1bbfd5 100644
> > > --- a/kernel/rcupdate.c
> > > +++ b/kernel/rcupdate.c
> > > @@ -251,7 +251,7 @@ void __init rcu_init(void)
> > > int i;
> > >
> > > __rcu_init();
> > > - hotcpu_notifier(rcu_barrier_cpu_hotplug, 0);
> > > + cpu_notifier(rcu_barrier_cpu_hotplug, 0);
> > >
> > > /*
> > > * We don't need protection against CPU-hotplug here because
> > >
> > [ removed repeat of rcupdate.c patch ]
> > >
> > > Thoughts?
> >
> > That builds and works for me, with or without CONFIG_HOTPLUG_CPU.
> >
> > But I didn't get what you're achieving with the MODULE part of it;
> > and (I'm not a notifier buff at all) it does seems rather baroque to
> > me - a single callsite, why not stick with register_cpu_notifier()?
> >
> > Ah, perhaps it's your ambition to move others over to this
> > (or perhaps it's your ambition to leave that to someone else ;-)
>
> Actually, nothing quite that clearly thought out. I was just following
> the pattern set for register_cpu_notifier(). My guess at the reasoning
> is that when !HOTPLUG_CPU, modules cannot be loaded until all the CPUs
> are online, so there is no point in letting a module set itself up for
> notification.
>
> But whatever their reasoning, mine was that there is no point in
> creating a struct notifier_block that wasn't going to be used. ;-)
And the above patch fails for !CONFIG_SMP. Here is an update, testing
in progress. Still not fully tested, but results are encouraging.
In particular, this one is more likely to compile.
Thanx, Paul
------------------------------------------------------------------------
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 4d668e0..4753619 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -48,6 +48,15 @@ struct notifier_block;
#ifdef CONFIG_SMP
/* Need to know about CPUs going up/down? */
+#if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE)
+#define cpu_notifier(fn, pri) { \
+ static struct notifier_block fn##_nb __cpuinitdata = \
+ { .notifier_call = fn, .priority = pri }; \
+ register_cpu_notifier(&fn##_nb); \
+}
+#else /* #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
+#define cpu_notifier(fn, pri) do { (void)(fn); } while (0)
+#endif /* #else #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
#ifdef CONFIG_HOTPLUG_CPU
extern int register_cpu_notifier(struct notifier_block *nb);
extern void unregister_cpu_notifier(struct notifier_block *nb);
@@ -74,6 +83,8 @@ extern void cpu_maps_update_done(void);
#else /* CONFIG_SMP */
+#define cpu_notifier(fn, pri) do { (void)(fn); } while (0)
+
static inline int register_cpu_notifier(struct notifier_block *nb)
{
return 0;
@@ -99,11 +110,7 @@ extern struct sysdev_class cpu_sysdev_class;
extern void get_online_cpus(void);
extern void put_online_cpus(void);
-#define hotcpu_notifier(fn, pri) { \
- static struct notifier_block fn##_nb __cpuinitdata = \
- { .notifier_call = fn, .priority = pri }; \
- register_cpu_notifier(&fn##_nb); \
-}
+#define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri)
#define register_hotcpu_notifier(nb) register_cpu_notifier(nb)
#define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb)
int cpu_down(unsigned int cpu);
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index 9f0584e..c1bbfd5 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -251,7 +251,7 @@ void __init rcu_init(void)
int i;
__rcu_init();
- hotcpu_notifier(rcu_barrier_cpu_hotplug, 0);
+ cpu_notifier(rcu_barrier_cpu_hotplug, 0);
/*
* We don't need protection against CPU-hotplug here because
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: CONFIG_PREEMPT_RCU in next/mmotm
2009-08-10 3:39 ` Paul E. McKenney
@ 2009-08-10 22:43 ` Paul E. McKenney
2009-08-12 1:34 ` Paul E. McKenney
0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2009-08-10 22:43 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Ingo Molnar, Ben Herrenschmidt, Andrew Morton, linux-kernel
On Sun, Aug 09, 2009 at 08:39:07PM -0700, Paul E. McKenney wrote:
> On Sun, Aug 09, 2009 at 02:16:29PM -0700, Paul E. McKenney wrote:
> > On Sun, Aug 09, 2009 at 09:53:53PM +0100, Hugh Dickins wrote:
> > > On Sun, 9 Aug 2009, Paul E. McKenney wrote:
> > > >
> > > > I introduced the problem in commit 7fe616c5dd50a50f334edec1ea0580b90b7af0d9
> > > > by changing from register_cpu_notifier() to hotcpu_notifier(). The former
> > > > lets you know when CPUs come on line unconditionally, the latter only
> > > > when CONFIG_HOTPLUG_CPU is in effect.
> > > >
> > > > But hotcpu_notifier() is much nicer to use, so I propose introducing
> > > > a cpu_notifier() that is invoked like hotcpu_notifier() is, but is
> > > > unconditional in the same way that register_cpu_notifier().
> > > >
> > > > Something like the following (untested, probably does not compile):
> > > >
> > > > diff --git a/include/linux/cpu.h b/include/linux/cpu.h
> > > > index 4d668e0..d5dfc1f 100644
> > > > --- a/include/linux/cpu.h
> > > > +++ b/include/linux/cpu.h
> > > > @@ -48,6 +48,15 @@ struct notifier_block;
> > > >
> > > > #ifdef CONFIG_SMP
> > > > /* Need to know about CPUs going up/down? */
> > > > +#if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE)
> > > > +#define cpu_notifier(fn, pri) { \
> > > > + static struct notifier_block fn##_nb __cpuinitdata = \
> > > > + { .notifier_call = fn, .priority = pri }; \
> > > > + register_cpu_notifier(&fn##_nb); \
> > > > +}
> > > > +#else /* #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
> > > > +#define cpu_notifier(fn, pri) do { (void)(fn); } while (0)
> > > > +#endif /* #else #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
> > > > #ifdef CONFIG_HOTPLUG_CPU
> > > > extern int register_cpu_notifier(struct notifier_block *nb);
> > > > extern void unregister_cpu_notifier(struct notifier_block *nb);
> > > > @@ -99,11 +108,7 @@ extern struct sysdev_class cpu_sysdev_class;
> > > >
> > > > extern void get_online_cpus(void);
> > > > extern void put_online_cpus(void);
> > > > -#define hotcpu_notifier(fn, pri) { \
> > > > - static struct notifier_block fn##_nb __cpuinitdata = \
> > > > - { .notifier_call = fn, .priority = pri }; \
> > > > - register_cpu_notifier(&fn##_nb); \
> > > > -}
> > > > +#define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri)
> > > > #define register_hotcpu_notifier(nb) register_cpu_notifier(nb)
> > > > #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb)
> > > > int cpu_down(unsigned int cpu);
> > > > diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
> > > > index 9f0584e..c1bbfd5 100644
> > > > --- a/kernel/rcupdate.c
> > > > +++ b/kernel/rcupdate.c
> > > > @@ -251,7 +251,7 @@ void __init rcu_init(void)
> > > > int i;
> > > >
> > > > __rcu_init();
> > > > - hotcpu_notifier(rcu_barrier_cpu_hotplug, 0);
> > > > + cpu_notifier(rcu_barrier_cpu_hotplug, 0);
> > > >
> > > > /*
> > > > * We don't need protection against CPU-hotplug here because
> > > >
> > > [ removed repeat of rcupdate.c patch ]
> > > >
> > > > Thoughts?
> > >
> > > That builds and works for me, with or without CONFIG_HOTPLUG_CPU.
> > >
> > > But I didn't get what you're achieving with the MODULE part of it;
> > > and (I'm not a notifier buff at all) it does seems rather baroque to
> > > me - a single callsite, why not stick with register_cpu_notifier()?
> > >
> > > Ah, perhaps it's your ambition to move others over to this
> > > (or perhaps it's your ambition to leave that to someone else ;-)
> >
> > Actually, nothing quite that clearly thought out. I was just following
> > the pattern set for register_cpu_notifier(). My guess at the reasoning
> > is that when !HOTPLUG_CPU, modules cannot be loaded until all the CPUs
> > are online, so there is no point in letting a module set itself up for
> > notification.
> >
> > But whatever their reasoning, mine was that there is no point in
> > creating a struct notifier_block that wasn't going to be used. ;-)
>
> And the above patch fails for !CONFIG_SMP. Here is an update, testing
> in progress. Still not fully tested, but results are encouraging.
> In particular, this one is more likely to compile.
And this handled !CONFIG_SMP, but fails two of fifteen test cases.
So better, but still far from perfect.
Chasing the failures down.
Thanx, Paul
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: CONFIG_PREEMPT_RCU in next/mmotm
2009-08-10 22:43 ` Paul E. McKenney
@ 2009-08-12 1:34 ` Paul E. McKenney
2009-08-12 9:22 ` Ingo Molnar
0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2009-08-12 1:34 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Ben Herrenschmidt, Andrew Morton, linux-kernel, Hugh Dickins
On Mon, Aug 10, 2009 at 03:43:45PM -0700, Paul E. McKenney wrote:
> On Sun, Aug 09, 2009 at 08:39:07PM -0700, Paul E. McKenney wrote:
> > On Sun, Aug 09, 2009 at 02:16:29PM -0700, Paul E. McKenney wrote:
> > > On Sun, Aug 09, 2009 at 09:53:53PM +0100, Hugh Dickins wrote:
[ . . . ]
> > > > That builds and works for me, with or without CONFIG_HOTPLUG_CPU.
> > > >
> > > > But I didn't get what you're achieving with the MODULE part of it;
> > > > and (I'm not a notifier buff at all) it does seems rather baroque to
> > > > me - a single callsite, why not stick with register_cpu_notifier()?
> > > >
> > > > Ah, perhaps it's your ambition to move others over to this
> > > > (or perhaps it's your ambition to leave that to someone else ;-)
> > >
> > > Actually, nothing quite that clearly thought out. I was just following
> > > the pattern set for register_cpu_notifier(). My guess at the reasoning
> > > is that when !HOTPLUG_CPU, modules cannot be loaded until all the CPUs
> > > are online, so there is no point in letting a module set itself up for
> > > notification.
> > >
> > > But whatever their reasoning, mine was that there is no point in
> > > creating a struct notifier_block that wasn't going to be used. ;-)
> >
> > And the above patch fails for !CONFIG_SMP. Here is an update, testing
> > in progress. Still not fully tested, but results are encouraging.
> > In particular, this one is more likely to compile.
>
> And this handled !CONFIG_SMP, but fails two of fifteen test cases.
> So better, but still far from perfect.
>
> Chasing the failures down.
And I believe I have a patch that works for all of my test cases, but
am rerunning the full set to double-check. Patch against tip/core/rcu
below for your collective amusement.
Should these tests pass...
Unless someone tells me otherwise, I will make a patch series intended
to replace tip/core/rcu commits 7fe616c5d ("Simplify RCU CPU-hotplug
notification"), 04b06256c ("Fix RCU & CPU hotplug hang"), and 7256cf0e83b
("Add diagnostic check for a possible CPU-hotplug race"), re-run all tests
on that patchset, and submit the series. I expect the resulting patch
set to have three patches, one to split out boot-time initialization
for RCU_TREE, a second to create the cpu_notifier() API, and the third
to make RCU use it.
I guess the lesson to me is that although I should send a patch quickly
in response to bug reports, I need to nevertheless run my full set of RCU
torture tests on it -- and verify that the specified kernel configuration
parameters actually were in effect for those tests. :-/
Thanx, Paul
------------------------------------------------------------------------
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 4d668e0..4753619 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -48,6 +48,15 @@ struct notifier_block;
#ifdef CONFIG_SMP
/* Need to know about CPUs going up/down? */
+#if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE)
+#define cpu_notifier(fn, pri) { \
+ static struct notifier_block fn##_nb __cpuinitdata = \
+ { .notifier_call = fn, .priority = pri }; \
+ register_cpu_notifier(&fn##_nb); \
+}
+#else /* #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
+#define cpu_notifier(fn, pri) do { (void)(fn); } while (0)
+#endif /* #else #if defined(CONFIG_HOTPLUG_CPU) || !defined(MODULE) */
#ifdef CONFIG_HOTPLUG_CPU
extern int register_cpu_notifier(struct notifier_block *nb);
extern void unregister_cpu_notifier(struct notifier_block *nb);
@@ -74,6 +83,8 @@ extern void cpu_maps_update_done(void);
#else /* CONFIG_SMP */
+#define cpu_notifier(fn, pri) do { (void)(fn); } while (0)
+
static inline int register_cpu_notifier(struct notifier_block *nb)
{
return 0;
@@ -99,11 +110,7 @@ extern struct sysdev_class cpu_sysdev_class;
extern void get_online_cpus(void);
extern void put_online_cpus(void);
-#define hotcpu_notifier(fn, pri) { \
- static struct notifier_block fn##_nb __cpuinitdata = \
- { .notifier_call = fn, .priority = pri }; \
- register_cpu_notifier(&fn##_nb); \
-}
+#define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri)
#define register_hotcpu_notifier(nb) register_cpu_notifier(nb)
#define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb)
int cpu_down(unsigned int cpu);
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index 9f0584e..8df1156 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -238,7 +238,7 @@ static int __cpuinit rcu_barrier_cpu_hotplug(struct notifier_block *self,
call_rcu_bh(rcu_migrate_head, rcu_migrate_callback);
call_rcu_sched(rcu_migrate_head + 1, rcu_migrate_callback);
call_rcu(rcu_migrate_head + 2, rcu_migrate_callback);
- } else if (action == CPU_DEAD) {
+ } else if (action == CPU_POST_DEAD) {
/* rcu_migrate_head is protected by cpu_add_remove_lock */
wait_migrated_callbacks();
}
@@ -251,7 +251,7 @@ void __init rcu_init(void)
int i;
__rcu_init();
- hotcpu_notifier(rcu_barrier_cpu_hotplug, 0);
+ cpu_notifier(rcu_barrier_cpu_hotplug, 0);
/*
* We don't need protection against CPU-hotplug here because
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: CONFIG_PREEMPT_RCU in next/mmotm
2009-08-12 1:34 ` Paul E. McKenney
@ 2009-08-12 9:22 ` Ingo Molnar
2009-08-13 0:51 ` Paul E. McKenney
0 siblings, 1 reply; 14+ messages in thread
From: Ingo Molnar @ 2009-08-12 9:22 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Ben Herrenschmidt, Andrew Morton, linux-kernel, Hugh Dickins
* Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> Should these tests pass...
>
> Unless someone tells me otherwise, I will make a patch series
> intended to replace tip/core/rcu commits 7fe616c5d ("Simplify RCU
> CPU-hotplug notification"), 04b06256c ("Fix RCU & CPU hotplug
> hang"), and 7256cf0e83b ("Add diagnostic check for a possible
> CPU-hotplug race"), re-run all tests on that patchset, and submit
> the series. I expect the resulting patch set to have three
> patches, one to split out boot-time initialization for RCU_TREE, a
> second to create the cpu_notifier() API, and the third to make RCU
> use it.
Sure - we can reasonably rebase portions of that stack of commits:
earth4:~/tip> gll linus..core/rcu
7256cf0: rcu: Add diagnostic check for a possible CPU-hotplug race
04b0625: rcu: Fix RCU & CPU hotplug hang
7fe616c: rcu: Simplify RCU CPU-hotplug notification
240ebbf: rcu: Add synchronize_sched_expedited() rcutorture doc + updates
0acc512: rcu: Add synchronize_sched_expedited() torture tests
03b042b: rcu: Add synchronize_sched_expedited() primitive
c17ef45: rcu: Remove Classic RCU
Please mention the magic words "please reset core/rcu to 240ebbf
before applying these patches" in the mail to me, should i forget in
the days to come.
(hm, what was i supposed to not forget? Weird.)
Ingo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: CONFIG_PREEMPT_RCU in next/mmotm
2009-08-12 9:22 ` Ingo Molnar
@ 2009-08-13 0:51 ` Paul E. McKenney
0 siblings, 0 replies; 14+ messages in thread
From: Paul E. McKenney @ 2009-08-13 0:51 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Ben Herrenschmidt, Andrew Morton, linux-kernel, Hugh Dickins
On Wed, Aug 12, 2009 at 11:22:50AM +0200, Ingo Molnar wrote:
>
> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
>
> > Should these tests pass...
Things are working much better, but I can still cause failures by
hotplugging CPUs with zero wait time between them while concurrently
modprobe-ing and rmmod-ing rcutorture repeatedly while running
CONFIG_PREEMPT_RCU. I increased the kernel's lifespan by an order of
magnitude or so by simplifying rcupreempt's hotplug code path.
And I just -know- that I will deeply regret having described my test
process, given that my life is much easier when my RCU testing is more
rigorous than anyone else's. ;-)
The remaining (known) problem appears to be due to a kthread_stop()
deadlock between the migration threads and a few of rcutorture's kthreads.
In this deadlock, rcutorture is waiting for one of the fakewriters
to stop (via kthread_stop()), while the fakewriter is waiting for
synchronize_rcu() to complete. The migration thread's CPU-hotplug
notifier is blocked in kthread_stop() because rcutorture holds the
kthread_stop() mutex.
I could argue that CPU hotplug should allow RCU grace periods to
proceed regardless, but I believe that the problem is that some thread
is preempted in the middle of an RCU read-side critical section, but
cannot be migrated to a CPU that could run it due to the fact that the
migration kthread is stuck in its CPU-hotplug notifier. RCU being what
it is, it seems reasonable for me to instead arrange so that rcutorture
never invokes kthread_stop() unless it knows that the target thread cannot
possibly be in the midst of synchronize_rcu(). That said, there is the
concern that this general pattern might rear its ugly head elsewhere.
> > Unless someone tells me otherwise, I will make a patch series
> > intended to replace tip/core/rcu commits 7fe616c5d ("Simplify RCU
> > CPU-hotplug notification"), 04b06256c ("Fix RCU & CPU hotplug
> > hang"), and 7256cf0e83b ("Add diagnostic check for a possible
> > CPU-hotplug race"), re-run all tests on that patchset, and submit
> > the series. I expect the resulting patch set to have three
> > patches, one to split out boot-time initialization for RCU_TREE, a
> > second to create the cpu_notifier() API, and the third to make RCU
> > use it.
While thinking this over, I am rebasing as described above, and doing
full-up testing at each step. No more Mr. Nice Guy!!! ;-)
In the meantime, can anyone tell me why we only let one kthread stop at
a time?
> Sure - we can reasonably rebase portions of that stack of commits:
>
> earth4:~/tip> gll linus..core/rcu
> 7256cf0: rcu: Add diagnostic check for a possible CPU-hotplug race
> 04b0625: rcu: Fix RCU & CPU hotplug hang
> 7fe616c: rcu: Simplify RCU CPU-hotplug notification
> 240ebbf: rcu: Add synchronize_sched_expedited() rcutorture doc + updates
> 0acc512: rcu: Add synchronize_sched_expedited() torture tests
> 03b042b: rcu: Add synchronize_sched_expedited() primitive
> c17ef45: rcu: Remove Classic RCU
>
> Please mention the magic words "please reset core/rcu to 240ebbf
> before applying these patches" in the mail to me, should i forget in
> the days to come.
Will do!
> (hm, what was i supposed to not forget? Weird.)
I can't remember. ;-)
Thanx, Paul
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2009-08-13 0:51 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-08 19:34 CONFIG_PREEMPT_RCU in next/mmotm Hugh Dickins
2009-08-08 23:56 ` Paul E. McKenney
2009-08-09 3:32 ` Hugh Dickins
2009-08-09 5:33 ` Paul E. McKenney
2009-08-09 11:24 ` Hugh Dickins
2009-08-09 13:06 ` Hugh Dickins
2009-08-09 18:57 ` Paul E. McKenney
2009-08-09 20:53 ` Hugh Dickins
2009-08-09 21:16 ` Paul E. McKenney
2009-08-10 3:39 ` Paul E. McKenney
2009-08-10 22:43 ` Paul E. McKenney
2009-08-12 1:34 ` Paul E. McKenney
2009-08-12 9:22 ` Ingo Molnar
2009-08-13 0:51 ` Paul E. McKenney
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.