On Wed, Dec 21, 2016 at 08:48:45AM -0800, Paul E. McKenney wrote: > On Wed, Dec 21, 2016 at 12:18:08PM +0800, Boqun Feng wrote: > > On Tue, Dec 20, 2016 at 07:40:24PM -0800, Paul E. McKenney wrote: > > [...] > > > > > > > > Agreed, my intent is to keep this overcare check for couples of releases > > > > and if no one shoots his/her foot, we can remove it, if not, it > > > > definitely means this part is subtle, and we need to pay more attention > > > > to it, maybe write some regression tests for this particular problem to > > > > help developers avoid it. > > > > > > > > This check is supposed to be removed, so I'm not stick to keeping it. > > > > > > I suggest keeping through validation. If it triggers during that time, > > > consider keeping it longer. If it does not trigger, remove it before > > > it goes upstream. > > > > Good point ;-) > > > > [...] > > > > > > > > > > > > But this brings a side question, is the callsite of rcu_cpu_starting() > > > > > > is correct? Given rcu_cpu_starting() ignores the @cpu parameter and only > > > > > > set _this_ cpu's bit in a leaf node? > > > > > > > > > > The calls from notify_cpu_starting() are called from the various > > > > > start_kernel_secondary(), secondary_start_kernel(), and similarly > > > > > named functions. These are called on the incoming CPU early in that > > > > > CPU's execution. The call from rcu_init() is correct until such time > > > > > as more than one CPU can be running at rcu_init() time. And that > > > > > day might be coming, so please see the untested patch below. > > > > > > > > Looks better than mine ;-) > > > > > > > > But do we need to worry that we start rcu on each CPU twice, which may > > > > slow down the boot? > > > > > > We only start a given CPU once. The boot CPU at rcu_init() time, and > > > the rest at CPU-hotplug time. Unless of course a CPU is later taken > > > > Confused... we call rcu_cpu_starting() in a for_each_online_cpu() loop > > in rcu_init(), so we basically start all online CPUs there after > > applying your patch. And all the rest CPUs will get themselves start > > again at CPU-hotplug time, right? > > At rcu_init() time, there is only one online CPU, namely the boot CPU. > > Or perhaps your point is that if CPUs come online before rcu_init(), they > might do so via the normal online mechanism. I don't believe that this > is likely, because the normal online mechanism reaquires the scheduler > be running. But either way, my hope would be that whoever fires up CPUs > before rcu_init() asks a few questions when they run into bugs. ;-) > ;-) > > Besides, without your patch, we started the boot CPU many times in the > > for_each_online_cpu() loop. > > That is true. It is harmless because it just does a group of assignments > repeatedly, and because there is only one CPU and because interrupts > are disabled, this cannot have any effect. And my fix inadvertently > fixed this issue, didn't it? > Yep! > So I do need to update the commit log accordingly. Done! > > > Am I missing something subtle? > > Given the nature of RCU, the only possible answer I can give to that > question is "probably". (Hey, you asked!!!) > True, I misread the for_each_online_cpu() loop in rcu_init(), I thought at that time, CPUs other than boot cpu have already mask themselves in the cpu_online_mask. But that's not true.. Sorry for the noice, and thank you for explanation ;-) Regards, Boqun > Thanx, Paul > > > Regards, > > Boqun > > > > > offline, in which case we start it again when it comes back online. > > > > > > Thanx, Paul > > > > > > > Regards, > > > > Boqun > > > > > > > > > Thanx, Paul > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > commit 1e84402587173d6d4da8645689f0e24c877b3269 > > > > > Author: Paul E. McKenney > > > > > Date: Tue Dec 20 07:17:58 2016 -0800 > > > > > > > > > > rcu: Make rcu_cpu_starting() use its "cpu" argument > > > > > > > > > > The rcu_cpu_starting() function uses this_cpu_ptr() to locate the > > > > > incoming CPU's rcu_data structure. This works for the boot CPU and for > > > > > all CPUs onlined after rcu_init() executes (during very early boot). > > > > > Currently, this is the full set of CPUs, so all is well. But if > > > > > anyone ever parallelizes boot before rcu_init() time, it will fail. > > > > > This commit therefore substitutes the rcu_cpu_starting() function's > > > > > this_cpu_pointer() for per_cpu_ptr(), future-proofing the code and > > > > > (arguably) improving readability. > > > > > > > > > > Reported-by: Boqun Feng > > > > > Signed-off-by: Paul E. McKenney > > > > > > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > > > > > index b9d3c0e30935..083cb8a6299c 100644 > > > > > --- a/kernel/rcu/tree.c > > > > > +++ b/kernel/rcu/tree.c > > > > > @@ -4017,7 +4017,7 @@ void rcu_cpu_starting(unsigned int cpu) > > > > > struct rcu_state *rsp; > > > > > > > > > > for_each_rcu_flavor(rsp) { > > > > > - rdp = this_cpu_ptr(rsp->rda); > > > > > + rdp = per_cpu_ptr(rsp->rda, cpu); > > > > > rnp = rdp->mynode; > > > > > mask = rdp->grpmask; > > > > > raw_spin_lock_irqsave_rcu_node(rnp, flags); > > > > > > > > > > > > >