From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755575AbcHSVO7 (ORCPT ); Fri, 19 Aug 2016 17:14:59 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:33448 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754611AbcHSVO6 (ORCPT ); Fri, 19 Aug 2016 17:14:58 -0400 X-IBM-Helo: d03dlp01.boulder.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com Date: Fri, 19 Aug 2016 14:14:57 -0700 From: "Paul E. McKenney" To: Sebastian Andrzej Siewior Cc: Sebastian Andrzej Siewior , Anna-Maria Gleixner , LKML , Peter Zijlstra , Ingo Molnar Subject: Re: [patch 59/66] rcu: Convert rcutree to hotplug state machine Reply-To: paulmck@linux.vnet.ibm.com References: <20160711122450.923603742@linutronix.de> <20160711122535.613832499@linutronix.de> <20160711183828.GO4650@linux.vnet.ibm.com> <20160818173535.l5i24wepg26ktqbd@breakpoint.cc> <20160818183013.GT3482@linux.vnet.ibm.com> <20160819201200.warurd7e7t7uiggg@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160819201200.warurd7e7t7uiggg@linutronix.de> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16081921-0020-0000-0000-00000998215F X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00005617; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000183; SDB=6.00747032; UDB=6.00352275; IPR=6.00519554; BA=6.00004671; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00012398; XFM=3.00000011; UTC=2016-08-19 21:14:49 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16081921-0021-0000-0000-000054B96A93 Message-Id: <20160819211457.GH3482@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-08-19_07:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1608190259 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 19, 2016 at 10:12:00PM +0200, Sebastian Andrzej Siewior wrote: > On 2016-08-18 11:30:13 [-0700], Paul E. McKenney wrote: > > > > --- a/kernel/cpu.c > > > > +++ b/kernel/cpu.c > > > > @@ -882,6 +882,7 @@ void notify_cpu_starting(unsigned int cpu) > > > > struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu); > > > > enum cpuhp_state target = min((int)st->target, CPUHP_AP_ONLINE); > > > > > > > > + rcu_cpu_starting(cpu); /* All CPU_STARTING notifiers can use RCU. */ > > > > while (st->state < target) { > > > > struct cpuhp_step *step; > > > > > > What happens if something post CPUHP_AP_ONLINE fails and we do a > > > rollback to 0? Do we need to revert / undo rcu_cpu_starting() doing? > > > > Yes, by calling rcu_cleanup_dying_idle_cpu(). > > Thank you for that :) No problem! ;-) > > But please note that rcu_cpu_starting() is invoked from the CPU itself > > during the onlining process. Is it really possible to fail beyond > > that point? > > "sure". We enter here at CPUHP_BRINGUP_CPU. The next states until > (approx) CPUHP_AP_ONLINE are currently defined as "can't fail". As you > see in notify_cpu_starting() the return code of cpuhp_invoke_callback() > isn't checked and there is no rollback. > Later, CPUHP_AP_SMPBOOT_THREADS could fail (not in current but from here > on we no longer the return code). At this point we would return to where > we started (CPUHP_OFFLINE is most cases). During the rollback we get to > rcu_cleanup_dying_idle_cpu() via rcu_report_dead() so that is fine. OK, so any later failure looks to RCU like the CPU came online, then immediately went offline. Works for me! > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > > > > index 3121242b8579..5e7c1d6a6108 100644 > > > > --- a/kernel/rcu/tree.c > > > > +++ b/kernel/rcu/tree.c > > > > @@ -3793,8 +3793,6 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp) > > > > rnp = rdp->mynode; > > > > mask = rdp->grpmask; > > > > raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */ > > > > - rnp->qsmaskinitnext |= mask; > > > > - rnp->expmaskinitnext |= mask; > > > > if (!rdp->beenonline) > > > > WRITE_ONCE(rsp->ncpus, READ_ONCE(rsp->ncpus) + 1); > > > > rdp->beenonline = true; /* We have now been online. */ > > > > @@ -4211,8 +4235,10 @@ void __init rcu_init(void) > > > > */ > > > > cpu_notifier(rcu_cpu_notify, 0); > > > > pm_notifier(rcu_pm_notify, 0); > > > > - for_each_online_cpu(cpu) > > > > + for_each_online_cpu(cpu) { > > > > rcu_cpu_notify(NULL, CPU_UP_PREPARE, (void *)(long)cpu); > > > > + rcu_cpu_starting(cpu); > > > > + } > > > > > > and looking at this from x86-64 then at this point I have CPU0 online > > > and all other are down (or not yet up). So this is invoked for one CPU > > > only. And later via hotplug callbacks for the other CPUs. > > > > Yes, that is the current situation. The reason for the loop is in > > case someone clever decides to online other CPUs before RCU initializes > > itself. No idea how anyone would make that sort of thing work, but I > > have learned not to underestimate the creativity of the fast-boot guys. > > doubt this would work. start_kernel() is invoked from the boot CPU. > Other CPUs are usually down because the scheduler wasn't up yet (so they > can't idle in their idle thread nor could they be marked "online" in the > CPU mask). Later (rest_init() -> kernel_init() -> > kernel_init_freeable()) there is smp_init() which boots the additional > CPUs. Just as there is currently an early boot implementation of printk(), some crazy person might well create an early boot implementation of the scheduler that had fixed per-CPU tasks for which blocking is forbidden. I understand that this is crazy, but much else has been considered crazy not long before inclusion. PREEMPT_RT, for but one example. ;-) > > And please do not invoke rcu_cpu_starting() before rcu_init(), at least > > not without some careful inspection and likely reworking of rcu_init() > > and the things that it calls. > > I did not intend to. I was thinking about moving it to > CPUHP_AP_RCUTREE_DYING but since the teardown method does not match it > makes no sense. Whew!!! ;-) > Thank you for clarifying the counterpart of rcu_cpu_starting(). Again, no problem! Thanx, Paul