From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755580Ab1GLVPn (ORCPT ); Tue, 12 Jul 2011 17:15:43 -0400 Received: from mail-yi0-f46.google.com ([209.85.218.46]:38534 "EHLO mail-yi0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754984Ab1GLVPl convert rfc822-to-8bit (ORCPT ); Tue, 12 Jul 2011 17:15:41 -0400 MIME-Version: 1.0 In-Reply-To: <20110711214301.GP2245@linux.vnet.ibm.com> References: <20110710032510.GG6014@linux.vnet.ibm.com> <20110710171626.GK6014@linux.vnet.ibm.com> <20110710173530.GA16954@linux.vnet.ibm.com> <20110710214639.GP6014@linux.vnet.ibm.com> <20110710231449.GQ6014@linux.vnet.ibm.com> <20110711214301.GP2245@linux.vnet.ibm.com> Date: Tue, 12 Jul 2011 22:15:40 +0100 Message-ID: Subject: Re: PROBLEM: 3.0-rc kernels unbootable since -rc3 From: Julie Sullivan To: paulmck@linux.vnet.ibm.com, linux-kernel-mail Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 11, 2011 at 10:43 PM, Paul E. McKenney wrote: > On Mon, Jul 11, 2011 at 09:37:53PM +0100, julie Sullivan wrote: >> > And here is what I am proposing sending upstream.  I have your Tested-by, >> > but had to make a small but very real change in order to make it work >> > under all configurations that I test under.  So could you please try >> > the attached patch out?  I am particularly interested in how it works >> > out when CONFIG_RCU_BOOST=n. >> > >> >                                                        Thanx, Paul >> > >> > ------------------------------------------------------------------------ >> > >> > rcu: Prevent RCU callbacks from executing during early boot >> > >> > Under some rare but real combinations of configuration parameters, RCU >> > callbacks are posted during early boot that use kernel facilities that >> > are not yet initialized.  Therefore, when these callbacks are invoked, >> > hard hangs and crashes ensue.  This commit therefore prevents RCU >> > callbacks from being invoked until after the scheduler is up and running. >> > >> > It might well turn out that a better approach is to identify the specific >> > RCU callbacks that are causing this problem, but that discussion will >> > wait until such time as someone really needs an RCU callback to be >> > invoked during early boot. >> > >> > Reported-by: julie Sullivan >> > Tested-by: julie Sullivan >> > Signed-off-by: Paul E. McKenney >> > >> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c >> > index 7e59ffb..4c0210f 100644 >> > --- a/kernel/rcutree.c >> > +++ b/kernel/rcutree.c >> > @@ -1467,7 +1467,7 @@ static void rcu_process_callbacks(struct softirq_action *unused) >> >  */ >> >  static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp) >> >  { >> > -       if (likely(!rsp->boost)) { >> > +       if (likely(rcu_scheduler_active && !rsp->boost)) { >> >                rcu_do_batch(rsp, rdp); >> >                return; >> >        } >> > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h >> > index 14dc7dd..ca3c6dc 100644 >> > --- a/kernel/rcutree_plugin.h >> > +++ b/kernel/rcutree_plugin.h >> > @@ -1703,7 +1703,7 @@ static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags) >> > >> >  static void invoke_rcu_callbacks_kthread(void) >> >  { >> > -       WARN_ON_ONCE(1); >> > +       WARN_ON_ONCE(rcu_scheduler_active); >> >  } >> > >> >  static void rcu_preempt_boost_start_gp(struct rcu_node *rnp) >> > >> >> Hi Paul, >> Is this to be applied on a clean v3.0-rc4? I tried this but I'm afraid >> the boot crash is back again (on -rc5 and -rc6 too). > > I must confess that it did seem to be giving up a bit too easily.  :-( > > So, I have created a new branch jms.2011.07.11a on the -rcu git tree at: > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git > > If the new branch jms.2011.07.11a fails and the old branch jms.2011.07.07a > succeeds (both with CONFIG_RCU_BOOST=n), then that indicates that my > mainlinable patch didn't delay the callbacks quite far enough.  On the > other hand, if both succeed, then that means that there is another bug > lurking later on in the sequence of commits. > > Could you please test these out? > >                                                        Thanx, Paul > OK tested- jms.2011.07.11a fails. The other one's fine (I'm actually running an -rc6 with its patches right now :-) Julie