linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: linux-kernel@vger.kernel.org, mingo@redhat.com,
	peterz@infradead.org, tglx@linutronix.de
Subject: Re: RCU used on incoming CPU before rcu_cpu_starting() called
Date: Wed, 8 Mar 2017 21:24:14 -0800	[thread overview]
Message-ID: <20170309052414.GQ30506@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170309035527.GB6416@lerouge>

On Thu, Mar 09, 2017 at 04:55:29AM +0100, Frederic Weisbecker wrote:
> On Wed, Mar 08, 2017 at 03:41:52PM -0800, Paul E. McKenney wrote:
> > On Wed, Mar 08, 2017 at 02:16:56PM -0800, Paul E. McKenney wrote:
> > > Hello!
> > > 
> > > I am seeing the following splat in rcutorture testing of v4.11-rc1:
> > > 
> > > [   30.694013] =============================
> > > [   30.694013] WARNING: suspicious RCU usage
> > > [   30.694013] 4.11.0-rc1+ #1 Not tainted
> > > [   30.694013] -----------------------------
> > > [   30.694013] /home/git/linux-2.6-tip/kernel/workqueue.c:712 sched RCU or wq_pool_mutex should be held!
> > > [   30.694013] 
> > > [   30.694013] other info that might help us debug this:
> > > [   30.694013] 
> > > [   30.694013] 
> > > [   30.694013] RCU used illegally from offline CPU!
> > > [   30.694013] rcu_scheduler_active = 2, debug_locks = 0
> > > [   30.694013] no locks held by swapper/1/0.
> > > [   30.694013] 
> > > [   30.694013] stack backtrace:
> > > [   30.694013] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc1+ #1
> > > [   30.694013] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> > > [   30.694013] Call Trace:
> > > [   30.694013]  dump_stack+0x67/0x99
> > > [   30.694013]  lockdep_rcu_suspicious+0xe7/0x120
> > > [   30.694013]  get_work_pool+0x82/0x90
> > > [   30.694013]  __queue_work+0x70/0x5f0
> > > [   30.694013]  queue_work_on+0x33/0x70
> > > [   30.694013]  clear_sched_clock_stable+0x33/0x40
> > > [   30.694013]  early_init_intel+0xe7/0x2f0
> > > [   30.694013]  init_intel+0x11/0x350
> > > [   30.694013]  identify_cpu+0x344/0x5a0
> > > [   30.694013]  identify_secondary_cpu+0x18/0x80
> > > [   30.694013]  smp_store_cpu_info+0x39/0x40
> > > [   30.694013]  start_secondary+0x4e/0x100
> > > [   30.694013]  start_cpu+0x14/0x14
> > > 
> > > Here is the relevant code from x86's smp_callin():
> > > 
> > > 	/*
> > > 	 * Save our processor parameters. Note: this information
> > > 	 * is needed for clock calibration.
> > > 	 */
> > > 	smp_store_cpu_info(cpuid);
> > > 
> > > 	/*
> > > 	 * Get our bogomips.
> > > 	 * Update loops_per_jiffy in cpu_data. Previous call to
> > > 	 * smp_store_cpu_info() stored a value that is close but not as
> > > 	 * accurate as the value just calculated.
> > > 	 */
> > > 	calibrate_delay();
> > > 	cpu_data(cpuid).loops_per_jiffy = loops_per_jiffy;
> > > 	pr_debug("Stack at about %p\n", &cpuid);
> > > 
> > > 	/*
> > > 	 * This must be done before setting cpu_online_mask
> > > 	 * or calling notify_cpu_starting.
> > > 	 */
> > > 	set_cpu_sibling_map(raw_smp_processor_id());
> > > 	wmb();
> > > 
> > > 	notify_cpu_starting(cpuid);
> > > 
> > > The problem is that smp_store_cpu_info() indirectly invokes
> > > schedule_work(), which wants to use RCU.  But RCU isn't informed
> > > of the incoming CPU until the call to notify_cpu_starting(), which
> > > causes lockdep to complain bitterly about the use of RCU by the
> > > premature call to schedule_work().
> > > 
> > > I considered just moving the notify_cpu_starting() earlier in the
> > > sequence, but the comments make it seem like this would not be
> > > a wise choice.
> > > 
> > > Any suggestions?
> 
> Calling schedule_work() from an offline (booting) CPU doesn't sound like a good
> idea in the first place. And neither is it a good idea to allow using
> RCU on a CPU that is not yet online.

Fair point, though it is only RCU readers that are in use, so no
RCU core code would be running, at least not until interrupts are
enabled and so on.  But I needed the patch to get this out of the
way of my rcutorture testing, so it is serving a purpose whether or
not it eventually hits mainline.

> Perhaps we could delay this sched clock stability check to a later
> stage in the secondary CPU boot-up code? Once the CPU is online
> and RCU is initialized? For example it could be a CPU_ONLINE hotplug
> callback.

In theory, this does seem like a cleaner solution to me.  In practice,
I must defer to those who know the code better than I do.

							Thanx, Paul

  reply	other threads:[~2017-03-09  5:24 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-08 22:16 RCU used on incoming CPU before rcu_cpu_starting() called Paul E. McKenney
2017-03-08 23:41 ` Paul E. McKenney
2017-03-09  3:55   ` Frederic Weisbecker
2017-03-09  5:24     ` Paul E. McKenney [this message]
2017-03-09 13:08 ` Thomas Gleixner
2017-03-09 15:12   ` Peter Zijlstra
2017-03-09 15:29     ` Paul E. McKenney
2017-03-09 15:50       ` Paul E. McKenney
2017-03-20  8:32         ` Tomeu Vizoso
2017-03-20 12:50           ` Paul E. McKenney
2017-03-09 15:24   ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170309052414.GQ30506@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).