linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Tejun Heo <tj@kernel.org>,
	jiangshanlai@gmail.com, linux-kernel@vger.kernel.org,
	Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: Workqueues splat due to ending up on wrong CPU
Date: Thu, 5 Dec 2019 11:29:28 +0100	[thread overview]
Message-ID: <20191205102928.GG2810@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20191204201150.GA14040@paulmck-ThinkPad-P72>

On Wed, Dec 04, 2019 at 12:11:50PM -0800, Paul E. McKenney wrote:

> And the good news is that I didn't see the workqueue splat, though my
> best guess is that I had about a 13% chance of not seeing it due to
> random chance (and I am currently trying an idea that I hope will make
> it more probable).  But I did get a couple of new complaints about RCU
> being used illegally from an offline CPU.  Splats below.

Shiny!

> Your patch did rearrange the CPU-online sequence, so let's see if I
> can piece things together...
> 
> RCU considers a CPU to be online at rcu_cpu_starting() time.  This is
> called from notify_cpu_starting(), which is called from the arch-specific
> CPU-bringup code.  Any RCU readers before rcu_cpu_starting() will trigger
> the warning I am seeing.

Right.

> The original location of the stop_machine_unpark() was in
> bringup_wait_for_ap(), which is called from bringup_cpu(), which is in
> the CPUHP_BRINGUP_CPU entry of cpuhp_hp_states[].  Which, if I am not
> too confused, is invoked by some CPU other than the to-be-incoming CPU.

Correct.

> The new location of the stop_machine_unpark() is in cpuhp_online_idle(),
> which is called from cpu_startup_entry(), which is invoked from
> the arch-specific bringup code that runs on the incoming CPU.

The new place is the final piece of bringup, it is right before where
the freshly woken CPU will drop into the idle loop and start scheduling
(for the first time).

> Which
> is the same code that invokes notify_cpu_starting(), so we need
> notify_cpu_starting() to be invoked before cpu_startup_entry().

Right, that is right before we run what used to be the CPU_STARTING
notifiers. This is in fact (on x86) before the CPU is marked
cpu_online(). It has to be before cpu_startup_entry(), before this is
ran with IRQs disabled, while cpu_startup_entry() demands IRQs are
enabled.

> The order is not immediately obvious on IA64.  But it looks like
> everything else does it in the required order, so I am a bit confused
> about this.

That makes two of us, afaict we have RCU up and running when we get to
the idle loop.

  reply	other threads:[~2019-12-05 10:29 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-25 23:03 Workqueues splat due to ending up on wrong CPU Paul E. McKenney
2019-11-26 18:33 ` Tejun Heo
2019-11-26 22:05   ` Paul E. McKenney
2019-11-27 15:50     ` Paul E. McKenney
2019-11-28 16:18       ` Paul E. McKenney
2019-11-29 15:58         ` Paul E. McKenney
2019-12-02  1:55           ` Paul E. McKenney
2019-12-02 20:13             ` Tejun Heo
2019-12-02 23:39               ` Paul E. McKenney
2019-12-03 10:00                 ` Peter Zijlstra
2019-12-03 17:45                   ` Paul E. McKenney
2019-12-03 18:13                     ` Tejun Heo
2019-12-03  9:55               ` Peter Zijlstra
2019-12-03 10:06                 ` Peter Zijlstra
2019-12-03 15:42                 ` Tejun Heo
2019-12-03 16:04                   ` Paul E. McKenney
2019-12-04 20:11                 ` Paul E. McKenney
2019-12-05 10:29                   ` Peter Zijlstra [this message]
2019-12-05 10:32                     ` Peter Zijlstra
2019-12-05 14:48                       ` Paul E. McKenney
2019-12-06  3:19                         ` Paul E. McKenney
2019-12-06 18:52                         ` Paul E. McKenney
2019-12-06 22:00                           ` Paul E. McKenney
2019-12-09 18:59                             ` Paul E. McKenney
2019-12-10  9:08                               ` Peter Zijlstra
2019-12-10 22:56                                 ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191205102928.GG2810@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=jiangshanlai@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).