All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>,
	jiangshanlai@gmail.com, linux-kernel@vger.kernel.org,
	Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: Workqueues splat due to ending up on wrong CPU
Date: Tue, 10 Dec 2019 14:56:45 -0800	[thread overview]
Message-ID: <20191210225645.GW2889@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20191210090839.GJ2844@hirez.programming.kicks-ass.net>

On Tue, Dec 10, 2019 at 10:08:39AM +0100, Peter Zijlstra wrote:
> On Mon, Dec 09, 2019 at 10:59:08AM -0800, Paul E. McKenney wrote:
> > And it survived!  ;-)
> > 
> > Peter, could I please have your Signed-off-by?  Or take my Tested-by if
> > you would prefer to send it up some other way.
> 
> How's this?

Very good, thank you!  I have queued it on -rcu, but please let me
know if you would rather that it go in via some other path.

							Thanx, Paul

> ---
> Subject: cpu/hotplug, stop_machine: Fix stop_machine vs hotplug order
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Tue Dec 10 09:34:54 CET 2019
> 
> Paul reported a very sporadic, rcutorture induced, workqueue failure.
> When the planets align, the workqueue rescuer's self-migrate fails and
> then triggers a WARN for running a work on the wrong CPU.
> 
> Tejun then figured that set_cpus_allowed_ptr()'s stop_one_cpu() call
> could be ignored! When stopper->enabled is false, stop_machine will
> insta complete the work, without actually doing the work. Worse, it
> will not WARN about this (we really should fix this).
> 
> It turns out there is a small window where a freshly online'ed CPU is
> marked 'online' but doesn't yet have the stopper task running:
> 
> 	BP				AP
> 
> 	bringup_cpu()
> 	  __cpu_up(cpu, idle)	 -->	start_secondary()
> 					...
> 					cpu_startup_entry()
> 	  bringup_wait_for_ap()
> 	    wait_for_ap_thread() <--	  cpuhp_online_idle()
> 					  while (1)
> 					    do_idle()
> 
> 					... available to run kthreads ...
> 
> 	    stop_machine_unpark()
> 	      stopper->enable = true;
> 
> Close this by moving the stop_machine_unpark() into
> cpuhp_online_idle(), such that the stopper thread is ready before we
> start the idle loop and schedule.
> 
> Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
> Debugged-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -525,8 +525,7 @@ static int bringup_wait_for_ap(unsigned
>  	if (WARN_ON_ONCE((!cpu_online(cpu))))
>  		return -ECANCELED;
>  
> -	/* Unpark the stopper thread and the hotplug thread of the target cpu */
> -	stop_machine_unpark(cpu);
> +	/* Unpark the hotplug thread of the target cpu */
>  	kthread_unpark(st->thread);
>  
>  	/*
> @@ -1089,8 +1088,8 @@ void notify_cpu_starting(unsigned int cp
>  
>  /*
>   * Called from the idle task. Wake up the controlling task which brings the
> - * stopper and the hotplug thread of the upcoming CPU up and then delegates
> - * the rest of the online bringup to the hotplug thread.
> + * hotplug thread of the upcoming CPU up and then delegates the rest of the
> + * online bringup to the hotplug thread.
>   */
>  void cpuhp_online_idle(enum cpuhp_state state)
>  {
> @@ -1100,6 +1099,12 @@ void cpuhp_online_idle(enum cpuhp_state
>  	if (state != CPUHP_AP_ONLINE_IDLE)
>  		return;
>  
> +	/*
> +	 * Unpart the stopper thread before we start the idle loop (and start
> +	 * scheduling); this ensures the stopper task is always available.
> +	 */
> +	stop_machine_unpark(smp_processor_id());
> +
>  	st->state = CPUHP_AP_ONLINE_IDLE;
>  	complete_ap_thread(st, true);
>  }

      reply	other threads:[~2019-12-10 22:56 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-25 23:03 Workqueues splat due to ending up on wrong CPU Paul E. McKenney
2019-11-26 18:33 ` Tejun Heo
2019-11-26 22:05   ` Paul E. McKenney
2019-11-27 15:50     ` Paul E. McKenney
2019-11-28 16:18       ` Paul E. McKenney
2019-11-29 15:58         ` Paul E. McKenney
2019-12-02  1:55           ` Paul E. McKenney
2019-12-02 20:13             ` Tejun Heo
2019-12-02 23:39               ` Paul E. McKenney
2019-12-03 10:00                 ` Peter Zijlstra
2019-12-03 17:45                   ` Paul E. McKenney
2019-12-03 18:13                     ` Tejun Heo
2019-12-03  9:55               ` Peter Zijlstra
2019-12-03 10:06                 ` Peter Zijlstra
2019-12-03 15:42                 ` Tejun Heo
2019-12-03 16:04                   ` Paul E. McKenney
2019-12-04 20:11                 ` Paul E. McKenney
2019-12-05 10:29                   ` Peter Zijlstra
2019-12-05 10:32                     ` Peter Zijlstra
2019-12-05 14:48                       ` Paul E. McKenney
2019-12-06  3:19                         ` Paul E. McKenney
2019-12-06 18:52                         ` Paul E. McKenney
2019-12-06 22:00                           ` Paul E. McKenney
2019-12-09 18:59                             ` Paul E. McKenney
2019-12-10  9:08                               ` Peter Zijlstra
2019-12-10 22:56                                 ` Paul E. McKenney [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191210225645.GW2889@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.