linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: tip-bot for Thomas Gleixner <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, hpa@zytor.com, mingo@kernel.org,
	konrad.wilk@oracle.com, rusty@rustcorp.com.au,
	srivatsa.bhat@linux.vnet.ibm.com, tglx@linutronix.de,
	linux@eikelenboom.it
Subject: [tip:core/urgent] stop_machine: Mark per cpu stopper enabled early
Date: Tue, 26 Feb 2013 13:38:06 -0800	[thread overview]
Message-ID: <tip-46c498c2cdee5efe44f617bcd4f388179be36115@git.kernel.org> (raw)
In-Reply-To: <alpine.LFD.2.02.1302261843240.22263@ionos>

Commit-ID:  46c498c2cdee5efe44f617bcd4f388179be36115
Gitweb:     http://git.kernel.org/tip/46c498c2cdee5efe44f617bcd4f388179be36115
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 26 Feb 2013 18:44:33 +0100
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 26 Feb 2013 22:25:17 +0100

stop_machine: Mark per cpu stopper enabled early

commit 14e568e78 (stop_machine: Use smpboot threads) introduced the
following regression:

Before this commit the stopper enabled bit was set in the online
notifier.

CPU0				CPU1
cpu_up
				cpu online
hotplug_notifier(ONLINE)
  stopper(CPU1)->enabled = true;
...
stop_machine()

The conversion to smpboot threads moved the enablement to the wakeup
path of the parked thread. The majority of users seem to have the
following working order:

CPU0				CPU1
cpu_up
				cpu online
unpark_threads()
  wakeup(stopper[CPU1])
....
				stopper thread runs
				  stopper(CPU1)->enabled = true;
stop_machine()

But Konrad and Sander have observed:

CPU0				CPU1
cpu_up
				cpu online
unpark_threads()
  wakeup(stopper[CPU1])
....
stop_machine()
				stopper thread runs
				  stopper(CPU1)->enabled = true;

Now the stop machinery kicks CPU0 into the stop loop, where it gets
stuck forever because the queue code saw stopper(CPU1)->enabled ==
false, so CPU0 waits for CPU1 to enter stomp_machine, but the CPU1
stopper work got discarded due to enabled == false.

Add a pre_unpark function to the smpboot thread descriptor and call it
before waking the thread.

This fixes the problem at hand, but the stop_machine code should be
more robust. The stopper->enabled flag smells fishy at best.

Thanks to Konrad for going through a loop of debug patches and
providing the information to decode this issue.

Reported-and-tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reported-and-tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1302261843240.22263@ionos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/smpboot.h | 4 ++++
 kernel/smpboot.c        | 2 ++
 kernel/stop_machine.c   | 2 +-
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/linux/smpboot.h b/include/linux/smpboot.h
index c65dee0..13e9296 100644
--- a/include/linux/smpboot.h
+++ b/include/linux/smpboot.h
@@ -24,6 +24,9 @@ struct smpboot_thread_data;
  *			parked (cpu offline)
  * @unpark:		Optional unpark function, called when the thread is
  *			unparked (cpu online)
+ * @pre_unpark:		Optional unpark function, called before the thread is
+ *			unparked (cpu online). This is not guaranteed to be
+ *			called on the target cpu of the thread. Careful!
  * @selfparking:	Thread is not parked by the park function.
  * @thread_comm:	The base name of the thread
  */
@@ -37,6 +40,7 @@ struct smp_hotplug_thread {
 	void				(*cleanup)(unsigned int cpu, bool online);
 	void				(*park)(unsigned int cpu);
 	void				(*unpark)(unsigned int cpu);
+	void				(*pre_unpark)(unsigned int cpu);
 	bool				selfparking;
 	const char			*thread_comm;
 };
diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index d4abac2..8eaed9a 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -209,6 +209,8 @@ static void smpboot_unpark_thread(struct smp_hotplug_thread *ht, unsigned int cp
 {
 	struct task_struct *tsk = *per_cpu_ptr(ht->store, cpu);
 
+	if (ht->pre_unpark)
+		ht->pre_unpark(cpu);
 	kthread_unpark(tsk);
 }
 
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 95d178c..c09f295 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -336,7 +336,7 @@ static struct smp_hotplug_thread cpu_stop_threads = {
 	.create			= cpu_stop_create,
 	.setup			= cpu_stop_unpark,
 	.park			= cpu_stop_park,
-	.unpark			= cpu_stop_unpark,
+	.pre_unpark		= cpu_stop_unpark,
 	.selfparking		= true,
 };
 

      parent reply	other threads:[~2013-02-26 21:39 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-23  1:18 Regression introduced with 14e568e78f6f80ca1e27256641ddf524c7dbdc51 (stop_machine: Use smpboot threads) Konrad Rzeszutek Wilk
2013-02-26 12:36 ` Thomas Gleixner
2013-02-26 13:07   ` [Xen-devel] " Sander Eikelenboom
2013-02-26 17:44     ` Thomas Gleixner
2013-02-26 20:33       ` Sander Eikelenboom
2013-02-26 21:38       ` tip-bot for Thomas Gleixner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tip-46c498c2cdee5efe44f617bcd4f388179be36115@git.kernel.org \
    --to=tipbot@zytor.com \
    --cc=hpa@zytor.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=linux@eikelenboom.it \
    --cc=mingo@kernel.org \
    --cc=rusty@rustcorp.com.au \
    --cc=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).