linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gautham R Shenoy <ego@in.ibm.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Slaby <jirislaby@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, "Rafael J. Wysocki" <rjw@sisk.pl>,
	Arjan van de Ven <arjan@linux.intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linux-pm mailing list <linux-pm@lists.linux-foundation.org>,
	Dipankar Sarma <dipankar@in.ibm.com>
Subject: Re: broken suspend (sched related) [Was: 2.6.24-rc4-mm1]
Date: Mon, 10 Dec 2007 15:45:00 +0530	[thread overview]
Message-ID: <20071210101500.GB12880@in.ibm.com> (raw)
In-Reply-To: <20071210091052.GA14487@elte.hu>

On Mon, Dec 10, 2007 at 10:10:52AM +0100, Ingo Molnar wrote:
> 
> > > softlockup: remove get_online_cpus() which doesn't help here.
> > > 
> > > The get_online_cpus() protection seems to be bogus in 
> > > kernel/softlockup.c as cpu cached in check_cpu can go offline once 
> > > we do a put_online_cpus().
> > > 
> > > This can also cause deadlock during a cpu offline as follows:
> 
> i'm wondering, what's the proper CPU-hotplug safe sequence here then? 
> I'm picking a CPU number from cpu_online_map, and that CPU could go away 
> while i'm still using it, right? What's saving us here?

In this particular case, we are trying to see if any task on a particular
cpu has not been scheduled for a really long time. If we do this check
on a cpu which has gone offline, then
a) If the tasks have not been migrated on to another cpu yet, we will
still perform that check and yell if something has been holding any task
for a sufficiently long time.
b) If the tasks have been migrated off, then we have nothing to check.

However, if we still want that particular cpu to not go offline during
the check, then could we use the following patch

commit a49736844717e00f7a37c96368cea8fec7eb31cf
Author: Gautham R Shenoy <ego@in.ibm.com>
Date:   Mon Dec 10 15:43:32 2007 +0530

CPU-Hotplug: Add try_get_online_cpus()

Add the fastpath code, try_get_online_cpus() which will
return 1 once it has managed to hold the reference to the cpu_online_map
if there are no threads trying to perform a cpu-hotplug.

Use the primitive in the softlockup code.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linuxtronix.de>
Cc: Jiri Slaby <jirislaby@gmail.com>

diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index e0132cb..d236e21 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -107,6 +107,7 @@ static inline void cpuhotplug_mutex_unlock(struct mutex *cpu_hp_mutex)
 }
 
 extern void get_online_cpus(void);
+extern int  try_get_online_cpus(void);
 extern void put_online_cpus(void);
 #define hotcpu_notifier(fn, pri) {				\
 	static struct notifier_block fn##_nb =			\
@@ -127,6 +128,9 @@ static inline void cpuhotplug_mutex_unlock(struct mutex *cpu_hp_mutex)
 
 #define get_online_cpus()	do { } while (0)
 #define put_online_cpus()	do { } while (0)
+static inline int try_get_online_cpus(void) 
+{ return 1;}
+
 #define hotcpu_notifier(fn, pri)	do { (void)(fn); } while (0)
 /* These aren't inline functions due to a GCC bug. */
 #define register_hotcpu_notifier(nb)	({ (void)(nb); 0; })
diff --git a/kernel/cpu.c b/kernel/cpu.c
index e0d3a4f..38537c9 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -48,11 +48,35 @@ void __init cpu_hotplug_init(void)
 
 #ifdef CONFIG_HOTPLUG_CPU
 
+/*
+ * try_get_online_cpus(): Tries to hold a reference 
+ * to the cpu_online_map if no one is trying to perform 
+ * a cpu-hotplug operation. This is the fastpath code for
+ * get_online_cpus.
+ *
+ * Returns 1 if there is no cpu-hotplug operation
+ * currently in progress.
+ */
+int try_get_online_cpus(void)
+{
+	if(!cpu_hotplug.active_writer) {
+		cpu_hotplug.refcount++;
+		return 1;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(try_get_online_cpus);
+
 void get_online_cpus(void)
 {
 	might_sleep();
 	if (cpu_hotplug.active_writer == current)
 		return;
+	if (try_get_online_cpus())
+		return;
+
+	/* The writer exists, hence the slowpath */
 	mutex_lock(&cpu_hotplug.lock);
 	cpu_hotplug.refcount++;
 	mutex_unlock(&cpu_hotplug.lock);
@@ -120,6 +144,11 @@ static void cpu_hotplug_begin(void)
 	mutex_lock(&cpu_hotplug.lock);
 
 	cpu_hotplug.active_writer = current;
+	synchronize_sched();
+	/* New users of get_online_cpus() will see a non-NULL value
+	 * for cpu_hotplug.active_writer here and will take the slowpath
+	 */
+
 	add_wait_queue_exclusive(&cpu_hotplug.writer_queue, &wait);
 	while (cpu_hotplug.refcount) {
 		set_current_state(TASK_UNINTERRUPTIBLE);
diff --git a/kernel/softlockup.c b/kernel/softlockup.c
index 576eb9c..cb0616d 100644
--- a/kernel/softlockup.c
+++ b/kernel/softlockup.c
@@ -150,8 +150,8 @@ static void check_hung_task(struct task_struct *t, unsigned long now)
 	sysctl_hung_task_warnings--;
 
 	/*
-	 * Ok, the task did not get scheduled for more than 2 minutes,
-	 * complain:
+	 * Ok, the task did not get scheduled for more than
+	 * sysctl_hung_task_timeout_secs, complain:
 	 */
 	printk(KERN_ERR "INFO: task %s:%d blocked for more than "
 			"%ld seconds.\n", t->comm, t->pid,
@@ -216,16 +216,21 @@ static int watchdog(void *__bind_cpu)
 		touch_softlockup_watchdog();
 		msleep_interruptible(10000);
 
+		/* 
+		 * If a cpu-hotplug operation is in progress
+		 * we can come back later
+		 */
+		if (!try_get_online_cpus())
+			continue; 
 		/*
 		 * Only do the hung-tasks check on one CPU:
 		 */
 		check_cpu = any_online_cpu(cpu_online_map);
 
-		if (this_cpu != check_cpu)
-			continue;
-
-		if (sysctl_hung_task_timeout_secs)
+		if ((this_cpu == check_cpu) && sysctl_hung_task_timeout_secs)
 			check_hung_uninterruptible_tasks(this_cpu);
+
+		put_online_cpus();
 	}
 
 	return 0;






> 
> 	Ingo

-- 
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"

  reply	other threads:[~2007-12-10 10:15 UTC|newest]

Thread overview: 112+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-05  5:17 2.6.24-rc4-mm1 Andrew Morton
2007-12-05  9:15 ` 2.6.24-rc4-mm1: kobj changes fallout on powerpc Olof Johansson
2007-12-05 13:11   ` Kamalesh Babulal
2007-12-05 15:46     ` Greg KH
2007-12-05 14:12 ` 2.6.24-rc4-mm1 kobject changes broken with hvcs driver " Kamalesh Babulal
2007-12-05 15:47   ` Greg KH
2007-12-06 18:19   ` Balbir Singh
2007-12-06 18:50     ` Greg KH
2007-12-06 18:49       ` Kamalesh Babulal
2007-12-06 18:58       ` Balbir Singh
2007-12-06 19:21         ` Badari Pulavarty
2007-12-07  1:29           ` Balbir Singh
2007-12-06 20:31         ` Greg KH
2007-12-06 23:54           ` Badari Pulavarty
2007-12-07  0:32             ` Greg KH
2007-12-07  3:02               ` Kamalesh Babulal
2007-12-07  5:14                 ` Greg KH
2007-12-07 22:01                   ` Balbir Singh
2007-12-05 23:41 ` 2.6.24-rc4-mm1: hostbyte=0x01 driverbyte=0x00 (now bisected) Alexey Dobriyan
2007-12-06  7:52   ` Hannes Reinecke
2007-12-06 12:08     ` Jens Axboe
2007-12-06 19:19     ` Alexey Dobriyan
2007-12-06  3:15 ` 2.6.24-rc4-mm1 Kernel build fails on S390x Kamalesh Babulal
2007-12-06  7:19   ` Andrew Morton
2007-12-06  6:59 ` 2.6.24-rc4-mm1 Reuben Farrelly
2007-12-06  7:09   ` 2.6.24-rc4-mm1 David Miller
2007-12-07 13:16     ` 2.6.24-rc4-mm1 Ilpo Järvinen
2007-12-12 17:57       ` 2.6.24-rc4-mm1 Cedric Le Goater
2007-12-06  7:35   ` 2.6.24-rc4-mm1 Andrew Morton
2007-12-10 12:24     ` 2.6.24-rc4-mm1 Ilpo Järvinen
2007-12-10 20:05       ` 2.6.24-rc4-mm1 Ilpo Järvinen
2007-12-12 19:21       ` 2.6.24-rc4-mm1 Cedric Le Goater
2007-12-13 17:38         ` tcp_sacktag_one() WARNING (was Re: 2.6.24-rc4-mm1) Cedric Le Goater
2007-12-06 11:49 ` 2.6.24-rc4-mm1 Valdis.Kletnieks
2007-12-06 12:04   ` 2.6.24-rc4-mm1 Andrew Morton
2007-12-06 19:18     ` 2.6.24-rc4-mm1 Valdis.Kletnieks
2007-12-06 19:38       ` 2.6.24-rc4-mm1 Greg KH
2007-12-06 20:04         ` 2.6.24-rc4-mm1 Valdis.Kletnieks
2007-12-06 22:04         ` [dm-devel] 2.6.24-rc4-mm1 Kay Sievers
2007-12-06 22:12           ` Alasdair G Kergon
2007-12-06 23:12           ` Valdis.Kletnieks
2007-12-06 23:24             ` Kay Sievers
2007-12-07 18:20               ` Valdis.Kletnieks
2007-12-07 18:44                 ` Kay Sievers
2007-12-07 20:28                   ` Valdis.Kletnieks
2007-12-07 20:49                     ` Kay Sievers
2007-12-06 22:28 ` 2.6.24-rc4-mm1: VDSOSYM build error Laurent Riffard
2007-12-06 22:37   ` Andrew Morton
2007-12-06 23:28     ` Miles Lane
2007-12-06 23:34       ` Andrew Morton
2007-12-06 23:47         ` Miles Lane
2007-12-07 10:36         ` Ingo Molnar
2007-12-07  1:14     ` [PATCH x86/mm] x86 vDSO: canonicalize sysenter .eh_frame Roland McGrath
2007-12-07  1:27       ` Harvey Harrison
2007-12-07  3:27         ` Miles Lane
2007-12-07  9:44       ` Ingo Molnar
2007-12-07  2:12 ` 2.6.24-rc4-mm1 Dave Young
2007-12-07 22:22   ` 2.6.24-rc4-mm1 Luis R. Rodriguez
2007-12-10  1:07     ` 2.6.24-rc4-mm1 Dave Young
2007-12-09 17:55   ` 2.6.24-rc4-mm1 Nick Kossifidis
     [not found] ` <E1J0Yfy-0001V7-Rr@localhost>
2007-12-07  8:35   ` [PATCH BUGFIX] hid: the `bit' in hidinput_mapping_quirks() is an out parameter Fengguang Wu
2007-12-10 10:03     ` Jiri Kosina
2007-12-07 14:34 ` broken suspend (sched related) [Was: 2.6.24-rc4-mm1] Jiri Slaby
2007-12-07 15:11   ` Ingo Molnar
2007-12-07 17:51     ` Ingo Molnar
2007-12-08  8:10       ` Jiri Slaby
2007-12-08  8:39         ` Ingo Molnar
2007-12-08  9:23           ` Jiri Slaby
2007-12-08 15:24             ` Ingo Molnar
2007-12-08 17:34               ` Jiri Slaby
2007-12-08 17:43               ` Jiri Slaby
2007-12-09  8:06                 ` Ingo Molnar
2007-12-08 23:12               ` Jiri Slaby
2007-12-09  7:46                 ` Ingo Molnar
2007-12-09  9:09                   ` Jiri Slaby
2007-12-10  8:19                   ` Gautham R Shenoy
2007-12-10  8:55                     ` Jiri Slaby
2007-12-10  9:10                       ` Ingo Molnar
2007-12-10 10:15                         ` Gautham R Shenoy [this message]
2007-12-10 10:21                           ` Ingo Molnar
2007-12-10 11:08                             ` Gautham R Shenoy
2007-12-10 11:28                               ` Ingo Molnar
2007-12-10 11:49                                 ` Gautham R Shenoy
2007-12-10  9:29                     ` Ingo Molnar
2007-12-07 18:20 ` [PATCH] md: balance braces in raid5 debug code Mariusz Kozlowski
2007-12-08  0:04 ` 2.6.24-rc4-mm1: undefined reference to `compat_sys_timerfd' on sparc64 Mariusz Kozlowski
2007-12-08  0:08   ` Andrew Morton
2007-12-08  9:17     ` Mariusz Kozlowski
2007-12-11 10:15     ` David Miller
2007-12-08 18:20 ` 2.6.24-rc4-mm1: some issues " Mariusz Kozlowski
2007-12-08 18:22   ` Andrew Morton
2007-12-09  8:45     ` David Miller
2007-12-09  9:03       ` Andrew Morton
2007-12-10 14:48 ` 2.6.24-rc4-mm1 Reuben Farrelly
2007-12-10 21:11   ` 2.6.24-rc4-mm1 Andrew Morton
2007-12-11 14:12     ` 2.6.24-rc4-mm1 Reuben Farrelly
2007-12-11 16:20 ` 2.6.24-rc4-mm1 Martin Bligh
2007-12-11 16:59   ` 2.6.24-rc4-mm1 Randy Dunlap
2007-12-11 17:50     ` 2.6.24-rc4-mm1 Martin Bligh
     [not found] ` <33307c790712110813h23def95dvd068b7226e9fcd36@mail.gmail.com>
2007-12-11 20:37   ` 2.6.24-rc4-mm1 Andrew Morton
2007-12-11 21:20     ` 2.6.24-rc4-mm1 Ingo Molnar
2007-12-11 21:26     ` 2.6.24-rc4-mm1 Kok, Auke
2007-12-11 21:59       ` 2.6.24-rc4-mm1 Kok, Auke
2007-12-11 22:10       ` 2.6.24-rc4-mm1 Andrew Morton
2007-12-11 22:17         ` 2.6.24-rc4-mm1 Kok, Auke
2007-12-11 23:15           ` 2.6.24-rc4-mm1 Randy Dunlap
2007-12-12  4:16 ` 2.6.24-rc4-mm1 Rik van Riel
2007-12-13 17:45 ` 2.6.24-rc4-mm1 - BUG in tcp_fragment Cedric Le Goater
2007-12-13 23:00   ` Ilpo Järvinen
2007-12-14  6:52     ` Cedric Le Goater
2007-12-14 20:14     ` [PATCH net-2.6.25] Revert recent TCP work Ilpo Järvinen
2007-12-16 22:21       ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071210101500.GB12880@in.ibm.com \
    --to=ego@in.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@linux.intel.com \
    --cc=dipankar@in.ibm.com \
    --cc=jirislaby@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@lists.linux-foundation.org \
    --cc=mingo@elte.hu \
    --cc=rjw@sisk.pl \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).