All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched: fix the endless sync_sched/rcu() inside _cpu_down()
@ 2013-11-13  3:10 Michael wang
  2013-11-13 17:25 ` [tip:sched/urgent] sched: Fix endless sync_sched/rcu() loop " tip-bot for Michael wang
  0 siblings, 1 reply; 2+ messages in thread
From: Michael wang @ 2013-11-13  3:10 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Fengguang Wu, LKML


Commit 6acce3ef8:

	sched: Remove get_online_cpus() usage

try to do sync_sched/rcu() inside _cpu_down() but trigger:

	INFO: task swapper/0:1 blocked for more than 120 seconds.
	...
	[<ffffffff811263dc>] synchronize_rcu+0x2c/0x30
	[<ffffffff81d1bd82>] _cpu_down+0x2b2/0x340
	...

It was caused by that in rcu boost case, we rely on smpboot thread to
finish the rcu callback, which has already parked before sync in here
and lead to the endless sync_sched/rcu().

This patch exchange the sequence of smpboot_park_threads() and
sync_sched/rcu() to fix the BUG.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
---
 kernel/cpu.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 63aa50d..2227b58 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -306,7 +306,6 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
 				__func__, cpu);
 		goto out_release;
 	}
-	smpboot_park_threads(cpu);
 
 	/*
 	 * By now we've cleared cpu_active_mask, wait for all preempt-disabled
@@ -315,12 +314,16 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
 	 *
 	 * For CONFIG_PREEMPT we have preemptible RCU and its sync_rcu() might
 	 * not imply sync_sched(), so explicitly call both.
+	 *
+	 * Do sync before park smpboot threads to take care the rcu boost case.
 	 */
 #ifdef CONFIG_PREEMPT
 	synchronize_sched();
 #endif
 	synchronize_rcu();
 
+	smpboot_park_threads(cpu);
+
 	/*
 	 * So now all preempt/rcu users must observe !cpu_active().
 	 */
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* [tip:sched/urgent] sched: Fix endless sync_sched/rcu() loop inside _cpu_down()
  2013-11-13  3:10 [PATCH] sched: fix the endless sync_sched/rcu() inside _cpu_down() Michael wang
@ 2013-11-13 17:25 ` tip-bot for Michael wang
  0 siblings, 0 replies; 2+ messages in thread
From: tip-bot for Michael wang @ 2013-11-13 17:25 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, peterz, wangyun, tglx, fengguang.wu

Commit-ID:  106dd5afde3cd10db7e1370b6ddc77f0b2496a75
Gitweb:     http://git.kernel.org/tip/106dd5afde3cd10db7e1370b6ddc77f0b2496a75
Author:     Michael wang <wangyun@linux.vnet.ibm.com>
AuthorDate: Wed, 13 Nov 2013 11:10:56 +0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 13 Nov 2013 13:33:50 +0100

sched: Fix endless sync_sched/rcu() loop inside _cpu_down()

Commit 6acce3ef8:

	sched: Remove get_online_cpus() usage

tries to do sync_sched/rcu() inside _cpu_down() but triggers:

	INFO: task swapper/0:1 blocked for more than 120 seconds.
	...
	[<ffffffff811263dc>] synchronize_rcu+0x2c/0x30
	[<ffffffff81d1bd82>] _cpu_down+0x2b2/0x340
	...

It was caused by that in the rcu boost case we rely on smpboot thread to
finish the rcu callback, which has already been parked before sync in here
and leads to the endless sync_sched/rcu().

This patch exchanges the sequence of smpboot_park_threads() and
sync_sched/rcu() to fix the bug.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/5282EDC0.6060003@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/cpu.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 63aa50d..2227b58 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -306,7 +306,6 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
 				__func__, cpu);
 		goto out_release;
 	}
-	smpboot_park_threads(cpu);
 
 	/*
 	 * By now we've cleared cpu_active_mask, wait for all preempt-disabled
@@ -315,12 +314,16 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
 	 *
 	 * For CONFIG_PREEMPT we have preemptible RCU and its sync_rcu() might
 	 * not imply sync_sched(), so explicitly call both.
+	 *
+	 * Do sync before park smpboot threads to take care the rcu boost case.
 	 */
 #ifdef CONFIG_PREEMPT
 	synchronize_sched();
 #endif
 	synchronize_rcu();
 
+	smpboot_park_threads(cpu);
+
 	/*
 	 * So now all preempt/rcu users must observe !cpu_active().
 	 */

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2013-11-13 17:26 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-13  3:10 [PATCH] sched: fix the endless sync_sched/rcu() inside _cpu_down() Michael wang
2013-11-13 17:25 ` [tip:sched/urgent] sched: Fix endless sync_sched/rcu() loop " tip-bot for Michael wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.