* [nacked] stop_machine-stalls-for-a-considerable-period-on-large-cpu-count-machines.patch removed from -mm tree
@ 2009-07-02 7:34 akpm
0 siblings, 0 replies; only message in thread
From: akpm @ 2009-07-02 7:34 UTC (permalink / raw)
To: holt, rusty, stable, travis, mm-commits
The patch titled
stop_machine() stalls for a considerable period on large cpu count machines
has been removed from the -mm tree. Its filename was
stop_machine-stalls-for-a-considerable-period-on-large-cpu-count-machines.patch
This patch was dropped because it was nacked
The current -mm tree may be found at http://userweb.kernel.org/~akpm/mmotm/
------------------------------------------------------
Subject: stop_machine() stalls for a considerable period on large cpu count machines
From: Robin Holt <holt@sgi.com>
Mike Travis noted that a 2048 cpu machine booting would take hours to get
through its modprobes. We would get numerous back traces from stop_cpu
indicating they had not serviced interrupts.
A quick code review indicated we have a situation of heavy cacheline
contention due to the 'state' (read-mostly) and 'thread_ack'
(write-mostly) variables being located in the same cacheline.
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
kernel/stop_machine.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff -puN kernel/stop_machine.c~stop_machine-stalls-for-a-considerable-period-on-large-cpu-count-machines kernel/stop_machine.c
--- a/kernel/stop_machine.c~stop_machine-stalls-for-a-considerable-period-on-large-cpu-count-machines
+++ a/kernel/stop_machine.c
@@ -13,6 +13,13 @@
#include <asm/atomic.h>
#include <asm/uaccess.h>
+/*
+ * It is important to keep 'thread_ack' and 'state' in a seperate
+ * cachelines to prevent cacheline sharing between threads updating
+ * thread_ack and other threads spinning on state.
+ */
+static atomic_t thread_ack ____cacheline_aligned;
+
/* This controls the threads on each CPU. */
enum stopmachine_state {
/* Dummy starting state for thread. */
@@ -26,7 +33,7 @@ enum stopmachine_state {
/* Exit */
STOPMACHINE_EXIT,
};
-static enum stopmachine_state state;
+static enum stopmachine_state state ____cacheline_aligned;
struct stop_machine_data {
int (*fn)(void *);
@@ -36,7 +43,6 @@ struct stop_machine_data {
/* Like num_online_cpus(), but hotplug cpu uses us, so we need this. */
static unsigned int num_threads;
-static atomic_t thread_ack;
static DEFINE_MUTEX(lock);
/* setup_lock protects refcount, stop_machine_wq and stop_machine_work. */
static DEFINE_MUTEX(setup_lock);
_
Patches currently in -mm which might be from holt@sgi.com are
stop_machine-stalls-for-a-considerable-period-on-large-cpu-count-machines.patch
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2009-07-02 7:35 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-07-02 7:34 [nacked] stop_machine-stalls-for-a-considerable-period-on-large-cpu-count-machines.patch removed from -mm tree akpm
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).