LKML Archive on
 help / color / Atom feed
From: Andi Kleen <>
	Andi Kleen <>
Subject: [PATCH] Allocate idle task for a CPU always on its local node
Date: Tue, 17 May 2016 06:44:54 -0700
Message-ID: <> (raw)

From: Andi Kleen <>

Linux pre-allocates the task structs of the idle tasks for all possible CPUs.
This currently means they all end up on node 0. This also implies
that the cache line of MWAIT, which is around the flags field in the task
struct, are all located in node 0.

We see a noticeable performance improvement on Knights Landing CPUs when
the cache lines used for MWAIT are located in the local nodes of the CPUs
using them. I would expect this to give a (likely slight) improvement
on other systems too.

The patch implements placing the idle task in the node of
its CPUs, by passing the right target node to copy_process()

Signed-off-by: Andi Kleen <>
 kernel/fork.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index d277e83..da449a9 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -340,13 +340,14 @@ void set_task_stack_end_magic(struct task_struct *tsk)
 	*stackend = STACK_END_MAGIC;	/* for overflow detection */
-static struct task_struct *dup_task_struct(struct task_struct *orig)
+static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
 	struct task_struct *tsk;
 	struct thread_info *ti;
-	int node = tsk_fork_get_node(orig);
 	int err;
+	if (node < 0)
+		node = tsk_fork_get_node(orig);
 	tsk = alloc_task_struct_node(node);
 	if (!tsk)
 		return NULL;
@@ -1256,7 +1257,8 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 					int __user *child_tidptr,
 					struct pid *pid,
 					int trace,
-					unsigned long tls)
+					unsigned long tls,
+					int node)
 	int retval;
 	struct task_struct *p;
@@ -1308,7 +1310,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 		goto fork_out;
 	retval = -ENOMEM;
-	p = dup_task_struct(current);
+	p = dup_task_struct(current, node);
 	if (!p)
 		goto fork_out;
@@ -1684,7 +1686,8 @@ static inline void init_idle_pids(struct pid_link *links)
 struct task_struct *fork_idle(int cpu)
 	struct task_struct *task;
-	task = copy_process(CLONE_VM, 0, 0, NULL, &init_struct_pid, 0, 0);
+	task = copy_process(CLONE_VM, 0, 0, NULL, &init_struct_pid, 0, 0,
+			    cpu_to_node(cpu));
 	if (!IS_ERR(task)) {
 		init_idle(task, cpu);
@@ -1729,7 +1732,7 @@ long _do_fork(unsigned long clone_flags,
 	p = copy_process(clone_flags, stack_start, stack_size,
-			 child_tidptr, NULL, trace, tls);
+			 child_tidptr, NULL, trace, tls, -1);
 	 * Do this prior waking up the new thread - the thread pointer
 	 * might get invalid after that point, if the thread exits quickly.

             reply index

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-17 13:44 Andi Kleen [this message]
2016-05-17 21:13 ` Andrew Morton
2016-05-17 21:50   ` Andi Kleen
2016-05-18  8:28 ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on

Archives are clonable:
	git clone --mirror lkml/git/0.git
	git clone --mirror lkml/git/1.git
	git clone --mirror lkml/git/2.git
	git clone --mirror lkml/git/3.git
	git clone --mirror lkml/git/4.git
	git clone --mirror lkml/git/5.git
	git clone --mirror lkml/git/6.git
	git clone --mirror lkml/git/7.git
	git clone --mirror lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ \
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone