From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751651AbcEQVNM (ORCPT ); Tue, 17 May 2016 17:13:12 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:37590 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750875AbcEQVNK (ORCPT ); Tue, 17 May 2016 17:13:10 -0400 Date: Tue, 17 May 2016 14:13:08 -0700 From: Andrew Morton To: Andi Kleen Cc: tglx@linutronix.de, linux-kernel@vger.kernel.org, Andi Kleen Subject: Re: [PATCH] Allocate idle task for a CPU always on its local node Message-Id: <20160517141308.8a2df2028d19d375d3660e1f@linux-foundation.org> In-Reply-To: <1463492694-15833-1-git-send-email-andi@firstfloor.org> References: <1463492694-15833-1-git-send-email-andi@firstfloor.org> X-Mailer: Sylpheed 3.4.1 (GTK+ 2.24.23; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 17 May 2016 06:44:54 -0700 Andi Kleen wrote: > From: Andi Kleen > > Linux pre-allocates the task structs of the idle tasks for all possible CPUs. > This currently means they all end up on node 0. This also implies > that the cache line of MWAIT, which is around the flags field in the task > struct, are all located in node 0. > > We see a noticeable performance improvement on Knights Landing CPUs when > the cache lines used for MWAIT are located in the local nodes of the CPUs > using them. I would expect this to give a (likely slight) improvement > on other systems too. > > The patch implements placing the idle task in the node of > its CPUs, by passing the right target node to copy_process() > Looks nice. This is nicer ;) From: Andrew Morton Subject: allocate-idle-task-for-a-cpu-always-on-its-local-node-fix use NUMA_NO_NODE, not a bare -1 Cc: Andi Kleen Cc: Thomas Gleixner Signed-off-by: Andrew Morton --- kernel/fork.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -puN kernel/fork.c~allocate-idle-task-for-a-cpu-always-on-its-local-node-fix kernel/fork.c --- a/kernel/fork.c~allocate-idle-task-for-a-cpu-always-on-its-local-node-fix +++ a/kernel/fork.c @@ -346,7 +346,7 @@ static struct task_struct *dup_task_stru struct thread_info *ti; int err; - if (node < 0) + if (node == NUMA_NO_NODE) node = tsk_fork_get_node(orig); tsk = alloc_task_struct_node(node); if (!tsk) @@ -1754,7 +1754,7 @@ long _do_fork(unsigned long clone_flags, } p = copy_process(clone_flags, stack_start, stack_size, - child_tidptr, NULL, trace, tls, -1); + child_tidptr, NULL, trace, tls, NUMA_NO_NODE); /* * Do this prior waking up the new thread - the thread pointer * might get invalid after that point, if the thread exits quickly. _