linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] kthread: ensure locality of task_struct allocations
@ 2014-01-28 18:38 Nishanth Aravamudan
  2014-01-29  8:13 ` David Rientjes
  2014-01-29 15:57 ` Christoph Lameter
  0 siblings, 2 replies; 10+ messages in thread
From: Nishanth Aravamudan @ 2014-01-28 18:38 UTC (permalink / raw)
  To: LKML
  Cc: Anton Blanchard, Christoph Lameter, Andrew Morton, Tejun Heo,
	Oleg Nesterov, Jan Kara, David Rientjes, Thomas Gleixner,
	Tetsuo Handa, linux-kernel, linux-mm, Wanpeng Li, Joonsoo Kim,
	Ben Herrenschmidt

In the presence of memoryless nodes, numa_node_id()/cpu_to_node() will
return the current CPU's NUMA node, but that may not be where we expect
to allocate from memory from. Instead, we should use
numa_mem_id()/cpu_to_mem(). On one ppc64 system with a memoryless Node
0, this ends up saving nearly 500M of slab due to less fragmentation.

Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>

diff --git a/kernel/kthread.c b/kernel/kthread.c
index b5ae3ee..8573e4e 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -217,7 +217,7 @@ int tsk_fork_get_node(struct task_struct *tsk)
 	if (tsk == kthreadd_task)
 		return tsk->pref_node_fork;
 #endif
-	return numa_node_id();
+	return numa_mem_id();
 }
 
 static void create_kthread(struct kthread_create_info *create)
@@ -369,7 +369,7 @@ struct task_struct *kthread_create_on_cpu(int (*threadfn)(void *data),
 {
 	struct task_struct *p;
 
-	p = kthread_create_on_node(threadfn, data, cpu_to_node(cpu), namefmt,
+	p = kthread_create_on_node(threadfn, data, cpu_to_mem(cpu), namefmt,
 				   cpu);
 	if (IS_ERR(p))
 		return p;


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] kthread: ensure locality of task_struct allocations
  2014-01-28 18:38 [PATCH] kthread: ensure locality of task_struct allocations Nishanth Aravamudan
@ 2014-01-29  8:13 ` David Rientjes
  2014-01-29 15:58   ` Christoph Lameter
  2014-01-29 15:57 ` Christoph Lameter
  1 sibling, 1 reply; 10+ messages in thread
From: David Rientjes @ 2014-01-29  8:13 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: LKML, Anton Blanchard, Christoph Lameter, Andrew Morton,
	Tejun Heo, Oleg Nesterov, Jan Kara, Thomas Gleixner,
	Tetsuo Handa, linux-mm, Wanpeng Li, Joonsoo Kim,
	Ben Herrenschmidt

On Tue, 28 Jan 2014, Nishanth Aravamudan wrote:

> In the presence of memoryless nodes, numa_node_id()/cpu_to_node() will
> return the current CPU's NUMA node, but that may not be where we expect
> to allocate from memory from. Instead, we should use
> numa_mem_id()/cpu_to_mem(). On one ppc64 system with a memoryless Node
> 0, this ends up saving nearly 500M of slab due to less fragmentation.
> 
> Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>

Acked-by: David Rientjes <rientjes@google.com>

> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index b5ae3ee..8573e4e 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -217,7 +217,7 @@ int tsk_fork_get_node(struct task_struct *tsk)
>  	if (tsk == kthreadd_task)
>  		return tsk->pref_node_fork;
>  #endif
> -	return numa_node_id();
> +	return numa_mem_id();

I'm wondering why return NUMA_NO_NODE wouldn't have the same effect and 
prefer the local node?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] kthread: ensure locality of task_struct allocations
  2014-01-28 18:38 [PATCH] kthread: ensure locality of task_struct allocations Nishanth Aravamudan
  2014-01-29  8:13 ` David Rientjes
@ 2014-01-29 15:57 ` Christoph Lameter
  1 sibling, 0 replies; 10+ messages in thread
From: Christoph Lameter @ 2014-01-29 15:57 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: LKML, Anton Blanchard, Andrew Morton, Tejun Heo, Oleg Nesterov,
	Jan Kara, David Rientjes, Thomas Gleixner, Tetsuo Handa,
	linux-mm, Wanpeng Li, Joonsoo Kim, Ben Herrenschmidt

On Tue, 28 Jan 2014, Nishanth Aravamudan wrote:

> In the presence of memoryless nodes, numa_node_id()/cpu_to_node() will
> return the current CPU's NUMA node, but that may not be where we expect
> to allocate from memory from. Instead, we should use
> numa_mem_id()/cpu_to_mem(). On one ppc64 system with a memoryless Node
> 0, this ends up saving nearly 500M of slab due to less fragmentation.

Reviewed-by: Christoph Lameter <cl@linux.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] kthread: ensure locality of task_struct allocations
  2014-01-29  8:13 ` David Rientjes
@ 2014-01-29 15:58   ` Christoph Lameter
  2014-01-30  0:27     ` David Rientjes
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Lameter @ 2014-01-29 15:58 UTC (permalink / raw)
  To: David Rientjes
  Cc: Nishanth Aravamudan, LKML, Anton Blanchard, Andrew Morton,
	Tejun Heo, Oleg Nesterov, Jan Kara, Thomas Gleixner,
	Tetsuo Handa, linux-mm, Wanpeng Li, Joonsoo Kim,
	Ben Herrenschmidt

On Wed, 29 Jan 2014, David Rientjes wrote:

> > diff --git a/kernel/kthread.c b/kernel/kthread.c
> > index b5ae3ee..8573e4e 100644
> > --- a/kernel/kthread.c
> > +++ b/kernel/kthread.c
> > @@ -217,7 +217,7 @@ int tsk_fork_get_node(struct task_struct *tsk)
> >  	if (tsk == kthreadd_task)
> >  		return tsk->pref_node_fork;
> >  #endif
> > -	return numa_node_id();
> > +	return numa_mem_id();
>
> I'm wondering why return NUMA_NO_NODE wouldn't have the same effect and
> prefer the local node?
>

The idea here seems to be that the allocation may occur from a cpu that is
different from where the process will run later on.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] kthread: ensure locality of task_struct allocations
  2014-01-29 15:58   ` Christoph Lameter
@ 2014-01-30  0:27     ` David Rientjes
  2014-01-30  6:14       ` Eric Dumazet
  0 siblings, 1 reply; 10+ messages in thread
From: David Rientjes @ 2014-01-30  0:27 UTC (permalink / raw)
  To: Christoph Lameter, Eric Dumazet
  Cc: Nishanth Aravamudan, LKML, Anton Blanchard, Andrew Morton,
	Tejun Heo, Oleg Nesterov, Jan Kara, Thomas Gleixner,
	Tetsuo Handa, linux-mm, Wanpeng Li, Joonsoo Kim,
	Ben Herrenschmidt

On Wed, 29 Jan 2014, Christoph Lameter wrote:

> > > diff --git a/kernel/kthread.c b/kernel/kthread.c
> > > index b5ae3ee..8573e4e 100644
> > > --- a/kernel/kthread.c
> > > +++ b/kernel/kthread.c
> > > @@ -217,7 +217,7 @@ int tsk_fork_get_node(struct task_struct *tsk)
> > >  	if (tsk == kthreadd_task)
> > >  		return tsk->pref_node_fork;
> > >  #endif
> > > -	return numa_node_id();
> > > +	return numa_mem_id();
> >
> > I'm wondering why return NUMA_NO_NODE wouldn't have the same effect and
> > prefer the local node?
> >
> 
> The idea here seems to be that the allocation may occur from a cpu that is
> different from where the process will run later on.
> 

Yeah, that makes sense for kthreadd, but I'm wondering why we have to 
return numa_mem_id() rather than just NUMA_NO_NODE.  Sorry for not being 
specific about doing s/numa_mem_id/NUMA_NO_NODE/ here.

That should just turn kmem_cache_alloc_node() into kmem_cache_alloc() and 
alloc_pages_node() into alloc_pages() for the allocators that use this 
return value, task_struct and thread_info.  If that's not allocating local 
memory, if possible, and numa_mem_id() magically does, then there's a 
problem.

Eric, did you try this when writing 207205a2ba26 ("kthread: NUMA aware 
kthread_create_on_node()") or was it always numa_node_id() from the 
beginning?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] kthread: ensure locality of task_struct allocations
  2014-01-30  0:27     ` David Rientjes
@ 2014-01-30  6:14       ` Eric Dumazet
  2014-01-30 22:47         ` David Rientjes
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2014-01-30  6:14 UTC (permalink / raw)
  To: David Rientjes
  Cc: Christoph Lameter, Eric Dumazet, Nishanth Aravamudan, LKML,
	Anton Blanchard, Andrew Morton, Tejun Heo, Oleg Nesterov,
	Jan Kara, Thomas Gleixner, Tetsuo Handa, linux-mm, Wanpeng Li,
	Joonsoo Kim, Ben Herrenschmidt

On Wed, 2014-01-29 at 16:27 -0800, David Rientjes wrote:

> Eric, did you try this when writing 207205a2ba26 ("kthread: NUMA aware 
> kthread_create_on_node()") or was it always numa_node_id() from the 
> beginning?

Hmm, I think I did not try this, its absolutely possible NUMA_NO_NODE
was better here.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] kthread: ensure locality of task_struct allocations
  2014-01-30  6:14       ` Eric Dumazet
@ 2014-01-30 22:47         ` David Rientjes
  2014-01-30 23:08           ` Nishanth Aravamudan
  0 siblings, 1 reply; 10+ messages in thread
From: David Rientjes @ 2014-01-30 22:47 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Christoph Lameter, Eric Dumazet, Nishanth Aravamudan, LKML,
	Anton Blanchard, Andrew Morton, Tejun Heo, Oleg Nesterov,
	Jan Kara, Thomas Gleixner, Tetsuo Handa, linux-mm, Wanpeng Li,
	Joonsoo Kim, Ben Herrenschmidt

On Wed, 29 Jan 2014, Eric Dumazet wrote:

> > Eric, did you try this when writing 207205a2ba26 ("kthread: NUMA aware 
> > kthread_create_on_node()") or was it always numa_node_id() from the 
> > beginning?
> 
> Hmm, I think I did not try this, its absolutely possible NUMA_NO_NODE
> was better here.
> 

Nishanth, could you change your patch to just return NUMA_NO_NODE for the 
non-kthreadd case?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] kthread: ensure locality of task_struct allocations
  2014-01-30 22:47         ` David Rientjes
@ 2014-01-30 23:08           ` Nishanth Aravamudan
  2014-01-30 23:31             ` David Rientjes
  2014-01-31 15:14             ` Christoph Lameter
  0 siblings, 2 replies; 10+ messages in thread
From: Nishanth Aravamudan @ 2014-01-30 23:08 UTC (permalink / raw)
  To: David Rientjes
  Cc: Eric Dumazet, Christoph Lameter, Eric Dumazet, LKML,
	Anton Blanchard, Andrew Morton, Tejun Heo, Oleg Nesterov,
	Jan Kara, Thomas Gleixner, Tetsuo Handa, linux-mm, Wanpeng Li,
	Joonsoo Kim, Ben Herrenschmidt

On 30.01.2014 [14:47:05 -0800], David Rientjes wrote:
> On Wed, 29 Jan 2014, Eric Dumazet wrote:
> 
> > > Eric, did you try this when writing 207205a2ba26 ("kthread: NUMA aware 
> > > kthread_create_on_node()") or was it always numa_node_id() from the 
> > > beginning?
> > 
> > Hmm, I think I did not try this, its absolutely possible NUMA_NO_NODE
> > was better here.
> > 
> 
> Nishanth, could you change your patch to just return NUMA_NO_NODE for the 
> non-kthreadd case?

Something like the following?


In the presence of memoryless nodes, numa_node_id() will return the
current CPU's NUMA node, but that may not be where we expect to allocate
from memory from. Instead, we should rely on the fallback code in the
memory allocator itself, by using NUMA_NO_NODE. Also, when calling
kthread_create_on_node(), use the nearest node with memory to the cpu in
question, rather than the node it is running on.

Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: linux-kernel@vger.kernel.org
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Ben Herrenschmidt <benh@kernel.crashing.org>

---
Note that I haven't yet tested this change on the system that reproduce
the original problem yet.

diff --git a/kernel/kthread.c b/kernel/kthread.c
index b5ae3ee..9a130ec 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -217,7 +217,7 @@ int tsk_fork_get_node(struct task_struct *tsk)
 	if (tsk == kthreadd_task)
 		return tsk->pref_node_fork;
 #endif
-	return numa_node_id();
+	return NUMA_NO_NODE;
 }
 
 static void create_kthread(struct kthread_create_info *create)
@@ -369,7 +369,7 @@ struct task_struct *kthread_create_on_cpu(int (*threadfn)(void *data),
 {
 	struct task_struct *p;
 
-	p = kthread_create_on_node(threadfn, data, cpu_to_node(cpu), namefmt,
+	p = kthread_create_on_node(threadfn, data, cpu_to_mem(cpu), namefmt,
 				   cpu);
 	if (IS_ERR(p))
 		return p;


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] kthread: ensure locality of task_struct allocations
  2014-01-30 23:08           ` Nishanth Aravamudan
@ 2014-01-30 23:31             ` David Rientjes
  2014-01-31 15:14             ` Christoph Lameter
  1 sibling, 0 replies; 10+ messages in thread
From: David Rientjes @ 2014-01-30 23:31 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Eric Dumazet, Christoph Lameter, Eric Dumazet, LKML,
	Anton Blanchard, Andrew Morton, Tejun Heo, Oleg Nesterov,
	Jan Kara, Thomas Gleixner, Tetsuo Handa, linux-mm, Wanpeng Li,
	Joonsoo Kim, Ben Herrenschmidt

On Thu, 30 Jan 2014, Nishanth Aravamudan wrote:

> In the presence of memoryless nodes, numa_node_id() will return the
> current CPU's NUMA node, but that may not be where we expect to allocate
> from memory from. Instead, we should rely on the fallback code in the
> memory allocator itself, by using NUMA_NO_NODE. Also, when calling
> kthread_create_on_node(), use the nearest node with memory to the cpu in
> question, rather than the node it is running on.
> 
> Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] kthread: ensure locality of task_struct allocations
  2014-01-30 23:08           ` Nishanth Aravamudan
  2014-01-30 23:31             ` David Rientjes
@ 2014-01-31 15:14             ` Christoph Lameter
  1 sibling, 0 replies; 10+ messages in thread
From: Christoph Lameter @ 2014-01-31 15:14 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: David Rientjes, Eric Dumazet, Eric Dumazet, LKML,
	Anton Blanchard, Andrew Morton, Tejun Heo, Oleg Nesterov,
	Jan Kara, Thomas Gleixner, Tetsuo Handa, linux-mm, Wanpeng Li,
	Joonsoo Kim, Ben Herrenschmidt

On Thu, 30 Jan 2014, Nishanth Aravamudan wrote:

> In the presence of memoryless nodes, numa_node_id() will return the
> current CPU's NUMA node, but that may not be where we expect to allocate
> from memory from. Instead, we should rely on the fallback code in the
> memory allocator itself, by using NUMA_NO_NODE. Also, when calling
> kthread_create_on_node(), use the nearest node with memory to the cpu in
> question, rather than the node it is running on.

Looks good to me.

Reviewed-by: Christoph Lameter <cl@linux.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-01-31 15:14 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-28 18:38 [PATCH] kthread: ensure locality of task_struct allocations Nishanth Aravamudan
2014-01-29  8:13 ` David Rientjes
2014-01-29 15:58   ` Christoph Lameter
2014-01-30  0:27     ` David Rientjes
2014-01-30  6:14       ` Eric Dumazet
2014-01-30 22:47         ` David Rientjes
2014-01-30 23:08           ` Nishanth Aravamudan
2014-01-30 23:31             ` David Rientjes
2014-01-31 15:14             ` Christoph Lameter
2014-01-29 15:57 ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).