linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch] fs, seq_file: fallback to vmalloc instead of oom kill processes
@ 2014-11-26 22:16 David Rientjes
  2014-11-26 22:24 ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: David Rientjes @ 2014-11-26 22:16 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Heiko Carstens, Christoph Hellwig, Al Viro, linux-kernel

Since commit 058504edd026 ("fs/seq_file: fallback to vmalloc allocation"),
seq_buf_alloc() falls back to vmalloc() when the kmalloc() for contiguous
memory fails.  This was done to address order-4 slab allocations for
reading /proc/stat on large machines and noticed because
PAGE_ALLOC_COSTLY_ORDER < 4, so there is no infinite loop in the page
allocator when allocating new slab for such high-order allocations.

Contiguous memory isn't necessary for caller of seq_buf_alloc(), however.
Other GFP_KERNEL high-order allocations that are <=
PAGE_ALLOC_COSTLY_ORDER will simply loop forever in the page allocator
and oom kill processes as a result.

We don't want to kill processes so that we can allocate contiguous memory
in situations when contiguous memory isn't necessary.

This patch does the kmalloc() allocation with __GFP_NORETRY for
high-order allocations.  This still utilizes memory compaction and direct 
reclaim in the allocation path, the only difference is that it will fail 
immediately instead of oom kill processes when out of memory.

Signed-off-by: David Rientjes <rientjes@google.com>
---
 fs/seq_file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/seq_file.c b/fs/seq_file.c
--- a/fs/seq_file.c
+++ b/fs/seq_file.c
@@ -36,7 +36,7 @@ static void *seq_buf_alloc(unsigned long size)
 {
 	void *buf;
 
-	buf = kmalloc(size, GFP_KERNEL | __GFP_NOWARN);
+	buf = kmalloc(size, GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN);
 	if (!buf && size > PAGE_SIZE)
 		buf = vmalloc(size);
 	return buf;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] fs, seq_file: fallback to vmalloc instead of oom kill processes
  2014-11-26 22:16 [patch] fs, seq_file: fallback to vmalloc instead of oom kill processes David Rientjes
@ 2014-11-26 22:24 ` Andrew Morton
  2014-11-26 22:38   ` Joe Perches
  2014-11-26 22:40   ` David Rientjes
  0 siblings, 2 replies; 5+ messages in thread
From: Andrew Morton @ 2014-11-26 22:24 UTC (permalink / raw)
  To: David Rientjes; +Cc: Heiko Carstens, Christoph Hellwig, Al Viro, linux-kernel

On Wed, 26 Nov 2014 14:16:46 -0800 (PST) David Rientjes <rientjes@google.com> wrote:

> Since commit 058504edd026 ("fs/seq_file: fallback to vmalloc allocation"),
> seq_buf_alloc() falls back to vmalloc() when the kmalloc() for contiguous
> memory fails.  This was done to address order-4 slab allocations for
> reading /proc/stat on large machines and noticed because
> PAGE_ALLOC_COSTLY_ORDER < 4, so there is no infinite loop in the page
> allocator when allocating new slab for such high-order allocations.
> 
> Contiguous memory isn't necessary for caller of seq_buf_alloc(), however.
> Other GFP_KERNEL high-order allocations that are <=
> PAGE_ALLOC_COSTLY_ORDER will simply loop forever in the page allocator
> and oom kill processes as a result.
> 
> We don't want to kill processes so that we can allocate contiguous memory
> in situations when contiguous memory isn't necessary.
> 
> This patch does the kmalloc() allocation with __GFP_NORETRY for
> high-order allocations.  This still utilizes memory compaction and direct 
> reclaim in the allocation path, the only difference is that it will fail 
> immediately instead of oom kill processes when out of memory.
> 
> --- a/fs/seq_file.c
> +++ b/fs/seq_file.c
> @@ -36,7 +36,7 @@ static void *seq_buf_alloc(unsigned long size)
>  {
>  	void *buf;
>  
> -	buf = kmalloc(size, GFP_KERNEL | __GFP_NOWARN);
> +	buf = kmalloc(size, GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN);
>  	if (!buf && size > PAGE_SIZE)
>  		buf = vmalloc(size);
>  	return buf;

You forgot something.

--- a/fs/seq_file.c~fs-seq_file-fallback-to-vmalloc-instead-of-oom-kill-processes-fix
+++ a/fs/seq_file.c
@@ -36,6 +36,10 @@ static void *seq_buf_alloc(unsigned long
 {
 	void *buf;
 
+	/*
+	 * __GFP_NORETRY to avoid oom-killings with high-order allocations -
+	 * it's better to fall back to vmalloc() than to kill things.
+	 */
 	buf = kmalloc(size, GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN);
 	if (!buf && size > PAGE_SIZE)
 		buf = vmalloc(size);

Is __GFP_NORETRY our preferred way of preventing oom-killings?  If so,
it's a bit obscure - wouldn't it be better to create a
__GFP_NO_OOM_KILL?

There are eleventy billion places where we do the open coded
kmalloc-or-vmalloc thing.  Sigh.  Perhaps it is time to add a helper
function which does this, so that all such callers use
__GFP_NO_OOM_KILL.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] fs, seq_file: fallback to vmalloc instead of oom kill processes
  2014-11-26 22:24 ` Andrew Morton
@ 2014-11-26 22:38   ` Joe Perches
  2014-11-26 22:40   ` David Rientjes
  1 sibling, 0 replies; 5+ messages in thread
From: Joe Perches @ 2014-11-26 22:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Rientjes, Heiko Carstens, Christoph Hellwig, Al Viro, linux-kernel

On Wed, 2014-11-26 at 14:24 -0800, Andrew Morton wrote:
> There are eleventy billion places where we do the open coded
> kmalloc-or-vmalloc thing.  Sigh.  Perhaps it is time to add a helper
> function which does this, so that all such callers use
> __GFP_NO_OOM_KILL.

That would go along nicely with kvfree() in mm/util.c

void kvfree(const void *addr)
{
	if (is_vmalloc_addr(addr))
		vfree(addr);
	else
		kfree(addr);
}
EXPORT_SYMBOL(kvfree);



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] fs, seq_file: fallback to vmalloc instead of oom kill processes
  2014-11-26 22:24 ` Andrew Morton
  2014-11-26 22:38   ` Joe Perches
@ 2014-11-26 22:40   ` David Rientjes
  2014-11-26 22:48     ` Andrew Morton
  1 sibling, 1 reply; 5+ messages in thread
From: David Rientjes @ 2014-11-26 22:40 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Heiko Carstens, Christoph Hellwig, Al Viro, linux-kernel

On Wed, 26 Nov 2014, Andrew Morton wrote:

> You forgot something.
> 
> --- a/fs/seq_file.c~fs-seq_file-fallback-to-vmalloc-instead-of-oom-kill-processes-fix
> +++ a/fs/seq_file.c
> @@ -36,6 +36,10 @@ static void *seq_buf_alloc(unsigned long
>  {
>  	void *buf;
>  
> +	/*
> +	 * __GFP_NORETRY to avoid oom-killings with high-order allocations -
> +	 * it's better to fall back to vmalloc() than to kill things.
> +	 */
>  	buf = kmalloc(size, GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN);
>  	if (!buf && size > PAGE_SIZE)
>  		buf = vmalloc(size);
> 

Thanks!

> Is __GFP_NORETRY our preferred way of preventing oom-killings?  If so,
> it's a bit obscure - wouldn't it be better to create a
> __GFP_NO_OOM_KILL?
> 

The slowpath tries to allocate, calls memory compaction if necessary, 
tries to allocate, calls direct reclaim, tries to allocate, call the oom 
killer and tries to allocate if we are going to loop, and then loop if 
allowed.  There's no need to try to allocate if we don't call the oom 
killer since it won't succeed and there's no need to call the oom killer 
to free memory if we aren't going to retry.

Even if __GFP_NO_OOM_KILL existed, it wouldn't be applicable to this 
patch: the change here is that seqfile will now return -ENOMEM instead of 
oom killing processes; the slab allocation will no longer loop forever 
trying to allocate memory.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [patch] fs, seq_file: fallback to vmalloc instead of oom kill processes
  2014-11-26 22:40   ` David Rientjes
@ 2014-11-26 22:48     ` Andrew Morton
  0 siblings, 0 replies; 5+ messages in thread
From: Andrew Morton @ 2014-11-26 22:48 UTC (permalink / raw)
  To: David Rientjes; +Cc: Heiko Carstens, Christoph Hellwig, Al Viro, linux-kernel

On Wed, 26 Nov 2014 14:40:06 -0800 (PST) David Rientjes <rientjes@google.com> wrote:

> On Wed, 26 Nov 2014, Andrew Morton wrote:
> 
> > You forgot something.
> > 
> > --- a/fs/seq_file.c~fs-seq_file-fallback-to-vmalloc-instead-of-oom-kill-processes-fix
> > +++ a/fs/seq_file.c
> > @@ -36,6 +36,10 @@ static void *seq_buf_alloc(unsigned long
> >  {
> >  	void *buf;
> >  
> > +	/*
> > +	 * __GFP_NORETRY to avoid oom-killings with high-order allocations -
> > +	 * it's better to fall back to vmalloc() than to kill things.
> > +	 */
> >  	buf = kmalloc(size, GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN);
> >  	if (!buf && size > PAGE_SIZE)
> >  		buf = vmalloc(size);
> > 
> 
> ...
>
> The slowpath tries to allocate, calls memory compaction if necessary, 
> tries to allocate, calls direct reclaim, tries to allocate, call the oom 
> killer and tries to allocate if we are going to loop, and then loop if 
> allowed.  There's no need to try to allocate if we don't call the oom 
> killer since it won't succeed and there's no need to call the oom killer 
> to free memory if we aren't going to retry.

My concern is that an open-coded __GFP_NORETRY is very obscure.  Even
something like

#define __GFP_NO_OOM_KILL __GFP_NORETRY

would help make things a bit self-documenting.

> Even if __GFP_NO_OOM_KILL existed, it wouldn't be applicable to this 
> patch: the change here is that seqfile will now return -ENOMEM instead of 
> oom killing processes;

Not really?  The change makes seq_buf_alloc() fall back to vmalloc()
rather than killing things.  Doesn't it?

Unless the allocation is < PAGE_SIZE, in which case we do go ENOMEM. 
That's daft - it would be better to vmalloc() a whole page in this
case.  Not that the vmalloc is likely to be successful anyway..

> the slab allocation will no longer loop forever 
> trying to allocate memory.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-11-26 22:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-26 22:16 [patch] fs, seq_file: fallback to vmalloc instead of oom kill processes David Rientjes
2014-11-26 22:24 ` Andrew Morton
2014-11-26 22:38   ` Joe Perches
2014-11-26 22:40   ` David Rientjes
2014-11-26 22:48     ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).