From: Jeff Layton <jlayton@poochiereds.net>
To: Linus Torvalds <torvalds@linux-foundation.org>,
Olaf Hering <olaf@aepfle.de>, Bruce Fields <bfields@fieldses.org>
Cc: Michal Hocko <mhocko@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Markus Trippelsdorf <markus@trippelsdorf.de>,
Arkadiusz Miskiewicz <a.miskiewicz@gmail.com>,
Ralf-Peter Rohbeck <Ralf-Peter.Rohbeck@quantum.com>,
Jiri Slaby <jslaby@suse.com>,
Greg KH <gregkh@linuxfoundation.org>,
Vlastimil Babka <vbabka@suse.cz>, Joonsoo Kim <js1304@gmail.com>,
linux-mm <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: OOM detection regressions since 4.7
Date: Mon, 29 Aug 2016 13:52:42 -0400 [thread overview]
Message-ID: <1472493162.16070.10.camel@poochiereds.net> (raw)
In-Reply-To: <CA+55aFxbBszp+O9=9MrwXxp_fNw6xzNjQ0Kktm-8ipgqbido8w@mail.gmail.com>
On Mon, 2016-08-29 at 10:28 -0700, Linus Torvalds wrote:
> > On Mon, Aug 29, 2016 at 7:52 AM, Olaf Hering <olaf@aepfle.de> wrote:
> >
> >
> > Today I noticed the nfsserver was disabled, probably since months already.
> > Starting it gives a OOM, not sure if this is new with 4.7+.
>
> That's not an oom, that's just an allocation failure.
>
> And with order-4, that's actually pretty normal. Nobody should use
> order-4 (that's 16 contiguous pages, fragmentation can easily make
> that hard - *much* harder than the small order-2 or order-2 cases that
> we should largely be able to rely on).
>
> In fact, people who do multi-order allocations should always have a
> fallback, and use __GFP_NOWARN.
>
> >
> > [93348.306406] Call Trace:
> > [93348.306490]A A [<ffffffff81198cef>] __alloc_pages_slowpath+0x1af/0xa10
> > [93348.306501]A A [<ffffffff811997a0>] __alloc_pages_nodemask+0x250/0x290
> > [93348.306511]A A [<ffffffff811f1c3d>] cache_grow_begin+0x8d/0x540
> > [93348.306520]A A [<ffffffff811f23d1>] fallback_alloc+0x161/0x200
> > [93348.306530]A A [<ffffffff811f43f2>] __kmalloc+0x1d2/0x570
> > [93348.306589]A A [<ffffffffa08f025a>] nfsd_reply_cache_init+0xaa/0x110 [nfsd]
>
> Hmm. That's kmalloc itself falling back after already failing to grow
> the slab cache earlier (the earlier allocations *were* done with
> NOWARN afaik).
>
> It does look like nfsdstarts out by allocating the hash table with one
> single fairly big allocation, and has no fallback position.
>
> I suspect the code expects to be started at boot time, when this just
> isn't an issue. The fact that you loaded the nfsd kernel module with
> memory already fragmented after heavy use is likely why nobody else
> has seen this.
>
> Adding the nfsd people to the cc, because just from a robustness
> standpoint I suspect it would be better if the code did something like
>
> A (a) shrink the hash table if the allocation fails (we've got some
> examples of that elsewhere)
>
> or
>
> A (b) fall back on a vmalloc allocation (that's certainly the simpler model)
>
> We do have a "kvfree()" helper function for the "free either a kmalloc
> or vmalloc allocation" but we don't actually have a good helper
> pattern for the allocation side. People just do it by hand, at least
> partly because we have so many different ways to allocate things -
> zeroing, non-zeroing, node-specific or not, atomic or not (atomic
> cannot fall back to vmalloc, obviously) etc etc.
>
> Bruce, Jeff, comments?
>
> A A A A A A A A A A A A A Linus
Yeah, that makes total sense.
Hmm...we _do_ already auto-size the hash at init time already, so
shrinking it downward and retrying if the allocation fails wouldn't be
hard to do. Maybe I can just cut it in half and throw a pr_warn to tell
the admin in that case.
In any case...I'll take a look at how we can improve it.
Thanks for the heads-up!
--A
Jeff Layton <jlayton@poochiereds.net>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-08-29 17:52 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-22 9:32 OOM detection regressions since 4.7 Michal Hocko
2016-08-22 9:37 ` Michal Hocko
2016-08-22 10:05 ` Greg KH
2016-08-22 10:54 ` Michal Hocko
2016-08-22 13:31 ` Greg KH
2016-08-22 13:42 ` Michal Hocko
2016-08-22 14:02 ` Greg KH
2016-08-22 22:05 ` Andrew Morton
2016-08-23 7:43 ` Michal Hocko
2016-08-25 7:11 ` Michal Hocko
2016-08-25 7:17 ` Olaf Hering
2016-08-29 14:52 ` Olaf Hering
2016-08-29 14:54 ` Olaf Hering
2016-08-29 15:07 ` Michal Hocko
2016-08-29 15:59 ` Olaf Hering
2016-08-29 17:28 ` Linus Torvalds
2016-08-29 17:52 ` Jeff Layton [this message]
2016-08-28 5:50 ` Arkadiusz Miskiewicz
2016-08-25 20:30 ` Ralf-Peter Rohbeck
2016-08-26 6:26 ` Michal Hocko
2016-08-26 20:17 ` Ralf-Peter Rohbeck
2016-08-22 10:16 ` Markus Trippelsdorf
2016-08-22 10:56 ` Michal Hocko
2016-08-22 11:01 ` Markus Trippelsdorf
2016-08-22 11:13 ` Michal Hocko
2016-08-22 11:20 ` Markus Trippelsdorf
2016-08-23 4:52 ` Joonsoo Kim
2016-08-23 7:33 ` Michal Hocko
2016-08-23 7:40 ` Markus Trippelsdorf
2016-08-23 7:48 ` Michal Hocko
2016-08-23 19:08 ` Linus Torvalds
2016-08-24 6:32 ` Michal Hocko
2016-08-24 5:01 ` Joonsoo Kim
2016-08-24 7:04 ` Michal Hocko
2016-08-24 7:29 ` Joonsoo Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1472493162.16070.10.camel@poochiereds.net \
--to=jlayton@poochiereds.net \
--cc=Ralf-Peter.Rohbeck@quantum.com \
--cc=a.miskiewicz@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=bfields@fieldses.org \
--cc=gregkh@linuxfoundation.org \
--cc=js1304@gmail.com \
--cc=jslaby@suse.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
--cc=markus@trippelsdorf.de \
--cc=mhocko@kernel.org \
--cc=olaf@aepfle.de \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).