All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Florian Westphal <fw@strlen.de>,
	Georgi Nikolov <gnikolov@icdsoft.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	bugzilla-daemon@bugzilla.kernel.org, linux-mm@kvack.org,
	netfilter-devel@vger.kernel.org
Subject: Re: [Bug 200651] New: cgroups iptables-restor: vmalloc: allocation failure
Date: Wed, 1 Aug 2018 10:33:49 +0200	[thread overview]
Message-ID: <20180801083349.GF16767@dhcp22.suse.cz> (raw)
In-Reply-To: <e5b24629-0296-5a4d-577a-c25d1c52b03b@suse.cz>

On Wed 01-08-18 09:34:23, Vlastimil Babka wrote:
> On 07/31/2018 04:05 PM, Florian Westphal wrote:
> > Georgi Nikolov <gnikolov@icdsoft.com> wrote:
> >>> No, I think that's rather for the netfilter folks to decide. However, it
> >>> seems there has been the debate already [1] and it was not found. The
> >>> conclusion was that __GFP_NORETRY worked fine before, so it should work
> >>> again after it's added back. But now we know that it doesn't...
> >>>
> >>> [1] https://lore.kernel.org/lkml/20180130140104.GE21609@dhcp22.suse.cz/T/#u
> >>
> >> Yes i see. I will add Florian Westphal to CC list. netfilter-devel is
> >> already in this list so probably have to wait for their opinion.
> > 
> > It hasn't changed, I think having OOM killer zap random processes
> > just because userspace wants to import large iptables ruleset is not a
> > good idea.
> 
> If we denied the allocation instead of OOM (e.g. by using
> __GFP_RETRY_MAYFAIL), a slightly smaller one may succeed, still leaving
> the system without much memory, so it will invoke OOM killer sooner or
> later anyway.
> 
> I don't see any silver-bullet solution, unfortunately. If this can be
> abused by (multiple) namespaces, then they have to be contained by
> kmemcg as that's the generic mechanism intended for this. Then we could
> use the __GFP_RETRY_MAYFAIL.
> The only limit we could impose to outright deny the allocation (to
> prevent obvious bugs/admin mistakes or abuses) could be based on the
> amount of RAM, as was suggested in the old thread.
> 
> __GFP_NORETRY might look like a good match at first sight as that stops
> allocating when "reclaim becomes hard" which means the system is still
> relatively far from OOM. But it's not reliable in principle, and as this
> bug report shows. That's fine when __GFP_NORETRY is used for optimistic
> allocations that have some other fallback (e.g. huge page with fallback
> to base page), but far from ideal when failure means returning -ENOMEM
> to userspace.

I absolutely agree. The whole __GFP_NORETRY is quite dubious TBH. I have
used it to get the original behavior because the change wasn't really
intended to make functional changes. But consideg ring this requires
higher privileges then I fail to see where the distrust comes from. If
this is really about untrusted root in a namespace then the proper way
is to use __GFP_ACCOUNT and limit that via kmemc.

__GFP_NORETRY can fail really easily if the kswapd doesn't keep the pace
with the allocations which might be completely unrelated to this
particular request.
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2018-08-01  8:33 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-200651-27@https.bugzilla.kernel.org/>
2018-07-25 19:52 ` [Bug 200651] New: cgroups iptables-restor: vmalloc: allocation failure Andrew Morton
2018-07-26  7:18   ` Vlastimil Babka
2018-07-26  7:26     ` Michal Hocko
2018-07-26  7:34       ` Vlastimil Babka
2018-07-26  7:42         ` Michal Hocko
2018-07-26  7:50           ` Vlastimil Babka
2018-07-26  8:03             ` Michal Hocko
2018-07-26  8:31               ` Vlastimil Babka
2018-07-26  8:48                 ` Vlastimil Babka
2018-07-26  9:02                   ` Georgi Nikolov
2018-07-30 13:37                     ` Georgi Nikolov
2018-07-30 13:57                       ` Michal Hocko
2018-07-30 15:54                         ` Georgi Nikolov
2018-07-30 18:38                           ` Michal Hocko
2018-07-30 18:51                             ` Georgi Nikolov
2018-07-31  6:38                               ` Vlastimil Babka
2018-07-31 13:55                                 ` Georgi Nikolov
2018-07-31 14:05                                   ` Florian Westphal
2018-07-31 14:25                                     ` Georgi Nikolov
2018-08-01  7:17                                       ` Vlastimil Babka
2018-08-01  7:34                                     ` Vlastimil Babka
2018-08-01  8:33                                       ` Michal Hocko [this message]
2018-08-01 16:03                                         ` Georgi Nikolov
2018-08-02  8:50                                           ` Michal Hocko
2018-08-02  9:25                                             ` Pablo Neira Ayuso
2018-08-02 10:44                                               ` Michal Hocko
2018-08-06  8:42                                             ` Georgi Nikolov
2018-08-07 11:02                                               ` Georgi Nikolov
2018-08-07 11:09                                                 ` Michal Hocko
2018-08-07 11:19                                                   ` Florian Westphal
2018-08-07 11:26                                                     ` Michal Hocko
2018-08-07 11:30                                                       ` Florian Westphal
2018-08-07 11:38                                                         ` Michal Hocko
2018-08-07 11:31                                                       ` Vlastimil Babka
2018-08-07 13:35                                                         ` Mike Rapoport
2018-08-07 11:29                                             ` Vlastimil Babka
2018-08-07 11:37                                               ` Michal Hocko
2018-08-07 18:23                                             ` Florian Westphal
2018-08-07 19:30                                               ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180801083349.GF16767@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bugzilla-daemon@bugzilla.kernel.org \
    --cc=fw@strlen.de \
    --cc=gnikolov@icdsoft.com \
    --cc=linux-mm@kvack.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.