linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <guro@fb.com>,
	linux-mm@vger.kernel.org, Michal Hocko <mhocko@suse.com>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Tejun Heo <tj@kernel.org>,
	kernel-team@fb.com, cgroups@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH v13 0/7] cgroup-aware OOM killer
Date: Sun, 14 Jan 2018 15:44:09 -0800 (PST)	[thread overview]
Message-ID: <alpine.DEB.2.10.1801141536380.131380@chino.kir.corp.google.com> (raw)
In-Reply-To: <20180113171432.GA23484@cmpxchg.org>

On Sat, 13 Jan 2018, Johannes Weiner wrote:

> You don't have any control and no accounting of the stuff situated
> inside the root cgroup, so it doesn't make sense to leave anything in
> there while also using sophisticated containerization mechanisms like
> this group oom setting.
> 
> In fact, the laptop I'm writing this email on runs an unmodified
> mainstream Linux distribution. The only thing in the root cgroup are
> kernel threads.
> 
> The decisions are good enough for the rare cases you forget something
> in there and it explodes.
> 

It's quite trivial to allow the root mem cgroup to be compared exactly the 
same as another cgroup.  Please see 
https://marc.info/?l=linux-kernel&m=151579459920305.

> This assumes you even need one. Right now, the OOM killer picks the
> biggest MM, so you can evade selection by forking your MM. This patch
> allows picking the biggest cgroup, so you can evade by forking groups.
> 

It's quite trivial to prevent any cgroup from evading the oom killer by 
either forking their mm or attaching all their processes to subcontainers.  
Please see https://marc.info/?l=linux-kernel&m=151579459920305.

> It's not a new vector, and clearly nobody cares. This has never been
> brought up against the current design that I know of.
> 

As cgroup v2 becomes more popular, people will organize their cgroup 
hierarchies for all controllers they need to use.  We do this today, for 
example, by attaching some individual consumers to child mem cgroups 
purely for the rich statistics and vmscan stats that mem cgroup provides 
without any limitation on those cgroups.

> Note, however, that there actually *is* a way to guard against it: in
> cgroup2 there is a hierarchical limit you can configure for the number
> of cgroups that are allowed to be created in the subtree. See
> 1a926e0bbab8 ("cgroup: implement hierarchy limits").
> 

Not allowing the user to create subcontainers to track statistics to paper 
over an obvious and acknowledged shortcoming in the design of the cgroup 
aware oom killer seems like a pretty nasty shortcoming itself.

> It could be useful, but we have no concensus on the desired
> semantics. And it's not clear why we couldn't add it later as long as
> the default settings of a new knob maintain the default behavior
> (which would have to be preserved anyway, since we rely on it).
>

The active proposal is 
https://marc.info/?l=linux-kernel&m=151579459920305, which describes an 
extendable interface and one that covers all the shortcomings of this 
patchset without polluting the mem cgroup filesystem.  The default oom 
policy in that proposal would be "none", i.e. we do what we do today, 
based on process usage.  You can configure that, without the mount option 
this patchset introduces for local or hierarchical cgroup targeting.
 
> > > > I proposed a solution in 
> > > > https://marc.info/?l=linux-kernel&m=150956897302725, which was never 
> > > > responded to, for all of these issues.  The idea is to do hierarchical 
> > > > accounting of mem cgroup hierarchies so that the hierarchy is traversed 
> > > > comparing total usage at each level to select target cgroups.  Admins and 
> > > > users can use memory.oom_score_adj to influence that decisionmaking at 
> > > > each level.
> 
> We did respond repeatedly: this doesn't work for a lot of setups.
> 

We need to move this discussion to the active proposal at 
https://marc.info/?l=linux-kernel&m=151579459920305, because it does 
address your setup, so it's not good use of anyones time to further 
discuss simply memory.oom_score_adj.

Thanks.

  reply	other threads:[~2018-01-14 23:44 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-30 15:28 [PATCH v13 0/7] cgroup-aware OOM killer Roman Gushchin
2017-11-30 15:28 ` [PATCH v13 1/7] mm, oom: refactor the oom_kill_process() function Roman Gushchin
2017-11-30 15:28 ` [PATCH v13 2/7] mm: implement mem_cgroup_scan_tasks() for the root memory cgroup Roman Gushchin
2017-11-30 15:28 ` [PATCH v13 3/7] mm, oom: cgroup-aware OOM killer Roman Gushchin
2017-12-01  8:35   ` Michal Hocko
2017-12-07  1:24   ` Andrew Morton
2017-12-07 13:39     ` Roman Gushchin
2017-11-30 15:28 ` [PATCH v13 4/7] mm, oom: introduce memory.oom_group Roman Gushchin
2017-11-30 15:28 ` [PATCH v13 5/7] mm, oom: add cgroup v2 mount option for cgroup-aware OOM killer Roman Gushchin
2017-12-01  8:41   ` Michal Hocko
2017-12-01 13:15     ` Roman Gushchin
2017-12-01 13:31       ` Michal Hocko
2017-12-01 17:00         ` Roman Gushchin
2017-11-30 15:28 ` [PATCH v13 6/7] mm, oom, docs: describe the " Roman Gushchin
2017-12-01  8:41   ` Michal Hocko
2017-12-01 17:01     ` Roman Gushchin
2017-12-01 17:13       ` Michal Hocko
2017-11-30 15:28 ` [PATCH v13 7/7] cgroup: list groupoom in cgroup features Roman Gushchin
2017-11-30 20:39 ` [PATCH v13 0/7] cgroup-aware OOM killer Andrew Morton
2018-01-10  0:57   ` David Rientjes
2018-01-10 13:11     ` Roman Gushchin
2018-01-10 19:33       ` Andrew Morton
2018-01-11  9:08         ` Michal Hocko
2018-01-11 13:18           ` Roman Gushchin
2018-01-12 22:03             ` David Rientjes
2018-01-15 11:54               ` Michal Hocko
2018-01-16 21:36                 ` David Rientjes
2018-01-16 22:09                   ` Michal Hocko
2018-01-11 21:57           ` David Rientjes
2018-01-13 17:14         ` Johannes Weiner
2018-01-14 23:44           ` David Rientjes [this message]
2018-01-15 16:25             ` Johannes Weiner
2018-01-16 21:21               ` David Rientjes
2018-01-10 20:50       ` David Rientjes
2017-12-01  9:14 ` [PATCH] mm, oom: simplify alloc_pages_before_oomkill handling Michal Hocko
2017-12-01 13:26   ` Tetsuo Handa
2017-12-01 13:32   ` Roman Gushchin
2017-12-01 13:54     ` Michal Hocko
2018-06-05 11:47 ` [PATCH v13 0/7] cgroup-aware OOM killer Michal Hocko
2018-06-05 12:13   ` Michal Hocko
2018-07-13 21:59   ` David Rientjes
2018-07-14  1:55     ` Tetsuo Handa
2018-07-16 21:13       ` Tetsuo Handa
2018-07-16 22:09         ` Roman Gushchin
2018-07-17  0:55           ` Tetsuo Handa
2018-07-31 14:14             ` Tetsuo Handa
2018-08-01 16:37               ` Roman Gushchin
2018-08-01 22:01                 ` Tetsuo Handa
2018-08-01 22:55                   ` Roman Gushchin
2018-07-16  9:36     ` Michal Hocko
2018-07-17  3:59       ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.10.1801141536380.131380@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-mm@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).