linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Michal Hocko <mhocko@suse.com>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Tejun Heo <tj@kernel.org>,
	kernel-team@fb.com, cgroups@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	Roman Gushchin <guro@fb.com>
Subject: Re: [RESEND v12 0/6] cgroup-aware OOM killer
Date: Sun, 22 Oct 2017 17:24:51 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.10.1710221715010.70210@chino.kir.corp.google.com> (raw)
In-Reply-To: <20171019194534.GA5502@cmpxchg.org>

On Thu, 19 Oct 2017, Johannes Weiner wrote:

> David would have really liked for this patchset to include knobs to
> influence how the algorithm picks cgroup victims. The rest of us
> agreed that this is beyond the scope of these patches, that the
> patches don't need it to be useful, and that there is nothing
> preventing anyone from adding configurability later on. David
> subsequently nacked the series as he considers it incomplete. Neither
> Michal nor I see technical merit in David's nack.
> 

The nack is for three reasons:

 (1) unfair comparison of root mem cgroup usage to bias against that mem 
     cgroup from oom kill in system oom conditions,

 (2) the ability of users to completely evade the oom killer by attaching
     all processes to child cgroups either purposefully or unpurposefully,
     and

 (3) the inability of userspace to effectively control oom victim  
     selection.

For (1), the difference in v12 is adding the rss of all processes attached 
to the root mem cgroup as the evaluation.  This is not the same criteria 
that child cgroups are evaluated on, and they are compared using that 
bias.  It is very trivial to provide a fair comparison as was suggested in 
v11.

For (2), users who do

	for i in $(cat cgroup.procs); do mkdir $i; echo $i > $i/cgroup.procs; done

can completely evade the oom killer and this may be part of their standard 
operating procedure for restricting resources with other cgroups other 
than the mem cgroup.  Again, it's an unfair comparison to all other 
cgroups on the system.

For (3), users need the ability to protect important cgroups, such as 
protecting a cgroup that is limited to 50% of system memory.  They need 
the ability to kill processes from other cgroups to protect these 
important processes.  This is nothing new: the oom killer has always 
provided the ability to bias against important processes.

There was follow-up email on all of these points where very trivial 
changes were suggested to address all three of these issues, and which 
Roman has implemented in one form or another in previous iterations with 
the bonus that no accounting to the root mem cgroup needs to be done.

> Michal acked the implementation, but on the condition that the new
> behavior be opt-in, to not surprise existing users. I *think* we agree
> that respecting the cgroup topography during global OOM is what we
> should have been doing when cgroups were initially introduced; where
> we disagree is that I think users shouldn't have to opt in to
> improvements. We have done much more invasive changes to the victim
> selection without actual regressions in the past. Further, this change
> only applies to mounts of the new cgroup2. Tejun also wasn't convinced
> of the risk for regression, and too would prefer cgroup-awareness to
> be the default in cgroup2. I would ask for patch 5/6 to be dropped.
> 

I agree with Michal that the new victim selection should be opt-in with 
CGRP_GROUP_OOM.

  parent reply	other threads:[~2017-10-23  0:24 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-19 18:52 [RESEND v12 0/6] cgroup-aware OOM killer Roman Gushchin
2017-10-19 18:52 ` [RESEND v12 1/6] mm, oom: refactor the oom_kill_process() function Roman Gushchin
2017-10-19 18:52 ` [RESEND v12 2/6] mm: implement mem_cgroup_scan_tasks() for the root memory cgroup Roman Gushchin
2017-10-19 18:52 ` [RESEND v12 3/6] mm, oom: cgroup-aware OOM killer Roman Gushchin
2017-10-19 19:30   ` Michal Hocko
2017-10-31 15:04   ` Shakeel Butt
2017-10-31 15:29     ` Michal Hocko
2017-10-31 19:06       ` Michal Hocko
2017-10-31 19:13         ` Michal Hocko
2017-10-31 16:40     ` Johannes Weiner
2017-10-31 17:50       ` Shakeel Butt
2017-10-31 18:44         ` Johannes Weiner
2017-10-19 18:52 ` [RESEND v12 4/6] mm, oom: introduce memory.oom_group Roman Gushchin
2017-10-19 18:52 ` [RESEND v12 5/6] mm, oom: add cgroup v2 mount option for cgroup-aware OOM killer Roman Gushchin
2017-10-19 18:52 ` [RESEND v12 6/6] mm, oom, docs: describe the " Roman Gushchin
2017-10-19 19:45 ` [RESEND v12 0/6] " Johannes Weiner
2017-10-19 21:09   ` Michal Hocko
2017-10-23  0:24   ` David Rientjes [this message]
2017-10-23 11:49     ` Michal Hocko
2017-10-25 20:12       ` David Rientjes
2017-10-26 14:24     ` Johannes Weiner
2017-10-26 21:03       ` David Rientjes
2017-10-27  9:31         ` Roman Gushchin
2017-10-30 21:36           ` David Rientjes
2017-10-31  7:54             ` Michal Hocko
2017-10-31 22:21               ` David Rientjes
2017-11-01  7:37                 ` Michal Hocko
2017-11-01 20:42                   ` David Rientjes
2017-10-27 20:05         ` Johannes Weiner
2017-10-31 14:17           ` peter enderborg
2017-10-31 14:34             ` Michal Hocko
2017-10-31 15:07               ` peter enderborg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.10.1710221715010.70210@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).