linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <guro@fb.com>, Michal Hocko <mhocko@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	kernel-team@fb.com, cgroups@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable
Date: Wed, 17 Jan 2018 07:41:55 -0800	[thread overview]
Message-ID: <20180117154155.GU3460072@devbig577.frc2.facebook.com> (raw)
In-Reply-To: <alpine.DEB.2.10.1801161814130.28198@chino.kir.corp.google.com>

Hello, David.

On Tue, Jan 16, 2018 at 06:15:08PM -0800, David Rientjes wrote:
> The behavior of killing an entire indivisible memory consumer, enabled
> by memory.oom_group, is an oom policy itself.  It specifies that all

I thought we discussed this before but maybe I'm misremembering.
There are two parts to the OOM policy.  One is victim selection, the
other is the action to take thereafter.

The two are different and conflating the two don't work too well.  For
example, please consider what should be given to the delegatee when
delegating a subtree, which often is a good excercise when designing
these APIs.

When a given workload is selected for OOM kill (IOW, selected to free
some memory), whether the workload can handle individual process kills
or not is the property of the workload itself.  Some applications can
safely handle some of its processes picked off and killed.  Most
others can't and want to be handled as a single unit, which makes it a
property of the workload.

That makes sense in the hierarchy too because whether one process or
the whole workload is killed doesn't infringe upon the parent's
authority over resources which in turn implies that there's nothing to
worry about how the parent's groupoom setting should constrain the
descendants.

OOM victim selection policy is a different beast.  As you've mentioned
multiple times, especially if you're worrying about people abusing OOM
policies by creating sub-cgroups and so on, the policy, first of all,
shouldn't be delegatable and secondly should have meaningful
hierarchical restrictions so that a policy that an ancestor chose
can't be nullified by a descendant.

I'm not necessarily against adding hierarchical victim selection
policy tunables; however, I am skeptical whether static tunables on
cgroup hierarchy (including selectable policies) can be made clean and
versatile enough, especially because the resource hierarchy doesn't
necessarily, or rather in most cases, match the OOM victim selection
decision tree, but I'd be happy to be proven wrong.

Without explicit configurations, the only thing the OOM killer needs
to guarantee is that the system can make forward progress.  We've
always been tweaking victim selection with or without cgroup and
absolutely shouldn't be locked into a specific heuristics.  The
heuristics is an implementaiton detail subject to improvements.

To me, your patchset actually seems to demonstrate that these are
separate issues.  The goal of groupoom is just to kill logical units
as cgroup hierarchy can inform the kernel of how workloads are
composed in the userspace.  If you want to improve victim selection,
sure, please go ahead, but your argument that groupoom can't be merged
because of victim selection policy doesn't make sense to me.

Thanks.

-- 
tejun

  reply	other threads:[~2018-01-17 15:42 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-17  2:14 [patch -mm 0/4] mm, memcg: introduce oom policies David Rientjes
2018-01-17  2:15 ` [patch -mm 1/4] mm, memcg: introduce per-memcg oom policy tunable David Rientjes
2018-01-17  2:15 ` [patch -mm 2/4] mm, memcg: replace cgroup aware oom killer mount option with tunable David Rientjes
2018-01-17  2:15 ` [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable David Rientjes
2018-01-17 15:41   ` Tejun Heo [this message]
2018-01-17 16:00     ` Michal Hocko
2018-01-17 22:18       ` David Rientjes
2018-01-23 15:13         ` Michal Hocko
2018-01-17 22:14     ` David Rientjes
2018-01-19 20:53       ` David Rientjes
2018-01-20 12:32         ` Tejun Heo
2018-01-22 22:34           ` David Rientjes
2018-01-23 15:53             ` Michal Hocko
2018-01-23 22:22               ` David Rientjes
2018-01-24  8:20                 ` Michal Hocko
2018-01-24 21:44                   ` David Rientjes
2018-01-24 22:08                     ` Andrew Morton
2018-01-24 22:18                       ` Tejun Heo
2018-01-25  8:11                       ` Michal Hocko
2018-01-25  8:05                     ` Michal Hocko
2018-01-25 23:27                       ` David Rientjes
2018-01-26 10:07                         ` Michal Hocko
2018-01-26 22:33                           ` David Rientjes
2018-01-17  2:15 ` [patch -mm 4/4] mm, memcg: add hierarchical usage oom policy David Rientjes
2018-01-17 11:46 ` [patch -mm 0/4] mm, memcg: introduce oom policies Roman Gushchin
2018-01-17 22:31   ` David Rientjes
2018-01-25 23:53 ` [patch -mm v2 0/3] " David Rientjes
2018-01-25 23:53   ` [patch -mm v2 1/3] mm, memcg: introduce per-memcg oom policy tunable David Rientjes
2018-01-26 17:15     ` Michal Hocko
2018-01-29 22:38       ` David Rientjes
2018-01-30  8:50         ` Michal Hocko
2018-01-30 22:38           ` David Rientjes
2018-01-31  9:47             ` Michal Hocko
2018-02-01 10:11               ` David Rientjes
2018-01-25 23:53   ` [patch -mm v2 2/3] mm, memcg: replace cgroup aware oom killer mount option with tunable David Rientjes
2018-01-26  0:00     ` Andrew Morton
2018-01-26 22:20       ` David Rientjes
2018-01-26 22:39         ` Andrew Morton
2018-01-26 22:52           ` David Rientjes
2018-01-27  0:17             ` Andrew Morton
2018-01-29 10:46               ` Michal Hocko
2018-01-29 19:11                 ` Tejun Heo
2018-01-30  8:54                   ` Michal Hocko
2018-01-30 11:58                     ` Roman Gushchin
2018-01-30 12:08                       ` Michal Hocko
2018-01-30 12:13                         ` Roman Gushchin
2018-01-30 12:20                           ` Michal Hocko
2018-01-30 15:15                             ` Tejun Heo
2018-01-30 17:30                             ` Johannes Weiner
2018-01-30 19:39                             ` Andrew Morton
2018-01-29 22:16               ` David Rientjes
2018-01-25 23:53   ` [patch -mm v2 3/3] mm, memcg: add hierarchical usage oom policy David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180117154155.GU3460072@devbig577.frc2.facebook.com \
    --to=tj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=rientjes@google.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).