linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Roman Gushchin <guro@fb.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	rientjes@google.com, hannes@cmpxchg.org, tj@kernel.org,
	gthelen@google.com
Subject: Re: cgroup-aware OOM killer, how to move forward
Date: Thu, 12 Jul 2018 14:07:03 +0200	[thread overview]
Message-ID: <20180712120703.GJ32648@dhcp22.suse.cz> (raw)
In-Reply-To: <20180711223959.GA13981@castle.DHCP.thefacebook.com>

On Wed 11-07-18 15:40:03, Roman Gushchin wrote:
> Hello!
> 
> I was thinking on how to move forward with the cgroup-aware OOM killer.
> It looks to me, that we all agree on the "cleanup" part of the patchset:
> it's a nice feature to be able to kill all tasks in the cgroup
> to guarantee the consistent state of the workload.
> All our disagreements are related to the victim selection algorithm.
> 
> So, I wonder, if the right thing to do is to split the problem.
> We can agree on the "cleanup" part, which is useful by itself,
> merge it upstream, and then return to the victim selection
> algorithm.

Could you be more specific which patches are those please?

> So, here is my proposal:
> let's introduce the memory.group_oom knob with the following semantics:
> if the knob is set, the OOM killer can kill either none, either all
> tasks in the cgroup*.
> It can perfectly work with the current OOM killer (as a "cleanup" option),
> and allows _any_ further approach on the OOM victim selection.
> It also doesn't require any mount/boot/tree-wide options.
> 
> How does it sound?

Well, I guess we have already discussed that. One problem I can see with
that approach is that there is a disconnection between what is the oom
killable entity and oom candidate entity. This will matter when we start
seeing reports that a wrong container has been torn down because there
were larger ones running. All that just because the latter ones consists
of smaller tasks.

Is this a fundamental roadblock? I am not sure but I would tend to say
_no_ because the oom victim selection has always been an implementation
detail. We just need to kill _somebody_ to release _some_ memory. Kill
the whole workload is a sensible thing to do.

So I would be ok with that even though I am still not sure why we should
start with something half done when your original implementation was
much more consistent. Sure there is some disagreement but I suspect
that we will get stuck with an intermediate solution later on again for
very same reasons. I have summarized [1] current contention points and
I would really appreciate if somebody who wasn't really involved in the
previous discussions could just join there and weight arguments. OOM
selection policy is just a heuristic with some potential drawbacks and
somebody might object and block otherwise useful features for others for
ever.  So we should really find some consensus on what is reasonable and
what is just over the line.

[1] http://lkml.kernel.org/r/20180605114729.GB19202@dhcp22.suse.cz
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2018-07-12 12:07 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-11 22:40 cgroup-aware OOM killer, how to move forward Roman Gushchin
2018-07-12 12:07 ` Michal Hocko [this message]
2018-07-12 15:55   ` Roman Gushchin
2018-07-13 21:34 ` David Rientjes
2018-07-13 22:16   ` Roman Gushchin
2018-07-13 22:39     ` David Rientjes
2018-07-13 23:05       ` Roman Gushchin
2018-07-13 23:11         ` David Rientjes
2018-07-13 23:16           ` Roman Gushchin
2018-07-17  4:19             ` David Rientjes
2018-07-17 12:41               ` Michal Hocko
2018-07-17 17:38               ` Roman Gushchin
2018-07-17 19:49                 ` Michal Hocko
2018-07-17 20:06                   ` Roman Gushchin
2018-07-17 20:41                     ` David Rientjes
2018-07-17 20:52                       ` Roman Gushchin
2018-07-20  8:30                         ` David Rientjes
2018-07-20 11:21                           ` Tejun Heo
2018-07-20 16:13                             ` Roman Gushchin
2018-07-20 20:28                             ` David Rientjes
2018-07-20 20:47                               ` Roman Gushchin
2018-07-23 23:06                                 ` David Rientjes
2018-07-23 14:12                               ` Michal Hocko
2018-07-18  8:19                       ` Michal Hocko
2018-07-18  8:12                     ` Michal Hocko
2018-07-18 15:28                       ` Roman Gushchin
2018-07-19  7:38                         ` Michal Hocko
2018-07-19 17:05                           ` Roman Gushchin
2018-07-20  8:32                             ` David Rientjes
2018-07-23 14:17                             ` Michal Hocko
2018-07-23 15:09                               ` Tejun Heo
2018-07-24  7:32                                 ` Michal Hocko
2018-07-24 13:08                                   ` Tejun Heo
2018-07-24 13:26                                     ` Michal Hocko
2018-07-24 13:31                                       ` Tejun Heo
2018-07-24 13:50                                         ` Michal Hocko
2018-07-24 13:55                                           ` Tejun Heo
2018-07-24 14:25                                             ` Michal Hocko
2018-07-24 14:28                                               ` Tejun Heo
2018-07-24 14:35                                                 ` Tejun Heo
2018-07-24 14:43                                                 ` Michal Hocko
2018-07-24 14:49                                                   ` Tejun Heo
2018-07-24 15:52                                                     ` Roman Gushchin
2018-07-25 12:00                                                       ` Michal Hocko
2018-07-25 11:58                                                     ` Michal Hocko
2018-07-30  8:03                                       ` Michal Hocko
2018-07-30 14:04                                         ` Tejun Heo
2018-07-30 15:29                                           ` Roman Gushchin
2018-07-24 11:59 ` Tetsuo Handa
2018-07-25  0:10   ` Roman Gushchin
2018-07-25 12:23     ` Tetsuo Handa
2018-07-25 13:01       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180712120703.GJ32648@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=gthelen@google.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).