linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	rientjes@google.com, hannes@cmpxchg.org, tj@kernel.org,
	gthelen@google.com
Subject: Re: cgroup-aware OOM killer, how to move forward
Date: Thu, 12 Jul 2018 08:55:00 -0700	[thread overview]
Message-ID: <20180712155456.GA28187@castle.DHCP.thefacebook.com> (raw)
In-Reply-To: <20180712120703.GJ32648@dhcp22.suse.cz>

On Thu, Jul 12, 2018 at 02:07:03PM +0200, Michal Hocko wrote:
> On Wed 11-07-18 15:40:03, Roman Gushchin wrote:
> > Hello!
> > 
> > I was thinking on how to move forward with the cgroup-aware OOM killer.
> > It looks to me, that we all agree on the "cleanup" part of the patchset:
> > it's a nice feature to be able to kill all tasks in the cgroup
> > to guarantee the consistent state of the workload.
> > All our disagreements are related to the victim selection algorithm.
> > 
> > So, I wonder, if the right thing to do is to split the problem.
> > We can agree on the "cleanup" part, which is useful by itself,
> > merge it upstream, and then return to the victim selection
> > algorithm.
> 
> Could you be more specific which patches are those please?

It's not quite a part of existing patchset. But I had such version
during my work on the current patchset, and it was really small and cute.
I need some time to restore/rebase it.

> 
> > So, here is my proposal:
> > let's introduce the memory.group_oom knob with the following semantics:
> > if the knob is set, the OOM killer can kill either none, either all
> > tasks in the cgroup*.
> > It can perfectly work with the current OOM killer (as a "cleanup" option),
> > and allows _any_ further approach on the OOM victim selection.
> > It also doesn't require any mount/boot/tree-wide options.
> > 
> > How does it sound?
> 
> Well, I guess we have already discussed that. One problem I can see with
> that approach is that there is a disconnection between what is the oom
> killable entity and oom candidate entity. This will matter when we start
> seeing reports that a wrong container has been torn down because there
> were larger ones running. All that just because the latter ones consists
> of smaller tasks.
> 
> Is this a fundamental roadblock? I am not sure but I would tend to say
> _no_ because the oom victim selection has always been an implementation
> detail. We just need to kill _somebody_ to release _some_ memory. Kill
> the whole workload is a sensible thing to do.

Yes. We also use Johaness's memory pressure metrics for making OOM
decisions internally, which is working nice. In this case the in-kernel
OOM decision logic serves more as a backup solution, and consistency
is the only thing which does really matter.

> 
> So I would be ok with that even though I am still not sure why we should
> start with something half done when your original implementation was
> much more consistent. Sure there is some disagreement but I suspect
> that we will get stuck with an intermediate solution later on again for
> very same reasons. I have summarized [1] current contention points and
> I would really appreciate if somebody who wasn't really involved in the
> previous discussions could just join there and weight arguments. OOM
> selection policy is just a heuristic with some potential drawbacks and
> somebody might object and block otherwise useful features for others for
> ever.  So we should really find some consensus on what is reasonable and
> what is just over the line.

I would definitely prefer just to land the existing version, and I prefer
it over this proposal. But it doesn't seem to be going forward well...

Maybe making the described step first might help.

Thanks,
Roman

  reply	other threads:[~2018-07-12 15:55 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-11 22:40 cgroup-aware OOM killer, how to move forward Roman Gushchin
2018-07-12 12:07 ` Michal Hocko
2018-07-12 15:55   ` Roman Gushchin [this message]
2018-07-13 21:34 ` David Rientjes
2018-07-13 22:16   ` Roman Gushchin
2018-07-13 22:39     ` David Rientjes
2018-07-13 23:05       ` Roman Gushchin
2018-07-13 23:11         ` David Rientjes
2018-07-13 23:16           ` Roman Gushchin
2018-07-17  4:19             ` David Rientjes
2018-07-17 12:41               ` Michal Hocko
2018-07-17 17:38               ` Roman Gushchin
2018-07-17 19:49                 ` Michal Hocko
2018-07-17 20:06                   ` Roman Gushchin
2018-07-17 20:41                     ` David Rientjes
2018-07-17 20:52                       ` Roman Gushchin
2018-07-20  8:30                         ` David Rientjes
2018-07-20 11:21                           ` Tejun Heo
2018-07-20 16:13                             ` Roman Gushchin
2018-07-20 20:28                             ` David Rientjes
2018-07-20 20:47                               ` Roman Gushchin
2018-07-23 23:06                                 ` David Rientjes
2018-07-23 14:12                               ` Michal Hocko
2018-07-18  8:19                       ` Michal Hocko
2018-07-18  8:12                     ` Michal Hocko
2018-07-18 15:28                       ` Roman Gushchin
2018-07-19  7:38                         ` Michal Hocko
2018-07-19 17:05                           ` Roman Gushchin
2018-07-20  8:32                             ` David Rientjes
2018-07-23 14:17                             ` Michal Hocko
2018-07-23 15:09                               ` Tejun Heo
2018-07-24  7:32                                 ` Michal Hocko
2018-07-24 13:08                                   ` Tejun Heo
2018-07-24 13:26                                     ` Michal Hocko
2018-07-24 13:31                                       ` Tejun Heo
2018-07-24 13:50                                         ` Michal Hocko
2018-07-24 13:55                                           ` Tejun Heo
2018-07-24 14:25                                             ` Michal Hocko
2018-07-24 14:28                                               ` Tejun Heo
2018-07-24 14:35                                                 ` Tejun Heo
2018-07-24 14:43                                                 ` Michal Hocko
2018-07-24 14:49                                                   ` Tejun Heo
2018-07-24 15:52                                                     ` Roman Gushchin
2018-07-25 12:00                                                       ` Michal Hocko
2018-07-25 11:58                                                     ` Michal Hocko
2018-07-30  8:03                                       ` Michal Hocko
2018-07-30 14:04                                         ` Tejun Heo
2018-07-30 15:29                                           ` Roman Gushchin
2018-07-24 11:59 ` Tetsuo Handa
2018-07-25  0:10   ` Roman Gushchin
2018-07-25 12:23     ` Tetsuo Handa
2018-07-25 13:01       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180712155456.GA28187@castle.DHCP.thefacebook.com \
    --to=guro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rientjes@google.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).