linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: David Rientjes <rientjes@google.com>
Cc: <linux-mm@kvack.org>, Michal Hocko <mhocko@suse.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Tejun Heo <tj@kernel.org>, <kernel-team@fb.com>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/3] introduce memory.oom.group
Date: Wed, 1 Aug 2018 15:47:09 -0700	[thread overview]
Message-ID: <20180801224706.GA32269@castle.DHCP.thefacebook.com> (raw)
In-Reply-To: <alpine.DEB.2.21.1808011437350.38896@chino.kir.corp.google.com>

On Wed, Aug 01, 2018 at 02:51:25PM -0700, David Rientjes wrote:
> On Tue, 31 Jul 2018, Roman Gushchin wrote:
> 
> > > What's the plan with the cgroup aware oom killer?  It has been sitting in 
> > > the -mm tree for ages with no clear path to being merged.
> > 
> > It's because your nack, isn't it?
> > Everybody else seem to be fine with it.
> > 
> 
> If they are fine with it, I'm not sure they have tested it :)  Killing 
> entire cgroups needlessly for mempolicy oom kills that will not free 
> memory on target nodes is the first regression they may notice.  It also 
> unnecessarily uses oom_score_adj settings only when attached to the root 
> mem cgroup.  That may be fine in very specialized usecases but your bash 
> shell being considered equal to a 96GB cgroup isn't very useful.  These 
> are all fixed in my follow-up patch series which you say you have reviewed 
> later in this email.
> 
> > > Are you planning on reviewing the patchset to fix the cgroup aware oom 
> > > killer at https://marc.info/?l=linux-kernel&m=153152325411865 which has 
> > > been waiting for feedback since March?
> > > 
> > 
> > I already did.
> > As I said, I find the proposed oom_policy interface confusing.
> > I'm not sure I understand why some memcg OOMs should be handled
> > by memcg-aware OOMs, while other by the traditional per-process
> > logic; and why this should be set on the OOMing memcg.
> > IMO this adds nothing but confusion.
> > 
> 
> If your entire review was the email to a single patch, I misinterpreted 
> that as the entire review not being done, sorry.  I volunteered to 
> separate out the logic to determine if a cgroup should be considered on 
> its own (kill the largest cgroup on the system) or whether to consider 
> subtree usage as well into its own tunable.  I haven't received an 
> answer, yet, but it's a trivial patch on top of my series if you prefer.  
> Just let me know so we can make progress.
> 
> > it doesn't look nice to me (neither I'm fan of the mount option).
> > If you need an option to evaluate a cgroup as a whole, but kill
> > only one task inside (the ability we've discussed before),
> > let's make it clear. It's possible with the new memory.oom.group.
> > 
> 
> The purpose is for subtrees delegated to users so that they can continue 
> to expect the same process being oom killed, with oom_score_adj 
> respected, even though the ancestor oom policy is for cgroup aware 
> targeting.  It is perfectly legitimate, and necessary, for a user who 
> controls their own subtree to prefer killing of the single largest process 
> as it has always been done.  Secondary to that is their ability to 
> influence the decision with oom_score_adj, which they lose without my 
> patches.
> 
> > Patches which adjust root memory cgroup accounting and NUMA
> > handling should be handled separately, they are really not
> > about the interface. I've nothing against them.
> > 
> 
> That's good to know, it would be helpful if you would ack the patches that 
> you are not objecting to.  Your feedback about the overloading of "cgroup" 
> and "tree" is well received and I can easily separate that into a tunable 
> as I said.  I do not know of any user that would want to specify "tree" 
> without having cgroup aware behavior, however.  If you would prefer this, 
> please let me know!
> 
> > Anyway, at this point I really think that this patch (memory.oom.group)
> > is a reasonable way forward. It implements a useful and complete feature,
> > doesn't block any further development and has a clean interface.
> > So, you can build memory.oom.policy on top of it.
> > Does this sound good?
> > 
> 
> I have no objection to this series, of course.  The functionality of group 
> oom was unchanged in my series.  I'd very much appreciate a review of my 
> patchset, though, so the cgroup-aware policy can be merged as well.
> 

Ok, I think that what we'll do here:
1) drop the current cgroup-aware OOM killer implementation from the mm tree
2) land memory.oom.group to the mm tree (your ack will be appreciated)
3) discuss and, hopefully, agree on memory.oom.policy interface
4) land memory.oom.policy

Basically, with oom.group separated everything we need is another
boolean knob, which means that the memcg should be evaluated together.
Am I right? If so, the main problem to solve is how to handle the following
case:
      A
     / \             B/memory.oom.evaluate_as_a_group* = 1
    B   C            C/memory.oom.evaluate_as_a_group* = 0
   / \
  D   E        * I do not propose to use this name, just for example.

In this case you have to compare tasks in C with cgroup B.
And this is what I'd like to avoid. Maybe it should be enforced on A's level?
I don't think it should be linked to the OOMing group, as in your patchset.

I would really prefer to discuss the interface first, without going
into code and implementation details code. It's not because I do not
appreciate your work, only because it's hard to think about the interface
when there are two big patchsets on top of each other.

Thank you!

  reply	other threads:[~2018-08-01 22:47 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-30 18:00 [PATCH 0/3] introduce memory.oom.group Roman Gushchin
2018-07-30 18:00 ` [PATCH 1/3] mm: introduce mem_cgroup_put() helper Roman Gushchin
2018-07-31  8:45   ` Michal Hocko
2018-07-31 14:58     ` Shakeel Butt
2018-08-01  5:53       ` Michal Hocko
2018-08-01 17:31   ` Johannes Weiner
2018-07-30 18:00 ` [PATCH 2/3] mm, oom: refactor oom_kill_process() Roman Gushchin
2018-08-01 17:32   ` Johannes Weiner
2018-07-30 18:01 ` [PATCH 3/3] mm, oom: introduce memory.oom.group Roman Gushchin
2018-07-31  9:07   ` Michal Hocko
2018-08-01  1:14     ` Roman Gushchin
2018-08-01  5:55       ` Michal Hocko
2018-08-01 17:48         ` Johannes Weiner
2018-08-01 17:50   ` Johannes Weiner
2018-07-31  1:49 ` [PATCH 0/3] " David Rientjes
2018-07-31 15:54   ` Johannes Weiner
2018-07-31 23:51   ` Roman Gushchin
2018-08-01 21:51     ` David Rientjes
2018-08-01 22:47       ` Roman Gushchin [this message]
2018-08-06 21:34         ` David Rientjes
2018-08-07  0:30           ` Roman Gushchin
2018-08-07 22:34             ` David Rientjes
2018-08-08 10:59               ` Michal Hocko
2018-08-09 20:10                 ` David Rientjes
2018-08-10  7:03                   ` Michal Hocko
2018-08-19 23:26               ` cgroup aware oom killer (was Re: [PATCH 0/3] introduce memory.oom.group) David Rientjes
2018-08-20 19:05                 ` Roman Gushchin
2018-08-02  8:00       ` [PATCH 0/3] introduce memory.oom.group Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180801224706.GA32269@castle.DHCP.thefacebook.com \
    --to=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=rientjes@google.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).