linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: Roman Gushchin <guro@fb.com>
Cc: linux-mm@kvack.org, Michal Hocko <mhocko@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Tejun Heo <tj@kernel.org>,
	kernel-team@fb.com, cgroups@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [v3 2/6] mm, oom: cgroup-aware OOM killer
Date: Wed, 12 Jul 2017 13:26:20 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.10.1707121317580.57341@chino.kir.corp.google.com> (raw)
In-Reply-To: <20170712121110.GA9017@castle>

On Wed, 12 Jul 2017, Roman Gushchin wrote:

> > It's a no-op if nobody sets up priorities or the system-wide sysctl is 
> > disabled.  Presumably, as in our model, the Activity Manager sets the 
> > sysctl and is responsible for configuring the priorities if present.  All 
> > memcgs at the sibling level or subcontainer level remain the default if 
> > not defined by the chown'd user, so this falls back to an rss model for 
> > backwards compatibility.
> 
> Hm, this is interesting...
> 
> What I'm thinking about, is that we can introduce the following model:
> each memory cgroup has an integer oom priority value, 0 be default.
> Root cgroup priority is always 0, other cgroups can have both positive
> or negative priorities.
> 

For our purposes we use a range of [0, 10000] for the per-process oom 
priority; 10000 implies the process is not oom killable, 5000 is the 
default.  We use a range of [0, 9999] for the per-memcg oom priority since 
memcgs cannot disable themselves from oom killing (although they could oom 
disable all attached processes).  We can obviously remap our priorities to 
whatever we decide here, but I think we should give ourselves more room 
and provide 10000 priorities at the minimum (we have 5000 true priorities 
plus overlimit bias).  I'm not sure that negative priorities make sense in 
this model, is there a strong reason to prefer [-5000, 5000] over 
[0, 10000]?

And, yes, the root memcg remains a constant oom priority and is never 
actually checked.

> During OOM victim selection we compare cgroups on each hierarchy level
> based on priority and size, if there are several cgroups with equal priority.
> Per-task oom_score_adj will affect task selection inside a cgroup if
> oom_kill_all_tasks is not set. -1000 special value will also completely
> protect a task from being killed, if only oom_kill_all_tasks is not set.
> 

If there are several cgroups of equal priority, we prefer the one that was 
created the most recently just to avoid losing work that has been done for 
a long period of time.  But the key in this proposal is that we _always_ 
continue to iterate the memcg hierarchy until we find a process attached 
to a memcg with the lowest priority relative to sibling cgroups, if any.

To adapt your model to this proposal, memory.oom_kill_all_tasks would only 
be effective if there are no descendant memcgs.  In that case, iteration 
stops anyway and in my model we kill the process with the lowest 
per-process priority.  This could trivially check 
memory.oom_kill_all_tasks and kill everything, and I'm happy to support 
that feature since we have had a need for it in the past as well.

We should talk about when this priority-based scoring becomes effective.  
We enable it by default in our kernel, but it could be guarded with a VM 
sysctl if necessary to enact a system-wide policy.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-07-12 20:26 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-21 21:19 [v3 0/6] cgroup-aware OOM killer Roman Gushchin
2017-06-21 21:19 ` [v3 1/6] mm, oom: use oom_victims counter to synchronize oom victim selection Roman Gushchin
     [not found]   ` <201706220040.v5M0eSnK074332@www262.sakura.ne.jp>
2017-06-22 16:58     ` Roman Gushchin
2017-06-22 20:37       ` Tetsuo Handa
     [not found]         ` <201706230537.IDB21366.SQHJVFOOFOMFLt-JPay3/Yim36HaxMnTkn67Xf5DAMn2ifp@public.gmane.org>
2017-06-22 21:52           ` Tetsuo Handa
2017-06-29 18:47             ` Roman Gushchin
2017-06-29 20:13               ` Tetsuo Handa
2017-06-29  9:04   ` Michal Hocko
2017-06-21 21:19 ` [v3 2/6] mm, oom: cgroup-aware OOM killer Roman Gushchin
2017-07-10 23:05   ` David Rientjes
2017-07-11 12:51     ` Roman Gushchin
2017-07-11 20:56       ` David Rientjes
2017-07-12 12:11         ` Roman Gushchin
2017-07-12 20:26           ` David Rientjes [this message]
2017-06-21 21:19 ` [v3 3/6] mm, oom: cgroup-aware OOM killer debug info Roman Gushchin
2017-06-21 21:19 ` [v3 4/6] mm, oom: introduce oom_score_adj for memory cgroups Roman Gushchin
2017-06-21 21:19 ` [v3 5/6] mm, oom: don't mark all oom victims tasks with TIF_MEMDIE Roman Gushchin
2017-06-29  8:53   ` Michal Hocko
2017-06-29 18:45     ` Roman Gushchin
2017-06-30  8:25       ` Michal Hocko
2017-06-21 21:19 ` [v3 6/6] mm,oom,docs: describe the cgroup-aware OOM killer Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.10.1707121317580.57341@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).