linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Roman Gushchin <guro@fb.com>
Cc: linux-mm@kvack.org, Vladimir Davydov <vdavydov.dev@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	David Rientjes <rientjes@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tejun Heo <tj@kernel.org>,
	kernel-team@fb.com, cgroups@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [v7 5/5] mm, oom: cgroup v2 mount option to disable cgroup-aware OOM killer
Date: Tue, 5 Sep 2017 15:44:12 +0200	[thread overview]
Message-ID: <20170905134412.qdvqcfhvbdzmarna@dhcp22.suse.cz> (raw)
In-Reply-To: <20170904142108.7165-6-guro@fb.com>

I will go and check patch 2 more deeply but this is something that I
wanted to sort out first.

On Mon 04-09-17 15:21:08, Roman Gushchin wrote:
> Introducing of cgroup-aware OOM killer changes the victim selection
> algorithm used by default: instead of picking the largest process,
> it will pick the largest memcg and then the largest process inside.
> 
> This affects only cgroup v2 users.
> 
> To provide a way to use cgroups v2 if the old OOM victim selection
> algorithm is preferred for some reason, the nogroupoom mount option
> is added.
> 
> If set, the OOM selection is performed in a "traditional" per-process
> way. Both oom_priority and oom_group memcg knobs are ignored.

Why is this an opt out rather than opt-in? IMHO the original oom logic
should be preserved by default and specific workloads should opt in for
the cgroup aware logic. Changing the global behavior depending on
whether cgroup v2 interface is in use is more than unexpected and IMHO
wrong approach to take. I think we should instead go with 
oom_strategy=[alloc_task,biggest_task,cgroup]

we currently have alloc_task (via sysctl_oom_kill_allocating_task) and
biggest_task which is the default. You are adding cgroup and the more I
think about the more I agree that it doesn't really make sense to try to
fit thew new semantic into the existing one (compare tasks to kill-all
memcgs). Just introduce a new strategy and define a new semantic from
scratch. Memcg priority and kill-all are a natural extension of this new
strategy. This will make the life easier and easier to understand by
users.

Does that make sense to you?

> Signed-off-by: Roman Gushchin <guro@fb.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: kernel-team@fb.com
> Cc: cgroups@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-mm@kvack.org
> ---
>  Documentation/admin-guide/kernel-parameters.txt | 1 +
>  mm/memcontrol.c                                 | 8 ++++++++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 28f1a0f84456..07891f1030aa 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -489,6 +489,7 @@
>  			Format: <string>
>  			nosocket -- Disable socket memory accounting.
>  			nokmem -- Disable kernel memory accounting.
> +			nogroupoom -- Disable cgroup-aware OOM killer.
>  
>  	checkreqprot	[SELINUX] Set initial checkreqprot flag value.
>  			Format: { "0" | "1" }
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index d7dd293897ca..6a8235dc41f6 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -87,6 +87,9 @@ static bool cgroup_memory_nosocket;
>  /* Kernel memory accounting disabled? */
>  static bool cgroup_memory_nokmem;
>  
> +/* Cgroup-aware OOM  disabled? */
> +static bool cgroup_memory_nogroupoom;
> +
>  /* Whether the swap controller is active */
>  #ifdef CONFIG_MEMCG_SWAP
>  int do_swap_account __read_mostly;
> @@ -2822,6 +2825,9 @@ bool mem_cgroup_select_oom_victim(struct oom_control *oc)
>  	if (mem_cgroup_disabled())
>  		return false;
>  
> +	if (cgroup_memory_nogroupoom)
> +		return false;
> +
>  	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
>  		return false;
>  
> @@ -6188,6 +6194,8 @@ static int __init cgroup_memory(char *s)
>  			cgroup_memory_nosocket = true;
>  		if (!strcmp(token, "nokmem"))
>  			cgroup_memory_nokmem = true;
> +		if (!strcmp(token, "nogroupoom"))
> +			cgroup_memory_nogroupoom = true;
>  	}
>  	return 0;
>  }
> -- 
> 2.13.5

-- 
Michal Hocko
SUSE Labs

  parent reply	other threads:[~2017-09-05 13:44 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-04 14:21 [v7 0/5] cgroup-aware OOM killer Roman Gushchin
2017-09-04 14:21 ` [v7 1/5] mm, oom: refactor the oom_kill_process() function Roman Gushchin
2017-09-05 13:34   ` Michal Hocko
2017-09-04 14:21 ` [v7 2/5] mm, oom: cgroup-aware OOM killer Roman Gushchin
2017-09-05 14:57   ` Michal Hocko
2017-09-05 20:23     ` Roman Gushchin
2017-09-06  8:31       ` Michal Hocko
2017-09-06 12:57         ` Roman Gushchin
2017-09-06 13:22           ` Michal Hocko
2017-09-06 13:41             ` Roman Gushchin
2017-09-06 14:10               ` Michal Hocko
2017-09-06  8:34       ` Michal Hocko
2017-09-06 12:33         ` Roman Gushchin
2017-09-07 16:18   ` Christopher Lameter
2017-09-11  8:49     ` Michal Hocko
2017-09-04 14:21 ` [v7 3/5] mm, oom: introduce oom_priority for memory cgroups Roman Gushchin
2017-09-04 14:21 ` [v7 4/5] mm, oom, docs: describe the cgroup-aware OOM killer Roman Gushchin
2017-09-04 14:21 ` [v7 5/5] mm, oom: cgroup v2 mount option to disable " Roman Gushchin
2017-09-04 17:32   ` Shakeel Butt
2017-09-04 17:51     ` Roman Gushchin
2017-09-05 13:44   ` Michal Hocko [this message]
2017-09-05 14:30     ` Roman Gushchin
2017-09-05 15:12       ` Michal Hocko
2017-09-05 19:16         ` Roman Gushchin
2017-09-06  8:42           ` Michal Hocko
2017-09-06 17:40             ` Roman Gushchin
2017-09-06 17:59               ` Michal Hocko
2017-09-06 20:59               ` David Rientjes
2017-09-07 14:43                 ` Christopher Lameter
2017-09-07 14:52                   ` Roman Gushchin
2017-09-07 15:03                     ` Christopher Lameter
2017-09-07 16:42                       ` Roman Gushchin
2017-09-07 17:03                         ` Christopher Lameter
2017-09-07 21:55                   ` David Rientjes
2017-09-07 16:21         ` Christopher Lameter
2017-09-05 21:53     ` Johannes Weiner
2017-09-06  8:28       ` Michal Hocko
2017-09-07 16:14         ` Johannes Weiner
2017-09-11  9:05           ` Michal Hocko
2017-09-11 12:50             ` Roman Gushchin
2017-09-07 16:27         ` Christopher Lameter
2017-09-07 22:03           ` David Rientjes
2017-09-08 21:07             ` Christopher Lameter
2017-09-09  8:45               ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170905134412.qdvqcfhvbdzmarna@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).