All of lore.kernel.org
 help / color / mirror / Atom feed
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
To: David Rientjes <rientjes@google.com>
Cc: kosaki.motohiro@jp.fujitsu.com,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Nick Piggin <npiggin@suse.de>,
	Oleg Nesterov <oleg@redhat.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	linux-mm@kvack.org
Subject: Re: [patch -mm 08/18] oom: badness heuristic rewrite
Date: Tue,  1 Jun 2010 16:36:48 +0900 (JST)	[thread overview]
Message-ID: <20100601163627.245D.A69D9226@jp.fujitsu.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1006010015030.29202@chino.kir.corp.google.com>

> This a complete rewrite of the oom killer's badness() heuristic which is
> used to determine which task to kill in oom conditions.  The goal is to
> make it as simple and predictable as possible so the results are better
> understood and we end up killing the task which will lead to the most
> memory freeing while still respecting the fine-tuning from userspace.
> 
> The baseline for the heuristic is a proportion of memory that each task is
> currently using in memory plus swap compared to the amount of "allowable"
> memory.  "Allowable," in this sense, means the system-wide resources for
> unconstrained oom conditions, the set of mempolicy nodes, the mems
> attached to current's cpuset, or a memory controller's limit.  The
> proportion is given on a scale of 0 (never kill) to 1000 (always kill),
> roughly meaning that if a task has a badness() score of 500 that the task
> consumes approximately 50% of allowable memory resident in RAM or in swap
> space.
> 
> The proportion is always relative to the amount of "allowable" memory and
> not the total amount of RAM systemwide so that mempolicies and cpusets may
> operate in isolation; they shall not need to know the true size of the
> machine on which they are running if they are bound to a specific set of
> nodes or mems, respectively.
> 
> Root tasks are given 3% extra memory just like __vm_enough_memory()
> provides in LSMs.  In the event of two tasks consuming similar amounts of
> memory, it is generally better to save root's task.
> 
> Because of the change in the badness() heuristic's baseline, it is also
> necessary to introduce a new user interface to tune it.  It's not possible
> to redefine the meaning of /proc/pid/oom_adj with a new scale since the
> ABI cannot be changed for backward compatability.  Instead, a new tunable,
> /proc/pid/oom_score_adj, is added that ranges from -1000 to +1000.  It may
> be used to polarize the heuristic such that certain tasks are never
> considered for oom kill while others may always be considered.  The value
> is added directly into the badness() score so a value of -500, for
> example, means to discount 50% of its memory consumption in comparison to
> other tasks either on the system, bound to the mempolicy, in the cpuset,
> or sharing the same memory controller.
> 
> /proc/pid/oom_adj is changed so that its meaning is rescaled into the
> units used by /proc/pid/oom_score_adj, and vice versa.  Changing one of
> these per-task tunables will rescale the value of the other to an
> equivalent meaning.  Although /proc/pid/oom_adj was originally defined as
> a bitshift on the badness score, it now shares the same linear growth as
> /proc/pid/oom_score_adj but with different granularity.  This is required
> so the ABI is not broken with userspace applications and allows oom_adj to
> be deprecated for future removal.
> 
> Signed-off-by: David Rientjes <rientjes@google.com>

nack


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-06-01  7:36 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-01  7:18 [patch -mm 00/18] oom killer rewrite David Rientjes
2010-06-01  7:18 ` [patch -mm 01/18] oom: filter tasks not sharing the same cpuset David Rientjes
2010-06-01  7:20   ` KOSAKI Motohiro
2010-06-08 11:41   ` KOSAKI Motohiro
2010-06-08 18:37     ` David Rientjes
2010-06-13 11:24       ` KOSAKI Motohiro
2010-06-17  3:33         ` David Rientjes
2010-06-21 11:45           ` KOSAKI Motohiro
2010-06-21 11:45           ` KOSAKI Motohiro
2010-06-08 11:41   ` KOSAKI Motohiro
2010-06-08 18:43     ` David Rientjes
2010-06-08 23:25       ` Andrew Morton
2010-06-08 23:54         ` David Rientjes
2010-06-09  0:06           ` Andrew Morton
2010-06-09  1:07             ` David Rientjes
2010-06-13 11:24             ` KOSAKI Motohiro
2010-06-01  7:18 ` [patch -mm 02/18] oom: sacrifice child with highest badness score for parent David Rientjes
2010-06-01  7:39   ` KOSAKI Motohiro
2010-06-08 11:41   ` KOSAKI Motohiro
2010-06-08 18:41     ` David Rientjes
2010-06-13 11:24       ` KOSAKI Motohiro
2010-06-14  8:54         ` David Rientjes
2010-06-14 11:08           ` KOSAKI Motohiro
2010-06-08 11:41   ` KOSAKI Motohiro
2010-06-08 18:45     ` David Rientjes
2010-06-01  7:18 ` [patch -mm 03/18] oom: select task from tasklist for mempolicy ooms David Rientjes
2010-06-01  7:39   ` KOSAKI Motohiro
2010-06-08 11:41   ` KOSAKI Motohiro
2010-06-08 23:28     ` Andrew Morton
2010-06-08 11:41   ` KOSAKI Motohiro
2010-06-01  7:18 ` [patch -mm 04/18] oom: extract panic helper function David Rientjes
2010-06-01  7:33   ` KOSAKI Motohiro
2010-06-01  7:18 ` [patch -mm 05/18] oom: remove special handling for pagefault ooms David Rientjes
2010-06-01  7:34   ` KOSAKI Motohiro
2010-06-01  7:18 ` [patch -mm 06/18] oom: move sysctl declarations to oom.h David Rientjes
2010-06-01  7:34   ` KOSAKI Motohiro
2010-06-01  7:18 ` [patch -mm 07/18] oom: enable oom tasklist dump by default David Rientjes
2010-06-01  7:36   ` KOSAKI Motohiro
2010-06-01  7:18 ` [patch -mm 08/18] oom: badness heuristic rewrite David Rientjes
2010-06-01  7:36   ` KOSAKI Motohiro [this message]
2010-06-01 18:44     ` David Rientjes
2010-06-02 13:54       ` KOSAKI Motohiro
2010-06-02 21:20         ` David Rientjes
2010-06-03 23:10         ` Andrew Morton
2010-06-03 23:53           ` KAMEZAWA Hiroyuki
2010-06-04  0:04             ` Andrew Morton
2010-06-04  0:20               ` KAMEZAWA Hiroyuki
2010-06-04  5:57                 ` KAMEZAWA Hiroyuki
2010-06-04  9:22                   ` David Rientjes
2010-06-04  9:19             ` David Rientjes
2010-06-04  9:43             ` Oleg Nesterov
2010-06-04 10:54           ` KOSAKI Motohiro
2010-06-04 20:57             ` David Rientjes
2010-06-08 11:41               ` KOSAKI Motohiro
2010-06-08 23:47                 ` Andrew Morton
2010-06-17  3:28                   ` David Rientjes
2010-06-01  7:46   ` Nick Piggin
2010-06-01 18:56     ` David Rientjes
2010-06-02 13:54       ` KOSAKI Motohiro
2010-06-02 21:23         ` David Rientjes
2010-06-03  0:05           ` KAMEZAWA Hiroyuki
2010-06-03  6:44             ` David Rientjes
2010-06-03  3:07           ` KOSAKI Motohiro
2010-06-03  6:48             ` David Rientjes
2010-06-03 23:15             ` Andrew Morton
2010-06-04 10:54               ` KOSAKI Motohiro
2010-06-01  7:18 ` [patch -mm 09/18] oom: add forkbomb penalty to badness heuristic David Rientjes
2010-06-01  7:37   ` KOSAKI Motohiro
2010-06-01 18:57     ` David Rientjes
2010-06-03 20:33       ` David Rientjes
2010-06-08 11:41   ` KOSAKI Motohiro
2010-06-08 11:41   ` KOSAKI Motohiro
2010-06-01  7:18 ` [patch -mm 10/18] oom: deprecate oom_adj tunable David Rientjes
2010-06-01  7:37   ` KOSAKI Motohiro
2010-06-01  7:18 ` [patch -mm 11/18] oom: avoid oom killer for lowmem allocations David Rientjes
2010-06-01  7:38   ` KOSAKI Motohiro
2010-06-08 11:41   ` KOSAKI Motohiro
2010-06-08 18:38     ` David Rientjes
2010-06-01  7:18 ` [patch -mm 12/18] oom: remove unnecessary code and cleanup David Rientjes
2010-06-01  7:40   ` KOSAKI Motohiro
2010-06-01 18:58     ` David Rientjes
2010-06-01  7:19 ` [patch -mm 13/18] oom: avoid race for oom killed tasks detaching mm prior to exit David Rientjes
2010-06-01  7:40   ` KOSAKI Motohiro
2010-06-01 18:59     ` David Rientjes
2010-06-01 20:43       ` Oleg Nesterov
2010-06-01 21:19         ` David Rientjes
2010-06-02  0:28         ` KAMEZAWA Hiroyuki
2010-06-02  9:49           ` David Rientjes
2010-06-02 10:46             ` Nick Piggin
2010-06-02 21:35               ` David Rientjes
2010-06-02 13:54         ` KOSAKI Motohiro
2010-06-01  7:19 ` [patch -mm 14/18] oom: check PF_KTHREAD instead of !mm to skip kthreads David Rientjes
2010-06-01  7:41   ` KOSAKI Motohiro
2010-06-01  7:19 ` [patch -mm 15/18] oom: introduce find_lock_task_mm() to fix !mm false positives David Rientjes
2010-06-01  7:41   ` KOSAKI Motohiro
2010-06-01  7:19 ` [patch -mm 16/18] oom: give current access to memory reserves if it has been killed David Rientjes
2010-06-01  7:44   ` KOSAKI Motohiro
2010-06-01  7:19 ` [patch -mm 17/18] oom: avoid sending exiting tasks a SIGKILL David Rientjes
2010-06-01  7:19 ` [patch -mm 18/18] oom: clean up oom_kill_task() David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100601163627.245D.A69D9226@jp.fujitsu.com \
    --to=kosaki.motohiro@jp.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=oleg@redhat.com \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.