From: Roman Gushchin <guro@fb.com> To: <linux-mm@kvack.org> Cc: Roman Gushchin <guro@fb.com>, Michal Hocko <mhocko@kernel.org>, Vladimir Davydov <vdavydov.dev@gmail.com>, Johannes Weiner <hannes@cmpxchg.org>, Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>, David Rientjes <rientjes@google.com>, Andrew Morton <akpm@linux-foundation.org>, Tejun Heo <tj@kernel.org>, <kernel-team@fb.com>, <cgroups@vger.kernel.org>, <linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org> Subject: [v8 1/4] mm, oom: refactor the oom_kill_process() function Date: Mon, 11 Sep 2017 14:17:39 +0100 [thread overview] Message-ID: <20170911131742.16482-2-guro@fb.com> (raw) In-Reply-To: <20170911131742.16482-1-guro@fb.com> The oom_kill_process() function consists of two logical parts: the first one is responsible for considering task's children as a potential victim and printing the debug information. The second half is responsible for sending SIGKILL to all tasks sharing the mm struct with the given victim. This commit splits the oom_kill_process() function with an intention to re-use the the second half: __oom_kill_process(). The cgroup-aware OOM killer will kill multiple tasks belonging to the victim cgroup. We don't need to print the debug information for the each task, as well as play with task selection (considering task's children), so we can't use the existing oom_kill_process(). Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: David Rientjes <rientjes@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Tejun Heo <tj@kernel.org> Cc: kernel-team@fb.com Cc: cgroups@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org --- mm/oom_kill.c | 123 +++++++++++++++++++++++++++++++--------------------------- 1 file changed, 65 insertions(+), 58 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 99736e026712..f061b627092c 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -804,68 +804,12 @@ static bool task_will_free_mem(struct task_struct *task) return ret; } -static void oom_kill_process(struct oom_control *oc, const char *message) +static void __oom_kill_process(struct task_struct *victim) { - struct task_struct *p = oc->chosen; - unsigned int points = oc->chosen_points; - struct task_struct *victim = p; - struct task_struct *child; - struct task_struct *t; + struct task_struct *p; struct mm_struct *mm; - unsigned int victim_points = 0; - static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL, - DEFAULT_RATELIMIT_BURST); bool can_oom_reap = true; - /* - * If the task is already exiting, don't alarm the sysadmin or kill - * its children or threads, just give it access to memory reserves - * so it can die quickly - */ - task_lock(p); - if (task_will_free_mem(p)) { - mark_oom_victim(p); - wake_oom_reaper(p); - task_unlock(p); - put_task_struct(p); - return; - } - task_unlock(p); - - if (__ratelimit(&oom_rs)) - dump_header(oc, p); - - pr_err("%s: Kill process %d (%s) score %u or sacrifice child\n", - message, task_pid_nr(p), p->comm, points); - - /* - * If any of p's children has a different mm and is eligible for kill, - * the one with the highest oom_badness() score is sacrificed for its - * parent. This attempts to lose the minimal amount of work done while - * still freeing memory. - */ - read_lock(&tasklist_lock); - for_each_thread(p, t) { - list_for_each_entry(child, &t->children, sibling) { - unsigned int child_points; - - if (process_shares_mm(child, p->mm)) - continue; - /* - * oom_badness() returns 0 if the thread is unkillable - */ - child_points = oom_badness(child, - oc->memcg, oc->nodemask, oc->totalpages); - if (child_points > victim_points) { - put_task_struct(victim); - victim = child; - victim_points = child_points; - get_task_struct(victim); - } - } - } - read_unlock(&tasklist_lock); - p = find_lock_task_mm(victim); if (!p) { put_task_struct(victim); @@ -939,6 +883,69 @@ static void oom_kill_process(struct oom_control *oc, const char *message) } #undef K +static void oom_kill_process(struct oom_control *oc, const char *message) +{ + struct task_struct *p = oc->chosen; + unsigned int points = oc->chosen_points; + struct task_struct *victim = p; + struct task_struct *child; + struct task_struct *t; + unsigned int victim_points = 0; + static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); + + /* + * If the task is already exiting, don't alarm the sysadmin or kill + * its children or threads, just give it access to memory reserves + * so it can die quickly + */ + task_lock(p); + if (task_will_free_mem(p)) { + mark_oom_victim(p); + wake_oom_reaper(p); + task_unlock(p); + put_task_struct(p); + return; + } + task_unlock(p); + + if (__ratelimit(&oom_rs)) + dump_header(oc, p); + + pr_err("%s: Kill process %d (%s) score %u or sacrifice child\n", + message, task_pid_nr(p), p->comm, points); + + /* + * If any of p's children has a different mm and is eligible for kill, + * the one with the highest oom_badness() score is sacrificed for its + * parent. This attempts to lose the minimal amount of work done while + * still freeing memory. + */ + read_lock(&tasklist_lock); + for_each_thread(p, t) { + list_for_each_entry(child, &t->children, sibling) { + unsigned int child_points; + + if (process_shares_mm(child, p->mm)) + continue; + /* + * oom_badness() returns 0 if the thread is unkillable + */ + child_points = oom_badness(child, + oc->memcg, oc->nodemask, oc->totalpages); + if (child_points > victim_points) { + put_task_struct(victim); + victim = child; + victim_points = child_points; + get_task_struct(victim); + } + } + } + read_unlock(&tasklist_lock); + + __oom_kill_process(victim); +} + /* * Determines whether the kernel must panic because of the panic_on_oom sysctl. */ -- 2.13.5
WARNING: multiple messages have this Message-ID (diff)
From: Roman Gushchin <guro@fb.com> To: linux-mm@kvack.org Cc: Roman Gushchin <guro@fb.com>, Michal Hocko <mhocko@kernel.org>, Vladimir Davydov <vdavydov.dev@gmail.com>, Johannes Weiner <hannes@cmpxchg.org>, Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>, David Rientjes <rientjes@google.com>, Andrew Morton <akpm@linux-foundation.org>, Tejun Heo <tj@kernel.org>, kernel-team@fb.com, cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [v8 1/4] mm, oom: refactor the oom_kill_process() function Date: Mon, 11 Sep 2017 14:17:39 +0100 [thread overview] Message-ID: <20170911131742.16482-2-guro@fb.com> (raw) In-Reply-To: <20170911131742.16482-1-guro@fb.com> The oom_kill_process() function consists of two logical parts: the first one is responsible for considering task's children as a potential victim and printing the debug information. The second half is responsible for sending SIGKILL to all tasks sharing the mm struct with the given victim. This commit splits the oom_kill_process() function with an intention to re-use the the second half: __oom_kill_process(). The cgroup-aware OOM killer will kill multiple tasks belonging to the victim cgroup. We don't need to print the debug information for the each task, as well as play with task selection (considering task's children), so we can't use the existing oom_kill_process(). Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: David Rientjes <rientjes@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Tejun Heo <tj@kernel.org> Cc: kernel-team@fb.com Cc: cgroups@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org --- mm/oom_kill.c | 123 +++++++++++++++++++++++++++++++--------------------------- 1 file changed, 65 insertions(+), 58 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 99736e026712..f061b627092c 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -804,68 +804,12 @@ static bool task_will_free_mem(struct task_struct *task) return ret; } -static void oom_kill_process(struct oom_control *oc, const char *message) +static void __oom_kill_process(struct task_struct *victim) { - struct task_struct *p = oc->chosen; - unsigned int points = oc->chosen_points; - struct task_struct *victim = p; - struct task_struct *child; - struct task_struct *t; + struct task_struct *p; struct mm_struct *mm; - unsigned int victim_points = 0; - static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL, - DEFAULT_RATELIMIT_BURST); bool can_oom_reap = true; - /* - * If the task is already exiting, don't alarm the sysadmin or kill - * its children or threads, just give it access to memory reserves - * so it can die quickly - */ - task_lock(p); - if (task_will_free_mem(p)) { - mark_oom_victim(p); - wake_oom_reaper(p); - task_unlock(p); - put_task_struct(p); - return; - } - task_unlock(p); - - if (__ratelimit(&oom_rs)) - dump_header(oc, p); - - pr_err("%s: Kill process %d (%s) score %u or sacrifice child\n", - message, task_pid_nr(p), p->comm, points); - - /* - * If any of p's children has a different mm and is eligible for kill, - * the one with the highest oom_badness() score is sacrificed for its - * parent. This attempts to lose the minimal amount of work done while - * still freeing memory. - */ - read_lock(&tasklist_lock); - for_each_thread(p, t) { - list_for_each_entry(child, &t->children, sibling) { - unsigned int child_points; - - if (process_shares_mm(child, p->mm)) - continue; - /* - * oom_badness() returns 0 if the thread is unkillable - */ - child_points = oom_badness(child, - oc->memcg, oc->nodemask, oc->totalpages); - if (child_points > victim_points) { - put_task_struct(victim); - victim = child; - victim_points = child_points; - get_task_struct(victim); - } - } - } - read_unlock(&tasklist_lock); - p = find_lock_task_mm(victim); if (!p) { put_task_struct(victim); @@ -939,6 +883,69 @@ static void oom_kill_process(struct oom_control *oc, const char *message) } #undef K +static void oom_kill_process(struct oom_control *oc, const char *message) +{ + struct task_struct *p = oc->chosen; + unsigned int points = oc->chosen_points; + struct task_struct *victim = p; + struct task_struct *child; + struct task_struct *t; + unsigned int victim_points = 0; + static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); + + /* + * If the task is already exiting, don't alarm the sysadmin or kill + * its children or threads, just give it access to memory reserves + * so it can die quickly + */ + task_lock(p); + if (task_will_free_mem(p)) { + mark_oom_victim(p); + wake_oom_reaper(p); + task_unlock(p); + put_task_struct(p); + return; + } + task_unlock(p); + + if (__ratelimit(&oom_rs)) + dump_header(oc, p); + + pr_err("%s: Kill process %d (%s) score %u or sacrifice child\n", + message, task_pid_nr(p), p->comm, points); + + /* + * If any of p's children has a different mm and is eligible for kill, + * the one with the highest oom_badness() score is sacrificed for its + * parent. This attempts to lose the minimal amount of work done while + * still freeing memory. + */ + read_lock(&tasklist_lock); + for_each_thread(p, t) { + list_for_each_entry(child, &t->children, sibling) { + unsigned int child_points; + + if (process_shares_mm(child, p->mm)) + continue; + /* + * oom_badness() returns 0 if the thread is unkillable + */ + child_points = oom_badness(child, + oc->memcg, oc->nodemask, oc->totalpages); + if (child_points > victim_points) { + put_task_struct(victim); + victim = child; + victim_points = child_points; + get_task_struct(victim); + } + } + } + read_unlock(&tasklist_lock); + + __oom_kill_process(victim); +} + /* * Determines whether the kernel must panic because of the panic_on_oom sysctl. */ -- 2.13.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-09-11 13:18 UTC|newest] Thread overview: 168+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-09-11 13:17 [v8 0/4] cgroup-aware OOM killer Roman Gushchin 2017-09-11 13:17 ` Roman Gushchin 2017-09-11 13:17 ` Roman Gushchin [this message] 2017-09-11 13:17 ` [v8 1/4] mm, oom: refactor the oom_kill_process() function Roman Gushchin 2017-09-11 20:51 ` David Rientjes 2017-09-11 20:51 ` David Rientjes 2017-09-14 13:42 ` Michal Hocko 2017-09-14 13:42 ` Michal Hocko 2017-09-11 13:17 ` [v8 2/4] mm, oom: cgroup-aware OOM killer Roman Gushchin 2017-09-11 13:17 ` Roman Gushchin 2017-09-13 20:46 ` David Rientjes 2017-09-13 20:46 ` David Rientjes 2017-09-13 21:59 ` Roman Gushchin 2017-09-13 21:59 ` Roman Gushchin 2017-09-13 21:59 ` Roman Gushchin 2017-09-11 13:17 ` [v8 3/4] mm, oom: add cgroup v2 mount option for " Roman Gushchin 2017-09-11 13:17 ` Roman Gushchin 2017-09-11 13:17 ` Roman Gushchin 2017-09-11 20:48 ` David Rientjes 2017-09-11 20:48 ` David Rientjes 2017-09-12 20:01 ` Roman Gushchin 2017-09-12 20:01 ` Roman Gushchin 2017-09-12 20:23 ` David Rientjes 2017-09-12 20:23 ` David Rientjes 2017-09-13 12:23 ` Michal Hocko 2017-09-13 12:23 ` Michal Hocko 2017-09-11 13:17 ` [v8 4/4] mm, oom, docs: describe the " Roman Gushchin 2017-09-11 13:17 ` Roman Gushchin 2017-09-11 20:44 ` [v8 0/4] " David Rientjes 2017-09-11 20:44 ` David Rientjes 2017-09-13 12:29 ` Michal Hocko 2017-09-13 12:29 ` Michal Hocko 2017-09-13 20:46 ` David Rientjes 2017-09-13 20:46 ` David Rientjes 2017-09-14 13:34 ` Michal Hocko 2017-09-14 13:34 ` Michal Hocko 2017-09-14 20:07 ` David Rientjes 2017-09-14 20:07 ` David Rientjes 2017-09-13 21:56 ` Roman Gushchin 2017-09-13 21:56 ` Roman Gushchin 2017-09-14 13:40 ` Michal Hocko 2017-09-14 13:40 ` Michal Hocko 2017-09-14 16:05 ` Roman Gushchin 2017-09-14 16:05 ` Roman Gushchin 2017-09-15 10:58 ` Michal Hocko 2017-09-15 10:58 ` Michal Hocko 2017-09-15 15:23 ` Roman Gushchin 2017-09-15 15:23 ` Roman Gushchin 2017-09-15 19:55 ` David Rientjes 2017-09-15 19:55 ` David Rientjes 2017-09-15 21:08 ` Roman Gushchin 2017-09-15 21:08 ` Roman Gushchin 2017-09-18 6:20 ` Michal Hocko 2017-09-18 6:20 ` Michal Hocko 2017-09-18 15:02 ` Roman Gushchin 2017-09-18 15:02 ` Roman Gushchin 2017-09-18 15:02 ` Roman Gushchin 2017-09-21 8:30 ` David Rientjes 2017-09-21 8:30 ` David Rientjes 2017-09-19 20:54 ` David Rientjes 2017-09-19 20:54 ` David Rientjes 2017-09-20 22:24 ` Roman Gushchin 2017-09-20 22:24 ` Roman Gushchin 2017-09-21 8:27 ` David Rientjes 2017-09-21 8:27 ` David Rientjes 2017-09-18 6:16 ` Michal Hocko 2017-09-18 6:16 ` Michal Hocko 2017-09-19 20:51 ` David Rientjes 2017-09-19 20:51 ` David Rientjes 2017-09-18 6:14 ` Michal Hocko 2017-09-18 6:14 ` Michal Hocko 2017-09-20 21:53 ` Roman Gushchin 2017-09-20 21:53 ` Roman Gushchin 2017-09-20 21:53 ` Roman Gushchin 2017-09-25 12:24 ` Michal Hocko 2017-09-25 12:24 ` Michal Hocko 2017-09-25 17:00 ` Johannes Weiner 2017-09-25 17:00 ` Johannes Weiner 2017-09-25 18:15 ` Roman Gushchin 2017-09-25 18:15 ` Roman Gushchin 2017-09-25 20:25 ` Michal Hocko 2017-09-25 20:25 ` Michal Hocko 2017-09-25 20:25 ` Michal Hocko 2017-09-26 10:59 ` Roman Gushchin 2017-09-26 10:59 ` Roman Gushchin 2017-09-26 11:21 ` Michal Hocko 2017-09-26 11:21 ` Michal Hocko 2017-09-26 12:13 ` Roman Gushchin 2017-09-26 12:13 ` Roman Gushchin 2017-09-26 12:13 ` Roman Gushchin 2017-09-26 13:30 ` Michal Hocko 2017-09-26 13:30 ` Michal Hocko 2017-09-26 17:26 ` Johannes Weiner 2017-09-26 17:26 ` Johannes Weiner 2017-09-27 3:37 ` Tim Hockin 2017-09-27 3:37 ` Tim Hockin 2017-09-27 7:43 ` Michal Hocko 2017-09-27 7:43 ` Michal Hocko 2017-09-27 10:19 ` Roman Gushchin 2017-09-27 10:19 ` Roman Gushchin 2017-09-27 10:19 ` Roman Gushchin 2017-09-27 15:35 ` Tim Hockin 2017-09-27 15:35 ` Tim Hockin 2017-09-27 16:23 ` Roman Gushchin 2017-09-27 16:23 ` Roman Gushchin 2017-09-27 18:11 ` Tim Hockin 2017-09-27 18:11 ` Tim Hockin 2017-10-01 23:29 ` Shakeel Butt 2017-10-01 23:29 ` Shakeel Butt 2017-10-02 11:56 ` Tetsuo Handa 2017-10-02 11:56 ` Tetsuo Handa 2017-10-02 12:24 ` Michal Hocko 2017-10-02 12:24 ` Michal Hocko 2017-10-02 12:47 ` Roman Gushchin 2017-10-02 12:47 ` Roman Gushchin 2017-10-02 14:29 ` Michal Hocko 2017-10-02 14:29 ` Michal Hocko 2017-10-02 14:29 ` Michal Hocko 2017-10-02 19:00 ` Shakeel Butt 2017-10-02 19:00 ` Shakeel Butt 2017-10-02 19:28 ` Michal Hocko 2017-10-02 19:28 ` Michal Hocko 2017-10-02 19:45 ` Shakeel Butt 2017-10-02 19:45 ` Shakeel Butt 2017-10-02 19:56 ` Michal Hocko 2017-10-02 19:56 ` Michal Hocko 2017-10-02 20:00 ` Tim Hockin 2017-10-02 20:00 ` Tim Hockin 2017-10-02 20:08 ` Michal Hocko 2017-10-02 20:08 ` Michal Hocko 2017-10-02 20:09 ` Shakeel Butt 2017-10-02 20:20 ` Shakeel Butt 2017-10-02 20:20 ` Shakeel Butt 2017-10-02 20:24 ` Shakeel Butt 2017-10-02 20:24 ` Shakeel Butt 2017-10-02 20:34 ` Johannes Weiner 2017-10-02 20:34 ` Johannes Weiner 2017-10-02 20:55 ` Michal Hocko 2017-10-02 20:55 ` Michal Hocko 2017-09-25 22:21 ` David Rientjes 2017-09-25 22:21 ` David Rientjes 2017-09-26 8:46 ` Michal Hocko 2017-09-26 8:46 ` Michal Hocko 2017-09-26 21:04 ` David Rientjes 2017-09-26 21:04 ` David Rientjes 2017-09-27 7:37 ` Michal Hocko 2017-09-27 7:37 ` Michal Hocko 2017-09-27 9:57 ` Roman Gushchin 2017-09-27 9:57 ` Roman Gushchin 2017-09-21 14:21 ` Johannes Weiner 2017-09-21 14:21 ` Johannes Weiner 2017-09-21 21:17 ` David Rientjes 2017-09-21 21:17 ` David Rientjes 2017-09-21 21:17 ` David Rientjes 2017-09-21 21:51 ` Johannes Weiner 2017-09-21 21:51 ` Johannes Weiner 2017-09-22 20:53 ` David Rientjes 2017-09-22 20:53 ` David Rientjes 2017-09-22 15:44 ` Tejun Heo 2017-09-22 15:44 ` Tejun Heo 2017-09-22 15:44 ` Tejun Heo 2017-09-22 20:39 ` David Rientjes 2017-09-22 20:39 ` David Rientjes 2017-09-22 20:39 ` David Rientjes 2017-09-22 21:05 ` Tejun Heo 2017-09-22 21:05 ` Tejun Heo 2017-09-23 8:16 ` David Rientjes 2017-09-23 8:16 ` David Rientjes
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20170911131742.16482-2-guro@fb.com \ --to=guro@fb.com \ --cc=akpm@linux-foundation.org \ --cc=cgroups@vger.kernel.org \ --cc=hannes@cmpxchg.org \ --cc=kernel-team@fb.com \ --cc=linux-doc@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mhocko@kernel.org \ --cc=penguin-kernel@I-love.SAKURA.ne.jp \ --cc=rientjes@google.com \ --cc=tj@kernel.org \ --cc=vdavydov.dev@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.