From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f199.google.com (mail-pf0-f199.google.com [209.85.192.199]) by kanga.kvack.org (Postfix) with ESMTP id 4E4EA6B0069 for ; Thu, 30 Nov 2017 10:29:14 -0500 (EST) Received: by mail-pf0-f199.google.com with SMTP id h18so5140245pfi.2 for ; Thu, 30 Nov 2017 07:29:14 -0800 (PST) Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com. [67.231.153.30]) by mx.google.com with ESMTPS id z6si3159633pgp.262.2017.11.30.07.29.12 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 Nov 2017 07:29:13 -0800 (PST) From: Roman Gushchin Subject: [PATCH v13 0/7] cgroup-aware OOM killer Date: Thu, 30 Nov 2017 15:28:17 +0000 Message-ID: <20171130152824.1591-1-guro@fb.com> MIME-Version: 1.0 Content-Type: text/plain Sender: owner-linux-mm@kvack.org List-ID: To: linux-mm@vger.kernel.org Cc: Roman Gushchin , Michal Hocko , Vladimir Davydov , Johannes Weiner , Tetsuo Handa , David Rientjes , Andrew Morton , Tejun Heo , kernel-team@fb.com, cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org This patchset makes the OOM killer cgroup-aware. v13: - Reverted fallback to per-process OOM as in v11 (asked by Michal) - Added entry in cgroup features list - Added a note about charge migration - Rebase v12: - Root memory cgroup is evaluated based on sum of the oom scores of belonging tasks - Do not fallback to the per-process behavior if there if it wasn't possbile to kill a memcg victim - Rebase on top of mm tree v11: - Fixed an issue with skipping the root mem cgroup (discovered by Shakeel Butt) - Moved a check in __oom_kill_process() to the memmory.oom_group patch, added corresponding comments - Added a note about ignoring tasks with oom_score_adj -1000 (proposed by Michal Hocko) - Rebase on top of mm tree v10: - Separate oom_group introduction into a standalone patch - Stop propagating oom_group - Make oom_group delegatable - Do not try to kill the biggest task in the first order, if the whole cgroup is going to be killed - Stop caching oom_score on struct memcg, optimize victim memcg selection - Drop dmesg printing (for further refining) - Small refactorings and comments added here and there - Rebase on top of mm tree v9: - Change siblings-to-siblings comparison to the tree-wide search, make related refactorings - Make oom_group implicitly propagated down by the tree - Fix an issue with task selection in root cgroup v8: - Do not kill tasks with OOM_SCORE_ADJ -1000 - Make the whole thing opt-in with cgroup mount option control - Drop oom_priority for further discussions - Kill the whole cgroup if oom_group is set and it's memory.max is reached - Update docs and commit messages v7: - __oom_kill_process() drops reference to the victim task - oom_score_adj -1000 is always respected - Renamed oom_kill_all to oom_group - Dropped oom_prio range, converted from short to int - Added a cgroup v2 mount option to disable cgroup-aware OOM killer - Docs updated - Rebased on top of mmotm v6: - Renamed oom_control.chosen to oom_control.chosen_task - Renamed oom_kill_all_tasks to oom_kill_all - Per-node NR_SLAB_UNRECLAIMABLE accounting - Several minor fixes and cleanups - Docs updated v5: - Rebased on top of Michal Hocko's patches, which have changed the way how OOM victims becoming an access to the memory reserves. Dropped corresponding part of this patchset - Separated the oom_kill_process() splitting into a standalone commit - Added debug output (suggested by David Rientjes) - Some minor fixes v4: - Reworked per-cgroup oom_score_adj into oom_priority (based on ideas by David Rientjes) - Tasks with oom_score_adj -1000 are never selected if oom_kill_all_tasks is not set - Memcg victim selection code is reworked, and synchronization is based on finding tasks with OOM victim marker, rather then on global counter - Debug output is dropped - Refactored TIF_MEMDIE usage v3: - Merged commits 1-4 into 6 - Separated oom_score_adj logic and debug output into separate commits - Fixed swap accounting v2: - Reworked victim selection based on feedback from Michal Hocko, Vladimir Davydov and Johannes Weiner - "Kill all tasks" is now an opt-in option, by default only one process will be killed - Added per-cgroup oom_score_adj - Refined oom score calculations, suggested by Vladimir Davydov - Converted to a patchset v1: https://lkml.org/lkml/2017/5/18/969 Cc: Michal Hocko Cc: Vladimir Davydov Cc: Johannes Weiner Cc: Tetsuo Handa Cc: David Rientjes Cc: Andrew Morton Cc: Tejun Heo Cc: kernel-team@fb.com Cc: cgroups@vger.kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: cgroups@vger.kernel.org Cc: linux-mm@kvack.org Roman Gushchin (7): mm, oom: refactor the oom_kill_process() function mm: implement mem_cgroup_scan_tasks() for the root memory cgroup mm, oom: cgroup-aware OOM killer mm, oom: introduce memory.oom_group mm, oom: add cgroup v2 mount option for cgroup-aware OOM killer mm, oom, docs: describe the cgroup-aware OOM killer cgroup: list groupoom in cgroup features Documentation/cgroup-v2.txt | 58 ++++++++++ include/linux/cgroup-defs.h | 5 + include/linux/memcontrol.h | 34 ++++++ include/linux/oom.h | 12 ++- kernel/cgroup/cgroup.c | 13 ++- mm/memcontrol.c | 258 +++++++++++++++++++++++++++++++++++++++++++- mm/oom_kill.c | 224 +++++++++++++++++++++++++------------- 7 files changed, 525 insertions(+), 79 deletions(-) -- 2.14.3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org