From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753827AbcEZMk2 (ORCPT ); Thu, 26 May 2016 08:40:28 -0400 Received: from mail-wm0-f51.google.com ([74.125.82.51]:37278 "EHLO mail-wm0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753262AbcEZMk0 (ORCPT ); Thu, 26 May 2016 08:40:26 -0400 From: Michal Hocko To: Cc: Tetsuo Handa , David Rientjes , Oleg Nesterov , Vladimir Davydov , Andrew Morton , LKML Subject: [PATCH 0/5] Handle oom bypass more gracefully Date: Thu, 26 May 2016 14:40:09 +0200 Message-Id: <1464266415-15558-1-git-send-email-mhocko@kernel.org> X-Mailer: git-send-email 2.8.1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, the following 6 patches should put some order to very rare cases of mm shared between processes and make the paths which bypass the oom killer oom reapable and so much more reliable finally. Even though mm shared outside of threadgroup is rare (either use_mm by kernel threads or exotic clone(CLONE_VM) without CLONE_THREAD resp. CLONE_SIGHAND) it makes the current oom killer logic quite hard to follow and evaluate. It is possible to select an oom victim which shares the mm with unkillable process or bypass the oom killer even when other processes sharing the mm are still alive and other weird cases. Patch 1 optimizes oom_kill_task to skip the costly process iteration when the current oom victim is not sharing mm with other processes. Patch 2 is a clean up of oom_score_adj handling and a preparatory work. Patch 3 enforces oom_adj_score to be consistent between processes sharing the mm to behave consistently with the regular thread groups. Patch 4 tries to handle vforked tasks better in the oom path, patch 5 ensures that all tasks sharing the mm are killed and finally patch 6 should guarantee that task_will_free_mem will always imply reapable bypass of the oom killer. The patchset is based on the current mmotm tree (mmotm-2016-05-23-16-51). I would really appreciate a deep review as this area is full of land mines but I hope I've made the code much cleaner with less kludges. I am CCing Oleg (sorry I know you hate this code) but I would feel much better if you double checked my assumptions about locking and vfork behavior. Michal Hocko (6): mm, oom: do not loop over all tasks if there are no external tasks sharing mm proc, oom_adj: extract oom_score_adj setting into a helper mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj mm, oom: skip over vforked tasks mm, oom: kill all tasks sharing the mm mm, oom: fortify task_will_free_mem fs/proc/base.c | 168 +++++++++++++++++++++++++++++----------------------- include/linux/mm.h | 2 + include/linux/oom.h | 72 ++++++++++++++++++++-- mm/memcontrol.c | 4 +- mm/oom_kill.c | 96 ++++++++++-------------------- 5 files changed, 196 insertions(+), 146 deletions(-)