From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753393AbcFFW1j (ORCPT <rfc822;w@1wt.eu>);
	Mon, 6 Jun 2016 18:27:39 -0400
Received: from mail-pf0-f169.google.com ([209.85.192.169]:34930 "EHLO
	mail-pf0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751874AbcFFW1i (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Mon, 6 Jun 2016 18:27:38 -0400
Date: Mon, 6 Jun 2016 15:27:34 -0700 (PDT)
From: David Rientjes <rientjes@google.com>
X-X-Sender: rientjes@chino.kir.corp.google.com
To: Michal Hocko <mhocko@kernel.org>
cc: linux-mm@kvack.org, Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
        Oleg Nesterov <oleg@redhat.com>,
        Vladimir Davydov <vdavydov@parallels.com>,
        Andrew Morton <akpm@linux-foundation.org>,
        LKML <linux-kernel@vger.kernel.org>, Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH 06/10] mm, oom: kill all tasks sharing the mm
In-Reply-To: <1464945404-30157-7-git-send-email-mhocko@kernel.org>
Message-ID: <alpine.DEB.2.10.1606061526440.18843@chino.kir.corp.google.com>
References: <1464945404-30157-1-git-send-email-mhocko@kernel.org> <1464945404-30157-7-git-send-email-mhocko@kernel.org>
User-Agent: Alpine 2.10 (DEB 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Fri, 3 Jun 2016, Michal Hocko wrote:

> From: Michal Hocko <mhocko@suse.com>
> 
> Currently oom_kill_process skips both the oom reaper and SIG_KILL if a
> process sharing the same mm is unkillable via OOM_ADJUST_MIN. After "mm,
> oom_adj: make sure processes sharing mm have same view of oom_score_adj"
> all such processes are sharing the same value so we shouldn't see such a
> task at all (oom_badness would rule them out).
> 
> We can still encounter oom disabled vforked task which has to be killed
> as well if we want to have other tasks sharing the mm reapable
> because it can access the memory before doing exec. Killing such a task
> should be acceptable because it is highly unlikely it has done anything
> useful because it cannot modify any memory before it calls exec. An
> alternative would be to keep the task alive and skip the oom reaper and
> risk all the weird corner cases where the OOM killer cannot make forward
> progress because the oom victim hung somewhere on the way to exit.
> 
> There is a potential race where we kill the oom disabled task which is
> highly unlikely but possible. It would happen if __set_oom_adj raced
> with select_bad_process and then it is OK to consider the old value or
> with fork when it should be acceptable as well.
> Let's add a little note to the log so that people would tell us that
> this really happens in the real life and it matters.
> 

We cannot kill oom disabled processes at all, little race or otherwise.  
We'd rather panic the system than oom kill these processes, and that's the 
semantic that the user is basing their decision on.  We cannot suddenly 
start allowing them to be SIGKILL'd.