From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1161124AbcE3KyA (ORCPT ); Mon, 30 May 2016 06:54:00 -0400 Received: from mail-wm0-f49.google.com ([74.125.82.49]:33787 "EHLO mail-wm0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932065AbcE3Kx7 (ORCPT ); Mon, 30 May 2016 06:53:59 -0400 Date: Mon, 30 May 2016 12:53:56 +0200 From: Michal Hocko To: Vladimir Davydov Cc: linux-mm@kvack.org, Tetsuo Handa , David Rientjes , Oleg Nesterov , Andrew Morton , LKML Subject: Re: [PATCH 4/6] mm, oom: skip over vforked tasks Message-ID: <20160530105356.GP22928@dhcp22.suse.cz> References: <1464266415-15558-1-git-send-email-mhocko@kernel.org> <1464266415-15558-5-git-send-email-mhocko@kernel.org> <20160527164830.GF26059@esperanza> <20160530071357.GE22928@dhcp22.suse.cz> <20160530095212.GO22928@dhcp22.suse.cz> <20160530104017.GB8293@esperanza> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160530104017.GB8293@esperanza> User-Agent: Mutt/1.6.0 (2016-04-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 30-05-16 13:40:17, Vladimir Davydov wrote: > On Mon, May 30, 2016 at 11:52:12AM +0200, Michal Hocko wrote: > > On Mon 30-05-16 09:13:57, Michal Hocko wrote: > > > On Fri 27-05-16 19:48:30, Vladimir Davydov wrote: > > > > On Thu, May 26, 2016 at 02:40:13PM +0200, Michal Hocko wrote: > > > [...] > > > > > @@ -839,6 +841,13 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p, > > > > > for_each_process(p) { > > > > > if (!process_shares_mm(p, mm)) > > > > > continue; > > > > > + /* > > > > > + * vforked tasks are ignored because they will drop the mm soon > > > > > + * hopefully and even if not they will not mind being oom > > > > > + * reaped because they cannot touch any memory. > > > > > > > > They shouldn't modify memory, but they still can touch it AFAIK. > > > > > > You are right. This means that the vforked child might see zero pages. > > > Let me think whether this is acceptable or not. > > > > OK, I was thinking about it some more and I think you have a good point > > here. I can see two options here: > > - keep vforked task alive and skip the oom reaper. If the victim exits > > normally and the oom wouldn't get resolved the vforked task will be > > selected in the next round because the victim would clean up > > vfork_done state in wait_for_vfork_done. We are still risking that > > the victim gets stuck though > > - kill vforked task and so it would be reapable. > > IMHO it all depends on what we're trying to achieve. If we want per task > oom, which could make some sense since a task can consume a lot of mem > via e.g. pipe buffers, we would go with option #1. However, it's rather > difficult to find out how much of kmem a task consumes w/o using kmemcg, > so IMHO per-mm approach makes more sense in general. In this case I > think we should kill both vforked task and its parent if their mm was > selected provided their oom_score_adj allows that. Yes agreed. Going with per-mm is a safier behavior because the vast majority of the consumed memory should be per mm not per task_struct. -- Michal Hocko SUSE Labs