From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933923AbcLTPb4 (ORCPT ); Tue, 20 Dec 2016 10:31:56 -0500 Received: from www262.sakura.ne.jp ([202.181.97.72]:45090 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933423AbcLTPbx (ORCPT ); Tue, 20 Dec 2016 10:31:53 -0500 To: mhocko@kernel.org, akpm@linux-foundation.org Cc: hannes@cmpxchg.org, rientjes@google.com, mgorman@suse.de, hillf.zj@alibaba-inc.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, mhocko@suse.com Subject: Re: [PATCH 2/3] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically From: Tetsuo Handa References: <20161220134904.21023-1-mhocko@kernel.org> <20161220134904.21023-3-mhocko@kernel.org> In-Reply-To: <20161220134904.21023-3-mhocko@kernel.org> Message-Id: <201612210031.BFD48914.VMtHSFFJOLQFOO@I-love.SAKURA.ne.jp> X-Mailer: Winbiff [Version 2.51 PL2] X-Accept-Language: ja,en,zh Date: Wed, 21 Dec 2016 00:31:47 +0900 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Michal Hocko wrote: > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index c8eed66d8abb..2dda7c3eba52 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -3098,32 +3098,31 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, > if (page) > goto out; > > - if (!(gfp_mask & __GFP_NOFAIL)) { > - /* Coredumps can quickly deplete all memory reserves */ > - if (current->flags & PF_DUMPCORE) > - goto out; > - /* The OOM killer will not help higher order allocs */ > - if (order > PAGE_ALLOC_COSTLY_ORDER) > - goto out; > - /* The OOM killer does not needlessly kill tasks for lowmem */ > - if (ac->high_zoneidx < ZONE_NORMAL) > - goto out; > - if (pm_suspended_storage()) > - goto out; > - /* > - * XXX: GFP_NOFS allocations should rather fail than rely on > - * other request to make a forward progress. > - * We are in an unfortunate situation where out_of_memory cannot > - * do much for this context but let's try it to at least get > - * access to memory reserved if the current task is killed (see > - * out_of_memory). Once filesystems are ready to handle allocation > - * failures more gracefully we should just bail out here. > - */ > + /* Coredumps can quickly deplete all memory reserves */ > + if (current->flags & PF_DUMPCORE) > + goto out; > + /* The OOM killer will not help higher order allocs */ > + if (order > PAGE_ALLOC_COSTLY_ORDER) > + goto out; > + /* The OOM killer does not needlessly kill tasks for lowmem */ > + if (ac->high_zoneidx < ZONE_NORMAL) > + goto out; > + if (pm_suspended_storage()) > + goto out; > + /* > + * XXX: GFP_NOFS allocations should rather fail than rely on > + * other request to make a forward progress. > + * We are in an unfortunate situation where out_of_memory cannot > + * do much for this context but let's try it to at least get > + * access to memory reserved if the current task is killed (see > + * out_of_memory). Once filesystems are ready to handle allocation > + * failures more gracefully we should just bail out here. > + */ > + > + /* The OOM killer may not free memory on a specific node */ > + if (gfp_mask & __GFP_THISNODE) > + goto out; > > - /* The OOM killer may not free memory on a specific node */ > - if (gfp_mask & __GFP_THISNODE) > - goto out; > - } > /* Exhausted what can be done so it's blamo time */ > if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) { > *did_some_progress = 1; Why do we need to change this part in this patch? This change silently prohibits invoking the OOM killer for e.g. costly GFP_KERNEL allocation. While it would be better if vmalloc() can be used, there might be users who cannot accept vmalloc() as a fallback (e.g. CONFIG_MMU=n where vmalloc() == kmalloc() ?). This change is not "do not enforce OOM killer automatically" but "never allow OOM killer". No exception is allowed. If we change this part, title for this part should be something strong like "mm,oom: Never allow OOM killer for coredumps, costly allocations, lowmem etc.".