From: Michal Hocko <mhocko@kernel.org> To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: akpm@linux-foundation.org, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@suse.de, rientjes@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically Date: Wed, 14 Dec 2016 11:34:18 +0100 [thread overview] Message-ID: <20161214103418.GH25573@dhcp22.suse.cz> (raw) In-Reply-To: <20161212084837.GB18163@dhcp22.suse.cz> On Mon 12-12-16 09:48:37, Michal Hocko wrote: > On Sun 11-12-16 20:23:47, Tetsuo Handa wrote: [...] > > I believe that __GFP_NOFAIL should not imply invocation of the OOM killer. > > Therefore, I want to change __GFP_NOFAIL not to invoke the OOM killer. > > But since currently the OOM killer is not invoked unless either __GFP_FS or > > __GFP_NOFAIL is specified, changing __GFP_NOFAIL not to invoke the OOM > > killer introduces e.g. GFP_NOFS | __GFP_NOFAIL users a risk of livelocking > > by not invoking the OOM killer. Although I can't prove that this change > > never causes livelock, I don't want to provide an alternative flag like > > __GFP_WANT_OOM_KILLER. Therefore, all existing __GFP_NOFAIL users must > > agree with accepting the risk introduced by this change. > > I think you are seriously misled here. First of all, I have gone through > GFP_NOFS | GFP_NOFAIL users and _none_ of them have added the nofail > flag to enforce the OOM killer. Those users just want to express that an > allocation failure is simply not acceptable. Most of them were simply > conversions from the open-conded > do { } while (! (page = page_alloc(GFP_NOFS)); > loops. Which _does_ not invoke the OOM killer. And that is the most > importatnt point here. Why the above open coded (and as you say lockup > prone) loop is OK while GFP_NOFAIL varian should behave any differently? > > > and confirm that all existing __GFP_NOFAIL users are willing to accept > > the risk of livelocking by not invoking the OOM killer. > > > > Unless you do this procedure, I continue: > > > > Nacked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > > I was hoping for some actual arguments but I am afraid this is circling > in a loop. You are still handwaving with theoretical lockups without any > actual proof they are real. While I am not saying the risk is not there > I also say that there are other aspects to consider > - lockups will happen only if there are no other GFP_FS requests > which trigger the OOM which is quite unlikely in most > situations > - triggering oom for GFP_NOFS | GFP_NOFAIL has a non negligible > risk of pre-mature OOM killer invocation for the same reason > we do not trigger oom for GFP_NOFS. Even worse metadata heavy > workloads are much harder to contain so this might be used as > a DoS vector. > - one of the primary point of GFP_NOFAIL existence is to prevent > from open coding endless loops in the code because the page > allocator can handle most situations more gracefully (e.g. > grant access to memory reserves). Having a completely > different OOM killer behavior is both confusing and encourages > abuse. If we have users who definitely need to control the OOM > behavior then we should add a gfp flag for them. But this > needs a strong use case and consider whether there are other > options to go around that. > > I can add the above to the changelog if you think this is helpful but I > still maintain my position that your "this might cause lockups > theoretically" is unfounded and not justified to block the patch. I will > of course retract this patch if you can demonstrate the issue is real or > that any of my argumentation in the changelog is not correct. I was thinking about this some more and realized that there is a different risk which this patch would introduce and have to be considered. Heavy GFP_NOFS | __GFP_NOFAIL users might actually deplete memory reserves. This was less of a problem with the current code because we invoke the oom killer and so at least _some_ memory might be freed. I will think about it some more but I guess I will just allow a partial access in the no-oom case. I will post the patch 1 in the meantime because I believe this is a reasonable cleanup. -- Michal Hocko SUSE Labs
WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org> To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: akpm@linux-foundation.org, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@suse.de, rientjes@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically Date: Wed, 14 Dec 2016 11:34:18 +0100 [thread overview] Message-ID: <20161214103418.GH25573@dhcp22.suse.cz> (raw) In-Reply-To: <20161212084837.GB18163@dhcp22.suse.cz> On Mon 12-12-16 09:48:37, Michal Hocko wrote: > On Sun 11-12-16 20:23:47, Tetsuo Handa wrote: [...] > > I believe that __GFP_NOFAIL should not imply invocation of the OOM killer. > > Therefore, I want to change __GFP_NOFAIL not to invoke the OOM killer. > > But since currently the OOM killer is not invoked unless either __GFP_FS or > > __GFP_NOFAIL is specified, changing __GFP_NOFAIL not to invoke the OOM > > killer introduces e.g. GFP_NOFS | __GFP_NOFAIL users a risk of livelocking > > by not invoking the OOM killer. Although I can't prove that this change > > never causes livelock, I don't want to provide an alternative flag like > > __GFP_WANT_OOM_KILLER. Therefore, all existing __GFP_NOFAIL users must > > agree with accepting the risk introduced by this change. > > I think you are seriously misled here. First of all, I have gone through > GFP_NOFS | GFP_NOFAIL users and _none_ of them have added the nofail > flag to enforce the OOM killer. Those users just want to express that an > allocation failure is simply not acceptable. Most of them were simply > conversions from the open-conded > do { } while (! (page = page_alloc(GFP_NOFS)); > loops. Which _does_ not invoke the OOM killer. And that is the most > importatnt point here. Why the above open coded (and as you say lockup > prone) loop is OK while GFP_NOFAIL varian should behave any differently? > > > and confirm that all existing __GFP_NOFAIL users are willing to accept > > the risk of livelocking by not invoking the OOM killer. > > > > Unless you do this procedure, I continue: > > > > Nacked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > > I was hoping for some actual arguments but I am afraid this is circling > in a loop. You are still handwaving with theoretical lockups without any > actual proof they are real. While I am not saying the risk is not there > I also say that there are other aspects to consider > - lockups will happen only if there are no other GFP_FS requests > which trigger the OOM which is quite unlikely in most > situations > - triggering oom for GFP_NOFS | GFP_NOFAIL has a non negligible > risk of pre-mature OOM killer invocation for the same reason > we do not trigger oom for GFP_NOFS. Even worse metadata heavy > workloads are much harder to contain so this might be used as > a DoS vector. > - one of the primary point of GFP_NOFAIL existence is to prevent > from open coding endless loops in the code because the page > allocator can handle most situations more gracefully (e.g. > grant access to memory reserves). Having a completely > different OOM killer behavior is both confusing and encourages > abuse. If we have users who definitely need to control the OOM > behavior then we should add a gfp flag for them. But this > needs a strong use case and consider whether there are other > options to go around that. > > I can add the above to the changelog if you think this is helpful but I > still maintain my position that your "this might cause lockups > theoretically" is unfounded and not justified to block the patch. I will > of course retract this patch if you can demonstrate the issue is real or > that any of my argumentation in the changelog is not correct. I was thinking about this some more and realized that there is a different risk which this patch would introduce and have to be considered. Heavy GFP_NOFS | __GFP_NOFAIL users might actually deplete memory reserves. This was less of a problem with the current code because we invoke the oom killer and so at least _some_ memory might be freed. I will think about it some more but I guess I will just allow a partial access in the no-oom case. I will post the patch 1 in the meantime because I believe this is a reasonable cleanup. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-12-14 10:34 UTC|newest] Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top 2016-12-01 15:25 [PATCH 0/2] GFP_NOFAIL cleanups Michal Hocko 2016-12-01 15:25 ` Michal Hocko 2016-12-01 15:25 ` [PATCH 1/2] mm: consolidate GFP_NOFAIL checks in the allocator slowpath Michal Hocko 2016-12-01 15:25 ` Michal Hocko 2016-12-01 15:25 ` [PATCH 2/2] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically Michal Hocko 2016-12-01 15:25 ` Michal Hocko 2016-12-02 7:23 ` Vlastimil Babka 2016-12-02 7:23 ` Vlastimil Babka 2016-12-05 13:45 ` Tetsuo Handa 2016-12-05 13:45 ` Tetsuo Handa 2016-12-05 14:10 ` Michal Hocko 2016-12-05 14:10 ` Michal Hocko 2016-12-06 8:27 ` Michal Hocko 2016-12-06 8:27 ` Michal Hocko 2016-12-06 10:38 ` Tetsuo Handa 2016-12-06 10:38 ` Tetsuo Handa 2016-12-06 11:03 ` Vlastimil Babka 2016-12-06 11:03 ` Vlastimil Babka 2016-12-06 19:25 ` Michal Hocko 2016-12-06 19:25 ` Michal Hocko 2016-12-06 19:22 ` Michal Hocko 2016-12-06 19:22 ` Michal Hocko 2016-12-08 12:53 ` Tetsuo Handa 2016-12-08 12:53 ` Tetsuo Handa 2016-12-08 13:47 ` Michal Hocko 2016-12-08 13:47 ` Michal Hocko 2016-12-11 11:23 ` Tetsuo Handa 2016-12-11 11:23 ` Tetsuo Handa 2016-12-11 13:53 ` Tetsuo Handa 2016-12-11 13:53 ` Tetsuo Handa 2016-12-12 8:52 ` Michal Hocko 2016-12-12 8:52 ` Michal Hocko 2016-12-12 8:48 ` Michal Hocko 2016-12-12 8:48 ` Michal Hocko 2016-12-14 10:34 ` Michal Hocko [this message] 2016-12-14 10:34 ` Michal Hocko 2016-12-16 7:39 OOM: Better, but still there on 4.9 Michal Hocko 2016-12-16 15:58 ` OOM: Better, but still there on Michal Hocko 2016-12-16 15:58 ` [PATCH 2/2] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically Michal Hocko 2016-12-16 15:58 ` Michal Hocko 2016-12-16 17:31 ` Johannes Weiner 2016-12-16 17:31 ` Johannes Weiner 2016-12-16 22:12 ` Michal Hocko 2016-12-16 22:12 ` Michal Hocko 2016-12-17 11:17 ` Tetsuo Handa 2016-12-17 11:17 ` Tetsuo Handa 2016-12-18 16:37 ` Michal Hocko 2016-12-18 16:37 ` Michal Hocko
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20161214103418.GH25573@dhcp22.suse.cz \ --to=mhocko@kernel.org \ --cc=akpm@linux-foundation.org \ --cc=hannes@cmpxchg.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mgorman@suse.de \ --cc=penguin-kernel@I-love.SAKURA.ne.jp \ --cc=rientjes@google.com \ --cc=vbabka@suse.cz \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.