All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: akpm@linux-foundation.org, vbabka@suse.cz, hannes@cmpxchg.org,
	mgorman@suse.de, rientjes@google.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically
Date: Mon, 5 Dec 2016 15:10:10 +0100	[thread overview]
Message-ID: <20161205141009.GJ30758@dhcp22.suse.cz> (raw)
In-Reply-To: <201612052245.HDB21880.OHJMOOQFFSVLtF@I-love.SAKURA.ne.jp>

On Mon 05-12-16 22:45:19, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > __alloc_pages_may_oom makes sure to skip the OOM killer depending on
> > the allocation request. This includes lowmem requests, costly high
> > order requests and others. For a long time __GFP_NOFAIL acted as an
> > override for all those rules. This is not documented and it can be quite
> > surprising as well. E.g. GFP_NOFS requests are not invoking the OOM
> > killer but GFP_NOFS|__GFP_NOFAIL does so if we try to convert some of
> > the existing open coded loops around allocator to nofail request (and we
> > have done that in the past) then such a change would have a non trivial
> > side effect which is not obvious. Note that the primary motivation for
> > skipping the OOM killer is to prevent from pre-mature invocation.
> > 
> > The exception has been added by 82553a937f12 ("oom: invoke oom killer
> > for __GFP_NOFAIL"). The changelog points out that the oom killer has to
> > be invoked otherwise the request would be looping for ever. But this
> > argument is rather weak because the OOM killer doesn't really guarantee
> > any forward progress for those exceptional cases - e.g. it will hardly
> > help to form costly order - I believe we certainly do not want to kill
> > all processes and eventually panic the system just because there is a
> > nasty driver asking for order-9 page with GFP_NOFAIL not realizing all
> > the consequences - it is much better this request would loop for ever
> > than the massive system disruption, lowmem is also highly unlikely to be
> > freed during OOM killer and GFP_NOFS request could trigger while there
> > is still a lot of memory pinned by filesystems.
> 
> I disagree. I believe that panic caused by OOM killer is much much better
> than a locked up system. I hate to add new locations that can lockup inside
> page allocator. This is __GFP_NOFAIL and reclaim has failed.

As a matter of fact any __GFP_NOFAIL can lockup inside the allocator.
Full stop. There is no guaranteed way to make a forward progress with
the current page allocator implementation.

So we are somewhere in the middle between pre-mature and pointless
system disruption (GFP_NOFS with a lots of metadata or lowmem request)
where the OOM killer even might not help and potential lockup which is
inevitable with the current design. Dunno about you but I would rather
go with the first option. To be honest I really fail to understand your
line of argumentation. We have this
	do {
		cond_resched();
	} (page = alloc_page(GFP_NOFS));
vs.
	page = alloc_page(GFP_NOFS | __GFP_NOFAIL);

the first one doesn't invoke OOM killer while the later does. This
discrepancy just cannot make any sense... The same is true for

	alloc_page(GFP_DMA) vs alloc_page(GFP_DMA|__GFP_NOFAIL)

Now we can discuss whether it is a _good_ idea to not invoke OOM killer
for those exceptions but whatever we do __GFP_NOFAIL is not a way to
give such a subtle side effect. Or do you disagree even with that?
-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: akpm@linux-foundation.org, vbabka@suse.cz, hannes@cmpxchg.org,
	mgorman@suse.de, rientjes@google.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically
Date: Mon, 5 Dec 2016 15:10:10 +0100	[thread overview]
Message-ID: <20161205141009.GJ30758@dhcp22.suse.cz> (raw)
In-Reply-To: <201612052245.HDB21880.OHJMOOQFFSVLtF@I-love.SAKURA.ne.jp>

On Mon 05-12-16 22:45:19, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > __alloc_pages_may_oom makes sure to skip the OOM killer depending on
> > the allocation request. This includes lowmem requests, costly high
> > order requests and others. For a long time __GFP_NOFAIL acted as an
> > override for all those rules. This is not documented and it can be quite
> > surprising as well. E.g. GFP_NOFS requests are not invoking the OOM
> > killer but GFP_NOFS|__GFP_NOFAIL does so if we try to convert some of
> > the existing open coded loops around allocator to nofail request (and we
> > have done that in the past) then such a change would have a non trivial
> > side effect which is not obvious. Note that the primary motivation for
> > skipping the OOM killer is to prevent from pre-mature invocation.
> > 
> > The exception has been added by 82553a937f12 ("oom: invoke oom killer
> > for __GFP_NOFAIL"). The changelog points out that the oom killer has to
> > be invoked otherwise the request would be looping for ever. But this
> > argument is rather weak because the OOM killer doesn't really guarantee
> > any forward progress for those exceptional cases - e.g. it will hardly
> > help to form costly order - I believe we certainly do not want to kill
> > all processes and eventually panic the system just because there is a
> > nasty driver asking for order-9 page with GFP_NOFAIL not realizing all
> > the consequences - it is much better this request would loop for ever
> > than the massive system disruption, lowmem is also highly unlikely to be
> > freed during OOM killer and GFP_NOFS request could trigger while there
> > is still a lot of memory pinned by filesystems.
> 
> I disagree. I believe that panic caused by OOM killer is much much better
> than a locked up system. I hate to add new locations that can lockup inside
> page allocator. This is __GFP_NOFAIL and reclaim has failed.

As a matter of fact any __GFP_NOFAIL can lockup inside the allocator.
Full stop. There is no guaranteed way to make a forward progress with
the current page allocator implementation.

So we are somewhere in the middle between pre-mature and pointless
system disruption (GFP_NOFS with a lots of metadata or lowmem request)
where the OOM killer even might not help and potential lockup which is
inevitable with the current design. Dunno about you but I would rather
go with the first option. To be honest I really fail to understand your
line of argumentation. We have this
	do {
		cond_resched();
	} (page = alloc_page(GFP_NOFS));
vs.
	page = alloc_page(GFP_NOFS | __GFP_NOFAIL);

the first one doesn't invoke OOM killer while the later does. This
discrepancy just cannot make any sense... The same is true for

	alloc_page(GFP_DMA) vs alloc_page(GFP_DMA|__GFP_NOFAIL)

Now we can discuss whether it is a _good_ idea to not invoke OOM killer
for those exceptions but whatever we do __GFP_NOFAIL is not a way to
give such a subtle side effect. Or do you disagree even with that?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-12-05 14:10 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-01 15:25 [PATCH 0/2] GFP_NOFAIL cleanups Michal Hocko
2016-12-01 15:25 ` Michal Hocko
2016-12-01 15:25 ` [PATCH 1/2] mm: consolidate GFP_NOFAIL checks in the allocator slowpath Michal Hocko
2016-12-01 15:25   ` Michal Hocko
2016-12-01 15:25 ` [PATCH 2/2] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically Michal Hocko
2016-12-01 15:25   ` Michal Hocko
2016-12-02  7:23   ` Vlastimil Babka
2016-12-02  7:23     ` Vlastimil Babka
2016-12-05 13:45   ` Tetsuo Handa
2016-12-05 13:45     ` Tetsuo Handa
2016-12-05 14:10     ` Michal Hocko [this message]
2016-12-05 14:10       ` Michal Hocko
2016-12-06  8:27       ` Michal Hocko
2016-12-06  8:27         ` Michal Hocko
2016-12-06 10:38       ` Tetsuo Handa
2016-12-06 10:38         ` Tetsuo Handa
2016-12-06 11:03         ` Vlastimil Babka
2016-12-06 11:03           ` Vlastimil Babka
2016-12-06 19:25           ` Michal Hocko
2016-12-06 19:25             ` Michal Hocko
2016-12-06 19:22         ` Michal Hocko
2016-12-06 19:22           ` Michal Hocko
2016-12-08 12:53           ` Tetsuo Handa
2016-12-08 12:53             ` Tetsuo Handa
2016-12-08 13:47             ` Michal Hocko
2016-12-08 13:47               ` Michal Hocko
2016-12-11 11:23               ` Tetsuo Handa
2016-12-11 11:23                 ` Tetsuo Handa
2016-12-11 13:53                 ` Tetsuo Handa
2016-12-11 13:53                   ` Tetsuo Handa
2016-12-12  8:52                   ` Michal Hocko
2016-12-12  8:52                     ` Michal Hocko
2016-12-12  8:48                 ` Michal Hocko
2016-12-12  8:48                   ` Michal Hocko
2016-12-14 10:34                   ` Michal Hocko
2016-12-14 10:34                     ` Michal Hocko
2016-12-16  7:39 OOM: Better, but still there on 4.9 Michal Hocko
2016-12-16 15:58 ` OOM: Better, but still there on Michal Hocko
2016-12-16 15:58   ` [PATCH 2/2] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically Michal Hocko
2016-12-16 15:58     ` Michal Hocko
2016-12-16 17:31     ` Johannes Weiner
2016-12-16 17:31       ` Johannes Weiner
2016-12-16 22:12       ` Michal Hocko
2016-12-16 22:12         ` Michal Hocko
2016-12-17 11:17         ` Tetsuo Handa
2016-12-17 11:17           ` Tetsuo Handa
2016-12-18 16:37           ` Michal Hocko
2016-12-18 16:37             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161205141009.GJ30758@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.