All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Michal Hocko <mhocko@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	Huang Ying <ying.huang@intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	"Theodore Ts'o" <tytso@mit.edu>
Subject: Re: [patch 00/12] mm: page_alloc: improve OOM mechanism and policy
Date: Tue, 7 Apr 2015 10:18:22 -0400	[thread overview]
Message-ID: <20150407141822.GA3262@cmpxchg.org> (raw)
In-Reply-To: <20150401151920.GB23824@dhcp22.suse.cz>

On Wed, Apr 01, 2015 at 05:19:20PM +0200, Michal Hocko wrote:
> On Mon 30-03-15 11:32:40, Dave Chinner wrote:
> > On Fri, Mar 27, 2015 at 11:05:09AM -0400, Johannes Weiner wrote:
> [...]
> > > GFP_NOFS sites are currently one of the sites that can deadlock inside
> > > the allocator, even though many of them seem to have fallback code.
> > > My reasoning here is that if you *have* an exit strategy for failing
> > > allocations that is smarter than hanging, we should probably use that.
> > 
> > We already do that for allocations where we can handle failure in
> > GFP_NOFS conditions. It is, however, somewhat useless if we can't
> > tell the allocator to try really hard if we've already had a failure
> > and we are already in memory reclaim conditions (e.g. a shrinker
> > trying to clean dirty objects so they can be reclaimed).
> > 
> > From that perspective, I think that this patch set aims force us
> > away from handling fallbacks ourselves because a) it makes GFP_NOFS
> > more likely to fail, and b) provides no mechanism to "try harder"
> > when we really need the allocation to succeed.
> 
> You can ask for this "try harder" by __GFP_HIGH flag. Would that help
> in your fallback case?

I would think __GFP_REPEAT would be more suitable here.  From the doc:

 * __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt
 * _might_ fail.  This depends upon the particular VM implementation.

so we can make the semantics of GFP_NOFS | __GFP_REPEAT such that they
are allowed to use the OOM killer and dip into the OOM reserves.

My question here would be: are there any NOFS allocations that *don't*
want this behavior?  Does it even make sense to require this separate
annotation or should we just make it the default?

The argument here was always that NOFS allocations are very limited in
their reclaim powers and will trigger OOM prematurely.  However, the
way we limit dirty memory these days forces most cache to be clean at
all times, and direct reclaim in general hasn't been allowed to issue
page writeback for quite some time.  So these days, NOFS reclaim isn't
really weaker than regular direct reclaim.  The only exception is that
it might block writeback, so we'd go OOM if the only reclaimables left
were dirty pages against that filesystem.  That should be acceptable.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 47981c5e54c3..fe3cb2b0b85b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2367,16 +2367,6 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, int alloc_flags,
 		/* The OOM killer does not needlessly kill tasks for lowmem */
 		if (ac->high_zoneidx < ZONE_NORMAL)
 			goto out;
-		/* The OOM killer does not compensate for IO-less reclaim */
-		if (!(gfp_mask & __GFP_FS)) {
-			/*
-			 * XXX: Page reclaim didn't yield anything,
-			 * and the OOM killer can't be invoked, but
-			 * keep looping as per tradition.
-			 */
-			*did_some_progress = 1;
-			goto out;
-		}
 		if (pm_suspended_storage())
 			goto out;
 		/* The OOM killer may not free memory on a specific node */

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Michal Hocko <mhocko@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	Huang Ying <ying.huang@intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Theodore Ts'o <tytso@mit.edu>
Subject: Re: [patch 00/12] mm: page_alloc: improve OOM mechanism and policy
Date: Tue, 7 Apr 2015 10:18:22 -0400	[thread overview]
Message-ID: <20150407141822.GA3262@cmpxchg.org> (raw)
In-Reply-To: <20150401151920.GB23824@dhcp22.suse.cz>

On Wed, Apr 01, 2015 at 05:19:20PM +0200, Michal Hocko wrote:
> On Mon 30-03-15 11:32:40, Dave Chinner wrote:
> > On Fri, Mar 27, 2015 at 11:05:09AM -0400, Johannes Weiner wrote:
> [...]
> > > GFP_NOFS sites are currently one of the sites that can deadlock inside
> > > the allocator, even though many of them seem to have fallback code.
> > > My reasoning here is that if you *have* an exit strategy for failing
> > > allocations that is smarter than hanging, we should probably use that.
> > 
> > We already do that for allocations where we can handle failure in
> > GFP_NOFS conditions. It is, however, somewhat useless if we can't
> > tell the allocator to try really hard if we've already had a failure
> > and we are already in memory reclaim conditions (e.g. a shrinker
> > trying to clean dirty objects so they can be reclaimed).
> > 
> > From that perspective, I think that this patch set aims force us
> > away from handling fallbacks ourselves because a) it makes GFP_NOFS
> > more likely to fail, and b) provides no mechanism to "try harder"
> > when we really need the allocation to succeed.
> 
> You can ask for this "try harder" by __GFP_HIGH flag. Would that help
> in your fallback case?

I would think __GFP_REPEAT would be more suitable here.  From the doc:

 * __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt
 * _might_ fail.  This depends upon the particular VM implementation.

so we can make the semantics of GFP_NOFS | __GFP_REPEAT such that they
are allowed to use the OOM killer and dip into the OOM reserves.

My question here would be: are there any NOFS allocations that *don't*
want this behavior?  Does it even make sense to require this separate
annotation or should we just make it the default?

The argument here was always that NOFS allocations are very limited in
their reclaim powers and will trigger OOM prematurely.  However, the
way we limit dirty memory these days forces most cache to be clean at
all times, and direct reclaim in general hasn't been allowed to issue
page writeback for quite some time.  So these days, NOFS reclaim isn't
really weaker than regular direct reclaim.  The only exception is that
it might block writeback, so we'd go OOM if the only reclaimables left
were dirty pages against that filesystem.  That should be acceptable.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 47981c5e54c3..fe3cb2b0b85b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2367,16 +2367,6 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, int alloc_flags,
 		/* The OOM killer does not needlessly kill tasks for lowmem */
 		if (ac->high_zoneidx < ZONE_NORMAL)
 			goto out;
-		/* The OOM killer does not compensate for IO-less reclaim */
-		if (!(gfp_mask & __GFP_FS)) {
-			/*
-			 * XXX: Page reclaim didn't yield anything,
-			 * and the OOM killer can't be invoked, but
-			 * keep looping as per tradition.
-			 */
-			*did_some_progress = 1;
-			goto out;
-		}
 		if (pm_suspended_storage())
 			goto out;
 		/* The OOM killer may not free memory on a specific node */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2015-04-07 14:18 UTC|newest]

Thread overview: 138+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-25  6:17 [patch 00/12] mm: page_alloc: improve OOM mechanism and policy Johannes Weiner
2015-03-25  6:17 ` Johannes Weiner
2015-03-25  6:17 ` [patch 01/12] mm: oom_kill: remove unnecessary locking in oom_enable() Johannes Weiner
2015-03-25  6:17   ` Johannes Weiner
2015-03-26  0:51   ` David Rientjes
2015-03-26  0:51     ` David Rientjes
2015-03-26 11:51     ` Michal Hocko
2015-03-26 11:51       ` Michal Hocko
2015-03-26 13:18       ` Michal Hocko
2015-03-26 13:18         ` Michal Hocko
2015-03-26 19:30         ` David Rientjes
2015-03-26 19:30           ` David Rientjes
2015-03-26 11:43   ` Michal Hocko
2015-03-26 11:43     ` Michal Hocko
2015-03-26 20:05   ` David Rientjes
2015-03-26 20:05     ` David Rientjes
2015-03-25  6:17 ` [patch 02/12] mm: oom_kill: clean up victim marking and exiting interfaces Johannes Weiner
2015-03-25  6:17   ` Johannes Weiner
2015-03-26  3:34   ` David Rientjes
2015-03-26  3:34     ` David Rientjes
2015-03-26 11:54   ` Michal Hocko
2015-03-26 11:54     ` Michal Hocko
2015-03-25  6:17 ` [patch 03/12] mm: oom_kill: switch test-and-clear of known TIF_MEMDIE to clear Johannes Weiner
2015-03-25  6:17   ` Johannes Weiner
2015-03-26  3:31   ` David Rientjes
2015-03-26  3:31     ` David Rientjes
2015-03-26 11:05     ` Johannes Weiner
2015-03-26 11:05       ` Johannes Weiner
2015-03-26 19:50       ` David Rientjes
2015-03-26 19:50         ` David Rientjes
2015-03-30 14:48         ` Michal Hocko
2015-03-30 14:48           ` Michal Hocko
2015-04-02 23:01         ` [patch] android, lmk: avoid setting TIF_MEMDIE if process has already exited David Rientjes
2015-04-02 23:01           ` David Rientjes
2015-04-28 22:50           ` [patch resend] " David Rientjes
2015-04-28 22:50             ` David Rientjes
2015-03-26 11:57   ` [patch 03/12] mm: oom_kill: switch test-and-clear of known TIF_MEMDIE to clear Michal Hocko
2015-03-26 11:57     ` Michal Hocko
2015-03-25  6:17 ` [patch 04/12] mm: oom_kill: remove unnecessary locking in exit_oom_victim() Johannes Weiner
2015-03-25  6:17   ` Johannes Weiner
2015-03-26 12:53   ` Michal Hocko
2015-03-26 12:53     ` Michal Hocko
2015-03-26 13:01     ` Michal Hocko
2015-03-26 13:01       ` Michal Hocko
2015-03-26 15:10       ` Johannes Weiner
2015-03-26 15:10         ` Johannes Weiner
2015-03-26 15:04     ` Johannes Weiner
2015-03-26 15:04       ` Johannes Weiner
2015-03-25  6:17 ` [patch 05/12] mm: oom_kill: generalize OOM progress waitqueue Johannes Weiner
2015-03-25  6:17   ` Johannes Weiner
2015-03-26 13:03   ` Michal Hocko
2015-03-26 13:03     ` Michal Hocko
2015-03-25  6:17 ` [patch 06/12] mm: oom_kill: simplify OOM killer locking Johannes Weiner
2015-03-25  6:17   ` Johannes Weiner
2015-03-26 13:31   ` Michal Hocko
2015-03-26 13:31     ` Michal Hocko
2015-03-26 15:17     ` Johannes Weiner
2015-03-26 15:17       ` Johannes Weiner
2015-03-26 16:07       ` Michal Hocko
2015-03-26 16:07         ` Michal Hocko
2015-03-25  6:17 ` [patch 07/12] mm: page_alloc: inline should_alloc_retry() Johannes Weiner
2015-03-25  6:17   ` Johannes Weiner
2015-03-26 14:11   ` Michal Hocko
2015-03-26 14:11     ` Michal Hocko
2015-03-26 15:18     ` Johannes Weiner
2015-03-26 15:18       ` Johannes Weiner
2015-03-25  6:17 ` [patch 08/12] mm: page_alloc: wait for OOM killer progress before retrying Johannes Weiner
2015-03-25  6:17   ` Johannes Weiner
2015-03-25 14:15   ` Tetsuo Handa
2015-03-25 14:15     ` Tetsuo Handa
2015-03-25 17:01     ` Vlastimil Babka
2015-03-25 17:01       ` Vlastimil Babka
2015-03-26 11:28       ` Johannes Weiner
2015-03-26 11:28         ` Johannes Weiner
2015-03-26 11:24     ` Johannes Weiner
2015-03-26 11:24       ` Johannes Weiner
2015-03-26 14:32       ` Michal Hocko
2015-03-26 14:32         ` Michal Hocko
2015-03-26 15:23         ` Johannes Weiner
2015-03-26 15:23           ` Johannes Weiner
2015-03-26 15:38           ` Michal Hocko
2015-03-26 15:38             ` Michal Hocko
2015-03-26 18:17             ` Johannes Weiner
2015-03-26 18:17               ` Johannes Weiner
2015-03-27 14:01             ` [patch 08/12] mm: page_alloc: wait for OOM killer progressbefore retrying Tetsuo Handa
2015-03-27 14:01               ` Tetsuo Handa
2015-03-26 15:58   ` [patch 08/12] mm: page_alloc: wait for OOM killer progress before retrying Michal Hocko
2015-03-26 15:58     ` Michal Hocko
2015-03-26 18:23     ` Johannes Weiner
2015-03-26 18:23       ` Johannes Weiner
2015-03-25  6:17 ` [patch 09/12] mm: page_alloc: private memory reserves for OOM-killing allocations Johannes Weiner
2015-03-25  6:17   ` Johannes Weiner
2015-04-14 16:49   ` Michal Hocko
2015-04-14 16:49     ` Michal Hocko
2015-04-24 19:13     ` Johannes Weiner
2015-04-24 19:13       ` Johannes Weiner
2015-03-25  6:17 ` [patch 10/12] mm: page_alloc: emergency reserve access for __GFP_NOFAIL allocations Johannes Weiner
2015-03-25  6:17   ` Johannes Weiner
2015-04-14 16:55   ` Michal Hocko
2015-04-14 16:55     ` Michal Hocko
2015-03-25  6:17 ` [patch 11/12] mm: page_alloc: do not lock up GFP_NOFS allocations upon OOM Johannes Weiner
2015-03-25  6:17   ` Johannes Weiner
2015-03-26 14:50   ` Michal Hocko
2015-03-26 14:50     ` Michal Hocko
2015-03-25  6:17 ` [patch 12/12] mm: page_alloc: do not lock up low-order " Johannes Weiner
2015-03-25  6:17   ` Johannes Weiner
2015-03-26 15:32   ` Michal Hocko
2015-03-26 15:32     ` Michal Hocko
2015-03-26 19:58 ` [patch 00/12] mm: page_alloc: improve OOM mechanism and policy Dave Chinner
2015-03-26 19:58   ` Dave Chinner
2015-03-27 15:05   ` Johannes Weiner
2015-03-27 15:05     ` Johannes Weiner
2015-03-30  0:32     ` Dave Chinner
2015-03-30  0:32       ` Dave Chinner
2015-03-30 19:31       ` Johannes Weiner
2015-03-30 19:31         ` Johannes Weiner
2015-04-01 15:19       ` Michal Hocko
2015-04-01 15:19         ` Michal Hocko
2015-04-01 21:39         ` Dave Chinner
2015-04-01 21:39           ` Dave Chinner
2015-04-02  7:29           ` Michal Hocko
2015-04-02  7:29             ` Michal Hocko
2015-04-07 14:18         ` Johannes Weiner [this message]
2015-04-07 14:18           ` Johannes Weiner
2015-04-11  7:29           ` Tetsuo Handa
2015-04-11  7:29             ` Tetsuo Handa
2015-04-13 12:49             ` Michal Hocko
2015-04-13 12:49               ` Michal Hocko
2015-04-13 12:46           ` Michal Hocko
2015-04-13 12:46             ` Michal Hocko
2015-04-14  0:11             ` Dave Chinner
2015-04-14  0:11               ` Dave Chinner
2015-04-14  7:20               ` Michal Hocko
2015-04-14  7:20                 ` Michal Hocko
2015-04-14 10:36             ` Johannes Weiner
2015-04-14 10:36               ` Johannes Weiner
2015-04-14 14:23               ` Michal Hocko
2015-04-14 14:23                 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150407141822.GA3262@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.