From: Michal Hocko <mhocko@kernel.org>
To: David Rientjes <rientjes@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Mel Gorman <mgorman@suse.de>, Vlastimil Babka <vbabka@suse.cz>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Mike Kravetz <mike.kravetz@oracle.com>
Subject: Re: [rfc 3/4] mm, page_alloc: avoid expensive reclaim when compaction may not succeed
Date: Thu, 5 Sep 2019 11:00:09 +0200 [thread overview]
Message-ID: <20190905090009.GF3838@dhcp22.suse.cz> (raw)
In-Reply-To: <alpine.DEB.2.21.1909041253390.94813@chino.kir.corp.google.com>
[Ccing Mike for checking on the hugetlb side of this change]
On Wed 04-09-19 12:54:22, David Rientjes wrote:
> Memory compaction has a couple significant drawbacks as the allocation
> order increases, specifically:
>
> - isolate_freepages() is responsible for finding free pages to use as
> migration targets and is implemented as a linear scan of memory
> starting at the end of a zone,
>
> - failing order-0 watermark checks in memory compaction does not account
> for how far below the watermarks the zone actually is: to enable
> migration, there must be *some* free memory available. Per the above,
> watermarks are not always suffficient if isolate_freepages() cannot
> find the free memory but it could require hundreds of MBs of reclaim to
> even reach this threshold (read: potentially very expensive reclaim with
> no indication compaction can be successful), and
>
> - if compaction at this order has failed recently so that it does not even
> run as a result of deferred compaction, looping through reclaim can often
> be pointless.
>
> For hugepage allocations, these are quite substantial drawbacks because
> these are very high order allocations (order-9 on x86) and falling back to
> doing reclaim can potentially be *very* expensive without any indication
> that compaction would even be successful.
>
> Reclaim itself is unlikely to free entire pageblocks and certainly no
> reliance should be put on it to do so in isolation (recall lumpy reclaim).
> This means we should avoid reclaim and simply fail hugepage allocation if
> compaction is deferred.
>
> It is also not helpful to thrash a zone by doing excessive reclaim if
> compaction may not be able to access that memory. If order-0 watermarks
> fail and the allocation order is sufficiently large, it is likely better
> to fail the allocation rather than thrashing the zone.
>
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
> mm/page_alloc.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4458,6 +4458,28 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> if (page)
> goto got_pg;
>
> + if (order >= pageblock_order && (gfp_mask & __GFP_IO)) {
> + /*
> + * If allocating entire pageblock(s) and compaction
> + * failed because all zones are below low watermarks
> + * or is prohibited because it recently failed at this
> + * order, fail immediately.
> + *
> + * Reclaim is
> + * - potentially very expensive because zones are far
> + * below their low watermarks or this is part of very
> + * bursty high order allocations,
> + * - not guaranteed to help because isolate_freepages()
> + * may not iterate over freed pages as part of its
> + * linear scan, and
> + * - unlikely to make entire pageblocks free on its
> + * own.
> + */
> + if (compact_result == COMPACT_SKIPPED ||
> + compact_result == COMPACT_DEFERRED)
> + goto nopage;
> + }
> +
> /*
> * Checks for costly allocations with __GFP_NORETRY, which
> * includes THP page fault allocations
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2019-09-05 9:00 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-04 19:54 [patch for-5.3 0/4] revert immediate fallback to remote hugepages David Rientjes
2019-09-04 19:54 ` [rfc 3/4] mm, page_alloc: avoid expensive reclaim when compaction may not succeed David Rientjes
2019-09-05 9:00 ` Michal Hocko [this message]
2019-09-05 11:22 ` Vlastimil Babka
2019-09-05 20:53 ` Mike Kravetz
2019-09-06 20:16 ` David Rientjes
2019-09-06 20:49 ` David Rientjes
2019-09-04 20:43 ` [patch for-5.3 0/4] revert immediate fallback to remote hugepages Linus Torvalds
2019-09-05 20:54 ` David Rientjes
2019-09-07 19:51 ` David Rientjes
2019-09-07 19:55 ` Linus Torvalds
2019-09-08 1:50 ` David Rientjes
2019-09-08 12:47 ` Vlastimil Babka
2019-09-08 20:45 ` David Rientjes
2019-09-09 8:37 ` Michal Hocko
2019-09-04 20:55 ` Andrea Arcangeli
2019-09-05 21:06 ` David Rientjes
2019-09-09 19:30 ` Michal Hocko
2019-09-25 7:08 ` Michal Hocko
2019-09-26 19:03 ` David Rientjes
2019-09-27 7:48 ` Michal Hocko
2019-09-28 20:59 ` Linus Torvalds
2019-09-30 11:28 ` Michal Hocko
2019-10-01 5:43 ` Michal Hocko
2019-10-01 8:37 ` Michal Hocko
2019-10-18 14:15 ` Michal Hocko
2019-10-23 11:03 ` Vlastimil Babka
2019-10-24 18:59 ` David Rientjes
2019-10-29 14:14 ` Vlastimil Babka
2019-10-29 15:15 ` Michal Hocko
2019-10-29 21:33 ` Andrew Morton
2019-10-29 21:45 ` Vlastimil Babka
2019-10-29 23:25 ` David Rientjes
2019-11-05 13:02 ` Michal Hocko
2019-11-06 1:01 ` David Rientjes
2019-11-06 7:35 ` Michal Hocko
2019-11-06 21:32 ` David Rientjes
2019-11-13 11:20 ` Mel Gorman
2019-11-25 0:10 ` David Rientjes
2019-11-25 11:47 ` Michal Hocko
2019-11-25 20:38 ` David Rientjes
2019-11-25 21:34 ` Vlastimil Babka
2019-10-01 13:50 ` Vlastimil Babka
2019-10-01 20:31 ` David Rientjes
2019-10-01 21:54 ` Vlastimil Babka
2019-10-02 10:34 ` Michal Hocko
2019-10-02 22:32 ` David Rientjes
2019-10-03 8:00 ` Vlastimil Babka
2019-10-04 12:18 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190905090009.GF3838@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mike.kravetz@oracle.com \
--cc=rientjes@google.com \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).