linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] compaction-related cleanups v4
@ 2016-07-18 11:22 Vlastimil Babka
  2016-07-18 11:22 ` [PATCH 1/8] mm, compaction: don't isolate PageWriteback pages in MIGRATE_SYNC_LIGHT mode Vlastimil Babka
                   ` (9 more replies)
  0 siblings, 10 replies; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-18 11:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Michal Hocko, Mel Gorman, Joonsoo Kim,
	David Rientjes, Rik van Riel, Vlastimil Babka

Hi,

this is the splitted-off first part of my "make direct compaction more
deterministic" series [1], rebased on mmotm-2016-07-13-16-09-18. For the whole
series it's probably too late for 4.8 given some unresolved feedback, but I
hope this part could go in as it was stable for quite some time.

At the very least, the first patch really shouldn't wait any longer.

[1] http://marc.info/?l=linux-mm&m=146676211226806&w=2

Hugh Dickins (1):
  mm, compaction: don't isolate PageWriteback pages in
    MIGRATE_SYNC_LIGHT mode

Vlastimil Babka (7):
  mm, page_alloc: set alloc_flags only once in slowpath
  mm, page_alloc: don't retry initial attempt in slowpath
  mm, page_alloc: restructure direct compaction handling in slowpath
  mm, page_alloc: make THP-specific decisions more generic
  mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations
  mm, compaction: introduce direct compaction priority
  mm, compaction: simplify contended compaction handling

 include/linux/compaction.h        |  33 +++---
 include/linux/gfp.h               |  14 +--
 include/trace/events/compaction.h |  12 +--
 include/trace/events/mmflags.h    |   1 +
 mm/compaction.c                   |  83 ++++-----------
 mm/huge_memory.c                  |  29 ++---
 mm/internal.h                     |   5 +-
 mm/khugepaged.c                   |   2 +-
 mm/migrate.c                      |   2 +-
 mm/page_alloc.c                   | 215 +++++++++++++++++---------------------
 tools/perf/builtin-kmem.c         |   1 +
 11 files changed, 164 insertions(+), 233 deletions(-)

-- 
2.9.0

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/8] mm, compaction: don't isolate PageWriteback pages in MIGRATE_SYNC_LIGHT mode
  2016-07-18 11:22 [PATCH 0/8] compaction-related cleanups v4 Vlastimil Babka
@ 2016-07-18 11:22 ` Vlastimil Babka
  2016-07-19 22:21   ` David Rientjes
  2016-07-18 11:22 ` [PATCH 2/8] mm, page_alloc: set alloc_flags only once in slowpath Vlastimil Babka
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-18 11:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Michal Hocko, Mel Gorman, Joonsoo Kim,
	David Rientjes, Rik van Riel, Hugh Dickins, Vlastimil Babka

From: Hugh Dickins <hughd@google.com>

At present MIGRATE_SYNC_LIGHT is allowing __isolate_lru_page() to
isolate a PageWriteback page, which __unmap_and_move() then rejects
with -EBUSY: of course the writeback might complete in between, but
that's not what we usually expect, so probably better not to isolate it.

When tested by stress-highalloc from mmtests, this has reduced the number of
page migrate failures by 60-70%.

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 mm/compaction.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index cd93ea24c565..892e397655dc 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1200,7 +1200,7 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
 	struct page *page;
 	const isolate_mode_t isolate_mode =
 		(sysctl_compact_unevictable_allowed ? ISOLATE_UNEVICTABLE : 0) |
-		(cc->mode == MIGRATE_ASYNC ? ISOLATE_ASYNC_MIGRATE : 0);
+		(cc->mode != MIGRATE_SYNC ? ISOLATE_ASYNC_MIGRATE : 0);
 
 	/*
 	 * Start at where we last stopped, or beginning of the zone as
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/8] mm, page_alloc: set alloc_flags only once in slowpath
  2016-07-18 11:22 [PATCH 0/8] compaction-related cleanups v4 Vlastimil Babka
  2016-07-18 11:22 ` [PATCH 1/8] mm, compaction: don't isolate PageWriteback pages in MIGRATE_SYNC_LIGHT mode Vlastimil Babka
@ 2016-07-18 11:22 ` Vlastimil Babka
  2016-07-18 11:27   ` Michal Hocko
  2016-07-19 22:28   ` David Rientjes
  2016-07-18 11:22 ` [PATCH 3/8] mm, page_alloc: don't retry initial attempt " Vlastimil Babka
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-18 11:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Michal Hocko, Mel Gorman, Joonsoo Kim,
	David Rientjes, Rik van Riel, Vlastimil Babka

In __alloc_pages_slowpath(), alloc_flags doesn't change after it's initialized,
so move the initialization above the retry: label. Also make the comment above
the initialization more descriptive.

The only exception in the alloc_flags being constant is ALLOC_NO_WATERMARKS,
which may change due to TIF_MEMDIE being set on the allocating thread. We can
fix this, and make the code simpler and a bit more effective at the same time,
by moving the part that determines ALLOC_NO_WATERMARKS from
gfp_to_alloc_flags() to gfp_pfmemalloc_allowed(). This means we don't have to
mask out ALLOC_NO_WATERMARKS in numerous places in __alloc_pages_slowpath()
anymore. The only two tests for the flag can instead call
gfp_pfmemalloc_allowed().

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/page_alloc.c | 52 ++++++++++++++++++++++++++--------------------------
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 571aca8c637a..eb1968a1041e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3119,8 +3119,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
 	 */
 	count_vm_event(COMPACTSTALL);
 
-	page = get_page_from_freelist(gfp_mask, order,
-					alloc_flags & ~ALLOC_NO_WATERMARKS, ac);
+	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
 
 	if (page) {
 		struct zone *zone = page_zone(page);
@@ -3288,8 +3287,7 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
 		return NULL;
 
 retry:
-	page = get_page_from_freelist(gfp_mask, order,
-					alloc_flags & ~ALLOC_NO_WATERMARKS, ac);
+	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
 
 	/*
 	 * If an allocation failed after direct reclaim, it could be because
@@ -3351,16 +3349,6 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
 	} else if (unlikely(rt_task(current)) && !in_interrupt())
 		alloc_flags |= ALLOC_HARDER;
 
-	if (likely(!(gfp_mask & __GFP_NOMEMALLOC))) {
-		if (gfp_mask & __GFP_MEMALLOC)
-			alloc_flags |= ALLOC_NO_WATERMARKS;
-		else if (in_serving_softirq() && (current->flags & PF_MEMALLOC))
-			alloc_flags |= ALLOC_NO_WATERMARKS;
-		else if (!in_interrupt() &&
-				((current->flags & PF_MEMALLOC) ||
-				 unlikely(test_thread_flag(TIF_MEMDIE))))
-			alloc_flags |= ALLOC_NO_WATERMARKS;
-	}
 #ifdef CONFIG_CMA
 	if (gfpflags_to_migratetype(gfp_mask) == MIGRATE_MOVABLE)
 		alloc_flags |= ALLOC_CMA;
@@ -3370,7 +3358,19 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
 
 bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
 {
-	return !!(gfp_to_alloc_flags(gfp_mask) & ALLOC_NO_WATERMARKS);
+	if (unlikely(gfp_mask & __GFP_NOMEMALLOC))
+		return false;
+
+	if (gfp_mask & __GFP_MEMALLOC)
+		return true;
+	if (in_serving_softirq() && (current->flags & PF_MEMALLOC))
+		return true;
+	if (!in_interrupt() &&
+			((current->flags & PF_MEMALLOC) ||
+			 unlikely(test_thread_flag(TIF_MEMDIE))))
+		return true;
+
+	return false;
 }
 
 static inline bool is_thp_gfp_mask(gfp_t gfp_mask)
@@ -3534,36 +3534,36 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 				(__GFP_ATOMIC|__GFP_DIRECT_RECLAIM)))
 		gfp_mask &= ~__GFP_ATOMIC;
 
-retry:
-	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
-		wake_all_kswapds(order, ac);
-
 	/*
-	 * OK, we're below the kswapd watermark and have kicked background
-	 * reclaim. Now things get more complex, so set up alloc_flags according
-	 * to how we want to proceed.
+	 * The fast path uses conservative alloc_flags to succeed only until
+	 * kswapd needs to be woken up, and to avoid the cost of setting up
+	 * alloc_flags precisely. So we do that now.
 	 */
 	alloc_flags = gfp_to_alloc_flags(gfp_mask);
 
+retry:
+	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
+		wake_all_kswapds(order, ac);
+
 	/*
 	 * Reset the zonelist iterators if memory policies can be ignored.
 	 * These allocations are high priority and system rather than user
 	 * orientated.
 	 */
-	if ((alloc_flags & ALLOC_NO_WATERMARKS) || !(alloc_flags & ALLOC_CPUSET)) {
+	if (!(alloc_flags & ALLOC_CPUSET) || gfp_pfmemalloc_allowed(gfp_mask)) {
 		ac->zonelist = node_zonelist(numa_node_id(), gfp_mask);
 		ac->preferred_zoneref = first_zones_zonelist(ac->zonelist,
 					ac->high_zoneidx, ac->nodemask);
 	}
 
 	/* This is the last chance, in general, before the goto nopage. */
-	page = get_page_from_freelist(gfp_mask, order,
-				alloc_flags & ~ALLOC_NO_WATERMARKS, ac);
+	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
 	if (page)
 		goto got_pg;
 
 	/* Allocate without watermarks if the context allows */
-	if (alloc_flags & ALLOC_NO_WATERMARKS) {
+	if (gfp_pfmemalloc_allowed(gfp_mask)) {
+
 		page = get_page_from_freelist(gfp_mask, order,
 						ALLOC_NO_WATERMARKS, ac);
 		if (page)
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/8] mm, page_alloc: don't retry initial attempt in slowpath
  2016-07-18 11:22 [PATCH 0/8] compaction-related cleanups v4 Vlastimil Babka
  2016-07-18 11:22 ` [PATCH 1/8] mm, compaction: don't isolate PageWriteback pages in MIGRATE_SYNC_LIGHT mode Vlastimil Babka
  2016-07-18 11:22 ` [PATCH 2/8] mm, page_alloc: set alloc_flags only once in slowpath Vlastimil Babka
@ 2016-07-18 11:22 ` Vlastimil Babka
  2016-07-18 11:29   ` Michal Hocko
  2016-07-19 22:36   ` David Rientjes
  2016-07-18 11:22 ` [PATCH 4/8] mm, page_alloc: restructure direct compaction handling " Vlastimil Babka
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-18 11:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Michal Hocko, Mel Gorman, Joonsoo Kim,
	David Rientjes, Rik van Riel, Vlastimil Babka

After __alloc_pages_slowpath() sets up new alloc_flags and wakes up kswapd, it
first tries get_page_from_freelist() with the new alloc_flags, as it may
succeed e.g. due to using min watermark instead of low watermark. It makes
sense to to do this attempt before adjusting zonelist based on
alloc_flags/gfp_mask, as it's still relatively a fast path if we just wake up
kswapd and successfully allocate.

This patch therefore moves the initial attempt above the retry label and
reorganizes a bit the part below the retry label. We still have to attempt
get_page_from_freelist() on each retry, as some allocations cannot do that
as part of direct reclaim or compaction, and yet are not allowed to fail
(even though they do a WARN_ON_ONCE() and thus should not exist). We can reuse
the call meant for ALLOC_NO_WATERMARKS attempt and just set alloc_flags to
ALLOC_NO_WATERMARKS if the context allows it. As a side-effect, the attempts
from direct reclaim/compaction will also no longer obey watermarks once this
is set, but there's little harm in that.

Kswapd wakeups are also done on each retry to be safe from potential races
resulting in kswapd going to sleep while a process (that may not be able to
reclaim by itself) is still looping.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/page_alloc.c | 29 ++++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index eb1968a1041e..30443804f156 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3541,35 +3541,42 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	 */
 	alloc_flags = gfp_to_alloc_flags(gfp_mask);
 
+	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
+		wake_all_kswapds(order, ac);
+
+	/*
+	 * The adjusted alloc_flags might result in immediate success, so try
+	 * that first
+	 */
+	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
+	if (page)
+		goto got_pg;
+
+
 retry:
+	/* Ensure kswapd doesn't accidentally go to sleep as long as we loop */
 	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
 		wake_all_kswapds(order, ac);
 
+	if (gfp_pfmemalloc_allowed(gfp_mask))
+		alloc_flags = ALLOC_NO_WATERMARKS;
+
 	/*
 	 * Reset the zonelist iterators if memory policies can be ignored.
 	 * These allocations are high priority and system rather than user
 	 * orientated.
 	 */
-	if (!(alloc_flags & ALLOC_CPUSET) || gfp_pfmemalloc_allowed(gfp_mask)) {
+	if (!(alloc_flags & ALLOC_CPUSET) || (alloc_flags & ALLOC_NO_WATERMARKS)) {
 		ac->zonelist = node_zonelist(numa_node_id(), gfp_mask);
 		ac->preferred_zoneref = first_zones_zonelist(ac->zonelist,
 					ac->high_zoneidx, ac->nodemask);
 	}
 
-	/* This is the last chance, in general, before the goto nopage. */
+	/* Attempt with potentially adjusted zonelist and alloc_flags */
 	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
 	if (page)
 		goto got_pg;
 
-	/* Allocate without watermarks if the context allows */
-	if (gfp_pfmemalloc_allowed(gfp_mask)) {
-
-		page = get_page_from_freelist(gfp_mask, order,
-						ALLOC_NO_WATERMARKS, ac);
-		if (page)
-			goto got_pg;
-	}
-
 	/* Caller is not willing to reclaim, we can't balance anything */
 	if (!can_direct_reclaim) {
 		/*
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/8] mm, page_alloc: restructure direct compaction handling in slowpath
  2016-07-18 11:22 [PATCH 0/8] compaction-related cleanups v4 Vlastimil Babka
                   ` (2 preceding siblings ...)
  2016-07-18 11:22 ` [PATCH 3/8] mm, page_alloc: don't retry initial attempt " Vlastimil Babka
@ 2016-07-18 11:22 ` Vlastimil Babka
  2016-07-19 22:50   ` David Rientjes
  2016-07-18 11:22 ` [PATCH 5/8] mm, page_alloc: make THP-specific decisions more generic Vlastimil Babka
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-18 11:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Michal Hocko, Mel Gorman, Joonsoo Kim,
	David Rientjes, Rik van Riel, Vlastimil Babka

The retry loop in __alloc_pages_slowpath is supposed to keep trying reclaim
and compaction (and OOM), until either the allocation succeeds, or returns
with failure. Success here is more probable when reclaim precedes compaction,
as certain watermarks have to be met for compaction to even try, and more free
pages increase the probability of compaction success. On the other hand,
starting with light async compaction (if the watermarks allow it), can be
more efficient, especially for smaller orders, if there's enough free memory
which is just fragmented.

Thus, the current code starts with compaction before reclaim, and to make sure
that the last reclaim is always followed by a final compaction, there's another
direct compaction call at the end of the loop. This makes the code hard to
follow and adds some duplicated handling of migration_mode decisions. It's also
somewhat inefficient that even if reclaim or compaction decides not to retry,
the final compaction is still attempted. Some gfp flags combination also
shortcut these retry decisions by "goto noretry;", making it even harder to
follow.

This patch attempts to restructure the code with only minimal functional
changes. The call to the first compaction and THP-specific checks are now
placed above the retry loop, and the "noretry" direct compaction is removed.

The initial compaction is additionally restricted only to costly orders, as we
can expect smaller orders to be held back by watermarks, and only larger orders
to suffer primarily from fragmentation. This better matches the checks in
reclaim's shrink_zones().

There are two other smaller functional changes. One is that the upgrade from
async migration to light sync migration will always occur after the initial
compaction. This is how it has been until recent patch "mm, oom: protect
!costly allocations some more", which introduced upgrading the mode based on
COMPACT_COMPLETE result, but kept the final compaction always upgraded, which
made it even more special. It's better to return to the simpler handling for
now, as migration modes will be further modified later in the series.

The second change is that once both reclaim and compaction declare it's not
worth to retry the reclaim/compact loop, there is no final compaction attempt.
As argued above, this is intentional. If that final compaction were to succeed,
it would be due to a wrong retry decision, or simply a race with somebody else
freeing memory for us.

The main outcome of this patch should be simpler code. Logically, the initial
compaction without reclaim is the exceptional case to the reclaim/compaction
scheme, but prior to the patch, it was the last loop iteration that was
exceptional. Now the code matches the logic better. The change also enable the
following patches.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 mm/page_alloc.c | 106 +++++++++++++++++++++++++++++---------------------------
 1 file changed, 54 insertions(+), 52 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 30443804f156..a04a67745927 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3510,7 +3510,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	struct page *page = NULL;
 	unsigned int alloc_flags;
 	unsigned long did_some_progress;
-	enum migrate_mode migration_mode = MIGRATE_ASYNC;
+	enum migrate_mode migration_mode = MIGRATE_SYNC_LIGHT;
 	enum compact_result compact_result;
 	int compaction_retries = 0;
 	int no_progress_loops = 0;
@@ -3552,6 +3552,49 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	if (page)
 		goto got_pg;
 
+	/*
+	 * For costly allocations, try direct compaction first, as it's likely
+	 * that we have enough base pages and don't need to reclaim.
+	 */
+	if (can_direct_reclaim && order > PAGE_ALLOC_COSTLY_ORDER) {
+		page = __alloc_pages_direct_compact(gfp_mask, order,
+						alloc_flags, ac,
+						MIGRATE_ASYNC,
+						&compact_result);
+		if (page)
+			goto got_pg;
+
+		/* Checks for THP-specific high-order allocations */
+		if (is_thp_gfp_mask(gfp_mask)) {
+			/*
+			 * If compaction is deferred for high-order allocations,
+			 * it is because sync compaction recently failed. If
+			 * this is the case and the caller requested a THP
+			 * allocation, we do not want to heavily disrupt the
+			 * system, so we fail the allocation instead of entering
+			 * direct reclaim.
+			 */
+			if (compact_result == COMPACT_DEFERRED)
+				goto nopage;
+
+			/*
+			 * Compaction is contended so rather back off than cause
+			 * excessive stalls.
+			 */
+			if (compact_result == COMPACT_CONTENDED)
+				goto nopage;
+
+			/*
+			 * It can become very expensive to allocate transparent
+			 * hugepages at fault, so use asynchronous memory
+			 * compaction for THP unless it is khugepaged trying to
+			 * collapse. All other requests should tolerate at
+			 * least light sync migration.
+			 */
+			if (!(current->flags & PF_KTHREAD))
+				migration_mode = MIGRATE_ASYNC;
+		}
+	}
 
 retry:
 	/* Ensure kswapd doesn't accidentally go to sleep as long as we loop */
@@ -3606,55 +3649,33 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL))
 		goto nopage;
 
-	/*
-	 * Try direct compaction. The first pass is asynchronous. Subsequent
-	 * attempts after direct reclaim are synchronous
-	 */
+
+	/* Try direct reclaim and then allocating */
+	page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
+							&did_some_progress);
+	if (page)
+		goto got_pg;
+
+	/* Try direct compaction and then allocating */
 	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac,
 					migration_mode,
 					&compact_result);
 	if (page)
 		goto got_pg;
 
-	/* Checks for THP-specific high-order allocations */
-	if (is_thp_gfp_mask(gfp_mask)) {
-		/*
-		 * If compaction is deferred for high-order allocations, it is
-		 * because sync compaction recently failed. If this is the case
-		 * and the caller requested a THP allocation, we do not want
-		 * to heavily disrupt the system, so we fail the allocation
-		 * instead of entering direct reclaim.
-		 */
-		if (compact_result == COMPACT_DEFERRED)
-			goto nopage;
-
-		/*
-		 * Compaction is contended so rather back off than cause
-		 * excessive stalls.
-		 */
-		if(compact_result == COMPACT_CONTENDED)
-			goto nopage;
-	}
-
 	if (order && compaction_made_progress(compact_result))
 		compaction_retries++;
 
-	/* Try direct reclaim and then allocating */
-	page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
-							&did_some_progress);
-	if (page)
-		goto got_pg;
-
 	/* Do not loop if specifically requested */
 	if (gfp_mask & __GFP_NORETRY)
-		goto noretry;
+		goto nopage;
 
 	/*
 	 * Do not retry costly high order allocations unless they are
 	 * __GFP_REPEAT
 	 */
 	if (order > PAGE_ALLOC_COSTLY_ORDER && !(gfp_mask & __GFP_REPEAT))
-		goto noretry;
+		goto nopage;
 
 	/*
 	 * Costly allocations might have made a progress but this doesn't mean
@@ -3693,25 +3714,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 		goto retry;
 	}
 
-noretry:
-	/*
-	 * High-order allocations do not necessarily loop after direct reclaim
-	 * and reclaim/compaction depends on compaction being called after
-	 * reclaim so call directly if necessary.
-	 * It can become very expensive to allocate transparent hugepages at
-	 * fault, so use asynchronous memory compaction for THP unless it is
-	 * khugepaged trying to collapse. All other requests should tolerate
-	 * at least light sync migration.
-	 */
-	if (is_thp_gfp_mask(gfp_mask) && !(current->flags & PF_KTHREAD))
-		migration_mode = MIGRATE_ASYNC;
-	else
-		migration_mode = MIGRATE_SYNC_LIGHT;
-	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags,
-					    ac, migration_mode,
-					    &compact_result);
-	if (page)
-		goto got_pg;
 nopage:
 	warn_alloc_failed(gfp_mask, order, NULL);
 got_pg:
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 5/8] mm, page_alloc: make THP-specific decisions more generic
  2016-07-18 11:22 [PATCH 0/8] compaction-related cleanups v4 Vlastimil Babka
                   ` (3 preceding siblings ...)
  2016-07-18 11:22 ` [PATCH 4/8] mm, page_alloc: restructure direct compaction handling " Vlastimil Babka
@ 2016-07-18 11:22 ` Vlastimil Babka
  2016-07-19 23:10   ` David Rientjes
  2016-07-18 11:23 ` [PATCH 6/8] mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations Vlastimil Babka
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-18 11:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Michal Hocko, Mel Gorman, Joonsoo Kim,
	David Rientjes, Rik van Riel, Vlastimil Babka

Since THP allocations during page faults can be costly, extra decisions are
employed for them to avoid excessive reclaim and compaction, if the initial
compaction doesn't look promising. The detection has never been perfect as
there is no gfp flag specific to THP allocations. At this moment it checks the
whole combination of flags that makes up GFP_TRANSHUGE, and hopes that no other
users of such combination exist, or would mind being treated the same way.
Extra care is also taken to separate allocations from khugepaged, where latency
doesn't matter that much.

It is however possible to distinguish these allocations in a simpler and more
reliable way. The key observation is that after the initial compaction followed
by the first iteration of "standard" reclaim/compaction, both __GFP_NORETRY
allocations and costly allocations without __GFP_REPEAT are declared as
failures:

        /* Do not loop if specifically requested */
        if (gfp_mask & __GFP_NORETRY)
                goto nopage;

        /*
         * Do not retry costly high order allocations unless they are
         * __GFP_REPEAT
         */
        if (order > PAGE_ALLOC_COSTLY_ORDER && !(gfp_mask & __GFP_REPEAT))
                goto nopage;

This means we can further distinguish allocations that are costly order *and*
additionally include the __GFP_NORETRY flag. As it happens, GFP_TRANSHUGE
allocations do already fall into this category. This will also allow other
costly allocations with similar high-order benefit vs latency considerations to
use this semantic. Furthermore, we can distinguish THP allocations that should
try a bit harder (such as from khugepageed) by removing __GFP_NORETRY, as will
be done in the next patch.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 mm/page_alloc.c | 22 +++++++++-------------
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a04a67745927..cfefcb98ac59 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3085,7 +3085,6 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
 	return page;
 }
 
-
 /*
  * Maximum number of compaction retries wit a progress before OOM
  * killer is consider as the only way to move forward.
@@ -3373,11 +3372,6 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
 	return false;
 }
 
-static inline bool is_thp_gfp_mask(gfp_t gfp_mask)
-{
-	return (gfp_mask & (GFP_TRANSHUGE | __GFP_KSWAPD_RECLAIM)) == GFP_TRANSHUGE;
-}
-
 /*
  * Maximum number of reclaim retries without any progress before OOM killer
  * is consider as the only way to move forward.
@@ -3564,8 +3558,11 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 		if (page)
 			goto got_pg;
 
-		/* Checks for THP-specific high-order allocations */
-		if (is_thp_gfp_mask(gfp_mask)) {
+		/*
+		 * Checks for costly allocations with __GFP_NORETRY, which
+		 * includes THP page fault allocations
+		 */
+		if (gfp_mask & __GFP_NORETRY) {
 			/*
 			 * If compaction is deferred for high-order allocations,
 			 * it is because sync compaction recently failed. If
@@ -3585,11 +3582,10 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 				goto nopage;
 
 			/*
-			 * It can become very expensive to allocate transparent
-			 * hugepages at fault, so use asynchronous memory
-			 * compaction for THP unless it is khugepaged trying to
-			 * collapse. All other requests should tolerate at
-			 * least light sync migration.
+			 * Looks like reclaim/compaction is worth trying, but
+			 * sync compaction could be very expensive, so keep
+			 * using async compaction, unless it's khugepaged
+			 * trying to collapse.
 			 */
 			if (!(current->flags & PF_KTHREAD))
 				migration_mode = MIGRATE_ASYNC;
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 6/8] mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations
  2016-07-18 11:22 [PATCH 0/8] compaction-related cleanups v4 Vlastimil Babka
                   ` (4 preceding siblings ...)
  2016-07-18 11:22 ` [PATCH 5/8] mm, page_alloc: make THP-specific decisions more generic Vlastimil Babka
@ 2016-07-18 11:23 ` Vlastimil Babka
  2016-07-18 11:23 ` [PATCH 7/8] mm, compaction: introduce direct compaction priority Vlastimil Babka
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-18 11:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Michal Hocko, Mel Gorman, Joonsoo Kim,
	David Rientjes, Rik van Riel, Vlastimil Babka

After the previous patch, we can distinguish costly allocations that should be
really lightweight, such as THP page faults, with __GFP_NORETRY. This means we
don't need to recognize khugepaged allocations via PF_KTHREAD anymore. We can
also change THP page faults in areas where madvise(MADV_HUGEPAGE) was used to
try as hard as khugepaged, as the process has indicated that it benefits from
THP's and is willing to pay some initial latency costs.

We can also make the flags handling less cryptic by distinguishing
GFP_TRANSHUGE_LIGHT (no reclaim at all, default mode in page fault) from
GFP_TRANSHUGE (only direct reclaim, khugepaged default). Adding __GFP_NORETRY
or __GFP_KSWAPD_RECLAIM is done where needed.

The patch effectively changes the current GFP_TRANSHUGE users as follows:

* get_huge_zero_page() - the zero page lifetime should be relatively long and
  it's shared by multiple users, so it's worth spending some effort on it.
  We use GFP_TRANSHUGE, and __GFP_NORETRY is not added. This also restores
  direct reclaim to this allocation, which was unintentionally removed by
  commit e4a49efe4e7e ("mm: thp: set THP defrag by default to madvise and add
  a stall-free defrag option")

* alloc_hugepage_khugepaged_gfpmask() - this is khugepaged, so latency is not
  an issue. So if khugepaged "defrag" is enabled (the default), do reclaim
  via GFP_TRANSHUGE without __GFP_NORETRY. We can remove the PF_KTHREAD check
  from page alloc.
  As a side-effect, khugepaged will now no longer check if the initial
  compaction was deferred or contended. This is OK, as khugepaged sleep times
  between collapsion attempts are long enough to prevent noticeable disruption,
  so we should allow it to spend some effort.

* migrate_misplaced_transhuge_page() - already was masking out __GFP_RECLAIM,
  so just convert to GFP_TRANSHUGE_LIGHT which is equivalent.

* alloc_hugepage_direct_gfpmask() - vma's with VM_HUGEPAGE (via madvise) are
  now allocating without __GFP_NORETRY. Other vma's keep using __GFP_NORETRY
  if direct reclaim/compaction is at all allowed (by default it's allowed only
  for madvised vma's). The rest is conversion to GFP_TRANSHUGE(_LIGHT).

[mhocko@suse.com: suggested GFP_TRANSHUGE_LIGHT]
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 include/linux/gfp.h            | 14 ++++++++------
 include/trace/events/mmflags.h |  1 +
 mm/huge_memory.c               | 29 ++++++++++++++++-------------
 mm/khugepaged.c                |  2 +-
 mm/migrate.c                   |  2 +-
 mm/page_alloc.c                |  6 ++----
 tools/perf/builtin-kmem.c      |  1 +
 7 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index c29e9d347bc6..f8041f9de31e 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -237,9 +237,11 @@ struct vm_area_struct;
  *   are expected to be movable via page reclaim or page migration. Typically,
  *   pages on the LRU would also be allocated with GFP_HIGHUSER_MOVABLE.
  *
- * GFP_TRANSHUGE is used for THP allocations. They are compound allocations
- *   that will fail quickly if memory is not available and will not wake
- *   kswapd on failure.
+ * GFP_TRANSHUGE and GFP_TRANSHUGE_LIGHT are used for THP allocations. They are
+ *   compound allocations that will generally fail quickly if memory is not
+ *   available and will not wake kswapd/kcompactd on failure. The _LIGHT
+ *   version does not attempt reclaim/compaction at all and is by default used
+ *   in page fault path, while the non-light is used by khugepaged.
  */
 #define GFP_ATOMIC	(__GFP_HIGH|__GFP_ATOMIC|__GFP_KSWAPD_RECLAIM)
 #define GFP_KERNEL	(__GFP_RECLAIM | __GFP_IO | __GFP_FS)
@@ -254,9 +256,9 @@ struct vm_area_struct;
 #define GFP_DMA32	__GFP_DMA32
 #define GFP_HIGHUSER	(GFP_USER | __GFP_HIGHMEM)
 #define GFP_HIGHUSER_MOVABLE	(GFP_HIGHUSER | __GFP_MOVABLE)
-#define GFP_TRANSHUGE	((GFP_HIGHUSER_MOVABLE | __GFP_COMP | \
-			 __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN) & \
-			 ~__GFP_RECLAIM)
+#define GFP_TRANSHUGE_LIGHT	((GFP_HIGHUSER_MOVABLE | __GFP_COMP | \
+			 __GFP_NOMEMALLOC | __GFP_NOWARN) & ~__GFP_RECLAIM)
+#define GFP_TRANSHUGE	(GFP_TRANSHUGE_LIGHT | __GFP_DIRECT_RECLAIM)
 
 /* Convert GFP flags to their corresponding migrate type */
 #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)
diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index 43cedbf0c759..5a81ab48a2fb 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -11,6 +11,7 @@
 
 #define __def_gfpflag_names						\
 	{(unsigned long)GFP_TRANSHUGE,		"GFP_TRANSHUGE"},	\
+	{(unsigned long)GFP_TRANSHUGE_LIGHT,	"GFP_TRANSHUGE_LIGHT"}, \
 	{(unsigned long)GFP_HIGHUSER_MOVABLE,	"GFP_HIGHUSER_MOVABLE"},\
 	{(unsigned long)GFP_HIGHUSER,		"GFP_HIGHUSER"},	\
 	{(unsigned long)GFP_USER,		"GFP_USER"},		\
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 4b8d4e588930..83b88f97cd54 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -539,23 +539,26 @@ static int __do_huge_pmd_anonymous_page(struct fault_env *fe, struct page *page,
 }
 
 /*
- * If THP is set to always then directly reclaim/compact as necessary
- * If set to defer then do no reclaim and defer to khugepaged
+ * If THP defrag is set to always then directly reclaim/compact as necessary
+ * If set to defer then do only background reclaim/compact and defer to khugepaged
  * If set to madvise and the VMA is flagged then directly reclaim/compact
+ * When direct reclaim/compact is allowed, don't retry except for flagged VMA's
  */
 static inline gfp_t alloc_hugepage_direct_gfpmask(struct vm_area_struct *vma)
 {
-	gfp_t reclaim_flags = 0;
-
-	if (test_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags) &&
-	    (vma->vm_flags & VM_HUGEPAGE))
-		reclaim_flags = __GFP_DIRECT_RECLAIM;
-	else if (test_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags))
-		reclaim_flags = __GFP_KSWAPD_RECLAIM;
-	else if (test_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags))
-		reclaim_flags = __GFP_DIRECT_RECLAIM;
-
-	return GFP_TRANSHUGE | reclaim_flags;
+	bool vma_madvised = !!(vma->vm_flags & VM_HUGEPAGE);
+
+	if (test_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG,
+				&transparent_hugepage_flags) && vma_madvised)
+		return GFP_TRANSHUGE;
+	else if (test_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG,
+						&transparent_hugepage_flags))
+		return GFP_TRANSHUGE_LIGHT | __GFP_KSWAPD_RECLAIM;
+	else if (test_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG,
+						&transparent_hugepage_flags))
+		return GFP_TRANSHUGE | (vma_madvised ? 0 : __GFP_NORETRY);
+
+	return GFP_TRANSHUGE_LIGHT;
 }
 
 /* Caller must hold page table lock. */
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index bb49bd1d2d9f..54ec5f8032a3 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -694,7 +694,7 @@ static bool khugepaged_scan_abort(int nid)
 /* Defrag for khugepaged will enter direct reclaim/compaction if necessary */
 static inline gfp_t alloc_hugepage_khugepaged_gfpmask(void)
 {
-	return GFP_TRANSHUGE | (khugepaged_defrag() ? __GFP_DIRECT_RECLAIM : 0);
+	return khugepaged_defrag() ? GFP_TRANSHUGE : GFP_TRANSHUGE_LIGHT;
 }
 
 #ifdef CONFIG_NUMA
diff --git a/mm/migrate.c b/mm/migrate.c
index 365153c14cd0..622c7e473464 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1930,7 +1930,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 		goto out_dropref;
 
 	new_page = alloc_pages_node(node,
-		(GFP_TRANSHUGE | __GFP_THISNODE) & ~__GFP_RECLAIM,
+		(GFP_TRANSHUGE_LIGHT | __GFP_THISNODE),
 		HPAGE_PMD_ORDER);
 	if (!new_page)
 		goto out_fail;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cfefcb98ac59..b631f1d94553 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3584,11 +3584,9 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 			/*
 			 * Looks like reclaim/compaction is worth trying, but
 			 * sync compaction could be very expensive, so keep
-			 * using async compaction, unless it's khugepaged
-			 * trying to collapse.
+			 * using async compaction.
 			 */
-			if (!(current->flags & PF_KTHREAD))
-				migration_mode = MIGRATE_ASYNC;
+			migration_mode = MIGRATE_ASYNC;
 		}
 	}
 
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index c9cb3be47cff..0d98182dc159 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -608,6 +608,7 @@ static const struct {
 	const char *compact;
 } gfp_compact_table[] = {
 	{ "GFP_TRANSHUGE",		"THP" },
+	{ "GFP_TRANSHUGE_LIGHT",	"THL" },
 	{ "GFP_HIGHUSER_MOVABLE",	"HUM" },
 	{ "GFP_HIGHUSER",		"HU" },
 	{ "GFP_USER",			"U" },
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 7/8] mm, compaction: introduce direct compaction priority
  2016-07-18 11:22 [PATCH 0/8] compaction-related cleanups v4 Vlastimil Babka
                   ` (5 preceding siblings ...)
  2016-07-18 11:23 ` [PATCH 6/8] mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations Vlastimil Babka
@ 2016-07-18 11:23 ` Vlastimil Babka
  2016-07-18 11:23 ` [PATCH 8/8] mm, compaction: simplify contended compaction handling Vlastimil Babka
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-18 11:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Michal Hocko, Mel Gorman, Joonsoo Kim,
	David Rientjes, Rik van Riel, Vlastimil Babka

In the context of direct compaction, for some types of allocations we would
like the compaction to either succeed or definitely fail while trying as hard
as possible. Current async/sync_light migration mode is insufficient, as there
are heuristics such as caching scanner positions, marking pageblocks as
unsuitable or deferring compaction for a zone. At least the final compaction
attempt should be able to override these heuristics.

To communicate how hard compaction should try, we replace migration mode with
a new enum compact_priority and change the relevant function signatures. In
compact_zone_order() where struct compact_control is constructed, the priority
is mapped to suitable control flags. This patch itself has no functional
change, as the current priority levels are mapped back to the same migration
modes as before. Expanding them will be done next.

Note that !CONFIG_COMPACTION variant of try_to_compact_pages() is removed, as
the only caller exists under CONFIG_COMPACTION.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 include/linux/compaction.h        | 22 +++++++++++++---------
 include/trace/events/compaction.h | 12 ++++++------
 mm/compaction.c                   | 13 +++++++------
 mm/page_alloc.c                   | 28 ++++++++++++++--------------
 4 files changed, 40 insertions(+), 35 deletions(-)

diff --git a/include/linux/compaction.h b/include/linux/compaction.h
index 1a02dab16646..0980a6ce4436 100644
--- a/include/linux/compaction.h
+++ b/include/linux/compaction.h
@@ -1,6 +1,18 @@
 #ifndef _LINUX_COMPACTION_H
 #define _LINUX_COMPACTION_H
 
+/*
+ * Determines how hard direct compaction should try to succeed.
+ * Lower value means higher priority, analogically to reclaim priority.
+ */
+enum compact_priority {
+	COMPACT_PRIO_SYNC_LIGHT,
+	MIN_COMPACT_PRIORITY = COMPACT_PRIO_SYNC_LIGHT,
+	DEF_COMPACT_PRIORITY = COMPACT_PRIO_SYNC_LIGHT,
+	COMPACT_PRIO_ASYNC,
+	INIT_COMPACT_PRIORITY = COMPACT_PRIO_ASYNC
+};
+
 /* Return values for compact_zone() and try_to_compact_pages() */
 /* When adding new states, please adjust include/trace/events/compaction.h */
 enum compact_result {
@@ -66,7 +78,7 @@ extern int fragmentation_index(struct zone *zone, unsigned int order);
 extern enum compact_result try_to_compact_pages(gfp_t gfp_mask,
 			unsigned int order,
 		unsigned int alloc_flags, const struct alloc_context *ac,
-		enum migrate_mode mode, int *contended);
+		enum compact_priority prio, int *contended);
 extern void compact_pgdat(pg_data_t *pgdat, int order);
 extern void reset_isolation_suitable(pg_data_t *pgdat);
 extern enum compact_result compaction_suitable(struct zone *zone, int order,
@@ -151,14 +163,6 @@ extern void kcompactd_stop(int nid);
 extern void wakeup_kcompactd(pg_data_t *pgdat, int order, int classzone_idx);
 
 #else
-static inline enum compact_result try_to_compact_pages(gfp_t gfp_mask,
-			unsigned int order, int alloc_flags,
-			const struct alloc_context *ac,
-			enum migrate_mode mode, int *contended)
-{
-	return COMPACT_CONTINUE;
-}
-
 static inline void compact_pgdat(pg_data_t *pgdat, int order)
 {
 }
diff --git a/include/trace/events/compaction.h b/include/trace/events/compaction.h
index 36e2d6fb1360..c2ba402ab256 100644
--- a/include/trace/events/compaction.h
+++ b/include/trace/events/compaction.h
@@ -226,26 +226,26 @@ TRACE_EVENT(mm_compaction_try_to_compact_pages,
 	TP_PROTO(
 		int order,
 		gfp_t gfp_mask,
-		enum migrate_mode mode),
+		int prio),
 
-	TP_ARGS(order, gfp_mask, mode),
+	TP_ARGS(order, gfp_mask, prio),
 
 	TP_STRUCT__entry(
 		__field(int, order)
 		__field(gfp_t, gfp_mask)
-		__field(enum migrate_mode, mode)
+		__field(int, prio)
 	),
 
 	TP_fast_assign(
 		__entry->order = order;
 		__entry->gfp_mask = gfp_mask;
-		__entry->mode = mode;
+		__entry->prio = prio;
 	),
 
-	TP_printk("order=%d gfp_mask=0x%x mode=%d",
+	TP_printk("order=%d gfp_mask=0x%x priority=%d",
 		__entry->order,
 		__entry->gfp_mask,
-		(int)__entry->mode)
+		__entry->prio)
 );
 
 DECLARE_EVENT_CLASS(mm_compaction_suitable_template,
diff --git a/mm/compaction.c b/mm/compaction.c
index 892e397655dc..bb969711d979 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1644,7 +1644,7 @@ static enum compact_result compact_zone(struct zone *zone, struct compact_contro
 }
 
 static enum compact_result compact_zone_order(struct zone *zone, int order,
-		gfp_t gfp_mask, enum migrate_mode mode, int *contended,
+		gfp_t gfp_mask, enum compact_priority prio, int *contended,
 		unsigned int alloc_flags, int classzone_idx)
 {
 	enum compact_result ret;
@@ -1654,7 +1654,8 @@ static enum compact_result compact_zone_order(struct zone *zone, int order,
 		.order = order,
 		.gfp_mask = gfp_mask,
 		.zone = zone,
-		.mode = mode,
+		.mode = (prio == COMPACT_PRIO_ASYNC) ?
+					MIGRATE_ASYNC :	MIGRATE_SYNC_LIGHT,
 		.alloc_flags = alloc_flags,
 		.classzone_idx = classzone_idx,
 		.direct_compaction = true,
@@ -1687,7 +1688,7 @@ int sysctl_extfrag_threshold = 500;
  */
 enum compact_result try_to_compact_pages(gfp_t gfp_mask, unsigned int order,
 		unsigned int alloc_flags, const struct alloc_context *ac,
-		enum migrate_mode mode, int *contended)
+		enum compact_priority prio, int *contended)
 {
 	int may_enter_fs = gfp_mask & __GFP_FS;
 	int may_perform_io = gfp_mask & __GFP_IO;
@@ -1702,7 +1703,7 @@ enum compact_result try_to_compact_pages(gfp_t gfp_mask, unsigned int order,
 	if (!may_enter_fs || !may_perform_io)
 		return COMPACT_SKIPPED;
 
-	trace_mm_compaction_try_to_compact_pages(order, gfp_mask, mode);
+	trace_mm_compaction_try_to_compact_pages(order, gfp_mask, prio);
 
 	/* Compact each zone in the list */
 	for_each_zone_zonelist_nodemask(zone, z, ac->zonelist, ac->high_zoneidx,
@@ -1715,7 +1716,7 @@ enum compact_result try_to_compact_pages(gfp_t gfp_mask, unsigned int order,
 			continue;
 		}
 
-		status = compact_zone_order(zone, order, gfp_mask, mode,
+		status = compact_zone_order(zone, order, gfp_mask, prio,
 				&zone_contended, alloc_flags,
 				ac_classzone_idx(ac));
 		rc = max(status, rc);
@@ -1749,7 +1750,7 @@ enum compact_result try_to_compact_pages(gfp_t gfp_mask, unsigned int order,
 			goto break_loop;
 		}
 
-		if (mode != MIGRATE_ASYNC && (status == COMPACT_COMPLETE ||
+		if (prio != COMPACT_PRIO_ASYNC && (status == COMPACT_COMPLETE ||
 					status == COMPACT_PARTIAL_SKIPPED)) {
 			/*
 			 * We think that allocation won't succeed in this zone
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b631f1d94553..04cab9d92e30 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3096,7 +3096,7 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
 static struct page *
 __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
 		unsigned int alloc_flags, const struct alloc_context *ac,
-		enum migrate_mode mode, enum compact_result *compact_result)
+		enum compact_priority prio, enum compact_result *compact_result)
 {
 	struct page *page;
 	int contended_compaction;
@@ -3106,7 +3106,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
 
 	current->flags |= PF_MEMALLOC;
 	*compact_result = try_to_compact_pages(gfp_mask, order, alloc_flags, ac,
-						mode, &contended_compaction);
+						prio, &contended_compaction);
 	current->flags &= ~PF_MEMALLOC;
 
 	if (*compact_result <= COMPACT_INACTIVE)
@@ -3160,7 +3160,8 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
 
 static inline bool
 should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
-		     enum compact_result compact_result, enum migrate_mode *migrate_mode,
+		     enum compact_result compact_result,
+		     enum compact_priority *compact_priority,
 		     int compaction_retries)
 {
 	int max_retries = MAX_COMPACT_RETRIES;
@@ -3171,11 +3172,11 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
 	/*
 	 * compaction considers all the zone as desperately out of memory
 	 * so it doesn't really make much sense to retry except when the
-	 * failure could be caused by weak migration mode.
+	 * failure could be caused by insufficient priority
 	 */
 	if (compaction_failed(compact_result)) {
-		if (*migrate_mode == MIGRATE_ASYNC) {
-			*migrate_mode = MIGRATE_SYNC_LIGHT;
+		if (*compact_priority > MIN_COMPACT_PRIORITY) {
+			(*compact_priority)--;
 			return true;
 		}
 		return false;
@@ -3209,7 +3210,7 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
 static inline struct page *
 __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
 		unsigned int alloc_flags, const struct alloc_context *ac,
-		enum migrate_mode mode, enum compact_result *compact_result)
+		enum compact_priority prio, enum compact_result *compact_result)
 {
 	*compact_result = COMPACT_SKIPPED;
 	return NULL;
@@ -3218,7 +3219,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
 static inline bool
 should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_flags,
 		     enum compact_result compact_result,
-		     enum migrate_mode *migrate_mode,
+		     enum compact_priority *compact_priority,
 		     int compaction_retries)
 {
 	struct zone *zone;
@@ -3504,7 +3505,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	struct page *page = NULL;
 	unsigned int alloc_flags;
 	unsigned long did_some_progress;
-	enum migrate_mode migration_mode = MIGRATE_SYNC_LIGHT;
+	enum compact_priority compact_priority = DEF_COMPACT_PRIORITY;
 	enum compact_result compact_result;
 	int compaction_retries = 0;
 	int no_progress_loops = 0;
@@ -3553,7 +3554,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	if (can_direct_reclaim && order > PAGE_ALLOC_COSTLY_ORDER) {
 		page = __alloc_pages_direct_compact(gfp_mask, order,
 						alloc_flags, ac,
-						MIGRATE_ASYNC,
+						INIT_COMPACT_PRIORITY,
 						&compact_result);
 		if (page)
 			goto got_pg;
@@ -3586,7 +3587,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 			 * sync compaction could be very expensive, so keep
 			 * using async compaction.
 			 */
-			migration_mode = MIGRATE_ASYNC;
+			compact_priority = INIT_COMPACT_PRIORITY;
 		}
 	}
 
@@ -3652,8 +3653,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 
 	/* Try direct compaction and then allocating */
 	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac,
-					migration_mode,
-					&compact_result);
+					compact_priority, &compact_result);
 	if (page)
 		goto got_pg;
 
@@ -3693,7 +3693,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	 */
 	if (did_some_progress > 0 &&
 			should_compact_retry(ac, order, alloc_flags,
-				compact_result, &migration_mode,
+				compact_result, &compact_priority,
 				compaction_retries))
 		goto retry;
 
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 8/8] mm, compaction: simplify contended compaction handling
  2016-07-18 11:22 [PATCH 0/8] compaction-related cleanups v4 Vlastimil Babka
                   ` (6 preceding siblings ...)
  2016-07-18 11:23 ` [PATCH 7/8] mm, compaction: introduce direct compaction priority Vlastimil Babka
@ 2016-07-18 11:23 ` Vlastimil Babka
  2016-07-18 11:30 ` [PATCH 0/8] compaction-related cleanups v4 Michal Hocko
  2016-07-18 15:41 ` Mel Gorman
  9 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-18 11:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Michal Hocko, Mel Gorman, Joonsoo Kim,
	David Rientjes, Rik van Riel, Vlastimil Babka

Async compaction detects contention either due to failing trylock on zone->lock
or lru_lock, or by need_resched(). Since 1f9efdef4f3f ("mm, compaction:
khugepaged should not give up due to need_resched()") the code got quite
complicated to distinguish these two up to the __alloc_pages_slowpath() level,
so different decisions could be taken for khugepaged allocations.

After the recent changes, khugepaged allocations don't check for contended
compaction anymore, so we again don't need to distinguish lock and sched
contention, and simplify the current convoluted code a lot.

However, I believe it's also possible to simplify even more and completely
remove the check for contended compaction after the initial async compaction
for costly orders, which was originally aimed at THP page fault allocations.
There are several reasons why this can be done now:

- with the new defaults, THP page faults no longer do reclaim/compaction at
  all, unless the system admin has overridden the default, or application has
  indicated via madvise that it can benefit from THP's. In both cases, it
  means that the potential extra latency is expected and worth the benefits.
- even if reclaim/compaction proceeds after this patch where it previously
  wouldn't, the second compaction attempt is still async and will detect the
  contention and back off, if the contention persists
- there are still heuristics like deferred compaction and pageblock skip bits
  in place that prevent excessive THP page fault latencies

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 include/linux/compaction.h | 13 ++-------
 mm/compaction.c            | 72 +++++++++-------------------------------------
 mm/internal.h              |  5 +---
 mm/page_alloc.c            | 28 +-----------------
 4 files changed, 17 insertions(+), 101 deletions(-)

diff --git a/include/linux/compaction.h b/include/linux/compaction.h
index 0980a6ce4436..d4e106b5dc27 100644
--- a/include/linux/compaction.h
+++ b/include/linux/compaction.h
@@ -55,14 +55,6 @@ enum compact_result {
 	COMPACT_PARTIAL,
 };
 
-/* Used to signal whether compaction detected need_sched() or lock contention */
-/* No contention detected */
-#define COMPACT_CONTENDED_NONE	0
-/* Either need_sched() was true or fatal signal pending */
-#define COMPACT_CONTENDED_SCHED	1
-/* Zone lock or lru_lock was contended in async compaction */
-#define COMPACT_CONTENDED_LOCK	2
-
 struct alloc_context; /* in mm/internal.h */
 
 #ifdef CONFIG_COMPACTION
@@ -76,9 +68,8 @@ extern int sysctl_compact_unevictable_allowed;
 
 extern int fragmentation_index(struct zone *zone, unsigned int order);
 extern enum compact_result try_to_compact_pages(gfp_t gfp_mask,
-			unsigned int order,
-		unsigned int alloc_flags, const struct alloc_context *ac,
-		enum compact_priority prio, int *contended);
+		unsigned int order, unsigned int alloc_flags,
+		const struct alloc_context *ac, enum compact_priority prio);
 extern void compact_pgdat(pg_data_t *pgdat, int order);
 extern void reset_isolation_suitable(pg_data_t *pgdat);
 extern enum compact_result compaction_suitable(struct zone *zone, int order,
diff --git a/mm/compaction.c b/mm/compaction.c
index bb969711d979..124a5b3384dd 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -331,7 +331,7 @@ static bool compact_trylock_irqsave(spinlock_t *lock, unsigned long *flags,
 {
 	if (cc->mode == MIGRATE_ASYNC) {
 		if (!spin_trylock_irqsave(lock, *flags)) {
-			cc->contended = COMPACT_CONTENDED_LOCK;
+			cc->contended = true;
 			return false;
 		}
 	} else {
@@ -365,13 +365,13 @@ static bool compact_unlock_should_abort(spinlock_t *lock,
 	}
 
 	if (fatal_signal_pending(current)) {
-		cc->contended = COMPACT_CONTENDED_SCHED;
+		cc->contended = true;
 		return true;
 	}
 
 	if (need_resched()) {
 		if (cc->mode == MIGRATE_ASYNC) {
-			cc->contended = COMPACT_CONTENDED_SCHED;
+			cc->contended = true;
 			return true;
 		}
 		cond_resched();
@@ -394,7 +394,7 @@ static inline bool compact_should_abort(struct compact_control *cc)
 	/* async compaction aborts if contended */
 	if (need_resched()) {
 		if (cc->mode == MIGRATE_ASYNC) {
-			cc->contended = COMPACT_CONTENDED_SCHED;
+			cc->contended = true;
 			return true;
 		}
 
@@ -1637,14 +1637,11 @@ static enum compact_result compact_zone(struct zone *zone, struct compact_contro
 	trace_mm_compaction_end(start_pfn, cc->migrate_pfn,
 				cc->free_pfn, end_pfn, sync, ret);
 
-	if (ret == COMPACT_CONTENDED)
-		ret = COMPACT_PARTIAL;
-
 	return ret;
 }
 
 static enum compact_result compact_zone_order(struct zone *zone, int order,
-		gfp_t gfp_mask, enum compact_priority prio, int *contended,
+		gfp_t gfp_mask, enum compact_priority prio,
 		unsigned int alloc_flags, int classzone_idx)
 {
 	enum compact_result ret;
@@ -1668,7 +1665,6 @@ static enum compact_result compact_zone_order(struct zone *zone, int order,
 	VM_BUG_ON(!list_empty(&cc.freepages));
 	VM_BUG_ON(!list_empty(&cc.migratepages));
 
-	*contended = cc.contended;
 	return ret;
 }
 
@@ -1681,23 +1677,18 @@ int sysctl_extfrag_threshold = 500;
  * @alloc_flags: The allocation flags of the current allocation
  * @ac: The context of current allocation
  * @mode: The migration mode for async, sync light, or sync migration
- * @contended: Return value that determines if compaction was aborted due to
- *	       need_resched() or lock contention
  *
  * This is the main entry point for direct page compaction.
  */
 enum compact_result try_to_compact_pages(gfp_t gfp_mask, unsigned int order,
 		unsigned int alloc_flags, const struct alloc_context *ac,
-		enum compact_priority prio, int *contended)
+		enum compact_priority prio)
 {
 	int may_enter_fs = gfp_mask & __GFP_FS;
 	int may_perform_io = gfp_mask & __GFP_IO;
 	struct zoneref *z;
 	struct zone *zone;
 	enum compact_result rc = COMPACT_SKIPPED;
-	int all_zones_contended = COMPACT_CONTENDED_LOCK; /* init for &= op */
-
-	*contended = COMPACT_CONTENDED_NONE;
 
 	/* Check if the GFP flags allow compaction */
 	if (!may_enter_fs || !may_perform_io)
@@ -1709,7 +1700,6 @@ enum compact_result try_to_compact_pages(gfp_t gfp_mask, unsigned int order,
 	for_each_zone_zonelist_nodemask(zone, z, ac->zonelist, ac->high_zoneidx,
 								ac->nodemask) {
 		enum compact_result status;
-		int zone_contended;
 
 		if (compaction_deferred(zone, order)) {
 			rc = max_t(enum compact_result, COMPACT_DEFERRED, rc);
@@ -1717,14 +1707,8 @@ enum compact_result try_to_compact_pages(gfp_t gfp_mask, unsigned int order,
 		}
 
 		status = compact_zone_order(zone, order, gfp_mask, prio,
-				&zone_contended, alloc_flags,
-				ac_classzone_idx(ac));
+					alloc_flags, ac_classzone_idx(ac));
 		rc = max(status, rc);
-		/*
-		 * It takes at least one zone that wasn't lock contended
-		 * to clear all_zones_contended.
-		 */
-		all_zones_contended &= zone_contended;
 
 		/* If a normal allocation would succeed, stop compacting */
 		if (zone_watermark_ok(zone, order, low_wmark_pages(zone),
@@ -1736,59 +1720,29 @@ enum compact_result try_to_compact_pages(gfp_t gfp_mask, unsigned int order,
 			 * succeeds in this zone.
 			 */
 			compaction_defer_reset(zone, order, false);
-			/*
-			 * It is possible that async compaction aborted due to
-			 * need_resched() and the watermarks were ok thanks to
-			 * somebody else freeing memory. The allocation can
-			 * however still fail so we better signal the
-			 * need_resched() contention anyway (this will not
-			 * prevent the allocation attempt).
-			 */
-			if (zone_contended == COMPACT_CONTENDED_SCHED)
-				*contended = COMPACT_CONTENDED_SCHED;
 
-			goto break_loop;
+			break;
 		}
 
 		if (prio != COMPACT_PRIO_ASYNC && (status == COMPACT_COMPLETE ||
-					status == COMPACT_PARTIAL_SKIPPED)) {
+					status == COMPACT_PARTIAL_SKIPPED))
 			/*
 			 * We think that allocation won't succeed in this zone
 			 * so we defer compaction there. If it ends up
 			 * succeeding after all, it will be reset.
 			 */
 			defer_compaction(zone, order);
-		}
 
 		/*
 		 * We might have stopped compacting due to need_resched() in
 		 * async compaction, or due to a fatal signal detected. In that
-		 * case do not try further zones and signal need_resched()
-		 * contention.
-		 */
-		if ((zone_contended == COMPACT_CONTENDED_SCHED)
-					|| fatal_signal_pending(current)) {
-			*contended = COMPACT_CONTENDED_SCHED;
-			goto break_loop;
-		}
-
-		continue;
-break_loop:
-		/*
-		 * We might not have tried all the zones, so  be conservative
-		 * and assume they are not all lock contended.
+		 * case do not try further zones
 		 */
-		all_zones_contended = 0;
-		break;
+		if ((prio == COMPACT_PRIO_ASYNC && need_resched())
+					|| fatal_signal_pending(current))
+			break;
 	}
 
-	/*
-	 * If at least one zone wasn't deferred or skipped, we report if all
-	 * zones that were tried were lock contended.
-	 */
-	if (rc > COMPACT_INACTIVE && all_zones_contended)
-		*contended = COMPACT_CONTENDED_LOCK;
-
 	return rc;
 }
 
diff --git a/mm/internal.h b/mm/internal.h
index 28932cd6a195..1501304f87a4 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -185,10 +185,7 @@ struct compact_control {
 	const unsigned int alloc_flags;	/* alloc flags of a direct compactor */
 	const int classzone_idx;	/* zone index of a direct compactor */
 	struct zone *zone;
-	int contended;			/* Signal need_sched() or lock
-					 * contention detected during
-					 * compaction
-					 */
+	bool contended;			/* Signal lock or sched contention */
 };
 
 unsigned long
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 04cab9d92e30..bb9b4fb66e85 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3099,14 +3099,13 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
 		enum compact_priority prio, enum compact_result *compact_result)
 {
 	struct page *page;
-	int contended_compaction;
 
 	if (!order)
 		return NULL;
 
 	current->flags |= PF_MEMALLOC;
 	*compact_result = try_to_compact_pages(gfp_mask, order, alloc_flags, ac,
-						prio, &contended_compaction);
+									prio);
 	current->flags &= ~PF_MEMALLOC;
 
 	if (*compact_result <= COMPACT_INACTIVE)
@@ -3135,24 +3134,6 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
 	 */
 	count_vm_event(COMPACTFAIL);
 
-	/*
-	 * In all zones where compaction was attempted (and not
-	 * deferred or skipped), lock contention has been detected.
-	 * For THP allocation we do not want to disrupt the others
-	 * so we fallback to base pages instead.
-	 */
-	if (contended_compaction == COMPACT_CONTENDED_LOCK)
-		*compact_result = COMPACT_CONTENDED;
-
-	/*
-	 * If compaction was aborted due to need_resched(), we do not
-	 * want to further increase allocation latency, unless it is
-	 * khugepaged trying to collapse.
-	 */
-	if (contended_compaction == COMPACT_CONTENDED_SCHED
-		&& !(current->flags & PF_KTHREAD))
-		*compact_result = COMPACT_CONTENDED;
-
 	cond_resched();
 
 	return NULL;
@@ -3576,13 +3557,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 				goto nopage;
 
 			/*
-			 * Compaction is contended so rather back off than cause
-			 * excessive stalls.
-			 */
-			if (compact_result == COMPACT_CONTENDED)
-				goto nopage;
-
-			/*
 			 * Looks like reclaim/compaction is worth trying, but
 			 * sync compaction could be very expensive, so keep
 			 * using async compaction.
-- 
2.9.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/8] mm, page_alloc: set alloc_flags only once in slowpath
  2016-07-18 11:22 ` [PATCH 2/8] mm, page_alloc: set alloc_flags only once in slowpath Vlastimil Babka
@ 2016-07-18 11:27   ` Michal Hocko
  2016-07-19 22:28   ` David Rientjes
  1 sibling, 0 replies; 24+ messages in thread
From: Michal Hocko @ 2016-07-18 11:27 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-mm, linux-kernel, Mel Gorman, Joonsoo Kim,
	David Rientjes, Rik van Riel

On Mon 18-07-16 13:22:56, Vlastimil Babka wrote:
> In __alloc_pages_slowpath(), alloc_flags doesn't change after it's initialized,
> so move the initialization above the retry: label. Also make the comment above
> the initialization more descriptive.
> 
> The only exception in the alloc_flags being constant is ALLOC_NO_WATERMARKS,
> which may change due to TIF_MEMDIE being set on the allocating thread. We can
> fix this, and make the code simpler and a bit more effective at the same time,
> by moving the part that determines ALLOC_NO_WATERMARKS from
> gfp_to_alloc_flags() to gfp_pfmemalloc_allowed(). This means we don't have to
> mask out ALLOC_NO_WATERMARKS in numerous places in __alloc_pages_slowpath()
> anymore. The only two tests for the flag can instead call
> gfp_pfmemalloc_allowed().
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

I've already acked this one AFAIR. But anyway I still agree with this
change
Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/page_alloc.c | 52 ++++++++++++++++++++++++++--------------------------
>  1 file changed, 26 insertions(+), 26 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 571aca8c637a..eb1968a1041e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3119,8 +3119,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
>  	 */
>  	count_vm_event(COMPACTSTALL);
>  
> -	page = get_page_from_freelist(gfp_mask, order,
> -					alloc_flags & ~ALLOC_NO_WATERMARKS, ac);
> +	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
>  
>  	if (page) {
>  		struct zone *zone = page_zone(page);
> @@ -3288,8 +3287,7 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
>  		return NULL;
>  
>  retry:
> -	page = get_page_from_freelist(gfp_mask, order,
> -					alloc_flags & ~ALLOC_NO_WATERMARKS, ac);
> +	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
>  
>  	/*
>  	 * If an allocation failed after direct reclaim, it could be because
> @@ -3351,16 +3349,6 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
>  	} else if (unlikely(rt_task(current)) && !in_interrupt())
>  		alloc_flags |= ALLOC_HARDER;
>  
> -	if (likely(!(gfp_mask & __GFP_NOMEMALLOC))) {
> -		if (gfp_mask & __GFP_MEMALLOC)
> -			alloc_flags |= ALLOC_NO_WATERMARKS;
> -		else if (in_serving_softirq() && (current->flags & PF_MEMALLOC))
> -			alloc_flags |= ALLOC_NO_WATERMARKS;
> -		else if (!in_interrupt() &&
> -				((current->flags & PF_MEMALLOC) ||
> -				 unlikely(test_thread_flag(TIF_MEMDIE))))
> -			alloc_flags |= ALLOC_NO_WATERMARKS;
> -	}
>  #ifdef CONFIG_CMA
>  	if (gfpflags_to_migratetype(gfp_mask) == MIGRATE_MOVABLE)
>  		alloc_flags |= ALLOC_CMA;
> @@ -3370,7 +3358,19 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
>  
>  bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
>  {
> -	return !!(gfp_to_alloc_flags(gfp_mask) & ALLOC_NO_WATERMARKS);
> +	if (unlikely(gfp_mask & __GFP_NOMEMALLOC))
> +		return false;
> +
> +	if (gfp_mask & __GFP_MEMALLOC)
> +		return true;
> +	if (in_serving_softirq() && (current->flags & PF_MEMALLOC))
> +		return true;
> +	if (!in_interrupt() &&
> +			((current->flags & PF_MEMALLOC) ||
> +			 unlikely(test_thread_flag(TIF_MEMDIE))))
> +		return true;
> +
> +	return false;
>  }
>  
>  static inline bool is_thp_gfp_mask(gfp_t gfp_mask)
> @@ -3534,36 +3534,36 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  				(__GFP_ATOMIC|__GFP_DIRECT_RECLAIM)))
>  		gfp_mask &= ~__GFP_ATOMIC;
>  
> -retry:
> -	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> -		wake_all_kswapds(order, ac);
> -
>  	/*
> -	 * OK, we're below the kswapd watermark and have kicked background
> -	 * reclaim. Now things get more complex, so set up alloc_flags according
> -	 * to how we want to proceed.
> +	 * The fast path uses conservative alloc_flags to succeed only until
> +	 * kswapd needs to be woken up, and to avoid the cost of setting up
> +	 * alloc_flags precisely. So we do that now.
>  	 */
>  	alloc_flags = gfp_to_alloc_flags(gfp_mask);
>  
> +retry:
> +	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> +		wake_all_kswapds(order, ac);
> +
>  	/*
>  	 * Reset the zonelist iterators if memory policies can be ignored.
>  	 * These allocations are high priority and system rather than user
>  	 * orientated.
>  	 */
> -	if ((alloc_flags & ALLOC_NO_WATERMARKS) || !(alloc_flags & ALLOC_CPUSET)) {
> +	if (!(alloc_flags & ALLOC_CPUSET) || gfp_pfmemalloc_allowed(gfp_mask)) {
>  		ac->zonelist = node_zonelist(numa_node_id(), gfp_mask);
>  		ac->preferred_zoneref = first_zones_zonelist(ac->zonelist,
>  					ac->high_zoneidx, ac->nodemask);
>  	}
>  
>  	/* This is the last chance, in general, before the goto nopage. */
> -	page = get_page_from_freelist(gfp_mask, order,
> -				alloc_flags & ~ALLOC_NO_WATERMARKS, ac);
> +	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
>  	if (page)
>  		goto got_pg;
>  
>  	/* Allocate without watermarks if the context allows */
> -	if (alloc_flags & ALLOC_NO_WATERMARKS) {
> +	if (gfp_pfmemalloc_allowed(gfp_mask)) {
> +
>  		page = get_page_from_freelist(gfp_mask, order,
>  						ALLOC_NO_WATERMARKS, ac);
>  		if (page)
> -- 
> 2.9.0
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/8] mm, page_alloc: don't retry initial attempt in slowpath
  2016-07-18 11:22 ` [PATCH 3/8] mm, page_alloc: don't retry initial attempt " Vlastimil Babka
@ 2016-07-18 11:29   ` Michal Hocko
  2016-07-18 11:34     ` Vlastimil Babka
  2016-07-19 22:36   ` David Rientjes
  1 sibling, 1 reply; 24+ messages in thread
From: Michal Hocko @ 2016-07-18 11:29 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-mm, linux-kernel, Mel Gorman, Joonsoo Kim,
	David Rientjes, Rik van Riel

On Mon 18-07-16 13:22:57, Vlastimil Babka wrote:
> After __alloc_pages_slowpath() sets up new alloc_flags and wakes up kswapd, it
> first tries get_page_from_freelist() with the new alloc_flags, as it may
> succeed e.g. due to using min watermark instead of low watermark. It makes
> sense to to do this attempt before adjusting zonelist based on
> alloc_flags/gfp_mask, as it's still relatively a fast path if we just wake up
> kswapd and successfully allocate.
> 
> This patch therefore moves the initial attempt above the retry label and
> reorganizes a bit the part below the retry label. We still have to attempt
> get_page_from_freelist() on each retry, as some allocations cannot do that
> as part of direct reclaim or compaction, and yet are not allowed to fail
> (even though they do a WARN_ON_ONCE() and thus should not exist). We can reuse
> the call meant for ALLOC_NO_WATERMARKS attempt and just set alloc_flags to
> ALLOC_NO_WATERMARKS if the context allows it. As a side-effect, the attempts
> from direct reclaim/compaction will also no longer obey watermarks once this
> is set, but there's little harm in that.
> 
> Kswapd wakeups are also done on each retry to be safe from potential races
> resulting in kswapd going to sleep while a process (that may not be able to
> reclaim by itself) is still looping.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Same here, my ack still holds
Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/page_alloc.c | 29 ++++++++++++++++++-----------
>  1 file changed, 18 insertions(+), 11 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index eb1968a1041e..30443804f156 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3541,35 +3541,42 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  	 */
>  	alloc_flags = gfp_to_alloc_flags(gfp_mask);
>  
> +	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> +		wake_all_kswapds(order, ac);
> +
> +	/*
> +	 * The adjusted alloc_flags might result in immediate success, so try
> +	 * that first
> +	 */
> +	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
> +	if (page)
> +		goto got_pg;
> +
> +
>  retry:
> +	/* Ensure kswapd doesn't accidentally go to sleep as long as we loop */
>  	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
>  		wake_all_kswapds(order, ac);
>  
> +	if (gfp_pfmemalloc_allowed(gfp_mask))
> +		alloc_flags = ALLOC_NO_WATERMARKS;
> +
>  	/*
>  	 * Reset the zonelist iterators if memory policies can be ignored.
>  	 * These allocations are high priority and system rather than user
>  	 * orientated.
>  	 */
> -	if (!(alloc_flags & ALLOC_CPUSET) || gfp_pfmemalloc_allowed(gfp_mask)) {
> +	if (!(alloc_flags & ALLOC_CPUSET) || (alloc_flags & ALLOC_NO_WATERMARKS)) {
>  		ac->zonelist = node_zonelist(numa_node_id(), gfp_mask);
>  		ac->preferred_zoneref = first_zones_zonelist(ac->zonelist,
>  					ac->high_zoneidx, ac->nodemask);
>  	}
>  
> -	/* This is the last chance, in general, before the goto nopage. */
> +	/* Attempt with potentially adjusted zonelist and alloc_flags */
>  	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
>  	if (page)
>  		goto got_pg;
>  
> -	/* Allocate without watermarks if the context allows */
> -	if (gfp_pfmemalloc_allowed(gfp_mask)) {
> -
> -		page = get_page_from_freelist(gfp_mask, order,
> -						ALLOC_NO_WATERMARKS, ac);
> -		if (page)
> -			goto got_pg;
> -	}
> -
>  	/* Caller is not willing to reclaim, we can't balance anything */
>  	if (!can_direct_reclaim) {
>  		/*
> -- 
> 2.9.0
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/8] compaction-related cleanups v4
  2016-07-18 11:22 [PATCH 0/8] compaction-related cleanups v4 Vlastimil Babka
                   ` (7 preceding siblings ...)
  2016-07-18 11:23 ` [PATCH 8/8] mm, compaction: simplify contended compaction handling Vlastimil Babka
@ 2016-07-18 11:30 ` Michal Hocko
  2016-07-18 15:41 ` Mel Gorman
  9 siblings, 0 replies; 24+ messages in thread
From: Michal Hocko @ 2016-07-18 11:30 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-mm, linux-kernel, Mel Gorman, Joonsoo Kim,
	David Rientjes, Rik van Riel

On Mon 18-07-16 13:22:54, Vlastimil Babka wrote:
> Hi,
> 
> this is the splitted-off first part of my "make direct compaction more
> deterministic" series [1], rebased on mmotm-2016-07-13-16-09-18. For the whole
> series it's probably too late for 4.8 given some unresolved feedback, but I
> hope this part could go in as it was stable for quite some time.
> 
> At the very least, the first patch really shouldn't wait any longer.

I think the rest looks also good to go. It makes the code more readable,
removes some hacks and...

>  11 files changed, 164 insertions(+), 233 deletions(-)

looks promissing as well.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/8] mm, page_alloc: don't retry initial attempt in slowpath
  2016-07-18 11:29   ` Michal Hocko
@ 2016-07-18 11:34     ` Vlastimil Babka
  0 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-18 11:34 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm, linux-kernel, Mel Gorman, Joonsoo Kim,
	David Rientjes, Rik van Riel

On 07/18/2016 01:29 PM, Michal Hocko wrote:
> On Mon 18-07-16 13:22:57, Vlastimil Babka wrote:
>> After __alloc_pages_slowpath() sets up new alloc_flags and wakes up kswapd, it
>> first tries get_page_from_freelist() with the new alloc_flags, as it may
>> succeed e.g. due to using min watermark instead of low watermark. It makes
>> sense to to do this attempt before adjusting zonelist based on
>> alloc_flags/gfp_mask, as it's still relatively a fast path if we just wake up
>> kswapd and successfully allocate.
>>
>> This patch therefore moves the initial attempt above the retry label and
>> reorganizes a bit the part below the retry label. We still have to attempt
>> get_page_from_freelist() on each retry, as some allocations cannot do that
>> as part of direct reclaim or compaction, and yet are not allowed to fail
>> (even though they do a WARN_ON_ONCE() and thus should not exist). We can reuse
>> the call meant for ALLOC_NO_WATERMARKS attempt and just set alloc_flags to
>> ALLOC_NO_WATERMARKS if the context allows it. As a side-effect, the attempts
>> from direct reclaim/compaction will also no longer obey watermarks once this
>> is set, but there's little harm in that.
>>
>> Kswapd wakeups are also done on each retry to be safe from potential races
>> resulting in kswapd going to sleep while a process (that may not be able to
>> reclaim by itself) is still looping.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>
> Same here, my ack still holds
> Acked-by: Michal Hocko <mhocko@suse.com>

Sorry, forgot to add them before sending. Thanks for both!

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/8] compaction-related cleanups v4
  2016-07-18 11:22 [PATCH 0/8] compaction-related cleanups v4 Vlastimil Babka
                   ` (8 preceding siblings ...)
  2016-07-18 11:30 ` [PATCH 0/8] compaction-related cleanups v4 Michal Hocko
@ 2016-07-18 15:41 ` Mel Gorman
  9 siblings, 0 replies; 24+ messages in thread
From: Mel Gorman @ 2016-07-18 15:41 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko, Joonsoo Kim,
	David Rientjes, Rik van Riel

On Mon, Jul 18, 2016 at 01:22:54PM +0200, Vlastimil Babka wrote:
> Hi,
> 
> this is the splitted-off first part of my "make direct compaction more
> deterministic" series [1], rebased on mmotm-2016-07-13-16-09-18. For the whole
> series it's probably too late for 4.8 given some unresolved feedback, but I
> hope this part could go in as it was stable for quite some time.
> 
> At the very least, the first patch really shouldn't wait any longer.
> 

I read through the patches but did not have a substantial or useful
comment to make. The compaction priority stuff is interesting and while
it'll take a little getting used to, I think it's a better way of
viewing compaction in general. For the series

Acked-by: Mel Gorman <mgorman@techsingularity.net>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/8] mm, compaction: don't isolate PageWriteback pages in MIGRATE_SYNC_LIGHT mode
  2016-07-18 11:22 ` [PATCH 1/8] mm, compaction: don't isolate PageWriteback pages in MIGRATE_SYNC_LIGHT mode Vlastimil Babka
@ 2016-07-19 22:21   ` David Rientjes
  0 siblings, 0 replies; 24+ messages in thread
From: David Rientjes @ 2016-07-19 22:21 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko, Mel Gorman,
	Joonsoo Kim, Rik van Riel, Hugh Dickins

On Mon, 18 Jul 2016, Vlastimil Babka wrote:

> From: Hugh Dickins <hughd@google.com>
> 
> At present MIGRATE_SYNC_LIGHT is allowing __isolate_lru_page() to
> isolate a PageWriteback page, which __unmap_and_move() then rejects
> with -EBUSY: of course the writeback might complete in between, but
> that's not what we usually expect, so probably better not to isolate it.
> 
> When tested by stress-highalloc from mmtests, this has reduced the number of
> page migrate failures by 60-70%.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Acked-by: Michal Hocko <mhocko@suse.com>

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/8] mm, page_alloc: set alloc_flags only once in slowpath
  2016-07-18 11:22 ` [PATCH 2/8] mm, page_alloc: set alloc_flags only once in slowpath Vlastimil Babka
  2016-07-18 11:27   ` Michal Hocko
@ 2016-07-19 22:28   ` David Rientjes
  2016-07-21  7:00     ` Vlastimil Babka
  1 sibling, 1 reply; 24+ messages in thread
From: David Rientjes @ 2016-07-19 22:28 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko, Mel Gorman,
	Joonsoo Kim, Rik van Riel

On Mon, 18 Jul 2016, Vlastimil Babka wrote:

> In __alloc_pages_slowpath(), alloc_flags doesn't change after it's initialized,
> so move the initialization above the retry: label. Also make the comment above
> the initialization more descriptive.
> 
> The only exception in the alloc_flags being constant is ALLOC_NO_WATERMARKS,
> which may change due to TIF_MEMDIE being set on the allocating thread. We can
> fix this, and make the code simpler and a bit more effective at the same time,
> by moving the part that determines ALLOC_NO_WATERMARKS from
> gfp_to_alloc_flags() to gfp_pfmemalloc_allowed(). This means we don't have to
> mask out ALLOC_NO_WATERMARKS in numerous places in __alloc_pages_slowpath()
> anymore. The only two tests for the flag can instead call
> gfp_pfmemalloc_allowed().
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Acked-by: David Rientjes <rientjes@google.com>

Looks good, although maybe a new name for gfp_pfmemalloc_allowed() would 
be in order.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/8] mm, page_alloc: don't retry initial attempt in slowpath
  2016-07-18 11:22 ` [PATCH 3/8] mm, page_alloc: don't retry initial attempt " Vlastimil Babka
  2016-07-18 11:29   ` Michal Hocko
@ 2016-07-19 22:36   ` David Rientjes
  2016-07-20 15:25     ` Vlastimil Babka
  1 sibling, 1 reply; 24+ messages in thread
From: David Rientjes @ 2016-07-19 22:36 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko, Mel Gorman,
	Joonsoo Kim, Rik van Riel

On Mon, 18 Jul 2016, Vlastimil Babka wrote:

> After __alloc_pages_slowpath() sets up new alloc_flags and wakes up kswapd, it
> first tries get_page_from_freelist() with the new alloc_flags, as it may
> succeed e.g. due to using min watermark instead of low watermark. It makes
> sense to to do this attempt before adjusting zonelist based on
> alloc_flags/gfp_mask, as it's still relatively a fast path if we just wake up
> kswapd and successfully allocate.
> 
> This patch therefore moves the initial attempt above the retry label and
> reorganizes a bit the part below the retry label. We still have to attempt
> get_page_from_freelist() on each retry, as some allocations cannot do that
> as part of direct reclaim or compaction, and yet are not allowed to fail
> (even though they do a WARN_ON_ONCE() and thus should not exist). We can reuse
> the call meant for ALLOC_NO_WATERMARKS attempt and just set alloc_flags to
> ALLOC_NO_WATERMARKS if the context allows it. As a side-effect, the attempts
> from direct reclaim/compaction will also no longer obey watermarks once this
> is set, but there's little harm in that.
> 
> Kswapd wakeups are also done on each retry to be safe from potential races
> resulting in kswapd going to sleep while a process (that may not be able to
> reclaim by itself) is still looping.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  mm/page_alloc.c | 29 ++++++++++++++++++-----------
>  1 file changed, 18 insertions(+), 11 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index eb1968a1041e..30443804f156 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3541,35 +3541,42 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  	 */
>  	alloc_flags = gfp_to_alloc_flags(gfp_mask);
>  
> +	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> +		wake_all_kswapds(order, ac);
> +
> +	/*
> +	 * The adjusted alloc_flags might result in immediate success, so try
> +	 * that first
> +	 */
> +	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
> +	if (page)
> +		goto got_pg;

Any reason to not test gfp_pfmemalloc_allowed() here?  For contexts where 
it returns true, it seems like the above would be an unneeded failure if 
ALLOC_WMARK_MIN would have failed.  No strong opinion.

> +
> +
>  retry:
> +	/* Ensure kswapd doesn't accidentally go to sleep as long as we loop */
>  	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
>  		wake_all_kswapds(order, ac);
>  
> +	if (gfp_pfmemalloc_allowed(gfp_mask))
> +		alloc_flags = ALLOC_NO_WATERMARKS;
> +
>  	/*
>  	 * Reset the zonelist iterators if memory policies can be ignored.
>  	 * These allocations are high priority and system rather than user
>  	 * orientated.
>  	 */
> -	if (!(alloc_flags & ALLOC_CPUSET) || gfp_pfmemalloc_allowed(gfp_mask)) {
> +	if (!(alloc_flags & ALLOC_CPUSET) || (alloc_flags & ALLOC_NO_WATERMARKS)) {

Do we need to test ALLOC_NO_WATERMARKS here, or is it just for clarity?

Otherwise looks good!

>  		ac->zonelist = node_zonelist(numa_node_id(), gfp_mask);
>  		ac->preferred_zoneref = first_zones_zonelist(ac->zonelist,
>  					ac->high_zoneidx, ac->nodemask);
>  	}
>  
> -	/* This is the last chance, in general, before the goto nopage. */
> +	/* Attempt with potentially adjusted zonelist and alloc_flags */
>  	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
>  	if (page)
>  		goto got_pg;
>  
> -	/* Allocate without watermarks if the context allows */
> -	if (gfp_pfmemalloc_allowed(gfp_mask)) {
> -
> -		page = get_page_from_freelist(gfp_mask, order,
> -						ALLOC_NO_WATERMARKS, ac);
> -		if (page)
> -			goto got_pg;
> -	}
> -
>  	/* Caller is not willing to reclaim, we can't balance anything */
>  	if (!can_direct_reclaim) {
>  		/*

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/8] mm, page_alloc: restructure direct compaction handling in slowpath
  2016-07-18 11:22 ` [PATCH 4/8] mm, page_alloc: restructure direct compaction handling " Vlastimil Babka
@ 2016-07-19 22:50   ` David Rientjes
  2016-07-20 16:02     ` Vlastimil Babka
  0 siblings, 1 reply; 24+ messages in thread
From: David Rientjes @ 2016-07-19 22:50 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko, Mel Gorman,
	Joonsoo Kim, Rik van Riel

On Mon, 18 Jul 2016, Vlastimil Babka wrote:

> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 30443804f156..a04a67745927 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3510,7 +3510,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  	struct page *page = NULL;
>  	unsigned int alloc_flags;
>  	unsigned long did_some_progress;
> -	enum migrate_mode migration_mode = MIGRATE_ASYNC;
> +	enum migrate_mode migration_mode = MIGRATE_SYNC_LIGHT;
>  	enum compact_result compact_result;
>  	int compaction_retries = 0;
>  	int no_progress_loops = 0;
> @@ -3552,6 +3552,49 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  	if (page)
>  		goto got_pg;
>  
> +	/*
> +	 * For costly allocations, try direct compaction first, as it's likely
> +	 * that we have enough base pages and don't need to reclaim.
> +	 */
> +	if (can_direct_reclaim && order > PAGE_ALLOC_COSTLY_ORDER) {
> +		page = __alloc_pages_direct_compact(gfp_mask, order,
> +						alloc_flags, ac,
> +						MIGRATE_ASYNC,
> +						&compact_result);
> +		if (page)
> +			goto got_pg;
> +
> +		/* Checks for THP-specific high-order allocations */
> +		if (is_thp_gfp_mask(gfp_mask)) {
> +			/*
> +			 * If compaction is deferred for high-order allocations,
> +			 * it is because sync compaction recently failed. If
> +			 * this is the case and the caller requested a THP
> +			 * allocation, we do not want to heavily disrupt the
> +			 * system, so we fail the allocation instead of entering
> +			 * direct reclaim.
> +			 */
> +			if (compact_result == COMPACT_DEFERRED)
> +				goto nopage;
> +
> +			/*
> +			 * Compaction is contended so rather back off than cause
> +			 * excessive stalls.
> +			 */
> +			if (compact_result == COMPACT_CONTENDED)
> +				goto nopage;
> +
> +			/*
> +			 * It can become very expensive to allocate transparent
> +			 * hugepages at fault, so use asynchronous memory
> +			 * compaction for THP unless it is khugepaged trying to
> +			 * collapse. All other requests should tolerate at
> +			 * least light sync migration.
> +			 */
> +			if (!(current->flags & PF_KTHREAD))
> +				migration_mode = MIGRATE_ASYNC;
> +		}
> +	}
>  

If gfp_pfmemalloc_allowed() == true, does this try to do compaction when 
get_page_from_freelist() would have succeeded with no watermarks?

>  retry:
>  	/* Ensure kswapd doesn't accidentally go to sleep as long as we loop */
> @@ -3606,55 +3649,33 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  	if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL))
>  		goto nopage;
>  
> -	/*
> -	 * Try direct compaction. The first pass is asynchronous. Subsequent
> -	 * attempts after direct reclaim are synchronous
> -	 */
> +
> +	/* Try direct reclaim and then allocating */
> +	page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
> +							&did_some_progress);
> +	if (page)
> +		goto got_pg;
> +
> +	/* Try direct compaction and then allocating */
>  	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac,
>  					migration_mode,
>  					&compact_result);
>  	if (page)
>  		goto got_pg;
>  
> -	/* Checks for THP-specific high-order allocations */
> -	if (is_thp_gfp_mask(gfp_mask)) {
> -		/*
> -		 * If compaction is deferred for high-order allocations, it is
> -		 * because sync compaction recently failed. If this is the case
> -		 * and the caller requested a THP allocation, we do not want
> -		 * to heavily disrupt the system, so we fail the allocation
> -		 * instead of entering direct reclaim.
> -		 */
> -		if (compact_result == COMPACT_DEFERRED)
> -			goto nopage;
> -
> -		/*
> -		 * Compaction is contended so rather back off than cause
> -		 * excessive stalls.
> -		 */
> -		if(compact_result == COMPACT_CONTENDED)
> -			goto nopage;
> -	}
> -
>  	if (order && compaction_made_progress(compact_result))
>  		compaction_retries++;
>  
> -	/* Try direct reclaim and then allocating */
> -	page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
> -							&did_some_progress);
> -	if (page)
> -		goto got_pg;
> -
>  	/* Do not loop if specifically requested */
>  	if (gfp_mask & __GFP_NORETRY)
> -		goto noretry;
> +		goto nopage;
>  
>  	/*
>  	 * Do not retry costly high order allocations unless they are
>  	 * __GFP_REPEAT
>  	 */
>  	if (order > PAGE_ALLOC_COSTLY_ORDER && !(gfp_mask & __GFP_REPEAT))
> -		goto noretry;
> +		goto nopage;
>  
>  	/*
>  	 * Costly allocations might have made a progress but this doesn't mean
> @@ -3693,25 +3714,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  		goto retry;
>  	}
>  
> -noretry:
> -	/*
> -	 * High-order allocations do not necessarily loop after direct reclaim
> -	 * and reclaim/compaction depends on compaction being called after
> -	 * reclaim so call directly if necessary.
> -	 * It can become very expensive to allocate transparent hugepages at
> -	 * fault, so use asynchronous memory compaction for THP unless it is
> -	 * khugepaged trying to collapse. All other requests should tolerate
> -	 * at least light sync migration.
> -	 */
> -	if (is_thp_gfp_mask(gfp_mask) && !(current->flags & PF_KTHREAD))
> -		migration_mode = MIGRATE_ASYNC;
> -	else
> -		migration_mode = MIGRATE_SYNC_LIGHT;
> -	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags,
> -					    ac, migration_mode,
> -					    &compact_result);
> -	if (page)
> -		goto got_pg;
>  nopage:
>  	warn_alloc_failed(gfp_mask, order, NULL);
>  got_pg:

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/8] mm, page_alloc: make THP-specific decisions more generic
  2016-07-18 11:22 ` [PATCH 5/8] mm, page_alloc: make THP-specific decisions more generic Vlastimil Babka
@ 2016-07-19 23:10   ` David Rientjes
  2016-07-21  7:13     ` Vlastimil Babka
  0 siblings, 1 reply; 24+ messages in thread
From: David Rientjes @ 2016-07-19 23:10 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko, Mel Gorman,
	Joonsoo Kim, Rik van Riel

On Mon, 18 Jul 2016, Vlastimil Babka wrote:

> Since THP allocations during page faults can be costly, extra decisions are
> employed for them to avoid excessive reclaim and compaction, if the initial
> compaction doesn't look promising. The detection has never been perfect as
> there is no gfp flag specific to THP allocations. At this moment it checks the
> whole combination of flags that makes up GFP_TRANSHUGE, and hopes that no other
> users of such combination exist, or would mind being treated the same way.
> Extra care is also taken to separate allocations from khugepaged, where latency
> doesn't matter that much.
> 
> It is however possible to distinguish these allocations in a simpler and more
> reliable way. The key observation is that after the initial compaction followed
> by the first iteration of "standard" reclaim/compaction, both __GFP_NORETRY
> allocations and costly allocations without __GFP_REPEAT are declared as
> failures:
> 
>         /* Do not loop if specifically requested */
>         if (gfp_mask & __GFP_NORETRY)
>                 goto nopage;
> 
>         /*
>          * Do not retry costly high order allocations unless they are
>          * __GFP_REPEAT
>          */
>         if (order > PAGE_ALLOC_COSTLY_ORDER && !(gfp_mask & __GFP_REPEAT))
>                 goto nopage;
> 
> This means we can further distinguish allocations that are costly order *and*
> additionally include the __GFP_NORETRY flag. As it happens, GFP_TRANSHUGE
> allocations do already fall into this category. This will also allow other
> costly allocations with similar high-order benefit vs latency considerations to
> use this semantic. Furthermore, we can distinguish THP allocations that should
> try a bit harder (such as from khugepageed) by removing __GFP_NORETRY, as will
> be done in the next patch.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Acked-by: Michal Hocko <mhocko@suse.com>

I think this is fine, but I would hope that we could check 
gfp_pfmemalloc_allowed() before compacting and failing even for costly 
orders when otherwise the first get_page_from_freelist() in the slowpath 
may have succeeded due to watermarks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/8] mm, page_alloc: don't retry initial attempt in slowpath
  2016-07-19 22:36   ` David Rientjes
@ 2016-07-20 15:25     ` Vlastimil Babka
  2016-07-20 22:00       ` David Rientjes
  0 siblings, 1 reply; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-20 15:25 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko, Mel Gorman,
	Joonsoo Kim, Rik van Riel

On 07/20/2016 12:36 AM, David Rientjes wrote:
> On Mon, 18 Jul 2016, Vlastimil Babka wrote:
> 
>> After __alloc_pages_slowpath() sets up new alloc_flags and wakes up kswapd, it
>> first tries get_page_from_freelist() with the new alloc_flags, as it may
>> succeed e.g. due to using min watermark instead of low watermark. It makes
>> sense to to do this attempt before adjusting zonelist based on
>> alloc_flags/gfp_mask, as it's still relatively a fast path if we just wake up
>> kswapd and successfully allocate.
>>
>> This patch therefore moves the initial attempt above the retry label and
>> reorganizes a bit the part below the retry label. We still have to attempt
>> get_page_from_freelist() on each retry, as some allocations cannot do that
>> as part of direct reclaim or compaction, and yet are not allowed to fail
>> (even though they do a WARN_ON_ONCE() and thus should not exist). We can reuse
>> the call meant for ALLOC_NO_WATERMARKS attempt and just set alloc_flags to
>> ALLOC_NO_WATERMARKS if the context allows it. As a side-effect, the attempts
>> from direct reclaim/compaction will also no longer obey watermarks once this
>> is set, but there's little harm in that.
>>
>> Kswapd wakeups are also done on each retry to be safe from potential races
>> resulting in kswapd going to sleep while a process (that may not be able to
>> reclaim by itself) is still looping.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>> ---
>>  mm/page_alloc.c | 29 ++++++++++++++++++-----------
>>  1 file changed, 18 insertions(+), 11 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index eb1968a1041e..30443804f156 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -3541,35 +3541,42 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>>  	 */
>>  	alloc_flags = gfp_to_alloc_flags(gfp_mask);
>>  
>> +	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
>> +		wake_all_kswapds(order, ac);
>> +
>> +	/*
>> +	 * The adjusted alloc_flags might result in immediate success, so try
>> +	 * that first
>> +	 */
>> +	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
>> +	if (page)
>> +		goto got_pg;
> 
> Any reason to not test gfp_pfmemalloc_allowed() here?  For contexts where 
> it returns true, it seems like the above would be an unneeded failure if 
> ALLOC_WMARK_MIN would have failed.  No strong opinion.

Yeah, two reasons:
1 - less overhead (for the test) if we went to slowpath just to wake up
kswapd and then succeed on min watermark
2 - try all zones with min watermark before resorting to no watermark
(if allowed), so we don't needlessly put below min watermark the first
zone in zonelist, while some later zone would still be above watermark

> 
>> +
>> +
>>  retry:
>> +	/* Ensure kswapd doesn't accidentally go to sleep as long as we loop */
>>  	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
>>  		wake_all_kswapds(order, ac);
>>  
>> +	if (gfp_pfmemalloc_allowed(gfp_mask))
>> +		alloc_flags = ALLOC_NO_WATERMARKS;
>> +
>>  	/*
>>  	 * Reset the zonelist iterators if memory policies can be ignored.
>>  	 * These allocations are high priority and system rather than user
>>  	 * orientated.
>>  	 */
>> -	if (!(alloc_flags & ALLOC_CPUSET) || gfp_pfmemalloc_allowed(gfp_mask)) {
>> +	if (!(alloc_flags & ALLOC_CPUSET) || (alloc_flags & ALLOC_NO_WATERMARKS)) {
> 
> Do we need to test ALLOC_NO_WATERMARKS here, or is it just for clarity?

I didn't realize it's redundant, but would keep for clarity and
robustness anyway.

> 
> Otherwise looks good!

Thanks!

>>  		ac->zonelist = node_zonelist(numa_node_id(), gfp_mask);
>>  		ac->preferred_zoneref = first_zones_zonelist(ac->zonelist,
>>  					ac->high_zoneidx, ac->nodemask);
>>  	}
>>  
>> -	/* This is the last chance, in general, before the goto nopage. */
>> +	/* Attempt with potentially adjusted zonelist and alloc_flags */
>>  	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
>>  	if (page)
>>  		goto got_pg;
>>  
>> -	/* Allocate without watermarks if the context allows */
>> -	if (gfp_pfmemalloc_allowed(gfp_mask)) {
>> -
>> -		page = get_page_from_freelist(gfp_mask, order,
>> -						ALLOC_NO_WATERMARKS, ac);
>> -		if (page)
>> -			goto got_pg;
>> -	}
>> -
>>  	/* Caller is not willing to reclaim, we can't balance anything */
>>  	if (!can_direct_reclaim) {
>>  		/*

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/8] mm, page_alloc: restructure direct compaction handling in slowpath
  2016-07-19 22:50   ` David Rientjes
@ 2016-07-20 16:02     ` Vlastimil Babka
  0 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-20 16:02 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko, Mel Gorman,
	Joonsoo Kim, Rik van Riel

On 07/20/2016 12:50 AM, David Rientjes wrote:
> On Mon, 18 Jul 2016, Vlastimil Babka wrote:
> 
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 30443804f156..a04a67745927 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -3510,7 +3510,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>>  	struct page *page = NULL;
>>  	unsigned int alloc_flags;
>>  	unsigned long did_some_progress;
>> -	enum migrate_mode migration_mode = MIGRATE_ASYNC;
>> +	enum migrate_mode migration_mode = MIGRATE_SYNC_LIGHT;
>>  	enum compact_result compact_result;
>>  	int compaction_retries = 0;
>>  	int no_progress_loops = 0;
>> @@ -3552,6 +3552,49 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>>  	if (page)
>>  		goto got_pg;
>>  
>> +	/*
>> +	 * For costly allocations, try direct compaction first, as it's likely
>> +	 * that we have enough base pages and don't need to reclaim.
>> +	 */
>> +	if (can_direct_reclaim && order > PAGE_ALLOC_COSTLY_ORDER) {
>> +		page = __alloc_pages_direct_compact(gfp_mask, order,
>> +						alloc_flags, ac,
>> +						MIGRATE_ASYNC,
>> +						&compact_result);
>> +		if (page)
>> +			goto got_pg;
>> +
>> +		/* Checks for THP-specific high-order allocations */
>> +		if (is_thp_gfp_mask(gfp_mask)) {
>> +			/*
>> +			 * If compaction is deferred for high-order allocations,
>> +			 * it is because sync compaction recently failed. If
>> +			 * this is the case and the caller requested a THP
>> +			 * allocation, we do not want to heavily disrupt the
>> +			 * system, so we fail the allocation instead of entering
>> +			 * direct reclaim.
>> +			 */
>> +			if (compact_result == COMPACT_DEFERRED)
>> +				goto nopage;
>> +
>> +			/*
>> +			 * Compaction is contended so rather back off than cause
>> +			 * excessive stalls.
>> +			 */
>> +			if (compact_result == COMPACT_CONTENDED)
>> +				goto nopage;
>> +
>> +			/*
>> +			 * It can become very expensive to allocate transparent
>> +			 * hugepages at fault, so use asynchronous memory
>> +			 * compaction for THP unless it is khugepaged trying to
>> +			 * collapse. All other requests should tolerate at
>> +			 * least light sync migration.
>> +			 */
>> +			if (!(current->flags & PF_KTHREAD))
>> +				migration_mode = MIGRATE_ASYNC;
>> +		}
>> +	}
>>  
> 
> If gfp_pfmemalloc_allowed() == true, does this try to do compaction when 
> get_page_from_freelist() would have succeeded with no watermarks?

Yes, but the compaction will return immediately with COMPACT_SKIPPED, if
we are below min watermarks. So I don't think it's worth complicating
the code to avoid this?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/8] mm, page_alloc: don't retry initial attempt in slowpath
  2016-07-20 15:25     ` Vlastimil Babka
@ 2016-07-20 22:00       ` David Rientjes
  0 siblings, 0 replies; 24+ messages in thread
From: David Rientjes @ 2016-07-20 22:00 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko, Mel Gorman,
	Joonsoo Kim, Rik van Riel

On Wed, 20 Jul 2016, Vlastimil Babka wrote:

> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >> index eb1968a1041e..30443804f156 100644
> >> --- a/mm/page_alloc.c
> >> +++ b/mm/page_alloc.c
> >> @@ -3541,35 +3541,42 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> >>  	 */
> >>  	alloc_flags = gfp_to_alloc_flags(gfp_mask);
> >>  
> >> +	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> >> +		wake_all_kswapds(order, ac);
> >> +
> >> +	/*
> >> +	 * The adjusted alloc_flags might result in immediate success, so try
> >> +	 * that first
> >> +	 */
> >> +	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
> >> +	if (page)
> >> +		goto got_pg;
> > 
> > Any reason to not test gfp_pfmemalloc_allowed() here?  For contexts where 
> > it returns true, it seems like the above would be an unneeded failure if 
> > ALLOC_WMARK_MIN would have failed.  No strong opinion.
> 
> Yeah, two reasons:
> 1 - less overhead (for the test) if we went to slowpath just to wake up
> kswapd and then succeed on min watermark
> 2 - try all zones with min watermark before resorting to no watermark
> (if allowed), so we don't needlessly put below min watermark the first
> zone in zonelist, while some later zone would still be above watermark
> 

The second point makes sense, thanks!

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/8] mm, page_alloc: set alloc_flags only once in slowpath
  2016-07-19 22:28   ` David Rientjes
@ 2016-07-21  7:00     ` Vlastimil Babka
  0 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-21  7:00 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko, Mel Gorman,
	Joonsoo Kim, Rik van Riel

On 07/20/2016 12:28 AM, David Rientjes wrote:
> On Mon, 18 Jul 2016, Vlastimil Babka wrote:
>
>> In __alloc_pages_slowpath(), alloc_flags doesn't change after it's initialized,
>> so move the initialization above the retry: label. Also make the comment above
>> the initialization more descriptive.
>>
>> The only exception in the alloc_flags being constant is ALLOC_NO_WATERMARKS,
>> which may change due to TIF_MEMDIE being set on the allocating thread. We can
>> fix this, and make the code simpler and a bit more effective at the same time,
>> by moving the part that determines ALLOC_NO_WATERMARKS from
>> gfp_to_alloc_flags() to gfp_pfmemalloc_allowed(). This means we don't have to
>> mask out ALLOC_NO_WATERMARKS in numerous places in __alloc_pages_slowpath()
>> anymore. The only two tests for the flag can instead call
>> gfp_pfmemalloc_allowed().
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>
> Acked-by: David Rientjes <rientjes@google.com>
>
> Looks good, although maybe a new name for gfp_pfmemalloc_allowed() would
> be in order.

I don't disagree... any good suggestions? :)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/8] mm, page_alloc: make THP-specific decisions more generic
  2016-07-19 23:10   ` David Rientjes
@ 2016-07-21  7:13     ` Vlastimil Babka
  0 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2016-07-21  7:13 UTC (permalink / raw)
  To: David Rientjes
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko, Mel Gorman,
	Joonsoo Kim, Rik van Riel

On 07/20/2016 01:10 AM, David Rientjes wrote:
> On Mon, 18 Jul 2016, Vlastimil Babka wrote:
>
>> This means we can further distinguish allocations that are costly order *and*
>> additionally include the __GFP_NORETRY flag. As it happens, GFP_TRANSHUGE
>> allocations do already fall into this category. This will also allow other
>> costly allocations with similar high-order benefit vs latency considerations to
>> use this semantic. Furthermore, we can distinguish THP allocations that should
>> try a bit harder (such as from khugepageed) by removing __GFP_NORETRY, as will
>> be done in the next patch.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>> Acked-by: Michal Hocko <mhocko@suse.com>
>
> I think this is fine, but I would hope that we could check
> gfp_pfmemalloc_allowed() before compacting and failing even for costly
> orders when otherwise the first get_page_from_freelist() in the slowpath
> may have succeeded due to watermarks.

Hm ok, I will add it for the sake of avoiding goto nopage where 
previously it would have tried alloc without watermarks, as that would 
be unintended side-effect of the series... although I have some doubts 
about sanity of such scenarios (wants a costly order, can 
reclaim/compact but only with __GFP_NORETRY, yet is allowed to avoid 
watermarks?). Do you know about examples of such callers and think they 
do the right thing?

Thanks,
Vlastimil

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2016-07-21  7:13 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-18 11:22 [PATCH 0/8] compaction-related cleanups v4 Vlastimil Babka
2016-07-18 11:22 ` [PATCH 1/8] mm, compaction: don't isolate PageWriteback pages in MIGRATE_SYNC_LIGHT mode Vlastimil Babka
2016-07-19 22:21   ` David Rientjes
2016-07-18 11:22 ` [PATCH 2/8] mm, page_alloc: set alloc_flags only once in slowpath Vlastimil Babka
2016-07-18 11:27   ` Michal Hocko
2016-07-19 22:28   ` David Rientjes
2016-07-21  7:00     ` Vlastimil Babka
2016-07-18 11:22 ` [PATCH 3/8] mm, page_alloc: don't retry initial attempt " Vlastimil Babka
2016-07-18 11:29   ` Michal Hocko
2016-07-18 11:34     ` Vlastimil Babka
2016-07-19 22:36   ` David Rientjes
2016-07-20 15:25     ` Vlastimil Babka
2016-07-20 22:00       ` David Rientjes
2016-07-18 11:22 ` [PATCH 4/8] mm, page_alloc: restructure direct compaction handling " Vlastimil Babka
2016-07-19 22:50   ` David Rientjes
2016-07-20 16:02     ` Vlastimil Babka
2016-07-18 11:22 ` [PATCH 5/8] mm, page_alloc: make THP-specific decisions more generic Vlastimil Babka
2016-07-19 23:10   ` David Rientjes
2016-07-21  7:13     ` Vlastimil Babka
2016-07-18 11:23 ` [PATCH 6/8] mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations Vlastimil Babka
2016-07-18 11:23 ` [PATCH 7/8] mm, compaction: introduce direct compaction priority Vlastimil Babka
2016-07-18 11:23 ` [PATCH 8/8] mm, compaction: simplify contended compaction handling Vlastimil Babka
2016-07-18 11:30 ` [PATCH 0/8] compaction-related cleanups v4 Michal Hocko
2016-07-18 15:41 ` Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).