[PATCH 0/4] followups to reintroduce compaction feedback for OOM decisions

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/4] followups to reintroduce compaction feedback for OOM decisions
@ 2016-09-26 16:20 ` Vlastimil Babka
  0 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
	Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, David Rientjes,
	Hillf Danton, Joonsoo Kim, Mel Gorman, Michal Hocko,
	Michal Hocko, Rik van Riel

Reviews of series "reintroduce compaction feedback for OOM decisions" [1]
resulted in some followup patches and Michal suggested posting them in a new
threads, so here it goes.

Patch 1 is meant to be squashed into the following patch in mmotm:
mm-compaction-more-reliably-increase-direct-compaction-priority.patch

Patch 2 is a cleanup for consistency. Patches 3 and 4 deal with the last
(hopefully) remaining heuristic in the reclaim/compaction-vs-OOM scenario,
which is the fragmentation index.

[1] http://www.spinics.net/lists/linux-mm/msg113133.html

Vlastimil Babka (4):
  mm, compaction: more reliably increase direct compaction priority-fix
  mm, page_alloc: pull no_progress_loops update to
    should_reclaim_retry()
  mm, compaction: ignore fragindex from compaction_zonelist_suitable()
  mm, compaction: restrict fragindex to costly orders

 mm/compaction.c | 42 ++++++++++++++++++++++++------------------
 mm/page_alloc.c | 47 ++++++++++++++++++++++++-----------------------
 2 files changed, 48 insertions(+), 41 deletions(-)

-- 
2.10.0

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 0/4] followups to reintroduce compaction feedback for OOM decisions
@ 2016-09-26 16:20 ` Vlastimil Babka
  0 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
	Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, David Rientjes,
	Hillf Danton, Joonsoo Kim, Mel Gorman, Michal Hocko,
	Michal Hocko, Rik van Riel

Reviews of series "reintroduce compaction feedback for OOM decisions" [1]
resulted in some followup patches and Michal suggested posting them in a new
threads, so here it goes.

Patch 1 is meant to be squashed into the following patch in mmotm:
mm-compaction-more-reliably-increase-direct-compaction-priority.patch

Patch 2 is a cleanup for consistency. Patches 3 and 4 deal with the last
(hopefully) remaining heuristic in the reclaim/compaction-vs-OOM scenario,
which is the fragmentation index.

[1] http://www.spinics.net/lists/linux-mm/msg113133.html

Vlastimil Babka (4):
  mm, compaction: more reliably increase direct compaction priority-fix
  mm, page_alloc: pull no_progress_loops update to
    should_reclaim_retry()
  mm, compaction: ignore fragindex from compaction_zonelist_suitable()
  mm, compaction: restrict fragindex to costly orders

 mm/compaction.c | 42 ++++++++++++++++++++++++------------------
 mm/page_alloc.c | 47 ++++++++++++++++++++++++-----------------------
 2 files changed, 48 insertions(+), 41 deletions(-)

-- 
2.10.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/4] mm, compaction: more reliably increase direct compaction priority-fix
  2016-09-26 16:20 ` Vlastimil Babka
@ 2016-09-26 16:20   ` Vlastimil Babka
  -1 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
	Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, Mel Gorman,
	Joonsoo Kim, David Rientjes, Rik van Riel, Hillf Danton,
	Michal Hocko, Michal Hocko

When increasing the compaction priority, also reset retries. Otherwise we can
consume all retries on the lower priorities. Also pull the retries increment
into should_compact_retry() so it counts only the rounds where we actually
rely on it.

Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
---
 Please squash into
 mm-compaction-more-reliably-increase-direct-compaction-priority.patch

 mm/page_alloc.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5155485057cb..0fd29731ab35 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3162,7 +3162,7 @@ static inline bool
 should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
 		     enum compact_result compact_result,
 		     enum compact_priority *compact_priority,
-		     int compaction_retries)
+		     int *compaction_retries)
 {
 	int max_retries = MAX_COMPACT_RETRIES;
 	int min_priority;
@@ -3170,6 +3170,9 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
 	if (!order)
 		return false;
 
+	if (compaction_made_progress(compact_result))
+		(*compaction_retries)++;
+
 	/*
 	 * compaction considers all the zone as desperately out of memory
 	 * so it doesn't really make much sense to retry except when the
@@ -3197,18 +3200,19 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
 	 */
 	if (order > PAGE_ALLOC_COSTLY_ORDER)
 		max_retries /= 4;
-	if (compaction_retries <= max_retries)
+	if (*compaction_retries <= max_retries)
 		return true;
 
 	/*
-	 * Make sure there is at least one attempt at the highest priority
-	 * if we exhausted all retries at the lower priorities
+	 * Make sure there are attempts at the highest priority if we exhausted
+	 * all retries or failed at the lower priorities.
 	 */
 check_priority:
 	min_priority = (order > PAGE_ALLOC_COSTLY_ORDER) ?
 			MIN_COMPACT_COSTLY_PRIORITY : MIN_COMPACT_PRIORITY;
 	if (*compact_priority > min_priority) {
 		(*compact_priority)--;
+		*compaction_retries = 0;
 		return true;
 	}
 	return false;
@@ -3227,7 +3231,7 @@ static inline bool
 should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_flags,
 		     enum compact_result compact_result,
 		     enum compact_priority *compact_priority,
-		     int compaction_retries)
+		     int *compaction_retries)
 {
 	struct zone *zone;
 	struct zoneref *z;
@@ -3635,9 +3639,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	if (page)
 		goto got_pg;
 
-	if (order && compaction_made_progress(compact_result))
-		compaction_retries++;
-
 	/* Do not loop if specifically requested */
 	if (gfp_mask & __GFP_NORETRY)
 		goto nopage;
@@ -3672,7 +3673,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	if (did_some_progress > 0 &&
 			should_compact_retry(ac, order, alloc_flags,
 				compact_result, &compact_priority,
-				compaction_retries))
+				&compaction_retries))
 		goto retry;
 
 	/* Reclaim has failed us, start killing things */
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 1/4] mm, compaction: more reliably increase direct compaction priority-fix
@ 2016-09-26 16:20   ` Vlastimil Babka
  0 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
	Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, Mel Gorman,
	Joonsoo Kim, David Rientjes, Rik van Riel, Hillf Danton,
	Michal Hocko, Michal Hocko

When increasing the compaction priority, also reset retries. Otherwise we can
consume all retries on the lower priorities. Also pull the retries increment
into should_compact_retry() so it counts only the rounds where we actually
rely on it.

Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
---
 Please squash into
 mm-compaction-more-reliably-increase-direct-compaction-priority.patch

 mm/page_alloc.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5155485057cb..0fd29731ab35 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3162,7 +3162,7 @@ static inline bool
 should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
 		     enum compact_result compact_result,
 		     enum compact_priority *compact_priority,
-		     int compaction_retries)
+		     int *compaction_retries)
 {
 	int max_retries = MAX_COMPACT_RETRIES;
 	int min_priority;
@@ -3170,6 +3170,9 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
 	if (!order)
 		return false;
 
+	if (compaction_made_progress(compact_result))
+		(*compaction_retries)++;
+
 	/*
 	 * compaction considers all the zone as desperately out of memory
 	 * so it doesn't really make much sense to retry except when the
@@ -3197,18 +3200,19 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
 	 */
 	if (order > PAGE_ALLOC_COSTLY_ORDER)
 		max_retries /= 4;
-	if (compaction_retries <= max_retries)
+	if (*compaction_retries <= max_retries)
 		return true;
 
 	/*
-	 * Make sure there is at least one attempt at the highest priority
-	 * if we exhausted all retries at the lower priorities
+	 * Make sure there are attempts at the highest priority if we exhausted
+	 * all retries or failed at the lower priorities.
 	 */
 check_priority:
 	min_priority = (order > PAGE_ALLOC_COSTLY_ORDER) ?
 			MIN_COMPACT_COSTLY_PRIORITY : MIN_COMPACT_PRIORITY;
 	if (*compact_priority > min_priority) {
 		(*compact_priority)--;
+		*compaction_retries = 0;
 		return true;
 	}
 	return false;
@@ -3227,7 +3231,7 @@ static inline bool
 should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_flags,
 		     enum compact_result compact_result,
 		     enum compact_priority *compact_priority,
-		     int compaction_retries)
+		     int *compaction_retries)
 {
 	struct zone *zone;
 	struct zoneref *z;
@@ -3635,9 +3639,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	if (page)
 		goto got_pg;
 
-	if (order && compaction_made_progress(compact_result))
-		compaction_retries++;
-
 	/* Do not loop if specifically requested */
 	if (gfp_mask & __GFP_NORETRY)
 		goto nopage;
@@ -3672,7 +3673,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	if (did_some_progress > 0 &&
 			should_compact_retry(ac, order, alloc_flags,
 				compact_result, &compact_priority,
-				compaction_retries))
+				&compaction_retries))
 		goto retry;
 
 	/* Reclaim has failed us, start killing things */
-- 
2.10.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/4] mm, page_alloc: pull no_progress_loops update to should_reclaim_retry()
  2016-09-26 16:20 ` Vlastimil Babka
@ 2016-09-26 16:20   ` Vlastimil Babka
  -1 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
	Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, Mel Gorman,
	Joonsoo Kim, David Rientjes, Rik van Riel, Hillf Danton,
	Michal Hocko, Michal Hocko

The should_reclaim_retry() makes decisions based on no_progress_loops, so it
makes sense to also update the counter there. It will be also consistent with
should_compact_retry() and compaction_retries. No functional change.

[hillf.zj@alibaba-inc.com: fix missing pointer dereferences]
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
---
 mm/page_alloc.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0fd29731ab35..8ed4f506ae0b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3404,16 +3404,26 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
 static inline bool
 should_reclaim_retry(gfp_t gfp_mask, unsigned order,
 		     struct alloc_context *ac, int alloc_flags,
-		     bool did_some_progress, int no_progress_loops)
+		     bool did_some_progress, int *no_progress_loops)
 {
 	struct zone *zone;
 	struct zoneref *z;
 
 	/*
+	 * Costly allocations might have made a progress but this doesn't mean
+	 * their order will become available due to high fragmentation so
+	 * always increment the no progress counter for them
+	 */
+	if (did_some_progress && order <= PAGE_ALLOC_COSTLY_ORDER)
+		*no_progress_loops = 0;
+	else
+		(*no_progress_loops)++;
+
+	/*
 	 * Make sure we converge to OOM if we cannot make any progress
 	 * several times in the row.
 	 */
-	if (no_progress_loops > MAX_RECLAIM_RETRIES)
+	if (*no_progress_loops > MAX_RECLAIM_RETRIES)
 		return false;
 
 	/*
@@ -3428,7 +3438,7 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order,
 		unsigned long reclaimable;
 
 		available = reclaimable = zone_reclaimable_pages(zone);
-		available -= DIV_ROUND_UP(no_progress_loops * available,
+		available -= DIV_ROUND_UP((*no_progress_loops) * available,
 					  MAX_RECLAIM_RETRIES);
 		available += zone_page_state_snapshot(zone, NR_FREE_PAGES);
 
@@ -3650,18 +3660,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	if (order > PAGE_ALLOC_COSTLY_ORDER && !(gfp_mask & __GFP_REPEAT))
 		goto nopage;
 
-	/*
-	 * Costly allocations might have made a progress but this doesn't mean
-	 * their order will become available due to high fragmentation so
-	 * always increment the no progress counter for them
-	 */
-	if (did_some_progress && order <= PAGE_ALLOC_COSTLY_ORDER)
-		no_progress_loops = 0;
-	else
-		no_progress_loops++;
-
 	if (should_reclaim_retry(gfp_mask, order, ac, alloc_flags,
-				 did_some_progress > 0, no_progress_loops))
+				 did_some_progress > 0, &no_progress_loops))
 		goto retry;
 
 	/*
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/4] mm, page_alloc: pull no_progress_loops update to should_reclaim_retry()
@ 2016-09-26 16:20   ` Vlastimil Babka
  0 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
	Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, Mel Gorman,
	Joonsoo Kim, David Rientjes, Rik van Riel, Hillf Danton,
	Michal Hocko, Michal Hocko

The should_reclaim_retry() makes decisions based on no_progress_loops, so it
makes sense to also update the counter there. It will be also consistent with
should_compact_retry() and compaction_retries. No functional change.

[hillf.zj@alibaba-inc.com: fix missing pointer dereferences]
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
---
 mm/page_alloc.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0fd29731ab35..8ed4f506ae0b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3404,16 +3404,26 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
 static inline bool
 should_reclaim_retry(gfp_t gfp_mask, unsigned order,
 		     struct alloc_context *ac, int alloc_flags,
-		     bool did_some_progress, int no_progress_loops)
+		     bool did_some_progress, int *no_progress_loops)
 {
 	struct zone *zone;
 	struct zoneref *z;
 
 	/*
+	 * Costly allocations might have made a progress but this doesn't mean
+	 * their order will become available due to high fragmentation so
+	 * always increment the no progress counter for them
+	 */
+	if (did_some_progress && order <= PAGE_ALLOC_COSTLY_ORDER)
+		*no_progress_loops = 0;
+	else
+		(*no_progress_loops)++;
+
+	/*
 	 * Make sure we converge to OOM if we cannot make any progress
 	 * several times in the row.
 	 */
-	if (no_progress_loops > MAX_RECLAIM_RETRIES)
+	if (*no_progress_loops > MAX_RECLAIM_RETRIES)
 		return false;
 
 	/*
@@ -3428,7 +3438,7 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order,
 		unsigned long reclaimable;
 
 		available = reclaimable = zone_reclaimable_pages(zone);
-		available -= DIV_ROUND_UP(no_progress_loops * available,
+		available -= DIV_ROUND_UP((*no_progress_loops) * available,
 					  MAX_RECLAIM_RETRIES);
 		available += zone_page_state_snapshot(zone, NR_FREE_PAGES);
 
@@ -3650,18 +3660,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	if (order > PAGE_ALLOC_COSTLY_ORDER && !(gfp_mask & __GFP_REPEAT))
 		goto nopage;
 
-	/*
-	 * Costly allocations might have made a progress but this doesn't mean
-	 * their order will become available due to high fragmentation so
-	 * always increment the no progress counter for them
-	 */
-	if (did_some_progress && order <= PAGE_ALLOC_COSTLY_ORDER)
-		no_progress_loops = 0;
-	else
-		no_progress_loops++;
-
 	if (should_reclaim_retry(gfp_mask, order, ac, alloc_flags,
-				 did_some_progress > 0, no_progress_loops))
+				 did_some_progress > 0, &no_progress_loops))
 		goto retry;
 
 	/*
-- 
2.10.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/4] mm, compaction: ignore fragindex from compaction_zonelist_suitable()
  2016-09-26 16:20 ` Vlastimil Babka
@ 2016-09-26 16:20   ` Vlastimil Babka
  -1 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
	Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, Michal Hocko,
	Mel Gorman, Joonsoo Kim, David Rientjes, Rik van Riel,
	Hillf Danton, Michal Hocko

The compaction_zonelist_suitable() function tries to determine if compaction
will be able to proceed after sufficient reclaim, i.e. whether there are
enough reclaimable pages to provide enough order-0 freepages for compaction.

This addition of reclaimable pages to the free pages works well for the order-0
watermark check, but in the fragmentation index check we only consider truly
free pages. Thus we can get fragindex value close to 0 which indicates failure
do to lack of memory, and wrongly decide that compaction won't be suitable even
after reclaim.

Instead of trying to somehow adjust fragindex for reclaimable pages, let's just
skip it from compaction_zonelist_suitable().

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
---
 mm/compaction.c | 35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 86d4d0bbfc7c..5ff7f801c345 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1379,7 +1379,6 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
 					int classzone_idx,
 					unsigned long wmark_target)
 {
-	int fragindex;
 	unsigned long watermark;
 
 	if (is_via_compact_memory(order))
@@ -1415,6 +1414,18 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
 						ALLOC_CMA, wmark_target))
 		return COMPACT_SKIPPED;
 
+	return COMPACT_CONTINUE;
+}
+
+enum compact_result compaction_suitable(struct zone *zone, int order,
+					unsigned int alloc_flags,
+					int classzone_idx)
+{
+	enum compact_result ret;
+	int fragindex;
+
+	ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx,
+				    zone_page_state(zone, NR_FREE_PAGES));
 	/*
 	 * fragmentation index determines if allocation failures are due to
 	 * low memory or external fragmentation
@@ -1426,21 +1437,12 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
 	 *
 	 * Only compact if a failure would be due to fragmentation.
 	 */
-	fragindex = fragmentation_index(zone, order);
-	if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
-		return COMPACT_NOT_SUITABLE_ZONE;
-
-	return COMPACT_CONTINUE;
-}
-
-enum compact_result compaction_suitable(struct zone *zone, int order,
-					unsigned int alloc_flags,
-					int classzone_idx)
-{
-	enum compact_result ret;
+	if (ret == COMPACT_CONTINUE) {
+		fragindex = fragmentation_index(zone, order);
+		if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
+			return COMPACT_NOT_SUITABLE_ZONE;
+	}
 
-	ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx,
-				    zone_page_state(zone, NR_FREE_PAGES));
 	trace_mm_compaction_suitable(zone, order, ret);
 	if (ret == COMPACT_NOT_SUITABLE_ZONE)
 		ret = COMPACT_SKIPPED;
@@ -1473,8 +1475,7 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
 		available += zone_page_state_snapshot(zone, NR_FREE_PAGES);
 		compact_result = __compaction_suitable(zone, order, alloc_flags,
 				ac_classzone_idx(ac), available);
-		if (compact_result != COMPACT_SKIPPED &&
-				compact_result != COMPACT_NOT_SUITABLE_ZONE)
+		if (compact_result != COMPACT_SKIPPED)
 			return true;
 	}
 
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/4] mm, compaction: ignore fragindex from compaction_zonelist_suitable()
@ 2016-09-26 16:20   ` Vlastimil Babka
  0 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
	Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, Michal Hocko,
	Mel Gorman, Joonsoo Kim, David Rientjes, Rik van Riel,
	Hillf Danton, Michal Hocko

The compaction_zonelist_suitable() function tries to determine if compaction
will be able to proceed after sufficient reclaim, i.e. whether there are
enough reclaimable pages to provide enough order-0 freepages for compaction.

This addition of reclaimable pages to the free pages works well for the order-0
watermark check, but in the fragmentation index check we only consider truly
free pages. Thus we can get fragindex value close to 0 which indicates failure
do to lack of memory, and wrongly decide that compaction won't be suitable even
after reclaim.

Instead of trying to somehow adjust fragindex for reclaimable pages, let's just
skip it from compaction_zonelist_suitable().

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
---
 mm/compaction.c | 35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 86d4d0bbfc7c..5ff7f801c345 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1379,7 +1379,6 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
 					int classzone_idx,
 					unsigned long wmark_target)
 {
-	int fragindex;
 	unsigned long watermark;
 
 	if (is_via_compact_memory(order))
@@ -1415,6 +1414,18 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
 						ALLOC_CMA, wmark_target))
 		return COMPACT_SKIPPED;
 
+	return COMPACT_CONTINUE;
+}
+
+enum compact_result compaction_suitable(struct zone *zone, int order,
+					unsigned int alloc_flags,
+					int classzone_idx)
+{
+	enum compact_result ret;
+	int fragindex;
+
+	ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx,
+				    zone_page_state(zone, NR_FREE_PAGES));
 	/*
 	 * fragmentation index determines if allocation failures are due to
 	 * low memory or external fragmentation
@@ -1426,21 +1437,12 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
 	 *
 	 * Only compact if a failure would be due to fragmentation.
 	 */
-	fragindex = fragmentation_index(zone, order);
-	if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
-		return COMPACT_NOT_SUITABLE_ZONE;
-
-	return COMPACT_CONTINUE;
-}
-
-enum compact_result compaction_suitable(struct zone *zone, int order,
-					unsigned int alloc_flags,
-					int classzone_idx)
-{
-	enum compact_result ret;
+	if (ret == COMPACT_CONTINUE) {
+		fragindex = fragmentation_index(zone, order);
+		if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
+			return COMPACT_NOT_SUITABLE_ZONE;
+	}
 
-	ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx,
-				    zone_page_state(zone, NR_FREE_PAGES));
 	trace_mm_compaction_suitable(zone, order, ret);
 	if (ret == COMPACT_NOT_SUITABLE_ZONE)
 		ret = COMPACT_SKIPPED;
@@ -1473,8 +1475,7 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
 		available += zone_page_state_snapshot(zone, NR_FREE_PAGES);
 		compact_result = __compaction_suitable(zone, order, alloc_flags,
 				ac_classzone_idx(ac), available);
-		if (compact_result != COMPACT_SKIPPED &&
-				compact_result != COMPACT_NOT_SUITABLE_ZONE)
+		if (compact_result != COMPACT_SKIPPED)
 			return true;
 	}
 
-- 
2.10.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/4] mm, compaction: restrict fragindex to costly orders
  2016-09-26 16:20 ` Vlastimil Babka
@ 2016-09-26 16:20   ` Vlastimil Babka
  -1 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
	Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, Michal Hocko,
	Mel Gorman, Joonsoo Kim, David Rientjes, Rik van Riel,
	Hillf Danton, Michal Hocko

Fragmentation index and the vm.extfrag_threshold sysctl is meant as a heuristic
to prevent excessive compaction for costly orders (i.e. THP). It's unlikely to
make any difference for non-costly orders, especially with the default
threshold. But we cannot afford any uncertainty for the non-costly orders where
the only alternative to successful reclaim/compaction is OOM. After the recent
patches we are guaranteed maximum effort without heuristics from compaction
before deciding OOM, and fragindex is the last remaining heuristic. Therefore
skip fragindex altogether for non-costly orders.

Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
---
 mm/compaction.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 5ff7f801c345..badb92bf14b4 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1435,9 +1435,14 @@ enum compact_result compaction_suitable(struct zone *zone, int order,
 	 * index towards 0 implies failure is due to lack of memory
 	 * index towards 1000 implies failure is due to fragmentation
 	 *
-	 * Only compact if a failure would be due to fragmentation.
+	 * Only compact if a failure would be due to fragmentation. Also
+	 * ignore fragindex for non-costly orders where the alternative to
+	 * a successful reclaim/compaction is OOM. Fragindex and the
+	 * vm.extfrag_threshold sysctl is meant as a heuristic to prevent
+	 * excessive compaction for costly orders, but it should not be at the
+	 * expense of system stability.
 	 */
-	if (ret == COMPACT_CONTINUE) {
+	if (ret == COMPACT_CONTINUE && (order > PAGE_ALLOC_COSTLY_ORDER)) {
 		fragindex = fragmentation_index(zone, order);
 		if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
 			return COMPACT_NOT_SUITABLE_ZONE;
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/4] mm, compaction: restrict fragindex to costly orders
@ 2016-09-26 16:20   ` Vlastimil Babka
  0 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
	Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, Michal Hocko,
	Mel Gorman, Joonsoo Kim, David Rientjes, Rik van Riel,
	Hillf Danton, Michal Hocko

Fragmentation index and the vm.extfrag_threshold sysctl is meant as a heuristic
to prevent excessive compaction for costly orders (i.e. THP). It's unlikely to
make any difference for non-costly orders, especially with the default
threshold. But we cannot afford any uncertainty for the non-costly orders where
the only alternative to successful reclaim/compaction is OOM. After the recent
patches we are guaranteed maximum effort without heuristics from compaction
before deciding OOM, and fragindex is the last remaining heuristic. Therefore
skip fragindex altogether for non-costly orders.

Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
---
 mm/compaction.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 5ff7f801c345..badb92bf14b4 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1435,9 +1435,14 @@ enum compact_result compaction_suitable(struct zone *zone, int order,
 	 * index towards 0 implies failure is due to lack of memory
 	 * index towards 1000 implies failure is due to fragmentation
 	 *
-	 * Only compact if a failure would be due to fragmentation.
+	 * Only compact if a failure would be due to fragmentation. Also
+	 * ignore fragindex for non-costly orders where the alternative to
+	 * a successful reclaim/compaction is OOM. Fragindex and the
+	 * vm.extfrag_threshold sysctl is meant as a heuristic to prevent
+	 * excessive compaction for costly orders, but it should not be at the
+	 * expense of system stability.
 	 */
-	if (ret == COMPACT_CONTINUE) {
+	if (ret == COMPACT_CONTINUE && (order > PAGE_ALLOC_COSTLY_ORDER)) {
 		fragindex = fragmentation_index(zone, order);
 		if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
 			return COMPACT_NOT_SUITABLE_ZONE;
-- 
2.10.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/4] mm, compaction: ignore fragindex from compaction_zonelist_suitable()
  2016-09-26 16:20   ` Vlastimil Babka
@ 2016-09-26 20:15     ` Michal Hocko
  -1 siblings, 0 replies; 18+ messages in thread
From: Michal Hocko @ 2016-09-26 20:15 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-mm, linux-kernel, Linus Torvalds,
	Arkadiusz Miskiewicz, Ralf-Peter Rohbeck, Olaf Hering,
	Mel Gorman, Joonsoo Kim, David Rientjes, Rik van Riel,
	Hillf Danton

On Mon 26-09-16 18:20:24, Vlastimil Babka wrote:
> The compaction_zonelist_suitable() function tries to determine if compaction
> will be able to proceed after sufficient reclaim, i.e. whether there are
> enough reclaimable pages to provide enough order-0 freepages for compaction.
> 
> This addition of reclaimable pages to the free pages works well for the order-0
> watermark check, but in the fragmentation index check we only consider truly
> free pages. Thus we can get fragindex value close to 0 which indicates failure
> do to lack of memory, and wrongly decide that compaction won't be suitable even
> after reclaim.
> 
> Instead of trying to somehow adjust fragindex for reclaimable pages, let's just
> skip it from compaction_zonelist_suitable().

Yes this makes a lot of sense to me. The fragidx should be mostly about
a balance between reclaim and the compaction so it shouldn't be used for
fail or retry decisions.

> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/compaction.c | 35 ++++++++++++++++++-----------------
>  1 file changed, 18 insertions(+), 17 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 86d4d0bbfc7c..5ff7f801c345 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1379,7 +1379,6 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
>  					int classzone_idx,
>  					unsigned long wmark_target)
>  {
> -	int fragindex;
>  	unsigned long watermark;
>  
>  	if (is_via_compact_memory(order))
> @@ -1415,6 +1414,18 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
>  						ALLOC_CMA, wmark_target))
>  		return COMPACT_SKIPPED;
>  
> +	return COMPACT_CONTINUE;
> +}
> +
> +enum compact_result compaction_suitable(struct zone *zone, int order,
> +					unsigned int alloc_flags,
> +					int classzone_idx)
> +{
> +	enum compact_result ret;
> +	int fragindex;
> +
> +	ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx,
> +				    zone_page_state(zone, NR_FREE_PAGES));
>  	/*
>  	 * fragmentation index determines if allocation failures are due to
>  	 * low memory or external fragmentation
> @@ -1426,21 +1437,12 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
>  	 *
>  	 * Only compact if a failure would be due to fragmentation.
>  	 */
> -	fragindex = fragmentation_index(zone, order);
> -	if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
> -		return COMPACT_NOT_SUITABLE_ZONE;
> -
> -	return COMPACT_CONTINUE;
> -}
> -
> -enum compact_result compaction_suitable(struct zone *zone, int order,
> -					unsigned int alloc_flags,
> -					int classzone_idx)
> -{
> -	enum compact_result ret;
> +	if (ret == COMPACT_CONTINUE) {
> +		fragindex = fragmentation_index(zone, order);
> +		if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
> +			return COMPACT_NOT_SUITABLE_ZONE;
> +	}
>  
> -	ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx,
> -				    zone_page_state(zone, NR_FREE_PAGES));
>  	trace_mm_compaction_suitable(zone, order, ret);
>  	if (ret == COMPACT_NOT_SUITABLE_ZONE)
>  		ret = COMPACT_SKIPPED;
> @@ -1473,8 +1475,7 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
>  		available += zone_page_state_snapshot(zone, NR_FREE_PAGES);
>  		compact_result = __compaction_suitable(zone, order, alloc_flags,
>  				ac_classzone_idx(ac), available);
> -		if (compact_result != COMPACT_SKIPPED &&
> -				compact_result != COMPACT_NOT_SUITABLE_ZONE)
> +		if (compact_result != COMPACT_SKIPPED)
>  			return true;
>  	}
>  
> -- 
> 2.10.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/4] mm, compaction: ignore fragindex from compaction_zonelist_suitable()
@ 2016-09-26 20:15     ` Michal Hocko
  0 siblings, 0 replies; 18+ messages in thread
From: Michal Hocko @ 2016-09-26 20:15 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-mm, linux-kernel, Linus Torvalds,
	Arkadiusz Miskiewicz, Ralf-Peter Rohbeck, Olaf Hering,
	Mel Gorman, Joonsoo Kim, David Rientjes, Rik van Riel,
	Hillf Danton

On Mon 26-09-16 18:20:24, Vlastimil Babka wrote:
> The compaction_zonelist_suitable() function tries to determine if compaction
> will be able to proceed after sufficient reclaim, i.e. whether there are
> enough reclaimable pages to provide enough order-0 freepages for compaction.
> 
> This addition of reclaimable pages to the free pages works well for the order-0
> watermark check, but in the fragmentation index check we only consider truly
> free pages. Thus we can get fragindex value close to 0 which indicates failure
> do to lack of memory, and wrongly decide that compaction won't be suitable even
> after reclaim.
> 
> Instead of trying to somehow adjust fragindex for reclaimable pages, let's just
> skip it from compaction_zonelist_suitable().

Yes this makes a lot of sense to me. The fragidx should be mostly about
a balance between reclaim and the compaction so it shouldn't be used for
fail or retry decisions.

> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/compaction.c | 35 ++++++++++++++++++-----------------
>  1 file changed, 18 insertions(+), 17 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 86d4d0bbfc7c..5ff7f801c345 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1379,7 +1379,6 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
>  					int classzone_idx,
>  					unsigned long wmark_target)
>  {
> -	int fragindex;
>  	unsigned long watermark;
>  
>  	if (is_via_compact_memory(order))
> @@ -1415,6 +1414,18 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
>  						ALLOC_CMA, wmark_target))
>  		return COMPACT_SKIPPED;
>  
> +	return COMPACT_CONTINUE;
> +}
> +
> +enum compact_result compaction_suitable(struct zone *zone, int order,
> +					unsigned int alloc_flags,
> +					int classzone_idx)
> +{
> +	enum compact_result ret;
> +	int fragindex;
> +
> +	ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx,
> +				    zone_page_state(zone, NR_FREE_PAGES));
>  	/*
>  	 * fragmentation index determines if allocation failures are due to
>  	 * low memory or external fragmentation
> @@ -1426,21 +1437,12 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
>  	 *
>  	 * Only compact if a failure would be due to fragmentation.
>  	 */
> -	fragindex = fragmentation_index(zone, order);
> -	if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
> -		return COMPACT_NOT_SUITABLE_ZONE;
> -
> -	return COMPACT_CONTINUE;
> -}
> -
> -enum compact_result compaction_suitable(struct zone *zone, int order,
> -					unsigned int alloc_flags,
> -					int classzone_idx)
> -{
> -	enum compact_result ret;
> +	if (ret == COMPACT_CONTINUE) {
> +		fragindex = fragmentation_index(zone, order);
> +		if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
> +			return COMPACT_NOT_SUITABLE_ZONE;
> +	}
>  
> -	ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx,
> -				    zone_page_state(zone, NR_FREE_PAGES));
>  	trace_mm_compaction_suitable(zone, order, ret);
>  	if (ret == COMPACT_NOT_SUITABLE_ZONE)
>  		ret = COMPACT_SKIPPED;
> @@ -1473,8 +1475,7 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
>  		available += zone_page_state_snapshot(zone, NR_FREE_PAGES);
>  		compact_result = __compaction_suitable(zone, order, alloc_flags,
>  				ac_classzone_idx(ac), available);
> -		if (compact_result != COMPACT_SKIPPED &&
> -				compact_result != COMPACT_NOT_SUITABLE_ZONE)
> +		if (compact_result != COMPACT_SKIPPED)
>  			return true;
>  	}
>  
> -- 
> 2.10.0

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] mm, compaction: restrict fragindex to costly orders
  2016-09-26 16:20   ` Vlastimil Babka
@ 2016-09-26 20:29     ` Michal Hocko
  -1 siblings, 0 replies; 18+ messages in thread
From: Michal Hocko @ 2016-09-26 20:29 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-mm, linux-kernel, Linus Torvalds,
	Arkadiusz Miskiewicz, Ralf-Peter Rohbeck, Olaf Hering,
	Mel Gorman, Joonsoo Kim, David Rientjes, Rik van Riel,
	Hillf Danton

On Mon 26-09-16 18:20:25, Vlastimil Babka wrote:
> Fragmentation index and the vm.extfrag_threshold sysctl is meant as a heuristic
> to prevent excessive compaction for costly orders (i.e. THP). It's unlikely to
> make any difference for non-costly orders, especially with the default
> threshold. But we cannot afford any uncertainty for the non-costly orders where
> the only alternative to successful reclaim/compaction is OOM. After the recent
> patches we are guaranteed maximum effort without heuristics from compaction
> before deciding OOM, and fragindex is the last remaining heuristic. Therefore
> skip fragindex altogether for non-costly orders.

It would be nicer to reduce this just to the highest compaction priority
but as your previous attempt shows this adds a lot of code churn.
Not skipping the compaction for these !costly orders might lead to a
higher latency for the allocation due to pointless zone scanning but
considering that an alternative would be the order-0 reclaim which
doesn't guarantee any larger blocks then doing a more targeted approach
sounds quite reasonable to me.

This patch is not really needed to prevent pre-mature OOMs because
compaction_zonelist_suitable doesn't rely on the fragmentation index
after the previous patch but it makes sense to me regardless. The
fagindex was quite an obscure measure and having !costly order easier to
understand is valuable imho.

> Suggested-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/compaction.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 5ff7f801c345..badb92bf14b4 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1435,9 +1435,14 @@ enum compact_result compaction_suitable(struct zone *zone, int order,
>  	 * index towards 0 implies failure is due to lack of memory
>  	 * index towards 1000 implies failure is due to fragmentation
>  	 *
> -	 * Only compact if a failure would be due to fragmentation.
> +	 * Only compact if a failure would be due to fragmentation. Also
> +	 * ignore fragindex for non-costly orders where the alternative to
> +	 * a successful reclaim/compaction is OOM. Fragindex and the
> +	 * vm.extfrag_threshold sysctl is meant as a heuristic to prevent
> +	 * excessive compaction for costly orders, but it should not be at the
> +	 * expense of system stability.
>  	 */
> -	if (ret == COMPACT_CONTINUE) {
> +	if (ret == COMPACT_CONTINUE && (order > PAGE_ALLOC_COSTLY_ORDER)) {
>  		fragindex = fragmentation_index(zone, order);
>  		if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
>  			return COMPACT_NOT_SUITABLE_ZONE;
> -- 
> 2.10.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] mm, compaction: restrict fragindex to costly orders
@ 2016-09-26 20:29     ` Michal Hocko
  0 siblings, 0 replies; 18+ messages in thread
From: Michal Hocko @ 2016-09-26 20:29 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, linux-mm, linux-kernel, Linus Torvalds,
	Arkadiusz Miskiewicz, Ralf-Peter Rohbeck, Olaf Hering,
	Mel Gorman, Joonsoo Kim, David Rientjes, Rik van Riel,
	Hillf Danton

On Mon 26-09-16 18:20:25, Vlastimil Babka wrote:
> Fragmentation index and the vm.extfrag_threshold sysctl is meant as a heuristic
> to prevent excessive compaction for costly orders (i.e. THP). It's unlikely to
> make any difference for non-costly orders, especially with the default
> threshold. But we cannot afford any uncertainty for the non-costly orders where
> the only alternative to successful reclaim/compaction is OOM. After the recent
> patches we are guaranteed maximum effort without heuristics from compaction
> before deciding OOM, and fragindex is the last remaining heuristic. Therefore
> skip fragindex altogether for non-costly orders.

It would be nicer to reduce this just to the highest compaction priority
but as your previous attempt shows this adds a lot of code churn.
Not skipping the compaction for these !costly orders might lead to a
higher latency for the allocation due to pointless zone scanning but
considering that an alternative would be the order-0 reclaim which
doesn't guarantee any larger blocks then doing a more targeted approach
sounds quite reasonable to me.

This patch is not really needed to prevent pre-mature OOMs because
compaction_zonelist_suitable doesn't rely on the fragmentation index
after the previous patch but it makes sense to me regardless. The
fagindex was quite an obscure measure and having !costly order easier to
understand is valuable imho.

> Suggested-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/compaction.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 5ff7f801c345..badb92bf14b4 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1435,9 +1435,14 @@ enum compact_result compaction_suitable(struct zone *zone, int order,
>  	 * index towards 0 implies failure is due to lack of memory
>  	 * index towards 1000 implies failure is due to fragmentation
>  	 *
> -	 * Only compact if a failure would be due to fragmentation.
> +	 * Only compact if a failure would be due to fragmentation. Also
> +	 * ignore fragindex for non-costly orders where the alternative to
> +	 * a successful reclaim/compaction is OOM. Fragindex and the
> +	 * vm.extfrag_threshold sysctl is meant as a heuristic to prevent
> +	 * excessive compaction for costly orders, but it should not be at the
> +	 * expense of system stability.
>  	 */
> -	if (ret == COMPACT_CONTINUE) {
> +	if (ret == COMPACT_CONTINUE && (order > PAGE_ALLOC_COSTLY_ORDER)) {
>  		fragindex = fragmentation_index(zone, order);
>  		if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
>  			return COMPACT_NOT_SUITABLE_ZONE;
> -- 
> 2.10.0

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] mm, compaction: more reliably increase direct compaction priority-fix
  2016-09-26 16:20   ` Vlastimil Babka
@ 2016-09-27  3:25     ` Hillf Danton
  -1 siblings, 0 replies; 18+ messages in thread
From: Hillf Danton @ 2016-09-27  3:25 UTC (permalink / raw)
  To: 'Vlastimil Babka', 'Andrew Morton'
  Cc: linux-mm, linux-kernel, 'Linus Torvalds',
	'Arkadiusz Miskiewicz', 'Ralf-Peter Rohbeck',
	'Olaf Hering', 'Mel Gorman',
	'Joonsoo Kim', 'David Rientjes',
	'Rik van Riel', 'Michal Hocko',
	'Michal Hocko'

On Tuesday, September 27, 2016 12:20 AM Vlastimil Babka wrote 
> 
> When increasing the compaction priority, also reset retries. Otherwise we can
> consume all retries on the lower priorities. Also pull the retries increment
> into should_compact_retry() so it counts only the rounds where we actually
> rely on it.
> 
> Suggested-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>
> ---
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] mm, compaction: more reliably increase direct compaction priority-fix
@ 2016-09-27  3:25     ` Hillf Danton
  0 siblings, 0 replies; 18+ messages in thread
From: Hillf Danton @ 2016-09-27  3:25 UTC (permalink / raw)
  To: 'Vlastimil Babka', 'Andrew Morton'
  Cc: linux-mm, linux-kernel, 'Linus Torvalds',
	'Arkadiusz Miskiewicz', 'Ralf-Peter Rohbeck',
	'Olaf Hering', 'Mel Gorman',
	'Joonsoo Kim', 'David Rientjes',
	'Rik van Riel', 'Michal Hocko',
	'Michal Hocko'

On Tuesday, September 27, 2016 12:20 AM Vlastimil Babka wrote 
> 
> When increasing the compaction priority, also reset retries. Otherwise we can
> consume all retries on the lower priorities. Also pull the retries increment
> into should_compact_retry() so it counts only the rounds where we actually
> rely on it.
> 
> Suggested-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>
> ---
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/4] mm, compaction: ignore fragindex from compaction_zonelist_suitable()
  2016-09-26 16:20   ` Vlastimil Babka
@ 2016-09-29  9:57     ` Vlastimil Babka
  -1 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-29  9:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
	Ralf-Peter Rohbeck, Olaf Hering, Michal Hocko, Mel Gorman,
	Joonsoo Kim, David Rientjes, Rik van Riel, Hillf Danton,
	Michal Hocko

On 09/26/2016 06:20 PM, Vlastimil Babka wrote:
> The compaction_zonelist_suitable() function tries to determine if compaction
> will be able to proceed after sufficient reclaim, i.e. whether there are
> enough reclaimable pages to provide enough order-0 freepages for compaction.
> 
> This addition of reclaimable pages to the free pages works well for the order-0
> watermark check, but in the fragmentation index check we only consider truly
> free pages. Thus we can get fragindex value close to 0 which indicates failure
> do to lack of memory, and wrongly decide that compaction won't be suitable even
> after reclaim.
> 
> Instead of trying to somehow adjust fragindex for reclaimable pages, let's just
> skip it from compaction_zonelist_suitable().
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>

Bah, a fix below, sorry.
----8<----
>From 1cd4855305a9c7eef0cf0c1af9f310cecdcc9406 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Thu, 29 Sep 2016 11:54:07 +0200
Subject: [PATCH] mm, compaction: ignore fragindex from
 compaction_zonelist_suitable()-fix

COMPACT_NOT_SUITABLE_ZONE should not leak outside compaction_suitable() and
we should not skip tracepoint.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/compaction.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index badb92bf14b4..0409a4ad6ea1 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1445,7 +1445,7 @@ enum compact_result compaction_suitable(struct zone *zone, int order,
 	if (ret == COMPACT_CONTINUE && (order > PAGE_ALLOC_COSTLY_ORDER)) {
 		fragindex = fragmentation_index(zone, order);
 		if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
-			return COMPACT_NOT_SUITABLE_ZONE;
+			ret = COMPACT_NOT_SUITABLE_ZONE;
 	}
 
 	trace_mm_compaction_suitable(zone, order, ret);
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/4] mm, compaction: ignore fragindex from compaction_zonelist_suitable()
@ 2016-09-29  9:57     ` Vlastimil Babka
  0 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-29  9:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
	Ralf-Peter Rohbeck, Olaf Hering, Michal Hocko, Mel Gorman,
	Joonsoo Kim, David Rientjes, Rik van Riel, Hillf Danton,
	Michal Hocko

On 09/26/2016 06:20 PM, Vlastimil Babka wrote:
> The compaction_zonelist_suitable() function tries to determine if compaction
> will be able to proceed after sufficient reclaim, i.e. whether there are
> enough reclaimable pages to provide enough order-0 freepages for compaction.
> 
> This addition of reclaimable pages to the free pages works well for the order-0
> watermark check, but in the fragmentation index check we only consider truly
> free pages. Thus we can get fragindex value close to 0 which indicates failure
> do to lack of memory, and wrongly decide that compaction won't be suitable even
> after reclaim.
> 
> Instead of trying to somehow adjust fragindex for reclaimable pages, let's just
> skip it from compaction_zonelist_suitable().
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>

Bah, a fix below, sorry.
----8<----

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-09-29  9:57 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-26 16:20 [PATCH 0/4] followups to reintroduce compaction feedback for OOM decisions Vlastimil Babka
2016-09-26 16:20 ` Vlastimil Babka
2016-09-26 16:20 ` [PATCH 1/4] mm, compaction: more reliably increase direct compaction priority-fix Vlastimil Babka
2016-09-26 16:20   ` Vlastimil Babka
2016-09-27  3:25   ` Hillf Danton
2016-09-27  3:25     ` Hillf Danton
2016-09-26 16:20 ` [PATCH 2/4] mm, page_alloc: pull no_progress_loops update to should_reclaim_retry() Vlastimil Babka
2016-09-26 16:20   ` Vlastimil Babka
2016-09-26 16:20 ` [PATCH 3/4] mm, compaction: ignore fragindex from compaction_zonelist_suitable() Vlastimil Babka
2016-09-26 16:20   ` Vlastimil Babka
2016-09-26 20:15   ` Michal Hocko
2016-09-26 20:15     ` Michal Hocko
2016-09-29  9:57   ` Vlastimil Babka
2016-09-29  9:57     ` Vlastimil Babka
2016-09-26 16:20 ` [PATCH 4/4] mm, compaction: restrict fragindex to costly orders Vlastimil Babka
2016-09-26 16:20   ` Vlastimil Babka
2016-09-26 20:29   ` Michal Hocko
2016-09-26 20:29     ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.