* [PATCH 0/4] followups to reintroduce compaction feedback for OOM decisions
@ 2016-09-26 16:20 ` Vlastimil Babka
0 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, David Rientjes,
Hillf Danton, Joonsoo Kim, Mel Gorman, Michal Hocko,
Michal Hocko, Rik van Riel
Reviews of series "reintroduce compaction feedback for OOM decisions" [1]
resulted in some followup patches and Michal suggested posting them in a new
threads, so here it goes.
Patch 1 is meant to be squashed into the following patch in mmotm:
mm-compaction-more-reliably-increase-direct-compaction-priority.patch
Patch 2 is a cleanup for consistency. Patches 3 and 4 deal with the last
(hopefully) remaining heuristic in the reclaim/compaction-vs-OOM scenario,
which is the fragmentation index.
[1] http://www.spinics.net/lists/linux-mm/msg113133.html
Vlastimil Babka (4):
mm, compaction: more reliably increase direct compaction priority-fix
mm, page_alloc: pull no_progress_loops update to
should_reclaim_retry()
mm, compaction: ignore fragindex from compaction_zonelist_suitable()
mm, compaction: restrict fragindex to costly orders
mm/compaction.c | 42 ++++++++++++++++++++++++------------------
mm/page_alloc.c | 47 ++++++++++++++++++++++++-----------------------
2 files changed, 48 insertions(+), 41 deletions(-)
--
2.10.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 1/4] mm, compaction: more reliably increase direct compaction priority-fix
2016-09-26 16:20 ` Vlastimil Babka
@ 2016-09-26 16:20 ` Vlastimil Babka
-1 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, Mel Gorman,
Joonsoo Kim, David Rientjes, Rik van Riel, Hillf Danton,
Michal Hocko, Michal Hocko
When increasing the compaction priority, also reset retries. Otherwise we can
consume all retries on the lower priorities. Also pull the retries increment
into should_compact_retry() so it counts only the rounds where we actually
rely on it.
Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
---
Please squash into
mm-compaction-more-reliably-increase-direct-compaction-priority.patch
mm/page_alloc.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5155485057cb..0fd29731ab35 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3162,7 +3162,7 @@ static inline bool
should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
enum compact_result compact_result,
enum compact_priority *compact_priority,
- int compaction_retries)
+ int *compaction_retries)
{
int max_retries = MAX_COMPACT_RETRIES;
int min_priority;
@@ -3170,6 +3170,9 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
if (!order)
return false;
+ if (compaction_made_progress(compact_result))
+ (*compaction_retries)++;
+
/*
* compaction considers all the zone as desperately out of memory
* so it doesn't really make much sense to retry except when the
@@ -3197,18 +3200,19 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
*/
if (order > PAGE_ALLOC_COSTLY_ORDER)
max_retries /= 4;
- if (compaction_retries <= max_retries)
+ if (*compaction_retries <= max_retries)
return true;
/*
- * Make sure there is at least one attempt at the highest priority
- * if we exhausted all retries at the lower priorities
+ * Make sure there are attempts at the highest priority if we exhausted
+ * all retries or failed at the lower priorities.
*/
check_priority:
min_priority = (order > PAGE_ALLOC_COSTLY_ORDER) ?
MIN_COMPACT_COSTLY_PRIORITY : MIN_COMPACT_PRIORITY;
if (*compact_priority > min_priority) {
(*compact_priority)--;
+ *compaction_retries = 0;
return true;
}
return false;
@@ -3227,7 +3231,7 @@ static inline bool
should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_flags,
enum compact_result compact_result,
enum compact_priority *compact_priority,
- int compaction_retries)
+ int *compaction_retries)
{
struct zone *zone;
struct zoneref *z;
@@ -3635,9 +3639,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
if (page)
goto got_pg;
- if (order && compaction_made_progress(compact_result))
- compaction_retries++;
-
/* Do not loop if specifically requested */
if (gfp_mask & __GFP_NORETRY)
goto nopage;
@@ -3672,7 +3673,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
if (did_some_progress > 0 &&
should_compact_retry(ac, order, alloc_flags,
compact_result, &compact_priority,
- compaction_retries))
+ &compaction_retries))
goto retry;
/* Reclaim has failed us, start killing things */
--
2.10.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 1/4] mm, compaction: more reliably increase direct compaction priority-fix
@ 2016-09-26 16:20 ` Vlastimil Babka
0 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, Mel Gorman,
Joonsoo Kim, David Rientjes, Rik van Riel, Hillf Danton,
Michal Hocko, Michal Hocko
When increasing the compaction priority, also reset retries. Otherwise we can
consume all retries on the lower priorities. Also pull the retries increment
into should_compact_retry() so it counts only the rounds where we actually
rely on it.
Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
---
Please squash into
mm-compaction-more-reliably-increase-direct-compaction-priority.patch
mm/page_alloc.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5155485057cb..0fd29731ab35 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3162,7 +3162,7 @@ static inline bool
should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
enum compact_result compact_result,
enum compact_priority *compact_priority,
- int compaction_retries)
+ int *compaction_retries)
{
int max_retries = MAX_COMPACT_RETRIES;
int min_priority;
@@ -3170,6 +3170,9 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
if (!order)
return false;
+ if (compaction_made_progress(compact_result))
+ (*compaction_retries)++;
+
/*
* compaction considers all the zone as desperately out of memory
* so it doesn't really make much sense to retry except when the
@@ -3197,18 +3200,19 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
*/
if (order > PAGE_ALLOC_COSTLY_ORDER)
max_retries /= 4;
- if (compaction_retries <= max_retries)
+ if (*compaction_retries <= max_retries)
return true;
/*
- * Make sure there is at least one attempt at the highest priority
- * if we exhausted all retries at the lower priorities
+ * Make sure there are attempts at the highest priority if we exhausted
+ * all retries or failed at the lower priorities.
*/
check_priority:
min_priority = (order > PAGE_ALLOC_COSTLY_ORDER) ?
MIN_COMPACT_COSTLY_PRIORITY : MIN_COMPACT_PRIORITY;
if (*compact_priority > min_priority) {
(*compact_priority)--;
+ *compaction_retries = 0;
return true;
}
return false;
@@ -3227,7 +3231,7 @@ static inline bool
should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_flags,
enum compact_result compact_result,
enum compact_priority *compact_priority,
- int compaction_retries)
+ int *compaction_retries)
{
struct zone *zone;
struct zoneref *z;
@@ -3635,9 +3639,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
if (page)
goto got_pg;
- if (order && compaction_made_progress(compact_result))
- compaction_retries++;
-
/* Do not loop if specifically requested */
if (gfp_mask & __GFP_NORETRY)
goto nopage;
@@ -3672,7 +3673,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
if (did_some_progress > 0 &&
should_compact_retry(ac, order, alloc_flags,
compact_result, &compact_priority,
- compaction_retries))
+ &compaction_retries))
goto retry;
/* Reclaim has failed us, start killing things */
--
2.10.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 1/4] mm, compaction: more reliably increase direct compaction priority-fix
2016-09-26 16:20 ` Vlastimil Babka
@ 2016-09-27 3:25 ` Hillf Danton
-1 siblings, 0 replies; 18+ messages in thread
From: Hillf Danton @ 2016-09-27 3:25 UTC (permalink / raw)
To: 'Vlastimil Babka', 'Andrew Morton'
Cc: linux-mm, linux-kernel, 'Linus Torvalds',
'Arkadiusz Miskiewicz', 'Ralf-Peter Rohbeck',
'Olaf Hering', 'Mel Gorman',
'Joonsoo Kim', 'David Rientjes',
'Rik van Riel', 'Michal Hocko',
'Michal Hocko'
On Tuesday, September 27, 2016 12:20 AM Vlastimil Babka wrote
>
> When increasing the compaction priority, also reset retries. Otherwise we can
> consume all retries on the lower priorities. Also pull the retries increment
> into should_compact_retry() so it counts only the rounds where we actually
> rely on it.
>
> Suggested-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>
> ---
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/4] mm, compaction: more reliably increase direct compaction priority-fix
@ 2016-09-27 3:25 ` Hillf Danton
0 siblings, 0 replies; 18+ messages in thread
From: Hillf Danton @ 2016-09-27 3:25 UTC (permalink / raw)
To: 'Vlastimil Babka', 'Andrew Morton'
Cc: linux-mm, linux-kernel, 'Linus Torvalds',
'Arkadiusz Miskiewicz', 'Ralf-Peter Rohbeck',
'Olaf Hering', 'Mel Gorman',
'Joonsoo Kim', 'David Rientjes',
'Rik van Riel', 'Michal Hocko',
'Michal Hocko'
On Tuesday, September 27, 2016 12:20 AM Vlastimil Babka wrote
>
> When increasing the compaction priority, also reset retries. Otherwise we can
> consume all retries on the lower priorities. Also pull the retries increment
> into should_compact_retry() so it counts only the rounds where we actually
> rely on it.
>
> Suggested-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>
> ---
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 2/4] mm, page_alloc: pull no_progress_loops update to should_reclaim_retry()
2016-09-26 16:20 ` Vlastimil Babka
@ 2016-09-26 16:20 ` Vlastimil Babka
-1 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, Mel Gorman,
Joonsoo Kim, David Rientjes, Rik van Riel, Hillf Danton,
Michal Hocko, Michal Hocko
The should_reclaim_retry() makes decisions based on no_progress_loops, so it
makes sense to also update the counter there. It will be also consistent with
should_compact_retry() and compaction_retries. No functional change.
[hillf.zj@alibaba-inc.com: fix missing pointer dereferences]
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
---
mm/page_alloc.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0fd29731ab35..8ed4f506ae0b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3404,16 +3404,26 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
static inline bool
should_reclaim_retry(gfp_t gfp_mask, unsigned order,
struct alloc_context *ac, int alloc_flags,
- bool did_some_progress, int no_progress_loops)
+ bool did_some_progress, int *no_progress_loops)
{
struct zone *zone;
struct zoneref *z;
/*
+ * Costly allocations might have made a progress but this doesn't mean
+ * their order will become available due to high fragmentation so
+ * always increment the no progress counter for them
+ */
+ if (did_some_progress && order <= PAGE_ALLOC_COSTLY_ORDER)
+ *no_progress_loops = 0;
+ else
+ (*no_progress_loops)++;
+
+ /*
* Make sure we converge to OOM if we cannot make any progress
* several times in the row.
*/
- if (no_progress_loops > MAX_RECLAIM_RETRIES)
+ if (*no_progress_loops > MAX_RECLAIM_RETRIES)
return false;
/*
@@ -3428,7 +3438,7 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order,
unsigned long reclaimable;
available = reclaimable = zone_reclaimable_pages(zone);
- available -= DIV_ROUND_UP(no_progress_loops * available,
+ available -= DIV_ROUND_UP((*no_progress_loops) * available,
MAX_RECLAIM_RETRIES);
available += zone_page_state_snapshot(zone, NR_FREE_PAGES);
@@ -3650,18 +3660,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
if (order > PAGE_ALLOC_COSTLY_ORDER && !(gfp_mask & __GFP_REPEAT))
goto nopage;
- /*
- * Costly allocations might have made a progress but this doesn't mean
- * their order will become available due to high fragmentation so
- * always increment the no progress counter for them
- */
- if (did_some_progress && order <= PAGE_ALLOC_COSTLY_ORDER)
- no_progress_loops = 0;
- else
- no_progress_loops++;
-
if (should_reclaim_retry(gfp_mask, order, ac, alloc_flags,
- did_some_progress > 0, no_progress_loops))
+ did_some_progress > 0, &no_progress_loops))
goto retry;
/*
--
2.10.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 2/4] mm, page_alloc: pull no_progress_loops update to should_reclaim_retry()
@ 2016-09-26 16:20 ` Vlastimil Babka
0 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, Mel Gorman,
Joonsoo Kim, David Rientjes, Rik van Riel, Hillf Danton,
Michal Hocko, Michal Hocko
The should_reclaim_retry() makes decisions based on no_progress_loops, so it
makes sense to also update the counter there. It will be also consistent with
should_compact_retry() and compaction_retries. No functional change.
[hillf.zj@alibaba-inc.com: fix missing pointer dereferences]
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
---
mm/page_alloc.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0fd29731ab35..8ed4f506ae0b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3404,16 +3404,26 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
static inline bool
should_reclaim_retry(gfp_t gfp_mask, unsigned order,
struct alloc_context *ac, int alloc_flags,
- bool did_some_progress, int no_progress_loops)
+ bool did_some_progress, int *no_progress_loops)
{
struct zone *zone;
struct zoneref *z;
/*
+ * Costly allocations might have made a progress but this doesn't mean
+ * their order will become available due to high fragmentation so
+ * always increment the no progress counter for them
+ */
+ if (did_some_progress && order <= PAGE_ALLOC_COSTLY_ORDER)
+ *no_progress_loops = 0;
+ else
+ (*no_progress_loops)++;
+
+ /*
* Make sure we converge to OOM if we cannot make any progress
* several times in the row.
*/
- if (no_progress_loops > MAX_RECLAIM_RETRIES)
+ if (*no_progress_loops > MAX_RECLAIM_RETRIES)
return false;
/*
@@ -3428,7 +3438,7 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order,
unsigned long reclaimable;
available = reclaimable = zone_reclaimable_pages(zone);
- available -= DIV_ROUND_UP(no_progress_loops * available,
+ available -= DIV_ROUND_UP((*no_progress_loops) * available,
MAX_RECLAIM_RETRIES);
available += zone_page_state_snapshot(zone, NR_FREE_PAGES);
@@ -3650,18 +3660,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
if (order > PAGE_ALLOC_COSTLY_ORDER && !(gfp_mask & __GFP_REPEAT))
goto nopage;
- /*
- * Costly allocations might have made a progress but this doesn't mean
- * their order will become available due to high fragmentation so
- * always increment the no progress counter for them
- */
- if (did_some_progress && order <= PAGE_ALLOC_COSTLY_ORDER)
- no_progress_loops = 0;
- else
- no_progress_loops++;
-
if (should_reclaim_retry(gfp_mask, order, ac, alloc_flags,
- did_some_progress > 0, no_progress_loops))
+ did_some_progress > 0, &no_progress_loops))
goto retry;
/*
--
2.10.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 3/4] mm, compaction: ignore fragindex from compaction_zonelist_suitable()
2016-09-26 16:20 ` Vlastimil Babka
@ 2016-09-26 16:20 ` Vlastimil Babka
-1 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, Michal Hocko,
Mel Gorman, Joonsoo Kim, David Rientjes, Rik van Riel,
Hillf Danton, Michal Hocko
The compaction_zonelist_suitable() function tries to determine if compaction
will be able to proceed after sufficient reclaim, i.e. whether there are
enough reclaimable pages to provide enough order-0 freepages for compaction.
This addition of reclaimable pages to the free pages works well for the order-0
watermark check, but in the fragmentation index check we only consider truly
free pages. Thus we can get fragindex value close to 0 which indicates failure
do to lack of memory, and wrongly decide that compaction won't be suitable even
after reclaim.
Instead of trying to somehow adjust fragindex for reclaimable pages, let's just
skip it from compaction_zonelist_suitable().
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
---
mm/compaction.c | 35 ++++++++++++++++++-----------------
1 file changed, 18 insertions(+), 17 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 86d4d0bbfc7c..5ff7f801c345 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1379,7 +1379,6 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
int classzone_idx,
unsigned long wmark_target)
{
- int fragindex;
unsigned long watermark;
if (is_via_compact_memory(order))
@@ -1415,6 +1414,18 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
ALLOC_CMA, wmark_target))
return COMPACT_SKIPPED;
+ return COMPACT_CONTINUE;
+}
+
+enum compact_result compaction_suitable(struct zone *zone, int order,
+ unsigned int alloc_flags,
+ int classzone_idx)
+{
+ enum compact_result ret;
+ int fragindex;
+
+ ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx,
+ zone_page_state(zone, NR_FREE_PAGES));
/*
* fragmentation index determines if allocation failures are due to
* low memory or external fragmentation
@@ -1426,21 +1437,12 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
*
* Only compact if a failure would be due to fragmentation.
*/
- fragindex = fragmentation_index(zone, order);
- if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
- return COMPACT_NOT_SUITABLE_ZONE;
-
- return COMPACT_CONTINUE;
-}
-
-enum compact_result compaction_suitable(struct zone *zone, int order,
- unsigned int alloc_flags,
- int classzone_idx)
-{
- enum compact_result ret;
+ if (ret == COMPACT_CONTINUE) {
+ fragindex = fragmentation_index(zone, order);
+ if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
+ return COMPACT_NOT_SUITABLE_ZONE;
+ }
- ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx,
- zone_page_state(zone, NR_FREE_PAGES));
trace_mm_compaction_suitable(zone, order, ret);
if (ret == COMPACT_NOT_SUITABLE_ZONE)
ret = COMPACT_SKIPPED;
@@ -1473,8 +1475,7 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
available += zone_page_state_snapshot(zone, NR_FREE_PAGES);
compact_result = __compaction_suitable(zone, order, alloc_flags,
ac_classzone_idx(ac), available);
- if (compact_result != COMPACT_SKIPPED &&
- compact_result != COMPACT_NOT_SUITABLE_ZONE)
+ if (compact_result != COMPACT_SKIPPED)
return true;
}
--
2.10.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 3/4] mm, compaction: ignore fragindex from compaction_zonelist_suitable()
@ 2016-09-26 16:20 ` Vlastimil Babka
0 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, Michal Hocko,
Mel Gorman, Joonsoo Kim, David Rientjes, Rik van Riel,
Hillf Danton, Michal Hocko
The compaction_zonelist_suitable() function tries to determine if compaction
will be able to proceed after sufficient reclaim, i.e. whether there are
enough reclaimable pages to provide enough order-0 freepages for compaction.
This addition of reclaimable pages to the free pages works well for the order-0
watermark check, but in the fragmentation index check we only consider truly
free pages. Thus we can get fragindex value close to 0 which indicates failure
do to lack of memory, and wrongly decide that compaction won't be suitable even
after reclaim.
Instead of trying to somehow adjust fragindex for reclaimable pages, let's just
skip it from compaction_zonelist_suitable().
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
---
mm/compaction.c | 35 ++++++++++++++++++-----------------
1 file changed, 18 insertions(+), 17 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 86d4d0bbfc7c..5ff7f801c345 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1379,7 +1379,6 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
int classzone_idx,
unsigned long wmark_target)
{
- int fragindex;
unsigned long watermark;
if (is_via_compact_memory(order))
@@ -1415,6 +1414,18 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
ALLOC_CMA, wmark_target))
return COMPACT_SKIPPED;
+ return COMPACT_CONTINUE;
+}
+
+enum compact_result compaction_suitable(struct zone *zone, int order,
+ unsigned int alloc_flags,
+ int classzone_idx)
+{
+ enum compact_result ret;
+ int fragindex;
+
+ ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx,
+ zone_page_state(zone, NR_FREE_PAGES));
/*
* fragmentation index determines if allocation failures are due to
* low memory or external fragmentation
@@ -1426,21 +1437,12 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
*
* Only compact if a failure would be due to fragmentation.
*/
- fragindex = fragmentation_index(zone, order);
- if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
- return COMPACT_NOT_SUITABLE_ZONE;
-
- return COMPACT_CONTINUE;
-}
-
-enum compact_result compaction_suitable(struct zone *zone, int order,
- unsigned int alloc_flags,
- int classzone_idx)
-{
- enum compact_result ret;
+ if (ret == COMPACT_CONTINUE) {
+ fragindex = fragmentation_index(zone, order);
+ if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
+ return COMPACT_NOT_SUITABLE_ZONE;
+ }
- ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx,
- zone_page_state(zone, NR_FREE_PAGES));
trace_mm_compaction_suitable(zone, order, ret);
if (ret == COMPACT_NOT_SUITABLE_ZONE)
ret = COMPACT_SKIPPED;
@@ -1473,8 +1475,7 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
available += zone_page_state_snapshot(zone, NR_FREE_PAGES);
compact_result = __compaction_suitable(zone, order, alloc_flags,
ac_classzone_idx(ac), available);
- if (compact_result != COMPACT_SKIPPED &&
- compact_result != COMPACT_NOT_SUITABLE_ZONE)
+ if (compact_result != COMPACT_SKIPPED)
return true;
}
--
2.10.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 3/4] mm, compaction: ignore fragindex from compaction_zonelist_suitable()
2016-09-26 16:20 ` Vlastimil Babka
@ 2016-09-26 20:15 ` Michal Hocko
-1 siblings, 0 replies; 18+ messages in thread
From: Michal Hocko @ 2016-09-26 20:15 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, linux-mm, linux-kernel, Linus Torvalds,
Arkadiusz Miskiewicz, Ralf-Peter Rohbeck, Olaf Hering,
Mel Gorman, Joonsoo Kim, David Rientjes, Rik van Riel,
Hillf Danton
On Mon 26-09-16 18:20:24, Vlastimil Babka wrote:
> The compaction_zonelist_suitable() function tries to determine if compaction
> will be able to proceed after sufficient reclaim, i.e. whether there are
> enough reclaimable pages to provide enough order-0 freepages for compaction.
>
> This addition of reclaimable pages to the free pages works well for the order-0
> watermark check, but in the fragmentation index check we only consider truly
> free pages. Thus we can get fragindex value close to 0 which indicates failure
> do to lack of memory, and wrongly decide that compaction won't be suitable even
> after reclaim.
>
> Instead of trying to somehow adjust fragindex for reclaimable pages, let's just
> skip it from compaction_zonelist_suitable().
Yes this makes a lot of sense to me. The fragidx should be mostly about
a balance between reclaim and the compaction so it shouldn't be used for
fail or retry decisions.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
> ---
> mm/compaction.c | 35 ++++++++++++++++++-----------------
> 1 file changed, 18 insertions(+), 17 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 86d4d0bbfc7c..5ff7f801c345 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1379,7 +1379,6 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
> int classzone_idx,
> unsigned long wmark_target)
> {
> - int fragindex;
> unsigned long watermark;
>
> if (is_via_compact_memory(order))
> @@ -1415,6 +1414,18 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
> ALLOC_CMA, wmark_target))
> return COMPACT_SKIPPED;
>
> + return COMPACT_CONTINUE;
> +}
> +
> +enum compact_result compaction_suitable(struct zone *zone, int order,
> + unsigned int alloc_flags,
> + int classzone_idx)
> +{
> + enum compact_result ret;
> + int fragindex;
> +
> + ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx,
> + zone_page_state(zone, NR_FREE_PAGES));
> /*
> * fragmentation index determines if allocation failures are due to
> * low memory or external fragmentation
> @@ -1426,21 +1437,12 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
> *
> * Only compact if a failure would be due to fragmentation.
> */
> - fragindex = fragmentation_index(zone, order);
> - if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
> - return COMPACT_NOT_SUITABLE_ZONE;
> -
> - return COMPACT_CONTINUE;
> -}
> -
> -enum compact_result compaction_suitable(struct zone *zone, int order,
> - unsigned int alloc_flags,
> - int classzone_idx)
> -{
> - enum compact_result ret;
> + if (ret == COMPACT_CONTINUE) {
> + fragindex = fragmentation_index(zone, order);
> + if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
> + return COMPACT_NOT_SUITABLE_ZONE;
> + }
>
> - ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx,
> - zone_page_state(zone, NR_FREE_PAGES));
> trace_mm_compaction_suitable(zone, order, ret);
> if (ret == COMPACT_NOT_SUITABLE_ZONE)
> ret = COMPACT_SKIPPED;
> @@ -1473,8 +1475,7 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
> available += zone_page_state_snapshot(zone, NR_FREE_PAGES);
> compact_result = __compaction_suitable(zone, order, alloc_flags,
> ac_classzone_idx(ac), available);
> - if (compact_result != COMPACT_SKIPPED &&
> - compact_result != COMPACT_NOT_SUITABLE_ZONE)
> + if (compact_result != COMPACT_SKIPPED)
> return true;
> }
>
> --
> 2.10.0
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 3/4] mm, compaction: ignore fragindex from compaction_zonelist_suitable()
@ 2016-09-26 20:15 ` Michal Hocko
0 siblings, 0 replies; 18+ messages in thread
From: Michal Hocko @ 2016-09-26 20:15 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, linux-mm, linux-kernel, Linus Torvalds,
Arkadiusz Miskiewicz, Ralf-Peter Rohbeck, Olaf Hering,
Mel Gorman, Joonsoo Kim, David Rientjes, Rik van Riel,
Hillf Danton
On Mon 26-09-16 18:20:24, Vlastimil Babka wrote:
> The compaction_zonelist_suitable() function tries to determine if compaction
> will be able to proceed after sufficient reclaim, i.e. whether there are
> enough reclaimable pages to provide enough order-0 freepages for compaction.
>
> This addition of reclaimable pages to the free pages works well for the order-0
> watermark check, but in the fragmentation index check we only consider truly
> free pages. Thus we can get fragindex value close to 0 which indicates failure
> do to lack of memory, and wrongly decide that compaction won't be suitable even
> after reclaim.
>
> Instead of trying to somehow adjust fragindex for reclaimable pages, let's just
> skip it from compaction_zonelist_suitable().
Yes this makes a lot of sense to me. The fragidx should be mostly about
a balance between reclaim and the compaction so it shouldn't be used for
fail or retry decisions.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
> ---
> mm/compaction.c | 35 ++++++++++++++++++-----------------
> 1 file changed, 18 insertions(+), 17 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 86d4d0bbfc7c..5ff7f801c345 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1379,7 +1379,6 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
> int classzone_idx,
> unsigned long wmark_target)
> {
> - int fragindex;
> unsigned long watermark;
>
> if (is_via_compact_memory(order))
> @@ -1415,6 +1414,18 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
> ALLOC_CMA, wmark_target))
> return COMPACT_SKIPPED;
>
> + return COMPACT_CONTINUE;
> +}
> +
> +enum compact_result compaction_suitable(struct zone *zone, int order,
> + unsigned int alloc_flags,
> + int classzone_idx)
> +{
> + enum compact_result ret;
> + int fragindex;
> +
> + ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx,
> + zone_page_state(zone, NR_FREE_PAGES));
> /*
> * fragmentation index determines if allocation failures are due to
> * low memory or external fragmentation
> @@ -1426,21 +1437,12 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order,
> *
> * Only compact if a failure would be due to fragmentation.
> */
> - fragindex = fragmentation_index(zone, order);
> - if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
> - return COMPACT_NOT_SUITABLE_ZONE;
> -
> - return COMPACT_CONTINUE;
> -}
> -
> -enum compact_result compaction_suitable(struct zone *zone, int order,
> - unsigned int alloc_flags,
> - int classzone_idx)
> -{
> - enum compact_result ret;
> + if (ret == COMPACT_CONTINUE) {
> + fragindex = fragmentation_index(zone, order);
> + if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
> + return COMPACT_NOT_SUITABLE_ZONE;
> + }
>
> - ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx,
> - zone_page_state(zone, NR_FREE_PAGES));
> trace_mm_compaction_suitable(zone, order, ret);
> if (ret == COMPACT_NOT_SUITABLE_ZONE)
> ret = COMPACT_SKIPPED;
> @@ -1473,8 +1475,7 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
> available += zone_page_state_snapshot(zone, NR_FREE_PAGES);
> compact_result = __compaction_suitable(zone, order, alloc_flags,
> ac_classzone_idx(ac), available);
> - if (compact_result != COMPACT_SKIPPED &&
> - compact_result != COMPACT_NOT_SUITABLE_ZONE)
> + if (compact_result != COMPACT_SKIPPED)
> return true;
> }
>
> --
> 2.10.0
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 3/4] mm, compaction: ignore fragindex from compaction_zonelist_suitable()
2016-09-26 16:20 ` Vlastimil Babka
@ 2016-09-29 9:57 ` Vlastimil Babka
-1 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-29 9:57 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
Ralf-Peter Rohbeck, Olaf Hering, Michal Hocko, Mel Gorman,
Joonsoo Kim, David Rientjes, Rik van Riel, Hillf Danton,
Michal Hocko
On 09/26/2016 06:20 PM, Vlastimil Babka wrote:
> The compaction_zonelist_suitable() function tries to determine if compaction
> will be able to proceed after sufficient reclaim, i.e. whether there are
> enough reclaimable pages to provide enough order-0 freepages for compaction.
>
> This addition of reclaimable pages to the free pages works well for the order-0
> watermark check, but in the fragmentation index check we only consider truly
> free pages. Thus we can get fragindex value close to 0 which indicates failure
> do to lack of memory, and wrongly decide that compaction won't be suitable even
> after reclaim.
>
> Instead of trying to somehow adjust fragindex for reclaimable pages, let's just
> skip it from compaction_zonelist_suitable().
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>
Bah, a fix below, sorry.
----8<----
>From 1cd4855305a9c7eef0cf0c1af9f310cecdcc9406 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Thu, 29 Sep 2016 11:54:07 +0200
Subject: [PATCH] mm, compaction: ignore fragindex from
compaction_zonelist_suitable()-fix
COMPACT_NOT_SUITABLE_ZONE should not leak outside compaction_suitable() and
we should not skip tracepoint.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
mm/compaction.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index badb92bf14b4..0409a4ad6ea1 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1445,7 +1445,7 @@ enum compact_result compaction_suitable(struct zone *zone, int order,
if (ret == COMPACT_CONTINUE && (order > PAGE_ALLOC_COSTLY_ORDER)) {
fragindex = fragmentation_index(zone, order);
if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
- return COMPACT_NOT_SUITABLE_ZONE;
+ ret = COMPACT_NOT_SUITABLE_ZONE;
}
trace_mm_compaction_suitable(zone, order, ret);
--
2.10.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 3/4] mm, compaction: ignore fragindex from compaction_zonelist_suitable()
@ 2016-09-29 9:57 ` Vlastimil Babka
0 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-29 9:57 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
Ralf-Peter Rohbeck, Olaf Hering, Michal Hocko, Mel Gorman,
Joonsoo Kim, David Rientjes, Rik van Riel, Hillf Danton,
Michal Hocko
On 09/26/2016 06:20 PM, Vlastimil Babka wrote:
> The compaction_zonelist_suitable() function tries to determine if compaction
> will be able to proceed after sufficient reclaim, i.e. whether there are
> enough reclaimable pages to provide enough order-0 freepages for compaction.
>
> This addition of reclaimable pages to the free pages works well for the order-0
> watermark check, but in the fragmentation index check we only consider truly
> free pages. Thus we can get fragindex value close to 0 which indicates failure
> do to lack of memory, and wrongly decide that compaction won't be suitable even
> after reclaim.
>
> Instead of trying to somehow adjust fragindex for reclaimable pages, let's just
> skip it from compaction_zonelist_suitable().
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>
Bah, a fix below, sorry.
----8<----
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 4/4] mm, compaction: restrict fragindex to costly orders
2016-09-26 16:20 ` Vlastimil Babka
@ 2016-09-26 16:20 ` Vlastimil Babka
-1 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, Michal Hocko,
Mel Gorman, Joonsoo Kim, David Rientjes, Rik van Riel,
Hillf Danton, Michal Hocko
Fragmentation index and the vm.extfrag_threshold sysctl is meant as a heuristic
to prevent excessive compaction for costly orders (i.e. THP). It's unlikely to
make any difference for non-costly orders, especially with the default
threshold. But we cannot afford any uncertainty for the non-costly orders where
the only alternative to successful reclaim/compaction is OOM. After the recent
patches we are guaranteed maximum effort without heuristics from compaction
before deciding OOM, and fragindex is the last remaining heuristic. Therefore
skip fragindex altogether for non-costly orders.
Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
---
mm/compaction.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 5ff7f801c345..badb92bf14b4 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1435,9 +1435,14 @@ enum compact_result compaction_suitable(struct zone *zone, int order,
* index towards 0 implies failure is due to lack of memory
* index towards 1000 implies failure is due to fragmentation
*
- * Only compact if a failure would be due to fragmentation.
+ * Only compact if a failure would be due to fragmentation. Also
+ * ignore fragindex for non-costly orders where the alternative to
+ * a successful reclaim/compaction is OOM. Fragindex and the
+ * vm.extfrag_threshold sysctl is meant as a heuristic to prevent
+ * excessive compaction for costly orders, but it should not be at the
+ * expense of system stability.
*/
- if (ret == COMPACT_CONTINUE) {
+ if (ret == COMPACT_CONTINUE && (order > PAGE_ALLOC_COSTLY_ORDER)) {
fragindex = fragmentation_index(zone, order);
if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
return COMPACT_NOT_SUITABLE_ZONE;
--
2.10.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 4/4] mm, compaction: restrict fragindex to costly orders
@ 2016-09-26 16:20 ` Vlastimil Babka
0 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2016-09-26 16:20 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Linus Torvalds, Arkadiusz Miskiewicz,
Ralf-Peter Rohbeck, Olaf Hering, Vlastimil Babka, Michal Hocko,
Mel Gorman, Joonsoo Kim, David Rientjes, Rik van Riel,
Hillf Danton, Michal Hocko
Fragmentation index and the vm.extfrag_threshold sysctl is meant as a heuristic
to prevent excessive compaction for costly orders (i.e. THP). It's unlikely to
make any difference for non-costly orders, especially with the default
threshold. But we cannot afford any uncertainty for the non-costly orders where
the only alternative to successful reclaim/compaction is OOM. After the recent
patches we are guaranteed maximum effort without heuristics from compaction
before deciding OOM, and fragindex is the last remaining heuristic. Therefore
skip fragindex altogether for non-costly orders.
Suggested-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
---
mm/compaction.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 5ff7f801c345..badb92bf14b4 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1435,9 +1435,14 @@ enum compact_result compaction_suitable(struct zone *zone, int order,
* index towards 0 implies failure is due to lack of memory
* index towards 1000 implies failure is due to fragmentation
*
- * Only compact if a failure would be due to fragmentation.
+ * Only compact if a failure would be due to fragmentation. Also
+ * ignore fragindex for non-costly orders where the alternative to
+ * a successful reclaim/compaction is OOM. Fragindex and the
+ * vm.extfrag_threshold sysctl is meant as a heuristic to prevent
+ * excessive compaction for costly orders, but it should not be at the
+ * expense of system stability.
*/
- if (ret == COMPACT_CONTINUE) {
+ if (ret == COMPACT_CONTINUE && (order > PAGE_ALLOC_COSTLY_ORDER)) {
fragindex = fragmentation_index(zone, order);
if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
return COMPACT_NOT_SUITABLE_ZONE;
--
2.10.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 4/4] mm, compaction: restrict fragindex to costly orders
2016-09-26 16:20 ` Vlastimil Babka
@ 2016-09-26 20:29 ` Michal Hocko
-1 siblings, 0 replies; 18+ messages in thread
From: Michal Hocko @ 2016-09-26 20:29 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, linux-mm, linux-kernel, Linus Torvalds,
Arkadiusz Miskiewicz, Ralf-Peter Rohbeck, Olaf Hering,
Mel Gorman, Joonsoo Kim, David Rientjes, Rik van Riel,
Hillf Danton
On Mon 26-09-16 18:20:25, Vlastimil Babka wrote:
> Fragmentation index and the vm.extfrag_threshold sysctl is meant as a heuristic
> to prevent excessive compaction for costly orders (i.e. THP). It's unlikely to
> make any difference for non-costly orders, especially with the default
> threshold. But we cannot afford any uncertainty for the non-costly orders where
> the only alternative to successful reclaim/compaction is OOM. After the recent
> patches we are guaranteed maximum effort without heuristics from compaction
> before deciding OOM, and fragindex is the last remaining heuristic. Therefore
> skip fragindex altogether for non-costly orders.
It would be nicer to reduce this just to the highest compaction priority
but as your previous attempt shows this adds a lot of code churn.
Not skipping the compaction for these !costly orders might lead to a
higher latency for the allocation due to pointless zone scanning but
considering that an alternative would be the order-0 reclaim which
doesn't guarantee any larger blocks then doing a more targeted approach
sounds quite reasonable to me.
This patch is not really needed to prevent pre-mature OOMs because
compaction_zonelist_suitable doesn't rely on the fragmentation index
after the previous patch but it makes sense to me regardless. The
fagindex was quite an obscure measure and having !costly order easier to
understand is valuable imho.
> Suggested-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
> ---
> mm/compaction.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 5ff7f801c345..badb92bf14b4 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1435,9 +1435,14 @@ enum compact_result compaction_suitable(struct zone *zone, int order,
> * index towards 0 implies failure is due to lack of memory
> * index towards 1000 implies failure is due to fragmentation
> *
> - * Only compact if a failure would be due to fragmentation.
> + * Only compact if a failure would be due to fragmentation. Also
> + * ignore fragindex for non-costly orders where the alternative to
> + * a successful reclaim/compaction is OOM. Fragindex and the
> + * vm.extfrag_threshold sysctl is meant as a heuristic to prevent
> + * excessive compaction for costly orders, but it should not be at the
> + * expense of system stability.
> */
> - if (ret == COMPACT_CONTINUE) {
> + if (ret == COMPACT_CONTINUE && (order > PAGE_ALLOC_COSTLY_ORDER)) {
> fragindex = fragmentation_index(zone, order);
> if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
> return COMPACT_NOT_SUITABLE_ZONE;
> --
> 2.10.0
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 4/4] mm, compaction: restrict fragindex to costly orders
@ 2016-09-26 20:29 ` Michal Hocko
0 siblings, 0 replies; 18+ messages in thread
From: Michal Hocko @ 2016-09-26 20:29 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, linux-mm, linux-kernel, Linus Torvalds,
Arkadiusz Miskiewicz, Ralf-Peter Rohbeck, Olaf Hering,
Mel Gorman, Joonsoo Kim, David Rientjes, Rik van Riel,
Hillf Danton
On Mon 26-09-16 18:20:25, Vlastimil Babka wrote:
> Fragmentation index and the vm.extfrag_threshold sysctl is meant as a heuristic
> to prevent excessive compaction for costly orders (i.e. THP). It's unlikely to
> make any difference for non-costly orders, especially with the default
> threshold. But we cannot afford any uncertainty for the non-costly orders where
> the only alternative to successful reclaim/compaction is OOM. After the recent
> patches we are guaranteed maximum effort without heuristics from compaction
> before deciding OOM, and fragindex is the last remaining heuristic. Therefore
> skip fragindex altogether for non-costly orders.
It would be nicer to reduce this just to the highest compaction priority
but as your previous attempt shows this adds a lot of code churn.
Not skipping the compaction for these !costly orders might lead to a
higher latency for the allocation due to pointless zone scanning but
considering that an alternative would be the order-0 reclaim which
doesn't guarantee any larger blocks then doing a more targeted approach
sounds quite reasonable to me.
This patch is not really needed to prevent pre-mature OOMs because
compaction_zonelist_suitable doesn't rely on the fragmentation index
after the previous patch but it makes sense to me regardless. The
fagindex was quite an obscure measure and having !costly order easier to
understand is valuable imho.
> Suggested-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
> ---
> mm/compaction.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 5ff7f801c345..badb92bf14b4 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1435,9 +1435,14 @@ enum compact_result compaction_suitable(struct zone *zone, int order,
> * index towards 0 implies failure is due to lack of memory
> * index towards 1000 implies failure is due to fragmentation
> *
> - * Only compact if a failure would be due to fragmentation.
> + * Only compact if a failure would be due to fragmentation. Also
> + * ignore fragindex for non-costly orders where the alternative to
> + * a successful reclaim/compaction is OOM. Fragindex and the
> + * vm.extfrag_threshold sysctl is meant as a heuristic to prevent
> + * excessive compaction for costly orders, but it should not be at the
> + * expense of system stability.
> */
> - if (ret == COMPACT_CONTINUE) {
> + if (ret == COMPACT_CONTINUE && (order > PAGE_ALLOC_COSTLY_ORDER)) {
> fragindex = fragmentation_index(zone, order);
> if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold)
> return COMPACT_NOT_SUITABLE_ZONE;
> --
> 2.10.0
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 18+ messages in thread