linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] enhance compaction success rate
@ 2015-01-30 12:34 Joonsoo Kim
  2015-01-30 12:34 ` [PATCH v2 1/4] mm/compaction: fix wrong order check in compact_finished() Joonsoo Kim
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Joonsoo Kim @ 2015-01-30 12:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, Mel Gorman, David Rientjes, Rik van Riel,
	linux-mm, linux-kernel, Joonsoo Kim

This patchset aims at increase of compaction success rate. Changes are 
related to compaction finish condition and freepage isolation condition.

>From these changes, I did stress highalloc test in mmtests with nonmovable
order 7 allocation configuration, and compaction success rate (%) are

Base	Patch-1 Patch-2 Patch-3	Patch-4
18.47	27.13   31.82	--	42.20

Note: Base version is tested in v1 and the others are tested freshly.
Test is perform based on next-20150103 and Vlastimil's stealing logic
patches due to current next's unstablility.
Patch-3 isn't tested since there is no functional change.

Joonsoo (3):
  mm/compaction: stop the isolation when we isolate enough freepage
  mm/page_alloc: separate steal decision from steal behaviour part
  mm/compaction: enhance compaction finish condition

Joonsoo Kim (1):
  mm/compaction: fix wrong order check in compact_finished()

 include/linux/mmzone.h |  3 +++
 mm/compaction.c        | 47 ++++++++++++++++++++++++++++++++++++++---------
 mm/internal.h          |  1 +
 mm/page_alloc.c        | 50 ++++++++++++++++++++++++++++++++------------------
 4 files changed, 74 insertions(+), 27 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/4] mm/compaction: fix wrong order check in compact_finished()
  2015-01-30 12:34 [PATCH v2 0/4] enhance compaction success rate Joonsoo Kim
@ 2015-01-30 12:34 ` Joonsoo Kim
  2015-01-30 13:27   ` Vlastimil Babka
  2015-01-31  7:38   ` Zhang Yanfei
  2015-01-30 12:34 ` [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage Joonsoo Kim
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 19+ messages in thread
From: Joonsoo Kim @ 2015-01-30 12:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, Mel Gorman, David Rientjes, Rik van Riel,
	linux-mm, linux-kernel, Joonsoo Kim, stable

What we want to check here is whether there is highorder freepage
in buddy list of other migratetype in order to steal it without
fragmentation. But, current code just checks cc->order which means
allocation request order. So, this is wrong.

Without this fix, non-movable synchronous compaction below pageblock order
would not stopped until compaction is complete, because migratetype of most
pageblocks are movable and high order freepage made by compaction is usually
on movable type buddy list.

There is some report related to this bug. See below link.

http://www.spinics.net/lists/linux-mm/msg81666.html

Although the issued system still has load spike comes from compaction,
this makes that system completely stable and responsive according to
his report.

stress-highalloc test in mmtests with non movable order 7 allocation doesn't
show any notable difference in allocation success rate, but, it shows more
compaction success rate.

Compaction success rate (Compaction success * 100 / Compaction stalls, %)
18.47 : 28.94

Cc: <stable@vger.kernel.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/compaction.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index b68736c..4954e19 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1173,7 +1173,7 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc,
 			return COMPACT_PARTIAL;
 
 		/* Job done if allocation would set block type */
-		if (cc->order >= pageblock_order && area->nr_free)
+		if (order >= pageblock_order && area->nr_free)
 			return COMPACT_PARTIAL;
 	}
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage
  2015-01-30 12:34 [PATCH v2 0/4] enhance compaction success rate Joonsoo Kim
  2015-01-30 12:34 ` [PATCH v2 1/4] mm/compaction: fix wrong order check in compact_finished() Joonsoo Kim
@ 2015-01-30 12:34 ` Joonsoo Kim
  2015-01-30 13:47   ` Vlastimil Babka
  2015-01-31  7:49   ` Zhang Yanfei
  2015-01-30 12:34 ` [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part Joonsoo Kim
  2015-01-30 12:34 ` [PATCH v2 4/4] mm/compaction: enhance compaction finish condition Joonsoo Kim
  3 siblings, 2 replies; 19+ messages in thread
From: Joonsoo Kim @ 2015-01-30 12:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, Mel Gorman, David Rientjes, Rik van Riel,
	linux-mm, linux-kernel, Joonsoo Kim

From: Joonsoo <iamjoonsoo.kim@lge.com>

Currently, freepage isolation in one pageblock doesn't consider how many
freepages we isolate. When I traced flow of compaction, compaction
sometimes isolates more than 256 freepages to migrate just 32 pages.

In this patch, freepage isolation is stopped at the point that we
have more isolated freepage than isolated page for migration. This
results in slowing down free page scanner and make compaction success
rate higher.

stress-highalloc test in mmtests with non movable order 7 allocation shows
increase of compaction success rate.

Compaction success rate (Compaction success * 100 / Compaction stalls, %)
27.13 : 31.82

pfn where both scanners meets on compaction complete
(separate test due to enormous tracepoint buffer)
(zone_start=4096, zone_end=1048576)
586034 : 654378

In fact, I didn't fully understand why this patch results in such good
result. There was a guess that not used freepages are released to pcp list
and on next compaction trial we won't isolate them again so compaction
success rate would decrease. To prevent this effect, I tested with adding
pcp drain code on release_freepages(), but, it has no good effect.

Anyway, this patch reduces waste time to isolate unneeded freepages so
seems reasonable.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/compaction.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 4954e19..782772d 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -490,6 +490,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
 
 		/* If a page was split, advance to the end of it */
 		if (isolated) {
+			cc->nr_freepages += isolated;
+			if (!strict &&
+				cc->nr_migratepages <= cc->nr_freepages) {
+				blockpfn += isolated;
+				break;
+			}
+
 			blockpfn += isolated - 1;
 			cursor += isolated - 1;
 			continue;
@@ -899,7 +906,6 @@ static void isolate_freepages(struct compact_control *cc)
 	unsigned long isolate_start_pfn; /* exact pfn we start at */
 	unsigned long block_end_pfn;	/* end of current pageblock */
 	unsigned long low_pfn;	     /* lowest pfn scanner is able to scan */
-	int nr_freepages = cc->nr_freepages;
 	struct list_head *freelist = &cc->freepages;
 
 	/*
@@ -924,11 +930,11 @@ static void isolate_freepages(struct compact_control *cc)
 	 * pages on cc->migratepages. We stop searching if the migrate
 	 * and free page scanners meet or enough free pages are isolated.
 	 */
-	for (; block_start_pfn >= low_pfn && cc->nr_migratepages > nr_freepages;
+	for (; block_start_pfn >= low_pfn &&
+			cc->nr_migratepages > cc->nr_freepages;
 				block_end_pfn = block_start_pfn,
 				block_start_pfn -= pageblock_nr_pages,
 				isolate_start_pfn = block_start_pfn) {
-		unsigned long isolated;
 
 		/*
 		 * This can iterate a massively long zone without finding any
@@ -953,9 +959,8 @@ static void isolate_freepages(struct compact_control *cc)
 			continue;
 
 		/* Found a block suitable for isolating free pages from. */
-		isolated = isolate_freepages_block(cc, &isolate_start_pfn,
+		isolate_freepages_block(cc, &isolate_start_pfn,
 					block_end_pfn, freelist, false);
-		nr_freepages += isolated;
 
 		/*
 		 * Remember where the free scanner should restart next time,
@@ -987,8 +992,6 @@ static void isolate_freepages(struct compact_control *cc)
 	 */
 	if (block_start_pfn < low_pfn)
 		cc->free_pfn = cc->migrate_pfn;
-
-	cc->nr_freepages = nr_freepages;
 }
 
 /*
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part
  2015-01-30 12:34 [PATCH v2 0/4] enhance compaction success rate Joonsoo Kim
  2015-01-30 12:34 ` [PATCH v2 1/4] mm/compaction: fix wrong order check in compact_finished() Joonsoo Kim
  2015-01-30 12:34 ` [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage Joonsoo Kim
@ 2015-01-30 12:34 ` Joonsoo Kim
  2015-01-30 14:27   ` Vlastimil Babka
  2015-01-31 12:38   ` Zhang Yanfei
  2015-01-30 12:34 ` [PATCH v2 4/4] mm/compaction: enhance compaction finish condition Joonsoo Kim
  3 siblings, 2 replies; 19+ messages in thread
From: Joonsoo Kim @ 2015-01-30 12:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, Mel Gorman, David Rientjes, Rik van Riel,
	linux-mm, linux-kernel, Joonsoo Kim

From: Joonsoo <iamjoonsoo.kim@lge.com>

This is preparation step to use page allocator's anti fragmentation logic
in compaction. This patch just separates steal decision part from actual
steal behaviour part so there is no functional change.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/page_alloc.c | 49 ++++++++++++++++++++++++++++++++-----------------
 1 file changed, 32 insertions(+), 17 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8d52ab1..ef74750 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1122,6 +1122,24 @@ static void change_pageblock_range(struct page *pageblock_page,
 	}
 }
 
+static bool can_steal_freepages(unsigned int order,
+				int start_mt, int fallback_mt)
+{
+	if (is_migrate_cma(fallback_mt))
+		return false;
+
+	if (order >= pageblock_order)
+		return true;
+
+	if (order >= pageblock_order / 2 ||
+		start_mt == MIGRATE_RECLAIMABLE ||
+		start_mt == MIGRATE_UNMOVABLE ||
+		page_group_by_mobility_disabled)
+		return true;
+
+	return false;
+}
+
 /*
  * When we are falling back to another migratetype during allocation, try to
  * steal extra free pages from the same pageblocks to satisfy further
@@ -1138,9 +1156,10 @@ static void change_pageblock_range(struct page *pageblock_page,
  * as well.
  */
 static void try_to_steal_freepages(struct zone *zone, struct page *page,
-				  int start_type, int fallback_type)
+				  int start_type)
 {
 	int current_order = page_order(page);
+	int pages;
 
 	/* Take ownership for orders >= pageblock_order */
 	if (current_order >= pageblock_order) {
@@ -1148,19 +1167,12 @@ static void try_to_steal_freepages(struct zone *zone, struct page *page,
 		return;
 	}
 
-	if (current_order >= pageblock_order / 2 ||
-	    start_type == MIGRATE_RECLAIMABLE ||
-	    start_type == MIGRATE_UNMOVABLE ||
-	    page_group_by_mobility_disabled) {
-		int pages;
+	pages = move_freepages_block(zone, page, start_type);
 
-		pages = move_freepages_block(zone, page, start_type);
-
-		/* Claim the whole block if over half of it is free */
-		if (pages >= (1 << (pageblock_order-1)) ||
-				page_group_by_mobility_disabled)
-			set_pageblock_migratetype(page, start_type);
-	}
+	/* Claim the whole block if over half of it is free */
+	if (pages >= (1 << (pageblock_order-1)) ||
+			page_group_by_mobility_disabled)
+		set_pageblock_migratetype(page, start_type);
 }
 
 /* Remove an element from the buddy allocator from the fallback list */
@@ -1170,6 +1182,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
 	struct free_area *area;
 	unsigned int current_order;
 	struct page *page;
+	bool can_steal;
 
 	/* Find the largest possible block of pages in the other list */
 	for (current_order = MAX_ORDER-1;
@@ -1192,10 +1205,11 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
 					struct page, lru);
 			area->nr_free--;
 
-			if (!is_migrate_cma(migratetype)) {
+			can_steal = can_steal_freepages(current_order,
+					start_migratetype, migratetype);
+			if (can_steal) {
 				try_to_steal_freepages(zone, page,
-							start_migratetype,
-							migratetype);
+							start_migratetype);
 			} else {
 				/*
 				 * When borrowing from MIGRATE_CMA, we need to
@@ -1203,7 +1217,8 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
 				 * itself, and we do not try to steal extra
 				 * free pages.
 				 */
-				buddy_type = migratetype;
+				if (is_migrate_cma(migratetype))
+					buddy_type = migratetype;
 			}
 
 			/* Remove the page from the freelists */
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/4] mm/compaction: enhance compaction finish condition
  2015-01-30 12:34 [PATCH v2 0/4] enhance compaction success rate Joonsoo Kim
                   ` (2 preceding siblings ...)
  2015-01-30 12:34 ` [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part Joonsoo Kim
@ 2015-01-30 12:34 ` Joonsoo Kim
  2015-01-30 14:43   ` Vlastimil Babka
  2015-01-31 15:58   ` Zhang Yanfei
  3 siblings, 2 replies; 19+ messages in thread
From: Joonsoo Kim @ 2015-01-30 12:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, Mel Gorman, David Rientjes, Rik van Riel,
	linux-mm, linux-kernel, Joonsoo Kim

From: Joonsoo <iamjoonsoo.kim@lge.com>

Compaction has anti fragmentation algorithm. It is that freepage
should be more than pageblock order to finish the compaction if we don't
find any freepage in requested migratetype buddy list. This is for
mitigating fragmentation, but, there is a lack of migratetype
consideration and it is too excessive compared to page allocator's anti
fragmentation algorithm.

Not considering migratetype would cause premature finish of compaction.
For example, if allocation request is for unmovable migratetype,
freepage with CMA migratetype doesn't help that allocation and
compaction should not be stopped. But, current logic regards this
situation as compaction is no longer needed, so finish the compaction.

Secondly, condition is too excessive compared to page allocator's logic.
We can steal freepage from other migratetype and change pageblock
migratetype on more relaxed conditions in page allocator. This is designed
to prevent fragmentation and we can use it here. Imposing hard constraint
only to the compaction doesn't help much in this case since page allocator
would cause fragmentation again.

To solve these problems, this patch borrows anti fragmentation logic from
page allocator. It will reduce premature compaction finish in some cases
and reduce excessive compaction work.

stress-highalloc test in mmtests with non movable order 7 allocation shows
considerable increase of compaction success rate.

Compaction success rate (Compaction success * 100 / Compaction stalls, %)
31.82 : 42.20

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 include/linux/mmzone.h |  3 +++
 mm/compaction.c        | 30 ++++++++++++++++++++++++++++--
 mm/internal.h          |  1 +
 mm/page_alloc.c        |  5 ++---
 4 files changed, 34 insertions(+), 5 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index f279d9c..a2906bc 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -63,6 +63,9 @@ enum {
 	MIGRATE_TYPES
 };
 
+#define FALLBACK_MIGRATETYPES (4)
+extern int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES];
+
 #ifdef CONFIG_CMA
 #  define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA)
 #else
diff --git a/mm/compaction.c b/mm/compaction.c
index 782772d..0460e4b 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1125,6 +1125,29 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
 	return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE;
 }
 
+static bool can_steal_fallbacks(struct free_area *area,
+			unsigned int order, int migratetype)
+{
+	int i;
+	int fallback_mt;
+
+	if (area->nr_free == 0)
+		return false;
+
+	for (i = 0; i < FALLBACK_MIGRATETYPES; i++) {
+		fallback_mt = fallbacks[migratetype][i];
+		if (fallback_mt == MIGRATE_RESERVE)
+			break;
+
+		if (list_empty(&area->free_list[fallback_mt]))
+			continue;
+
+		if (can_steal_freepages(order, migratetype, fallback_mt))
+			return true;
+	}
+	return false;
+}
+
 static int __compact_finished(struct zone *zone, struct compact_control *cc,
 			    const int migratetype)
 {
@@ -1175,8 +1198,11 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc,
 		if (!list_empty(&area->free_list[migratetype]))
 			return COMPACT_PARTIAL;
 
-		/* Job done if allocation would set block type */
-		if (order >= pageblock_order && area->nr_free)
+		/*
+		 * Job done if allocation would steal freepages from
+		 * other migratetype buddy lists.
+		 */
+		if (can_steal_fallbacks(area, order, migratetype))
 			return COMPACT_PARTIAL;
 	}
 
diff --git a/mm/internal.h b/mm/internal.h
index c4d6c9b..0a89a14 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -201,6 +201,7 @@ unsigned long
 isolate_migratepages_range(struct compact_control *cc,
 			   unsigned long low_pfn, unsigned long end_pfn);
 
+bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt);
 #endif
 
 /*
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ef74750..4c3538b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1026,7 +1026,7 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order,
  * This array describes the order lists are fallen back to when
  * the free lists for the desirable migrate type are depleted
  */
-static int fallbacks[MIGRATE_TYPES][4] = {
+int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES] = {
 	[MIGRATE_UNMOVABLE]   = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE,     MIGRATE_RESERVE },
 	[MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE,   MIGRATE_MOVABLE,     MIGRATE_RESERVE },
 #ifdef CONFIG_CMA
@@ -1122,8 +1122,7 @@ static void change_pageblock_range(struct page *pageblock_page,
 	}
 }
 
-static bool can_steal_freepages(unsigned int order,
-				int start_mt, int fallback_mt)
+bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt)
 {
 	if (is_migrate_cma(fallback_mt))
 		return false;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/4] mm/compaction: fix wrong order check in compact_finished()
  2015-01-30 12:34 ` [PATCH v2 1/4] mm/compaction: fix wrong order check in compact_finished() Joonsoo Kim
@ 2015-01-30 13:27   ` Vlastimil Babka
  2015-01-31  7:38   ` Zhang Yanfei
  1 sibling, 0 replies; 19+ messages in thread
From: Vlastimil Babka @ 2015-01-30 13:27 UTC (permalink / raw)
  To: Joonsoo Kim, Andrew Morton
  Cc: Mel Gorman, David Rientjes, Rik van Riel, linux-mm, linux-kernel,
	Joonsoo Kim, stable

On 01/30/2015 01:34 PM, Joonsoo Kim wrote:
> What we want to check here is whether there is highorder freepage
> in buddy list of other migratetype in order to steal it without
> fragmentation. But, current code just checks cc->order which means
> allocation request order. So, this is wrong.

The bug has been introduced by 1fb3f8ca0e92 ("mm: compaction: capture a suitable
high-order page immediately when it is made available") and survived the later
partial revert 8fb74b9fb2b1.

> Without this fix, non-movable synchronous compaction below pageblock order
> would not stopped until compaction is complete, because migratetype of most
> pageblocks are movable and high order freepage made by compaction is usually
> on movable type buddy list.
> 
> There is some report related to this bug. See below link.
> 
> http://www.spinics.net/lists/linux-mm/msg81666.html
> 
> Although the issued system still has load spike comes from compaction,
> this makes that system completely stable and responsive according to
> his report.
> 
> stress-highalloc test in mmtests with non movable order 7 allocation doesn't
> show any notable difference in allocation success rate, but, it shows more
> compaction success rate.
> 
> Compaction success rate (Compaction success * 100 / Compaction stalls, %)
> 18.47 : 28.94
> 
> Cc: <stable@vger.kernel.org>

# v3.7+
Fixes: 1fb3f8ca0e9222535a39b884cb67a34628411b9f

> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Thanks.

> ---
>  mm/compaction.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index b68736c..4954e19 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1173,7 +1173,7 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc,
>  			return COMPACT_PARTIAL;
>  
>  		/* Job done if allocation would set block type */
> -		if (cc->order >= pageblock_order && area->nr_free)
> +		if (order >= pageblock_order && area->nr_free)
>  			return COMPACT_PARTIAL;
>  	}
>  
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage
  2015-01-30 12:34 ` [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage Joonsoo Kim
@ 2015-01-30 13:47   ` Vlastimil Babka
  2015-01-31  7:49   ` Zhang Yanfei
  1 sibling, 0 replies; 19+ messages in thread
From: Vlastimil Babka @ 2015-01-30 13:47 UTC (permalink / raw)
  To: Joonsoo Kim, Andrew Morton
  Cc: Mel Gorman, David Rientjes, Rik van Riel, linux-mm, linux-kernel,
	Joonsoo Kim

On 01/30/2015 01:34 PM, Joonsoo Kim wrote:
> From: Joonsoo <iamjoonsoo.kim@lge.com>
> 
> Currently, freepage isolation in one pageblock doesn't consider how many
> freepages we isolate. When I traced flow of compaction, compaction
> sometimes isolates more than 256 freepages to migrate just 32 pages.
> 
> In this patch, freepage isolation is stopped at the point that we
> have more isolated freepage than isolated page for migration. This
> results in slowing down free page scanner and make compaction success
> rate higher.
> 
> stress-highalloc test in mmtests with non movable order 7 allocation shows
> increase of compaction success rate.
> 
> Compaction success rate (Compaction success * 100 / Compaction stalls, %)
> 27.13 : 31.82
> 
> pfn where both scanners meets on compaction complete
> (separate test due to enormous tracepoint buffer)
> (zone_start=4096, zone_end=1048576)
> 586034 : 654378

Now I that I know that scanners meeting further in zone is better for success
rate, the better success rate makes sense. Still not sure why they meet further :)

> In fact, I didn't fully understand why this patch results in such good
> result. There was a guess that not used freepages are released to pcp list
> and on next compaction trial we won't isolate them again so compaction
> success rate would decrease. To prevent this effect, I tested with adding
> pcp drain code on release_freepages(), but, it has no good effect.
> 
> Anyway, this patch reduces waste time to isolate unneeded freepages so
> seems reasonable.

I briefly tried it on top of the pivot-changing series and with order-9
allocations it reduced free page scanned counter by almost 10%. No effect on
success rates (maybe because pivot changing already took care of the scanners
meeting problem) but the scanning reduction is good on its own.

It also explains why e14c720efdd7 ("mm, compaction: remember position within
pageblock in free pages scanner") had less than expected improvements. It would
only actually stop within pageblock in case of async compaction detecting
contention. I guess that's also why the infinite loop problem fixed by
1d5bfe1ffb5b affected so relatively few people.

> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

Thanks!

> ---
>  mm/compaction.c | 17 ++++++++++-------
>  1 file changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 4954e19..782772d 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -490,6 +490,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>  
>  		/* If a page was split, advance to the end of it */
>  		if (isolated) {
> +			cc->nr_freepages += isolated;
> +			if (!strict &&
> +				cc->nr_migratepages <= cc->nr_freepages) {
> +				blockpfn += isolated;
> +				break;
> +			}
> +
>  			blockpfn += isolated - 1;
>  			cursor += isolated - 1;
>  			continue;
> @@ -899,7 +906,6 @@ static void isolate_freepages(struct compact_control *cc)
>  	unsigned long isolate_start_pfn; /* exact pfn we start at */
>  	unsigned long block_end_pfn;	/* end of current pageblock */
>  	unsigned long low_pfn;	     /* lowest pfn scanner is able to scan */
> -	int nr_freepages = cc->nr_freepages;
>  	struct list_head *freelist = &cc->freepages;
>  
>  	/*
> @@ -924,11 +930,11 @@ static void isolate_freepages(struct compact_control *cc)
>  	 * pages on cc->migratepages. We stop searching if the migrate
>  	 * and free page scanners meet or enough free pages are isolated.
>  	 */
> -	for (; block_start_pfn >= low_pfn && cc->nr_migratepages > nr_freepages;
> +	for (; block_start_pfn >= low_pfn &&
> +			cc->nr_migratepages > cc->nr_freepages;
>  				block_end_pfn = block_start_pfn,
>  				block_start_pfn -= pageblock_nr_pages,
>  				isolate_start_pfn = block_start_pfn) {
> -		unsigned long isolated;
>  
>  		/*
>  		 * This can iterate a massively long zone without finding any
> @@ -953,9 +959,8 @@ static void isolate_freepages(struct compact_control *cc)
>  			continue;
>  
>  		/* Found a block suitable for isolating free pages from. */
> -		isolated = isolate_freepages_block(cc, &isolate_start_pfn,
> +		isolate_freepages_block(cc, &isolate_start_pfn,
>  					block_end_pfn, freelist, false);
> -		nr_freepages += isolated;
>  
>  		/*
>  		 * Remember where the free scanner should restart next time,
> @@ -987,8 +992,6 @@ static void isolate_freepages(struct compact_control *cc)
>  	 */
>  	if (block_start_pfn < low_pfn)
>  		cc->free_pfn = cc->migrate_pfn;
> -
> -	cc->nr_freepages = nr_freepages;
>  }
>  
>  /*
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part
  2015-01-30 12:34 ` [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part Joonsoo Kim
@ 2015-01-30 14:27   ` Vlastimil Babka
  2015-02-02  7:02     ` Joonsoo Kim
  2015-01-31 12:38   ` Zhang Yanfei
  1 sibling, 1 reply; 19+ messages in thread
From: Vlastimil Babka @ 2015-01-30 14:27 UTC (permalink / raw)
  To: Joonsoo Kim, Andrew Morton
  Cc: Mel Gorman, David Rientjes, Rik van Riel, linux-mm, linux-kernel,
	Joonsoo Kim

On 01/30/2015 01:34 PM, Joonsoo Kim wrote:
> From: Joonsoo <iamjoonsoo.kim@lge.com>
> 
> This is preparation step to use page allocator's anti fragmentation logic
> in compaction. This patch just separates steal decision part from actual
> steal behaviour part so there is no functional change.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  mm/page_alloc.c | 49 ++++++++++++++++++++++++++++++++-----------------
>  1 file changed, 32 insertions(+), 17 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 8d52ab1..ef74750 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1122,6 +1122,24 @@ static void change_pageblock_range(struct page *pageblock_page,
>  	}
>  }
>  
> +static bool can_steal_freepages(unsigned int order,
> +				int start_mt, int fallback_mt)
> +{
> +	if (is_migrate_cma(fallback_mt))
> +		return false;
> +
> +	if (order >= pageblock_order)
> +		return true;
> +
> +	if (order >= pageblock_order / 2 ||
> +		start_mt == MIGRATE_RECLAIMABLE ||
> +		start_mt == MIGRATE_UNMOVABLE ||
> +		page_group_by_mobility_disabled)
> +		return true;
> +
> +	return false;
> +}
> +
>  /*
>   * When we are falling back to another migratetype during allocation, try to
>   * steal extra free pages from the same pageblocks to satisfy further
> @@ -1138,9 +1156,10 @@ static void change_pageblock_range(struct page *pageblock_page,
>   * as well.
>   */
>  static void try_to_steal_freepages(struct zone *zone, struct page *page,
> -				  int start_type, int fallback_type)
> +				  int start_type)

It's actually not 'try_to_' anymore, is it? But could be, see below.

>  {
>  	int current_order = page_order(page);
> +	int pages;
>  
>  	/* Take ownership for orders >= pageblock_order */
>  	if (current_order >= pageblock_order) {
> @@ -1148,19 +1167,12 @@ static void try_to_steal_freepages(struct zone *zone, struct page *page,
>  		return;
>  	}
>  
> -	if (current_order >= pageblock_order / 2 ||
> -	    start_type == MIGRATE_RECLAIMABLE ||
> -	    start_type == MIGRATE_UNMOVABLE ||
> -	    page_group_by_mobility_disabled) {
> -		int pages;
> +	pages = move_freepages_block(zone, page, start_type);
>  
> -		pages = move_freepages_block(zone, page, start_type);
> -
> -		/* Claim the whole block if over half of it is free */
> -		if (pages >= (1 << (pageblock_order-1)) ||
> -				page_group_by_mobility_disabled)
> -			set_pageblock_migratetype(page, start_type);
> -	}
> +	/* Claim the whole block if over half of it is free */
> +	if (pages >= (1 << (pageblock_order-1)) ||
> +			page_group_by_mobility_disabled)
> +		set_pageblock_migratetype(page, start_type);
>  }
>  
>  /* Remove an element from the buddy allocator from the fallback list */
> @@ -1170,6 +1182,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
>  	struct free_area *area;
>  	unsigned int current_order;
>  	struct page *page;
> +	bool can_steal;
>  
>  	/* Find the largest possible block of pages in the other list */
>  	for (current_order = MAX_ORDER-1;
> @@ -1192,10 +1205,11 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
>  					struct page, lru);
>  			area->nr_free--;
>  
> -			if (!is_migrate_cma(migratetype)) {
> +			can_steal = can_steal_freepages(current_order,
> +					start_migratetype, migratetype);
> +			if (can_steal) {

can_steal is only used once, why not do if (can_steal_freepages()) directly?

Or, call can_steal_freepages() from try_to_steal_freepages() and make
try_to_steal_freepages() return its result. Then here it simplifies to:

if (!try_to_steal_freepages(...) && is_migrate_cma(...))
	buddy_type = migratetype;

>  				try_to_steal_freepages(zone, page,
> -							start_migratetype,
> -							migratetype);
> +							start_migratetype);
>  			} else {
>  				/*
>  				 * When borrowing from MIGRATE_CMA, we need to
> @@ -1203,7 +1217,8 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
>  				 * itself, and we do not try to steal extra
>  				 * free pages.
>  				 */
> -				buddy_type = migratetype;
> +				if (is_migrate_cma(migratetype))
> +					buddy_type = migratetype;
>  			}
>  
>  			/* Remove the page from the freelists */
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/4] mm/compaction: enhance compaction finish condition
  2015-01-30 12:34 ` [PATCH v2 4/4] mm/compaction: enhance compaction finish condition Joonsoo Kim
@ 2015-01-30 14:43   ` Vlastimil Babka
  2015-02-02  7:11     ` Joonsoo Kim
  2015-01-31 15:58   ` Zhang Yanfei
  1 sibling, 1 reply; 19+ messages in thread
From: Vlastimil Babka @ 2015-01-30 14:43 UTC (permalink / raw)
  To: Joonsoo Kim, Andrew Morton
  Cc: Mel Gorman, David Rientjes, Rik van Riel, linux-mm, linux-kernel,
	Joonsoo Kim

On 01/30/2015 01:34 PM, Joonsoo Kim wrote:
> From: Joonsoo <iamjoonsoo.kim@lge.com>
> 
> Compaction has anti fragmentation algorithm. It is that freepage
> should be more than pageblock order to finish the compaction if we don't
> find any freepage in requested migratetype buddy list. This is for
> mitigating fragmentation, but, there is a lack of migratetype
> consideration and it is too excessive compared to page allocator's anti
> fragmentation algorithm.
> 
> Not considering migratetype would cause premature finish of compaction.
> For example, if allocation request is for unmovable migratetype,
> freepage with CMA migratetype doesn't help that allocation and
> compaction should not be stopped. But, current logic regards this
> situation as compaction is no longer needed, so finish the compaction.
> 
> Secondly, condition is too excessive compared to page allocator's logic.
> We can steal freepage from other migratetype and change pageblock
> migratetype on more relaxed conditions in page allocator. This is designed
> to prevent fragmentation and we can use it here. Imposing hard constraint
> only to the compaction doesn't help much in this case since page allocator
> would cause fragmentation again.
> 
> To solve these problems, this patch borrows anti fragmentation logic from
> page allocator. It will reduce premature compaction finish in some cases
> and reduce excessive compaction work.
> 
> stress-highalloc test in mmtests with non movable order 7 allocation shows
> considerable increase of compaction success rate.
> 
> Compaction success rate (Compaction success * 100 / Compaction stalls, %)
> 31.82 : 42.20
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

I have some worries about longer-term fragmentation, some testing of
stress-highalloc several times without restarts could be helpful.

> ---
>  include/linux/mmzone.h |  3 +++
>  mm/compaction.c        | 30 ++++++++++++++++++++++++++++--
>  mm/internal.h          |  1 +
>  mm/page_alloc.c        |  5 ++---
>  4 files changed, 34 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index f279d9c..a2906bc 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -63,6 +63,9 @@ enum {
>  	MIGRATE_TYPES
>  };
>  
> +#define FALLBACK_MIGRATETYPES (4)
> +extern int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES];
> +
>  #ifdef CONFIG_CMA
>  #  define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA)
>  #else
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 782772d..0460e4b 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1125,6 +1125,29 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
>  	return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE;
>  }
>  
> +static bool can_steal_fallbacks(struct free_area *area,
> +			unsigned int order, int migratetype)

Could you move this to page_alloc.c and then you don't have to export the
fallbacks arrays?

> +{
> +	int i;
> +	int fallback_mt;
> +
> +	if (area->nr_free == 0)
> +		return false;
> +
> +	for (i = 0; i < FALLBACK_MIGRATETYPES; i++) {
> +		fallback_mt = fallbacks[migratetype][i];
> +		if (fallback_mt == MIGRATE_RESERVE)
> +			break;
> +
> +		if (list_empty(&area->free_list[fallback_mt]))
> +			continue;
> +
> +		if (can_steal_freepages(order, migratetype, fallback_mt))
> +			return true;
> +	}
> +	return false;
> +}
> +
>  static int __compact_finished(struct zone *zone, struct compact_control *cc,
>  			    const int migratetype)
>  {
> @@ -1175,8 +1198,11 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc,
>  		if (!list_empty(&area->free_list[migratetype]))
>  			return COMPACT_PARTIAL;
>  
> -		/* Job done if allocation would set block type */
> -		if (order >= pageblock_order && area->nr_free)
> +		/*
> +		 * Job done if allocation would steal freepages from
> +		 * other migratetype buddy lists.
> +		 */
> +		if (can_steal_fallbacks(area, order, migratetype))
>  			return COMPACT_PARTIAL;

Seems somewhat wasteful in scenario where we want to satisfy a movable
allocation and it's an async compaction. Then we don't compact in
unmovable/reclaimable pageblock, and yet we will keep checking them for
fallbacks. A price to pay for having generic code?

Maybe can_steal_fallbacks could know the sync/async mode and use
migrate_async_suitable appropriately. But then I wouldn't move it to
page_alloc.c. More efficient could be a separate fallbacks array for async
compaction with the unsuitable types removed. But maybe I'm just overengineering
now.

>  	}
>  
> diff --git a/mm/internal.h b/mm/internal.h
> index c4d6c9b..0a89a14 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -201,6 +201,7 @@ unsigned long
>  isolate_migratepages_range(struct compact_control *cc,
>  			   unsigned long low_pfn, unsigned long end_pfn);
>  
> +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt);
>  #endif
>  
>  /*
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index ef74750..4c3538b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1026,7 +1026,7 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order,
>   * This array describes the order lists are fallen back to when
>   * the free lists for the desirable migrate type are depleted
>   */
> -static int fallbacks[MIGRATE_TYPES][4] = {
> +int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES] = {
>  	[MIGRATE_UNMOVABLE]   = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE,     MIGRATE_RESERVE },
>  	[MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE,   MIGRATE_MOVABLE,     MIGRATE_RESERVE },
>  #ifdef CONFIG_CMA
> @@ -1122,8 +1122,7 @@ static void change_pageblock_range(struct page *pageblock_page,
>  	}
>  }
>  
> -static bool can_steal_freepages(unsigned int order,
> -				int start_mt, int fallback_mt)
> +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt)
>  {
>  	if (is_migrate_cma(fallback_mt))
>  		return false;
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/4] mm/compaction: fix wrong order check in compact_finished()
  2015-01-30 12:34 ` [PATCH v2 1/4] mm/compaction: fix wrong order check in compact_finished() Joonsoo Kim
  2015-01-30 13:27   ` Vlastimil Babka
@ 2015-01-31  7:38   ` Zhang Yanfei
  1 sibling, 0 replies; 19+ messages in thread
From: Zhang Yanfei @ 2015-01-31  7:38 UTC (permalink / raw)
  To: Joonsoo Kim, Andrew Morton
  Cc: Vlastimil Babka, Mel Gorman, David Rientjes, Rik van Riel,
	linux-mm, linux-kernel, Joonsoo Kim, stable

Hello,

At 2015/1/30 20:34, Joonsoo Kim wrote:
> What we want to check here is whether there is highorder freepage
> in buddy list of other migratetype in order to steal it without
> fragmentation. But, current code just checks cc->order which means
> allocation request order. So, this is wrong.
> 
> Without this fix, non-movable synchronous compaction below pageblock order
> would not stopped until compaction is complete, because migratetype of most
> pageblocks are movable and high order freepage made by compaction is usually
> on movable type buddy list.
> 
> There is some report related to this bug. See below link.
> 
> http://www.spinics.net/lists/linux-mm/msg81666.html
> 
> Although the issued system still has load spike comes from compaction,
> this makes that system completely stable and responsive according to
> his report.
> 
> stress-highalloc test in mmtests with non movable order 7 allocation doesn't
> show any notable difference in allocation success rate, but, it shows more
> compaction success rate.
> 
> Compaction success rate (Compaction success * 100 / Compaction stalls, %)
> 18.47 : 28.94
> 
> Cc: <stable@vger.kernel.org>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>

> ---
>  mm/compaction.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index b68736c..4954e19 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1173,7 +1173,7 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc,
>  			return COMPACT_PARTIAL;
>  
>  		/* Job done if allocation would set block type */
> -		if (cc->order >= pageblock_order && area->nr_free)
> +		if (order >= pageblock_order && area->nr_free)
>  			return COMPACT_PARTIAL;
>  	}
>  
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage
  2015-01-30 12:34 ` [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage Joonsoo Kim
  2015-01-30 13:47   ` Vlastimil Babka
@ 2015-01-31  7:49   ` Zhang Yanfei
  2015-01-31  8:31     ` Vlastimil Babka
  1 sibling, 1 reply; 19+ messages in thread
From: Zhang Yanfei @ 2015-01-31  7:49 UTC (permalink / raw)
  To: Joonsoo Kim, Andrew Morton
  Cc: Vlastimil Babka, Mel Gorman, David Rientjes, Rik van Riel,
	linux-mm, linux-kernel, Joonsoo Kim

Hello,

At 2015/1/30 20:34, Joonsoo Kim wrote:
> From: Joonsoo <iamjoonsoo.kim@lge.com>
> 
> Currently, freepage isolation in one pageblock doesn't consider how many
> freepages we isolate. When I traced flow of compaction, compaction
> sometimes isolates more than 256 freepages to migrate just 32 pages.
> 
> In this patch, freepage isolation is stopped at the point that we
> have more isolated freepage than isolated page for migration. This
> results in slowing down free page scanner and make compaction success
> rate higher.
> 
> stress-highalloc test in mmtests with non movable order 7 allocation shows
> increase of compaction success rate.
> 
> Compaction success rate (Compaction success * 100 / Compaction stalls, %)
> 27.13 : 31.82
> 
> pfn where both scanners meets on compaction complete
> (separate test due to enormous tracepoint buffer)
> (zone_start=4096, zone_end=1048576)
> 586034 : 654378
> 
> In fact, I didn't fully understand why this patch results in such good
> result. There was a guess that not used freepages are released to pcp list
> and on next compaction trial we won't isolate them again so compaction
> success rate would decrease. To prevent this effect, I tested with adding
> pcp drain code on release_freepages(), but, it has no good effect.
> 
> Anyway, this patch reduces waste time to isolate unneeded freepages so
> seems reasonable.

Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>

IMHO, the patch making the free scanner move slower makes both scanners
meet further. Before this patch, if we isolate too many free pages and even 
after we release the unneeded free pages later the free scanner still already
be there and will be moved forward again next time -- the free scanner just
cannot be moved back to grab the free pages we released before no matter where
the free pages in, pcp or buddy. 

> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  mm/compaction.c | 17 ++++++++++-------
>  1 file changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 4954e19..782772d 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -490,6 +490,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>  
>  		/* If a page was split, advance to the end of it */
>  		if (isolated) {
> +			cc->nr_freepages += isolated;
> +			if (!strict &&
> +				cc->nr_migratepages <= cc->nr_freepages) {
> +				blockpfn += isolated;
> +				break;
> +			}
> +
>  			blockpfn += isolated - 1;
>  			cursor += isolated - 1;
>  			continue;
> @@ -899,7 +906,6 @@ static void isolate_freepages(struct compact_control *cc)
>  	unsigned long isolate_start_pfn; /* exact pfn we start at */
>  	unsigned long block_end_pfn;	/* end of current pageblock */
>  	unsigned long low_pfn;	     /* lowest pfn scanner is able to scan */
> -	int nr_freepages = cc->nr_freepages;
>  	struct list_head *freelist = &cc->freepages;
>  
>  	/*
> @@ -924,11 +930,11 @@ static void isolate_freepages(struct compact_control *cc)
>  	 * pages on cc->migratepages. We stop searching if the migrate
>  	 * and free page scanners meet or enough free pages are isolated.
>  	 */
> -	for (; block_start_pfn >= low_pfn && cc->nr_migratepages > nr_freepages;
> +	for (; block_start_pfn >= low_pfn &&
> +			cc->nr_migratepages > cc->nr_freepages;
>  				block_end_pfn = block_start_pfn,
>  				block_start_pfn -= pageblock_nr_pages,
>  				isolate_start_pfn = block_start_pfn) {
> -		unsigned long isolated;
>  
>  		/*
>  		 * This can iterate a massively long zone without finding any
> @@ -953,9 +959,8 @@ static void isolate_freepages(struct compact_control *cc)
>  			continue;
>  
>  		/* Found a block suitable for isolating free pages from. */
> -		isolated = isolate_freepages_block(cc, &isolate_start_pfn,
> +		isolate_freepages_block(cc, &isolate_start_pfn,
>  					block_end_pfn, freelist, false);
> -		nr_freepages += isolated;
>  
>  		/*
>  		 * Remember where the free scanner should restart next time,
> @@ -987,8 +992,6 @@ static void isolate_freepages(struct compact_control *cc)
>  	 */
>  	if (block_start_pfn < low_pfn)
>  		cc->free_pfn = cc->migrate_pfn;
> -
> -	cc->nr_freepages = nr_freepages;
>  }
>  
>  /*
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage
  2015-01-31  7:49   ` Zhang Yanfei
@ 2015-01-31  8:31     ` Vlastimil Babka
  2015-01-31 10:17       ` Zhang Yanfei
  0 siblings, 1 reply; 19+ messages in thread
From: Vlastimil Babka @ 2015-01-31  8:31 UTC (permalink / raw)
  To: Zhang Yanfei, Joonsoo Kim, Andrew Morton
  Cc: Mel Gorman, David Rientjes, Rik van Riel, linux-mm, linux-kernel,
	Joonsoo Kim

On 01/31/2015 08:49 AM, Zhang Yanfei wrote:
> Hello,
> 
> At 2015/1/30 20:34, Joonsoo Kim wrote:
>
> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
> 
> IMHO, the patch making the free scanner move slower makes both scanners
> meet further. Before this patch, if we isolate too many free pages and even 
> after we release the unneeded free pages later the free scanner still already
> be there and will be moved forward again next time -- the free scanner just
> cannot be moved back to grab the free pages we released before no matter where
> the free pages in, pcp or buddy. 

It can be actually moved back. If we are releasing free pages, it means the
current compaction is terminating, and it will set zone->compact_cached_free_pfn
back to the position of the released free page that was furthest back. The next
compaction will start from the cached free pfn.

It is however possible that another compaction runs in parallel and has
progressed further and overwrites the cached free pfn.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage
  2015-01-31  8:31     ` Vlastimil Babka
@ 2015-01-31 10:17       ` Zhang Yanfei
  0 siblings, 0 replies; 19+ messages in thread
From: Zhang Yanfei @ 2015-01-31 10:17 UTC (permalink / raw)
  To: Vlastimil Babka, Joonsoo Kim, Andrew Morton
  Cc: Mel Gorman, David Rientjes, Rik van Riel, linux-mm, linux-kernel,
	Joonsoo Kim

At 2015/1/31 16:31, Vlastimil Babka wrote:
> On 01/31/2015 08:49 AM, Zhang Yanfei wrote:
>> Hello,
>>
>> At 2015/1/30 20:34, Joonsoo Kim wrote:
>>
>> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
>>
>> IMHO, the patch making the free scanner move slower makes both scanners
>> meet further. Before this patch, if we isolate too many free pages and even 
>> after we release the unneeded free pages later the free scanner still already
>> be there and will be moved forward again next time -- the free scanner just
>> cannot be moved back to grab the free pages we released before no matter where
>> the free pages in, pcp or buddy. 
> 
> It can be actually moved back. If we are releasing free pages, it means the
> current compaction is terminating, and it will set zone->compact_cached_free_pfn
> back to the position of the released free page that was furthest back. The next
> compaction will start from the cached free pfn.

Yeah, you are right. I missed the release_freepages(). Thanks!

> 
> It is however possible that another compaction runs in parallel and has
> progressed further and overwrites the cached free pfn.
> 

Hmm, maybe.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part
  2015-01-30 12:34 ` [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part Joonsoo Kim
  2015-01-30 14:27   ` Vlastimil Babka
@ 2015-01-31 12:38   ` Zhang Yanfei
  2015-02-02  7:03     ` Joonsoo Kim
  1 sibling, 1 reply; 19+ messages in thread
From: Zhang Yanfei @ 2015-01-31 12:38 UTC (permalink / raw)
  To: Joonsoo Kim, Andrew Morton
  Cc: Vlastimil Babka, Mel Gorman, David Rientjes, Rik van Riel,
	linux-mm, linux-kernel, Joonsoo Kim

At 2015/1/30 20:34, Joonsoo Kim wrote:
> From: Joonsoo <iamjoonsoo.kim@lge.com>
> 
> This is preparation step to use page allocator's anti fragmentation logic
> in compaction. This patch just separates steal decision part from actual
> steal behaviour part so there is no functional change.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  mm/page_alloc.c | 49 ++++++++++++++++++++++++++++++++-----------------
>  1 file changed, 32 insertions(+), 17 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 8d52ab1..ef74750 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1122,6 +1122,24 @@ static void change_pageblock_range(struct page *pageblock_page,
>  	}
>  }
>  
> +static bool can_steal_freepages(unsigned int order,
> +				int start_mt, int fallback_mt)
> +{
> +	if (is_migrate_cma(fallback_mt))
> +		return false;
> +
> +	if (order >= pageblock_order)
> +		return true;
> +
> +	if (order >= pageblock_order / 2 ||
> +		start_mt == MIGRATE_RECLAIMABLE ||
> +		start_mt == MIGRATE_UNMOVABLE ||
> +		page_group_by_mobility_disabled)
> +		return true;
> +
> +	return false;
> +}

So some comments which can tell the cases can or cannot steal freepages
from other migratetype is necessary IMHO. Actually we can just
move some comments in try_to_steal_pages to here.

Thanks.

> +
>  /*
>   * When we are falling back to another migratetype during allocation, try to
>   * steal extra free pages from the same pageblocks to satisfy further
> @@ -1138,9 +1156,10 @@ static void change_pageblock_range(struct page *pageblock_page,
>   * as well.
>   */
>  static void try_to_steal_freepages(struct zone *zone, struct page *page,
> -				  int start_type, int fallback_type)
> +				  int start_type)
>  {
>  	int current_order = page_order(page);
> +	int pages;
>  
>  	/* Take ownership for orders >= pageblock_order */
>  	if (current_order >= pageblock_order) {
> @@ -1148,19 +1167,12 @@ static void try_to_steal_freepages(struct zone *zone, struct page *page,
>  		return;
>  	}
>  
> -	if (current_order >= pageblock_order / 2 ||
> -	    start_type == MIGRATE_RECLAIMABLE ||
> -	    start_type == MIGRATE_UNMOVABLE ||
> -	    page_group_by_mobility_disabled) {
> -		int pages;
> +	pages = move_freepages_block(zone, page, start_type);
>  
> -		pages = move_freepages_block(zone, page, start_type);
> -
> -		/* Claim the whole block if over half of it is free */
> -		if (pages >= (1 << (pageblock_order-1)) ||
> -				page_group_by_mobility_disabled)
> -			set_pageblock_migratetype(page, start_type);
> -	}
> +	/* Claim the whole block if over half of it is free */
> +	if (pages >= (1 << (pageblock_order-1)) ||
> +			page_group_by_mobility_disabled)
> +		set_pageblock_migratetype(page, start_type);
>  }
>  
>  /* Remove an element from the buddy allocator from the fallback list */
> @@ -1170,6 +1182,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
>  	struct free_area *area;
>  	unsigned int current_order;
>  	struct page *page;
> +	bool can_steal;
>  
>  	/* Find the largest possible block of pages in the other list */
>  	for (current_order = MAX_ORDER-1;
> @@ -1192,10 +1205,11 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
>  					struct page, lru);
>  			area->nr_free--;
>  
> -			if (!is_migrate_cma(migratetype)) {
> +			can_steal = can_steal_freepages(current_order,
> +					start_migratetype, migratetype);
> +			if (can_steal) {
>  				try_to_steal_freepages(zone, page,
> -							start_migratetype,
> -							migratetype);
> +							start_migratetype);
>  			} else {
>  				/*
>  				 * When borrowing from MIGRATE_CMA, we need to
> @@ -1203,7 +1217,8 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
>  				 * itself, and we do not try to steal extra
>  				 * free pages.
>  				 */
> -				buddy_type = migratetype;
> +				if (is_migrate_cma(migratetype))
> +					buddy_type = migratetype;
>  			}
>  
>  			/* Remove the page from the freelists */
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/4] mm/compaction: enhance compaction finish condition
  2015-01-30 12:34 ` [PATCH v2 4/4] mm/compaction: enhance compaction finish condition Joonsoo Kim
  2015-01-30 14:43   ` Vlastimil Babka
@ 2015-01-31 15:58   ` Zhang Yanfei
  2015-02-02  7:12     ` Joonsoo Kim
  1 sibling, 1 reply; 19+ messages in thread
From: Zhang Yanfei @ 2015-01-31 15:58 UTC (permalink / raw)
  To: Joonsoo Kim, Andrew Morton
  Cc: Vlastimil Babka, Mel Gorman, David Rientjes, Rik van Riel,
	linux-mm, linux-kernel, Joonsoo Kim

At 2015/1/30 20:34, Joonsoo Kim wrote:
> From: Joonsoo <iamjoonsoo.kim@lge.com>
> 
> Compaction has anti fragmentation algorithm. It is that freepage
> should be more than pageblock order to finish the compaction if we don't
> find any freepage in requested migratetype buddy list. This is for
> mitigating fragmentation, but, there is a lack of migratetype
> consideration and it is too excessive compared to page allocator's anti
> fragmentation algorithm.
> 
> Not considering migratetype would cause premature finish of compaction.
> For example, if allocation request is for unmovable migratetype,
> freepage with CMA migratetype doesn't help that allocation and
> compaction should not be stopped. But, current logic regards this
> situation as compaction is no longer needed, so finish the compaction.
> 
> Secondly, condition is too excessive compared to page allocator's logic.
> We can steal freepage from other migratetype and change pageblock
> migratetype on more relaxed conditions in page allocator. This is designed
> to prevent fragmentation and we can use it here. Imposing hard constraint
> only to the compaction doesn't help much in this case since page allocator
> would cause fragmentation again.

Changing both two behaviours in compaction may change the high order allocation
behaviours in the buddy allocator slowpath, so just as Vlastimil suggested,
some data from allocator should be necessary and helpful, IMHO.

Thanks. 

> 
> To solve these problems, this patch borrows anti fragmentation logic from
> page allocator. It will reduce premature compaction finish in some cases
> and reduce excessive compaction work.
> 
> stress-highalloc test in mmtests with non movable order 7 allocation shows
> considerable increase of compaction success rate.
> 
> Compaction success rate (Compaction success * 100 / Compaction stalls, %)
> 31.82 : 42.20
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  include/linux/mmzone.h |  3 +++
>  mm/compaction.c        | 30 ++++++++++++++++++++++++++++--
>  mm/internal.h          |  1 +
>  mm/page_alloc.c        |  5 ++---
>  4 files changed, 34 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index f279d9c..a2906bc 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -63,6 +63,9 @@ enum {
>  	MIGRATE_TYPES
>  };
>  
> +#define FALLBACK_MIGRATETYPES (4)
> +extern int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES];
> +
>  #ifdef CONFIG_CMA
>  #  define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA)
>  #else
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 782772d..0460e4b 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1125,6 +1125,29 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
>  	return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE;
>  }
>  
> +static bool can_steal_fallbacks(struct free_area *area,
> +			unsigned int order, int migratetype)
> +{
> +	int i;
> +	int fallback_mt;
> +
> +	if (area->nr_free == 0)
> +		return false;
> +
> +	for (i = 0; i < FALLBACK_MIGRATETYPES; i++) {
> +		fallback_mt = fallbacks[migratetype][i];
> +		if (fallback_mt == MIGRATE_RESERVE)
> +			break;
> +
> +		if (list_empty(&area->free_list[fallback_mt]))
> +			continue;
> +
> +		if (can_steal_freepages(order, migratetype, fallback_mt))
> +			return true;
> +	}
> +	return false;
> +}
> +
>  static int __compact_finished(struct zone *zone, struct compact_control *cc,
>  			    const int migratetype)
>  {
> @@ -1175,8 +1198,11 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc,
>  		if (!list_empty(&area->free_list[migratetype]))
>  			return COMPACT_PARTIAL;
>  
> -		/* Job done if allocation would set block type */
> -		if (order >= pageblock_order && area->nr_free)
> +		/*
> +		 * Job done if allocation would steal freepages from
> +		 * other migratetype buddy lists.
> +		 */
> +		if (can_steal_fallbacks(area, order, migratetype))
>  			return COMPACT_PARTIAL;
>  	}
>  
> diff --git a/mm/internal.h b/mm/internal.h
> index c4d6c9b..0a89a14 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -201,6 +201,7 @@ unsigned long
>  isolate_migratepages_range(struct compact_control *cc,
>  			   unsigned long low_pfn, unsigned long end_pfn);
>  
> +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt);
>  #endif
>  
>  /*
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index ef74750..4c3538b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1026,7 +1026,7 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order,
>   * This array describes the order lists are fallen back to when
>   * the free lists for the desirable migrate type are depleted
>   */
> -static int fallbacks[MIGRATE_TYPES][4] = {
> +int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES] = {
>  	[MIGRATE_UNMOVABLE]   = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE,     MIGRATE_RESERVE },
>  	[MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE,   MIGRATE_MOVABLE,     MIGRATE_RESERVE },
>  #ifdef CONFIG_CMA
> @@ -1122,8 +1122,7 @@ static void change_pageblock_range(struct page *pageblock_page,
>  	}
>  }
>  
> -static bool can_steal_freepages(unsigned int order,
> -				int start_mt, int fallback_mt)
> +bool can_steal_freepages(unsigned int order, int start_mt, int fallback_mt)
>  {
>  	if (is_migrate_cma(fallback_mt))
>  		return false;
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part
  2015-01-30 14:27   ` Vlastimil Babka
@ 2015-02-02  7:02     ` Joonsoo Kim
  0 siblings, 0 replies; 19+ messages in thread
From: Joonsoo Kim @ 2015-02-02  7:02 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Mel Gorman, David Rientjes, Rik van Riel,
	linux-mm, linux-kernel

On Fri, Jan 30, 2015 at 03:27:50PM +0100, Vlastimil Babka wrote:
> On 01/30/2015 01:34 PM, Joonsoo Kim wrote:
> > From: Joonsoo <iamjoonsoo.kim@lge.com>
> > 
> > This is preparation step to use page allocator's anti fragmentation logic
> > in compaction. This patch just separates steal decision part from actual
> > steal behaviour part so there is no functional change.
> > 
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > ---
> >  mm/page_alloc.c | 49 ++++++++++++++++++++++++++++++++-----------------
> >  1 file changed, 32 insertions(+), 17 deletions(-)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 8d52ab1..ef74750 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -1122,6 +1122,24 @@ static void change_pageblock_range(struct page *pageblock_page,
> >  	}
> >  }
> >  
> > +static bool can_steal_freepages(unsigned int order,
> > +				int start_mt, int fallback_mt)
> > +{
> > +	if (is_migrate_cma(fallback_mt))
> > +		return false;
> > +
> > +	if (order >= pageblock_order)
> > +		return true;
> > +
> > +	if (order >= pageblock_order / 2 ||
> > +		start_mt == MIGRATE_RECLAIMABLE ||
> > +		start_mt == MIGRATE_UNMOVABLE ||
> > +		page_group_by_mobility_disabled)
> > +		return true;
> > +
> > +	return false;
> > +}
> > +
> >  /*
> >   * When we are falling back to another migratetype during allocation, try to
> >   * steal extra free pages from the same pageblocks to satisfy further
> > @@ -1138,9 +1156,10 @@ static void change_pageblock_range(struct page *pageblock_page,
> >   * as well.
> >   */
> >  static void try_to_steal_freepages(struct zone *zone, struct page *page,
> > -				  int start_type, int fallback_type)
> > +				  int start_type)
> 
> It's actually not 'try_to_' anymore, is it? But could be, see below.
> 
> >  {
> >  	int current_order = page_order(page);
> > +	int pages;
> >  
> >  	/* Take ownership for orders >= pageblock_order */
> >  	if (current_order >= pageblock_order) {
> > @@ -1148,19 +1167,12 @@ static void try_to_steal_freepages(struct zone *zone, struct page *page,
> >  		return;
> >  	}
> >  
> > -	if (current_order >= pageblock_order / 2 ||
> > -	    start_type == MIGRATE_RECLAIMABLE ||
> > -	    start_type == MIGRATE_UNMOVABLE ||
> > -	    page_group_by_mobility_disabled) {
> > -		int pages;
> > +	pages = move_freepages_block(zone, page, start_type);
> >  
> > -		pages = move_freepages_block(zone, page, start_type);
> > -
> > -		/* Claim the whole block if over half of it is free */
> > -		if (pages >= (1 << (pageblock_order-1)) ||
> > -				page_group_by_mobility_disabled)
> > -			set_pageblock_migratetype(page, start_type);
> > -	}
> > +	/* Claim the whole block if over half of it is free */
> > +	if (pages >= (1 << (pageblock_order-1)) ||
> > +			page_group_by_mobility_disabled)
> > +		set_pageblock_migratetype(page, start_type);
> >  }
> >  
> >  /* Remove an element from the buddy allocator from the fallback list */
> > @@ -1170,6 +1182,7 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
> >  	struct free_area *area;
> >  	unsigned int current_order;
> >  	struct page *page;
> > +	bool can_steal;
> >  
> >  	/* Find the largest possible block of pages in the other list */
> >  	for (current_order = MAX_ORDER-1;
> > @@ -1192,10 +1205,11 @@ __rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype)
> >  					struct page, lru);
> >  			area->nr_free--;
> >  
> > -			if (!is_migrate_cma(migratetype)) {
> > +			can_steal = can_steal_freepages(current_order,
> > +					start_migratetype, migratetype);
> > +			if (can_steal) {
> 
> can_steal is only used once, why not do if (can_steal_freepages()) directly?
> 
> Or, call can_steal_freepages() from try_to_steal_freepages() and make
> try_to_steal_freepages() return its result. Then here it simplifies to:
> 
> if (!try_to_steal_freepages(...) && is_migrate_cma(...))
> 	buddy_type = migratetype;

You're right. Your commented code loosk better.

Your comment on 3/4 and 4/4 makes me reconsider this code factorization
and I found better solution.
I will send it soon.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part
  2015-01-31 12:38   ` Zhang Yanfei
@ 2015-02-02  7:03     ` Joonsoo Kim
  0 siblings, 0 replies; 19+ messages in thread
From: Joonsoo Kim @ 2015-02-02  7:03 UTC (permalink / raw)
  To: Zhang Yanfei
  Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, David Rientjes,
	Rik van Riel, linux-mm, linux-kernel

On Sat, Jan 31, 2015 at 08:38:10PM +0800, Zhang Yanfei wrote:
> At 2015/1/30 20:34, Joonsoo Kim wrote:
> > From: Joonsoo <iamjoonsoo.kim@lge.com>
> > 
> > This is preparation step to use page allocator's anti fragmentation logic
> > in compaction. This patch just separates steal decision part from actual
> > steal behaviour part so there is no functional change.
> > 
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > ---
> >  mm/page_alloc.c | 49 ++++++++++++++++++++++++++++++++-----------------
> >  1 file changed, 32 insertions(+), 17 deletions(-)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 8d52ab1..ef74750 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -1122,6 +1122,24 @@ static void change_pageblock_range(struct page *pageblock_page,
> >  	}
> >  }
> >  
> > +static bool can_steal_freepages(unsigned int order,
> > +				int start_mt, int fallback_mt)
> > +{
> > +	if (is_migrate_cma(fallback_mt))
> > +		return false;
> > +
> > +	if (order >= pageblock_order)
> > +		return true;
> > +
> > +	if (order >= pageblock_order / 2 ||
> > +		start_mt == MIGRATE_RECLAIMABLE ||
> > +		start_mt == MIGRATE_UNMOVABLE ||
> > +		page_group_by_mobility_disabled)
> > +		return true;
> > +
> > +	return false;
> > +}
> 
> So some comments which can tell the cases can or cannot steal freepages
> from other migratetype is necessary IMHO. Actually we can just
> move some comments in try_to_steal_pages to here.

Yes, move some comments looks sufficient to me. I will fix it.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/4] mm/compaction: enhance compaction finish condition
  2015-01-30 14:43   ` Vlastimil Babka
@ 2015-02-02  7:11     ` Joonsoo Kim
  0 siblings, 0 replies; 19+ messages in thread
From: Joonsoo Kim @ 2015-02-02  7:11 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Mel Gorman, David Rientjes, Rik van Riel,
	linux-mm, linux-kernel

On Fri, Jan 30, 2015 at 03:43:27PM +0100, Vlastimil Babka wrote:
> On 01/30/2015 01:34 PM, Joonsoo Kim wrote:
> > From: Joonsoo <iamjoonsoo.kim@lge.com>
> > 
> > Compaction has anti fragmentation algorithm. It is that freepage
> > should be more than pageblock order to finish the compaction if we don't
> > find any freepage in requested migratetype buddy list. This is for
> > mitigating fragmentation, but, there is a lack of migratetype
> > consideration and it is too excessive compared to page allocator's anti
> > fragmentation algorithm.
> > 
> > Not considering migratetype would cause premature finish of compaction.
> > For example, if allocation request is for unmovable migratetype,
> > freepage with CMA migratetype doesn't help that allocation and
> > compaction should not be stopped. But, current logic regards this
> > situation as compaction is no longer needed, so finish the compaction.
> > 
> > Secondly, condition is too excessive compared to page allocator's logic.
> > We can steal freepage from other migratetype and change pageblock
> > migratetype on more relaxed conditions in page allocator. This is designed
> > to prevent fragmentation and we can use it here. Imposing hard constraint
> > only to the compaction doesn't help much in this case since page allocator
> > would cause fragmentation again.
> > 
> > To solve these problems, this patch borrows anti fragmentation logic from
> > page allocator. It will reduce premature compaction finish in some cases
> > and reduce excessive compaction work.
> > 
> > stress-highalloc test in mmtests with non movable order 7 allocation shows
> > considerable increase of compaction success rate.
> > 
> > Compaction success rate (Compaction success * 100 / Compaction stalls, %)
> > 31.82 : 42.20
> > 
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> I have some worries about longer-term fragmentation, some testing of
> stress-highalloc several times without restarts could be helpful.

Okay. I will do it.

> 
> > ---
> >  include/linux/mmzone.h |  3 +++
> >  mm/compaction.c        | 30 ++++++++++++++++++++++++++++--
> >  mm/internal.h          |  1 +
> >  mm/page_alloc.c        |  5 ++---
> >  4 files changed, 34 insertions(+), 5 deletions(-)
> > 
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index f279d9c..a2906bc 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -63,6 +63,9 @@ enum {
> >  	MIGRATE_TYPES
> >  };
> >  
> > +#define FALLBACK_MIGRATETYPES (4)
> > +extern int fallbacks[MIGRATE_TYPES][FALLBACK_MIGRATETYPES];
> > +
> >  #ifdef CONFIG_CMA
> >  #  define is_migrate_cma(migratetype) unlikely((migratetype) == MIGRATE_CMA)
> >  #else
> > diff --git a/mm/compaction.c b/mm/compaction.c
> > index 782772d..0460e4b 100644
> > --- a/mm/compaction.c
> > +++ b/mm/compaction.c
> > @@ -1125,6 +1125,29 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
> >  	return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE;
> >  }
> >  
> > +static bool can_steal_fallbacks(struct free_area *area,
> > +			unsigned int order, int migratetype)
> 
> Could you move this to page_alloc.c and then you don't have to export the
> fallbacks arrays?

Okay.

> 
> > +{
> > +	int i;
> > +	int fallback_mt;
> > +
> > +	if (area->nr_free == 0)
> > +		return false;
> > +
> > +	for (i = 0; i < FALLBACK_MIGRATETYPES; i++) {
> > +		fallback_mt = fallbacks[migratetype][i];
> > +		if (fallback_mt == MIGRATE_RESERVE)
> > +			break;
> > +
> > +		if (list_empty(&area->free_list[fallback_mt]))
> > +			continue;
> > +
> > +		if (can_steal_freepages(order, migratetype, fallback_mt))
> > +			return true;
> > +	}
> > +	return false;
> > +}
> > +
> >  static int __compact_finished(struct zone *zone, struct compact_control *cc,
> >  			    const int migratetype)
> >  {
> > @@ -1175,8 +1198,11 @@ static int __compact_finished(struct zone *zone, struct compact_control *cc,
> >  		if (!list_empty(&area->free_list[migratetype]))
> >  			return COMPACT_PARTIAL;
> >  
> > -		/* Job done if allocation would set block type */
> > -		if (order >= pageblock_order && area->nr_free)
> > +		/*
> > +		 * Job done if allocation would steal freepages from
> > +		 * other migratetype buddy lists.
> > +		 */
> > +		if (can_steal_fallbacks(area, order, migratetype))
> >  			return COMPACT_PARTIAL;
> 
> Seems somewhat wasteful in scenario where we want to satisfy a movable
> allocation and it's an async compaction. Then we don't compact in
> unmovable/reclaimable pageblock, and yet we will keep checking them for
> fallbacks. A price to pay for having generic code?

I think that there would be lucky case that high order freepage on
unmovable/reclaimable pageblock is made by concurrent freeing.
In this case, finishing compaction would be good thing. And, this logic
would cause marginal overhead so generic code seems justificable to me.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/4] mm/compaction: enhance compaction finish condition
  2015-01-31 15:58   ` Zhang Yanfei
@ 2015-02-02  7:12     ` Joonsoo Kim
  0 siblings, 0 replies; 19+ messages in thread
From: Joonsoo Kim @ 2015-02-02  7:12 UTC (permalink / raw)
  To: Zhang Yanfei
  Cc: Andrew Morton, Vlastimil Babka, Mel Gorman, David Rientjes,
	Rik van Riel, linux-mm, linux-kernel

On Sat, Jan 31, 2015 at 11:58:03PM +0800, Zhang Yanfei wrote:
> At 2015/1/30 20:34, Joonsoo Kim wrote:
> > From: Joonsoo <iamjoonsoo.kim@lge.com>
> > 
> > Compaction has anti fragmentation algorithm. It is that freepage
> > should be more than pageblock order to finish the compaction if we don't
> > find any freepage in requested migratetype buddy list. This is for
> > mitigating fragmentation, but, there is a lack of migratetype
> > consideration and it is too excessive compared to page allocator's anti
> > fragmentation algorithm.
> > 
> > Not considering migratetype would cause premature finish of compaction.
> > For example, if allocation request is for unmovable migratetype,
> > freepage with CMA migratetype doesn't help that allocation and
> > compaction should not be stopped. But, current logic regards this
> > situation as compaction is no longer needed, so finish the compaction.
> > 
> > Secondly, condition is too excessive compared to page allocator's logic.
> > We can steal freepage from other migratetype and change pageblock
> > migratetype on more relaxed conditions in page allocator. This is designed
> > to prevent fragmentation and we can use it here. Imposing hard constraint
> > only to the compaction doesn't help much in this case since page allocator
> > would cause fragmentation again.
> 
> Changing both two behaviours in compaction may change the high order allocation
> behaviours in the buddy allocator slowpath, so just as Vlastimil suggested,
> some data from allocator should be necessary and helpful, IMHO.

As Vlastimil said, fragmentation effect should be checked. I will do
it and report the result on next version.

Thanks.


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-02-02  7:10 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-30 12:34 [PATCH v2 0/4] enhance compaction success rate Joonsoo Kim
2015-01-30 12:34 ` [PATCH v2 1/4] mm/compaction: fix wrong order check in compact_finished() Joonsoo Kim
2015-01-30 13:27   ` Vlastimil Babka
2015-01-31  7:38   ` Zhang Yanfei
2015-01-30 12:34 ` [PATCH v2 2/4] mm/compaction: stop the isolation when we isolate enough freepage Joonsoo Kim
2015-01-30 13:47   ` Vlastimil Babka
2015-01-31  7:49   ` Zhang Yanfei
2015-01-31  8:31     ` Vlastimil Babka
2015-01-31 10:17       ` Zhang Yanfei
2015-01-30 12:34 ` [PATCH v2 3/4] mm/page_alloc: separate steal decision from steal behaviour part Joonsoo Kim
2015-01-30 14:27   ` Vlastimil Babka
2015-02-02  7:02     ` Joonsoo Kim
2015-01-31 12:38   ` Zhang Yanfei
2015-02-02  7:03     ` Joonsoo Kim
2015-01-30 12:34 ` [PATCH v2 4/4] mm/compaction: enhance compaction finish condition Joonsoo Kim
2015-01-30 14:43   ` Vlastimil Babka
2015-02-02  7:11     ` Joonsoo Kim
2015-01-31 15:58   ` Zhang Yanfei
2015-02-02  7:12     ` Joonsoo Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).