All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn
@ 2015-12-14  5:02 ` Joonsoo Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Joonsoo Kim @ 2015-12-14  5:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, Aaron Lu, Mel Gorman, Rik van Riel,
	David Rientjes, linux-kernel, linux-mm, Joonsoo Kim

free_pfn and compact_cached_free_pfn are the pointer that remember
restart position of freepage scanner. When they are reset or invalid,
we set them to zone_end_pfn because freepage scanner works in reverse
direction. But, because zone range is defined as [zone_start_pfn,
zone_end_pfn), zone_end_pfn is invalid to access. Therefore, we should
not store it to free_pfn and compact_cached_free_pfn. Instead, we need
to store zone_end_pfn - 1 to them. There is one more thing we should
consider. Freepage scanner scan reversely by pageblock unit. If free_pfn
and compact_cached_free_pfn are set to middle of pageblock, it regards
that sitiation as that it already scans front part of pageblock so we
lose opportunity to scan there. To fix-up, this patch do round_down()
to guarantee that reset position will be pageblock aligned.

Note that thanks to the current pageblock_pfn_to_page() implementation,
actual access to zone_end_pfn doesn't happen until now. But, following
patch will change pageblock_pfn_to_page() so this patch is needed
from now on.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/compaction.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 585de54..56fa321 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -200,7 +200,8 @@ static void reset_cached_positions(struct zone *zone)
 {
 	zone->compact_cached_migrate_pfn[0] = zone->zone_start_pfn;
 	zone->compact_cached_migrate_pfn[1] = zone->zone_start_pfn;
-	zone->compact_cached_free_pfn = zone_end_pfn(zone);
+	zone->compact_cached_free_pfn =
+			round_down(zone_end_pfn(zone) - 1, pageblock_nr_pages);
 }
 
 /*
@@ -1371,11 +1372,11 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
 	 */
 	cc->migrate_pfn = zone->compact_cached_migrate_pfn[sync];
 	cc->free_pfn = zone->compact_cached_free_pfn;
-	if (cc->free_pfn < start_pfn || cc->free_pfn > end_pfn) {
-		cc->free_pfn = end_pfn & ~(pageblock_nr_pages-1);
+	if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) {
+		cc->free_pfn = round_down(end_pfn - 1, pageblock_nr_pages);
 		zone->compact_cached_free_pfn = cc->free_pfn;
 	}
-	if (cc->migrate_pfn < start_pfn || cc->migrate_pfn > end_pfn) {
+	if (cc->migrate_pfn < start_pfn || cc->migrate_pfn >= end_pfn) {
 		cc->migrate_pfn = start_pfn;
 		zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn;
 		zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn
@ 2015-12-14  5:02 ` Joonsoo Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Joonsoo Kim @ 2015-12-14  5:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, Aaron Lu, Mel Gorman, Rik van Riel,
	David Rientjes, linux-kernel, linux-mm, Joonsoo Kim

free_pfn and compact_cached_free_pfn are the pointer that remember
restart position of freepage scanner. When they are reset or invalid,
we set them to zone_end_pfn because freepage scanner works in reverse
direction. But, because zone range is defined as [zone_start_pfn,
zone_end_pfn), zone_end_pfn is invalid to access. Therefore, we should
not store it to free_pfn and compact_cached_free_pfn. Instead, we need
to store zone_end_pfn - 1 to them. There is one more thing we should
consider. Freepage scanner scan reversely by pageblock unit. If free_pfn
and compact_cached_free_pfn are set to middle of pageblock, it regards
that sitiation as that it already scans front part of pageblock so we
lose opportunity to scan there. To fix-up, this patch do round_down()
to guarantee that reset position will be pageblock aligned.

Note that thanks to the current pageblock_pfn_to_page() implementation,
actual access to zone_end_pfn doesn't happen until now. But, following
patch will change pageblock_pfn_to_page() so this patch is needed
from now on.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/compaction.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 585de54..56fa321 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -200,7 +200,8 @@ static void reset_cached_positions(struct zone *zone)
 {
 	zone->compact_cached_migrate_pfn[0] = zone->zone_start_pfn;
 	zone->compact_cached_migrate_pfn[1] = zone->zone_start_pfn;
-	zone->compact_cached_free_pfn = zone_end_pfn(zone);
+	zone->compact_cached_free_pfn =
+			round_down(zone_end_pfn(zone) - 1, pageblock_nr_pages);
 }
 
 /*
@@ -1371,11 +1372,11 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
 	 */
 	cc->migrate_pfn = zone->compact_cached_migrate_pfn[sync];
 	cc->free_pfn = zone->compact_cached_free_pfn;
-	if (cc->free_pfn < start_pfn || cc->free_pfn > end_pfn) {
-		cc->free_pfn = end_pfn & ~(pageblock_nr_pages-1);
+	if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) {
+		cc->free_pfn = round_down(end_pfn - 1, pageblock_nr_pages);
 		zone->compact_cached_free_pfn = cc->free_pfn;
 	}
-	if (cc->migrate_pfn < start_pfn || cc->migrate_pfn > end_pfn) {
+	if (cc->migrate_pfn < start_pfn || cc->migrate_pfn >= end_pfn) {
 		cc->migrate_pfn = start_pfn;
 		zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn;
 		zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn;
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous
  2015-12-14  5:02 ` Joonsoo Kim
@ 2015-12-14  5:02   ` Joonsoo Kim
  -1 siblings, 0 replies; 24+ messages in thread
From: Joonsoo Kim @ 2015-12-14  5:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, Aaron Lu, Mel Gorman, Rik van Riel,
	David Rientjes, linux-kernel, linux-mm, Joonsoo Kim

There is a performance drop report due to hugepage allocation and in there
half of cpu time are spent on pageblock_pfn_to_page() in compaction [1].
In that workload, compaction is triggered to make hugepage but most of
pageblocks are un-available for compaction due to pageblock type and
skip bit so compaction usually fails. Most costly operations in this case
is to find valid pageblock while scanning whole zone range. To check
if pageblock is valid to compact, valid pfn within pageblock is required
and we can obtain it by calling pageblock_pfn_to_page(). This function
checks whether pageblock is in a single zone and return valid pfn
if possible. Problem is that we need to check it every time before
scanning pageblock even if we re-visit it and this turns out to
be very expensive in this workload.

Although we have no way to skip this pageblock check in the system
where hole exists at arbitrary position, we can use cached value for
zone continuity and just do pfn_to_page() in the system where hole doesn't
exist. This optimization considerably speeds up in above workload.

Before vs After
Max: 1096 MB/s vs 1325 MB/s
Min: 635 MB/s 1015 MB/s
Avg: 899 MB/s 1194 MB/s

Avg is improved by roughly 30% [2].

Not to disturb the system where compaction isn't triggered, checking will
be done at first compaction invocation.

[1]: http://www.spinics.net/lists/linux-mm/msg97378.html
[2]: https://lkml.org/lkml/2015/12/9/23

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 include/linux/mmzone.h |  1 +
 mm/compaction.c        | 49 ++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 68cc063..cd3736e 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -521,6 +521,7 @@ struct zone {
 #if defined CONFIG_COMPACTION || defined CONFIG_CMA
 	/* Set to true when the PG_migrate_skip bits should be cleared */
 	bool			compact_blockskip_flush;
+	bool			contiguous;
 #endif
 
 	ZONE_PADDING(_pad3_)
diff --git a/mm/compaction.c b/mm/compaction.c
index 56fa321..ce60b38 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -88,7 +88,7 @@ static inline bool migrate_async_suitable(int migratetype)
  * the first and last page of a pageblock and avoid checking each individual
  * page in a pageblock.
  */
-static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
+static struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
 				unsigned long end_pfn, struct zone *zone)
 {
 	struct page *start_page;
@@ -114,6 +114,51 @@ static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
 	return start_page;
 }
 
+static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
+				unsigned long end_pfn, struct zone *zone)
+{
+	if (zone->contiguous)
+		return pfn_to_page(start_pfn);
+
+	return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
+}
+
+static void check_zone_contiguous(struct zone *zone)
+{
+	unsigned long block_start_pfn = zone->zone_start_pfn;
+	unsigned long block_end_pfn;
+	unsigned long pfn;
+
+	/* Already initialized if cached pfn is non-zero */
+	if (zone->compact_cached_migrate_pfn[0] ||
+		zone->compact_cached_free_pfn)
+		return;
+
+	/* Mark that checking is in progress */
+	zone->compact_cached_free_pfn = ULONG_MAX;
+
+	block_end_pfn = ALIGN(block_start_pfn + 1, pageblock_nr_pages);
+	for (; block_start_pfn < zone_end_pfn(zone);
+		block_start_pfn = block_end_pfn,
+		block_end_pfn += pageblock_nr_pages) {
+
+		block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
+
+		if (!__pageblock_pfn_to_page(block_start_pfn,
+					block_end_pfn, zone))
+			return;
+
+		/* Check validity of pfn within pageblock */
+		for (pfn = block_start_pfn; pfn < block_end_pfn; pfn++) {
+			if (!pfn_valid_within(pfn))
+				return;
+		}
+	}
+
+	/* We confirm that there is no hole */
+	zone->contiguous = true;
+}
+
 #ifdef CONFIG_COMPACTION
 
 /* Do not skip compaction more than 64 times */
@@ -1357,6 +1402,8 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
 		;
 	}
 
+	check_zone_contiguous(zone);
+
 	/*
 	 * Clear pageblock skip if there were failures recently and compaction
 	 * is about to be retried after being deferred. kswapd does not do
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous
@ 2015-12-14  5:02   ` Joonsoo Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Joonsoo Kim @ 2015-12-14  5:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, Aaron Lu, Mel Gorman, Rik van Riel,
	David Rientjes, linux-kernel, linux-mm, Joonsoo Kim

There is a performance drop report due to hugepage allocation and in there
half of cpu time are spent on pageblock_pfn_to_page() in compaction [1].
In that workload, compaction is triggered to make hugepage but most of
pageblocks are un-available for compaction due to pageblock type and
skip bit so compaction usually fails. Most costly operations in this case
is to find valid pageblock while scanning whole zone range. To check
if pageblock is valid to compact, valid pfn within pageblock is required
and we can obtain it by calling pageblock_pfn_to_page(). This function
checks whether pageblock is in a single zone and return valid pfn
if possible. Problem is that we need to check it every time before
scanning pageblock even if we re-visit it and this turns out to
be very expensive in this workload.

Although we have no way to skip this pageblock check in the system
where hole exists at arbitrary position, we can use cached value for
zone continuity and just do pfn_to_page() in the system where hole doesn't
exist. This optimization considerably speeds up in above workload.

Before vs After
Max: 1096 MB/s vs 1325 MB/s
Min: 635 MB/s 1015 MB/s
Avg: 899 MB/s 1194 MB/s

Avg is improved by roughly 30% [2].

Not to disturb the system where compaction isn't triggered, checking will
be done at first compaction invocation.

[1]: http://www.spinics.net/lists/linux-mm/msg97378.html
[2]: https://lkml.org/lkml/2015/12/9/23

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 include/linux/mmzone.h |  1 +
 mm/compaction.c        | 49 ++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 68cc063..cd3736e 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -521,6 +521,7 @@ struct zone {
 #if defined CONFIG_COMPACTION || defined CONFIG_CMA
 	/* Set to true when the PG_migrate_skip bits should be cleared */
 	bool			compact_blockskip_flush;
+	bool			contiguous;
 #endif
 
 	ZONE_PADDING(_pad3_)
diff --git a/mm/compaction.c b/mm/compaction.c
index 56fa321..ce60b38 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -88,7 +88,7 @@ static inline bool migrate_async_suitable(int migratetype)
  * the first and last page of a pageblock and avoid checking each individual
  * page in a pageblock.
  */
-static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
+static struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
 				unsigned long end_pfn, struct zone *zone)
 {
 	struct page *start_page;
@@ -114,6 +114,51 @@ static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
 	return start_page;
 }
 
+static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
+				unsigned long end_pfn, struct zone *zone)
+{
+	if (zone->contiguous)
+		return pfn_to_page(start_pfn);
+
+	return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
+}
+
+static void check_zone_contiguous(struct zone *zone)
+{
+	unsigned long block_start_pfn = zone->zone_start_pfn;
+	unsigned long block_end_pfn;
+	unsigned long pfn;
+
+	/* Already initialized if cached pfn is non-zero */
+	if (zone->compact_cached_migrate_pfn[0] ||
+		zone->compact_cached_free_pfn)
+		return;
+
+	/* Mark that checking is in progress */
+	zone->compact_cached_free_pfn = ULONG_MAX;
+
+	block_end_pfn = ALIGN(block_start_pfn + 1, pageblock_nr_pages);
+	for (; block_start_pfn < zone_end_pfn(zone);
+		block_start_pfn = block_end_pfn,
+		block_end_pfn += pageblock_nr_pages) {
+
+		block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
+
+		if (!__pageblock_pfn_to_page(block_start_pfn,
+					block_end_pfn, zone))
+			return;
+
+		/* Check validity of pfn within pageblock */
+		for (pfn = block_start_pfn; pfn < block_end_pfn; pfn++) {
+			if (!pfn_valid_within(pfn))
+				return;
+		}
+	}
+
+	/* We confirm that there is no hole */
+	zone->contiguous = true;
+}
+
 #ifdef CONFIG_COMPACTION
 
 /* Do not skip compaction more than 64 times */
@@ -1357,6 +1402,8 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
 		;
 	}
 
+	check_zone_contiguous(zone);
+
 	/*
 	 * Clear pageblock skip if there were failures recently and compaction
 	 * is about to be retried after being deferred. kswapd does not do
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn
  2015-12-14  5:02 ` Joonsoo Kim
@ 2015-12-14 10:07   ` Vlastimil Babka
  -1 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2015-12-14 10:07 UTC (permalink / raw)
  To: Joonsoo Kim, Andrew Morton
  Cc: Aaron Lu, Mel Gorman, Rik van Riel, David Rientjes, linux-kernel,
	linux-mm, Joonsoo Kim

On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
> free_pfn and compact_cached_free_pfn are the pointer that remember
> restart position of freepage scanner. When they are reset or invalid,
> we set them to zone_end_pfn because freepage scanner works in reverse
> direction. But, because zone range is defined as [zone_start_pfn,
> zone_end_pfn), zone_end_pfn is invalid to access. Therefore, we should
> not store it to free_pfn and compact_cached_free_pfn. Instead, we need
> to store zone_end_pfn - 1 to them. There is one more thing we should
> consider. Freepage scanner scan reversely by pageblock unit. If free_pfn
> and compact_cached_free_pfn are set to middle of pageblock, it regards
> that sitiation as that it already scans front part of pageblock so we
> lose opportunity to scan there. To fix-up, this patch do round_down()
> to guarantee that reset position will be pageblock aligned.
>
> Note that thanks to the current pageblock_pfn_to_page() implementation,
> actual access to zone_end_pfn doesn't happen until now. But, following
> patch will change pageblock_pfn_to_page() so this patch is needed
> from now on.
>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

Note that until now in compaction we've used basically an open-coded 
round_down(), and ALIGN() for rounding up. You introduce a first use of 
round_down(), and it would be nice to standardize on round_down() and 
round_up() everywhere. I think it's more obvious than open-coding and 
ALIGN() (which doesn't tell the reader if it's aligning up or down). 
Hopefully they really do the same thing and there are no caveats...

> ---
>   mm/compaction.c | 9 +++++----
>   1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 585de54..56fa321 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -200,7 +200,8 @@ static void reset_cached_positions(struct zone *zone)
>   {
>   	zone->compact_cached_migrate_pfn[0] = zone->zone_start_pfn;
>   	zone->compact_cached_migrate_pfn[1] = zone->zone_start_pfn;
> -	zone->compact_cached_free_pfn = zone_end_pfn(zone);
> +	zone->compact_cached_free_pfn =
> +			round_down(zone_end_pfn(zone) - 1, pageblock_nr_pages);
>   }
>
>   /*
> @@ -1371,11 +1372,11 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
>   	 */
>   	cc->migrate_pfn = zone->compact_cached_migrate_pfn[sync];
>   	cc->free_pfn = zone->compact_cached_free_pfn;
> -	if (cc->free_pfn < start_pfn || cc->free_pfn > end_pfn) {
> -		cc->free_pfn = end_pfn & ~(pageblock_nr_pages-1);
> +	if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) {
> +		cc->free_pfn = round_down(end_pfn - 1, pageblock_nr_pages);
>   		zone->compact_cached_free_pfn = cc->free_pfn;
>   	}
> -	if (cc->migrate_pfn < start_pfn || cc->migrate_pfn > end_pfn) {
> +	if (cc->migrate_pfn < start_pfn || cc->migrate_pfn >= end_pfn) {
>   		cc->migrate_pfn = start_pfn;
>   		zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn;
>   		zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn;
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn
@ 2015-12-14 10:07   ` Vlastimil Babka
  0 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2015-12-14 10:07 UTC (permalink / raw)
  To: Joonsoo Kim, Andrew Morton
  Cc: Aaron Lu, Mel Gorman, Rik van Riel, David Rientjes, linux-kernel,
	linux-mm, Joonsoo Kim

On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
> free_pfn and compact_cached_free_pfn are the pointer that remember
> restart position of freepage scanner. When they are reset or invalid,
> we set them to zone_end_pfn because freepage scanner works in reverse
> direction. But, because zone range is defined as [zone_start_pfn,
> zone_end_pfn), zone_end_pfn is invalid to access. Therefore, we should
> not store it to free_pfn and compact_cached_free_pfn. Instead, we need
> to store zone_end_pfn - 1 to them. There is one more thing we should
> consider. Freepage scanner scan reversely by pageblock unit. If free_pfn
> and compact_cached_free_pfn are set to middle of pageblock, it regards
> that sitiation as that it already scans front part of pageblock so we
> lose opportunity to scan there. To fix-up, this patch do round_down()
> to guarantee that reset position will be pageblock aligned.
>
> Note that thanks to the current pageblock_pfn_to_page() implementation,
> actual access to zone_end_pfn doesn't happen until now. But, following
> patch will change pageblock_pfn_to_page() so this patch is needed
> from now on.
>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

Note that until now in compaction we've used basically an open-coded 
round_down(), and ALIGN() for rounding up. You introduce a first use of 
round_down(), and it would be nice to standardize on round_down() and 
round_up() everywhere. I think it's more obvious than open-coding and 
ALIGN() (which doesn't tell the reader if it's aligning up or down). 
Hopefully they really do the same thing and there are no caveats...

> ---
>   mm/compaction.c | 9 +++++----
>   1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 585de54..56fa321 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -200,7 +200,8 @@ static void reset_cached_positions(struct zone *zone)
>   {
>   	zone->compact_cached_migrate_pfn[0] = zone->zone_start_pfn;
>   	zone->compact_cached_migrate_pfn[1] = zone->zone_start_pfn;
> -	zone->compact_cached_free_pfn = zone_end_pfn(zone);
> +	zone->compact_cached_free_pfn =
> +			round_down(zone_end_pfn(zone) - 1, pageblock_nr_pages);
>   }
>
>   /*
> @@ -1371,11 +1372,11 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
>   	 */
>   	cc->migrate_pfn = zone->compact_cached_migrate_pfn[sync];
>   	cc->free_pfn = zone->compact_cached_free_pfn;
> -	if (cc->free_pfn < start_pfn || cc->free_pfn > end_pfn) {
> -		cc->free_pfn = end_pfn & ~(pageblock_nr_pages-1);
> +	if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) {
> +		cc->free_pfn = round_down(end_pfn - 1, pageblock_nr_pages);
>   		zone->compact_cached_free_pfn = cc->free_pfn;
>   	}
> -	if (cc->migrate_pfn < start_pfn || cc->migrate_pfn > end_pfn) {
> +	if (cc->migrate_pfn < start_pfn || cc->migrate_pfn >= end_pfn) {
>   		cc->migrate_pfn = start_pfn;
>   		zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn;
>   		zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn;
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous
  2015-12-14  5:02   ` Joonsoo Kim
@ 2015-12-14 10:29     ` Vlastimil Babka
  -1 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2015-12-14 10:29 UTC (permalink / raw)
  To: Joonsoo Kim, Andrew Morton
  Cc: Aaron Lu, Mel Gorman, Rik van Riel, David Rientjes, linux-kernel,
	linux-mm, Joonsoo Kim

On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
> There is a performance drop report due to hugepage allocation and in there
> half of cpu time are spent on pageblock_pfn_to_page() in compaction [1].
> In that workload, compaction is triggered to make hugepage but most of
> pageblocks are un-available for compaction due to pageblock type and
> skip bit so compaction usually fails. Most costly operations in this case
> is to find valid pageblock while scanning whole zone range. To check
> if pageblock is valid to compact, valid pfn within pageblock is required
> and we can obtain it by calling pageblock_pfn_to_page(). This function
> checks whether pageblock is in a single zone and return valid pfn
> if possible. Problem is that we need to check it every time before
> scanning pageblock even if we re-visit it and this turns out to
> be very expensive in this workload.

Hm I wonder if this is safe wrt memory hotplug? Shouldn't there be a 
rechecking plugged into the appropriate hotplug add/remove callbacks? 
Which would make the whole thing generic too, zone->contiguous 
information doesn't have to be limited to compaction. And it would 
remove the rather ugly part where cached pfn info is used as an 
indication of zone->contiguous being already set...

> Although we have no way to skip this pageblock check in the system
> where hole exists at arbitrary position, we can use cached value for
> zone continuity and just do pfn_to_page() in the system where hole doesn't
> exist. This optimization considerably speeds up in above workload.
>
> Before vs After
> Max: 1096 MB/s vs 1325 MB/s
> Min: 635 MB/s 1015 MB/s
> Avg: 899 MB/s 1194 MB/s
>
> Avg is improved by roughly 30% [2].

Unless I'm mistaken, these results also include my RFC series (Aaron can 
you clarify?). These patches should better be tested standalone on top 
of base, as being simpler they will probably be included sooner (the RFC 
series needs reviews at the very least :) - although the memory hotplug 
concerns might make the "sooner" here relative too.

Anyway it's interesting that this patch improved "Min", and variance in 
general (on top of my RFC) so much. I would expect the overhead of 
pageblock_pfn_to_page() to be quite stable, hmm.

> Not to disturb the system where compaction isn't triggered, checking will
> be done at first compaction invocation.
>
> [1]: http://www.spinics.net/lists/linux-mm/msg97378.html
> [2]: https://lkml.org/lkml/2015/12/9/23
>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>   include/linux/mmzone.h |  1 +
>   mm/compaction.c        | 49 ++++++++++++++++++++++++++++++++++++++++++++++++-
>   2 files changed, 49 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 68cc063..cd3736e 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -521,6 +521,7 @@ struct zone {
>   #if defined CONFIG_COMPACTION || defined CONFIG_CMA
>   	/* Set to true when the PG_migrate_skip bits should be cleared */
>   	bool			compact_blockskip_flush;
> +	bool			contiguous;
>   #endif
>
>   	ZONE_PADDING(_pad3_)
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 56fa321..ce60b38 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -88,7 +88,7 @@ static inline bool migrate_async_suitable(int migratetype)
>    * the first and last page of a pageblock and avoid checking each individual
>    * page in a pageblock.
>    */
> -static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
> +static struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
>   				unsigned long end_pfn, struct zone *zone)
>   {
>   	struct page *start_page;
> @@ -114,6 +114,51 @@ static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
>   	return start_page;
>   }
>
> +static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
> +				unsigned long end_pfn, struct zone *zone)
> +{
> +	if (zone->contiguous)
> +		return pfn_to_page(start_pfn);
> +
> +	return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
> +}
> +
> +static void check_zone_contiguous(struct zone *zone)
> +{
> +	unsigned long block_start_pfn = zone->zone_start_pfn;
> +	unsigned long block_end_pfn;
> +	unsigned long pfn;
> +
> +	/* Already initialized if cached pfn is non-zero */
> +	if (zone->compact_cached_migrate_pfn[0] ||
> +		zone->compact_cached_free_pfn)
> +		return;
> +
> +	/* Mark that checking is in progress */
> +	zone->compact_cached_free_pfn = ULONG_MAX;
> +
> +	block_end_pfn = ALIGN(block_start_pfn + 1, pageblock_nr_pages);
> +	for (; block_start_pfn < zone_end_pfn(zone);
> +		block_start_pfn = block_end_pfn,
> +		block_end_pfn += pageblock_nr_pages) {
> +
> +		block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
> +
> +		if (!__pageblock_pfn_to_page(block_start_pfn,
> +					block_end_pfn, zone))
> +			return;
> +
> +		/* Check validity of pfn within pageblock */
> +		for (pfn = block_start_pfn; pfn < block_end_pfn; pfn++) {
> +			if (!pfn_valid_within(pfn))
> +				return;
> +		}
> +	}
> +
> +	/* We confirm that there is no hole */
> +	zone->contiguous = true;
> +}
> +
>   #ifdef CONFIG_COMPACTION
>
>   /* Do not skip compaction more than 64 times */
> @@ -1357,6 +1402,8 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
>   		;
>   	}
>
> +	check_zone_contiguous(zone);
> +
>   	/*
>   	 * Clear pageblock skip if there were failures recently and compaction
>   	 * is about to be retried after being deferred. kswapd does not do
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous
@ 2015-12-14 10:29     ` Vlastimil Babka
  0 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2015-12-14 10:29 UTC (permalink / raw)
  To: Joonsoo Kim, Andrew Morton
  Cc: Aaron Lu, Mel Gorman, Rik van Riel, David Rientjes, linux-kernel,
	linux-mm, Joonsoo Kim

On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
> There is a performance drop report due to hugepage allocation and in there
> half of cpu time are spent on pageblock_pfn_to_page() in compaction [1].
> In that workload, compaction is triggered to make hugepage but most of
> pageblocks are un-available for compaction due to pageblock type and
> skip bit so compaction usually fails. Most costly operations in this case
> is to find valid pageblock while scanning whole zone range. To check
> if pageblock is valid to compact, valid pfn within pageblock is required
> and we can obtain it by calling pageblock_pfn_to_page(). This function
> checks whether pageblock is in a single zone and return valid pfn
> if possible. Problem is that we need to check it every time before
> scanning pageblock even if we re-visit it and this turns out to
> be very expensive in this workload.

Hm I wonder if this is safe wrt memory hotplug? Shouldn't there be a 
rechecking plugged into the appropriate hotplug add/remove callbacks? 
Which would make the whole thing generic too, zone->contiguous 
information doesn't have to be limited to compaction. And it would 
remove the rather ugly part where cached pfn info is used as an 
indication of zone->contiguous being already set...

> Although we have no way to skip this pageblock check in the system
> where hole exists at arbitrary position, we can use cached value for
> zone continuity and just do pfn_to_page() in the system where hole doesn't
> exist. This optimization considerably speeds up in above workload.
>
> Before vs After
> Max: 1096 MB/s vs 1325 MB/s
> Min: 635 MB/s 1015 MB/s
> Avg: 899 MB/s 1194 MB/s
>
> Avg is improved by roughly 30% [2].

Unless I'm mistaken, these results also include my RFC series (Aaron can 
you clarify?). These patches should better be tested standalone on top 
of base, as being simpler they will probably be included sooner (the RFC 
series needs reviews at the very least :) - although the memory hotplug 
concerns might make the "sooner" here relative too.

Anyway it's interesting that this patch improved "Min", and variance in 
general (on top of my RFC) so much. I would expect the overhead of 
pageblock_pfn_to_page() to be quite stable, hmm.

> Not to disturb the system where compaction isn't triggered, checking will
> be done at first compaction invocation.
>
> [1]: http://www.spinics.net/lists/linux-mm/msg97378.html
> [2]: https://lkml.org/lkml/2015/12/9/23
>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>   include/linux/mmzone.h |  1 +
>   mm/compaction.c        | 49 ++++++++++++++++++++++++++++++++++++++++++++++++-
>   2 files changed, 49 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 68cc063..cd3736e 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -521,6 +521,7 @@ struct zone {
>   #if defined CONFIG_COMPACTION || defined CONFIG_CMA
>   	/* Set to true when the PG_migrate_skip bits should be cleared */
>   	bool			compact_blockskip_flush;
> +	bool			contiguous;
>   #endif
>
>   	ZONE_PADDING(_pad3_)
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 56fa321..ce60b38 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -88,7 +88,7 @@ static inline bool migrate_async_suitable(int migratetype)
>    * the first and last page of a pageblock and avoid checking each individual
>    * page in a pageblock.
>    */
> -static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
> +static struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
>   				unsigned long end_pfn, struct zone *zone)
>   {
>   	struct page *start_page;
> @@ -114,6 +114,51 @@ static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
>   	return start_page;
>   }
>
> +static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
> +				unsigned long end_pfn, struct zone *zone)
> +{
> +	if (zone->contiguous)
> +		return pfn_to_page(start_pfn);
> +
> +	return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
> +}
> +
> +static void check_zone_contiguous(struct zone *zone)
> +{
> +	unsigned long block_start_pfn = zone->zone_start_pfn;
> +	unsigned long block_end_pfn;
> +	unsigned long pfn;
> +
> +	/* Already initialized if cached pfn is non-zero */
> +	if (zone->compact_cached_migrate_pfn[0] ||
> +		zone->compact_cached_free_pfn)
> +		return;
> +
> +	/* Mark that checking is in progress */
> +	zone->compact_cached_free_pfn = ULONG_MAX;
> +
> +	block_end_pfn = ALIGN(block_start_pfn + 1, pageblock_nr_pages);
> +	for (; block_start_pfn < zone_end_pfn(zone);
> +		block_start_pfn = block_end_pfn,
> +		block_end_pfn += pageblock_nr_pages) {
> +
> +		block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
> +
> +		if (!__pageblock_pfn_to_page(block_start_pfn,
> +					block_end_pfn, zone))
> +			return;
> +
> +		/* Check validity of pfn within pageblock */
> +		for (pfn = block_start_pfn; pfn < block_end_pfn; pfn++) {
> +			if (!pfn_valid_within(pfn))
> +				return;
> +		}
> +	}
> +
> +	/* We confirm that there is no hole */
> +	zone->contiguous = true;
> +}
> +
>   #ifdef CONFIG_COMPACTION
>
>   /* Do not skip compaction more than 64 times */
> @@ -1357,6 +1402,8 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
>   		;
>   	}
>
> +	check_zone_contiguous(zone);
> +
>   	/*
>   	 * Clear pageblock skip if there were failures recently and compaction
>   	 * is about to be retried after being deferred. kswapd does not do
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous
  2015-12-14 10:29     ` Vlastimil Babka
@ 2015-12-14 15:25       ` Joonsoo Kim
  -1 siblings, 0 replies; 24+ messages in thread
From: Joonsoo Kim @ 2015-12-14 15:25 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Aaron Lu, Mel Gorman, Rik van Riel,
	David Rientjes, LKML, Linux Memory Management List, Joonsoo Kim

2015-12-14 19:29 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>:
> On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
>>
>> There is a performance drop report due to hugepage allocation and in there
>> half of cpu time are spent on pageblock_pfn_to_page() in compaction [1].
>> In that workload, compaction is triggered to make hugepage but most of
>> pageblocks are un-available for compaction due to pageblock type and
>> skip bit so compaction usually fails. Most costly operations in this case
>> is to find valid pageblock while scanning whole zone range. To check
>> if pageblock is valid to compact, valid pfn within pageblock is required
>> and we can obtain it by calling pageblock_pfn_to_page(). This function
>> checks whether pageblock is in a single zone and return valid pfn
>> if possible. Problem is that we need to check it every time before
>> scanning pageblock even if we re-visit it and this turns out to
>> be very expensive in this workload.
>
>
> Hm I wonder if this is safe wrt memory hotplug? Shouldn't there be a
> rechecking plugged into the appropriate hotplug add/remove callbacks? Which
> would make the whole thing generic too, zone->contiguous information doesn't
> have to be limited to compaction. And it would remove the rather ugly part
> where cached pfn info is used as an indication of zone->contiguous being
> already set...

Will check it.

>> Although we have no way to skip this pageblock check in the system
>> where hole exists at arbitrary position, we can use cached value for
>> zone continuity and just do pfn_to_page() in the system where hole doesn't
>> exist. This optimization considerably speeds up in above workload.
>>
>> Before vs After
>> Max: 1096 MB/s vs 1325 MB/s
>> Min: 635 MB/s 1015 MB/s
>> Avg: 899 MB/s 1194 MB/s
>>
>> Avg is improved by roughly 30% [2].
>
>
> Unless I'm mistaken, these results also include my RFC series (Aaron can you
> clarify?). These patches should better be tested standalone on top of base,
> as being simpler they will probably be included sooner (the RFC series needs
> reviews at the very least :) - although the memory hotplug concerns might
> make the "sooner" here relative too.

AFAIK, these patches are tested standalone on top of base. When I sent it,
I asked to Aaron to test it on top of base.

Btw, I missed adding Reported/Tested-by tag for Aaron. I will add it
on next spin.

> Anyway it's interesting that this patch improved "Min", and variance in
> general (on top of my RFC) so much. I would expect the overhead of
> pageblock_pfn_to_page() to be quite stable, hmm.

Perhaps, pageblock_pfn_to_page() would be stable. Combination of
slow scanning and kswapd's skip bit flushing would result in unstable result.

Thanks.

>
>> Not to disturb the system where compaction isn't triggered, checking will
>> be done at first compaction invocation.
>>
>> [1]: http://www.spinics.net/lists/linux-mm/msg97378.html
>> [2]: https://lkml.org/lkml/2015/12/9/23
>>
>> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>> ---
>>   include/linux/mmzone.h |  1 +
>>   mm/compaction.c        | 49
>> ++++++++++++++++++++++++++++++++++++++++++++++++-
>>   2 files changed, 49 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index 68cc063..cd3736e 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -521,6 +521,7 @@ struct zone {
>>   #if defined CONFIG_COMPACTION || defined CONFIG_CMA
>>         /* Set to true when the PG_migrate_skip bits should be cleared */
>>         bool                    compact_blockskip_flush;
>> +       bool                    contiguous;
>>   #endif
>>
>>         ZONE_PADDING(_pad3_)
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index 56fa321..ce60b38 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -88,7 +88,7 @@ static inline bool migrate_async_suitable(int
>> migratetype)
>>    * the first and last page of a pageblock and avoid checking each
>> individual
>>    * page in a pageblock.
>>    */
>> -static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
>> +static struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
>>                                 unsigned long end_pfn, struct zone *zone)
>>   {
>>         struct page *start_page;
>> @@ -114,6 +114,51 @@ static struct page *pageblock_pfn_to_page(unsigned
>> long start_pfn,
>>         return start_page;
>>   }
>>
>> +static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
>> +                               unsigned long end_pfn, struct zone *zone)
>> +{
>> +       if (zone->contiguous)
>> +               return pfn_to_page(start_pfn);
>> +
>> +       return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
>> +}
>> +
>> +static void check_zone_contiguous(struct zone *zone)
>> +{
>> +       unsigned long block_start_pfn = zone->zone_start_pfn;
>> +       unsigned long block_end_pfn;
>> +       unsigned long pfn;
>> +
>> +       /* Already initialized if cached pfn is non-zero */
>> +       if (zone->compact_cached_migrate_pfn[0] ||
>> +               zone->compact_cached_free_pfn)
>> +               return;
>> +
>> +       /* Mark that checking is in progress */
>> +       zone->compact_cached_free_pfn = ULONG_MAX;
>> +
>> +       block_end_pfn = ALIGN(block_start_pfn + 1, pageblock_nr_pages);
>> +       for (; block_start_pfn < zone_end_pfn(zone);
>> +               block_start_pfn = block_end_pfn,
>> +               block_end_pfn += pageblock_nr_pages) {
>> +
>> +               block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
>> +
>> +               if (!__pageblock_pfn_to_page(block_start_pfn,
>> +                                       block_end_pfn, zone))
>> +                       return;
>> +
>> +               /* Check validity of pfn within pageblock */
>> +               for (pfn = block_start_pfn; pfn < block_end_pfn; pfn++) {
>> +                       if (!pfn_valid_within(pfn))
>> +                               return;
>> +               }
>> +       }
>> +
>> +       /* We confirm that there is no hole */
>> +       zone->contiguous = true;
>> +}
>> +
>>   #ifdef CONFIG_COMPACTION
>>
>>   /* Do not skip compaction more than 64 times */
>> @@ -1357,6 +1402,8 @@ static int compact_zone(struct zone *zone, struct
>> compact_control *cc)
>>                 ;
>>         }
>>
>> +       check_zone_contiguous(zone);
>> +
>>         /*
>>          * Clear pageblock skip if there were failures recently and
>> compaction
>>          * is about to be retried after being deferred. kswapd does not do
>>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous
@ 2015-12-14 15:25       ` Joonsoo Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Joonsoo Kim @ 2015-12-14 15:25 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Aaron Lu, Mel Gorman, Rik van Riel,
	David Rientjes, LKML, Linux Memory Management List, Joonsoo Kim

2015-12-14 19:29 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>:
> On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
>>
>> There is a performance drop report due to hugepage allocation and in there
>> half of cpu time are spent on pageblock_pfn_to_page() in compaction [1].
>> In that workload, compaction is triggered to make hugepage but most of
>> pageblocks are un-available for compaction due to pageblock type and
>> skip bit so compaction usually fails. Most costly operations in this case
>> is to find valid pageblock while scanning whole zone range. To check
>> if pageblock is valid to compact, valid pfn within pageblock is required
>> and we can obtain it by calling pageblock_pfn_to_page(). This function
>> checks whether pageblock is in a single zone and return valid pfn
>> if possible. Problem is that we need to check it every time before
>> scanning pageblock even if we re-visit it and this turns out to
>> be very expensive in this workload.
>
>
> Hm I wonder if this is safe wrt memory hotplug? Shouldn't there be a
> rechecking plugged into the appropriate hotplug add/remove callbacks? Which
> would make the whole thing generic too, zone->contiguous information doesn't
> have to be limited to compaction. And it would remove the rather ugly part
> where cached pfn info is used as an indication of zone->contiguous being
> already set...

Will check it.

>> Although we have no way to skip this pageblock check in the system
>> where hole exists at arbitrary position, we can use cached value for
>> zone continuity and just do pfn_to_page() in the system where hole doesn't
>> exist. This optimization considerably speeds up in above workload.
>>
>> Before vs After
>> Max: 1096 MB/s vs 1325 MB/s
>> Min: 635 MB/s 1015 MB/s
>> Avg: 899 MB/s 1194 MB/s
>>
>> Avg is improved by roughly 30% [2].
>
>
> Unless I'm mistaken, these results also include my RFC series (Aaron can you
> clarify?). These patches should better be tested standalone on top of base,
> as being simpler they will probably be included sooner (the RFC series needs
> reviews at the very least :) - although the memory hotplug concerns might
> make the "sooner" here relative too.

AFAIK, these patches are tested standalone on top of base. When I sent it,
I asked to Aaron to test it on top of base.

Btw, I missed adding Reported/Tested-by tag for Aaron. I will add it
on next spin.

> Anyway it's interesting that this patch improved "Min", and variance in
> general (on top of my RFC) so much. I would expect the overhead of
> pageblock_pfn_to_page() to be quite stable, hmm.

Perhaps, pageblock_pfn_to_page() would be stable. Combination of
slow scanning and kswapd's skip bit flushing would result in unstable result.

Thanks.

>
>> Not to disturb the system where compaction isn't triggered, checking will
>> be done at first compaction invocation.
>>
>> [1]: http://www.spinics.net/lists/linux-mm/msg97378.html
>> [2]: https://lkml.org/lkml/2015/12/9/23
>>
>> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>> ---
>>   include/linux/mmzone.h |  1 +
>>   mm/compaction.c        | 49
>> ++++++++++++++++++++++++++++++++++++++++++++++++-
>>   2 files changed, 49 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index 68cc063..cd3736e 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -521,6 +521,7 @@ struct zone {
>>   #if defined CONFIG_COMPACTION || defined CONFIG_CMA
>>         /* Set to true when the PG_migrate_skip bits should be cleared */
>>         bool                    compact_blockskip_flush;
>> +       bool                    contiguous;
>>   #endif
>>
>>         ZONE_PADDING(_pad3_)
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index 56fa321..ce60b38 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -88,7 +88,7 @@ static inline bool migrate_async_suitable(int
>> migratetype)
>>    * the first and last page of a pageblock and avoid checking each
>> individual
>>    * page in a pageblock.
>>    */
>> -static struct page *pageblock_pfn_to_page(unsigned long start_pfn,
>> +static struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
>>                                 unsigned long end_pfn, struct zone *zone)
>>   {
>>         struct page *start_page;
>> @@ -114,6 +114,51 @@ static struct page *pageblock_pfn_to_page(unsigned
>> long start_pfn,
>>         return start_page;
>>   }
>>
>> +static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
>> +                               unsigned long end_pfn, struct zone *zone)
>> +{
>> +       if (zone->contiguous)
>> +               return pfn_to_page(start_pfn);
>> +
>> +       return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
>> +}
>> +
>> +static void check_zone_contiguous(struct zone *zone)
>> +{
>> +       unsigned long block_start_pfn = zone->zone_start_pfn;
>> +       unsigned long block_end_pfn;
>> +       unsigned long pfn;
>> +
>> +       /* Already initialized if cached pfn is non-zero */
>> +       if (zone->compact_cached_migrate_pfn[0] ||
>> +               zone->compact_cached_free_pfn)
>> +               return;
>> +
>> +       /* Mark that checking is in progress */
>> +       zone->compact_cached_free_pfn = ULONG_MAX;
>> +
>> +       block_end_pfn = ALIGN(block_start_pfn + 1, pageblock_nr_pages);
>> +       for (; block_start_pfn < zone_end_pfn(zone);
>> +               block_start_pfn = block_end_pfn,
>> +               block_end_pfn += pageblock_nr_pages) {
>> +
>> +               block_end_pfn = min(block_end_pfn, zone_end_pfn(zone));
>> +
>> +               if (!__pageblock_pfn_to_page(block_start_pfn,
>> +                                       block_end_pfn, zone))
>> +                       return;
>> +
>> +               /* Check validity of pfn within pageblock */
>> +               for (pfn = block_start_pfn; pfn < block_end_pfn; pfn++) {
>> +                       if (!pfn_valid_within(pfn))
>> +                               return;
>> +               }
>> +       }
>> +
>> +       /* We confirm that there is no hole */
>> +       zone->contiguous = true;
>> +}
>> +
>>   #ifdef CONFIG_COMPACTION
>>
>>   /* Do not skip compaction more than 64 times */
>> @@ -1357,6 +1402,8 @@ static int compact_zone(struct zone *zone, struct
>> compact_control *cc)
>>                 ;
>>         }
>>
>> +       check_zone_contiguous(zone);
>> +
>>         /*
>>          * Clear pageblock skip if there were failures recently and
>> compaction
>>          * is about to be retried after being deferred. kswapd does not do
>>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn
  2015-12-14 10:07   ` Vlastimil Babka
@ 2015-12-14 15:26     ` Joonsoo Kim
  -1 siblings, 0 replies; 24+ messages in thread
From: Joonsoo Kim @ 2015-12-14 15:26 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Aaron Lu, Mel Gorman, Rik van Riel,
	David Rientjes, LKML, Linux Memory Management List, Joonsoo Kim

2015-12-14 19:07 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>:
> On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
>>
>> free_pfn and compact_cached_free_pfn are the pointer that remember
>> restart position of freepage scanner. When they are reset or invalid,
>> we set them to zone_end_pfn because freepage scanner works in reverse
>> direction. But, because zone range is defined as [zone_start_pfn,
>> zone_end_pfn), zone_end_pfn is invalid to access. Therefore, we should
>> not store it to free_pfn and compact_cached_free_pfn. Instead, we need
>> to store zone_end_pfn - 1 to them. There is one more thing we should
>> consider. Freepage scanner scan reversely by pageblock unit. If free_pfn
>> and compact_cached_free_pfn are set to middle of pageblock, it regards
>> that sitiation as that it already scans front part of pageblock so we
>> lose opportunity to scan there. To fix-up, this patch do round_down()
>> to guarantee that reset position will be pageblock aligned.
>>
>> Note that thanks to the current pageblock_pfn_to_page() implementation,
>> actual access to zone_end_pfn doesn't happen until now. But, following
>> patch will change pageblock_pfn_to_page() so this patch is needed
>> from now on.
>>
>> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>
>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
>
> Note that until now in compaction we've used basically an open-coded
> round_down(), and ALIGN() for rounding up. You introduce a first use of
> round_down(), and it would be nice to standardize on round_down() and
> round_up() everywhere. I think it's more obvious than open-coding and
> ALIGN() (which doesn't tell the reader if it's aligning up or down).
> Hopefully they really do the same thing and there are no caveats...

Okay. Will send another patch for this clean-up on next spin.

Thanks.

>
>> ---
>>   mm/compaction.c | 9 +++++----
>>   1 file changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index 585de54..56fa321 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -200,7 +200,8 @@ static void reset_cached_positions(struct zone *zone)
>>   {
>>         zone->compact_cached_migrate_pfn[0] = zone->zone_start_pfn;
>>         zone->compact_cached_migrate_pfn[1] = zone->zone_start_pfn;
>> -       zone->compact_cached_free_pfn = zone_end_pfn(zone);
>> +       zone->compact_cached_free_pfn =
>> +                       round_down(zone_end_pfn(zone) - 1,
>> pageblock_nr_pages);
>>   }
>>
>>   /*
>> @@ -1371,11 +1372,11 @@ static int compact_zone(struct zone *zone, struct
>> compact_control *cc)
>>          */
>>         cc->migrate_pfn = zone->compact_cached_migrate_pfn[sync];
>>         cc->free_pfn = zone->compact_cached_free_pfn;
>> -       if (cc->free_pfn < start_pfn || cc->free_pfn > end_pfn) {
>> -               cc->free_pfn = end_pfn & ~(pageblock_nr_pages-1);
>> +       if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) {
>> +               cc->free_pfn = round_down(end_pfn - 1,
>> pageblock_nr_pages);
>>                 zone->compact_cached_free_pfn = cc->free_pfn;
>>         }
>> -       if (cc->migrate_pfn < start_pfn || cc->migrate_pfn > end_pfn) {
>> +       if (cc->migrate_pfn < start_pfn || cc->migrate_pfn >= end_pfn) {
>>                 cc->migrate_pfn = start_pfn;
>>                 zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn;
>>                 zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn;
>>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn
@ 2015-12-14 15:26     ` Joonsoo Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Joonsoo Kim @ 2015-12-14 15:26 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Aaron Lu, Mel Gorman, Rik van Riel,
	David Rientjes, LKML, Linux Memory Management List, Joonsoo Kim

2015-12-14 19:07 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>:
> On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
>>
>> free_pfn and compact_cached_free_pfn are the pointer that remember
>> restart position of freepage scanner. When they are reset or invalid,
>> we set them to zone_end_pfn because freepage scanner works in reverse
>> direction. But, because zone range is defined as [zone_start_pfn,
>> zone_end_pfn), zone_end_pfn is invalid to access. Therefore, we should
>> not store it to free_pfn and compact_cached_free_pfn. Instead, we need
>> to store zone_end_pfn - 1 to them. There is one more thing we should
>> consider. Freepage scanner scan reversely by pageblock unit. If free_pfn
>> and compact_cached_free_pfn are set to middle of pageblock, it regards
>> that sitiation as that it already scans front part of pageblock so we
>> lose opportunity to scan there. To fix-up, this patch do round_down()
>> to guarantee that reset position will be pageblock aligned.
>>
>> Note that thanks to the current pageblock_pfn_to_page() implementation,
>> actual access to zone_end_pfn doesn't happen until now. But, following
>> patch will change pageblock_pfn_to_page() so this patch is needed
>> from now on.
>>
>> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>
>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
>
> Note that until now in compaction we've used basically an open-coded
> round_down(), and ALIGN() for rounding up. You introduce a first use of
> round_down(), and it would be nice to standardize on round_down() and
> round_up() everywhere. I think it's more obvious than open-coding and
> ALIGN() (which doesn't tell the reader if it's aligning up or down).
> Hopefully they really do the same thing and there are no caveats...

Okay. Will send another patch for this clean-up on next spin.

Thanks.

>
>> ---
>>   mm/compaction.c | 9 +++++----
>>   1 file changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index 585de54..56fa321 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -200,7 +200,8 @@ static void reset_cached_positions(struct zone *zone)
>>   {
>>         zone->compact_cached_migrate_pfn[0] = zone->zone_start_pfn;
>>         zone->compact_cached_migrate_pfn[1] = zone->zone_start_pfn;
>> -       zone->compact_cached_free_pfn = zone_end_pfn(zone);
>> +       zone->compact_cached_free_pfn =
>> +                       round_down(zone_end_pfn(zone) - 1,
>> pageblock_nr_pages);
>>   }
>>
>>   /*
>> @@ -1371,11 +1372,11 @@ static int compact_zone(struct zone *zone, struct
>> compact_control *cc)
>>          */
>>         cc->migrate_pfn = zone->compact_cached_migrate_pfn[sync];
>>         cc->free_pfn = zone->compact_cached_free_pfn;
>> -       if (cc->free_pfn < start_pfn || cc->free_pfn > end_pfn) {
>> -               cc->free_pfn = end_pfn & ~(pageblock_nr_pages-1);
>> +       if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) {
>> +               cc->free_pfn = round_down(end_pfn - 1,
>> pageblock_nr_pages);
>>                 zone->compact_cached_free_pfn = cc->free_pfn;
>>         }
>> -       if (cc->migrate_pfn < start_pfn || cc->migrate_pfn > end_pfn) {
>> +       if (cc->migrate_pfn < start_pfn || cc->migrate_pfn >= end_pfn) {
>>                 cc->migrate_pfn = start_pfn;
>>                 zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn;
>>                 zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn;
>>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous
  2015-12-14 15:25       ` Joonsoo Kim
@ 2015-12-15  1:06         ` Aaron Lu
  -1 siblings, 0 replies; 24+ messages in thread
From: Aaron Lu @ 2015-12-15  1:06 UTC (permalink / raw)
  To: Joonsoo Kim, Vlastimil Babka
  Cc: Andrew Morton, Mel Gorman, Rik van Riel, David Rientjes, LKML,
	Linux Memory Management List, Joonsoo Kim

On 12/14/2015 11:25 PM, Joonsoo Kim wrote:
> 2015-12-14 19:29 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>:
>> On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
>>> Before vs After
>>> Max: 1096 MB/s vs 1325 MB/s
>>> Min: 635 MB/s 1015 MB/s
>>> Avg: 899 MB/s 1194 MB/s
>>>
>>> Avg is improved by roughly 30% [2].
>>
>>
>> Unless I'm mistaken, these results also include my RFC series (Aaron can you
>> clarify?). These patches should better be tested standalone on top of base,
>> as being simpler they will probably be included sooner (the RFC series needs
>> reviews at the very least :) - although the memory hotplug concerns might
>> make the "sooner" here relative too.
> 
> AFAIK, these patches are tested standalone on top of base. When I sent it,
> I asked to Aaron to test it on top of base.

Right, it is tested standalone on top of base.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous
@ 2015-12-15  1:06         ` Aaron Lu
  0 siblings, 0 replies; 24+ messages in thread
From: Aaron Lu @ 2015-12-15  1:06 UTC (permalink / raw)
  To: Joonsoo Kim, Vlastimil Babka
  Cc: Andrew Morton, Mel Gorman, Rik van Riel, David Rientjes, LKML,
	Linux Memory Management List, Joonsoo Kim

On 12/14/2015 11:25 PM, Joonsoo Kim wrote:
> 2015-12-14 19:29 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>:
>> On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
>>> Before vs After
>>> Max: 1096 MB/s vs 1325 MB/s
>>> Min: 635 MB/s 1015 MB/s
>>> Avg: 899 MB/s 1194 MB/s
>>>
>>> Avg is improved by roughly 30% [2].
>>
>>
>> Unless I'm mistaken, these results also include my RFC series (Aaron can you
>> clarify?). These patches should better be tested standalone on top of base,
>> as being simpler they will probably be included sooner (the RFC series needs
>> reviews at the very least :) - although the memory hotplug concerns might
>> make the "sooner" here relative too.
> 
> AFAIK, these patches are tested standalone on top of base. When I sent it,
> I asked to Aaron to test it on top of base.

Right, it is tested standalone on top of base.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous
  2015-12-15  1:06         ` Aaron Lu
@ 2015-12-15  8:24           ` Vlastimil Babka
  -1 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2015-12-15  8:24 UTC (permalink / raw)
  To: Aaron Lu, Joonsoo Kim
  Cc: Andrew Morton, Mel Gorman, Rik van Riel, David Rientjes, LKML,
	Linux Memory Management List, Joonsoo Kim

On 12/15/2015 02:06 AM, Aaron Lu wrote:
> On 12/14/2015 11:25 PM, Joonsoo Kim wrote:
>> 2015-12-14 19:29 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>:
>>> Unless I'm mistaken, these results also include my RFC series (Aaron can you
>>> clarify?). These patches should better be tested standalone on top of base,
>>> as being simpler they will probably be included sooner (the RFC series needs
>>> reviews at the very least :) - although the memory hotplug concerns might
>>> make the "sooner" here relative too.
>>
>> AFAIK, these patches are tested standalone on top of base. When I sent it,
>> I asked to Aaron to test it on top of base.
>
> Right, it is tested standalone on top of base.

Thanks, sorry about the noise then.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous
@ 2015-12-15  8:24           ` Vlastimil Babka
  0 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2015-12-15  8:24 UTC (permalink / raw)
  To: Aaron Lu, Joonsoo Kim
  Cc: Andrew Morton, Mel Gorman, Rik van Riel, David Rientjes, LKML,
	Linux Memory Management List, Joonsoo Kim

On 12/15/2015 02:06 AM, Aaron Lu wrote:
> On 12/14/2015 11:25 PM, Joonsoo Kim wrote:
>> 2015-12-14 19:29 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>:
>>> Unless I'm mistaken, these results also include my RFC series (Aaron can you
>>> clarify?). These patches should better be tested standalone on top of base,
>>> as being simpler they will probably be included sooner (the RFC series needs
>>> reviews at the very least :) - although the memory hotplug concerns might
>>> make the "sooner" here relative too.
>>
>> AFAIK, these patches are tested standalone on top of base. When I sent it,
>> I asked to Aaron to test it on top of base.
>
> Right, it is tested standalone on top of base.

Thanks, sorry about the noise then.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn
  2015-12-14 15:26     ` Joonsoo Kim
@ 2015-12-15  8:31       ` Vlastimil Babka
  -1 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2015-12-15  8:31 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Aaron Lu, Mel Gorman, Rik van Riel,
	David Rientjes, LKML, Linux Memory Management List, Joonsoo Kim

On 12/14/2015 04:26 PM, Joonsoo Kim wrote:
> 2015-12-14 19:07 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>:
>> On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
>>>
>>
>> Acked-by: Vlastimil Babka <vbabka@suse.cz>
>>
>> Note that until now in compaction we've used basically an open-coded
>> round_down(), and ALIGN() for rounding up. You introduce a first use of
>> round_down(), and it would be nice to standardize on round_down() and
>> round_up() everywhere. I think it's more obvious than open-coding and
>> ALIGN() (which doesn't tell the reader if it's aligning up or down).
>> Hopefully they really do the same thing and there are no caveats...
>
> Okay. Will send another patch for this clean-up on next spin.

Great, I didn't mean that the cleanup is needed right now, but whether 
we agree on an idiom to use whenever doing any changes from now on.
Maybe it would be best to add some defines in the top of compaction.c 
that would also hide away the repeated pageblock_nr_pages everywhere? 
Something like:

#define pageblock_start(pfn) round_down(pfn, pageblock_nr_pages)
#define pageblock_end(pfn) round_up((pfn)+1, pageblock_nr_pages)


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn
@ 2015-12-15  8:31       ` Vlastimil Babka
  0 siblings, 0 replies; 24+ messages in thread
From: Vlastimil Babka @ 2015-12-15  8:31 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Aaron Lu, Mel Gorman, Rik van Riel,
	David Rientjes, LKML, Linux Memory Management List, Joonsoo Kim

On 12/14/2015 04:26 PM, Joonsoo Kim wrote:
> 2015-12-14 19:07 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>:
>> On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
>>>
>>
>> Acked-by: Vlastimil Babka <vbabka@suse.cz>
>>
>> Note that until now in compaction we've used basically an open-coded
>> round_down(), and ALIGN() for rounding up. You introduce a first use of
>> round_down(), and it would be nice to standardize on round_down() and
>> round_up() everywhere. I think it's more obvious than open-coding and
>> ALIGN() (which doesn't tell the reader if it's aligning up or down).
>> Hopefully they really do the same thing and there are no caveats...
>
> Okay. Will send another patch for this clean-up on next spin.

Great, I didn't mean that the cleanup is needed right now, but whether 
we agree on an idiom to use whenever doing any changes from now on.
Maybe it would be best to add some defines in the top of compaction.c 
that would also hide away the repeated pageblock_nr_pages everywhere? 
Something like:

#define pageblock_start(pfn) round_down(pfn, pageblock_nr_pages)
#define pageblock_end(pfn) round_up((pfn)+1, pageblock_nr_pages)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn
  2015-12-15  8:31       ` Vlastimil Babka
@ 2015-12-16  5:44         ` Joonsoo Kim
  -1 siblings, 0 replies; 24+ messages in thread
From: Joonsoo Kim @ 2015-12-16  5:44 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Aaron Lu, Mel Gorman, Rik van Riel,
	David Rientjes, LKML, Linux Memory Management List

On Tue, Dec 15, 2015 at 09:31:39AM +0100, Vlastimil Babka wrote:
> On 12/14/2015 04:26 PM, Joonsoo Kim wrote:
> >2015-12-14 19:07 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>:
> >>On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
> >>>
> >>
> >>Acked-by: Vlastimil Babka <vbabka@suse.cz>
> >>
> >>Note that until now in compaction we've used basically an open-coded
> >>round_down(), and ALIGN() for rounding up. You introduce a first use of
> >>round_down(), and it would be nice to standardize on round_down() and
> >>round_up() everywhere. I think it's more obvious than open-coding and
> >>ALIGN() (which doesn't tell the reader if it's aligning up or down).
> >>Hopefully they really do the same thing and there are no caveats...
> >
> >Okay. Will send another patch for this clean-up on next spin.
> 
> Great, I didn't mean that the cleanup is needed right now, but
> whether we agree on an idiom to use whenever doing any changes from
> now on.

Okay.

> Maybe it would be best to add some defines in the top of
> compaction.c that would also hide away the repeated
> pageblock_nr_pages everywhere? Something like:
> 
> #define pageblock_start(pfn) round_down(pfn, pageblock_nr_pages)
> #define pageblock_end(pfn) round_up((pfn)+1, pageblock_nr_pages)

Quick grep shows that there are much more places this new define or
some variant can be used. It would be good clean-up. I will try it
separately.

Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn
@ 2015-12-16  5:44         ` Joonsoo Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Joonsoo Kim @ 2015-12-16  5:44 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Aaron Lu, Mel Gorman, Rik van Riel,
	David Rientjes, LKML, Linux Memory Management List

On Tue, Dec 15, 2015 at 09:31:39AM +0100, Vlastimil Babka wrote:
> On 12/14/2015 04:26 PM, Joonsoo Kim wrote:
> >2015-12-14 19:07 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>:
> >>On 12/14/2015 06:02 AM, Joonsoo Kim wrote:
> >>>
> >>
> >>Acked-by: Vlastimil Babka <vbabka@suse.cz>
> >>
> >>Note that until now in compaction we've used basically an open-coded
> >>round_down(), and ALIGN() for rounding up. You introduce a first use of
> >>round_down(), and it would be nice to standardize on round_down() and
> >>round_up() everywhere. I think it's more obvious than open-coding and
> >>ALIGN() (which doesn't tell the reader if it's aligning up or down).
> >>Hopefully they really do the same thing and there are no caveats...
> >
> >Okay. Will send another patch for this clean-up on next spin.
> 
> Great, I didn't mean that the cleanup is needed right now, but
> whether we agree on an idiom to use whenever doing any changes from
> now on.

Okay.

> Maybe it would be best to add some defines in the top of
> compaction.c that would also hide away the repeated
> pageblock_nr_pages everywhere? Something like:
> 
> #define pageblock_start(pfn) round_down(pfn, pageblock_nr_pages)
> #define pageblock_end(pfn) round_up((pfn)+1, pageblock_nr_pages)

Quick grep shows that there are much more places this new define or
some variant can be used. It would be good clean-up. I will try it
separately.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn
  2015-12-21  6:13 ` Joonsoo Kim
@ 2015-12-22 22:05   ` David Rientjes
  -1 siblings, 0 replies; 24+ messages in thread
From: David Rientjes @ 2015-12-22 22:05 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Vlastimil Babka, Aaron Lu, Mel Gorman,
	Rik van Riel, linux-kernel, linux-mm, Joonsoo Kim

On Mon, 21 Dec 2015, Joonsoo Kim wrote:

> free_pfn and compact_cached_free_pfn are the pointer that remember
> restart position of freepage scanner. When they are reset or invalid,
> we set them to zone_end_pfn because freepage scanner works in reverse
> direction. But, because zone range is defined as [zone_start_pfn,
> zone_end_pfn), zone_end_pfn is invalid to access. Therefore, we should
> not store it to free_pfn and compact_cached_free_pfn. Instead, we need
> to store zone_end_pfn - 1 to them. There is one more thing we should
> consider. Freepage scanner scan reversely by pageblock unit. If free_pfn
> and compact_cached_free_pfn are set to middle of pageblock, it regards
> that sitiation as that it already scans front part of pageblock so we
> lose opportunity to scan there. To fix-up, this patch do round_down()
> to guarantee that reset position will be pageblock aligned.
> 
> Note that thanks to the current pageblock_pfn_to_page() implementation,
> actual access to zone_end_pfn doesn't happen until now. But, following
> patch will change pageblock_pfn_to_page() so this patch is needed
> from now on.
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn
@ 2015-12-22 22:05   ` David Rientjes
  0 siblings, 0 replies; 24+ messages in thread
From: David Rientjes @ 2015-12-22 22:05 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Vlastimil Babka, Aaron Lu, Mel Gorman,
	Rik van Riel, linux-kernel, linux-mm, Joonsoo Kim

On Mon, 21 Dec 2015, Joonsoo Kim wrote:

> free_pfn and compact_cached_free_pfn are the pointer that remember
> restart position of freepage scanner. When they are reset or invalid,
> we set them to zone_end_pfn because freepage scanner works in reverse
> direction. But, because zone range is defined as [zone_start_pfn,
> zone_end_pfn), zone_end_pfn is invalid to access. Therefore, we should
> not store it to free_pfn and compact_cached_free_pfn. Instead, we need
> to store zone_end_pfn - 1 to them. There is one more thing we should
> consider. Freepage scanner scan reversely by pageblock unit. If free_pfn
> and compact_cached_free_pfn are set to middle of pageblock, it regards
> that sitiation as that it already scans front part of pageblock so we
> lose opportunity to scan there. To fix-up, this patch do round_down()
> to guarantee that reset position will be pageblock aligned.
> 
> Note that thanks to the current pageblock_pfn_to_page() implementation,
> actual access to zone_end_pfn doesn't happen until now. But, following
> patch will change pageblock_pfn_to_page() so this patch is needed
> from now on.
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Acked-by: David Rientjes <rientjes@google.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn
@ 2015-12-21  6:13 ` Joonsoo Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Joonsoo Kim @ 2015-12-21  6:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, Aaron Lu, Mel Gorman, Rik van Riel,
	David Rientjes, linux-kernel, linux-mm, Joonsoo Kim

free_pfn and compact_cached_free_pfn are the pointer that remember
restart position of freepage scanner. When they are reset or invalid,
we set them to zone_end_pfn because freepage scanner works in reverse
direction. But, because zone range is defined as [zone_start_pfn,
zone_end_pfn), zone_end_pfn is invalid to access. Therefore, we should
not store it to free_pfn and compact_cached_free_pfn. Instead, we need
to store zone_end_pfn - 1 to them. There is one more thing we should
consider. Freepage scanner scan reversely by pageblock unit. If free_pfn
and compact_cached_free_pfn are set to middle of pageblock, it regards
that sitiation as that it already scans front part of pageblock so we
lose opportunity to scan there. To fix-up, this patch do round_down()
to guarantee that reset position will be pageblock aligned.

Note that thanks to the current pageblock_pfn_to_page() implementation,
actual access to zone_end_pfn doesn't happen until now. But, following
patch will change pageblock_pfn_to_page() so this patch is needed
from now on.

Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/compaction.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 585de54..56fa321 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -200,7 +200,8 @@ static void reset_cached_positions(struct zone *zone)
 {
 	zone->compact_cached_migrate_pfn[0] = zone->zone_start_pfn;
 	zone->compact_cached_migrate_pfn[1] = zone->zone_start_pfn;
-	zone->compact_cached_free_pfn = zone_end_pfn(zone);
+	zone->compact_cached_free_pfn =
+			round_down(zone_end_pfn(zone) - 1, pageblock_nr_pages);
 }
 
 /*
@@ -1371,11 +1372,11 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
 	 */
 	cc->migrate_pfn = zone->compact_cached_migrate_pfn[sync];
 	cc->free_pfn = zone->compact_cached_free_pfn;
-	if (cc->free_pfn < start_pfn || cc->free_pfn > end_pfn) {
-		cc->free_pfn = end_pfn & ~(pageblock_nr_pages-1);
+	if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) {
+		cc->free_pfn = round_down(end_pfn - 1, pageblock_nr_pages);
 		zone->compact_cached_free_pfn = cc->free_pfn;
 	}
-	if (cc->migrate_pfn < start_pfn || cc->migrate_pfn > end_pfn) {
+	if (cc->migrate_pfn < start_pfn || cc->migrate_pfn >= end_pfn) {
 		cc->migrate_pfn = start_pfn;
 		zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn;
 		zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn
@ 2015-12-21  6:13 ` Joonsoo Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Joonsoo Kim @ 2015-12-21  6:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vlastimil Babka, Aaron Lu, Mel Gorman, Rik van Riel,
	David Rientjes, linux-kernel, linux-mm, Joonsoo Kim

free_pfn and compact_cached_free_pfn are the pointer that remember
restart position of freepage scanner. When they are reset or invalid,
we set them to zone_end_pfn because freepage scanner works in reverse
direction. But, because zone range is defined as [zone_start_pfn,
zone_end_pfn), zone_end_pfn is invalid to access. Therefore, we should
not store it to free_pfn and compact_cached_free_pfn. Instead, we need
to store zone_end_pfn - 1 to them. There is one more thing we should
consider. Freepage scanner scan reversely by pageblock unit. If free_pfn
and compact_cached_free_pfn are set to middle of pageblock, it regards
that sitiation as that it already scans front part of pageblock so we
lose opportunity to scan there. To fix-up, this patch do round_down()
to guarantee that reset position will be pageblock aligned.

Note that thanks to the current pageblock_pfn_to_page() implementation,
actual access to zone_end_pfn doesn't happen until now. But, following
patch will change pageblock_pfn_to_page() so this patch is needed
from now on.

Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/compaction.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 585de54..56fa321 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -200,7 +200,8 @@ static void reset_cached_positions(struct zone *zone)
 {
 	zone->compact_cached_migrate_pfn[0] = zone->zone_start_pfn;
 	zone->compact_cached_migrate_pfn[1] = zone->zone_start_pfn;
-	zone->compact_cached_free_pfn = zone_end_pfn(zone);
+	zone->compact_cached_free_pfn =
+			round_down(zone_end_pfn(zone) - 1, pageblock_nr_pages);
 }
 
 /*
@@ -1371,11 +1372,11 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
 	 */
 	cc->migrate_pfn = zone->compact_cached_migrate_pfn[sync];
 	cc->free_pfn = zone->compact_cached_free_pfn;
-	if (cc->free_pfn < start_pfn || cc->free_pfn > end_pfn) {
-		cc->free_pfn = end_pfn & ~(pageblock_nr_pages-1);
+	if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) {
+		cc->free_pfn = round_down(end_pfn - 1, pageblock_nr_pages);
 		zone->compact_cached_free_pfn = cc->free_pfn;
 	}
-	if (cc->migrate_pfn < start_pfn || cc->migrate_pfn > end_pfn) {
+	if (cc->migrate_pfn < start_pfn || cc->migrate_pfn >= end_pfn) {
 		cc->migrate_pfn = start_pfn;
 		zone->compact_cached_migrate_pfn[0] = cc->migrate_pfn;
 		zone->compact_cached_migrate_pfn[1] = cc->migrate_pfn;
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-12-22 22:05 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-14  5:02 [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn Joonsoo Kim
2015-12-14  5:02 ` Joonsoo Kim
2015-12-14  5:02 ` [PATCH 2/2] mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous Joonsoo Kim
2015-12-14  5:02   ` Joonsoo Kim
2015-12-14 10:29   ` Vlastimil Babka
2015-12-14 10:29     ` Vlastimil Babka
2015-12-14 15:25     ` Joonsoo Kim
2015-12-14 15:25       ` Joonsoo Kim
2015-12-15  1:06       ` Aaron Lu
2015-12-15  1:06         ` Aaron Lu
2015-12-15  8:24         ` Vlastimil Babka
2015-12-15  8:24           ` Vlastimil Babka
2015-12-14 10:07 ` [PATCH 1/2] mm/compaction: fix invalid free_pfn and compact_cached_free_pfn Vlastimil Babka
2015-12-14 10:07   ` Vlastimil Babka
2015-12-14 15:26   ` Joonsoo Kim
2015-12-14 15:26     ` Joonsoo Kim
2015-12-15  8:31     ` Vlastimil Babka
2015-12-15  8:31       ` Vlastimil Babka
2015-12-16  5:44       ` Joonsoo Kim
2015-12-16  5:44         ` Joonsoo Kim
2015-12-21  6:13 Joonsoo Kim
2015-12-21  6:13 ` Joonsoo Kim
2015-12-22 22:05 ` David Rientjes
2015-12-22 22:05   ` David Rientjes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.