linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: get rid of unnecessary pageblock scanning in setup_zone_migrate_reserve
@ 2013-10-23 21:01 kosaki.motohiro
  2013-10-30 15:19 ` Mel Gorman
  0 siblings, 1 reply; 7+ messages in thread
From: kosaki.motohiro @ 2013-10-23 21:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, akpm, KOSAKI Motohiro, Mel Gorman, Yasuaki Ishimatsu

From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

Yasuaki Ithimatsu reported memory hot-add spent more than 5 _hours_
on 9TB memory machine and we found out setup_zone_migrate_reserve
spnet >90% time.

The problem is, setup_zone_migrate_reserve scan all pageblock
unconditionally, but it is only necessary number of reserved block
was reduced (i.e. memory hot remove).
Moreover, maximum MIGRATE_RESERVE per zone are currently 2. It mean,
number of reserved pageblock are almost always unchanged.

This patch adds zone->nr_migrate_reserve_block to maintain number
of MIGRATE_RESERVE pageblock and it reduce an overhead of
setup_zone_migrate_reserve dramatically.

Cc: Mel Gorman <mgorman@suse.de>
Reported-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 include/linux/mmzone.h |    6 ++++++
 mm/page_alloc.c        |   13 +++++++++++++
 2 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index bd791e4..67ab5fe 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -490,6 +490,12 @@ struct zone {
 	unsigned long		managed_pages;
 
 	/*
+	 * Number of MIGRATE_RESEVE page block. To maintain for just
+	 * optimization. Protected by zone->lock.
+	 */
+	int			nr_migrate_reserve_block;
+
+	/*
 	 * rarely used fields:
 	 */
 	const char		*name;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 58e67ea..76ca054 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3909,6 +3909,7 @@ static void setup_zone_migrate_reserve(struct zone *zone)
 	struct page *page;
 	unsigned long block_migratetype;
 	int reserve;
+	int old_reserve;
 
 	/*
 	 * Get the start pfn, end pfn and the number of blocks to reserve
@@ -3930,6 +3931,12 @@ static void setup_zone_migrate_reserve(struct zone *zone)
 	 * future allocation of hugepages at runtime.
 	 */
 	reserve = min(2, reserve);
+	old_reserve = zone->nr_migrate_reserve_block;
+
+	/* When memory hot-add, we almost always need to do nothing */
+	if (reserve == old_reserve)
+		return;
+	zone->nr_migrate_reserve_block = reserve;
 
 	for (pfn = start_pfn; pfn < end_pfn; pfn += pageblock_nr_pages) {
 		if (!pfn_valid(pfn))
@@ -3967,6 +3974,12 @@ static void setup_zone_migrate_reserve(struct zone *zone)
 				reserve--;
 				continue;
 			}
+		} else if (!old_reserve) {
+			/*
+			 * When boot time, we don't need scan whole zone
+			 * for turning off MIGRATE_RESERVE.
+			 */
+			break;
 		}
 
 		/*
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: get rid of unnecessary pageblock scanning in setup_zone_migrate_reserve
  2013-10-23 21:01 [PATCH] mm: get rid of unnecessary pageblock scanning in setup_zone_migrate_reserve kosaki.motohiro
@ 2013-10-30 15:19 ` Mel Gorman
  2013-10-30 19:26   ` KOSAKI Motohiro
  2013-10-30 20:19   ` KOSAKI Motohiro
  0 siblings, 2 replies; 7+ messages in thread
From: Mel Gorman @ 2013-10-30 15:19 UTC (permalink / raw)
  To: kosaki.motohiro
  Cc: linux-kernel, linux-mm, akpm, KOSAKI Motohiro, Yasuaki Ishimatsu

On Wed, Oct 23, 2013 at 05:01:32PM -0400, kosaki.motohiro@gmail.com wrote:
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> 
> Yasuaki Ithimatsu reported memory hot-add spent more than 5 _hours_
> on 9TB memory machine and we found out setup_zone_migrate_reserve
> spnet >90% time.
> 
> The problem is, setup_zone_migrate_reserve scan all pageblock
> unconditionally, but it is only necessary number of reserved block
> was reduced (i.e. memory hot remove).
> Moreover, maximum MIGRATE_RESERVE per zone are currently 2. It mean,
> number of reserved pageblock are almost always unchanged.
> 
> This patch adds zone->nr_migrate_reserve_block to maintain number
> of MIGRATE_RESERVE pageblock and it reduce an overhead of
> setup_zone_migrate_reserve dramatically.
> 

It seems regrettable to expand the size of struct zone just for this.
You are right that the number of blocks does not exceed 2 because of a
check made in setup_zone_migrate_reserve so it should be possible to
special case this. I didn't test this or think about it particularly
carefully and no doubt there is a nicer way but for illustration
purposes see the patch below.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index dd886fa..1aedddd 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3897,6 +3897,8 @@ static int pageblock_is_reserved(unsigned long start_pfn, unsigned long end_pfn)
 	return 0;
 }
 
+#define MAX_MIGRATE_RESERVE_BLOCKS 2
+
 /*
  * Mark a number of pageblocks as MIGRATE_RESERVE. The number
  * of blocks reserved is based on min_wmark_pages(zone). The memory within
@@ -3910,6 +3912,7 @@ static void setup_zone_migrate_reserve(struct zone *zone)
 	struct page *page;
 	unsigned long block_migratetype;
 	int reserve;
+	int found = 0;
 
 	/*
 	 * Get the start pfn, end pfn and the number of blocks to reserve
@@ -3926,11 +3929,11 @@ static void setup_zone_migrate_reserve(struct zone *zone)
 	/*
 	 * Reserve blocks are generally in place to help high-order atomic
 	 * allocations that are short-lived. A min_free_kbytes value that
-	 * would result in more than 2 reserve blocks for atomic allocations
-	 * is assumed to be in place to help anti-fragmentation for the
-	 * future allocation of hugepages at runtime.
+	 * would result in more than MAX_MIGRATE_RESERVE_BLOCKS reserve blocks
+	 * for atomic allocations is assumed to be in place to help
+	 * anti-fragmentation for the future allocation of hugepages at runtime.
 	 */
-	reserve = min(2, reserve);
+	reserve = min(MAX_MIGRATE_RESERVE_BLOCKS, reserve);
 
 	for (pfn = start_pfn; pfn < end_pfn; pfn += pageblock_nr_pages) {
 		if (!pfn_valid(pfn))
@@ -3956,6 +3959,7 @@ static void setup_zone_migrate_reserve(struct zone *zone)
 			/* If this block is reserved, account for it */
 			if (block_migratetype == MIGRATE_RESERVE) {
 				reserve--;
+				found++;
 				continue;
 			}
 
@@ -3970,6 +3974,10 @@ static void setup_zone_migrate_reserve(struct zone *zone)
 			}
 		}
 
+		/* If all possible reserve blocks have been found, we're done */
+		if (found >= MAX_MIGRATE_RESERVE_BLOCKS)
+			break;
+
 		/*
 		 * If the reserve is met and this is a previous reserved block,
 		 * take it back

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: get rid of unnecessary pageblock scanning in setup_zone_migrate_reserve
  2013-10-30 15:19 ` Mel Gorman
@ 2013-10-30 19:26   ` KOSAKI Motohiro
  2013-10-30 20:19   ` KOSAKI Motohiro
  1 sibling, 0 replies; 7+ messages in thread
From: KOSAKI Motohiro @ 2013-10-30 19:26 UTC (permalink / raw)
  To: Mel Gorman
  Cc: kosaki.motohiro, linux-kernel, linux-mm, akpm, KOSAKI Motohiro,
	Yasuaki Ishimatsu

(10/30/13 11:19 AM), Mel Gorman wrote:
> On Wed, Oct 23, 2013 at 05:01:32PM -0400, kosaki.motohiro@gmail.com wrote:
>> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>>
>> Yasuaki Ithimatsu reported memory hot-add spent more than 5 _hours_
>> on 9TB memory machine and we found out setup_zone_migrate_reserve
>> spnet >90% time.
>>
>> The problem is, setup_zone_migrate_reserve scan all pageblock
>> unconditionally, but it is only necessary number of reserved block
>> was reduced (i.e. memory hot remove).
>> Moreover, maximum MIGRATE_RESERVE per zone are currently 2. It mean,
>> number of reserved pageblock are almost always unchanged.
>>
>> This patch adds zone->nr_migrate_reserve_block to maintain number
>> of MIGRATE_RESERVE pageblock and it reduce an overhead of
>> setup_zone_migrate_reserve dramatically.
>>
>
> It seems regrettable to expand the size of struct zone just for this.

This is only matter when backporting enterprise distro. But you are right
it would be nice if it's avoidable.

> You are right that the number of blocks does not exceed 2 because of a
> check made in setup_zone_migrate_reserve so it should be possible to
> special case this. I didn't test this or think about it particularly
> carefully and no doubt there is a nicer way but for illustration
> purposes see the patch below.

I'll test. A few days give me please.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: get rid of unnecessary pageblock scanning in setup_zone_migrate_reserve
  2013-10-30 15:19 ` Mel Gorman
  2013-10-30 19:26   ` KOSAKI Motohiro
@ 2013-10-30 20:19   ` KOSAKI Motohiro
  2013-10-31 10:15     ` Mel Gorman
  1 sibling, 1 reply; 7+ messages in thread
From: KOSAKI Motohiro @ 2013-10-30 20:19 UTC (permalink / raw)
  To: Mel Gorman
  Cc: kosaki.motohiro, linux-kernel, linux-mm, akpm, KOSAKI Motohiro,
	Yasuaki Ishimatsu

> @@ -3926,11 +3929,11 @@ static void setup_zone_migrate_reserve(struct zone *zone)
>   	/*
>   	 * Reserve blocks are generally in place to help high-order atomic
>   	 * allocations that are short-lived. A min_free_kbytes value that
> -	 * would result in more than 2 reserve blocks for atomic allocations
> -	 * is assumed to be in place to help anti-fragmentation for the
> -	 * future allocation of hugepages at runtime.
> +	 * would result in more than MAX_MIGRATE_RESERVE_BLOCKS reserve blocks
> +	 * for atomic allocations is assumed to be in place to help
> +	 * anti-fragmentation for the future allocation of hugepages at runtime.
>   	 */
> -	reserve = min(2, reserve);
> +	reserve = min(MAX_MIGRATE_RESERVE_BLOCKS, reserve);
>
>   	for (pfn = start_pfn; pfn < end_pfn; pfn += pageblock_nr_pages) {
>   		if (!pfn_valid(pfn))
> @@ -3956,6 +3959,7 @@ static void setup_zone_migrate_reserve(struct zone *zone)
>   			/* If this block is reserved, account for it */
>   			if (block_migratetype == MIGRATE_RESERVE) {
>   				reserve--;
> +				found++;
>   				continue;
>   			}
>
> @@ -3970,6 +3974,10 @@ static void setup_zone_migrate_reserve(struct zone *zone)
>   			}
>   		}
>
> +		/* If all possible reserve blocks have been found, we're done */
> +		if (found >= MAX_MIGRATE_RESERVE_BLOCKS)
> +			break;
> +
>   		/*
>   		 * If the reserve is met and this is a previous reserved block,
>   		 * take it back

Nit. I would like to add following hunk. This is just nit because moving
reserve pageblock is extreme rare.

		if (block_migratetype == MIGRATE_RESERVE) {
+                       found++;
			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
			move_freepages_block(zone, page, MIGRATE_MOVABLE);
		}



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: get rid of unnecessary pageblock scanning in setup_zone_migrate_reserve
  2013-10-30 20:19   ` KOSAKI Motohiro
@ 2013-10-31 10:15     ` Mel Gorman
  2013-10-31 17:14       ` KOSAKI Motohiro
  0 siblings, 1 reply; 7+ messages in thread
From: Mel Gorman @ 2013-10-31 10:15 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: linux-kernel, linux-mm, akpm, KOSAKI Motohiro, Yasuaki Ishimatsu

On Wed, Oct 30, 2013 at 04:19:07PM -0400, KOSAKI Motohiro wrote:
> >@@ -3926,11 +3929,11 @@ static void setup_zone_migrate_reserve(struct zone *zone)
> >  	/*
> >  	 * Reserve blocks are generally in place to help high-order atomic
> >  	 * allocations that are short-lived. A min_free_kbytes value that
> >-	 * would result in more than 2 reserve blocks for atomic allocations
> >-	 * is assumed to be in place to help anti-fragmentation for the
> >-	 * future allocation of hugepages at runtime.
> >+	 * would result in more than MAX_MIGRATE_RESERVE_BLOCKS reserve blocks
> >+	 * for atomic allocations is assumed to be in place to help
> >+	 * anti-fragmentation for the future allocation of hugepages at runtime.
> >  	 */
> >-	reserve = min(2, reserve);
> >+	reserve = min(MAX_MIGRATE_RESERVE_BLOCKS, reserve);
> >
> >  	for (pfn = start_pfn; pfn < end_pfn; pfn += pageblock_nr_pages) {
> >  		if (!pfn_valid(pfn))
> >@@ -3956,6 +3959,7 @@ static void setup_zone_migrate_reserve(struct zone *zone)
> >  			/* If this block is reserved, account for it */
> >  			if (block_migratetype == MIGRATE_RESERVE) {
> >  				reserve--;
> >+				found++;
> >  				continue;
> >  			}
> >
> >@@ -3970,6 +3974,10 @@ static void setup_zone_migrate_reserve(struct zone *zone)
> >  			}
> >  		}
> >
> >+		/* If all possible reserve blocks have been found, we're done */
> >+		if (found >= MAX_MIGRATE_RESERVE_BLOCKS)
> >+			break;
> >+
> >  		/*
> >  		 * If the reserve is met and this is a previous reserved block,
> >  		 * take it back
> 
> Nit. I would like to add following hunk. This is just nit because moving
> reserve pageblock is extreme rare.
> 
> 		if (block_migratetype == MIGRATE_RESERVE) {
> +                       found++;
> 			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
> 			move_freepages_block(zone, page, MIGRATE_MOVABLE);
> 		}

I don't really see the advantage but if you think it is necessary then I
do not object either.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: get rid of unnecessary pageblock scanning in setup_zone_migrate_reserve
  2013-10-31 10:15     ` Mel Gorman
@ 2013-10-31 17:14       ` KOSAKI Motohiro
  2013-11-01  9:00         ` Yasuaki Ishimatsu
  0 siblings, 1 reply; 7+ messages in thread
From: KOSAKI Motohiro @ 2013-10-31 17:14 UTC (permalink / raw)
  To: Mel Gorman
  Cc: kosaki.motohiro, linux-kernel, linux-mm, akpm, KOSAKI Motohiro,
	Yasuaki Ishimatsu

>> Nit. I would like to add following hunk. This is just nit because moving
>> reserve pageblock is extreme rare.
>>
>> 		if (block_migratetype == MIGRATE_RESERVE) {
>> +                       found++;
>> 			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
>> 			move_freepages_block(zone, page, MIGRATE_MOVABLE);
>> 		}
>
> I don't really see the advantage but if you think it is necessary then I
> do not object either.

For example, a zone has five pageblock b1,b2,b3,b4,b5 and b1 has MIGRATE_RESERVE.
When hotremove b1 and hotadd again, your code need to scan all of blocks. But
mine only need to scan b1 and b2. I mean that's a hotplug specific optimization.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: get rid of unnecessary pageblock scanning in setup_zone_migrate_reserve
  2013-10-31 17:14       ` KOSAKI Motohiro
@ 2013-11-01  9:00         ` Yasuaki Ishimatsu
  0 siblings, 0 replies; 7+ messages in thread
From: Yasuaki Ishimatsu @ 2013-11-01  9:00 UTC (permalink / raw)
  To: KOSAKI Motohiro, Mel Gorman; +Cc: linux-kernel, linux-mm, akpm, KOSAKI Motohiro

Hi Mel and Kosaki,

Thank you for posting patches.

I tested your patches. And following table shows time of onlining
a memory section.

Amount of memory        | 128GB | 192GB | 256GB|
------------------------------------------------
linux-3.12-rc7          |  24.3 |  30.2 | 45.6 |
Kosaki's first patch    |   8.3 |   8.3 |  8.6 |
Mel + Kosaki's nit pick |  10.9 |  19.2 | 31.3 |
------------------------------------------------
                                    (millisecond)

128GB : 4 nodes and each node has 32GB of memory
192GB : 6 nodes and each node has 32GB of memory
256GB : 8 nodes and each node has 32GB of memory

In my result, Mel's patch does not seem to fix the problem since time
is increasing with increasing amount of memory.

Thanks,
Yasuaki Ishimatsu

(2013/11/01 2:14), KOSAKI Motohiro wrote:
>>> Nit. I would like to add following hunk. This is just nit because moving
>>> reserve pageblock is extreme rare.
>>>
>>>         if (block_migratetype == MIGRATE_RESERVE) {
>>> +                       found++;
>>>             set_pageblock_migratetype(page, MIGRATE_MOVABLE);
>>>             move_freepages_block(zone, page, MIGRATE_MOVABLE);
>>>         }
>>
>> I don't really see the advantage but if you think it is necessary then I
>> do not object either.
>
> For example, a zone has five pageblock b1,b2,b3,b4,b5 and b1 has MIGRATE_RESERVE.
> When hotremove b1 and hotadd again, your code need to scan all of blocks. But
> mine only need to scan b1 and b2. I mean that's a hotplug specific optimization.
>
>


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-11-01  9:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-23 21:01 [PATCH] mm: get rid of unnecessary pageblock scanning in setup_zone_migrate_reserve kosaki.motohiro
2013-10-30 15:19 ` Mel Gorman
2013-10-30 19:26   ` KOSAKI Motohiro
2013-10-30 20:19   ` KOSAKI Motohiro
2013-10-31 10:15     ` Mel Gorman
2013-10-31 17:14       ` KOSAKI Motohiro
2013-11-01  9:00         ` Yasuaki Ishimatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).