All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
@ 2017-02-06 15:43 ` Wei Yang
  0 siblings, 0 replies; 20+ messages in thread
From: Wei Yang @ 2017-02-06 15:43 UTC (permalink / raw)
  To: akpm, vbabka, mgorman, mhocko; +Cc: linux-mm, linux-kernel, Wei Yang

The whole memory space is divided into several zones and nodes may have no
page in some zones. In this case, the __absent_pages_in_range() would
return 0, since the range it is searching for is an empty range.

Also this happens more often to those nodes with higher memory range when
there are more nodes, which is a trend for future architectures.

This patch checks the zone range after clamp and adjustment, return 0 if
the range is an empty range.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
 mm/page_alloc.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6de9440e3ae2..51c60c0eadcb 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5521,6 +5521,11 @@ static unsigned long __meminit zone_absent_pages_in_node(int nid,
 	adjust_zone_range_for_zone_movable(nid, zone_type,
 			node_start_pfn, node_end_pfn,
 			&zone_start_pfn, &zone_end_pfn);
+
+	/* If this node has no page within this zone, return 0. */
+	if (zone_start_pfn == zone_end_pfn)
+		return 0;
+
 	nr_absent = __absent_pages_in_range(nid, zone_start_pfn, zone_end_pfn);
 
 	/*
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
@ 2017-02-06 15:43 ` Wei Yang
  0 siblings, 0 replies; 20+ messages in thread
From: Wei Yang @ 2017-02-06 15:43 UTC (permalink / raw)
  To: akpm, vbabka, mgorman, mhocko; +Cc: linux-mm, linux-kernel, Wei Yang

The whole memory space is divided into several zones and nodes may have no
page in some zones. In this case, the __absent_pages_in_range() would
return 0, since the range it is searching for is an empty range.

Also this happens more often to those nodes with higher memory range when
there are more nodes, which is a trend for future architectures.

This patch checks the zone range after clamp and adjustment, return 0 if
the range is an empty range.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
 mm/page_alloc.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6de9440e3ae2..51c60c0eadcb 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5521,6 +5521,11 @@ static unsigned long __meminit zone_absent_pages_in_node(int nid,
 	adjust_zone_range_for_zone_movable(nid, zone_type,
 			node_start_pfn, node_end_pfn,
 			&zone_start_pfn, &zone_end_pfn);
+
+	/* If this node has no page within this zone, return 0. */
+	if (zone_start_pfn == zone_end_pfn)
+		return 0;
+
 	nr_absent = __absent_pages_in_range(nid, zone_start_pfn, zone_end_pfn);
 
 	/*
-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
  2017-02-06 15:43 ` Wei Yang
@ 2017-02-06 23:29   ` Andrew Morton
  -1 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2017-02-06 23:29 UTC (permalink / raw)
  To: Wei Yang; +Cc: vbabka, mgorman, mhocko, linux-mm, linux-kernel

On Mon,  6 Feb 2017 23:43:14 +0800 Wei Yang <richard.weiyang@gmail.com> wrote:

> The whole memory space is divided into several zones and nodes may have no
> page in some zones. In this case, the __absent_pages_in_range() would
> return 0, since the range it is searching for is an empty range.
> 
> Also this happens more often to those nodes with higher memory range when
> there are more nodes, which is a trend for future architectures.
> 
> This patch checks the zone range after clamp and adjustment, return 0 if
> the range is an empty range.

What are the user-visible runtime effects of this change?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
@ 2017-02-06 23:29   ` Andrew Morton
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2017-02-06 23:29 UTC (permalink / raw)
  To: Wei Yang; +Cc: vbabka, mgorman, mhocko, linux-mm, linux-kernel

On Mon,  6 Feb 2017 23:43:14 +0800 Wei Yang <richard.weiyang@gmail.com> wrote:

> The whole memory space is divided into several zones and nodes may have no
> page in some zones. In this case, the __absent_pages_in_range() would
> return 0, since the range it is searching for is an empty range.
> 
> Also this happens more often to those nodes with higher memory range when
> there are more nodes, which is a trend for future architectures.
> 
> This patch checks the zone range after clamp and adjustment, return 0 if
> the range is an empty range.

What are the user-visible runtime effects of this change?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
  2017-02-06 15:43 ` Wei Yang
@ 2017-02-07  9:45   ` Michal Hocko
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-02-07  9:45 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, vbabka, mgorman, linux-mm, linux-kernel

On Mon 06-02-17 23:43:14, Wei Yang wrote:
> The whole memory space is divided into several zones and nodes may have no
> page in some zones. In this case, the __absent_pages_in_range() would
> return 0, since the range it is searching for is an empty range.
> 
> Also this happens more often to those nodes with higher memory range when
> there are more nodes, which is a trend for future architectures.

I do not understand this part. Why would we see more zones with zero pfn
range in higher memory ranges.

> This patch checks the zone range after clamp and adjustment, return 0 if
> the range is an empty range.

I assume the whole point of this patch is to save
__absent_pages_in_range which iterates over all memblock regions, right?
Is there any reason why for_each_mem_pfn_range cannot be changed to
honor the given start/end pfns instead? I can imagine that a small zone
would see a similar pointless iterations...

> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> ---
>  mm/page_alloc.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 6de9440e3ae2..51c60c0eadcb 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5521,6 +5521,11 @@ static unsigned long __meminit zone_absent_pages_in_node(int nid,
>  	adjust_zone_range_for_zone_movable(nid, zone_type,
>  			node_start_pfn, node_end_pfn,
>  			&zone_start_pfn, &zone_end_pfn);
> +
> +	/* If this node has no page within this zone, return 0. */
> +	if (zone_start_pfn == zone_end_pfn)
> +		return 0;
> +
>  	nr_absent = __absent_pages_in_range(nid, zone_start_pfn, zone_end_pfn);
>  
>  	/*
> -- 
> 2.11.0
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
@ 2017-02-07  9:45   ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-02-07  9:45 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, vbabka, mgorman, linux-mm, linux-kernel

On Mon 06-02-17 23:43:14, Wei Yang wrote:
> The whole memory space is divided into several zones and nodes may have no
> page in some zones. In this case, the __absent_pages_in_range() would
> return 0, since the range it is searching for is an empty range.
> 
> Also this happens more often to those nodes with higher memory range when
> there are more nodes, which is a trend for future architectures.

I do not understand this part. Why would we see more zones with zero pfn
range in higher memory ranges.

> This patch checks the zone range after clamp and adjustment, return 0 if
> the range is an empty range.

I assume the whole point of this patch is to save
__absent_pages_in_range which iterates over all memblock regions, right?
Is there any reason why for_each_mem_pfn_range cannot be changed to
honor the given start/end pfns instead? I can imagine that a small zone
would see a similar pointless iterations...

> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> ---
>  mm/page_alloc.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 6de9440e3ae2..51c60c0eadcb 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5521,6 +5521,11 @@ static unsigned long __meminit zone_absent_pages_in_node(int nid,
>  	adjust_zone_range_for_zone_movable(nid, zone_type,
>  			node_start_pfn, node_end_pfn,
>  			&zone_start_pfn, &zone_end_pfn);
> +
> +	/* If this node has no page within this zone, return 0. */
> +	if (zone_start_pfn == zone_end_pfn)
> +		return 0;
> +
>  	nr_absent = __absent_pages_in_range(nid, zone_start_pfn, zone_end_pfn);
>  
>  	/*
> -- 
> 2.11.0
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
  2017-02-06 23:29   ` Andrew Morton
  (?)
@ 2017-02-07 15:07   ` Wei Yang
  -1 siblings, 0 replies; 20+ messages in thread
From: Wei Yang @ 2017-02-07 15:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Wei Yang, vbabka, mgorman, mhocko, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 870 bytes --]

On Mon, Feb 06, 2017 at 03:29:32PM -0800, Andrew Morton wrote:
>On Mon,  6 Feb 2017 23:43:14 +0800 Wei Yang <richard.weiyang@gmail.com> wrote:
>
>> The whole memory space is divided into several zones and nodes may have no
>> page in some zones. In this case, the __absent_pages_in_range() would
>> return 0, since the range it is searching for is an empty range.
>> 
>> Also this happens more often to those nodes with higher memory range when
>> there are more nodes, which is a trend for future architectures.
>> 
>> This patch checks the zone range after clamp and adjustment, return 0 if
>> the range is an empty range.
>
>What are the user-visible runtime effects of this change?

Hmm, for users they may not "see" the effect, while it will save some time in
case there is no overlap between the zone and node.

-- 
Wei Yang
Help you, Help me

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
  2017-02-07  9:45   ` Michal Hocko
  (?)
@ 2017-02-07 15:32   ` Wei Yang
  2017-02-07 15:41       ` Michal Hocko
  -1 siblings, 1 reply; 20+ messages in thread
From: Wei Yang @ 2017-02-07 15:32 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Wei Yang, akpm, vbabka, mgorman, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3197 bytes --]

On Tue, Feb 07, 2017 at 10:45:57AM +0100, Michal Hocko wrote:
>On Mon 06-02-17 23:43:14, Wei Yang wrote:
>> The whole memory space is divided into several zones and nodes may have no
>> page in some zones. In this case, the __absent_pages_in_range() would
>> return 0, since the range it is searching for is an empty range.
>> 
>> Also this happens more often to those nodes with higher memory range when
>> there are more nodes, which is a trend for future architectures.
>
>I do not understand this part. Why would we see more zones with zero pfn
>range in higher memory ranges.
>

Based on my understanding, zone boundary is fixed address. For example, on
x84_64, ZONE_DMA is < 16M, ZONE_DMA32 is < 4G. And similar rules apply to
sparc, ia64, s390 as shown in the comment of ZONE definition.

For example, currently we see a server with 8 NUMA nodes and with 4T memory.
Those zone boundaries may all sits in the first node range, so that the nodes
with higher memory range may all sits in the last zone, which is ZONE_NORMAL I
think. During the memory initialization, for each node we still iterate on
each zone and calculate the memory range in each zone. By doing so, those
nodes with higher memory range will see several empty zones.

>> This patch checks the zone range after clamp and adjustment, return 0 if
>> the range is an empty range.
>
>I assume the whole point of this patch is to save
>__absent_pages_in_range which iterates over all memblock regions, right?

Yes, you are right. Since we know there is no overlap, it is not necessary to
do the iteration on memblock.

>Is there any reason why for_each_mem_pfn_range cannot be changed to
>honor the given start/end pfns instead? I can imagine that a small zone
>would see a similar pointless iterations...
>

Hmm... No special reason, just not thought about this implementation. And
actually I just do the similar thing as in zone_spanned_pages_in_node(), in
which also return 0 when there is no overlap.

BTW, I don't get your point. You wish to put the check in
for_each_mem_pfn_range() definition?

>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> ---
>>  mm/page_alloc.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>> 
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 6de9440e3ae2..51c60c0eadcb 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5521,6 +5521,11 @@ static unsigned long __meminit zone_absent_pages_in_node(int nid,
>>  	adjust_zone_range_for_zone_movable(nid, zone_type,
>>  			node_start_pfn, node_end_pfn,
>>  			&zone_start_pfn, &zone_end_pfn);
>> +
>> +	/* If this node has no page within this zone, return 0. */
>> +	if (zone_start_pfn == zone_end_pfn)
>> +		return 0;
>> +
>>  	nr_absent = __absent_pages_in_range(nid, zone_start_pfn, zone_end_pfn);
>>  
>>  	/*
>> -- 
>> 2.11.0
>> 
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
  2017-02-07 15:32   ` Wei Yang
@ 2017-02-07 15:41       ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-02-07 15:41 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, vbabka, mgorman, linux-mm, linux-kernel

On Tue 07-02-17 23:32:47, Wei Yang wrote:
> On Tue, Feb 07, 2017 at 10:45:57AM +0100, Michal Hocko wrote:
[...]
> >Is there any reason why for_each_mem_pfn_range cannot be changed to
> >honor the given start/end pfns instead? I can imagine that a small zone
> >would see a similar pointless iterations...
> >
> 
> Hmm... No special reason, just not thought about this implementation. And
> actually I just do the similar thing as in zone_spanned_pages_in_node(), in
> which also return 0 when there is no overlap.
> 
> BTW, I don't get your point. You wish to put the check in
> for_each_mem_pfn_range() definition?

My point was that you are handling one special case (an empty zone) but
the underlying problem is that __absent_pages_in_range might be wasting
cycles iterating over memblocks that are way outside of the given pfn
range. At least this is my understanding. If you fix that you do not
need the special case, right?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
@ 2017-02-07 15:41       ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-02-07 15:41 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, vbabka, mgorman, linux-mm, linux-kernel

On Tue 07-02-17 23:32:47, Wei Yang wrote:
> On Tue, Feb 07, 2017 at 10:45:57AM +0100, Michal Hocko wrote:
[...]
> >Is there any reason why for_each_mem_pfn_range cannot be changed to
> >honor the given start/end pfns instead? I can imagine that a small zone
> >would see a similar pointless iterations...
> >
> 
> Hmm... No special reason, just not thought about this implementation. And
> actually I just do the similar thing as in zone_spanned_pages_in_node(), in
> which also return 0 when there is no overlap.
> 
> BTW, I don't get your point. You wish to put the check in
> for_each_mem_pfn_range() definition?

My point was that you are handling one special case (an empty zone) but
the underlying problem is that __absent_pages_in_range might be wasting
cycles iterating over memblocks that are way outside of the given pfn
range. At least this is my understanding. If you fix that you do not
need the special case, right?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
  2017-02-07 15:41       ` Michal Hocko
  (?)
@ 2017-02-08 14:05       ` Wei Yang
  2017-02-08 14:39           ` Michal Hocko
  -1 siblings, 1 reply; 20+ messages in thread
From: Wei Yang @ 2017-02-08 14:05 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Wei Yang, akpm, vbabka, mgorman, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2674 bytes --]

On Tue, Feb 07, 2017 at 04:41:21PM +0100, Michal Hocko wrote:
>On Tue 07-02-17 23:32:47, Wei Yang wrote:
>> On Tue, Feb 07, 2017 at 10:45:57AM +0100, Michal Hocko wrote:
>[...]
>> >Is there any reason why for_each_mem_pfn_range cannot be changed to
>> >honor the given start/end pfns instead? I can imagine that a small zone
>> >would see a similar pointless iterations...
>> >
>> 
>> Hmm... No special reason, just not thought about this implementation. And
>> actually I just do the similar thing as in zone_spanned_pages_in_node(), in
>> which also return 0 when there is no overlap.
>> 
>> BTW, I don't get your point. You wish to put the check in
>> for_each_mem_pfn_range() definition?
>
>My point was that you are handling one special case (an empty zone) but
>the underlying problem is that __absent_pages_in_range might be wasting
>cycles iterating over memblocks that are way outside of the given pfn
>range. At least this is my understanding. If you fix that you do not
>need the special case, right?

Yep, I think this is a good suggestion. By doing do, this could save iterating
cycles in __absent_pages_in_range(). 

Hmm, the case is a little bit different in zone_absent_pages_in_node() in case
there is movable zone in this node. Even __absent_pages_in_range() returns 0,
it is not a proof that this node has no page in this zone. Which means, we
still need to go through the ZONE_MOVABLE handling part, which is a memblock
iteration too.

Let's take a look whether guard __absent_pages_in_range() internally is
necessary now.

The function itself is invoked at three places:

* numa_meminfo_cover_memory()
* zone_absent_pages_in_node()
* absent_pages_in_range()

The first one is invoked on numa_meminfo which is sanitized by
numa_cleanup_meminfo().
The second one is analysed here.

The third one is invoked at two places:
* numa_meminfo_cover_memory()
* mem_hole_size()

At the first place, it is passed with (0, max_pfn) as parameter, which I think
is not common to have max_pfn to be 0.
At the second place, the start_pfn and end_pfn is already guarded.

With all those status, currently I choose to put the check in
zone_absent_pages_in_node().

BTW, the ZONE_MOVABLE handling looks strange to me and the comment "Treat
pages to be ZONE_MOVABLE in ZONE_NORMAL as absent pages and vice versa" is
hard to understand. From the code point of view, if zone_type is ZONE_NORMAL,
each memblock region between zone_start_pfn and zone_end_pfn would be treated
as absent pages if it is not mirrored. Do you have some hint on this?

>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
  2017-02-08 14:05       ` Wei Yang
@ 2017-02-08 14:39           ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-02-08 14:39 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, vbabka, mgorman, linux-mm, linux-kernel

On Wed 08-02-17 22:05:18, Wei Yang wrote:
[...]
> BTW, the ZONE_MOVABLE handling looks strange to me and the comment "Treat
> pages to be ZONE_MOVABLE in ZONE_NORMAL as absent pages and vice versa" is
> hard to understand. From the code point of view, if zone_type is ZONE_NORMAL,
> each memblock region between zone_start_pfn and zone_end_pfn would be treated
> as absent pages if it is not mirrored. Do you have some hint on this?

Not really, sorry, this area is full of awkward and subtle code when new
changes build on top of previous awkwardness/surprises. Any cleanup
would be really appreciated. That is the reason I didn't like the
initial check all that much.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
@ 2017-02-08 14:39           ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-02-08 14:39 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, vbabka, mgorman, linux-mm, linux-kernel

On Wed 08-02-17 22:05:18, Wei Yang wrote:
[...]
> BTW, the ZONE_MOVABLE handling looks strange to me and the comment "Treat
> pages to be ZONE_MOVABLE in ZONE_NORMAL as absent pages and vice versa" is
> hard to understand. From the code point of view, if zone_type is ZONE_NORMAL,
> each memblock region between zone_start_pfn and zone_end_pfn would be treated
> as absent pages if it is not mirrored. Do you have some hint on this?

Not really, sorry, this area is full of awkward and subtle code when new
changes build on top of previous awkwardness/surprises. Any cleanup
would be really appreciated. That is the reason I didn't like the
initial check all that much.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
  2017-02-07 15:41       ` Michal Hocko
  (?)
  (?)
@ 2017-02-09 13:59       ` Wei Yang
  2017-02-22  8:49           ` Michal Hocko
  -1 siblings, 1 reply; 20+ messages in thread
From: Wei Yang @ 2017-02-09 13:59 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Wei Yang, akpm, vbabka, mgorman, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1553 bytes --]

On Tue, Feb 07, 2017 at 04:41:21PM +0100, Michal Hocko wrote:
>On Tue 07-02-17 23:32:47, Wei Yang wrote:
>> On Tue, Feb 07, 2017 at 10:45:57AM +0100, Michal Hocko wrote:
>[...]
>> >Is there any reason why for_each_mem_pfn_range cannot be changed to
>> >honor the given start/end pfns instead? I can imagine that a small zone
>> >would see a similar pointless iterations...
>> >
>> 
>> Hmm... No special reason, just not thought about this implementation. And
>> actually I just do the similar thing as in zone_spanned_pages_in_node(), in
>> which also return 0 when there is no overlap.
>> 
>> BTW, I don't get your point. You wish to put the check in
>> for_each_mem_pfn_range() definition?
>
>My point was that you are handling one special case (an empty zone) but
>the underlying problem is that __absent_pages_in_range might be wasting
>cycles iterating over memblocks that are way outside of the given pfn
>range. At least this is my understanding. If you fix that you do not
>need the special case, right?
>-- 
>Michal Hocko
>SUSE Labs

> Not really, sorry, this area is full of awkward and subtle code when new
> changes build on top of previous awkwardness/surprises. Any cleanup
> would be really appreciated. That is the reason I didn't like the
> initial check all that much.

Looks my fetchmail failed to get your last reply. So I copied it here.

Yes, the change here looks not that nice, while currently this is what I can't
come up with.

Thanks for your review :-)

-- 
Wei Yang
Help you, Help me

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
  2017-02-09 13:59       ` Wei Yang
@ 2017-02-22  8:49           ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-02-22  8:49 UTC (permalink / raw)
  To: Wei Yang, Andrew Morton; +Cc: vbabka, mgorman, linux-mm, linux-kernel

On Thu 09-02-17 21:59:29, Wei Yang wrote:
> On Tue, Feb 07, 2017 at 04:41:21PM +0100, Michal Hocko wrote:
> >On Tue 07-02-17 23:32:47, Wei Yang wrote:
> >> On Tue, Feb 07, 2017 at 10:45:57AM +0100, Michal Hocko wrote:
> >[...]
> >> >Is there any reason why for_each_mem_pfn_range cannot be changed to
> >> >honor the given start/end pfns instead? I can imagine that a small zone
> >> >would see a similar pointless iterations...
> >> >
> >> 
> >> Hmm... No special reason, just not thought about this implementation. And
> >> actually I just do the similar thing as in zone_spanned_pages_in_node(), in
> >> which also return 0 when there is no overlap.
> >> 
> >> BTW, I don't get your point. You wish to put the check in
> >> for_each_mem_pfn_range() definition?
> >
> >My point was that you are handling one special case (an empty zone) but
> >the underlying problem is that __absent_pages_in_range might be wasting
> >cycles iterating over memblocks that are way outside of the given pfn
> >range. At least this is my understanding. If you fix that you do not
> >need the special case, right?
> >-- 
> >Michal Hocko
> >SUSE Labs
> 
> > Not really, sorry, this area is full of awkward and subtle code when new
> > changes build on top of previous awkwardness/surprises. Any cleanup
> > would be really appreciated. That is the reason I didn't like the
> > initial check all that much.
> 
> Looks my fetchmail failed to get your last reply. So I copied it here.
> 
> Yes, the change here looks not that nice, while currently this is what I can't
> come up with.

THen I will suggest dropping this patch from the mmotm tree because it
doesn't sound like a big improvement and I would encourage you or
anybody else to take a deeper look and unclutter this area to be more
readable and better maintainable.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
@ 2017-02-22  8:49           ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-02-22  8:49 UTC (permalink / raw)
  To: Wei Yang, Andrew Morton; +Cc: vbabka, mgorman, linux-mm, linux-kernel

On Thu 09-02-17 21:59:29, Wei Yang wrote:
> On Tue, Feb 07, 2017 at 04:41:21PM +0100, Michal Hocko wrote:
> >On Tue 07-02-17 23:32:47, Wei Yang wrote:
> >> On Tue, Feb 07, 2017 at 10:45:57AM +0100, Michal Hocko wrote:
> >[...]
> >> >Is there any reason why for_each_mem_pfn_range cannot be changed to
> >> >honor the given start/end pfns instead? I can imagine that a small zone
> >> >would see a similar pointless iterations...
> >> >
> >> 
> >> Hmm... No special reason, just not thought about this implementation. And
> >> actually I just do the similar thing as in zone_spanned_pages_in_node(), in
> >> which also return 0 when there is no overlap.
> >> 
> >> BTW, I don't get your point. You wish to put the check in
> >> for_each_mem_pfn_range() definition?
> >
> >My point was that you are handling one special case (an empty zone) but
> >the underlying problem is that __absent_pages_in_range might be wasting
> >cycles iterating over memblocks that are way outside of the given pfn
> >range. At least this is my understanding. If you fix that you do not
> >need the special case, right?
> >-- 
> >Michal Hocko
> >SUSE Labs
> 
> > Not really, sorry, this area is full of awkward and subtle code when new
> > changes build on top of previous awkwardness/surprises. Any cleanup
> > would be really appreciated. That is the reason I didn't like the
> > initial check all that much.
> 
> Looks my fetchmail failed to get your last reply. So I copied it here.
> 
> Yes, the change here looks not that nice, while currently this is what I can't
> come up with.

THen I will suggest dropping this patch from the mmotm tree because it
doesn't sound like a big improvement and I would encourage you or
anybody else to take a deeper look and unclutter this area to be more
readable and better maintainable.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
  2017-02-22  8:49           ` Michal Hocko
  (?)
@ 2017-02-22 10:51           ` Wei Yang
  2017-02-22 11:45               ` Michal Hocko
  -1 siblings, 1 reply; 20+ messages in thread
From: Wei Yang @ 2017-02-22 10:51 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wei Yang, Andrew Morton, vbabka, mgorman, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2437 bytes --]

On Wed, Feb 22, 2017 at 09:49:47AM +0100, Michal Hocko wrote:
>On Thu 09-02-17 21:59:29, Wei Yang wrote:
>> On Tue, Feb 07, 2017 at 04:41:21PM +0100, Michal Hocko wrote:
>> >On Tue 07-02-17 23:32:47, Wei Yang wrote:
>> >> On Tue, Feb 07, 2017 at 10:45:57AM +0100, Michal Hocko wrote:
>> >[...]
>> >> >Is there any reason why for_each_mem_pfn_range cannot be changed to
>> >> >honor the given start/end pfns instead? I can imagine that a small zone
>> >> >would see a similar pointless iterations...
>> >> >
>> >> 
>> >> Hmm... No special reason, just not thought about this implementation. And
>> >> actually I just do the similar thing as in zone_spanned_pages_in_node(), in
>> >> which also return 0 when there is no overlap.
>> >> 
>> >> BTW, I don't get your point. You wish to put the check in
>> >> for_each_mem_pfn_range() definition?
>> >
>> >My point was that you are handling one special case (an empty zone) but
>> >the underlying problem is that __absent_pages_in_range might be wasting
>> >cycles iterating over memblocks that are way outside of the given pfn
>> >range. At least this is my understanding. If you fix that you do not
>> >need the special case, right?
>> >-- 
>> >Michal Hocko
>> >SUSE Labs
>> 
>> > Not really, sorry, this area is full of awkward and subtle code when new
>> > changes build on top of previous awkwardness/surprises. Any cleanup
>> > would be really appreciated. That is the reason I didn't like the
>> > initial check all that much.
>> 
>> Looks my fetchmail failed to get your last reply. So I copied it here.
>> 
>> Yes, the change here looks not that nice, while currently this is what I can't
>> come up with.
>
>THen I will suggest dropping this patch from the mmotm tree because it
>doesn't sound like a big improvement and I would encourage you or
>anybody else to take a deeper look and unclutter this area to be more
>readable and better maintainable.

Hi, Michal

I don't get your point, which part of the code makes you feel uncomfortable?

The behavior here is similar in zone_spanned_pages_in_node(), in which it also
checks whether this node has pages within the zone's required range. The
improvement may not that much, while the logic here is clear. If the node has
no page in the zone's required range, it is not necessary to continue the
calculation.

>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
  2017-02-22 10:51           ` Wei Yang
@ 2017-02-22 11:45               ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-02-22 11:45 UTC (permalink / raw)
  To: Wei Yang; +Cc: Andrew Morton, vbabka, mgorman, linux-mm, linux-kernel

On Wed 22-02-17 18:51:31, Wei Yang wrote:
> On Wed, Feb 22, 2017 at 09:49:47AM +0100, Michal Hocko wrote:
> >On Thu 09-02-17 21:59:29, Wei Yang wrote:
> >> On Tue, Feb 07, 2017 at 04:41:21PM +0100, Michal Hocko wrote:
> >> >On Tue 07-02-17 23:32:47, Wei Yang wrote:
> >> >> On Tue, Feb 07, 2017 at 10:45:57AM +0100, Michal Hocko wrote:
> >> >[...]
> >> >> >Is there any reason why for_each_mem_pfn_range cannot be changed to
> >> >> >honor the given start/end pfns instead? I can imagine that a small zone
> >> >> >would see a similar pointless iterations...
> >> >> >
> >> >> 
> >> >> Hmm... No special reason, just not thought about this implementation. And
> >> >> actually I just do the similar thing as in zone_spanned_pages_in_node(), in
> >> >> which also return 0 when there is no overlap.
> >> >> 
> >> >> BTW, I don't get your point. You wish to put the check in
> >> >> for_each_mem_pfn_range() definition?
> >> >
> >> >My point was that you are handling one special case (an empty zone) but
> >> >the underlying problem is that __absent_pages_in_range might be wasting
> >> >cycles iterating over memblocks that are way outside of the given pfn
> >> >range. At least this is my understanding. If you fix that you do not
> >> >need the special case, right?
> >> >-- 
> >> >Michal Hocko
> >> >SUSE Labs
> >> 
> >> > Not really, sorry, this area is full of awkward and subtle code when new
> >> > changes build on top of previous awkwardness/surprises. Any cleanup
> >> > would be really appreciated. That is the reason I didn't like the
> >> > initial check all that much.
> >> 
> >> Looks my fetchmail failed to get your last reply. So I copied it here.
> >> 
> >> Yes, the change here looks not that nice, while currently this is what I can't
> >> come up with.
> >
> >THen I will suggest dropping this patch from the mmotm tree because it
> >doesn't sound like a big improvement and I would encourage you or
> >anybody else to take a deeper look and unclutter this area to be more
> >readable and better maintainable.
> 
> Hi, Michal
> 
> I don't get your point, which part of the code makes you feel uncomfortable?

It adds a check which would better be handled at a different level. I've
tried to explain what are my concerns about quick&dirty solutions in
this area. I would agree to add the check as a immediate workaround if
this had some measurable benefits but the changelog doesn't mention
any. So I do not really see this as an improvement in the end. If we
want to address the suboptimal code, let's do it properly rather than
spewing ifs all over the code.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
@ 2017-02-22 11:45               ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-02-22 11:45 UTC (permalink / raw)
  To: Wei Yang; +Cc: Andrew Morton, vbabka, mgorman, linux-mm, linux-kernel

On Wed 22-02-17 18:51:31, Wei Yang wrote:
> On Wed, Feb 22, 2017 at 09:49:47AM +0100, Michal Hocko wrote:
> >On Thu 09-02-17 21:59:29, Wei Yang wrote:
> >> On Tue, Feb 07, 2017 at 04:41:21PM +0100, Michal Hocko wrote:
> >> >On Tue 07-02-17 23:32:47, Wei Yang wrote:
> >> >> On Tue, Feb 07, 2017 at 10:45:57AM +0100, Michal Hocko wrote:
> >> >[...]
> >> >> >Is there any reason why for_each_mem_pfn_range cannot be changed to
> >> >> >honor the given start/end pfns instead? I can imagine that a small zone
> >> >> >would see a similar pointless iterations...
> >> >> >
> >> >> 
> >> >> Hmm... No special reason, just not thought about this implementation. And
> >> >> actually I just do the similar thing as in zone_spanned_pages_in_node(), in
> >> >> which also return 0 when there is no overlap.
> >> >> 
> >> >> BTW, I don't get your point. You wish to put the check in
> >> >> for_each_mem_pfn_range() definition?
> >> >
> >> >My point was that you are handling one special case (an empty zone) but
> >> >the underlying problem is that __absent_pages_in_range might be wasting
> >> >cycles iterating over memblocks that are way outside of the given pfn
> >> >range. At least this is my understanding. If you fix that you do not
> >> >need the special case, right?
> >> >-- 
> >> >Michal Hocko
> >> >SUSE Labs
> >> 
> >> > Not really, sorry, this area is full of awkward and subtle code when new
> >> > changes build on top of previous awkwardness/surprises. Any cleanup
> >> > would be really appreciated. That is the reason I didn't like the
> >> > initial check all that much.
> >> 
> >> Looks my fetchmail failed to get your last reply. So I copied it here.
> >> 
> >> Yes, the change here looks not that nice, while currently this is what I can't
> >> come up with.
> >
> >THen I will suggest dropping this patch from the mmotm tree because it
> >doesn't sound like a big improvement and I would encourage you or
> >anybody else to take a deeper look and unclutter this area to be more
> >readable and better maintainable.
> 
> Hi, Michal
> 
> I don't get your point, which part of the code makes you feel uncomfortable?

It adds a check which would better be handled at a different level. I've
tried to explain what are my concerns about quick&dirty solutions in
this area. I would agree to add the check as a immediate workaround if
this had some measurable benefits but the changelog doesn't mention
any. So I do not really see this as an improvement in the end. If we
want to address the suboptimal code, let's do it properly rather than
spewing ifs all over the code.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone
  2017-02-22 11:45               ` Michal Hocko
  (?)
@ 2017-02-22 14:18               ` Wei Yang
  -1 siblings, 0 replies; 20+ messages in thread
From: Wei Yang @ 2017-02-22 14:18 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wei Yang, Andrew Morton, vbabka, mgorman, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4977 bytes --]

On Wed, Feb 22, 2017 at 12:45:22PM +0100, Michal Hocko wrote:
>On Wed 22-02-17 18:51:31, Wei Yang wrote:
>> On Wed, Feb 22, 2017 at 09:49:47AM +0100, Michal Hocko wrote:
>> >On Thu 09-02-17 21:59:29, Wei Yang wrote:
>> >> On Tue, Feb 07, 2017 at 04:41:21PM +0100, Michal Hocko wrote:
>> >> >On Tue 07-02-17 23:32:47, Wei Yang wrote:
>> >> >> On Tue, Feb 07, 2017 at 10:45:57AM +0100, Michal Hocko wrote:
>> >> >[...]
>> >> >> >Is there any reason why for_each_mem_pfn_range cannot be changed to
>> >> >> >honor the given start/end pfns instead? I can imagine that a small zone
>> >> >> >would see a similar pointless iterations...
>> >> >> >
>> >> >> 
>> >> >> Hmm... No special reason, just not thought about this implementation. And
>> >> >> actually I just do the similar thing as in zone_spanned_pages_in_node(), in
>> >> >> which also return 0 when there is no overlap.
>> >> >> 
>> >> >> BTW, I don't get your point. You wish to put the check in
>> >> >> for_each_mem_pfn_range() definition?
>> >> >
>> >> >My point was that you are handling one special case (an empty zone) but
>> >> >the underlying problem is that __absent_pages_in_range might be wasting
>> >> >cycles iterating over memblocks that are way outside of the given pfn
>> >> >range. At least this is my understanding. If you fix that you do not
>> >> >need the special case, right?
>> >> >-- 
>> >> >Michal Hocko
>> >> >SUSE Labs
>> >> 
>> >> > Not really, sorry, this area is full of awkward and subtle code when new
>> >> > changes build on top of previous awkwardness/surprises. Any cleanup
>> >> > would be really appreciated. That is the reason I didn't like the
>> >> > initial check all that much.
>> >> 
>> >> Looks my fetchmail failed to get your last reply. So I copied it here.
>> >> 
>> >> Yes, the change here looks not that nice, while currently this is what I can't
>> >> come up with.
>> >
>> >THen I will suggest dropping this patch from the mmotm tree because it
>> >doesn't sound like a big improvement and I would encourage you or
>> >anybody else to take a deeper look and unclutter this area to be more
>> >readable and better maintainable.
>> 
>> Hi, Michal
>> 
>> I don't get your point, which part of the code makes you feel uncomfortable?
>
>It adds a check which would better be handled at a different level. I've
>tried to explain what are my concerns about quick&dirty solutions in
>this area. I would agree to add the check as a immediate workaround if
>this had some measurable benefits but the changelog doesn't mention
>any. So I do not really see this as an improvement in the end. If we
>want to address the suboptimal code, let's do it properly rather than
>spewing ifs all over the code.

Yep, I agree that to pursuit a better solution in the project is our ultimate
goal.

First let me explain why it is not enough to add it in the "different level" .
As you mentioned, it would be better to add this check in
__absent_pages_in_range(). Yes, I agree this would be proper place to add
this check at first sight. While __absent_pages_in_range() return 0 is not a
guarantee the ZONE_MOVEABLE handling would get 0 absent_page . So if we add the
check in __absent_pages_in_range(), we still need to add a check before
ZONE_MOVEABLE handling to avoid the iteration in this loop.

Here is a code snippet, I could come up with your suggestion.

	zone_absent_pages_in_node()
		
		__absent_pages_in_range()
			check zone and node overlap

		check zone and node overlap
		handle ZONE_MOVEABLE

Then let me explain why it is not necessary to add the check in
__absent_pages_in_range() now. Hmm... this looks a very good place to add this
check, since it would guard all cases to avoid these invalid
iterations. While in current implementation zone_absent_pages_in_node() is the
only place where the (start_pfn == end_pfn) could happen.

The __absent_pages_in_range() is invoked at three places:

* numa_meminfo_cover_memory()
* zone_absent_pages_in_node()
* absent_pages_in_range()

And looks the other two would have no chance to pass two equal parameters,
which falls into the check. So it looks not necessary to add a check here for
more general cases. The detailed explanation is posted in this mail,
https://lkml.org/lkml/2017/2/8/337

Based on these two analysis, I choose to put the check outside
__absent_pages_in_range(), which makes the code look like this:

	zone_absent_pages_in_node()
		
		check zone and node overlap
		__absent_pages_in_range()

		handle ZONE_MOVEABLE

So only one check instead of two.

Last but not the least, yes, I agree with you that this check is better to
be put in a different level while it may not as good as we think for current
implementation.

Glad to discuss with you about these details. Not sure which one you like or
you don't like any of them?

>
>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-02-22 14:18 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-06 15:43 [PATCH] mm/page_alloc: return 0 in case this node has no page within the zone Wei Yang
2017-02-06 15:43 ` Wei Yang
2017-02-06 23:29 ` Andrew Morton
2017-02-06 23:29   ` Andrew Morton
2017-02-07 15:07   ` Wei Yang
2017-02-07  9:45 ` Michal Hocko
2017-02-07  9:45   ` Michal Hocko
2017-02-07 15:32   ` Wei Yang
2017-02-07 15:41     ` Michal Hocko
2017-02-07 15:41       ` Michal Hocko
2017-02-08 14:05       ` Wei Yang
2017-02-08 14:39         ` Michal Hocko
2017-02-08 14:39           ` Michal Hocko
2017-02-09 13:59       ` Wei Yang
2017-02-22  8:49         ` Michal Hocko
2017-02-22  8:49           ` Michal Hocko
2017-02-22 10:51           ` Wei Yang
2017-02-22 11:45             ` Michal Hocko
2017-02-22 11:45               ` Michal Hocko
2017-02-22 14:18               ` Wei Yang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.