All of lore.kernel.org
 help / color / mirror / Atom feed
* [Patch v2 1/2] mm/page_alloc: add same penalty is enough to get round-robin order
@ 2022-04-08  2:59 Wei Yang
  2022-04-08  2:59 ` [Patch v2 2/2] mm/page_alloc: not necessary to multiply MAX_NODE_LOAD Wei Yang
  0 siblings, 1 reply; 7+ messages in thread
From: Wei Yang @ 2022-04-08  2:59 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, Wei Yang, Vlastimil Babka, David Hildenbrand, Oscar Salvador

To make node order in round-robin in the same distance group, we add a
penalty to the first node we got in each round.

To get a round-robin order in the same distance group, we don't need to
decrease the penalty since:

  * find_next_best_node() always iterates node in the same order
  * distance matters more then penalty in find_next_best_node()
  * in nodes with the same distance, the first one would be picked up

So it is fine to increase same penalty when we get the first node in the
same distance group.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
CC: Vlastimil Babka <vbabka@suse.cz>
CC: David Hildenbrand <david@redhat.com>
CC: Oscar Salvador <osalvador@suse.de>
---
v2: adjust constant penalty to 1
---
 mm/page_alloc.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7ccd9aced0ca..86b6573fbeb5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6283,13 +6283,12 @@ static void build_thisnode_zonelists(pg_data_t *pgdat)
 static void build_zonelists(pg_data_t *pgdat)
 {
 	static int node_order[MAX_NUMNODES];
-	int node, load, nr_nodes = 0;
+	int node, nr_nodes = 0;
 	nodemask_t used_mask = NODE_MASK_NONE;
 	int local_node, prev_node;
 
 	/* NUMA-aware ordering of nodes */
 	local_node = pgdat->node_id;
-	load = nr_online_nodes;
 	prev_node = local_node;
 
 	memset(node_order, 0, sizeof(node_order));
@@ -6301,11 +6300,10 @@ static void build_zonelists(pg_data_t *pgdat)
 		 */
 		if (node_distance(local_node, node) !=
 		    node_distance(local_node, prev_node))
-			node_load[node] += load;
+			node_load[node] += 1;
 
 		node_order[nr_nodes++] = node;
 		prev_node = node;
-		load--;
 	}
 
 	build_zonelists_in_node_order(pgdat, node_order, nr_nodes);
-- 
2.33.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Patch v2 2/2] mm/page_alloc: not necessary to multiply MAX_NODE_LOAD
  2022-04-08  2:59 [Patch v2 1/2] mm/page_alloc: add same penalty is enough to get round-robin order Wei Yang
@ 2022-04-08  2:59 ` Wei Yang
  2022-04-08  8:09   ` David Hildenbrand
  0 siblings, 1 reply; 7+ messages in thread
From: Wei Yang @ 2022-04-08  2:59 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, Wei Yang, Vlastimil Babka, David Hildenbrand, Oscar Salvador

Since we just increase a constance of 1 to node penalty, it is not
necessary to multiply MAX_NODE_LOAD for preference.

This patch also remove the definition.

[vbabka@suse.cz: suggests]

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
CC: Vlastimil Babka <vbabka@suse.cz>
CC: David Hildenbrand <david@redhat.com>
CC: Oscar Salvador <osalvador@suse.de>
---
 mm/page_alloc.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 86b6573fbeb5..ca6a127bbc26 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6170,7 +6170,6 @@ int numa_zonelist_order_handler(struct ctl_table *table, int write,
 }
 
 
-#define MAX_NODE_LOAD (nr_online_nodes)
 static int node_load[MAX_NUMNODES];
 
 /**
@@ -6217,7 +6216,7 @@ int find_next_best_node(int node, nodemask_t *used_node_mask)
 			val += PENALTY_FOR_NODE_WITH_CPUS;
 
 		/* Slight preference for less loaded node */
-		val *= (MAX_NODE_LOAD*MAX_NUMNODES);
+		val *= MAX_NUMNODES;
 		val += node_load[n];
 
 		if (val < min_val) {
-- 
2.33.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc: not necessary to multiply MAX_NODE_LOAD
  2022-04-08  2:59 ` [Patch v2 2/2] mm/page_alloc: not necessary to multiply MAX_NODE_LOAD Wei Yang
@ 2022-04-08  8:09   ` David Hildenbrand
  2022-04-08 23:07     ` Wei Yang
  0 siblings, 1 reply; 7+ messages in thread
From: David Hildenbrand @ 2022-04-08  8:09 UTC (permalink / raw)
  To: Wei Yang, akpm; +Cc: linux-mm, Vlastimil Babka, Oscar Salvador

On 08.04.22 04:59, Wei Yang wrote:
> Since we just increase a constance of 1 to node penalty, it is not
> necessary to multiply MAX_NODE_LOAD for preference.
> 
> This patch also remove the definition.
> 
> [vbabka@suse.cz: suggests]
> 
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> CC: Vlastimil Babka <vbabka@suse.cz>
> CC: David Hildenbrand <david@redhat.com>
> CC: Oscar Salvador <osalvador@suse.de>
> ---
>  mm/page_alloc.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 86b6573fbeb5..ca6a127bbc26 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6170,7 +6170,6 @@ int numa_zonelist_order_handler(struct ctl_table *table, int write,
>  }
>  
>  
> -#define MAX_NODE_LOAD (nr_online_nodes)
>  static int node_load[MAX_NUMNODES];
>  
>  /**
> @@ -6217,7 +6216,7 @@ int find_next_best_node(int node, nodemask_t *used_node_mask)
>  			val += PENALTY_FOR_NODE_WITH_CPUS;
>  
>  		/* Slight preference for less loaded node */
> -		val *= (MAX_NODE_LOAD*MAX_NUMNODES);
> +		val *= MAX_NUMNODES;
>  		val += node_load[n];
>  
>  		if (val < min_val) {

I feel like this should be squashed into the previous patch. It has the
same effect of making this code independent of nr_online_nodes. And I
had to scratch my head a couple of times in patch #1 why the change in
patch #1 is fine with thus remaining in place.


Having that said, I consider this code highly unnecessary
over-complicated at first sight. Removing some of the magic most
certainly is very welcome.

This semantics of the global variable node_load[] remains mostly
mysterious for me.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc: not necessary to multiply MAX_NODE_LOAD
  2022-04-08  8:09   ` David Hildenbrand
@ 2022-04-08 23:07     ` Wei Yang
  2022-04-11 10:52       ` Vlastimil Babka
  0 siblings, 1 reply; 7+ messages in thread
From: Wei Yang @ 2022-04-08 23:07 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Wei Yang, akpm, linux-mm, Vlastimil Babka, Oscar Salvador

On Fri, Apr 08, 2022 at 10:09:48AM +0200, David Hildenbrand wrote:
>On 08.04.22 04:59, Wei Yang wrote:
>> Since we just increase a constance of 1 to node penalty, it is not
>> necessary to multiply MAX_NODE_LOAD for preference.
>> 
>> This patch also remove the definition.
>> 
>> [vbabka@suse.cz: suggests]
>> 
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> CC: Vlastimil Babka <vbabka@suse.cz>
>> CC: David Hildenbrand <david@redhat.com>
>> CC: Oscar Salvador <osalvador@suse.de>
>> ---
>>  mm/page_alloc.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>> 
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 86b6573fbeb5..ca6a127bbc26 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -6170,7 +6170,6 @@ int numa_zonelist_order_handler(struct ctl_table *table, int write,
>>  }
>>  
>>  
>> -#define MAX_NODE_LOAD (nr_online_nodes)
>>  static int node_load[MAX_NUMNODES];
>>  
>>  /**
>> @@ -6217,7 +6216,7 @@ int find_next_best_node(int node, nodemask_t *used_node_mask)
>>  			val += PENALTY_FOR_NODE_WITH_CPUS;
>>  
>>  		/* Slight preference for less loaded node */
>> -		val *= (MAX_NODE_LOAD*MAX_NUMNODES);
>> +		val *= MAX_NUMNODES;
>>  		val += node_load[n];
>>  
>>  		if (val < min_val) {
>
>I feel like this should be squashed into the previous patch. It has the
>same effect of making this code independent of nr_online_nodes. And I
>had to scratch my head a couple of times in patch #1 why the change in
>patch #1 is fine with thus remaining in place.
>
>
>Having that said, I consider this code highly unnecessary
>over-complicated at first sight. Removing some of the magic most
>certainly is very welcome.
>
>This semantics of the global variable node_load[] remains mostly
>mysterious for me.
>

So the suggestion is a v3 with #1 and #2 squashed?

>-- 
>Thanks,
>
>David / dhildenb

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc: not necessary to multiply MAX_NODE_LOAD
  2022-04-08 23:07     ` Wei Yang
@ 2022-04-11 10:52       ` Vlastimil Babka
  2022-04-12  0:02         ` Wei Yang
  2022-04-12  7:53         ` David Hildenbrand
  0 siblings, 2 replies; 7+ messages in thread
From: Vlastimil Babka @ 2022-04-11 10:52 UTC (permalink / raw)
  To: Wei Yang, David Hildenbrand; +Cc: akpm, linux-mm, Oscar Salvador

On 4/9/22 01:07, Wei Yang wrote:
> On Fri, Apr 08, 2022 at 10:09:48AM +0200, David Hildenbrand wrote:
>>On 08.04.22 04:59, Wei Yang wrote:
>>> Since we just increase a constance of 1 to node penalty, it is not
>>> necessary to multiply MAX_NODE_LOAD for preference.
>>> 
>>> This patch also remove the definition.
>>> 
>>> [vbabka@suse.cz: suggests]
>>> 
>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>> CC: Vlastimil Babka <vbabka@suse.cz>
>>> CC: David Hildenbrand <david@redhat.com>
>>> CC: Oscar Salvador <osalvador@suse.de>
>>> ---
>>>  mm/page_alloc.c | 3 +--
>>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>> 
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index 86b6573fbeb5..ca6a127bbc26 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -6170,7 +6170,6 @@ int numa_zonelist_order_handler(struct ctl_table *table, int write,
>>>  }
>>>  
>>>  
>>> -#define MAX_NODE_LOAD (nr_online_nodes)
>>>  static int node_load[MAX_NUMNODES];
>>>  
>>>  /**
>>> @@ -6217,7 +6216,7 @@ int find_next_best_node(int node, nodemask_t *used_node_mask)
>>>  			val += PENALTY_FOR_NODE_WITH_CPUS;
>>>  
>>>  		/* Slight preference for less loaded node */
>>> -		val *= (MAX_NODE_LOAD*MAX_NUMNODES);
>>> +		val *= MAX_NUMNODES;
>>>  		val += node_load[n];
>>>  
>>>  		if (val < min_val) {
>>
>>I feel like this should be squashed into the previous patch. It has the
>>same effect of making this code independent of nr_online_nodes. And I
>>had to scratch my head a couple of times in patch #1 why the change in
>>patch #1 is fine with thus remaining in place.
>>
>>
>>Having that said, I consider this code highly unnecessary
>>over-complicated at first sight. Removing some of the magic most
>>certainly is very welcome.
>>
>>This semantics of the global variable node_load[] remains mostly
>>mysterious for me.

Looks like after this patch(es), it would be "how many times was this node
picked as the first fallback out of nodes with the same distance"?

>>
> 
> So the suggestion is a v3 with #1 and #2 squashed?

Yes, and I agree with the suggestion.

>>-- 
>>Thanks,
>>
>>David / dhildenb
> 



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc: not necessary to multiply MAX_NODE_LOAD
  2022-04-11 10:52       ` Vlastimil Babka
@ 2022-04-12  0:02         ` Wei Yang
  2022-04-12  7:53         ` David Hildenbrand
  1 sibling, 0 replies; 7+ messages in thread
From: Wei Yang @ 2022-04-12  0:02 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Wei Yang, David Hildenbrand, akpm, linux-mm, Oscar Salvador

On Mon, Apr 11, 2022 at 12:52:33PM +0200, Vlastimil Babka wrote:
>On 4/9/22 01:07, Wei Yang wrote:
>> On Fri, Apr 08, 2022 at 10:09:48AM +0200, David Hildenbrand wrote:
>>>On 08.04.22 04:59, Wei Yang wrote:
>>>> Since we just increase a constance of 1 to node penalty, it is not
>>>> necessary to multiply MAX_NODE_LOAD for preference.
>>>> 
>>>> This patch also remove the definition.
>>>> 
>>>> [vbabka@suse.cz: suggests]
>>>> 
>>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>>> CC: Vlastimil Babka <vbabka@suse.cz>
>>>> CC: David Hildenbrand <david@redhat.com>
>>>> CC: Oscar Salvador <osalvador@suse.de>
>>>> ---
>>>>  mm/page_alloc.c | 3 +--
>>>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>>> 
>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>> index 86b6573fbeb5..ca6a127bbc26 100644
>>>> --- a/mm/page_alloc.c
>>>> +++ b/mm/page_alloc.c
>>>> @@ -6170,7 +6170,6 @@ int numa_zonelist_order_handler(struct ctl_table *table, int write,
>>>>  }
>>>>  
>>>>  
>>>> -#define MAX_NODE_LOAD (nr_online_nodes)
>>>>  static int node_load[MAX_NUMNODES];
>>>>  
>>>>  /**
>>>> @@ -6217,7 +6216,7 @@ int find_next_best_node(int node, nodemask_t *used_node_mask)
>>>>  			val += PENALTY_FOR_NODE_WITH_CPUS;
>>>>  
>>>>  		/* Slight preference for less loaded node */
>>>> -		val *= (MAX_NODE_LOAD*MAX_NUMNODES);
>>>> +		val *= MAX_NUMNODES;
>>>>  		val += node_load[n];
>>>>  
>>>>  		if (val < min_val) {
>>>
>>>I feel like this should be squashed into the previous patch. It has the
>>>same effect of making this code independent of nr_online_nodes. And I
>>>had to scratch my head a couple of times in patch #1 why the change in
>>>patch #1 is fine with thus remaining in place.
>>>
>>>
>>>Having that said, I consider this code highly unnecessary
>>>over-complicated at first sight. Removing some of the magic most
>>>certainly is very welcome.
>>>
>>>This semantics of the global variable node_load[] remains mostly
>>>mysterious for me.
>
>Looks like after this patch(es), it would be "how many times was this node
>picked as the first fallback out of nodes with the same distance"?
>
>>>
>> 
>> So the suggestion is a v3 with #1 and #2 squashed?
>
>Yes, and I agree with the suggestion.

No problem.

>
>>>-- 
>>>Thanks,
>>>
>>>David / dhildenb
>> 

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc: not necessary to multiply MAX_NODE_LOAD
  2022-04-11 10:52       ` Vlastimil Babka
  2022-04-12  0:02         ` Wei Yang
@ 2022-04-12  7:53         ` David Hildenbrand
  1 sibling, 0 replies; 7+ messages in thread
From: David Hildenbrand @ 2022-04-12  7:53 UTC (permalink / raw)
  To: Vlastimil Babka, Wei Yang; +Cc: akpm, linux-mm, Oscar Salvador

On 11.04.22 12:52, Vlastimil Babka wrote:
> On 4/9/22 01:07, Wei Yang wrote:
>> On Fri, Apr 08, 2022 at 10:09:48AM +0200, David Hildenbrand wrote:
>>> On 08.04.22 04:59, Wei Yang wrote:
>>>> Since we just increase a constance of 1 to node penalty, it is not
>>>> necessary to multiply MAX_NODE_LOAD for preference.
>>>>
>>>> This patch also remove the definition.
>>>>
>>>> [vbabka@suse.cz: suggests]
>>>>
>>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>>> CC: Vlastimil Babka <vbabka@suse.cz>
>>>> CC: David Hildenbrand <david@redhat.com>
>>>> CC: Oscar Salvador <osalvador@suse.de>
>>>> ---
>>>>  mm/page_alloc.c | 3 +--
>>>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>>>
>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>> index 86b6573fbeb5..ca6a127bbc26 100644
>>>> --- a/mm/page_alloc.c
>>>> +++ b/mm/page_alloc.c
>>>> @@ -6170,7 +6170,6 @@ int numa_zonelist_order_handler(struct ctl_table *table, int write,
>>>>  }
>>>>  
>>>>  
>>>> -#define MAX_NODE_LOAD (nr_online_nodes)
>>>>  static int node_load[MAX_NUMNODES];
>>>>  
>>>>  /**
>>>> @@ -6217,7 +6216,7 @@ int find_next_best_node(int node, nodemask_t *used_node_mask)
>>>>  			val += PENALTY_FOR_NODE_WITH_CPUS;
>>>>  
>>>>  		/* Slight preference for less loaded node */
>>>> -		val *= (MAX_NODE_LOAD*MAX_NUMNODES);
>>>> +		val *= MAX_NUMNODES;
>>>>  		val += node_load[n];
>>>>  
>>>>  		if (val < min_val) {
>>>
>>> I feel like this should be squashed into the previous patch. It has the
>>> same effect of making this code independent of nr_online_nodes. And I
>>> had to scratch my head a couple of times in patch #1 why the change in
>>> patch #1 is fine with thus remaining in place.
>>>
>>>
>>> Having that said, I consider this code highly unnecessary
>>> over-complicated at first sight. Removing some of the magic most
>>> certainly is very welcome.
>>>
>>> This semantics of the global variable node_load[] remains mostly
>>> mysterious for me.
> 
> Looks like after this patch(es), it would be "how many times was this node
> picked as the first fallback out of nodes with the same distance"?

Yeah, that makes sense. I'd appreciate if we'd just stop using a global
variable here and pass it as a parameter. But that's obviously an
independent cleanup.


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-04-12  7:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-08  2:59 [Patch v2 1/2] mm/page_alloc: add same penalty is enough to get round-robin order Wei Yang
2022-04-08  2:59 ` [Patch v2 2/2] mm/page_alloc: not necessary to multiply MAX_NODE_LOAD Wei Yang
2022-04-08  8:09   ` David Hildenbrand
2022-04-08 23:07     ` Wei Yang
2022-04-11 10:52       ` Vlastimil Babka
2022-04-12  0:02         ` Wei Yang
2022-04-12  7:53         ` David Hildenbrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.