All of lore.kernel.org
 help / color / mirror / Atom feed
* [Patch v2 1/2] mm/page_alloc.c: use NODE_MASK_NONE define used_mask
@ 2020-03-27 22:01 Wei Yang
  2020-03-27 22:01 ` [Patch v2 2/2] mm/page_alloc.c: define node_order with all zero Wei Yang
  0 siblings, 1 reply; 14+ messages in thread
From: Wei Yang @ 2020-03-27 22:01 UTC (permalink / raw)
  To: akpm; +Cc: linux-mm, linux-kernel, jgg, david, Wei Yang

For all 0 nodemask_t, we have already define macro NODE_MASK_NONE.
Leverage this instead of clear it at run time.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>

---
v2: use NODE_MASK_NONE as suggested by David Hildenbrand
---
 mm/page_alloc.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ef790dfad6aa..dfcf2682ed40 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5587,14 +5587,13 @@ static void build_zonelists(pg_data_t *pgdat)
 {
 	static int node_order[MAX_NUMNODES];
 	int node, load, nr_nodes = 0;
-	nodemask_t used_mask;
+	nodemask_t used_mask = NODE_MASK_NONE;
 	int local_node, prev_node;
 
 	/* NUMA-aware ordering of nodes */
 	local_node = pgdat->node_id;
 	load = nr_online_nodes;
 	prev_node = local_node;
-	nodes_clear(used_mask);
 
 	memset(node_order, 0, sizeof(node_order));
 	while ((node = find_next_best_node(local_node, &used_mask)) >= 0) {
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Patch v2 2/2] mm/page_alloc.c: define node_order with all zero
  2020-03-27 22:01 [Patch v2 1/2] mm/page_alloc.c: use NODE_MASK_NONE define used_mask Wei Yang
@ 2020-03-27 22:01 ` Wei Yang
  2020-03-27 22:37   ` John Hubbard
  0 siblings, 1 reply; 14+ messages in thread
From: Wei Yang @ 2020-03-27 22:01 UTC (permalink / raw)
  To: akpm; +Cc: linux-mm, linux-kernel, jgg, david, Wei Yang

Since we always clear node_order before getting it, we can leverage
compiler to do this instead of at run time.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
 mm/page_alloc.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index dfcf2682ed40..49dd1f25c000 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5585,7 +5585,7 @@ static void build_thisnode_zonelists(pg_data_t *pgdat)
 
 static void build_zonelists(pg_data_t *pgdat)
 {
-	static int node_order[MAX_NUMNODES];
+	static int node_order[MAX_NUMNODES] = {0};
 	int node, load, nr_nodes = 0;
 	nodemask_t used_mask = NODE_MASK_NONE;
 	int local_node, prev_node;
@@ -5595,7 +5595,6 @@ static void build_zonelists(pg_data_t *pgdat)
 	load = nr_online_nodes;
 	prev_node = local_node;
 
-	memset(node_order, 0, sizeof(node_order));
 	while ((node = find_next_best_node(local_node, &used_mask)) >= 0) {
 		/*
 		 * We don't want to pressure a particular node.
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc.c: define node_order with all zero
  2020-03-27 22:01 ` [Patch v2 2/2] mm/page_alloc.c: define node_order with all zero Wei Yang
@ 2020-03-27 22:37   ` John Hubbard
  2020-03-27 23:18     ` Jason Gunthorpe
  2020-03-28  0:26     ` Wei Yang
  0 siblings, 2 replies; 14+ messages in thread
From: John Hubbard @ 2020-03-27 22:37 UTC (permalink / raw)
  To: Wei Yang, akpm; +Cc: linux-mm, linux-kernel, jgg, david

On 3/27/20 3:01 PM, Wei Yang wrote:
> Since we always clear node_order before getting it, we can leverage
> compiler to do this instead of at run time.
> 
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> ---
>   mm/page_alloc.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index dfcf2682ed40..49dd1f25c000 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5585,7 +5585,7 @@ static void build_thisnode_zonelists(pg_data_t *pgdat)
>   
>   static void build_zonelists(pg_data_t *pgdat)
>   {
> -	static int node_order[MAX_NUMNODES];
> +	static int node_order[MAX_NUMNODES] = {0};


Looks wrong: now the single instance of node_order is initialized just once by
the compiler. And that means that only the first caller of this function
gets a zeroed node_order array...


>   	int node, load, nr_nodes = 0;
>   	nodemask_t used_mask = NODE_MASK_NONE;
>   	int local_node, prev_node;
> @@ -5595,7 +5595,6 @@ static void build_zonelists(pg_data_t *pgdat)
>   	load = nr_online_nodes;
>   	prev_node = local_node;
>   
> -	memset(node_order, 0, sizeof(node_order));

...and all subsequent callers are left with whatever debris is remaining in
node_order. So this is not good.

The reason that memset() was used here, is that there aren't many other ways
to get node_order zeroed, given that it is a statically defined variable.


>   	while ((node = find_next_best_node(local_node, &used_mask)) >= 0) {
>   		/*
>   		 * We don't want to pressure a particular node.
> 



thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc.c: define node_order with all zero
  2020-03-27 22:37   ` John Hubbard
@ 2020-03-27 23:18     ` Jason Gunthorpe
  2020-03-28  0:27       ` Wei Yang
  2020-03-28  0:26     ` Wei Yang
  1 sibling, 1 reply; 14+ messages in thread
From: Jason Gunthorpe @ 2020-03-27 23:18 UTC (permalink / raw)
  To: John Hubbard; +Cc: Wei Yang, akpm, linux-mm, linux-kernel, david

On Fri, Mar 27, 2020 at 03:37:57PM -0700, John Hubbard wrote:
> On 3/27/20 3:01 PM, Wei Yang wrote:
> > Since we always clear node_order before getting it, we can leverage
> > compiler to do this instead of at run time.
> > 
> > Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> >   mm/page_alloc.c | 3 +--
> >   1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index dfcf2682ed40..49dd1f25c000 100644
> > +++ b/mm/page_alloc.c
> > @@ -5585,7 +5585,7 @@ static void build_thisnode_zonelists(pg_data_t *pgdat)
> >   static void build_zonelists(pg_data_t *pgdat)
> >   {
> > -	static int node_order[MAX_NUMNODES];
> > +	static int node_order[MAX_NUMNODES] = {0};
> 
> 
> Looks wrong: now the single instance of node_order is initialized just once by
> the compiler. And that means that only the first caller of this function
> gets a zeroed node_order array...

It is also redundant, all static data is 0 initialized in Linux and
should not be explicitly initialized so it can remain in .bss

> > @@ -5595,7 +5595,6 @@ static void build_zonelists(pg_data_t *pgdat)
> >   	load = nr_online_nodes;
> >   	prev_node = local_node;
> > -	memset(node_order, 0, sizeof(node_order));
> 
> ...and all subsequent callers are left with whatever debris is remaining in
> node_order. So this is not good.

Indeed

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc.c: define node_order with all zero
  2020-03-27 22:37   ` John Hubbard
  2020-03-27 23:18     ` Jason Gunthorpe
@ 2020-03-28  0:26     ` Wei Yang
  2020-03-28  0:51       ` Baoquan He
  2020-03-28  0:59       ` John Hubbard
  1 sibling, 2 replies; 14+ messages in thread
From: Wei Yang @ 2020-03-28  0:26 UTC (permalink / raw)
  To: John Hubbard; +Cc: Wei Yang, akpm, linux-mm, linux-kernel, jgg, david

On Fri, Mar 27, 2020 at 03:37:57PM -0700, John Hubbard wrote:
>On 3/27/20 3:01 PM, Wei Yang wrote:
>> Since we always clear node_order before getting it, we can leverage
>> compiler to do this instead of at run time.
>> 
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> ---
>>   mm/page_alloc.c | 3 +--
>>   1 file changed, 1 insertion(+), 2 deletions(-)
>> 
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index dfcf2682ed40..49dd1f25c000 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5585,7 +5585,7 @@ static void build_thisnode_zonelists(pg_data_t *pgdat)
>>   static void build_zonelists(pg_data_t *pgdat)
>>   {
>> -	static int node_order[MAX_NUMNODES];
>> +	static int node_order[MAX_NUMNODES] = {0};
>
>
>Looks wrong: now the single instance of node_order is initialized just once by
>the compiler. And that means that only the first caller of this function
>gets a zeroed node_order array...
>

What a shame on me. You are right, I miss the static word. 

Well, then I am curious about why we want to define it as static. Each time we
call this function, node_order would be set to 0 and find_next_best_node()
would sort a proper value into it. I don't see the reason to reserve it in a
global area and be used next time.

My suggestion is to remove the static and define it {0} instead of memset
every time. Is my understanding correct here?

>
>>   	int node, load, nr_nodes = 0;
>>   	nodemask_t used_mask = NODE_MASK_NONE;
>>   	int local_node, prev_node;
>> @@ -5595,7 +5595,6 @@ static void build_zonelists(pg_data_t *pgdat)
>>   	load = nr_online_nodes;
>>   	prev_node = local_node;
>> -	memset(node_order, 0, sizeof(node_order));
>
>...and all subsequent callers are left with whatever debris is remaining in
>node_order. So this is not good.
>
>The reason that memset() was used here, is that there aren't many other ways
>to get node_order zeroed, given that it is a statically defined variable.
>
>
>>   	while ((node = find_next_best_node(local_node, &used_mask)) >= 0) {
>>   		/*
>>   		 * We don't want to pressure a particular node.
>> 
>
>
>
>thanks,
>-- 
>John Hubbard
>NVIDIA

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc.c: define node_order with all zero
  2020-03-27 23:18     ` Jason Gunthorpe
@ 2020-03-28  0:27       ` Wei Yang
  0 siblings, 0 replies; 14+ messages in thread
From: Wei Yang @ 2020-03-28  0:27 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: John Hubbard, Wei Yang, akpm, linux-mm, linux-kernel, david

On Fri, Mar 27, 2020 at 08:18:20PM -0300, Jason Gunthorpe wrote:
>On Fri, Mar 27, 2020 at 03:37:57PM -0700, John Hubbard wrote:
>> On 3/27/20 3:01 PM, Wei Yang wrote:
>> > Since we always clear node_order before getting it, we can leverage
>> > compiler to do this instead of at run time.
>> > 
>> > Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> >   mm/page_alloc.c | 3 +--
>> >   1 file changed, 1 insertion(+), 2 deletions(-)
>> > 
>> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> > index dfcf2682ed40..49dd1f25c000 100644
>> > +++ b/mm/page_alloc.c
>> > @@ -5585,7 +5585,7 @@ static void build_thisnode_zonelists(pg_data_t *pgdat)
>> >   static void build_zonelists(pg_data_t *pgdat)
>> >   {
>> > -	static int node_order[MAX_NUMNODES];
>> > +	static int node_order[MAX_NUMNODES] = {0};
>> 
>> 
>> Looks wrong: now the single instance of node_order is initialized just once by
>> the compiler. And that means that only the first caller of this function
>> gets a zeroed node_order array...
>
>It is also redundant, all static data is 0 initialized in Linux and
>should not be explicitly initialized so it can remain in .bss
>

Yeah, you are right. I missed the static word.

>> > @@ -5595,7 +5595,6 @@ static void build_zonelists(pg_data_t *pgdat)
>> >   	load = nr_online_nodes;
>> >   	prev_node = local_node;
>> > -	memset(node_order, 0, sizeof(node_order));
>> 
>> ...and all subsequent callers are left with whatever debris is remaining in
>> node_order. So this is not good.
>
>Indeed
>
>Jason

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc.c: define node_order with all zero
  2020-03-28  0:26     ` Wei Yang
@ 2020-03-28  0:51       ` Baoquan He
  2020-03-28  0:59       ` John Hubbard
  1 sibling, 0 replies; 14+ messages in thread
From: Baoquan He @ 2020-03-28  0:51 UTC (permalink / raw)
  To: Wei Yang; +Cc: John Hubbard, akpm, linux-mm, linux-kernel, jgg, david

On 03/28/20 at 12:26am, Wei Yang wrote:
> On Fri, Mar 27, 2020 at 03:37:57PM -0700, John Hubbard wrote:
> >On 3/27/20 3:01 PM, Wei Yang wrote:
> >> Since we always clear node_order before getting it, we can leverage
> >> compiler to do this instead of at run time.
> >> 
> >> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> >> ---
> >>   mm/page_alloc.c | 3 +--
> >>   1 file changed, 1 insertion(+), 2 deletions(-)
> >> 
> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >> index dfcf2682ed40..49dd1f25c000 100644
> >> --- a/mm/page_alloc.c
> >> +++ b/mm/page_alloc.c
> >> @@ -5585,7 +5585,7 @@ static void build_thisnode_zonelists(pg_data_t *pgdat)
> >>   static void build_zonelists(pg_data_t *pgdat)
> >>   {
> >> -	static int node_order[MAX_NUMNODES];
> >> +	static int node_order[MAX_NUMNODES] = {0};
> >
> >
> >Looks wrong: now the single instance of node_order is initialized just once by
> >the compiler. And that means that only the first caller of this function
> >gets a zeroed node_order array...
> >
> 
> What a shame on me. You are right, I miss the static word. 
> 
> Well, then I am curious about why we want to define it as static. Each time we
> call this function, node_order would be set to 0 and find_next_best_node()
> would sort a proper value into it. I don't see the reason to reserve it in a
> global area and be used next time.
> 
> My suggestion is to remove the static and define it {0} instead of memset
> every time. Is my understanding correct here?

Removing static looks good, the node_order is calculated on the basis of
each node, it's useless for other node. It may be inherited from the old
code where it's a static global variable.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc.c: define node_order with all zero
  2020-03-28  0:26     ` Wei Yang
  2020-03-28  0:51       ` Baoquan He
@ 2020-03-28  0:59       ` John Hubbard
  2020-03-28  1:10         ` Wei Yang
  1 sibling, 1 reply; 14+ messages in thread
From: John Hubbard @ 2020-03-28  0:59 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, linux-mm, linux-kernel, jgg, david

On 3/27/20 5:26 PM, Wei Yang wrote:
> On Fri, Mar 27, 2020 at 03:37:57PM -0700, John Hubbard wrote:
>> On 3/27/20 3:01 PM, Wei Yang wrote:
>>> Since we always clear node_order before getting it, we can leverage
>>> compiler to do this instead of at run time.
>>>
>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>> ---
>>>    mm/page_alloc.c | 3 +--
>>>    1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index dfcf2682ed40..49dd1f25c000 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -5585,7 +5585,7 @@ static void build_thisnode_zonelists(pg_data_t *pgdat)
>>>    static void build_zonelists(pg_data_t *pgdat)
>>>    {
>>> -	static int node_order[MAX_NUMNODES];
>>> +	static int node_order[MAX_NUMNODES] = {0};
>>
>>
>> Looks wrong: now the single instance of node_order is initialized just once by
>> the compiler. And that means that only the first caller of this function
>> gets a zeroed node_order array...
>>
> 
> What a shame on me. You are right, I miss the static word.
> 
> Well, then I am curious about why we want to define it as static. Each time we
> call this function, node_order would be set to 0 and find_next_best_node()
> would sort a proper value into it. I don't see the reason to reserve it in a
> global area and be used next time.

It's not just about preserving the value. Sometimes it's about stack space.
Here's the trade-offs for static variables within a function:

Advantages of static variables within a function (compared to non-static
variables, also within a function):
-----------------------------------

* Doesn't use any of the scarce kernel stack space
* Preserves values (not always necessarily and advantage)

Disadvantages:
-----------------------------------

* Removes basic thread safety: multiple threads can no longer independently
   call the function without getting interaction, and generally that means
   data corruption.

So here, I suspect that the original motivation was probably to conserve stack
space, and the author likely observed that there was no concurrency to worry
about: the function was only being called by one thread at a time.  Given those
constraints (which I haven't confirmed just yet, btw), a static function variable
fits well.

> 
> My suggestion is to remove the static and define it {0} instead of memset
> every time. Is my understanding correct here?


Not completely:

a) First of all, "instead of memset every time" is a misconception, because
    there is still a memset happening every time with {0}. It's just that the
    compiler silently writes that code for you, and you don't see it on the
    screen. But it's still there.

b) Switching away from a static to an on-stack variable requires that you first
    verify that stack space is not an issue. Or, if you determine that this
    function needs the per-thread isolation that a non-static variable provides,
    then you can switch to either an on-stack variable, or a *alloc() function.



thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc.c: define node_order with all zero
  2020-03-28  0:59       ` John Hubbard
@ 2020-03-28  1:10         ` Wei Yang
  2020-03-28  1:28           ` John Hubbard
  0 siblings, 1 reply; 14+ messages in thread
From: Wei Yang @ 2020-03-28  1:10 UTC (permalink / raw)
  To: John Hubbard; +Cc: Wei Yang, akpm, linux-mm, linux-kernel, jgg, david

On Fri, Mar 27, 2020 at 05:59:30PM -0700, John Hubbard wrote:
>On 3/27/20 5:26 PM, Wei Yang wrote:
>> On Fri, Mar 27, 2020 at 03:37:57PM -0700, John Hubbard wrote:
>> > On 3/27/20 3:01 PM, Wei Yang wrote:
>> > > Since we always clear node_order before getting it, we can leverage
>> > > compiler to do this instead of at run time.
>> > > 
>> > > Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> > > ---
>> > >    mm/page_alloc.c | 3 +--
>> > >    1 file changed, 1 insertion(+), 2 deletions(-)
>> > > 
>> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> > > index dfcf2682ed40..49dd1f25c000 100644
>> > > --- a/mm/page_alloc.c
>> > > +++ b/mm/page_alloc.c
>> > > @@ -5585,7 +5585,7 @@ static void build_thisnode_zonelists(pg_data_t *pgdat)
>> > >    static void build_zonelists(pg_data_t *pgdat)
>> > >    {
>> > > -	static int node_order[MAX_NUMNODES];
>> > > +	static int node_order[MAX_NUMNODES] = {0};
>> > 
>> > 
>> > Looks wrong: now the single instance of node_order is initialized just once by
>> > the compiler. And that means that only the first caller of this function
>> > gets a zeroed node_order array...
>> > 
>> 
>> What a shame on me. You are right, I miss the static word.
>> 
>> Well, then I am curious about why we want to define it as static. Each time we
>> call this function, node_order would be set to 0 and find_next_best_node()
>> would sort a proper value into it. I don't see the reason to reserve it in a
>> global area and be used next time.
>
>It's not just about preserving the value. Sometimes it's about stack space.
>Here's the trade-offs for static variables within a function:
>
>Advantages of static variables within a function (compared to non-static
>variables, also within a function):
>-----------------------------------
>
>* Doesn't use any of the scarce kernel stack space
>* Preserves values (not always necessarily and advantage)
>
>Disadvantages:
>-----------------------------------
>
>* Removes basic thread safety: multiple threads can no longer independently
>  call the function without getting interaction, and generally that means
>  data corruption.
>
>So here, I suspect that the original motivation was probably to conserve stack
>space, and the author likely observed that there was no concurrency to worry
>about: the function was only being called by one thread at a time.  Given those
>constraints (which I haven't confirmed just yet, btw), a static function variable
>fits well.
>
>> 
>> My suggestion is to remove the static and define it {0} instead of memset
>> every time. Is my understanding correct here?
>
>
>Not completely:
>
>a) First of all, "instead of memset every time" is a misconception, because
>   there is still a memset happening every time with {0}. It's just that the
>   compiler silently writes that code for you, and you don't see it on the
>   screen. But it's still there.
>
>b) Switching away from a static to an on-stack variable requires that you first
>   verify that stack space is not an issue. Or, if you determine that this
>   function needs the per-thread isolation that a non-static variable provides,
>   then you can switch to either an on-stack variable, or a *alloc() function.
>

I think you get some point. While one more question about stack and static. If
one function is thread safe, which factor determines whether we choose on
stack value or static? Any reference size? It looks currently we don't have a
guide line for this.

>
>
>thanks,
>-- 
>John Hubbard
>NVIDIA

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc.c: define node_order with all zero
  2020-03-28  1:10         ` Wei Yang
@ 2020-03-28  1:28           ` John Hubbard
  2020-03-28  2:56             ` Wei Yang
  2020-03-28 11:25             ` Baoquan He
  0 siblings, 2 replies; 14+ messages in thread
From: John Hubbard @ 2020-03-28  1:28 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, linux-mm, linux-kernel, jgg, david

On 3/27/20 6:10 PM, Wei Yang wrote:
...
>> It's not just about preserving the value. Sometimes it's about stack space.
>> Here's the trade-offs for static variables within a function:
>>
>> Advantages of static variables within a function (compared to non-static
>> variables, also within a function):
>> -----------------------------------
>>
>> * Doesn't use any of the scarce kernel stack space
>> * Preserves values (not always necessarily and advantage)
>>
>> Disadvantages:
>> -----------------------------------
>>
>> * Removes basic thread safety: multiple threads can no longer independently
>>   call the function without getting interaction, and generally that means
>>   data corruption.
>>
>> So here, I suspect that the original motivation was probably to conserve stack
>> space, and the author likely observed that there was no concurrency to worry
>> about: the function was only being called by one thread at a time.  Given those
>> constraints (which I haven't confirmed just yet, btw), a static function variable
>> fits well.
>>
>>>
>>> My suggestion is to remove the static and define it {0} instead of memset
>>> every time. Is my understanding correct here?
>>
>>
>> Not completely:
>>
>> a) First of all, "instead of memset every time" is a misconception, because
>>    there is still a memset happening every time with {0}. It's just that the
>>    compiler silently writes that code for you, and you don't see it on the
>>    screen. But it's still there.
>>
>> b) Switching away from a static to an on-stack variable requires that you first
>>    verify that stack space is not an issue. Or, if you determine that this
>>    function needs the per-thread isolation that a non-static variable provides,
>>    then you can switch to either an on-stack variable, or a *alloc() function.
>>
> 
> I think you get some point. While one more question about stack and static. If
> one function is thread safe, which factor determines whether we choose on
> stack value or static? Any reference size? It looks currently we don't have a
> guide line for this.
> 


There's not really any general guideline, but applying the points above (plus keeping
in mind that kernel stack space is quite small) to each case, you'll come to a good
answer.

In this case, if we really are only ever calling this function in one thread at a time,
then it's probably best to let the "conserve stack space" point win. Which leads to
just leaving the code nearly as-is. The only thing left to do would be to (optionally,
because this is an exceedingly minor point) delete the arguably misleading "= {0}" part.
And as Jason points out, doing so also moves node_order into .bss :

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4bd35eb83d34..cb4b07458249 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5607,7 +5607,7 @@ static void build_thisnode_zonelists(pg_data_t *pgdat)
  
  static void build_zonelists(pg_data_t *pgdat)
  {
-       static int node_order[MAX_NUMNODES] = {0};
+       static int node_order[MAX_NUMNODES];
         int node, load, nr_nodes = 0;
         nodemask_t used_mask = NODE_MASK_NONE;
         int local_node, prev_node;



Further note: On my current testing .config, I've got MAX_NUMNODES set to 64, which makes
256 bytes required for node_order array. 256 bytes on a 16KB stack is a little bit above
my mental watermark for "that's too much in today's kernels".


thanks,
-- 
John Hubbard
NVIDIA



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc.c: define node_order with all zero
  2020-03-28  1:28           ` John Hubbard
@ 2020-03-28  2:56             ` Wei Yang
  2020-03-29  1:30               ` John Hubbard
  2020-03-28 11:25             ` Baoquan He
  1 sibling, 1 reply; 14+ messages in thread
From: Wei Yang @ 2020-03-28  2:56 UTC (permalink / raw)
  To: John Hubbard; +Cc: Wei Yang, akpm, linux-mm, linux-kernel, jgg, david

On Fri, Mar 27, 2020 at 06:28:36PM -0700, John Hubbard wrote:
>On 3/27/20 6:10 PM, Wei Yang wrote:
>...
>> > It's not just about preserving the value. Sometimes it's about stack space.
>> > Here's the trade-offs for static variables within a function:
>> > 
>> > Advantages of static variables within a function (compared to non-static
>> > variables, also within a function):
>> > -----------------------------------
>> > 
>> > * Doesn't use any of the scarce kernel stack space
>> > * Preserves values (not always necessarily and advantage)
>> > 
>> > Disadvantages:
>> > -----------------------------------
>> > 
>> > * Removes basic thread safety: multiple threads can no longer independently
>> >   call the function without getting interaction, and generally that means
>> >   data corruption.
>> > 
>> > So here, I suspect that the original motivation was probably to conserve stack
>> > space, and the author likely observed that there was no concurrency to worry
>> > about: the function was only being called by one thread at a time.  Given those
>> > constraints (which I haven't confirmed just yet, btw), a static function variable
>> > fits well.
>> > 
>> > > 
>> > > My suggestion is to remove the static and define it {0} instead of memset
>> > > every time. Is my understanding correct here?
>> > 
>> > 
>> > Not completely:
>> > 
>> > a) First of all, "instead of memset every time" is a misconception, because
>> >    there is still a memset happening every time with {0}. It's just that the
>> >    compiler silently writes that code for you, and you don't see it on the
>> >    screen. But it's still there.
>> > 
>> > b) Switching away from a static to an on-stack variable requires that you first
>> >    verify that stack space is not an issue. Or, if you determine that this
>> >    function needs the per-thread isolation that a non-static variable provides,
>> >    then you can switch to either an on-stack variable, or a *alloc() function.
>> > 
>> 
>> I think you get some point. While one more question about stack and static. If
>> one function is thread safe, which factor determines whether we choose on
>> stack value or static? Any reference size? It looks currently we don't have a
>> guide line for this.
>> 
>
>
>There's not really any general guideline, but applying the points above (plus keeping
>in mind that kernel stack space is quite small) to each case, you'll come to a good
>answer.
>
>In this case, if we really are only ever calling this function in one thread at a time,
>then it's probably best to let the "conserve stack space" point win. Which leads to
>just leaving the code nearly as-is. The only thing left to do would be to (optionally,
>because this is an exceedingly minor point) delete the arguably misleading "= {0}" part.
>And as Jason points out, doing so also moves node_order into .bss :
>
>diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>index 4bd35eb83d34..cb4b07458249 100644
>--- a/mm/page_alloc.c
>+++ b/mm/page_alloc.c
>@@ -5607,7 +5607,7 @@ static void build_thisnode_zonelists(pg_data_t *pgdat)
> static void build_zonelists(pg_data_t *pgdat)
> {
>-       static int node_order[MAX_NUMNODES] = {0};

This is what I added, so I would drop this one.

>+       static int node_order[MAX_NUMNODES];
>        int node, load, nr_nodes = 0;
>        nodemask_t used_mask = NODE_MASK_NONE;
>        int local_node, prev_node;
>
>
>
>Further note: On my current testing .config, I've got MAX_NUMNODES set to 64, which makes
>256 bytes required for node_order array. 256 bytes on a 16KB stack is a little bit above
>my mental watermark for "that's too much in today's kernels".
>

Thanks for your explanation. I would keep this in mind.

Now I have one more question, hope it won't sound silly. (16KB / 256) = 64,
this means if each function call takes 256 space on stack, the max call depth
is 64. So how deep a kernel function call would be? or expected to be?

Also because of the limit space on stack, recursive function is not welcome in
kernel neither. Am I right?

>
>thanks,
>-- 
>John Hubbard
>NVIDIA
>

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc.c: define node_order with all zero
  2020-03-28  1:28           ` John Hubbard
  2020-03-28  2:56             ` Wei Yang
@ 2020-03-28 11:25             ` Baoquan He
  1 sibling, 0 replies; 14+ messages in thread
From: Baoquan He @ 2020-03-28 11:25 UTC (permalink / raw)
  To: John Hubbard; +Cc: Wei Yang, akpm, linux-mm, linux-kernel, jgg, david

On 03/27/20 at 06:28pm, John Hubbard wrote:
> On 3/27/20 6:10 PM, Wei Yang wrote:
> ...
> > > It's not just about preserving the value. Sometimes it's about stack space.
> > > Here's the trade-offs for static variables within a function:
> > > 
> > > Advantages of static variables within a function (compared to non-static
> > > variables, also within a function):
> > > -----------------------------------
> > > 
> > > * Doesn't use any of the scarce kernel stack space
> > > * Preserves values (not always necessarily and advantage)
> > > 
> > > Disadvantages:
> > > -----------------------------------
> > > 
> > > * Removes basic thread safety: multiple threads can no longer independently
> > >   call the function without getting interaction, and generally that means
> > >   data corruption.
> > > 
> > > So here, I suspect that the original motivation was probably to conserve stack
> > > space, and the author likely observed that there was no concurrency to worry
> > > about: the function was only being called by one thread at a time.  Given those
> > > constraints (which I haven't confirmed just yet, btw), a static function variable
> > > fits well.
> > > 
> > > > 
> > > > My suggestion is to remove the static and define it {0} instead of memset
> > > > every time. Is my understanding correct here?
> > > 
> > > 
> > > Not completely:
> > > 
> > > a) First of all, "instead of memset every time" is a misconception, because
> > >    there is still a memset happening every time with {0}. It's just that the
> > >    compiler silently writes that code for you, and you don't see it on the
> > >    screen. But it's still there.
> > > 
> > > b) Switching away from a static to an on-stack variable requires that you first
> > >    verify that stack space is not an issue. Or, if you determine that this
> > >    function needs the per-thread isolation that a non-static variable provides,
> > >    then you can switch to either an on-stack variable, or a *alloc() function.
> > > 
> > 
> > I think you get some point. While one more question about stack and static. If
> > one function is thread safe, which factor determines whether we choose on
> > stack value or static? Any reference size? It looks currently we don't have a
> > guide line for this.
> > 
> 
> 
> There's not really any general guideline, but applying the points above (plus keeping
> in mind that kernel stack space is quite small) to each case, you'll come to a good
> answer.
> 
> In this case, if we really are only ever calling this function in one thread at a time,
> then it's probably best to let the "conserve stack space" point win. Which leads to
> just leaving the code nearly as-is. The only thing left to do would be to (optionally,
> because this is an exceedingly minor point) delete the arguably misleading "= {0}" part.
> And as Jason points out, doing so also moves node_order into .bss :
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 4bd35eb83d34..cb4b07458249 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5607,7 +5607,7 @@ static void build_thisnode_zonelists(pg_data_t *pgdat)
>  static void build_zonelists(pg_data_t *pgdat)
>  {
> -       static int node_order[MAX_NUMNODES] = {0};
> +       static int node_order[MAX_NUMNODES];
>         int node, load, nr_nodes = 0;
>         nodemask_t used_mask = NODE_MASK_NONE;
>         int local_node, prev_node;
> 
> 
> 
> Further note: On my current testing .config, I've got MAX_NUMNODES set to 64, which makes
> 256 bytes required for node_order array. 256 bytes on a 16KB stack is a little bit above
> my mental watermark for "that's too much in today's kernels".

Oh, so Michal was deliberate to do so. I have CONFIG_NODES_SHIFT as 10
in my laptop config. That truly will cost much kernel stack. Thanks for
telling this.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc.c: define node_order with all zero
  2020-03-28  2:56             ` Wei Yang
@ 2020-03-29  1:30               ` John Hubbard
  2020-03-29  2:31                 ` Wei Yang
  0 siblings, 1 reply; 14+ messages in thread
From: John Hubbard @ 2020-03-29  1:30 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, linux-mm, linux-kernel, jgg, david

On 3/27/20 7:56 PM, Wei Yang wrote:
...
>> Further note: On my current testing .config, I've got MAX_NUMNODES set to 64, which makes
>> 256 bytes required for node_order array. 256 bytes on a 16KB stack is a little bit above
>> my mental watermark for "that's too much in today's kernels".
>>
> 
> Thanks for your explanation. I would keep this in mind.
> 
> Now I have one more question, hope it won't sound silly. (16KB / 256) = 64,
> this means if each function call takes 256 space on stack, the max call depth
> is 64. So how deep a kernel function call would be? or expected to be?
>

64 is quite a bit deeper call depth than we expect to see. So 256 bytes on the stack
is not completely indefensible, but it's getting close. But worse, that's just an
example based on my .config choices. And (as Baoquan just pointed out) it can be much
bigger. (And the .config variable is an exponent, not linear, so it gets exciting fast.)
In fact, you could overrun the stack directly, with say CONFIG_NODES_SHIFT = 14.

  
> Also because of the limit space on stack, recursive function is not welcome in
> kernel neither. Am I right?
> 
Yes, that is correct.

thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Patch v2 2/2] mm/page_alloc.c: define node_order with all zero
  2020-03-29  1:30               ` John Hubbard
@ 2020-03-29  2:31                 ` Wei Yang
  0 siblings, 0 replies; 14+ messages in thread
From: Wei Yang @ 2020-03-29  2:31 UTC (permalink / raw)
  To: John Hubbard; +Cc: Wei Yang, akpm, linux-mm, linux-kernel, jgg, david

On Sat, Mar 28, 2020 at 06:30:00PM -0700, John Hubbard wrote:
>On 3/27/20 7:56 PM, Wei Yang wrote:
>...
>> > Further note: On my current testing .config, I've got MAX_NUMNODES set to 64, which makes
>> > 256 bytes required for node_order array. 256 bytes on a 16KB stack is a little bit above
>> > my mental watermark for "that's too much in today's kernels".
>> > 
>> 
>> Thanks for your explanation. I would keep this in mind.
>> 
>> Now I have one more question, hope it won't sound silly. (16KB / 256) = 64,
>> this means if each function call takes 256 space on stack, the max call depth
>> is 64. So how deep a kernel function call would be? or expected to be?
>> 
>
>64 is quite a bit deeper call depth than we expect to see. So 256 bytes on the stack
>is not completely indefensible, but it's getting close. But worse, that's just an
>example based on my .config choices. And (as Baoquan just pointed out) it can be much
>bigger. (And the .config variable is an exponent, not linear, so it gets exciting fast.)
>In fact, you could overrun the stack directly, with say CONFIG_NODES_SHIFT = 14.
>

Thanks :-)

This is better not to use "big" structure on stack.

>> Also because of the limit space on stack, recursive function is not welcome in
>> kernel neither. Am I right?
>> 
>Yes, that is correct.
>
>thanks,
>-- 
>John Hubbard
>NVIDIA

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-03-29  2:31 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-27 22:01 [Patch v2 1/2] mm/page_alloc.c: use NODE_MASK_NONE define used_mask Wei Yang
2020-03-27 22:01 ` [Patch v2 2/2] mm/page_alloc.c: define node_order with all zero Wei Yang
2020-03-27 22:37   ` John Hubbard
2020-03-27 23:18     ` Jason Gunthorpe
2020-03-28  0:27       ` Wei Yang
2020-03-28  0:26     ` Wei Yang
2020-03-28  0:51       ` Baoquan He
2020-03-28  0:59       ` John Hubbard
2020-03-28  1:10         ` Wei Yang
2020-03-28  1:28           ` John Hubbard
2020-03-28  2:56             ` Wei Yang
2020-03-29  1:30               ` John Hubbard
2020-03-29  2:31                 ` Wei Yang
2020-03-28 11:25             ` Baoquan He

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.