[PATCH 0/3] trivial code refine for sparsemem

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/3] trivial code refine for sparsemem
@ 2018-08-23 13:07 Wei Yang
  2018-08-23 13:07 ` [PATCH 1/3] mm/sparse: add likely to mem_section[root] check in sparse_index_init() Wei Yang
                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Wei Yang @ 2018-08-23 13:07 UTC (permalink / raw)
  To: akpm, mhocko, rientjes; +Cc: linux-mm, kirill.shutemov, bob.picco, Wei Yang

Here is three trivial refine patches for sparsemem.

Wei Yang (3):
  mm/sparse: add likely to mem_section[root] check in
    sparse_index_init()
  mm/sparse: expand the CONFIG_SPARSEMEM_EXTREME range in
    __nr_to_section()
  mm/sparse: use __highest_present_section_nr as the boundary for pfn
    check

 include/linux/mmzone.h | 6 +++---
 mm/sparse.c            | 3 ++-
 2 files changed, 5 insertions(+), 4 deletions(-)

-- 
2.15.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/3] mm/sparse: add likely to mem_section[root] check in sparse_index_init()
  2018-08-23 13:07 [PATCH 0/3] trivial code refine for sparsemem Wei Yang
@ 2018-08-23 13:07 ` Wei Yang
  2018-08-23 13:13   ` Michal Hocko
  2018-08-24  0:11   ` Dave Hansen
  2018-08-23 13:07 ` [PATCH 2/3] mm/sparse: expand the CONFIG_SPARSEMEM_EXTREME range in __nr_to_section() Wei Yang
  2018-08-23 13:07 ` [PATCH 3/3] mm/sparse: use __highest_present_section_nr as the boundary for pfn check Wei Yang
  2 siblings, 2 replies; 23+ messages in thread
From: Wei Yang @ 2018-08-23 13:07 UTC (permalink / raw)
  To: akpm, mhocko, rientjes; +Cc: linux-mm, kirill.shutemov, bob.picco, Wei Yang

Each time SECTIONS_PER_ROOT number of mem_section is allocated when
mem_section[root] is null. This means only (1 / SECTIONS_PER_ROOT) chance
of the mem_section[root] check is false.

This patch adds likely to the if check to optimize this a little.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
 mm/sparse.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/sparse.c b/mm/sparse.c
index 10b07eea9a6e..90bab7f03757 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -78,7 +78,7 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
 	unsigned long root = SECTION_NR_TO_ROOT(section_nr);
 	struct mem_section *section;
 
-	if (mem_section[root])
+	if (likely(mem_section[root]))
 		return -EEXIST;
 
 	section = sparse_index_alloc(nid);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/3] mm/sparse: expand the CONFIG_SPARSEMEM_EXTREME range in __nr_to_section()
  2018-08-23 13:07 [PATCH 0/3] trivial code refine for sparsemem Wei Yang
  2018-08-23 13:07 ` [PATCH 1/3] mm/sparse: add likely to mem_section[root] check in sparse_index_init() Wei Yang
@ 2018-08-23 13:07 ` Wei Yang
  2018-08-23 13:21   ` Michal Hocko
  2018-08-23 13:07 ` [PATCH 3/3] mm/sparse: use __highest_present_section_nr as the boundary for pfn check Wei Yang
  2 siblings, 1 reply; 23+ messages in thread
From: Wei Yang @ 2018-08-23 13:07 UTC (permalink / raw)
  To: akpm, mhocko, rientjes; +Cc: linux-mm, kirill.shutemov, bob.picco, Wei Yang

When CONFIG_SPARSEMEM_EXTREME is not defined, mem_section is a static
two dimension array. This means !mem_section[SECTION_NR_TO_ROOT(nr)] is
always true.

This patch expand the CONFIG_SPARSEMEM_EXTREME range to return a proper
mem_section when CONFIG_SPARSEMEM_EXTREME is not defined.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
 include/linux/mmzone.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 32699b2dc52a..33086f86d1a7 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1155,9 +1155,9 @@ static inline struct mem_section *__nr_to_section(unsigned long nr)
 #ifdef CONFIG_SPARSEMEM_EXTREME
 	if (!mem_section)
 		return NULL;
-#endif
 	if (!mem_section[SECTION_NR_TO_ROOT(nr)])
 		return NULL;
+#endif
 	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
 }
 extern int __section_nr(struct mem_section* ms);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/3] mm/sparse: use __highest_present_section_nr as the boundary for pfn check
  2018-08-23 13:07 [PATCH 0/3] trivial code refine for sparsemem Wei Yang
  2018-08-23 13:07 ` [PATCH 1/3] mm/sparse: add likely to mem_section[root] check in sparse_index_init() Wei Yang
  2018-08-23 13:07 ` [PATCH 2/3] mm/sparse: expand the CONFIG_SPARSEMEM_EXTREME range in __nr_to_section() Wei Yang
@ 2018-08-23 13:07 ` Wei Yang
  2018-08-23 13:25   ` Michal Hocko
  2018-08-24  0:15   ` Dave Hansen
  2 siblings, 2 replies; 23+ messages in thread
From: Wei Yang @ 2018-08-23 13:07 UTC (permalink / raw)
  To: akpm, mhocko, rientjes; +Cc: linux-mm, kirill.shutemov, bob.picco, Wei Yang

And it is known, __highest_present_section_nr is a more strict boundary
than NR_MEM_SECTIONS.

This patch uses a __highest_present_section_nr to check a valid pfn.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
 include/linux/mmzone.h | 4 ++--
 mm/sparse.c            | 1 +
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 33086f86d1a7..5138efde11ae 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1237,7 +1237,7 @@ extern int __highest_present_section_nr;
 #ifndef CONFIG_HAVE_ARCH_PFN_VALID
 static inline int pfn_valid(unsigned long pfn)
 {
-	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
+	if (pfn_to_section_nr(pfn) > __highest_present_section_nr)
 		return 0;
 	return valid_section(__nr_to_section(pfn_to_section_nr(pfn)));
 }
@@ -1245,7 +1245,7 @@ static inline int pfn_valid(unsigned long pfn)
 
 static inline int pfn_present(unsigned long pfn)
 {
-	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
+	if (pfn_to_section_nr(pfn) > __highest_present_section_nr)
 		return 0;
 	return present_section(__nr_to_section(pfn_to_section_nr(pfn)));
 }
diff --git a/mm/sparse.c b/mm/sparse.c
index 90bab7f03757..a9c55c8da11f 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -174,6 +174,7 @@ void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn,
  * those loops early.
  */
 int __highest_present_section_nr;
+EXPORT_SYMBOL(__highest_present_section_nr);
 static void section_mark_present(struct mem_section *ms)
 {
 	int section_nr = __section_nr(ms);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] mm/sparse: add likely to mem_section[root] check in sparse_index_init()
  2018-08-23 13:07 ` [PATCH 1/3] mm/sparse: add likely to mem_section[root] check in sparse_index_init() Wei Yang
@ 2018-08-23 13:13   ` Michal Hocko
  2018-08-23 22:57     ` Wei Yang
  2018-08-24  0:11   ` Dave Hansen
  1 sibling, 1 reply; 23+ messages in thread
From: Michal Hocko @ 2018-08-23 13:13 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, rientjes, linux-mm, kirill.shutemov, bob.picco

On Thu 23-08-18 21:07:30, Wei Yang wrote:
> Each time SECTIONS_PER_ROOT number of mem_section is allocated when
> mem_section[root] is null. This means only (1 / SECTIONS_PER_ROOT) chance
> of the mem_section[root] check is false.
> 
> This patch adds likely to the if check to optimize this a little.

Could you evaluate how much does this help if any? Does this have any
impact on the initialization path at all?

> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> ---
>  mm/sparse.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 10b07eea9a6e..90bab7f03757 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -78,7 +78,7 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>  	unsigned long root = SECTION_NR_TO_ROOT(section_nr);
>  	struct mem_section *section;
>  
> -	if (mem_section[root])
> +	if (likely(mem_section[root]))
>  		return -EEXIST;
>  
>  	section = sparse_index_alloc(nid);
> -- 
> 2.15.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3] mm/sparse: expand the CONFIG_SPARSEMEM_EXTREME range in __nr_to_section()
  2018-08-23 13:07 ` [PATCH 2/3] mm/sparse: expand the CONFIG_SPARSEMEM_EXTREME range in __nr_to_section() Wei Yang
@ 2018-08-23 13:21   ` Michal Hocko
  2018-08-23 23:03     ` Wei Yang
  2018-08-24  0:09     ` Dave Hansen
  0 siblings, 2 replies; 23+ messages in thread
From: Michal Hocko @ 2018-08-23 13:21 UTC (permalink / raw)
  To: Wei Yang
  Cc: akpm, rientjes, linux-mm, kirill.shutemov, bob.picco, Dave Hansen

[Cc Dave]

On Thu 23-08-18 21:07:31, Wei Yang wrote:
> When CONFIG_SPARSEMEM_EXTREME is not defined, mem_section is a static
> two dimension array. This means !mem_section[SECTION_NR_TO_ROOT(nr)] is
> always true.
> 
> This patch expand the CONFIG_SPARSEMEM_EXTREME range to return a proper
> mem_section when CONFIG_SPARSEMEM_EXTREME is not defined.

As long as all callers provide a valid section number then yes. I am not
really sure this is the case though.

> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> ---
>  include/linux/mmzone.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 32699b2dc52a..33086f86d1a7 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1155,9 +1155,9 @@ static inline struct mem_section *__nr_to_section(unsigned long nr)
>  #ifdef CONFIG_SPARSEMEM_EXTREME
>  	if (!mem_section)
>  		return NULL;
> -#endif
>  	if (!mem_section[SECTION_NR_TO_ROOT(nr)])
>  		return NULL;
> +#endif
>  	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
>  }
>  extern int __section_nr(struct mem_section* ms);
> -- 
> 2.15.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/3] mm/sparse: use __highest_present_section_nr as the boundary for pfn check
  2018-08-23 13:07 ` [PATCH 3/3] mm/sparse: use __highest_present_section_nr as the boundary for pfn check Wei Yang
@ 2018-08-23 13:25   ` Michal Hocko
  2018-08-23 14:00     ` Oscar Salvador
  2018-08-24  0:15   ` Dave Hansen
  1 sibling, 1 reply; 23+ messages in thread
From: Michal Hocko @ 2018-08-23 13:25 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, rientjes, linux-mm, kirill.shutemov, bob.picco

On Thu 23-08-18 21:07:32, Wei Yang wrote:
> And it is known, __highest_present_section_nr is a more strict boundary
> than NR_MEM_SECTIONS.
> 
> This patch uses a __highest_present_section_nr to check a valid pfn.

But why is this an improvement? Sure when you loop over all sections
than __highest_present_section_nr makes a lot of sense. But all the
updated function perform a trivial comparision.

> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> ---
>  include/linux/mmzone.h | 4 ++--
>  mm/sparse.c            | 1 +
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 33086f86d1a7..5138efde11ae 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1237,7 +1237,7 @@ extern int __highest_present_section_nr;
>  #ifndef CONFIG_HAVE_ARCH_PFN_VALID
>  static inline int pfn_valid(unsigned long pfn)
>  {
> -	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> +	if (pfn_to_section_nr(pfn) > __highest_present_section_nr)
>  		return 0;
>  	return valid_section(__nr_to_section(pfn_to_section_nr(pfn)));
>  }
> @@ -1245,7 +1245,7 @@ static inline int pfn_valid(unsigned long pfn)
>  
>  static inline int pfn_present(unsigned long pfn)
>  {
> -	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> +	if (pfn_to_section_nr(pfn) > __highest_present_section_nr)
>  		return 0;
>  	return present_section(__nr_to_section(pfn_to_section_nr(pfn)));
>  }
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 90bab7f03757..a9c55c8da11f 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -174,6 +174,7 @@ void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn,
>   * those loops early.
>   */
>  int __highest_present_section_nr;
> +EXPORT_SYMBOL(__highest_present_section_nr);
>  static void section_mark_present(struct mem_section *ms)
>  {
>  	int section_nr = __section_nr(ms);
> -- 
> 2.15.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/3] mm/sparse: use __highest_present_section_nr as the boundary for pfn check
  2018-08-23 13:25   ` Michal Hocko
@ 2018-08-23 14:00     ` Oscar Salvador
  2018-08-23 19:17       ` Michal Hocko
  0 siblings, 1 reply; 23+ messages in thread
From: Oscar Salvador @ 2018-08-23 14:00 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wei Yang, akpm, rientjes, linux-mm, kirill.shutemov, bob.picco

On Thu, Aug 23, 2018 at 03:25:26PM +0200, Michal Hocko wrote:
> On Thu 23-08-18 21:07:32, Wei Yang wrote:
> > And it is known, __highest_present_section_nr is a more strict boundary
> > than NR_MEM_SECTIONS.
> > 
> > This patch uses a __highest_present_section_nr to check a valid pfn.
> 
> But why is this an improvement? Sure when you loop over all sections
> than __highest_present_section_nr makes a lot of sense. But all the
> updated function perform a trivial comparision.

I think it makes some sense.
NR_MEM_SECTIONS can be a big number, but we might not be using
all sections, so __highest_present_section_nr ends up being a much lower
value.

I think that we want to compare the pfn's section_nr with our current limit
of present sections.
Sections over that do not really exist for us, so it is no use to look for
them in __nr_to_section/valid_section.

It might not be a big improvement, but I think that given the nature of
pfn_valid/pfn_present, comparing to __highest_present_section_nr suits better.

-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/3] mm/sparse: use __highest_present_section_nr as the boundary for pfn check
  2018-08-23 14:00     ` Oscar Salvador
@ 2018-08-23 19:17       ` Michal Hocko
  2018-08-23 20:52         ` Oscar Salvador
  0 siblings, 1 reply; 23+ messages in thread
From: Michal Hocko @ 2018-08-23 19:17 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Wei Yang, akpm, rientjes, linux-mm, kirill.shutemov, bob.picco

On Thu 23-08-18 16:00:53, Oscar Salvador wrote:
> On Thu, Aug 23, 2018 at 03:25:26PM +0200, Michal Hocko wrote:
> > On Thu 23-08-18 21:07:32, Wei Yang wrote:
> > > And it is known, __highest_present_section_nr is a more strict boundary
> > > than NR_MEM_SECTIONS.
> > > 
> > > This patch uses a __highest_present_section_nr to check a valid pfn.
> > 
> > But why is this an improvement? Sure when you loop over all sections
> > than __highest_present_section_nr makes a lot of sense. But all the
> > updated function perform a trivial comparision.
> 
> I think it makes some sense.
> NR_MEM_SECTIONS can be a big number, but we might not be using
> all sections, so __highest_present_section_nr ends up being a much lower
> value.

And how exactly does it help to check for the smaller vs. a larger number?
Both are O(1) operations AFAICS. __highest_present_section_nr makes
perfect sense when we iterate over all sections or similar operations
where it smaller number of iterations really makes sense.

I am not saying the patch is wrong but I just do not see this being an
improvement. You have to export an internal symbol to achieve something
that is hardly an optimization.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/3] mm/sparse: use __highest_present_section_nr as the boundary for pfn check
  2018-08-23 19:17       ` Michal Hocko
@ 2018-08-23 20:52         ` Oscar Salvador
  0 siblings, 0 replies; 23+ messages in thread
From: Oscar Salvador @ 2018-08-23 20:52 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wei Yang, akpm, rientjes, linux-mm, kirill.shutemov, bob.picco

On Thu, Aug 23, 2018 at 09:17:29PM +0200, Michal Hocko wrote:
> And how exactly does it help to check for the smaller vs. a larger number?
> Both are O(1) operations AFAICS. __highest_present_section_nr makes
> perfect sense when we iterate over all sections or similar operations
> where it smaller number of iterations really makes sense.

Sure, improvement/optimization was not really my point, a comparasion is
a comparasion.
The gain, if any, would be because we would catch
non present sections sooner before calling to present_section().
In the case that __highest_present_section_nr differs from
NR_MEM_SECTIONS, of course.

I thought it would make more sense given the nature of the function itself.

The only thing I did not like much was that we need to export the symbol, though.
So, as you said, the price might be too hight for what we get.

-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] mm/sparse: add likely to mem_section[root] check in sparse_index_init()
  2018-08-23 13:13   ` Michal Hocko
@ 2018-08-23 22:57     ` Wei Yang
  2018-08-24  7:31       ` Michal Hocko
  0 siblings, 1 reply; 23+ messages in thread
From: Wei Yang @ 2018-08-23 22:57 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wei Yang, akpm, rientjes, linux-mm, kirill.shutemov, bob.picco

On Thu, Aug 23, 2018 at 03:13:39PM +0200, Michal Hocko wrote:
>On Thu 23-08-18 21:07:30, Wei Yang wrote:
>> Each time SECTIONS_PER_ROOT number of mem_section is allocated when
>> mem_section[root] is null. This means only (1 / SECTIONS_PER_ROOT) chance
>> of the mem_section[root] check is false.
>> 
>> This patch adds likely to the if check to optimize this a little.
>
>Could you evaluate how much does this help if any? Does this have any
>impact on the initialization path at all?

Let me test on my 4G machine with this patch :-)

>
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> ---
>>  mm/sparse.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/mm/sparse.c b/mm/sparse.c
>> index 10b07eea9a6e..90bab7f03757 100644
>> --- a/mm/sparse.c
>> +++ b/mm/sparse.c
>> @@ -78,7 +78,7 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>>  	unsigned long root = SECTION_NR_TO_ROOT(section_nr);
>>  	struct mem_section *section;
>>  
>> -	if (mem_section[root])
>> +	if (likely(mem_section[root]))
>>  		return -EEXIST;
>>  
>>  	section = sparse_index_alloc(nid);
>> -- 
>> 2.15.1
>> 
>
>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3] mm/sparse: expand the CONFIG_SPARSEMEM_EXTREME range in __nr_to_section()
  2018-08-23 13:21   ` Michal Hocko
@ 2018-08-23 23:03     ` Wei Yang
  2018-08-24  0:09     ` Dave Hansen
  1 sibling, 0 replies; 23+ messages in thread
From: Wei Yang @ 2018-08-23 23:03 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wei Yang, akpm, rientjes, linux-mm, kirill.shutemov, bob.picco,
	Dave Hansen

On Thu, Aug 23, 2018 at 03:21:12PM +0200, Michal Hocko wrote:
>[Cc Dave]
>
>On Thu 23-08-18 21:07:31, Wei Yang wrote:
>> When CONFIG_SPARSEMEM_EXTREME is not defined, mem_section is a static
>> two dimension array. This means !mem_section[SECTION_NR_TO_ROOT(nr)] is
>> always true.
>> 
>> This patch expand the CONFIG_SPARSEMEM_EXTREME range to return a proper
>> mem_section when CONFIG_SPARSEMEM_EXTREME is not defined.
>
>As long as all callers provide a valid section number then yes. I am not
>really sure this is the case though.
>

I don't get your point.

When CONFIG_SPARSEMEM_EXTREME is not defined, each section number is a
valid one in this context. Because for eavry section number in
[0, NR_MEM_SECTIONS - 1], we have a mem_sectioin structure there.

This patch helps to reduce a meaningless check when
CONFIG_SPARSEMEM_EXTREME=n.

>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> ---
>>  include/linux/mmzone.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index 32699b2dc52a..33086f86d1a7 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -1155,9 +1155,9 @@ static inline struct mem_section *__nr_to_section(unsigned long nr)
>>  #ifdef CONFIG_SPARSEMEM_EXTREME
>>  	if (!mem_section)
>>  		return NULL;
>> -#endif
>>  	if (!mem_section[SECTION_NR_TO_ROOT(nr)])
>>  		return NULL;
>> +#endif
>>  	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
>>  }
>>  extern int __section_nr(struct mem_section* ms);
>> -- 
>> 2.15.1
>> 
>
>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3] mm/sparse: expand the CONFIG_SPARSEMEM_EXTREME range in __nr_to_section()
  2018-08-23 13:21   ` Michal Hocko
  2018-08-23 23:03     ` Wei Yang
@ 2018-08-24  0:09     ` Dave Hansen
  2018-08-24 15:24       ` Wei Yang
  1 sibling, 1 reply; 23+ messages in thread
From: Dave Hansen @ 2018-08-24  0:09 UTC (permalink / raw)
  To: Michal Hocko, Wei Yang
  Cc: akpm, rientjes, linux-mm, kirill.shutemov, bob.picco

On 08/23/2018 06:21 AM, Michal Hocko wrote:
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1155,9 +1155,9 @@ static inline struct mem_section *__nr_to_section(unsigned long nr)
>  #ifdef CONFIG_SPARSEMEM_EXTREME
>  	if (!mem_section)
>  		return NULL;
> -#endif
>  	if (!mem_section[SECTION_NR_TO_ROOT(nr)])
>  		return NULL;
> +#endif
>  	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
>  }

This patch has no practical effect and only adds unnecessary churn.

#ifdef CONFIG_SPARSEMEM_EXTREME
...
#else
struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
#endif

The compiler knows that NR_SECTION_ROOTS==1 and that
!mem_section[SECTION_NR_TO_ROOT(nr) is always false.  It doesn't need
our help.

My goal with the sparsemem code, and code in general is t avoid #ifdefs
whenever possible and limit their scope to the smallest possible area
whenever possible.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] mm/sparse: add likely to mem_section[root] check in sparse_index_init()
  2018-08-23 13:07 ` [PATCH 1/3] mm/sparse: add likely to mem_section[root] check in sparse_index_init() Wei Yang
  2018-08-23 13:13   ` Michal Hocko
@ 2018-08-24  0:11   ` Dave Hansen
  2018-08-24 15:07     ` Wei Yang
  1 sibling, 1 reply; 23+ messages in thread
From: Dave Hansen @ 2018-08-24  0:11 UTC (permalink / raw)
  To: Wei Yang, akpm, mhocko, rientjes; +Cc: linux-mm, kirill.shutemov, bob.picco

On 08/23/2018 06:07 AM, Wei Yang wrote:
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -78,7 +78,7 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>  	unsigned long root = SECTION_NR_TO_ROOT(section_nr);
>  	struct mem_section *section;
>  
> -	if (mem_section[root])
> +	if (likely(mem_section[root]))
>  		return -EEXIST;

We could add likely()/unlikely() to approximately a billion if()s around
the kernel if we felt like it.  We don't because it's messy and it
actually takes away choices from the compiler.

Please don't send patches like this unless you have some *actual*
analysis that shows the benefit of the patch.  Performance numbers are best.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/3] mm/sparse: use __highest_present_section_nr as the boundary for pfn check
  2018-08-23 13:07 ` [PATCH 3/3] mm/sparse: use __highest_present_section_nr as the boundary for pfn check Wei Yang
  2018-08-23 13:25   ` Michal Hocko
@ 2018-08-24  0:15   ` Dave Hansen
  2018-08-24 18:11     ` Wei Yang
  1 sibling, 1 reply; 23+ messages in thread
From: Dave Hansen @ 2018-08-24  0:15 UTC (permalink / raw)
  To: Wei Yang, akpm, mhocko, rientjes; +Cc: linux-mm, kirill.shutemov, bob.picco

On 08/23/2018 06:07 AM, Wei Yang wrote:
> And it is known, __highest_present_section_nr is a more strict boundary
> than NR_MEM_SECTIONS.

What is the benefit of this patch?

You're adding a more "strict" boundary, but you're also adding a
potential cacheline miss and removing optimizations that the compiler
can make with a constant vs. a variable.  Providing absolute bounds
limits that the compiler knows about can actually be pretty handy for it
to optimize things

Do you have *any* analysis to show that this has a benefit?  What does
it do to text size, for instance?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] mm/sparse: add likely to mem_section[root] check in sparse_index_init()
  2018-08-23 22:57     ` Wei Yang
@ 2018-08-24  7:31       ` Michal Hocko
  0 siblings, 0 replies; 23+ messages in thread
From: Michal Hocko @ 2018-08-24  7:31 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, rientjes, linux-mm, kirill.shutemov, bob.picco

On Thu 23-08-18 22:57:42, Wei Yang wrote:
> On Thu, Aug 23, 2018 at 03:13:39PM +0200, Michal Hocko wrote:
> >On Thu 23-08-18 21:07:30, Wei Yang wrote:
> >> Each time SECTIONS_PER_ROOT number of mem_section is allocated when
> >> mem_section[root] is null. This means only (1 / SECTIONS_PER_ROOT) chance
> >> of the mem_section[root] check is false.
> >> 
> >> This patch adds likely to the if check to optimize this a little.
> >
> >Could you evaluate how much does this help if any? Does this have any
> >impact on the initialization path at all?
> 
> Let me test on my 4G machine with this patch :-)

Well, this should have been done before posting the patch. In general,
though, you should have some convincing numbers in order to add new
likely/unlikely annotations. They are rarely helpful and they tend to
rot over time. Steven Rostedt has made some test few years back and
found out that a vast majority of those annotation were simply wrong.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] mm/sparse: add likely to mem_section[root] check in sparse_index_init()
  2018-08-24  0:11   ` Dave Hansen
@ 2018-08-24 15:07     ` Wei Yang
  2018-09-03 22:27       ` Wei Yang
  0 siblings, 1 reply; 23+ messages in thread
From: Wei Yang @ 2018-08-24 15:07 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Wei Yang, akpm, mhocko, rientjes, linux-mm, kirill.shutemov, bob.picco

On Thu, Aug 23, 2018 at 05:11:48PM -0700, Dave Hansen wrote:
>On 08/23/2018 06:07 AM, Wei Yang wrote:
>> --- a/mm/sparse.c
>> +++ b/mm/sparse.c
>> @@ -78,7 +78,7 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>>  	unsigned long root = SECTION_NR_TO_ROOT(section_nr);
>>  	struct mem_section *section;
>>  
>> -	if (mem_section[root])
>> +	if (likely(mem_section[root]))
>>  		return -EEXIST;
>
>We could add likely()/unlikely() to approximately a billion if()s around
>the kernel if we felt like it.  We don't because it's messy and it
>actually takes away choices from the compiler.
>
>Please don't send patches like this unless you have some *actual*
>analysis that shows the benefit of the patch.  Performance numbers are best.

Thanks all for your comments, Michal, Dave and Oscar.

Well, maybe I took it for granted, so let me put more words on this. To be
honest, my analysis maybe partially effective, so if the cost is higher than
the gain, please let me know.

Below is my analysis and test result for this patch.
------------------------------------------------------

During bootup, the call flow looks like this.

    sparse_memory_present_with_active_regions()
        memory_present()
            sparse_index_init()

sparse_memory_present_with_active_regions() iterates on pfn continuously for
the whole system RAM, which leads to sparse_index_init() will iterate
section_nr continuously. Usually, we don't expect many large holes, right?

Each time when mem_section[root] is null, SECTIONS_PER_ROOT number of
mem_section will be allocated. This means, for SECTIONS_PER_ROOT number of
check, only the first check is false. So the possibility to be false is 
(1 / SECTIONS_PER_ROOT).

SECTIONS_PER_ROOT is defined as (PAGE_SIZE / sizeof (struct mem_section)).

On my x86_64 machine, PAGE_SIZE is 4KB and mem_section is 16B.

    SECTIONS_PER_ROOT = 4K / 16 = 256.

So the check for mem_section[root] is (1 / 256) chance to be invalid and
(255 / 256) valid. In theory, this value seems to be a "likely" to me.

In practice, when the system RAM is multiple times of
((1 << SECTION_SIZE_BITS) * SECTIONS_PER_ROOT), the "likely" chance is
(255 / 256), otherwise the chance would be less. 

On my x86_64 machine, SECTION_SIZE_BITS is defined to 27.

    ((1 << SECTION_SIZE_BITS) * SECTIONS_PER_ROOT) = 32GB

          System RAM size       32G         16G        8G         4G
      Possibility          (255 / 256) (127 / 128) (63 / 64)  (31 / 32)

Generally, in my mind, if we iterate pfn continuously and there is no large
holes, the check on mem_section[root] is likely to be true.

At last, here is the test result on my 4G virtual machine. I added printk
before and after sparse_memory_present_with_active_regions() and tested three
times with/without "likely".

                without      with
     Elapsed   0.000252     0.000250   -0.8%

The benefit seems to be too small on a 4G virtual machine or even this is not
stable. Not sure we can see some visible effect on a 32G machine.

Well, above is all my analysis and test result. I did the optimization based
on my own experience and understanding. If this is not qualified, I am very
glad to hear from your statement, so that I would learn more from your
experience.

Thanks all for your comments again :-)

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3] mm/sparse: expand the CONFIG_SPARSEMEM_EXTREME range in __nr_to_section()
  2018-08-24  0:09     ` Dave Hansen
@ 2018-08-24 15:24       ` Wei Yang
  0 siblings, 0 replies; 23+ messages in thread
From: Wei Yang @ 2018-08-24 15:24 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Michal Hocko, Wei Yang, akpm, rientjes, linux-mm,
	kirill.shutemov, bob.picco

On Thu, Aug 23, 2018 at 05:09:12PM -0700, Dave Hansen wrote:
>On 08/23/2018 06:21 AM, Michal Hocko wrote:
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -1155,9 +1155,9 @@ static inline struct mem_section *__nr_to_section(unsigned long nr)
>>  #ifdef CONFIG_SPARSEMEM_EXTREME
>>  	if (!mem_section)
>>  		return NULL;
>> -#endif
>>  	if (!mem_section[SECTION_NR_TO_ROOT(nr)])
>>  		return NULL;
>> +#endif
>>  	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
>>  }
>
>This patch has no practical effect and only adds unnecessary churn.
>
>#ifdef CONFIG_SPARSEMEM_EXTREME
>...
>#else
>struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
>#endif
>
>The compiler knows that NR_SECTION_ROOTS==1 and that
>!mem_section[SECTION_NR_TO_ROOT(nr) is always false.  It doesn't need
>our help.
>

I didn't know the compile would optimize the code when this is a one dimension
array. Just wrote a code and their assembly looks the same.

Thanks for pointing out.

>My goal with the sparsemem code, and code in general is t avoid #ifdefs
>whenever possible and limit their scope to the smallest possible area
>whenever possible.

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/3] mm/sparse: use __highest_present_section_nr as the boundary for pfn check
  2018-08-24  0:15   ` Dave Hansen
@ 2018-08-24 18:11     ` Wei Yang
  0 siblings, 0 replies; 23+ messages in thread
From: Wei Yang @ 2018-08-24 18:11 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Wei Yang, akpm, mhocko, rientjes, linux-mm, kirill.shutemov, bob.picco

On Thu, Aug 23, 2018 at 05:15:39PM -0700, Dave Hansen wrote:
>On 08/23/2018 06:07 AM, Wei Yang wrote:
>> And it is known, __highest_present_section_nr is a more strict boundary
>> than NR_MEM_SECTIONS.
>
>What is the benefit of this patch?
>

The original idea is simple: with a more strict boundary, it will has less
computation.

>You're adding a more "strict" boundary, but you're also adding a
>potential cacheline miss and removing optimizations that the compiler
>can make with a constant vs. a variable.  Providing absolute bounds
>limits that the compiler knows about can actually be pretty handy for it
>to optimize things

You are right.

I haven't thought about the compiler optimization case.

>
>Do you have *any* analysis to show that this has a benefit?  What does
>it do to text size, for instance?

I don't have more analysis about this. Originally, I thought a strict boundary
will have a benefit. But as you mentioned, it loose the compiler optimization.

BTW, I did some tests and found during a normal boot, all the section number
are within NR_MEM_SECTIONS. I am wondering in which case, this check will be
valid and return 0?

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] mm/sparse: add likely to mem_section[root] check in sparse_index_init()
  2018-08-24 15:07     ` Wei Yang
@ 2018-09-03 22:27       ` Wei Yang
  2018-09-09  1:38         ` Wei Yang
  0 siblings, 1 reply; 23+ messages in thread
From: Wei Yang @ 2018-09-03 22:27 UTC (permalink / raw)
  To: Wei Yang; +Cc: Dave Hansen, akpm, mhocko, rientjes, linux-mm, kirill.shutemov

On Fri, Aug 24, 2018 at 11:07:17PM +0800, Wei Yang wrote:
>On Thu, Aug 23, 2018 at 05:11:48PM -0700, Dave Hansen wrote:
>>On 08/23/2018 06:07 AM, Wei Yang wrote:
>>> --- a/mm/sparse.c
>>> +++ b/mm/sparse.c
>>> @@ -78,7 +78,7 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>>>  	unsigned long root = SECTION_NR_TO_ROOT(section_nr);
>>>  	struct mem_section *section;
>>>  
>>> -	if (mem_section[root])
>>> +	if (likely(mem_section[root]))
>>>  		return -EEXIST;
>>
>>We could add likely()/unlikely() to approximately a billion if()s around
>>the kernel if we felt like it.  We don't because it's messy and it
>>actually takes away choices from the compiler.
>>
>>Please don't send patches like this unless you have some *actual*
>>analysis that shows the benefit of the patch.  Performance numbers are best.
>

Hi, 

Is my analysis reasonable? Or which part is not valid?

>Thanks all for your comments, Michal, Dave and Oscar.
>
>Well, maybe I took it for granted, so let me put more words on this. To be
>honest, my analysis maybe partially effective, so if the cost is higher than
>the gain, please let me know.
>
>Below is my analysis and test result for this patch.
>------------------------------------------------------
>
>During bootup, the call flow looks like this.
>
>    sparse_memory_present_with_active_regions()
>        memory_present()
>            sparse_index_init()
>
>sparse_memory_present_with_active_regions() iterates on pfn continuously for
>the whole system RAM, which leads to sparse_index_init() will iterate
>section_nr continuously. Usually, we don't expect many large holes, right?
>
>Each time when mem_section[root] is null, SECTIONS_PER_ROOT number of
>mem_section will be allocated. This means, for SECTIONS_PER_ROOT number of
>check, only the first check is false. So the possibility to be false is 
>(1 / SECTIONS_PER_ROOT).
>
>SECTIONS_PER_ROOT is defined as (PAGE_SIZE / sizeof (struct mem_section)).
>
>On my x86_64 machine, PAGE_SIZE is 4KB and mem_section is 16B.
>
>    SECTIONS_PER_ROOT = 4K / 16 = 256.
>
>So the check for mem_section[root] is (1 / 256) chance to be invalid and
>(255 / 256) valid. In theory, this value seems to be a "likely" to me.
>
>In practice, when the system RAM is multiple times of
>((1 << SECTION_SIZE_BITS) * SECTIONS_PER_ROOT), the "likely" chance is
>(255 / 256), otherwise the chance would be less. 
>
>On my x86_64 machine, SECTION_SIZE_BITS is defined to 27.
>
>    ((1 << SECTION_SIZE_BITS) * SECTIONS_PER_ROOT) = 32GB
>
>          System RAM size       32G         16G        8G         4G
>      Possibility          (255 / 256) (127 / 128) (63 / 64)  (31 / 32)
>
>Generally, in my mind, if we iterate pfn continuously and there is no large
>holes, the check on mem_section[root] is likely to be true.
>
>At last, here is the test result on my 4G virtual machine. I added printk
>before and after sparse_memory_present_with_active_regions() and tested three
>times with/without "likely".
>
>                without      with
>     Elapsed   0.000252     0.000250   -0.8%
>
>The benefit seems to be too small on a 4G virtual machine or even this is not
>stable. Not sure we can see some visible effect on a 32G machine.
>
>
>Well, above is all my analysis and test result. I did the optimization based
>on my own experience and understanding. If this is not qualified, I am very
>glad to hear from your statement, so that I would learn more from your
>experience.
>
>Thanks all for your comments again :-)
> 
>
>-- 
>Wei Yang
>Help you, Help me

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] mm/sparse: add likely to mem_section[root] check in sparse_index_init()
  2018-09-03 22:27       ` Wei Yang
@ 2018-09-09  1:38         ` Wei Yang
  2018-09-10 20:30           ` Dave Hansen
  0 siblings, 1 reply; 23+ messages in thread
From: Wei Yang @ 2018-09-09  1:38 UTC (permalink / raw)
  To: Wei Yang; +Cc: Dave Hansen, akpm, mhocko, rientjes, linux-mm, kirill.shutemov

On Mon, Sep 03, 2018 at 10:27:32PM +0000, Wei Yang wrote:
>On Fri, Aug 24, 2018 at 11:07:17PM +0800, Wei Yang wrote:
>>On Thu, Aug 23, 2018 at 05:11:48PM -0700, Dave Hansen wrote:
>>>On 08/23/2018 06:07 AM, Wei Yang wrote:
>>>> --- a/mm/sparse.c
>>>> +++ b/mm/sparse.c
>>>> @@ -78,7 +78,7 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid)
>>>>  	unsigned long root = SECTION_NR_TO_ROOT(section_nr);
>>>>  	struct mem_section *section;
>>>>  
>>>> -	if (mem_section[root])
>>>> +	if (likely(mem_section[root]))
>>>>  		return -EEXIST;
>>>
>>>We could add likely()/unlikely() to approximately a billion if()s around
>>>the kernel if we felt like it.  We don't because it's messy and it
>>>actually takes away choices from the compiler.
>>>
>>>Please don't send patches like this unless you have some *actual*
>>>analysis that shows the benefit of the patch.  Performance numbers are best.
>>
>
>Hi, 
>
>Is my analysis reasonable? Or which part is not valid?
>

Would someone share some idea on my analysis?

>>Thanks all for your comments, Michal, Dave and Oscar.
>>
>>Well, maybe I took it for granted, so let me put more words on this. To be
>>honest, my analysis maybe partially effective, so if the cost is higher than
>>the gain, please let me know.
>>
>>Below is my analysis and test result for this patch.
>>------------------------------------------------------
>>
>>During bootup, the call flow looks like this.
>>
>>    sparse_memory_present_with_active_regions()
>>        memory_present()
>>            sparse_index_init()
>>
>>sparse_memory_present_with_active_regions() iterates on pfn continuously for
>>the whole system RAM, which leads to sparse_index_init() will iterate
>>section_nr continuously. Usually, we don't expect many large holes, right?
>>
>>Each time when mem_section[root] is null, SECTIONS_PER_ROOT number of
>>mem_section will be allocated. This means, for SECTIONS_PER_ROOT number of
>>check, only the first check is false. So the possibility to be false is 
>>(1 / SECTIONS_PER_ROOT).
>>
>>SECTIONS_PER_ROOT is defined as (PAGE_SIZE / sizeof (struct mem_section)).
>>
>>On my x86_64 machine, PAGE_SIZE is 4KB and mem_section is 16B.
>>
>>    SECTIONS_PER_ROOT = 4K / 16 = 256.
>>
>>So the check for mem_section[root] is (1 / 256) chance to be invalid and
>>(255 / 256) valid. In theory, this value seems to be a "likely" to me.
>>
>>In practice, when the system RAM is multiple times of
>>((1 << SECTION_SIZE_BITS) * SECTIONS_PER_ROOT), the "likely" chance is
>>(255 / 256), otherwise the chance would be less. 
>>
>>On my x86_64 machine, SECTION_SIZE_BITS is defined to 27.
>>
>>    ((1 << SECTION_SIZE_BITS) * SECTIONS_PER_ROOT) = 32GB
>>
>>          System RAM size       32G         16G        8G         4G
>>      Possibility          (255 / 256) (127 / 128) (63 / 64)  (31 / 32)
>>
>>Generally, in my mind, if we iterate pfn continuously and there is no large
>>holes, the check on mem_section[root] is likely to be true.
>>
>>At last, here is the test result on my 4G virtual machine. I added printk
>>before and after sparse_memory_present_with_active_regions() and tested three
>>times with/without "likely".
>>
>>                without      with
>>     Elapsed   0.000252     0.000250   -0.8%
>>
>>The benefit seems to be too small on a 4G virtual machine or even this is not
>>stable. Not sure we can see some visible effect on a 32G machine.
>>
>>
>>Well, above is all my analysis and test result. I did the optimization based
>>on my own experience and understanding. If this is not qualified, I am very
>>glad to hear from your statement, so that I would learn more from your
>>experience.
>>
>>Thanks all for your comments again :-)
>> 
>>
>>-- 
>>Wei Yang
>>Help you, Help me
>
>-- 
>Wei Yang
>Help you, Help me

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] mm/sparse: add likely to mem_section[root] check in sparse_index_init()
  2018-09-09  1:38         ` Wei Yang
@ 2018-09-10 20:30           ` Dave Hansen
  2018-09-11 15:00             ` Wei Yang
  0 siblings, 1 reply; 23+ messages in thread
From: Dave Hansen @ 2018-09-10 20:30 UTC (permalink / raw)
  To: Wei Yang; +Cc: akpm, mhocko, rientjes, linux-mm, kirill.shutemov

On 09/08/2018 06:38 PM, owner-linux-mm@kvack.org wrote:
> 
> At last, here is the test result on my 4G virtual machine. I added printk
> before and after sparse_memory_present_with_active_regions() and tested three
> times with/without "likely".
> 
>                without      with
>     Elapsed   0.000252     0.000250   -0.8%
> 
> The benefit seems to be too small on a 4G virtual machine or even this is not
> stable. Not sure we can see some visible effect on a 32G machine.

I think it's highly unlikely you have found something significant here.
It's one system, in a VM and it's not being measured using a mechanism
that is suitable for benchmarking (the kernel dmesg timestamps).

Plus, if this is a really tight loop, the cpu's branch predictors will
be good at it.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] mm/sparse: add likely to mem_section[root] check in sparse_index_init()
  2018-09-10 20:30           ` Dave Hansen
@ 2018-09-11 15:00             ` Wei Yang
  0 siblings, 0 replies; 23+ messages in thread
From: Wei Yang @ 2018-09-11 15:00 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Wei Yang, akpm, mhocko, rientjes, linux-mm, kirill.shutemov

On Mon, Sep 10, 2018 at 01:30:11PM -0700, Dave Hansen wrote:
>On 09/08/2018 06:38 PM, owner-linux-mm@kvack.org wrote:
>> 
>> At last, here is the test result on my 4G virtual machine. I added printk
>> before and after sparse_memory_present_with_active_regions() and tested three
>> times with/without "likely".
>> 
>>                without      with
>>     Elapsed   0.000252     0.000250   -0.8%
>> 
>> The benefit seems to be too small on a 4G virtual machine or even this is not
>> stable. Not sure we can see some visible effect on a 32G machine.
>
>I think it's highly unlikely you have found something significant here.
>It's one system, in a VM and it's not being measured using a mechanism
>that is suitable for benchmarking (the kernel dmesg timestamps).
>
>Plus, if this is a really tight loop, the cpu's branch predictors will
>be good at it.

Hi, Dave

Thanks for your reply.

I think you are right. This part is not significant and cpu may do its
job well.

Hmm... I am still willing to hear your opinion on my analysis of this
situation. In which case we would use likely/unlikely.

For example, in this case the possibility is (255/ 256) if the system
has 32G RAM. Do we have a threshold of the possibility to use
likely/unlikely. Or we'd prefer not to use this any more? Let compiler
and cpu do their job.

Look forward your insights.

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2018-09-11 15:00 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-23 13:07 [PATCH 0/3] trivial code refine for sparsemem Wei Yang
2018-08-23 13:07 ` [PATCH 1/3] mm/sparse: add likely to mem_section[root] check in sparse_index_init() Wei Yang
2018-08-23 13:13   ` Michal Hocko
2018-08-23 22:57     ` Wei Yang
2018-08-24  7:31       ` Michal Hocko
2018-08-24  0:11   ` Dave Hansen
2018-08-24 15:07     ` Wei Yang
2018-09-03 22:27       ` Wei Yang
2018-09-09  1:38         ` Wei Yang
2018-09-10 20:30           ` Dave Hansen
2018-09-11 15:00             ` Wei Yang
2018-08-23 13:07 ` [PATCH 2/3] mm/sparse: expand the CONFIG_SPARSEMEM_EXTREME range in __nr_to_section() Wei Yang
2018-08-23 13:21   ` Michal Hocko
2018-08-23 23:03     ` Wei Yang
2018-08-24  0:09     ` Dave Hansen
2018-08-24 15:24       ` Wei Yang
2018-08-23 13:07 ` [PATCH 3/3] mm/sparse: use __highest_present_section_nr as the boundary for pfn check Wei Yang
2018-08-23 13:25   ` Michal Hocko
2018-08-23 14:00     ` Oscar Salvador
2018-08-23 19:17       ` Michal Hocko
2018-08-23 20:52         ` Oscar Salvador
2018-08-24  0:15   ` Dave Hansen
2018-08-24 18:11     ` Wei Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).