linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] mm: memmap_init_zone() performance improvement
       [not found] <1349276174-8398-1-git-send-email-mike.yoknis@hp.com>
@ 2012-10-06 23:59 ` Ni zhan Chen
  2012-10-08 15:16 ` Mel Gorman
  1 sibling, 0 replies; 13+ messages in thread
From: Ni zhan Chen @ 2012-10-06 23:59 UTC (permalink / raw)
  To: Mike Yoknis
  Cc: mgorman, mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam,
	minchan, kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel,
	linux-mm

On 10/03/2012 10:56 PM, Mike Yoknis wrote:
> memmap_init_zone() loops through every Page Frame Number (pfn),
> including pfn values that are within the gaps between existing
> memory sections.  The unneeded looping will become a boot
> performance issue when machines configure larger memory ranges
> that will contain larger and more numerous gaps.
>
> The code will skip across invalid sections to reduce the
> number of loops executed.

looks reasonable to me.

>
> Signed-off-by: Mike Yoknis <mike.yoknis@hp.com>
> ---
>   arch/x86/include/asm/mmzone_32.h     |    2 ++
>   arch/x86/include/asm/page_32.h       |    1 +
>   arch/x86/include/asm/page_64_types.h |    3 ++-
>   include/asm-generic/page.h           |    1 +
>   include/linux/mmzone.h               |    6 ++++++
>   mm/page_alloc.c                      |    5 ++++-
>   6 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/mmzone_32.h b/arch/x86/include/asm/mmzone_32.h
> index eb05fb3..73c5c74 100644
> --- a/arch/x86/include/asm/mmzone_32.h
> +++ b/arch/x86/include/asm/mmzone_32.h
> @@ -48,6 +48,8 @@ static inline int pfn_to_nid(unsigned long pfn)
>   #endif
>   }
>   
> +#define next_pfn_try(pfn)	((pfn)+1)
> +
>   static inline int pfn_valid(int pfn)
>   {
>   	int nid = pfn_to_nid(pfn);
> diff --git a/arch/x86/include/asm/page_32.h b/arch/x86/include/asm/page_32.h
> index da4e762..e2c4cfc 100644
> --- a/arch/x86/include/asm/page_32.h
> +++ b/arch/x86/include/asm/page_32.h
> @@ -19,6 +19,7 @@ extern unsigned long __phys_addr(unsigned long);
>   
>   #ifdef CONFIG_FLATMEM
>   #define pfn_valid(pfn)		((pfn) < max_mapnr)
> +#define next_pfn_try(pfn)	((pfn)+1)
>   #endif /* CONFIG_FLATMEM */
>   
>   #ifdef CONFIG_X86_USE_3DNOW
> diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
> index 320f7bb..02d82e5 100644
> --- a/arch/x86/include/asm/page_64_types.h
> +++ b/arch/x86/include/asm/page_64_types.h
> @@ -69,7 +69,8 @@ extern void init_extra_mapping_wb(unsigned long phys, unsigned long size);
>   #endif	/* !__ASSEMBLY__ */
>   
>   #ifdef CONFIG_FLATMEM
> -#define pfn_valid(pfn)          ((pfn) < max_pfn)
> +#define pfn_valid(pfn)		((pfn) < max_pfn)
> +#define next_pfn_try(pfn)	((pfn)+1)
>   #endif
>   
>   #endif /* _ASM_X86_PAGE_64_DEFS_H */
> diff --git a/include/asm-generic/page.h b/include/asm-generic/page.h
> index 37d1fe2..316200d 100644
> --- a/include/asm-generic/page.h
> +++ b/include/asm-generic/page.h
> @@ -91,6 +91,7 @@ extern unsigned long memory_end;
>   #endif
>   
>   #define pfn_valid(pfn)		((pfn) >= ARCH_PFN_OFFSET && ((pfn) - ARCH_PFN_OFFSET) < max_mapnr)
> +#define next_pfn_try(pfn)	((pfn)+1)
>   
>   #define	virt_addr_valid(kaddr)	(((void *)(kaddr) >= (void *)PAGE_OFFSET) && \
>   				((void *)(kaddr) < (void *)memory_end))
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index f7d88ba..04d3c39 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1166,6 +1166,12 @@ static inline int pfn_valid(unsigned long pfn)
>   		return 0;
>   	return valid_section(__nr_to_section(pfn_to_section_nr(pfn)));
>   }
> +
> +static inline unsigned long next_pfn_try(unsigned long pfn)
> +{
> +	/* Skip entire section, because all of it is invalid. */
> +	return section_nr_to_pfn(pfn_to_section_nr(pfn) + 1);
> +}
>   #endif
>   
>   static inline int pfn_present(unsigned long pfn)
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 5b6b6b1..dd2af8b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3798,8 +3798,11 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>   		 * exist on hotplugged memory.
>   		 */
>   		if (context == MEMMAP_EARLY) {
> -			if (!early_pfn_valid(pfn))
> +			if (!early_pfn_valid(pfn)) {
> +				pfn = next_pfn_try(pfn);
> +				pfn--;
>   				continue;
> +			}
>   			if (!early_pfn_in_nid(pfn, nid))
>   				continue;
>   		}


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm: memmap_init_zone() performance improvement
       [not found] <1349276174-8398-1-git-send-email-mike.yoknis@hp.com>
  2012-10-06 23:59 ` [PATCH] mm: memmap_init_zone() performance improvement Ni zhan Chen
@ 2012-10-08 15:16 ` Mel Gorman
  2012-10-09  0:42   ` Ni zhan Chen
  2012-10-09 14:56   ` Mike Yoknis
  1 sibling, 2 replies; 13+ messages in thread
From: Mel Gorman @ 2012-10-08 15:16 UTC (permalink / raw)
  To: Mike Yoknis
  Cc: mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan,
	kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm

On Wed, Oct 03, 2012 at 08:56:14AM -0600, Mike Yoknis wrote:
> memmap_init_zone() loops through every Page Frame Number (pfn),
> including pfn values that are within the gaps between existing
> memory sections.  The unneeded looping will become a boot
> performance issue when machines configure larger memory ranges
> that will contain larger and more numerous gaps.
> 
> The code will skip across invalid sections to reduce the
> number of loops executed.
> 
> Signed-off-by: Mike Yoknis <mike.yoknis@hp.com>

This only helps SPARSEMEM and changes more headers than should be
necessary. It would have been easier to do something simple like

if (!early_pfn_valid(pfn)) {
	pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1;
	continue;
}

because that would obey the expectation that pages within a
MAX_ORDER_NR_PAGES-aligned range are all valid or all invalid (ARM is the
exception that breaks this rule). It would be less efficient on
SPARSEMEM than what you're trying to merge but I do not see the need for
the additional complexity unless you can show it makes a big difference
to boot times.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm: memmap_init_zone() performance improvement
  2012-10-08 15:16 ` Mel Gorman
@ 2012-10-09  0:42   ` Ni zhan Chen
  2012-10-09 14:56   ` Mike Yoknis
  1 sibling, 0 replies; 13+ messages in thread
From: Ni zhan Chen @ 2012-10-09  0:42 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Mike Yoknis, mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd,
	sam, minchan, kamezawa.hiroyu, mhocko, linux-kbuild,
	linux-kernel, linux-mm

On 10/08/2012 11:16 PM, Mel Gorman wrote:
> On Wed, Oct 03, 2012 at 08:56:14AM -0600, Mike Yoknis wrote:
>> memmap_init_zone() loops through every Page Frame Number (pfn),
>> including pfn values that are within the gaps between existing
>> memory sections.  The unneeded looping will become a boot
>> performance issue when machines configure larger memory ranges
>> that will contain larger and more numerous gaps.
>>
>> The code will skip across invalid sections to reduce the
>> number of loops executed.
>>
>> Signed-off-by: Mike Yoknis <mike.yoknis@hp.com>
> This only helps SPARSEMEM and changes more headers than should be
> necessary. It would have been easier to do something simple like
>
> if (!early_pfn_valid(pfn)) {
> 	pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1;
> 	continue;
> }

So if present memoy section in sparsemem can have 
MAX_ORDER_NR_PAGES-aligned range are all invalid?
If the answer is yes, when this will happen?

>
> because that would obey the expectation that pages within a
> MAX_ORDER_NR_PAGES-aligned range are all valid or all invalid (ARM is the
> exception that breaks this rule). It would be less efficient on
> SPARSEMEM than what you're trying to merge but I do not see the need for
> the additional complexity unless you can show it makes a big difference
> to boot times.
>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm: memmap_init_zone() performance improvement
  2012-10-08 15:16 ` Mel Gorman
  2012-10-09  0:42   ` Ni zhan Chen
@ 2012-10-09 14:56   ` Mike Yoknis
  2012-10-19 19:53     ` Mike Yoknis
  1 sibling, 1 reply; 13+ messages in thread
From: Mike Yoknis @ 2012-10-09 14:56 UTC (permalink / raw)
  To: Mel Gorman
  Cc: mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan,
	kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm

On Mon, 2012-10-08 at 16:16 +0100, Mel Gorman wrote:
> On Wed, Oct 03, 2012 at 08:56:14AM -0600, Mike Yoknis wrote:
> > memmap_init_zone() loops through every Page Frame Number (pfn),
> > including pfn values that are within the gaps between existing
> > memory sections.  The unneeded looping will become a boot
> > performance issue when machines configure larger memory ranges
> > that will contain larger and more numerous gaps.
> > 
> > The code will skip across invalid sections to reduce the
> > number of loops executed.
> > 
> > Signed-off-by: Mike Yoknis <mike.yoknis@hp.com>
> 
> This only helps SPARSEMEM and changes more headers than should be
> necessary. It would have been easier to do something simple like
> 
> if (!early_pfn_valid(pfn)) {
> 	pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1;
> 	continue;
> }
> 
> because that would obey the expectation that pages within a
> MAX_ORDER_NR_PAGES-aligned range are all valid or all invalid (ARM is the
> exception that breaks this rule). It would be less efficient on
> SPARSEMEM than what you're trying to merge but I do not see the need for
> the additional complexity unless you can show it makes a big difference
> to boot times.
> 

Mel,
I, too, was concerned that pfn_valid() was defined in so many header
files.  But, I did not feel that it was appropriate for me to try to
restructure things to consolidate those definitions just to add this one
new function.  Being a kernel newbie I did not believe that I had a good
enough understanding of what combinations and permutations of CONFIG and
architecture may have made all of those different definitions necessary,
so I left them in.

Yes, indeed, this fix is targeted for systems that have holes in memory.
That is where we see the problem.  We are creating large computer
systems and we would like for those machines to perform well, including
boot times.

Let me pass along the numbers I have.  We have what we call an
"architectural simulator".  It is a computer program that pretends that
it is a computer system.  We use it to test the firmware before real
hardware is available.  We have booted Linux on our simulator.  As you
would expect it takes longer to boot on the simulator than it does on
real hardware.

With my patch - boot time 41 minutes
Without patch - boot time 94 minutes

These numbers do not scale linearly to real hardware.  But indicate to
me a place where Linux can be improved.

Mike Yoknis



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm: memmap_init_zone() performance improvement
  2012-10-09 14:56   ` Mike Yoknis
@ 2012-10-19 19:53     ` Mike Yoknis
  2012-10-20  8:29       ` Mel Gorman
  0 siblings, 1 reply; 13+ messages in thread
From: Mike Yoknis @ 2012-10-19 19:53 UTC (permalink / raw)
  To: Mel Gorman
  Cc: mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan,
	kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm

On Tue, 2012-10-09 at 08:56 -0600, Mike Yoknis wrote:
> On Mon, 2012-10-08 at 16:16 +0100, Mel Gorman wrote:
> > On Wed, Oct 03, 2012 at 08:56:14AM -0600, Mike Yoknis wrote:
> > > memmap_init_zone() loops through every Page Frame Number (pfn),
> > > including pfn values that are within the gaps between existing
> > > memory sections.  The unneeded looping will become a boot
> > > performance issue when machines configure larger memory ranges
> > > that will contain larger and more numerous gaps.
> > > 
> > > The code will skip across invalid sections to reduce the
> > > number of loops executed.
> > > 
> > > Signed-off-by: Mike Yoknis <mike.yoknis@hp.com>
> > 
> > I do not see the need for
> > the additional complexity unless you can show it makes a big difference
> > to boot times.
> > 
> 
> Mel,
> 
> Let me pass along the numbers I have.  We have what we call an
> "architectural simulator".  It is a computer program that pretends that
> it is a computer system.  We use it to test the firmware before real
> hardware is available.  We have booted Linux on our simulator.  As you
> would expect it takes longer to boot on the simulator than it does on
> real hardware.
> 
> With my patch - boot time 41 minutes
> Without patch - boot time 94 minutes
> 
> These numbers do not scale linearly to real hardware.  But indicate to
> me a place where Linux can be improved.
> 
> Mike Yoknis
> 
Mel,
I finally got access to prototype hardware.  
It is a relatively small machine with only 64GB of RAM.
 
I put in a time measurement by reading the TSC register.
I booted both with and without my patch -
 
Without patch -
[    0.000000]   Normal zone: 13400064 pages, LIFO batch:31
[    0.000000] memmap_init_zone() enter 1404184834218
[    0.000000] memmap_init_zone() exit  1411174884438  diff = 6990050220
 
With patch -
[    0.000000]   Normal zone: 13400064 pages, LIFO batch:31
[    0.000000] memmap_init_zone() enter 1555530050778
[    0.000000] memmap_init_zone() exit  1559379204643  diff = 3849153865
 
This shows that without the patch the routine spends 45% 
of its time spinning unnecessarily.
 
Mike Yoknis



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm: memmap_init_zone() performance improvement
  2012-10-19 19:53     ` Mike Yoknis
@ 2012-10-20  8:29       ` Mel Gorman
  2012-10-24 15:47         ` Mike Yoknis
  2012-10-30 15:14         ` [PATCH] " Dave Hansen
  0 siblings, 2 replies; 13+ messages in thread
From: Mel Gorman @ 2012-10-20  8:29 UTC (permalink / raw)
  To: Mike Yoknis
  Cc: mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan,
	kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm

On Fri, Oct 19, 2012 at 01:53:18PM -0600, Mike Yoknis wrote:
> On Tue, 2012-10-09 at 08:56 -0600, Mike Yoknis wrote:
> > On Mon, 2012-10-08 at 16:16 +0100, Mel Gorman wrote:
> > > On Wed, Oct 03, 2012 at 08:56:14AM -0600, Mike Yoknis wrote:
> > > > memmap_init_zone() loops through every Page Frame Number (pfn),
> > > > including pfn values that are within the gaps between existing
> > > > memory sections.  The unneeded looping will become a boot
> > > > performance issue when machines configure larger memory ranges
> > > > that will contain larger and more numerous gaps.
> > > > 
> > > > The code will skip across invalid sections to reduce the
> > > > number of loops executed.
> > > > 
> > > > Signed-off-by: Mike Yoknis <mike.yoknis@hp.com>
> > > 
> > > I do not see the need for
> > > the additional complexity unless you can show it makes a big difference
> > > to boot times.
> > > 
> > 
> > Mel,
> > 
> > Let me pass along the numbers I have.  We have what we call an
> > "architectural simulator".  It is a computer program that pretends that
> > it is a computer system.  We use it to test the firmware before real
> > hardware is available.  We have booted Linux on our simulator.  As you
> > would expect it takes longer to boot on the simulator than it does on
> > real hardware.
> > 
> > With my patch - boot time 41 minutes
> > Without patch - boot time 94 minutes
> > 
> > These numbers do not scale linearly to real hardware.  But indicate to
> > me a place where Linux can be improved.
> > 
> > Mike Yoknis
> > 
> Mel,
> I finally got access to prototype hardware.  
> It is a relatively small machine with only 64GB of RAM.
>  
> I put in a time measurement by reading the TSC register.
> I booted both with and without my patch -
>  
> Without patch -
> [    0.000000]   Normal zone: 13400064 pages, LIFO batch:31
> [    0.000000] memmap_init_zone() enter 1404184834218
> [    0.000000] memmap_init_zone() exit  1411174884438  diff = 6990050220
>  
> With patch -
> [    0.000000]   Normal zone: 13400064 pages, LIFO batch:31
> [    0.000000] memmap_init_zone() enter 1555530050778
> [    0.000000] memmap_init_zone() exit  1559379204643  diff = 3849153865
>  
> This shows that without the patch the routine spends 45% 
> of its time spinning unnecessarily.
>  

I'm travelling at the moment so apologies that I have not followed up on
this. My problem is still the same with the patch - it changes more
headers than is necessary and it is sparsemem specific. At minimum, try
the suggestion of 

if (!early_pfn_valid(pfn)) {
      pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1;
      continue;
}

and see how much it gains you as it should work on all memory models. If
it turns out that you really need to skip whole sections then the strice
could MAX_ORDER_NR_PAGES on all memory models except sparsemem where the
stride would be PAGES_PER_SECTION

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm: memmap_init_zone() performance improvement
  2012-10-20  8:29       ` Mel Gorman
@ 2012-10-24 15:47         ` Mike Yoknis
  2012-10-25  9:44           ` Mel Gorman
  2012-10-30 15:14         ` [PATCH] " Dave Hansen
  1 sibling, 1 reply; 13+ messages in thread
From: Mike Yoknis @ 2012-10-24 15:47 UTC (permalink / raw)
  To: Mel Gorman
  Cc: mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan,
	kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm

On Sat, 2012-10-20 at 09:29 +0100, Mel Gorman wrote:
> On Fri, Oct 19, 2012 at 01:53:18PM -0600, Mike Yoknis wrote:
> > On Tue, 2012-10-09 at 08:56 -0600, Mike Yoknis wrote:
> > > On Mon, 2012-10-08 at 16:16 +0100, Mel Gorman wrote:
> > > > On Wed, Oct 03, 2012 at 08:56:14AM -0600, Mike Yoknis wrote:
> > > > > memmap_init_zone() loops through every Page Frame Number (pfn),
> > > > > including pfn values that are within the gaps between existing
> > > > > memory sections.  The unneeded looping will become a boot
> > > > > performance issue when machines configure larger memory ranges
> > > > > that will contain larger and more numerous gaps.
> > > > > 
> > > > > The code will skip across invalid sections to reduce the
> > > > > number of loops executed.
> > > > > 
> > > > > Signed-off-by: Mike Yoknis <mike.yoknis@hp.com>
> > > > 
> > > > I do not see the need for
> > > > the additional complexity unless you can show it makes a big difference
> > > > to boot times.
> > > > 
> > > 
> > > Mel,
> > > 
> > > Let me pass along the numbers I have.  We have what we call an
> > > "architectural simulator".  It is a computer program that pretends that
> > > it is a computer system.  We use it to test the firmware before real
> > > hardware is available.  We have booted Linux on our simulator.  As you
> > > would expect it takes longer to boot on the simulator than it does on
> > > real hardware.
> > > 
> > > With my patch - boot time 41 minutes
> > > Without patch - boot time 94 minutes
> > > 
> > > These numbers do not scale linearly to real hardware.  But indicate to
> > > me a place where Linux can be improved.
> > > 
> > > Mike Yoknis
> > > 
> > Mel,
> > I finally got access to prototype hardware.  
> > It is a relatively small machine with only 64GB of RAM.
> >  
> > I put in a time measurement by reading the TSC register.
> > I booted both with and without my patch -
> >  
> > Without patch -
> > [    0.000000]   Normal zone: 13400064 pages, LIFO batch:31
> > [    0.000000] memmap_init_zone() enter 1404184834218
> > [    0.000000] memmap_init_zone() exit  1411174884438  diff = 6990050220
> >  
> > With patch -
> > [    0.000000]   Normal zone: 13400064 pages, LIFO batch:31
> > [    0.000000] memmap_init_zone() enter 1555530050778
> > [    0.000000] memmap_init_zone() exit  1559379204643  diff = 3849153865
> >  
> > This shows that without the patch the routine spends 45% 
> > of its time spinning unnecessarily.
> >  
> 
> I'm travelling at the moment so apologies that I have not followed up on
> this. My problem is still the same with the patch - it changes more
> headers than is necessary and it is sparsemem specific. At minimum, try
> the suggestion of 
> 
> if (!early_pfn_valid(pfn)) {
>       pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1;
>       continue;
> }
> 
> and see how much it gains you as it should work on all memory models. If
> it turns out that you really need to skip whole sections then the strice
> could MAX_ORDER_NR_PAGES on all memory models except sparsemem where the
> stride would be PAGES_PER_SECTION
> 
Mel,
I tried your suggestion.  I re-ran all 3 methods on our latest firmware.

The following are TSC difference numbers (*10^6) to execute
memmap_init_zone() -

No patch   - 7010
Mel's patch- 3918
My patch   - 3847

The incremental improvement of my method is not significant vs. yours.

If you believe your suggested change is worthwhile I will create a v2
patch.
Mike Y



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm: memmap_init_zone() performance improvement
  2012-10-24 15:47         ` Mike Yoknis
@ 2012-10-25  9:44           ` Mel Gorman
  2012-10-26 22:47             ` [PATCH v2] " Mike Yoknis
  0 siblings, 1 reply; 13+ messages in thread
From: Mel Gorman @ 2012-10-25  9:44 UTC (permalink / raw)
  To: Mike Yoknis
  Cc: mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan,
	kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm

On Wed, Oct 24, 2012 at 09:47:47AM -0600, Mike Yoknis wrote:
> On Sat, 2012-10-20 at 09:29 +0100, Mel Gorman wrote:
> > On Fri, Oct 19, 2012 at 01:53:18PM -0600, Mike Yoknis wrote:
> > > On Tue, 2012-10-09 at 08:56 -0600, Mike Yoknis wrote:
> > > > On Mon, 2012-10-08 at 16:16 +0100, Mel Gorman wrote:
> > > > > On Wed, Oct 03, 2012 at 08:56:14AM -0600, Mike Yoknis wrote:
> > > > > > memmap_init_zone() loops through every Page Frame Number (pfn),
> > > > > > including pfn values that are within the gaps between existing
> > > > > > memory sections.  The unneeded looping will become a boot
> > > > > > performance issue when machines configure larger memory ranges
> > > > > > that will contain larger and more numerous gaps.
> > > > > > 
> > > > > > The code will skip across invalid sections to reduce the
> > > > > > number of loops executed.
> > > > > > 
> > > > > > Signed-off-by: Mike Yoknis <mike.yoknis@hp.com>
> > > > > 
> > > > > I do not see the need for
> > > > > the additional complexity unless you can show it makes a big difference
> > > > > to boot times.
> > > > > 
> > > > 
> > > > Mel,
> > > > 
> > > > Let me pass along the numbers I have.  We have what we call an
> > > > "architectural simulator".  It is a computer program that pretends that
> > > > it is a computer system.  We use it to test the firmware before real
> > > > hardware is available.  We have booted Linux on our simulator.  As you
> > > > would expect it takes longer to boot on the simulator than it does on
> > > > real hardware.
> > > > 
> > > > With my patch - boot time 41 minutes
> > > > Without patch - boot time 94 minutes
> > > > 
> > > > These numbers do not scale linearly to real hardware.  But indicate to
> > > > me a place where Linux can be improved.
> > > > 
> > > > Mike Yoknis
> > > > 
> > > Mel,
> > > I finally got access to prototype hardware.  
> > > It is a relatively small machine with only 64GB of RAM.
> > >  
> > > I put in a time measurement by reading the TSC register.
> > > I booted both with and without my patch -
> > >  
> > > Without patch -
> > > [    0.000000]   Normal zone: 13400064 pages, LIFO batch:31
> > > [    0.000000] memmap_init_zone() enter 1404184834218
> > > [    0.000000] memmap_init_zone() exit  1411174884438  diff = 6990050220
> > >  
> > > With patch -
> > > [    0.000000]   Normal zone: 13400064 pages, LIFO batch:31
> > > [    0.000000] memmap_init_zone() enter 1555530050778
> > > [    0.000000] memmap_init_zone() exit  1559379204643  diff = 3849153865
> > >  
> > > This shows that without the patch the routine spends 45% 
> > > of its time spinning unnecessarily.
> > >  
> > 
> > I'm travelling at the moment so apologies that I have not followed up on
> > this. My problem is still the same with the patch - it changes more
> > headers than is necessary and it is sparsemem specific. At minimum, try
> > the suggestion of 
> > 
> > if (!early_pfn_valid(pfn)) {
> >       pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1;
> >       continue;
> > }
> > 
> > and see how much it gains you as it should work on all memory models. If
> > it turns out that you really need to skip whole sections then the strice
> > could MAX_ORDER_NR_PAGES on all memory models except sparsemem where the
> > stride would be PAGES_PER_SECTION
> > 
> Mel,
> I tried your suggestion.  I re-ran all 3 methods on our latest firmware.
> 
> The following are TSC difference numbers (*10^6) to execute
> memmap_init_zone() -
> 
> No patch   - 7010
> Mel's patch- 3918
> My patch   - 3847
> 
> The incremental improvement of my method is not significant vs. yours.
> 
> If you believe your suggested change is worthwhile I will create a v2
> patch.

I think it is a reasonable change and I prefer my suggestion because it
should work for all memory models. Please do a V2 of the patch. I'm still
travelling at the moment (writing this from an airport) but I'll be back
online next Tuesday and will review it when I can.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] mm: memmap_init_zone() performance improvement
  2012-10-25  9:44           ` Mel Gorman
@ 2012-10-26 22:47             ` Mike Yoknis
  2012-10-30 22:31               ` Andrew Morton
  0 siblings, 1 reply; 13+ messages in thread
From: Mike Yoknis @ 2012-10-26 22:47 UTC (permalink / raw)
  To: Mel Gorman
  Cc: mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan,
	kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm

memmap_init_zone() loops through every Page Frame Number (pfn),
including pfn values that are within the gaps between existing
memory sections.  The unneeded looping will become a boot
performance issue when machines configure larger memory ranges
that will contain larger and more numerous gaps.

The code will skip across invalid pfn values to reduce the
number of loops executed.

Signed-off-by: Mike Yoknis <mike.yoknis@hp.com>
---
 mm/page_alloc.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 45c916b..9f9c1a6 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3857,8 +3857,11 @@ void __meminit memmap_init_zone(unsigned long
size, int nid, unsigned long zone,
 		 * exist on hotplugged memory.
 		 */
 		if (context == MEMMAP_EARLY) {
-			if (!early_pfn_valid(pfn))
+			if (!early_pfn_valid(pfn)) {
+				pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES,
+						MAX_ORDER_NR_PAGES) - 1;
 				continue;
+			}
 			if (!early_pfn_in_nid(pfn, nid))
 				continue;
 		}
-- 
1.7.11.3



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm: memmap_init_zone() performance improvement
  2012-10-20  8:29       ` Mel Gorman
  2012-10-24 15:47         ` Mike Yoknis
@ 2012-10-30 15:14         ` Dave Hansen
  2012-11-06 16:03           ` Mike Yoknis
  1 sibling, 1 reply; 13+ messages in thread
From: Dave Hansen @ 2012-10-30 15:14 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Mike Yoknis, mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd,
	sam, minchan, kamezawa.hiroyu, mhocko, linux-kbuild,
	linux-kernel, linux-mm

On 10/20/2012 01:29 AM, Mel Gorman wrote:
> I'm travelling at the moment so apologies that I have not followed up on
> this. My problem is still the same with the patch - it changes more
> headers than is necessary and it is sparsemem specific. At minimum, try
> the suggestion of 
> 
> if (!early_pfn_valid(pfn)) {
>       pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1;
>       continue;
> }

Sorry I didn't catch this until v2...

Is that ALIGN() correct?  If pfn=3, then it would expand to:

(3+MAX_ORDER_NR_PAGES+MAX_ORDER_NR_PAGES-1) & ~(MAX_ORDER_NR_PAGES-1)

You would end up skipping the current MAX_ORDER_NR_PAGES area, and then
one _extra_ because ALIGN() aligns up, and you're adding
MAX_ORDER_NR_PAGES too.  It doesn't matter unless you run in to a
!early_valid_pfn() in the middle of a MAX_ORDER area, I guess.

I think this would work, plus be a bit smaller:

	pfn = ALIGN(pfn + 1, MAX_ORDER_NR_PAGES) - 1;


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] mm: memmap_init_zone() performance improvement
  2012-10-26 22:47             ` [PATCH v2] " Mike Yoknis
@ 2012-10-30 22:31               ` Andrew Morton
  0 siblings, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2012-10-30 22:31 UTC (permalink / raw)
  To: mike.yoknis
  Cc: Mel Gorman, mingo, linux-arch, mmarek, tglx, hpa, arnd, sam,
	minchan, kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel,
	linux-mm

On Fri, 26 Oct 2012 16:47:47 -0600
Mike Yoknis <mike.yoknis@hp.com> wrote:

> memmap_init_zone() loops through every Page Frame Number (pfn),
> including pfn values that are within the gaps between existing
> memory sections.  The unneeded looping will become a boot
> performance issue when machines configure larger memory ranges
> that will contain larger and more numerous gaps.
> 
> The code will skip across invalid pfn values to reduce the
> number of loops executed.
> 

So I was wondering how much difference this makes.  Then I see Mel
already asked and was answered.  The lesson: please treat a reviewer
question as a sign that the changelog needs more information!  I added
this text to the changelog:

: We have what we call an "architectural simulator".  It is a computer
: program that pretends that it is a computer system.  We use it to test the
: firmware before real hardware is available.  We have booted Linux on our
: simulator.  As you would expect it takes longer to boot on the simulator
: than it does on real hardware.
: 
: With my patch - boot time 41 minutes
: Without patch - boot time 94 minutes
: 
: These numbers do not scale linearly to real hardware.  But indicate to me
: a place where Linux can be improved.

> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3857,8 +3857,11 @@ void __meminit memmap_init_zone(unsigned long
> size, int nid, unsigned long zone,
>  		 * exist on hotplugged memory.
>  		 */
>  		if (context == MEMMAP_EARLY) {
> -			if (!early_pfn_valid(pfn))
> +			if (!early_pfn_valid(pfn)) {
> +				pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES,
> +						MAX_ORDER_NR_PAGES) - 1;
>  				continue;
> +			}
>  			if (!early_pfn_in_nid(pfn, nid))
>  				continue;
>  		}

So what is the assumption here?  That each zone's first page has a pfn
which is a multiple of MAX_ORDER_NR_PAGES?

That seems reasonable, but is it actually true, for all architectures
and for all time?  Where did this come from?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm: memmap_init_zone() performance improvement
  2012-10-30 15:14         ` [PATCH] " Dave Hansen
@ 2012-11-06 16:03           ` Mike Yoknis
  2012-12-18 23:03             ` Andrew Morton
  0 siblings, 1 reply; 13+ messages in thread
From: Mike Yoknis @ 2012-11-06 16:03 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Mel Gorman, mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd,
	sam, minchan, kamezawa.hiroyu, mhocko, linux-kbuild,
	linux-kernel, linux-mm

On Tue, 2012-10-30 at 09:14 -0600, Dave Hansen wrote:
> On 10/20/2012 01:29 AM, Mel Gorman wrote:
> > I'm travelling at the moment so apologies that I have not followed up on
> > this. My problem is still the same with the patch - it changes more
> > headers than is necessary and it is sparsemem specific. At minimum, try
> > the suggestion of
> >
> > if (!early_pfn_valid(pfn)) {
> >       pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1;
> >       continue;
> > }
> 
> Sorry I didn't catch this until v2...
> 
> Is that ALIGN() correct?  If pfn=3, then it would expand to:
> 
> (3+MAX_ORDER_NR_PAGES+MAX_ORDER_NR_PAGES-1) & ~(MAX_ORDER_NR_PAGES-1)
> 
> You would end up skipping the current MAX_ORDER_NR_PAGES area, and then
> one _extra_ because ALIGN() aligns up, and you're adding
> MAX_ORDER_NR_PAGES too.  It doesn't matter unless you run in to a
> !early_valid_pfn() in the middle of a MAX_ORDER area, I guess.
> 
> I think this would work, plus be a bit smaller:
> 
>         pfn = ALIGN(pfn + 1, MAX_ORDER_NR_PAGES) - 1;
> 
Dave,
I see your point about "rounding-up".  But, I favor the way Mel
suggested it.  It more clearly shows the intent, which is to move up by
MAX_ORDER_NR_PAGES.  The "pfn+1" may suggest that there is some
significance to the next pfn, but there is not.
I find Mel's way easier to understand.
Mike Y



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] mm: memmap_init_zone() performance improvement
  2012-11-06 16:03           ` Mike Yoknis
@ 2012-12-18 23:03             ` Andrew Morton
  0 siblings, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2012-12-18 23:03 UTC (permalink / raw)
  To: mike.yoknis
  Cc: Dave Hansen, Mel Gorman, mingo, linux-arch, mmarek, tglx, hpa,
	arnd, sam, minchan, kamezawa.hiroyu, mhocko, linux-kbuild,
	linux-kernel, linux-mm

On Tue, 06 Nov 2012 09:03:26 -0700
Mike Yoknis <mike.yoknis@hp.com> wrote:

> On Tue, 2012-10-30 at 09:14 -0600, Dave Hansen wrote:
> > On 10/20/2012 01:29 AM, Mel Gorman wrote:
> > > I'm travelling at the moment so apologies that I have not followed up on
> > > this. My problem is still the same with the patch - it changes more
> > > headers than is necessary and it is sparsemem specific. At minimum, try
> > > the suggestion of
> > >
> > > if (!early_pfn_valid(pfn)) {
> > >       pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1;
> > >       continue;
> > > }
> > 
> > Sorry I didn't catch this until v2...
> > 
> > Is that ALIGN() correct?  If pfn=3, then it would expand to:
> > 
> > (3+MAX_ORDER_NR_PAGES+MAX_ORDER_NR_PAGES-1) & ~(MAX_ORDER_NR_PAGES-1)
> > 
> > You would end up skipping the current MAX_ORDER_NR_PAGES area, and then
> > one _extra_ because ALIGN() aligns up, and you're adding
> > MAX_ORDER_NR_PAGES too.  It doesn't matter unless you run in to a
> > !early_valid_pfn() in the middle of a MAX_ORDER area, I guess.
> > 
> > I think this would work, plus be a bit smaller:
> > 
> >         pfn = ALIGN(pfn + 1, MAX_ORDER_NR_PAGES) - 1;
> > 
> Dave,
> I see your point about "rounding-up".  But, I favor the way Mel
> suggested it.  It more clearly shows the intent, which is to move up by
> MAX_ORDER_NR_PAGES.  The "pfn+1" may suggest that there is some
> significance to the next pfn, but there is not.
> I find Mel's way easier to understand.

I don't think that really answers Dave's question.  What happens if we
"run in to a !early_valid_pfn() in the middle of a MAX_ORDER area"?



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-12-18 23:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1349276174-8398-1-git-send-email-mike.yoknis@hp.com>
2012-10-06 23:59 ` [PATCH] mm: memmap_init_zone() performance improvement Ni zhan Chen
2012-10-08 15:16 ` Mel Gorman
2012-10-09  0:42   ` Ni zhan Chen
2012-10-09 14:56   ` Mike Yoknis
2012-10-19 19:53     ` Mike Yoknis
2012-10-20  8:29       ` Mel Gorman
2012-10-24 15:47         ` Mike Yoknis
2012-10-25  9:44           ` Mel Gorman
2012-10-26 22:47             ` [PATCH v2] " Mike Yoknis
2012-10-30 22:31               ` Andrew Morton
2012-10-30 15:14         ` [PATCH] " Dave Hansen
2012-11-06 16:03           ` Mike Yoknis
2012-12-18 23:03             ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).