linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Weirdness in __alloc_bootmem_node_high
@ 2012-04-17 15:55 Michal Hocko
  2012-04-17 17:12 ` Yinghai Lu
  2012-04-20 18:29 ` Tejun Heo
  0 siblings, 2 replies; 21+ messages in thread
From: Michal Hocko @ 2012-04-17 15:55 UTC (permalink / raw)
  To: yinghai; +Cc: linux-mm, LKML

Hi,
I just come across the following condition in __alloc_bootmem_node_high
which I have hard times to understand. I guess it is a bug and we need
something like the following. But, to be honest, I have no idea why we
care about those 128MB above MAX_DMA32_PFN.
---
 mm/bootmem.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/bootmem.c b/mm/bootmem.c
index 0131170..5adb072 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -737,7 +737,7 @@ void * __init __alloc_bootmem_node_high(pg_data_t *pgdat, unsigned long size,
 	/* update goal according ...MAX_DMA32_PFN */
 	end_pfn = pgdat->node_start_pfn + pgdat->node_spanned_pages;
 
-	if (end_pfn > MAX_DMA32_PFN + (128 >> (20 - PAGE_SHIFT)) &&
+	if (end_pfn > MAX_DMA32_PFN + (128 << (20 - PAGE_SHIFT)) &&
 	    (goal >> PAGE_SHIFT) < MAX_DMA32_PFN) {
 		void *ptr;
 		unsigned long new_goal;
-- 
1.7.9.5

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-17 15:55 Weirdness in __alloc_bootmem_node_high Michal Hocko
@ 2012-04-17 17:12 ` Yinghai Lu
  2012-04-17 17:32   ` Michal Hocko
  2012-04-20 18:29 ` Tejun Heo
  1 sibling, 1 reply; 21+ messages in thread
From: Yinghai Lu @ 2012-04-17 17:12 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-mm, LKML

On Tue, Apr 17, 2012 at 8:55 AM, Michal Hocko <mhocko@suse.cz> wrote:
> Hi,
> I just come across the following condition in __alloc_bootmem_node_high
> which I have hard times to understand. I guess it is a bug and we need
> something like the following. But, to be honest, I have no idea why we
> care about those 128MB above MAX_DMA32_PFN.
> ---
>  mm/bootmem.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/bootmem.c b/mm/bootmem.c
> index 0131170..5adb072 100644
> --- a/mm/bootmem.c
> +++ b/mm/bootmem.c
> @@ -737,7 +737,7 @@ void * __init __alloc_bootmem_node_high(pg_data_t *pgdat, unsigned long size,
>        /* update goal according ...MAX_DMA32_PFN */
>        end_pfn = pgdat->node_start_pfn + pgdat->node_spanned_pages;
>
> -       if (end_pfn > MAX_DMA32_PFN + (128 >> (20 - PAGE_SHIFT)) &&
> +       if (end_pfn > MAX_DMA32_PFN + (128 << (20 - PAGE_SHIFT)) &&
>            (goal >> PAGE_SHIFT) < MAX_DMA32_PFN) {
>                void *ptr;
>                unsigned long new_goal;
> --

We are not using bootmem with x86 now, so could remove those workaround now.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-17 17:12 ` Yinghai Lu
@ 2012-04-17 17:32   ` Michal Hocko
  2012-04-17 18:07     ` Yinghai Lu
  0 siblings, 1 reply; 21+ messages in thread
From: Michal Hocko @ 2012-04-17 17:32 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: linux-mm, LKML

On Tue 17-04-12 10:12:30, Yinghai Lu wrote:
> On Tue, Apr 17, 2012 at 8:55 AM, Michal Hocko <mhocko@suse.cz> wrote:
> > Hi,
> > I just come across the following condition in __alloc_bootmem_node_high
> > which I have hard times to understand. I guess it is a bug and we need
> > something like the following. But, to be honest, I have no idea why we
> > care about those 128MB above MAX_DMA32_PFN.
> > ---
> >  mm/bootmem.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/bootmem.c b/mm/bootmem.c
> > index 0131170..5adb072 100644
> > --- a/mm/bootmem.c
> > +++ b/mm/bootmem.c
> > @@ -737,7 +737,7 @@ void * __init __alloc_bootmem_node_high(pg_data_t *pgdat, unsigned long size,
> >        /* update goal according ...MAX_DMA32_PFN */
> >        end_pfn = pgdat->node_start_pfn + pgdat->node_spanned_pages;
> >
> > -       if (end_pfn > MAX_DMA32_PFN + (128 >> (20 - PAGE_SHIFT)) &&
> > +       if (end_pfn > MAX_DMA32_PFN + (128 << (20 - PAGE_SHIFT)) &&
> >            (goal >> PAGE_SHIFT) < MAX_DMA32_PFN) {
> >                void *ptr;
> >                unsigned long new_goal;
> > --
> 
> We are not using bootmem with x86 now, so could remove those workaround now.

Could you be more specific about what the workaround is used for?

Thanks

> 
> Thanks
> 
> Yinghai
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-17 17:32   ` Michal Hocko
@ 2012-04-17 18:07     ` Yinghai Lu
  2012-04-17 18:30       ` Sam Ravnborg
  2012-04-19 12:50       ` Michal Hocko
  0 siblings, 2 replies; 21+ messages in thread
From: Yinghai Lu @ 2012-04-17 18:07 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-mm, LKML

On Tue, Apr 17, 2012 at 10:32 AM, Michal Hocko <mhocko@suse.cz> wrote:
> On Tue 17-04-12 10:12:30, Yinghai Lu wrote:
>>
>> We are not using bootmem with x86 now, so could remove those workaround now.
>
> Could you be more specific about what the workaround is used for?

Don't bootmem allocating too low to use up all low memory. like for
system with lots of memory for sparse vmemmap.

when nobootmem.c is used, __alloc_bootmem_node_high is the same as
__alloc_bootmem_node.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-17 18:07     ` Yinghai Lu
@ 2012-04-17 18:30       ` Sam Ravnborg
  2012-04-17 21:33         ` Tejun Heo
  2012-04-19 12:50       ` Michal Hocko
  1 sibling, 1 reply; 21+ messages in thread
From: Sam Ravnborg @ 2012-04-17 18:30 UTC (permalink / raw)
  To: Yinghai Lu, Tejun Heo; +Cc: Michal Hocko, linux-mm, LKML

On Tue, Apr 17, 2012 at 11:07:10AM -0700, Yinghai Lu wrote:
> On Tue, Apr 17, 2012 at 10:32 AM, Michal Hocko <mhocko@suse.cz> wrote:
> > On Tue 17-04-12 10:12:30, Yinghai Lu wrote:
> >>
> >> We are not using bootmem with x86 now, so could remove those workaround now.
> >
> > Could you be more specific about what the workaround is used for?
> 
> Don't bootmem allocating too low to use up all low memory. like for
> system with lots of memory for sparse vmemmap.
> 
> when nobootmem.c is used, __alloc_bootmem_node_high is the same as
> __alloc_bootmem_node.

It would be nice if someone familiar with the memblock/bootmem
internals could cleans up the leftovers from the migration
of x86 to memblock / nobootmem.

This would be less to be confused about when other migrate to
use memblock.

	Sam

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-17 18:30       ` Sam Ravnborg
@ 2012-04-17 21:33         ` Tejun Heo
  0 siblings, 0 replies; 21+ messages in thread
From: Tejun Heo @ 2012-04-17 21:33 UTC (permalink / raw)
  To: Sam Ravnborg; +Cc: Yinghai Lu, Michal Hocko, linux-mm, LKML

Hello, Sam.

On Tue, Apr 17, 2012 at 08:30:42PM +0200, Sam Ravnborg wrote:
> It would be nice if someone familiar with the memblock/bootmem
> internals could cleans up the leftovers from the migration
> of x86 to memblock / nobootmem.
> 
> This would be less to be confused about when other migrate to
> use memblock.

I can't remember the details now (my memory sucks big time) but there
were some obstacles and I decided to defer cleaning up.  I'm kinda
overwhelmed in other areas so if anyone is interested in cleaning up,
I'll be happy to help.  If not, I'll try to get to it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-17 18:07     ` Yinghai Lu
  2012-04-17 18:30       ` Sam Ravnborg
@ 2012-04-19 12:50       ` Michal Hocko
  1 sibling, 0 replies; 21+ messages in thread
From: Michal Hocko @ 2012-04-19 12:50 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: linux-mm, LKML

On Tue 17-04-12 11:07:10, Yinghai Lu wrote:
> On Tue, Apr 17, 2012 at 10:32 AM, Michal Hocko <mhocko@suse.cz> wrote:
> > On Tue 17-04-12 10:12:30, Yinghai Lu wrote:
> >>
> >> We are not using bootmem with x86 now, so could remove those workaround now.
> >
> > Could you be more specific about what the workaround is used for?
> 
> Don't bootmem allocating too low to use up all low memory. like for
> system with lots of memory for sparse vmemmap.

OK I see. Thanks for clarification.
I guess it doesn't make much sense to fix this particular thing now and
rather let it to a bigger clean up. If people think otherwise I can send
a patch though.


> 
> when nobootmem.c is used, __alloc_bootmem_node_high is the same as
> __alloc_bootmem_node.
> 
> Thanks
> 
> Yinghai
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-17 15:55 Weirdness in __alloc_bootmem_node_high Michal Hocko
  2012-04-17 17:12 ` Yinghai Lu
@ 2012-04-20 18:29 ` Tejun Heo
  2012-04-20 19:14   ` Sam Ravnborg
  1 sibling, 1 reply; 21+ messages in thread
From: Tejun Heo @ 2012-04-20 18:29 UTC (permalink / raw)
  To: Michal Hocko; +Cc: yinghai, linux-mm, LKML

On Tue, Apr 17, 2012 at 05:55:02PM +0200, Michal Hocko wrote:
> Hi,
> I just come across the following condition in __alloc_bootmem_node_high
> which I have hard times to understand. I guess it is a bug and we need
> something like the following. But, to be honest, I have no idea why we
> care about those 128MB above MAX_DMA32_PFN.
> ---
>  mm/bootmem.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/bootmem.c b/mm/bootmem.c
> index 0131170..5adb072 100644
> --- a/mm/bootmem.c
> +++ b/mm/bootmem.c
> @@ -737,7 +737,7 @@ void * __init __alloc_bootmem_node_high(pg_data_t *pgdat, unsigned long size,
>  	/* update goal according ...MAX_DMA32_PFN */
>  	end_pfn = pgdat->node_start_pfn + pgdat->node_spanned_pages;
>  
> -	if (end_pfn > MAX_DMA32_PFN + (128 >> (20 - PAGE_SHIFT)) &&
> +	if (end_pfn > MAX_DMA32_PFN + (128 << (20 - PAGE_SHIFT)) &&
>  	    (goal >> PAGE_SHIFT) < MAX_DMA32_PFN) {
>  		void *ptr;
>  		unsigned long new_goal;

Regardless of x86 not using it, this is a bug fix and this code path
seems to be used by mips at least.  Michal, can you please post proper
signed-off patch?  The code is simply trying to use memory above DMA32
limit if there seems to be enough space (128M) to avoid unnecessarily
using DMA32 memory.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-20 18:29 ` Tejun Heo
@ 2012-04-20 19:14   ` Sam Ravnborg
  2012-04-20 19:29     ` Michal Hocko
  2012-04-20 19:30     ` Yinghai Lu
  0 siblings, 2 replies; 21+ messages in thread
From: Sam Ravnborg @ 2012-04-20 19:14 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Michal Hocko, yinghai, linux-mm, LKML

On Fri, Apr 20, 2012 at 11:29:07AM -0700, Tejun Heo wrote:
> On Tue, Apr 17, 2012 at 05:55:02PM +0200, Michal Hocko wrote:
> > Hi,
> > I just come across the following condition in __alloc_bootmem_node_high
> > which I have hard times to understand. I guess it is a bug and we need
> > something like the following. But, to be honest, I have no idea why we
> > care about those 128MB above MAX_DMA32_PFN.
> > ---
> >  mm/bootmem.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/bootmem.c b/mm/bootmem.c
> > index 0131170..5adb072 100644
> > --- a/mm/bootmem.c
> > +++ b/mm/bootmem.c
> > @@ -737,7 +737,7 @@ void * __init __alloc_bootmem_node_high(pg_data_t *pgdat, unsigned long size,
> >  	/* update goal according ...MAX_DMA32_PFN */
> >  	end_pfn = pgdat->node_start_pfn + pgdat->node_spanned_pages;
> >  
> > -	if (end_pfn > MAX_DMA32_PFN + (128 >> (20 - PAGE_SHIFT)) &&
> > +	if (end_pfn > MAX_DMA32_PFN + (128 << (20 - PAGE_SHIFT)) &&
> >  	    (goal >> PAGE_SHIFT) < MAX_DMA32_PFN) {
> >  		void *ptr;
> >  		unsigned long new_goal;
> 
> Regardless of x86 not using it, this is a bug fix and this code path
> seems to be used by mips at least.

I took a quick look at this.
__alloc_bootmem_node_high() is used in mm/sparse.c - but only
if SPARSEMEM_VMEMMAP is enabled.

mips has this:

config ARCH_SPARSEMEM_ENABLE
        bool
        select SPARSEMEM_STATIC

So SPARSEMEM_VMEMMAP is not enabled.

__alloc_bootmem_node_high() is used in mm/sparse-vmemmap.c which
also depends on CONFIG_SPARSEMEM_VMEMMAP.


So I really do not see the logic in __alloc_bootmem_node_high()
being used anymore and it can be replaced by __alloc_bootmem_node()

	Sam

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-20 19:14   ` Sam Ravnborg
@ 2012-04-20 19:29     ` Michal Hocko
  2012-04-20 19:32       ` Yinghai Lu
  2012-04-20 19:30     ` Yinghai Lu
  1 sibling, 1 reply; 21+ messages in thread
From: Michal Hocko @ 2012-04-20 19:29 UTC (permalink / raw)
  To: Sam Ravnborg; +Cc: Tejun Heo, yinghai, linux-mm, LKML

On Fri 20-04-12 21:14:18, Sam Ravnborg wrote:
> On Fri, Apr 20, 2012 at 11:29:07AM -0700, Tejun Heo wrote:
> > On Tue, Apr 17, 2012 at 05:55:02PM +0200, Michal Hocko wrote:
> > > Hi,
> > > I just come across the following condition in __alloc_bootmem_node_high
> > > which I have hard times to understand. I guess it is a bug and we need
> > > something like the following. But, to be honest, I have no idea why we
> > > care about those 128MB above MAX_DMA32_PFN.
> > > ---
> > >  mm/bootmem.c |    2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/mm/bootmem.c b/mm/bootmem.c
> > > index 0131170..5adb072 100644
> > > --- a/mm/bootmem.c
> > > +++ b/mm/bootmem.c
> > > @@ -737,7 +737,7 @@ void * __init __alloc_bootmem_node_high(pg_data_t *pgdat, unsigned long size,
> > >  	/* update goal according ...MAX_DMA32_PFN */
> > >  	end_pfn = pgdat->node_start_pfn + pgdat->node_spanned_pages;
> > >  
> > > -	if (end_pfn > MAX_DMA32_PFN + (128 >> (20 - PAGE_SHIFT)) &&
> > > +	if (end_pfn > MAX_DMA32_PFN + (128 << (20 - PAGE_SHIFT)) &&
> > >  	    (goal >> PAGE_SHIFT) < MAX_DMA32_PFN) {
> > >  		void *ptr;
> > >  		unsigned long new_goal;
> > 
> > Regardless of x86 not using it, this is a bug fix and this code path
> > seems to be used by mips at least.
> 
> I took a quick look at this.
> __alloc_bootmem_node_high() is used in mm/sparse.c - but only
> if SPARSEMEM_VMEMMAP is enabled.

This is what I can see in the current (Linus) git:
./arch/sparc/Kconfig:   select SPARSEMEM_VMEMMAP_ENABLE
./arch/powerpc/Kconfig: select SPARSEMEM_VMEMMAP_ENABLE
./arch/ia64/Kconfig:    select SPARSEMEM_VMEMMAP_ENABLE
./arch/s390/Kconfig:    select SPARSEMEM_VMEMMAP_ENABLE
./arch/s390/Kconfig:    select SPARSEMEM_VMEMMAP
./arch/x86/Kconfig:     select SPARSEMEM_VMEMMAP_ENABLE if X86_64

So there are more arches which enable SPARSEMEM_VMEMMAP so the function
is used. Or am I missing something?

-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-20 19:14   ` Sam Ravnborg
  2012-04-20 19:29     ` Michal Hocko
@ 2012-04-20 19:30     ` Yinghai Lu
  2012-04-20 19:43       ` Sam Ravnborg
  1 sibling, 1 reply; 21+ messages in thread
From: Yinghai Lu @ 2012-04-20 19:30 UTC (permalink / raw)
  To: Sam Ravnborg; +Cc: Tejun Heo, Michal Hocko, linux-mm, LKML

On Fri, Apr 20, 2012 at 12:14 PM, Sam Ravnborg <sam@ravnborg.org> wrote:
>
> I took a quick look at this.
> __alloc_bootmem_node_high() is used in mm/sparse.c - but only
> if SPARSEMEM_VMEMMAP is enabled.
>
> mips has this:
>
> config ARCH_SPARSEMEM_ENABLE
>        bool
>        select SPARSEMEM_STATIC
>
> So SPARSEMEM_VMEMMAP is not enabled.
>
> __alloc_bootmem_node_high() is used in mm/sparse-vmemmap.c which
> also depends on CONFIG_SPARSEMEM_VMEMMAP.
>
>
> So I really do not see the logic in __alloc_bootmem_node_high()
> being used anymore and it can be replaced by __alloc_bootmem_node()

Yes, you are right. __alloc_bootmem_node_high could be removed.

BTW, x86 is still the only one that use NO_BOOTMEM.

Are you working on making sparc to use NO_BOOTMEM?

Yinghai

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-20 19:29     ` Michal Hocko
@ 2012-04-20 19:32       ` Yinghai Lu
  2012-04-20 19:41         ` Michal Hocko
  0 siblings, 1 reply; 21+ messages in thread
From: Yinghai Lu @ 2012-04-20 19:32 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Sam Ravnborg, Tejun Heo, linux-mm, LKML

On Fri, Apr 20, 2012 at 12:29 PM, Michal Hocko <mhocko@suse.cz> wrote:
> This is what I can see in the current (Linus) git:
> ./arch/sparc/Kconfig:   select SPARSEMEM_VMEMMAP_ENABLE
> ./arch/powerpc/Kconfig: select SPARSEMEM_VMEMMAP_ENABLE
> ./arch/ia64/Kconfig:    select SPARSEMEM_VMEMMAP_ENABLE
> ./arch/s390/Kconfig:    select SPARSEMEM_VMEMMAP_ENABLE
> ./arch/s390/Kconfig:    select SPARSEMEM_VMEMMAP
> ./arch/x86/Kconfig:     select SPARSEMEM_VMEMMAP_ENABLE if X86_64
>
> So there are more arches which enable SPARSEMEM_VMEMMAP so the function
> is used. Or am I missing something?

MAX_DMA32_PFN is not defined for them.

I was think only x86 have that define. Actually mips have that defined too.

Yinghai

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-20 19:32       ` Yinghai Lu
@ 2012-04-20 19:41         ` Michal Hocko
  0 siblings, 0 replies; 21+ messages in thread
From: Michal Hocko @ 2012-04-20 19:41 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Sam Ravnborg, Tejun Heo, linux-mm, LKML

On Fri 20-04-12 12:32:38, Yinghai Lu wrote:
> On Fri, Apr 20, 2012 at 12:29 PM, Michal Hocko <mhocko@suse.cz> wrote:
> > This is what I can see in the current (Linus) git:
> > ./arch/sparc/Kconfig:   select SPARSEMEM_VMEMMAP_ENABLE
> > ./arch/powerpc/Kconfig: select SPARSEMEM_VMEMMAP_ENABLE
> > ./arch/ia64/Kconfig:    select SPARSEMEM_VMEMMAP_ENABLE
> > ./arch/s390/Kconfig:    select SPARSEMEM_VMEMMAP_ENABLE
> > ./arch/s390/Kconfig:    select SPARSEMEM_VMEMMAP
> > ./arch/x86/Kconfig:     select SPARSEMEM_VMEMMAP_ENABLE if X86_64
> >
> > So there are more arches which enable SPARSEMEM_VMEMMAP so the function
> > is used. Or am I missing something?
> 
> MAX_DMA32_PFN is not defined for them.

Ahh, you are right except that it is defined for x86 but that one uses
nobootmem. I missed that point.

Thanks
-- 
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-20 19:30     ` Yinghai Lu
@ 2012-04-20 19:43       ` Sam Ravnborg
  2012-04-22 19:22         ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: Sam Ravnborg @ 2012-04-20 19:43 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Tejun Heo, Michal Hocko, linux-mm, LKML

On Fri, Apr 20, 2012 at 12:30:54PM -0700, Yinghai Lu wrote:
> On Fri, Apr 20, 2012 at 12:14 PM, Sam Ravnborg <sam@ravnborg.org> wrote:
> >
> > I took a quick look at this.
> > __alloc_bootmem_node_high() is used in mm/sparse.c - but only
> > if SPARSEMEM_VMEMMAP is enabled.
> >
> > mips has this:
> >
> > config ARCH_SPARSEMEM_ENABLE
> >        bool
> >        select SPARSEMEM_STATIC
> >
> > So SPARSEMEM_VMEMMAP is not enabled.
> >
> > __alloc_bootmem_node_high() is used in mm/sparse-vmemmap.c which
> > also depends on CONFIG_SPARSEMEM_VMEMMAP.
> >
> >
> > So I really do not see the logic in __alloc_bootmem_node_high()
> > being used anymore and it can be replaced by __alloc_bootmem_node()
> 
> Yes, you are right. __alloc_bootmem_node_high could be removed.
> 
> BTW, x86 is still the only one that use NO_BOOTMEM.
> 
> Are you working on making sparc to use NO_BOOTMEM?

For now I am trying to convert sparc32 to
use memblock and NO_BOOTMEM in one step.

I have it almost finished - except that it does not work :-(
We have limitations in what area we can allocate very early,
and here I had to use the alloc_bootmem_low() variant.
I had preferred a variant that allowed me to allocate
bottom-up in this case.

For now I assume something is fishy in my code where I
hand over memory to the buddyallocator.
But before posting anything I need time to go through
my code and divide it up in smaller patches.

There is so far no changes to nobootmem / memblock code.

I will most likely convert sparc64 to NO_BOOTMEM next,
if it looks reasonable simple that is.
But first step is to get sparc32 working.

	Sam

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-20 19:43       ` Sam Ravnborg
@ 2012-04-22 19:22         ` David Miller
  2012-04-22 20:05           ` Sam Ravnborg
  0 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2012-04-22 19:22 UTC (permalink / raw)
  To: sam; +Cc: yinghai, tj, mhocko, linux-mm, linux-kernel

From: Sam Ravnborg <sam@ravnborg.org>
Date: Fri, 20 Apr 2012 21:43:09 +0200

> I have it almost finished - except that it does not work :-(
> We have limitations in what area we can allocate very early,
> and here I had to use the alloc_bootmem_low() variant.
> I had preferred a variant that allowed me to allocate
> bottom-up in this case.

I think you're going to have to bear down and map all of linear kernel
mappings before you start using the bootmem code rather than
afterwards.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-22 19:22         ` David Miller
@ 2012-04-22 20:05           ` Sam Ravnborg
  2012-04-23  2:00             ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: Sam Ravnborg @ 2012-04-22 20:05 UTC (permalink / raw)
  To: David Miller; +Cc: yinghai, tj, mhocko, linux-mm, linux-kernel

On Sun, Apr 22, 2012 at 03:22:10PM -0400, David Miller wrote:
> From: Sam Ravnborg <sam@ravnborg.org>
> Date: Fri, 20 Apr 2012 21:43:09 +0200
> 
> > I have it almost finished - except that it does not work :-(
> > We have limitations in what area we can allocate very early,
> > and here I had to use the alloc_bootmem_low() variant.
> > I had preferred a variant that allowed me to allocate
> > bottom-up in this case.
> 
> I think you're going to have to bear down and map all of linear kernel
> mappings before you start using the bootmem code rather than
> afterwards.

Thanks - I will try to do so.
But I must admit I do not (yet) see any big difference in the
approach before/after.

I digged up an old todo regarding final link of vmlinux.
Will look at this and then re-visit memblock at sparc32.

	Sam

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-22 20:05           ` Sam Ravnborg
@ 2012-04-23  2:00             ` David Miller
  2012-04-23  5:12               ` Sam Ravnborg
  2012-04-24  6:32               ` Sam Ravnborg
  0 siblings, 2 replies; 21+ messages in thread
From: David Miller @ 2012-04-23  2:00 UTC (permalink / raw)
  To: sam; +Cc: yinghai, tj, mhocko, linux-mm, linux-kernel


So here is a sparc64 conversion to NO_BOOTMEM.

Critically, I had to fix __alloc_bootmem_node to do what it promised
to do.  Which is retry the allocation without the goal if doing so
with the goal fails.

Otherwise all bootmem allocations with goal set to
__pa(MAX_DMA_ADDRESS) fail on sparc64, because we set that to ~0 so
effectively all such allocations evaluate roughly to "goal=~0,
limit=~0" which can never succeed.

diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index db4e821..3763302 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -109,6 +109,9 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
 config NEED_PER_CPU_PAGE_FIRST_CHUNK
 	def_bool y if SPARC64
 
+config NO_BOOTMEM
+	def_bool y if SPARC64
+
 config MMU
 	bool
 	default y
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 21faaee..5b84559 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -741,7 +741,6 @@ static void __init find_ramdisk(unsigned long phys_base)
 struct node_mem_mask {
 	unsigned long mask;
 	unsigned long val;
-	unsigned long bootmem_paddr;
 };
 static struct node_mem_mask node_masks[MAX_NUMNODES];
 static int num_node_masks;
@@ -820,7 +819,7 @@ static u64 memblock_nid_range(u64 start, u64 end, int *nid)
  */
 static void __init allocate_node_data(int nid)
 {
-	unsigned long paddr, num_pages, start_pfn, end_pfn;
+	unsigned long paddr, start_pfn, end_pfn;
 	struct pglist_data *p;
 
 #ifdef CONFIG_NEED_MULTIPLE_NODES
@@ -832,7 +831,7 @@ static void __init allocate_node_data(int nid)
 	NODE_DATA(nid) = __va(paddr);
 	memset(NODE_DATA(nid), 0, sizeof(struct pglist_data));
 
-	NODE_DATA(nid)->bdata = &bootmem_node_data[nid];
+	NODE_DATA(nid)->node_id = nid;
 #endif
 
 	p = NODE_DATA(nid);
@@ -840,18 +839,6 @@ static void __init allocate_node_data(int nid)
 	get_pfn_range_for_nid(nid, &start_pfn, &end_pfn);
 	p->node_start_pfn = start_pfn;
 	p->node_spanned_pages = end_pfn - start_pfn;
-
-	if (p->node_spanned_pages) {
-		num_pages = bootmem_bootmap_pages(p->node_spanned_pages);
-
-		paddr = memblock_alloc_try_nid(num_pages << PAGE_SHIFT, PAGE_SIZE, nid);
-		if (!paddr) {
-			prom_printf("Cannot allocate bootmap for nid[%d]\n",
-				  nid);
-			prom_halt();
-		}
-		node_masks[nid].bootmem_paddr = paddr;
-	}
 }
 
 static void init_node_masks_nonnuma(void)
@@ -1292,75 +1279,9 @@ static void __init bootmem_init_nonnuma(void)
 	node_set_online(0);
 }
 
-static void __init reserve_range_in_node(int nid, unsigned long start,
-					 unsigned long end)
-{
-	numadbg("    reserve_range_in_node(nid[%d],start[%lx],end[%lx]\n",
-		nid, start, end);
-	while (start < end) {
-		unsigned long this_end;
-		int n;
-
-		this_end = memblock_nid_range(start, end, &n);
-		if (n == nid) {
-			numadbg("      MATCH reserving range [%lx:%lx]\n",
-				start, this_end);
-			reserve_bootmem_node(NODE_DATA(nid), start,
-					     (this_end - start), BOOTMEM_DEFAULT);
-		} else
-			numadbg("      NO MATCH, advancing start to %lx\n",
-				this_end);
-
-		start = this_end;
-	}
-}
-
-static void __init trim_reserved_in_node(int nid)
-{
-	struct memblock_region *reg;
-
-	numadbg("  trim_reserved_in_node(%d)\n", nid);
-
-	for_each_memblock(reserved, reg)
-		reserve_range_in_node(nid, reg->base, reg->base + reg->size);
-}
-
-static void __init bootmem_init_one_node(int nid)
-{
-	struct pglist_data *p;
-
-	numadbg("bootmem_init_one_node(%d)\n", nid);
-
-	p = NODE_DATA(nid);
-
-	if (p->node_spanned_pages) {
-		unsigned long paddr = node_masks[nid].bootmem_paddr;
-		unsigned long end_pfn;
-
-		end_pfn = p->node_start_pfn + p->node_spanned_pages;
-
-		numadbg("  init_bootmem_node(%d, %lx, %lx, %lx)\n",
-			nid, paddr >> PAGE_SHIFT, p->node_start_pfn, end_pfn);
-
-		init_bootmem_node(p, paddr >> PAGE_SHIFT,
-				  p->node_start_pfn, end_pfn);
-
-		numadbg("  free_bootmem_with_active_regions(%d, %lx)\n",
-			nid, end_pfn);
-		free_bootmem_with_active_regions(nid, end_pfn);
-
-		trim_reserved_in_node(nid);
-
-		numadbg("  sparse_memory_present_with_active_regions(%d)\n",
-			nid);
-		sparse_memory_present_with_active_regions(nid);
-	}
-}
-
 static unsigned long __init bootmem_init(unsigned long phys_base)
 {
 	unsigned long end_pfn;
-	int nid;
 
 	end_pfn = memblock_end_of_DRAM() >> PAGE_SHIFT;
 	max_pfn = max_low_pfn = end_pfn;
@@ -1369,11 +1290,12 @@ static unsigned long __init bootmem_init(unsigned long phys_base)
 	if (bootmem_init_numa() < 0)
 		bootmem_init_nonnuma();
 
-	/* XXX cpu notifier XXX */
+	/* Dump memblock with node info. */
+	memblock_dump_all();
 
-	for_each_online_node(nid)
-		bootmem_init_one_node(nid);
+	/* XXX cpu notifier XXX */
 
+	sparse_memory_present_with_active_regions(MAX_NUMNODES);
 	sparse_init();
 
 	return end_pfn;
@@ -1973,6 +1895,7 @@ void __init mem_init(void)
 					free_all_bootmem_node(NODE_DATA(i));
 			}
 		}
+		totalram_pages += free_low_memory_core_early(MAX_NUMNODES);
 	}
 #else
 	totalram_pages = free_all_bootmem();
diff --git a/mm/nobootmem.c b/mm/nobootmem.c
index 24f0fc1..e53bb8a 100644
--- a/mm/nobootmem.c
+++ b/mm/nobootmem.c
@@ -298,13 +298,19 @@ void * __init __alloc_bootmem_node(pg_data_t *pgdat, unsigned long size,
 	if (WARN_ON_ONCE(slab_is_available()))
 		return kzalloc_node(size, GFP_NOWAIT, pgdat->node_id);
 
+again:
 	ptr = __alloc_memory_core_early(pgdat->node_id, size, align,
 					 goal, -1ULL);
 	if (ptr)
 		return ptr;
 
-	return __alloc_memory_core_early(MAX_NUMNODES, size, align,
-					 goal, -1ULL);
+	ptr = __alloc_memory_core_early(MAX_NUMNODES, size, align,
+					goal, -1ULL);
+	if (!ptr && goal) {
+		goal = 0;
+		goto again;
+	}
+	return ptr;
 }
 
 void * __init __alloc_bootmem_node_high(pg_data_t *pgdat, unsigned long size,

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-23  2:00             ` David Miller
@ 2012-04-23  5:12               ` Sam Ravnborg
  2012-04-24  6:32               ` Sam Ravnborg
  1 sibling, 0 replies; 21+ messages in thread
From: Sam Ravnborg @ 2012-04-23  5:12 UTC (permalink / raw)
  To: David Miller; +Cc: yinghai, tj, mhocko, linux-mm, linux-kernel

On Sun, Apr 22, 2012 at 10:00:54PM -0400, David Miller wrote:
> 
> So here is a sparc64 conversion to NO_BOOTMEM.
Great!

I looked briefly through the patch - no comments.
I will take a more carefull look tonight.

	Sam

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-23  2:00             ` David Miller
  2012-04-23  5:12               ` Sam Ravnborg
@ 2012-04-24  6:32               ` Sam Ravnborg
  2012-04-24  7:00                 ` David Miller
  1 sibling, 1 reply; 21+ messages in thread
From: Sam Ravnborg @ 2012-04-24  6:32 UTC (permalink / raw)
  To: David Miller; +Cc: yinghai, tj, mhocko, linux-mm, linux-kernel

On Sun, Apr 22, 2012 at 10:00:54PM -0400, David Miller wrote:
> 
> So here is a sparc64 conversion to NO_BOOTMEM.
> 
> Critically, I had to fix __alloc_bootmem_node to do what it promised
> to do.  Which is retry the allocation without the goal if doing so
> with the goal fails.
> 
> Otherwise all bootmem allocations with goal set to
> __pa(MAX_DMA_ADDRESS) fail on sparc64, because we set that to ~0 so
> effectively all such allocations evaluate roughly to "goal=~0,
> limit=~0" which can never succeed.
> 
> diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
> index db4e821..3763302 100644
> --- a/arch/sparc/Kconfig
> +++ b/arch/sparc/Kconfig
> @@ -109,6 +109,9 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
>  config NEED_PER_CPU_PAGE_FIRST_CHUNK
>  	def_bool y if SPARC64
>  
> +config NO_BOOTMEM
> +	def_bool y if SPARC64

mm/Kconfig define NO_BOOTMEM so you can just add a "select NO_BOOTMEM"
to SPARC64.

	Sam

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-24  6:32               ` Sam Ravnborg
@ 2012-04-24  7:00                 ` David Miller
  2012-04-27  3:32                   ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: David Miller @ 2012-04-24  7:00 UTC (permalink / raw)
  To: sam; +Cc: yinghai, tj, mhocko, linux-mm, linux-kernel

From: Sam Ravnborg <sam@ravnborg.org>
Date: Tue, 24 Apr 2012 08:32:36 +0200

> On Sun, Apr 22, 2012 at 10:00:54PM -0400, David Miller wrote:
>> diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
>> index db4e821..3763302 100644
>> --- a/arch/sparc/Kconfig
>> +++ b/arch/sparc/Kconfig
>> @@ -109,6 +109,9 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
>>  config NEED_PER_CPU_PAGE_FIRST_CHUNK
>>  	def_bool y if SPARC64
>>  
>> +config NO_BOOTMEM
>> +	def_bool y if SPARC64
> 
> mm/Kconfig define NO_BOOTMEM so you can just add a "select NO_BOOTMEM"
> to SPARC64.

I was merely following the lead on x86 :-) but yes it should
probably be a select.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Weirdness in __alloc_bootmem_node_high
  2012-04-24  7:00                 ` David Miller
@ 2012-04-27  3:32                   ` David Miller
  0 siblings, 0 replies; 21+ messages in thread
From: David Miller @ 2012-04-27  3:32 UTC (permalink / raw)
  To: sam; +Cc: yinghai, tj, mhocko, linux-mm, linux-kernel

From: David Miller <davem@davemloft.net>
Date: Tue, 24 Apr 2012 03:00:50 -0400 (EDT)

> From: Sam Ravnborg <sam@ravnborg.org>
> Date: Tue, 24 Apr 2012 08:32:36 +0200
> 
>> On Sun, Apr 22, 2012 at 10:00:54PM -0400, David Miller wrote:
>>> diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
>>> index db4e821..3763302 100644
>>> --- a/arch/sparc/Kconfig
>>> +++ b/arch/sparc/Kconfig
>>> @@ -109,6 +109,9 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
>>>  config NEED_PER_CPU_PAGE_FIRST_CHUNK
>>>  	def_bool y if SPARC64
>>>  
>>> +config NO_BOOTMEM
>>> +	def_bool y if SPARC64
>> 
>> mm/Kconfig define NO_BOOTMEM so you can just add a "select NO_BOOTMEM"
>> to SPARC64.
> 
> I was merely following the lead on x86 :-) but yes it should
> probably be a select.

So I merged mainline into sparc-next to get the mm/nobootmem.c fix,
and then added in the sparc64 NO_BOOTMEM conversion.

Just FYI.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2012-04-27  3:34 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-17 15:55 Weirdness in __alloc_bootmem_node_high Michal Hocko
2012-04-17 17:12 ` Yinghai Lu
2012-04-17 17:32   ` Michal Hocko
2012-04-17 18:07     ` Yinghai Lu
2012-04-17 18:30       ` Sam Ravnborg
2012-04-17 21:33         ` Tejun Heo
2012-04-19 12:50       ` Michal Hocko
2012-04-20 18:29 ` Tejun Heo
2012-04-20 19:14   ` Sam Ravnborg
2012-04-20 19:29     ` Michal Hocko
2012-04-20 19:32       ` Yinghai Lu
2012-04-20 19:41         ` Michal Hocko
2012-04-20 19:30     ` Yinghai Lu
2012-04-20 19:43       ` Sam Ravnborg
2012-04-22 19:22         ` David Miller
2012-04-22 20:05           ` Sam Ravnborg
2012-04-23  2:00             ` David Miller
2012-04-23  5:12               ` Sam Ravnborg
2012-04-24  6:32               ` Sam Ravnborg
2012-04-24  7:00                 ` David Miller
2012-04-27  3:32                   ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).