All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-08-20 14:14 ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-08-20 14:14 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Andi Kleen, Wu Fengguang, Haicheng Li,
	Christoph Lameter, linux-kernel

Hi,
what do you think about the patch below?

>From b983695b92b5be58f31c719fada1d3245f7b6768 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Fri, 20 Aug 2010 15:39:16 +0200
Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code

Currently is_mem_section_removable checks whether each pageblock from
the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
are false then the range is considered non removable.

On the other hand, offlining code (more specifically
set_migratetype_isolate) doesn't care whether a page is free and instead
it just checks the migrate type of the page and whether the page's zone
is movable.

This can lead into a situation when a node is marked as removable even
though all pages are neither MIGRATE_MOVABLE nor the zone is
ZONE_MOVABLE.

Also we can mark a node as not removable just because a pageblock is
MIGRATE_RESERVE and not free (and this situation is much more probable).
---
 mm/memory_hotplug.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a4cfcdc..da20568 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -611,10 +611,10 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
 		type = get_pageblock_migratetype(page);
 
 		/*
-		 * A pageblock containing MOVABLE or free pages is considered
-		 * removable
+		 * A pageblock containing MOVABLE or page from movable
+		 * zone are considered removable
 		 */
-		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
+		if (type != MIGRATE_MOVABLE && zone_idx(page) != ZONE_MOVABLE)
 			return 0;
 
 		/*
-- 
1.7.1


-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-08-20 14:14 ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-08-20 14:14 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Andi Kleen, Wu Fengguang, Haicheng Li,
	Christoph Lameter, linux-kernel

Hi,
what do you think about the patch below?

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-08-20 14:14 ` Michal Hocko
@ 2010-08-22  0:42   ` Wu Fengguang
  -1 siblings, 0 replies; 72+ messages in thread
From: Wu Fengguang @ 2010-08-22  0:42 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Andi Kleen, Haicheng Li,
	Christoph Lameter, linux-kernel

Hi Michal,

It helps to explain in changelog/code

- in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
  pages? And why the MIGRATE_MOVABLE test is still necessary given the
  ZONE_MOVABLE check?

- why do you think free pages are not removeable? Simply to cater for
  the set_migratetype_isolate() logic, or there are more fundamental
  reasons?

Thanks,
Fengguang

On Fri, Aug 20, 2010 at 04:14:00PM +0200, Michal Hocko wrote:
> Hi,
> what do you think about the patch below?
> 
> >From b983695b92b5be58f31c719fada1d3245f7b6768 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Fri, 20 Aug 2010 15:39:16 +0200
> Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> 
> Currently is_mem_section_removable checks whether each pageblock from
> the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> are false then the range is considered non removable.
> 
> On the other hand, offlining code (more specifically
> set_migratetype_isolate) doesn't care whether a page is free and instead
> it just checks the migrate type of the page and whether the page's zone
> is movable.
> 
> This can lead into a situation when a node is marked as removable even
> though all pages are neither MIGRATE_MOVABLE nor the zone is
> ZONE_MOVABLE.
> 
> Also we can mark a node as not removable just because a pageblock is
> MIGRATE_RESERVE and not free (and this situation is much more probable).
> ---
>  mm/memory_hotplug.c |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index a4cfcdc..da20568 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -611,10 +611,10 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
>  		type = get_pageblock_migratetype(page);
>  
>  		/*
> -		 * A pageblock containing MOVABLE or free pages is considered
> -		 * removable
> +		 * A pageblock containing MOVABLE or page from movable
> +		 * zone are considered removable
>  		 */
> -		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
> +		if (type != MIGRATE_MOVABLE && zone_idx(page) != ZONE_MOVABLE)
>  			return 0;
>  
>  		/*
> -- 
> 1.7.1
> 
> 
> -- 
> Michal Hocko
> L3 team 
> SUSE LINUX s.r.o.
> Lihovarska 1060/12
> 190 00 Praha 9    
> Czech Republic
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-08-22  0:42   ` Wu Fengguang
  0 siblings, 0 replies; 72+ messages in thread
From: Wu Fengguang @ 2010-08-22  0:42 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Andi Kleen, Haicheng Li,
	Christoph Lameter, linux-kernel

Hi Michal,

It helps to explain in changelog/code

- in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
  pages? And why the MIGRATE_MOVABLE test is still necessary given the
  ZONE_MOVABLE check?

- why do you think free pages are not removeable? Simply to cater for
  the set_migratetype_isolate() logic, or there are more fundamental
  reasons?

Thanks,
Fengguang

On Fri, Aug 20, 2010 at 04:14:00PM +0200, Michal Hocko wrote:
> Hi,
> what do you think about the patch below?
> 
> >From b983695b92b5be58f31c719fada1d3245f7b6768 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Fri, 20 Aug 2010 15:39:16 +0200
> Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> 
> Currently is_mem_section_removable checks whether each pageblock from
> the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> are false then the range is considered non removable.
> 
> On the other hand, offlining code (more specifically
> set_migratetype_isolate) doesn't care whether a page is free and instead
> it just checks the migrate type of the page and whether the page's zone
> is movable.
> 
> This can lead into a situation when a node is marked as removable even
> though all pages are neither MIGRATE_MOVABLE nor the zone is
> ZONE_MOVABLE.
> 
> Also we can mark a node as not removable just because a pageblock is
> MIGRATE_RESERVE and not free (and this situation is much more probable).
> ---
>  mm/memory_hotplug.c |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index a4cfcdc..da20568 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -611,10 +611,10 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
>  		type = get_pageblock_migratetype(page);
>  
>  		/*
> -		 * A pageblock containing MOVABLE or free pages is considered
> -		 * removable
> +		 * A pageblock containing MOVABLE or page from movable
> +		 * zone are considered removable
>  		 */
> -		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
> +		if (type != MIGRATE_MOVABLE && zone_idx(page) != ZONE_MOVABLE)
>  			return 0;
>  
>  		/*
> -- 
> 1.7.1
> 
> 
> -- 
> Michal Hocko
> L3 team 
> SUSE LINUX s.r.o.
> Lihovarska 1060/12
> 190 00 Praha 9    
> Czech Republic
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-08-22  0:42   ` Wu Fengguang
@ 2010-08-23  9:22     ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-08-23  9:22 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-mm, Andrew Morton, Andi Kleen, Haicheng Li,
	Christoph Lameter, linux-kernel

On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> Hi Michal,

Hi,

> 
> It helps to explain in changelog/code
> 
> - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
>   pages? 

page can be MIGRATE_RESERVE IIUC.

>   And why the MIGRATE_MOVABLE test is still necessary given the
>   ZONE_MOVABLE check?

I would assume that the MIGRATE_MOVABLE test is not necessary (given that
the whole zone is set as movable) but this test is used also in the
offlining path (in set_migratetype_isolate) and the primary reason for
this patch is to sync those two checks. 

I am not familiar with all the possible cases for migrate flags so the
test reduction should be better done by someone more familiar with the
code (the zone flag test is much more easier than the whole
get_pageblock_migratetype so this could be a win in the end).

> 
> - why do you think free pages are not removeable? Simply to cater for
>   the set_migratetype_isolate() logic, or there are more fundamental
>   reasons?

Free pages can be from non movable zone, right? I know that having a
zone with the free page blocks in non-movable zone is extremely 
improbable but what is the point of this check anyway? So yes, this is
more to be in sync than anything more fundamental.

> 
> Thanks,
> Fengguang
> 
> On Fri, Aug 20, 2010 at 04:14:00PM +0200, Michal Hocko wrote:
> > Hi,
> > what do you think about the patch below?
> > 
> > >From b983695b92b5be58f31c719fada1d3245f7b6768 Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <mhocko@suse.cz>
> > Date: Fri, 20 Aug 2010 15:39:16 +0200
> > Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> > 
> > Currently is_mem_section_removable checks whether each pageblock from
> > the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> > are false then the range is considered non removable.
> > 
> > On the other hand, offlining code (more specifically
> > set_migratetype_isolate) doesn't care whether a page is free and instead
> > it just checks the migrate type of the page and whether the page's zone
> > is movable.
> > 
> > This can lead into a situation when a node is marked as removable even
> > though all pages are neither MIGRATE_MOVABLE nor the zone is
> > ZONE_MOVABLE.
> > 
> > Also we can mark a node as not removable just because a pageblock is
> > MIGRATE_RESERVE and not free (and this situation is much more probable).
> > ---
> >  mm/memory_hotplug.c |    6 +++---
> >  1 files changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index a4cfcdc..da20568 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -611,10 +611,10 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
> >  		type = get_pageblock_migratetype(page);
> >  
> >  		/*
> > -		 * A pageblock containing MOVABLE or free pages is considered
> > -		 * removable
> > +		 * A pageblock containing MOVABLE or page from movable
> > +		 * zone are considered removable
> >  		 */
> > -		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
> > +		if (type != MIGRATE_MOVABLE && zone_idx(page) != ZONE_MOVABLE)
> >  			return 0;
> >  
> >  		/*
> > -- 
> > 1.7.1
> > 
> > 
> > -- 
> > Michal Hocko
> > L3 team 
> > SUSE LINUX s.r.o.
> > Lihovarska 1060/12
> > 190 00 Praha 9    
> > Czech Republic
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-08-23  9:22     ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-08-23  9:22 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-mm, Andrew Morton, Andi Kleen, Haicheng Li,
	Christoph Lameter, linux-kernel

On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> Hi Michal,

Hi,

> 
> It helps to explain in changelog/code
> 
> - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
>   pages? 

page can be MIGRATE_RESERVE IIUC.

>   And why the MIGRATE_MOVABLE test is still necessary given the
>   ZONE_MOVABLE check?

I would assume that the MIGRATE_MOVABLE test is not necessary (given that
the whole zone is set as movable) but this test is used also in the
offlining path (in set_migratetype_isolate) and the primary reason for
this patch is to sync those two checks. 

I am not familiar with all the possible cases for migrate flags so the
test reduction should be better done by someone more familiar with the
code (the zone flag test is much more easier than the whole
get_pageblock_migratetype so this could be a win in the end).

> 
> - why do you think free pages are not removeable? Simply to cater for
>   the set_migratetype_isolate() logic, or there are more fundamental
>   reasons?

Free pages can be from non movable zone, right? I know that having a
zone with the free page blocks in non-movable zone is extremely 
improbable but what is the point of this check anyway? So yes, this is
more to be in sync than anything more fundamental.

> 
> Thanks,
> Fengguang
> 
> On Fri, Aug 20, 2010 at 04:14:00PM +0200, Michal Hocko wrote:
> > Hi,
> > what do you think about the patch below?
> > 
> > >From b983695b92b5be58f31c719fada1d3245f7b6768 Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <mhocko@suse.cz>
> > Date: Fri, 20 Aug 2010 15:39:16 +0200
> > Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> > 
> > Currently is_mem_section_removable checks whether each pageblock from
> > the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> > are false then the range is considered non removable.
> > 
> > On the other hand, offlining code (more specifically
> > set_migratetype_isolate) doesn't care whether a page is free and instead
> > it just checks the migrate type of the page and whether the page's zone
> > is movable.
> > 
> > This can lead into a situation when a node is marked as removable even
> > though all pages are neither MIGRATE_MOVABLE nor the zone is
> > ZONE_MOVABLE.
> > 
> > Also we can mark a node as not removable just because a pageblock is
> > MIGRATE_RESERVE and not free (and this situation is much more probable).
> > ---
> >  mm/memory_hotplug.c |    6 +++---
> >  1 files changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index a4cfcdc..da20568 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -611,10 +611,10 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
> >  		type = get_pageblock_migratetype(page);
> >  
> >  		/*
> > -		 * A pageblock containing MOVABLE or free pages is considered
> > -		 * removable
> > +		 * A pageblock containing MOVABLE or page from movable
> > +		 * zone are considered removable
> >  		 */
> > -		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
> > +		if (type != MIGRATE_MOVABLE && zone_idx(page) != ZONE_MOVABLE)
> >  			return 0;
> >  
> >  		/*
> > -- 
> > 1.7.1
> > 
> > 
> > -- 
> > Michal Hocko
> > L3 team 
> > SUSE LINUX s.r.o.
> > Lihovarska 1060/12
> > 190 00 Praha 9    
> > Czech Republic
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-08-23  9:22     ` Michal Hocko
@ 2010-08-31 12:30       ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-08-31 12:30 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-mm, Andrew Morton, Andi Kleen, Haicheng Li,
	Christoph Lameter, linux-kernel

Hi,

On Mon 23-08-10 11:22:46, Michal Hocko wrote:
> On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> > Hi Michal,
> 
> Hi,
> 
> > 
> > It helps to explain in changelog/code
> > 
> > - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
> >   pages? 
> 
> page can be MIGRATE_RESERVE IIUC.
> 
> >   And why the MIGRATE_MOVABLE test is still necessary given the
> >   ZONE_MOVABLE check?
> 
> I would assume that the MIGRATE_MOVABLE test is not necessary (given that
> the whole zone is set as movable) but this test is used also in the
> offlining path (in set_migratetype_isolate) and the primary reason for
> this patch is to sync those two checks. 
> 
> I am not familiar with all the possible cases for migrate flags so the
> test reduction should be better done by someone more familiar with the
> code (the zone flag test is much more easier than the whole
> get_pageblock_migratetype so this could be a win in the end).
> 
> > 
> > - why do you think free pages are not removeable? Simply to cater for
> >   the set_migratetype_isolate() logic, or there are more fundamental
> >   reasons?
> 
> Free pages can be from non movable zone, right? I know that having a
> zone with the free page blocks in non-movable zone is extremely 
> improbable but what is the point of this check anyway? So yes, this is
> more to be in sync than anything more fundamental.

Are there any other comments on this? Is the patch reasonable at all?

> 
> > 
> > Thanks,
> > Fengguang
> > 
> > On Fri, Aug 20, 2010 at 04:14:00PM +0200, Michal Hocko wrote:
> > > Hi,
> > > what do you think about the patch below?
> > > 
> > > >From b983695b92b5be58f31c719fada1d3245f7b6768 Mon Sep 17 00:00:00 2001
> > > From: Michal Hocko <mhocko@suse.cz>
> > > Date: Fri, 20 Aug 2010 15:39:16 +0200
> > > Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> > > 
> > > Currently is_mem_section_removable checks whether each pageblock from
> > > the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> > > are false then the range is considered non removable.
> > > 
> > > On the other hand, offlining code (more specifically
> > > set_migratetype_isolate) doesn't care whether a page is free and instead
> > > it just checks the migrate type of the page and whether the page's zone
> > > is movable.
> > > 
> > > This can lead into a situation when a node is marked as removable even
> > > though all pages are neither MIGRATE_MOVABLE nor the zone is
> > > ZONE_MOVABLE.
> > > 
> > > Also we can mark a node as not removable just because a pageblock is
> > > MIGRATE_RESERVE and not free (and this situation is much more probable).
> > > ---
> > >  mm/memory_hotplug.c |    6 +++---
> > >  1 files changed, 3 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > > index a4cfcdc..da20568 100644
> > > --- a/mm/memory_hotplug.c
> > > +++ b/mm/memory_hotplug.c
> > > @@ -611,10 +611,10 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
> > >  		type = get_pageblock_migratetype(page);
> > >  
> > >  		/*
> > > -		 * A pageblock containing MOVABLE or free pages is considered
> > > -		 * removable
> > > +		 * A pageblock containing MOVABLE or page from movable
> > > +		 * zone are considered removable
> > >  		 */
> > > -		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
> > > +		if (type != MIGRATE_MOVABLE && zone_idx(page) != ZONE_MOVABLE)
> > >  			return 0;
> > >  
> > >  		/*

-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-08-31 12:30       ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-08-31 12:30 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-mm, Andrew Morton, Andi Kleen, Haicheng Li,
	Christoph Lameter, linux-kernel

Hi,

On Mon 23-08-10 11:22:46, Michal Hocko wrote:
> On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> > Hi Michal,
> 
> Hi,
> 
> > 
> > It helps to explain in changelog/code
> > 
> > - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
> >   pages? 
> 
> page can be MIGRATE_RESERVE IIUC.
> 
> >   And why the MIGRATE_MOVABLE test is still necessary given the
> >   ZONE_MOVABLE check?
> 
> I would assume that the MIGRATE_MOVABLE test is not necessary (given that
> the whole zone is set as movable) but this test is used also in the
> offlining path (in set_migratetype_isolate) and the primary reason for
> this patch is to sync those two checks. 
> 
> I am not familiar with all the possible cases for migrate flags so the
> test reduction should be better done by someone more familiar with the
> code (the zone flag test is much more easier than the whole
> get_pageblock_migratetype so this could be a win in the end).
> 
> > 
> > - why do you think free pages are not removeable? Simply to cater for
> >   the set_migratetype_isolate() logic, or there are more fundamental
> >   reasons?
> 
> Free pages can be from non movable zone, right? I know that having a
> zone with the free page blocks in non-movable zone is extremely 
> improbable but what is the point of this check anyway? So yes, this is
> more to be in sync than anything more fundamental.

Are there any other comments on this? Is the patch reasonable at all?

> 
> > 
> > Thanks,
> > Fengguang
> > 
> > On Fri, Aug 20, 2010 at 04:14:00PM +0200, Michal Hocko wrote:
> > > Hi,
> > > what do you think about the patch below?
> > > 
> > > >From b983695b92b5be58f31c719fada1d3245f7b6768 Mon Sep 17 00:00:00 2001
> > > From: Michal Hocko <mhocko@suse.cz>
> > > Date: Fri, 20 Aug 2010 15:39:16 +0200
> > > Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> > > 
> > > Currently is_mem_section_removable checks whether each pageblock from
> > > the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> > > are false then the range is considered non removable.
> > > 
> > > On the other hand, offlining code (more specifically
> > > set_migratetype_isolate) doesn't care whether a page is free and instead
> > > it just checks the migrate type of the page and whether the page's zone
> > > is movable.
> > > 
> > > This can lead into a situation when a node is marked as removable even
> > > though all pages are neither MIGRATE_MOVABLE nor the zone is
> > > ZONE_MOVABLE.
> > > 
> > > Also we can mark a node as not removable just because a pageblock is
> > > MIGRATE_RESERVE and not free (and this situation is much more probable).
> > > ---
> > >  mm/memory_hotplug.c |    6 +++---
> > >  1 files changed, 3 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > > index a4cfcdc..da20568 100644
> > > --- a/mm/memory_hotplug.c
> > > +++ b/mm/memory_hotplug.c
> > > @@ -611,10 +611,10 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
> > >  		type = get_pageblock_migratetype(page);
> > >  
> > >  		/*
> > > -		 * A pageblock containing MOVABLE or free pages is considered
> > > -		 * removable
> > > +		 * A pageblock containing MOVABLE or page from movable
> > > +		 * zone are considered removable
> > >  		 */
> > > -		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
> > > +		if (type != MIGRATE_MOVABLE && zone_idx(page) != ZONE_MOVABLE)
> > >  			return 0;
> > >  
> > >  		/*

-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-08-23  9:22     ` Michal Hocko
@ 2010-08-31 14:19       ` Wu Fengguang
  -1 siblings, 0 replies; 72+ messages in thread
From: Wu Fengguang @ 2010-08-31 14:19 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Mon, Aug 23, 2010 at 05:22:46PM +0800, Michal Hocko wrote:
> On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> > Hi Michal,
> 
> Hi,
> 
> > 
> > It helps to explain in changelog/code
> > 
> > - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
> >   pages? 
> 
> page can be MIGRATE_RESERVE IIUC.

Yup, it may also be set to MIGRATE_ISOLATE by soft_offline_page().

> >   And why the MIGRATE_MOVABLE test is still necessary given the
> >   ZONE_MOVABLE check?
> 
> I would assume that the MIGRATE_MOVABLE test is not necessary (given that
> the whole zone is set as movable) but this test is used also in the
> offlining path (in set_migratetype_isolate) and the primary reason for
> this patch is to sync those two checks. 

Merge the two checks into an inline function?

> I am not familiar with all the possible cases for migrate flags so the
> test reduction should be better done by someone more familiar with the
> code (the zone flag test is much more easier than the whole
> get_pageblock_migratetype so this could be a win in the end).

Feel free to swap the order of tests :)

> > 
> > - why do you think free pages are not removeable? Simply to cater for
> >   the set_migratetype_isolate() logic, or there are more fundamental
> >   reasons?
> 
> Free pages can be from non movable zone, right? I know that having a
> zone with the free page blocks in non-movable zone is extremely 
> improbable but what is the point of this check anyway? So yes, this is
> more to be in sync than anything more fundamental.

You don't have strong reasons to remove the free pages test, so why
not keep it? We never know what the user will do. He may regretted
immediately after onlining a node, and want to offline it..  Some
hackers may want to offline some 128MB memory blocks (by chance) with
the help of drop_caches.

Thanks,
Fengguang

> > On Fri, Aug 20, 2010 at 04:14:00PM +0200, Michal Hocko wrote:
> > > Hi,
> > > what do you think about the patch below?
> > > 
> > > >From b983695b92b5be58f31c719fada1d3245f7b6768 Mon Sep 17 00:00:00 2001
> > > From: Michal Hocko <mhocko@suse.cz>
> > > Date: Fri, 20 Aug 2010 15:39:16 +0200
> > > Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> > > 
> > > Currently is_mem_section_removable checks whether each pageblock from
> > > the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> > > are false then the range is considered non removable.
> > > 
> > > On the other hand, offlining code (more specifically
> > > set_migratetype_isolate) doesn't care whether a page is free and instead
> > > it just checks the migrate type of the page and whether the page's zone
> > > is movable.
> > > 
> > > This can lead into a situation when a node is marked as removable even
> > > though all pages are neither MIGRATE_MOVABLE nor the zone is
> > > ZONE_MOVABLE.
> > > 
> > > Also we can mark a node as not removable just because a pageblock is
> > > MIGRATE_RESERVE and not free (and this situation is much more probable).
> > > ---
> > >  mm/memory_hotplug.c |    6 +++---
> > >  1 files changed, 3 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > > index a4cfcdc..da20568 100644
> > > --- a/mm/memory_hotplug.c
> > > +++ b/mm/memory_hotplug.c
> > > @@ -611,10 +611,10 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
> > >  		type = get_pageblock_migratetype(page);
> > >  
> > >  		/*
> > > -		 * A pageblock containing MOVABLE or free pages is considered
> > > -		 * removable
> > > +		 * A pageblock containing MOVABLE or page from movable
> > > +		 * zone are considered removable
> > >  		 */
> > > -		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
> > > +		if (type != MIGRATE_MOVABLE && zone_idx(page) != ZONE_MOVABLE)
> > >  			return 0;
> > >  
> > >  		/*
> > > -- 
> > > 1.7.1
> > > 
> > > 
> > > -- 
> > > Michal Hocko
> > > L3 team 
> > > SUSE LINUX s.r.o.
> > > Lihovarska 1060/12
> > > 190 00 Praha 9    
> > > Czech Republic
> > > 
> > > --
> > > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > > the body to majordomo@kvack.org.  For more info on Linux MM,
> > > see: http://www.linux-mm.org/ .
> > > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 
> -- 
> Michal Hocko
> L3 team 
> SUSE LINUX s.r.o.
> Lihovarska 1060/12
> 190 00 Praha 9    
> Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-08-31 14:19       ` Wu Fengguang
  0 siblings, 0 replies; 72+ messages in thread
From: Wu Fengguang @ 2010-08-31 14:19 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Mon, Aug 23, 2010 at 05:22:46PM +0800, Michal Hocko wrote:
> On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> > Hi Michal,
> 
> Hi,
> 
> > 
> > It helps to explain in changelog/code
> > 
> > - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
> >   pages? 
> 
> page can be MIGRATE_RESERVE IIUC.

Yup, it may also be set to MIGRATE_ISOLATE by soft_offline_page().

> >   And why the MIGRATE_MOVABLE test is still necessary given the
> >   ZONE_MOVABLE check?
> 
> I would assume that the MIGRATE_MOVABLE test is not necessary (given that
> the whole zone is set as movable) but this test is used also in the
> offlining path (in set_migratetype_isolate) and the primary reason for
> this patch is to sync those two checks. 

Merge the two checks into an inline function?

> I am not familiar with all the possible cases for migrate flags so the
> test reduction should be better done by someone more familiar with the
> code (the zone flag test is much more easier than the whole
> get_pageblock_migratetype so this could be a win in the end).

Feel free to swap the order of tests :)

> > 
> > - why do you think free pages are not removeable? Simply to cater for
> >   the set_migratetype_isolate() logic, or there are more fundamental
> >   reasons?
> 
> Free pages can be from non movable zone, right? I know that having a
> zone with the free page blocks in non-movable zone is extremely 
> improbable but what is the point of this check anyway? So yes, this is
> more to be in sync than anything more fundamental.

You don't have strong reasons to remove the free pages test, so why
not keep it? We never know what the user will do. He may regretted
immediately after onlining a node, and want to offline it..  Some
hackers may want to offline some 128MB memory blocks (by chance) with
the help of drop_caches.

Thanks,
Fengguang

> > On Fri, Aug 20, 2010 at 04:14:00PM +0200, Michal Hocko wrote:
> > > Hi,
> > > what do you think about the patch below?
> > > 
> > > >From b983695b92b5be58f31c719fada1d3245f7b6768 Mon Sep 17 00:00:00 2001
> > > From: Michal Hocko <mhocko@suse.cz>
> > > Date: Fri, 20 Aug 2010 15:39:16 +0200
> > > Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> > > 
> > > Currently is_mem_section_removable checks whether each pageblock from
> > > the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> > > are false then the range is considered non removable.
> > > 
> > > On the other hand, offlining code (more specifically
> > > set_migratetype_isolate) doesn't care whether a page is free and instead
> > > it just checks the migrate type of the page and whether the page's zone
> > > is movable.
> > > 
> > > This can lead into a situation when a node is marked as removable even
> > > though all pages are neither MIGRATE_MOVABLE nor the zone is
> > > ZONE_MOVABLE.
> > > 
> > > Also we can mark a node as not removable just because a pageblock is
> > > MIGRATE_RESERVE and not free (and this situation is much more probable).
> > > ---
> > >  mm/memory_hotplug.c |    6 +++---
> > >  1 files changed, 3 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > > index a4cfcdc..da20568 100644
> > > --- a/mm/memory_hotplug.c
> > > +++ b/mm/memory_hotplug.c
> > > @@ -611,10 +611,10 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
> > >  		type = get_pageblock_migratetype(page);
> > >  
> > >  		/*
> > > -		 * A pageblock containing MOVABLE or free pages is considered
> > > -		 * removable
> > > +		 * A pageblock containing MOVABLE or page from movable
> > > +		 * zone are considered removable
> > >  		 */
> > > -		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
> > > +		if (type != MIGRATE_MOVABLE && zone_idx(page) != ZONE_MOVABLE)
> > >  			return 0;
> > >  
> > >  		/*
> > > -- 
> > > 1.7.1
> > > 
> > > 
> > > -- 
> > > Michal Hocko
> > > L3 team 
> > > SUSE LINUX s.r.o.
> > > Lihovarska 1060/12
> > > 190 00 Praha 9    
> > > Czech Republic
> > > 
> > > --
> > > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > > the body to majordomo@kvack.org.  For more info on Linux MM,
> > > see: http://www.linux-mm.org/ .
> > > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 
> -- 
> Michal Hocko
> L3 team 
> SUSE LINUX s.r.o.
> Lihovarska 1060/12
> 190 00 Praha 9    
> Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-08-31 14:19       ` Wu Fengguang
@ 2010-08-31 14:36         ` Wu Fengguang
  -1 siblings, 0 replies; 72+ messages in thread
From: Wu Fengguang @ 2010-08-31 14:36 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Tue, Aug 31, 2010 at 10:19:42PM +0800, Wu Fengguang wrote:
> On Mon, Aug 23, 2010 at 05:22:46PM +0800, Michal Hocko wrote:
> > On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> > > Hi Michal,
> > 
> > Hi,
> > 
> > > 
> > > It helps to explain in changelog/code
> > > 
> > > - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
> > >   pages? 
> > 
> > page can be MIGRATE_RESERVE IIUC.
> 
> Yup, it may also be set to MIGRATE_ISOLATE by soft_offline_page().

Ah a non-movable page allocation could fall back into the movable
zone. See __rmqueue_fallback() and the fallbacks[][] array. So the

        if (type != MIGRATE_MOVABLE && !pageblock_free(page))

check in is_mem_section_removable() is correct. It is
set_migratetype_isolate() that should be fixed to use the same check.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-08-31 14:36         ` Wu Fengguang
  0 siblings, 0 replies; 72+ messages in thread
From: Wu Fengguang @ 2010-08-31 14:36 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Tue, Aug 31, 2010 at 10:19:42PM +0800, Wu Fengguang wrote:
> On Mon, Aug 23, 2010 at 05:22:46PM +0800, Michal Hocko wrote:
> > On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> > > Hi Michal,
> > 
> > Hi,
> > 
> > > 
> > > It helps to explain in changelog/code
> > > 
> > > - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
> > >   pages? 
> > 
> > page can be MIGRATE_RESERVE IIUC.
> 
> Yup, it may also be set to MIGRATE_ISOLATE by soft_offline_page().

Ah a non-movable page allocation could fall back into the movable
zone. See __rmqueue_fallback() and the fallbacks[][] array. So the

        if (type != MIGRATE_MOVABLE && !pageblock_free(page))

check in is_mem_section_removable() is correct. It is
set_migratetype_isolate() that should be fixed to use the same check.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-08-31 14:36         ` Wu Fengguang
@ 2010-08-31 14:59           ` Wu Fengguang
  -1 siblings, 0 replies; 72+ messages in thread
From: Wu Fengguang @ 2010-08-31 14:59 UTC (permalink / raw)
  To: Mel Gorman, Michal Hocko
  Cc: linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel

On Tue, Aug 31, 2010 at 10:36:49PM +0800, Wu Fengguang wrote:
> On Tue, Aug 31, 2010 at 10:19:42PM +0800, Wu Fengguang wrote:
> > On Mon, Aug 23, 2010 at 05:22:46PM +0800, Michal Hocko wrote:
> > > On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> > > > Hi Michal,
> > > 
> > > Hi,
> > > 
> > > > 
> > > > It helps to explain in changelog/code
> > > > 
> > > > - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
> > > >   pages? 
> > > 
> > > page can be MIGRATE_RESERVE IIUC.
> > 
> > Yup, it may also be set to MIGRATE_ISOLATE by soft_offline_page().
> 
> Ah a non-movable page allocation could fall back into the movable
> zone. See __rmqueue_fallback() and the fallbacks[][] array. So the
> 
>         if (type != MIGRATE_MOVABLE && !pageblock_free(page))
> 
> check in is_mem_section_removable() is correct. It is
> set_migratetype_isolate() that should be fixed to use the same check.

Mel, I'm not familiar with the code, but it seems that in
__rmqueue_fallback(), set_pageblock_migratetype() is not always called
after move_freepages_block() successfully moved some pages. Then it's
possible a MIGRATE_MOVABLE page block will end up containing
non-movable pages?

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-08-31 14:59           ` Wu Fengguang
  0 siblings, 0 replies; 72+ messages in thread
From: Wu Fengguang @ 2010-08-31 14:59 UTC (permalink / raw)
  To: Mel Gorman, Michal Hocko
  Cc: linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel

On Tue, Aug 31, 2010 at 10:36:49PM +0800, Wu Fengguang wrote:
> On Tue, Aug 31, 2010 at 10:19:42PM +0800, Wu Fengguang wrote:
> > On Mon, Aug 23, 2010 at 05:22:46PM +0800, Michal Hocko wrote:
> > > On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> > > > Hi Michal,
> > > 
> > > Hi,
> > > 
> > > > 
> > > > It helps to explain in changelog/code
> > > > 
> > > > - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
> > > >   pages? 
> > > 
> > > page can be MIGRATE_RESERVE IIUC.
> > 
> > Yup, it may also be set to MIGRATE_ISOLATE by soft_offline_page().
> 
> Ah a non-movable page allocation could fall back into the movable
> zone. See __rmqueue_fallback() and the fallbacks[][] array. So the
> 
>         if (type != MIGRATE_MOVABLE && !pageblock_free(page))
> 
> check in is_mem_section_removable() is correct. It is
> set_migratetype_isolate() that should be fixed to use the same check.

Mel, I'm not familiar with the code, but it seems that in
__rmqueue_fallback(), set_pageblock_migratetype() is not always called
after move_freepages_block() successfully moved some pages. Then it's
possible a MIGRATE_MOVABLE page block will end up containing
non-movable pages?

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-08-31 14:36         ` Wu Fengguang
@ 2010-09-01  1:19           ` KAMEZAWA Hiroyuki
  -1 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-01  1:19 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Michal Hocko, linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Tue, 31 Aug 2010 22:36:49 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> On Tue, Aug 31, 2010 at 10:19:42PM +0800, Wu Fengguang wrote:
> > On Mon, Aug 23, 2010 at 05:22:46PM +0800, Michal Hocko wrote:
> > > On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> > > > Hi Michal,
> > > 
> > > Hi,
> > > 
> > > > 
> > > > It helps to explain in changelog/code
> > > > 
> > > > - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
> > > >   pages? 
> > > 
> > > page can be MIGRATE_RESERVE IIUC.
> > 
> > Yup, it may also be set to MIGRATE_ISOLATE by soft_offline_page().
> 
> Ah a non-movable page allocation could fall back into the movable
> zone. See __rmqueue_fallback() and the fallbacks[][] array. So the
> 
>         if (type != MIGRATE_MOVABLE && !pageblock_free(page))
> 
> check in is_mem_section_removable() is correct. It is
> set_migratetype_isolate() that should be fixed to use the same check.
> 

Here is a patch for set_migratetype_isolate(). This is not a _fix_ but an
improvement.

==
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

At memory hotplug, we set pageblock's movable-type as ISOLATE. At failure,
we make it back to be MOVABLE. So,
	- pageblock's type shoule be movable before isolation.
	- pageblock's contents should be really movable before isolation.

Add document about it and add pageblock_free() call for fast-path.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 include/linux/mmzone.h |    4 ++++
 mm/memory_hotplug.c    |    2 +-
 mm/page_alloc.c        |   17 +++++++++++++----
 3 files changed, 18 insertions(+), 5 deletions(-)

Index: mmotm-0827/mm/page_alloc.c
===================================================================
--- mmotm-0827.orig/mm/page_alloc.c
+++ mmotm-0827/mm/page_alloc.c
@@ -5282,19 +5282,28 @@ int set_migratetype_isolate(struct page 
 	unsigned long immobile = 0;
 	struct memory_isolate_notify arg;
 	int notifier_ret;
-	int ret = -EBUSY;
+	int ret = 0;
 	int zone_idx;
 
 	zone = page_zone(page);
 	zone_idx = zone_idx(zone);
 
 	spin_lock_irqsave(&zone->lock, flags);
+	/*
+	 * At failure of hotplug, we turns this block to be MOVABLE if
+	 * isolation has been successfully done. So, if page-block is movable
+	 * or freed, we can try this without check contents..
+	 */
 	if (get_pageblock_migratetype(page) == MIGRATE_MOVABLE ||
-	    zone_idx == ZONE_MOVABLE) {
-		ret = 0;
+	    pageblock_free(page) ||
+	    zone_idx == ZONE_MOVABLE)
 		goto out;
-	}
 
+	ret = -EBUSY;
+	/*
+	 * check contents, because we move this pageblocks type to be MOVABLE
+	 * at failure, the contents should be movable.
+	 */
 	pfn = page_to_pfn(page);
 	arg.start_pfn = pfn;
 	arg.nr_pages = pageblock_nr_pages;
Index: mmotm-0827/include/linux/mmzone.h
===================================================================
--- mmotm-0827.orig/include/linux/mmzone.h
+++ mmotm-0827/include/linux/mmzone.h
@@ -54,6 +54,10 @@ static inline int get_pageblock_migratet
 	return get_pageblock_flags_group(page, PB_migrate, PB_migrate_end);
 }
 
+#ifdef CONFIG_MEMORY_HOTREMOVE
+extern int pageblock_free(struct page *page);
+#endif
+
 struct free_area {
 	struct list_head	free_list[MIGRATE_TYPES];
 	unsigned long		nr_free;
Index: mmotm-0827/mm/memory_hotplug.c
===================================================================
--- mmotm-0827.orig/mm/memory_hotplug.c
+++ mmotm-0827/mm/memory_hotplug.c
@@ -576,7 +576,7 @@ EXPORT_SYMBOL_GPL(add_memory);
  * Due to buddy contraints, a free page at least the size of a pageblock will
  * be located at the start of the pageblock
  */
-static inline int pageblock_free(struct page *page)
+int pageblock_free(struct page *page)
 {
 	return PageBuddy(page) && page_order(page) >= pageblock_order;
 }


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-01  1:19           ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-01  1:19 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Michal Hocko, linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Tue, 31 Aug 2010 22:36:49 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> On Tue, Aug 31, 2010 at 10:19:42PM +0800, Wu Fengguang wrote:
> > On Mon, Aug 23, 2010 at 05:22:46PM +0800, Michal Hocko wrote:
> > > On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> > > > Hi Michal,
> > > 
> > > Hi,
> > > 
> > > > 
> > > > It helps to explain in changelog/code
> > > > 
> > > > - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
> > > >   pages? 
> > > 
> > > page can be MIGRATE_RESERVE IIUC.
> > 
> > Yup, it may also be set to MIGRATE_ISOLATE by soft_offline_page().
> 
> Ah a non-movable page allocation could fall back into the movable
> zone. See __rmqueue_fallback() and the fallbacks[][] array. So the
> 
>         if (type != MIGRATE_MOVABLE && !pageblock_free(page))
> 
> check in is_mem_section_removable() is correct. It is
> set_migratetype_isolate() that should be fixed to use the same check.
> 

Here is a patch for set_migratetype_isolate(). This is not a _fix_ but an
improvement.

==
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

At memory hotplug, we set pageblock's movable-type as ISOLATE. At failure,
we make it back to be MOVABLE. So,
	- pageblock's type shoule be movable before isolation.
	- pageblock's contents should be really movable before isolation.

Add document about it and add pageblock_free() call for fast-path.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 include/linux/mmzone.h |    4 ++++
 mm/memory_hotplug.c    |    2 +-
 mm/page_alloc.c        |   17 +++++++++++++----
 3 files changed, 18 insertions(+), 5 deletions(-)

Index: mmotm-0827/mm/page_alloc.c
===================================================================
--- mmotm-0827.orig/mm/page_alloc.c
+++ mmotm-0827/mm/page_alloc.c
@@ -5282,19 +5282,28 @@ int set_migratetype_isolate(struct page 
 	unsigned long immobile = 0;
 	struct memory_isolate_notify arg;
 	int notifier_ret;
-	int ret = -EBUSY;
+	int ret = 0;
 	int zone_idx;
 
 	zone = page_zone(page);
 	zone_idx = zone_idx(zone);
 
 	spin_lock_irqsave(&zone->lock, flags);
+	/*
+	 * At failure of hotplug, we turns this block to be MOVABLE if
+	 * isolation has been successfully done. So, if page-block is movable
+	 * or freed, we can try this without check contents..
+	 */
 	if (get_pageblock_migratetype(page) == MIGRATE_MOVABLE ||
-	    zone_idx == ZONE_MOVABLE) {
-		ret = 0;
+	    pageblock_free(page) ||
+	    zone_idx == ZONE_MOVABLE)
 		goto out;
-	}
 
+	ret = -EBUSY;
+	/*
+	 * check contents, because we move this pageblocks type to be MOVABLE
+	 * at failure, the contents should be movable.
+	 */
 	pfn = page_to_pfn(page);
 	arg.start_pfn = pfn;
 	arg.nr_pages = pageblock_nr_pages;
Index: mmotm-0827/include/linux/mmzone.h
===================================================================
--- mmotm-0827.orig/include/linux/mmzone.h
+++ mmotm-0827/include/linux/mmzone.h
@@ -54,6 +54,10 @@ static inline int get_pageblock_migratet
 	return get_pageblock_flags_group(page, PB_migrate, PB_migrate_end);
 }
 
+#ifdef CONFIG_MEMORY_HOTREMOVE
+extern int pageblock_free(struct page *page);
+#endif
+
 struct free_area {
 	struct list_head	free_list[MIGRATE_TYPES];
 	unsigned long		nr_free;
Index: mmotm-0827/mm/memory_hotplug.c
===================================================================
--- mmotm-0827.orig/mm/memory_hotplug.c
+++ mmotm-0827/mm/memory_hotplug.c
@@ -576,7 +576,7 @@ EXPORT_SYMBOL_GPL(add_memory);
  * Due to buddy contraints, a free page at least the size of a pageblock will
  * be located at the start of the pageblock
  */
-static inline int pageblock_free(struct page *page)
+int pageblock_free(struct page *page)
 {
 	return PageBuddy(page) && page_order(page) >= pageblock_order;
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-08-31 14:19       ` Wu Fengguang
@ 2010-09-01 12:19         ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-01 12:19 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Tue 31-08-10 22:19:42, Wu Fengguang wrote:
> On Mon, Aug 23, 2010 at 05:22:46PM +0800, Michal Hocko wrote:
> > On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> > > Hi Michal,
> > 
> > Hi,
> > 
> > > 
> > > It helps to explain in changelog/code
> > > 
> > > - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
> > >   pages? 
> > 
> > page can be MIGRATE_RESERVE IIUC.
> 
> Yup, it may also be set to MIGRATE_ISOLATE by soft_offline_page().

Doesn't it make sense to check for !MIGRATE_UNMOVABLE then?

> 
> > >   And why the MIGRATE_MOVABLE test is still necessary given the
> > >   ZONE_MOVABLE check?
> > 
> > I would assume that the MIGRATE_MOVABLE test is not necessary (given that
> > the whole zone is set as movable) but this test is used also in the
> > offlining path (in set_migratetype_isolate) and the primary reason for
> > this patch is to sync those two checks. 
> 
> Merge the two checks into an inline function?

This sounds reasonable. I will update my patch as soon as I find a
proper place for the function (I guess include/linux/mmzone.h is the
best place).

> 
> > I am not familiar with all the possible cases for migrate flags so the
> > test reduction should be better done by someone more familiar with the
> > code (the zone flag test is much more easier than the whole
> > get_pageblock_migratetype so this could be a win in the end).
> 
> Feel free to swap the order of tests :)
> 
> > > 
> > > - why do you think free pages are not removeable? Simply to cater for
> > >   the set_migratetype_isolate() logic, or there are more fundamental
> > >   reasons?
> > 
> > Free pages can be from non movable zone, right? I know that having a
> > zone with the free page blocks in non-movable zone is extremely 
> > improbable but what is the point of this check anyway? So yes, this is
> > more to be in sync than anything more fundamental.
> 
> You don't have strong reasons to remove the free pages test, so why
> not keep it? 

OK, I think I do understand the free pages test. It just says that
everyting that is free is potentially movable by definition because
nobody uses this memory, right?

> We never know what the user will do. He may regretted
> immediately after onlining a node, and want to offline it..  Some
> hackers may want to offline some 128MB memory blocks (by chance) with
> the help of drop_caches.

I am not sure I understand what you are saying here.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-01 12:19         ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-01 12:19 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Tue 31-08-10 22:19:42, Wu Fengguang wrote:
> On Mon, Aug 23, 2010 at 05:22:46PM +0800, Michal Hocko wrote:
> > On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> > > Hi Michal,
> > 
> > Hi,
> > 
> > > 
> > > It helps to explain in changelog/code
> > > 
> > > - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
> > >   pages? 
> > 
> > page can be MIGRATE_RESERVE IIUC.
> 
> Yup, it may also be set to MIGRATE_ISOLATE by soft_offline_page().

Doesn't it make sense to check for !MIGRATE_UNMOVABLE then?

> 
> > >   And why the MIGRATE_MOVABLE test is still necessary given the
> > >   ZONE_MOVABLE check?
> > 
> > I would assume that the MIGRATE_MOVABLE test is not necessary (given that
> > the whole zone is set as movable) but this test is used also in the
> > offlining path (in set_migratetype_isolate) and the primary reason for
> > this patch is to sync those two checks. 
> 
> Merge the two checks into an inline function?

This sounds reasonable. I will update my patch as soon as I find a
proper place for the function (I guess include/linux/mmzone.h is the
best place).

> 
> > I am not familiar with all the possible cases for migrate flags so the
> > test reduction should be better done by someone more familiar with the
> > code (the zone flag test is much more easier than the whole
> > get_pageblock_migratetype so this could be a win in the end).
> 
> Feel free to swap the order of tests :)
> 
> > > 
> > > - why do you think free pages are not removeable? Simply to cater for
> > >   the set_migratetype_isolate() logic, or there are more fundamental
> > >   reasons?
> > 
> > Free pages can be from non movable zone, right? I know that having a
> > zone with the free page blocks in non-movable zone is extremely 
> > improbable but what is the point of this check anyway? So yes, this is
> > more to be in sync than anything more fundamental.
> 
> You don't have strong reasons to remove the free pages test, so why
> not keep it? 

OK, I think I do understand the free pages test. It just says that
everyting that is free is potentially movable by definition because
nobody uses this memory, right?

> We never know what the user will do. He may regretted
> immediately after onlining a node, and want to offline it..  Some
> hackers may want to offline some 128MB memory blocks (by chance) with
> the help of drop_caches.

I am not sure I understand what you are saying here.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-09-01 12:19         ` Michal Hocko
@ 2010-09-01 12:41           ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-01 12:41 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Wed 01-09-10 14:19:51, Michal Hocko wrote:
> On Tue 31-08-10 22:19:42, Wu Fengguang wrote:
> > On Mon, Aug 23, 2010 at 05:22:46PM +0800, Michal Hocko wrote:
> > > On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> > > > Hi Michal,
> > > 
> > > Hi,
> > > 
> > > > 
> > > > It helps to explain in changelog/code
> > > > 
> > > > - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
> > > >   pages? 
> > > 
> > > page can be MIGRATE_RESERVE IIUC.
> > 
> > Yup, it may also be set to MIGRATE_ISOLATE by soft_offline_page().
> 
> Doesn't it make sense to check for !MIGRATE_UNMOVABLE then?

Something like the following patch.


>From de85f1aa42115678d3340f0448cd798577036496 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Fri, 20 Aug 2010 15:39:16 +0200
Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code

Currently is_mem_section_removable checks whether each pageblock from
the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
are false then the range is considered non removable.

On the other hand, offlining code (more specifically
set_migratetype_isolate) doesn't care whether a page is free and instead
it just checks the migrate type of the page and whether the page's zone
is movable.

This can lead into a situation when we can mark a node as not removable
just because a pageblock is MIGRATE_RESERVE and it is not free.

Let's make a common helper is_page_removable which unifies both tests
at one place. Also let's check for MIGRATE_UNMOVABLE rather than all
possible MIGRATEable types.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
---
 include/linux/mmzone.h |   26 ++++++++++++++++++++++++++
 mm/memory_hotplug.c    |   19 +------------------
 mm/page_alloc.c        |    5 +----
 3 files changed, 28 insertions(+), 22 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 6e6e626..7aaa272 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -669,6 +669,32 @@ unsigned long __init node_memmap_size_bytes(int, unsigned long, unsigned long);
  */
 #define zone_idx(zone)		((zone) - (zone)->zone_pgdat->node_zones)
 
+#ifdef CONFIG_MEMORY_HOTREMOVE
+/*
+ * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
+ * set and the size of the free page is given by page_order(). Using this,
+ * the function determines if the pageblock contains only free pages.
+ * Due to buddy contraints, a free page at least the size of a pageblock will
+ * be located at the start of the pageblock
+ */
+static inline int pageblock_free(struct page *page)
+{
+	return PageBuddy(page) && page_order(page) >= pageblock_order;
+}
+
+/*
+ * A free pageblock or the one which is movable or reclaimable is
+ * considered to be removable
+ */
+static inline bool is_page_removable(struct page *page)
+{
+	return get_pageblock_migratetype(page) != MIGRATE_UNMOVABLE
+		|| pageblock_free(page);
+}
+#else
+#define is_page_removable(p) 0
+#endif
+
 static inline int populated_zone(struct zone *zone)
 {
 	return (!!zone->present_pages);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a4cfcdc..66195b8 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -569,17 +569,6 @@ out:
 EXPORT_SYMBOL_GPL(add_memory);
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-/*
- * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
- * set and the size of the free page is given by page_order(). Using this,
- * the function determines if the pageblock contains only free pages.
- * Due to buddy contraints, a free page at least the size of a pageblock will
- * be located at the start of the pageblock
- */
-static inline int pageblock_free(struct page *page)
-{
-	return PageBuddy(page) && page_order(page) >= pageblock_order;
-}
 
 /* Return the start of the next active pageblock after a given page */
 static struct page *next_active_pageblock(struct page *page)
@@ -608,13 +597,7 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
 
 	/* Check the starting page of each pageblock within the range */
 	for (; page < end_page; page = next_active_pageblock(page)) {
-		type = get_pageblock_migratetype(page);
-
-		/*
-		 * A pageblock containing MOVABLE or free pages is considered
-		 * removable
-		 */
-		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
+		if (!is_page_removable(page))
 			return 0;
 
 		/*
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a9649f4..c2e2576 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5277,14 +5277,11 @@ int set_migratetype_isolate(struct page *page)
 	struct memory_isolate_notify arg;
 	int notifier_ret;
 	int ret = -EBUSY;
-	int zone_idx;
 
 	zone = page_zone(page);
-	zone_idx = zone_idx(zone);
 
 	spin_lock_irqsave(&zone->lock, flags);
-	if (get_pageblock_migratetype(page) == MIGRATE_MOVABLE ||
-	    zone_idx == ZONE_MOVABLE) {
+	if (is_page_removable(page)) {
 		ret = 0;
 		goto out;
 	}
-- 
1.7.1


-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-01 12:41           ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-01 12:41 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Wed 01-09-10 14:19:51, Michal Hocko wrote:
> On Tue 31-08-10 22:19:42, Wu Fengguang wrote:
> > On Mon, Aug 23, 2010 at 05:22:46PM +0800, Michal Hocko wrote:
> > > On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> > > > Hi Michal,
> > > 
> > > Hi,
> > > 
> > > > 
> > > > It helps to explain in changelog/code
> > > > 
> > > > - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
> > > >   pages? 
> > > 
> > > page can be MIGRATE_RESERVE IIUC.
> > 
> > Yup, it may also be set to MIGRATE_ISOLATE by soft_offline_page().
> 
> Doesn't it make sense to check for !MIGRATE_UNMOVABLE then?

Something like the following patch.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-09-01 12:41           ` Michal Hocko
@ 2010-09-02  5:45             ` KAMEZAWA Hiroyuki
  -1 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-02  5:45 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wu Fengguang, linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Wed, 1 Sep 2010 14:41:38 +0200
Michal Hocko <mhocko@suse.cz> wrote:

> On Wed 01-09-10 14:19:51, Michal Hocko wrote:
> > On Tue 31-08-10 22:19:42, Wu Fengguang wrote:
> > > On Mon, Aug 23, 2010 at 05:22:46PM +0800, Michal Hocko wrote:
> > > > On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> > > > > Hi Michal,
> > > > 
> > > > Hi,
> > > > 
> > > > > 
> > > > > It helps to explain in changelog/code
> > > > > 
> > > > > - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
> > > > >   pages? 
> > > > 
> > > > page can be MIGRATE_RESERVE IIUC.
> > > 
> > > Yup, it may also be set to MIGRATE_ISOLATE by soft_offline_page().
> > 
> > Doesn't it make sense to check for !MIGRATE_UNMOVABLE then?
> 
> Something like the following patch.
> 
> 
> From de85f1aa42115678d3340f0448cd798577036496 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Fri, 20 Aug 2010 15:39:16 +0200
> Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> 
> Currently is_mem_section_removable checks whether each pageblock from
> the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> are false then the range is considered non removable.
> 
> On the other hand, offlining code (more specifically
> set_migratetype_isolate) doesn't care whether a page is free and instead
> it just checks the migrate type of the page and whether the page's zone
> is movable.
> 
> This can lead into a situation when we can mark a node as not removable
> just because a pageblock is MIGRATE_RESERVE and it is not free.
> 
> Let's make a common helper is_page_removable which unifies both tests
> at one place. Also let's check for MIGRATE_UNMOVABLE rather than all
> possible MIGRATEable types.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.cz>

Hmm..Why MIGRATE_RECLAIMABLE is included ?

If MIGRATE_RCLAIMABLE is included, set_migrate_type() should check the
range of pages. Because it makes the pageblock as MIGRAGE_MOVABLE after
failure of memory hotplug.

Original code checks.

 - the range is MIGRAGE_MOVABLE or
 - the range includes only free pages and LRU pages.

Then, moving them back to MIGRAGE_MOVABLE after failure was correct.
Doesn't this makes changes MIGRATE_RECALIMABLE to be MIGRATE_MOVABLE and
leads us to more fragmentated situation ?

Thanks,
-Kame



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-02  5:45             ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-02  5:45 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wu Fengguang, linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Wed, 1 Sep 2010 14:41:38 +0200
Michal Hocko <mhocko@suse.cz> wrote:

> On Wed 01-09-10 14:19:51, Michal Hocko wrote:
> > On Tue 31-08-10 22:19:42, Wu Fengguang wrote:
> > > On Mon, Aug 23, 2010 at 05:22:46PM +0800, Michal Hocko wrote:
> > > > On Sun 22-08-10 08:42:32, Wu Fengguang wrote:
> > > > > Hi Michal,
> > > > 
> > > > Hi,
> > > > 
> > > > > 
> > > > > It helps to explain in changelog/code
> > > > > 
> > > > > - in what situation a ZONE_MOVABLE will contain !MIGRATE_MOVABLE
> > > > >   pages? 
> > > > 
> > > > page can be MIGRATE_RESERVE IIUC.
> > > 
> > > Yup, it may also be set to MIGRATE_ISOLATE by soft_offline_page().
> > 
> > Doesn't it make sense to check for !MIGRATE_UNMOVABLE then?
> 
> Something like the following patch.
> 
> 
> From de85f1aa42115678d3340f0448cd798577036496 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Fri, 20 Aug 2010 15:39:16 +0200
> Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> 
> Currently is_mem_section_removable checks whether each pageblock from
> the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> are false then the range is considered non removable.
> 
> On the other hand, offlining code (more specifically
> set_migratetype_isolate) doesn't care whether a page is free and instead
> it just checks the migrate type of the page and whether the page's zone
> is movable.
> 
> This can lead into a situation when we can mark a node as not removable
> just because a pageblock is MIGRATE_RESERVE and it is not free.
> 
> Let's make a common helper is_page_removable which unifies both tests
> at one place. Also let's check for MIGRATE_UNMOVABLE rather than all
> possible MIGRATEable types.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.cz>

Hmm..Why MIGRATE_RECLAIMABLE is included ?

If MIGRATE_RCLAIMABLE is included, set_migrate_type() should check the
range of pages. Because it makes the pageblock as MIGRAGE_MOVABLE after
failure of memory hotplug.

Original code checks.

 - the range is MIGRAGE_MOVABLE or
 - the range includes only free pages and LRU pages.

Then, moving them back to MIGRAGE_MOVABLE after failure was correct.
Doesn't this makes changes MIGRATE_RECALIMABLE to be MIGRATE_MOVABLE and
leads us to more fragmentated situation ?

Thanks,
-Kame


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-09-02  5:45             ` KAMEZAWA Hiroyuki
@ 2010-09-02  8:28               ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-02  8:28 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Wu Fengguang, linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Thu 02-09-10 14:45:00, KAMEZAWA Hiroyuki wrote:
> On Wed, 1 Sep 2010 14:41:38 +0200
[...]
> > From de85f1aa42115678d3340f0448cd798577036496 Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <mhocko@suse.cz>
> > Date: Fri, 20 Aug 2010 15:39:16 +0200
> > Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> > 
> > Currently is_mem_section_removable checks whether each pageblock from
> > the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> > are false then the range is considered non removable.
> > 
> > On the other hand, offlining code (more specifically
> > set_migratetype_isolate) doesn't care whether a page is free and instead
> > it just checks the migrate type of the page and whether the page's zone
> > is movable.
> > 
> > This can lead into a situation when we can mark a node as not removable
> > just because a pageblock is MIGRATE_RESERVE and it is not free.
> > 
> > Let's make a common helper is_page_removable which unifies both tests
> > at one place. Also let's check for MIGRATE_UNMOVABLE rather than all
> > possible MIGRATEable types.
> > 
> > Signed-off-by: Michal Hocko <mhocko@suse.cz>
> 
> Hmm..Why MIGRATE_RECLAIMABLE is included ?

AFAIU the code, MIGRATE_RECLAIMABLE are movable as well (at least that
is how I interpret #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)).
Why should we prevent from memory offlining if we have some reclaimable
pages? Or am I totally misinterpreting the meaning of this flag?

> 
> If MIGRATE_RCLAIMABLE is included, set_migrate_type() should check the
> range of pages. Because it makes the pageblock as MIGRAGE_MOVABLE after
> failure of memory hotplug.
> 
> Original code checks.
> 
>  - the range is MIGRAGE_MOVABLE or
>  - the range includes only free pages and LRU pages.
> 
> Then, moving them back to MIGRAGE_MOVABLE after failure was correct.
> Doesn't this makes changes MIGRATE_RECALIMABLE to be MIGRATE_MOVABLE and
> leads us to more fragmentated situation ?

Just to be sure that I understand you concern. We are talking about hot
remove failure which can lead to higher fragmentation, right? 

By the higher fragmentation you mean that all movable pageblocks (even
reclaimable) gets to MIGRATE_MOVABLE until we get first failure. In the
worst case, if we fail near the end of the zone then there is imbalance
in MIGRATE_MOVABLE vs. MIGRATE_RECALIMABLE. Is that what you are
thinking of? Doesn't this just gets the zone to the state after
onlining? Or is the problem if we fail somewhere in the middle?

> 
> Thanks,
> -Kame
> 
> 

-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-02  8:28               ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-02  8:28 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Wu Fengguang, linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Thu 02-09-10 14:45:00, KAMEZAWA Hiroyuki wrote:
> On Wed, 1 Sep 2010 14:41:38 +0200
[...]
> > From de85f1aa42115678d3340f0448cd798577036496 Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <mhocko@suse.cz>
> > Date: Fri, 20 Aug 2010 15:39:16 +0200
> > Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> > 
> > Currently is_mem_section_removable checks whether each pageblock from
> > the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> > are false then the range is considered non removable.
> > 
> > On the other hand, offlining code (more specifically
> > set_migratetype_isolate) doesn't care whether a page is free and instead
> > it just checks the migrate type of the page and whether the page's zone
> > is movable.
> > 
> > This can lead into a situation when we can mark a node as not removable
> > just because a pageblock is MIGRATE_RESERVE and it is not free.
> > 
> > Let's make a common helper is_page_removable which unifies both tests
> > at one place. Also let's check for MIGRATE_UNMOVABLE rather than all
> > possible MIGRATEable types.
> > 
> > Signed-off-by: Michal Hocko <mhocko@suse.cz>
> 
> Hmm..Why MIGRATE_RECLAIMABLE is included ?

AFAIU the code, MIGRATE_RECLAIMABLE are movable as well (at least that
is how I interpret #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)).
Why should we prevent from memory offlining if we have some reclaimable
pages? Or am I totally misinterpreting the meaning of this flag?

> 
> If MIGRATE_RCLAIMABLE is included, set_migrate_type() should check the
> range of pages. Because it makes the pageblock as MIGRAGE_MOVABLE after
> failure of memory hotplug.
> 
> Original code checks.
> 
>  - the range is MIGRAGE_MOVABLE or
>  - the range includes only free pages and LRU pages.
> 
> Then, moving them back to MIGRAGE_MOVABLE after failure was correct.
> Doesn't this makes changes MIGRATE_RECALIMABLE to be MIGRATE_MOVABLE and
> leads us to more fragmentated situation ?

Just to be sure that I understand you concern. We are talking about hot
remove failure which can lead to higher fragmentation, right? 

By the higher fragmentation you mean that all movable pageblocks (even
reclaimable) gets to MIGRATE_MOVABLE until we get first failure. In the
worst case, if we fail near the end of the zone then there is imbalance
in MIGRATE_MOVABLE vs. MIGRATE_RECALIMABLE. Is that what you are
thinking of? Doesn't this just gets the zone to the state after
onlining? Or is the problem if we fail somewhere in the middle?

> 
> Thanks,
> -Kame
> 
> 

-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-09-02  8:28               ` Michal Hocko
@ 2010-09-02  9:03                 ` KAMEZAWA Hiroyuki
  -1 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-02  9:03 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wu Fengguang, linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Thu, 2 Sep 2010 10:28:29 +0200
Michal Hocko <mhocko@suse.cz> wrote:

> On Thu 02-09-10 14:45:00, KAMEZAWA Hiroyuki wrote:
> > On Wed, 1 Sep 2010 14:41:38 +0200
> [...]
> > > From de85f1aa42115678d3340f0448cd798577036496 Mon Sep 17 00:00:00 2001
> > > From: Michal Hocko <mhocko@suse.cz>
> > > Date: Fri, 20 Aug 2010 15:39:16 +0200
> > > Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> > > 
> > > Currently is_mem_section_removable checks whether each pageblock from
> > > the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> > > are false then the range is considered non removable.
> > > 
> > > On the other hand, offlining code (more specifically
> > > set_migratetype_isolate) doesn't care whether a page is free and instead
> > > it just checks the migrate type of the page and whether the page's zone
> > > is movable.
> > > 
> > > This can lead into a situation when we can mark a node as not removable
> > > just because a pageblock is MIGRATE_RESERVE and it is not free.
> > > 
> > > Let's make a common helper is_page_removable which unifies both tests
> > > at one place. Also let's check for MIGRATE_UNMOVABLE rather than all
> > > possible MIGRATEable types.
> > > 
> > > Signed-off-by: Michal Hocko <mhocko@suse.cz>
> > 
> > Hmm..Why MIGRATE_RECLAIMABLE is included ?
> 
> AFAIU the code, MIGRATE_RECLAIMABLE are movable as well (at least that
> is how I interpret #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)).
> Why should we prevent from memory offlining if we have some reclaimable
> pages? Or am I totally misinterpreting the meaning of this flag?
> 

RECLAIMABLE cannot be 100% reclaimable. Then, for memory hotlug,
I intentionally skips it and check free_area[] and LRU.


> > 
> > If MIGRATE_RCLAIMABLE is included, set_migrate_type() should check the
> > range of pages. Because it makes the pageblock as MIGRAGE_MOVABLE after
> > failure of memory hotplug.
> > 
> > Original code checks.
> > 
> >  - the range is MIGRAGE_MOVABLE or
> >  - the range includes only free pages and LRU pages.
> > 
> > Then, moving them back to MIGRAGE_MOVABLE after failure was correct.
> > Doesn't this makes changes MIGRATE_RECALIMABLE to be MIGRATE_MOVABLE and
> > leads us to more fragmentated situation ?
> 
> Just to be sure that I understand you concern. We are talking about hot
> remove failure which can lead to higher fragmentation, right? 
> 
right. 

> By the higher fragmentation you mean that all movable pageblocks (even
> reclaimable) gets to MIGRATE_MOVABLE until we get first failure. In the
> worst case, if we fail near the end of the zone then there is imbalance
> in MIGRATE_MOVABLE vs. MIGRATE_RECALIMABLE. Is that what you are
> thinking of? Doesn't this just gets the zone to the state after
> onlining? Or is the problem if we fail somewhere in the middle?
> 

No. My concern is pageblock type changes before/after memory hotplug failure.
	before isolation: MIGRATE_RECLAIMABLE
	after isolation failure : MIGRATE_MOVABLE

Then, the section which was RECALAIMABLE (but caused memory hotplug failure)
turns to be MIGRATE_MOVABLE and will continue to cause memory hotplug failure.
(Because it contains unreclaimable(still-in-use) slab.)

That means memory-hotplug success-rate goes down because of not-important check,
and (your) customer believe "memory hotplug never works well hahaha."

The old code checks RECLAIMABLE pageblock only contains free pages or LRU pages,
In that meaning, MIGRATE_MOVABLE check itself should be removed. It's my fault.


Thanks,
-Kame


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-02  9:03                 ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-02  9:03 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wu Fengguang, linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Thu, 2 Sep 2010 10:28:29 +0200
Michal Hocko <mhocko@suse.cz> wrote:

> On Thu 02-09-10 14:45:00, KAMEZAWA Hiroyuki wrote:
> > On Wed, 1 Sep 2010 14:41:38 +0200
> [...]
> > > From de85f1aa42115678d3340f0448cd798577036496 Mon Sep 17 00:00:00 2001
> > > From: Michal Hocko <mhocko@suse.cz>
> > > Date: Fri, 20 Aug 2010 15:39:16 +0200
> > > Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> > > 
> > > Currently is_mem_section_removable checks whether each pageblock from
> > > the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> > > are false then the range is considered non removable.
> > > 
> > > On the other hand, offlining code (more specifically
> > > set_migratetype_isolate) doesn't care whether a page is free and instead
> > > it just checks the migrate type of the page and whether the page's zone
> > > is movable.
> > > 
> > > This can lead into a situation when we can mark a node as not removable
> > > just because a pageblock is MIGRATE_RESERVE and it is not free.
> > > 
> > > Let's make a common helper is_page_removable which unifies both tests
> > > at one place. Also let's check for MIGRATE_UNMOVABLE rather than all
> > > possible MIGRATEable types.
> > > 
> > > Signed-off-by: Michal Hocko <mhocko@suse.cz>
> > 
> > Hmm..Why MIGRATE_RECLAIMABLE is included ?
> 
> AFAIU the code, MIGRATE_RECLAIMABLE are movable as well (at least that
> is how I interpret #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)).
> Why should we prevent from memory offlining if we have some reclaimable
> pages? Or am I totally misinterpreting the meaning of this flag?
> 

RECLAIMABLE cannot be 100% reclaimable. Then, for memory hotlug,
I intentionally skips it and check free_area[] and LRU.


> > 
> > If MIGRATE_RCLAIMABLE is included, set_migrate_type() should check the
> > range of pages. Because it makes the pageblock as MIGRAGE_MOVABLE after
> > failure of memory hotplug.
> > 
> > Original code checks.
> > 
> >  - the range is MIGRAGE_MOVABLE or
> >  - the range includes only free pages and LRU pages.
> > 
> > Then, moving them back to MIGRAGE_MOVABLE after failure was correct.
> > Doesn't this makes changes MIGRATE_RECALIMABLE to be MIGRATE_MOVABLE and
> > leads us to more fragmentated situation ?
> 
> Just to be sure that I understand you concern. We are talking about hot
> remove failure which can lead to higher fragmentation, right? 
> 
right. 

> By the higher fragmentation you mean that all movable pageblocks (even
> reclaimable) gets to MIGRATE_MOVABLE until we get first failure. In the
> worst case, if we fail near the end of the zone then there is imbalance
> in MIGRATE_MOVABLE vs. MIGRATE_RECALIMABLE. Is that what you are
> thinking of? Doesn't this just gets the zone to the state after
> onlining? Or is the problem if we fail somewhere in the middle?
> 

No. My concern is pageblock type changes before/after memory hotplug failure.
	before isolation: MIGRATE_RECLAIMABLE
	after isolation failure : MIGRATE_MOVABLE

Then, the section which was RECALAIMABLE (but caused memory hotplug failure)
turns to be MIGRATE_MOVABLE and will continue to cause memory hotplug failure.
(Because it contains unreclaimable(still-in-use) slab.)

That means memory-hotplug success-rate goes down because of not-important check,
and (your) customer believe "memory hotplug never works well hahaha."

The old code checks RECLAIMABLE pageblock only contains free pages or LRU pages,
In that meaning, MIGRATE_MOVABLE check itself should be removed. It's my fault.


Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-09-02  9:03                 ` KAMEZAWA Hiroyuki
@ 2010-09-02  9:24                   ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-02  9:24 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Wu Fengguang, linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Thu 02-09-10 18:03:43, KAMEZAWA Hiroyuki wrote:
> On Thu, 2 Sep 2010 10:28:29 +0200
> Michal Hocko <mhocko@suse.cz> wrote:
> 
> > On Thu 02-09-10 14:45:00, KAMEZAWA Hiroyuki wrote:
> > > On Wed, 1 Sep 2010 14:41:38 +0200
> > [...]
> > > > From de85f1aa42115678d3340f0448cd798577036496 Mon Sep 17 00:00:00 2001
> > > > From: Michal Hocko <mhocko@suse.cz>
> > > > Date: Fri, 20 Aug 2010 15:39:16 +0200
> > > > Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> > > > 
> > > > Currently is_mem_section_removable checks whether each pageblock from
> > > > the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> > > > are false then the range is considered non removable.
> > > > 
> > > > On the other hand, offlining code (more specifically
> > > > set_migratetype_isolate) doesn't care whether a page is free and instead
> > > > it just checks the migrate type of the page and whether the page's zone
> > > > is movable.
> > > > 
> > > > This can lead into a situation when we can mark a node as not removable
> > > > just because a pageblock is MIGRATE_RESERVE and it is not free.
> > > > 
> > > > Let's make a common helper is_page_removable which unifies both tests
> > > > at one place. Also let's check for MIGRATE_UNMOVABLE rather than all
> > > > possible MIGRATEable types.
> > > > 
> > > > Signed-off-by: Michal Hocko <mhocko@suse.cz>
> > > 
> > > Hmm..Why MIGRATE_RECLAIMABLE is included ?
> > 
> > AFAIU the code, MIGRATE_RECLAIMABLE are movable as well (at least that
> > is how I interpret #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)).
> > Why should we prevent from memory offlining if we have some reclaimable
> > pages? Or am I totally misinterpreting the meaning of this flag?
> > 
> 
> RECLAIMABLE cannot be 100% reclaimable. 

OK, I see. The name is little bit misleading then. Should we comment
that?

> Then, for memory hotlug,
> I intentionally skips it and check free_area[] and LRU.
> 
> 
> > > 
> > > If MIGRATE_RCLAIMABLE is included, set_migrate_type() should check the
> > > range of pages. Because it makes the pageblock as MIGRAGE_MOVABLE after
> > > failure of memory hotplug.
> > > 
> > > Original code checks.
> > > 
> > >  - the range is MIGRAGE_MOVABLE or
> > >  - the range includes only free pages and LRU pages.
> > > 
> > > Then, moving them back to MIGRAGE_MOVABLE after failure was correct.
> > > Doesn't this makes changes MIGRATE_RECALIMABLE to be MIGRATE_MOVABLE and
> > > leads us to more fragmentated situation ?
> > 
> > Just to be sure that I understand you concern. We are talking about hot
> > remove failure which can lead to higher fragmentation, right? 
> > 
> right. 
> 
> > By the higher fragmentation you mean that all movable pageblocks (even
> > reclaimable) gets to MIGRATE_MOVABLE until we get first failure. In the
> > worst case, if we fail near the end of the zone then there is imbalance
> > in MIGRATE_MOVABLE vs. MIGRATE_RECALIMABLE. Is that what you are
> > thinking of? Doesn't this just gets the zone to the state after
> > onlining? Or is the problem if we fail somewhere in the middle?
> > 
> 
> No. My concern is pageblock type changes before/after memory hotplug failure.
> 	before isolation: MIGRATE_RECLAIMABLE
> 	after isolation failure : MIGRATE_MOVABLE

Ahh, OK I can see your point now. unset_migratetype_isolate called on
the failure path sets migrate type unconditionally as it cannot know
what was the original migration type.

What about MIGRATE_RESERVE? Is there anything that can make those
allocations fail offlining?

Thanks!

> 
> Then, the section which was RECALAIMABLE (but caused memory hotplug failure)
> turns to be MIGRATE_MOVABLE and will continue to cause memory hotplug failure.
> (Because it contains unreclaimable(still-in-use) slab.)
> 
> That means memory-hotplug success-rate goes down because of not-important check,
> and (your) customer believe "memory hotplug never works well hahaha."
> 
> The old code checks RECLAIMABLE pageblock only contains free pages or LRU pages,
> In that meaning, MIGRATE_MOVABLE check itself should be removed. It's my fault.
> 
> 
> Thanks,
> -Kame
> 

-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-02  9:24                   ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-02  9:24 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Wu Fengguang, linux-mm, Andrew Morton, Kleen, Andi, Haicheng Li,
	Christoph Lameter, linux-kernel, Mel Gorman

On Thu 02-09-10 18:03:43, KAMEZAWA Hiroyuki wrote:
> On Thu, 2 Sep 2010 10:28:29 +0200
> Michal Hocko <mhocko@suse.cz> wrote:
> 
> > On Thu 02-09-10 14:45:00, KAMEZAWA Hiroyuki wrote:
> > > On Wed, 1 Sep 2010 14:41:38 +0200
> > [...]
> > > > From de85f1aa42115678d3340f0448cd798577036496 Mon Sep 17 00:00:00 2001
> > > > From: Michal Hocko <mhocko@suse.cz>
> > > > Date: Fri, 20 Aug 2010 15:39:16 +0200
> > > > Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> > > > 
> > > > Currently is_mem_section_removable checks whether each pageblock from
> > > > the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> > > > are false then the range is considered non removable.
> > > > 
> > > > On the other hand, offlining code (more specifically
> > > > set_migratetype_isolate) doesn't care whether a page is free and instead
> > > > it just checks the migrate type of the page and whether the page's zone
> > > > is movable.
> > > > 
> > > > This can lead into a situation when we can mark a node as not removable
> > > > just because a pageblock is MIGRATE_RESERVE and it is not free.
> > > > 
> > > > Let's make a common helper is_page_removable which unifies both tests
> > > > at one place. Also let's check for MIGRATE_UNMOVABLE rather than all
> > > > possible MIGRATEable types.
> > > > 
> > > > Signed-off-by: Michal Hocko <mhocko@suse.cz>
> > > 
> > > Hmm..Why MIGRATE_RECLAIMABLE is included ?
> > 
> > AFAIU the code, MIGRATE_RECLAIMABLE are movable as well (at least that
> > is how I interpret #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)).
> > Why should we prevent from memory offlining if we have some reclaimable
> > pages? Or am I totally misinterpreting the meaning of this flag?
> > 
> 
> RECLAIMABLE cannot be 100% reclaimable. 

OK, I see. The name is little bit misleading then. Should we comment
that?

> Then, for memory hotlug,
> I intentionally skips it and check free_area[] and LRU.
> 
> 
> > > 
> > > If MIGRATE_RCLAIMABLE is included, set_migrate_type() should check the
> > > range of pages. Because it makes the pageblock as MIGRAGE_MOVABLE after
> > > failure of memory hotplug.
> > > 
> > > Original code checks.
> > > 
> > >  - the range is MIGRAGE_MOVABLE or
> > >  - the range includes only free pages and LRU pages.
> > > 
> > > Then, moving them back to MIGRAGE_MOVABLE after failure was correct.
> > > Doesn't this makes changes MIGRATE_RECALIMABLE to be MIGRATE_MOVABLE and
> > > leads us to more fragmentated situation ?
> > 
> > Just to be sure that I understand you concern. We are talking about hot
> > remove failure which can lead to higher fragmentation, right? 
> > 
> right. 
> 
> > By the higher fragmentation you mean that all movable pageblocks (even
> > reclaimable) gets to MIGRATE_MOVABLE until we get first failure. In the
> > worst case, if we fail near the end of the zone then there is imbalance
> > in MIGRATE_MOVABLE vs. MIGRATE_RECALIMABLE. Is that what you are
> > thinking of? Doesn't this just gets the zone to the state after
> > onlining? Or is the problem if we fail somewhere in the middle?
> > 
> 
> No. My concern is pageblock type changes before/after memory hotplug failure.
> 	before isolation: MIGRATE_RECLAIMABLE
> 	after isolation failure : MIGRATE_MOVABLE

Ahh, OK I can see your point now. unset_migratetype_isolate called on
the failure path sets migrate type unconditionally as it cannot know
what was the original migration type.

What about MIGRATE_RESERVE? Is there anything that can make those
allocations fail offlining?

Thanks!

> 
> Then, the section which was RECALAIMABLE (but caused memory hotplug failure)
> turns to be MIGRATE_MOVABLE and will continue to cause memory hotplug failure.
> (Because it contains unreclaimable(still-in-use) slab.)
> 
> That means memory-hotplug success-rate goes down because of not-important check,
> and (your) customer believe "memory hotplug never works well hahaha."
> 
> The old code checks RECLAIMABLE pageblock only contains free pages or LRU pages,
> In that meaning, MIGRATE_MOVABLE check itself should be removed. It's my fault.
> 
> 
> Thanks,
> -Kame
> 

-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-09-02  9:24                   ` Michal Hocko
@ 2010-09-02 11:19                     ` Hiroyuki Kamezawa
  -1 siblings, 0 replies; 72+ messages in thread
From: Hiroyuki Kamezawa @ 2010-09-02 11:19 UTC (permalink / raw)
  To: Michal Hocko
  Cc: KAMEZAWA Hiroyuki, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

2010/9/2 Michal Hocko <mhocko@suse.cz>:
> On Thu 02-09-10 18:03:43, KAMEZAWA Hiroyuki wrote:
>> On Thu, 2 Sep 2010 10:28:29 +0200
>> Michal Hocko <mhocko@suse.cz> wrote:
>>
>> > On Thu 02-09-10 14:45:00, KAMEZAWA Hiroyuki wrote:
>> > > On Wed, 1 Sep 2010 14:41:38 +0200
>> > [...]
>> > > > From de85f1aa42115678d3340f0448cd798577036496 Mon Sep 17 00:00:00 2001
>> > > > From: Michal Hocko <mhocko@suse.cz>
>> > > > Date: Fri, 20 Aug 2010 15:39:16 +0200
>> > > > Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
>> > > >
>> > > > Currently is_mem_section_removable checks whether each pageblock from
>> > > > the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
>> > > > are false then the range is considered non removable.
>> > > >
>> > > > On the other hand, offlining code (more specifically
>> > > > set_migratetype_isolate) doesn't care whether a page is free and instead
>> > > > it just checks the migrate type of the page and whether the page's zone
>> > > > is movable.
>> > > >
>> > > > This can lead into a situation when we can mark a node as not removable
>> > > > just because a pageblock is MIGRATE_RESERVE and it is not free.
>> > > >
>> > > > Let's make a common helper is_page_removable which unifies both tests
>> > > > at one place. Also let's check for MIGRATE_UNMOVABLE rather than all
>> > > > possible MIGRATEable types.
>> > > >
>> > > > Signed-off-by: Michal Hocko <mhocko@suse.cz>
>> > >
>> > > Hmm..Why MIGRATE_RECLAIMABLE is included ?
>> >
>> > AFAIU the code, MIGRATE_RECLAIMABLE are movable as well (at least that
>> > is how I interpret #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)).
>> > Why should we prevent from memory offlining if we have some reclaimable
>> > pages? Or am I totally misinterpreting the meaning of this flag?
>> >
>>
>> RECLAIMABLE cannot be 100% reclaimable.
>
> OK, I see. The name is little bit misleading then. Should we comment
> that?
>
>> Then, for memory hotlug,
>> I intentionally skips it and check free_area[] and LRU.
>>
>>
>> > >
>> > > If MIGRATE_RCLAIMABLE is included, set_migrate_type() should check the
>> > > range of pages. Because it makes the pageblock as MIGRAGE_MOVABLE after
>> > > failure of memory hotplug.
>> > >
>> > > Original code checks.
>> > >
>> > >  - the range is MIGRAGE_MOVABLE or
>> > >  - the range includes only free pages and LRU pages.
>> > >
>> > > Then, moving them back to MIGRAGE_MOVABLE after failure was correct.
>> > > Doesn't this makes changes MIGRATE_RECALIMABLE to be MIGRATE_MOVABLE and
>> > > leads us to more fragmentated situation ?
>> >
>> > Just to be sure that I understand you concern. We are talking about hot
>> > remove failure which can lead to higher fragmentation, right?
>> >
>> right.
>>
>> > By the higher fragmentation you mean that all movable pageblocks (even
>> > reclaimable) gets to MIGRATE_MOVABLE until we get first failure. In the
>> > worst case, if we fail near the end of the zone then there is imbalance
>> > in MIGRATE_MOVABLE vs. MIGRATE_RECALIMABLE. Is that what you are
>> > thinking of? Doesn't this just gets the zone to the state after
>> > onlining? Or is the problem if we fail somewhere in the middle?
>> >
>>
>> No. My concern is pageblock type changes before/after memory hotplug failure.
>>       before isolation: MIGRATE_RECLAIMABLE
>>       after isolation failure : MIGRATE_MOVABLE
>
> Ahh, OK I can see your point now. unset_migratetype_isolate called on
> the failure path sets migrate type unconditionally as it cannot know
> what was the original migration type.
>
Right.

> What about MIGRATE_RESERVE? Is there anything that can make those
> allocations fail offlining?
>
MIGRATE_RESERVE can contain several typs of pages, mixture of movable/unmovable
pages.

IIRC, my 1st version of code of set_migratetype_isolate() just checks
zone_idx and
I think checking MIGRATE_TYPE is my mistake.
(As Mel explained, it can be a mixture of several types.)

So, how about using the latter half of set_migratetype_isolate()'s check ?
It checks that the given range just includes free pages and LRU pages.
It's 100% accurate and more trustable than migrate_type check.

Whatever migratetype the pageblock has, if the block only contains free pages
and lru pages, changing the type as MOVABLE (at failure) is not very bad.

(Or, checking contents of pageblock in failure path and set proper
MIGRATE type.)

Anyway, not very difficult. Just a bit larger patch than you have.

Thanks,
-Kame

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-02 11:19                     ` Hiroyuki Kamezawa
  0 siblings, 0 replies; 72+ messages in thread
From: Hiroyuki Kamezawa @ 2010-09-02 11:19 UTC (permalink / raw)
  To: Michal Hocko
  Cc: KAMEZAWA Hiroyuki, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

2010/9/2 Michal Hocko <mhocko@suse.cz>:
> On Thu 02-09-10 18:03:43, KAMEZAWA Hiroyuki wrote:
>> On Thu, 2 Sep 2010 10:28:29 +0200
>> Michal Hocko <mhocko@suse.cz> wrote:
>>
>> > On Thu 02-09-10 14:45:00, KAMEZAWA Hiroyuki wrote:
>> > > On Wed, 1 Sep 2010 14:41:38 +0200
>> > [...]
>> > > > From de85f1aa42115678d3340f0448cd798577036496 Mon Sep 17 00:00:00 2001
>> > > > From: Michal Hocko <mhocko@suse.cz>
>> > > > Date: Fri, 20 Aug 2010 15:39:16 +0200
>> > > > Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
>> > > >
>> > > > Currently is_mem_section_removable checks whether each pageblock from
>> > > > the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
>> > > > are false then the range is considered non removable.
>> > > >
>> > > > On the other hand, offlining code (more specifically
>> > > > set_migratetype_isolate) doesn't care whether a page is free and instead
>> > > > it just checks the migrate type of the page and whether the page's zone
>> > > > is movable.
>> > > >
>> > > > This can lead into a situation when we can mark a node as not removable
>> > > > just because a pageblock is MIGRATE_RESERVE and it is not free.
>> > > >
>> > > > Let's make a common helper is_page_removable which unifies both tests
>> > > > at one place. Also let's check for MIGRATE_UNMOVABLE rather than all
>> > > > possible MIGRATEable types.
>> > > >
>> > > > Signed-off-by: Michal Hocko <mhocko@suse.cz>
>> > >
>> > > Hmm..Why MIGRATE_RECLAIMABLE is included ?
>> >
>> > AFAIU the code, MIGRATE_RECLAIMABLE are movable as well (at least that
>> > is how I interpret #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)).
>> > Why should we prevent from memory offlining if we have some reclaimable
>> > pages? Or am I totally misinterpreting the meaning of this flag?
>> >
>>
>> RECLAIMABLE cannot be 100% reclaimable.
>
> OK, I see. The name is little bit misleading then. Should we comment
> that?
>
>> Then, for memory hotlug,
>> I intentionally skips it and check free_area[] and LRU.
>>
>>
>> > >
>> > > If MIGRATE_RCLAIMABLE is included, set_migrate_type() should check the
>> > > range of pages. Because it makes the pageblock as MIGRAGE_MOVABLE after
>> > > failure of memory hotplug.
>> > >
>> > > Original code checks.
>> > >
>> > >  - the range is MIGRAGE_MOVABLE or
>> > >  - the range includes only free pages and LRU pages.
>> > >
>> > > Then, moving them back to MIGRAGE_MOVABLE after failure was correct.
>> > > Doesn't this makes changes MIGRATE_RECALIMABLE to be MIGRATE_MOVABLE and
>> > > leads us to more fragmentated situation ?
>> >
>> > Just to be sure that I understand you concern. We are talking about hot
>> > remove failure which can lead to higher fragmentation, right?
>> >
>> right.
>>
>> > By the higher fragmentation you mean that all movable pageblocks (even
>> > reclaimable) gets to MIGRATE_MOVABLE until we get first failure. In the
>> > worst case, if we fail near the end of the zone then there is imbalance
>> > in MIGRATE_MOVABLE vs. MIGRATE_RECALIMABLE. Is that what you are
>> > thinking of? Doesn't this just gets the zone to the state after
>> > onlining? Or is the problem if we fail somewhere in the middle?
>> >
>>
>> No. My concern is pageblock type changes before/after memory hotplug failure.
>>       before isolation: MIGRATE_RECLAIMABLE
>>       after isolation failure : MIGRATE_MOVABLE
>
> Ahh, OK I can see your point now. unset_migratetype_isolate called on
> the failure path sets migrate type unconditionally as it cannot know
> what was the original migration type.
>
Right.

> What about MIGRATE_RESERVE? Is there anything that can make those
> allocations fail offlining?
>
MIGRATE_RESERVE can contain several typs of pages, mixture of movable/unmovable
pages.

IIRC, my 1st version of code of set_migratetype_isolate() just checks
zone_idx and
I think checking MIGRATE_TYPE is my mistake.
(As Mel explained, it can be a mixture of several types.)

So, how about using the latter half of set_migratetype_isolate()'s check ?
It checks that the given range just includes free pages and LRU pages.
It's 100% accurate and more trustable than migrate_type check.

Whatever migratetype the pageblock has, if the block only contains free pages
and lru pages, changing the type as MOVABLE (at failure) is not very bad.

(Or, checking contents of pageblock in failure path and set proper
MIGRATE type.)

Anyway, not very difficult. Just a bit larger patch than you have.

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-09-02 11:19                     ` Hiroyuki Kamezawa
@ 2010-09-02 13:18                       ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-02 13:18 UTC (permalink / raw)
  To: Hiroyuki Kamezawa
  Cc: KAMEZAWA Hiroyuki, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Thu 02-09-10 20:19:45, Hiroyuki Kamezawa wrote:
> 2010/9/2 Michal Hocko <mhocko@suse.cz>:
> > On Thu 02-09-10 18:03:43, KAMEZAWA Hiroyuki wrote:
> >> On Thu, 2 Sep 2010 10:28:29 +0200
> >> Michal Hocko <mhocko@suse.cz> wrote:
> >>
> >> > On Thu 02-09-10 14:45:00, KAMEZAWA Hiroyuki wrote:
[...]
> >> > By the higher fragmentation you mean that all movable pageblocks (even
> >> > reclaimable) gets to MIGRATE_MOVABLE until we get first failure. In the
> >> > worst case, if we fail near the end of the zone then there is imbalance
> >> > in MIGRATE_MOVABLE vs. MIGRATE_RECALIMABLE. Is that what you are
> >> > thinking of? Doesn't this just gets the zone to the state after
> >> > onlining? Or is the problem if we fail somewhere in the middle?
> >> >
> >>
> >> No. My concern is pageblock type changes before/after memory hotplug failure.
> >> ? ? ? before isolation: MIGRATE_RECLAIMABLE
> >> ? ? ? after isolation failure : MIGRATE_MOVABLE
> >
> > Ahh, OK I can see your point now. unset_migratetype_isolate called on
> > the failure path sets migrate type unconditionally as it cannot know
> > what was the original migration type.
> >
> Right.
> 
> > What about MIGRATE_RESERVE? Is there anything that can make those
> > allocations fail offlining?
> >
> MIGRATE_RESERVE can contain several typs of pages, mixture of movable/unmovable
> pages.

Ahh, ok. This is just a fallback zone. I see.

> 
> IIRC, my 1st version of code of set_migratetype_isolate() just checks
> zone_idx and
> I think checking MIGRATE_TYPE is my mistake.
> (As Mel explained, it can be a mixture of several types.)
> 
> So, how about using the latter half of set_migratetype_isolate()'s check ?
> It checks that the given range just includes free pages and LRU pages.
> It's 100% accurate and more trustable than migrate_type check.
> 
> Whatever migratetype the pageblock has, if the block only contains free pages
> and lru pages, changing the type as MOVABLE (at failure) is not very bad.
> 
> (Or, checking contents of pageblock in failure path and set proper
> MIGRATE type.)
> 
> Anyway, not very difficult. Just a bit larger patch than you have.

What about this? Just compile tested.

---
>From a2aaeafbaeb5b195b699df25060128b9e547949c Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Fri, 20 Aug 2010 15:39:16 +0200
Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code

Currently is_mem_section_removable checks whether each pageblock from
the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
are false then the range is considered non removable.

On the other hand, offlining code (more specifically
set_migratetype_isolate) doesn't care whether a page is free and instead
it just checks the migrate type of the page and whether the page's zone
is movable.

This can lead into a situation when we can mark a node as not removable
just because a pageblock is MIGRATE_RESERVE and it is not free but still
movable.

Let's make a common helper is_page_removable which unifies both tests
at one place.

Do not rely on any of MIGRATE_* types as all others than MIGRATE_MOVABLE
may be tricky. MIGRATE_RESERVE can be anything that just happened to
fallback to that allocation, MIGRATE_RECLAIMABLE can be unmovable
because slab (or what ever) has this page currently in use. If we tried
to remove those pages and the isolation failed then those blocks
would get to the MIRAGTE_MOVABLE list and we will end up with the
unmovable pages in the movable zone.

Let's, instead, check just whether a pageblock contains free or LRU
pages.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
---
 include/linux/mmzone.h |   24 ++++++++++++++++++++++++
 mm/memory_hotplug.c    |   19 +------------------
 mm/page_alloc.c        |    5 +----
 3 files changed, 26 insertions(+), 22 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 6e6e626..0bd941b 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -669,6 +669,30 @@ unsigned long __init node_memmap_size_bytes(int, unsigned long, unsigned long);
  */
 #define zone_idx(zone)		((zone) - (zone)->zone_pgdat->node_zones)
 
+#ifdef CONFIG_MEMORY_HOTREMOVE
+/*
+ * A free or LRU pages block are removable
+ * Do not use MIGRATE_MOVABLE because it can be insufficient and
+ * other MIGRATE types are tricky.
+ */
+static inline bool is_page_removable(struct page *page)
+{
+	int page_block = 1 << pageblock_order;
+	for (page_block > 0) {
+		if (PageBuddy(page)) {
+			page_block -= page_order(page);
+		}else if (PageLRU(page))
+			page_block--;
+		else 
+			return false;
+	}
+
+	return true;
+}
+#else
+#define is_page_removable(p) 0
+#endif
+
 static inline int populated_zone(struct zone *zone)
 {
 	return (!!zone->present_pages);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a4cfcdc..66195b8 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -569,17 +569,6 @@ out:
 EXPORT_SYMBOL_GPL(add_memory);
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-/*
- * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
- * set and the size of the free page is given by page_order(). Using this,
- * the function determines if the pageblock contains only free pages.
- * Due to buddy contraints, a free page at least the size of a pageblock will
- * be located at the start of the pageblock
- */
-static inline int pageblock_free(struct page *page)
-{
-	return PageBuddy(page) && page_order(page) >= pageblock_order;
-}
 
 /* Return the start of the next active pageblock after a given page */
 static struct page *next_active_pageblock(struct page *page)
@@ -608,13 +597,7 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
 
 	/* Check the starting page of each pageblock within the range */
 	for (; page < end_page; page = next_active_pageblock(page)) {
-		type = get_pageblock_migratetype(page);
-
-		/*
-		 * A pageblock containing MOVABLE or free pages is considered
-		 * removable
-		 */
-		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
+		if (!is_page_removable(page))
 			return 0;
 
 		/*
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a9649f4..c2e2576 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5277,14 +5277,11 @@ int set_migratetype_isolate(struct page *page)
 	struct memory_isolate_notify arg;
 	int notifier_ret;
 	int ret = -EBUSY;
-	int zone_idx;
 
 	zone = page_zone(page);
-	zone_idx = zone_idx(zone);
 
 	spin_lock_irqsave(&zone->lock, flags);
-	if (get_pageblock_migratetype(page) == MIGRATE_MOVABLE ||
-	    zone_idx == ZONE_MOVABLE) {
+	if (is_page_removable(page)) {
 		ret = 0;
 		goto out;
 	}
-- 
1.7.1


-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-02 13:18                       ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-02 13:18 UTC (permalink / raw)
  To: Hiroyuki Kamezawa
  Cc: KAMEZAWA Hiroyuki, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Thu 02-09-10 20:19:45, Hiroyuki Kamezawa wrote:
> 2010/9/2 Michal Hocko <mhocko@suse.cz>:
> > On Thu 02-09-10 18:03:43, KAMEZAWA Hiroyuki wrote:
> >> On Thu, 2 Sep 2010 10:28:29 +0200
> >> Michal Hocko <mhocko@suse.cz> wrote:
> >>
> >> > On Thu 02-09-10 14:45:00, KAMEZAWA Hiroyuki wrote:
[...]
> >> > By the higher fragmentation you mean that all movable pageblocks (even
> >> > reclaimable) gets to MIGRATE_MOVABLE until we get first failure. In the
> >> > worst case, if we fail near the end of the zone then there is imbalance
> >> > in MIGRATE_MOVABLE vs. MIGRATE_RECALIMABLE. Is that what you are
> >> > thinking of? Doesn't this just gets the zone to the state after
> >> > onlining? Or is the problem if we fail somewhere in the middle?
> >> >
> >>
> >> No. My concern is pageblock type changes before/after memory hotplug failure.
> >> ? ? ? before isolation: MIGRATE_RECLAIMABLE
> >> ? ? ? after isolation failure : MIGRATE_MOVABLE
> >
> > Ahh, OK I can see your point now. unset_migratetype_isolate called on
> > the failure path sets migrate type unconditionally as it cannot know
> > what was the original migration type.
> >
> Right.
> 
> > What about MIGRATE_RESERVE? Is there anything that can make those
> > allocations fail offlining?
> >
> MIGRATE_RESERVE can contain several typs of pages, mixture of movable/unmovable
> pages.

Ahh, ok. This is just a fallback zone. I see.

> 
> IIRC, my 1st version of code of set_migratetype_isolate() just checks
> zone_idx and
> I think checking MIGRATE_TYPE is my mistake.
> (As Mel explained, it can be a mixture of several types.)
> 
> So, how about using the latter half of set_migratetype_isolate()'s check ?
> It checks that the given range just includes free pages and LRU pages.
> It's 100% accurate and more trustable than migrate_type check.
> 
> Whatever migratetype the pageblock has, if the block only contains free pages
> and lru pages, changing the type as MOVABLE (at failure) is not very bad.
> 
> (Or, checking contents of pageblock in failure path and set proper
> MIGRATE type.)
> 
> Anyway, not very difficult. Just a bit larger patch than you have.

What about this? Just compile tested.

---

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-09-02 13:18                       ` Michal Hocko
@ 2010-09-02 14:19                         ` Hiroyuki Kamezawa
  -1 siblings, 0 replies; 72+ messages in thread
From: Hiroyuki Kamezawa @ 2010-09-02 14:19 UTC (permalink / raw)
  To: Michal Hocko
  Cc: KAMEZAWA Hiroyuki, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

2010/9/2 Michal Hocko <mhocko@suse.cz>:
> What about this? Just compile tested.
>
> ---
> From a2aaeafbaeb5b195b699df25060128b9e547949c Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Fri, 20 Aug 2010 15:39:16 +0200
> Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
>
> Currently is_mem_section_removable checks whether each pageblock from
> the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> are false then the range is considered non removable.
>
> On the other hand, offlining code (more specifically
> set_migratetype_isolate) doesn't care whether a page is free and instead
> it just checks the migrate type of the page and whether the page's zone
> is movable.
>
> This can lead into a situation when we can mark a node as not removable
> just because a pageblock is MIGRATE_RESERVE and it is not free but still
> movable.
>
> Let's make a common helper is_page_removable which unifies both tests
> at one place.
>
> Do not rely on any of MIGRATE_* types as all others than MIGRATE_MOVABLE
> may be tricky. MIGRATE_RESERVE can be anything that just happened to
> fallback to that allocation, MIGRATE_RECLAIMABLE can be unmovable
> because slab (or what ever) has this page currently in use. If we tried
> to remove those pages and the isolation failed then those blocks
> would get to the MIRAGTE_MOVABLE list and we will end up with the
> unmovable pages in the movable zone.
>
> Let's, instead, check just whether a pageblock contains free or LRU
> pages.
>
> Signed-off-by: Michal Hocko <mhocko@suse.cz>
> ---
>  include/linux/mmzone.h |   24 ++++++++++++++++++++++++
>  mm/memory_hotplug.c    |   19 +------------------
>  mm/page_alloc.c        |    5 +----
>  3 files changed, 26 insertions(+), 22 deletions(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 6e6e626..0bd941b 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -669,6 +669,30 @@ unsigned long __init node_memmap_size_bytes(int, unsigned long, unsigned long);
>  */
>  #define zone_idx(zone)         ((zone) - (zone)->zone_pgdat->node_zones)
>
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +/*
> + * A free or LRU pages block are removable
> + * Do not use MIGRATE_MOVABLE because it can be insufficient and
> + * other MIGRATE types are tricky.
> + */
> +static inline bool is_page_removable(struct page *page)
> +{
> +       int page_block = 1 << pageblock_order;
> +       for (page_block > 0) {

for ?

> +               if (PageBuddy(page)) {
> +                       page_block -= page_order(page);
> +               }else if (PageLRU(page))
> +                       page_block--;
> +               else
> +                       return false;
> +       }
> +
> +       return true;
> +}

Hmm. above for is intending to check all pages in the block ?
I'll look into details, tomorrow.

Thanks,
-Kame

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-02 14:19                         ` Hiroyuki Kamezawa
  0 siblings, 0 replies; 72+ messages in thread
From: Hiroyuki Kamezawa @ 2010-09-02 14:19 UTC (permalink / raw)
  To: Michal Hocko
  Cc: KAMEZAWA Hiroyuki, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

2010/9/2 Michal Hocko <mhocko@suse.cz>:
> What about this? Just compile tested.
>
> ---
> From a2aaeafbaeb5b195b699df25060128b9e547949c Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Fri, 20 Aug 2010 15:39:16 +0200
> Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
>
> Currently is_mem_section_removable checks whether each pageblock from
> the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> are false then the range is considered non removable.
>
> On the other hand, offlining code (more specifically
> set_migratetype_isolate) doesn't care whether a page is free and instead
> it just checks the migrate type of the page and whether the page's zone
> is movable.
>
> This can lead into a situation when we can mark a node as not removable
> just because a pageblock is MIGRATE_RESERVE and it is not free but still
> movable.
>
> Let's make a common helper is_page_removable which unifies both tests
> at one place.
>
> Do not rely on any of MIGRATE_* types as all others than MIGRATE_MOVABLE
> may be tricky. MIGRATE_RESERVE can be anything that just happened to
> fallback to that allocation, MIGRATE_RECLAIMABLE can be unmovable
> because slab (or what ever) has this page currently in use. If we tried
> to remove those pages and the isolation failed then those blocks
> would get to the MIRAGTE_MOVABLE list and we will end up with the
> unmovable pages in the movable zone.
>
> Let's, instead, check just whether a pageblock contains free or LRU
> pages.
>
> Signed-off-by: Michal Hocko <mhocko@suse.cz>
> ---
>  include/linux/mmzone.h |   24 ++++++++++++++++++++++++
>  mm/memory_hotplug.c    |   19 +------------------
>  mm/page_alloc.c        |    5 +----
>  3 files changed, 26 insertions(+), 22 deletions(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 6e6e626..0bd941b 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -669,6 +669,30 @@ unsigned long __init node_memmap_size_bytes(int, unsigned long, unsigned long);
>  */
>  #define zone_idx(zone)         ((zone) - (zone)->zone_pgdat->node_zones)
>
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +/*
> + * A free or LRU pages block are removable
> + * Do not use MIGRATE_MOVABLE because it can be insufficient and
> + * other MIGRATE types are tricky.
> + */
> +static inline bool is_page_removable(struct page *page)
> +{
> +       int page_block = 1 << pageblock_order;
> +       for (page_block > 0) {

for ?

> +               if (PageBuddy(page)) {
> +                       page_block -= page_order(page);
> +               }else if (PageLRU(page))
> +                       page_block--;
> +               else
> +                       return false;
> +       }
> +
> +       return true;
> +}

Hmm. above for is intending to check all pages in the block ?
I'll look into details, tomorrow.

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-09-02 14:19                         ` Hiroyuki Kamezawa
@ 2010-09-02 14:39                           ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-02 14:39 UTC (permalink / raw)
  To: Hiroyuki Kamezawa
  Cc: KAMEZAWA Hiroyuki, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Thu 02-09-10 23:19:18, Hiroyuki Kamezawa wrote:
[...]
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index 6e6e626..0bd941b 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -669,6 +669,30 @@ unsigned long __init node_memmap_size_bytes(int, unsigned long, unsigned long);
> > ?*/
> > ?#define zone_idx(zone) ? ? ? ? ((zone) - (zone)->zone_pgdat->node_zones)
> >
> > +#ifdef CONFIG_MEMORY_HOTREMOVE
> > +/*
> > + * A free or LRU pages block are removable
> > + * Do not use MIGRATE_MOVABLE because it can be insufficient and
> > + * other MIGRATE types are tricky.
> > + */
> > +static inline bool is_page_removable(struct page *page)
> > +{
> > + ? ? ? int page_block = 1 << pageblock_order;
> > + ? ? ? for (page_block > 0) {
> 
> for ?

Bahh. The old and half backed patch. See the up-to-date one bellow.

> > + ? ? ? ? ? ? ? if (PageBuddy(page)) {
> > + ? ? ? ? ? ? ? ? ? ? ? page_block -= page_order(page);
> > + ? ? ? ? ? ? ? }else if (PageLRU(page))
> > + ? ? ? ? ? ? ? ? ? ? ? page_block--;
> > + ? ? ? ? ? ? ? else
> > + ? ? ? ? ? ? ? ? ? ? ? return false;
> > + ? ? ? }
> > +
> > + ? ? ? return true;
> > +}
> 
> Hmm. above for is intending to check all pages in the block ?

Yes, by their orders.

> I'll look into details, tomorrow.

Thanks!

---
>From 0a8650594280446ecfa2ae8b609f25877801c55d Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Fri, 20 Aug 2010 15:39:16 +0200
Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code

Currently is_mem_section_removable checks whether each pageblock from
the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
are false then the range is considered non removable.

On the other hand, offlining code (more specifically
set_migratetype_isolate) doesn't care whether a page is free and instead
it just checks the migrate type of the page and whether the page's zone
is movable.

This can lead into a situation when we can mark a node as not removable
just because a pageblock is MIGRATE_RESERVE and it is not free but still
movable.

Let's make a common helper is_page_removable which unifies both tests
at one place.

Do not rely on any of MIGRATE_* types as all others but MIGRATE_MOVABLE
may be tricky. MIGRATE_RESERVE can be anything that just happened to
fallback to that allocation. MIGRATE_RECLAIMABLE can be unmovable
because slab (or what ever) has this page currently in use and cannot
release it.  If we tried to remove those pages and the isolation failed
then those blocks would get into the MIRAGTE_MOVABLE list
unconditionally and we will end up having unmovable pages in the movable
list.

Let's, instead, check just whether a pageblock contains only free or LRU
pages.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
---
 include/linux/mmzone.h |   24 ++++++++++++++++++++++++
 mm/memory_hotplug.c    |   19 +------------------
 mm/page_alloc.c        |    5 +----
 3 files changed, 26 insertions(+), 22 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 6e6e626..f64297e 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -669,6 +669,30 @@ unsigned long __init node_memmap_size_bytes(int, unsigned long, unsigned long);
  */
 #define zone_idx(zone)		((zone) - (zone)->zone_pgdat->node_zones)
 
+#ifdef CONFIG_MEMORY_HOTREMOVE
+/*
+ * A free or LRU pages block are removable
+ * Do not use MIGRATE_MOVABLE because it can be insufficient and
+ * other MIGRATE types are tricky.
+ */
+static inline bool is_page_removable(struct page *page)
+{
+	int page_block = 1 << pageblock_order;
+	while (page_block > 0) {
+		if (PageBuddy(page)) {
+			page_block -= page_order(page);
+		}else if (PageLRU(page))
+			page_block--;
+		else 
+			return false;
+	}
+
+	return true;
+}
+#else
+#define is_page_removable(p) 0
+#endif
+
 static inline int populated_zone(struct zone *zone)
 {
 	return (!!zone->present_pages);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a4cfcdc..66195b8 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -569,17 +569,6 @@ out:
 EXPORT_SYMBOL_GPL(add_memory);
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-/*
- * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
- * set and the size of the free page is given by page_order(). Using this,
- * the function determines if the pageblock contains only free pages.
- * Due to buddy contraints, a free page at least the size of a pageblock will
- * be located at the start of the pageblock
- */
-static inline int pageblock_free(struct page *page)
-{
-	return PageBuddy(page) && page_order(page) >= pageblock_order;
-}
 
 /* Return the start of the next active pageblock after a given page */
 static struct page *next_active_pageblock(struct page *page)
@@ -608,13 +597,7 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
 
 	/* Check the starting page of each pageblock within the range */
 	for (; page < end_page; page = next_active_pageblock(page)) {
-		type = get_pageblock_migratetype(page);
-
-		/*
-		 * A pageblock containing MOVABLE or free pages is considered
-		 * removable
-		 */
-		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
+		if (!is_page_removable(page))
 			return 0;
 
 		/*
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a9649f4..c2e2576 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5277,14 +5277,11 @@ int set_migratetype_isolate(struct page *page)
 	struct memory_isolate_notify arg;
 	int notifier_ret;
 	int ret = -EBUSY;
-	int zone_idx;
 
 	zone = page_zone(page);
-	zone_idx = zone_idx(zone);
 
 	spin_lock_irqsave(&zone->lock, flags);
-	if (get_pageblock_migratetype(page) == MIGRATE_MOVABLE ||
-	    zone_idx == ZONE_MOVABLE) {
+	if (is_page_removable(page)) {
 		ret = 0;
 		goto out;
 	}
-- 
1.7.1


-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-02 14:39                           ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-02 14:39 UTC (permalink / raw)
  To: Hiroyuki Kamezawa
  Cc: KAMEZAWA Hiroyuki, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Thu 02-09-10 23:19:18, Hiroyuki Kamezawa wrote:
[...]
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index 6e6e626..0bd941b 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -669,6 +669,30 @@ unsigned long __init node_memmap_size_bytes(int, unsigned long, unsigned long);
> > ?*/
> > ?#define zone_idx(zone) ? ? ? ? ((zone) - (zone)->zone_pgdat->node_zones)
> >
> > +#ifdef CONFIG_MEMORY_HOTREMOVE
> > +/*
> > + * A free or LRU pages block are removable
> > + * Do not use MIGRATE_MOVABLE because it can be insufficient and
> > + * other MIGRATE types are tricky.
> > + */
> > +static inline bool is_page_removable(struct page *page)
> > +{
> > + ? ? ? int page_block = 1 << pageblock_order;
> > + ? ? ? for (page_block > 0) {
> 
> for ?

Bahh. The old and half backed patch. See the up-to-date one bellow.

> > + ? ? ? ? ? ? ? if (PageBuddy(page)) {
> > + ? ? ? ? ? ? ? ? ? ? ? page_block -= page_order(page);
> > + ? ? ? ? ? ? ? }else if (PageLRU(page))
> > + ? ? ? ? ? ? ? ? ? ? ? page_block--;
> > + ? ? ? ? ? ? ? else
> > + ? ? ? ? ? ? ? ? ? ? ? return false;
> > + ? ? ? }
> > +
> > + ? ? ? return true;
> > +}
> 
> Hmm. above for is intending to check all pages in the block ?

Yes, by their orders.

> I'll look into details, tomorrow.

Thanks!

---

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
  2010-09-02 14:39                           ` Michal Hocko
@ 2010-09-02 15:05                             ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-02 15:05 UTC (permalink / raw)
  To: Hiroyuki Kamezawa
  Cc: KAMEZAWA Hiroyuki, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

I am sorry for re-posting the patch again but I found out that my config
didn't have CONFIG_HOTREMOVE so the code wasn't compiled and I didn't
catch include problems (missing symbols etc.). The patch compiles now,
finally.

---
>From 8f29b72c3318a0c9252471c4b8ffe31dce10eb6b Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Fri, 20 Aug 2010 15:39:16 +0200
Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code

Currently is_mem_section_removable checks whether each pageblock from
the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
are false then the range is considered non removable.

On the other hand, offlining code (more specifically
set_migratetype_isolate) doesn't care whether a page is free and instead
it just checks the migrate type of the page and whether the page's zone
is movable.

This can lead into a situation when we can mark a node as not removable
just because a pageblock is MIGRATE_RESERVE and it is not free but still
movable.

Let's make a common helper is_page_removable which unifies both tests
at one place.

Do not rely on any of MIGRATE_* types as all others but MIGRATE_MOVABLE
may be tricky. MIGRATE_RESERVE can be anything that just happened to
fallback to that allocation. MIGRATE_RECLAIMABLE can be unmovable
because slab (or what ever) has this page currently in use and cannot
release it.  If we tried to remove those pages and the isolation failed
then those blocks would get into the MIRAGTE_MOVABLE list
unconditionally and we will end up having unmovable pages in the movable
list.

Let's, instead, check just whether a pageblock contains only free or LRU
pages.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
---
 include/linux/memory_hotplug.h |    4 ++++
 mm/memory_hotplug.c            |   34 +++++++++++++++++++---------------
 mm/page_alloc.c                |    5 +----
 3 files changed, 24 insertions(+), 19 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 864035f..5c448f7 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -194,12 +194,16 @@ static inline void register_page_bootmem_info_node(struct pglist_data *pgdat)
 
 extern int is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
 
+bool is_page_removable(struct page *page);
+
 #else
 static inline int is_mem_section_removable(unsigned long pfn,
 					unsigned long nr_pages)
 {
 	return 0;
 }
+
+#define is_page_removable(page) 0
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 
 extern int mem_online_node(int nid);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a4cfcdc..2b736ed 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -569,16 +569,25 @@ out:
 EXPORT_SYMBOL_GPL(add_memory);
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
+
 /*
- * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
- * set and the size of the free page is given by page_order(). Using this,
- * the function determines if the pageblock contains only free pages.
- * Due to buddy contraints, a free page at least the size of a pageblock will
- * be located at the start of the pageblock
+ * A free or LRU pages block are removable
+ * Do not use MIGRATE_MOVABLE because it can be insufficient and
+ * other MIGRATE types are tricky.
  */
-static inline int pageblock_free(struct page *page)
-{
-	return PageBuddy(page) && page_order(page) >= pageblock_order;
+bool is_page_removable(struct page *page)
+{
+	int page_block = 1 << pageblock_order;
+	while (page_block > 0) {
+		if (PageBuddy(page)) {
+			page_block -= page_order(page);
+		} else if (PageLRU(page))
+			page_block--;
+		else 
+			return false;
+	}
+
+	return true;
 }
 
 /* Return the start of the next active pageblock after a given page */
@@ -608,13 +617,7 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
 
 	/* Check the starting page of each pageblock within the range */
 	for (; page < end_page; page = next_active_pageblock(page)) {
-		type = get_pageblock_migratetype(page);
-
-		/*
-		 * A pageblock containing MOVABLE or free pages is considered
-		 * removable
-		 */
-		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
+		if (!is_page_removable(page))
 			return 0;
 
 		/*
@@ -770,6 +773,7 @@ check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages,
 	return ret;
 }
 
+
 static long
 check_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
 {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a9649f4..c2e2576 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5277,14 +5277,11 @@ int set_migratetype_isolate(struct page *page)
 	struct memory_isolate_notify arg;
 	int notifier_ret;
 	int ret = -EBUSY;
-	int zone_idx;
 
 	zone = page_zone(page);
-	zone_idx = zone_idx(zone);
 
 	spin_lock_irqsave(&zone->lock, flags);
-	if (get_pageblock_migratetype(page) == MIGRATE_MOVABLE ||
-	    zone_idx == ZONE_MOVABLE) {
+	if (is_page_removable(page)) {
 		ret = 0;
 		goto out;
 	}
-- 
1.7.1


-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-02 15:05                             ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-02 15:05 UTC (permalink / raw)
  To: Hiroyuki Kamezawa
  Cc: KAMEZAWA Hiroyuki, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

I am sorry for re-posting the patch again but I found out that my config
didn't have CONFIG_HOTREMOVE so the code wasn't compiled and I didn't
catch include problems (missing symbols etc.). The patch compiles now,
finally.

---

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH 0/2 v2] Make is_mem_section_removable more conformable with offlining code
  2010-09-02 15:05                             ` Michal Hocko
@ 2010-09-03  3:10                               ` KAMEZAWA Hiroyuki
  -1 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-03  3:10 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Thu, 2 Sep 2010 17:05:54 +0200
Michal Hocko <mhocko@suse.cz> wrote:

>  extern int mem_online_node(int nid);
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index a4cfcdc..2b736ed 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -569,16 +569,25 @@ out:
>  EXPORT_SYMBOL_GPL(add_memory);
>  
>  #ifdef CONFIG_MEMORY_HOTREMOVE
> +
>  /*
> - * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
> - * set and the size of the free page is given by page_order(). Using this,
> - * the function determines if the pageblock contains only free pages.
> - * Due to buddy contraints, a free page at least the size of a pageblock will
> - * be located at the start of the pageblock
> + * A free or LRU pages block are removable
> + * Do not use MIGRATE_MOVABLE because it can be insufficient and
> + * other MIGRATE types are tricky.
>   */
> -static inline int pageblock_free(struct page *page)
> -{
> -	return PageBuddy(page) && page_order(page) >= pageblock_order;
> +bool is_page_removable(struct page *page)
> +{
> +	int page_block = 1 << pageblock_order;
> +	while (page_block > 0) {
> +		if (PageBuddy(page)) {
> +			page_block -= page_order(page);
> +		} else if (PageLRU(page))
> +			page_block--;
> +		else 
> +			return false;
> +	}

still seems wrong..."page" pointer should be updated.

Ok, here is my patch in reply to this mail. (changed the subject as v2.)

1. bugfix for current code.
2. show precise removable information.

Tested and seems to work well.

Thanks,
-Kame






^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH 0/2 v2] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-03  3:10                               ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-03  3:10 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Thu, 2 Sep 2010 17:05:54 +0200
Michal Hocko <mhocko@suse.cz> wrote:

>  extern int mem_online_node(int nid);
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index a4cfcdc..2b736ed 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -569,16 +569,25 @@ out:
>  EXPORT_SYMBOL_GPL(add_memory);
>  
>  #ifdef CONFIG_MEMORY_HOTREMOVE
> +
>  /*
> - * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
> - * set and the size of the free page is given by page_order(). Using this,
> - * the function determines if the pageblock contains only free pages.
> - * Due to buddy contraints, a free page at least the size of a pageblock will
> - * be located at the start of the pageblock
> + * A free or LRU pages block are removable
> + * Do not use MIGRATE_MOVABLE because it can be insufficient and
> + * other MIGRATE types are tricky.
>   */
> -static inline int pageblock_free(struct page *page)
> -{
> -	return PageBuddy(page) && page_order(page) >= pageblock_order;
> +bool is_page_removable(struct page *page)
> +{
> +	int page_block = 1 << pageblock_order;
> +	while (page_block > 0) {
> +		if (PageBuddy(page)) {
> +			page_block -= page_order(page);
> +		} else if (PageLRU(page))
> +			page_block--;
> +		else 
> +			return false;
> +	}

still seems wrong..."page" pointer should be updated.

Ok, here is my patch in reply to this mail. (changed the subject as v2.)

1. bugfix for current code.
2. show precise removable information.

Tested and seems to work well.

Thanks,
-Kame





--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH 1/2][BUGFIX] fix next active pageblock calculation
  2010-09-03  3:10                               ` KAMEZAWA Hiroyuki
@ 2010-09-03  3:11                                 ` KAMEZAWA Hiroyuki
  -1 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-03  3:11 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Michal Hocko, Hiroyuki Kamezawa, Wu Fengguang, linux-mm,
	Andrew Morton, Kleen, Andi, Haicheng Li, Christoph Lameter,
	linux-kernel, Mel Gorman

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

next_active_pageblock() is for finding next _used_ freeblock. It skips
several blocks when it finds there are a chunk of free pages lager than
pageblock. But it has 2 bugs.

  1. We have no lock. page_order(page) - pageblock_order can be minus.
  2. pageblocks_stride += is wrong. it should skip page_order(p) of pages.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memory_hotplug.c |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

Index: mmotm-0827/mm/memory_hotplug.c
===================================================================
--- mmotm-0827.orig/mm/memory_hotplug.c
+++ mmotm-0827/mm/memory_hotplug.c
@@ -584,19 +584,19 @@ static inline int pageblock_free(struct 
 /* Return the start of the next active pageblock after a given page */
 static struct page *next_active_pageblock(struct page *page)
 {
-	int pageblocks_stride;
-
 	/* Ensure the starting page is pageblock-aligned */
 	BUG_ON(page_to_pfn(page) & (pageblock_nr_pages - 1));
 
-	/* Move forward by at least 1 * pageblock_nr_pages */
-	pageblocks_stride = 1;
-
 	/* If the entire pageblock is free, move to the end of free page */
-	if (pageblock_free(page))
-		pageblocks_stride += page_order(page) - pageblock_order;
+	if (pageblock_free(page)) {
+		int order;
+		/* be careful. we don't have locks, page_order can be changed.*/
+		order = page_order(page);
+		if (order > pageblock_order)
+			return page + (1 << order);
+	}
 
-	return page + (pageblocks_stride * pageblock_nr_pages);
+	return page + pageblock_nr_pages;
 }
 
 /* Checks if this range of memory is likely to be hot-removable. */


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH 1/2][BUGFIX] fix next active pageblock calculation
@ 2010-09-03  3:11                                 ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-03  3:11 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Michal Hocko, Hiroyuki Kamezawa, Wu Fengguang, linux-mm,
	Andrew Morton, Kleen, Andi, Haicheng Li, Christoph Lameter,
	linux-kernel, Mel Gorman

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

next_active_pageblock() is for finding next _used_ freeblock. It skips
several blocks when it finds there are a chunk of free pages lager than
pageblock. But it has 2 bugs.

  1. We have no lock. page_order(page) - pageblock_order can be minus.
  2. pageblocks_stride += is wrong. it should skip page_order(p) of pages.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memory_hotplug.c |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

Index: mmotm-0827/mm/memory_hotplug.c
===================================================================
--- mmotm-0827.orig/mm/memory_hotplug.c
+++ mmotm-0827/mm/memory_hotplug.c
@@ -584,19 +584,19 @@ static inline int pageblock_free(struct 
 /* Return the start of the next active pageblock after a given page */
 static struct page *next_active_pageblock(struct page *page)
 {
-	int pageblocks_stride;
-
 	/* Ensure the starting page is pageblock-aligned */
 	BUG_ON(page_to_pfn(page) & (pageblock_nr_pages - 1));
 
-	/* Move forward by at least 1 * pageblock_nr_pages */
-	pageblocks_stride = 1;
-
 	/* If the entire pageblock is free, move to the end of free page */
-	if (pageblock_free(page))
-		pageblocks_stride += page_order(page) - pageblock_order;
+	if (pageblock_free(page)) {
+		int order;
+		/* be careful. we don't have locks, page_order can be changed.*/
+		order = page_order(page);
+		if (order > pageblock_order)
+			return page + (1 << order);
+	}
 
-	return page + (pageblocks_stride * pageblock_nr_pages);
+	return page + pageblock_nr_pages;
 }
 
 /* Checks if this range of memory is likely to be hot-removable. */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code
  2010-09-03  3:10                               ` KAMEZAWA Hiroyuki
@ 2010-09-03  3:14                                 ` KAMEZAWA Hiroyuki
  -1 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-03  3:14 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Michal Hocko, Hiroyuki Kamezawa, Wu Fengguang, linux-mm,
	Andrew Morton, Kleen, Andi, Haicheng Li, Christoph Lameter,
	linux-kernel, Mel Gorman


Now, sysfs interface of memory hotplug shows whether the section is
removable or not. But it checks only migrateype of pages and doesn't
check details of cluster of pages.

Next, memory hotplug's set_migratetype_isolate() has the same kind
of check, too. But the migrate-type is just a "hint" and the pageblock
can contain several types of pages if fragmentation is very heavy.

To get precise information, we need to check
 - the pageblock only contains free pages or LRU pages.

This patch adds the function __count_unmovable_pages() and makes
above 2 checks to use the same logic. This will improve user experience
of memory hotplug because sysfs interface tells accurate information.

Note:
it may be better to check MIGRATE_UNMOVABLE for making failure case quick.
Note2:
I'm not sure but notifer should be called ??

Reported-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 include/linux/memory_hotplug.h |    1 
 mm/memory_hotplug.c            |   15 --------
 mm/page_alloc.c                |   76 ++++++++++++++++++++++++++++++-----------
 3 files changed, 59 insertions(+), 33 deletions(-)

Index: mmotm-0827/mm/page_alloc.c
===================================================================
--- mmotm-0827.orig/mm/page_alloc.c
+++ mmotm-0827/mm/page_alloc.c
@@ -5274,11 +5274,63 @@ void set_pageblock_flags_group(struct pa
  * page allocater never alloc memory from ISOLATE block.
  */
 
+static int __count_unmovable_pages(struct zone *zone, struct page *page)
+{
+	unsigned long pfn, iter, found;
+	/*
+	 * For avoiding noise data, lru_add_drain_all() should be called.
+ 	 * before this.
+ 	 */
+	if (zone_idx(zone) == ZONE_MOVABLE)
+		return 0;
+
+	pfn = page_to_pfn(page);
+	for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
+		unsigned long check = pfn + iter;
+
+		if (!pfn_valid_within(check)) {
+			iter++;
+			continue;
+		}
+		page = pfn_to_page(check);
+		if (!page_count(page)) {
+			if (PageBuddy(page))
+				iter += (1 << page_order(page)) - 1;
+			continue;
+		}
+		if (!PageLRU(page))
+			found++;
+		/*
+		 * If the page is not RAM, page_count()should be 0.
+		 * we don't need more check. This is an _used_ not-movable page.
+		 *
+		 * The problematic thing here is PG_reserved pages. But if
+		 * a PG_reserved page is _used_ (at boot), page_count > 1.
+		 * But...is there PG_reserved && page_count(page)==0 page ?
+		 */
+	}
+	return found;
+}
+
+bool is_pageblock_removable(struct page *page)
+{
+	struct zone *zone = page_zone(page);
+	unsigned long flags;
+	int num;
+
+	spin_lock_irqsave(&zone->lock, flags);
+	num = __count_unmovable_pages(zone, page);
+	spin_unlock_irqrestore(&zone->lock, flags);
+
+	if (num)
+		return false;
+	return true;
+}
+
 int set_migratetype_isolate(struct page *page)
 {
 	struct zone *zone;
-	struct page *curr_page;
-	unsigned long flags, pfn, iter;
+	unsigned long flags, pfn;
 	unsigned long immobile = 0;
 	struct memory_isolate_notify arg;
 	int notifier_ret;
@@ -5289,11 +5341,6 @@ int set_migratetype_isolate(struct page 
 	zone_idx = zone_idx(zone);
 
 	spin_lock_irqsave(&zone->lock, flags);
-	if (get_pageblock_migratetype(page) == MIGRATE_MOVABLE ||
-	    zone_idx == ZONE_MOVABLE) {
-		ret = 0;
-		goto out;
-	}
 
 	pfn = page_to_pfn(page);
 	arg.start_pfn = pfn;
@@ -5315,19 +5362,9 @@ int set_migratetype_isolate(struct page 
 	notifier_ret = notifier_to_errno(notifier_ret);
 	if (notifier_ret || !arg.pages_found)
 		goto out;
+	immobile = __count_unmovable_pages(zone ,page);
 
-	for (iter = pfn; iter < (pfn + pageblock_nr_pages); iter++) {
-		if (!pfn_valid_within(pfn))
-			continue;
-
-		curr_page = pfn_to_page(iter);
-		if (!page_count(curr_page) || PageLRU(curr_page))
-			continue;
-
-		immobile++;
-	}
-
-	if (arg.pages_found == immobile)
+	if (!immobile || arg.pages_found == immobile)
 		ret = 0;
 
 out:
Index: mmotm-0827/mm/memory_hotplug.c
===================================================================
--- mmotm-0827.orig/mm/memory_hotplug.c
+++ mmotm-0827/mm/memory_hotplug.c
@@ -602,27 +602,14 @@ static struct page *next_active_pagebloc
 /* Checks if this range of memory is likely to be hot-removable. */
 int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
 {
-	int type;
 	struct page *page = pfn_to_page(start_pfn);
 	struct page *end_page = page + nr_pages;
 
 	/* Check the starting page of each pageblock within the range */
 	for (; page < end_page; page = next_active_pageblock(page)) {
-		type = get_pageblock_migratetype(page);
-
-		/*
-		 * A pageblock containing MOVABLE or free pages is considered
-		 * removable
-		 */
-		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
+		if (!is_pageblock_removable(page))
 			return 0;
 
-		/*
-		 * A pageblock starting with a PageReserved page is not
-		 * considered removable.
-		 */
-		if (PageReserved(page))
-			return 0;
 	}
 
 	/* All pageblocks in the memory block are likely to be hot-removable */
Index: mmotm-0827/include/linux/memory_hotplug.h
===================================================================
--- mmotm-0827.orig/include/linux/memory_hotplug.h
+++ mmotm-0827/include/linux/memory_hotplug.h
@@ -76,6 +76,7 @@ extern int __add_pages(int nid, struct z
 extern int __remove_pages(struct zone *zone, unsigned long start_pfn,
 	unsigned long nr_pages);
 
+extern bool is_pageblock_removable(struct page *page);
 #ifdef CONFIG_NUMA
 extern int memory_add_physaddr_to_nid(u64 start);
 #else


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-03  3:14                                 ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-03  3:14 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Michal Hocko, Hiroyuki Kamezawa, Wu Fengguang, linux-mm,
	Andrew Morton, Kleen, Andi, Haicheng Li, Christoph Lameter,
	linux-kernel, Mel Gorman


Now, sysfs interface of memory hotplug shows whether the section is
removable or not. But it checks only migrateype of pages and doesn't
check details of cluster of pages.

Next, memory hotplug's set_migratetype_isolate() has the same kind
of check, too. But the migrate-type is just a "hint" and the pageblock
can contain several types of pages if fragmentation is very heavy.

To get precise information, we need to check
 - the pageblock only contains free pages or LRU pages.

This patch adds the function __count_unmovable_pages() and makes
above 2 checks to use the same logic. This will improve user experience
of memory hotplug because sysfs interface tells accurate information.

Note:
it may be better to check MIGRATE_UNMOVABLE for making failure case quick.
Note2:
I'm not sure but notifer should be called ??

Reported-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 include/linux/memory_hotplug.h |    1 
 mm/memory_hotplug.c            |   15 --------
 mm/page_alloc.c                |   76 ++++++++++++++++++++++++++++++-----------
 3 files changed, 59 insertions(+), 33 deletions(-)

Index: mmotm-0827/mm/page_alloc.c
===================================================================
--- mmotm-0827.orig/mm/page_alloc.c
+++ mmotm-0827/mm/page_alloc.c
@@ -5274,11 +5274,63 @@ void set_pageblock_flags_group(struct pa
  * page allocater never alloc memory from ISOLATE block.
  */
 
+static int __count_unmovable_pages(struct zone *zone, struct page *page)
+{
+	unsigned long pfn, iter, found;
+	/*
+	 * For avoiding noise data, lru_add_drain_all() should be called.
+ 	 * before this.
+ 	 */
+	if (zone_idx(zone) == ZONE_MOVABLE)
+		return 0;
+
+	pfn = page_to_pfn(page);
+	for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
+		unsigned long check = pfn + iter;
+
+		if (!pfn_valid_within(check)) {
+			iter++;
+			continue;
+		}
+		page = pfn_to_page(check);
+		if (!page_count(page)) {
+			if (PageBuddy(page))
+				iter += (1 << page_order(page)) - 1;
+			continue;
+		}
+		if (!PageLRU(page))
+			found++;
+		/*
+		 * If the page is not RAM, page_count()should be 0.
+		 * we don't need more check. This is an _used_ not-movable page.
+		 *
+		 * The problematic thing here is PG_reserved pages. But if
+		 * a PG_reserved page is _used_ (at boot), page_count > 1.
+		 * But...is there PG_reserved && page_count(page)==0 page ?
+		 */
+	}
+	return found;
+}
+
+bool is_pageblock_removable(struct page *page)
+{
+	struct zone *zone = page_zone(page);
+	unsigned long flags;
+	int num;
+
+	spin_lock_irqsave(&zone->lock, flags);
+	num = __count_unmovable_pages(zone, page);
+	spin_unlock_irqrestore(&zone->lock, flags);
+
+	if (num)
+		return false;
+	return true;
+}
+
 int set_migratetype_isolate(struct page *page)
 {
 	struct zone *zone;
-	struct page *curr_page;
-	unsigned long flags, pfn, iter;
+	unsigned long flags, pfn;
 	unsigned long immobile = 0;
 	struct memory_isolate_notify arg;
 	int notifier_ret;
@@ -5289,11 +5341,6 @@ int set_migratetype_isolate(struct page 
 	zone_idx = zone_idx(zone);
 
 	spin_lock_irqsave(&zone->lock, flags);
-	if (get_pageblock_migratetype(page) == MIGRATE_MOVABLE ||
-	    zone_idx == ZONE_MOVABLE) {
-		ret = 0;
-		goto out;
-	}
 
 	pfn = page_to_pfn(page);
 	arg.start_pfn = pfn;
@@ -5315,19 +5362,9 @@ int set_migratetype_isolate(struct page 
 	notifier_ret = notifier_to_errno(notifier_ret);
 	if (notifier_ret || !arg.pages_found)
 		goto out;
+	immobile = __count_unmovable_pages(zone ,page);
 
-	for (iter = pfn; iter < (pfn + pageblock_nr_pages); iter++) {
-		if (!pfn_valid_within(pfn))
-			continue;
-
-		curr_page = pfn_to_page(iter);
-		if (!page_count(curr_page) || PageLRU(curr_page))
-			continue;
-
-		immobile++;
-	}
-
-	if (arg.pages_found == immobile)
+	if (!immobile || arg.pages_found == immobile)
 		ret = 0;
 
 out:
Index: mmotm-0827/mm/memory_hotplug.c
===================================================================
--- mmotm-0827.orig/mm/memory_hotplug.c
+++ mmotm-0827/mm/memory_hotplug.c
@@ -602,27 +602,14 @@ static struct page *next_active_pagebloc
 /* Checks if this range of memory is likely to be hot-removable. */
 int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
 {
-	int type;
 	struct page *page = pfn_to_page(start_pfn);
 	struct page *end_page = page + nr_pages;
 
 	/* Check the starting page of each pageblock within the range */
 	for (; page < end_page; page = next_active_pageblock(page)) {
-		type = get_pageblock_migratetype(page);
-
-		/*
-		 * A pageblock containing MOVABLE or free pages is considered
-		 * removable
-		 */
-		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
+		if (!is_pageblock_removable(page))
 			return 0;
 
-		/*
-		 * A pageblock starting with a PageReserved page is not
-		 * considered removable.
-		 */
-		if (PageReserved(page))
-			return 0;
 	}
 
 	/* All pageblocks in the memory block are likely to be hot-removable */
Index: mmotm-0827/include/linux/memory_hotplug.h
===================================================================
--- mmotm-0827.orig/include/linux/memory_hotplug.h
+++ mmotm-0827/include/linux/memory_hotplug.h
@@ -76,6 +76,7 @@ extern int __add_pages(int nid, struct z
 extern int __remove_pages(struct zone *zone, unsigned long start_pfn,
 	unsigned long nr_pages);
 
+extern bool is_pageblock_removable(struct page *page);
 #ifdef CONFIG_NUMA
 extern int memory_add_physaddr_to_nid(u64 start);
 #else

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 0/2 v2] Make is_mem_section_removable more conformable with offlining code
  2010-09-03  3:10                               ` KAMEZAWA Hiroyuki
@ 2010-09-03  7:54                                 ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-03  7:54 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Fri 03-09-10 12:10:03, KAMEZAWA Hiroyuki wrote:
> On Thu, 2 Sep 2010 17:05:54 +0200
> Michal Hocko <mhocko@suse.cz> wrote:
> 
> >  extern int mem_online_node(int nid);
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index a4cfcdc..2b736ed 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -569,16 +569,25 @@ out:
> >  EXPORT_SYMBOL_GPL(add_memory);
> >  
> >  #ifdef CONFIG_MEMORY_HOTREMOVE
> > +
> >  /*
> > - * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
> > - * set and the size of the free page is given by page_order(). Using this,
> > - * the function determines if the pageblock contains only free pages.
> > - * Due to buddy contraints, a free page at least the size of a pageblock will
> > - * be located at the start of the pageblock
> > + * A free or LRU pages block are removable
> > + * Do not use MIGRATE_MOVABLE because it can be insufficient and
> > + * other MIGRATE types are tricky.
> >   */
> > -static inline int pageblock_free(struct page *page)
> > -{
> > -	return PageBuddy(page) && page_order(page) >= pageblock_order;
> > +bool is_page_removable(struct page *page)
> > +{
> > +	int page_block = 1 << pageblock_order;
> > +	while (page_block > 0) {
> > +		if (PageBuddy(page)) {
> > +			page_block -= page_order(page);
> > +		} else if (PageLRU(page))
> > +			page_block--;
> > +		else 
> > +			return false;
> > +	}
> 
> still seems wrong..."page" pointer should be updated.

You are right. I should have double checked that (never send patches
before leaving from the office...).

-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 0/2 v2] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-03  7:54                                 ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-03  7:54 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Fri 03-09-10 12:10:03, KAMEZAWA Hiroyuki wrote:
> On Thu, 2 Sep 2010 17:05:54 +0200
> Michal Hocko <mhocko@suse.cz> wrote:
> 
> >  extern int mem_online_node(int nid);
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index a4cfcdc..2b736ed 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -569,16 +569,25 @@ out:
> >  EXPORT_SYMBOL_GPL(add_memory);
> >  
> >  #ifdef CONFIG_MEMORY_HOTREMOVE
> > +
> >  /*
> > - * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
> > - * set and the size of the free page is given by page_order(). Using this,
> > - * the function determines if the pageblock contains only free pages.
> > - * Due to buddy contraints, a free page at least the size of a pageblock will
> > - * be located at the start of the pageblock
> > + * A free or LRU pages block are removable
> > + * Do not use MIGRATE_MOVABLE because it can be insufficient and
> > + * other MIGRATE types are tricky.
> >   */
> > -static inline int pageblock_free(struct page *page)
> > -{
> > -	return PageBuddy(page) && page_order(page) >= pageblock_order;
> > +bool is_page_removable(struct page *page)
> > +{
> > +	int page_block = 1 << pageblock_order;
> > +	while (page_block > 0) {
> > +		if (PageBuddy(page)) {
> > +			page_block -= page_order(page);
> > +		} else if (PageLRU(page))
> > +			page_block--;
> > +		else 
> > +			return false;
> > +	}
> 
> still seems wrong..."page" pointer should be updated.

You are right. I should have double checked that (never send patches
before leaving from the office...).

-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH 3/2][BUGFIX] fix memory isolation notifier return value check
  2010-09-03  3:10                               ` KAMEZAWA Hiroyuki
@ 2010-09-03  7:57                                 ` KAMEZAWA Hiroyuki
  -1 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-03  7:57 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Michal Hocko, Hiroyuki Kamezawa, Wu Fengguang, linux-mm,
	Andrew Morton, Kleen, Andi, Haicheng Li, Christoph Lameter,
	linux-kernel, Mel Gorman

Sorry, the 3rd patch for this set.
==

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Even if notifier cannot find any pages, it doesn't mean
no pages are available...(For example, there is no notifier.)
Anyway, we check all pages in the range, later.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/page_alloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: mmotm-0827/mm/page_alloc.c
===================================================================
--- mmotm-0827.orig/mm/page_alloc.c
+++ mmotm-0827/mm/page_alloc.c
@@ -5360,7 +5360,7 @@ int set_migratetype_isolate(struct page 
 	 */
 	notifier_ret = memory_isolate_notify(MEM_ISOLATE_COUNT, &arg);
 	notifier_ret = notifier_to_errno(notifier_ret);
-	if (notifier_ret || !arg.pages_found)
+	if (notifier_ret)
 		goto out;
 	immobile = __count_unmovable_pages(zone ,page);
 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH 3/2][BUGFIX] fix memory isolation notifier return value check
@ 2010-09-03  7:57                                 ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-03  7:57 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Michal Hocko, Hiroyuki Kamezawa, Wu Fengguang, linux-mm,
	Andrew Morton, Kleen, Andi, Haicheng Li, Christoph Lameter,
	linux-kernel, Mel Gorman

Sorry, the 3rd patch for this set.
==

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Even if notifier cannot find any pages, it doesn't mean
no pages are available...(For example, there is no notifier.)
Anyway, we check all pages in the range, later.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/page_alloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: mmotm-0827/mm/page_alloc.c
===================================================================
--- mmotm-0827.orig/mm/page_alloc.c
+++ mmotm-0827/mm/page_alloc.c
@@ -5360,7 +5360,7 @@ int set_migratetype_isolate(struct page 
 	 */
 	notifier_ret = memory_isolate_notify(MEM_ISOLATE_COUNT, &arg);
 	notifier_ret = notifier_to_errno(notifier_ret);
-	if (notifier_ret || !arg.pages_found)
+	if (notifier_ret)
 		goto out;
 	immobile = __count_unmovable_pages(zone ,page);
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code
  2010-09-03  3:14                                 ` KAMEZAWA Hiroyuki
@ 2010-09-03  8:25                                   ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-03  8:25 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Fri 03-09-10 12:14:52, KAMEZAWA Hiroyuki wrote:
[...]
> ---
>  include/linux/memory_hotplug.h |    1 
>  mm/memory_hotplug.c            |   15 --------
>  mm/page_alloc.c                |   76 ++++++++++++++++++++++++++++++-----------
>  3 files changed, 59 insertions(+), 33 deletions(-)
> 
> Index: mmotm-0827/mm/page_alloc.c
> ===================================================================
> --- mmotm-0827.orig/mm/page_alloc.c
> +++ mmotm-0827/mm/page_alloc.c
> @@ -5274,11 +5274,63 @@ void set_pageblock_flags_group(struct pa
>   * page allocater never alloc memory from ISOLATE block.
>   */
>  
> +static int __count_unmovable_pages(struct zone *zone, struct page *page)
> +{
> +	unsigned long pfn, iter, found;
> +	/*
> +	 * For avoiding noise data, lru_add_drain_all() should be called.
> + 	 * before this.
> + 	 */
> +	if (zone_idx(zone) == ZONE_MOVABLE)
> +		return 0;

Cannot ZONE_MOVABLE contain different MIGRATE_types?

> +
> +	pfn = page_to_pfn(page);
> +	for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
> +		unsigned long check = pfn + iter;
> +
> +		if (!pfn_valid_within(check)) {
> +			iter++;
> +			continue;
> +		}
> +		page = pfn_to_page(check);
> +		if (!page_count(page)) {
> +			if (PageBuddy(page))

Why do you check page_count as well? PageBuddy has alwyas count==0,
right?

> +				iter += (1 << page_order(page)) - 1;
> +			continue;
> +		}
> +		if (!PageLRU(page))
> +			found++;
> +		/*
> +		 * If the page is not RAM, page_count()should be 0.
> +		 * we don't need more check. This is an _used_ not-movable page.
> +		 *
> +		 * The problematic thing here is PG_reserved pages. But if
> +		 * a PG_reserved page is _used_ (at boot), page_count > 1.
> +		 * But...is there PG_reserved && page_count(page)==0 page ?

Can we have PG_reserved && PG_lru? I also quite don't understand the
comment. At this place we are sure that the page is valid and neither
free nor LRU.

> +		 */
> +	}
> +	return found;
> +}
> +
> +bool is_pageblock_removable(struct page *page)
> +{
> +	struct zone *zone = page_zone(page);
> +	unsigned long flags;
> +	int num;
> +
> +	spin_lock_irqsave(&zone->lock, flags);
> +	num = __count_unmovable_pages(zone, page);
> +	spin_unlock_irqrestore(&zone->lock, flags);

Isn't this a problem? The function is triggered from userspace by sysfs
(0444 file) and holds the lock for pageblock_nr_pages. So someone can
simply read the file and block the zone->lock preventing/delaying
allocations for the rest of the system.

I think that the function should rather bail out as soon as possible.

[...]

>  	/* All pageblocks in the memory block are likely to be hot-removable */
> Index: mmotm-0827/include/linux/memory_hotplug.h
> ===================================================================
> --- mmotm-0827.orig/include/linux/memory_hotplug.h
> +++ mmotm-0827/include/linux/memory_hotplug.h
> @@ -76,6 +76,7 @@ extern int __add_pages(int nid, struct z
>  extern int __remove_pages(struct zone *zone, unsigned long start_pfn,
>  	unsigned long nr_pages);
>  
> +extern bool is_pageblock_removable(struct page *page);
>  #ifdef CONFIG_NUMA
>  extern int memory_add_physaddr_to_nid(u64 start);
>  #else

Shouldn't this go rather under CONFIG_MEMORY_HOTREMOVE?

Thanks!
-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-03  8:25                                   ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-03  8:25 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Fri 03-09-10 12:14:52, KAMEZAWA Hiroyuki wrote:
[...]
> ---
>  include/linux/memory_hotplug.h |    1 
>  mm/memory_hotplug.c            |   15 --------
>  mm/page_alloc.c                |   76 ++++++++++++++++++++++++++++++-----------
>  3 files changed, 59 insertions(+), 33 deletions(-)
> 
> Index: mmotm-0827/mm/page_alloc.c
> ===================================================================
> --- mmotm-0827.orig/mm/page_alloc.c
> +++ mmotm-0827/mm/page_alloc.c
> @@ -5274,11 +5274,63 @@ void set_pageblock_flags_group(struct pa
>   * page allocater never alloc memory from ISOLATE block.
>   */
>  
> +static int __count_unmovable_pages(struct zone *zone, struct page *page)
> +{
> +	unsigned long pfn, iter, found;
> +	/*
> +	 * For avoiding noise data, lru_add_drain_all() should be called.
> + 	 * before this.
> + 	 */
> +	if (zone_idx(zone) == ZONE_MOVABLE)
> +		return 0;

Cannot ZONE_MOVABLE contain different MIGRATE_types?

> +
> +	pfn = page_to_pfn(page);
> +	for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
> +		unsigned long check = pfn + iter;
> +
> +		if (!pfn_valid_within(check)) {
> +			iter++;
> +			continue;
> +		}
> +		page = pfn_to_page(check);
> +		if (!page_count(page)) {
> +			if (PageBuddy(page))

Why do you check page_count as well? PageBuddy has alwyas count==0,
right?

> +				iter += (1 << page_order(page)) - 1;
> +			continue;
> +		}
> +		if (!PageLRU(page))
> +			found++;
> +		/*
> +		 * If the page is not RAM, page_count()should be 0.
> +		 * we don't need more check. This is an _used_ not-movable page.
> +		 *
> +		 * The problematic thing here is PG_reserved pages. But if
> +		 * a PG_reserved page is _used_ (at boot), page_count > 1.
> +		 * But...is there PG_reserved && page_count(page)==0 page ?

Can we have PG_reserved && PG_lru? I also quite don't understand the
comment. At this place we are sure that the page is valid and neither
free nor LRU.

> +		 */
> +	}
> +	return found;
> +}
> +
> +bool is_pageblock_removable(struct page *page)
> +{
> +	struct zone *zone = page_zone(page);
> +	unsigned long flags;
> +	int num;
> +
> +	spin_lock_irqsave(&zone->lock, flags);
> +	num = __count_unmovable_pages(zone, page);
> +	spin_unlock_irqrestore(&zone->lock, flags);

Isn't this a problem? The function is triggered from userspace by sysfs
(0444 file) and holds the lock for pageblock_nr_pages. So someone can
simply read the file and block the zone->lock preventing/delaying
allocations for the rest of the system.

I think that the function should rather bail out as soon as possible.

[...]

>  	/* All pageblocks in the memory block are likely to be hot-removable */
> Index: mmotm-0827/include/linux/memory_hotplug.h
> ===================================================================
> --- mmotm-0827.orig/include/linux/memory_hotplug.h
> +++ mmotm-0827/include/linux/memory_hotplug.h
> @@ -76,6 +76,7 @@ extern int __add_pages(int nid, struct z
>  extern int __remove_pages(struct zone *zone, unsigned long start_pfn,
>  	unsigned long nr_pages);
>  
> +extern bool is_pageblock_removable(struct page *page);
>  #ifdef CONFIG_NUMA
>  extern int memory_add_physaddr_to_nid(u64 start);
>  #else

Shouldn't this go rather under CONFIG_MEMORY_HOTREMOVE?

Thanks!
-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code
  2010-09-03  8:25                                   ` Michal Hocko
@ 2010-09-03  9:13                                     ` KAMEZAWA Hiroyuki
  -1 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-03  9:13 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Fri, 3 Sep 2010 10:25:58 +0200
Michal Hocko <mhocko@suse.cz> wrote:

> On Fri 03-09-10 12:14:52, KAMEZAWA Hiroyuki wrote:
> [...]
> > ---
> >  include/linux/memory_hotplug.h |    1 
> >  mm/memory_hotplug.c            |   15 --------
> >  mm/page_alloc.c                |   76 ++++++++++++++++++++++++++++++-----------
> >  3 files changed, 59 insertions(+), 33 deletions(-)
> > 
> > Index: mmotm-0827/mm/page_alloc.c
> > ===================================================================
> > --- mmotm-0827.orig/mm/page_alloc.c
> > +++ mmotm-0827/mm/page_alloc.c
> > @@ -5274,11 +5274,63 @@ void set_pageblock_flags_group(struct pa
> >   * page allocater never alloc memory from ISOLATE block.
> >   */
> >  
> > +static int __count_unmovable_pages(struct zone *zone, struct page *page)
> > +{
> > +	unsigned long pfn, iter, found;
> > +	/*
> > +	 * For avoiding noise data, lru_add_drain_all() should be called.
> > + 	 * before this.
> > + 	 */
> > +	if (zone_idx(zone) == ZONE_MOVABLE)
> > +		return 0;
> 
> Cannot ZONE_MOVABLE contain different MIGRATE_types?
> 
never.

> > +
> > +	pfn = page_to_pfn(page);
> > +	for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
> > +		unsigned long check = pfn + iter;
> > +
> > +		if (!pfn_valid_within(check)) {
> > +			iter++;
> > +			continue;
> > +		}
> > +		page = pfn_to_page(check);
> > +		if (!page_count(page)) {
> > +			if (PageBuddy(page))
> 
> Why do you check page_count as well? PageBuddy has alwyas count==0,
> right?
> 

But PageBuddy() flag is considered to be valid only when page_count()==0.
This is for safe handling.


> > +				iter += (1 << page_order(page)) - 1;
> > +			continue;
> > +		}
> > +		if (!PageLRU(page))
> > +			found++;
> > +		/*
> > +		 * If the page is not RAM, page_count()should be 0.
> > +		 * we don't need more check. This is an _used_ not-movable page.
> > +		 *
> > +		 * The problematic thing here is PG_reserved pages. But if
> > +		 * a PG_reserved page is _used_ (at boot), page_count > 1.
> > +		 * But...is there PG_reserved && page_count(page)==0 page ?
> 
> Can we have PG_reserved && PG_lru? 

I think never.

> I also quite don't understand the comment. 

There an issue that "remove an memory section which includes memory hole".
Then,

   a page used by bootmem .... PG_reserved.
   a page of memory hole  .... PG_reserved.

We need to call page_is_ram() or some for handling this mess.


> At this place we are sure that the page is valid and neither
> free nor LRU.
> 


> > +		 */
> > +	}
> > +	return found;
> > +}
> > +
> > +bool is_pageblock_removable(struct page *page)
> > +{
> > +	struct zone *zone = page_zone(page);
> > +	unsigned long flags;
> > +	int num;
> > +
> > +	spin_lock_irqsave(&zone->lock, flags);
> > +	num = __count_unmovable_pages(zone, page);
> > +	spin_unlock_irqrestore(&zone->lock, flags);
> 
> Isn't this a problem? The function is triggered from userspace by sysfs
> (0444 file) and holds the lock for pageblock_nr_pages. So someone can
> simply read the file and block the zone->lock preventing/delaying
> allocations for the rest of the system.
> 
But we need to take this. Maybe no panic you'll see even if no-lock.

> I think that the function should rather bail out as soon as possible.
> 

I did this for 100% accuracy, but ok, will remove this lock and see what happens.

> [...]
> 
> >  	/* All pageblocks in the memory block are likely to be hot-removable */
> > Index: mmotm-0827/include/linux/memory_hotplug.h
> > ===================================================================
> > --- mmotm-0827.orig/include/linux/memory_hotplug.h
> > +++ mmotm-0827/include/linux/memory_hotplug.h
> > @@ -76,6 +76,7 @@ extern int __add_pages(int nid, struct z
> >  extern int __remove_pages(struct zone *zone, unsigned long start_pfn,
> >  	unsigned long nr_pages);
> >  
> > +extern bool is_pageblock_removable(struct page *page);
> >  #ifdef CONFIG_NUMA
> >  extern int memory_add_physaddr_to_nid(u64 start);
> >  #else
> 
> Shouldn't this go rather under CONFIG_MEMORY_HOTREMOVE?
> 

Hmm. maybe. will post udpated one.

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-03  9:13                                     ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-03  9:13 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Fri, 3 Sep 2010 10:25:58 +0200
Michal Hocko <mhocko@suse.cz> wrote:

> On Fri 03-09-10 12:14:52, KAMEZAWA Hiroyuki wrote:
> [...]
> > ---
> >  include/linux/memory_hotplug.h |    1 
> >  mm/memory_hotplug.c            |   15 --------
> >  mm/page_alloc.c                |   76 ++++++++++++++++++++++++++++++-----------
> >  3 files changed, 59 insertions(+), 33 deletions(-)
> > 
> > Index: mmotm-0827/mm/page_alloc.c
> > ===================================================================
> > --- mmotm-0827.orig/mm/page_alloc.c
> > +++ mmotm-0827/mm/page_alloc.c
> > @@ -5274,11 +5274,63 @@ void set_pageblock_flags_group(struct pa
> >   * page allocater never alloc memory from ISOLATE block.
> >   */
> >  
> > +static int __count_unmovable_pages(struct zone *zone, struct page *page)
> > +{
> > +	unsigned long pfn, iter, found;
> > +	/*
> > +	 * For avoiding noise data, lru_add_drain_all() should be called.
> > + 	 * before this.
> > + 	 */
> > +	if (zone_idx(zone) == ZONE_MOVABLE)
> > +		return 0;
> 
> Cannot ZONE_MOVABLE contain different MIGRATE_types?
> 
never.

> > +
> > +	pfn = page_to_pfn(page);
> > +	for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
> > +		unsigned long check = pfn + iter;
> > +
> > +		if (!pfn_valid_within(check)) {
> > +			iter++;
> > +			continue;
> > +		}
> > +		page = pfn_to_page(check);
> > +		if (!page_count(page)) {
> > +			if (PageBuddy(page))
> 
> Why do you check page_count as well? PageBuddy has alwyas count==0,
> right?
> 

But PageBuddy() flag is considered to be valid only when page_count()==0.
This is for safe handling.


> > +				iter += (1 << page_order(page)) - 1;
> > +			continue;
> > +		}
> > +		if (!PageLRU(page))
> > +			found++;
> > +		/*
> > +		 * If the page is not RAM, page_count()should be 0.
> > +		 * we don't need more check. This is an _used_ not-movable page.
> > +		 *
> > +		 * The problematic thing here is PG_reserved pages. But if
> > +		 * a PG_reserved page is _used_ (at boot), page_count > 1.
> > +		 * But...is there PG_reserved && page_count(page)==0 page ?
> 
> Can we have PG_reserved && PG_lru? 

I think never.

> I also quite don't understand the comment. 

There an issue that "remove an memory section which includes memory hole".
Then,

   a page used by bootmem .... PG_reserved.
   a page of memory hole  .... PG_reserved.

We need to call page_is_ram() or some for handling this mess.


> At this place we are sure that the page is valid and neither
> free nor LRU.
> 


> > +		 */
> > +	}
> > +	return found;
> > +}
> > +
> > +bool is_pageblock_removable(struct page *page)
> > +{
> > +	struct zone *zone = page_zone(page);
> > +	unsigned long flags;
> > +	int num;
> > +
> > +	spin_lock_irqsave(&zone->lock, flags);
> > +	num = __count_unmovable_pages(zone, page);
> > +	spin_unlock_irqrestore(&zone->lock, flags);
> 
> Isn't this a problem? The function is triggered from userspace by sysfs
> (0444 file) and holds the lock for pageblock_nr_pages. So someone can
> simply read the file and block the zone->lock preventing/delaying
> allocations for the rest of the system.
> 
But we need to take this. Maybe no panic you'll see even if no-lock.

> I think that the function should rather bail out as soon as possible.
> 

I did this for 100% accuracy, but ok, will remove this lock and see what happens.

> [...]
> 
> >  	/* All pageblocks in the memory block are likely to be hot-removable */
> > Index: mmotm-0827/include/linux/memory_hotplug.h
> > ===================================================================
> > --- mmotm-0827.orig/include/linux/memory_hotplug.h
> > +++ mmotm-0827/include/linux/memory_hotplug.h
> > @@ -76,6 +76,7 @@ extern int __add_pages(int nid, struct z
> >  extern int __remove_pages(struct zone *zone, unsigned long start_pfn,
> >  	unsigned long nr_pages);
> >  
> > +extern bool is_pageblock_removable(struct page *page);
> >  #ifdef CONFIG_NUMA
> >  extern int memory_add_physaddr_to_nid(u64 start);
> >  #else
> 
> Shouldn't this go rather under CONFIG_MEMORY_HOTREMOVE?
> 

Hmm. maybe. will post udpated one.

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code
  2010-09-03  8:25                                   ` Michal Hocko
@ 2010-09-03  9:15                                     ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-03  9:15 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

Just in case that my old (buggy) approach still matters, here is the
updated (and hopefully fixed) patch.

---

>From 5e0436a854d7103eba53f173f6839032a2f43c21 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Fri, 20 Aug 2010 15:39:16 +0200
Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code

Currently is_mem_section_removable checks whether each pageblock from
the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
are false then the range is considered non removable.

On the other hand, offlining code (more specifically
set_migratetype_isolate) doesn't care whether a page is free and instead
it just checks the migrate type of the page and whether the page's zone
is movable.

This can lead into a situation when we can mark a node as not removable
just because a pageblock is MIGRATE_RESERVE and it is not free but still
movable.

Let's make a common helper is_page_removable which unifies both tests
at one place.

Do not rely on any of MIGRATE_* types as all others but MIGRATE_MOVABLE
may be tricky. MIGRATE_RESERVE can be anything that just happened to
fallback to that allocation. MIGRATE_RECLAIMABLE can be unmovable
because slab (or what ever) has this page currently in use and cannot
release it.  If we tried to remove those pages and the isolation failed
then those blocks would get into the MIRAGTE_MOVABLE list
unconditionally and we will end up having unmovable pages in the movable
list.

Let's, instead, check just whether a pageblock contains only free or LRU
pages.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
---
 include/linux/memory_hotplug.h |    4 +++
 mm/memory_hotplug.c            |   42 +++++++++++++++++++++++++++++++++------
 mm/page_alloc.c                |    5 +---
 3 files changed, 40 insertions(+), 11 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 864035f..5c448f7 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -194,12 +194,16 @@ static inline void register_page_bootmem_info_node(struct pglist_data *pgdat)
 
 extern int is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
 
+bool is_page_removable(struct page *page);
+
 #else
 static inline int is_mem_section_removable(unsigned long pfn,
 					unsigned long nr_pages)
 {
 	return 0;
 }
+
+#define is_page_removable(page) 0
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 
 extern int mem_online_node(int nid);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a4cfcdc..ccd927d 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -581,6 +581,39 @@ static inline int pageblock_free(struct page *page)
 	return PageBuddy(page) && page_order(page) >= pageblock_order;
 }
 
+/*
+ * A free or LRU pages block are removable
+ * Do not use MIGRATE_MOVABLE because it can be insufficient and
+ * other MIGRATE types are tricky.
+ * Do not hold zone->lock as this is used from user space by the
+ * sysfs interface.
+ */
+bool is_page_removable(struct page *page)
+{
+	int page_block = 1 << pageblock_order;
+	while (page_block > 0) {
+		int order = 0;
+
+		if (pfn_valid_within(page_to_pfn(page))) {
+			if (PageBuddy(page)) {
+				order = page_order(page);
+			} else if (!PageLRU(page))
+				return false;
+		}
+
+		/* We are not holding zone lock so the page
+		 * might get used since we tested it for buddy
+		 * flag. This is just a informative check so
+		 * live with that and rely that we catch this
+		 * in the page_block test.
+		 */
+		page_block -= 1 << order;
+		page += 1 << order;
+	}
+
+	return true;
+}
+
 /* Return the start of the next active pageblock after a given page */
 static struct page *next_active_pageblock(struct page *page)
 {
@@ -608,13 +641,7 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
 
 	/* Check the starting page of each pageblock within the range */
 	for (; page < end_page; page = next_active_pageblock(page)) {
-		type = get_pageblock_migratetype(page);
-
-		/*
-		 * A pageblock containing MOVABLE or free pages is considered
-		 * removable
-		 */
-		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
+		if (!is_page_removable(page))
 			return 0;
 
 		/*
@@ -770,6 +797,7 @@ check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages,
 	return ret;
 }
 
+
 static long
 check_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
 {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a9649f4..c2e2576 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5277,14 +5277,11 @@ int set_migratetype_isolate(struct page *page)
 	struct memory_isolate_notify arg;
 	int notifier_ret;
 	int ret = -EBUSY;
-	int zone_idx;
 
 	zone = page_zone(page);
-	zone_idx = zone_idx(zone);
 
 	spin_lock_irqsave(&zone->lock, flags);
-	if (get_pageblock_migratetype(page) == MIGRATE_MOVABLE ||
-	    zone_idx == ZONE_MOVABLE) {
+	if (is_page_removable(page)) {
 		ret = 0;
 		goto out;
 	}
-- 
1.7.1

-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-03  9:15                                     ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-03  9:15 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

Just in case that my old (buggy) approach still matters, here is the
updated (and hopefully fixed) patch.

---

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code
  2010-09-03  9:15                                     ` Michal Hocko
@ 2010-09-03  9:24                                       ` KAMEZAWA Hiroyuki
  -1 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-03  9:24 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Fri, 3 Sep 2010 11:15:09 +0200
Michal Hocko <mhocko@suse.cz> wrote:

> Just in case that my old (buggy) approach still matters, here is the
> updated (and hopefully fixed) patch.
> 

as my patch shows, you can drop more codes from set_pagetype_migrate().
is_removable() and that should use the same logic.
...

And there are some bugs should be fixed as I shown.

If you get ack from someone, please go ahead. I'll make add-on.

Thanks,
-Kame
> ---
> 
> From 5e0436a854d7103eba53f173f6839032a2f43c21 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Fri, 20 Aug 2010 15:39:16 +0200
> Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> 
> Currently is_mem_section_removable checks whether each pageblock from
> the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> are false then the range is considered non removable.
> 
> On the other hand, offlining code (more specifically
> set_migratetype_isolate) doesn't care whether a page is free and instead
> it just checks the migrate type of the page and whether the page's zone
> is movable.
> 
> This can lead into a situation when we can mark a node as not removable
> just because a pageblock is MIGRATE_RESERVE and it is not free but still
> movable.
> 
> Let's make a common helper is_page_removable which unifies both tests
> at one place.
> 
> Do not rely on any of MIGRATE_* types as all others but MIGRATE_MOVABLE
> may be tricky. MIGRATE_RESERVE can be anything that just happened to
> fallback to that allocation. MIGRATE_RECLAIMABLE can be unmovable
> because slab (or what ever) has this page currently in use and cannot
> release it.  If we tried to remove those pages and the isolation failed
> then those blocks would get into the MIRAGTE_MOVABLE list
> unconditionally and we will end up having unmovable pages in the movable
> list.
> 
> Let's, instead, check just whether a pageblock contains only free or LRU
> pages.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.cz>
> ---
>  include/linux/memory_hotplug.h |    4 +++
>  mm/memory_hotplug.c            |   42 +++++++++++++++++++++++++++++++++------
>  mm/page_alloc.c                |    5 +---
>  3 files changed, 40 insertions(+), 11 deletions(-)
> 
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 864035f..5c448f7 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -194,12 +194,16 @@ static inline void register_page_bootmem_info_node(struct pglist_data *pgdat)
>  
>  extern int is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
>  
> +bool is_page_removable(struct page *page);
> +
>  #else
>  static inline int is_mem_section_removable(unsigned long pfn,
>  					unsigned long nr_pages)
>  {
>  	return 0;
>  }
> +
> +#define is_page_removable(page) 0
>  #endif /* CONFIG_MEMORY_HOTREMOVE */
>  
>  extern int mem_online_node(int nid);
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index a4cfcdc..ccd927d 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -581,6 +581,39 @@ static inline int pageblock_free(struct page *page)
>  	return PageBuddy(page) && page_order(page) >= pageblock_order;
>  }
>  
> +/*
> + * A free or LRU pages block are removable
> + * Do not use MIGRATE_MOVABLE because it can be insufficient and
> + * other MIGRATE types are tricky.
> + * Do not hold zone->lock as this is used from user space by the
> + * sysfs interface.
> + */
> +bool is_page_removable(struct page *page)
> +{
> +	int page_block = 1 << pageblock_order;
> +	while (page_block > 0) {
> +		int order = 0;
> +
> +		if (pfn_valid_within(page_to_pfn(page))) {
> +			if (PageBuddy(page)) {
> +				order = page_order(page);
> +			} else if (!PageLRU(page))
> +				return false;
> +		}
> +
> +		/* We are not holding zone lock so the page
> +		 * might get used since we tested it for buddy
> +		 * flag. This is just a informative check so
> +		 * live with that and rely that we catch this
> +		 * in the page_block test.
> +		 */
> +		page_block -= 1 << order;
> +		page += 1 << order;
> +	}
> +
> +	return true;
> +}
> +
>  /* Return the start of the next active pageblock after a given page */
>  static struct page *next_active_pageblock(struct page *page)
>  {
> @@ -608,13 +641,7 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
>  
>  	/* Check the starting page of each pageblock within the range */
>  	for (; page < end_page; page = next_active_pageblock(page)) {
> -		type = get_pageblock_migratetype(page);
> -
> -		/*
> -		 * A pageblock containing MOVABLE or free pages is considered
> -		 * removable
> -		 */
> -		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
> +		if (!is_page_removable(page))
>  			return 0;
>  
>  		/*
> @@ -770,6 +797,7 @@ check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages,
>  	return ret;
>  }
>  
> +
>  static long
>  check_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
>  {
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index a9649f4..c2e2576 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5277,14 +5277,11 @@ int set_migratetype_isolate(struct page *page)
>  	struct memory_isolate_notify arg;
>  	int notifier_ret;
>  	int ret = -EBUSY;
> -	int zone_idx;
>  
>  	zone = page_zone(page);
> -	zone_idx = zone_idx(zone);
>  
>  	spin_lock_irqsave(&zone->lock, flags);
> -	if (get_pageblock_migratetype(page) == MIGRATE_MOVABLE ||
> -	    zone_idx == ZONE_MOVABLE) {
> +	if (is_page_removable(page)) {
>  		ret = 0;
>  		goto out;
>  	}
> -- 
> 1.7.1
> 
> -- 
> Michal Hocko
> L3 team 
> SUSE LINUX s.r.o.
> Lihovarska 1060/12
> 190 00 Praha 9    
> Czech Republic
> 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-03  9:24                                       ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-03  9:24 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Fri, 3 Sep 2010 11:15:09 +0200
Michal Hocko <mhocko@suse.cz> wrote:

> Just in case that my old (buggy) approach still matters, here is the
> updated (and hopefully fixed) patch.
> 

as my patch shows, you can drop more codes from set_pagetype_migrate().
is_removable() and that should use the same logic.
...

And there are some bugs should be fixed as I shown.

If you get ack from someone, please go ahead. I'll make add-on.

Thanks,
-Kame
> ---
> 
> From 5e0436a854d7103eba53f173f6839032a2f43c21 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Fri, 20 Aug 2010 15:39:16 +0200
> Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code
> 
> Currently is_mem_section_removable checks whether each pageblock from
> the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
> are false then the range is considered non removable.
> 
> On the other hand, offlining code (more specifically
> set_migratetype_isolate) doesn't care whether a page is free and instead
> it just checks the migrate type of the page and whether the page's zone
> is movable.
> 
> This can lead into a situation when we can mark a node as not removable
> just because a pageblock is MIGRATE_RESERVE and it is not free but still
> movable.
> 
> Let's make a common helper is_page_removable which unifies both tests
> at one place.
> 
> Do not rely on any of MIGRATE_* types as all others but MIGRATE_MOVABLE
> may be tricky. MIGRATE_RESERVE can be anything that just happened to
> fallback to that allocation. MIGRATE_RECLAIMABLE can be unmovable
> because slab (or what ever) has this page currently in use and cannot
> release it.  If we tried to remove those pages and the isolation failed
> then those blocks would get into the MIRAGTE_MOVABLE list
> unconditionally and we will end up having unmovable pages in the movable
> list.
> 
> Let's, instead, check just whether a pageblock contains only free or LRU
> pages.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.cz>
> ---
>  include/linux/memory_hotplug.h |    4 +++
>  mm/memory_hotplug.c            |   42 +++++++++++++++++++++++++++++++++------
>  mm/page_alloc.c                |    5 +---
>  3 files changed, 40 insertions(+), 11 deletions(-)
> 
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 864035f..5c448f7 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -194,12 +194,16 @@ static inline void register_page_bootmem_info_node(struct pglist_data *pgdat)
>  
>  extern int is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
>  
> +bool is_page_removable(struct page *page);
> +
>  #else
>  static inline int is_mem_section_removable(unsigned long pfn,
>  					unsigned long nr_pages)
>  {
>  	return 0;
>  }
> +
> +#define is_page_removable(page) 0
>  #endif /* CONFIG_MEMORY_HOTREMOVE */
>  
>  extern int mem_online_node(int nid);
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index a4cfcdc..ccd927d 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -581,6 +581,39 @@ static inline int pageblock_free(struct page *page)
>  	return PageBuddy(page) && page_order(page) >= pageblock_order;
>  }
>  
> +/*
> + * A free or LRU pages block are removable
> + * Do not use MIGRATE_MOVABLE because it can be insufficient and
> + * other MIGRATE types are tricky.
> + * Do not hold zone->lock as this is used from user space by the
> + * sysfs interface.
> + */
> +bool is_page_removable(struct page *page)
> +{
> +	int page_block = 1 << pageblock_order;
> +	while (page_block > 0) {
> +		int order = 0;
> +
> +		if (pfn_valid_within(page_to_pfn(page))) {
> +			if (PageBuddy(page)) {
> +				order = page_order(page);
> +			} else if (!PageLRU(page))
> +				return false;
> +		}
> +
> +		/* We are not holding zone lock so the page
> +		 * might get used since we tested it for buddy
> +		 * flag. This is just a informative check so
> +		 * live with that and rely that we catch this
> +		 * in the page_block test.
> +		 */
> +		page_block -= 1 << order;
> +		page += 1 << order;
> +	}
> +
> +	return true;
> +}
> +
>  /* Return the start of the next active pageblock after a given page */
>  static struct page *next_active_pageblock(struct page *page)
>  {
> @@ -608,13 +641,7 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
>  
>  	/* Check the starting page of each pageblock within the range */
>  	for (; page < end_page; page = next_active_pageblock(page)) {
> -		type = get_pageblock_migratetype(page);
> -
> -		/*
> -		 * A pageblock containing MOVABLE or free pages is considered
> -		 * removable
> -		 */
> -		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
> +		if (!is_page_removable(page))
>  			return 0;
>  
>  		/*
> @@ -770,6 +797,7 @@ check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages,
>  	return ret;
>  }
>  
> +
>  static long
>  check_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
>  {
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index a9649f4..c2e2576 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5277,14 +5277,11 @@ int set_migratetype_isolate(struct page *page)
>  	struct memory_isolate_notify arg;
>  	int notifier_ret;
>  	int ret = -EBUSY;
> -	int zone_idx;
>  
>  	zone = page_zone(page);
> -	zone_idx = zone_idx(zone);
>  
>  	spin_lock_irqsave(&zone->lock, flags);
> -	if (get_pageblock_migratetype(page) == MIGRATE_MOVABLE ||
> -	    zone_idx == ZONE_MOVABLE) {
> +	if (is_page_removable(page)) {
>  		ret = 0;
>  		goto out;
>  	}
> -- 
> 1.7.1
> 
> -- 
> Michal Hocko
> L3 team 
> SUSE LINUX s.r.o.
> Lihovarska 1060/12
> 190 00 Praha 9    
> Czech Republic
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code
  2010-09-03  9:13                                     ` KAMEZAWA Hiroyuki
@ 2010-09-03  9:50                                       ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-03  9:50 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Fri 03-09-10 18:13:27, KAMEZAWA Hiroyuki wrote:
> On Fri, 3 Sep 2010 10:25:58 +0200
> Michal Hocko <mhocko@suse.cz> wrote:
> 
> > On Fri 03-09-10 12:14:52, KAMEZAWA Hiroyuki wrote:
> > [...]
[...]
> > Cannot ZONE_MOVABLE contain different MIGRATE_types?
> > 
> never.

Then I am terribly missing something. Zone contains free lists for
different MIGRATE_TYPES, doesn't it? Pages allocated from those free
lists keep the migration type of the list, right?

ZONE_MOVABLE just says whether it makes sense to move pages in that zone
at all, right?

> 
> > > +
> > > +	pfn = page_to_pfn(page);
> > > +	for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
> > > +		unsigned long check = pfn + iter;
> > > +
> > > +		if (!pfn_valid_within(check)) {
> > > +			iter++;
> > > +			continue;
> > > +		}
> > > +		page = pfn_to_page(check);
> > > +		if (!page_count(page)) {
> > > +			if (PageBuddy(page))
> > 
> > Why do you check page_count as well? PageBuddy has alwyas count==0,
> > right?
> > 
> 
> But PageBuddy() flag is considered to be valid only when page_count()==0.
> This is for safe handling.

OK. I don't see that documented anywhere but it makes sense. Anyway
there are some places which don't do this test (e.g.
isolate_freepages_block, suitable_migration_target, etc.).

> 
> 
> > > +				iter += (1 << page_order(page)) - 1;
> > > +			continue;
> > > +		}
> > > +		if (!PageLRU(page))
> > > +			found++;
> > > +		/*
> > > +		 * If the page is not RAM, page_count()should be 0.
> > > +		 * we don't need more check. This is an _used_ not-movable page.
> > > +		 *
> > > +		 * The problematic thing here is PG_reserved pages. But if
> > > +		 * a PG_reserved page is _used_ (at boot), page_count > 1.
> > > +		 * But...is there PG_reserved && page_count(page)==0 page ?
> > 
> > Can we have PG_reserved && PG_lru? 
> 
> I think never.
> 
> > I also quite don't understand the comment. 
> 
> There an issue that "remove an memory section which includes memory hole".
> Then,
> 
>    a page used by bootmem .... PG_reserved.
>    a page of memory hole  .... PG_reserved.
> 
> We need to call page_is_ram() or some for handling this mess.

OK, I see.

> 
> 
> > At this place we are sure that the page is valid and neither
> > free nor LRU.
> > 
[...]
> > > +bool is_pageblock_removable(struct page *page)
> > > +{
> > > +	struct zone *zone = page_zone(page);
> > > +	unsigned long flags;
> > > +	int num;
> > > +
> > > +	spin_lock_irqsave(&zone->lock, flags);
> > > +	num = __count_unmovable_pages(zone, page);
> > > +	spin_unlock_irqrestore(&zone->lock, flags);
> > 
> > Isn't this a problem? The function is triggered from userspace by sysfs
> > (0444 file) and holds the lock for pageblock_nr_pages. So someone can
> > simply read the file and block the zone->lock preventing/delaying
> > allocations for the rest of the system.
> > 
> But we need to take this. Maybe no panic you'll see even if no-lock.

Yes, I think that this can only lead to a false possitive in sysfs
interface. Isolating code holds the lock.

Thanks
-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-03  9:50                                       ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-03  9:50 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Fri 03-09-10 18:13:27, KAMEZAWA Hiroyuki wrote:
> On Fri, 3 Sep 2010 10:25:58 +0200
> Michal Hocko <mhocko@suse.cz> wrote:
> 
> > On Fri 03-09-10 12:14:52, KAMEZAWA Hiroyuki wrote:
> > [...]
[...]
> > Cannot ZONE_MOVABLE contain different MIGRATE_types?
> > 
> never.

Then I am terribly missing something. Zone contains free lists for
different MIGRATE_TYPES, doesn't it? Pages allocated from those free
lists keep the migration type of the list, right?

ZONE_MOVABLE just says whether it makes sense to move pages in that zone
at all, right?

> 
> > > +
> > > +	pfn = page_to_pfn(page);
> > > +	for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
> > > +		unsigned long check = pfn + iter;
> > > +
> > > +		if (!pfn_valid_within(check)) {
> > > +			iter++;
> > > +			continue;
> > > +		}
> > > +		page = pfn_to_page(check);
> > > +		if (!page_count(page)) {
> > > +			if (PageBuddy(page))
> > 
> > Why do you check page_count as well? PageBuddy has alwyas count==0,
> > right?
> > 
> 
> But PageBuddy() flag is considered to be valid only when page_count()==0.
> This is for safe handling.

OK. I don't see that documented anywhere but it makes sense. Anyway
there are some places which don't do this test (e.g.
isolate_freepages_block, suitable_migration_target, etc.).

> 
> 
> > > +				iter += (1 << page_order(page)) - 1;
> > > +			continue;
> > > +		}
> > > +		if (!PageLRU(page))
> > > +			found++;
> > > +		/*
> > > +		 * If the page is not RAM, page_count()should be 0.
> > > +		 * we don't need more check. This is an _used_ not-movable page.
> > > +		 *
> > > +		 * The problematic thing here is PG_reserved pages. But if
> > > +		 * a PG_reserved page is _used_ (at boot), page_count > 1.
> > > +		 * But...is there PG_reserved && page_count(page)==0 page ?
> > 
> > Can we have PG_reserved && PG_lru? 
> 
> I think never.
> 
> > I also quite don't understand the comment. 
> 
> There an issue that "remove an memory section which includes memory hole".
> Then,
> 
>    a page used by bootmem .... PG_reserved.
>    a page of memory hole  .... PG_reserved.
> 
> We need to call page_is_ram() or some for handling this mess.

OK, I see.

> 
> 
> > At this place we are sure that the page is valid and neither
> > free nor LRU.
> > 
[...]
> > > +bool is_pageblock_removable(struct page *page)
> > > +{
> > > +	struct zone *zone = page_zone(page);
> > > +	unsigned long flags;
> > > +	int num;
> > > +
> > > +	spin_lock_irqsave(&zone->lock, flags);
> > > +	num = __count_unmovable_pages(zone, page);
> > > +	spin_unlock_irqrestore(&zone->lock, flags);
> > 
> > Isn't this a problem? The function is triggered from userspace by sysfs
> > (0444 file) and holds the lock for pageblock_nr_pages. So someone can
> > simply read the file and block the zone->lock preventing/delaying
> > allocations for the rest of the system.
> > 
> But we need to take this. Maybe no panic you'll see even if no-lock.

Yes, I think that this can only lead to a false possitive in sysfs
interface. Isolating code holds the lock.

Thanks
-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code
  2010-09-03  9:50                                       ` Michal Hocko
@ 2010-09-03 10:05                                         ` KAMEZAWA Hiroyuki
  -1 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-03 10:05 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Fri, 3 Sep 2010 11:50:49 +0200
Michal Hocko <mhocko@suse.cz> wrote:

> On Fri 03-09-10 18:13:27, KAMEZAWA Hiroyuki wrote:
> > On Fri, 3 Sep 2010 10:25:58 +0200
> > Michal Hocko <mhocko@suse.cz> wrote:
> > 
> > > On Fri 03-09-10 12:14:52, KAMEZAWA Hiroyuki wrote:
> > > [...]
> [...]
> > > Cannot ZONE_MOVABLE contain different MIGRATE_types?
> > > 
> > never.
> 
> Then I am terribly missing something. Zone contains free lists for
> different MIGRATE_TYPES, doesn't it? Pages allocated from those free
> lists keep the migration type of the list, right?
> 
> ZONE_MOVABLE just says whether it makes sense to move pages in that zone
> at all, right?
> 

 ZONE_MOVABLE means "it only contains MOVABLE pages."
 So, we can ignore migrate-type.





> > 
> > > > +
> > > > +	pfn = page_to_pfn(page);
> > > > +	for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
> > > > +		unsigned long check = pfn + iter;
> > > > +
> > > > +		if (!pfn_valid_within(check)) {
> > > > +			iter++;
> > > > +			continue;
> > > > +		}
> > > > +		page = pfn_to_page(check);
> > > > +		if (!page_count(page)) {
> > > > +			if (PageBuddy(page))
> > > 
> > > Why do you check page_count as well? PageBuddy has alwyas count==0,
> > > right?
> > > 
> > 
> > But PageBuddy() flag is considered to be valid only when page_count()==0.
> > This is for safe handling.
> 
> OK. I don't see that documented anywhere but it makes sense. Anyway
> there are some places which don't do this test (e.g.
> isolate_freepages_block, suitable_migration_target, etc.).
> 
> > 
> > 
> > > > +				iter += (1 << page_order(page)) - 1;
> > > > +			continue;
> > > > +		}
> > > > +		if (!PageLRU(page))
> > > > +			found++;
> > > > +		/*
> > > > +		 * If the page is not RAM, page_count()should be 0.
> > > > +		 * we don't need more check. This is an _used_ not-movable page.
> > > > +		 *
> > > > +		 * The problematic thing here is PG_reserved pages. But if
> > > > +		 * a PG_reserved page is _used_ (at boot), page_count > 1.
> > > > +		 * But...is there PG_reserved && page_count(page)==0 page ?
> > > 
> > > Can we have PG_reserved && PG_lru? 
> > 
> > I think never.
> > 
> > > I also quite don't understand the comment. 
> > 
> > There an issue that "remove an memory section which includes memory hole".
> > Then,
> > 
> >    a page used by bootmem .... PG_reserved.
> >    a page of memory hole  .... PG_reserved.
> > 
> > We need to call page_is_ram() or some for handling this mess.
> 
> OK, I see.
> 
> > 
> > 
> > > At this place we are sure that the page is valid and neither
> > > free nor LRU.
> > > 
> [...]
> > > > +bool is_pageblock_removable(struct page *page)
> > > > +{
> > > > +	struct zone *zone = page_zone(page);
> > > > +	unsigned long flags;
> > > > +	int num;
> > > > +
> > > > +	spin_lock_irqsave(&zone->lock, flags);
> > > > +	num = __count_unmovable_pages(zone, page);
> > > > +	spin_unlock_irqrestore(&zone->lock, flags);
> > > 
> > > Isn't this a problem? The function is triggered from userspace by sysfs
> > > (0444 file) and holds the lock for pageblock_nr_pages. So someone can
> > > simply read the file and block the zone->lock preventing/delaying
> > > allocations for the rest of the system.
> > > 
> > But we need to take this. Maybe no panic you'll see even if no-lock.
> 
> Yes, I think that this can only lead to a false possitive in sysfs
> interface. Isolating code holds the lock.
> 

ok, let's go step by step.

I'm ok that your new patch to be merged. I'll post some clean up and small
bugfix (not related to your patch), later.
(I'll be very busy in this weekend, sorry.)


Thanks,
-Kame


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-03 10:05                                         ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 72+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-03 10:05 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Fri, 3 Sep 2010 11:50:49 +0200
Michal Hocko <mhocko@suse.cz> wrote:

> On Fri 03-09-10 18:13:27, KAMEZAWA Hiroyuki wrote:
> > On Fri, 3 Sep 2010 10:25:58 +0200
> > Michal Hocko <mhocko@suse.cz> wrote:
> > 
> > > On Fri 03-09-10 12:14:52, KAMEZAWA Hiroyuki wrote:
> > > [...]
> [...]
> > > Cannot ZONE_MOVABLE contain different MIGRATE_types?
> > > 
> > never.
> 
> Then I am terribly missing something. Zone contains free lists for
> different MIGRATE_TYPES, doesn't it? Pages allocated from those free
> lists keep the migration type of the list, right?
> 
> ZONE_MOVABLE just says whether it makes sense to move pages in that zone
> at all, right?
> 

 ZONE_MOVABLE means "it only contains MOVABLE pages."
 So, we can ignore migrate-type.





> > 
> > > > +
> > > > +	pfn = page_to_pfn(page);
> > > > +	for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
> > > > +		unsigned long check = pfn + iter;
> > > > +
> > > > +		if (!pfn_valid_within(check)) {
> > > > +			iter++;
> > > > +			continue;
> > > > +		}
> > > > +		page = pfn_to_page(check);
> > > > +		if (!page_count(page)) {
> > > > +			if (PageBuddy(page))
> > > 
> > > Why do you check page_count as well? PageBuddy has alwyas count==0,
> > > right?
> > > 
> > 
> > But PageBuddy() flag is considered to be valid only when page_count()==0.
> > This is for safe handling.
> 
> OK. I don't see that documented anywhere but it makes sense. Anyway
> there are some places which don't do this test (e.g.
> isolate_freepages_block, suitable_migration_target, etc.).
> 
> > 
> > 
> > > > +				iter += (1 << page_order(page)) - 1;
> > > > +			continue;
> > > > +		}
> > > > +		if (!PageLRU(page))
> > > > +			found++;
> > > > +		/*
> > > > +		 * If the page is not RAM, page_count()should be 0.
> > > > +		 * we don't need more check. This is an _used_ not-movable page.
> > > > +		 *
> > > > +		 * The problematic thing here is PG_reserved pages. But if
> > > > +		 * a PG_reserved page is _used_ (at boot), page_count > 1.
> > > > +		 * But...is there PG_reserved && page_count(page)==0 page ?
> > > 
> > > Can we have PG_reserved && PG_lru? 
> > 
> > I think never.
> > 
> > > I also quite don't understand the comment. 
> > 
> > There an issue that "remove an memory section which includes memory hole".
> > Then,
> > 
> >    a page used by bootmem .... PG_reserved.
> >    a page of memory hole  .... PG_reserved.
> > 
> > We need to call page_is_ram() or some for handling this mess.
> 
> OK, I see.
> 
> > 
> > 
> > > At this place we are sure that the page is valid and neither
> > > free nor LRU.
> > > 
> [...]
> > > > +bool is_pageblock_removable(struct page *page)
> > > > +{
> > > > +	struct zone *zone = page_zone(page);
> > > > +	unsigned long flags;
> > > > +	int num;
> > > > +
> > > > +	spin_lock_irqsave(&zone->lock, flags);
> > > > +	num = __count_unmovable_pages(zone, page);
> > > > +	spin_unlock_irqrestore(&zone->lock, flags);
> > > 
> > > Isn't this a problem? The function is triggered from userspace by sysfs
> > > (0444 file) and holds the lock for pageblock_nr_pages. So someone can
> > > simply read the file and block the zone->lock preventing/delaying
> > > allocations for the rest of the system.
> > > 
> > But we need to take this. Maybe no panic you'll see even if no-lock.
> 
> Yes, I think that this can only lead to a false possitive in sysfs
> interface. Isolating code holds the lock.
> 

ok, let's go step by step.

I'm ok that your new patch to be merged. I'll post some clean up and small
bugfix (not related to your patch), later.
(I'll be very busy in this weekend, sorry.)


Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code
  2010-09-03 10:05                                         ` KAMEZAWA Hiroyuki
@ 2010-09-03 11:01                                           ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-03 11:01 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Fri 03-09-10 19:05:20, KAMEZAWA Hiroyuki wrote:
> On Fri, 3 Sep 2010 11:50:49 +0200
> Michal Hocko <mhocko@suse.cz> wrote:
> 
> > On Fri 03-09-10 18:13:27, KAMEZAWA Hiroyuki wrote:
> > > On Fri, 3 Sep 2010 10:25:58 +0200
> > > Michal Hocko <mhocko@suse.cz> wrote:
> > > 
> > > > On Fri 03-09-10 12:14:52, KAMEZAWA Hiroyuki wrote:
> > > > [...]
> > [...]
> > > > Cannot ZONE_MOVABLE contain different MIGRATE_types?
> > > > 
> > > never.
> > 
> > Then I am terribly missing something. Zone contains free lists for
> > different MIGRATE_TYPES, doesn't it? Pages allocated from those free
> > lists keep the migration type of the list, right?
> > 
> > ZONE_MOVABLE just says whether it makes sense to move pages in that zone
> > at all, right?
> > 
> 
>  ZONE_MOVABLE means "it only contains MOVABLE pages."
>  So, we can ignore migrate-type.

Ahh. OK, it seems that I have to look over movable related code once
again. Maybe I just got confused from my experience when ZONE_MOVABLE
zone contained MIGRATE_RESERVE pages but now I can see that this is OK
and doesn't have anything to do with other migrate types.

Anyway, thank you for clarification.

[...]

> > > > Isn't this a problem? The function is triggered from userspace by sysfs
> > > > (0444 file) and holds the lock for pageblock_nr_pages. So someone can
> > > > simply read the file and block the zone->lock preventing/delaying
> > > > allocations for the rest of the system.
> > > > 
> > > But we need to take this. Maybe no panic you'll see even if no-lock.
> > 
> > Yes, I think that this can only lead to a false possitive in sysfs
> > interface. Isolating code holds the lock.
> > 
> 
> ok, let's go step by step.
> 
> I'm ok that your new patch to be merged. I'll post some clean up and small
> bugfix (not related to your patch), later.
> (I'll be very busy in this weekend, sorry.)

OK, no problem, we are not in hurry ;)

> 
> 
> Thanks,
> -Kame
> 

Thanks!

-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code
@ 2010-09-03 11:01                                           ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-03 11:01 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Fri 03-09-10 19:05:20, KAMEZAWA Hiroyuki wrote:
> On Fri, 3 Sep 2010 11:50:49 +0200
> Michal Hocko <mhocko@suse.cz> wrote:
> 
> > On Fri 03-09-10 18:13:27, KAMEZAWA Hiroyuki wrote:
> > > On Fri, 3 Sep 2010 10:25:58 +0200
> > > Michal Hocko <mhocko@suse.cz> wrote:
> > > 
> > > > On Fri 03-09-10 12:14:52, KAMEZAWA Hiroyuki wrote:
> > > > [...]
> > [...]
> > > > Cannot ZONE_MOVABLE contain different MIGRATE_types?
> > > > 
> > > never.
> > 
> > Then I am terribly missing something. Zone contains free lists for
> > different MIGRATE_TYPES, doesn't it? Pages allocated from those free
> > lists keep the migration type of the list, right?
> > 
> > ZONE_MOVABLE just says whether it makes sense to move pages in that zone
> > at all, right?
> > 
> 
>  ZONE_MOVABLE means "it only contains MOVABLE pages."
>  So, we can ignore migrate-type.

Ahh. OK, it seems that I have to look over movable related code once
again. Maybe I just got confused from my experience when ZONE_MOVABLE
zone contained MIGRATE_RESERVE pages but now I can see that this is OK
and doesn't have anything to do with other migrate types.

Anyway, thank you for clarification.

[...]

> > > > Isn't this a problem? The function is triggered from userspace by sysfs
> > > > (0444 file) and holds the lock for pageblock_nr_pages. So someone can
> > > > simply read the file and block the zone->lock preventing/delaying
> > > > allocations for the rest of the system.
> > > > 
> > > But we need to take this. Maybe no panic you'll see even if no-lock.
> > 
> > Yes, I think that this can only lead to a false possitive in sysfs
> > interface. Isolating code holds the lock.
> > 
> 
> ok, let's go step by step.
> 
> I'm ok that your new patch to be merged. I'll post some clean up and small
> bugfix (not related to your patch), later.
> (I'll be very busy in this weekend, sorry.)

OK, no problem, we are not in hurry ;)

> 
> 
> Thanks,
> -Kame
> 

Thanks!

-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code v3
  2010-09-03 10:05                                         ` KAMEZAWA Hiroyuki
@ 2010-09-03 11:42                                           ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-03 11:42 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

Here is the updated version of my original patch based on the
KAMEZAWA Hiroyuki feedback.

What do other people think about that?

On Fri 03-09-10 19:05:20, KAMEZAWA Hiroyuki wrote:
[...]
> ok, let's go step by step.
> 
> I'm ok that your new patch to be merged. I'll post some clean up and small
> bugfix (not related to your patch), later.
> (I'll be very busy in this weekend, sorry.)

---

>From 1080064212f2a4efce9b15e0bfc5471c2d68e475 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Fri, 20 Aug 2010 15:39:16 +0200
Subject: [PATCH] Make is_mem_section_removable more conformable with offlining code

Currently is_mem_section_removable checks whether each pageblock from
the given pfn range is of MIGRATE_MOVABLE type or if it is free. If both
are false then the range is considered non removable.

On the other hand, offlining code (more specifically
set_migratetype_isolate) doesn't care whether a page is free and instead
it just checks the migrate type of the page and whether the page's zone
is movable.

This can lead into a situation when we can mark a node as not removable
just because a pageblock is MIGRATE_RESERVE and it is not free but still
movable.

Let's make a common helper is_page_removable which unifies both tests
at one place.

Do not rely on any of MIGRATE_* types as all others but MIGRATE_MOVABLE
may be tricky. MIGRATE_RESERVE can be anything that just happened to
fallback to that allocation. MIGRATE_RECLAIMABLE can be unmovable
because slab (or what ever) has this page currently in use and cannot
release it.  If we tried to remove those pages and the isolation failed
then those blocks would get into the MIRAGTE_MOVABLE list
unconditionally and we will end up having unmovable pages in the movable
list.

Let's, instead, check just whether a pageblock comes from ZONE_MOVABLE
or it contains only free or LRU pages.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
---
 include/linux/memory_hotplug.h |    4 +++
 mm/memory_hotplug.c            |   48 +++++++++++++++++++++++++++++++++------
 mm/page_alloc.c                |    5 +---
 3 files changed, 45 insertions(+), 12 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 864035f..5c448f7 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -194,12 +194,16 @@ static inline void register_page_bootmem_info_node(struct pglist_data *pgdat)
 
 extern int is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
 
+bool is_page_removable(struct page *page);
+
 #else
 static inline int is_mem_section_removable(unsigned long pfn,
 					unsigned long nr_pages)
 {
 	return 0;
 }
+
+#define is_page_removable(page) 0
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 
 extern int mem_online_node(int nid);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a4cfcdc..c2e54e1 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -581,6 +581,44 @@ static inline int pageblock_free(struct page *page)
 	return PageBuddy(page) && page_order(page) >= pageblock_order;
 }
 
+/*
+ * A free or LRU pages block are removable
+ * Do not use MIGRATE_MOVABLE because it can be insufficient and
+ * other MIGRATE types are tricky.
+ * Do not hold zone->lock as this is used from user space by the
+ * sysfs interface.
+ */
+bool is_page_removable(struct page *page)
+{
+	int page_block = 1 << pageblock_order;
+
+	/* All pages from the MOVABLE zone are movable */
+	if (zone_idx(page_zone(page)) == ZONE_MOVABLE)
+		return true;
+
+	while (page_block > 0) {
+		int order = 0;
+
+		if (pfn_valid_within(page_to_pfn(page))) {
+			if (!page_count(page) && PageBuddy(page)) {
+				order = page_order(page);
+			} else if (!PageLRU(page))
+				return false;
+		}
+
+		/* We are not holding zone lock so the page might get used
+		 * since we tested it for buddy flag.  
+		 * This is just a informative check for is_mem_section_removable
+		 * so live with that and rely that we catch this in the 
+		 * page_block test and set_migratetype_isolate holds the lock.
+		 */
+		page_block -= 1 << order;
+		page += 1 << order;
+	}
+
+	return true;
+}
+
 /* Return the start of the next active pageblock after a given page */
 static struct page *next_active_pageblock(struct page *page)
 {
@@ -602,19 +640,12 @@ static struct page *next_active_pageblock(struct page *page)
 /* Checks if this range of memory is likely to be hot-removable. */
 int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
 {
-	int type;
 	struct page *page = pfn_to_page(start_pfn);
 	struct page *end_page = page + nr_pages;
 
 	/* Check the starting page of each pageblock within the range */
 	for (; page < end_page; page = next_active_pageblock(page)) {
-		type = get_pageblock_migratetype(page);
-
-		/*
-		 * A pageblock containing MOVABLE or free pages is considered
-		 * removable
-		 */
-		if (type != MIGRATE_MOVABLE && !pageblock_free(page))
+		if (!is_page_removable(page))
 			return 0;
 
 		/*
@@ -770,6 +801,7 @@ check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages,
 	return ret;
 }
 
+
 static long
 check_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
 {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a9649f4..c2e2576 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5277,14 +5277,11 @@ int set_migratetype_isolate(struct page *page)
 	struct memory_isolate_notify arg;
 	int notifier_ret;
 	int ret = -EBUSY;
-	int zone_idx;
 
 	zone = page_zone(page);
-	zone_idx = zone_idx(zone);
 
 	spin_lock_irqsave(&zone->lock, flags);
-	if (get_pageblock_migratetype(page) == MIGRATE_MOVABLE ||
-	    zone_idx == ZONE_MOVABLE) {
+	if (is_page_removable(page)) {
 		ret = 0;
 		goto out;
 	}
-- 
1.7.1


-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code v3
@ 2010-09-03 11:42                                           ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-03 11:42 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Andrew Morton, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

Here is the updated version of my original patch based on the
KAMEZAWA Hiroyuki feedback.

What do other people think about that?

On Fri 03-09-10 19:05:20, KAMEZAWA Hiroyuki wrote:
[...]
> ok, let's go step by step.
> 
> I'm ok that your new patch to be merged. I'll post some clean up and small
> bugfix (not related to your patch), later.
> (I'll be very busy in this weekend, sorry.)

---

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 3/2][BUGFIX] fix memory isolation notifier return value check
  2010-09-03  7:57                                 ` KAMEZAWA Hiroyuki
@ 2010-09-03 20:48                                   ` Andrew Morton
  -1 siblings, 0 replies; 72+ messages in thread
From: Andrew Morton @ 2010-09-03 20:48 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Michal Hocko, Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Fri, 3 Sep 2010 16:57:13 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> Sorry, the 3rd patch for this set.

What happened with "[PATCH 2/2] Make is_mem_section_removable more
conformable with offlining code"?  You mentioned sending an updated
one, but I can't immediately find it.

Also, please do describe the impact of the problems which are being
fixed.  It helps me decide on priority and on
which-kernels-need-the-patch and it helps others when deciding
should-i-backport-this-into-my-kernel.

I think it'd be best to resend all of this, please.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 3/2][BUGFIX] fix memory isolation notifier return value check
@ 2010-09-03 20:48                                   ` Andrew Morton
  0 siblings, 0 replies; 72+ messages in thread
From: Andrew Morton @ 2010-09-03 20:48 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Michal Hocko, Hiroyuki Kamezawa, Wu Fengguang, linux-mm, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

On Fri, 3 Sep 2010 16:57:13 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> Sorry, the 3rd patch for this set.

What happened with "[PATCH 2/2] Make is_mem_section_removable more
conformable with offlining code"?  You mentioned sending an updated
one, but I can't immediately find it.

Also, please do describe the impact of the problems which are being
fixed.  It helps me decide on priority and on
which-kernels-need-the-patch and it helps others when deciding
should-i-backport-this-into-my-kernel.

I think it'd be best to resend all of this, please.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 3/2][BUGFIX] fix memory isolation notifier return value check
  2010-09-03 20:48                                   ` Andrew Morton
@ 2010-09-03 22:05                                     ` Hiroyuki Kamezawa
  -1 siblings, 0 replies; 72+ messages in thread
From: Hiroyuki Kamezawa @ 2010-09-03 22:05 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Wu Fengguang, linux-mm, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

2010/9/4 Andrew Morton <akpm@linux-foundation.org>:
> On Fri, 3 Sep 2010 16:57:13 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>
>> Sorry, the 3rd patch for this set.
>
> What happened with "[PATCH 2/2] Make is_mem_section_removable more
> conformable with offlining code"?  You mentioned sending an updated
> one, but I can't immediately find it.
>
Sorry, I couldn't. (and I will not able to do until Monday.)

> Also, please do describe the impact of the problems which are being
> fixed.  It helps me decide on priority and on
> which-kernels-need-the-patch and it helps others when deciding
> should-i-backport-this-into-my-kernel.
>
Ah,yes
  - Before the patch [2/2], the code is buggy but works.
    (Because of not-precise test of pre-memory-hotplug.)

    IOW, patch [2/2] is not buggy but make the bug  be apparent and
    evenryone will hit this.

    Influence is very small and maybe no need for backport.


> I think it'd be best to resend all of this, please.
>
I'll do in the next week. Sorry for annoying.

Thanks,
-Kame

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 3/2][BUGFIX] fix memory isolation notifier return value check
@ 2010-09-03 22:05                                     ` Hiroyuki Kamezawa
  0 siblings, 0 replies; 72+ messages in thread
From: Hiroyuki Kamezawa @ 2010-09-03 22:05 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Wu Fengguang, linux-mm, Kleen,
	Andi, Haicheng Li, Christoph Lameter, linux-kernel, Mel Gorman

2010/9/4 Andrew Morton <akpm@linux-foundation.org>:
> On Fri, 3 Sep 2010 16:57:13 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>
>> Sorry, the 3rd patch for this set.
>
> What happened with "[PATCH 2/2] Make is_mem_section_removable more
> conformable with offlining code"?  You mentioned sending an updated
> one, but I can't immediately find it.
>
Sorry, I couldn't. (and I will not able to do until Monday.)

> Also, please do describe the impact of the problems which are being
> fixed.  It helps me decide on priority and on
> which-kernels-need-the-patch and it helps others when deciding
> should-i-backport-this-into-my-kernel.
>
Ah,yes
  - Before the patch [2/2], the code is buggy but works.
    (Because of not-precise test of pre-memory-hotplug.)

    IOW, patch [2/2] is not buggy but make the bug  be apparent and
    evenryone will hit this.

    Influence is very small and maybe no need for backport.


> I think it'd be best to resend all of this, please.
>
I'll do in the next week. Sorry for annoying.

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code v3
  2010-09-03 11:42                                           ` Michal Hocko
@ 2010-09-04  2:55                                             ` Wu Fengguang
  -1 siblings, 0 replies; 72+ messages in thread
From: Wu Fengguang @ 2010-09-04  2:55 UTC (permalink / raw)
  To: Michal Hocko
  Cc: KAMEZAWA Hiroyuki, Hiroyuki Kamezawa, linux-mm, Andrew Morton,
	Kleen, Andi, Haicheng Li, Christoph Lameter, linux-kernel,
	Mel Gorman

On Fri, Sep 03, 2010 at 07:42:13PM +0800, Michal Hocko wrote:

> +/*
> + * A free or LRU pages block are removable
> + * Do not use MIGRATE_MOVABLE because it can be insufficient and
> + * other MIGRATE types are tricky.
> + * Do not hold zone->lock as this is used from user space by the
> + * sysfs interface.
> + */
> +bool is_page_removable(struct page *page)
> +{
> +	int page_block = 1 << pageblock_order;
> +
> +	/* All pages from the MOVABLE zone are movable */
> +	if (zone_idx(page_zone(page)) == ZONE_MOVABLE)
> +		return true;
> +
> +	while (page_block > 0) {
> +		int order = 0;
> +
> +		if (pfn_valid_within(page_to_pfn(page))) {
> +			if (!page_count(page) && PageBuddy(page)) {

PageBuddy() is true only for the head page and false for all tail
pages. So when is_page_removable() is given a random 4k page
(get_any_page() will exactly do that), the above test is not enough.

It's recommended to reuse is_free_buddy_page(). (Need to do some
cleanup work first: remove the "#ifdef CONFIG_MEMORY_FAILURE" and
abstract out an __is_free_buddy_page() that takes no lock.)

> @@ -5277,14 +5277,11 @@ int set_migratetype_isolate(struct page *page)
>  	struct memory_isolate_notify arg;
>  	int notifier_ret;
>  	int ret = -EBUSY;
> -	int zone_idx;
>  
>  	zone = page_zone(page);
> -	zone_idx = zone_idx(zone);
>  
>  	spin_lock_irqsave(&zone->lock, flags);
> -	if (get_pageblock_migratetype(page) == MIGRATE_MOVABLE ||
> -	    zone_idx == ZONE_MOVABLE) {
> +	if (is_page_removable(page)) {
>  		ret = 0;
>  		goto out;

The above check only applies to the first page in the page block.
The following "if (!page_count(curr_page) || PageLRU(curr_page))"
check in the same function should be converted too (and that's another
reason to use __is_free_buddy_page(): it will be tested for every 4k
pages, including both the head and tail pages).

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code v3
@ 2010-09-04  2:55                                             ` Wu Fengguang
  0 siblings, 0 replies; 72+ messages in thread
From: Wu Fengguang @ 2010-09-04  2:55 UTC (permalink / raw)
  To: Michal Hocko
  Cc: KAMEZAWA Hiroyuki, Hiroyuki Kamezawa, linux-mm, Andrew Morton,
	Kleen, Andi, Haicheng Li, Christoph Lameter, linux-kernel,
	Mel Gorman

On Fri, Sep 03, 2010 at 07:42:13PM +0800, Michal Hocko wrote:

> +/*
> + * A free or LRU pages block are removable
> + * Do not use MIGRATE_MOVABLE because it can be insufficient and
> + * other MIGRATE types are tricky.
> + * Do not hold zone->lock as this is used from user space by the
> + * sysfs interface.
> + */
> +bool is_page_removable(struct page *page)
> +{
> +	int page_block = 1 << pageblock_order;
> +
> +	/* All pages from the MOVABLE zone are movable */
> +	if (zone_idx(page_zone(page)) == ZONE_MOVABLE)
> +		return true;
> +
> +	while (page_block > 0) {
> +		int order = 0;
> +
> +		if (pfn_valid_within(page_to_pfn(page))) {
> +			if (!page_count(page) && PageBuddy(page)) {

PageBuddy() is true only for the head page and false for all tail
pages. So when is_page_removable() is given a random 4k page
(get_any_page() will exactly do that), the above test is not enough.

It's recommended to reuse is_free_buddy_page(). (Need to do some
cleanup work first: remove the "#ifdef CONFIG_MEMORY_FAILURE" and
abstract out an __is_free_buddy_page() that takes no lock.)

> @@ -5277,14 +5277,11 @@ int set_migratetype_isolate(struct page *page)
>  	struct memory_isolate_notify arg;
>  	int notifier_ret;
>  	int ret = -EBUSY;
> -	int zone_idx;
>  
>  	zone = page_zone(page);
> -	zone_idx = zone_idx(zone);
>  
>  	spin_lock_irqsave(&zone->lock, flags);
> -	if (get_pageblock_migratetype(page) == MIGRATE_MOVABLE ||
> -	    zone_idx == ZONE_MOVABLE) {
> +	if (is_page_removable(page)) {
>  		ret = 0;
>  		goto out;

The above check only applies to the first page in the page block.
The following "if (!page_count(curr_page) || PageLRU(curr_page))"
check in the same function should be converted too (and that's another
reason to use __is_free_buddy_page(): it will be tested for every 4k
pages, including both the head and tail pages).

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code v3
  2010-09-04  2:55                                             ` Wu Fengguang
@ 2010-09-06  9:16                                               ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-06  9:16 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: KAMEZAWA Hiroyuki, Hiroyuki Kamezawa, linux-mm, Andrew Morton,
	Kleen, Andi, Haicheng Li, Christoph Lameter, linux-kernel,
	Mel Gorman

On Sat 04-09-10 10:55:16, Wu Fengguang wrote:
> On Fri, Sep 03, 2010 at 07:42:13PM +0800, Michal Hocko wrote:
> 
> > +/*
> > + * A free or LRU pages block are removable
> > + * Do not use MIGRATE_MOVABLE because it can be insufficient and
> > + * other MIGRATE types are tricky.
> > + * Do not hold zone->lock as this is used from user space by the
> > + * sysfs interface.
> > + */
> > +bool is_page_removable(struct page *page)
> > +{
> > +	int page_block = 1 << pageblock_order;
> > +
> > +	/* All pages from the MOVABLE zone are movable */
> > +	if (zone_idx(page_zone(page)) == ZONE_MOVABLE)
> > +		return true;
> > +
> > +	while (page_block > 0) {
> > +		int order = 0;
> > +
> > +		if (pfn_valid_within(page_to_pfn(page))) {
> > +			if (!page_count(page) && PageBuddy(page)) {
> 
> PageBuddy() is true only for the head page and false for all tail
> pages. So when is_page_removable() is given a random 4k page
> (get_any_page() will exactly do that), the above test is not enough.

OK, I haven't noticed that set_migratetype_isolate (which calls
is_page_removable in my patch - bellow) is called from that context.
is_mem_section_removable goes by pageblocks so we are always checking
the head.

Anyway, I can see why you are counting unmovable pages in your patch
now. You need it for notifier logic, so my approach is not usable.

Thanks!
-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code v3
@ 2010-09-06  9:16                                               ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2010-09-06  9:16 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: KAMEZAWA Hiroyuki, Hiroyuki Kamezawa, linux-mm, Andrew Morton,
	Kleen, Andi, Haicheng Li, Christoph Lameter, linux-kernel,
	Mel Gorman

On Sat 04-09-10 10:55:16, Wu Fengguang wrote:
> On Fri, Sep 03, 2010 at 07:42:13PM +0800, Michal Hocko wrote:
> 
> > +/*
> > + * A free or LRU pages block are removable
> > + * Do not use MIGRATE_MOVABLE because it can be insufficient and
> > + * other MIGRATE types are tricky.
> > + * Do not hold zone->lock as this is used from user space by the
> > + * sysfs interface.
> > + */
> > +bool is_page_removable(struct page *page)
> > +{
> > +	int page_block = 1 << pageblock_order;
> > +
> > +	/* All pages from the MOVABLE zone are movable */
> > +	if (zone_idx(page_zone(page)) == ZONE_MOVABLE)
> > +		return true;
> > +
> > +	while (page_block > 0) {
> > +		int order = 0;
> > +
> > +		if (pfn_valid_within(page_to_pfn(page))) {
> > +			if (!page_count(page) && PageBuddy(page)) {
> 
> PageBuddy() is true only for the head page and false for all tail
> pages. So when is_page_removable() is given a random 4k page
> (get_any_page() will exactly do that), the above test is not enough.

OK, I haven't noticed that set_migratetype_isolate (which calls
is_page_removable in my patch - bellow) is called from that context.
is_mem_section_removable goes by pageblocks so we are always checking
the head.

Anyway, I can see why you are counting unmovable pages in your patch
now. You need it for notifier logic, so my approach is not usable.

Thanks!
-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2010-09-06  9:16 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-20 14:14 [PATCH] Make is_mem_section_removable more conformable with offlining code Michal Hocko
2010-08-20 14:14 ` Michal Hocko
2010-08-22  0:42 ` Wu Fengguang
2010-08-22  0:42   ` Wu Fengguang
2010-08-23  9:22   ` Michal Hocko
2010-08-23  9:22     ` Michal Hocko
2010-08-31 12:30     ` Michal Hocko
2010-08-31 12:30       ` Michal Hocko
2010-08-31 14:19     ` Wu Fengguang
2010-08-31 14:19       ` Wu Fengguang
2010-08-31 14:36       ` Wu Fengguang
2010-08-31 14:36         ` Wu Fengguang
2010-08-31 14:59         ` Wu Fengguang
2010-08-31 14:59           ` Wu Fengguang
2010-09-01  1:19         ` KAMEZAWA Hiroyuki
2010-09-01  1:19           ` KAMEZAWA Hiroyuki
2010-09-01 12:19       ` Michal Hocko
2010-09-01 12:19         ` Michal Hocko
2010-09-01 12:41         ` Michal Hocko
2010-09-01 12:41           ` Michal Hocko
2010-09-02  5:45           ` KAMEZAWA Hiroyuki
2010-09-02  5:45             ` KAMEZAWA Hiroyuki
2010-09-02  8:28             ` Michal Hocko
2010-09-02  8:28               ` Michal Hocko
2010-09-02  9:03               ` KAMEZAWA Hiroyuki
2010-09-02  9:03                 ` KAMEZAWA Hiroyuki
2010-09-02  9:24                 ` Michal Hocko
2010-09-02  9:24                   ` Michal Hocko
2010-09-02 11:19                   ` Hiroyuki Kamezawa
2010-09-02 11:19                     ` Hiroyuki Kamezawa
2010-09-02 13:18                     ` Michal Hocko
2010-09-02 13:18                       ` Michal Hocko
2010-09-02 14:19                       ` Hiroyuki Kamezawa
2010-09-02 14:19                         ` Hiroyuki Kamezawa
2010-09-02 14:39                         ` Michal Hocko
2010-09-02 14:39                           ` Michal Hocko
2010-09-02 15:05                           ` Michal Hocko
2010-09-02 15:05                             ` Michal Hocko
2010-09-03  3:10                             ` [PATCH 0/2 v2] " KAMEZAWA Hiroyuki
2010-09-03  3:10                               ` KAMEZAWA Hiroyuki
2010-09-03  3:11                               ` [PATCH 1/2][BUGFIX] fix next active pageblock calculation KAMEZAWA Hiroyuki
2010-09-03  3:11                                 ` KAMEZAWA Hiroyuki
2010-09-03  3:14                               ` [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code KAMEZAWA Hiroyuki
2010-09-03  3:14                                 ` KAMEZAWA Hiroyuki
2010-09-03  8:25                                 ` Michal Hocko
2010-09-03  8:25                                   ` Michal Hocko
2010-09-03  9:13                                   ` KAMEZAWA Hiroyuki
2010-09-03  9:13                                     ` KAMEZAWA Hiroyuki
2010-09-03  9:50                                     ` Michal Hocko
2010-09-03  9:50                                       ` Michal Hocko
2010-09-03 10:05                                       ` KAMEZAWA Hiroyuki
2010-09-03 10:05                                         ` KAMEZAWA Hiroyuki
2010-09-03 11:01                                         ` Michal Hocko
2010-09-03 11:01                                           ` Michal Hocko
2010-09-03 11:42                                         ` [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code v3 Michal Hocko
2010-09-03 11:42                                           ` Michal Hocko
2010-09-04  2:55                                           ` Wu Fengguang
2010-09-04  2:55                                             ` Wu Fengguang
2010-09-06  9:16                                             ` Michal Hocko
2010-09-06  9:16                                               ` Michal Hocko
2010-09-03  9:15                                   ` [PATCH 2/2] Make is_mem_section_removable more conformable with offlining code Michal Hocko
2010-09-03  9:15                                     ` Michal Hocko
2010-09-03  9:24                                     ` KAMEZAWA Hiroyuki
2010-09-03  9:24                                       ` KAMEZAWA Hiroyuki
2010-09-03  7:54                               ` [PATCH 0/2 v2] " Michal Hocko
2010-09-03  7:54                                 ` Michal Hocko
2010-09-03  7:57                               ` [PATCH 3/2][BUGFIX] fix memory isolation notifier return value check KAMEZAWA Hiroyuki
2010-09-03  7:57                                 ` KAMEZAWA Hiroyuki
2010-09-03 20:48                                 ` Andrew Morton
2010-09-03 20:48                                   ` Andrew Morton
2010-09-03 22:05                                   ` Hiroyuki Kamezawa
2010-09-03 22:05                                     ` Hiroyuki Kamezawa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.