linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
@ 2008-10-27 16:49 Gerald Schaefer
  2008-10-27 17:17 ` Gerald Schaefer
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Gerald Schaefer @ 2008-10-27 16:49 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, linux-mm, schwidefsky, heiko.carstens,
	kamezawa.hiroyu, y-goto

From: Gerald Schaefer <gerald.schaefer@de.ibm.com>

My last bugfix here (adding zone->lock) introduced a new problem: Using
pfn_to_page(pfn) to get the zone after the for() loop is wrong. pfn then
points to the first pfn after end_pfn, which may be in a different zone
or not present at all. This may lead to an addressing exception in
page_zone() or spin_lock_irqsave().

Using the last valid page that was found inside the for() loop, instead
of pfn_to_page(), should fix this.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>

---
mm/page_isolation.c |    6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

Index: linux-2.6/mm/page_isolation.c
===================================================================
--- linux-2.6.orig/mm/page_isolation.c
+++ linux-2.6/mm/page_isolation.c
@@ -115,7 +115,7 @@ __test_page_isolated_in_pageblock(unsign
int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
{
	unsigned long pfn, flags;
-	struct page *page;
+	struct page *page = NULL;
	struct zone *zone;
	int ret;

@@ -130,10 +130,10 @@ int test_pages_isolated(unsigned long st
		if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
			break;
	}
-	if (pfn < end_pfn)
+	if ((pfn < end_pfn) || !page)
		return -EBUSY;
	/* Check all pages are free or Marked as ISOLATED */
-	zone = page_zone(pfn_to_page(pfn));
+	zone = page_zone(page);
	spin_lock_irqsave(&zone->lock, flags);
	ret = __test_page_isolated_in_pageblock(start_pfn, end_pfn);
	spin_unlock_irqrestore(&zone->lock, flags);


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
  2008-10-27 16:49 [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated() Gerald Schaefer
@ 2008-10-27 17:17 ` Gerald Schaefer
  2008-10-27 17:19 ` Gerald Schaefer
  2008-10-27 17:25 ` Dave Hansen
  2 siblings, 0 replies; 10+ messages in thread
From: Gerald Schaefer @ 2008-10-27 17:17 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, linux-mm, schwidefsky, heiko.carstens,
	kamezawa.hiroyu, y-goto

Ouch, stupid Thunderbird broke the patch (or stupid me used Thunderbird...),
will send a new one.

Gerald



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
  2008-10-27 16:49 [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated() Gerald Schaefer
  2008-10-27 17:17 ` Gerald Schaefer
@ 2008-10-27 17:19 ` Gerald Schaefer
  2008-10-27 17:25 ` Dave Hansen
  2 siblings, 0 replies; 10+ messages in thread
From: Gerald Schaefer @ 2008-10-27 17:19 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, linux-mm, schwidefsky, heiko.carstens,
	kamezawa.hiroyu, y-goto

From: Gerald Schaefer <gerald.schaefer@de.ibm.com>

My last bugfix here (adding zone->lock) introduced a new problem: Using
pfn_to_page(pfn) to get the zone after the for() loop is wrong. pfn then
points to the first pfn after end_pfn, which may be in a different zone
or not present at all. This may lead to an addressing exception in
page_zone() or spin_lock_irqsave().

Using the last valid page that was found inside the for() loop, instead
of pfn_to_page(), should fix this.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>

---
 mm/page_isolation.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Index: linux-2.6/mm/page_isolation.c
===================================================================
--- linux-2.6.orig/mm/page_isolation.c
+++ linux-2.6/mm/page_isolation.c
@@ -115,7 +115,7 @@ __test_page_isolated_in_pageblock(unsign
 int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
 {
 	unsigned long pfn, flags;
-	struct page *page;
+	struct page *page = NULL;
 	struct zone *zone;
 	int ret;
 
@@ -130,10 +130,10 @@ int test_pages_isolated(unsigned long st
 		if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
 			break;
 	}
-	if (pfn < end_pfn)
+	if ((pfn < end_pfn) || !page)
 		return -EBUSY;
 	/* Check all pages are free or Marked as ISOLATED */
-	zone = page_zone(pfn_to_page(pfn));
+	zone = page_zone(page);
 	spin_lock_irqsave(&zone->lock, flags);
 	ret = __test_page_isolated_in_pageblock(start_pfn, end_pfn);
 	spin_unlock_irqrestore(&zone->lock, flags);


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
  2008-10-27 16:49 [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated() Gerald Schaefer
  2008-10-27 17:17 ` Gerald Schaefer
  2008-10-27 17:19 ` Gerald Schaefer
@ 2008-10-27 17:25 ` Dave Hansen
  2008-10-27 17:59   ` Gerald Schaefer
  2 siblings, 1 reply; 10+ messages in thread
From: Dave Hansen @ 2008-10-27 17:25 UTC (permalink / raw)
  To: gerald.schaefer
  Cc: akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens,
	kamezawa.hiroyu, y-goto

On Mon, 2008-10-27 at 17:49 +0100, Gerald Schaefer wrote:
> My last bugfix here (adding zone->lock) introduced a new problem: Using
> pfn_to_page(pfn) to get the zone after the for() loop is wrong. pfn then
> points to the first pfn after end_pfn, which may be in a different zone
> or not present at all. This may lead to an addressing exception in
> page_zone() or spin_lock_irqsave().

I'm not sure I follow.  Let's look at the code, pre-patch:

> 	for (pfn = start_pfn; pfn < end_pfn; pfn += pageblock_nr_pages) {
>                 page = __first_valid_page(pfn, pageblock_nr_pages);
>                 if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
>                         break;
>         }
>         if (pfn < end_pfn)
>                 return -EBUSY;

We have two ways out of the loop:
1. 'page' is valid, and not isolated, so we did a 'break'
2. No page hit (1) in the range and we broke out of the loop because
   of the for() condition: (pfn < end_pfn).  

So, when the condition happens that you mentioned in your changelog
above: "pfn then points to the first pfn after end_pfn", we jump out at
the 'return -EBUSY;'.  We don't ever do pfn_to_page() in that case since
we've returned befoer.

Either 'page' is valid *OR* you return -EBUSY.  I don't think you need
to check both.

> Using the last valid page that was found inside the for() loop, instead
> of pfn_to_page(), should fix this.
> @@ -130,10 +130,10 @@ int test_pages_isolated(unsigned long st
> 		if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> 			break;
> 	}
> -	if (pfn < end_pfn)
> +	if ((pfn < end_pfn) || !page)
> 		return -EBUSY;
> 	/* Check all pages are free or Marked as ISOLATED */
> -	zone = page_zone(pfn_to_page(pfn));
> +	zone = page_zone(page);

I think this patch fixes the bug, but for reasons other than what you
said. :)

The trouble here is that the 'pfn' could have been in the middle of a
hole somewhere, which __first_valid_page() worked around.  Since you
saved off the result of __first_valid_page(), it ends up being OK with
your patch.

Instead of using pfn_to_page() you could also have just called
__first_valid_page() again.  But, that would have duplicated a bit of
work, even though not much in practice because the caches are still hot.

Technically, you wouldn't even need to check the return from
__first_valid_page() since you know it has a valid result because you
made the exact same call a moment before.

Anyway, can you remove the !page check, fix up the changelog and resend?

-- Dave


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
  2008-10-27 17:25 ` Dave Hansen
@ 2008-10-27 17:59   ` Gerald Schaefer
  2008-10-28  0:32     ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 10+ messages in thread
From: Gerald Schaefer @ 2008-10-27 17:59 UTC (permalink / raw)
  To: Dave Hansen
  Cc: akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens,
	kamezawa.hiroyu, y-goto

On Mon, 2008-10-27 at 10:25 -0700, Dave Hansen wrote:
> I'm not sure I follow.  Let's look at the code, pre-patch:
> 
> > 	for (pfn = start_pfn; pfn < end_pfn; pfn += pageblock_nr_pages) {
> >                 page = __first_valid_page(pfn, pageblock_nr_pages);
> >                 if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> >                         break;
> >         }
> >         if (pfn < end_pfn)
> >                 return -EBUSY;
> 
> We have two ways out of the loop:
> 1. 'page' is valid, and not isolated, so we did a 'break'
> 2. No page hit (1) in the range and we broke out of the loop because
>    of the for() condition: (pfn < end_pfn).  
> 
> So, when the condition happens that you mentioned in your changelog
> above: "pfn then points to the first pfn after end_pfn", we jump out at
> the 'return -EBUSY;'.  We don't ever do pfn_to_page() in that case since
> we've returned befoer.
> 
> Either 'page' is valid *OR* you return -EBUSY.  I don't think you need
> to check both.

We only return -EBUSY if pfn < end_pfn, but after completing the loop w/o
a break pfn will be > end_pfn. Also, the last call to __first_valid_page()
may return NULL w/o causing a break, so page may also be invalid after the
loop.

> > Using the last valid page that was found inside the for() loop, instead
> > of pfn_to_page(), should fix this.
> > @@ -130,10 +130,10 @@ int test_pages_isolated(unsigned long st
> > 		if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> > 			break;
> > 	}
> > -	if (pfn < end_pfn)
> > +	if ((pfn < end_pfn) || !page)
> > 		return -EBUSY;
> > 	/* Check all pages are free or Marked as ISOLATED */
> > -	zone = page_zone(pfn_to_page(pfn));
> > +	zone = page_zone(page);
> 
> I think this patch fixes the bug, but for reasons other than what you
> said. :)
> 
> The trouble here is that the 'pfn' could have been in the middle of a
> hole somewhere, which __first_valid_page() worked around.  Since you
> saved off the result of __first_valid_page(), it ends up being OK with
> your patch.

I think pfn will always be > end_pfn if we complete the loop. And breaking
out of the loop earlier will always return -EBUSY.

> Instead of using pfn_to_page() you could also have just called
> __first_valid_page() again.  But, that would have duplicated a bit of
> work, even though not much in practice because the caches are still hot.
> 
> Technically, you wouldn't even need to check the return from
> __first_valid_page() since you know it has a valid result because you
> made the exact same call a moment before.
> 
> Anyway, can you remove the !page check, fix up the changelog and resend?

Calling __first_valid_page() again might be a good idea. Thinking about it
now, I guess there is still a problem left with my patch, but for reasons
other than what you said :) If the loop is completed with page == NULL,
we will return -EBUSY with the new patch. But there may have been valid
pages before, and only some memory hole at the end. In this case, returning
-EBUSY would probably be wrong.

Kamezawa, this loop/function was added by you, what do you think?

--
Thanks,
Gerald



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
  2008-10-27 17:59   ` Gerald Schaefer
@ 2008-10-28  0:32     ` KAMEZAWA Hiroyuki
  2008-10-28 13:00       ` Gerald Schaefer
  0 siblings, 1 reply; 10+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-10-28  0:32 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: Dave Hansen, akpm, linux-kernel, linux-mm, schwidefsky,
	heiko.carstens, y-goto

On Mon, 27 Oct 2008 18:59:29 +0100
Gerald Schaefer <gerald.schaefer@de.ibm.com> wrote:
> Instead of using pfn_to_page() you could also have just called
> > __first_valid_page() again.  But, that would have duplicated a bit of
> > work, even though not much in practice because the caches are still hot.
> > 
> > Technically, you wouldn't even need to check the return from
> > __first_valid_page() since you know it has a valid result because you
> > made the exact same call a moment before.
> > 
> > Anyway, can you remove the !page check, fix up the changelog and resend?
> 
> Calling __first_valid_page() again might be a good idea. Thinking about it
> now, I guess there is still a problem left with my patch, but for reasons
> other than what you said :) If the loop is completed with page == NULL,
> we will return -EBUSY with the new patch. But there may have been valid
> pages before, and only some memory hole at the end. In this case, returning
> -EBUSY would probably be wrong.
> 
> Kamezawa, this loop/function was added by you, what do you think?
> 

I think there is a bug, as you wrote.
But
 - "pfn" and "end_pfn" (and pfn in the middle of them) can be in different zone on strange machine.

Now: test_pages_isolated() is called in following sequence.
  
  check_page_isolated()
     walk_memory_resource()			# read resource range and get start/end of pfn
         -> chcek_page_isolated_cb()
		-> test_page_isolated().

I think all pages within [start, end) passed to test_pages_isolated() should be in the same zone.

please change this to
  check_page_isolated()
     walk_memory_resource()
         -> check_page_isolated_cb()
                 -> walk_page_range_in_same_zone()  # get page range in the same zone.
                        -> test_page_isolated().

Could you try ?

Thanks,
-Kame
 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
  2008-10-28  0:32     ` KAMEZAWA Hiroyuki
@ 2008-10-28 13:00       ` Gerald Schaefer
  0 siblings, 0 replies; 10+ messages in thread
From: Gerald Schaefer @ 2008-10-28 13:00 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Dave Hansen, akpm, linux-kernel, linux-mm, schwidefsky,
	heiko.carstens, y-goto

On Tue, 2008-10-28 at 09:32 +0900, KAMEZAWA Hiroyuki wrote:
> But
>  - "pfn" and "end_pfn" (and pfn in the middle of them) can be in different zone on strange machine.
> 
> Now: test_pages_isolated() is called in following sequence.
>   
>   check_page_isolated()
>      walk_memory_resource()			# read resource range and get start/end of pfn
>          -> chcek_page_isolated_cb()
> 		-> test_page_isolated().
> 
> I think all pages within [start, end) passed to test_pages_isolated() should be in the same zone.
> 
> please change this to
>   check_page_isolated()
>      walk_memory_resource()
>          -> check_page_isolated_cb()
>                  -> walk_page_range_in_same_zone()  # get page range in the same zone.
>                         -> test_page_isolated().
> 
> Could you try ?

There is already a "same zone" check at the beginning of offline_pages():

>	if (!test_pages_in_a_zone(start_pfn, end_pfn))
>		return -EINVAL;

So we should be safe here, the only problem that I see is that my
zone->lock patch in test_pages_isolated() is broken. As explained,
the pfn used in my page_zone(pfn_to_page(pfn)) is >= end_pfn.

I'll send a new patch to fix this, using __first_valid_page() again,
as described in my reply to Daves mail. The only other solution that
I see would be to remember the first/last !NULL page that was found
inside the for() loop. Not sure which is better, but I think I like
the first one more. Any other ideas?

Thanks,
Gerald



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
  2008-10-29 14:25 Gerald Schaefer
  2008-10-29 18:00 ` Nathan Fontenot
@ 2008-10-30  0:09 ` KAMEZAWA Hiroyuki
  1 sibling, 0 replies; 10+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-10-30  0:09 UTC (permalink / raw)
  To: gerald.schaefer
  Cc: akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens, y-goto, dave

On Wed, 29 Oct 2008 15:25:30 +0100
Gerald Schaefer <gerald.schaefer@de.ibm.com> wrote:

> From: Gerald Schaefer <gerald.schaefer@de.ibm.com>
> 
> My last bugfix here (adding zone->lock) introduced a new problem: Using
> page_zone(pfn_to_page(pfn)) to get the zone after the for() loop is wrong.
> pfn will then be >= end_pfn, which may be in a different zone or not
> present at all. This may lead to an addressing exception in page_zone()
> or spin_lock_irqsave().
> 
> Now I use __first_valid_page() again after the loop to find a valid page
> for page_zone().
> 
> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
> 
Thanks.

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

> ---
>  mm/page_isolation.c |    5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6/mm/page_isolation.c
> ===================================================================
> --- linux-2.6.orig/mm/page_isolation.c
> +++ linux-2.6/mm/page_isolation.c
> @@ -130,10 +130,11 @@ int test_pages_isolated(unsigned long st
>  		if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
>  			break;
>  	}
> -	if (pfn < end_pfn)
> +	page = __first_valid_page(start_pfn, end_pfn - start_pfn);
> +	if ((pfn < end_pfn) || !page)
>  		return -EBUSY;
>  	/* Check all pages are free or Marked as ISOLATED */
> -	zone = page_zone(pfn_to_page(pfn));
> +	zone = page_zone(page);
>  	spin_lock_irqsave(&zone->lock, flags);
>  	ret = __test_page_isolated_in_pageblock(start_pfn, end_pfn);
>  	spin_unlock_irqrestore(&zone->lock, flags);
> 
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
  2008-10-29 14:25 Gerald Schaefer
@ 2008-10-29 18:00 ` Nathan Fontenot
  2008-10-30  0:09 ` KAMEZAWA Hiroyuki
  1 sibling, 0 replies; 10+ messages in thread
From: Nathan Fontenot @ 2008-10-29 18:00 UTC (permalink / raw)
  To: gerald.schaefer
  Cc: akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens,
	kamezawa.hiroyu, y-goto, dave

Looks like you beat me to it, and with a nicer fix too.

Gerald Schaefer wrote:
> From: Gerald Schaefer <gerald.schaefer@de.ibm.com>
> 
> My last bugfix here (adding zone->lock) introduced a new problem: Using
> page_zone(pfn_to_page(pfn)) to get the zone after the for() loop is wrong.
> pfn will then be >= end_pfn, which may be in a different zone or not
> present at all. This may lead to an addressing exception in page_zone()
> or spin_lock_irqsave().
> 
> Now I use __first_valid_page() again after the loop to find a valid page
> for page_zone().
> 
> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>

Acked-by: Nathan Fontenot <nfont@austin.ibm.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
@ 2008-10-29 14:25 Gerald Schaefer
  2008-10-29 18:00 ` Nathan Fontenot
  2008-10-30  0:09 ` KAMEZAWA Hiroyuki
  0 siblings, 2 replies; 10+ messages in thread
From: Gerald Schaefer @ 2008-10-29 14:25 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, linux-mm, schwidefsky, heiko.carstens,
	kamezawa.hiroyu, y-goto, dave

From: Gerald Schaefer <gerald.schaefer@de.ibm.com>

My last bugfix here (adding zone->lock) introduced a new problem: Using
page_zone(pfn_to_page(pfn)) to get the zone after the for() loop is wrong.
pfn will then be >= end_pfn, which may be in a different zone or not
present at all. This may lead to an addressing exception in page_zone()
or spin_lock_irqsave().

Now I use __first_valid_page() again after the loop to find a valid page
for page_zone().

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>

---
 mm/page_isolation.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/page_isolation.c
===================================================================
--- linux-2.6.orig/mm/page_isolation.c
+++ linux-2.6/mm/page_isolation.c
@@ -130,10 +130,11 @@ int test_pages_isolated(unsigned long st
 		if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
 			break;
 	}
-	if (pfn < end_pfn)
+	page = __first_valid_page(start_pfn, end_pfn - start_pfn);
+	if ((pfn < end_pfn) || !page)
 		return -EBUSY;
 	/* Check all pages are free or Marked as ISOLATED */
-	zone = page_zone(pfn_to_page(pfn));
+	zone = page_zone(page);
 	spin_lock_irqsave(&zone->lock, flags);
 	ret = __test_page_isolated_in_pageblock(start_pfn, end_pfn);
 	spin_unlock_irqrestore(&zone->lock, flags);


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-10-30  0:10 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-27 16:49 [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated() Gerald Schaefer
2008-10-27 17:17 ` Gerald Schaefer
2008-10-27 17:19 ` Gerald Schaefer
2008-10-27 17:25 ` Dave Hansen
2008-10-27 17:59   ` Gerald Schaefer
2008-10-28  0:32     ` KAMEZAWA Hiroyuki
2008-10-28 13:00       ` Gerald Schaefer
2008-10-29 14:25 Gerald Schaefer
2008-10-29 18:00 ` Nathan Fontenot
2008-10-30  0:09 ` KAMEZAWA Hiroyuki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).