[03/10] mm/page_alloc: handle page on pcp correctly if it's pageblock is isolated
diff mbox series

Message ID 1404460675-24456-4-git-send-email-iamjoonsoo.kim@lge.com
State New, archived
Headers show
Series
  • fix freepage count problems due to memory isolation
Related show

Commit Message

Joonsoo Kim July 4, 2014, 7:57 a.m. UTC
If pageblock of page on pcp are isolated now, we should free it to isolate
buddy list to prevent future allocation on it. But current code doesn't
do this.

Moreover, there is a freepage counting problem on current code. Although
pageblock of page on pcp are isolated now, it could go normal buddy list,
because get_onpcp_migratetype() will return non-isolate migratetype.
In this case, we should do either adding freepage count or changing
migratetype to MIGRATE_ISOLATE, but, current code do neither.

This patch fixes these two problems by handling pageblock migratetype
before calling __free_one_page(). And, if we find the page on isolated
pageblock, change migratetype to MIGRATE_ISOLATE to prevent future
allocation of this page and freepage counting problem.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/page_alloc.c |   14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

Comments

Gioh Kim July 7, 2014, 8:25 a.m. UTC | #1
2014-07-04 오후 4:57, Joonsoo Kim 쓴 글:
> If pageblock of page on pcp are isolated now, we should free it to isolate
> buddy list to prevent future allocation on it. But current code doesn't
> do this.

I think it is strage that pcp can have isolated page.
I remember that The purpose of pcp is having frequently used pages (hot for cache),
but isolated page is not for frequently allocated and freed.

(Above is my guess. It can be wrong. I know CMA and hot-plug features are using isolated pages
so that I guess isolated pages are not for frequently allocated and freed memory.)

Anyway if isolated page is not good for pcp, what about preventing isolated page from inserting pcp?
I think fix of free_hot_cold_page() can be one of the preventing ways.
The free_hot_cold_page() checks migratetype and inserts CMA page into pcp.
Am I correct?

If I am correct I think inserting CMA page into pcp is harmful for both of CMA and pcp.
If CMA page is on pcp it will be used frequently and prevent making contiguous memory.
What about avoiding CMA page for pcp like following?

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8c9eeec..1cbb816 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1413,13 +1413,8 @@ void free_hot_cold_page(struct page *page, bool cold)
         * areas back if necessary. Otherwise, we may have to free
         * excessively into the page allocator
         */
-       if (migratetype >= MIGRATE_PCPTYPES) {
-               if (unlikely(is_migrate_isolate(migratetype))) {
-                       free_one_page(zone, page, pfn, 0, migratetype);
-                       goto out;
-               }
-               migratetype = MIGRATE_MOVABLE;
-       }
+       if (migratetype > MIGRATE_PCPTYPES)
+               goto out;

        pcp = &this_cpu_ptr(zone->pageset)->pcp;
        if (!cold)


> 
> Moreover, there is a freepage counting problem on current code. Although
> pageblock of page on pcp are isolated now, it could go normal buddy list,
> because get_onpcp_migratetype() will return non-isolate migratetype.

I cannot find get_onpcp_migratetype() in v3.16.0-rc3 and also cannot google it.
Is it typo?

> In this case, we should do either adding freepage count or changing
> migratetype to MIGRATE_ISOLATE, but, current code do neither.
> 
> This patch fixes these two problems by handling pageblock migratetype
> before calling __free_one_page(). And, if we find the page on isolated
> pageblock, change migratetype to MIGRATE_ISOLATE to prevent future
> allocation of this page and freepage counting problem.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>   mm/page_alloc.c |   14 ++++++++------
>   1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index aeb51d1..99c05f7 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -719,15 +719,17 @@ static void free_pcppages_bulk(struct zone *zone, int count,
>   			page = list_entry(list->prev, struct page, lru);
>   			/* must delete as __free_one_page list manipulates */
>   			list_del(&page->lru);
> -			mt = get_freepage_migratetype(page);
> +
> +			if (unlikely(is_migrate_isolate_page(page))) {
> +				mt = MIGRATE_ISOLATE;
> +			} else {
> +				mt = get_freepage_migratetype(page);
> +				__mod_zone_freepage_state(zone, 1, mt);
> +			}
> +
>   			/* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */
>   			__free_one_page(page, page_to_pfn(page), zone, 0, mt);
>   			trace_mm_page_pcpu_drain(page, 0, mt);
> -			if (likely(!is_migrate_isolate_page(page))) {
> -				__mod_zone_page_state(zone, NR_FREE_PAGES, 1);
> -				if (is_migrate_cma(mt))
> -					__mod_zone_page_state(zone, NR_FREE_CMA_PAGES, 1);
> -			}
>   		} while (--to_free && --batch_free && !list_empty(list));
>   	}
>   	spin_unlock(&zone->lock);
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Vlastimil Babka July 7, 2014, 3:19 p.m. UTC | #2
On 07/04/2014 09:57 AM, Joonsoo Kim wrote:
> If pageblock of page on pcp are isolated now, we should free it to isolate
> buddy list to prevent future allocation on it. But current code doesn't
> do this.
>
> Moreover, there is a freepage counting problem on current code. Although
> pageblock of page on pcp are isolated now, it could go normal buddy list,
> because get_onpcp_migratetype() will return non-isolate migratetype.

get_onpcp_migratetype() is only introduced in later patch.

> In this case, we should do either adding freepage count or changing
> migratetype to MIGRATE_ISOLATE, but, current code do neither.

I wouldn't say it "do neither". It already limits the freepage counting 
to !MIGRATE_ISOLATE case (and it's not converted to 
__mod_zone_freepage_state for some reason). So there's accounting 
mismatch in addition to buddy list misplacement.

> This patch fixes these two problems by handling pageblock migratetype
> before calling __free_one_page(). And, if we find the page on isolated
> pageblock, change migratetype to MIGRATE_ISOLATE to prevent future
> allocation of this page and freepage counting problem.

So although this is not an addition of a new pageblock migratetype check 
to the fast path (the check is already there), I would prefer removing 
the check :) With the approach of pcplists draining outlined in my reply 
to 00/10, we would allow a misplacement to happen (and the page 
accounted as freepage) immediately followed by move_frepages_block which 
would place the page onto isolate freelist with the rest. Anything newly 
freed will get isolate_migratetype determined in free_hot_cold_page or 
__free_pages_ok (where it would need moving the migratepage check under 
the disabled irq part) and be placed and buddy-merged properly.

> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>   mm/page_alloc.c |   14 ++++++++------
>   1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index aeb51d1..99c05f7 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -719,15 +719,17 @@ static void free_pcppages_bulk(struct zone *zone, int count,
>   			page = list_entry(list->prev, struct page, lru);
>   			/* must delete as __free_one_page list manipulates */
>   			list_del(&page->lru);
> -			mt = get_freepage_migratetype(page);
> +
> +			if (unlikely(is_migrate_isolate_page(page))) {
> +				mt = MIGRATE_ISOLATE;
> +			} else {
> +				mt = get_freepage_migratetype(page);
> +				__mod_zone_freepage_state(zone, 1, mt);
> +			}
> +
>   			/* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */
>   			__free_one_page(page, page_to_pfn(page), zone, 0, mt);
>   			trace_mm_page_pcpu_drain(page, 0, mt);
> -			if (likely(!is_migrate_isolate_page(page))) {
> -				__mod_zone_page_state(zone, NR_FREE_PAGES, 1);
> -				if (is_migrate_cma(mt))
> -					__mod_zone_page_state(zone, NR_FREE_CMA_PAGES, 1);
> -			}
>   		} while (--to_free && --batch_free && !list_empty(list));
>   	}
>   	spin_unlock(&zone->lock);
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Vlastimil Babka July 8, 2014, 7:18 a.m. UTC | #3
On 07/07/2014 10:25 AM, Gioh Kim wrote:
> 
> 
> 2014-07-04 오후 4:57, Joonsoo Kim 쓴 글:
>> If pageblock of page on pcp are isolated now, we should free it to isolate
>> buddy list to prevent future allocation on it. But current code doesn't
>> do this.
> 
> I think it is strage that pcp can have isolated page.
> I remember that The purpose of pcp is having frequently used pages (hot for cache),
> but isolated page is not for frequently allocated and freed.

It can happen that a page was placed on pcplist *before* the pageblock
was isolated.

> (Above is my guess. It can be wrong. I know CMA and hot-plug features are using isolated pages
> so that I guess isolated pages are not for frequently allocated and freed memory.)
> 
> Anyway if isolated page is not good for pcp, what about preventing isolated page from inserting pcp?
> I think fix of free_hot_cold_page() can be one of the preventing ways.
> The free_hot_cold_page() checks migratetype and inserts CMA page into pcp.
> Am I correct?
> 
> If I am correct I think inserting CMA page into pcp is harmful for both of CMA and pcp.
> If CMA page is on pcp it will be used frequently and prevent making contiguous memory.
> What about avoiding CMA page for pcp like following?

CMA is not the only user of memory isolation. There's also memory
hot-remove. Also CMA tries to make the CMA-marked pages usable as normal
MOVABLE pages, until a CMA allocation is needed. Here you would restrict
their usage outside of pcplists.

> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 8c9eeec..1cbb816 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1413,13 +1413,8 @@ void free_hot_cold_page(struct page *page, bool cold)
>           * areas back if necessary. Otherwise, we may have to free
>           * excessively into the page allocator
>           */
> -       if (migratetype >= MIGRATE_PCPTYPES) {
> -               if (unlikely(is_migrate_isolate(migratetype))) {
> -                       free_one_page(zone, page, pfn, 0, migratetype);
> -                       goto out;
> -               }
> -               migratetype = MIGRATE_MOVABLE;
> -       }
> +       if (migratetype > MIGRATE_PCPTYPES)
> +               goto out;

Certainly not, for the reasons above and also:
- by changing >= to > you stopped handling RESERVE pages specially
- by removing free_one_page() call you are now leaking the CMA and
ISOLATE pages.

>          pcp = &this_cpu_ptr(zone->pageset)->pcp;
>          if (!cold)
> 
> 
>>
>> Moreover, there is a freepage counting problem on current code. Although
>> pageblock of page on pcp are isolated now, it could go normal buddy list,
>> because get_onpcp_migratetype() will return non-isolate migratetype.
> 
> I cannot find get_onpcp_migratetype() in v3.16.0-rc3 and also cannot google it.
> Is it typo?
> 
>> In this case, we should do either adding freepage count or changing
>> migratetype to MIGRATE_ISOLATE, but, current code do neither.
>>
>> This patch fixes these two problems by handling pageblock migratetype
>> before calling __free_one_page(). And, if we find the page on isolated
>> pageblock, change migratetype to MIGRATE_ISOLATE to prevent future
>> allocation of this page and freepage counting problem.
>>
>> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>> ---
>>    mm/page_alloc.c |   14 ++++++++------
>>    1 file changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index aeb51d1..99c05f7 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -719,15 +719,17 @@ static void free_pcppages_bulk(struct zone *zone, int count,
>>    			page = list_entry(list->prev, struct page, lru);
>>    			/* must delete as __free_one_page list manipulates */
>>    			list_del(&page->lru);
>> -			mt = get_freepage_migratetype(page);
>> +
>> +			if (unlikely(is_migrate_isolate_page(page))) {
>> +				mt = MIGRATE_ISOLATE;
>> +			} else {
>> +				mt = get_freepage_migratetype(page);
>> +				__mod_zone_freepage_state(zone, 1, mt);
>> +			}
>> +
>>    			/* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */
>>    			__free_one_page(page, page_to_pfn(page), zone, 0, mt);
>>    			trace_mm_page_pcpu_drain(page, 0, mt);
>> -			if (likely(!is_migrate_isolate_page(page))) {
>> -				__mod_zone_page_state(zone, NR_FREE_PAGES, 1);
>> -				if (is_migrate_cma(mt))
>> -					__mod_zone_page_state(zone, NR_FREE_CMA_PAGES, 1);
>> -			}
>>    		} while (--to_free && --batch_free && !list_empty(list));
>>    	}
>>    	spin_unlock(&zone->lock);
>>
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Joonsoo Kim July 14, 2014, 6:24 a.m. UTC | #4
On Mon, Jul 07, 2014 at 05:19:48PM +0200, Vlastimil Babka wrote:
> On 07/04/2014 09:57 AM, Joonsoo Kim wrote:
> >If pageblock of page on pcp are isolated now, we should free it to isolate
> >buddy list to prevent future allocation on it. But current code doesn't
> >do this.
> >
> >Moreover, there is a freepage counting problem on current code. Although
> >pageblock of page on pcp are isolated now, it could go normal buddy list,
> >because get_onpcp_migratetype() will return non-isolate migratetype.
> 
> get_onpcp_migratetype() is only introduced in later patch.

Yes, I will fix it.

> 
> >In this case, we should do either adding freepage count or changing
> >migratetype to MIGRATE_ISOLATE, but, current code do neither.
> 
> I wouldn't say it "do neither". It already limits the freepage
> counting to !MIGRATE_ISOLATE case (and it's not converted to
> __mod_zone_freepage_state for some reason). So there's accounting
> mismatch in addition to buddy list misplacement.

Okay.

> 
> >This patch fixes these two problems by handling pageblock migratetype
> >before calling __free_one_page(). And, if we find the page on isolated
> >pageblock, change migratetype to MIGRATE_ISOLATE to prevent future
> >allocation of this page and freepage counting problem.
> 
> So although this is not an addition of a new pageblock migratetype
> check to the fast path (the check is already there), I would prefer
> removing the check :) 

Yes, I want to do it if possible. :)

> With the approach of pcplists draining
> outlined in my reply to 00/10, we would allow a misplacement to
> happen (and the page accounted as freepage) immediately followed by
> move_frepages_block which would place the page onto isolate freelist
> with the rest. Anything newly freed will get isolate_migratetype
> determined in free_hot_cold_page or __free_pages_ok (where it would
> need moving the migratepage check under the disabled irq part) and
> be placed and buddy-merged properly.

I explained the problem of this approach in 00/10.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Patch
diff mbox series

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index aeb51d1..99c05f7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -719,15 +719,17 @@  static void free_pcppages_bulk(struct zone *zone, int count,
 			page = list_entry(list->prev, struct page, lru);
 			/* must delete as __free_one_page list manipulates */
 			list_del(&page->lru);
-			mt = get_freepage_migratetype(page);
+
+			if (unlikely(is_migrate_isolate_page(page))) {
+				mt = MIGRATE_ISOLATE;
+			} else {
+				mt = get_freepage_migratetype(page);
+				__mod_zone_freepage_state(zone, 1, mt);
+			}
+
 			/* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */
 			__free_one_page(page, page_to_pfn(page), zone, 0, mt);
 			trace_mm_page_pcpu_drain(page, 0, mt);
-			if (likely(!is_migrate_isolate_page(page))) {
-				__mod_zone_page_state(zone, NR_FREE_PAGES, 1);
-				if (is_migrate_cma(mt))
-					__mod_zone_page_state(zone, NR_FREE_CMA_PAGES, 1);
-			}
 		} while (--to_free && --batch_free && !list_empty(list));
 	}
 	spin_unlock(&zone->lock);