linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] [3.7-rc] fix incorrect NR_FREE_PAGES accounting (appears like memory leak)
@ 2012-11-21 19:21 Dave Hansen
  2012-11-26 11:23 ` [PATCH] mm: compaction: Fix return value of capture_free_page Mel Gorman
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Hansen @ 2012-11-21 19:21 UTC (permalink / raw)
  To: mgorman; +Cc: akpm, linux-kernel, torvalds, linux-mm, Dave Hansen


This needs to make it in before 3.7 is released.

--

There have been some 3.7-rc reports of vm issues, including some
kswapd bugs and, more importantly, some memory "leaks":

	http://www.spinics.net/lists/linux-mm/msg46187.html
	https://bugzilla.kernel.org/show_bug.cgi?id=50181

The post-3.6 commit 1fb3f8ca took split_free_page() and reused
it for the compaction code.  It does something curious with
capture_free_page() (previously known as split_free_page()):

int capture_free_page(struct page *page, int alloc_order,
...
        __mod_zone_page_state(zone, NR_FREE_PAGES, -(1UL << order));

-       /* Split into individual pages */
-       set_page_refcounted(page);
-       split_page(page, order);
+       if (alloc_order != order)
+               expand(zone, page, alloc_order, order,
+                       &zone->free_area[order], migratetype);

Note that expand() puts the pages _back_ in the allocator, but it
does not bump NR_FREE_PAGES.  We "return" 'alloc_order' worth of
pages, but we accounted for removing 'order' in the
__mod_zone_page_state() call.  For the old split_page()-style use
(order==alloc_order) the bug will not trigger.  But, when called
from the compaction code where we occasionally get a larger page
out of the buddy allocator than we need, we will run in to this.

This patch simply changes the NR_FREE_PAGES manipulation to the
correct 'alloc_order' instead of 'order'.

I've been able to repeatedly trigger this in my testing
environment.  The amount "leaked" very closely tracks the
imbalance I see in buddy pages vs. NR_FREE_PAGES.  I have
confirmed that this patch fixes the imbalance

Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: Mel Gorman <mgorman@suse.de>
---

 linux-2.6.git-dave/mm/page_alloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN mm/page_alloc.c~leak-fix-20121120-2 mm/page_alloc.c
--- linux-2.6.git/mm/page_alloc.c~leak-fix-20121120-2	2012-11-21 14:14:52.053714749 -0500
+++ linux-2.6.git-dave/mm/page_alloc.c	2012-11-21 14:14:52.069714883 -0500
@@ -1405,7 +1405,7 @@ int capture_free_page(struct page *page,
 
 	mt = get_pageblock_migratetype(page);
 	if (unlikely(mt != MIGRATE_ISOLATE))
-		__mod_zone_freepage_state(zone, -(1UL << order), mt);
+		__mod_zone_freepage_state(zone, -(1UL << alloc_order), mt);
 
 	if (alloc_order != order)
 		expand(zone, page, alloc_order, order,
_


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] mm: compaction: Fix return value of capture_free_page
  2012-11-21 19:21 [PATCH] [3.7-rc] fix incorrect NR_FREE_PAGES accounting (appears like memory leak) Dave Hansen
@ 2012-11-26 11:23 ` Mel Gorman
  2012-11-26 15:06   ` Dave Hansen
  0 siblings, 1 reply; 4+ messages in thread
From: Mel Gorman @ 2012-11-26 11:23 UTC (permalink / raw)
  To: Dave Hansen; +Cc: akpm, linux-kernel, torvalds, linux-mm

On Wed, Nov 21, 2012 at 02:21:51PM -0500, Dave Hansen wrote:
> 
> This needs to make it in before 3.7 is released.
> 

This is also required. Dave, can you double check? The surprise is that
this does not blow up very obviously.

---8<---
From: Mel Gorman <mgorman@suse.de>
Subject: [PATCH] mm: compaction: Fix return value of capture_free_page

Commit ef6c5be6 (fix incorrect NR_FREE_PAGES accounting (appears like
memory leak)) fixes a NR_FREE_PAGE accounting leak but missed the return
value which was also missed by this reviewer until today. That return value
is used by compaction when adding pages to a list of isolated free
pages and without this follow-up fix, there is a risk of free list
corruption.

Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 mm/page_alloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index bcb72c6..8193809 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1422,7 +1422,7 @@ int capture_free_page(struct page *page, int alloc_order, int migratetype)
 		}
 	}
 
-	return 1UL << order;
+	return 1UL << alloc_order;
 }
 
 /*

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm: compaction: Fix return value of capture_free_page
  2012-11-26 11:23 ` [PATCH] mm: compaction: Fix return value of capture_free_page Mel Gorman
@ 2012-11-26 15:06   ` Dave Hansen
  2012-11-27 11:07     ` Mel Gorman
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Hansen @ 2012-11-26 15:06 UTC (permalink / raw)
  To: Mel Gorman; +Cc: akpm, linux-kernel, torvalds, linux-mm

On 11/26/2012 03:23 AM, Mel Gorman wrote:
> On Wed, Nov 21, 2012 at 02:21:51PM -0500, Dave Hansen wrote:
>>
>> This needs to make it in before 3.7 is released.
>>
> 
> This is also required. Dave, can you double check? The surprise is that
> this does not blow up very obviously.
...
> @@ -1422,7 +1422,7 @@ int capture_free_page(struct page *page, int alloc_order, int migratetype)
>  		}
>  	}
> 
> -	return 1UL << order;
> +	return 1UL << alloc_order;
>  }

compact_capture_page() only looks at the boolean return value out of
capture_free_page(), so it wouldn't notice.  split_free_page() does.
But, when it calls capture_free_page(), order==alloc_order, so it
wouldn't make a difference.  So, there's probably no actual bug here,
but it's certainly a wrong return value.

We should probably also fix the set_pageblock_migratetype() loop in
there while we're at it.  I think it's potentially trampling on the
migration type of pages currently in the allocator.  I _think_ that
completes the list of things that need to get audited in there.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm: compaction: Fix return value of capture_free_page
  2012-11-26 15:06   ` Dave Hansen
@ 2012-11-27 11:07     ` Mel Gorman
  0 siblings, 0 replies; 4+ messages in thread
From: Mel Gorman @ 2012-11-27 11:07 UTC (permalink / raw)
  To: Dave Hansen; +Cc: akpm, linux-kernel, torvalds, linux-mm

On Mon, Nov 26, 2012 at 07:06:53AM -0800, Dave Hansen wrote:
> On 11/26/2012 03:23 AM, Mel Gorman wrote:
> > On Wed, Nov 21, 2012 at 02:21:51PM -0500, Dave Hansen wrote:
> >>
> >> This needs to make it in before 3.7 is released.
> >>
> > 
> > This is also required. Dave, can you double check? The surprise is that
> > this does not blow up very obviously.
> ...
> > @@ -1422,7 +1422,7 @@ int capture_free_page(struct page *page, int alloc_order, int migratetype)
> >  		}
> >  	}
> > 
> > -	return 1UL << order;
> > +	return 1UL << alloc_order;
> >  }
> 
> compact_capture_page() only looks at the boolean return value out of
> capture_free_page(), so it wouldn't notice.  split_free_page() does.
> But, when it calls capture_free_page(), order==alloc_order, so it
> wouldn't make a difference.  So, there's probably no actual bug here,
> but it's certainly a wrong return value.
> 

I don't think it is fine in this case.

isolate_freepages_block
isolated = split_free_page(page);
  -> split_free_page
     nr_pages = capture_free_page(page, order, 0);
     -> capture_free_page (returns wrong value of too many pages)
     return nr_pages;

so now isolate_freepages_block has the wrong value with nr_pages holding
a value for a larger number of pages than are really isolated and does
this

                for (i = 0; i < isolated; i++) {
                        list_add(&page->lru, freelist);
                        page++;
                }

so potentially that is now adding pages that are already on the buddy list
to the local free list and "fun" ensues.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-11-27 11:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-21 19:21 [PATCH] [3.7-rc] fix incorrect NR_FREE_PAGES accounting (appears like memory leak) Dave Hansen
2012-11-26 11:23 ` [PATCH] mm: compaction: Fix return value of capture_free_page Mel Gorman
2012-11-26 15:06   ` Dave Hansen
2012-11-27 11:07     ` Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).