linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/memory-failure.c: fix memory leak by race between poison and unpoison
@ 2014-05-14 15:21 Naoya Horiguchi
  2014-05-14 22:10 ` Andrew Morton
  2014-05-15  3:34 ` cyc
  0 siblings, 2 replies; 3+ messages in thread
From: Naoya Horiguchi @ 2014-05-14 15:21 UTC (permalink / raw)
  To: Andrew Morton, Andi Kleen; +Cc: Wu Fengguang, linux-kernel, linux-mm

When a memory error happens on an in-use page or (free and in-use) hugepage,
the victim page is isolated with its refcount set to one. When you try to
unpoison it later, unpoison_memory() calls put_page() for it twice in order to
bring the page back to free page pool (buddy or free hugepage list.)
However, if another memory error occurs on the page which we are unpoisoning,
memory_failure() returns without releasing the refcount which was incremented
in the same call at first, which results in memory leak and unconsistent
num_poisoned_pages statistics. This patch fixes it.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org>    [2.6.32+]
---
 mm/memory-failure.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git next-20140512.orig/mm/memory-failure.c next-20140512/mm/memory-failure.c
index 9872af1b1e9d..93a08bd78c78 100644
--- next-20140512.orig/mm/memory-failure.c
+++ next-20140512/mm/memory-failure.c
@@ -1153,6 +1153,8 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
 	 */
 	if (!PageHWPoison(p)) {
 		printk(KERN_ERR "MCE %#lx: just unpoisoned\n", pfn);
+		atomic_long_sub(nr_pages, &num_poisoned_pages);
+		put_page(hpage);
 		res = 0;
 		goto out;
 	}
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm/memory-failure.c: fix memory leak by race between poison and unpoison
  2014-05-14 15:21 [PATCH] mm/memory-failure.c: fix memory leak by race between poison and unpoison Naoya Horiguchi
@ 2014-05-14 22:10 ` Andrew Morton
  2014-05-15  3:34 ` cyc
  1 sibling, 0 replies; 3+ messages in thread
From: Andrew Morton @ 2014-05-14 22:10 UTC (permalink / raw)
  To: Naoya Horiguchi; +Cc: Andi Kleen, Wu Fengguang, linux-kernel, linux-mm

On Wed, 14 May 2014 11:21:31 -0400 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> When a memory error happens on an in-use page or (free and in-use) hugepage,
> the victim page is isolated with its refcount set to one. When you try to
> unpoison it later, unpoison_memory() calls put_page() for it twice in order to
> bring the page back to free page pool (buddy or free hugepage list.)
> However, if another memory error occurs on the page which we are unpoisoning,
> memory_failure() returns without releasing the refcount which was incremented
> in the same call at first, which results in memory leak and unconsistent
> num_poisoned_pages statistics. This patch fixes it.
> 
> ...
>
> --- next-20140512.orig/mm/memory-failure.c
> +++ next-20140512/mm/memory-failure.c
> @@ -1153,6 +1153,8 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
>  	 */
>  	if (!PageHWPoison(p)) {
>  		printk(KERN_ERR "MCE %#lx: just unpoisoned\n", pfn);
> +		atomic_long_sub(nr_pages, &num_poisoned_pages);
> +		put_page(hpage);
>  		res = 0;
>  		goto out;
>  	}

Looking at the surrounding code...

	/*
	 * Lock the page and wait for writeback to finish.
	 * It's very difficult to mess with pages currently under IO
	 * and in many cases impossible, so we just avoid it here.
	 */
	lock_page(hpage);


lock_page() doesn't wait for writeback to finish -
wait_on_page_writeback() does that.  Either the code or the comment
could do with fixing.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm/memory-failure.c: fix memory leak by race between poison and unpoison
  2014-05-14 15:21 [PATCH] mm/memory-failure.c: fix memory leak by race between poison and unpoison Naoya Horiguchi
  2014-05-14 22:10 ` Andrew Morton
@ 2014-05-15  3:34 ` cyc
  1 sibling, 0 replies; 3+ messages in thread
From: cyc @ 2014-05-15  3:34 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Andrew Morton, Andi Kleen, Wu Fengguang, linux-kernel, linux-mm

在 2014-05-14三的 11:21 -0400,Naoya Horiguchi写道:
> When a memory error happens on an in-use page or (free and in-use) hugepage,
> the victim page is isolated with its refcount set to one. When you try to
> unpoison it later, unpoison_memory() calls put_page() for it twice in order to
> bring the page back to free page pool (buddy or free hugepage list.)
> However, if another memory error occurs on the page which we are unpoisoning,
> memory_failure() returns without releasing the refcount which was incremented
> in the same call at first, which results in memory leak and unconsistent
> num_poisoned_pages statistics. This patch fixes it.

We assume that a new memory error occurs on the hugepage which we are
unpoisoning. 

          A   unpoisoned  B    poisoned    C          
hugepage: |---------------+++++++++++++++++|

There are two cases, so shown.
  1. the victim page belongs to A-B, the memory_failure will be blocked
by lock_page() until unlock_page() invoked by unpoison_memory().
  2. the victim page belongs to B-C, the memory_failure() will return
very soon at the beginning of this function.

So the new memory error will have no effect what you say so.

thx!
cyc 

> 
> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: <stable@vger.kernel.org>    [2.6.32+]
> ---
>  mm/memory-failure.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git next-20140512.orig/mm/memory-failure.c next-20140512/mm/memory-failure.c
> index 9872af1b1e9d..93a08bd78c78 100644
> --- next-20140512.orig/mm/memory-failure.c
> +++ next-20140512/mm/memory-failure.c
> @@ -1153,6 +1153,8 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
>  	 */
>  	if (!PageHWPoison(p)) {
>  		printk(KERN_ERR "MCE %#lx: just unpoisoned\n", pfn);
> +		atomic_long_sub(nr_pages, &num_poisoned_pages);
> +		put_page(hpage);
>  		res = 0;
>  		goto out;
>  	}



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-05-15  3:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-14 15:21 [PATCH] mm/memory-failure.c: fix memory leak by race between poison and unpoison Naoya Horiguchi
2014-05-14 22:10 ` Andrew Morton
2014-05-15  3:34 ` cyc

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).