All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/5] mm/memory-failure: unlock_page before put_page
  2015-07-31  6:46 ` Naoya Horiguchi
@ 2015-07-31  6:46   ` Naoya Horiguchi
  -1 siblings, 0 replies; 14+ messages in thread
From: Naoya Horiguchi @ 2015-07-31  6:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
	Naoya Horiguchi

In "just unpoisoned" path, we do put_page and then unlock_page, which is a
wrong order and causes "freeing locked page" bug. So let's fix it.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/memory-failure.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git v4.2-rc4.orig/mm/memory-failure.c v4.2-rc4/mm/memory-failure.c
index c53543d89282..04d677048af7 100644
--- v4.2-rc4.orig/mm/memory-failure.c
+++ v4.2-rc4/mm/memory-failure.c
@@ -1209,9 +1209,9 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
 	if (!PageHWPoison(p)) {
 		printk(KERN_ERR "MCE %#lx: just unpoisoned\n", pfn);
 		atomic_long_sub(nr_pages, &num_poisoned_pages);
+		unlock_page(hpage);
 		put_page(hpage);
-		res = 0;
-		goto out;
+		return 0;
 	}
 	if (hwpoison_filter(p)) {
 		if (TestClearPageHWPoison(p))
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 0/5] hwpoison: fixes on v4.2-rc4
@ 2015-07-31  6:46 ` Naoya Horiguchi
  0 siblings, 0 replies; 14+ messages in thread
From: Naoya Horiguchi @ 2015-07-31  6:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
	Naoya Horiguchi

This is v2 of hwpoison fix series for v4.2.
I reflected the feedback for v1, and tried another solution for "reuse just
after soft-offline" problem (see patch 5/5.)

General description (mostly identical to v1)
===================

Recently I addressed a few of hwpoison race problems and the patches are merged
on v4.2-rc1. It made progress, but unfortunately some problems still remain due
to less coverage of my testing. So I'm trying to fix or avoid them in this series.

One point I'm expecting to discuss is that patch 4/5 changes the page flag set
to be checked on free time. In current behavior, __PG_HWPOISON is not supposed
to be set when the page is freed. I think that there is no strong reason for this
behavior, and it causes a problem hard to fix only in error handler side (because
__PG_HWPOISON could be set at arbitrary timing.) So I suggest to change it.

Test
====

With this patchset, hwpoison stress testing in official mce-test testsuite
(which previously failed) passes.

Thanks,
Naoya Horiguchi
---
Tree: https://github.com/Naoya-Horiguchi/linux/tree/v4.2-rc4/hwpoison.v2
---
Summary:

Naoya Horiguchi (5):
      mm/memory-failure: unlock_page before put_page
      mm/memory-failure: fix race in counting num_poisoned_pages
      mm/memory-failure: give up error handling for non-tail-refcounted thp
      mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*
      mm/memory-failure: set PageHWPoison before migrate_pages()

 include/linux/page-flags.h | 10 +++++++---
 mm/huge_memory.c           |  7 +------
 mm/memory-failure.c        | 32 ++++++++++++++++++--------------
 mm/migrate.c               |  8 ++++++--
 mm/page_alloc.c            |  4 ++++
 5 files changed, 36 insertions(+), 25 deletions(-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 0/5] hwpoison: fixes on v4.2-rc4
@ 2015-07-31  6:46 ` Naoya Horiguchi
  0 siblings, 0 replies; 14+ messages in thread
From: Naoya Horiguchi @ 2015-07-31  6:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
	Naoya Horiguchi

This is v2 of hwpoison fix series for v4.2.
I reflected the feedback for v1, and tried another solution for "reuse just
after soft-offline" problem (see patch 5/5.)

General description (mostly identical to v1)
===================

Recently I addressed a few of hwpoison race problems and the patches are merged
on v4.2-rc1. It made progress, but unfortunately some problems still remain due
to less coverage of my testing. So I'm trying to fix or avoid them in this series.

One point I'm expecting to discuss is that patch 4/5 changes the page flag set
to be checked on free time. In current behavior, __PG_HWPOISON is not supposed
to be set when the page is freed. I think that there is no strong reason for this
behavior, and it causes a problem hard to fix only in error handler side (because
__PG_HWPOISON could be set at arbitrary timing.) So I suggest to change it.

Test
====

With this patchset, hwpoison stress testing in official mce-test testsuite
(which previously failed) passes.

Thanks,
Naoya Horiguchi
---
Tree: https://github.com/Naoya-Horiguchi/linux/tree/v4.2-rc4/hwpoison.v2
---
Summary:

Naoya Horiguchi (5):
      mm/memory-failure: unlock_page before put_page
      mm/memory-failure: fix race in counting num_poisoned_pages
      mm/memory-failure: give up error handling for non-tail-refcounted thp
      mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*
      mm/memory-failure: set PageHWPoison before migrate_pages()

 include/linux/page-flags.h | 10 +++++++---
 mm/huge_memory.c           |  7 +------
 mm/memory-failure.c        | 32 ++++++++++++++++++--------------
 mm/migrate.c               |  8 ++++++--
 mm/page_alloc.c            |  4 ++++
 5 files changed, 36 insertions(+), 25 deletions(-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/5] mm/memory-failure: unlock_page before put_page
@ 2015-07-31  6:46   ` Naoya Horiguchi
  0 siblings, 0 replies; 14+ messages in thread
From: Naoya Horiguchi @ 2015-07-31  6:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
	Naoya Horiguchi

In "just unpoisoned" path, we do put_page and then unlock_page, which is a
wrong order and causes "freeing locked page" bug. So let's fix it.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/memory-failure.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git v4.2-rc4.orig/mm/memory-failure.c v4.2-rc4/mm/memory-failure.c
index c53543d89282..04d677048af7 100644
--- v4.2-rc4.orig/mm/memory-failure.c
+++ v4.2-rc4/mm/memory-failure.c
@@ -1209,9 +1209,9 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
 	if (!PageHWPoison(p)) {
 		printk(KERN_ERR "MCE %#lx: just unpoisoned\n", pfn);
 		atomic_long_sub(nr_pages, &num_poisoned_pages);
+		unlock_page(hpage);
 		put_page(hpage);
-		res = 0;
-		goto out;
+		return 0;
 	}
 	if (hwpoison_filter(p)) {
 		if (TestClearPageHWPoison(p))
-- 
2.4.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 4/5] mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*
  2015-07-31  6:46 ` Naoya Horiguchi
@ 2015-07-31  6:46   ` Naoya Horiguchi
  -1 siblings, 0 replies; 14+ messages in thread
From: Naoya Horiguchi @ 2015-07-31  6:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
	Naoya Horiguchi

The race condition addressed in commit add05cecef80 ("mm: soft-offline: don't
free target page in successful page migration") was not closed completely,
because that can happen not only for soft-offline, but also for hard-offline.
Consider that a slab page is about to be freed into buddy pool, and then an
uncorrected memory error hits the page just after entering __free_one_page(),
then VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP) is triggered,
despite the fact that it's not necessary because the data on the affected
page is not consumed.

To solve it, this patch drops __PG_HWPOISON from page flag checks at
allocation/free time. I think it's justified because __PG_HWPOISON flags is
defined to prevent the page from being reused, and setting it outside the
page's alloc-free cycle is a designed behavior (not a bug.)

For recent months, I was annoyed about BUG_ON when soft-offlined page remains
on lru cache list for a while, which is avoided by calling put_page() instead
of putback_lru_page() in page migration's success path. This means that this
patch reverts a major change from commit add05cecef80 about the new refcounting
rule of soft-offlined pages, so "reuse window" revives. This will be closed
by a subsequent patch.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
v1 -> v2:
- call put_page() when migration succeeded in reason == MR_MEMORY_FAILURE
- refrain from reviving "reuse prevention" by MIGRATE_ISOLATE flag
---
 include/linux/page-flags.h | 10 +++++++---
 mm/huge_memory.c           |  7 +------
 mm/migrate.c               |  5 ++++-
 mm/page_alloc.c            |  4 ++++
 4 files changed, 16 insertions(+), 10 deletions(-)

diff --git v4.2-rc4.orig/include/linux/page-flags.h v4.2-rc4/include/linux/page-flags.h
index f34e040b34e9..41c93844fb1d 100644
--- v4.2-rc4.orig/include/linux/page-flags.h
+++ v4.2-rc4/include/linux/page-flags.h
@@ -631,15 +631,19 @@ static inline void ClearPageSlabPfmemalloc(struct page *page)
 	 1 << PG_private | 1 << PG_private_2 | \
 	 1 << PG_writeback | 1 << PG_reserved | \
 	 1 << PG_slab	 | 1 << PG_swapcache | 1 << PG_active | \
-	 1 << PG_unevictable | __PG_MLOCKED | __PG_HWPOISON | \
+	 1 << PG_unevictable | __PG_MLOCKED | \
 	 __PG_COMPOUND_LOCK)
 
 /*
  * Flags checked when a page is prepped for return by the page allocator.
- * Pages being prepped should not have any flags set.  It they are set,
+ * Pages being prepped should not have these flags set.  It they are set,
  * there has been a kernel bug or struct page corruption.
+ *
+ * __PG_HWPOISON is exceptional because it needs to be kept beyond page's
+ * alloc-free cycle to prevent from reusing the page.
  */
-#define PAGE_FLAGS_CHECK_AT_PREP	((1 << NR_PAGEFLAGS) - 1)
+#define PAGE_FLAGS_CHECK_AT_PREP	\
+	(((1 << NR_PAGEFLAGS) - 1) & ~__PG_HWPOISON)
 
 #define PAGE_FLAGS_PRIVATE				\
 	(1 << PG_private | 1 << PG_private_2)
diff --git v4.2-rc4.orig/mm/huge_memory.c v4.2-rc4/mm/huge_memory.c
index c107094f79ba..097c7a4bfbd9 100644
--- v4.2-rc4.orig/mm/huge_memory.c
+++ v4.2-rc4/mm/huge_memory.c
@@ -1676,12 +1676,7 @@ static void __split_huge_page_refcount(struct page *page,
 		/* after clearing PageTail the gup refcount can be released */
 		smp_mb__after_atomic();
 
-		/*
-		 * retain hwpoison flag of the poisoned tail page:
-		 *   fix for the unsuitable process killed on Guest Machine(KVM)
-		 *   by the memory-failure.
-		 */
-		page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP | __PG_HWPOISON;
+		page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
 		page_tail->flags |= (page->flags &
 				     ((1L << PG_referenced) |
 				      (1L << PG_swapbacked) |
diff --git v4.2-rc4.orig/mm/migrate.c v4.2-rc4/mm/migrate.c
index ee401e4e5ef1..f2415be7d93b 100644
--- v4.2-rc4.orig/mm/migrate.c
+++ v4.2-rc4/mm/migrate.c
@@ -950,7 +950,10 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
 		list_del(&page->lru);
 		dec_zone_page_state(page, NR_ISOLATED_ANON +
 				page_is_file_cache(page));
-		if (reason != MR_MEMORY_FAILURE)
+		/* Soft-offlined page shouldn't go through lru cache list */
+		if (reason == MR_MEMORY_FAILURE)
+			put_page(page);
+		else
 			putback_lru_page(page);
 	}
 
diff --git v4.2-rc4.orig/mm/page_alloc.c v4.2-rc4/mm/page_alloc.c
index ef19f22b2b7d..775c254648f7 100644
--- v4.2-rc4.orig/mm/page_alloc.c
+++ v4.2-rc4/mm/page_alloc.c
@@ -1285,6 +1285,10 @@ static inline int check_new_page(struct page *page)
 		bad_reason = "non-NULL mapping";
 	if (unlikely(atomic_read(&page->_count) != 0))
 		bad_reason = "nonzero _count";
+	if (unlikely(page->flags & __PG_HWPOISON)) {
+		bad_reason = "HWPoisoned (hardware-corrupted)";
+		bad_flags = __PG_HWPOISON;
+	}
 	if (unlikely(page->flags & PAGE_FLAGS_CHECK_AT_PREP)) {
 		bad_reason = "PAGE_FLAGS_CHECK_AT_PREP flag set";
 		bad_flags = PAGE_FLAGS_CHECK_AT_PREP;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/5] mm/memory-failure: give up error handling for non-tail-refcounted thp
  2015-07-31  6:46 ` Naoya Horiguchi
@ 2015-07-31  6:46   ` Naoya Horiguchi
  -1 siblings, 0 replies; 14+ messages in thread
From: Naoya Horiguchi @ 2015-07-31  6:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
	Naoya Horiguchi

"non anonymous thp" case is still racy with freeing thp, which causes panic
due to put_page() for refcount-0 page. It seems that closing up this race
might be hard (and/or not worth doing,) so let's give up the error handling
for this case.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
v1 -> v2:
- keep pr_err() message
---
 mm/memory-failure.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git v4.2-rc4.orig/mm/memory-failure.c v4.2-rc4/mm/memory-failure.c
index f72d2fad0b90..cd985530f102 100644
--- v4.2-rc4.orig/mm/memory-failure.c
+++ v4.2-rc4/mm/memory-failure.c
@@ -909,6 +909,18 @@ int get_hwpoison_page(struct page *page)
 	 * directly for tail pages.
 	 */
 	if (PageTransHuge(head)) {
+		/*
+		 * Non anonymous thp exists only in allocation/free time. We
+		 * can't handle such a case correctly, so let's give it up.
+		 * This should be better than triggering BUG_ON when kernel
+		 * tries to touch the "partially handled" page.
+		 */
+		if (!PageAnon(head)) {
+			pr_err("MCE: %#lx: non anonymous thp\n",
+				page_to_pfn(page));
+			return 0;
+		}
+
 		if (get_page_unless_zero(head)) {
 			if (PageTail(page))
 				get_page(page);
@@ -1134,15 +1146,6 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
 	}
 
 	if (!PageHuge(p) && PageTransHuge(hpage)) {
-		if (!PageAnon(hpage)) {
-			pr_err("MCE: %#lx: non anonymous thp\n", pfn);
-			if (TestClearPageHWPoison(p))
-				atomic_long_sub(nr_pages, &num_poisoned_pages);
-			put_page(p);
-			if (p != hpage)
-				put_page(hpage);
-			return -EBUSY;
-		}
 		if (unlikely(split_huge_page(hpage))) {
 			pr_err("MCE: %#lx: thp split failed\n", pfn);
 			if (TestClearPageHWPoison(p))
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/5] mm/memory-failure: fix race in counting num_poisoned_pages
  2015-07-31  6:46 ` Naoya Horiguchi
@ 2015-07-31  6:46   ` Naoya Horiguchi
  -1 siblings, 0 replies; 14+ messages in thread
From: Naoya Horiguchi @ 2015-07-31  6:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
	Naoya Horiguchi

When memory_failure() is called on a page which are just freed after page
migration from soft offlining, the counter num_poisoned_pages is raised twice.
So let's fix it with using TestSetPageHWPoison.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/memory-failure.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git v4.2-rc4.orig/mm/memory-failure.c v4.2-rc4/mm/memory-failure.c
index 04d677048af7..f72d2fad0b90 100644
--- v4.2-rc4.orig/mm/memory-failure.c
+++ v4.2-rc4/mm/memory-failure.c
@@ -1671,8 +1671,8 @@ static int __soft_offline_page(struct page *page, int flags)
 			if (ret > 0)
 				ret = -EIO;
 		} else {
-			SetPageHWPoison(page);
-			atomic_long_inc(&num_poisoned_pages);
+			if (!TestSetPageHWPoison(page))
+				atomic_long_inc(&num_poisoned_pages);
 		}
 	} else {
 		pr_info("soft offline: %#lx: isolation failed: %d, page count %d, type %lx\n",
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/5] mm/memory-failure: give up error handling for non-tail-refcounted thp
@ 2015-07-31  6:46   ` Naoya Horiguchi
  0 siblings, 0 replies; 14+ messages in thread
From: Naoya Horiguchi @ 2015-07-31  6:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
	Naoya Horiguchi

"non anonymous thp" case is still racy with freeing thp, which causes panic
due to put_page() for refcount-0 page. It seems that closing up this race
might be hard (and/or not worth doing,) so let's give up the error handling
for this case.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
v1 -> v2:
- keep pr_err() message
---
 mm/memory-failure.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git v4.2-rc4.orig/mm/memory-failure.c v4.2-rc4/mm/memory-failure.c
index f72d2fad0b90..cd985530f102 100644
--- v4.2-rc4.orig/mm/memory-failure.c
+++ v4.2-rc4/mm/memory-failure.c
@@ -909,6 +909,18 @@ int get_hwpoison_page(struct page *page)
 	 * directly for tail pages.
 	 */
 	if (PageTransHuge(head)) {
+		/*
+		 * Non anonymous thp exists only in allocation/free time. We
+		 * can't handle such a case correctly, so let's give it up.
+		 * This should be better than triggering BUG_ON when kernel
+		 * tries to touch the "partially handled" page.
+		 */
+		if (!PageAnon(head)) {
+			pr_err("MCE: %#lx: non anonymous thp\n",
+				page_to_pfn(page));
+			return 0;
+		}
+
 		if (get_page_unless_zero(head)) {
 			if (PageTail(page))
 				get_page(page);
@@ -1134,15 +1146,6 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
 	}
 
 	if (!PageHuge(p) && PageTransHuge(hpage)) {
-		if (!PageAnon(hpage)) {
-			pr_err("MCE: %#lx: non anonymous thp\n", pfn);
-			if (TestClearPageHWPoison(p))
-				atomic_long_sub(nr_pages, &num_poisoned_pages);
-			put_page(p);
-			if (p != hpage)
-				put_page(hpage);
-			return -EBUSY;
-		}
 		if (unlikely(split_huge_page(hpage))) {
 			pr_err("MCE: %#lx: thp split failed\n", pfn);
 			if (TestClearPageHWPoison(p))
-- 
2.4.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 4/5] mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*
@ 2015-07-31  6:46   ` Naoya Horiguchi
  0 siblings, 0 replies; 14+ messages in thread
From: Naoya Horiguchi @ 2015-07-31  6:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
	Naoya Horiguchi

The race condition addressed in commit add05cecef80 ("mm: soft-offline: don't
free target page in successful page migration") was not closed completely,
because that can happen not only for soft-offline, but also for hard-offline.
Consider that a slab page is about to be freed into buddy pool, and then an
uncorrected memory error hits the page just after entering __free_one_page(),
then VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP) is triggered,
despite the fact that it's not necessary because the data on the affected
page is not consumed.

To solve it, this patch drops __PG_HWPOISON from page flag checks at
allocation/free time. I think it's justified because __PG_HWPOISON flags is
defined to prevent the page from being reused, and setting it outside the
page's alloc-free cycle is a designed behavior (not a bug.)

For recent months, I was annoyed about BUG_ON when soft-offlined page remains
on lru cache list for a while, which is avoided by calling put_page() instead
of putback_lru_page() in page migration's success path. This means that this
patch reverts a major change from commit add05cecef80 about the new refcounting
rule of soft-offlined pages, so "reuse window" revives. This will be closed
by a subsequent patch.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
v1 -> v2:
- call put_page() when migration succeeded in reason == MR_MEMORY_FAILURE
- refrain from reviving "reuse prevention" by MIGRATE_ISOLATE flag
---
 include/linux/page-flags.h | 10 +++++++---
 mm/huge_memory.c           |  7 +------
 mm/migrate.c               |  5 ++++-
 mm/page_alloc.c            |  4 ++++
 4 files changed, 16 insertions(+), 10 deletions(-)

diff --git v4.2-rc4.orig/include/linux/page-flags.h v4.2-rc4/include/linux/page-flags.h
index f34e040b34e9..41c93844fb1d 100644
--- v4.2-rc4.orig/include/linux/page-flags.h
+++ v4.2-rc4/include/linux/page-flags.h
@@ -631,15 +631,19 @@ static inline void ClearPageSlabPfmemalloc(struct page *page)
 	 1 << PG_private | 1 << PG_private_2 | \
 	 1 << PG_writeback | 1 << PG_reserved | \
 	 1 << PG_slab	 | 1 << PG_swapcache | 1 << PG_active | \
-	 1 << PG_unevictable | __PG_MLOCKED | __PG_HWPOISON | \
+	 1 << PG_unevictable | __PG_MLOCKED | \
 	 __PG_COMPOUND_LOCK)
 
 /*
  * Flags checked when a page is prepped for return by the page allocator.
- * Pages being prepped should not have any flags set.  It they are set,
+ * Pages being prepped should not have these flags set.  It they are set,
  * there has been a kernel bug or struct page corruption.
+ *
+ * __PG_HWPOISON is exceptional because it needs to be kept beyond page's
+ * alloc-free cycle to prevent from reusing the page.
  */
-#define PAGE_FLAGS_CHECK_AT_PREP	((1 << NR_PAGEFLAGS) - 1)
+#define PAGE_FLAGS_CHECK_AT_PREP	\
+	(((1 << NR_PAGEFLAGS) - 1) & ~__PG_HWPOISON)
 
 #define PAGE_FLAGS_PRIVATE				\
 	(1 << PG_private | 1 << PG_private_2)
diff --git v4.2-rc4.orig/mm/huge_memory.c v4.2-rc4/mm/huge_memory.c
index c107094f79ba..097c7a4bfbd9 100644
--- v4.2-rc4.orig/mm/huge_memory.c
+++ v4.2-rc4/mm/huge_memory.c
@@ -1676,12 +1676,7 @@ static void __split_huge_page_refcount(struct page *page,
 		/* after clearing PageTail the gup refcount can be released */
 		smp_mb__after_atomic();
 
-		/*
-		 * retain hwpoison flag of the poisoned tail page:
-		 *   fix for the unsuitable process killed on Guest Machine(KVM)
-		 *   by the memory-failure.
-		 */
-		page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP | __PG_HWPOISON;
+		page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
 		page_tail->flags |= (page->flags &
 				     ((1L << PG_referenced) |
 				      (1L << PG_swapbacked) |
diff --git v4.2-rc4.orig/mm/migrate.c v4.2-rc4/mm/migrate.c
index ee401e4e5ef1..f2415be7d93b 100644
--- v4.2-rc4.orig/mm/migrate.c
+++ v4.2-rc4/mm/migrate.c
@@ -950,7 +950,10 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
 		list_del(&page->lru);
 		dec_zone_page_state(page, NR_ISOLATED_ANON +
 				page_is_file_cache(page));
-		if (reason != MR_MEMORY_FAILURE)
+		/* Soft-offlined page shouldn't go through lru cache list */
+		if (reason == MR_MEMORY_FAILURE)
+			put_page(page);
+		else
 			putback_lru_page(page);
 	}
 
diff --git v4.2-rc4.orig/mm/page_alloc.c v4.2-rc4/mm/page_alloc.c
index ef19f22b2b7d..775c254648f7 100644
--- v4.2-rc4.orig/mm/page_alloc.c
+++ v4.2-rc4/mm/page_alloc.c
@@ -1285,6 +1285,10 @@ static inline int check_new_page(struct page *page)
 		bad_reason = "non-NULL mapping";
 	if (unlikely(atomic_read(&page->_count) != 0))
 		bad_reason = "nonzero _count";
+	if (unlikely(page->flags & __PG_HWPOISON)) {
+		bad_reason = "HWPoisoned (hardware-corrupted)";
+		bad_flags = __PG_HWPOISON;
+	}
 	if (unlikely(page->flags & PAGE_FLAGS_CHECK_AT_PREP)) {
 		bad_reason = "PAGE_FLAGS_CHECK_AT_PREP flag set";
 		bad_flags = PAGE_FLAGS_CHECK_AT_PREP;
-- 
2.4.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/5] mm/memory-failure: fix race in counting num_poisoned_pages
@ 2015-07-31  6:46   ` Naoya Horiguchi
  0 siblings, 0 replies; 14+ messages in thread
From: Naoya Horiguchi @ 2015-07-31  6:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
	Naoya Horiguchi

When memory_failure() is called on a page which are just freed after page
migration from soft offlining, the counter num_poisoned_pages is raised twice.
So let's fix it with using TestSetPageHWPoison.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/memory-failure.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git v4.2-rc4.orig/mm/memory-failure.c v4.2-rc4/mm/memory-failure.c
index 04d677048af7..f72d2fad0b90 100644
--- v4.2-rc4.orig/mm/memory-failure.c
+++ v4.2-rc4/mm/memory-failure.c
@@ -1671,8 +1671,8 @@ static int __soft_offline_page(struct page *page, int flags)
 			if (ret > 0)
 				ret = -EIO;
 		} else {
-			SetPageHWPoison(page);
-			atomic_long_inc(&num_poisoned_pages);
+			if (!TestSetPageHWPoison(page))
+				atomic_long_inc(&num_poisoned_pages);
 		}
 	} else {
 		pr_info("soft offline: %#lx: isolation failed: %d, page count %d, type %lx\n",
-- 
2.4.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 5/5] mm/memory-failure: set PageHWPoison before migrate_pages()
  2015-07-31  6:46 ` Naoya Horiguchi
@ 2015-07-31  6:46   ` Naoya Horiguchi
  -1 siblings, 0 replies; 14+ messages in thread
From: Naoya Horiguchi @ 2015-07-31  6:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
	Naoya Horiguchi

Now page freeing code doesn't consider PageHWPoison as a bad page, so by
setting it before completing the page containment, we can prevent the error
page from being reused just after successful page migration.

I added TTU_IGNORE_HWPOISON for try_to_unmap() to make sure that the page
table entry is transformed into migration entry, not to hwpoison entry.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/memory-failure.c | 7 ++++---
 mm/migrate.c        | 3 ++-
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git v4.2-rc4.orig/mm/memory-failure.c v4.2-rc4/mm/memory-failure.c
index cd985530f102..ea5a93659488 100644
--- v4.2-rc4.orig/mm/memory-failure.c
+++ v4.2-rc4/mm/memory-failure.c
@@ -1659,6 +1659,8 @@ static int __soft_offline_page(struct page *page, int flags)
 		inc_zone_page_state(page, NR_ISOLATED_ANON +
 					page_is_file_cache(page));
 		list_add(&page->lru, &pagelist);
+		if (!TestSetPageHWPoison(page))
+			atomic_long_inc(&num_poisoned_pages);
 		ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
 					MIGRATE_SYNC, MR_MEMORY_FAILURE);
 		if (ret) {
@@ -1673,9 +1675,8 @@ static int __soft_offline_page(struct page *page, int flags)
 				pfn, ret, page->flags);
 			if (ret > 0)
 				ret = -EIO;
-		} else {
-			if (!TestSetPageHWPoison(page))
-				atomic_long_inc(&num_poisoned_pages);
+			if (TestClearPageHWPoison(page))
+				atomic_long_dec(&num_poisoned_pages);
 		}
 	} else {
 		pr_info("soft offline: %#lx: isolation failed: %d, page count %d, type %lx\n",
diff --git v4.2-rc4.orig/mm/migrate.c v4.2-rc4/mm/migrate.c
index f2415be7d93b..eb4267107d1f 100644
--- v4.2-rc4.orig/mm/migrate.c
+++ v4.2-rc4/mm/migrate.c
@@ -880,7 +880,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 	/* Establish migration ptes or remove ptes */
 	if (page_mapped(page)) {
 		try_to_unmap(page,
-			TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
+			TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS|
+			TTU_IGNORE_HWPOISON);
 		page_was_mapped = 1;
 	}
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 5/5] mm/memory-failure: set PageHWPoison before migrate_pages()
@ 2015-07-31  6:46   ` Naoya Horiguchi
  0 siblings, 0 replies; 14+ messages in thread
From: Naoya Horiguchi @ 2015-07-31  6:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
	Naoya Horiguchi

Now page freeing code doesn't consider PageHWPoison as a bad page, so by
setting it before completing the page containment, we can prevent the error
page from being reused just after successful page migration.

I added TTU_IGNORE_HWPOISON for try_to_unmap() to make sure that the page
table entry is transformed into migration entry, not to hwpoison entry.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/memory-failure.c | 7 ++++---
 mm/migrate.c        | 3 ++-
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git v4.2-rc4.orig/mm/memory-failure.c v4.2-rc4/mm/memory-failure.c
index cd985530f102..ea5a93659488 100644
--- v4.2-rc4.orig/mm/memory-failure.c
+++ v4.2-rc4/mm/memory-failure.c
@@ -1659,6 +1659,8 @@ static int __soft_offline_page(struct page *page, int flags)
 		inc_zone_page_state(page, NR_ISOLATED_ANON +
 					page_is_file_cache(page));
 		list_add(&page->lru, &pagelist);
+		if (!TestSetPageHWPoison(page))
+			atomic_long_inc(&num_poisoned_pages);
 		ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
 					MIGRATE_SYNC, MR_MEMORY_FAILURE);
 		if (ret) {
@@ -1673,9 +1675,8 @@ static int __soft_offline_page(struct page *page, int flags)
 				pfn, ret, page->flags);
 			if (ret > 0)
 				ret = -EIO;
-		} else {
-			if (!TestSetPageHWPoison(page))
-				atomic_long_inc(&num_poisoned_pages);
+			if (TestClearPageHWPoison(page))
+				atomic_long_dec(&num_poisoned_pages);
 		}
 	} else {
 		pr_info("soft offline: %#lx: isolation failed: %d, page count %d, type %lx\n",
diff --git v4.2-rc4.orig/mm/migrate.c v4.2-rc4/mm/migrate.c
index f2415be7d93b..eb4267107d1f 100644
--- v4.2-rc4.orig/mm/migrate.c
+++ v4.2-rc4/mm/migrate.c
@@ -880,7 +880,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 	/* Establish migration ptes or remove ptes */
 	if (page_mapped(page)) {
 		try_to_unmap(page,
-			TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
+			TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS|
+			TTU_IGNORE_HWPOISON);
 		page_was_mapped = 1;
 	}
 
-- 
2.4.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/5] mm/memory-failure: unlock_page before put_page
  2015-07-31  6:46   ` Naoya Horiguchi
@ 2015-07-31 20:54     ` David Rientjes
  -1 siblings, 0 replies; 14+ messages in thread
From: David Rientjes @ 2015-07-31 20:54 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Andrew Morton, Andi Kleen, Dean Nelson, Tony Luck,
	Kirill A. Shutemov, Hugh Dickins, linux-mm, linux-kernel,
	Naoya Horiguchi

On Fri, 31 Jul 2015, Naoya Horiguchi wrote:

> In "just unpoisoned" path, we do put_page and then unlock_page, which is a
> wrong order and causes "freeing locked page" bug. So let's fix it.
> 
> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> ---
>  mm/memory-failure.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git v4.2-rc4.orig/mm/memory-failure.c v4.2-rc4/mm/memory-failure.c
> index c53543d89282..04d677048af7 100644
> --- v4.2-rc4.orig/mm/memory-failure.c
> +++ v4.2-rc4/mm/memory-failure.c
> @@ -1209,9 +1209,9 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
>  	if (!PageHWPoison(p)) {
>  		printk(KERN_ERR "MCE %#lx: just unpoisoned\n", pfn);
>  		atomic_long_sub(nr_pages, &num_poisoned_pages);
> +		unlock_page(hpage);
>  		put_page(hpage);
> -		res = 0;
> -		goto out;
> +		return 0;
>  	}
>  	if (hwpoison_filter(p)) {
>  		if (TestClearPageHWPoison(p))

Looks like you could do the unlock_page() before either the printk or 
atomic_long_sub(), but probably not important.

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/5] mm/memory-failure: unlock_page before put_page
@ 2015-07-31 20:54     ` David Rientjes
  0 siblings, 0 replies; 14+ messages in thread
From: David Rientjes @ 2015-07-31 20:54 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Andrew Morton, Andi Kleen, Dean Nelson, Tony Luck,
	Kirill A. Shutemov, Hugh Dickins, linux-mm, linux-kernel,
	Naoya Horiguchi

On Fri, 31 Jul 2015, Naoya Horiguchi wrote:

> In "just unpoisoned" path, we do put_page and then unlock_page, which is a
> wrong order and causes "freeing locked page" bug. So let's fix it.
> 
> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> ---
>  mm/memory-failure.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git v4.2-rc4.orig/mm/memory-failure.c v4.2-rc4/mm/memory-failure.c
> index c53543d89282..04d677048af7 100644
> --- v4.2-rc4.orig/mm/memory-failure.c
> +++ v4.2-rc4/mm/memory-failure.c
> @@ -1209,9 +1209,9 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
>  	if (!PageHWPoison(p)) {
>  		printk(KERN_ERR "MCE %#lx: just unpoisoned\n", pfn);
>  		atomic_long_sub(nr_pages, &num_poisoned_pages);
> +		unlock_page(hpage);
>  		put_page(hpage);
> -		res = 0;
> -		goto out;
> +		return 0;
>  	}
>  	if (hwpoison_filter(p)) {
>  		if (TestClearPageHWPoison(p))

Looks like you could do the unlock_page() before either the printk or 
atomic_long_sub(), but probably not important.

Acked-by: David Rientjes <rientjes@google.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-07-31 20:54 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-31  6:46 [PATCH v2 0/5] hwpoison: fixes on v4.2-rc4 Naoya Horiguchi
2015-07-31  6:46 ` Naoya Horiguchi
2015-07-31  6:46 ` [PATCH v2 1/5] mm/memory-failure: unlock_page before put_page Naoya Horiguchi
2015-07-31  6:46   ` Naoya Horiguchi
2015-07-31 20:54   ` David Rientjes
2015-07-31 20:54     ` David Rientjes
2015-07-31  6:46 ` [PATCH v2 3/5] mm/memory-failure: give up error handling for non-tail-refcounted thp Naoya Horiguchi
2015-07-31  6:46   ` Naoya Horiguchi
2015-07-31  6:46 ` [PATCH v2 4/5] mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_* Naoya Horiguchi
2015-07-31  6:46   ` Naoya Horiguchi
2015-07-31  6:46 ` [PATCH v2 2/5] mm/memory-failure: fix race in counting num_poisoned_pages Naoya Horiguchi
2015-07-31  6:46   ` Naoya Horiguchi
2015-07-31  6:46 ` [PATCH v2 5/5] mm/memory-failure: set PageHWPoison before migrate_pages() Naoya Horiguchi
2015-07-31  6:46   ` Naoya Horiguchi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.