All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/4] hwpoison: fixes on v4.2-rc2
@ 2015-07-16  1:41 ` Naoya Horiguchi
  0 siblings, 0 replies; 18+ messages in thread
From: Naoya Horiguchi @ 2015-07-16  1:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi

Recently I addressed a few of hwpoison race problems and the patches are merged
on v4.2-rc1. It made progress, but unfortunately some problems still remain due
to less coverage of my testing. So I'm trying to fix or avoid them in this series.

One point I'm expecting to discuss is that patch 4/4 changes the page flag set
to be checked on free time. In current behavior, __PG_HWPOISON is not supposed
to be set when the page is freed. I think that there is no strong reason for this
behavior, and it causes a problem hard to fix only in error handler side (because
__PG_HWPOISON could be set at arbitrary timing.) So I suggest to change it.

With this patchset, the stress testing in official mce-test testsuite passes.

Thanks,
Naoya Horiguchi
---
Tree: https://github.com/Naoya-Horiguchi/linux/tree/v4.2-rc2/hwpoison.v1
---
Summary:

Naoya Horiguchi (4):
      mm/memory-failure: unlock_page before put_page
      mm/memory-failure: fix race in counting num_poisoned_pages
      mm/memory-failure: give up error handling for non-tail-refcounted thp
      mm/memory-failure: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*

 include/linux/page-flags.h | 10 +++++++---
 mm/huge_memory.c           |  7 +------
 mm/memory-failure.c        | 32 ++++++++++++++++++--------------
 mm/migrate.c               |  9 +++------
 mm/page_alloc.c            |  4 ++++
 5 files changed, 33 insertions(+), 29 deletions(-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v1 0/4] hwpoison: fixes on v4.2-rc2
@ 2015-07-16  1:41 ` Naoya Horiguchi
  0 siblings, 0 replies; 18+ messages in thread
From: Naoya Horiguchi @ 2015-07-16  1:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi

Recently I addressed a few of hwpoison race problems and the patches are merged
on v4.2-rc1. It made progress, but unfortunately some problems still remain due
to less coverage of my testing. So I'm trying to fix or avoid them in this series.

One point I'm expecting to discuss is that patch 4/4 changes the page flag set
to be checked on free time. In current behavior, __PG_HWPOISON is not supposed
to be set when the page is freed. I think that there is no strong reason for this
behavior, and it causes a problem hard to fix only in error handler side (because
__PG_HWPOISON could be set at arbitrary timing.) So I suggest to change it.

With this patchset, the stress testing in official mce-test testsuite passes.

Thanks,
Naoya Horiguchi
---
Tree: https://github.com/Naoya-Horiguchi/linux/tree/v4.2-rc2/hwpoison.v1
---
Summary:

Naoya Horiguchi (4):
      mm/memory-failure: unlock_page before put_page
      mm/memory-failure: fix race in counting num_poisoned_pages
      mm/memory-failure: give up error handling for non-tail-refcounted thp
      mm/memory-failure: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*

 include/linux/page-flags.h | 10 +++++++---
 mm/huge_memory.c           |  7 +------
 mm/memory-failure.c        | 32 ++++++++++++++++++--------------
 mm/migrate.c               |  9 +++------
 mm/page_alloc.c            |  4 ++++
 5 files changed, 33 insertions(+), 29 deletions(-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v1 4/4] mm/memory-failure: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*
  2015-07-16  1:41 ` Naoya Horiguchi
@ 2015-07-16  1:41   ` Naoya Horiguchi
  -1 siblings, 0 replies; 18+ messages in thread
From: Naoya Horiguchi @ 2015-07-16  1:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi

The race condition addressed in commit add05cecef80 ("mm: soft-offline: don't
free target page in successful page migration") was not closed completely,
because that can happen not only for soft-offline, but also for hard-offline.
Consider that a slab page is about to be freed into buddy pool, and then an
uncorrected memory error hits the page just after entering __free_one_page(),
then VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP) is triggered,
despite the fact that it's not necessary because the data on the affected
page is not consumed.

To solve it, this patch drops __PG_HWPOISON from page flag checks at
allocation/free time. I think it's justified because __PG_HWPOISON flags is
defined to prevent the page from being reused and setting it outside the
page's alloc-free cycle is a designed behavior (not a bug.)

And the patch reverts most of the changes from commit add05cecef80 about
the new refcounting rule of soft-offlined pages, which is no longer necessary.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 include/linux/page-flags.h | 10 +++++++---
 mm/huge_memory.c           |  7 +------
 mm/memory-failure.c        |  6 +++++-
 mm/migrate.c               |  9 +++------
 mm/page_alloc.c            |  4 ++++
 5 files changed, 20 insertions(+), 16 deletions(-)

diff --git v4.2-rc2.orig/include/linux/page-flags.h v4.2-rc2/include/linux/page-flags.h
index f34e040b34e9..53400f101f2d 100644
--- v4.2-rc2.orig/include/linux/page-flags.h
+++ v4.2-rc2/include/linux/page-flags.h
@@ -631,15 +631,19 @@ static inline void ClearPageSlabPfmemalloc(struct page *page)
 	 1 << PG_private | 1 << PG_private_2 | \
 	 1 << PG_writeback | 1 << PG_reserved | \
 	 1 << PG_slab	 | 1 << PG_swapcache | 1 << PG_active | \
-	 1 << PG_unevictable | __PG_MLOCKED | __PG_HWPOISON | \
+	 1 << PG_unevictable | __PG_MLOCKED | \
 	 __PG_COMPOUND_LOCK)
 
 /*
  * Flags checked when a page is prepped for return by the page allocator.
- * Pages being prepped should not have any flags set.  It they are set,
+ * Pages being prepped should not have these flags set.  It they are set,
  * there has been a kernel bug or struct page corruption.
+ *
+ * __PG_HWPOISON is exceptional because it need to be kept beyond page's
+ * alloc-free cycle to prevent from reusing the page.
  */
-#define PAGE_FLAGS_CHECK_AT_PREP	((1 << NR_PAGEFLAGS) - 1)
+#define PAGE_FLAGS_CHECK_AT_PREP	\
+	(((1 << NR_PAGEFLAGS) - 1) & ~__PG_HWPOISON)
 
 #define PAGE_FLAGS_PRIVATE				\
 	(1 << PG_private | 1 << PG_private_2)
diff --git v4.2-rc2.orig/mm/huge_memory.c v4.2-rc2/mm/huge_memory.c
index c107094f79ba..097c7a4bfbd9 100644
--- v4.2-rc2.orig/mm/huge_memory.c
+++ v4.2-rc2/mm/huge_memory.c
@@ -1676,12 +1676,7 @@ static void __split_huge_page_refcount(struct page *page,
 		/* after clearing PageTail the gup refcount can be released */
 		smp_mb__after_atomic();
 
-		/*
-		 * retain hwpoison flag of the poisoned tail page:
-		 *   fix for the unsuitable process killed on Guest Machine(KVM)
-		 *   by the memory-failure.
-		 */
-		page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP | __PG_HWPOISON;
+		page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
 		page_tail->flags |= (page->flags &
 				     ((1L << PG_referenced) |
 				      (1L << PG_swapbacked) |
diff --git v4.2-rc2.orig/mm/memory-failure.c v4.2-rc2/mm/memory-failure.c
index 421d7c9b30f4..755f87e4ec64 100644
--- v4.2-rc2.orig/mm/memory-failure.c
+++ v4.2-rc2/mm/memory-failure.c
@@ -1723,6 +1723,9 @@ int soft_offline_page(struct page *page, int flags)
 
 	get_online_mems();
 
+	if (get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
+		set_migratetype_isolate(page, true);
+
 	ret = get_any_page(page, pfn, flags);
 	put_online_mems();
 	if (ret > 0) { /* for in-use pages */
@@ -1730,7 +1733,7 @@ int soft_offline_page(struct page *page, int flags)
 			ret = soft_offline_huge_page(page, flags);
 		else
 			ret = __soft_offline_page(page, flags);
-	} else if (ret == 0) { /* for free pages */
+	} else if (ret == 0) {
 		if (PageHuge(page)) {
 			set_page_hwpoison_huge_page(hpage);
 			if (!dequeue_hwpoisoned_huge_page(hpage))
@@ -1741,5 +1744,6 @@ int soft_offline_page(struct page *page, int flags)
 				atomic_long_inc(&num_poisoned_pages);
 		}
 	}
+	unset_migratetype_isolate(page, MIGRATE_MOVABLE);
 	return ret;
 }
diff --git v4.2-rc2.orig/mm/migrate.c v4.2-rc2/mm/migrate.c
index ee401e4e5ef1..c37d5772767b 100644
--- v4.2-rc2.orig/mm/migrate.c
+++ v4.2-rc2/mm/migrate.c
@@ -918,8 +918,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 static ICE_noinline int unmap_and_move(new_page_t get_new_page,
 				   free_page_t put_new_page,
 				   unsigned long private, struct page *page,
-				   int force, enum migrate_mode mode,
-				   enum migrate_reason reason)
+				   int force, enum migrate_mode mode)
 {
 	int rc = 0;
 	int *result = NULL;
@@ -950,8 +949,7 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
 		list_del(&page->lru);
 		dec_zone_page_state(page, NR_ISOLATED_ANON +
 				page_is_file_cache(page));
-		if (reason != MR_MEMORY_FAILURE)
-			putback_lru_page(page);
+		putback_lru_page(page);
 	}
 
 	/*
@@ -1124,8 +1122,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 						pass > 2, mode);
 			else
 				rc = unmap_and_move(get_new_page, put_new_page,
-						private, page, pass > 2, mode,
-						reason);
+						private, page, pass > 2, mode);
 
 			switch(rc) {
 			case -ENOMEM:
diff --git v4.2-rc2.orig/mm/page_alloc.c v4.2-rc2/mm/page_alloc.c
index 506eac8b38af..e32d58ce5d2f 100644
--- v4.2-rc2.orig/mm/page_alloc.c
+++ v4.2-rc2/mm/page_alloc.c
@@ -1287,6 +1287,10 @@ static inline int check_new_page(struct page *page)
 		bad_reason = "non-NULL mapping";
 	if (unlikely(atomic_read(&page->_count) != 0))
 		bad_reason = "nonzero _count";
+	if (unlikely(page->flags & __PG_HWPOISON)) {
+		bad_reason = "HWPoisoned (hardware-corrupted)";
+		bad_flags = __PG_HWPOISON;
+	}
 	if (unlikely(page->flags & PAGE_FLAGS_CHECK_AT_PREP)) {
 		bad_reason = "PAGE_FLAGS_CHECK_AT_PREP flag set";
 		bad_flags = PAGE_FLAGS_CHECK_AT_PREP;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v1 2/4] mm/memory-failure: fix race in counting num_poisoned_pages
  2015-07-16  1:41 ` Naoya Horiguchi
@ 2015-07-16  1:41   ` Naoya Horiguchi
  -1 siblings, 0 replies; 18+ messages in thread
From: Naoya Horiguchi @ 2015-07-16  1:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi

When memory_failure() is called on a page which are just freed after page
migration from soft offlining, the counter num_poisoned_pages is raised twice.
So let's fix it with using TestSetPageHWPoison.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/memory-failure.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git v4.2-rc2.orig/mm/memory-failure.c v4.2-rc2/mm/memory-failure.c
index 04d677048af7..f72d2fad0b90 100644
--- v4.2-rc2.orig/mm/memory-failure.c
+++ v4.2-rc2/mm/memory-failure.c
@@ -1671,8 +1671,8 @@ static int __soft_offline_page(struct page *page, int flags)
 			if (ret > 0)
 				ret = -EIO;
 		} else {
-			SetPageHWPoison(page);
-			atomic_long_inc(&num_poisoned_pages);
+			if (!TestSetPageHWPoison(page))
+				atomic_long_inc(&num_poisoned_pages);
 		}
 	} else {
 		pr_info("soft offline: %#lx: isolation failed: %d, page count %d, type %lx\n",
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v1 3/4] mm/memory-failure: give up error handling for non-tail-refcounted thp
  2015-07-16  1:41 ` Naoya Horiguchi
@ 2015-07-16  1:41   ` Naoya Horiguchi
  -1 siblings, 0 replies; 18+ messages in thread
From: Naoya Horiguchi @ 2015-07-16  1:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi

"non anonymous thp" case is still racy with freeing thp, which causes panic
due to put_page() for refcount-0 page. It seems that closing up this race
might be hard (and/or not worth doing,) so let's give up the error handling
for this case.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/memory-failure.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git v4.2-rc2.orig/mm/memory-failure.c v4.2-rc2/mm/memory-failure.c
index f72d2fad0b90..421d7c9b30f4 100644
--- v4.2-rc2.orig/mm/memory-failure.c
+++ v4.2-rc2/mm/memory-failure.c
@@ -909,6 +909,15 @@ int get_hwpoison_page(struct page *page)
 	 * directly for tail pages.
 	 */
 	if (PageTransHuge(head)) {
+		/*
+		 * Non anonymous thp exists only in allocation/free time. We
+		 * can't handle such a case correctly, so let's give it up.
+		 * This should be better than triggering BUG_ON when kernel
+		 * tries to touch a "partially handled" page.
+		 */
+		if (!PageAnon(head))
+			return 0;
+
 		if (get_page_unless_zero(head)) {
 			if (PageTail(page))
 				get_page(page);
@@ -1134,15 +1143,6 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
 	}
 
 	if (!PageHuge(p) && PageTransHuge(hpage)) {
-		if (!PageAnon(hpage)) {
-			pr_err("MCE: %#lx: non anonymous thp\n", pfn);
-			if (TestClearPageHWPoison(p))
-				atomic_long_sub(nr_pages, &num_poisoned_pages);
-			put_page(p);
-			if (p != hpage)
-				put_page(hpage);
-			return -EBUSY;
-		}
 		if (unlikely(split_huge_page(hpage))) {
 			pr_err("MCE: %#lx: thp split failed\n", pfn);
 			if (TestClearPageHWPoison(p))
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v1 1/4] mm/memory-failure: unlock_page before put_page
  2015-07-16  1:41 ` Naoya Horiguchi
@ 2015-07-16  1:41   ` Naoya Horiguchi
  -1 siblings, 0 replies; 18+ messages in thread
From: Naoya Horiguchi @ 2015-07-16  1:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi

In "just unpoisoned" path, we do put_page and then unlock_page, which is a
wrong order and causes "freeing locked page" bug. So let's fix it.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/memory-failure.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git v4.2-rc2.orig/mm/memory-failure.c v4.2-rc2/mm/memory-failure.c
index c53543d89282..04d677048af7 100644
--- v4.2-rc2.orig/mm/memory-failure.c
+++ v4.2-rc2/mm/memory-failure.c
@@ -1209,9 +1209,9 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
 	if (!PageHWPoison(p)) {
 		printk(KERN_ERR "MCE %#lx: just unpoisoned\n", pfn);
 		atomic_long_sub(nr_pages, &num_poisoned_pages);
+		unlock_page(hpage);
 		put_page(hpage);
-		res = 0;
-		goto out;
+		return 0;
 	}
 	if (hwpoison_filter(p)) {
 		if (TestClearPageHWPoison(p))
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v1 1/4] mm/memory-failure: unlock_page before put_page
@ 2015-07-16  1:41   ` Naoya Horiguchi
  0 siblings, 0 replies; 18+ messages in thread
From: Naoya Horiguchi @ 2015-07-16  1:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi

In "just unpoisoned" path, we do put_page and then unlock_page, which is a
wrong order and causes "freeing locked page" bug. So let's fix it.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/memory-failure.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git v4.2-rc2.orig/mm/memory-failure.c v4.2-rc2/mm/memory-failure.c
index c53543d89282..04d677048af7 100644
--- v4.2-rc2.orig/mm/memory-failure.c
+++ v4.2-rc2/mm/memory-failure.c
@@ -1209,9 +1209,9 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
 	if (!PageHWPoison(p)) {
 		printk(KERN_ERR "MCE %#lx: just unpoisoned\n", pfn);
 		atomic_long_sub(nr_pages, &num_poisoned_pages);
+		unlock_page(hpage);
 		put_page(hpage);
-		res = 0;
-		goto out;
+		return 0;
 	}
 	if (hwpoison_filter(p)) {
 		if (TestClearPageHWPoison(p))
-- 
2.4.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v1 2/4] mm/memory-failure: fix race in counting num_poisoned_pages
@ 2015-07-16  1:41   ` Naoya Horiguchi
  0 siblings, 0 replies; 18+ messages in thread
From: Naoya Horiguchi @ 2015-07-16  1:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi

When memory_failure() is called on a page which are just freed after page
migration from soft offlining, the counter num_poisoned_pages is raised twice.
So let's fix it with using TestSetPageHWPoison.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/memory-failure.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git v4.2-rc2.orig/mm/memory-failure.c v4.2-rc2/mm/memory-failure.c
index 04d677048af7..f72d2fad0b90 100644
--- v4.2-rc2.orig/mm/memory-failure.c
+++ v4.2-rc2/mm/memory-failure.c
@@ -1671,8 +1671,8 @@ static int __soft_offline_page(struct page *page, int flags)
 			if (ret > 0)
 				ret = -EIO;
 		} else {
-			SetPageHWPoison(page);
-			atomic_long_inc(&num_poisoned_pages);
+			if (!TestSetPageHWPoison(page))
+				atomic_long_inc(&num_poisoned_pages);
 		}
 	} else {
 		pr_info("soft offline: %#lx: isolation failed: %d, page count %d, type %lx\n",
-- 
2.4.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v1 4/4] mm/memory-failure: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*
@ 2015-07-16  1:41   ` Naoya Horiguchi
  0 siblings, 0 replies; 18+ messages in thread
From: Naoya Horiguchi @ 2015-07-16  1:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi

The race condition addressed in commit add05cecef80 ("mm: soft-offline: don't
free target page in successful page migration") was not closed completely,
because that can happen not only for soft-offline, but also for hard-offline.
Consider that a slab page is about to be freed into buddy pool, and then an
uncorrected memory error hits the page just after entering __free_one_page(),
then VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP) is triggered,
despite the fact that it's not necessary because the data on the affected
page is not consumed.

To solve it, this patch drops __PG_HWPOISON from page flag checks at
allocation/free time. I think it's justified because __PG_HWPOISON flags is
defined to prevent the page from being reused and setting it outside the
page's alloc-free cycle is a designed behavior (not a bug.)

And the patch reverts most of the changes from commit add05cecef80 about
the new refcounting rule of soft-offlined pages, which is no longer necessary.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 include/linux/page-flags.h | 10 +++++++---
 mm/huge_memory.c           |  7 +------
 mm/memory-failure.c        |  6 +++++-
 mm/migrate.c               |  9 +++------
 mm/page_alloc.c            |  4 ++++
 5 files changed, 20 insertions(+), 16 deletions(-)

diff --git v4.2-rc2.orig/include/linux/page-flags.h v4.2-rc2/include/linux/page-flags.h
index f34e040b34e9..53400f101f2d 100644
--- v4.2-rc2.orig/include/linux/page-flags.h
+++ v4.2-rc2/include/linux/page-flags.h
@@ -631,15 +631,19 @@ static inline void ClearPageSlabPfmemalloc(struct page *page)
 	 1 << PG_private | 1 << PG_private_2 | \
 	 1 << PG_writeback | 1 << PG_reserved | \
 	 1 << PG_slab	 | 1 << PG_swapcache | 1 << PG_active | \
-	 1 << PG_unevictable | __PG_MLOCKED | __PG_HWPOISON | \
+	 1 << PG_unevictable | __PG_MLOCKED | \
 	 __PG_COMPOUND_LOCK)
 
 /*
  * Flags checked when a page is prepped for return by the page allocator.
- * Pages being prepped should not have any flags set.  It they are set,
+ * Pages being prepped should not have these flags set.  It they are set,
  * there has been a kernel bug or struct page corruption.
+ *
+ * __PG_HWPOISON is exceptional because it need to be kept beyond page's
+ * alloc-free cycle to prevent from reusing the page.
  */
-#define PAGE_FLAGS_CHECK_AT_PREP	((1 << NR_PAGEFLAGS) - 1)
+#define PAGE_FLAGS_CHECK_AT_PREP	\
+	(((1 << NR_PAGEFLAGS) - 1) & ~__PG_HWPOISON)
 
 #define PAGE_FLAGS_PRIVATE				\
 	(1 << PG_private | 1 << PG_private_2)
diff --git v4.2-rc2.orig/mm/huge_memory.c v4.2-rc2/mm/huge_memory.c
index c107094f79ba..097c7a4bfbd9 100644
--- v4.2-rc2.orig/mm/huge_memory.c
+++ v4.2-rc2/mm/huge_memory.c
@@ -1676,12 +1676,7 @@ static void __split_huge_page_refcount(struct page *page,
 		/* after clearing PageTail the gup refcount can be released */
 		smp_mb__after_atomic();
 
-		/*
-		 * retain hwpoison flag of the poisoned tail page:
-		 *   fix for the unsuitable process killed on Guest Machine(KVM)
-		 *   by the memory-failure.
-		 */
-		page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP | __PG_HWPOISON;
+		page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
 		page_tail->flags |= (page->flags &
 				     ((1L << PG_referenced) |
 				      (1L << PG_swapbacked) |
diff --git v4.2-rc2.orig/mm/memory-failure.c v4.2-rc2/mm/memory-failure.c
index 421d7c9b30f4..755f87e4ec64 100644
--- v4.2-rc2.orig/mm/memory-failure.c
+++ v4.2-rc2/mm/memory-failure.c
@@ -1723,6 +1723,9 @@ int soft_offline_page(struct page *page, int flags)
 
 	get_online_mems();
 
+	if (get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
+		set_migratetype_isolate(page, true);
+
 	ret = get_any_page(page, pfn, flags);
 	put_online_mems();
 	if (ret > 0) { /* for in-use pages */
@@ -1730,7 +1733,7 @@ int soft_offline_page(struct page *page, int flags)
 			ret = soft_offline_huge_page(page, flags);
 		else
 			ret = __soft_offline_page(page, flags);
-	} else if (ret == 0) { /* for free pages */
+	} else if (ret == 0) {
 		if (PageHuge(page)) {
 			set_page_hwpoison_huge_page(hpage);
 			if (!dequeue_hwpoisoned_huge_page(hpage))
@@ -1741,5 +1744,6 @@ int soft_offline_page(struct page *page, int flags)
 				atomic_long_inc(&num_poisoned_pages);
 		}
 	}
+	unset_migratetype_isolate(page, MIGRATE_MOVABLE);
 	return ret;
 }
diff --git v4.2-rc2.orig/mm/migrate.c v4.2-rc2/mm/migrate.c
index ee401e4e5ef1..c37d5772767b 100644
--- v4.2-rc2.orig/mm/migrate.c
+++ v4.2-rc2/mm/migrate.c
@@ -918,8 +918,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 static ICE_noinline int unmap_and_move(new_page_t get_new_page,
 				   free_page_t put_new_page,
 				   unsigned long private, struct page *page,
-				   int force, enum migrate_mode mode,
-				   enum migrate_reason reason)
+				   int force, enum migrate_mode mode)
 {
 	int rc = 0;
 	int *result = NULL;
@@ -950,8 +949,7 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
 		list_del(&page->lru);
 		dec_zone_page_state(page, NR_ISOLATED_ANON +
 				page_is_file_cache(page));
-		if (reason != MR_MEMORY_FAILURE)
-			putback_lru_page(page);
+		putback_lru_page(page);
 	}
 
 	/*
@@ -1124,8 +1122,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 						pass > 2, mode);
 			else
 				rc = unmap_and_move(get_new_page, put_new_page,
-						private, page, pass > 2, mode,
-						reason);
+						private, page, pass > 2, mode);
 
 			switch(rc) {
 			case -ENOMEM:
diff --git v4.2-rc2.orig/mm/page_alloc.c v4.2-rc2/mm/page_alloc.c
index 506eac8b38af..e32d58ce5d2f 100644
--- v4.2-rc2.orig/mm/page_alloc.c
+++ v4.2-rc2/mm/page_alloc.c
@@ -1287,6 +1287,10 @@ static inline int check_new_page(struct page *page)
 		bad_reason = "non-NULL mapping";
 	if (unlikely(atomic_read(&page->_count) != 0))
 		bad_reason = "nonzero _count";
+	if (unlikely(page->flags & __PG_HWPOISON)) {
+		bad_reason = "HWPoisoned (hardware-corrupted)";
+		bad_flags = __PG_HWPOISON;
+	}
 	if (unlikely(page->flags & PAGE_FLAGS_CHECK_AT_PREP)) {
 		bad_reason = "PAGE_FLAGS_CHECK_AT_PREP flag set";
 		bad_flags = PAGE_FLAGS_CHECK_AT_PREP;
-- 
2.4.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v1 3/4] mm/memory-failure: give up error handling for non-tail-refcounted thp
@ 2015-07-16  1:41   ` Naoya Horiguchi
  0 siblings, 0 replies; 18+ messages in thread
From: Naoya Horiguchi @ 2015-07-16  1:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi

"non anonymous thp" case is still racy with freeing thp, which causes panic
due to put_page() for refcount-0 page. It seems that closing up this race
might be hard (and/or not worth doing,) so let's give up the error handling
for this case.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/memory-failure.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git v4.2-rc2.orig/mm/memory-failure.c v4.2-rc2/mm/memory-failure.c
index f72d2fad0b90..421d7c9b30f4 100644
--- v4.2-rc2.orig/mm/memory-failure.c
+++ v4.2-rc2/mm/memory-failure.c
@@ -909,6 +909,15 @@ int get_hwpoison_page(struct page *page)
 	 * directly for tail pages.
 	 */
 	if (PageTransHuge(head)) {
+		/*
+		 * Non anonymous thp exists only in allocation/free time. We
+		 * can't handle such a case correctly, so let's give it up.
+		 * This should be better than triggering BUG_ON when kernel
+		 * tries to touch a "partially handled" page.
+		 */
+		if (!PageAnon(head))
+			return 0;
+
 		if (get_page_unless_zero(head)) {
 			if (PageTail(page))
 				get_page(page);
@@ -1134,15 +1143,6 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
 	}
 
 	if (!PageHuge(p) && PageTransHuge(hpage)) {
-		if (!PageAnon(hpage)) {
-			pr_err("MCE: %#lx: non anonymous thp\n", pfn);
-			if (TestClearPageHWPoison(p))
-				atomic_long_sub(nr_pages, &num_poisoned_pages);
-			put_page(p);
-			if (p != hpage)
-				put_page(hpage);
-			return -EBUSY;
-		}
 		if (unlikely(split_huge_page(hpage))) {
 			pr_err("MCE: %#lx: thp split failed\n", pfn);
 			if (TestClearPageHWPoison(p))
-- 
2.4.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 3/4] mm/memory-failure: give up error handling for non-tail-refcounted thp
  2015-07-16  1:41   ` Naoya Horiguchi
@ 2015-07-16  2:33     ` Andi Kleen
  -1 siblings, 0 replies; 18+ messages in thread
From: Andi Kleen @ 2015-07-16  2:33 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Andrew Morton, Andi Kleen, Dean Nelson, Tony Luck,
	Kirill A. Shutemov, Hugh Dickins, linux-mm, linux-kernel,
	Naoya Horiguchi

> @@ -909,6 +909,15 @@ int get_hwpoison_page(struct page *page)
>  	 * directly for tail pages.
>  	 */
>  	if (PageTransHuge(head)) {
> +		/*
> +		 * Non anonymous thp exists only in allocation/free time. We
> +		 * can't handle such a case correctly, so let's give it up.
> +		 * This should be better than triggering BUG_ON when kernel
> +		 * tries to touch a "partially handled" page.
> +		 */
> +		if (!PageAnon(head))
> +			return 0;

Please print a message for this case. In the future there will be
likely more non anonymous THP pages from Kirill's large page cache work
(so eventually we'll need it)

-Andi


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 3/4] mm/memory-failure: give up error handling for non-tail-refcounted thp
@ 2015-07-16  2:33     ` Andi Kleen
  0 siblings, 0 replies; 18+ messages in thread
From: Andi Kleen @ 2015-07-16  2:33 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Andrew Morton, Andi Kleen, Dean Nelson, Tony Luck,
	Kirill A. Shutemov, Hugh Dickins, linux-mm, linux-kernel,
	Naoya Horiguchi

> @@ -909,6 +909,15 @@ int get_hwpoison_page(struct page *page)
>  	 * directly for tail pages.
>  	 */
>  	if (PageTransHuge(head)) {
> +		/*
> +		 * Non anonymous thp exists only in allocation/free time. We
> +		 * can't handle such a case correctly, so let's give it up.
> +		 * This should be better than triggering BUG_ON when kernel
> +		 * tries to touch a "partially handled" page.
> +		 */
> +		if (!PageAnon(head))
> +			return 0;

Please print a message for this case. In the future there will be
likely more non anonymous THP pages from Kirill's large page cache work
(so eventually we'll need it)

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 3/4] mm/memory-failure: give up error handling for non-tail-refcounted thp
  2015-07-16  2:33     ` Andi Kleen
@ 2015-07-16  2:41       ` Naoya Horiguchi
  -1 siblings, 0 replies; 18+ messages in thread
From: Naoya Horiguchi @ 2015-07-16  2:41 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi

On Thu, Jul 16, 2015 at 04:33:07AM +0200, Andi Kleen wrote:
> > @@ -909,6 +909,15 @@ int get_hwpoison_page(struct page *page)
> >  	 * directly for tail pages.
> >  	 */
> >  	if (PageTransHuge(head)) {
> > +		/*
> > +		 * Non anonymous thp exists only in allocation/free time. We
> > +		 * can't handle such a case correctly, so let's give it up.
> > +		 * This should be better than triggering BUG_ON when kernel
> > +		 * tries to touch a "partially handled" page.
> > +		 */
> > +		if (!PageAnon(head))
> > +			return 0;
> 
> Please print a message for this case. In the future there will be
> likely more non anonymous THP pages from Kirill's large page cache work
> (so eventually we'll need it)

OK, I'll do this.

Thanks,
Naoya Horiguchi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 3/4] mm/memory-failure: give up error handling for non-tail-refcounted thp
@ 2015-07-16  2:41       ` Naoya Horiguchi
  0 siblings, 0 replies; 18+ messages in thread
From: Naoya Horiguchi @ 2015-07-16  2:41 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi

On Thu, Jul 16, 2015 at 04:33:07AM +0200, Andi Kleen wrote:
> > @@ -909,6 +909,15 @@ int get_hwpoison_page(struct page *page)
> >  	 * directly for tail pages.
> >  	 */
> >  	if (PageTransHuge(head)) {
> > +		/*
> > +		 * Non anonymous thp exists only in allocation/free time. We
> > +		 * can't handle such a case correctly, so let's give it up.
> > +		 * This should be better than triggering BUG_ON when kernel
> > +		 * tries to touch a "partially handled" page.
> > +		 */
> > +		if (!PageAnon(head))
> > +			return 0;
> 
> Please print a message for this case. In the future there will be
> likely more non anonymous THP pages from Kirill's large page cache work
> (so eventually we'll need it)

OK, I'll do this.

Thanks,
Naoya Horiguchi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 4/4] mm/memory-failure: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*
  2015-07-16  1:41   ` Naoya Horiguchi
@ 2015-07-23 20:37     ` Andrew Morton
  -1 siblings, 0 replies; 18+ messages in thread
From: Andrew Morton @ 2015-07-23 20:37 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi

On Thu, 16 Jul 2015 01:41:56 +0000 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> The race condition addressed in commit add05cecef80 ("mm: soft-offline: don't
> free target page in successful page migration") was not closed completely,
> because that can happen not only for soft-offline, but also for hard-offline.
> Consider that a slab page is about to be freed into buddy pool, and then an
> uncorrected memory error hits the page just after entering __free_one_page(),
> then VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP) is triggered,
> despite the fact that it's not necessary because the data on the affected
> page is not consumed.
> 
> To solve it, this patch drops __PG_HWPOISON from page flag checks at
> allocation/free time. I think it's justified because __PG_HWPOISON flags is
> defined to prevent the page from being reused and setting it outside the
> page's alloc-free cycle is a designed behavior (not a bug.)
> 
> And the patch reverts most of the changes from commit add05cecef80 about
> the new refcounting rule of soft-offlined pages, which is no longer necessary.
> 
> ...
>
> --- v4.2-rc2.orig/mm/memory-failure.c
> +++ v4.2-rc2/mm/memory-failure.c
> @@ -1723,6 +1723,9 @@ int soft_offline_page(struct page *page, int flags)
>  
>  	get_online_mems();
>  
> +	if (get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> +		set_migratetype_isolate(page, true);
> +
>  	ret = get_any_page(page, pfn, flags);
>  	put_online_mems();
>  	if (ret > 0) { /* for in-use pages */

This patch gets build-broken by your
mm-page_isolation-make-set-unset_migratetype_isolate-file-local.patch,
which I shall drop.

From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Subject: mm, page_isolation: make set/unset_migratetype_isolate() file-local

Nowaday, set/unset_migratetype_isolate() is defined and used only in
mm/page_isolation, so let's limit the scope within the file.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/page-isolation.h |    5 -----
 mm/page_isolation.c            |    5 +++--
 2 files changed, 3 insertions(+), 7 deletions(-)

diff -puN include/linux/page-isolation.h~mm-page_isolation-make-set-unset_migratetype_isolate-file-local include/linux/page-isolation.h
--- a/include/linux/page-isolation.h~mm-page_isolation-make-set-unset_migratetype_isolate-file-local
+++ a/include/linux/page-isolation.h
@@ -65,11 +65,6 @@ undo_isolate_page_range(unsigned long st
 int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn,
 			bool skip_hwpoisoned_pages);
 
-/*
- * Internal functions. Changes pageblock's migrate type.
- */
-int set_migratetype_isolate(struct page *page, bool skip_hwpoisoned_pages);
-void unset_migratetype_isolate(struct page *page, unsigned migratetype);
 struct page *alloc_migrate_target(struct page *page, unsigned long private,
 				int **resultp);
 
diff -puN mm/page_isolation.c~mm-page_isolation-make-set-unset_migratetype_isolate-file-local mm/page_isolation.c
--- a/mm/page_isolation.c~mm-page_isolation-make-set-unset_migratetype_isolate-file-local
+++ a/mm/page_isolation.c
@@ -9,7 +9,8 @@
 #include <linux/hugetlb.h>
 #include "internal.h"
 
-int set_migratetype_isolate(struct page *page, bool skip_hwpoisoned_pages)
+static int set_migratetype_isolate(struct page *page,
+				bool skip_hwpoisoned_pages)
 {
 	struct zone *zone;
 	unsigned long flags, pfn;
@@ -72,7 +73,7 @@ out:
 	return ret;
 }
 
-void unset_migratetype_isolate(struct page *page, unsigned migratetype)
+static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 {
 	struct zone *zone;
 	unsigned long flags, nr_pages;
_


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 4/4] mm/memory-failure: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*
@ 2015-07-23 20:37     ` Andrew Morton
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Morton @ 2015-07-23 20:37 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi

On Thu, 16 Jul 2015 01:41:56 +0000 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> The race condition addressed in commit add05cecef80 ("mm: soft-offline: don't
> free target page in successful page migration") was not closed completely,
> because that can happen not only for soft-offline, but also for hard-offline.
> Consider that a slab page is about to be freed into buddy pool, and then an
> uncorrected memory error hits the page just after entering __free_one_page(),
> then VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP) is triggered,
> despite the fact that it's not necessary because the data on the affected
> page is not consumed.
> 
> To solve it, this patch drops __PG_HWPOISON from page flag checks at
> allocation/free time. I think it's justified because __PG_HWPOISON flags is
> defined to prevent the page from being reused and setting it outside the
> page's alloc-free cycle is a designed behavior (not a bug.)
> 
> And the patch reverts most of the changes from commit add05cecef80 about
> the new refcounting rule of soft-offlined pages, which is no longer necessary.
> 
> ...
>
> --- v4.2-rc2.orig/mm/memory-failure.c
> +++ v4.2-rc2/mm/memory-failure.c
> @@ -1723,6 +1723,9 @@ int soft_offline_page(struct page *page, int flags)
>  
>  	get_online_mems();
>  
> +	if (get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> +		set_migratetype_isolate(page, true);
> +
>  	ret = get_any_page(page, pfn, flags);
>  	put_online_mems();
>  	if (ret > 0) { /* for in-use pages */

This patch gets build-broken by your
mm-page_isolation-make-set-unset_migratetype_isolate-file-local.patch,
which I shall drop.

From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Subject: mm, page_isolation: make set/unset_migratetype_isolate() file-local

Nowaday, set/unset_migratetype_isolate() is defined and used only in
mm/page_isolation, so let's limit the scope within the file.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/page-isolation.h |    5 -----
 mm/page_isolation.c            |    5 +++--
 2 files changed, 3 insertions(+), 7 deletions(-)

diff -puN include/linux/page-isolation.h~mm-page_isolation-make-set-unset_migratetype_isolate-file-local include/linux/page-isolation.h
--- a/include/linux/page-isolation.h~mm-page_isolation-make-set-unset_migratetype_isolate-file-local
+++ a/include/linux/page-isolation.h
@@ -65,11 +65,6 @@ undo_isolate_page_range(unsigned long st
 int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn,
 			bool skip_hwpoisoned_pages);
 
-/*
- * Internal functions. Changes pageblock's migrate type.
- */
-int set_migratetype_isolate(struct page *page, bool skip_hwpoisoned_pages);
-void unset_migratetype_isolate(struct page *page, unsigned migratetype);
 struct page *alloc_migrate_target(struct page *page, unsigned long private,
 				int **resultp);
 
diff -puN mm/page_isolation.c~mm-page_isolation-make-set-unset_migratetype_isolate-file-local mm/page_isolation.c
--- a/mm/page_isolation.c~mm-page_isolation-make-set-unset_migratetype_isolate-file-local
+++ a/mm/page_isolation.c
@@ -9,7 +9,8 @@
 #include <linux/hugetlb.h>
 #include "internal.h"
 
-int set_migratetype_isolate(struct page *page, bool skip_hwpoisoned_pages)
+static int set_migratetype_isolate(struct page *page,
+				bool skip_hwpoisoned_pages)
 {
 	struct zone *zone;
 	unsigned long flags, pfn;
@@ -72,7 +73,7 @@ out:
 	return ret;
 }
 
-void unset_migratetype_isolate(struct page *page, unsigned migratetype)
+static void unset_migratetype_isolate(struct page *page, unsigned migratetype)
 {
 	struct zone *zone;
 	unsigned long flags, nr_pages;
_

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 4/4] mm/memory-failure: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*
  2015-07-23 20:37     ` Andrew Morton
@ 2015-07-23 23:13       ` Naoya Horiguchi
  -1 siblings, 0 replies; 18+ messages in thread
From: Naoya Horiguchi @ 2015-07-23 23:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi

On Thu, Jul 23, 2015 at 01:37:02PM -0700, Andrew Morton wrote:
> On Thu, 16 Jul 2015 01:41:56 +0000 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
> 
> > The race condition addressed in commit add05cecef80 ("mm: soft-offline: don't
> > free target page in successful page migration") was not closed completely,
> > because that can happen not only for soft-offline, but also for hard-offline.
> > Consider that a slab page is about to be freed into buddy pool, and then an
> > uncorrected memory error hits the page just after entering __free_one_page(),
> > then VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP) is triggered,
> > despite the fact that it's not necessary because the data on the affected
> > page is not consumed.
> > 
> > To solve it, this patch drops __PG_HWPOISON from page flag checks at
> > allocation/free time. I think it's justified because __PG_HWPOISON flags is
> > defined to prevent the page from being reused and setting it outside the
> > page's alloc-free cycle is a designed behavior (not a bug.)
> > 
> > And the patch reverts most of the changes from commit add05cecef80 about
> > the new refcounting rule of soft-offlined pages, which is no longer necessary.
> > 
> > ...
> >
> > --- v4.2-rc2.orig/mm/memory-failure.c
> > +++ v4.2-rc2/mm/memory-failure.c
> > @@ -1723,6 +1723,9 @@ int soft_offline_page(struct page *page, int flags)
> >  
> >  	get_online_mems();
> >  
> > +	if (get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> > +		set_migratetype_isolate(page, true);
> > +
> >  	ret = get_any_page(page, pfn, flags);
> >  	put_online_mems();
> >  	if (ret > 0) { /* for in-use pages */
> 
> This patch gets build-broken by your
> mm-page_isolation-make-set-unset_migratetype_isolate-file-local.patch,
> which I shall drop.

I apologize this build failure. At first I planned to add another hwpoison patch
after this to remove this migratetype thing separately, but I was not 100% sure
of the correctness, so I did not include it in this version.
But Vlastimil's cleanup patch showed me that using MIGRATE_ISOLATE at free time
(, which is what soft offline code does now,) is wrong (or not an expected usage).
So I shouldn't have reverted the above part.

So I want the patch "mm, page_isolation: make set/unset_migratetype_isolate()
file-local" to be merged first, and I'd like to update this hwpoison before
going into mmotm. Could you drop this series from your tree for now?
I'll repost the next version probably next week.

Thanks,
Naoya Horiguchi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v1 4/4] mm/memory-failure: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*
@ 2015-07-23 23:13       ` Naoya Horiguchi
  0 siblings, 0 replies; 18+ messages in thread
From: Naoya Horiguchi @ 2015-07-23 23:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
	Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi

On Thu, Jul 23, 2015 at 01:37:02PM -0700, Andrew Morton wrote:
> On Thu, 16 Jul 2015 01:41:56 +0000 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
> 
> > The race condition addressed in commit add05cecef80 ("mm: soft-offline: don't
> > free target page in successful page migration") was not closed completely,
> > because that can happen not only for soft-offline, but also for hard-offline.
> > Consider that a slab page is about to be freed into buddy pool, and then an
> > uncorrected memory error hits the page just after entering __free_one_page(),
> > then VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP) is triggered,
> > despite the fact that it's not necessary because the data on the affected
> > page is not consumed.
> > 
> > To solve it, this patch drops __PG_HWPOISON from page flag checks at
> > allocation/free time. I think it's justified because __PG_HWPOISON flags is
> > defined to prevent the page from being reused and setting it outside the
> > page's alloc-free cycle is a designed behavior (not a bug.)
> > 
> > And the patch reverts most of the changes from commit add05cecef80 about
> > the new refcounting rule of soft-offlined pages, which is no longer necessary.
> > 
> > ...
> >
> > --- v4.2-rc2.orig/mm/memory-failure.c
> > +++ v4.2-rc2/mm/memory-failure.c
> > @@ -1723,6 +1723,9 @@ int soft_offline_page(struct page *page, int flags)
> >  
> >  	get_online_mems();
> >  
> > +	if (get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> > +		set_migratetype_isolate(page, true);
> > +
> >  	ret = get_any_page(page, pfn, flags);
> >  	put_online_mems();
> >  	if (ret > 0) { /* for in-use pages */
> 
> This patch gets build-broken by your
> mm-page_isolation-make-set-unset_migratetype_isolate-file-local.patch,
> which I shall drop.

I apologize this build failure. At first I planned to add another hwpoison patch
after this to remove this migratetype thing separately, but I was not 100% sure
of the correctness, so I did not include it in this version.
But Vlastimil's cleanup patch showed me that using MIGRATE_ISOLATE at free time
(, which is what soft offline code does now,) is wrong (or not an expected usage).
So I shouldn't have reverted the above part.

So I want the patch "mm, page_isolation: make set/unset_migratetype_isolate()
file-local" to be merged first, and I'd like to update this hwpoison before
going into mmotm. Could you drop this series from your tree for now?
I'll repost the next version probably next week.

Thanks,
Naoya Horiguchi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-07-23 23:17 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-16  1:41 [PATCH v1 0/4] hwpoison: fixes on v4.2-rc2 Naoya Horiguchi
2015-07-16  1:41 ` Naoya Horiguchi
2015-07-16  1:41 ` [PATCH v1 3/4] mm/memory-failure: give up error handling for non-tail-refcounted thp Naoya Horiguchi
2015-07-16  1:41   ` Naoya Horiguchi
2015-07-16  2:33   ` Andi Kleen
2015-07-16  2:33     ` Andi Kleen
2015-07-16  2:41     ` Naoya Horiguchi
2015-07-16  2:41       ` Naoya Horiguchi
2015-07-16  1:41 ` [PATCH v1 2/4] mm/memory-failure: fix race in counting num_poisoned_pages Naoya Horiguchi
2015-07-16  1:41   ` Naoya Horiguchi
2015-07-16  1:41 ` [PATCH v1 4/4] mm/memory-failure: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_* Naoya Horiguchi
2015-07-16  1:41   ` Naoya Horiguchi
2015-07-23 20:37   ` Andrew Morton
2015-07-23 20:37     ` Andrew Morton
2015-07-23 23:13     ` Naoya Horiguchi
2015-07-23 23:13       ` Naoya Horiguchi
2015-07-16  1:41 ` [PATCH v1 1/4] mm/memory-failure: unlock_page before put_page Naoya Horiguchi
2015-07-16  1:41   ` Naoya Horiguchi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.