All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yang Shi <shy828301@gmail.com>
To: naoya.horiguchi@nec.com, hughd@google.com,
	kirill.shutemov@linux.intel.com, willy@infradead.org,
	peterx@redhat.com, osalvador@suse.de, akpm@linux-foundation.org
Cc: shy828301@gmail.com, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [v3 PATCH 2/5] mm: filemap: check if THP has hwpoisoned subpage for PMD page fault
Date: Thu, 30 Sep 2021 14:53:08 -0700	[thread overview]
Message-ID: <20210930215311.240774-3-shy828301@gmail.com> (raw)
In-Reply-To: <20210930215311.240774-1-shy828301@gmail.com>

When handling shmem page fault the THP with corrupted subpage could be PMD
mapped if certain conditions are satisfied.  But kernel is supposed to
send SIGBUS when trying to map hwpoisoned page.

There are two paths which may do PMD map: fault around and regular fault.

Before commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
the thing was even worse in fault around path.  The THP could be PMD mapped as
long as the VMA fits regardless what subpage is accessed and corrupted.  After
this commit as long as head page is not corrupted the THP could be PMD mapped.

In the regular fault path the THP could be PMD mapped as long as the corrupted
page is not accessed and the VMA fits.

This loophole could be fixed by iterating every subpage to check if any
of them is hwpoisoned or not, but it is somewhat costly in page fault path.

So introduce a new page flag called HasHWPoisoned on the first tail page.  It
indicates the THP has hwpoisoned subpage(s).  It is set if any subpage of THP
is found hwpoisoned by memory failure and cleared when the THP is freed or
split.

Fixes: 800d8c63b2e9 ("shmem: add huge pages support")
Cc: <stable@vger.kernel.org>
Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 include/linux/page-flags.h | 19 +++++++++++++++++++
 mm/filemap.c               | 12 ++++++------
 mm/huge_memory.c           |  2 ++
 mm/memory-failure.c        |  6 +++++-
 mm/memory.c                |  9 +++++++++
 mm/page_alloc.c            |  4 +++-
 6 files changed, 44 insertions(+), 8 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index a558d67ee86f..a357b41b3057 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -171,6 +171,11 @@ enum pageflags {
 	/* Compound pages. Stored in first tail page's flags */
 	PG_double_map = PG_workingset,
 
+#ifdef CONFIG_MEMORY_FAILURE
+	/* Compound pages. Stored in first tail page's flags */
+	PG_has_hwpoisoned = PG_mappedtodisk,
+#endif
+
 	/* non-lru isolated movable page */
 	PG_isolated = PG_reclaim,
 
@@ -668,6 +673,20 @@ PAGEFLAG_FALSE(DoubleMap)
 	TESTSCFLAG_FALSE(DoubleMap)
 #endif
 
+#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
+/*
+ * PageHasPoisoned indicates that at least on subpage is hwpoisoned in the
+ * compound page.
+ *
+ * This flag is set by hwpoison handler.  Cleared by THP split or free page.
+ */
+PAGEFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND)
+	TESTSCFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND)
+#else
+PAGEFLAG_FALSE(HasHWPoisoned)
+	TESTSCFLAG_FALSE(HasHWPoisoned)
+#endif
+
 /*
  * Check if a page is currently marked HWPoisoned. Note that this check is
  * best effort only and inherently racy: there is no way to synchronize with
diff --git a/mm/filemap.c b/mm/filemap.c
index dae481293b5d..2acc2b977f66 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3195,12 +3195,12 @@ static bool filemap_map_pmd(struct vm_fault *vmf, struct page *page)
 	}
 
 	if (pmd_none(*vmf->pmd) && PageTransHuge(page)) {
-	    vm_fault_t ret = do_set_pmd(vmf, page);
-	    if (!ret) {
-		    /* The page is mapped successfully, reference consumed. */
-		    unlock_page(page);
-		    return true;
-	    }
+		vm_fault_t ret = do_set_pmd(vmf, page);
+		if (!ret) {
+			/* The page is mapped successfully, reference consumed. */
+			unlock_page(page);
+			return true;
+		}
 	}
 
 	if (pmd_none(*vmf->pmd)) {
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 5e9ef0fc261e..0574b1613714 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2426,6 +2426,8 @@ static void __split_huge_page(struct page *page, struct list_head *list,
 	/* lock lru list/PageCompound, ref frozen by page_ref_freeze */
 	lruvec = lock_page_lruvec(head);
 
+	ClearPageHasHWPoisoned(head);
+
 	for (i = nr - 1; i >= 1; i--) {
 		__split_huge_page_tail(head, i, lruvec, list);
 		/* Some pages can be beyond EOF: drop them from page cache */
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index ed28eba50f98..a79a38374a14 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1148,8 +1148,12 @@ static int __get_hwpoison_page(struct page *page)
 		return -EBUSY;
 
 	if (get_page_unless_zero(head)) {
-		if (head == compound_head(page))
+		if (head == compound_head(page)) {
+			if (PageTransHuge(head))
+				SetPageHasHWPoisoned(head);
+
 			return 1;
+		}
 
 		pr_info("Memory failure: %#lx cannot catch tail\n",
 			page_to_pfn(page));
diff --git a/mm/memory.c b/mm/memory.c
index adf9b9ef8277..c52be6d6b605 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3906,6 +3906,15 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
 	if (compound_order(page) != HPAGE_PMD_ORDER)
 		return ret;
 
+	/*
+	 * Just backoff if any subpage of a THP is corrupted otherwise
+	 * the corrupted page may mapped by PMD silently to escape the
+	 * check.  This kind of THP just can be PTE mapped.  Access to
+	 * the corrupted subpage should trigger SIGBUS as expected.
+	 */
+	if (unlikely(PageHasHWPoisoned(page)))
+		return ret;
+
 	/*
 	 * Archs like ppc64 need additional space to store information
 	 * related to pte entry. Use the preallocated table for that.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b37435c274cf..7f37652f0287 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1312,8 +1312,10 @@ static __always_inline bool free_pages_prepare(struct page *page,
 
 		VM_BUG_ON_PAGE(compound && compound_order(page) != order, page);
 
-		if (compound)
+		if (compound) {
 			ClearPageDoubleMap(page);
+			ClearPageHasHWPoisoned(page);
+		}
 		for (i = 1; i < (1 << order); i++) {
 			if (compound)
 				bad += free_tail_pages_check(page, page + i);
-- 
2.26.2


  parent reply	other threads:[~2021-09-30 21:53 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-30 21:53 [RFC v3 PATCH 0/5] Solve silent data loss caused by poisoned page cache (shmem/tmpfs) Yang Shi
2021-09-30 21:53 ` [v3 PATCH 1/5] mm: hwpoison: remove the unnecessary THP check Yang Shi
2021-10-06  2:35   ` Yang Shi
2021-10-06  4:00     ` Naoya Horiguchi
2021-10-06 17:56       ` Yang Shi
2021-09-30 21:53 ` Yang Shi [this message]
2021-10-01  7:23   ` [v3 PATCH 2/5] mm: filemap: check if THP has hwpoisoned subpage for PMD page fault Naoya Horiguchi
2021-10-01 21:07     ` Yang Shi
2021-10-04 14:06   ` Kirill A. Shutemov
2021-10-04 18:17     ` Yang Shi
2021-10-04 19:41       ` Kirill A. Shutemov
2021-10-04 20:13         ` Yang Shi
2021-10-06 19:54           ` Peter Xu
2021-10-06 23:41             ` Yang Shi
2021-10-07 16:14               ` Peter Xu
2021-10-07 18:28                 ` Yang Shi
2021-10-08  9:35             ` Kirill A. Shutemov
2021-10-11 22:57               ` Peter Xu
2021-10-06 20:15   ` Peter Xu
2021-10-06 23:57     ` Yang Shi
2021-10-07 16:06       ` Peter Xu
2021-10-07 18:19         ` Yang Shi
2021-10-07 20:27           ` Yang Shi
2021-10-07 21:28       ` Yang Shi
2021-10-12  0:55         ` Peter Xu
2021-10-12  1:44           ` Peter Xu
2021-10-12 18:02             ` Yang Shi
2021-10-12 22:10               ` Peter Xu
2021-10-13  2:48                 ` Yang Shi
2021-10-13  3:01                   ` Peter Xu
2021-10-13  3:27                     ` Yang Shi
2021-10-13  3:41                       ` Peter Xu
2021-10-13 21:42                         ` Yang Shi
2021-10-13 23:13                           ` Peter Xu
2021-10-14  6:54                     ` Naoya Horiguchi
2021-10-06 20:18   ` Peter Xu
2021-10-07  2:49     ` Yang Shi
2021-11-01 19:05   ` Naresh Kamboju
2021-11-01 19:26     ` Yang Shi
2021-09-30 21:53 ` [v3 PATCH 3/5] mm: hwpoison: refactor refcount check handling Yang Shi
2021-10-06 22:01   ` Peter Xu
2021-10-07  2:47     ` Yang Shi
2021-10-07 16:18       ` Peter Xu
2021-09-30 21:53 ` [v3 PATCH 4/5] mm: shmem: don't truncate page if memory failure happens Yang Shi
2021-10-01  7:05   ` Naoya Horiguchi
2021-10-01 21:08     ` Yang Shi
2021-10-12  1:57   ` Peter Xu
2021-10-12 19:17     ` Yang Shi
2021-10-12 22:26       ` Peter Xu
2021-10-13  3:00         ` Yang Shi
2021-10-13  3:06           ` Peter Xu
2021-10-13  3:29             ` Yang Shi
2021-09-30 21:53 ` [v3 PATCH 5/5] mm: hwpoison: handle non-anonymous THP correctly Yang Shi
2021-10-01  7:06   ` Naoya Horiguchi
2021-10-01 21:09     ` Yang Shi
2021-10-13  2:40 ` [RFC v3 PATCH 0/5] Solve silent data loss caused by poisoned page cache (shmem/tmpfs) Peter Xu
2021-10-13  3:09   ` Yang Shi
2021-10-13  3:24     ` Peter Xu
2021-10-14  6:54     ` Naoya Horiguchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210930215311.240774-3-shy828301@gmail.com \
    --to=shy828301@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=peterx@redhat.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.