linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Yang Shi <shy828301@gmail.com>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Hugh Dickins <hughd@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Oscar Salvador <osalvador@suse.de>, Peter Xu <peterx@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 5.10 05/21] mm: filemap: check if THP has hwpoisoned subpage for PMD page fault
Date: Wed, 10 Nov 2021 19:43:51 +0100	[thread overview]
Message-ID: <20211110182003.138254569@linuxfoundation.org> (raw)
In-Reply-To: <20211110182002.964190708@linuxfoundation.org>

From: Yang Shi <shy828301@gmail.com>

commit eac96c3efdb593df1a57bb5b95dbe037bfa9a522 upstream.

When handling shmem page fault the THP with corrupted subpage could be
PMD mapped if certain conditions are satisfied.  But kernel is supposed
to send SIGBUS when trying to map hwpoisoned page.

There are two paths which may do PMD map: fault around and regular
fault.

Before commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault()
codepaths") the thing was even worse in fault around path.  The THP
could be PMD mapped as long as the VMA fits regardless what subpage is
accessed and corrupted.  After this commit as long as head page is not
corrupted the THP could be PMD mapped.

In the regular fault path the THP could be PMD mapped as long as the
corrupted page is not accessed and the VMA fits.

This loophole could be fixed by iterating every subpage to check if any
of them is hwpoisoned or not, but it is somewhat costly in page fault
path.

So introduce a new page flag called HasHWPoisoned on the first tail
page.  It indicates the THP has hwpoisoned subpage(s).  It is set if any
subpage of THP is found hwpoisoned by memory failure and after the
refcount is bumped successfully, then cleared when the THP is freed or
split.

The soft offline path doesn't need this since soft offline handler just
marks a subpage hwpoisoned when the subpage is migrated successfully.
But shmem THP didn't get split then migrated at all.

Link: https://lkml.kernel.org/r/20211020210755.23964-3-shy828301@gmail.com
Fixes: 800d8c63b2e9 ("shmem: add huge pages support")
Signed-off-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/linux/page-flags.h |   23 +++++++++++++++++++++++
 mm/huge_memory.c           |    2 ++
 mm/memory-failure.c        |   14 ++++++++++++++
 mm/memory.c                |    9 +++++++++
 mm/page_alloc.c            |    4 +++-
 5 files changed, 51 insertions(+), 1 deletion(-)

--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -169,6 +169,15 @@ enum pageflags {
 	/* Compound pages. Stored in first tail page's flags */
 	PG_double_map = PG_workingset,
 
+#ifdef CONFIG_MEMORY_FAILURE
+	/*
+	 * Compound pages. Stored in first tail page's flags.
+	 * Indicates that at least one subpage is hwpoisoned in the
+	 * THP.
+	 */
+	PG_has_hwpoisoned = PG_mappedtodisk,
+#endif
+
 	/* non-lru isolated movable page */
 	PG_isolated = PG_reclaim,
 
@@ -701,6 +710,20 @@ PAGEFLAG_FALSE(DoubleMap)
 	TESTSCFLAG_FALSE(DoubleMap)
 #endif
 
+#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
+/*
+ * PageHasHWPoisoned indicates that at least one subpage is hwpoisoned in the
+ * compound page.
+ *
+ * This flag is set by hwpoison handler.  Cleared by THP split or free page.
+ */
+PAGEFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND)
+	TESTSCFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND)
+#else
+PAGEFLAG_FALSE(HasHWPoisoned)
+	TESTSCFLAG_FALSE(HasHWPoisoned)
+#endif
+
 /*
  * For pages that are never mapped to userspace (and aren't PageSlab),
  * page_type may be used.  Because it is initialised to -1, we invert the
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2464,6 +2464,8 @@ static void __split_huge_page(struct pag
 		xa_lock(&swap_cache->i_pages);
 	}
 
+	ClearPageHasHWPoisoned(head);
+
 	for (i = nr - 1; i >= 1; i--) {
 		__split_huge_page_tail(head, i, lruvec, list);
 		/* Some pages can be beyond i_size: drop them from page cache */
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1367,6 +1367,20 @@ int memory_failure(unsigned long pfn, in
 	}
 
 	if (PageTransHuge(hpage)) {
+		/*
+		 * The flag must be set after the refcount is bumped
+		 * otherwise it may race with THP split.
+		 * And the flag can't be set in get_hwpoison_page() since
+		 * it is called by soft offline too and it is just called
+		 * for !MF_COUNT_INCREASE.  So here seems to be the best
+		 * place.
+		 *
+		 * Don't need care about the above error handling paths for
+		 * get_hwpoison_page() since they handle either free page
+		 * or unhandlable page.  The refcount is bumped iff the
+		 * page is a valid handlable page.
+		 */
+		SetPageHasHWPoisoned(hpage);
 		if (try_to_split_thp_page(p, "Memory Failure") < 0) {
 			action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED);
 			return -EBUSY;
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3921,6 +3921,15 @@ vm_fault_t finish_fault(struct vm_fault
 		page = vmf->page;
 
 	/*
+	 * Just backoff if any subpage of a THP is corrupted otherwise
+	 * the corrupted page may mapped by PMD silently to escape the
+	 * check.  This kind of THP just can be PTE mapped.  Access to
+	 * the corrupted subpage should trigger SIGBUS as expected.
+	 */
+	if (unlikely(PageHasHWPoisoned(page)))
+		return ret;
+
+	/*
 	 * check even for read faults because we might have lost our CoWed
 	 * page
 	 */
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1232,8 +1232,10 @@ static __always_inline bool free_pages_p
 
 		VM_BUG_ON_PAGE(compound && compound_order(page) != order, page);
 
-		if (compound)
+		if (compound) {
 			ClearPageDoubleMap(page);
+			ClearPageHasHWPoisoned(page);
+		}
 		for (i = 1; i < (1 << order); i++) {
 			if (compound)
 				bad += free_tail_pages_check(page, page + i);



  parent reply	other threads:[~2021-11-10 18:54 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-10 18:43 [PATCH 5.10 00/21] 5.10.79-rc1 review Greg Kroah-Hartman
2021-11-10 18:43 ` [PATCH 5.10 01/21] KVM: x86: avoid warning with -Wbitwise-instead-of-logical Greg Kroah-Hartman
2021-11-10 18:43 ` [PATCH 5.10 02/21] Revert "x86/kvm: fix vcpu-id indexed array sizes" Greg Kroah-Hartman
2021-11-10 18:43 ` [PATCH 5.10 03/21] usb: ehci: handshake CMD_RUN instead of STS_HALT Greg Kroah-Hartman
2021-11-10 18:43 ` [PATCH 5.10 04/21] mm: hwpoison: remove the unnecessary THP check Greg Kroah-Hartman
2021-11-10 18:43 ` Greg Kroah-Hartman [this message]
2021-11-10 18:43 ` [PATCH 5.10 06/21] usb: gadget: Mark USB_FSL_QE broken on 64-bit Greg Kroah-Hartman
2021-11-10 18:43 ` [PATCH 5.10 07/21] usb: musb: Balance list entry in musb_gadget_queue Greg Kroah-Hartman
2021-11-10 18:43 ` [PATCH 5.10 08/21] usb-storage: Add compatibility quirk flags for iODD 2531/2541 Greg Kroah-Hartman
2021-11-10 18:43 ` [PATCH 5.10 09/21] binder: dont detect sender/target during buffer cleanup Greg Kroah-Hartman
2021-11-10 18:43 ` [PATCH 5.10 10/21] printk/console: Allow to disable console output by using console="" or console=null Greg Kroah-Hartman
2021-11-10 18:43 ` [PATCH 5.10 11/21] staging: rtl8712: fix use-after-free in rtl8712_dl_fw Greg Kroah-Hartman
2021-11-10 18:43 ` [PATCH 5.10 12/21] isofs: Fix out of bound access for corrupted isofs image Greg Kroah-Hartman
2021-11-10 18:43 ` [PATCH 5.10 13/21] comedi: dt9812: fix DMA buffers on stack Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.10 14/21] comedi: ni_usb6501: fix NULL-deref in command paths Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.10 15/21] comedi: vmk80xx: fix transfer-buffer overflows Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.10 16/21] comedi: vmk80xx: fix bulk-buffer overflow Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.10 17/21] comedi: vmk80xx: fix bulk and interrupt message timeouts Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.10 18/21] staging: r8712u: fix control-message timeout Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.10 19/21] staging: rtl8192u: fix control-message timeouts Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.10 20/21] media: staging/intel-ipu3: css: Fix wrong size comparison imgu_css_fw_init Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.10 21/21] rsi: fix control-message timeout Greg Kroah-Hartman
2021-11-10 20:09 ` [PATCH 5.10 00/21] 5.10.79-rc1 review Florian Fainelli
2021-11-10 21:42 ` Fox Chen
2021-11-11 13:01 ` Sudip Mukherjee
2021-11-11 14:54   ` Naresh Kamboju
2021-11-12 13:47     ` Greg Kroah-Hartman
2021-11-11 19:45   ` Sudip Mukherjee
2021-11-12 13:46     ` Greg Kroah-Hartman
2021-11-11 21:36   ` Shuah Khan
2021-11-12 13:46     ` Greg Kroah-Hartman
2021-11-11 16:20 ` Shuah Khan
2021-11-11 16:42 ` Pavel Machek
2021-11-12  1:15 ` Guenter Roeck
2021-11-12 13:45   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211110182003.138254569@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=peterx@redhat.com \
    --cc=shy828301@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).