All of lore.kernel.org
 help / color / mirror / Atom feed
* + mm-clean-up-hwpoison-page-cache-page-in-fault-path.patch added to -mm tree
@ 2022-02-14 23:24 Andrew Morton
  0 siblings, 0 replies; only message in thread
From: Andrew Morton @ 2022-02-14 23:24 UTC (permalink / raw)
  To: mm-commits, willy, naoya.horiguchi, mgorman, linmiaohe, hannes,
	riel, akpm


The patch titled
     Subject: mm: invalidate hwpoison page cache page in fault path
has been added to the -mm tree.  Its filename is
     mm-clean-up-hwpoison-page-cache-page-in-fault-path.patch

This patch should soon appear at
    https://ozlabs.org/~akpm/mmots/broken-out/mm-clean-up-hwpoison-page-cache-page-in-fault-path.patch
and later at
    https://ozlabs.org/~akpm/mmotm/broken-out/mm-clean-up-hwpoison-page-cache-page-in-fault-path.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Rik van Riel <riel@surriel.com>
Subject: mm: invalidate hwpoison page cache page in fault path

Sometimes the page offlining code can leave behind a hwpoisoned clean page
cache page.  This can lead to programs being killed over and over and over
again as they fault in the hwpoisoned page, get killed, and then get
re-spawned by whatever wanted to run them.

This is particularly embarrassing when the page was offlined due to having
too many corrected memory errors.  Now we are killing tasks due to them
trying to access memory that probably isn't even corrupted.

This problem can be avoided by invalidating the page from the page fault
handler, which already has a branch for dealing with these kinds of pages.
With this patch we simply pretend the page fault was successful if the
page was invalidated, return to userspace, incur another page fault, read
in the file from disk (to a new memory page), and then everything works
again.

Link: https://lkml.kernel.org/r/20220212213740.423efcea@imladris.surriel.com
Signed-off-by: Rik van Riel <riel@surriel.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

--- a/mm/memory.c~mm-clean-up-hwpoison-page-cache-page-in-fault-path
+++ a/mm/memory.c
@@ -3926,11 +3926,16 @@ static vm_fault_t __do_fault(struct vm_f
 		return ret;
 
 	if (unlikely(PageHWPoison(vmf->page))) {
-		if (ret & VM_FAULT_LOCKED)
+		vm_fault_t poisonret = VM_FAULT_HWPOISON;
+		if (ret & VM_FAULT_LOCKED) {
+			/* Retry if a clean page was removed from the cache. */
+			if (invalidate_inode_page(vmf->page))
+				poisonret = 0;
 			unlock_page(vmf->page);
+		}
 		put_page(vmf->page);
 		vmf->page = NULL;
-		return VM_FAULT_HWPOISON;
+		return poisonret;
 	}
 
 	if (unlikely(!(ret & VM_FAULT_LOCKED)))
_

Patches currently in -mm which might be from riel@surriel.com are

mm-clean-up-hwpoison-page-cache-page-in-fault-path.patch


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2022-02-14 23:24 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-14 23:24 + mm-clean-up-hwpoison-page-cache-page-in-fault-path.patch added to -mm tree Andrew Morton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.