linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Aili Yao <yaoaili@kingsoft.com>
To: David Hildenbrand <david@redhat.com>
Cc: <akpm@linux-foundation.org>, <naoya.horiguchi@nec.com>,
	<linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
	<yangfeng1@kingsoft.com>, <sunhao2@kingsoft.com>,
	Oscar Salvador <osalvador@suse.de>,
	Mike Kravetz <mike.kravetz@oracle.com>, <yaoaili@kingsoft.com>
Subject: [PATCH v2] mm/gup: check page posion status for coredump.
Date: Thu, 18 Mar 2021 11:18:45 +0800	[thread overview]
Message-ID: <20210318111845.3d06141c@alex-virtual-machine> (raw)
In-Reply-To: <20a0d078-f49d-54d6-9f04-f6b41dd51e5f@redhat.com>

When we do coredump for user process signal, this may be an SIGBUS signal
with BUS_MCEERR_AR or BUS_MCEERR_AO code, which means this signal is
resulted from ECC memory fail like SRAR or SRAO, we expect the memory
recovery work is finished correctly, then the get_dump_page() will not
return the error page as its process pte is set invalid by
memory_failure().

But memory_failure() may fail, and the process's related pte may not be
correctly set invalid, for current code, we will return the poison page
and get it dumped and lead to system panic as its in kernel code.

So check the poison status in get_dump_page(), and if TRUE, return NULL.

There maybe other scenario that is also better to check the posion status
and not to panic, so make a wrapper for this check, suggested by
David Hildenbrand <david@redhat.com>

Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
---
 mm/gup.c      |  4 ++++
 mm/internal.h | 21 +++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/mm/gup.c b/mm/gup.c
index e4c224c..3b4703a 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1536,6 +1536,10 @@ struct page *get_dump_page(unsigned long addr)
 				      FOLL_FORCE | FOLL_DUMP | FOLL_GET);
 	if (locked)
 		mmap_read_unlock(mm);
+
+	if (ret == 1 && check_user_page_poison(page))
+		return NULL;
+
 	return (ret == 1) ? page : NULL;
 }
 #endif /* CONFIG_ELF_CORE */
diff --git a/mm/internal.h b/mm/internal.h
index 25d2b2439..777b3e5 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -97,6 +97,27 @@ static inline void set_page_refcounted(struct page *page)
 	set_page_count(page, 1);
 }
 
+/*
+ * When kernel touch the user page, the user page may be have been marked
+ * poison but still mapped in user space, if without this page, the kernel
+ * can guarantee the data integrity and operation success, the kernel is
+ * better to check the posion status and avoid touching it, be good not to
+ * panic, coredump for process fatal signal is a sample case matching this
+ * scenario. Or if kernel can't guarantee the data integrity, it's better
+ * not to call this function, let kernel touch the poison page and get to
+ * panic.
+ */
+static inline int check_user_page_poison(struct page *page)
+{
+	if (IS_ENABLED(CONFIG_MEMORY_FAILURE) && page != NULL) {
+		if (unlikely(PageHuge(page) && PageHWPoison(compound_head(page))))
+			return true;
+		else if (unlikely(PageHWPoison(page)))
+			return true;
+	}
+	return 0;
+}
+
 extern unsigned long highest_memmap_pfn;
 
 /*
-- 
1.8.3.1



  parent reply	other threads:[~2021-03-18  3:18 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-17  8:37 [PATCH] mm/gup: check page posion status for coredump Aili Yao
2021-03-17  9:12 ` David Hildenbrand
2021-03-18  3:15   ` Aili Yao
2021-03-18  3:18   ` Aili Yao [this message]
2021-03-18  4:46   ` Matthew Wilcox
2021-03-18  5:34     ` Aili Yao
2021-03-19  2:44       ` [PATCH v3] " Aili Yao
2021-03-20  0:35         ` Matthew Wilcox
2021-03-22  3:40           ` Aili Yao
2021-03-22 11:33           ` [PATCH v5] mm/gup: check page hwposion " Aili Yao
2021-03-26 14:09             ` David Hildenbrand
2021-03-26 14:22               ` David Hildenbrand
2021-03-31  1:52                 ` HORIGUCHI NAOYA(堀口 直也)
2021-03-31  2:43                   ` Aili Yao
2021-03-31  4:32                     ` HORIGUCHI NAOYA(堀口 直也)
2021-03-31  6:44                       ` David Hildenbrand
2021-03-31  7:07                         ` Aili Yao
2021-04-01  2:31                         ` Aili Yao
2021-04-06  2:23                         ` [PATCH v6] mm/gup: check page hwpoison status for memory recovery failures Aili Yao
2021-04-06  2:41                           ` [PATCH v7] " Aili Yao
2021-04-07  1:54                             ` HORIGUCHI NAOYA(堀口 直也)
2021-04-07  7:48                               ` Aili Yao
2021-05-10  3:13                             ` Aili Yao
2021-03-31  6:07                   ` [PATCH v5] mm/gup: check page hwposion status for coredump Matthew Wilcox
2021-03-31  6:53                     ` HORIGUCHI NAOYA(堀口 直也)
2021-03-31  7:05                       ` David Hildenbrand
2021-03-18  8:14     ` [PATCH] mm/gup: check page posion " David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210318111845.3d06141c@alex-virtual-machine \
    --to=yaoaili@kingsoft.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=sunhao2@kingsoft.com \
    --cc=yangfeng1@kingsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).