linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lv Ying <lvying6@huawei.com>
To: <rafael@kernel.org>, <lenb@kernel.org>, <james.morse@arm.com>,
	<tony.luck@intel.com>, <bp@alien8.de>, <naoya.horiguchi@nec.com>,
	<linmiaohe@huawei.com>, <akpm@linux-foundation.org>,
	<xueshuai@linux.alibaba.com>, <ashish.kalra@amd.com>
Cc: <xiezhipeng1@huawei.com>, <wangkefeng.wang@huawei.com>,
	<xiexiuqi@huawei.com>, <tanxiaofei@huawei.com>,
	<lvying6@huawei.com>, <linux-acpi@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>
Subject: [RFC PATCH v1 2/2] ACPI: APEI: fix reboot caused by synchronous error loop because of memory_failure() failed
Date: Wed, 7 Dec 2022 17:39:35 +0800	[thread overview]
Message-ID: <20221207093935.1972530-3-lvying6@huawei.com> (raw)
In-Reply-To: <20221207093935.1972530-1-lvying6@huawei.com>

Synchronous error was detected as a result of user-space accessing a
corrupt memory location the CPU may take an abort instead. On arm64 this
is a 'synchronous external abort' which can be notified by SEA.

If memory_failure() failed, we return to user-space will trigger SEA again,
such loop may cause platform firmware to exceed some threshold and reboot
when Linux could have recovered from this error. Not all memory_failure()
processing failures will cause the reboot, VM_FAULT_HWPOISON[_LARGE]
handling in arm64 page fault will send SIGBUS signal to the user-space
accessing process to terminate this loop.

If process mapping fault page, but memory_failure() abnormal return before
try_to_unmap(), for example, the fault page process mapping is KSM page.
In this case, arm64 cannot use the page fault process to terminate the
loop.

Add judgement of memory_failure() result in task_work before returning to
user-space. If memory_failure() failed, send SIGBUS signal to the current
process to avoid SEA loop.

Signed-off-by: Lv Ying <lvying6@huawei.com>
---
 mm/memory-failure.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 3b6ac3694b8d..07ec7b62f330 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -2255,7 +2255,7 @@ static void __memory_failure_work_func(struct work_struct *work, bool sync)
 	struct memory_failure_cpu *mf_cpu;
 	struct memory_failure_entry entry = { 0, };
 	unsigned long proc_flags;
-	int gotten;
+	int gotten, ret;
 
 	mf_cpu = container_of(work, struct memory_failure_cpu, work);
 	for (;;) {
@@ -2266,7 +2266,16 @@ static void __memory_failure_work_func(struct work_struct *work, bool sync)
 			break;
 		if (entry.flags & MF_SOFT_OFFLINE)
 			soft_offline_page(entry.pfn, entry.flags);
-		else if (!sync || (entry.flags & MF_ACTION_REQUIRED))
+		else if (sync) {
+			if (entry.flags & MF_ACTION_REQUIRED) {
+				ret = memory_failure(entry.pfn, entry.flags);
+				if (ret == -EHWPOISON || ret == -EOPNOTSUPP)
+					return;
+
+				pr_err("Memory error not recovered");
+				force_sig(SIGBUS);
+			}
+		} else
 			memory_failure(entry.pfn, entry.flags);
 	}
 }
-- 
2.36.1



      parent reply	other threads:[~2022-12-07  9:39 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-07  9:39 [RFC PATCH v1 0/2] ACPI: APEI: Make synchronization errors call Lv Ying
2022-12-07  9:39 ` [RFC PATCH v1 1/2] ACPI: APEI: Make memory_failure() triggered by synchronization errors execute in the current context Lv Ying
2022-12-07  9:39 ` Lv Ying [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221207093935.1972530-3-lvying6@huawei.com \
    --to=lvying6@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=ashish.kalra@amd.com \
    --cc=bp@alien8.de \
    --cc=james.morse@arm.com \
    --cc=lenb@kernel.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=rafael@kernel.org \
    --cc=tanxiaofei@huawei.com \
    --cc=tony.luck@intel.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=xiexiuqi@huawei.com \
    --cc=xiezhipeng1@huawei.com \
    --cc=xueshuai@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).