All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xu Yu <xuyu@linux.alibaba.com>
To: linux-mm@kvack.org
Cc: akpm@linux-foundation.org
Subject: [PATCH] mm/memory-failure.c: bail out early if huge zero page
Date: Sun, 10 Apr 2022 23:22:34 +0800	[thread overview]
Message-ID: <49273e6688d7571756603dac996692a15f245d58.1649603963.git.xuyu@linux.alibaba.com> (raw)

Kernel panic when injecting memory_failure for the global huge_zero_page,
when CONFIG_DEBUG_VM is enabled, as follows.

[    5.582720] Injecting memory failure for pfn 0x109ff9 at process virtual address 0x20ff9000
[    5.583786] page:00000000fb053fc3 refcount:2 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x109e00
[    5.584900] head:00000000fb053fc3 order:9 compound_mapcount:0 compound_pincount:0
[    5.585796] flags: 0x17fffc000010001(locked|head|node=0|zone=2|lastcpupid=0x1ffff)
[    5.586712] raw: 017fffc000010001 0000000000000000 dead000000000122 0000000000000000
[    5.587640] raw: 0000000000000000 0000000000000000 00000002ffffffff 0000000000000000
[    5.588565] page dumped because: VM_BUG_ON_PAGE(is_huge_zero_page(head))
[    5.589398] ------------[ cut here ]------------
[    5.589952] kernel BUG at mm/huge_memory.c:2499!
[    5.590516] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[    5.591120] CPU: 6 PID: 553 Comm: split_bug Not tainted 5.18.0-rc1+ #11
[    5.591904] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 3288b3c 04/01/2014
[    5.592817] RIP: 0010:split_huge_page_to_list+0x66a/0x880
[    5.593469] Code: 84 9b fb ff ff 48 8b 7c 24 08 31 f6 e8 9f 5d 2a 00 b8 b8 02 00 00 e9 e8 fb ff ff 48 c7 c6 e8 47 3c 82 4c b
[    5.595806] RSP: 0018:ffffc90000dcbdf8 EFLAGS: 00010246
[    5.596434] RAX: 000000000000003c RBX: 0000000000000001 RCX: 0000000000000000
[    5.597322] RDX: 0000000000000000 RSI: ffffffff823e4c4f RDI: 00000000ffffffff
[    5.598162] RBP: ffff88843fffdb40 R08: 0000000000000000 R09: 00000000fffeffff
[    5.598999] R10: ffffc90000dcbc48 R11: ffffffff82d68448 R12: ffffea0004278000
[    5.599849] R13: ffffffff823c6203 R14: 0000000000109ff9 R15: ffffea000427fe40
[    5.600693] FS:  00007fc375a26740(0000) GS:ffff88842fd80000(0000) knlGS:0000000000000000
[    5.601640] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    5.602304] CR2: 00007fc3757c9290 CR3: 0000000102174006 CR4: 00000000003706e0
[    5.603139] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    5.603977] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    5.604806] Call Trace:
[    5.605101]  <TASK>
[    5.605357]  ? __irq_work_queue_local+0x39/0x70
[    5.605904]  try_to_split_thp_page+0x3a/0x130
[    5.606430]  memory_failure+0x128/0x800
[    5.606888]  madvise_inject_error.cold+0x8b/0xa1
[    5.607444]  __x64_sys_madvise+0x54/0x60
[    5.607915]  do_syscall_64+0x35/0x80
[    5.608347]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[    5.608949] RIP: 0033:0x7fc3754f8bf9
[    5.609374] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 8
[    5.611554] RSP: 002b:00007ffeda93a1d8 EFLAGS: 00000217 ORIG_RAX: 000000000000001c
[    5.612441] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc3754f8bf9
[    5.613269] RDX: 0000000000000064 RSI: 0000000000003000 RDI: 0000000020ff9000
[    5.614108] RBP: 00007ffeda93a200 R08: 0000000000000000 R09: 0000000000000000
[    5.614946] R10: 00000000ffffffff R11: 0000000000000217 R12: 0000000000400490
[    5.615787] R13: 00007ffeda93a2e0 R14: 0000000000000000 R15: 0000000000000000
[    5.616626]  </TASK>

In fact, huge_zero_page is unhandlable currently in either soft offline
or memory failure injection.  With CONFIG_DEBUG_VM disabled,
huge_zero_page is bailed out when checking HWPoisonHandlable() in
get_any_page(), or checking page mapping in split_huge_page_to_list().

This makes huge_zero_page bail out early in madvise_inject_error(), and
panic above won't happen again.

Reported-by: Abaci <abaci@linux.alibaba.com>
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
---
 mm/madvise.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 1873616a37d2..03ad50d222e0 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1079,7 +1079,7 @@ static int madvise_inject_error(int behavior,
 
 	for (; start < end; start += size) {
 		unsigned long pfn;
-		struct page *page;
+		struct page *page, *head;
 		int ret;
 
 		ret = get_user_pages_fast(start, 1, 0, &page);
@@ -1087,12 +1087,21 @@ static int madvise_inject_error(int behavior,
 			return ret;
 		pfn = page_to_pfn(page);
 
+		head = compound_head(page);
+		if (unlikely(is_huge_zero_page(head))) {
+			pr_warn("Unhandlable attempt to %s pfn %#lx at process virtual address %#lx\n",
+				behavior == MADV_SOFT_OFFLINE ? "soft offline" :
+								"inject memory failure for",
+				pfn, start);
+			return -EINVAL;
+		}
+
 		/*
 		 * When soft offlining hugepages, after migrating the page
 		 * we dissolve it, therefore in the second loop "page" will
 		 * no longer be a compound page.
 		 */
-		size = page_size(compound_head(page));
+		size = page_size(head);
 
 		if (behavior == MADV_SOFT_OFFLINE) {
 			pr_info("Soft offlining pfn %#lx at process virtual address %#lx\n",
-- 
2.20.1.2432.ga663e714



             reply	other threads:[~2022-04-10 15:23 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-10 15:22 Xu Yu [this message]
2022-04-11  2:18 ` [PATCH] mm/memory-failure.c: bail out early if huge zero page Miaohe Lin
2022-04-12  9:09   ` Naoya Horiguchi
2022-04-12  9:45     ` Yu Xu
2022-04-12 10:00       ` Yu Xu
2022-04-12 11:11         ` HORIGUCHI NAOYA(堀口 直也)
2022-04-12 11:08     ` Miaohe Lin
2022-04-13  8:36       ` HORIGUCHI NAOYA(堀口 直也)
2022-04-13  9:03         ` Miaohe Lin
2022-04-12  8:31 ` Oscar Salvador
2022-04-12  9:25   ` Miaohe Lin
2022-04-12  9:30     ` Oscar Salvador
2022-04-12  9:47       ` Yu Xu
2022-04-12 10:58       ` Miaohe Lin
2022-04-12  8:59 ` Yu Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49273e6688d7571756603dac996692a15f245d58.1649603963.git.xuyu@linux.alibaba.com \
    --to=xuyu@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.