All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kefeng Wang <wangkefeng.wang@huawei.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: <willy@infradead.org>, David Hildenbrand <david@redhat.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Naoya Horiguchi <nao.horiguchi@gmail.com>,
	Oscar Salvador <osalvador@suse.de>, Zi Yan <ziy@nvidia.com>,
	Hugh Dickins <hughd@google.com>, Jonathan Corbet <corbet@lwn.net>,
	<linux-mm@kvack.org>, Vishal Moola <vishal.moola@gmail.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: [PATCH v2 01/10] mm: memory_hotplug: check hwpoisoned page firstly in do_migrate_range()
Date: Thu, 25 Apr 2024 16:40:19 +0800	[thread overview]
Message-ID: <20240425084028.3888403-2-wangkefeng.wang@huawei.com> (raw)
In-Reply-To: <20240425084028.3888403-1-wangkefeng.wang@huawei.com>

The commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned
pages to be offlined") don't handle the hugetlb pages, the dead loop
still occur if offline a hwpoison hugetlb, luckly, with commit
e591ef7d96d6 ("mm,hwpoison,hugetlb,memory_hotplug: hotremove memory
section with hwpoisoned hugepage"), the HPageMigratable of hugetlb
page will be clear, and the hwpoison hugetlb page will be skipped in
scan_movable_pages(), so the deed loop issue is fixed.

However if the HPageMigratable() check passed(without reference and lock),
the hugetlb page may be hwpoisoned, it won't cause issue since the
hwpoisoned page will be handled correctly in the next scan_movable_pages()
loop, it will be isolated in do_migrate_range() and but fails to migrated,

In order to avoid this isolation and unify all hwpoisoned page handling.
let's unconditionally check hwpoison firstly, and if it is a hwpoisoned
hugetlb page, try to unmap it as the catch all safety net like normal page
does, also add some warn when the folio is still mapped.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 mm/memory_hotplug.c | 62 +++++++++++++++++++++++++++++++++++----------
 1 file changed, 48 insertions(+), 14 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 431b1f6753c0..1985caf73e5a 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1772,6 +1772,35 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
 	return 0;
 }
 
+static bool isolate_and_unmap_hwposion_folio(struct folio *folio)
+{
+	if (WARN_ON(folio_test_lru(folio)))
+		folio_isolate_lru(folio);
+
+	if (!folio_mapped(folio))
+		return true;
+
+	if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
+		struct address_space *mapping;
+
+		mapping = hugetlb_page_mapping_lock_write(&folio->page);
+		if (mapping) {
+			/*
+			 * In shared mappings, try_to_unmap could potentially
+			 * call huge_pmd_unshare.  Because of this, take
+			 * semaphore in write mode here and set TTU_RMAP_LOCKED
+			 * to let lower levels know we have taken the lock.
+			 */
+			try_to_unmap(folio, TTU_IGNORE_MLOCK | TTU_RMAP_LOCKED);
+			i_mmap_unlock_write(mapping);
+		}
+	} else {
+		try_to_unmap(folio, TTU_IGNORE_MLOCK);
+	}
+
+	return folio_mapped(folio);
+}
+
 static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 {
 	unsigned long pfn;
@@ -1790,28 +1819,33 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 		folio = page_folio(page);
 		head = &folio->page;
 
-		if (PageHuge(page)) {
-			pfn = page_to_pfn(head) + compound_nr(head) - 1;
-			isolate_hugetlb(folio, &source);
-			continue;
-		} else if (PageTransHuge(page))
-			pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
-
 		/*
 		 * HWPoison pages have elevated reference counts so the migration would
 		 * fail on them. It also doesn't make any sense to migrate them in the
 		 * first place. Still try to unmap such a page in case it is still mapped
-		 * (e.g. current hwpoison implementation doesn't unmap KSM pages but keep
-		 * the unmap as the catch all safety net).
+		 * (keep the unmap as the catch all safety net).
 		 */
-		if (PageHWPoison(page)) {
-			if (WARN_ON(folio_test_lru(folio)))
-				folio_isolate_lru(folio);
-			if (folio_mapped(folio))
-				try_to_unmap(folio, TTU_IGNORE_MLOCK);
+		if (unlikely(PageHWPoison(page))) {
+			folio = page_folio(page);
+			if (isolate_and_unmap_hwposion_folio(folio)) {
+				if (__ratelimit(&migrate_rs)) {
+					pr_warn("%#lx: failed to unmap hwpoison folio\n",
+						pfn);
+				}
+			}
+
+			if (folio_test_large(folio))
+				pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
 			continue;
 		}
 
+		if (PageHuge(page)) {
+			pfn = page_to_pfn(head) + compound_nr(head) - 1;
+			isolate_hugetlb(folio, &source);
+			continue;
+		} else if (PageTransHuge(page))
+			pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
+
 		if (!get_page_unless_zero(page))
 			continue;
 		/*
-- 
2.27.0



  reply	other threads:[~2024-04-25  8:41 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-25  8:40 [PATCH v2 00/10] mm: remove isolate_lru_page() and isolate_movable_page() Kefeng Wang
2024-04-25  8:40 ` Kefeng Wang [this message]
2024-04-27  3:57   ` [PATCH v2 01/10] mm: memory_hotplug: check hwpoisoned page firstly in do_migrate_range() kernel test robot
2024-04-28  0:49     ` Kefeng Wang
2024-04-27  5:40   ` kernel test robot
2024-04-27  7:23   ` kernel test robot
2024-04-25  8:40 ` [PATCH v2 02/10] mm: add isolate_folio_to_list() Kefeng Wang
2024-04-25  8:40 ` [PATCH v2 03/10] mm: memory_hotplug: unify Huge/LRU/non-LRU folio isolation Kefeng Wang
2024-04-25  8:40 ` [PATCH v2 04/10] mm: compaction: try get reference before non-lru movable folio migration Kefeng Wang
2024-04-25  8:40 ` [PATCH v2 05/10] mm: migrate: add folio_isolate_movable() Kefeng Wang
2024-04-25  8:40 ` [PATCH v2 06/10] mm: compaction: use folio_isolate_movable() Kefeng Wang
2024-04-25  8:40 ` [PATCH v2 07/10] mm: migrate: " Kefeng Wang
2024-04-25  8:40 ` [PATCH v2 08/10] mm: migrate: remove isolate_movable_page() Kefeng Wang
2024-04-25  8:40 ` [PATCH v2 09/10] mm: migrate_device: use more folio in migrate_device_unmap() Kefeng Wang
2024-04-25  9:31   ` David Hildenbrand
2024-04-25 11:05     ` Kefeng Wang
2024-04-25  8:40 ` [PATCH v2 10/10] mm: remove isolate_lru_page() Kefeng Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240425084028.3888403-2-wangkefeng.wang@huawei.com \
    --to=wangkefeng.wang@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-mm@kvack.org \
    --cc=nao.horiguchi@gmail.com \
    --cc=osalvador@suse.de \
    --cc=vishal.moola@gmail.com \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.