linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] mm/huge_memory: use pfn_to_online_page() in split_huge_pages_all()
@ 2022-09-08  4:11 Naoya Horiguchi
  2022-09-08 12:23 ` Kirill A. Shutemov
  0 siblings, 1 reply; 3+ messages in thread
From: Naoya Horiguchi @ 2022-09-08  4:11 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, David Hildenbrand, Muchun Song, Miaohe Lin,
	Matthew Wilcox, Michal Hocko, Yang Shi, Naoya Horiguchi,
	linux-kernel

From: Naoya Horiguchi <naoya.horiguchi@nec.com>

NULL pointer dereference is triggered when calling thp split via debugfs
on the system with offlined memory blocks.  With debug option enabled,
the following kernel messages are printed out:

  page:00000000467f4890 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x121c000
  flags: 0x17fffc00000000(node=0|zone=2|lastcpupid=0x1ffff)
  raw: 0017fffc00000000 0000000000000000 dead000000000122 0000000000000000
  raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
  page dumped because: unmovable page
  page:000000007d7ab72e is uninitialized and poisoned
  page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
  ------------[ cut here ]------------
  kernel BUG at include/linux/mm.h:1248!
  invalid opcode: 0000 [#1] PREEMPT SMP PTI
  CPU: 16 PID: 20964 Comm: bash Tainted: G          I        6.0.0-rc3-foll-numa+ #41
  ...
  RIP: 0010:split_huge_pages_write+0xcf4/0xe30

This shows that page_to_nid() in page_zone() is unexpectedly called for an
offlined memmap.

Use pfn_to_online_page() to get struct page in PFN walker.

Fixes: 49071d436b51 ("thp: add debugfs handle to split all huge pages")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Co-developed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org> # 5.10+
---
 mm/huge_memory.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 5fa2ba86dae4..730eb6d6836b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2894,11 +2894,9 @@ static void split_huge_pages_all(void)
 		max_zone_pfn = zone_end_pfn(zone);
 		for (pfn = zone->zone_start_pfn; pfn < max_zone_pfn; pfn++) {
 			int nr_pages;
-			if (!pfn_valid(pfn))
-				continue;
 
-			page = pfn_to_page(pfn);
-			if (!get_page_unless_zero(page))
+			page = pfn_to_online_page(pfn);
+			if (!page || !get_page_unless_zero(page))
 				continue;
 
 			if (zone != page_zone(page))
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] mm/huge_memory: use pfn_to_online_page() in split_huge_pages_all()
  2022-09-08  4:11 [PATCH v2] mm/huge_memory: use pfn_to_online_page() in split_huge_pages_all() Naoya Horiguchi
@ 2022-09-08 12:23 ` Kirill A. Shutemov
  2022-09-08 16:48   ` Michal Hocko
  0 siblings, 1 reply; 3+ messages in thread
From: Kirill A. Shutemov @ 2022-09-08 12:23 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: linux-mm, Andrew Morton, David Hildenbrand, Muchun Song,
	Miaohe Lin, Matthew Wilcox, Michal Hocko, Yang Shi,
	Naoya Horiguchi, linux-kernel

On Thu, Sep 08, 2022 at 01:11:50PM +0900, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> 
> NULL pointer dereference is triggered when calling thp split via debugfs
> on the system with offlined memory blocks.  With debug option enabled,
> the following kernel messages are printed out:
> 
>   page:00000000467f4890 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x121c000
>   flags: 0x17fffc00000000(node=0|zone=2|lastcpupid=0x1ffff)
>   raw: 0017fffc00000000 0000000000000000 dead000000000122 0000000000000000
>   raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
>   page dumped because: unmovable page
>   page:000000007d7ab72e is uninitialized and poisoned
>   page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
>   ------------[ cut here ]------------
>   kernel BUG at include/linux/mm.h:1248!
>   invalid opcode: 0000 [#1] PREEMPT SMP PTI
>   CPU: 16 PID: 20964 Comm: bash Tainted: G          I        6.0.0-rc3-foll-numa+ #41
>   ...
>   RIP: 0010:split_huge_pages_write+0xcf4/0xe30
> 
> This shows that page_to_nid() in page_zone() is unexpectedly called for an
> offlined memmap.
> 
> Use pfn_to_online_page() to get struct page in PFN walker.
> 
> Fixes: 49071d436b51 ("thp: add debugfs handle to split all huge pages")
> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Co-developed-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> Reviewed-by: Yang Shi <shy828301@gmail.com>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
> Reviewed-by: Oscar Salvador <osalvador@suse.de>
> Cc: <stable@vger.kernel.org> # 5.10+

Looks good:

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

But it makes me think if there's other similar cases. "page is offline" is
rather obscure case that rarely covered by routine testing. Otherwise the
bug would not survive for 6 years.

After quick look, kvm_pfn_to_refcounted_page() looks suspicious.
kdb_getphys() too.

Maybe we should make pfn_valid() false for offline pages and introduce
other check that allows offline pages which can be used in codepaths that
deal with offline pages explicitly.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] mm/huge_memory: use pfn_to_online_page() in split_huge_pages_all()
  2022-09-08 12:23 ` Kirill A. Shutemov
@ 2022-09-08 16:48   ` Michal Hocko
  0 siblings, 0 replies; 3+ messages in thread
From: Michal Hocko @ 2022-09-08 16:48 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Naoya Horiguchi, linux-mm, Andrew Morton, David Hildenbrand,
	Muchun Song, Miaohe Lin, Matthew Wilcox, Yang Shi,
	Naoya Horiguchi, linux-kernel

On Thu 08-09-22 15:23:03, Kirill A. Shutemov wrote:
[...]
> But it makes me think if there's other similar cases. "page is offline" is
> rather obscure case that rarely covered by routine testing. Otherwise the
> bug would not survive for 6 years.
> 
> After quick look, kvm_pfn_to_refcounted_page() looks suspicious.

this one is hard to judge for me. Is this ever used for something that
could be offlined?

> kdb_getphys() too.

this one looks it needs a fix
 
> Maybe we should make pfn_valid() false for offline pages and introduce
> other check that allows offline pages which can be used in codepaths that
> deal with offline pages explicitly.

Back when pfn_to_online page was introduced altering pfn_valid was
considered but the semantic is not the same. offline pages could be
still pfn_valid. The discussion should be in archives. Sorry do not have
the link handy. We have also considered changing pfn_to_page but that
would add an overhead for unlikely case to everybody. So the conclusion
was that only pfn walkers should care. Most other users just translate
pfn to page when the page cannot be offlined (e.g. because it is
referenced).

I do realize this is fragile but we couldn't come up with something more
clever without introducing overhead all over the place.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-09-08 16:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-08  4:11 [PATCH v2] mm/huge_memory: use pfn_to_online_page() in split_huge_pages_all() Naoya Horiguchi
2022-09-08 12:23 ` Kirill A. Shutemov
2022-09-08 16:48   ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).