linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] mm/hotplug: invalid PFNs from pfn_to_online_page()
@ 2019-01-21 21:27 Qian Cai
  2019-01-22  6:58 ` Michal Hocko
  2019-01-22 10:46 ` Oscar Salvador
  0 siblings, 2 replies; 3+ messages in thread
From: Qian Cai @ 2019-01-21 21:27 UTC (permalink / raw)
  To: akpm
  Cc: osalvador, catalin.marinas, vbabka, mhocko, linux-mm,
	linux-kernel, Qian Cai

On an arm64 ThunderX2 server, the first kmemleak scan would crash [1]
with CONFIG_DEBUG_VM_PGFLAGS=y due to page_to_nid() found a pfn that is
not directly mapped (MEMBLOCK_NOMAP). Hence, the page->flags is
uninitialized.

This is due to the commit 9f1eb38e0e11 ("mm, kmemleak: little
optimization while scanning") starts to use pfn_to_online_page() instead
of pfn_valid(). However, in the CONFIG_MEMORY_HOTPLUG=y case,
pfn_to_online_page() does not call memblock_is_map_memory() while
pfn_valid() does.

Historically, the commit 68709f45385a ("arm64: only consider memblocks
with NOMAP cleared for linear mapping") causes pages marked as nomap
being no long reassigned to the new zone in memmap_init_zone() by
calling __init_single_page().

Since the commit 2d070eab2e82 ("mm: consider zone which is not fully
populated to have holes") introduced pfn_to_online_page() and was
designed to return a valid pfn only, but it is clearly broken on arm64.

Therefore, let pfn_to_online_page() call pfn_valid_within(), so it can
handle nomap thanks to the commit f52bb98f5ade ("arm64: mm: always
enable CONFIG_HOLES_IN_ZONE"), while it will be optimized away on
architectures where have no HOLES_IN_ZONE.

[1]
[  102.195320] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000006
[  102.204113] Mem abort info:
[  102.206921]   ESR = 0x96000005
[  102.209997]   Exception class = DABT (current EL), IL = 32 bits
[  102.215926]   SET = 0, FnV = 0
[  102.218993]   EA = 0, S1PTW = 0
[  102.222150] Data abort info:
[  102.225047]   ISV = 0, ISS = 0x00000005
[  102.228887]   CM = 0, WnR = 0
[  102.231866] user pgtable: 64k pages, 48-bit VAs, pgdp = (____ptrval____)
[  102.238572] [0000000000000006] pgd=0000000000000000, pud=0000000000000000
[  102.245448] Internal error: Oops: 96000005 [#1] SMP
[  102.264062] CPU: 60 PID: 1408 Comm: kmemleak Not tainted 5.0.0-rc2+ #8
[  102.280403] pstate: 60400009 (nZCv daif +PAN -UAO)
[  102.280409] pc : page_mapping+0x24/0x144
[  102.280415] lr : __dump_page+0x34/0x3dc
[  102.292923] sp : ffff00003a5cfd10
[  102.296229] x29: ffff00003a5cfd10 x28: 000000000000802f
[  102.301533] x27: 0000000000000000 x26: 0000000000277d00
[  102.306835] x25: ffff000010791f56 x24: ffff7fe000000000
[  102.312138] x23: ffff000010772f8b x22: ffff00001125f670
[  102.317442] x21: ffff000011311000 x20: ffff000010772f8b
[  102.322747] x19: fffffffffffffffe x18: 0000000000000000
[  102.328049] x17: 0000000000000000 x16: 0000000000000000
[  102.333352] x15: 0000000000000000 x14: ffff802698b19600
[  102.338654] x13: ffff802698b1a200 x12: ffff802698b16f00
[  102.343956] x11: ffff802698b1a400 x10: 0000000000001400
[  102.349260] x9 : 0000000000000001 x8 : ffff00001121a000
[  102.354563] x7 : 0000000000000000 x6 : ffff0000102c53b8
[  102.359868] x5 : 0000000000000000 x4 : 0000000000000003
[  102.365173] x3 : 0000000000000100 x2 : 0000000000000000
[  102.370476] x1 : ffff000010772f8b x0 : ffffffffffffffff
[  102.375782] Process kmemleak (pid: 1408, stack limit = 0x(____ptrval____))
[  102.382648] Call trace:
[  102.385091]  page_mapping+0x24/0x144
[  102.388659]  __dump_page+0x34/0x3dc
[  102.392140]  dump_page+0x28/0x4c
[  102.395363]  kmemleak_scan+0x4ac/0x680
[  102.399106]  kmemleak_scan_thread+0xb4/0xdc
[  102.403285]  kthread+0x12c/0x13c
[  102.406509]  ret_from_fork+0x10/0x18
[  102.410080] Code: d503201f f9400660 36000040 d1000413 (f9400661)
[  102.416357] ---[ end trace 4d4bd7f573490c8e ]---
[  102.420966] Kernel panic - not syncing: Fatal exception
[  102.426293] SMP: stopping secondary CPUs
[  102.431830] Kernel Offset: disabled
[  102.435311] CPU features: 0x002,20000c38
[  102.439223] Memory Limit: none
[  102.442384] ---[ end Kernel panic - not syncing: Fatal exception ]---

Fixes: 9f1eb38e0e11 ("mm, kmemleak: little optimization while scanning")
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Qian Cai <cai@lca.pw>
---

v3: change the "Fixes" line.
v2: update the changelog; keep the bound check; use pfn_valid_within().

 include/linux/memory_hotplug.h | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 07da5c6c5ba0..cdeecd9bd87e 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -21,14 +21,15 @@ struct vmem_altmap;
  * walkers which rely on the fully initialized page->flags and others
  * should use this rather than pfn_valid && pfn_to_page
  */
-#define pfn_to_online_page(pfn)				\
-({							\
-	struct page *___page = NULL;			\
-	unsigned long ___nr = pfn_to_section_nr(pfn);	\
-							\
-	if (___nr < NR_MEM_SECTIONS && online_section_nr(___nr))\
-		___page = pfn_to_page(pfn);		\
-	___page;					\
+#define pfn_to_online_page(pfn)					   \
+({								   \
+	struct page *___page = NULL;				   \
+	unsigned long ___nr = pfn_to_section_nr(pfn);		   \
+								   \
+	if (___nr < NR_MEM_SECTIONS && online_section_nr(___nr) && \
+	    pfn_valid_within(pfn))				   \
+		___page = pfn_to_page(pfn);			   \
+	___page;						   \
 })
 
 /*
-- 
2.17.2 (Apple Git-113)


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v3] mm/hotplug: invalid PFNs from pfn_to_online_page()
  2019-01-21 21:27 [PATCH v3] mm/hotplug: invalid PFNs from pfn_to_online_page() Qian Cai
@ 2019-01-22  6:58 ` Michal Hocko
  2019-01-22 10:46 ` Oscar Salvador
  1 sibling, 0 replies; 3+ messages in thread
From: Michal Hocko @ 2019-01-22  6:58 UTC (permalink / raw)
  To: Qian Cai; +Cc: akpm, osalvador, catalin.marinas, vbabka, linux-mm, linux-kernel

On Mon 21-01-19 16:27:47, Qian Cai wrote:
[...]

Sorry to miss this before but you want to

> +#define pfn_to_online_page(pfn)					   \
> +({								   \
> +	struct page *___page = NULL;				   \
	unsigned long ___pfn = pfn;
> +	unsigned long ___nr = pfn_to_section_nr(pfn);		   \
			      pfn_to_section_nr(___pfn);
> +								   \
> +	if (___nr < NR_MEM_SECTIONS && online_section_nr(___nr) && \
> +	    pfn_valid_within(pfn))				   \
	    pfn_valid_within(___pfn))
	
> +		___page = pfn_to_page(pfn);			   \
> +	___page;						   \
>  })

to prevent from issues when pfn expression has side effects.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v3] mm/hotplug: invalid PFNs from pfn_to_online_page()
  2019-01-21 21:27 [PATCH v3] mm/hotplug: invalid PFNs from pfn_to_online_page() Qian Cai
  2019-01-22  6:58 ` Michal Hocko
@ 2019-01-22 10:46 ` Oscar Salvador
  1 sibling, 0 replies; 3+ messages in thread
From: Oscar Salvador @ 2019-01-22 10:46 UTC (permalink / raw)
  To: Qian Cai; +Cc: akpm, catalin.marinas, vbabka, mhocko, linux-mm, linux-kernel

On Mon, Jan 21, 2019 at 04:27:47PM -0500, Qian Cai wrote:
 
> Fixes: 9f1eb38e0e11 ("mm, kmemleak: little optimization while scanning")
> Acked-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Qian Cai <cai@lca.pw>

Heh, I guess that it comes in handy to have a machine with CONFIG_HOLES_IN_ZONE
enabled.
I totally missed the fact that systems with such configuration can have
uninitialized pages even if the section is online.
To be honest, I blindly thought that if a section was online, that means
that all its pages were initialized properly.

Thanks for fixing it:

Reviewed-by: Oscar Salvador <osalvador@suse.de>

> ---
> 
> v3: change the "Fixes" line.
> v2: update the changelog; keep the bound check; use pfn_valid_within().
> 
>  include/linux/memory_hotplug.h | 17 +++++++++--------
>  1 file changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 07da5c6c5ba0..cdeecd9bd87e 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -21,14 +21,15 @@ struct vmem_altmap;
>   * walkers which rely on the fully initialized page->flags and others
>   * should use this rather than pfn_valid && pfn_to_page
>   */
> -#define pfn_to_online_page(pfn)				\
> -({							\
> -	struct page *___page = NULL;			\
> -	unsigned long ___nr = pfn_to_section_nr(pfn);	\
> -							\
> -	if (___nr < NR_MEM_SECTIONS && online_section_nr(___nr))\
> -		___page = pfn_to_page(pfn);		\
> -	___page;					\
> +#define pfn_to_online_page(pfn)					   \
> +({								   \
> +	struct page *___page = NULL;				   \
> +	unsigned long ___nr = pfn_to_section_nr(pfn);		   \
> +								   \
> +	if (___nr < NR_MEM_SECTIONS && online_section_nr(___nr) && \
> +	    pfn_valid_within(pfn))				   \
> +		___page = pfn_to_page(pfn);			   \
> +	___page;						   \
>  })
>  
>  /*
> -- 
> 2.17.2 (Apple Git-113)
> 

-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-01-22 10:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-21 21:27 [PATCH v3] mm/hotplug: invalid PFNs from pfn_to_online_page() Qian Cai
2019-01-22  6:58 ` Michal Hocko
2019-01-22 10:46 ` Oscar Salvador

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).