linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] memblock, arm: fixes for freeing of the memory map
@ 2021-05-18  9:06 Mike Rapoport
  2021-05-18  9:06 ` [PATCH 1/3] memblock: free_unused_memmap: use pageblock units instead of MAX_ORDER Mike Rapoport
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Mike Rapoport @ 2021-05-18  9:06 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Andrew Morton, Kefeng Wang, Mike Rapoport, Mike Rapoport,
	Russell King, linux-kernel, linux-mm

From: Mike Rapoport <rppt@linux.ibm.com>

Hi,

The coordination between freeing of unused memory map, pfn_valid() and core
mm assumptions about validity of the memory map in various ranges was not
designed for complex layouts of the physical memory with a lot of holes all
over the place.

Kefen Wang reported crashes in move_freepages() on a system with the
following memory layout:

  node   0: [mem 0x0000000080a00000-0x00000000855fffff]
  node   0: [mem 0x0000000086a00000-0x0000000087dfffff]
  node   0: [mem 0x000000008bd00000-0x000000008c4fffff]
  node   0: [mem 0x000000008e300000-0x000000008ecfffff]
  node   0: [mem 0x0000000090d00000-0x00000000bfffffff]
  node   0: [mem 0x00000000cc000000-0x00000000dc9fffff]
  node   0: [mem 0x00000000de700000-0x00000000de9fffff]
  node   0: [mem 0x00000000e0800000-0x00000000e0bfffff]
  node   0: [mem 0x00000000f4b00000-0x00000000f6ffffff]
  node   0: [mem 0x00000000fda00000-0x00000000ffffefff]

The crashes can be mitigated by enabling CONFIG_HOLES_IN_ZONE and
essentially turning pfn_valid_within() to pfn_valid() instead of having it
hardwired to 1.

Alternatively, we can update ARM's implementation of pfn_valid() to take
into accounting rounding of the freed memory map to pageblock boundaries
and make sure it returns true for PFNs that have memory map entries even if
there is no physical memory.

I can take the entire series via memblock tree.

[1] https://lore.kernel.org/lkml/2a1592ad-bc9d-4664-fd19-f7448a37edc0@huawei.com

Mike Rapoport (3):
  memblock: free_unused_memmap: use pageblock units instead of MAX_ORDER
  memblock: align freed memory map on pageblock boundaries with SPARSEMEM
  arm: extend pfn_valid to take into accound freed memory map alignment

 arch/arm/mm/init.c | 15 ++++++++++++++-
 mm/memblock.c      | 23 ++++++++++++-----------
 2 files changed, 26 insertions(+), 12 deletions(-)


base-commit: d07f6ca923ea0927a1024dfccafc5b53b61cfecc
-- 
2.28.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] memblock: free_unused_memmap: use pageblock units instead of MAX_ORDER
  2021-05-18  9:06 [PATCH 0/3] memblock, arm: fixes for freeing of the memory map Mike Rapoport
@ 2021-05-18  9:06 ` Mike Rapoport
  2021-05-18  9:06 ` [PATCH 2/3] memblock: align freed memory map on pageblock boundaries with SPARSEMEM Mike Rapoport
  2021-05-18  9:06 ` [PATCH 3/3] arm: extend pfn_valid to take into accound freed memory map alignment Mike Rapoport
  2 siblings, 0 replies; 10+ messages in thread
From: Mike Rapoport @ 2021-05-18  9:06 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Andrew Morton, Kefeng Wang, Mike Rapoport, Mike Rapoport,
	Russell King, linux-kernel, linux-mm

From: Mike Rapoport <rppt@linux.ibm.com>

The code that frees unused memory map uses rounds start and end of the
holes that are freed to MAX_ORDER_NR_PAGES to preserve continuity of the
memory map for MAX_ORDER regions.

Lots of core memory management functionality relies on homogeneity of the
memory map within each pageblock which size may differ from MAX_ORDER in
certain configurations.

Although currently, for the architectures that use free_unused_memmap(),
pageblock_order and MAX_ORDER are equivalent, it is cleaner to have common
notation thought mm code.

Replace MAX_ORDER_NR_PAGES with pageblock_nr_pages and update the comments
to make it more clear why the alignment to pageblock boundaries is
required.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 mm/memblock.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index afaefa8fc6ab..97fa87541b5f 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1943,11 +1943,11 @@ static void __init free_unused_memmap(void)
 		start = min(start, ALIGN(prev_end, PAGES_PER_SECTION));
 #else
 		/*
-		 * Align down here since the VM subsystem insists that the
-		 * memmap entries are valid from the bank start aligned to
-		 * MAX_ORDER_NR_PAGES.
+		 * Align down here since many operations in VM subsystem
+		 * presume that there are no holes in the memory map inside
+		 * a pageblock
 		 */
-		start = round_down(start, MAX_ORDER_NR_PAGES);
+		start = round_down(start, pageblock_nr_pages);
 #endif
 
 		/*
@@ -1958,11 +1958,11 @@ static void __init free_unused_memmap(void)
 			free_memmap(prev_end, start);
 
 		/*
-		 * Align up here since the VM subsystem insists that the
-		 * memmap entries are valid from the bank end aligned to
-		 * MAX_ORDER_NR_PAGES.
+		 * Align up here since many operations in VM subsystem
+		 * presume that there are no holes in the memory map inside
+		 * a pageblock
 		 */
-		prev_end = ALIGN(end, MAX_ORDER_NR_PAGES);
+		prev_end = ALIGN(end, pageblock_nr_pages);
 	}
 
 #ifdef CONFIG_SPARSEMEM
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] memblock: align freed memory map on pageblock boundaries with SPARSEMEM
  2021-05-18  9:06 [PATCH 0/3] memblock, arm: fixes for freeing of the memory map Mike Rapoport
  2021-05-18  9:06 ` [PATCH 1/3] memblock: free_unused_memmap: use pageblock units instead of MAX_ORDER Mike Rapoport
@ 2021-05-18  9:06 ` Mike Rapoport
  2021-05-18  9:06 ` [PATCH 3/3] arm: extend pfn_valid to take into accound freed memory map alignment Mike Rapoport
  2 siblings, 0 replies; 10+ messages in thread
From: Mike Rapoport @ 2021-05-18  9:06 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Andrew Morton, Kefeng Wang, Mike Rapoport, Mike Rapoport,
	Russell King, linux-kernel, linux-mm

From: Mike Rapoport <rppt@linux.ibm.com>

When CONFIG_SPARSEMEM=y the ranges of the memory map that are freed are not
aligned to the pageblock boundaries which breaks assumptions about
homogeneity of the memory map throughout core mm code.

Make sure that the freed memory map is always aligned on pageblock
boundaries regardless of the memory model selection.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 mm/memblock.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index 97fa87541b5f..2e25d69739e0 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1941,14 +1941,13 @@ static void __init free_unused_memmap(void)
 		 * due to SPARSEMEM sections which aren't present.
 		 */
 		start = min(start, ALIGN(prev_end, PAGES_PER_SECTION));
-#else
+#endif
 		/*
 		 * Align down here since many operations in VM subsystem
 		 * presume that there are no holes in the memory map inside
 		 * a pageblock
 		 */
 		start = round_down(start, pageblock_nr_pages);
-#endif
 
 		/*
 		 * If we had a previous bank, and there is a space
@@ -1966,8 +1965,10 @@ static void __init free_unused_memmap(void)
 	}
 
 #ifdef CONFIG_SPARSEMEM
-	if (!IS_ALIGNED(prev_end, PAGES_PER_SECTION))
+	if (!IS_ALIGNED(prev_end, PAGES_PER_SECTION)) {
+		prev_end = ALIGN(end, pageblock_nr_pages);
 		free_memmap(prev_end, ALIGN(prev_end, PAGES_PER_SECTION));
+	}
 #endif
 }
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] arm: extend pfn_valid to take into accound freed memory map alignment
  2021-05-18  9:06 [PATCH 0/3] memblock, arm: fixes for freeing of the memory map Mike Rapoport
  2021-05-18  9:06 ` [PATCH 1/3] memblock: free_unused_memmap: use pageblock units instead of MAX_ORDER Mike Rapoport
  2021-05-18  9:06 ` [PATCH 2/3] memblock: align freed memory map on pageblock boundaries with SPARSEMEM Mike Rapoport
@ 2021-05-18  9:06 ` Mike Rapoport
  2021-05-18  9:44   ` Russell King (Oracle)
  2021-05-18 12:49   ` Kefeng Wang
  2 siblings, 2 replies; 10+ messages in thread
From: Mike Rapoport @ 2021-05-18  9:06 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Andrew Morton, Kefeng Wang, Mike Rapoport, Mike Rapoport,
	Russell King, linux-kernel, linux-mm

From: Mike Rapoport <rppt@linux.ibm.com>

When unused memory map is freed the preserved part of the memory map is
extended to match pageblock boundaries because lots of core mm
functionality relies on homogeneity of the memory map within pageblock
boundaries.

Since pfn_valid() is used to check whether there is a valid memory map
entry for a PFN, make it return true also for PFNs that have memory map
entries even if there is no actual memory populated there.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arm/mm/init.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 9d4744a632c6..bb678c0ba143 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -125,11 +125,24 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
 int pfn_valid(unsigned long pfn)
 {
 	phys_addr_t addr = __pfn_to_phys(pfn);
+	unsigned long pageblock_size = PAGE_SIZE * pageblock_nr_pages;
 
 	if (__phys_to_pfn(addr) != pfn)
 		return 0;
 
-	return memblock_is_map_memory(addr);
+	if (memblock_is_map_memory(addr))
+		return 1;
+
+	/*
+	 * If address less than pageblock_size bytes away from a present
+	 * memory chunk there still will be a memory map entry for it
+	 * because we round freed memory map to the pageblock boundaries
+	 */
+	if (memblock_is_map_memory(ALIGN(addr + 1, pageblock_size)) ||
+	    memblock_is_map_memory(ALIGN_DOWN(addr, pageblock_size)))
+		return 1;
+
+	return 0;
 }
 EXPORT_SYMBOL(pfn_valid);
 #endif
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] arm: extend pfn_valid to take into accound freed memory map alignment
  2021-05-18  9:06 ` [PATCH 3/3] arm: extend pfn_valid to take into accound freed memory map alignment Mike Rapoport
@ 2021-05-18  9:44   ` Russell King (Oracle)
  2021-05-18 10:53     ` Mike Rapoport
  2021-05-18 12:49   ` Kefeng Wang
  1 sibling, 1 reply; 10+ messages in thread
From: Russell King (Oracle) @ 2021-05-18  9:44 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-arm-kernel, Andrew Morton, Kefeng Wang, Mike Rapoport,
	linux-kernel, linux-mm

On Tue, May 18, 2021 at 12:06:13PM +0300, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> When unused memory map is freed the preserved part of the memory map is
> extended to match pageblock boundaries because lots of core mm
> functionality relies on homogeneity of the memory map within pageblock
> boundaries.
> 
> Since pfn_valid() is used to check whether there is a valid memory map
> entry for a PFN, make it return true also for PFNs that have memory map
> entries even if there is no actual memory populated there.

I thought pfn_valid() was a particularly hot path... do we really want
to be doing multiple lookups here? Is there no better solution?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] arm: extend pfn_valid to take into accound freed memory map alignment
  2021-05-18  9:44   ` Russell King (Oracle)
@ 2021-05-18 10:53     ` Mike Rapoport
  0 siblings, 0 replies; 10+ messages in thread
From: Mike Rapoport @ 2021-05-18 10:53 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: linux-arm-kernel, Andrew Morton, Kefeng Wang, Mike Rapoport,
	linux-kernel, linux-mm

On Tue, May 18, 2021 at 10:44:27AM +0100, Russell King (Oracle) wrote:
> On Tue, May 18, 2021 at 12:06:13PM +0300, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > When unused memory map is freed the preserved part of the memory map is
> > extended to match pageblock boundaries because lots of core mm
> > functionality relies on homogeneity of the memory map within pageblock
> > boundaries.
> > 
> > Since pfn_valid() is used to check whether there is a valid memory map
> > entry for a PFN, make it return true also for PFNs that have memory map
> > entries even if there is no actual memory populated there.
> 
> I thought pfn_valid() was a particularly hot path... do we really want
> to be doing multiple lookups here? Is there no better solution?

It is hot, but for more, hmm, straightforward memory layouts it'll take 

	if (memblock_is_map_memory(addr))
		return 1;

branch, I think.

Most of mm operations are on pages that are fed into buddy allocator, and
if there are no holes with weird alignment  pfn_valid() will return 1 right
away.

Now thinking about it, with the patch that marks NOMAP areas reserved in
the memory map [1], we could also use
	
	memblock_overlaps_region(&memblock.memory,
				 ALIGN_DOWN(addr, pageblock_size),
				 ALIGN(addr + 1, pageblock_size))
to have only one lookup.

Completely another approach would be to simply stop freeing memory map with
SPARSEMEM. For systems like the one Kefen is using, it would waste less
than 2M out of 1.5G.
It is worse of course for old systems with small memories. The worst case
being mach-ep93xx with sections size of 256M and I presume 16M per section
would be normal for such machines.

[1] https://lore.kernel.org/lkml/20210511100550.28178-3-rppt@kernel.org

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] arm: extend pfn_valid to take into accound freed memory map alignment
  2021-05-18  9:06 ` [PATCH 3/3] arm: extend pfn_valid to take into accound freed memory map alignment Mike Rapoport
  2021-05-18  9:44   ` Russell King (Oracle)
@ 2021-05-18 12:49   ` Kefeng Wang
  2021-05-18 15:52     ` Mike Rapoport
  1 sibling, 1 reply; 10+ messages in thread
From: Kefeng Wang @ 2021-05-18 12:49 UTC (permalink / raw)
  To: Mike Rapoport, linux-arm-kernel
  Cc: Andrew Morton, Mike Rapoport, Russell King, linux-kernel, linux-mm



On 2021/5/18 17:06, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> When unused memory map is freed the preserved part of the memory map is
> extended to match pageblock boundaries because lots of core mm
> functionality relies on homogeneity of the memory map within pageblock
> boundaries.
> 
> Since pfn_valid() is used to check whether there is a valid memory map
> entry for a PFN, make it return true also for PFNs that have memory map
> entries even if there is no actual memory populated there.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---
>   arch/arm/mm/init.c | 15 ++++++++++++++-
>   1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 9d4744a632c6..bb678c0ba143 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -125,11 +125,24 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
>   int pfn_valid(unsigned long pfn)
>   {
>   	phys_addr_t addr = __pfn_to_phys(pfn);
> +	unsigned long pageblock_size = PAGE_SIZE * pageblock_nr_pages;
>   
>   	if (__phys_to_pfn(addr) != pfn)
>   		return 0;
>   
> -	return memblock_is_map_memory(addr);
> +	if (memblock_is_map_memory(addr))
> +		return 1;
> +
> +	/*
> +	 * If address less than pageblock_size bytes away from a present
> +	 * memory chunk there still will be a memory map entry for it
> +	 * because we round freed memory map to the pageblock boundaries
> +	 */
> +	if (memblock_is_map_memory(ALIGN(addr + 1, pageblock_size)) ||
> +	    memblock_is_map_memory(ALIGN_DOWN(addr, pageblock_size)))
> +		return 1;

Hi Mike, with patch3, the system won't boot.

...
Initmem setup node 0 [mem 0x0000000080a00000-0x00000000ffffefff]
   Normal zone: 18176 pages in unavailable ranges
8<--- cut here ---
Unable to handle kernel paging request at virtual address 0197c000
pgd = (ptrval)
[0197c000] *pgd=00000000
Internal error: Oops: 805 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0+ #132
Hardware name: Hisilicon A9
PC is at mmioset+0x4c/0xa8
LR is at 0x0
pc : [<c033930c>]    lr : [<00000000>]    psr: 00000293
sp : c0801d88  ip : 0197c000  fp : 000001ff
r10: 00000000  r9 : 00000001  r8 : 00000000
r7 : 00000032  r6 : 00000032  r5 : 01000000  r4 : 0197c000
r3 : 00000000  r2 : ffffffe0  r1 : 00000000  r0 : 0197c000
Flags: nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 1ac5387d  Table: 80a0404a  DAC: 55555555
Process swapper (pid: 0, stack limit = 0x(ptrval))
Stack: (0xc0801d88 to 0xc0802000)
1d80:                   000cc000 c04b5764 000cbe00 0197c000 c0885040 
c04b580c
1da0: 00000000 00000000 00000001 000fffff 00000000 00000000 00000000 
000b0200
1dc0: 00000001 00000000 c0880f38 c04b5d4c 000fffff 00000000 00000000 
00000001
1de0: 000cc000 000dca00 00000005 c0813d28 000259ff 000259ff 00000001 
c072e4c4
1e00: 0004fdff 000b0200 c050903c 00000004 c0863280 c0712ea4 00000000 
ffffefff
1e20: 00000000 c0862fa0 0007f5ff 00080a00 00000000 00080a00 000fffff 
c0813d28
1e40: 00000000 00000001 c05b046b 00000000 fffff000 c0842780 c0801e90 
00000000
1e60: fffff000 c071336c c0801e90 ffffefff 00000000 00000001 000fda00 
000fffff
1e80: ffffffff 000fda00 000fffff ffffffff 00000000 c0813d28 00000000 
c0881bb8
1ea0: c0881bac c0817900 c05df034 c0842780 c0818614 c086e3fc ffe00000 
c0705d50
1ec0: 000b0200 000fffff 00000000 c0813d28 c0817900 00000000 ef7fa000 
c07076ac
1ee0: 00000000 c0801f00 c0801f04 00000000 b0200000 00080b00 00081200 
c072b000
1f00: cc000000 b0200000 00000000 00000006 fda00000 ffff1000 000afe01 
00001000
1f20: 00000007 c0813d28 c0598da7 c0727be8 c08178e4 c086e1c0 c08e8058 
c0008000
1f40: c0801fcc 1ac5387d c05df0f8 c07044e8 00000000 00117680 c0813d20 
12c0387d
1f60: 40000193 80a01000 413fc090 1ac5387d 00000000 c0166978 c0801fac 
c0801fac
1f80: 00000000 c04b07e0 c0801fac c0813d28 00000080 00117680 c0813d20 
12c0387d
1fa0: 40000193 80a01000 413fc090 1ac5387d 00000000 c0700940 00000000 
00000000
1fc0: 00000000 00000000 00000000 c072aa64 00000000 c0813d28 00000000 
00117680
1fe0: 55555555 12c0387d 40000193 80a01000 413fc090 00000000 00000000 
00000000
[<c033930c>] (mmioset) from [<c04b5764>] (__init_single_page+0x50/0x7c)
[<c04b5764>] (__init_single_page) from [<c04b580c>] 
(init_unavailable_range+0x7c/0xec)
[<c04b580c>] (init_unavailable_range) from [<c04b5d4c>] 
(memmap_init+0x144/0x188)
[<c04b5d4c>] (memmap_init) from [<c0712ea4>] 
(free_area_init_node+0x3b8/0x54c)
[<c0712ea4>] (free_area_init_node) from [<c071336c>] 
(free_area_init+0x214/0x5a4)
[<c071336c>] (free_area_init) from [<c0705d50>] (bootmem_init+0x78/0xac)
[<c0705d50>] (bootmem_init) from [<c07076ac>] (paging_init+0x410/0x6bc)
[<c07076ac>] (paging_init) from [<c07044e8>] (setup_arch+0x10c/0x5c4)
[<c07044e8>] (setup_arch) from [<c0700940>] (start_kernel+0x60/0x5c0)
[<c0700940>] (start_kernel) from [<00000000>] (0x0)
Code: 0a41aca8 f9ffffca 0081bd08 200012e3 (0a41ac18)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] arm: extend pfn_valid to take into accound freed memory map alignment
  2021-05-18 12:49   ` Kefeng Wang
@ 2021-05-18 15:52     ` Mike Rapoport
  2021-05-19  1:50       ` Kefeng Wang
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Rapoport @ 2021-05-18 15:52 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: linux-arm-kernel, Andrew Morton, Mike Rapoport, Russell King,
	linux-kernel, linux-mm

On Tue, May 18, 2021 at 08:49:43PM +0800, Kefeng Wang wrote:
> 
> 
> On 2021/5/18 17:06, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > When unused memory map is freed the preserved part of the memory map is
> > extended to match pageblock boundaries because lots of core mm
> > functionality relies on homogeneity of the memory map within pageblock
> > boundaries.
> > 
> > Since pfn_valid() is used to check whether there is a valid memory map
> > entry for a PFN, make it return true also for PFNs that have memory map
> > entries even if there is no actual memory populated there.
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > ---
> >   arch/arm/mm/init.c | 15 ++++++++++++++-
> >   1 file changed, 14 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> > index 9d4744a632c6..bb678c0ba143 100644
> > --- a/arch/arm/mm/init.c
> > +++ b/arch/arm/mm/init.c
> > @@ -125,11 +125,24 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
> >   int pfn_valid(unsigned long pfn)
> >   {
> >   	phys_addr_t addr = __pfn_to_phys(pfn);
> > +	unsigned long pageblock_size = PAGE_SIZE * pageblock_nr_pages;
> >   	if (__phys_to_pfn(addr) != pfn)
> >   		return 0;
> > -	return memblock_is_map_memory(addr);
> > +	if (memblock_is_map_memory(addr))
> > +		return 1;
> > +
> > +	/*
> > +	 * If address less than pageblock_size bytes away from a present
> > +	 * memory chunk there still will be a memory map entry for it
> > +	 * because we round freed memory map to the pageblock boundaries
> > +	 */
> > +	if (memblock_is_map_memory(ALIGN(addr + 1, pageblock_size)) ||
> > +	    memblock_is_map_memory(ALIGN_DOWN(addr, pageblock_size)))
> > +		return 1;
> 
> Hi Mike, with patch3, the system won't boot.

Hmm, apparently I've miscalculated the ranges...

Can you please check with the below patch on top of this series:

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index bb678c0ba143..2fafbbc8e73b 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -138,8 +138,9 @@ int pfn_valid(unsigned long pfn)
 	 * memory chunk there still will be a memory map entry for it
 	 * because we round freed memory map to the pageblock boundaries
 	 */
-	if (memblock_is_map_memory(ALIGN(addr + 1, pageblock_size)) ||
-	    memblock_is_map_memory(ALIGN_DOWN(addr, pageblock_size)))
+	if (memblock_overlaps_region(&memblock.memory,
+				     ALIGN_DOWN(addr, pageblock_size),
+				     pageblock_size);
 		return 1;
 
 	return 0;

-- 
Sincerely yours,
Mike.

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] arm: extend pfn_valid to take into accound freed memory map alignment
  2021-05-18 15:52     ` Mike Rapoport
@ 2021-05-19  1:50       ` Kefeng Wang
  2021-05-19 13:25         ` Mike Rapoport
  0 siblings, 1 reply; 10+ messages in thread
From: Kefeng Wang @ 2021-05-19  1:50 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-arm-kernel, Andrew Morton, Mike Rapoport, Russell King,
	linux-kernel, linux-mm



On 2021/5/18 23:52, Mike Rapoport wrote:
> On Tue, May 18, 2021 at 08:49:43PM +0800, Kefeng Wang wrote:
>>
>>
>> On 2021/5/18 17:06, Mike Rapoport wrote:
>>> From: Mike Rapoport <rppt@linux.ibm.com>
>>>
>>> When unused memory map is freed the preserved part of the memory map is
>>> extended to match pageblock boundaries because lots of core mm
>>> functionality relies on homogeneity of the memory map within pageblock
>>> boundaries.
>>>
>>> Since pfn_valid() is used to check whether there is a valid memory map
>>> entry for a PFN, make it return true also for PFNs that have memory map
>>> entries even if there is no actual memory populated there.
>>>
>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
>>> ---
>>>    arch/arm/mm/init.c | 15 ++++++++++++++-
>>>    1 file changed, 14 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
>>> index 9d4744a632c6..bb678c0ba143 100644
>>> --- a/arch/arm/mm/init.c
>>> +++ b/arch/arm/mm/init.c
>>> @@ -125,11 +125,24 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
>>>    int pfn_valid(unsigned long pfn)
>>>    {
>>>    	phys_addr_t addr = __pfn_to_phys(pfn);
>>> +	unsigned long pageblock_size = PAGE_SIZE * pageblock_nr_pages;
>>>    	if (__phys_to_pfn(addr) != pfn)
>>>    		return 0;
>>> -	return memblock_is_map_memory(addr);
>>> +	if (memblock_is_map_memory(addr))
>>> +		return 1;
>>> +
>>> +	/*
>>> +	 * If address less than pageblock_size bytes away from a present
>>> +	 * memory chunk there still will be a memory map entry for it
>>> +	 * because we round freed memory map to the pageblock boundaries
>>> +	 */
>>> +	if (memblock_is_map_memory(ALIGN(addr + 1, pageblock_size)) ||
>>> +	    memblock_is_map_memory(ALIGN_DOWN(addr, pageblock_size)))
>>> +		return 1;
>>
>> Hi Mike, with patch3, the system won't boot.
> 
> Hmm, apparently I've miscalculated the ranges...
> 
> Can you please check with the below patch on top of this series:

Yes, it works,

On node 0 totalpages: 311551
   Normal zone: 1230 pages used for memmap
   Normal zone: 0 pages reserved
   Normal zone: 157440 pages, LIFO batch:31
   Normal zone: 17152 pages in unavailable ranges
   HighMem zone: 154111 pages, LIFO batch:31
   HighMem zone: 513 pages in unavailable ranges

and the oom testcase could pass.

Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>


There is memblock_is_region_reserved(check if a region intersects 
reserved memory), it also checks the size, should we add a similar func?

> 
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index bb678c0ba143..2fafbbc8e73b 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -138,8 +138,9 @@ int pfn_valid(unsigned long pfn)
>   	 * memory chunk there still will be a memory map entry for it
>   	 * because we round freed memory map to the pageblock boundaries
>   	 */
> -	if (memblock_is_map_memory(ALIGN(addr + 1, pageblock_size)) ||
> -	    memblock_is_map_memory(ALIGN_DOWN(addr, pageblock_size)))
> +	if (memblock_overlaps_region(&memblock.memory,
> +				     ALIGN_DOWN(addr, pageblock_size),
> +				     pageblock_size);
>   		return 1;
>   
>   	return 0;
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] arm: extend pfn_valid to take into accound freed memory map alignment
  2021-05-19  1:50       ` Kefeng Wang
@ 2021-05-19 13:25         ` Mike Rapoport
  0 siblings, 0 replies; 10+ messages in thread
From: Mike Rapoport @ 2021-05-19 13:25 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: linux-arm-kernel, Andrew Morton, Mike Rapoport, Russell King,
	linux-kernel, linux-mm

On Wed, May 19, 2021 at 09:50:46AM +0800, Kefeng Wang wrote:
> 
> On 2021/5/18 23:52, Mike Rapoport wrote:
> > On Tue, May 18, 2021 at 08:49:43PM +0800, Kefeng Wang wrote:
> > > 
> > > 
> > > On 2021/5/18 17:06, Mike Rapoport wrote:
> > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > 
> > > > When unused memory map is freed the preserved part of the memory map is
> > > > extended to match pageblock boundaries because lots of core mm
> > > > functionality relies on homogeneity of the memory map within pageblock
> > > > boundaries.
> > > > 
> > > > Since pfn_valid() is used to check whether there is a valid memory map
> > > > entry for a PFN, make it return true also for PFNs that have memory map
> > > > entries even if there is no actual memory populated there.
> > > > 
> > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > > > ---
> > > >    arch/arm/mm/init.c | 15 ++++++++++++++-
> > > >    1 file changed, 14 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> > > > index 9d4744a632c6..bb678c0ba143 100644
> > > > --- a/arch/arm/mm/init.c
> > > > +++ b/arch/arm/mm/init.c
> > > > @@ -125,11 +125,24 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
> > > >    int pfn_valid(unsigned long pfn)
> > > >    {
> > > >    	phys_addr_t addr = __pfn_to_phys(pfn);
> > > > +	unsigned long pageblock_size = PAGE_SIZE * pageblock_nr_pages;
> > > >    	if (__phys_to_pfn(addr) != pfn)
> > > >    		return 0;
> > > > -	return memblock_is_map_memory(addr);
> > > > +	if (memblock_is_map_memory(addr))
> > > > +		return 1;
> > > > +
> > > > +	/*
> > > > +	 * If address less than pageblock_size bytes away from a present
> > > > +	 * memory chunk there still will be a memory map entry for it
> > > > +	 * because we round freed memory map to the pageblock boundaries
> > > > +	 */
> > > > +	if (memblock_is_map_memory(ALIGN(addr + 1, pageblock_size)) ||
> > > > +	    memblock_is_map_memory(ALIGN_DOWN(addr, pageblock_size)))
> > > > +		return 1;
> > > 
> > > Hi Mike, with patch3, the system won't boot.
> > 
> > Hmm, apparently I've miscalculated the ranges...
> > 
> > Can you please check with the below patch on top of this series:
> 
> Yes, it works,
> 
> On node 0 totalpages: 311551
>   Normal zone: 1230 pages used for memmap
>   Normal zone: 0 pages reserved
>   Normal zone: 157440 pages, LIFO batch:31
>   Normal zone: 17152 pages in unavailable ranges
>   HighMem zone: 154111 pages, LIFO batch:31
>   HighMem zone: 513 pages in unavailable ranges
> 
> and the oom testcase could pass.
> 
> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> 
> 
> There is memblock_is_region_reserved(check if a region intersects reserved
> memory), it also checks the size, should we add a similar func?

We already have memblock_is_region_memory() that checks if the entire
region is a subset of memory. :(
 
Let's deal with this mess afterwards.
 
> > 
> > diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> > index bb678c0ba143..2fafbbc8e73b 100644
> > --- a/arch/arm/mm/init.c
> > +++ b/arch/arm/mm/init.c
> > @@ -138,8 +138,9 @@ int pfn_valid(unsigned long pfn)
> >   	 * memory chunk there still will be a memory map entry for it
> >   	 * because we round freed memory map to the pageblock boundaries
> >   	 */
> > -	if (memblock_is_map_memory(ALIGN(addr + 1, pageblock_size)) ||
> > -	    memblock_is_map_memory(ALIGN_DOWN(addr, pageblock_size)))
> > +	if (memblock_overlaps_region(&memblock.memory,
> > +				     ALIGN_DOWN(addr, pageblock_size),
> > +				     pageblock_size);
> >   		return 1;
> >   	return 0;
> > 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-05-19 13:25 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-18  9:06 [PATCH 0/3] memblock, arm: fixes for freeing of the memory map Mike Rapoport
2021-05-18  9:06 ` [PATCH 1/3] memblock: free_unused_memmap: use pageblock units instead of MAX_ORDER Mike Rapoport
2021-05-18  9:06 ` [PATCH 2/3] memblock: align freed memory map on pageblock boundaries with SPARSEMEM Mike Rapoport
2021-05-18  9:06 ` [PATCH 3/3] arm: extend pfn_valid to take into accound freed memory map alignment Mike Rapoport
2021-05-18  9:44   ` Russell King (Oracle)
2021-05-18 10:53     ` Mike Rapoport
2021-05-18 12:49   ` Kefeng Wang
2021-05-18 15:52     ` Mike Rapoport
2021-05-19  1:50       ` Kefeng Wang
2021-05-19 13:25         ` Mike Rapoport

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).