linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/20] mm: rework free_area_init*() funcitons
@ 2020-04-29 12:11 Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 01/20] mm: memblock: replace dereferences of memblock_region.nid with API calls Mike Rapoport
                   ` (19 more replies)
  0 siblings, 20 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Hi,

After the discussion [1] about removal of CONFIG_NODES_SPAN_OTHER_NODES and
CONFIG_HAVE_MEMBLOCK_NODE_MAP options, I took it a bit further and updated
the node/zone initialization. 

Since all architectures have memblock, it is possible to use only the newer
version of free_area_init_node() that calculates the zone and node
boundaries based on memblock node mapping and architectural limits on
possible zone PFNs. 

The architectures that still determined zone and hole sizes can be switched
to the generic code and the old code that took those zone and hole sizes
can be simply removed.

And, since it all started from the removal of
CONFIG_NODES_SPAN_OTHER_NODES, the memmap_init() is now updated to iterate
over memblocks and so it does not need to perform early_pfn_to_nid() query
for every PFN.

v2 changes:
* move deletion of one of '#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP' from
  patch 2 to patch 3 where it should have been from the beginning
* drop patch that introduced a free_area_init_memoryless_node() wrapper
  for free_area_init_node()
* remove unused next_pfn(), thanks Qian
* drop stale comment in memmap_init_zone(), as per David

--
Sincerely yours,
Mike.

[1] https://lore.kernel.org/lkml/1585420282-25630-1-git-send-email-Hoan@os.amperecomputing.com

Baoquan He (1):
  mm: memmap_init: iterate over memblock regions rather that check each PFN

Mike Rapoport (19):
  mm: memblock: replace dereferences of memblock_region.nid with API calls
  mm: make early_pfn_to_nid() and related defintions close to each other
  mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP option
  mm: free_area_init: use maximal zone PFNs rather than zone sizes
  mm: use free_area_init() instead of free_area_init_nodes()
  alpha: simplify detection of memory zone boundaries
  arm: simplify detection of memory zone boundaries
  arm64: simplify detection of memory zone boundaries for UMA configs
  csky: simplify detection of memory zone boundaries
  m68k: mm: simplify detection of memory zone boundaries
  parisc: simplify detection of memory zone boundaries
  sparc32: simplify detection of memory zone boundaries
  unicore32: simplify detection of memory zone boundaries
  xtensa: simplify detection of memory zone boundaries
  mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES
  mm: free_area_init: allow defining max_zone_pfn in descending order
  mm: clean up free_area_init_node() and its helpers
  mm: simplify find_min_pfn_with_active_regions()
  docs/vm: update memory-models documentation

 .../vm/numa-memblock/arch-support.txt         |  34 ---
 Documentation/vm/memory-model.rst             |   9 +-
 arch/alpha/mm/init.c                          |  16 +-
 arch/alpha/mm/numa.c                          |  22 +-
 arch/arc/mm/init.c                            |  36 +--
 arch/arm/mm/init.c                            |  66 +----
 arch/arm64/Kconfig                            |   1 -
 arch/arm64/mm/init.c                          |  56 +---
 arch/arm64/mm/numa.c                          |   9 +-
 arch/c6x/mm/init.c                            |   8 +-
 arch/csky/kernel/setup.c                      |  26 +-
 arch/h8300/mm/init.c                          |   6 +-
 arch/hexagon/mm/init.c                        |   6 +-
 arch/ia64/Kconfig                             |   1 -
 arch/ia64/mm/contig.c                         |   2 +-
 arch/ia64/mm/discontig.c                      |   2 +-
 arch/m68k/mm/init.c                           |   6 +-
 arch/m68k/mm/mcfmmu.c                         |   9 +-
 arch/m68k/mm/motorola.c                       |  15 +-
 arch/m68k/mm/sun3mmu.c                        |  10 +-
 arch/microblaze/Kconfig                       |   1 -
 arch/microblaze/mm/init.c                     |   2 +-
 arch/mips/Kconfig                             |   1 -
 arch/mips/loongson64/numa.c                   |   2 +-
 arch/mips/mm/init.c                           |   2 +-
 arch/mips/sgi-ip27/ip27-memory.c              |   2 +-
 arch/nds32/mm/init.c                          |  11 +-
 arch/nios2/mm/init.c                          |   8 +-
 arch/openrisc/mm/init.c                       |   9 +-
 arch/parisc/mm/init.c                         |  22 +-
 arch/powerpc/Kconfig                          |  10 -
 arch/powerpc/mm/mem.c                         |   2 +-
 arch/riscv/Kconfig                            |   1 -
 arch/riscv/mm/init.c                          |   2 +-
 arch/s390/Kconfig                             |   1 -
 arch/s390/mm/init.c                           |   2 +-
 arch/sh/Kconfig                               |   1 -
 arch/sh/mm/init.c                             |   2 +-
 arch/sparc/Kconfig                            |  10 -
 arch/sparc/mm/init_64.c                       |   2 +-
 arch/sparc/mm/srmmu.c                         |  21 +-
 arch/um/kernel/mem.c                          |  12 +-
 arch/unicore32/include/asm/memory.h           |   2 +-
 arch/unicore32/include/mach/memory.h          |   6 +-
 arch/unicore32/kernel/pci.c                   |  14 +-
 arch/unicore32/mm/init.c                      |  43 +--
 arch/x86/Kconfig                              |  10 -
 arch/x86/mm/init.c                            |   2 +-
 arch/x86/mm/numa.c                            |   8 +-
 arch/xtensa/mm/init.c                         |   8 +-
 include/linux/memblock.h                      |   8 +-
 include/linux/mm.h                            |  28 +-
 include/linux/mmzone.h                        |  11 +-
 mm/Kconfig                                    |   3 -
 mm/memblock.c                                 |  19 +-
 mm/memory_hotplug.c                           |   4 -
 mm/page_alloc.c                               | 278 ++++++------------
 57 files changed, 243 insertions(+), 667 deletions(-)
 delete mode 100644 Documentation/features/vm/numa-memblock/arch-support.txt

-- 
2.26.1



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 01/20] mm: memblock: replace dereferences of memblock_region.nid with API calls
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 02/20] mm: make early_pfn_to_nid() and related defintions close to each other Mike Rapoport
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

There are several places in the code that directly dereference
memblock_region.nid despite this field being defined only when
CONFIG_HAVE_MEMBLOCK_NODE_MAP=y.

Replace these with calls to memblock_get_region_nid() to improve code
robustness and to avoid possible breakage when
CONFIG_HAVE_MEMBLOCK_NODE_MAP will be removed.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arm64/mm/numa.c | 9 ++++++---
 arch/x86/mm/numa.c   | 6 ++++--
 mm/memblock.c        | 8 +++++---
 mm/page_alloc.c      | 4 ++--
 4 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
index 4decf1659700..aafcee3e3f7e 100644
--- a/arch/arm64/mm/numa.c
+++ b/arch/arm64/mm/numa.c
@@ -350,13 +350,16 @@ static int __init numa_register_nodes(void)
 	struct memblock_region *mblk;
 
 	/* Check that valid nid is set to memblks */
-	for_each_memblock(memory, mblk)
-		if (mblk->nid == NUMA_NO_NODE || mblk->nid >= MAX_NUMNODES) {
+	for_each_memblock(memory, mblk) {
+		int mblk_nid = memblock_get_region_node(mblk);
+
+		if (mblk_nid == NUMA_NO_NODE || mblk_nid >= MAX_NUMNODES) {
 			pr_warn("Warning: invalid memblk node %d [mem %#010Lx-%#010Lx]\n",
-				mblk->nid, mblk->base,
+				mblk_nid, mblk->base,
 				mblk->base + mblk->size - 1);
 			return -EINVAL;
 		}
+	}
 
 	/* Finally register nodes. */
 	for_each_node_mask(nid, numa_nodes_parsed) {
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 59ba008504dc..fe024b2ac796 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -517,8 +517,10 @@ static void __init numa_clear_kernel_node_hotplug(void)
 	 *   reserve specific pages for Sandy Bridge graphics. ]
 	 */
 	for_each_memblock(reserved, mb_region) {
-		if (mb_region->nid != MAX_NUMNODES)
-			node_set(mb_region->nid, reserved_nodemask);
+		int nid = memblock_get_region_node(mb_region);
+
+		if (nid != MAX_NUMNODES)
+			node_set(nid, reserved_nodemask);
 	}
 
 	/*
diff --git a/mm/memblock.c b/mm/memblock.c
index c79ba6f9920c..43e2fd3006c1 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1207,13 +1207,15 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
 {
 	struct memblock_type *type = &memblock.memory;
 	struct memblock_region *r;
+	int r_nid;
 
 	while (++*idx < type->cnt) {
 		r = &type->regions[*idx];
+		r_nid = memblock_get_region_node(r);
 
 		if (PFN_UP(r->base) >= PFN_DOWN(r->base + r->size))
 			continue;
-		if (nid == MAX_NUMNODES || nid == r->nid)
+		if (nid == MAX_NUMNODES || nid == r_nid)
 			break;
 	}
 	if (*idx >= type->cnt) {
@@ -1226,7 +1228,7 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
 	if (out_end_pfn)
 		*out_end_pfn = PFN_DOWN(r->base + r->size);
 	if (out_nid)
-		*out_nid = r->nid;
+		*out_nid = r_nid;
 }
 
 /**
@@ -1810,7 +1812,7 @@ int __init_memblock memblock_search_pfn_nid(unsigned long pfn,
 	*start_pfn = PFN_DOWN(type->regions[mid].base);
 	*end_pfn = PFN_DOWN(type->regions[mid].base + type->regions[mid].size);
 
-	return type->regions[mid].nid;
+	return memblock_get_region_node(&type->regions[mid]);
 }
 #endif
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 69827d4fa052..0d012eda1694 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7208,7 +7208,7 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 			if (!memblock_is_hotpluggable(r))
 				continue;
 
-			nid = r->nid;
+			nid = memblock_get_region_node(r);
 
 			usable_startpfn = PFN_DOWN(r->base);
 			zone_movable_pfn[nid] = zone_movable_pfn[nid] ?
@@ -7229,7 +7229,7 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 			if (memblock_is_mirror(r))
 				continue;
 
-			nid = r->nid;
+			nid = memblock_get_region_node(r);
 
 			usable_startpfn = memblock_region_memory_base_pfn(r);
 
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 02/20] mm: make early_pfn_to_nid() and related defintions close to each other
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 01/20] mm: memblock: replace dereferences of memblock_region.nid with API calls Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 03/20] mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP option Mike Rapoport
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The early_pfn_to_nid() and it's helper __early_pfn_to_nid() are spread
around include/linux/mm.h, include/linux/mmzone.h and mm/page_alloc.c.

Drop unused stub for __early_pfn_to_nid() and move its actual generic
implementation close to its users.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 include/linux/mm.h     |  4 ++--
 include/linux/mmzone.h |  9 --------
 mm/page_alloc.c        | 49 +++++++++++++++++++++---------------------
 3 files changed, 27 insertions(+), 35 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5a323422d783..a404026d14d4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2388,9 +2388,9 @@ extern void sparse_memory_present_with_active_regions(int nid);
 
 #if !defined(CONFIG_HAVE_MEMBLOCK_NODE_MAP) && \
     !defined(CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID)
-static inline int __early_pfn_to_nid(unsigned long pfn,
-					struct mminit_pfnnid_cache *state)
+static inline int early_pfn_to_nid(unsigned long pfn)
 {
+	BUILD_BUG_ON(IS_ENABLED(CONFIG_NUMA));
 	return 0;
 }
 #else
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 1b9de7d220fb..7b5b6eba402f 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1078,15 +1078,6 @@ static inline struct zoneref *first_zones_zonelist(struct zonelist *zonelist,
 #include <asm/sparsemem.h>
 #endif
 
-#if !defined(CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID) && \
-	!defined(CONFIG_HAVE_MEMBLOCK_NODE_MAP)
-static inline unsigned long early_pfn_to_nid(unsigned long pfn)
-{
-	BUILD_BUG_ON(IS_ENABLED(CONFIG_NUMA));
-	return 0;
-}
-#endif
-
 #ifdef CONFIG_FLATMEM
 #define pfn_to_nid(pfn)		(0)
 #endif
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0d012eda1694..a802ee47e715 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1504,6 +1504,31 @@ void __free_pages_core(struct page *page, unsigned int order)
 
 static struct mminit_pfnnid_cache early_pfnnid_cache __meminitdata;
 
+#ifndef CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
+
+/*
+ * Required by SPARSEMEM. Given a PFN, return what node the PFN is on.
+ */
+int __meminit __early_pfn_to_nid(unsigned long pfn,
+					struct mminit_pfnnid_cache *state)
+{
+	unsigned long start_pfn, end_pfn;
+	int nid;
+
+	if (state->last_start <= pfn && pfn < state->last_end)
+		return state->last_nid;
+
+	nid = memblock_search_pfn_nid(pfn, &start_pfn, &end_pfn);
+	if (nid != NUMA_NO_NODE) {
+		state->last_start = start_pfn;
+		state->last_end = end_pfn;
+		state->last_nid = nid;
+	}
+
+	return nid;
+}
+#endif /* CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID */
+
 int __meminit early_pfn_to_nid(unsigned long pfn)
 {
 	static DEFINE_SPINLOCK(early_pfn_lock);
@@ -6299,30 +6324,6 @@ void __meminit init_currently_empty_zone(struct zone *zone,
 }
 
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
-#ifndef CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
-
-/*
- * Required by SPARSEMEM. Given a PFN, return what node the PFN is on.
- */
-int __meminit __early_pfn_to_nid(unsigned long pfn,
-					struct mminit_pfnnid_cache *state)
-{
-	unsigned long start_pfn, end_pfn;
-	int nid;
-
-	if (state->last_start <= pfn && pfn < state->last_end)
-		return state->last_nid;
-
-	nid = memblock_search_pfn_nid(pfn, &start_pfn, &end_pfn);
-	if (nid != NUMA_NO_NODE) {
-		state->last_start = start_pfn;
-		state->last_end = end_pfn;
-		state->last_nid = nid;
-	}
-
-	return nid;
-}
-#endif /* CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID */
 
 /**
  * free_bootmem_with_active_regions - Call memblock_free_early_nid for each active range
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 03/20] mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP option
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 01/20] mm: memblock: replace dereferences of memblock_region.nid with API calls Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 02/20] mm: make early_pfn_to_nid() and related defintions close to each other Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-05-26 17:11   ` Catalin Marinas
  2020-04-29 12:11 ` [PATCH v2 04/20] mm: free_area_init: use maximal zone PFNs rather than zone sizes Mike Rapoport
                   ` (16 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The CONFIG_HAVE_MEMBLOCK_NODE_MAP is used to differentiate initialization
of nodes and zones structures between the systems that have region to node
mapping in memblock and those that don't.

Currently all the NUMA architectures enable this option and for the
non-NUMA systems we can presume that all the memory belongs to node 0 and
therefore the compile time configuration option is not required.

The remaining few architectures that use DISCONTIGMEM without NUMA are
easily updated to use memblock_add_node() instead of memblock_add() and
thus have proper correspondence of memblock regions to NUMA nodes.

Still, free_area_init_node() must have a backward compatible version
because its semantics with and without CONFIG_HAVE_MEMBLOCK_NODE_MAP is
different. Once all the architectures will use the new semantics, the
entire compatibility layer can be dropped.

To avoid addition of extra run time memory to store node id for
architectures that keep memblock but have only a single node, the node id
field of the memblock_region is guarded by CONFIG_NEED_MULTIPLE_NODES and
the corresponding accessors presume that in those cases it is always 0.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 .../vm/numa-memblock/arch-support.txt         |  34 ------
 arch/alpha/mm/numa.c                          |   4 +-
 arch/arm64/Kconfig                            |   1 -
 arch/ia64/Kconfig                             |   1 -
 arch/m68k/mm/motorola.c                       |   4 +-
 arch/microblaze/Kconfig                       |   1 -
 arch/mips/Kconfig                             |   1 -
 arch/powerpc/Kconfig                          |   1 -
 arch/riscv/Kconfig                            |   1 -
 arch/s390/Kconfig                             |   1 -
 arch/sh/Kconfig                               |   1 -
 arch/sparc/Kconfig                            |   1 -
 arch/x86/Kconfig                              |   1 -
 include/linux/memblock.h                      |   8 +-
 include/linux/mm.h                            |  12 +-
 include/linux/mmzone.h                        |   2 +-
 mm/Kconfig                                    |   3 -
 mm/memblock.c                                 |  11 +-
 mm/memory_hotplug.c                           |   4 -
 mm/page_alloc.c                               | 103 ++++++++++--------
 20 files changed, 74 insertions(+), 121 deletions(-)
 delete mode 100644 Documentation/features/vm/numa-memblock/arch-support.txt

diff --git a/Documentation/features/vm/numa-memblock/arch-support.txt b/Documentation/features/vm/numa-memblock/arch-support.txt
deleted file mode 100644
index 3004beb0fd71..000000000000
--- a/Documentation/features/vm/numa-memblock/arch-support.txt
+++ /dev/null
@@ -1,34 +0,0 @@
-#
-# Feature name:          numa-memblock
-#         Kconfig:       HAVE_MEMBLOCK_NODE_MAP
-#         description:   arch supports NUMA aware memblocks
-#
-    -----------------------
-    |         arch |status|
-    -----------------------
-    |       alpha: | TODO |
-    |         arc: |  ..  |
-    |         arm: |  ..  |
-    |       arm64: |  ok  |
-    |         c6x: |  ..  |
-    |        csky: |  ..  |
-    |       h8300: |  ..  |
-    |     hexagon: |  ..  |
-    |        ia64: |  ok  |
-    |        m68k: |  ..  |
-    |  microblaze: |  ok  |
-    |        mips: |  ok  |
-    |       nds32: | TODO |
-    |       nios2: |  ..  |
-    |    openrisc: |  ..  |
-    |      parisc: |  ..  |
-    |     powerpc: |  ok  |
-    |       riscv: |  ok  |
-    |        s390: |  ok  |
-    |          sh: |  ok  |
-    |       sparc: |  ok  |
-    |          um: |  ..  |
-    |   unicore32: |  ..  |
-    |         x86: |  ok  |
-    |      xtensa: |  ..  |
-    -----------------------
diff --git a/arch/alpha/mm/numa.c b/arch/alpha/mm/numa.c
index d0b73371e985..a24cd13e71cb 100644
--- a/arch/alpha/mm/numa.c
+++ b/arch/alpha/mm/numa.c
@@ -144,8 +144,8 @@ setup_memory_node(int nid, void *kernel_end)
 	if (!nid && (node_max_pfn < end_kernel_pfn || node_min_pfn > start_kernel_pfn))
 		panic("kernel loaded out of ram");
 
-	memblock_add(PFN_PHYS(node_min_pfn),
-		     (node_max_pfn - node_min_pfn) << PAGE_SHIFT);
+	memblock_add_node(PFN_PHYS(node_min_pfn),
+			  (node_max_pfn - node_min_pfn) << PAGE_SHIFT, nid);
 
 	/* Zone start phys-addr must be 2^(MAX_ORDER-1) aligned.
 	   Note that we round this down, not up - node memory
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 40fb05d96c60..957151013d10 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -156,7 +156,6 @@ config ARM64
 	select HAVE_GCC_PLUGINS
 	select HAVE_HW_BREAKPOINT if PERF_EVENTS
 	select HAVE_IRQ_TIME_ACCOUNTING
-	select HAVE_MEMBLOCK_NODE_MAP if NUMA
 	select HAVE_NMI
 	select HAVE_PATA_PLATFORM
 	select HAVE_PERF_EVENTS
diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index bab7cd878464..88b05b5256a9 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -31,7 +31,6 @@ config IA64
 	select HAVE_FUNCTION_TRACER
 	select TTY
 	select HAVE_ARCH_TRACEHOOK
-	select HAVE_MEMBLOCK_NODE_MAP
 	select HAVE_VIRT_CPU_ACCOUNTING
 	select DMA_NONCOHERENT_MMAP
 	select ARCH_HAS_SYNC_DMA_FOR_CPU
diff --git a/arch/m68k/mm/motorola.c b/arch/m68k/mm/motorola.c
index fc16190ec2d6..84ab5963cabb 100644
--- a/arch/m68k/mm/motorola.c
+++ b/arch/m68k/mm/motorola.c
@@ -386,7 +386,7 @@ void __init paging_init(void)
 
 	min_addr = m68k_memory[0].addr;
 	max_addr = min_addr + m68k_memory[0].size;
-	memblock_add(m68k_memory[0].addr, m68k_memory[0].size);
+	memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0);
 	for (i = 1; i < m68k_num_memory;) {
 		if (m68k_memory[i].addr < min_addr) {
 			printk("Ignoring memory chunk at 0x%lx:0x%lx before the first chunk\n",
@@ -397,7 +397,7 @@ void __init paging_init(void)
 				(m68k_num_memory - i) * sizeof(struct m68k_mem_info));
 			continue;
 		}
-		memblock_add(m68k_memory[i].addr, m68k_memory[i].size);
+		memblock_add_node(m68k_memory[i].addr, m68k_memory[i].size, i);
 		addr = m68k_memory[i].addr + m68k_memory[i].size;
 		if (addr > max_addr)
 			max_addr = addr;
diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig
index 9606c244b5b8..d262ac0c8714 100644
--- a/arch/microblaze/Kconfig
+++ b/arch/microblaze/Kconfig
@@ -32,7 +32,6 @@ config MICROBLAZE
 	select HAVE_FTRACE_MCOUNT_RECORD
 	select HAVE_FUNCTION_GRAPH_TRACER
 	select HAVE_FUNCTION_TRACER
-	select HAVE_MEMBLOCK_NODE_MAP
 	select HAVE_OPROFILE
 	select HAVE_PCI
 	select IRQ_DOMAIN
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 690718b3701a..94a91b5b7759 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -72,7 +72,6 @@ config MIPS
 	select HAVE_KPROBES
 	select HAVE_KRETPROBES
 	select HAVE_LD_DEAD_CODE_DATA_ELIMINATION
-	select HAVE_MEMBLOCK_NODE_MAP
 	select HAVE_MOD_ARCH_SPECIFIC
 	select HAVE_NMI
 	select HAVE_OPROFILE
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 924c541a9260..5f86b22b7d2c 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -210,7 +210,6 @@ config PPC
 	select HAVE_KRETPROBES
 	select HAVE_LD_DEAD_CODE_DATA_ELIMINATION
 	select HAVE_LIVEPATCH			if HAVE_DYNAMIC_FTRACE_WITH_REGS
-	select HAVE_MEMBLOCK_NODE_MAP
 	select HAVE_MOD_ARCH_SPECIFIC
 	select HAVE_NMI				if PERF_EVENTS || (PPC64 && PPC_BOOK3S)
 	select HAVE_HARDLOCKUP_DETECTOR_ARCH	if (PPC64 && PPC_BOOK3S)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 62f7bfeb709e..e22858e8f88e 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -32,7 +32,6 @@ config RISCV
 	select HAVE_ARCH_AUDITSYSCALL
 	select HAVE_ARCH_SECCOMP_FILTER
 	select HAVE_ASM_MODVERSIONS
-	select HAVE_MEMBLOCK_NODE_MAP
 	select HAVE_DMA_CONTIGUOUS if MMU
 	select HAVE_FUTEX_CMPXCHG if FUTEX
 	select HAVE_PERF_EVENTS
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 2167bce993ff..d6dc6933adc2 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -162,7 +162,6 @@ config S390
 	select HAVE_LIVEPATCH
 	select HAVE_PERF_REGS
 	select HAVE_PERF_USER_STACK_DUMP
-	select HAVE_MEMBLOCK_NODE_MAP
 	select HAVE_MEMBLOCK_PHYS_MAP
 	select MMU_GATHER_NO_GATHER
 	select HAVE_MOD_ARCH_SPECIFIC
diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index b4f0e37b83eb..be7c4f699113 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -9,7 +9,6 @@ config SUPERH
 	select CLKDEV_LOOKUP
 	select DMA_DECLARE_COHERENT
 	select HAVE_IDE if HAS_IOPORT_MAP
-	select HAVE_MEMBLOCK_NODE_MAP
 	select HAVE_OPROFILE
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_PERF_EVENTS
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index da515fdad83d..795206b7b552 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -65,7 +65,6 @@ config SPARC64
 	select HAVE_KRETPROBES
 	select HAVE_KPROBES
 	select MMU_GATHER_RCU_TABLE_FREE if SMP
-	select HAVE_MEMBLOCK_NODE_MAP
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
 	select HAVE_DYNAMIC_FTRACE
 	select HAVE_FTRACE_MCOUNT_RECORD
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 1197b5596d5a..f8bf218a169c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -190,7 +190,6 @@ config X86
 	select HAVE_KRETPROBES
 	select HAVE_KVM
 	select HAVE_LIVEPATCH			if X86_64
-	select HAVE_MEMBLOCK_NODE_MAP
 	select HAVE_MIXED_BREAKPOINTS_REGS
 	select HAVE_MOD_ARCH_SPECIFIC
 	select HAVE_MOVE_PMD
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 6bc37a731d27..45abfc54da37 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -50,7 +50,7 @@ struct memblock_region {
 	phys_addr_t base;
 	phys_addr_t size;
 	enum memblock_flags flags;
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
+#ifdef CONFIG_NEED_MULTIPLE_NODES
 	int nid;
 #endif
 };
@@ -215,7 +215,6 @@ static inline bool memblock_is_nomap(struct memblock_region *m)
 	return m->flags & MEMBLOCK_NOMAP;
 }
 
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
 			    unsigned long  *end_pfn);
 void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
@@ -234,7 +233,6 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
 #define for_each_mem_pfn_range(i, nid, p_start, p_end, p_nid)		\
 	for (i = -1, __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid); \
 	     i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
-#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
 
 #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
 void __next_mem_pfn_range_in_zone(u64 *idx, struct zone *zone,
@@ -310,10 +308,10 @@ void __next_mem_pfn_range_in_zone(u64 *idx, struct zone *zone,
 	for_each_mem_range_rev(i, &memblock.memory, &memblock.reserved,	\
 			       nid, flags, p_start, p_end, p_nid)
 
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 int memblock_set_node(phys_addr_t base, phys_addr_t size,
 		      struct memblock_type *type, int nid);
 
+#ifdef CONFIG_NEED_MULTIPLE_NODES
 static inline void memblock_set_region_node(struct memblock_region *r, int nid)
 {
 	r->nid = nid;
@@ -332,7 +330,7 @@ static inline int memblock_get_region_node(const struct memblock_region *r)
 {
 	return 0;
 }
-#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
+#endif /* CONFIG_NEED_MULTIPLE_NODES */
 
 /* Flags for memblock allocation APIs */
 #define MEMBLOCK_ALLOC_ANYWHERE	(~(phys_addr_t)0)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a404026d14d4..5903bbbdb336 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2344,9 +2344,8 @@ static inline unsigned long get_num_physpages(void)
 	return phys_pages;
 }
 
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 /*
- * With CONFIG_HAVE_MEMBLOCK_NODE_MAP set, an architecture may initialise its
+ * Using memblock node mappings, an architecture may initialise its
  * zones, allocate the backing mem_map and account for memory holes in a more
  * architecture independent manner. This is a substitute for creating the
  * zone_sizes[] and zholes_size[] arrays and passing them to
@@ -2367,9 +2366,6 @@ static inline unsigned long get_num_physpages(void)
  * registered physical page range.  Similarly
  * sparse_memory_present_with_active_regions() calls memory_present() for
  * each range when SPARSEMEM is enabled.
- *
- * See mm/page_alloc.c for more information on each function exposed by
- * CONFIG_HAVE_MEMBLOCK_NODE_MAP.
  */
 extern void free_area_init_nodes(unsigned long *max_zone_pfn);
 unsigned long node_map_pfn_alignment(void);
@@ -2384,13 +2380,9 @@ extern void free_bootmem_with_active_regions(int nid,
 						unsigned long max_low_pfn);
 extern void sparse_memory_present_with_active_regions(int nid);
 
-#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
-
-#if !defined(CONFIG_HAVE_MEMBLOCK_NODE_MAP) && \
-    !defined(CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID)
+#ifndef CONFIG_NEED_MULTIPLE_NODES
 static inline int early_pfn_to_nid(unsigned long pfn)
 {
-	BUILD_BUG_ON(IS_ENABLED(CONFIG_NUMA));
 	return 0;
 }
 #else
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 7b5b6eba402f..ffc2a3d6036b 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -874,7 +874,7 @@ extern int movable_zone;
 #ifdef CONFIG_HIGHMEM
 static inline int zone_movable_is_highmem(void)
 {
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
+#ifdef CONFIG_NEED_MULTIPLE_NODES
 	return movable_zone == ZONE_HIGHMEM;
 #else
 	return (ZONE_MOVABLE - 1) == ZONE_HIGHMEM;
diff --git a/mm/Kconfig b/mm/Kconfig
index c1acc34c1c35..aaa5bdaa1c8a 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -126,9 +126,6 @@ config SPARSEMEM_VMEMMAP
 	  pfn_to_page and page_to_pfn operations.  This is the most
 	  efficient option when sufficient kernel resources are available.
 
-config HAVE_MEMBLOCK_NODE_MAP
-	bool
-
 config HAVE_MEMBLOCK_PHYS_MAP
 	bool
 
diff --git a/mm/memblock.c b/mm/memblock.c
index 43e2fd3006c1..743659d88fc4 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -620,7 +620,7 @@ static int __init_memblock memblock_add_range(struct memblock_type *type,
 		 * area, insert that portion.
 		 */
 		if (rbase > base) {
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
+#ifdef CONFIG_NEED_MULTIPLE_NODES
 			WARN_ON(nid != memblock_get_region_node(rgn));
 #endif
 			WARN_ON(flags != rgn->flags);
@@ -1197,7 +1197,6 @@ void __init_memblock __next_mem_range_rev(u64 *idx, int nid,
 	*idx = ULLONG_MAX;
 }
 
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 /*
  * Common iterator interface used to define for_each_mem_pfn_range().
  */
@@ -1247,6 +1246,7 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
 int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size,
 				      struct memblock_type *type, int nid)
 {
+#ifdef CONFIG_NEED_MULTIPLE_NODES
 	int start_rgn, end_rgn;
 	int i, ret;
 
@@ -1258,9 +1258,10 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size,
 		memblock_set_region_node(&type->regions[i], nid);
 
 	memblock_merge_regions(type);
+#endif
 	return 0;
 }
-#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
+
 #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
 /**
  * __next_mem_pfn_range_in_zone - iterator for for_each_*_range_in_zone()
@@ -1799,7 +1800,6 @@ bool __init_memblock memblock_is_map_memory(phys_addr_t addr)
 	return !memblock_is_nomap(&memblock.memory.regions[i]);
 }
 
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 int __init_memblock memblock_search_pfn_nid(unsigned long pfn,
 			 unsigned long *start_pfn, unsigned long *end_pfn)
 {
@@ -1814,7 +1814,6 @@ int __init_memblock memblock_search_pfn_nid(unsigned long pfn,
 
 	return memblock_get_region_node(&type->regions[mid]);
 }
-#endif
 
 /**
  * memblock_is_region_memory - check if a region is a subset of memory
@@ -1905,7 +1904,7 @@ static void __init_memblock memblock_dump(struct memblock_type *type)
 		size = rgn->size;
 		end = base + size - 1;
 		flags = rgn->flags;
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
+#ifdef CONFIG_NEED_MULTIPLE_NODES
 		if (memblock_get_region_node(rgn) != MAX_NUMNODES)
 			snprintf(nid_buf, sizeof(nid_buf), " on node %d",
 				 memblock_get_region_node(rgn));
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index fc0aad0bc1f5..e67dc501576a 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1372,11 +1372,7 @@ check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages,
 
 static int __init cmdline_parse_movable_node(char *p)
 {
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 	movable_node_enabled = true;
-#else
-	pr_warn("movable_node parameter depends on CONFIG_HAVE_MEMBLOCK_NODE_MAP to work properly\n");
-#endif
 	return 0;
 }
 early_param("movable_node", cmdline_parse_movable_node);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a802ee47e715..4530e9cfd9f7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -335,7 +335,6 @@ static unsigned long nr_kernel_pages __initdata;
 static unsigned long nr_all_pages __initdata;
 static unsigned long dma_reserve __initdata;
 
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 static unsigned long arch_zone_lowest_possible_pfn[MAX_NR_ZONES] __initdata;
 static unsigned long arch_zone_highest_possible_pfn[MAX_NR_ZONES] __initdata;
 static unsigned long required_kernelcore __initdata;
@@ -348,7 +347,6 @@ static bool mirrored_kernelcore __meminitdata;
 /* movable_zone is the "real" zone pages in ZONE_MOVABLE are taken from */
 int movable_zone;
 EXPORT_SYMBOL(movable_zone);
-#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
 
 #if MAX_NUMNODES > 1
 unsigned int nr_node_ids __read_mostly = MAX_NUMNODES;
@@ -1499,8 +1497,7 @@ void __free_pages_core(struct page *page, unsigned int order)
 	__free_pages(page, order);
 }
 
-#if defined(CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID) || \
-	defined(CONFIG_HAVE_MEMBLOCK_NODE_MAP)
+#ifdef CONFIG_NEED_MULTIPLE_NODES
 
 static struct mminit_pfnnid_cache early_pfnnid_cache __meminitdata;
 
@@ -1542,7 +1539,7 @@ int __meminit early_pfn_to_nid(unsigned long pfn)
 
 	return nid;
 }
-#endif
+#endif /* CONFIG_NEED_MULTIPLE_NODES */
 
 #ifdef CONFIG_NODES_SPAN_OTHER_NODES
 /* Only safe to use early in boot when initialisation is single-threaded */
@@ -5924,7 +5921,6 @@ void __ref build_all_zonelists(pg_data_t *pgdat)
 static bool __meminit
 overlap_memmap_init(unsigned long zone, unsigned long *pfn)
 {
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 	static struct memblock_region *r;
 
 	if (mirrored_kernelcore && zone == ZONE_MOVABLE) {
@@ -5940,7 +5936,6 @@ overlap_memmap_init(unsigned long zone, unsigned long *pfn)
 			return true;
 		}
 	}
-#endif
 	return false;
 }
 
@@ -6323,8 +6318,6 @@ void __meminit init_currently_empty_zone(struct zone *zone,
 	zone->initialized = 1;
 }
 
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
-
 /**
  * free_bootmem_with_active_regions - Call memblock_free_early_nid for each active range
  * @nid: The node to free memory on. If MAX_NUMNODES, all nodes are freed.
@@ -6575,8 +6568,7 @@ static unsigned long __init zone_absent_pages_in_node(int nid,
 	return nr_absent;
 }
 
-#else /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
-static inline unsigned long __init zone_spanned_pages_in_node(int nid,
+static inline unsigned long __init compat_zone_spanned_pages_in_node(int nid,
 					unsigned long zone_type,
 					unsigned long node_start_pfn,
 					unsigned long node_end_pfn,
@@ -6595,7 +6587,7 @@ static inline unsigned long __init zone_spanned_pages_in_node(int nid,
 	return zones_size[zone_type];
 }
 
-static inline unsigned long __init zone_absent_pages_in_node(int nid,
+static inline unsigned long __init compat_zone_absent_pages_in_node(int nid,
 						unsigned long zone_type,
 						unsigned long node_start_pfn,
 						unsigned long node_end_pfn,
@@ -6607,13 +6599,12 @@ static inline unsigned long __init zone_absent_pages_in_node(int nid,
 	return zholes_size[zone_type];
 }
 
-#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
-
 static void __init calculate_node_totalpages(struct pglist_data *pgdat,
 						unsigned long node_start_pfn,
 						unsigned long node_end_pfn,
 						unsigned long *zones_size,
-						unsigned long *zholes_size)
+						unsigned long *zholes_size,
+						bool compat)
 {
 	unsigned long realtotalpages = 0, totalpages = 0;
 	enum zone_type i;
@@ -6621,17 +6612,38 @@ static void __init calculate_node_totalpages(struct pglist_data *pgdat,
 	for (i = 0; i < MAX_NR_ZONES; i++) {
 		struct zone *zone = pgdat->node_zones + i;
 		unsigned long zone_start_pfn, zone_end_pfn;
+		unsigned long spanned, absent;
 		unsigned long size, real_size;
 
-		size = zone_spanned_pages_in_node(pgdat->node_id, i,
-						  node_start_pfn,
-						  node_end_pfn,
-						  &zone_start_pfn,
-						  &zone_end_pfn,
-						  zones_size);
-		real_size = size - zone_absent_pages_in_node(pgdat->node_id, i,
-						  node_start_pfn, node_end_pfn,
-						  zholes_size);
+		if (compat) {
+			spanned = compat_zone_spanned_pages_in_node(
+						pgdat->node_id, i,
+						node_start_pfn,
+						node_end_pfn,
+						&zone_start_pfn,
+						&zone_end_pfn,
+						zones_size);
+			absent = compat_zone_absent_pages_in_node(
+						pgdat->node_id, i,
+						node_start_pfn,
+						node_end_pfn,
+						zholes_size);
+		} else {
+			spanned = zone_spanned_pages_in_node(pgdat->node_id, i,
+						node_start_pfn,
+						node_end_pfn,
+						&zone_start_pfn,
+						&zone_end_pfn,
+						zones_size);
+			absent = zone_absent_pages_in_node(pgdat->node_id, i,
+						node_start_pfn,
+						node_end_pfn,
+						zholes_size);
+		}
+
+		size = spanned;
+		real_size = size - absent;
+
 		if (size)
 			zone->zone_start_pfn = zone_start_pfn;
 		else
@@ -6931,10 +6943,8 @@ static void __ref alloc_node_mem_map(struct pglist_data *pgdat)
 	 */
 	if (pgdat == NODE_DATA(0)) {
 		mem_map = NODE_DATA(0)->node_mem_map;
-#if defined(CONFIG_HAVE_MEMBLOCK_NODE_MAP) || defined(CONFIG_FLATMEM)
 		if (page_to_pfn(mem_map) != pgdat->node_start_pfn)
 			mem_map -= offset;
-#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
 	}
 #endif
 }
@@ -6951,9 +6961,10 @@ static inline void pgdat_set_deferred_range(pg_data_t *pgdat)
 static inline void pgdat_set_deferred_range(pg_data_t *pgdat) {}
 #endif
 
-void __init free_area_init_node(int nid, unsigned long *zones_size,
-				   unsigned long node_start_pfn,
-				   unsigned long *zholes_size)
+static void __init __free_area_init_node(int nid, unsigned long *zones_size,
+					 unsigned long node_start_pfn,
+					 unsigned long *zholes_size,
+					 bool compat)
 {
 	pg_data_t *pgdat = NODE_DATA(nid);
 	unsigned long start_pfn = 0;
@@ -6965,16 +6976,16 @@ void __init free_area_init_node(int nid, unsigned long *zones_size,
 	pgdat->node_id = nid;
 	pgdat->node_start_pfn = node_start_pfn;
 	pgdat->per_cpu_nodestats = NULL;
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
-	get_pfn_range_for_nid(nid, &start_pfn, &end_pfn);
-	pr_info("Initmem setup node %d [mem %#018Lx-%#018Lx]\n", nid,
-		(u64)start_pfn << PAGE_SHIFT,
-		end_pfn ? ((u64)end_pfn << PAGE_SHIFT) - 1 : 0);
-#else
-	start_pfn = node_start_pfn;
-#endif
+	if (!compat) {
+		get_pfn_range_for_nid(nid, &start_pfn, &end_pfn);
+		pr_info("Initmem setup node %d [mem %#018Lx-%#018Lx]\n", nid,
+			(u64)start_pfn << PAGE_SHIFT,
+			end_pfn ? ((u64)end_pfn << PAGE_SHIFT) - 1 : 0);
+	} else {
+		start_pfn = node_start_pfn;
+	}
 	calculate_node_totalpages(pgdat, start_pfn, end_pfn,
-				  zones_size, zholes_size);
+				  zones_size, zholes_size, compat);
 
 	alloc_node_mem_map(pgdat);
 	pgdat_set_deferred_range(pgdat);
@@ -6982,6 +6993,14 @@ void __init free_area_init_node(int nid, unsigned long *zones_size,
 	free_area_init_core(pgdat);
 }
 
+void __init free_area_init_node(int nid, unsigned long *zones_size,
+				unsigned long node_start_pfn,
+				unsigned long *zholes_size)
+{
+	__free_area_init_node(nid, zones_size, node_start_pfn, zholes_size,
+			      true);
+}
+
 #if !defined(CONFIG_FLAT_NODE_MEM_MAP)
 /*
  * Initialize all valid struct pages in the range [spfn, epfn) and mark them
@@ -7065,8 +7084,6 @@ static inline void __init init_unavailable_mem(void)
 }
 #endif /* !CONFIG_FLAT_NODE_MEM_MAP */
 
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
-
 #if MAX_NUMNODES > 1
 /*
  * Figure out the number of possible node ids.
@@ -7495,8 +7512,8 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn)
 	init_unavailable_mem();
 	for_each_online_node(nid) {
 		pg_data_t *pgdat = NODE_DATA(nid);
-		free_area_init_node(nid, NULL,
-				find_min_pfn_for_node(nid), NULL);
+		__free_area_init_node(nid, NULL,
+				      find_min_pfn_for_node(nid), NULL, false);
 
 		/* Any memory on that node */
 		if (pgdat->node_present_pages)
@@ -7561,8 +7578,6 @@ static int __init cmdline_parse_movablecore(char *p)
 early_param("kernelcore", cmdline_parse_kernelcore);
 early_param("movablecore", cmdline_parse_movablecore);
 
-#endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
-
 void adjust_managed_page_count(struct page *page, long count)
 {
 	atomic_long_add(count, &page_zone(page)->managed_pages);
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 04/20] mm: free_area_init: use maximal zone PFNs rather than zone sizes
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (2 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 03/20] mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP option Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 05/20] mm: use free_area_init() instead of free_area_init_nodes() Mike Rapoport
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Currently, architectures that use free_area_init() to initialize memory map
and node and zone structures need to calculate zone and hole sizes. We can
use free_area_init_nodes() instead and let it detect the zone boundaries
while the architectures will only have to supply the possible limits for
the zones.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/alpha/mm/init.c    | 16 ++++++----------
 arch/c6x/mm/init.c      |  8 +++-----
 arch/h8300/mm/init.c    |  6 +++---
 arch/hexagon/mm/init.c  |  6 +++---
 arch/m68k/mm/init.c     |  6 +++---
 arch/m68k/mm/mcfmmu.c   |  9 +++------
 arch/nds32/mm/init.c    | 11 ++++-------
 arch/nios2/mm/init.c    |  8 +++-----
 arch/openrisc/mm/init.c |  9 +++------
 arch/um/kernel/mem.c    | 12 ++++--------
 include/linux/mm.h      |  2 +-
 mm/page_alloc.c         |  5 ++---
 12 files changed, 38 insertions(+), 60 deletions(-)

diff --git a/arch/alpha/mm/init.c b/arch/alpha/mm/init.c
index 12e218d3792a..667cd21393b5 100644
--- a/arch/alpha/mm/init.c
+++ b/arch/alpha/mm/init.c
@@ -243,21 +243,17 @@ callback_init(void * kernel_end)
  */
 void __init paging_init(void)
 {
-	unsigned long zones_size[MAX_NR_ZONES] = {0, };
-	unsigned long dma_pfn, high_pfn;
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = {0, };
+	unsigned long dma_pfn;
 
 	dma_pfn = virt_to_phys((char *)MAX_DMA_ADDRESS) >> PAGE_SHIFT;
-	high_pfn = max_pfn = max_low_pfn;
+	max_pfn = max_low_pfn;
 
-	if (dma_pfn >= high_pfn)
-		zones_size[ZONE_DMA] = high_pfn;
-	else {
-		zones_size[ZONE_DMA] = dma_pfn;
-		zones_size[ZONE_NORMAL] = high_pfn - dma_pfn;
-	}
+	max_zone_pfn[ZONE_DMA] = dma_pfn;
+	max_zone_pfn[ZONE_NORMAL] = max_pfn;
 
 	/* Initialize mem_map[].  */
-	free_area_init(zones_size);
+	free_area_init(max_zone_pfn);
 
 	/* Initialize the kernel's ZERO_PGE. */
 	memset((void *)ZERO_PGE, 0, PAGE_SIZE);
diff --git a/arch/c6x/mm/init.c b/arch/c6x/mm/init.c
index 9b374393a8f4..a97e51a3e26d 100644
--- a/arch/c6x/mm/init.c
+++ b/arch/c6x/mm/init.c
@@ -33,7 +33,7 @@ EXPORT_SYMBOL(empty_zero_page);
 void __init paging_init(void)
 {
 	struct pglist_data *pgdat = NODE_DATA(0);
-	unsigned long zones_size[MAX_NR_ZONES] = {0, };
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = {0, };
 
 	empty_zero_page      = (unsigned long) memblock_alloc(PAGE_SIZE,
 							      PAGE_SIZE);
@@ -49,11 +49,9 @@ void __init paging_init(void)
 	/*
 	 * Define zones
 	 */
-	zones_size[ZONE_NORMAL] = (memory_end - PAGE_OFFSET) >> PAGE_SHIFT;
-	pgdat->node_zones[ZONE_NORMAL].zone_start_pfn =
-		__pa(PAGE_OFFSET) >> PAGE_SHIFT;
+	max_zone_pfn[ZONE_NORMAL] = memory_end >> PAGE_SHIFT;
 
-	free_area_init(zones_size);
+	free_area_init(max_zone_pfn);
 }
 
 void __init mem_init(void)
diff --git a/arch/h8300/mm/init.c b/arch/h8300/mm/init.c
index 1eab16b1a0bc..27a0020e3771 100644
--- a/arch/h8300/mm/init.c
+++ b/arch/h8300/mm/init.c
@@ -83,10 +83,10 @@ void __init paging_init(void)
 		 start_mem, end_mem);
 
 	{
-		unsigned long zones_size[MAX_NR_ZONES] = {0, };
+		unsigned long max_zone_pfn[MAX_NR_ZONES] = {0, };
 
-		zones_size[ZONE_NORMAL] = (end_mem - PAGE_OFFSET) >> PAGE_SHIFT;
-		free_area_init(zones_size);
+		max_zone_pfn[ZONE_NORMAL] = end_mem >> PAGE_SHIFT;
+		free_area_init(max_zone_pfn);
 	}
 }
 
diff --git a/arch/hexagon/mm/init.c b/arch/hexagon/mm/init.c
index c961773a6fff..f2e6c868e477 100644
--- a/arch/hexagon/mm/init.c
+++ b/arch/hexagon/mm/init.c
@@ -91,7 +91,7 @@ void sync_icache_dcache(pte_t pte)
  */
 void __init paging_init(void)
 {
-	unsigned long zones_sizes[MAX_NR_ZONES] = {0, };
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = {0, };
 
 	/*
 	 *  This is not particularly well documented anywhere, but
@@ -101,9 +101,9 @@ void __init paging_init(void)
 	 *  adjust accordingly.
 	 */
 
-	zones_sizes[ZONE_NORMAL] = max_low_pfn;
+	max_zone_pfn[ZONE_NORMAL] = max_low_pfn;
 
-	free_area_init(zones_sizes);  /*  sets up the zonelists and mem_map  */
+	free_area_init(max_zone_pfn);  /*  sets up the zonelists and mem_map  */
 
 	/*
 	 * Start of high memory area.  Will probably need something more
diff --git a/arch/m68k/mm/init.c b/arch/m68k/mm/init.c
index b88d510d4fe3..6d3147662ff2 100644
--- a/arch/m68k/mm/init.c
+++ b/arch/m68k/mm/init.c
@@ -84,7 +84,7 @@ void __init paging_init(void)
 	 * page_alloc get different views of the world.
 	 */
 	unsigned long end_mem = memory_end & PAGE_MASK;
-	unsigned long zones_size[MAX_NR_ZONES] = { 0, };
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = { 0, };
 
 	high_memory = (void *) end_mem;
 
@@ -98,8 +98,8 @@ void __init paging_init(void)
 	 */
 	set_fs (USER_DS);
 
-	zones_size[ZONE_DMA] = (end_mem - PAGE_OFFSET) >> PAGE_SHIFT;
-	free_area_init(zones_size);
+	max_zone_pfn[ZONE_DMA] = end_mem >> PAGE_SHIFT;
+	free_area_init(max_zone_pfn);
 }
 
 #endif /* CONFIG_MMU */
diff --git a/arch/m68k/mm/mcfmmu.c b/arch/m68k/mm/mcfmmu.c
index 0ea375607767..80064e6d064f 100644
--- a/arch/m68k/mm/mcfmmu.c
+++ b/arch/m68k/mm/mcfmmu.c
@@ -39,7 +39,7 @@ void __init paging_init(void)
 	pte_t *pg_table;
 	unsigned long address, size;
 	unsigned long next_pgtable, bootmem_end;
-	unsigned long zones_size[MAX_NR_ZONES];
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = { 0 };
 	enum zone_type zone;
 	int i;
 
@@ -80,11 +80,8 @@ void __init paging_init(void)
 	}
 
 	current->mm = NULL;
-
-	for (zone = 0; zone < MAX_NR_ZONES; zone++)
-		zones_size[zone] = 0x0;
-	zones_size[ZONE_DMA] = num_pages;
-	free_area_init(zones_size);
+	max_zone_pfn[ZONE_DMA] = PFN_DOWN(_ramend);
+	free_area_init(max_zone_pfn);
 }
 
 int cf_tlb_miss(struct pt_regs *regs, int write, int dtlb, int extension_word)
diff --git a/arch/nds32/mm/init.c b/arch/nds32/mm/init.c
index 0be3833f6814..91147cca4b64 100644
--- a/arch/nds32/mm/init.c
+++ b/arch/nds32/mm/init.c
@@ -31,16 +31,13 @@ EXPORT_SYMBOL(empty_zero_page);
 
 static void __init zone_sizes_init(void)
 {
-	unsigned long zones_size[MAX_NR_ZONES];
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = { 0 };
 
-	/* Clear the zone sizes */
-	memset(zones_size, 0, sizeof(zones_size));
-
-	zones_size[ZONE_NORMAL] = max_low_pfn;
+	max_zone_pfn[ZONE_NORMAL] = max_low_pfn;
 #ifdef CONFIG_HIGHMEM
-	zones_size[ZONE_HIGHMEM] = max_pfn;
+	max_zone_pfn[ZONE_HIGHMEM] = max_pfn;
 #endif
-	free_area_init(zones_size);
+	free_area_init(max_zone_pfn);
 
 }
 
diff --git a/arch/nios2/mm/init.c b/arch/nios2/mm/init.c
index 2c609c2516b2..9afca77d10b1 100644
--- a/arch/nios2/mm/init.c
+++ b/arch/nios2/mm/init.c
@@ -46,17 +46,15 @@ pgd_t *pgd_current;
  */
 void __init paging_init(void)
 {
-	unsigned long zones_size[MAX_NR_ZONES];
-
-	memset(zones_size, 0, sizeof(zones_size));
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = { 0 };
 
 	pagetable_init();
 	pgd_current = swapper_pg_dir;
 
-	zones_size[ZONE_NORMAL] = max_mapnr;
+	max_zone_pfn[ZONE_NORMAL] = max_mapnr;
 
 	/* pass the memory from the bootmem allocator to the main allocator */
-	free_area_init(zones_size);
+	free_area_init(max_zone_pfn);
 
 	flush_dcache_range((unsigned long)empty_zero_page,
 			(unsigned long)empty_zero_page + PAGE_SIZE);
diff --git a/arch/openrisc/mm/init.c b/arch/openrisc/mm/init.c
index 1f87b524db78..f94fe6d3f499 100644
--- a/arch/openrisc/mm/init.c
+++ b/arch/openrisc/mm/init.c
@@ -45,17 +45,14 @@ DEFINE_PER_CPU(struct mmu_gather, mmu_gathers);
 
 static void __init zone_sizes_init(void)
 {
-	unsigned long zones_size[MAX_NR_ZONES];
-
-	/* Clear the zone sizes */
-	memset(zones_size, 0, sizeof(zones_size));
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = { 0 };
 
 	/*
 	 * We use only ZONE_NORMAL
 	 */
-	zones_size[ZONE_NORMAL] = max_low_pfn;
+	max_zone_pfn[ZONE_NORMAL] = max_low_pfn;
 
-	free_area_init(zones_size);
+	free_area_init(max_zone_pfn);
 }
 
 extern const char _s_kernel_ro[], _e_kernel_ro[];
diff --git a/arch/um/kernel/mem.c b/arch/um/kernel/mem.c
index 30885d0b94ac..401b22f14743 100644
--- a/arch/um/kernel/mem.c
+++ b/arch/um/kernel/mem.c
@@ -158,8 +158,8 @@ static void __init fixaddr_user_init( void)
 
 void __init paging_init(void)
 {
-	unsigned long zones_size[MAX_NR_ZONES], vaddr;
-	int i;
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = { 0 };
+	unsigned long vaddr;
 
 	empty_zero_page = (unsigned long *) memblock_alloc_low(PAGE_SIZE,
 							       PAGE_SIZE);
@@ -167,12 +167,8 @@ void __init paging_init(void)
 		panic("%s: Failed to allocate %lu bytes align=%lx\n",
 		      __func__, PAGE_SIZE, PAGE_SIZE);
 
-	for (i = 0; i < ARRAY_SIZE(zones_size); i++)
-		zones_size[i] = 0;
-
-	zones_size[ZONE_NORMAL] = (end_iomem >> PAGE_SHIFT) -
-		(uml_physmem >> PAGE_SHIFT);
-	free_area_init(zones_size);
+	max_zone_pfn[ZONE_NORMAL] = end_iomem >> PAGE_SHIFT;
+	free_area_init(max_zone_pfn);
 
 	/*
 	 * Fixed mappings, only the page table structure has to be
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5903bbbdb336..d9a256a97ac5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2272,7 +2272,7 @@ static inline spinlock_t *pud_lock(struct mm_struct *mm, pud_t *pud)
 }
 
 extern void __init pagecache_init(void);
-extern void free_area_init(unsigned long * zones_size);
+extern void free_area_init(unsigned long * max_zone_pfn);
 extern void __init free_area_init_node(int nid, unsigned long * zones_size,
 		unsigned long zone_start_pfn, unsigned long *zholes_size);
 extern void free_initmem(void);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4530e9cfd9f7..530701b38bc7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7700,11 +7700,10 @@ void __init set_dma_reserve(unsigned long new_dma_reserve)
 	dma_reserve = new_dma_reserve;
 }
 
-void __init free_area_init(unsigned long *zones_size)
+void __init free_area_init(unsigned long *max_zone_pfn)
 {
 	init_unavailable_mem();
-	free_area_init_node(0, zones_size,
-			__pa(PAGE_OFFSET) >> PAGE_SHIFT, NULL);
+	free_area_init_nodes(max_zone_pfn);
 }
 
 static int page_alloc_cpu_dead(unsigned int cpu)
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 05/20] mm: use free_area_init() instead of free_area_init_nodes()
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (3 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 04/20] mm: free_area_init: use maximal zone PFNs rather than zone sizes Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-05-26 17:13   ` Catalin Marinas
  2020-04-29 12:11 ` [PATCH v2 06/20] alpha: simplify detection of memory zone boundaries Mike Rapoport
                   ` (14 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The free_area_init() has effectively became a wrapper for
free_area_init_nodes() and there is no point of keeping it. Still
free_area_init() name is shorter and more general as it does not imply
necessity to initialize multiple nodes.

Rename free_area_init_nodes() to free_area_init(), update the callers and
drop old version of free_area_init().

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arm64/mm/init.c             |  2 +-
 arch/ia64/mm/contig.c            |  2 +-
 arch/ia64/mm/discontig.c         |  2 +-
 arch/microblaze/mm/init.c        |  2 +-
 arch/mips/loongson64/numa.c      |  2 +-
 arch/mips/mm/init.c              |  2 +-
 arch/mips/sgi-ip27/ip27-memory.c |  2 +-
 arch/powerpc/mm/mem.c            |  2 +-
 arch/riscv/mm/init.c             |  2 +-
 arch/s390/mm/init.c              |  2 +-
 arch/sh/mm/init.c                |  2 +-
 arch/sparc/mm/init_64.c          |  2 +-
 arch/x86/mm/init.c               |  2 +-
 include/linux/mm.h               |  7 +++----
 mm/page_alloc.c                  | 10 ++--------
 15 files changed, 18 insertions(+), 25 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index e42727e3568e..a650adb358ee 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -206,7 +206,7 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
 #endif
 	max_zone_pfns[ZONE_NORMAL] = max;
 
-	free_area_init_nodes(max_zone_pfns);
+	free_area_init(max_zone_pfns);
 }
 
 #else
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
index 5b00dc3898e1..8786fa5c7612 100644
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -210,6 +210,6 @@ paging_init (void)
 		printk("Virtual mem_map starts at 0x%p\n", mem_map);
 	}
 #endif /* !CONFIG_VIRTUAL_MEM_MAP */
-	free_area_init_nodes(max_zone_pfns);
+	free_area_init(max_zone_pfns);
 	zero_page_memmap_ptr = virt_to_page(ia64_imva(empty_zero_page));
 }
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
index 4f33f6e7e206..dd8284bcbf16 100644
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -627,7 +627,7 @@ void __init paging_init(void)
 	max_zone_pfns[ZONE_DMA32] = max_dma;
 #endif
 	max_zone_pfns[ZONE_NORMAL] = max_pfn;
-	free_area_init_nodes(max_zone_pfns);
+	free_area_init(max_zone_pfns);
 
 	zero_page_memmap_ptr = virt_to_page(ia64_imva(empty_zero_page));
 }
diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c
index 1ffbfa96b9b8..dcaa53d11339 100644
--- a/arch/microblaze/mm/init.c
+++ b/arch/microblaze/mm/init.c
@@ -112,7 +112,7 @@ static void __init paging_init(void)
 #endif
 
 	/* We don't have holes in memory map */
-	free_area_init_nodes(zones_size);
+	free_area_init(zones_size);
 }
 
 void __init setup_memory(void)
diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
index 1ae072df4831..901f5be5ee76 100644
--- a/arch/mips/loongson64/numa.c
+++ b/arch/mips/loongson64/numa.c
@@ -247,7 +247,7 @@ void __init paging_init(void)
 	zones_size[ZONE_DMA32] = MAX_DMA32_PFN;
 #endif
 	zones_size[ZONE_NORMAL] = max_low_pfn;
-	free_area_init_nodes(zones_size);
+	free_area_init(zones_size);
 }
 
 void __init mem_init(void)
diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
index 79684000de0e..19719e8b41a5 100644
--- a/arch/mips/mm/init.c
+++ b/arch/mips/mm/init.c
@@ -418,7 +418,7 @@ void __init paging_init(void)
 	}
 #endif
 
-	free_area_init_nodes(max_zone_pfns);
+	free_area_init(max_zone_pfns);
 }
 
 #ifdef CONFIG_64BIT
diff --git a/arch/mips/sgi-ip27/ip27-memory.c b/arch/mips/sgi-ip27/ip27-memory.c
index a45691e6ab90..1213215ea965 100644
--- a/arch/mips/sgi-ip27/ip27-memory.c
+++ b/arch/mips/sgi-ip27/ip27-memory.c
@@ -419,7 +419,7 @@ void __init paging_init(void)
 
 	pagetable_init();
 	zones_size[ZONE_NORMAL] = max_low_pfn;
-	free_area_init_nodes(zones_size);
+	free_area_init(zones_size);
 }
 
 void __init mem_init(void)
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 041ed7cfd341..0fcea21f26b4 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -271,7 +271,7 @@ void __init paging_init(void)
 	max_zone_pfns[ZONE_HIGHMEM] = max_pfn;
 #endif
 
-	free_area_init_nodes(max_zone_pfns);
+	free_area_init(max_zone_pfns);
 
 	mark_nonram_nosave();
 }
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index b55be44ff9bd..f2ceab77b8e6 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -39,7 +39,7 @@ static void __init zone_sizes_init(void)
 #endif
 	max_zone_pfns[ZONE_NORMAL] = max_low_pfn;
 
-	free_area_init_nodes(max_zone_pfns);
+	free_area_init(max_zone_pfns);
 }
 
 static void setup_zero_page(void)
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 87b2d024e75a..b11bcf4da531 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -122,7 +122,7 @@ void __init paging_init(void)
 	memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
 	max_zone_pfns[ZONE_DMA] = PFN_DOWN(MAX_DMA_ADDRESS);
 	max_zone_pfns[ZONE_NORMAL] = max_low_pfn;
-	free_area_init_nodes(max_zone_pfns);
+	free_area_init(max_zone_pfns);
 }
 
 void mark_rodata_ro(void)
diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index 8d2a68aea1fc..628f461b8993 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -334,7 +334,7 @@ void __init paging_init(void)
 
 	memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
 	max_zone_pfns[ZONE_NORMAL] = max_low_pfn;
-	free_area_init_nodes(max_zone_pfns);
+	free_area_init(max_zone_pfns);
 }
 
 unsigned int mem_init_done = 0;
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 1cf0d666dea3..79d3c5e0802e 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2488,7 +2488,7 @@ void __init paging_init(void)
 
 		max_zone_pfns[ZONE_NORMAL] = end_pfn;
 
-		free_area_init_nodes(max_zone_pfns);
+		free_area_init(max_zone_pfns);
 	}
 
 	printk("Booting Linux...\n");
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 1bba16c5742b..4016f2bf5d87 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -949,7 +949,7 @@ void __init zone_sizes_init(void)
 	max_zone_pfns[ZONE_HIGHMEM]	= max_pfn;
 #endif
 
-	free_area_init_nodes(max_zone_pfns);
+	free_area_init(max_zone_pfns);
 }
 
 __visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tlb_state, cpu_tlbstate) = {
diff --git a/include/linux/mm.h b/include/linux/mm.h
index d9a256a97ac5..1c2ecb42e043 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2272,7 +2272,6 @@ static inline spinlock_t *pud_lock(struct mm_struct *mm, pud_t *pud)
 }
 
 extern void __init pagecache_init(void);
-extern void free_area_init(unsigned long * max_zone_pfn);
 extern void __init free_area_init_node(int nid, unsigned long * zones_size,
 		unsigned long zone_start_pfn, unsigned long *zholes_size);
 extern void free_initmem(void);
@@ -2353,21 +2352,21 @@ static inline unsigned long get_num_physpages(void)
  *
  * An architecture is expected to register range of page frames backed by
  * physical memory with memblock_add[_node]() before calling
- * free_area_init_nodes() passing in the PFN each zone ends at. At a basic
+ * free_area_init() passing in the PFN each zone ends at. At a basic
  * usage, an architecture is expected to do something like
  *
  * unsigned long max_zone_pfns[MAX_NR_ZONES] = {max_dma, max_normal_pfn,
  * 							 max_highmem_pfn};
  * for_each_valid_physical_page_range()
  * 	memblock_add_node(base, size, nid)
- * free_area_init_nodes(max_zone_pfns);
+ * free_area_init(max_zone_pfns);
  *
  * free_bootmem_with_active_regions() calls free_bootmem_node() for each
  * registered physical page range.  Similarly
  * sparse_memory_present_with_active_regions() calls memory_present() for
  * each range when SPARSEMEM is enabled.
  */
-extern void free_area_init_nodes(unsigned long *max_zone_pfn);
+void free_area_init(unsigned long *max_zone_pfn);
 unsigned long node_map_pfn_alignment(void);
 unsigned long __absent_pages_in_range(int nid, unsigned long start_pfn,
 						unsigned long end_pfn);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 530701b38bc7..7f6a3081edb8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7428,7 +7428,7 @@ static void check_for_memory(pg_data_t *pgdat, int nid)
 }
 
 /**
- * free_area_init_nodes - Initialise all pg_data_t and zone data
+ * free_area_init - Initialise all pg_data_t and zone data
  * @max_zone_pfn: an array of max PFNs for each zone
  *
  * This will call free_area_init_node() for each active node in the system.
@@ -7440,7 +7440,7 @@ static void check_for_memory(pg_data_t *pgdat, int nid)
  * starts where the previous one ended. For example, ZONE_DMA32 starts
  * at arch_max_dma_pfn.
  */
-void __init free_area_init_nodes(unsigned long *max_zone_pfn)
+void __init free_area_init(unsigned long *max_zone_pfn)
 {
 	unsigned long start_pfn, end_pfn;
 	int i, nid;
@@ -7700,12 +7700,6 @@ void __init set_dma_reserve(unsigned long new_dma_reserve)
 	dma_reserve = new_dma_reserve;
 }
 
-void __init free_area_init(unsigned long *max_zone_pfn)
-{
-	init_unavailable_mem();
-	free_area_init_nodes(max_zone_pfn);
-}
-
 static int page_alloc_cpu_dead(unsigned int cpu)
 {
 
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 06/20] alpha: simplify detection of memory zone boundaries
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (4 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 05/20] mm: use free_area_init() instead of free_area_init_nodes() Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 07/20] arm: " Mike Rapoport
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The free_area_init() function only requires the definition of maximal PFN
for each of the supported zone rater than calculation of actual zone sizes
and the sizes of the holes between the zones.

After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP the free_area_init() is
available to all architectures.

Using this function instead of free_area_init_node() simplifies the zone
detection.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/alpha/mm/numa.c | 18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/arch/alpha/mm/numa.c b/arch/alpha/mm/numa.c
index a24cd13e71cb..5ad6087de1d6 100644
--- a/arch/alpha/mm/numa.c
+++ b/arch/alpha/mm/numa.c
@@ -202,8 +202,7 @@ setup_memory(void *kernel_end)
 
 void __init paging_init(void)
 {
-	unsigned int    nid;
-	unsigned long   zones_size[MAX_NR_ZONES] = {0, };
+	unsigned long   max_zone_pfn[MAX_NR_ZONES] = {0, };
 	unsigned long	dma_local_pfn;
 
 	/*
@@ -215,19 +214,10 @@ void __init paging_init(void)
 	 */
 	dma_local_pfn = virt_to_phys((char *)MAX_DMA_ADDRESS) >> PAGE_SHIFT;
 
-	for_each_online_node(nid) {
-		unsigned long start_pfn = NODE_DATA(nid)->node_start_pfn;
-		unsigned long end_pfn = start_pfn + NODE_DATA(nid)->node_present_pages;
+	max_zone_pfn[ZONE_DMA] = dma_local_pfn;
+	max_zone_pfn[ZONE_NORMAL] = max_pfn;
 
-		if (dma_local_pfn >= end_pfn - start_pfn)
-			zones_size[ZONE_DMA] = end_pfn - start_pfn;
-		else {
-			zones_size[ZONE_DMA] = dma_local_pfn;
-			zones_size[ZONE_NORMAL] = (end_pfn - start_pfn) - dma_local_pfn;
-		}
-		node_set_state(nid, N_NORMAL_MEMORY);
-		free_area_init_node(nid, zones_size, start_pfn, NULL);
-	}
+	free_area_init(max_zone_pfn);
 
 	/* Initialize the kernel's ZERO_PGE. */
 	memset((void *)ZERO_PGE, 0, PAGE_SIZE);
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 07/20] arm: simplify detection of memory zone boundaries
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (5 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 06/20] alpha: simplify detection of memory zone boundaries Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 08/20] arm64: simplify detection of memory zone boundaries for UMA configs Mike Rapoport
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The free_area_init() function only requires the definition of maximal PFN
for each of the supported zone rater than calculation of actual zone sizes
and the sizes of the holes between the zones.

After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP the free_area_init() is
available to all architectures.

Using this function instead of free_area_init_node() simplifies the zone
detection.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arm/mm/init.c | 66 +++++-----------------------------------------
 1 file changed, 7 insertions(+), 59 deletions(-)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 054be44d1cdb..4e43455fab84 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -92,18 +92,6 @@ EXPORT_SYMBOL(arm_dma_zone_size);
  */
 phys_addr_t arm_dma_limit;
 unsigned long arm_dma_pfn_limit;
-
-static void __init arm_adjust_dma_zone(unsigned long *size, unsigned long *hole,
-	unsigned long dma_size)
-{
-	if (size[0] <= dma_size)
-		return;
-
-	size[ZONE_NORMAL] = size[0] - dma_size;
-	size[ZONE_DMA] = dma_size;
-	hole[ZONE_NORMAL] = hole[0];
-	hole[ZONE_DMA] = 0;
-}
 #endif
 
 void __init setup_dma_zone(const struct machine_desc *mdesc)
@@ -121,56 +109,16 @@ void __init setup_dma_zone(const struct machine_desc *mdesc)
 static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
 	unsigned long max_high)
 {
-	unsigned long zone_size[MAX_NR_ZONES], zhole_size[MAX_NR_ZONES];
-	struct memblock_region *reg;
-
-	/*
-	 * initialise the zones.
-	 */
-	memset(zone_size, 0, sizeof(zone_size));
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = { 0 };
 
-	/*
-	 * The memory size has already been determined.  If we need
-	 * to do anything fancy with the allocation of this memory
-	 * to the zones, now is the time to do it.
-	 */
-	zone_size[0] = max_low - min;
-#ifdef CONFIG_HIGHMEM
-	zone_size[ZONE_HIGHMEM] = max_high - max_low;
+#ifdef CONFIG_ZONE_DMA
+	max_zone_pfn[ZONE_DMA] = min(arm_dma_pfn_limit, max_low);
 #endif
-
-	/*
-	 * Calculate the size of the holes.
-	 *  holes = node_size - sum(bank_sizes)
-	 */
-	memcpy(zhole_size, zone_size, sizeof(zhole_size));
-	for_each_memblock(memory, reg) {
-		unsigned long start = memblock_region_memory_base_pfn(reg);
-		unsigned long end = memblock_region_memory_end_pfn(reg);
-
-		if (start < max_low) {
-			unsigned long low_end = min(end, max_low);
-			zhole_size[0] -= low_end - start;
-		}
+	max_zone_pfn[ZONE_NORMAL] = max_low;
 #ifdef CONFIG_HIGHMEM
-		if (end > max_low) {
-			unsigned long high_start = max(start, max_low);
-			zhole_size[ZONE_HIGHMEM] -= end - high_start;
-		}
+	max_zone_pfn[ZONE_HIGHMEM] = max_high;
 #endif
-	}
-
-#ifdef CONFIG_ZONE_DMA
-	/*
-	 * Adjust the sizes according to any special requirements for
-	 * this machine type.
-	 */
-	if (arm_dma_zone_size)
-		arm_adjust_dma_zone(zone_size, zhole_size,
-			arm_dma_zone_size >> PAGE_SHIFT);
-#endif
-
-	free_area_init_node(0, zone_size, min, zhole_size);
+	free_area_init(max_zone_pfn);
 }
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
@@ -306,7 +254,7 @@ void __init bootmem_init(void)
 	sparse_init();
 
 	/*
-	 * Now free the memory - free_area_init_node needs
+	 * Now free the memory - free_area_init needs
 	 * the sparse mem_map arrays initialized by sparse_init()
 	 * for memmap_init_zone(), otherwise all PFNs are invalid.
 	 */
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 08/20] arm64: simplify detection of memory zone boundaries for UMA configs
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (6 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 07/20] arm: " Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-05-26 17:15   ` Catalin Marinas
  2020-04-29 12:11 ` [PATCH v2 09/20] csky: simplify detection of memory zone boundaries Mike Rapoport
                   ` (11 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The free_area_init() function only requires the definition of maximal PFN
for each of the supported zone rater than calculation of actual zone sizes
and the sizes of the holes between the zones.

After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP the free_area_init() is
available to all architectures.

Using this function instead of free_area_init_node() simplifies the zone
detection.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arm64/mm/init.c | 54 --------------------------------------------
 1 file changed, 54 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index a650adb358ee..d54ad2250dce 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -192,8 +192,6 @@ static phys_addr_t __init max_zone_phys(unsigned int zone_bits)
 	return min(offset + (1ULL << zone_bits), memblock_end_of_DRAM());
 }
 
-#ifdef CONFIG_NUMA
-
 static void __init zone_sizes_init(unsigned long min, unsigned long max)
 {
 	unsigned long max_zone_pfns[MAX_NR_ZONES]  = {0};
@@ -209,58 +207,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
 	free_area_init(max_zone_pfns);
 }
 
-#else
-
-static void __init zone_sizes_init(unsigned long min, unsigned long max)
-{
-	struct memblock_region *reg;
-	unsigned long zone_size[MAX_NR_ZONES], zhole_size[MAX_NR_ZONES];
-	unsigned long __maybe_unused max_dma, max_dma32;
-
-	memset(zone_size, 0, sizeof(zone_size));
-
-	max_dma = max_dma32 = min;
-#ifdef CONFIG_ZONE_DMA
-	max_dma = max_dma32 = PFN_DOWN(arm64_dma_phys_limit);
-	zone_size[ZONE_DMA] = max_dma - min;
-#endif
-#ifdef CONFIG_ZONE_DMA32
-	max_dma32 = PFN_DOWN(arm64_dma32_phys_limit);
-	zone_size[ZONE_DMA32] = max_dma32 - max_dma;
-#endif
-	zone_size[ZONE_NORMAL] = max - max_dma32;
-
-	memcpy(zhole_size, zone_size, sizeof(zhole_size));
-
-	for_each_memblock(memory, reg) {
-		unsigned long start = memblock_region_memory_base_pfn(reg);
-		unsigned long end = memblock_region_memory_end_pfn(reg);
-
-#ifdef CONFIG_ZONE_DMA
-		if (start >= min && start < max_dma) {
-			unsigned long dma_end = min(end, max_dma);
-			zhole_size[ZONE_DMA] -= dma_end - start;
-			start = dma_end;
-		}
-#endif
-#ifdef CONFIG_ZONE_DMA32
-		if (start >= max_dma && start < max_dma32) {
-			unsigned long dma32_end = min(end, max_dma32);
-			zhole_size[ZONE_DMA32] -= dma32_end - start;
-			start = dma32_end;
-		}
-#endif
-		if (start >= max_dma32 && start < max) {
-			unsigned long normal_end = min(end, max);
-			zhole_size[ZONE_NORMAL] -= normal_end - start;
-		}
-	}
-
-	free_area_init_node(0, zone_size, min, zhole_size);
-}
-
-#endif /* CONFIG_NUMA */
-
 int pfn_valid(unsigned long pfn)
 {
 	phys_addr_t addr = pfn << PAGE_SHIFT;
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 09/20] csky: simplify detection of memory zone boundaries
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (7 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 08/20] arm64: simplify detection of memory zone boundaries for UMA configs Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 10/20] m68k: mm: " Mike Rapoport
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The free_area_init() function only requires the definition of maximal PFN
for each of the supported zone rater than calculation of actual zone sizes
and the sizes of the holes between the zones.

After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP the free_area_init() is
available to all architectures.

Using this function instead of free_area_init_node() simplifies the zone
detection.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/csky/kernel/setup.c | 26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/arch/csky/kernel/setup.c b/arch/csky/kernel/setup.c
index 819a9a7bf786..0481f4e34538 100644
--- a/arch/csky/kernel/setup.c
+++ b/arch/csky/kernel/setup.c
@@ -26,7 +26,9 @@ struct screen_info screen_info = {
 
 static void __init csky_memblock_init(void)
 {
-	unsigned long zone_size[MAX_NR_ZONES];
+	unsigned long lowmem_size = PFN_DOWN(LOWMEM_LIMIT - PHYS_OFFSET_OFFSET);
+	unsigned long sseg_size = PFN_DOWN(SSEG_SIZE - PHYS_OFFSET_OFFSET);
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = { 0 };
 	signed long size;
 
 	memblock_reserve(__pa(_stext), _end - _stext);
@@ -36,28 +38,22 @@ static void __init csky_memblock_init(void)
 
 	memblock_dump_all();
 
-	memset(zone_size, 0, sizeof(zone_size));
-
 	min_low_pfn = PFN_UP(memblock_start_of_DRAM());
 	max_low_pfn = max_pfn = PFN_DOWN(memblock_end_of_DRAM());
 
 	size = max_pfn - min_low_pfn;
 
-	if (size <= PFN_DOWN(SSEG_SIZE - PHYS_OFFSET_OFFSET))
-		zone_size[ZONE_NORMAL] = size;
-	else if (size < PFN_DOWN(LOWMEM_LIMIT - PHYS_OFFSET_OFFSET)) {
-		zone_size[ZONE_NORMAL] =
-				PFN_DOWN(SSEG_SIZE - PHYS_OFFSET_OFFSET);
-		max_low_pfn = min_low_pfn + zone_size[ZONE_NORMAL];
-	} else {
-		zone_size[ZONE_NORMAL] =
-				PFN_DOWN(LOWMEM_LIMIT - PHYS_OFFSET_OFFSET);
-		max_low_pfn = min_low_pfn + zone_size[ZONE_NORMAL];
+	if (size >= lowmem_size) {
+		max_low_pfn = min_low_pfn + lowmem_size;
 		write_mmu_msa1(read_mmu_msa0() + SSEG_SIZE);
+	} else if (size > sseg_size) {
+		max_low_pfn = min_low_pfn + sseg_size;
 	}
 
+	max_zone_pfn[ZONE_NORMAL] = max_low_pfn;
+
 #ifdef CONFIG_HIGHMEM
-	zone_size[ZONE_HIGHMEM] = max_pfn - max_low_pfn;
+	max_zone_pfn[ZONE_HIGHMEM] = max_pfn;
 
 	highstart_pfn = max_low_pfn;
 	highend_pfn   = max_pfn;
@@ -66,7 +62,7 @@ static void __init csky_memblock_init(void)
 
 	dma_contiguous_reserve(0);
 
-	free_area_init_node(0, zone_size, min_low_pfn, NULL);
+	free_area_init(max_zone_pfn);
 }
 
 void __init setup_arch(char **cmdline_p)
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 10/20] m68k: mm: simplify detection of memory zone boundaries
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (8 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 09/20] csky: simplify detection of memory zone boundaries Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 11/20] parisc: " Mike Rapoport
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The free_area_init() function only requires the definition of maximal PFN
for each of the supported zone rater than calculation of actual zone sizes
and the sizes of the holes between the zones.

After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP the free_area_init() is
available to all architectures.

Using this function instead of free_area_init_node() simplifies the zone
detection.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/m68k/mm/motorola.c | 11 +++++------
 arch/m68k/mm/sun3mmu.c  | 10 +++-------
 2 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/arch/m68k/mm/motorola.c b/arch/m68k/mm/motorola.c
index 84ab5963cabb..904c2a663977 100644
--- a/arch/m68k/mm/motorola.c
+++ b/arch/m68k/mm/motorola.c
@@ -365,7 +365,7 @@ static void __init map_node(int node)
  */
 void __init paging_init(void)
 {
-	unsigned long zones_size[MAX_NR_ZONES] = { 0, };
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = { 0, };
 	unsigned long min_addr, max_addr;
 	unsigned long addr;
 	int i;
@@ -448,11 +448,10 @@ void __init paging_init(void)
 #ifdef DEBUG
 	printk ("before free_area_init\n");
 #endif
-	for (i = 0; i < m68k_num_memory; i++) {
-		zones_size[ZONE_DMA] = m68k_memory[i].size >> PAGE_SHIFT;
-		free_area_init_node(i, zones_size,
-				    m68k_memory[i].addr >> PAGE_SHIFT, NULL);
+	for (i = 0; i < m68k_num_memory; i++)
 		if (node_present_pages(i))
 			node_set_state(i, N_NORMAL_MEMORY);
-	}
+
+	max_zone_pfn[ZONE_DMA] = memblock_end_of_DRAM();
+	free_area_init(max_zone_pfn);
 }
diff --git a/arch/m68k/mm/sun3mmu.c b/arch/m68k/mm/sun3mmu.c
index eca1c46bb90a..5d8d956d9329 100644
--- a/arch/m68k/mm/sun3mmu.c
+++ b/arch/m68k/mm/sun3mmu.c
@@ -42,7 +42,7 @@ void __init paging_init(void)
 	unsigned long address;
 	unsigned long next_pgtable;
 	unsigned long bootmem_end;
-	unsigned long zones_size[MAX_NR_ZONES] = { 0, };
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = { 0, };
 	unsigned long size;
 
 	empty_zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
@@ -89,14 +89,10 @@ void __init paging_init(void)
 	current->mm = NULL;
 
 	/* memory sizing is a hack stolen from motorola.c..  hope it works for us */
-	zones_size[ZONE_DMA] = ((unsigned long)high_memory - PAGE_OFFSET) >> PAGE_SHIFT;
+	max_zone_pfn[ZONE_DMA] = ((unsigned long)high_memory) >> PAGE_SHIFT;
 
 	/* I really wish I knew why the following change made things better...  -- Sam */
-/*	free_area_init(zones_size); */
-	free_area_init_node(0, zones_size,
-			    (__pa(PAGE_OFFSET) >> PAGE_SHIFT) + 1, NULL);
+	free_area_init(max_zone_pfn);
 
 
 }
-
-
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 11/20] parisc: simplify detection of memory zone boundaries
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (9 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 10/20] m68k: mm: " Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 12/20] sparc32: " Mike Rapoport
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The free_area_init() function only requires the definition of maximal PFN
for each of the supported zone rater than calculation of actual zone sizes
and the sizes of the holes between the zones.

After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP the free_area_init() is
available to all architectures.

Using this function instead of free_area_init_node() simplifies the zone
detection.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/parisc/mm/init.c | 22 +++-------------------
 1 file changed, 3 insertions(+), 19 deletions(-)

diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c
index 5224fb38d766..02d2fdb85dcc 100644
--- a/arch/parisc/mm/init.c
+++ b/arch/parisc/mm/init.c
@@ -675,27 +675,11 @@ static void __init gateway_init(void)
 
 static void __init parisc_bootmem_free(void)
 {
-	unsigned long zones_size[MAX_NR_ZONES] = { 0, };
-	unsigned long holes_size[MAX_NR_ZONES] = { 0, };
-	unsigned long mem_start_pfn = ~0UL, mem_end_pfn = 0, mem_size_pfn = 0;
-	int i;
-
-	for (i = 0; i < npmem_ranges; i++) {
-		unsigned long start = pmem_ranges[i].start_pfn;
-		unsigned long size = pmem_ranges[i].pages;
-		unsigned long end = start + size;
-
-		if (mem_start_pfn > start)
-			mem_start_pfn = start;
-		if (mem_end_pfn < end)
-			mem_end_pfn = end;
-		mem_size_pfn += size;
-	}
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = { 0, };
 
-	zones_size[0] = mem_end_pfn - mem_start_pfn;
-	holes_size[0] = zones_size[0] - mem_size_pfn;
+	max_zone_pfn[0] = memblock_end_of_DRAM();
 
-	free_area_init_node(0, zones_size, mem_start_pfn, holes_size);
+	free_area_init(max_zone_pfn);
 }
 
 void __init paging_init(void)
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 12/20] sparc32: simplify detection of memory zone boundaries
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (10 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 11/20] parisc: " Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 13/20] unicore32: " Mike Rapoport
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The free_area_init() function only requires the definition of maximal PFN
for each of the supported zone rater than calculation of actual zone sizes
and the sizes of the holes between the zones.

After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP the free_area_init() is
available to all architectures.

Using this function instead of free_area_init_node() simplifies the zone
detection.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/sparc/mm/srmmu.c | 21 +++++----------------
 1 file changed, 5 insertions(+), 16 deletions(-)

diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
index b7c94de70cca..cc071dd7d8da 100644
--- a/arch/sparc/mm/srmmu.c
+++ b/arch/sparc/mm/srmmu.c
@@ -1008,24 +1008,13 @@ void __init srmmu_paging_init(void)
 	kmap_init();
 
 	{
-		unsigned long zones_size[MAX_NR_ZONES];
-		unsigned long zholes_size[MAX_NR_ZONES];
-		unsigned long npages;
-		int znum;
+		unsigned long max_zone_pfn[MAX_NR_ZONES] = { 0 };
 
-		for (znum = 0; znum < MAX_NR_ZONES; znum++)
-			zones_size[znum] = zholes_size[znum] = 0;
+		max_zone_pfn[ZONE_DMA] = max_low_pfn;
+		max_zone_pfn[ZONE_NORMAL] = max_low_pfn;
+		max_zone_pfn[ZONE_HIGHMEM] = highend_pfn;
 
-		npages = max_low_pfn - pfn_base;
-
-		zones_size[ZONE_DMA] = npages;
-		zholes_size[ZONE_DMA] = npages - pages_avail;
-
-		npages = highend_pfn - max_low_pfn;
-		zones_size[ZONE_HIGHMEM] = npages;
-		zholes_size[ZONE_HIGHMEM] = npages - calc_highpages();
-
-		free_area_init_node(0, zones_size, pfn_base, zholes_size);
+		free_area_init(max_zone_pfn);
 	}
 }
 
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 13/20] unicore32: simplify detection of memory zone boundaries
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (11 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 12/20] sparc32: " Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 14/20] xtensa: " Mike Rapoport
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The free_area_init() function only requires the definition of maximal PFN
for each of the supported zone rater than calculation of actual zone sizes
and the sizes of the holes between the zones.

After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP the free_area_init() is
available to all architectures.

Using this function instead of free_area_init_node() simplifies the zone
detection.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/unicore32/include/asm/memory.h  |  2 +-
 arch/unicore32/include/mach/memory.h |  6 ++--
 arch/unicore32/kernel/pci.c          | 14 ++-------
 arch/unicore32/mm/init.c             | 43 ++++++----------------------
 4 files changed, 15 insertions(+), 50 deletions(-)

diff --git a/arch/unicore32/include/asm/memory.h b/arch/unicore32/include/asm/memory.h
index 23c93105f98f..66285178dd9b 100644
--- a/arch/unicore32/include/asm/memory.h
+++ b/arch/unicore32/include/asm/memory.h
@@ -60,7 +60,7 @@
 #ifndef __ASSEMBLY__
 
 #ifndef arch_adjust_zones
-#define arch_adjust_zones(size, holes) do { } while (0)
+#define arch_adjust_zones(max_zone_pfn) do { } while (0)
 #endif
 
 /*
diff --git a/arch/unicore32/include/mach/memory.h b/arch/unicore32/include/mach/memory.h
index 2b527cedd03d..b4e6035cb9a3 100644
--- a/arch/unicore32/include/mach/memory.h
+++ b/arch/unicore32/include/mach/memory.h
@@ -25,10 +25,10 @@
 
 #if !defined(__ASSEMBLY__) && defined(CONFIG_PCI)
 
-void puv3_pci_adjust_zones(unsigned long *size, unsigned long *holes);
+void puv3_pci_adjust_zones(unsigned long *max_zone_pfn);
 
-#define arch_adjust_zones(size, holes) \
-	puv3_pci_adjust_zones(size, holes)
+#define arch_adjust_zones(max_zone_pfn) \
+	puv3_pci_adjust_zones(max_zone_pfn)
 
 #endif
 
diff --git a/arch/unicore32/kernel/pci.c b/arch/unicore32/kernel/pci.c
index efa04a94dcdb..0d098aa05b47 100644
--- a/arch/unicore32/kernel/pci.c
+++ b/arch/unicore32/kernel/pci.c
@@ -133,21 +133,11 @@ static int pci_puv3_map_irq(const struct pci_dev *dev, u8 slot, u8 pin)
  * This is really ugly and we need a better way of specifying
  * DMA-capable regions of memory.
  */
-void __init puv3_pci_adjust_zones(unsigned long *zone_size,
-	unsigned long *zhole_size)
+void __init puv3_pci_adjust_zones(unsigned long max_zone_pfn)
 {
 	unsigned int sz = SZ_128M >> PAGE_SHIFT;
 
-	/*
-	 * Only adjust if > 128M on current system
-	 */
-	if (zone_size[0] <= sz)
-		return;
-
-	zone_size[1] = zone_size[0] - sz;
-	zone_size[0] = sz;
-	zhole_size[1] = zhole_size[0];
-	zhole_size[0] = 0;
+	max_zone_pfn[ZONE_DMA] = sz;
 }
 
 /*
diff --git a/arch/unicore32/mm/init.c b/arch/unicore32/mm/init.c
index 6cf010fadc7a..52425d383cea 100644
--- a/arch/unicore32/mm/init.c
+++ b/arch/unicore32/mm/init.c
@@ -61,46 +61,21 @@ static void __init find_limits(unsigned long *min, unsigned long *max_low,
 	}
 }
 
-static void __init uc32_bootmem_free(unsigned long min, unsigned long max_low,
-	unsigned long max_high)
+static void __init uc32_bootmem_free(unsigned long max_low)
 {
-	unsigned long zone_size[MAX_NR_ZONES], zhole_size[MAX_NR_ZONES];
-	struct memblock_region *reg;
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = { 0 };
 
-	/*
-	 * initialise the zones.
-	 */
-	memset(zone_size, 0, sizeof(zone_size));
-
-	/*
-	 * The memory size has already been determined.  If we need
-	 * to do anything fancy with the allocation of this memory
-	 * to the zones, now is the time to do it.
-	 */
-	zone_size[0] = max_low - min;
-
-	/*
-	 * Calculate the size of the holes.
-	 *  holes = node_size - sum(bank_sizes)
-	 */
-	memcpy(zhole_size, zone_size, sizeof(zhole_size));
-	for_each_memblock(memory, reg) {
-		unsigned long start = memblock_region_memory_base_pfn(reg);
-		unsigned long end = memblock_region_memory_end_pfn(reg);
-
-		if (start < max_low) {
-			unsigned long low_end = min(end, max_low);
-			zhole_size[0] -= low_end - start;
-		}
-	}
+	max_zone_pfn[ZONE_DMA] = max_low;
+	max_zone_pfn[ZONE_NORMAL] = max_low;
 
 	/*
 	 * Adjust the sizes according to any special requirements for
 	 * this machine type.
+	 * This might lower ZONE_DMA limit.
 	 */
-	arch_adjust_zones(zone_size, zhole_size);
+	arch_adjust_zones(max_zone_pfn);
 
-	free_area_init_node(0, zone_size, min, zhole_size);
+	free_area_init(max_zone_pfn);
 }
 
 int pfn_valid(unsigned long pfn)
@@ -176,11 +151,11 @@ void __init bootmem_init(void)
 	sparse_init();
 
 	/*
-	 * Now free the memory - free_area_init_node needs
+	 * Now free the memory - free_area_init needs
 	 * the sparse mem_map arrays initialized by sparse_init()
 	 * for memmap_init_zone(), otherwise all PFNs are invalid.
 	 */
-	uc32_bootmem_free(min, max_low, max_high);
+	uc32_bootmem_free(max_low);
 
 	high_memory = __va((max_low << PAGE_SHIFT) - 1) + 1;
 
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 14/20] xtensa: simplify detection of memory zone boundaries
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (12 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 13/20] unicore32: " Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 15/20] mm: memmap_init: iterate over memblock regions rather that check each PFN Mike Rapoport
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The free_area_init() function only requires the definition of maximal PFN
for each of the supported zone rater than calculation of actual zone sizes
and the sizes of the holes between the zones.

After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP the free_area_init() is
available to all architectures.

Using this function instead of free_area_init_node() simplifies the zone
detection.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/xtensa/mm/init.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/xtensa/mm/init.c b/arch/xtensa/mm/init.c
index 19c625e6d81f..a05b306cf371 100644
--- a/arch/xtensa/mm/init.c
+++ b/arch/xtensa/mm/init.c
@@ -70,13 +70,13 @@ void __init bootmem_init(void)
 void __init zones_init(void)
 {
 	/* All pages are DMA-able, so we put them all in the DMA zone. */
-	unsigned long zones_size[MAX_NR_ZONES] = {
-		[ZONE_NORMAL] = max_low_pfn - ARCH_PFN_OFFSET,
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = {
+		[ZONE_NORMAL] = max_low_pfn,
 #ifdef CONFIG_HIGHMEM
-		[ZONE_HIGHMEM] = max_pfn - max_low_pfn,
+		[ZONE_HIGHMEM] = max_pfn,
 #endif
 	};
-	free_area_init_node(0, zones_size, ARCH_PFN_OFFSET, NULL);
+	free_area_init(max_zone_pfn);
 }
 
 #ifdef CONFIG_HIGHMEM
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 15/20] mm: memmap_init: iterate over memblock regions rather that check each PFN
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (13 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 14/20] xtensa: " Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 16/20] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES Mike Rapoport
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Baoquan He <bhe@redhat.com>

When called during boot the memmap_init_zone() function checks if each PFN
is valid and actually belongs to the node being initialized using
early_pfn_valid() and early_pfn_in_nid().

Each such check may cost up to O(log(n)) where n is the number of memory
banks, so for large amount of memory overall time spent in early_pfn*()
becomes substantial.

Since the information is anyway present in memblock, we can iterate over
memblock memory regions in memmap_init() and only call memmap_init_zone()
for PFN ranges that are know to be valid and in the appropriate node.

Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 mm/page_alloc.c | 47 ++++++++++++++++-------------------------------
 1 file changed, 16 insertions(+), 31 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7f6a3081edb8..8d112defaead 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5939,23 +5939,6 @@ overlap_memmap_init(unsigned long zone, unsigned long *pfn)
 	return false;
 }
 
-#ifdef CONFIG_SPARSEMEM
-/* Skip PFNs that belong to non-present sections */
-static inline __meminit unsigned long next_pfn(unsigned long pfn)
-{
-	const unsigned long section_nr = pfn_to_section_nr(++pfn);
-
-	if (present_section_nr(section_nr))
-		return pfn;
-	return section_nr_to_pfn(next_present_section_nr(section_nr));
-}
-#else
-static inline __meminit unsigned long next_pfn(unsigned long pfn)
-{
-	return pfn++;
-}
-#endif
-
 /*
  * Initially all pages are reserved - free ones are freed
  * up by memblock_free_all() once the early boot process is
@@ -5990,19 +5973,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 #endif
 
 	for (pfn = start_pfn; pfn < end_pfn; ) {
-		/*
-		 * There can be holes in boot-time mem_map[]s handed to this
-		 * function.  They do not exist on hotplugged memory.
-		 */
 		if (context == MEMMAP_EARLY) {
-			if (!early_pfn_valid(pfn)) {
-				pfn = next_pfn(pfn);
-				continue;
-			}
-			if (!early_pfn_in_nid(pfn, nid)) {
-				pfn++;
-				continue;
-			}
 			if (overlap_memmap_init(zone, &pfn))
 				continue;
 			if (defer_init(nid, pfn, end_pfn))
@@ -6118,9 +6089,23 @@ static void __meminit zone_init_free_lists(struct zone *zone)
 }
 
 void __meminit __weak memmap_init(unsigned long size, int nid,
-				  unsigned long zone, unsigned long start_pfn)
+				  unsigned long zone,
+				  unsigned long range_start_pfn)
 {
-	memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY, NULL);
+	unsigned long start_pfn, end_pfn;
+	unsigned long range_end_pfn = range_start_pfn + size;
+	int i;
+
+	for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) {
+		start_pfn = clamp(start_pfn, range_start_pfn, range_end_pfn);
+		end_pfn = clamp(end_pfn, range_start_pfn, range_end_pfn);
+
+		if (end_pfn > start_pfn) {
+			size = end_pfn - start_pfn;
+			memmap_init_zone(size, nid, zone, start_pfn,
+					 MEMMAP_EARLY, NULL);
+		}
+	}
 }
 
 static int zone_batchsize(struct zone *zone)
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 16/20] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (14 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 15/20] mm: memmap_init: iterate over memblock regions rather that check each PFN Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-04-29 14:17   ` Christoph Hellwig
  2020-04-29 16:29   ` [PATCH v2.5 " Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 17/20] mm: free_area_init: allow defining max_zone_pfn in descending order Mike Rapoport
                   ` (3 subsequent siblings)
  19 siblings, 2 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The commit f47ac088c406 ("mm: memmap_init: iterate over memblock regions
rather that check each PFN") made early_pfn_in_nid() obsolete and since
CONFIG_NODES_SPAN_OTHER_NODES is only used to pick a stub or a real
implementation of early_pfn_in_nid() it is also not needed anymore.

Remove both early_pfn_in_nid() and the CONFIG_NODES_SPAN_OTHER_NODES.

Co-developed-by: Hoan Tran <Hoan@os.amperecomputing.com>
Signed-off-by: Hoan Tran <Hoan@os.amperecomputing.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/powerpc/Kconfig |  9 ---------
 arch/sparc/Kconfig   |  9 ---------
 arch/x86/Kconfig     |  9 ---------
 mm/page_alloc.c      | 20 --------------------
 4 files changed, 47 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 5f86b22b7d2c..74f316deeae1 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -685,15 +685,6 @@ config ARCH_MEMORY_PROBE
 	def_bool y
 	depends on MEMORY_HOTPLUG
 
-# Some NUMA nodes have memory ranges that span
-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-	def_bool y
-	depends on NEED_MULTIPLE_NODES
-
 config STDBINUTILS
 	bool "Using standard binutils settings"
 	depends on 44x
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 795206b7b552..0e4f3891b904 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -286,15 +286,6 @@ config NODES_SHIFT
 	  Specify the maximum number of NUMA Nodes available on the target
 	  system.  Increases memory reserved to accommodate various tables.
 
-# Some NUMA nodes have memory ranges that span
-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-	def_bool y
-	depends on NEED_MULTIPLE_NODES
-
 config ARCH_SPARSEMEM_ENABLE
 	def_bool y if SPARC64
 	select SPARSEMEM_VMEMMAP_ENABLE
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f8bf218a169c..1ec2a5e2fef6 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1581,15 +1581,6 @@ config X86_64_ACPI_NUMA
 	---help---
 	  Enable ACPI SRAT based node topology detection.
 
-# Some NUMA nodes have memory ranges that span
-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-	def_bool y
-	depends on X86_64_ACPI_NUMA
-
 config NUMA_EMU
 	bool "NUMA emulation"
 	depends on NUMA
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8d112defaead..d35ca0996a09 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1541,26 +1541,6 @@ int __meminit early_pfn_to_nid(unsigned long pfn)
 }
 #endif /* CONFIG_NEED_MULTIPLE_NODES */
 
-#ifdef CONFIG_NODES_SPAN_OTHER_NODES
-/* Only safe to use early in boot when initialisation is single-threaded */
-static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
-{
-	int nid;
-
-	nid = __early_pfn_to_nid(pfn, &early_pfnnid_cache);
-	if (nid >= 0 && nid != node)
-		return false;
-	return true;
-}
-
-#else
-static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
-{
-	return true;
-}
-#endif
-
-
 void __init memblock_free_pages(struct page *page, unsigned long pfn,
 							unsigned int order)
 {
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 17/20] mm: free_area_init: allow defining max_zone_pfn in descending order
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (15 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 16/20] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-05-03 17:41   ` Guenter Roeck
  2020-04-29 12:11 ` [PATCH v2 18/20] mm: clean up free_area_init_node() and its helpers Mike Rapoport
                   ` (2 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Some architectures (e.g. ARC) have the ZONE_HIGHMEM zone below the
ZONE_NORMAL. Allowing free_area_init() parse max_zone_pfn array even it is
sorted in descending order allows using free_area_init() on such
architectures.

Add top -> down traversal of max_zone_pfn array in free_area_init() and use
the latter in ARC node/zone initialization.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arc/mm/init.c | 36 +++++++-----------------------------
 mm/page_alloc.c    | 24 +++++++++++++++++++-----
 2 files changed, 26 insertions(+), 34 deletions(-)

diff --git a/arch/arc/mm/init.c b/arch/arc/mm/init.c
index 0920c969c466..41eb9be1653c 100644
--- a/arch/arc/mm/init.c
+++ b/arch/arc/mm/init.c
@@ -63,11 +63,13 @@ void __init early_init_dt_add_memory_arch(u64 base, u64 size)
 
 		low_mem_sz = size;
 		in_use = 1;
+		memblock_add_node(base, size, 0);
 	} else {
 #ifdef CONFIG_HIGHMEM
 		high_mem_start = base;
 		high_mem_sz = size;
 		in_use = 1;
+		memblock_add_node(base, size, 1);
 #endif
 	}
 
@@ -83,8 +85,7 @@ void __init early_init_dt_add_memory_arch(u64 base, u64 size)
  */
 void __init setup_arch_memory(void)
 {
-	unsigned long zones_size[MAX_NR_ZONES];
-	unsigned long zones_holes[MAX_NR_ZONES];
+	unsigned long max_zone_pfn[MAX_NR_ZONES] = { 0 };
 
 	init_mm.start_code = (unsigned long)_text;
 	init_mm.end_code = (unsigned long)_etext;
@@ -115,7 +116,6 @@ void __init setup_arch_memory(void)
 	 * the crash
 	 */
 
-	memblock_add_node(low_mem_start, low_mem_sz, 0);
 	memblock_reserve(CONFIG_LINUX_LINK_BASE,
 			 __pa(_end) - CONFIG_LINUX_LINK_BASE);
 
@@ -133,22 +133,7 @@ void __init setup_arch_memory(void)
 	memblock_dump_all();
 
 	/*----------------- node/zones setup --------------------------*/
-	memset(zones_size, 0, sizeof(zones_size));
-	memset(zones_holes, 0, sizeof(zones_holes));
-
-	zones_size[ZONE_NORMAL] = max_low_pfn - min_low_pfn;
-	zones_holes[ZONE_NORMAL] = 0;
-
-	/*
-	 * We can't use the helper free_area_init(zones[]) because it uses
-	 * PAGE_OFFSET to compute the @min_low_pfn which would be wrong
-	 * when our kernel doesn't start at PAGE_OFFSET, i.e.
-	 * PAGE_OFFSET != CONFIG_LINUX_RAM_BASE
-	 */
-	free_area_init_node(0,			/* node-id */
-			    zones_size,		/* num pages per zone */
-			    min_low_pfn,	/* first pfn of node */
-			    zones_holes);	/* holes */
+	max_zone_pfn[ZONE_NORMAL] = max_low_pfn;
 
 #ifdef CONFIG_HIGHMEM
 	/*
@@ -168,20 +153,13 @@ void __init setup_arch_memory(void)
 	min_high_pfn = PFN_DOWN(high_mem_start);
 	max_high_pfn = PFN_DOWN(high_mem_start + high_mem_sz);
 
-	zones_size[ZONE_NORMAL] = 0;
-	zones_holes[ZONE_NORMAL] = 0;
-
-	zones_size[ZONE_HIGHMEM] = max_high_pfn - min_high_pfn;
-	zones_holes[ZONE_HIGHMEM] = 0;
-
-	free_area_init_node(1,			/* node-id */
-			    zones_size,		/* num pages per zone */
-			    min_high_pfn,	/* first pfn of node */
-			    zones_holes);	/* holes */
+	max_zone_pfn[ZONE_HIGHMEM] = max_high_pfn;
 
 	high_memory = (void *)(min_high_pfn << PAGE_SHIFT);
 	kmap_init();
 #endif
+
+	free_area_init(max_zone_pfn);
 }
 
 /*
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d35ca0996a09..98a47f90065a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7408,7 +7408,8 @@ static void check_for_memory(pg_data_t *pgdat, int nid)
 void __init free_area_init(unsigned long *max_zone_pfn)
 {
 	unsigned long start_pfn, end_pfn;
-	int i, nid;
+	int i, nid, zone;
+	bool descending = false;
 
 	/* Record where the zone boundaries are */
 	memset(arch_zone_lowest_possible_pfn, 0,
@@ -7418,13 +7419,26 @@ void __init free_area_init(unsigned long *max_zone_pfn)
 
 	start_pfn = find_min_pfn_with_active_regions();
 
+	/*
+	 * Some architecturs, e.g. ARC may have ZONE_HIGHMEM below
+	 * ZONE_NORMAL. For such cases we allow max_zone_pfn sorted in the
+	 * descending order
+	 */
+	if (MAX_NR_ZONES > 1 && max_zone_pfn[0] > max_zone_pfn[1])
+		descending = true;
+
 	for (i = 0; i < MAX_NR_ZONES; i++) {
-		if (i == ZONE_MOVABLE)
+		if (descending)
+			zone = MAX_NR_ZONES - i - 1;
+		else
+			zone = i;
+
+		if (zone == ZONE_MOVABLE)
 			continue;
 
-		end_pfn = max(max_zone_pfn[i], start_pfn);
-		arch_zone_lowest_possible_pfn[i] = start_pfn;
-		arch_zone_highest_possible_pfn[i] = end_pfn;
+		end_pfn = max(max_zone_pfn[zone], start_pfn);
+		arch_zone_lowest_possible_pfn[zone] = start_pfn;
+		arch_zone_highest_possible_pfn[zone] = end_pfn;
 
 		start_pfn = end_pfn;
 	}
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 18/20] mm: clean up free_area_init_node() and its helpers
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (16 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 17/20] mm: free_area_init: allow defining max_zone_pfn in descending order Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 19/20] mm: simplify find_min_pfn_with_active_regions() Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 20/20] docs/vm: update memory-models documentation Mike Rapoport
  19 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The free_area_init_node() now always uses memblock info and the zone PFN
limits so it does not need the backwards compatibility functions to
calculate the zone spanned and absent pages. The removal of the compat_
versions of zone_{abscent,spanned}_pages_in_node() in turn, makes zone_size
and zhole_size parameters unused.

The node_start_pfn is determined by get_pfn_range_for_nid(), so there is no
need to pass it to free_area_init_node().

As the result, the only required parameter to free_area_init_node() is the
node ID, all the rest are removed along with no longer used
compat_zone_{abscent,spanned}_pages_in_node() helpers.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/x86/mm/numa.c |   2 +-
 include/linux/mm.h |   7 +--
 mm/page_alloc.c    | 110 +++++++++------------------------------------
 3 files changed, 24 insertions(+), 95 deletions(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index fe024b2ac796..0e1b99f491e4 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -742,7 +742,7 @@ static void __init init_memory_less_node(int nid)
 
 	/* Allocate and initialize node data. Memory-less node is now online.*/
 	alloc_node_data(nid);
-	free_area_init_node(nid, zones_size, 0, zholes_size);
+	free_area_init_node(nid);
 
 	/*
 	 * All zonelists will be built later in start_kernel() after per cpu
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1c2ecb42e043..2c0d42b11f3c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2272,8 +2272,7 @@ static inline spinlock_t *pud_lock(struct mm_struct *mm, pud_t *pud)
 }
 
 extern void __init pagecache_init(void);
-extern void __init free_area_init_node(int nid, unsigned long * zones_size,
-		unsigned long zone_start_pfn, unsigned long *zholes_size);
+extern void __init free_area_init_node(int nid);
 extern void free_initmem(void);
 
 /*
@@ -2346,9 +2345,7 @@ static inline unsigned long get_num_physpages(void)
 /*
  * Using memblock node mappings, an architecture may initialise its
  * zones, allocate the backing mem_map and account for memory holes in a more
- * architecture independent manner. This is a substitute for creating the
- * zone_sizes[] and zholes_size[] arrays and passing them to
- * free_area_init_node()
+ * architecture independent manner.
  *
  * An architecture is expected to register range of page frames backed by
  * physical memory with memblock_add[_node]() before calling
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 98a47f90065a..30d171451d4c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6420,8 +6420,7 @@ static unsigned long __init zone_spanned_pages_in_node(int nid,
 					unsigned long node_start_pfn,
 					unsigned long node_end_pfn,
 					unsigned long *zone_start_pfn,
-					unsigned long *zone_end_pfn,
-					unsigned long *ignored)
+					unsigned long *zone_end_pfn)
 {
 	unsigned long zone_low = arch_zone_lowest_possible_pfn[zone_type];
 	unsigned long zone_high = arch_zone_highest_possible_pfn[zone_type];
@@ -6485,8 +6484,7 @@ unsigned long __init absent_pages_in_range(unsigned long start_pfn,
 static unsigned long __init zone_absent_pages_in_node(int nid,
 					unsigned long zone_type,
 					unsigned long node_start_pfn,
-					unsigned long node_end_pfn,
-					unsigned long *ignored)
+					unsigned long node_end_pfn)
 {
 	unsigned long zone_low = arch_zone_lowest_possible_pfn[zone_type];
 	unsigned long zone_high = arch_zone_highest_possible_pfn[zone_type];
@@ -6533,43 +6531,9 @@ static unsigned long __init zone_absent_pages_in_node(int nid,
 	return nr_absent;
 }
 
-static inline unsigned long __init compat_zone_spanned_pages_in_node(int nid,
-					unsigned long zone_type,
-					unsigned long node_start_pfn,
-					unsigned long node_end_pfn,
-					unsigned long *zone_start_pfn,
-					unsigned long *zone_end_pfn,
-					unsigned long *zones_size)
-{
-	unsigned int zone;
-
-	*zone_start_pfn = node_start_pfn;
-	for (zone = 0; zone < zone_type; zone++)
-		*zone_start_pfn += zones_size[zone];
-
-	*zone_end_pfn = *zone_start_pfn + zones_size[zone_type];
-
-	return zones_size[zone_type];
-}
-
-static inline unsigned long __init compat_zone_absent_pages_in_node(int nid,
-						unsigned long zone_type,
-						unsigned long node_start_pfn,
-						unsigned long node_end_pfn,
-						unsigned long *zholes_size)
-{
-	if (!zholes_size)
-		return 0;
-
-	return zholes_size[zone_type];
-}
-
 static void __init calculate_node_totalpages(struct pglist_data *pgdat,
 						unsigned long node_start_pfn,
-						unsigned long node_end_pfn,
-						unsigned long *zones_size,
-						unsigned long *zholes_size,
-						bool compat)
+						unsigned long node_end_pfn)
 {
 	unsigned long realtotalpages = 0, totalpages = 0;
 	enum zone_type i;
@@ -6580,31 +6544,14 @@ static void __init calculate_node_totalpages(struct pglist_data *pgdat,
 		unsigned long spanned, absent;
 		unsigned long size, real_size;
 
-		if (compat) {
-			spanned = compat_zone_spanned_pages_in_node(
-						pgdat->node_id, i,
-						node_start_pfn,
-						node_end_pfn,
-						&zone_start_pfn,
-						&zone_end_pfn,
-						zones_size);
-			absent = compat_zone_absent_pages_in_node(
-						pgdat->node_id, i,
-						node_start_pfn,
-						node_end_pfn,
-						zholes_size);
-		} else {
-			spanned = zone_spanned_pages_in_node(pgdat->node_id, i,
-						node_start_pfn,
-						node_end_pfn,
-						&zone_start_pfn,
-						&zone_end_pfn,
-						zones_size);
-			absent = zone_absent_pages_in_node(pgdat->node_id, i,
-						node_start_pfn,
-						node_end_pfn,
-						zholes_size);
-		}
+		spanned = zone_spanned_pages_in_node(pgdat->node_id, i,
+						     node_start_pfn,
+						     node_end_pfn,
+						     &zone_start_pfn,
+						     &zone_end_pfn);
+		absent = zone_absent_pages_in_node(pgdat->node_id, i,
+						   node_start_pfn,
+						   node_end_pfn);
 
 		size = spanned;
 		real_size = size - absent;
@@ -6926,10 +6873,7 @@ static inline void pgdat_set_deferred_range(pg_data_t *pgdat)
 static inline void pgdat_set_deferred_range(pg_data_t *pgdat) {}
 #endif
 
-static void __init __free_area_init_node(int nid, unsigned long *zones_size,
-					 unsigned long node_start_pfn,
-					 unsigned long *zholes_size,
-					 bool compat)
+void __init free_area_init_node(int nid)
 {
 	pg_data_t *pgdat = NODE_DATA(nid);
 	unsigned long start_pfn = 0;
@@ -6938,19 +6882,16 @@ static void __init __free_area_init_node(int nid, unsigned long *zones_size,
 	/* pg_data_t should be reset to zero when it's allocated */
 	WARN_ON(pgdat->nr_zones || pgdat->kswapd_classzone_idx);
 
+	get_pfn_range_for_nid(nid, &start_pfn, &end_pfn);
+
 	pgdat->node_id = nid;
-	pgdat->node_start_pfn = node_start_pfn;
+	pgdat->node_start_pfn = start_pfn;
 	pgdat->per_cpu_nodestats = NULL;
-	if (!compat) {
-		get_pfn_range_for_nid(nid, &start_pfn, &end_pfn);
-		pr_info("Initmem setup node %d [mem %#018Lx-%#018Lx]\n", nid,
-			(u64)start_pfn << PAGE_SHIFT,
-			end_pfn ? ((u64)end_pfn << PAGE_SHIFT) - 1 : 0);
-	} else {
-		start_pfn = node_start_pfn;
-	}
-	calculate_node_totalpages(pgdat, start_pfn, end_pfn,
-				  zones_size, zholes_size, compat);
+
+	pr_info("Initmem setup node %d [mem %#018Lx-%#018Lx]\n", nid,
+		(u64)start_pfn << PAGE_SHIFT,
+		end_pfn ? ((u64)end_pfn << PAGE_SHIFT) - 1 : 0);
+	calculate_node_totalpages(pgdat, start_pfn, end_pfn);
 
 	alloc_node_mem_map(pgdat);
 	pgdat_set_deferred_range(pgdat);
@@ -6958,14 +6899,6 @@ static void __init __free_area_init_node(int nid, unsigned long *zones_size,
 	free_area_init_core(pgdat);
 }
 
-void __init free_area_init_node(int nid, unsigned long *zones_size,
-				unsigned long node_start_pfn,
-				unsigned long *zholes_size)
-{
-	__free_area_init_node(nid, zones_size, node_start_pfn, zholes_size,
-			      true);
-}
-
 #if !defined(CONFIG_FLAT_NODE_MEM_MAP)
 /*
  * Initialize all valid struct pages in the range [spfn, epfn) and mark them
@@ -7491,8 +7424,7 @@ void __init free_area_init(unsigned long *max_zone_pfn)
 	init_unavailable_mem();
 	for_each_online_node(nid) {
 		pg_data_t *pgdat = NODE_DATA(nid);
-		__free_area_init_node(nid, NULL,
-				      find_min_pfn_for_node(nid), NULL, false);
+		free_area_init_node(nid);
 
 		/* Any memory on that node */
 		if (pgdat->node_present_pages)
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 19/20] mm: simplify find_min_pfn_with_active_regions()
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (17 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 18/20] mm: clean up free_area_init_node() and its helpers Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  2020-04-29 12:11 ` [PATCH v2 20/20] docs/vm: update memory-models documentation Mike Rapoport
  19 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The find_min_pfn_with_active_regions() calls find_min_pfn_for_node() with
nid parameter set to MAX_NUMNODES. This makes the find_min_pfn_for_node()
traverse all memblock memory regions although the first PFN in the system
can be easily found with memblock_start_of_DRAM().

Use memblock_start_of_DRAM() in find_min_pfn_with_active_regions() and drop
now unused find_min_pfn_for_node().

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 mm/page_alloc.c | 20 +-------------------
 1 file changed, 1 insertion(+), 19 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 30d171451d4c..b990e9734474 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7045,24 +7045,6 @@ unsigned long __init node_map_pfn_alignment(void)
 	return ~accl_mask + 1;
 }
 
-/* Find the lowest pfn for a node */
-static unsigned long __init find_min_pfn_for_node(int nid)
-{
-	unsigned long min_pfn = ULONG_MAX;
-	unsigned long start_pfn;
-	int i;
-
-	for_each_mem_pfn_range(i, nid, &start_pfn, NULL, NULL)
-		min_pfn = min(min_pfn, start_pfn);
-
-	if (min_pfn == ULONG_MAX) {
-		pr_warn("Could not find start_pfn for node %d\n", nid);
-		return 0;
-	}
-
-	return min_pfn;
-}
-
 /**
  * find_min_pfn_with_active_regions - Find the minimum PFN registered
  *
@@ -7071,7 +7053,7 @@ static unsigned long __init find_min_pfn_for_node(int nid)
  */
 unsigned long __init find_min_pfn_with_active_regions(void)
 {
-	return find_min_pfn_for_node(MAX_NUMNODES);
+	return PHYS_PFN(memblock_start_of_DRAM());
 }
 
 /*
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 20/20] docs/vm: update memory-models documentation
  2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
                   ` (18 preceding siblings ...)
  2020-04-29 12:11 ` [PATCH v2 19/20] mm: simplify find_min_pfn_with_active_regions() Mike Rapoport
@ 2020-04-29 12:11 ` Mike Rapoport
  19 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 12:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

to reflect the updates to free_area_init() family of functions.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 Documentation/vm/memory-model.rst | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/Documentation/vm/memory-model.rst b/Documentation/vm/memory-model.rst
index 58a12376b7df..91228044ed16 100644
--- a/Documentation/vm/memory-model.rst
+++ b/Documentation/vm/memory-model.rst
@@ -46,11 +46,10 @@ maps the entire physical memory. For most architectures, the holes
 have entries in the `mem_map` array. The `struct page` objects
 corresponding to the holes are never fully initialized.
 
-To allocate the `mem_map` array, architecture specific setup code
-should call :c:func:`free_area_init_node` function or its convenience
-wrapper :c:func:`free_area_init`. Yet, the mappings array is not
-usable until the call to :c:func:`memblock_free_all` that hands all
-the memory to the page allocator.
+To allocate the `mem_map` array, architecture specific setup code should
+call :c:func:`free_area_init` function. Yet, the mappings array is not
+usable until the call to :c:func:`memblock_free_all` that hands all the
+memory to the page allocator.
 
 If an architecture enables `CONFIG_ARCH_HAS_HOLES_MEMORYMODEL` option,
 it may free parts of the `mem_map` array that do not cover the
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 16/20] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES
  2020-04-29 12:11 ` [PATCH v2 16/20] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES Mike Rapoport
@ 2020-04-29 14:17   ` Christoph Hellwig
  2020-04-29 14:33     ` Mike Rapoport
  2020-04-29 16:29   ` [PATCH v2.5 " Mike Rapoport
  1 sibling, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2020-04-29 14:17 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, Rich Felker, linux-ia64, linux-doc,
	Catalin Marinas, Heiko Carstens, Michal Hocko,
	James E.J. Bottomley, Max Filippov, Guo Ren, linux-csky,
	linux-parisc, sparclinux, linux-hexagon, linux-riscv,
	Mike Rapoport, Greg Ungerer, linux-arch, linux-s390,
	linux-c6x-dev, Baoquan He, Jonathan Corbet, linux-sh,
	Helge Deller, x86, Russell King, Ley Foon Tan, Yoshinori Sato,
	Geert Uytterhoeven, linux-arm-kernel, Mark Salter, Matt Turner,
	linux-snps-arc, uclinux-h8-devel, linux-xtensa, linux-alpha,
	linux-um, linux-m68k, Tony Luck, Qian Cai, Greentime Hu,
	Paul Walmsley, Stafford Horne, Guan Xuetao, Hoan Tran,
	Michal Simek, Thomas Bogendoerfer, Brian Cain, Nick Hu, linux-mm,
	Vineet Gupta, linux-mips, openrisc, Richard Weinberger,
	Andrew Morton, linuxppc-dev, David S. Miller

On Wed, Apr 29, 2020 at 03:11:22PM +0300, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> The commit f47ac088c406 ("mm: memmap_init: iterate over memblock regions
> rather that check each PFN") made early_pfn_in_nid() obsolete and since
> CONFIG_NODES_SPAN_OTHER_NODES is only used to pick a stub or a real
> implementation of early_pfn_in_nid() it is also not needed anymore.

I don't think you can quote a commit id for something that hasn't been
commited to mainline yet.  Then again I would have just merged this
patch into the one that obsoleted early_pfn_in_nid anyway.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 16/20] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES
  2020-04-29 14:17   ` Christoph Hellwig
@ 2020-04-29 14:33     ` Mike Rapoport
  0 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 14:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, Rich Felker, linux-ia64, linux-doc,
	Catalin Marinas, Heiko Carstens, Michal Hocko,
	James E.J. Bottomley, Max Filippov, Guo Ren, linux-csky,
	linux-parisc, sparclinux, linux-hexagon, linux-riscv,
	Mike Rapoport, Greg Ungerer, linux-arch, linux-s390,
	linux-c6x-dev, Baoquan He, Jonathan Corbet, linux-sh,
	Helge Deller, x86, Russell King, Ley Foon Tan, Yoshinori Sato,
	Geert Uytterhoeven, linux-arm-kernel, Mark Salter, Matt Turner,
	linux-snps-arc, uclinux-h8-devel, linux-xtensa, linux-alpha,
	linux-um, linux-m68k, Tony Luck, Qian Cai, Greentime Hu,
	Paul Walmsley, Stafford Horne, Guan Xuetao, Hoan Tran,
	Michal Simek, Thomas Bogendoerfer, Brian Cain, Nick Hu, linux-mm,
	Vineet Gupta, linux-mips, openrisc, Richard Weinberger,
	Andrew Morton, linuxppc-dev, David S. Miller

On Wed, Apr 29, 2020 at 07:17:06AM -0700, Christoph Hellwig wrote:
> On Wed, Apr 29, 2020 at 03:11:22PM +0300, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > The commit f47ac088c406 ("mm: memmap_init: iterate over memblock regions
> > rather that check each PFN") made early_pfn_in_nid() obsolete and since
> > CONFIG_NODES_SPAN_OTHER_NODES is only used to pick a stub or a real
> > implementation of early_pfn_in_nid() it is also not needed anymore.
> 
> I don't think you can quote a commit id for something that hasn't been
> commited to mainline yet.i

Ouch, that was one of the things I've indented to fix in v2...

> Then again I would have just merged this
> patch into the one that obsoleted early_pfn_in_nid anyway.

I've kept these commits separate to preserve the authorship.
I'll update the changelog so that it won't mention commit id.

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2.5 16/20] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES
  2020-04-29 12:11 ` [PATCH v2 16/20] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES Mike Rapoport
  2020-04-29 14:17   ` Christoph Hellwig
@ 2020-04-29 16:29   ` Mike Rapoport
  1 sibling, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-04-29 16:29 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, Baoquan He, Brian Cain, Catalin Marinas,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

On Wed, Apr 29, 2020 at 03:11:22PM +0300, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> The commit f47ac088c406 ("mm: memmap_init: iterate over memblock regions
> rather that check each PFN") made early_pfn_in_nid() obsolete and since
> CONFIG_NODES_SPAN_OTHER_NODES is only used to pick a stub or a real
> implementation of early_pfn_in_nid() it is also not needed anymore.
> 
> Remove both early_pfn_in_nid() and the CONFIG_NODES_SPAN_OTHER_NODES.
> 
> Co-developed-by: Hoan Tran <Hoan@os.amperecomputing.com>
> Signed-off-by: Hoan Tran <Hoan@os.amperecomputing.com>
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---

Here's the version with the updated changelog:

From 7415d1a9b7000c6eecd9f63770592e4d4a8d2463 Mon Sep 17 00:00:00 2001
From: Mike Rapoport <rppt@linux.ibm.com>
Date: Sat, 11 Apr 2020 11:26:49 +0300
Subject: [PATCH v2.5] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES

The memmap_init() function was made to iterate over memblock regions and as
the result the early_pfn_in_nid() function became obsolete.
Since CONFIG_NODES_SPAN_OTHER_NODES is only used to pick a stub or a real
implementation of early_pfn_in_nid(), it is also not needed anymore.

Remove both early_pfn_in_nid() and the CONFIG_NODES_SPAN_OTHER_NODES.

Co-developed-by: Hoan Tran <Hoan@os.amperecomputing.com>
Signed-off-by: Hoan Tran <Hoan@os.amperecomputing.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/powerpc/Kconfig |  9 ---------
 arch/sparc/Kconfig   |  9 ---------
 arch/x86/Kconfig     |  9 ---------
 mm/page_alloc.c      | 20 --------------------
 4 files changed, 47 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 5f86b22b7d2c..74f316deeae1 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -685,15 +685,6 @@ config ARCH_MEMORY_PROBE
 	def_bool y
 	depends on MEMORY_HOTPLUG
 
-# Some NUMA nodes have memory ranges that span
-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-	def_bool y
-	depends on NEED_MULTIPLE_NODES
-
 config STDBINUTILS
 	bool "Using standard binutils settings"
 	depends on 44x
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 795206b7b552..0e4f3891b904 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -286,15 +286,6 @@ config NODES_SHIFT
 	  Specify the maximum number of NUMA Nodes available on the target
 	  system.  Increases memory reserved to accommodate various tables.
 
-# Some NUMA nodes have memory ranges that span
-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-	def_bool y
-	depends on NEED_MULTIPLE_NODES
-
 config ARCH_SPARSEMEM_ENABLE
 	def_bool y if SPARC64
 	select SPARSEMEM_VMEMMAP_ENABLE
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f8bf218a169c..1ec2a5e2fef6 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1581,15 +1581,6 @@ config X86_64_ACPI_NUMA
 	---help---
 	  Enable ACPI SRAT based node topology detection.
 
-# Some NUMA nodes have memory ranges that span
-# other nodes.  Even though a pfn is valid and
-# between a node's start and end pfns, it may not
-# reside on that node.  See memmap_init_zone()
-# for details.
-config NODES_SPAN_OTHER_NODES
-	def_bool y
-	depends on X86_64_ACPI_NUMA
-
 config NUMA_EMU
 	bool "NUMA emulation"
 	depends on NUMA
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8d112defaead..d35ca0996a09 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1541,26 +1541,6 @@ int __meminit early_pfn_to_nid(unsigned long pfn)
 }
 #endif /* CONFIG_NEED_MULTIPLE_NODES */
 
-#ifdef CONFIG_NODES_SPAN_OTHER_NODES
-/* Only safe to use early in boot when initialisation is single-threaded */
-static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
-{
-	int nid;
-
-	nid = __early_pfn_to_nid(pfn, &early_pfnnid_cache);
-	if (nid >= 0 && nid != node)
-		return false;
-	return true;
-}
-
-#else
-static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
-{
-	return true;
-}
-#endif
-
-
 void __init memblock_free_pages(struct page *page, unsigned long pfn,
 							unsigned int order)
 {
-- 
2.26.1



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 17/20] mm: free_area_init: allow defining max_zone_pfn in descending order
  2020-04-29 12:11 ` [PATCH v2 17/20] mm: free_area_init: allow defining max_zone_pfn in descending order Mike Rapoport
@ 2020-05-03 17:41   ` Guenter Roeck
  2020-05-03 18:43     ` Guenter Roeck
  0 siblings, 1 reply; 33+ messages in thread
From: Guenter Roeck @ 2020-05-03 17:41 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, Rich Felker, linux-ia64, linux-doc,
	Catalin Marinas, Heiko Carstens, Michal Hocko,
	James E.J. Bottomley, Max Filippov, Guo Ren, linux-csky,
	linux-parisc, sparclinux, linux-hexagon, linux-riscv,
	Mike Rapoport, Greg Ungerer, linux-arch, linux-s390,
	linux-c6x-dev, Baoquan He, Jonathan Corbet, linux-sh,
	Michael Ellerman, Helge Deller, x86, Russell King, Ley Foon Tan,
	Yoshinori Sato, Geert Uytterhoeven, linux-arm-kernel,
	Mark Salter, Matt Turner, linux-snps-arc, uclinux-h8-devel,
	linux-xtensa, linux-alpha, linux-um, linux-m68k, Tony Luck,
	Qian Cai, Greentime Hu, Paul Walmsley, Stafford Horne,
	Guan Xuetao, Hoan Tran, Michal Simek, Thomas Bogendoerfer,
	Brian Cain, Nick Hu, linux-mm, Vineet Gupta, linux-mips,
	openrisc, Richard Weinberger, Andrew Morton, linuxppc-dev,
	David S. Miller

Hi,

On Wed, Apr 29, 2020 at 03:11:23PM +0300, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> Some architectures (e.g. ARC) have the ZONE_HIGHMEM zone below the
> ZONE_NORMAL. Allowing free_area_init() parse max_zone_pfn array even it is
> sorted in descending order allows using free_area_init() on such
> architectures.
> 
> Add top -> down traversal of max_zone_pfn array in free_area_init() and use
> the latter in ARC node/zone initialization.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>

This patch causes my microblazeel qemu boot test in linux-next to fail.
Reverting it fixes the problem.

qemu command line:

qemu-system-microblazeel -M petalogix-ml605 -m 256 \
	-kernel arch/microblaze/boot/linux.bin -no-reboot \
	-initrd rootfs.cpio \
	-append 'panic=-1 slub_debug=FZPUA rdinit=/sbin/init console=ttyS0,115200' \
	-monitor none -serial stdio -nographic

initrd:
	https://github.com/groeck/linux-build-test/blob/master/rootfs/microblazeel/rootfs.cpio.gz
configuration:
	https://github.com/groeck/linux-build-test/blob/master/rootfs/microblazeel/qemu_microblazeel_ml605_defconfig

Bisect log is below.

Guenter

---
# bad: [fb9d670f57e3f6478602328bbbf71138be06ca4f] Add linux-next specific files for 20200501
# good: [6a8b55ed4056ea5559ebe4f6a4b247f627870d4c] Linux 5.7-rc3
git bisect start 'HEAD' 'v5.7-rc3'
# good: [068b80b68a670f0b17288c8a3d1ee751f35162ab] Merge remote-tracking branch 'drm/drm-next'
git bisect good 068b80b68a670f0b17288c8a3d1ee751f35162ab
# good: [46c70fc6a3ac35cd72ddad248dcbe4eee716d2a5] Merge remote-tracking branch 'drivers-x86/for-next'
git bisect good 46c70fc6a3ac35cd72ddad248dcbe4eee716d2a5
# good: [f39c4ad479a2f005f972a2b941b40efa6b9c9349] Merge remote-tracking branch 'rpmsg/for-next'
git bisect good f39c4ad479a2f005f972a2b941b40efa6b9c9349
# bad: [165d3ee0162fe28efc2c8180176633e33515df15] ipc-convert-ipcs_idr-to-xarray-update
git bisect bad 165d3ee0162fe28efc2c8180176633e33515df15
# good: [001f1d211ed2ed0f005838dc4390993930bbbd69] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES
git bisect good 001f1d211ed2ed0f005838dc4390993930bbbd69
# bad: [aaad7401bd32f10c1d591dd886b3a9b9595c6d77] mm/vmsan: fix some typos in comment
git bisect bad aaad7401bd32f10c1d591dd886b3a9b9595c6d77
# bad: [09f9d0ab1fbed85623b283995aa7a7d78daa1611] khugepaged: allow to collapse PTE-mapped compound pages
git bisect bad 09f9d0ab1fbed85623b283995aa7a7d78daa1611
# bad: [c942fc8a3e5088407bc32d94f554bab205175f8a] mm/vmstat.c: do not show lowmem reserve protection information of empty zone
git bisect bad c942fc8a3e5088407bc32d94f554bab205175f8a
# bad: [b29358d269ace3826d8521bea842fc2984cfc11b] mm/page_alloc.c: rename free_pages_check() to check_free_page()
git bisect bad b29358d269ace3826d8521bea842fc2984cfc11b
# bad: [be0fb591a1f1df20a00c8f023f9ca4891f177b0d] mm: simplify find_min_pfn_with_active_regions()
git bisect bad be0fb591a1f1df20a00c8f023f9ca4891f177b0d
# bad: [c17422a008d36dcf3e9f51469758c5762716cb0a] mm: rename free_area_init_node() to free_area_init_memoryless_node()
git bisect bad c17422a008d36dcf3e9f51469758c5762716cb0a
# bad: [51a2f644fd020d5f090044825c388444d11029d5] mm: free_area_init: allow defining max_zone_pfn in descending order
git bisect bad 51a2f644fd020d5f090044825c388444d11029d5
# first bad commit: [51a2f644fd020d5f090044825c388444d11029d5] mm: free_area_init: allow defining max_zone_pfn in descending order


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 17/20] mm: free_area_init: allow defining max_zone_pfn in descending order
  2020-05-03 17:41   ` Guenter Roeck
@ 2020-05-03 18:43     ` Guenter Roeck
  2020-05-04 15:39       ` Mike Rapoport
  0 siblings, 1 reply; 33+ messages in thread
From: Guenter Roeck @ 2020-05-03 18:43 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, Rich Felker, linux-ia64, linux-doc,
	Catalin Marinas, Heiko Carstens, Michal Hocko,
	James E.J. Bottomley, Max Filippov, Guo Ren, linux-csky,
	linux-parisc, sparclinux, linux-hexagon, linux-riscv,
	Mike Rapoport, Greg Ungerer, linux-arch, linux-s390,
	linux-c6x-dev, Baoquan He, Jonathan Corbet, linux-sh,
	Michael Ellerman, Helge Deller, x86, Russell King, Ley Foon Tan,
	Yoshinori Sato, Geert Uytterhoeven, linux-arm-kernel,
	Mark Salter, Matt Turner, linux-snps-arc, uclinux-h8-devel,
	linux-xtensa, linux-alpha, linux-um, linux-m68k, Tony Luck,
	Qian Cai, Greentime Hu, Paul Walmsley, Stafford Horne,
	Guan Xuetao, Hoan Tran, Michal Simek, Thomas Bogendoerfer,
	Brian Cain, Nick Hu, linux-mm, Vineet Gupta, linux-mips,
	openrisc, Richard Weinberger, Andrew Morton, linuxppc-dev,
	David S. Miller

On Sun, May 03, 2020 at 10:41:38AM -0700, Guenter Roeck wrote:
> Hi,
> 
> On Wed, Apr 29, 2020 at 03:11:23PM +0300, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > Some architectures (e.g. ARC) have the ZONE_HIGHMEM zone below the
> > ZONE_NORMAL. Allowing free_area_init() parse max_zone_pfn array even it is
> > sorted in descending order allows using free_area_init() on such
> > architectures.
> > 
> > Add top -> down traversal of max_zone_pfn array in free_area_init() and use
> > the latter in ARC node/zone initialization.
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> 
> This patch causes my microblazeel qemu boot test in linux-next to fail.
> Reverting it fixes the problem.
> 
The same problem is seen with s390 emulations.

Guenter

> qemu command line:
> 
> qemu-system-microblazeel -M petalogix-ml605 -m 256 \
> 	-kernel arch/microblaze/boot/linux.bin -no-reboot \
> 	-initrd rootfs.cpio \
> 	-append 'panic=-1 slub_debug=FZPUA rdinit=/sbin/init console=ttyS0,115200' \
> 	-monitor none -serial stdio -nographic
> 
> initrd:
> 	https://github.com/groeck/linux-build-test/blob/master/rootfs/microblazeel/rootfs.cpio.gz
> configuration:
> 	https://github.com/groeck/linux-build-test/blob/master/rootfs/microblazeel/qemu_microblazeel_ml605_defconfig
> 
> Bisect log is below.
> 
> Guenter
> 
> ---
> # bad: [fb9d670f57e3f6478602328bbbf71138be06ca4f] Add linux-next specific files for 20200501
> # good: [6a8b55ed4056ea5559ebe4f6a4b247f627870d4c] Linux 5.7-rc3
> git bisect start 'HEAD' 'v5.7-rc3'
> # good: [068b80b68a670f0b17288c8a3d1ee751f35162ab] Merge remote-tracking branch 'drm/drm-next'
> git bisect good 068b80b68a670f0b17288c8a3d1ee751f35162ab
> # good: [46c70fc6a3ac35cd72ddad248dcbe4eee716d2a5] Merge remote-tracking branch 'drivers-x86/for-next'
> git bisect good 46c70fc6a3ac35cd72ddad248dcbe4eee716d2a5
> # good: [f39c4ad479a2f005f972a2b941b40efa6b9c9349] Merge remote-tracking branch 'rpmsg/for-next'
> git bisect good f39c4ad479a2f005f972a2b941b40efa6b9c9349
> # bad: [165d3ee0162fe28efc2c8180176633e33515df15] ipc-convert-ipcs_idr-to-xarray-update
> git bisect bad 165d3ee0162fe28efc2c8180176633e33515df15
> # good: [001f1d211ed2ed0f005838dc4390993930bbbd69] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES
> git bisect good 001f1d211ed2ed0f005838dc4390993930bbbd69
> # bad: [aaad7401bd32f10c1d591dd886b3a9b9595c6d77] mm/vmsan: fix some typos in comment
> git bisect bad aaad7401bd32f10c1d591dd886b3a9b9595c6d77
> # bad: [09f9d0ab1fbed85623b283995aa7a7d78daa1611] khugepaged: allow to collapse PTE-mapped compound pages
> git bisect bad 09f9d0ab1fbed85623b283995aa7a7d78daa1611
> # bad: [c942fc8a3e5088407bc32d94f554bab205175f8a] mm/vmstat.c: do not show lowmem reserve protection information of empty zone
> git bisect bad c942fc8a3e5088407bc32d94f554bab205175f8a
> # bad: [b29358d269ace3826d8521bea842fc2984cfc11b] mm/page_alloc.c: rename free_pages_check() to check_free_page()
> git bisect bad b29358d269ace3826d8521bea842fc2984cfc11b
> # bad: [be0fb591a1f1df20a00c8f023f9ca4891f177b0d] mm: simplify find_min_pfn_with_active_regions()
> git bisect bad be0fb591a1f1df20a00c8f023f9ca4891f177b0d
> # bad: [c17422a008d36dcf3e9f51469758c5762716cb0a] mm: rename free_area_init_node() to free_area_init_memoryless_node()
> git bisect bad c17422a008d36dcf3e9f51469758c5762716cb0a
> # bad: [51a2f644fd020d5f090044825c388444d11029d5] mm: free_area_init: allow defining max_zone_pfn in descending order
> git bisect bad 51a2f644fd020d5f090044825c388444d11029d5
> # first bad commit: [51a2f644fd020d5f090044825c388444d11029d5] mm: free_area_init: allow defining max_zone_pfn in descending order


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 17/20] mm: free_area_init: allow defining max_zone_pfn in descending order
  2020-05-03 18:43     ` Guenter Roeck
@ 2020-05-04 15:39       ` Mike Rapoport
  2020-05-05 13:18         ` Guenter Roeck
  0 siblings, 1 reply; 33+ messages in thread
From: Mike Rapoport @ 2020-05-04 15:39 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-kernel, Rich Felker, linux-ia64, linux-doc,
	Catalin Marinas, Heiko Carstens, Michal Hocko,
	James E.J. Bottomley, Max Filippov, Guo Ren, linux-csky,
	linux-parisc, sparclinux, linux-hexagon, linux-riscv,
	Mike Rapoport, Greg Ungerer, linux-arch, linux-s390,
	linux-c6x-dev, Baoquan He, Jonathan Corbet, linux-sh,
	Michael Ellerman, Helge Deller, x86, Russell King, Ley Foon Tan,
	Yoshinori Sato, Geert Uytterhoeven, linux-arm-kernel,
	Mark Salter, Matt Turner, linux-snps-arc, uclinux-h8-devel,
	linux-xtensa, linux-alpha, linux-um, linux-m68k, Tony Luck,
	Qian Cai, Greentime Hu, Paul Walmsley, Stafford Horne,
	Guan Xuetao, Hoan Tran, Michal Simek, Thomas Bogendoerfer,
	Brian Cain, Nick Hu, linux-mm, Vineet Gupta, linux-mips,
	openrisc, Richard Weinberger, Andrew Morton, linuxppc-dev,
	David S. Miller

On Sun, May 03, 2020 at 11:43:00AM -0700, Guenter Roeck wrote:
> On Sun, May 03, 2020 at 10:41:38AM -0700, Guenter Roeck wrote:
> > Hi,
> > 
> > On Wed, Apr 29, 2020 at 03:11:23PM +0300, Mike Rapoport wrote:
> > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > 
> > > Some architectures (e.g. ARC) have the ZONE_HIGHMEM zone below the
> > > ZONE_NORMAL. Allowing free_area_init() parse max_zone_pfn array even it is
> > > sorted in descending order allows using free_area_init() on such
> > > architectures.
> > > 
> > > Add top -> down traversal of max_zone_pfn array in free_area_init() and use
> > > the latter in ARC node/zone initialization.
> > > 
> > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > This patch causes my microblazeel qemu boot test in linux-next to fail.
> > Reverting it fixes the problem.
> > 
> The same problem is seen with s390 emulations.

Yeah, this patch breaks some others as well :(

My assumption that max_zone_pfn defines architectural limit for maximal
PFN that can belong to a zone was over-optimistic. Several arches
actually do that, but others do

	max_zone_pfn[ZONE_DMA] = MAX_DMA_PFN;
	max_zone_pfn[ZONE_NORMAL] = max_pfn;

where MAX_DMA_PFN is build-time constrain and max_pfn is run time limit
for the current system.

So, when max_pfn is lower than MAX_DMA_PFN, the free_init_area() will
consider max_zone_pfn as descending and will wrongly calculate zone
extents.

That said, instead of trying to create a generic way to special case
ARC, I suggest to simply use the below patch instead.

diff --git a/arch/arc/mm/init.c b/arch/arc/mm/init.c
index 41eb9be1653c..386959bac3d2 100644
--- a/arch/arc/mm/init.c
+++ b/arch/arc/mm/init.c
@@ -77,6 +77,11 @@ void __init early_init_dt_add_memory_arch(u64 base, u64 size)
 		base, TO_MB(size), !in_use ? "Not used":"");
 }
 
+bool arch_has_descending_max_zone_pfns(void)
+{
+	return true;
+}
+
 /*
  * First memory setup routine called from setup_arch()
  * 1. setup swapper's mm @init_mm
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b990e9734474..114f0e027144 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7307,6 +7307,15 @@ static void check_for_memory(pg_data_t *pgdat, int nid)
 	}
 }
 
+/*
+ * Some architecturs, e.g. ARC may have ZONE_HIGHMEM below ZONE_NORMAL. For
+ * such cases we allow max_zone_pfn sorted in the descending order
+ */
+bool __weak arch_has_descending_max_zone_pfns(void)
+{
+	return false;
+}
+
 /**
  * free_area_init - Initialise all pg_data_t and zone data
  * @max_zone_pfn: an array of max PFNs for each zone
@@ -7324,7 +7333,7 @@ void __init free_area_init(unsigned long *max_zone_pfn)
 {
 	unsigned long start_pfn, end_pfn;
 	int i, nid, zone;
-	bool descending = false;
+	bool descending;
 
 	/* Record where the zone boundaries are */
 	memset(arch_zone_lowest_possible_pfn, 0,
@@ -7333,14 +7342,7 @@ void __init free_area_init(unsigned long *max_zone_pfn)
 				sizeof(arch_zone_highest_possible_pfn));
 
 	start_pfn = find_min_pfn_with_active_regions();
-
-	/*
-	 * Some architecturs, e.g. ARC may have ZONE_HIGHMEM below
-	 * ZONE_NORMAL. For such cases we allow max_zone_pfn sorted in the
-	 * descending order
-	 */
-	if (MAX_NR_ZONES > 1 && max_zone_pfn[0] > max_zone_pfn[1])
-		descending = true;
+	descending = arch_has_descending_max_zone_pfns();
 
 	for (i = 0; i < MAX_NR_ZONES; i++) {
 		if (descending)

> Guenter
> 
> > qemu command line:
> > 
> > qemu-system-microblazeel -M petalogix-ml605 -m 256 \
> > 	-kernel arch/microblaze/boot/linux.bin -no-reboot \
> > 	-initrd rootfs.cpio \
> > 	-append 'panic=-1 slub_debug=FZPUA rdinit=/sbin/init console=ttyS0,115200' \
> > 	-monitor none -serial stdio -nographic
> > 
> > initrd:
> > 	https://github.com/groeck/linux-build-test/blob/master/rootfs/microblazeel/rootfs.cpio.gz
> > configuration:
> > 	https://github.com/groeck/linux-build-test/blob/master/rootfs/microblazeel/qemu_microblazeel_ml605_defconfig
> > 
> > Bisect log is below.
> > 
> > Guenter
> > 
> > ---
> > # bad: [fb9d670f57e3f6478602328bbbf71138be06ca4f] Add linux-next specific files for 20200501
> > # good: [6a8b55ed4056ea5559ebe4f6a4b247f627870d4c] Linux 5.7-rc3
> > git bisect start 'HEAD' 'v5.7-rc3'
> > # good: [068b80b68a670f0b17288c8a3d1ee751f35162ab] Merge remote-tracking branch 'drm/drm-next'
> > git bisect good 068b80b68a670f0b17288c8a3d1ee751f35162ab
> > # good: [46c70fc6a3ac35cd72ddad248dcbe4eee716d2a5] Merge remote-tracking branch 'drivers-x86/for-next'
> > git bisect good 46c70fc6a3ac35cd72ddad248dcbe4eee716d2a5
> > # good: [f39c4ad479a2f005f972a2b941b40efa6b9c9349] Merge remote-tracking branch 'rpmsg/for-next'
> > git bisect good f39c4ad479a2f005f972a2b941b40efa6b9c9349
> > # bad: [165d3ee0162fe28efc2c8180176633e33515df15] ipc-convert-ipcs_idr-to-xarray-update
> > git bisect bad 165d3ee0162fe28efc2c8180176633e33515df15
> > # good: [001f1d211ed2ed0f005838dc4390993930bbbd69] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES
> > git bisect good 001f1d211ed2ed0f005838dc4390993930bbbd69
> > # bad: [aaad7401bd32f10c1d591dd886b3a9b9595c6d77] mm/vmsan: fix some typos in comment
> > git bisect bad aaad7401bd32f10c1d591dd886b3a9b9595c6d77
> > # bad: [09f9d0ab1fbed85623b283995aa7a7d78daa1611] khugepaged: allow to collapse PTE-mapped compound pages
> > git bisect bad 09f9d0ab1fbed85623b283995aa7a7d78daa1611
> > # bad: [c942fc8a3e5088407bc32d94f554bab205175f8a] mm/vmstat.c: do not show lowmem reserve protection information of empty zone
> > git bisect bad c942fc8a3e5088407bc32d94f554bab205175f8a
> > # bad: [b29358d269ace3826d8521bea842fc2984cfc11b] mm/page_alloc.c: rename free_pages_check() to check_free_page()
> > git bisect bad b29358d269ace3826d8521bea842fc2984cfc11b
> > # bad: [be0fb591a1f1df20a00c8f023f9ca4891f177b0d] mm: simplify find_min_pfn_with_active_regions()
> > git bisect bad be0fb591a1f1df20a00c8f023f9ca4891f177b0d
> > # bad: [c17422a008d36dcf3e9f51469758c5762716cb0a] mm: rename free_area_init_node() to free_area_init_memoryless_node()
> > git bisect bad c17422a008d36dcf3e9f51469758c5762716cb0a
> > # bad: [51a2f644fd020d5f090044825c388444d11029d5] mm: free_area_init: allow defining max_zone_pfn in descending order
> > git bisect bad 51a2f644fd020d5f090044825c388444d11029d5
> > # first bad commit: [51a2f644fd020d5f090044825c388444d11029d5] mm: free_area_init: allow defining max_zone_pfn in descending order

-- 
Sincerely yours,
Mike.


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 17/20] mm: free_area_init: allow defining max_zone_pfn in descending order
  2020-05-04 15:39       ` Mike Rapoport
@ 2020-05-05 13:18         ` Guenter Roeck
  2020-05-05 13:45           ` Mike Rapoport
  2020-05-05 17:27           ` Vineet Gupta
  0 siblings, 2 replies; 33+ messages in thread
From: Guenter Roeck @ 2020-05-05 13:18 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, Rich Felker, linux-ia64, linux-doc,
	Catalin Marinas, Heiko Carstens, Michal Hocko,
	James E.J. Bottomley, Max Filippov, Guo Ren, linux-csky,
	linux-parisc, sparclinux, linux-hexagon, linux-riscv,
	Mike Rapoport, Greg Ungerer, linux-arch, linux-s390,
	linux-c6x-dev, Baoquan He, Jonathan Corbet, linux-sh,
	Michael Ellerman, Helge Deller, x86, Russell King, Ley Foon Tan,
	Yoshinori Sato, Geert Uytterhoeven, linux-arm-kernel,
	Mark Salter, Matt Turner, linux-snps-arc, uclinux-h8-devel,
	linux-xtensa, linux-alpha, linux-um, linux-m68k, Tony Luck,
	Qian Cai, Greentime Hu, Paul Walmsley, Stafford Horne,
	Guan Xuetao, Hoan Tran, Michal Simek, Thomas Bogendoerfer,
	Brian Cain, Nick Hu, linux-mm, Vineet Gupta, linux-mips,
	openrisc, Richard Weinberger, Andrew Morton, linuxppc-dev,
	David S. Miller

On 5/4/20 8:39 AM, Mike Rapoport wrote:
> On Sun, May 03, 2020 at 11:43:00AM -0700, Guenter Roeck wrote:
>> On Sun, May 03, 2020 at 10:41:38AM -0700, Guenter Roeck wrote:
>>> Hi,
>>>
>>> On Wed, Apr 29, 2020 at 03:11:23PM +0300, Mike Rapoport wrote:
>>>> From: Mike Rapoport <rppt@linux.ibm.com>
>>>>
>>>> Some architectures (e.g. ARC) have the ZONE_HIGHMEM zone below the
>>>> ZONE_NORMAL. Allowing free_area_init() parse max_zone_pfn array even it is
>>>> sorted in descending order allows using free_area_init() on such
>>>> architectures.
>>>>
>>>> Add top -> down traversal of max_zone_pfn array in free_area_init() and use
>>>> the latter in ARC node/zone initialization.
>>>>
>>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
>>>
>>> This patch causes my microblazeel qemu boot test in linux-next to fail.
>>> Reverting it fixes the problem.
>>>
>> The same problem is seen with s390 emulations.
> 
> Yeah, this patch breaks some others as well :(
> 
> My assumption that max_zone_pfn defines architectural limit for maximal
> PFN that can belong to a zone was over-optimistic. Several arches
> actually do that, but others do
> 
> 	max_zone_pfn[ZONE_DMA] = MAX_DMA_PFN;
> 	max_zone_pfn[ZONE_NORMAL] = max_pfn;
> 
> where MAX_DMA_PFN is build-time constrain and max_pfn is run time limit
> for the current system.
> 
> So, when max_pfn is lower than MAX_DMA_PFN, the free_init_area() will
> consider max_zone_pfn as descending and will wrongly calculate zone
> extents.
> 
> That said, instead of trying to create a generic way to special case
> ARC, I suggest to simply use the below patch instead.
> 

As a reminder, I reported the problem against s390 and microblazeel
(interestingly enough, microblaze (big endian) works), not against arc.

Guenter

> diff --git a/arch/arc/mm/init.c b/arch/arc/mm/init.c
> index 41eb9be1653c..386959bac3d2 100644
> --- a/arch/arc/mm/init.c
> +++ b/arch/arc/mm/init.c
> @@ -77,6 +77,11 @@ void __init early_init_dt_add_memory_arch(u64 base, u64 size)
>  		base, TO_MB(size), !in_use ? "Not used":"");
>  }
>  
> +bool arch_has_descending_max_zone_pfns(void)
> +{
> +	return true;
> +}
> +
>  /*
>   * First memory setup routine called from setup_arch()
>   * 1. setup swapper's mm @init_mm
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index b990e9734474..114f0e027144 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7307,6 +7307,15 @@ static void check_for_memory(pg_data_t *pgdat, int nid)
>  	}
>  }
>  
> +/*
> + * Some architecturs, e.g. ARC may have ZONE_HIGHMEM below ZONE_NORMAL. For
> + * such cases we allow max_zone_pfn sorted in the descending order
> + */
> +bool __weak arch_has_descending_max_zone_pfns(void)
> +{
> +	return false;
> +}
> +
>  /**
>   * free_area_init - Initialise all pg_data_t and zone data
>   * @max_zone_pfn: an array of max PFNs for each zone
> @@ -7324,7 +7333,7 @@ void __init free_area_init(unsigned long *max_zone_pfn)
>  {
>  	unsigned long start_pfn, end_pfn;
>  	int i, nid, zone;
> -	bool descending = false;
> +	bool descending;
>  
>  	/* Record where the zone boundaries are */
>  	memset(arch_zone_lowest_possible_pfn, 0,
> @@ -7333,14 +7342,7 @@ void __init free_area_init(unsigned long *max_zone_pfn)
>  				sizeof(arch_zone_highest_possible_pfn));
>  
>  	start_pfn = find_min_pfn_with_active_regions();
> -
> -	/*
> -	 * Some architecturs, e.g. ARC may have ZONE_HIGHMEM below
> -	 * ZONE_NORMAL. For such cases we allow max_zone_pfn sorted in the
> -	 * descending order
> -	 */
> -	if (MAX_NR_ZONES > 1 && max_zone_pfn[0] > max_zone_pfn[1])
> -		descending = true;
> +	descending = arch_has_descending_max_zone_pfns();
>  
>  	for (i = 0; i < MAX_NR_ZONES; i++) {
>  		if (descending)
> 
>> Guenter
>>
>>> qemu command line:
>>>
>>> qemu-system-microblazeel -M petalogix-ml605 -m 256 \
>>> 	-kernel arch/microblaze/boot/linux.bin -no-reboot \
>>> 	-initrd rootfs.cpio \
>>> 	-append 'panic=-1 slub_debug=FZPUA rdinit=/sbin/init console=ttyS0,115200' \
>>> 	-monitor none -serial stdio -nographic
>>>
>>> initrd:
>>> 	https://github.com/groeck/linux-build-test/blob/master/rootfs/microblazeel/rootfs.cpio.gz
>>> configuration:
>>> 	https://github.com/groeck/linux-build-test/blob/master/rootfs/microblazeel/qemu_microblazeel_ml605_defconfig
>>>
>>> Bisect log is below.
>>>
>>> Guenter
>>>
>>> ---
>>> # bad: [fb9d670f57e3f6478602328bbbf71138be06ca4f] Add linux-next specific files for 20200501
>>> # good: [6a8b55ed4056ea5559ebe4f6a4b247f627870d4c] Linux 5.7-rc3
>>> git bisect start 'HEAD' 'v5.7-rc3'
>>> # good: [068b80b68a670f0b17288c8a3d1ee751f35162ab] Merge remote-tracking branch 'drm/drm-next'
>>> git bisect good 068b80b68a670f0b17288c8a3d1ee751f35162ab
>>> # good: [46c70fc6a3ac35cd72ddad248dcbe4eee716d2a5] Merge remote-tracking branch 'drivers-x86/for-next'
>>> git bisect good 46c70fc6a3ac35cd72ddad248dcbe4eee716d2a5
>>> # good: [f39c4ad479a2f005f972a2b941b40efa6b9c9349] Merge remote-tracking branch 'rpmsg/for-next'
>>> git bisect good f39c4ad479a2f005f972a2b941b40efa6b9c9349
>>> # bad: [165d3ee0162fe28efc2c8180176633e33515df15] ipc-convert-ipcs_idr-to-xarray-update
>>> git bisect bad 165d3ee0162fe28efc2c8180176633e33515df15
>>> # good: [001f1d211ed2ed0f005838dc4390993930bbbd69] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES
>>> git bisect good 001f1d211ed2ed0f005838dc4390993930bbbd69
>>> # bad: [aaad7401bd32f10c1d591dd886b3a9b9595c6d77] mm/vmsan: fix some typos in comment
>>> git bisect bad aaad7401bd32f10c1d591dd886b3a9b9595c6d77
>>> # bad: [09f9d0ab1fbed85623b283995aa7a7d78daa1611] khugepaged: allow to collapse PTE-mapped compound pages
>>> git bisect bad 09f9d0ab1fbed85623b283995aa7a7d78daa1611
>>> # bad: [c942fc8a3e5088407bc32d94f554bab205175f8a] mm/vmstat.c: do not show lowmem reserve protection information of empty zone
>>> git bisect bad c942fc8a3e5088407bc32d94f554bab205175f8a
>>> # bad: [b29358d269ace3826d8521bea842fc2984cfc11b] mm/page_alloc.c: rename free_pages_check() to check_free_page()
>>> git bisect bad b29358d269ace3826d8521bea842fc2984cfc11b
>>> # bad: [be0fb591a1f1df20a00c8f023f9ca4891f177b0d] mm: simplify find_min_pfn_with_active_regions()
>>> git bisect bad be0fb591a1f1df20a00c8f023f9ca4891f177b0d
>>> # bad: [c17422a008d36dcf3e9f51469758c5762716cb0a] mm: rename free_area_init_node() to free_area_init_memoryless_node()
>>> git bisect bad c17422a008d36dcf3e9f51469758c5762716cb0a
>>> # bad: [51a2f644fd020d5f090044825c388444d11029d5] mm: free_area_init: allow defining max_zone_pfn in descending order
>>> git bisect bad 51a2f644fd020d5f090044825c388444d11029d5
>>> # first bad commit: [51a2f644fd020d5f090044825c388444d11029d5] mm: free_area_init: allow defining max_zone_pfn in descending order
> 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 17/20] mm: free_area_init: allow defining max_zone_pfn in descending order
  2020-05-05 13:18         ` Guenter Roeck
@ 2020-05-05 13:45           ` Mike Rapoport
  2020-05-05 17:27           ` Vineet Gupta
  1 sibling, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2020-05-05 13:45 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Mike Rapoport, linux-kernel, Rich Felker, linux-ia64, linux-doc,
	Catalin Marinas, Heiko Carstens, Michal Hocko,
	James E.J. Bottomley, Max Filippov, Guo Ren, linux-csky,
	linux-parisc, sparclinux, linux-hexagon, linux-riscv,
	Greg Ungerer, linux-arch, linux-s390, linux-c6x-dev, Baoquan He,
	Jonathan Corbet, linux-sh, Michael Ellerman, Helge Deller, x86,
	Russell King, Ley Foon Tan, Yoshinori Sato, Geert Uytterhoeven,
	linux-arm-kernel, Mark Salter, Matt Turner, linux-snps-arc,
	uclinux-h8-devel, linux-xtensa, linux-alpha, linux-um,
	linux-m68k, Tony Luck, Qian Cai, Greentime Hu, Paul Walmsley,
	Stafford Horne, Guan Xuetao, Hoan Tran, Michal Simek,
	Thomas Bogendoerfer, Brian Cain, Nick Hu, linux-mm, Vineet Gupta,
	linux-mips, openrisc, Richard Weinberger, Andrew Morton,
	linuxppc-dev, David S. Miller

On Tue, May 05, 2020 at 06:18:11AM -0700, Guenter Roeck wrote:
> On 5/4/20 8:39 AM, Mike Rapoport wrote:
> > On Sun, May 03, 2020 at 11:43:00AM -0700, Guenter Roeck wrote:
> >> On Sun, May 03, 2020 at 10:41:38AM -0700, Guenter Roeck wrote:
> >>> Hi,
> >>>
> >>> On Wed, Apr 29, 2020 at 03:11:23PM +0300, Mike Rapoport wrote:
> >>>> From: Mike Rapoport <rppt@linux.ibm.com>
> >>>>
> >>>> Some architectures (e.g. ARC) have the ZONE_HIGHMEM zone below the
> >>>> ZONE_NORMAL. Allowing free_area_init() parse max_zone_pfn array even it is
> >>>> sorted in descending order allows using free_area_init() on such
> >>>> architectures.
> >>>>
> >>>> Add top -> down traversal of max_zone_pfn array in free_area_init() and use
> >>>> the latter in ARC node/zone initialization.
> >>>>
> >>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> >>>
> >>> This patch causes my microblazeel qemu boot test in linux-next to fail.
> >>> Reverting it fixes the problem.
> >>>
> >> The same problem is seen with s390 emulations.
> > 
> > Yeah, this patch breaks some others as well :(
> > 
> > My assumption that max_zone_pfn defines architectural limit for maximal
> > PFN that can belong to a zone was over-optimistic. Several arches
> > actually do that, but others do
> > 
> > 	max_zone_pfn[ZONE_DMA] = MAX_DMA_PFN;
> > 	max_zone_pfn[ZONE_NORMAL] = max_pfn;
> > 
> > where MAX_DMA_PFN is build-time constrain and max_pfn is run time limit
> > for the current system.
> > 
> > So, when max_pfn is lower than MAX_DMA_PFN, the free_init_area() will
> > consider max_zone_pfn as descending and will wrongly calculate zone
> > extents.
> > 
> > That said, instead of trying to create a generic way to special case
> > ARC, I suggest to simply use the below patch instead.
> > 
> 
> As a reminder, I reported the problem against s390 and microblazeel
> (interestingly enough, microblaze (big endian) works), not against arc.

With this fix microblazeel and s390 worked for me and also Christian had
reported that s390 is fixed.

microblaze (big endian) works because its defconfig does not enable HIGHMEM
while little endian does.

ARC is mentioned because it is the only arch that may have ZONE_HIGHMEM
and ZONE_NORMAL and this patch was required to consolidate
free_area_init* variants.

> Guenter
> 
> > diff --git a/arch/arc/mm/init.c b/arch/arc/mm/init.c
> > index 41eb9be1653c..386959bac3d2 100644
> > --- a/arch/arc/mm/init.c
> > +++ b/arch/arc/mm/init.c
> > @@ -77,6 +77,11 @@ void __init early_init_dt_add_memory_arch(u64 base, u64 size)
> >  		base, TO_MB(size), !in_use ? "Not used":"");
> >  }
> >  
> > +bool arch_has_descending_max_zone_pfns(void)
> > +{
> > +	return true;
> > +}
> > +
> >  /*
> >   * First memory setup routine called from setup_arch()
> >   * 1. setup swapper's mm @init_mm
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index b990e9734474..114f0e027144 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -7307,6 +7307,15 @@ static void check_for_memory(pg_data_t *pgdat, int nid)
> >  	}
> >  }
> >  
> > +/*
> > + * Some architecturs, e.g. ARC may have ZONE_HIGHMEM below ZONE_NORMAL. For
> > + * such cases we allow max_zone_pfn sorted in the descending order
> > + */
> > +bool __weak arch_has_descending_max_zone_pfns(void)
> > +{
> > +	return false;
> > +}
> > +
> >  /**
> >   * free_area_init - Initialise all pg_data_t and zone data
> >   * @max_zone_pfn: an array of max PFNs for each zone
> > @@ -7324,7 +7333,7 @@ void __init free_area_init(unsigned long *max_zone_pfn)
> >  {
> >  	unsigned long start_pfn, end_pfn;
> >  	int i, nid, zone;
> > -	bool descending = false;
> > +	bool descending;
> >  
> >  	/* Record where the zone boundaries are */
> >  	memset(arch_zone_lowest_possible_pfn, 0,
> > @@ -7333,14 +7342,7 @@ void __init free_area_init(unsigned long *max_zone_pfn)
> >  				sizeof(arch_zone_highest_possible_pfn));
> >  
> >  	start_pfn = find_min_pfn_with_active_regions();
> > -
> > -	/*
> > -	 * Some architecturs, e.g. ARC may have ZONE_HIGHMEM below
> > -	 * ZONE_NORMAL. For such cases we allow max_zone_pfn sorted in the
> > -	 * descending order
> > -	 */
> > -	if (MAX_NR_ZONES > 1 && max_zone_pfn[0] > max_zone_pfn[1])
> > -		descending = true;
> > +	descending = arch_has_descending_max_zone_pfns();
> >  
> >  	for (i = 0; i < MAX_NR_ZONES; i++) {
> >  		if (descending)
> > 
> >> Guenter
> >>
> >>> qemu command line:
> >>>
> >>> qemu-system-microblazeel -M petalogix-ml605 -m 256 \
> >>> 	-kernel arch/microblaze/boot/linux.bin -no-reboot \
> >>> 	-initrd rootfs.cpio \
> >>> 	-append 'panic=-1 slub_debug=FZPUA rdinit=/sbin/init console=ttyS0,115200' \
> >>> 	-monitor none -serial stdio -nographic
> >>>
> >>> initrd:
> >>> 	https://github.com/groeck/linux-build-test/blob/master/rootfs/microblazeel/rootfs.cpio.gz
> >>> configuration:
> >>> 	https://github.com/groeck/linux-build-test/blob/master/rootfs/microblazeel/qemu_microblazeel_ml605_defconfig
> >>>
> >>> Bisect log is below.
> >>>
> >>> Guenter
> >>>
> >>> ---
> >>> # bad: [fb9d670f57e3f6478602328bbbf71138be06ca4f] Add linux-next specific files for 20200501
> >>> # good: [6a8b55ed4056ea5559ebe4f6a4b247f627870d4c] Linux 5.7-rc3
> >>> git bisect start 'HEAD' 'v5.7-rc3'
> >>> # good: [068b80b68a670f0b17288c8a3d1ee751f35162ab] Merge remote-tracking branch 'drm/drm-next'
> >>> git bisect good 068b80b68a670f0b17288c8a3d1ee751f35162ab
> >>> # good: [46c70fc6a3ac35cd72ddad248dcbe4eee716d2a5] Merge remote-tracking branch 'drivers-x86/for-next'
> >>> git bisect good 46c70fc6a3ac35cd72ddad248dcbe4eee716d2a5
> >>> # good: [f39c4ad479a2f005f972a2b941b40efa6b9c9349] Merge remote-tracking branch 'rpmsg/for-next'
> >>> git bisect good f39c4ad479a2f005f972a2b941b40efa6b9c9349
> >>> # bad: [165d3ee0162fe28efc2c8180176633e33515df15] ipc-convert-ipcs_idr-to-xarray-update
> >>> git bisect bad 165d3ee0162fe28efc2c8180176633e33515df15
> >>> # good: [001f1d211ed2ed0f005838dc4390993930bbbd69] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES
> >>> git bisect good 001f1d211ed2ed0f005838dc4390993930bbbd69
> >>> # bad: [aaad7401bd32f10c1d591dd886b3a9b9595c6d77] mm/vmsan: fix some typos in comment
> >>> git bisect bad aaad7401bd32f10c1d591dd886b3a9b9595c6d77
> >>> # bad: [09f9d0ab1fbed85623b283995aa7a7d78daa1611] khugepaged: allow to collapse PTE-mapped compound pages
> >>> git bisect bad 09f9d0ab1fbed85623b283995aa7a7d78daa1611
> >>> # bad: [c942fc8a3e5088407bc32d94f554bab205175f8a] mm/vmstat.c: do not show lowmem reserve protection information of empty zone
> >>> git bisect bad c942fc8a3e5088407bc32d94f554bab205175f8a
> >>> # bad: [b29358d269ace3826d8521bea842fc2984cfc11b] mm/page_alloc.c: rename free_pages_check() to check_free_page()
> >>> git bisect bad b29358d269ace3826d8521bea842fc2984cfc11b
> >>> # bad: [be0fb591a1f1df20a00c8f023f9ca4891f177b0d] mm: simplify find_min_pfn_with_active_regions()
> >>> git bisect bad be0fb591a1f1df20a00c8f023f9ca4891f177b0d
> >>> # bad: [c17422a008d36dcf3e9f51469758c5762716cb0a] mm: rename free_area_init_node() to free_area_init_memoryless_node()
> >>> git bisect bad c17422a008d36dcf3e9f51469758c5762716cb0a
> >>> # bad: [51a2f644fd020d5f090044825c388444d11029d5] mm: free_area_init: allow defining max_zone_pfn in descending order
> >>> git bisect bad 51a2f644fd020d5f090044825c388444d11029d5
> >>> # first bad commit: [51a2f644fd020d5f090044825c388444d11029d5] mm: free_area_init: allow defining max_zone_pfn in descending order
> > 
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 17/20] mm: free_area_init: allow defining max_zone_pfn in descending order
  2020-05-05 13:18         ` Guenter Roeck
  2020-05-05 13:45           ` Mike Rapoport
@ 2020-05-05 17:27           ` Vineet Gupta
  1 sibling, 0 replies; 33+ messages in thread
From: Vineet Gupta @ 2020-05-05 17:27 UTC (permalink / raw)
  To: Guenter Roeck, Mike Rapoport
  Cc: linux-kernel, Rich Felker, linux-ia64, linux-doc,
	Catalin Marinas, Heiko Carstens, Michal Hocko,
	James E.J. Bottomley, Max Filippov, Guo Ren, linux-csky,
	linux-parisc, sparclinux, linux-hexagon, linux-riscv,
	Mike Rapoport, Greg Ungerer, linux-arch, linux-s390,
	linux-c6x-dev, Baoquan He, Jonathan Corbet, linux-sh,
	Michael Ellerman, Helge Deller, x86, Russell King, Ley Foon Tan,
	Yoshinori Sato, Geert Uytterhoeven, linux-arm-kernel,
	Mark Salter, Matt Turner, linux-snps-arc, uclinux-h8-devel,
	linux-xtensa, linux-alpha, linux-um, linux-m68k, Tony Luck,
	Qian Cai, Greentime Hu, Paul Walmsley, Stafford Horne,
	Guan Xuetao, Hoan Tran, Michal Simek, Thomas Bogendoerfer,
	Brian Cain, Nick Hu, linux-mm, linux-mips, openrisc,
	Richard Weinberger, Andrew Morton, linuxppc-dev, David S. Miller

On 5/5/20 6:18 AM, Guenter Roeck wrote:
> On 5/4/20 8:39 AM, Mike Rapoport wrote:
>> On Sun, May 03, 2020 at 11:43:00AM -0700, Guenter Roeck wrote:
>>> On Sun, May 03, 2020 at 10:41:38AM -0700, Guenter Roeck wrote:
>>>> Hi,
>>>>
>>>> On Wed, Apr 29, 2020 at 03:11:23PM +0300, Mike Rapoport wrote:
>>>>> From: Mike Rapoport <rppt@linux.ibm.com>
>>>>>
>>>>> Some architectures (e.g. ARC) have the ZONE_HIGHMEM zone below the
>>>>> ZONE_NORMAL. Allowing free_area_init() parse max_zone_pfn array even it is
>>>>> sorted in descending order allows using free_area_init() on such
>>>>> architectures.
>>>>>
>>>>> Add top -> down traversal of max_zone_pfn array in free_area_init() and use
>>>>> the latter in ARC node/zone initialization.
>>>>>
>>>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
>>>> This patch causes my microblazeel qemu boot test in linux-next to fail.
>>>> Reverting it fixes the problem.
>>>>
>>> The same problem is seen with s390 emulations.
>> Yeah, this patch breaks some others as well :(
>>
>> My assumption that max_zone_pfn defines architectural limit for maximal
>> PFN that can belong to a zone was over-optimistic. Several arches
>> actually do that, but others do
>>
>> 	max_zone_pfn[ZONE_DMA] = MAX_DMA_PFN;
>> 	max_zone_pfn[ZONE_NORMAL] = max_pfn;
>>
>> where MAX_DMA_PFN is build-time constrain and max_pfn is run time limit
>> for the current system.
>>
>> So, when max_pfn is lower than MAX_DMA_PFN, the free_init_area() will
>> consider max_zone_pfn as descending and will wrongly calculate zone
>> extents.
>>
>> That said, instead of trying to create a generic way to special case
>> ARC, I suggest to simply use the below patch instead.
>>
> As a reminder, I reported the problem against s390 and microblazeel
> (interestingly enough, microblaze (big endian) works), not against arc.

Understood and my comment was to point to any other problems in future.

Thx,
-Vineet

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 03/20] mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP option
  2020-04-29 12:11 ` [PATCH v2 03/20] mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP option Mike Rapoport
@ 2020-05-26 17:11   ` Catalin Marinas
  0 siblings, 0 replies; 33+ messages in thread
From: Catalin Marinas @ 2020-05-26 17:11 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, Andrew Morton, Baoquan He, Brian Cain,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

On Wed, Apr 29, 2020 at 03:11:09PM +0300, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> The CONFIG_HAVE_MEMBLOCK_NODE_MAP is used to differentiate initialization
> of nodes and zones structures between the systems that have region to node
> mapping in memblock and those that don't.
> 
> Currently all the NUMA architectures enable this option and for the
> non-NUMA systems we can presume that all the memory belongs to node 0 and
> therefore the compile time configuration option is not required.
> 
> The remaining few architectures that use DISCONTIGMEM without NUMA are
> easily updated to use memblock_add_node() instead of memblock_add() and
> thus have proper correspondence of memblock regions to NUMA nodes.
> 
> Still, free_area_init_node() must have a backward compatible version
> because its semantics with and without CONFIG_HAVE_MEMBLOCK_NODE_MAP is
> different. Once all the architectures will use the new semantics, the
> entire compatibility layer can be dropped.
> 
> To avoid addition of extra run time memory to store node id for
> architectures that keep memblock but have only a single node, the node id
> field of the memblock_region is guarded by CONFIG_NEED_MULTIPLE_NODES and
> the corresponding accessors presume that in those cases it is always 0.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---
>  .../vm/numa-memblock/arch-support.txt         |  34 ------
>  arch/alpha/mm/numa.c                          |   4 +-
>  arch/arm64/Kconfig                            |   1 -

For arm64:

Acked-by: Catalin Marinas <catalin.marinas@arm.com>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 05/20] mm: use free_area_init() instead of free_area_init_nodes()
  2020-04-29 12:11 ` [PATCH v2 05/20] mm: use free_area_init() instead of free_area_init_nodes() Mike Rapoport
@ 2020-05-26 17:13   ` Catalin Marinas
  0 siblings, 0 replies; 33+ messages in thread
From: Catalin Marinas @ 2020-05-26 17:13 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, Andrew Morton, Baoquan He, Brian Cain,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

On Wed, Apr 29, 2020 at 03:11:11PM +0300, Mike Rapoport wrote:
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index e42727e3568e..a650adb358ee 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -206,7 +206,7 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>  #endif
>  	max_zone_pfns[ZONE_NORMAL] = max;
>  
> -	free_area_init_nodes(max_zone_pfns);
> +	free_area_init(max_zone_pfns);
>  }

Acked-by: Catalin Marinas <catalin.marinas@arm.com>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 08/20] arm64: simplify detection of memory zone boundaries for UMA configs
  2020-04-29 12:11 ` [PATCH v2 08/20] arm64: simplify detection of memory zone boundaries for UMA configs Mike Rapoport
@ 2020-05-26 17:15   ` Catalin Marinas
  0 siblings, 0 replies; 33+ messages in thread
From: Catalin Marinas @ 2020-05-26 17:15 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, Andrew Morton, Baoquan He, Brian Cain,
	David S. Miller, Geert Uytterhoeven, Greentime Hu, Greg Ungerer,
	Guan Xuetao, Guo Ren, Heiko Carstens, Helge Deller, Hoan Tran,
	James E.J. Bottomley, Jonathan Corbet, Ley Foon Tan, Mark Salter,
	Matt Turner, Max Filippov, Michael Ellerman, Michal Hocko,
	Michal Simek, Nick Hu, Paul Walmsley, Qian Cai,
	Richard Weinberger, Rich Felker, Russell King, Stafford Horne,
	Thomas Bogendoerfer, Tony Luck, Vineet Gupta, x86,
	Yoshinori Sato, linux-alpha, linux-arch, linux-arm-kernel,
	linux-c6x-dev, linux-csky, linux-doc, linux-hexagon, linux-ia64,
	linux-m68k, linux-mips, linux-mm, linux-parisc, linuxppc-dev,
	linux-riscv, linux-s390, linux-sh, linux-snps-arc, linux-um,
	linux-xtensa, openrisc, sparclinux, uclinux-h8-devel,
	Mike Rapoport

On Wed, Apr 29, 2020 at 03:11:14PM +0300, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> The free_area_init() function only requires the definition of maximal PFN
> for each of the supported zone rater than calculation of actual zone sizes
> and the sizes of the holes between the zones.
> 
> After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP the free_area_init() is
> available to all architectures.
> 
> Using this function instead of free_area_init_node() simplifies the zone
> detection.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

(BTW, none of my acks so far made it to the linux-arm-kernel list
because of the large number of people on cc)


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2020-05-26 17:15 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-29 12:11 [PATCH v2 00/20] mm: rework free_area_init*() funcitons Mike Rapoport
2020-04-29 12:11 ` [PATCH v2 01/20] mm: memblock: replace dereferences of memblock_region.nid with API calls Mike Rapoport
2020-04-29 12:11 ` [PATCH v2 02/20] mm: make early_pfn_to_nid() and related defintions close to each other Mike Rapoport
2020-04-29 12:11 ` [PATCH v2 03/20] mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP option Mike Rapoport
2020-05-26 17:11   ` Catalin Marinas
2020-04-29 12:11 ` [PATCH v2 04/20] mm: free_area_init: use maximal zone PFNs rather than zone sizes Mike Rapoport
2020-04-29 12:11 ` [PATCH v2 05/20] mm: use free_area_init() instead of free_area_init_nodes() Mike Rapoport
2020-05-26 17:13   ` Catalin Marinas
2020-04-29 12:11 ` [PATCH v2 06/20] alpha: simplify detection of memory zone boundaries Mike Rapoport
2020-04-29 12:11 ` [PATCH v2 07/20] arm: " Mike Rapoport
2020-04-29 12:11 ` [PATCH v2 08/20] arm64: simplify detection of memory zone boundaries for UMA configs Mike Rapoport
2020-05-26 17:15   ` Catalin Marinas
2020-04-29 12:11 ` [PATCH v2 09/20] csky: simplify detection of memory zone boundaries Mike Rapoport
2020-04-29 12:11 ` [PATCH v2 10/20] m68k: mm: " Mike Rapoport
2020-04-29 12:11 ` [PATCH v2 11/20] parisc: " Mike Rapoport
2020-04-29 12:11 ` [PATCH v2 12/20] sparc32: " Mike Rapoport
2020-04-29 12:11 ` [PATCH v2 13/20] unicore32: " Mike Rapoport
2020-04-29 12:11 ` [PATCH v2 14/20] xtensa: " Mike Rapoport
2020-04-29 12:11 ` [PATCH v2 15/20] mm: memmap_init: iterate over memblock regions rather that check each PFN Mike Rapoport
2020-04-29 12:11 ` [PATCH v2 16/20] mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES Mike Rapoport
2020-04-29 14:17   ` Christoph Hellwig
2020-04-29 14:33     ` Mike Rapoport
2020-04-29 16:29   ` [PATCH v2.5 " Mike Rapoport
2020-04-29 12:11 ` [PATCH v2 17/20] mm: free_area_init: allow defining max_zone_pfn in descending order Mike Rapoport
2020-05-03 17:41   ` Guenter Roeck
2020-05-03 18:43     ` Guenter Roeck
2020-05-04 15:39       ` Mike Rapoport
2020-05-05 13:18         ` Guenter Roeck
2020-05-05 13:45           ` Mike Rapoport
2020-05-05 17:27           ` Vineet Gupta
2020-04-29 12:11 ` [PATCH v2 18/20] mm: clean up free_area_init_node() and its helpers Mike Rapoport
2020-04-29 12:11 ` [PATCH v2 19/20] mm: simplify find_min_pfn_with_active_regions() Mike Rapoport
2020-04-29 12:11 ` [PATCH v2 20/20] docs/vm: update memory-models documentation Mike Rapoport

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).