All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/5] optimize memblock_next_valid_pfn and early_pfn_valid on arm and arm64
@ 2018-04-02  2:30 ` Jia He
  0 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  2:30 UTC (permalink / raw)
  To: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Ard Biesheuvel, Andrew Morton, Michal Hocko
  Cc: Wei Yang, Kees Cook, Laura Abbott, Vladimir Murzin,
	Philip Derrin, Grygorii Strashko, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	linux-kernel, linux-mm, Jia He

Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") tried to optimize the loop in memmap_init_zone(). But
there is still some room for improvement.

Patch 1 remain the memblock_next_valid_pfn on arm and arm64
Patch 2 optimizes the memblock_next_valid_pfn()
Patch 3~5 optimizes the early_pfn_valid()

I tested the pfn loop process in memmap_init(), the same as before.
As for the performance improvement, after this set, I can see the time
overhead of memmap_init() is reduced from 41313 us to 24345 us in my
armv8a server(QDF2400 with 96G memory).

Attached the memblock region information in my server.
[   86.956758] Zone ranges:
[   86.959452]   DMA      [mem 0x0000000000200000-0x00000000ffffffff]
[   86.966041]   Normal   [mem 0x0000000100000000-0x00000017ffffffff]
[   86.972631] Movable zone start for each node
[   86.977179] Early memory node ranges
[   86.980985]   node   0: [mem 0x0000000000200000-0x000000000021ffff]
[   86.987666]   node   0: [mem 0x0000000000820000-0x000000000307ffff]
[   86.994348]   node   0: [mem 0x0000000003080000-0x000000000308ffff]
[   87.001029]   node   0: [mem 0x0000000003090000-0x00000000031fffff]
[   87.007710]   node   0: [mem 0x0000000003200000-0x00000000033fffff]
[   87.014392]   node   0: [mem 0x0000000003410000-0x000000000563ffff]
[   87.021073]   node   0: [mem 0x0000000005640000-0x000000000567ffff]
[   87.027754]   node   0: [mem 0x0000000005680000-0x00000000056dffff]
[   87.034435]   node   0: [mem 0x00000000056e0000-0x00000000086fffff]
[   87.041117]   node   0: [mem 0x0000000008700000-0x000000000871ffff]
[   87.047798]   node   0: [mem 0x0000000008720000-0x000000000894ffff]
[   87.054479]   node   0: [mem 0x0000000008950000-0x0000000008baffff]
[   87.061161]   node   0: [mem 0x0000000008bb0000-0x0000000008bcffff]
[   87.067842]   node   0: [mem 0x0000000008bd0000-0x0000000008c4ffff]
[   87.074524]   node   0: [mem 0x0000000008c50000-0x0000000008e2ffff]
[   87.081205]   node   0: [mem 0x0000000008e30000-0x0000000008e4ffff]
[   87.087886]   node   0: [mem 0x0000000008e50000-0x0000000008fcffff]
[   87.094568]   node   0: [mem 0x0000000008fd0000-0x000000000910ffff]
[   87.101249]   node   0: [mem 0x0000000009110000-0x00000000092effff]
[   87.107930]   node   0: [mem 0x00000000092f0000-0x000000000930ffff]
[   87.114612]   node   0: [mem 0x0000000009310000-0x000000000963ffff]
[   87.121293]   node   0: [mem 0x0000000009640000-0x000000000e61ffff]
[   87.127975]   node   0: [mem 0x000000000e620000-0x000000000e64ffff]
[   87.134657]   node   0: [mem 0x000000000e650000-0x000000000fffffff]
[   87.141338]   node   0: [mem 0x0000000010800000-0x0000000017feffff]
[   87.148019]   node   0: [mem 0x000000001c000000-0x000000001c00ffff]
[   87.154701]   node   0: [mem 0x000000001c010000-0x000000001c7fffff]
[   87.161383]   node   0: [mem 0x000000001c810000-0x000000007efbffff]
[   87.168064]   node   0: [mem 0x000000007efc0000-0x000000007efdffff]
[   87.174746]   node   0: [mem 0x000000007efe0000-0x000000007efeffff]
[   87.181427]   node   0: [mem 0x000000007eff0000-0x000000007effffff]
[   87.188108]   node   0: [mem 0x000000007f000000-0x00000017ffffffff]
[   87.194791] Initmem setup node 0 [mem 0x0000000000200000-0x00000017ffffffff]

Without this patchset:
[  117.106153] Initmem setup node 0 [mem 0x0000000000200000-0x00000017ffffffff]
[  117.113677] before memmap_init
[  117.118195] after  memmap_init
>>> memmap_init takes 4518 us
[  117.121446] before memmap_init
[  117.154992] after  memmap_init
>>> memmap_init takes 33546 us
[  117.158241] before memmap_init
[  117.161490] after  memmap_init
>>> memmap_init takes 3249 us
>>> totally takes 41313 us

With this patchset:
[   87.194791] Initmem setup node 0 [mem 0x0000000000200000-0x00000017ffffffff]
[   87.202314] before memmap_init
[   87.206164] after  memmap_init
>>> memmap_init takes 3850 us
[   87.209416] before memmap_init
[   87.226662] after  memmap_init
>>> memmap_init takes 17246 us
[   87.229911] before memmap_init
[   87.233160] after  memmap_init
>>> memmap_init takes 3249 us
>>> totally takes 24345 us

Changelog:
V5: - further refining as suggested by Danial Vacek. Make codes
      arm/arm64 more arch specific
V4: - refine patches as suggested by Danial Vacek and Wei Yang
    - optimized on arm besides arm64
V3: - fix 2 issues reported by kbuild test robot
V2: - rebase to mmotm latest
    - remain memblock_next_valid_pfn on arm64
    - refine memblock_search_pfn_regions and pfn_valid_region

Jia He (5):
  mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
  arm: arm64: page_alloc: reduce unnecessary binary search in
    memblock_next_valid_pfn()
  mm/memblock: introduce memblock_search_pfn_regions()
  arm64: introduce pfn_valid_region()
  mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()

 arch/arm/include/asm/page.h   |  6 +++-
 arch/arm/mm/init.c            | 71 ++++++++++++++++++++++++++++++++++++++++++-
 arch/arm64/include/asm/page.h |  6 +++-
 arch/arm64/mm/init.c          | 71 ++++++++++++++++++++++++++++++++++++++++++-
 include/linux/memblock.h      |  2 ++
 include/linux/mmzone.h        |  8 ++++-
 mm/memblock.c                 |  9 ++++++
 mm/page_alloc.c               |  4 ++-
 8 files changed, 171 insertions(+), 6 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 0/5] optimize memblock_next_valid_pfn and early_pfn_valid on arm and arm64
@ 2018-04-02  2:30 ` Jia He
  0 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  2:30 UTC (permalink / raw)
  To: linux-arm-kernel

Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") tried to optimize the loop in memmap_init_zone(). But
there is still some room for improvement.

Patch 1 remain the memblock_next_valid_pfn on arm and arm64
Patch 2 optimizes the memblock_next_valid_pfn()
Patch 3~5 optimizes the early_pfn_valid()

I tested the pfn loop process in memmap_init(), the same as before.
As for the performance improvement, after this set, I can see the time
overhead of memmap_init() is reduced from 41313 us to 24345 us in my
armv8a server(QDF2400 with 96G memory).

Attached the memblock region information in my server.
[   86.956758] Zone ranges:
[   86.959452]   DMA      [mem 0x0000000000200000-0x00000000ffffffff]
[   86.966041]   Normal   [mem 0x0000000100000000-0x00000017ffffffff]
[   86.972631] Movable zone start for each node
[   86.977179] Early memory node ranges
[   86.980985]   node   0: [mem 0x0000000000200000-0x000000000021ffff]
[   86.987666]   node   0: [mem 0x0000000000820000-0x000000000307ffff]
[   86.994348]   node   0: [mem 0x0000000003080000-0x000000000308ffff]
[   87.001029]   node   0: [mem 0x0000000003090000-0x00000000031fffff]
[   87.007710]   node   0: [mem 0x0000000003200000-0x00000000033fffff]
[   87.014392]   node   0: [mem 0x0000000003410000-0x000000000563ffff]
[   87.021073]   node   0: [mem 0x0000000005640000-0x000000000567ffff]
[   87.027754]   node   0: [mem 0x0000000005680000-0x00000000056dffff]
[   87.034435]   node   0: [mem 0x00000000056e0000-0x00000000086fffff]
[   87.041117]   node   0: [mem 0x0000000008700000-0x000000000871ffff]
[   87.047798]   node   0: [mem 0x0000000008720000-0x000000000894ffff]
[   87.054479]   node   0: [mem 0x0000000008950000-0x0000000008baffff]
[   87.061161]   node   0: [mem 0x0000000008bb0000-0x0000000008bcffff]
[   87.067842]   node   0: [mem 0x0000000008bd0000-0x0000000008c4ffff]
[   87.074524]   node   0: [mem 0x0000000008c50000-0x0000000008e2ffff]
[   87.081205]   node   0: [mem 0x0000000008e30000-0x0000000008e4ffff]
[   87.087886]   node   0: [mem 0x0000000008e50000-0x0000000008fcffff]
[   87.094568]   node   0: [mem 0x0000000008fd0000-0x000000000910ffff]
[   87.101249]   node   0: [mem 0x0000000009110000-0x00000000092effff]
[   87.107930]   node   0: [mem 0x00000000092f0000-0x000000000930ffff]
[   87.114612]   node   0: [mem 0x0000000009310000-0x000000000963ffff]
[   87.121293]   node   0: [mem 0x0000000009640000-0x000000000e61ffff]
[   87.127975]   node   0: [mem 0x000000000e620000-0x000000000e64ffff]
[   87.134657]   node   0: [mem 0x000000000e650000-0x000000000fffffff]
[   87.141338]   node   0: [mem 0x0000000010800000-0x0000000017feffff]
[   87.148019]   node   0: [mem 0x000000001c000000-0x000000001c00ffff]
[   87.154701]   node   0: [mem 0x000000001c010000-0x000000001c7fffff]
[   87.161383]   node   0: [mem 0x000000001c810000-0x000000007efbffff]
[   87.168064]   node   0: [mem 0x000000007efc0000-0x000000007efdffff]
[   87.174746]   node   0: [mem 0x000000007efe0000-0x000000007efeffff]
[   87.181427]   node   0: [mem 0x000000007eff0000-0x000000007effffff]
[   87.188108]   node   0: [mem 0x000000007f000000-0x00000017ffffffff]
[   87.194791] Initmem setup node 0 [mem 0x0000000000200000-0x00000017ffffffff]

Without this patchset:
[  117.106153] Initmem setup node 0 [mem 0x0000000000200000-0x00000017ffffffff]
[  117.113677] before memmap_init
[  117.118195] after  memmap_init
>>> memmap_init takes 4518 us
[  117.121446] before memmap_init
[  117.154992] after  memmap_init
>>> memmap_init takes 33546 us
[  117.158241] before memmap_init
[  117.161490] after  memmap_init
>>> memmap_init takes 3249 us
>>> totally takes 41313 us

With this patchset:
[   87.194791] Initmem setup node 0 [mem 0x0000000000200000-0x00000017ffffffff]
[   87.202314] before memmap_init
[   87.206164] after  memmap_init
>>> memmap_init takes 3850 us
[   87.209416] before memmap_init
[   87.226662] after  memmap_init
>>> memmap_init takes 17246 us
[   87.229911] before memmap_init
[   87.233160] after  memmap_init
>>> memmap_init takes 3249 us
>>> totally takes 24345 us

Changelog:
V5: - further refining as suggested by Danial Vacek. Make codes
      arm/arm64 more arch specific
V4: - refine patches as suggested by Danial Vacek and Wei Yang
    - optimized on arm besides arm64
V3: - fix 2 issues reported by kbuild test robot
V2: - rebase to mmotm latest
    - remain memblock_next_valid_pfn on arm64
    - refine memblock_search_pfn_regions and pfn_valid_region

Jia He (5):
  mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
  arm: arm64: page_alloc: reduce unnecessary binary search in
    memblock_next_valid_pfn()
  mm/memblock: introduce memblock_search_pfn_regions()
  arm64: introduce pfn_valid_region()
  mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()

 arch/arm/include/asm/page.h   |  6 +++-
 arch/arm/mm/init.c            | 71 ++++++++++++++++++++++++++++++++++++++++++-
 arch/arm64/include/asm/page.h |  6 +++-
 arch/arm64/mm/init.c          | 71 ++++++++++++++++++++++++++++++++++++++++++-
 include/linux/memblock.h      |  2 ++
 include/linux/mmzone.h        |  8 ++++-
 mm/memblock.c                 |  9 ++++++
 mm/page_alloc.c               |  4 ++-
 8 files changed, 171 insertions(+), 6 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
  2018-04-02  2:30 ` Jia He
@ 2018-04-02  2:30   ` Jia He
  -1 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  2:30 UTC (permalink / raw)
  To: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Ard Biesheuvel, Andrew Morton, Michal Hocko
  Cc: Wei Yang, Kees Cook, Laura Abbott, Vladimir Murzin,
	Philip Derrin, Grygorii Strashko, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	linux-kernel, linux-mm, Jia He, Jia He

Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") optimized the loop in memmap_init_zone(). But it causes
possible panic bug. So Daniel Vacek reverted it later.

But as suggested by Daniel Vacek, it is fine to using memblock to skip
gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.

On arm and arm64, memblock is used by default. But generic version of
pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
not always return the next valid one but skips more resulting in some
valid frames to be skipped (as if they were invalid). And that's why
kernel was eventually crashing on some !arm machines.

And as verified by Eugeniu Rosca, arm can benifit from commit
b92df1de5d28. So remain the memblock_next_valid_pfn on arm{,64} and move
the related codes to arm64 arch directory.

Suggested-by: Daniel Vacek <neelx@redhat.com>
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
---
 arch/arm/include/asm/page.h   |  2 ++
 arch/arm/mm/init.c            | 31 ++++++++++++++++++++++++++++++-
 arch/arm64/include/asm/page.h |  2 ++
 arch/arm64/mm/init.c          | 31 ++++++++++++++++++++++++++++++-
 include/linux/mmzone.h        |  1 +
 mm/page_alloc.c               |  4 +++-
 6 files changed, 68 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index 4355f0e..489875c 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -158,6 +158,8 @@ typedef struct page *pgtable_t;
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
 extern int pfn_valid(unsigned long);
+extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
+#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
 #endif
 
 #include <asm/memory.h>
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index a1f11a7..0fb85ca 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -198,7 +198,36 @@ int pfn_valid(unsigned long pfn)
 	return memblock_is_map_memory(__pfn_to_phys(pfn));
 }
 EXPORT_SYMBOL(pfn_valid);
-#endif
+
+/* HAVE_MEMBLOCK is always enabled on arm */
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
+{
+	struct memblock_type *type = &memblock.memory;
+	unsigned int right = type->cnt;
+	unsigned int mid, left = 0;
+	phys_addr_t addr = PFN_PHYS(++pfn);
+
+	do {
+		mid = (right + left) / 2;
+
+		if (addr < type->regions[mid].base)
+			right = mid;
+		else if (addr >= (type->regions[mid].base +
+				  type->regions[mid].size))
+			left = mid + 1;
+		else {
+			/* addr is within the region, so pfn is valid */
+			return pfn;
+		}
+	} while (left < right);
+
+	if (right == type->cnt)
+		return -1UL;
+	else
+		return PHYS_PFN(type->regions[right].base);
+}
+EXPORT_SYMBOL(memblock_next_valid_pfn);
+#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
 
 #ifndef CONFIG_SPARSEMEM
 static void __init arm_memory_present(void)
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 60d02c8..e57d3f2 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -39,6 +39,8 @@ typedef struct page *pgtable_t;
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
 extern int pfn_valid(unsigned long);
+extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
+#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
 #endif
 
 #include <asm/memory.h>
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 00e7b90..13e43ff 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -290,7 +290,36 @@ int pfn_valid(unsigned long pfn)
 	return memblock_is_map_memory(pfn << PAGE_SHIFT);
 }
 EXPORT_SYMBOL(pfn_valid);
-#endif
+
+/* HAVE_MEMBLOCK is always enabled on arm64 */
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
+{
+	struct memblock_type *type = &memblock.memory;
+	unsigned int right = type->cnt;
+	unsigned int mid, left = 0;
+	phys_addr_t addr = PFN_PHYS(++pfn);
+
+	do {
+		mid = (right + left) / 2;
+
+		if (addr < type->regions[mid].base)
+			right = mid;
+		else if (addr >= (type->regions[mid].base +
+				  type->regions[mid].size))
+			left = mid + 1;
+		else {
+			/* addr is within the region, so pfn is valid */
+			return pfn;
+		}
+	} while (left < right);
+
+	if (right == type->cnt)
+		return -1UL;
+	else
+		return PHYS_PFN(type->regions[right].base);
+}
+EXPORT_SYMBOL(memblock_next_valid_pfn);
+#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
 
 #ifndef CONFIG_SPARSEMEM
 static void __init arm64_memory_present(void)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index d797716..f9c0c46 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1245,6 +1245,7 @@ static inline int pfn_valid(unsigned long pfn)
 		return 0;
 	return valid_section(__nr_to_section(pfn_to_section_nr(pfn)));
 }
+#define skip_to_last_invalid_pfn(pfn) (pfn)
 #endif
 
 static inline int pfn_present(unsigned long pfn)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c19f5ac..30f7d76 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5483,8 +5483,10 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 		if (context != MEMMAP_EARLY)
 			goto not_early;
 
-		if (!early_pfn_valid(pfn))
+		if (!early_pfn_valid(pfn)) {
+			pfn = skip_to_last_invalid_pfn(pfn);
 			continue;
+		}
 		if (!early_pfn_in_nid(pfn, nid))
 			continue;
 		if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised))
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
@ 2018-04-02  2:30   ` Jia He
  0 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  2:30 UTC (permalink / raw)
  To: linux-arm-kernel

Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") optimized the loop in memmap_init_zone(). But it causes
possible panic bug. So Daniel Vacek reverted it later.

But as suggested by Daniel Vacek, it is fine to using memblock to skip
gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.

On arm and arm64, memblock is used by default. But generic version of
pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
not always return the next valid one but skips more resulting in some
valid frames to be skipped (as if they were invalid). And that's why
kernel was eventually crashing on some !arm machines.

And as verified by Eugeniu Rosca, arm can benifit from commit
b92df1de5d28. So remain the memblock_next_valid_pfn on arm{,64} and move
the related codes to arm64 arch directory.

Suggested-by: Daniel Vacek <neelx@redhat.com>
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
---
 arch/arm/include/asm/page.h   |  2 ++
 arch/arm/mm/init.c            | 31 ++++++++++++++++++++++++++++++-
 arch/arm64/include/asm/page.h |  2 ++
 arch/arm64/mm/init.c          | 31 ++++++++++++++++++++++++++++++-
 include/linux/mmzone.h        |  1 +
 mm/page_alloc.c               |  4 +++-
 6 files changed, 68 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index 4355f0e..489875c 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -158,6 +158,8 @@ typedef struct page *pgtable_t;
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
 extern int pfn_valid(unsigned long);
+extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
+#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
 #endif
 
 #include <asm/memory.h>
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index a1f11a7..0fb85ca 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -198,7 +198,36 @@ int pfn_valid(unsigned long pfn)
 	return memblock_is_map_memory(__pfn_to_phys(pfn));
 }
 EXPORT_SYMBOL(pfn_valid);
-#endif
+
+/* HAVE_MEMBLOCK is always enabled on arm */
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
+{
+	struct memblock_type *type = &memblock.memory;
+	unsigned int right = type->cnt;
+	unsigned int mid, left = 0;
+	phys_addr_t addr = PFN_PHYS(++pfn);
+
+	do {
+		mid = (right + left) / 2;
+
+		if (addr < type->regions[mid].base)
+			right = mid;
+		else if (addr >= (type->regions[mid].base +
+				  type->regions[mid].size))
+			left = mid + 1;
+		else {
+			/* addr is within the region, so pfn is valid */
+			return pfn;
+		}
+	} while (left < right);
+
+	if (right == type->cnt)
+		return -1UL;
+	else
+		return PHYS_PFN(type->regions[right].base);
+}
+EXPORT_SYMBOL(memblock_next_valid_pfn);
+#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
 
 #ifndef CONFIG_SPARSEMEM
 static void __init arm_memory_present(void)
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 60d02c8..e57d3f2 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -39,6 +39,8 @@ typedef struct page *pgtable_t;
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
 extern int pfn_valid(unsigned long);
+extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
+#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
 #endif
 
 #include <asm/memory.h>
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 00e7b90..13e43ff 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -290,7 +290,36 @@ int pfn_valid(unsigned long pfn)
 	return memblock_is_map_memory(pfn << PAGE_SHIFT);
 }
 EXPORT_SYMBOL(pfn_valid);
-#endif
+
+/* HAVE_MEMBLOCK is always enabled on arm64 */
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
+{
+	struct memblock_type *type = &memblock.memory;
+	unsigned int right = type->cnt;
+	unsigned int mid, left = 0;
+	phys_addr_t addr = PFN_PHYS(++pfn);
+
+	do {
+		mid = (right + left) / 2;
+
+		if (addr < type->regions[mid].base)
+			right = mid;
+		else if (addr >= (type->regions[mid].base +
+				  type->regions[mid].size))
+			left = mid + 1;
+		else {
+			/* addr is within the region, so pfn is valid */
+			return pfn;
+		}
+	} while (left < right);
+
+	if (right == type->cnt)
+		return -1UL;
+	else
+		return PHYS_PFN(type->regions[right].base);
+}
+EXPORT_SYMBOL(memblock_next_valid_pfn);
+#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
 
 #ifndef CONFIG_SPARSEMEM
 static void __init arm64_memory_present(void)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index d797716..f9c0c46 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1245,6 +1245,7 @@ static inline int pfn_valid(unsigned long pfn)
 		return 0;
 	return valid_section(__nr_to_section(pfn_to_section_nr(pfn)));
 }
+#define skip_to_last_invalid_pfn(pfn) (pfn)
 #endif
 
 static inline int pfn_present(unsigned long pfn)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c19f5ac..30f7d76 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5483,8 +5483,10 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 		if (context != MEMMAP_EARLY)
 			goto not_early;
 
-		if (!early_pfn_valid(pfn))
+		if (!early_pfn_valid(pfn)) {
+			pfn = skip_to_last_invalid_pfn(pfn);
 			continue;
+		}
 		if (!early_pfn_in_nid(pfn, nid))
 			continue;
 		if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised))
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 2/5] arm: arm64: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()
  2018-04-02  2:30 ` Jia He
@ 2018-04-02  2:30   ` Jia He
  -1 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  2:30 UTC (permalink / raw)
  To: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Ard Biesheuvel, Andrew Morton, Michal Hocko
  Cc: Wei Yang, Kees Cook, Laura Abbott, Vladimir Murzin,
	Philip Derrin, Grygorii Strashko, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	linux-kernel, linux-mm, Jia He, Jia He

Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") optimized the loop in memmap_init_zone(). But there is
still some room for improvement. E.g. if pfn and pfn+1 are in the same
memblock region, we can simply pfn++ instead of doing the binary search
in memblock_next_valid_pfn.

Signed-off-by: Jia He <jia.he@hxt-semitech.com>
---
 arch/arm/include/asm/page.h   |  1 +
 arch/arm/mm/init.c            | 28 ++++++++++++++++++++++------
 arch/arm64/include/asm/page.h |  1 +
 arch/arm64/mm/init.c          | 28 ++++++++++++++++++++++------
 4 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index 489875c..f38909c 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -157,6 +157,7 @@ extern void copy_page(void *to, const void *from);
 typedef struct page *pgtable_t;
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
+extern int early_region_idx;
 extern int pfn_valid(unsigned long);
 extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
 #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 0fb85ca..06ed190 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -193,6 +193,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
 }
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
+int early_region_idx __meminitdata = -1;
+
 int pfn_valid(unsigned long pfn)
 {
 	return memblock_is_map_memory(__pfn_to_phys(pfn));
@@ -203,28 +205,42 @@ EXPORT_SYMBOL(pfn_valid);
 unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
 {
 	struct memblock_type *type = &memblock.memory;
+	struct memblock_region *regions = type->regions;
 	unsigned int right = type->cnt;
 	unsigned int mid, left = 0;
+	unsigned long start_pfn, end_pfn;
 	phys_addr_t addr = PFN_PHYS(++pfn);
 
+	/* fast path, return pfn+1 if next pfn is in the same region */
+	if (early_region_idx != -1) {
+		start_pfn = PFN_DOWN(regions[early_region_idx].base);
+		end_pfn = PFN_DOWN(regions[early_region_idx].base +
+				regions[early_region_idx].size);
+
+		if (pfn >= start_pfn && pfn < end_pfn)
+			return pfn;
+	}
+
+	/* slow path, do the binary searching */
 	do {
 		mid = (right + left) / 2;
 
-		if (addr < type->regions[mid].base)
+		if (addr < regions[mid].base)
 			right = mid;
-		else if (addr >= (type->regions[mid].base +
-				  type->regions[mid].size))
+		else if (addr >= (regions[mid].base + regions[mid].size))
 			left = mid + 1;
 		else {
-			/* addr is within the region, so pfn is valid */
+			early_region_idx = mid;
 			return pfn;
 		}
 	} while (left < right);
 
 	if (right == type->cnt)
 		return -1UL;
-	else
-		return PHYS_PFN(type->regions[right].base);
+
+	early_region_idx = right;
+
+	return PHYS_PFN(regions[early_region_idx].base);
 }
 EXPORT_SYMBOL(memblock_next_valid_pfn);
 #endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index e57d3f2..f0d8c8e5 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -38,6 +38,7 @@ extern void clear_page(void *to);
 typedef struct page *pgtable_t;
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
+extern int early_region_idx;
 extern int pfn_valid(unsigned long);
 extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
 #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 13e43ff..342e4e2 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -285,6 +285,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
 #endif /* CONFIG_NUMA */
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
+int early_region_idx __meminitdata = -1;
+
 int pfn_valid(unsigned long pfn)
 {
 	return memblock_is_map_memory(pfn << PAGE_SHIFT);
@@ -295,28 +297,42 @@ EXPORT_SYMBOL(pfn_valid);
 unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
 {
 	struct memblock_type *type = &memblock.memory;
+	struct memblock_region *regions = type->regions;
 	unsigned int right = type->cnt;
 	unsigned int mid, left = 0;
+	unsigned long start_pfn, end_pfn;
 	phys_addr_t addr = PFN_PHYS(++pfn);
 
+	/* fast path, return pfn+1 if next pfn is in the same region */
+	if (early_region_idx != -1) {
+		start_pfn = PFN_DOWN(regions[early_region_idx].base);
+		end_pfn = PFN_DOWN(regions[early_region_idx].base +
+				regions[early_region_idx].size);
+
+		if (pfn >= start_pfn && pfn < end_pfn)
+			return pfn;
+	}
+
+	/* slow path, do the binary searching */
 	do {
 		mid = (right + left) / 2;
 
-		if (addr < type->regions[mid].base)
+		if (addr < regions[mid].base)
 			right = mid;
-		else if (addr >= (type->regions[mid].base +
-				  type->regions[mid].size))
+		else if (addr >= (regions[mid].base + regions[mid].size))
 			left = mid + 1;
 		else {
-			/* addr is within the region, so pfn is valid */
+			early_region_idx = mid;
 			return pfn;
 		}
 	} while (left < right);
 
 	if (right == type->cnt)
 		return -1UL;
-	else
-		return PHYS_PFN(type->regions[right].base);
+
+	early_region_idx = right;
+
+	return PHYS_PFN(regions[early_region_idx].base);
 }
 EXPORT_SYMBOL(memblock_next_valid_pfn);
 #endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 2/5] arm: arm64: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()
@ 2018-04-02  2:30   ` Jia He
  0 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  2:30 UTC (permalink / raw)
  To: linux-arm-kernel

Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") optimized the loop in memmap_init_zone(). But there is
still some room for improvement. E.g. if pfn and pfn+1 are in the same
memblock region, we can simply pfn++ instead of doing the binary search
in memblock_next_valid_pfn.

Signed-off-by: Jia He <jia.he@hxt-semitech.com>
---
 arch/arm/include/asm/page.h   |  1 +
 arch/arm/mm/init.c            | 28 ++++++++++++++++++++++------
 arch/arm64/include/asm/page.h |  1 +
 arch/arm64/mm/init.c          | 28 ++++++++++++++++++++++------
 4 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index 489875c..f38909c 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -157,6 +157,7 @@ extern void copy_page(void *to, const void *from);
 typedef struct page *pgtable_t;
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
+extern int early_region_idx;
 extern int pfn_valid(unsigned long);
 extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
 #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 0fb85ca..06ed190 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -193,6 +193,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
 }
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
+int early_region_idx __meminitdata = -1;
+
 int pfn_valid(unsigned long pfn)
 {
 	return memblock_is_map_memory(__pfn_to_phys(pfn));
@@ -203,28 +205,42 @@ EXPORT_SYMBOL(pfn_valid);
 unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
 {
 	struct memblock_type *type = &memblock.memory;
+	struct memblock_region *regions = type->regions;
 	unsigned int right = type->cnt;
 	unsigned int mid, left = 0;
+	unsigned long start_pfn, end_pfn;
 	phys_addr_t addr = PFN_PHYS(++pfn);
 
+	/* fast path, return pfn+1 if next pfn is in the same region */
+	if (early_region_idx != -1) {
+		start_pfn = PFN_DOWN(regions[early_region_idx].base);
+		end_pfn = PFN_DOWN(regions[early_region_idx].base +
+				regions[early_region_idx].size);
+
+		if (pfn >= start_pfn && pfn < end_pfn)
+			return pfn;
+	}
+
+	/* slow path, do the binary searching */
 	do {
 		mid = (right + left) / 2;
 
-		if (addr < type->regions[mid].base)
+		if (addr < regions[mid].base)
 			right = mid;
-		else if (addr >= (type->regions[mid].base +
-				  type->regions[mid].size))
+		else if (addr >= (regions[mid].base + regions[mid].size))
 			left = mid + 1;
 		else {
-			/* addr is within the region, so pfn is valid */
+			early_region_idx = mid;
 			return pfn;
 		}
 	} while (left < right);
 
 	if (right == type->cnt)
 		return -1UL;
-	else
-		return PHYS_PFN(type->regions[right].base);
+
+	early_region_idx = right;
+
+	return PHYS_PFN(regions[early_region_idx].base);
 }
 EXPORT_SYMBOL(memblock_next_valid_pfn);
 #endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index e57d3f2..f0d8c8e5 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -38,6 +38,7 @@ extern void clear_page(void *to);
 typedef struct page *pgtable_t;
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
+extern int early_region_idx;
 extern int pfn_valid(unsigned long);
 extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
 #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 13e43ff..342e4e2 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -285,6 +285,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
 #endif /* CONFIG_NUMA */
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
+int early_region_idx __meminitdata = -1;
+
 int pfn_valid(unsigned long pfn)
 {
 	return memblock_is_map_memory(pfn << PAGE_SHIFT);
@@ -295,28 +297,42 @@ EXPORT_SYMBOL(pfn_valid);
 unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
 {
 	struct memblock_type *type = &memblock.memory;
+	struct memblock_region *regions = type->regions;
 	unsigned int right = type->cnt;
 	unsigned int mid, left = 0;
+	unsigned long start_pfn, end_pfn;
 	phys_addr_t addr = PFN_PHYS(++pfn);
 
+	/* fast path, return pfn+1 if next pfn is in the same region */
+	if (early_region_idx != -1) {
+		start_pfn = PFN_DOWN(regions[early_region_idx].base);
+		end_pfn = PFN_DOWN(regions[early_region_idx].base +
+				regions[early_region_idx].size);
+
+		if (pfn >= start_pfn && pfn < end_pfn)
+			return pfn;
+	}
+
+	/* slow path, do the binary searching */
 	do {
 		mid = (right + left) / 2;
 
-		if (addr < type->regions[mid].base)
+		if (addr < regions[mid].base)
 			right = mid;
-		else if (addr >= (type->regions[mid].base +
-				  type->regions[mid].size))
+		else if (addr >= (regions[mid].base + regions[mid].size))
 			left = mid + 1;
 		else {
-			/* addr is within the region, so pfn is valid */
+			early_region_idx = mid;
 			return pfn;
 		}
 	} while (left < right);
 
 	if (right == type->cnt)
 		return -1UL;
-	else
-		return PHYS_PFN(type->regions[right].base);
+
+	early_region_idx = right;
+
+	return PHYS_PFN(regions[early_region_idx].base);
 }
 EXPORT_SYMBOL(memblock_next_valid_pfn);
 #endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 3/5] mm/memblock: introduce memblock_search_pfn_regions()
  2018-04-02  2:30 ` Jia He
@ 2018-04-02  2:30   ` Jia He
  -1 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  2:30 UTC (permalink / raw)
  To: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Ard Biesheuvel, Andrew Morton, Michal Hocko
  Cc: Wei Yang, Kees Cook, Laura Abbott, Vladimir Murzin,
	Philip Derrin, Grygorii Strashko, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	linux-kernel, linux-mm, Jia He, Jia He

This api is the preparation for further optimizing early_pfn_valid

Signed-off-by: Jia He <jia.he@hxt-semitech.com>
---
 include/linux/memblock.h | 2 ++
 mm/memblock.c            | 9 +++++++++
 2 files changed, 11 insertions(+)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 0257aee..a0127b3 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -203,6 +203,8 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
 	     i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
 #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
 
+int memblock_search_pfn_regions(unsigned long pfn);
+
 /**
  * for_each_free_mem_range - iterate through free memblock areas
  * @i: u64 used as loop variable
diff --git a/mm/memblock.c b/mm/memblock.c
index ba7c878..0f4004c 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1617,6 +1617,15 @@ static int __init_memblock memblock_search(struct memblock_type *type, phys_addr
 	return -1;
 }
 
+/* search memblock with the input pfn, return the region idx */
+int __init_memblock memblock_search_pfn_regions(unsigned long pfn)
+{
+	struct memblock_type *type = &memblock.memory;
+	int mid = memblock_search(type, PFN_PHYS(pfn));
+
+	return mid;
+}
+
 bool __init memblock_is_reserved(phys_addr_t addr)
 {
 	return memblock_search(&memblock.reserved, addr) != -1;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 3/5] mm/memblock: introduce memblock_search_pfn_regions()
@ 2018-04-02  2:30   ` Jia He
  0 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  2:30 UTC (permalink / raw)
  To: linux-arm-kernel

This api is the preparation for further optimizing early_pfn_valid

Signed-off-by: Jia He <jia.he@hxt-semitech.com>
---
 include/linux/memblock.h | 2 ++
 mm/memblock.c            | 9 +++++++++
 2 files changed, 11 insertions(+)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 0257aee..a0127b3 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -203,6 +203,8 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
 	     i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
 #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
 
+int memblock_search_pfn_regions(unsigned long pfn);
+
 /**
  * for_each_free_mem_range - iterate through free memblock areas
  * @i: u64 used as loop variable
diff --git a/mm/memblock.c b/mm/memblock.c
index ba7c878..0f4004c 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1617,6 +1617,15 @@ static int __init_memblock memblock_search(struct memblock_type *type, phys_addr
 	return -1;
 }
 
+/* search memblock with the input pfn, return the region idx */
+int __init_memblock memblock_search_pfn_regions(unsigned long pfn)
+{
+	struct memblock_type *type = &memblock.memory;
+	int mid = memblock_search(type, PFN_PHYS(pfn));
+
+	return mid;
+}
+
 bool __init memblock_is_reserved(phys_addr_t addr)
 {
 	return memblock_search(&memblock.reserved, addr) != -1;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 4/5] arm64: introduce pfn_valid_region()
  2018-04-02  2:30 ` Jia He
@ 2018-04-02  2:30   ` Jia He
  -1 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  2:30 UTC (permalink / raw)
  To: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Ard Biesheuvel, Andrew Morton, Michal Hocko
  Cc: Wei Yang, Kees Cook, Laura Abbott, Vladimir Murzin,
	Philip Derrin, Grygorii Strashko, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	linux-kernel, linux-mm, Jia He, Jia He

This is the preparation for further optimizing in early_pfn_valid
on arm and arm64.

Signed-off-by: Jia He <jia.he@hxt-semitech.com>
---
 arch/arm/include/asm/page.h   |  3 ++-
 arch/arm/mm/init.c            | 24 ++++++++++++++++++++++++
 arch/arm64/include/asm/page.h |  3 ++-
 arch/arm64/mm/init.c          | 24 ++++++++++++++++++++++++
 4 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index f38909c..3bd810e 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -158,9 +158,10 @@ typedef struct page *pgtable_t;
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
 extern int early_region_idx;
-extern int pfn_valid(unsigned long);
+extern int pfn_valid(unsigned long pfn);
 extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
 #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
+extern int pfn_valid_region(unsigned long pfn);
 #endif
 
 #include <asm/memory.h>
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 06ed190..bdcbf58 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -201,6 +201,30 @@ int pfn_valid(unsigned long pfn)
 }
 EXPORT_SYMBOL(pfn_valid);
 
+int pfn_valid_region(unsigned long pfn)
+{
+	unsigned long start_pfn, end_pfn;
+	struct memblock_type *type = &memblock.memory;
+	struct memblock_region *regions = type->regions;
+
+	if (early_region_idx != -1) {
+		start_pfn = PFN_DOWN(regions[early_region_idx].base);
+		end_pfn = PFN_DOWN(regions[early_region_idx].base +
+					regions[early_region_idx].size);
+
+		if (pfn >= start_pfn && pfn < end_pfn)
+			return !memblock_is_nomap(
+					&regions[early_region_idx]);
+	}
+
+	early_region_idx = memblock_search_pfn_regions(pfn);
+	if (early_region_idx == -1)
+		return false;
+
+	return !memblock_is_nomap(&regions[early_region_idx]);
+}
+EXPORT_SYMBOL(pfn_valid_region);
+
 /* HAVE_MEMBLOCK is always enabled on arm */
 unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
 {
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index f0d8c8e5..7087b63 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -39,9 +39,10 @@ typedef struct page *pgtable_t;
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
 extern int early_region_idx;
-extern int pfn_valid(unsigned long);
+extern int pfn_valid(unsigned long pfn);
 extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
 #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
+extern int pfn_valid_region(unsigned long pfn);
 #endif
 
 #include <asm/memory.h>
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 342e4e2..a1646b6 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -293,6 +293,30 @@ int pfn_valid(unsigned long pfn)
 }
 EXPORT_SYMBOL(pfn_valid);
 
+int pfn_valid_region(unsigned long pfn)
+{
+	unsigned long start_pfn, end_pfn;
+	struct memblock_type *type = &memblock.memory;
+	struct memblock_region *regions = type->regions;
+
+	if (early_region_idx != -1) {
+		start_pfn = PFN_DOWN(regions[early_region_idx].base);
+		end_pfn = PFN_DOWN(regions[early_region_idx].base +
+				regions[early_region_idx].size);
+
+		if (pfn >= start_pfn && pfn < end_pfn)
+			return !memblock_is_nomap(
+					&regions[early_region_idx]);
+	}
+
+	early_region_idx = memblock_search_pfn_regions(pfn);
+	if (early_region_idx == -1)
+		return false;
+
+	return !memblock_is_nomap(&regions[early_region_idx]);
+}
+EXPORT_SYMBOL(pfn_valid_region);
+
 /* HAVE_MEMBLOCK is always enabled on arm64 */
 unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 4/5] arm64: introduce pfn_valid_region()
@ 2018-04-02  2:30   ` Jia He
  0 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  2:30 UTC (permalink / raw)
  To: linux-arm-kernel

This is the preparation for further optimizing in early_pfn_valid
on arm and arm64.

Signed-off-by: Jia He <jia.he@hxt-semitech.com>
---
 arch/arm/include/asm/page.h   |  3 ++-
 arch/arm/mm/init.c            | 24 ++++++++++++++++++++++++
 arch/arm64/include/asm/page.h |  3 ++-
 arch/arm64/mm/init.c          | 24 ++++++++++++++++++++++++
 4 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index f38909c..3bd810e 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -158,9 +158,10 @@ typedef struct page *pgtable_t;
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
 extern int early_region_idx;
-extern int pfn_valid(unsigned long);
+extern int pfn_valid(unsigned long pfn);
 extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
 #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
+extern int pfn_valid_region(unsigned long pfn);
 #endif
 
 #include <asm/memory.h>
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 06ed190..bdcbf58 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -201,6 +201,30 @@ int pfn_valid(unsigned long pfn)
 }
 EXPORT_SYMBOL(pfn_valid);
 
+int pfn_valid_region(unsigned long pfn)
+{
+	unsigned long start_pfn, end_pfn;
+	struct memblock_type *type = &memblock.memory;
+	struct memblock_region *regions = type->regions;
+
+	if (early_region_idx != -1) {
+		start_pfn = PFN_DOWN(regions[early_region_idx].base);
+		end_pfn = PFN_DOWN(regions[early_region_idx].base +
+					regions[early_region_idx].size);
+
+		if (pfn >= start_pfn && pfn < end_pfn)
+			return !memblock_is_nomap(
+					&regions[early_region_idx]);
+	}
+
+	early_region_idx = memblock_search_pfn_regions(pfn);
+	if (early_region_idx == -1)
+		return false;
+
+	return !memblock_is_nomap(&regions[early_region_idx]);
+}
+EXPORT_SYMBOL(pfn_valid_region);
+
 /* HAVE_MEMBLOCK is always enabled on arm */
 unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
 {
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index f0d8c8e5..7087b63 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -39,9 +39,10 @@ typedef struct page *pgtable_t;
 
 #ifdef CONFIG_HAVE_ARCH_PFN_VALID
 extern int early_region_idx;
-extern int pfn_valid(unsigned long);
+extern int pfn_valid(unsigned long pfn);
 extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
 #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
+extern int pfn_valid_region(unsigned long pfn);
 #endif
 
 #include <asm/memory.h>
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 342e4e2..a1646b6 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -293,6 +293,30 @@ int pfn_valid(unsigned long pfn)
 }
 EXPORT_SYMBOL(pfn_valid);
 
+int pfn_valid_region(unsigned long pfn)
+{
+	unsigned long start_pfn, end_pfn;
+	struct memblock_type *type = &memblock.memory;
+	struct memblock_region *regions = type->regions;
+
+	if (early_region_idx != -1) {
+		start_pfn = PFN_DOWN(regions[early_region_idx].base);
+		end_pfn = PFN_DOWN(regions[early_region_idx].base +
+				regions[early_region_idx].size);
+
+		if (pfn >= start_pfn && pfn < end_pfn)
+			return !memblock_is_nomap(
+					&regions[early_region_idx]);
+	}
+
+	early_region_idx = memblock_search_pfn_regions(pfn);
+	if (early_region_idx == -1)
+		return false;
+
+	return !memblock_is_nomap(&regions[early_region_idx]);
+}
+EXPORT_SYMBOL(pfn_valid_region);
+
 /* HAVE_MEMBLOCK is always enabled on arm64 */
 unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 5/5] mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()
  2018-04-02  2:30 ` Jia He
@ 2018-04-02  2:30   ` Jia He
  -1 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  2:30 UTC (permalink / raw)
  To: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Ard Biesheuvel, Andrew Morton, Michal Hocko
  Cc: Wei Yang, Kees Cook, Laura Abbott, Vladimir Murzin,
	Philip Derrin, Grygorii Strashko, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	linux-kernel, linux-mm, Jia He, Jia He

Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") optimized the loop in memmap_init_zone(). But there is
still some room for improvement. E.g. in early_pfn_valid(), if pfn and
pfn+1 are in the same memblock region, we can record the last returned
memblock region index and check check pfn++ is still in the same region.

Currently it only improve the performance on arm64 and will have no
impact on other arches.

Signed-off-by: Jia He <jia.he@hxt-semitech.com>
---
 include/linux/mmzone.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index f9c0c46..079f468 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1268,9 +1268,14 @@ static inline int pfn_present(unsigned long pfn)
 })
 #else
 #define pfn_to_nid(pfn)		(0)
-#endif
+#endif /*CONFIG_NUMA*/
 
+#ifdef CONFIG_HAVE_ARCH_PFN_VALID
+#define early_pfn_valid(pfn) pfn_valid_region(pfn)
+#else
 #define early_pfn_valid(pfn)	pfn_valid(pfn)
+#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
+
 void sparse_init(void);
 #else
 #define sparse_init()	do {} while (0)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v5 5/5] mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()
@ 2018-04-02  2:30   ` Jia He
  0 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  2:30 UTC (permalink / raw)
  To: linux-arm-kernel

Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
where possible") optimized the loop in memmap_init_zone(). But there is
still some room for improvement. E.g. in early_pfn_valid(), if pfn and
pfn+1 are in the same memblock region, we can record the last returned
memblock region index and check check pfn++ is still in the same region.

Currently it only improve the performance on arm64 and will have no
impact on other arches.

Signed-off-by: Jia He <jia.he@hxt-semitech.com>
---
 include/linux/mmzone.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index f9c0c46..079f468 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1268,9 +1268,14 @@ static inline int pfn_present(unsigned long pfn)
 })
 #else
 #define pfn_to_nid(pfn)		(0)
-#endif
+#endif /*CONFIG_NUMA*/
 
+#ifdef CONFIG_HAVE_ARCH_PFN_VALID
+#define early_pfn_valid(pfn) pfn_valid_region(pfn)
+#else
 #define early_pfn_valid(pfn)	pfn_valid(pfn)
+#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
+
 void sparse_init(void);
 #else
 #define sparse_init()	do {} while (0)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
  2018-04-02  2:30   ` Jia He
@ 2018-04-02  6:55     ` Ard Biesheuvel
  -1 siblings, 0 replies; 46+ messages in thread
From: Ard Biesheuvel @ 2018-04-02  6:55 UTC (permalink / raw)
  To: Jia He
  Cc: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Andrew Morton, Michal Hocko, Wei Yang, Kees Cook, Laura Abbott,
	Vladimir Murzin, Philip Derrin, Grygorii Strashko,
	AKASHI Takahiro, James Morse, Steve Capper, Pavel Tatashin,
	Gioh Kim, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Kemi Wang, Petr Tesarik, YASUAKI ISHIMATSU, Andrey Ryabinin,
	Nikolay Borisov, Daniel Jordan, Daniel Vacek, Eugeniu Rosca,
	linux-arm-kernel, Linux Kernel Mailing List, Linux-MM, Jia He

On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") optimized the loop in memmap_init_zone(). But it causes
> possible panic bug. So Daniel Vacek reverted it later.
>
> But as suggested by Daniel Vacek, it is fine to using memblock to skip
> gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.
>
> On arm and arm64, memblock is used by default. But generic version of
> pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
> not always return the next valid one but skips more resulting in some
> valid frames to be skipped (as if they were invalid). And that's why
> kernel was eventually crashing on some !arm machines.
>
> And as verified by Eugeniu Rosca, arm can benifit from commit
> b92df1de5d28. So remain the memblock_next_valid_pfn on arm{,64} and move
> the related codes to arm64 arch directory.
>
> Suggested-by: Daniel Vacek <neelx@redhat.com>
> Signed-off-by: Jia He <jia.he@hxt-semitech.com>

Hello Jia,

Apologies for chiming in late.

If we are going to rearchitect this, I'd rather we change the loop in
memmap_init_zone() so that we skip to the next valid PFN directly
rather than skipping to the last invalid PFN so that the pfn++ in the
for () results in the next value. Can we replace the pfn++ there with
a function calls that defaults to 'return pfn + 1', but does the skip
for architectures that implement it?


> ---
>  arch/arm/include/asm/page.h   |  2 ++
>  arch/arm/mm/init.c            | 31 ++++++++++++++++++++++++++++++-
>  arch/arm64/include/asm/page.h |  2 ++
>  arch/arm64/mm/init.c          | 31 ++++++++++++++++++++++++++++++-
>  include/linux/mmzone.h        |  1 +
>  mm/page_alloc.c               |  4 +++-
>  6 files changed, 68 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
> index 4355f0e..489875c 100644
> --- a/arch/arm/include/asm/page.h
> +++ b/arch/arm/include/asm/page.h
> @@ -158,6 +158,8 @@ typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>  extern int pfn_valid(unsigned long);
> +extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
> +#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
>  #endif
>
>  #include <asm/memory.h>
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index a1f11a7..0fb85ca 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -198,7 +198,36 @@ int pfn_valid(unsigned long pfn)
>         return memblock_is_map_memory(__pfn_to_phys(pfn));
>  }
>  EXPORT_SYMBOL(pfn_valid);
> -#endif
> +
> +/* HAVE_MEMBLOCK is always enabled on arm */
> +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
> +{
> +       struct memblock_type *type = &memblock.memory;
> +       unsigned int right = type->cnt;
> +       unsigned int mid, left = 0;
> +       phys_addr_t addr = PFN_PHYS(++pfn);
> +
> +       do {
> +               mid = (right + left) / 2;
> +
> +               if (addr < type->regions[mid].base)
> +                       right = mid;
> +               else if (addr >= (type->regions[mid].base +
> +                                 type->regions[mid].size))
> +                       left = mid + 1;
> +               else {
> +                       /* addr is within the region, so pfn is valid */
> +                       return pfn;
> +               }
> +       } while (left < right);
> +
> +       if (right == type->cnt)
> +               return -1UL;
> +       else
> +               return PHYS_PFN(type->regions[right].base);
> +}
> +EXPORT_SYMBOL(memblock_next_valid_pfn);
> +#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
>
>  #ifndef CONFIG_SPARSEMEM
>  static void __init arm_memory_present(void)
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 60d02c8..e57d3f2 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -39,6 +39,8 @@ typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>  extern int pfn_valid(unsigned long);
> +extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
> +#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
>  #endif
>
>  #include <asm/memory.h>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 00e7b90..13e43ff 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -290,7 +290,36 @@ int pfn_valid(unsigned long pfn)
>         return memblock_is_map_memory(pfn << PAGE_SHIFT);
>  }
>  EXPORT_SYMBOL(pfn_valid);
> -#endif
> +
> +/* HAVE_MEMBLOCK is always enabled on arm64 */
> +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
> +{
> +       struct memblock_type *type = &memblock.memory;
> +       unsigned int right = type->cnt;
> +       unsigned int mid, left = 0;
> +       phys_addr_t addr = PFN_PHYS(++pfn);
> +
> +       do {
> +               mid = (right + left) / 2;
> +
> +               if (addr < type->regions[mid].base)
> +                       right = mid;
> +               else if (addr >= (type->regions[mid].base +
> +                                 type->regions[mid].size))
> +                       left = mid + 1;
> +               else {
> +                       /* addr is within the region, so pfn is valid */
> +                       return pfn;
> +               }
> +       } while (left < right);
> +
> +       if (right == type->cnt)
> +               return -1UL;
> +       else
> +               return PHYS_PFN(type->regions[right].base);
> +}
> +EXPORT_SYMBOL(memblock_next_valid_pfn);
> +#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
>
>  #ifndef CONFIG_SPARSEMEM
>  static void __init arm64_memory_present(void)
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index d797716..f9c0c46 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1245,6 +1245,7 @@ static inline int pfn_valid(unsigned long pfn)
>                 return 0;
>         return valid_section(__nr_to_section(pfn_to_section_nr(pfn)));
>  }
> +#define skip_to_last_invalid_pfn(pfn) (pfn)
>  #endif
>
>  static inline int pfn_present(unsigned long pfn)
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c19f5ac..30f7d76 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5483,8 +5483,10 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>                 if (context != MEMMAP_EARLY)
>                         goto not_early;
>
> -               if (!early_pfn_valid(pfn))
> +               if (!early_pfn_valid(pfn)) {
> +                       pfn = skip_to_last_invalid_pfn(pfn);
>                         continue;
> +               }
>                 if (!early_pfn_in_nid(pfn, nid))
>                         continue;
>                 if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised))
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
@ 2018-04-02  6:55     ` Ard Biesheuvel
  0 siblings, 0 replies; 46+ messages in thread
From: Ard Biesheuvel @ 2018-04-02  6:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") optimized the loop in memmap_init_zone(). But it causes
> possible panic bug. So Daniel Vacek reverted it later.
>
> But as suggested by Daniel Vacek, it is fine to using memblock to skip
> gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.
>
> On arm and arm64, memblock is used by default. But generic version of
> pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
> not always return the next valid one but skips more resulting in some
> valid frames to be skipped (as if they were invalid). And that's why
> kernel was eventually crashing on some !arm machines.
>
> And as verified by Eugeniu Rosca, arm can benifit from commit
> b92df1de5d28. So remain the memblock_next_valid_pfn on arm{,64} and move
> the related codes to arm64 arch directory.
>
> Suggested-by: Daniel Vacek <neelx@redhat.com>
> Signed-off-by: Jia He <jia.he@hxt-semitech.com>

Hello Jia,

Apologies for chiming in late.

If we are going to rearchitect this, I'd rather we change the loop in
memmap_init_zone() so that we skip to the next valid PFN directly
rather than skipping to the last invalid PFN so that the pfn++ in the
for () results in the next value. Can we replace the pfn++ there with
a function calls that defaults to 'return pfn + 1', but does the skip
for architectures that implement it?


> ---
>  arch/arm/include/asm/page.h   |  2 ++
>  arch/arm/mm/init.c            | 31 ++++++++++++++++++++++++++++++-
>  arch/arm64/include/asm/page.h |  2 ++
>  arch/arm64/mm/init.c          | 31 ++++++++++++++++++++++++++++++-
>  include/linux/mmzone.h        |  1 +
>  mm/page_alloc.c               |  4 +++-
>  6 files changed, 68 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
> index 4355f0e..489875c 100644
> --- a/arch/arm/include/asm/page.h
> +++ b/arch/arm/include/asm/page.h
> @@ -158,6 +158,8 @@ typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>  extern int pfn_valid(unsigned long);
> +extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
> +#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
>  #endif
>
>  #include <asm/memory.h>
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index a1f11a7..0fb85ca 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -198,7 +198,36 @@ int pfn_valid(unsigned long pfn)
>         return memblock_is_map_memory(__pfn_to_phys(pfn));
>  }
>  EXPORT_SYMBOL(pfn_valid);
> -#endif
> +
> +/* HAVE_MEMBLOCK is always enabled on arm */
> +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
> +{
> +       struct memblock_type *type = &memblock.memory;
> +       unsigned int right = type->cnt;
> +       unsigned int mid, left = 0;
> +       phys_addr_t addr = PFN_PHYS(++pfn);
> +
> +       do {
> +               mid = (right + left) / 2;
> +
> +               if (addr < type->regions[mid].base)
> +                       right = mid;
> +               else if (addr >= (type->regions[mid].base +
> +                                 type->regions[mid].size))
> +                       left = mid + 1;
> +               else {
> +                       /* addr is within the region, so pfn is valid */
> +                       return pfn;
> +               }
> +       } while (left < right);
> +
> +       if (right == type->cnt)
> +               return -1UL;
> +       else
> +               return PHYS_PFN(type->regions[right].base);
> +}
> +EXPORT_SYMBOL(memblock_next_valid_pfn);
> +#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
>
>  #ifndef CONFIG_SPARSEMEM
>  static void __init arm_memory_present(void)
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 60d02c8..e57d3f2 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -39,6 +39,8 @@ typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>  extern int pfn_valid(unsigned long);
> +extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
> +#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
>  #endif
>
>  #include <asm/memory.h>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 00e7b90..13e43ff 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -290,7 +290,36 @@ int pfn_valid(unsigned long pfn)
>         return memblock_is_map_memory(pfn << PAGE_SHIFT);
>  }
>  EXPORT_SYMBOL(pfn_valid);
> -#endif
> +
> +/* HAVE_MEMBLOCK is always enabled on arm64 */
> +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
> +{
> +       struct memblock_type *type = &memblock.memory;
> +       unsigned int right = type->cnt;
> +       unsigned int mid, left = 0;
> +       phys_addr_t addr = PFN_PHYS(++pfn);
> +
> +       do {
> +               mid = (right + left) / 2;
> +
> +               if (addr < type->regions[mid].base)
> +                       right = mid;
> +               else if (addr >= (type->regions[mid].base +
> +                                 type->regions[mid].size))
> +                       left = mid + 1;
> +               else {
> +                       /* addr is within the region, so pfn is valid */
> +                       return pfn;
> +               }
> +       } while (left < right);
> +
> +       if (right == type->cnt)
> +               return -1UL;
> +       else
> +               return PHYS_PFN(type->regions[right].base);
> +}
> +EXPORT_SYMBOL(memblock_next_valid_pfn);
> +#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
>
>  #ifndef CONFIG_SPARSEMEM
>  static void __init arm64_memory_present(void)
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index d797716..f9c0c46 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1245,6 +1245,7 @@ static inline int pfn_valid(unsigned long pfn)
>                 return 0;
>         return valid_section(__nr_to_section(pfn_to_section_nr(pfn)));
>  }
> +#define skip_to_last_invalid_pfn(pfn) (pfn)
>  #endif
>
>  static inline int pfn_present(unsigned long pfn)
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c19f5ac..30f7d76 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5483,8 +5483,10 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>                 if (context != MEMMAP_EARLY)
>                         goto not_early;
>
> -               if (!early_pfn_valid(pfn))
> +               if (!early_pfn_valid(pfn)) {
> +                       pfn = skip_to_last_invalid_pfn(pfn);
>                         continue;
> +               }
>                 if (!early_pfn_in_nid(pfn, nid))
>                         continue;
>                 if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised))
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 2/5] arm: arm64: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()
  2018-04-02  2:30   ` Jia He
@ 2018-04-02  6:57     ` Ard Biesheuvel
  -1 siblings, 0 replies; 46+ messages in thread
From: Ard Biesheuvel @ 2018-04-02  6:57 UTC (permalink / raw)
  To: Jia He
  Cc: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Andrew Morton, Michal Hocko, Wei Yang, Kees Cook, Laura Abbott,
	Vladimir Murzin, Philip Derrin, Grygorii Strashko,
	AKASHI Takahiro, James Morse, Steve Capper, Pavel Tatashin,
	Gioh Kim, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Kemi Wang, Petr Tesarik, YASUAKI ISHIMATSU, Andrey Ryabinin,
	Nikolay Borisov, Daniel Jordan, Daniel Vacek, Eugeniu Rosca,
	linux-arm-kernel, Linux Kernel Mailing List, Linux-MM, Jia He

On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") optimized the loop in memmap_init_zone(). But there is
> still some room for improvement. E.g. if pfn and pfn+1 are in the same
> memblock region, we can simply pfn++ instead of doing the binary search
> in memblock_next_valid_pfn.
>
> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
> ---
>  arch/arm/include/asm/page.h   |  1 +
>  arch/arm/mm/init.c            | 28 ++++++++++++++++++++++------
>  arch/arm64/include/asm/page.h |  1 +
>  arch/arm64/mm/init.c          | 28 ++++++++++++++++++++++------
>  4 files changed, 46 insertions(+), 12 deletions(-)
>

Could we put this in a shared file somewhere? This is the second patch
where you put make identical changes to ARM and arm64.


> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
> index 489875c..f38909c 100644
> --- a/arch/arm/include/asm/page.h
> +++ b/arch/arm/include/asm/page.h
> @@ -157,6 +157,7 @@ extern void copy_page(void *to, const void *from);
>  typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +extern int early_region_idx;
>  extern int pfn_valid(unsigned long);
>  extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>  #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 0fb85ca..06ed190 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -193,6 +193,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
>  }
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +int early_region_idx __meminitdata = -1;
> +
>  int pfn_valid(unsigned long pfn)
>  {
>         return memblock_is_map_memory(__pfn_to_phys(pfn));
> @@ -203,28 +205,42 @@ EXPORT_SYMBOL(pfn_valid);
>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>  {
>         struct memblock_type *type = &memblock.memory;
> +       struct memblock_region *regions = type->regions;
>         unsigned int right = type->cnt;
>         unsigned int mid, left = 0;
> +       unsigned long start_pfn, end_pfn;
>         phys_addr_t addr = PFN_PHYS(++pfn);
>
> +       /* fast path, return pfn+1 if next pfn is in the same region */
> +       if (early_region_idx != -1) {
> +               start_pfn = PFN_DOWN(regions[early_region_idx].base);
> +               end_pfn = PFN_DOWN(regions[early_region_idx].base +
> +                               regions[early_region_idx].size);
> +
> +               if (pfn >= start_pfn && pfn < end_pfn)
> +                       return pfn;
> +       }
> +
> +       /* slow path, do the binary searching */
>         do {
>                 mid = (right + left) / 2;
>
> -               if (addr < type->regions[mid].base)
> +               if (addr < regions[mid].base)
>                         right = mid;
> -               else if (addr >= (type->regions[mid].base +
> -                                 type->regions[mid].size))
> +               else if (addr >= (regions[mid].base + regions[mid].size))
>                         left = mid + 1;
>                 else {
> -                       /* addr is within the region, so pfn is valid */
> +                       early_region_idx = mid;
>                         return pfn;
>                 }
>         } while (left < right);
>
>         if (right == type->cnt)
>                 return -1UL;
> -       else
> -               return PHYS_PFN(type->regions[right].base);
> +
> +       early_region_idx = right;
> +
> +       return PHYS_PFN(regions[early_region_idx].base);
>  }
>  EXPORT_SYMBOL(memblock_next_valid_pfn);
>  #endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index e57d3f2..f0d8c8e5 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -38,6 +38,7 @@ extern void clear_page(void *to);
>  typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +extern int early_region_idx;
>  extern int pfn_valid(unsigned long);
>  extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>  #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 13e43ff..342e4e2 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -285,6 +285,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>  #endif /* CONFIG_NUMA */
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +int early_region_idx __meminitdata = -1;
> +
>  int pfn_valid(unsigned long pfn)
>  {
>         return memblock_is_map_memory(pfn << PAGE_SHIFT);
> @@ -295,28 +297,42 @@ EXPORT_SYMBOL(pfn_valid);
>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>  {
>         struct memblock_type *type = &memblock.memory;
> +       struct memblock_region *regions = type->regions;
>         unsigned int right = type->cnt;
>         unsigned int mid, left = 0;
> +       unsigned long start_pfn, end_pfn;
>         phys_addr_t addr = PFN_PHYS(++pfn);
>
> +       /* fast path, return pfn+1 if next pfn is in the same region */
> +       if (early_region_idx != -1) {
> +               start_pfn = PFN_DOWN(regions[early_region_idx].base);
> +               end_pfn = PFN_DOWN(regions[early_region_idx].base +
> +                               regions[early_region_idx].size);
> +
> +               if (pfn >= start_pfn && pfn < end_pfn)
> +                       return pfn;
> +       }
> +
> +       /* slow path, do the binary searching */
>         do {
>                 mid = (right + left) / 2;
>
> -               if (addr < type->regions[mid].base)
> +               if (addr < regions[mid].base)
>                         right = mid;
> -               else if (addr >= (type->regions[mid].base +
> -                                 type->regions[mid].size))
> +               else if (addr >= (regions[mid].base + regions[mid].size))
>                         left = mid + 1;
>                 else {
> -                       /* addr is within the region, so pfn is valid */
> +                       early_region_idx = mid;
>                         return pfn;
>                 }
>         } while (left < right);
>
>         if (right == type->cnt)
>                 return -1UL;
> -       else
> -               return PHYS_PFN(type->regions[right].base);
> +
> +       early_region_idx = right;
> +
> +       return PHYS_PFN(regions[early_region_idx].base);
>  }
>  EXPORT_SYMBOL(memblock_next_valid_pfn);
>  #endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 2/5] arm: arm64: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()
@ 2018-04-02  6:57     ` Ard Biesheuvel
  0 siblings, 0 replies; 46+ messages in thread
From: Ard Biesheuvel @ 2018-04-02  6:57 UTC (permalink / raw)
  To: linux-arm-kernel

On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") optimized the loop in memmap_init_zone(). But there is
> still some room for improvement. E.g. if pfn and pfn+1 are in the same
> memblock region, we can simply pfn++ instead of doing the binary search
> in memblock_next_valid_pfn.
>
> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
> ---
>  arch/arm/include/asm/page.h   |  1 +
>  arch/arm/mm/init.c            | 28 ++++++++++++++++++++++------
>  arch/arm64/include/asm/page.h |  1 +
>  arch/arm64/mm/init.c          | 28 ++++++++++++++++++++++------
>  4 files changed, 46 insertions(+), 12 deletions(-)
>

Could we put this in a shared file somewhere? This is the second patch
where you put make identical changes to ARM and arm64.


> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
> index 489875c..f38909c 100644
> --- a/arch/arm/include/asm/page.h
> +++ b/arch/arm/include/asm/page.h
> @@ -157,6 +157,7 @@ extern void copy_page(void *to, const void *from);
>  typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +extern int early_region_idx;
>  extern int pfn_valid(unsigned long);
>  extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>  #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 0fb85ca..06ed190 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -193,6 +193,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
>  }
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +int early_region_idx __meminitdata = -1;
> +
>  int pfn_valid(unsigned long pfn)
>  {
>         return memblock_is_map_memory(__pfn_to_phys(pfn));
> @@ -203,28 +205,42 @@ EXPORT_SYMBOL(pfn_valid);
>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>  {
>         struct memblock_type *type = &memblock.memory;
> +       struct memblock_region *regions = type->regions;
>         unsigned int right = type->cnt;
>         unsigned int mid, left = 0;
> +       unsigned long start_pfn, end_pfn;
>         phys_addr_t addr = PFN_PHYS(++pfn);
>
> +       /* fast path, return pfn+1 if next pfn is in the same region */
> +       if (early_region_idx != -1) {
> +               start_pfn = PFN_DOWN(regions[early_region_idx].base);
> +               end_pfn = PFN_DOWN(regions[early_region_idx].base +
> +                               regions[early_region_idx].size);
> +
> +               if (pfn >= start_pfn && pfn < end_pfn)
> +                       return pfn;
> +       }
> +
> +       /* slow path, do the binary searching */
>         do {
>                 mid = (right + left) / 2;
>
> -               if (addr < type->regions[mid].base)
> +               if (addr < regions[mid].base)
>                         right = mid;
> -               else if (addr >= (type->regions[mid].base +
> -                                 type->regions[mid].size))
> +               else if (addr >= (regions[mid].base + regions[mid].size))
>                         left = mid + 1;
>                 else {
> -                       /* addr is within the region, so pfn is valid */
> +                       early_region_idx = mid;
>                         return pfn;
>                 }
>         } while (left < right);
>
>         if (right == type->cnt)
>                 return -1UL;
> -       else
> -               return PHYS_PFN(type->regions[right].base);
> +
> +       early_region_idx = right;
> +
> +       return PHYS_PFN(regions[early_region_idx].base);
>  }
>  EXPORT_SYMBOL(memblock_next_valid_pfn);
>  #endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index e57d3f2..f0d8c8e5 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -38,6 +38,7 @@ extern void clear_page(void *to);
>  typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +extern int early_region_idx;
>  extern int pfn_valid(unsigned long);
>  extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>  #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 13e43ff..342e4e2 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -285,6 +285,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>  #endif /* CONFIG_NUMA */
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +int early_region_idx __meminitdata = -1;
> +
>  int pfn_valid(unsigned long pfn)
>  {
>         return memblock_is_map_memory(pfn << PAGE_SHIFT);
> @@ -295,28 +297,42 @@ EXPORT_SYMBOL(pfn_valid);
>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>  {
>         struct memblock_type *type = &memblock.memory;
> +       struct memblock_region *regions = type->regions;
>         unsigned int right = type->cnt;
>         unsigned int mid, left = 0;
> +       unsigned long start_pfn, end_pfn;
>         phys_addr_t addr = PFN_PHYS(++pfn);
>
> +       /* fast path, return pfn+1 if next pfn is in the same region */
> +       if (early_region_idx != -1) {
> +               start_pfn = PFN_DOWN(regions[early_region_idx].base);
> +               end_pfn = PFN_DOWN(regions[early_region_idx].base +
> +                               regions[early_region_idx].size);
> +
> +               if (pfn >= start_pfn && pfn < end_pfn)
> +                       return pfn;
> +       }
> +
> +       /* slow path, do the binary searching */
>         do {
>                 mid = (right + left) / 2;
>
> -               if (addr < type->regions[mid].base)
> +               if (addr < regions[mid].base)
>                         right = mid;
> -               else if (addr >= (type->regions[mid].base +
> -                                 type->regions[mid].size))
> +               else if (addr >= (regions[mid].base + regions[mid].size))
>                         left = mid + 1;
>                 else {
> -                       /* addr is within the region, so pfn is valid */
> +                       early_region_idx = mid;
>                         return pfn;
>                 }
>         } while (left < right);
>
>         if (right == type->cnt)
>                 return -1UL;
> -       else
> -               return PHYS_PFN(type->regions[right].base);
> +
> +       early_region_idx = right;
> +
> +       return PHYS_PFN(regions[early_region_idx].base);
>  }
>  EXPORT_SYMBOL(memblock_next_valid_pfn);
>  #endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 3/5] mm/memblock: introduce memblock_search_pfn_regions()
  2018-04-02  2:30   ` Jia He
@ 2018-04-02  6:57     ` Ard Biesheuvel
  -1 siblings, 0 replies; 46+ messages in thread
From: Ard Biesheuvel @ 2018-04-02  6:57 UTC (permalink / raw)
  To: Jia He
  Cc: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Andrew Morton, Michal Hocko, Wei Yang, Kees Cook, Laura Abbott,
	Vladimir Murzin, Philip Derrin, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	Linux Kernel Mailing List, Linux-MM, Jia He

On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
> This api is the preparation for further optimizing early_pfn_valid
>

Please add more explanatation here of what it is you are doing and why.


> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
> ---
>  include/linux/memblock.h | 2 ++
>  mm/memblock.c            | 9 +++++++++
>  2 files changed, 11 insertions(+)
>
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 0257aee..a0127b3 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -203,6 +203,8 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
>              i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
>  #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
>
> +int memblock_search_pfn_regions(unsigned long pfn);
> +
>  /**
>   * for_each_free_mem_range - iterate through free memblock areas
>   * @i: u64 used as loop variable
> diff --git a/mm/memblock.c b/mm/memblock.c
> index ba7c878..0f4004c 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1617,6 +1617,15 @@ static int __init_memblock memblock_search(struct memblock_type *type, phys_addr
>         return -1;
>  }
>
> +/* search memblock with the input pfn, return the region idx */
> +int __init_memblock memblock_search_pfn_regions(unsigned long pfn)
> +{
> +       struct memblock_type *type = &memblock.memory;
> +       int mid = memblock_search(type, PFN_PHYS(pfn));
> +
> +       return mid;
> +}
> +
>  bool __init memblock_is_reserved(phys_addr_t addr)
>  {
>         return memblock_search(&memblock.reserved, addr) != -1;
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 3/5] mm/memblock: introduce memblock_search_pfn_regions()
@ 2018-04-02  6:57     ` Ard Biesheuvel
  0 siblings, 0 replies; 46+ messages in thread
From: Ard Biesheuvel @ 2018-04-02  6:57 UTC (permalink / raw)
  To: linux-arm-kernel

On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
> This api is the preparation for further optimizing early_pfn_valid
>

Please add more explanatation here of what it is you are doing and why.


> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
> ---
>  include/linux/memblock.h | 2 ++
>  mm/memblock.c            | 9 +++++++++
>  2 files changed, 11 insertions(+)
>
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 0257aee..a0127b3 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -203,6 +203,8 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
>              i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid))
>  #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
>
> +int memblock_search_pfn_regions(unsigned long pfn);
> +
>  /**
>   * for_each_free_mem_range - iterate through free memblock areas
>   * @i: u64 used as loop variable
> diff --git a/mm/memblock.c b/mm/memblock.c
> index ba7c878..0f4004c 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1617,6 +1617,15 @@ static int __init_memblock memblock_search(struct memblock_type *type, phys_addr
>         return -1;
>  }
>
> +/* search memblock with the input pfn, return the region idx */
> +int __init_memblock memblock_search_pfn_regions(unsigned long pfn)
> +{
> +       struct memblock_type *type = &memblock.memory;
> +       int mid = memblock_search(type, PFN_PHYS(pfn));
> +
> +       return mid;
> +}
> +
>  bool __init memblock_is_reserved(phys_addr_t addr)
>  {
>         return memblock_search(&memblock.reserved, addr) != -1;
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 4/5] arm64: introduce pfn_valid_region()
  2018-04-02  2:30   ` Jia He
@ 2018-04-02  6:59     ` Ard Biesheuvel
  -1 siblings, 0 replies; 46+ messages in thread
From: Ard Biesheuvel @ 2018-04-02  6:59 UTC (permalink / raw)
  To: Jia He
  Cc: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Andrew Morton, Michal Hocko, Wei Yang, Kees Cook, Laura Abbott,
	Vladimir Murzin, Philip Derrin, Grygorii Strashko,
	AKASHI Takahiro, James Morse, Steve Capper, Pavel Tatashin,
	Gioh Kim, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Kemi Wang, Petr Tesarik, YASUAKI ISHIMATSU, Andrey Ryabinin,
	Nikolay Borisov, Daniel Jordan, Daniel Vacek, Eugeniu Rosca,
	linux-arm-kernel, Linux Kernel Mailing List, Linux-MM, Jia He

On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
> This is the preparation for further optimizing in early_pfn_valid
> on arm and arm64.
>

Same as before
- please share the code between ARM and arm64. if necessary, you can
invent a new HAVE_ARCH_xxx symbol that is only defined by ARM and
arm64
- please explain what the patch does and more importantly, why

> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
> ---
>  arch/arm/include/asm/page.h   |  3 ++-
>  arch/arm/mm/init.c            | 24 ++++++++++++++++++++++++
>  arch/arm64/include/asm/page.h |  3 ++-
>  arch/arm64/mm/init.c          | 24 ++++++++++++++++++++++++
>  4 files changed, 52 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
> index f38909c..3bd810e 100644
> --- a/arch/arm/include/asm/page.h
> +++ b/arch/arm/include/asm/page.h
> @@ -158,9 +158,10 @@ typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>  extern int early_region_idx;
> -extern int pfn_valid(unsigned long);
> +extern int pfn_valid(unsigned long pfn);
>  extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>  #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
> +extern int pfn_valid_region(unsigned long pfn);
>  #endif
>
>  #include <asm/memory.h>
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 06ed190..bdcbf58 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -201,6 +201,30 @@ int pfn_valid(unsigned long pfn)
>  }
>  EXPORT_SYMBOL(pfn_valid);
>
> +int pfn_valid_region(unsigned long pfn)
> +{
> +       unsigned long start_pfn, end_pfn;
> +       struct memblock_type *type = &memblock.memory;
> +       struct memblock_region *regions = type->regions;
> +
> +       if (early_region_idx != -1) {
> +               start_pfn = PFN_DOWN(regions[early_region_idx].base);
> +               end_pfn = PFN_DOWN(regions[early_region_idx].base +
> +                                       regions[early_region_idx].size);
> +
> +               if (pfn >= start_pfn && pfn < end_pfn)
> +                       return !memblock_is_nomap(
> +                                       &regions[early_region_idx]);
> +       }
> +
> +       early_region_idx = memblock_search_pfn_regions(pfn);
> +       if (early_region_idx == -1)
> +               return false;
> +
> +       return !memblock_is_nomap(&regions[early_region_idx]);
> +}
> +EXPORT_SYMBOL(pfn_valid_region);
> +
>  /* HAVE_MEMBLOCK is always enabled on arm */
>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>  {
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index f0d8c8e5..7087b63 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -39,9 +39,10 @@ typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>  extern int early_region_idx;
> -extern int pfn_valid(unsigned long);
> +extern int pfn_valid(unsigned long pfn);
>  extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>  #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
> +extern int pfn_valid_region(unsigned long pfn);
>  #endif
>
>  #include <asm/memory.h>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 342e4e2..a1646b6 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -293,6 +293,30 @@ int pfn_valid(unsigned long pfn)
>  }
>  EXPORT_SYMBOL(pfn_valid);
>
> +int pfn_valid_region(unsigned long pfn)
> +{
> +       unsigned long start_pfn, end_pfn;
> +       struct memblock_type *type = &memblock.memory;
> +       struct memblock_region *regions = type->regions;
> +
> +       if (early_region_idx != -1) {
> +               start_pfn = PFN_DOWN(regions[early_region_idx].base);
> +               end_pfn = PFN_DOWN(regions[early_region_idx].base +
> +                               regions[early_region_idx].size);
> +
> +               if (pfn >= start_pfn && pfn < end_pfn)
> +                       return !memblock_is_nomap(
> +                                       &regions[early_region_idx]);
> +       }
> +
> +       early_region_idx = memblock_search_pfn_regions(pfn);
> +       if (early_region_idx == -1)
> +               return false;
> +
> +       return !memblock_is_nomap(&regions[early_region_idx]);
> +}
> +EXPORT_SYMBOL(pfn_valid_region);
> +
>  /* HAVE_MEMBLOCK is always enabled on arm64 */
>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>  {
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 4/5] arm64: introduce pfn_valid_region()
@ 2018-04-02  6:59     ` Ard Biesheuvel
  0 siblings, 0 replies; 46+ messages in thread
From: Ard Biesheuvel @ 2018-04-02  6:59 UTC (permalink / raw)
  To: linux-arm-kernel

On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
> This is the preparation for further optimizing in early_pfn_valid
> on arm and arm64.
>

Same as before
- please share the code between ARM and arm64. if necessary, you can
invent a new HAVE_ARCH_xxx symbol that is only defined by ARM and
arm64
- please explain what the patch does and more importantly, why

> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
> ---
>  arch/arm/include/asm/page.h   |  3 ++-
>  arch/arm/mm/init.c            | 24 ++++++++++++++++++++++++
>  arch/arm64/include/asm/page.h |  3 ++-
>  arch/arm64/mm/init.c          | 24 ++++++++++++++++++++++++
>  4 files changed, 52 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
> index f38909c..3bd810e 100644
> --- a/arch/arm/include/asm/page.h
> +++ b/arch/arm/include/asm/page.h
> @@ -158,9 +158,10 @@ typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>  extern int early_region_idx;
> -extern int pfn_valid(unsigned long);
> +extern int pfn_valid(unsigned long pfn);
>  extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>  #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
> +extern int pfn_valid_region(unsigned long pfn);
>  #endif
>
>  #include <asm/memory.h>
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 06ed190..bdcbf58 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -201,6 +201,30 @@ int pfn_valid(unsigned long pfn)
>  }
>  EXPORT_SYMBOL(pfn_valid);
>
> +int pfn_valid_region(unsigned long pfn)
> +{
> +       unsigned long start_pfn, end_pfn;
> +       struct memblock_type *type = &memblock.memory;
> +       struct memblock_region *regions = type->regions;
> +
> +       if (early_region_idx != -1) {
> +               start_pfn = PFN_DOWN(regions[early_region_idx].base);
> +               end_pfn = PFN_DOWN(regions[early_region_idx].base +
> +                                       regions[early_region_idx].size);
> +
> +               if (pfn >= start_pfn && pfn < end_pfn)
> +                       return !memblock_is_nomap(
> +                                       &regions[early_region_idx]);
> +       }
> +
> +       early_region_idx = memblock_search_pfn_regions(pfn);
> +       if (early_region_idx == -1)
> +               return false;
> +
> +       return !memblock_is_nomap(&regions[early_region_idx]);
> +}
> +EXPORT_SYMBOL(pfn_valid_region);
> +
>  /* HAVE_MEMBLOCK is always enabled on arm */
>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>  {
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index f0d8c8e5..7087b63 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -39,9 +39,10 @@ typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>  extern int early_region_idx;
> -extern int pfn_valid(unsigned long);
> +extern int pfn_valid(unsigned long pfn);
>  extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>  #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
> +extern int pfn_valid_region(unsigned long pfn);
>  #endif
>
>  #include <asm/memory.h>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 342e4e2..a1646b6 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -293,6 +293,30 @@ int pfn_valid(unsigned long pfn)
>  }
>  EXPORT_SYMBOL(pfn_valid);
>
> +int pfn_valid_region(unsigned long pfn)
> +{
> +       unsigned long start_pfn, end_pfn;
> +       struct memblock_type *type = &memblock.memory;
> +       struct memblock_region *regions = type->regions;
> +
> +       if (early_region_idx != -1) {
> +               start_pfn = PFN_DOWN(regions[early_region_idx].base);
> +               end_pfn = PFN_DOWN(regions[early_region_idx].base +
> +                               regions[early_region_idx].size);
> +
> +               if (pfn >= start_pfn && pfn < end_pfn)
> +                       return !memblock_is_nomap(
> +                                       &regions[early_region_idx]);
> +       }
> +
> +       early_region_idx = memblock_search_pfn_regions(pfn);
> +       if (early_region_idx == -1)
> +               return false;
> +
> +       return !memblock_is_nomap(&regions[early_region_idx]);
> +}
> +EXPORT_SYMBOL(pfn_valid_region);
> +
>  /* HAVE_MEMBLOCK is always enabled on arm64 */
>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>  {
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 5/5] mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()
  2018-04-02  2:30   ` Jia He
@ 2018-04-02  7:00     ` Ard Biesheuvel
  -1 siblings, 0 replies; 46+ messages in thread
From: Ard Biesheuvel @ 2018-04-02  7:00 UTC (permalink / raw)
  To: Jia He
  Cc: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Andrew Morton, Michal Hocko, Wei Yang, Kees Cook, Laura Abbott,
	Vladimir Murzin, Philip Derrin, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	Linux Kernel Mailing List, Linux-MM, Jia He

On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") optimized the loop in memmap_init_zone(). But there is
> still some room for improvement. E.g. in early_pfn_valid(), if pfn and
> pfn+1 are in the same memblock region, we can record the last returned
> memblock region index and check check pfn++ is still in the same region.
>
> Currently it only improve the performance on arm64 and will have no
> impact on other arches.
>

How much does it improve the performance? And in which cases?

I guess it improves boot time on systems with physical address spaces
that are sparsely populated with DRAM, but you really have to quantify
this if you want other people to care.

> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
> ---
>  include/linux/mmzone.h | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index f9c0c46..079f468 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1268,9 +1268,14 @@ static inline int pfn_present(unsigned long pfn)
>  })
>  #else
>  #define pfn_to_nid(pfn)                (0)
> -#endif
> +#endif /*CONFIG_NUMA*/
>
> +#ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +#define early_pfn_valid(pfn) pfn_valid_region(pfn)
> +#else
>  #define early_pfn_valid(pfn)   pfn_valid(pfn)
> +#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
> +
>  void sparse_init(void);
>  #else
>  #define sparse_init()  do {} while (0)
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 5/5] mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()
@ 2018-04-02  7:00     ` Ard Biesheuvel
  0 siblings, 0 replies; 46+ messages in thread
From: Ard Biesheuvel @ 2018-04-02  7:00 UTC (permalink / raw)
  To: linux-arm-kernel

On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") optimized the loop in memmap_init_zone(). But there is
> still some room for improvement. E.g. in early_pfn_valid(), if pfn and
> pfn+1 are in the same memblock region, we can record the last returned
> memblock region index and check check pfn++ is still in the same region.
>
> Currently it only improve the performance on arm64 and will have no
> impact on other arches.
>

How much does it improve the performance? And in which cases?

I guess it improves boot time on systems with physical address spaces
that are sparsely populated with DRAM, but you really have to quantify
this if you want other people to care.

> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
> ---
>  include/linux/mmzone.h | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index f9c0c46..079f468 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1268,9 +1268,14 @@ static inline int pfn_present(unsigned long pfn)
>  })
>  #else
>  #define pfn_to_nid(pfn)                (0)
> -#endif
> +#endif /*CONFIG_NUMA*/
>
> +#ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +#define early_pfn_valid(pfn) pfn_valid_region(pfn)
> +#else
>  #define early_pfn_valid(pfn)   pfn_valid(pfn)
> +#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
> +
>  void sparse_init(void);
>  #else
>  #define sparse_init()  do {} while (0)
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
  2018-04-02  6:55     ` Ard Biesheuvel
  (?)
@ 2018-04-02  7:49       ` Jia He
  -1 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  7:49 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Andrew Morton, Michal Hocko, Wei Yang, Kees Cook, Laura Abbott,
	Vladimir Murzin, Philip Derrin, Grygorii Strashko,
	AKASHI Takahiro, James Morse, Steve Capper, Pavel Tatashin,
	Gioh Kim, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Kemi Wang, Petr Tesarik, YASUAKI ISHIMATSU, Andrey Ryabinin,
	Nikolay Borisov, Daniel Jordan, Daniel Vacek, Eugeniu Rosca,
	linux-arm-kernel, Linux Kernel Mailing List, Linux-MM, Jia He



On 4/2/2018 2:55 PM, Ard Biesheuvel Wrote:
> On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>> where possible") optimized the loop in memmap_init_zone(). But it causes
>> possible panic bug. So Daniel Vacek reverted it later.
>>
>> But as suggested by Daniel Vacek, it is fine to using memblock to skip
>> gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.
>>
>> On arm and arm64, memblock is used by default. But generic version of
>> pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
>> not always return the next valid one but skips more resulting in some
>> valid frames to be skipped (as if they were invalid). And that's why
>> kernel was eventually crashing on some !arm machines.
>>
>> And as verified by Eugeniu Rosca, arm can benifit from commit
>> b92df1de5d28. So remain the memblock_next_valid_pfn on arm{,64} and move
>> the related codes to arm64 arch directory.
>>
>> Suggested-by: Daniel Vacek <neelx@redhat.com>
>> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
> Hello Jia,
>
> Apologies for chiming in late.
no problem, thanks for your comments  ;-)
>
> If we are going to rearchitect this, I'd rather we change the loop in
> memmap_init_zone() so that we skip to the next valid PFN directly
> rather than skipping to the last invalid PFN so that the pfn++ in the
hmm... Maybe this macro name makes you confused

pfn = skip_to_last_invalid_pfn(pfn);

how about skip_to_next_valid_pfn?

> for () results in the next value. Can we replace the pfn++ there with
> a function calls that defaults to 'return pfn + 1', but does the skip
> for architectures that implement it?
I am not sure I understand your question here.
With this patch, on !arm arches, skip_to_last_invalid_pfn is equal to 
(pfn), and will be increased
when for{} loop continue. We only *skip* to the start pfn of next valid 
region when
CONFIG_HAVE_MEMBLOCK and CONFIG_HAVE_ARCH_PFN_VALID(arm/arm64 supports 
both).


-- 
Cheers,
Jia

>
>
>> ---
>>   arch/arm/include/asm/page.h   |  2 ++
>>   arch/arm/mm/init.c            | 31 ++++++++++++++++++++++++++++++-
>>   arch/arm64/include/asm/page.h |  2 ++
>>   arch/arm64/mm/init.c          | 31 ++++++++++++++++++++++++++++++-
>>   include/linux/mmzone.h        |  1 +
>>   mm/page_alloc.c               |  4 +++-
>>   6 files changed, 68 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
>> index 4355f0e..489875c 100644
>> --- a/arch/arm/include/asm/page.h
>> +++ b/arch/arm/include/asm/page.h
>> @@ -158,6 +158,8 @@ typedef struct page *pgtable_t;
>>
>>   #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>>   extern int pfn_valid(unsigned long);
>> +extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>> +#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
>>   #endif
>>
>>   #include <asm/memory.h>
>> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
>> index a1f11a7..0fb85ca 100644
>> --- a/arch/arm/mm/init.c
>> +++ b/arch/arm/mm/init.c
>> @@ -198,7 +198,36 @@ int pfn_valid(unsigned long pfn)
>>          return memblock_is_map_memory(__pfn_to_phys(pfn));
>>   }
>>   EXPORT_SYMBOL(pfn_valid);
>> -#endif
>> +
>> +/* HAVE_MEMBLOCK is always enabled on arm */
>> +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>> +{
>> +       struct memblock_type *type = &memblock.memory;
>> +       unsigned int right = type->cnt;
>> +       unsigned int mid, left = 0;
>> +       phys_addr_t addr = PFN_PHYS(++pfn);
>> +
>> +       do {
>> +               mid = (right + left) / 2;
>> +
>> +               if (addr < type->regions[mid].base)
>> +                       right = mid;
>> +               else if (addr >= (type->regions[mid].base +
>> +                                 type->regions[mid].size))
>> +                       left = mid + 1;
>> +               else {
>> +                       /* addr is within the region, so pfn is valid */
>> +                       return pfn;
>> +               }
>> +       } while (left < right);
>> +
>> +       if (right == type->cnt)
>> +               return -1UL;
>> +       else
>> +               return PHYS_PFN(type->regions[right].base);
>> +}
>> +EXPORT_SYMBOL(memblock_next_valid_pfn);
>> +#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
>>
>>   #ifndef CONFIG_SPARSEMEM
>>   static void __init arm_memory_present(void)
>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>> index 60d02c8..e57d3f2 100644
>> --- a/arch/arm64/include/asm/page.h
>> +++ b/arch/arm64/include/asm/page.h
>> @@ -39,6 +39,8 @@ typedef struct page *pgtable_t;
>>
>>   #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>>   extern int pfn_valid(unsigned long);
>> +extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>> +#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
>>   #endif
>>
>>   #include <asm/memory.h>
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 00e7b90..13e43ff 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -290,7 +290,36 @@ int pfn_valid(unsigned long pfn)
>>          return memblock_is_map_memory(pfn << PAGE_SHIFT);
>>   }
>>   EXPORT_SYMBOL(pfn_valid);
>> -#endif
>> +
>> +/* HAVE_MEMBLOCK is always enabled on arm64 */
>> +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>> +{
>> +       struct memblock_type *type = &memblock.memory;
>> +       unsigned int right = type->cnt;
>> +       unsigned int mid, left = 0;
>> +       phys_addr_t addr = PFN_PHYS(++pfn);
>> +
>> +       do {
>> +               mid = (right + left) / 2;
>> +
>> +               if (addr < type->regions[mid].base)
>> +                       right = mid;
>> +               else if (addr >= (type->regions[mid].base +
>> +                                 type->regions[mid].size))
>> +                       left = mid + 1;
>> +               else {
>> +                       /* addr is within the region, so pfn is valid */
>> +                       return pfn;
>> +               }
>> +       } while (left < right);
>> +
>> +       if (right == type->cnt)
>> +               return -1UL;
>> +       else
>> +               return PHYS_PFN(type->regions[right].base);
>> +}
>> +EXPORT_SYMBOL(memblock_next_valid_pfn);
>> +#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
>>
>>   #ifndef CONFIG_SPARSEMEM
>>   static void __init arm64_memory_present(void)
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index d797716..f9c0c46 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -1245,6 +1245,7 @@ static inline int pfn_valid(unsigned long pfn)
>>                  return 0;
>>          return valid_section(__nr_to_section(pfn_to_section_nr(pfn)));
>>   }
>> +#define skip_to_last_invalid_pfn(pfn) (pfn)
>>   #endif
>>
>>   static inline int pfn_present(unsigned long pfn)
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index c19f5ac..30f7d76 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5483,8 +5483,10 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>>                  if (context != MEMMAP_EARLY)
>>                          goto not_early;
>>
>> -               if (!early_pfn_valid(pfn))
>> +               if (!early_pfn_valid(pfn)) {
>> +                       pfn = skip_to_last_invalid_pfn(pfn);
>>                          continue;
>> +               }
>>                  if (!early_pfn_in_nid(pfn, nid))
>>                          continue;
>>                  if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised))
>> --
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
@ 2018-04-02  7:49       ` Jia He
  0 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  7:49 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Andrew Morton, Michal Hocko, Wei Yang, Kees Cook, Laura Abbott,
	Vladimir Murzin, Philip Derrin, Grygorii Strashko,
	AKASHI Takahiro, James Morse, Steve Capper, Pavel Tatashin,
	Gioh Kim, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Kemi Wang, Petr Tesarik, YASUAKI ISHIMATSU, Andrey Ryabinin,
	Nikolay Borisov, Daniel Jordan, Daniel Vacek, Eugeniu Rosca,
	linux-arm-kernel, Linux Kernel Mailing List, Linux-MM, Jia He



On 4/2/2018 2:55 PM, Ard Biesheuvel Wrote:
> On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>> where possible") optimized the loop in memmap_init_zone(). But it causes
>> possible panic bug. So Daniel Vacek reverted it later.
>>
>> But as suggested by Daniel Vacek, it is fine to using memblock to skip
>> gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.
>>
>> On arm and arm64, memblock is used by default. But generic version of
>> pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
>> not always return the next valid one but skips more resulting in some
>> valid frames to be skipped (as if they were invalid). And that's why
>> kernel was eventually crashing on some !arm machines.
>>
>> And as verified by Eugeniu Rosca, arm can benifit from commit
>> b92df1de5d28. So remain the memblock_next_valid_pfn on arm{,64} and move
>> the related codes to arm64 arch directory.
>>
>> Suggested-by: Daniel Vacek <neelx@redhat.com>
>> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
> Hello Jia,
>
> Apologies for chiming in late.
no problem, thanks for your commentsA  ;-)
>
> If we are going to rearchitect this, I'd rather we change the loop in
> memmap_init_zone() so that we skip to the next valid PFN directly
> rather than skipping to the last invalid PFN so that the pfn++ in the
hmm... Maybe this macro name makes you confused

pfn = skip_to_last_invalid_pfn(pfn);

how about skip_to_next_valid_pfn?

> for () results in the next value. Can we replace the pfn++ there with
> a function calls that defaults to 'return pfn + 1', but does the skip
> for architectures that implement it?
I am not sure I understand your question here.
With this patch, on !arm arches, skip_to_last_invalid_pfn is equal to 
(pfn), and will be increased
when for{} loop continue. We only *skip* to the start pfn of next valid 
region when
CONFIG_HAVE_MEMBLOCK and CONFIG_HAVE_ARCH_PFN_VALID(arm/arm64 supports 
both).


-- 
Cheers,
Jia

>
>
>> ---
>>   arch/arm/include/asm/page.h   |  2 ++
>>   arch/arm/mm/init.c            | 31 ++++++++++++++++++++++++++++++-
>>   arch/arm64/include/asm/page.h |  2 ++
>>   arch/arm64/mm/init.c          | 31 ++++++++++++++++++++++++++++++-
>>   include/linux/mmzone.h        |  1 +
>>   mm/page_alloc.c               |  4 +++-
>>   6 files changed, 68 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
>> index 4355f0e..489875c 100644
>> --- a/arch/arm/include/asm/page.h
>> +++ b/arch/arm/include/asm/page.h
>> @@ -158,6 +158,8 @@ typedef struct page *pgtable_t;
>>
>>   #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>>   extern int pfn_valid(unsigned long);
>> +extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>> +#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
>>   #endif
>>
>>   #include <asm/memory.h>
>> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
>> index a1f11a7..0fb85ca 100644
>> --- a/arch/arm/mm/init.c
>> +++ b/arch/arm/mm/init.c
>> @@ -198,7 +198,36 @@ int pfn_valid(unsigned long pfn)
>>          return memblock_is_map_memory(__pfn_to_phys(pfn));
>>   }
>>   EXPORT_SYMBOL(pfn_valid);
>> -#endif
>> +
>> +/* HAVE_MEMBLOCK is always enabled on arm */
>> +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>> +{
>> +       struct memblock_type *type = &memblock.memory;
>> +       unsigned int right = type->cnt;
>> +       unsigned int mid, left = 0;
>> +       phys_addr_t addr = PFN_PHYS(++pfn);
>> +
>> +       do {
>> +               mid = (right + left) / 2;
>> +
>> +               if (addr < type->regions[mid].base)
>> +                       right = mid;
>> +               else if (addr >= (type->regions[mid].base +
>> +                                 type->regions[mid].size))
>> +                       left = mid + 1;
>> +               else {
>> +                       /* addr is within the region, so pfn is valid */
>> +                       return pfn;
>> +               }
>> +       } while (left < right);
>> +
>> +       if (right == type->cnt)
>> +               return -1UL;
>> +       else
>> +               return PHYS_PFN(type->regions[right].base);
>> +}
>> +EXPORT_SYMBOL(memblock_next_valid_pfn);
>> +#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
>>
>>   #ifndef CONFIG_SPARSEMEM
>>   static void __init arm_memory_present(void)
>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>> index 60d02c8..e57d3f2 100644
>> --- a/arch/arm64/include/asm/page.h
>> +++ b/arch/arm64/include/asm/page.h
>> @@ -39,6 +39,8 @@ typedef struct page *pgtable_t;
>>
>>   #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>>   extern int pfn_valid(unsigned long);
>> +extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>> +#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
>>   #endif
>>
>>   #include <asm/memory.h>
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 00e7b90..13e43ff 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -290,7 +290,36 @@ int pfn_valid(unsigned long pfn)
>>          return memblock_is_map_memory(pfn << PAGE_SHIFT);
>>   }
>>   EXPORT_SYMBOL(pfn_valid);
>> -#endif
>> +
>> +/* HAVE_MEMBLOCK is always enabled on arm64 */
>> +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>> +{
>> +       struct memblock_type *type = &memblock.memory;
>> +       unsigned int right = type->cnt;
>> +       unsigned int mid, left = 0;
>> +       phys_addr_t addr = PFN_PHYS(++pfn);
>> +
>> +       do {
>> +               mid = (right + left) / 2;
>> +
>> +               if (addr < type->regions[mid].base)
>> +                       right = mid;
>> +               else if (addr >= (type->regions[mid].base +
>> +                                 type->regions[mid].size))
>> +                       left = mid + 1;
>> +               else {
>> +                       /* addr is within the region, so pfn is valid */
>> +                       return pfn;
>> +               }
>> +       } while (left < right);
>> +
>> +       if (right == type->cnt)
>> +               return -1UL;
>> +       else
>> +               return PHYS_PFN(type->regions[right].base);
>> +}
>> +EXPORT_SYMBOL(memblock_next_valid_pfn);
>> +#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
>>
>>   #ifndef CONFIG_SPARSEMEM
>>   static void __init arm64_memory_present(void)
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index d797716..f9c0c46 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -1245,6 +1245,7 @@ static inline int pfn_valid(unsigned long pfn)
>>                  return 0;
>>          return valid_section(__nr_to_section(pfn_to_section_nr(pfn)));
>>   }
>> +#define skip_to_last_invalid_pfn(pfn) (pfn)
>>   #endif
>>
>>   static inline int pfn_present(unsigned long pfn)
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index c19f5ac..30f7d76 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5483,8 +5483,10 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>>                  if (context != MEMMAP_EARLY)
>>                          goto not_early;
>>
>> -               if (!early_pfn_valid(pfn))
>> +               if (!early_pfn_valid(pfn)) {
>> +                       pfn = skip_to_last_invalid_pfn(pfn);
>>                          continue;
>> +               }
>>                  if (!early_pfn_in_nid(pfn, nid))
>>                          continue;
>>                  if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised))
>> --
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
@ 2018-04-02  7:49       ` Jia He
  0 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  7:49 UTC (permalink / raw)
  To: linux-arm-kernel



On 4/2/2018 2:55 PM, Ard Biesheuvel Wrote:
> On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>> where possible") optimized the loop in memmap_init_zone(). But it causes
>> possible panic bug. So Daniel Vacek reverted it later.
>>
>> But as suggested by Daniel Vacek, it is fine to using memblock to skip
>> gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.
>>
>> On arm and arm64, memblock is used by default. But generic version of
>> pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
>> not always return the next valid one but skips more resulting in some
>> valid frames to be skipped (as if they were invalid). And that's why
>> kernel was eventually crashing on some !arm machines.
>>
>> And as verified by Eugeniu Rosca, arm can benifit from commit
>> b92df1de5d28. So remain the memblock_next_valid_pfn on arm{,64} and move
>> the related codes to arm64 arch directory.
>>
>> Suggested-by: Daniel Vacek <neelx@redhat.com>
>> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
> Hello Jia,
>
> Apologies for chiming in late.
no problem, thanks for your comments? ;-)
>
> If we are going to rearchitect this, I'd rather we change the loop in
> memmap_init_zone() so that we skip to the next valid PFN directly
> rather than skipping to the last invalid PFN so that the pfn++ in the
hmm... Maybe this macro name makes you confused

pfn = skip_to_last_invalid_pfn(pfn);

how about skip_to_next_valid_pfn?

> for () results in the next value. Can we replace the pfn++ there with
> a function calls that defaults to 'return pfn + 1', but does the skip
> for architectures that implement it?
I am not sure I understand your question here.
With this patch, on !arm arches, skip_to_last_invalid_pfn is equal to 
(pfn), and will be increased
when for{} loop continue. We only *skip* to the start pfn of next valid 
region when
CONFIG_HAVE_MEMBLOCK and CONFIG_HAVE_ARCH_PFN_VALID(arm/arm64 supports 
both).


-- 
Cheers,
Jia

>
>
>> ---
>>   arch/arm/include/asm/page.h   |  2 ++
>>   arch/arm/mm/init.c            | 31 ++++++++++++++++++++++++++++++-
>>   arch/arm64/include/asm/page.h |  2 ++
>>   arch/arm64/mm/init.c          | 31 ++++++++++++++++++++++++++++++-
>>   include/linux/mmzone.h        |  1 +
>>   mm/page_alloc.c               |  4 +++-
>>   6 files changed, 68 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
>> index 4355f0e..489875c 100644
>> --- a/arch/arm/include/asm/page.h
>> +++ b/arch/arm/include/asm/page.h
>> @@ -158,6 +158,8 @@ typedef struct page *pgtable_t;
>>
>>   #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>>   extern int pfn_valid(unsigned long);
>> +extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>> +#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
>>   #endif
>>
>>   #include <asm/memory.h>
>> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
>> index a1f11a7..0fb85ca 100644
>> --- a/arch/arm/mm/init.c
>> +++ b/arch/arm/mm/init.c
>> @@ -198,7 +198,36 @@ int pfn_valid(unsigned long pfn)
>>          return memblock_is_map_memory(__pfn_to_phys(pfn));
>>   }
>>   EXPORT_SYMBOL(pfn_valid);
>> -#endif
>> +
>> +/* HAVE_MEMBLOCK is always enabled on arm */
>> +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>> +{
>> +       struct memblock_type *type = &memblock.memory;
>> +       unsigned int right = type->cnt;
>> +       unsigned int mid, left = 0;
>> +       phys_addr_t addr = PFN_PHYS(++pfn);
>> +
>> +       do {
>> +               mid = (right + left) / 2;
>> +
>> +               if (addr < type->regions[mid].base)
>> +                       right = mid;
>> +               else if (addr >= (type->regions[mid].base +
>> +                                 type->regions[mid].size))
>> +                       left = mid + 1;
>> +               else {
>> +                       /* addr is within the region, so pfn is valid */
>> +                       return pfn;
>> +               }
>> +       } while (left < right);
>> +
>> +       if (right == type->cnt)
>> +               return -1UL;
>> +       else
>> +               return PHYS_PFN(type->regions[right].base);
>> +}
>> +EXPORT_SYMBOL(memblock_next_valid_pfn);
>> +#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
>>
>>   #ifndef CONFIG_SPARSEMEM
>>   static void __init arm_memory_present(void)
>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>> index 60d02c8..e57d3f2 100644
>> --- a/arch/arm64/include/asm/page.h
>> +++ b/arch/arm64/include/asm/page.h
>> @@ -39,6 +39,8 @@ typedef struct page *pgtable_t;
>>
>>   #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>>   extern int pfn_valid(unsigned long);
>> +extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>> +#define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
>>   #endif
>>
>>   #include <asm/memory.h>
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 00e7b90..13e43ff 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -290,7 +290,36 @@ int pfn_valid(unsigned long pfn)
>>          return memblock_is_map_memory(pfn << PAGE_SHIFT);
>>   }
>>   EXPORT_SYMBOL(pfn_valid);
>> -#endif
>> +
>> +/* HAVE_MEMBLOCK is always enabled on arm64 */
>> +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>> +{
>> +       struct memblock_type *type = &memblock.memory;
>> +       unsigned int right = type->cnt;
>> +       unsigned int mid, left = 0;
>> +       phys_addr_t addr = PFN_PHYS(++pfn);
>> +
>> +       do {
>> +               mid = (right + left) / 2;
>> +
>> +               if (addr < type->regions[mid].base)
>> +                       right = mid;
>> +               else if (addr >= (type->regions[mid].base +
>> +                                 type->regions[mid].size))
>> +                       left = mid + 1;
>> +               else {
>> +                       /* addr is within the region, so pfn is valid */
>> +                       return pfn;
>> +               }
>> +       } while (left < right);
>> +
>> +       if (right == type->cnt)
>> +               return -1UL;
>> +       else
>> +               return PHYS_PFN(type->regions[right].base);
>> +}
>> +EXPORT_SYMBOL(memblock_next_valid_pfn);
>> +#endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
>>
>>   #ifndef CONFIG_SPARSEMEM
>>   static void __init arm64_memory_present(void)
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index d797716..f9c0c46 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -1245,6 +1245,7 @@ static inline int pfn_valid(unsigned long pfn)
>>                  return 0;
>>          return valid_section(__nr_to_section(pfn_to_section_nr(pfn)));
>>   }
>> +#define skip_to_last_invalid_pfn(pfn) (pfn)
>>   #endif
>>
>>   static inline int pfn_present(unsigned long pfn)
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index c19f5ac..30f7d76 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5483,8 +5483,10 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>>                  if (context != MEMMAP_EARLY)
>>                          goto not_early;
>>
>> -               if (!early_pfn_valid(pfn))
>> +               if (!early_pfn_valid(pfn)) {
>> +                       pfn = skip_to_last_invalid_pfn(pfn);
>>                          continue;
>> +               }
>>                  if (!early_pfn_in_nid(pfn, nid))
>>                          continue;
>>                  if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised))
>> --
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
  2018-04-02  2:30   ` Jia He
  (?)
@ 2018-04-02  7:50     ` kbuild test robot
  -1 siblings, 0 replies; 46+ messages in thread
From: kbuild test robot @ 2018-04-02  7:50 UTC (permalink / raw)
  To: Jia He
  Cc: kbuild-all, Russell King, Catalin Marinas, Will Deacon,
	Mark Rutland, Ard Biesheuvel, Andrew Morton, Michal Hocko,
	Wei Yang, Kees Cook, Laura Abbott, Vladimir Murzin,
	Philip Derrin, Grygorii Strashko, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	linux-kernel, linux-mm, Jia He, Jia He

[-- Attachment #1: Type: text/plain, Size: 2053 bytes --]

Hi Jia,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.16 next-20180329]
[cannot apply to arm64/for-next/core]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jia-He/optimize-memblock_next_valid_pfn-and-early_pfn_valid-on-arm-and-arm64/20180402-131223
config: i386-tinyconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   mm/page_alloc.c: In function 'memmap_init_zone':
>> mm/page_alloc.c:5360:10: error: implicit declaration of function 'skip_to_last_invalid_pfn' [-Werror=implicit-function-declaration]
       pfn = skip_to_last_invalid_pfn(pfn);
             ^~~~~~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +/skip_to_last_invalid_pfn +5360 mm/page_alloc.c

  5340	
  5341		if (highest_memmap_pfn < end_pfn - 1)
  5342			highest_memmap_pfn = end_pfn - 1;
  5343	
  5344		/*
  5345		 * Honor reservation requested by the driver for this ZONE_DEVICE
  5346		 * memory
  5347		 */
  5348		if (altmap && start_pfn == altmap->base_pfn)
  5349			start_pfn += altmap->reserve;
  5350	
  5351		for (pfn = start_pfn; pfn < end_pfn; pfn++) {
  5352			/*
  5353			 * There can be holes in boot-time mem_map[]s handed to this
  5354			 * function.  They do not exist on hotplugged memory.
  5355			 */
  5356			if (context != MEMMAP_EARLY)
  5357				goto not_early;
  5358	
  5359			if (!early_pfn_valid(pfn)) {
> 5360				pfn = skip_to_last_invalid_pfn(pfn);
  5361				continue;
  5362			}
  5363			if (!early_pfn_in_nid(pfn, nid))
  5364				continue;
  5365			if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised))
  5366				break;
  5367	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 6721 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
@ 2018-04-02  7:50     ` kbuild test robot
  0 siblings, 0 replies; 46+ messages in thread
From: kbuild test robot @ 2018-04-02  7:50 UTC (permalink / raw)
  To: Jia He
  Cc: kbuild-all, Russell King, Catalin Marinas, Will Deacon,
	Mark Rutland, Ard Biesheuvel, Andrew Morton, Michal Hocko,
	Wei Yang, Kees Cook, Laura Abbott, Vladimir Murzin,
	Philip Derrin, Grygorii Strashko, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	linux-kernel, linux-mm, Jia He

[-- Attachment #1: Type: text/plain, Size: 2053 bytes --]

Hi Jia,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.16 next-20180329]
[cannot apply to arm64/for-next/core]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jia-He/optimize-memblock_next_valid_pfn-and-early_pfn_valid-on-arm-and-arm64/20180402-131223
config: i386-tinyconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   mm/page_alloc.c: In function 'memmap_init_zone':
>> mm/page_alloc.c:5360:10: error: implicit declaration of function 'skip_to_last_invalid_pfn' [-Werror=implicit-function-declaration]
       pfn = skip_to_last_invalid_pfn(pfn);
             ^~~~~~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +/skip_to_last_invalid_pfn +5360 mm/page_alloc.c

  5340	
  5341		if (highest_memmap_pfn < end_pfn - 1)
  5342			highest_memmap_pfn = end_pfn - 1;
  5343	
  5344		/*
  5345		 * Honor reservation requested by the driver for this ZONE_DEVICE
  5346		 * memory
  5347		 */
  5348		if (altmap && start_pfn == altmap->base_pfn)
  5349			start_pfn += altmap->reserve;
  5350	
  5351		for (pfn = start_pfn; pfn < end_pfn; pfn++) {
  5352			/*
  5353			 * There can be holes in boot-time mem_map[]s handed to this
  5354			 * function.  They do not exist on hotplugged memory.
  5355			 */
  5356			if (context != MEMMAP_EARLY)
  5357				goto not_early;
  5358	
  5359			if (!early_pfn_valid(pfn)) {
> 5360				pfn = skip_to_last_invalid_pfn(pfn);
  5361				continue;
  5362			}
  5363			if (!early_pfn_in_nid(pfn, nid))
  5364				continue;
  5365			if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised))
  5366				break;
  5367	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 6721 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
@ 2018-04-02  7:50     ` kbuild test robot
  0 siblings, 0 replies; 46+ messages in thread
From: kbuild test robot @ 2018-04-02  7:50 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jia,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.16 next-20180329]
[cannot apply to arm64/for-next/core]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jia-He/optimize-memblock_next_valid_pfn-and-early_pfn_valid-on-arm-and-arm64/20180402-131223
config: i386-tinyconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   mm/page_alloc.c: In function 'memmap_init_zone':
>> mm/page_alloc.c:5360:10: error: implicit declaration of function 'skip_to_last_invalid_pfn' [-Werror=implicit-function-declaration]
       pfn = skip_to_last_invalid_pfn(pfn);
             ^~~~~~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +/skip_to_last_invalid_pfn +5360 mm/page_alloc.c

  5340	
  5341		if (highest_memmap_pfn < end_pfn - 1)
  5342			highest_memmap_pfn = end_pfn - 1;
  5343	
  5344		/*
  5345		 * Honor reservation requested by the driver for this ZONE_DEVICE
  5346		 * memory
  5347		 */
  5348		if (altmap && start_pfn == altmap->base_pfn)
  5349			start_pfn += altmap->reserve;
  5350	
  5351		for (pfn = start_pfn; pfn < end_pfn; pfn++) {
  5352			/*
  5353			 * There can be holes in boot-time mem_map[]s handed to this
  5354			 * function.  They do not exist on hotplugged memory.
  5355			 */
  5356			if (context != MEMMAP_EARLY)
  5357				goto not_early;
  5358	
  5359			if (!early_pfn_valid(pfn)) {
> 5360				pfn = skip_to_last_invalid_pfn(pfn);
  5361				continue;
  5362			}
  5363			if (!early_pfn_in_nid(pfn, nid))
  5364				continue;
  5365			if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised))
  5366				break;
  5367	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: .config.gz
Type: application/gzip
Size: 6721 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180402/1a272b4d/attachment-0001.gz>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
  2018-04-02  7:49       ` Jia He
@ 2018-04-02  7:53         ` Ard Biesheuvel
  -1 siblings, 0 replies; 46+ messages in thread
From: Ard Biesheuvel @ 2018-04-02  7:53 UTC (permalink / raw)
  To: Jia He
  Cc: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Andrew Morton, Michal Hocko, Wei Yang, Kees Cook, Laura Abbott,
	Vladimir Murzin, Philip Derrin, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	Linux Kernel Mailing List, Linux-MM, Jia He

On 2 April 2018 at 09:49, Jia He <hejianet@gmail.com> wrote:
>
>
> On 4/2/2018 2:55 PM, Ard Biesheuvel Wrote:
>>
>> On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
>>>
>>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>>> where possible") optimized the loop in memmap_init_zone(). But it causes
>>> possible panic bug. So Daniel Vacek reverted it later.
>>>
>>> But as suggested by Daniel Vacek, it is fine to using memblock to skip
>>> gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.
>>>
>>> On arm and arm64, memblock is used by default. But generic version of
>>> pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
>>> not always return the next valid one but skips more resulting in some
>>> valid frames to be skipped (as if they were invalid). And that's why
>>> kernel was eventually crashing on some !arm machines.
>>>
>>> And as verified by Eugeniu Rosca, arm can benifit from commit
>>> b92df1de5d28. So remain the memblock_next_valid_pfn on arm{,64} and move
>>> the related codes to arm64 arch directory.
>>>
>>> Suggested-by: Daniel Vacek <neelx@redhat.com>
>>> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
>>
>> Hello Jia,
>>
>> Apologies for chiming in late.
>
> no problem, thanks for your comments  ;-)
>>
>>
>> If we are going to rearchitect this, I'd rather we change the loop in
>> memmap_init_zone() so that we skip to the next valid PFN directly
>> rather than skipping to the last invalid PFN so that the pfn++ in the
>
> hmm... Maybe this macro name makes you confused
>
> pfn = skip_to_last_invalid_pfn(pfn);
>
> how about skip_to_next_valid_pfn?
>
>> for () results in the next value. Can we replace the pfn++ there with
>> a function calls that defaults to 'return pfn + 1', but does the skip
>> for architectures that implement it?
>
> I am not sure I understand your question here.
> With this patch, on !arm arches, skip_to_last_invalid_pfn is equal to (pfn),
> and will be increased
> when for{} loop continue. We only *skip* to the start pfn of next valid
> region when
> CONFIG_HAVE_MEMBLOCK and CONFIG_HAVE_ARCH_PFN_VALID(arm/arm64 supports
> both).
>

What I am saying is that the loop in memmap_init_zone

for (pfn = start_pfn; pfn < end_pfn; pfn++) { ... }

should be replaced by something like

for (pfn = start_pfn; pfn < end_pfn; pfn = next_valid_pfn(pfn))

where next_valid_pfn() is simply defined as

static ulong next_valid_pfn(ulong pfn)
{
  return pfn + 1;
}

by default, unless we do something special like you are proposing for
ARM and arm64, in which case you provide a different implementation.
That way, we no longer have to reason around the pfn++, and return an
invalid pfn so that the ++ will produce a valid pfn

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
@ 2018-04-02  7:53         ` Ard Biesheuvel
  0 siblings, 0 replies; 46+ messages in thread
From: Ard Biesheuvel @ 2018-04-02  7:53 UTC (permalink / raw)
  To: linux-arm-kernel

On 2 April 2018 at 09:49, Jia He <hejianet@gmail.com> wrote:
>
>
> On 4/2/2018 2:55 PM, Ard Biesheuvel Wrote:
>>
>> On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
>>>
>>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>>> where possible") optimized the loop in memmap_init_zone(). But it causes
>>> possible panic bug. So Daniel Vacek reverted it later.
>>>
>>> But as suggested by Daniel Vacek, it is fine to using memblock to skip
>>> gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.
>>>
>>> On arm and arm64, memblock is used by default. But generic version of
>>> pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
>>> not always return the next valid one but skips more resulting in some
>>> valid frames to be skipped (as if they were invalid). And that's why
>>> kernel was eventually crashing on some !arm machines.
>>>
>>> And as verified by Eugeniu Rosca, arm can benifit from commit
>>> b92df1de5d28. So remain the memblock_next_valid_pfn on arm{,64} and move
>>> the related codes to arm64 arch directory.
>>>
>>> Suggested-by: Daniel Vacek <neelx@redhat.com>
>>> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
>>
>> Hello Jia,
>>
>> Apologies for chiming in late.
>
> no problem, thanks for your comments  ;-)
>>
>>
>> If we are going to rearchitect this, I'd rather we change the loop in
>> memmap_init_zone() so that we skip to the next valid PFN directly
>> rather than skipping to the last invalid PFN so that the pfn++ in the
>
> hmm... Maybe this macro name makes you confused
>
> pfn = skip_to_last_invalid_pfn(pfn);
>
> how about skip_to_next_valid_pfn?
>
>> for () results in the next value. Can we replace the pfn++ there with
>> a function calls that defaults to 'return pfn + 1', but does the skip
>> for architectures that implement it?
>
> I am not sure I understand your question here.
> With this patch, on !arm arches, skip_to_last_invalid_pfn is equal to (pfn),
> and will be increased
> when for{} loop continue. We only *skip* to the start pfn of next valid
> region when
> CONFIG_HAVE_MEMBLOCK and CONFIG_HAVE_ARCH_PFN_VALID(arm/arm64 supports
> both).
>

What I am saying is that the loop in memmap_init_zone

for (pfn = start_pfn; pfn < end_pfn; pfn++) { ... }

should be replaced by something like

for (pfn = start_pfn; pfn < end_pfn; pfn = next_valid_pfn(pfn))

where next_valid_pfn() is simply defined as

static ulong next_valid_pfn(ulong pfn)
{
  return pfn + 1;
}

by default, unless we do something special like you are proposing for
ARM and arm64, in which case you provide a different implementation.
That way, we no longer have to reason around the pfn++, and return an
invalid pfn so that the ++ will produce a valid pfn

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 2/5] arm: arm64: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()
  2018-04-02  2:30   ` Jia He
@ 2018-04-02  8:01     ` Daniel Vacek
  -1 siblings, 0 replies; 46+ messages in thread
From: Daniel Vacek @ 2018-04-02  8:01 UTC (permalink / raw)
  To: Jia He
  Cc: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Ard Biesheuvel, Andrew Morton, Michal Hocko, Wei Yang, Kees Cook,
	Laura Abbott, Vladimir Murzin, Philip Derrin, Grygorii Strashko,
	AKASHI Takahiro, James Morse, Steve Capper, Pavel Tatashin,
	Gioh Kim, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Kemi Wang, Petr Tesarik, YASUAKI ISHIMATSU, Andrey Ryabinin,
	Nikolay Borisov, Daniel Jordan, Eugeniu Rosca, linux-arm-kernel,
	open list, linux-mm, Jia He

On Mon, Apr 2, 2018 at 4:30 AM, Jia He <hejianet@gmail.com> wrote:
> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") optimized the loop in memmap_init_zone(). But there is
> still some room for improvement. E.g. if pfn and pfn+1 are in the same
> memblock region, we can simply pfn++ instead of doing the binary search
> in memblock_next_valid_pfn.
>
> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
> ---
>  arch/arm/include/asm/page.h   |  1 +
>  arch/arm/mm/init.c            | 28 ++++++++++++++++++++++------
>  arch/arm64/include/asm/page.h |  1 +
>  arch/arm64/mm/init.c          | 28 ++++++++++++++++++++++------
>  4 files changed, 46 insertions(+), 12 deletions(-)
>
> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
> index 489875c..f38909c 100644
> --- a/arch/arm/include/asm/page.h
> +++ b/arch/arm/include/asm/page.h
> @@ -157,6 +157,7 @@ extern void copy_page(void *to, const void *from);
>  typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +extern int early_region_idx;

I believe this is not needed anymore.

>  extern int pfn_valid(unsigned long);
>  extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>  #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 0fb85ca..06ed190 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -193,6 +193,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
>  }
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +int early_region_idx __meminitdata = -1;

static?

> +
>  int pfn_valid(unsigned long pfn)
>  {
>         return memblock_is_map_memory(__pfn_to_phys(pfn));
> @@ -203,28 +205,42 @@ EXPORT_SYMBOL(pfn_valid);
>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>  {
>         struct memblock_type *type = &memblock.memory;
> +       struct memblock_region *regions = type->regions;
>         unsigned int right = type->cnt;
>         unsigned int mid, left = 0;
> +       unsigned long start_pfn, end_pfn;
>         phys_addr_t addr = PFN_PHYS(++pfn);
>
> +       /* fast path, return pfn+1 if next pfn is in the same region */
> +       if (early_region_idx != -1) {
> +               start_pfn = PFN_DOWN(regions[early_region_idx].base);
> +               end_pfn = PFN_DOWN(regions[early_region_idx].base +
> +                               regions[early_region_idx].size);
> +
> +               if (pfn >= start_pfn && pfn < end_pfn)
> +                       return pfn;
> +       }
> +
> +       /* slow path, do the binary searching */
>         do {
>                 mid = (right + left) / 2;
>
> -               if (addr < type->regions[mid].base)
> +               if (addr < regions[mid].base)
>                         right = mid;
> -               else if (addr >= (type->regions[mid].base +
> -                                 type->regions[mid].size))
> +               else if (addr >= (regions[mid].base + regions[mid].size))
>                         left = mid + 1;
>                 else {
> -                       /* addr is within the region, so pfn is valid */
> +                       early_region_idx = mid;
>                         return pfn;
>                 }
>         } while (left < right);
>
>         if (right == type->cnt)
>                 return -1UL;
> -       else
> -               return PHYS_PFN(type->regions[right].base);
> +
> +       early_region_idx = right;
> +
> +       return PHYS_PFN(regions[early_region_idx].base);
>  }
>  EXPORT_SYMBOL(memblock_next_valid_pfn);
>  #endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index e57d3f2..f0d8c8e5 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -38,6 +38,7 @@ extern void clear_page(void *to);
>  typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +extern int early_region_idx;

ditto

>  extern int pfn_valid(unsigned long);
>  extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>  #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 13e43ff..342e4e2 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -285,6 +285,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>  #endif /* CONFIG_NUMA */
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +int early_region_idx __meminitdata = -1;

ditto

--nX

> +
>  int pfn_valid(unsigned long pfn)
>  {
>         return memblock_is_map_memory(pfn << PAGE_SHIFT);
> @@ -295,28 +297,42 @@ EXPORT_SYMBOL(pfn_valid);
>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>  {
>         struct memblock_type *type = &memblock.memory;
> +       struct memblock_region *regions = type->regions;
>         unsigned int right = type->cnt;
>         unsigned int mid, left = 0;
> +       unsigned long start_pfn, end_pfn;
>         phys_addr_t addr = PFN_PHYS(++pfn);
>
> +       /* fast path, return pfn+1 if next pfn is in the same region */
> +       if (early_region_idx != -1) {
> +               start_pfn = PFN_DOWN(regions[early_region_idx].base);
> +               end_pfn = PFN_DOWN(regions[early_region_idx].base +
> +                               regions[early_region_idx].size);
> +
> +               if (pfn >= start_pfn && pfn < end_pfn)
> +                       return pfn;
> +       }
> +
> +       /* slow path, do the binary searching */
>         do {
>                 mid = (right + left) / 2;
>
> -               if (addr < type->regions[mid].base)
> +               if (addr < regions[mid].base)
>                         right = mid;
> -               else if (addr >= (type->regions[mid].base +
> -                                 type->regions[mid].size))
> +               else if (addr >= (regions[mid].base + regions[mid].size))
>                         left = mid + 1;
>                 else {
> -                       /* addr is within the region, so pfn is valid */
> +                       early_region_idx = mid;
>                         return pfn;
>                 }
>         } while (left < right);
>
>         if (right == type->cnt)
>                 return -1UL;
> -       else
> -               return PHYS_PFN(type->regions[right].base);
> +
> +       early_region_idx = right;
> +
> +       return PHYS_PFN(regions[early_region_idx].base);
>  }
>  EXPORT_SYMBOL(memblock_next_valid_pfn);
>  #endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 2/5] arm: arm64: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()
@ 2018-04-02  8:01     ` Daniel Vacek
  0 siblings, 0 replies; 46+ messages in thread
From: Daniel Vacek @ 2018-04-02  8:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Apr 2, 2018 at 4:30 AM, Jia He <hejianet@gmail.com> wrote:
> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") optimized the loop in memmap_init_zone(). But there is
> still some room for improvement. E.g. if pfn and pfn+1 are in the same
> memblock region, we can simply pfn++ instead of doing the binary search
> in memblock_next_valid_pfn.
>
> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
> ---
>  arch/arm/include/asm/page.h   |  1 +
>  arch/arm/mm/init.c            | 28 ++++++++++++++++++++++------
>  arch/arm64/include/asm/page.h |  1 +
>  arch/arm64/mm/init.c          | 28 ++++++++++++++++++++++------
>  4 files changed, 46 insertions(+), 12 deletions(-)
>
> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
> index 489875c..f38909c 100644
> --- a/arch/arm/include/asm/page.h
> +++ b/arch/arm/include/asm/page.h
> @@ -157,6 +157,7 @@ extern void copy_page(void *to, const void *from);
>  typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +extern int early_region_idx;

I believe this is not needed anymore.

>  extern int pfn_valid(unsigned long);
>  extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>  #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 0fb85ca..06ed190 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -193,6 +193,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
>  }
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +int early_region_idx __meminitdata = -1;

static?

> +
>  int pfn_valid(unsigned long pfn)
>  {
>         return memblock_is_map_memory(__pfn_to_phys(pfn));
> @@ -203,28 +205,42 @@ EXPORT_SYMBOL(pfn_valid);
>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>  {
>         struct memblock_type *type = &memblock.memory;
> +       struct memblock_region *regions = type->regions;
>         unsigned int right = type->cnt;
>         unsigned int mid, left = 0;
> +       unsigned long start_pfn, end_pfn;
>         phys_addr_t addr = PFN_PHYS(++pfn);
>
> +       /* fast path, return pfn+1 if next pfn is in the same region */
> +       if (early_region_idx != -1) {
> +               start_pfn = PFN_DOWN(regions[early_region_idx].base);
> +               end_pfn = PFN_DOWN(regions[early_region_idx].base +
> +                               regions[early_region_idx].size);
> +
> +               if (pfn >= start_pfn && pfn < end_pfn)
> +                       return pfn;
> +       }
> +
> +       /* slow path, do the binary searching */
>         do {
>                 mid = (right + left) / 2;
>
> -               if (addr < type->regions[mid].base)
> +               if (addr < regions[mid].base)
>                         right = mid;
> -               else if (addr >= (type->regions[mid].base +
> -                                 type->regions[mid].size))
> +               else if (addr >= (regions[mid].base + regions[mid].size))
>                         left = mid + 1;
>                 else {
> -                       /* addr is within the region, so pfn is valid */
> +                       early_region_idx = mid;
>                         return pfn;
>                 }
>         } while (left < right);
>
>         if (right == type->cnt)
>                 return -1UL;
> -       else
> -               return PHYS_PFN(type->regions[right].base);
> +
> +       early_region_idx = right;
> +
> +       return PHYS_PFN(regions[early_region_idx].base);
>  }
>  EXPORT_SYMBOL(memblock_next_valid_pfn);
>  #endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index e57d3f2..f0d8c8e5 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -38,6 +38,7 @@ extern void clear_page(void *to);
>  typedef struct page *pgtable_t;
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +extern int early_region_idx;

ditto

>  extern int pfn_valid(unsigned long);
>  extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>  #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 13e43ff..342e4e2 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -285,6 +285,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>  #endif /* CONFIG_NUMA */
>
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +int early_region_idx __meminitdata = -1;

ditto

--nX

> +
>  int pfn_valid(unsigned long pfn)
>  {
>         return memblock_is_map_memory(pfn << PAGE_SHIFT);
> @@ -295,28 +297,42 @@ EXPORT_SYMBOL(pfn_valid);
>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>  {
>         struct memblock_type *type = &memblock.memory;
> +       struct memblock_region *regions = type->regions;
>         unsigned int right = type->cnt;
>         unsigned int mid, left = 0;
> +       unsigned long start_pfn, end_pfn;
>         phys_addr_t addr = PFN_PHYS(++pfn);
>
> +       /* fast path, return pfn+1 if next pfn is in the same region */
> +       if (early_region_idx != -1) {
> +               start_pfn = PFN_DOWN(regions[early_region_idx].base);
> +               end_pfn = PFN_DOWN(regions[early_region_idx].base +
> +                               regions[early_region_idx].size);
> +
> +               if (pfn >= start_pfn && pfn < end_pfn)
> +                       return pfn;
> +       }
> +
> +       /* slow path, do the binary searching */
>         do {
>                 mid = (right + left) / 2;
>
> -               if (addr < type->regions[mid].base)
> +               if (addr < regions[mid].base)
>                         right = mid;
> -               else if (addr >= (type->regions[mid].base +
> -                                 type->regions[mid].size))
> +               else if (addr >= (regions[mid].base + regions[mid].size))
>                         left = mid + 1;
>                 else {
> -                       /* addr is within the region, so pfn is valid */
> +                       early_region_idx = mid;
>                         return pfn;
>                 }
>         } while (left < right);
>
>         if (right == type->cnt)
>                 return -1UL;
> -       else
> -               return PHYS_PFN(type->regions[right].base);
> +
> +       early_region_idx = right;
> +
> +       return PHYS_PFN(regions[early_region_idx].base);
>  }
>  EXPORT_SYMBOL(memblock_next_valid_pfn);
>  #endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 5/5] mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()
  2018-04-02  7:00     ` Ard Biesheuvel
@ 2018-04-02  8:15       ` Jia He
  -1 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  8:15 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Andrew Morton, Michal Hocko, Wei Yang, Kees Cook, Laura Abbott,
	Vladimir Murzin, Philip Derrin, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	Linux Kernel Mailing List, Linux-MM, Jia He



On 4/2/2018 3:00 PM, Ard Biesheuvel Wrote:
> How much does it improve the performance? And in which cases?
>
> I guess it improves boot time on systems with physical address spaces
> that are sparsely populated with DRAM, but you really have to quantify
> this if you want other people to care.
Yes, I write the performance in patch 0/5. I will write it in the patch 
description later.

-- 
Cheers,
Jia

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 5/5] mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()
@ 2018-04-02  8:15       ` Jia He
  0 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-02  8:15 UTC (permalink / raw)
  To: linux-arm-kernel



On 4/2/2018 3:00 PM, Ard Biesheuvel Wrote:
> How much does it improve the performance? And in which cases?
>
> I guess it improves boot time on systems with physical address spaces
> that are sparsely populated with DRAM, but you really have to quantify
> this if you want other people to care.
Yes, I write the performance in patch 0/5. I will write it in the patch 
description later.

-- 
Cheers,
Jia

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 2/5] arm: arm64: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()
  2018-04-02  6:57     ` Ard Biesheuvel
@ 2018-04-02  8:43       ` Daniel Vacek
  -1 siblings, 0 replies; 46+ messages in thread
From: Daniel Vacek @ 2018-04-02  8:43 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Jia He, Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Andrew Morton, Michal Hocko, Wei Yang, Kees Cook, Laura Abbott,
	Vladimir Murzin, Philip Derrin, Grygorii Strashko,
	AKASHI Takahiro, James Morse, Steve Capper, Pavel Tatashin,
	Gioh Kim, Vlastimil Babka, Mel Gorman, Johannes Weiner,
	Kemi Wang, Petr Tesarik, YASUAKI ISHIMATSU, Andrey Ryabinin,
	Nikolay Borisov, Daniel Jordan, Eugeniu Rosca, linux-arm-kernel,
	Linux Kernel Mailing List, Linux-MM, Jia He

On Mon, Apr 2, 2018 at 8:57 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>> where possible") optimized the loop in memmap_init_zone(). But there is
>> still some room for improvement. E.g. if pfn and pfn+1 are in the same
>> memblock region, we can simply pfn++ instead of doing the binary search
>> in memblock_next_valid_pfn.
>>
>> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
>> ---
>>  arch/arm/include/asm/page.h   |  1 +
>>  arch/arm/mm/init.c            | 28 ++++++++++++++++++++++------
>>  arch/arm64/include/asm/page.h |  1 +
>>  arch/arm64/mm/init.c          | 28 ++++++++++++++++++++++------
>>  4 files changed, 46 insertions(+), 12 deletions(-)
>>
>
> Could we put this in a shared file somewhere? This is the second patch
> where you put make identical changes to ARM and arm64.

Ard, I was wondering if we can actually have this changed to something
like CONFIG_MEMBLOCK_PFN_VALID and shared instead of it being arm
specific? Is there a reason it's only usable for arm? The rest is
dependent on this, hence I suggested to place it close-by. But
generalizing it all would make it a lot cleaner.

arch/arm/mm/init.c:196:
#ifdef CONFIG_HAVE_ARCH_PFN_VALID
int pfn_valid(unsigned long pfn)
{
        return memblock_is_map_memory(__pfn_to_phys(pfn));
}
EXPORT_SYMBOL(pfn_valid);
#endif

arch/arm64/mm/init.c:287:
#ifdef CONFIG_HAVE_ARCH_PFN_VALID
int pfn_valid(unsigned long pfn)
{
        return memblock_is_map_memory(pfn << PAGE_SHIFT);
}
EXPORT_SYMBOL(pfn_valid);
#endif

--nX

>> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
>> index 489875c..f38909c 100644
>> --- a/arch/arm/include/asm/page.h
>> +++ b/arch/arm/include/asm/page.h
>> @@ -157,6 +157,7 @@ extern void copy_page(void *to, const void *from);
>>  typedef struct page *pgtable_t;
>>
>>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>> +extern int early_region_idx;
>>  extern int pfn_valid(unsigned long);
>>  extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>>  #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
>> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
>> index 0fb85ca..06ed190 100644
>> --- a/arch/arm/mm/init.c
>> +++ b/arch/arm/mm/init.c
>> @@ -193,6 +193,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
>>  }
>>
>>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>> +int early_region_idx __meminitdata = -1;
>> +
>>  int pfn_valid(unsigned long pfn)
>>  {
>>         return memblock_is_map_memory(__pfn_to_phys(pfn));
>> @@ -203,28 +205,42 @@ EXPORT_SYMBOL(pfn_valid);
>>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>>  {
>>         struct memblock_type *type = &memblock.memory;
>> +       struct memblock_region *regions = type->regions;
>>         unsigned int right = type->cnt;
>>         unsigned int mid, left = 0;
>> +       unsigned long start_pfn, end_pfn;
>>         phys_addr_t addr = PFN_PHYS(++pfn);
>>
>> +       /* fast path, return pfn+1 if next pfn is in the same region */
>> +       if (early_region_idx != -1) {
>> +               start_pfn = PFN_DOWN(regions[early_region_idx].base);
>> +               end_pfn = PFN_DOWN(regions[early_region_idx].base +
>> +                               regions[early_region_idx].size);
>> +
>> +               if (pfn >= start_pfn && pfn < end_pfn)
>> +                       return pfn;
>> +       }
>> +
>> +       /* slow path, do the binary searching */
>>         do {
>>                 mid = (right + left) / 2;
>>
>> -               if (addr < type->regions[mid].base)
>> +               if (addr < regions[mid].base)
>>                         right = mid;
>> -               else if (addr >= (type->regions[mid].base +
>> -                                 type->regions[mid].size))
>> +               else if (addr >= (regions[mid].base + regions[mid].size))
>>                         left = mid + 1;
>>                 else {
>> -                       /* addr is within the region, so pfn is valid */
>> +                       early_region_idx = mid;
>>                         return pfn;
>>                 }
>>         } while (left < right);
>>
>>         if (right == type->cnt)
>>                 return -1UL;
>> -       else
>> -               return PHYS_PFN(type->regions[right].base);
>> +
>> +       early_region_idx = right;
>> +
>> +       return PHYS_PFN(regions[early_region_idx].base);
>>  }
>>  EXPORT_SYMBOL(memblock_next_valid_pfn);
>>  #endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>> index e57d3f2..f0d8c8e5 100644
>> --- a/arch/arm64/include/asm/page.h
>> +++ b/arch/arm64/include/asm/page.h
>> @@ -38,6 +38,7 @@ extern void clear_page(void *to);
>>  typedef struct page *pgtable_t;
>>
>>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>> +extern int early_region_idx;
>>  extern int pfn_valid(unsigned long);
>>  extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>>  #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 13e43ff..342e4e2 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -285,6 +285,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>>  #endif /* CONFIG_NUMA */
>>
>>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>> +int early_region_idx __meminitdata = -1;
>> +
>>  int pfn_valid(unsigned long pfn)
>>  {
>>         return memblock_is_map_memory(pfn << PAGE_SHIFT);
>> @@ -295,28 +297,42 @@ EXPORT_SYMBOL(pfn_valid);
>>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>>  {
>>         struct memblock_type *type = &memblock.memory;
>> +       struct memblock_region *regions = type->regions;
>>         unsigned int right = type->cnt;
>>         unsigned int mid, left = 0;
>> +       unsigned long start_pfn, end_pfn;
>>         phys_addr_t addr = PFN_PHYS(++pfn);
>>
>> +       /* fast path, return pfn+1 if next pfn is in the same region */
>> +       if (early_region_idx != -1) {
>> +               start_pfn = PFN_DOWN(regions[early_region_idx].base);
>> +               end_pfn = PFN_DOWN(regions[early_region_idx].base +
>> +                               regions[early_region_idx].size);
>> +
>> +               if (pfn >= start_pfn && pfn < end_pfn)
>> +                       return pfn;
>> +       }
>> +
>> +       /* slow path, do the binary searching */
>>         do {
>>                 mid = (right + left) / 2;
>>
>> -               if (addr < type->regions[mid].base)
>> +               if (addr < regions[mid].base)
>>                         right = mid;
>> -               else if (addr >= (type->regions[mid].base +
>> -                                 type->regions[mid].size))
>> +               else if (addr >= (regions[mid].base + regions[mid].size))
>>                         left = mid + 1;
>>                 else {
>> -                       /* addr is within the region, so pfn is valid */
>> +                       early_region_idx = mid;
>>                         return pfn;
>>                 }
>>         } while (left < right);
>>
>>         if (right == type->cnt)
>>                 return -1UL;
>> -       else
>> -               return PHYS_PFN(type->regions[right].base);
>> +
>> +       early_region_idx = right;
>> +
>> +       return PHYS_PFN(regions[early_region_idx].base);
>>  }
>>  EXPORT_SYMBOL(memblock_next_valid_pfn);
>>  #endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
>> --
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 2/5] arm: arm64: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()
@ 2018-04-02  8:43       ` Daniel Vacek
  0 siblings, 0 replies; 46+ messages in thread
From: Daniel Vacek @ 2018-04-02  8:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Apr 2, 2018 at 8:57 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>> where possible") optimized the loop in memmap_init_zone(). But there is
>> still some room for improvement. E.g. if pfn and pfn+1 are in the same
>> memblock region, we can simply pfn++ instead of doing the binary search
>> in memblock_next_valid_pfn.
>>
>> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
>> ---
>>  arch/arm/include/asm/page.h   |  1 +
>>  arch/arm/mm/init.c            | 28 ++++++++++++++++++++++------
>>  arch/arm64/include/asm/page.h |  1 +
>>  arch/arm64/mm/init.c          | 28 ++++++++++++++++++++++------
>>  4 files changed, 46 insertions(+), 12 deletions(-)
>>
>
> Could we put this in a shared file somewhere? This is the second patch
> where you put make identical changes to ARM and arm64.

Ard, I was wondering if we can actually have this changed to something
like CONFIG_MEMBLOCK_PFN_VALID and shared instead of it being arm
specific? Is there a reason it's only usable for arm? The rest is
dependent on this, hence I suggested to place it close-by. But
generalizing it all would make it a lot cleaner.

arch/arm/mm/init.c:196:
#ifdef CONFIG_HAVE_ARCH_PFN_VALID
int pfn_valid(unsigned long pfn)
{
        return memblock_is_map_memory(__pfn_to_phys(pfn));
}
EXPORT_SYMBOL(pfn_valid);
#endif

arch/arm64/mm/init.c:287:
#ifdef CONFIG_HAVE_ARCH_PFN_VALID
int pfn_valid(unsigned long pfn)
{
        return memblock_is_map_memory(pfn << PAGE_SHIFT);
}
EXPORT_SYMBOL(pfn_valid);
#endif

--nX

>> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
>> index 489875c..f38909c 100644
>> --- a/arch/arm/include/asm/page.h
>> +++ b/arch/arm/include/asm/page.h
>> @@ -157,6 +157,7 @@ extern void copy_page(void *to, const void *from);
>>  typedef struct page *pgtable_t;
>>
>>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>> +extern int early_region_idx;
>>  extern int pfn_valid(unsigned long);
>>  extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>>  #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
>> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
>> index 0fb85ca..06ed190 100644
>> --- a/arch/arm/mm/init.c
>> +++ b/arch/arm/mm/init.c
>> @@ -193,6 +193,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
>>  }
>>
>>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>> +int early_region_idx __meminitdata = -1;
>> +
>>  int pfn_valid(unsigned long pfn)
>>  {
>>         return memblock_is_map_memory(__pfn_to_phys(pfn));
>> @@ -203,28 +205,42 @@ EXPORT_SYMBOL(pfn_valid);
>>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>>  {
>>         struct memblock_type *type = &memblock.memory;
>> +       struct memblock_region *regions = type->regions;
>>         unsigned int right = type->cnt;
>>         unsigned int mid, left = 0;
>> +       unsigned long start_pfn, end_pfn;
>>         phys_addr_t addr = PFN_PHYS(++pfn);
>>
>> +       /* fast path, return pfn+1 if next pfn is in the same region */
>> +       if (early_region_idx != -1) {
>> +               start_pfn = PFN_DOWN(regions[early_region_idx].base);
>> +               end_pfn = PFN_DOWN(regions[early_region_idx].base +
>> +                               regions[early_region_idx].size);
>> +
>> +               if (pfn >= start_pfn && pfn < end_pfn)
>> +                       return pfn;
>> +       }
>> +
>> +       /* slow path, do the binary searching */
>>         do {
>>                 mid = (right + left) / 2;
>>
>> -               if (addr < type->regions[mid].base)
>> +               if (addr < regions[mid].base)
>>                         right = mid;
>> -               else if (addr >= (type->regions[mid].base +
>> -                                 type->regions[mid].size))
>> +               else if (addr >= (regions[mid].base + regions[mid].size))
>>                         left = mid + 1;
>>                 else {
>> -                       /* addr is within the region, so pfn is valid */
>> +                       early_region_idx = mid;
>>                         return pfn;
>>                 }
>>         } while (left < right);
>>
>>         if (right == type->cnt)
>>                 return -1UL;
>> -       else
>> -               return PHYS_PFN(type->regions[right].base);
>> +
>> +       early_region_idx = right;
>> +
>> +       return PHYS_PFN(regions[early_region_idx].base);
>>  }
>>  EXPORT_SYMBOL(memblock_next_valid_pfn);
>>  #endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>> index e57d3f2..f0d8c8e5 100644
>> --- a/arch/arm64/include/asm/page.h
>> +++ b/arch/arm64/include/asm/page.h
>> @@ -38,6 +38,7 @@ extern void clear_page(void *to);
>>  typedef struct page *pgtable_t;
>>
>>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>> +extern int early_region_idx;
>>  extern int pfn_valid(unsigned long);
>>  extern unsigned long memblock_next_valid_pfn(unsigned long pfn);
>>  #define skip_to_last_invalid_pfn(pfn) (memblock_next_valid_pfn(pfn) - 1)
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 13e43ff..342e4e2 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -285,6 +285,8 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>>  #endif /* CONFIG_NUMA */
>>
>>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>> +int early_region_idx __meminitdata = -1;
>> +
>>  int pfn_valid(unsigned long pfn)
>>  {
>>         return memblock_is_map_memory(pfn << PAGE_SHIFT);
>> @@ -295,28 +297,42 @@ EXPORT_SYMBOL(pfn_valid);
>>  unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
>>  {
>>         struct memblock_type *type = &memblock.memory;
>> +       struct memblock_region *regions = type->regions;
>>         unsigned int right = type->cnt;
>>         unsigned int mid, left = 0;
>> +       unsigned long start_pfn, end_pfn;
>>         phys_addr_t addr = PFN_PHYS(++pfn);
>>
>> +       /* fast path, return pfn+1 if next pfn is in the same region */
>> +       if (early_region_idx != -1) {
>> +               start_pfn = PFN_DOWN(regions[early_region_idx].base);
>> +               end_pfn = PFN_DOWN(regions[early_region_idx].base +
>> +                               regions[early_region_idx].size);
>> +
>> +               if (pfn >= start_pfn && pfn < end_pfn)
>> +                       return pfn;
>> +       }
>> +
>> +       /* slow path, do the binary searching */
>>         do {
>>                 mid = (right + left) / 2;
>>
>> -               if (addr < type->regions[mid].base)
>> +               if (addr < regions[mid].base)
>>                         right = mid;
>> -               else if (addr >= (type->regions[mid].base +
>> -                                 type->regions[mid].size))
>> +               else if (addr >= (regions[mid].base + regions[mid].size))
>>                         left = mid + 1;
>>                 else {
>> -                       /* addr is within the region, so pfn is valid */
>> +                       early_region_idx = mid;
>>                         return pfn;
>>                 }
>>         } while (left < right);
>>
>>         if (right == type->cnt)
>>                 return -1UL;
>> -       else
>> -               return PHYS_PFN(type->regions[right].base);
>> +
>> +       early_region_idx = right;
>> +
>> +       return PHYS_PFN(regions[early_region_idx].base);
>>  }
>>  EXPORT_SYMBOL(memblock_next_valid_pfn);
>>  #endif /*CONFIG_HAVE_ARCH_PFN_VALID*/
>> --
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 2/5] arm: arm64: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()
  2018-04-02  2:30   ` Jia He
  (?)
@ 2018-04-02 15:07     ` kbuild test robot
  -1 siblings, 0 replies; 46+ messages in thread
From: kbuild test robot @ 2018-04-02 15:07 UTC (permalink / raw)
  To: Jia He
  Cc: kbuild-all, Russell King, Catalin Marinas, Will Deacon,
	Mark Rutland, Ard Biesheuvel, Andrew Morton, Michal Hocko,
	Wei Yang, Kees Cook, Laura Abbott, Vladimir Murzin,
	Philip Derrin, Grygorii Strashko, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	linux-kernel, linux-mm, Jia He, Jia He

[-- Attachment #1: Type: text/plain, Size: 1333 bytes --]

Hi Jia,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.16 next-20180329]
[cannot apply to arm64/for-next/core]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jia-He/optimize-memblock_next_valid_pfn-and-early_pfn_valid-on-arm-and-arm64/20180402-131223
config: arm64-allmodconfig (attached as .config)
compiler: aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=arm64 

All warnings (new ones prefixed by >>):

>> WARNING: vmlinux.o(.text+0x39a80): Section mismatch in reference from the function memblock_next_valid_pfn() to the variable .meminit.data:$d
   The function memblock_next_valid_pfn() references
   the variable __meminitdata $d.
   This is often because memblock_next_valid_pfn lacks a __meminitdata
   annotation or the annotation of $d is wrong.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 58864 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 2/5] arm: arm64: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()
@ 2018-04-02 15:07     ` kbuild test robot
  0 siblings, 0 replies; 46+ messages in thread
From: kbuild test robot @ 2018-04-02 15:07 UTC (permalink / raw)
  To: Jia He
  Cc: kbuild-all, Russell King, Catalin Marinas, Will Deacon,
	Mark Rutland, Ard Biesheuvel, Andrew Morton, Michal Hocko,
	Wei Yang, Kees Cook, Laura Abbott, Vladimir Murzin,
	Philip Derrin, Grygorii Strashko, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	linux-kernel, linux-mm, Jia He

[-- Attachment #1: Type: text/plain, Size: 1333 bytes --]

Hi Jia,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.16 next-20180329]
[cannot apply to arm64/for-next/core]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jia-He/optimize-memblock_next_valid_pfn-and-early_pfn_valid-on-arm-and-arm64/20180402-131223
config: arm64-allmodconfig (attached as .config)
compiler: aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=arm64 

All warnings (new ones prefixed by >>):

>> WARNING: vmlinux.o(.text+0x39a80): Section mismatch in reference from the function memblock_next_valid_pfn() to the variable .meminit.data:$d
   The function memblock_next_valid_pfn() references
   the variable __meminitdata $d.
   This is often because memblock_next_valid_pfn lacks a __meminitdata
   annotation or the annotation of $d is wrong.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 58864 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 2/5] arm: arm64: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()
@ 2018-04-02 15:07     ` kbuild test robot
  0 siblings, 0 replies; 46+ messages in thread
From: kbuild test robot @ 2018-04-02 15:07 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jia,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.16 next-20180329]
[cannot apply to arm64/for-next/core]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jia-He/optimize-memblock_next_valid_pfn-and-early_pfn_valid-on-arm-and-arm64/20180402-131223
config: arm64-allmodconfig (attached as .config)
compiler: aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=arm64 

All warnings (new ones prefixed by >>):

>> WARNING: vmlinux.o(.text+0x39a80): Section mismatch in reference from the function memblock_next_valid_pfn() to the variable .meminit.data:$d
   The function memblock_next_valid_pfn() references
   the variable __meminitdata $d.
   This is often because memblock_next_valid_pfn lacks a __meminitdata
   annotation or the annotation of $d is wrong.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: .config.gz
Type: application/gzip
Size: 58864 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180402/c21d664c/attachment-0001.gz>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 4/5] arm64: introduce pfn_valid_region()
  2018-04-02  2:30   ` Jia He
  (?)
@ 2018-04-02 18:53     ` kbuild test robot
  -1 siblings, 0 replies; 46+ messages in thread
From: kbuild test robot @ 2018-04-02 18:53 UTC (permalink / raw)
  To: Jia He
  Cc: kbuild-all, Russell King, Catalin Marinas, Will Deacon,
	Mark Rutland, Ard Biesheuvel, Andrew Morton, Michal Hocko,
	Wei Yang, Kees Cook, Laura Abbott, Vladimir Murzin,
	Philip Derrin, Grygorii Strashko, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	linux-kernel, linux-mm, Jia He, Jia He

[-- Attachment #1: Type: text/plain, Size: 1312 bytes --]

Hi Jia,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.16 next-20180329]
[cannot apply to arm64/for-next/core]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jia-He/optimize-memblock_next_valid_pfn-and-early_pfn_valid-on-arm-and-arm64/20180402-131223
config: arm64-allmodconfig (attached as .config)
compiler: aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=arm64 

All warnings (new ones prefixed by >>):

>> WARNING: vmlinux.o(.text+0x39c4c): Section mismatch in reference from the function pfn_valid_region() to the variable .meminit.data:$d
   The function pfn_valid_region() references
   the variable __meminitdata $d.
   This is often because pfn_valid_region lacks a __meminitdata
   annotation or the annotation of $d is wrong.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 58864 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 4/5] arm64: introduce pfn_valid_region()
@ 2018-04-02 18:53     ` kbuild test robot
  0 siblings, 0 replies; 46+ messages in thread
From: kbuild test robot @ 2018-04-02 18:53 UTC (permalink / raw)
  To: Jia He
  Cc: kbuild-all, Russell King, Catalin Marinas, Will Deacon,
	Mark Rutland, Ard Biesheuvel, Andrew Morton, Michal Hocko,
	Wei Yang, Kees Cook, Laura Abbott, Vladimir Murzin,
	Philip Derrin, Grygorii Strashko, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	linux-kernel, linux-mm, Jia He

[-- Attachment #1: Type: text/plain, Size: 1312 bytes --]

Hi Jia,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.16 next-20180329]
[cannot apply to arm64/for-next/core]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jia-He/optimize-memblock_next_valid_pfn-and-early_pfn_valid-on-arm-and-arm64/20180402-131223
config: arm64-allmodconfig (attached as .config)
compiler: aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=arm64 

All warnings (new ones prefixed by >>):

>> WARNING: vmlinux.o(.text+0x39c4c): Section mismatch in reference from the function pfn_valid_region() to the variable .meminit.data:$d
   The function pfn_valid_region() references
   the variable __meminitdata $d.
   This is often because pfn_valid_region lacks a __meminitdata
   annotation or the annotation of $d is wrong.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 58864 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 4/5] arm64: introduce pfn_valid_region()
@ 2018-04-02 18:53     ` kbuild test robot
  0 siblings, 0 replies; 46+ messages in thread
From: kbuild test robot @ 2018-04-02 18:53 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jia,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.16 next-20180329]
[cannot apply to arm64/for-next/core]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jia-He/optimize-memblock_next_valid_pfn-and-early_pfn_valid-on-arm-and-arm64/20180402-131223
config: arm64-allmodconfig (attached as .config)
compiler: aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=arm64 

All warnings (new ones prefixed by >>):

>> WARNING: vmlinux.o(.text+0x39c4c): Section mismatch in reference from the function pfn_valid_region() to the variable .meminit.data:$d
   The function pfn_valid_region() references
   the variable __meminitdata $d.
   This is often because pfn_valid_region lacks a __meminitdata
   annotation or the annotation of $d is wrong.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: .config.gz
Type: application/gzip
Size: 58864 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180403/04ea4a88/attachment-0001.gz>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
  2018-04-02  7:53         ` Ard Biesheuvel
@ 2018-04-03  3:07           ` Jia He
  -1 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-03  3:07 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Andrew Morton, Michal Hocko, Wei Yang, Kees Cook, Laura Abbott,
	Vladimir Murzin, Philip Derrin, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	Linux Kernel Mailing List, Linux-MM, Jia He



On 4/2/2018 3:53 PM, Ard Biesheuvel Wrote:
> On 2 April 2018 at 09:49, Jia He <hejianet@gmail.com> wrote:
>>
>> On 4/2/2018 2:55 PM, Ard Biesheuvel Wrote:
>>> On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
>>>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>>>> where possible") optimized the loop in memmap_init_zone(). But it causes
>>>> possible panic bug. So Daniel Vacek reverted it later.
>>>>
>>>> But as suggested by Daniel Vacek, it is fine to using memblock to skip
>>>> gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.
>>>>
>>>> On arm and arm64, memblock is used by default. But generic version of
>>>> pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
>>>> not always return the next valid one but skips more resulting in some
>>>> valid frames to be skipped (as if they were invalid). And that's why
>>>> kernel was eventually crashing on some !arm machines.
>>>>
>>>> And as verified by Eugeniu Rosca, arm can benifit from commit
>>>> b92df1de5d28. So remain the memblock_next_valid_pfn on arm{,64} and move
>>>> the related codes to arm64 arch directory.
>>>>
>>>> Suggested-by: Daniel Vacek <neelx@redhat.com>
>>>> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
>>> Hello Jia,
>>>
>>> Apologies for chiming in late.
>> no problem, thanks for your comments  ;-)
>>>
>>> If we are going to rearchitect this, I'd rather we change the loop in
>>> memmap_init_zone() so that we skip to the next valid PFN directly
>>> rather than skipping to the last invalid PFN so that the pfn++ in the
>> hmm... Maybe this macro name makes you confused
>>
>> pfn = skip_to_last_invalid_pfn(pfn);
>>
>> how about skip_to_next_valid_pfn?
>>
>>> for () results in the next value. Can we replace the pfn++ there with
>>> a function calls that defaults to 'return pfn + 1', but does the skip
>>> for architectures that implement it?
>> I am not sure I understand your question here.
>> With this patch, on !arm arches, skip_to_last_invalid_pfn is equal to (pfn),
>> and will be increased
>> when for{} loop continue. We only *skip* to the start pfn of next valid
>> region when
>> CONFIG_HAVE_MEMBLOCK and CONFIG_HAVE_ARCH_PFN_VALID(arm/arm64 supports
>> both).
>>
> What I am saying is that the loop in memmap_init_zone
>
> for (pfn = start_pfn; pfn < end_pfn; pfn++) { ... }
>
> should be replaced by something like
>
> for (pfn = start_pfn; pfn < end_pfn; pfn = next_valid_pfn(pfn))
>
> where next_valid_pfn() is simply defined as
>
> static ulong next_valid_pfn(ulong pfn)
> {
>    return pfn + 1;
> }
Hi Ard,
Do you think a macro instead of simply fuction is better here?
--
Cheer,
Jia

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
@ 2018-04-03  3:07           ` Jia He
  0 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-03  3:07 UTC (permalink / raw)
  To: linux-arm-kernel



On 4/2/2018 3:53 PM, Ard Biesheuvel Wrote:
> On 2 April 2018 at 09:49, Jia He <hejianet@gmail.com> wrote:
>>
>> On 4/2/2018 2:55 PM, Ard Biesheuvel Wrote:
>>> On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
>>>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>>>> where possible") optimized the loop in memmap_init_zone(). But it causes
>>>> possible panic bug. So Daniel Vacek reverted it later.
>>>>
>>>> But as suggested by Daniel Vacek, it is fine to using memblock to skip
>>>> gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.
>>>>
>>>> On arm and arm64, memblock is used by default. But generic version of
>>>> pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
>>>> not always return the next valid one but skips more resulting in some
>>>> valid frames to be skipped (as if they were invalid). And that's why
>>>> kernel was eventually crashing on some !arm machines.
>>>>
>>>> And as verified by Eugeniu Rosca, arm can benifit from commit
>>>> b92df1de5d28. So remain the memblock_next_valid_pfn on arm{,64} and move
>>>> the related codes to arm64 arch directory.
>>>>
>>>> Suggested-by: Daniel Vacek <neelx@redhat.com>
>>>> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
>>> Hello Jia,
>>>
>>> Apologies for chiming in late.
>> no problem, thanks for your comments  ;-)
>>>
>>> If we are going to rearchitect this, I'd rather we change the loop in
>>> memmap_init_zone() so that we skip to the next valid PFN directly
>>> rather than skipping to the last invalid PFN so that the pfn++ in the
>> hmm... Maybe this macro name makes you confused
>>
>> pfn = skip_to_last_invalid_pfn(pfn);
>>
>> how about skip_to_next_valid_pfn?
>>
>>> for () results in the next value. Can we replace the pfn++ there with
>>> a function calls that defaults to 'return pfn + 1', but does the skip
>>> for architectures that implement it?
>> I am not sure I understand your question here.
>> With this patch, on !arm arches, skip_to_last_invalid_pfn is equal to (pfn),
>> and will be increased
>> when for{} loop continue. We only *skip* to the start pfn of next valid
>> region when
>> CONFIG_HAVE_MEMBLOCK and CONFIG_HAVE_ARCH_PFN_VALID(arm/arm64 supports
>> both).
>>
> What I am saying is that the loop in memmap_init_zone
>
> for (pfn = start_pfn; pfn < end_pfn; pfn++) { ... }
>
> should be replaced by something like
>
> for (pfn = start_pfn; pfn < end_pfn; pfn = next_valid_pfn(pfn))
>
> where next_valid_pfn() is simply defined as
>
> static ulong next_valid_pfn(ulong pfn)
> {
>    return pfn + 1;
> }
Hi Ard,
Do you think a macro instead of simply fuction is better here?
--
Cheer,
Jia

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
  2018-04-02  7:53         ` Ard Biesheuvel
@ 2018-04-11  4:47           ` Jia He
  -1 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-11  4:47 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Russell King, Catalin Marinas, Will Deacon, Mark Rutland,
	Andrew Morton, Michal Hocko, Wei Yang, Kees Cook, Laura Abbott,
	Vladimir Murzin, Philip Derrin, AKASHI Takahiro, James Morse,
	Steve Capper, Pavel Tatashin, Gioh Kim, Vlastimil Babka,
	Mel Gorman, Johannes Weiner, Kemi Wang, Petr Tesarik,
	YASUAKI ISHIMATSU, Andrey Ryabinin, Nikolay Borisov,
	Daniel Jordan, Daniel Vacek, Eugeniu Rosca, linux-arm-kernel,
	Linux Kernel Mailing List, Linux-MM, Jia He



On 4/2/2018 3:53 PM, Ard Biesheuvel Wrote:
> On 2 April 2018 at 09:49, Jia He <hejianet@gmail.com> wrote:
>>
>> On 4/2/2018 2:55 PM, Ard Biesheuvel Wrote:
>>> On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
>>>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>>>> where possible") optimized the loop in memmap_init_zone(). But it causes
>>>> possible panic bug. So Daniel Vacek reverted it later.
>>>>
>>>> But as suggested by Daniel Vacek, it is fine to using memblock to skip
>>>> gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.
>>>>
>>>> On arm and arm64, memblock is used by default. But generic version of
>>>> pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
>>>> not always return the next valid one but skips more resulting in some
>>>> valid frames to be skipped (as if they were invalid). And that's why
>>>> kernel was eventually crashing on some !arm machines.
>>>>
>>>> And as verified by Eugeniu Rosca, arm can benifit from commit
>>>> b92df1de5d28. So remain the memblock_next_valid_pfn on arm{,64} and move
>>>> the related codes to arm64 arch directory.
>>>>
>>>> Suggested-by: Daniel Vacek <neelx@redhat.com>
>>>> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
>>> Hello Jia,
>>>
>>> Apologies for chiming in late.
>> no problem, thanks for your comments  ;-)
>>>
>>> If we are going to rearchitect this, I'd rather we change the loop in
>>> memmap_init_zone() so that we skip to the next valid PFN directly
>>> rather than skipping to the last invalid PFN so that the pfn++ in the
>> hmm... Maybe this macro name makes you confused
>>
>> pfn = skip_to_last_invalid_pfn(pfn);
>>
>> how about skip_to_next_valid_pfn?
>>
>>> for () results in the next value. Can we replace the pfn++ there with
>>> a function calls that defaults to 'return pfn + 1', but does the skip
>>> for architectures that implement it?
>> I am not sure I understand your question here.
>> With this patch, on !arm arches, skip_to_last_invalid_pfn is equal to (pfn),
>> and will be increased
>> when for{} loop continue. We only *skip* to the start pfn of next valid
>> region when
>> CONFIG_HAVE_MEMBLOCK and CONFIG_HAVE_ARCH_PFN_VALID(arm/arm64 supports
>> both).
>>
> What I am saying is that the loop in memmap_init_zone
>
> for (pfn = start_pfn; pfn < end_pfn; pfn++) { ... }
>
> should be replaced by something like
>
> for (pfn = start_pfn; pfn < end_pfn; pfn = next_valid_pfn(pfn))
After further thinking, IMO, pfn = next_valid_pfn(pfn) might have impact on

memmap_init_zone loop.

e.g.context != MEMMAP_EARLY, pfn will not be checked by early_pfn_valid, thus
It will change the memhotplug logic.

So I would choose the old implementation:
		if (!early_pfn_valid(pfn)) {
			pfn = next_valid_pfn(pfn) - 1;
			continue;
		}
Any comments? Thanks

-- 
Cheers,
Jia

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() on arm and arm64
@ 2018-04-11  4:47           ` Jia He
  0 siblings, 0 replies; 46+ messages in thread
From: Jia He @ 2018-04-11  4:47 UTC (permalink / raw)
  To: linux-arm-kernel



On 4/2/2018 3:53 PM, Ard Biesheuvel Wrote:
> On 2 April 2018 at 09:49, Jia He <hejianet@gmail.com> wrote:
>>
>> On 4/2/2018 2:55 PM, Ard Biesheuvel Wrote:
>>> On 2 April 2018 at 04:30, Jia He <hejianet@gmail.com> wrote:
>>>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>>>> where possible") optimized the loop in memmap_init_zone(). But it causes
>>>> possible panic bug. So Daniel Vacek reverted it later.
>>>>
>>>> But as suggested by Daniel Vacek, it is fine to using memblock to skip
>>>> gaps and finding next valid frame with CONFIG_HAVE_ARCH_PFN_VALID.
>>>>
>>>> On arm and arm64, memblock is used by default. But generic version of
>>>> pfn_valid() is based on mem sections and memblock_next_valid_pfn() does
>>>> not always return the next valid one but skips more resulting in some
>>>> valid frames to be skipped (as if they were invalid). And that's why
>>>> kernel was eventually crashing on some !arm machines.
>>>>
>>>> And as verified by Eugeniu Rosca, arm can benifit from commit
>>>> b92df1de5d28. So remain the memblock_next_valid_pfn on arm{,64} and move
>>>> the related codes to arm64 arch directory.
>>>>
>>>> Suggested-by: Daniel Vacek <neelx@redhat.com>
>>>> Signed-off-by: Jia He <jia.he@hxt-semitech.com>
>>> Hello Jia,
>>>
>>> Apologies for chiming in late.
>> no problem, thanks for your comments  ;-)
>>>
>>> If we are going to rearchitect this, I'd rather we change the loop in
>>> memmap_init_zone() so that we skip to the next valid PFN directly
>>> rather than skipping to the last invalid PFN so that the pfn++ in the
>> hmm... Maybe this macro name makes you confused
>>
>> pfn = skip_to_last_invalid_pfn(pfn);
>>
>> how about skip_to_next_valid_pfn?
>>
>>> for () results in the next value. Can we replace the pfn++ there with
>>> a function calls that defaults to 'return pfn + 1', but does the skip
>>> for architectures that implement it?
>> I am not sure I understand your question here.
>> With this patch, on !arm arches, skip_to_last_invalid_pfn is equal to (pfn),
>> and will be increased
>> when for{} loop continue. We only *skip* to the start pfn of next valid
>> region when
>> CONFIG_HAVE_MEMBLOCK and CONFIG_HAVE_ARCH_PFN_VALID(arm/arm64 supports
>> both).
>>
> What I am saying is that the loop in memmap_init_zone
>
> for (pfn = start_pfn; pfn < end_pfn; pfn++) { ... }
>
> should be replaced by something like
>
> for (pfn = start_pfn; pfn < end_pfn; pfn = next_valid_pfn(pfn))
After further thinking, IMO, pfn = next_valid_pfn(pfn) might have impact on

memmap_init_zone loop.

e.g.context != MEMMAP_EARLY, pfn will not be checked by early_pfn_valid, thus
It will change the memhotplug logic.

So I would choose the old implementation:
		if (!early_pfn_valid(pfn)) {
			pfn = next_valid_pfn(pfn) - 1;
			continue;
		}
Any comments? Thanks

-- 
Cheers,
Jia

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2018-04-11  4:47 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-02  2:30 [PATCH v5 0/5] optimize memblock_next_valid_pfn and early_pfn_valid on arm and arm64 Jia He
2018-04-02  2:30 ` Jia He
2018-04-02  2:30 ` [PATCH v5 1/5] mm: page_alloc: remain memblock_next_valid_pfn() " Jia He
2018-04-02  2:30   ` Jia He
2018-04-02  6:55   ` Ard Biesheuvel
2018-04-02  6:55     ` Ard Biesheuvel
2018-04-02  7:49     ` Jia He
2018-04-02  7:49       ` Jia He
2018-04-02  7:49       ` Jia He
2018-04-02  7:53       ` Ard Biesheuvel
2018-04-02  7:53         ` Ard Biesheuvel
2018-04-03  3:07         ` Jia He
2018-04-03  3:07           ` Jia He
2018-04-11  4:47         ` Jia He
2018-04-11  4:47           ` Jia He
2018-04-02  7:50   ` kbuild test robot
2018-04-02  7:50     ` kbuild test robot
2018-04-02  7:50     ` kbuild test robot
2018-04-02  2:30 ` [PATCH v5 2/5] arm: arm64: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn() Jia He
2018-04-02  2:30   ` Jia He
2018-04-02  6:57   ` Ard Biesheuvel
2018-04-02  6:57     ` Ard Biesheuvel
2018-04-02  8:43     ` Daniel Vacek
2018-04-02  8:43       ` Daniel Vacek
2018-04-02  8:01   ` Daniel Vacek
2018-04-02  8:01     ` Daniel Vacek
2018-04-02 15:07   ` kbuild test robot
2018-04-02 15:07     ` kbuild test robot
2018-04-02 15:07     ` kbuild test robot
2018-04-02  2:30 ` [PATCH v5 3/5] mm/memblock: introduce memblock_search_pfn_regions() Jia He
2018-04-02  2:30   ` Jia He
2018-04-02  6:57   ` Ard Biesheuvel
2018-04-02  6:57     ` Ard Biesheuvel
2018-04-02  2:30 ` [PATCH v5 4/5] arm64: introduce pfn_valid_region() Jia He
2018-04-02  2:30   ` Jia He
2018-04-02  6:59   ` Ard Biesheuvel
2018-04-02  6:59     ` Ard Biesheuvel
2018-04-02 18:53   ` kbuild test robot
2018-04-02 18:53     ` kbuild test robot
2018-04-02 18:53     ` kbuild test robot
2018-04-02  2:30 ` [PATCH v5 5/5] mm: page_alloc: reduce unnecessary binary search in early_pfn_valid() Jia He
2018-04-02  2:30   ` Jia He
2018-04-02  7:00   ` Ard Biesheuvel
2018-04-02  7:00     ` Ard Biesheuvel
2018-04-02  8:15     ` Jia He
2018-04-02  8:15       ` Jia He

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.