linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid()
@ 2021-04-22  6:18 Mike Rapoport
  2021-04-22  6:18 ` [PATCH v3 1/4] include/linux/mmzone.h: add documentation for pfn_valid() Mike Rapoport
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Mike Rapoport @ 2021-04-22  6:18 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Andrew Morton, Anshuman Khandual, Ard Biesheuvel,
	Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland,
	Mike Rapoport, Mike Rapoport, Will Deacon, kvmarm, linux-kernel,
	linux-mm

From: Mike Rapoport <rppt@linux.ibm.com>

Hi,

These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire
pfn_valid_within() to 1. 

The idea is to mark NOMAP pages as reserved in the memory map and restore
the intended semantics of pfn_valid() to designate availability of struct
page for a pfn.

With this the core mm will be able to cope with the fact that it cannot use
NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks
will be treated correctly even without the need for pfn_valid_within.

The patches are only boot tested on qemu-system-aarch64 so I'd really
appreciate memory stress tests on real hardware.

If this actually works we'll be one step closer to drop custom pfn_valid()
on arm64 altogether.

v3:
* Fix minor issues found by Anshuman
* Freshen up the declaration of pfn_valid() to make it consistent with
  pfn_is_map_memory()
* Add more Acked-by and Reviewed-by tags, thanks Anshuman and David

v2: Link: https://lore.kernel.org/lkml/20210421065108.1987-1-rppt@kernel.org
* Add check for PFN overflow in pfn_is_map_memory()
* Add Acked-by and Reviewed-by tags, thanks David.

v1: Link: https://lore.kernel.org/lkml/20210420090925.7457-1-rppt@kernel.org
* Add comment about the semantics of pfn_valid() as Anshuman suggested
* Extend comments about MEMBLOCK_NOMAP, per Anshuman
* Use pfn_is_map_memory() name for the exported wrapper for
  memblock_is_map_memory(). It is still local to arch/arm64 in the end
  because of header dependency issues.

rfc: Link: https://lore.kernel.org/lkml/20210407172607.8812-1-rppt@kernel.org

Mike Rapoport (4):
  include/linux/mmzone.h: add documentation for pfn_valid()
  memblock: update initialization of reserved pages
  arm64: decouple check whether pfn is in linear map from pfn_valid()
  arm64: drop pfn_valid_within() and simplify pfn_valid()

 arch/arm64/Kconfig              |  3 ---
 arch/arm64/include/asm/memory.h |  2 +-
 arch/arm64/include/asm/page.h   |  3 ++-
 arch/arm64/kvm/mmu.c            |  2 +-
 arch/arm64/mm/init.c            | 16 ++++++++++++++--
 arch/arm64/mm/ioremap.c         |  4 ++--
 arch/arm64/mm/mmu.c             |  2 +-
 include/linux/memblock.h        |  4 +++-
 include/linux/mmzone.h          | 11 +++++++++++
 mm/memblock.c                   | 28 ++++++++++++++++++++++++++--
 10 files changed, 61 insertions(+), 14 deletions(-)

base-commit: e49d033bddf5b565044e2abe4241353959bc9120
-- 
2.28.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v3 1/4] include/linux/mmzone.h: add documentation for pfn_valid()
  2021-04-22  6:18 [PATCH v3 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
@ 2021-04-22  6:18 ` Mike Rapoport
  2021-04-22  6:19 ` [PATCH v3 2/4] memblock: update initialization of reserved pages Mike Rapoport
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 22+ messages in thread
From: Mike Rapoport @ 2021-04-22  6:18 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Andrew Morton, Anshuman Khandual, Ard Biesheuvel,
	Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland,
	Mike Rapoport, Mike Rapoport, Will Deacon, kvmarm, linux-kernel,
	linux-mm

From: Mike Rapoport <rppt@linux.ibm.com>

Add comment describing the semantics of pfn_valid() that clarifies that
pfn_valid() only checks for availability of a memory map entry (i.e. struct
page) for a PFN rather than availability of usable memory backing that PFN.

The most "generic" version of pfn_valid() used by the configurations with
SPARSEMEM enabled resides in include/linux/mmzone.h so this is the most
suitable place for documentation about semantics of pfn_valid().

Suggested-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 include/linux/mmzone.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 47946cec7584..961f0eeefb62 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1410,6 +1410,17 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
 #endif
 
 #ifndef CONFIG_HAVE_ARCH_PFN_VALID
+/**
+ * pfn_valid - check if there is a valid memory map entry for a PFN
+ * @pfn: the page frame number to check
+ *
+ * Check if there is a valid memory map entry aka struct page for the @pfn.
+ * Note, that availability of the memory map entry does not imply that
+ * there is actual usable memory at that @pfn. The struct page may
+ * represent a hole or an unusable page frame.
+ *
+ * Return: 1 for PFNs that have memory map entries and 0 otherwise
+ */
 static inline int pfn_valid(unsigned long pfn)
 {
 	struct mem_section *ms;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 2/4] memblock: update initialization of reserved pages
  2021-04-22  6:18 [PATCH v3 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
  2021-04-22  6:18 ` [PATCH v3 1/4] include/linux/mmzone.h: add documentation for pfn_valid() Mike Rapoport
@ 2021-04-22  6:19 ` Mike Rapoport
  2021-04-22  6:19 ` [PATCH v3 3/4] arm64: decouple check whether pfn is in linear map from pfn_valid() Mike Rapoport
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 22+ messages in thread
From: Mike Rapoport @ 2021-04-22  6:19 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Andrew Morton, Anshuman Khandual, Ard Biesheuvel,
	Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland,
	Mike Rapoport, Mike Rapoport, Will Deacon, kvmarm, linux-kernel,
	linux-mm

From: Mike Rapoport <rppt@linux.ibm.com>

The struct pages representing a reserved memory region are initialized
using reserve_bootmem_range() function. This function is called for each
reserved region just before the memory is freed from memblock to the buddy
page allocator.

The struct pages for MEMBLOCK_NOMAP regions are kept with the default
values set by the memory map initialization which makes it necessary to
have a special treatment for such pages in pfn_valid() and
pfn_valid_within().

Split out initialization of the reserved pages to a function with a
meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the
reserved regions and mark struct pages for the NOMAP regions as
PageReserved.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 include/linux/memblock.h |  4 +++-
 mm/memblock.c            | 28 ++++++++++++++++++++++++++--
 2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 5984fff3f175..1b4c97c151ae 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -30,7 +30,9 @@ extern unsigned long long max_possible_pfn;
  * @MEMBLOCK_NONE: no special request
  * @MEMBLOCK_HOTPLUG: hotpluggable region
  * @MEMBLOCK_MIRROR: mirrored region
- * @MEMBLOCK_NOMAP: don't add to kernel direct mapping
+ * @MEMBLOCK_NOMAP: don't add to kernel direct mapping and treat as
+ * reserved in the memory map; refer to memblock_mark_nomap() description
+ * for further details
  */
 enum memblock_flags {
 	MEMBLOCK_NONE		= 0x0,	/* No special request */
diff --git a/mm/memblock.c b/mm/memblock.c
index afaefa8fc6ab..3abf2c3fea7f 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -906,6 +906,11 @@ int __init_memblock memblock_mark_mirror(phys_addr_t base, phys_addr_t size)
  * @base: the base phys addr of the region
  * @size: the size of the region
  *
+ * The memory regions marked with %MEMBLOCK_NOMAP will not be added to the
+ * direct mapping of the physical memory. These regions will still be
+ * covered by the memory map. The struct page representing NOMAP memory
+ * frames in the memory map will be PageReserved()
+ *
  * Return: 0 on success, -errno on failure.
  */
 int __init_memblock memblock_mark_nomap(phys_addr_t base, phys_addr_t size)
@@ -2002,6 +2007,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start,
 	return end_pfn - start_pfn;
 }
 
+static void __init memmap_init_reserved_pages(void)
+{
+	struct memblock_region *region;
+	phys_addr_t start, end;
+	u64 i;
+
+	/* initialize struct pages for the reserved regions */
+	for_each_reserved_mem_range(i, &start, &end)
+		reserve_bootmem_region(start, end);
+
+	/* and also treat struct pages for the NOMAP regions as PageReserved */
+	for_each_mem_region(region) {
+		if (memblock_is_nomap(region)) {
+			start = region->base;
+			end = start + region->size;
+			reserve_bootmem_region(start, end);
+		}
+	}
+}
+
 static unsigned long __init free_low_memory_core_early(void)
 {
 	unsigned long count = 0;
@@ -2010,8 +2035,7 @@ static unsigned long __init free_low_memory_core_early(void)
 
 	memblock_clear_hotplug(0, -1);
 
-	for_each_reserved_mem_range(i, &start, &end)
-		reserve_bootmem_region(start, end);
+	memmap_init_reserved_pages();
 
 	/*
 	 * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 3/4] arm64: decouple check whether pfn is in linear map from pfn_valid()
  2021-04-22  6:18 [PATCH v3 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
  2021-04-22  6:18 ` [PATCH v3 1/4] include/linux/mmzone.h: add documentation for pfn_valid() Mike Rapoport
  2021-04-22  6:19 ` [PATCH v3 2/4] memblock: update initialization of reserved pages Mike Rapoport
@ 2021-04-22  6:19 ` Mike Rapoport
  2021-04-22  8:57   ` David Hildenbrand
  2021-04-22  6:19 ` [PATCH v3 4/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
  2021-04-22  7:50 ` [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes Anshuman Khandual
  4 siblings, 1 reply; 22+ messages in thread
From: Mike Rapoport @ 2021-04-22  6:19 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Andrew Morton, Anshuman Khandual, Ard Biesheuvel,
	Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland,
	Mike Rapoport, Mike Rapoport, Will Deacon, kvmarm, linux-kernel,
	linux-mm

From: Mike Rapoport <rppt@linux.ibm.com>

The intended semantics of pfn_valid() is to verify whether there is a
struct page for the pfn in question and nothing else.

Yet, on arm64 it is used to distinguish memory areas that are mapped in the
linear map vs those that require ioremap() to access them.

Introduce a dedicated pfn_is_map_memory() wrapper for
memblock_is_map_memory() to perform such check and use it where
appropriate.

Using a wrapper allows to avoid cyclic include dependencies.

While here also update style of pfn_valid() so that both pfn_valid() and
pfn_is_map_memory() declarations will be consistent.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arm64/include/asm/memory.h |  2 +-
 arch/arm64/include/asm/page.h   |  3 ++-
 arch/arm64/kvm/mmu.c            |  2 +-
 arch/arm64/mm/init.c            | 12 ++++++++++++
 arch/arm64/mm/ioremap.c         |  4 ++--
 arch/arm64/mm/mmu.c             |  2 +-
 6 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 0aabc3be9a75..194f9f993d30 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x)
 
 #define virt_addr_valid(addr)	({					\
 	__typeof__(addr) __addr = __tag_reset(addr);			\
-	__is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr));	\
+	__is_lm_address(__addr) && pfn_is_map_memory(virt_to_pfn(__addr));	\
 })
 
 void dump_mem_limit(void);
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 012cffc574e8..75ddfe671393 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -37,7 +37,8 @@ void copy_highpage(struct page *to, struct page *from);
 
 typedef struct page *pgtable_t;
 
-extern int pfn_valid(unsigned long);
+int pfn_valid(unsigned long pfn);
+int pfn_is_map_memory(unsigned long pfn);
 
 #include <asm/memory.h>
 
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 8711894db8c2..23dd99e29b23 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
 
 static bool kvm_is_device_pfn(unsigned long pfn)
 {
-	return !pfn_valid(pfn);
+	return !pfn_is_map_memory(pfn);
 }
 
 /*
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 3685e12aba9b..966a7a18d528 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -258,6 +258,18 @@ int pfn_valid(unsigned long pfn)
 }
 EXPORT_SYMBOL(pfn_valid);
 
+int pfn_is_map_memory(unsigned long pfn)
+{
+	phys_addr_t addr = PFN_PHYS(pfn);
+
+	/* avoid false positives for bogus PFNs, see comment in pfn_valid() */
+	if (PHYS_PFN(addr) != pfn)
+		return 0;
+
+	return memblock_is_map_memory(addr);
+}
+EXPORT_SYMBOL(pfn_is_map_memory);
+
 static phys_addr_t memory_limit = PHYS_ADDR_MAX;
 
 /*
diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c
index b5e83c46b23e..b7c81dacabf0 100644
--- a/arch/arm64/mm/ioremap.c
+++ b/arch/arm64/mm/ioremap.c
@@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size,
 	/*
 	 * Don't allow RAM to be mapped.
 	 */
-	if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
+	if (WARN_ON(pfn_is_map_memory(__phys_to_pfn(phys_addr))))
 		return NULL;
 
 	area = get_vm_area_caller(size, VM_IOREMAP, caller);
@@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap);
 void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size)
 {
 	/* For normal memory we already have a cacheable mapping. */
-	if (pfn_valid(__phys_to_pfn(phys_addr)))
+	if (pfn_is_map_memory(__phys_to_pfn(phys_addr)))
 		return (void __iomem *)__phys_to_virt(phys_addr);
 
 	return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL),
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 5d9550fdb9cf..26045e9adbd7 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -81,7 +81,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd)
 pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 			      unsigned long size, pgprot_t vma_prot)
 {
-	if (!pfn_valid(pfn))
+	if (!pfn_is_map_memory(pfn))
 		return pgprot_noncached(vma_prot);
 	else if (file->f_flags & O_SYNC)
 		return pgprot_writecombine(vma_prot);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v3 4/4] arm64: drop pfn_valid_within() and simplify pfn_valid()
  2021-04-22  6:18 [PATCH v3 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
                   ` (2 preceding siblings ...)
  2021-04-22  6:19 ` [PATCH v3 3/4] arm64: decouple check whether pfn is in linear map from pfn_valid() Mike Rapoport
@ 2021-04-22  6:19 ` Mike Rapoport
  2021-04-22  7:50 ` [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes Anshuman Khandual
  4 siblings, 0 replies; 22+ messages in thread
From: Mike Rapoport @ 2021-04-22  6:19 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Andrew Morton, Anshuman Khandual, Ard Biesheuvel,
	Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland,
	Mike Rapoport, Mike Rapoport, Will Deacon, kvmarm, linux-kernel,
	linux-mm

From: Mike Rapoport <rppt@linux.ibm.com>

The arm64's version of pfn_valid() differs from the generic because of two
reasons:

* Parts of the memory map are freed during boot. This makes it necessary to
  verify that there is actual physical memory that corresponds to a pfn
  which is done by querying memblock.

* There are NOMAP memory regions. These regions are not mapped in the
  linear map and until the previous commit the struct pages representing
  these areas had default values.

As the consequence of absence of the special treatment of NOMAP regions in
the memory map it was necessary to use memblock_is_map_memory() in
pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that
generic mm functionality would not treat a NOMAP page as a normal page.

Since the NOMAP regions are now marked as PageReserved(), pfn walkers and
the rest of core mm will treat them as unusable memory and thus
pfn_valid_within() is no longer required at all and can be disabled by
removing CONFIG_HOLES_IN_ZONE on arm64.

pfn_valid() can be slightly simplified by replacing
memblock_is_map_memory() with memblock_is_memory().

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
---
 arch/arm64/Kconfig   | 3 ---
 arch/arm64/mm/init.c | 4 ++--
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index e4e1b6550115..58e439046d05 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1040,9 +1040,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
 	def_bool y
 	depends on NUMA
 
-config HOLES_IN_ZONE
-	def_bool y
-
 source "kernel/Kconfig.hz"
 
 config ARCH_SPARSEMEM_ENABLE
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 966a7a18d528..f431b38d0837 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -243,7 +243,7 @@ int pfn_valid(unsigned long pfn)
 
 	/*
 	 * ZONE_DEVICE memory does not have the memblock entries.
-	 * memblock_is_map_memory() check for ZONE_DEVICE based
+	 * memblock_is_memory() check for ZONE_DEVICE based
 	 * addresses will always fail. Even the normal hotplugged
 	 * memory will never have MEMBLOCK_NOMAP flag set in their
 	 * memblock entries. Skip memblock search for all non early
@@ -254,7 +254,7 @@ int pfn_valid(unsigned long pfn)
 		return pfn_section_valid(ms, pfn);
 }
 #endif
-	return memblock_is_map_memory(addr);
+	return memblock_is_memory(addr);
 }
 EXPORT_SYMBOL(pfn_valid);
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes
  2021-04-22  6:18 [PATCH v3 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
                   ` (3 preceding siblings ...)
  2021-04-22  6:19 ` [PATCH v3 4/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
@ 2021-04-22  7:50 ` Anshuman Khandual
  2021-04-22  8:27   ` Mike Rapoport
                     ` (2 more replies)
  4 siblings, 3 replies; 22+ messages in thread
From: Anshuman Khandual @ 2021-04-22  7:50 UTC (permalink / raw)
  To: linux-mm
  Cc: david, rppt, akpm, Anshuman Khandual, Catalin Marinas,
	Will Deacon, linux-arm-kernel, linux-kernel

Platforms like arm and arm64 have redefined pfn_valid() because their early
memory sections might have contained memmap holes after freeing parts of it
during boot, which should be skipped while validating a pfn for struct page
backing. This scenario on certain platforms where memmap is not continuous,
could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES.
Then the generic pfn_valid() can be improved to accommodate such platforms.
This reduces overall code footprint and also improves maintainability.

free_unused_memmap() and pfn_to_online_page() have been updated to include
such cases. This also exports memblock_is_memory() for all drivers that use
pfn_valid() but lack required visibility. After the new config is in place,
drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
This patch applies on the latest mainline kernel after Mike's series
regarding arm64 based pfn_valid().

https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t

Changes in RFC V2:

- Dropped support for arm (32 bit)
- Replaced memblock_is_map_memory() check with memblock_is_memory()
- MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid()
- Updated pfn_to_online_page() per David
- Updated free_unused_memmap() to preserve existing semantics per Mike
- Exported memblock_is_memory() instead of memblock_is_map_memory()

Changes in RFC V1:

- https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/

 arch/arm64/Kconfig            |  2 +-
 arch/arm64/include/asm/page.h |  1 -
 arch/arm64/mm/init.c          | 41 -----------------------------------
 include/linux/mmzone.h        | 18 ++++++++++++++-
 mm/Kconfig                    |  9 ++++++++
 mm/memblock.c                 |  8 +++++--
 mm/memory_hotplug.c           |  5 +++++
 7 files changed, 38 insertions(+), 46 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index b4a9b493ce72..4cdc3570ffa9 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -144,7 +144,6 @@ config ARM64
 	select HAVE_ARCH_KGDB
 	select HAVE_ARCH_MMAP_RND_BITS
 	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
-	select HAVE_ARCH_PFN_VALID
 	select HAVE_ARCH_PREL32_RELOCATIONS
 	select HAVE_ARCH_SECCOMP_FILTER
 	select HAVE_ARCH_STACKLEAK
@@ -167,6 +166,7 @@ config ARM64
 		if $(cc-option,-fpatchable-function-entry=2)
 	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
 		if DYNAMIC_FTRACE_WITH_REGS
+	select HAVE_EARLY_SECTION_MEMMAP_HOLES
 	select HAVE_EFFICIENT_UNALIGNED_ACCESS
 	select HAVE_FAST_GUP
 	select HAVE_FTRACE_MCOUNT_RECORD
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 75ddfe671393..fcbef3eec4b2 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from);
 
 typedef struct page *pgtable_t;
 
-int pfn_valid(unsigned long pfn);
 int pfn_is_map_memory(unsigned long pfn);
 
 #include <asm/memory.h>
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f431b38d0837..5731a11550d8 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
 	free_area_init(max_zone_pfns);
 }
 
-int pfn_valid(unsigned long pfn)
-{
-	phys_addr_t addr = PFN_PHYS(pfn);
-
-	/*
-	 * Ensure the upper PAGE_SHIFT bits are clear in the
-	 * pfn. Else it might lead to false positives when
-	 * some of the upper bits are set, but the lower bits
-	 * match a valid pfn.
-	 */
-	if (PHYS_PFN(addr) != pfn)
-		return 0;
-
-#ifdef CONFIG_SPARSEMEM
-{
-	struct mem_section *ms;
-
-	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
-		return 0;
-
-	ms = __pfn_to_section(pfn);
-	if (!valid_section(ms))
-		return 0;
-
-	/*
-	 * ZONE_DEVICE memory does not have the memblock entries.
-	 * memblock_is_memory() check for ZONE_DEVICE based
-	 * addresses will always fail. Even the normal hotplugged
-	 * memory will never have MEMBLOCK_NOMAP flag set in their
-	 * memblock entries. Skip memblock search for all non early
-	 * memory sections covering all of hotplug memory including
-	 * both normal and ZONE_DEVICE based.
-	 */
-	if (!early_section(ms))
-		return pfn_section_valid(ms, pfn);
-}
-#endif
-	return memblock_is_memory(addr);
-}
-EXPORT_SYMBOL(pfn_valid);
-
 int pfn_is_map_memory(unsigned long pfn)
 {
 	phys_addr_t addr = PFN_PHYS(pfn);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 961f0eeefb62..18bf71665211 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
  *
  * Return: 1 for PFNs that have memory map entries and 0 otherwise
  */
+bool memblock_is_memory(phys_addr_t addr);
+
 static inline int pfn_valid(unsigned long pfn)
 {
+	phys_addr_t addr = PFN_PHYS(pfn);
 	struct mem_section *ms;
 
+	/*
+	 * Ensure the upper PAGE_SHIFT bits are clear in the
+	 * pfn. Else it might lead to false positives when
+	 * some of the upper bits are set, but the lower bits
+	 * match a valid pfn.
+	 */
+	if (PHYS_PFN(addr) != pfn)
+		return 0;
+
 	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
 		return 0;
 	ms = __nr_to_section(pfn_to_section_nr(pfn));
@@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn)
 	 * Traditionally early sections always returned pfn_valid() for
 	 * the entire section-sized span.
 	 */
-	return early_section(ms) || pfn_section_valid(ms, pfn);
+	if (early_section(ms))
+		return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ?
+			memblock_is_memory(pfn << PAGE_SHIFT) : 1;
+
+	return pfn_section_valid(ms, pfn);
 }
 #endif
 
diff --git a/mm/Kconfig b/mm/Kconfig
index 24c045b24b95..db7128111874 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -135,6 +135,15 @@ config HAVE_FAST_GUP
 config ARCH_KEEP_MEMBLOCK
 	bool
 
+config HAVE_EARLY_SECTION_MEMMAP_HOLES
+	depends on ARCH_KEEP_MEMBLOCK && SPARSEMEM_VMEMMAP
+	def_bool n
+	help
+	  Early sections on certain platforms might have some memory ranges that
+	  are not backed with struct page mappings. When subscribed, this option
+	  enables special handling for those memory ranges in certain situations
+	  such as pfn_valid().
+
 # Keep arch NUMA mapping infrastructure post-init.
 config NUMA_KEEP_MEMINFO
 	bool
diff --git a/mm/memblock.c b/mm/memblock.c
index 3abf2c3fea7f..93f8a9c8428d 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1740,6 +1740,7 @@ bool __init_memblock memblock_is_memory(phys_addr_t addr)
 {
 	return memblock_search(&memblock.memory, addr) != -1;
 }
+EXPORT_SYMBOL(memblock_is_memory);
 
 bool __init_memblock memblock_is_map_memory(phys_addr_t addr)
 {
@@ -1931,8 +1932,11 @@ static void __init free_unused_memmap(void)
 	unsigned long start, end, prev_end = 0;
 	int i;
 
-	if (!IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) ||
-	    IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
+	if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
+		return;
+
+	if (!IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) &&
+	    !IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID))
 		return;
 
 	/*
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 0cdbbfbc5757..8c78b6a3d888 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -309,6 +309,11 @@ struct page *pfn_to_online_page(unsigned long pfn)
 	 * Save some code text when online_section() +
 	 * pfn_section_valid() are sufficient.
 	 */
+	if (IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES)) {
+		if (early_section(ms) && !memblock_is_memory(PFN_PHYS(pfn)))
+			return NULL;
+	}
+
 	if (IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) && !pfn_valid(pfn))
 		return NULL;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes
  2021-04-22  7:50 ` [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes Anshuman Khandual
@ 2021-04-22  8:27   ` Mike Rapoport
  2021-04-22 11:23     ` Anshuman Khandual
  2021-04-22  9:03   ` David Hildenbrand
  2021-05-24  4:58   ` Anshuman Khandual
  2 siblings, 1 reply; 22+ messages in thread
From: Mike Rapoport @ 2021-04-22  8:27 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-mm, david, akpm, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel

On Thu, Apr 22, 2021 at 01:20:23PM +0530, Anshuman Khandual wrote:
> Platforms like arm and arm64 have redefined pfn_valid() because their early
> memory sections might have contained memmap holes after freeing parts of it
> during boot, which should be skipped while validating a pfn for struct page
> backing. This scenario on certain platforms where memmap is not continuous,
> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES.
> Then the generic pfn_valid() can be improved to accommodate such platforms.
> This reduces overall code footprint and also improves maintainability.
> 
> free_unused_memmap() and pfn_to_online_page() have been updated to include
> such cases. This also exports memblock_is_memory() for all drivers that use
> pfn_valid() but lack required visibility. After the new config is in place,
> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-mm@kvack.org
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
> This patch applies on the latest mainline kernel after Mike's series
> regarding arm64 based pfn_valid().
> 
> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t
> 
> Changes in RFC V2:
> 
> - Dropped support for arm (32 bit)
> - Replaced memblock_is_map_memory() check with memblock_is_memory()
> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid()
> - Updated pfn_to_online_page() per David
> - Updated free_unused_memmap() to preserve existing semantics per Mike
> - Exported memblock_is_memory() instead of memblock_is_map_memory()
> 
> Changes in RFC V1:
> 
> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/
> 
>  arch/arm64/Kconfig            |  2 +-
>  arch/arm64/include/asm/page.h |  1 -
>  arch/arm64/mm/init.c          | 41 -----------------------------------
>  include/linux/mmzone.h        | 18 ++++++++++++++-
>  mm/Kconfig                    |  9 ++++++++
>  mm/memblock.c                 |  8 +++++--
>  mm/memory_hotplug.c           |  5 +++++
>  7 files changed, 38 insertions(+), 46 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index b4a9b493ce72..4cdc3570ffa9 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -144,7 +144,6 @@ config ARM64
>  	select HAVE_ARCH_KGDB
>  	select HAVE_ARCH_MMAP_RND_BITS
>  	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
> -	select HAVE_ARCH_PFN_VALID
>  	select HAVE_ARCH_PREL32_RELOCATIONS
>  	select HAVE_ARCH_SECCOMP_FILTER
>  	select HAVE_ARCH_STACKLEAK
> @@ -167,6 +166,7 @@ config ARM64
>  		if $(cc-option,-fpatchable-function-entry=2)
>  	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
>  		if DYNAMIC_FTRACE_WITH_REGS
> +	select HAVE_EARLY_SECTION_MEMMAP_HOLES
>  	select HAVE_EFFICIENT_UNALIGNED_ACCESS
>  	select HAVE_FAST_GUP
>  	select HAVE_FTRACE_MCOUNT_RECORD
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 75ddfe671393..fcbef3eec4b2 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from);
>  
>  typedef struct page *pgtable_t;
>  
> -int pfn_valid(unsigned long pfn);
>  int pfn_is_map_memory(unsigned long pfn);
>  
>  #include <asm/memory.h>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index f431b38d0837..5731a11550d8 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>  	free_area_init(max_zone_pfns);
>  }
>  
> -int pfn_valid(unsigned long pfn)
> -{
> -	phys_addr_t addr = PFN_PHYS(pfn);
> -
> -	/*
> -	 * Ensure the upper PAGE_SHIFT bits are clear in the
> -	 * pfn. Else it might lead to false positives when
> -	 * some of the upper bits are set, but the lower bits
> -	 * match a valid pfn.
> -	 */
> -	if (PHYS_PFN(addr) != pfn)
> -		return 0;
> -
> -#ifdef CONFIG_SPARSEMEM
> -{
> -	struct mem_section *ms;
> -
> -	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> -		return 0;
> -
> -	ms = __pfn_to_section(pfn);
> -	if (!valid_section(ms))
> -		return 0;
> -
> -	/*
> -	 * ZONE_DEVICE memory does not have the memblock entries.
> -	 * memblock_is_memory() check for ZONE_DEVICE based
> -	 * addresses will always fail. Even the normal hotplugged
> -	 * memory will never have MEMBLOCK_NOMAP flag set in their
> -	 * memblock entries. Skip memblock search for all non early
> -	 * memory sections covering all of hotplug memory including
> -	 * both normal and ZONE_DEVICE based.
> -	 */
> -	if (!early_section(ms))
> -		return pfn_section_valid(ms, pfn);
> -}
> -#endif
> -	return memblock_is_memory(addr);
> -}
> -EXPORT_SYMBOL(pfn_valid);
> -
>  int pfn_is_map_memory(unsigned long pfn)
>  {
>  	phys_addr_t addr = PFN_PHYS(pfn);
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 961f0eeefb62..18bf71665211 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
>   *
>   * Return: 1 for PFNs that have memory map entries and 0 otherwise
>   */
> +bool memblock_is_memory(phys_addr_t addr);
> +
>  static inline int pfn_valid(unsigned long pfn)
>  {
> +	phys_addr_t addr = PFN_PHYS(pfn);
>  	struct mem_section *ms;
>  
> +	/*
> +	 * Ensure the upper PAGE_SHIFT bits are clear in the
> +	 * pfn. Else it might lead to false positives when
> +	 * some of the upper bits are set, but the lower bits
> +	 * match a valid pfn.
> +	 */
> +	if (PHYS_PFN(addr) != pfn)
> +		return 0;
> +
>  	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>  		return 0;
>  	ms = __nr_to_section(pfn_to_section_nr(pfn));
> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn)
>  	 * Traditionally early sections always returned pfn_valid() for
>  	 * the entire section-sized span.
>  	 */
> -	return early_section(ms) || pfn_section_valid(ms, pfn);
> +	if (early_section(ms))
> +		return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ?
> +			memblock_is_memory(pfn << PAGE_SHIFT) : 1;

Nit: we already did 

	addr = PFN_PHYS(pfn);

a few lines above :)

> +
> +	return pfn_section_valid(ms, pfn);
>  }
>  #endif
>  
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 24c045b24b95..db7128111874 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -135,6 +135,15 @@ config HAVE_FAST_GUP
>  config ARCH_KEEP_MEMBLOCK
>  	bool
>  
> +config HAVE_EARLY_SECTION_MEMMAP_HOLES
> +	depends on ARCH_KEEP_MEMBLOCK && SPARSEMEM_VMEMMAP
> +	def_bool n
> +	help
> +	  Early sections on certain platforms might have some memory ranges that
> +	  are not backed with struct page mappings. When subscribed, this option
> +	  enables special handling for those memory ranges in certain situations
> +	  such as pfn_valid().
> +
>  # Keep arch NUMA mapping infrastructure post-init.
>  config NUMA_KEEP_MEMINFO
>  	bool
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 3abf2c3fea7f..93f8a9c8428d 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1740,6 +1740,7 @@ bool __init_memblock memblock_is_memory(phys_addr_t addr)
>  {
>  	return memblock_search(&memblock.memory, addr) != -1;
>  }
> +EXPORT_SYMBOL(memblock_is_memory);

Please make it inside #ifdef CONFIG_ARCH_MEMBLOCK
  
>  bool __init_memblock memblock_is_map_memory(phys_addr_t addr)
>  {
> @@ -1931,8 +1932,11 @@ static void __init free_unused_memmap(void)
>  	unsigned long start, end, prev_end = 0;
>  	int i;
>  
> -	if (!IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) ||
> -	    IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
> +	if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
> +		return;
> +
> +	if (!IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) &&
> +	    !IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID))
>  		return;

Can you please add a comment that says that architecture should provide a
way to detect holes in the memory map to be able to free its part?  

>  	/*
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 0cdbbfbc5757..8c78b6a3d888 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -309,6 +309,11 @@ struct page *pfn_to_online_page(unsigned long pfn)
>  	 * Save some code text when online_section() +
>  	 * pfn_section_valid() are sufficient.
>  	 */
> +	if (IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES)) {
> +		if (early_section(ms) && !memblock_is_memory(PFN_PHYS(pfn)))
> +			return NULL;
> +	}
> +
>  	if (IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) && !pfn_valid(pfn))
>  		return NULL;
>  
> -- 
> 2.20.1
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3 3/4] arm64: decouple check whether pfn is in linear map from pfn_valid()
  2021-04-22  6:19 ` [PATCH v3 3/4] arm64: decouple check whether pfn is in linear map from pfn_valid() Mike Rapoport
@ 2021-04-22  8:57   ` David Hildenbrand
  0 siblings, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2021-04-22  8:57 UTC (permalink / raw)
  To: Mike Rapoport, linux-arm-kernel
  Cc: Andrew Morton, Anshuman Khandual, Ard Biesheuvel,
	Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport,
	Will Deacon, kvmarm, linux-kernel, linux-mm

On 22.04.21 08:19, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> The intended semantics of pfn_valid() is to verify whether there is a
> struct page for the pfn in question and nothing else.
> 
> Yet, on arm64 it is used to distinguish memory areas that are mapped in the
> linear map vs those that require ioremap() to access them.
> 
> Introduce a dedicated pfn_is_map_memory() wrapper for
> memblock_is_map_memory() to perform such check and use it where
> appropriate.
> 
> Using a wrapper allows to avoid cyclic include dependencies.
> 
> While here also update style of pfn_valid() so that both pfn_valid() and
> pfn_is_map_memory() declarations will be consistent.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---
>   arch/arm64/include/asm/memory.h |  2 +-
>   arch/arm64/include/asm/page.h   |  3 ++-
>   arch/arm64/kvm/mmu.c            |  2 +-
>   arch/arm64/mm/init.c            | 12 ++++++++++++
>   arch/arm64/mm/ioremap.c         |  4 ++--
>   arch/arm64/mm/mmu.c             |  2 +-
>   6 files changed, 19 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 0aabc3be9a75..194f9f993d30 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x)
>   
>   #define virt_addr_valid(addr)	({					\
>   	__typeof__(addr) __addr = __tag_reset(addr);			\
> -	__is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr));	\
> +	__is_lm_address(__addr) && pfn_is_map_memory(virt_to_pfn(__addr));	\
>   })
>   
>   void dump_mem_limit(void);
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 012cffc574e8..75ddfe671393 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -37,7 +37,8 @@ void copy_highpage(struct page *to, struct page *from);
>   
>   typedef struct page *pgtable_t;
>   
> -extern int pfn_valid(unsigned long);
> +int pfn_valid(unsigned long pfn);
> +int pfn_is_map_memory(unsigned long pfn);
>   
>   #include <asm/memory.h>
>   
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 8711894db8c2..23dd99e29b23 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
>   
>   static bool kvm_is_device_pfn(unsigned long pfn)
>   {
> -	return !pfn_valid(pfn);
> +	return !pfn_is_map_memory(pfn);
>   }
>   
>   /*
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 3685e12aba9b..966a7a18d528 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -258,6 +258,18 @@ int pfn_valid(unsigned long pfn)
>   }
>   EXPORT_SYMBOL(pfn_valid);
>   
> +int pfn_is_map_memory(unsigned long pfn)
> +{
> +	phys_addr_t addr = PFN_PHYS(pfn);
> +
> +	/* avoid false positives for bogus PFNs, see comment in pfn_valid() */
> +	if (PHYS_PFN(addr) != pfn)
> +		return 0;
> +
> +	return memblock_is_map_memory(addr);
> +}
> +EXPORT_SYMBOL(pfn_is_map_memory);
> +
>   static phys_addr_t memory_limit = PHYS_ADDR_MAX;
>   
>   /*
> diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c
> index b5e83c46b23e..b7c81dacabf0 100644
> --- a/arch/arm64/mm/ioremap.c
> +++ b/arch/arm64/mm/ioremap.c
> @@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size,
>   	/*
>   	 * Don't allow RAM to be mapped.
>   	 */
> -	if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
> +	if (WARN_ON(pfn_is_map_memory(__phys_to_pfn(phys_addr))))
>   		return NULL;
>   
>   	area = get_vm_area_caller(size, VM_IOREMAP, caller);
> @@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap);
>   void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size)
>   {
>   	/* For normal memory we already have a cacheable mapping. */
> -	if (pfn_valid(__phys_to_pfn(phys_addr)))
> +	if (pfn_is_map_memory(__phys_to_pfn(phys_addr)))
>   		return (void __iomem *)__phys_to_virt(phys_addr);
>   
>   	return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL),
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 5d9550fdb9cf..26045e9adbd7 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -81,7 +81,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd)
>   pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
>   			      unsigned long size, pgprot_t vma_prot)
>   {
> -	if (!pfn_valid(pfn))
> +	if (!pfn_is_map_memory(pfn))
>   		return pgprot_noncached(vma_prot);
>   	else if (file->f_flags & O_SYNC)
>   		return pgprot_writecombine(vma_prot);
> 

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes
  2021-04-22  7:50 ` [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes Anshuman Khandual
  2021-04-22  8:27   ` Mike Rapoport
@ 2021-04-22  9:03   ` David Hildenbrand
  2021-04-22  9:42     ` Mike Rapoport
  2021-05-24  4:58   ` Anshuman Khandual
  2 siblings, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2021-04-22  9:03 UTC (permalink / raw)
  To: Anshuman Khandual, linux-mm
  Cc: rppt, akpm, Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel

On 22.04.21 09:50, Anshuman Khandual wrote:
> Platforms like arm and arm64 have redefined pfn_valid() because their early
> memory sections might have contained memmap holes after freeing parts of it
> during boot, which should be skipped while validating a pfn for struct page
> backing. This scenario on certain platforms where memmap is not continuous,
> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES.
> Then the generic pfn_valid() can be improved to accommodate such platforms.
> This reduces overall code footprint and also improves maintainability.
> 
> free_unused_memmap() and pfn_to_online_page() have been updated to include
> such cases. This also exports memblock_is_memory() for all drivers that use
> pfn_valid() but lack required visibility. After the new config is in place,
> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-mm@kvack.org
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
> This patch applies on the latest mainline kernel after Mike's series
> regarding arm64 based pfn_valid().
> 
> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t
> 
> Changes in RFC V2:
> 
> - Dropped support for arm (32 bit)
> - Replaced memblock_is_map_memory() check with memblock_is_memory()
> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid()
> - Updated pfn_to_online_page() per David
> - Updated free_unused_memmap() to preserve existing semantics per Mike
> - Exported memblock_is_memory() instead of memblock_is_map_memory()
> 
> Changes in RFC V1:
> 
> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/
> 
>   arch/arm64/Kconfig            |  2 +-
>   arch/arm64/include/asm/page.h |  1 -
>   arch/arm64/mm/init.c          | 41 -----------------------------------
>   include/linux/mmzone.h        | 18 ++++++++++++++-
>   mm/Kconfig                    |  9 ++++++++
>   mm/memblock.c                 |  8 +++++--
>   mm/memory_hotplug.c           |  5 +++++
>   7 files changed, 38 insertions(+), 46 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index b4a9b493ce72..4cdc3570ffa9 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -144,7 +144,6 @@ config ARM64
>   	select HAVE_ARCH_KGDB
>   	select HAVE_ARCH_MMAP_RND_BITS
>   	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
> -	select HAVE_ARCH_PFN_VALID
>   	select HAVE_ARCH_PREL32_RELOCATIONS
>   	select HAVE_ARCH_SECCOMP_FILTER
>   	select HAVE_ARCH_STACKLEAK
> @@ -167,6 +166,7 @@ config ARM64
>   		if $(cc-option,-fpatchable-function-entry=2)
>   	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
>   		if DYNAMIC_FTRACE_WITH_REGS
> +	select HAVE_EARLY_SECTION_MEMMAP_HOLES
>   	select HAVE_EFFICIENT_UNALIGNED_ACCESS
>   	select HAVE_FAST_GUP
>   	select HAVE_FTRACE_MCOUNT_RECORD
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 75ddfe671393..fcbef3eec4b2 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from);
>   
>   typedef struct page *pgtable_t;
>   
> -int pfn_valid(unsigned long pfn);
>   int pfn_is_map_memory(unsigned long pfn);
>   
>   #include <asm/memory.h>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index f431b38d0837..5731a11550d8 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>   	free_area_init(max_zone_pfns);
>   }
>   
> -int pfn_valid(unsigned long pfn)
> -{
> -	phys_addr_t addr = PFN_PHYS(pfn);
> -
> -	/*
> -	 * Ensure the upper PAGE_SHIFT bits are clear in the
> -	 * pfn. Else it might lead to false positives when
> -	 * some of the upper bits are set, but the lower bits
> -	 * match a valid pfn.
> -	 */
> -	if (PHYS_PFN(addr) != pfn)
> -		return 0;
> -
> -#ifdef CONFIG_SPARSEMEM
> -{
> -	struct mem_section *ms;
> -
> -	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> -		return 0;
> -
> -	ms = __pfn_to_section(pfn);
> -	if (!valid_section(ms))
> -		return 0;
> -
> -	/*
> -	 * ZONE_DEVICE memory does not have the memblock entries.
> -	 * memblock_is_memory() check for ZONE_DEVICE based
> -	 * addresses will always fail. Even the normal hotplugged
> -	 * memory will never have MEMBLOCK_NOMAP flag set in their
> -	 * memblock entries. Skip memblock search for all non early
> -	 * memory sections covering all of hotplug memory including
> -	 * both normal and ZONE_DEVICE based.
> -	 */
> -	if (!early_section(ms))
> -		return pfn_section_valid(ms, pfn);
> -}
> -#endif
> -	return memblock_is_memory(addr);
> -}
> -EXPORT_SYMBOL(pfn_valid);
> -
>   int pfn_is_map_memory(unsigned long pfn)
>   {
>   	phys_addr_t addr = PFN_PHYS(pfn);
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 961f0eeefb62..18bf71665211 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
>    *
>    * Return: 1 for PFNs that have memory map entries and 0 otherwise
>    */
> +bool memblock_is_memory(phys_addr_t addr);
> +
>   static inline int pfn_valid(unsigned long pfn)
>   {
> +	phys_addr_t addr = PFN_PHYS(pfn);
>   	struct mem_section *ms;
>   
> +	/*
> +	 * Ensure the upper PAGE_SHIFT bits are clear in the
> +	 * pfn. Else it might lead to false positives when
> +	 * some of the upper bits are set, but the lower bits
> +	 * match a valid pfn.
> +	 */
> +	if (PHYS_PFN(addr) != pfn)
> +		return 0;
> +
>   	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>   		return 0;
>   	ms = __nr_to_section(pfn_to_section_nr(pfn));
> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn)
>   	 * Traditionally early sections always returned pfn_valid() for
>   	 * the entire section-sized span.
>   	 */
> -	return early_section(ms) || pfn_section_valid(ms, pfn);
> +	if (early_section(ms))
> +		return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ?
> +			memblock_is_memory(pfn << PAGE_SHIFT) : 1;
> +
> +	return pfn_section_valid(ms, pfn);
>   }
>   #endif
>   
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 24c045b24b95..db7128111874 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -135,6 +135,15 @@ config HAVE_FAST_GUP
>   config ARCH_KEEP_MEMBLOCK
>   	bool
>   
> +config HAVE_EARLY_SECTION_MEMMAP_HOLES
> +	depends on ARCH_KEEP_MEMBLOCK && SPARSEMEM_VMEMMAP
> +	def_bool n
> +	help
> +	  Early sections on certain platforms might have some memory ranges that
> +	  are not backed with struct page mappings. When subscribed, this option
> +	  enables special handling for those memory ranges in certain situations
> +	  such as pfn_valid().
> +
>   # Keep arch NUMA mapping infrastructure post-init.
>   config NUMA_KEEP_MEMINFO
>   	bool
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 3abf2c3fea7f..93f8a9c8428d 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1740,6 +1740,7 @@ bool __init_memblock memblock_is_memory(phys_addr_t addr)
>   {
>   	return memblock_search(&memblock.memory, addr) != -1;
>   }
> +EXPORT_SYMBOL(memblock_is_memory);
>   
>   bool __init_memblock memblock_is_map_memory(phys_addr_t addr)
>   {
> @@ -1931,8 +1932,11 @@ static void __init free_unused_memmap(void)
>   	unsigned long start, end, prev_end = 0;
>   	int i;
>   
> -	if (!IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) ||
> -	    IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
> +	if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
> +		return;
> +
> +	if (!IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) &&
> +	    !IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID))
>   		return;
>   
>   	/*
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 0cdbbfbc5757..8c78b6a3d888 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -309,6 +309,11 @@ struct page *pfn_to_online_page(unsigned long pfn)
>   	 * Save some code text when online_section() +
>   	 * pfn_section_valid() are sufficient.
>   	 */
> +	if (IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES)) {
> +		if (early_section(ms) && !memblock_is_memory(PFN_PHYS(pfn)))
> +			return NULL;
> +	}
> +
>   	if (IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) && !pfn_valid(pfn))
>   		return NULL;
>   
> 

What about doing one step at a time and switching only over to generic 
pfn_valid() in case of CONFIG_SPARSEMEM? (meaning: freeing the memmap 
only with !CONFIG_SPARSEMEM)

IOW, avoiding having to adjust generic pfn_valid()/pfn_to_online_page() 
at all. Am i missing something or should that be possible?

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes
  2021-04-22  9:03   ` David Hildenbrand
@ 2021-04-22  9:42     ` Mike Rapoport
  2021-04-22  9:48       ` David Hildenbrand
  2021-04-22  9:59       ` Anshuman Khandual
  0 siblings, 2 replies; 22+ messages in thread
From: Mike Rapoport @ 2021-04-22  9:42 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Anshuman Khandual, linux-mm, akpm, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel

On Thu, Apr 22, 2021 at 11:03:50AM +0200, David Hildenbrand wrote:
> On 22.04.21 09:50, Anshuman Khandual wrote:
> > Platforms like arm and arm64 have redefined pfn_valid() because their early
> > memory sections might have contained memmap holes after freeing parts of it
> > during boot, which should be skipped while validating a pfn for struct page
> > backing. This scenario on certain platforms where memmap is not continuous,
> > could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES.
> > Then the generic pfn_valid() can be improved to accommodate such platforms.
> > This reduces overall code footprint and also improves maintainability.
> > 
> > free_unused_memmap() and pfn_to_online_page() have been updated to include
> > such cases. This also exports memblock_is_memory() for all drivers that use
> > pfn_valid() but lack required visibility. After the new config is in place,
> > drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms.
> > 
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Mike Rapoport <rppt@kernel.org>
> > Cc: David Hildenbrand <david@redhat.com>
> > Cc: linux-arm-kernel@lists.infradead.org
> > Cc: linux-kernel@vger.kernel.org
> > Cc: linux-mm@kvack.org
> > Suggested-by: David Hildenbrand <david@redhat.com>
> > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> > ---
> > This patch applies on the latest mainline kernel after Mike's series
> > regarding arm64 based pfn_valid().
> > 
> > https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t
> > 
> > Changes in RFC V2:
> > 
> > - Dropped support for arm (32 bit)
> > - Replaced memblock_is_map_memory() check with memblock_is_memory()
> > - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid()
> > - Updated pfn_to_online_page() per David
> > - Updated free_unused_memmap() to preserve existing semantics per Mike
> > - Exported memblock_is_memory() instead of memblock_is_map_memory()
> > 
> > Changes in RFC V1:
> > 
> > - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/
> > 
> >   arch/arm64/Kconfig            |  2 +-
> >   arch/arm64/include/asm/page.h |  1 -
> >   arch/arm64/mm/init.c          | 41 -----------------------------------
> >   include/linux/mmzone.h        | 18 ++++++++++++++-
> >   mm/Kconfig                    |  9 ++++++++
> >   mm/memblock.c                 |  8 +++++--
> >   mm/memory_hotplug.c           |  5 +++++
> >   7 files changed, 38 insertions(+), 46 deletions(-)
> > 
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index b4a9b493ce72..4cdc3570ffa9 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -144,7 +144,6 @@ config ARM64
> >   	select HAVE_ARCH_KGDB
> >   	select HAVE_ARCH_MMAP_RND_BITS
> >   	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
> > -	select HAVE_ARCH_PFN_VALID
> >   	select HAVE_ARCH_PREL32_RELOCATIONS
> >   	select HAVE_ARCH_SECCOMP_FILTER
> >   	select HAVE_ARCH_STACKLEAK
> > @@ -167,6 +166,7 @@ config ARM64
> >   		if $(cc-option,-fpatchable-function-entry=2)
> >   	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
> >   		if DYNAMIC_FTRACE_WITH_REGS
> > +	select HAVE_EARLY_SECTION_MEMMAP_HOLES
> >   	select HAVE_EFFICIENT_UNALIGNED_ACCESS
> >   	select HAVE_FAST_GUP
> >   	select HAVE_FTRACE_MCOUNT_RECORD
> > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> > index 75ddfe671393..fcbef3eec4b2 100644
> > --- a/arch/arm64/include/asm/page.h
> > +++ b/arch/arm64/include/asm/page.h
> > @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from);
> >   typedef struct page *pgtable_t;
> > -int pfn_valid(unsigned long pfn);
> >   int pfn_is_map_memory(unsigned long pfn);
> >   #include <asm/memory.h>
> > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> > index f431b38d0837..5731a11550d8 100644
> > --- a/arch/arm64/mm/init.c
> > +++ b/arch/arm64/mm/init.c
> > @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
> >   	free_area_init(max_zone_pfns);
> >   }
> > -int pfn_valid(unsigned long pfn)
> > -{
> > -	phys_addr_t addr = PFN_PHYS(pfn);
> > -
> > -	/*
> > -	 * Ensure the upper PAGE_SHIFT bits are clear in the
> > -	 * pfn. Else it might lead to false positives when
> > -	 * some of the upper bits are set, but the lower bits
> > -	 * match a valid pfn.
> > -	 */
> > -	if (PHYS_PFN(addr) != pfn)
> > -		return 0;
> > -
> > -#ifdef CONFIG_SPARSEMEM
> > -{
> > -	struct mem_section *ms;
> > -
> > -	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> > -		return 0;
> > -
> > -	ms = __pfn_to_section(pfn);
> > -	if (!valid_section(ms))
> > -		return 0;
> > -
> > -	/*
> > -	 * ZONE_DEVICE memory does not have the memblock entries.
> > -	 * memblock_is_memory() check for ZONE_DEVICE based
> > -	 * addresses will always fail. Even the normal hotplugged
> > -	 * memory will never have MEMBLOCK_NOMAP flag set in their
> > -	 * memblock entries. Skip memblock search for all non early
> > -	 * memory sections covering all of hotplug memory including
> > -	 * both normal and ZONE_DEVICE based.
> > -	 */
> > -	if (!early_section(ms))
> > -		return pfn_section_valid(ms, pfn);
> > -}
> > -#endif
> > -	return memblock_is_memory(addr);
> > -}
> > -EXPORT_SYMBOL(pfn_valid);
> > -
> >   int pfn_is_map_memory(unsigned long pfn)
> >   {
> >   	phys_addr_t addr = PFN_PHYS(pfn);
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index 961f0eeefb62..18bf71665211 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
> >    *
> >    * Return: 1 for PFNs that have memory map entries and 0 otherwise
> >    */
> > +bool memblock_is_memory(phys_addr_t addr);
> > +
> >   static inline int pfn_valid(unsigned long pfn)
> >   {
> > +	phys_addr_t addr = PFN_PHYS(pfn);
> >   	struct mem_section *ms;
> > +	/*
> > +	 * Ensure the upper PAGE_SHIFT bits are clear in the
> > +	 * pfn. Else it might lead to false positives when
> > +	 * some of the upper bits are set, but the lower bits
> > +	 * match a valid pfn.
> > +	 */
> > +	if (PHYS_PFN(addr) != pfn)
> > +		return 0;
> > +
> >   	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> >   		return 0;
> >   	ms = __nr_to_section(pfn_to_section_nr(pfn));
> > @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn)
> >   	 * Traditionally early sections always returned pfn_valid() for
> >   	 * the entire section-sized span.
> >   	 */
> > -	return early_section(ms) || pfn_section_valid(ms, pfn);
> > +	if (early_section(ms))
> > +		return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ?
> > +			memblock_is_memory(pfn << PAGE_SHIFT) : 1;
> > +
> > +	return pfn_section_valid(ms, pfn);
> >   }
> >   #endif
> > diff --git a/mm/Kconfig b/mm/Kconfig
> > index 24c045b24b95..db7128111874 100644
> > --- a/mm/Kconfig
> > +++ b/mm/Kconfig
> > @@ -135,6 +135,15 @@ config HAVE_FAST_GUP
> >   config ARCH_KEEP_MEMBLOCK
> >   	bool
> > +config HAVE_EARLY_SECTION_MEMMAP_HOLES
> > +	depends on ARCH_KEEP_MEMBLOCK && SPARSEMEM_VMEMMAP
> > +	def_bool n
> > +	help
> > +	  Early sections on certain platforms might have some memory ranges that
> > +	  are not backed with struct page mappings. When subscribed, this option
> > +	  enables special handling for those memory ranges in certain situations
> > +	  such as pfn_valid().
> > +
> >   # Keep arch NUMA mapping infrastructure post-init.
> >   config NUMA_KEEP_MEMINFO
> >   	bool
> > diff --git a/mm/memblock.c b/mm/memblock.c
> > index 3abf2c3fea7f..93f8a9c8428d 100644
> > --- a/mm/memblock.c
> > +++ b/mm/memblock.c
> > @@ -1740,6 +1740,7 @@ bool __init_memblock memblock_is_memory(phys_addr_t addr)
> >   {
> >   	return memblock_search(&memblock.memory, addr) != -1;
> >   }
> > +EXPORT_SYMBOL(memblock_is_memory);
> >   bool __init_memblock memblock_is_map_memory(phys_addr_t addr)
> >   {
> > @@ -1931,8 +1932,11 @@ static void __init free_unused_memmap(void)
> >   	unsigned long start, end, prev_end = 0;
> >   	int i;
> > -	if (!IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) ||
> > -	    IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
> > +	if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
> > +		return;
> > +
> > +	if (!IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) &&
> > +	    !IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID))
> >   		return;
> >   	/*
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index 0cdbbfbc5757..8c78b6a3d888 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -309,6 +309,11 @@ struct page *pfn_to_online_page(unsigned long pfn)
> >   	 * Save some code text when online_section() +
> >   	 * pfn_section_valid() are sufficient.
> >   	 */
> > +	if (IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES)) {
> > +		if (early_section(ms) && !memblock_is_memory(PFN_PHYS(pfn)))
> > +			return NULL;
> > +	}
> > +
> >   	if (IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) && !pfn_valid(pfn))
> >   		return NULL;
> > 
> 
> What about doing one step at a time and switching only over to generic
> pfn_valid() in case of CONFIG_SPARSEMEM? (meaning: freeing the memmap only
> with !CONFIG_SPARSEMEM)

The "generic" pfn_valid() is only available with CONFIG_SPARSEMEM.
With FLATMEM it's wild west:

$ git grep -w "define pfn_valid" arch/*/include/asm/ | wc -l
22

This would actually mean that we still need arm64::pfn_valid() for the
FLATMEM case.

> IOW, avoiding having to adjust generic pfn_valid()/pfn_to_online_page() at
> all. Am i missing something or should that be possible?

We are back again to the question "should arm64 free its memmap". 
If the answer is no, we don't need arm64::pfn_valid() for SPARSEMEM at all.
If the answer is yes, Anshuman's patch is way better than a custom
pfn_valid().

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes
  2021-04-22  9:42     ` Mike Rapoport
@ 2021-04-22  9:48       ` David Hildenbrand
  2021-04-22 10:03         ` Mike Rapoport
  2021-04-22  9:59       ` Anshuman Khandual
  1 sibling, 1 reply; 22+ messages in thread
From: David Hildenbrand @ 2021-04-22  9:48 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Anshuman Khandual, linux-mm, akpm, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel

On 22.04.21 11:42, Mike Rapoport wrote:
> On Thu, Apr 22, 2021 at 11:03:50AM +0200, David Hildenbrand wrote:
>> On 22.04.21 09:50, Anshuman Khandual wrote:
>>> Platforms like arm and arm64 have redefined pfn_valid() because their early
>>> memory sections might have contained memmap holes after freeing parts of it
>>> during boot, which should be skipped while validating a pfn for struct page
>>> backing. This scenario on certain platforms where memmap is not continuous,
>>> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES.
>>> Then the generic pfn_valid() can be improved to accommodate such platforms.
>>> This reduces overall code footprint and also improves maintainability.
>>>
>>> free_unused_memmap() and pfn_to_online_page() have been updated to include
>>> such cases. This also exports memblock_is_memory() for all drivers that use
>>> pfn_valid() but lack required visibility. After the new config is in place,
>>> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms.
>>>
>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>> Cc: Will Deacon <will@kernel.org>
>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>> Cc: Mike Rapoport <rppt@kernel.org>
>>> Cc: David Hildenbrand <david@redhat.com>
>>> Cc: linux-arm-kernel@lists.infradead.org
>>> Cc: linux-kernel@vger.kernel.org
>>> Cc: linux-mm@kvack.org
>>> Suggested-by: David Hildenbrand <david@redhat.com>
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> ---
>>> This patch applies on the latest mainline kernel after Mike's series
>>> regarding arm64 based pfn_valid().
>>>
>>> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t
>>>
>>> Changes in RFC V2:
>>>
>>> - Dropped support for arm (32 bit)
>>> - Replaced memblock_is_map_memory() check with memblock_is_memory()
>>> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid()
>>> - Updated pfn_to_online_page() per David
>>> - Updated free_unused_memmap() to preserve existing semantics per Mike
>>> - Exported memblock_is_memory() instead of memblock_is_map_memory()
>>>
>>> Changes in RFC V1:
>>>
>>> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/
>>>
>>>    arch/arm64/Kconfig            |  2 +-
>>>    arch/arm64/include/asm/page.h |  1 -
>>>    arch/arm64/mm/init.c          | 41 -----------------------------------
>>>    include/linux/mmzone.h        | 18 ++++++++++++++-
>>>    mm/Kconfig                    |  9 ++++++++
>>>    mm/memblock.c                 |  8 +++++--
>>>    mm/memory_hotplug.c           |  5 +++++
>>>    7 files changed, 38 insertions(+), 46 deletions(-)
>>>
>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>> index b4a9b493ce72..4cdc3570ffa9 100644
>>> --- a/arch/arm64/Kconfig
>>> +++ b/arch/arm64/Kconfig
>>> @@ -144,7 +144,6 @@ config ARM64
>>>    	select HAVE_ARCH_KGDB
>>>    	select HAVE_ARCH_MMAP_RND_BITS
>>>    	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
>>> -	select HAVE_ARCH_PFN_VALID
>>>    	select HAVE_ARCH_PREL32_RELOCATIONS
>>>    	select HAVE_ARCH_SECCOMP_FILTER
>>>    	select HAVE_ARCH_STACKLEAK
>>> @@ -167,6 +166,7 @@ config ARM64
>>>    		if $(cc-option,-fpatchable-function-entry=2)
>>>    	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
>>>    		if DYNAMIC_FTRACE_WITH_REGS
>>> +	select HAVE_EARLY_SECTION_MEMMAP_HOLES
>>>    	select HAVE_EFFICIENT_UNALIGNED_ACCESS
>>>    	select HAVE_FAST_GUP
>>>    	select HAVE_FTRACE_MCOUNT_RECORD
>>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>>> index 75ddfe671393..fcbef3eec4b2 100644
>>> --- a/arch/arm64/include/asm/page.h
>>> +++ b/arch/arm64/include/asm/page.h
>>> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from);
>>>    typedef struct page *pgtable_t;
>>> -int pfn_valid(unsigned long pfn);
>>>    int pfn_is_map_memory(unsigned long pfn);
>>>    #include <asm/memory.h>
>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>> index f431b38d0837..5731a11550d8 100644
>>> --- a/arch/arm64/mm/init.c
>>> +++ b/arch/arm64/mm/init.c
>>> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>>>    	free_area_init(max_zone_pfns);
>>>    }
>>> -int pfn_valid(unsigned long pfn)
>>> -{
>>> -	phys_addr_t addr = PFN_PHYS(pfn);
>>> -
>>> -	/*
>>> -	 * Ensure the upper PAGE_SHIFT bits are clear in the
>>> -	 * pfn. Else it might lead to false positives when
>>> -	 * some of the upper bits are set, but the lower bits
>>> -	 * match a valid pfn.
>>> -	 */
>>> -	if (PHYS_PFN(addr) != pfn)
>>> -		return 0;
>>> -
>>> -#ifdef CONFIG_SPARSEMEM
>>> -{
>>> -	struct mem_section *ms;
>>> -
>>> -	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>>> -		return 0;
>>> -
>>> -	ms = __pfn_to_section(pfn);
>>> -	if (!valid_section(ms))
>>> -		return 0;
>>> -
>>> -	/*
>>> -	 * ZONE_DEVICE memory does not have the memblock entries.
>>> -	 * memblock_is_memory() check for ZONE_DEVICE based
>>> -	 * addresses will always fail. Even the normal hotplugged
>>> -	 * memory will never have MEMBLOCK_NOMAP flag set in their
>>> -	 * memblock entries. Skip memblock search for all non early
>>> -	 * memory sections covering all of hotplug memory including
>>> -	 * both normal and ZONE_DEVICE based.
>>> -	 */
>>> -	if (!early_section(ms))
>>> -		return pfn_section_valid(ms, pfn);
>>> -}
>>> -#endif
>>> -	return memblock_is_memory(addr);
>>> -}
>>> -EXPORT_SYMBOL(pfn_valid);
>>> -
>>>    int pfn_is_map_memory(unsigned long pfn)
>>>    {
>>>    	phys_addr_t addr = PFN_PHYS(pfn);
>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>>> index 961f0eeefb62..18bf71665211 100644
>>> --- a/include/linux/mmzone.h
>>> +++ b/include/linux/mmzone.h
>>> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
>>>     *
>>>     * Return: 1 for PFNs that have memory map entries and 0 otherwise
>>>     */
>>> +bool memblock_is_memory(phys_addr_t addr);
>>> +
>>>    static inline int pfn_valid(unsigned long pfn)
>>>    {
>>> +	phys_addr_t addr = PFN_PHYS(pfn);
>>>    	struct mem_section *ms;
>>> +	/*
>>> +	 * Ensure the upper PAGE_SHIFT bits are clear in the
>>> +	 * pfn. Else it might lead to false positives when
>>> +	 * some of the upper bits are set, but the lower bits
>>> +	 * match a valid pfn.
>>> +	 */
>>> +	if (PHYS_PFN(addr) != pfn)
>>> +		return 0;
>>> +
>>>    	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>>>    		return 0;
>>>    	ms = __nr_to_section(pfn_to_section_nr(pfn));
>>> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn)
>>>    	 * Traditionally early sections always returned pfn_valid() for
>>>    	 * the entire section-sized span.
>>>    	 */
>>> -	return early_section(ms) || pfn_section_valid(ms, pfn);
>>> +	if (early_section(ms))
>>> +		return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ?
>>> +			memblock_is_memory(pfn << PAGE_SHIFT) : 1;
>>> +
>>> +	return pfn_section_valid(ms, pfn);
>>>    }
>>>    #endif
>>> diff --git a/mm/Kconfig b/mm/Kconfig
>>> index 24c045b24b95..db7128111874 100644
>>> --- a/mm/Kconfig
>>> +++ b/mm/Kconfig
>>> @@ -135,6 +135,15 @@ config HAVE_FAST_GUP
>>>    config ARCH_KEEP_MEMBLOCK
>>>    	bool
>>> +config HAVE_EARLY_SECTION_MEMMAP_HOLES
>>> +	depends on ARCH_KEEP_MEMBLOCK && SPARSEMEM_VMEMMAP
>>> +	def_bool n
>>> +	help
>>> +	  Early sections on certain platforms might have some memory ranges that
>>> +	  are not backed with struct page mappings. When subscribed, this option
>>> +	  enables special handling for those memory ranges in certain situations
>>> +	  such as pfn_valid().
>>> +
>>>    # Keep arch NUMA mapping infrastructure post-init.
>>>    config NUMA_KEEP_MEMINFO
>>>    	bool
>>> diff --git a/mm/memblock.c b/mm/memblock.c
>>> index 3abf2c3fea7f..93f8a9c8428d 100644
>>> --- a/mm/memblock.c
>>> +++ b/mm/memblock.c
>>> @@ -1740,6 +1740,7 @@ bool __init_memblock memblock_is_memory(phys_addr_t addr)
>>>    {
>>>    	return memblock_search(&memblock.memory, addr) != -1;
>>>    }
>>> +EXPORT_SYMBOL(memblock_is_memory);
>>>    bool __init_memblock memblock_is_map_memory(phys_addr_t addr)
>>>    {
>>> @@ -1931,8 +1932,11 @@ static void __init free_unused_memmap(void)
>>>    	unsigned long start, end, prev_end = 0;
>>>    	int i;
>>> -	if (!IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) ||
>>> -	    IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
>>> +	if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
>>> +		return;
>>> +
>>> +	if (!IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) &&
>>> +	    !IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID))
>>>    		return;
>>>    	/*
>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>> index 0cdbbfbc5757..8c78b6a3d888 100644
>>> --- a/mm/memory_hotplug.c
>>> +++ b/mm/memory_hotplug.c
>>> @@ -309,6 +309,11 @@ struct page *pfn_to_online_page(unsigned long pfn)
>>>    	 * Save some code text when online_section() +
>>>    	 * pfn_section_valid() are sufficient.
>>>    	 */
>>> +	if (IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES)) {
>>> +		if (early_section(ms) && !memblock_is_memory(PFN_PHYS(pfn)))
>>> +			return NULL;
>>> +	}
>>> +
>>>    	if (IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) && !pfn_valid(pfn))
>>>    		return NULL;
>>>
>>
>> What about doing one step at a time and switching only over to generic
>> pfn_valid() in case of CONFIG_SPARSEMEM? (meaning: freeing the memmap only
>> with !CONFIG_SPARSEMEM)
> 
> The "generic" pfn_valid() is only available with CONFIG_SPARSEMEM.
> With FLATMEM it's wild west:
> 
> $ git grep -w "define pfn_valid" arch/*/include/asm/ | wc -l
> 22
> 
> This would actually mean that we still need arm64::pfn_valid() for the
> FLATMEM case.

Right, which is the case on x86 etc. as well. (I was assuming that this 
patch was missing something)

> 
>> IOW, avoiding having to adjust generic pfn_valid()/pfn_to_online_page() at
>> all. Am i missing something or should that be possible?
> 
> We are back again to the question "should arm64 free its memmap".
> If the answer is no, we don't need arm64::pfn_valid() for SPARSEMEM at all.
> If the answer is yes, Anshuman's patch is way better than a custom
> pfn_valid().

Well, I propose something in between: stop freeing with SPARSEMEM, 
continue freeing with FLATMEM.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes
  2021-04-22  9:42     ` Mike Rapoport
  2021-04-22  9:48       ` David Hildenbrand
@ 2021-04-22  9:59       ` Anshuman Khandual
  1 sibling, 0 replies; 22+ messages in thread
From: Anshuman Khandual @ 2021-04-22  9:59 UTC (permalink / raw)
  To: Mike Rapoport, David Hildenbrand
  Cc: linux-mm, akpm, Catalin Marinas, Will Deacon, linux-arm-kernel,
	linux-kernel



On 4/22/21 3:12 PM, Mike Rapoport wrote:
> The "generic" pfn_valid() is only available with CONFIG_SPARSEMEM.
> With FLATMEM it's wild west:
> 
> $ git grep -w "define pfn_valid" arch/*/include/asm/ | wc -l
> 22
> 
> This would actually mean that we still need arm64::pfn_valid() for the
> FLATMEM case.

SPARSEMEM would be the only memory model going forward.

https://patchwork.kernel.org/project/linux-arm-kernel/patch/20210420093559.23168-1-catalin.marinas@arm.com/ 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes
  2021-04-22  9:48       ` David Hildenbrand
@ 2021-04-22 10:03         ` Mike Rapoport
  0 siblings, 0 replies; 22+ messages in thread
From: Mike Rapoport @ 2021-04-22 10:03 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Anshuman Khandual, linux-mm, akpm, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel

On Thu, Apr 22, 2021 at 11:48:58AM +0200, David Hildenbrand wrote:
> > 
> > > IOW, avoiding having to adjust generic pfn_valid()/pfn_to_online_page() at
> > > all. Am i missing something or should that be possible?
> > 
> > We are back again to the question "should arm64 free its memmap".
> > If the answer is no, we don't need arm64::pfn_valid() for SPARSEMEM at all.
> > If the answer is yes, Anshuman's patch is way better than a custom
> > pfn_valid().
> 
> Well, I propose something in between: stop freeing with SPARSEMEM, continue
> freeing with FLATMEM.

I'm all for it.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes
  2021-04-22  8:27   ` Mike Rapoport
@ 2021-04-22 11:23     ` Anshuman Khandual
  2021-04-22 12:19       ` Mike Rapoport
  0 siblings, 1 reply; 22+ messages in thread
From: Anshuman Khandual @ 2021-04-22 11:23 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-mm, david, akpm, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel

On 4/22/21 1:57 PM, Mike Rapoport wrote:
> On Thu, Apr 22, 2021 at 01:20:23PM +0530, Anshuman Khandual wrote:
>> Platforms like arm and arm64 have redefined pfn_valid() because their early
>> memory sections might have contained memmap holes after freeing parts of it
>> during boot, which should be skipped while validating a pfn for struct page
>> backing. This scenario on certain platforms where memmap is not continuous,
>> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES.
>> Then the generic pfn_valid() can be improved to accommodate such platforms.
>> This reduces overall code footprint and also improves maintainability.
>>
>> free_unused_memmap() and pfn_to_online_page() have been updated to include
>> such cases. This also exports memblock_is_memory() for all drivers that use
>> pfn_valid() but lack required visibility. After the new config is in place,
>> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Mike Rapoport <rppt@kernel.org>
>> Cc: David Hildenbrand <david@redhat.com>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Cc: linux-mm@kvack.org
>> Suggested-by: David Hildenbrand <david@redhat.com>
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>> This patch applies on the latest mainline kernel after Mike's series
>> regarding arm64 based pfn_valid().
>>
>> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t
>>
>> Changes in RFC V2:
>>
>> - Dropped support for arm (32 bit)
>> - Replaced memblock_is_map_memory() check with memblock_is_memory()
>> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid()
>> - Updated pfn_to_online_page() per David
>> - Updated free_unused_memmap() to preserve existing semantics per Mike
>> - Exported memblock_is_memory() instead of memblock_is_map_memory()
>>
>> Changes in RFC V1:
>>
>> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/
>>
>>  arch/arm64/Kconfig            |  2 +-
>>  arch/arm64/include/asm/page.h |  1 -
>>  arch/arm64/mm/init.c          | 41 -----------------------------------
>>  include/linux/mmzone.h        | 18 ++++++++++++++-
>>  mm/Kconfig                    |  9 ++++++++
>>  mm/memblock.c                 |  8 +++++--
>>  mm/memory_hotplug.c           |  5 +++++
>>  7 files changed, 38 insertions(+), 46 deletions(-)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index b4a9b493ce72..4cdc3570ffa9 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -144,7 +144,6 @@ config ARM64
>>  	select HAVE_ARCH_KGDB
>>  	select HAVE_ARCH_MMAP_RND_BITS
>>  	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
>> -	select HAVE_ARCH_PFN_VALID
>>  	select HAVE_ARCH_PREL32_RELOCATIONS
>>  	select HAVE_ARCH_SECCOMP_FILTER
>>  	select HAVE_ARCH_STACKLEAK
>> @@ -167,6 +166,7 @@ config ARM64
>>  		if $(cc-option,-fpatchable-function-entry=2)
>>  	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
>>  		if DYNAMIC_FTRACE_WITH_REGS
>> +	select HAVE_EARLY_SECTION_MEMMAP_HOLES
>>  	select HAVE_EFFICIENT_UNALIGNED_ACCESS
>>  	select HAVE_FAST_GUP
>>  	select HAVE_FTRACE_MCOUNT_RECORD
>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>> index 75ddfe671393..fcbef3eec4b2 100644
>> --- a/arch/arm64/include/asm/page.h
>> +++ b/arch/arm64/include/asm/page.h
>> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from);
>>  
>>  typedef struct page *pgtable_t;
>>  
>> -int pfn_valid(unsigned long pfn);
>>  int pfn_is_map_memory(unsigned long pfn);
>>  
>>  #include <asm/memory.h>
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index f431b38d0837..5731a11550d8 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>>  	free_area_init(max_zone_pfns);
>>  }
>>  
>> -int pfn_valid(unsigned long pfn)
>> -{
>> -	phys_addr_t addr = PFN_PHYS(pfn);
>> -
>> -	/*
>> -	 * Ensure the upper PAGE_SHIFT bits are clear in the
>> -	 * pfn. Else it might lead to false positives when
>> -	 * some of the upper bits are set, but the lower bits
>> -	 * match a valid pfn.
>> -	 */
>> -	if (PHYS_PFN(addr) != pfn)
>> -		return 0;
>> -
>> -#ifdef CONFIG_SPARSEMEM
>> -{
>> -	struct mem_section *ms;
>> -
>> -	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>> -		return 0;
>> -
>> -	ms = __pfn_to_section(pfn);
>> -	if (!valid_section(ms))
>> -		return 0;
>> -
>> -	/*
>> -	 * ZONE_DEVICE memory does not have the memblock entries.
>> -	 * memblock_is_memory() check for ZONE_DEVICE based
>> -	 * addresses will always fail. Even the normal hotplugged
>> -	 * memory will never have MEMBLOCK_NOMAP flag set in their
>> -	 * memblock entries. Skip memblock search for all non early
>> -	 * memory sections covering all of hotplug memory including
>> -	 * both normal and ZONE_DEVICE based.
>> -	 */
>> -	if (!early_section(ms))
>> -		return pfn_section_valid(ms, pfn);
>> -}
>> -#endif
>> -	return memblock_is_memory(addr);
>> -}
>> -EXPORT_SYMBOL(pfn_valid);
>> -
>>  int pfn_is_map_memory(unsigned long pfn)
>>  {
>>  	phys_addr_t addr = PFN_PHYS(pfn);
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index 961f0eeefb62..18bf71665211 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
>>   *
>>   * Return: 1 for PFNs that have memory map entries and 0 otherwise
>>   */
>> +bool memblock_is_memory(phys_addr_t addr);
>> +
>>  static inline int pfn_valid(unsigned long pfn)
>>  {
>> +	phys_addr_t addr = PFN_PHYS(pfn);
>>  	struct mem_section *ms;
>>  
>> +	/*
>> +	 * Ensure the upper PAGE_SHIFT bits are clear in the
>> +	 * pfn. Else it might lead to false positives when
>> +	 * some of the upper bits are set, but the lower bits
>> +	 * match a valid pfn.
>> +	 */
>> +	if (PHYS_PFN(addr) != pfn)
>> +		return 0;
>> +
>>  	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>>  		return 0;
>>  	ms = __nr_to_section(pfn_to_section_nr(pfn));
>> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn)
>>  	 * Traditionally early sections always returned pfn_valid() for
>>  	 * the entire section-sized span.
>>  	 */
>> -	return early_section(ms) || pfn_section_valid(ms, pfn);
>> +	if (early_section(ms))
>> +		return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ?
>> +			memblock_is_memory(pfn << PAGE_SHIFT) : 1;
> 
> Nit: we already did 
> 
> 	addr = PFN_PHYS(pfn);
> 
> a few lines above :)

Yeah, will use the addr directly here.

> 
>> +
>> +	return pfn_section_valid(ms, pfn);
>>  }
>>  #endif
>>  
>> diff --git a/mm/Kconfig b/mm/Kconfig
>> index 24c045b24b95..db7128111874 100644
>> --- a/mm/Kconfig
>> +++ b/mm/Kconfig
>> @@ -135,6 +135,15 @@ config HAVE_FAST_GUP
>>  config ARCH_KEEP_MEMBLOCK
>>  	bool
>>  
>> +config HAVE_EARLY_SECTION_MEMMAP_HOLES
>> +	depends on ARCH_KEEP_MEMBLOCK && SPARSEMEM_VMEMMAP
>> +	def_bool n
>> +	help
>> +	  Early sections on certain platforms might have some memory ranges that
>> +	  are not backed with struct page mappings. When subscribed, this option
>> +	  enables special handling for those memory ranges in certain situations
>> +	  such as pfn_valid().
>> +
>>  # Keep arch NUMA mapping infrastructure post-init.
>>  config NUMA_KEEP_MEMINFO
>>  	bool
>> diff --git a/mm/memblock.c b/mm/memblock.c
>> index 3abf2c3fea7f..93f8a9c8428d 100644
>> --- a/mm/memblock.c
>> +++ b/mm/memblock.c
>> @@ -1740,6 +1740,7 @@ bool __init_memblock memblock_is_memory(phys_addr_t addr)
>>  {
>>  	return memblock_search(&memblock.memory, addr) != -1;
>>  }
>> +EXPORT_SYMBOL(memblock_is_memory);
> 
> Please make it inside #ifdef CONFIG_ARCH_MEMBLOCK
CONFIG_ARCH_KEEP_MEMBLOCK ? Wrap it around the EXPORT_SYMBOL() or the
entire function memblock_is_memory().

>   
>>  bool __init_memblock memblock_is_map_memory(phys_addr_t addr)
>>  {
>> @@ -1931,8 +1932,11 @@ static void __init free_unused_memmap(void)
>>  	unsigned long start, end, prev_end = 0;
>>  	int i;
>>  
>> -	if (!IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) ||
>> -	    IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
>> +	if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
>> +		return;
>> +
>> +	if (!IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) &&
>> +	    !IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID))
>>  		return;
> 
> Can you please add a comment that says that architecture should provide a
> way to detect holes in the memory map to be able to free its part?
I did not get that completely. If CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES is
not subscribed, then the platform must provide a method that identifies these
memmap holes and process them accordingly in pfn_valid() just to avoid null
pointer dereference ? Also the comment should be placed after the second
return statement, when its going to free unused memmap segments.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes
  2021-04-22 11:23     ` Anshuman Khandual
@ 2021-04-22 12:19       ` Mike Rapoport
  0 siblings, 0 replies; 22+ messages in thread
From: Mike Rapoport @ 2021-04-22 12:19 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-mm, david, akpm, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel

On Thu, Apr 22, 2021 at 04:53:36PM +0530, Anshuman Khandual wrote:
> On 4/22/21 1:57 PM, Mike Rapoport wrote:

...

> >> diff --git a/mm/memblock.c b/mm/memblock.c
> >> index 3abf2c3fea7f..93f8a9c8428d 100644
> >> --- a/mm/memblock.c
> >> +++ b/mm/memblock.c
> >> @@ -1740,6 +1740,7 @@ bool __init_memblock memblock_is_memory(phys_addr_t addr)
> >>  {
> >>  	return memblock_search(&memblock.memory, addr) != -1;
> >>  }
> >> +EXPORT_SYMBOL(memblock_is_memory);
> > 
> > Please make it inside #ifdef CONFIG_ARCH_MEMBLOCK
> CONFIG_ARCH_KEEP_MEMBLOCK ?

Yeah, _KEEP went away somehow :)

> Wrap it around the EXPORT_SYMBOL() or the entire function
> memblock_is_memory().

EXPORT_SYMBOL(). Otherwise we'll have exported __init function.
 
-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes
  2021-04-22  7:50 ` [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes Anshuman Khandual
  2021-04-22  8:27   ` Mike Rapoport
  2021-04-22  9:03   ` David Hildenbrand
@ 2021-05-24  4:58   ` Anshuman Khandual
  2021-05-24  6:52     ` Mike Rapoport
  2 siblings, 1 reply; 22+ messages in thread
From: Anshuman Khandual @ 2021-05-24  4:58 UTC (permalink / raw)
  To: linux-mm
  Cc: david, rppt, akpm, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel



On 4/22/21 1:20 PM, Anshuman Khandual wrote:
> Platforms like arm and arm64 have redefined pfn_valid() because their early
> memory sections might have contained memmap holes after freeing parts of it
> during boot, which should be skipped while validating a pfn for struct page
> backing. This scenario on certain platforms where memmap is not continuous,
> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES.
> Then the generic pfn_valid() can be improved to accommodate such platforms.
> This reduces overall code footprint and also improves maintainability.
> 
> free_unused_memmap() and pfn_to_online_page() have been updated to include
> such cases. This also exports memblock_is_memory() for all drivers that use
> pfn_valid() but lack required visibility. After the new config is in place,
> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-mm@kvack.org
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
> This patch applies on the latest mainline kernel after Mike's series
> regarding arm64 based pfn_valid().
> 
> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t
> 
> Changes in RFC V2:
> 
> - Dropped support for arm (32 bit)
> - Replaced memblock_is_map_memory() check with memblock_is_memory()
> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid()
> - Updated pfn_to_online_page() per David
> - Updated free_unused_memmap() to preserve existing semantics per Mike
> - Exported memblock_is_memory() instead of memblock_is_map_memory()
> 
> Changes in RFC V1:
> 
> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/
> 
>  arch/arm64/Kconfig            |  2 +-
>  arch/arm64/include/asm/page.h |  1 -
>  arch/arm64/mm/init.c          | 41 -----------------------------------
>  include/linux/mmzone.h        | 18 ++++++++++++++-
>  mm/Kconfig                    |  9 ++++++++
>  mm/memblock.c                 |  8 +++++--
>  mm/memory_hotplug.c           |  5 +++++
>  7 files changed, 38 insertions(+), 46 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index b4a9b493ce72..4cdc3570ffa9 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -144,7 +144,6 @@ config ARM64
>  	select HAVE_ARCH_KGDB
>  	select HAVE_ARCH_MMAP_RND_BITS
>  	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
> -	select HAVE_ARCH_PFN_VALID
>  	select HAVE_ARCH_PREL32_RELOCATIONS
>  	select HAVE_ARCH_SECCOMP_FILTER
>  	select HAVE_ARCH_STACKLEAK
> @@ -167,6 +166,7 @@ config ARM64
>  		if $(cc-option,-fpatchable-function-entry=2)
>  	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
>  		if DYNAMIC_FTRACE_WITH_REGS
> +	select HAVE_EARLY_SECTION_MEMMAP_HOLES
>  	select HAVE_EFFICIENT_UNALIGNED_ACCESS
>  	select HAVE_FAST_GUP
>  	select HAVE_FTRACE_MCOUNT_RECORD
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 75ddfe671393..fcbef3eec4b2 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from);
>  
>  typedef struct page *pgtable_t;
>  
> -int pfn_valid(unsigned long pfn);
>  int pfn_is_map_memory(unsigned long pfn);
>  
>  #include <asm/memory.h>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index f431b38d0837..5731a11550d8 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>  	free_area_init(max_zone_pfns);
>  }
>  
> -int pfn_valid(unsigned long pfn)
> -{
> -	phys_addr_t addr = PFN_PHYS(pfn);
> -
> -	/*
> -	 * Ensure the upper PAGE_SHIFT bits are clear in the
> -	 * pfn. Else it might lead to false positives when
> -	 * some of the upper bits are set, but the lower bits
> -	 * match a valid pfn.
> -	 */
> -	if (PHYS_PFN(addr) != pfn)
> -		return 0;
> -
> -#ifdef CONFIG_SPARSEMEM
> -{
> -	struct mem_section *ms;
> -
> -	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> -		return 0;
> -
> -	ms = __pfn_to_section(pfn);
> -	if (!valid_section(ms))
> -		return 0;
> -
> -	/*
> -	 * ZONE_DEVICE memory does not have the memblock entries.
> -	 * memblock_is_memory() check for ZONE_DEVICE based
> -	 * addresses will always fail. Even the normal hotplugged
> -	 * memory will never have MEMBLOCK_NOMAP flag set in their
> -	 * memblock entries. Skip memblock search for all non early
> -	 * memory sections covering all of hotplug memory including
> -	 * both normal and ZONE_DEVICE based.
> -	 */
> -	if (!early_section(ms))
> -		return pfn_section_valid(ms, pfn);
> -}
> -#endif
> -	return memblock_is_memory(addr);
> -}
> -EXPORT_SYMBOL(pfn_valid);
> -
>  int pfn_is_map_memory(unsigned long pfn)
>  {
>  	phys_addr_t addr = PFN_PHYS(pfn);
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 961f0eeefb62..18bf71665211 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
>   *
>   * Return: 1 for PFNs that have memory map entries and 0 otherwise
>   */
> +bool memblock_is_memory(phys_addr_t addr);
> +
>  static inline int pfn_valid(unsigned long pfn)
>  {
> +	phys_addr_t addr = PFN_PHYS(pfn);
>  	struct mem_section *ms;
>  
> +	/*
> +	 * Ensure the upper PAGE_SHIFT bits are clear in the
> +	 * pfn. Else it might lead to false positives when
> +	 * some of the upper bits are set, but the lower bits
> +	 * match a valid pfn.
> +	 */
> +	if (PHYS_PFN(addr) != pfn)
> +		return 0;
> +
>  	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>  		return 0;
>  	ms = __nr_to_section(pfn_to_section_nr(pfn));
> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn)
>  	 * Traditionally early sections always returned pfn_valid() for
>  	 * the entire section-sized span.
>  	 */
> -	return early_section(ms) || pfn_section_valid(ms, pfn);
> +	if (early_section(ms))
> +		return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ?
> +			memblock_is_memory(pfn << PAGE_SHIFT) : 1;
> +
> +	return pfn_section_valid(ms, pfn);
>  }
>  #endif

Hello David/Mike,

Now that pfn_is_map_memory() usage has been decoupled from pfn_valid() and
SPARSEMEM_VMEMMAP is only available memory model on arm64, wondering if we
still need this HAVE_EARLY_SECTION_MEMMAP_HOLES proposal ? Please do kindly
suggest. Thank you.

- Anshuman

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes
  2021-05-24  4:58   ` Anshuman Khandual
@ 2021-05-24  6:52     ` Mike Rapoport
  2021-05-25  6:00       ` Anshuman Khandual
  0 siblings, 1 reply; 22+ messages in thread
From: Mike Rapoport @ 2021-05-24  6:52 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-mm, david, akpm, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel

Hello Anshuman,

On Mon, May 24, 2021 at 10:28:32AM +0530, Anshuman Khandual wrote:
> 
> On 4/22/21 1:20 PM, Anshuman Khandual wrote:
> > Platforms like arm and arm64 have redefined pfn_valid() because their early
> > memory sections might have contained memmap holes after freeing parts of it
> > during boot, which should be skipped while validating a pfn for struct page
> > backing. This scenario on certain platforms where memmap is not continuous,
> > could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES.
> > Then the generic pfn_valid() can be improved to accommodate such platforms.
> > This reduces overall code footprint and also improves maintainability.
> > 
> > free_unused_memmap() and pfn_to_online_page() have been updated to include
> > such cases. This also exports memblock_is_memory() for all drivers that use
> > pfn_valid() but lack required visibility. After the new config is in place,
> > drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms.
> > 
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Mike Rapoport <rppt@kernel.org>
> > Cc: David Hildenbrand <david@redhat.com>
> > Cc: linux-arm-kernel@lists.infradead.org
> > Cc: linux-kernel@vger.kernel.org
> > Cc: linux-mm@kvack.org
> > Suggested-by: David Hildenbrand <david@redhat.com>
> > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> > ---
> > This patch applies on the latest mainline kernel after Mike's series
> > regarding arm64 based pfn_valid().
> > 
> > https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t
> > 
> > Changes in RFC V2:
> > 
> > - Dropped support for arm (32 bit)
> > - Replaced memblock_is_map_memory() check with memblock_is_memory()
> > - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid()
> > - Updated pfn_to_online_page() per David
> > - Updated free_unused_memmap() to preserve existing semantics per Mike
> > - Exported memblock_is_memory() instead of memblock_is_map_memory()
> > 
> > Changes in RFC V1:
> > 
> > - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/
> > 
> >  arch/arm64/Kconfig            |  2 +-
> >  arch/arm64/include/asm/page.h |  1 -
> >  arch/arm64/mm/init.c          | 41 -----------------------------------
> >  include/linux/mmzone.h        | 18 ++++++++++++++-
> >  mm/Kconfig                    |  9 ++++++++
> >  mm/memblock.c                 |  8 +++++--
> >  mm/memory_hotplug.c           |  5 +++++
> >  7 files changed, 38 insertions(+), 46 deletions(-)
> > 
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index b4a9b493ce72..4cdc3570ffa9 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -144,7 +144,6 @@ config ARM64
> >  	select HAVE_ARCH_KGDB
> >  	select HAVE_ARCH_MMAP_RND_BITS
> >  	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
> > -	select HAVE_ARCH_PFN_VALID
> >  	select HAVE_ARCH_PREL32_RELOCATIONS
> >  	select HAVE_ARCH_SECCOMP_FILTER
> >  	select HAVE_ARCH_STACKLEAK
> > @@ -167,6 +166,7 @@ config ARM64
> >  		if $(cc-option,-fpatchable-function-entry=2)
> >  	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
> >  		if DYNAMIC_FTRACE_WITH_REGS
> > +	select HAVE_EARLY_SECTION_MEMMAP_HOLES
> >  	select HAVE_EFFICIENT_UNALIGNED_ACCESS
> >  	select HAVE_FAST_GUP
> >  	select HAVE_FTRACE_MCOUNT_RECORD
> > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> > index 75ddfe671393..fcbef3eec4b2 100644
> > --- a/arch/arm64/include/asm/page.h
> > +++ b/arch/arm64/include/asm/page.h
> > @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from);
> >  
> >  typedef struct page *pgtable_t;
> >  
> > -int pfn_valid(unsigned long pfn);
> >  int pfn_is_map_memory(unsigned long pfn);
> >  
> >  #include <asm/memory.h>
> > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> > index f431b38d0837..5731a11550d8 100644
> > --- a/arch/arm64/mm/init.c
> > +++ b/arch/arm64/mm/init.c
> > @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
> >  	free_area_init(max_zone_pfns);
> >  }
> >  
> > -int pfn_valid(unsigned long pfn)
> > -{
> > -	phys_addr_t addr = PFN_PHYS(pfn);
> > -
> > -	/*
> > -	 * Ensure the upper PAGE_SHIFT bits are clear in the
> > -	 * pfn. Else it might lead to false positives when
> > -	 * some of the upper bits are set, but the lower bits
> > -	 * match a valid pfn.
> > -	 */
> > -	if (PHYS_PFN(addr) != pfn)
> > -		return 0;
> > -
> > -#ifdef CONFIG_SPARSEMEM
> > -{
> > -	struct mem_section *ms;
> > -
> > -	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> > -		return 0;
> > -
> > -	ms = __pfn_to_section(pfn);
> > -	if (!valid_section(ms))
> > -		return 0;
> > -
> > -	/*
> > -	 * ZONE_DEVICE memory does not have the memblock entries.
> > -	 * memblock_is_memory() check for ZONE_DEVICE based
> > -	 * addresses will always fail. Even the normal hotplugged
> > -	 * memory will never have MEMBLOCK_NOMAP flag set in their
> > -	 * memblock entries. Skip memblock search for all non early
> > -	 * memory sections covering all of hotplug memory including
> > -	 * both normal and ZONE_DEVICE based.
> > -	 */
> > -	if (!early_section(ms))
> > -		return pfn_section_valid(ms, pfn);
> > -}
> > -#endif
> > -	return memblock_is_memory(addr);
> > -}
> > -EXPORT_SYMBOL(pfn_valid);
> > -
> >  int pfn_is_map_memory(unsigned long pfn)
> >  {
> >  	phys_addr_t addr = PFN_PHYS(pfn);
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index 961f0eeefb62..18bf71665211 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
> >   *
> >   * Return: 1 for PFNs that have memory map entries and 0 otherwise
> >   */
> > +bool memblock_is_memory(phys_addr_t addr);
> > +
> >  static inline int pfn_valid(unsigned long pfn)
> >  {
> > +	phys_addr_t addr = PFN_PHYS(pfn);
> >  	struct mem_section *ms;
> >  
> > +	/*
> > +	 * Ensure the upper PAGE_SHIFT bits are clear in the
> > +	 * pfn. Else it might lead to false positives when
> > +	 * some of the upper bits are set, but the lower bits
> > +	 * match a valid pfn.
> > +	 */
> > +	if (PHYS_PFN(addr) != pfn)
> > +		return 0;
> > +
> >  	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> >  		return 0;
> >  	ms = __nr_to_section(pfn_to_section_nr(pfn));
> > @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn)
> >  	 * Traditionally early sections always returned pfn_valid() for
> >  	 * the entire section-sized span.
> >  	 */
> > -	return early_section(ms) || pfn_section_valid(ms, pfn);
> > +	if (early_section(ms))
> > +		return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ?
> > +			memblock_is_memory(pfn << PAGE_SHIFT) : 1;
> > +
> > +	return pfn_section_valid(ms, pfn);
> >  }
> >  #endif
> 
> Hello David/Mike,
> 
> Now that pfn_is_map_memory() usage has been decoupled from pfn_valid() and
> SPARSEMEM_VMEMMAP is only available memory model on arm64, wondering if we
> still need this HAVE_EARLY_SECTION_MEMMAP_HOLES proposal ? Please do kindly
> suggest. Thank you.

Even now arm64 still frees parts of the memory map and pfn_valid() should
be able to tell if a part of a section is freed or not.

For instance for the following memory configuration
    
        |<----section---->|<----hole---->|<----section---->|
        +--------+--------+--------------+--------+--------+
        | bank 0 | unused |              | bank 1 | unused |
        +--------+--------+--------------+--------+--------+

the memory map corresponding to the "unused" areas is freed, but the generic
pfn_valid() will still return 1 there.

So we either should stop freeing unused memory map on arm64, or keep
arm64::pfn_valid() or implement something along the lines of this patch.

I personally don't think that the memory savings from freeing the unused
memory map worth the pain of maintenance and bugs happening here and there.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes
  2021-05-24  6:52     ` Mike Rapoport
@ 2021-05-25  6:00       ` Anshuman Khandual
  2021-05-25  6:32         ` Mike Rapoport
  0 siblings, 1 reply; 22+ messages in thread
From: Anshuman Khandual @ 2021-05-25  6:00 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-mm, david, akpm, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel


On 5/24/21 12:22 PM, Mike Rapoport wrote:
> Hello Anshuman,
> 
> On Mon, May 24, 2021 at 10:28:32AM +0530, Anshuman Khandual wrote:
>>
>> On 4/22/21 1:20 PM, Anshuman Khandual wrote:
>>> Platforms like arm and arm64 have redefined pfn_valid() because their early
>>> memory sections might have contained memmap holes after freeing parts of it
>>> during boot, which should be skipped while validating a pfn for struct page
>>> backing. This scenario on certain platforms where memmap is not continuous,
>>> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES.
>>> Then the generic pfn_valid() can be improved to accommodate such platforms.
>>> This reduces overall code footprint and also improves maintainability.
>>>
>>> free_unused_memmap() and pfn_to_online_page() have been updated to include
>>> such cases. This also exports memblock_is_memory() for all drivers that use
>>> pfn_valid() but lack required visibility. After the new config is in place,
>>> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms.
>>>
>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>> Cc: Will Deacon <will@kernel.org>
>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>> Cc: Mike Rapoport <rppt@kernel.org>
>>> Cc: David Hildenbrand <david@redhat.com>
>>> Cc: linux-arm-kernel@lists.infradead.org
>>> Cc: linux-kernel@vger.kernel.org
>>> Cc: linux-mm@kvack.org
>>> Suggested-by: David Hildenbrand <david@redhat.com>
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> ---
>>> This patch applies on the latest mainline kernel after Mike's series
>>> regarding arm64 based pfn_valid().
>>>
>>> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t
>>>
>>> Changes in RFC V2:
>>>
>>> - Dropped support for arm (32 bit)
>>> - Replaced memblock_is_map_memory() check with memblock_is_memory()
>>> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid()
>>> - Updated pfn_to_online_page() per David
>>> - Updated free_unused_memmap() to preserve existing semantics per Mike
>>> - Exported memblock_is_memory() instead of memblock_is_map_memory()
>>>
>>> Changes in RFC V1:
>>>
>>> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/
>>>
>>>  arch/arm64/Kconfig            |  2 +-
>>>  arch/arm64/include/asm/page.h |  1 -
>>>  arch/arm64/mm/init.c          | 41 -----------------------------------
>>>  include/linux/mmzone.h        | 18 ++++++++++++++-
>>>  mm/Kconfig                    |  9 ++++++++
>>>  mm/memblock.c                 |  8 +++++--
>>>  mm/memory_hotplug.c           |  5 +++++
>>>  7 files changed, 38 insertions(+), 46 deletions(-)
>>>
>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>> index b4a9b493ce72..4cdc3570ffa9 100644
>>> --- a/arch/arm64/Kconfig
>>> +++ b/arch/arm64/Kconfig
>>> @@ -144,7 +144,6 @@ config ARM64
>>>  	select HAVE_ARCH_KGDB
>>>  	select HAVE_ARCH_MMAP_RND_BITS
>>>  	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
>>> -	select HAVE_ARCH_PFN_VALID
>>>  	select HAVE_ARCH_PREL32_RELOCATIONS
>>>  	select HAVE_ARCH_SECCOMP_FILTER
>>>  	select HAVE_ARCH_STACKLEAK
>>> @@ -167,6 +166,7 @@ config ARM64
>>>  		if $(cc-option,-fpatchable-function-entry=2)
>>>  	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
>>>  		if DYNAMIC_FTRACE_WITH_REGS
>>> +	select HAVE_EARLY_SECTION_MEMMAP_HOLES
>>>  	select HAVE_EFFICIENT_UNALIGNED_ACCESS
>>>  	select HAVE_FAST_GUP
>>>  	select HAVE_FTRACE_MCOUNT_RECORD
>>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>>> index 75ddfe671393..fcbef3eec4b2 100644
>>> --- a/arch/arm64/include/asm/page.h
>>> +++ b/arch/arm64/include/asm/page.h
>>> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from);
>>>  
>>>  typedef struct page *pgtable_t;
>>>  
>>> -int pfn_valid(unsigned long pfn);
>>>  int pfn_is_map_memory(unsigned long pfn);
>>>  
>>>  #include <asm/memory.h>
>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>> index f431b38d0837..5731a11550d8 100644
>>> --- a/arch/arm64/mm/init.c
>>> +++ b/arch/arm64/mm/init.c
>>> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>>>  	free_area_init(max_zone_pfns);
>>>  }
>>>  
>>> -int pfn_valid(unsigned long pfn)
>>> -{
>>> -	phys_addr_t addr = PFN_PHYS(pfn);
>>> -
>>> -	/*
>>> -	 * Ensure the upper PAGE_SHIFT bits are clear in the
>>> -	 * pfn. Else it might lead to false positives when
>>> -	 * some of the upper bits are set, but the lower bits
>>> -	 * match a valid pfn.
>>> -	 */
>>> -	if (PHYS_PFN(addr) != pfn)
>>> -		return 0;
>>> -
>>> -#ifdef CONFIG_SPARSEMEM
>>> -{
>>> -	struct mem_section *ms;
>>> -
>>> -	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>>> -		return 0;
>>> -
>>> -	ms = __pfn_to_section(pfn);
>>> -	if (!valid_section(ms))
>>> -		return 0;
>>> -
>>> -	/*
>>> -	 * ZONE_DEVICE memory does not have the memblock entries.
>>> -	 * memblock_is_memory() check for ZONE_DEVICE based
>>> -	 * addresses will always fail. Even the normal hotplugged
>>> -	 * memory will never have MEMBLOCK_NOMAP flag set in their
>>> -	 * memblock entries. Skip memblock search for all non early
>>> -	 * memory sections covering all of hotplug memory including
>>> -	 * both normal and ZONE_DEVICE based.
>>> -	 */
>>> -	if (!early_section(ms))
>>> -		return pfn_section_valid(ms, pfn);
>>> -}
>>> -#endif
>>> -	return memblock_is_memory(addr);
>>> -}
>>> -EXPORT_SYMBOL(pfn_valid);
>>> -
>>>  int pfn_is_map_memory(unsigned long pfn)
>>>  {
>>>  	phys_addr_t addr = PFN_PHYS(pfn);
>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>>> index 961f0eeefb62..18bf71665211 100644
>>> --- a/include/linux/mmzone.h
>>> +++ b/include/linux/mmzone.h
>>> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
>>>   *
>>>   * Return: 1 for PFNs that have memory map entries and 0 otherwise
>>>   */
>>> +bool memblock_is_memory(phys_addr_t addr);
>>> +
>>>  static inline int pfn_valid(unsigned long pfn)
>>>  {
>>> +	phys_addr_t addr = PFN_PHYS(pfn);
>>>  	struct mem_section *ms;
>>>  
>>> +	/*
>>> +	 * Ensure the upper PAGE_SHIFT bits are clear in the
>>> +	 * pfn. Else it might lead to false positives when
>>> +	 * some of the upper bits are set, but the lower bits
>>> +	 * match a valid pfn.
>>> +	 */
>>> +	if (PHYS_PFN(addr) != pfn)
>>> +		return 0;
>>> +
>>>  	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>>>  		return 0;
>>>  	ms = __nr_to_section(pfn_to_section_nr(pfn));
>>> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn)
>>>  	 * Traditionally early sections always returned pfn_valid() for
>>>  	 * the entire section-sized span.
>>>  	 */
>>> -	return early_section(ms) || pfn_section_valid(ms, pfn);
>>> +	if (early_section(ms))
>>> +		return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ?
>>> +			memblock_is_memory(pfn << PAGE_SHIFT) : 1;
>>> +
>>> +	return pfn_section_valid(ms, pfn);
>>>  }
>>>  #endif
>>
>> Hello David/Mike,
>>
>> Now that pfn_is_map_memory() usage has been decoupled from pfn_valid() and
>> SPARSEMEM_VMEMMAP is only available memory model on arm64, wondering if we
>> still need this HAVE_EARLY_SECTION_MEMMAP_HOLES proposal ? Please do kindly
>> suggest. Thank you.
> 
> Even now arm64 still frees parts of the memory map and pfn_valid() should
> be able to tell if a part of a section is freed or not.
> 
> For instance for the following memory configuration
>     
>         |<----section---->|<----hole---->|<----section---->|
>         +--------+--------+--------------+--------+--------+
>         | bank 0 | unused |              | bank 1 | unused |
>         +--------+--------+--------------+--------+--------+
> 
> the memory map corresponding to the "unused" areas is freed, but the generic
> pfn_valid() will still return 1 there.

But is not free_unused_memmap() return early when CONFIG_SPARSEMEM_VMEMMAP
is enabled, which is the only option now on arm64. Then how can memmap have
holes (from unused areas) anymore ? Am I missing something here.

> 
> So we either should stop freeing unused memory map on arm64, or keep
> arm64::pfn_valid() or implement something along the lines of this patch.
> 
> I personally don't think that the memory savings from freeing the unused
> memory map worth the pain of maintenance and bugs happening here and there.
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes
  2021-05-25  6:00       ` Anshuman Khandual
@ 2021-05-25  6:32         ` Mike Rapoport
  2021-05-25  9:52           ` Anshuman Khandual
  0 siblings, 1 reply; 22+ messages in thread
From: Mike Rapoport @ 2021-05-25  6:32 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-mm, david, akpm, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel

On Tue, May 25, 2021 at 11:30:15AM +0530, Anshuman Khandual wrote:
> 
> On 5/24/21 12:22 PM, Mike Rapoport wrote:
> > Hello Anshuman,
> > 
> > On Mon, May 24, 2021 at 10:28:32AM +0530, Anshuman Khandual wrote:
> >>
> >> On 4/22/21 1:20 PM, Anshuman Khandual wrote:
> >>> Platforms like arm and arm64 have redefined pfn_valid() because their early
> >>> memory sections might have contained memmap holes after freeing parts of it
> >>> during boot, which should be skipped while validating a pfn for struct page
> >>> backing. This scenario on certain platforms where memmap is not continuous,
> >>> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES.
> >>> Then the generic pfn_valid() can be improved to accommodate such platforms.
> >>> This reduces overall code footprint and also improves maintainability.
> >>>
> >>> free_unused_memmap() and pfn_to_online_page() have been updated to include
> >>> such cases. This also exports memblock_is_memory() for all drivers that use
> >>> pfn_valid() but lack required visibility. After the new config is in place,
> >>> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms.
> >>>
> >>> Cc: Catalin Marinas <catalin.marinas@arm.com>
> >>> Cc: Will Deacon <will@kernel.org>
> >>> Cc: Andrew Morton <akpm@linux-foundation.org>
> >>> Cc: Mike Rapoport <rppt@kernel.org>
> >>> Cc: David Hildenbrand <david@redhat.com>
> >>> Cc: linux-arm-kernel@lists.infradead.org
> >>> Cc: linux-kernel@vger.kernel.org
> >>> Cc: linux-mm@kvack.org
> >>> Suggested-by: David Hildenbrand <david@redhat.com>
> >>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> >>> ---
> >>> This patch applies on the latest mainline kernel after Mike's series
> >>> regarding arm64 based pfn_valid().
> >>>
> >>> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t
> >>>
> >>> Changes in RFC V2:
> >>>
> >>> - Dropped support for arm (32 bit)
> >>> - Replaced memblock_is_map_memory() check with memblock_is_memory()
> >>> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid()
> >>> - Updated pfn_to_online_page() per David
> >>> - Updated free_unused_memmap() to preserve existing semantics per Mike
> >>> - Exported memblock_is_memory() instead of memblock_is_map_memory()
> >>>
> >>> Changes in RFC V1:
> >>>
> >>> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/
> >>>
> >>>  arch/arm64/Kconfig            |  2 +-
> >>>  arch/arm64/include/asm/page.h |  1 -
> >>>  arch/arm64/mm/init.c          | 41 -----------------------------------
> >>>  include/linux/mmzone.h        | 18 ++++++++++++++-
> >>>  mm/Kconfig                    |  9 ++++++++
> >>>  mm/memblock.c                 |  8 +++++--
> >>>  mm/memory_hotplug.c           |  5 +++++
> >>>  7 files changed, 38 insertions(+), 46 deletions(-)
> >>>
> >>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> >>> index b4a9b493ce72..4cdc3570ffa9 100644
> >>> --- a/arch/arm64/Kconfig
> >>> +++ b/arch/arm64/Kconfig
> >>> @@ -144,7 +144,6 @@ config ARM64
> >>>  	select HAVE_ARCH_KGDB
> >>>  	select HAVE_ARCH_MMAP_RND_BITS
> >>>  	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
> >>> -	select HAVE_ARCH_PFN_VALID
> >>>  	select HAVE_ARCH_PREL32_RELOCATIONS
> >>>  	select HAVE_ARCH_SECCOMP_FILTER
> >>>  	select HAVE_ARCH_STACKLEAK
> >>> @@ -167,6 +166,7 @@ config ARM64
> >>>  		if $(cc-option,-fpatchable-function-entry=2)
> >>>  	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
> >>>  		if DYNAMIC_FTRACE_WITH_REGS
> >>> +	select HAVE_EARLY_SECTION_MEMMAP_HOLES
> >>>  	select HAVE_EFFICIENT_UNALIGNED_ACCESS
> >>>  	select HAVE_FAST_GUP
> >>>  	select HAVE_FTRACE_MCOUNT_RECORD
> >>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> >>> index 75ddfe671393..fcbef3eec4b2 100644
> >>> --- a/arch/arm64/include/asm/page.h
> >>> +++ b/arch/arm64/include/asm/page.h
> >>> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from);
> >>>  
> >>>  typedef struct page *pgtable_t;
> >>>  
> >>> -int pfn_valid(unsigned long pfn);
> >>>  int pfn_is_map_memory(unsigned long pfn);
> >>>  
> >>>  #include <asm/memory.h>
> >>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> >>> index f431b38d0837..5731a11550d8 100644
> >>> --- a/arch/arm64/mm/init.c
> >>> +++ b/arch/arm64/mm/init.c
> >>> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
> >>>  	free_area_init(max_zone_pfns);
> >>>  }
> >>>  
> >>> -int pfn_valid(unsigned long pfn)
> >>> -{
> >>> -	phys_addr_t addr = PFN_PHYS(pfn);
> >>> -
> >>> -	/*
> >>> -	 * Ensure the upper PAGE_SHIFT bits are clear in the
> >>> -	 * pfn. Else it might lead to false positives when
> >>> -	 * some of the upper bits are set, but the lower bits
> >>> -	 * match a valid pfn.
> >>> -	 */
> >>> -	if (PHYS_PFN(addr) != pfn)
> >>> -		return 0;
> >>> -
> >>> -#ifdef CONFIG_SPARSEMEM
> >>> -{
> >>> -	struct mem_section *ms;
> >>> -
> >>> -	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> >>> -		return 0;
> >>> -
> >>> -	ms = __pfn_to_section(pfn);
> >>> -	if (!valid_section(ms))
> >>> -		return 0;
> >>> -
> >>> -	/*
> >>> -	 * ZONE_DEVICE memory does not have the memblock entries.
> >>> -	 * memblock_is_memory() check for ZONE_DEVICE based
> >>> -	 * addresses will always fail. Even the normal hotplugged
> >>> -	 * memory will never have MEMBLOCK_NOMAP flag set in their
> >>> -	 * memblock entries. Skip memblock search for all non early
> >>> -	 * memory sections covering all of hotplug memory including
> >>> -	 * both normal and ZONE_DEVICE based.
> >>> -	 */
> >>> -	if (!early_section(ms))
> >>> -		return pfn_section_valid(ms, pfn);
> >>> -}
> >>> -#endif
> >>> -	return memblock_is_memory(addr);
> >>> -}
> >>> -EXPORT_SYMBOL(pfn_valid);
> >>> -
> >>>  int pfn_is_map_memory(unsigned long pfn)
> >>>  {
> >>>  	phys_addr_t addr = PFN_PHYS(pfn);
> >>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> >>> index 961f0eeefb62..18bf71665211 100644
> >>> --- a/include/linux/mmzone.h
> >>> +++ b/include/linux/mmzone.h
> >>> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
> >>>   *
> >>>   * Return: 1 for PFNs that have memory map entries and 0 otherwise
> >>>   */
> >>> +bool memblock_is_memory(phys_addr_t addr);
> >>> +
> >>>  static inline int pfn_valid(unsigned long pfn)
> >>>  {
> >>> +	phys_addr_t addr = PFN_PHYS(pfn);
> >>>  	struct mem_section *ms;
> >>>  
> >>> +	/*
> >>> +	 * Ensure the upper PAGE_SHIFT bits are clear in the
> >>> +	 * pfn. Else it might lead to false positives when
> >>> +	 * some of the upper bits are set, but the lower bits
> >>> +	 * match a valid pfn.
> >>> +	 */
> >>> +	if (PHYS_PFN(addr) != pfn)
> >>> +		return 0;
> >>> +
> >>>  	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> >>>  		return 0;
> >>>  	ms = __nr_to_section(pfn_to_section_nr(pfn));
> >>> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn)
> >>>  	 * Traditionally early sections always returned pfn_valid() for
> >>>  	 * the entire section-sized span.
> >>>  	 */
> >>> -	return early_section(ms) || pfn_section_valid(ms, pfn);
> >>> +	if (early_section(ms))
> >>> +		return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ?
> >>> +			memblock_is_memory(pfn << PAGE_SHIFT) : 1;
> >>> +
> >>> +	return pfn_section_valid(ms, pfn);
> >>>  }
> >>>  #endif
> >>
> >> Hello David/Mike,
> >>
> >> Now that pfn_is_map_memory() usage has been decoupled from pfn_valid() and
> >> SPARSEMEM_VMEMMAP is only available memory model on arm64, wondering if we
> >> still need this HAVE_EARLY_SECTION_MEMMAP_HOLES proposal ? Please do kindly
> >> suggest. Thank you.
> > 
> > Even now arm64 still frees parts of the memory map and pfn_valid() should
> > be able to tell if a part of a section is freed or not.
> > 
> > For instance for the following memory configuration
> >     
> >         |<----section---->|<----hole---->|<----section---->|
> >         +--------+--------+--------------+--------+--------+
> >         | bank 0 | unused |              | bank 1 | unused |
> >         +--------+--------+--------------+--------+--------+
> > 
> > the memory map corresponding to the "unused" areas is freed, but the generic
> > pfn_valid() will still return 1 there.
> 
> But is not free_unused_memmap() return early when CONFIG_SPARSEMEM_VMEMMAP
> is enabled, which is the only option now on arm64. Then how can memmap have
> holes (from unused areas) anymore ? Am I missing something here.
 
Ah, you are right, I missed this detail myself :)

With CONFIG_SPARSEMEM_VMEMMAP as the only memory model for arm64, we can
simply rid of arm64::pfn_valid() without any changes to the generic
version.
 
> > So we either should stop freeing unused memory map on arm64, or keep
> > arm64::pfn_valid() or implement something along the lines of this patch.
> > 
> > I personally don't think that the memory savings from freeing the unused
> > memory map worth the pain of maintenance and bugs happening here and there.
> > 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes
  2021-05-25  6:32         ` Mike Rapoport
@ 2021-05-25  9:52           ` Anshuman Khandual
  2021-05-25 10:03             ` Mike Rapoport
  0 siblings, 1 reply; 22+ messages in thread
From: Anshuman Khandual @ 2021-05-25  9:52 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-mm, david, akpm, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel



On 5/25/21 12:02 PM, Mike Rapoport wrote:
> On Tue, May 25, 2021 at 11:30:15AM +0530, Anshuman Khandual wrote:
>>
>> On 5/24/21 12:22 PM, Mike Rapoport wrote:
>>> Hello Anshuman,
>>>
>>> On Mon, May 24, 2021 at 10:28:32AM +0530, Anshuman Khandual wrote:
>>>>
>>>> On 4/22/21 1:20 PM, Anshuman Khandual wrote:
>>>>> Platforms like arm and arm64 have redefined pfn_valid() because their early
>>>>> memory sections might have contained memmap holes after freeing parts of it
>>>>> during boot, which should be skipped while validating a pfn for struct page
>>>>> backing. This scenario on certain platforms where memmap is not continuous,
>>>>> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES.
>>>>> Then the generic pfn_valid() can be improved to accommodate such platforms.
>>>>> This reduces overall code footprint and also improves maintainability.
>>>>>
>>>>> free_unused_memmap() and pfn_to_online_page() have been updated to include
>>>>> such cases. This also exports memblock_is_memory() for all drivers that use
>>>>> pfn_valid() but lack required visibility. After the new config is in place,
>>>>> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms.
>>>>>
>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>>>> Cc: Will Deacon <will@kernel.org>
>>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>>>> Cc: Mike Rapoport <rppt@kernel.org>
>>>>> Cc: David Hildenbrand <david@redhat.com>
>>>>> Cc: linux-arm-kernel@lists.infradead.org
>>>>> Cc: linux-kernel@vger.kernel.org
>>>>> Cc: linux-mm@kvack.org
>>>>> Suggested-by: David Hildenbrand <david@redhat.com>
>>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>>>> ---
>>>>> This patch applies on the latest mainline kernel after Mike's series
>>>>> regarding arm64 based pfn_valid().
>>>>>
>>>>> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t
>>>>>
>>>>> Changes in RFC V2:
>>>>>
>>>>> - Dropped support for arm (32 bit)
>>>>> - Replaced memblock_is_map_memory() check with memblock_is_memory()
>>>>> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid()
>>>>> - Updated pfn_to_online_page() per David
>>>>> - Updated free_unused_memmap() to preserve existing semantics per Mike
>>>>> - Exported memblock_is_memory() instead of memblock_is_map_memory()
>>>>>
>>>>> Changes in RFC V1:
>>>>>
>>>>> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/
>>>>>
>>>>>  arch/arm64/Kconfig            |  2 +-
>>>>>  arch/arm64/include/asm/page.h |  1 -
>>>>>  arch/arm64/mm/init.c          | 41 -----------------------------------
>>>>>  include/linux/mmzone.h        | 18 ++++++++++++++-
>>>>>  mm/Kconfig                    |  9 ++++++++
>>>>>  mm/memblock.c                 |  8 +++++--
>>>>>  mm/memory_hotplug.c           |  5 +++++
>>>>>  7 files changed, 38 insertions(+), 46 deletions(-)
>>>>>
>>>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>>>> index b4a9b493ce72..4cdc3570ffa9 100644
>>>>> --- a/arch/arm64/Kconfig
>>>>> +++ b/arch/arm64/Kconfig
>>>>> @@ -144,7 +144,6 @@ config ARM64
>>>>>  	select HAVE_ARCH_KGDB
>>>>>  	select HAVE_ARCH_MMAP_RND_BITS
>>>>>  	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
>>>>> -	select HAVE_ARCH_PFN_VALID
>>>>>  	select HAVE_ARCH_PREL32_RELOCATIONS
>>>>>  	select HAVE_ARCH_SECCOMP_FILTER
>>>>>  	select HAVE_ARCH_STACKLEAK
>>>>> @@ -167,6 +166,7 @@ config ARM64
>>>>>  		if $(cc-option,-fpatchable-function-entry=2)
>>>>>  	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
>>>>>  		if DYNAMIC_FTRACE_WITH_REGS
>>>>> +	select HAVE_EARLY_SECTION_MEMMAP_HOLES
>>>>>  	select HAVE_EFFICIENT_UNALIGNED_ACCESS
>>>>>  	select HAVE_FAST_GUP
>>>>>  	select HAVE_FTRACE_MCOUNT_RECORD
>>>>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>>>>> index 75ddfe671393..fcbef3eec4b2 100644
>>>>> --- a/arch/arm64/include/asm/page.h
>>>>> +++ b/arch/arm64/include/asm/page.h
>>>>> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from);
>>>>>  
>>>>>  typedef struct page *pgtable_t;
>>>>>  
>>>>> -int pfn_valid(unsigned long pfn);
>>>>>  int pfn_is_map_memory(unsigned long pfn);
>>>>>  
>>>>>  #include <asm/memory.h>
>>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>>>> index f431b38d0837..5731a11550d8 100644
>>>>> --- a/arch/arm64/mm/init.c
>>>>> +++ b/arch/arm64/mm/init.c
>>>>> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>>>>>  	free_area_init(max_zone_pfns);
>>>>>  }
>>>>>  
>>>>> -int pfn_valid(unsigned long pfn)
>>>>> -{
>>>>> -	phys_addr_t addr = PFN_PHYS(pfn);
>>>>> -
>>>>> -	/*
>>>>> -	 * Ensure the upper PAGE_SHIFT bits are clear in the
>>>>> -	 * pfn. Else it might lead to false positives when
>>>>> -	 * some of the upper bits are set, but the lower bits
>>>>> -	 * match a valid pfn.
>>>>> -	 */
>>>>> -	if (PHYS_PFN(addr) != pfn)
>>>>> -		return 0;
>>>>> -
>>>>> -#ifdef CONFIG_SPARSEMEM
>>>>> -{
>>>>> -	struct mem_section *ms;
>>>>> -
>>>>> -	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>>>>> -		return 0;
>>>>> -
>>>>> -	ms = __pfn_to_section(pfn);
>>>>> -	if (!valid_section(ms))
>>>>> -		return 0;
>>>>> -
>>>>> -	/*
>>>>> -	 * ZONE_DEVICE memory does not have the memblock entries.
>>>>> -	 * memblock_is_memory() check for ZONE_DEVICE based
>>>>> -	 * addresses will always fail. Even the normal hotplugged
>>>>> -	 * memory will never have MEMBLOCK_NOMAP flag set in their
>>>>> -	 * memblock entries. Skip memblock search for all non early
>>>>> -	 * memory sections covering all of hotplug memory including
>>>>> -	 * both normal and ZONE_DEVICE based.
>>>>> -	 */
>>>>> -	if (!early_section(ms))
>>>>> -		return pfn_section_valid(ms, pfn);
>>>>> -}
>>>>> -#endif
>>>>> -	return memblock_is_memory(addr);
>>>>> -}
>>>>> -EXPORT_SYMBOL(pfn_valid);
>>>>> -
>>>>>  int pfn_is_map_memory(unsigned long pfn)
>>>>>  {
>>>>>  	phys_addr_t addr = PFN_PHYS(pfn);
>>>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>>>>> index 961f0eeefb62..18bf71665211 100644
>>>>> --- a/include/linux/mmzone.h
>>>>> +++ b/include/linux/mmzone.h
>>>>> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
>>>>>   *
>>>>>   * Return: 1 for PFNs that have memory map entries and 0 otherwise
>>>>>   */
>>>>> +bool memblock_is_memory(phys_addr_t addr);
>>>>> +
>>>>>  static inline int pfn_valid(unsigned long pfn)
>>>>>  {
>>>>> +	phys_addr_t addr = PFN_PHYS(pfn);
>>>>>  	struct mem_section *ms;
>>>>>  
>>>>> +	/*
>>>>> +	 * Ensure the upper PAGE_SHIFT bits are clear in the
>>>>> +	 * pfn. Else it might lead to false positives when
>>>>> +	 * some of the upper bits are set, but the lower bits
>>>>> +	 * match a valid pfn.
>>>>> +	 */
>>>>> +	if (PHYS_PFN(addr) != pfn)
>>>>> +		return 0;
>>>>> +
>>>>>  	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>>>>>  		return 0;
>>>>>  	ms = __nr_to_section(pfn_to_section_nr(pfn));
>>>>> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn)
>>>>>  	 * Traditionally early sections always returned pfn_valid() for
>>>>>  	 * the entire section-sized span.
>>>>>  	 */
>>>>> -	return early_section(ms) || pfn_section_valid(ms, pfn);
>>>>> +	if (early_section(ms))
>>>>> +		return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ?
>>>>> +			memblock_is_memory(pfn << PAGE_SHIFT) : 1;
>>>>> +
>>>>> +	return pfn_section_valid(ms, pfn);
>>>>>  }
>>>>>  #endif
>>>>
>>>> Hello David/Mike,
>>>>
>>>> Now that pfn_is_map_memory() usage has been decoupled from pfn_valid() and
>>>> SPARSEMEM_VMEMMAP is only available memory model on arm64, wondering if we
>>>> still need this HAVE_EARLY_SECTION_MEMMAP_HOLES proposal ? Please do kindly
>>>> suggest. Thank you.
>>>
>>> Even now arm64 still frees parts of the memory map and pfn_valid() should
>>> be able to tell if a part of a section is freed or not.
>>>
>>> For instance for the following memory configuration
>>>     
>>>         |<----section---->|<----hole---->|<----section---->|
>>>         +--------+--------+--------------+--------+--------+
>>>         | bank 0 | unused |              | bank 1 | unused |
>>>         +--------+--------+--------------+--------+--------+
>>>
>>> the memory map corresponding to the "unused" areas is freed, but the generic
>>> pfn_valid() will still return 1 there.
>>
>> But is not free_unused_memmap() return early when CONFIG_SPARSEMEM_VMEMMAP
>> is enabled, which is the only option now on arm64. Then how can memmap have
>> holes (from unused areas) anymore ? Am I missing something here.
>  
> Ah, you are right, I missed this detail myself :)
> 
> With CONFIG_SPARSEMEM_VMEMMAP as the only memory model for arm64, we can
> simply rid of arm64::pfn_valid() without any changes to the generic
> version.

Though just moved the pfn bits sanity check into generic pfn_valid().
I hope this looks okay.

From 7a63f460bcb6ae171c2081bfad81edd9e8f3b7a0 Mon Sep 17 00:00:00 2001
From: Anshuman Khandual <anshuman.khandual@arm.com>
Date: Tue, 25 May 2021 10:27:09 +0100
Subject: [PATCH] arm64/mm: Drop HAVE_ARCH_PFN_VALID

CONFIG_SPARSEMEM_VMEMMAP is now the only available memory model on arm64
platforms and free_unused_memmap() would just return without creating any
holes in the memmap mapping. There is no need for any special handling in
pfn_valid() and HAVE_ARCH_PFN_VALID can just be dropped. This also moves
the pfn upper bits sanity check into generic pfn_valid().

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/Kconfig            |  1 -
 arch/arm64/include/asm/page.h |  1 -
 arch/arm64/mm/init.c          | 37 -----------------------------------
 include/linux/mmzone.h        |  9 +++++++++
 4 files changed, 9 insertions(+), 39 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index d7dc8698cf8e..7904728befcc 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -154,7 +154,6 @@ config ARM64
 	select HAVE_ARCH_KGDB
 	select HAVE_ARCH_MMAP_RND_BITS
 	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
-	select HAVE_ARCH_PFN_VALID
 	select HAVE_ARCH_PREL32_RELOCATIONS
 	select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
 	select HAVE_ARCH_SECCOMP_FILTER
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 75ddfe671393..fcbef3eec4b2 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from);
 
 typedef struct page *pgtable_t;
 
-int pfn_valid(unsigned long pfn);
 int pfn_is_map_memory(unsigned long pfn);
 
 #include <asm/memory.h>
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 725aa84f2faa..49019ea0c8a8 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -219,43 +219,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
 	free_area_init(max_zone_pfns);
 }
 
-int pfn_valid(unsigned long pfn)
-{
-	phys_addr_t addr = PFN_PHYS(pfn);
-	struct mem_section *ms;
-
-	/*
-	 * Ensure the upper PAGE_SHIFT bits are clear in the
-	 * pfn. Else it might lead to false positives when
-	 * some of the upper bits are set, but the lower bits
-	 * match a valid pfn.
-	 */
-	if (PHYS_PFN(addr) != pfn)
-		return 0;
-
-	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
-		return 0;
-
-	ms = __pfn_to_section(pfn);
-	if (!valid_section(ms))
-		return 0;
-
-	/*
-	 * ZONE_DEVICE memory does not have the memblock entries.
-	 * memblock_is_map_memory() check for ZONE_DEVICE based
-	 * addresses will always fail. Even the normal hotplugged
-	 * memory will never have MEMBLOCK_NOMAP flag set in their
-	 * memblock entries. Skip memblock search for all non early
-	 * memory sections covering all of hotplug memory including
-	 * both normal and ZONE_DEVICE based.
-	 */
-	if (!early_section(ms))
-		return pfn_section_valid(ms, pfn);
-
-	return memblock_is_memory(addr);
-}
-EXPORT_SYMBOL(pfn_valid);
-
 int pfn_is_map_memory(unsigned long pfn)
 {
 	phys_addr_t addr = PFN_PHYS(pfn);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index a9b263d4cf9d..d0c4fc506fa3 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1443,6 +1443,15 @@ static inline int pfn_valid(unsigned long pfn)
 {
 	struct mem_section *ms;
 
+	/*
+	 * Ensure the upper PAGE_SHIFT bits are clear in the
+	 * pfn. Else it might lead to false positives when
+	 * some of the upper bits are set, but the lower bits
+	 * match a valid pfn.
+	 */
+	if (PHYS_PFN(PFN_PHYS(pfn)) != pfn)
+		return 0;
+
 	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
 		return 0;
 	ms = __nr_to_section(pfn_to_section_nr(pfn));
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes
  2021-05-25  9:52           ` Anshuman Khandual
@ 2021-05-25 10:03             ` Mike Rapoport
  2021-05-25 10:04               ` David Hildenbrand
  0 siblings, 1 reply; 22+ messages in thread
From: Mike Rapoport @ 2021-05-25 10:03 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-mm, david, akpm, Catalin Marinas, Will Deacon,
	linux-arm-kernel, linux-kernel

On Tue, May 25, 2021 at 03:22:53PM +0530, Anshuman Khandual wrote:
> 
> 
> On 5/25/21 12:02 PM, Mike Rapoport wrote:
> > On Tue, May 25, 2021 at 11:30:15AM +0530, Anshuman Khandual wrote:
> >>
> >> On 5/24/21 12:22 PM, Mike Rapoport wrote:
> >>> Hello Anshuman,
> >>>
> >>> On Mon, May 24, 2021 at 10:28:32AM +0530, Anshuman Khandual wrote:
> >>>>
> >>>> On 4/22/21 1:20 PM, Anshuman Khandual wrote:
> >>>>> Platforms like arm and arm64 have redefined pfn_valid() because their early
> >>>>> memory sections might have contained memmap holes after freeing parts of it
> >>>>> during boot, which should be skipped while validating a pfn for struct page
> >>>>> backing. This scenario on certain platforms where memmap is not continuous,
> >>>>> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES.
> >>>>> Then the generic pfn_valid() can be improved to accommodate such platforms.
> >>>>> This reduces overall code footprint and also improves maintainability.
> >>>>>
> >>>>> free_unused_memmap() and pfn_to_online_page() have been updated to include
> >>>>> such cases. This also exports memblock_is_memory() for all drivers that use
> >>>>> pfn_valid() but lack required visibility. After the new config is in place,
> >>>>> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms.
> >>>>>
> >>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
> >>>>> Cc: Will Deacon <will@kernel.org>
> >>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
> >>>>> Cc: Mike Rapoport <rppt@kernel.org>
> >>>>> Cc: David Hildenbrand <david@redhat.com>
> >>>>> Cc: linux-arm-kernel@lists.infradead.org
> >>>>> Cc: linux-kernel@vger.kernel.org
> >>>>> Cc: linux-mm@kvack.org
> >>>>> Suggested-by: David Hildenbrand <david@redhat.com>
> >>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> >>>>> ---
> >>>>> This patch applies on the latest mainline kernel after Mike's series
> >>>>> regarding arm64 based pfn_valid().
> >>>>>
> >>>>> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t
> >>>>>
> >>>>> Changes in RFC V2:
> >>>>>
> >>>>> - Dropped support for arm (32 bit)
> >>>>> - Replaced memblock_is_map_memory() check with memblock_is_memory()
> >>>>> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid()
> >>>>> - Updated pfn_to_online_page() per David
> >>>>> - Updated free_unused_memmap() to preserve existing semantics per Mike
> >>>>> - Exported memblock_is_memory() instead of memblock_is_map_memory()
> >>>>>
> >>>>> Changes in RFC V1:
> >>>>>
> >>>>> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/
> >>>>>
> >>>>>  arch/arm64/Kconfig            |  2 +-
> >>>>>  arch/arm64/include/asm/page.h |  1 -
> >>>>>  arch/arm64/mm/init.c          | 41 -----------------------------------
> >>>>>  include/linux/mmzone.h        | 18 ++++++++++++++-
> >>>>>  mm/Kconfig                    |  9 ++++++++
> >>>>>  mm/memblock.c                 |  8 +++++--
> >>>>>  mm/memory_hotplug.c           |  5 +++++
> >>>>>  7 files changed, 38 insertions(+), 46 deletions(-)
> >>>>>
> >>>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> >>>>> index b4a9b493ce72..4cdc3570ffa9 100644
> >>>>> --- a/arch/arm64/Kconfig
> >>>>> +++ b/arch/arm64/Kconfig
> >>>>> @@ -144,7 +144,6 @@ config ARM64
> >>>>>  	select HAVE_ARCH_KGDB
> >>>>>  	select HAVE_ARCH_MMAP_RND_BITS
> >>>>>  	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
> >>>>> -	select HAVE_ARCH_PFN_VALID
> >>>>>  	select HAVE_ARCH_PREL32_RELOCATIONS
> >>>>>  	select HAVE_ARCH_SECCOMP_FILTER
> >>>>>  	select HAVE_ARCH_STACKLEAK
> >>>>> @@ -167,6 +166,7 @@ config ARM64
> >>>>>  		if $(cc-option,-fpatchable-function-entry=2)
> >>>>>  	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
> >>>>>  		if DYNAMIC_FTRACE_WITH_REGS
> >>>>> +	select HAVE_EARLY_SECTION_MEMMAP_HOLES
> >>>>>  	select HAVE_EFFICIENT_UNALIGNED_ACCESS
> >>>>>  	select HAVE_FAST_GUP
> >>>>>  	select HAVE_FTRACE_MCOUNT_RECORD
> >>>>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> >>>>> index 75ddfe671393..fcbef3eec4b2 100644
> >>>>> --- a/arch/arm64/include/asm/page.h
> >>>>> +++ b/arch/arm64/include/asm/page.h
> >>>>> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from);
> >>>>>  
> >>>>>  typedef struct page *pgtable_t;
> >>>>>  
> >>>>> -int pfn_valid(unsigned long pfn);
> >>>>>  int pfn_is_map_memory(unsigned long pfn);
> >>>>>  
> >>>>>  #include <asm/memory.h>
> >>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> >>>>> index f431b38d0837..5731a11550d8 100644
> >>>>> --- a/arch/arm64/mm/init.c
> >>>>> +++ b/arch/arm64/mm/init.c
> >>>>> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
> >>>>>  	free_area_init(max_zone_pfns);
> >>>>>  }
> >>>>>  
> >>>>> -int pfn_valid(unsigned long pfn)
> >>>>> -{
> >>>>> -	phys_addr_t addr = PFN_PHYS(pfn);
> >>>>> -
> >>>>> -	/*
> >>>>> -	 * Ensure the upper PAGE_SHIFT bits are clear in the
> >>>>> -	 * pfn. Else it might lead to false positives when
> >>>>> -	 * some of the upper bits are set, but the lower bits
> >>>>> -	 * match a valid pfn.
> >>>>> -	 */
> >>>>> -	if (PHYS_PFN(addr) != pfn)
> >>>>> -		return 0;
> >>>>> -
> >>>>> -#ifdef CONFIG_SPARSEMEM
> >>>>> -{
> >>>>> -	struct mem_section *ms;
> >>>>> -
> >>>>> -	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> >>>>> -		return 0;
> >>>>> -
> >>>>> -	ms = __pfn_to_section(pfn);
> >>>>> -	if (!valid_section(ms))
> >>>>> -		return 0;
> >>>>> -
> >>>>> -	/*
> >>>>> -	 * ZONE_DEVICE memory does not have the memblock entries.
> >>>>> -	 * memblock_is_memory() check for ZONE_DEVICE based
> >>>>> -	 * addresses will always fail. Even the normal hotplugged
> >>>>> -	 * memory will never have MEMBLOCK_NOMAP flag set in their
> >>>>> -	 * memblock entries. Skip memblock search for all non early
> >>>>> -	 * memory sections covering all of hotplug memory including
> >>>>> -	 * both normal and ZONE_DEVICE based.
> >>>>> -	 */
> >>>>> -	if (!early_section(ms))
> >>>>> -		return pfn_section_valid(ms, pfn);
> >>>>> -}
> >>>>> -#endif
> >>>>> -	return memblock_is_memory(addr);
> >>>>> -}
> >>>>> -EXPORT_SYMBOL(pfn_valid);
> >>>>> -
> >>>>>  int pfn_is_map_memory(unsigned long pfn)
> >>>>>  {
> >>>>>  	phys_addr_t addr = PFN_PHYS(pfn);
> >>>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> >>>>> index 961f0eeefb62..18bf71665211 100644
> >>>>> --- a/include/linux/mmzone.h
> >>>>> +++ b/include/linux/mmzone.h
> >>>>> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
> >>>>>   *
> >>>>>   * Return: 1 for PFNs that have memory map entries and 0 otherwise
> >>>>>   */
> >>>>> +bool memblock_is_memory(phys_addr_t addr);
> >>>>> +
> >>>>>  static inline int pfn_valid(unsigned long pfn)
> >>>>>  {
> >>>>> +	phys_addr_t addr = PFN_PHYS(pfn);
> >>>>>  	struct mem_section *ms;
> >>>>>  
> >>>>> +	/*
> >>>>> +	 * Ensure the upper PAGE_SHIFT bits are clear in the
> >>>>> +	 * pfn. Else it might lead to false positives when
> >>>>> +	 * some of the upper bits are set, but the lower bits
> >>>>> +	 * match a valid pfn.
> >>>>> +	 */
> >>>>> +	if (PHYS_PFN(addr) != pfn)
> >>>>> +		return 0;
> >>>>> +
> >>>>>  	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> >>>>>  		return 0;
> >>>>>  	ms = __nr_to_section(pfn_to_section_nr(pfn));
> >>>>> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn)
> >>>>>  	 * Traditionally early sections always returned pfn_valid() for
> >>>>>  	 * the entire section-sized span.
> >>>>>  	 */
> >>>>> -	return early_section(ms) || pfn_section_valid(ms, pfn);
> >>>>> +	if (early_section(ms))
> >>>>> +		return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ?
> >>>>> +			memblock_is_memory(pfn << PAGE_SHIFT) : 1;
> >>>>> +
> >>>>> +	return pfn_section_valid(ms, pfn);
> >>>>>  }
> >>>>>  #endif
> >>>>
> >>>> Hello David/Mike,
> >>>>
> >>>> Now that pfn_is_map_memory() usage has been decoupled from pfn_valid() and
> >>>> SPARSEMEM_VMEMMAP is only available memory model on arm64, wondering if we
> >>>> still need this HAVE_EARLY_SECTION_MEMMAP_HOLES proposal ? Please do kindly
> >>>> suggest. Thank you.
> >>>
> >>> Even now arm64 still frees parts of the memory map and pfn_valid() should
> >>> be able to tell if a part of a section is freed or not.
> >>>
> >>> For instance for the following memory configuration
> >>>     
> >>>         |<----section---->|<----hole---->|<----section---->|
> >>>         +--------+--------+--------------+--------+--------+
> >>>         | bank 0 | unused |              | bank 1 | unused |
> >>>         +--------+--------+--------------+--------+--------+
> >>>
> >>> the memory map corresponding to the "unused" areas is freed, but the generic
> >>> pfn_valid() will still return 1 there.
> >>
> >> But is not free_unused_memmap() return early when CONFIG_SPARSEMEM_VMEMMAP
> >> is enabled, which is the only option now on arm64. Then how can memmap have
> >> holes (from unused areas) anymore ? Am I missing something here.
> >  
> > Ah, you are right, I missed this detail myself :)
> > 
> > With CONFIG_SPARSEMEM_VMEMMAP as the only memory model for arm64, we can
> > simply rid of arm64::pfn_valid() without any changes to the generic
> > version.
> 
> Though just moved the pfn bits sanity check into generic pfn_valid().
> I hope this looks okay.
> 
> From 7a63f460bcb6ae171c2081bfad81edd9e8f3b7a0 Mon Sep 17 00:00:00 2001
> From: Anshuman Khandual <anshuman.khandual@arm.com>
> Date: Tue, 25 May 2021 10:27:09 +0100
> Subject: [PATCH] arm64/mm: Drop HAVE_ARCH_PFN_VALID
> 
> CONFIG_SPARSEMEM_VMEMMAP is now the only available memory model on arm64
> platforms and free_unused_memmap() would just return without creating any
> holes in the memmap mapping. There is no need for any special handling in
> pfn_valid() and HAVE_ARCH_PFN_VALID can just be dropped. This also moves
> the pfn upper bits sanity check into generic pfn_valid().
> 
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>

Acked-by: Mike Rapoport <rppt@linux.ibm.com>

> ---
>  arch/arm64/Kconfig            |  1 -
>  arch/arm64/include/asm/page.h |  1 -
>  arch/arm64/mm/init.c          | 37 -----------------------------------
>  include/linux/mmzone.h        |  9 +++++++++
>  4 files changed, 9 insertions(+), 39 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index d7dc8698cf8e..7904728befcc 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -154,7 +154,6 @@ config ARM64
>  	select HAVE_ARCH_KGDB
>  	select HAVE_ARCH_MMAP_RND_BITS
>  	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
> -	select HAVE_ARCH_PFN_VALID
>  	select HAVE_ARCH_PREL32_RELOCATIONS
>  	select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
>  	select HAVE_ARCH_SECCOMP_FILTER
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 75ddfe671393..fcbef3eec4b2 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from);
>  
>  typedef struct page *pgtable_t;
>  
> -int pfn_valid(unsigned long pfn);
>  int pfn_is_map_memory(unsigned long pfn);
>  
>  #include <asm/memory.h>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 725aa84f2faa..49019ea0c8a8 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -219,43 +219,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>  	free_area_init(max_zone_pfns);
>  }
>  
> -int pfn_valid(unsigned long pfn)
> -{
> -	phys_addr_t addr = PFN_PHYS(pfn);
> -	struct mem_section *ms;
> -
> -	/*
> -	 * Ensure the upper PAGE_SHIFT bits are clear in the
> -	 * pfn. Else it might lead to false positives when
> -	 * some of the upper bits are set, but the lower bits
> -	 * match a valid pfn.
> -	 */
> -	if (PHYS_PFN(addr) != pfn)
> -		return 0;
> -
> -	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> -		return 0;
> -
> -	ms = __pfn_to_section(pfn);
> -	if (!valid_section(ms))
> -		return 0;
> -
> -	/*
> -	 * ZONE_DEVICE memory does not have the memblock entries.
> -	 * memblock_is_map_memory() check for ZONE_DEVICE based
> -	 * addresses will always fail. Even the normal hotplugged
> -	 * memory will never have MEMBLOCK_NOMAP flag set in their
> -	 * memblock entries. Skip memblock search for all non early
> -	 * memory sections covering all of hotplug memory including
> -	 * both normal and ZONE_DEVICE based.
> -	 */
> -	if (!early_section(ms))
> -		return pfn_section_valid(ms, pfn);
> -
> -	return memblock_is_memory(addr);
> -}
> -EXPORT_SYMBOL(pfn_valid);
> -
>  int pfn_is_map_memory(unsigned long pfn)
>  {
>  	phys_addr_t addr = PFN_PHYS(pfn);
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index a9b263d4cf9d..d0c4fc506fa3 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1443,6 +1443,15 @@ static inline int pfn_valid(unsigned long pfn)
>  {
>  	struct mem_section *ms;
>  
> +	/*
> +	 * Ensure the upper PAGE_SHIFT bits are clear in the
> +	 * pfn. Else it might lead to false positives when
> +	 * some of the upper bits are set, but the lower bits
> +	 * match a valid pfn.
> +	 */
> +	if (PHYS_PFN(PFN_PHYS(pfn)) != pfn)
> +		return 0;
> +
>  	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>  		return 0;
>  	ms = __nr_to_section(pfn_to_section_nr(pfn));
> -- 
> 2.20.1

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes
  2021-05-25 10:03             ` Mike Rapoport
@ 2021-05-25 10:04               ` David Hildenbrand
  0 siblings, 0 replies; 22+ messages in thread
From: David Hildenbrand @ 2021-05-25 10:04 UTC (permalink / raw)
  To: Mike Rapoport, Anshuman Khandual
  Cc: linux-mm, akpm, Catalin Marinas, Will Deacon, linux-arm-kernel,
	linux-kernel

On 25.05.21 12:03, Mike Rapoport wrote:
> On Tue, May 25, 2021 at 03:22:53PM +0530, Anshuman Khandual wrote:
>>
>>
>> On 5/25/21 12:02 PM, Mike Rapoport wrote:
>>> On Tue, May 25, 2021 at 11:30:15AM +0530, Anshuman Khandual wrote:
>>>>
>>>> On 5/24/21 12:22 PM, Mike Rapoport wrote:
>>>>> Hello Anshuman,
>>>>>
>>>>> On Mon, May 24, 2021 at 10:28:32AM +0530, Anshuman Khandual wrote:
>>>>>>
>>>>>> On 4/22/21 1:20 PM, Anshuman Khandual wrote:
>>>>>>> Platforms like arm and arm64 have redefined pfn_valid() because their early
>>>>>>> memory sections might have contained memmap holes after freeing parts of it
>>>>>>> during boot, which should be skipped while validating a pfn for struct page
>>>>>>> backing. This scenario on certain platforms where memmap is not continuous,
>>>>>>> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES.
>>>>>>> Then the generic pfn_valid() can be improved to accommodate such platforms.
>>>>>>> This reduces overall code footprint and also improves maintainability.
>>>>>>>
>>>>>>> free_unused_memmap() and pfn_to_online_page() have been updated to include
>>>>>>> such cases. This also exports memblock_is_memory() for all drivers that use
>>>>>>> pfn_valid() but lack required visibility. After the new config is in place,
>>>>>>> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms.
>>>>>>>
>>>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>>>>>> Cc: Will Deacon <will@kernel.org>
>>>>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>>>>>> Cc: Mike Rapoport <rppt@kernel.org>
>>>>>>> Cc: David Hildenbrand <david@redhat.com>
>>>>>>> Cc: linux-arm-kernel@lists.infradead.org
>>>>>>> Cc: linux-kernel@vger.kernel.org
>>>>>>> Cc: linux-mm@kvack.org
>>>>>>> Suggested-by: David Hildenbrand <david@redhat.com>
>>>>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>>>>>> ---
>>>>>>> This patch applies on the latest mainline kernel after Mike's series
>>>>>>> regarding arm64 based pfn_valid().
>>>>>>>
>>>>>>> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t
>>>>>>>
>>>>>>> Changes in RFC V2:
>>>>>>>
>>>>>>> - Dropped support for arm (32 bit)
>>>>>>> - Replaced memblock_is_map_memory() check with memblock_is_memory()
>>>>>>> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid()
>>>>>>> - Updated pfn_to_online_page() per David
>>>>>>> - Updated free_unused_memmap() to preserve existing semantics per Mike
>>>>>>> - Exported memblock_is_memory() instead of memblock_is_map_memory()
>>>>>>>
>>>>>>> Changes in RFC V1:
>>>>>>>
>>>>>>> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/
>>>>>>>
>>>>>>>   arch/arm64/Kconfig            |  2 +-
>>>>>>>   arch/arm64/include/asm/page.h |  1 -
>>>>>>>   arch/arm64/mm/init.c          | 41 -----------------------------------
>>>>>>>   include/linux/mmzone.h        | 18 ++++++++++++++-
>>>>>>>   mm/Kconfig                    |  9 ++++++++
>>>>>>>   mm/memblock.c                 |  8 +++++--
>>>>>>>   mm/memory_hotplug.c           |  5 +++++
>>>>>>>   7 files changed, 38 insertions(+), 46 deletions(-)
>>>>>>>
>>>>>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>>>>>> index b4a9b493ce72..4cdc3570ffa9 100644
>>>>>>> --- a/arch/arm64/Kconfig
>>>>>>> +++ b/arch/arm64/Kconfig
>>>>>>> @@ -144,7 +144,6 @@ config ARM64
>>>>>>>   	select HAVE_ARCH_KGDB
>>>>>>>   	select HAVE_ARCH_MMAP_RND_BITS
>>>>>>>   	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
>>>>>>> -	select HAVE_ARCH_PFN_VALID
>>>>>>>   	select HAVE_ARCH_PREL32_RELOCATIONS
>>>>>>>   	select HAVE_ARCH_SECCOMP_FILTER
>>>>>>>   	select HAVE_ARCH_STACKLEAK
>>>>>>> @@ -167,6 +166,7 @@ config ARM64
>>>>>>>   		if $(cc-option,-fpatchable-function-entry=2)
>>>>>>>   	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
>>>>>>>   		if DYNAMIC_FTRACE_WITH_REGS
>>>>>>> +	select HAVE_EARLY_SECTION_MEMMAP_HOLES
>>>>>>>   	select HAVE_EFFICIENT_UNALIGNED_ACCESS
>>>>>>>   	select HAVE_FAST_GUP
>>>>>>>   	select HAVE_FTRACE_MCOUNT_RECORD
>>>>>>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>>>>>>> index 75ddfe671393..fcbef3eec4b2 100644
>>>>>>> --- a/arch/arm64/include/asm/page.h
>>>>>>> +++ b/arch/arm64/include/asm/page.h
>>>>>>> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from);
>>>>>>>   
>>>>>>>   typedef struct page *pgtable_t;
>>>>>>>   
>>>>>>> -int pfn_valid(unsigned long pfn);
>>>>>>>   int pfn_is_map_memory(unsigned long pfn);
>>>>>>>   
>>>>>>>   #include <asm/memory.h>
>>>>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>>>>>> index f431b38d0837..5731a11550d8 100644
>>>>>>> --- a/arch/arm64/mm/init.c
>>>>>>> +++ b/arch/arm64/mm/init.c
>>>>>>> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>>>>>>>   	free_area_init(max_zone_pfns);
>>>>>>>   }
>>>>>>>   
>>>>>>> -int pfn_valid(unsigned long pfn)
>>>>>>> -{
>>>>>>> -	phys_addr_t addr = PFN_PHYS(pfn);
>>>>>>> -
>>>>>>> -	/*
>>>>>>> -	 * Ensure the upper PAGE_SHIFT bits are clear in the
>>>>>>> -	 * pfn. Else it might lead to false positives when
>>>>>>> -	 * some of the upper bits are set, but the lower bits
>>>>>>> -	 * match a valid pfn.
>>>>>>> -	 */
>>>>>>> -	if (PHYS_PFN(addr) != pfn)
>>>>>>> -		return 0;
>>>>>>> -
>>>>>>> -#ifdef CONFIG_SPARSEMEM
>>>>>>> -{
>>>>>>> -	struct mem_section *ms;
>>>>>>> -
>>>>>>> -	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>>>>>>> -		return 0;
>>>>>>> -
>>>>>>> -	ms = __pfn_to_section(pfn);
>>>>>>> -	if (!valid_section(ms))
>>>>>>> -		return 0;
>>>>>>> -
>>>>>>> -	/*
>>>>>>> -	 * ZONE_DEVICE memory does not have the memblock entries.
>>>>>>> -	 * memblock_is_memory() check for ZONE_DEVICE based
>>>>>>> -	 * addresses will always fail. Even the normal hotplugged
>>>>>>> -	 * memory will never have MEMBLOCK_NOMAP flag set in their
>>>>>>> -	 * memblock entries. Skip memblock search for all non early
>>>>>>> -	 * memory sections covering all of hotplug memory including
>>>>>>> -	 * both normal and ZONE_DEVICE based.
>>>>>>> -	 */
>>>>>>> -	if (!early_section(ms))
>>>>>>> -		return pfn_section_valid(ms, pfn);
>>>>>>> -}
>>>>>>> -#endif
>>>>>>> -	return memblock_is_memory(addr);
>>>>>>> -}
>>>>>>> -EXPORT_SYMBOL(pfn_valid);
>>>>>>> -
>>>>>>>   int pfn_is_map_memory(unsigned long pfn)
>>>>>>>   {
>>>>>>>   	phys_addr_t addr = PFN_PHYS(pfn);
>>>>>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>>>>>>> index 961f0eeefb62..18bf71665211 100644
>>>>>>> --- a/include/linux/mmzone.h
>>>>>>> +++ b/include/linux/mmzone.h
>>>>>>> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
>>>>>>>    *
>>>>>>>    * Return: 1 for PFNs that have memory map entries and 0 otherwise
>>>>>>>    */
>>>>>>> +bool memblock_is_memory(phys_addr_t addr);
>>>>>>> +
>>>>>>>   static inline int pfn_valid(unsigned long pfn)
>>>>>>>   {
>>>>>>> +	phys_addr_t addr = PFN_PHYS(pfn);
>>>>>>>   	struct mem_section *ms;
>>>>>>>   
>>>>>>> +	/*
>>>>>>> +	 * Ensure the upper PAGE_SHIFT bits are clear in the
>>>>>>> +	 * pfn. Else it might lead to false positives when
>>>>>>> +	 * some of the upper bits are set, but the lower bits
>>>>>>> +	 * match a valid pfn.
>>>>>>> +	 */
>>>>>>> +	if (PHYS_PFN(addr) != pfn)
>>>>>>> +		return 0;
>>>>>>> +
>>>>>>>   	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>>>>>>>   		return 0;
>>>>>>>   	ms = __nr_to_section(pfn_to_section_nr(pfn));
>>>>>>> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn)
>>>>>>>   	 * Traditionally early sections always returned pfn_valid() for
>>>>>>>   	 * the entire section-sized span.
>>>>>>>   	 */
>>>>>>> -	return early_section(ms) || pfn_section_valid(ms, pfn);
>>>>>>> +	if (early_section(ms))
>>>>>>> +		return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ?
>>>>>>> +			memblock_is_memory(pfn << PAGE_SHIFT) : 1;
>>>>>>> +
>>>>>>> +	return pfn_section_valid(ms, pfn);
>>>>>>>   }
>>>>>>>   #endif
>>>>>>
>>>>>> Hello David/Mike,
>>>>>>
>>>>>> Now that pfn_is_map_memory() usage has been decoupled from pfn_valid() and
>>>>>> SPARSEMEM_VMEMMAP is only available memory model on arm64, wondering if we
>>>>>> still need this HAVE_EARLY_SECTION_MEMMAP_HOLES proposal ? Please do kindly
>>>>>> suggest. Thank you.
>>>>>
>>>>> Even now arm64 still frees parts of the memory map and pfn_valid() should
>>>>> be able to tell if a part of a section is freed or not.
>>>>>
>>>>> For instance for the following memory configuration
>>>>>      
>>>>>          |<----section---->|<----hole---->|<----section---->|
>>>>>          +--------+--------+--------------+--------+--------+
>>>>>          | bank 0 | unused |              | bank 1 | unused |
>>>>>          +--------+--------+--------------+--------+--------+
>>>>>
>>>>> the memory map corresponding to the "unused" areas is freed, but the generic
>>>>> pfn_valid() will still return 1 there.
>>>>
>>>> But is not free_unused_memmap() return early when CONFIG_SPARSEMEM_VMEMMAP
>>>> is enabled, which is the only option now on arm64. Then how can memmap have
>>>> holes (from unused areas) anymore ? Am I missing something here.
>>>   
>>> Ah, you are right, I missed this detail myself :)
>>>
>>> With CONFIG_SPARSEMEM_VMEMMAP as the only memory model for arm64, we can
>>> simply rid of arm64::pfn_valid() without any changes to the generic
>>> version.
>>
>> Though just moved the pfn bits sanity check into generic pfn_valid().
>> I hope this looks okay.
>>
>>  From 7a63f460bcb6ae171c2081bfad81edd9e8f3b7a0 Mon Sep 17 00:00:00 2001
>> From: Anshuman Khandual <anshuman.khandual@arm.com>
>> Date: Tue, 25 May 2021 10:27:09 +0100
>> Subject: [PATCH] arm64/mm: Drop HAVE_ARCH_PFN_VALID
>>
>> CONFIG_SPARSEMEM_VMEMMAP is now the only available memory model on arm64
>> platforms and free_unused_memmap() would just return without creating any
>> holes in the memmap mapping. There is no need for any special handling in
>> pfn_valid() and HAVE_ARCH_PFN_VALID can just be dropped. This also moves
>> the pfn upper bits sanity check into generic pfn_valid().
>>
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> 
> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
> 

Indeed, looks good

Acked-by: David Hildenbrand <david@redhat.com>

>> ---
>>   arch/arm64/Kconfig            |  1 -
>>   arch/arm64/include/asm/page.h |  1 -
>>   arch/arm64/mm/init.c          | 37 -----------------------------------
>>   include/linux/mmzone.h        |  9 +++++++++
>>   4 files changed, 9 insertions(+), 39 deletions(-)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index d7dc8698cf8e..7904728befcc 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -154,7 +154,6 @@ config ARM64
>>   	select HAVE_ARCH_KGDB
>>   	select HAVE_ARCH_MMAP_RND_BITS
>>   	select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
>> -	select HAVE_ARCH_PFN_VALID
>>   	select HAVE_ARCH_PREL32_RELOCATIONS
>>   	select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
>>   	select HAVE_ARCH_SECCOMP_FILTER
>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
>> index 75ddfe671393..fcbef3eec4b2 100644
>> --- a/arch/arm64/include/asm/page.h
>> +++ b/arch/arm64/include/asm/page.h
>> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from);
>>   
>>   typedef struct page *pgtable_t;
>>   
>> -int pfn_valid(unsigned long pfn);
>>   int pfn_is_map_memory(unsigned long pfn);
>>   
>>   #include <asm/memory.h>
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 725aa84f2faa..49019ea0c8a8 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -219,43 +219,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>>   	free_area_init(max_zone_pfns);
>>   }
>>   
>> -int pfn_valid(unsigned long pfn)
>> -{
>> -	phys_addr_t addr = PFN_PHYS(pfn);
>> -	struct mem_section *ms;
>> -
>> -	/*
>> -	 * Ensure the upper PAGE_SHIFT bits are clear in the
>> -	 * pfn. Else it might lead to false positives when
>> -	 * some of the upper bits are set, but the lower bits
>> -	 * match a valid pfn.
>> -	 */
>> -	if (PHYS_PFN(addr) != pfn)
>> -		return 0;
>> -
>> -	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>> -		return 0;
>> -
>> -	ms = __pfn_to_section(pfn);
>> -	if (!valid_section(ms))
>> -		return 0;
>> -
>> -	/*
>> -	 * ZONE_DEVICE memory does not have the memblock entries.
>> -	 * memblock_is_map_memory() check for ZONE_DEVICE based
>> -	 * addresses will always fail. Even the normal hotplugged
>> -	 * memory will never have MEMBLOCK_NOMAP flag set in their
>> -	 * memblock entries. Skip memblock search for all non early
>> -	 * memory sections covering all of hotplug memory including
>> -	 * both normal and ZONE_DEVICE based.
>> -	 */
>> -	if (!early_section(ms))
>> -		return pfn_section_valid(ms, pfn);
>> -
>> -	return memblock_is_memory(addr);
>> -}
>> -EXPORT_SYMBOL(pfn_valid);
>> -
>>   int pfn_is_map_memory(unsigned long pfn)
>>   {
>>   	phys_addr_t addr = PFN_PHYS(pfn);
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index a9b263d4cf9d..d0c4fc506fa3 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -1443,6 +1443,15 @@ static inline int pfn_valid(unsigned long pfn)
>>   {
>>   	struct mem_section *ms;
>>   
>> +	/*
>> +	 * Ensure the upper PAGE_SHIFT bits are clear in the
>> +	 * pfn. Else it might lead to false positives when
>> +	 * some of the upper bits are set, but the lower bits
>> +	 * match a valid pfn.
>> +	 */
>> +	if (PHYS_PFN(PFN_PHYS(pfn)) != pfn)
>> +		return 0;
>> +
>>   	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>>   		return 0;
>>   	ms = __nr_to_section(pfn_to_section_nr(pfn));
>> -- 
>> 2.20.1
> 


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-05-25 10:04 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-22  6:18 [PATCH v3 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
2021-04-22  6:18 ` [PATCH v3 1/4] include/linux/mmzone.h: add documentation for pfn_valid() Mike Rapoport
2021-04-22  6:19 ` [PATCH v3 2/4] memblock: update initialization of reserved pages Mike Rapoport
2021-04-22  6:19 ` [PATCH v3 3/4] arm64: decouple check whether pfn is in linear map from pfn_valid() Mike Rapoport
2021-04-22  8:57   ` David Hildenbrand
2021-04-22  6:19 ` [PATCH v3 4/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
2021-04-22  7:50 ` [RFC V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes Anshuman Khandual
2021-04-22  8:27   ` Mike Rapoport
2021-04-22 11:23     ` Anshuman Khandual
2021-04-22 12:19       ` Mike Rapoport
2021-04-22  9:03   ` David Hildenbrand
2021-04-22  9:42     ` Mike Rapoport
2021-04-22  9:48       ` David Hildenbrand
2021-04-22 10:03         ` Mike Rapoport
2021-04-22  9:59       ` Anshuman Khandual
2021-05-24  4:58   ` Anshuman Khandual
2021-05-24  6:52     ` Mike Rapoport
2021-05-25  6:00       ` Anshuman Khandual
2021-05-25  6:32         ` Mike Rapoport
2021-05-25  9:52           ` Anshuman Khandual
2021-05-25 10:03             ` Mike Rapoport
2021-05-25 10:04               ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).