linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid()
@ 2021-04-20  9:09 Mike Rapoport
  2021-04-20  9:09 ` [PATCH v1 1/4] include/linux/mmzone.h: add documentation for pfn_valid() Mike Rapoport
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Mike Rapoport @ 2021-04-20  9:09 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas,
	David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport,
	Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm

From: Mike Rapoport <rppt@linux.ibm.com>

Hi,

These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire
pfn_valid_within() to 1. 

The idea is to mark NOMAP pages as reserved in the memory map and restore
the intended semantics of pfn_valid() to designate availability of struct
page for a pfn.

With this the core mm will be able to cope with the fact that it cannot use
NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks
will be treated correctly even without the need for pfn_valid_within.

The patches are only boot tested on qemu-system-aarch64 so I'd really
appreciate memory stress tests on real hardware.

If this actually works we'll be one step closer to drop custom pfn_valid()
on arm64 altogether.

Changes since RFC
Link: https://lore.kernel.org/lkml/20210407172607.8812-1-rppt@kernel.org

* Add comment about the semantics of pfn_valid() as Anshuman suggested
* Extend comments about MEMBLOCK_NOMAP, per Anshuman
* Use pfn_is_map_memory() name for the exported wrapper for
  memblock_is_map_memory(). It is still local to arch/arm64 in the end
  because of header dependency issues.

Mike Rapoport (4):
  include/linux/mmzone.h: add documentation for pfn_valid()
  memblock: update initialization of reserved pages
  arm64: decouple check whether pfn is in linear map from pfn_valid()
  arm64: drop pfn_valid_within() and simplify pfn_valid()

 arch/arm64/Kconfig              |  3 ---
 arch/arm64/include/asm/memory.h |  2 +-
 arch/arm64/include/asm/page.h   |  1 +
 arch/arm64/kvm/mmu.c            |  2 +-
 arch/arm64/mm/init.c            | 10 ++++++++--
 arch/arm64/mm/ioremap.c         |  4 ++--
 arch/arm64/mm/mmu.c             |  2 +-
 include/linux/memblock.h        |  4 +++-
 include/linux/mmzone.h          | 11 +++++++++++
 mm/memblock.c                   | 28 ++++++++++++++++++++++++++--
 10 files changed, 54 insertions(+), 13 deletions(-)

base-commit: e49d033bddf5b565044e2abe4241353959bc9120
-- 
2.28.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v1 1/4] include/linux/mmzone.h: add documentation for pfn_valid()
  2021-04-20  9:09 [PATCH v1 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
@ 2021-04-20  9:09 ` Mike Rapoport
  2021-04-20  9:22   ` David Hildenbrand
  2021-04-20  9:09 ` [PATCH v1 2/4] memblock: update initialization of reserved pages Mike Rapoport
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Mike Rapoport @ 2021-04-20  9:09 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas,
	David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport,
	Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm

From: Mike Rapoport <rppt@linux.ibm.com>

Add comment describing the semantics of pfn_valid() that clarifies that
pfn_valid() only checks for availability of a memory map entry (i.e. struct
page) for a PFN rather than availability of usable memory backing that PFN.

The most "generic" version of pfn_valid() used by the configurations with
SPARSEMEM enabled resides in include/linux/mmzone.h so this is the most
suitable place for documentation about semantics of pfn_valid().

Suggested-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 include/linux/mmzone.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 47946cec7584..961f0eeefb62 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1410,6 +1410,17 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
 #endif
 
 #ifndef CONFIG_HAVE_ARCH_PFN_VALID
+/**
+ * pfn_valid - check if there is a valid memory map entry for a PFN
+ * @pfn: the page frame number to check
+ *
+ * Check if there is a valid memory map entry aka struct page for the @pfn.
+ * Note, that availability of the memory map entry does not imply that
+ * there is actual usable memory at that @pfn. The struct page may
+ * represent a hole or an unusable page frame.
+ *
+ * Return: 1 for PFNs that have memory map entries and 0 otherwise
+ */
 static inline int pfn_valid(unsigned long pfn)
 {
 	struct mem_section *ms;
-- 
2.28.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v1 2/4] memblock: update initialization of reserved pages
  2021-04-20  9:09 [PATCH v1 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
  2021-04-20  9:09 ` [PATCH v1 1/4] include/linux/mmzone.h: add documentation for pfn_valid() Mike Rapoport
@ 2021-04-20  9:09 ` Mike Rapoport
  2021-04-20 13:56   ` David Hildenbrand
  2021-04-20  9:09 ` [PATCH v1 3/4] arm64: decouple check whether pfn is in linear map from pfn_valid() Mike Rapoport
  2021-04-20  9:09 ` [PATCH v1 4/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
  3 siblings, 1 reply; 16+ messages in thread
From: Mike Rapoport @ 2021-04-20  9:09 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas,
	David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport,
	Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm

From: Mike Rapoport <rppt@linux.ibm.com>

The struct pages representing a reserved memory region are initialized
using reserve_bootmem_range() function. This function is called for each
reserved region just before the memory is freed from memblock to the buddy
page allocator.

The struct pages for MEMBLOCK_NOMAP regions are kept with the default
values set by the memory map initialization which makes it necessary to
have a special treatment for such pages in pfn_valid() and
pfn_valid_within().

Split out initialization of the reserved pages to a function with a
meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the
reserved regions and mark struct pages for the NOMAP regions as
PageReserved.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 include/linux/memblock.h |  4 +++-
 mm/memblock.c            | 28 ++++++++++++++++++++++++++--
 2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 5984fff3f175..634c1a578db8 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -30,7 +30,9 @@ extern unsigned long long max_possible_pfn;
  * @MEMBLOCK_NONE: no special request
  * @MEMBLOCK_HOTPLUG: hotpluggable region
  * @MEMBLOCK_MIRROR: mirrored region
- * @MEMBLOCK_NOMAP: don't add to kernel direct mapping
+ * @MEMBLOCK_NOMAP: don't add to kernel direct mapping and treat as
+ * reserved in the memory map; refer to memblock_mark_nomap() description
+ * for futher details
  */
 enum memblock_flags {
 	MEMBLOCK_NONE		= 0x0,	/* No special request */
diff --git a/mm/memblock.c b/mm/memblock.c
index afaefa8fc6ab..3abf2c3fea7f 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -906,6 +906,11 @@ int __init_memblock memblock_mark_mirror(phys_addr_t base, phys_addr_t size)
  * @base: the base phys addr of the region
  * @size: the size of the region
  *
+ * The memory regions marked with %MEMBLOCK_NOMAP will not be added to the
+ * direct mapping of the physical memory. These regions will still be
+ * covered by the memory map. The struct page representing NOMAP memory
+ * frames in the memory map will be PageReserved()
+ *
  * Return: 0 on success, -errno on failure.
  */
 int __init_memblock memblock_mark_nomap(phys_addr_t base, phys_addr_t size)
@@ -2002,6 +2007,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start,
 	return end_pfn - start_pfn;
 }
 
+static void __init memmap_init_reserved_pages(void)
+{
+	struct memblock_region *region;
+	phys_addr_t start, end;
+	u64 i;
+
+	/* initialize struct pages for the reserved regions */
+	for_each_reserved_mem_range(i, &start, &end)
+		reserve_bootmem_region(start, end);
+
+	/* and also treat struct pages for the NOMAP regions as PageReserved */
+	for_each_mem_region(region) {
+		if (memblock_is_nomap(region)) {
+			start = region->base;
+			end = start + region->size;
+			reserve_bootmem_region(start, end);
+		}
+	}
+}
+
 static unsigned long __init free_low_memory_core_early(void)
 {
 	unsigned long count = 0;
@@ -2010,8 +2035,7 @@ static unsigned long __init free_low_memory_core_early(void)
 
 	memblock_clear_hotplug(0, -1);
 
-	for_each_reserved_mem_range(i, &start, &end)
-		reserve_bootmem_region(start, end);
+	memmap_init_reserved_pages();
 
 	/*
 	 * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id
-- 
2.28.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v1 3/4] arm64: decouple check whether pfn is in linear map from pfn_valid()
  2021-04-20  9:09 [PATCH v1 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
  2021-04-20  9:09 ` [PATCH v1 1/4] include/linux/mmzone.h: add documentation for pfn_valid() Mike Rapoport
  2021-04-20  9:09 ` [PATCH v1 2/4] memblock: update initialization of reserved pages Mike Rapoport
@ 2021-04-20  9:09 ` Mike Rapoport
  2021-04-20 15:57   ` David Hildenbrand
  2021-04-20  9:09 ` [PATCH v1 4/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
  3 siblings, 1 reply; 16+ messages in thread
From: Mike Rapoport @ 2021-04-20  9:09 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas,
	David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport,
	Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm

From: Mike Rapoport <rppt@linux.ibm.com>

The intended semantics of pfn_valid() is to verify whether there is a
struct page for the pfn in question and nothing else.

Yet, on arm64 it is used to distinguish memory areas that are mapped in the
linear map vs those that require ioremap() to access them.

Introduce a dedicated pfn_is_map_memory() wrapper for
memblock_is_map_memory() to perform such check and use it where
appropriate.

Using a wrapper allows to avoid cyclic include dependencies.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arm64/include/asm/memory.h | 2 +-
 arch/arm64/include/asm/page.h   | 1 +
 arch/arm64/kvm/mmu.c            | 2 +-
 arch/arm64/mm/init.c            | 6 ++++++
 arch/arm64/mm/ioremap.c         | 4 ++--
 arch/arm64/mm/mmu.c             | 2 +-
 6 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 0aabc3be9a75..194f9f993d30 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x)
 
 #define virt_addr_valid(addr)	({					\
 	__typeof__(addr) __addr = __tag_reset(addr);			\
-	__is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr));	\
+	__is_lm_address(__addr) && pfn_is_map_memory(virt_to_pfn(__addr));	\
 })
 
 void dump_mem_limit(void);
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 012cffc574e8..99a6da91f870 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from);
 typedef struct page *pgtable_t;
 
 extern int pfn_valid(unsigned long);
+extern int pfn_is_map_memory(unsigned long);
 
 #include <asm/memory.h>
 
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 8711894db8c2..23dd99e29b23 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
 
 static bool kvm_is_device_pfn(unsigned long pfn)
 {
-	return !pfn_valid(pfn);
+	return !pfn_is_map_memory(pfn);
 }
 
 /*
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 3685e12aba9b..c54e329aca15 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn)
 }
 EXPORT_SYMBOL(pfn_valid);
 
+int pfn_is_map_memory(unsigned long pfn)
+{
+	return memblock_is_map_memory(PFN_PHYS(pfn));
+}
+EXPORT_SYMBOL(pfn_is_map_memory);
+
 static phys_addr_t memory_limit = PHYS_ADDR_MAX;
 
 /*
diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c
index b5e83c46b23e..b7c81dacabf0 100644
--- a/arch/arm64/mm/ioremap.c
+++ b/arch/arm64/mm/ioremap.c
@@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size,
 	/*
 	 * Don't allow RAM to be mapped.
 	 */
-	if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
+	if (WARN_ON(pfn_is_map_memory(__phys_to_pfn(phys_addr))))
 		return NULL;
 
 	area = get_vm_area_caller(size, VM_IOREMAP, caller);
@@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap);
 void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size)
 {
 	/* For normal memory we already have a cacheable mapping. */
-	if (pfn_valid(__phys_to_pfn(phys_addr)))
+	if (pfn_is_map_memory(__phys_to_pfn(phys_addr)))
 		return (void __iomem *)__phys_to_virt(phys_addr);
 
 	return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL),
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 5d9550fdb9cf..26045e9adbd7 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -81,7 +81,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd)
 pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 			      unsigned long size, pgprot_t vma_prot)
 {
-	if (!pfn_valid(pfn))
+	if (!pfn_is_map_memory(pfn))
 		return pgprot_noncached(vma_prot);
 	else if (file->f_flags & O_SYNC)
 		return pgprot_writecombine(vma_prot);
-- 
2.28.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v1 4/4] arm64: drop pfn_valid_within() and simplify pfn_valid()
  2021-04-20  9:09 [PATCH v1 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
                   ` (2 preceding siblings ...)
  2021-04-20  9:09 ` [PATCH v1 3/4] arm64: decouple check whether pfn is in linear map from pfn_valid() Mike Rapoport
@ 2021-04-20  9:09 ` Mike Rapoport
  2021-04-20 16:00   ` David Hildenbrand
  3 siblings, 1 reply; 16+ messages in thread
From: Mike Rapoport @ 2021-04-20  9:09 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas,
	David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport,
	Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm

From: Mike Rapoport <rppt@linux.ibm.com>

The arm64's version of pfn_valid() differs from the generic because of two
reasons:

* Parts of the memory map are freed during boot. This makes it necessary to
  verify that there is actual physical memory that corresponds to a pfn
  which is done by querying memblock.

* There are NOMAP memory regions. These regions are not mapped in the
  linear map and until the previous commit the struct pages representing
  these areas had default values.

As the consequence of absence of the special treatment of NOMAP regions in
the memory map it was necessary to use memblock_is_map_memory() in
pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that
generic mm functionality would not treat a NOMAP page as a normal page.

Since the NOMAP regions are now marked as PageReserved(), pfn walkers and
the rest of core mm will treat them as unusable memory and thus
pfn_valid_within() is no longer required at all and can be disabled by
removing CONFIG_HOLES_IN_ZONE on arm64.

pfn_valid() can be slightly simplified by replacing
memblock_is_map_memory() with memblock_is_memory().

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arm64/Kconfig   | 3 ---
 arch/arm64/mm/init.c | 4 ++--
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index e4e1b6550115..58e439046d05 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1040,9 +1040,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
 	def_bool y
 	depends on NUMA
 
-config HOLES_IN_ZONE
-	def_bool y
-
 source "kernel/Kconfig.hz"
 
 config ARCH_SPARSEMEM_ENABLE
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index c54e329aca15..370f33765b64 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -243,7 +243,7 @@ int pfn_valid(unsigned long pfn)
 
 	/*
 	 * ZONE_DEVICE memory does not have the memblock entries.
-	 * memblock_is_map_memory() check for ZONE_DEVICE based
+	 * memblock_is_memory() check for ZONE_DEVICE based
 	 * addresses will always fail. Even the normal hotplugged
 	 * memory will never have MEMBLOCK_NOMAP flag set in their
 	 * memblock entries. Skip memblock search for all non early
@@ -254,7 +254,7 @@ int pfn_valid(unsigned long pfn)
 		return pfn_section_valid(ms, pfn);
 }
 #endif
-	return memblock_is_map_memory(addr);
+	return memblock_is_memory(addr);
 }
 EXPORT_SYMBOL(pfn_valid);
 
-- 
2.28.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 1/4] include/linux/mmzone.h: add documentation for pfn_valid()
  2021-04-20  9:09 ` [PATCH v1 1/4] include/linux/mmzone.h: add documentation for pfn_valid() Mike Rapoport
@ 2021-04-20  9:22   ` David Hildenbrand
  2021-04-20 12:57     ` Mike Rapoport
  0 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2021-04-20  9:22 UTC (permalink / raw)
  To: Mike Rapoport, linux-arm-kernel
  Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, Marc Zyngier,
	Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel,
	linux-mm

On 20.04.21 11:09, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> Add comment describing the semantics of pfn_valid() that clarifies that
> pfn_valid() only checks for availability of a memory map entry (i.e. struct
> page) for a PFN rather than availability of usable memory backing that PFN.
> 
> The most "generic" version of pfn_valid() used by the configurations with
> SPARSEMEM enabled resides in include/linux/mmzone.h so this is the most
> suitable place for documentation about semantics of pfn_valid().
> 
> Suggested-by: Anshuman Khandual <anshuman.khandual@arm.com>
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---
>   include/linux/mmzone.h | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 47946cec7584..961f0eeefb62 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1410,6 +1410,17 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
>   #endif
>   
>   #ifndef CONFIG_HAVE_ARCH_PFN_VALID
> +/**
> + * pfn_valid - check if there is a valid memory map entry for a PFN
> + * @pfn: the page frame number to check
> + *
> + * Check if there is a valid memory map entry aka struct page for the @pfn.
> + * Note, that availability of the memory map entry does not imply that
> + * there is actual usable memory at that @pfn. The struct page may
> + * represent a hole or an unusable page frame.
> + *
> + * Return: 1 for PFNs that have memory map entries and 0 otherwise
> + */
>   static inline int pfn_valid(unsigned long pfn)
>   {
>   	struct mem_section *ms;
> 

I'd rephrase all "there is a valid memory map" to "there is a memory 
map" and add "pfn_valid() does to indicate whether the memory map as 
actually initialized -- see pfn_to_online_page()."

pfn_valid() means that we can do a pfn_to_page() and don't get a fault 
when accessing the "struct page". It doesn't state anything about the 
content.

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 1/4] include/linux/mmzone.h: add documentation for pfn_valid()
  2021-04-20  9:22   ` David Hildenbrand
@ 2021-04-20 12:57     ` Mike Rapoport
  2021-04-20 12:58       ` David Hildenbrand
  0 siblings, 1 reply; 16+ messages in thread
From: Mike Rapoport @ 2021-04-20 12:57 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-arm-kernel, Anshuman Khandual, Ard Biesheuvel,
	Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport,
	Will Deacon, kvmarm, linux-kernel, linux-mm

On Tue, Apr 20, 2021 at 11:22:53AM +0200, David Hildenbrand wrote:
> On 20.04.21 11:09, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > Add comment describing the semantics of pfn_valid() that clarifies that
> > pfn_valid() only checks for availability of a memory map entry (i.e. struct
> > page) for a PFN rather than availability of usable memory backing that PFN.
> > 
> > The most "generic" version of pfn_valid() used by the configurations with
> > SPARSEMEM enabled resides in include/linux/mmzone.h so this is the most
> > suitable place for documentation about semantics of pfn_valid().
> > 
> > Suggested-by: Anshuman Khandual <anshuman.khandual@arm.com>
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > ---
> >   include/linux/mmzone.h | 11 +++++++++++
> >   1 file changed, 11 insertions(+)
> > 
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index 47946cec7584..961f0eeefb62 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -1410,6 +1410,17 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
> >   #endif
> >   #ifndef CONFIG_HAVE_ARCH_PFN_VALID
> > +/**
> > + * pfn_valid - check if there is a valid memory map entry for a PFN
> > + * @pfn: the page frame number to check
> > + *
> > + * Check if there is a valid memory map entry aka struct page for the @pfn.
> > + * Note, that availability of the memory map entry does not imply that
> > + * there is actual usable memory at that @pfn. The struct page may
> > + * represent a hole or an unusable page frame.
> > + *
> > + * Return: 1 for PFNs that have memory map entries and 0 otherwise
> > + */
> >   static inline int pfn_valid(unsigned long pfn)
> >   {
> >   	struct mem_section *ms;
> > 
> 
> I'd rephrase all "there is a valid memory map" to "there is a memory map"
> and add "pfn_valid() does to indicate whether the memory map as actually
> initialized -- see pfn_to_online_page()."
> 
> pfn_valid() means that we can do a pfn_to_page() and don't get a fault when
> accessing the "struct page". It doesn't state anything about the content.

Well, I mean valid in the sense you can access the struct page :)
How about:

/**
 * pfn_valid - check if there is a memory map entry for a PFN
 * @pfn: the page frame number to check
 *
 * Check if there is a memory map entry aka struct page for the @pfn and it
 * is safe to access that struct page; the struct page state may be
 * uninitialized -- see pfn_to_online_page().
 *
 * Note, that availability of the memory map entry does not imply that
 * there is actual usable memory at that @pfn. The struct page may
 * represent a hole or an unusable page frame.
 *
 * Return: 1 for PFNs that have memory map entries and 0 otherwise.
 */

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 1/4] include/linux/mmzone.h: add documentation for pfn_valid()
  2021-04-20 12:57     ` Mike Rapoport
@ 2021-04-20 12:58       ` David Hildenbrand
  0 siblings, 0 replies; 16+ messages in thread
From: David Hildenbrand @ 2021-04-20 12:58 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-arm-kernel, Anshuman Khandual, Ard Biesheuvel,
	Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport,
	Will Deacon, kvmarm, linux-kernel, linux-mm

On 20.04.21 14:57, Mike Rapoport wrote:
> On Tue, Apr 20, 2021 at 11:22:53AM +0200, David Hildenbrand wrote:
>> On 20.04.21 11:09, Mike Rapoport wrote:
>>> From: Mike Rapoport <rppt@linux.ibm.com>
>>>
>>> Add comment describing the semantics of pfn_valid() that clarifies that
>>> pfn_valid() only checks for availability of a memory map entry (i.e. struct
>>> page) for a PFN rather than availability of usable memory backing that PFN.
>>>
>>> The most "generic" version of pfn_valid() used by the configurations with
>>> SPARSEMEM enabled resides in include/linux/mmzone.h so this is the most
>>> suitable place for documentation about semantics of pfn_valid().
>>>
>>> Suggested-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
>>> ---
>>>    include/linux/mmzone.h | 11 +++++++++++
>>>    1 file changed, 11 insertions(+)
>>>
>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>>> index 47946cec7584..961f0eeefb62 100644
>>> --- a/include/linux/mmzone.h
>>> +++ b/include/linux/mmzone.h
>>> @@ -1410,6 +1410,17 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
>>>    #endif
>>>    #ifndef CONFIG_HAVE_ARCH_PFN_VALID
>>> +/**
>>> + * pfn_valid - check if there is a valid memory map entry for a PFN
>>> + * @pfn: the page frame number to check
>>> + *
>>> + * Check if there is a valid memory map entry aka struct page for the @pfn.
>>> + * Note, that availability of the memory map entry does not imply that
>>> + * there is actual usable memory at that @pfn. The struct page may
>>> + * represent a hole or an unusable page frame.
>>> + *
>>> + * Return: 1 for PFNs that have memory map entries and 0 otherwise
>>> + */
>>>    static inline int pfn_valid(unsigned long pfn)
>>>    {
>>>    	struct mem_section *ms;
>>>
>>
>> I'd rephrase all "there is a valid memory map" to "there is a memory map"
>> and add "pfn_valid() does to indicate whether the memory map as actually
>> initialized -- see pfn_to_online_page()."
>>
>> pfn_valid() means that we can do a pfn_to_page() and don't get a fault when
>> accessing the "struct page". It doesn't state anything about the content.
> 
> Well, I mean valid in the sense you can access the struct page :)
> How about:
> 
> /**
>   * pfn_valid - check if there is a memory map entry for a PFN
>   * @pfn: the page frame number to check
>   *
>   * Check if there is a memory map entry aka struct page for the @pfn and it
>   * is safe to access that struct page; the struct page state may be
>   * uninitialized -- see pfn_to_online_page().
>   *
>   * Note, that availability of the memory map entry does not imply that
>   * there is actual usable memory at that @pfn. The struct page may
>   * represent a hole or an unusable page frame.
>   *
>   * Return: 1 for PFNs that have memory map entries and 0 otherwise.
>   */
> 

Sounds good to me -- thanks!

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 2/4] memblock: update initialization of reserved pages
  2021-04-20  9:09 ` [PATCH v1 2/4] memblock: update initialization of reserved pages Mike Rapoport
@ 2021-04-20 13:56   ` David Hildenbrand
  2021-04-20 15:03     ` Mike Rapoport
  0 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2021-04-20 13:56 UTC (permalink / raw)
  To: Mike Rapoport, linux-arm-kernel
  Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, Marc Zyngier,
	Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel,
	linux-mm

On 20.04.21 11:09, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> The struct pages representing a reserved memory region are initialized
> using reserve_bootmem_range() function. This function is called for each
> reserved region just before the memory is freed from memblock to the buddy
> page allocator.
> 
> The struct pages for MEMBLOCK_NOMAP regions are kept with the default
> values set by the memory map initialization which makes it necessary to
> have a special treatment for such pages in pfn_valid() and
> pfn_valid_within().

Just a general question while thinking about it:

Would we right now initialize the memmap of these pages already via 
memmap_init_zone()->memmap_init_range()? (IOW, not marking the 
PageReserved?)

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 2/4] memblock: update initialization of reserved pages
  2021-04-20 13:56   ` David Hildenbrand
@ 2021-04-20 15:03     ` Mike Rapoport
  2021-04-20 15:18       ` David Hildenbrand
  0 siblings, 1 reply; 16+ messages in thread
From: Mike Rapoport @ 2021-04-20 15:03 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-arm-kernel, Anshuman Khandual, Ard Biesheuvel,
	Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport,
	Will Deacon, kvmarm, linux-kernel, linux-mm

On Tue, Apr 20, 2021 at 03:56:28PM +0200, David Hildenbrand wrote:
> On 20.04.21 11:09, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > The struct pages representing a reserved memory region are initialized
> > using reserve_bootmem_range() function. This function is called for each
> > reserved region just before the memory is freed from memblock to the buddy
> > page allocator.
> > 
> > The struct pages for MEMBLOCK_NOMAP regions are kept with the default
> > values set by the memory map initialization which makes it necessary to
> > have a special treatment for such pages in pfn_valid() and
> > pfn_valid_within().
> 
> Just a general question while thinking about it:
> 
> Would we right now initialize the memmap of these pages already via
> memmap_init_zone()->memmap_init_range()? (IOW, not marking the
> PageReserved?)

Yep. These pages are part of memblock.memory so they are initialized in
memmap_init_zone()->memmap_init_range() to the default values.

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 2/4] memblock: update initialization of reserved pages
  2021-04-20 15:03     ` Mike Rapoport
@ 2021-04-20 15:18       ` David Hildenbrand
  2021-04-20 15:25         ` Mike Rapoport
  0 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2021-04-20 15:18 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-arm-kernel, Anshuman Khandual, Ard Biesheuvel,
	Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport,
	Will Deacon, kvmarm, linux-kernel, linux-mm

On 20.04.21 17:03, Mike Rapoport wrote:
> On Tue, Apr 20, 2021 at 03:56:28PM +0200, David Hildenbrand wrote:
>> On 20.04.21 11:09, Mike Rapoport wrote:
>>> From: Mike Rapoport <rppt@linux.ibm.com>
>>>
>>> The struct pages representing a reserved memory region are initialized
>>> using reserve_bootmem_range() function. This function is called for each
>>> reserved region just before the memory is freed from memblock to the buddy
>>> page allocator.
>>>
>>> The struct pages for MEMBLOCK_NOMAP regions are kept with the default
>>> values set by the memory map initialization which makes it necessary to
>>> have a special treatment for such pages in pfn_valid() and
>>> pfn_valid_within().
>>
>> Just a general question while thinking about it:
>>
>> Would we right now initialize the memmap of these pages already via
>> memmap_init_zone()->memmap_init_range()? (IOW, not marking the
>> PageReserved?)
> 
> Yep. These pages are part of memblock.memory so they are initialized in
> memmap_init_zone()->memmap_init_range() to the default values.
> 

So instead of fully initializing them again, we mostly would only have 
to set PageReserved(). Not sure how big that memory usually is -- IOW, 
if we really care about optimizing the double-init.

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 2/4] memblock: update initialization of reserved pages
  2021-04-20 15:18       ` David Hildenbrand
@ 2021-04-20 15:25         ` Mike Rapoport
  0 siblings, 0 replies; 16+ messages in thread
From: Mike Rapoport @ 2021-04-20 15:25 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-arm-kernel, Anshuman Khandual, Ard Biesheuvel,
	Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport,
	Will Deacon, kvmarm, linux-kernel, linux-mm

On Tue, Apr 20, 2021 at 05:18:55PM +0200, David Hildenbrand wrote:
> On 20.04.21 17:03, Mike Rapoport wrote:
> > On Tue, Apr 20, 2021 at 03:56:28PM +0200, David Hildenbrand wrote:
> > > On 20.04.21 11:09, Mike Rapoport wrote:
> > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > 
> > > > The struct pages representing a reserved memory region are initialized
> > > > using reserve_bootmem_range() function. This function is called for each
> > > > reserved region just before the memory is freed from memblock to the buddy
> > > > page allocator.
> > > > 
> > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default
> > > > values set by the memory map initialization which makes it necessary to
> > > > have a special treatment for such pages in pfn_valid() and
> > > > pfn_valid_within().
> > > 
> > > Just a general question while thinking about it:
> > > 
> > > Would we right now initialize the memmap of these pages already via
> > > memmap_init_zone()->memmap_init_range()? (IOW, not marking the
> > > PageReserved?)
> > 
> > Yep. These pages are part of memblock.memory so they are initialized in
> > memmap_init_zone()->memmap_init_range() to the default values.
> > 
> 
> So instead of fully initializing them again, we mostly would only have to
> set PageReserved(). Not sure how big that memory usually is -- IOW, if we
> really care about optimizing the double-init.

IIUC, these are small areas reserved by the firmware, like e.g. ACPI
tables.

@Ard, am I right?

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 3/4] arm64: decouple check whether pfn is in linear map from pfn_valid()
  2021-04-20  9:09 ` [PATCH v1 3/4] arm64: decouple check whether pfn is in linear map from pfn_valid() Mike Rapoport
@ 2021-04-20 15:57   ` David Hildenbrand
  2021-04-21  5:32     ` Mike Rapoport
  0 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2021-04-20 15:57 UTC (permalink / raw)
  To: Mike Rapoport, linux-arm-kernel
  Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, Marc Zyngier,
	Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel,
	linux-mm

On 20.04.21 11:09, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> The intended semantics of pfn_valid() is to verify whether there is a
> struct page for the pfn in question and nothing else.
> 
> Yet, on arm64 it is used to distinguish memory areas that are mapped in the
> linear map vs those that require ioremap() to access them.
> 
> Introduce a dedicated pfn_is_map_memory() wrapper for
> memblock_is_map_memory() to perform such check and use it where
> appropriate.
> 
> Using a wrapper allows to avoid cyclic include dependencies.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---
>   arch/arm64/include/asm/memory.h | 2 +-
>   arch/arm64/include/asm/page.h   | 1 +
>   arch/arm64/kvm/mmu.c            | 2 +-
>   arch/arm64/mm/init.c            | 6 ++++++
>   arch/arm64/mm/ioremap.c         | 4 ++--
>   arch/arm64/mm/mmu.c             | 2 +-
>   6 files changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 0aabc3be9a75..194f9f993d30 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x)
>   
>   #define virt_addr_valid(addr)	({					\
>   	__typeof__(addr) __addr = __tag_reset(addr);			\
> -	__is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr));	\
> +	__is_lm_address(__addr) && pfn_is_map_memory(virt_to_pfn(__addr));	\
>   })
>   
>   void dump_mem_limit(void);
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 012cffc574e8..99a6da91f870 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from);
>   typedef struct page *pgtable_t;
>   
>   extern int pfn_valid(unsigned long);
> +extern int pfn_is_map_memory(unsigned long);
>   
>   #include <asm/memory.h>
>   
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 8711894db8c2..23dd99e29b23 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
>   
>   static bool kvm_is_device_pfn(unsigned long pfn)
>   {
> -	return !pfn_valid(pfn);
> +	return !pfn_is_map_memory(pfn);
>   }
>   
>   /*
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 3685e12aba9b..c54e329aca15 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn)
>   }
>   EXPORT_SYMBOL(pfn_valid);
>   
> +int pfn_is_map_memory(unsigned long pfn)
> +{

I think you might have to add (see pfn_valid())

if (PHYS_PFN(PFN_PHYS(pfn)) != pfn)
	return 0;

to catch false positives.

> +	return memblock_is_map_memory(PFN_PHYS(pfn));
> +}
> +EXPORT_SYMBOL(pfn_is_map_memory);
> +
>   static phys_addr_t memory_limit = PHYS_ADDR_MAX;
>   
>   /*
> diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c
> index b5e83c46b23e..b7c81dacabf0 100644
> --- a/arch/arm64/mm/ioremap.c
> +++ b/arch/arm64/mm/ioremap.c
> @@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size,
>   	/*
>   	 * Don't allow RAM to be mapped.
>   	 */
> -	if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
> +	if (WARN_ON(pfn_is_map_memory(__phys_to_pfn(phys_addr))))
>   		return NULL;
>   
>   	area = get_vm_area_caller(size, VM_IOREMAP, caller);
> @@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap);
>   void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size)
>   {
>   	/* For normal memory we already have a cacheable mapping. */
> -	if (pfn_valid(__phys_to_pfn(phys_addr)))
> +	if (pfn_is_map_memory(__phys_to_pfn(phys_addr)))
>   		return (void __iomem *)__phys_to_virt(phys_addr);
>   
>   	return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL),
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 5d9550fdb9cf..26045e9adbd7 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -81,7 +81,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd)
>   pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
>   			      unsigned long size, pgprot_t vma_prot)
>   {
> -	if (!pfn_valid(pfn))
> +	if (!pfn_is_map_memory(pfn))
>   		return pgprot_noncached(vma_prot);
>   	else if (file->f_flags & O_SYNC)
>   		return pgprot_writecombine(vma_prot);
> 

As discussed, in the future it would be nice if we could just rely on 
the memmap state. There are cases where pfn_is_map_memory() will now be 
slower than pfn_valid() -- e.g., we don't check for valid_section() in 
case of CONFIG_SPARSEMEM. This would apply where pfn_valid() would have 
returned "0".

As we're not creating the direct map, kern_addr_valid() shouldn't need 
love. It'd be some kind of ugly if some generic code used by arm64 would 
be relying in case of arm64 on pfn_valid() to return the expected 
result; I doubt it.

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 4/4] arm64: drop pfn_valid_within() and simplify pfn_valid()
  2021-04-20  9:09 ` [PATCH v1 4/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
@ 2021-04-20 16:00   ` David Hildenbrand
  2021-04-21  5:52     ` Mike Rapoport
  0 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2021-04-20 16:00 UTC (permalink / raw)
  To: Mike Rapoport, linux-arm-kernel
  Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, Marc Zyngier,
	Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel,
	linux-mm

On 20.04.21 11:09, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> The arm64's version of pfn_valid() differs from the generic because of two
> reasons:
> 
> * Parts of the memory map are freed during boot. This makes it necessary to
>    verify that there is actual physical memory that corresponds to a pfn
>    which is done by querying memblock.
> 
> * There are NOMAP memory regions. These regions are not mapped in the
>    linear map and until the previous commit the struct pages representing
>    these areas had default values.
> 
> As the consequence of absence of the special treatment of NOMAP regions in
> the memory map it was necessary to use memblock_is_map_memory() in
> pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that
> generic mm functionality would not treat a NOMAP page as a normal page.
> 
> Since the NOMAP regions are now marked as PageReserved(), pfn walkers and
> the rest of core mm will treat them as unusable memory and thus
> pfn_valid_within() is no longer required at all and can be disabled by
> removing CONFIG_HOLES_IN_ZONE on arm64.
> 
> pfn_valid() can be slightly simplified by replacing
> memblock_is_map_memory() with memblock_is_memory().
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---
>   arch/arm64/Kconfig   | 3 ---
>   arch/arm64/mm/init.c | 4 ++--
>   2 files changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index e4e1b6550115..58e439046d05 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1040,9 +1040,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
>   	def_bool y
>   	depends on NUMA
>   
> -config HOLES_IN_ZONE
> -	def_bool y
> -
>   source "kernel/Kconfig.hz"
>   
>   config ARCH_SPARSEMEM_ENABLE
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index c54e329aca15..370f33765b64 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -243,7 +243,7 @@ int pfn_valid(unsigned long pfn)
>   
>   	/*
>   	 * ZONE_DEVICE memory does not have the memblock entries.
> -	 * memblock_is_map_memory() check for ZONE_DEVICE based
> +	 * memblock_is_memory() check for ZONE_DEVICE based
>   	 * addresses will always fail. Even the normal hotplugged
>   	 * memory will never have MEMBLOCK_NOMAP flag set in their
>   	 * memblock entries. Skip memblock search for all non early
> @@ -254,7 +254,7 @@ int pfn_valid(unsigned long pfn)
>   		return pfn_section_valid(ms, pfn);
>   }
>   #endif
> -	return memblock_is_map_memory(addr);
> +	return memblock_is_memory(addr);
>   }
>   EXPORT_SYMBOL(pfn_valid);
>   
> 

What are the steps needed to get rid of custom pfn_valid() completely?

I'd assume we would have to stop freeing parts of the mem map during 
boot. How relevant is that for arm64 nowadays, especially with reduced 
section sizes?

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 3/4] arm64: decouple check whether pfn is in linear map from pfn_valid()
  2021-04-20 15:57   ` David Hildenbrand
@ 2021-04-21  5:32     ` Mike Rapoport
  0 siblings, 0 replies; 16+ messages in thread
From: Mike Rapoport @ 2021-04-21  5:32 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-arm-kernel, Anshuman Khandual, Ard Biesheuvel,
	Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport,
	Will Deacon, kvmarm, linux-kernel, linux-mm

On Tue, Apr 20, 2021 at 05:57:57PM +0200, David Hildenbrand wrote:
> On 20.04.21 11:09, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > The intended semantics of pfn_valid() is to verify whether there is a
> > struct page for the pfn in question and nothing else.
> > 
> > Yet, on arm64 it is used to distinguish memory areas that are mapped in the
> > linear map vs those that require ioremap() to access them.
> > 
> > Introduce a dedicated pfn_is_map_memory() wrapper for
> > memblock_is_map_memory() to perform such check and use it where
> > appropriate.
> > 
> > Using a wrapper allows to avoid cyclic include dependencies.
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > ---
> >   arch/arm64/include/asm/memory.h | 2 +-
> >   arch/arm64/include/asm/page.h   | 1 +
> >   arch/arm64/kvm/mmu.c            | 2 +-
> >   arch/arm64/mm/init.c            | 6 ++++++
> >   arch/arm64/mm/ioremap.c         | 4 ++--
> >   arch/arm64/mm/mmu.c             | 2 +-
> >   6 files changed, 12 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> > index 0aabc3be9a75..194f9f993d30 100644
> > --- a/arch/arm64/include/asm/memory.h
> > +++ b/arch/arm64/include/asm/memory.h
> > @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x)
> >   #define virt_addr_valid(addr)	({					\
> >   	__typeof__(addr) __addr = __tag_reset(addr);			\
> > -	__is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr));	\
> > +	__is_lm_address(__addr) && pfn_is_map_memory(virt_to_pfn(__addr));	\
> >   })
> >   void dump_mem_limit(void);
> > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> > index 012cffc574e8..99a6da91f870 100644
> > --- a/arch/arm64/include/asm/page.h
> > +++ b/arch/arm64/include/asm/page.h
> > @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from);
> >   typedef struct page *pgtable_t;
> >   extern int pfn_valid(unsigned long);
> > +extern int pfn_is_map_memory(unsigned long);
> >   #include <asm/memory.h>
> > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> > index 8711894db8c2..23dd99e29b23 100644
> > --- a/arch/arm64/kvm/mmu.c
> > +++ b/arch/arm64/kvm/mmu.c
> > @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
> >   static bool kvm_is_device_pfn(unsigned long pfn)
> >   {
> > -	return !pfn_valid(pfn);
> > +	return !pfn_is_map_memory(pfn);
> >   }
> >   /*
> > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> > index 3685e12aba9b..c54e329aca15 100644
> > --- a/arch/arm64/mm/init.c
> > +++ b/arch/arm64/mm/init.c
> > @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn)
> >   }
> >   EXPORT_SYMBOL(pfn_valid);
> > +int pfn_is_map_memory(unsigned long pfn)
> > +{
> 
> I think you might have to add (see pfn_valid())
> 
> if (PHYS_PFN(PFN_PHYS(pfn)) != pfn)
> 	return 0;
> 
> to catch false positives.
 
Yeah, makes sense. 

> > +	return memblock_is_map_memory(PFN_PHYS(pfn));
> > +}
> > +EXPORT_SYMBOL(pfn_is_map_memory);
> > +
> >   static phys_addr_t memory_limit = PHYS_ADDR_MAX;
> >   /*
> > diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c
> > index b5e83c46b23e..b7c81dacabf0 100644
> > --- a/arch/arm64/mm/ioremap.c
> > +++ b/arch/arm64/mm/ioremap.c
> > @@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size,
> >   	/*
> >   	 * Don't allow RAM to be mapped.
> >   	 */
> > -	if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
> > +	if (WARN_ON(pfn_is_map_memory(__phys_to_pfn(phys_addr))))
> >   		return NULL;
> >   	area = get_vm_area_caller(size, VM_IOREMAP, caller);
> > @@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap);
> >   void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size)
> >   {
> >   	/* For normal memory we already have a cacheable mapping. */
> > -	if (pfn_valid(__phys_to_pfn(phys_addr)))
> > +	if (pfn_is_map_memory(__phys_to_pfn(phys_addr)))
> >   		return (void __iomem *)__phys_to_virt(phys_addr);
> >   	return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL),
> > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > index 5d9550fdb9cf..26045e9adbd7 100644
> > --- a/arch/arm64/mm/mmu.c
> > +++ b/arch/arm64/mm/mmu.c
> > @@ -81,7 +81,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd)
> >   pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
> >   			      unsigned long size, pgprot_t vma_prot)
> >   {
> > -	if (!pfn_valid(pfn))
> > +	if (!pfn_is_map_memory(pfn))
> >   		return pgprot_noncached(vma_prot);
> >   	else if (file->f_flags & O_SYNC)
> >   		return pgprot_writecombine(vma_prot);
> > 
> 
> As discussed, in the future it would be nice if we could just rely on the
> memmap state. There are cases where pfn_is_map_memory() will now be slower
> than pfn_valid() -- e.g., we don't check for valid_section() in case of
> CONFIG_SPARSEMEM. This would apply where pfn_valid() would have returned
> "0".
>
> As we're not creating the direct map, kern_addr_valid() shouldn't need love.
> It'd be some kind of ugly if some generic code used by arm64 would be
> relying in case of arm64 on pfn_valid() to return the expected result; I
> doubt it.

No doubt there is a room for further improvement in this area.
 
> Acked-by: David Hildenbrand <david@redhat.com>

Thanks!

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 4/4] arm64: drop pfn_valid_within() and simplify pfn_valid()
  2021-04-20 16:00   ` David Hildenbrand
@ 2021-04-21  5:52     ` Mike Rapoport
  0 siblings, 0 replies; 16+ messages in thread
From: Mike Rapoport @ 2021-04-21  5:52 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-arm-kernel, Anshuman Khandual, Ard Biesheuvel,
	Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport,
	Will Deacon, kvmarm, linux-kernel, linux-mm

On Tue, Apr 20, 2021 at 06:00:55PM +0200, David Hildenbrand wrote:
> On 20.04.21 11:09, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > The arm64's version of pfn_valid() differs from the generic because of two
> > reasons:
> > 
> > * Parts of the memory map are freed during boot. This makes it necessary to
> >    verify that there is actual physical memory that corresponds to a pfn
> >    which is done by querying memblock.
> > 
> > * There are NOMAP memory regions. These regions are not mapped in the
> >    linear map and until the previous commit the struct pages representing
> >    these areas had default values.
> > 
> > As the consequence of absence of the special treatment of NOMAP regions in
> > the memory map it was necessary to use memblock_is_map_memory() in
> > pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that
> > generic mm functionality would not treat a NOMAP page as a normal page.
> > 
> > Since the NOMAP regions are now marked as PageReserved(), pfn walkers and
> > the rest of core mm will treat them as unusable memory and thus
> > pfn_valid_within() is no longer required at all and can be disabled by
> > removing CONFIG_HOLES_IN_ZONE on arm64.
> > 
> > pfn_valid() can be slightly simplified by replacing
> > memblock_is_map_memory() with memblock_is_memory().
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > ---
> >   arch/arm64/Kconfig   | 3 ---
> >   arch/arm64/mm/init.c | 4 ++--
> >   2 files changed, 2 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index e4e1b6550115..58e439046d05 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -1040,9 +1040,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
> >   	def_bool y
> >   	depends on NUMA
> > -config HOLES_IN_ZONE
> > -	def_bool y
> > -
> >   source "kernel/Kconfig.hz"
> >   config ARCH_SPARSEMEM_ENABLE
> > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> > index c54e329aca15..370f33765b64 100644
> > --- a/arch/arm64/mm/init.c
> > +++ b/arch/arm64/mm/init.c
> > @@ -243,7 +243,7 @@ int pfn_valid(unsigned long pfn)
> >   	/*
> >   	 * ZONE_DEVICE memory does not have the memblock entries.
> > -	 * memblock_is_map_memory() check for ZONE_DEVICE based
> > +	 * memblock_is_memory() check for ZONE_DEVICE based
> >   	 * addresses will always fail. Even the normal hotplugged
> >   	 * memory will never have MEMBLOCK_NOMAP flag set in their
> >   	 * memblock entries. Skip memblock search for all non early
> > @@ -254,7 +254,7 @@ int pfn_valid(unsigned long pfn)
> >   		return pfn_section_valid(ms, pfn);
> >   }
> >   #endif
> > -	return memblock_is_map_memory(addr);
> > +	return memblock_is_memory(addr);
> >   }
> >   EXPORT_SYMBOL(pfn_valid);
> > 
> 
> What are the steps needed to get rid of custom pfn_valid() completely?
> 
> I'd assume we would have to stop freeing parts of the mem map during boot.
> How relevant is that for arm64 nowadays, especially with reduced section
> sizes?

Yes, for arm64 to use the generic pfn_valid() it'd need to stop freeing
parts of the memory map.

Presuming struct page is 64 bytes, the memory map takes 2M per section in
the worst case (128M per section, 4k pages). 

So for systems that have less than 128M populated in each section freeing
unused memory map would mean significant savings.

But nowadays when a clock has at least 1G of RAM I doubt this is relevant
to many systems if at all.

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-04-21  5:58 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-20  9:09 [PATCH v1 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
2021-04-20  9:09 ` [PATCH v1 1/4] include/linux/mmzone.h: add documentation for pfn_valid() Mike Rapoport
2021-04-20  9:22   ` David Hildenbrand
2021-04-20 12:57     ` Mike Rapoport
2021-04-20 12:58       ` David Hildenbrand
2021-04-20  9:09 ` [PATCH v1 2/4] memblock: update initialization of reserved pages Mike Rapoport
2021-04-20 13:56   ` David Hildenbrand
2021-04-20 15:03     ` Mike Rapoport
2021-04-20 15:18       ` David Hildenbrand
2021-04-20 15:25         ` Mike Rapoport
2021-04-20  9:09 ` [PATCH v1 3/4] arm64: decouple check whether pfn is in linear map from pfn_valid() Mike Rapoport
2021-04-20 15:57   ` David Hildenbrand
2021-04-21  5:32     ` Mike Rapoport
2021-04-20  9:09 ` [PATCH v1 4/4] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport
2021-04-20 16:00   ` David Hildenbrand
2021-04-21  5:52     ` Mike Rapoport

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).