* [RFC/RFT PATCH 0/3] arm64: drop pfn_valid_within() and simplify pfn_valid() @ 2021-04-07 17:26 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-07 17:26 UTC (permalink / raw) To: linux-arm-kernel Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm From: Mike Rapoport <rppt@linux.ibm.com> Hi, These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire pfn_valid_within() to 1. The idea is to mark NOMAP pages as reserved in the memory map and restore the intended semantics of pfn_valid() to designate availability of struct page for a pfn. With this the core mm will be able to cope with the fact that it cannot use NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks will be treated correctly even without the need for pfn_valid_within. The patches are only boot tested on qemu-system-aarch64 so I'd really appreciate memory stress tests on real hardware. If this actually works we'll be one step closer to drop custom pfn_valid() on arm64 altogether. Mike Rapoport (3): memblock: update initialization of reserved pages arm64: decouple check whether pfn is normal memory from pfn_valid() arm64: drop pfn_valid_within() and simplify pfn_valid() arch/arm64/Kconfig | 3 --- arch/arm64/include/asm/memory.h | 2 +- arch/arm64/include/asm/page.h | 1 + arch/arm64/kvm/mmu.c | 2 +- arch/arm64/mm/init.c | 10 ++++++++-- arch/arm64/mm/ioremap.c | 4 ++-- arch/arm64/mm/mmu.c | 2 +- mm/memblock.c | 23 +++++++++++++++++++++-- 8 files changed, 35 insertions(+), 12 deletions(-) base-commit: e49d033bddf5b565044e2abe4241353959bc9120 -- 2.28.0 ^ permalink raw reply [flat|nested] 78+ messages in thread
* [RFC/RFT PATCH 0/3] arm64: drop pfn_valid_within() and simplify pfn_valid() @ 2021-04-07 17:26 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-07 17:26 UTC (permalink / raw) To: linux-arm-kernel Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm From: Mike Rapoport <rppt@linux.ibm.com> Hi, These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire pfn_valid_within() to 1. The idea is to mark NOMAP pages as reserved in the memory map and restore the intended semantics of pfn_valid() to designate availability of struct page for a pfn. With this the core mm will be able to cope with the fact that it cannot use NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks will be treated correctly even without the need for pfn_valid_within. The patches are only boot tested on qemu-system-aarch64 so I'd really appreciate memory stress tests on real hardware. If this actually works we'll be one step closer to drop custom pfn_valid() on arm64 altogether. Mike Rapoport (3): memblock: update initialization of reserved pages arm64: decouple check whether pfn is normal memory from pfn_valid() arm64: drop pfn_valid_within() and simplify pfn_valid() arch/arm64/Kconfig | 3 --- arch/arm64/include/asm/memory.h | 2 +- arch/arm64/include/asm/page.h | 1 + arch/arm64/kvm/mmu.c | 2 +- arch/arm64/mm/init.c | 10 ++++++++-- arch/arm64/mm/ioremap.c | 4 ++-- arch/arm64/mm/mmu.c | 2 +- mm/memblock.c | 23 +++++++++++++++++++++-- 8 files changed, 35 insertions(+), 12 deletions(-) base-commit: e49d033bddf5b565044e2abe4241353959bc9120 -- 2.28.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* [RFC/RFT PATCH 0/3] arm64: drop pfn_valid_within() and simplify pfn_valid() @ 2021-04-07 17:26 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-07 17:26 UTC (permalink / raw) To: linux-arm-kernel Cc: David Hildenbrand, Catalin Marinas, Anshuman Khandual, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon, Mike Rapoport From: Mike Rapoport <rppt@linux.ibm.com> Hi, These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire pfn_valid_within() to 1. The idea is to mark NOMAP pages as reserved in the memory map and restore the intended semantics of pfn_valid() to designate availability of struct page for a pfn. With this the core mm will be able to cope with the fact that it cannot use NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks will be treated correctly even without the need for pfn_valid_within. The patches are only boot tested on qemu-system-aarch64 so I'd really appreciate memory stress tests on real hardware. If this actually works we'll be one step closer to drop custom pfn_valid() on arm64 altogether. Mike Rapoport (3): memblock: update initialization of reserved pages arm64: decouple check whether pfn is normal memory from pfn_valid() arm64: drop pfn_valid_within() and simplify pfn_valid() arch/arm64/Kconfig | 3 --- arch/arm64/include/asm/memory.h | 2 +- arch/arm64/include/asm/page.h | 1 + arch/arm64/kvm/mmu.c | 2 +- arch/arm64/mm/init.c | 10 ++++++++-- arch/arm64/mm/ioremap.c | 4 ++-- arch/arm64/mm/mmu.c | 2 +- mm/memblock.c | 23 +++++++++++++++++++++-- 8 files changed, 35 insertions(+), 12 deletions(-) base-commit: e49d033bddf5b565044e2abe4241353959bc9120 -- 2.28.0 _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages 2021-04-07 17:26 ` Mike Rapoport (?) @ 2021-04-07 17:26 ` Mike Rapoport -1 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-07 17:26 UTC (permalink / raw) To: linux-arm-kernel Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm From: Mike Rapoport <rppt@linux.ibm.com> The struct pages representing a reserved memory region are initialized using reserve_bootmem_range() function. This function is called for each reserved region just before the memory is freed from memblock to the buddy page allocator. The struct pages for MEMBLOCK_NOMAP regions are kept with the default values set by the memory map initialization which makes it necessary to have a special treatment for such pages in pfn_valid() and pfn_valid_within(). Split out initialization of the reserved pages to a function with a meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the reserved regions and mark struct pages for the NOMAP regions as PageReserved. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- mm/memblock.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/mm/memblock.c b/mm/memblock.c index afaefa8fc6ab..6b7ea9d86310 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, return end_pfn - start_pfn; } +static void __init memmap_init_reserved_pages(void) +{ + struct memblock_region *region; + phys_addr_t start, end; + u64 i; + + /* initialize struct pages for the reserved regions */ + for_each_reserved_mem_range(i, &start, &end) + reserve_bootmem_region(start, end); + + /* and also treat struct pages for the NOMAP regions as PageReserved */ + for_each_mem_region(region) { + if (memblock_is_nomap(region)) { + start = region->base; + end = start + region->size; + reserve_bootmem_region(start, end); + } + } +} + static unsigned long __init free_low_memory_core_early(void) { unsigned long count = 0; @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) memblock_clear_hotplug(0, -1); - for_each_reserved_mem_range(i, &start, &end) - reserve_bootmem_region(start, end); + memmap_init_reserved_pages(); /* * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id -- 2.28.0 ^ permalink raw reply related [flat|nested] 78+ messages in thread
* [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-07 17:26 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-07 17:26 UTC (permalink / raw) To: linux-arm-kernel Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm From: Mike Rapoport <rppt@linux.ibm.com> The struct pages representing a reserved memory region are initialized using reserve_bootmem_range() function. This function is called for each reserved region just before the memory is freed from memblock to the buddy page allocator. The struct pages for MEMBLOCK_NOMAP regions are kept with the default values set by the memory map initialization which makes it necessary to have a special treatment for such pages in pfn_valid() and pfn_valid_within(). Split out initialization of the reserved pages to a function with a meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the reserved regions and mark struct pages for the NOMAP regions as PageReserved. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- mm/memblock.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/mm/memblock.c b/mm/memblock.c index afaefa8fc6ab..6b7ea9d86310 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, return end_pfn - start_pfn; } +static void __init memmap_init_reserved_pages(void) +{ + struct memblock_region *region; + phys_addr_t start, end; + u64 i; + + /* initialize struct pages for the reserved regions */ + for_each_reserved_mem_range(i, &start, &end) + reserve_bootmem_region(start, end); + + /* and also treat struct pages for the NOMAP regions as PageReserved */ + for_each_mem_region(region) { + if (memblock_is_nomap(region)) { + start = region->base; + end = start + region->size; + reserve_bootmem_region(start, end); + } + } +} + static unsigned long __init free_low_memory_core_early(void) { unsigned long count = 0; @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) memblock_clear_hotplug(0, -1); - for_each_reserved_mem_range(i, &start, &end) - reserve_bootmem_region(start, end); + memmap_init_reserved_pages(); /* * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id -- 2.28.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 78+ messages in thread
* [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-07 17:26 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-07 17:26 UTC (permalink / raw) To: linux-arm-kernel Cc: David Hildenbrand, Catalin Marinas, Anshuman Khandual, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon, Mike Rapoport From: Mike Rapoport <rppt@linux.ibm.com> The struct pages representing a reserved memory region are initialized using reserve_bootmem_range() function. This function is called for each reserved region just before the memory is freed from memblock to the buddy page allocator. The struct pages for MEMBLOCK_NOMAP regions are kept with the default values set by the memory map initialization which makes it necessary to have a special treatment for such pages in pfn_valid() and pfn_valid_within(). Split out initialization of the reserved pages to a function with a meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the reserved regions and mark struct pages for the NOMAP regions as PageReserved. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- mm/memblock.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/mm/memblock.c b/mm/memblock.c index afaefa8fc6ab..6b7ea9d86310 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, return end_pfn - start_pfn; } +static void __init memmap_init_reserved_pages(void) +{ + struct memblock_region *region; + phys_addr_t start, end; + u64 i; + + /* initialize struct pages for the reserved regions */ + for_each_reserved_mem_range(i, &start, &end) + reserve_bootmem_region(start, end); + + /* and also treat struct pages for the NOMAP regions as PageReserved */ + for_each_mem_region(region) { + if (memblock_is_nomap(region)) { + start = region->base; + end = start + region->size; + reserve_bootmem_region(start, end); + } + } +} + static unsigned long __init free_low_memory_core_early(void) { unsigned long count = 0; @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) memblock_clear_hotplug(0, -1); - for_each_reserved_mem_range(i, &start, &end) - reserve_bootmem_region(start, end); + memmap_init_reserved_pages(); /* * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id -- 2.28.0 _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply related [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages 2021-04-07 17:26 ` Mike Rapoport (?) @ 2021-04-08 5:16 ` Anshuman Khandual -1 siblings, 0 replies; 78+ messages in thread From: Anshuman Khandual @ 2021-04-08 5:16 UTC (permalink / raw) To: Mike Rapoport, linux-arm-kernel Cc: Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On 4/7/21 10:56 PM, Mike Rapoport wrote: > From: Mike Rapoport <rppt@linux.ibm.com> > > The struct pages representing a reserved memory region are initialized > using reserve_bootmem_range() function. This function is called for each > reserved region just before the memory is freed from memblock to the buddy > page allocator. > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > values set by the memory map initialization which makes it necessary to > have a special treatment for such pages in pfn_valid() and > pfn_valid_within(). > > Split out initialization of the reserved pages to a function with a > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > reserved regions and mark struct pages for the NOMAP regions as > PageReserved. This would definitely need updating the comment for MEMBLOCK_NOMAP definition in include/linux/memblock.h just to make the semantics is clear, though arm64 is currently the only user for MEMBLOCK_NOMAP. > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > --- > mm/memblock.c | 23 +++++++++++++++++++++-- > 1 file changed, 21 insertions(+), 2 deletions(-) > > diff --git a/mm/memblock.c b/mm/memblock.c > index afaefa8fc6ab..6b7ea9d86310 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > return end_pfn - start_pfn; > } > > +static void __init memmap_init_reserved_pages(void) > +{ > + struct memblock_region *region; > + phys_addr_t start, end; > + u64 i; > + > + /* initialize struct pages for the reserved regions */ > + for_each_reserved_mem_range(i, &start, &end) > + reserve_bootmem_region(start, end); > + > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > + for_each_mem_region(region) { > + if (memblock_is_nomap(region)) { > + start = region->base; > + end = start + region->size; > + reserve_bootmem_region(start, end); > + } > + } > +} > + > static unsigned long __init free_low_memory_core_early(void) > { > unsigned long count = 0; > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > memblock_clear_hotplug(0, -1); > > - for_each_reserved_mem_range(i, &start, &end) > - reserve_bootmem_region(start, end); > + memmap_init_reserved_pages(); > > /* > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-08 5:16 ` Anshuman Khandual 0 siblings, 0 replies; 78+ messages in thread From: Anshuman Khandual @ 2021-04-08 5:16 UTC (permalink / raw) To: Mike Rapoport, linux-arm-kernel Cc: Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On 4/7/21 10:56 PM, Mike Rapoport wrote: > From: Mike Rapoport <rppt@linux.ibm.com> > > The struct pages representing a reserved memory region are initialized > using reserve_bootmem_range() function. This function is called for each > reserved region just before the memory is freed from memblock to the buddy > page allocator. > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > values set by the memory map initialization which makes it necessary to > have a special treatment for such pages in pfn_valid() and > pfn_valid_within(). > > Split out initialization of the reserved pages to a function with a > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > reserved regions and mark struct pages for the NOMAP regions as > PageReserved. This would definitely need updating the comment for MEMBLOCK_NOMAP definition in include/linux/memblock.h just to make the semantics is clear, though arm64 is currently the only user for MEMBLOCK_NOMAP. > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > --- > mm/memblock.c | 23 +++++++++++++++++++++-- > 1 file changed, 21 insertions(+), 2 deletions(-) > > diff --git a/mm/memblock.c b/mm/memblock.c > index afaefa8fc6ab..6b7ea9d86310 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > return end_pfn - start_pfn; > } > > +static void __init memmap_init_reserved_pages(void) > +{ > + struct memblock_region *region; > + phys_addr_t start, end; > + u64 i; > + > + /* initialize struct pages for the reserved regions */ > + for_each_reserved_mem_range(i, &start, &end) > + reserve_bootmem_region(start, end); > + > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > + for_each_mem_region(region) { > + if (memblock_is_nomap(region)) { > + start = region->base; > + end = start + region->size; > + reserve_bootmem_region(start, end); > + } > + } > +} > + > static unsigned long __init free_low_memory_core_early(void) > { > unsigned long count = 0; > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > memblock_clear_hotplug(0, -1); > > - for_each_reserved_mem_range(i, &start, &end) > - reserve_bootmem_region(start, end); > + memmap_init_reserved_pages(); > > /* > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-08 5:16 ` Anshuman Khandual 0 siblings, 0 replies; 78+ messages in thread From: Anshuman Khandual @ 2021-04-08 5:16 UTC (permalink / raw) To: Mike Rapoport, linux-arm-kernel Cc: David Hildenbrand, Catalin Marinas, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon On 4/7/21 10:56 PM, Mike Rapoport wrote: > From: Mike Rapoport <rppt@linux.ibm.com> > > The struct pages representing a reserved memory region are initialized > using reserve_bootmem_range() function. This function is called for each > reserved region just before the memory is freed from memblock to the buddy > page allocator. > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > values set by the memory map initialization which makes it necessary to > have a special treatment for such pages in pfn_valid() and > pfn_valid_within(). > > Split out initialization of the reserved pages to a function with a > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > reserved regions and mark struct pages for the NOMAP regions as > PageReserved. This would definitely need updating the comment for MEMBLOCK_NOMAP definition in include/linux/memblock.h just to make the semantics is clear, though arm64 is currently the only user for MEMBLOCK_NOMAP. > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > --- > mm/memblock.c | 23 +++++++++++++++++++++-- > 1 file changed, 21 insertions(+), 2 deletions(-) > > diff --git a/mm/memblock.c b/mm/memblock.c > index afaefa8fc6ab..6b7ea9d86310 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > return end_pfn - start_pfn; > } > > +static void __init memmap_init_reserved_pages(void) > +{ > + struct memblock_region *region; > + phys_addr_t start, end; > + u64 i; > + > + /* initialize struct pages for the reserved regions */ > + for_each_reserved_mem_range(i, &start, &end) > + reserve_bootmem_region(start, end); > + > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > + for_each_mem_region(region) { > + if (memblock_is_nomap(region)) { > + start = region->base; > + end = start + region->size; > + reserve_bootmem_region(start, end); > + } > + } > +} > + > static unsigned long __init free_low_memory_core_early(void) > { > unsigned long count = 0; > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > memblock_clear_hotplug(0, -1); > > - for_each_reserved_mem_range(i, &start, &end) > - reserve_bootmem_region(start, end); > + memmap_init_reserved_pages(); > > /* > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages 2021-04-08 5:16 ` Anshuman Khandual (?) @ 2021-04-08 5:48 ` Mike Rapoport -1 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-08 5:48 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On Thu, Apr 08, 2021 at 10:46:18AM +0530, Anshuman Khandual wrote: > > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > The struct pages representing a reserved memory region are initialized > > using reserve_bootmem_range() function. This function is called for each > > reserved region just before the memory is freed from memblock to the buddy > > page allocator. > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > values set by the memory map initialization which makes it necessary to > > have a special treatment for such pages in pfn_valid() and > > pfn_valid_within(). > > > > Split out initialization of the reserved pages to a function with a > > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > > reserved regions and mark struct pages for the NOMAP regions as > > PageReserved. > > This would definitely need updating the comment for MEMBLOCK_NOMAP definition > in include/linux/memblock.h just to make the semantics is clear, Sure > though arm64 is currently the only user for MEMBLOCK_NOMAP. > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > --- > > mm/memblock.c | 23 +++++++++++++++++++++-- > > 1 file changed, 21 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > index afaefa8fc6ab..6b7ea9d86310 100644 > > --- a/mm/memblock.c > > +++ b/mm/memblock.c > > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > > return end_pfn - start_pfn; > > } > > > > +static void __init memmap_init_reserved_pages(void) > > +{ > > + struct memblock_region *region; > > + phys_addr_t start, end; > > + u64 i; > > + > > + /* initialize struct pages for the reserved regions */ > > + for_each_reserved_mem_range(i, &start, &end) > > + reserve_bootmem_region(start, end); > > + > > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > > + for_each_mem_region(region) { > > + if (memblock_is_nomap(region)) { > > + start = region->base; > > + end = start + region->size; > > + reserve_bootmem_region(start, end); > > + } > > + } > > +} > > + > > static unsigned long __init free_low_memory_core_early(void) > > { > > unsigned long count = 0; > > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > > > memblock_clear_hotplug(0, -1); > > > > - for_each_reserved_mem_range(i, &start, &end) > > - reserve_bootmem_region(start, end); > > + memmap_init_reserved_pages(); > > > > /* > > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > > -- Sincerely yours, Mike. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-08 5:48 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-08 5:48 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On Thu, Apr 08, 2021 at 10:46:18AM +0530, Anshuman Khandual wrote: > > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > The struct pages representing a reserved memory region are initialized > > using reserve_bootmem_range() function. This function is called for each > > reserved region just before the memory is freed from memblock to the buddy > > page allocator. > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > values set by the memory map initialization which makes it necessary to > > have a special treatment for such pages in pfn_valid() and > > pfn_valid_within(). > > > > Split out initialization of the reserved pages to a function with a > > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > > reserved regions and mark struct pages for the NOMAP regions as > > PageReserved. > > This would definitely need updating the comment for MEMBLOCK_NOMAP definition > in include/linux/memblock.h just to make the semantics is clear, Sure > though arm64 is currently the only user for MEMBLOCK_NOMAP. > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > --- > > mm/memblock.c | 23 +++++++++++++++++++++-- > > 1 file changed, 21 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > index afaefa8fc6ab..6b7ea9d86310 100644 > > --- a/mm/memblock.c > > +++ b/mm/memblock.c > > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > > return end_pfn - start_pfn; > > } > > > > +static void __init memmap_init_reserved_pages(void) > > +{ > > + struct memblock_region *region; > > + phys_addr_t start, end; > > + u64 i; > > + > > + /* initialize struct pages for the reserved regions */ > > + for_each_reserved_mem_range(i, &start, &end) > > + reserve_bootmem_region(start, end); > > + > > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > > + for_each_mem_region(region) { > > + if (memblock_is_nomap(region)) { > > + start = region->base; > > + end = start + region->size; > > + reserve_bootmem_region(start, end); > > + } > > + } > > +} > > + > > static unsigned long __init free_low_memory_core_early(void) > > { > > unsigned long count = 0; > > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > > > memblock_clear_hotplug(0, -1); > > > > - for_each_reserved_mem_range(i, &start, &end) > > - reserve_bootmem_region(start, end); > > + memmap_init_reserved_pages(); > > > > /* > > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > > -- Sincerely yours, Mike. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-08 5:48 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-08 5:48 UTC (permalink / raw) To: Anshuman Khandual Cc: David Hildenbrand, Catalin Marinas, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon, linux-arm-kernel On Thu, Apr 08, 2021 at 10:46:18AM +0530, Anshuman Khandual wrote: > > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > The struct pages representing a reserved memory region are initialized > > using reserve_bootmem_range() function. This function is called for each > > reserved region just before the memory is freed from memblock to the buddy > > page allocator. > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > values set by the memory map initialization which makes it necessary to > > have a special treatment for such pages in pfn_valid() and > > pfn_valid_within(). > > > > Split out initialization of the reserved pages to a function with a > > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > > reserved regions and mark struct pages for the NOMAP regions as > > PageReserved. > > This would definitely need updating the comment for MEMBLOCK_NOMAP definition > in include/linux/memblock.h just to make the semantics is clear, Sure > though arm64 is currently the only user for MEMBLOCK_NOMAP. > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > --- > > mm/memblock.c | 23 +++++++++++++++++++++-- > > 1 file changed, 21 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > index afaefa8fc6ab..6b7ea9d86310 100644 > > --- a/mm/memblock.c > > +++ b/mm/memblock.c > > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > > return end_pfn - start_pfn; > > } > > > > +static void __init memmap_init_reserved_pages(void) > > +{ > > + struct memblock_region *region; > > + phys_addr_t start, end; > > + u64 i; > > + > > + /* initialize struct pages for the reserved regions */ > > + for_each_reserved_mem_range(i, &start, &end) > > + reserve_bootmem_region(start, end); > > + > > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > > + for_each_mem_region(region) { > > + if (memblock_is_nomap(region)) { > > + start = region->base; > > + end = start + region->size; > > + reserve_bootmem_region(start, end); > > + } > > + } > > +} > > + > > static unsigned long __init free_low_memory_core_early(void) > > { > > unsigned long count = 0; > > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > > > memblock_clear_hotplug(0, -1); > > > > - for_each_reserved_mem_range(i, &start, &end) > > - reserve_bootmem_region(start, end); > > + memmap_init_reserved_pages(); > > > > /* > > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > > -- Sincerely yours, Mike. _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages 2021-04-07 17:26 ` Mike Rapoport (?) @ 2021-04-14 15:12 ` David Hildenbrand -1 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-14 15:12 UTC (permalink / raw) To: Mike Rapoport, linux-arm-kernel Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On 07.04.21 19:26, Mike Rapoport wrote: > From: Mike Rapoport <rppt@linux.ibm.com> > > The struct pages representing a reserved memory region are initialized > using reserve_bootmem_range() function. This function is called for each > reserved region just before the memory is freed from memblock to the buddy > page allocator. > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > values set by the memory map initialization which makes it necessary to > have a special treatment for such pages in pfn_valid() and > pfn_valid_within(). I assume these pages are never given to the buddy, because we don't have a direct mapping. So to the kernel, it's essentially just like a memory hole with benefits. I can spot that we want to export such memory like any special memory thingy/hole in /proc/iomem -- "reserved", which makes sense. I would assume that MEMBLOCK_NOMAP is a special type of *reserved* memory. IOW, that for_each_reserved_mem_range() should already succeed on these as well -- we should mark anything that is MEMBLOCK_NOMAP implicitly as reserved. Or are there valid reasons not to do so? What can anyone do with that memory? I assume they are pretty much useless for the kernel, right? Like other reserved memory ranges. > > Split out initialization of the reserved pages to a function with a > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > reserved regions and mark struct pages for the NOMAP regions as > PageReserved. > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > --- > mm/memblock.c | 23 +++++++++++++++++++++-- > 1 file changed, 21 insertions(+), 2 deletions(-) > > diff --git a/mm/memblock.c b/mm/memblock.c > index afaefa8fc6ab..6b7ea9d86310 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > return end_pfn - start_pfn; > } > > +static void __init memmap_init_reserved_pages(void) > +{ > + struct memblock_region *region; > + phys_addr_t start, end; > + u64 i; > + > + /* initialize struct pages for the reserved regions */ > + for_each_reserved_mem_range(i, &start, &end) > + reserve_bootmem_region(start, end); > + > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > + for_each_mem_region(region) { > + if (memblock_is_nomap(region)) { > + start = region->base; > + end = start + region->size; > + reserve_bootmem_region(start, end); > + } > + } > +} > + > static unsigned long __init free_low_memory_core_early(void) > { > unsigned long count = 0; > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > memblock_clear_hotplug(0, -1); > > - for_each_reserved_mem_range(i, &start, &end) > - reserve_bootmem_region(start, end); > + memmap_init_reserved_pages(); > > /* > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-14 15:12 ` David Hildenbrand 0 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-14 15:12 UTC (permalink / raw) To: Mike Rapoport, linux-arm-kernel Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On 07.04.21 19:26, Mike Rapoport wrote: > From: Mike Rapoport <rppt@linux.ibm.com> > > The struct pages representing a reserved memory region are initialized > using reserve_bootmem_range() function. This function is called for each > reserved region just before the memory is freed from memblock to the buddy > page allocator. > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > values set by the memory map initialization which makes it necessary to > have a special treatment for such pages in pfn_valid() and > pfn_valid_within(). I assume these pages are never given to the buddy, because we don't have a direct mapping. So to the kernel, it's essentially just like a memory hole with benefits. I can spot that we want to export such memory like any special memory thingy/hole in /proc/iomem -- "reserved", which makes sense. I would assume that MEMBLOCK_NOMAP is a special type of *reserved* memory. IOW, that for_each_reserved_mem_range() should already succeed on these as well -- we should mark anything that is MEMBLOCK_NOMAP implicitly as reserved. Or are there valid reasons not to do so? What can anyone do with that memory? I assume they are pretty much useless for the kernel, right? Like other reserved memory ranges. > > Split out initialization of the reserved pages to a function with a > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > reserved regions and mark struct pages for the NOMAP regions as > PageReserved. > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > --- > mm/memblock.c | 23 +++++++++++++++++++++-- > 1 file changed, 21 insertions(+), 2 deletions(-) > > diff --git a/mm/memblock.c b/mm/memblock.c > index afaefa8fc6ab..6b7ea9d86310 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > return end_pfn - start_pfn; > } > > +static void __init memmap_init_reserved_pages(void) > +{ > + struct memblock_region *region; > + phys_addr_t start, end; > + u64 i; > + > + /* initialize struct pages for the reserved regions */ > + for_each_reserved_mem_range(i, &start, &end) > + reserve_bootmem_region(start, end); > + > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > + for_each_mem_region(region) { > + if (memblock_is_nomap(region)) { > + start = region->base; > + end = start + region->size; > + reserve_bootmem_region(start, end); > + } > + } > +} > + > static unsigned long __init free_low_memory_core_early(void) > { > unsigned long count = 0; > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > memblock_clear_hotplug(0, -1); > > - for_each_reserved_mem_range(i, &start, &end) > - reserve_bootmem_region(start, end); > + memmap_init_reserved_pages(); > > /* > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > -- Thanks, David / dhildenb _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-14 15:12 ` David Hildenbrand 0 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-14 15:12 UTC (permalink / raw) To: Mike Rapoport, linux-arm-kernel Cc: Anshuman Khandual, Catalin Marinas, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon On 07.04.21 19:26, Mike Rapoport wrote: > From: Mike Rapoport <rppt@linux.ibm.com> > > The struct pages representing a reserved memory region are initialized > using reserve_bootmem_range() function. This function is called for each > reserved region just before the memory is freed from memblock to the buddy > page allocator. > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > values set by the memory map initialization which makes it necessary to > have a special treatment for such pages in pfn_valid() and > pfn_valid_within(). I assume these pages are never given to the buddy, because we don't have a direct mapping. So to the kernel, it's essentially just like a memory hole with benefits. I can spot that we want to export such memory like any special memory thingy/hole in /proc/iomem -- "reserved", which makes sense. I would assume that MEMBLOCK_NOMAP is a special type of *reserved* memory. IOW, that for_each_reserved_mem_range() should already succeed on these as well -- we should mark anything that is MEMBLOCK_NOMAP implicitly as reserved. Or are there valid reasons not to do so? What can anyone do with that memory? I assume they are pretty much useless for the kernel, right? Like other reserved memory ranges. > > Split out initialization of the reserved pages to a function with a > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > reserved regions and mark struct pages for the NOMAP regions as > PageReserved. > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > --- > mm/memblock.c | 23 +++++++++++++++++++++-- > 1 file changed, 21 insertions(+), 2 deletions(-) > > diff --git a/mm/memblock.c b/mm/memblock.c > index afaefa8fc6ab..6b7ea9d86310 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > return end_pfn - start_pfn; > } > > +static void __init memmap_init_reserved_pages(void) > +{ > + struct memblock_region *region; > + phys_addr_t start, end; > + u64 i; > + > + /* initialize struct pages for the reserved regions */ > + for_each_reserved_mem_range(i, &start, &end) > + reserve_bootmem_region(start, end); > + > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > + for_each_mem_region(region) { > + if (memblock_is_nomap(region)) { > + start = region->base; > + end = start + region->size; > + reserve_bootmem_region(start, end); > + } > + } > +} > + > static unsigned long __init free_low_memory_core_early(void) > { > unsigned long count = 0; > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > memblock_clear_hotplug(0, -1); > > - for_each_reserved_mem_range(i, &start, &end) > - reserve_bootmem_region(start, end); > + memmap_init_reserved_pages(); > > /* > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > -- Thanks, David / dhildenb _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages 2021-04-14 15:12 ` David Hildenbrand (?) (?) @ 2021-04-14 15:27 ` Ard Biesheuvel -1 siblings, 0 replies; 78+ messages in thread From: Ard Biesheuvel @ 2021-04-14 15:27 UTC (permalink / raw) To: David Hildenbrand Cc: Mike Rapoport, Linux ARM, Anshuman Khandual, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, Linux Kernel Mailing List, Linux Memory Management List On Wed, 14 Apr 2021 at 17:14, David Hildenbrand <david@redhat.com> wrote: > > On 07.04.21 19:26, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > The struct pages representing a reserved memory region are initialized > > using reserve_bootmem_range() function. This function is called for each > > reserved region just before the memory is freed from memblock to the buddy > > page allocator. > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > values set by the memory map initialization which makes it necessary to > > have a special treatment for such pages in pfn_valid() and > > pfn_valid_within(). > > I assume these pages are never given to the buddy, because we don't have > a direct mapping. So to the kernel, it's essentially just like a memory > hole with benefits. > > I can spot that we want to export such memory like any special memory > thingy/hole in /proc/iomem -- "reserved", which makes sense. > > I would assume that MEMBLOCK_NOMAP is a special type of *reserved* > memory. IOW, that for_each_reserved_mem_range() should already succeed > on these as well -- we should mark anything that is MEMBLOCK_NOMAP > implicitly as reserved. Or are there valid reasons not to do so? What > can anyone do with that memory? > > I assume they are pretty much useless for the kernel, right? Like other > reserved memory ranges. > On ARM, we need to know whether any physical regions that do not contain system memory contain something with device semantics or not. One of the examples is ACPI tables: these are in reserved memory, and so they are not covered by the linear region. However, when the ACPI core ioremap()s an arbitrary memory region, we don't know whether it is mapping a memory region or a device region unless we keep track of this in some way. (Device mappings require device attributes, but firmware tables require memory attributes, as they might be accessed using misaligned reads) > > > > > Split out initialization of the reserved pages to a function with a > > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > > reserved regions and mark struct pages for the NOMAP regions as > > PageReserved. > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > --- > > mm/memblock.c | 23 +++++++++++++++++++++-- > > 1 file changed, 21 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > index afaefa8fc6ab..6b7ea9d86310 100644 > > --- a/mm/memblock.c > > +++ b/mm/memblock.c > > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > > return end_pfn - start_pfn; > > } > > > > +static void __init memmap_init_reserved_pages(void) > > +{ > > + struct memblock_region *region; > > + phys_addr_t start, end; > > + u64 i; > > + > > + /* initialize struct pages for the reserved regions */ > > + for_each_reserved_mem_range(i, &start, &end) > > + reserve_bootmem_region(start, end); > > + > > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > > + for_each_mem_region(region) { > > + if (memblock_is_nomap(region)) { > > + start = region->base; > > + end = start + region->size; > > + reserve_bootmem_region(start, end); > > + } > > + } > > +} > > + > > static unsigned long __init free_low_memory_core_early(void) > > { > > unsigned long count = 0; > > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > > > memblock_clear_hotplug(0, -1); > > > > - for_each_reserved_mem_range(i, &start, &end) > > - reserve_bootmem_region(start, end); > > + memmap_init_reserved_pages(); > > > > /* > > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > > > > > -- > Thanks, > > David / dhildenb > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-14 15:27 ` Ard Biesheuvel 0 siblings, 0 replies; 78+ messages in thread From: Ard Biesheuvel @ 2021-04-14 15:27 UTC (permalink / raw) To: David Hildenbrand Cc: Mike Rapoport, Linux ARM, Anshuman Khandual, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, Linux Kernel Mailing List, Linux Memory Management List On Wed, 14 Apr 2021 at 17:14, David Hildenbrand <david@redhat.com> wrote: > > On 07.04.21 19:26, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > The struct pages representing a reserved memory region are initialized > > using reserve_bootmem_range() function. This function is called for each > > reserved region just before the memory is freed from memblock to the buddy > > page allocator. > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > values set by the memory map initialization which makes it necessary to > > have a special treatment for such pages in pfn_valid() and > > pfn_valid_within(). > > I assume these pages are never given to the buddy, because we don't have > a direct mapping. So to the kernel, it's essentially just like a memory > hole with benefits. > > I can spot that we want to export such memory like any special memory > thingy/hole in /proc/iomem -- "reserved", which makes sense. > > I would assume that MEMBLOCK_NOMAP is a special type of *reserved* > memory. IOW, that for_each_reserved_mem_range() should already succeed > on these as well -- we should mark anything that is MEMBLOCK_NOMAP > implicitly as reserved. Or are there valid reasons not to do so? What > can anyone do with that memory? > > I assume they are pretty much useless for the kernel, right? Like other > reserved memory ranges. > On ARM, we need to know whether any physical regions that do not contain system memory contain something with device semantics or not. One of the examples is ACPI tables: these are in reserved memory, and so they are not covered by the linear region. However, when the ACPI core ioremap()s an arbitrary memory region, we don't know whether it is mapping a memory region or a device region unless we keep track of this in some way. (Device mappings require device attributes, but firmware tables require memory attributes, as they might be accessed using misaligned reads) > > > > > Split out initialization of the reserved pages to a function with a > > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > > reserved regions and mark struct pages for the NOMAP regions as > > PageReserved. > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > --- > > mm/memblock.c | 23 +++++++++++++++++++++-- > > 1 file changed, 21 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > index afaefa8fc6ab..6b7ea9d86310 100644 > > --- a/mm/memblock.c > > +++ b/mm/memblock.c > > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > > return end_pfn - start_pfn; > > } > > > > +static void __init memmap_init_reserved_pages(void) > > +{ > > + struct memblock_region *region; > > + phys_addr_t start, end; > > + u64 i; > > + > > + /* initialize struct pages for the reserved regions */ > > + for_each_reserved_mem_range(i, &start, &end) > > + reserve_bootmem_region(start, end); > > + > > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > > + for_each_mem_region(region) { > > + if (memblock_is_nomap(region)) { > > + start = region->base; > > + end = start + region->size; > > + reserve_bootmem_region(start, end); > > + } > > + } > > +} > > + > > static unsigned long __init free_low_memory_core_early(void) > > { > > unsigned long count = 0; > > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > > > memblock_clear_hotplug(0, -1); > > > > - for_each_reserved_mem_range(i, &start, &end) > > - reserve_bootmem_region(start, end); > > + memmap_init_reserved_pages(); > > > > /* > > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > > > > > -- > Thanks, > > David / dhildenb > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-14 15:27 ` Ard Biesheuvel 0 siblings, 0 replies; 78+ messages in thread From: Ard Biesheuvel @ 2021-04-14 15:27 UTC (permalink / raw) To: David Hildenbrand Cc: Anshuman Khandual, Catalin Marinas, Linux Kernel Mailing List, Mike Rapoport, Linux Memory Management List, Mike Rapoport, Marc Zyngier, Will Deacon, kvmarm, Linux ARM On Wed, 14 Apr 2021 at 17:14, David Hildenbrand <david@redhat.com> wrote: > > On 07.04.21 19:26, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > The struct pages representing a reserved memory region are initialized > > using reserve_bootmem_range() function. This function is called for each > > reserved region just before the memory is freed from memblock to the buddy > > page allocator. > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > values set by the memory map initialization which makes it necessary to > > have a special treatment for such pages in pfn_valid() and > > pfn_valid_within(). > > I assume these pages are never given to the buddy, because we don't have > a direct mapping. So to the kernel, it's essentially just like a memory > hole with benefits. > > I can spot that we want to export such memory like any special memory > thingy/hole in /proc/iomem -- "reserved", which makes sense. > > I would assume that MEMBLOCK_NOMAP is a special type of *reserved* > memory. IOW, that for_each_reserved_mem_range() should already succeed > on these as well -- we should mark anything that is MEMBLOCK_NOMAP > implicitly as reserved. Or are there valid reasons not to do so? What > can anyone do with that memory? > > I assume they are pretty much useless for the kernel, right? Like other > reserved memory ranges. > On ARM, we need to know whether any physical regions that do not contain system memory contain something with device semantics or not. One of the examples is ACPI tables: these are in reserved memory, and so they are not covered by the linear region. However, when the ACPI core ioremap()s an arbitrary memory region, we don't know whether it is mapping a memory region or a device region unless we keep track of this in some way. (Device mappings require device attributes, but firmware tables require memory attributes, as they might be accessed using misaligned reads) > > > > > Split out initialization of the reserved pages to a function with a > > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > > reserved regions and mark struct pages for the NOMAP regions as > > PageReserved. > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > --- > > mm/memblock.c | 23 +++++++++++++++++++++-- > > 1 file changed, 21 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > index afaefa8fc6ab..6b7ea9d86310 100644 > > --- a/mm/memblock.c > > +++ b/mm/memblock.c > > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > > return end_pfn - start_pfn; > > } > > > > +static void __init memmap_init_reserved_pages(void) > > +{ > > + struct memblock_region *region; > > + phys_addr_t start, end; > > + u64 i; > > + > > + /* initialize struct pages for the reserved regions */ > > + for_each_reserved_mem_range(i, &start, &end) > > + reserve_bootmem_region(start, end); > > + > > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > > + for_each_mem_region(region) { > > + if (memblock_is_nomap(region)) { > > + start = region->base; > > + end = start + region->size; > > + reserve_bootmem_region(start, end); > > + } > > + } > > +} > > + > > static unsigned long __init free_low_memory_core_early(void) > > { > > unsigned long count = 0; > > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > > > memblock_clear_hotplug(0, -1); > > > > - for_each_reserved_mem_range(i, &start, &end) > > - reserve_bootmem_region(start, end); > > + memmap_init_reserved_pages(); > > > > /* > > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > > > > > -- > Thanks, > > David / dhildenb > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-14 15:27 ` Ard Biesheuvel 0 siblings, 0 replies; 78+ messages in thread From: Ard Biesheuvel @ 2021-04-14 15:27 UTC (permalink / raw) To: David Hildenbrand Cc: Mike Rapoport, Linux ARM, Anshuman Khandual, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, Linux Kernel Mailing List, Linux Memory Management List On Wed, 14 Apr 2021 at 17:14, David Hildenbrand <david@redhat.com> wrote: > > On 07.04.21 19:26, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > The struct pages representing a reserved memory region are initialized > > using reserve_bootmem_range() function. This function is called for each > > reserved region just before the memory is freed from memblock to the buddy > > page allocator. > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > values set by the memory map initialization which makes it necessary to > > have a special treatment for such pages in pfn_valid() and > > pfn_valid_within(). > > I assume these pages are never given to the buddy, because we don't have > a direct mapping. So to the kernel, it's essentially just like a memory > hole with benefits. > > I can spot that we want to export such memory like any special memory > thingy/hole in /proc/iomem -- "reserved", which makes sense. > > I would assume that MEMBLOCK_NOMAP is a special type of *reserved* > memory. IOW, that for_each_reserved_mem_range() should already succeed > on these as well -- we should mark anything that is MEMBLOCK_NOMAP > implicitly as reserved. Or are there valid reasons not to do so? What > can anyone do with that memory? > > I assume they are pretty much useless for the kernel, right? Like other > reserved memory ranges. > On ARM, we need to know whether any physical regions that do not contain system memory contain something with device semantics or not. One of the examples is ACPI tables: these are in reserved memory, and so they are not covered by the linear region. However, when the ACPI core ioremap()s an arbitrary memory region, we don't know whether it is mapping a memory region or a device region unless we keep track of this in some way. (Device mappings require device attributes, but firmware tables require memory attributes, as they might be accessed using misaligned reads) > > > > > Split out initialization of the reserved pages to a function with a > > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > > reserved regions and mark struct pages for the NOMAP regions as > > PageReserved. > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > --- > > mm/memblock.c | 23 +++++++++++++++++++++-- > > 1 file changed, 21 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > index afaefa8fc6ab..6b7ea9d86310 100644 > > --- a/mm/memblock.c > > +++ b/mm/memblock.c > > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > > return end_pfn - start_pfn; > > } > > > > +static void __init memmap_init_reserved_pages(void) > > +{ > > + struct memblock_region *region; > > + phys_addr_t start, end; > > + u64 i; > > + > > + /* initialize struct pages for the reserved regions */ > > + for_each_reserved_mem_range(i, &start, &end) > > + reserve_bootmem_region(start, end); > > + > > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > > + for_each_mem_region(region) { > > + if (memblock_is_nomap(region)) { > > + start = region->base; > > + end = start + region->size; > > + reserve_bootmem_region(start, end); > > + } > > + } > > +} > > + > > static unsigned long __init free_low_memory_core_early(void) > > { > > unsigned long count = 0; > > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > > > memblock_clear_hotplug(0, -1); > > > > - for_each_reserved_mem_range(i, &start, &end) > > - reserve_bootmem_region(start, end); > > + memmap_init_reserved_pages(); > > > > /* > > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > > > > > -- > Thanks, > > David / dhildenb > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages 2021-04-14 15:27 ` Ard Biesheuvel (?) @ 2021-04-14 15:52 ` David Hildenbrand -1 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-14 15:52 UTC (permalink / raw) To: Ard Biesheuvel Cc: Mike Rapoport, Linux ARM, Anshuman Khandual, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, Linux Kernel Mailing List, Linux Memory Management List On 14.04.21 17:27, Ard Biesheuvel wrote: > On Wed, 14 Apr 2021 at 17:14, David Hildenbrand <david@redhat.com> wrote: >> >> On 07.04.21 19:26, Mike Rapoport wrote: >>> From: Mike Rapoport <rppt@linux.ibm.com> >>> >>> The struct pages representing a reserved memory region are initialized >>> using reserve_bootmem_range() function. This function is called for each >>> reserved region just before the memory is freed from memblock to the buddy >>> page allocator. >>> >>> The struct pages for MEMBLOCK_NOMAP regions are kept with the default >>> values set by the memory map initialization which makes it necessary to >>> have a special treatment for such pages in pfn_valid() and >>> pfn_valid_within(). >> >> I assume these pages are never given to the buddy, because we don't have >> a direct mapping. So to the kernel, it's essentially just like a memory >> hole with benefits. >> >> I can spot that we want to export such memory like any special memory >> thingy/hole in /proc/iomem -- "reserved", which makes sense. >> >> I would assume that MEMBLOCK_NOMAP is a special type of *reserved* >> memory. IOW, that for_each_reserved_mem_range() should already succeed >> on these as well -- we should mark anything that is MEMBLOCK_NOMAP >> implicitly as reserved. Or are there valid reasons not to do so? What >> can anyone do with that memory? >> >> I assume they are pretty much useless for the kernel, right? Like other >> reserved memory ranges. >> > > On ARM, we need to know whether any physical regions that do not > contain system memory contain something with device semantics or not. > One of the examples is ACPI tables: these are in reserved memory, and > so they are not covered by the linear region. However, when the ACPI > core ioremap()s an arbitrary memory region, we don't know whether it > is mapping a memory region or a device region unless we keep track of > this in some way. (Device mappings require device attributes, but > firmware tables require memory attributes, as they might be accessed > using misaligned reads) Using generically sounding NOMAP ("don't create direct mapping") to identify device regions feels like a hack. I know, it was introduced just for that purpose. Looking at memblock_mark_nomap(), we consider "device regions" 1) ACPI tables 2) VIDEO_TYPE_EFI memory 3) some device-tree regions in of/fdt.c IIUC, right now we end up creating a memmap for this NOMAP memory, but hide it away in pfn_valid(). This patch set at least fixes that. Assuming these pages are never mapped to user space via the struct page (which better be the case), we could further use a new pagetype to mark these pages in a special way, such that we can identify them directly via pfn_to_page(). Then, we could mostly avoid having to query memblock at runtime to figure out that this is special memory. This would obviously be an extension to this series. Just a thought. -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-14 15:52 ` David Hildenbrand 0 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-14 15:52 UTC (permalink / raw) To: Ard Biesheuvel Cc: Mike Rapoport, Linux ARM, Anshuman Khandual, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, Linux Kernel Mailing List, Linux Memory Management List On 14.04.21 17:27, Ard Biesheuvel wrote: > On Wed, 14 Apr 2021 at 17:14, David Hildenbrand <david@redhat.com> wrote: >> >> On 07.04.21 19:26, Mike Rapoport wrote: >>> From: Mike Rapoport <rppt@linux.ibm.com> >>> >>> The struct pages representing a reserved memory region are initialized >>> using reserve_bootmem_range() function. This function is called for each >>> reserved region just before the memory is freed from memblock to the buddy >>> page allocator. >>> >>> The struct pages for MEMBLOCK_NOMAP regions are kept with the default >>> values set by the memory map initialization which makes it necessary to >>> have a special treatment for such pages in pfn_valid() and >>> pfn_valid_within(). >> >> I assume these pages are never given to the buddy, because we don't have >> a direct mapping. So to the kernel, it's essentially just like a memory >> hole with benefits. >> >> I can spot that we want to export such memory like any special memory >> thingy/hole in /proc/iomem -- "reserved", which makes sense. >> >> I would assume that MEMBLOCK_NOMAP is a special type of *reserved* >> memory. IOW, that for_each_reserved_mem_range() should already succeed >> on these as well -- we should mark anything that is MEMBLOCK_NOMAP >> implicitly as reserved. Or are there valid reasons not to do so? What >> can anyone do with that memory? >> >> I assume they are pretty much useless for the kernel, right? Like other >> reserved memory ranges. >> > > On ARM, we need to know whether any physical regions that do not > contain system memory contain something with device semantics or not. > One of the examples is ACPI tables: these are in reserved memory, and > so they are not covered by the linear region. However, when the ACPI > core ioremap()s an arbitrary memory region, we don't know whether it > is mapping a memory region or a device region unless we keep track of > this in some way. (Device mappings require device attributes, but > firmware tables require memory attributes, as they might be accessed > using misaligned reads) Using generically sounding NOMAP ("don't create direct mapping") to identify device regions feels like a hack. I know, it was introduced just for that purpose. Looking at memblock_mark_nomap(), we consider "device regions" 1) ACPI tables 2) VIDEO_TYPE_EFI memory 3) some device-tree regions in of/fdt.c IIUC, right now we end up creating a memmap for this NOMAP memory, but hide it away in pfn_valid(). This patch set at least fixes that. Assuming these pages are never mapped to user space via the struct page (which better be the case), we could further use a new pagetype to mark these pages in a special way, such that we can identify them directly via pfn_to_page(). Then, we could mostly avoid having to query memblock at runtime to figure out that this is special memory. This would obviously be an extension to this series. Just a thought. -- Thanks, David / dhildenb _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-14 15:52 ` David Hildenbrand 0 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-14 15:52 UTC (permalink / raw) To: Ard Biesheuvel Cc: Anshuman Khandual, Catalin Marinas, Linux Kernel Mailing List, Mike Rapoport, Linux Memory Management List, Mike Rapoport, Marc Zyngier, Will Deacon, kvmarm, Linux ARM On 14.04.21 17:27, Ard Biesheuvel wrote: > On Wed, 14 Apr 2021 at 17:14, David Hildenbrand <david@redhat.com> wrote: >> >> On 07.04.21 19:26, Mike Rapoport wrote: >>> From: Mike Rapoport <rppt@linux.ibm.com> >>> >>> The struct pages representing a reserved memory region are initialized >>> using reserve_bootmem_range() function. This function is called for each >>> reserved region just before the memory is freed from memblock to the buddy >>> page allocator. >>> >>> The struct pages for MEMBLOCK_NOMAP regions are kept with the default >>> values set by the memory map initialization which makes it necessary to >>> have a special treatment for such pages in pfn_valid() and >>> pfn_valid_within(). >> >> I assume these pages are never given to the buddy, because we don't have >> a direct mapping. So to the kernel, it's essentially just like a memory >> hole with benefits. >> >> I can spot that we want to export such memory like any special memory >> thingy/hole in /proc/iomem -- "reserved", which makes sense. >> >> I would assume that MEMBLOCK_NOMAP is a special type of *reserved* >> memory. IOW, that for_each_reserved_mem_range() should already succeed >> on these as well -- we should mark anything that is MEMBLOCK_NOMAP >> implicitly as reserved. Or are there valid reasons not to do so? What >> can anyone do with that memory? >> >> I assume they are pretty much useless for the kernel, right? Like other >> reserved memory ranges. >> > > On ARM, we need to know whether any physical regions that do not > contain system memory contain something with device semantics or not. > One of the examples is ACPI tables: these are in reserved memory, and > so they are not covered by the linear region. However, when the ACPI > core ioremap()s an arbitrary memory region, we don't know whether it > is mapping a memory region or a device region unless we keep track of > this in some way. (Device mappings require device attributes, but > firmware tables require memory attributes, as they might be accessed > using misaligned reads) Using generically sounding NOMAP ("don't create direct mapping") to identify device regions feels like a hack. I know, it was introduced just for that purpose. Looking at memblock_mark_nomap(), we consider "device regions" 1) ACPI tables 2) VIDEO_TYPE_EFI memory 3) some device-tree regions in of/fdt.c IIUC, right now we end up creating a memmap for this NOMAP memory, but hide it away in pfn_valid(). This patch set at least fixes that. Assuming these pages are never mapped to user space via the struct page (which better be the case), we could further use a new pagetype to mark these pages in a special way, such that we can identify them directly via pfn_to_page(). Then, we could mostly avoid having to query memblock at runtime to figure out that this is special memory. This would obviously be an extension to this series. Just a thought. -- Thanks, David / dhildenb _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages 2021-04-14 15:52 ` David Hildenbrand (?) @ 2021-04-14 20:24 ` Mike Rapoport -1 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-14 20:24 UTC (permalink / raw) To: David Hildenbrand Cc: Ard Biesheuvel, Linux ARM, Anshuman Khandual, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, Linux Kernel Mailing List, Linux Memory Management List On Wed, Apr 14, 2021 at 05:52:57PM +0200, David Hildenbrand wrote: > On 14.04.21 17:27, Ard Biesheuvel wrote: > > On Wed, 14 Apr 2021 at 17:14, David Hildenbrand <david@redhat.com> wrote: > > > > > > On 07.04.21 19:26, Mike Rapoport wrote: > > > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > > > > > The struct pages representing a reserved memory region are initialized > > > > using reserve_bootmem_range() function. This function is called for each > > > > reserved region just before the memory is freed from memblock to the buddy > > > > page allocator. > > > > > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > > > values set by the memory map initialization which makes it necessary to > > > > have a special treatment for such pages in pfn_valid() and > > > > pfn_valid_within(). > > > > > > I assume these pages are never given to the buddy, because we don't have > > > a direct mapping. So to the kernel, it's essentially just like a memory > > > hole with benefits. > > > > > > I can spot that we want to export such memory like any special memory > > > thingy/hole in /proc/iomem -- "reserved", which makes sense. > > > > > > I would assume that MEMBLOCK_NOMAP is a special type of *reserved* > > > memory. IOW, that for_each_reserved_mem_range() should already succeed > > > on these as well -- we should mark anything that is MEMBLOCK_NOMAP > > > implicitly as reserved. Or are there valid reasons not to do so? What > > > can anyone do with that memory? > > > > > > I assume they are pretty much useless for the kernel, right? Like other > > > reserved memory ranges. > > > > > > > On ARM, we need to know whether any physical regions that do not > > contain system memory contain something with device semantics or not. > > One of the examples is ACPI tables: these are in reserved memory, and > > so they are not covered by the linear region. However, when the ACPI > > core ioremap()s an arbitrary memory region, we don't know whether it > > is mapping a memory region or a device region unless we keep track of > > this in some way. (Device mappings require device attributes, but > > firmware tables require memory attributes, as they might be accessed > > using misaligned reads) > > Using generically sounding NOMAP ("don't create direct mapping") to identify > device regions feels like a hack. I know, it was introduced just for that > purpose. > > Looking at memblock_mark_nomap(), we consider "device regions" > > 1) ACPI tables > > 2) VIDEO_TYPE_EFI memory > > 3) some device-tree regions in of/fdt.c > > > IIUC, right now we end up creating a memmap for this NOMAP memory, but hide > it away in pfn_valid(). This patch set at least fixes that. Currently we have memmap entries with struct page set to defaults for the NOMAP memory. AFAIU hiding them in pfn_valid()/pfn_valid_within() was a solution to failures in pfn walkers that presumed that for a pfn_valid() there will be a struct page that really reflects the state of that page. > Assuming these pages are never mapped to user space via the struct page > (which better be the case), we could further use a new pagetype to mark > these pages in a special way, such that we can identify them directly via > pfn_to_page(). Not sure we really need a new pagetype here, PG_Reserved seems to be quite enough to say "don't touch this". I generally agree that we could make PG_Reserved a PageType and then have several sub-types for reserved memory. This definitely will add clarity but I'm not sure that this justifies amount of churn and effort required to audit uses of PageResrved(). > Then, we could mostly avoid having to query memblock at runtime to figure > out that this is special memory. This would obviously be an extension to > this series. Just a thought. Stop pushing memblock out of kernel! ;-) Now, seriously, we can minimize memblock involvement in run-time and this series in yet another step in that direction. -- Sincerely yours, Mike. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-14 20:24 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-14 20:24 UTC (permalink / raw) To: David Hildenbrand Cc: Ard Biesheuvel, Linux ARM, Anshuman Khandual, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, Linux Kernel Mailing List, Linux Memory Management List On Wed, Apr 14, 2021 at 05:52:57PM +0200, David Hildenbrand wrote: > On 14.04.21 17:27, Ard Biesheuvel wrote: > > On Wed, 14 Apr 2021 at 17:14, David Hildenbrand <david@redhat.com> wrote: > > > > > > On 07.04.21 19:26, Mike Rapoport wrote: > > > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > > > > > The struct pages representing a reserved memory region are initialized > > > > using reserve_bootmem_range() function. This function is called for each > > > > reserved region just before the memory is freed from memblock to the buddy > > > > page allocator. > > > > > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > > > values set by the memory map initialization which makes it necessary to > > > > have a special treatment for such pages in pfn_valid() and > > > > pfn_valid_within(). > > > > > > I assume these pages are never given to the buddy, because we don't have > > > a direct mapping. So to the kernel, it's essentially just like a memory > > > hole with benefits. > > > > > > I can spot that we want to export such memory like any special memory > > > thingy/hole in /proc/iomem -- "reserved", which makes sense. > > > > > > I would assume that MEMBLOCK_NOMAP is a special type of *reserved* > > > memory. IOW, that for_each_reserved_mem_range() should already succeed > > > on these as well -- we should mark anything that is MEMBLOCK_NOMAP > > > implicitly as reserved. Or are there valid reasons not to do so? What > > > can anyone do with that memory? > > > > > > I assume they are pretty much useless for the kernel, right? Like other > > > reserved memory ranges. > > > > > > > On ARM, we need to know whether any physical regions that do not > > contain system memory contain something with device semantics or not. > > One of the examples is ACPI tables: these are in reserved memory, and > > so they are not covered by the linear region. However, when the ACPI > > core ioremap()s an arbitrary memory region, we don't know whether it > > is mapping a memory region or a device region unless we keep track of > > this in some way. (Device mappings require device attributes, but > > firmware tables require memory attributes, as they might be accessed > > using misaligned reads) > > Using generically sounding NOMAP ("don't create direct mapping") to identify > device regions feels like a hack. I know, it was introduced just for that > purpose. > > Looking at memblock_mark_nomap(), we consider "device regions" > > 1) ACPI tables > > 2) VIDEO_TYPE_EFI memory > > 3) some device-tree regions in of/fdt.c > > > IIUC, right now we end up creating a memmap for this NOMAP memory, but hide > it away in pfn_valid(). This patch set at least fixes that. Currently we have memmap entries with struct page set to defaults for the NOMAP memory. AFAIU hiding them in pfn_valid()/pfn_valid_within() was a solution to failures in pfn walkers that presumed that for a pfn_valid() there will be a struct page that really reflects the state of that page. > Assuming these pages are never mapped to user space via the struct page > (which better be the case), we could further use a new pagetype to mark > these pages in a special way, such that we can identify them directly via > pfn_to_page(). Not sure we really need a new pagetype here, PG_Reserved seems to be quite enough to say "don't touch this". I generally agree that we could make PG_Reserved a PageType and then have several sub-types for reserved memory. This definitely will add clarity but I'm not sure that this justifies amount of churn and effort required to audit uses of PageResrved(). > Then, we could mostly avoid having to query memblock at runtime to figure > out that this is special memory. This would obviously be an extension to > this series. Just a thought. Stop pushing memblock out of kernel! ;-) Now, seriously, we can minimize memblock involvement in run-time and this series in yet another step in that direction. -- Sincerely yours, Mike. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-14 20:24 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-14 20:24 UTC (permalink / raw) To: David Hildenbrand Cc: Anshuman Khandual, Catalin Marinas, Linux Kernel Mailing List, Mike Rapoport, Linux Memory Management List, kvmarm, Marc Zyngier, Will Deacon, Linux ARM On Wed, Apr 14, 2021 at 05:52:57PM +0200, David Hildenbrand wrote: > On 14.04.21 17:27, Ard Biesheuvel wrote: > > On Wed, 14 Apr 2021 at 17:14, David Hildenbrand <david@redhat.com> wrote: > > > > > > On 07.04.21 19:26, Mike Rapoport wrote: > > > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > > > > > The struct pages representing a reserved memory region are initialized > > > > using reserve_bootmem_range() function. This function is called for each > > > > reserved region just before the memory is freed from memblock to the buddy > > > > page allocator. > > > > > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > > > values set by the memory map initialization which makes it necessary to > > > > have a special treatment for such pages in pfn_valid() and > > > > pfn_valid_within(). > > > > > > I assume these pages are never given to the buddy, because we don't have > > > a direct mapping. So to the kernel, it's essentially just like a memory > > > hole with benefits. > > > > > > I can spot that we want to export such memory like any special memory > > > thingy/hole in /proc/iomem -- "reserved", which makes sense. > > > > > > I would assume that MEMBLOCK_NOMAP is a special type of *reserved* > > > memory. IOW, that for_each_reserved_mem_range() should already succeed > > > on these as well -- we should mark anything that is MEMBLOCK_NOMAP > > > implicitly as reserved. Or are there valid reasons not to do so? What > > > can anyone do with that memory? > > > > > > I assume they are pretty much useless for the kernel, right? Like other > > > reserved memory ranges. > > > > > > > On ARM, we need to know whether any physical regions that do not > > contain system memory contain something with device semantics or not. > > One of the examples is ACPI tables: these are in reserved memory, and > > so they are not covered by the linear region. However, when the ACPI > > core ioremap()s an arbitrary memory region, we don't know whether it > > is mapping a memory region or a device region unless we keep track of > > this in some way. (Device mappings require device attributes, but > > firmware tables require memory attributes, as they might be accessed > > using misaligned reads) > > Using generically sounding NOMAP ("don't create direct mapping") to identify > device regions feels like a hack. I know, it was introduced just for that > purpose. > > Looking at memblock_mark_nomap(), we consider "device regions" > > 1) ACPI tables > > 2) VIDEO_TYPE_EFI memory > > 3) some device-tree regions in of/fdt.c > > > IIUC, right now we end up creating a memmap for this NOMAP memory, but hide > it away in pfn_valid(). This patch set at least fixes that. Currently we have memmap entries with struct page set to defaults for the NOMAP memory. AFAIU hiding them in pfn_valid()/pfn_valid_within() was a solution to failures in pfn walkers that presumed that for a pfn_valid() there will be a struct page that really reflects the state of that page. > Assuming these pages are never mapped to user space via the struct page > (which better be the case), we could further use a new pagetype to mark > these pages in a special way, such that we can identify them directly via > pfn_to_page(). Not sure we really need a new pagetype here, PG_Reserved seems to be quite enough to say "don't touch this". I generally agree that we could make PG_Reserved a PageType and then have several sub-types for reserved memory. This definitely will add clarity but I'm not sure that this justifies amount of churn and effort required to audit uses of PageResrved(). > Then, we could mostly avoid having to query memblock at runtime to figure > out that this is special memory. This would obviously be an extension to > this series. Just a thought. Stop pushing memblock out of kernel! ;-) Now, seriously, we can minimize memblock involvement in run-time and this series in yet another step in that direction. -- Sincerely yours, Mike. _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages 2021-04-14 20:24 ` Mike Rapoport (?) @ 2021-04-15 9:30 ` David Hildenbrand -1 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-15 9:30 UTC (permalink / raw) To: Mike Rapoport Cc: Ard Biesheuvel, Linux ARM, Anshuman Khandual, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, Linux Kernel Mailing List, Linux Memory Management List > Not sure we really need a new pagetype here, PG_Reserved seems to be quite > enough to say "don't touch this". I generally agree that we could make > PG_Reserved a PageType and then have several sub-types for reserved memory. > This definitely will add clarity but I'm not sure that this justifies > amount of churn and effort required to audit uses of PageResrved(). > >> Then, we could mostly avoid having to query memblock at runtime to figure >> out that this is special memory. This would obviously be an extension to >> this series. Just a thought. > > Stop pushing memblock out of kernel! ;-) Can't stop. Won't stop. :D It's lovely for booting up a kernel until we have other data-structures in place ;) -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-15 9:30 ` David Hildenbrand 0 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-15 9:30 UTC (permalink / raw) To: Mike Rapoport Cc: Ard Biesheuvel, Linux ARM, Anshuman Khandual, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, Linux Kernel Mailing List, Linux Memory Management List > Not sure we really need a new pagetype here, PG_Reserved seems to be quite > enough to say "don't touch this". I generally agree that we could make > PG_Reserved a PageType and then have several sub-types for reserved memory. > This definitely will add clarity but I'm not sure that this justifies > amount of churn and effort required to audit uses of PageResrved(). > >> Then, we could mostly avoid having to query memblock at runtime to figure >> out that this is special memory. This would obviously be an extension to >> this series. Just a thought. > > Stop pushing memblock out of kernel! ;-) Can't stop. Won't stop. :D It's lovely for booting up a kernel until we have other data-structures in place ;) -- Thanks, David / dhildenb _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-15 9:30 ` David Hildenbrand 0 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-15 9:30 UTC (permalink / raw) To: Mike Rapoport Cc: Anshuman Khandual, Catalin Marinas, Linux Kernel Mailing List, Mike Rapoport, Linux Memory Management List, kvmarm, Marc Zyngier, Will Deacon, Linux ARM > Not sure we really need a new pagetype here, PG_Reserved seems to be quite > enough to say "don't touch this". I generally agree that we could make > PG_Reserved a PageType and then have several sub-types for reserved memory. > This definitely will add clarity but I'm not sure that this justifies > amount of churn and effort required to audit uses of PageResrved(). > >> Then, we could mostly avoid having to query memblock at runtime to figure >> out that this is special memory. This would obviously be an extension to >> this series. Just a thought. > > Stop pushing memblock out of kernel! ;-) Can't stop. Won't stop. :D It's lovely for booting up a kernel until we have other data-structures in place ;) -- Thanks, David / dhildenb _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages 2021-04-15 9:30 ` David Hildenbrand (?) @ 2021-04-16 11:44 ` Mike Rapoport -1 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-16 11:44 UTC (permalink / raw) To: David Hildenbrand Cc: Ard Biesheuvel, Linux ARM, Anshuman Khandual, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, Linux Kernel Mailing List, Linux Memory Management List On Thu, Apr 15, 2021 at 11:30:12AM +0200, David Hildenbrand wrote: > > Not sure we really need a new pagetype here, PG_Reserved seems to be quite > > enough to say "don't touch this". I generally agree that we could make > > PG_Reserved a PageType and then have several sub-types for reserved memory. > > This definitely will add clarity but I'm not sure that this justifies > > amount of churn and effort required to audit uses of PageResrved(). > > > Then, we could mostly avoid having to query memblock at runtime to figure > > > out that this is special memory. This would obviously be an extension to > > > this series. Just a thought. > > > > Stop pushing memblock out of kernel! ;-) > > Can't stop. Won't stop. :D > > It's lovely for booting up a kernel until we have other data-structures in > place ;) A bit more seriously, we don't have any data structure that reliably represents physical memory layout and arch-independent fashion. memblock is probably the best starting point for eventually having one. -- Sincerely yours, Mike. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-16 11:44 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-16 11:44 UTC (permalink / raw) To: David Hildenbrand Cc: Ard Biesheuvel, Linux ARM, Anshuman Khandual, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, Linux Kernel Mailing List, Linux Memory Management List On Thu, Apr 15, 2021 at 11:30:12AM +0200, David Hildenbrand wrote: > > Not sure we really need a new pagetype here, PG_Reserved seems to be quite > > enough to say "don't touch this". I generally agree that we could make > > PG_Reserved a PageType and then have several sub-types for reserved memory. > > This definitely will add clarity but I'm not sure that this justifies > > amount of churn and effort required to audit uses of PageResrved(). > > > Then, we could mostly avoid having to query memblock at runtime to figure > > > out that this is special memory. This would obviously be an extension to > > > this series. Just a thought. > > > > Stop pushing memblock out of kernel! ;-) > > Can't stop. Won't stop. :D > > It's lovely for booting up a kernel until we have other data-structures in > place ;) A bit more seriously, we don't have any data structure that reliably represents physical memory layout and arch-independent fashion. memblock is probably the best starting point for eventually having one. -- Sincerely yours, Mike. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-16 11:44 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-16 11:44 UTC (permalink / raw) To: David Hildenbrand Cc: Anshuman Khandual, Catalin Marinas, Linux Kernel Mailing List, Mike Rapoport, Linux Memory Management List, kvmarm, Marc Zyngier, Will Deacon, Linux ARM On Thu, Apr 15, 2021 at 11:30:12AM +0200, David Hildenbrand wrote: > > Not sure we really need a new pagetype here, PG_Reserved seems to be quite > > enough to say "don't touch this". I generally agree that we could make > > PG_Reserved a PageType and then have several sub-types for reserved memory. > > This definitely will add clarity but I'm not sure that this justifies > > amount of churn and effort required to audit uses of PageResrved(). > > > Then, we could mostly avoid having to query memblock at runtime to figure > > > out that this is special memory. This would obviously be an extension to > > > this series. Just a thought. > > > > Stop pushing memblock out of kernel! ;-) > > Can't stop. Won't stop. :D > > It's lovely for booting up a kernel until we have other data-structures in > place ;) A bit more seriously, we don't have any data structure that reliably represents physical memory layout and arch-independent fashion. memblock is probably the best starting point for eventually having one. -- Sincerely yours, Mike. _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages 2021-04-16 11:44 ` Mike Rapoport (?) @ 2021-04-16 11:54 ` David Hildenbrand -1 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-16 11:54 UTC (permalink / raw) To: Mike Rapoport Cc: Ard Biesheuvel, Linux ARM, Anshuman Khandual, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, Linux Kernel Mailing List, Linux Memory Management List On 16.04.21 13:44, Mike Rapoport wrote: > On Thu, Apr 15, 2021 at 11:30:12AM +0200, David Hildenbrand wrote: >>> Not sure we really need a new pagetype here, PG_Reserved seems to be quite >>> enough to say "don't touch this". I generally agree that we could make >>> PG_Reserved a PageType and then have several sub-types for reserved memory. >>> This definitely will add clarity but I'm not sure that this justifies >>> amount of churn and effort required to audit uses of PageResrved(). >>>> Then, we could mostly avoid having to query memblock at runtime to figure >>>> out that this is special memory. This would obviously be an extension to >>>> this series. Just a thought. >>> >>> Stop pushing memblock out of kernel! ;-) >> >> Can't stop. Won't stop. :D >> >> It's lovely for booting up a kernel until we have other data-structures in >> place ;) > > A bit more seriously, we don't have any data structure that reliably > represents physical memory layout and arch-independent fashion. > memblock is probably the best starting point for eventually having one. We have the (slowish) kernel resource tree after boot and the (faster) memmap. I really don't see why we really need another slowish variant. We might be better off to just extend and speed up the kernel resource tree. Memblock as is is not a reasonable datastructure to keep around after boot: for example, how we handle boottime allocations and reserve regions both as reserved. -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-16 11:54 ` David Hildenbrand 0 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-16 11:54 UTC (permalink / raw) To: Mike Rapoport Cc: Ard Biesheuvel, Linux ARM, Anshuman Khandual, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, Linux Kernel Mailing List, Linux Memory Management List On 16.04.21 13:44, Mike Rapoport wrote: > On Thu, Apr 15, 2021 at 11:30:12AM +0200, David Hildenbrand wrote: >>> Not sure we really need a new pagetype here, PG_Reserved seems to be quite >>> enough to say "don't touch this". I generally agree that we could make >>> PG_Reserved a PageType and then have several sub-types for reserved memory. >>> This definitely will add clarity but I'm not sure that this justifies >>> amount of churn and effort required to audit uses of PageResrved(). >>>> Then, we could mostly avoid having to query memblock at runtime to figure >>>> out that this is special memory. This would obviously be an extension to >>>> this series. Just a thought. >>> >>> Stop pushing memblock out of kernel! ;-) >> >> Can't stop. Won't stop. :D >> >> It's lovely for booting up a kernel until we have other data-structures in >> place ;) > > A bit more seriously, we don't have any data structure that reliably > represents physical memory layout and arch-independent fashion. > memblock is probably the best starting point for eventually having one. We have the (slowish) kernel resource tree after boot and the (faster) memmap. I really don't see why we really need another slowish variant. We might be better off to just extend and speed up the kernel resource tree. Memblock as is is not a reasonable datastructure to keep around after boot: for example, how we handle boottime allocations and reserve regions both as reserved. -- Thanks, David / dhildenb _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-16 11:54 ` David Hildenbrand 0 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-16 11:54 UTC (permalink / raw) To: Mike Rapoport Cc: Anshuman Khandual, Catalin Marinas, Linux Kernel Mailing List, Mike Rapoport, Linux Memory Management List, kvmarm, Marc Zyngier, Will Deacon, Linux ARM On 16.04.21 13:44, Mike Rapoport wrote: > On Thu, Apr 15, 2021 at 11:30:12AM +0200, David Hildenbrand wrote: >>> Not sure we really need a new pagetype here, PG_Reserved seems to be quite >>> enough to say "don't touch this". I generally agree that we could make >>> PG_Reserved a PageType and then have several sub-types for reserved memory. >>> This definitely will add clarity but I'm not sure that this justifies >>> amount of churn and effort required to audit uses of PageResrved(). >>>> Then, we could mostly avoid having to query memblock at runtime to figure >>>> out that this is special memory. This would obviously be an extension to >>>> this series. Just a thought. >>> >>> Stop pushing memblock out of kernel! ;-) >> >> Can't stop. Won't stop. :D >> >> It's lovely for booting up a kernel until we have other data-structures in >> place ;) > > A bit more seriously, we don't have any data structure that reliably > represents physical memory layout and arch-independent fashion. > memblock is probably the best starting point for eventually having one. We have the (slowish) kernel resource tree after boot and the (faster) memmap. I really don't see why we really need another slowish variant. We might be better off to just extend and speed up the kernel resource tree. Memblock as is is not a reasonable datastructure to keep around after boot: for example, how we handle boottime allocations and reserve regions both as reserved. -- Thanks, David / dhildenb _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages 2021-04-14 15:27 ` Ard Biesheuvel (?) @ 2021-04-14 20:11 ` Mike Rapoport -1 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-14 20:11 UTC (permalink / raw) To: Ard Biesheuvel Cc: David Hildenbrand, Linux ARM, Anshuman Khandual, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, Linux Kernel Mailing List, Linux Memory Management List On Wed, Apr 14, 2021 at 05:27:53PM +0200, Ard Biesheuvel wrote: > On Wed, 14 Apr 2021 at 17:14, David Hildenbrand <david@redhat.com> wrote: > > > > On 07.04.21 19:26, Mike Rapoport wrote: > > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > > > The struct pages representing a reserved memory region are initialized > > > using reserve_bootmem_range() function. This function is called for each > > > reserved region just before the memory is freed from memblock to the buddy > > > page allocator. > > > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > > values set by the memory map initialization which makes it necessary to > > > have a special treatment for such pages in pfn_valid() and > > > pfn_valid_within(). > > > > I assume these pages are never given to the buddy, because we don't have > > a direct mapping. So to the kernel, it's essentially just like a memory > > hole with benefits. > > > > I can spot that we want to export such memory like any special memory > > thingy/hole in /proc/iomem -- "reserved", which makes sense. > > > > I would assume that MEMBLOCK_NOMAP is a special type of *reserved* > > memory. IOW, that for_each_reserved_mem_range() should already succeed > > on these as well -- we should mark anything that is MEMBLOCK_NOMAP > > implicitly as reserved. Or are there valid reasons not to do so? What > > can anyone do with that memory? > > > > I assume they are pretty much useless for the kernel, right? Like other > > reserved memory ranges. > > > > On ARM, we need to know whether any physical regions that do not > contain system memory contain something with device semantics or not. > One of the examples is ACPI tables: these are in reserved memory, and > so they are not covered by the linear region. However, when the ACPI > core ioremap()s an arbitrary memory region, we don't know whether it > is mapping a memory region or a device region unless we keep track of > this in some way. (Device mappings require device attributes, but > firmware tables require memory attributes, as they might be accessed > using misaligned reads) I mostly agree, but my understanding is that regions of *physical* memory that are occupied by various pieces of EFI/ACPI information require special treatment because it was defined this way in the APCI spec. And since ARM cannot tolerate aliased mappings with different caching mode the whole bunch of firmware memory should be ioremap()ed to access it. > > > Split out initialization of the reserved pages to a function with a > > > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > > > reserved regions and mark struct pages for the NOMAP regions as > > > PageReserved. > > > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > > --- > > > mm/memblock.c | 23 +++++++++++++++++++++-- > > > 1 file changed, 21 insertions(+), 2 deletions(-) > > > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > > index afaefa8fc6ab..6b7ea9d86310 100644 > > > --- a/mm/memblock.c > > > +++ b/mm/memblock.c > > > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > > > return end_pfn - start_pfn; > > > } > > > > > > +static void __init memmap_init_reserved_pages(void) > > > +{ > > > + struct memblock_region *region; > > > + phys_addr_t start, end; > > > + u64 i; > > > + > > > + /* initialize struct pages for the reserved regions */ > > > + for_each_reserved_mem_range(i, &start, &end) > > > + reserve_bootmem_region(start, end); > > > + > > > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > > > + for_each_mem_region(region) { > > > + if (memblock_is_nomap(region)) { > > > + start = region->base; > > > + end = start + region->size; > > > + reserve_bootmem_region(start, end); > > > + } > > > + } > > > +} > > > + > > > static unsigned long __init free_low_memory_core_early(void) > > > { > > > unsigned long count = 0; > > > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > > > > > memblock_clear_hotplug(0, -1); > > > > > > - for_each_reserved_mem_range(i, &start, &end) > > > - reserve_bootmem_region(start, end); > > > + memmap_init_reserved_pages(); > > > > > > /* > > > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > > > > > > > > > -- > > Thanks, > > > > David / dhildenb > > > > > > _______________________________________________ > > linux-arm-kernel mailing list > > linux-arm-kernel@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel -- Sincerely yours, Mike. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-14 20:11 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-14 20:11 UTC (permalink / raw) To: Ard Biesheuvel Cc: David Hildenbrand, Linux ARM, Anshuman Khandual, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, Linux Kernel Mailing List, Linux Memory Management List On Wed, Apr 14, 2021 at 05:27:53PM +0200, Ard Biesheuvel wrote: > On Wed, 14 Apr 2021 at 17:14, David Hildenbrand <david@redhat.com> wrote: > > > > On 07.04.21 19:26, Mike Rapoport wrote: > > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > > > The struct pages representing a reserved memory region are initialized > > > using reserve_bootmem_range() function. This function is called for each > > > reserved region just before the memory is freed from memblock to the buddy > > > page allocator. > > > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > > values set by the memory map initialization which makes it necessary to > > > have a special treatment for such pages in pfn_valid() and > > > pfn_valid_within(). > > > > I assume these pages are never given to the buddy, because we don't have > > a direct mapping. So to the kernel, it's essentially just like a memory > > hole with benefits. > > > > I can spot that we want to export such memory like any special memory > > thingy/hole in /proc/iomem -- "reserved", which makes sense. > > > > I would assume that MEMBLOCK_NOMAP is a special type of *reserved* > > memory. IOW, that for_each_reserved_mem_range() should already succeed > > on these as well -- we should mark anything that is MEMBLOCK_NOMAP > > implicitly as reserved. Or are there valid reasons not to do so? What > > can anyone do with that memory? > > > > I assume they are pretty much useless for the kernel, right? Like other > > reserved memory ranges. > > > > On ARM, we need to know whether any physical regions that do not > contain system memory contain something with device semantics or not. > One of the examples is ACPI tables: these are in reserved memory, and > so they are not covered by the linear region. However, when the ACPI > core ioremap()s an arbitrary memory region, we don't know whether it > is mapping a memory region or a device region unless we keep track of > this in some way. (Device mappings require device attributes, but > firmware tables require memory attributes, as they might be accessed > using misaligned reads) I mostly agree, but my understanding is that regions of *physical* memory that are occupied by various pieces of EFI/ACPI information require special treatment because it was defined this way in the APCI spec. And since ARM cannot tolerate aliased mappings with different caching mode the whole bunch of firmware memory should be ioremap()ed to access it. > > > Split out initialization of the reserved pages to a function with a > > > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > > > reserved regions and mark struct pages for the NOMAP regions as > > > PageReserved. > > > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > > --- > > > mm/memblock.c | 23 +++++++++++++++++++++-- > > > 1 file changed, 21 insertions(+), 2 deletions(-) > > > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > > index afaefa8fc6ab..6b7ea9d86310 100644 > > > --- a/mm/memblock.c > > > +++ b/mm/memblock.c > > > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > > > return end_pfn - start_pfn; > > > } > > > > > > +static void __init memmap_init_reserved_pages(void) > > > +{ > > > + struct memblock_region *region; > > > + phys_addr_t start, end; > > > + u64 i; > > > + > > > + /* initialize struct pages for the reserved regions */ > > > + for_each_reserved_mem_range(i, &start, &end) > > > + reserve_bootmem_region(start, end); > > > + > > > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > > > + for_each_mem_region(region) { > > > + if (memblock_is_nomap(region)) { > > > + start = region->base; > > > + end = start + region->size; > > > + reserve_bootmem_region(start, end); > > > + } > > > + } > > > +} > > > + > > > static unsigned long __init free_low_memory_core_early(void) > > > { > > > unsigned long count = 0; > > > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > > > > > memblock_clear_hotplug(0, -1); > > > > > > - for_each_reserved_mem_range(i, &start, &end) > > > - reserve_bootmem_region(start, end); > > > + memmap_init_reserved_pages(); > > > > > > /* > > > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > > > > > > > > > -- > > Thanks, > > > > David / dhildenb > > > > > > _______________________________________________ > > linux-arm-kernel mailing list > > linux-arm-kernel@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel -- Sincerely yours, Mike. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-14 20:11 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-14 20:11 UTC (permalink / raw) To: Ard Biesheuvel Cc: Anshuman Khandual, Catalin Marinas, David Hildenbrand, Linux Kernel Mailing List, Mike Rapoport, Linux Memory Management List, Marc Zyngier, Will Deacon, kvmarm, Linux ARM On Wed, Apr 14, 2021 at 05:27:53PM +0200, Ard Biesheuvel wrote: > On Wed, 14 Apr 2021 at 17:14, David Hildenbrand <david@redhat.com> wrote: > > > > On 07.04.21 19:26, Mike Rapoport wrote: > > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > > > The struct pages representing a reserved memory region are initialized > > > using reserve_bootmem_range() function. This function is called for each > > > reserved region just before the memory is freed from memblock to the buddy > > > page allocator. > > > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > > values set by the memory map initialization which makes it necessary to > > > have a special treatment for such pages in pfn_valid() and > > > pfn_valid_within(). > > > > I assume these pages are never given to the buddy, because we don't have > > a direct mapping. So to the kernel, it's essentially just like a memory > > hole with benefits. > > > > I can spot that we want to export such memory like any special memory > > thingy/hole in /proc/iomem -- "reserved", which makes sense. > > > > I would assume that MEMBLOCK_NOMAP is a special type of *reserved* > > memory. IOW, that for_each_reserved_mem_range() should already succeed > > on these as well -- we should mark anything that is MEMBLOCK_NOMAP > > implicitly as reserved. Or are there valid reasons not to do so? What > > can anyone do with that memory? > > > > I assume they are pretty much useless for the kernel, right? Like other > > reserved memory ranges. > > > > On ARM, we need to know whether any physical regions that do not > contain system memory contain something with device semantics or not. > One of the examples is ACPI tables: these are in reserved memory, and > so they are not covered by the linear region. However, when the ACPI > core ioremap()s an arbitrary memory region, we don't know whether it > is mapping a memory region or a device region unless we keep track of > this in some way. (Device mappings require device attributes, but > firmware tables require memory attributes, as they might be accessed > using misaligned reads) I mostly agree, but my understanding is that regions of *physical* memory that are occupied by various pieces of EFI/ACPI information require special treatment because it was defined this way in the APCI spec. And since ARM cannot tolerate aliased mappings with different caching mode the whole bunch of firmware memory should be ioremap()ed to access it. > > > Split out initialization of the reserved pages to a function with a > > > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > > > reserved regions and mark struct pages for the NOMAP regions as > > > PageReserved. > > > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > > --- > > > mm/memblock.c | 23 +++++++++++++++++++++-- > > > 1 file changed, 21 insertions(+), 2 deletions(-) > > > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > > index afaefa8fc6ab..6b7ea9d86310 100644 > > > --- a/mm/memblock.c > > > +++ b/mm/memblock.c > > > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > > > return end_pfn - start_pfn; > > > } > > > > > > +static void __init memmap_init_reserved_pages(void) > > > +{ > > > + struct memblock_region *region; > > > + phys_addr_t start, end; > > > + u64 i; > > > + > > > + /* initialize struct pages for the reserved regions */ > > > + for_each_reserved_mem_range(i, &start, &end) > > > + reserve_bootmem_region(start, end); > > > + > > > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > > > + for_each_mem_region(region) { > > > + if (memblock_is_nomap(region)) { > > > + start = region->base; > > > + end = start + region->size; > > > + reserve_bootmem_region(start, end); > > > + } > > > + } > > > +} > > > + > > > static unsigned long __init free_low_memory_core_early(void) > > > { > > > unsigned long count = 0; > > > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > > > > > memblock_clear_hotplug(0, -1); > > > > > > - for_each_reserved_mem_range(i, &start, &end) > > > - reserve_bootmem_region(start, end); > > > + memmap_init_reserved_pages(); > > > > > > /* > > > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > > > > > > > > > -- > > Thanks, > > > > David / dhildenb > > > > > > _______________________________________________ > > linux-arm-kernel mailing list > > linux-arm-kernel@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel -- Sincerely yours, Mike. _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages 2021-04-14 15:12 ` David Hildenbrand (?) @ 2021-04-14 20:06 ` Mike Rapoport -1 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-14 20:06 UTC (permalink / raw) To: David Hildenbrand Cc: linux-arm-kernel, Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On Wed, Apr 14, 2021 at 05:12:11PM +0200, David Hildenbrand wrote: > On 07.04.21 19:26, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > The struct pages representing a reserved memory region are initialized > > using reserve_bootmem_range() function. This function is called for each > > reserved region just before the memory is freed from memblock to the buddy > > page allocator. > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > values set by the memory map initialization which makes it necessary to > > have a special treatment for such pages in pfn_valid() and > > pfn_valid_within(). > > I assume these pages are never given to the buddy, because we don't have a > direct mapping. So to the kernel, it's essentially just like a memory hole > with benefits. The pages should not be accessed as normal memory so they do not have a direct (or in ARMish linear) mapping and are never given to buddy. After looking at ACPI standard I don't see a fundamental reason for this but they've already made this mess and we need to cope with it. > I can spot that we want to export such memory like any special memory > thingy/hole in /proc/iomem -- "reserved", which makes sense. It does, but let's wait with /proc/iomem changes. We don't really have a 100% consistent view of it on different architectures, so adding yet another type there does not seem, well, urgent. > I would assume that MEMBLOCK_NOMAP is a special type of *reserved* memory. > IOW, that for_each_reserved_mem_range() should already succeed on these as > well -- we should mark anything that is MEMBLOCK_NOMAP implicitly as > reserved. Or are there valid reasons not to do so? What can anyone do with > that memory? > > I assume they are pretty much useless for the kernel, right? Like other > reserved memory ranges. I agree that there is a lot of commonality between NOMAP and reserved. The problem is that even semantics for reserved is different between architectures. Moreover, on the same architecture there could be E820_TYPE_RESERVED and memblock.reserved with different properties. I'd really prefer moving in baby steps here because any change in the boot mm can bear several month of early hangs debugging ;-) > > Split out initialization of the reserved pages to a function with a > > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > > reserved regions and mark struct pages for the NOMAP regions as > > PageReserved. > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > --- > > mm/memblock.c | 23 +++++++++++++++++++++-- > > 1 file changed, 21 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > index afaefa8fc6ab..6b7ea9d86310 100644 > > --- a/mm/memblock.c > > +++ b/mm/memblock.c > > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > > return end_pfn - start_pfn; > > } > > +static void __init memmap_init_reserved_pages(void) > > +{ > > + struct memblock_region *region; > > + phys_addr_t start, end; > > + u64 i; > > + > > + /* initialize struct pages for the reserved regions */ > > + for_each_reserved_mem_range(i, &start, &end) > > + reserve_bootmem_region(start, end); > > + > > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > > + for_each_mem_region(region) { > > + if (memblock_is_nomap(region)) { > > + start = region->base; > > + end = start + region->size; > > + reserve_bootmem_region(start, end); > > + } > > + } > > +} > > + > > static unsigned long __init free_low_memory_core_early(void) > > { > > unsigned long count = 0; > > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > memblock_clear_hotplug(0, -1); > > - for_each_reserved_mem_range(i, &start, &end) > > - reserve_bootmem_region(start, end); > > + memmap_init_reserved_pages(); > > /* > > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id -- Sincerely yours, Mike. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-14 20:06 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-14 20:06 UTC (permalink / raw) To: David Hildenbrand Cc: linux-arm-kernel, Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On Wed, Apr 14, 2021 at 05:12:11PM +0200, David Hildenbrand wrote: > On 07.04.21 19:26, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > The struct pages representing a reserved memory region are initialized > > using reserve_bootmem_range() function. This function is called for each > > reserved region just before the memory is freed from memblock to the buddy > > page allocator. > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > values set by the memory map initialization which makes it necessary to > > have a special treatment for such pages in pfn_valid() and > > pfn_valid_within(). > > I assume these pages are never given to the buddy, because we don't have a > direct mapping. So to the kernel, it's essentially just like a memory hole > with benefits. The pages should not be accessed as normal memory so they do not have a direct (or in ARMish linear) mapping and are never given to buddy. After looking at ACPI standard I don't see a fundamental reason for this but they've already made this mess and we need to cope with it. > I can spot that we want to export such memory like any special memory > thingy/hole in /proc/iomem -- "reserved", which makes sense. It does, but let's wait with /proc/iomem changes. We don't really have a 100% consistent view of it on different architectures, so adding yet another type there does not seem, well, urgent. > I would assume that MEMBLOCK_NOMAP is a special type of *reserved* memory. > IOW, that for_each_reserved_mem_range() should already succeed on these as > well -- we should mark anything that is MEMBLOCK_NOMAP implicitly as > reserved. Or are there valid reasons not to do so? What can anyone do with > that memory? > > I assume they are pretty much useless for the kernel, right? Like other > reserved memory ranges. I agree that there is a lot of commonality between NOMAP and reserved. The problem is that even semantics for reserved is different between architectures. Moreover, on the same architecture there could be E820_TYPE_RESERVED and memblock.reserved with different properties. I'd really prefer moving in baby steps here because any change in the boot mm can bear several month of early hangs debugging ;-) > > Split out initialization of the reserved pages to a function with a > > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > > reserved regions and mark struct pages for the NOMAP regions as > > PageReserved. > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > --- > > mm/memblock.c | 23 +++++++++++++++++++++-- > > 1 file changed, 21 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > index afaefa8fc6ab..6b7ea9d86310 100644 > > --- a/mm/memblock.c > > +++ b/mm/memblock.c > > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > > return end_pfn - start_pfn; > > } > > +static void __init memmap_init_reserved_pages(void) > > +{ > > + struct memblock_region *region; > > + phys_addr_t start, end; > > + u64 i; > > + > > + /* initialize struct pages for the reserved regions */ > > + for_each_reserved_mem_range(i, &start, &end) > > + reserve_bootmem_region(start, end); > > + > > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > > + for_each_mem_region(region) { > > + if (memblock_is_nomap(region)) { > > + start = region->base; > > + end = start + region->size; > > + reserve_bootmem_region(start, end); > > + } > > + } > > +} > > + > > static unsigned long __init free_low_memory_core_early(void) > > { > > unsigned long count = 0; > > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > memblock_clear_hotplug(0, -1); > > - for_each_reserved_mem_range(i, &start, &end) > > - reserve_bootmem_region(start, end); > > + memmap_init_reserved_pages(); > > /* > > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id -- Sincerely yours, Mike. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-14 20:06 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-14 20:06 UTC (permalink / raw) To: David Hildenbrand Cc: Anshuman Khandual, Catalin Marinas, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon, linux-arm-kernel On Wed, Apr 14, 2021 at 05:12:11PM +0200, David Hildenbrand wrote: > On 07.04.21 19:26, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > The struct pages representing a reserved memory region are initialized > > using reserve_bootmem_range() function. This function is called for each > > reserved region just before the memory is freed from memblock to the buddy > > page allocator. > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > values set by the memory map initialization which makes it necessary to > > have a special treatment for such pages in pfn_valid() and > > pfn_valid_within(). > > I assume these pages are never given to the buddy, because we don't have a > direct mapping. So to the kernel, it's essentially just like a memory hole > with benefits. The pages should not be accessed as normal memory so they do not have a direct (or in ARMish linear) mapping and are never given to buddy. After looking at ACPI standard I don't see a fundamental reason for this but they've already made this mess and we need to cope with it. > I can spot that we want to export such memory like any special memory > thingy/hole in /proc/iomem -- "reserved", which makes sense. It does, but let's wait with /proc/iomem changes. We don't really have a 100% consistent view of it on different architectures, so adding yet another type there does not seem, well, urgent. > I would assume that MEMBLOCK_NOMAP is a special type of *reserved* memory. > IOW, that for_each_reserved_mem_range() should already succeed on these as > well -- we should mark anything that is MEMBLOCK_NOMAP implicitly as > reserved. Or are there valid reasons not to do so? What can anyone do with > that memory? > > I assume they are pretty much useless for the kernel, right? Like other > reserved memory ranges. I agree that there is a lot of commonality between NOMAP and reserved. The problem is that even semantics for reserved is different between architectures. Moreover, on the same architecture there could be E820_TYPE_RESERVED and memblock.reserved with different properties. I'd really prefer moving in baby steps here because any change in the boot mm can bear several month of early hangs debugging ;-) > > Split out initialization of the reserved pages to a function with a > > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as the > > reserved regions and mark struct pages for the NOMAP regions as > > PageReserved. > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > --- > > mm/memblock.c | 23 +++++++++++++++++++++-- > > 1 file changed, 21 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > index afaefa8fc6ab..6b7ea9d86310 100644 > > --- a/mm/memblock.c > > +++ b/mm/memblock.c > > @@ -2002,6 +2002,26 @@ static unsigned long __init __free_memory_core(phys_addr_t start, > > return end_pfn - start_pfn; > > } > > +static void __init memmap_init_reserved_pages(void) > > +{ > > + struct memblock_region *region; > > + phys_addr_t start, end; > > + u64 i; > > + > > + /* initialize struct pages for the reserved regions */ > > + for_each_reserved_mem_range(i, &start, &end) > > + reserve_bootmem_region(start, end); > > + > > + /* and also treat struct pages for the NOMAP regions as PageReserved */ > > + for_each_mem_region(region) { > > + if (memblock_is_nomap(region)) { > > + start = region->base; > > + end = start + region->size; > > + reserve_bootmem_region(start, end); > > + } > > + } > > +} > > + > > static unsigned long __init free_low_memory_core_early(void) > > { > > unsigned long count = 0; > > @@ -2010,8 +2030,7 @@ static unsigned long __init free_low_memory_core_early(void) > > memblock_clear_hotplug(0, -1); > > - for_each_reserved_mem_range(i, &start, &end) > > - reserve_bootmem_region(start, end); > > + memmap_init_reserved_pages(); > > /* > > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id -- Sincerely yours, Mike. _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages 2021-04-14 20:06 ` Mike Rapoport @ 2021-04-14 20:09 ` David Hildenbrand -1 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-14 20:09 UTC (permalink / raw) To: Mike Rapoport Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-arm-kernel, linux-kernel, linux-mm [-- Attachment #1: Type: text/plain, Size: 4561 bytes --] Mike Rapoport <rppt@kernel.org> schrieb am Mi. 14. Apr. 2021 um 22:06: > On Wed, Apr 14, 2021 at 05:12:11PM +0200, David Hildenbrand wrote: > > On 07.04.21 19:26, Mike Rapoport wrote: > > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > > > The struct pages representing a reserved memory region are initialized > > > using reserve_bootmem_range() function. This function is called for > each > > > reserved region just before the memory is freed from memblock to the > buddy > > > page allocator. > > > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > > values set by the memory map initialization which makes it necessary to > > > have a special treatment for such pages in pfn_valid() and > > > pfn_valid_within(). > > > > I assume these pages are never given to the buddy, because we don't have > a > > direct mapping. So to the kernel, it's essentially just like a memory > hole > > with benefits. > > The pages should not be accessed as normal memory so they do not have a > direct (or in ARMish linear) mapping and are never given to buddy. > After looking at ACPI standard I don't see a fundamental reason for this > but they've already made this mess and we need to cope with it. > > > I can spot that we want to export such memory like any special memory > > thingy/hole in /proc/iomem -- "reserved", which makes sense. > > It does, but let's wait with /proc/iomem changes. We don't really have a > 100% consistent view of it on different architectures, so adding yet > another type there does not seem, well, urgent. > To clarify: this is already done on arm64. > > I would assume that MEMBLOCK_NOMAP is a special type of *reserved* > memory. > > IOW, that for_each_reserved_mem_range() should already succeed on these > as > > well -- we should mark anything that is MEMBLOCK_NOMAP implicitly as > > reserved. Or are there valid reasons not to do so? What can anyone do > with > > that memory? > > > > I assume they are pretty much useless for the kernel, right? Like other > > reserved memory ranges. > > I agree that there is a lot of commonality between NOMAP and reserved. The > problem is that even semantics for reserved is different between > architectures. Moreover, on the same architecture there could be > E820_TYPE_RESERVED and memblock.reserved with different properties. > > I'd really prefer moving in baby steps here because any change in the boot > mm can bear several month of early hangs debugging ;-) Yeah I know. We just should have the desired target state figured out :) > > > > Split out initialization of the reserved pages to a function with a > > > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as > the > > > reserved regions and mark struct pages for the NOMAP regions as > > > PageReserved. > > > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > > --- > > > mm/memblock.c | 23 +++++++++++++++++++++-- > > > 1 file changed, 21 insertions(+), 2 deletions(-) > > > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > > index afaefa8fc6ab..6b7ea9d86310 100644 > > > --- a/mm/memblock.c > > > +++ b/mm/memblock.c > > > @@ -2002,6 +2002,26 @@ static unsigned long __init > __free_memory_core(phys_addr_t start, > > > return end_pfn - start_pfn; > > > } > > > +static void __init memmap_init_reserved_pages(void) > > > +{ > > > + struct memblock_region *region; > > > + phys_addr_t start, end; > > > + u64 i; > > > + > > > + /* initialize struct pages for the reserved regions */ > > > + for_each_reserved_mem_range(i, &start, &end) > > > + reserve_bootmem_region(start, end); > > > + > > > + /* and also treat struct pages for the NOMAP regions as > PageReserved */ > > > + for_each_mem_region(region) { > > > + if (memblock_is_nomap(region)) { > > > + start = region->base; > > > + end = start + region->size; > > > + reserve_bootmem_region(start, end); > > > + } > > > + } > > > +} > > > + > > > static unsigned long __init free_low_memory_core_early(void) > > > { > > > unsigned long count = 0; > > > @@ -2010,8 +2030,7 @@ static unsigned long __init > free_low_memory_core_early(void) > > > memblock_clear_hotplug(0, -1); > > > - for_each_reserved_mem_range(i, &start, &end) > > > - reserve_bootmem_region(start, end); > > > + memmap_init_reserved_pages(); > > > /* > > > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > > -- > Sincerely yours, > Mike. > > -- Thanks, David / dhildenb [-- Attachment #2: Type: text/html, Size: 6468 bytes --] ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages @ 2021-04-14 20:09 ` David Hildenbrand 0 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-14 20:09 UTC (permalink / raw) To: Mike Rapoport Cc: Anshuman Khandual, Catalin Marinas, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon, linux-arm-kernel [-- Attachment #1.1: Type: text/plain, Size: 4561 bytes --] Mike Rapoport <rppt@kernel.org> schrieb am Mi. 14. Apr. 2021 um 22:06: > On Wed, Apr 14, 2021 at 05:12:11PM +0200, David Hildenbrand wrote: > > On 07.04.21 19:26, Mike Rapoport wrote: > > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > > > The struct pages representing a reserved memory region are initialized > > > using reserve_bootmem_range() function. This function is called for > each > > > reserved region just before the memory is freed from memblock to the > buddy > > > page allocator. > > > > > > The struct pages for MEMBLOCK_NOMAP regions are kept with the default > > > values set by the memory map initialization which makes it necessary to > > > have a special treatment for such pages in pfn_valid() and > > > pfn_valid_within(). > > > > I assume these pages are never given to the buddy, because we don't have > a > > direct mapping. So to the kernel, it's essentially just like a memory > hole > > with benefits. > > The pages should not be accessed as normal memory so they do not have a > direct (or in ARMish linear) mapping and are never given to buddy. > After looking at ACPI standard I don't see a fundamental reason for this > but they've already made this mess and we need to cope with it. > > > I can spot that we want to export such memory like any special memory > > thingy/hole in /proc/iomem -- "reserved", which makes sense. > > It does, but let's wait with /proc/iomem changes. We don't really have a > 100% consistent view of it on different architectures, so adding yet > another type there does not seem, well, urgent. > To clarify: this is already done on arm64. > > I would assume that MEMBLOCK_NOMAP is a special type of *reserved* > memory. > > IOW, that for_each_reserved_mem_range() should already succeed on these > as > > well -- we should mark anything that is MEMBLOCK_NOMAP implicitly as > > reserved. Or are there valid reasons not to do so? What can anyone do > with > > that memory? > > > > I assume they are pretty much useless for the kernel, right? Like other > > reserved memory ranges. > > I agree that there is a lot of commonality between NOMAP and reserved. The > problem is that even semantics for reserved is different between > architectures. Moreover, on the same architecture there could be > E820_TYPE_RESERVED and memblock.reserved with different properties. > > I'd really prefer moving in baby steps here because any change in the boot > mm can bear several month of early hangs debugging ;-) Yeah I know. We just should have the desired target state figured out :) > > > > Split out initialization of the reserved pages to a function with a > > > meaningful name and treat the MEMBLOCK_NOMAP regions the same way as > the > > > reserved regions and mark struct pages for the NOMAP regions as > > > PageReserved. > > > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > > --- > > > mm/memblock.c | 23 +++++++++++++++++++++-- > > > 1 file changed, 21 insertions(+), 2 deletions(-) > > > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > > index afaefa8fc6ab..6b7ea9d86310 100644 > > > --- a/mm/memblock.c > > > +++ b/mm/memblock.c > > > @@ -2002,6 +2002,26 @@ static unsigned long __init > __free_memory_core(phys_addr_t start, > > > return end_pfn - start_pfn; > > > } > > > +static void __init memmap_init_reserved_pages(void) > > > +{ > > > + struct memblock_region *region; > > > + phys_addr_t start, end; > > > + u64 i; > > > + > > > + /* initialize struct pages for the reserved regions */ > > > + for_each_reserved_mem_range(i, &start, &end) > > > + reserve_bootmem_region(start, end); > > > + > > > + /* and also treat struct pages for the NOMAP regions as > PageReserved */ > > > + for_each_mem_region(region) { > > > + if (memblock_is_nomap(region)) { > > > + start = region->base; > > > + end = start + region->size; > > > + reserve_bootmem_region(start, end); > > > + } > > > + } > > > +} > > > + > > > static unsigned long __init free_low_memory_core_early(void) > > > { > > > unsigned long count = 0; > > > @@ -2010,8 +2030,7 @@ static unsigned long __init > free_low_memory_core_early(void) > > > memblock_clear_hotplug(0, -1); > > > - for_each_reserved_mem_range(i, &start, &end) > > > - reserve_bootmem_region(start, end); > > > + memmap_init_reserved_pages(); > > > /* > > > * We need to use NUMA_NO_NODE instead of NODE_DATA(0)->node_id > > -- > Sincerely yours, > Mike. > > -- Thanks, David / dhildenb [-- Attachment #1.2: Type: text/html, Size: 6468 bytes --] [-- Attachment #2: Type: text/plain, Size: 151 bytes --] _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() 2021-04-07 17:26 ` Mike Rapoport (?) @ 2021-04-07 17:26 ` Mike Rapoport -1 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-07 17:26 UTC (permalink / raw) To: linux-arm-kernel Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm From: Mike Rapoport <rppt@linux.ibm.com> The intended semantics of pfn_valid() is to verify whether there is a struct page for the pfn in question and nothing else. Yet, on arm64 it is used to distinguish memory areas that are mapped in the linear map vs those that require ioremap() to access them. Introduce a dedicated pfn_is_memory() to perform such check and use it where appropriate. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- arch/arm64/include/asm/memory.h | 2 +- arch/arm64/include/asm/page.h | 1 + arch/arm64/kvm/mmu.c | 2 +- arch/arm64/mm/init.c | 6 ++++++ arch/arm64/mm/ioremap.c | 4 ++-- arch/arm64/mm/mmu.c | 2 +- 6 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index 0aabc3be9a75..7e77fdf71b9d 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) #define virt_addr_valid(addr) ({ \ __typeof__(addr) __addr = __tag_reset(addr); \ - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ }) void dump_mem_limit(void); diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 012cffc574e8..32b485bcc6ff 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); typedef struct page *pgtable_t; extern int pfn_valid(unsigned long); +extern int pfn_is_memory(unsigned long); #include <asm/memory.h> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 8711894db8c2..ad2ea65a3937 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) static bool kvm_is_device_pfn(unsigned long pfn) { - return !pfn_valid(pfn); + return !pfn_is_memory(pfn); } /* diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 3685e12aba9b..258b1905ed4a 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) } EXPORT_SYMBOL(pfn_valid); +int pfn_is_memory(unsigned long pfn) +{ + return memblock_is_map_memory(PFN_PHYS(pfn)); +} +EXPORT_SYMBOL(pfn_is_memory); + static phys_addr_t memory_limit = PHYS_ADDR_MAX; /* diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c index b5e83c46b23e..82a369b22ef5 100644 --- a/arch/arm64/mm/ioremap.c +++ b/arch/arm64/mm/ioremap.c @@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size, /* * Don't allow RAM to be mapped. */ - if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) + if (WARN_ON(pfn_is_memory(__phys_to_pfn(phys_addr)))) return NULL; area = get_vm_area_caller(size, VM_IOREMAP, caller); @@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap); void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size) { /* For normal memory we already have a cacheable mapping. */ - if (pfn_valid(__phys_to_pfn(phys_addr))) + if (pfn_is_memory(__phys_to_pfn(phys_addr))) return (void __iomem *)__phys_to_virt(phys_addr); return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL), diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 5d9550fdb9cf..038d20fe163f 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -81,7 +81,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd) pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, unsigned long size, pgprot_t vma_prot) { - if (!pfn_valid(pfn)) + if (!pfn_is_memory(pfn)) return pgprot_noncached(vma_prot); else if (file->f_flags & O_SYNC) return pgprot_writecombine(vma_prot); -- 2.28.0 ^ permalink raw reply related [flat|nested] 78+ messages in thread
* [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() @ 2021-04-07 17:26 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-07 17:26 UTC (permalink / raw) To: linux-arm-kernel Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm From: Mike Rapoport <rppt@linux.ibm.com> The intended semantics of pfn_valid() is to verify whether there is a struct page for the pfn in question and nothing else. Yet, on arm64 it is used to distinguish memory areas that are mapped in the linear map vs those that require ioremap() to access them. Introduce a dedicated pfn_is_memory() to perform such check and use it where appropriate. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- arch/arm64/include/asm/memory.h | 2 +- arch/arm64/include/asm/page.h | 1 + arch/arm64/kvm/mmu.c | 2 +- arch/arm64/mm/init.c | 6 ++++++ arch/arm64/mm/ioremap.c | 4 ++-- arch/arm64/mm/mmu.c | 2 +- 6 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index 0aabc3be9a75..7e77fdf71b9d 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) #define virt_addr_valid(addr) ({ \ __typeof__(addr) __addr = __tag_reset(addr); \ - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ }) void dump_mem_limit(void); diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 012cffc574e8..32b485bcc6ff 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); typedef struct page *pgtable_t; extern int pfn_valid(unsigned long); +extern int pfn_is_memory(unsigned long); #include <asm/memory.h> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 8711894db8c2..ad2ea65a3937 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) static bool kvm_is_device_pfn(unsigned long pfn) { - return !pfn_valid(pfn); + return !pfn_is_memory(pfn); } /* diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 3685e12aba9b..258b1905ed4a 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) } EXPORT_SYMBOL(pfn_valid); +int pfn_is_memory(unsigned long pfn) +{ + return memblock_is_map_memory(PFN_PHYS(pfn)); +} +EXPORT_SYMBOL(pfn_is_memory); + static phys_addr_t memory_limit = PHYS_ADDR_MAX; /* diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c index b5e83c46b23e..82a369b22ef5 100644 --- a/arch/arm64/mm/ioremap.c +++ b/arch/arm64/mm/ioremap.c @@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size, /* * Don't allow RAM to be mapped. */ - if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) + if (WARN_ON(pfn_is_memory(__phys_to_pfn(phys_addr)))) return NULL; area = get_vm_area_caller(size, VM_IOREMAP, caller); @@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap); void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size) { /* For normal memory we already have a cacheable mapping. */ - if (pfn_valid(__phys_to_pfn(phys_addr))) + if (pfn_is_memory(__phys_to_pfn(phys_addr))) return (void __iomem *)__phys_to_virt(phys_addr); return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL), diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 5d9550fdb9cf..038d20fe163f 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -81,7 +81,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd) pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, unsigned long size, pgprot_t vma_prot) { - if (!pfn_valid(pfn)) + if (!pfn_is_memory(pfn)) return pgprot_noncached(vma_prot); else if (file->f_flags & O_SYNC) return pgprot_writecombine(vma_prot); -- 2.28.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 78+ messages in thread
* [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() @ 2021-04-07 17:26 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-07 17:26 UTC (permalink / raw) To: linux-arm-kernel Cc: David Hildenbrand, Catalin Marinas, Anshuman Khandual, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon, Mike Rapoport From: Mike Rapoport <rppt@linux.ibm.com> The intended semantics of pfn_valid() is to verify whether there is a struct page for the pfn in question and nothing else. Yet, on arm64 it is used to distinguish memory areas that are mapped in the linear map vs those that require ioremap() to access them. Introduce a dedicated pfn_is_memory() to perform such check and use it where appropriate. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- arch/arm64/include/asm/memory.h | 2 +- arch/arm64/include/asm/page.h | 1 + arch/arm64/kvm/mmu.c | 2 +- arch/arm64/mm/init.c | 6 ++++++ arch/arm64/mm/ioremap.c | 4 ++-- arch/arm64/mm/mmu.c | 2 +- 6 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index 0aabc3be9a75..7e77fdf71b9d 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) #define virt_addr_valid(addr) ({ \ __typeof__(addr) __addr = __tag_reset(addr); \ - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ }) void dump_mem_limit(void); diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 012cffc574e8..32b485bcc6ff 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); typedef struct page *pgtable_t; extern int pfn_valid(unsigned long); +extern int pfn_is_memory(unsigned long); #include <asm/memory.h> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 8711894db8c2..ad2ea65a3937 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) static bool kvm_is_device_pfn(unsigned long pfn) { - return !pfn_valid(pfn); + return !pfn_is_memory(pfn); } /* diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 3685e12aba9b..258b1905ed4a 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) } EXPORT_SYMBOL(pfn_valid); +int pfn_is_memory(unsigned long pfn) +{ + return memblock_is_map_memory(PFN_PHYS(pfn)); +} +EXPORT_SYMBOL(pfn_is_memory); + static phys_addr_t memory_limit = PHYS_ADDR_MAX; /* diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c index b5e83c46b23e..82a369b22ef5 100644 --- a/arch/arm64/mm/ioremap.c +++ b/arch/arm64/mm/ioremap.c @@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size, /* * Don't allow RAM to be mapped. */ - if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) + if (WARN_ON(pfn_is_memory(__phys_to_pfn(phys_addr)))) return NULL; area = get_vm_area_caller(size, VM_IOREMAP, caller); @@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap); void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size) { /* For normal memory we already have a cacheable mapping. */ - if (pfn_valid(__phys_to_pfn(phys_addr))) + if (pfn_is_memory(__phys_to_pfn(phys_addr))) return (void __iomem *)__phys_to_virt(phys_addr); return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL), diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 5d9550fdb9cf..038d20fe163f 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -81,7 +81,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd) pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, unsigned long size, pgprot_t vma_prot) { - if (!pfn_valid(pfn)) + if (!pfn_is_memory(pfn)) return pgprot_noncached(vma_prot); else if (file->f_flags & O_SYNC) return pgprot_writecombine(vma_prot); -- 2.28.0 _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply related [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() 2021-04-07 17:26 ` Mike Rapoport (?) @ 2021-04-08 5:14 ` Anshuman Khandual -1 siblings, 0 replies; 78+ messages in thread From: Anshuman Khandual @ 2021-04-08 5:14 UTC (permalink / raw) To: Mike Rapoport, linux-arm-kernel Cc: Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On 4/7/21 10:56 PM, Mike Rapoport wrote: > From: Mike Rapoport <rppt@linux.ibm.com> > > The intended semantics of pfn_valid() is to verify whether there is a > struct page for the pfn in question and nothing else. Should there be a comment affirming this semantics interpretation, above the generic pfn_valid() in include/linux/mmzone.h ? > > Yet, on arm64 it is used to distinguish memory areas that are mapped in the > linear map vs those that require ioremap() to access them. > > Introduce a dedicated pfn_is_memory() to perform such check and use it > where appropriate. > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > --- > arch/arm64/include/asm/memory.h | 2 +- > arch/arm64/include/asm/page.h | 1 + > arch/arm64/kvm/mmu.c | 2 +- > arch/arm64/mm/init.c | 6 ++++++ > arch/arm64/mm/ioremap.c | 4 ++-- > arch/arm64/mm/mmu.c | 2 +- > 6 files changed, 12 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h > index 0aabc3be9a75..7e77fdf71b9d 100644 > --- a/arch/arm64/include/asm/memory.h > +++ b/arch/arm64/include/asm/memory.h > @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) > > #define virt_addr_valid(addr) ({ \ > __typeof__(addr) __addr = __tag_reset(addr); \ > - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ > + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ > }) > > void dump_mem_limit(void); > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > index 012cffc574e8..32b485bcc6ff 100644 > --- a/arch/arm64/include/asm/page.h > +++ b/arch/arm64/include/asm/page.h > @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); > typedef struct page *pgtable_t; > > extern int pfn_valid(unsigned long); > +extern int pfn_is_memory(unsigned long); > > #include <asm/memory.h> > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index 8711894db8c2..ad2ea65a3937 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) > > static bool kvm_is_device_pfn(unsigned long pfn) > { > - return !pfn_valid(pfn); > + return !pfn_is_memory(pfn); > } > > /* > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 3685e12aba9b..258b1905ed4a 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) > } > EXPORT_SYMBOL(pfn_valid); > > +int pfn_is_memory(unsigned long pfn) > +{ > + return memblock_is_map_memory(PFN_PHYS(pfn)); > +} > +EXPORT_SYMBOL(pfn_is_memory);> + Should not this be generic though ? There is nothing platform or arm64 specific in here. Wondering as pfn_is_memory() just indicates that the pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() instead ? Regardless, it's fine either way. > static phys_addr_t memory_limit = PHYS_ADDR_MAX; > > /* > diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c > index b5e83c46b23e..82a369b22ef5 100644 > --- a/arch/arm64/mm/ioremap.c > +++ b/arch/arm64/mm/ioremap.c > @@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size, > /* > * Don't allow RAM to be mapped. > */ > - if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) > + if (WARN_ON(pfn_is_memory(__phys_to_pfn(phys_addr)))) > return NULL; > > area = get_vm_area_caller(size, VM_IOREMAP, caller); > @@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap); > void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size) > { > /* For normal memory we already have a cacheable mapping. */ > - if (pfn_valid(__phys_to_pfn(phys_addr))) > + if (pfn_is_memory(__phys_to_pfn(phys_addr))) > return (void __iomem *)__phys_to_virt(phys_addr); > > return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL), > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > index 5d9550fdb9cf..038d20fe163f 100644 > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -81,7 +81,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd) > pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, > unsigned long size, pgprot_t vma_prot) > { > - if (!pfn_valid(pfn)) > + if (!pfn_is_memory(pfn)) > return pgprot_noncached(vma_prot); > else if (file->f_flags & O_SYNC) > return pgprot_writecombine(vma_prot); > ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() @ 2021-04-08 5:14 ` Anshuman Khandual 0 siblings, 0 replies; 78+ messages in thread From: Anshuman Khandual @ 2021-04-08 5:14 UTC (permalink / raw) To: Mike Rapoport, linux-arm-kernel Cc: Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On 4/7/21 10:56 PM, Mike Rapoport wrote: > From: Mike Rapoport <rppt@linux.ibm.com> > > The intended semantics of pfn_valid() is to verify whether there is a > struct page for the pfn in question and nothing else. Should there be a comment affirming this semantics interpretation, above the generic pfn_valid() in include/linux/mmzone.h ? > > Yet, on arm64 it is used to distinguish memory areas that are mapped in the > linear map vs those that require ioremap() to access them. > > Introduce a dedicated pfn_is_memory() to perform such check and use it > where appropriate. > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > --- > arch/arm64/include/asm/memory.h | 2 +- > arch/arm64/include/asm/page.h | 1 + > arch/arm64/kvm/mmu.c | 2 +- > arch/arm64/mm/init.c | 6 ++++++ > arch/arm64/mm/ioremap.c | 4 ++-- > arch/arm64/mm/mmu.c | 2 +- > 6 files changed, 12 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h > index 0aabc3be9a75..7e77fdf71b9d 100644 > --- a/arch/arm64/include/asm/memory.h > +++ b/arch/arm64/include/asm/memory.h > @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) > > #define virt_addr_valid(addr) ({ \ > __typeof__(addr) __addr = __tag_reset(addr); \ > - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ > + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ > }) > > void dump_mem_limit(void); > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > index 012cffc574e8..32b485bcc6ff 100644 > --- a/arch/arm64/include/asm/page.h > +++ b/arch/arm64/include/asm/page.h > @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); > typedef struct page *pgtable_t; > > extern int pfn_valid(unsigned long); > +extern int pfn_is_memory(unsigned long); > > #include <asm/memory.h> > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index 8711894db8c2..ad2ea65a3937 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) > > static bool kvm_is_device_pfn(unsigned long pfn) > { > - return !pfn_valid(pfn); > + return !pfn_is_memory(pfn); > } > > /* > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 3685e12aba9b..258b1905ed4a 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) > } > EXPORT_SYMBOL(pfn_valid); > > +int pfn_is_memory(unsigned long pfn) > +{ > + return memblock_is_map_memory(PFN_PHYS(pfn)); > +} > +EXPORT_SYMBOL(pfn_is_memory);> + Should not this be generic though ? There is nothing platform or arm64 specific in here. Wondering as pfn_is_memory() just indicates that the pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() instead ? Regardless, it's fine either way. > static phys_addr_t memory_limit = PHYS_ADDR_MAX; > > /* > diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c > index b5e83c46b23e..82a369b22ef5 100644 > --- a/arch/arm64/mm/ioremap.c > +++ b/arch/arm64/mm/ioremap.c > @@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size, > /* > * Don't allow RAM to be mapped. > */ > - if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) > + if (WARN_ON(pfn_is_memory(__phys_to_pfn(phys_addr)))) > return NULL; > > area = get_vm_area_caller(size, VM_IOREMAP, caller); > @@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap); > void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size) > { > /* For normal memory we already have a cacheable mapping. */ > - if (pfn_valid(__phys_to_pfn(phys_addr))) > + if (pfn_is_memory(__phys_to_pfn(phys_addr))) > return (void __iomem *)__phys_to_virt(phys_addr); > > return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL), > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > index 5d9550fdb9cf..038d20fe163f 100644 > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -81,7 +81,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd) > pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, > unsigned long size, pgprot_t vma_prot) > { > - if (!pfn_valid(pfn)) > + if (!pfn_is_memory(pfn)) > return pgprot_noncached(vma_prot); > else if (file->f_flags & O_SYNC) > return pgprot_writecombine(vma_prot); > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() @ 2021-04-08 5:14 ` Anshuman Khandual 0 siblings, 0 replies; 78+ messages in thread From: Anshuman Khandual @ 2021-04-08 5:14 UTC (permalink / raw) To: Mike Rapoport, linux-arm-kernel Cc: David Hildenbrand, Catalin Marinas, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon On 4/7/21 10:56 PM, Mike Rapoport wrote: > From: Mike Rapoport <rppt@linux.ibm.com> > > The intended semantics of pfn_valid() is to verify whether there is a > struct page for the pfn in question and nothing else. Should there be a comment affirming this semantics interpretation, above the generic pfn_valid() in include/linux/mmzone.h ? > > Yet, on arm64 it is used to distinguish memory areas that are mapped in the > linear map vs those that require ioremap() to access them. > > Introduce a dedicated pfn_is_memory() to perform such check and use it > where appropriate. > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > --- > arch/arm64/include/asm/memory.h | 2 +- > arch/arm64/include/asm/page.h | 1 + > arch/arm64/kvm/mmu.c | 2 +- > arch/arm64/mm/init.c | 6 ++++++ > arch/arm64/mm/ioremap.c | 4 ++-- > arch/arm64/mm/mmu.c | 2 +- > 6 files changed, 12 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h > index 0aabc3be9a75..7e77fdf71b9d 100644 > --- a/arch/arm64/include/asm/memory.h > +++ b/arch/arm64/include/asm/memory.h > @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) > > #define virt_addr_valid(addr) ({ \ > __typeof__(addr) __addr = __tag_reset(addr); \ > - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ > + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ > }) > > void dump_mem_limit(void); > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > index 012cffc574e8..32b485bcc6ff 100644 > --- a/arch/arm64/include/asm/page.h > +++ b/arch/arm64/include/asm/page.h > @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); > typedef struct page *pgtable_t; > > extern int pfn_valid(unsigned long); > +extern int pfn_is_memory(unsigned long); > > #include <asm/memory.h> > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index 8711894db8c2..ad2ea65a3937 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) > > static bool kvm_is_device_pfn(unsigned long pfn) > { > - return !pfn_valid(pfn); > + return !pfn_is_memory(pfn); > } > > /* > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 3685e12aba9b..258b1905ed4a 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) > } > EXPORT_SYMBOL(pfn_valid); > > +int pfn_is_memory(unsigned long pfn) > +{ > + return memblock_is_map_memory(PFN_PHYS(pfn)); > +} > +EXPORT_SYMBOL(pfn_is_memory);> + Should not this be generic though ? There is nothing platform or arm64 specific in here. Wondering as pfn_is_memory() just indicates that the pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() instead ? Regardless, it's fine either way. > static phys_addr_t memory_limit = PHYS_ADDR_MAX; > > /* > diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c > index b5e83c46b23e..82a369b22ef5 100644 > --- a/arch/arm64/mm/ioremap.c > +++ b/arch/arm64/mm/ioremap.c > @@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size, > /* > * Don't allow RAM to be mapped. > */ > - if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) > + if (WARN_ON(pfn_is_memory(__phys_to_pfn(phys_addr)))) > return NULL; > > area = get_vm_area_caller(size, VM_IOREMAP, caller); > @@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap); > void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size) > { > /* For normal memory we already have a cacheable mapping. */ > - if (pfn_valid(__phys_to_pfn(phys_addr))) > + if (pfn_is_memory(__phys_to_pfn(phys_addr))) > return (void __iomem *)__phys_to_virt(phys_addr); > > return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL), > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > index 5d9550fdb9cf..038d20fe163f 100644 > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -81,7 +81,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd) > pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, > unsigned long size, pgprot_t vma_prot) > { > - if (!pfn_valid(pfn)) > + if (!pfn_is_memory(pfn)) > return pgprot_noncached(vma_prot); > else if (file->f_flags & O_SYNC) > return pgprot_writecombine(vma_prot); > _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() 2021-04-08 5:14 ` Anshuman Khandual (?) @ 2021-04-08 6:00 ` Mike Rapoport -1 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-08 6:00 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On Thu, Apr 08, 2021 at 10:44:58AM +0530, Anshuman Khandual wrote: > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > The intended semantics of pfn_valid() is to verify whether there is a > > struct page for the pfn in question and nothing else. > > Should there be a comment affirming this semantics interpretation, above the > generic pfn_valid() in include/linux/mmzone.h ? Yeah, that would have been helpful :) > > > > Yet, on arm64 it is used to distinguish memory areas that are mapped in the > > linear map vs those that require ioremap() to access them. > > > > Introduce a dedicated pfn_is_memory() to perform such check and use it > > where appropriate. > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > --- > > arch/arm64/include/asm/memory.h | 2 +- > > arch/arm64/include/asm/page.h | 1 + > > arch/arm64/kvm/mmu.c | 2 +- > > arch/arm64/mm/init.c | 6 ++++++ > > arch/arm64/mm/ioremap.c | 4 ++-- > > arch/arm64/mm/mmu.c | 2 +- > > 6 files changed, 12 insertions(+), 5 deletions(-) > > > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h > > index 0aabc3be9a75..7e77fdf71b9d 100644 > > --- a/arch/arm64/include/asm/memory.h > > +++ b/arch/arm64/include/asm/memory.h > > @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) > > > > #define virt_addr_valid(addr) ({ \ > > __typeof__(addr) __addr = __tag_reset(addr); \ > > - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ > > + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ > > }) > > > > void dump_mem_limit(void); > > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > > index 012cffc574e8..32b485bcc6ff 100644 > > --- a/arch/arm64/include/asm/page.h > > +++ b/arch/arm64/include/asm/page.h > > @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); > > typedef struct page *pgtable_t; > > > > extern int pfn_valid(unsigned long); > > +extern int pfn_is_memory(unsigned long); > > > > #include <asm/memory.h> > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > index 8711894db8c2..ad2ea65a3937 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) > > > > static bool kvm_is_device_pfn(unsigned long pfn) > > { > > - return !pfn_valid(pfn); > > + return !pfn_is_memory(pfn); > > } > > > > /* > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > index 3685e12aba9b..258b1905ed4a 100644 > > --- a/arch/arm64/mm/init.c > > +++ b/arch/arm64/mm/init.c > > @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) > > } > > EXPORT_SYMBOL(pfn_valid); > > > > +int pfn_is_memory(unsigned long pfn) > > +{ > > + return memblock_is_map_memory(PFN_PHYS(pfn)); > > +} > > +EXPORT_SYMBOL(pfn_is_memory);> + > > Should not this be generic though ? There is nothing platform or arm64 > specific in here. As NOMAP itself is quite ARM specific, this check is currently only relevant for arm64 and maybe arm32. But probably having an EXPORT_SYMBOL wrapper for memblock_is_map_memory(), say in memblock does make sense for all architectures that have KEEP_MEMBLOCK. > Wondering as pfn_is_memory() just indicates that the > pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() > instead ? Regardless, it's fine either way. Yeah, I agree that naming could be better here. I think that for a generic name we'd need pfn_is_directly_mapped() so that it can be used on x86 ;-) > > static phys_addr_t memory_limit = PHYS_ADDR_MAX; > > > > /* > > diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c > > index b5e83c46b23e..82a369b22ef5 100644 > > --- a/arch/arm64/mm/ioremap.c > > +++ b/arch/arm64/mm/ioremap.c > > @@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size, > > /* > > * Don't allow RAM to be mapped. > > */ > > - if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) > > + if (WARN_ON(pfn_is_memory(__phys_to_pfn(phys_addr)))) > > return NULL; > > > > area = get_vm_area_caller(size, VM_IOREMAP, caller); > > @@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap); > > void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size) > > { > > /* For normal memory we already have a cacheable mapping. */ > > - if (pfn_valid(__phys_to_pfn(phys_addr))) > > + if (pfn_is_memory(__phys_to_pfn(phys_addr))) > > return (void __iomem *)__phys_to_virt(phys_addr); > > > > return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL), > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > > index 5d9550fdb9cf..038d20fe163f 100644 > > --- a/arch/arm64/mm/mmu.c > > +++ b/arch/arm64/mm/mmu.c > > @@ -81,7 +81,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd) > > pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, > > unsigned long size, pgprot_t vma_prot) > > { > > - if (!pfn_valid(pfn)) > > + if (!pfn_is_memory(pfn)) > > return pgprot_noncached(vma_prot); > > else if (file->f_flags & O_SYNC) > > return pgprot_writecombine(vma_prot); > > -- Sincerely yours, Mike. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() @ 2021-04-08 6:00 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-08 6:00 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On Thu, Apr 08, 2021 at 10:44:58AM +0530, Anshuman Khandual wrote: > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > The intended semantics of pfn_valid() is to verify whether there is a > > struct page for the pfn in question and nothing else. > > Should there be a comment affirming this semantics interpretation, above the > generic pfn_valid() in include/linux/mmzone.h ? Yeah, that would have been helpful :) > > > > Yet, on arm64 it is used to distinguish memory areas that are mapped in the > > linear map vs those that require ioremap() to access them. > > > > Introduce a dedicated pfn_is_memory() to perform such check and use it > > where appropriate. > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > --- > > arch/arm64/include/asm/memory.h | 2 +- > > arch/arm64/include/asm/page.h | 1 + > > arch/arm64/kvm/mmu.c | 2 +- > > arch/arm64/mm/init.c | 6 ++++++ > > arch/arm64/mm/ioremap.c | 4 ++-- > > arch/arm64/mm/mmu.c | 2 +- > > 6 files changed, 12 insertions(+), 5 deletions(-) > > > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h > > index 0aabc3be9a75..7e77fdf71b9d 100644 > > --- a/arch/arm64/include/asm/memory.h > > +++ b/arch/arm64/include/asm/memory.h > > @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) > > > > #define virt_addr_valid(addr) ({ \ > > __typeof__(addr) __addr = __tag_reset(addr); \ > > - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ > > + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ > > }) > > > > void dump_mem_limit(void); > > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > > index 012cffc574e8..32b485bcc6ff 100644 > > --- a/arch/arm64/include/asm/page.h > > +++ b/arch/arm64/include/asm/page.h > > @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); > > typedef struct page *pgtable_t; > > > > extern int pfn_valid(unsigned long); > > +extern int pfn_is_memory(unsigned long); > > > > #include <asm/memory.h> > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > index 8711894db8c2..ad2ea65a3937 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) > > > > static bool kvm_is_device_pfn(unsigned long pfn) > > { > > - return !pfn_valid(pfn); > > + return !pfn_is_memory(pfn); > > } > > > > /* > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > index 3685e12aba9b..258b1905ed4a 100644 > > --- a/arch/arm64/mm/init.c > > +++ b/arch/arm64/mm/init.c > > @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) > > } > > EXPORT_SYMBOL(pfn_valid); > > > > +int pfn_is_memory(unsigned long pfn) > > +{ > > + return memblock_is_map_memory(PFN_PHYS(pfn)); > > +} > > +EXPORT_SYMBOL(pfn_is_memory);> + > > Should not this be generic though ? There is nothing platform or arm64 > specific in here. As NOMAP itself is quite ARM specific, this check is currently only relevant for arm64 and maybe arm32. But probably having an EXPORT_SYMBOL wrapper for memblock_is_map_memory(), say in memblock does make sense for all architectures that have KEEP_MEMBLOCK. > Wondering as pfn_is_memory() just indicates that the > pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() > instead ? Regardless, it's fine either way. Yeah, I agree that naming could be better here. I think that for a generic name we'd need pfn_is_directly_mapped() so that it can be used on x86 ;-) > > static phys_addr_t memory_limit = PHYS_ADDR_MAX; > > > > /* > > diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c > > index b5e83c46b23e..82a369b22ef5 100644 > > --- a/arch/arm64/mm/ioremap.c > > +++ b/arch/arm64/mm/ioremap.c > > @@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size, > > /* > > * Don't allow RAM to be mapped. > > */ > > - if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) > > + if (WARN_ON(pfn_is_memory(__phys_to_pfn(phys_addr)))) > > return NULL; > > > > area = get_vm_area_caller(size, VM_IOREMAP, caller); > > @@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap); > > void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size) > > { > > /* For normal memory we already have a cacheable mapping. */ > > - if (pfn_valid(__phys_to_pfn(phys_addr))) > > + if (pfn_is_memory(__phys_to_pfn(phys_addr))) > > return (void __iomem *)__phys_to_virt(phys_addr); > > > > return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL), > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > > index 5d9550fdb9cf..038d20fe163f 100644 > > --- a/arch/arm64/mm/mmu.c > > +++ b/arch/arm64/mm/mmu.c > > @@ -81,7 +81,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd) > > pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, > > unsigned long size, pgprot_t vma_prot) > > { > > - if (!pfn_valid(pfn)) > > + if (!pfn_is_memory(pfn)) > > return pgprot_noncached(vma_prot); > > else if (file->f_flags & O_SYNC) > > return pgprot_writecombine(vma_prot); > > -- Sincerely yours, Mike. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() @ 2021-04-08 6:00 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-08 6:00 UTC (permalink / raw) To: Anshuman Khandual Cc: David Hildenbrand, Catalin Marinas, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon, linux-arm-kernel On Thu, Apr 08, 2021 at 10:44:58AM +0530, Anshuman Khandual wrote: > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > The intended semantics of pfn_valid() is to verify whether there is a > > struct page for the pfn in question and nothing else. > > Should there be a comment affirming this semantics interpretation, above the > generic pfn_valid() in include/linux/mmzone.h ? Yeah, that would have been helpful :) > > > > Yet, on arm64 it is used to distinguish memory areas that are mapped in the > > linear map vs those that require ioremap() to access them. > > > > Introduce a dedicated pfn_is_memory() to perform such check and use it > > where appropriate. > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > --- > > arch/arm64/include/asm/memory.h | 2 +- > > arch/arm64/include/asm/page.h | 1 + > > arch/arm64/kvm/mmu.c | 2 +- > > arch/arm64/mm/init.c | 6 ++++++ > > arch/arm64/mm/ioremap.c | 4 ++-- > > arch/arm64/mm/mmu.c | 2 +- > > 6 files changed, 12 insertions(+), 5 deletions(-) > > > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h > > index 0aabc3be9a75..7e77fdf71b9d 100644 > > --- a/arch/arm64/include/asm/memory.h > > +++ b/arch/arm64/include/asm/memory.h > > @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) > > > > #define virt_addr_valid(addr) ({ \ > > __typeof__(addr) __addr = __tag_reset(addr); \ > > - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ > > + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ > > }) > > > > void dump_mem_limit(void); > > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > > index 012cffc574e8..32b485bcc6ff 100644 > > --- a/arch/arm64/include/asm/page.h > > +++ b/arch/arm64/include/asm/page.h > > @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); > > typedef struct page *pgtable_t; > > > > extern int pfn_valid(unsigned long); > > +extern int pfn_is_memory(unsigned long); > > > > #include <asm/memory.h> > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > index 8711894db8c2..ad2ea65a3937 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) > > > > static bool kvm_is_device_pfn(unsigned long pfn) > > { > > - return !pfn_valid(pfn); > > + return !pfn_is_memory(pfn); > > } > > > > /* > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > index 3685e12aba9b..258b1905ed4a 100644 > > --- a/arch/arm64/mm/init.c > > +++ b/arch/arm64/mm/init.c > > @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) > > } > > EXPORT_SYMBOL(pfn_valid); > > > > +int pfn_is_memory(unsigned long pfn) > > +{ > > + return memblock_is_map_memory(PFN_PHYS(pfn)); > > +} > > +EXPORT_SYMBOL(pfn_is_memory);> + > > Should not this be generic though ? There is nothing platform or arm64 > specific in here. As NOMAP itself is quite ARM specific, this check is currently only relevant for arm64 and maybe arm32. But probably having an EXPORT_SYMBOL wrapper for memblock_is_map_memory(), say in memblock does make sense for all architectures that have KEEP_MEMBLOCK. > Wondering as pfn_is_memory() just indicates that the > pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() > instead ? Regardless, it's fine either way. Yeah, I agree that naming could be better here. I think that for a generic name we'd need pfn_is_directly_mapped() so that it can be used on x86 ;-) > > static phys_addr_t memory_limit = PHYS_ADDR_MAX; > > > > /* > > diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c > > index b5e83c46b23e..82a369b22ef5 100644 > > --- a/arch/arm64/mm/ioremap.c > > +++ b/arch/arm64/mm/ioremap.c > > @@ -43,7 +43,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size, > > /* > > * Don't allow RAM to be mapped. > > */ > > - if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) > > + if (WARN_ON(pfn_is_memory(__phys_to_pfn(phys_addr)))) > > return NULL; > > > > area = get_vm_area_caller(size, VM_IOREMAP, caller); > > @@ -84,7 +84,7 @@ EXPORT_SYMBOL(iounmap); > > void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size) > > { > > /* For normal memory we already have a cacheable mapping. */ > > - if (pfn_valid(__phys_to_pfn(phys_addr))) > > + if (pfn_is_memory(__phys_to_pfn(phys_addr))) > > return (void __iomem *)__phys_to_virt(phys_addr); > > > > return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL), > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > > index 5d9550fdb9cf..038d20fe163f 100644 > > --- a/arch/arm64/mm/mmu.c > > +++ b/arch/arm64/mm/mmu.c > > @@ -81,7 +81,7 @@ void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd) > > pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, > > unsigned long size, pgprot_t vma_prot) > > { > > - if (!pfn_valid(pfn)) > > + if (!pfn_is_memory(pfn)) > > return pgprot_noncached(vma_prot); > > else if (file->f_flags & O_SYNC) > > return pgprot_writecombine(vma_prot); > > -- Sincerely yours, Mike. _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() 2021-04-08 5:14 ` Anshuman Khandual (?) @ 2021-04-14 15:58 ` David Hildenbrand -1 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-14 15:58 UTC (permalink / raw) To: Anshuman Khandual, Mike Rapoport, linux-arm-kernel Cc: Ard Biesheuvel, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On 08.04.21 07:14, Anshuman Khandual wrote: > > On 4/7/21 10:56 PM, Mike Rapoport wrote: >> From: Mike Rapoport <rppt@linux.ibm.com> >> >> The intended semantics of pfn_valid() is to verify whether there is a >> struct page for the pfn in question and nothing else. > > Should there be a comment affirming this semantics interpretation, above the > generic pfn_valid() in include/linux/mmzone.h ? > >> >> Yet, on arm64 it is used to distinguish memory areas that are mapped in the >> linear map vs those that require ioremap() to access them. >> >> Introduce a dedicated pfn_is_memory() to perform such check and use it >> where appropriate. >> >> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> >> --- >> arch/arm64/include/asm/memory.h | 2 +- >> arch/arm64/include/asm/page.h | 1 + >> arch/arm64/kvm/mmu.c | 2 +- >> arch/arm64/mm/init.c | 6 ++++++ >> arch/arm64/mm/ioremap.c | 4 ++-- >> arch/arm64/mm/mmu.c | 2 +- >> 6 files changed, 12 insertions(+), 5 deletions(-) >> >> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h >> index 0aabc3be9a75..7e77fdf71b9d 100644 >> --- a/arch/arm64/include/asm/memory.h >> +++ b/arch/arm64/include/asm/memory.h >> @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) >> >> #define virt_addr_valid(addr) ({ \ >> __typeof__(addr) __addr = __tag_reset(addr); \ >> - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ >> + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ >> }) >> >> void dump_mem_limit(void); >> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h >> index 012cffc574e8..32b485bcc6ff 100644 >> --- a/arch/arm64/include/asm/page.h >> +++ b/arch/arm64/include/asm/page.h >> @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); >> typedef struct page *pgtable_t; >> >> extern int pfn_valid(unsigned long); >> +extern int pfn_is_memory(unsigned long); >> >> #include <asm/memory.h> >> >> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c >> index 8711894db8c2..ad2ea65a3937 100644 >> --- a/arch/arm64/kvm/mmu.c >> +++ b/arch/arm64/kvm/mmu.c >> @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) >> >> static bool kvm_is_device_pfn(unsigned long pfn) >> { >> - return !pfn_valid(pfn); >> + return !pfn_is_memory(pfn); >> } >> >> /* >> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >> index 3685e12aba9b..258b1905ed4a 100644 >> --- a/arch/arm64/mm/init.c >> +++ b/arch/arm64/mm/init.c >> @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) >> } >> EXPORT_SYMBOL(pfn_valid); >> >> +int pfn_is_memory(unsigned long pfn) >> +{ >> + return memblock_is_map_memory(PFN_PHYS(pfn)); >> +} >> +EXPORT_SYMBOL(pfn_is_memory);> + > > Should not this be generic though ? There is nothing platform or arm64 > specific in here. Wondering as pfn_is_memory() just indicates that the > pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() > instead ? Regardless, it's fine either way. TBH, I dislike (generic) pfn_is_memory(). It feels like we're mixing concepts. NOMAP memory vs !NOMAP memory; even NOMAP is some kind of memory after all. pfn_is_map_memory() would be more expressive, although still sub-optimal. We'd actually want some kind of arm64-specific pfn_is_system_memory() or the inverse pfn_is_device_memory() -- to be improved. -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() @ 2021-04-14 15:58 ` David Hildenbrand 0 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-14 15:58 UTC (permalink / raw) To: Anshuman Khandual, Mike Rapoport, linux-arm-kernel Cc: Ard Biesheuvel, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On 08.04.21 07:14, Anshuman Khandual wrote: > > On 4/7/21 10:56 PM, Mike Rapoport wrote: >> From: Mike Rapoport <rppt@linux.ibm.com> >> >> The intended semantics of pfn_valid() is to verify whether there is a >> struct page for the pfn in question and nothing else. > > Should there be a comment affirming this semantics interpretation, above the > generic pfn_valid() in include/linux/mmzone.h ? > >> >> Yet, on arm64 it is used to distinguish memory areas that are mapped in the >> linear map vs those that require ioremap() to access them. >> >> Introduce a dedicated pfn_is_memory() to perform such check and use it >> where appropriate. >> >> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> >> --- >> arch/arm64/include/asm/memory.h | 2 +- >> arch/arm64/include/asm/page.h | 1 + >> arch/arm64/kvm/mmu.c | 2 +- >> arch/arm64/mm/init.c | 6 ++++++ >> arch/arm64/mm/ioremap.c | 4 ++-- >> arch/arm64/mm/mmu.c | 2 +- >> 6 files changed, 12 insertions(+), 5 deletions(-) >> >> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h >> index 0aabc3be9a75..7e77fdf71b9d 100644 >> --- a/arch/arm64/include/asm/memory.h >> +++ b/arch/arm64/include/asm/memory.h >> @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) >> >> #define virt_addr_valid(addr) ({ \ >> __typeof__(addr) __addr = __tag_reset(addr); \ >> - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ >> + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ >> }) >> >> void dump_mem_limit(void); >> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h >> index 012cffc574e8..32b485bcc6ff 100644 >> --- a/arch/arm64/include/asm/page.h >> +++ b/arch/arm64/include/asm/page.h >> @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); >> typedef struct page *pgtable_t; >> >> extern int pfn_valid(unsigned long); >> +extern int pfn_is_memory(unsigned long); >> >> #include <asm/memory.h> >> >> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c >> index 8711894db8c2..ad2ea65a3937 100644 >> --- a/arch/arm64/kvm/mmu.c >> +++ b/arch/arm64/kvm/mmu.c >> @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) >> >> static bool kvm_is_device_pfn(unsigned long pfn) >> { >> - return !pfn_valid(pfn); >> + return !pfn_is_memory(pfn); >> } >> >> /* >> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >> index 3685e12aba9b..258b1905ed4a 100644 >> --- a/arch/arm64/mm/init.c >> +++ b/arch/arm64/mm/init.c >> @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) >> } >> EXPORT_SYMBOL(pfn_valid); >> >> +int pfn_is_memory(unsigned long pfn) >> +{ >> + return memblock_is_map_memory(PFN_PHYS(pfn)); >> +} >> +EXPORT_SYMBOL(pfn_is_memory);> + > > Should not this be generic though ? There is nothing platform or arm64 > specific in here. Wondering as pfn_is_memory() just indicates that the > pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() > instead ? Regardless, it's fine either way. TBH, I dislike (generic) pfn_is_memory(). It feels like we're mixing concepts. NOMAP memory vs !NOMAP memory; even NOMAP is some kind of memory after all. pfn_is_map_memory() would be more expressive, although still sub-optimal. We'd actually want some kind of arm64-specific pfn_is_system_memory() or the inverse pfn_is_device_memory() -- to be improved. -- Thanks, David / dhildenb _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() @ 2021-04-14 15:58 ` David Hildenbrand 0 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-14 15:58 UTC (permalink / raw) To: Anshuman Khandual, Mike Rapoport, linux-arm-kernel Cc: Catalin Marinas, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon On 08.04.21 07:14, Anshuman Khandual wrote: > > On 4/7/21 10:56 PM, Mike Rapoport wrote: >> From: Mike Rapoport <rppt@linux.ibm.com> >> >> The intended semantics of pfn_valid() is to verify whether there is a >> struct page for the pfn in question and nothing else. > > Should there be a comment affirming this semantics interpretation, above the > generic pfn_valid() in include/linux/mmzone.h ? > >> >> Yet, on arm64 it is used to distinguish memory areas that are mapped in the >> linear map vs those that require ioremap() to access them. >> >> Introduce a dedicated pfn_is_memory() to perform such check and use it >> where appropriate. >> >> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> >> --- >> arch/arm64/include/asm/memory.h | 2 +- >> arch/arm64/include/asm/page.h | 1 + >> arch/arm64/kvm/mmu.c | 2 +- >> arch/arm64/mm/init.c | 6 ++++++ >> arch/arm64/mm/ioremap.c | 4 ++-- >> arch/arm64/mm/mmu.c | 2 +- >> 6 files changed, 12 insertions(+), 5 deletions(-) >> >> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h >> index 0aabc3be9a75..7e77fdf71b9d 100644 >> --- a/arch/arm64/include/asm/memory.h >> +++ b/arch/arm64/include/asm/memory.h >> @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) >> >> #define virt_addr_valid(addr) ({ \ >> __typeof__(addr) __addr = __tag_reset(addr); \ >> - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ >> + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ >> }) >> >> void dump_mem_limit(void); >> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h >> index 012cffc574e8..32b485bcc6ff 100644 >> --- a/arch/arm64/include/asm/page.h >> +++ b/arch/arm64/include/asm/page.h >> @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); >> typedef struct page *pgtable_t; >> >> extern int pfn_valid(unsigned long); >> +extern int pfn_is_memory(unsigned long); >> >> #include <asm/memory.h> >> >> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c >> index 8711894db8c2..ad2ea65a3937 100644 >> --- a/arch/arm64/kvm/mmu.c >> +++ b/arch/arm64/kvm/mmu.c >> @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) >> >> static bool kvm_is_device_pfn(unsigned long pfn) >> { >> - return !pfn_valid(pfn); >> + return !pfn_is_memory(pfn); >> } >> >> /* >> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >> index 3685e12aba9b..258b1905ed4a 100644 >> --- a/arch/arm64/mm/init.c >> +++ b/arch/arm64/mm/init.c >> @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) >> } >> EXPORT_SYMBOL(pfn_valid); >> >> +int pfn_is_memory(unsigned long pfn) >> +{ >> + return memblock_is_map_memory(PFN_PHYS(pfn)); >> +} >> +EXPORT_SYMBOL(pfn_is_memory);> + > > Should not this be generic though ? There is nothing platform or arm64 > specific in here. Wondering as pfn_is_memory() just indicates that the > pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() > instead ? Regardless, it's fine either way. TBH, I dislike (generic) pfn_is_memory(). It feels like we're mixing concepts. NOMAP memory vs !NOMAP memory; even NOMAP is some kind of memory after all. pfn_is_map_memory() would be more expressive, although still sub-optimal. We'd actually want some kind of arm64-specific pfn_is_system_memory() or the inverse pfn_is_device_memory() -- to be improved. -- Thanks, David / dhildenb _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() 2021-04-14 15:58 ` David Hildenbrand (?) @ 2021-04-14 20:29 ` Mike Rapoport -1 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-14 20:29 UTC (permalink / raw) To: David Hildenbrand Cc: Anshuman Khandual, linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On Wed, Apr 14, 2021 at 05:58:26PM +0200, David Hildenbrand wrote: > On 08.04.21 07:14, Anshuman Khandual wrote: > > > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > > > The intended semantics of pfn_valid() is to verify whether there is a > > > struct page for the pfn in question and nothing else. > > > > Should there be a comment affirming this semantics interpretation, above the > > generic pfn_valid() in include/linux/mmzone.h ? > > > > > > > > Yet, on arm64 it is used to distinguish memory areas that are mapped in the > > > linear map vs those that require ioremap() to access them. > > > > > > Introduce a dedicated pfn_is_memory() to perform such check and use it > > > where appropriate. > > > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > > --- > > > arch/arm64/include/asm/memory.h | 2 +- > > > arch/arm64/include/asm/page.h | 1 + > > > arch/arm64/kvm/mmu.c | 2 +- > > > arch/arm64/mm/init.c | 6 ++++++ > > > arch/arm64/mm/ioremap.c | 4 ++-- > > > arch/arm64/mm/mmu.c | 2 +- > > > 6 files changed, 12 insertions(+), 5 deletions(-) > > > > > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h > > > index 0aabc3be9a75..7e77fdf71b9d 100644 > > > --- a/arch/arm64/include/asm/memory.h > > > +++ b/arch/arm64/include/asm/memory.h > > > @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) > > > #define virt_addr_valid(addr) ({ \ > > > __typeof__(addr) __addr = __tag_reset(addr); \ > > > - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ > > > + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ > > > }) > > > void dump_mem_limit(void); > > > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > > > index 012cffc574e8..32b485bcc6ff 100644 > > > --- a/arch/arm64/include/asm/page.h > > > +++ b/arch/arm64/include/asm/page.h > > > @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); > > > typedef struct page *pgtable_t; > > > extern int pfn_valid(unsigned long); > > > +extern int pfn_is_memory(unsigned long); > > > #include <asm/memory.h> > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > > index 8711894db8c2..ad2ea65a3937 100644 > > > --- a/arch/arm64/kvm/mmu.c > > > +++ b/arch/arm64/kvm/mmu.c > > > @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) > > > static bool kvm_is_device_pfn(unsigned long pfn) > > > { > > > - return !pfn_valid(pfn); > > > + return !pfn_is_memory(pfn); > > > } > > > /* > > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > > index 3685e12aba9b..258b1905ed4a 100644 > > > --- a/arch/arm64/mm/init.c > > > +++ b/arch/arm64/mm/init.c > > > @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) > > > } > > > EXPORT_SYMBOL(pfn_valid); > > > +int pfn_is_memory(unsigned long pfn) > > > +{ > > > + return memblock_is_map_memory(PFN_PHYS(pfn)); > > > +} > > > +EXPORT_SYMBOL(pfn_is_memory);> + > > > > Should not this be generic though ? There is nothing platform or arm64 > > specific in here. Wondering as pfn_is_memory() just indicates that the > > pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() > > instead ? Regardless, it's fine either way. > > TBH, I dislike (generic) pfn_is_memory(). It feels like we're mixing > concepts. Yeah, at the moment NOMAP is very much arm specific so I'd keep it this way for now. > NOMAP memory vs !NOMAP memory; even NOMAP is some kind of memory > after all. pfn_is_map_memory() would be more expressive, although still > sub-optimal. > > We'd actually want some kind of arm64-specific pfn_is_system_memory() or the > inverse pfn_is_device_memory() -- to be improved. In my current version (to be posted soon) I've started with pfn_lineary_mapped() but then ended up with pfn_mapped() to make it "upward" compatible with architectures that use direct rather than linear map :) -- Sincerely yours, Mike. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() @ 2021-04-14 20:29 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-14 20:29 UTC (permalink / raw) To: David Hildenbrand Cc: Anshuman Khandual, linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On Wed, Apr 14, 2021 at 05:58:26PM +0200, David Hildenbrand wrote: > On 08.04.21 07:14, Anshuman Khandual wrote: > > > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > > > The intended semantics of pfn_valid() is to verify whether there is a > > > struct page for the pfn in question and nothing else. > > > > Should there be a comment affirming this semantics interpretation, above the > > generic pfn_valid() in include/linux/mmzone.h ? > > > > > > > > Yet, on arm64 it is used to distinguish memory areas that are mapped in the > > > linear map vs those that require ioremap() to access them. > > > > > > Introduce a dedicated pfn_is_memory() to perform such check and use it > > > where appropriate. > > > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > > --- > > > arch/arm64/include/asm/memory.h | 2 +- > > > arch/arm64/include/asm/page.h | 1 + > > > arch/arm64/kvm/mmu.c | 2 +- > > > arch/arm64/mm/init.c | 6 ++++++ > > > arch/arm64/mm/ioremap.c | 4 ++-- > > > arch/arm64/mm/mmu.c | 2 +- > > > 6 files changed, 12 insertions(+), 5 deletions(-) > > > > > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h > > > index 0aabc3be9a75..7e77fdf71b9d 100644 > > > --- a/arch/arm64/include/asm/memory.h > > > +++ b/arch/arm64/include/asm/memory.h > > > @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) > > > #define virt_addr_valid(addr) ({ \ > > > __typeof__(addr) __addr = __tag_reset(addr); \ > > > - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ > > > + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ > > > }) > > > void dump_mem_limit(void); > > > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > > > index 012cffc574e8..32b485bcc6ff 100644 > > > --- a/arch/arm64/include/asm/page.h > > > +++ b/arch/arm64/include/asm/page.h > > > @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); > > > typedef struct page *pgtable_t; > > > extern int pfn_valid(unsigned long); > > > +extern int pfn_is_memory(unsigned long); > > > #include <asm/memory.h> > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > > index 8711894db8c2..ad2ea65a3937 100644 > > > --- a/arch/arm64/kvm/mmu.c > > > +++ b/arch/arm64/kvm/mmu.c > > > @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) > > > static bool kvm_is_device_pfn(unsigned long pfn) > > > { > > > - return !pfn_valid(pfn); > > > + return !pfn_is_memory(pfn); > > > } > > > /* > > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > > index 3685e12aba9b..258b1905ed4a 100644 > > > --- a/arch/arm64/mm/init.c > > > +++ b/arch/arm64/mm/init.c > > > @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) > > > } > > > EXPORT_SYMBOL(pfn_valid); > > > +int pfn_is_memory(unsigned long pfn) > > > +{ > > > + return memblock_is_map_memory(PFN_PHYS(pfn)); > > > +} > > > +EXPORT_SYMBOL(pfn_is_memory);> + > > > > Should not this be generic though ? There is nothing platform or arm64 > > specific in here. Wondering as pfn_is_memory() just indicates that the > > pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() > > instead ? Regardless, it's fine either way. > > TBH, I dislike (generic) pfn_is_memory(). It feels like we're mixing > concepts. Yeah, at the moment NOMAP is very much arm specific so I'd keep it this way for now. > NOMAP memory vs !NOMAP memory; even NOMAP is some kind of memory > after all. pfn_is_map_memory() would be more expressive, although still > sub-optimal. > > We'd actually want some kind of arm64-specific pfn_is_system_memory() or the > inverse pfn_is_device_memory() -- to be improved. In my current version (to be posted soon) I've started with pfn_lineary_mapped() but then ended up with pfn_mapped() to make it "upward" compatible with architectures that use direct rather than linear map :) -- Sincerely yours, Mike. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() @ 2021-04-14 20:29 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-14 20:29 UTC (permalink / raw) To: David Hildenbrand Cc: Anshuman Khandual, Catalin Marinas, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon, linux-arm-kernel On Wed, Apr 14, 2021 at 05:58:26PM +0200, David Hildenbrand wrote: > On 08.04.21 07:14, Anshuman Khandual wrote: > > > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > > > The intended semantics of pfn_valid() is to verify whether there is a > > > struct page for the pfn in question and nothing else. > > > > Should there be a comment affirming this semantics interpretation, above the > > generic pfn_valid() in include/linux/mmzone.h ? > > > > > > > > Yet, on arm64 it is used to distinguish memory areas that are mapped in the > > > linear map vs those that require ioremap() to access them. > > > > > > Introduce a dedicated pfn_is_memory() to perform such check and use it > > > where appropriate. > > > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > > --- > > > arch/arm64/include/asm/memory.h | 2 +- > > > arch/arm64/include/asm/page.h | 1 + > > > arch/arm64/kvm/mmu.c | 2 +- > > > arch/arm64/mm/init.c | 6 ++++++ > > > arch/arm64/mm/ioremap.c | 4 ++-- > > > arch/arm64/mm/mmu.c | 2 +- > > > 6 files changed, 12 insertions(+), 5 deletions(-) > > > > > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h > > > index 0aabc3be9a75..7e77fdf71b9d 100644 > > > --- a/arch/arm64/include/asm/memory.h > > > +++ b/arch/arm64/include/asm/memory.h > > > @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) > > > #define virt_addr_valid(addr) ({ \ > > > __typeof__(addr) __addr = __tag_reset(addr); \ > > > - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ > > > + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ > > > }) > > > void dump_mem_limit(void); > > > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > > > index 012cffc574e8..32b485bcc6ff 100644 > > > --- a/arch/arm64/include/asm/page.h > > > +++ b/arch/arm64/include/asm/page.h > > > @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); > > > typedef struct page *pgtable_t; > > > extern int pfn_valid(unsigned long); > > > +extern int pfn_is_memory(unsigned long); > > > #include <asm/memory.h> > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > > index 8711894db8c2..ad2ea65a3937 100644 > > > --- a/arch/arm64/kvm/mmu.c > > > +++ b/arch/arm64/kvm/mmu.c > > > @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) > > > static bool kvm_is_device_pfn(unsigned long pfn) > > > { > > > - return !pfn_valid(pfn); > > > + return !pfn_is_memory(pfn); > > > } > > > /* > > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > > index 3685e12aba9b..258b1905ed4a 100644 > > > --- a/arch/arm64/mm/init.c > > > +++ b/arch/arm64/mm/init.c > > > @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) > > > } > > > EXPORT_SYMBOL(pfn_valid); > > > +int pfn_is_memory(unsigned long pfn) > > > +{ > > > + return memblock_is_map_memory(PFN_PHYS(pfn)); > > > +} > > > +EXPORT_SYMBOL(pfn_is_memory);> + > > > > Should not this be generic though ? There is nothing platform or arm64 > > specific in here. Wondering as pfn_is_memory() just indicates that the > > pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() > > instead ? Regardless, it's fine either way. > > TBH, I dislike (generic) pfn_is_memory(). It feels like we're mixing > concepts. Yeah, at the moment NOMAP is very much arm specific so I'd keep it this way for now. > NOMAP memory vs !NOMAP memory; even NOMAP is some kind of memory > after all. pfn_is_map_memory() would be more expressive, although still > sub-optimal. > > We'd actually want some kind of arm64-specific pfn_is_system_memory() or the > inverse pfn_is_device_memory() -- to be improved. In my current version (to be posted soon) I've started with pfn_lineary_mapped() but then ended up with pfn_mapped() to make it "upward" compatible with architectures that use direct rather than linear map :) -- Sincerely yours, Mike. _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() 2021-04-14 20:29 ` Mike Rapoport (?) @ 2021-04-15 9:31 ` David Hildenbrand -1 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-15 9:31 UTC (permalink / raw) To: Mike Rapoport Cc: Anshuman Khandual, linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On 14.04.21 22:29, Mike Rapoport wrote: > On Wed, Apr 14, 2021 at 05:58:26PM +0200, David Hildenbrand wrote: >> On 08.04.21 07:14, Anshuman Khandual wrote: >>> >>> On 4/7/21 10:56 PM, Mike Rapoport wrote: >>>> From: Mike Rapoport <rppt@linux.ibm.com> >>>> >>>> The intended semantics of pfn_valid() is to verify whether there is a >>>> struct page for the pfn in question and nothing else. >>> >>> Should there be a comment affirming this semantics interpretation, above the >>> generic pfn_valid() in include/linux/mmzone.h ? >>> >>>> >>>> Yet, on arm64 it is used to distinguish memory areas that are mapped in the >>>> linear map vs those that require ioremap() to access them. >>>> >>>> Introduce a dedicated pfn_is_memory() to perform such check and use it >>>> where appropriate. >>>> >>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> >>>> --- >>>> arch/arm64/include/asm/memory.h | 2 +- >>>> arch/arm64/include/asm/page.h | 1 + >>>> arch/arm64/kvm/mmu.c | 2 +- >>>> arch/arm64/mm/init.c | 6 ++++++ >>>> arch/arm64/mm/ioremap.c | 4 ++-- >>>> arch/arm64/mm/mmu.c | 2 +- >>>> 6 files changed, 12 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h >>>> index 0aabc3be9a75..7e77fdf71b9d 100644 >>>> --- a/arch/arm64/include/asm/memory.h >>>> +++ b/arch/arm64/include/asm/memory.h >>>> @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) >>>> #define virt_addr_valid(addr) ({ \ >>>> __typeof__(addr) __addr = __tag_reset(addr); \ >>>> - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ >>>> + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ >>>> }) >>>> void dump_mem_limit(void); >>>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h >>>> index 012cffc574e8..32b485bcc6ff 100644 >>>> --- a/arch/arm64/include/asm/page.h >>>> +++ b/arch/arm64/include/asm/page.h >>>> @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); >>>> typedef struct page *pgtable_t; >>>> extern int pfn_valid(unsigned long); >>>> +extern int pfn_is_memory(unsigned long); >>>> #include <asm/memory.h> >>>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c >>>> index 8711894db8c2..ad2ea65a3937 100644 >>>> --- a/arch/arm64/kvm/mmu.c >>>> +++ b/arch/arm64/kvm/mmu.c >>>> @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) >>>> static bool kvm_is_device_pfn(unsigned long pfn) >>>> { >>>> - return !pfn_valid(pfn); >>>> + return !pfn_is_memory(pfn); >>>> } >>>> /* >>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >>>> index 3685e12aba9b..258b1905ed4a 100644 >>>> --- a/arch/arm64/mm/init.c >>>> +++ b/arch/arm64/mm/init.c >>>> @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) >>>> } >>>> EXPORT_SYMBOL(pfn_valid); >>>> +int pfn_is_memory(unsigned long pfn) >>>> +{ >>>> + return memblock_is_map_memory(PFN_PHYS(pfn)); >>>> +} >>>> +EXPORT_SYMBOL(pfn_is_memory);> + >>> >>> Should not this be generic though ? There is nothing platform or arm64 >>> specific in here. Wondering as pfn_is_memory() just indicates that the >>> pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() >>> instead ? Regardless, it's fine either way. >> >> TBH, I dislike (generic) pfn_is_memory(). It feels like we're mixing >> concepts. > > Yeah, at the moment NOMAP is very much arm specific so I'd keep it this way > for now. > >> NOMAP memory vs !NOMAP memory; even NOMAP is some kind of memory >> after all. pfn_is_map_memory() would be more expressive, although still >> sub-optimal. >> >> We'd actually want some kind of arm64-specific pfn_is_system_memory() or the >> inverse pfn_is_device_memory() -- to be improved. > > In my current version (to be posted soon) I've started with > pfn_lineary_mapped() but then ended up with pfn_mapped() to make it > "upward" compatible with architectures that use direct rather than linear > map :) And even that is moot. It doesn't tell you if a PFN is *actually* mapped (hello secretmem). I'd suggest to just use memblock_is_map_memory() in arch specific code. Then it's clear what we are querying exactly and what the semantics might be. -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() @ 2021-04-15 9:31 ` David Hildenbrand 0 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-15 9:31 UTC (permalink / raw) To: Mike Rapoport Cc: Anshuman Khandual, linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On 14.04.21 22:29, Mike Rapoport wrote: > On Wed, Apr 14, 2021 at 05:58:26PM +0200, David Hildenbrand wrote: >> On 08.04.21 07:14, Anshuman Khandual wrote: >>> >>> On 4/7/21 10:56 PM, Mike Rapoport wrote: >>>> From: Mike Rapoport <rppt@linux.ibm.com> >>>> >>>> The intended semantics of pfn_valid() is to verify whether there is a >>>> struct page for the pfn in question and nothing else. >>> >>> Should there be a comment affirming this semantics interpretation, above the >>> generic pfn_valid() in include/linux/mmzone.h ? >>> >>>> >>>> Yet, on arm64 it is used to distinguish memory areas that are mapped in the >>>> linear map vs those that require ioremap() to access them. >>>> >>>> Introduce a dedicated pfn_is_memory() to perform such check and use it >>>> where appropriate. >>>> >>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> >>>> --- >>>> arch/arm64/include/asm/memory.h | 2 +- >>>> arch/arm64/include/asm/page.h | 1 + >>>> arch/arm64/kvm/mmu.c | 2 +- >>>> arch/arm64/mm/init.c | 6 ++++++ >>>> arch/arm64/mm/ioremap.c | 4 ++-- >>>> arch/arm64/mm/mmu.c | 2 +- >>>> 6 files changed, 12 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h >>>> index 0aabc3be9a75..7e77fdf71b9d 100644 >>>> --- a/arch/arm64/include/asm/memory.h >>>> +++ b/arch/arm64/include/asm/memory.h >>>> @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) >>>> #define virt_addr_valid(addr) ({ \ >>>> __typeof__(addr) __addr = __tag_reset(addr); \ >>>> - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ >>>> + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ >>>> }) >>>> void dump_mem_limit(void); >>>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h >>>> index 012cffc574e8..32b485bcc6ff 100644 >>>> --- a/arch/arm64/include/asm/page.h >>>> +++ b/arch/arm64/include/asm/page.h >>>> @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); >>>> typedef struct page *pgtable_t; >>>> extern int pfn_valid(unsigned long); >>>> +extern int pfn_is_memory(unsigned long); >>>> #include <asm/memory.h> >>>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c >>>> index 8711894db8c2..ad2ea65a3937 100644 >>>> --- a/arch/arm64/kvm/mmu.c >>>> +++ b/arch/arm64/kvm/mmu.c >>>> @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) >>>> static bool kvm_is_device_pfn(unsigned long pfn) >>>> { >>>> - return !pfn_valid(pfn); >>>> + return !pfn_is_memory(pfn); >>>> } >>>> /* >>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >>>> index 3685e12aba9b..258b1905ed4a 100644 >>>> --- a/arch/arm64/mm/init.c >>>> +++ b/arch/arm64/mm/init.c >>>> @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) >>>> } >>>> EXPORT_SYMBOL(pfn_valid); >>>> +int pfn_is_memory(unsigned long pfn) >>>> +{ >>>> + return memblock_is_map_memory(PFN_PHYS(pfn)); >>>> +} >>>> +EXPORT_SYMBOL(pfn_is_memory);> + >>> >>> Should not this be generic though ? There is nothing platform or arm64 >>> specific in here. Wondering as pfn_is_memory() just indicates that the >>> pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() >>> instead ? Regardless, it's fine either way. >> >> TBH, I dislike (generic) pfn_is_memory(). It feels like we're mixing >> concepts. > > Yeah, at the moment NOMAP is very much arm specific so I'd keep it this way > for now. > >> NOMAP memory vs !NOMAP memory; even NOMAP is some kind of memory >> after all. pfn_is_map_memory() would be more expressive, although still >> sub-optimal. >> >> We'd actually want some kind of arm64-specific pfn_is_system_memory() or the >> inverse pfn_is_device_memory() -- to be improved. > > In my current version (to be posted soon) I've started with > pfn_lineary_mapped() but then ended up with pfn_mapped() to make it > "upward" compatible with architectures that use direct rather than linear > map :) And even that is moot. It doesn't tell you if a PFN is *actually* mapped (hello secretmem). I'd suggest to just use memblock_is_map_memory() in arch specific code. Then it's clear what we are querying exactly and what the semantics might be. -- Thanks, David / dhildenb _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() @ 2021-04-15 9:31 ` David Hildenbrand 0 siblings, 0 replies; 78+ messages in thread From: David Hildenbrand @ 2021-04-15 9:31 UTC (permalink / raw) To: Mike Rapoport Cc: Anshuman Khandual, Catalin Marinas, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon, linux-arm-kernel On 14.04.21 22:29, Mike Rapoport wrote: > On Wed, Apr 14, 2021 at 05:58:26PM +0200, David Hildenbrand wrote: >> On 08.04.21 07:14, Anshuman Khandual wrote: >>> >>> On 4/7/21 10:56 PM, Mike Rapoport wrote: >>>> From: Mike Rapoport <rppt@linux.ibm.com> >>>> >>>> The intended semantics of pfn_valid() is to verify whether there is a >>>> struct page for the pfn in question and nothing else. >>> >>> Should there be a comment affirming this semantics interpretation, above the >>> generic pfn_valid() in include/linux/mmzone.h ? >>> >>>> >>>> Yet, on arm64 it is used to distinguish memory areas that are mapped in the >>>> linear map vs those that require ioremap() to access them. >>>> >>>> Introduce a dedicated pfn_is_memory() to perform such check and use it >>>> where appropriate. >>>> >>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> >>>> --- >>>> arch/arm64/include/asm/memory.h | 2 +- >>>> arch/arm64/include/asm/page.h | 1 + >>>> arch/arm64/kvm/mmu.c | 2 +- >>>> arch/arm64/mm/init.c | 6 ++++++ >>>> arch/arm64/mm/ioremap.c | 4 ++-- >>>> arch/arm64/mm/mmu.c | 2 +- >>>> 6 files changed, 12 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h >>>> index 0aabc3be9a75..7e77fdf71b9d 100644 >>>> --- a/arch/arm64/include/asm/memory.h >>>> +++ b/arch/arm64/include/asm/memory.h >>>> @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) >>>> #define virt_addr_valid(addr) ({ \ >>>> __typeof__(addr) __addr = __tag_reset(addr); \ >>>> - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ >>>> + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ >>>> }) >>>> void dump_mem_limit(void); >>>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h >>>> index 012cffc574e8..32b485bcc6ff 100644 >>>> --- a/arch/arm64/include/asm/page.h >>>> +++ b/arch/arm64/include/asm/page.h >>>> @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); >>>> typedef struct page *pgtable_t; >>>> extern int pfn_valid(unsigned long); >>>> +extern int pfn_is_memory(unsigned long); >>>> #include <asm/memory.h> >>>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c >>>> index 8711894db8c2..ad2ea65a3937 100644 >>>> --- a/arch/arm64/kvm/mmu.c >>>> +++ b/arch/arm64/kvm/mmu.c >>>> @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) >>>> static bool kvm_is_device_pfn(unsigned long pfn) >>>> { >>>> - return !pfn_valid(pfn); >>>> + return !pfn_is_memory(pfn); >>>> } >>>> /* >>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >>>> index 3685e12aba9b..258b1905ed4a 100644 >>>> --- a/arch/arm64/mm/init.c >>>> +++ b/arch/arm64/mm/init.c >>>> @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) >>>> } >>>> EXPORT_SYMBOL(pfn_valid); >>>> +int pfn_is_memory(unsigned long pfn) >>>> +{ >>>> + return memblock_is_map_memory(PFN_PHYS(pfn)); >>>> +} >>>> +EXPORT_SYMBOL(pfn_is_memory);> + >>> >>> Should not this be generic though ? There is nothing platform or arm64 >>> specific in here. Wondering as pfn_is_memory() just indicates that the >>> pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() >>> instead ? Regardless, it's fine either way. >> >> TBH, I dislike (generic) pfn_is_memory(). It feels like we're mixing >> concepts. > > Yeah, at the moment NOMAP is very much arm specific so I'd keep it this way > for now. > >> NOMAP memory vs !NOMAP memory; even NOMAP is some kind of memory >> after all. pfn_is_map_memory() would be more expressive, although still >> sub-optimal. >> >> We'd actually want some kind of arm64-specific pfn_is_system_memory() or the >> inverse pfn_is_device_memory() -- to be improved. > > In my current version (to be posted soon) I've started with > pfn_lineary_mapped() but then ended up with pfn_mapped() to make it > "upward" compatible with architectures that use direct rather than linear > map :) And even that is moot. It doesn't tell you if a PFN is *actually* mapped (hello secretmem). I'd suggest to just use memblock_is_map_memory() in arch specific code. Then it's clear what we are querying exactly and what the semantics might be. -- Thanks, David / dhildenb _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() 2021-04-15 9:31 ` David Hildenbrand (?) @ 2021-04-16 11:40 ` Mike Rapoport -1 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-16 11:40 UTC (permalink / raw) To: David Hildenbrand Cc: Anshuman Khandual, linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On Thu, Apr 15, 2021 at 11:31:26AM +0200, David Hildenbrand wrote: > On 14.04.21 22:29, Mike Rapoport wrote: > > On Wed, Apr 14, 2021 at 05:58:26PM +0200, David Hildenbrand wrote: > > > On 08.04.21 07:14, Anshuman Khandual wrote: > > > > > > > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > > > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > > > > > > > The intended semantics of pfn_valid() is to verify whether there is a > > > > > struct page for the pfn in question and nothing else. > > > > > > > > Should there be a comment affirming this semantics interpretation, above the > > > > generic pfn_valid() in include/linux/mmzone.h ? > > > > > > > > > > > > > > Yet, on arm64 it is used to distinguish memory areas that are mapped in the > > > > > linear map vs those that require ioremap() to access them. > > > > > > > > > > Introduce a dedicated pfn_is_memory() to perform such check and use it > > > > > where appropriate. > > > > > > > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > > > > --- > > > > > arch/arm64/include/asm/memory.h | 2 +- > > > > > arch/arm64/include/asm/page.h | 1 + > > > > > arch/arm64/kvm/mmu.c | 2 +- > > > > > arch/arm64/mm/init.c | 6 ++++++ > > > > > arch/arm64/mm/ioremap.c | 4 ++-- > > > > > arch/arm64/mm/mmu.c | 2 +- > > > > > 6 files changed, 12 insertions(+), 5 deletions(-) > > > > > > > > > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h > > > > > index 0aabc3be9a75..7e77fdf71b9d 100644 > > > > > --- a/arch/arm64/include/asm/memory.h > > > > > +++ b/arch/arm64/include/asm/memory.h > > > > > @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) > > > > > #define virt_addr_valid(addr) ({ \ > > > > > __typeof__(addr) __addr = __tag_reset(addr); \ > > > > > - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ > > > > > + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ > > > > > }) > > > > > void dump_mem_limit(void); > > > > > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > > > > > index 012cffc574e8..32b485bcc6ff 100644 > > > > > --- a/arch/arm64/include/asm/page.h > > > > > +++ b/arch/arm64/include/asm/page.h > > > > > @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); > > > > > typedef struct page *pgtable_t; > > > > > extern int pfn_valid(unsigned long); > > > > > +extern int pfn_is_memory(unsigned long); > > > > > #include <asm/memory.h> > > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > > > > index 8711894db8c2..ad2ea65a3937 100644 > > > > > --- a/arch/arm64/kvm/mmu.c > > > > > +++ b/arch/arm64/kvm/mmu.c > > > > > @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) > > > > > static bool kvm_is_device_pfn(unsigned long pfn) > > > > > { > > > > > - return !pfn_valid(pfn); > > > > > + return !pfn_is_memory(pfn); > > > > > } > > > > > /* > > > > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > > > > index 3685e12aba9b..258b1905ed4a 100644 > > > > > --- a/arch/arm64/mm/init.c > > > > > +++ b/arch/arm64/mm/init.c > > > > > @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) > > > > > } > > > > > EXPORT_SYMBOL(pfn_valid); > > > > > +int pfn_is_memory(unsigned long pfn) > > > > > +{ > > > > > + return memblock_is_map_memory(PFN_PHYS(pfn)); > > > > > +} > > > > > +EXPORT_SYMBOL(pfn_is_memory);> + > > > > > > > > Should not this be generic though ? There is nothing platform or arm64 > > > > specific in here. Wondering as pfn_is_memory() just indicates that the > > > > pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() > > > > instead ? Regardless, it's fine either way. > > > > > > TBH, I dislike (generic) pfn_is_memory(). It feels like we're mixing > > > concepts. > > > > Yeah, at the moment NOMAP is very much arm specific so I'd keep it this way > > for now. > > > > > NOMAP memory vs !NOMAP memory; even NOMAP is some kind of memory > > > after all. pfn_is_map_memory() would be more expressive, although still > > > sub-optimal. > > > > > > We'd actually want some kind of arm64-specific pfn_is_system_memory() or the > > > inverse pfn_is_device_memory() -- to be improved. > > > > In my current version (to be posted soon) I've started with > > pfn_lineary_mapped() but then ended up with pfn_mapped() to make it > > "upward" compatible with architectures that use direct rather than linear > > map :) > > And even that is moot. It doesn't tell you if a PFN is *actually* mapped > (hello secretmem). > > I'd suggest to just use memblock_is_map_memory() in arch specific code. Then > it's clear what we are querying exactly and what the semantics might be. Ok, let's export memblock_is_map_memory() for the KEEP_MEMBLOCK case. -- Sincerely yours, Mike. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() @ 2021-04-16 11:40 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-16 11:40 UTC (permalink / raw) To: David Hildenbrand Cc: Anshuman Khandual, linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On Thu, Apr 15, 2021 at 11:31:26AM +0200, David Hildenbrand wrote: > On 14.04.21 22:29, Mike Rapoport wrote: > > On Wed, Apr 14, 2021 at 05:58:26PM +0200, David Hildenbrand wrote: > > > On 08.04.21 07:14, Anshuman Khandual wrote: > > > > > > > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > > > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > > > > > > > The intended semantics of pfn_valid() is to verify whether there is a > > > > > struct page for the pfn in question and nothing else. > > > > > > > > Should there be a comment affirming this semantics interpretation, above the > > > > generic pfn_valid() in include/linux/mmzone.h ? > > > > > > > > > > > > > > Yet, on arm64 it is used to distinguish memory areas that are mapped in the > > > > > linear map vs those that require ioremap() to access them. > > > > > > > > > > Introduce a dedicated pfn_is_memory() to perform such check and use it > > > > > where appropriate. > > > > > > > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > > > > --- > > > > > arch/arm64/include/asm/memory.h | 2 +- > > > > > arch/arm64/include/asm/page.h | 1 + > > > > > arch/arm64/kvm/mmu.c | 2 +- > > > > > arch/arm64/mm/init.c | 6 ++++++ > > > > > arch/arm64/mm/ioremap.c | 4 ++-- > > > > > arch/arm64/mm/mmu.c | 2 +- > > > > > 6 files changed, 12 insertions(+), 5 deletions(-) > > > > > > > > > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h > > > > > index 0aabc3be9a75..7e77fdf71b9d 100644 > > > > > --- a/arch/arm64/include/asm/memory.h > > > > > +++ b/arch/arm64/include/asm/memory.h > > > > > @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) > > > > > #define virt_addr_valid(addr) ({ \ > > > > > __typeof__(addr) __addr = __tag_reset(addr); \ > > > > > - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ > > > > > + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ > > > > > }) > > > > > void dump_mem_limit(void); > > > > > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > > > > > index 012cffc574e8..32b485bcc6ff 100644 > > > > > --- a/arch/arm64/include/asm/page.h > > > > > +++ b/arch/arm64/include/asm/page.h > > > > > @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); > > > > > typedef struct page *pgtable_t; > > > > > extern int pfn_valid(unsigned long); > > > > > +extern int pfn_is_memory(unsigned long); > > > > > #include <asm/memory.h> > > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > > > > index 8711894db8c2..ad2ea65a3937 100644 > > > > > --- a/arch/arm64/kvm/mmu.c > > > > > +++ b/arch/arm64/kvm/mmu.c > > > > > @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) > > > > > static bool kvm_is_device_pfn(unsigned long pfn) > > > > > { > > > > > - return !pfn_valid(pfn); > > > > > + return !pfn_is_memory(pfn); > > > > > } > > > > > /* > > > > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > > > > index 3685e12aba9b..258b1905ed4a 100644 > > > > > --- a/arch/arm64/mm/init.c > > > > > +++ b/arch/arm64/mm/init.c > > > > > @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) > > > > > } > > > > > EXPORT_SYMBOL(pfn_valid); > > > > > +int pfn_is_memory(unsigned long pfn) > > > > > +{ > > > > > + return memblock_is_map_memory(PFN_PHYS(pfn)); > > > > > +} > > > > > +EXPORT_SYMBOL(pfn_is_memory);> + > > > > > > > > Should not this be generic though ? There is nothing platform or arm64 > > > > specific in here. Wondering as pfn_is_memory() just indicates that the > > > > pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() > > > > instead ? Regardless, it's fine either way. > > > > > > TBH, I dislike (generic) pfn_is_memory(). It feels like we're mixing > > > concepts. > > > > Yeah, at the moment NOMAP is very much arm specific so I'd keep it this way > > for now. > > > > > NOMAP memory vs !NOMAP memory; even NOMAP is some kind of memory > > > after all. pfn_is_map_memory() would be more expressive, although still > > > sub-optimal. > > > > > > We'd actually want some kind of arm64-specific pfn_is_system_memory() or the > > > inverse pfn_is_device_memory() -- to be improved. > > > > In my current version (to be posted soon) I've started with > > pfn_lineary_mapped() but then ended up with pfn_mapped() to make it > > "upward" compatible with architectures that use direct rather than linear > > map :) > > And even that is moot. It doesn't tell you if a PFN is *actually* mapped > (hello secretmem). > > I'd suggest to just use memblock_is_map_memory() in arch specific code. Then > it's clear what we are querying exactly and what the semantics might be. Ok, let's export memblock_is_map_memory() for the KEEP_MEMBLOCK case. -- Sincerely yours, Mike. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() @ 2021-04-16 11:40 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-16 11:40 UTC (permalink / raw) To: David Hildenbrand Cc: Anshuman Khandual, Catalin Marinas, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon, linux-arm-kernel On Thu, Apr 15, 2021 at 11:31:26AM +0200, David Hildenbrand wrote: > On 14.04.21 22:29, Mike Rapoport wrote: > > On Wed, Apr 14, 2021 at 05:58:26PM +0200, David Hildenbrand wrote: > > > On 08.04.21 07:14, Anshuman Khandual wrote: > > > > > > > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > > > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > > > > > > > The intended semantics of pfn_valid() is to verify whether there is a > > > > > struct page for the pfn in question and nothing else. > > > > > > > > Should there be a comment affirming this semantics interpretation, above the > > > > generic pfn_valid() in include/linux/mmzone.h ? > > > > > > > > > > > > > > Yet, on arm64 it is used to distinguish memory areas that are mapped in the > > > > > linear map vs those that require ioremap() to access them. > > > > > > > > > > Introduce a dedicated pfn_is_memory() to perform such check and use it > > > > > where appropriate. > > > > > > > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > > > > --- > > > > > arch/arm64/include/asm/memory.h | 2 +- > > > > > arch/arm64/include/asm/page.h | 1 + > > > > > arch/arm64/kvm/mmu.c | 2 +- > > > > > arch/arm64/mm/init.c | 6 ++++++ > > > > > arch/arm64/mm/ioremap.c | 4 ++-- > > > > > arch/arm64/mm/mmu.c | 2 +- > > > > > 6 files changed, 12 insertions(+), 5 deletions(-) > > > > > > > > > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h > > > > > index 0aabc3be9a75..7e77fdf71b9d 100644 > > > > > --- a/arch/arm64/include/asm/memory.h > > > > > +++ b/arch/arm64/include/asm/memory.h > > > > > @@ -351,7 +351,7 @@ static inline void *phys_to_virt(phys_addr_t x) > > > > > #define virt_addr_valid(addr) ({ \ > > > > > __typeof__(addr) __addr = __tag_reset(addr); \ > > > > > - __is_lm_address(__addr) && pfn_valid(virt_to_pfn(__addr)); \ > > > > > + __is_lm_address(__addr) && pfn_is_memory(virt_to_pfn(__addr)); \ > > > > > }) > > > > > void dump_mem_limit(void); > > > > > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > > > > > index 012cffc574e8..32b485bcc6ff 100644 > > > > > --- a/arch/arm64/include/asm/page.h > > > > > +++ b/arch/arm64/include/asm/page.h > > > > > @@ -38,6 +38,7 @@ void copy_highpage(struct page *to, struct page *from); > > > > > typedef struct page *pgtable_t; > > > > > extern int pfn_valid(unsigned long); > > > > > +extern int pfn_is_memory(unsigned long); > > > > > #include <asm/memory.h> > > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > > > > index 8711894db8c2..ad2ea65a3937 100644 > > > > > --- a/arch/arm64/kvm/mmu.c > > > > > +++ b/arch/arm64/kvm/mmu.c > > > > > @@ -85,7 +85,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) > > > > > static bool kvm_is_device_pfn(unsigned long pfn) > > > > > { > > > > > - return !pfn_valid(pfn); > > > > > + return !pfn_is_memory(pfn); > > > > > } > > > > > /* > > > > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > > > > index 3685e12aba9b..258b1905ed4a 100644 > > > > > --- a/arch/arm64/mm/init.c > > > > > +++ b/arch/arm64/mm/init.c > > > > > @@ -258,6 +258,12 @@ int pfn_valid(unsigned long pfn) > > > > > } > > > > > EXPORT_SYMBOL(pfn_valid); > > > > > +int pfn_is_memory(unsigned long pfn) > > > > > +{ > > > > > + return memblock_is_map_memory(PFN_PHYS(pfn)); > > > > > +} > > > > > +EXPORT_SYMBOL(pfn_is_memory);> + > > > > > > > > Should not this be generic though ? There is nothing platform or arm64 > > > > specific in here. Wondering as pfn_is_memory() just indicates that the > > > > pfn is linear mapped, should not it be renamed as pfn_is_linear_memory() > > > > instead ? Regardless, it's fine either way. > > > > > > TBH, I dislike (generic) pfn_is_memory(). It feels like we're mixing > > > concepts. > > > > Yeah, at the moment NOMAP is very much arm specific so I'd keep it this way > > for now. > > > > > NOMAP memory vs !NOMAP memory; even NOMAP is some kind of memory > > > after all. pfn_is_map_memory() would be more expressive, although still > > > sub-optimal. > > > > > > We'd actually want some kind of arm64-specific pfn_is_system_memory() or the > > > inverse pfn_is_device_memory() -- to be improved. > > > > In my current version (to be posted soon) I've started with > > pfn_lineary_mapped() but then ended up with pfn_mapped() to make it > > "upward" compatible with architectures that use direct rather than linear > > map :) > > And even that is moot. It doesn't tell you if a PFN is *actually* mapped > (hello secretmem). > > I'd suggest to just use memblock_is_map_memory() in arch specific code. Then > it's clear what we are querying exactly and what the semantics might be. Ok, let's export memblock_is_map_memory() for the KEEP_MEMBLOCK case. -- Sincerely yours, Mike. _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* [RFC/RFT PATCH 3/3] arm64: drop pfn_valid_within() and simplify pfn_valid() 2021-04-07 17:26 ` Mike Rapoport (?) @ 2021-04-07 17:26 ` Mike Rapoport -1 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-07 17:26 UTC (permalink / raw) To: linux-arm-kernel Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm From: Mike Rapoport <rppt@linux.ibm.com> The arm64's version of pfn_valid() differs from the generic because of two reasons: * Parts of the memory map are freed during boot. This makes it necessary to verify that there is actual physical memory that corresponds to a pfn which is done by querying memblock. * There are NOMAP memory regions. These regions are not mapped in the linear map and until the previous commit the struct pages representing these areas had default values. As the consequence of absence of the special treatment of NOMAP regions in the memory map it was necessary to use memblock_is_map_memory() in pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that generic mm functionality would not treat a NOMAP page as a normal page. Since the NOMAP regions are now marked as PageReserved(), pfn walkers and the rest of core mm will treat them as unusable memory and thus pfn_valid_within() is no longer required at all and can be disabled by removing CONFIG_HOLES_IN_ZONE on arm64. pfn_valid() can be slightly simplified by replacing memblock_is_map_memory() with memblock_is_memory(). Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- arch/arm64/Kconfig | 3 --- arch/arm64/mm/init.c | 4 ++-- 2 files changed, 2 insertions(+), 5 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index e4e1b6550115..58e439046d05 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1040,9 +1040,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK def_bool y depends on NUMA -config HOLES_IN_ZONE - def_bool y - source "kernel/Kconfig.hz" config ARCH_SPARSEMEM_ENABLE diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 258b1905ed4a..bb6dd406b1f0 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -243,7 +243,7 @@ int pfn_valid(unsigned long pfn) /* * ZONE_DEVICE memory does not have the memblock entries. - * memblock_is_map_memory() check for ZONE_DEVICE based + * memblock_is_memory() check for ZONE_DEVICE based * addresses will always fail. Even the normal hotplugged * memory will never have MEMBLOCK_NOMAP flag set in their * memblock entries. Skip memblock search for all non early @@ -254,7 +254,7 @@ int pfn_valid(unsigned long pfn) return pfn_section_valid(ms, pfn); } #endif - return memblock_is_map_memory(addr); + return memblock_is_memory(addr); } EXPORT_SYMBOL(pfn_valid); -- 2.28.0 ^ permalink raw reply related [flat|nested] 78+ messages in thread
* [RFC/RFT PATCH 3/3] arm64: drop pfn_valid_within() and simplify pfn_valid() @ 2021-04-07 17:26 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-07 17:26 UTC (permalink / raw) To: linux-arm-kernel Cc: Anshuman Khandual, Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm From: Mike Rapoport <rppt@linux.ibm.com> The arm64's version of pfn_valid() differs from the generic because of two reasons: * Parts of the memory map are freed during boot. This makes it necessary to verify that there is actual physical memory that corresponds to a pfn which is done by querying memblock. * There are NOMAP memory regions. These regions are not mapped in the linear map and until the previous commit the struct pages representing these areas had default values. As the consequence of absence of the special treatment of NOMAP regions in the memory map it was necessary to use memblock_is_map_memory() in pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that generic mm functionality would not treat a NOMAP page as a normal page. Since the NOMAP regions are now marked as PageReserved(), pfn walkers and the rest of core mm will treat them as unusable memory and thus pfn_valid_within() is no longer required at all and can be disabled by removing CONFIG_HOLES_IN_ZONE on arm64. pfn_valid() can be slightly simplified by replacing memblock_is_map_memory() with memblock_is_memory(). Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- arch/arm64/Kconfig | 3 --- arch/arm64/mm/init.c | 4 ++-- 2 files changed, 2 insertions(+), 5 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index e4e1b6550115..58e439046d05 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1040,9 +1040,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK def_bool y depends on NUMA -config HOLES_IN_ZONE - def_bool y - source "kernel/Kconfig.hz" config ARCH_SPARSEMEM_ENABLE diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 258b1905ed4a..bb6dd406b1f0 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -243,7 +243,7 @@ int pfn_valid(unsigned long pfn) /* * ZONE_DEVICE memory does not have the memblock entries. - * memblock_is_map_memory() check for ZONE_DEVICE based + * memblock_is_memory() check for ZONE_DEVICE based * addresses will always fail. Even the normal hotplugged * memory will never have MEMBLOCK_NOMAP flag set in their * memblock entries. Skip memblock search for all non early @@ -254,7 +254,7 @@ int pfn_valid(unsigned long pfn) return pfn_section_valid(ms, pfn); } #endif - return memblock_is_map_memory(addr); + return memblock_is_memory(addr); } EXPORT_SYMBOL(pfn_valid); -- 2.28.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 78+ messages in thread
* [RFC/RFT PATCH 3/3] arm64: drop pfn_valid_within() and simplify pfn_valid() @ 2021-04-07 17:26 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-07 17:26 UTC (permalink / raw) To: linux-arm-kernel Cc: David Hildenbrand, Catalin Marinas, Anshuman Khandual, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon, Mike Rapoport From: Mike Rapoport <rppt@linux.ibm.com> The arm64's version of pfn_valid() differs from the generic because of two reasons: * Parts of the memory map are freed during boot. This makes it necessary to verify that there is actual physical memory that corresponds to a pfn which is done by querying memblock. * There are NOMAP memory regions. These regions are not mapped in the linear map and until the previous commit the struct pages representing these areas had default values. As the consequence of absence of the special treatment of NOMAP regions in the memory map it was necessary to use memblock_is_map_memory() in pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that generic mm functionality would not treat a NOMAP page as a normal page. Since the NOMAP regions are now marked as PageReserved(), pfn walkers and the rest of core mm will treat them as unusable memory and thus pfn_valid_within() is no longer required at all and can be disabled by removing CONFIG_HOLES_IN_ZONE on arm64. pfn_valid() can be slightly simplified by replacing memblock_is_map_memory() with memblock_is_memory(). Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- arch/arm64/Kconfig | 3 --- arch/arm64/mm/init.c | 4 ++-- 2 files changed, 2 insertions(+), 5 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index e4e1b6550115..58e439046d05 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1040,9 +1040,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK def_bool y depends on NUMA -config HOLES_IN_ZONE - def_bool y - source "kernel/Kconfig.hz" config ARCH_SPARSEMEM_ENABLE diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 258b1905ed4a..bb6dd406b1f0 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -243,7 +243,7 @@ int pfn_valid(unsigned long pfn) /* * ZONE_DEVICE memory does not have the memblock entries. - * memblock_is_map_memory() check for ZONE_DEVICE based + * memblock_is_memory() check for ZONE_DEVICE based * addresses will always fail. Even the normal hotplugged * memory will never have MEMBLOCK_NOMAP flag set in their * memblock entries. Skip memblock search for all non early @@ -254,7 +254,7 @@ int pfn_valid(unsigned long pfn) return pfn_section_valid(ms, pfn); } #endif - return memblock_is_map_memory(addr); + return memblock_is_memory(addr); } EXPORT_SYMBOL(pfn_valid); -- 2.28.0 _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply related [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 3/3] arm64: drop pfn_valid_within() and simplify pfn_valid() 2021-04-07 17:26 ` Mike Rapoport (?) @ 2021-04-08 5:12 ` Anshuman Khandual -1 siblings, 0 replies; 78+ messages in thread From: Anshuman Khandual @ 2021-04-08 5:12 UTC (permalink / raw) To: Mike Rapoport, linux-arm-kernel Cc: Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On 4/7/21 10:56 PM, Mike Rapoport wrote: > From: Mike Rapoport <rppt@linux.ibm.com> > > The arm64's version of pfn_valid() differs from the generic because of two > reasons: > > * Parts of the memory map are freed during boot. This makes it necessary to > verify that there is actual physical memory that corresponds to a pfn > which is done by querying memblock. > > * There are NOMAP memory regions. These regions are not mapped in the > linear map and until the previous commit the struct pages representing > these areas had default values. > > As the consequence of absence of the special treatment of NOMAP regions in > the memory map it was necessary to use memblock_is_map_memory() in > pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that > generic mm functionality would not treat a NOMAP page as a normal page. > > Since the NOMAP regions are now marked as PageReserved(), pfn walkers and > the rest of core mm will treat them as unusable memory and thus > pfn_valid_within() is no longer required at all and can be disabled by > removing CONFIG_HOLES_IN_ZONE on arm64. But what about the memory map that are freed during boot (mentioned above). Would not they still cause CONFIG_HOLES_IN_ZONE to be applicable and hence pfn_valid_within() ? > > pfn_valid() can be slightly simplified by replacing > memblock_is_map_memory() with memblock_is_memory(). Just to understand this better, pfn_valid() will now return true for all MEMBLOCK_NOMAP based memory but that is okay as core MM would still ignore them as unusable memory for being PageReserved(). > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > --- > arch/arm64/Kconfig | 3 --- > arch/arm64/mm/init.c | 4 ++-- > 2 files changed, 2 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index e4e1b6550115..58e439046d05 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -1040,9 +1040,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK > def_bool y > depends on NUMA > > -config HOLES_IN_ZONE > - def_bool y > - > source "kernel/Kconfig.hz" > > config ARCH_SPARSEMEM_ENABLE > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 258b1905ed4a..bb6dd406b1f0 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -243,7 +243,7 @@ int pfn_valid(unsigned long pfn) > > /* > * ZONE_DEVICE memory does not have the memblock entries. > - * memblock_is_map_memory() check for ZONE_DEVICE based > + * memblock_is_memory() check for ZONE_DEVICE based > * addresses will always fail. Even the normal hotplugged > * memory will never have MEMBLOCK_NOMAP flag set in their > * memblock entries. Skip memblock search for all non early > @@ -254,7 +254,7 @@ int pfn_valid(unsigned long pfn) > return pfn_section_valid(ms, pfn); > } > #endif > - return memblock_is_map_memory(addr); > + return memblock_is_memory(addr); > } > EXPORT_SYMBOL(pfn_valid); > > ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 3/3] arm64: drop pfn_valid_within() and simplify pfn_valid() @ 2021-04-08 5:12 ` Anshuman Khandual 0 siblings, 0 replies; 78+ messages in thread From: Anshuman Khandual @ 2021-04-08 5:12 UTC (permalink / raw) To: Mike Rapoport, linux-arm-kernel Cc: Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On 4/7/21 10:56 PM, Mike Rapoport wrote: > From: Mike Rapoport <rppt@linux.ibm.com> > > The arm64's version of pfn_valid() differs from the generic because of two > reasons: > > * Parts of the memory map are freed during boot. This makes it necessary to > verify that there is actual physical memory that corresponds to a pfn > which is done by querying memblock. > > * There are NOMAP memory regions. These regions are not mapped in the > linear map and until the previous commit the struct pages representing > these areas had default values. > > As the consequence of absence of the special treatment of NOMAP regions in > the memory map it was necessary to use memblock_is_map_memory() in > pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that > generic mm functionality would not treat a NOMAP page as a normal page. > > Since the NOMAP regions are now marked as PageReserved(), pfn walkers and > the rest of core mm will treat them as unusable memory and thus > pfn_valid_within() is no longer required at all and can be disabled by > removing CONFIG_HOLES_IN_ZONE on arm64. But what about the memory map that are freed during boot (mentioned above). Would not they still cause CONFIG_HOLES_IN_ZONE to be applicable and hence pfn_valid_within() ? > > pfn_valid() can be slightly simplified by replacing > memblock_is_map_memory() with memblock_is_memory(). Just to understand this better, pfn_valid() will now return true for all MEMBLOCK_NOMAP based memory but that is okay as core MM would still ignore them as unusable memory for being PageReserved(). > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > --- > arch/arm64/Kconfig | 3 --- > arch/arm64/mm/init.c | 4 ++-- > 2 files changed, 2 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index e4e1b6550115..58e439046d05 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -1040,9 +1040,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK > def_bool y > depends on NUMA > > -config HOLES_IN_ZONE > - def_bool y > - > source "kernel/Kconfig.hz" > > config ARCH_SPARSEMEM_ENABLE > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 258b1905ed4a..bb6dd406b1f0 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -243,7 +243,7 @@ int pfn_valid(unsigned long pfn) > > /* > * ZONE_DEVICE memory does not have the memblock entries. > - * memblock_is_map_memory() check for ZONE_DEVICE based > + * memblock_is_memory() check for ZONE_DEVICE based > * addresses will always fail. Even the normal hotplugged > * memory will never have MEMBLOCK_NOMAP flag set in their > * memblock entries. Skip memblock search for all non early > @@ -254,7 +254,7 @@ int pfn_valid(unsigned long pfn) > return pfn_section_valid(ms, pfn); > } > #endif > - return memblock_is_map_memory(addr); > + return memblock_is_memory(addr); > } > EXPORT_SYMBOL(pfn_valid); > > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 3/3] arm64: drop pfn_valid_within() and simplify pfn_valid() @ 2021-04-08 5:12 ` Anshuman Khandual 0 siblings, 0 replies; 78+ messages in thread From: Anshuman Khandual @ 2021-04-08 5:12 UTC (permalink / raw) To: Mike Rapoport, linux-arm-kernel Cc: David Hildenbrand, Catalin Marinas, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon On 4/7/21 10:56 PM, Mike Rapoport wrote: > From: Mike Rapoport <rppt@linux.ibm.com> > > The arm64's version of pfn_valid() differs from the generic because of two > reasons: > > * Parts of the memory map are freed during boot. This makes it necessary to > verify that there is actual physical memory that corresponds to a pfn > which is done by querying memblock. > > * There are NOMAP memory regions. These regions are not mapped in the > linear map and until the previous commit the struct pages representing > these areas had default values. > > As the consequence of absence of the special treatment of NOMAP regions in > the memory map it was necessary to use memblock_is_map_memory() in > pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that > generic mm functionality would not treat a NOMAP page as a normal page. > > Since the NOMAP regions are now marked as PageReserved(), pfn walkers and > the rest of core mm will treat them as unusable memory and thus > pfn_valid_within() is no longer required at all and can be disabled by > removing CONFIG_HOLES_IN_ZONE on arm64. But what about the memory map that are freed during boot (mentioned above). Would not they still cause CONFIG_HOLES_IN_ZONE to be applicable and hence pfn_valid_within() ? > > pfn_valid() can be slightly simplified by replacing > memblock_is_map_memory() with memblock_is_memory(). Just to understand this better, pfn_valid() will now return true for all MEMBLOCK_NOMAP based memory but that is okay as core MM would still ignore them as unusable memory for being PageReserved(). > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > --- > arch/arm64/Kconfig | 3 --- > arch/arm64/mm/init.c | 4 ++-- > 2 files changed, 2 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index e4e1b6550115..58e439046d05 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -1040,9 +1040,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK > def_bool y > depends on NUMA > > -config HOLES_IN_ZONE > - def_bool y > - > source "kernel/Kconfig.hz" > > config ARCH_SPARSEMEM_ENABLE > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 258b1905ed4a..bb6dd406b1f0 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -243,7 +243,7 @@ int pfn_valid(unsigned long pfn) > > /* > * ZONE_DEVICE memory does not have the memblock entries. > - * memblock_is_map_memory() check for ZONE_DEVICE based > + * memblock_is_memory() check for ZONE_DEVICE based > * addresses will always fail. Even the normal hotplugged > * memory will never have MEMBLOCK_NOMAP flag set in their > * memblock entries. Skip memblock search for all non early > @@ -254,7 +254,7 @@ int pfn_valid(unsigned long pfn) > return pfn_section_valid(ms, pfn); > } > #endif > - return memblock_is_map_memory(addr); > + return memblock_is_memory(addr); > } > EXPORT_SYMBOL(pfn_valid); > > _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 3/3] arm64: drop pfn_valid_within() and simplify pfn_valid() 2021-04-08 5:12 ` Anshuman Khandual (?) @ 2021-04-08 6:17 ` Mike Rapoport -1 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-08 6:17 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On Thu, Apr 08, 2021 at 10:42:43AM +0530, Anshuman Khandual wrote: > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > The arm64's version of pfn_valid() differs from the generic because of two > > reasons: > > > > * Parts of the memory map are freed during boot. This makes it necessary to > > verify that there is actual physical memory that corresponds to a pfn > > which is done by querying memblock. > > > > * There are NOMAP memory regions. These regions are not mapped in the > > linear map and until the previous commit the struct pages representing > > these areas had default values. > > > > As the consequence of absence of the special treatment of NOMAP regions in > > the memory map it was necessary to use memblock_is_map_memory() in > > pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that > > generic mm functionality would not treat a NOMAP page as a normal page. > > > > Since the NOMAP regions are now marked as PageReserved(), pfn walkers and > > the rest of core mm will treat them as unusable memory and thus > > pfn_valid_within() is no longer required at all and can be disabled by > > removing CONFIG_HOLES_IN_ZONE on arm64. > > But what about the memory map that are freed during boot (mentioned above). > Would not they still cause CONFIG_HOLES_IN_ZONE to be applicable and hence > pfn_valid_within() ? The CONFIG_HOLES_IN_ZONE name is misleading as actually pfn_valid_within() is only required for holes within a MAX_ORDER_NR_PAGES blocks (see comment near pfn_valid_within() definition in mmzone.h). The freeing of the memory map during boot avoids breaking MAX_ORDER blocks and the holes for which memory map is freed are always aligned at MAX_ORDER. AFAIU, the only case when there could be a hole in a MAX_ORDER block is when EFI/ACPI reserves memory for its use and this memory becomes NOMAP in the kernel. We still create struct pages for this memory, but they never get values other than defaults, so core mm has no idea that this memory should be touched, hence the need for pfn_valid_within() aliased to pfn_valid() on arm64. > > pfn_valid() can be slightly simplified by replacing > > memblock_is_map_memory() with memblock_is_memory(). > > Just to understand this better, pfn_valid() will now return true for all > MEMBLOCK_NOMAP based memory but that is okay as core MM would still ignore > them as unusable memory for being PageReserved(). Right, pfn_valid() will return true for all memory, including MEMBLOCK_NOMAP. Since core mm deals with PageResrved() for memory used by the firmware, e.g. on x86, I don't see why it won't work on arm64. > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > --- > > arch/arm64/Kconfig | 3 --- > > arch/arm64/mm/init.c | 4 ++-- > > 2 files changed, 2 insertions(+), 5 deletions(-) > > > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > > index e4e1b6550115..58e439046d05 100644 > > --- a/arch/arm64/Kconfig > > +++ b/arch/arm64/Kconfig > > @@ -1040,9 +1040,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK > > def_bool y > > depends on NUMA > > > > -config HOLES_IN_ZONE > > - def_bool y > > - > > source "kernel/Kconfig.hz" > > > > config ARCH_SPARSEMEM_ENABLE > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > index 258b1905ed4a..bb6dd406b1f0 100644 > > --- a/arch/arm64/mm/init.c > > +++ b/arch/arm64/mm/init.c > > @@ -243,7 +243,7 @@ int pfn_valid(unsigned long pfn) > > > > /* > > * ZONE_DEVICE memory does not have the memblock entries. > > - * memblock_is_map_memory() check for ZONE_DEVICE based > > + * memblock_is_memory() check for ZONE_DEVICE based > > * addresses will always fail. Even the normal hotplugged > > * memory will never have MEMBLOCK_NOMAP flag set in their > > * memblock entries. Skip memblock search for all non early > > @@ -254,7 +254,7 @@ int pfn_valid(unsigned long pfn) > > return pfn_section_valid(ms, pfn); > > } > > #endif > > - return memblock_is_map_memory(addr); > > + return memblock_is_memory(addr); > > } > > EXPORT_SYMBOL(pfn_valid); > > > > -- Sincerely yours, Mike. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 3/3] arm64: drop pfn_valid_within() and simplify pfn_valid() @ 2021-04-08 6:17 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-08 6:17 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm On Thu, Apr 08, 2021 at 10:42:43AM +0530, Anshuman Khandual wrote: > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > The arm64's version of pfn_valid() differs from the generic because of two > > reasons: > > > > * Parts of the memory map are freed during boot. This makes it necessary to > > verify that there is actual physical memory that corresponds to a pfn > > which is done by querying memblock. > > > > * There are NOMAP memory regions. These regions are not mapped in the > > linear map and until the previous commit the struct pages representing > > these areas had default values. > > > > As the consequence of absence of the special treatment of NOMAP regions in > > the memory map it was necessary to use memblock_is_map_memory() in > > pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that > > generic mm functionality would not treat a NOMAP page as a normal page. > > > > Since the NOMAP regions are now marked as PageReserved(), pfn walkers and > > the rest of core mm will treat them as unusable memory and thus > > pfn_valid_within() is no longer required at all and can be disabled by > > removing CONFIG_HOLES_IN_ZONE on arm64. > > But what about the memory map that are freed during boot (mentioned above). > Would not they still cause CONFIG_HOLES_IN_ZONE to be applicable and hence > pfn_valid_within() ? The CONFIG_HOLES_IN_ZONE name is misleading as actually pfn_valid_within() is only required for holes within a MAX_ORDER_NR_PAGES blocks (see comment near pfn_valid_within() definition in mmzone.h). The freeing of the memory map during boot avoids breaking MAX_ORDER blocks and the holes for which memory map is freed are always aligned at MAX_ORDER. AFAIU, the only case when there could be a hole in a MAX_ORDER block is when EFI/ACPI reserves memory for its use and this memory becomes NOMAP in the kernel. We still create struct pages for this memory, but they never get values other than defaults, so core mm has no idea that this memory should be touched, hence the need for pfn_valid_within() aliased to pfn_valid() on arm64. > > pfn_valid() can be slightly simplified by replacing > > memblock_is_map_memory() with memblock_is_memory(). > > Just to understand this better, pfn_valid() will now return true for all > MEMBLOCK_NOMAP based memory but that is okay as core MM would still ignore > them as unusable memory for being PageReserved(). Right, pfn_valid() will return true for all memory, including MEMBLOCK_NOMAP. Since core mm deals with PageResrved() for memory used by the firmware, e.g. on x86, I don't see why it won't work on arm64. > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > --- > > arch/arm64/Kconfig | 3 --- > > arch/arm64/mm/init.c | 4 ++-- > > 2 files changed, 2 insertions(+), 5 deletions(-) > > > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > > index e4e1b6550115..58e439046d05 100644 > > --- a/arch/arm64/Kconfig > > +++ b/arch/arm64/Kconfig > > @@ -1040,9 +1040,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK > > def_bool y > > depends on NUMA > > > > -config HOLES_IN_ZONE > > - def_bool y > > - > > source "kernel/Kconfig.hz" > > > > config ARCH_SPARSEMEM_ENABLE > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > index 258b1905ed4a..bb6dd406b1f0 100644 > > --- a/arch/arm64/mm/init.c > > +++ b/arch/arm64/mm/init.c > > @@ -243,7 +243,7 @@ int pfn_valid(unsigned long pfn) > > > > /* > > * ZONE_DEVICE memory does not have the memblock entries. > > - * memblock_is_map_memory() check for ZONE_DEVICE based > > + * memblock_is_memory() check for ZONE_DEVICE based > > * addresses will always fail. Even the normal hotplugged > > * memory will never have MEMBLOCK_NOMAP flag set in their > > * memblock entries. Skip memblock search for all non early > > @@ -254,7 +254,7 @@ int pfn_valid(unsigned long pfn) > > return pfn_section_valid(ms, pfn); > > } > > #endif > > - return memblock_is_map_memory(addr); > > + return memblock_is_memory(addr); > > } > > EXPORT_SYMBOL(pfn_valid); > > > > -- Sincerely yours, Mike. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 3/3] arm64: drop pfn_valid_within() and simplify pfn_valid() @ 2021-04-08 6:17 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-08 6:17 UTC (permalink / raw) To: Anshuman Khandual Cc: David Hildenbrand, Catalin Marinas, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon, linux-arm-kernel On Thu, Apr 08, 2021 at 10:42:43AM +0530, Anshuman Khandual wrote: > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > The arm64's version of pfn_valid() differs from the generic because of two > > reasons: > > > > * Parts of the memory map are freed during boot. This makes it necessary to > > verify that there is actual physical memory that corresponds to a pfn > > which is done by querying memblock. > > > > * There are NOMAP memory regions. These regions are not mapped in the > > linear map and until the previous commit the struct pages representing > > these areas had default values. > > > > As the consequence of absence of the special treatment of NOMAP regions in > > the memory map it was necessary to use memblock_is_map_memory() in > > pfn_valid() and to have pfn_valid_within() aliased to pfn_valid() so that > > generic mm functionality would not treat a NOMAP page as a normal page. > > > > Since the NOMAP regions are now marked as PageReserved(), pfn walkers and > > the rest of core mm will treat them as unusable memory and thus > > pfn_valid_within() is no longer required at all and can be disabled by > > removing CONFIG_HOLES_IN_ZONE on arm64. > > But what about the memory map that are freed during boot (mentioned above). > Would not they still cause CONFIG_HOLES_IN_ZONE to be applicable and hence > pfn_valid_within() ? The CONFIG_HOLES_IN_ZONE name is misleading as actually pfn_valid_within() is only required for holes within a MAX_ORDER_NR_PAGES blocks (see comment near pfn_valid_within() definition in mmzone.h). The freeing of the memory map during boot avoids breaking MAX_ORDER blocks and the holes for which memory map is freed are always aligned at MAX_ORDER. AFAIU, the only case when there could be a hole in a MAX_ORDER block is when EFI/ACPI reserves memory for its use and this memory becomes NOMAP in the kernel. We still create struct pages for this memory, but they never get values other than defaults, so core mm has no idea that this memory should be touched, hence the need for pfn_valid_within() aliased to pfn_valid() on arm64. > > pfn_valid() can be slightly simplified by replacing > > memblock_is_map_memory() with memblock_is_memory(). > > Just to understand this better, pfn_valid() will now return true for all > MEMBLOCK_NOMAP based memory but that is okay as core MM would still ignore > them as unusable memory for being PageReserved(). Right, pfn_valid() will return true for all memory, including MEMBLOCK_NOMAP. Since core mm deals with PageResrved() for memory used by the firmware, e.g. on x86, I don't see why it won't work on arm64. > > > > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> > > --- > > arch/arm64/Kconfig | 3 --- > > arch/arm64/mm/init.c | 4 ++-- > > 2 files changed, 2 insertions(+), 5 deletions(-) > > > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > > index e4e1b6550115..58e439046d05 100644 > > --- a/arch/arm64/Kconfig > > +++ b/arch/arm64/Kconfig > > @@ -1040,9 +1040,6 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK > > def_bool y > > depends on NUMA > > > > -config HOLES_IN_ZONE > > - def_bool y > > - > > source "kernel/Kconfig.hz" > > > > config ARCH_SPARSEMEM_ENABLE > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > index 258b1905ed4a..bb6dd406b1f0 100644 > > --- a/arch/arm64/mm/init.c > > +++ b/arch/arm64/mm/init.c > > @@ -243,7 +243,7 @@ int pfn_valid(unsigned long pfn) > > > > /* > > * ZONE_DEVICE memory does not have the memblock entries. > > - * memblock_is_map_memory() check for ZONE_DEVICE based > > + * memblock_is_memory() check for ZONE_DEVICE based > > * addresses will always fail. Even the normal hotplugged > > * memory will never have MEMBLOCK_NOMAP flag set in their > > * memblock entries. Skip memblock search for all non early > > @@ -254,7 +254,7 @@ int pfn_valid(unsigned long pfn) > > return pfn_section_valid(ms, pfn); > > } > > #endif > > - return memblock_is_map_memory(addr); > > + return memblock_is_memory(addr); > > } > > EXPORT_SYMBOL(pfn_valid); > > > > -- Sincerely yours, Mike. _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 0/3] arm64: drop pfn_valid_within() and simplify pfn_valid() 2021-04-07 17:26 ` Mike Rapoport (?) @ 2021-04-08 5:19 ` Anshuman Khandual -1 siblings, 0 replies; 78+ messages in thread From: Anshuman Khandual @ 2021-04-08 5:19 UTC (permalink / raw) To: Mike Rapoport, linux-arm-kernel Cc: Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm, James Morse Adding James here. + James Morse <james.morse@arm.com> On 4/7/21 10:56 PM, Mike Rapoport wrote: > From: Mike Rapoport <rppt@linux.ibm.com> > > Hi, > > These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire > pfn_valid_within() to 1. That would be really great for arm64 platform as it will save CPU cycles on many generic MM paths, given that our pfn_valid() has been expensive. > > The idea is to mark NOMAP pages as reserved in the memory map and restore Though I am not really sure, would that possibly be problematic for UEFI/EFI use cases as it might have just treated them as normal struct pages till now. > the intended semantics of pfn_valid() to designate availability of struct > page for a pfn. Right, that would be better as the current semantics is not ideal. > > With this the core mm will be able to cope with the fact that it cannot use > NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks > will be treated correctly even without the need for pfn_valid_within. > > The patches are only boot tested on qemu-system-aarch64 so I'd really > appreciate memory stress tests on real hardware. Did some preliminary memory stress tests on a guest with portions of memory marked as MEMBLOCK_NOMAP and did not find any obvious problem. But this might require some testing on real UEFI environment with firmware using MEMBLOCK_NOMAP memory to make sure that changing these struct pages to PageReserved() is safe. > > If this actually works we'll be one step closer to drop custom pfn_valid() > on arm64 altogether. Right, planning to rework and respin the RFC originally sent last month. https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 0/3] arm64: drop pfn_valid_within() and simplify pfn_valid() @ 2021-04-08 5:19 ` Anshuman Khandual 0 siblings, 0 replies; 78+ messages in thread From: Anshuman Khandual @ 2021-04-08 5:19 UTC (permalink / raw) To: Mike Rapoport, linux-arm-kernel Cc: Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm, James Morse Adding James here. + James Morse <james.morse@arm.com> On 4/7/21 10:56 PM, Mike Rapoport wrote: > From: Mike Rapoport <rppt@linux.ibm.com> > > Hi, > > These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire > pfn_valid_within() to 1. That would be really great for arm64 platform as it will save CPU cycles on many generic MM paths, given that our pfn_valid() has been expensive. > > The idea is to mark NOMAP pages as reserved in the memory map and restore Though I am not really sure, would that possibly be problematic for UEFI/EFI use cases as it might have just treated them as normal struct pages till now. > the intended semantics of pfn_valid() to designate availability of struct > page for a pfn. Right, that would be better as the current semantics is not ideal. > > With this the core mm will be able to cope with the fact that it cannot use > NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks > will be treated correctly even without the need for pfn_valid_within. > > The patches are only boot tested on qemu-system-aarch64 so I'd really > appreciate memory stress tests on real hardware. Did some preliminary memory stress tests on a guest with portions of memory marked as MEMBLOCK_NOMAP and did not find any obvious problem. But this might require some testing on real UEFI environment with firmware using MEMBLOCK_NOMAP memory to make sure that changing these struct pages to PageReserved() is safe. > > If this actually works we'll be one step closer to drop custom pfn_valid() > on arm64 altogether. Right, planning to rework and respin the RFC originally sent last month. https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 0/3] arm64: drop pfn_valid_within() and simplify pfn_valid() @ 2021-04-08 5:19 ` Anshuman Khandual 0 siblings, 0 replies; 78+ messages in thread From: Anshuman Khandual @ 2021-04-08 5:19 UTC (permalink / raw) To: Mike Rapoport, linux-arm-kernel Cc: David Hildenbrand, Catalin Marinas, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon Adding James here. + James Morse <james.morse@arm.com> On 4/7/21 10:56 PM, Mike Rapoport wrote: > From: Mike Rapoport <rppt@linux.ibm.com> > > Hi, > > These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire > pfn_valid_within() to 1. That would be really great for arm64 platform as it will save CPU cycles on many generic MM paths, given that our pfn_valid() has been expensive. > > The idea is to mark NOMAP pages as reserved in the memory map and restore Though I am not really sure, would that possibly be problematic for UEFI/EFI use cases as it might have just treated them as normal struct pages till now. > the intended semantics of pfn_valid() to designate availability of struct > page for a pfn. Right, that would be better as the current semantics is not ideal. > > With this the core mm will be able to cope with the fact that it cannot use > NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks > will be treated correctly even without the need for pfn_valid_within. > > The patches are only boot tested on qemu-system-aarch64 so I'd really > appreciate memory stress tests on real hardware. Did some preliminary memory stress tests on a guest with portions of memory marked as MEMBLOCK_NOMAP and did not find any obvious problem. But this might require some testing on real UEFI environment with firmware using MEMBLOCK_NOMAP memory to make sure that changing these struct pages to PageReserved() is safe. > > If this actually works we'll be one step closer to drop custom pfn_valid() > on arm64 altogether. Right, planning to rework and respin the RFC originally sent last month. https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 0/3] arm64: drop pfn_valid_within() and simplify pfn_valid() 2021-04-08 5:19 ` Anshuman Khandual (?) @ 2021-04-08 6:27 ` Mike Rapoport -1 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-08 6:27 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm, James Morse On Thu, Apr 08, 2021 at 10:49:02AM +0530, Anshuman Khandual wrote: > Adding James here. > > + James Morse <james.morse@arm.com> > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > Hi, > > > > These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire > > pfn_valid_within() to 1. > > That would be really great for arm64 platform as it will save CPU cycles on > many generic MM paths, given that our pfn_valid() has been expensive. > > > > > The idea is to mark NOMAP pages as reserved in the memory map and restore > > Though I am not really sure, would that possibly be problematic for UEFI/EFI > use cases as it might have just treated them as normal struct pages till now. I don't think there should be a problem because now the struct pages for UEFI/ACPI never got to be used by the core mm. They were (rightfully) skipped by memblock_free_all() from one side and pfn_valid() and pfn_valid_within() return false for them in various pfn walkers from the other side. > > the intended semantics of pfn_valid() to designate availability of struct > > page for a pfn. > > Right, that would be better as the current semantics is not ideal. > > > > > With this the core mm will be able to cope with the fact that it cannot use > > NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks > > will be treated correctly even without the need for pfn_valid_within. > > > > The patches are only boot tested on qemu-system-aarch64 so I'd really > > appreciate memory stress tests on real hardware. > > Did some preliminary memory stress tests on a guest with portions of memory > marked as MEMBLOCK_NOMAP and did not find any obvious problem. But this might > require some testing on real UEFI environment with firmware using MEMBLOCK_NOMAP > memory to make sure that changing these struct pages to PageReserved() is safe. I surely have no access for such machines :) > > If this actually works we'll be one step closer to drop custom pfn_valid() > > on arm64 altogether. > > Right, planning to rework and respin the RFC originally sent last month. > > https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ -- Sincerely yours, Mike. ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 0/3] arm64: drop pfn_valid_within() and simplify pfn_valid() @ 2021-04-08 6:27 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-08 6:27 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-arm-kernel, Ard Biesheuvel, Catalin Marinas, David Hildenbrand, Marc Zyngier, Mark Rutland, Mike Rapoport, Will Deacon, kvmarm, linux-kernel, linux-mm, James Morse On Thu, Apr 08, 2021 at 10:49:02AM +0530, Anshuman Khandual wrote: > Adding James here. > > + James Morse <james.morse@arm.com> > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > Hi, > > > > These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire > > pfn_valid_within() to 1. > > That would be really great for arm64 platform as it will save CPU cycles on > many generic MM paths, given that our pfn_valid() has been expensive. > > > > > The idea is to mark NOMAP pages as reserved in the memory map and restore > > Though I am not really sure, would that possibly be problematic for UEFI/EFI > use cases as it might have just treated them as normal struct pages till now. I don't think there should be a problem because now the struct pages for UEFI/ACPI never got to be used by the core mm. They were (rightfully) skipped by memblock_free_all() from one side and pfn_valid() and pfn_valid_within() return false for them in various pfn walkers from the other side. > > the intended semantics of pfn_valid() to designate availability of struct > > page for a pfn. > > Right, that would be better as the current semantics is not ideal. > > > > > With this the core mm will be able to cope with the fact that it cannot use > > NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks > > will be treated correctly even without the need for pfn_valid_within. > > > > The patches are only boot tested on qemu-system-aarch64 so I'd really > > appreciate memory stress tests on real hardware. > > Did some preliminary memory stress tests on a guest with portions of memory > marked as MEMBLOCK_NOMAP and did not find any obvious problem. But this might > require some testing on real UEFI environment with firmware using MEMBLOCK_NOMAP > memory to make sure that changing these struct pages to PageReserved() is safe. I surely have no access for such machines :) > > If this actually works we'll be one step closer to drop custom pfn_valid() > > on arm64 altogether. > > Right, planning to rework and respin the RFC originally sent last month. > > https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ -- Sincerely yours, Mike. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 78+ messages in thread
* Re: [RFC/RFT PATCH 0/3] arm64: drop pfn_valid_within() and simplify pfn_valid() @ 2021-04-08 6:27 ` Mike Rapoport 0 siblings, 0 replies; 78+ messages in thread From: Mike Rapoport @ 2021-04-08 6:27 UTC (permalink / raw) To: Anshuman Khandual Cc: David Hildenbrand, Catalin Marinas, linux-kernel, Mike Rapoport, linux-mm, kvmarm, Marc Zyngier, Will Deacon, linux-arm-kernel On Thu, Apr 08, 2021 at 10:49:02AM +0530, Anshuman Khandual wrote: > Adding James here. > > + James Morse <james.morse@arm.com> > > On 4/7/21 10:56 PM, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@linux.ibm.com> > > > > Hi, > > > > These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially hardwire > > pfn_valid_within() to 1. > > That would be really great for arm64 platform as it will save CPU cycles on > many generic MM paths, given that our pfn_valid() has been expensive. > > > > > The idea is to mark NOMAP pages as reserved in the memory map and restore > > Though I am not really sure, would that possibly be problematic for UEFI/EFI > use cases as it might have just treated them as normal struct pages till now. I don't think there should be a problem because now the struct pages for UEFI/ACPI never got to be used by the core mm. They were (rightfully) skipped by memblock_free_all() from one side and pfn_valid() and pfn_valid_within() return false for them in various pfn walkers from the other side. > > the intended semantics of pfn_valid() to designate availability of struct > > page for a pfn. > > Right, that would be better as the current semantics is not ideal. > > > > > With this the core mm will be able to cope with the fact that it cannot use > > NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER blocks > > will be treated correctly even without the need for pfn_valid_within. > > > > The patches are only boot tested on qemu-system-aarch64 so I'd really > > appreciate memory stress tests on real hardware. > > Did some preliminary memory stress tests on a guest with portions of memory > marked as MEMBLOCK_NOMAP and did not find any obvious problem. But this might > require some testing on real UEFI environment with firmware using MEMBLOCK_NOMAP > memory to make sure that changing these struct pages to PageReserved() is safe. I surely have no access for such machines :) > > If this actually works we'll be one step closer to drop custom pfn_valid() > > on arm64 altogether. > > Right, planning to rework and respin the RFC originally sent last month. > > https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ -- Sincerely yours, Mike. _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ^ permalink raw reply [flat|nested] 78+ messages in thread
end of thread, other threads:[~2021-04-16 12:02 UTC | newest] Thread overview: 78+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-04-07 17:26 [RFC/RFT PATCH 0/3] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport 2021-04-07 17:26 ` Mike Rapoport 2021-04-07 17:26 ` Mike Rapoport 2021-04-07 17:26 ` [RFC/RFT PATCH 1/3] memblock: update initialization of reserved pages Mike Rapoport 2021-04-07 17:26 ` Mike Rapoport 2021-04-07 17:26 ` Mike Rapoport 2021-04-08 5:16 ` Anshuman Khandual 2021-04-08 5:16 ` Anshuman Khandual 2021-04-08 5:16 ` Anshuman Khandual 2021-04-08 5:48 ` Mike Rapoport 2021-04-08 5:48 ` Mike Rapoport 2021-04-08 5:48 ` Mike Rapoport 2021-04-14 15:12 ` David Hildenbrand 2021-04-14 15:12 ` David Hildenbrand 2021-04-14 15:12 ` David Hildenbrand 2021-04-14 15:27 ` Ard Biesheuvel 2021-04-14 15:27 ` Ard Biesheuvel 2021-04-14 15:27 ` Ard Biesheuvel 2021-04-14 15:27 ` Ard Biesheuvel 2021-04-14 15:52 ` David Hildenbrand 2021-04-14 15:52 ` David Hildenbrand 2021-04-14 15:52 ` David Hildenbrand 2021-04-14 20:24 ` Mike Rapoport 2021-04-14 20:24 ` Mike Rapoport 2021-04-14 20:24 ` Mike Rapoport 2021-04-15 9:30 ` David Hildenbrand 2021-04-15 9:30 ` David Hildenbrand 2021-04-15 9:30 ` David Hildenbrand 2021-04-16 11:44 ` Mike Rapoport 2021-04-16 11:44 ` Mike Rapoport 2021-04-16 11:44 ` Mike Rapoport 2021-04-16 11:54 ` David Hildenbrand 2021-04-16 11:54 ` David Hildenbrand 2021-04-16 11:54 ` David Hildenbrand 2021-04-14 20:11 ` Mike Rapoport 2021-04-14 20:11 ` Mike Rapoport 2021-04-14 20:11 ` Mike Rapoport 2021-04-14 20:06 ` Mike Rapoport 2021-04-14 20:06 ` Mike Rapoport 2021-04-14 20:06 ` Mike Rapoport 2021-04-14 20:09 ` David Hildenbrand 2021-04-14 20:09 ` David Hildenbrand 2021-04-07 17:26 ` [RFC/RFT PATCH 2/3] arm64: decouple check whether pfn is normal memory from pfn_valid() Mike Rapoport 2021-04-07 17:26 ` Mike Rapoport 2021-04-07 17:26 ` Mike Rapoport 2021-04-08 5:14 ` Anshuman Khandual 2021-04-08 5:14 ` Anshuman Khandual 2021-04-08 5:14 ` Anshuman Khandual 2021-04-08 6:00 ` Mike Rapoport 2021-04-08 6:00 ` Mike Rapoport 2021-04-08 6:00 ` Mike Rapoport 2021-04-14 15:58 ` David Hildenbrand 2021-04-14 15:58 ` David Hildenbrand 2021-04-14 15:58 ` David Hildenbrand 2021-04-14 20:29 ` Mike Rapoport 2021-04-14 20:29 ` Mike Rapoport 2021-04-14 20:29 ` Mike Rapoport 2021-04-15 9:31 ` David Hildenbrand 2021-04-15 9:31 ` David Hildenbrand 2021-04-15 9:31 ` David Hildenbrand 2021-04-16 11:40 ` Mike Rapoport 2021-04-16 11:40 ` Mike Rapoport 2021-04-16 11:40 ` Mike Rapoport 2021-04-07 17:26 ` [RFC/RFT PATCH 3/3] arm64: drop pfn_valid_within() and simplify pfn_valid() Mike Rapoport 2021-04-07 17:26 ` Mike Rapoport 2021-04-07 17:26 ` Mike Rapoport 2021-04-08 5:12 ` Anshuman Khandual 2021-04-08 5:12 ` Anshuman Khandual 2021-04-08 5:12 ` Anshuman Khandual 2021-04-08 6:17 ` Mike Rapoport 2021-04-08 6:17 ` Mike Rapoport 2021-04-08 6:17 ` Mike Rapoport 2021-04-08 5:19 ` [RFC/RFT PATCH 0/3] " Anshuman Khandual 2021-04-08 5:19 ` Anshuman Khandual 2021-04-08 5:19 ` Anshuman Khandual 2021-04-08 6:27 ` Mike Rapoport 2021-04-08 6:27 ` Mike Rapoport 2021-04-08 6:27 ` Mike Rapoport
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.