linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/2] memblock: make memblock_find_in_range method private
@ 2021-08-12  6:59 Mike Rapoport
  2021-08-12  6:59 ` [PATCH v4 1/2] x86/mm: memory_map_top_down: remove spurious reservation of upper 2M Mike Rapoport
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Mike Rapoport @ 2021-08-12  6:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Albert Ou, Andy Lutomirski, Borislav Petkov, Catalin Marinas,
	Christian Borntraeger, Dave Hansen, Frank Rowand,
	Greg Kroah-Hartman, Guenter Roeck, H. Peter Anvin,
	Heiko Carstens, Ingo Molnar, Kirill A. Shutemov, Len Brown,
	Marc Zyngier, Mike Rapoport, Mike Rapoport, Palmer Dabbelt,
	Paul Walmsley, Peter Zijlstra, Rafael J. Wysocki, Rob Herring,
	Russell King, Thomas Bogendoerfer, Thomas Gleixner,
	Vasily Gorbik, Will Deacon, devicetree, kvmarm, linux-acpi,
	linux-arm-kernel, linux-kernel, linux-mips, linux-mm,
	linux-riscv, linux-s390, x86

From: Mike Rapoport <rppt@linux.ibm.com>

Hi,

This is v4 of "memblock: make memblock_find_in_range method private" patch
that essentially replaces memblock_find_in_range() + memblock_reserve()
calls with equivalent calls to memblock_phys_alloc() and prevents usage of
memblock_find_in_range() outside memblock itself.

The patch uncovered an issue with top down memory mapping on x86 and this
version has a preparation patch that addresses this issue.

Guenter, I didn't add your Tested-by because the patch that addresses the
crashes differs from the one you've tested.

v4: 
* Add patch that prevents the crashes reported by Guenter Roeck on x86/i386
  on QEMU with 256M or 512M of memory and EFI boot enabled.
* Add Acked-by and Reviewed-by, thanks everybidy!

v3: https://lore.kernel.org/lkml/20210803064218.6611-1-rppt@kernel.org
* simplify check for exact crash kerenl allocation on arm, per Rob
* make crash_max unsigned long long on arm64, per Rob

v2: https://lore.kernel.org/lkml/20210802063737.22733-1-rppt@kernel.org
* don't change error message in arm::reserve_crashkernel(), per Russell

v1: https://lore.kernel.org/lkml/20210730104039.7047-1-rppt@kernel.org

Mike Rapoport (2):
  x86/mm: memory_map_top_down: remove spurious reservation of upper 2M
  memblock: make memblock_find_in_range method private

 arch/arm/kernel/setup.c           | 20 +++++---------
 arch/arm64/kvm/hyp/reserved_mem.c |  9 +++----
 arch/arm64/mm/init.c              | 36 ++++++++-----------------
 arch/mips/kernel/setup.c          | 14 +++++-----
 arch/riscv/mm/init.c              | 44 ++++++++++---------------------
 arch/s390/kernel/setup.c          | 10 ++++---
 arch/x86/kernel/aperture_64.c     |  5 ++--
 arch/x86/mm/init.c                | 27 +++++++------------
 arch/x86/mm/numa.c                |  5 ++--
 arch/x86/mm/numa_emulation.c      |  5 ++--
 arch/x86/realmode/init.c          |  2 +-
 drivers/acpi/tables.c             |  5 ++--
 drivers/base/arch_numa.c          |  5 +---
 drivers/of/of_reserved_mem.c      | 12 ++++++---
 include/linux/memblock.h          |  2 --
 mm/memblock.c                     |  2 +-
 16 files changed, 76 insertions(+), 127 deletions(-)


base-commit: ff1176468d368232b684f75e82563369208bc371
-- 
2.28.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v4 1/2] x86/mm: memory_map_top_down: remove spurious reservation of upper 2M
  2021-08-12  6:59 [PATCH v4 0/2] memblock: make memblock_find_in_range method private Mike Rapoport
@ 2021-08-12  6:59 ` Mike Rapoport
  2021-08-12  6:59 ` [PATCH v4 2/2] memblock: make memblock_find_in_range method private Mike Rapoport
  2021-08-12 14:40 ` [PATCH v4 0/2] " Guenter Roeck
  2 siblings, 0 replies; 4+ messages in thread
From: Mike Rapoport @ 2021-08-12  6:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Albert Ou, Andy Lutomirski, Borislav Petkov, Catalin Marinas,
	Christian Borntraeger, Dave Hansen, Frank Rowand,
	Greg Kroah-Hartman, Guenter Roeck, H. Peter Anvin,
	Heiko Carstens, Ingo Molnar, Kirill A. Shutemov, Len Brown,
	Marc Zyngier, Mike Rapoport, Mike Rapoport, Palmer Dabbelt,
	Paul Walmsley, Peter Zijlstra, Rafael J. Wysocki, Rob Herring,
	Russell King, Thomas Bogendoerfer, Thomas Gleixner,
	Vasily Gorbik, Will Deacon, devicetree, kvmarm, linux-acpi,
	linux-arm-kernel, linux-kernel, linux-mips, linux-mm,
	linux-riscv, linux-s390, x86

From: Mike Rapoport <rppt@linux.ibm.com>

memory_map_top_down() function skips the upper 2M in the beginning and maps
them in the end because

	"xen has big range in reserved near end of ram, skip it at first"

It appears, though, that the root cause was that there was not enough
memory in the range [min_pfn_mapped, max_pfn_mapped] to allocate page
tables from that range in alloc_low_pages() because min_pfn_mapped didn't
reflect that actual minimal pfn that was already mapped but remained close
to the end of the range being mapped by memory_map_top_down().

This happened because min_pfn_mapped is updated at every iteration of the
loop in memory_map_top_down(), but there is another loop in
init_range_memory_mapping() that maps several regions below the current
min_pfn_mapped without updating this variable.

Move the update of min_pfn_mapped to add_pfn_range_mapped() next to the
update of max_pfn_mapped so that every time a new range is mapped both
limits will be updated accordingly, and remove the spurious "reservation"
of upper 2M.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/x86/mm/init.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 75ef19aa8903..87150961fdca 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -486,6 +486,7 @@ static void add_pfn_range_mapped(unsigned long start_pfn, unsigned long end_pfn)
 	nr_pfn_mapped = clean_sort_range(pfn_mapped, E820_MAX_ENTRIES);
 
 	max_pfn_mapped = max(max_pfn_mapped, end_pfn);
+	min_pfn_mapped = min(min_pfn_mapped, start_pfn);
 
 	if (start_pfn < (1UL<<(32-PAGE_SHIFT)))
 		max_low_pfn_mapped = max(max_low_pfn_mapped,
@@ -605,20 +606,14 @@ static unsigned long __init get_new_step_size(unsigned long step_size)
 static void __init memory_map_top_down(unsigned long map_start,
 				       unsigned long map_end)
 {
-	unsigned long real_end, last_start;
-	unsigned long step_size;
-	unsigned long addr;
+	unsigned long real_end = ALIGN_DOWN(map_end, PMD_SIZE);
+	unsigned long last_start = real_end;
+	/* step_size need to be small so pgt_buf from BRK could cover it */
+	unsigned long step_size = PMD_SIZE;
 	unsigned long mapped_ram_size = 0;
 
-	/* xen has big range in reserved near end of ram, skip it at first.*/
-	addr = memblock_find_in_range(map_start, map_end, PMD_SIZE, PMD_SIZE);
-	real_end = addr + PMD_SIZE;
-
-	/* step_size need to be small so pgt_buf from BRK could cover it */
-	step_size = PMD_SIZE;
 	max_pfn_mapped = 0; /* will get exact value next */
 	min_pfn_mapped = real_end >> PAGE_SHIFT;
-	last_start = real_end;
 
 	/*
 	 * We start from the top (end of memory) and go to the bottom.
@@ -638,7 +633,6 @@ static void __init memory_map_top_down(unsigned long map_start,
 		mapped_ram_size += init_range_memory_mapping(start,
 							last_start);
 		last_start = start;
-		min_pfn_mapped = last_start >> PAGE_SHIFT;
 		if (mapped_ram_size >= step_size)
 			step_size = get_new_step_size(step_size);
 	}
-- 
2.28.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v4 2/2] memblock: make memblock_find_in_range method private
  2021-08-12  6:59 [PATCH v4 0/2] memblock: make memblock_find_in_range method private Mike Rapoport
  2021-08-12  6:59 ` [PATCH v4 1/2] x86/mm: memory_map_top_down: remove spurious reservation of upper 2M Mike Rapoport
@ 2021-08-12  6:59 ` Mike Rapoport
  2021-08-12 14:40 ` [PATCH v4 0/2] " Guenter Roeck
  2 siblings, 0 replies; 4+ messages in thread
From: Mike Rapoport @ 2021-08-12  6:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Albert Ou, Andy Lutomirski, Borislav Petkov, Catalin Marinas,
	Christian Borntraeger, Dave Hansen, Frank Rowand,
	Greg Kroah-Hartman, Guenter Roeck, H. Peter Anvin,
	Heiko Carstens, Ingo Molnar, Kirill A. Shutemov, Len Brown,
	Marc Zyngier, Mike Rapoport, Mike Rapoport, Palmer Dabbelt,
	Paul Walmsley, Peter Zijlstra, Rafael J. Wysocki, Rob Herring,
	Russell King, Thomas Bogendoerfer, Thomas Gleixner,
	Vasily Gorbik, Will Deacon, devicetree, kvmarm, linux-acpi,
	linux-arm-kernel, linux-kernel, linux-mips, linux-mm,
	linux-riscv, linux-s390, x86, Kirill A . Shutemov,
	Rafael J . Wysocki, Russell King, Nick Kossifidis

From: Mike Rapoport <rppt@linux.ibm.com>

There are a lot of uses of memblock_find_in_range() along with
memblock_reserve() from the times memblock allocation APIs did not exist.

memblock_find_in_range() is the very core of memblock allocations, so any
future changes to its internal behaviour would mandate updates of all the
users outside memblock.

Replace the calls to memblock_find_in_range() with an equivalent calls to
memblock_phys_alloc() and memblock_phys_alloc_range() and make
memblock_find_in_range() private method of memblock.

This simplifies the callers, ensures that (unlikely) errors in
memblock_reserve() are handled and improves maintainability of
memblock_find_in_range().

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Kirill A. Shutemov <kirill.shtuemov@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Nick Kossifidis <mick@ics.forth.gr>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm/kernel/setup.c           | 20 +++++---------
 arch/arm64/kvm/hyp/reserved_mem.c |  9 +++----
 arch/arm64/mm/init.c              | 36 ++++++++-----------------
 arch/mips/kernel/setup.c          | 14 +++++-----
 arch/riscv/mm/init.c              | 44 ++++++++++---------------------
 arch/s390/kernel/setup.c          | 10 ++++---
 arch/x86/kernel/aperture_64.c     |  5 ++--
 arch/x86/mm/init.c                | 11 ++++----
 arch/x86/mm/numa.c                |  5 ++--
 arch/x86/mm/numa_emulation.c      |  5 ++--
 arch/x86/realmode/init.c          |  2 +-
 drivers/acpi/tables.c             |  5 ++--
 drivers/base/arch_numa.c          |  5 +---
 drivers/of/of_reserved_mem.c      | 12 ++++++---
 include/linux/memblock.h          |  2 --
 mm/memblock.c                     |  2 +-
 16 files changed, 71 insertions(+), 116 deletions(-)

diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index f97eb2371672..284a80c0b6e1 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -1012,31 +1012,25 @@ static void __init reserve_crashkernel(void)
 		unsigned long long lowmem_max = __pa(high_memory - 1) + 1;
 		if (crash_max > lowmem_max)
 			crash_max = lowmem_max;
-		crash_base = memblock_find_in_range(CRASH_ALIGN, crash_max,
-						    crash_size, CRASH_ALIGN);
+
+		crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
+						       CRASH_ALIGN, crash_max);
 		if (!crash_base) {
 			pr_err("crashkernel reservation failed - No suitable area found.\n");
 			return;
 		}
 	} else {
+		unsigned long long crash_max = crash_base + crash_size;
 		unsigned long long start;
 
-		start = memblock_find_in_range(crash_base,
-					       crash_base + crash_size,
-					       crash_size, SECTION_SIZE);
-		if (start != crash_base) {
+		start = memblock_phys_alloc_range(crash_size, SECTION_SIZE,
+						  crash_base, crash_max);
+		if (!start) {
 			pr_err("crashkernel reservation failed - memory is in use.\n");
 			return;
 		}
 	}
 
-	ret = memblock_reserve(crash_base, crash_size);
-	if (ret < 0) {
-		pr_warn("crashkernel reservation failed - memory is in use (0x%lx)\n",
-			(unsigned long)crash_base);
-		return;
-	}
-
 	pr_info("Reserving %ldMB of memory at %ldMB for crashkernel (System RAM: %ldMB)\n",
 		(unsigned long)(crash_size >> 20),
 		(unsigned long)(crash_base >> 20),
diff --git a/arch/arm64/kvm/hyp/reserved_mem.c b/arch/arm64/kvm/hyp/reserved_mem.c
index d654921dd09b..578670e3f608 100644
--- a/arch/arm64/kvm/hyp/reserved_mem.c
+++ b/arch/arm64/kvm/hyp/reserved_mem.c
@@ -92,12 +92,10 @@ void __init kvm_hyp_reserve(void)
 	 * this is unmapped from the host stage-2, and fallback to PAGE_SIZE.
 	 */
 	hyp_mem_size = hyp_mem_pages << PAGE_SHIFT;
-	hyp_mem_base = memblock_find_in_range(0, memblock_end_of_DRAM(),
-					      ALIGN(hyp_mem_size, PMD_SIZE),
-					      PMD_SIZE);
+	hyp_mem_base = memblock_phys_alloc(ALIGN(hyp_mem_size, PMD_SIZE),
+					   PMD_SIZE);
 	if (!hyp_mem_base)
-		hyp_mem_base = memblock_find_in_range(0, memblock_end_of_DRAM(),
-						      hyp_mem_size, PAGE_SIZE);
+		hyp_mem_base = memblock_phys_alloc(hyp_mem_size, PAGE_SIZE);
 	else
 		hyp_mem_size = ALIGN(hyp_mem_size, PMD_SIZE);
 
@@ -105,7 +103,6 @@ void __init kvm_hyp_reserve(void)
 		kvm_err("Failed to reserve hyp memory\n");
 		return;
 	}
-	memblock_reserve(hyp_mem_base, hyp_mem_size);
 
 	kvm_info("Reserved %lld MiB at 0x%llx\n", hyp_mem_size >> 20,
 		 hyp_mem_base);
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 8490ed2917ff..0bffd2d1854f 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -74,6 +74,7 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
 static void __init reserve_crashkernel(void)
 {
 	unsigned long long crash_base, crash_size;
+	unsigned long long crash_max = arm64_dma_phys_limit;
 	int ret;
 
 	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
@@ -84,33 +85,18 @@ static void __init reserve_crashkernel(void)
 
 	crash_size = PAGE_ALIGN(crash_size);
 
-	if (crash_base == 0) {
-		/* Current arm64 boot protocol requires 2MB alignment */
-		crash_base = memblock_find_in_range(0, arm64_dma_phys_limit,
-				crash_size, SZ_2M);
-		if (crash_base == 0) {
-			pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
-				crash_size);
-			return;
-		}
-	} else {
-		/* User specifies base address explicitly. */
-		if (!memblock_is_region_memory(crash_base, crash_size)) {
-			pr_warn("cannot reserve crashkernel: region is not memory\n");
-			return;
-		}
+	/* User specifies base address explicitly. */
+	if (crash_base)
+		crash_max = crash_base + crash_size;
 
-		if (memblock_is_region_reserved(crash_base, crash_size)) {
-			pr_warn("cannot reserve crashkernel: region overlaps reserved memory\n");
-			return;
-		}
-
-		if (!IS_ALIGNED(crash_base, SZ_2M)) {
-			pr_warn("cannot reserve crashkernel: base address is not 2MB aligned\n");
-			return;
-		}
+	/* Current arm64 boot protocol requires 2MB alignment */
+	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
+					       crash_base, crash_max);
+	if (!crash_base) {
+		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
+			crash_size);
+		return;
 	}
-	memblock_reserve(crash_base, crash_size);
 
 	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
 		crash_base, crash_base + crash_size, crash_size >> 20);
diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 23a140327a0b..f979adfd4fc2 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -452,8 +452,9 @@ static void __init mips_parse_crashkernel(void)
 		return;
 
 	if (crash_base <= 0) {
-		crash_base = memblock_find_in_range(CRASH_ALIGN, CRASH_ADDR_MAX,
-							crash_size, CRASH_ALIGN);
+		crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
+						       CRASH_ALIGN,
+						       CRASH_ADDR_MAX);
 		if (!crash_base) {
 			pr_warn("crashkernel reservation failed - No suitable area found.\n");
 			return;
@@ -461,8 +462,9 @@ static void __init mips_parse_crashkernel(void)
 	} else {
 		unsigned long long start;
 
-		start = memblock_find_in_range(crash_base, crash_base + crash_size,
-						crash_size, 1);
+		start = memblock_phys_alloc_range(crash_size, 1,
+						  crash_base,
+						  crash_base + crash_size);
 		if (start != crash_base) {
 			pr_warn("Invalid memory region reserved for crash kernel\n");
 			return;
@@ -656,10 +658,6 @@ static void __init arch_mem_init(char **cmdline_p)
 	mips_reserve_vmcore();
 
 	mips_parse_crashkernel();
-#ifdef CONFIG_KEXEC
-	if (crashk_res.start != crashk_res.end)
-		memblock_reserve(crashk_res.start, resource_size(&crashk_res));
-#endif
 	device_tree_init();
 
 	/*
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index a14bf3910eec..88649337c568 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -812,38 +812,22 @@ static void __init reserve_crashkernel(void)
 
 	crash_size = PAGE_ALIGN(crash_size);
 
-	if (crash_base == 0) {
-		/*
-		 * Current riscv boot protocol requires 2MB alignment for
-		 * RV64 and 4MB alignment for RV32 (hugepage size)
-		 */
-		crash_base = memblock_find_in_range(search_start, search_end,
-						    crash_size, PMD_SIZE);
-
-		if (crash_base == 0) {
-			pr_warn("crashkernel: couldn't allocate %lldKB\n",
-				crash_size >> 10);
-			return;
-		}
-	} else {
-		/* User specifies base address explicitly. */
-		if (!memblock_is_region_memory(crash_base, crash_size)) {
-			pr_warn("crashkernel: requested region is not memory\n");
-			return;
-		}
-
-		if (memblock_is_region_reserved(crash_base, crash_size)) {
-			pr_warn("crashkernel: requested region is reserved\n");
-			return;
-		}
-
+	if (crash_base) {
+		search_start = crash_base;
+		search_end = crash_base + crash_size;
+	}
 
-		if (!IS_ALIGNED(crash_base, PMD_SIZE)) {
-			pr_warn("crashkernel: requested region is misaligned\n");
-			return;
-		}
+	/*
+	 * Current riscv boot protocol requires 2MB alignment for
+	 * RV64 and 4MB alignment for RV32 (hugepage size)
+	 */
+	crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE,
+					       search_start, search_end);
+	if (crash_base == 0) {
+		pr_warn("crashkernel: couldn't allocate %lldKB\n",
+			crash_size >> 10);
+		return;
 	}
-	memblock_reserve(crash_base, crash_size);
 
 	pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n",
 		crash_base, crash_base + crash_size, crash_size >> 20);
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
index ff0f9e838916..3d9efee0f43c 100644
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -626,8 +626,9 @@ static void __init reserve_crashkernel(void)
 			return;
 		}
 		low = crash_base ?: low;
-		crash_base = memblock_find_in_range(low, high, crash_size,
-						    KEXEC_CRASH_MEM_ALIGN);
+		crash_base = memblock_phys_alloc_range(crash_size,
+						       KEXEC_CRASH_MEM_ALIGN,
+						       low, high);
 	}
 
 	if (!crash_base) {
@@ -636,14 +637,15 @@ static void __init reserve_crashkernel(void)
 		return;
 	}
 
-	if (register_memory_notifier(&kdump_mem_nb))
+	if (register_memory_notifier(&kdump_mem_nb)) {
+		memblock_free(crash_base, crash_size);
 		return;
+	}
 
 	if (!OLDMEM_BASE && MACHINE_IS_VM)
 		diag10_range(PFN_DOWN(crash_base), PFN_DOWN(crash_size));
 	crashk_res.start = crash_base;
 	crashk_res.end = crash_base + crash_size - 1;
-	memblock_remove(crash_base, crash_size);
 	pr_info("Reserving %lluMB of memory at %lluMB "
 		"for crashkernel (System RAM: %luMB)\n",
 		crash_size >> 20, crash_base >> 20,
diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c
index 294ed4392a0e..10562885f5fc 100644
--- a/arch/x86/kernel/aperture_64.c
+++ b/arch/x86/kernel/aperture_64.c
@@ -109,14 +109,13 @@ static u32 __init allocate_aperture(void)
 	 * memory. Unfortunately we cannot move it up because that would
 	 * make the IOMMU useless.
 	 */
-	addr = memblock_find_in_range(GART_MIN_ADDR, GART_MAX_ADDR,
-				      aper_size, aper_size);
+	addr = memblock_phys_alloc_range(aper_size, aper_size,
+					 GART_MIN_ADDR, GART_MAX_ADDR);
 	if (!addr) {
 		pr_err("Cannot allocate aperture memory hole [mem %#010lx-%#010lx] (%uKB)\n",
 		       addr, addr + aper_size - 1, aper_size >> 10);
 		return 0;
 	}
-	memblock_reserve(addr, aper_size);
 	pr_info("Mapping aperture over RAM [mem %#010lx-%#010lx] (%uKB)\n",
 		addr, addr + aper_size - 1, aper_size >> 10);
 	register_nosave_region(addr >> PAGE_SHIFT,
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 87150961fdca..18ed4332d707 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -26,6 +26,7 @@
 #include <asm/pti.h>
 #include <asm/text-patching.h>
 #include <asm/memtype.h>
+#include <xen/xen.h>
 
 /*
  * We need to define the tracepoints somewhere, and tlb.c
@@ -127,14 +128,12 @@ __ref void *alloc_low_pages(unsigned int num)
 		unsigned long ret = 0;
 
 		if (min_pfn_mapped < max_pfn_mapped) {
-			ret = memblock_find_in_range(
+			ret = memblock_phys_alloc_range(
+					PAGE_SIZE * num, PAGE_SIZE,
 					min_pfn_mapped << PAGE_SHIFT,
-					max_pfn_mapped << PAGE_SHIFT,
-					PAGE_SIZE * num , PAGE_SIZE);
+					max_pfn_mapped << PAGE_SHIFT);
 		}
-		if (ret)
-			memblock_reserve(ret, PAGE_SIZE * num);
-		else if (can_use_brk_pgt)
+		if (!ret && can_use_brk_pgt)
 			ret = __pa(extend_brk(PAGE_SIZE * num, PAGE_SIZE));
 
 		if (!ret)
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index e94da744386f..a1b5c71099e6 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -376,15 +376,14 @@ static int __init numa_alloc_distance(void)
 	cnt++;
 	size = cnt * cnt * sizeof(numa_distance[0]);
 
-	phys = memblock_find_in_range(0, PFN_PHYS(max_pfn_mapped),
-				      size, PAGE_SIZE);
+	phys = memblock_phys_alloc_range(size, PAGE_SIZE, 0,
+					 PFN_PHYS(max_pfn_mapped));
 	if (!phys) {
 		pr_warn("Warning: can't allocate distance table!\n");
 		/* don't retry until explicitly reset */
 		numa_distance = (void *)1LU;
 		return -ENOMEM;
 	}
-	memblock_reserve(phys, size);
 
 	numa_distance = __va(phys);
 	numa_distance_cnt = cnt;
diff --git a/arch/x86/mm/numa_emulation.c b/arch/x86/mm/numa_emulation.c
index 87d77cc52f86..737491b13728 100644
--- a/arch/x86/mm/numa_emulation.c
+++ b/arch/x86/mm/numa_emulation.c
@@ -447,13 +447,12 @@ void __init numa_emulation(struct numa_meminfo *numa_meminfo, int numa_dist_cnt)
 	if (numa_dist_cnt) {
 		u64 phys;
 
-		phys = memblock_find_in_range(0, PFN_PHYS(max_pfn_mapped),
-					      phys_size, PAGE_SIZE);
+		phys = memblock_phys_alloc_range(phys_size, PAGE_SIZE, 0,
+						 PFN_PHYS(max_pfn_mapped));
 		if (!phys) {
 			pr_warn("NUMA: Warning: can't allocate copy of distance table, disabling emulation\n");
 			goto no_emu;
 		}
-		memblock_reserve(phys, phys_size);
 		phys_dist = __va(phys);
 
 		for (i = 0; i < numa_dist_cnt; i++)
diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 6534c92d0f83..31b5856010cb 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -28,7 +28,7 @@ void __init reserve_real_mode(void)
 	WARN_ON(slab_is_available());
 
 	/* Has to be under 1M so we can execute real-mode AP code. */
-	mem = memblock_find_in_range(0, 1<<20, size, PAGE_SIZE);
+	mem = memblock_phys_alloc_range(size, PAGE_SIZE, 0, 1<<20);
 	if (!mem)
 		pr_info("No sub-1M memory is available for the trampoline\n");
 	else
diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
index a37a1532a575..f9383736fa0f 100644
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -583,8 +583,8 @@ void __init acpi_table_upgrade(void)
 	}
 
 	acpi_tables_addr =
-		memblock_find_in_range(0, ACPI_TABLE_UPGRADE_MAX_PHYS,
-				       all_tables_size, PAGE_SIZE);
+		memblock_phys_alloc_range(all_tables_size, PAGE_SIZE,
+					  0, ACPI_TABLE_UPGRADE_MAX_PHYS);
 	if (!acpi_tables_addr) {
 		WARN_ON(1);
 		return;
@@ -599,7 +599,6 @@ void __init acpi_table_upgrade(void)
 	 * Both memblock_reserve and e820__range_add (via arch_reserve_mem_area)
 	 * works fine.
 	 */
-	memblock_reserve(acpi_tables_addr, all_tables_size);
 	arch_reserve_mem_area(acpi_tables_addr, all_tables_size);
 
 	/*
diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
index 4cc4e117727d..46c503486e96 100644
--- a/drivers/base/arch_numa.c
+++ b/drivers/base/arch_numa.c
@@ -279,13 +279,10 @@ static int __init numa_alloc_distance(void)
 	int i, j;
 
 	size = nr_node_ids * nr_node_ids * sizeof(numa_distance[0]);
-	phys = memblock_find_in_range(0, PFN_PHYS(max_pfn),
-				      size, PAGE_SIZE);
+	phys = memblock_phys_alloc_range(size, PAGE_SIZE, 0, PFN_PHYS(max_pfn));
 	if (WARN_ON(!phys))
 		return -ENOMEM;
 
-	memblock_reserve(phys, size);
-
 	numa_distance = __va(phys);
 	numa_distance_cnt = nr_node_ids;
 
diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c
index fd3964d24224..59c1390cdf42 100644
--- a/drivers/of/of_reserved_mem.c
+++ b/drivers/of/of_reserved_mem.c
@@ -33,18 +33,22 @@ static int __init early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
 	phys_addr_t *res_base)
 {
 	phys_addr_t base;
+	int err = 0;
 
 	end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end;
 	align = !align ? SMP_CACHE_BYTES : align;
-	base = memblock_find_in_range(start, end, size, align);
+	base = memblock_phys_alloc_range(size, align, start, end);
 	if (!base)
 		return -ENOMEM;
 
 	*res_base = base;
-	if (nomap)
-		return memblock_mark_nomap(base, size);
+	if (nomap) {
+		err = memblock_mark_nomap(base, size);
+		if (err)
+			memblock_free(base, size);
+	}
 
-	return memblock_reserve(base, size);
+	return err;
 }
 
 /*
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 4a53c3ca86bd..b066024c62e3 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -99,8 +99,6 @@ void memblock_discard(void);
 static inline void memblock_discard(void) {}
 #endif
 
-phys_addr_t memblock_find_in_range(phys_addr_t start, phys_addr_t end,
-				   phys_addr_t size, phys_addr_t align);
 void memblock_allow_resize(void);
 int memblock_add_node(phys_addr_t base, phys_addr_t size, int nid);
 int memblock_add(phys_addr_t base, phys_addr_t size);
diff --git a/mm/memblock.c b/mm/memblock.c
index de7b553baa50..28a813d9e955 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -315,7 +315,7 @@ static phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size,
  * Return:
  * Found address on success, 0 on failure.
  */
-phys_addr_t __init_memblock memblock_find_in_range(phys_addr_t start,
+static phys_addr_t __init_memblock memblock_find_in_range(phys_addr_t start,
 					phys_addr_t end, phys_addr_t size,
 					phys_addr_t align)
 {
-- 
2.28.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v4 0/2] memblock: make memblock_find_in_range method private
  2021-08-12  6:59 [PATCH v4 0/2] memblock: make memblock_find_in_range method private Mike Rapoport
  2021-08-12  6:59 ` [PATCH v4 1/2] x86/mm: memory_map_top_down: remove spurious reservation of upper 2M Mike Rapoport
  2021-08-12  6:59 ` [PATCH v4 2/2] memblock: make memblock_find_in_range method private Mike Rapoport
@ 2021-08-12 14:40 ` Guenter Roeck
  2 siblings, 0 replies; 4+ messages in thread
From: Guenter Roeck @ 2021-08-12 14:40 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, Albert Ou, Andy Lutomirski, Borislav Petkov,
	Catalin Marinas, Christian Borntraeger, Dave Hansen,
	Frank Rowand, Greg Kroah-Hartman, H. Peter Anvin, Heiko Carstens,
	Ingo Molnar, Kirill A. Shutemov, Len Brown, Marc Zyngier,
	Mike Rapoport, Palmer Dabbelt, Paul Walmsley, Peter Zijlstra,
	Rafael J. Wysocki, Rob Herring, Russell King,
	Thomas Bogendoerfer, Thomas Gleixner, Vasily Gorbik, Will Deacon,
	devicetree, kvmarm, linux-acpi, linux-arm-kernel, linux-kernel,
	linux-mips, linux-mm, linux-riscv, linux-s390, x86

Mike,

On Thu, Aug 12, 2021 at 09:59:05AM +0300, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> Hi,
> 
> This is v4 of "memblock: make memblock_find_in_range method private" patch
> that essentially replaces memblock_find_in_range() + memblock_reserve()
> calls with equivalent calls to memblock_phys_alloc() and prevents usage of
> memblock_find_in_range() outside memblock itself.
> 
> The patch uncovered an issue with top down memory mapping on x86 and this
> version has a preparation patch that addresses this issue.
> 
> Guenter, I didn't add your Tested-by because the patch that addresses the
> crashes differs from the one you've tested.
> 

Unfortunately I am still seeing crashes.

1G of memory, x86_64:

[    0.000000] efi: EFI v2.70 by EDK II
[    0.000000] efi: SMBIOS=0x3fbcc000 ACPI=0x3fbfa000 ACPI 2.0=0x3fbfa014 MEMATTR=0x3f229018 
[    0.000000] SMBIOS 2.8 present.
[    0.000000] DMI: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 3792.807 MHz processor
[    0.001816] last_pfn = 0x3ff50 max_arch_pfn = 0x400000000
[    0.002595] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT  
[    0.022989] Using GB pages for direct mapping
[    0.025601] Kernel panic - not syncing: alloc_low_pages: can not alloc memory
[    0.025910] CPU: 0 PID: 0 Comm: swapper Not tainted 5.14.0-rc5+ #1
[    0.026133] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[    0.026462] Call Trace:
[    0.026942]  ? dump_stack_lvl+0x57/0x7d
[    0.027475]  ? panic+0x10a/0x2de
[    0.027600]  ? alloc_low_pages+0x117/0x156
[    0.027704]  ? phys_pmd_init+0x234/0x342
[    0.027817]  ? phys_pud_init+0x171/0x337
[    0.027926]  ? __kernel_physical_mapping_init+0xec/0x276
[    0.028062]  ? init_memory_mapping+0x1ea/0x2ca
[    0.028199]  ? init_range_memory_mapping+0xdf/0x12e
[    0.028326]  ? init_mem_mapping+0x1e9/0x261
[    0.028432]  ? setup_arch+0x5ff/0xb6d
[    0.028535]  ? start_kernel+0x71/0x6b4
[    0.028636]  ? secondary_startup_64_no_verify+0xc2/0xcb
[    0.029479] ---[ end Kernel panic - not syncing: alloc_low_pages: can not alloc memory ]---

Complete log:
https://kerneltests.org/builders/qemu-x86_64-testing/builds/67/steps/qemubuildcommand/logs/stdio

x86, default memory size, all efi boots affected:

[    0.025676] BUG: unable to handle page fault for address: cf3c1000
[    0.025932] #PF: supervisor write access in kernel mode
[    0.026022] #PF: error_code(0x0002) - not-present page
[    0.026122] *pde = 00000000
[    0.026308] Oops: 0002 [#1] SMP
[    0.026468] CPU: 0 PID: 0 Comm: swapper Not tainted 5.14.0-rc5+ #1
[    0.026616] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[    0.026848] EIP: alloc_low_pages+0xa0/0x13f
[    0.027355] Code: 00 74 77 a3 cc ba 62 ca 8b 45 f0 8d 90 00 00 0c 00 31 c0 c1 e2 0c 85 f6 74 16 89 d7 b9 00 04 00 00 83 c3 01 81 c2 00 10 00 00 <f3> ab 39 f3 75 ea 8b 45 f0 8d 65 f4 5b 5e c1 e0 0c 5f 5d 2d 00 00
[    0.027802] EAX: 00000000 EBX: 00000001 ECX: 00000400 EDX: cf3c2000
[    0.027903] ESI: 00000001 EDI: cf3c1000 EBP: ca389e28 ESP: ca389e18
[    0.028006] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00200086
[    0.028125] CR0: 80050033 CR2: cf3c1000 CR3: 0a69f000 CR4: 00040690
[    0.028287] Call Trace:
[    0.028603]  one_page_table_init+0x15/0x6d
[    0.028751]  kernel_physical_mapping_init+0xdd/0x19b
[    0.028839]  init_memory_mapping+0x146/0x1f1
[    0.028921]  init_range_memory_mapping+0xfe/0x144
[    0.029001]  init_mem_mapping+0x145/0x185
[    0.029066]  setup_arch+0x5ff/0xa75
[    0.029128]  ? vprintk+0x4c/0x100
[    0.029187]  start_kernel+0x66/0x5ba
[    0.029246]  ? set_intr_gate+0x42/0x55
[    0.029306]  ? early_idt_handler_common+0x44/0x44
[    0.029380]  i386_start_kernel+0x43/0x45
[    0.029441]  startup_32_smp+0x161/0x164
[    0.029567] Modules linked in:
[    0.029776] CR2: 00000000cf3c1000
[    0.030406] random: get_random_bytes called from oops_exit+0x35/0x60 with crng_init=0
[    0.031121] ---[ end trace 544692cd05e387e2 ]---
[    0.031357] EIP: alloc_low_pages+0xa0/0x13f
[    0.031427] Code: 00 74 77 a3 cc ba 62 ca 8b 45 f0 8d 90 00 00 0c 00 31 c0 c1 e2 0c 85 f6 74 16 89 d7 b9 00 04 00 00 83 c3 01 81 c2 00 10 00 00 <f3> ab 39 f3 75 ea 8b 45 f0 8d 65 f4 5b 5e c1 e0 0c 5f 5d 2d 00 00
[    0.031698] EAX: 00000000 EBX: 00000001 ECX: 00000400 EDX: cf3c2000
[    0.031787] ESI: 00000001 EDI: cf3c1000 EBP: ca389e28 ESP: ca389e18
[    0.031876] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00200086
[    0.031972] CR0: 80050033 CR2: cf3c1000 CR3: 0a69f000 CR4: 00040690
[    0.032198] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.032521] ---[ end Kernel panic - not syncing: Attempted to kill the idle
task! ]--

Complete log: 
https://kerneltests.org/builders/qemu-x86-testing/builds/65/steps/qemubuildcommand/logs/stdio

Guenter

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-08-12 14:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-12  6:59 [PATCH v4 0/2] memblock: make memblock_find_in_range method private Mike Rapoport
2021-08-12  6:59 ` [PATCH v4 1/2] x86/mm: memory_map_top_down: remove spurious reservation of upper 2M Mike Rapoport
2021-08-12  6:59 ` [PATCH v4 2/2] memblock: make memblock_find_in_range method private Mike Rapoport
2021-08-12 14:40 ` [PATCH v4 0/2] " Guenter Roeck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).