linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v20 0/5] support reserving crashkernel above 4G on arm64 kdump
@ 2022-01-24  8:47 Zhen Lei
  2022-01-24  8:47 ` [PATCH v20 1/5] arm64: Use insert_resource() to simplify code Zhen Lei
                   ` (5 more replies)
  0 siblings, 6 replies; 30+ messages in thread
From: Zhen Lei @ 2022-01-24  8:47 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel below 4G, which
will fail when there is no enough low memory.
2. If reserving crashkernel above 4G, in this case, crash dump
kernel will boot failure because there is no low memory available
for allocation.

To solve these issues, change the behavior of crashkernel=X.
crashkernel=X tries low allocation in DMA zone and fall back to high
allocation if it fails.

We can also use "crashkernel=X,high" to select a high region above
DMA zone, which also tries to allocate at least 256M low memory in
DMA zone automatically and "crashkernel=Y,low" can be used to allocate
specified size low memory.

When reserving crashkernel in high memory, some low memory is reserved
for crash dump kernel devices. So there may be two regions reserved for
crash dump kernel.
In order to distinct from the high region and make no effect to the use
of existing kexec-tools, rename the low region as "Crash kernel (low)",
and pass the low region by reusing DT property
"linux,usable-memory-range". We made the low memory region as the last
range of "linux,usable-memory-range" to keep compatibility with existing
user-space and older kdump kernels.

Besides, we need to modify kexec-tools:
arm64: support more than one crash kernel regions(see [1])

Another update is document about DT property 'linux,usable-memory-range':
schemas: update 'linux,usable-memory-range' node schema(see [2])


Changes since [v19]:
1. Temporarily stop making reserve_crashkernel[_low]() generic. There are a
   lot of details need to be considered, which can take a long time. Because
   "make generic" does not add new functions and does not improve performance,
   maybe I should say it's just a cleanup. So by stripping it out and leaving
   it for other patches later, we can aggregate the changes to the main functions.
2. Use insert_resource() to replace request_resource(), this not only simplifies
   the code, but also reduces the differences between arm64 and x86 implementations.
3. As commit 157752d84f5d ("kexec: use Crash kernel for Crash kernel low") do for
   x86, we can also extend kexec-tools for arm64, and it's currently applied. See:
   https://www.spinics.net/lists/kexec/msg28284.html

Thank you very much, Borislav Petkov, for so many valuable comments.



Changes since [v17]: v17 --> v19
1. Patch 0001-0004
   Introduce generic parse_crashkernel_high_low() to bring the parsing of
   "crashkernel=X,high" and the parsing of "crashkernel=X,low" together,
   then use it instead of the call to parse_crashkernel_{high|low}(). Two
   confusing parameters of parse_crashkernel_{high|low}() are deleted.

   I previously sent these four patches separately:
   [1] https://lkml.org/lkml/2021/12/25/40
2. Patch 0005-0009
   Introduce generic reserve_crashkernel_mem[_low](), the implementation of
   these two functions is based on function reserve_crashkernel[_low]() in
   arch/x86/kernel/setup.c. There is no functional change for x86.
   1) The check position of xen_pv_domain() does not change.
   2) Still 1M alignment for crash kernel fixed region, when 'base' is specified.

   To avoid compilation problems on other architectures: patch 0004 moves
   the definition of global variable crashk[_low]_res from kexec_core.c to
   crash_core.c, and provide default definitions for all macros involved, a
   particular platform can redefine these macros to override the default
   values.
3. 0010, only one line of comment was changed.
4. 0011
   1) crashk_low_res may also a valid reserved memory, should be checked
      in crash_is_nosave(), see arch/arm64/kernel/machine_kexec.
   2) Drop memblock_mark_nomap() for crashk_low_res, because of:
      2687275a5843 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
   3) Also call kmemleak_ignore_phys() for crashk_low_res, because of:
      85f58eb18898 arm64: kdump: Skip kmemleak scan reserved memory for kdump
5. 0012, slightly rebased, because the following patch is applied in advance. 
   https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/commit/?h=dt/linus&id=8347b41748c3019157312fbe7f8a6792ae396eb7
6. 0013, no change.

Others:
1. Discard add ARCH_WANT_RESERVE_CRASH_KERNEL
2. When allocating crash low memory, the start address still starts from 0.
   low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
3. Discard change (1ULL << 32) to CRASH_ADDR_LOW_MAX.
4. Ensure the check position of xen_pv_domain() have no change.
5. Except patch 0010 and 0012, all "Tested-by", "Reviewed-by", "Acked-by" are removed.
6. Update description.



Changes since [v16]
- Because no functional changes in this version, so add
  "Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>" for patch 1-9
- Add "Reviewed-by: Rob Herring <robh@kernel.org>" for patch 8
- Update patch 9 based on the review comments of Rob Herring
- As Catalin Marinas's suggestion, merge the implementation of
  ARCH_WANT_RESERVE_CRASH_KERNEL into patch 5. Ensure that the
  contents of X86 and ARM64 do not overlap, and reduce unnecessary
  temporary differences.

Changes since [v15]
-  Aggregate the processing of "linux,usable-memory-range" into one function.
   Only patch 9-10 have been updated.

Changes since [v14]
- Recovering the requirement that the CrashKernel memory regions on X86
  only requires 1 MiB alignment.
- Combine patches 5 and 6 in v14 into one. The compilation warning fixed
  by patch 6 was introduced by patch 5 in v14.
- As with crashk_res, crashk_low_res is also processed by
  crash_exclude_mem_range() in patch 7.
- Due to commit b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
  has removed the architecture-specific code, extend the property "linux,usable-memory-range"
  in the platform-agnostic FDT core code. See patch 9.
- Discard the x86 description update in the document, because the description
  has been updated by commit b1f4c363666c ("Documentation: kdump: update kdump guide").
- Change "arm64" to "ARM64" in Doc.


Changes since [v13]
- Rebased on top of 5.11-rc5.
- Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL.
Since reserve_crashkernel[_low]() implementations are quite similar on
other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in
arch/Kconfig and select this by X86 and ARM64.
- Some minor cleanup.

Changes since [v12]
- Rebased on top of 5.10-rc1.
- Keep CRASH_ALIGN as 16M suggested by Dave.
- Drop patch "kdump: add threshold for the required memory".
- Add Tested-by from John.

Changes since [v11]
- Rebased on top of 5.9-rc4.
- Make the function reserve_crashkernel() of x86 generic.
Suggested by Catalin, make the function reserve_crashkernel() of x86 generic
and arm64 use the generic version to reimplement crashkernel=X.

Changes since [v10]
- Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.

Changes since [v9]
- Patch 1 add Acked-by from Dave.
- Update patch 5 according to Dave's comments.
- Update chosen schema.

Changes since [v8]
- Reuse DT property "linux,usable-memory-range".
Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
memory region.
- Fix kdump broken with ZONE_DMA reintroduced.
- Update chosen schema.

Changes since [v7]
- Move x86 CRASH_ALIGN to 2M
Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
- Update Documentation/devicetree/bindings/chosen.txt.
Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
suggested by Arnd.
- Add Tested-by from Jhon and pk.

Changes since [v6]
- Fix build errors reported by kbuild test robot.

Changes since [v5]
- Move reserve_crashkernel_low() into kernel/crash_core.c.
- Delete crashkernel=X,high.
- Modify crashkernel=X,low.
If crashkernel=X,low is specified simultaneously, reserve spcified size low
memory for crash kdump kernel devices firstly and then reserve memory above 4G.
In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
pass to crash dump kernel by DT property "linux,low-memory-range".
- Update Documentation/admin-guide/kdump/kdump.rst.

Changes since [v4]
- Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.

Changes since [v3]
- Add memblock_cap_memory_ranges back for multiple ranges.
- Fix some compiling warnings.

Changes since [v2]
- Split patch "arm64: kdump: support reserving crashkernel above 4G" as
two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
patch.

Changes since [v1]:
- Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
- Remove memblock_cap_memory_ranges() i added in v1 and implement that
in fdt_enforce_memory_region().
There are at most two crash kernel regions, for two crash kernel regions
case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
and then remove the memory range in the middle.

[1]: https://www.spinics.net/lists/kexec/msg28226.html
[2]: https://github.com/robherring/dt-schema/pull/19 
[v1]: https://lkml.org/lkml/2019/4/2/1174
[v2]: https://lkml.org/lkml/2019/4/9/86
[v3]: https://lkml.org/lkml/2019/4/9/306
[v4]: https://lkml.org/lkml/2019/4/15/273
[v5]: https://lkml.org/lkml/2019/5/6/1360
[v6]: https://lkml.org/lkml/2019/8/30/142
[v7]: https://lkml.org/lkml/2019/12/23/411
[v8]: https://lkml.org/lkml/2020/5/21/213
[v9]: https://lkml.org/lkml/2020/6/28/73
[v10]: https://lkml.org/lkml/2020/7/2/1443
[v11]: https://lkml.org/lkml/2020/8/1/150
[v12]: https://lkml.org/lkml/2020/9/7/1037
[v13]: https://lkml.org/lkml/2020/10/31/34
[v14]: https://lkml.org/lkml/2021/1/30/53
[v15]: https://lkml.org/lkml/2021/10/19/1405
[v16]: https://lkml.org/lkml/2021/11/23/435
[v17]: https://lkml.org/lkml/2021/12/10/38
[v18]: https://lkml.org/lkml/2021/12/22/424
[v19]: https://lkml.org/lkml/2021/12/28/203


Chen Zhou (4):
  arm64: kdump: introduce some macros for crash kernel reservation
  arm64: kdump: reimplement crashkernel=X
  of: fdt: Add memory for devices by DT property
    "linux,usable-memory-range"
  kdump: update Documentation about crashkernel

Zhen Lei (1):
  arm64: Use insert_resource() to simplify code

 Documentation/admin-guide/kdump/kdump.rst     | 11 ++-
 .../admin-guide/kernel-parameters.txt         | 11 ++-
 arch/arm64/kernel/machine_kexec.c             |  9 ++-
 arch/arm64/kernel/machine_kexec_file.c        | 12 ++-
 arch/arm64/kernel/setup.c                     | 17 +---
 arch/arm64/mm/init.c                          | 80 +++++++++++++++++--
 drivers/of/fdt.c                              | 33 +++++---
 7 files changed, 134 insertions(+), 39 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v20 1/5] arm64: Use insert_resource() to simplify code
  2022-01-24  8:47 [PATCH v20 0/5] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
@ 2022-01-24  8:47 ` Zhen Lei
  2022-01-26 15:16   ` john.p.donnelly
  2022-02-08  1:43   ` Baoquan He
  2022-01-24  8:47 ` [PATCH v20 2/5] arm64: kdump: introduce some macros for crash kernel reservation Zhen Lei
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 30+ messages in thread
From: Zhen Lei @ 2022-01-24  8:47 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

insert_resource() traverses the subtree layer by layer from the root node
until a proper location is found. Compared with request_resource(), the
parent node does not need to be determined in advance.

In addition, move the insertion of node 'crashk_res' into function
reserve_crashkernel() to make the associated code close together.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/kernel/setup.c | 17 +++--------------
 arch/arm64/mm/init.c      |  1 +
 2 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index f70573928f1bff0..a81efcc359e4e78 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -225,6 +225,8 @@ static void __init request_standard_resources(void)
 	kernel_code.end     = __pa_symbol(__init_begin - 1);
 	kernel_data.start   = __pa_symbol(_sdata);
 	kernel_data.end     = __pa_symbol(_end - 1);
+	insert_resource(&iomem_resource, &kernel_code);
+	insert_resource(&iomem_resource, &kernel_data);
 
 	num_standard_resources = memblock.memory.cnt;
 	res_size = num_standard_resources * sizeof(*standard_resources);
@@ -246,20 +248,7 @@ static void __init request_standard_resources(void)
 			res->end = __pfn_to_phys(memblock_region_memory_end_pfn(region)) - 1;
 		}
 
-		request_resource(&iomem_resource, res);
-
-		if (kernel_code.start >= res->start &&
-		    kernel_code.end <= res->end)
-			request_resource(res, &kernel_code);
-		if (kernel_data.start >= res->start &&
-		    kernel_data.end <= res->end)
-			request_resource(res, &kernel_data);
-#ifdef CONFIG_KEXEC_CORE
-		/* Userspace will find "Crash kernel" region in /proc/iomem. */
-		if (crashk_res.end && crashk_res.start >= res->start &&
-		    crashk_res.end <= res->end)
-			request_resource(res, &crashk_res);
-#endif
+		insert_resource(&iomem_resource, res);
 	}
 }
 
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index db63cc885771a52..90f276d46b93bc6 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -109,6 +109,7 @@ static void __init reserve_crashkernel(void)
 	kmemleak_ignore_phys(crash_base);
 	crashk_res.start = crash_base;
 	crashk_res.end = crash_base + crash_size - 1;
+	insert_resource(&iomem_resource, &crashk_res);
 }
 #else
 static void __init reserve_crashkernel(void)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v20 2/5] arm64: kdump: introduce some macros for crash kernel reservation
  2022-01-24  8:47 [PATCH v20 0/5] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
  2022-01-24  8:47 ` [PATCH v20 1/5] arm64: Use insert_resource() to simplify code Zhen Lei
@ 2022-01-24  8:47 ` Zhen Lei
  2022-01-26 15:17   ` john.p.donnelly
  2022-02-11 10:39   ` Baoquan He
  2022-01-24  8:47 ` [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X Zhen Lei
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 30+ messages in thread
From: Zhen Lei @ 2022-01-24  8:47 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

From: Chen Zhou <chenzhou10@huawei.com>

Introduce macro CRASH_ALIGN for alignment, macro CRASH_ADDR_LOW_MAX
for upper bound of low crash memory, macro CRASH_ADDR_HIGH_MAX for
upper bound of high crash memory, use macros instead.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Tested-by: John Donnelly <John.p.donnelly@oracle.com>
Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>
---
 arch/arm64/mm/init.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 90f276d46b93bc6..6c653a2c7cff052 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -65,6 +65,12 @@ EXPORT_SYMBOL(memstart_addr);
 phys_addr_t arm64_dma_phys_limit __ro_after_init;
 
 #ifdef CONFIG_KEXEC_CORE
+/* Current arm64 boot protocol requires 2MB alignment */
+#define CRASH_ALIGN		SZ_2M
+
+#define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
+#define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
+
 /*
  * reserve_crashkernel() - reserves memory for crash kernel
  *
@@ -75,7 +81,7 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
 static void __init reserve_crashkernel(void)
 {
 	unsigned long long crash_base, crash_size;
-	unsigned long long crash_max = arm64_dma_phys_limit;
+	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
 	int ret;
 
 	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
@@ -90,8 +96,7 @@ static void __init reserve_crashkernel(void)
 	if (crash_base)
 		crash_max = crash_base + crash_size;
 
-	/* Current arm64 boot protocol requires 2MB alignment */
-	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
+	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
 					       crash_base, crash_max);
 	if (!crash_base) {
 		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X
  2022-01-24  8:47 [PATCH v20 0/5] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
  2022-01-24  8:47 ` [PATCH v20 1/5] arm64: Use insert_resource() to simplify code Zhen Lei
  2022-01-24  8:47 ` [PATCH v20 2/5] arm64: kdump: introduce some macros for crash kernel reservation Zhen Lei
@ 2022-01-24  8:47 ` Zhen Lei
  2022-01-26 15:18   ` john.p.donnelly
                     ` (2 more replies)
  2022-01-24  8:47 ` [PATCH v20 4/5] of: fdt: Add memory for devices by DT property "linux,usable-memory-range" Zhen Lei
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 30+ messages in thread
From: Zhen Lei @ 2022-01-24  8:47 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

From: Chen Zhou <chenzhou10@huawei.com>

There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel below 4G, which
will fail when there is no enough low memory.
2. If reserving crashkernel above 4G, in this case, crash dump
kernel will boot failure because there is no low memory available
for allocation.

To solve these issues, change the behavior of crashkernel=X and
introduce crashkernel=X,[high,low]. crashkernel=X tries low allocation
in DMA zone, and fall back to high allocation if it fails.
We can also use "crashkernel=X,high" to select a region above DMA zone,
which also tries to allocate at least 256M in DMA zone automatically.
"crashkernel=Y,low" can be used to allocate specified size low memory.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/kernel/machine_kexec.c      |  9 +++-
 arch/arm64/kernel/machine_kexec_file.c | 12 ++++-
 arch/arm64/mm/init.c                   | 68 ++++++++++++++++++++++++--
 3 files changed, 81 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index e16b248699d5c3c..19c2d487cb08feb 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -329,8 +329,13 @@ bool crash_is_nosave(unsigned long pfn)
 
 	/* in reserved memory? */
 	addr = __pfn_to_phys(pfn);
-	if ((addr < crashk_res.start) || (crashk_res.end < addr))
-		return false;
+	if ((addr < crashk_res.start) || (crashk_res.end < addr)) {
+		if (!crashk_low_res.end)
+			return false;
+
+		if ((addr < crashk_low_res.start) || (crashk_low_res.end < addr))
+			return false;
+	}
 
 	if (!kexec_crash_image)
 		return true;
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index 59c648d51848886..889951291cc0f9c 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -65,10 +65,18 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
 
 	/* Exclude crashkernel region */
 	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
+	if (ret)
+		goto out;
+
+	if (crashk_low_res.end) {
+		ret = crash_exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
+		if (ret)
+			goto out;
+	}
 
-	if (!ret)
-		ret =  crash_prepare_elf64_headers(cmem, true, addr, sz);
+	ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
 
+out:
 	kfree(cmem);
 	return ret;
 }
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 6c653a2c7cff052..a5d43feac0d7d96 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -71,6 +71,30 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
 #define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
 #define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
 
+static int __init reserve_crashkernel_low(unsigned long long low_size)
+{
+	unsigned long long low_base;
+
+	/* passed with crashkernel=0,low ? */
+	if (!low_size)
+		return 0;
+
+	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
+	if (!low_base) {
+		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
+		return -ENOMEM;
+	}
+
+	pr_info("crashkernel low memory reserved: 0x%llx - 0x%llx (%lld MB)\n",
+		low_base, low_base + low_size, low_size >> 20);
+
+	crashk_low_res.start = low_base;
+	crashk_low_res.end   = low_base + low_size - 1;
+	insert_resource(&iomem_resource, &crashk_low_res);
+
+	return 0;
+}
+
 /*
  * reserve_crashkernel() - reserves memory for crash kernel
  *
@@ -81,29 +105,62 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
 static void __init reserve_crashkernel(void)
 {
 	unsigned long long crash_base, crash_size;
+	unsigned long long crash_low_size = SZ_256M;
 	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
 	int ret;
+	bool fixed_base;
+	char *cmdline = boot_command_line;
 
-	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+	/* crashkernel=X[@offset] */
+	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
 				&crash_size, &crash_base);
-	/* no crashkernel= or invalid value specified */
-	if (ret || !crash_size)
-		return;
+	if (ret || !crash_size) {
+		unsigned long long low_size;
 
+		/* crashkernel=X,high */
+		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
+		if (ret || !crash_size)
+			return;
+
+		/* crashkernel=X,low */
+		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
+		if (!ret)
+			crash_low_size = low_size;
+
+		crash_max = CRASH_ADDR_HIGH_MAX;
+	}
+
+	fixed_base = !!crash_base;
 	crash_size = PAGE_ALIGN(crash_size);
 
 	/* User specifies base address explicitly. */
 	if (crash_base)
 		crash_max = crash_base + crash_size;
 
+retry:
 	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
 					       crash_base, crash_max);
 	if (!crash_base) {
+		/*
+		 * Attempt to fully allocate low memory failed, fall back
+		 * to high memory, the minimum required low memory will be
+		 * reserved later.
+		 */
+		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
+			crash_max = CRASH_ADDR_HIGH_MAX;
+			goto retry;
+		}
+
 		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
 			crash_size);
 		return;
 	}
 
+	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
+		memblock_phys_free(crash_base, crash_size);
+		return;
+	}
+
 	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
 		crash_base, crash_base + crash_size, crash_size >> 20);
 
@@ -112,6 +169,9 @@ static void __init reserve_crashkernel(void)
 	 * map. Inform kmemleak so that it won't try to access it.
 	 */
 	kmemleak_ignore_phys(crash_base);
+	if (crashk_low_res.end)
+		kmemleak_ignore_phys(crashk_low_res.start);
+
 	crashk_res.start = crash_base;
 	crashk_res.end = crash_base + crash_size - 1;
 	insert_resource(&iomem_resource, &crashk_res);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v20 4/5] of: fdt: Add memory for devices by DT property "linux,usable-memory-range"
  2022-01-24  8:47 [PATCH v20 0/5] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
                   ` (2 preceding siblings ...)
  2022-01-24  8:47 ` [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X Zhen Lei
@ 2022-01-24  8:47 ` Zhen Lei
  2022-01-26 15:19   ` john.p.donnelly
  2022-01-24  8:47 ` [PATCH v20 5/5] kdump: update Documentation about crashkernel Zhen Lei
  2022-02-07  4:04 ` [PATCH v20 0/5] support reserving crashkernel above 4G on arm64 kdump Leizhen (ThunderTown)
  5 siblings, 1 reply; 30+ messages in thread
From: Zhen Lei @ 2022-01-24  8:47 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

From: Chen Zhou <chenzhou10@huawei.com>

When reserving crashkernel in high memory, some low memory is reserved
for crash dump kernel devices and never mapped by the first kernel.
This memory range is advertised to crash dump kernel via DT property
under /chosen,
        linux,usable-memory-range = <BASE1 SIZE1 [BASE2 SIZE2]>

We reused the DT property linux,usable-memory-range and made the low
memory region as the second range "BASE2 SIZE2", which keeps compatibility
with existing user-space and older kdump kernels.

Crash dump kernel reads this property at boot time and call memblock_add()
to add the low memory region after memblock_cap_memory_range() has been
called.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>
---
 drivers/of/fdt.c | 33 +++++++++++++++++++++++----------
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index ad85ff6474ff139..df4b9d2418a13d4 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -973,16 +973,24 @@ static void __init early_init_dt_check_for_elfcorehdr(unsigned long node)
 
 static unsigned long chosen_node_offset = -FDT_ERR_NOTFOUND;
 
+/*
+ * The main usage of linux,usable-memory-range is for crash dump kernel.
+ * Originally, the number of usable-memory regions is one. Now there may
+ * be two regions, low region and high region.
+ * To make compatibility with existing user-space and older kdump, the low
+ * region is always the last range of linux,usable-memory-range if exist.
+ */
+#define MAX_USABLE_RANGES		2
+
 /**
  * early_init_dt_check_for_usable_mem_range - Decode usable memory range
  * location from flat tree
  */
 void __init early_init_dt_check_for_usable_mem_range(void)
 {
-	const __be32 *prop;
-	int len;
-	phys_addr_t cap_mem_addr;
-	phys_addr_t cap_mem_size;
+	struct memblock_region rgn[MAX_USABLE_RANGES] = {0};
+	const __be32 *prop, *endp;
+	int len, i;
 	unsigned long node = chosen_node_offset;
 
 	if ((long)node < 0)
@@ -991,16 +999,21 @@ void __init early_init_dt_check_for_usable_mem_range(void)
 	pr_debug("Looking for usable-memory-range property... ");
 
 	prop = of_get_flat_dt_prop(node, "linux,usable-memory-range", &len);
-	if (!prop || (len < (dt_root_addr_cells + dt_root_size_cells)))
+	if (!prop || (len % (dt_root_addr_cells + dt_root_size_cells)))
 		return;
 
-	cap_mem_addr = dt_mem_next_cell(dt_root_addr_cells, &prop);
-	cap_mem_size = dt_mem_next_cell(dt_root_size_cells, &prop);
+	endp = prop + (len / sizeof(__be32));
+	for (i = 0; i < MAX_USABLE_RANGES && prop < endp; i++) {
+		rgn[i].base = dt_mem_next_cell(dt_root_addr_cells, &prop);
+		rgn[i].size = dt_mem_next_cell(dt_root_size_cells, &prop);
 
-	pr_debug("cap_mem_start=%pa cap_mem_size=%pa\n", &cap_mem_addr,
-		 &cap_mem_size);
+		pr_debug("cap_mem_regions[%d]: base=%pa, size=%pa\n",
+			 i, &rgn[i].base, &rgn[i].size);
+	}
 
-	memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
+	memblock_cap_memory_range(rgn[0].base, rgn[0].size);
+	for (i = 1; i < MAX_USABLE_RANGES && rgn[i].size; i++)
+		memblock_add(rgn[i].base, rgn[i].size);
 }
 
 #ifdef CONFIG_SERIAL_EARLYCON
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v20 5/5] kdump: update Documentation about crashkernel
  2022-01-24  8:47 [PATCH v20 0/5] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
                   ` (3 preceding siblings ...)
  2022-01-24  8:47 ` [PATCH v20 4/5] of: fdt: Add memory for devices by DT property "linux,usable-memory-range" Zhen Lei
@ 2022-01-24  8:47 ` Zhen Lei
  2022-01-26 15:19   ` john.p.donnelly
  2022-02-21  3:48   ` Baoquan He
  2022-02-07  4:04 ` [PATCH v20 0/5] support reserving crashkernel above 4G on arm64 kdump Leizhen (ThunderTown)
  5 siblings, 2 replies; 30+ messages in thread
From: Zhen Lei @ 2022-01-24  8:47 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

From: Chen Zhou <chenzhou10@huawei.com>

For arm64, the behavior of crashkernel=X has been changed, which
tries low allocation in DMA zone and fall back to high allocation
if it fails.

We can also use "crashkernel=X,high" to select a high region above
DMA zone, which also tries to allocate at least 256M low memory in
DMA zone automatically and "crashkernel=Y,low" can be used to allocate
specified size low memory.

So update the Documentation.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 Documentation/admin-guide/kdump/kdump.rst       | 11 +++++++++--
 Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
index cb30ca3df27c9b2..d4c287044be0c70 100644
--- a/Documentation/admin-guide/kdump/kdump.rst
+++ b/Documentation/admin-guide/kdump/kdump.rst
@@ -361,8 +361,15 @@ Boot into System Kernel
    kernel will automatically locate the crash kernel image within the
    first 512MB of RAM if X is not given.
 
-   On arm64, use "crashkernel=Y[@X]".  Note that the start address of
-   the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
+   On arm64, use "crashkernel=X" to try low allocation in DMA zone and
+   fall back to high allocation if it fails.
+   We can also use "crashkernel=X,high" to select a high region above
+   DMA zone, which also tries to allocate at least 256M low memory in
+   DMA zone automatically.
+   "crashkernel=Y,low" can be used to allocate specified size low memory.
+   Use "crashkernel=Y@X" if you really have to reserve memory from
+   specified start address X. Note that the start address of the kernel,
+   X if explicitly specified, must be aligned to 2MiB (0x200000).
 
 Load the Dump-capture Kernel
 ============================
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index f5a27f067db9ed9..65780c2ca830be0 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -792,6 +792,9 @@
 			[KNL, X86-64] Select a region under 4G first, and
 			fall back to reserve region above 4G when '@offset'
 			hasn't been specified.
+			[KNL, ARM64] Try low allocation in DMA zone and fall back
+			to high allocation if it fails when '@offset' hasn't been
+			specified.
 			See Documentation/admin-guide/kdump/kdump.rst for further details.
 
 	crashkernel=range1:size1[,range2:size2,...][@offset]
@@ -808,6 +811,8 @@
 			Otherwise memory region will be allocated below 4G, if
 			available.
 			It will be ignored if crashkernel=X is specified.
+			[KNL, ARM64] range in high memory.
+			Allow kernel to allocate physical memory region from top.
 	crashkernel=size[KMG],low
 			[KNL, X86-64] range under 4G. When crashkernel=X,high
 			is passed, kernel could allocate physical memory region
@@ -816,13 +821,15 @@
 			requires at least 64M+32K low memory, also enough extra
 			low memory is needed to make sure DMA buffers for 32-bit
 			devices won't run out. Kernel would try to allocate at
-			at least 256M below 4G automatically.
+			least 256M below 4G automatically.
 			This one let user to specify own low range under 4G
 			for second kernel instead.
 			0: to disable low allocation.
 			It will be ignored when crashkernel=X,high is not used
 			or memory reserved is below 4G.
-
+			[KNL, ARM64] range in low memory.
+			This one let user to specify a low range in DMA zone for
+			crash dump kernel.
 	cryptomgr.notests
 			[KNL] Disable crypto self-tests
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 1/5] arm64: Use insert_resource() to simplify code
  2022-01-24  8:47 ` [PATCH v20 1/5] arm64: Use insert_resource() to simplify code Zhen Lei
@ 2022-01-26 15:16   ` john.p.donnelly
  2022-02-08  1:43   ` Baoquan He
  1 sibling, 0 replies; 30+ messages in thread
From: john.p.donnelly @ 2022-01-26 15:16 UTC (permalink / raw)
  To: Zhen Lei, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou, Dave Kleikamp,
	John Donnelly

On 1/24/22 2:47 AM, Zhen Lei wrote:
> insert_resource() traverses the subtree layer by layer from the root node
> until a proper location is found. Compared with request_resource(), the
> parent node does not need to be determined in advance.
> 
> In addition, move the insertion of node 'crashk_res' into function
> reserve_crashkernel() to make the associated code close together.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>


Acked-by: John Donnelly  <john.p.donnelly@oracle.com>

> ---
>   arch/arm64/kernel/setup.c | 17 +++--------------
>   arch/arm64/mm/init.c      |  1 +
>   2 files changed, 4 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index f70573928f1bff0..a81efcc359e4e78 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -225,6 +225,8 @@ static void __init request_standard_resources(void)
>   	kernel_code.end     = __pa_symbol(__init_begin - 1);
>   	kernel_data.start   = __pa_symbol(_sdata);
>   	kernel_data.end     = __pa_symbol(_end - 1);
> +	insert_resource(&iomem_resource, &kernel_code);
> +	insert_resource(&iomem_resource, &kernel_data);
>   
>   	num_standard_resources = memblock.memory.cnt;
>   	res_size = num_standard_resources * sizeof(*standard_resources);
> @@ -246,20 +248,7 @@ static void __init request_standard_resources(void)
>   			res->end = __pfn_to_phys(memblock_region_memory_end_pfn(region)) - 1;
>   		}
>   
> -		request_resource(&iomem_resource, res);
> -
> -		if (kernel_code.start >= res->start &&
> -		    kernel_code.end <= res->end)
> -			request_resource(res, &kernel_code);
> -		if (kernel_data.start >= res->start &&
> -		    kernel_data.end <= res->end)
> -			request_resource(res, &kernel_data);
> -#ifdef CONFIG_KEXEC_CORE
> -		/* Userspace will find "Crash kernel" region in /proc/iomem. */
> -		if (crashk_res.end && crashk_res.start >= res->start &&
> -		    crashk_res.end <= res->end)
> -			request_resource(res, &crashk_res);
> -#endif
> +		insert_resource(&iomem_resource, res);
>   	}
>   }
>   
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index db63cc885771a52..90f276d46b93bc6 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -109,6 +109,7 @@ static void __init reserve_crashkernel(void)
>   	kmemleak_ignore_phys(crash_base);
>   	crashk_res.start = crash_base;
>   	crashk_res.end = crash_base + crash_size - 1;
> +	insert_resource(&iomem_resource, &crashk_res);
>   }
>   #else
>   static void __init reserve_crashkernel(void)


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 2/5] arm64: kdump: introduce some macros for crash kernel reservation
  2022-01-24  8:47 ` [PATCH v20 2/5] arm64: kdump: introduce some macros for crash kernel reservation Zhen Lei
@ 2022-01-26 15:17   ` john.p.donnelly
  2022-02-11 10:39   ` Baoquan He
  1 sibling, 0 replies; 30+ messages in thread
From: john.p.donnelly @ 2022-01-26 15:17 UTC (permalink / raw)
  To: Zhen Lei, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou, Dave Kleikamp

On 1/24/22 2:47 AM, Zhen Lei wrote:
> From: Chen Zhou <chenzhou10@huawei.com>
> 
> Introduce macro CRASH_ALIGN for alignment, macro CRASH_ADDR_LOW_MAX
> for upper bound of low crash memory, macro CRASH_ADDR_HIGH_MAX for
> upper bound of high crash memory, use macros instead.
> 
> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> Tested-by: John Donnelly <John.p.donnelly@oracle.com>
> Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>


Acked-by: John Donnelly  <john.p.donnelly@oracle.com>

> ---
>   arch/arm64/mm/init.c | 11 ++++++++---
>   1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 90f276d46b93bc6..6c653a2c7cff052 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -65,6 +65,12 @@ EXPORT_SYMBOL(memstart_addr);
>   phys_addr_t arm64_dma_phys_limit __ro_after_init;
>   
>   #ifdef CONFIG_KEXEC_CORE
> +/* Current arm64 boot protocol requires 2MB alignment */
> +#define CRASH_ALIGN		SZ_2M
> +
> +#define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
> +#define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
> +
>   /*
>    * reserve_crashkernel() - reserves memory for crash kernel
>    *
> @@ -75,7 +81,7 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>   static void __init reserve_crashkernel(void)
>   {
>   	unsigned long long crash_base, crash_size;
> -	unsigned long long crash_max = arm64_dma_phys_limit;
> +	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>   	int ret;
>   
>   	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> @@ -90,8 +96,7 @@ static void __init reserve_crashkernel(void)
>   	if (crash_base)
>   		crash_max = crash_base + crash_size;
>   
> -	/* Current arm64 boot protocol requires 2MB alignment */
> -	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
> +	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>   					       crash_base, crash_max);
>   	if (!crash_base) {
>   		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X
  2022-01-24  8:47 ` [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X Zhen Lei
@ 2022-01-26 15:18   ` john.p.donnelly
  2022-02-11 10:30   ` Baoquan He
  2022-02-14  3:52   ` Baoquan He
  2 siblings, 0 replies; 30+ messages in thread
From: john.p.donnelly @ 2022-01-26 15:18 UTC (permalink / raw)
  To: Zhen Lei, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou, Dave Kleikamp

On 1/24/22 2:47 AM, Zhen Lei wrote:
> From: Chen Zhou <chenzhou10@huawei.com>
> 
> There are following issues in arm64 kdump:
> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> will fail when there is no enough low memory.
> 2. If reserving crashkernel above 4G, in this case, crash dump
> kernel will boot failure because there is no low memory available
> for allocation.
> 
> To solve these issues, change the behavior of crashkernel=X and
> introduce crashkernel=X,[high,low]. crashkernel=X tries low allocation
> in DMA zone, and fall back to high allocation if it fails.
> We can also use "crashkernel=X,high" to select a region above DMA zone,
> which also tries to allocate at least 256M in DMA zone automatically.
> "crashkernel=Y,low" can be used to allocate specified size low memory.
> 
> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
> Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>


Acked-by: John Donnelly  <john.p.donnelly@oracle.com>

> ---
>   arch/arm64/kernel/machine_kexec.c      |  9 +++-
>   arch/arm64/kernel/machine_kexec_file.c | 12 ++++-
>   arch/arm64/mm/init.c                   | 68 ++++++++++++++++++++++++--
>   3 files changed, 81 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index e16b248699d5c3c..19c2d487cb08feb 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -329,8 +329,13 @@ bool crash_is_nosave(unsigned long pfn)
>   
>   	/* in reserved memory? */
>   	addr = __pfn_to_phys(pfn);
> -	if ((addr < crashk_res.start) || (crashk_res.end < addr))
> -		return false;
> +	if ((addr < crashk_res.start) || (crashk_res.end < addr)) {
> +		if (!crashk_low_res.end)
> +			return false;
> +
> +		if ((addr < crashk_low_res.start) || (crashk_low_res.end < addr))
> +			return false;
> +	}
>   
>   	if (!kexec_crash_image)
>   		return true;
> diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> index 59c648d51848886..889951291cc0f9c 100644
> --- a/arch/arm64/kernel/machine_kexec_file.c
> +++ b/arch/arm64/kernel/machine_kexec_file.c
> @@ -65,10 +65,18 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
>   
>   	/* Exclude crashkernel region */
>   	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
> +	if (ret)
> +		goto out;
> +
> +	if (crashk_low_res.end) {
> +		ret = crash_exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
> +		if (ret)
> +			goto out;
> +	}
>   
> -	if (!ret)
> -		ret =  crash_prepare_elf64_headers(cmem, true, addr, sz);
> +	ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
>   
> +out:
>   	kfree(cmem);
>   	return ret;
>   }
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 6c653a2c7cff052..a5d43feac0d7d96 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -71,6 +71,30 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>   #define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
>   #define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
>   
> +static int __init reserve_crashkernel_low(unsigned long long low_size)
> +{
> +	unsigned long long low_base;
> +
> +	/* passed with crashkernel=0,low ? */
> +	if (!low_size)
> +		return 0;
> +
> +	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> +	if (!low_base) {
> +		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
> +		return -ENOMEM;
> +	}
> +
> +	pr_info("crashkernel low memory reserved: 0x%llx - 0x%llx (%lld MB)\n",
> +		low_base, low_base + low_size, low_size >> 20);
> +
> +	crashk_low_res.start = low_base;
> +	crashk_low_res.end   = low_base + low_size - 1;
> +	insert_resource(&iomem_resource, &crashk_low_res);
> +
> +	return 0;
> +}
> +
>   /*
>    * reserve_crashkernel() - reserves memory for crash kernel
>    *
> @@ -81,29 +105,62 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>   static void __init reserve_crashkernel(void)
>   {
>   	unsigned long long crash_base, crash_size;
> +	unsigned long long crash_low_size = SZ_256M;
>   	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>   	int ret;
> +	bool fixed_base;
> +	char *cmdline = boot_command_line;
>   
> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> +	/* crashkernel=X[@offset] */
> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>   				&crash_size, &crash_base);
> -	/* no crashkernel= or invalid value specified */
> -	if (ret || !crash_size)
> -		return;
> +	if (ret || !crash_size) {
> +		unsigned long long low_size;
>   
> +		/* crashkernel=X,high */
> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
> +		if (ret || !crash_size)
> +			return;
> +
> +		/* crashkernel=X,low */
> +		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
> +		if (!ret)
> +			crash_low_size = low_size;
> +
> +		crash_max = CRASH_ADDR_HIGH_MAX;
> +	}
> +
> +	fixed_base = !!crash_base;
>   	crash_size = PAGE_ALIGN(crash_size);
>   
>   	/* User specifies base address explicitly. */
>   	if (crash_base)
>   		crash_max = crash_base + crash_size;
>   
> +retry:
>   	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>   					       crash_base, crash_max);
>   	if (!crash_base) {
> +		/*
> +		 * Attempt to fully allocate low memory failed, fall back
> +		 * to high memory, the minimum required low memory will be
> +		 * reserved later.
> +		 */
> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
> +			crash_max = CRASH_ADDR_HIGH_MAX;
> +			goto retry;
> +		}
> +
>   		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>   			crash_size);
>   		return;
>   	}
>   
> +	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
> +		memblock_phys_free(crash_base, crash_size);
> +		return;
> +	}
> +
>   	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>   		crash_base, crash_base + crash_size, crash_size >> 20);
>   
> @@ -112,6 +169,9 @@ static void __init reserve_crashkernel(void)
>   	 * map. Inform kmemleak so that it won't try to access it.
>   	 */
>   	kmemleak_ignore_phys(crash_base);
> +	if (crashk_low_res.end)
> +		kmemleak_ignore_phys(crashk_low_res.start);
> +
>   	crashk_res.start = crash_base;
>   	crashk_res.end = crash_base + crash_size - 1;
>   	insert_resource(&iomem_resource, &crashk_res);


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 4/5] of: fdt: Add memory for devices by DT property "linux,usable-memory-range"
  2022-01-24  8:47 ` [PATCH v20 4/5] of: fdt: Add memory for devices by DT property "linux,usable-memory-range" Zhen Lei
@ 2022-01-26 15:19   ` john.p.donnelly
  0 siblings, 0 replies; 30+ messages in thread
From: john.p.donnelly @ 2022-01-26 15:19 UTC (permalink / raw)
  To: Zhen Lei, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou, Dave Kleikamp

On 1/24/22 2:47 AM, Zhen Lei wrote:
> From: Chen Zhou <chenzhou10@huawei.com>
> 
> When reserving crashkernel in high memory, some low memory is reserved
> for crash dump kernel devices and never mapped by the first kernel.
> This memory range is advertised to crash dump kernel via DT property
> under /chosen,
>          linux,usable-memory-range = <BASE1 SIZE1 [BASE2 SIZE2]>
> 
> We reused the DT property linux,usable-memory-range and made the low
> memory region as the second range "BASE2 SIZE2", which keeps compatibility
> with existing user-space and older kdump kernels.
> 
> Crash dump kernel reads this property at boot time and call memblock_add()
> to add the low memory region after memblock_cap_memory_range() has been
> called.
> 
> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
> Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> Reviewed-by: Rob Herring <robh@kernel.org>
> Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>

Acked-by: John Donnelly  <john.p.donnelly@oracle.com>

> ---
>   drivers/of/fdt.c | 33 +++++++++++++++++++++++----------
>   1 file changed, 23 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index ad85ff6474ff139..df4b9d2418a13d4 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -973,16 +973,24 @@ static void __init early_init_dt_check_for_elfcorehdr(unsigned long node)
>   
>   static unsigned long chosen_node_offset = -FDT_ERR_NOTFOUND;
>   
> +/*
> + * The main usage of linux,usable-memory-range is for crash dump kernel.
> + * Originally, the number of usable-memory regions is one. Now there may
> + * be two regions, low region and high region.
> + * To make compatibility with existing user-space and older kdump, the low
> + * region is always the last range of linux,usable-memory-range if exist.
> + */
> +#define MAX_USABLE_RANGES		2
> +
>   /**
>    * early_init_dt_check_for_usable_mem_range - Decode usable memory range
>    * location from flat tree
>    */
>   void __init early_init_dt_check_for_usable_mem_range(void)
>   {
> -	const __be32 *prop;
> -	int len;
> -	phys_addr_t cap_mem_addr;
> -	phys_addr_t cap_mem_size;
> +	struct memblock_region rgn[MAX_USABLE_RANGES] = {0};
> +	const __be32 *prop, *endp;
> +	int len, i;
>   	unsigned long node = chosen_node_offset;
>   
>   	if ((long)node < 0)
> @@ -991,16 +999,21 @@ void __init early_init_dt_check_for_usable_mem_range(void)
>   	pr_debug("Looking for usable-memory-range property... ");
>   
>   	prop = of_get_flat_dt_prop(node, "linux,usable-memory-range", &len);
> -	if (!prop || (len < (dt_root_addr_cells + dt_root_size_cells)))
> +	if (!prop || (len % (dt_root_addr_cells + dt_root_size_cells)))
>   		return;
>   
> -	cap_mem_addr = dt_mem_next_cell(dt_root_addr_cells, &prop);
> -	cap_mem_size = dt_mem_next_cell(dt_root_size_cells, &prop);
> +	endp = prop + (len / sizeof(__be32));
> +	for (i = 0; i < MAX_USABLE_RANGES && prop < endp; i++) {
> +		rgn[i].base = dt_mem_next_cell(dt_root_addr_cells, &prop);
> +		rgn[i].size = dt_mem_next_cell(dt_root_size_cells, &prop);
>   
> -	pr_debug("cap_mem_start=%pa cap_mem_size=%pa\n", &cap_mem_addr,
> -		 &cap_mem_size);
> +		pr_debug("cap_mem_regions[%d]: base=%pa, size=%pa\n",
> +			 i, &rgn[i].base, &rgn[i].size);
> +	}
>   
> -	memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
> +	memblock_cap_memory_range(rgn[0].base, rgn[0].size);
> +	for (i = 1; i < MAX_USABLE_RANGES && rgn[i].size; i++)
> +		memblock_add(rgn[i].base, rgn[i].size);
>   }
>   
>   #ifdef CONFIG_SERIAL_EARLYCON


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 5/5] kdump: update Documentation about crashkernel
  2022-01-24  8:47 ` [PATCH v20 5/5] kdump: update Documentation about crashkernel Zhen Lei
@ 2022-01-26 15:19   ` john.p.donnelly
  2022-02-21  3:48   ` Baoquan He
  1 sibling, 0 replies; 30+ messages in thread
From: john.p.donnelly @ 2022-01-26 15:19 UTC (permalink / raw)
  To: Zhen Lei, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou, Dave Kleikamp

On 1/24/22 2:47 AM, Zhen Lei wrote:
> From: Chen Zhou <chenzhou10@huawei.com>
> 
> For arm64, the behavior of crashkernel=X has been changed, which
> tries low allocation in DMA zone and fall back to high allocation
> if it fails.
> 
> We can also use "crashkernel=X,high" to select a high region above
> DMA zone, which also tries to allocate at least 256M low memory in
> DMA zone automatically and "crashkernel=Y,low" can be used to allocate
> specified size low memory.
> 
> So update the Documentation.
> 
> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

Acked-by: John Donnelly  <john.p.donnelly@oracle.com>

> ---
>   Documentation/admin-guide/kdump/kdump.rst       | 11 +++++++++--
>   Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
>   2 files changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
> index cb30ca3df27c9b2..d4c287044be0c70 100644
> --- a/Documentation/admin-guide/kdump/kdump.rst
> +++ b/Documentation/admin-guide/kdump/kdump.rst
> @@ -361,8 +361,15 @@ Boot into System Kernel
>      kernel will automatically locate the crash kernel image within the
>      first 512MB of RAM if X is not given.
>   
> -   On arm64, use "crashkernel=Y[@X]".  Note that the start address of
> -   the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
> +   On arm64, use "crashkernel=X" to try low allocation in DMA zone and
> +   fall back to high allocation if it fails.
> +   We can also use "crashkernel=X,high" to select a high region above
> +   DMA zone, which also tries to allocate at least 256M low memory in
> +   DMA zone automatically.
> +   "crashkernel=Y,low" can be used to allocate specified size low memory.
> +   Use "crashkernel=Y@X" if you really have to reserve memory from
> +   specified start address X. Note that the start address of the kernel,
> +   X if explicitly specified, must be aligned to 2MiB (0x200000).
>   
>   Load the Dump-capture Kernel
>   ============================
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index f5a27f067db9ed9..65780c2ca830be0 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -792,6 +792,9 @@
>   			[KNL, X86-64] Select a region under 4G first, and
>   			fall back to reserve region above 4G when '@offset'
>   			hasn't been specified.
> +			[KNL, ARM64] Try low allocation in DMA zone and fall back
> +			to high allocation if it fails when '@offset' hasn't been
> +			specified.
>   			See Documentation/admin-guide/kdump/kdump.rst for further details.
>   
>   	crashkernel=range1:size1[,range2:size2,...][@offset]
> @@ -808,6 +811,8 @@
>   			Otherwise memory region will be allocated below 4G, if
>   			available.
>   			It will be ignored if crashkernel=X is specified.
> +			[KNL, ARM64] range in high memory.
> +			Allow kernel to allocate physical memory region from top.
>   	crashkernel=size[KMG],low
>   			[KNL, X86-64] range under 4G. When crashkernel=X,high
>   			is passed, kernel could allocate physical memory region
> @@ -816,13 +821,15 @@
>   			requires at least 64M+32K low memory, also enough extra
>   			low memory is needed to make sure DMA buffers for 32-bit
>   			devices won't run out. Kernel would try to allocate at
> -			at least 256M below 4G automatically.
> +			least 256M below 4G automatically.
>   			This one let user to specify own low range under 4G
>   			for second kernel instead.
>   			0: to disable low allocation.
>   			It will be ignored when crashkernel=X,high is not used
>   			or memory reserved is below 4G.
> -
> +			[KNL, ARM64] range in low memory.
> +			This one let user to specify a low range in DMA zone for
> +			crash dump kernel.
>   	cryptomgr.notests
>   			[KNL] Disable crypto self-tests
>   


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 0/5] support reserving crashkernel above 4G on arm64 kdump
  2022-01-24  8:47 [PATCH v20 0/5] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
                   ` (4 preceding siblings ...)
  2022-01-24  8:47 ` [PATCH v20 5/5] kdump: update Documentation about crashkernel Zhen Lei
@ 2022-02-07  4:04 ` Leizhen (ThunderTown)
  2022-02-08  2:34   ` Baoquan He
  5 siblings, 1 reply; 30+ messages in thread
From: Leizhen (ThunderTown) @ 2022-02-07  4:04 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou, John Donnelly,
	Dave Kleikamp

Hi everybody:
  Can someone take a moment to review these patches? Maybe I should just try
making generic. This patch series seems to have gone back to square one,
discarding some of the valuable comments that were made in the middle. But the
only benefit of make generic is to avoid code duplication, a lot of adaptation
is needed. I think Borislav Petkov's suggestion is good, too.

  These patches are taking too long. Maybe no one wants to look through history
anymore. So I'm putting together some of the most central observations of
"make generic" as follows:
   Mike Rapoport:
     This very reminds what x86 does. Any chance some of the code can be reused
     rather than duplicated?
     https://lkml.org/lkml/2019/4/4/1225

     I think it would be better to have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL
     in arch/Kconfig and select this by X86 and ARM64.
     https://lkml.org/lkml/2020/11/12/224

   Ingo Molnar:
     No objections for this to be merged via the ARM tree, as long as x86
     functionality is kept intact.
     https://lkml.org/lkml/2019/4/10/109

     I.e. Ack, but only if it doesn't break anything. :-)
     https://lkml.org/lkml/2019/4/12/66

   Dave Young:
     Other than the comments from James, can you move the function into
     kernel/crash_core.c, we already have some functions moved there for
     sharing.
     https://lkml.org/lkml/2019/6/12/248

   Catalin Marinas:
     Except for the threshold to keep zone ZONE_DMA memory,
     reserve_crashkernel() looks very close to the x86 version. Shall we try
     to make this generic as well?
     https://lkml.org/lkml/2020/9/2/917

   Borislav Petkov:
     Why insert_resource() is relevant only to x86?
     --> I think this means "Why does arm64 not use insert_resource()?"
     https://lkml.org/lkml/2021/12/23/480

     This is exactly why I say that making those functions generic and shared
     might not be such a good idea, after all, because then you'd have to
     sprinkle around arch-specific stuff.
     https://lkml.org/lkml/2021/12/23/480

     What I suggested and what would be real clean is if the arches would
     simply call a *single*
	parse_crashkernel()
     function and when that one returns, *all* crashkernel= options would
     have been parsed properly, low, high, middle crashkernel, whatever...
     and the caller would know what crash kernel needs to be allocated.
     https://lkml.org/lkml/2021/12/28/305

   ------
   James Morse:
     We can then describe it via a different string in /proc/iomem, something
     like "Crash kernel (low)".
     https://lkml.org/lkml/2019/6/5/670
     --> The suggestion looks out of date. See Borislav Petkov's comments:
     --> 157752d84f5d ("kexec: use Crash kernel for Crash kernel low")
     --> https://lkml.org/lkml/2021/12/23/480


On 2022/1/24 16:47, Zhen Lei wrote:
> There are following issues in arm64 kdump:
> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> will fail when there is no enough low memory.
> 2. If reserving crashkernel above 4G, in this case, crash dump
> kernel will boot failure because there is no low memory available
> for allocation.
> 
> To solve these issues, change the behavior of crashkernel=X.
> crashkernel=X tries low allocation in DMA zone and fall back to high
> allocation if it fails.
> 
> We can also use "crashkernel=X,high" to select a high region above
> DMA zone, which also tries to allocate at least 256M low memory in
> DMA zone automatically and "crashkernel=Y,low" can be used to allocate
> specified size low memory.
> 
> When reserving crashkernel in high memory, some low memory is reserved
> for crash dump kernel devices. So there may be two regions reserved for
> crash dump kernel.
> In order to distinct from the high region and make no effect to the use
> of existing kexec-tools, rename the low region as "Crash kernel (low)",
> and pass the low region by reusing DT property
> "linux,usable-memory-range". We made the low memory region as the last
> range of "linux,usable-memory-range" to keep compatibility with existing
> user-space and older kdump kernels.
> 
> Besides, we need to modify kexec-tools:
> arm64: support more than one crash kernel regions(see [1])
> 
> Another update is document about DT property 'linux,usable-memory-range':
> schemas: update 'linux,usable-memory-range' node schema(see [2])
> 
> 
> Changes since [v19]:
> 1. Temporarily stop making reserve_crashkernel[_low]() generic. There are a
>    lot of details need to be considered, which can take a long time. Because
>    "make generic" does not add new functions and does not improve performance,
>    maybe I should say it's just a cleanup. So by stripping it out and leaving
>    it for other patches later, we can aggregate the changes to the main functions.
> 2. Use insert_resource() to replace request_resource(), this not only simplifies
>    the code, but also reduces the differences between arm64 and x86 implementations.
> 3. As commit 157752d84f5d ("kexec: use Crash kernel for Crash kernel low") do for
>    x86, we can also extend kexec-tools for arm64, and it's currently applied. See:
>    https://www.spinics.net/lists/kexec/msg28284.html
> 
> Thank you very much, Borislav Petkov, for so many valuable comments.
> 
> 
> 
> Changes since [v17]: v17 --> v19
> 1. Patch 0001-0004
>    Introduce generic parse_crashkernel_high_low() to bring the parsing of
>    "crashkernel=X,high" and the parsing of "crashkernel=X,low" together,
>    then use it instead of the call to parse_crashkernel_{high|low}(). Two
>    confusing parameters of parse_crashkernel_{high|low}() are deleted.
> 
>    I previously sent these four patches separately:
>    [1] https://lkml.org/lkml/2021/12/25/40
> 2. Patch 0005-0009
>    Introduce generic reserve_crashkernel_mem[_low](), the implementation of
>    these two functions is based on function reserve_crashkernel[_low]() in
>    arch/x86/kernel/setup.c. There is no functional change for x86.
>    1) The check position of xen_pv_domain() does not change.
>    2) Still 1M alignment for crash kernel fixed region, when 'base' is specified.
> 
>    To avoid compilation problems on other architectures: patch 0004 moves
>    the definition of global variable crashk[_low]_res from kexec_core.c to
>    crash_core.c, and provide default definitions for all macros involved, a
>    particular platform can redefine these macros to override the default
>    values.
> 3. 0010, only one line of comment was changed.
> 4. 0011
>    1) crashk_low_res may also a valid reserved memory, should be checked
>       in crash_is_nosave(), see arch/arm64/kernel/machine_kexec.
>    2) Drop memblock_mark_nomap() for crashk_low_res, because of:
>       2687275a5843 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
>    3) Also call kmemleak_ignore_phys() for crashk_low_res, because of:
>       85f58eb18898 arm64: kdump: Skip kmemleak scan reserved memory for kdump
> 5. 0012, slightly rebased, because the following patch is applied in advance. 
>    https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/commit/?h=dt/linus&id=8347b41748c3019157312fbe7f8a6792ae396eb7
> 6. 0013, no change.
> 
> Others:
> 1. Discard add ARCH_WANT_RESERVE_CRASH_KERNEL
> 2. When allocating crash low memory, the start address still starts from 0.
>    low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> 3. Discard change (1ULL << 32) to CRASH_ADDR_LOW_MAX.
> 4. Ensure the check position of xen_pv_domain() have no change.
> 5. Except patch 0010 and 0012, all "Tested-by", "Reviewed-by", "Acked-by" are removed.
> 6. Update description.
> 
> 
> 
> Changes since [v16]
> - Because no functional changes in this version, so add
>   "Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>" for patch 1-9
> - Add "Reviewed-by: Rob Herring <robh@kernel.org>" for patch 8
> - Update patch 9 based on the review comments of Rob Herring
> - As Catalin Marinas's suggestion, merge the implementation of
>   ARCH_WANT_RESERVE_CRASH_KERNEL into patch 5. Ensure that the
>   contents of X86 and ARM64 do not overlap, and reduce unnecessary
>   temporary differences.
> 
> Changes since [v15]
> -  Aggregate the processing of "linux,usable-memory-range" into one function.
>    Only patch 9-10 have been updated.
> 
> Changes since [v14]
> - Recovering the requirement that the CrashKernel memory regions on X86
>   only requires 1 MiB alignment.
> - Combine patches 5 and 6 in v14 into one. The compilation warning fixed
>   by patch 6 was introduced by patch 5 in v14.
> - As with crashk_res, crashk_low_res is also processed by
>   crash_exclude_mem_range() in patch 7.
> - Due to commit b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
>   has removed the architecture-specific code, extend the property "linux,usable-memory-range"
>   in the platform-agnostic FDT core code. See patch 9.
> - Discard the x86 description update in the document, because the description
>   has been updated by commit b1f4c363666c ("Documentation: kdump: update kdump guide").
> - Change "arm64" to "ARM64" in Doc.
> 
> 
> Changes since [v13]
> - Rebased on top of 5.11-rc5.
> - Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL.
> Since reserve_crashkernel[_low]() implementations are quite similar on
> other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in
> arch/Kconfig and select this by X86 and ARM64.
> - Some minor cleanup.
> 
> Changes since [v12]
> - Rebased on top of 5.10-rc1.
> - Keep CRASH_ALIGN as 16M suggested by Dave.
> - Drop patch "kdump: add threshold for the required memory".
> - Add Tested-by from John.
> 
> Changes since [v11]
> - Rebased on top of 5.9-rc4.
> - Make the function reserve_crashkernel() of x86 generic.
> Suggested by Catalin, make the function reserve_crashkernel() of x86 generic
> and arm64 use the generic version to reimplement crashkernel=X.
> 
> Changes since [v10]
> - Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.
> 
> Changes since [v9]
> - Patch 1 add Acked-by from Dave.
> - Update patch 5 according to Dave's comments.
> - Update chosen schema.
> 
> Changes since [v8]
> - Reuse DT property "linux,usable-memory-range".
> Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
> memory region.
> - Fix kdump broken with ZONE_DMA reintroduced.
> - Update chosen schema.
> 
> Changes since [v7]
> - Move x86 CRASH_ALIGN to 2M
> Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> - Update Documentation/devicetree/bindings/chosen.txt.
> Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
> suggested by Arnd.
> - Add Tested-by from Jhon and pk.
> 
> Changes since [v6]
> - Fix build errors reported by kbuild test robot.
> 
> Changes since [v5]
> - Move reserve_crashkernel_low() into kernel/crash_core.c.
> - Delete crashkernel=X,high.
> - Modify crashkernel=X,low.
> If crashkernel=X,low is specified simultaneously, reserve spcified size low
> memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> pass to crash dump kernel by DT property "linux,low-memory-range".
> - Update Documentation/admin-guide/kdump/kdump.rst.
> 
> Changes since [v4]
> - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
> 
> Changes since [v3]
> - Add memblock_cap_memory_ranges back for multiple ranges.
> - Fix some compiling warnings.
> 
> Changes since [v2]
> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> patch.
> 
> Changes since [v1]:
> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> in fdt_enforce_memory_region().
> There are at most two crash kernel regions, for two crash kernel regions
> case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> and then remove the memory range in the middle.
> 
> [1]: https://www.spinics.net/lists/kexec/msg28226.html
> [2]: https://github.com/robherring/dt-schema/pull/19 
> [v1]: https://lkml.org/lkml/2019/4/2/1174
> [v2]: https://lkml.org/lkml/2019/4/9/86
> [v3]: https://lkml.org/lkml/2019/4/9/306
> [v4]: https://lkml.org/lkml/2019/4/15/273
> [v5]: https://lkml.org/lkml/2019/5/6/1360
> [v6]: https://lkml.org/lkml/2019/8/30/142
> [v7]: https://lkml.org/lkml/2019/12/23/411
> [v8]: https://lkml.org/lkml/2020/5/21/213
> [v9]: https://lkml.org/lkml/2020/6/28/73
> [v10]: https://lkml.org/lkml/2020/7/2/1443
> [v11]: https://lkml.org/lkml/2020/8/1/150
> [v12]: https://lkml.org/lkml/2020/9/7/1037
> [v13]: https://lkml.org/lkml/2020/10/31/34
> [v14]: https://lkml.org/lkml/2021/1/30/53
> [v15]: https://lkml.org/lkml/2021/10/19/1405
> [v16]: https://lkml.org/lkml/2021/11/23/435
> [v17]: https://lkml.org/lkml/2021/12/10/38
> [v18]: https://lkml.org/lkml/2021/12/22/424
> [v19]: https://lkml.org/lkml/2021/12/28/203
> 
> 
> Chen Zhou (4):
>   arm64: kdump: introduce some macros for crash kernel reservation
>   arm64: kdump: reimplement crashkernel=X
>   of: fdt: Add memory for devices by DT property
>     "linux,usable-memory-range"
>   kdump: update Documentation about crashkernel
> 
> Zhen Lei (1):
>   arm64: Use insert_resource() to simplify code
> 
>  Documentation/admin-guide/kdump/kdump.rst     | 11 ++-
>  .../admin-guide/kernel-parameters.txt         | 11 ++-
>  arch/arm64/kernel/machine_kexec.c             |  9 ++-
>  arch/arm64/kernel/machine_kexec_file.c        | 12 ++-
>  arch/arm64/kernel/setup.c                     | 17 +---
>  arch/arm64/mm/init.c                          | 80 +++++++++++++++++--
>  drivers/of/fdt.c                              | 33 +++++---
>  7 files changed, 134 insertions(+), 39 deletions(-)
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 1/5] arm64: Use insert_resource() to simplify code
  2022-01-24  8:47 ` [PATCH v20 1/5] arm64: Use insert_resource() to simplify code Zhen Lei
  2022-01-26 15:16   ` john.p.donnelly
@ 2022-02-08  1:43   ` Baoquan He
  1 sibling, 0 replies; 30+ messages in thread
From: Baoquan He @ 2022-02-08  1:43 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 01/24/22 at 04:47pm, Zhen Lei wrote:
> insert_resource() traverses the subtree layer by layer from the root node
> until a proper location is found. Compared with request_resource(), the
> parent node does not need to be determined in advance.
> 
> In addition, move the insertion of node 'crashk_res' into function
> reserve_crashkernel() to make the associated code close together.

This is nice cleanup.

Acked-by: Baoquan He <bhe@redhat.com>

> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  arch/arm64/kernel/setup.c | 17 +++--------------
>  arch/arm64/mm/init.c      |  1 +
>  2 files changed, 4 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index f70573928f1bff0..a81efcc359e4e78 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -225,6 +225,8 @@ static void __init request_standard_resources(void)
>  	kernel_code.end     = __pa_symbol(__init_begin - 1);
>  	kernel_data.start   = __pa_symbol(_sdata);
>  	kernel_data.end     = __pa_symbol(_end - 1);
> +	insert_resource(&iomem_resource, &kernel_code);
> +	insert_resource(&iomem_resource, &kernel_data);
>  
>  	num_standard_resources = memblock.memory.cnt;
>  	res_size = num_standard_resources * sizeof(*standard_resources);
> @@ -246,20 +248,7 @@ static void __init request_standard_resources(void)
>  			res->end = __pfn_to_phys(memblock_region_memory_end_pfn(region)) - 1;
>  		}
>  
> -		request_resource(&iomem_resource, res);
> -
> -		if (kernel_code.start >= res->start &&
> -		    kernel_code.end <= res->end)
> -			request_resource(res, &kernel_code);
> -		if (kernel_data.start >= res->start &&
> -		    kernel_data.end <= res->end)
> -			request_resource(res, &kernel_data);
> -#ifdef CONFIG_KEXEC_CORE
> -		/* Userspace will find "Crash kernel" region in /proc/iomem. */
> -		if (crashk_res.end && crashk_res.start >= res->start &&
> -		    crashk_res.end <= res->end)
> -			request_resource(res, &crashk_res);
> -#endif
> +		insert_resource(&iomem_resource, res);
>  	}
>  }
>  
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index db63cc885771a52..90f276d46b93bc6 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -109,6 +109,7 @@ static void __init reserve_crashkernel(void)
>  	kmemleak_ignore_phys(crash_base);
>  	crashk_res.start = crash_base;
>  	crashk_res.end = crash_base + crash_size - 1;
> +	insert_resource(&iomem_resource, &crashk_res);
>  }
>  #else
>  static void __init reserve_crashkernel(void)
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 0/5] support reserving crashkernel above 4G on arm64 kdump
  2022-02-07  4:04 ` [PATCH v20 0/5] support reserving crashkernel above 4G on arm64 kdump Leizhen (ThunderTown)
@ 2022-02-08  2:34   ` Baoquan He
  0 siblings, 0 replies; 30+ messages in thread
From: Baoquan He @ 2022-02-08  2:34 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp, prudo, piliu

On 02/07/22 at 12:04pm, Leizhen (ThunderTown) wrote:
> Hi everybody:
>   Can someone take a moment to review these patches? Maybe I should just try
> making generic. This patch series seems to have gone back to square one,
> discarding some of the valuable comments that were made in the middle. But the
> only benefit of make generic is to avoid code duplication, a lot of adaptation
> is needed. I think Borislav Petkov's suggestion is good, too.

I am checking this version. I have got a arm64 machine, will give it a shot.

About deduplicating the copied code, it's good to have comparing with having an
important feature in a certian ARCH.

Add Philipp to CC since he investigates the deduplicating.

> 
>   These patches are taking too long. Maybe no one wants to look through history
> anymore. So I'm putting together some of the most central observations of
> "make generic" as follows:
>    Mike Rapoport:
>      This very reminds what x86 does. Any chance some of the code can be reused
>      rather than duplicated?
>      https://lkml.org/lkml/2019/4/4/1225
> 
>      I think it would be better to have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL
>      in arch/Kconfig and select this by X86 and ARM64.
>      https://lkml.org/lkml/2020/11/12/224
> 
>    Ingo Molnar:
>      No objections for this to be merged via the ARM tree, as long as x86
>      functionality is kept intact.
>      https://lkml.org/lkml/2019/4/10/109
> 
>      I.e. Ack, but only if it doesn't break anything. :-)
>      https://lkml.org/lkml/2019/4/12/66
> 
>    Dave Young:
>      Other than the comments from James, can you move the function into
>      kernel/crash_core.c, we already have some functions moved there for
>      sharing.
>      https://lkml.org/lkml/2019/6/12/248
> 
>    Catalin Marinas:
>      Except for the threshold to keep zone ZONE_DMA memory,
>      reserve_crashkernel() looks very close to the x86 version. Shall we try
>      to make this generic as well?
>      https://lkml.org/lkml/2020/9/2/917
> 
>    Borislav Petkov:
>      Why insert_resource() is relevant only to x86?
>      --> I think this means "Why does arm64 not use insert_resource()?"
>      https://lkml.org/lkml/2021/12/23/480
> 
>      This is exactly why I say that making those functions generic and shared
>      might not be such a good idea, after all, because then you'd have to
>      sprinkle around arch-specific stuff.
>      https://lkml.org/lkml/2021/12/23/480
> 
>      What I suggested and what would be real clean is if the arches would
>      simply call a *single*
> 	parse_crashkernel()
>      function and when that one returns, *all* crashkernel= options would
>      have been parsed properly, low, high, middle crashkernel, whatever...
>      and the caller would know what crash kernel needs to be allocated.
>      https://lkml.org/lkml/2021/12/28/305
> 
>    ------
>    James Morse:
>      We can then describe it via a different string in /proc/iomem, something
>      like "Crash kernel (low)".
>      https://lkml.org/lkml/2019/6/5/670
>      --> The suggestion looks out of date. See Borislav Petkov's comments:
>      --> 157752d84f5d ("kexec: use Crash kernel for Crash kernel low")
>      --> https://lkml.org/lkml/2021/12/23/480
> 
> 
> On 2022/1/24 16:47, Zhen Lei wrote:
> > There are following issues in arm64 kdump:
> > 1. We use crashkernel=X to reserve crashkernel below 4G, which
> > will fail when there is no enough low memory.
> > 2. If reserving crashkernel above 4G, in this case, crash dump
> > kernel will boot failure because there is no low memory available
> > for allocation.
> > 
> > To solve these issues, change the behavior of crashkernel=X.
> > crashkernel=X tries low allocation in DMA zone and fall back to high
> > allocation if it fails.
> > 
> > We can also use "crashkernel=X,high" to select a high region above
> > DMA zone, which also tries to allocate at least 256M low memory in
> > DMA zone automatically and "crashkernel=Y,low" can be used to allocate
> > specified size low memory.
> > 
> > When reserving crashkernel in high memory, some low memory is reserved
> > for crash dump kernel devices. So there may be two regions reserved for
> > crash dump kernel.
> > In order to distinct from the high region and make no effect to the use
> > of existing kexec-tools, rename the low region as "Crash kernel (low)",
> > and pass the low region by reusing DT property
> > "linux,usable-memory-range". We made the low memory region as the last
> > range of "linux,usable-memory-range" to keep compatibility with existing
> > user-space and older kdump kernels.
> > 
> > Besides, we need to modify kexec-tools:
> > arm64: support more than one crash kernel regions(see [1])
> > 
> > Another update is document about DT property 'linux,usable-memory-range':
> > schemas: update 'linux,usable-memory-range' node schema(see [2])
> > 
> > 
> > Changes since [v19]:
> > 1. Temporarily stop making reserve_crashkernel[_low]() generic. There are a
> >    lot of details need to be considered, which can take a long time. Because
> >    "make generic" does not add new functions and does not improve performance,
> >    maybe I should say it's just a cleanup. So by stripping it out and leaving
> >    it for other patches later, we can aggregate the changes to the main functions.
> > 2. Use insert_resource() to replace request_resource(), this not only simplifies
> >    the code, but also reduces the differences between arm64 and x86 implementations.
> > 3. As commit 157752d84f5d ("kexec: use Crash kernel for Crash kernel low") do for
> >    x86, we can also extend kexec-tools for arm64, and it's currently applied. See:
> >    https://www.spinics.net/lists/kexec/msg28284.html
> > 
> > Thank you very much, Borislav Petkov, for so many valuable comments.
> > 
> > 
> > 
> > Changes since [v17]: v17 --> v19
> > 1. Patch 0001-0004
> >    Introduce generic parse_crashkernel_high_low() to bring the parsing of
> >    "crashkernel=X,high" and the parsing of "crashkernel=X,low" together,
> >    then use it instead of the call to parse_crashkernel_{high|low}(). Two
> >    confusing parameters of parse_crashkernel_{high|low}() are deleted.
> > 
> >    I previously sent these four patches separately:
> >    [1] https://lkml.org/lkml/2021/12/25/40
> > 2. Patch 0005-0009
> >    Introduce generic reserve_crashkernel_mem[_low](), the implementation of
> >    these two functions is based on function reserve_crashkernel[_low]() in
> >    arch/x86/kernel/setup.c. There is no functional change for x86.
> >    1) The check position of xen_pv_domain() does not change.
> >    2) Still 1M alignment for crash kernel fixed region, when 'base' is specified.
> > 
> >    To avoid compilation problems on other architectures: patch 0004 moves
> >    the definition of global variable crashk[_low]_res from kexec_core.c to
> >    crash_core.c, and provide default definitions for all macros involved, a
> >    particular platform can redefine these macros to override the default
> >    values.
> > 3. 0010, only one line of comment was changed.
> > 4. 0011
> >    1) crashk_low_res may also a valid reserved memory, should be checked
> >       in crash_is_nosave(), see arch/arm64/kernel/machine_kexec.
> >    2) Drop memblock_mark_nomap() for crashk_low_res, because of:
> >       2687275a5843 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
> >    3) Also call kmemleak_ignore_phys() for crashk_low_res, because of:
> >       85f58eb18898 arm64: kdump: Skip kmemleak scan reserved memory for kdump
> > 5. 0012, slightly rebased, because the following patch is applied in advance. 
> >    https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/commit/?h=dt/linus&id=8347b41748c3019157312fbe7f8a6792ae396eb7
> > 6. 0013, no change.
> > 
> > Others:
> > 1. Discard add ARCH_WANT_RESERVE_CRASH_KERNEL
> > 2. When allocating crash low memory, the start address still starts from 0.
> >    low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> > 3. Discard change (1ULL << 32) to CRASH_ADDR_LOW_MAX.
> > 4. Ensure the check position of xen_pv_domain() have no change.
> > 5. Except patch 0010 and 0012, all "Tested-by", "Reviewed-by", "Acked-by" are removed.
> > 6. Update description.
> > 
> > 
> > 
> > Changes since [v16]
> > - Because no functional changes in this version, so add
> >   "Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>" for patch 1-9
> > - Add "Reviewed-by: Rob Herring <robh@kernel.org>" for patch 8
> > - Update patch 9 based on the review comments of Rob Herring
> > - As Catalin Marinas's suggestion, merge the implementation of
> >   ARCH_WANT_RESERVE_CRASH_KERNEL into patch 5. Ensure that the
> >   contents of X86 and ARM64 do not overlap, and reduce unnecessary
> >   temporary differences.
> > 
> > Changes since [v15]
> > -  Aggregate the processing of "linux,usable-memory-range" into one function.
> >    Only patch 9-10 have been updated.
> > 
> > Changes since [v14]
> > - Recovering the requirement that the CrashKernel memory regions on X86
> >   only requires 1 MiB alignment.
> > - Combine patches 5 and 6 in v14 into one. The compilation warning fixed
> >   by patch 6 was introduced by patch 5 in v14.
> > - As with crashk_res, crashk_low_res is also processed by
> >   crash_exclude_mem_range() in patch 7.
> > - Due to commit b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
> >   has removed the architecture-specific code, extend the property "linux,usable-memory-range"
> >   in the platform-agnostic FDT core code. See patch 9.
> > - Discard the x86 description update in the document, because the description
> >   has been updated by commit b1f4c363666c ("Documentation: kdump: update kdump guide").
> > - Change "arm64" to "ARM64" in Doc.
> > 
> > 
> > Changes since [v13]
> > - Rebased on top of 5.11-rc5.
> > - Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL.
> > Since reserve_crashkernel[_low]() implementations are quite similar on
> > other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in
> > arch/Kconfig and select this by X86 and ARM64.
> > - Some minor cleanup.
> > 
> > Changes since [v12]
> > - Rebased on top of 5.10-rc1.
> > - Keep CRASH_ALIGN as 16M suggested by Dave.
> > - Drop patch "kdump: add threshold for the required memory".
> > - Add Tested-by from John.
> > 
> > Changes since [v11]
> > - Rebased on top of 5.9-rc4.
> > - Make the function reserve_crashkernel() of x86 generic.
> > Suggested by Catalin, make the function reserve_crashkernel() of x86 generic
> > and arm64 use the generic version to reimplement crashkernel=X.
> > 
> > Changes since [v10]
> > - Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.
> > 
> > Changes since [v9]
> > - Patch 1 add Acked-by from Dave.
> > - Update patch 5 according to Dave's comments.
> > - Update chosen schema.
> > 
> > Changes since [v8]
> > - Reuse DT property "linux,usable-memory-range".
> > Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
> > memory region.
> > - Fix kdump broken with ZONE_DMA reintroduced.
> > - Update chosen schema.
> > 
> > Changes since [v7]
> > - Move x86 CRASH_ALIGN to 2M
> > Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> > - Update Documentation/devicetree/bindings/chosen.txt.
> > Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
> > suggested by Arnd.
> > - Add Tested-by from Jhon and pk.
> > 
> > Changes since [v6]
> > - Fix build errors reported by kbuild test robot.
> > 
> > Changes since [v5]
> > - Move reserve_crashkernel_low() into kernel/crash_core.c.
> > - Delete crashkernel=X,high.
> > - Modify crashkernel=X,low.
> > If crashkernel=X,low is specified simultaneously, reserve spcified size low
> > memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> > In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> > pass to crash dump kernel by DT property "linux,low-memory-range".
> > - Update Documentation/admin-guide/kdump/kdump.rst.
> > 
> > Changes since [v4]
> > - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
> > 
> > Changes since [v3]
> > - Add memblock_cap_memory_ranges back for multiple ranges.
> > - Fix some compiling warnings.
> > 
> > Changes since [v2]
> > - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> > two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> > patch.
> > 
> > Changes since [v1]:
> > - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> > - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> > in fdt_enforce_memory_region().
> > There are at most two crash kernel regions, for two crash kernel regions
> > case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> > and then remove the memory range in the middle.
> > 
> > [1]: https://www.spinics.net/lists/kexec/msg28226.html
> > [2]: https://github.com/robherring/dt-schema/pull/19 
> > [v1]: https://lkml.org/lkml/2019/4/2/1174
> > [v2]: https://lkml.org/lkml/2019/4/9/86
> > [v3]: https://lkml.org/lkml/2019/4/9/306
> > [v4]: https://lkml.org/lkml/2019/4/15/273
> > [v5]: https://lkml.org/lkml/2019/5/6/1360
> > [v6]: https://lkml.org/lkml/2019/8/30/142
> > [v7]: https://lkml.org/lkml/2019/12/23/411
> > [v8]: https://lkml.org/lkml/2020/5/21/213
> > [v9]: https://lkml.org/lkml/2020/6/28/73
> > [v10]: https://lkml.org/lkml/2020/7/2/1443
> > [v11]: https://lkml.org/lkml/2020/8/1/150
> > [v12]: https://lkml.org/lkml/2020/9/7/1037
> > [v13]: https://lkml.org/lkml/2020/10/31/34
> > [v14]: https://lkml.org/lkml/2021/1/30/53
> > [v15]: https://lkml.org/lkml/2021/10/19/1405
> > [v16]: https://lkml.org/lkml/2021/11/23/435
> > [v17]: https://lkml.org/lkml/2021/12/10/38
> > [v18]: https://lkml.org/lkml/2021/12/22/424
> > [v19]: https://lkml.org/lkml/2021/12/28/203
> > 
> > 
> > Chen Zhou (4):
> >   arm64: kdump: introduce some macros for crash kernel reservation
> >   arm64: kdump: reimplement crashkernel=X
> >   of: fdt: Add memory for devices by DT property
> >     "linux,usable-memory-range"
> >   kdump: update Documentation about crashkernel
> > 
> > Zhen Lei (1):
> >   arm64: Use insert_resource() to simplify code
> > 
> >  Documentation/admin-guide/kdump/kdump.rst     | 11 ++-
> >  .../admin-guide/kernel-parameters.txt         | 11 ++-
> >  arch/arm64/kernel/machine_kexec.c             |  9 ++-
> >  arch/arm64/kernel/machine_kexec_file.c        | 12 ++-
> >  arch/arm64/kernel/setup.c                     | 17 +---
> >  arch/arm64/mm/init.c                          | 80 +++++++++++++++++--
> >  drivers/of/fdt.c                              | 33 +++++---
> >  7 files changed, 134 insertions(+), 39 deletions(-)
> > 
> 
> -- 
> Regards,
>   Zhen Lei
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X
  2022-01-24  8:47 ` [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X Zhen Lei
  2022-01-26 15:18   ` john.p.donnelly
@ 2022-02-11 10:30   ` Baoquan He
  2022-02-11 10:41     ` Leizhen (ThunderTown)
  2022-02-14  3:52   ` Baoquan He
  2 siblings, 1 reply; 30+ messages in thread
From: Baoquan He @ 2022-02-11 10:30 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 01/24/22 at 04:47pm, Zhen Lei wrote:
> From: Chen Zhou <chenzhou10@huawei.com>
......
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 6c653a2c7cff052..a5d43feac0d7d96 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -71,6 +71,30 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>  #define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
>  #define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
>  
> +static int __init reserve_crashkernel_low(unsigned long long low_size)
> +{
> +	unsigned long long low_base;
> +
> +	/* passed with crashkernel=0,low ? */
> +	if (!low_size)
> +		return 0;
> +
> +	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> +	if (!low_base) {
> +		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
> +		return -ENOMEM;
> +	}
> +
> +	pr_info("crashkernel low memory reserved: 0x%llx - 0x%llx (%lld MB)\n",
> +		low_base, low_base + low_size, low_size >> 20);
> +
> +	crashk_low_res.start = low_base;
> +	crashk_low_res.end   = low_base + low_size - 1;
> +	insert_resource(&iomem_resource, &crashk_low_res);
> +
> +	return 0;
> +}
> +
>  /*
>   * reserve_crashkernel() - reserves memory for crash kernel
>   *
> @@ -81,29 +105,62 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>  static void __init reserve_crashkernel(void)
>  {
>  	unsigned long long crash_base, crash_size;
> +	unsigned long long crash_low_size = SZ_256M;
>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>  	int ret;
> +	bool fixed_base;
> +	char *cmdline = boot_command_line;
>  
> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> +	/* crashkernel=X[@offset] */
> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>  				&crash_size, &crash_base);
> -	/* no crashkernel= or invalid value specified */
> -	if (ret || !crash_size)
> -		return;
> +	if (ret || !crash_size) {
> +		unsigned long long low_size;
>  
> +		/* crashkernel=X,high */
> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
> +		if (ret || !crash_size)
> +			return;
> +
> +		/* crashkernel=X,low */
> +		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
> +		if (!ret)
> +			crash_low_size = low_size;

Here, the error case is not checked and handled. But it still gets
expeced result which is the default SZ_256M. Is this designed on
purpose?

> +
> +		crash_max = CRASH_ADDR_HIGH_MAX;
> +	}
> +
> +	fixed_base = !!crash_base;
>  	crash_size = PAGE_ALIGN(crash_size);
>  
>  	/* User specifies base address explicitly. */
>  	if (crash_base)
>  		crash_max = crash_base + crash_size;
>  
> +retry:
>  	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>  					       crash_base, crash_max);
>  	if (!crash_base) {
> +		/*
> +		 * Attempt to fully allocate low memory failed, fall back
> +		 * to high memory, the minimum required low memory will be
> +		 * reserved later.
> +		 */
> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
> +			crash_max = CRASH_ADDR_HIGH_MAX;
> +			goto retry;
> +		}
> +
>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>  			crash_size);
>  		return;
>  	}
>  
> +	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
> +		memblock_phys_free(crash_base, crash_size);
> +		return;
> +	}
> +
>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>  		crash_base, crash_base + crash_size, crash_size >> 20);
>  
> @@ -112,6 +169,9 @@ static void __init reserve_crashkernel(void)
>  	 * map. Inform kmemleak so that it won't try to access it.
>  	 */
>  	kmemleak_ignore_phys(crash_base);
> +	if (crashk_low_res.end)
> +		kmemleak_ignore_phys(crashk_low_res.start);
> +
>  	crashk_res.start = crash_base;
>  	crashk_res.end = crash_base + crash_size - 1;
>  	insert_resource(&iomem_resource, &crashk_res);
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 2/5] arm64: kdump: introduce some macros for crash kernel reservation
  2022-01-24  8:47 ` [PATCH v20 2/5] arm64: kdump: introduce some macros for crash kernel reservation Zhen Lei
  2022-01-26 15:17   ` john.p.donnelly
@ 2022-02-11 10:39   ` Baoquan He
  2022-02-14  6:22     ` Leizhen (ThunderTown)
  1 sibling, 1 reply; 30+ messages in thread
From: Baoquan He @ 2022-02-11 10:39 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 01/24/22 at 04:47pm, Zhen Lei wrote:
> From: Chen Zhou <chenzhou10@huawei.com>
> 
> Introduce macro CRASH_ALIGN for alignment, macro CRASH_ADDR_LOW_MAX
> for upper bound of low crash memory, macro CRASH_ADDR_HIGH_MAX for
> upper bound of high crash memory, use macros instead.
> 
> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> Tested-by: John Donnelly <John.p.donnelly@oracle.com>
> Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>
> ---
>  arch/arm64/mm/init.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 90f276d46b93bc6..6c653a2c7cff052 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -65,6 +65,12 @@ EXPORT_SYMBOL(memstart_addr);
>  phys_addr_t arm64_dma_phys_limit __ro_after_init;
>  
>  #ifdef CONFIG_KEXEC_CORE
> +/* Current arm64 boot protocol requires 2MB alignment */
> +#define CRASH_ALIGN		SZ_2M
> +
> +#define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
> +#define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE

MEMBLOCK_ALLOC_ACCESSIBLE is obvoiously a alloc flag for memblock
allocator, I don't think it's appropriate to make HIGH_MAX get its value.
You can make it as memblock.current_limit, or do not define it, but using
MEMBLOCK_ALLOC_ACCESSIBLE direclty in memblock_phys_alloc_range() with
a code comment. 


> +
>  /*
>   * reserve_crashkernel() - reserves memory for crash kernel
>   *
> @@ -75,7 +81,7 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>  static void __init reserve_crashkernel(void)
>  {
>  	unsigned long long crash_base, crash_size;
> -	unsigned long long crash_max = arm64_dma_phys_limit;
> +	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>  	int ret;
>  
>  	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> @@ -90,8 +96,7 @@ static void __init reserve_crashkernel(void)
>  	if (crash_base)
>  		crash_max = crash_base + crash_size;
>  
> -	/* Current arm64 boot protocol requires 2MB alignment */
> -	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
> +	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>  					       crash_base, crash_max);
>  	if (!crash_base) {
>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X
  2022-02-11 10:30   ` Baoquan He
@ 2022-02-11 10:41     ` Leizhen (ThunderTown)
  2022-02-11 10:51       ` Baoquan He
  0 siblings, 1 reply; 30+ messages in thread
From: Leizhen (ThunderTown) @ 2022-02-11 10:41 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/2/11 18:30, Baoquan He wrote:
> On 01/24/22 at 04:47pm, Zhen Lei wrote:
>> From: Chen Zhou <chenzhou10@huawei.com>
> ......
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 6c653a2c7cff052..a5d43feac0d7d96 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -71,6 +71,30 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>  #define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
>>  #define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
>>  
>> +static int __init reserve_crashkernel_low(unsigned long long low_size)
>> +{
>> +	unsigned long long low_base;
>> +
>> +	/* passed with crashkernel=0,low ? */
>> +	if (!low_size)
>> +		return 0;
>> +
>> +	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
>> +	if (!low_base) {
>> +		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
>> +		return -ENOMEM;
>> +	}
>> +
>> +	pr_info("crashkernel low memory reserved: 0x%llx - 0x%llx (%lld MB)\n",
>> +		low_base, low_base + low_size, low_size >> 20);
>> +
>> +	crashk_low_res.start = low_base;
>> +	crashk_low_res.end   = low_base + low_size - 1;
>> +	insert_resource(&iomem_resource, &crashk_low_res);
>> +
>> +	return 0;
>> +}
>> +
>>  /*
>>   * reserve_crashkernel() - reserves memory for crash kernel
>>   *
>> @@ -81,29 +105,62 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>  static void __init reserve_crashkernel(void)
>>  {
>>  	unsigned long long crash_base, crash_size;
>> +	unsigned long long crash_low_size = SZ_256M;
>>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>  	int ret;
>> +	bool fixed_base;
>> +	char *cmdline = boot_command_line;
>>  
>> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>> +	/* crashkernel=X[@offset] */
>> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>>  				&crash_size, &crash_base);
>> -	/* no crashkernel= or invalid value specified */
>> -	if (ret || !crash_size)
>> -		return;
>> +	if (ret || !crash_size) {
>> +		unsigned long long low_size;
>>  
>> +		/* crashkernel=X,high */
>> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
>> +		if (ret || !crash_size)
>> +			return;
>> +
>> +		/* crashkernel=X,low */
>> +		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
>> +		if (!ret)
>> +			crash_low_size = low_size;
> 
> Here, the error case is not checked and handled. But it still gets
> expeced result which is the default SZ_256M. Is this designed on
> purpose?

Yes, we can specify only "crashkernel=X,high".

This is mentioned in Documentation/admin-guide/kernel-parameters.txt

        crashkernel=size[KMG],low
                        [KNL, X86-64] range under 4G. When crashkernel=X,high
                        is passed, kernel could allocate physical memory region
                        above 4G, that cause second kernel crash on system
                        that require some amount of low memory, e.g. swiotlb
                        requires at least 64M+32K low memory, also enough extra
                        low memory is needed to make sure DMA buffers for 32-bit
                        devices won't run out. Kernel would try to allocate at     <---------
                        least 256M below 4G automatically.                         <---------

> 
>> +
>> +		crash_max = CRASH_ADDR_HIGH_MAX;
>> +	}
>> +
>> +	fixed_base = !!crash_base;
>>  	crash_size = PAGE_ALIGN(crash_size);
>>  
>>  	/* User specifies base address explicitly. */
>>  	if (crash_base)
>>  		crash_max = crash_base + crash_size;
>>  
>> +retry:
>>  	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>  					       crash_base, crash_max);
>>  	if (!crash_base) {
>> +		/*
>> +		 * Attempt to fully allocate low memory failed, fall back
>> +		 * to high memory, the minimum required low memory will be
>> +		 * reserved later.
>> +		 */
>> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
>> +			crash_max = CRASH_ADDR_HIGH_MAX;
>> +			goto retry;
>> +		}
>> +
>>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>>  			crash_size);
>>  		return;
>>  	}
>>  
>> +	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
>> +		memblock_phys_free(crash_base, crash_size);
>> +		return;
>> +	}
>> +
>>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>>  		crash_base, crash_base + crash_size, crash_size >> 20);
>>  
>> @@ -112,6 +169,9 @@ static void __init reserve_crashkernel(void)
>>  	 * map. Inform kmemleak so that it won't try to access it.
>>  	 */
>>  	kmemleak_ignore_phys(crash_base);
>> +	if (crashk_low_res.end)
>> +		kmemleak_ignore_phys(crashk_low_res.start);
>> +
>>  	crashk_res.start = crash_base;
>>  	crashk_res.end = crash_base + crash_size - 1;
>>  	insert_resource(&iomem_resource, &crashk_res);
>> -- 
>> 2.25.1
>>
> 
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X
  2022-02-11 10:41     ` Leizhen (ThunderTown)
@ 2022-02-11 10:51       ` Baoquan He
  2022-02-14  6:44         ` Leizhen (ThunderTown)
  0 siblings, 1 reply; 30+ messages in thread
From: Baoquan He @ 2022-02-11 10:51 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 02/11/22 at 06:41pm, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/2/11 18:30, Baoquan He wrote:
> > On 01/24/22 at 04:47pm, Zhen Lei wrote:
> >> From: Chen Zhou <chenzhou10@huawei.com>
> > ......
> >> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> >> index 6c653a2c7cff052..a5d43feac0d7d96 100644
> >> --- a/arch/arm64/mm/init.c
> >> +++ b/arch/arm64/mm/init.c
> >> @@ -71,6 +71,30 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
> >>  #define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
> >>  #define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
> >>  
> >> +static int __init reserve_crashkernel_low(unsigned long long low_size)
> >> +{
> >> +	unsigned long long low_base;
> >> +
> >> +	/* passed with crashkernel=0,low ? */
> >> +	if (!low_size)
> >> +		return 0;
> >> +
> >> +	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> >> +	if (!low_base) {
> >> +		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
> >> +		return -ENOMEM;
> >> +	}
> >> +
> >> +	pr_info("crashkernel low memory reserved: 0x%llx - 0x%llx (%lld MB)\n",
> >> +		low_base, low_base + low_size, low_size >> 20);
> >> +
> >> +	crashk_low_res.start = low_base;
> >> +	crashk_low_res.end   = low_base + low_size - 1;
> >> +	insert_resource(&iomem_resource, &crashk_low_res);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >>  /*
> >>   * reserve_crashkernel() - reserves memory for crash kernel
> >>   *
> >> @@ -81,29 +105,62 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
> >>  static void __init reserve_crashkernel(void)
> >>  {
> >>  	unsigned long long crash_base, crash_size;
> >> +	unsigned long long crash_low_size = SZ_256M;
> >>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> >>  	int ret;
> >> +	bool fixed_base;
> >> +	char *cmdline = boot_command_line;
> >>  
> >> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> >> +	/* crashkernel=X[@offset] */
> >> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
> >>  				&crash_size, &crash_base);
> >> -	/* no crashkernel= or invalid value specified */
> >> -	if (ret || !crash_size)
> >> -		return;
> >> +	if (ret || !crash_size) {
> >> +		unsigned long long low_size;
> >>  
> >> +		/* crashkernel=X,high */
> >> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
> >> +		if (ret || !crash_size)
> >> +			return;
> >> +
> >> +		/* crashkernel=X,low */
> >> +		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
> >> +		if (!ret)
> >> +			crash_low_size = low_size;
> > 
> > Here, the error case is not checked and handled. But it still gets
> > expeced result which is the default SZ_256M. Is this designed on
> > purpose?
> 
> Yes, we can specify only "crashkernel=X,high".
> 
> This is mentioned in Documentation/admin-guide/kernel-parameters.txt
> 
>         crashkernel=size[KMG],low
>                         [KNL, X86-64] range under 4G. When crashkernel=X,high
>                         is passed, kernel could allocate physical memory region
>                         above 4G, that cause second kernel crash on system
>                         that require some amount of low memory, e.g. swiotlb
>                         requires at least 64M+32K low memory, also enough extra
>                         low memory is needed to make sure DMA buffers for 32-bit
>                         devices won't run out. Kernel would try to allocate at     <---------
>                         least 256M below 4G automatically.                         <---------

Yeah, that is expected becasue no crahskernel=,low is a right usage. The
'ret' is 0 in the case. If I gave below string, it works too.
"crashkernel=256M,high crashkernel=aaabbadfadfd,low"

> 
> > 
> >> +
> >> +		crash_max = CRASH_ADDR_HIGH_MAX;
> >> +	}
> >> +
> >> +	fixed_base = !!crash_base;
> >>  	crash_size = PAGE_ALIGN(crash_size);
> >>  
> >>  	/* User specifies base address explicitly. */
> >>  	if (crash_base)
> >>  		crash_max = crash_base + crash_size;
> >>  
> >> +retry:
> >>  	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
> >>  					       crash_base, crash_max);
> >>  	if (!crash_base) {
> >> +		/*
> >> +		 * Attempt to fully allocate low memory failed, fall back
> >> +		 * to high memory, the minimum required low memory will be
> >> +		 * reserved later.
> >> +		 */
> >> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
> >> +			crash_max = CRASH_ADDR_HIGH_MAX;
> >> +			goto retry;
> >> +		}
> >> +
> >>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
> >>  			crash_size);
> >>  		return;
> >>  	}
> >>  
> >> +	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
> >> +		memblock_phys_free(crash_base, crash_size);
> >> +		return;
> >> +	}
> >> +
> >>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
> >>  		crash_base, crash_base + crash_size, crash_size >> 20);
> >>  
> >> @@ -112,6 +169,9 @@ static void __init reserve_crashkernel(void)
> >>  	 * map. Inform kmemleak so that it won't try to access it.
> >>  	 */
> >>  	kmemleak_ignore_phys(crash_base);
> >> +	if (crashk_low_res.end)
> >> +		kmemleak_ignore_phys(crashk_low_res.start);
> >> +
> >>  	crashk_res.start = crash_base;
> >>  	crashk_res.end = crash_base + crash_size - 1;
> >>  	insert_resource(&iomem_resource, &crashk_res);
> >> -- 
> >> 2.25.1
> >>
> > 
> > .
> > 
> 
> -- 
> Regards,
>   Zhen Lei
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X
  2022-01-24  8:47 ` [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X Zhen Lei
  2022-01-26 15:18   ` john.p.donnelly
  2022-02-11 10:30   ` Baoquan He
@ 2022-02-14  3:52   ` Baoquan He
  2022-02-14  7:53     ` Leizhen (ThunderTown)
  2 siblings, 1 reply; 30+ messages in thread
From: Baoquan He @ 2022-02-14  3:52 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 01/24/22 at 04:47pm, Zhen Lei wrote:
......
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 6c653a2c7cff052..a5d43feac0d7d96 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -71,6 +71,30 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>  #define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
>  #define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
>  
> +static int __init reserve_crashkernel_low(unsigned long long low_size)
> +{
> +	unsigned long long low_base;
> +
> +	/* passed with crashkernel=0,low ? */
> +	if (!low_size)
> +		return 0;
> +
> +	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> +	if (!low_base) {
> +		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
> +		return -ENOMEM;
> +	}
> +
> +	pr_info("crashkernel low memory reserved: 0x%llx - 0x%llx (%lld MB)\n",
> +		low_base, low_base + low_size, low_size >> 20);
> +
> +	crashk_low_res.start = low_base;
> +	crashk_low_res.end   = low_base + low_size - 1;
> +	insert_resource(&iomem_resource, &crashk_low_res);
> +
> +	return 0;
> +}
> +
>  /*
>   * reserve_crashkernel() - reserves memory for crash kernel

My another concern is the crashkernel=,low handling. In this patch, the
code related to low memory is obscure. Wondering if we should make them
explicit with a little redundant but very clear code flows. Saying this
because the code must be very clear to you and reviewers, it may be
harder for later code reader or anyone interested to understand.

1) crashkernel=X,high
2) crashkernel=X,high crashkernel=Y,low
3) crashkernel=X,high crashkernel=0,low
4) crashkernel=X,high crashkernel='messy code',low
5) crashkernel=X //fall back to high memory, low memory is required then.

It could be me thinking about it too much. I made changes to your patch
with a tuning, not sure if it's OK to you. Otherwise, this patchset
works very well for all above test cases, it's ripe to be merged for
wider testing.

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index a5d43feac0d7..671862c56d7d 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -94,7 +94,8 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
 
 	return 0;
 }
-
+/*Words explaining why it's 256M*/
+#define DEFAULT_CRASH_KERNEL_LOW_SIZE SZ_256M
 /*
  * reserve_crashkernel() - reserves memory for crash kernel
  *
@@ -105,10 +106,10 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
 static void __init reserve_crashkernel(void)
 {
 	unsigned long long crash_base, crash_size;
-	unsigned long long crash_low_size = SZ_256M;
+	unsigned long long crash_low_size;
 	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
 	int ret;
-	bool fixed_base;
+	bool fixed_base, high;
 	char *cmdline = boot_command_line;
 
 	/* crashkernel=X[@offset] */
@@ -126,7 +127,10 @@ static void __init reserve_crashkernel(void)
 		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
 		if (!ret)
 			crash_low_size = low_size;
+		else
+			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
 
+		high = true;
 		crash_max = CRASH_ADDR_HIGH_MAX;
 	}
 
@@ -134,7 +138,7 @@ static void __init reserve_crashkernel(void)
 	crash_size = PAGE_ALIGN(crash_size);
 
 	/* User specifies base address explicitly. */
-	if (crash_base)
+	if (fixed_base)
 		crash_max = crash_base + crash_size;
 
 retry:
@@ -156,7 +160,10 @@ static void __init reserve_crashkernel(void)
 		return;
 	}
 
-	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
+	if (crash_base >= SZ_4G && !high) 
+		crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
+
+	if (reserve_crashkernel_low(crash_low_size)) {
 		memblock_phys_free(crash_base, crash_size);
 		return;
 	}

>   *
> @@ -81,29 +105,62 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>  static void __init reserve_crashkernel(void)
>  {
>  	unsigned long long crash_base, crash_size;
> +	unsigned long long crash_low_size = SZ_256M;
>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>  	int ret;
> +	bool fixed_base;
> +	char *cmdline = boot_command_line;
>  
> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> +	/* crashkernel=X[@offset] */
> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>  				&crash_size, &crash_base);
> -	/* no crashkernel= or invalid value specified */
> -	if (ret || !crash_size)
> -		return;
> +	if (ret || !crash_size) {
> +		unsigned long long low_size;
>  
> +		/* crashkernel=X,high */
> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
> +		if (ret || !crash_size)
> +			return;
> +
> +		/* crashkernel=X,low */
> +		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
> +		if (!ret)
> +			crash_low_size = low_size;
> +
> +		crash_max = CRASH_ADDR_HIGH_MAX;
> +	}
> +
> +	fixed_base = !!crash_base;
>  	crash_size = PAGE_ALIGN(crash_size);
>  
>  	/* User specifies base address explicitly. */
>  	if (crash_base)
>  		crash_max = crash_base + crash_size;
>  
> +retry:
>  	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>  					       crash_base, crash_max);
>  	if (!crash_base) {
> +		/*
> +		 * Attempt to fully allocate low memory failed, fall back
> +		 * to high memory, the minimum required low memory will be
> +		 * reserved later.
> +		 */
> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
> +			crash_max = CRASH_ADDR_HIGH_MAX;
> +			goto retry;
> +		}
> +
>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>  			crash_size);
>  		return;
>  	}
>  
> +	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
> +		memblock_phys_free(crash_base, crash_size);
> +		return;
> +	}
> +
>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>  		crash_base, crash_base + crash_size, crash_size >> 20);
>  
> @@ -112,6 +169,9 @@ static void __init reserve_crashkernel(void)
>  	 * map. Inform kmemleak so that it won't try to access it.
>  	 */
>  	kmemleak_ignore_phys(crash_base);
> +	if (crashk_low_res.end)
> +		kmemleak_ignore_phys(crashk_low_res.start);
> +
>  	crashk_res.start = crash_base;
>  	crashk_res.end = crash_base + crash_size - 1;
>  	insert_resource(&iomem_resource, &crashk_res);
> -- 
> 2.25.1
> 


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 2/5] arm64: kdump: introduce some macros for crash kernel reservation
  2022-02-11 10:39   ` Baoquan He
@ 2022-02-14  6:22     ` Leizhen (ThunderTown)
  2022-02-21  3:22       ` Baoquan He
  0 siblings, 1 reply; 30+ messages in thread
From: Leizhen (ThunderTown) @ 2022-02-14  6:22 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/2/11 18:39, Baoquan He wrote:
> On 01/24/22 at 04:47pm, Zhen Lei wrote:
>> From: Chen Zhou <chenzhou10@huawei.com>
>>
>> Introduce macro CRASH_ALIGN for alignment, macro CRASH_ADDR_LOW_MAX
>> for upper bound of low crash memory, macro CRASH_ADDR_HIGH_MAX for
>> upper bound of high crash memory, use macros instead.
>>
>> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> Tested-by: John Donnelly <John.p.donnelly@oracle.com>
>> Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>
>> ---
>>  arch/arm64/mm/init.c | 11 ++++++++---
>>  1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 90f276d46b93bc6..6c653a2c7cff052 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -65,6 +65,12 @@ EXPORT_SYMBOL(memstart_addr);
>>  phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>  
>>  #ifdef CONFIG_KEXEC_CORE
>> +/* Current arm64 boot protocol requires 2MB alignment */
>> +#define CRASH_ALIGN		SZ_2M
>> +
>> +#define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
>> +#define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
> 
> MEMBLOCK_ALLOC_ACCESSIBLE is obvoiously a alloc flag for memblock
> allocator, I don't think it's appropriate to make HIGH_MAX get its value.

Right, thanks.

> You can make it as memblock.current_limit, or do not define it, but using
> MEMBLOCK_ALLOC_ACCESSIBLE direclty in memblock_phys_alloc_range() with
> a code comment. 

This patch is not required at present. These macros are added to eliminate
differences to share code with x86.

> 
> 
>> +
>>  /*
>>   * reserve_crashkernel() - reserves memory for crash kernel
>>   *
>> @@ -75,7 +81,7 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>  static void __init reserve_crashkernel(void)
>>  {
>>  	unsigned long long crash_base, crash_size;
>> -	unsigned long long crash_max = arm64_dma_phys_limit;
>> +	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>  	int ret;
>>  
>>  	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>> @@ -90,8 +96,7 @@ static void __init reserve_crashkernel(void)
>>  	if (crash_base)
>>  		crash_max = crash_base + crash_size;
>>  
>> -	/* Current arm64 boot protocol requires 2MB alignment */
>> -	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
>> +	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>  					       crash_base, crash_max);
>>  	if (!crash_base) {
>>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>> -- 
>> 2.25.1
>>
> 
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X
  2022-02-11 10:51       ` Baoquan He
@ 2022-02-14  6:44         ` Leizhen (ThunderTown)
  2022-02-14  7:09           ` Baoquan He
  0 siblings, 1 reply; 30+ messages in thread
From: Leizhen (ThunderTown) @ 2022-02-14  6:44 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/2/11 18:51, Baoquan He wrote:
> On 02/11/22 at 06:41pm, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2022/2/11 18:30, Baoquan He wrote:
>>> On 01/24/22 at 04:47pm, Zhen Lei wrote:
>>>> From: Chen Zhou <chenzhou10@huawei.com>
>>> ......
>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>>> index 6c653a2c7cff052..a5d43feac0d7d96 100644
>>>> --- a/arch/arm64/mm/init.c
>>>> +++ b/arch/arm64/mm/init.c
>>>> @@ -71,6 +71,30 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>>>  #define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
>>>>  #define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
>>>>  
>>>> +static int __init reserve_crashkernel_low(unsigned long long low_size)
>>>> +{
>>>> +	unsigned long long low_base;
>>>> +
>>>> +	/* passed with crashkernel=0,low ? */
>>>> +	if (!low_size)
>>>> +		return 0;
>>>> +
>>>> +	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
>>>> +	if (!low_base) {
>>>> +		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
>>>> +		return -ENOMEM;
>>>> +	}
>>>> +
>>>> +	pr_info("crashkernel low memory reserved: 0x%llx - 0x%llx (%lld MB)\n",
>>>> +		low_base, low_base + low_size, low_size >> 20);
>>>> +
>>>> +	crashk_low_res.start = low_base;
>>>> +	crashk_low_res.end   = low_base + low_size - 1;
>>>> +	insert_resource(&iomem_resource, &crashk_low_res);
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>>  /*
>>>>   * reserve_crashkernel() - reserves memory for crash kernel
>>>>   *
>>>> @@ -81,29 +105,62 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>>>  static void __init reserve_crashkernel(void)
>>>>  {
>>>>  	unsigned long long crash_base, crash_size;
>>>> +	unsigned long long crash_low_size = SZ_256M;
>>>>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>>>  	int ret;
>>>> +	bool fixed_base;
>>>> +	char *cmdline = boot_command_line;
>>>>  
>>>> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>>>> +	/* crashkernel=X[@offset] */
>>>> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>>>>  				&crash_size, &crash_base);
>>>> -	/* no crashkernel= or invalid value specified */
>>>> -	if (ret || !crash_size)
>>>> -		return;
>>>> +	if (ret || !crash_size) {
>>>> +		unsigned long long low_size;
>>>>  
>>>> +		/* crashkernel=X,high */
>>>> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
>>>> +		if (ret || !crash_size)
>>>> +			return;
>>>> +
>>>> +		/* crashkernel=X,low */
>>>> +		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
>>>> +		if (!ret)
>>>> +			crash_low_size = low_size;
>>>
>>> Here, the error case is not checked and handled. But it still gets
>>> expeced result which is the default SZ_256M. Is this designed on
>>> purpose?
>>
>> Yes, we can specify only "crashkernel=X,high".
>>
>> This is mentioned in Documentation/admin-guide/kernel-parameters.txt
>>
>>         crashkernel=size[KMG],low
>>                         [KNL, X86-64] range under 4G. When crashkernel=X,high
>>                         is passed, kernel could allocate physical memory region
>>                         above 4G, that cause second kernel crash on system
>>                         that require some amount of low memory, e.g. swiotlb
>>                         requires at least 64M+32K low memory, also enough extra
>>                         low memory is needed to make sure DMA buffers for 32-bit
>>                         devices won't run out. Kernel would try to allocate at     <---------
>>                         least 256M below 4G automatically.                         <---------
> 
> Yeah, that is expected becasue no crahskernel=,low is a right usage. The
> 'ret' is 0 in the case. If I gave below string, it works too.
> "crashkernel=256M,high crashkernel=aaabbadfadfd,low"

Yes, so maybe we should change the error code in __parse_crashkernel()
from "-EINVAL" to "-ENOENT" when the specified option does not exist.

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 256cf6db573cd09..395f4fac1773f28 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -243,9 +243,8 @@ static int __init __parse_crashkernel(char *cmdline,
        *crash_base = 0;

        ck_cmdline = get_last_crashkernel(cmdline, name, suffix);
-
        if (!ck_cmdline)
-               return -EINVAL;
+               return -ENOENT;

        ck_cmdline += strlen(name);


> 
>>
>>>
>>>> +
>>>> +		crash_max = CRASH_ADDR_HIGH_MAX;
>>>> +	}
>>>> +
>>>> +	fixed_base = !!crash_base;
>>>>  	crash_size = PAGE_ALIGN(crash_size);
>>>>  
>>>>  	/* User specifies base address explicitly. */
>>>>  	if (crash_base)
>>>>  		crash_max = crash_base + crash_size;
>>>>  
>>>> +retry:
>>>>  	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>>  					       crash_base, crash_max);
>>>>  	if (!crash_base) {
>>>> +		/*
>>>> +		 * Attempt to fully allocate low memory failed, fall back
>>>> +		 * to high memory, the minimum required low memory will be
>>>> +		 * reserved later.
>>>> +		 */
>>>> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
>>>> +			crash_max = CRASH_ADDR_HIGH_MAX;
>>>> +			goto retry;
>>>> +		}
>>>> +
>>>>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>>>>  			crash_size);
>>>>  		return;
>>>>  	}
>>>>  
>>>> +	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
>>>> +		memblock_phys_free(crash_base, crash_size);
>>>> +		return;
>>>> +	}
>>>> +
>>>>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>>>>  		crash_base, crash_base + crash_size, crash_size >> 20);
>>>>  
>>>> @@ -112,6 +169,9 @@ static void __init reserve_crashkernel(void)
>>>>  	 * map. Inform kmemleak so that it won't try to access it.
>>>>  	 */
>>>>  	kmemleak_ignore_phys(crash_base);
>>>> +	if (crashk_low_res.end)
>>>> +		kmemleak_ignore_phys(crashk_low_res.start);
>>>> +
>>>>  	crashk_res.start = crash_base;
>>>>  	crashk_res.end = crash_base + crash_size - 1;
>>>>  	insert_resource(&iomem_resource, &crashk_res);
>>>> -- 
>>>> 2.25.1
>>>>
>>>
>>> .
>>>
>>
>> -- 
>> Regards,
>>   Zhen Lei
>>
> 
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X
  2022-02-14  6:44         ` Leizhen (ThunderTown)
@ 2022-02-14  7:09           ` Baoquan He
  0 siblings, 0 replies; 30+ messages in thread
From: Baoquan He @ 2022-02-14  7:09 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 02/14/22 at 02:44pm, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/2/11 18:51, Baoquan He wrote:
> > On 02/11/22 at 06:41pm, Leizhen (ThunderTown) wrote:
> >>
> >>
> >> On 2022/2/11 18:30, Baoquan He wrote:
> >>> On 01/24/22 at 04:47pm, Zhen Lei wrote:
> >>>> From: Chen Zhou <chenzhou10@huawei.com>
> >>> ......
> >>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> >>>> index 6c653a2c7cff052..a5d43feac0d7d96 100644
> >>>> --- a/arch/arm64/mm/init.c
> >>>> +++ b/arch/arm64/mm/init.c
> >>>> @@ -71,6 +71,30 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
> >>>>  #define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
> >>>>  #define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
> >>>>  
> >>>> +static int __init reserve_crashkernel_low(unsigned long long low_size)
> >>>> +{
> >>>> +	unsigned long long low_base;
> >>>> +
> >>>> +	/* passed with crashkernel=0,low ? */
> >>>> +	if (!low_size)
> >>>> +		return 0;
> >>>> +
> >>>> +	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> >>>> +	if (!low_base) {
> >>>> +		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
> >>>> +		return -ENOMEM;
> >>>> +	}
> >>>> +
> >>>> +	pr_info("crashkernel low memory reserved: 0x%llx - 0x%llx (%lld MB)\n",
> >>>> +		low_base, low_base + low_size, low_size >> 20);
> >>>> +
> >>>> +	crashk_low_res.start = low_base;
> >>>> +	crashk_low_res.end   = low_base + low_size - 1;
> >>>> +	insert_resource(&iomem_resource, &crashk_low_res);
> >>>> +
> >>>> +	return 0;
> >>>> +}
> >>>> +
> >>>>  /*
> >>>>   * reserve_crashkernel() - reserves memory for crash kernel
> >>>>   *
> >>>> @@ -81,29 +105,62 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
> >>>>  static void __init reserve_crashkernel(void)
> >>>>  {
> >>>>  	unsigned long long crash_base, crash_size;
> >>>> +	unsigned long long crash_low_size = SZ_256M;
> >>>>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> >>>>  	int ret;
> >>>> +	bool fixed_base;
> >>>> +	char *cmdline = boot_command_line;
> >>>>  
> >>>> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> >>>> +	/* crashkernel=X[@offset] */
> >>>> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
> >>>>  				&crash_size, &crash_base);
> >>>> -	/* no crashkernel= or invalid value specified */
> >>>> -	if (ret || !crash_size)
> >>>> -		return;
> >>>> +	if (ret || !crash_size) {
> >>>> +		unsigned long long low_size;
> >>>>  
> >>>> +		/* crashkernel=X,high */
> >>>> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
> >>>> +		if (ret || !crash_size)
> >>>> +			return;
> >>>> +
> >>>> +		/* crashkernel=X,low */
> >>>> +		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
> >>>> +		if (!ret)
> >>>> +			crash_low_size = low_size;
> >>>
> >>> Here, the error case is not checked and handled. But it still gets
> >>> expeced result which is the default SZ_256M. Is this designed on
> >>> purpose?
> >>
> >> Yes, we can specify only "crashkernel=X,high".
> >>
> >> This is mentioned in Documentation/admin-guide/kernel-parameters.txt
> >>
> >>         crashkernel=size[KMG],low
> >>                         [KNL, X86-64] range under 4G. When crashkernel=X,high
> >>                         is passed, kernel could allocate physical memory region
> >>                         above 4G, that cause second kernel crash on system
> >>                         that require some amount of low memory, e.g. swiotlb
> >>                         requires at least 64M+32K low memory, also enough extra
> >>                         low memory is needed to make sure DMA buffers for 32-bit
> >>                         devices won't run out. Kernel would try to allocate at     <---------
> >>                         least 256M below 4G automatically.                         <---------
> > 
> > Yeah, that is expected becasue no crahskernel=,low is a right usage. The
> > 'ret' is 0 in the case. If I gave below string, it works too.
> > "crashkernel=256M,high crashkernel=aaabbadfadfd,low"
> 
> Yes, so maybe we should change the error code in __parse_crashkernel()
> from "-EINVAL" to "-ENOENT" when the specified option does not exist.

Good point. I also thought of this, it could be next step clean up. X86
code need this too. In crashkernel='messy code',high, it will fail to
reserve. For consistency, we should fail crashkrenel='messy code',low
too.

> 
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 256cf6db573cd09..395f4fac1773f28 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -243,9 +243,8 @@ static int __init __parse_crashkernel(char *cmdline,
>         *crash_base = 0;
> 
>         ck_cmdline = get_last_crashkernel(cmdline, name, suffix);
> -
>         if (!ck_cmdline)
> -               return -EINVAL;
> +               return -ENOENT;
> 
>         ck_cmdline += strlen(name);
> 
> 
> > 
> >>
> >>>
> >>>> +
> >>>> +		crash_max = CRASH_ADDR_HIGH_MAX;
> >>>> +	}
> >>>> +
> >>>> +	fixed_base = !!crash_base;
> >>>>  	crash_size = PAGE_ALIGN(crash_size);
> >>>>  
> >>>>  	/* User specifies base address explicitly. */
> >>>>  	if (crash_base)
> >>>>  		crash_max = crash_base + crash_size;
> >>>>  
> >>>> +retry:
> >>>>  	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
> >>>>  					       crash_base, crash_max);
> >>>>  	if (!crash_base) {
> >>>> +		/*
> >>>> +		 * Attempt to fully allocate low memory failed, fall back
> >>>> +		 * to high memory, the minimum required low memory will be
> >>>> +		 * reserved later.
> >>>> +		 */
> >>>> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
> >>>> +			crash_max = CRASH_ADDR_HIGH_MAX;
> >>>> +			goto retry;
> >>>> +		}
> >>>> +
> >>>>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
> >>>>  			crash_size);
> >>>>  		return;
> >>>>  	}
> >>>>  
> >>>> +	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
> >>>> +		memblock_phys_free(crash_base, crash_size);
> >>>> +		return;
> >>>> +	}
> >>>> +
> >>>>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
> >>>>  		crash_base, crash_base + crash_size, crash_size >> 20);
> >>>>  
> >>>> @@ -112,6 +169,9 @@ static void __init reserve_crashkernel(void)
> >>>>  	 * map. Inform kmemleak so that it won't try to access it.
> >>>>  	 */
> >>>>  	kmemleak_ignore_phys(crash_base);
> >>>> +	if (crashk_low_res.end)
> >>>> +		kmemleak_ignore_phys(crashk_low_res.start);
> >>>> +
> >>>>  	crashk_res.start = crash_base;
> >>>>  	crashk_res.end = crash_base + crash_size - 1;
> >>>>  	insert_resource(&iomem_resource, &crashk_res);
> >>>> -- 
> >>>> 2.25.1
> >>>>
> >>>
> >>> .
> >>>
> >>
> >> -- 
> >> Regards,
> >>   Zhen Lei
> >>
> > 
> > .
> > 
> 
> -- 
> Regards,
>   Zhen Lei
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X
  2022-02-14  3:52   ` Baoquan He
@ 2022-02-14  7:53     ` Leizhen (ThunderTown)
  2022-02-16  2:58       ` Leizhen (ThunderTown)
  0 siblings, 1 reply; 30+ messages in thread
From: Leizhen (ThunderTown) @ 2022-02-14  7:53 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/2/14 11:52, Baoquan He wrote:
> On 01/24/22 at 04:47pm, Zhen Lei wrote:
> ......
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 6c653a2c7cff052..a5d43feac0d7d96 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -71,6 +71,30 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>  #define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
>>  #define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
>>  
>> +static int __init reserve_crashkernel_low(unsigned long long low_size)
>> +{
>> +	unsigned long long low_base;
>> +
>> +	/* passed with crashkernel=0,low ? */
>> +	if (!low_size)
>> +		return 0;
>> +
>> +	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
>> +	if (!low_base) {
>> +		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
>> +		return -ENOMEM;
>> +	}
>> +
>> +	pr_info("crashkernel low memory reserved: 0x%llx - 0x%llx (%lld MB)\n",
>> +		low_base, low_base + low_size, low_size >> 20);
>> +
>> +	crashk_low_res.start = low_base;
>> +	crashk_low_res.end   = low_base + low_size - 1;
>> +	insert_resource(&iomem_resource, &crashk_low_res);
>> +
>> +	return 0;
>> +}
>> +
>>  /*
>>   * reserve_crashkernel() - reserves memory for crash kernel
> 
> My another concern is the crashkernel=,low handling. In this patch, the
> code related to low memory is obscure. Wondering if we should make them
> explicit with a little redundant but very clear code flows. Saying this
> because the code must be very clear to you and reviewers, it may be
> harder for later code reader or anyone interested to understand.
> 
> 1) crashkernel=X,high
> 2) crashkernel=X,high crashkernel=Y,low
> 3) crashkernel=X,high crashkernel=0,low
> 4) crashkernel=X,high crashkernel='messy code',low
> 5) crashkernel=X //fall back to high memory, low memory is required then.
> 
> It could be me thinking about it too much. I made changes to your patch
> with a tuning, not sure if it's OK to you. Otherwise, this patchset

I think it's good.

> works very well for all above test cases, it's ripe to be merged for
> wider testing.

I will test it tomorrow. I've prepared a little more use cases than yours.

1) crashkernel=4G						//high=4G, low=256M
2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
5) crashkernel=4G@0xe0000000					//high=0G, low=0M, cannot allocate, failed
6) crashkernel=512M						//high=0G, low=512M
7) crashkernel=128M						//high=0G, low=128M
8) crashkernel=512M@0xde000000		//512M@3552M		//high=0G, low=512M
9) crashkernel=4G,high						//high=4G, low=256M
a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
c) crashkernel=512M,low						//high=0G, low=0M, invalid


> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index a5d43feac0d7..671862c56d7d 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -94,7 +94,8 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
>  
>  	return 0;
>  }
> -
> +/*Words explaining why it's 256M*/
> +#define DEFAULT_CRASH_KERNEL_LOW_SIZE SZ_256M
>  /*
>   * reserve_crashkernel() - reserves memory for crash kernel
>   *
> @@ -105,10 +106,10 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
>  static void __init reserve_crashkernel(void)
>  {
>  	unsigned long long crash_base, crash_size;
> -	unsigned long long crash_low_size = SZ_256M;
> +	unsigned long long crash_low_size;
>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>  	int ret;
> -	bool fixed_base;
> +	bool fixed_base, high;
>  	char *cmdline = boot_command_line;
>  
>  	/* crashkernel=X[@offset] */
> @@ -126,7 +127,10 @@ static void __init reserve_crashkernel(void)
>  		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
>  		if (!ret)
>  			crash_low_size = low_size;
> +		else
> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>  
> +		high = true;
>  		crash_max = CRASH_ADDR_HIGH_MAX;
>  	}
>  
> @@ -134,7 +138,7 @@ static void __init reserve_crashkernel(void)
>  	crash_size = PAGE_ALIGN(crash_size);
>  
>  	/* User specifies base address explicitly. */
> -	if (crash_base)
> +	if (fixed_base)
>  		crash_max = crash_base + crash_size;
>  
>  retry:
> @@ -156,7 +160,10 @@ static void __init reserve_crashkernel(void)
>  		return;
>  	}
>  
> -	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
> +	if (crash_base >= SZ_4G && !high) 
> +		crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> +
> +	if (reserve_crashkernel_low(crash_low_size)) {
>  		memblock_phys_free(crash_base, crash_size);
>  		return;
>  	}

It feels like {} may need to be added here so that it is in branch "if (crash_base >= SZ_4G)".
The case of "crashkernel=128M" will not fall back to high memory and does not need to reserve
low memory again.

> 
>>   *
>> @@ -81,29 +105,62 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>  static void __init reserve_crashkernel(void)
>>  {
>>  	unsigned long long crash_base, crash_size;
>> +	unsigned long long crash_low_size = SZ_256M;
>>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>  	int ret;
>> +	bool fixed_base;
>> +	char *cmdline = boot_command_line;
>>  
>> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>> +	/* crashkernel=X[@offset] */
>> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>>  				&crash_size, &crash_base);
>> -	/* no crashkernel= or invalid value specified */
>> -	if (ret || !crash_size)
>> -		return;
>> +	if (ret || !crash_size) {
>> +		unsigned long long low_size;
>>  
>> +		/* crashkernel=X,high */
>> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
>> +		if (ret || !crash_size)
>> +			return;
>> +
>> +		/* crashkernel=X,low */
>> +		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
>> +		if (!ret)
>> +			crash_low_size = low_size;
>> +
>> +		crash_max = CRASH_ADDR_HIGH_MAX;
>> +	}
>> +
>> +	fixed_base = !!crash_base;
>>  	crash_size = PAGE_ALIGN(crash_size);
>>  
>>  	/* User specifies base address explicitly. */
>>  	if (crash_base)
>>  		crash_max = crash_base + crash_size;
>>  
>> +retry:
>>  	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>  					       crash_base, crash_max);
>>  	if (!crash_base) {
>> +		/*
>> +		 * Attempt to fully allocate low memory failed, fall back
>> +		 * to high memory, the minimum required low memory will be
>> +		 * reserved later.
>> +		 */
>> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
>> +			crash_max = CRASH_ADDR_HIGH_MAX;
>> +			goto retry;
>> +		}
>> +
>>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>>  			crash_size);
>>  		return;
>>  	}
>>  
>> +	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
>> +		memblock_phys_free(crash_base, crash_size);
>> +		return;
>> +	}
>> +
>>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>>  		crash_base, crash_base + crash_size, crash_size >> 20);
>>  
>> @@ -112,6 +169,9 @@ static void __init reserve_crashkernel(void)
>>  	 * map. Inform kmemleak so that it won't try to access it.
>>  	 */
>>  	kmemleak_ignore_phys(crash_base);
>> +	if (crashk_low_res.end)
>> +		kmemleak_ignore_phys(crashk_low_res.start);
>> +
>>  	crashk_res.start = crash_base;
>>  	crashk_res.end = crash_base + crash_size - 1;
>>  	insert_resource(&iomem_resource, &crashk_res);
>> -- 
>> 2.25.1
>>
> 
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X
  2022-02-14  7:53     ` Leizhen (ThunderTown)
@ 2022-02-16  2:58       ` Leizhen (ThunderTown)
  2022-02-16 10:20         ` Baoquan He
  0 siblings, 1 reply; 30+ messages in thread
From: Leizhen (ThunderTown) @ 2022-02-16  2:58 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/2/14 15:53, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/2/14 11:52, Baoquan He wrote:
>> On 01/24/22 at 04:47pm, Zhen Lei wrote:
>> ......
>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>> index 6c653a2c7cff052..a5d43feac0d7d96 100644
>>> --- a/arch/arm64/mm/init.c
>>> +++ b/arch/arm64/mm/init.c
>>> @@ -71,6 +71,30 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>>  #define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
>>>  #define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
>>>  
>>> +static int __init reserve_crashkernel_low(unsigned long long low_size)
>>> +{
>>> +	unsigned long long low_base;
>>> +
>>> +	/* passed with crashkernel=0,low ? */
>>> +	if (!low_size)
>>> +		return 0;
>>> +
>>> +	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
>>> +	if (!low_base) {
>>> +		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
>>> +		return -ENOMEM;
>>> +	}
>>> +
>>> +	pr_info("crashkernel low memory reserved: 0x%llx - 0x%llx (%lld MB)\n",
>>> +		low_base, low_base + low_size, low_size >> 20);
>>> +
>>> +	crashk_low_res.start = low_base;
>>> +	crashk_low_res.end   = low_base + low_size - 1;
>>> +	insert_resource(&iomem_resource, &crashk_low_res);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>>  /*
>>>   * reserve_crashkernel() - reserves memory for crash kernel
>>
>> My another concern is the crashkernel=,low handling. In this patch, the
>> code related to low memory is obscure. Wondering if we should make them
>> explicit with a little redundant but very clear code flows. Saying this
>> because the code must be very clear to you and reviewers, it may be
>> harder for later code reader or anyone interested to understand.
>>
>> 1) crashkernel=X,high
>> 2) crashkernel=X,high crashkernel=Y,low
>> 3) crashkernel=X,high crashkernel=0,low
>> 4) crashkernel=X,high crashkernel='messy code',low
>> 5) crashkernel=X //fall back to high memory, low memory is required then.
>>
>> It could be me thinking about it too much. I made changes to your patch
>> with a tuning, not sure if it's OK to you. Otherwise, this patchset
> 
> I think it's good.
> 
>> works very well for all above test cases, it's ripe to be merged for
>> wider testing.
> 
> I will test it tomorrow. I've prepared a little more use cases than yours.

After the following modifications, I have tested it and it works well. Passed
all the test cases I prepared.

> 
> 1) crashkernel=4G						//high=4G, low=256M
> 2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
> 3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
> 4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
> 5) crashkernel=4G@0xe0000000					//high=0G, low=0M, cannot allocate, failed
> 6) crashkernel=512M						//high=0G, low=512M
> 7) crashkernel=128M						//high=0G, low=128M
> 8) crashkernel=512M@0xde000000		//512M@3552M		//high=0G, low=512M
> 9) crashkernel=4G,high						//high=4G, low=256M
> a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
> b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
> c) crashkernel=512M,low						//high=0G, low=0M, invalid
> 
> 
>>
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index a5d43feac0d7..671862c56d7d 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -94,7 +94,8 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
>>  
>>  	return 0;
>>  }
>> -
>> +/*Words explaining why it's 256M*/
>> +#define DEFAULT_CRASH_KERNEL_LOW_SIZE SZ_256M

It's an empirical value.

94fb9334182284e8e7e4bcb9125c25dc33af19d4 x86/crash: Allocate enough low memory when crashkernel=high

    When the crash kernel is loaded above 4GiB in memory, the
    first kernel allocates only 72MiB of low-memory for the DMA
    requirements of the second kernel. On systems with many
    devices this is not enough and causes device driver
    initialization errors and failed crash dumps. Testing by
    SUSE and Redhat has shown that 256MiB is a good default
    value for now and the discussion has lead to this value as
    well. So set this default value to 256MiB to make sure there
    is enough memory available for DMA.


>>  /*
>>   * reserve_crashkernel() - reserves memory for crash kernel
>>   *
>> @@ -105,10 +106,10 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
>>  static void __init reserve_crashkernel(void)
>>  {
>>  	unsigned long long crash_base, crash_size;
>> -	unsigned long long crash_low_size = SZ_256M;
>> +	unsigned long long crash_low_size;
>>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>  	int ret;
>> -	bool fixed_base;
>> +	bool fixed_base, high;

high = false;

>>  	char *cmdline = boot_command_line;
>>  
>>  	/* crashkernel=X[@offset] */
>> @@ -126,7 +127,10 @@ static void __init reserve_crashkernel(void)
>>  		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
>>  		if (!ret)
>>  			crash_low_size = low_size;
>> +		else
>> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>>  
>> +		high = true;
>>  		crash_max = CRASH_ADDR_HIGH_MAX;
>>  	}
>>  
>> @@ -134,7 +138,7 @@ static void __init reserve_crashkernel(void)
>>  	crash_size = PAGE_ALIGN(crash_size);
>>  
>>  	/* User specifies base address explicitly. */
>> -	if (crash_base)
>> +	if (fixed_base)
>>  		crash_max = crash_base + crash_size;
>>  
>>  retry:
>> @@ -156,7 +160,10 @@ static void __init reserve_crashkernel(void)
>>  		return;
>>  	}
>>  
>> -	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
>> +	if (crash_base >= SZ_4G && !high) 
>> +		crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>> +
>> +	if (reserve_crashkernel_low(crash_low_size)) {
>>  		memblock_phys_free(crash_base, crash_size);
>>  		return;
>>  	}

-       if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
-               memblock_phys_free(crash_base, crash_size);
-               return;
+       if (crash_base >= SZ_4G) {
+               if (!high)
+                       crash_low_size = SZ_256M;
+
+               if (reserve_crashkernel_low(crash_low_size)) {
+                       memblock_phys_free(crash_base, crash_size);
+                       return;
+               }
        }

Looks like changing 'high' to 'low' would be more accurate. Whether crashkernel=Y,low is specified.


> 
> It feels like {} may need to be added here so that it is in branch "if (crash_base >= SZ_4G)".
> The case of "crashkernel=128M" will not fall back to high memory and does not need to reserve
> low memory again.
> 
>>
>>>   *
>>> @@ -81,29 +105,62 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>>  static void __init reserve_crashkernel(void)
>>>  {
>>>  	unsigned long long crash_base, crash_size;
>>> +	unsigned long long crash_low_size = SZ_256M;
>>>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>>  	int ret;
>>> +	bool fixed_base;
>>> +	char *cmdline = boot_command_line;
>>>  
>>> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>>> +	/* crashkernel=X[@offset] */
>>> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>>>  				&crash_size, &crash_base);
>>> -	/* no crashkernel= or invalid value specified */
>>> -	if (ret || !crash_size)
>>> -		return;
>>> +	if (ret || !crash_size) {
>>> +		unsigned long long low_size;
>>>  
>>> +		/* crashkernel=X,high */
>>> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
>>> +		if (ret || !crash_size)
>>> +			return;
>>> +
>>> +		/* crashkernel=X,low */
>>> +		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
>>> +		if (!ret)
>>> +			crash_low_size = low_size;
>>> +
>>> +		crash_max = CRASH_ADDR_HIGH_MAX;
>>> +	}
>>> +
>>> +	fixed_base = !!crash_base;
>>>  	crash_size = PAGE_ALIGN(crash_size);
>>>  
>>>  	/* User specifies base address explicitly. */
>>>  	if (crash_base)
>>>  		crash_max = crash_base + crash_size;
>>>  
>>> +retry:
>>>  	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>  					       crash_base, crash_max);
>>>  	if (!crash_base) {
>>> +		/*
>>> +		 * Attempt to fully allocate low memory failed, fall back
>>> +		 * to high memory, the minimum required low memory will be
>>> +		 * reserved later.
>>> +		 */
>>> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
>>> +			crash_max = CRASH_ADDR_HIGH_MAX;
>>> +			goto retry;
>>> +		}
>>> +
>>>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>>>  			crash_size);
>>>  		return;
>>>  	}
>>>  
>>> +	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
>>> +		memblock_phys_free(crash_base, crash_size);
>>> +		return;
>>> +	}
>>> +
>>>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>>>  		crash_base, crash_base + crash_size, crash_size >> 20);
>>>  
>>> @@ -112,6 +169,9 @@ static void __init reserve_crashkernel(void)
>>>  	 * map. Inform kmemleak so that it won't try to access it.
>>>  	 */
>>>  	kmemleak_ignore_phys(crash_base);
>>> +	if (crashk_low_res.end)
>>> +		kmemleak_ignore_phys(crashk_low_res.start);
>>> +
>>>  	crashk_res.start = crash_base;
>>>  	crashk_res.end = crash_base + crash_size - 1;
>>>  	insert_resource(&iomem_resource, &crashk_res);
>>> -- 
>>> 2.25.1
>>>
>>
>> .
>>
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X
  2022-02-16  2:58       ` Leizhen (ThunderTown)
@ 2022-02-16 10:20         ` Baoquan He
  2022-02-17  1:57           ` Leizhen (ThunderTown)
  0 siblings, 1 reply; 30+ messages in thread
From: Baoquan He @ 2022-02-16 10:20 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 02/16/22 at 10:58am, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/2/14 15:53, Leizhen (ThunderTown) wrote:
> > 
> > 
> > On 2022/2/14 11:52, Baoquan He wrote:
> >> On 01/24/22 at 04:47pm, Zhen Lei wrote:
> >> ......
> >>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> >>> index 6c653a2c7cff052..a5d43feac0d7d96 100644
> >>> --- a/arch/arm64/mm/init.c
> >>> +++ b/arch/arm64/mm/init.c
> >>> @@ -71,6 +71,30 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
> >>>  #define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
> >>>  #define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
> >>>  
> >>> +static int __init reserve_crashkernel_low(unsigned long long low_size)
> >>> +{
> >>> +	unsigned long long low_base;
> >>> +
> >>> +	/* passed with crashkernel=0,low ? */
> >>> +	if (!low_size)
> >>> +		return 0;
> >>> +
> >>> +	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> >>> +	if (!low_base) {
> >>> +		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
> >>> +		return -ENOMEM;
> >>> +	}
> >>> +
> >>> +	pr_info("crashkernel low memory reserved: 0x%llx - 0x%llx (%lld MB)\n",
> >>> +		low_base, low_base + low_size, low_size >> 20);
> >>> +
> >>> +	crashk_low_res.start = low_base;
> >>> +	crashk_low_res.end   = low_base + low_size - 1;
> >>> +	insert_resource(&iomem_resource, &crashk_low_res);
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>>  /*
> >>>   * reserve_crashkernel() - reserves memory for crash kernel
> >>
> >> My another concern is the crashkernel=,low handling. In this patch, the
> >> code related to low memory is obscure. Wondering if we should make them
> >> explicit with a little redundant but very clear code flows. Saying this
> >> because the code must be very clear to you and reviewers, it may be
> >> harder for later code reader or anyone interested to understand.
> >>
> >> 1) crashkernel=X,high
> >> 2) crashkernel=X,high crashkernel=Y,low
> >> 3) crashkernel=X,high crashkernel=0,low
> >> 4) crashkernel=X,high crashkernel='messy code',low
> >> 5) crashkernel=X //fall back to high memory, low memory is required then.
> >>
> >> It could be me thinking about it too much. I made changes to your patch
> >> with a tuning, not sure if it's OK to you. Otherwise, this patchset
> > 
> > I think it's good.
> > 
> >> works very well for all above test cases, it's ripe to be merged for
> >> wider testing.
> > 
> > I will test it tomorrow. I've prepared a little more use cases than yours.
> 
> After the following modifications, I have tested it and it works well. Passed
> all the test cases I prepared.

That's great.

You might need to add 'crashkernel=xM, crashkernel=0,low',
'crashkernel=xM, crashkernel='messy code',low' to your test cases.

> 
> > 
> > 1) crashkernel=4G						//high=4G, low=256M
> > 2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
> > 3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
> > 4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
> > 5) crashkernel=4G@0xe0000000					//high=0G, low=0M, cannot allocate, failed
> > 6) crashkernel=512M						//high=0G, low=512M
> > 7) crashkernel=128M						//high=0G, low=128M
> > 8) crashkernel=512M@0xde000000		//512M@3552M		//high=0G, low=512M
> > 9) crashkernel=4G,high						//high=4G, low=256M
> > a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
> > b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
> > c) crashkernel=512M,low						//high=0G, low=0M, invalid
> > 
> > 
> >>
> >> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> >> index a5d43feac0d7..671862c56d7d 100644
> >> --- a/arch/arm64/mm/init.c
> >> +++ b/arch/arm64/mm/init.c
> >> @@ -94,7 +94,8 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
> >>  
> >>  	return 0;
> >>  }
> >> -
> >> +/*Words explaining why it's 256M*/
> >> +#define DEFAULT_CRASH_KERNEL_LOW_SIZE SZ_256M
> 
> It's an empirical value.
> 
> 94fb9334182284e8e7e4bcb9125c25dc33af19d4 x86/crash: Allocate enough low memory when crashkernel=high
> 
>     When the crash kernel is loaded above 4GiB in memory, the
>     first kernel allocates only 72MiB of low-memory for the DMA
>     requirements of the second kernel. On systems with many
>     devices this is not enough and causes device driver
>     initialization errors and failed crash dumps. Testing by
>     SUSE and Redhat has shown that 256MiB is a good default
>     value for now and the discussion has lead to this value as
>     well. So set this default value to 256MiB to make sure there
>     is enough memory available for DMA.

Then, some words like below can be added. I am not confident it's good
enought, hope someone else can help to polish it.

/*
 * This is an empirical value in x86_64 and taken here directly. Please
 * refer to code comment in reserve_crashkernel_low() of x86_64 for more
 * details.
 */
#define DEFAULT_CRASH_KERNEL_LOW_SIZE SZ_256M

> 
> 
> >>  /*
> >>   * reserve_crashkernel() - reserves memory for crash kernel
> >>   *
> >> @@ -105,10 +106,10 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
> >>  static void __init reserve_crashkernel(void)
> >>  {
> >>  	unsigned long long crash_base, crash_size;
> >> -	unsigned long long crash_low_size = SZ_256M;
> >> +	unsigned long long crash_low_size;
> >>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> >>  	int ret;
> >> -	bool fixed_base;
> >> +	bool fixed_base, high;
> 
> high = false;
> 
> >>  	char *cmdline = boot_command_line;
> >>  
> >>  	/* crashkernel=X[@offset] */
> >> @@ -126,7 +127,10 @@ static void __init reserve_crashkernel(void)
> >>  		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
> >>  		if (!ret)
> >>  			crash_low_size = low_size;
> >> +		else
> >> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> >>  
> >> +		high = true;
> >>  		crash_max = CRASH_ADDR_HIGH_MAX;
> >>  	}
> >>  
> >> @@ -134,7 +138,7 @@ static void __init reserve_crashkernel(void)
> >>  	crash_size = PAGE_ALIGN(crash_size);
> >>  
> >>  	/* User specifies base address explicitly. */
> >> -	if (crash_base)
> >> +	if (fixed_base)
> >>  		crash_max = crash_base + crash_size;
> >>  
> >>  retry:
> >> @@ -156,7 +160,10 @@ static void __init reserve_crashkernel(void)
> >>  		return;
> >>  	}
> >>  
> >> -	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
> >> +	if (crash_base >= SZ_4G && !high) 
> >> +		crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> >> +
> >> +	if (reserve_crashkernel_low(crash_low_size)) {
> >>  		memblock_phys_free(crash_base, crash_size);
> >>  		return;
> >>  	}
> 
> -       if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
> -               memblock_phys_free(crash_base, crash_size);
> -               return;
> +       if (crash_base >= SZ_4G) {
> +               if (!high)
> +                       crash_low_size = SZ_256M;
> +
> +               if (reserve_crashkernel_low(crash_low_size)) {
> +                       memblock_phys_free(crash_base, crash_size);
> +                       return;
> +               }
>         }
> 
> Looks like changing 'high' to 'low' would be more accurate. Whether crashkernel=Y,low is specified.

What I menat is like below, we even can add code comment to make it more
clearer.

static void __init reserve_crashkernel(void)
{

        /* crashkernel=X[@offset] */
        ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
                                &crash_size, &crash_base);
        if (ret || !crash_size) {
                unsigned long long low_size;

                /* crashkernel=X,high */
                ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
                if (ret || !crash_size)
                        return;

                /* crashkernel=X,low */
                ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
		//case #1, crashkernel=yM,low is specified explicitly in cmdline
                if (!ret)
                        crash_low_size = low_size;
		else //case #2, crashkernel=yM,low is not specified explicitly
                        crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;

		//high means crashkernel,high is specified explicitly
		high = true;
                crash_max = CRASH_ADDR_HIGH_MAX;
        }

        fixed_base = !!crash_base;
        crash_size = PAGE_ALIGN(crash_size);

        /* User specifies base address explicitly. */
        if (crash_base)
                crash_max = crash_base + crash_size;
retry:
        crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
                                               crash_base, crash_max);
        if (!crash_base) {
                /*
                 * Attempt to fully allocate low memory failed, fall back
                 * to high memory, the minimum required low memory will be
                 * reserved later.
                 */
                if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
                        crash_max = CRASH_ADDR_HIGH_MAX;
                        goto retry;
                }

                pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
                        crash_size);
                return;
        }


	//case #3: get crashkernel from high memory through fallback, let's set crashkernel,low too.
        if (crash_base >= SZ_4G && !high)
		crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;	

        if (reserve_crashkernel_low(crash_low_size)) {
                memblock_phys_free(crash_base, crash_size);
                return;
        }

        pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
                crash_base, crash_base + crash_size, crash_size >> 20);

        /*
         * The crashkernel memory will be removed from the kernel linear
         * map. Inform kmemleak so that it won't try to access it.
         */
        kmemleak_ignore_phys(crash_base);
        if (crashk_low_res.end)
                kmemleak_ignore_phys(crashk_low_res.start);

        crashk_res.start = crash_base;
        crashk_res.end = crash_base + crash_size - 1;
        insert_resource(&iomem_resource, &crashk_res);
}


> 
> 
> > 
> > It feels like {} may need to be added here so that it is in branch "if (crash_base >= SZ_4G)".
> > The case of "crashkernel=128M" will not fall back to high memory and does not need to reserve
> > low memory again.
> > 
> >>
> >>>   *
> >>> @@ -81,29 +105,62 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
> >>>  static void __init reserve_crashkernel(void)
> >>>  {
> >>>  	unsigned long long crash_base, crash_size;
> >>> +	unsigned long long crash_low_size = SZ_256M;
> >>>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> >>>  	int ret;
> >>> +	bool fixed_base;
> >>> +	char *cmdline = boot_command_line;
> >>>  
> >>> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> >>> +	/* crashkernel=X[@offset] */
> >>> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
> >>>  				&crash_size, &crash_base);
> >>> -	/* no crashkernel= or invalid value specified */
> >>> -	if (ret || !crash_size)
> >>> -		return;
> >>> +	if (ret || !crash_size) {
> >>> +		unsigned long long low_size;
> >>>  
> >>> +		/* crashkernel=X,high */
> >>> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
> >>> +		if (ret || !crash_size)
> >>> +			return;
> >>> +
> >>> +		/* crashkernel=X,low */
> >>> +		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
> >>> +		if (!ret)
> >>> +			crash_low_size = low_size;
> >>> +
> >>> +		crash_max = CRASH_ADDR_HIGH_MAX;
> >>> +	}
> >>> +
> >>> +	fixed_base = !!crash_base;
> >>>  	crash_size = PAGE_ALIGN(crash_size);
> >>>  
> >>>  	/* User specifies base address explicitly. */
> >>>  	if (crash_base)
> >>>  		crash_max = crash_base + crash_size;
> >>>  
> >>> +retry:
> >>>  	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
> >>>  					       crash_base, crash_max);
> >>>  	if (!crash_base) {
> >>> +		/*
> >>> +		 * Attempt to fully allocate low memory failed, fall back
> >>> +		 * to high memory, the minimum required low memory will be
> >>> +		 * reserved later.
> >>> +		 */
> >>> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
> >>> +			crash_max = CRASH_ADDR_HIGH_MAX;
> >>> +			goto retry;
> >>> +		}
> >>> +
> >>>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
> >>>  			crash_size);
> >>>  		return;
> >>>  	}
> >>>  
> >>> +	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
> >>> +		memblock_phys_free(crash_base, crash_size);
> >>> +		return;
> >>> +	}
> >>> +
> >>>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
> >>>  		crash_base, crash_base + crash_size, crash_size >> 20);
> >>>  
> >>> @@ -112,6 +169,9 @@ static void __init reserve_crashkernel(void)
> >>>  	 * map. Inform kmemleak so that it won't try to access it.
> >>>  	 */
> >>>  	kmemleak_ignore_phys(crash_base);
> >>> +	if (crashk_low_res.end)
> >>> +		kmemleak_ignore_phys(crashk_low_res.start);
> >>> +
> >>>  	crashk_res.start = crash_base;
> >>>  	crashk_res.end = crash_base + crash_size - 1;
> >>>  	insert_resource(&iomem_resource, &crashk_res);
> >>> -- 
> >>> 2.25.1
> >>>
> >>
> >> .
> >>
> > 
> 
> -- 
> Regards,
>   Zhen Lei
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X
  2022-02-16 10:20         ` Baoquan He
@ 2022-02-17  1:57           ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 30+ messages in thread
From: Leizhen (ThunderTown) @ 2022-02-17  1:57 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/2/16 18:20, Baoquan He wrote:
> On 02/16/22 at 10:58am, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2022/2/14 15:53, Leizhen (ThunderTown) wrote:
>>>
>>>
>>> On 2022/2/14 11:52, Baoquan He wrote:
>>>> On 01/24/22 at 04:47pm, Zhen Lei wrote:
>>>> ......
>>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>>>> index 6c653a2c7cff052..a5d43feac0d7d96 100644
>>>>> --- a/arch/arm64/mm/init.c
>>>>> +++ b/arch/arm64/mm/init.c
>>>>> @@ -71,6 +71,30 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>>>>  #define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
>>>>>  #define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
>>>>>  
>>>>> +static int __init reserve_crashkernel_low(unsigned long long low_size)
>>>>> +{
>>>>> +	unsigned long long low_base;
>>>>> +
>>>>> +	/* passed with crashkernel=0,low ? */
>>>>> +	if (!low_size)
>>>>> +		return 0;
>>>>> +
>>>>> +	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
>>>>> +	if (!low_base) {
>>>>> +		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
>>>>> +		return -ENOMEM;
>>>>> +	}
>>>>> +
>>>>> +	pr_info("crashkernel low memory reserved: 0x%llx - 0x%llx (%lld MB)\n",
>>>>> +		low_base, low_base + low_size, low_size >> 20);
>>>>> +
>>>>> +	crashk_low_res.start = low_base;
>>>>> +	crashk_low_res.end   = low_base + low_size - 1;
>>>>> +	insert_resource(&iomem_resource, &crashk_low_res);
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>>  /*
>>>>>   * reserve_crashkernel() - reserves memory for crash kernel
>>>>
>>>> My another concern is the crashkernel=,low handling. In this patch, the
>>>> code related to low memory is obscure. Wondering if we should make them
>>>> explicit with a little redundant but very clear code flows. Saying this
>>>> because the code must be very clear to you and reviewers, it may be
>>>> harder for later code reader or anyone interested to understand.
>>>>
>>>> 1) crashkernel=X,high
>>>> 2) crashkernel=X,high crashkernel=Y,low
>>>> 3) crashkernel=X,high crashkernel=0,low
>>>> 4) crashkernel=X,high crashkernel='messy code',low
>>>> 5) crashkernel=X //fall back to high memory, low memory is required then.
>>>>
>>>> It could be me thinking about it too much. I made changes to your patch
>>>> with a tuning, not sure if it's OK to you. Otherwise, this patchset
>>>
>>> I think it's good.
>>>
>>>> works very well for all above test cases, it's ripe to be merged for
>>>> wider testing.
>>>
>>> I will test it tomorrow. I've prepared a little more use cases than yours.
>>
>> After the following modifications, I have tested it and it works well. Passed
>> all the test cases I prepared.
> 
> That's great.
> 
> You might need to add 'crashkernel=xM, crashkernel=0,low',
> 'crashkernel=xM, crashkernel='messy code',low' to your test cases.

Oh, right, I will add them.

> 
>>
>>>
>>> 1) crashkernel=4G						//high=4G, low=256M
>>> 2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
>>> 3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
>>> 4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
>>> 5) crashkernel=4G@0xe0000000					//high=0G, low=0M, cannot allocate, failed
>>> 6) crashkernel=512M						//high=0G, low=512M
>>> 7) crashkernel=128M						//high=0G, low=128M
>>> 8) crashkernel=512M@0xde000000		//512M@3552M		//high=0G, low=512M
>>> 9) crashkernel=4G,high						//high=4G, low=256M
>>> a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
>>> b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
>>> c) crashkernel=512M,low						//high=0G, low=0M, invalid
>>>
>>>
>>>>
>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>>> index a5d43feac0d7..671862c56d7d 100644
>>>> --- a/arch/arm64/mm/init.c
>>>> +++ b/arch/arm64/mm/init.c
>>>> @@ -94,7 +94,8 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
>>>>  
>>>>  	return 0;
>>>>  }
>>>> -
>>>> +/*Words explaining why it's 256M*/
>>>> +#define DEFAULT_CRASH_KERNEL_LOW_SIZE SZ_256M
>>
>> It's an empirical value.
>>
>> 94fb9334182284e8e7e4bcb9125c25dc33af19d4 x86/crash: Allocate enough low memory when crashkernel=high
>>
>>     When the crash kernel is loaded above 4GiB in memory, the
>>     first kernel allocates only 72MiB of low-memory for the DMA
>>     requirements of the second kernel. On systems with many
>>     devices this is not enough and causes device driver
>>     initialization errors and failed crash dumps. Testing by
>>     SUSE and Redhat has shown that 256MiB is a good default
>>     value for now and the discussion has lead to this value as
>>     well. So set this default value to 256MiB to make sure there
>>     is enough memory available for DMA.
> 
> Then, some words like below can be added. I am not confident it's good
> enought, hope someone else can help to polish it.
> 
> /*
>  * This is an empirical value in x86_64 and taken here directly. Please
>  * refer to code comment in reserve_crashkernel_low() of x86_64 for more
>  * details.
>  */
> #define DEFAULT_CRASH_KERNEL_LOW_SIZE SZ_256M

I think it's good. If no correction is made, I will use it.

"code comment" --> "the code comment"

> 
>>
>>
>>>>  /*
>>>>   * reserve_crashkernel() - reserves memory for crash kernel
>>>>   *
>>>> @@ -105,10 +106,10 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
>>>>  static void __init reserve_crashkernel(void)
>>>>  {
>>>>  	unsigned long long crash_base, crash_size;
>>>> -	unsigned long long crash_low_size = SZ_256M;
>>>> +	unsigned long long crash_low_size;
>>>>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>>>  	int ret;
>>>> -	bool fixed_base;
>>>> +	bool fixed_base, high;
>>
>> high = false;
>>
>>>>  	char *cmdline = boot_command_line;
>>>>  
>>>>  	/* crashkernel=X[@offset] */
>>>> @@ -126,7 +127,10 @@ static void __init reserve_crashkernel(void)
>>>>  		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
>>>>  		if (!ret)
>>>>  			crash_low_size = low_size;
>>>> +		else
>>>> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>>>>  
>>>> +		high = true;
>>>>  		crash_max = CRASH_ADDR_HIGH_MAX;
>>>>  	}
>>>>  
>>>> @@ -134,7 +138,7 @@ static void __init reserve_crashkernel(void)
>>>>  	crash_size = PAGE_ALIGN(crash_size);
>>>>  
>>>>  	/* User specifies base address explicitly. */
>>>> -	if (crash_base)
>>>> +	if (fixed_base)
>>>>  		crash_max = crash_base + crash_size;
>>>>  
>>>>  retry:
>>>> @@ -156,7 +160,10 @@ static void __init reserve_crashkernel(void)
>>>>  		return;
>>>>  	}
>>>>  
>>>> -	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
>>>> +	if (crash_base >= SZ_4G && !high) 
>>>> +		crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>>>> +
>>>> +	if (reserve_crashkernel_low(crash_low_size)) {
>>>>  		memblock_phys_free(crash_base, crash_size);
>>>>  		return;
>>>>  	}
>>
>> -       if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
>> -               memblock_phys_free(crash_base, crash_size);
>> -               return;
>> +       if (crash_base >= SZ_4G) {
>> +               if (!high)
>> +                       crash_low_size = SZ_256M;
>> +
>> +               if (reserve_crashkernel_low(crash_low_size)) {
>> +                       memblock_phys_free(crash_base, crash_size);
>> +                       return;
>> +               }
>>         }
>>
>> Looks like changing 'high' to 'low' would be more accurate. Whether crashkernel=Y,low is specified.
> 
> What I menat is like below, we even can add code comment to make it more
> clearer.

OK, I got it. I'll add the necessary comments. Thanks.

> 
> static void __init reserve_crashkernel(void)
> {
> 
>         /* crashkernel=X[@offset] */
>         ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>                                 &crash_size, &crash_base);
>         if (ret || !crash_size) {
>                 unsigned long long low_size;
> 
>                 /* crashkernel=X,high */
>                 ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
>                 if (ret || !crash_size)
>                         return;
> 
>                 /* crashkernel=X,low */
>                 ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
> 		//case #1, crashkernel=yM,low is specified explicitly in cmdline
>                 if (!ret)
>                         crash_low_size = low_size;
> 		else //case #2, crashkernel=yM,low is not specified explicitly
>                         crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> 
> 		//high means crashkernel,high is specified explicitly
> 		high = true;
>                 crash_max = CRASH_ADDR_HIGH_MAX;
>         }
> 
>         fixed_base = !!crash_base;
>         crash_size = PAGE_ALIGN(crash_size);
> 
>         /* User specifies base address explicitly. */
>         if (crash_base)
>                 crash_max = crash_base + crash_size;
> retry:
>         crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>                                                crash_base, crash_max);
>         if (!crash_base) {
>                 /*
>                  * Attempt to fully allocate low memory failed, fall back
>                  * to high memory, the minimum required low memory will be
>                  * reserved later.
>                  */
>                 if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
>                         crash_max = CRASH_ADDR_HIGH_MAX;
>                         goto retry;
>                 }
> 
>                 pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>                         crash_size);
>                 return;
>         }
> 
> 
> 	//case #3: get crashkernel from high memory through fallback, let's set crashkernel,low too.
>         if (crash_base >= SZ_4G && !high)
> 		crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;	
> 
>         if (reserve_crashkernel_low(crash_low_size)) {
>                 memblock_phys_free(crash_base, crash_size);
>                 return;
>         }
> 
>         pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>                 crash_base, crash_base + crash_size, crash_size >> 20);
> 
>         /*
>          * The crashkernel memory will be removed from the kernel linear
>          * map. Inform kmemleak so that it won't try to access it.
>          */
>         kmemleak_ignore_phys(crash_base);
>         if (crashk_low_res.end)
>                 kmemleak_ignore_phys(crashk_low_res.start);
> 
>         crashk_res.start = crash_base;
>         crashk_res.end = crash_base + crash_size - 1;
>         insert_resource(&iomem_resource, &crashk_res);
> }
> 
> 
>>
>>
>>>
>>> It feels like {} may need to be added here so that it is in branch "if (crash_base >= SZ_4G)".
>>> The case of "crashkernel=128M" will not fall back to high memory and does not need to reserve
>>> low memory again.
>>>
>>>>
>>>>>   *
>>>>> @@ -81,29 +105,62 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>>>>  static void __init reserve_crashkernel(void)
>>>>>  {
>>>>>  	unsigned long long crash_base, crash_size;
>>>>> +	unsigned long long crash_low_size = SZ_256M;
>>>>>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>>>>  	int ret;
>>>>> +	bool fixed_base;
>>>>> +	char *cmdline = boot_command_line;
>>>>>  
>>>>> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>>>>> +	/* crashkernel=X[@offset] */
>>>>> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>>>>>  				&crash_size, &crash_base);
>>>>> -	/* no crashkernel= or invalid value specified */
>>>>> -	if (ret || !crash_size)
>>>>> -		return;
>>>>> +	if (ret || !crash_size) {
>>>>> +		unsigned long long low_size;
>>>>>  
>>>>> +		/* crashkernel=X,high */
>>>>> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
>>>>> +		if (ret || !crash_size)
>>>>> +			return;
>>>>> +
>>>>> +		/* crashkernel=X,low */
>>>>> +		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
>>>>> +		if (!ret)
>>>>> +			crash_low_size = low_size;
>>>>> +
>>>>> +		crash_max = CRASH_ADDR_HIGH_MAX;
>>>>> +	}
>>>>> +
>>>>> +	fixed_base = !!crash_base;
>>>>>  	crash_size = PAGE_ALIGN(crash_size);
>>>>>  
>>>>>  	/* User specifies base address explicitly. */
>>>>>  	if (crash_base)
>>>>>  		crash_max = crash_base + crash_size;
>>>>>  
>>>>> +retry:
>>>>>  	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>>>  					       crash_base, crash_max);
>>>>>  	if (!crash_base) {
>>>>> +		/*
>>>>> +		 * Attempt to fully allocate low memory failed, fall back
>>>>> +		 * to high memory, the minimum required low memory will be
>>>>> +		 * reserved later.
>>>>> +		 */
>>>>> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
>>>>> +			crash_max = CRASH_ADDR_HIGH_MAX;
>>>>> +			goto retry;
>>>>> +		}
>>>>> +
>>>>>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>>>>>  			crash_size);
>>>>>  		return;
>>>>>  	}
>>>>>  
>>>>> +	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
>>>>> +		memblock_phys_free(crash_base, crash_size);
>>>>> +		return;
>>>>> +	}
>>>>> +
>>>>>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>>>>>  		crash_base, crash_base + crash_size, crash_size >> 20);
>>>>>  
>>>>> @@ -112,6 +169,9 @@ static void __init reserve_crashkernel(void)
>>>>>  	 * map. Inform kmemleak so that it won't try to access it.
>>>>>  	 */
>>>>>  	kmemleak_ignore_phys(crash_base);
>>>>> +	if (crashk_low_res.end)
>>>>> +		kmemleak_ignore_phys(crashk_low_res.start);
>>>>> +
>>>>>  	crashk_res.start = crash_base;
>>>>>  	crashk_res.end = crash_base + crash_size - 1;
>>>>>  	insert_resource(&iomem_resource, &crashk_res);
>>>>> -- 
>>>>> 2.25.1
>>>>>
>>>>
>>>> .
>>>>
>>>
>>
>> -- 
>> Regards,
>>   Zhen Lei
>>
> 
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 2/5] arm64: kdump: introduce some macros for crash kernel reservation
  2022-02-14  6:22     ` Leizhen (ThunderTown)
@ 2022-02-21  3:22       ` Baoquan He
  2022-02-21  6:19         ` Leizhen (ThunderTown)
  0 siblings, 1 reply; 30+ messages in thread
From: Baoquan He @ 2022-02-21  3:22 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 02/14/22 at 02:22pm, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/2/11 18:39, Baoquan He wrote:
> > On 01/24/22 at 04:47pm, Zhen Lei wrote:
> >> From: Chen Zhou <chenzhou10@huawei.com>
> >>
> >> Introduce macro CRASH_ALIGN for alignment, macro CRASH_ADDR_LOW_MAX
> >> for upper bound of low crash memory, macro CRASH_ADDR_HIGH_MAX for
> >> upper bound of high crash memory, use macros instead.
> >>
> >> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
> >> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> >> Tested-by: John Donnelly <John.p.donnelly@oracle.com>
> >> Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>
> >> ---
> >>  arch/arm64/mm/init.c | 11 ++++++++---
> >>  1 file changed, 8 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> >> index 90f276d46b93bc6..6c653a2c7cff052 100644
> >> --- a/arch/arm64/mm/init.c
> >> +++ b/arch/arm64/mm/init.c
> >> @@ -65,6 +65,12 @@ EXPORT_SYMBOL(memstart_addr);
> >>  phys_addr_t arm64_dma_phys_limit __ro_after_init;
> >>  
> >>  #ifdef CONFIG_KEXEC_CORE
> >> +/* Current arm64 boot protocol requires 2MB alignment */
> >> +#define CRASH_ALIGN		SZ_2M
> >> +
> >> +#define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
> >> +#define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
> > 
> > MEMBLOCK_ALLOC_ACCESSIBLE is obvoiously a alloc flag for memblock
> > allocator, I don't think it's appropriate to make HIGH_MAX get its value.
> 
> Right, thanks.
> 
> > You can make it as memblock.current_limit, or do not define it, but using
> > MEMBLOCK_ALLOC_ACCESSIBLE direclty in memblock_phys_alloc_range() with
> > a code comment. 
> 
> This patch is not required at present. These macros are added to eliminate
> differences to share code with x86.

So this patch may not be needed in this series. It can be added in
another post when you start to do the clean up and code unification
among ARCHes, with my udnerstanding. At that time you can consider how
to abstract the common code to handle the difference.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 5/5] kdump: update Documentation about crashkernel
  2022-01-24  8:47 ` [PATCH v20 5/5] kdump: update Documentation about crashkernel Zhen Lei
  2022-01-26 15:19   ` john.p.donnelly
@ 2022-02-21  3:48   ` Baoquan He
  2022-02-21  6:38     ` Leizhen (ThunderTown)
  1 sibling, 1 reply; 30+ messages in thread
From: Baoquan He @ 2022-02-21  3:48 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 01/24/22 at 04:47pm, Zhen Lei wrote:
> From: Chen Zhou <chenzhou10@huawei.com>
> 
> For arm64, the behavior of crashkernel=X has been changed, which
> tries low allocation in DMA zone and fall back to high allocation
> if it fails.
> 
> We can also use "crashkernel=X,high" to select a high region above
> DMA zone, which also tries to allocate at least 256M low memory in
> DMA zone automatically and "crashkernel=Y,low" can be used to allocate
> specified size low memory.
> 
> So update the Documentation.
> 
> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  Documentation/admin-guide/kdump/kdump.rst       | 11 +++++++++--
>  Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
>  2 files changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
> index cb30ca3df27c9b2..d4c287044be0c70 100644
> --- a/Documentation/admin-guide/kdump/kdump.rst
> +++ b/Documentation/admin-guide/kdump/kdump.rst
> @@ -361,8 +361,15 @@ Boot into System Kernel
>     kernel will automatically locate the crash kernel image within the
>     first 512MB of RAM if X is not given.
>  
> -   On arm64, use "crashkernel=Y[@X]".  Note that the start address of
> -   the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
> +   On arm64, use "crashkernel=X" to try low allocation in DMA zone and
> +   fall back to high allocation if it fails.
> +   We can also use "crashkernel=X,high" to select a high region above
> +   DMA zone, which also tries to allocate at least 256M low memory in
> +   DMA zone automatically.
> +   "crashkernel=Y,low" can be used to allocate specified size low memory.
> +   Use "crashkernel=Y@X" if you really have to reserve memory from
> +   specified start address X. Note that the start address of the kernel,
> +   X if explicitly specified, must be aligned to 2MiB (0x200000).

Hmm, we may not need the details related to crashkernel,high|low in this
section. This just gives examples of basic configation for each ARCH.
The detailed configuration of all crashkernel setting can be found in
"crashkernel syntax" section, I don't think arm64 is so special to need
a specific one.

>  
>  Load the Dump-capture Kernel
>  ============================
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index f5a27f067db9ed9..65780c2ca830be0 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -792,6 +792,9 @@
>  			[KNL, X86-64] Select a region under 4G first, and
>  			fall back to reserve region above 4G when '@offset'
>  			hasn't been specified.
> +			[KNL, ARM64] Try low allocation in DMA zone and fall back
> +			to high allocation if it fails when '@offset' hasn't been
> +			specified.
>  			See Documentation/admin-guide/kdump/kdump.rst for further details.

How about add ARM64 like below to avoid redundant words?
  			[KNL, X86-64, ARM64] Select a region under 4G first, and
  			fall back to reserve region above 4G when '@offset'
  			hasn't been specified.

>  
>  	crashkernel=range1:size1[,range2:size2,...][@offset]
> @@ -808,6 +811,8 @@
>  			Otherwise memory region will be allocated below 4G, if
>  			available.
>  			It will be ignored if crashkernel=X is specified.
> +			[KNL, ARM64] range in high memory.
> +			Allow kernel to allocate physical memory region from top.

Ditto, please don't add redundent words if it's similar to x86_64
handling.

>  	crashkernel=size[KMG],low
>  			[KNL, X86-64] range under 4G. When crashkernel=X,high
>  			is passed, kernel could allocate physical memory region
> @@ -816,13 +821,15 @@
>  			requires at least 64M+32K low memory, also enough extra
>  			low memory is needed to make sure DMA buffers for 32-bit
>  			devices won't run out. Kernel would try to allocate at
> -			at least 256M below 4G automatically.
> +			least 256M below 4G automatically.
>  			This one let user to specify own low range under 4G
>  			for second kernel instead.
>  			0: to disable low allocation.
>  			It will be ignored when crashkernel=X,high is not used
>  			or memory reserved is below 4G.
> -
> +			[KNL, ARM64] range in low memory.
> +			This one let user to specify a low range in DMA zone for
> +			crash dump kernel.

Ditto.

>  	cryptomgr.notests
>  			[KNL] Disable crypto self-tests
>  
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 2/5] arm64: kdump: introduce some macros for crash kernel reservation
  2022-02-21  3:22       ` Baoquan He
@ 2022-02-21  6:19         ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 30+ messages in thread
From: Leizhen (ThunderTown) @ 2022-02-21  6:19 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/2/21 11:22, Baoquan He wrote:
> On 02/14/22 at 02:22pm, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2022/2/11 18:39, Baoquan He wrote:
>>> On 01/24/22 at 04:47pm, Zhen Lei wrote:
>>>> From: Chen Zhou <chenzhou10@huawei.com>
>>>>
>>>> Introduce macro CRASH_ALIGN for alignment, macro CRASH_ADDR_LOW_MAX
>>>> for upper bound of low crash memory, macro CRASH_ADDR_HIGH_MAX for
>>>> upper bound of high crash memory, use macros instead.
>>>>
>>>> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
>>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>>>> Tested-by: John Donnelly <John.p.donnelly@oracle.com>
>>>> Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>
>>>> ---
>>>>  arch/arm64/mm/init.c | 11 ++++++++---
>>>>  1 file changed, 8 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>>> index 90f276d46b93bc6..6c653a2c7cff052 100644
>>>> --- a/arch/arm64/mm/init.c
>>>> +++ b/arch/arm64/mm/init.c
>>>> @@ -65,6 +65,12 @@ EXPORT_SYMBOL(memstart_addr);
>>>>  phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>>>  
>>>>  #ifdef CONFIG_KEXEC_CORE
>>>> +/* Current arm64 boot protocol requires 2MB alignment */
>>>> +#define CRASH_ALIGN		SZ_2M
>>>> +
>>>> +#define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
>>>> +#define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
>>>
>>> MEMBLOCK_ALLOC_ACCESSIBLE is obvoiously a alloc flag for memblock
>>> allocator, I don't think it's appropriate to make HIGH_MAX get its value.
>>
>> Right, thanks.
>>
>>> You can make it as memblock.current_limit, or do not define it, but using
>>> MEMBLOCK_ALLOC_ACCESSIBLE direclty in memblock_phys_alloc_range() with
>>> a code comment. 
>>
>> This patch is not required at present. These macros are added to eliminate
>> differences to share code with x86.
> 
> So this patch may not be needed in this series. It can be added in
> another post when you start to do the clean up and code unification
> among ARCHes, with my udnerstanding. At that time you can consider how
> to abstract the common code to handle the difference.

Yes, it should be merged with the v20 3/5.

> 
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v20 5/5] kdump: update Documentation about crashkernel
  2022-02-21  3:48   ` Baoquan He
@ 2022-02-21  6:38     ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 30+ messages in thread
From: Leizhen (ThunderTown) @ 2022-02-21  6:38 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/2/21 11:48, Baoquan He wrote:
> On 01/24/22 at 04:47pm, Zhen Lei wrote:
>> From: Chen Zhou <chenzhou10@huawei.com>
>>
>> For arm64, the behavior of crashkernel=X has been changed, which
>> tries low allocation in DMA zone and fall back to high allocation
>> if it fails.
>>
>> We can also use "crashkernel=X,high" to select a high region above
>> DMA zone, which also tries to allocate at least 256M low memory in
>> DMA zone automatically and "crashkernel=Y,low" can be used to allocate
>> specified size low memory.
>>
>> So update the Documentation.
>>
>> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>  Documentation/admin-guide/kdump/kdump.rst       | 11 +++++++++--
>>  Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
>>  2 files changed, 18 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
>> index cb30ca3df27c9b2..d4c287044be0c70 100644
>> --- a/Documentation/admin-guide/kdump/kdump.rst
>> +++ b/Documentation/admin-guide/kdump/kdump.rst
>> @@ -361,8 +361,15 @@ Boot into System Kernel
>>     kernel will automatically locate the crash kernel image within the
>>     first 512MB of RAM if X is not given.
>>  
>> -   On arm64, use "crashkernel=Y[@X]".  Note that the start address of
>> -   the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
>> +   On arm64, use "crashkernel=X" to try low allocation in DMA zone and
>> +   fall back to high allocation if it fails.
>> +   We can also use "crashkernel=X,high" to select a high region above
>> +   DMA zone, which also tries to allocate at least 256M low memory in
>> +   DMA zone automatically.
>> +   "crashkernel=Y,low" can be used to allocate specified size low memory.
>> +   Use "crashkernel=Y@X" if you really have to reserve memory from
>> +   specified start address X. Note that the start address of the kernel,
>> +   X if explicitly specified, must be aligned to 2MiB (0x200000).
> 
> Hmm, we may not need the details related to crashkernel,high|low in this
> section. This just gives examples of basic configation for each ARCH.
> The detailed configuration of all crashkernel setting can be found in
> "crashkernel syntax" section, I don't think arm64 is so special to need
> a specific one.
> 
>>  
>>  Load the Dump-capture Kernel
>>  ============================
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> index f5a27f067db9ed9..65780c2ca830be0 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -792,6 +792,9 @@
>>  			[KNL, X86-64] Select a region under 4G first, and
>>  			fall back to reserve region above 4G when '@offset'
>>  			hasn't been specified.
>> +			[KNL, ARM64] Try low allocation in DMA zone and fall back
>> +			to high allocation if it fails when '@offset' hasn't been
>> +			specified.
>>  			See Documentation/admin-guide/kdump/kdump.rst for further details.
> 
> How about add ARM64 like below to avoid redundant words?
>   			[KNL, X86-64, ARM64] Select a region under 4G first, and
>   			fall back to reserve region above 4G when '@offset'
>   			hasn't been specified.

I agree very much, and I think that's the right thing to do. Thanks.

I wanted to do the same before, just wasn't sure if the format was correct. I just
looked at the other descriptions in kernel-parameters.txt and found that there are
many precedents.

"movablecore=    [KNL,X86,IA-64,PPC]"

> 
>>  
>>  	crashkernel=range1:size1[,range2:size2,...][@offset]
>> @@ -808,6 +811,8 @@
>>  			Otherwise memory region will be allocated below 4G, if
>>  			available.
>>  			It will be ignored if crashkernel=X is specified.
>> +			[KNL, ARM64] range in high memory.
>> +			Allow kernel to allocate physical memory region from top.
> 
> Ditto, please don't add redundent words if it's similar to x86_64
> handling.
> 
>>  	crashkernel=size[KMG],low
>>  			[KNL, X86-64] range under 4G. When crashkernel=X,high
>>  			is passed, kernel could allocate physical memory region
>> @@ -816,13 +821,15 @@
>>  			requires at least 64M+32K low memory, also enough extra
>>  			low memory is needed to make sure DMA buffers for 32-bit
>>  			devices won't run out. Kernel would try to allocate at
>> -			at least 256M below 4G automatically.
>> +			least 256M below 4G automatically.
>>  			This one let user to specify own low range under 4G
>>  			for second kernel instead.
>>  			0: to disable low allocation.
>>  			It will be ignored when crashkernel=X,high is not used
>>  			or memory reserved is below 4G.
>> -
>> +			[KNL, ARM64] range in low memory.
>> +			This one let user to specify a low range in DMA zone for
>> +			crash dump kernel.
> 
> Ditto.
> 
>>  	cryptomgr.notests
>>  			[KNL] Disable crypto self-tests
>>  
>> -- 
>> 2.25.1
>>
> 
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2022-02-21  6:38 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-24  8:47 [PATCH v20 0/5] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
2022-01-24  8:47 ` [PATCH v20 1/5] arm64: Use insert_resource() to simplify code Zhen Lei
2022-01-26 15:16   ` john.p.donnelly
2022-02-08  1:43   ` Baoquan He
2022-01-24  8:47 ` [PATCH v20 2/5] arm64: kdump: introduce some macros for crash kernel reservation Zhen Lei
2022-01-26 15:17   ` john.p.donnelly
2022-02-11 10:39   ` Baoquan He
2022-02-14  6:22     ` Leizhen (ThunderTown)
2022-02-21  3:22       ` Baoquan He
2022-02-21  6:19         ` Leizhen (ThunderTown)
2022-01-24  8:47 ` [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X Zhen Lei
2022-01-26 15:18   ` john.p.donnelly
2022-02-11 10:30   ` Baoquan He
2022-02-11 10:41     ` Leizhen (ThunderTown)
2022-02-11 10:51       ` Baoquan He
2022-02-14  6:44         ` Leizhen (ThunderTown)
2022-02-14  7:09           ` Baoquan He
2022-02-14  3:52   ` Baoquan He
2022-02-14  7:53     ` Leizhen (ThunderTown)
2022-02-16  2:58       ` Leizhen (ThunderTown)
2022-02-16 10:20         ` Baoquan He
2022-02-17  1:57           ` Leizhen (ThunderTown)
2022-01-24  8:47 ` [PATCH v20 4/5] of: fdt: Add memory for devices by DT property "linux,usable-memory-range" Zhen Lei
2022-01-26 15:19   ` john.p.donnelly
2022-01-24  8:47 ` [PATCH v20 5/5] kdump: update Documentation about crashkernel Zhen Lei
2022-01-26 15:19   ` john.p.donnelly
2022-02-21  3:48   ` Baoquan He
2022-02-21  6:38     ` Leizhen (ThunderTown)
2022-02-07  4:04 ` [PATCH v20 0/5] support reserving crashkernel above 4G on arm64 kdump Leizhen (ThunderTown)
2022-02-08  2:34   ` Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).