linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v21 0/5] support reserving crashkernel above 4G on arm64 kdump
@ 2022-02-27  3:07 Zhen Lei
  2022-02-27  3:07 ` [PATCH v21 1/5] kdump: return -ENOENT if required cmdline option does not exist Zhen Lei
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Zhen Lei @ 2022-02-27  3:07 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

Changes since [v20]:
1. Check whether crashkernel=Y,low is incorrectly configured or not configured. Do different processing.
2. Share the existing description of x86. The configuration of arm64 is the same as that of x86.
3. Define the value of macro CRASH_ADDR_HIGH_MAX as memblock.current_limit, instead of MEMBLOCK_ALLOC_ACCESSIBLE.
4. To improve readability, some lightweight code adjustments have been made to reserve_craskernel(), including comments.
5. The defined value of DEFAULT_CRASH_KERNEL_LOW_SIZE reconsiders swiotlb, just like x86, to share documents.

Thanks to Baoquan He for his careful review.

The test cases are as follows: (Please update the kexec tool to the latest version)
1) crashkernel=4G						//high=4G, low=256M
2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
5) crashkernel=4G@0xe0000000					//high=0G, low=0M, cannot allocate, failed
6) crashkernel=512M						//high=0G, low=512M
7) crashkernel=128M						//high=0G, low=128M
8) crashkernel=512M@0xde000000		//512M@3552M		//high=0G, low=512M
9) crashkernel=4G,high						//high=4G, low=256M
a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
c) crashkernel=128M,high					//high=128M, low=256M
d) crashkernel=512M,low						//high=0G, low=0M, invalid
e) crashkernel=512M,high crashkernel=0,low			//high=512M, low=0M
f) crashkernel=4G,high crashkernel=ab,low			//high=0G, low=0M, invalid


Changes since [v19]:
1. Temporarily stop making reserve_crashkernel[_low]() generic. There are a
   lot of details need to be considered, which can take a long time. Because
   "make generic" does not add new functions and does not improve performance,
   maybe I should say it's just a cleanup. So by stripping it out and leaving
   it for other patches later, we can aggregate the changes to the main functions.
2. Use insert_resource() to replace request_resource(), this not only simplifies
   the code, but also reduces the differences between arm64 and x86 implementations.
3. As commit 157752d84f5d ("kexec: use Crash kernel for Crash kernel low") do for
   x86, we can also extend kexec-tools for arm64, and it's currently applied. See:
   https://www.spinics.net/lists/kexec/msg28284.html

Thank you very much, Borislav Petkov, for so many valuable comments.

Changes since [v17]: v17 --> v19
1. Patch 0001-0004
   Introduce generic parse_crashkernel_high_low() to bring the parsing of
   "crashkernel=X,high" and the parsing of "crashkernel=X,low" together,
   then use it instead of the call to parse_crashkernel_{high|low}(). Two
   confusing parameters of parse_crashkernel_{high|low}() are deleted.

   I previously sent these four patches separately:
   [1] https://lkml.org/lkml/2021/12/25/40
2. Patch 0005-0009
   Introduce generic reserve_crashkernel_mem[_low](), the implementation of
   these two functions is based on function reserve_crashkernel[_low]() in
   arch/x86/kernel/setup.c. There is no functional change for x86.
   1) The check position of xen_pv_domain() does not change.
   2) Still 1M alignment for crash kernel fixed region, when 'base' is specified.

   To avoid compilation problems on other architectures: patch 0004 moves
   the definition of global variable crashk[_low]_res from kexec_core.c to
   crash_core.c, and provide default definitions for all macros involved, a
   particular platform can redefine these macros to override the default
   values.
3. 0010, only one line of comment was changed.
4. 0011
   1) crashk_low_res may also a valid reserved memory, should be checked
      in crash_is_nosave(), see arch/arm64/kernel/machine_kexec.
   2) Drop memblock_mark_nomap() for crashk_low_res, because of:
      2687275a5843 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
   3) Also call kmemleak_ignore_phys() for crashk_low_res, because of:
      85f58eb18898 arm64: kdump: Skip kmemleak scan reserved memory for kdump
5. 0012, slightly rebased, because the following patch is applied in advance. 
   https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/commit/?h=dt/linus&id=8347b41748c3019157312fbe7f8a6792ae396eb7
6. 0013, no change.

Others:
1. Discard add ARCH_WANT_RESERVE_CRASH_KERNEL
2. When allocating crash low memory, the start address still starts from 0.
   low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
3. Discard change (1ULL << 32) to CRASH_ADDR_LOW_MAX.
4. Ensure the check position of xen_pv_domain() have no change.
5. Except patch 0010 and 0012, all "Tested-by", "Reviewed-by", "Acked-by" are removed.
6. Update description.



Changes since [v16]
- Because no functional changes in this version, so add
  "Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>" for patch 1-9
- Add "Reviewed-by: Rob Herring <robh@kernel.org>" for patch 8
- Update patch 9 based on the review comments of Rob Herring
- As Catalin Marinas's suggestion, merge the implementation of
  ARCH_WANT_RESERVE_CRASH_KERNEL into patch 5. Ensure that the
  contents of X86 and ARM64 do not overlap, and reduce unnecessary
  temporary differences.

Changes since [v15]
-  Aggregate the processing of "linux,usable-memory-range" into one function.
   Only patch 9-10 have been updated.

Changes since [v14]
- Recovering the requirement that the CrashKernel memory regions on X86
  only requires 1 MiB alignment.
- Combine patches 5 and 6 in v14 into one. The compilation warning fixed
  by patch 6 was introduced by patch 5 in v14.
- As with crashk_res, crashk_low_res is also processed by
  crash_exclude_mem_range() in patch 7.
- Due to commit b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
  has removed the architecture-specific code, extend the property "linux,usable-memory-range"
  in the platform-agnostic FDT core code. See patch 9.
- Discard the x86 description update in the document, because the description
  has been updated by commit b1f4c363666c ("Documentation: kdump: update kdump guide").
- Change "arm64" to "ARM64" in Doc.


Changes since [v13]
- Rebased on top of 5.11-rc5.
- Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL.
Since reserve_crashkernel[_low]() implementations are quite similar on
other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in
arch/Kconfig and select this by X86 and ARM64.
- Some minor cleanup.

Changes since [v12]
- Rebased on top of 5.10-rc1.
- Keep CRASH_ALIGN as 16M suggested by Dave.
- Drop patch "kdump: add threshold for the required memory".
- Add Tested-by from John.

Changes since [v11]
- Rebased on top of 5.9-rc4.
- Make the function reserve_crashkernel() of x86 generic.
Suggested by Catalin, make the function reserve_crashkernel() of x86 generic
and arm64 use the generic version to reimplement crashkernel=X.

Changes since [v10]
- Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.

Changes since [v9]
- Patch 1 add Acked-by from Dave.
- Update patch 5 according to Dave's comments.
- Update chosen schema.

Changes since [v8]
- Reuse DT property "linux,usable-memory-range".
Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
memory region.
- Fix kdump broken with ZONE_DMA reintroduced.
- Update chosen schema.

Changes since [v7]
- Move x86 CRASH_ALIGN to 2M
Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
- Update Documentation/devicetree/bindings/chosen.txt.
Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
suggested by Arnd.
- Add Tested-by from Jhon and pk.

Changes since [v6]
- Fix build errors reported by kbuild test robot.

Changes since [v5]
- Move reserve_crashkernel_low() into kernel/crash_core.c.
- Delete crashkernel=X,high.
- Modify crashkernel=X,low.
If crashkernel=X,low is specified simultaneously, reserve spcified size low
memory for crash kdump kernel devices firstly and then reserve memory above 4G.
In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
pass to crash dump kernel by DT property "linux,low-memory-range".
- Update Documentation/admin-guide/kdump/kdump.rst.

Changes since [v4]
- Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.

Changes since [v3]
- Add memblock_cap_memory_ranges back for multiple ranges.
- Fix some compiling warnings.

Changes since [v2]
- Split patch "arm64: kdump: support reserving crashkernel above 4G" as
two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
patch.

Changes since [v1]:
- Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
- Remove memblock_cap_memory_ranges() i added in v1 and implement that
in fdt_enforce_memory_region().
There are at most two crash kernel regions, for two crash kernel regions
case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
and then remove the memory range in the middle.

v1:
There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel below 4G, which
will fail when there is no enough low memory.
2. If reserving crashkernel above 4G, in this case, crash dump
kernel will boot failure because there is no low memory available
for allocation.

To solve these issues, change the behavior of crashkernel=X.
crashkernel=X tries low allocation in DMA zone and fall back to high
allocation if it fails.

We can also use "crashkernel=X,high" to select a high region above
DMA zone, which also tries to allocate at least 256M low memory in
DMA zone automatically and "crashkernel=Y,low" can be used to allocate
specified size low memory.

When reserving crashkernel in high memory, some low memory is reserved
for crash dump kernel devices. So there may be two regions reserved for
crash dump kernel.
In order to distinct from the high region and make no effect to the use
of existing kexec-tools, rename the low region as "Crash kernel (low)",
and pass the low region by reusing DT property
"linux,usable-memory-range". We made the low memory region as the last
range of "linux,usable-memory-range" to keep compatibility with existing
user-space and older kdump kernels.

Besides, we need to modify kexec-tools:
arm64: support more than one crash kernel regions(see [1])

Another update is document about DT property 'linux,usable-memory-range':
schemas: update 'linux,usable-memory-range' node schema(see [2])


[1]: https://www.spinics.net/lists/kexec/msg28226.html
[2]: https://github.com/robherring/dt-schema/pull/19 
[v1]: https://lkml.org/lkml/2019/4/2/1174
[v2]: https://lkml.org/lkml/2019/4/9/86
[v3]: https://lkml.org/lkml/2019/4/9/306
[v4]: https://lkml.org/lkml/2019/4/15/273
[v5]: https://lkml.org/lkml/2019/5/6/1360
[v6]: https://lkml.org/lkml/2019/8/30/142
[v7]: https://lkml.org/lkml/2019/12/23/411
[v8]: https://lkml.org/lkml/2020/5/21/213
[v9]: https://lkml.org/lkml/2020/6/28/73
[v10]: https://lkml.org/lkml/2020/7/2/1443
[v11]: https://lkml.org/lkml/2020/8/1/150
[v12]: https://lkml.org/lkml/2020/9/7/1037
[v13]: https://lkml.org/lkml/2020/10/31/34
[v14]: https://lkml.org/lkml/2021/1/30/53
[v15]: https://lkml.org/lkml/2021/10/19/1405
[v16]: https://lkml.org/lkml/2021/11/23/435
[v17]: https://lkml.org/lkml/2021/12/10/38
[v18]: https://lkml.org/lkml/2021/12/22/424
[v19]: https://lkml.org/lkml/2021/12/28/203
[v20]: https://lkml.org/lkml/2022/1/24/167

Chen Zhou (2):
  arm64: kdump: reimplement crashkernel=X
  of: fdt: Add memory for devices by DT property
    "linux,usable-memory-range"

Zhen Lei (3):
  kdump: return -ENOENT if required cmdline option does not exist
  arm64: Use insert_resource() to simplify code
  docs: kdump: Update the crashkernel description for arm64

 .../admin-guide/kernel-parameters.txt         |   8 +-
 arch/arm64/kernel/machine_kexec.c             |   9 +-
 arch/arm64/kernel/machine_kexec_file.c        |  12 +-
 arch/arm64/kernel/setup.c                     |  17 +--
 arch/arm64/mm/init.c                          | 107 ++++++++++++++++--
 drivers/of/fdt.c                              |  33 ++++--
 kernel/crash_core.c                           |   3 +-
 7 files changed, 147 insertions(+), 42 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v21 1/5] kdump: return -ENOENT if required cmdline option does not exist
  2022-02-27  3:07 [PATCH v21 0/5] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
@ 2022-02-27  3:07 ` Zhen Lei
  2022-03-15 11:57   ` Baoquan He
  2022-03-16  5:39   ` Baoquan He
  2022-02-27  3:07 ` [PATCH v21 2/5] arm64: Use insert_resource() to simplify code Zhen Lei
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 27+ messages in thread
From: Zhen Lei @ 2022-02-27  3:07 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

The crashkernel=Y,low is an optional command-line option. When it doesn't
exist, kernel will try to allocate minimum required memory below 4G
automatically. Give it a unique error code to distinguish it from other
error scenarios.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 kernel/crash_core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 256cf6db573cd09..4d57c03714f4e13 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -243,9 +243,8 @@ static int __init __parse_crashkernel(char *cmdline,
 	*crash_base = 0;
 
 	ck_cmdline = get_last_crashkernel(cmdline, name, suffix);
-
 	if (!ck_cmdline)
-		return -EINVAL;
+		return -ENOENT;
 
 	ck_cmdline += strlen(name);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v21 2/5] arm64: Use insert_resource() to simplify code
  2022-02-27  3:07 [PATCH v21 0/5] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
  2022-02-27  3:07 ` [PATCH v21 1/5] kdump: return -ENOENT if required cmdline option does not exist Zhen Lei
@ 2022-02-27  3:07 ` Zhen Lei
  2022-02-27  3:07 ` [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X Zhen Lei
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Zhen Lei @ 2022-02-27  3:07 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

insert_resource() traverses the subtree layer by layer from the root node
until a proper location is found. Compared with request_resource(), the
parent node does not need to be determined in advance.

In addition, move the insertion of node 'crashk_res' into function
reserve_crashkernel() to make the associated code close together.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: John Donnelly  <john.p.donnelly@oracle.com>
Acked-by: Baoquan He <bhe@redhat.com>
---
 arch/arm64/kernel/setup.c | 17 +++--------------
 arch/arm64/mm/init.c      |  1 +
 2 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index f70573928f1bff0..a81efcc359e4e78 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -225,6 +225,8 @@ static void __init request_standard_resources(void)
 	kernel_code.end     = __pa_symbol(__init_begin - 1);
 	kernel_data.start   = __pa_symbol(_sdata);
 	kernel_data.end     = __pa_symbol(_end - 1);
+	insert_resource(&iomem_resource, &kernel_code);
+	insert_resource(&iomem_resource, &kernel_data);
 
 	num_standard_resources = memblock.memory.cnt;
 	res_size = num_standard_resources * sizeof(*standard_resources);
@@ -246,20 +248,7 @@ static void __init request_standard_resources(void)
 			res->end = __pfn_to_phys(memblock_region_memory_end_pfn(region)) - 1;
 		}
 
-		request_resource(&iomem_resource, res);
-
-		if (kernel_code.start >= res->start &&
-		    kernel_code.end <= res->end)
-			request_resource(res, &kernel_code);
-		if (kernel_data.start >= res->start &&
-		    kernel_data.end <= res->end)
-			request_resource(res, &kernel_data);
-#ifdef CONFIG_KEXEC_CORE
-		/* Userspace will find "Crash kernel" region in /proc/iomem. */
-		if (crashk_res.end && crashk_res.start >= res->start &&
-		    crashk_res.end <= res->end)
-			request_resource(res, &crashk_res);
-#endif
+		insert_resource(&iomem_resource, res);
 	}
 }
 
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index db63cc885771a52..90f276d46b93bc6 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -109,6 +109,7 @@ static void __init reserve_crashkernel(void)
 	kmemleak_ignore_phys(crash_base);
 	crashk_res.start = crash_base;
 	crashk_res.end = crash_base + crash_size - 1;
+	insert_resource(&iomem_resource, &crashk_res);
 }
 #else
 static void __init reserve_crashkernel(void)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X
  2022-02-27  3:07 [PATCH v21 0/5] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
  2022-02-27  3:07 ` [PATCH v21 1/5] kdump: return -ENOENT if required cmdline option does not exist Zhen Lei
  2022-02-27  3:07 ` [PATCH v21 2/5] arm64: Use insert_resource() to simplify code Zhen Lei
@ 2022-02-27  3:07 ` Zhen Lei
  2022-03-16 12:11   ` Baoquan He
                     ` (2 more replies)
  2022-02-27  3:07 ` [PATCH v21 4/5] of: fdt: Add memory for devices by DT property "linux,usable-memory-range" Zhen Lei
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 27+ messages in thread
From: Zhen Lei @ 2022-02-27  3:07 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

From: Chen Zhou <chenzhou10@huawei.com>

There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel below 4G, which
will fail when there is no enough low memory.
2. If reserving crashkernel above 4G, in this case, crash dump
kernel will boot failure because there is no low memory available
for allocation.

To solve these issues, change the behavior of crashkernel=X and
introduce crashkernel=X,[high,low]. crashkernel=X tries low allocation
in DMA zone, and fall back to high allocation if it fails.
We can also use "crashkernel=X,high" to select a region above DMA zone,
which also tries to allocate at least 256M in DMA zone automatically.
"crashkernel=Y,low" can be used to allocate specified size low memory.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/kernel/machine_kexec.c      |   9 ++-
 arch/arm64/kernel/machine_kexec_file.c |  12 ++-
 arch/arm64/mm/init.c                   | 106 +++++++++++++++++++++++--
 3 files changed, 115 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index e16b248699d5c3c..19c2d487cb08feb 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -329,8 +329,13 @@ bool crash_is_nosave(unsigned long pfn)
 
 	/* in reserved memory? */
 	addr = __pfn_to_phys(pfn);
-	if ((addr < crashk_res.start) || (crashk_res.end < addr))
-		return false;
+	if ((addr < crashk_res.start) || (crashk_res.end < addr)) {
+		if (!crashk_low_res.end)
+			return false;
+
+		if ((addr < crashk_low_res.start) || (crashk_low_res.end < addr))
+			return false;
+	}
 
 	if (!kexec_crash_image)
 		return true;
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index 59c648d51848886..889951291cc0f9c 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -65,10 +65,18 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
 
 	/* Exclude crashkernel region */
 	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
+	if (ret)
+		goto out;
+
+	if (crashk_low_res.end) {
+		ret = crash_exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
+		if (ret)
+			goto out;
+	}
 
-	if (!ret)
-		ret =  crash_prepare_elf64_headers(cmem, true, addr, sz);
+	ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
 
+out:
 	kfree(cmem);
 	return ret;
 }
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 90f276d46b93bc6..30ae6638ff54c47 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -65,6 +65,44 @@ EXPORT_SYMBOL(memstart_addr);
 phys_addr_t arm64_dma_phys_limit __ro_after_init;
 
 #ifdef CONFIG_KEXEC_CORE
+/* Current arm64 boot protocol requires 2MB alignment */
+#define CRASH_ALIGN			SZ_2M
+
+#define CRASH_ADDR_LOW_MAX		arm64_dma_phys_limit
+#define CRASH_ADDR_HIGH_MAX		memblock.current_limit
+
+/*
+ * This is an empirical value in x86_64 and taken here directly. Please
+ * refer to the code comment in reserve_crashkernel_low() of x86_64 for more
+ * details.
+ */
+#define DEFAULT_CRASH_KERNEL_LOW_SIZE	\
+	max(swiotlb_size_or_default() + (8UL << 20), 256UL << 20)
+
+static int __init reserve_crashkernel_low(unsigned long long low_size)
+{
+	unsigned long long low_base;
+
+	/* passed with crashkernel=0,low ? */
+	if (!low_size)
+		return 0;
+
+	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
+	if (!low_base) {
+		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
+		return -ENOMEM;
+	}
+
+	pr_info("crashkernel low memory reserved: 0x%08llx - 0x%08llx (%lld MB)\n",
+		low_base, low_base + low_size, low_size >> 20);
+
+	crashk_low_res.start = low_base;
+	crashk_low_res.end   = low_base + low_size - 1;
+	insert_resource(&iomem_resource, &crashk_low_res);
+
+	return 0;
+}
+
 /*
  * reserve_crashkernel() - reserves memory for crash kernel
  *
@@ -75,30 +113,79 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
 static void __init reserve_crashkernel(void)
 {
 	unsigned long long crash_base, crash_size;
-	unsigned long long crash_max = arm64_dma_phys_limit;
+	unsigned long long crash_low_size;
+	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
 	int ret;
+	bool fixed_base, high = false;
+	char *cmdline = boot_command_line;
 
-	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+	/* crashkernel=X[@offset] */
+	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
 				&crash_size, &crash_base);
-	/* no crashkernel= or invalid value specified */
-	if (ret || !crash_size)
-		return;
+	if (ret || !crash_size) {
+		/* crashkernel=X,high */
+		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
+		if (ret || !crash_size)
+			return;
+
+		/* crashkernel=Y,low */
+		ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
+		if (ret == -ENOENT)
+			/*
+			 * crashkernel=Y,low is not specified explicitly, use
+			 * default size automatically.
+			 */
+			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
+		else if (ret)
+			/* crashkernel=Y,low is specified but Y is invalid */
+			return;
+
+		/* Mark crashkernel=X,high is specified */
+		high = true;
+		crash_max = CRASH_ADDR_HIGH_MAX;
+	}
 
+	fixed_base = !!crash_base;
 	crash_size = PAGE_ALIGN(crash_size);
 
 	/* User specifies base address explicitly. */
-	if (crash_base)
+	if (fixed_base)
 		crash_max = crash_base + crash_size;
 
-	/* Current arm64 boot protocol requires 2MB alignment */
-	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
+retry:
+	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
 					       crash_base, crash_max);
 	if (!crash_base) {
+		/*
+		 * Attempt to fully allocate low memory failed, fall back
+		 * to high memory, the minimum required low memory will be
+		 * reserved later.
+		 */
+		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
+			crash_max = CRASH_ADDR_HIGH_MAX;
+			goto retry;
+		}
+
 		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
 			crash_size);
 		return;
 	}
 
+	if (crash_base >= SZ_4G) {
+		/*
+		 * For case crashkernel=X, low memory is not enough and fall
+		 * back to reserve specified size of memory above 4G, try to
+		 * allocate minimum required memory below 4G again.
+		 */
+		if (!high)
+			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
+
+		if (reserve_crashkernel_low(crash_low_size)) {
+			memblock_phys_free(crash_base, crash_size);
+			return;
+		}
+	}
+
 	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
 		crash_base, crash_base + crash_size, crash_size >> 20);
 
@@ -107,6 +194,9 @@ static void __init reserve_crashkernel(void)
 	 * map. Inform kmemleak so that it won't try to access it.
 	 */
 	kmemleak_ignore_phys(crash_base);
+	if (crashk_low_res.end)
+		kmemleak_ignore_phys(crashk_low_res.start);
+
 	crashk_res.start = crash_base;
 	crashk_res.end = crash_base + crash_size - 1;
 	insert_resource(&iomem_resource, &crashk_res);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v21 4/5] of: fdt: Add memory for devices by DT property "linux,usable-memory-range"
  2022-02-27  3:07 [PATCH v21 0/5] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
                   ` (2 preceding siblings ...)
  2022-02-27  3:07 ` [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X Zhen Lei
@ 2022-02-27  3:07 ` Zhen Lei
  2022-02-27  3:07 ` [PATCH v21 5/5] docs: kdump: Update the crashkernel description for arm64 Zhen Lei
  2022-04-08  9:32 ` [PATCH v21 0/5] support reserving crashkernel above 4G on arm64 kdump Baoquan He
  5 siblings, 0 replies; 27+ messages in thread
From: Zhen Lei @ 2022-02-27  3:07 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

From: Chen Zhou <chenzhou10@huawei.com>

When reserving crashkernel in high memory, some low memory is reserved
for crash dump kernel devices and never mapped by the first kernel.
This memory range is advertised to crash dump kernel via DT property
under /chosen,
        linux,usable-memory-range = <BASE1 SIZE1 [BASE2 SIZE2]>

We reused the DT property linux,usable-memory-range and made the low
memory region as the second range "BASE2 SIZE2", which keeps compatibility
with existing user-space and older kdump kernels.

Crash dump kernel reads this property at boot time and call memblock_add()
to add the low memory region after memblock_cap_memory_range() has been
called.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>
---
 drivers/of/fdt.c | 33 +++++++++++++++++++++++----------
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index ec315b060cd50d2..2f248d0acc04830 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -973,16 +973,24 @@ static void __init early_init_dt_check_for_elfcorehdr(unsigned long node)
 
 static unsigned long chosen_node_offset = -FDT_ERR_NOTFOUND;
 
+/*
+ * The main usage of linux,usable-memory-range is for crash dump kernel.
+ * Originally, the number of usable-memory regions is one. Now there may
+ * be two regions, low region and high region.
+ * To make compatibility with existing user-space and older kdump, the low
+ * region is always the last range of linux,usable-memory-range if exist.
+ */
+#define MAX_USABLE_RANGES		2
+
 /**
  * early_init_dt_check_for_usable_mem_range - Decode usable memory range
  * location from flat tree
  */
 void __init early_init_dt_check_for_usable_mem_range(void)
 {
-	const __be32 *prop;
-	int len;
-	phys_addr_t cap_mem_addr;
-	phys_addr_t cap_mem_size;
+	struct memblock_region rgn[MAX_USABLE_RANGES] = {0};
+	const __be32 *prop, *endp;
+	int len, i;
 	unsigned long node = chosen_node_offset;
 
 	if ((long)node < 0)
@@ -991,16 +999,21 @@ void __init early_init_dt_check_for_usable_mem_range(void)
 	pr_debug("Looking for usable-memory-range property... ");
 
 	prop = of_get_flat_dt_prop(node, "linux,usable-memory-range", &len);
-	if (!prop || (len < (dt_root_addr_cells + dt_root_size_cells)))
+	if (!prop || (len % (dt_root_addr_cells + dt_root_size_cells)))
 		return;
 
-	cap_mem_addr = dt_mem_next_cell(dt_root_addr_cells, &prop);
-	cap_mem_size = dt_mem_next_cell(dt_root_size_cells, &prop);
+	endp = prop + (len / sizeof(__be32));
+	for (i = 0; i < MAX_USABLE_RANGES && prop < endp; i++) {
+		rgn[i].base = dt_mem_next_cell(dt_root_addr_cells, &prop);
+		rgn[i].size = dt_mem_next_cell(dt_root_size_cells, &prop);
 
-	pr_debug("cap_mem_start=%pa cap_mem_size=%pa\n", &cap_mem_addr,
-		 &cap_mem_size);
+		pr_debug("cap_mem_regions[%d]: base=%pa, size=%pa\n",
+			 i, &rgn[i].base, &rgn[i].size);
+	}
 
-	memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
+	memblock_cap_memory_range(rgn[0].base, rgn[0].size);
+	for (i = 1; i < MAX_USABLE_RANGES && rgn[i].size; i++)
+		memblock_add(rgn[i].base, rgn[i].size);
 }
 
 #ifdef CONFIG_SERIAL_EARLYCON
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v21 5/5] docs: kdump: Update the crashkernel description for arm64
  2022-02-27  3:07 [PATCH v21 0/5] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
                   ` (3 preceding siblings ...)
  2022-02-27  3:07 ` [PATCH v21 4/5] of: fdt: Add memory for devices by DT property "linux,usable-memory-range" Zhen Lei
@ 2022-02-27  3:07 ` Zhen Lei
  2022-03-15 11:59   ` Baoquan He
  2022-04-08  9:32 ` [PATCH v21 0/5] support reserving crashkernel above 4G on arm64 kdump Baoquan He
  5 siblings, 1 reply; 27+ messages in thread
From: Zhen Lei @ 2022-02-27  3:07 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

Now arm64 has added support for "crashkernel=X,high" and
"crashkernel=Y,low", and implements "crashkernel=X[@offset]" in the
same way as x86. So update the Documentation.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 Documentation/admin-guide/kernel-parameters.txt | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index f5a27f067db9ed9..63098786c93828c 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -789,7 +789,7 @@
 			memory region [offset, offset + size] for that kernel
 			image. If '@offset' is omitted, then a suitable offset
 			is selected automatically.
-			[KNL, X86-64] Select a region under 4G first, and
+			[KNL, X86-64, ARM64] Select a region under 4G first, and
 			fall back to reserve region above 4G when '@offset'
 			hasn't been specified.
 			See Documentation/admin-guide/kdump/kdump.rst for further details.
@@ -802,20 +802,20 @@
 			Documentation/admin-guide/kdump/kdump.rst for an example.
 
 	crashkernel=size[KMG],high
-			[KNL, X86-64] range could be above 4G. Allow kernel
+			[KNL, X86-64, ARM64] range could be above 4G. Allow kernel
 			to allocate physical memory region from top, so could
 			be above 4G if system have more than 4G ram installed.
 			Otherwise memory region will be allocated below 4G, if
 			available.
 			It will be ignored if crashkernel=X is specified.
 	crashkernel=size[KMG],low
-			[KNL, X86-64] range under 4G. When crashkernel=X,high
+			[KNL, X86-64, ARM64] range under 4G. When crashkernel=X,high
 			is passed, kernel could allocate physical memory region
 			above 4G, that cause second kernel crash on system
 			that require some amount of low memory, e.g. swiotlb
 			requires at least 64M+32K low memory, also enough extra
 			low memory is needed to make sure DMA buffers for 32-bit
-			devices won't run out. Kernel would try to allocate at
+			devices won't run out. Kernel would try to allocate
 			at least 256M below 4G automatically.
 			This one let user to specify own low range under 4G
 			for second kernel instead.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 1/5] kdump: return -ENOENT if required cmdline option does not exist
  2022-02-27  3:07 ` [PATCH v21 1/5] kdump: return -ENOENT if required cmdline option does not exist Zhen Lei
@ 2022-03-15 11:57   ` Baoquan He
  2022-03-15 12:21     ` Baoquan He
  2022-03-16  5:39   ` Baoquan He
  1 sibling, 1 reply; 27+ messages in thread
From: Baoquan He @ 2022-03-15 11:57 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 02/27/22 at 11:07am, Zhen Lei wrote:
> The crashkernel=Y,low is an optional command-line option. When it doesn't
> exist, kernel will try to allocate minimum required memory below 4G
> automatically. Give it a unique error code to distinguish it from other
> error scenarios.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  kernel/crash_core.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 256cf6db573cd09..4d57c03714f4e13 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -243,9 +243,8 @@ static int __init __parse_crashkernel(char *cmdline,
>  	*crash_base = 0;
>  
>  	ck_cmdline = get_last_crashkernel(cmdline, name, suffix);
> -
>  	if (!ck_cmdline)
> -		return -EINVAL;
> +		return -ENOENT;

Firstly, I am not sure if '-ENOENT' is a right value to return. From the
code comment of ENOENT, it's used for file or dir?
#define ENOENT           2      /* No such file or directory */

Secondly, we ever discussed the case including
 - no crashkernel=,low is provided;
 - messy code is provied, e.g crashkernel=aaaaaabbbb,low

The 2nd one is not handled in this patchset. How about taking the
handling into another round of patches. This patchset just adds
crashkernel=,high purely.

>  
>  	ck_cmdline += strlen(name);
>  
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 5/5] docs: kdump: Update the crashkernel description for arm64
  2022-02-27  3:07 ` [PATCH v21 5/5] docs: kdump: Update the crashkernel description for arm64 Zhen Lei
@ 2022-03-15 11:59   ` Baoquan He
  0 siblings, 0 replies; 27+ messages in thread
From: Baoquan He @ 2022-03-15 11:59 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 02/27/22 at 11:07am, Zhen Lei wrote:
> Now arm64 has added support for "crashkernel=X,high" and
> "crashkernel=Y,low", and implements "crashkernel=X[@offset]" in the
> same way as x86. So update the Documentation.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

Looks good to me, thx.

Acked-by: Baoquan He <bhe@redhat.com>

> ---
>  Documentation/admin-guide/kernel-parameters.txt | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index f5a27f067db9ed9..63098786c93828c 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -789,7 +789,7 @@
>  			memory region [offset, offset + size] for that kernel
>  			image. If '@offset' is omitted, then a suitable offset
>  			is selected automatically.
> -			[KNL, X86-64] Select a region under 4G first, and
> +			[KNL, X86-64, ARM64] Select a region under 4G first, and
>  			fall back to reserve region above 4G when '@offset'
>  			hasn't been specified.
>  			See Documentation/admin-guide/kdump/kdump.rst for further details.
> @@ -802,20 +802,20 @@
>  			Documentation/admin-guide/kdump/kdump.rst for an example.
>  
>  	crashkernel=size[KMG],high
> -			[KNL, X86-64] range could be above 4G. Allow kernel
> +			[KNL, X86-64, ARM64] range could be above 4G. Allow kernel
>  			to allocate physical memory region from top, so could
>  			be above 4G if system have more than 4G ram installed.
>  			Otherwise memory region will be allocated below 4G, if
>  			available.
>  			It will be ignored if crashkernel=X is specified.
>  	crashkernel=size[KMG],low
> -			[KNL, X86-64] range under 4G. When crashkernel=X,high
> +			[KNL, X86-64, ARM64] range under 4G. When crashkernel=X,high
>  			is passed, kernel could allocate physical memory region
>  			above 4G, that cause second kernel crash on system
>  			that require some amount of low memory, e.g. swiotlb
>  			requires at least 64M+32K low memory, also enough extra
>  			low memory is needed to make sure DMA buffers for 32-bit
> -			devices won't run out. Kernel would try to allocate at
> +			devices won't run out. Kernel would try to allocate
>  			at least 256M below 4G automatically.
>  			This one let user to specify own low range under 4G
>  			for second kernel instead.
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 1/5] kdump: return -ENOENT if required cmdline option does not exist
  2022-03-15 11:57   ` Baoquan He
@ 2022-03-15 12:21     ` Baoquan He
  2022-03-15 13:32       ` Leizhen (ThunderTown)
  0 siblings, 1 reply; 27+ messages in thread
From: Baoquan He @ 2022-03-15 12:21 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 03/15/22 at 07:57pm, Baoquan He wrote:
> On 02/27/22 at 11:07am, Zhen Lei wrote:
> > The crashkernel=Y,low is an optional command-line option. When it doesn't
> > exist, kernel will try to allocate minimum required memory below 4G
> > automatically. Give it a unique error code to distinguish it from other
> > error scenarios.
> > 
> > Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> > ---
> >  kernel/crash_core.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > index 256cf6db573cd09..4d57c03714f4e13 100644
> > --- a/kernel/crash_core.c
> > +++ b/kernel/crash_core.c
> > @@ -243,9 +243,8 @@ static int __init __parse_crashkernel(char *cmdline,
> >  	*crash_base = 0;
> >  
> >  	ck_cmdline = get_last_crashkernel(cmdline, name, suffix);
> > -
> >  	if (!ck_cmdline)
> > -		return -EINVAL;
> > +		return -ENOENT;
> 
> Firstly, I am not sure if '-ENOENT' is a right value to return. From the
> code comment of ENOENT, it's used for file or dir?
> #define ENOENT           2      /* No such file or directory */
> 
> Secondly, we ever discussed the case including
>  - no crashkernel=,low is provided;
>  - messy code is provied, e.g crashkernel=aaaaaabbbb,low

Checking the 3rd pach, this is handled. Take back my below words,
continue reviewing.

> 
> The 2nd one is not handled in this patchset. How about taking the
> handling into another round of patches. This patchset just adds
> crashkernel=,high purely.
> 
> >  
> >  	ck_cmdline += strlen(name);
> >  
> > -- 
> > 2.25.1
> > 
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 1/5] kdump: return -ENOENT if required cmdline option does not exist
  2022-03-15 12:21     ` Baoquan He
@ 2022-03-15 13:32       ` Leizhen (ThunderTown)
  2022-03-16  5:17         ` Baoquan He
  0 siblings, 1 reply; 27+ messages in thread
From: Leizhen (ThunderTown) @ 2022-03-15 13:32 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/3/15 20:21, Baoquan He wrote:
> On 03/15/22 at 07:57pm, Baoquan He wrote:
>> On 02/27/22 at 11:07am, Zhen Lei wrote:
>>> The crashkernel=Y,low is an optional command-line option. When it doesn't
>>> exist, kernel will try to allocate minimum required memory below 4G
>>> automatically. Give it a unique error code to distinguish it from other
>>> error scenarios.
>>>
>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>>> ---
>>>  kernel/crash_core.c | 3 +--
>>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>>> index 256cf6db573cd09..4d57c03714f4e13 100644
>>> --- a/kernel/crash_core.c
>>> +++ b/kernel/crash_core.c
>>> @@ -243,9 +243,8 @@ static int __init __parse_crashkernel(char *cmdline,
>>>  	*crash_base = 0;
>>>  
>>>  	ck_cmdline = get_last_crashkernel(cmdline, name, suffix);
>>> -
>>>  	if (!ck_cmdline)
>>> -		return -EINVAL;
>>> +		return -ENOENT;
>>
>> Firstly, I am not sure if '-ENOENT' is a right value to return. From the
>> code comment of ENOENT, it's used for file or dir?
>> #define ENOENT           2      /* No such file or directory */

This error code does not return to user mode, so there is no problem.
There are a lot of places in the kernel that are used this way. For example:

int stop_one_cpu(unsigned int cpu, cpu_stop_fn_t fn, void *arg)
{
	if (!cpu_stop_queue_work(cpu, &work))
		return -ENOENT;

>>
>> Secondly, we ever discussed the case including
>>  - no crashkernel=,low is provided;
>>  - messy code is provied, e.g crashkernel=aaaaaabbbb,low
> 
> Checking the 3rd pach, this is handled. Take back my below words,
> continue reviewing.

Yes.

> 
>>
>> The 2nd one is not handled in this patchset. How about taking the
>> handling into another round of patches. This patchset just adds
>> crashkernel=,high purely.
>>
>>>  
>>>  	ck_cmdline += strlen(name);
>>>  
>>> -- 
>>> 2.25.1
>>>
>>
> 
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 1/5] kdump: return -ENOENT if required cmdline option does not exist
  2022-03-15 13:32       ` Leizhen (ThunderTown)
@ 2022-03-16  5:17         ` Baoquan He
  0 siblings, 0 replies; 27+ messages in thread
From: Baoquan He @ 2022-03-16  5:17 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 03/15/22 at 09:32pm, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/3/15 20:21, Baoquan He wrote:
> > On 03/15/22 at 07:57pm, Baoquan He wrote:
> >> On 02/27/22 at 11:07am, Zhen Lei wrote:
> >>> The crashkernel=Y,low is an optional command-line option. When it doesn't
> >>> exist, kernel will try to allocate minimum required memory below 4G
> >>> automatically. Give it a unique error code to distinguish it from other
> >>> error scenarios.
> >>>
> >>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> >>> ---
> >>>  kernel/crash_core.c | 3 +--
> >>>  1 file changed, 1 insertion(+), 2 deletions(-)
> >>>
> >>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> >>> index 256cf6db573cd09..4d57c03714f4e13 100644
> >>> --- a/kernel/crash_core.c
> >>> +++ b/kernel/crash_core.c
> >>> @@ -243,9 +243,8 @@ static int __init __parse_crashkernel(char *cmdline,
> >>>  	*crash_base = 0;
> >>>  
> >>>  	ck_cmdline = get_last_crashkernel(cmdline, name, suffix);
> >>> -
> >>>  	if (!ck_cmdline)
> >>> -		return -EINVAL;
> >>> +		return -ENOENT;
> >>
> >> Firstly, I am not sure if '-ENOENT' is a right value to return. From the
> >> code comment of ENOENT, it's used for file or dir?
> >> #define ENOENT           2      /* No such file or directory */
> 
> This error code does not return to user mode, so there is no problem.
> There are a lot of places in the kernel that are used this way. For example:
> 
> int stop_one_cpu(unsigned int cpu, cpu_stop_fn_t fn, void *arg)
> {
> 	if (!cpu_stop_queue_work(cpu, &work))
> 		return -ENOENT;

OK, it's fine to me. Thanks for the investigation.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 1/5] kdump: return -ENOENT if required cmdline option does not exist
  2022-02-27  3:07 ` [PATCH v21 1/5] kdump: return -ENOENT if required cmdline option does not exist Zhen Lei
  2022-03-15 11:57   ` Baoquan He
@ 2022-03-16  5:39   ` Baoquan He
  2022-03-16  6:15     ` Leizhen (ThunderTown)
  1 sibling, 1 reply; 27+ messages in thread
From: Baoquan He @ 2022-03-16  5:39 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 02/27/22 at 11:07am, Zhen Lei wrote:
> The crashkernel=Y,low is an optional command-line option. When it doesn't
> exist, kernel will try to allocate minimum required memory below 4G
> automatically. Give it a unique error code to distinguish it from other
> error scenarios.

This log is a little confusing. __parse_crashkernel() has three callers. 
 - parse_crashkernel()
 - parse_crashkernel_high()
 - parse_crashkernel_low()

How about tuning the git log as below:

==================
According to the current crashkernel=Y,low support in other ARCHes, it's
an optional command-line option. When it doesn't exist, kernel will try
to allocate minimum required memory below 4G automatically. 

However, __parse_crashkernel() returns '-EINVAL' for all error cases. It
can't distinguish the nonexistent option from invalid option. 

Change __parse_crashkernel() to return '-ENOENT' for the nonexistent option
case. With this change, crashkernel,low memory will take the default
value if crashkernel=,low is not specified; while crashkernel reservation
will fail and bail out if an invalid option is specified.
==================

> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  kernel/crash_core.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  kernel/crash_core.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 256cf6db573cd09..4d57c03714f4e13 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -243,9 +243,8 @@ static int __init __parse_crashkernel(char *cmdline,
>  	*crash_base = 0;
>  
>  	ck_cmdline = get_last_crashkernel(cmdline, name, suffix);
> -
>  	if (!ck_cmdline)
> -		return -EINVAL;
> +		return -ENOENT;
>  
>  	ck_cmdline += strlen(name);
>  
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 1/5] kdump: return -ENOENT if required cmdline option does not exist
  2022-03-16  5:39   ` Baoquan He
@ 2022-03-16  6:15     ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 27+ messages in thread
From: Leizhen (ThunderTown) @ 2022-03-16  6:15 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/3/16 13:39, Baoquan He wrote:
> On 02/27/22 at 11:07am, Zhen Lei wrote:
>> The crashkernel=Y,low is an optional command-line option. When it doesn't
>> exist, kernel will try to allocate minimum required memory below 4G
>> automatically. Give it a unique error code to distinguish it from other
>> error scenarios.
> 
> This log is a little confusing. __parse_crashkernel() has three callers. 
>  - parse_crashkernel()
>  - parse_crashkernel_high()
>  - parse_crashkernel_low()
> 
> How about tuning the git log as below:

Sure. Your description is much clearer than mine.

> 
> ==================
> According to the current crashkernel=Y,low support in other ARCHes, it's
> an optional command-line option. When it doesn't exist, kernel will try
> to allocate minimum required memory below 4G automatically. 
> 
> However, __parse_crashkernel() returns '-EINVAL' for all error cases. It
> can't distinguish the nonexistent option from invalid option. 
> 
> Change __parse_crashkernel() to return '-ENOENT' for the nonexistent option
> case. With this change, crashkernel,low memory will take the default
> value if crashkernel=,low is not specified; while crashkernel reservation
> will fail and bail out if an invalid option is specified.
> ==================
> 
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>  kernel/crash_core.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>  kernel/crash_core.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>> index 256cf6db573cd09..4d57c03714f4e13 100644
>> --- a/kernel/crash_core.c
>> +++ b/kernel/crash_core.c
>> @@ -243,9 +243,8 @@ static int __init __parse_crashkernel(char *cmdline,
>>  	*crash_base = 0;
>>  
>>  	ck_cmdline = get_last_crashkernel(cmdline, name, suffix);
>> -
>>  	if (!ck_cmdline)
>> -		return -EINVAL;
>> +		return -ENOENT;
>>  
>>  	ck_cmdline += strlen(name);
>>  
>> -- 
>> 2.25.1
>>
> 
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X
  2022-02-27  3:07 ` [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X Zhen Lei
@ 2022-03-16 12:11   ` Baoquan He
  2022-03-16 13:11     ` Leizhen (ThunderTown)
  2022-03-17  2:38   ` Baoquan He
  2022-03-21 13:29   ` John Donnelly
  2 siblings, 1 reply; 27+ messages in thread
From: Baoquan He @ 2022-03-16 12:11 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 02/27/22 at 11:07am, Zhen Lei wrote:
> From: Chen Zhou <chenzhou10@huawei.com>
> 
> There are following issues in arm64 kdump:
> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> will fail when there is no enough low memory.
> 2. If reserving crashkernel above 4G, in this case, crash dump
> kernel will boot failure because there is no low memory available
> for allocation.
> 
> To solve these issues, change the behavior of crashkernel=X and
> introduce crashkernel=X,[high,low]. crashkernel=X tries low allocation
> in DMA zone, and fall back to high allocation if it fails.
> We can also use "crashkernel=X,high" to select a region above DMA zone,
> which also tries to allocate at least 256M in DMA zone automatically.
> "crashkernel=Y,low" can be used to allocate specified size low memory.
> 
> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
> Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  arch/arm64/kernel/machine_kexec.c      |   9 ++-
>  arch/arm64/kernel/machine_kexec_file.c |  12 ++-
>  arch/arm64/mm/init.c                   | 106 +++++++++++++++++++++++--
>  3 files changed, 115 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index e16b248699d5c3c..19c2d487cb08feb 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -329,8 +329,13 @@ bool crash_is_nosave(unsigned long pfn)
>  
>  	/* in reserved memory? */
>  	addr = __pfn_to_phys(pfn);
> -	if ((addr < crashk_res.start) || (crashk_res.end < addr))
> -		return false;
> +	if ((addr < crashk_res.start) || (crashk_res.end < addr)) {
> +		if (!crashk_low_res.end)
> +			return false;
> +
> +		if ((addr < crashk_low_res.start) || (crashk_low_res.end < addr))
> +			return false;
> +	}
>  
>  	if (!kexec_crash_image)
>  		return true;
> diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> index 59c648d51848886..889951291cc0f9c 100644
> --- a/arch/arm64/kernel/machine_kexec_file.c
> +++ b/arch/arm64/kernel/machine_kexec_file.c
> @@ -65,10 +65,18 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
>  
>  	/* Exclude crashkernel region */
>  	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
> +	if (ret)
> +		goto out;
> +
> +	if (crashk_low_res.end) {
> +		ret = crash_exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
> +		if (ret)
> +			goto out;
> +	}
>  
> -	if (!ret)
> -		ret =  crash_prepare_elf64_headers(cmem, true, addr, sz);
> +	ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
>  
> +out:
>  	kfree(cmem);
>  	return ret;
>  }
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 90f276d46b93bc6..30ae6638ff54c47 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -65,6 +65,44 @@ EXPORT_SYMBOL(memstart_addr);
>  phys_addr_t arm64_dma_phys_limit __ro_after_init;
>  
>  #ifdef CONFIG_KEXEC_CORE
> +/* Current arm64 boot protocol requires 2MB alignment */
> +#define CRASH_ALIGN			SZ_2M
> +
> +#define CRASH_ADDR_LOW_MAX		arm64_dma_phys_limit
> +#define CRASH_ADDR_HIGH_MAX		memblock.current_limit
> +
> +/*
> + * This is an empirical value in x86_64 and taken here directly. Please
> + * refer to the code comment in reserve_crashkernel_low() of x86_64 for more
> + * details.
> + */
> +#define DEFAULT_CRASH_KERNEL_LOW_SIZE	\
> +	max(swiotlb_size_or_default() + (8UL << 20), 256UL << 20)
> +
> +static int __init reserve_crashkernel_low(unsigned long long low_size)
> +{
> +	unsigned long long low_base;
> +
> +	/* passed with crashkernel=0,low ? */
> +	if (!low_size)
> +		return 0;
> +
> +	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> +	if (!low_base) {
> +		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
> +		return -ENOMEM;
> +	}
> +
> +	pr_info("crashkernel low memory reserved: 0x%08llx - 0x%08llx (%lld MB)\n",
> +		low_base, low_base + low_size, low_size >> 20);
> +
> +	crashk_low_res.start = low_base;
> +	crashk_low_res.end   = low_base + low_size - 1;
> +	insert_resource(&iomem_resource, &crashk_low_res);
> +
> +	return 0;
> +}
> +
>  /*
>   * reserve_crashkernel() - reserves memory for crash kernel
>   *
> @@ -75,30 +113,79 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>  static void __init reserve_crashkernel(void)
>  {
>  	unsigned long long crash_base, crash_size;
> -	unsigned long long crash_max = arm64_dma_phys_limit;
> +	unsigned long long crash_low_size;
> +	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>  	int ret;

Even though reverse xmas tree style is not enforced, this 'int ret;' is
really annoying to look at. Maybe move it down two lines.

> +	bool fixed_base, high = false;
> +	char *cmdline = boot_command_line;
>  
> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> +	/* crashkernel=X[@offset] */
> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>  				&crash_size, &crash_base);
> -	/* no crashkernel= or invalid value specified */
> -	if (ret || !crash_size)
> -		return;
> +	if (ret || !crash_size) {
> +		/* crashkernel=X,high */
> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
> +		if (ret || !crash_size)
> +			return;
> +
> +		/* crashkernel=Y,low */
> +		ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
> +		if (ret == -ENOENT)
> +			/*
> +			 * crashkernel=Y,low is not specified explicitly, use
> +			 * default size automatically.
> +			 */
> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> +		else if (ret)
> +			/* crashkernel=Y,low is specified but Y is invalid */
> +			return;
> +
> +		/* Mark crashkernel=X,high is specified */
> +		high = true;
> +		crash_max = CRASH_ADDR_HIGH_MAX;
> +	}
>  
> +	fixed_base = !!crash_base;
>  	crash_size = PAGE_ALIGN(crash_size);
>  
>  	/* User specifies base address explicitly. */
This is over commenting, can't see why it's needed.
> -	if (crash_base)
> +	if (fixed_base)
>  		crash_max = crash_base + crash_size;

Hi leizhen,

I made change on reserve_crashkenrel(), inline comment may be slow.
Please check and consider if they can be taken.

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 30ae6638ff54..f96351da1e3e 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -109,38 +109,43 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
  * This function reserves memory area given in "crashkernel=" kernel command
  * line parameter. The memory reserved is used by dump capture kernel when
  * primary kernel is crashing.
+ *
+ * NOTE: Reservation of crashkernel,low is special since its existence
+ * is not independent, need rely on the existence of crashkernel,high.
+ * Hence there are different cases for crashkernel,low reservation:
+ * 1) crashkernel=Y,low is specified explicitly, crashkernel,low takes Y;
+ * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
+ *    take the default crashkernel,low value;
+ * 3) crashkernel=X is specified, while fallback to get a memory region
+ *    in high memory, take the default crashkernel,low value;
+ * 4) crashkernel='invalid value',low is specified, failed the whole
+ *    crashkernel reservation and bail out.
  */
 static void __init reserve_crashkernel(void)
 {
 	unsigned long long crash_base, crash_size;
 	unsigned long long crash_low_size;
 	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
-	int ret;
 	bool fixed_base, high = false;
 	char *cmdline = boot_command_line;
+	int ret;
 
 	/* crashkernel=X[@offset] */
 	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
 				&crash_size, &crash_base);
 	if (ret || !crash_size) {
-		/* crashkernel=X,high */
 		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
 		if (ret || !crash_size)
 			return;
 
-		/* crashkernel=Y,low */
 		ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
 		if (ret == -ENOENT)
-			/*
-			 * crashkernel=Y,low is not specified explicitly, use
-			 * default size automatically.
-			 */
+			/* case #2 of crashkernel,low reservation */
 			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
 		else if (ret)
-			/* crashkernel=Y,low is specified but Y is invalid */
+			/* case #4 of crashkernel,low reservation */
 			return;
 
-		/* Mark crashkernel=X,high is specified */
 		high = true;
 		crash_max = CRASH_ADDR_HIGH_MAX;
 	}
@@ -148,7 +153,6 @@ static void __init reserve_crashkernel(void)
 	fixed_base = !!crash_base;
 	crash_size = PAGE_ALIGN(crash_size);
 
-	/* User specifies base address explicitly. */
 	if (fixed_base)
 		crash_max = crash_base + crash_size;
 
@@ -172,11 +176,7 @@ static void __init reserve_crashkernel(void)
 	}
 
 	if (crash_base >= SZ_4G) {
-		/*
-		 * For case crashkernel=X, low memory is not enough and fall
-		 * back to reserve specified size of memory above 4G, try to
-		 * allocate minimum required memory below 4G again.
-		 */
+		/* case #3 of crashkernel,low reservation */
 		if (!high)
 			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
 

>  
> -	/* Current arm64 boot protocol requires 2MB alignment */
> -	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
> +retry:
> +	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>  					       crash_base, crash_max);
>  	if (!crash_base) {
> +		/*
> +		 * Attempt to fully allocate low memory failed, fall back
> +		 * to high memory, the minimum required low memory will be
> +		 * reserved later.
> +		 */
> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
> +			crash_max = CRASH_ADDR_HIGH_MAX;
> +			goto retry;
> +		}
> +
>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>  			crash_size);
>  		return;
>  	}
>  
> +	if (crash_base >= SZ_4G) {
> +		/*
> +		 * For case crashkernel=X, low memory is not enough and fall
> +		 * back to reserve specified size of memory above 4G, try to
> +		 * allocate minimum required memory below 4G again.
> +		 */
> +		if (!high)
> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> +
> +		if (reserve_crashkernel_low(crash_low_size)) {
> +			memblock_phys_free(crash_base, crash_size);
> +			return;
> +		}
> +	}
> +
>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>  		crash_base, crash_base + crash_size, crash_size >> 20);
>  
> @@ -107,6 +194,9 @@ static void __init reserve_crashkernel(void)
>  	 * map. Inform kmemleak so that it won't try to access it.
>  	 */
>  	kmemleak_ignore_phys(crash_base);
> +	if (crashk_low_res.end)
> +		kmemleak_ignore_phys(crashk_low_res.start);
> +
>  	crashk_res.start = crash_base;
>  	crashk_res.end = crash_base + crash_size - 1;
>  	insert_resource(&iomem_resource, &crashk_res);
> -- 
> 2.25.1
> 


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X
  2022-03-16 12:11   ` Baoquan He
@ 2022-03-16 13:11     ` Leizhen (ThunderTown)
  2022-03-17  2:36       ` Baoquan He
  0 siblings, 1 reply; 27+ messages in thread
From: Leizhen (ThunderTown) @ 2022-03-16 13:11 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/3/16 20:11, Baoquan He wrote:
> On 02/27/22 at 11:07am, Zhen Lei wrote:
>> From: Chen Zhou <chenzhou10@huawei.com>
>>
>> There are following issues in arm64 kdump:
>> 1. We use crashkernel=X to reserve crashkernel below 4G, which
>> will fail when there is no enough low memory.
>> 2. If reserving crashkernel above 4G, in this case, crash dump
>> kernel will boot failure because there is no low memory available
>> for allocation.
>>
>> To solve these issues, change the behavior of crashkernel=X and
>> introduce crashkernel=X,[high,low]. crashkernel=X tries low allocation
>> in DMA zone, and fall back to high allocation if it fails.
>> We can also use "crashkernel=X,high" to select a region above DMA zone,
>> which also tries to allocate at least 256M in DMA zone automatically.
>> "crashkernel=Y,low" can be used to allocate specified size low memory.
>>
>> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
>> Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>  arch/arm64/kernel/machine_kexec.c      |   9 ++-
>>  arch/arm64/kernel/machine_kexec_file.c |  12 ++-
>>  arch/arm64/mm/init.c                   | 106 +++++++++++++++++++++++--
>>  3 files changed, 115 insertions(+), 12 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>> index e16b248699d5c3c..19c2d487cb08feb 100644
>> --- a/arch/arm64/kernel/machine_kexec.c
>> +++ b/arch/arm64/kernel/machine_kexec.c
>> @@ -329,8 +329,13 @@ bool crash_is_nosave(unsigned long pfn)
>>  
>>  	/* in reserved memory? */
>>  	addr = __pfn_to_phys(pfn);
>> -	if ((addr < crashk_res.start) || (crashk_res.end < addr))
>> -		return false;
>> +	if ((addr < crashk_res.start) || (crashk_res.end < addr)) {
>> +		if (!crashk_low_res.end)
>> +			return false;
>> +
>> +		if ((addr < crashk_low_res.start) || (crashk_low_res.end < addr))
>> +			return false;
>> +	}
>>  
>>  	if (!kexec_crash_image)
>>  		return true;
>> diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
>> index 59c648d51848886..889951291cc0f9c 100644
>> --- a/arch/arm64/kernel/machine_kexec_file.c
>> +++ b/arch/arm64/kernel/machine_kexec_file.c
>> @@ -65,10 +65,18 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
>>  
>>  	/* Exclude crashkernel region */
>>  	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
>> +	if (ret)
>> +		goto out;
>> +
>> +	if (crashk_low_res.end) {
>> +		ret = crash_exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
>> +		if (ret)
>> +			goto out;
>> +	}
>>  
>> -	if (!ret)
>> -		ret =  crash_prepare_elf64_headers(cmem, true, addr, sz);
>> +	ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
>>  
>> +out:
>>  	kfree(cmem);
>>  	return ret;
>>  }
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 90f276d46b93bc6..30ae6638ff54c47 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -65,6 +65,44 @@ EXPORT_SYMBOL(memstart_addr);
>>  phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>  
>>  #ifdef CONFIG_KEXEC_CORE
>> +/* Current arm64 boot protocol requires 2MB alignment */
>> +#define CRASH_ALIGN			SZ_2M
>> +
>> +#define CRASH_ADDR_LOW_MAX		arm64_dma_phys_limit
>> +#define CRASH_ADDR_HIGH_MAX		memblock.current_limit
>> +
>> +/*
>> + * This is an empirical value in x86_64 and taken here directly. Please
>> + * refer to the code comment in reserve_crashkernel_low() of x86_64 for more
>> + * details.
>> + */
>> +#define DEFAULT_CRASH_KERNEL_LOW_SIZE	\
>> +	max(swiotlb_size_or_default() + (8UL << 20), 256UL << 20)
>> +
>> +static int __init reserve_crashkernel_low(unsigned long long low_size)
>> +{
>> +	unsigned long long low_base;
>> +
>> +	/* passed with crashkernel=0,low ? */
>> +	if (!low_size)
>> +		return 0;
>> +
>> +	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
>> +	if (!low_base) {
>> +		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
>> +		return -ENOMEM;
>> +	}
>> +
>> +	pr_info("crashkernel low memory reserved: 0x%08llx - 0x%08llx (%lld MB)\n",
>> +		low_base, low_base + low_size, low_size >> 20);
>> +
>> +	crashk_low_res.start = low_base;
>> +	crashk_low_res.end   = low_base + low_size - 1;
>> +	insert_resource(&iomem_resource, &crashk_low_res);
>> +
>> +	return 0;
>> +}
>> +
>>  /*
>>   * reserve_crashkernel() - reserves memory for crash kernel
>>   *
>> @@ -75,30 +113,79 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>  static void __init reserve_crashkernel(void)
>>  {
>>  	unsigned long long crash_base, crash_size;
>> -	unsigned long long crash_max = arm64_dma_phys_limit;
>> +	unsigned long long crash_low_size;
>> +	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>  	int ret;
> 
> Even though reverse xmas tree style is not enforced, this 'int ret;' is
> really annoying to look at. Maybe move it down two lines.
> 
>> +	bool fixed_base, high = false;
>> +	char *cmdline = boot_command_line;
>>  
>> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>> +	/* crashkernel=X[@offset] */
>> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>>  				&crash_size, &crash_base);
>> -	/* no crashkernel= or invalid value specified */
>> -	if (ret || !crash_size)
>> -		return;
>> +	if (ret || !crash_size) {
>> +		/* crashkernel=X,high */
>> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
>> +		if (ret || !crash_size)
>> +			return;
>> +
>> +		/* crashkernel=Y,low */
>> +		ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
>> +		if (ret == -ENOENT)
>> +			/*
>> +			 * crashkernel=Y,low is not specified explicitly, use
>> +			 * default size automatically.
>> +			 */
>> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>> +		else if (ret)
>> +			/* crashkernel=Y,low is specified but Y is invalid */
>> +			return;
>> +
>> +		/* Mark crashkernel=X,high is specified */
>> +		high = true;
>> +		crash_max = CRASH_ADDR_HIGH_MAX;
>> +	}
>>  
>> +	fixed_base = !!crash_base;
>>  	crash_size = PAGE_ALIGN(crash_size);
>>  
>>  	/* User specifies base address explicitly. */
> This is over commenting, can't see why it's needed.
>> -	if (crash_base)
>> +	if (fixed_base)
>>  		crash_max = crash_base + crash_size;
> 
> Hi leizhen,
> 
> I made change on reserve_crashkenrel(), inline comment may be slow.
> Please check and consider if they can be taken.

That's great. Thank you very much.

> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 30ae6638ff54..f96351da1e3e 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -109,38 +109,43 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
>   * This function reserves memory area given in "crashkernel=" kernel command
>   * line parameter. The memory reserved is used by dump capture kernel when
>   * primary kernel is crashing.
> + *
> + * NOTE: Reservation of crashkernel,low is special since its existence
> + * is not independent, need rely on the existence of crashkernel,high.
> + * Hence there are different cases for crashkernel,low reservation:
> + * 1) crashkernel=Y,low is specified explicitly, crashkernel,low takes Y;
> + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
> + *    take the default crashkernel,low value;
> + * 3) crashkernel=X is specified, while fallback to get a memory region
> + *    in high memory, take the default crashkernel,low value;
> + * 4) crashkernel='invalid value',low is specified, failed the whole
> + *    crashkernel reservation and bail out.
>   */
>  static void __init reserve_crashkernel(void)
>  {
>  	unsigned long long crash_base, crash_size;
>  	unsigned long long crash_low_size;
>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> -	int ret;
>  	bool fixed_base, high = false;
>  	char *cmdline = boot_command_line;
> +	int ret;
>  
>  	/* crashkernel=X[@offset] */
>  	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>  				&crash_size, &crash_base);
>  	if (ret || !crash_size) {
> -		/* crashkernel=X,high */
>  		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
>  		if (ret || !crash_size)
>  			return;
>  
> -		/* crashkernel=Y,low */
>  		ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
>  		if (ret == -ENOENT)
> -			/*
> -			 * crashkernel=Y,low is not specified explicitly, use
> -			 * default size automatically.
> -			 */
> +			/* case #2 of crashkernel,low reservation */
>  			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>  		else if (ret)
> -			/* crashkernel=Y,low is specified but Y is invalid */
> +			/* case #4 of crashkernel,low reservation */
>  			return;
>  
> -		/* Mark crashkernel=X,high is specified */
>  		high = true;
>  		crash_max = CRASH_ADDR_HIGH_MAX;
>  	}
> @@ -148,7 +153,6 @@ static void __init reserve_crashkernel(void)
>  	fixed_base = !!crash_base;
>  	crash_size = PAGE_ALIGN(crash_size);
>  
> -	/* User specifies base address explicitly. */
>  	if (fixed_base)
>  		crash_max = crash_base + crash_size;
>  
> @@ -172,11 +176,7 @@ static void __init reserve_crashkernel(void)
>  	}
>  
>  	if (crash_base >= SZ_4G) {
> -		/*
> -		 * For case crashkernel=X, low memory is not enough and fall
> -		 * back to reserve specified size of memory above 4G, try to
> -		 * allocate minimum required memory below 4G again.
> -		 */
> +		/* case #3 of crashkernel,low reservation */
>  		if (!high)
>  			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>  
> 
>>  
>> -	/* Current arm64 boot protocol requires 2MB alignment */
>> -	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
>> +retry:
>> +	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>  					       crash_base, crash_max);
>>  	if (!crash_base) {
>> +		/*
>> +		 * Attempt to fully allocate low memory failed, fall back
>> +		 * to high memory, the minimum required low memory will be
>> +		 * reserved later.
>> +		 */
>> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
>> +			crash_max = CRASH_ADDR_HIGH_MAX;
>> +			goto retry;
>> +		}
>> +
>>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>>  			crash_size);
>>  		return;
>>  	}
>>  
>> +	if (crash_base >= SZ_4G) {
>> +		/*
>> +		 * For case crashkernel=X, low memory is not enough and fall
>> +		 * back to reserve specified size of memory above 4G, try to
>> +		 * allocate minimum required memory below 4G again.
>> +		 */
>> +		if (!high)
>> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>> +
>> +		if (reserve_crashkernel_low(crash_low_size)) {
>> +			memblock_phys_free(crash_base, crash_size);
>> +			return;
>> +		}
>> +	}
>> +
>>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>>  		crash_base, crash_base + crash_size, crash_size >> 20);
>>  
>> @@ -107,6 +194,9 @@ static void __init reserve_crashkernel(void)
>>  	 * map. Inform kmemleak so that it won't try to access it.
>>  	 */
>>  	kmemleak_ignore_phys(crash_base);
>> +	if (crashk_low_res.end)
>> +		kmemleak_ignore_phys(crashk_low_res.start);
>> +
>>  	crashk_res.start = crash_base;
>>  	crashk_res.end = crash_base + crash_size - 1;
>>  	insert_resource(&iomem_resource, &crashk_res);
>> -- 
>> 2.25.1
>>
> 
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X
  2022-03-16 13:11     ` Leizhen (ThunderTown)
@ 2022-03-17  2:36       ` Baoquan He
  2022-03-17  3:19         ` Leizhen (ThunderTown)
  0 siblings, 1 reply; 27+ messages in thread
From: Baoquan He @ 2022-03-17  2:36 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 03/16/22 at 09:11pm, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/3/16 20:11, Baoquan He wrote:
> > On 02/27/22 at 11:07am, Zhen Lei wrote:
...... 

> > Hi leizhen,
> > 
> > I made change on reserve_crashkenrel(), inline comment may be slow.
> > Please check and consider if they can be taken.
> 
> That's great. Thank you very much.
> 
> > 
> > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> > index 30ae6638ff54..f96351da1e3e 100644
> > --- a/arch/arm64/mm/init.c
> > +++ b/arch/arm64/mm/init.c
> > @@ -109,38 +109,43 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
> >   * This function reserves memory area given in "crashkernel=" kernel command
> >   * line parameter. The memory reserved is used by dump capture kernel when
> >   * primary kernel is crashing.
> > + *
> > + * NOTE: Reservation of crashkernel,low is special since its existence
> > + * is not independent, need rely on the existence of crashkernel,high.
> > + * Hence there are different cases for crashkernel,low reservation:

Considering to update the 3rd line as below:

 * NOTE: Reservation of crashkernel,low is special since its existence
 * is not independent, need rely on the existence of crashkernel,high.
 * Here, four cases of crashkernel,low reservation are summarized: 

> > + * 1) crashkernel=Y,low is specified explicitly, crashkernel,low takes Y;
> > + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
> > + *    take the default crashkernel,low value;
> > + * 3) crashkernel=X is specified, while fallback to get a memory region
> > + *    in high memory, take the default crashkernel,low value;
> > + * 4) crashkernel='invalid value',low is specified, failed the whole
> > + *    crashkernel reservation and bail out.
> >   */
> >  static void __init reserve_crashkernel(void)
> >  {
> >  	unsigned long long crash_base, crash_size;
> >  	unsigned long long crash_low_size;
> >  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> > -	int ret;
> >  	bool fixed_base, high = false;
> >  	char *cmdline = boot_command_line;
> > +	int ret;
> >  
> >  	/* crashkernel=X[@offset] */
> >  	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
> >  				&crash_size, &crash_base);
> >  	if (ret || !crash_size) {
> > -		/* crashkernel=X,high */
> >  		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
> >  		if (ret || !crash_size)
> >  			return;
> >  
> > -		/* crashkernel=Y,low */
> >  		ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
> >  		if (ret == -ENOENT)
> > -			/*
> > -			 * crashkernel=Y,low is not specified explicitly, use
> > -			 * default size automatically.
> > -			 */
> > +			/* case #2 of crashkernel,low reservation */
> >  			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> >  		else if (ret)
> > -			/* crashkernel=Y,low is specified but Y is invalid */
> > +			/* case #4 of crashkernel,low reservation */
> >  			return;
> >  
> > -		/* Mark crashkernel=X,high is specified */
> >  		high = true;
> >  		crash_max = CRASH_ADDR_HIGH_MAX;
> >  	}
> > @@ -148,7 +153,6 @@ static void __init reserve_crashkernel(void)
> >  	fixed_base = !!crash_base;
> >  	crash_size = PAGE_ALIGN(crash_size);
> >  
> > -	/* User specifies base address explicitly. */
> >  	if (fixed_base)
> >  		crash_max = crash_base + crash_size;
> >  
> > @@ -172,11 +176,7 @@ static void __init reserve_crashkernel(void)
> >  	}
> >  
> >  	if (crash_base >= SZ_4G) {
> > -		/*
> > -		 * For case crashkernel=X, low memory is not enough and fall
> > -		 * back to reserve specified size of memory above 4G, try to
> > -		 * allocate minimum required memory below 4G again.
> > -		 */
> > +		/* case #3 of crashkernel,low reservation */
> >  		if (!high)
> >  			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> >  
> > 
> >>  
> >> -	/* Current arm64 boot protocol requires 2MB alignment */
> >> -	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
> >> +retry:
> >> +	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
> >>  					       crash_base, crash_max);
> >>  	if (!crash_base) {
> >> +		/*
> >> +		 * Attempt to fully allocate low memory failed, fall back
> >> +		 * to high memory, the minimum required low memory will be
> >> +		 * reserved later.
> >> +		 */
> >> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
> >> +			crash_max = CRASH_ADDR_HIGH_MAX;
> >> +			goto retry;
> >> +		}
> >> +
> >>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
> >>  			crash_size);
> >>  		return;
> >>  	}
> >>  
> >> +	if (crash_base >= SZ_4G) {
> >> +		/*
> >> +		 * For case crashkernel=X, low memory is not enough and fall
> >> +		 * back to reserve specified size of memory above 4G, try to
> >> +		 * allocate minimum required memory below 4G again.
> >> +		 */
> >> +		if (!high)
> >> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> >> +
> >> +		if (reserve_crashkernel_low(crash_low_size)) {
> >> +			memblock_phys_free(crash_base, crash_size);
> >> +			return;
> >> +		}
> >> +	}
> >> +
> >>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
> >>  		crash_base, crash_base + crash_size, crash_size >> 20);
> >>  
> >> @@ -107,6 +194,9 @@ static void __init reserve_crashkernel(void)
> >>  	 * map. Inform kmemleak so that it won't try to access it.
> >>  	 */
> >>  	kmemleak_ignore_phys(crash_base);
> >> +	if (crashk_low_res.end)
> >> +		kmemleak_ignore_phys(crashk_low_res.start);
> >> +
> >>  	crashk_res.start = crash_base;
> >>  	crashk_res.end = crash_base + crash_size - 1;
> >>  	insert_resource(&iomem_resource, &crashk_res);
> >> -- 
> >> 2.25.1
> >>
> > 
> > .
> > 
> 
> -- 
> Regards,
>   Zhen Lei
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X
  2022-02-27  3:07 ` [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X Zhen Lei
  2022-03-16 12:11   ` Baoquan He
@ 2022-03-17  2:38   ` Baoquan He
  2022-03-17  3:23     ` Leizhen (ThunderTown)
  2022-03-21 13:29   ` John Donnelly
  2 siblings, 1 reply; 27+ messages in thread
From: Baoquan He @ 2022-03-17  2:38 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 02/27/22 at 11:07am, Zhen Lei wrote:
> From: Chen Zhou <chenzhou10@huawei.com>
> 
> There are following issues in arm64 kdump:
> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> will fail when there is no enough low memory.
> 2. If reserving crashkernel above 4G, in this case, crash dump
> kernel will boot failure because there is no low memory available
              ~~ change it to "get boot failure" or "fail to boot"
> for allocation.
> 
> To solve these issues, change the behavior of crashkernel=X and
> introduce crashkernel=X,[high,low]. crashkernel=X tries low allocation
> in DMA zone, and fall back to high allocation if it fails.
> We can also use "crashkernel=X,high" to select a region above DMA zone,
> which also tries to allocate at least 256M in DMA zone automatically.
> "crashkernel=Y,low" can be used to allocate specified size low memory.
> 
> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
> Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X
  2022-03-17  2:36       ` Baoquan He
@ 2022-03-17  3:19         ` Leizhen (ThunderTown)
  2022-03-17  3:47           ` Baoquan He
  0 siblings, 1 reply; 27+ messages in thread
From: Leizhen (ThunderTown) @ 2022-03-17  3:19 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/3/17 10:36, Baoquan He wrote:
> On 03/16/22 at 09:11pm, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2022/3/16 20:11, Baoquan He wrote:
>>> On 02/27/22 at 11:07am, Zhen Lei wrote:
> ...... 
> 
>>> Hi leizhen,
>>>
>>> I made change on reserve_crashkenrel(), inline comment may be slow.
>>> Please check and consider if they can be taken.
>>
>> That's great. Thank you very much.
>>
>>>
>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>> index 30ae6638ff54..f96351da1e3e 100644
>>> --- a/arch/arm64/mm/init.c
>>> +++ b/arch/arm64/mm/init.c
>>> @@ -109,38 +109,43 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
>>>   * This function reserves memory area given in "crashkernel=" kernel command
>>>   * line parameter. The memory reserved is used by dump capture kernel when
>>>   * primary kernel is crashing.
>>> + *
>>> + * NOTE: Reservation of crashkernel,low is special since its existence
>>> + * is not independent, need rely on the existence of crashkernel,high.
>>> + * Hence there are different cases for crashkernel,low reservation:
> 
> Considering to update the 3rd line as below:
> 
>  * NOTE: Reservation of crashkernel,low is special since its existence
>  * is not independent, need rely on the existence of crashkernel,high.
>  * Here, four cases of crashkernel,low reservation are summarized: 

OK. How about change "crashkernel,low" to "crashkernel low memory"?
"crashkernel=Y,low", "crashkernel=,low" and "crashkernel,low" are very similar,
may dazzle the reader.

> 
>>> + * 1) crashkernel=Y,low is specified explicitly, crashkernel,low takes Y;
>>> + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
>>> + *    take the default crashkernel,low value;
>>> + * 3) crashkernel=X is specified, while fallback to get a memory region
>>> + *    in high memory, take the default crashkernel,low value;
>>> + * 4) crashkernel='invalid value',low is specified, failed the whole
>>> + *    crashkernel reservation and bail out.
>>>   */
>>>  static void __init reserve_crashkernel(void)
>>>  {
>>>  	unsigned long long crash_base, crash_size;
>>>  	unsigned long long crash_low_size;
>>>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>> -	int ret;
>>>  	bool fixed_base, high = false;
>>>  	char *cmdline = boot_command_line;
>>> +	int ret;
>>>  
>>>  	/* crashkernel=X[@offset] */
>>>  	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>>>  				&crash_size, &crash_base);
>>>  	if (ret || !crash_size) {
>>> -		/* crashkernel=X,high */
>>>  		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
>>>  		if (ret || !crash_size)
>>>  			return;
>>>  
>>> -		/* crashkernel=Y,low */
>>>  		ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
>>>  		if (ret == -ENOENT)
>>> -			/*
>>> -			 * crashkernel=Y,low is not specified explicitly, use
>>> -			 * default size automatically.
>>> -			 */
>>> +			/* case #2 of crashkernel,low reservation */
>>>  			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>>>  		else if (ret)
>>> -			/* crashkernel=Y,low is specified but Y is invalid */
>>> +			/* case #4 of crashkernel,low reservation */
>>>  			return;
>>>  
>>> -		/* Mark crashkernel=X,high is specified */
>>>  		high = true;
>>>  		crash_max = CRASH_ADDR_HIGH_MAX;
>>>  	}
>>> @@ -148,7 +153,6 @@ static void __init reserve_crashkernel(void)
>>>  	fixed_base = !!crash_base;
>>>  	crash_size = PAGE_ALIGN(crash_size);
>>>  
>>> -	/* User specifies base address explicitly. */
>>>  	if (fixed_base)
>>>  		crash_max = crash_base + crash_size;
>>>  
>>> @@ -172,11 +176,7 @@ static void __init reserve_crashkernel(void)
>>>  	}
>>>  
>>>  	if (crash_base >= SZ_4G) {
>>> -		/*
>>> -		 * For case crashkernel=X, low memory is not enough and fall
>>> -		 * back to reserve specified size of memory above 4G, try to
>>> -		 * allocate minimum required memory below 4G again.
>>> -		 */
>>> +		/* case #3 of crashkernel,low reservation */
>>>  		if (!high)
>>>  			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>>>  
>>>
>>>>  
>>>> -	/* Current arm64 boot protocol requires 2MB alignment */
>>>> -	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
>>>> +retry:
>>>> +	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>>  					       crash_base, crash_max);
>>>>  	if (!crash_base) {
>>>> +		/*
>>>> +		 * Attempt to fully allocate low memory failed, fall back
>>>> +		 * to high memory, the minimum required low memory will be
>>>> +		 * reserved later.
>>>> +		 */
>>>> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
>>>> +			crash_max = CRASH_ADDR_HIGH_MAX;
>>>> +			goto retry;
>>>> +		}
>>>> +
>>>>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>>>>  			crash_size);
>>>>  		return;
>>>>  	}
>>>>  
>>>> +	if (crash_base >= SZ_4G) {
>>>> +		/*
>>>> +		 * For case crashkernel=X, low memory is not enough and fall
>>>> +		 * back to reserve specified size of memory above 4G, try to
>>>> +		 * allocate minimum required memory below 4G again.
>>>> +		 */
>>>> +		if (!high)
>>>> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>>>> +
>>>> +		if (reserve_crashkernel_low(crash_low_size)) {
>>>> +			memblock_phys_free(crash_base, crash_size);
>>>> +			return;
>>>> +		}
>>>> +	}
>>>> +
>>>>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>>>>  		crash_base, crash_base + crash_size, crash_size >> 20);
>>>>  
>>>> @@ -107,6 +194,9 @@ static void __init reserve_crashkernel(void)
>>>>  	 * map. Inform kmemleak so that it won't try to access it.
>>>>  	 */
>>>>  	kmemleak_ignore_phys(crash_base);
>>>> +	if (crashk_low_res.end)
>>>> +		kmemleak_ignore_phys(crashk_low_res.start);
>>>> +
>>>>  	crashk_res.start = crash_base;
>>>>  	crashk_res.end = crash_base + crash_size - 1;
>>>>  	insert_resource(&iomem_resource, &crashk_res);
>>>> -- 
>>>> 2.25.1
>>>>
>>>
>>> .
>>>
>>
>> -- 
>> Regards,
>>   Zhen Lei
>>
> 
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X
  2022-03-17  2:38   ` Baoquan He
@ 2022-03-17  3:23     ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 27+ messages in thread
From: Leizhen (ThunderTown) @ 2022-03-17  3:23 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/3/17 10:38, Baoquan He wrote:
> On 02/27/22 at 11:07am, Zhen Lei wrote:
>> From: Chen Zhou <chenzhou10@huawei.com>
>>
>> There are following issues in arm64 kdump:
>> 1. We use crashkernel=X to reserve crashkernel below 4G, which
>> will fail when there is no enough low memory.
>> 2. If reserving crashkernel above 4G, in this case, crash dump
>> kernel will boot failure because there is no low memory available
>               ~~ change it to "get boot failure" or "fail to boot"

OK. I'm going to use "fail to boot".

>> for allocation.
>>
>> To solve these issues, change the behavior of crashkernel=X and
>> introduce crashkernel=X,[high,low]. crashkernel=X tries low allocation
>> in DMA zone, and fall back to high allocation if it fails.
>> We can also use "crashkernel=X,high" to select a region above DMA zone,
>> which also tries to allocate at least 256M in DMA zone automatically.
>> "crashkernel=Y,low" can be used to allocate specified size low memory.
>>
>> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
>> Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> 
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X
  2022-03-17  3:19         ` Leizhen (ThunderTown)
@ 2022-03-17  3:47           ` Baoquan He
  2022-03-17  7:30             ` Leizhen (ThunderTown)
  0 siblings, 1 reply; 27+ messages in thread
From: Baoquan He @ 2022-03-17  3:47 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 03/17/22 at 11:19am, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/3/17 10:36, Baoquan He wrote:
> > On 03/16/22 at 09:11pm, Leizhen (ThunderTown) wrote:
> >>
> >>
> >> On 2022/3/16 20:11, Baoquan He wrote:
> >>> On 02/27/22 at 11:07am, Zhen Lei wrote:
> > ...... 
> > 
> >>> Hi leizhen,
> >>>
> >>> I made change on reserve_crashkenrel(), inline comment may be slow.
> >>> Please check and consider if they can be taken.
> >>
> >> That's great. Thank you very much.
> >>
> >>>
> >>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> >>> index 30ae6638ff54..f96351da1e3e 100644
> >>> --- a/arch/arm64/mm/init.c
> >>> +++ b/arch/arm64/mm/init.c
> >>> @@ -109,38 +109,43 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
> >>>   * This function reserves memory area given in "crashkernel=" kernel command
> >>>   * line parameter. The memory reserved is used by dump capture kernel when
> >>>   * primary kernel is crashing.
> >>> + *
> >>> + * NOTE: Reservation of crashkernel,low is special since its existence
> >>> + * is not independent, need rely on the existence of crashkernel,high.
> >>> + * Hence there are different cases for crashkernel,low reservation:
> > 
> > Considering to update the 3rd line as below:
> > 
> >  * NOTE: Reservation of crashkernel,low is special since its existence
> >  * is not independent, need rely on the existence of crashkernel,high.
> >  * Here, four cases of crashkernel,low reservation are summarized: 
> 
> OK. How about change "crashkernel,low" to "crashkernel low memory"?
> "crashkernel=Y,low", "crashkernel=,low" and "crashkernel,low" are very similar,
> may dazzle the reader.

Fine by me. 'crashkernel low memory' is formal, just make sentence a
little longer. Please take what you think fitter.

> 
> > 
> >>> + * 1) crashkernel=Y,low is specified explicitly, crashkernel,low takes Y;
> >>> + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
> >>> + *    take the default crashkernel,low value;
> >>> + * 3) crashkernel=X is specified, while fallback to get a memory region
> >>> + *    in high memory, take the default crashkernel,low value;
> >>> + * 4) crashkernel='invalid value',low is specified, failed the whole
> >>> + *    crashkernel reservation and bail out.
> >>>   */
> >>>  static void __init reserve_crashkernel(void)
> >>>  {
> >>>  	unsigned long long crash_base, crash_size;
> >>>  	unsigned long long crash_low_size;
> >>>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> >>> -	int ret;
> >>>  	bool fixed_base, high = false;
> >>>  	char *cmdline = boot_command_line;
> >>> +	int ret;
> >>>  
> >>>  	/* crashkernel=X[@offset] */
> >>>  	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
> >>>  				&crash_size, &crash_base);
> >>>  	if (ret || !crash_size) {
> >>> -		/* crashkernel=X,high */
> >>>  		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
> >>>  		if (ret || !crash_size)
> >>>  			return;
> >>>  
> >>> -		/* crashkernel=Y,low */
> >>>  		ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
> >>>  		if (ret == -ENOENT)
> >>> -			/*
> >>> -			 * crashkernel=Y,low is not specified explicitly, use
> >>> -			 * default size automatically.
> >>> -			 */
> >>> +			/* case #2 of crashkernel,low reservation */
> >>>  			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> >>>  		else if (ret)
> >>> -			/* crashkernel=Y,low is specified but Y is invalid */
> >>> +			/* case #4 of crashkernel,low reservation */
> >>>  			return;
> >>>  
> >>> -		/* Mark crashkernel=X,high is specified */
> >>>  		high = true;
> >>>  		crash_max = CRASH_ADDR_HIGH_MAX;
> >>>  	}
> >>> @@ -148,7 +153,6 @@ static void __init reserve_crashkernel(void)
> >>>  	fixed_base = !!crash_base;
> >>>  	crash_size = PAGE_ALIGN(crash_size);
> >>>  
> >>> -	/* User specifies base address explicitly. */
> >>>  	if (fixed_base)
> >>>  		crash_max = crash_base + crash_size;
> >>>  
> >>> @@ -172,11 +176,7 @@ static void __init reserve_crashkernel(void)
> >>>  	}
> >>>  
> >>>  	if (crash_base >= SZ_4G) {
> >>> -		/*
> >>> -		 * For case crashkernel=X, low memory is not enough and fall
> >>> -		 * back to reserve specified size of memory above 4G, try to
> >>> -		 * allocate minimum required memory below 4G again.
> >>> -		 */
> >>> +		/* case #3 of crashkernel,low reservation */
> >>>  		if (!high)
> >>>  			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> >>>  
> >>>
> >>>>  
> >>>> -	/* Current arm64 boot protocol requires 2MB alignment */
> >>>> -	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
> >>>> +retry:
> >>>> +	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
> >>>>  					       crash_base, crash_max);
> >>>>  	if (!crash_base) {
> >>>> +		/*
> >>>> +		 * Attempt to fully allocate low memory failed, fall back
> >>>> +		 * to high memory, the minimum required low memory will be
> >>>> +		 * reserved later.
> >>>> +		 */
> >>>> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
> >>>> +			crash_max = CRASH_ADDR_HIGH_MAX;
> >>>> +			goto retry;
> >>>> +		}
> >>>> +
> >>>>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
> >>>>  			crash_size);
> >>>>  		return;
> >>>>  	}
> >>>>  
> >>>> +	if (crash_base >= SZ_4G) {
> >>>> +		/*
> >>>> +		 * For case crashkernel=X, low memory is not enough and fall
> >>>> +		 * back to reserve specified size of memory above 4G, try to
> >>>> +		 * allocate minimum required memory below 4G again.
> >>>> +		 */
> >>>> +		if (!high)
> >>>> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> >>>> +
> >>>> +		if (reserve_crashkernel_low(crash_low_size)) {
> >>>> +			memblock_phys_free(crash_base, crash_size);
> >>>> +			return;
> >>>> +		}
> >>>> +	}
> >>>> +
> >>>>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
> >>>>  		crash_base, crash_base + crash_size, crash_size >> 20);
> >>>>  
> >>>> @@ -107,6 +194,9 @@ static void __init reserve_crashkernel(void)
> >>>>  	 * map. Inform kmemleak so that it won't try to access it.
> >>>>  	 */
> >>>>  	kmemleak_ignore_phys(crash_base);
> >>>> +	if (crashk_low_res.end)
> >>>> +		kmemleak_ignore_phys(crashk_low_res.start);
> >>>> +
> >>>>  	crashk_res.start = crash_base;
> >>>>  	crashk_res.end = crash_base + crash_size - 1;
> >>>>  	insert_resource(&iomem_resource, &crashk_res);
> >>>> -- 
> >>>> 2.25.1
> >>>>
> >>>
> >>> .
> >>>
> >>
> >> -- 
> >> Regards,
> >>   Zhen Lei
> >>
> > 
> > .
> > 
> 
> -- 
> Regards,
>   Zhen Lei
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X
  2022-03-17  3:47           ` Baoquan He
@ 2022-03-17  7:30             ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 27+ messages in thread
From: Leizhen (ThunderTown) @ 2022-03-17  7:30 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/3/17 11:47, Baoquan He wrote:
> On 03/17/22 at 11:19am, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2022/3/17 10:36, Baoquan He wrote:
>>> On 03/16/22 at 09:11pm, Leizhen (ThunderTown) wrote:
>>>>
>>>>
>>>> On 2022/3/16 20:11, Baoquan He wrote:
>>>>> On 02/27/22 at 11:07am, Zhen Lei wrote:
>>> ...... 
>>>
>>>>> Hi leizhen,
>>>>>
>>>>> I made change on reserve_crashkenrel(), inline comment may be slow.
>>>>> Please check and consider if they can be taken.
>>>>
>>>> That's great. Thank you very much.
>>>>
>>>>>
>>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>>>> index 30ae6638ff54..f96351da1e3e 100644
>>>>> --- a/arch/arm64/mm/init.c
>>>>> +++ b/arch/arm64/mm/init.c
>>>>> @@ -109,38 +109,43 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
>>>>>   * This function reserves memory area given in "crashkernel=" kernel command
>>>>>   * line parameter. The memory reserved is used by dump capture kernel when
>>>>>   * primary kernel is crashing.
>>>>> + *
>>>>> + * NOTE: Reservation of crashkernel,low is special since its existence
>>>>> + * is not independent, need rely on the existence of crashkernel,high.
>>>>> + * Hence there are different cases for crashkernel,low reservation:
>>>
>>> Considering to update the 3rd line as below:
>>>
>>>  * NOTE: Reservation of crashkernel,low is special since its existence
>>>  * is not independent, need rely on the existence of crashkernel,high.
>>>  * Here, four cases of crashkernel,low reservation are summarized: 
>>
>> OK. How about change "crashkernel,low" to "crashkernel low memory"?
>> "crashkernel=Y,low", "crashkernel=,low" and "crashkernel,low" are very similar,
>> may dazzle the reader.
> 
> Fine by me. 'crashkernel low memory' is formal, just make sentence a
> little longer. Please take what you think fitter.

OK, I will send v22 after v5.18-rc1.

> 
>>
>>>
>>>>> + * 1) crashkernel=Y,low is specified explicitly, crashkernel,low takes Y;
>>>>> + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
>>>>> + *    take the default crashkernel,low value;
>>>>> + * 3) crashkernel=X is specified, while fallback to get a memory region
>>>>> + *    in high memory, take the default crashkernel,low value;
>>>>> + * 4) crashkernel='invalid value',low is specified, failed the whole
>>>>> + *    crashkernel reservation and bail out.
>>>>>   */
>>>>>  static void __init reserve_crashkernel(void)
>>>>>  {
>>>>>  	unsigned long long crash_base, crash_size;
>>>>>  	unsigned long long crash_low_size;
>>>>>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>>>> -	int ret;
>>>>>  	bool fixed_base, high = false;
>>>>>  	char *cmdline = boot_command_line;
>>>>> +	int ret;
>>>>>  
>>>>>  	/* crashkernel=X[@offset] */
>>>>>  	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>>>>>  				&crash_size, &crash_base);
>>>>>  	if (ret || !crash_size) {
>>>>> -		/* crashkernel=X,high */
>>>>>  		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
>>>>>  		if (ret || !crash_size)
>>>>>  			return;
>>>>>  
>>>>> -		/* crashkernel=Y,low */
>>>>>  		ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
>>>>>  		if (ret == -ENOENT)
>>>>> -			/*
>>>>> -			 * crashkernel=Y,low is not specified explicitly, use
>>>>> -			 * default size automatically.
>>>>> -			 */
>>>>> +			/* case #2 of crashkernel,low reservation */
>>>>>  			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>>>>>  		else if (ret)
>>>>> -			/* crashkernel=Y,low is specified but Y is invalid */
>>>>> +			/* case #4 of crashkernel,low reservation */
>>>>>  			return;
>>>>>  
>>>>> -		/* Mark crashkernel=X,high is specified */
>>>>>  		high = true;
>>>>>  		crash_max = CRASH_ADDR_HIGH_MAX;
>>>>>  	}
>>>>> @@ -148,7 +153,6 @@ static void __init reserve_crashkernel(void)
>>>>>  	fixed_base = !!crash_base;
>>>>>  	crash_size = PAGE_ALIGN(crash_size);
>>>>>  
>>>>> -	/* User specifies base address explicitly. */
>>>>>  	if (fixed_base)
>>>>>  		crash_max = crash_base + crash_size;
>>>>>  
>>>>> @@ -172,11 +176,7 @@ static void __init reserve_crashkernel(void)
>>>>>  	}
>>>>>  
>>>>>  	if (crash_base >= SZ_4G) {
>>>>> -		/*
>>>>> -		 * For case crashkernel=X, low memory is not enough and fall
>>>>> -		 * back to reserve specified size of memory above 4G, try to
>>>>> -		 * allocate minimum required memory below 4G again.
>>>>> -		 */
>>>>> +		/* case #3 of crashkernel,low reservation */
>>>>>  		if (!high)
>>>>>  			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>>>>>  
>>>>>
>>>>>>  
>>>>>> -	/* Current arm64 boot protocol requires 2MB alignment */
>>>>>> -	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
>>>>>> +retry:
>>>>>> +	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>>>>  					       crash_base, crash_max);
>>>>>>  	if (!crash_base) {
>>>>>> +		/*
>>>>>> +		 * Attempt to fully allocate low memory failed, fall back
>>>>>> +		 * to high memory, the minimum required low memory will be
>>>>>> +		 * reserved later.
>>>>>> +		 */
>>>>>> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
>>>>>> +			crash_max = CRASH_ADDR_HIGH_MAX;
>>>>>> +			goto retry;
>>>>>> +		}
>>>>>> +
>>>>>>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>>>>>>  			crash_size);
>>>>>>  		return;
>>>>>>  	}
>>>>>>  
>>>>>> +	if (crash_base >= SZ_4G) {
>>>>>> +		/*
>>>>>> +		 * For case crashkernel=X, low memory is not enough and fall
>>>>>> +		 * back to reserve specified size of memory above 4G, try to
>>>>>> +		 * allocate minimum required memory below 4G again.
>>>>>> +		 */
>>>>>> +		if (!high)
>>>>>> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>>>>>> +
>>>>>> +		if (reserve_crashkernel_low(crash_low_size)) {
>>>>>> +			memblock_phys_free(crash_base, crash_size);
>>>>>> +			return;
>>>>>> +		}
>>>>>> +	}
>>>>>> +
>>>>>>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>>>>>>  		crash_base, crash_base + crash_size, crash_size >> 20);
>>>>>>  
>>>>>> @@ -107,6 +194,9 @@ static void __init reserve_crashkernel(void)
>>>>>>  	 * map. Inform kmemleak so that it won't try to access it.
>>>>>>  	 */
>>>>>>  	kmemleak_ignore_phys(crash_base);
>>>>>> +	if (crashk_low_res.end)
>>>>>> +		kmemleak_ignore_phys(crashk_low_res.start);
>>>>>> +
>>>>>>  	crashk_res.start = crash_base;
>>>>>>  	crashk_res.end = crash_base + crash_size - 1;
>>>>>>  	insert_resource(&iomem_resource, &crashk_res);
>>>>>> -- 
>>>>>> 2.25.1
>>>>>>
>>>>>
>>>>> .
>>>>>
>>>>
>>>> -- 
>>>> Regards,
>>>>   Zhen Lei
>>>>
>>>
>>> .
>>>
>>
>> -- 
>> Regards,
>>   Zhen Lei
>>
> 
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X
  2022-02-27  3:07 ` [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X Zhen Lei
  2022-03-16 12:11   ` Baoquan He
  2022-03-17  2:38   ` Baoquan He
@ 2022-03-21 13:29   ` John Donnelly
  2022-03-21 14:09     ` Dave Kleikamp
  2022-03-22  1:58     ` Leizhen (ThunderTown)
  2 siblings, 2 replies; 27+ messages in thread
From: John Donnelly @ 2022-03-21 13:29 UTC (permalink / raw)
  To: Zhen Lei, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, John Donnelly
  Cc: Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou, Dave Kleikamp

On 2/26/22 9:07 PM, Zhen Lei wrote:
> From: Chen Zhou <chenzhou10@huawei.com>
> 
> There are following issues in arm64 kdump:
> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> will fail when there is no enough low memory.

                         " Not enough "
> 2. If reserving crashkernel above 4G, in this case, crash dump
> kernel will boot failure because there is no low memory available
> for allocation.

  We can't have a "boot failure". If the requested reservation
  can not be met,  the kdump  configuration is not setup.
> 
> To solve these issues, change the behavior of crashkernel=X and
> introduce crashkernel=X,[high,low]. crashkernel=X tries low allocation
> in DMA zone, and fall back to high allocation if it fails.
> We can also use "crashkernel=X,high" to select a region above DMA zone,
> which also tries to allocate at least 256M in DMA zone automatically.
> "crashkernel=Y,low" can be used to allocate specified size low memory.

Is there going to be documentation on what values certain Arm platforms 
are going to use this on ?

> 
> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
> Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>   arch/arm64/kernel/machine_kexec.c      |   9 ++-
>   arch/arm64/kernel/machine_kexec_file.c |  12 ++-
>   arch/arm64/mm/init.c                   | 106 +++++++++++++++++++++++--
>   3 files changed, 115 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index e16b248699d5c3c..19c2d487cb08feb 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -329,8 +329,13 @@ bool crash_is_nosave(unsigned long pfn)
>   
>   	/* in reserved memory? */
>   	addr = __pfn_to_phys(pfn);
> -	if ((addr < crashk_res.start) || (crashk_res.end < addr))
> -		return false;
> +	if ((addr < crashk_res.start) || (crashk_res.end < addr)) {
> +		if (!crashk_low_res.end)
> +			return false;
> +
> +		if ((addr < crashk_low_res.start) || (crashk_low_res.end < addr))
> +			return false;
> +	}
>   
>   	if (!kexec_crash_image)
>   		return true;
> diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> index 59c648d51848886..889951291cc0f9c 100644
> --- a/arch/arm64/kernel/machine_kexec_file.c
> +++ b/arch/arm64/kernel/machine_kexec_file.c
> @@ -65,10 +65,18 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
>   
>   	/* Exclude crashkernel region */
>   	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
> +	if (ret)
> +		goto out;
> +
> +	if (crashk_low_res.end) {
> +		ret = crash_exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
> +		if (ret)
> +			goto out;
> +	}
>   
> -	if (!ret)
> -		ret =  crash_prepare_elf64_headers(cmem, true, addr, sz);
> +	ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
>   
> +out:
>   	kfree(cmem);
>   	return ret;
>   }
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 90f276d46b93bc6..30ae6638ff54c47 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -65,6 +65,44 @@ EXPORT_SYMBOL(memstart_addr);
>   phys_addr_t arm64_dma_phys_limit __ro_after_init;
>   
>   #ifdef CONFIG_KEXEC_CORE
> +/* Current arm64 boot protocol requires 2MB alignment */
> +#define CRASH_ALIGN			SZ_2M
> +
> +#define CRASH_ADDR_LOW_MAX		arm64_dma_phys_limit
> +#define CRASH_ADDR_HIGH_MAX		memblock.current_limit
> +
> +/*
> + * This is an empirical value in x86_64 and taken here directly. Please
> + * refer to the code comment in reserve_crashkernel_low() of x86_64 for more
> + * details.
> + */
> +#define DEFAULT_CRASH_KERNEL_LOW_SIZE	\
> +	max(swiotlb_size_or_default() + (8UL << 20), 256UL << 20)
> +
> +static int __init reserve_crashkernel_low(unsigned long long low_size)
> +{
> +	unsigned long long low_base;
> +
> +	/* passed with crashkernel=0,low ? */
> +	if (!low_size)
> +		return 0;
> +
> +	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> +	if (!low_base) {
> +		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
> +		return -ENOMEM;
> +	}
> +
> +	pr_info("crashkernel low memory reserved: 0x%08llx - 0x%08llx (%lld MB)\n",
> +		low_base, low_base + low_size, low_size >> 20);
> +
> +	crashk_low_res.start = low_base;
> +	crashk_low_res.end   = low_base + low_size - 1;
> +	insert_resource(&iomem_resource, &crashk_low_res);
> +
> +	return 0;
> +}
> +
>   /*
>    * reserve_crashkernel() - reserves memory for crash kernel
>    *
> @@ -75,30 +113,79 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>   static void __init reserve_crashkernel(void)
>   {
>   	unsigned long long crash_base, crash_size;
> -	unsigned long long crash_max = arm64_dma_phys_limit;
> +	unsigned long long crash_low_size;
> +	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>   	int ret;
> +	bool fixed_base, high = false;
> +	char *cmdline = boot_command_line;
>   
> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> +	/* crashkernel=X[@offset] */
> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>   				&crash_size, &crash_base);
> -	/* no crashkernel= or invalid value specified */
> -	if (ret || !crash_size)
> -		return;
> +	if (ret || !crash_size) {
> +		/* crashkernel=X,high */
> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
> +		if (ret || !crash_size)
> +			return;
> +
> +		/* crashkernel=Y,low */
> +		ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
> +		if (ret == -ENOENT)
> +			/*
> +			 * crashkernel=Y,low is not specified explicitly, use
> +			 * default size automatically.
> +			 */
> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> +		else if (ret)
> +			/* crashkernel=Y,low is specified but Y is invalid */
> +			return;
> +
> +		/* Mark crashkernel=X,high is specified */
> +		high = true;
> +		crash_max = CRASH_ADDR_HIGH_MAX;
> +	}
>   
> +	fixed_base = !!crash_base;
>   	crash_size = PAGE_ALIGN(crash_size);
>   
>   	/* User specifies base address explicitly. */
> -	if (crash_base)
> +	if (fixed_base)
>   		crash_max = crash_base + crash_size;
>   
> -	/* Current arm64 boot protocol requires 2MB alignment */
> -	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
> +retry:
> +	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>   					       crash_base, crash_max);
>   	if (!crash_base) {
> +		/*
> +		 * Attempt to fully allocate low memory failed, fall back
> +		 * to high memory, the minimum required low memory will be
> +		 * reserved later.
> +		 */
> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
> +			crash_max = CRASH_ADDR_HIGH_MAX;
> +			goto retry;
> +		}
> +
>   		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>   			crash_size);
>   		return;
>   	}
>   
> +	if (crash_base >= SZ_4G) {
> +		/*
> +		 * For case crashkernel=X, low memory is not enough and fall
> +		 * back to reserve specified size of memory above 4G, try to
> +		 * allocate minimum required memory below 4G again.
> +		 */
> +		if (!high)
> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> +
> +		if (reserve_crashkernel_low(crash_low_size)) {
> +			memblock_phys_free(crash_base, crash_size);
> +			return;
> +		}
> +	}
> +
>   	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>   		crash_base, crash_base + crash_size, crash_size >> 20);
>   
> @@ -107,6 +194,9 @@ static void __init reserve_crashkernel(void)
>   	 * map. Inform kmemleak so that it won't try to access it.
>   	 */
>   	kmemleak_ignore_phys(crash_base);
> +	if (crashk_low_res.end)
> +		kmemleak_ignore_phys(crashk_low_res.start);
> +
>   	crashk_res.start = crash_base;
>   	crashk_res.end = crash_base + crash_size - 1;
>   	insert_resource(&iomem_resource, &crashk_res);


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X
  2022-03-21 13:29   ` John Donnelly
@ 2022-03-21 14:09     ` Dave Kleikamp
  2022-03-22  1:58     ` Leizhen (ThunderTown)
  1 sibling, 0 replies; 27+ messages in thread
From: Dave Kleikamp @ 2022-03-21 14:09 UTC (permalink / raw)
  To: John Donnelly, Zhen Lei, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H . Peter Anvin, linux-kernel, Dave Young,
	Baoquan He, Vivek Goyal, Eric Biederman, kexec, Catalin Marinas,
	Will Deacon, linux-arm-kernel, Rob Herring, Frank Rowand,
	devicetree, Jonathan Corbet, linux-doc
  Cc: Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou

On 3/21/22 8:29AM, John Donnelly wrote:
> On 2/26/22 9:07 PM, Zhen Lei wrote:
>> From: Chen Zhou <chenzhou10@huawei.com>
>>
>> There are following issues in arm64 kdump:
>> 1. We use crashkernel=X to reserve crashkernel below 4G, which
>> will fail when there is no enough low memory.
> 
>                          " Not enough "
>> 2. If reserving crashkernel above 4G, in this case, crash dump
>> kernel will boot failure because there is no low memory available
>> for allocation.
> 
>   We can't have a "boot failure". If the requested reservation
>   can not be met,  the kdump  configuration is not setup.

I think you misread this. Without these patches, if only high memory is 
reserved for the crash kernel, then the crash kernel will fail to boot.

>>
>> To solve these issues, change the behavior of crashkernel=X and
>> introduce crashkernel=X,[high,low]. crashkernel=X tries low allocation
>> in DMA zone, and fall back to high allocation if it fails.
>> We can also use "crashkernel=X,high" to select a region above DMA zone,
>> which also tries to allocate at least 256M in DMA zone automatically.
>> "crashkernel=Y,low" can be used to allocate specified size low memory.
> 
> Is there going to be documentation on what values certain Arm platforms 
> are going to use this on ?
> 
>>
>> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
>> Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X
  2022-03-21 13:29   ` John Donnelly
  2022-03-21 14:09     ` Dave Kleikamp
@ 2022-03-22  1:58     ` Leizhen (ThunderTown)
  1 sibling, 0 replies; 27+ messages in thread
From: Leizhen (ThunderTown) @ 2022-03-22  1:58 UTC (permalink / raw)
  To: John Donnelly, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou, Dave Kleikamp



On 2022/3/21 21:29, John Donnelly wrote:
> On 2/26/22 9:07 PM, Zhen Lei wrote:
>> From: Chen Zhou <chenzhou10@huawei.com>
>>
>> There are following issues in arm64 kdump:
>> 1. We use crashkernel=X to reserve crashkernel below 4G, which
>> will fail when there is no enough low memory.
> 
>                         " Not enough "

OK, thanks

>> 2. If reserving crashkernel above 4G, in this case, crash dump
>> kernel will boot failure because there is no low memory available
>> for allocation.
> 
>  We can't have a "boot failure". If the requested reservation
>  can not be met,  the kdump  configuration is not setup.
>>
>> To solve these issues, change the behavior of crashkernel=X and
>> introduce crashkernel=X,[high,low]. crashkernel=X tries low allocation
>> in DMA zone, and fall back to high allocation if it fails.
>> We can also use "crashkernel=X,high" to select a region above DMA zone,
>> which also tries to allocate at least 256M in DMA zone automatically.
>> "crashkernel=Y,low" can be used to allocate specified size low memory.
> 
> Is there going to be documentation on what values certain Arm platforms are going to use this on ?

There is no exact formula.

> 
>>
>> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
>> Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>   arch/arm64/kernel/machine_kexec.c      |   9 ++-
>>   arch/arm64/kernel/machine_kexec_file.c |  12 ++-
>>   arch/arm64/mm/init.c                   | 106 +++++++++++++++++++++++--
>>   3 files changed, 115 insertions(+), 12 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
>> index e16b248699d5c3c..19c2d487cb08feb 100644
>> --- a/arch/arm64/kernel/machine_kexec.c
>> +++ b/arch/arm64/kernel/machine_kexec.c
>> @@ -329,8 +329,13 @@ bool crash_is_nosave(unsigned long pfn)
>>         /* in reserved memory? */
>>       addr = __pfn_to_phys(pfn);
>> -    if ((addr < crashk_res.start) || (crashk_res.end < addr))
>> -        return false;
>> +    if ((addr < crashk_res.start) || (crashk_res.end < addr)) {
>> +        if (!crashk_low_res.end)
>> +            return false;
>> +
>> +        if ((addr < crashk_low_res.start) || (crashk_low_res.end < addr))
>> +            return false;
>> +    }
>>         if (!kexec_crash_image)
>>           return true;
>> diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
>> index 59c648d51848886..889951291cc0f9c 100644
>> --- a/arch/arm64/kernel/machine_kexec_file.c
>> +++ b/arch/arm64/kernel/machine_kexec_file.c
>> @@ -65,10 +65,18 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
>>         /* Exclude crashkernel region */
>>       ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
>> +    if (ret)
>> +        goto out;
>> +
>> +    if (crashk_low_res.end) {
>> +        ret = crash_exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
>> +        if (ret)
>> +            goto out;
>> +    }
>>   -    if (!ret)
>> -        ret =  crash_prepare_elf64_headers(cmem, true, addr, sz);
>> +    ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
>>   +out:
>>       kfree(cmem);
>>       return ret;
>>   }
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 90f276d46b93bc6..30ae6638ff54c47 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -65,6 +65,44 @@ EXPORT_SYMBOL(memstart_addr);
>>   phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>     #ifdef CONFIG_KEXEC_CORE
>> +/* Current arm64 boot protocol requires 2MB alignment */
>> +#define CRASH_ALIGN            SZ_2M
>> +
>> +#define CRASH_ADDR_LOW_MAX        arm64_dma_phys_limit
>> +#define CRASH_ADDR_HIGH_MAX        memblock.current_limit
>> +
>> +/*
>> + * This is an empirical value in x86_64 and taken here directly. Please
>> + * refer to the code comment in reserve_crashkernel_low() of x86_64 for more
>> + * details.
>> + */
>> +#define DEFAULT_CRASH_KERNEL_LOW_SIZE    \
>> +    max(swiotlb_size_or_default() + (8UL << 20), 256UL << 20)
>> +
>> +static int __init reserve_crashkernel_low(unsigned long long low_size)
>> +{
>> +    unsigned long long low_base;
>> +
>> +    /* passed with crashkernel=0,low ? */
>> +    if (!low_size)
>> +        return 0;
>> +
>> +    low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
>> +    if (!low_base) {
>> +        pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
>> +        return -ENOMEM;
>> +    }
>> +
>> +    pr_info("crashkernel low memory reserved: 0x%08llx - 0x%08llx (%lld MB)\n",
>> +        low_base, low_base + low_size, low_size >> 20);
>> +
>> +    crashk_low_res.start = low_base;
>> +    crashk_low_res.end   = low_base + low_size - 1;
>> +    insert_resource(&iomem_resource, &crashk_low_res);
>> +
>> +    return 0;
>> +}
>> +
>>   /*
>>    * reserve_crashkernel() - reserves memory for crash kernel
>>    *
>> @@ -75,30 +113,79 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
>>   static void __init reserve_crashkernel(void)
>>   {
>>       unsigned long long crash_base, crash_size;
>> -    unsigned long long crash_max = arm64_dma_phys_limit;
>> +    unsigned long long crash_low_size;
>> +    unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>       int ret;
>> +    bool fixed_base, high = false;
>> +    char *cmdline = boot_command_line;
>>   -    ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>> +    /* crashkernel=X[@offset] */
>> +    ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
>>                   &crash_size, &crash_base);
>> -    /* no crashkernel= or invalid value specified */
>> -    if (ret || !crash_size)
>> -        return;
>> +    if (ret || !crash_size) {
>> +        /* crashkernel=X,high */
>> +        ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
>> +        if (ret || !crash_size)
>> +            return;
>> +
>> +        /* crashkernel=Y,low */
>> +        ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
>> +        if (ret == -ENOENT)
>> +            /*
>> +             * crashkernel=Y,low is not specified explicitly, use
>> +             * default size automatically.
>> +             */
>> +            crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>> +        else if (ret)
>> +            /* crashkernel=Y,low is specified but Y is invalid */
>> +            return;
>> +
>> +        /* Mark crashkernel=X,high is specified */
>> +        high = true;
>> +        crash_max = CRASH_ADDR_HIGH_MAX;
>> +    }
>>   +    fixed_base = !!crash_base;
>>       crash_size = PAGE_ALIGN(crash_size);
>>         /* User specifies base address explicitly. */
>> -    if (crash_base)
>> +    if (fixed_base)
>>           crash_max = crash_base + crash_size;
>>   -    /* Current arm64 boot protocol requires 2MB alignment */
>> -    crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
>> +retry:
>> +    crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>                              crash_base, crash_max);
>>       if (!crash_base) {
>> +        /*
>> +         * Attempt to fully allocate low memory failed, fall back
>> +         * to high memory, the minimum required low memory will be
>> +         * reserved later.
>> +         */
>> +        if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
>> +            crash_max = CRASH_ADDR_HIGH_MAX;
>> +            goto retry;
>> +        }
>> +
>>           pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>>               crash_size);
>>           return;
>>       }
>>   +    if (crash_base >= SZ_4G) {
>> +        /*
>> +         * For case crashkernel=X, low memory is not enough and fall
>> +         * back to reserve specified size of memory above 4G, try to
>> +         * allocate minimum required memory below 4G again.
>> +         */
>> +        if (!high)
>> +            crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
>> +
>> +        if (reserve_crashkernel_low(crash_low_size)) {
>> +            memblock_phys_free(crash_base, crash_size);
>> +            return;
>> +        }
>> +    }
>> +
>>       pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>>           crash_base, crash_base + crash_size, crash_size >> 20);
>>   @@ -107,6 +194,9 @@ static void __init reserve_crashkernel(void)
>>        * map. Inform kmemleak so that it won't try to access it.
>>        */
>>       kmemleak_ignore_phys(crash_base);
>> +    if (crashk_low_res.end)
>> +        kmemleak_ignore_phys(crashk_low_res.start);
>> +
>>       crashk_res.start = crash_base;
>>       crashk_res.end = crash_base + crash_size - 1;
>>       insert_resource(&iomem_resource, &crashk_res);
> 
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 0/5] support reserving crashkernel above 4G on arm64 kdump
  2022-02-27  3:07 [PATCH v21 0/5] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
                   ` (4 preceding siblings ...)
  2022-02-27  3:07 ` [PATCH v21 5/5] docs: kdump: Update the crashkernel description for arm64 Zhen Lei
@ 2022-04-08  9:32 ` Baoquan He
  2022-04-08  9:47   ` Leizhen (ThunderTown)
  5 siblings, 1 reply; 27+ messages in thread
From: Baoquan He @ 2022-04-08  9:32 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

Hi, Lei

On 02/27/22 at 11:07am, Zhen Lei wrote:
> Changes since [v20]:
> 1. Check whether crashkernel=Y,low is incorrectly configured or not configured. Do different processing.
> 2. Share the existing description of x86. The configuration of arm64 is the same as that of x86.
> 3. Define the value of macro CRASH_ADDR_HIGH_MAX as memblock.current_limit, instead of MEMBLOCK_ALLOC_ACCESSIBLE.
> 4. To improve readability, some lightweight code adjustments have been made to reserve_craskernel(), including comments.
> 5. The defined value of DEFAULT_CRASH_KERNEL_LOW_SIZE reconsiders swiotlb, just like x86, to share documents.

5.18 rc1 is already done, do you have plan to post a new version for
reviewing?

Thanks
Baoquan


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 0/5] support reserving crashkernel above 4G on arm64 kdump
  2022-04-08  9:32 ` [PATCH v21 0/5] support reserving crashkernel above 4G on arm64 kdump Baoquan He
@ 2022-04-08  9:47   ` Leizhen (ThunderTown)
  2022-04-11  2:56     ` Baoquan He
  0 siblings, 1 reply; 27+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-08  9:47 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/4/8 17:32, Baoquan He wrote:
> Hi, Lei
> 
> On 02/27/22 at 11:07am, Zhen Lei wrote:
>> Changes since [v20]:
>> 1. Check whether crashkernel=Y,low is incorrectly configured or not configured. Do different processing.
>> 2. Share the existing description of x86. The configuration of arm64 is the same as that of x86.
>> 3. Define the value of macro CRASH_ADDR_HIGH_MAX as memblock.current_limit, instead of MEMBLOCK_ALLOC_ACCESSIBLE.
>> 4. To improve readability, some lightweight code adjustments have been made to reserve_craskernel(), including comments.
>> 5. The defined value of DEFAULT_CRASH_KERNEL_LOW_SIZE reconsiders swiotlb, just like x86, to share documents.
> 
> 5.18 rc1 is already done, do you have plan to post a new version for
> reviewing?

Yes, v5.18-rc1 has added a new patch
commit  031495635b46 ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones")
to allow block mapping again, so my patches need to be modified. It should be post next week.

> 
> Thanks
> Baoquan
> 
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v21 0/5] support reserving crashkernel above 4G on arm64 kdump
  2022-04-08  9:47   ` Leizhen (ThunderTown)
@ 2022-04-11  2:56     ` Baoquan He
  0 siblings, 0 replies; 27+ messages in thread
From: Baoquan He @ 2022-04-11  2:56 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 04/08/22 at 05:47pm, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/4/8 17:32, Baoquan He wrote:
> > Hi, Lei
> > 
> > On 02/27/22 at 11:07am, Zhen Lei wrote:
> >> Changes since [v20]:
> >> 1. Check whether crashkernel=Y,low is incorrectly configured or not configured. Do different processing.
> >> 2. Share the existing description of x86. The configuration of arm64 is the same as that of x86.
> >> 3. Define the value of macro CRASH_ADDR_HIGH_MAX as memblock.current_limit, instead of MEMBLOCK_ALLOC_ACCESSIBLE.
> >> 4. To improve readability, some lightweight code adjustments have been made to reserve_craskernel(), including comments.
> >> 5. The defined value of DEFAULT_CRASH_KERNEL_LOW_SIZE reconsiders swiotlb, just like x86, to share documents.
> > 
> > 5.18 rc1 is already done, do you have plan to post a new version for
> > reviewing?
> 
> Yes, v5.18-rc1 has added a new patch
> commit  031495635b46 ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones")
> to allow block mapping again, so my patches need to be modified. It should be post next week.

Sounds great, thanks. Just a reminder, please take your time.


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2022-04-11  2:57 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-27  3:07 [PATCH v21 0/5] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
2022-02-27  3:07 ` [PATCH v21 1/5] kdump: return -ENOENT if required cmdline option does not exist Zhen Lei
2022-03-15 11:57   ` Baoquan He
2022-03-15 12:21     ` Baoquan He
2022-03-15 13:32       ` Leizhen (ThunderTown)
2022-03-16  5:17         ` Baoquan He
2022-03-16  5:39   ` Baoquan He
2022-03-16  6:15     ` Leizhen (ThunderTown)
2022-02-27  3:07 ` [PATCH v21 2/5] arm64: Use insert_resource() to simplify code Zhen Lei
2022-02-27  3:07 ` [PATCH v21 3/5] arm64: kdump: reimplement crashkernel=X Zhen Lei
2022-03-16 12:11   ` Baoquan He
2022-03-16 13:11     ` Leizhen (ThunderTown)
2022-03-17  2:36       ` Baoquan He
2022-03-17  3:19         ` Leizhen (ThunderTown)
2022-03-17  3:47           ` Baoquan He
2022-03-17  7:30             ` Leizhen (ThunderTown)
2022-03-17  2:38   ` Baoquan He
2022-03-17  3:23     ` Leizhen (ThunderTown)
2022-03-21 13:29   ` John Donnelly
2022-03-21 14:09     ` Dave Kleikamp
2022-03-22  1:58     ` Leizhen (ThunderTown)
2022-02-27  3:07 ` [PATCH v21 4/5] of: fdt: Add memory for devices by DT property "linux,usable-memory-range" Zhen Lei
2022-02-27  3:07 ` [PATCH v21 5/5] docs: kdump: Update the crashkernel description for arm64 Zhen Lei
2022-03-15 11:59   ` Baoquan He
2022-04-08  9:32 ` [PATCH v21 0/5] support reserving crashkernel above 4G on arm64 kdump Baoquan He
2022-04-08  9:47   ` Leizhen (ThunderTown)
2022-04-11  2:56     ` Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).