All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump
@ 2022-04-14 11:57 ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

Changes since [v21]:
1. Update the commit message of  patch 1 and 5.
2. Add some comments for reserve_crashkernel() in patch 5.

Thanks to Baoquan He and John Donnelly for their review comments.

Because v5.18-rc1 has added a new patch
commit  031495635b46 ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones")
There are many new scenarios:
1) The mappings may be block or page-level. 
2) The call to reserve_crashkernel() may or may not be deferred.
3) The the upper limit of DMA address may be 4G, or less than 4G. Or the
   upper limit of physical memory, because SMMU can do the mapping.

The code of patch 1-2, 8-9 keep no change, because the above-mentioned issues are not involved.
The code of patch 5 only makes the following changes:
-	if (crash_base >= SZ_4G)
+	/*
+	 * When both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the
+	 * CRASH_ADDR_LOW_MAX equals the upper limit of physical memory, so
+	 * the 'crash_base' of high memory can not exceed it. To follow the
+	 * description of "crashkernel=X,high" option, add below 'high'
+	 * condition to make sure the crash low memory will be reserved.
+	 */
+	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
Change SZ_4G to CRASH_ADDR_LOW_MAX, because arm64_dma_phys_limit may be less than
4G or greater than 4G. The check 'high' is used for "crashkernel=X,high" and
"(crash_base >= CRASH_ADDR_LOW_MAX)" is used for "crashkernel=X[@offset]".

Patch 3-4 to allow block mappings for memory above 4G.
Patch 6-7 to support only crash high memory or fixed memory range specified by
crashkernel=X@offset use page-level mapping, to allow other areas use block mapping.
These four patches are for performance optimization purposes. For details about the
technical feasibility analysis, please see the commit messages.

Now the implementation of arm64 is very different from that of x86. It's no longer
suitable for both of them to share code.



Changes since [v20]:
1. Check whether crashkernel=Y,low is incorrectly configured or not configured. Do different processing.
2. Share the existing description of x86. The configuration of arm64 is the same as that of x86.
3. Define the value of macro CRASH_ADDR_HIGH_MAX as memblock.current_limit, instead of MEMBLOCK_ALLOC_ACCESSIBLE.
4. To improve readability, some lightweight code adjustments have been made to reserve_craskernel(), including comments.
5. The defined value of DEFAULT_CRASH_KERNEL_LOW_SIZE reconsiders swiotlb, just like x86, to share documents.

Thanks to Baoquan He for his careful review.

The test cases are as follows: (Please update the kexec tool to the latest version)
1) crashkernel=4G						//high=4G, low=256M
2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
5) crashkernel=4G@0xe0000000					//high=0G, low=0M, cannot allocate, failed
6) crashkernel=512M						//high=0G, low=512M
7) crashkernel=128M						//high=0G, low=128M
8) crashkernel=512M@0xde000000		//512M@3552M		//high=0G, low=512M
9) crashkernel=4G,high						//high=4G, low=256M
a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
c) crashkernel=128M,high					//high=128M, low=256M
d) crashkernel=512M,low						//high=0G, low=0M, invalid
e) crashkernel=512M,high crashkernel=0,low			//high=512M, low=0M
f) crashkernel=4G,high crashkernel=ab,low			//high=0G, low=0M, invalid


Changes since [v19]:
1. Temporarily stop making reserve_crashkernel[_low]() generic. There are a
   lot of details need to be considered, which can take a long time. Because
   "make generic" does not add new functions and does not improve performance,
   maybe I should say it's just a cleanup. So by stripping it out and leaving
   it for other patches later, we can aggregate the changes to the main functions.
2. Use insert_resource() to replace request_resource(), this not only simplifies
   the code, but also reduces the differences between arm64 and x86 implementations.
3. As commit 157752d84f5d ("kexec: use Crash kernel for Crash kernel low") do for
   x86, we can also extend kexec-tools for arm64, and it's currently applied. See:
   https://www.spinics.net/lists/kexec/msg28284.html

Thank you very much, Borislav Petkov, for so many valuable comments.

Changes since [v17]: v17 --> v19
1. Patch 0001-0004
   Introduce generic parse_crashkernel_high_low() to bring the parsing of
   "crashkernel=X,high" and the parsing of "crashkernel=X,low" together,
   then use it instead of the call to parse_crashkernel_{high|low}(). Two
   confusing parameters of parse_crashkernel_{high|low}() are deleted.

   I previously sent these four patches separately:
   [1] https://lkml.org/lkml/2021/12/25/40
2. Patch 0005-0009
   Introduce generic reserve_crashkernel_mem[_low](), the implementation of
   these two functions is based on function reserve_crashkernel[_low]() in
   arch/x86/kernel/setup.c. There is no functional change for x86.
   1) The check position of xen_pv_domain() does not change.
   2) Still 1M alignment for crash kernel fixed region, when 'base' is specified.

   To avoid compilation problems on other architectures: patch 0004 moves
   the definition of global variable crashk[_low]_res from kexec_core.c to
   crash_core.c, and provide default definitions for all macros involved, a
   particular platform can redefine these macros to override the default
   values.
3. 0010, only one line of comment was changed.
4. 0011
   1) crashk_low_res may also a valid reserved memory, should be checked
      in crash_is_nosave(), see arch/arm64/kernel/machine_kexec.
   2) Drop memblock_mark_nomap() for crashk_low_res, because of:
      2687275a5843 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
   3) Also call kmemleak_ignore_phys() for crashk_low_res, because of:
      85f58eb18898 arm64: kdump: Skip kmemleak scan reserved memory for kdump
5. 0012, slightly rebased, because the following patch is applied in advance. 
   https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/commit/?h=dt/linus&id=8347b41748c3019157312fbe7f8a6792ae396eb7
6. 0013, no change.

Others:
1. Discard add ARCH_WANT_RESERVE_CRASH_KERNEL
2. When allocating crash low memory, the start address still starts from 0.
   low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
3. Discard change (1ULL << 32) to CRASH_ADDR_LOW_MAX.
4. Ensure the check position of xen_pv_domain() have no change.
5. Except patch 0010 and 0012, all "Tested-by", "Reviewed-by", "Acked-by" are removed.
6. Update description.



Changes since [v16]
- Because no functional changes in this version, so add
  "Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>" for patch 1-9
- Add "Reviewed-by: Rob Herring <robh@kernel.org>" for patch 8
- Update patch 9 based on the review comments of Rob Herring
- As Catalin Marinas's suggestion, merge the implementation of
  ARCH_WANT_RESERVE_CRASH_KERNEL into patch 5. Ensure that the
  contents of X86 and ARM64 do not overlap, and reduce unnecessary
  temporary differences.

Changes since [v15]
-  Aggregate the processing of "linux,usable-memory-range" into one function.
   Only patch 9-10 have been updated.

Changes since [v14]
- Recovering the requirement that the CrashKernel memory regions on X86
  only requires 1 MiB alignment.
- Combine patches 5 and 6 in v14 into one. The compilation warning fixed
  by patch 6 was introduced by patch 5 in v14.
- As with crashk_res, crashk_low_res is also processed by
  crash_exclude_mem_range() in patch 7.
- Due to commit b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
  has removed the architecture-specific code, extend the property "linux,usable-memory-range"
  in the platform-agnostic FDT core code. See patch 9.
- Discard the x86 description update in the document, because the description
  has been updated by commit b1f4c363666c ("Documentation: kdump: update kdump guide").
- Change "arm64" to "ARM64" in Doc.


Changes since [v13]
- Rebased on top of 5.11-rc5.
- Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL.
Since reserve_crashkernel[_low]() implementations are quite similar on
other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in
arch/Kconfig and select this by X86 and ARM64.
- Some minor cleanup.

Changes since [v12]
- Rebased on top of 5.10-rc1.
- Keep CRASH_ALIGN as 16M suggested by Dave.
- Drop patch "kdump: add threshold for the required memory".
- Add Tested-by from John.

Changes since [v11]
- Rebased on top of 5.9-rc4.
- Make the function reserve_crashkernel() of x86 generic.
Suggested by Catalin, make the function reserve_crashkernel() of x86 generic
and arm64 use the generic version to reimplement crashkernel=X.

Changes since [v10]
- Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.

Changes since [v9]
- Patch 1 add Acked-by from Dave.
- Update patch 5 according to Dave's comments.
- Update chosen schema.

Changes since [v8]
- Reuse DT property "linux,usable-memory-range".
Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
memory region.
- Fix kdump broken with ZONE_DMA reintroduced.
- Update chosen schema.

Changes since [v7]
- Move x86 CRASH_ALIGN to 2M
Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
- Update Documentation/devicetree/bindings/chosen.txt.
Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
suggested by Arnd.
- Add Tested-by from Jhon and pk.

Changes since [v6]
- Fix build errors reported by kbuild test robot.

Changes since [v5]
- Move reserve_crashkernel_low() into kernel/crash_core.c.
- Delete crashkernel=X,high.
- Modify crashkernel=X,low.
If crashkernel=X,low is specified simultaneously, reserve spcified size low
memory for crash kdump kernel devices firstly and then reserve memory above 4G.
In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
pass to crash dump kernel by DT property "linux,low-memory-range".
- Update Documentation/admin-guide/kdump/kdump.rst.

Changes since [v4]
- Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.

Changes since [v3]
- Add memblock_cap_memory_ranges back for multiple ranges.
- Fix some compiling warnings.

Changes since [v2]
- Split patch "arm64: kdump: support reserving crashkernel above 4G" as
two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
patch.

Changes since [v1]:
- Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
- Remove memblock_cap_memory_ranges() i added in v1 and implement that
in fdt_enforce_memory_region().
There are at most two crash kernel regions, for two crash kernel regions
case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
and then remove the memory range in the middle.

v1:
There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel below 4G, which
will fail when there is no enough low memory.
2. If reserving crashkernel above 4G, in this case, crash dump
kernel will boot failure because there is no low memory available
for allocation.

To solve these issues, change the behavior of crashkernel=X.
crashkernel=X tries low allocation in DMA zone and fall back to high
allocation if it fails.

We can also use "crashkernel=X,high" to select a high region above
DMA zone, which also tries to allocate at least 256M low memory in
DMA zone automatically and "crashkernel=Y,low" can be used to allocate
specified size low memory.

When reserving crashkernel in high memory, some low memory is reserved
for crash dump kernel devices. So there may be two regions reserved for
crash dump kernel.
In order to distinct from the high region and make no effect to the use
of existing kexec-tools, rename the low region as "Crash kernel (low)",
and pass the low region by reusing DT property
"linux,usable-memory-range". We made the low memory region as the last
range of "linux,usable-memory-range" to keep compatibility with existing
user-space and older kdump kernels.

Besides, we need to modify kexec-tools:
arm64: support more than one crash kernel regions(see [1])

Another update is document about DT property 'linux,usable-memory-range':
schemas: update 'linux,usable-memory-range' node schema(see [2])


[1]: https://www.spinics.net/lists/kexec/msg28226.html
[2]: https://github.com/robherring/dt-schema/pull/19 
[v1]: https://lkml.org/lkml/2019/4/2/1174
[v2]: https://lkml.org/lkml/2019/4/9/86
[v3]: https://lkml.org/lkml/2019/4/9/306
[v4]: https://lkml.org/lkml/2019/4/15/273
[v5]: https://lkml.org/lkml/2019/5/6/1360
[v6]: https://lkml.org/lkml/2019/8/30/142
[v7]: https://lkml.org/lkml/2019/12/23/411
[v8]: https://lkml.org/lkml/2020/5/21/213
[v9]: https://lkml.org/lkml/2020/6/28/73
[v10]: https://lkml.org/lkml/2020/7/2/1443
[v11]: https://lkml.org/lkml/2020/8/1/150
[v12]: https://lkml.org/lkml/2020/9/7/1037
[v13]: https://lkml.org/lkml/2020/10/31/34
[v14]: https://lkml.org/lkml/2021/1/30/53
[v15]: https://lkml.org/lkml/2021/10/19/1405
[v16]: https://lkml.org/lkml/2021/11/23/435
[v17]: https://lkml.org/lkml/2021/12/10/38
[v18]: https://lkml.org/lkml/2021/12/22/424
[v19]: https://lkml.org/lkml/2021/12/28/203
[v20]: https://lkml.org/lkml/2022/1/24/167
[v21]: https://lkml.org/lkml/2022/2/26/350

Chen Zhou (2):
  arm64: kdump: Reimplement crashkernel=X
  of: fdt: Add memory for devices by DT property
    "linux,usable-memory-range"

Zhen Lei (7):
  kdump: return -ENOENT if required cmdline option does not exist
  arm64: Use insert_resource() to simplify code
  arm64: kdump: Remove some redundant checks in map_mem()
  arm64: kdump: Don't force page-level mappings for memory above 4G
  arm64: kdump: Use page-level mapping for the high memory of
    crashkernel
  arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G
  docs: kdump: Update the crashkernel description for arm64

 .../admin-guide/kernel-parameters.txt         |   8 +-
 arch/arm64/include/asm/kexec.h                |   2 +
 arch/arm64/kernel/machine_kexec.c             |   9 +-
 arch/arm64/kernel/machine_kexec_file.c        |  12 +-
 arch/arm64/kernel/setup.c                     |  17 +-
 arch/arm64/mm/init.c                          | 204 +++++++++++++++++-
 arch/arm64/mm/mmu.c                           |  68 +++---
 drivers/of/fdt.c                              |  33 ++-
 kernel/crash_core.c                           |   3 +-
 9 files changed, 278 insertions(+), 78 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump
@ 2022-04-14 11:57 ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

Changes since [v21]:
1. Update the commit message of  patch 1 and 5.
2. Add some comments for reserve_crashkernel() in patch 5.

Thanks to Baoquan He and John Donnelly for their review comments.

Because v5.18-rc1 has added a new patch
commit  031495635b46 ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones")
There are many new scenarios:
1) The mappings may be block or page-level. 
2) The call to reserve_crashkernel() may or may not be deferred.
3) The the upper limit of DMA address may be 4G, or less than 4G. Or the
   upper limit of physical memory, because SMMU can do the mapping.

The code of patch 1-2, 8-9 keep no change, because the above-mentioned issues are not involved.
The code of patch 5 only makes the following changes:
-	if (crash_base >= SZ_4G)
+	/*
+	 * When both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the
+	 * CRASH_ADDR_LOW_MAX equals the upper limit of physical memory, so
+	 * the 'crash_base' of high memory can not exceed it. To follow the
+	 * description of "crashkernel=X,high" option, add below 'high'
+	 * condition to make sure the crash low memory will be reserved.
+	 */
+	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
Change SZ_4G to CRASH_ADDR_LOW_MAX, because arm64_dma_phys_limit may be less than
4G or greater than 4G. The check 'high' is used for "crashkernel=X,high" and
"(crash_base >= CRASH_ADDR_LOW_MAX)" is used for "crashkernel=X[@offset]".

Patch 3-4 to allow block mappings for memory above 4G.
Patch 6-7 to support only crash high memory or fixed memory range specified by
crashkernel=X@offset use page-level mapping, to allow other areas use block mapping.
These four patches are for performance optimization purposes. For details about the
technical feasibility analysis, please see the commit messages.

Now the implementation of arm64 is very different from that of x86. It's no longer
suitable for both of them to share code.



Changes since [v20]:
1. Check whether crashkernel=Y,low is incorrectly configured or not configured. Do different processing.
2. Share the existing description of x86. The configuration of arm64 is the same as that of x86.
3. Define the value of macro CRASH_ADDR_HIGH_MAX as memblock.current_limit, instead of MEMBLOCK_ALLOC_ACCESSIBLE.
4. To improve readability, some lightweight code adjustments have been made to reserve_craskernel(), including comments.
5. The defined value of DEFAULT_CRASH_KERNEL_LOW_SIZE reconsiders swiotlb, just like x86, to share documents.

Thanks to Baoquan He for his careful review.

The test cases are as follows: (Please update the kexec tool to the latest version)
1) crashkernel=4G						//high=4G, low=256M
2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
5) crashkernel=4G@0xe0000000					//high=0G, low=0M, cannot allocate, failed
6) crashkernel=512M						//high=0G, low=512M
7) crashkernel=128M						//high=0G, low=128M
8) crashkernel=512M@0xde000000		//512M@3552M		//high=0G, low=512M
9) crashkernel=4G,high						//high=4G, low=256M
a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
c) crashkernel=128M,high					//high=128M, low=256M
d) crashkernel=512M,low						//high=0G, low=0M, invalid
e) crashkernel=512M,high crashkernel=0,low			//high=512M, low=0M
f) crashkernel=4G,high crashkernel=ab,low			//high=0G, low=0M, invalid


Changes since [v19]:
1. Temporarily stop making reserve_crashkernel[_low]() generic. There are a
   lot of details need to be considered, which can take a long time. Because
   "make generic" does not add new functions and does not improve performance,
   maybe I should say it's just a cleanup. So by stripping it out and leaving
   it for other patches later, we can aggregate the changes to the main functions.
2. Use insert_resource() to replace request_resource(), this not only simplifies
   the code, but also reduces the differences between arm64 and x86 implementations.
3. As commit 157752d84f5d ("kexec: use Crash kernel for Crash kernel low") do for
   x86, we can also extend kexec-tools for arm64, and it's currently applied. See:
   https://www.spinics.net/lists/kexec/msg28284.html

Thank you very much, Borislav Petkov, for so many valuable comments.

Changes since [v17]: v17 --> v19
1. Patch 0001-0004
   Introduce generic parse_crashkernel_high_low() to bring the parsing of
   "crashkernel=X,high" and the parsing of "crashkernel=X,low" together,
   then use it instead of the call to parse_crashkernel_{high|low}(). Two
   confusing parameters of parse_crashkernel_{high|low}() are deleted.

   I previously sent these four patches separately:
   [1] https://lkml.org/lkml/2021/12/25/40
2. Patch 0005-0009
   Introduce generic reserve_crashkernel_mem[_low](), the implementation of
   these two functions is based on function reserve_crashkernel[_low]() in
   arch/x86/kernel/setup.c. There is no functional change for x86.
   1) The check position of xen_pv_domain() does not change.
   2) Still 1M alignment for crash kernel fixed region, when 'base' is specified.

   To avoid compilation problems on other architectures: patch 0004 moves
   the definition of global variable crashk[_low]_res from kexec_core.c to
   crash_core.c, and provide default definitions for all macros involved, a
   particular platform can redefine these macros to override the default
   values.
3. 0010, only one line of comment was changed.
4. 0011
   1) crashk_low_res may also a valid reserved memory, should be checked
      in crash_is_nosave(), see arch/arm64/kernel/machine_kexec.
   2) Drop memblock_mark_nomap() for crashk_low_res, because of:
      2687275a5843 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
   3) Also call kmemleak_ignore_phys() for crashk_low_res, because of:
      85f58eb18898 arm64: kdump: Skip kmemleak scan reserved memory for kdump
5. 0012, slightly rebased, because the following patch is applied in advance. 
   https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/commit/?h=dt/linus&id=8347b41748c3019157312fbe7f8a6792ae396eb7
6. 0013, no change.

Others:
1. Discard add ARCH_WANT_RESERVE_CRASH_KERNEL
2. When allocating crash low memory, the start address still starts from 0.
   low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
3. Discard change (1ULL << 32) to CRASH_ADDR_LOW_MAX.
4. Ensure the check position of xen_pv_domain() have no change.
5. Except patch 0010 and 0012, all "Tested-by", "Reviewed-by", "Acked-by" are removed.
6. Update description.



Changes since [v16]
- Because no functional changes in this version, so add
  "Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>" for patch 1-9
- Add "Reviewed-by: Rob Herring <robh@kernel.org>" for patch 8
- Update patch 9 based on the review comments of Rob Herring
- As Catalin Marinas's suggestion, merge the implementation of
  ARCH_WANT_RESERVE_CRASH_KERNEL into patch 5. Ensure that the
  contents of X86 and ARM64 do not overlap, and reduce unnecessary
  temporary differences.

Changes since [v15]
-  Aggregate the processing of "linux,usable-memory-range" into one function.
   Only patch 9-10 have been updated.

Changes since [v14]
- Recovering the requirement that the CrashKernel memory regions on X86
  only requires 1 MiB alignment.
- Combine patches 5 and 6 in v14 into one. The compilation warning fixed
  by patch 6 was introduced by patch 5 in v14.
- As with crashk_res, crashk_low_res is also processed by
  crash_exclude_mem_range() in patch 7.
- Due to commit b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
  has removed the architecture-specific code, extend the property "linux,usable-memory-range"
  in the platform-agnostic FDT core code. See patch 9.
- Discard the x86 description update in the document, because the description
  has been updated by commit b1f4c363666c ("Documentation: kdump: update kdump guide").
- Change "arm64" to "ARM64" in Doc.


Changes since [v13]
- Rebased on top of 5.11-rc5.
- Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL.
Since reserve_crashkernel[_low]() implementations are quite similar on
other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in
arch/Kconfig and select this by X86 and ARM64.
- Some minor cleanup.

Changes since [v12]
- Rebased on top of 5.10-rc1.
- Keep CRASH_ALIGN as 16M suggested by Dave.
- Drop patch "kdump: add threshold for the required memory".
- Add Tested-by from John.

Changes since [v11]
- Rebased on top of 5.9-rc4.
- Make the function reserve_crashkernel() of x86 generic.
Suggested by Catalin, make the function reserve_crashkernel() of x86 generic
and arm64 use the generic version to reimplement crashkernel=X.

Changes since [v10]
- Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.

Changes since [v9]
- Patch 1 add Acked-by from Dave.
- Update patch 5 according to Dave's comments.
- Update chosen schema.

Changes since [v8]
- Reuse DT property "linux,usable-memory-range".
Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
memory region.
- Fix kdump broken with ZONE_DMA reintroduced.
- Update chosen schema.

Changes since [v7]
- Move x86 CRASH_ALIGN to 2M
Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
- Update Documentation/devicetree/bindings/chosen.txt.
Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
suggested by Arnd.
- Add Tested-by from Jhon and pk.

Changes since [v6]
- Fix build errors reported by kbuild test robot.

Changes since [v5]
- Move reserve_crashkernel_low() into kernel/crash_core.c.
- Delete crashkernel=X,high.
- Modify crashkernel=X,low.
If crashkernel=X,low is specified simultaneously, reserve spcified size low
memory for crash kdump kernel devices firstly and then reserve memory above 4G.
In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
pass to crash dump kernel by DT property "linux,low-memory-range".
- Update Documentation/admin-guide/kdump/kdump.rst.

Changes since [v4]
- Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.

Changes since [v3]
- Add memblock_cap_memory_ranges back for multiple ranges.
- Fix some compiling warnings.

Changes since [v2]
- Split patch "arm64: kdump: support reserving crashkernel above 4G" as
two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
patch.

Changes since [v1]:
- Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
- Remove memblock_cap_memory_ranges() i added in v1 and implement that
in fdt_enforce_memory_region().
There are at most two crash kernel regions, for two crash kernel regions
case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
and then remove the memory range in the middle.

v1:
There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel below 4G, which
will fail when there is no enough low memory.
2. If reserving crashkernel above 4G, in this case, crash dump
kernel will boot failure because there is no low memory available
for allocation.

To solve these issues, change the behavior of crashkernel=X.
crashkernel=X tries low allocation in DMA zone and fall back to high
allocation if it fails.

We can also use "crashkernel=X,high" to select a high region above
DMA zone, which also tries to allocate at least 256M low memory in
DMA zone automatically and "crashkernel=Y,low" can be used to allocate
specified size low memory.

When reserving crashkernel in high memory, some low memory is reserved
for crash dump kernel devices. So there may be two regions reserved for
crash dump kernel.
In order to distinct from the high region and make no effect to the use
of existing kexec-tools, rename the low region as "Crash kernel (low)",
and pass the low region by reusing DT property
"linux,usable-memory-range". We made the low memory region as the last
range of "linux,usable-memory-range" to keep compatibility with existing
user-space and older kdump kernels.

Besides, we need to modify kexec-tools:
arm64: support more than one crash kernel regions(see [1])

Another update is document about DT property 'linux,usable-memory-range':
schemas: update 'linux,usable-memory-range' node schema(see [2])


[1]: https://www.spinics.net/lists/kexec/msg28226.html
[2]: https://github.com/robherring/dt-schema/pull/19 
[v1]: https://lkml.org/lkml/2019/4/2/1174
[v2]: https://lkml.org/lkml/2019/4/9/86
[v3]: https://lkml.org/lkml/2019/4/9/306
[v4]: https://lkml.org/lkml/2019/4/15/273
[v5]: https://lkml.org/lkml/2019/5/6/1360
[v6]: https://lkml.org/lkml/2019/8/30/142
[v7]: https://lkml.org/lkml/2019/12/23/411
[v8]: https://lkml.org/lkml/2020/5/21/213
[v9]: https://lkml.org/lkml/2020/6/28/73
[v10]: https://lkml.org/lkml/2020/7/2/1443
[v11]: https://lkml.org/lkml/2020/8/1/150
[v12]: https://lkml.org/lkml/2020/9/7/1037
[v13]: https://lkml.org/lkml/2020/10/31/34
[v14]: https://lkml.org/lkml/2021/1/30/53
[v15]: https://lkml.org/lkml/2021/10/19/1405
[v16]: https://lkml.org/lkml/2021/11/23/435
[v17]: https://lkml.org/lkml/2021/12/10/38
[v18]: https://lkml.org/lkml/2021/12/22/424
[v19]: https://lkml.org/lkml/2021/12/28/203
[v20]: https://lkml.org/lkml/2022/1/24/167
[v21]: https://lkml.org/lkml/2022/2/26/350

Chen Zhou (2):
  arm64: kdump: Reimplement crashkernel=X
  of: fdt: Add memory for devices by DT property
    "linux,usable-memory-range"

Zhen Lei (7):
  kdump: return -ENOENT if required cmdline option does not exist
  arm64: Use insert_resource() to simplify code
  arm64: kdump: Remove some redundant checks in map_mem()
  arm64: kdump: Don't force page-level mappings for memory above 4G
  arm64: kdump: Use page-level mapping for the high memory of
    crashkernel
  arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G
  docs: kdump: Update the crashkernel description for arm64

 .../admin-guide/kernel-parameters.txt         |   8 +-
 arch/arm64/include/asm/kexec.h                |   2 +
 arch/arm64/kernel/machine_kexec.c             |   9 +-
 arch/arm64/kernel/machine_kexec_file.c        |  12 +-
 arch/arm64/kernel/setup.c                     |  17 +-
 arch/arm64/mm/init.c                          | 204 +++++++++++++++++-
 arch/arm64/mm/mmu.c                           |  68 +++---
 drivers/of/fdt.c                              |  33 ++-
 kernel/crash_core.c                           |   3 +-
 9 files changed, 278 insertions(+), 78 deletions(-)

-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump
@ 2022-04-14 11:57 ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: kexec

Changes since [v21]:
1. Update the commit message of  patch 1 and 5.
2. Add some comments for reserve_crashkernel() in patch 5.

Thanks to Baoquan He and John Donnelly for their review comments.

Because v5.18-rc1 has added a new patch
commit  031495635b46 ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones")
There are many new scenarios:
1) The mappings may be block or page-level. 
2) The call to reserve_crashkernel() may or may not be deferred.
3) The the upper limit of DMA address may be 4G, or less than 4G. Or the
   upper limit of physical memory, because SMMU can do the mapping.

The code of patch 1-2, 8-9 keep no change, because the above-mentioned issues are not involved.
The code of patch 5 only makes the following changes:
-	if (crash_base >= SZ_4G)
+	/*
+	 * When both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the
+	 * CRASH_ADDR_LOW_MAX equals the upper limit of physical memory, so
+	 * the 'crash_base' of high memory can not exceed it. To follow the
+	 * description of "crashkernel=X,high" option, add below 'high'
+	 * condition to make sure the crash low memory will be reserved.
+	 */
+	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
Change SZ_4G to CRASH_ADDR_LOW_MAX, because arm64_dma_phys_limit may be less than
4G or greater than 4G. The check 'high' is used for "crashkernel=X,high" and
"(crash_base >= CRASH_ADDR_LOW_MAX)" is used for "crashkernel=X[@offset]".

Patch 3-4 to allow block mappings for memory above 4G.
Patch 6-7 to support only crash high memory or fixed memory range specified by
crashkernel=X at offset use page-level mapping, to allow other areas use block mapping.
These four patches are for performance optimization purposes. For details about the
technical feasibility analysis, please see the commit messages.

Now the implementation of arm64 is very different from that of x86. It's no longer
suitable for both of them to share code.



Changes since [v20]:
1. Check whether crashkernel=Y,low is incorrectly configured or not configured. Do different processing.
2. Share the existing description of x86. The configuration of arm64 is the same as that of x86.
3. Define the value of macro CRASH_ADDR_HIGH_MAX as memblock.current_limit, instead of MEMBLOCK_ALLOC_ACCESSIBLE.
4. To improve readability, some lightweight code adjustments have been made to reserve_craskernel(), including comments.
5. The defined value of DEFAULT_CRASH_KERNEL_LOW_SIZE reconsiders swiotlb, just like x86, to share documents.

Thanks to Baoquan He for his careful review.

The test cases are as follows: (Please update the kexec tool to the latest version)
1) crashkernel=4G						//high=4G, low=256M
2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
5) crashkernel=4G at 0xe0000000					//high=0G, low=0M, cannot allocate, failed
6) crashkernel=512M						//high=0G, low=512M
7) crashkernel=128M						//high=0G, low=128M
8) crashkernel=512M at 0xde000000		//512M at 3552M		//high=0G, low=512M
9) crashkernel=4G,high						//high=4G, low=256M
a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
c) crashkernel=128M,high					//high=128M, low=256M
d) crashkernel=512M,low						//high=0G, low=0M, invalid
e) crashkernel=512M,high crashkernel=0,low			//high=512M, low=0M
f) crashkernel=4G,high crashkernel=ab,low			//high=0G, low=0M, invalid


Changes since [v19]:
1. Temporarily stop making reserve_crashkernel[_low]() generic. There are a
   lot of details need to be considered, which can take a long time. Because
   "make generic" does not add new functions and does not improve performance,
   maybe I should say it's just a cleanup. So by stripping it out and leaving
   it for other patches later, we can aggregate the changes to the main functions.
2. Use insert_resource() to replace request_resource(), this not only simplifies
   the code, but also reduces the differences between arm64 and x86 implementations.
3. As commit 157752d84f5d ("kexec: use Crash kernel for Crash kernel low") do for
   x86, we can also extend kexec-tools for arm64, and it's currently applied. See:
   https://www.spinics.net/lists/kexec/msg28284.html

Thank you very much, Borislav Petkov, for so many valuable comments.

Changes since [v17]: v17 --> v19
1. Patch 0001-0004
   Introduce generic parse_crashkernel_high_low() to bring the parsing of
   "crashkernel=X,high" and the parsing of "crashkernel=X,low" together,
   then use it instead of the call to parse_crashkernel_{high|low}(). Two
   confusing parameters of parse_crashkernel_{high|low}() are deleted.

   I previously sent these four patches separately:
   [1] https://lkml.org/lkml/2021/12/25/40
2. Patch 0005-0009
   Introduce generic reserve_crashkernel_mem[_low](), the implementation of
   these two functions is based on function reserve_crashkernel[_low]() in
   arch/x86/kernel/setup.c. There is no functional change for x86.
   1) The check position of xen_pv_domain() does not change.
   2) Still 1M alignment for crash kernel fixed region, when 'base' is specified.

   To avoid compilation problems on other architectures: patch 0004 moves
   the definition of global variable crashk[_low]_res from kexec_core.c to
   crash_core.c, and provide default definitions for all macros involved, a
   particular platform can redefine these macros to override the default
   values.
3. 0010, only one line of comment was changed.
4. 0011
   1) crashk_low_res may also a valid reserved memory, should be checked
      in crash_is_nosave(), see arch/arm64/kernel/machine_kexec.
   2) Drop memblock_mark_nomap() for crashk_low_res, because of:
      2687275a5843 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
   3) Also call kmemleak_ignore_phys() for crashk_low_res, because of:
      85f58eb18898 arm64: kdump: Skip kmemleak scan reserved memory for kdump
5. 0012, slightly rebased, because the following patch is applied in advance. 
   https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/commit/?h=dt/linus&id=8347b41748c3019157312fbe7f8a6792ae396eb7
6. 0013, no change.

Others:
1. Discard add ARCH_WANT_RESERVE_CRASH_KERNEL
2. When allocating crash low memory, the start address still starts from 0.
   low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
3. Discard change (1ULL << 32) to CRASH_ADDR_LOW_MAX.
4. Ensure the check position of xen_pv_domain() have no change.
5. Except patch 0010 and 0012, all "Tested-by", "Reviewed-by", "Acked-by" are removed.
6. Update description.



Changes since [v16]
- Because no functional changes in this version, so add
  "Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>" for patch 1-9
- Add "Reviewed-by: Rob Herring <robh@kernel.org>" for patch 8
- Update patch 9 based on the review comments of Rob Herring
- As Catalin Marinas's suggestion, merge the implementation of
  ARCH_WANT_RESERVE_CRASH_KERNEL into patch 5. Ensure that the
  contents of X86 and ARM64 do not overlap, and reduce unnecessary
  temporary differences.

Changes since [v15]
-  Aggregate the processing of "linux,usable-memory-range" into one function.
   Only patch 9-10 have been updated.

Changes since [v14]
- Recovering the requirement that the CrashKernel memory regions on X86
  only requires 1 MiB alignment.
- Combine patches 5 and 6 in v14 into one. The compilation warning fixed
  by patch 6 was introduced by patch 5 in v14.
- As with crashk_res, crashk_low_res is also processed by
  crash_exclude_mem_range() in patch 7.
- Due to commit b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
  has removed the architecture-specific code, extend the property "linux,usable-memory-range"
  in the platform-agnostic FDT core code. See patch 9.
- Discard the x86 description update in the document, because the description
  has been updated by commit b1f4c363666c ("Documentation: kdump: update kdump guide").
- Change "arm64" to "ARM64" in Doc.


Changes since [v13]
- Rebased on top of 5.11-rc5.
- Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL.
Since reserve_crashkernel[_low]() implementations are quite similar on
other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in
arch/Kconfig and select this by X86 and ARM64.
- Some minor cleanup.

Changes since [v12]
- Rebased on top of 5.10-rc1.
- Keep CRASH_ALIGN as 16M suggested by Dave.
- Drop patch "kdump: add threshold for the required memory".
- Add Tested-by from John.

Changes since [v11]
- Rebased on top of 5.9-rc4.
- Make the function reserve_crashkernel() of x86 generic.
Suggested by Catalin, make the function reserve_crashkernel() of x86 generic
and arm64 use the generic version to reimplement crashkernel=X.

Changes since [v10]
- Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.

Changes since [v9]
- Patch 1 add Acked-by from Dave.
- Update patch 5 according to Dave's comments.
- Update chosen schema.

Changes since [v8]
- Reuse DT property "linux,usable-memory-range".
Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
memory region.
- Fix kdump broken with ZONE_DMA reintroduced.
- Update chosen schema.

Changes since [v7]
- Move x86 CRASH_ALIGN to 2M
Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
- Update Documentation/devicetree/bindings/chosen.txt.
Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
suggested by Arnd.
- Add Tested-by from Jhon and pk.

Changes since [v6]
- Fix build errors reported by kbuild test robot.

Changes since [v5]
- Move reserve_crashkernel_low() into kernel/crash_core.c.
- Delete crashkernel=X,high.
- Modify crashkernel=X,low.
If crashkernel=X,low is specified simultaneously, reserve spcified size low
memory for crash kdump kernel devices firstly and then reserve memory above 4G.
In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
pass to crash dump kernel by DT property "linux,low-memory-range".
- Update Documentation/admin-guide/kdump/kdump.rst.

Changes since [v4]
- Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.

Changes since [v3]
- Add memblock_cap_memory_ranges back for multiple ranges.
- Fix some compiling warnings.

Changes since [v2]
- Split patch "arm64: kdump: support reserving crashkernel above 4G" as
two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
patch.

Changes since [v1]:
- Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
- Remove memblock_cap_memory_ranges() i added in v1 and implement that
in fdt_enforce_memory_region().
There are at most two crash kernel regions, for two crash kernel regions
case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
and then remove the memory range in the middle.

v1:
There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel below 4G, which
will fail when there is no enough low memory.
2. If reserving crashkernel above 4G, in this case, crash dump
kernel will boot failure because there is no low memory available
for allocation.

To solve these issues, change the behavior of crashkernel=X.
crashkernel=X tries low allocation in DMA zone and fall back to high
allocation if it fails.

We can also use "crashkernel=X,high" to select a high region above
DMA zone, which also tries to allocate at least 256M low memory in
DMA zone automatically and "crashkernel=Y,low" can be used to allocate
specified size low memory.

When reserving crashkernel in high memory, some low memory is reserved
for crash dump kernel devices. So there may be two regions reserved for
crash dump kernel.
In order to distinct from the high region and make no effect to the use
of existing kexec-tools, rename the low region as "Crash kernel (low)",
and pass the low region by reusing DT property
"linux,usable-memory-range". We made the low memory region as the last
range of "linux,usable-memory-range" to keep compatibility with existing
user-space and older kdump kernels.

Besides, we need to modify kexec-tools:
arm64: support more than one crash kernel regions(see [1])

Another update is document about DT property 'linux,usable-memory-range':
schemas: update 'linux,usable-memory-range' node schema(see [2])


[1]: https://www.spinics.net/lists/kexec/msg28226.html
[2]: https://github.com/robherring/dt-schema/pull/19 
[v1]: https://lkml.org/lkml/2019/4/2/1174
[v2]: https://lkml.org/lkml/2019/4/9/86
[v3]: https://lkml.org/lkml/2019/4/9/306
[v4]: https://lkml.org/lkml/2019/4/15/273
[v5]: https://lkml.org/lkml/2019/5/6/1360
[v6]: https://lkml.org/lkml/2019/8/30/142
[v7]: https://lkml.org/lkml/2019/12/23/411
[v8]: https://lkml.org/lkml/2020/5/21/213
[v9]: https://lkml.org/lkml/2020/6/28/73
[v10]: https://lkml.org/lkml/2020/7/2/1443
[v11]: https://lkml.org/lkml/2020/8/1/150
[v12]: https://lkml.org/lkml/2020/9/7/1037
[v13]: https://lkml.org/lkml/2020/10/31/34
[v14]: https://lkml.org/lkml/2021/1/30/53
[v15]: https://lkml.org/lkml/2021/10/19/1405
[v16]: https://lkml.org/lkml/2021/11/23/435
[v17]: https://lkml.org/lkml/2021/12/10/38
[v18]: https://lkml.org/lkml/2021/12/22/424
[v19]: https://lkml.org/lkml/2021/12/28/203
[v20]: https://lkml.org/lkml/2022/1/24/167
[v21]: https://lkml.org/lkml/2022/2/26/350

Chen Zhou (2):
  arm64: kdump: Reimplement crashkernel=X
  of: fdt: Add memory for devices by DT property
    "linux,usable-memory-range"

Zhen Lei (7):
  kdump: return -ENOENT if required cmdline option does not exist
  arm64: Use insert_resource() to simplify code
  arm64: kdump: Remove some redundant checks in map_mem()
  arm64: kdump: Don't force page-level mappings for memory above 4G
  arm64: kdump: Use page-level mapping for the high memory of
    crashkernel
  arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G
  docs: kdump: Update the crashkernel description for arm64

 .../admin-guide/kernel-parameters.txt         |   8 +-
 arch/arm64/include/asm/kexec.h                |   2 +
 arch/arm64/kernel/machine_kexec.c             |   9 +-
 arch/arm64/kernel/machine_kexec_file.c        |  12 +-
 arch/arm64/kernel/setup.c                     |  17 +-
 arch/arm64/mm/init.c                          | 204 +++++++++++++++++-
 arch/arm64/mm/mmu.c                           |  68 +++---
 drivers/of/fdt.c                              |  33 ++-
 kernel/crash_core.c                           |   3 +-
 9 files changed, 278 insertions(+), 78 deletions(-)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 1/9] kdump: return -ENOENT if required cmdline option does not exist
  2022-04-14 11:57 ` Zhen Lei
  (?)
@ 2022-04-14 11:57   ` Zhen Lei
  -1 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

According to the current crashkernel=Y,low support in other ARCHes, it's
an optional command-line option. When it doesn't exist, kernel will try
to allocate minimum required memory below 4G automatically.

However, __parse_crashkernel() returns '-EINVAL' for all error cases. It
can't distinguish the nonexistent option from invalid option.

Change __parse_crashkernel() to return '-ENOENT' for the nonexistent option
case. With this change, crashkernel,low memory will take the default
value if crashkernel=,low is not specified; while crashkernel reservation
will fail and bail out if an invalid option is specified.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 kernel/crash_core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 256cf6db573cd09..4d57c03714f4e13 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -243,9 +243,8 @@ static int __init __parse_crashkernel(char *cmdline,
 	*crash_base = 0;
 
 	ck_cmdline = get_last_crashkernel(cmdline, name, suffix);
-
 	if (!ck_cmdline)
-		return -EINVAL;
+		return -ENOENT;
 
 	ck_cmdline += strlen(name);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 1/9] kdump: return -ENOENT if required cmdline option does not exist
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

According to the current crashkernel=Y,low support in other ARCHes, it's
an optional command-line option. When it doesn't exist, kernel will try
to allocate minimum required memory below 4G automatically.

However, __parse_crashkernel() returns '-EINVAL' for all error cases. It
can't distinguish the nonexistent option from invalid option.

Change __parse_crashkernel() to return '-ENOENT' for the nonexistent option
case. With this change, crashkernel,low memory will take the default
value if crashkernel=,low is not specified; while crashkernel reservation
will fail and bail out if an invalid option is specified.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 kernel/crash_core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 256cf6db573cd09..4d57c03714f4e13 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -243,9 +243,8 @@ static int __init __parse_crashkernel(char *cmdline,
 	*crash_base = 0;
 
 	ck_cmdline = get_last_crashkernel(cmdline, name, suffix);
-
 	if (!ck_cmdline)
-		return -EINVAL;
+		return -ENOENT;
 
 	ck_cmdline += strlen(name);
 
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 1/9] kdump: return -ENOENT if required cmdline option does not exist
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: kexec

According to the current crashkernel=Y,low support in other ARCHes, it's
an optional command-line option. When it doesn't exist, kernel will try
to allocate minimum required memory below 4G automatically.

However, __parse_crashkernel() returns '-EINVAL' for all error cases. It
can't distinguish the nonexistent option from invalid option.

Change __parse_crashkernel() to return '-ENOENT' for the nonexistent option
case. With this change, crashkernel,low memory will take the default
value if crashkernel=,low is not specified; while crashkernel reservation
will fail and bail out if an invalid option is specified.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 kernel/crash_core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 256cf6db573cd09..4d57c03714f4e13 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -243,9 +243,8 @@ static int __init __parse_crashkernel(char *cmdline,
 	*crash_base = 0;
 
 	ck_cmdline = get_last_crashkernel(cmdline, name, suffix);
-
 	if (!ck_cmdline)
-		return -EINVAL;
+		return -ENOENT;
 
 	ck_cmdline += strlen(name);
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 2/9] arm64: Use insert_resource() to simplify code
  2022-04-14 11:57 ` Zhen Lei
  (?)
@ 2022-04-14 11:57   ` Zhen Lei
  -1 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

insert_resource() traverses the subtree layer by layer from the root node
until a proper location is found. Compared with request_resource(), the
parent node does not need to be determined in advance.

In addition, move the insertion of node 'crashk_res' into function
reserve_crashkernel() to make the associated code close together.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: John Donnelly  <john.p.donnelly@oracle.com>
Acked-by: Baoquan He <bhe@redhat.com>
---
 arch/arm64/kernel/setup.c | 17 +++--------------
 arch/arm64/mm/init.c      |  1 +
 2 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 3505789cf4bd92a..fea3223704b6339 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -225,6 +225,8 @@ static void __init request_standard_resources(void)
 	kernel_code.end     = __pa_symbol(__init_begin - 1);
 	kernel_data.start   = __pa_symbol(_sdata);
 	kernel_data.end     = __pa_symbol(_end - 1);
+	insert_resource(&iomem_resource, &kernel_code);
+	insert_resource(&iomem_resource, &kernel_data);
 
 	num_standard_resources = memblock.memory.cnt;
 	res_size = num_standard_resources * sizeof(*standard_resources);
@@ -246,20 +248,7 @@ static void __init request_standard_resources(void)
 			res->end = __pfn_to_phys(memblock_region_memory_end_pfn(region)) - 1;
 		}
 
-		request_resource(&iomem_resource, res);
-
-		if (kernel_code.start >= res->start &&
-		    kernel_code.end <= res->end)
-			request_resource(res, &kernel_code);
-		if (kernel_data.start >= res->start &&
-		    kernel_data.end <= res->end)
-			request_resource(res, &kernel_data);
-#ifdef CONFIG_KEXEC_CORE
-		/* Userspace will find "Crash kernel" region in /proc/iomem. */
-		if (crashk_res.end && crashk_res.start >= res->start &&
-		    crashk_res.end <= res->end)
-			request_resource(res, &crashk_res);
-#endif
+		insert_resource(&iomem_resource, res);
 	}
 }
 
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 8ac25f19084e898..f670bca160992b9 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -137,6 +137,7 @@ static void __init reserve_crashkernel(void)
 	kmemleak_ignore_phys(crash_base);
 	crashk_res.start = crash_base;
 	crashk_res.end = crash_base + crash_size - 1;
+	insert_resource(&iomem_resource, &crashk_res);
 }
 
 /*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 2/9] arm64: Use insert_resource() to simplify code
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

insert_resource() traverses the subtree layer by layer from the root node
until a proper location is found. Compared with request_resource(), the
parent node does not need to be determined in advance.

In addition, move the insertion of node 'crashk_res' into function
reserve_crashkernel() to make the associated code close together.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: John Donnelly  <john.p.donnelly@oracle.com>
Acked-by: Baoquan He <bhe@redhat.com>
---
 arch/arm64/kernel/setup.c | 17 +++--------------
 arch/arm64/mm/init.c      |  1 +
 2 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 3505789cf4bd92a..fea3223704b6339 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -225,6 +225,8 @@ static void __init request_standard_resources(void)
 	kernel_code.end     = __pa_symbol(__init_begin - 1);
 	kernel_data.start   = __pa_symbol(_sdata);
 	kernel_data.end     = __pa_symbol(_end - 1);
+	insert_resource(&iomem_resource, &kernel_code);
+	insert_resource(&iomem_resource, &kernel_data);
 
 	num_standard_resources = memblock.memory.cnt;
 	res_size = num_standard_resources * sizeof(*standard_resources);
@@ -246,20 +248,7 @@ static void __init request_standard_resources(void)
 			res->end = __pfn_to_phys(memblock_region_memory_end_pfn(region)) - 1;
 		}
 
-		request_resource(&iomem_resource, res);
-
-		if (kernel_code.start >= res->start &&
-		    kernel_code.end <= res->end)
-			request_resource(res, &kernel_code);
-		if (kernel_data.start >= res->start &&
-		    kernel_data.end <= res->end)
-			request_resource(res, &kernel_data);
-#ifdef CONFIG_KEXEC_CORE
-		/* Userspace will find "Crash kernel" region in /proc/iomem. */
-		if (crashk_res.end && crashk_res.start >= res->start &&
-		    crashk_res.end <= res->end)
-			request_resource(res, &crashk_res);
-#endif
+		insert_resource(&iomem_resource, res);
 	}
 }
 
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 8ac25f19084e898..f670bca160992b9 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -137,6 +137,7 @@ static void __init reserve_crashkernel(void)
 	kmemleak_ignore_phys(crash_base);
 	crashk_res.start = crash_base;
 	crashk_res.end = crash_base + crash_size - 1;
+	insert_resource(&iomem_resource, &crashk_res);
 }
 
 /*
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 2/9] arm64: Use insert_resource() to simplify code
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: kexec

insert_resource() traverses the subtree layer by layer from the root node
until a proper location is found. Compared with request_resource(), the
parent node does not need to be determined in advance.

In addition, move the insertion of node 'crashk_res' into function
reserve_crashkernel() to make the associated code close together.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: John Donnelly  <john.p.donnelly@oracle.com>
Acked-by: Baoquan He <bhe@redhat.com>
---
 arch/arm64/kernel/setup.c | 17 +++--------------
 arch/arm64/mm/init.c      |  1 +
 2 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 3505789cf4bd92a..fea3223704b6339 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -225,6 +225,8 @@ static void __init request_standard_resources(void)
 	kernel_code.end     = __pa_symbol(__init_begin - 1);
 	kernel_data.start   = __pa_symbol(_sdata);
 	kernel_data.end     = __pa_symbol(_end - 1);
+	insert_resource(&iomem_resource, &kernel_code);
+	insert_resource(&iomem_resource, &kernel_data);
 
 	num_standard_resources = memblock.memory.cnt;
 	res_size = num_standard_resources * sizeof(*standard_resources);
@@ -246,20 +248,7 @@ static void __init request_standard_resources(void)
 			res->end = __pfn_to_phys(memblock_region_memory_end_pfn(region)) - 1;
 		}
 
-		request_resource(&iomem_resource, res);
-
-		if (kernel_code.start >= res->start &&
-		    kernel_code.end <= res->end)
-			request_resource(res, &kernel_code);
-		if (kernel_data.start >= res->start &&
-		    kernel_data.end <= res->end)
-			request_resource(res, &kernel_data);
-#ifdef CONFIG_KEXEC_CORE
-		/* Userspace will find "Crash kernel" region in /proc/iomem. */
-		if (crashk_res.end && crashk_res.start >= res->start &&
-		    crashk_res.end <= res->end)
-			request_resource(res, &crashk_res);
-#endif
+		insert_resource(&iomem_resource, res);
 	}
 }
 
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 8ac25f19084e898..f670bca160992b9 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -137,6 +137,7 @@ static void __init reserve_crashkernel(void)
 	kmemleak_ignore_phys(crash_base);
 	crashk_res.start = crash_base;
 	crashk_res.end = crash_base + crash_size - 1;
+	insert_resource(&iomem_resource, &crashk_res);
 }
 
 /*
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 3/9] arm64: kdump: Remove some redundant checks in map_mem()
  2022-04-14 11:57 ` Zhen Lei
  (?)
@ 2022-04-14 11:57   ` Zhen Lei
  -1 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

If the value of crashk_res.end is non-zero in this function, it indicates
that function reserve_crashkernel() has been called under condition
"if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))" in
zone_sizes_init() before. And obviously the command line option
"crashkernel=" must also exist, so crash_mem_map is also true.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/mm/mmu.c | 31 +++++++++++++------------------
 1 file changed, 13 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 626ec32873c6c36..7666b4955e45cb3 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -528,14 +528,12 @@ static void __init map_mem(pgd_t *pgdp)
 	memblock_mark_nomap(kernel_start, kernel_end - kernel_start);
 
 #ifdef CONFIG_KEXEC_CORE
-	if (crash_mem_map) {
-		if (IS_ENABLED(CONFIG_ZONE_DMA) ||
-		    IS_ENABLED(CONFIG_ZONE_DMA32))
-			flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
-		else if (crashk_res.end)
-			memblock_mark_nomap(crashk_res.start,
-			    resource_size(&crashk_res));
-	}
+	if (crash_mem_map &&
+	    (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)))
+		flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
+	else if (crashk_res.end)
+		memblock_mark_nomap(crashk_res.start,
+				    resource_size(&crashk_res));
 #endif
 
 	/* map all the memory banks */
@@ -571,16 +569,13 @@ static void __init map_mem(pgd_t *pgdp)
 	 * through /sys/kernel/kexec_crash_size interface.
 	 */
 #ifdef CONFIG_KEXEC_CORE
-	if (crash_mem_map &&
-	    !IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) {
-		if (crashk_res.end) {
-			__map_memblock(pgdp, crashk_res.start,
-				       crashk_res.end + 1,
-				       PAGE_KERNEL,
-				       NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS);
-			memblock_clear_nomap(crashk_res.start,
-					     resource_size(&crashk_res));
-		}
+	if (crashk_res.end) {
+		__map_memblock(pgdp, crashk_res.start,
+			       crashk_res.end + 1,
+			       PAGE_KERNEL,
+			       NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS);
+		memblock_clear_nomap(crashk_res.start,
+				     resource_size(&crashk_res));
 	}
 #endif
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 3/9] arm64: kdump: Remove some redundant checks in map_mem()
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

If the value of crashk_res.end is non-zero in this function, it indicates
that function reserve_crashkernel() has been called under condition
"if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))" in
zone_sizes_init() before. And obviously the command line option
"crashkernel=" must also exist, so crash_mem_map is also true.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/mm/mmu.c | 31 +++++++++++++------------------
 1 file changed, 13 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 626ec32873c6c36..7666b4955e45cb3 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -528,14 +528,12 @@ static void __init map_mem(pgd_t *pgdp)
 	memblock_mark_nomap(kernel_start, kernel_end - kernel_start);
 
 #ifdef CONFIG_KEXEC_CORE
-	if (crash_mem_map) {
-		if (IS_ENABLED(CONFIG_ZONE_DMA) ||
-		    IS_ENABLED(CONFIG_ZONE_DMA32))
-			flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
-		else if (crashk_res.end)
-			memblock_mark_nomap(crashk_res.start,
-			    resource_size(&crashk_res));
-	}
+	if (crash_mem_map &&
+	    (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)))
+		flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
+	else if (crashk_res.end)
+		memblock_mark_nomap(crashk_res.start,
+				    resource_size(&crashk_res));
 #endif
 
 	/* map all the memory banks */
@@ -571,16 +569,13 @@ static void __init map_mem(pgd_t *pgdp)
 	 * through /sys/kernel/kexec_crash_size interface.
 	 */
 #ifdef CONFIG_KEXEC_CORE
-	if (crash_mem_map &&
-	    !IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) {
-		if (crashk_res.end) {
-			__map_memblock(pgdp, crashk_res.start,
-				       crashk_res.end + 1,
-				       PAGE_KERNEL,
-				       NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS);
-			memblock_clear_nomap(crashk_res.start,
-					     resource_size(&crashk_res));
-		}
+	if (crashk_res.end) {
+		__map_memblock(pgdp, crashk_res.start,
+			       crashk_res.end + 1,
+			       PAGE_KERNEL,
+			       NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS);
+		memblock_clear_nomap(crashk_res.start,
+				     resource_size(&crashk_res));
 	}
 #endif
 }
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 3/9] arm64: kdump: Remove some redundant checks in map_mem()
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: kexec

If the value of crashk_res.end is non-zero in this function, it indicates
that function reserve_crashkernel() has been called under condition
"if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))" in
zone_sizes_init() before. And obviously the command line option
"crashkernel=" must also exist, so crash_mem_map is also true.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/mm/mmu.c | 31 +++++++++++++------------------
 1 file changed, 13 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 626ec32873c6c36..7666b4955e45cb3 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -528,14 +528,12 @@ static void __init map_mem(pgd_t *pgdp)
 	memblock_mark_nomap(kernel_start, kernel_end - kernel_start);
 
 #ifdef CONFIG_KEXEC_CORE
-	if (crash_mem_map) {
-		if (IS_ENABLED(CONFIG_ZONE_DMA) ||
-		    IS_ENABLED(CONFIG_ZONE_DMA32))
-			flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
-		else if (crashk_res.end)
-			memblock_mark_nomap(crashk_res.start,
-			    resource_size(&crashk_res));
-	}
+	if (crash_mem_map &&
+	    (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)))
+		flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
+	else if (crashk_res.end)
+		memblock_mark_nomap(crashk_res.start,
+				    resource_size(&crashk_res));
 #endif
 
 	/* map all the memory banks */
@@ -571,16 +569,13 @@ static void __init map_mem(pgd_t *pgdp)
 	 * through /sys/kernel/kexec_crash_size interface.
 	 */
 #ifdef CONFIG_KEXEC_CORE
-	if (crash_mem_map &&
-	    !IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) {
-		if (crashk_res.end) {
-			__map_memblock(pgdp, crashk_res.start,
-				       crashk_res.end + 1,
-				       PAGE_KERNEL,
-				       NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS);
-			memblock_clear_nomap(crashk_res.start,
-					     resource_size(&crashk_res));
-		}
+	if (crashk_res.end) {
+		__map_memblock(pgdp, crashk_res.start,
+			       crashk_res.end + 1,
+			       PAGE_KERNEL,
+			       NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS);
+		memblock_clear_nomap(crashk_res.start,
+				     resource_size(&crashk_res));
 	}
 #endif
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 4/9] arm64: kdump: Don't force page-level mappings for memory above 4G
  2022-04-14 11:57 ` Zhen Lei
  (?)
@ 2022-04-14 11:57   ` Zhen Lei
  -1 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

If the crashkernel reservation is deferred, such boundaries are not known
when the linear mapping is created. But its upper limit is fixed, cannot
above 4G. Therefore, unless otherwise required, block mapping should be
used for memory above 4G to improve performance.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/mm/mmu.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 7666b4955e45cb3..8c6666cbc7f2216 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -504,7 +504,7 @@ static void __init map_mem(pgd_t *pgdp)
 	phys_addr_t kernel_start = __pa_symbol(_stext);
 	phys_addr_t kernel_end = __pa_symbol(__init_begin);
 	phys_addr_t start, end;
-	int flags = NO_EXEC_MAPPINGS;
+	int flags = NO_EXEC_MAPPINGS, eflags = 0;
 	u64 i;
 
 	/*
@@ -530,7 +530,7 @@ static void __init map_mem(pgd_t *pgdp)
 #ifdef CONFIG_KEXEC_CORE
 	if (crash_mem_map &&
 	    (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)))
-		flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
+		eflags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
 	else if (crashk_res.end)
 		memblock_mark_nomap(crashk_res.start,
 				    resource_size(&crashk_res));
@@ -540,13 +540,31 @@ static void __init map_mem(pgd_t *pgdp)
 	for_each_mem_range(i, &start, &end) {
 		if (start >= end)
 			break;
+
+#ifdef CONFIG_KEXEC_CORE
+		if (eflags && (end >= SZ_4G)) {
+			/*
+			 * The memory block cross the 4G boundary.
+			 * Forcibly use page-level mappings for memory under 4G.
+			 */
+			if (start < SZ_4G) {
+				__map_memblock(pgdp, start, SZ_4G - 1,
+					       pgprot_tagged(PAGE_KERNEL), flags | eflags);
+				start  = SZ_4G;
+			}
+
+			/* Page-level mappings is not mandatory for memory above 4G */
+			eflags = 0;
+		}
+#endif
+
 		/*
 		 * The linear map must allow allocation tags reading/writing
 		 * if MTE is present. Otherwise, it has the same attributes as
 		 * PAGE_KERNEL.
 		 */
 		__map_memblock(pgdp, start, end, pgprot_tagged(PAGE_KERNEL),
-			       flags);
+			       flags | eflags);
 	}
 
 	/*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 4/9] arm64: kdump: Don't force page-level mappings for memory above 4G
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

If the crashkernel reservation is deferred, such boundaries are not known
when the linear mapping is created. But its upper limit is fixed, cannot
above 4G. Therefore, unless otherwise required, block mapping should be
used for memory above 4G to improve performance.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/mm/mmu.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 7666b4955e45cb3..8c6666cbc7f2216 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -504,7 +504,7 @@ static void __init map_mem(pgd_t *pgdp)
 	phys_addr_t kernel_start = __pa_symbol(_stext);
 	phys_addr_t kernel_end = __pa_symbol(__init_begin);
 	phys_addr_t start, end;
-	int flags = NO_EXEC_MAPPINGS;
+	int flags = NO_EXEC_MAPPINGS, eflags = 0;
 	u64 i;
 
 	/*
@@ -530,7 +530,7 @@ static void __init map_mem(pgd_t *pgdp)
 #ifdef CONFIG_KEXEC_CORE
 	if (crash_mem_map &&
 	    (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)))
-		flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
+		eflags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
 	else if (crashk_res.end)
 		memblock_mark_nomap(crashk_res.start,
 				    resource_size(&crashk_res));
@@ -540,13 +540,31 @@ static void __init map_mem(pgd_t *pgdp)
 	for_each_mem_range(i, &start, &end) {
 		if (start >= end)
 			break;
+
+#ifdef CONFIG_KEXEC_CORE
+		if (eflags && (end >= SZ_4G)) {
+			/*
+			 * The memory block cross the 4G boundary.
+			 * Forcibly use page-level mappings for memory under 4G.
+			 */
+			if (start < SZ_4G) {
+				__map_memblock(pgdp, start, SZ_4G - 1,
+					       pgprot_tagged(PAGE_KERNEL), flags | eflags);
+				start  = SZ_4G;
+			}
+
+			/* Page-level mappings is not mandatory for memory above 4G */
+			eflags = 0;
+		}
+#endif
+
 		/*
 		 * The linear map must allow allocation tags reading/writing
 		 * if MTE is present. Otherwise, it has the same attributes as
 		 * PAGE_KERNEL.
 		 */
 		__map_memblock(pgdp, start, end, pgprot_tagged(PAGE_KERNEL),
-			       flags);
+			       flags | eflags);
 	}
 
 	/*
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 4/9] arm64: kdump: Don't force page-level mappings for memory above 4G
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: kexec

If the crashkernel reservation is deferred, such boundaries are not known
when the linear mapping is created. But its upper limit is fixed, cannot
above 4G. Therefore, unless otherwise required, block mapping should be
used for memory above 4G to improve performance.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/mm/mmu.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 7666b4955e45cb3..8c6666cbc7f2216 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -504,7 +504,7 @@ static void __init map_mem(pgd_t *pgdp)
 	phys_addr_t kernel_start = __pa_symbol(_stext);
 	phys_addr_t kernel_end = __pa_symbol(__init_begin);
 	phys_addr_t start, end;
-	int flags = NO_EXEC_MAPPINGS;
+	int flags = NO_EXEC_MAPPINGS, eflags = 0;
 	u64 i;
 
 	/*
@@ -530,7 +530,7 @@ static void __init map_mem(pgd_t *pgdp)
 #ifdef CONFIG_KEXEC_CORE
 	if (crash_mem_map &&
 	    (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)))
-		flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
+		eflags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
 	else if (crashk_res.end)
 		memblock_mark_nomap(crashk_res.start,
 				    resource_size(&crashk_res));
@@ -540,13 +540,31 @@ static void __init map_mem(pgd_t *pgdp)
 	for_each_mem_range(i, &start, &end) {
 		if (start >= end)
 			break;
+
+#ifdef CONFIG_KEXEC_CORE
+		if (eflags && (end >= SZ_4G)) {
+			/*
+			 * The memory block cross the 4G boundary.
+			 * Forcibly use page-level mappings for memory under 4G.
+			 */
+			if (start < SZ_4G) {
+				__map_memblock(pgdp, start, SZ_4G - 1,
+					       pgprot_tagged(PAGE_KERNEL), flags | eflags);
+				start  = SZ_4G;
+			}
+
+			/* Page-level mappings is not mandatory for memory above 4G */
+			eflags = 0;
+		}
+#endif
+
 		/*
 		 * The linear map must allow allocation tags reading/writing
 		 * if MTE is present. Otherwise, it has the same attributes as
 		 * PAGE_KERNEL.
 		 */
 		__map_memblock(pgdp, start, end, pgprot_tagged(PAGE_KERNEL),
-			       flags);
+			       flags | eflags);
 	}
 
 	/*
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-04-14 11:57 ` Zhen Lei
  (?)
@ 2022-04-14 11:57   ` Zhen Lei
  -1 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

From: Chen Zhou <chenzhou10@huawei.com>

There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel below 4G, which
will fail when there is not enough low memory.
2. If reserving crashkernel above 4G, in this case, crash dump
kernel will fail to boot because there is no low memory available
for allocation.

To solve these issues, change the behavior of crashkernel=X and
introduce crashkernel=X,[high,low]. crashkernel=X tries low allocation
in DMA zone, and fall back to high allocation if it fails.
We can also use "crashkernel=X,high" to select a region above DMA zone,
which also tries to allocate at least 256M in DMA zone automatically.
"crashkernel=Y,low" can be used to allocate specified size low memory.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/kernel/machine_kexec.c      |   9 +-
 arch/arm64/kernel/machine_kexec_file.c |  12 ++-
 arch/arm64/mm/init.c                   | 116 +++++++++++++++++++++++--
 3 files changed, 124 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index e16b248699d5c3c..19c2d487cb08feb 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -329,8 +329,13 @@ bool crash_is_nosave(unsigned long pfn)
 
 	/* in reserved memory? */
 	addr = __pfn_to_phys(pfn);
-	if ((addr < crashk_res.start) || (crashk_res.end < addr))
-		return false;
+	if ((addr < crashk_res.start) || (crashk_res.end < addr)) {
+		if (!crashk_low_res.end)
+			return false;
+
+		if ((addr < crashk_low_res.start) || (crashk_low_res.end < addr))
+			return false;
+	}
 
 	if (!kexec_crash_image)
 		return true;
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index 59c648d51848886..889951291cc0f9c 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -65,10 +65,18 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
 
 	/* Exclude crashkernel region */
 	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
+	if (ret)
+		goto out;
+
+	if (crashk_low_res.end) {
+		ret = crash_exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
+		if (ret)
+			goto out;
+	}
 
-	if (!ret)
-		ret =  crash_prepare_elf64_headers(cmem, true, addr, sz);
+	ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
 
+out:
 	kfree(cmem);
 	return ret;
 }
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f670bca160992b9..99d5539c13de3b1 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -90,43 +90,138 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit;
 phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
 #endif
 
+/* Current arm64 boot protocol requires 2MB alignment */
+#define CRASH_ALIGN			SZ_2M
+
+#define CRASH_ADDR_LOW_MAX		arm64_dma_phys_limit
+#define CRASH_ADDR_HIGH_MAX		memblock.current_limit
+
+/*
+ * This is an empirical value in x86_64 and taken here directly. Please
+ * refer to the code comment in reserve_crashkernel_low() of x86_64 for more
+ * details.
+ */
+#define DEFAULT_CRASH_KERNEL_LOW_SIZE	\
+	max(swiotlb_size_or_default() + (8UL << 20), 256UL << 20)
+
+static int __init reserve_crashkernel_low(unsigned long long low_size)
+{
+	unsigned long long low_base;
+
+	/* passed with crashkernel=0,low ? */
+	if (!low_size)
+		return 0;
+
+	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
+	if (!low_base) {
+		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
+		return -ENOMEM;
+	}
+
+	pr_info("crashkernel low memory reserved: 0x%08llx - 0x%08llx (%lld MB)\n",
+		low_base, low_base + low_size, low_size >> 20);
+
+	crashk_low_res.start = low_base;
+	crashk_low_res.end   = low_base + low_size - 1;
+	insert_resource(&iomem_resource, &crashk_low_res);
+
+	return 0;
+}
+
 /*
  * reserve_crashkernel() - reserves memory for crash kernel
  *
  * This function reserves memory area given in "crashkernel=" kernel command
  * line parameter. The memory reserved is used by dump capture kernel when
  * primary kernel is crashing.
+ *
+ * NOTE: Reservation of crashkernel,low is special since its existence
+ * is not independent, need rely on the existence of crashkernel,high.
+ * Here, four cases of crashkernel low memory reservation are summarized:
+ * 1) crashkernel=Y,low is specified explicitly, the size of crashkernel low
+ *    memory takes Y;
+ * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
+ *    take the default crashkernel low memory size;
+ * 3) crashkernel=X is specified, while fallback to get a memory region
+ *    in high memory, take the default crashkernel low memory size;
+ * 4) crashkernel='invalid value',low is specified, failed the whole
+ *    crashkernel reservation and bail out.
  */
 static void __init reserve_crashkernel(void)
 {
 	unsigned long long crash_base, crash_size;
-	unsigned long long crash_max = arm64_dma_phys_limit;
+	unsigned long long crash_low_size;
+	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
+	bool fixed_base, high = false;
+	char *cmdline = boot_command_line;
 	int ret;
 
 	if (!IS_ENABLED(CONFIG_KEXEC_CORE))
 		return;
 
-	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+	/* crashkernel=X[@offset] */
+	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
 				&crash_size, &crash_base);
-	/* no crashkernel= or invalid value specified */
-	if (ret || !crash_size)
-		return;
+	if (ret || !crash_size) {
+		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
+		if (ret || !crash_size)
+			return;
+
+		ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
+		if (ret == -ENOENT)
+			/* case #2 of crashkernel,low reservation */
+			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
+		else if (ret)
+			/* case #4 of crashkernel,low reservation */
+			return;
+
+		high = true;
+		crash_max = CRASH_ADDR_HIGH_MAX;
+	}
 
+	fixed_base = !!crash_base;
 	crash_size = PAGE_ALIGN(crash_size);
 
-	/* User specifies base address explicitly. */
-	if (crash_base)
+	if (fixed_base)
 		crash_max = crash_base + crash_size;
 
-	/* Current arm64 boot protocol requires 2MB alignment */
-	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
+retry:
+	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
 					       crash_base, crash_max);
 	if (!crash_base) {
+		/*
+		 * Attempt to fully allocate low memory failed, fall back
+		 * to high memory, the minimum required low memory will be
+		 * reserved later.
+		 */
+		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
+			crash_max = CRASH_ADDR_HIGH_MAX;
+			goto retry;
+		}
+
 		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
 			crash_size);
 		return;
 	}
 
+	/*
+	 * When both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the
+	 * CRASH_ADDR_LOW_MAX equals the upper limit of physical memory, so
+	 * the 'crash_base' of high memory can not exceed it. To follow the
+	 * description of "crashkernel=X,high" option, add below 'high'
+	 * condition to make sure the crash low memory will be reserved.
+	 */
+	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
+		/* case #3 of crashkernel,low reservation */
+		if (!high)
+			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
+
+		if (reserve_crashkernel_low(crash_low_size)) {
+			memblock_phys_free(crash_base, crash_size);
+			return;
+		}
+	}
+
 	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
 		crash_base, crash_base + crash_size, crash_size >> 20);
 
@@ -135,6 +230,9 @@ static void __init reserve_crashkernel(void)
 	 * map. Inform kmemleak so that it won't try to access it.
 	 */
 	kmemleak_ignore_phys(crash_base);
+	if (crashk_low_res.end)
+		kmemleak_ignore_phys(crashk_low_res.start);
+
 	crashk_res.start = crash_base;
 	crashk_res.end = crash_base + crash_size - 1;
 	insert_resource(&iomem_resource, &crashk_res);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

From: Chen Zhou <chenzhou10@huawei.com>

There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel below 4G, which
will fail when there is not enough low memory.
2. If reserving crashkernel above 4G, in this case, crash dump
kernel will fail to boot because there is no low memory available
for allocation.

To solve these issues, change the behavior of crashkernel=X and
introduce crashkernel=X,[high,low]. crashkernel=X tries low allocation
in DMA zone, and fall back to high allocation if it fails.
We can also use "crashkernel=X,high" to select a region above DMA zone,
which also tries to allocate at least 256M in DMA zone automatically.
"crashkernel=Y,low" can be used to allocate specified size low memory.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/kernel/machine_kexec.c      |   9 +-
 arch/arm64/kernel/machine_kexec_file.c |  12 ++-
 arch/arm64/mm/init.c                   | 116 +++++++++++++++++++++++--
 3 files changed, 124 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index e16b248699d5c3c..19c2d487cb08feb 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -329,8 +329,13 @@ bool crash_is_nosave(unsigned long pfn)
 
 	/* in reserved memory? */
 	addr = __pfn_to_phys(pfn);
-	if ((addr < crashk_res.start) || (crashk_res.end < addr))
-		return false;
+	if ((addr < crashk_res.start) || (crashk_res.end < addr)) {
+		if (!crashk_low_res.end)
+			return false;
+
+		if ((addr < crashk_low_res.start) || (crashk_low_res.end < addr))
+			return false;
+	}
 
 	if (!kexec_crash_image)
 		return true;
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index 59c648d51848886..889951291cc0f9c 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -65,10 +65,18 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
 
 	/* Exclude crashkernel region */
 	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
+	if (ret)
+		goto out;
+
+	if (crashk_low_res.end) {
+		ret = crash_exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
+		if (ret)
+			goto out;
+	}
 
-	if (!ret)
-		ret =  crash_prepare_elf64_headers(cmem, true, addr, sz);
+	ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
 
+out:
 	kfree(cmem);
 	return ret;
 }
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f670bca160992b9..99d5539c13de3b1 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -90,43 +90,138 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit;
 phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
 #endif
 
+/* Current arm64 boot protocol requires 2MB alignment */
+#define CRASH_ALIGN			SZ_2M
+
+#define CRASH_ADDR_LOW_MAX		arm64_dma_phys_limit
+#define CRASH_ADDR_HIGH_MAX		memblock.current_limit
+
+/*
+ * This is an empirical value in x86_64 and taken here directly. Please
+ * refer to the code comment in reserve_crashkernel_low() of x86_64 for more
+ * details.
+ */
+#define DEFAULT_CRASH_KERNEL_LOW_SIZE	\
+	max(swiotlb_size_or_default() + (8UL << 20), 256UL << 20)
+
+static int __init reserve_crashkernel_low(unsigned long long low_size)
+{
+	unsigned long long low_base;
+
+	/* passed with crashkernel=0,low ? */
+	if (!low_size)
+		return 0;
+
+	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
+	if (!low_base) {
+		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
+		return -ENOMEM;
+	}
+
+	pr_info("crashkernel low memory reserved: 0x%08llx - 0x%08llx (%lld MB)\n",
+		low_base, low_base + low_size, low_size >> 20);
+
+	crashk_low_res.start = low_base;
+	crashk_low_res.end   = low_base + low_size - 1;
+	insert_resource(&iomem_resource, &crashk_low_res);
+
+	return 0;
+}
+
 /*
  * reserve_crashkernel() - reserves memory for crash kernel
  *
  * This function reserves memory area given in "crashkernel=" kernel command
  * line parameter. The memory reserved is used by dump capture kernel when
  * primary kernel is crashing.
+ *
+ * NOTE: Reservation of crashkernel,low is special since its existence
+ * is not independent, need rely on the existence of crashkernel,high.
+ * Here, four cases of crashkernel low memory reservation are summarized:
+ * 1) crashkernel=Y,low is specified explicitly, the size of crashkernel low
+ *    memory takes Y;
+ * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
+ *    take the default crashkernel low memory size;
+ * 3) crashkernel=X is specified, while fallback to get a memory region
+ *    in high memory, take the default crashkernel low memory size;
+ * 4) crashkernel='invalid value',low is specified, failed the whole
+ *    crashkernel reservation and bail out.
  */
 static void __init reserve_crashkernel(void)
 {
 	unsigned long long crash_base, crash_size;
-	unsigned long long crash_max = arm64_dma_phys_limit;
+	unsigned long long crash_low_size;
+	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
+	bool fixed_base, high = false;
+	char *cmdline = boot_command_line;
 	int ret;
 
 	if (!IS_ENABLED(CONFIG_KEXEC_CORE))
 		return;
 
-	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+	/* crashkernel=X[@offset] */
+	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
 				&crash_size, &crash_base);
-	/* no crashkernel= or invalid value specified */
-	if (ret || !crash_size)
-		return;
+	if (ret || !crash_size) {
+		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
+		if (ret || !crash_size)
+			return;
+
+		ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
+		if (ret == -ENOENT)
+			/* case #2 of crashkernel,low reservation */
+			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
+		else if (ret)
+			/* case #4 of crashkernel,low reservation */
+			return;
+
+		high = true;
+		crash_max = CRASH_ADDR_HIGH_MAX;
+	}
 
+	fixed_base = !!crash_base;
 	crash_size = PAGE_ALIGN(crash_size);
 
-	/* User specifies base address explicitly. */
-	if (crash_base)
+	if (fixed_base)
 		crash_max = crash_base + crash_size;
 
-	/* Current arm64 boot protocol requires 2MB alignment */
-	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
+retry:
+	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
 					       crash_base, crash_max);
 	if (!crash_base) {
+		/*
+		 * Attempt to fully allocate low memory failed, fall back
+		 * to high memory, the minimum required low memory will be
+		 * reserved later.
+		 */
+		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
+			crash_max = CRASH_ADDR_HIGH_MAX;
+			goto retry;
+		}
+
 		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
 			crash_size);
 		return;
 	}
 
+	/*
+	 * When both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the
+	 * CRASH_ADDR_LOW_MAX equals the upper limit of physical memory, so
+	 * the 'crash_base' of high memory can not exceed it. To follow the
+	 * description of "crashkernel=X,high" option, add below 'high'
+	 * condition to make sure the crash low memory will be reserved.
+	 */
+	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
+		/* case #3 of crashkernel,low reservation */
+		if (!high)
+			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
+
+		if (reserve_crashkernel_low(crash_low_size)) {
+			memblock_phys_free(crash_base, crash_size);
+			return;
+		}
+	}
+
 	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
 		crash_base, crash_base + crash_size, crash_size >> 20);
 
@@ -135,6 +230,9 @@ static void __init reserve_crashkernel(void)
 	 * map. Inform kmemleak so that it won't try to access it.
 	 */
 	kmemleak_ignore_phys(crash_base);
+	if (crashk_low_res.end)
+		kmemleak_ignore_phys(crashk_low_res.start);
+
 	crashk_res.start = crash_base;
 	crashk_res.end = crash_base + crash_size - 1;
 	insert_resource(&iomem_resource, &crashk_res);
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: kexec

From: Chen Zhou <chenzhou10@huawei.com>

There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel below 4G, which
will fail when there is not enough low memory.
2. If reserving crashkernel above 4G, in this case, crash dump
kernel will fail to boot because there is no low memory available
for allocation.

To solve these issues, change the behavior of crashkernel=X and
introduce crashkernel=X,[high,low]. crashkernel=X tries low allocation
in DMA zone, and fall back to high allocation if it fails.
We can also use "crashkernel=X,high" to select a region above DMA zone,
which also tries to allocate at least 256M in DMA zone automatically.
"crashkernel=Y,low" can be used to allocate specified size low memory.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/kernel/machine_kexec.c      |   9 +-
 arch/arm64/kernel/machine_kexec_file.c |  12 ++-
 arch/arm64/mm/init.c                   | 116 +++++++++++++++++++++++--
 3 files changed, 124 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index e16b248699d5c3c..19c2d487cb08feb 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -329,8 +329,13 @@ bool crash_is_nosave(unsigned long pfn)
 
 	/* in reserved memory? */
 	addr = __pfn_to_phys(pfn);
-	if ((addr < crashk_res.start) || (crashk_res.end < addr))
-		return false;
+	if ((addr < crashk_res.start) || (crashk_res.end < addr)) {
+		if (!crashk_low_res.end)
+			return false;
+
+		if ((addr < crashk_low_res.start) || (crashk_low_res.end < addr))
+			return false;
+	}
 
 	if (!kexec_crash_image)
 		return true;
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index 59c648d51848886..889951291cc0f9c 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -65,10 +65,18 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
 
 	/* Exclude crashkernel region */
 	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
+	if (ret)
+		goto out;
+
+	if (crashk_low_res.end) {
+		ret = crash_exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
+		if (ret)
+			goto out;
+	}
 
-	if (!ret)
-		ret =  crash_prepare_elf64_headers(cmem, true, addr, sz);
+	ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
 
+out:
 	kfree(cmem);
 	return ret;
 }
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f670bca160992b9..99d5539c13de3b1 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -90,43 +90,138 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit;
 phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
 #endif
 
+/* Current arm64 boot protocol requires 2MB alignment */
+#define CRASH_ALIGN			SZ_2M
+
+#define CRASH_ADDR_LOW_MAX		arm64_dma_phys_limit
+#define CRASH_ADDR_HIGH_MAX		memblock.current_limit
+
+/*
+ * This is an empirical value in x86_64 and taken here directly. Please
+ * refer to the code comment in reserve_crashkernel_low() of x86_64 for more
+ * details.
+ */
+#define DEFAULT_CRASH_KERNEL_LOW_SIZE	\
+	max(swiotlb_size_or_default() + (8UL << 20), 256UL << 20)
+
+static int __init reserve_crashkernel_low(unsigned long long low_size)
+{
+	unsigned long long low_base;
+
+	/* passed with crashkernel=0,low ? */
+	if (!low_size)
+		return 0;
+
+	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
+	if (!low_base) {
+		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
+		return -ENOMEM;
+	}
+
+	pr_info("crashkernel low memory reserved: 0x%08llx - 0x%08llx (%lld MB)\n",
+		low_base, low_base + low_size, low_size >> 20);
+
+	crashk_low_res.start = low_base;
+	crashk_low_res.end   = low_base + low_size - 1;
+	insert_resource(&iomem_resource, &crashk_low_res);
+
+	return 0;
+}
+
 /*
  * reserve_crashkernel() - reserves memory for crash kernel
  *
  * This function reserves memory area given in "crashkernel=" kernel command
  * line parameter. The memory reserved is used by dump capture kernel when
  * primary kernel is crashing.
+ *
+ * NOTE: Reservation of crashkernel,low is special since its existence
+ * is not independent, need rely on the existence of crashkernel,high.
+ * Here, four cases of crashkernel low memory reservation are summarized:
+ * 1) crashkernel=Y,low is specified explicitly, the size of crashkernel low
+ *    memory takes Y;
+ * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
+ *    take the default crashkernel low memory size;
+ * 3) crashkernel=X is specified, while fallback to get a memory region
+ *    in high memory, take the default crashkernel low memory size;
+ * 4) crashkernel='invalid value',low is specified, failed the whole
+ *    crashkernel reservation and bail out.
  */
 static void __init reserve_crashkernel(void)
 {
 	unsigned long long crash_base, crash_size;
-	unsigned long long crash_max = arm64_dma_phys_limit;
+	unsigned long long crash_low_size;
+	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
+	bool fixed_base, high = false;
+	char *cmdline = boot_command_line;
 	int ret;
 
 	if (!IS_ENABLED(CONFIG_KEXEC_CORE))
 		return;
 
-	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+	/* crashkernel=X[@offset] */
+	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
 				&crash_size, &crash_base);
-	/* no crashkernel= or invalid value specified */
-	if (ret || !crash_size)
-		return;
+	if (ret || !crash_size) {
+		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
+		if (ret || !crash_size)
+			return;
+
+		ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
+		if (ret == -ENOENT)
+			/* case #2 of crashkernel,low reservation */
+			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
+		else if (ret)
+			/* case #4 of crashkernel,low reservation */
+			return;
+
+		high = true;
+		crash_max = CRASH_ADDR_HIGH_MAX;
+	}
 
+	fixed_base = !!crash_base;
 	crash_size = PAGE_ALIGN(crash_size);
 
-	/* User specifies base address explicitly. */
-	if (crash_base)
+	if (fixed_base)
 		crash_max = crash_base + crash_size;
 
-	/* Current arm64 boot protocol requires 2MB alignment */
-	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
+retry:
+	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
 					       crash_base, crash_max);
 	if (!crash_base) {
+		/*
+		 * Attempt to fully allocate low memory failed, fall back
+		 * to high memory, the minimum required low memory will be
+		 * reserved later.
+		 */
+		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
+			crash_max = CRASH_ADDR_HIGH_MAX;
+			goto retry;
+		}
+
 		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
 			crash_size);
 		return;
 	}
 
+	/*
+	 * When both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the
+	 * CRASH_ADDR_LOW_MAX equals the upper limit of physical memory, so
+	 * the 'crash_base' of high memory can not exceed it. To follow the
+	 * description of "crashkernel=X,high" option, add below 'high'
+	 * condition to make sure the crash low memory will be reserved.
+	 */
+	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
+		/* case #3 of crashkernel,low reservation */
+		if (!high)
+			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
+
+		if (reserve_crashkernel_low(crash_low_size)) {
+			memblock_phys_free(crash_base, crash_size);
+			return;
+		}
+	}
+
 	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
 		crash_base, crash_base + crash_size, crash_size >> 20);
 
@@ -135,6 +230,9 @@ static void __init reserve_crashkernel(void)
 	 * map. Inform kmemleak so that it won't try to access it.
 	 */
 	kmemleak_ignore_phys(crash_base);
+	if (crashk_low_res.end)
+		kmemleak_ignore_phys(crashk_low_res.start);
+
 	crashk_res.start = crash_base;
 	crashk_res.end = crash_base + crash_size - 1;
 	insert_resource(&iomem_resource, &crashk_res);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 6/9] arm64: kdump: Use page-level mapping for the high memory of crashkernel
  2022-04-14 11:57 ` Zhen Lei
  (?)
@ 2022-04-14 11:57   ` Zhen Lei
  -1 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

If the crashkernel has both high memory above 4G and low memory under 4G,
kexec always loads the content such as Imge and dtb to the high memory
instead of the low memory. This means that only high memory requires write
protection based on page-level mapping. The allocation of high memory does
not depend on the DMA boundary. So we can reserve the high memory first
even if the crashkernel reservation is deferred.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/mm/init.c | 84 ++++++++++++++++++++++++++++++++++++++++++++
 arch/arm64/mm/mmu.c  |  3 +-
 2 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 99d5539c13de3b1..b1b40b900fae170 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -90,6 +90,9 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit;
 phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
 #endif
 
+static bool crash_high_mem_reserved __initdata;
+static struct resource crashk_res_high;
+
 /* Current arm64 boot protocol requires 2MB alignment */
 #define CRASH_ALIGN			SZ_2M
 
@@ -128,6 +131,66 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
 	return 0;
 }
 
+static void __init reserve_crashkernel_high(void)
+{
+	unsigned long long crash_base, crash_size;
+	char *cmdline = boot_command_line;
+	int ret;
+
+	if (!IS_ENABLED(CONFIG_KEXEC_CORE))
+		return;
+
+	/* crashkernel=X[@offset] */
+	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
+				&crash_size, &crash_base);
+	if (ret || !crash_size) {
+		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
+		if (ret || !crash_size)
+			return;
+	}
+
+	crash_size = PAGE_ALIGN(crash_size);
+
+	/*
+	 * For the case crashkernel=X, may fall back to reserve memory above
+	 * 4G, make reservations here in advance. It will be released later if
+	 * the region is successfully reserved under 4G.
+	 */
+	if (!crash_base) {
+		crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
+						       crash_base, CRASH_ADDR_HIGH_MAX);
+		if (!crash_base)
+			return;
+
+		crash_high_mem_reserved = true;
+	}
+
+	/* Mark the memory range that requires page-level mappings */
+	crashk_res.start = crash_base;
+	crashk_res.end   = crash_base + crash_size - 1;
+}
+
+static void __init hand_over_reserved_high_mem(void)
+{
+	crashk_res_high.start = crashk_res.start;
+	crashk_res_high.end   = crashk_res.end;
+
+	crashk_res.start = 0;
+	crashk_res.end   = 0;
+}
+
+static void __init take_reserved_high_mem(unsigned long long *crash_base,
+					  unsigned long long *crash_size)
+{
+	*crash_base = crashk_res_high.start;
+	*crash_size = resource_size(&crashk_res_high);
+}
+
+static void __init free_reserved_high_mem(void)
+{
+	memblock_phys_free(crashk_res_high.start, resource_size(&crashk_res_high));
+}
+
 /*
  * reserve_crashkernel() - reserves memory for crash kernel
  *
@@ -159,6 +222,8 @@ static void __init reserve_crashkernel(void)
 	if (!IS_ENABLED(CONFIG_KEXEC_CORE))
 		return;
 
+	hand_over_reserved_high_mem();
+
 	/* crashkernel=X[@offset] */
 	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
 				&crash_size, &crash_base);
@@ -177,6 +242,11 @@ static void __init reserve_crashkernel(void)
 
 		high = true;
 		crash_max = CRASH_ADDR_HIGH_MAX;
+
+		if (crash_high_mem_reserved) {
+			take_reserved_high_mem(&crash_base, &crash_size);
+			goto reserve_low;
+		}
 	}
 
 	fixed_base = !!crash_base;
@@ -195,6 +265,11 @@ static void __init reserve_crashkernel(void)
 		 * reserved later.
 		 */
 		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
+			if (crash_high_mem_reserved) {
+				take_reserved_high_mem(&crash_base, &crash_size);
+				goto reserve_low;
+			}
+
 			crash_max = CRASH_ADDR_HIGH_MAX;
 			goto retry;
 		}
@@ -212,6 +287,7 @@ static void __init reserve_crashkernel(void)
 	 * condition to make sure the crash low memory will be reserved.
 	 */
 	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
+reserve_low:
 		/* case #3 of crashkernel,low reservation */
 		if (!high)
 			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
@@ -220,6 +296,12 @@ static void __init reserve_crashkernel(void)
 			memblock_phys_free(crash_base, crash_size);
 			return;
 		}
+	} else if (crash_high_mem_reserved) {
+		/*
+		 * The crash memory is successfully allocated under 4G, and the
+		 * previously reserved high memory is no longer required.
+		 */
+		free_reserved_high_mem();
 	}
 
 	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
@@ -437,6 +519,8 @@ void __init arm64_memblock_init(void)
 
 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
 		reserve_crashkernel();
+	else
+		reserve_crashkernel_high();
 
 	high_memory = __va(memblock_end_of_DRAM() - 1) + 1;
 }
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 8c6666cbc7f2216..f84eca55b103d0c 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -531,7 +531,8 @@ static void __init map_mem(pgd_t *pgdp)
 	if (crash_mem_map &&
 	    (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)))
 		eflags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
-	else if (crashk_res.end)
+
+	if (crashk_res.end)
 		memblock_mark_nomap(crashk_res.start,
 				    resource_size(&crashk_res));
 #endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 6/9] arm64: kdump: Use page-level mapping for the high memory of crashkernel
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

If the crashkernel has both high memory above 4G and low memory under 4G,
kexec always loads the content such as Imge and dtb to the high memory
instead of the low memory. This means that only high memory requires write
protection based on page-level mapping. The allocation of high memory does
not depend on the DMA boundary. So we can reserve the high memory first
even if the crashkernel reservation is deferred.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/mm/init.c | 84 ++++++++++++++++++++++++++++++++++++++++++++
 arch/arm64/mm/mmu.c  |  3 +-
 2 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 99d5539c13de3b1..b1b40b900fae170 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -90,6 +90,9 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit;
 phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
 #endif
 
+static bool crash_high_mem_reserved __initdata;
+static struct resource crashk_res_high;
+
 /* Current arm64 boot protocol requires 2MB alignment */
 #define CRASH_ALIGN			SZ_2M
 
@@ -128,6 +131,66 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
 	return 0;
 }
 
+static void __init reserve_crashkernel_high(void)
+{
+	unsigned long long crash_base, crash_size;
+	char *cmdline = boot_command_line;
+	int ret;
+
+	if (!IS_ENABLED(CONFIG_KEXEC_CORE))
+		return;
+
+	/* crashkernel=X[@offset] */
+	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
+				&crash_size, &crash_base);
+	if (ret || !crash_size) {
+		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
+		if (ret || !crash_size)
+			return;
+	}
+
+	crash_size = PAGE_ALIGN(crash_size);
+
+	/*
+	 * For the case crashkernel=X, may fall back to reserve memory above
+	 * 4G, make reservations here in advance. It will be released later if
+	 * the region is successfully reserved under 4G.
+	 */
+	if (!crash_base) {
+		crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
+						       crash_base, CRASH_ADDR_HIGH_MAX);
+		if (!crash_base)
+			return;
+
+		crash_high_mem_reserved = true;
+	}
+
+	/* Mark the memory range that requires page-level mappings */
+	crashk_res.start = crash_base;
+	crashk_res.end   = crash_base + crash_size - 1;
+}
+
+static void __init hand_over_reserved_high_mem(void)
+{
+	crashk_res_high.start = crashk_res.start;
+	crashk_res_high.end   = crashk_res.end;
+
+	crashk_res.start = 0;
+	crashk_res.end   = 0;
+}
+
+static void __init take_reserved_high_mem(unsigned long long *crash_base,
+					  unsigned long long *crash_size)
+{
+	*crash_base = crashk_res_high.start;
+	*crash_size = resource_size(&crashk_res_high);
+}
+
+static void __init free_reserved_high_mem(void)
+{
+	memblock_phys_free(crashk_res_high.start, resource_size(&crashk_res_high));
+}
+
 /*
  * reserve_crashkernel() - reserves memory for crash kernel
  *
@@ -159,6 +222,8 @@ static void __init reserve_crashkernel(void)
 	if (!IS_ENABLED(CONFIG_KEXEC_CORE))
 		return;
 
+	hand_over_reserved_high_mem();
+
 	/* crashkernel=X[@offset] */
 	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
 				&crash_size, &crash_base);
@@ -177,6 +242,11 @@ static void __init reserve_crashkernel(void)
 
 		high = true;
 		crash_max = CRASH_ADDR_HIGH_MAX;
+
+		if (crash_high_mem_reserved) {
+			take_reserved_high_mem(&crash_base, &crash_size);
+			goto reserve_low;
+		}
 	}
 
 	fixed_base = !!crash_base;
@@ -195,6 +265,11 @@ static void __init reserve_crashkernel(void)
 		 * reserved later.
 		 */
 		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
+			if (crash_high_mem_reserved) {
+				take_reserved_high_mem(&crash_base, &crash_size);
+				goto reserve_low;
+			}
+
 			crash_max = CRASH_ADDR_HIGH_MAX;
 			goto retry;
 		}
@@ -212,6 +287,7 @@ static void __init reserve_crashkernel(void)
 	 * condition to make sure the crash low memory will be reserved.
 	 */
 	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
+reserve_low:
 		/* case #3 of crashkernel,low reservation */
 		if (!high)
 			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
@@ -220,6 +296,12 @@ static void __init reserve_crashkernel(void)
 			memblock_phys_free(crash_base, crash_size);
 			return;
 		}
+	} else if (crash_high_mem_reserved) {
+		/*
+		 * The crash memory is successfully allocated under 4G, and the
+		 * previously reserved high memory is no longer required.
+		 */
+		free_reserved_high_mem();
 	}
 
 	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
@@ -437,6 +519,8 @@ void __init arm64_memblock_init(void)
 
 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
 		reserve_crashkernel();
+	else
+		reserve_crashkernel_high();
 
 	high_memory = __va(memblock_end_of_DRAM() - 1) + 1;
 }
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 8c6666cbc7f2216..f84eca55b103d0c 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -531,7 +531,8 @@ static void __init map_mem(pgd_t *pgdp)
 	if (crash_mem_map &&
 	    (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)))
 		eflags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
-	else if (crashk_res.end)
+
+	if (crashk_res.end)
 		memblock_mark_nomap(crashk_res.start,
 				    resource_size(&crashk_res));
 #endif
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 6/9] arm64: kdump: Use page-level mapping for the high memory of crashkernel
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: kexec

If the crashkernel has both high memory above 4G and low memory under 4G,
kexec always loads the content such as Imge and dtb to the high memory
instead of the low memory. This means that only high memory requires write
protection based on page-level mapping. The allocation of high memory does
not depend on the DMA boundary. So we can reserve the high memory first
even if the crashkernel reservation is deferred.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/mm/init.c | 84 ++++++++++++++++++++++++++++++++++++++++++++
 arch/arm64/mm/mmu.c  |  3 +-
 2 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 99d5539c13de3b1..b1b40b900fae170 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -90,6 +90,9 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit;
 phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
 #endif
 
+static bool crash_high_mem_reserved __initdata;
+static struct resource crashk_res_high;
+
 /* Current arm64 boot protocol requires 2MB alignment */
 #define CRASH_ALIGN			SZ_2M
 
@@ -128,6 +131,66 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
 	return 0;
 }
 
+static void __init reserve_crashkernel_high(void)
+{
+	unsigned long long crash_base, crash_size;
+	char *cmdline = boot_command_line;
+	int ret;
+
+	if (!IS_ENABLED(CONFIG_KEXEC_CORE))
+		return;
+
+	/* crashkernel=X[@offset] */
+	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
+				&crash_size, &crash_base);
+	if (ret || !crash_size) {
+		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
+		if (ret || !crash_size)
+			return;
+	}
+
+	crash_size = PAGE_ALIGN(crash_size);
+
+	/*
+	 * For the case crashkernel=X, may fall back to reserve memory above
+	 * 4G, make reservations here in advance. It will be released later if
+	 * the region is successfully reserved under 4G.
+	 */
+	if (!crash_base) {
+		crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
+						       crash_base, CRASH_ADDR_HIGH_MAX);
+		if (!crash_base)
+			return;
+
+		crash_high_mem_reserved = true;
+	}
+
+	/* Mark the memory range that requires page-level mappings */
+	crashk_res.start = crash_base;
+	crashk_res.end   = crash_base + crash_size - 1;
+}
+
+static void __init hand_over_reserved_high_mem(void)
+{
+	crashk_res_high.start = crashk_res.start;
+	crashk_res_high.end   = crashk_res.end;
+
+	crashk_res.start = 0;
+	crashk_res.end   = 0;
+}
+
+static void __init take_reserved_high_mem(unsigned long long *crash_base,
+					  unsigned long long *crash_size)
+{
+	*crash_base = crashk_res_high.start;
+	*crash_size = resource_size(&crashk_res_high);
+}
+
+static void __init free_reserved_high_mem(void)
+{
+	memblock_phys_free(crashk_res_high.start, resource_size(&crashk_res_high));
+}
+
 /*
  * reserve_crashkernel() - reserves memory for crash kernel
  *
@@ -159,6 +222,8 @@ static void __init reserve_crashkernel(void)
 	if (!IS_ENABLED(CONFIG_KEXEC_CORE))
 		return;
 
+	hand_over_reserved_high_mem();
+
 	/* crashkernel=X[@offset] */
 	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
 				&crash_size, &crash_base);
@@ -177,6 +242,11 @@ static void __init reserve_crashkernel(void)
 
 		high = true;
 		crash_max = CRASH_ADDR_HIGH_MAX;
+
+		if (crash_high_mem_reserved) {
+			take_reserved_high_mem(&crash_base, &crash_size);
+			goto reserve_low;
+		}
 	}
 
 	fixed_base = !!crash_base;
@@ -195,6 +265,11 @@ static void __init reserve_crashkernel(void)
 		 * reserved later.
 		 */
 		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
+			if (crash_high_mem_reserved) {
+				take_reserved_high_mem(&crash_base, &crash_size);
+				goto reserve_low;
+			}
+
 			crash_max = CRASH_ADDR_HIGH_MAX;
 			goto retry;
 		}
@@ -212,6 +287,7 @@ static void __init reserve_crashkernel(void)
 	 * condition to make sure the crash low memory will be reserved.
 	 */
 	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
+reserve_low:
 		/* case #3 of crashkernel,low reservation */
 		if (!high)
 			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
@@ -220,6 +296,12 @@ static void __init reserve_crashkernel(void)
 			memblock_phys_free(crash_base, crash_size);
 			return;
 		}
+	} else if (crash_high_mem_reserved) {
+		/*
+		 * The crash memory is successfully allocated under 4G, and the
+		 * previously reserved high memory is no longer required.
+		 */
+		free_reserved_high_mem();
 	}
 
 	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
@@ -437,6 +519,8 @@ void __init arm64_memblock_init(void)
 
 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
 		reserve_crashkernel();
+	else
+		reserve_crashkernel_high();
 
 	high_memory = __va(memblock_end_of_DRAM() - 1) + 1;
 }
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 8c6666cbc7f2216..f84eca55b103d0c 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -531,7 +531,8 @@ static void __init map_mem(pgd_t *pgdp)
 	if (crash_mem_map &&
 	    (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)))
 		eflags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
-	else if (crashk_res.end)
+
+	if (crashk_res.end)
 		memblock_mark_nomap(crashk_res.start,
 				    resource_size(&crashk_res));
 #endif
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 7/9] arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G
  2022-04-14 11:57 ` Zhen Lei
  (?)
@ 2022-04-14 11:57   ` Zhen Lei
  -1 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

For the case crashkernel=X@offset and crashkernel=X,high, we've explicitly
used 'crashk_res' to mark the scope of the page-level mapping required, so
NO_BLOCK_MAPPINGS should not be required for other areas. Otherwise,
system performance will be affected. In fact, only the case crashkernel=X
requires page-level mapping for all low memory under 4G because it
attempts high memory after it fails to request low memory first, and we
cannot predict its final location.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/include/asm/kexec.h |  2 ++
 arch/arm64/mm/init.c           |  3 +++
 arch/arm64/mm/mmu.c            | 18 +-----------------
 3 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 9839bfc163d7147..8caf64065383844 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -80,6 +80,8 @@ static inline void crash_setup_regs(struct pt_regs *newregs,
 	}
 }
 
+extern bool crash_low_mem_page_map;
+
 #if defined(CONFIG_KEXEC_CORE) && defined(CONFIG_HIBERNATION)
 extern bool crash_is_nosave(unsigned long pfn);
 extern void crash_prepare_suspend(void);
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index b1b40b900fae170..d9676e30f9b657a 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -90,6 +90,7 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit;
 phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
 #endif
 
+bool crash_low_mem_page_map __initdata;
 static bool crash_high_mem_reserved __initdata;
 static struct resource crashk_res_high;
 
@@ -147,6 +148,8 @@ static void __init reserve_crashkernel_high(void)
 		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
 		if (ret || !crash_size)
 			return;
+	} else if (!crash_base) {
+		crash_low_mem_page_map = true;
 	}
 
 	crash_size = PAGE_ALIGN(crash_size);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index f84eca55b103d0c..56a973cb4c9cae6 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -483,21 +483,6 @@ void __init mark_linear_text_alias_ro(void)
 			    PAGE_KERNEL_RO);
 }
 
-static bool crash_mem_map __initdata;
-
-static int __init enable_crash_mem_map(char *arg)
-{
-	/*
-	 * Proper parameter parsing is done by reserve_crashkernel(). We only
-	 * need to know if the linear map has to avoid block mappings so that
-	 * the crashkernel reservations can be unmapped later.
-	 */
-	crash_mem_map = true;
-
-	return 0;
-}
-early_param("crashkernel", enable_crash_mem_map);
-
 static void __init map_mem(pgd_t *pgdp)
 {
 	static const u64 direct_map_end = _PAGE_END(VA_BITS_MIN);
@@ -528,8 +513,7 @@ static void __init map_mem(pgd_t *pgdp)
 	memblock_mark_nomap(kernel_start, kernel_end - kernel_start);
 
 #ifdef CONFIG_KEXEC_CORE
-	if (crash_mem_map &&
-	    (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)))
+	if (crash_low_mem_page_map)
 		eflags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
 
 	if (crashk_res.end)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 7/9] arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

For the case crashkernel=X@offset and crashkernel=X,high, we've explicitly
used 'crashk_res' to mark the scope of the page-level mapping required, so
NO_BLOCK_MAPPINGS should not be required for other areas. Otherwise,
system performance will be affected. In fact, only the case crashkernel=X
requires page-level mapping for all low memory under 4G because it
attempts high memory after it fails to request low memory first, and we
cannot predict its final location.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/include/asm/kexec.h |  2 ++
 arch/arm64/mm/init.c           |  3 +++
 arch/arm64/mm/mmu.c            | 18 +-----------------
 3 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 9839bfc163d7147..8caf64065383844 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -80,6 +80,8 @@ static inline void crash_setup_regs(struct pt_regs *newregs,
 	}
 }
 
+extern bool crash_low_mem_page_map;
+
 #if defined(CONFIG_KEXEC_CORE) && defined(CONFIG_HIBERNATION)
 extern bool crash_is_nosave(unsigned long pfn);
 extern void crash_prepare_suspend(void);
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index b1b40b900fae170..d9676e30f9b657a 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -90,6 +90,7 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit;
 phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
 #endif
 
+bool crash_low_mem_page_map __initdata;
 static bool crash_high_mem_reserved __initdata;
 static struct resource crashk_res_high;
 
@@ -147,6 +148,8 @@ static void __init reserve_crashkernel_high(void)
 		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
 		if (ret || !crash_size)
 			return;
+	} else if (!crash_base) {
+		crash_low_mem_page_map = true;
 	}
 
 	crash_size = PAGE_ALIGN(crash_size);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index f84eca55b103d0c..56a973cb4c9cae6 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -483,21 +483,6 @@ void __init mark_linear_text_alias_ro(void)
 			    PAGE_KERNEL_RO);
 }
 
-static bool crash_mem_map __initdata;
-
-static int __init enable_crash_mem_map(char *arg)
-{
-	/*
-	 * Proper parameter parsing is done by reserve_crashkernel(). We only
-	 * need to know if the linear map has to avoid block mappings so that
-	 * the crashkernel reservations can be unmapped later.
-	 */
-	crash_mem_map = true;
-
-	return 0;
-}
-early_param("crashkernel", enable_crash_mem_map);
-
 static void __init map_mem(pgd_t *pgdp)
 {
 	static const u64 direct_map_end = _PAGE_END(VA_BITS_MIN);
@@ -528,8 +513,7 @@ static void __init map_mem(pgd_t *pgdp)
 	memblock_mark_nomap(kernel_start, kernel_end - kernel_start);
 
 #ifdef CONFIG_KEXEC_CORE
-	if (crash_mem_map &&
-	    (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)))
+	if (crash_low_mem_page_map)
 		eflags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
 
 	if (crashk_res.end)
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 7/9] arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: kexec

For the case crashkernel=X at offset and crashkernel=X,high, we've explicitly
used 'crashk_res' to mark the scope of the page-level mapping required, so
NO_BLOCK_MAPPINGS should not be required for other areas. Otherwise,
system performance will be affected. In fact, only the case crashkernel=X
requires page-level mapping for all low memory under 4G because it
attempts high memory after it fails to request low memory first, and we
cannot predict its final location.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 arch/arm64/include/asm/kexec.h |  2 ++
 arch/arm64/mm/init.c           |  3 +++
 arch/arm64/mm/mmu.c            | 18 +-----------------
 3 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 9839bfc163d7147..8caf64065383844 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -80,6 +80,8 @@ static inline void crash_setup_regs(struct pt_regs *newregs,
 	}
 }
 
+extern bool crash_low_mem_page_map;
+
 #if defined(CONFIG_KEXEC_CORE) && defined(CONFIG_HIBERNATION)
 extern bool crash_is_nosave(unsigned long pfn);
 extern void crash_prepare_suspend(void);
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index b1b40b900fae170..d9676e30f9b657a 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -90,6 +90,7 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit;
 phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
 #endif
 
+bool crash_low_mem_page_map __initdata;
 static bool crash_high_mem_reserved __initdata;
 static struct resource crashk_res_high;
 
@@ -147,6 +148,8 @@ static void __init reserve_crashkernel_high(void)
 		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
 		if (ret || !crash_size)
 			return;
+	} else if (!crash_base) {
+		crash_low_mem_page_map = true;
 	}
 
 	crash_size = PAGE_ALIGN(crash_size);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index f84eca55b103d0c..56a973cb4c9cae6 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -483,21 +483,6 @@ void __init mark_linear_text_alias_ro(void)
 			    PAGE_KERNEL_RO);
 }
 
-static bool crash_mem_map __initdata;
-
-static int __init enable_crash_mem_map(char *arg)
-{
-	/*
-	 * Proper parameter parsing is done by reserve_crashkernel(). We only
-	 * need to know if the linear map has to avoid block mappings so that
-	 * the crashkernel reservations can be unmapped later.
-	 */
-	crash_mem_map = true;
-
-	return 0;
-}
-early_param("crashkernel", enable_crash_mem_map);
-
 static void __init map_mem(pgd_t *pgdp)
 {
 	static const u64 direct_map_end = _PAGE_END(VA_BITS_MIN);
@@ -528,8 +513,7 @@ static void __init map_mem(pgd_t *pgdp)
 	memblock_mark_nomap(kernel_start, kernel_end - kernel_start);
 
 #ifdef CONFIG_KEXEC_CORE
-	if (crash_mem_map &&
-	    (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)))
+	if (crash_low_mem_page_map)
 		eflags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
 
 	if (crashk_res.end)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 8/9] of: fdt: Add memory for devices by DT property "linux,usable-memory-range"
  2022-04-14 11:57 ` Zhen Lei
  (?)
@ 2022-04-14 11:57   ` Zhen Lei
  -1 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

From: Chen Zhou <chenzhou10@huawei.com>

When reserving crashkernel in high memory, some low memory is reserved
for crash dump kernel devices and never mapped by the first kernel.
This memory range is advertised to crash dump kernel via DT property
under /chosen,
        linux,usable-memory-range = <BASE1 SIZE1 [BASE2 SIZE2]>

We reused the DT property linux,usable-memory-range and made the low
memory region as the second range "BASE2 SIZE2", which keeps compatibility
with existing user-space and older kdump kernels.

Crash dump kernel reads this property at boot time and call memblock_add()
to add the low memory region after memblock_cap_memory_range() has been
called.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>
---
 drivers/of/fdt.c | 33 +++++++++++++++++++++++----------
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index ec315b060cd50d2..2f248d0acc04830 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -973,16 +973,24 @@ static void __init early_init_dt_check_for_elfcorehdr(unsigned long node)
 
 static unsigned long chosen_node_offset = -FDT_ERR_NOTFOUND;
 
+/*
+ * The main usage of linux,usable-memory-range is for crash dump kernel.
+ * Originally, the number of usable-memory regions is one. Now there may
+ * be two regions, low region and high region.
+ * To make compatibility with existing user-space and older kdump, the low
+ * region is always the last range of linux,usable-memory-range if exist.
+ */
+#define MAX_USABLE_RANGES		2
+
 /**
  * early_init_dt_check_for_usable_mem_range - Decode usable memory range
  * location from flat tree
  */
 void __init early_init_dt_check_for_usable_mem_range(void)
 {
-	const __be32 *prop;
-	int len;
-	phys_addr_t cap_mem_addr;
-	phys_addr_t cap_mem_size;
+	struct memblock_region rgn[MAX_USABLE_RANGES] = {0};
+	const __be32 *prop, *endp;
+	int len, i;
 	unsigned long node = chosen_node_offset;
 
 	if ((long)node < 0)
@@ -991,16 +999,21 @@ void __init early_init_dt_check_for_usable_mem_range(void)
 	pr_debug("Looking for usable-memory-range property... ");
 
 	prop = of_get_flat_dt_prop(node, "linux,usable-memory-range", &len);
-	if (!prop || (len < (dt_root_addr_cells + dt_root_size_cells)))
+	if (!prop || (len % (dt_root_addr_cells + dt_root_size_cells)))
 		return;
 
-	cap_mem_addr = dt_mem_next_cell(dt_root_addr_cells, &prop);
-	cap_mem_size = dt_mem_next_cell(dt_root_size_cells, &prop);
+	endp = prop + (len / sizeof(__be32));
+	for (i = 0; i < MAX_USABLE_RANGES && prop < endp; i++) {
+		rgn[i].base = dt_mem_next_cell(dt_root_addr_cells, &prop);
+		rgn[i].size = dt_mem_next_cell(dt_root_size_cells, &prop);
 
-	pr_debug("cap_mem_start=%pa cap_mem_size=%pa\n", &cap_mem_addr,
-		 &cap_mem_size);
+		pr_debug("cap_mem_regions[%d]: base=%pa, size=%pa\n",
+			 i, &rgn[i].base, &rgn[i].size);
+	}
 
-	memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
+	memblock_cap_memory_range(rgn[0].base, rgn[0].size);
+	for (i = 1; i < MAX_USABLE_RANGES && rgn[i].size; i++)
+		memblock_add(rgn[i].base, rgn[i].size);
 }
 
 #ifdef CONFIG_SERIAL_EARLYCON
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 8/9] of: fdt: Add memory for devices by DT property "linux, usable-memory-range"
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

From: Chen Zhou <chenzhou10@huawei.com>

When reserving crashkernel in high memory, some low memory is reserved
for crash dump kernel devices and never mapped by the first kernel.
This memory range is advertised to crash dump kernel via DT property
under /chosen,
        linux,usable-memory-range = <BASE1 SIZE1 [BASE2 SIZE2]>

We reused the DT property linux,usable-memory-range and made the low
memory region as the second range "BASE2 SIZE2", which keeps compatibility
with existing user-space and older kdump kernels.

Crash dump kernel reads this property at boot time and call memblock_add()
to add the low memory region after memblock_cap_memory_range() has been
called.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>
---
 drivers/of/fdt.c | 33 +++++++++++++++++++++++----------
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index ec315b060cd50d2..2f248d0acc04830 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -973,16 +973,24 @@ static void __init early_init_dt_check_for_elfcorehdr(unsigned long node)
 
 static unsigned long chosen_node_offset = -FDT_ERR_NOTFOUND;
 
+/*
+ * The main usage of linux,usable-memory-range is for crash dump kernel.
+ * Originally, the number of usable-memory regions is one. Now there may
+ * be two regions, low region and high region.
+ * To make compatibility with existing user-space and older kdump, the low
+ * region is always the last range of linux,usable-memory-range if exist.
+ */
+#define MAX_USABLE_RANGES		2
+
 /**
  * early_init_dt_check_for_usable_mem_range - Decode usable memory range
  * location from flat tree
  */
 void __init early_init_dt_check_for_usable_mem_range(void)
 {
-	const __be32 *prop;
-	int len;
-	phys_addr_t cap_mem_addr;
-	phys_addr_t cap_mem_size;
+	struct memblock_region rgn[MAX_USABLE_RANGES] = {0};
+	const __be32 *prop, *endp;
+	int len, i;
 	unsigned long node = chosen_node_offset;
 
 	if ((long)node < 0)
@@ -991,16 +999,21 @@ void __init early_init_dt_check_for_usable_mem_range(void)
 	pr_debug("Looking for usable-memory-range property... ");
 
 	prop = of_get_flat_dt_prop(node, "linux,usable-memory-range", &len);
-	if (!prop || (len < (dt_root_addr_cells + dt_root_size_cells)))
+	if (!prop || (len % (dt_root_addr_cells + dt_root_size_cells)))
 		return;
 
-	cap_mem_addr = dt_mem_next_cell(dt_root_addr_cells, &prop);
-	cap_mem_size = dt_mem_next_cell(dt_root_size_cells, &prop);
+	endp = prop + (len / sizeof(__be32));
+	for (i = 0; i < MAX_USABLE_RANGES && prop < endp; i++) {
+		rgn[i].base = dt_mem_next_cell(dt_root_addr_cells, &prop);
+		rgn[i].size = dt_mem_next_cell(dt_root_size_cells, &prop);
 
-	pr_debug("cap_mem_start=%pa cap_mem_size=%pa\n", &cap_mem_addr,
-		 &cap_mem_size);
+		pr_debug("cap_mem_regions[%d]: base=%pa, size=%pa\n",
+			 i, &rgn[i].base, &rgn[i].size);
+	}
 
-	memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
+	memblock_cap_memory_range(rgn[0].base, rgn[0].size);
+	for (i = 1; i < MAX_USABLE_RANGES && rgn[i].size; i++)
+		memblock_add(rgn[i].base, rgn[i].size);
 }
 
 #ifdef CONFIG_SERIAL_EARLYCON
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 8/9] of: fdt: Add memory for devices by DT property "linux, usable-memory-range"
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: kexec

From: Chen Zhou <chenzhou10@huawei.com>

When reserving crashkernel in high memory, some low memory is reserved
for crash dump kernel devices and never mapped by the first kernel.
This memory range is advertised to crash dump kernel via DT property
under /chosen,
        linux,usable-memory-range = <BASE1 SIZE1 [BASE2 SIZE2]>

We reused the DT property linux,usable-memory-range and made the low
memory region as the second range "BASE2 SIZE2", which keeps compatibility
with existing user-space and older kdump kernels.

Crash dump kernel reads this property at boot time and call memblock_add()
to add the low memory region after memblock_cap_memory_range() has been
called.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>
---
 drivers/of/fdt.c | 33 +++++++++++++++++++++++----------
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index ec315b060cd50d2..2f248d0acc04830 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -973,16 +973,24 @@ static void __init early_init_dt_check_for_elfcorehdr(unsigned long node)
 
 static unsigned long chosen_node_offset = -FDT_ERR_NOTFOUND;
 
+/*
+ * The main usage of linux,usable-memory-range is for crash dump kernel.
+ * Originally, the number of usable-memory regions is one. Now there may
+ * be two regions, low region and high region.
+ * To make compatibility with existing user-space and older kdump, the low
+ * region is always the last range of linux,usable-memory-range if exist.
+ */
+#define MAX_USABLE_RANGES		2
+
 /**
  * early_init_dt_check_for_usable_mem_range - Decode usable memory range
  * location from flat tree
  */
 void __init early_init_dt_check_for_usable_mem_range(void)
 {
-	const __be32 *prop;
-	int len;
-	phys_addr_t cap_mem_addr;
-	phys_addr_t cap_mem_size;
+	struct memblock_region rgn[MAX_USABLE_RANGES] = {0};
+	const __be32 *prop, *endp;
+	int len, i;
 	unsigned long node = chosen_node_offset;
 
 	if ((long)node < 0)
@@ -991,16 +999,21 @@ void __init early_init_dt_check_for_usable_mem_range(void)
 	pr_debug("Looking for usable-memory-range property... ");
 
 	prop = of_get_flat_dt_prop(node, "linux,usable-memory-range", &len);
-	if (!prop || (len < (dt_root_addr_cells + dt_root_size_cells)))
+	if (!prop || (len % (dt_root_addr_cells + dt_root_size_cells)))
 		return;
 
-	cap_mem_addr = dt_mem_next_cell(dt_root_addr_cells, &prop);
-	cap_mem_size = dt_mem_next_cell(dt_root_size_cells, &prop);
+	endp = prop + (len / sizeof(__be32));
+	for (i = 0; i < MAX_USABLE_RANGES && prop < endp; i++) {
+		rgn[i].base = dt_mem_next_cell(dt_root_addr_cells, &prop);
+		rgn[i].size = dt_mem_next_cell(dt_root_size_cells, &prop);
 
-	pr_debug("cap_mem_start=%pa cap_mem_size=%pa\n", &cap_mem_addr,
-		 &cap_mem_size);
+		pr_debug("cap_mem_regions[%d]: base=%pa, size=%pa\n",
+			 i, &rgn[i].base, &rgn[i].size);
+	}
 
-	memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
+	memblock_cap_memory_range(rgn[0].base, rgn[0].size);
+	for (i = 1; i < MAX_USABLE_RANGES && rgn[i].size; i++)
+		memblock_add(rgn[i].base, rgn[i].size);
 }
 
 #ifdef CONFIG_SERIAL_EARLYCON
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 9/9] docs: kdump: Update the crashkernel description for arm64
  2022-04-14 11:57 ` Zhen Lei
  (?)
@ 2022-04-14 11:57   ` Zhen Lei
  -1 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

Now arm64 has added support for "crashkernel=X,high" and
"crashkernel=Y,low", and implements "crashkernel=X[@offset]" in the
same way as x86. So update the Documentation.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Baoquan He <bhe@redhat.com>
---
 Documentation/admin-guide/kernel-parameters.txt | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 3f1cc5e317ed4a5..ae0aa63ffe82f59 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -795,7 +795,7 @@
 			memory region [offset, offset + size] for that kernel
 			image. If '@offset' is omitted, then a suitable offset
 			is selected automatically.
-			[KNL, X86-64] Select a region under 4G first, and
+			[KNL, X86-64, ARM64] Select a region under 4G first, and
 			fall back to reserve region above 4G when '@offset'
 			hasn't been specified.
 			See Documentation/admin-guide/kdump/kdump.rst for further details.
@@ -808,20 +808,20 @@
 			Documentation/admin-guide/kdump/kdump.rst for an example.
 
 	crashkernel=size[KMG],high
-			[KNL, X86-64] range could be above 4G. Allow kernel
+			[KNL, X86-64, ARM64] range could be above 4G. Allow kernel
 			to allocate physical memory region from top, so could
 			be above 4G if system have more than 4G ram installed.
 			Otherwise memory region will be allocated below 4G, if
 			available.
 			It will be ignored if crashkernel=X is specified.
 	crashkernel=size[KMG],low
-			[KNL, X86-64] range under 4G. When crashkernel=X,high
+			[KNL, X86-64, ARM64] range under 4G. When crashkernel=X,high
 			is passed, kernel could allocate physical memory region
 			above 4G, that cause second kernel crash on system
 			that require some amount of low memory, e.g. swiotlb
 			requires at least 64M+32K low memory, also enough extra
 			low memory is needed to make sure DMA buffers for 32-bit
-			devices won't run out. Kernel would try to allocate at
+			devices won't run out. Kernel would try to allocate
 			at least 256M below 4G automatically.
 			This one let user to specify own low range under 4G
 			for second kernel instead.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 9/9] docs: kdump: Update the crashkernel description for arm64
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Zhen Lei, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

Now arm64 has added support for "crashkernel=X,high" and
"crashkernel=Y,low", and implements "crashkernel=X[@offset]" in the
same way as x86. So update the Documentation.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Baoquan He <bhe@redhat.com>
---
 Documentation/admin-guide/kernel-parameters.txt | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 3f1cc5e317ed4a5..ae0aa63ffe82f59 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -795,7 +795,7 @@
 			memory region [offset, offset + size] for that kernel
 			image. If '@offset' is omitted, then a suitable offset
 			is selected automatically.
-			[KNL, X86-64] Select a region under 4G first, and
+			[KNL, X86-64, ARM64] Select a region under 4G first, and
 			fall back to reserve region above 4G when '@offset'
 			hasn't been specified.
 			See Documentation/admin-guide/kdump/kdump.rst for further details.
@@ -808,20 +808,20 @@
 			Documentation/admin-guide/kdump/kdump.rst for an example.
 
 	crashkernel=size[KMG],high
-			[KNL, X86-64] range could be above 4G. Allow kernel
+			[KNL, X86-64, ARM64] range could be above 4G. Allow kernel
 			to allocate physical memory region from top, so could
 			be above 4G if system have more than 4G ram installed.
 			Otherwise memory region will be allocated below 4G, if
 			available.
 			It will be ignored if crashkernel=X is specified.
 	crashkernel=size[KMG],low
-			[KNL, X86-64] range under 4G. When crashkernel=X,high
+			[KNL, X86-64, ARM64] range under 4G. When crashkernel=X,high
 			is passed, kernel could allocate physical memory region
 			above 4G, that cause second kernel crash on system
 			that require some amount of low memory, e.g. swiotlb
 			requires at least 64M+32K low memory, also enough extra
 			low memory is needed to make sure DMA buffers for 32-bit
-			devices won't run out. Kernel would try to allocate at
+			devices won't run out. Kernel would try to allocate
 			at least 256M below 4G automatically.
 			This one let user to specify own low range under 4G
 			for second kernel instead.
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 9/9] docs: kdump: Update the crashkernel description for arm64
@ 2022-04-14 11:57   ` Zhen Lei
  0 siblings, 0 replies; 102+ messages in thread
From: Zhen Lei @ 2022-04-14 11:57 UTC (permalink / raw)
  To: kexec

Now arm64 has added support for "crashkernel=X,high" and
"crashkernel=Y,low", and implements "crashkernel=X[@offset]" in the
same way as x86. So update the Documentation.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Baoquan He <bhe@redhat.com>
---
 Documentation/admin-guide/kernel-parameters.txt | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 3f1cc5e317ed4a5..ae0aa63ffe82f59 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -795,7 +795,7 @@
 			memory region [offset, offset + size] for that kernel
 			image. If '@offset' is omitted, then a suitable offset
 			is selected automatically.
-			[KNL, X86-64] Select a region under 4G first, and
+			[KNL, X86-64, ARM64] Select a region under 4G first, and
 			fall back to reserve region above 4G when '@offset'
 			hasn't been specified.
 			See Documentation/admin-guide/kdump/kdump.rst for further details.
@@ -808,20 +808,20 @@
 			Documentation/admin-guide/kdump/kdump.rst for an example.
 
 	crashkernel=size[KMG],high
-			[KNL, X86-64] range could be above 4G. Allow kernel
+			[KNL, X86-64, ARM64] range could be above 4G. Allow kernel
 			to allocate physical memory region from top, so could
 			be above 4G if system have more than 4G ram installed.
 			Otherwise memory region will be allocated below 4G, if
 			available.
 			It will be ignored if crashkernel=X is specified.
 	crashkernel=size[KMG],low
-			[KNL, X86-64] range under 4G. When crashkernel=X,high
+			[KNL, X86-64, ARM64] range under 4G. When crashkernel=X,high
 			is passed, kernel could allocate physical memory region
 			above 4G, that cause second kernel crash on system
 			that require some amount of low memory, e.g. swiotlb
 			requires at least 64M+32K low memory, also enough extra
 			low memory is needed to make sure DMA buffers for 32-bit
-			devices won't run out. Kernel would try to allocate at
+			devices won't run out. Kernel would try to allocate
 			at least 256M below 4G automatically.
 			This one let user to specify own low range under 4G
 			for second kernel instead.
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump
  2022-04-14 11:57 ` Zhen Lei
  (?)
@ 2022-04-19 17:02   ` Dave Kleikamp
  -1 siblings, 0 replies; 102+ messages in thread
From: Dave Kleikamp @ 2022-04-19 17:02 UTC (permalink / raw)
  To: Zhen Lei, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou, John Donnelly

For the series:

Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>

Thanks for keeping this patch set alive.

On 4/14/22 6:57AM, Zhen Lei wrote:
> Changes since [v21]:
> 1. Update the commit message of  patch 1 and 5.
> 2. Add some comments for reserve_crashkernel() in patch 5.
> 
> Thanks to Baoquan He and John Donnelly for their review comments.
> 
> Because v5.18-rc1 has added a new patch
> commit  031495635b46 ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones")
> There are many new scenarios:
> 1) The mappings may be block or page-level.
> 2) The call to reserve_crashkernel() may or may not be deferred.
> 3) The the upper limit of DMA address may be 4G, or less than 4G. Or the
>     upper limit of physical memory, because SMMU can do the mapping.
> 
> The code of patch 1-2, 8-9 keep no change, because the above-mentioned issues are not involved.
> The code of patch 5 only makes the following changes:
> -	if (crash_base >= SZ_4G)
> +	/*
> +	 * When both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the
> +	 * CRASH_ADDR_LOW_MAX equals the upper limit of physical memory, so
> +	 * the 'crash_base' of high memory can not exceed it. To follow the
> +	 * description of "crashkernel=X,high" option, add below 'high'
> +	 * condition to make sure the crash low memory will be reserved.
> +	 */
> +	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
> Change SZ_4G to CRASH_ADDR_LOW_MAX, because arm64_dma_phys_limit may be less than
> 4G or greater than 4G. The check 'high' is used for "crashkernel=X,high" and
> "(crash_base >= CRASH_ADDR_LOW_MAX)" is used for "crashkernel=X[@offset]".
> 
> Patch 3-4 to allow block mappings for memory above 4G.
> Patch 6-7 to support only crash high memory or fixed memory range specified by
> crashkernel=X@offset use page-level mapping, to allow other areas use block mapping.
> These four patches are for performance optimization purposes. For details about the
> technical feasibility analysis, please see the commit messages.
> 
> Now the implementation of arm64 is very different from that of x86. It's no longer
> suitable for both of them to share code.
> 
> 
> 
> Changes since [v20]:
> 1. Check whether crashkernel=Y,low is incorrectly configured or not configured. Do different processing.
> 2. Share the existing description of x86. The configuration of arm64 is the same as that of x86.
> 3. Define the value of macro CRASH_ADDR_HIGH_MAX as memblock.current_limit, instead of MEMBLOCK_ALLOC_ACCESSIBLE.
> 4. To improve readability, some lightweight code adjustments have been made to reserve_craskernel(), including comments.
> 5. The defined value of DEFAULT_CRASH_KERNEL_LOW_SIZE reconsiders swiotlb, just like x86, to share documents.
> 
> Thanks to Baoquan He for his careful review.
> 
> The test cases are as follows: (Please update the kexec tool to the latest version)
> 1) crashkernel=4G						//high=4G, low=256M
> 2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
> 3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
> 4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
> 5) crashkernel=4G@0xe0000000					//high=0G, low=0M, cannot allocate, failed
> 6) crashkernel=512M						//high=0G, low=512M
> 7) crashkernel=128M						//high=0G, low=128M
> 8) crashkernel=512M@0xde000000		//512M@3552M		//high=0G, low=512M
> 9) crashkernel=4G,high						//high=4G, low=256M
> a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
> b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
> c) crashkernel=128M,high					//high=128M, low=256M
> d) crashkernel=512M,low						//high=0G, low=0M, invalid
> e) crashkernel=512M,high crashkernel=0,low			//high=512M, low=0M
> f) crashkernel=4G,high crashkernel=ab,low			//high=0G, low=0M, invalid
> 
> 
> Changes since [v19]:
> 1. Temporarily stop making reserve_crashkernel[_low]() generic. There are a
>     lot of details need to be considered, which can take a long time. Because
>     "make generic" does not add new functions and does not improve performance,
>     maybe I should say it's just a cleanup. So by stripping it out and leaving
>     it for other patches later, we can aggregate the changes to the main functions.
> 2. Use insert_resource() to replace request_resource(), this not only simplifies
>     the code, but also reduces the differences between arm64 and x86 implementations.
> 3. As commit 157752d84f5d ("kexec: use Crash kernel for Crash kernel low") do for
>     x86, we can also extend kexec-tools for arm64, and it's currently applied. See:
>     https://www.spinics.net/lists/kexec/msg28284.html
> 
> Thank you very much, Borislav Petkov, for so many valuable comments.
> 
> Changes since [v17]: v17 --> v19
> 1. Patch 0001-0004
>     Introduce generic parse_crashkernel_high_low() to bring the parsing of
>     "crashkernel=X,high" and the parsing of "crashkernel=X,low" together,
>     then use it instead of the call to parse_crashkernel_{high|low}(). Two
>     confusing parameters of parse_crashkernel_{high|low}() are deleted.
> 
>     I previously sent these four patches separately:
>     [1] https://lkml.org/lkml/2021/12/25/40
> 2. Patch 0005-0009
>     Introduce generic reserve_crashkernel_mem[_low](), the implementation of
>     these two functions is based on function reserve_crashkernel[_low]() in
>     arch/x86/kernel/setup.c. There is no functional change for x86.
>     1) The check position of xen_pv_domain() does not change.
>     2) Still 1M alignment for crash kernel fixed region, when 'base' is specified.
> 
>     To avoid compilation problems on other architectures: patch 0004 moves
>     the definition of global variable crashk[_low]_res from kexec_core.c to
>     crash_core.c, and provide default definitions for all macros involved, a
>     particular platform can redefine these macros to override the default
>     values.
> 3. 0010, only one line of comment was changed.
> 4. 0011
>     1) crashk_low_res may also a valid reserved memory, should be checked
>        in crash_is_nosave(), see arch/arm64/kernel/machine_kexec.
>     2) Drop memblock_mark_nomap() for crashk_low_res, because of:
>        2687275a5843 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
>     3) Also call kmemleak_ignore_phys() for crashk_low_res, because of:
>        85f58eb18898 arm64: kdump: Skip kmemleak scan reserved memory for kdump
> 5. 0012, slightly rebased, because the following patch is applied in advance.
>     https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/commit/?h=dt/linus&id=8347b41748c3019157312fbe7f8a6792ae396eb7
> 6. 0013, no change.
> 
> Others:
> 1. Discard add ARCH_WANT_RESERVE_CRASH_KERNEL
> 2. When allocating crash low memory, the start address still starts from 0.
>     low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> 3. Discard change (1ULL << 32) to CRASH_ADDR_LOW_MAX.
> 4. Ensure the check position of xen_pv_domain() have no change.
> 5. Except patch 0010 and 0012, all "Tested-by", "Reviewed-by", "Acked-by" are removed.
> 6. Update description.
> 
> 
> 
> Changes since [v16]
> - Because no functional changes in this version, so add
>    "Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>" for patch 1-9
> - Add "Reviewed-by: Rob Herring <robh@kernel.org>" for patch 8
> - Update patch 9 based on the review comments of Rob Herring
> - As Catalin Marinas's suggestion, merge the implementation of
>    ARCH_WANT_RESERVE_CRASH_KERNEL into patch 5. Ensure that the
>    contents of X86 and ARM64 do not overlap, and reduce unnecessary
>    temporary differences.
> 
> Changes since [v15]
> -  Aggregate the processing of "linux,usable-memory-range" into one function.
>     Only patch 9-10 have been updated.
> 
> Changes since [v14]
> - Recovering the requirement that the CrashKernel memory regions on X86
>    only requires 1 MiB alignment.
> - Combine patches 5 and 6 in v14 into one. The compilation warning fixed
>    by patch 6 was introduced by patch 5 in v14.
> - As with crashk_res, crashk_low_res is also processed by
>    crash_exclude_mem_range() in patch 7.
> - Due to commit b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
>    has removed the architecture-specific code, extend the property "linux,usable-memory-range"
>    in the platform-agnostic FDT core code. See patch 9.
> - Discard the x86 description update in the document, because the description
>    has been updated by commit b1f4c363666c ("Documentation: kdump: update kdump guide").
> - Change "arm64" to "ARM64" in Doc.
> 
> 
> Changes since [v13]
> - Rebased on top of 5.11-rc5.
> - Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL.
> Since reserve_crashkernel[_low]() implementations are quite similar on
> other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in
> arch/Kconfig and select this by X86 and ARM64.
> - Some minor cleanup.
> 
> Changes since [v12]
> - Rebased on top of 5.10-rc1.
> - Keep CRASH_ALIGN as 16M suggested by Dave.
> - Drop patch "kdump: add threshold for the required memory".
> - Add Tested-by from John.
> 
> Changes since [v11]
> - Rebased on top of 5.9-rc4.
> - Make the function reserve_crashkernel() of x86 generic.
> Suggested by Catalin, make the function reserve_crashkernel() of x86 generic
> and arm64 use the generic version to reimplement crashkernel=X.
> 
> Changes since [v10]
> - Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.
> 
> Changes since [v9]
> - Patch 1 add Acked-by from Dave.
> - Update patch 5 according to Dave's comments.
> - Update chosen schema.
> 
> Changes since [v8]
> - Reuse DT property "linux,usable-memory-range".
> Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
> memory region.
> - Fix kdump broken with ZONE_DMA reintroduced.
> - Update chosen schema.
> 
> Changes since [v7]
> - Move x86 CRASH_ALIGN to 2M
> Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> - Update Documentation/devicetree/bindings/chosen.txt.
> Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
> suggested by Arnd.
> - Add Tested-by from Jhon and pk.
> 
> Changes since [v6]
> - Fix build errors reported by kbuild test robot.
> 
> Changes since [v5]
> - Move reserve_crashkernel_low() into kernel/crash_core.c.
> - Delete crashkernel=X,high.
> - Modify crashkernel=X,low.
> If crashkernel=X,low is specified simultaneously, reserve spcified size low
> memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> pass to crash dump kernel by DT property "linux,low-memory-range".
> - Update Documentation/admin-guide/kdump/kdump.rst.
> 
> Changes since [v4]
> - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
> 
> Changes since [v3]
> - Add memblock_cap_memory_ranges back for multiple ranges.
> - Fix some compiling warnings.
> 
> Changes since [v2]
> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> patch.
> 
> Changes since [v1]:
> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> in fdt_enforce_memory_region().
> There are at most two crash kernel regions, for two crash kernel regions
> case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> and then remove the memory range in the middle.
> 
> v1:
> There are following issues in arm64 kdump:
> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> will fail when there is no enough low memory.
> 2. If reserving crashkernel above 4G, in this case, crash dump
> kernel will boot failure because there is no low memory available
> for allocation.
> 
> To solve these issues, change the behavior of crashkernel=X.
> crashkernel=X tries low allocation in DMA zone and fall back to high
> allocation if it fails.
> 
> We can also use "crashkernel=X,high" to select a high region above
> DMA zone, which also tries to allocate at least 256M low memory in
> DMA zone automatically and "crashkernel=Y,low" can be used to allocate
> specified size low memory.
> 
> When reserving crashkernel in high memory, some low memory is reserved
> for crash dump kernel devices. So there may be two regions reserved for
> crash dump kernel.
> In order to distinct from the high region and make no effect to the use
> of existing kexec-tools, rename the low region as "Crash kernel (low)",
> and pass the low region by reusing DT property
> "linux,usable-memory-range". We made the low memory region as the last
> range of "linux,usable-memory-range" to keep compatibility with existing
> user-space and older kdump kernels.
> 
> Besides, we need to modify kexec-tools:
> arm64: support more than one crash kernel regions(see [1])
> 
> Another update is document about DT property 'linux,usable-memory-range':
> schemas: update 'linux,usable-memory-range' node schema(see [2])
> 
> 
> [1]: https://www.spinics.net/lists/kexec/msg28226.html
> [2]: https://github.com/robherring/dt-schema/pull/19
> [v1]: https://lkml.org/lkml/2019/4/2/1174
> [v2]: https://lkml.org/lkml/2019/4/9/86
> [v3]: https://lkml.org/lkml/2019/4/9/306
> [v4]: https://lkml.org/lkml/2019/4/15/273
> [v5]: https://lkml.org/lkml/2019/5/6/1360
> [v6]: https://lkml.org/lkml/2019/8/30/142
> [v7]: https://lkml.org/lkml/2019/12/23/411
> [v8]: https://lkml.org/lkml/2020/5/21/213
> [v9]: https://lkml.org/lkml/2020/6/28/73
> [v10]: https://lkml.org/lkml/2020/7/2/1443
> [v11]: https://lkml.org/lkml/2020/8/1/150
> [v12]: https://lkml.org/lkml/2020/9/7/1037
> [v13]: https://lkml.org/lkml/2020/10/31/34
> [v14]: https://lkml.org/lkml/2021/1/30/53
> [v15]: https://lkml.org/lkml/2021/10/19/1405
> [v16]: https://lkml.org/lkml/2021/11/23/435
> [v17]: https://lkml.org/lkml/2021/12/10/38
> [v18]: https://lkml.org/lkml/2021/12/22/424
> [v19]: https://lkml.org/lkml/2021/12/28/203
> [v20]: https://lkml.org/lkml/2022/1/24/167
> [v21]: https://lkml.org/lkml/2022/2/26/350
> 
> Chen Zhou (2):
>    arm64: kdump: Reimplement crashkernel=X
>    of: fdt: Add memory for devices by DT property
>      "linux,usable-memory-range"
> 
> Zhen Lei (7):
>    kdump: return -ENOENT if required cmdline option does not exist
>    arm64: Use insert_resource() to simplify code
>    arm64: kdump: Remove some redundant checks in map_mem()
>    arm64: kdump: Don't force page-level mappings for memory above 4G
>    arm64: kdump: Use page-level mapping for the high memory of
>      crashkernel
>    arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G
>    docs: kdump: Update the crashkernel description for arm64
> 
>   .../admin-guide/kernel-parameters.txt         |   8 +-
>   arch/arm64/include/asm/kexec.h                |   2 +
>   arch/arm64/kernel/machine_kexec.c             |   9 +-
>   arch/arm64/kernel/machine_kexec_file.c        |  12 +-
>   arch/arm64/kernel/setup.c                     |  17 +-
>   arch/arm64/mm/init.c                          | 204 +++++++++++++++++-
>   arch/arm64/mm/mmu.c                           |  68 +++---
>   drivers/of/fdt.c                              |  33 ++-
>   kernel/crash_core.c                           |   3 +-
>   9 files changed, 278 insertions(+), 78 deletions(-)
> 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump
@ 2022-04-19 17:02   ` Dave Kleikamp
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Kleikamp @ 2022-04-19 17:02 UTC (permalink / raw)
  To: Zhen Lei, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou, John Donnelly

For the series:

Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>

Thanks for keeping this patch set alive.

On 4/14/22 6:57AM, Zhen Lei wrote:
> Changes since [v21]:
> 1. Update the commit message of  patch 1 and 5.
> 2. Add some comments for reserve_crashkernel() in patch 5.
> 
> Thanks to Baoquan He and John Donnelly for their review comments.
> 
> Because v5.18-rc1 has added a new patch
> commit  031495635b46 ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones")
> There are many new scenarios:
> 1) The mappings may be block or page-level.
> 2) The call to reserve_crashkernel() may or may not be deferred.
> 3) The the upper limit of DMA address may be 4G, or less than 4G. Or the
>     upper limit of physical memory, because SMMU can do the mapping.
> 
> The code of patch 1-2, 8-9 keep no change, because the above-mentioned issues are not involved.
> The code of patch 5 only makes the following changes:
> -	if (crash_base >= SZ_4G)
> +	/*
> +	 * When both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the
> +	 * CRASH_ADDR_LOW_MAX equals the upper limit of physical memory, so
> +	 * the 'crash_base' of high memory can not exceed it. To follow the
> +	 * description of "crashkernel=X,high" option, add below 'high'
> +	 * condition to make sure the crash low memory will be reserved.
> +	 */
> +	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
> Change SZ_4G to CRASH_ADDR_LOW_MAX, because arm64_dma_phys_limit may be less than
> 4G or greater than 4G. The check 'high' is used for "crashkernel=X,high" and
> "(crash_base >= CRASH_ADDR_LOW_MAX)" is used for "crashkernel=X[@offset]".
> 
> Patch 3-4 to allow block mappings for memory above 4G.
> Patch 6-7 to support only crash high memory or fixed memory range specified by
> crashkernel=X@offset use page-level mapping, to allow other areas use block mapping.
> These four patches are for performance optimization purposes. For details about the
> technical feasibility analysis, please see the commit messages.
> 
> Now the implementation of arm64 is very different from that of x86. It's no longer
> suitable for both of them to share code.
> 
> 
> 
> Changes since [v20]:
> 1. Check whether crashkernel=Y,low is incorrectly configured or not configured. Do different processing.
> 2. Share the existing description of x86. The configuration of arm64 is the same as that of x86.
> 3. Define the value of macro CRASH_ADDR_HIGH_MAX as memblock.current_limit, instead of MEMBLOCK_ALLOC_ACCESSIBLE.
> 4. To improve readability, some lightweight code adjustments have been made to reserve_craskernel(), including comments.
> 5. The defined value of DEFAULT_CRASH_KERNEL_LOW_SIZE reconsiders swiotlb, just like x86, to share documents.
> 
> Thanks to Baoquan He for his careful review.
> 
> The test cases are as follows: (Please update the kexec tool to the latest version)
> 1) crashkernel=4G						//high=4G, low=256M
> 2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
> 3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
> 4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
> 5) crashkernel=4G@0xe0000000					//high=0G, low=0M, cannot allocate, failed
> 6) crashkernel=512M						//high=0G, low=512M
> 7) crashkernel=128M						//high=0G, low=128M
> 8) crashkernel=512M@0xde000000		//512M@3552M		//high=0G, low=512M
> 9) crashkernel=4G,high						//high=4G, low=256M
> a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
> b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
> c) crashkernel=128M,high					//high=128M, low=256M
> d) crashkernel=512M,low						//high=0G, low=0M, invalid
> e) crashkernel=512M,high crashkernel=0,low			//high=512M, low=0M
> f) crashkernel=4G,high crashkernel=ab,low			//high=0G, low=0M, invalid
> 
> 
> Changes since [v19]:
> 1. Temporarily stop making reserve_crashkernel[_low]() generic. There are a
>     lot of details need to be considered, which can take a long time. Because
>     "make generic" does not add new functions and does not improve performance,
>     maybe I should say it's just a cleanup. So by stripping it out and leaving
>     it for other patches later, we can aggregate the changes to the main functions.
> 2. Use insert_resource() to replace request_resource(), this not only simplifies
>     the code, but also reduces the differences between arm64 and x86 implementations.
> 3. As commit 157752d84f5d ("kexec: use Crash kernel for Crash kernel low") do for
>     x86, we can also extend kexec-tools for arm64, and it's currently applied. See:
>     https://www.spinics.net/lists/kexec/msg28284.html
> 
> Thank you very much, Borislav Petkov, for so many valuable comments.
> 
> Changes since [v17]: v17 --> v19
> 1. Patch 0001-0004
>     Introduce generic parse_crashkernel_high_low() to bring the parsing of
>     "crashkernel=X,high" and the parsing of "crashkernel=X,low" together,
>     then use it instead of the call to parse_crashkernel_{high|low}(). Two
>     confusing parameters of parse_crashkernel_{high|low}() are deleted.
> 
>     I previously sent these four patches separately:
>     [1] https://lkml.org/lkml/2021/12/25/40
> 2. Patch 0005-0009
>     Introduce generic reserve_crashkernel_mem[_low](), the implementation of
>     these two functions is based on function reserve_crashkernel[_low]() in
>     arch/x86/kernel/setup.c. There is no functional change for x86.
>     1) The check position of xen_pv_domain() does not change.
>     2) Still 1M alignment for crash kernel fixed region, when 'base' is specified.
> 
>     To avoid compilation problems on other architectures: patch 0004 moves
>     the definition of global variable crashk[_low]_res from kexec_core.c to
>     crash_core.c, and provide default definitions for all macros involved, a
>     particular platform can redefine these macros to override the default
>     values.
> 3. 0010, only one line of comment was changed.
> 4. 0011
>     1) crashk_low_res may also a valid reserved memory, should be checked
>        in crash_is_nosave(), see arch/arm64/kernel/machine_kexec.
>     2) Drop memblock_mark_nomap() for crashk_low_res, because of:
>        2687275a5843 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
>     3) Also call kmemleak_ignore_phys() for crashk_low_res, because of:
>        85f58eb18898 arm64: kdump: Skip kmemleak scan reserved memory for kdump
> 5. 0012, slightly rebased, because the following patch is applied in advance.
>     https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/commit/?h=dt/linus&id=8347b41748c3019157312fbe7f8a6792ae396eb7
> 6. 0013, no change.
> 
> Others:
> 1. Discard add ARCH_WANT_RESERVE_CRASH_KERNEL
> 2. When allocating crash low memory, the start address still starts from 0.
>     low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> 3. Discard change (1ULL << 32) to CRASH_ADDR_LOW_MAX.
> 4. Ensure the check position of xen_pv_domain() have no change.
> 5. Except patch 0010 and 0012, all "Tested-by", "Reviewed-by", "Acked-by" are removed.
> 6. Update description.
> 
> 
> 
> Changes since [v16]
> - Because no functional changes in this version, so add
>    "Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>" for patch 1-9
> - Add "Reviewed-by: Rob Herring <robh@kernel.org>" for patch 8
> - Update patch 9 based on the review comments of Rob Herring
> - As Catalin Marinas's suggestion, merge the implementation of
>    ARCH_WANT_RESERVE_CRASH_KERNEL into patch 5. Ensure that the
>    contents of X86 and ARM64 do not overlap, and reduce unnecessary
>    temporary differences.
> 
> Changes since [v15]
> -  Aggregate the processing of "linux,usable-memory-range" into one function.
>     Only patch 9-10 have been updated.
> 
> Changes since [v14]
> - Recovering the requirement that the CrashKernel memory regions on X86
>    only requires 1 MiB alignment.
> - Combine patches 5 and 6 in v14 into one. The compilation warning fixed
>    by patch 6 was introduced by patch 5 in v14.
> - As with crashk_res, crashk_low_res is also processed by
>    crash_exclude_mem_range() in patch 7.
> - Due to commit b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
>    has removed the architecture-specific code, extend the property "linux,usable-memory-range"
>    in the platform-agnostic FDT core code. See patch 9.
> - Discard the x86 description update in the document, because the description
>    has been updated by commit b1f4c363666c ("Documentation: kdump: update kdump guide").
> - Change "arm64" to "ARM64" in Doc.
> 
> 
> Changes since [v13]
> - Rebased on top of 5.11-rc5.
> - Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL.
> Since reserve_crashkernel[_low]() implementations are quite similar on
> other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in
> arch/Kconfig and select this by X86 and ARM64.
> - Some minor cleanup.
> 
> Changes since [v12]
> - Rebased on top of 5.10-rc1.
> - Keep CRASH_ALIGN as 16M suggested by Dave.
> - Drop patch "kdump: add threshold for the required memory".
> - Add Tested-by from John.
> 
> Changes since [v11]
> - Rebased on top of 5.9-rc4.
> - Make the function reserve_crashkernel() of x86 generic.
> Suggested by Catalin, make the function reserve_crashkernel() of x86 generic
> and arm64 use the generic version to reimplement crashkernel=X.
> 
> Changes since [v10]
> - Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.
> 
> Changes since [v9]
> - Patch 1 add Acked-by from Dave.
> - Update patch 5 according to Dave's comments.
> - Update chosen schema.
> 
> Changes since [v8]
> - Reuse DT property "linux,usable-memory-range".
> Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
> memory region.
> - Fix kdump broken with ZONE_DMA reintroduced.
> - Update chosen schema.
> 
> Changes since [v7]
> - Move x86 CRASH_ALIGN to 2M
> Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> - Update Documentation/devicetree/bindings/chosen.txt.
> Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
> suggested by Arnd.
> - Add Tested-by from Jhon and pk.
> 
> Changes since [v6]
> - Fix build errors reported by kbuild test robot.
> 
> Changes since [v5]
> - Move reserve_crashkernel_low() into kernel/crash_core.c.
> - Delete crashkernel=X,high.
> - Modify crashkernel=X,low.
> If crashkernel=X,low is specified simultaneously, reserve spcified size low
> memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> pass to crash dump kernel by DT property "linux,low-memory-range".
> - Update Documentation/admin-guide/kdump/kdump.rst.
> 
> Changes since [v4]
> - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
> 
> Changes since [v3]
> - Add memblock_cap_memory_ranges back for multiple ranges.
> - Fix some compiling warnings.
> 
> Changes since [v2]
> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> patch.
> 
> Changes since [v1]:
> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> in fdt_enforce_memory_region().
> There are at most two crash kernel regions, for two crash kernel regions
> case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> and then remove the memory range in the middle.
> 
> v1:
> There are following issues in arm64 kdump:
> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> will fail when there is no enough low memory.
> 2. If reserving crashkernel above 4G, in this case, crash dump
> kernel will boot failure because there is no low memory available
> for allocation.
> 
> To solve these issues, change the behavior of crashkernel=X.
> crashkernel=X tries low allocation in DMA zone and fall back to high
> allocation if it fails.
> 
> We can also use "crashkernel=X,high" to select a high region above
> DMA zone, which also tries to allocate at least 256M low memory in
> DMA zone automatically and "crashkernel=Y,low" can be used to allocate
> specified size low memory.
> 
> When reserving crashkernel in high memory, some low memory is reserved
> for crash dump kernel devices. So there may be two regions reserved for
> crash dump kernel.
> In order to distinct from the high region and make no effect to the use
> of existing kexec-tools, rename the low region as "Crash kernel (low)",
> and pass the low region by reusing DT property
> "linux,usable-memory-range". We made the low memory region as the last
> range of "linux,usable-memory-range" to keep compatibility with existing
> user-space and older kdump kernels.
> 
> Besides, we need to modify kexec-tools:
> arm64: support more than one crash kernel regions(see [1])
> 
> Another update is document about DT property 'linux,usable-memory-range':
> schemas: update 'linux,usable-memory-range' node schema(see [2])
> 
> 
> [1]: https://www.spinics.net/lists/kexec/msg28226.html
> [2]: https://github.com/robherring/dt-schema/pull/19
> [v1]: https://lkml.org/lkml/2019/4/2/1174
> [v2]: https://lkml.org/lkml/2019/4/9/86
> [v3]: https://lkml.org/lkml/2019/4/9/306
> [v4]: https://lkml.org/lkml/2019/4/15/273
> [v5]: https://lkml.org/lkml/2019/5/6/1360
> [v6]: https://lkml.org/lkml/2019/8/30/142
> [v7]: https://lkml.org/lkml/2019/12/23/411
> [v8]: https://lkml.org/lkml/2020/5/21/213
> [v9]: https://lkml.org/lkml/2020/6/28/73
> [v10]: https://lkml.org/lkml/2020/7/2/1443
> [v11]: https://lkml.org/lkml/2020/8/1/150
> [v12]: https://lkml.org/lkml/2020/9/7/1037
> [v13]: https://lkml.org/lkml/2020/10/31/34
> [v14]: https://lkml.org/lkml/2021/1/30/53
> [v15]: https://lkml.org/lkml/2021/10/19/1405
> [v16]: https://lkml.org/lkml/2021/11/23/435
> [v17]: https://lkml.org/lkml/2021/12/10/38
> [v18]: https://lkml.org/lkml/2021/12/22/424
> [v19]: https://lkml.org/lkml/2021/12/28/203
> [v20]: https://lkml.org/lkml/2022/1/24/167
> [v21]: https://lkml.org/lkml/2022/2/26/350
> 
> Chen Zhou (2):
>    arm64: kdump: Reimplement crashkernel=X
>    of: fdt: Add memory for devices by DT property
>      "linux,usable-memory-range"
> 
> Zhen Lei (7):
>    kdump: return -ENOENT if required cmdline option does not exist
>    arm64: Use insert_resource() to simplify code
>    arm64: kdump: Remove some redundant checks in map_mem()
>    arm64: kdump: Don't force page-level mappings for memory above 4G
>    arm64: kdump: Use page-level mapping for the high memory of
>      crashkernel
>    arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G
>    docs: kdump: Update the crashkernel description for arm64
> 
>   .../admin-guide/kernel-parameters.txt         |   8 +-
>   arch/arm64/include/asm/kexec.h                |   2 +
>   arch/arm64/kernel/machine_kexec.c             |   9 +-
>   arch/arm64/kernel/machine_kexec_file.c        |  12 +-
>   arch/arm64/kernel/setup.c                     |  17 +-
>   arch/arm64/mm/init.c                          | 204 +++++++++++++++++-
>   arch/arm64/mm/mmu.c                           |  68 +++---
>   drivers/of/fdt.c                              |  33 ++-
>   kernel/crash_core.c                           |   3 +-
>   9 files changed, 278 insertions(+), 78 deletions(-)
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump
@ 2022-04-19 17:02   ` Dave Kleikamp
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Kleikamp @ 2022-04-19 17:02 UTC (permalink / raw)
  To: kexec

For the series:

Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>

Thanks for keeping this patch set alive.

On 4/14/22 6:57AM, Zhen Lei wrote:
> Changes since [v21]:
> 1. Update the commit message of  patch 1 and 5.
> 2. Add some comments for reserve_crashkernel() in patch 5.
> 
> Thanks to Baoquan He and John Donnelly for their review comments.
> 
> Because v5.18-rc1 has added a new patch
> commit  031495635b46 ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones")
> There are many new scenarios:
> 1) The mappings may be block or page-level.
> 2) The call to reserve_crashkernel() may or may not be deferred.
> 3) The the upper limit of DMA address may be 4G, or less than 4G. Or the
>     upper limit of physical memory, because SMMU can do the mapping.
> 
> The code of patch 1-2, 8-9 keep no change, because the above-mentioned issues are not involved.
> The code of patch 5 only makes the following changes:
> -	if (crash_base >= SZ_4G)
> +	/*
> +	 * When both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the
> +	 * CRASH_ADDR_LOW_MAX equals the upper limit of physical memory, so
> +	 * the 'crash_base' of high memory can not exceed it. To follow the
> +	 * description of "crashkernel=X,high" option, add below 'high'
> +	 * condition to make sure the crash low memory will be reserved.
> +	 */
> +	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
> Change SZ_4G to CRASH_ADDR_LOW_MAX, because arm64_dma_phys_limit may be less than
> 4G or greater than 4G. The check 'high' is used for "crashkernel=X,high" and
> "(crash_base >= CRASH_ADDR_LOW_MAX)" is used for "crashkernel=X[@offset]".
> 
> Patch 3-4 to allow block mappings for memory above 4G.
> Patch 6-7 to support only crash high memory or fixed memory range specified by
> crashkernel=X at offset use page-level mapping, to allow other areas use block mapping.
> These four patches are for performance optimization purposes. For details about the
> technical feasibility analysis, please see the commit messages.
> 
> Now the implementation of arm64 is very different from that of x86. It's no longer
> suitable for both of them to share code.
> 
> 
> 
> Changes since [v20]:
> 1. Check whether crashkernel=Y,low is incorrectly configured or not configured. Do different processing.
> 2. Share the existing description of x86. The configuration of arm64 is the same as that of x86.
> 3. Define the value of macro CRASH_ADDR_HIGH_MAX as memblock.current_limit, instead of MEMBLOCK_ALLOC_ACCESSIBLE.
> 4. To improve readability, some lightweight code adjustments have been made to reserve_craskernel(), including comments.
> 5. The defined value of DEFAULT_CRASH_KERNEL_LOW_SIZE reconsiders swiotlb, just like x86, to share documents.
> 
> Thanks to Baoquan He for his careful review.
> 
> The test cases are as follows: (Please update the kexec tool to the latest version)
> 1) crashkernel=4G						//high=4G, low=256M
> 2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
> 3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
> 4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
> 5) crashkernel=4G at 0xe0000000					//high=0G, low=0M, cannot allocate, failed
> 6) crashkernel=512M						//high=0G, low=512M
> 7) crashkernel=128M						//high=0G, low=128M
> 8) crashkernel=512M at 0xde000000		//512M at 3552M		//high=0G, low=512M
> 9) crashkernel=4G,high						//high=4G, low=256M
> a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
> b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
> c) crashkernel=128M,high					//high=128M, low=256M
> d) crashkernel=512M,low						//high=0G, low=0M, invalid
> e) crashkernel=512M,high crashkernel=0,low			//high=512M, low=0M
> f) crashkernel=4G,high crashkernel=ab,low			//high=0G, low=0M, invalid
> 
> 
> Changes since [v19]:
> 1. Temporarily stop making reserve_crashkernel[_low]() generic. There are a
>     lot of details need to be considered, which can take a long time. Because
>     "make generic" does not add new functions and does not improve performance,
>     maybe I should say it's just a cleanup. So by stripping it out and leaving
>     it for other patches later, we can aggregate the changes to the main functions.
> 2. Use insert_resource() to replace request_resource(), this not only simplifies
>     the code, but also reduces the differences between arm64 and x86 implementations.
> 3. As commit 157752d84f5d ("kexec: use Crash kernel for Crash kernel low") do for
>     x86, we can also extend kexec-tools for arm64, and it's currently applied. See:
>     https://www.spinics.net/lists/kexec/msg28284.html
> 
> Thank you very much, Borislav Petkov, for so many valuable comments.
> 
> Changes since [v17]: v17 --> v19
> 1. Patch 0001-0004
>     Introduce generic parse_crashkernel_high_low() to bring the parsing of
>     "crashkernel=X,high" and the parsing of "crashkernel=X,low" together,
>     then use it instead of the call to parse_crashkernel_{high|low}(). Two
>     confusing parameters of parse_crashkernel_{high|low}() are deleted.
> 
>     I previously sent these four patches separately:
>     [1] https://lkml.org/lkml/2021/12/25/40
> 2. Patch 0005-0009
>     Introduce generic reserve_crashkernel_mem[_low](), the implementation of
>     these two functions is based on function reserve_crashkernel[_low]() in
>     arch/x86/kernel/setup.c. There is no functional change for x86.
>     1) The check position of xen_pv_domain() does not change.
>     2) Still 1M alignment for crash kernel fixed region, when 'base' is specified.
> 
>     To avoid compilation problems on other architectures: patch 0004 moves
>     the definition of global variable crashk[_low]_res from kexec_core.c to
>     crash_core.c, and provide default definitions for all macros involved, a
>     particular platform can redefine these macros to override the default
>     values.
> 3. 0010, only one line of comment was changed.
> 4. 0011
>     1) crashk_low_res may also a valid reserved memory, should be checked
>        in crash_is_nosave(), see arch/arm64/kernel/machine_kexec.
>     2) Drop memblock_mark_nomap() for crashk_low_res, because of:
>        2687275a5843 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
>     3) Also call kmemleak_ignore_phys() for crashk_low_res, because of:
>        85f58eb18898 arm64: kdump: Skip kmemleak scan reserved memory for kdump
> 5. 0012, slightly rebased, because the following patch is applied in advance.
>     https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/commit/?h=dt/linus&id=8347b41748c3019157312fbe7f8a6792ae396eb7
> 6. 0013, no change.
> 
> Others:
> 1. Discard add ARCH_WANT_RESERVE_CRASH_KERNEL
> 2. When allocating crash low memory, the start address still starts from 0.
>     low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> 3. Discard change (1ULL << 32) to CRASH_ADDR_LOW_MAX.
> 4. Ensure the check position of xen_pv_domain() have no change.
> 5. Except patch 0010 and 0012, all "Tested-by", "Reviewed-by", "Acked-by" are removed.
> 6. Update description.
> 
> 
> 
> Changes since [v16]
> - Because no functional changes in this version, so add
>    "Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>" for patch 1-9
> - Add "Reviewed-by: Rob Herring <robh@kernel.org>" for patch 8
> - Update patch 9 based on the review comments of Rob Herring
> - As Catalin Marinas's suggestion, merge the implementation of
>    ARCH_WANT_RESERVE_CRASH_KERNEL into patch 5. Ensure that the
>    contents of X86 and ARM64 do not overlap, and reduce unnecessary
>    temporary differences.
> 
> Changes since [v15]
> -  Aggregate the processing of "linux,usable-memory-range" into one function.
>     Only patch 9-10 have been updated.
> 
> Changes since [v14]
> - Recovering the requirement that the CrashKernel memory regions on X86
>    only requires 1 MiB alignment.
> - Combine patches 5 and 6 in v14 into one. The compilation warning fixed
>    by patch 6 was introduced by patch 5 in v14.
> - As with crashk_res, crashk_low_res is also processed by
>    crash_exclude_mem_range() in patch 7.
> - Due to commit b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
>    has removed the architecture-specific code, extend the property "linux,usable-memory-range"
>    in the platform-agnostic FDT core code. See patch 9.
> - Discard the x86 description update in the document, because the description
>    has been updated by commit b1f4c363666c ("Documentation: kdump: update kdump guide").
> - Change "arm64" to "ARM64" in Doc.
> 
> 
> Changes since [v13]
> - Rebased on top of 5.11-rc5.
> - Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL.
> Since reserve_crashkernel[_low]() implementations are quite similar on
> other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in
> arch/Kconfig and select this by X86 and ARM64.
> - Some minor cleanup.
> 
> Changes since [v12]
> - Rebased on top of 5.10-rc1.
> - Keep CRASH_ALIGN as 16M suggested by Dave.
> - Drop patch "kdump: add threshold for the required memory".
> - Add Tested-by from John.
> 
> Changes since [v11]
> - Rebased on top of 5.9-rc4.
> - Make the function reserve_crashkernel() of x86 generic.
> Suggested by Catalin, make the function reserve_crashkernel() of x86 generic
> and arm64 use the generic version to reimplement crashkernel=X.
> 
> Changes since [v10]
> - Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.
> 
> Changes since [v9]
> - Patch 1 add Acked-by from Dave.
> - Update patch 5 according to Dave's comments.
> - Update chosen schema.
> 
> Changes since [v8]
> - Reuse DT property "linux,usable-memory-range".
> Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
> memory region.
> - Fix kdump broken with ZONE_DMA reintroduced.
> - Update chosen schema.
> 
> Changes since [v7]
> - Move x86 CRASH_ALIGN to 2M
> Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> - Update Documentation/devicetree/bindings/chosen.txt.
> Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
> suggested by Arnd.
> - Add Tested-by from Jhon and pk.
> 
> Changes since [v6]
> - Fix build errors reported by kbuild test robot.
> 
> Changes since [v5]
> - Move reserve_crashkernel_low() into kernel/crash_core.c.
> - Delete crashkernel=X,high.
> - Modify crashkernel=X,low.
> If crashkernel=X,low is specified simultaneously, reserve spcified size low
> memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> pass to crash dump kernel by DT property "linux,low-memory-range".
> - Update Documentation/admin-guide/kdump/kdump.rst.
> 
> Changes since [v4]
> - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
> 
> Changes since [v3]
> - Add memblock_cap_memory_ranges back for multiple ranges.
> - Fix some compiling warnings.
> 
> Changes since [v2]
> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> patch.
> 
> Changes since [v1]:
> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> in fdt_enforce_memory_region().
> There are at most two crash kernel regions, for two crash kernel regions
> case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> and then remove the memory range in the middle.
> 
> v1:
> There are following issues in arm64 kdump:
> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> will fail when there is no enough low memory.
> 2. If reserving crashkernel above 4G, in this case, crash dump
> kernel will boot failure because there is no low memory available
> for allocation.
> 
> To solve these issues, change the behavior of crashkernel=X.
> crashkernel=X tries low allocation in DMA zone and fall back to high
> allocation if it fails.
> 
> We can also use "crashkernel=X,high" to select a high region above
> DMA zone, which also tries to allocate at least 256M low memory in
> DMA zone automatically and "crashkernel=Y,low" can be used to allocate
> specified size low memory.
> 
> When reserving crashkernel in high memory, some low memory is reserved
> for crash dump kernel devices. So there may be two regions reserved for
> crash dump kernel.
> In order to distinct from the high region and make no effect to the use
> of existing kexec-tools, rename the low region as "Crash kernel (low)",
> and pass the low region by reusing DT property
> "linux,usable-memory-range". We made the low memory region as the last
> range of "linux,usable-memory-range" to keep compatibility with existing
> user-space and older kdump kernels.
> 
> Besides, we need to modify kexec-tools:
> arm64: support more than one crash kernel regions(see [1])
> 
> Another update is document about DT property 'linux,usable-memory-range':
> schemas: update 'linux,usable-memory-range' node schema(see [2])
> 
> 
> [1]: https://www.spinics.net/lists/kexec/msg28226.html
> [2]: https://github.com/robherring/dt-schema/pull/19
> [v1]: https://lkml.org/lkml/2019/4/2/1174
> [v2]: https://lkml.org/lkml/2019/4/9/86
> [v3]: https://lkml.org/lkml/2019/4/9/306
> [v4]: https://lkml.org/lkml/2019/4/15/273
> [v5]: https://lkml.org/lkml/2019/5/6/1360
> [v6]: https://lkml.org/lkml/2019/8/30/142
> [v7]: https://lkml.org/lkml/2019/12/23/411
> [v8]: https://lkml.org/lkml/2020/5/21/213
> [v9]: https://lkml.org/lkml/2020/6/28/73
> [v10]: https://lkml.org/lkml/2020/7/2/1443
> [v11]: https://lkml.org/lkml/2020/8/1/150
> [v12]: https://lkml.org/lkml/2020/9/7/1037
> [v13]: https://lkml.org/lkml/2020/10/31/34
> [v14]: https://lkml.org/lkml/2021/1/30/53
> [v15]: https://lkml.org/lkml/2021/10/19/1405
> [v16]: https://lkml.org/lkml/2021/11/23/435
> [v17]: https://lkml.org/lkml/2021/12/10/38
> [v18]: https://lkml.org/lkml/2021/12/22/424
> [v19]: https://lkml.org/lkml/2021/12/28/203
> [v20]: https://lkml.org/lkml/2022/1/24/167
> [v21]: https://lkml.org/lkml/2022/2/26/350
> 
> Chen Zhou (2):
>    arm64: kdump: Reimplement crashkernel=X
>    of: fdt: Add memory for devices by DT property
>      "linux,usable-memory-range"
> 
> Zhen Lei (7):
>    kdump: return -ENOENT if required cmdline option does not exist
>    arm64: Use insert_resource() to simplify code
>    arm64: kdump: Remove some redundant checks in map_mem()
>    arm64: kdump: Don't force page-level mappings for memory above 4G
>    arm64: kdump: Use page-level mapping for the high memory of
>      crashkernel
>    arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G
>    docs: kdump: Update the crashkernel description for arm64
> 
>   .../admin-guide/kernel-parameters.txt         |   8 +-
>   arch/arm64/include/asm/kexec.h                |   2 +
>   arch/arm64/kernel/machine_kexec.c             |   9 +-
>   arch/arm64/kernel/machine_kexec_file.c        |  12 +-
>   arch/arm64/kernel/setup.c                     |  17 +-
>   arch/arm64/mm/init.c                          | 204 +++++++++++++++++-
>   arch/arm64/mm/mmu.c                           |  68 +++---
>   drivers/of/fdt.c                              |  33 ++-
>   kernel/crash_core.c                           |   3 +-
>   9 files changed, 278 insertions(+), 78 deletions(-)
> 


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump
  2022-04-14 11:57 ` Zhen Lei
  (?)
@ 2022-04-25  2:19   ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-25  2:19 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou, John Donnelly,
	Dave Kleikamp


Hi, Catalin Marinas and Will Deacon:
  Do you have time to review patches 3-4 and 6-7? These patches are a little more
relevant to the arm64 architecture.

Hi, Baoquan He:
  Can you review patches 1 and 5 again?


On 2022/4/14 19:57, Zhen Lei wrote:
> Changes since [v21]:
> 1. Update the commit message of  patch 1 and 5.
> 2. Add some comments for reserve_crashkernel() in patch 5.
> 
> Thanks to Baoquan He and John Donnelly for their review comments.
> 
> Because v5.18-rc1 has added a new patch
> commit  031495635b46 ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones")
> There are many new scenarios:
> 1) The mappings may be block or page-level. 
> 2) The call to reserve_crashkernel() may or may not be deferred.
> 3) The the upper limit of DMA address may be 4G, or less than 4G. Or the
>    upper limit of physical memory, because SMMU can do the mapping.
> 
> The code of patch 1-2, 8-9 keep no change, because the above-mentioned issues are not involved.
> The code of patch 5 only makes the following changes:
> -	if (crash_base >= SZ_4G)
> +	/*
> +	 * When both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the
> +	 * CRASH_ADDR_LOW_MAX equals the upper limit of physical memory, so
> +	 * the 'crash_base' of high memory can not exceed it. To follow the
> +	 * description of "crashkernel=X,high" option, add below 'high'
> +	 * condition to make sure the crash low memory will be reserved.
> +	 */
> +	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
> Change SZ_4G to CRASH_ADDR_LOW_MAX, because arm64_dma_phys_limit may be less than
> 4G or greater than 4G. The check 'high' is used for "crashkernel=X,high" and
> "(crash_base >= CRASH_ADDR_LOW_MAX)" is used for "crashkernel=X[@offset]".
> 
> Patch 3-4 to allow block mappings for memory above 4G.
> Patch 6-7 to support only crash high memory or fixed memory range specified by
> crashkernel=X@offset use page-level mapping, to allow other areas use block mapping.
> These four patches are for performance optimization purposes. For details about the
> technical feasibility analysis, please see the commit messages.
> 
> Now the implementation of arm64 is very different from that of x86. It's no longer
> suitable for both of them to share code.
> 
> 
> 
> Changes since [v20]:
> 1. Check whether crashkernel=Y,low is incorrectly configured or not configured. Do different processing.
> 2. Share the existing description of x86. The configuration of arm64 is the same as that of x86.
> 3. Define the value of macro CRASH_ADDR_HIGH_MAX as memblock.current_limit, instead of MEMBLOCK_ALLOC_ACCESSIBLE.
> 4. To improve readability, some lightweight code adjustments have been made to reserve_craskernel(), including comments.
> 5. The defined value of DEFAULT_CRASH_KERNEL_LOW_SIZE reconsiders swiotlb, just like x86, to share documents.
> 
> Thanks to Baoquan He for his careful review.
> 
> The test cases are as follows: (Please update the kexec tool to the latest version)
> 1) crashkernel=4G						//high=4G, low=256M
> 2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
> 3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
> 4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
> 5) crashkernel=4G@0xe0000000					//high=0G, low=0M, cannot allocate, failed
> 6) crashkernel=512M						//high=0G, low=512M
> 7) crashkernel=128M						//high=0G, low=128M
> 8) crashkernel=512M@0xde000000		//512M@3552M		//high=0G, low=512M
> 9) crashkernel=4G,high						//high=4G, low=256M
> a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
> b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
> c) crashkernel=128M,high					//high=128M, low=256M
> d) crashkernel=512M,low						//high=0G, low=0M, invalid
> e) crashkernel=512M,high crashkernel=0,low			//high=512M, low=0M
> f) crashkernel=4G,high crashkernel=ab,low			//high=0G, low=0M, invalid
> 
> 
> Changes since [v19]:
> 1. Temporarily stop making reserve_crashkernel[_low]() generic. There are a
>    lot of details need to be considered, which can take a long time. Because
>    "make generic" does not add new functions and does not improve performance,
>    maybe I should say it's just a cleanup. So by stripping it out and leaving
>    it for other patches later, we can aggregate the changes to the main functions.
> 2. Use insert_resource() to replace request_resource(), this not only simplifies
>    the code, but also reduces the differences between arm64 and x86 implementations.
> 3. As commit 157752d84f5d ("kexec: use Crash kernel for Crash kernel low") do for
>    x86, we can also extend kexec-tools for arm64, and it's currently applied. See:
>    https://www.spinics.net/lists/kexec/msg28284.html
> 
> Thank you very much, Borislav Petkov, for so many valuable comments.
> 
> Changes since [v17]: v17 --> v19
> 1. Patch 0001-0004
>    Introduce generic parse_crashkernel_high_low() to bring the parsing of
>    "crashkernel=X,high" and the parsing of "crashkernel=X,low" together,
>    then use it instead of the call to parse_crashkernel_{high|low}(). Two
>    confusing parameters of parse_crashkernel_{high|low}() are deleted.
> 
>    I previously sent these four patches separately:
>    [1] https://lkml.org/lkml/2021/12/25/40
> 2. Patch 0005-0009
>    Introduce generic reserve_crashkernel_mem[_low](), the implementation of
>    these two functions is based on function reserve_crashkernel[_low]() in
>    arch/x86/kernel/setup.c. There is no functional change for x86.
>    1) The check position of xen_pv_domain() does not change.
>    2) Still 1M alignment for crash kernel fixed region, when 'base' is specified.
> 
>    To avoid compilation problems on other architectures: patch 0004 moves
>    the definition of global variable crashk[_low]_res from kexec_core.c to
>    crash_core.c, and provide default definitions for all macros involved, a
>    particular platform can redefine these macros to override the default
>    values.
> 3. 0010, only one line of comment was changed.
> 4. 0011
>    1) crashk_low_res may also a valid reserved memory, should be checked
>       in crash_is_nosave(), see arch/arm64/kernel/machine_kexec.
>    2) Drop memblock_mark_nomap() for crashk_low_res, because of:
>       2687275a5843 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
>    3) Also call kmemleak_ignore_phys() for crashk_low_res, because of:
>       85f58eb18898 arm64: kdump: Skip kmemleak scan reserved memory for kdump
> 5. 0012, slightly rebased, because the following patch is applied in advance. 
>    https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/commit/?h=dt/linus&id=8347b41748c3019157312fbe7f8a6792ae396eb7
> 6. 0013, no change.
> 
> Others:
> 1. Discard add ARCH_WANT_RESERVE_CRASH_KERNEL
> 2. When allocating crash low memory, the start address still starts from 0.
>    low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> 3. Discard change (1ULL << 32) to CRASH_ADDR_LOW_MAX.
> 4. Ensure the check position of xen_pv_domain() have no change.
> 5. Except patch 0010 and 0012, all "Tested-by", "Reviewed-by", "Acked-by" are removed.
> 6. Update description.
> 
> 
> 
> Changes since [v16]
> - Because no functional changes in this version, so add
>   "Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>" for patch 1-9
> - Add "Reviewed-by: Rob Herring <robh@kernel.org>" for patch 8
> - Update patch 9 based on the review comments of Rob Herring
> - As Catalin Marinas's suggestion, merge the implementation of
>   ARCH_WANT_RESERVE_CRASH_KERNEL into patch 5. Ensure that the
>   contents of X86 and ARM64 do not overlap, and reduce unnecessary
>   temporary differences.
> 
> Changes since [v15]
> -  Aggregate the processing of "linux,usable-memory-range" into one function.
>    Only patch 9-10 have been updated.
> 
> Changes since [v14]
> - Recovering the requirement that the CrashKernel memory regions on X86
>   only requires 1 MiB alignment.
> - Combine patches 5 and 6 in v14 into one. The compilation warning fixed
>   by patch 6 was introduced by patch 5 in v14.
> - As with crashk_res, crashk_low_res is also processed by
>   crash_exclude_mem_range() in patch 7.
> - Due to commit b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
>   has removed the architecture-specific code, extend the property "linux,usable-memory-range"
>   in the platform-agnostic FDT core code. See patch 9.
> - Discard the x86 description update in the document, because the description
>   has been updated by commit b1f4c363666c ("Documentation: kdump: update kdump guide").
> - Change "arm64" to "ARM64" in Doc.
> 
> 
> Changes since [v13]
> - Rebased on top of 5.11-rc5.
> - Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL.
> Since reserve_crashkernel[_low]() implementations are quite similar on
> other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in
> arch/Kconfig and select this by X86 and ARM64.
> - Some minor cleanup.
> 
> Changes since [v12]
> - Rebased on top of 5.10-rc1.
> - Keep CRASH_ALIGN as 16M suggested by Dave.
> - Drop patch "kdump: add threshold for the required memory".
> - Add Tested-by from John.
> 
> Changes since [v11]
> - Rebased on top of 5.9-rc4.
> - Make the function reserve_crashkernel() of x86 generic.
> Suggested by Catalin, make the function reserve_crashkernel() of x86 generic
> and arm64 use the generic version to reimplement crashkernel=X.
> 
> Changes since [v10]
> - Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.
> 
> Changes since [v9]
> - Patch 1 add Acked-by from Dave.
> - Update patch 5 according to Dave's comments.
> - Update chosen schema.
> 
> Changes since [v8]
> - Reuse DT property "linux,usable-memory-range".
> Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
> memory region.
> - Fix kdump broken with ZONE_DMA reintroduced.
> - Update chosen schema.
> 
> Changes since [v7]
> - Move x86 CRASH_ALIGN to 2M
> Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> - Update Documentation/devicetree/bindings/chosen.txt.
> Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
> suggested by Arnd.
> - Add Tested-by from Jhon and pk.
> 
> Changes since [v6]
> - Fix build errors reported by kbuild test robot.
> 
> Changes since [v5]
> - Move reserve_crashkernel_low() into kernel/crash_core.c.
> - Delete crashkernel=X,high.
> - Modify crashkernel=X,low.
> If crashkernel=X,low is specified simultaneously, reserve spcified size low
> memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> pass to crash dump kernel by DT property "linux,low-memory-range".
> - Update Documentation/admin-guide/kdump/kdump.rst.
> 
> Changes since [v4]
> - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
> 
> Changes since [v3]
> - Add memblock_cap_memory_ranges back for multiple ranges.
> - Fix some compiling warnings.
> 
> Changes since [v2]
> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> patch.
> 
> Changes since [v1]:
> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> in fdt_enforce_memory_region().
> There are at most two crash kernel regions, for two crash kernel regions
> case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> and then remove the memory range in the middle.
> 
> v1:
> There are following issues in arm64 kdump:
> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> will fail when there is no enough low memory.
> 2. If reserving crashkernel above 4G, in this case, crash dump
> kernel will boot failure because there is no low memory available
> for allocation.
> 
> To solve these issues, change the behavior of crashkernel=X.
> crashkernel=X tries low allocation in DMA zone and fall back to high
> allocation if it fails.
> 
> We can also use "crashkernel=X,high" to select a high region above
> DMA zone, which also tries to allocate at least 256M low memory in
> DMA zone automatically and "crashkernel=Y,low" can be used to allocate
> specified size low memory.
> 
> When reserving crashkernel in high memory, some low memory is reserved
> for crash dump kernel devices. So there may be two regions reserved for
> crash dump kernel.
> In order to distinct from the high region and make no effect to the use
> of existing kexec-tools, rename the low region as "Crash kernel (low)",
> and pass the low region by reusing DT property
> "linux,usable-memory-range". We made the low memory region as the last
> range of "linux,usable-memory-range" to keep compatibility with existing
> user-space and older kdump kernels.
> 
> Besides, we need to modify kexec-tools:
> arm64: support more than one crash kernel regions(see [1])
> 
> Another update is document about DT property 'linux,usable-memory-range':
> schemas: update 'linux,usable-memory-range' node schema(see [2])
> 
> 
> [1]: https://www.spinics.net/lists/kexec/msg28226.html
> [2]: https://github.com/robherring/dt-schema/pull/19 
> [v1]: https://lkml.org/lkml/2019/4/2/1174
> [v2]: https://lkml.org/lkml/2019/4/9/86
> [v3]: https://lkml.org/lkml/2019/4/9/306
> [v4]: https://lkml.org/lkml/2019/4/15/273
> [v5]: https://lkml.org/lkml/2019/5/6/1360
> [v6]: https://lkml.org/lkml/2019/8/30/142
> [v7]: https://lkml.org/lkml/2019/12/23/411
> [v8]: https://lkml.org/lkml/2020/5/21/213
> [v9]: https://lkml.org/lkml/2020/6/28/73
> [v10]: https://lkml.org/lkml/2020/7/2/1443
> [v11]: https://lkml.org/lkml/2020/8/1/150
> [v12]: https://lkml.org/lkml/2020/9/7/1037
> [v13]: https://lkml.org/lkml/2020/10/31/34
> [v14]: https://lkml.org/lkml/2021/1/30/53
> [v15]: https://lkml.org/lkml/2021/10/19/1405
> [v16]: https://lkml.org/lkml/2021/11/23/435
> [v17]: https://lkml.org/lkml/2021/12/10/38
> [v18]: https://lkml.org/lkml/2021/12/22/424
> [v19]: https://lkml.org/lkml/2021/12/28/203
> [v20]: https://lkml.org/lkml/2022/1/24/167
> [v21]: https://lkml.org/lkml/2022/2/26/350
> 
> Chen Zhou (2):
>   arm64: kdump: Reimplement crashkernel=X
>   of: fdt: Add memory for devices by DT property
>     "linux,usable-memory-range"
> 
> Zhen Lei (7):
>   kdump: return -ENOENT if required cmdline option does not exist
>   arm64: Use insert_resource() to simplify code
>   arm64: kdump: Remove some redundant checks in map_mem()
>   arm64: kdump: Don't force page-level mappings for memory above 4G
>   arm64: kdump: Use page-level mapping for the high memory of
>     crashkernel
>   arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G
>   docs: kdump: Update the crashkernel description for arm64
> 
>  .../admin-guide/kernel-parameters.txt         |   8 +-
>  arch/arm64/include/asm/kexec.h                |   2 +
>  arch/arm64/kernel/machine_kexec.c             |   9 +-
>  arch/arm64/kernel/machine_kexec_file.c        |  12 +-
>  arch/arm64/kernel/setup.c                     |  17 +-
>  arch/arm64/mm/init.c                          | 204 +++++++++++++++++-
>  arch/arm64/mm/mmu.c                           |  68 +++---
>  drivers/of/fdt.c                              |  33 ++-
>  kernel/crash_core.c                           |   3 +-
>  9 files changed, 278 insertions(+), 78 deletions(-)
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump
@ 2022-04-25  2:19   ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-25  2:19 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc
  Cc: Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou, John Donnelly,
	Dave Kleikamp


Hi, Catalin Marinas and Will Deacon:
  Do you have time to review patches 3-4 and 6-7? These patches are a little more
relevant to the arm64 architecture.

Hi, Baoquan He:
  Can you review patches 1 and 5 again?


On 2022/4/14 19:57, Zhen Lei wrote:
> Changes since [v21]:
> 1. Update the commit message of  patch 1 and 5.
> 2. Add some comments for reserve_crashkernel() in patch 5.
> 
> Thanks to Baoquan He and John Donnelly for their review comments.
> 
> Because v5.18-rc1 has added a new patch
> commit  031495635b46 ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones")
> There are many new scenarios:
> 1) The mappings may be block or page-level. 
> 2) The call to reserve_crashkernel() may or may not be deferred.
> 3) The the upper limit of DMA address may be 4G, or less than 4G. Or the
>    upper limit of physical memory, because SMMU can do the mapping.
> 
> The code of patch 1-2, 8-9 keep no change, because the above-mentioned issues are not involved.
> The code of patch 5 only makes the following changes:
> -	if (crash_base >= SZ_4G)
> +	/*
> +	 * When both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the
> +	 * CRASH_ADDR_LOW_MAX equals the upper limit of physical memory, so
> +	 * the 'crash_base' of high memory can not exceed it. To follow the
> +	 * description of "crashkernel=X,high" option, add below 'high'
> +	 * condition to make sure the crash low memory will be reserved.
> +	 */
> +	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
> Change SZ_4G to CRASH_ADDR_LOW_MAX, because arm64_dma_phys_limit may be less than
> 4G or greater than 4G. The check 'high' is used for "crashkernel=X,high" and
> "(crash_base >= CRASH_ADDR_LOW_MAX)" is used for "crashkernel=X[@offset]".
> 
> Patch 3-4 to allow block mappings for memory above 4G.
> Patch 6-7 to support only crash high memory or fixed memory range specified by
> crashkernel=X@offset use page-level mapping, to allow other areas use block mapping.
> These four patches are for performance optimization purposes. For details about the
> technical feasibility analysis, please see the commit messages.
> 
> Now the implementation of arm64 is very different from that of x86. It's no longer
> suitable for both of them to share code.
> 
> 
> 
> Changes since [v20]:
> 1. Check whether crashkernel=Y,low is incorrectly configured or not configured. Do different processing.
> 2. Share the existing description of x86. The configuration of arm64 is the same as that of x86.
> 3. Define the value of macro CRASH_ADDR_HIGH_MAX as memblock.current_limit, instead of MEMBLOCK_ALLOC_ACCESSIBLE.
> 4. To improve readability, some lightweight code adjustments have been made to reserve_craskernel(), including comments.
> 5. The defined value of DEFAULT_CRASH_KERNEL_LOW_SIZE reconsiders swiotlb, just like x86, to share documents.
> 
> Thanks to Baoquan He for his careful review.
> 
> The test cases are as follows: (Please update the kexec tool to the latest version)
> 1) crashkernel=4G						//high=4G, low=256M
> 2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
> 3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
> 4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
> 5) crashkernel=4G@0xe0000000					//high=0G, low=0M, cannot allocate, failed
> 6) crashkernel=512M						//high=0G, low=512M
> 7) crashkernel=128M						//high=0G, low=128M
> 8) crashkernel=512M@0xde000000		//512M@3552M		//high=0G, low=512M
> 9) crashkernel=4G,high						//high=4G, low=256M
> a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
> b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
> c) crashkernel=128M,high					//high=128M, low=256M
> d) crashkernel=512M,low						//high=0G, low=0M, invalid
> e) crashkernel=512M,high crashkernel=0,low			//high=512M, low=0M
> f) crashkernel=4G,high crashkernel=ab,low			//high=0G, low=0M, invalid
> 
> 
> Changes since [v19]:
> 1. Temporarily stop making reserve_crashkernel[_low]() generic. There are a
>    lot of details need to be considered, which can take a long time. Because
>    "make generic" does not add new functions and does not improve performance,
>    maybe I should say it's just a cleanup. So by stripping it out and leaving
>    it for other patches later, we can aggregate the changes to the main functions.
> 2. Use insert_resource() to replace request_resource(), this not only simplifies
>    the code, but also reduces the differences between arm64 and x86 implementations.
> 3. As commit 157752d84f5d ("kexec: use Crash kernel for Crash kernel low") do for
>    x86, we can also extend kexec-tools for arm64, and it's currently applied. See:
>    https://www.spinics.net/lists/kexec/msg28284.html
> 
> Thank you very much, Borislav Petkov, for so many valuable comments.
> 
> Changes since [v17]: v17 --> v19
> 1. Patch 0001-0004
>    Introduce generic parse_crashkernel_high_low() to bring the parsing of
>    "crashkernel=X,high" and the parsing of "crashkernel=X,low" together,
>    then use it instead of the call to parse_crashkernel_{high|low}(). Two
>    confusing parameters of parse_crashkernel_{high|low}() are deleted.
> 
>    I previously sent these four patches separately:
>    [1] https://lkml.org/lkml/2021/12/25/40
> 2. Patch 0005-0009
>    Introduce generic reserve_crashkernel_mem[_low](), the implementation of
>    these two functions is based on function reserve_crashkernel[_low]() in
>    arch/x86/kernel/setup.c. There is no functional change for x86.
>    1) The check position of xen_pv_domain() does not change.
>    2) Still 1M alignment for crash kernel fixed region, when 'base' is specified.
> 
>    To avoid compilation problems on other architectures: patch 0004 moves
>    the definition of global variable crashk[_low]_res from kexec_core.c to
>    crash_core.c, and provide default definitions for all macros involved, a
>    particular platform can redefine these macros to override the default
>    values.
> 3. 0010, only one line of comment was changed.
> 4. 0011
>    1) crashk_low_res may also a valid reserved memory, should be checked
>       in crash_is_nosave(), see arch/arm64/kernel/machine_kexec.
>    2) Drop memblock_mark_nomap() for crashk_low_res, because of:
>       2687275a5843 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
>    3) Also call kmemleak_ignore_phys() for crashk_low_res, because of:
>       85f58eb18898 arm64: kdump: Skip kmemleak scan reserved memory for kdump
> 5. 0012, slightly rebased, because the following patch is applied in advance. 
>    https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/commit/?h=dt/linus&id=8347b41748c3019157312fbe7f8a6792ae396eb7
> 6. 0013, no change.
> 
> Others:
> 1. Discard add ARCH_WANT_RESERVE_CRASH_KERNEL
> 2. When allocating crash low memory, the start address still starts from 0.
>    low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> 3. Discard change (1ULL << 32) to CRASH_ADDR_LOW_MAX.
> 4. Ensure the check position of xen_pv_domain() have no change.
> 5. Except patch 0010 and 0012, all "Tested-by", "Reviewed-by", "Acked-by" are removed.
> 6. Update description.
> 
> 
> 
> Changes since [v16]
> - Because no functional changes in this version, so add
>   "Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>" for patch 1-9
> - Add "Reviewed-by: Rob Herring <robh@kernel.org>" for patch 8
> - Update patch 9 based on the review comments of Rob Herring
> - As Catalin Marinas's suggestion, merge the implementation of
>   ARCH_WANT_RESERVE_CRASH_KERNEL into patch 5. Ensure that the
>   contents of X86 and ARM64 do not overlap, and reduce unnecessary
>   temporary differences.
> 
> Changes since [v15]
> -  Aggregate the processing of "linux,usable-memory-range" into one function.
>    Only patch 9-10 have been updated.
> 
> Changes since [v14]
> - Recovering the requirement that the CrashKernel memory regions on X86
>   only requires 1 MiB alignment.
> - Combine patches 5 and 6 in v14 into one. The compilation warning fixed
>   by patch 6 was introduced by patch 5 in v14.
> - As with crashk_res, crashk_low_res is also processed by
>   crash_exclude_mem_range() in patch 7.
> - Due to commit b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
>   has removed the architecture-specific code, extend the property "linux,usable-memory-range"
>   in the platform-agnostic FDT core code. See patch 9.
> - Discard the x86 description update in the document, because the description
>   has been updated by commit b1f4c363666c ("Documentation: kdump: update kdump guide").
> - Change "arm64" to "ARM64" in Doc.
> 
> 
> Changes since [v13]
> - Rebased on top of 5.11-rc5.
> - Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL.
> Since reserve_crashkernel[_low]() implementations are quite similar on
> other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in
> arch/Kconfig and select this by X86 and ARM64.
> - Some minor cleanup.
> 
> Changes since [v12]
> - Rebased on top of 5.10-rc1.
> - Keep CRASH_ALIGN as 16M suggested by Dave.
> - Drop patch "kdump: add threshold for the required memory".
> - Add Tested-by from John.
> 
> Changes since [v11]
> - Rebased on top of 5.9-rc4.
> - Make the function reserve_crashkernel() of x86 generic.
> Suggested by Catalin, make the function reserve_crashkernel() of x86 generic
> and arm64 use the generic version to reimplement crashkernel=X.
> 
> Changes since [v10]
> - Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.
> 
> Changes since [v9]
> - Patch 1 add Acked-by from Dave.
> - Update patch 5 according to Dave's comments.
> - Update chosen schema.
> 
> Changes since [v8]
> - Reuse DT property "linux,usable-memory-range".
> Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
> memory region.
> - Fix kdump broken with ZONE_DMA reintroduced.
> - Update chosen schema.
> 
> Changes since [v7]
> - Move x86 CRASH_ALIGN to 2M
> Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> - Update Documentation/devicetree/bindings/chosen.txt.
> Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
> suggested by Arnd.
> - Add Tested-by from Jhon and pk.
> 
> Changes since [v6]
> - Fix build errors reported by kbuild test robot.
> 
> Changes since [v5]
> - Move reserve_crashkernel_low() into kernel/crash_core.c.
> - Delete crashkernel=X,high.
> - Modify crashkernel=X,low.
> If crashkernel=X,low is specified simultaneously, reserve spcified size low
> memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> pass to crash dump kernel by DT property "linux,low-memory-range".
> - Update Documentation/admin-guide/kdump/kdump.rst.
> 
> Changes since [v4]
> - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
> 
> Changes since [v3]
> - Add memblock_cap_memory_ranges back for multiple ranges.
> - Fix some compiling warnings.
> 
> Changes since [v2]
> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> patch.
> 
> Changes since [v1]:
> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> in fdt_enforce_memory_region().
> There are at most two crash kernel regions, for two crash kernel regions
> case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> and then remove the memory range in the middle.
> 
> v1:
> There are following issues in arm64 kdump:
> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> will fail when there is no enough low memory.
> 2. If reserving crashkernel above 4G, in this case, crash dump
> kernel will boot failure because there is no low memory available
> for allocation.
> 
> To solve these issues, change the behavior of crashkernel=X.
> crashkernel=X tries low allocation in DMA zone and fall back to high
> allocation if it fails.
> 
> We can also use "crashkernel=X,high" to select a high region above
> DMA zone, which also tries to allocate at least 256M low memory in
> DMA zone automatically and "crashkernel=Y,low" can be used to allocate
> specified size low memory.
> 
> When reserving crashkernel in high memory, some low memory is reserved
> for crash dump kernel devices. So there may be two regions reserved for
> crash dump kernel.
> In order to distinct from the high region and make no effect to the use
> of existing kexec-tools, rename the low region as "Crash kernel (low)",
> and pass the low region by reusing DT property
> "linux,usable-memory-range". We made the low memory region as the last
> range of "linux,usable-memory-range" to keep compatibility with existing
> user-space and older kdump kernels.
> 
> Besides, we need to modify kexec-tools:
> arm64: support more than one crash kernel regions(see [1])
> 
> Another update is document about DT property 'linux,usable-memory-range':
> schemas: update 'linux,usable-memory-range' node schema(see [2])
> 
> 
> [1]: https://www.spinics.net/lists/kexec/msg28226.html
> [2]: https://github.com/robherring/dt-schema/pull/19 
> [v1]: https://lkml.org/lkml/2019/4/2/1174
> [v2]: https://lkml.org/lkml/2019/4/9/86
> [v3]: https://lkml.org/lkml/2019/4/9/306
> [v4]: https://lkml.org/lkml/2019/4/15/273
> [v5]: https://lkml.org/lkml/2019/5/6/1360
> [v6]: https://lkml.org/lkml/2019/8/30/142
> [v7]: https://lkml.org/lkml/2019/12/23/411
> [v8]: https://lkml.org/lkml/2020/5/21/213
> [v9]: https://lkml.org/lkml/2020/6/28/73
> [v10]: https://lkml.org/lkml/2020/7/2/1443
> [v11]: https://lkml.org/lkml/2020/8/1/150
> [v12]: https://lkml.org/lkml/2020/9/7/1037
> [v13]: https://lkml.org/lkml/2020/10/31/34
> [v14]: https://lkml.org/lkml/2021/1/30/53
> [v15]: https://lkml.org/lkml/2021/10/19/1405
> [v16]: https://lkml.org/lkml/2021/11/23/435
> [v17]: https://lkml.org/lkml/2021/12/10/38
> [v18]: https://lkml.org/lkml/2021/12/22/424
> [v19]: https://lkml.org/lkml/2021/12/28/203
> [v20]: https://lkml.org/lkml/2022/1/24/167
> [v21]: https://lkml.org/lkml/2022/2/26/350
> 
> Chen Zhou (2):
>   arm64: kdump: Reimplement crashkernel=X
>   of: fdt: Add memory for devices by DT property
>     "linux,usable-memory-range"
> 
> Zhen Lei (7):
>   kdump: return -ENOENT if required cmdline option does not exist
>   arm64: Use insert_resource() to simplify code
>   arm64: kdump: Remove some redundant checks in map_mem()
>   arm64: kdump: Don't force page-level mappings for memory above 4G
>   arm64: kdump: Use page-level mapping for the high memory of
>     crashkernel
>   arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G
>   docs: kdump: Update the crashkernel description for arm64
> 
>  .../admin-guide/kernel-parameters.txt         |   8 +-
>  arch/arm64/include/asm/kexec.h                |   2 +
>  arch/arm64/kernel/machine_kexec.c             |   9 +-
>  arch/arm64/kernel/machine_kexec_file.c        |  12 +-
>  arch/arm64/kernel/setup.c                     |  17 +-
>  arch/arm64/mm/init.c                          | 204 +++++++++++++++++-
>  arch/arm64/mm/mmu.c                           |  68 +++---
>  drivers/of/fdt.c                              |  33 ++-
>  kernel/crash_core.c                           |   3 +-
>  9 files changed, 278 insertions(+), 78 deletions(-)
> 

-- 
Regards,
  Zhen Lei

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump
@ 2022-04-25  2:19   ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen @ 2022-04-25  2:19 UTC (permalink / raw)
  To: kexec


Hi, Catalin Marinas and Will Deacon:
  Do you have time to review patches 3-4 and 6-7? These patches are a little more
relevant to the arm64 architecture.

Hi, Baoquan He:
  Can you review patches 1 and 5 again?


On 2022/4/14 19:57, Zhen Lei wrote:
> Changes since [v21]:
> 1. Update the commit message of  patch 1 and 5.
> 2. Add some comments for reserve_crashkernel() in patch 5.
> 
> Thanks to Baoquan He and John Donnelly for their review comments.
> 
> Because v5.18-rc1 has added a new patch
> commit  031495635b46 ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones")
> There are many new scenarios:
> 1) The mappings may be block or page-level. 
> 2) The call to reserve_crashkernel() may or may not be deferred.
> 3) The the upper limit of DMA address may be 4G, or less than 4G. Or the
>    upper limit of physical memory, because SMMU can do the mapping.
> 
> The code of patch 1-2, 8-9 keep no change, because the above-mentioned issues are not involved.
> The code of patch 5 only makes the following changes:
> -	if (crash_base >= SZ_4G)
> +	/*
> +	 * When both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the
> +	 * CRASH_ADDR_LOW_MAX equals the upper limit of physical memory, so
> +	 * the 'crash_base' of high memory can not exceed it. To follow the
> +	 * description of "crashkernel=X,high" option, add below 'high'
> +	 * condition to make sure the crash low memory will be reserved.
> +	 */
> +	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
> Change SZ_4G to CRASH_ADDR_LOW_MAX, because arm64_dma_phys_limit may be less than
> 4G or greater than 4G. The check 'high' is used for "crashkernel=X,high" and
> "(crash_base >= CRASH_ADDR_LOW_MAX)" is used for "crashkernel=X[@offset]".
> 
> Patch 3-4 to allow block mappings for memory above 4G.
> Patch 6-7 to support only crash high memory or fixed memory range specified by
> crashkernel=X at offset use page-level mapping, to allow other areas use block mapping.
> These four patches are for performance optimization purposes. For details about the
> technical feasibility analysis, please see the commit messages.
> 
> Now the implementation of arm64 is very different from that of x86. It's no longer
> suitable for both of them to share code.
> 
> 
> 
> Changes since [v20]:
> 1. Check whether crashkernel=Y,low is incorrectly configured or not configured. Do different processing.
> 2. Share the existing description of x86. The configuration of arm64 is the same as that of x86.
> 3. Define the value of macro CRASH_ADDR_HIGH_MAX as memblock.current_limit, instead of MEMBLOCK_ALLOC_ACCESSIBLE.
> 4. To improve readability, some lightweight code adjustments have been made to reserve_craskernel(), including comments.
> 5. The defined value of DEFAULT_CRASH_KERNEL_LOW_SIZE reconsiders swiotlb, just like x86, to share documents.
> 
> Thanks to Baoquan He for his careful review.
> 
> The test cases are as follows: (Please update the kexec tool to the latest version)
> 1) crashkernel=4G						//high=4G, low=256M
> 2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
> 3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
> 4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
> 5) crashkernel=4G at 0xe0000000					//high=0G, low=0M, cannot allocate, failed
> 6) crashkernel=512M						//high=0G, low=512M
> 7) crashkernel=128M						//high=0G, low=128M
> 8) crashkernel=512M at 0xde000000		//512M at 3552M		//high=0G, low=512M
> 9) crashkernel=4G,high						//high=4G, low=256M
> a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
> b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
> c) crashkernel=128M,high					//high=128M, low=256M
> d) crashkernel=512M,low						//high=0G, low=0M, invalid
> e) crashkernel=512M,high crashkernel=0,low			//high=512M, low=0M
> f) crashkernel=4G,high crashkernel=ab,low			//high=0G, low=0M, invalid
> 
> 
> Changes since [v19]:
> 1. Temporarily stop making reserve_crashkernel[_low]() generic. There are a
>    lot of details need to be considered, which can take a long time. Because
>    "make generic" does not add new functions and does not improve performance,
>    maybe I should say it's just a cleanup. So by stripping it out and leaving
>    it for other patches later, we can aggregate the changes to the main functions.
> 2. Use insert_resource() to replace request_resource(), this not only simplifies
>    the code, but also reduces the differences between arm64 and x86 implementations.
> 3. As commit 157752d84f5d ("kexec: use Crash kernel for Crash kernel low") do for
>    x86, we can also extend kexec-tools for arm64, and it's currently applied. See:
>    https://www.spinics.net/lists/kexec/msg28284.html
> 
> Thank you very much, Borislav Petkov, for so many valuable comments.
> 
> Changes since [v17]: v17 --> v19
> 1. Patch 0001-0004
>    Introduce generic parse_crashkernel_high_low() to bring the parsing of
>    "crashkernel=X,high" and the parsing of "crashkernel=X,low" together,
>    then use it instead of the call to parse_crashkernel_{high|low}(). Two
>    confusing parameters of parse_crashkernel_{high|low}() are deleted.
> 
>    I previously sent these four patches separately:
>    [1] https://lkml.org/lkml/2021/12/25/40
> 2. Patch 0005-0009
>    Introduce generic reserve_crashkernel_mem[_low](), the implementation of
>    these two functions is based on function reserve_crashkernel[_low]() in
>    arch/x86/kernel/setup.c. There is no functional change for x86.
>    1) The check position of xen_pv_domain() does not change.
>    2) Still 1M alignment for crash kernel fixed region, when 'base' is specified.
> 
>    To avoid compilation problems on other architectures: patch 0004 moves
>    the definition of global variable crashk[_low]_res from kexec_core.c to
>    crash_core.c, and provide default definitions for all macros involved, a
>    particular platform can redefine these macros to override the default
>    values.
> 3. 0010, only one line of comment was changed.
> 4. 0011
>    1) crashk_low_res may also a valid reserved memory, should be checked
>       in crash_is_nosave(), see arch/arm64/kernel/machine_kexec.
>    2) Drop memblock_mark_nomap() for crashk_low_res, because of:
>       2687275a5843 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
>    3) Also call kmemleak_ignore_phys() for crashk_low_res, because of:
>       85f58eb18898 arm64: kdump: Skip kmemleak scan reserved memory for kdump
> 5. 0012, slightly rebased, because the following patch is applied in advance. 
>    https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/commit/?h=dt/linus&id=8347b41748c3019157312fbe7f8a6792ae396eb7
> 6. 0013, no change.
> 
> Others:
> 1. Discard add ARCH_WANT_RESERVE_CRASH_KERNEL
> 2. When allocating crash low memory, the start address still starts from 0.
>    low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> 3. Discard change (1ULL << 32) to CRASH_ADDR_LOW_MAX.
> 4. Ensure the check position of xen_pv_domain() have no change.
> 5. Except patch 0010 and 0012, all "Tested-by", "Reviewed-by", "Acked-by" are removed.
> 6. Update description.
> 
> 
> 
> Changes since [v16]
> - Because no functional changes in this version, so add
>   "Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>" for patch 1-9
> - Add "Reviewed-by: Rob Herring <robh@kernel.org>" for patch 8
> - Update patch 9 based on the review comments of Rob Herring
> - As Catalin Marinas's suggestion, merge the implementation of
>   ARCH_WANT_RESERVE_CRASH_KERNEL into patch 5. Ensure that the
>   contents of X86 and ARM64 do not overlap, and reduce unnecessary
>   temporary differences.
> 
> Changes since [v15]
> -  Aggregate the processing of "linux,usable-memory-range" into one function.
>    Only patch 9-10 have been updated.
> 
> Changes since [v14]
> - Recovering the requirement that the CrashKernel memory regions on X86
>   only requires 1 MiB alignment.
> - Combine patches 5 and 6 in v14 into one. The compilation warning fixed
>   by patch 6 was introduced by patch 5 in v14.
> - As with crashk_res, crashk_low_res is also processed by
>   crash_exclude_mem_range() in patch 7.
> - Due to commit b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
>   has removed the architecture-specific code, extend the property "linux,usable-memory-range"
>   in the platform-agnostic FDT core code. See patch 9.
> - Discard the x86 description update in the document, because the description
>   has been updated by commit b1f4c363666c ("Documentation: kdump: update kdump guide").
> - Change "arm64" to "ARM64" in Doc.
> 
> 
> Changes since [v13]
> - Rebased on top of 5.11-rc5.
> - Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL.
> Since reserve_crashkernel[_low]() implementations are quite similar on
> other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in
> arch/Kconfig and select this by X86 and ARM64.
> - Some minor cleanup.
> 
> Changes since [v12]
> - Rebased on top of 5.10-rc1.
> - Keep CRASH_ALIGN as 16M suggested by Dave.
> - Drop patch "kdump: add threshold for the required memory".
> - Add Tested-by from John.
> 
> Changes since [v11]
> - Rebased on top of 5.9-rc4.
> - Make the function reserve_crashkernel() of x86 generic.
> Suggested by Catalin, make the function reserve_crashkernel() of x86 generic
> and arm64 use the generic version to reimplement crashkernel=X.
> 
> Changes since [v10]
> - Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.
> 
> Changes since [v9]
> - Patch 1 add Acked-by from Dave.
> - Update patch 5 according to Dave's comments.
> - Update chosen schema.
> 
> Changes since [v8]
> - Reuse DT property "linux,usable-memory-range".
> Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
> memory region.
> - Fix kdump broken with ZONE_DMA reintroduced.
> - Update chosen schema.
> 
> Changes since [v7]
> - Move x86 CRASH_ALIGN to 2M
> Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> - Update Documentation/devicetree/bindings/chosen.txt.
> Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
> suggested by Arnd.
> - Add Tested-by from Jhon and pk.
> 
> Changes since [v6]
> - Fix build errors reported by kbuild test robot.
> 
> Changes since [v5]
> - Move reserve_crashkernel_low() into kernel/crash_core.c.
> - Delete crashkernel=X,high.
> - Modify crashkernel=X,low.
> If crashkernel=X,low is specified simultaneously, reserve spcified size low
> memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> pass to crash dump kernel by DT property "linux,low-memory-range".
> - Update Documentation/admin-guide/kdump/kdump.rst.
> 
> Changes since [v4]
> - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
> 
> Changes since [v3]
> - Add memblock_cap_memory_ranges back for multiple ranges.
> - Fix some compiling warnings.
> 
> Changes since [v2]
> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> patch.
> 
> Changes since [v1]:
> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> in fdt_enforce_memory_region().
> There are at most two crash kernel regions, for two crash kernel regions
> case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> and then remove the memory range in the middle.
> 
> v1:
> There are following issues in arm64 kdump:
> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> will fail when there is no enough low memory.
> 2. If reserving crashkernel above 4G, in this case, crash dump
> kernel will boot failure because there is no low memory available
> for allocation.
> 
> To solve these issues, change the behavior of crashkernel=X.
> crashkernel=X tries low allocation in DMA zone and fall back to high
> allocation if it fails.
> 
> We can also use "crashkernel=X,high" to select a high region above
> DMA zone, which also tries to allocate at least 256M low memory in
> DMA zone automatically and "crashkernel=Y,low" can be used to allocate
> specified size low memory.
> 
> When reserving crashkernel in high memory, some low memory is reserved
> for crash dump kernel devices. So there may be two regions reserved for
> crash dump kernel.
> In order to distinct from the high region and make no effect to the use
> of existing kexec-tools, rename the low region as "Crash kernel (low)",
> and pass the low region by reusing DT property
> "linux,usable-memory-range". We made the low memory region as the last
> range of "linux,usable-memory-range" to keep compatibility with existing
> user-space and older kdump kernels.
> 
> Besides, we need to modify kexec-tools:
> arm64: support more than one crash kernel regions(see [1])
> 
> Another update is document about DT property 'linux,usable-memory-range':
> schemas: update 'linux,usable-memory-range' node schema(see [2])
> 
> 
> [1]: https://www.spinics.net/lists/kexec/msg28226.html
> [2]: https://github.com/robherring/dt-schema/pull/19 
> [v1]: https://lkml.org/lkml/2019/4/2/1174
> [v2]: https://lkml.org/lkml/2019/4/9/86
> [v3]: https://lkml.org/lkml/2019/4/9/306
> [v4]: https://lkml.org/lkml/2019/4/15/273
> [v5]: https://lkml.org/lkml/2019/5/6/1360
> [v6]: https://lkml.org/lkml/2019/8/30/142
> [v7]: https://lkml.org/lkml/2019/12/23/411
> [v8]: https://lkml.org/lkml/2020/5/21/213
> [v9]: https://lkml.org/lkml/2020/6/28/73
> [v10]: https://lkml.org/lkml/2020/7/2/1443
> [v11]: https://lkml.org/lkml/2020/8/1/150
> [v12]: https://lkml.org/lkml/2020/9/7/1037
> [v13]: https://lkml.org/lkml/2020/10/31/34
> [v14]: https://lkml.org/lkml/2021/1/30/53
> [v15]: https://lkml.org/lkml/2021/10/19/1405
> [v16]: https://lkml.org/lkml/2021/11/23/435
> [v17]: https://lkml.org/lkml/2021/12/10/38
> [v18]: https://lkml.org/lkml/2021/12/22/424
> [v19]: https://lkml.org/lkml/2021/12/28/203
> [v20]: https://lkml.org/lkml/2022/1/24/167
> [v21]: https://lkml.org/lkml/2022/2/26/350
> 
> Chen Zhou (2):
>   arm64: kdump: Reimplement crashkernel=X
>   of: fdt: Add memory for devices by DT property
>     "linux,usable-memory-range"
> 
> Zhen Lei (7):
>   kdump: return -ENOENT if required cmdline option does not exist
>   arm64: Use insert_resource() to simplify code
>   arm64: kdump: Remove some redundant checks in map_mem()
>   arm64: kdump: Don't force page-level mappings for memory above 4G
>   arm64: kdump: Use page-level mapping for the high memory of
>     crashkernel
>   arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G
>   docs: kdump: Update the crashkernel description for arm64
> 
>  .../admin-guide/kernel-parameters.txt         |   8 +-
>  arch/arm64/include/asm/kexec.h                |   2 +
>  arch/arm64/kernel/machine_kexec.c             |   9 +-
>  arch/arm64/kernel/machine_kexec_file.c        |  12 +-
>  arch/arm64/kernel/setup.c                     |  17 +-
>  arch/arm64/mm/init.c                          | 204 +++++++++++++++++-
>  arch/arm64/mm/mmu.c                           |  68 +++---
>  drivers/of/fdt.c                              |  33 ++-
>  kernel/crash_core.c                           |   3 +-
>  9 files changed, 278 insertions(+), 78 deletions(-)
> 

-- 
Regards,
  Zhen Lei


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump
  2022-04-25  2:19   ` Leizhen (ThunderTown)
  (?)
@ 2022-04-25  2:45     ` Baoquan He
  -1 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-04-25  2:45 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 04/25/22 at 10:19am, Leizhen (ThunderTown) wrote:
> 
> Hi, Catalin Marinas and Will Deacon:
>   Do you have time to review patches 3-4 and 6-7? These patches are a little more
> relevant to the arm64 architecture.
> 
> Hi, Baoquan He:
>   Can you review patches 1 and 5 again?

Yes, and I will test and review the whole series if no one check the
arm64 related changes.

> 
> 
> On 2022/4/14 19:57, Zhen Lei wrote:
> > Changes since [v21]:
> > 1. Update the commit message of  patch 1 and 5.
> > 2. Add some comments for reserve_crashkernel() in patch 5.
> > 
> > Thanks to Baoquan He and John Donnelly for their review comments.
> > 
> > Because v5.18-rc1 has added a new patch
> > commit  031495635b46 ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones")
> > There are many new scenarios:
> > 1) The mappings may be block or page-level. 
> > 2) The call to reserve_crashkernel() may or may not be deferred.
> > 3) The the upper limit of DMA address may be 4G, or less than 4G. Or the
> >    upper limit of physical memory, because SMMU can do the mapping.
> > 
> > The code of patch 1-2, 8-9 keep no change, because the above-mentioned issues are not involved.
> > The code of patch 5 only makes the following changes:
> > -	if (crash_base >= SZ_4G)
> > +	/*
> > +	 * When both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the
> > +	 * CRASH_ADDR_LOW_MAX equals the upper limit of physical memory, so
> > +	 * the 'crash_base' of high memory can not exceed it. To follow the
> > +	 * description of "crashkernel=X,high" option, add below 'high'
> > +	 * condition to make sure the crash low memory will be reserved.
> > +	 */
> > +	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
> > Change SZ_4G to CRASH_ADDR_LOW_MAX, because arm64_dma_phys_limit may be less than
> > 4G or greater than 4G. The check 'high' is used for "crashkernel=X,high" and
> > "(crash_base >= CRASH_ADDR_LOW_MAX)" is used for "crashkernel=X[@offset]".
> > 
> > Patch 3-4 to allow block mappings for memory above 4G.
> > Patch 6-7 to support only crash high memory or fixed memory range specified by
> > crashkernel=X@offset use page-level mapping, to allow other areas use block mapping.
> > These four patches are for performance optimization purposes. For details about the
> > technical feasibility analysis, please see the commit messages.
> > 
> > Now the implementation of arm64 is very different from that of x86. It's no longer
> > suitable for both of them to share code.
> > 
> > 
> > 
> > Changes since [v20]:
> > 1. Check whether crashkernel=Y,low is incorrectly configured or not configured. Do different processing.
> > 2. Share the existing description of x86. The configuration of arm64 is the same as that of x86.
> > 3. Define the value of macro CRASH_ADDR_HIGH_MAX as memblock.current_limit, instead of MEMBLOCK_ALLOC_ACCESSIBLE.
> > 4. To improve readability, some lightweight code adjustments have been made to reserve_craskernel(), including comments.
> > 5. The defined value of DEFAULT_CRASH_KERNEL_LOW_SIZE reconsiders swiotlb, just like x86, to share documents.
> > 
> > Thanks to Baoquan He for his careful review.
> > 
> > The test cases are as follows: (Please update the kexec tool to the latest version)
> > 1) crashkernel=4G						//high=4G, low=256M
> > 2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
> > 3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
> > 4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
> > 5) crashkernel=4G@0xe0000000					//high=0G, low=0M, cannot allocate, failed
> > 6) crashkernel=512M						//high=0G, low=512M
> > 7) crashkernel=128M						//high=0G, low=128M
> > 8) crashkernel=512M@0xde000000		//512M@3552M		//high=0G, low=512M
> > 9) crashkernel=4G,high						//high=4G, low=256M
> > a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
> > b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
> > c) crashkernel=128M,high					//high=128M, low=256M
> > d) crashkernel=512M,low						//high=0G, low=0M, invalid
> > e) crashkernel=512M,high crashkernel=0,low			//high=512M, low=0M
> > f) crashkernel=4G,high crashkernel=ab,low			//high=0G, low=0M, invalid
> > 
> > 
> > Changes since [v19]:
> > 1. Temporarily stop making reserve_crashkernel[_low]() generic. There are a
> >    lot of details need to be considered, which can take a long time. Because
> >    "make generic" does not add new functions and does not improve performance,
> >    maybe I should say it's just a cleanup. So by stripping it out and leaving
> >    it for other patches later, we can aggregate the changes to the main functions.
> > 2. Use insert_resource() to replace request_resource(), this not only simplifies
> >    the code, but also reduces the differences between arm64 and x86 implementations.
> > 3. As commit 157752d84f5d ("kexec: use Crash kernel for Crash kernel low") do for
> >    x86, we can also extend kexec-tools for arm64, and it's currently applied. See:
> >    https://www.spinics.net/lists/kexec/msg28284.html
> > 
> > Thank you very much, Borislav Petkov, for so many valuable comments.
> > 
> > Changes since [v17]: v17 --> v19
> > 1. Patch 0001-0004
> >    Introduce generic parse_crashkernel_high_low() to bring the parsing of
> >    "crashkernel=X,high" and the parsing of "crashkernel=X,low" together,
> >    then use it instead of the call to parse_crashkernel_{high|low}(). Two
> >    confusing parameters of parse_crashkernel_{high|low}() are deleted.
> > 
> >    I previously sent these four patches separately:
> >    [1] https://lkml.org/lkml/2021/12/25/40
> > 2. Patch 0005-0009
> >    Introduce generic reserve_crashkernel_mem[_low](), the implementation of
> >    these two functions is based on function reserve_crashkernel[_low]() in
> >    arch/x86/kernel/setup.c. There is no functional change for x86.
> >    1) The check position of xen_pv_domain() does not change.
> >    2) Still 1M alignment for crash kernel fixed region, when 'base' is specified.
> > 
> >    To avoid compilation problems on other architectures: patch 0004 moves
> >    the definition of global variable crashk[_low]_res from kexec_core.c to
> >    crash_core.c, and provide default definitions for all macros involved, a
> >    particular platform can redefine these macros to override the default
> >    values.
> > 3. 0010, only one line of comment was changed.
> > 4. 0011
> >    1) crashk_low_res may also a valid reserved memory, should be checked
> >       in crash_is_nosave(), see arch/arm64/kernel/machine_kexec.
> >    2) Drop memblock_mark_nomap() for crashk_low_res, because of:
> >       2687275a5843 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
> >    3) Also call kmemleak_ignore_phys() for crashk_low_res, because of:
> >       85f58eb18898 arm64: kdump: Skip kmemleak scan reserved memory for kdump
> > 5. 0012, slightly rebased, because the following patch is applied in advance. 
> >    https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/commit/?h=dt/linus&id=8347b41748c3019157312fbe7f8a6792ae396eb7
> > 6. 0013, no change.
> > 
> > Others:
> > 1. Discard add ARCH_WANT_RESERVE_CRASH_KERNEL
> > 2. When allocating crash low memory, the start address still starts from 0.
> >    low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> > 3. Discard change (1ULL << 32) to CRASH_ADDR_LOW_MAX.
> > 4. Ensure the check position of xen_pv_domain() have no change.
> > 5. Except patch 0010 and 0012, all "Tested-by", "Reviewed-by", "Acked-by" are removed.
> > 6. Update description.
> > 
> > 
> > 
> > Changes since [v16]
> > - Because no functional changes in this version, so add
> >   "Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>" for patch 1-9
> > - Add "Reviewed-by: Rob Herring <robh@kernel.org>" for patch 8
> > - Update patch 9 based on the review comments of Rob Herring
> > - As Catalin Marinas's suggestion, merge the implementation of
> >   ARCH_WANT_RESERVE_CRASH_KERNEL into patch 5. Ensure that the
> >   contents of X86 and ARM64 do not overlap, and reduce unnecessary
> >   temporary differences.
> > 
> > Changes since [v15]
> > -  Aggregate the processing of "linux,usable-memory-range" into one function.
> >    Only patch 9-10 have been updated.
> > 
> > Changes since [v14]
> > - Recovering the requirement that the CrashKernel memory regions on X86
> >   only requires 1 MiB alignment.
> > - Combine patches 5 and 6 in v14 into one. The compilation warning fixed
> >   by patch 6 was introduced by patch 5 in v14.
> > - As with crashk_res, crashk_low_res is also processed by
> >   crash_exclude_mem_range() in patch 7.
> > - Due to commit b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
> >   has removed the architecture-specific code, extend the property "linux,usable-memory-range"
> >   in the platform-agnostic FDT core code. See patch 9.
> > - Discard the x86 description update in the document, because the description
> >   has been updated by commit b1f4c363666c ("Documentation: kdump: update kdump guide").
> > - Change "arm64" to "ARM64" in Doc.
> > 
> > 
> > Changes since [v13]
> > - Rebased on top of 5.11-rc5.
> > - Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL.
> > Since reserve_crashkernel[_low]() implementations are quite similar on
> > other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in
> > arch/Kconfig and select this by X86 and ARM64.
> > - Some minor cleanup.
> > 
> > Changes since [v12]
> > - Rebased on top of 5.10-rc1.
> > - Keep CRASH_ALIGN as 16M suggested by Dave.
> > - Drop patch "kdump: add threshold for the required memory".
> > - Add Tested-by from John.
> > 
> > Changes since [v11]
> > - Rebased on top of 5.9-rc4.
> > - Make the function reserve_crashkernel() of x86 generic.
> > Suggested by Catalin, make the function reserve_crashkernel() of x86 generic
> > and arm64 use the generic version to reimplement crashkernel=X.
> > 
> > Changes since [v10]
> > - Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.
> > 
> > Changes since [v9]
> > - Patch 1 add Acked-by from Dave.
> > - Update patch 5 according to Dave's comments.
> > - Update chosen schema.
> > 
> > Changes since [v8]
> > - Reuse DT property "linux,usable-memory-range".
> > Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
> > memory region.
> > - Fix kdump broken with ZONE_DMA reintroduced.
> > - Update chosen schema.
> > 
> > Changes since [v7]
> > - Move x86 CRASH_ALIGN to 2M
> > Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> > - Update Documentation/devicetree/bindings/chosen.txt.
> > Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
> > suggested by Arnd.
> > - Add Tested-by from Jhon and pk.
> > 
> > Changes since [v6]
> > - Fix build errors reported by kbuild test robot.
> > 
> > Changes since [v5]
> > - Move reserve_crashkernel_low() into kernel/crash_core.c.
> > - Delete crashkernel=X,high.
> > - Modify crashkernel=X,low.
> > If crashkernel=X,low is specified simultaneously, reserve spcified size low
> > memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> > In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> > pass to crash dump kernel by DT property "linux,low-memory-range".
> > - Update Documentation/admin-guide/kdump/kdump.rst.
> > 
> > Changes since [v4]
> > - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
> > 
> > Changes since [v3]
> > - Add memblock_cap_memory_ranges back for multiple ranges.
> > - Fix some compiling warnings.
> > 
> > Changes since [v2]
> > - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> > two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> > patch.
> > 
> > Changes since [v1]:
> > - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> > - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> > in fdt_enforce_memory_region().
> > There are at most two crash kernel regions, for two crash kernel regions
> > case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> > and then remove the memory range in the middle.
> > 
> > v1:
> > There are following issues in arm64 kdump:
> > 1. We use crashkernel=X to reserve crashkernel below 4G, which
> > will fail when there is no enough low memory.
> > 2. If reserving crashkernel above 4G, in this case, crash dump
> > kernel will boot failure because there is no low memory available
> > for allocation.
> > 
> > To solve these issues, change the behavior of crashkernel=X.
> > crashkernel=X tries low allocation in DMA zone and fall back to high
> > allocation if it fails.
> > 
> > We can also use "crashkernel=X,high" to select a high region above
> > DMA zone, which also tries to allocate at least 256M low memory in
> > DMA zone automatically and "crashkernel=Y,low" can be used to allocate
> > specified size low memory.
> > 
> > When reserving crashkernel in high memory, some low memory is reserved
> > for crash dump kernel devices. So there may be two regions reserved for
> > crash dump kernel.
> > In order to distinct from the high region and make no effect to the use
> > of existing kexec-tools, rename the low region as "Crash kernel (low)",
> > and pass the low region by reusing DT property
> > "linux,usable-memory-range". We made the low memory region as the last
> > range of "linux,usable-memory-range" to keep compatibility with existing
> > user-space and older kdump kernels.
> > 
> > Besides, we need to modify kexec-tools:
> > arm64: support more than one crash kernel regions(see [1])
> > 
> > Another update is document about DT property 'linux,usable-memory-range':
> > schemas: update 'linux,usable-memory-range' node schema(see [2])
> > 
> > 
> > [1]: https://www.spinics.net/lists/kexec/msg28226.html
> > [2]: https://github.com/robherring/dt-schema/pull/19 
> > [v1]: https://lkml.org/lkml/2019/4/2/1174
> > [v2]: https://lkml.org/lkml/2019/4/9/86
> > [v3]: https://lkml.org/lkml/2019/4/9/306
> > [v4]: https://lkml.org/lkml/2019/4/15/273
> > [v5]: https://lkml.org/lkml/2019/5/6/1360
> > [v6]: https://lkml.org/lkml/2019/8/30/142
> > [v7]: https://lkml.org/lkml/2019/12/23/411
> > [v8]: https://lkml.org/lkml/2020/5/21/213
> > [v9]: https://lkml.org/lkml/2020/6/28/73
> > [v10]: https://lkml.org/lkml/2020/7/2/1443
> > [v11]: https://lkml.org/lkml/2020/8/1/150
> > [v12]: https://lkml.org/lkml/2020/9/7/1037
> > [v13]: https://lkml.org/lkml/2020/10/31/34
> > [v14]: https://lkml.org/lkml/2021/1/30/53
> > [v15]: https://lkml.org/lkml/2021/10/19/1405
> > [v16]: https://lkml.org/lkml/2021/11/23/435
> > [v17]: https://lkml.org/lkml/2021/12/10/38
> > [v18]: https://lkml.org/lkml/2021/12/22/424
> > [v19]: https://lkml.org/lkml/2021/12/28/203
> > [v20]: https://lkml.org/lkml/2022/1/24/167
> > [v21]: https://lkml.org/lkml/2022/2/26/350
> > 
> > Chen Zhou (2):
> >   arm64: kdump: Reimplement crashkernel=X
> >   of: fdt: Add memory for devices by DT property
> >     "linux,usable-memory-range"
> > 
> > Zhen Lei (7):
> >   kdump: return -ENOENT if required cmdline option does not exist
> >   arm64: Use insert_resource() to simplify code
> >   arm64: kdump: Remove some redundant checks in map_mem()
> >   arm64: kdump: Don't force page-level mappings for memory above 4G
> >   arm64: kdump: Use page-level mapping for the high memory of
> >     crashkernel
> >   arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G
> >   docs: kdump: Update the crashkernel description for arm64
> > 
> >  .../admin-guide/kernel-parameters.txt         |   8 +-
> >  arch/arm64/include/asm/kexec.h                |   2 +
> >  arch/arm64/kernel/machine_kexec.c             |   9 +-
> >  arch/arm64/kernel/machine_kexec_file.c        |  12 +-
> >  arch/arm64/kernel/setup.c                     |  17 +-
> >  arch/arm64/mm/init.c                          | 204 +++++++++++++++++-
> >  arch/arm64/mm/mmu.c                           |  68 +++---
> >  drivers/of/fdt.c                              |  33 ++-
> >  kernel/crash_core.c                           |   3 +-
> >  9 files changed, 278 insertions(+), 78 deletions(-)
> > 
> 
> -- 
> Regards,
>   Zhen Lei
> 


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump
@ 2022-04-25  2:45     ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-04-25  2:45 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 04/25/22 at 10:19am, Leizhen (ThunderTown) wrote:
> 
> Hi, Catalin Marinas and Will Deacon:
>   Do you have time to review patches 3-4 and 6-7? These patches are a little more
> relevant to the arm64 architecture.
> 
> Hi, Baoquan He:
>   Can you review patches 1 and 5 again?

Yes, and I will test and review the whole series if no one check the
arm64 related changes.

> 
> 
> On 2022/4/14 19:57, Zhen Lei wrote:
> > Changes since [v21]:
> > 1. Update the commit message of  patch 1 and 5.
> > 2. Add some comments for reserve_crashkernel() in patch 5.
> > 
> > Thanks to Baoquan He and John Donnelly for their review comments.
> > 
> > Because v5.18-rc1 has added a new patch
> > commit  031495635b46 ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones")
> > There are many new scenarios:
> > 1) The mappings may be block or page-level. 
> > 2) The call to reserve_crashkernel() may or may not be deferred.
> > 3) The the upper limit of DMA address may be 4G, or less than 4G. Or the
> >    upper limit of physical memory, because SMMU can do the mapping.
> > 
> > The code of patch 1-2, 8-9 keep no change, because the above-mentioned issues are not involved.
> > The code of patch 5 only makes the following changes:
> > -	if (crash_base >= SZ_4G)
> > +	/*
> > +	 * When both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the
> > +	 * CRASH_ADDR_LOW_MAX equals the upper limit of physical memory, so
> > +	 * the 'crash_base' of high memory can not exceed it. To follow the
> > +	 * description of "crashkernel=X,high" option, add below 'high'
> > +	 * condition to make sure the crash low memory will be reserved.
> > +	 */
> > +	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
> > Change SZ_4G to CRASH_ADDR_LOW_MAX, because arm64_dma_phys_limit may be less than
> > 4G or greater than 4G. The check 'high' is used for "crashkernel=X,high" and
> > "(crash_base >= CRASH_ADDR_LOW_MAX)" is used for "crashkernel=X[@offset]".
> > 
> > Patch 3-4 to allow block mappings for memory above 4G.
> > Patch 6-7 to support only crash high memory or fixed memory range specified by
> > crashkernel=X@offset use page-level mapping, to allow other areas use block mapping.
> > These four patches are for performance optimization purposes. For details about the
> > technical feasibility analysis, please see the commit messages.
> > 
> > Now the implementation of arm64 is very different from that of x86. It's no longer
> > suitable for both of them to share code.
> > 
> > 
> > 
> > Changes since [v20]:
> > 1. Check whether crashkernel=Y,low is incorrectly configured or not configured. Do different processing.
> > 2. Share the existing description of x86. The configuration of arm64 is the same as that of x86.
> > 3. Define the value of macro CRASH_ADDR_HIGH_MAX as memblock.current_limit, instead of MEMBLOCK_ALLOC_ACCESSIBLE.
> > 4. To improve readability, some lightweight code adjustments have been made to reserve_craskernel(), including comments.
> > 5. The defined value of DEFAULT_CRASH_KERNEL_LOW_SIZE reconsiders swiotlb, just like x86, to share documents.
> > 
> > Thanks to Baoquan He for his careful review.
> > 
> > The test cases are as follows: (Please update the kexec tool to the latest version)
> > 1) crashkernel=4G						//high=4G, low=256M
> > 2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
> > 3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
> > 4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
> > 5) crashkernel=4G@0xe0000000					//high=0G, low=0M, cannot allocate, failed
> > 6) crashkernel=512M						//high=0G, low=512M
> > 7) crashkernel=128M						//high=0G, low=128M
> > 8) crashkernel=512M@0xde000000		//512M@3552M		//high=0G, low=512M
> > 9) crashkernel=4G,high						//high=4G, low=256M
> > a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
> > b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
> > c) crashkernel=128M,high					//high=128M, low=256M
> > d) crashkernel=512M,low						//high=0G, low=0M, invalid
> > e) crashkernel=512M,high crashkernel=0,low			//high=512M, low=0M
> > f) crashkernel=4G,high crashkernel=ab,low			//high=0G, low=0M, invalid
> > 
> > 
> > Changes since [v19]:
> > 1. Temporarily stop making reserve_crashkernel[_low]() generic. There are a
> >    lot of details need to be considered, which can take a long time. Because
> >    "make generic" does not add new functions and does not improve performance,
> >    maybe I should say it's just a cleanup. So by stripping it out and leaving
> >    it for other patches later, we can aggregate the changes to the main functions.
> > 2. Use insert_resource() to replace request_resource(), this not only simplifies
> >    the code, but also reduces the differences between arm64 and x86 implementations.
> > 3. As commit 157752d84f5d ("kexec: use Crash kernel for Crash kernel low") do for
> >    x86, we can also extend kexec-tools for arm64, and it's currently applied. See:
> >    https://www.spinics.net/lists/kexec/msg28284.html
> > 
> > Thank you very much, Borislav Petkov, for so many valuable comments.
> > 
> > Changes since [v17]: v17 --> v19
> > 1. Patch 0001-0004
> >    Introduce generic parse_crashkernel_high_low() to bring the parsing of
> >    "crashkernel=X,high" and the parsing of "crashkernel=X,low" together,
> >    then use it instead of the call to parse_crashkernel_{high|low}(). Two
> >    confusing parameters of parse_crashkernel_{high|low}() are deleted.
> > 
> >    I previously sent these four patches separately:
> >    [1] https://lkml.org/lkml/2021/12/25/40
> > 2. Patch 0005-0009
> >    Introduce generic reserve_crashkernel_mem[_low](), the implementation of
> >    these two functions is based on function reserve_crashkernel[_low]() in
> >    arch/x86/kernel/setup.c. There is no functional change for x86.
> >    1) The check position of xen_pv_domain() does not change.
> >    2) Still 1M alignment for crash kernel fixed region, when 'base' is specified.
> > 
> >    To avoid compilation problems on other architectures: patch 0004 moves
> >    the definition of global variable crashk[_low]_res from kexec_core.c to
> >    crash_core.c, and provide default definitions for all macros involved, a
> >    particular platform can redefine these macros to override the default
> >    values.
> > 3. 0010, only one line of comment was changed.
> > 4. 0011
> >    1) crashk_low_res may also a valid reserved memory, should be checked
> >       in crash_is_nosave(), see arch/arm64/kernel/machine_kexec.
> >    2) Drop memblock_mark_nomap() for crashk_low_res, because of:
> >       2687275a5843 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
> >    3) Also call kmemleak_ignore_phys() for crashk_low_res, because of:
> >       85f58eb18898 arm64: kdump: Skip kmemleak scan reserved memory for kdump
> > 5. 0012, slightly rebased, because the following patch is applied in advance. 
> >    https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/commit/?h=dt/linus&id=8347b41748c3019157312fbe7f8a6792ae396eb7
> > 6. 0013, no change.
> > 
> > Others:
> > 1. Discard add ARCH_WANT_RESERVE_CRASH_KERNEL
> > 2. When allocating crash low memory, the start address still starts from 0.
> >    low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> > 3. Discard change (1ULL << 32) to CRASH_ADDR_LOW_MAX.
> > 4. Ensure the check position of xen_pv_domain() have no change.
> > 5. Except patch 0010 and 0012, all "Tested-by", "Reviewed-by", "Acked-by" are removed.
> > 6. Update description.
> > 
> > 
> > 
> > Changes since [v16]
> > - Because no functional changes in this version, so add
> >   "Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>" for patch 1-9
> > - Add "Reviewed-by: Rob Herring <robh@kernel.org>" for patch 8
> > - Update patch 9 based on the review comments of Rob Herring
> > - As Catalin Marinas's suggestion, merge the implementation of
> >   ARCH_WANT_RESERVE_CRASH_KERNEL into patch 5. Ensure that the
> >   contents of X86 and ARM64 do not overlap, and reduce unnecessary
> >   temporary differences.
> > 
> > Changes since [v15]
> > -  Aggregate the processing of "linux,usable-memory-range" into one function.
> >    Only patch 9-10 have been updated.
> > 
> > Changes since [v14]
> > - Recovering the requirement that the CrashKernel memory regions on X86
> >   only requires 1 MiB alignment.
> > - Combine patches 5 and 6 in v14 into one. The compilation warning fixed
> >   by patch 6 was introduced by patch 5 in v14.
> > - As with crashk_res, crashk_low_res is also processed by
> >   crash_exclude_mem_range() in patch 7.
> > - Due to commit b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
> >   has removed the architecture-specific code, extend the property "linux,usable-memory-range"
> >   in the platform-agnostic FDT core code. See patch 9.
> > - Discard the x86 description update in the document, because the description
> >   has been updated by commit b1f4c363666c ("Documentation: kdump: update kdump guide").
> > - Change "arm64" to "ARM64" in Doc.
> > 
> > 
> > Changes since [v13]
> > - Rebased on top of 5.11-rc5.
> > - Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL.
> > Since reserve_crashkernel[_low]() implementations are quite similar on
> > other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in
> > arch/Kconfig and select this by X86 and ARM64.
> > - Some minor cleanup.
> > 
> > Changes since [v12]
> > - Rebased on top of 5.10-rc1.
> > - Keep CRASH_ALIGN as 16M suggested by Dave.
> > - Drop patch "kdump: add threshold for the required memory".
> > - Add Tested-by from John.
> > 
> > Changes since [v11]
> > - Rebased on top of 5.9-rc4.
> > - Make the function reserve_crashkernel() of x86 generic.
> > Suggested by Catalin, make the function reserve_crashkernel() of x86 generic
> > and arm64 use the generic version to reimplement crashkernel=X.
> > 
> > Changes since [v10]
> > - Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.
> > 
> > Changes since [v9]
> > - Patch 1 add Acked-by from Dave.
> > - Update patch 5 according to Dave's comments.
> > - Update chosen schema.
> > 
> > Changes since [v8]
> > - Reuse DT property "linux,usable-memory-range".
> > Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
> > memory region.
> > - Fix kdump broken with ZONE_DMA reintroduced.
> > - Update chosen schema.
> > 
> > Changes since [v7]
> > - Move x86 CRASH_ALIGN to 2M
> > Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> > - Update Documentation/devicetree/bindings/chosen.txt.
> > Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
> > suggested by Arnd.
> > - Add Tested-by from Jhon and pk.
> > 
> > Changes since [v6]
> > - Fix build errors reported by kbuild test robot.
> > 
> > Changes since [v5]
> > - Move reserve_crashkernel_low() into kernel/crash_core.c.
> > - Delete crashkernel=X,high.
> > - Modify crashkernel=X,low.
> > If crashkernel=X,low is specified simultaneously, reserve spcified size low
> > memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> > In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> > pass to crash dump kernel by DT property "linux,low-memory-range".
> > - Update Documentation/admin-guide/kdump/kdump.rst.
> > 
> > Changes since [v4]
> > - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
> > 
> > Changes since [v3]
> > - Add memblock_cap_memory_ranges back for multiple ranges.
> > - Fix some compiling warnings.
> > 
> > Changes since [v2]
> > - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> > two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> > patch.
> > 
> > Changes since [v1]:
> > - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> > - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> > in fdt_enforce_memory_region().
> > There are at most two crash kernel regions, for two crash kernel regions
> > case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> > and then remove the memory range in the middle.
> > 
> > v1:
> > There are following issues in arm64 kdump:
> > 1. We use crashkernel=X to reserve crashkernel below 4G, which
> > will fail when there is no enough low memory.
> > 2. If reserving crashkernel above 4G, in this case, crash dump
> > kernel will boot failure because there is no low memory available
> > for allocation.
> > 
> > To solve these issues, change the behavior of crashkernel=X.
> > crashkernel=X tries low allocation in DMA zone and fall back to high
> > allocation if it fails.
> > 
> > We can also use "crashkernel=X,high" to select a high region above
> > DMA zone, which also tries to allocate at least 256M low memory in
> > DMA zone automatically and "crashkernel=Y,low" can be used to allocate
> > specified size low memory.
> > 
> > When reserving crashkernel in high memory, some low memory is reserved
> > for crash dump kernel devices. So there may be two regions reserved for
> > crash dump kernel.
> > In order to distinct from the high region and make no effect to the use
> > of existing kexec-tools, rename the low region as "Crash kernel (low)",
> > and pass the low region by reusing DT property
> > "linux,usable-memory-range". We made the low memory region as the last
> > range of "linux,usable-memory-range" to keep compatibility with existing
> > user-space and older kdump kernels.
> > 
> > Besides, we need to modify kexec-tools:
> > arm64: support more than one crash kernel regions(see [1])
> > 
> > Another update is document about DT property 'linux,usable-memory-range':
> > schemas: update 'linux,usable-memory-range' node schema(see [2])
> > 
> > 
> > [1]: https://www.spinics.net/lists/kexec/msg28226.html
> > [2]: https://github.com/robherring/dt-schema/pull/19 
> > [v1]: https://lkml.org/lkml/2019/4/2/1174
> > [v2]: https://lkml.org/lkml/2019/4/9/86
> > [v3]: https://lkml.org/lkml/2019/4/9/306
> > [v4]: https://lkml.org/lkml/2019/4/15/273
> > [v5]: https://lkml.org/lkml/2019/5/6/1360
> > [v6]: https://lkml.org/lkml/2019/8/30/142
> > [v7]: https://lkml.org/lkml/2019/12/23/411
> > [v8]: https://lkml.org/lkml/2020/5/21/213
> > [v9]: https://lkml.org/lkml/2020/6/28/73
> > [v10]: https://lkml.org/lkml/2020/7/2/1443
> > [v11]: https://lkml.org/lkml/2020/8/1/150
> > [v12]: https://lkml.org/lkml/2020/9/7/1037
> > [v13]: https://lkml.org/lkml/2020/10/31/34
> > [v14]: https://lkml.org/lkml/2021/1/30/53
> > [v15]: https://lkml.org/lkml/2021/10/19/1405
> > [v16]: https://lkml.org/lkml/2021/11/23/435
> > [v17]: https://lkml.org/lkml/2021/12/10/38
> > [v18]: https://lkml.org/lkml/2021/12/22/424
> > [v19]: https://lkml.org/lkml/2021/12/28/203
> > [v20]: https://lkml.org/lkml/2022/1/24/167
> > [v21]: https://lkml.org/lkml/2022/2/26/350
> > 
> > Chen Zhou (2):
> >   arm64: kdump: Reimplement crashkernel=X
> >   of: fdt: Add memory for devices by DT property
> >     "linux,usable-memory-range"
> > 
> > Zhen Lei (7):
> >   kdump: return -ENOENT if required cmdline option does not exist
> >   arm64: Use insert_resource() to simplify code
> >   arm64: kdump: Remove some redundant checks in map_mem()
> >   arm64: kdump: Don't force page-level mappings for memory above 4G
> >   arm64: kdump: Use page-level mapping for the high memory of
> >     crashkernel
> >   arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G
> >   docs: kdump: Update the crashkernel description for arm64
> > 
> >  .../admin-guide/kernel-parameters.txt         |   8 +-
> >  arch/arm64/include/asm/kexec.h                |   2 +
> >  arch/arm64/kernel/machine_kexec.c             |   9 +-
> >  arch/arm64/kernel/machine_kexec_file.c        |  12 +-
> >  arch/arm64/kernel/setup.c                     |  17 +-
> >  arch/arm64/mm/init.c                          | 204 +++++++++++++++++-
> >  arch/arm64/mm/mmu.c                           |  68 +++---
> >  drivers/of/fdt.c                              |  33 ++-
> >  kernel/crash_core.c                           |   3 +-
> >  9 files changed, 278 insertions(+), 78 deletions(-)
> > 
> 
> -- 
> Regards,
>   Zhen Lei
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump
@ 2022-04-25  2:45     ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-04-25  2:45 UTC (permalink / raw)
  To: kexec

On 04/25/22 at 10:19am, Leizhen (ThunderTown) wrote:
> 
> Hi, Catalin Marinas and Will Deacon:
>   Do you have time to review patches 3-4 and 6-7? These patches are a little more
> relevant to the arm64 architecture.
> 
> Hi, Baoquan He:
>   Can you review patches 1 and 5 again?

Yes, and I will test and review the whole series if no one check the
arm64 related changes.

> 
> 
> On 2022/4/14 19:57, Zhen Lei wrote:
> > Changes since [v21]:
> > 1. Update the commit message of  patch 1 and 5.
> > 2. Add some comments for reserve_crashkernel() in patch 5.
> > 
> > Thanks to Baoquan He and John Donnelly for their review comments.
> > 
> > Because v5.18-rc1 has added a new patch
> > commit  031495635b46 ("arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones")
> > There are many new scenarios:
> > 1) The mappings may be block or page-level. 
> > 2) The call to reserve_crashkernel() may or may not be deferred.
> > 3) The the upper limit of DMA address may be 4G, or less than 4G. Or the
> >    upper limit of physical memory, because SMMU can do the mapping.
> > 
> > The code of patch 1-2, 8-9 keep no change, because the above-mentioned issues are not involved.
> > The code of patch 5 only makes the following changes:
> > -	if (crash_base >= SZ_4G)
> > +	/*
> > +	 * When both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the
> > +	 * CRASH_ADDR_LOW_MAX equals the upper limit of physical memory, so
> > +	 * the 'crash_base' of high memory can not exceed it. To follow the
> > +	 * description of "crashkernel=X,high" option, add below 'high'
> > +	 * condition to make sure the crash low memory will be reserved.
> > +	 */
> > +	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
> > Change SZ_4G to CRASH_ADDR_LOW_MAX, because arm64_dma_phys_limit may be less than
> > 4G or greater than 4G. The check 'high' is used for "crashkernel=X,high" and
> > "(crash_base >= CRASH_ADDR_LOW_MAX)" is used for "crashkernel=X[@offset]".
> > 
> > Patch 3-4 to allow block mappings for memory above 4G.
> > Patch 6-7 to support only crash high memory or fixed memory range specified by
> > crashkernel=X at offset use page-level mapping, to allow other areas use block mapping.
> > These four patches are for performance optimization purposes. For details about the
> > technical feasibility analysis, please see the commit messages.
> > 
> > Now the implementation of arm64 is very different from that of x86. It's no longer
> > suitable for both of them to share code.
> > 
> > 
> > 
> > Changes since [v20]:
> > 1. Check whether crashkernel=Y,low is incorrectly configured or not configured. Do different processing.
> > 2. Share the existing description of x86. The configuration of arm64 is the same as that of x86.
> > 3. Define the value of macro CRASH_ADDR_HIGH_MAX as memblock.current_limit, instead of MEMBLOCK_ALLOC_ACCESSIBLE.
> > 4. To improve readability, some lightweight code adjustments have been made to reserve_craskernel(), including comments.
> > 5. The defined value of DEFAULT_CRASH_KERNEL_LOW_SIZE reconsiders swiotlb, just like x86, to share documents.
> > 
> > Thanks to Baoquan He for his careful review.
> > 
> > The test cases are as follows: (Please update the kexec tool to the latest version)
> > 1) crashkernel=4G						//high=4G, low=256M
> > 2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
> > 3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
> > 4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
> > 5) crashkernel=4G at 0xe0000000					//high=0G, low=0M, cannot allocate, failed
> > 6) crashkernel=512M						//high=0G, low=512M
> > 7) crashkernel=128M						//high=0G, low=128M
> > 8) crashkernel=512M at 0xde000000		//512M at 3552M		//high=0G, low=512M
> > 9) crashkernel=4G,high						//high=4G, low=256M
> > a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
> > b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
> > c) crashkernel=128M,high					//high=128M, low=256M
> > d) crashkernel=512M,low						//high=0G, low=0M, invalid
> > e) crashkernel=512M,high crashkernel=0,low			//high=512M, low=0M
> > f) crashkernel=4G,high crashkernel=ab,low			//high=0G, low=0M, invalid
> > 
> > 
> > Changes since [v19]:
> > 1. Temporarily stop making reserve_crashkernel[_low]() generic. There are a
> >    lot of details need to be considered, which can take a long time. Because
> >    "make generic" does not add new functions and does not improve performance,
> >    maybe I should say it's just a cleanup. So by stripping it out and leaving
> >    it for other patches later, we can aggregate the changes to the main functions.
> > 2. Use insert_resource() to replace request_resource(), this not only simplifies
> >    the code, but also reduces the differences between arm64 and x86 implementations.
> > 3. As commit 157752d84f5d ("kexec: use Crash kernel for Crash kernel low") do for
> >    x86, we can also extend kexec-tools for arm64, and it's currently applied. See:
> >    https://www.spinics.net/lists/kexec/msg28284.html
> > 
> > Thank you very much, Borislav Petkov, for so many valuable comments.
> > 
> > Changes since [v17]: v17 --> v19
> > 1. Patch 0001-0004
> >    Introduce generic parse_crashkernel_high_low() to bring the parsing of
> >    "crashkernel=X,high" and the parsing of "crashkernel=X,low" together,
> >    then use it instead of the call to parse_crashkernel_{high|low}(). Two
> >    confusing parameters of parse_crashkernel_{high|low}() are deleted.
> > 
> >    I previously sent these four patches separately:
> >    [1] https://lkml.org/lkml/2021/12/25/40
> > 2. Patch 0005-0009
> >    Introduce generic reserve_crashkernel_mem[_low](), the implementation of
> >    these two functions is based on function reserve_crashkernel[_low]() in
> >    arch/x86/kernel/setup.c. There is no functional change for x86.
> >    1) The check position of xen_pv_domain() does not change.
> >    2) Still 1M alignment for crash kernel fixed region, when 'base' is specified.
> > 
> >    To avoid compilation problems on other architectures: patch 0004 moves
> >    the definition of global variable crashk[_low]_res from kexec_core.c to
> >    crash_core.c, and provide default definitions for all macros involved, a
> >    particular platform can redefine these macros to override the default
> >    values.
> > 3. 0010, only one line of comment was changed.
> > 4. 0011
> >    1) crashk_low_res may also a valid reserved memory, should be checked
> >       in crash_is_nosave(), see arch/arm64/kernel/machine_kexec.
> >    2) Drop memblock_mark_nomap() for crashk_low_res, because of:
> >       2687275a5843 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
> >    3) Also call kmemleak_ignore_phys() for crashk_low_res, because of:
> >       85f58eb18898 arm64: kdump: Skip kmemleak scan reserved memory for kdump
> > 5. 0012, slightly rebased, because the following patch is applied in advance. 
> >    https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/commit/?h=dt/linus&id=8347b41748c3019157312fbe7f8a6792ae396eb7
> > 6. 0013, no change.
> > 
> > Others:
> > 1. Discard add ARCH_WANT_RESERVE_CRASH_KERNEL
> > 2. When allocating crash low memory, the start address still starts from 0.
> >    low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> > 3. Discard change (1ULL << 32) to CRASH_ADDR_LOW_MAX.
> > 4. Ensure the check position of xen_pv_domain() have no change.
> > 5. Except patch 0010 and 0012, all "Tested-by", "Reviewed-by", "Acked-by" are removed.
> > 6. Update description.
> > 
> > 
> > 
> > Changes since [v16]
> > - Because no functional changes in this version, so add
> >   "Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>" for patch 1-9
> > - Add "Reviewed-by: Rob Herring <robh@kernel.org>" for patch 8
> > - Update patch 9 based on the review comments of Rob Herring
> > - As Catalin Marinas's suggestion, merge the implementation of
> >   ARCH_WANT_RESERVE_CRASH_KERNEL into patch 5. Ensure that the
> >   contents of X86 and ARM64 do not overlap, and reduce unnecessary
> >   temporary differences.
> > 
> > Changes since [v15]
> > -  Aggregate the processing of "linux,usable-memory-range" into one function.
> >    Only patch 9-10 have been updated.
> > 
> > Changes since [v14]
> > - Recovering the requirement that the CrashKernel memory regions on X86
> >   only requires 1 MiB alignment.
> > - Combine patches 5 and 6 in v14 into one. The compilation warning fixed
> >   by patch 6 was introduced by patch 5 in v14.
> > - As with crashk_res, crashk_low_res is also processed by
> >   crash_exclude_mem_range() in patch 7.
> > - Due to commit b261dba2fdb2 ("arm64: kdump: Remove custom linux,usable-memory-range handling")
> >   has removed the architecture-specific code, extend the property "linux,usable-memory-range"
> >   in the platform-agnostic FDT core code. See patch 9.
> > - Discard the x86 description update in the document, because the description
> >   has been updated by commit b1f4c363666c ("Documentation: kdump: update kdump guide").
> > - Change "arm64" to "ARM64" in Doc.
> > 
> > 
> > Changes since [v13]
> > - Rebased on top of 5.11-rc5.
> > - Introduce config CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL.
> > Since reserve_crashkernel[_low]() implementations are quite similar on
> > other architectures, so have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in
> > arch/Kconfig and select this by X86 and ARM64.
> > - Some minor cleanup.
> > 
> > Changes since [v12]
> > - Rebased on top of 5.10-rc1.
> > - Keep CRASH_ALIGN as 16M suggested by Dave.
> > - Drop patch "kdump: add threshold for the required memory".
> > - Add Tested-by from John.
> > 
> > Changes since [v11]
> > - Rebased on top of 5.9-rc4.
> > - Make the function reserve_crashkernel() of x86 generic.
> > Suggested by Catalin, make the function reserve_crashkernel() of x86 generic
> > and arm64 use the generic version to reimplement crashkernel=X.
> > 
> > Changes since [v10]
> > - Reimplement crashkernel=X suggested by Catalin, Many thanks to Catalin.
> > 
> > Changes since [v9]
> > - Patch 1 add Acked-by from Dave.
> > - Update patch 5 according to Dave's comments.
> > - Update chosen schema.
> > 
> > Changes since [v8]
> > - Reuse DT property "linux,usable-memory-range".
> > Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
> > memory region.
> > - Fix kdump broken with ZONE_DMA reintroduced.
> > - Update chosen schema.
> > 
> > Changes since [v7]
> > - Move x86 CRASH_ALIGN to 2M
> > Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> > - Update Documentation/devicetree/bindings/chosen.txt.
> > Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
> > suggested by Arnd.
> > - Add Tested-by from Jhon and pk.
> > 
> > Changes since [v6]
> > - Fix build errors reported by kbuild test robot.
> > 
> > Changes since [v5]
> > - Move reserve_crashkernel_low() into kernel/crash_core.c.
> > - Delete crashkernel=X,high.
> > - Modify crashkernel=X,low.
> > If crashkernel=X,low is specified simultaneously, reserve spcified size low
> > memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> > In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> > pass to crash dump kernel by DT property "linux,low-memory-range".
> > - Update Documentation/admin-guide/kdump/kdump.rst.
> > 
> > Changes since [v4]
> > - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
> > 
> > Changes since [v3]
> > - Add memblock_cap_memory_ranges back for multiple ranges.
> > - Fix some compiling warnings.
> > 
> > Changes since [v2]
> > - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> > two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> > patch.
> > 
> > Changes since [v1]:
> > - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> > - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> > in fdt_enforce_memory_region().
> > There are at most two crash kernel regions, for two crash kernel regions
> > case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> > and then remove the memory range in the middle.
> > 
> > v1:
> > There are following issues in arm64 kdump:
> > 1. We use crashkernel=X to reserve crashkernel below 4G, which
> > will fail when there is no enough low memory.
> > 2. If reserving crashkernel above 4G, in this case, crash dump
> > kernel will boot failure because there is no low memory available
> > for allocation.
> > 
> > To solve these issues, change the behavior of crashkernel=X.
> > crashkernel=X tries low allocation in DMA zone and fall back to high
> > allocation if it fails.
> > 
> > We can also use "crashkernel=X,high" to select a high region above
> > DMA zone, which also tries to allocate at least 256M low memory in
> > DMA zone automatically and "crashkernel=Y,low" can be used to allocate
> > specified size low memory.
> > 
> > When reserving crashkernel in high memory, some low memory is reserved
> > for crash dump kernel devices. So there may be two regions reserved for
> > crash dump kernel.
> > In order to distinct from the high region and make no effect to the use
> > of existing kexec-tools, rename the low region as "Crash kernel (low)",
> > and pass the low region by reusing DT property
> > "linux,usable-memory-range". We made the low memory region as the last
> > range of "linux,usable-memory-range" to keep compatibility with existing
> > user-space and older kdump kernels.
> > 
> > Besides, we need to modify kexec-tools:
> > arm64: support more than one crash kernel regions(see [1])
> > 
> > Another update is document about DT property 'linux,usable-memory-range':
> > schemas: update 'linux,usable-memory-range' node schema(see [2])
> > 
> > 
> > [1]: https://www.spinics.net/lists/kexec/msg28226.html
> > [2]: https://github.com/robherring/dt-schema/pull/19 
> > [v1]: https://lkml.org/lkml/2019/4/2/1174
> > [v2]: https://lkml.org/lkml/2019/4/9/86
> > [v3]: https://lkml.org/lkml/2019/4/9/306
> > [v4]: https://lkml.org/lkml/2019/4/15/273
> > [v5]: https://lkml.org/lkml/2019/5/6/1360
> > [v6]: https://lkml.org/lkml/2019/8/30/142
> > [v7]: https://lkml.org/lkml/2019/12/23/411
> > [v8]: https://lkml.org/lkml/2020/5/21/213
> > [v9]: https://lkml.org/lkml/2020/6/28/73
> > [v10]: https://lkml.org/lkml/2020/7/2/1443
> > [v11]: https://lkml.org/lkml/2020/8/1/150
> > [v12]: https://lkml.org/lkml/2020/9/7/1037
> > [v13]: https://lkml.org/lkml/2020/10/31/34
> > [v14]: https://lkml.org/lkml/2021/1/30/53
> > [v15]: https://lkml.org/lkml/2021/10/19/1405
> > [v16]: https://lkml.org/lkml/2021/11/23/435
> > [v17]: https://lkml.org/lkml/2021/12/10/38
> > [v18]: https://lkml.org/lkml/2021/12/22/424
> > [v19]: https://lkml.org/lkml/2021/12/28/203
> > [v20]: https://lkml.org/lkml/2022/1/24/167
> > [v21]: https://lkml.org/lkml/2022/2/26/350
> > 
> > Chen Zhou (2):
> >   arm64: kdump: Reimplement crashkernel=X
> >   of: fdt: Add memory for devices by DT property
> >     "linux,usable-memory-range"
> > 
> > Zhen Lei (7):
> >   kdump: return -ENOENT if required cmdline option does not exist
> >   arm64: Use insert_resource() to simplify code
> >   arm64: kdump: Remove some redundant checks in map_mem()
> >   arm64: kdump: Don't force page-level mappings for memory above 4G
> >   arm64: kdump: Use page-level mapping for the high memory of
> >     crashkernel
> >   arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G
> >   docs: kdump: Update the crashkernel description for arm64
> > 
> >  .../admin-guide/kernel-parameters.txt         |   8 +-
> >  arch/arm64/include/asm/kexec.h                |   2 +
> >  arch/arm64/kernel/machine_kexec.c             |   9 +-
> >  arch/arm64/kernel/machine_kexec_file.c        |  12 +-
> >  arch/arm64/kernel/setup.c                     |  17 +-
> >  arch/arm64/mm/init.c                          | 204 +++++++++++++++++-
> >  arch/arm64/mm/mmu.c                           |  68 +++---
> >  drivers/of/fdt.c                              |  33 ++-
> >  kernel/crash_core.c                           |   3 +-
> >  9 files changed, 278 insertions(+), 78 deletions(-)
> > 
> 
> -- 
> Regards,
>   Zhen Lei
> 



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 1/9] kdump: return -ENOENT if required cmdline option does not exist
  2022-04-14 11:57   ` Zhen Lei
  (?)
@ 2022-04-25  3:49     ` Baoquan He
  -1 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-04-25  3:49 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 04/14/22 at 07:57pm, Zhen Lei wrote:
> According to the current crashkernel=Y,low support in other ARCHes, it's
> an optional command-line option. When it doesn't exist, kernel will try
> to allocate minimum required memory below 4G automatically.
> 
> However, __parse_crashkernel() returns '-EINVAL' for all error cases. It
> can't distinguish the nonexistent option from invalid option.
> 
> Change __parse_crashkernel() to return '-ENOENT' for the nonexistent option
> case. With this change, crashkernel,low memory will take the default
> value if crashkernel=,low is not specified; while crashkernel reservation
> will fail and bail out if an invalid option is specified.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

Acked-by: Baoquan He <bhe@redhat.com>

> ---
>  kernel/crash_core.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 256cf6db573cd09..4d57c03714f4e13 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -243,9 +243,8 @@ static int __init __parse_crashkernel(char *cmdline,
>  	*crash_base = 0;
>  
>  	ck_cmdline = get_last_crashkernel(cmdline, name, suffix);
> -
>  	if (!ck_cmdline)
> -		return -EINVAL;
> +		return -ENOENT;
>  
>  	ck_cmdline += strlen(name);
>  
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 1/9] kdump: return -ENOENT if required cmdline option does not exist
@ 2022-04-25  3:49     ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-04-25  3:49 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On 04/14/22 at 07:57pm, Zhen Lei wrote:
> According to the current crashkernel=Y,low support in other ARCHes, it's
> an optional command-line option. When it doesn't exist, kernel will try
> to allocate minimum required memory below 4G automatically.
> 
> However, __parse_crashkernel() returns '-EINVAL' for all error cases. It
> can't distinguish the nonexistent option from invalid option.
> 
> Change __parse_crashkernel() to return '-ENOENT' for the nonexistent option
> case. With this change, crashkernel,low memory will take the default
> value if crashkernel=,low is not specified; while crashkernel reservation
> will fail and bail out if an invalid option is specified.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

Acked-by: Baoquan He <bhe@redhat.com>

> ---
>  kernel/crash_core.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 256cf6db573cd09..4d57c03714f4e13 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -243,9 +243,8 @@ static int __init __parse_crashkernel(char *cmdline,
>  	*crash_base = 0;
>  
>  	ck_cmdline = get_last_crashkernel(cmdline, name, suffix);
> -
>  	if (!ck_cmdline)
> -		return -EINVAL;
> +		return -ENOENT;
>  
>  	ck_cmdline += strlen(name);
>  
> -- 
> 2.25.1
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 1/9] kdump: return -ENOENT if required cmdline option does not exist
@ 2022-04-25  3:49     ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-04-25  3:49 UTC (permalink / raw)
  To: kexec

On 04/14/22 at 07:57pm, Zhen Lei wrote:
> According to the current crashkernel=Y,low support in other ARCHes, it's
> an optional command-line option. When it doesn't exist, kernel will try
> to allocate minimum required memory below 4G automatically.
> 
> However, __parse_crashkernel() returns '-EINVAL' for all error cases. It
> can't distinguish the nonexistent option from invalid option.
> 
> Change __parse_crashkernel() to return '-ENOENT' for the nonexistent option
> case. With this change, crashkernel,low memory will take the default
> value if crashkernel=,low is not specified; while crashkernel reservation
> will fail and bail out if an invalid option is specified.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

Acked-by: Baoquan He <bhe@redhat.com>

> ---
>  kernel/crash_core.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 256cf6db573cd09..4d57c03714f4e13 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -243,9 +243,8 @@ static int __init __parse_crashkernel(char *cmdline,
>  	*crash_base = 0;
>  
>  	ck_cmdline = get_last_crashkernel(cmdline, name, suffix);
> -
>  	if (!ck_cmdline)
> -		return -EINVAL;
> +		return -ENOENT;
>  
>  	ck_cmdline += strlen(name);
>  
> -- 
> 2.25.1
> 



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump
  2022-04-25  2:45     ` Baoquan He
  (?)
@ 2022-04-25  6:29       ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-25  6:29 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/4/25 10:45, Baoquan He wrote:
> On 04/25/22 at 10:19am, Leizhen (ThunderTown) wrote:
>>
>> Hi, Catalin Marinas and Will Deacon:
>>   Do you have time to review patches 3-4 and 6-7? These patches are a little more
>> relevant to the arm64 architecture.
>>
>> Hi, Baoquan He:
>>   Can you review patches 1 and 5 again?
> 
> Yes, and I will test and review the whole series if no one check the
> arm64 related changes.

Okay, thank you so much.

> 
>>
>>

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump
@ 2022-04-25  6:29       ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-25  6:29 UTC (permalink / raw)
  To: Baoquan He
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Catalin Marinas, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/4/25 10:45, Baoquan He wrote:
> On 04/25/22 at 10:19am, Leizhen (ThunderTown) wrote:
>>
>> Hi, Catalin Marinas and Will Deacon:
>>   Do you have time to review patches 3-4 and 6-7? These patches are a little more
>> relevant to the arm64 architecture.
>>
>> Hi, Baoquan He:
>>   Can you review patches 1 and 5 again?
> 
> Yes, and I will test and review the whole series if no one check the
> arm64 related changes.

Okay, thank you so much.

> 
>>
>>

-- 
Regards,
  Zhen Lei

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump
@ 2022-04-25  6:29       ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen @ 2022-04-25  6:29 UTC (permalink / raw)
  To: kexec



On 2022/4/25 10:45, Baoquan He wrote:
> On 04/25/22 at 10:19am, Leizhen (ThunderTown) wrote:
>>
>> Hi, Catalin Marinas and Will Deacon:
>>   Do you have time to review patches 3-4 and 6-7? These patches are a little more
>> relevant to the arm64 architecture.
>>
>> Hi, Baoquan He:
>>   Can you review patches 1 and 5 again?
> 
> Yes, and I will test and review the whole series if no one check the
> arm64 related changes.

Okay, thank you so much.

> 
>>
>>

-- 
Regards,
  Zhen Lei


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 4/9] arm64: kdump: Don't force page-level mappings for memory above 4G
  2022-04-14 11:57   ` Zhen Lei
  (?)
@ 2022-04-26 14:26     ` Catalin Marinas
  -1 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-04-26 14:26 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On Thu, Apr 14, 2022 at 07:57:15PM +0800, Zhen Lei wrote:
> @@ -540,13 +540,31 @@ static void __init map_mem(pgd_t *pgdp)
>  	for_each_mem_range(i, &start, &end) {
>  		if (start >= end)
>  			break;
> +
> +#ifdef CONFIG_KEXEC_CORE
> +		if (eflags && (end >= SZ_4G)) {
> +			/*
> +			 * The memory block cross the 4G boundary.
> +			 * Forcibly use page-level mappings for memory under 4G.
> +			 */
> +			if (start < SZ_4G) {
> +				__map_memblock(pgdp, start, SZ_4G - 1,
> +					       pgprot_tagged(PAGE_KERNEL), flags | eflags);
> +				start  = SZ_4G;
> +			}
> +
> +			/* Page-level mappings is not mandatory for memory above 4G */
> +			eflags = 0;
> +		}
> +#endif

That's a bit tricky if a SoC has all RAM above 4G. IIRC AMD Seattle had
this layout. See max_zone_phys() for how we deal with this, basically
extending ZONE_DMA to the whole range if RAM starts above 4GB. In that
case, crashkernel reservation would fall in the range above 4GB.

BTW, we changed the max_zone_phys() logic with commit 791ab8b2e3db
("arm64: Ignore any DMA offsets in the max_zone_phys() calculation").

-- 
Catalin

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 4/9] arm64: kdump: Don't force page-level mappings for memory above 4G
@ 2022-04-26 14:26     ` Catalin Marinas
  0 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-04-26 14:26 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On Thu, Apr 14, 2022 at 07:57:15PM +0800, Zhen Lei wrote:
> @@ -540,13 +540,31 @@ static void __init map_mem(pgd_t *pgdp)
>  	for_each_mem_range(i, &start, &end) {
>  		if (start >= end)
>  			break;
> +
> +#ifdef CONFIG_KEXEC_CORE
> +		if (eflags && (end >= SZ_4G)) {
> +			/*
> +			 * The memory block cross the 4G boundary.
> +			 * Forcibly use page-level mappings for memory under 4G.
> +			 */
> +			if (start < SZ_4G) {
> +				__map_memblock(pgdp, start, SZ_4G - 1,
> +					       pgprot_tagged(PAGE_KERNEL), flags | eflags);
> +				start  = SZ_4G;
> +			}
> +
> +			/* Page-level mappings is not mandatory for memory above 4G */
> +			eflags = 0;
> +		}
> +#endif

That's a bit tricky if a SoC has all RAM above 4G. IIRC AMD Seattle had
this layout. See max_zone_phys() for how we deal with this, basically
extending ZONE_DMA to the whole range if RAM starts above 4GB. In that
case, crashkernel reservation would fall in the range above 4GB.

BTW, we changed the max_zone_phys() logic with commit 791ab8b2e3db
("arm64: Ignore any DMA offsets in the max_zone_phys() calculation").

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 4/9] arm64: kdump: Don't force page-level mappings for memory above 4G
@ 2022-04-26 14:26     ` Catalin Marinas
  0 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-04-26 14:26 UTC (permalink / raw)
  To: kexec

On Thu, Apr 14, 2022 at 07:57:15PM +0800, Zhen Lei wrote:
> @@ -540,13 +540,31 @@ static void __init map_mem(pgd_t *pgdp)
>  	for_each_mem_range(i, &start, &end) {
>  		if (start >= end)
>  			break;
> +
> +#ifdef CONFIG_KEXEC_CORE
> +		if (eflags && (end >= SZ_4G)) {
> +			/*
> +			 * The memory block cross the 4G boundary.
> +			 * Forcibly use page-level mappings for memory under 4G.
> +			 */
> +			if (start < SZ_4G) {
> +				__map_memblock(pgdp, start, SZ_4G - 1,
> +					       pgprot_tagged(PAGE_KERNEL), flags | eflags);
> +				start  = SZ_4G;
> +			}
> +
> +			/* Page-level mappings is not mandatory for memory above 4G */
> +			eflags = 0;
> +		}
> +#endif

That's a bit tricky if a SoC has all RAM above 4G. IIRC AMD Seattle had
this layout. See max_zone_phys() for how we deal with this, basically
extending ZONE_DMA to the whole range if RAM starts above 4GB. In that
case, crashkernel reservation would fall in the range above 4GB.

BTW, we changed the max_zone_phys() logic with commit 791ab8b2e3db
("arm64: Ignore any DMA offsets in the max_zone_phys() calculation").

-- 
Catalin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-04-14 11:57   ` Zhen Lei
  (?)
@ 2022-04-26 18:02     ` Catalin Marinas
  -1 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-04-26 18:02 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On Thu, Apr 14, 2022 at 07:57:16PM +0800, Zhen Lei wrote:
>  /*
>   * reserve_crashkernel() - reserves memory for crash kernel
>   *
>   * This function reserves memory area given in "crashkernel=" kernel command
>   * line parameter. The memory reserved is used by dump capture kernel when
>   * primary kernel is crashing.
> + *
> + * NOTE: Reservation of crashkernel,low is special since its existence
> + * is not independent, need rely on the existence of crashkernel,high.
> + * Here, four cases of crashkernel low memory reservation are summarized:
> + * 1) crashkernel=Y,low is specified explicitly, the size of crashkernel low
> + *    memory takes Y;
> + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
> + *    take the default crashkernel low memory size;
> + * 3) crashkernel=X is specified, while fallback to get a memory region
> + *    in high memory, take the default crashkernel low memory size;
> + * 4) crashkernel='invalid value',low is specified, failed the whole
> + *    crashkernel reservation and bail out.

Following the x86 behaviour made sense when we were tried to get that
code generic. Now that we moved the logic under arch/arm64, we can
diverge a bit. I lost track of the original (v1/v2) proposal but I
wonder whether we still need the fallback to high for crashkernel=Y.
Maybe simpler, no fallbacks:

	crashkernel=Y - keep the current behaviour, ignore high,low
	crashkernel=Y,high - allocate above ZONE_DMA
	crashkernel=Y,low - allocate within ZONE_DMA

From your proposal, the difference is that the Y,high option won't
have any default ZONE_DMA fallback, one would have to explicitly pass
the Y,low option if needed.

Just a thought, maybe it makes the code simpler. But I'm open to
discussion if there are good arguments for the proposed (x86-like)
behaviour. One argument could be for crashkernel=Y to fall back to high
if distros don't want to bother with high/low settings.

Another thing I may have asked in the past, what happens if we run a new
kernel with these patches with old kexec user tools. I suspect the
crashkernel=Y with the fallback to high will confuse the tools.

BTW, please separate the NO_BLOCK_MAPPINGS optimisations from the
crashkernel above 4G. Let's get the crashkernel reservations sorted
first, it's been around for too long.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-26 18:02     ` Catalin Marinas
  0 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-04-26 18:02 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On Thu, Apr 14, 2022 at 07:57:16PM +0800, Zhen Lei wrote:
>  /*
>   * reserve_crashkernel() - reserves memory for crash kernel
>   *
>   * This function reserves memory area given in "crashkernel=" kernel command
>   * line parameter. The memory reserved is used by dump capture kernel when
>   * primary kernel is crashing.
> + *
> + * NOTE: Reservation of crashkernel,low is special since its existence
> + * is not independent, need rely on the existence of crashkernel,high.
> + * Here, four cases of crashkernel low memory reservation are summarized:
> + * 1) crashkernel=Y,low is specified explicitly, the size of crashkernel low
> + *    memory takes Y;
> + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
> + *    take the default crashkernel low memory size;
> + * 3) crashkernel=X is specified, while fallback to get a memory region
> + *    in high memory, take the default crashkernel low memory size;
> + * 4) crashkernel='invalid value',low is specified, failed the whole
> + *    crashkernel reservation and bail out.

Following the x86 behaviour made sense when we were tried to get that
code generic. Now that we moved the logic under arch/arm64, we can
diverge a bit. I lost track of the original (v1/v2) proposal but I
wonder whether we still need the fallback to high for crashkernel=Y.
Maybe simpler, no fallbacks:

	crashkernel=Y - keep the current behaviour, ignore high,low
	crashkernel=Y,high - allocate above ZONE_DMA
	crashkernel=Y,low - allocate within ZONE_DMA

From your proposal, the difference is that the Y,high option won't
have any default ZONE_DMA fallback, one would have to explicitly pass
the Y,low option if needed.

Just a thought, maybe it makes the code simpler. But I'm open to
discussion if there are good arguments for the proposed (x86-like)
behaviour. One argument could be for crashkernel=Y to fall back to high
if distros don't want to bother with high/low settings.

Another thing I may have asked in the past, what happens if we run a new
kernel with these patches with old kexec user tools. I suspect the
crashkernel=Y with the fallback to high will confuse the tools.

BTW, please separate the NO_BLOCK_MAPPINGS optimisations from the
crashkernel above 4G. Let's get the crashkernel reservations sorted
first, it's been around for too long.

Thanks.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-26 18:02     ` Catalin Marinas
  0 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-04-26 18:02 UTC (permalink / raw)
  To: kexec

On Thu, Apr 14, 2022 at 07:57:16PM +0800, Zhen Lei wrote:
>  /*
>   * reserve_crashkernel() - reserves memory for crash kernel
>   *
>   * This function reserves memory area given in "crashkernel=" kernel command
>   * line parameter. The memory reserved is used by dump capture kernel when
>   * primary kernel is crashing.
> + *
> + * NOTE: Reservation of crashkernel,low is special since its existence
> + * is not independent, need rely on the existence of crashkernel,high.
> + * Here, four cases of crashkernel low memory reservation are summarized:
> + * 1) crashkernel=Y,low is specified explicitly, the size of crashkernel low
> + *    memory takes Y;
> + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
> + *    take the default crashkernel low memory size;
> + * 3) crashkernel=X is specified, while fallback to get a memory region
> + *    in high memory, take the default crashkernel low memory size;
> + * 4) crashkernel='invalid value',low is specified, failed the whole
> + *    crashkernel reservation and bail out.

Following the x86 behaviour made sense when we were tried to get that
code generic. Now that we moved the logic under arch/arm64, we can
diverge a bit. I lost track of the original (v1/v2) proposal but I
wonder whether we still need the fallback to high for crashkernel=Y.
Maybe simpler, no fallbacks:

	crashkernel=Y - keep the current behaviour, ignore high,low
	crashkernel=Y,high - allocate above ZONE_DMA
	crashkernel=Y,low - allocate within ZONE_DMA

From your proposal, the difference is that the Y,high option won't
have any default ZONE_DMA fallback, one would have to explicitly pass
the Y,low option if needed.

Just a thought, maybe it makes the code simpler. But I'm open to
discussion if there are good arguments for the proposed (x86-like)
behaviour. One argument could be for crashkernel=Y to fall back to high
if distros don't want to bother with high/low settings.

Another thing I may have asked in the past, what happens if we run a new
kernel with these patches with old kexec user tools. I suspect the
crashkernel=Y with the fallback to high will confuse the tools.

BTW, please separate the NO_BLOCK_MAPPINGS optimisations from the
crashkernel above 4G. Let's get the crashkernel reservations sorted
first, it's been around for too long.

Thanks.

-- 
Catalin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-04-26 18:02     ` Catalin Marinas
  (?)
@ 2022-04-27  6:54       ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-27  6:54 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/4/27 2:02, Catalin Marinas wrote:
> On Thu, Apr 14, 2022 at 07:57:16PM +0800, Zhen Lei wrote:
>>  /*
>>   * reserve_crashkernel() - reserves memory for crash kernel
>>   *
>>   * This function reserves memory area given in "crashkernel=" kernel command
>>   * line parameter. The memory reserved is used by dump capture kernel when
>>   * primary kernel is crashing.
>> + *
>> + * NOTE: Reservation of crashkernel,low is special since its existence
>> + * is not independent, need rely on the existence of crashkernel,high.
>> + * Here, four cases of crashkernel low memory reservation are summarized:
>> + * 1) crashkernel=Y,low is specified explicitly, the size of crashkernel low
>> + *    memory takes Y;
>> + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
>> + *    take the default crashkernel low memory size;
>> + * 3) crashkernel=X is specified, while fallback to get a memory region
>> + *    in high memory, take the default crashkernel low memory size;
>> + * 4) crashkernel='invalid value',low is specified, failed the whole
>> + *    crashkernel reservation and bail out.
> 
> Following the x86 behaviour made sense when we were tried to get that
> code generic. Now that we moved the logic under arch/arm64, we can
> diverge a bit. I lost track of the original (v1/v2) proposal but I
> wonder whether we still need the fallback to high for crashkernel=Y.

I don't think anyone has raised this demand yet! If it weren't for the
fact that crashkernel=X appeared earlier, it would probably have been
enough for a combination of crashkernel=X,high and crashkernel=Y,low.

In fact, I also tend not to support "fallback to high for crashkernel=Y".
I took over this from Chen Zhou. In the absence of any objection, I had
to inherit. Now that you've brought it up, I'm happy to delete it.
Supporting this feature complicates the code logic a lot. The point is,
it's not fully backwards compatible yet. For example, someone may want
crashkernel=3G to report failure, but the the new support make it work.

> Maybe simpler, no fallbacks:
> 
> 	crashkernel=Y - keep the current behaviour, ignore high,low
> 	crashkernel=Y,high - allocate above ZONE_DMA
> 	crashkernel=Y,low - allocate within ZONE_DMA
> 
>>From your proposal, the difference is that the Y,high option won't
> have any default ZONE_DMA fallback, one would have to explicitly pass
> the Y,low option if needed.

I agree with you. Now we don't need code generic, so there is no need to
carry the historical burden of other ARCHs. arm64 does not need to delve
into that empirical value(the default size of crash low memory).

> 
> Just a thought, maybe it makes the code simpler. But I'm open to
> discussion if there are good arguments for the proposed (x86-like)
> behaviour. One argument could be for crashkernel=Y to fall back to high
> if distros don't want to bother with high/low settings.

I think distros should take precedence over "crashkernel=Y,high". After all,
ZONE_DMA memory is more valuable than high memory.


> 
> Another thing I may have asked in the past, what happens if we run a new
> kernel with these patches with old kexec user tools. I suspect the
> crashkernel=Y with the fallback to high will confuse the tools.

If crashkernel=Y can reserve the memory in Zone_DMA successfully, the old
kexec works well. But if crashkernel=Y fallback to high memory, the second
kernel will boot failed, because the old kexec can only use dtb to pass the
high memory range to the second kernel. In comparison, if no fallback, we can
see crash memory reservation failure in the first kernel, so we have a chance
to adjust Y.

Currently, the new kexec tool will pick the last memory range(sorted by address
in ascending order) to store Image,dtb,initrd.


> 
> BTW, please separate the NO_BLOCK_MAPPINGS optimisations from the
> crashkernel above 4G. Let's get the crashkernel reservations sorted
> first, it's been around for too long.

OK, thank you. That's a good suggestion.

> 
> Thanks.
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-27  6:54       ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-27  6:54 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/4/27 2:02, Catalin Marinas wrote:
> On Thu, Apr 14, 2022 at 07:57:16PM +0800, Zhen Lei wrote:
>>  /*
>>   * reserve_crashkernel() - reserves memory for crash kernel
>>   *
>>   * This function reserves memory area given in "crashkernel=" kernel command
>>   * line parameter. The memory reserved is used by dump capture kernel when
>>   * primary kernel is crashing.
>> + *
>> + * NOTE: Reservation of crashkernel,low is special since its existence
>> + * is not independent, need rely on the existence of crashkernel,high.
>> + * Here, four cases of crashkernel low memory reservation are summarized:
>> + * 1) crashkernel=Y,low is specified explicitly, the size of crashkernel low
>> + *    memory takes Y;
>> + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
>> + *    take the default crashkernel low memory size;
>> + * 3) crashkernel=X is specified, while fallback to get a memory region
>> + *    in high memory, take the default crashkernel low memory size;
>> + * 4) crashkernel='invalid value',low is specified, failed the whole
>> + *    crashkernel reservation and bail out.
> 
> Following the x86 behaviour made sense when we were tried to get that
> code generic. Now that we moved the logic under arch/arm64, we can
> diverge a bit. I lost track of the original (v1/v2) proposal but I
> wonder whether we still need the fallback to high for crashkernel=Y.

I don't think anyone has raised this demand yet! If it weren't for the
fact that crashkernel=X appeared earlier, it would probably have been
enough for a combination of crashkernel=X,high and crashkernel=Y,low.

In fact, I also tend not to support "fallback to high for crashkernel=Y".
I took over this from Chen Zhou. In the absence of any objection, I had
to inherit. Now that you've brought it up, I'm happy to delete it.
Supporting this feature complicates the code logic a lot. The point is,
it's not fully backwards compatible yet. For example, someone may want
crashkernel=3G to report failure, but the the new support make it work.

> Maybe simpler, no fallbacks:
> 
> 	crashkernel=Y - keep the current behaviour, ignore high,low
> 	crashkernel=Y,high - allocate above ZONE_DMA
> 	crashkernel=Y,low - allocate within ZONE_DMA
> 
>>From your proposal, the difference is that the Y,high option won't
> have any default ZONE_DMA fallback, one would have to explicitly pass
> the Y,low option if needed.

I agree with you. Now we don't need code generic, so there is no need to
carry the historical burden of other ARCHs. arm64 does not need to delve
into that empirical value(the default size of crash low memory).

> 
> Just a thought, maybe it makes the code simpler. But I'm open to
> discussion if there are good arguments for the proposed (x86-like)
> behaviour. One argument could be for crashkernel=Y to fall back to high
> if distros don't want to bother with high/low settings.

I think distros should take precedence over "crashkernel=Y,high". After all,
ZONE_DMA memory is more valuable than high memory.


> 
> Another thing I may have asked in the past, what happens if we run a new
> kernel with these patches with old kexec user tools. I suspect the
> crashkernel=Y with the fallback to high will confuse the tools.

If crashkernel=Y can reserve the memory in Zone_DMA successfully, the old
kexec works well. But if crashkernel=Y fallback to high memory, the second
kernel will boot failed, because the old kexec can only use dtb to pass the
high memory range to the second kernel. In comparison, if no fallback, we can
see crash memory reservation failure in the first kernel, so we have a chance
to adjust Y.

Currently, the new kexec tool will pick the last memory range(sorted by address
in ascending order) to store Image,dtb,initrd.


> 
> BTW, please separate the NO_BLOCK_MAPPINGS optimisations from the
> crashkernel above 4G. Let's get the crashkernel reservations sorted
> first, it's been around for too long.

OK, thank you. That's a good suggestion.

> 
> Thanks.
> 

-- 
Regards,
  Zhen Lei

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-27  6:54       ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen @ 2022-04-27  6:54 UTC (permalink / raw)
  To: kexec



On 2022/4/27 2:02, Catalin Marinas wrote:
> On Thu, Apr 14, 2022 at 07:57:16PM +0800, Zhen Lei wrote:
>>  /*
>>   * reserve_crashkernel() - reserves memory for crash kernel
>>   *
>>   * This function reserves memory area given in "crashkernel=" kernel command
>>   * line parameter. The memory reserved is used by dump capture kernel when
>>   * primary kernel is crashing.
>> + *
>> + * NOTE: Reservation of crashkernel,low is special since its existence
>> + * is not independent, need rely on the existence of crashkernel,high.
>> + * Here, four cases of crashkernel low memory reservation are summarized:
>> + * 1) crashkernel=Y,low is specified explicitly, the size of crashkernel low
>> + *    memory takes Y;
>> + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
>> + *    take the default crashkernel low memory size;
>> + * 3) crashkernel=X is specified, while fallback to get a memory region
>> + *    in high memory, take the default crashkernel low memory size;
>> + * 4) crashkernel='invalid value',low is specified, failed the whole
>> + *    crashkernel reservation and bail out.
> 
> Following the x86 behaviour made sense when we were tried to get that
> code generic. Now that we moved the logic under arch/arm64, we can
> diverge a bit. I lost track of the original (v1/v2) proposal but I
> wonder whether we still need the fallback to high for crashkernel=Y.

I don't think anyone has raised this demand yet! If it weren't for the
fact that crashkernel=X appeared earlier, it would probably have been
enough for a combination of crashkernel=X,high and crashkernel=Y,low.

In fact, I also tend not to support "fallback to high for crashkernel=Y".
I took over this from Chen Zhou. In the absence of any objection, I had
to inherit. Now that you've brought it up, I'm happy to delete it.
Supporting this feature complicates the code logic a lot. The point is,
it's not fully backwards compatible yet. For example, someone may want
crashkernel=3G to report failure, but the the new support make it work.

> Maybe simpler, no fallbacks:
> 
> 	crashkernel=Y - keep the current behaviour, ignore high,low
> 	crashkernel=Y,high - allocate above ZONE_DMA
> 	crashkernel=Y,low - allocate within ZONE_DMA
> 
>>From your proposal, the difference is that the Y,high option won't
> have any default ZONE_DMA fallback, one would have to explicitly pass
> the Y,low option if needed.

I agree with you. Now we don't need code generic, so there is no need to
carry the historical burden of other ARCHs. arm64 does not need to delve
into that empirical value(the default size of crash low memory).

> 
> Just a thought, maybe it makes the code simpler. But I'm open to
> discussion if there are good arguments for the proposed (x86-like)
> behaviour. One argument could be for crashkernel=Y to fall back to high
> if distros don't want to bother with high/low settings.

I think distros should take precedence over "crashkernel=Y,high". After all,
ZONE_DMA memory is more valuable than high memory.


> 
> Another thing I may have asked in the past, what happens if we run a new
> kernel with these patches with old kexec user tools. I suspect the
> crashkernel=Y with the fallback to high will confuse the tools.

If crashkernel=Y can reserve the memory in Zone_DMA successfully, the old
kexec works well. But if crashkernel=Y fallback to high memory, the second
kernel will boot failed, because the old kexec can only use dtb to pass the
high memory range to the second kernel. In comparison, if no fallback, we can
see crash memory reservation failure in the first kernel, so we have a chance
to adjust Y.

Currently, the new kexec tool will pick the last memory range(sorted by address
in ascending order) to store Image,dtb,initrd.


> 
> BTW, please separate the NO_BLOCK_MAPPINGS optimisations from the
> crashkernel above 4G. Let's get the crashkernel reservations sorted
> first, it's been around for too long.

OK, thank you. That's a good suggestion.

> 
> Thanks.
> 

-- 
Regards,
  Zhen Lei


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 4/9] arm64: kdump: Don't force page-level mappings for memory above 4G
  2022-04-26 14:26     ` Catalin Marinas
  (?)
@ 2022-04-27  7:12       ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-27  7:12 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/4/26 22:26, Catalin Marinas wrote:
> On Thu, Apr 14, 2022 at 07:57:15PM +0800, Zhen Lei wrote:
>> @@ -540,13 +540,31 @@ static void __init map_mem(pgd_t *pgdp)
>>  	for_each_mem_range(i, &start, &end) {
>>  		if (start >= end)
>>  			break;
>> +
>> +#ifdef CONFIG_KEXEC_CORE
>> +		if (eflags && (end >= SZ_4G)) {
>> +			/*
>> +			 * The memory block cross the 4G boundary.
>> +			 * Forcibly use page-level mappings for memory under 4G.
>> +			 */
>> +			if (start < SZ_4G) {
>> +				__map_memblock(pgdp, start, SZ_4G - 1,
>> +					       pgprot_tagged(PAGE_KERNEL), flags | eflags);
>> +				start  = SZ_4G;
>> +			}
>> +
>> +			/* Page-level mappings is not mandatory for memory above 4G */
>> +			eflags = 0;
>> +		}
>> +#endif
> 
> That's a bit tricky if a SoC has all RAM above 4G. IIRC AMD Seattle had
> this layout. See max_zone_phys() for how we deal with this, basically
> extending ZONE_DMA to the whole range if RAM starts above 4GB. In that
> case, crashkernel reservation would fall in the range above 4GB.
> 
> BTW, we changed the max_zone_phys() logic with commit 791ab8b2e3db
> ("arm64: Ignore any DMA offsets in the max_zone_phys() calculation").

Okay, thanks for your correction. I'll dig into it after I've done the original requirement.

> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 4/9] arm64: kdump: Don't force page-level mappings for memory above 4G
@ 2022-04-27  7:12       ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-27  7:12 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/4/26 22:26, Catalin Marinas wrote:
> On Thu, Apr 14, 2022 at 07:57:15PM +0800, Zhen Lei wrote:
>> @@ -540,13 +540,31 @@ static void __init map_mem(pgd_t *pgdp)
>>  	for_each_mem_range(i, &start, &end) {
>>  		if (start >= end)
>>  			break;
>> +
>> +#ifdef CONFIG_KEXEC_CORE
>> +		if (eflags && (end >= SZ_4G)) {
>> +			/*
>> +			 * The memory block cross the 4G boundary.
>> +			 * Forcibly use page-level mappings for memory under 4G.
>> +			 */
>> +			if (start < SZ_4G) {
>> +				__map_memblock(pgdp, start, SZ_4G - 1,
>> +					       pgprot_tagged(PAGE_KERNEL), flags | eflags);
>> +				start  = SZ_4G;
>> +			}
>> +
>> +			/* Page-level mappings is not mandatory for memory above 4G */
>> +			eflags = 0;
>> +		}
>> +#endif
> 
> That's a bit tricky if a SoC has all RAM above 4G. IIRC AMD Seattle had
> this layout. See max_zone_phys() for how we deal with this, basically
> extending ZONE_DMA to the whole range if RAM starts above 4GB. In that
> case, crashkernel reservation would fall in the range above 4GB.
> 
> BTW, we changed the max_zone_phys() logic with commit 791ab8b2e3db
> ("arm64: Ignore any DMA offsets in the max_zone_phys() calculation").

Okay, thanks for your correction. I'll dig into it after I've done the original requirement.

> 

-- 
Regards,
  Zhen Lei

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 4/9] arm64: kdump: Don't force page-level mappings for memory above 4G
@ 2022-04-27  7:12       ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen @ 2022-04-27  7:12 UTC (permalink / raw)
  To: kexec



On 2022/4/26 22:26, Catalin Marinas wrote:
> On Thu, Apr 14, 2022 at 07:57:15PM +0800, Zhen Lei wrote:
>> @@ -540,13 +540,31 @@ static void __init map_mem(pgd_t *pgdp)
>>  	for_each_mem_range(i, &start, &end) {
>>  		if (start >= end)
>>  			break;
>> +
>> +#ifdef CONFIG_KEXEC_CORE
>> +		if (eflags && (end >= SZ_4G)) {
>> +			/*
>> +			 * The memory block cross the 4G boundary.
>> +			 * Forcibly use page-level mappings for memory under 4G.
>> +			 */
>> +			if (start < SZ_4G) {
>> +				__map_memblock(pgdp, start, SZ_4G - 1,
>> +					       pgprot_tagged(PAGE_KERNEL), flags | eflags);
>> +				start  = SZ_4G;
>> +			}
>> +
>> +			/* Page-level mappings is not mandatory for memory above 4G */
>> +			eflags = 0;
>> +		}
>> +#endif
> 
> That's a bit tricky if a SoC has all RAM above 4G. IIRC AMD Seattle had
> this layout. See max_zone_phys() for how we deal with this, basically
> extending ZONE_DMA to the whole range if RAM starts above 4GB. In that
> case, crashkernel reservation would fall in the range above 4GB.
> 
> BTW, we changed the max_zone_phys() logic with commit 791ab8b2e3db
> ("arm64: Ignore any DMA offsets in the max_zone_phys() calculation").

Okay, thanks for your correction. I'll dig into it after I've done the original requirement.

> 

-- 
Regards,
  Zhen Lei


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-04-27  6:54       ` Leizhen (ThunderTown)
  (?)
@ 2022-04-27 12:32         ` Catalin Marinas
  -1 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-04-27 12:32 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On Wed, Apr 27, 2022 at 02:54:52PM +0800, Leizhen (ThunderTown) wrote:
> On 2022/4/27 2:02, Catalin Marinas wrote:
> > On Thu, Apr 14, 2022 at 07:57:16PM +0800, Zhen Lei wrote:
> >>  /*
> >>   * reserve_crashkernel() - reserves memory for crash kernel
> >>   *
> >>   * This function reserves memory area given in "crashkernel=" kernel command
> >>   * line parameter. The memory reserved is used by dump capture kernel when
> >>   * primary kernel is crashing.
> >> + *
> >> + * NOTE: Reservation of crashkernel,low is special since its existence
> >> + * is not independent, need rely on the existence of crashkernel,high.
> >> + * Here, four cases of crashkernel low memory reservation are summarized:
> >> + * 1) crashkernel=Y,low is specified explicitly, the size of crashkernel low
> >> + *    memory takes Y;
> >> + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
> >> + *    take the default crashkernel low memory size;
> >> + * 3) crashkernel=X is specified, while fallback to get a memory region
> >> + *    in high memory, take the default crashkernel low memory size;
> >> + * 4) crashkernel='invalid value',low is specified, failed the whole
> >> + *    crashkernel reservation and bail out.
> > 
> > Following the x86 behaviour made sense when we were tried to get that
> > code generic. Now that we moved the logic under arch/arm64, we can
> > diverge a bit. I lost track of the original (v1/v2) proposal but I
> > wonder whether we still need the fallback to high for crashkernel=Y.
> 
> I don't think anyone has raised this demand yet! If it weren't for the
> fact that crashkernel=X appeared earlier, it would probably have been
> enough for a combination of crashkernel=X,high and crashkernel=Y,low.
> 
> In fact, I also tend not to support "fallback to high for crashkernel=Y".
> I took over this from Chen Zhou. In the absence of any objection, I had
> to inherit. Now that you've brought it up, I'm happy to delete it.
> Supporting this feature complicates the code logic a lot. The point is,
> it's not fully backwards compatible yet. For example, someone may want
> crashkernel=3G to report failure, but the the new support make it work.

BTW, prior to v20, this patch had this line:

	crashk_low_res.name = "Crash kernel (low)";

I can't find it anymore. Do the kexec tools need to distinguish between
low and high or they can cope with multiple "Crash kernel" entries?

> > Maybe simpler, no fallbacks:
> > 
> > 	crashkernel=Y - keep the current behaviour, ignore high,low
> > 	crashkernel=Y,high - allocate above ZONE_DMA
> > 	crashkernel=Y,low - allocate within ZONE_DMA
> > 
> > From your proposal, the difference is that the Y,high option won't
> > have any default ZONE_DMA fallback, one would have to explicitly pass
> > the Y,low option if needed.
> 
> I agree with you. Now we don't need code generic, so there is no need to
> carry the historical burden of other ARCHs. arm64 does not need to delve
> into that empirical value(the default size of crash low memory).
> 
> > Just a thought, maybe it makes the code simpler. But I'm open to
> > discussion if there are good arguments for the proposed (x86-like)
> > behaviour. One argument could be for crashkernel=Y to fall back to high
> > if distros don't want to bother with high/low settings.
> 
> I think distros should take precedence over "crashkernel=Y,high". After all,
> ZONE_DMA memory is more valuable than high memory.

My point is whether an admin configuring the kernel command line needs
to know the layout of ZONE_DMA etc. to figure out how much to pass in
high and low. The fallbacks in this case have some value but they also
complicate the code logic. The 4GB limit does not always make sense
either for some platforms (RPi4 has a ZONE_DMA limit of 1GB).

I think one could always pass a default command line like:

	crashkernel=1G,high crashkernel=128M,low

without much knowledge of the SoC memory layout.

Another option is to only introduce crashkernel=Y,low and, when that is
passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
'high' option at all:

	crashkernel=1G				- all within ZONE_DMA
	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
						  1G above ZONE_DMA

If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
the 'low' option.

-- 
Catalin

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-27 12:32         ` Catalin Marinas
  0 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-04-27 12:32 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On Wed, Apr 27, 2022 at 02:54:52PM +0800, Leizhen (ThunderTown) wrote:
> On 2022/4/27 2:02, Catalin Marinas wrote:
> > On Thu, Apr 14, 2022 at 07:57:16PM +0800, Zhen Lei wrote:
> >>  /*
> >>   * reserve_crashkernel() - reserves memory for crash kernel
> >>   *
> >>   * This function reserves memory area given in "crashkernel=" kernel command
> >>   * line parameter. The memory reserved is used by dump capture kernel when
> >>   * primary kernel is crashing.
> >> + *
> >> + * NOTE: Reservation of crashkernel,low is special since its existence
> >> + * is not independent, need rely on the existence of crashkernel,high.
> >> + * Here, four cases of crashkernel low memory reservation are summarized:
> >> + * 1) crashkernel=Y,low is specified explicitly, the size of crashkernel low
> >> + *    memory takes Y;
> >> + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
> >> + *    take the default crashkernel low memory size;
> >> + * 3) crashkernel=X is specified, while fallback to get a memory region
> >> + *    in high memory, take the default crashkernel low memory size;
> >> + * 4) crashkernel='invalid value',low is specified, failed the whole
> >> + *    crashkernel reservation and bail out.
> > 
> > Following the x86 behaviour made sense when we were tried to get that
> > code generic. Now that we moved the logic under arch/arm64, we can
> > diverge a bit. I lost track of the original (v1/v2) proposal but I
> > wonder whether we still need the fallback to high for crashkernel=Y.
> 
> I don't think anyone has raised this demand yet! If it weren't for the
> fact that crashkernel=X appeared earlier, it would probably have been
> enough for a combination of crashkernel=X,high and crashkernel=Y,low.
> 
> In fact, I also tend not to support "fallback to high for crashkernel=Y".
> I took over this from Chen Zhou. In the absence of any objection, I had
> to inherit. Now that you've brought it up, I'm happy to delete it.
> Supporting this feature complicates the code logic a lot. The point is,
> it's not fully backwards compatible yet. For example, someone may want
> crashkernel=3G to report failure, but the the new support make it work.

BTW, prior to v20, this patch had this line:

	crashk_low_res.name = "Crash kernel (low)";

I can't find it anymore. Do the kexec tools need to distinguish between
low and high or they can cope with multiple "Crash kernel" entries?

> > Maybe simpler, no fallbacks:
> > 
> > 	crashkernel=Y - keep the current behaviour, ignore high,low
> > 	crashkernel=Y,high - allocate above ZONE_DMA
> > 	crashkernel=Y,low - allocate within ZONE_DMA
> > 
> > From your proposal, the difference is that the Y,high option won't
> > have any default ZONE_DMA fallback, one would have to explicitly pass
> > the Y,low option if needed.
> 
> I agree with you. Now we don't need code generic, so there is no need to
> carry the historical burden of other ARCHs. arm64 does not need to delve
> into that empirical value(the default size of crash low memory).
> 
> > Just a thought, maybe it makes the code simpler. But I'm open to
> > discussion if there are good arguments for the proposed (x86-like)
> > behaviour. One argument could be for crashkernel=Y to fall back to high
> > if distros don't want to bother with high/low settings.
> 
> I think distros should take precedence over "crashkernel=Y,high". After all,
> ZONE_DMA memory is more valuable than high memory.

My point is whether an admin configuring the kernel command line needs
to know the layout of ZONE_DMA etc. to figure out how much to pass in
high and low. The fallbacks in this case have some value but they also
complicate the code logic. The 4GB limit does not always make sense
either for some platforms (RPi4 has a ZONE_DMA limit of 1GB).

I think one could always pass a default command line like:

	crashkernel=1G,high crashkernel=128M,low

without much knowledge of the SoC memory layout.

Another option is to only introduce crashkernel=Y,low and, when that is
passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
'high' option at all:

	crashkernel=1G				- all within ZONE_DMA
	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
						  1G above ZONE_DMA

If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
the 'low' option.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-27 12:32         ` Catalin Marinas
  0 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-04-27 12:32 UTC (permalink / raw)
  To: kexec

On Wed, Apr 27, 2022 at 02:54:52PM +0800, Leizhen (ThunderTown) wrote:
> On 2022/4/27 2:02, Catalin Marinas wrote:
> > On Thu, Apr 14, 2022 at 07:57:16PM +0800, Zhen Lei wrote:
> >>  /*
> >>   * reserve_crashkernel() - reserves memory for crash kernel
> >>   *
> >>   * This function reserves memory area given in "crashkernel=" kernel command
> >>   * line parameter. The memory reserved is used by dump capture kernel when
> >>   * primary kernel is crashing.
> >> + *
> >> + * NOTE: Reservation of crashkernel,low is special since its existence
> >> + * is not independent, need rely on the existence of crashkernel,high.
> >> + * Here, four cases of crashkernel low memory reservation are summarized:
> >> + * 1) crashkernel=Y,low is specified explicitly, the size of crashkernel low
> >> + *    memory takes Y;
> >> + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
> >> + *    take the default crashkernel low memory size;
> >> + * 3) crashkernel=X is specified, while fallback to get a memory region
> >> + *    in high memory, take the default crashkernel low memory size;
> >> + * 4) crashkernel='invalid value',low is specified, failed the whole
> >> + *    crashkernel reservation and bail out.
> > 
> > Following the x86 behaviour made sense when we were tried to get that
> > code generic. Now that we moved the logic under arch/arm64, we can
> > diverge a bit. I lost track of the original (v1/v2) proposal but I
> > wonder whether we still need the fallback to high for crashkernel=Y.
> 
> I don't think anyone has raised this demand yet! If it weren't for the
> fact that crashkernel=X appeared earlier, it would probably have been
> enough for a combination of crashkernel=X,high and crashkernel=Y,low.
> 
> In fact, I also tend not to support "fallback to high for crashkernel=Y".
> I took over this from Chen Zhou. In the absence of any objection, I had
> to inherit. Now that you've brought it up, I'm happy to delete it.
> Supporting this feature complicates the code logic a lot. The point is,
> it's not fully backwards compatible yet. For example, someone may want
> crashkernel=3G to report failure, but the the new support make it work.

BTW, prior to v20, this patch had this line:

	crashk_low_res.name = "Crash kernel (low)";

I can't find it anymore. Do the kexec tools need to distinguish between
low and high or they can cope with multiple "Crash kernel" entries?

> > Maybe simpler, no fallbacks:
> > 
> > 	crashkernel=Y - keep the current behaviour, ignore high,low
> > 	crashkernel=Y,high - allocate above ZONE_DMA
> > 	crashkernel=Y,low - allocate within ZONE_DMA
> > 
> > From your proposal, the difference is that the Y,high option won't
> > have any default ZONE_DMA fallback, one would have to explicitly pass
> > the Y,low option if needed.
> 
> I agree with you. Now we don't need code generic, so there is no need to
> carry the historical burden of other ARCHs. arm64 does not need to delve
> into that empirical value(the default size of crash low memory).
> 
> > Just a thought, maybe it makes the code simpler. But I'm open to
> > discussion if there are good arguments for the proposed (x86-like)
> > behaviour. One argument could be for crashkernel=Y to fall back to high
> > if distros don't want to bother with high/low settings.
> 
> I think distros should take precedence over "crashkernel=Y,high". After all,
> ZONE_DMA memory is more valuable than high memory.

My point is whether an admin configuring the kernel command line needs
to know the layout of ZONE_DMA etc. to figure out how much to pass in
high and low. The fallbacks in this case have some value but they also
complicate the code logic. The 4GB limit does not always make sense
either for some platforms (RPi4 has a ZONE_DMA limit of 1GB).

I think one could always pass a default command line like:

	crashkernel=1G,high crashkernel=128M,low

without much knowledge of the SoC memory layout.

Another option is to only introduce crashkernel=Y,low and, when that is
passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
'high' option at all:

	crashkernel=1G				- all within ZONE_DMA
	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
						  1G above ZONE_DMA

If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
the 'low' option.

-- 
Catalin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-04-27 12:32         ` Catalin Marinas
  (?)
@ 2022-04-27 13:49           ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-27 13:49 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/4/27 20:32, Catalin Marinas wrote:
> On Wed, Apr 27, 2022 at 02:54:52PM +0800, Leizhen (ThunderTown) wrote:
>> On 2022/4/27 2:02, Catalin Marinas wrote:
>>> On Thu, Apr 14, 2022 at 07:57:16PM +0800, Zhen Lei wrote:
>>>>  /*
>>>>   * reserve_crashkernel() - reserves memory for crash kernel
>>>>   *
>>>>   * This function reserves memory area given in "crashkernel=" kernel command
>>>>   * line parameter. The memory reserved is used by dump capture kernel when
>>>>   * primary kernel is crashing.
>>>> + *
>>>> + * NOTE: Reservation of crashkernel,low is special since its existence
>>>> + * is not independent, need rely on the existence of crashkernel,high.
>>>> + * Here, four cases of crashkernel low memory reservation are summarized:
>>>> + * 1) crashkernel=Y,low is specified explicitly, the size of crashkernel low
>>>> + *    memory takes Y;
>>>> + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
>>>> + *    take the default crashkernel low memory size;
>>>> + * 3) crashkernel=X is specified, while fallback to get a memory region
>>>> + *    in high memory, take the default crashkernel low memory size;
>>>> + * 4) crashkernel='invalid value',low is specified, failed the whole
>>>> + *    crashkernel reservation and bail out.
>>>
>>> Following the x86 behaviour made sense when we were tried to get that
>>> code generic. Now that we moved the logic under arch/arm64, we can
>>> diverge a bit. I lost track of the original (v1/v2) proposal but I
>>> wonder whether we still need the fallback to high for crashkernel=Y.
>>
>> I don't think anyone has raised this demand yet! If it weren't for the
>> fact that crashkernel=X appeared earlier, it would probably have been
>> enough for a combination of crashkernel=X,high and crashkernel=Y,low.
>>
>> In fact, I also tend not to support "fallback to high for crashkernel=Y".
>> I took over this from Chen Zhou. In the absence of any objection, I had
>> to inherit. Now that you've brought it up, I'm happy to delete it.
>> Supporting this feature complicates the code logic a lot. The point is,
>> it's not fully backwards compatible yet. For example, someone may want
>> crashkernel=3G to report failure, but the the new support make it work.
> 
> BTW, prior to v20, this patch had this line:
> 
> 	crashk_low_res.name = "Crash kernel (low)";
> 
> I can't find it anymore. Do the kexec tools need to distinguish between
> low and high or they can cope with multiple "Crash kernel" entries?

Yes, I've updated the kexec tool patch based on Borislav Petkov's comments
to keep it the same as x86. And this patch has been merged into kexec v2.0.24.
b5a34a2 arm64: support more than one crash kernel regions

The kexec tool will first sorted all "Crash kernel" memory range in ascending
order, then choose the last one (the memory range with the highest address) to
store the second kernel's Image,dtb,initrd.

> 
>>> Maybe simpler, no fallbacks:
>>>
>>> 	crashkernel=Y - keep the current behaviour, ignore high,low
>>> 	crashkernel=Y,high - allocate above ZONE_DMA
>>> 	crashkernel=Y,low - allocate within ZONE_DMA
>>>
>>> From your proposal, the difference is that the Y,high option won't
>>> have any default ZONE_DMA fallback, one would have to explicitly pass
>>> the Y,low option if needed.
>>
>> I agree with you. Now we don't need code generic, so there is no need to
>> carry the historical burden of other ARCHs. arm64 does not need to delve
>> into that empirical value(the default size of crash low memory).
>>
>>> Just a thought, maybe it makes the code simpler. But I'm open to
>>> discussion if there are good arguments for the proposed (x86-like)
>>> behaviour. One argument could be for crashkernel=Y to fall back to high
>>> if distros don't want to bother with high/low settings.
>>
>> I think distros should take precedence over "crashkernel=Y,high". After all,
>> ZONE_DMA memory is more valuable than high memory.
> 
> My point is whether an admin configuring the kernel command line needs
> to know the layout of ZONE_DMA etc. to figure out how much to pass in

No need.

> high and low. The fallbacks in this case have some value but they also
> complicate the code logic. The 4GB limit does not always make sense
> either for some platforms (RPi4 has a ZONE_DMA limit of 1GB).
> 
> I think one could always pass a default command line like:
> 
> 	crashkernel=1G,high crashkernel=128M,low
> 
> without much knowledge of the SoC memory layout.

Yes, that's what the end result is. The user specify crashkernel=128M,low
and the implementation ensure the 128M low memory is allocated from DMA zone.
We use arm64_dma_phys_limit as the upper limit for crash low memory.

+#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
+       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
+       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
                                               crash_base, crash_max);

> 
> Another option is to only introduce crashkernel=Y,low and, when that is
> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
> 'high' option at all:
> 
> 	crashkernel=1G				- all within ZONE_DMA
> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
> 						  1G above ZONE_DMA
> 
> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
> the 'low' option.

I think although the code is hard to make generic, the interface is better to
be relatively uniform. A user might have to maintain both x86 and arm64, and
so on. It's not a good thing that the difference is too big.

A well had already been dug by the forefathers, and it seemed unnecessary to
dig another well beside it if there was no particular advantage.

> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-27 13:49           ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-27 13:49 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/4/27 20:32, Catalin Marinas wrote:
> On Wed, Apr 27, 2022 at 02:54:52PM +0800, Leizhen (ThunderTown) wrote:
>> On 2022/4/27 2:02, Catalin Marinas wrote:
>>> On Thu, Apr 14, 2022 at 07:57:16PM +0800, Zhen Lei wrote:
>>>>  /*
>>>>   * reserve_crashkernel() - reserves memory for crash kernel
>>>>   *
>>>>   * This function reserves memory area given in "crashkernel=" kernel command
>>>>   * line parameter. The memory reserved is used by dump capture kernel when
>>>>   * primary kernel is crashing.
>>>> + *
>>>> + * NOTE: Reservation of crashkernel,low is special since its existence
>>>> + * is not independent, need rely on the existence of crashkernel,high.
>>>> + * Here, four cases of crashkernel low memory reservation are summarized:
>>>> + * 1) crashkernel=Y,low is specified explicitly, the size of crashkernel low
>>>> + *    memory takes Y;
>>>> + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
>>>> + *    take the default crashkernel low memory size;
>>>> + * 3) crashkernel=X is specified, while fallback to get a memory region
>>>> + *    in high memory, take the default crashkernel low memory size;
>>>> + * 4) crashkernel='invalid value',low is specified, failed the whole
>>>> + *    crashkernel reservation and bail out.
>>>
>>> Following the x86 behaviour made sense when we were tried to get that
>>> code generic. Now that we moved the logic under arch/arm64, we can
>>> diverge a bit. I lost track of the original (v1/v2) proposal but I
>>> wonder whether we still need the fallback to high for crashkernel=Y.
>>
>> I don't think anyone has raised this demand yet! If it weren't for the
>> fact that crashkernel=X appeared earlier, it would probably have been
>> enough for a combination of crashkernel=X,high and crashkernel=Y,low.
>>
>> In fact, I also tend not to support "fallback to high for crashkernel=Y".
>> I took over this from Chen Zhou. In the absence of any objection, I had
>> to inherit. Now that you've brought it up, I'm happy to delete it.
>> Supporting this feature complicates the code logic a lot. The point is,
>> it's not fully backwards compatible yet. For example, someone may want
>> crashkernel=3G to report failure, but the the new support make it work.
> 
> BTW, prior to v20, this patch had this line:
> 
> 	crashk_low_res.name = "Crash kernel (low)";
> 
> I can't find it anymore. Do the kexec tools need to distinguish between
> low and high or they can cope with multiple "Crash kernel" entries?

Yes, I've updated the kexec tool patch based on Borislav Petkov's comments
to keep it the same as x86. And this patch has been merged into kexec v2.0.24.
b5a34a2 arm64: support more than one crash kernel regions

The kexec tool will first sorted all "Crash kernel" memory range in ascending
order, then choose the last one (the memory range with the highest address) to
store the second kernel's Image,dtb,initrd.

> 
>>> Maybe simpler, no fallbacks:
>>>
>>> 	crashkernel=Y - keep the current behaviour, ignore high,low
>>> 	crashkernel=Y,high - allocate above ZONE_DMA
>>> 	crashkernel=Y,low - allocate within ZONE_DMA
>>>
>>> From your proposal, the difference is that the Y,high option won't
>>> have any default ZONE_DMA fallback, one would have to explicitly pass
>>> the Y,low option if needed.
>>
>> I agree with you. Now we don't need code generic, so there is no need to
>> carry the historical burden of other ARCHs. arm64 does not need to delve
>> into that empirical value(the default size of crash low memory).
>>
>>> Just a thought, maybe it makes the code simpler. But I'm open to
>>> discussion if there are good arguments for the proposed (x86-like)
>>> behaviour. One argument could be for crashkernel=Y to fall back to high
>>> if distros don't want to bother with high/low settings.
>>
>> I think distros should take precedence over "crashkernel=Y,high". After all,
>> ZONE_DMA memory is more valuable than high memory.
> 
> My point is whether an admin configuring the kernel command line needs
> to know the layout of ZONE_DMA etc. to figure out how much to pass in

No need.

> high and low. The fallbacks in this case have some value but they also
> complicate the code logic. The 4GB limit does not always make sense
> either for some platforms (RPi4 has a ZONE_DMA limit of 1GB).
> 
> I think one could always pass a default command line like:
> 
> 	crashkernel=1G,high crashkernel=128M,low
> 
> without much knowledge of the SoC memory layout.

Yes, that's what the end result is. The user specify crashkernel=128M,low
and the implementation ensure the 128M low memory is allocated from DMA zone.
We use arm64_dma_phys_limit as the upper limit for crash low memory.

+#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
+       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
+       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
                                               crash_base, crash_max);

> 
> Another option is to only introduce crashkernel=Y,low and, when that is
> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
> 'high' option at all:
> 
> 	crashkernel=1G				- all within ZONE_DMA
> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
> 						  1G above ZONE_DMA
> 
> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
> the 'low' option.

I think although the code is hard to make generic, the interface is better to
be relatively uniform. A user might have to maintain both x86 and arm64, and
so on. It's not a good thing that the difference is too big.

A well had already been dug by the forefathers, and it seemed unnecessary to
dig another well beside it if there was no particular advantage.

> 

-- 
Regards,
  Zhen Lei

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-27 13:49           ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen @ 2022-04-27 13:49 UTC (permalink / raw)
  To: kexec



On 2022/4/27 20:32, Catalin Marinas wrote:
> On Wed, Apr 27, 2022 at 02:54:52PM +0800, Leizhen (ThunderTown) wrote:
>> On 2022/4/27 2:02, Catalin Marinas wrote:
>>> On Thu, Apr 14, 2022 at 07:57:16PM +0800, Zhen Lei wrote:
>>>>  /*
>>>>   * reserve_crashkernel() - reserves memory for crash kernel
>>>>   *
>>>>   * This function reserves memory area given in "crashkernel=" kernel command
>>>>   * line parameter. The memory reserved is used by dump capture kernel when
>>>>   * primary kernel is crashing.
>>>> + *
>>>> + * NOTE: Reservation of crashkernel,low is special since its existence
>>>> + * is not independent, need rely on the existence of crashkernel,high.
>>>> + * Here, four cases of crashkernel low memory reservation are summarized:
>>>> + * 1) crashkernel=Y,low is specified explicitly, the size of crashkernel low
>>>> + *    memory takes Y;
>>>> + * 2) crashkernel=,low is not given, while crashkernel=,high is specified,
>>>> + *    take the default crashkernel low memory size;
>>>> + * 3) crashkernel=X is specified, while fallback to get a memory region
>>>> + *    in high memory, take the default crashkernel low memory size;
>>>> + * 4) crashkernel='invalid value',low is specified, failed the whole
>>>> + *    crashkernel reservation and bail out.
>>>
>>> Following the x86 behaviour made sense when we were tried to get that
>>> code generic. Now that we moved the logic under arch/arm64, we can
>>> diverge a bit. I lost track of the original (v1/v2) proposal but I
>>> wonder whether we still need the fallback to high for crashkernel=Y.
>>
>> I don't think anyone has raised this demand yet! If it weren't for the
>> fact that crashkernel=X appeared earlier, it would probably have been
>> enough for a combination of crashkernel=X,high and crashkernel=Y,low.
>>
>> In fact, I also tend not to support "fallback to high for crashkernel=Y".
>> I took over this from Chen Zhou. In the absence of any objection, I had
>> to inherit. Now that you've brought it up, I'm happy to delete it.
>> Supporting this feature complicates the code logic a lot. The point is,
>> it's not fully backwards compatible yet. For example, someone may want
>> crashkernel=3G to report failure, but the the new support make it work.
> 
> BTW, prior to v20, this patch had this line:
> 
> 	crashk_low_res.name = "Crash kernel (low)";
> 
> I can't find it anymore. Do the kexec tools need to distinguish between
> low and high or they can cope with multiple "Crash kernel" entries?

Yes, I've updated the kexec tool patch based on Borislav Petkov's comments
to keep it the same as x86. And this patch has been merged into kexec v2.0.24.
b5a34a2 arm64: support more than one crash kernel regions

The kexec tool will first sorted all "Crash kernel" memory range in ascending
order, then choose the last one (the memory range with the highest address) to
store the second kernel's Image,dtb,initrd.

> 
>>> Maybe simpler, no fallbacks:
>>>
>>> 	crashkernel=Y - keep the current behaviour, ignore high,low
>>> 	crashkernel=Y,high - allocate above ZONE_DMA
>>> 	crashkernel=Y,low - allocate within ZONE_DMA
>>>
>>> From your proposal, the difference is that the Y,high option won't
>>> have any default ZONE_DMA fallback, one would have to explicitly pass
>>> the Y,low option if needed.
>>
>> I agree with you. Now we don't need code generic, so there is no need to
>> carry the historical burden of other ARCHs. arm64 does not need to delve
>> into that empirical value(the default size of crash low memory).
>>
>>> Just a thought, maybe it makes the code simpler. But I'm open to
>>> discussion if there are good arguments for the proposed (x86-like)
>>> behaviour. One argument could be for crashkernel=Y to fall back to high
>>> if distros don't want to bother with high/low settings.
>>
>> I think distros should take precedence over "crashkernel=Y,high". After all,
>> ZONE_DMA memory is more valuable than high memory.
> 
> My point is whether an admin configuring the kernel command line needs
> to know the layout of ZONE_DMA etc. to figure out how much to pass in

No need.

> high and low. The fallbacks in this case have some value but they also
> complicate the code logic. The 4GB limit does not always make sense
> either for some platforms (RPi4 has a ZONE_DMA limit of 1GB).
> 
> I think one could always pass a default command line like:
> 
> 	crashkernel=1G,high crashkernel=128M,low
> 
> without much knowledge of the SoC memory layout.

Yes, that's what the end result is. The user specify crashkernel=128M,low
and the implementation ensure the 128M low memory is allocated from DMA zone.
We use arm64_dma_phys_limit as the upper limit for crash low memory.

+#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
+       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
+       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
                                               crash_base, crash_max);

> 
> Another option is to only introduce crashkernel=Y,low and, when that is
> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
> 'high' option at all:
> 
> 	crashkernel=1G				- all within ZONE_DMA
> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
> 						  1G above ZONE_DMA
> 
> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
> the 'low' option.

I think although the code is hard to make generic, the interface is better to
be relatively uniform. A user might have to maintain both x86 and arm64, and
so on. It's not a good thing that the difference is too big.

A well had already been dug by the forefathers, and it seemed unnecessary to
dig another well beside it if there was no particular advantage.

> 

-- 
Regards,
  Zhen Lei


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-04-27 13:49           ` Leizhen (ThunderTown)
  (?)
@ 2022-04-27 16:04             ` Catalin Marinas
  -1 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-04-27 16:04 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
> On 2022/4/27 20:32, Catalin Marinas wrote:
> > I think one could always pass a default command line like:
> > 
> > 	crashkernel=1G,high crashkernel=128M,low
> > 
> > without much knowledge of the SoC memory layout.
> 
> Yes, that's what the end result is. The user specify crashkernel=128M,low
> and the implementation ensure the 128M low memory is allocated from DMA zone.
> We use arm64_dma_phys_limit as the upper limit for crash low memory.
> 
> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>                                                crash_base, crash_max);
> 
> > Another option is to only introduce crashkernel=Y,low and, when that is
> > passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
> > 'high' option at all:
> > 
> > 	crashkernel=1G				- all within ZONE_DMA
> > 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
> > 						  1G above ZONE_DMA
> > 
> > If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
> > the 'low' option.
> 
> I think although the code is hard to make generic, the interface is better to
> be relatively uniform. A user might have to maintain both x86 and arm64, and
> so on. It's not a good thing that the difference is too big.

There will be some difference as the 4G limit doesn't always hold for
arm64 (though it's true in most cases). Anyway, we can probably simplify
things a bit while following the documented behaviour:

	crashkernel=Y		- current behaviour within ZONE_DMA
	crashkernel=Y,high	- allocate from above ZONE_DMA
	crashkernel=Y,low	- allocate within ZONE_DMA

There is no fallback from crashkernel=Y.

The question is whether we still want a default low allocation if
crashkernel=Y,low is missing but 'high' is present. If we add this, I
think we'd be consistent with kernel-parameters.txt for the 'low'
description. A default 'low' is probably not that bad but I'm tempted to
always mandate both 'high' and 'low'.

-- 
Catalin

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-27 16:04             ` Catalin Marinas
  0 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-04-27 16:04 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp

On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
> On 2022/4/27 20:32, Catalin Marinas wrote:
> > I think one could always pass a default command line like:
> > 
> > 	crashkernel=1G,high crashkernel=128M,low
> > 
> > without much knowledge of the SoC memory layout.
> 
> Yes, that's what the end result is. The user specify crashkernel=128M,low
> and the implementation ensure the 128M low memory is allocated from DMA zone.
> We use arm64_dma_phys_limit as the upper limit for crash low memory.
> 
> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>                                                crash_base, crash_max);
> 
> > Another option is to only introduce crashkernel=Y,low and, when that is
> > passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
> > 'high' option at all:
> > 
> > 	crashkernel=1G				- all within ZONE_DMA
> > 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
> > 						  1G above ZONE_DMA
> > 
> > If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
> > the 'low' option.
> 
> I think although the code is hard to make generic, the interface is better to
> be relatively uniform. A user might have to maintain both x86 and arm64, and
> so on. It's not a good thing that the difference is too big.

There will be some difference as the 4G limit doesn't always hold for
arm64 (though it's true in most cases). Anyway, we can probably simplify
things a bit while following the documented behaviour:

	crashkernel=Y		- current behaviour within ZONE_DMA
	crashkernel=Y,high	- allocate from above ZONE_DMA
	crashkernel=Y,low	- allocate within ZONE_DMA

There is no fallback from crashkernel=Y.

The question is whether we still want a default low allocation if
crashkernel=Y,low is missing but 'high' is present. If we add this, I
think we'd be consistent with kernel-parameters.txt for the 'low'
description. A default 'low' is probably not that bad but I'm tempted to
always mandate both 'high' and 'low'.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-27 16:04             ` Catalin Marinas
  0 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-04-27 16:04 UTC (permalink / raw)
  To: kexec

On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
> On 2022/4/27 20:32, Catalin Marinas wrote:
> > I think one could always pass a default command line like:
> > 
> > 	crashkernel=1G,high crashkernel=128M,low
> > 
> > without much knowledge of the SoC memory layout.
> 
> Yes, that's what the end result is. The user specify crashkernel=128M,low
> and the implementation ensure the 128M low memory is allocated from DMA zone.
> We use arm64_dma_phys_limit as the upper limit for crash low memory.
> 
> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>                                                crash_base, crash_max);
> 
> > Another option is to only introduce crashkernel=Y,low and, when that is
> > passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
> > 'high' option at all:
> > 
> > 	crashkernel=1G				- all within ZONE_DMA
> > 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
> > 						  1G above ZONE_DMA
> > 
> > If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
> > the 'low' option.
> 
> I think although the code is hard to make generic, the interface is better to
> be relatively uniform. A user might have to maintain both x86 and arm64, and
> so on. It's not a good thing that the difference is too big.

There will be some difference as the 4G limit doesn't always hold for
arm64 (though it's true in most cases). Anyway, we can probably simplify
things a bit while following the documented behaviour:

	crashkernel=Y		- current behaviour within ZONE_DMA
	crashkernel=Y,high	- allocate from above ZONE_DMA
	crashkernel=Y,low	- allocate within ZONE_DMA

There is no fallback from crashkernel=Y.

The question is whether we still want a default low allocation if
crashkernel=Y,low is missing but 'high' is present. If we add this, I
think we'd be consistent with kernel-parameters.txt for the 'low'
description. A default 'low' is probably not that bad but I'm tempted to
always mandate both 'high' and 'low'.

-- 
Catalin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-04-27 16:04             ` Catalin Marinas
  (?)
@ 2022-04-28  2:22               ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-28  2:22 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/4/28 0:04, Catalin Marinas wrote:
> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
>> On 2022/4/27 20:32, Catalin Marinas wrote:
>>> I think one could always pass a default command line like:
>>>
>>> 	crashkernel=1G,high crashkernel=128M,low
>>>
>>> without much knowledge of the SoC memory layout.
>>
>> Yes, that's what the end result is. The user specify crashkernel=128M,low
>> and the implementation ensure the 128M low memory is allocated from DMA zone.
>> We use arm64_dma_phys_limit as the upper limit for crash low memory.
>>
>> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
>> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>                                                crash_base, crash_max);
>>
>>> Another option is to only introduce crashkernel=Y,low and, when that is
>>> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
>>> 'high' option at all:
>>>
>>> 	crashkernel=1G				- all within ZONE_DMA
>>> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
>>> 						  1G above ZONE_DMA
>>>
>>> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
>>> the 'low' option.
>>
>> I think although the code is hard to make generic, the interface is better to
>> be relatively uniform. A user might have to maintain both x86 and arm64, and
>> so on. It's not a good thing that the difference is too big.
> 
> There will be some difference as the 4G limit doesn't always hold for
> arm64 (though it's true in most cases). Anyway, we can probably simplify
> things a bit while following the documented behaviour:
> 
> 	crashkernel=Y		- current behaviour within ZONE_DMA
> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> 	crashkernel=Y,low	- allocate within ZONE_DMA
> 
> There is no fallback from crashkernel=Y.

Yes, I followed your guidelines yesterday to modify the code. Now the code flow
is much clearer.

> 
> The question is whether we still want a default low allocation if
> crashkernel=Y,low is missing but 'high' is present. If we add this, I
> think we'd be consistent with kernel-parameters.txt for the 'low'
> description. A default 'low' is probably not that bad but I'm tempted to
> always mandate both 'high' and 'low'.

Yes, I agree with you. Because the situation is complicated, the default value
is hard to be accurate. It's better to let the user configure it according to
the actual situation, they're also programmers.

Whether mandate both 'high' and 'low', or allow only 'high' like x86(but the default
value becomes zero). I prefer the latter. The size of 'low' maybe zero, for example,
SMMU is enabled on the second kernel. If only high memory is required, only that
high memory needs to be configured, seems more reasonable.

> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-28  2:22               ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-28  2:22 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Baoquan He,
	Vivek Goyal, Eric Biederman, kexec, Will Deacon,
	linux-arm-kernel, Rob Herring, Frank Rowand, devicetree,
	Jonathan Corbet, linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang,
	Chen Zhou, John Donnelly, Dave Kleikamp



On 2022/4/28 0:04, Catalin Marinas wrote:
> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
>> On 2022/4/27 20:32, Catalin Marinas wrote:
>>> I think one could always pass a default command line like:
>>>
>>> 	crashkernel=1G,high crashkernel=128M,low
>>>
>>> without much knowledge of the SoC memory layout.
>>
>> Yes, that's what the end result is. The user specify crashkernel=128M,low
>> and the implementation ensure the 128M low memory is allocated from DMA zone.
>> We use arm64_dma_phys_limit as the upper limit for crash low memory.
>>
>> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
>> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>                                                crash_base, crash_max);
>>
>>> Another option is to only introduce crashkernel=Y,low and, when that is
>>> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
>>> 'high' option at all:
>>>
>>> 	crashkernel=1G				- all within ZONE_DMA
>>> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
>>> 						  1G above ZONE_DMA
>>>
>>> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
>>> the 'low' option.
>>
>> I think although the code is hard to make generic, the interface is better to
>> be relatively uniform. A user might have to maintain both x86 and arm64, and
>> so on. It's not a good thing that the difference is too big.
> 
> There will be some difference as the 4G limit doesn't always hold for
> arm64 (though it's true in most cases). Anyway, we can probably simplify
> things a bit while following the documented behaviour:
> 
> 	crashkernel=Y		- current behaviour within ZONE_DMA
> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> 	crashkernel=Y,low	- allocate within ZONE_DMA
> 
> There is no fallback from crashkernel=Y.

Yes, I followed your guidelines yesterday to modify the code. Now the code flow
is much clearer.

> 
> The question is whether we still want a default low allocation if
> crashkernel=Y,low is missing but 'high' is present. If we add this, I
> think we'd be consistent with kernel-parameters.txt for the 'low'
> description. A default 'low' is probably not that bad but I'm tempted to
> always mandate both 'high' and 'low'.

Yes, I agree with you. Because the situation is complicated, the default value
is hard to be accurate. It's better to let the user configure it according to
the actual situation, they're also programmers.

Whether mandate both 'high' and 'low', or allow only 'high' like x86(but the default
value becomes zero). I prefer the latter. The size of 'low' maybe zero, for example,
SMMU is enabled on the second kernel. If only high memory is required, only that
high memory needs to be configured, seems more reasonable.

> 

-- 
Regards,
  Zhen Lei

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-28  2:22               ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen @ 2022-04-28  2:22 UTC (permalink / raw)
  To: kexec



On 2022/4/28 0:04, Catalin Marinas wrote:
> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
>> On 2022/4/27 20:32, Catalin Marinas wrote:
>>> I think one could always pass a default command line like:
>>>
>>> 	crashkernel=1G,high crashkernel=128M,low
>>>
>>> without much knowledge of the SoC memory layout.
>>
>> Yes, that's what the end result is. The user specify crashkernel=128M,low
>> and the implementation ensure the 128M low memory is allocated from DMA zone.
>> We use arm64_dma_phys_limit as the upper limit for crash low memory.
>>
>> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
>> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>                                                crash_base, crash_max);
>>
>>> Another option is to only introduce crashkernel=Y,low and, when that is
>>> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
>>> 'high' option at all:
>>>
>>> 	crashkernel=1G				- all within ZONE_DMA
>>> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
>>> 						  1G above ZONE_DMA
>>>
>>> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
>>> the 'low' option.
>>
>> I think although the code is hard to make generic, the interface is better to
>> be relatively uniform. A user might have to maintain both x86 and arm64, and
>> so on. It's not a good thing that the difference is too big.
> 
> There will be some difference as the 4G limit doesn't always hold for
> arm64 (though it's true in most cases). Anyway, we can probably simplify
> things a bit while following the documented behaviour:
> 
> 	crashkernel=Y		- current behaviour within ZONE_DMA
> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> 	crashkernel=Y,low	- allocate within ZONE_DMA
> 
> There is no fallback from crashkernel=Y.

Yes, I followed your guidelines yesterday to modify the code. Now the code flow
is much clearer.

> 
> The question is whether we still want a default low allocation if
> crashkernel=Y,low is missing but 'high' is present. If we add this, I
> think we'd be consistent with kernel-parameters.txt for the 'low'
> description. A default 'low' is probably not that bad but I'm tempted to
> always mandate both 'high' and 'low'.

Yes, I agree with you. Because the situation is complicated, the default value
is hard to be accurate. It's better to let the user configure it according to
the actual situation, they're also programmers.

Whether mandate both 'high' and 'low', or allow only 'high' like x86(but the default
value becomes zero). I prefer the latter. The size of 'low' maybe zero, for example,
SMMU is enabled on the second kernel. If only high memory is required, only that
high memory needs to be configured, seems more reasonable.

> 

-- 
Regards,
  Zhen Lei


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-04-27 16:04             ` Catalin Marinas
  (?)
@ 2022-04-28  3:40               ` Baoquan He
  -1 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-04-28  3:40 UTC (permalink / raw)
  To: Catalin Marinas, Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

Hi Catalin, Zhen Lei,

On 04/27/22 at 05:04pm, Catalin Marinas wrote:
> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
> > On 2022/4/27 20:32, Catalin Marinas wrote:
> > > I think one could always pass a default command line like:
> > > 
> > > 	crashkernel=1G,high crashkernel=128M,low
> > > 
> > > without much knowledge of the SoC memory layout.
> > 
> > Yes, that's what the end result is. The user specify crashkernel=128M,low
> > and the implementation ensure the 128M low memory is allocated from DMA zone.
> > We use arm64_dma_phys_limit as the upper limit for crash low memory.
> > 
> > +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
> > +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> > +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
> >                                                crash_base, crash_max);
> > 
> > > Another option is to only introduce crashkernel=Y,low and, when that is
> > > passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
> > > 'high' option at all:
> > > 
> > > 	crashkernel=1G				- all within ZONE_DMA
> > > 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
> > > 						  1G above ZONE_DMA
> > > 
> > > If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
> > > the 'low' option.
> > 
> > I think although the code is hard to make generic, the interface is better to
> > be relatively uniform. A user might have to maintain both x86 and arm64, and
> > so on. It's not a good thing that the difference is too big.
> 
> There will be some difference as the 4G limit doesn't always hold for
> arm64 (though it's true in most cases). Anyway, we can probably simplify
> things a bit while following the documented behaviour:
> 
> 	crashkernel=Y		- current behaviour within ZONE_DMA
> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> 	crashkernel=Y,low	- allocate within ZONE_DMA
> 
> There is no fallback from crashkernel=Y.
> 
> The question is whether we still want a default low allocation if
> crashkernel=Y,low is missing but 'high' is present. If we add this, I
> think we'd be consistent with kernel-parameters.txt for the 'low'
> description. A default 'low' is probably not that bad but I'm tempted to
> always mandate both 'high' and 'low'.

Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
about this version. And I have the same concerns about them which comes
from below points:
1) we may need to take best effort to keep ,high, ,low behaviour
consistent on all ARCHes. Otherwise user/admin may be confused when they
deploy/configure kdump on different machines of different ARCHes in the
same LAB. I think we should try to avoid the confusion.
2) Fallback behaviour is important to our distros. The reason is we will
provide default value with crashkernel=xxxM along kernel of distros. In
this case, we hope the reservation will succeed by all means. The ,high
and ,low is an option if customer likes to take with expertise.

After going through arm64 memory init code, I got below summary about
arm64_dma_phys_limit which is the first zone's upper limit. I think we
can make use of it to facilitate to simplify code.
================================================================================
                        DMA                      DMA32                    NORMAL
1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
2)Normal machine        0~4G                     0                        (above 4G)
3)Special machine       (above 4G)~MAX
4)No DMA|DMA32                                                            (above 4G)~MAX

-------------------------------------------
                      arm64_dma_phys_limit
1)Raspberry Pi4         1G                     
2)Normal machine        4G                     
3)Special machine       MAX
4)No DMA|DMA32          MAX

Note: 3)Special machine means the machine's starting physical address is above 4G.
WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
IOMMU hardware supporting.
===================================================================================

I made a draft patch based on this patchset, please feel free to check and
see if it's OK, or anything missing or wrongly understood. I removed
reserve_crashkernel_high() and only keep reserve_crashkernel() and
reserve_crashkernel_low() as the v21 did.

Thanks
Baoquan


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-28  3:40               ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-04-28  3:40 UTC (permalink / raw)
  To: Catalin Marinas, Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

Hi Catalin, Zhen Lei,

On 04/27/22 at 05:04pm, Catalin Marinas wrote:
> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
> > On 2022/4/27 20:32, Catalin Marinas wrote:
> > > I think one could always pass a default command line like:
> > > 
> > > 	crashkernel=1G,high crashkernel=128M,low
> > > 
> > > without much knowledge of the SoC memory layout.
> > 
> > Yes, that's what the end result is. The user specify crashkernel=128M,low
> > and the implementation ensure the 128M low memory is allocated from DMA zone.
> > We use arm64_dma_phys_limit as the upper limit for crash low memory.
> > 
> > +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
> > +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> > +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
> >                                                crash_base, crash_max);
> > 
> > > Another option is to only introduce crashkernel=Y,low and, when that is
> > > passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
> > > 'high' option at all:
> > > 
> > > 	crashkernel=1G				- all within ZONE_DMA
> > > 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
> > > 						  1G above ZONE_DMA
> > > 
> > > If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
> > > the 'low' option.
> > 
> > I think although the code is hard to make generic, the interface is better to
> > be relatively uniform. A user might have to maintain both x86 and arm64, and
> > so on. It's not a good thing that the difference is too big.
> 
> There will be some difference as the 4G limit doesn't always hold for
> arm64 (though it's true in most cases). Anyway, we can probably simplify
> things a bit while following the documented behaviour:
> 
> 	crashkernel=Y		- current behaviour within ZONE_DMA
> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> 	crashkernel=Y,low	- allocate within ZONE_DMA
> 
> There is no fallback from crashkernel=Y.
> 
> The question is whether we still want a default low allocation if
> crashkernel=Y,low is missing but 'high' is present. If we add this, I
> think we'd be consistent with kernel-parameters.txt for the 'low'
> description. A default 'low' is probably not that bad but I'm tempted to
> always mandate both 'high' and 'low'.

Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
about this version. And I have the same concerns about them which comes
from below points:
1) we may need to take best effort to keep ,high, ,low behaviour
consistent on all ARCHes. Otherwise user/admin may be confused when they
deploy/configure kdump on different machines of different ARCHes in the
same LAB. I think we should try to avoid the confusion.
2) Fallback behaviour is important to our distros. The reason is we will
provide default value with crashkernel=xxxM along kernel of distros. In
this case, we hope the reservation will succeed by all means. The ,high
and ,low is an option if customer likes to take with expertise.

After going through arm64 memory init code, I got below summary about
arm64_dma_phys_limit which is the first zone's upper limit. I think we
can make use of it to facilitate to simplify code.
================================================================================
                        DMA                      DMA32                    NORMAL
1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
2)Normal machine        0~4G                     0                        (above 4G)
3)Special machine       (above 4G)~MAX
4)No DMA|DMA32                                                            (above 4G)~MAX

-------------------------------------------
                      arm64_dma_phys_limit
1)Raspberry Pi4         1G                     
2)Normal machine        4G                     
3)Special machine       MAX
4)No DMA|DMA32          MAX

Note: 3)Special machine means the machine's starting physical address is above 4G.
WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
IOMMU hardware supporting.
===================================================================================

I made a draft patch based on this patchset, please feel free to check and
see if it's OK, or anything missing or wrongly understood. I removed
reserve_crashkernel_high() and only keep reserve_crashkernel() and
reserve_crashkernel_low() as the v21 did.

Thanks
Baoquan


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-28  3:40               ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-04-28  3:40 UTC (permalink / raw)
  To: kexec

Hi Catalin, Zhen Lei,

On 04/27/22 at 05:04pm, Catalin Marinas wrote:
> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
> > On 2022/4/27 20:32, Catalin Marinas wrote:
> > > I think one could always pass a default command line like:
> > > 
> > > 	crashkernel=1G,high crashkernel=128M,low
> > > 
> > > without much knowledge of the SoC memory layout.
> > 
> > Yes, that's what the end result is. The user specify crashkernel=128M,low
> > and the implementation ensure the 128M low memory is allocated from DMA zone.
> > We use arm64_dma_phys_limit as the upper limit for crash low memory.
> > 
> > +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
> > +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> > +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
> >                                                crash_base, crash_max);
> > 
> > > Another option is to only introduce crashkernel=Y,low and, when that is
> > > passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
> > > 'high' option at all:
> > > 
> > > 	crashkernel=1G				- all within ZONE_DMA
> > > 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
> > > 						  1G above ZONE_DMA
> > > 
> > > If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
> > > the 'low' option.
> > 
> > I think although the code is hard to make generic, the interface is better to
> > be relatively uniform. A user might have to maintain both x86 and arm64, and
> > so on. It's not a good thing that the difference is too big.
> 
> There will be some difference as the 4G limit doesn't always hold for
> arm64 (though it's true in most cases). Anyway, we can probably simplify
> things a bit while following the documented behaviour:
> 
> 	crashkernel=Y		- current behaviour within ZONE_DMA
> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> 	crashkernel=Y,low	- allocate within ZONE_DMA
> 
> There is no fallback from crashkernel=Y.
> 
> The question is whether we still want a default low allocation if
> crashkernel=Y,low is missing but 'high' is present. If we add this, I
> think we'd be consistent with kernel-parameters.txt for the 'low'
> description. A default 'low' is probably not that bad but I'm tempted to
> always mandate both 'high' and 'low'.

Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
about this version. And I have the same concerns about them which comes
from below points:
1) we may need to take best effort to keep ,high, ,low behaviour
consistent on all ARCHes. Otherwise user/admin may be confused when they
deploy/configure kdump on different machines of different ARCHes in the
same LAB. I think we should try to avoid the confusion.
2) Fallback behaviour is important to our distros. The reason is we will
provide default value with crashkernel=xxxM along kernel of distros. In
this case, we hope the reservation will succeed by all means. The ,high
and ,low is an option if customer likes to take with expertise.

After going through arm64 memory init code, I got below summary about
arm64_dma_phys_limit which is the first zone's upper limit. I think we
can make use of it to facilitate to simplify code.
================================================================================
                        DMA                      DMA32                    NORMAL
1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
2)Normal machine        0~4G                     0                        (above 4G)
3)Special machine       (above 4G)~MAX
4)No DMA|DMA32                                                            (above 4G)~MAX

-------------------------------------------
                      arm64_dma_phys_limit
1)Raspberry Pi4         1G                     
2)Normal machine        4G                     
3)Special machine       MAX
4)No DMA|DMA32          MAX

Note: 3)Special machine means the machine's starting physical address is above 4G.
WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
IOMMU hardware supporting.
===================================================================================

I made a draft patch based on this patchset, please feel free to check and
see if it's OK, or anything missing or wrongly understood. I removed
reserve_crashkernel_high() and only keep reserve_crashkernel() and
reserve_crashkernel_low() as the v21 did.

Thanks
Baoquan



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-04-28  3:40               ` Baoquan He
  (?)
@ 2022-04-28  3:52                 ` Baoquan He
  -1 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-04-28  3:52 UTC (permalink / raw)
  To: Catalin Marinas, Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

[-- Attachment #1: Type: text/plain, Size: 4860 bytes --]

On 04/28/22 at 11:40am, Baoquan He wrote:
> Hi Catalin, Zhen Lei,
> 
> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
> > On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
> > > On 2022/4/27 20:32, Catalin Marinas wrote:
> > > > I think one could always pass a default command line like:
> > > > 
> > > > 	crashkernel=1G,high crashkernel=128M,low
> > > > 
> > > > without much knowledge of the SoC memory layout.
> > > 
> > > Yes, that's what the end result is. The user specify crashkernel=128M,low
> > > and the implementation ensure the 128M low memory is allocated from DMA zone.
> > > We use arm64_dma_phys_limit as the upper limit for crash low memory.
> > > 
> > > +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
> > > +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> > > +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
> > >                                                crash_base, crash_max);
> > > 
> > > > Another option is to only introduce crashkernel=Y,low and, when that is
> > > > passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
> > > > 'high' option at all:
> > > > 
> > > > 	crashkernel=1G				- all within ZONE_DMA
> > > > 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
> > > > 						  1G above ZONE_DMA
> > > > 
> > > > If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
> > > > the 'low' option.
> > > 
> > > I think although the code is hard to make generic, the interface is better to
> > > be relatively uniform. A user might have to maintain both x86 and arm64, and
> > > so on. It's not a good thing that the difference is too big.
> > 
> > There will be some difference as the 4G limit doesn't always hold for
> > arm64 (though it's true in most cases). Anyway, we can probably simplify
> > things a bit while following the documented behaviour:
> > 
> > 	crashkernel=Y		- current behaviour within ZONE_DMA
> > 	crashkernel=Y,high	- allocate from above ZONE_DMA
> > 	crashkernel=Y,low	- allocate within ZONE_DMA
> > 
> > There is no fallback from crashkernel=Y.
> > 
> > The question is whether we still want a default low allocation if
> > crashkernel=Y,low is missing but 'high' is present. If we add this, I
> > think we'd be consistent with kernel-parameters.txt for the 'low'
> > description. A default 'low' is probably not that bad but I'm tempted to
> > always mandate both 'high' and 'low'.
> 
> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
> about this version. And I have the same concerns about them which comes
> from below points:
> 1) we may need to take best effort to keep ,high, ,low behaviour
> consistent on all ARCHes. Otherwise user/admin may be confused when they
> deploy/configure kdump on different machines of different ARCHes in the
> same LAB. I think we should try to avoid the confusion.
> 2) Fallback behaviour is important to our distros. The reason is we will
> provide default value with crashkernel=xxxM along kernel of distros. In
> this case, we hope the reservation will succeed by all means. The ,high
> and ,low is an option if customer likes to take with expertise.
> 
> After going through arm64 memory init code, I got below summary about
> arm64_dma_phys_limit which is the first zone's upper limit. I think we
> can make use of it to facilitate to simplify code.
> ================================================================================
>                         DMA                      DMA32                    NORMAL
> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
> 2)Normal machine        0~4G                     0                        (above 4G)
> 3)Special machine       (above 4G)~MAX
> 4)No DMA|DMA32                                                            (above 4G)~MAX
> 
> -------------------------------------------
>                       arm64_dma_phys_limit
> 1)Raspberry Pi4         1G                     
> 2)Normal machine        4G                     
> 3)Special machine       MAX
> 4)No DMA|DMA32          MAX
> 
> Note: 3)Special machine means the machine's starting physical address is above 4G.
> WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
> IOMMU hardware supporting.
> ===================================================================================
> 
> I made a draft patch based on this patchset, please feel free to check and
> see if it's OK, or anything missing or wrongly understood. I removed
> reserve_crashkernel_high() and only keep reserve_crashkernel() and
> reserve_crashkernel_low() as the v21 did.

Sorry, forgot attaching the draft patch.

By the way, we can also have a simple version with basic ,high, ,low
support, no fallback. We can add fallback and other optimization later.
This can be plan B.



[-- Attachment #2: diff.patch --]
[-- Type: text/plain, Size: 4676 bytes --]

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 4a8200f29b35..aa1fbea47c46 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -84,11 +84,7 @@ EXPORT_SYMBOL(memstart_addr);
  * Note: Page-granularity mappings are necessary for crash kernel memory
  * range for shrinking its size via /sys/kernel/kexec_crash_size interface.
  */
-#if IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)
 phys_addr_t __ro_after_init arm64_dma_phys_limit;
-#else
-phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
-#endif
 
 bool crash_low_mem_page_map __initdata;
 static bool crash_high_mem_reserved __initdata;
@@ -132,63 +128,6 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
 	return 0;
 }
 
-static void __init reserve_crashkernel_high(void)
-{
-	unsigned long long crash_base, crash_size;
-	char *cmdline = boot_command_line;
-	int ret;
-
-	if (!IS_ENABLED(CONFIG_KEXEC_CORE))
-		return;
-
-	/* crashkernel=X[@offset] */
-	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
-				&crash_size, &crash_base);
-	if (ret || !crash_size) {
-		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
-		if (ret || !crash_size)
-			return;
-	} else if (!crash_base) {
-		crash_low_mem_page_map = true;
-	}
-
-	crash_size = PAGE_ALIGN(crash_size);
-
-	/*
-	 * For the case crashkernel=X, may fall back to reserve memory above
-	 * 4G, make reservations here in advance. It will be released later if
-	 * the region is successfully reserved under 4G.
-	 */
-	if (!crash_base) {
-		crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
-						       crash_base, CRASH_ADDR_HIGH_MAX);
-		if (!crash_base)
-			return;
-
-		crash_high_mem_reserved = true;
-	}
-
-	/* Mark the memory range that requires page-level mappings */
-	crashk_res.start = crash_base;
-	crashk_res.end   = crash_base + crash_size - 1;
-}
-
-static void __init hand_over_reserved_high_mem(void)
-{
-	crashk_res_high.start = crashk_res.start;
-	crashk_res_high.end   = crashk_res.end;
-
-	crashk_res.start = 0;
-	crashk_res.end   = 0;
-}
-
-static void __init take_reserved_high_mem(unsigned long long *crash_base,
-					  unsigned long long *crash_size)
-{
-	*crash_base = crashk_res_high.start;
-	*crash_size = resource_size(&crashk_res_high);
-}
-
 static void __init free_reserved_high_mem(void)
 {
 	memblock_phys_free(crashk_res_high.start, resource_size(&crashk_res_high));
@@ -225,7 +164,8 @@ static void __init reserve_crashkernel(void)
 	if (!IS_ENABLED(CONFIG_KEXEC_CORE))
 		return;
 
-	hand_over_reserved_high_mem();
+	if (crashk_res.end)
+		return;
 
 	/* crashkernel=X[@offset] */
 	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
@@ -245,11 +185,6 @@ static void __init reserve_crashkernel(void)
 
 		high = true;
 		crash_max = CRASH_ADDR_HIGH_MAX;
-
-		if (crash_high_mem_reserved) {
-			take_reserved_high_mem(&crash_base, &crash_size);
-			goto reserve_low;
-		}
 	}
 
 	fixed_base = !!crash_base;
@@ -267,12 +202,8 @@ static void __init reserve_crashkernel(void)
 		 * to high memory, the minimum required low memory will be
 		 * reserved later.
 		 */
-		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
-			if (crash_high_mem_reserved) {
-				take_reserved_high_mem(&crash_base, &crash_size);
-				goto reserve_low;
-			}
-
+		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX
+		    && crash_max <CRASH_ADDR_HIGH_MAX)) {
 			crash_max = CRASH_ADDR_HIGH_MAX;
 			goto retry;
 		}
@@ -289,7 +220,7 @@ static void __init reserve_crashkernel(void)
 	 * description of "crashkernel=X,high" option, add below 'high'
 	 * condition to make sure the crash low memory will be reserved.
 	 */
-	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
+	if (crash_base >= CRASH_ADDR_LOW_MAX ) {
 reserve_low:
 		/* case #3 of crashkernel,low reservation */
 		if (!high)
@@ -299,14 +230,7 @@ static void __init reserve_crashkernel(void)
 			memblock_phys_free(crash_base, crash_size);
 			return;
 		}
-	} else if (crash_high_mem_reserved) {
-		/*
-		 * The crash memory is successfully allocated under 4G, and the
-		 * previously reserved high memory is no longer required.
-		 */
-		free_reserved_high_mem();
 	}
-
 	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
 		crash_base, crash_base + crash_size, crash_size >> 20);
 
@@ -520,10 +444,10 @@ void __init arm64_memblock_init(void)
 
 	early_init_fdt_scan_reserved_mem();
 
+	arm64_dma_phys_limit = memblock_end_of_DRAM();
+
 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
 		reserve_crashkernel();
-	else
-		reserve_crashkernel_high();
 
 	high_memory = __va(memblock_end_of_DRAM() - 1) + 1;
 }

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-28  3:52                 ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-04-28  3:52 UTC (permalink / raw)
  To: Catalin Marinas, Leizhen (ThunderTown)
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

[-- Attachment #1: Type: text/plain, Size: 4860 bytes --]

On 04/28/22 at 11:40am, Baoquan He wrote:
> Hi Catalin, Zhen Lei,
> 
> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
> > On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
> > > On 2022/4/27 20:32, Catalin Marinas wrote:
> > > > I think one could always pass a default command line like:
> > > > 
> > > > 	crashkernel=1G,high crashkernel=128M,low
> > > > 
> > > > without much knowledge of the SoC memory layout.
> > > 
> > > Yes, that's what the end result is. The user specify crashkernel=128M,low
> > > and the implementation ensure the 128M low memory is allocated from DMA zone.
> > > We use arm64_dma_phys_limit as the upper limit for crash low memory.
> > > 
> > > +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
> > > +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> > > +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
> > >                                                crash_base, crash_max);
> > > 
> > > > Another option is to only introduce crashkernel=Y,low and, when that is
> > > > passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
> > > > 'high' option at all:
> > > > 
> > > > 	crashkernel=1G				- all within ZONE_DMA
> > > > 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
> > > > 						  1G above ZONE_DMA
> > > > 
> > > > If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
> > > > the 'low' option.
> > > 
> > > I think although the code is hard to make generic, the interface is better to
> > > be relatively uniform. A user might have to maintain both x86 and arm64, and
> > > so on. It's not a good thing that the difference is too big.
> > 
> > There will be some difference as the 4G limit doesn't always hold for
> > arm64 (though it's true in most cases). Anyway, we can probably simplify
> > things a bit while following the documented behaviour:
> > 
> > 	crashkernel=Y		- current behaviour within ZONE_DMA
> > 	crashkernel=Y,high	- allocate from above ZONE_DMA
> > 	crashkernel=Y,low	- allocate within ZONE_DMA
> > 
> > There is no fallback from crashkernel=Y.
> > 
> > The question is whether we still want a default low allocation if
> > crashkernel=Y,low is missing but 'high' is present. If we add this, I
> > think we'd be consistent with kernel-parameters.txt for the 'low'
> > description. A default 'low' is probably not that bad but I'm tempted to
> > always mandate both 'high' and 'low'.
> 
> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
> about this version. And I have the same concerns about them which comes
> from below points:
> 1) we may need to take best effort to keep ,high, ,low behaviour
> consistent on all ARCHes. Otherwise user/admin may be confused when they
> deploy/configure kdump on different machines of different ARCHes in the
> same LAB. I think we should try to avoid the confusion.
> 2) Fallback behaviour is important to our distros. The reason is we will
> provide default value with crashkernel=xxxM along kernel of distros. In
> this case, we hope the reservation will succeed by all means. The ,high
> and ,low is an option if customer likes to take with expertise.
> 
> After going through arm64 memory init code, I got below summary about
> arm64_dma_phys_limit which is the first zone's upper limit. I think we
> can make use of it to facilitate to simplify code.
> ================================================================================
>                         DMA                      DMA32                    NORMAL
> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
> 2)Normal machine        0~4G                     0                        (above 4G)
> 3)Special machine       (above 4G)~MAX
> 4)No DMA|DMA32                                                            (above 4G)~MAX
> 
> -------------------------------------------
>                       arm64_dma_phys_limit
> 1)Raspberry Pi4         1G                     
> 2)Normal machine        4G                     
> 3)Special machine       MAX
> 4)No DMA|DMA32          MAX
> 
> Note: 3)Special machine means the machine's starting physical address is above 4G.
> WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
> IOMMU hardware supporting.
> ===================================================================================
> 
> I made a draft patch based on this patchset, please feel free to check and
> see if it's OK, or anything missing or wrongly understood. I removed
> reserve_crashkernel_high() and only keep reserve_crashkernel() and
> reserve_crashkernel_low() as the v21 did.

Sorry, forgot attaching the draft patch.

By the way, we can also have a simple version with basic ,high, ,low
support, no fallback. We can add fallback and other optimization later.
This can be plan B.



[-- Attachment #2: diff.patch --]
[-- Type: text/plain, Size: 4676 bytes --]

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 4a8200f29b35..aa1fbea47c46 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -84,11 +84,7 @@ EXPORT_SYMBOL(memstart_addr);
  * Note: Page-granularity mappings are necessary for crash kernel memory
  * range for shrinking its size via /sys/kernel/kexec_crash_size interface.
  */
-#if IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)
 phys_addr_t __ro_after_init arm64_dma_phys_limit;
-#else
-phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
-#endif
 
 bool crash_low_mem_page_map __initdata;
 static bool crash_high_mem_reserved __initdata;
@@ -132,63 +128,6 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
 	return 0;
 }
 
-static void __init reserve_crashkernel_high(void)
-{
-	unsigned long long crash_base, crash_size;
-	char *cmdline = boot_command_line;
-	int ret;
-
-	if (!IS_ENABLED(CONFIG_KEXEC_CORE))
-		return;
-
-	/* crashkernel=X[@offset] */
-	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
-				&crash_size, &crash_base);
-	if (ret || !crash_size) {
-		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
-		if (ret || !crash_size)
-			return;
-	} else if (!crash_base) {
-		crash_low_mem_page_map = true;
-	}
-
-	crash_size = PAGE_ALIGN(crash_size);
-
-	/*
-	 * For the case crashkernel=X, may fall back to reserve memory above
-	 * 4G, make reservations here in advance. It will be released later if
-	 * the region is successfully reserved under 4G.
-	 */
-	if (!crash_base) {
-		crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
-						       crash_base, CRASH_ADDR_HIGH_MAX);
-		if (!crash_base)
-			return;
-
-		crash_high_mem_reserved = true;
-	}
-
-	/* Mark the memory range that requires page-level mappings */
-	crashk_res.start = crash_base;
-	crashk_res.end   = crash_base + crash_size - 1;
-}
-
-static void __init hand_over_reserved_high_mem(void)
-{
-	crashk_res_high.start = crashk_res.start;
-	crashk_res_high.end   = crashk_res.end;
-
-	crashk_res.start = 0;
-	crashk_res.end   = 0;
-}
-
-static void __init take_reserved_high_mem(unsigned long long *crash_base,
-					  unsigned long long *crash_size)
-{
-	*crash_base = crashk_res_high.start;
-	*crash_size = resource_size(&crashk_res_high);
-}
-
 static void __init free_reserved_high_mem(void)
 {
 	memblock_phys_free(crashk_res_high.start, resource_size(&crashk_res_high));
@@ -225,7 +164,8 @@ static void __init reserve_crashkernel(void)
 	if (!IS_ENABLED(CONFIG_KEXEC_CORE))
 		return;
 
-	hand_over_reserved_high_mem();
+	if (crashk_res.end)
+		return;
 
 	/* crashkernel=X[@offset] */
 	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
@@ -245,11 +185,6 @@ static void __init reserve_crashkernel(void)
 
 		high = true;
 		crash_max = CRASH_ADDR_HIGH_MAX;
-
-		if (crash_high_mem_reserved) {
-			take_reserved_high_mem(&crash_base, &crash_size);
-			goto reserve_low;
-		}
 	}
 
 	fixed_base = !!crash_base;
@@ -267,12 +202,8 @@ static void __init reserve_crashkernel(void)
 		 * to high memory, the minimum required low memory will be
 		 * reserved later.
 		 */
-		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
-			if (crash_high_mem_reserved) {
-				take_reserved_high_mem(&crash_base, &crash_size);
-				goto reserve_low;
-			}
-
+		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX
+		    && crash_max <CRASH_ADDR_HIGH_MAX)) {
 			crash_max = CRASH_ADDR_HIGH_MAX;
 			goto retry;
 		}
@@ -289,7 +220,7 @@ static void __init reserve_crashkernel(void)
 	 * description of "crashkernel=X,high" option, add below 'high'
 	 * condition to make sure the crash low memory will be reserved.
 	 */
-	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
+	if (crash_base >= CRASH_ADDR_LOW_MAX ) {
 reserve_low:
 		/* case #3 of crashkernel,low reservation */
 		if (!high)
@@ -299,14 +230,7 @@ static void __init reserve_crashkernel(void)
 			memblock_phys_free(crash_base, crash_size);
 			return;
 		}
-	} else if (crash_high_mem_reserved) {
-		/*
-		 * The crash memory is successfully allocated under 4G, and the
-		 * previously reserved high memory is no longer required.
-		 */
-		free_reserved_high_mem();
 	}
-
 	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
 		crash_base, crash_base + crash_size, crash_size >> 20);
 
@@ -520,10 +444,10 @@ void __init arm64_memblock_init(void)
 
 	early_init_fdt_scan_reserved_mem();
 
+	arm64_dma_phys_limit = memblock_end_of_DRAM();
+
 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
 		reserve_crashkernel();
-	else
-		reserve_crashkernel_high();
 
 	high_memory = __va(memblock_end_of_DRAM() - 1) + 1;
 }

[-- Attachment #3: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-28  3:52                 ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-04-28  3:52 UTC (permalink / raw)
  To: kexec

On 04/28/22 at 11:40am, Baoquan He wrote:
> Hi Catalin, Zhen Lei,
> 
> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
> > On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
> > > On 2022/4/27 20:32, Catalin Marinas wrote:
> > > > I think one could always pass a default command line like:
> > > > 
> > > > 	crashkernel=1G,high crashkernel=128M,low
> > > > 
> > > > without much knowledge of the SoC memory layout.
> > > 
> > > Yes, that's what the end result is. The user specify crashkernel=128M,low
> > > and the implementation ensure the 128M low memory is allocated from DMA zone.
> > > We use arm64_dma_phys_limit as the upper limit for crash low memory.
> > > 
> > > +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
> > > +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> > > +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
> > >                                                crash_base, crash_max);
> > > 
> > > > Another option is to only introduce crashkernel=Y,low and, when that is
> > > > passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
> > > > 'high' option at all:
> > > > 
> > > > 	crashkernel=1G				- all within ZONE_DMA
> > > > 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
> > > > 						  1G above ZONE_DMA
> > > > 
> > > > If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
> > > > the 'low' option.
> > > 
> > > I think although the code is hard to make generic, the interface is better to
> > > be relatively uniform. A user might have to maintain both x86 and arm64, and
> > > so on. It's not a good thing that the difference is too big.
> > 
> > There will be some difference as the 4G limit doesn't always hold for
> > arm64 (though it's true in most cases). Anyway, we can probably simplify
> > things a bit while following the documented behaviour:
> > 
> > 	crashkernel=Y		- current behaviour within ZONE_DMA
> > 	crashkernel=Y,high	- allocate from above ZONE_DMA
> > 	crashkernel=Y,low	- allocate within ZONE_DMA
> > 
> > There is no fallback from crashkernel=Y.
> > 
> > The question is whether we still want a default low allocation if
> > crashkernel=Y,low is missing but 'high' is present. If we add this, I
> > think we'd be consistent with kernel-parameters.txt for the 'low'
> > description. A default 'low' is probably not that bad but I'm tempted to
> > always mandate both 'high' and 'low'.
> 
> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
> about this version. And I have the same concerns about them which comes
> from below points:
> 1) we may need to take best effort to keep ,high, ,low behaviour
> consistent on all ARCHes. Otherwise user/admin may be confused when they
> deploy/configure kdump on different machines of different ARCHes in the
> same LAB. I think we should try to avoid the confusion.
> 2) Fallback behaviour is important to our distros. The reason is we will
> provide default value with crashkernel=xxxM along kernel of distros. In
> this case, we hope the reservation will succeed by all means. The ,high
> and ,low is an option if customer likes to take with expertise.
> 
> After going through arm64 memory init code, I got below summary about
> arm64_dma_phys_limit which is the first zone's upper limit. I think we
> can make use of it to facilitate to simplify code.
> ================================================================================
>                         DMA                      DMA32                    NORMAL
> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
> 2)Normal machine        0~4G                     0                        (above 4G)
> 3)Special machine       (above 4G)~MAX
> 4)No DMA|DMA32                                                            (above 4G)~MAX
> 
> -------------------------------------------
>                       arm64_dma_phys_limit
> 1)Raspberry Pi4         1G                     
> 2)Normal machine        4G                     
> 3)Special machine       MAX
> 4)No DMA|DMA32          MAX
> 
> Note: 3)Special machine means the machine's starting physical address is above 4G.
> WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
> IOMMU hardware supporting.
> ===================================================================================
> 
> I made a draft patch based on this patchset, please feel free to check and
> see if it's OK, or anything missing or wrongly understood. I removed
> reserve_crashkernel_high() and only keep reserve_crashkernel() and
> reserve_crashkernel_low() as the v21 did.

Sorry, forgot attaching the draft patch.

By the way, we can also have a simple version with basic ,high, ,low
support, no fallback. We can add fallback and other optimization later.
This can be plan B.


-------------- next part --------------
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 4a8200f29b35..aa1fbea47c46 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -84,11 +84,7 @@ EXPORT_SYMBOL(memstart_addr);
  * Note: Page-granularity mappings are necessary for crash kernel memory
  * range for shrinking its size via /sys/kernel/kexec_crash_size interface.
  */
-#if IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)
 phys_addr_t __ro_after_init arm64_dma_phys_limit;
-#else
-phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
-#endif
 
 bool crash_low_mem_page_map __initdata;
 static bool crash_high_mem_reserved __initdata;
@@ -132,63 +128,6 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
 	return 0;
 }
 
-static void __init reserve_crashkernel_high(void)
-{
-	unsigned long long crash_base, crash_size;
-	char *cmdline = boot_command_line;
-	int ret;
-
-	if (!IS_ENABLED(CONFIG_KEXEC_CORE))
-		return;
-
-	/* crashkernel=X[@offset] */
-	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
-				&crash_size, &crash_base);
-	if (ret || !crash_size) {
-		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
-		if (ret || !crash_size)
-			return;
-	} else if (!crash_base) {
-		crash_low_mem_page_map = true;
-	}
-
-	crash_size = PAGE_ALIGN(crash_size);
-
-	/*
-	 * For the case crashkernel=X, may fall back to reserve memory above
-	 * 4G, make reservations here in advance. It will be released later if
-	 * the region is successfully reserved under 4G.
-	 */
-	if (!crash_base) {
-		crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
-						       crash_base, CRASH_ADDR_HIGH_MAX);
-		if (!crash_base)
-			return;
-
-		crash_high_mem_reserved = true;
-	}
-
-	/* Mark the memory range that requires page-level mappings */
-	crashk_res.start = crash_base;
-	crashk_res.end   = crash_base + crash_size - 1;
-}
-
-static void __init hand_over_reserved_high_mem(void)
-{
-	crashk_res_high.start = crashk_res.start;
-	crashk_res_high.end   = crashk_res.end;
-
-	crashk_res.start = 0;
-	crashk_res.end   = 0;
-}
-
-static void __init take_reserved_high_mem(unsigned long long *crash_base,
-					  unsigned long long *crash_size)
-{
-	*crash_base = crashk_res_high.start;
-	*crash_size = resource_size(&crashk_res_high);
-}
-
 static void __init free_reserved_high_mem(void)
 {
 	memblock_phys_free(crashk_res_high.start, resource_size(&crashk_res_high));
@@ -225,7 +164,8 @@ static void __init reserve_crashkernel(void)
 	if (!IS_ENABLED(CONFIG_KEXEC_CORE))
 		return;
 
-	hand_over_reserved_high_mem();
+	if (crashk_res.end)
+		return;
 
 	/* crashkernel=X[@offset] */
 	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
@@ -245,11 +185,6 @@ static void __init reserve_crashkernel(void)
 
 		high = true;
 		crash_max = CRASH_ADDR_HIGH_MAX;
-
-		if (crash_high_mem_reserved) {
-			take_reserved_high_mem(&crash_base, &crash_size);
-			goto reserve_low;
-		}
 	}
 
 	fixed_base = !!crash_base;
@@ -267,12 +202,8 @@ static void __init reserve_crashkernel(void)
 		 * to high memory, the minimum required low memory will be
 		 * reserved later.
 		 */
-		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
-			if (crash_high_mem_reserved) {
-				take_reserved_high_mem(&crash_base, &crash_size);
-				goto reserve_low;
-			}
-
+		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX
+		    && crash_max <CRASH_ADDR_HIGH_MAX)) {
 			crash_max = CRASH_ADDR_HIGH_MAX;
 			goto retry;
 		}
@@ -289,7 +220,7 @@ static void __init reserve_crashkernel(void)
 	 * description of "crashkernel=X,high" option, add below 'high'
 	 * condition to make sure the crash low memory will be reserved.
 	 */
-	if ((crash_base >= CRASH_ADDR_LOW_MAX) || high) {
+	if (crash_base >= CRASH_ADDR_LOW_MAX ) {
 reserve_low:
 		/* case #3 of crashkernel,low reservation */
 		if (!high)
@@ -299,14 +230,7 @@ static void __init reserve_crashkernel(void)
 			memblock_phys_free(crash_base, crash_size);
 			return;
 		}
-	} else if (crash_high_mem_reserved) {
-		/*
-		 * The crash memory is successfully allocated under 4G, and the
-		 * previously reserved high memory is no longer required.
-		 */
-		free_reserved_high_mem();
 	}
-
 	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
 		crash_base, crash_base + crash_size, crash_size >> 20);
 
@@ -520,10 +444,10 @@ void __init arm64_memblock_init(void)
 
 	early_init_fdt_scan_reserved_mem();
 
+	arm64_dma_phys_limit = memblock_end_of_DRAM();
+
 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
 		reserve_crashkernel();
-	else
-		reserve_crashkernel_high();
 
 	high_memory = __va(memblock_end_of_DRAM() - 1) + 1;
 }

^ permalink raw reply related	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-04-28  3:52                 ` Baoquan He
  (?)
@ 2022-04-28  9:33                   ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-28  9:33 UTC (permalink / raw)
  To: Baoquan He, Catalin Marinas
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp



On 2022/4/28 11:52, Baoquan He wrote:
> On 04/28/22 at 11:40am, Baoquan He wrote:
>> Hi Catalin, Zhen Lei,
>>
>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
>>> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
>>>> On 2022/4/27 20:32, Catalin Marinas wrote:
>>>>> I think one could always pass a default command line like:
>>>>>
>>>>> 	crashkernel=1G,high crashkernel=128M,low
>>>>>
>>>>> without much knowledge of the SoC memory layout.
>>>>
>>>> Yes, that's what the end result is. The user specify crashkernel=128M,low
>>>> and the implementation ensure the 128M low memory is allocated from DMA zone.
>>>> We use arm64_dma_phys_limit as the upper limit for crash low memory.
>>>>
>>>> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
>>>> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>>> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>>                                                crash_base, crash_max);
>>>>
>>>>> Another option is to only introduce crashkernel=Y,low and, when that is
>>>>> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
>>>>> 'high' option at all:
>>>>>
>>>>> 	crashkernel=1G				- all within ZONE_DMA
>>>>> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
>>>>> 						  1G above ZONE_DMA
>>>>>
>>>>> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
>>>>> the 'low' option.
>>>>
>>>> I think although the code is hard to make generic, the interface is better to
>>>> be relatively uniform. A user might have to maintain both x86 and arm64, and
>>>> so on. It's not a good thing that the difference is too big.
>>>
>>> There will be some difference as the 4G limit doesn't always hold for
>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
>>> things a bit while following the documented behaviour:
>>>
>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
>>>
>>> There is no fallback from crashkernel=Y.
>>>
>>> The question is whether we still want a default low allocation if
>>> crashkernel=Y,low is missing but 'high' is present. If we add this, I
>>> think we'd be consistent with kernel-parameters.txt for the 'low'
>>> description. A default 'low' is probably not that bad but I'm tempted to
>>> always mandate both 'high' and 'low'.
>>
>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
>> about this version. And I have the same concerns about them which comes
>> from below points:
>> 1) we may need to take best effort to keep ,high, ,low behaviour
>> consistent on all ARCHes. Otherwise user/admin may be confused when they
>> deploy/configure kdump on different machines of different ARCHes in the
>> same LAB. I think we should try to avoid the confusion.

Yes, but for someone who is configuring crashkernel= for the first time, he
needs to read doc to understand how to configure it. The doc can show the
recommended default value of 'low' size.

After commit 94fb93341822 ("x86/crash: Allocate enough low memory when crashkernel=high"),
the default 'low' size doesn't make much sense anymore. The default size of swiotlb_size()
is 64M, far less than 256M. And if user specify "swiotlb=", he can also adjust crashkernel=Y,low.


+                * -swiotlb size: user-specified with swiotlb= or default.
-               low_size = swiotlb_size_or_default() + (8UL<<20);
+               low_size = max(swiotlb_size_or_default() + (8UL<<20), 256UL<<20);

That means all ARCHs can explicit configure crashkernel=256M,low, instead of
omitting it. This may be another way to avoid confusion. It's not hard for
programmer-turned-user/admin. However, this requires us to forgo backward
compatibility with the default size of 'low'.


>> 2) Fallback behaviour is important to our distros. The reason is we will
>> provide default value with crashkernel=xxxM along kernel of distros. In
>> this case, we hope the reservation will succeed by all means. The ,high
>> and ,low is an option if customer likes to take with expertise.

OK, I got it.

>>
>> After going through arm64 memory init code, I got below summary about
>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
>> can make use of it to facilitate to simplify code.
>> ================================================================================
>>                         DMA                      DMA32                    NORMAL
>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
>> 2)Normal machine        0~4G                     0                        (above 4G)
>> 3)Special machine       (above 4G)~MAX
>> 4)No DMA|DMA32                                                            (above 4G)~MAX

arm64_memblock_init()
	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
paging_init()                                       |
	map_mem()                                   |
unflatten_device_tree or ACPI                       |  ----  //Raspberry Pi4 get dma zone base on dtb or ACPI
bootmem_init();                                     |      |
	zone_sizes_init()                           |      |
		of_dma_get_max_cpu_address          |  ----|
		//Update arm64_dma_phys_limit       |  ----|
	reserve_crashkernel()        <--------------  //Because we need arm64_dma_phys_limit to be updated above
request_standard_resources()

>>
>> -------------------------------------------
>>                       arm64_dma_phys_limit
>> 1)Raspberry Pi4         1G                     
>> 2)Normal machine        4G                     
>> 3)Special machine       MAX
>> 4)No DMA|DMA32          MAX
>>
>> Note: 3)Special machine means the machine's starting physical address is above 4G.
>> WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
>> IOMMU hardware supporting.
>> ===================================================================================
>>
>> I made a draft patch based on this patchset, please feel free to check and
>> see if it's OK, or anything missing or wrongly understood. I removed
>> reserve_crashkernel_high() and only keep reserve_crashkernel() and
>> reserve_crashkernel_low() as the v21 did.
> 
> Sorry, forgot attaching the draft patch.
> 
> By the way, we can also have a simple version with basic ,high, ,low
> support, no fallback. We can add fallback and other optimization later.
> This can be plan B.

Yes, That's what Catalin suggested also.

Hi, Baoquan He:
  Without optimization, the whole Patch 3-4 and 6-7 can be dropped.

Process after abstraction:
	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) {
		reserve_crashkernel()
		//block mapping
	} else {
		//page mapping
		reserve_crashkernel()
	}

------------ Simplified real-world process ---------
arm64_memblock_init()
	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
		reserve_crashkernel()
paging_init()
	map_mem()
		if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
			//block mapping
		else
			//page mapping
unflatten_device_tree or ACPI
bootmem_init();
	zone_sizes_init()
		of_dma_get_max_cpu_address
		//Update arm64_dma_phys_limit
	if (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32))
		reserve_crashkernel()


> 
> 

-- 
Regards,
  Zhen Lei

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-28  9:33                   ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-28  9:33 UTC (permalink / raw)
  To: Baoquan He, Catalin Marinas
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp



On 2022/4/28 11:52, Baoquan He wrote:
> On 04/28/22 at 11:40am, Baoquan He wrote:
>> Hi Catalin, Zhen Lei,
>>
>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
>>> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
>>>> On 2022/4/27 20:32, Catalin Marinas wrote:
>>>>> I think one could always pass a default command line like:
>>>>>
>>>>> 	crashkernel=1G,high crashkernel=128M,low
>>>>>
>>>>> without much knowledge of the SoC memory layout.
>>>>
>>>> Yes, that's what the end result is. The user specify crashkernel=128M,low
>>>> and the implementation ensure the 128M low memory is allocated from DMA zone.
>>>> We use arm64_dma_phys_limit as the upper limit for crash low memory.
>>>>
>>>> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
>>>> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>>> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>>                                                crash_base, crash_max);
>>>>
>>>>> Another option is to only introduce crashkernel=Y,low and, when that is
>>>>> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
>>>>> 'high' option at all:
>>>>>
>>>>> 	crashkernel=1G				- all within ZONE_DMA
>>>>> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
>>>>> 						  1G above ZONE_DMA
>>>>>
>>>>> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
>>>>> the 'low' option.
>>>>
>>>> I think although the code is hard to make generic, the interface is better to
>>>> be relatively uniform. A user might have to maintain both x86 and arm64, and
>>>> so on. It's not a good thing that the difference is too big.
>>>
>>> There will be some difference as the 4G limit doesn't always hold for
>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
>>> things a bit while following the documented behaviour:
>>>
>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
>>>
>>> There is no fallback from crashkernel=Y.
>>>
>>> The question is whether we still want a default low allocation if
>>> crashkernel=Y,low is missing but 'high' is present. If we add this, I
>>> think we'd be consistent with kernel-parameters.txt for the 'low'
>>> description. A default 'low' is probably not that bad but I'm tempted to
>>> always mandate both 'high' and 'low'.
>>
>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
>> about this version. And I have the same concerns about them which comes
>> from below points:
>> 1) we may need to take best effort to keep ,high, ,low behaviour
>> consistent on all ARCHes. Otherwise user/admin may be confused when they
>> deploy/configure kdump on different machines of different ARCHes in the
>> same LAB. I think we should try to avoid the confusion.

Yes, but for someone who is configuring crashkernel= for the first time, he
needs to read doc to understand how to configure it. The doc can show the
recommended default value of 'low' size.

After commit 94fb93341822 ("x86/crash: Allocate enough low memory when crashkernel=high"),
the default 'low' size doesn't make much sense anymore. The default size of swiotlb_size()
is 64M, far less than 256M. And if user specify "swiotlb=", he can also adjust crashkernel=Y,low.


+                * -swiotlb size: user-specified with swiotlb= or default.
-               low_size = swiotlb_size_or_default() + (8UL<<20);
+               low_size = max(swiotlb_size_or_default() + (8UL<<20), 256UL<<20);

That means all ARCHs can explicit configure crashkernel=256M,low, instead of
omitting it. This may be another way to avoid confusion. It's not hard for
programmer-turned-user/admin. However, this requires us to forgo backward
compatibility with the default size of 'low'.


>> 2) Fallback behaviour is important to our distros. The reason is we will
>> provide default value with crashkernel=xxxM along kernel of distros. In
>> this case, we hope the reservation will succeed by all means. The ,high
>> and ,low is an option if customer likes to take with expertise.

OK, I got it.

>>
>> After going through arm64 memory init code, I got below summary about
>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
>> can make use of it to facilitate to simplify code.
>> ================================================================================
>>                         DMA                      DMA32                    NORMAL
>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
>> 2)Normal machine        0~4G                     0                        (above 4G)
>> 3)Special machine       (above 4G)~MAX
>> 4)No DMA|DMA32                                                            (above 4G)~MAX

arm64_memblock_init()
	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
paging_init()                                       |
	map_mem()                                   |
unflatten_device_tree or ACPI                       |  ----  //Raspberry Pi4 get dma zone base on dtb or ACPI
bootmem_init();                                     |      |
	zone_sizes_init()                           |      |
		of_dma_get_max_cpu_address          |  ----|
		//Update arm64_dma_phys_limit       |  ----|
	reserve_crashkernel()        <--------------  //Because we need arm64_dma_phys_limit to be updated above
request_standard_resources()

>>
>> -------------------------------------------
>>                       arm64_dma_phys_limit
>> 1)Raspberry Pi4         1G                     
>> 2)Normal machine        4G                     
>> 3)Special machine       MAX
>> 4)No DMA|DMA32          MAX
>>
>> Note: 3)Special machine means the machine's starting physical address is above 4G.
>> WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
>> IOMMU hardware supporting.
>> ===================================================================================
>>
>> I made a draft patch based on this patchset, please feel free to check and
>> see if it's OK, or anything missing or wrongly understood. I removed
>> reserve_crashkernel_high() and only keep reserve_crashkernel() and
>> reserve_crashkernel_low() as the v21 did.
> 
> Sorry, forgot attaching the draft patch.
> 
> By the way, we can also have a simple version with basic ,high, ,low
> support, no fallback. We can add fallback and other optimization later.
> This can be plan B.

Yes, That's what Catalin suggested also.

Hi, Baoquan He:
  Without optimization, the whole Patch 3-4 and 6-7 can be dropped.

Process after abstraction:
	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) {
		reserve_crashkernel()
		//block mapping
	} else {
		//page mapping
		reserve_crashkernel()
	}

------------ Simplified real-world process ---------
arm64_memblock_init()
	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
		reserve_crashkernel()
paging_init()
	map_mem()
		if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
			//block mapping
		else
			//page mapping
unflatten_device_tree or ACPI
bootmem_init();
	zone_sizes_init()
		of_dma_get_max_cpu_address
		//Update arm64_dma_phys_limit
	if (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32))
		reserve_crashkernel()


> 
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-28  9:33                   ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen @ 2022-04-28  9:33 UTC (permalink / raw)
  To: kexec



On 2022/4/28 11:52, Baoquan He wrote:
> On 04/28/22 at 11:40am, Baoquan He wrote:
>> Hi Catalin, Zhen Lei,
>>
>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
>>> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
>>>> On 2022/4/27 20:32, Catalin Marinas wrote:
>>>>> I think one could always pass a default command line like:
>>>>>
>>>>> 	crashkernel=1G,high crashkernel=128M,low
>>>>>
>>>>> without much knowledge of the SoC memory layout.
>>>>
>>>> Yes, that's what the end result is. The user specify crashkernel=128M,low
>>>> and the implementation ensure the 128M low memory is allocated from DMA zone.
>>>> We use arm64_dma_phys_limit as the upper limit for crash low memory.
>>>>
>>>> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
>>>> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>>> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>>                                                crash_base, crash_max);
>>>>
>>>>> Another option is to only introduce crashkernel=Y,low and, when that is
>>>>> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
>>>>> 'high' option at all:
>>>>>
>>>>> 	crashkernel=1G				- all within ZONE_DMA
>>>>> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
>>>>> 						  1G above ZONE_DMA
>>>>>
>>>>> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
>>>>> the 'low' option.
>>>>
>>>> I think although the code is hard to make generic, the interface is better to
>>>> be relatively uniform. A user might have to maintain both x86 and arm64, and
>>>> so on. It's not a good thing that the difference is too big.
>>>
>>> There will be some difference as the 4G limit doesn't always hold for
>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
>>> things a bit while following the documented behaviour:
>>>
>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
>>>
>>> There is no fallback from crashkernel=Y.
>>>
>>> The question is whether we still want a default low allocation if
>>> crashkernel=Y,low is missing but 'high' is present. If we add this, I
>>> think we'd be consistent with kernel-parameters.txt for the 'low'
>>> description. A default 'low' is probably not that bad but I'm tempted to
>>> always mandate both 'high' and 'low'.
>>
>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
>> about this version. And I have the same concerns about them which comes
>> from below points:
>> 1) we may need to take best effort to keep ,high, ,low behaviour
>> consistent on all ARCHes. Otherwise user/admin may be confused when they
>> deploy/configure kdump on different machines of different ARCHes in the
>> same LAB. I think we should try to avoid the confusion.

Yes, but for someone who is configuring crashkernel= for the first time, he
needs to read doc to understand how to configure it. The doc can show the
recommended default value of 'low' size.

After commit 94fb93341822 ("x86/crash: Allocate enough low memory when crashkernel=high"),
the default 'low' size doesn't make much sense anymore. The default size of swiotlb_size()
is 64M, far less than 256M. And if user specify "swiotlb=", he can also adjust crashkernel=Y,low.


+                * -swiotlb size: user-specified with swiotlb= or default.
-               low_size = swiotlb_size_or_default() + (8UL<<20);
+               low_size = max(swiotlb_size_or_default() + (8UL<<20), 256UL<<20);

That means all ARCHs can explicit configure crashkernel=256M,low, instead of
omitting it. This may be another way to avoid confusion. It's not hard for
programmer-turned-user/admin. However, this requires us to forgo backward
compatibility with the default size of 'low'.


>> 2) Fallback behaviour is important to our distros. The reason is we will
>> provide default value with crashkernel=xxxM along kernel of distros. In
>> this case, we hope the reservation will succeed by all means. The ,high
>> and ,low is an option if customer likes to take with expertise.

OK, I got it.

>>
>> After going through arm64 memory init code, I got below summary about
>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
>> can make use of it to facilitate to simplify code.
>> ================================================================================
>>                         DMA                      DMA32                    NORMAL
>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
>> 2)Normal machine        0~4G                     0                        (above 4G)
>> 3)Special machine       (above 4G)~MAX
>> 4)No DMA|DMA32                                                            (above 4G)~MAX

arm64_memblock_init()
	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
paging_init()                                       |
	map_mem()                                   |
unflatten_device_tree or ACPI                       |  ----  //Raspberry Pi4 get dma zone base on dtb or ACPI
bootmem_init();                                     |      |
	zone_sizes_init()                           |      |
		of_dma_get_max_cpu_address          |  ----|
		//Update arm64_dma_phys_limit       |  ----|
	reserve_crashkernel()        <--------------  //Because we need arm64_dma_phys_limit to be updated above
request_standard_resources()

>>
>> -------------------------------------------
>>                       arm64_dma_phys_limit
>> 1)Raspberry Pi4         1G                     
>> 2)Normal machine        4G                     
>> 3)Special machine       MAX
>> 4)No DMA|DMA32          MAX
>>
>> Note: 3)Special machine means the machine's starting physical address is above 4G.
>> WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
>> IOMMU hardware supporting.
>> ===================================================================================
>>
>> I made a draft patch based on this patchset, please feel free to check and
>> see if it's OK, or anything missing or wrongly understood. I removed
>> reserve_crashkernel_high() and only keep reserve_crashkernel() and
>> reserve_crashkernel_low() as the v21 did.
> 
> Sorry, forgot attaching the draft patch.
> 
> By the way, we can also have a simple version with basic ,high, ,low
> support, no fallback. We can add fallback and other optimization later.
> This can be plan B.

Yes, That's what Catalin suggested also.

Hi, Baoquan He:
  Without optimization, the whole Patch 3-4 and 6-7 can be dropped.

Process after abstraction:
	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) {
		reserve_crashkernel()
		//block mapping
	} else {
		//page mapping
		reserve_crashkernel()
	}

------------ Simplified real-world process ---------
arm64_memblock_init()
	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
		reserve_crashkernel()
paging_init()
	map_mem()
		if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
			//block mapping
		else
			//page mapping
unflatten_device_tree or ACPI
bootmem_init();
	zone_sizes_init()
		of_dma_get_max_cpu_address
		//Update arm64_dma_phys_limit
	if (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32))
		reserve_crashkernel()


> 
> 

-- 
Regards,
  Zhen Lei


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-04-28  9:33                   ` Leizhen (ThunderTown)
  (?)
@ 2022-04-29  3:24                     ` Baoquan He
  -1 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-04-29  3:24 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Catalin Marinas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/4/28 11:52, Baoquan He wrote:
> > On 04/28/22 at 11:40am, Baoquan He wrote:
> >> Hi Catalin, Zhen Lei,
> >>
> >> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
> >>> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
> >>>> On 2022/4/27 20:32, Catalin Marinas wrote:
> >>>>> I think one could always pass a default command line like:
> >>>>>
> >>>>> 	crashkernel=1G,high crashkernel=128M,low
> >>>>>
> >>>>> without much knowledge of the SoC memory layout.
> >>>>
> >>>> Yes, that's what the end result is. The user specify crashkernel=128M,low
> >>>> and the implementation ensure the 128M low memory is allocated from DMA zone.
> >>>> We use arm64_dma_phys_limit as the upper limit for crash low memory.
> >>>>
> >>>> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
> >>>> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> >>>> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
> >>>>                                                crash_base, crash_max);
> >>>>
> >>>>> Another option is to only introduce crashkernel=Y,low and, when that is
> >>>>> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
> >>>>> 'high' option at all:
> >>>>>
> >>>>> 	crashkernel=1G				- all within ZONE_DMA
> >>>>> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
> >>>>> 						  1G above ZONE_DMA
> >>>>>
> >>>>> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
> >>>>> the 'low' option.
> >>>>
> >>>> I think although the code is hard to make generic, the interface is better to
> >>>> be relatively uniform. A user might have to maintain both x86 and arm64, and
> >>>> so on. It's not a good thing that the difference is too big.
> >>>
> >>> There will be some difference as the 4G limit doesn't always hold for
> >>> arm64 (though it's true in most cases). Anyway, we can probably simplify
> >>> things a bit while following the documented behaviour:
> >>>
> >>> 	crashkernel=Y		- current behaviour within ZONE_DMA
> >>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> >>> 	crashkernel=Y,low	- allocate within ZONE_DMA
> >>>
> >>> There is no fallback from crashkernel=Y.
> >>>
> >>> The question is whether we still want a default low allocation if
> >>> crashkernel=Y,low is missing but 'high' is present. If we add this, I
> >>> think we'd be consistent with kernel-parameters.txt for the 'low'
> >>> description. A default 'low' is probably not that bad but I'm tempted to
> >>> always mandate both 'high' and 'low'.
> >>
> >> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
> >> about this version. And I have the same concerns about them which comes
> >> from below points:
> >> 1) we may need to take best effort to keep ,high, ,low behaviour
> >> consistent on all ARCHes. Otherwise user/admin may be confused when they
> >> deploy/configure kdump on different machines of different ARCHes in the
> >> same LAB. I think we should try to avoid the confusion.
> 
> Yes, but for someone who is configuring crashkernel= for the first time, he
> needs to read doc to understand how to configure it. The doc can show the
> recommended default value of 'low' size.
> 
> After commit 94fb93341822 ("x86/crash: Allocate enough low memory when crashkernel=high"),
> the default 'low' size doesn't make much sense anymore. The default size of swiotlb_size()
> is 64M, far less than 256M. And if user specify "swiotlb=", he can also adjust crashkernel=Y,low.
> 
> 
> +                * -swiotlb size: user-specified with swiotlb= or default.
> -               low_size = swiotlb_size_or_default() + (8UL<<20);
> +               low_size = max(swiotlb_size_or_default() + (8UL<<20), 256UL<<20);
> 
> That means all ARCHs can explicit configure crashkernel=256M,low, instead of
> omitting it. This may be another way to avoid confusion. It's not hard for
> programmer-turned-user/admin. However, this requires us to forgo backward
> compatibility with the default size of 'low'.

We can make ,high and ,low simpler at first as they are alternative. If
possible, we can also simplify the ,high ,low implementation on x86_64
if it truly brings better archievement on arm64.

> 
> 
> >> 2) Fallback behaviour is important to our distros. The reason is we will
> >> provide default value with crashkernel=xxxM along kernel of distros. In
> >> this case, we hope the reservation will succeed by all means. The ,high
> >> and ,low is an option if customer likes to take with expertise.
> 
> OK, I got it.
> 
> >>
> >> After going through arm64 memory init code, I got below summary about
> >> arm64_dma_phys_limit which is the first zone's upper limit. I think we
> >> can make use of it to facilitate to simplify code.
> >> ================================================================================
> >>                         DMA                      DMA32                    NORMAL
> >> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
> >> 2)Normal machine        0~4G                     0                        (above 4G)
> >> 3)Special machine       (above 4G)~MAX
> >> 4)No DMA|DMA32                                                            (above 4G)~MAX
> 
> arm64_memblock_init()
> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
We don't need different code for this place of reservation as you are
doing in this patchset, since arm64_dma_phys_limit is initialized as 
below. In fact, in arm64_memblock_init(), we have made memblock ready,
we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
memblock_start_of_DRAM() is bigger than 4G, we possibly can call
reserve_crashkernel() here too.

phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;

> paging_init()                                       |
> 	map_mem()                                   |
> unflatten_device_tree or ACPI                       |  ----  //Raspberry Pi4 get dma zone base on dtb or ACPI
> bootmem_init();                                     |      |
> 	zone_sizes_init()                           |      |
> 		of_dma_get_max_cpu_address          |  ----|
> 		//Update arm64_dma_phys_limit       |  ----|
> 	reserve_crashkernel()        <--------------  //Because we need arm64_dma_phys_limit to be updated above
> request_standard_resources()

Yeah, because arm64_dma_phys_limit is decided late in the 1) and 2) case
as I summarized, we need defer reserve_crashkernel() to bootmem_init(). But 
arm64_dma_phys_limit could be 1G or 4G, that's why your optimization
about BLOCKING may not be right since you assume the 4G boundary, while
forgetting Raspberry Pi4 on which 1G is the boundary of low memory and
high memory. So separating out BLOCKING optimization can let us focus on
the crashkernel,high support.

> 
> >>
> >> -------------------------------------------
> >>                       arm64_dma_phys_limit
> >> 1)Raspberry Pi4         1G                     
> >> 2)Normal machine        4G                     
> >> 3)Special machine       MAX
> >> 4)No DMA|DMA32          MAX
> >>
> >> Note: 3)Special machine means the machine's starting physical address is above 4G.
> >> WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
> >> IOMMU hardware supporting.
> >> ===================================================================================
> >>
> >> I made a draft patch based on this patchset, please feel free to check and
> >> see if it's OK, or anything missing or wrongly understood. I removed
> >> reserve_crashkernel_high() and only keep reserve_crashkernel() and
> >> reserve_crashkernel_low() as the v21 did.
> > 
> > Sorry, forgot attaching the draft patch.
> > 
> > By the way, we can also have a simple version with basic ,high, ,low
> > support, no fallback. We can add fallback and other optimization later.
> > This can be plan B.
> 
> Yes, That's what Catalin suggested also.
> 
> Hi, Baoquan He:
>   Without optimization, the whole Patch 3-4 and 6-7 can be dropped.
> 
> Process after abstraction:
> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) {
> 		reserve_crashkernel()
> 		//block mapping
> 	} else {
> 		//page mapping
> 		reserve_crashkernel()
> 	}
> 
> ------------ Simplified real-world process ---------
Yeah, this looks clearer. I would like to see a version with them.

> arm64_memblock_init()
        Before reserve_crashkernel(), we can update arm64_dma_phys_limit
as memblock_end_of_DRAM() if CONFIG_ZONE_DMA|DMA32 is not enabled or
memblock_start_of_DRAM() is bigger than 4G.
        Then we go with:
        if (!arm64_dma_phys_limit)
		reserve_crashkernel();

Just personal opinion, please check if it's appropriate to handle case
3) which has physical starting memory above 4G here. 

> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
> 		reserve_crashkernel()
           
> paging_init()
> 	map_mem()
> 		if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
> 			//block mapping
> 		else
> 			//page mapping
> unflatten_device_tree or ACPI
> bootmem_init();
> 	zone_sizes_init()
> 		of_dma_get_max_cpu_address
> 		//Update arm64_dma_phys_limit
> 	if (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32))
> 		reserve_crashkernel()

The rest sounds good with optimization code split out.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-29  3:24                     ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-04-29  3:24 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Catalin Marinas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/4/28 11:52, Baoquan He wrote:
> > On 04/28/22 at 11:40am, Baoquan He wrote:
> >> Hi Catalin, Zhen Lei,
> >>
> >> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
> >>> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
> >>>> On 2022/4/27 20:32, Catalin Marinas wrote:
> >>>>> I think one could always pass a default command line like:
> >>>>>
> >>>>> 	crashkernel=1G,high crashkernel=128M,low
> >>>>>
> >>>>> without much knowledge of the SoC memory layout.
> >>>>
> >>>> Yes, that's what the end result is. The user specify crashkernel=128M,low
> >>>> and the implementation ensure the 128M low memory is allocated from DMA zone.
> >>>> We use arm64_dma_phys_limit as the upper limit for crash low memory.
> >>>>
> >>>> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
> >>>> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> >>>> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
> >>>>                                                crash_base, crash_max);
> >>>>
> >>>>> Another option is to only introduce crashkernel=Y,low and, when that is
> >>>>> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
> >>>>> 'high' option at all:
> >>>>>
> >>>>> 	crashkernel=1G				- all within ZONE_DMA
> >>>>> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
> >>>>> 						  1G above ZONE_DMA
> >>>>>
> >>>>> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
> >>>>> the 'low' option.
> >>>>
> >>>> I think although the code is hard to make generic, the interface is better to
> >>>> be relatively uniform. A user might have to maintain both x86 and arm64, and
> >>>> so on. It's not a good thing that the difference is too big.
> >>>
> >>> There will be some difference as the 4G limit doesn't always hold for
> >>> arm64 (though it's true in most cases). Anyway, we can probably simplify
> >>> things a bit while following the documented behaviour:
> >>>
> >>> 	crashkernel=Y		- current behaviour within ZONE_DMA
> >>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> >>> 	crashkernel=Y,low	- allocate within ZONE_DMA
> >>>
> >>> There is no fallback from crashkernel=Y.
> >>>
> >>> The question is whether we still want a default low allocation if
> >>> crashkernel=Y,low is missing but 'high' is present. If we add this, I
> >>> think we'd be consistent with kernel-parameters.txt for the 'low'
> >>> description. A default 'low' is probably not that bad but I'm tempted to
> >>> always mandate both 'high' and 'low'.
> >>
> >> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
> >> about this version. And I have the same concerns about them which comes
> >> from below points:
> >> 1) we may need to take best effort to keep ,high, ,low behaviour
> >> consistent on all ARCHes. Otherwise user/admin may be confused when they
> >> deploy/configure kdump on different machines of different ARCHes in the
> >> same LAB. I think we should try to avoid the confusion.
> 
> Yes, but for someone who is configuring crashkernel= for the first time, he
> needs to read doc to understand how to configure it. The doc can show the
> recommended default value of 'low' size.
> 
> After commit 94fb93341822 ("x86/crash: Allocate enough low memory when crashkernel=high"),
> the default 'low' size doesn't make much sense anymore. The default size of swiotlb_size()
> is 64M, far less than 256M. And if user specify "swiotlb=", he can also adjust crashkernel=Y,low.
> 
> 
> +                * -swiotlb size: user-specified with swiotlb= or default.
> -               low_size = swiotlb_size_or_default() + (8UL<<20);
> +               low_size = max(swiotlb_size_or_default() + (8UL<<20), 256UL<<20);
> 
> That means all ARCHs can explicit configure crashkernel=256M,low, instead of
> omitting it. This may be another way to avoid confusion. It's not hard for
> programmer-turned-user/admin. However, this requires us to forgo backward
> compatibility with the default size of 'low'.

We can make ,high and ,low simpler at first as they are alternative. If
possible, we can also simplify the ,high ,low implementation on x86_64
if it truly brings better archievement on arm64.

> 
> 
> >> 2) Fallback behaviour is important to our distros. The reason is we will
> >> provide default value with crashkernel=xxxM along kernel of distros. In
> >> this case, we hope the reservation will succeed by all means. The ,high
> >> and ,low is an option if customer likes to take with expertise.
> 
> OK, I got it.
> 
> >>
> >> After going through arm64 memory init code, I got below summary about
> >> arm64_dma_phys_limit which is the first zone's upper limit. I think we
> >> can make use of it to facilitate to simplify code.
> >> ================================================================================
> >>                         DMA                      DMA32                    NORMAL
> >> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
> >> 2)Normal machine        0~4G                     0                        (above 4G)
> >> 3)Special machine       (above 4G)~MAX
> >> 4)No DMA|DMA32                                                            (above 4G)~MAX
> 
> arm64_memblock_init()
> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
We don't need different code for this place of reservation as you are
doing in this patchset, since arm64_dma_phys_limit is initialized as 
below. In fact, in arm64_memblock_init(), we have made memblock ready,
we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
memblock_start_of_DRAM() is bigger than 4G, we possibly can call
reserve_crashkernel() here too.

phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;

> paging_init()                                       |
> 	map_mem()                                   |
> unflatten_device_tree or ACPI                       |  ----  //Raspberry Pi4 get dma zone base on dtb or ACPI
> bootmem_init();                                     |      |
> 	zone_sizes_init()                           |      |
> 		of_dma_get_max_cpu_address          |  ----|
> 		//Update arm64_dma_phys_limit       |  ----|
> 	reserve_crashkernel()        <--------------  //Because we need arm64_dma_phys_limit to be updated above
> request_standard_resources()

Yeah, because arm64_dma_phys_limit is decided late in the 1) and 2) case
as I summarized, we need defer reserve_crashkernel() to bootmem_init(). But 
arm64_dma_phys_limit could be 1G or 4G, that's why your optimization
about BLOCKING may not be right since you assume the 4G boundary, while
forgetting Raspberry Pi4 on which 1G is the boundary of low memory and
high memory. So separating out BLOCKING optimization can let us focus on
the crashkernel,high support.

> 
> >>
> >> -------------------------------------------
> >>                       arm64_dma_phys_limit
> >> 1)Raspberry Pi4         1G                     
> >> 2)Normal machine        4G                     
> >> 3)Special machine       MAX
> >> 4)No DMA|DMA32          MAX
> >>
> >> Note: 3)Special machine means the machine's starting physical address is above 4G.
> >> WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
> >> IOMMU hardware supporting.
> >> ===================================================================================
> >>
> >> I made a draft patch based on this patchset, please feel free to check and
> >> see if it's OK, or anything missing or wrongly understood. I removed
> >> reserve_crashkernel_high() and only keep reserve_crashkernel() and
> >> reserve_crashkernel_low() as the v21 did.
> > 
> > Sorry, forgot attaching the draft patch.
> > 
> > By the way, we can also have a simple version with basic ,high, ,low
> > support, no fallback. We can add fallback and other optimization later.
> > This can be plan B.
> 
> Yes, That's what Catalin suggested also.
> 
> Hi, Baoquan He:
>   Without optimization, the whole Patch 3-4 and 6-7 can be dropped.
> 
> Process after abstraction:
> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) {
> 		reserve_crashkernel()
> 		//block mapping
> 	} else {
> 		//page mapping
> 		reserve_crashkernel()
> 	}
> 
> ------------ Simplified real-world process ---------
Yeah, this looks clearer. I would like to see a version with them.

> arm64_memblock_init()
        Before reserve_crashkernel(), we can update arm64_dma_phys_limit
as memblock_end_of_DRAM() if CONFIG_ZONE_DMA|DMA32 is not enabled or
memblock_start_of_DRAM() is bigger than 4G.
        Then we go with:
        if (!arm64_dma_phys_limit)
		reserve_crashkernel();

Just personal opinion, please check if it's appropriate to handle case
3) which has physical starting memory above 4G here. 

> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
> 		reserve_crashkernel()
           
> paging_init()
> 	map_mem()
> 		if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
> 			//block mapping
> 		else
> 			//page mapping
> unflatten_device_tree or ACPI
> bootmem_init();
> 	zone_sizes_init()
> 		of_dma_get_max_cpu_address
> 		//Update arm64_dma_phys_limit
> 	if (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32))
> 		reserve_crashkernel()

The rest sounds good with optimization code split out.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-29  3:24                     ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-04-29  3:24 UTC (permalink / raw)
  To: kexec

On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/4/28 11:52, Baoquan He wrote:
> > On 04/28/22 at 11:40am, Baoquan He wrote:
> >> Hi Catalin, Zhen Lei,
> >>
> >> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
> >>> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
> >>>> On 2022/4/27 20:32, Catalin Marinas wrote:
> >>>>> I think one could always pass a default command line like:
> >>>>>
> >>>>> 	crashkernel=1G,high crashkernel=128M,low
> >>>>>
> >>>>> without much knowledge of the SoC memory layout.
> >>>>
> >>>> Yes, that's what the end result is. The user specify crashkernel=128M,low
> >>>> and the implementation ensure the 128M low memory is allocated from DMA zone.
> >>>> We use arm64_dma_phys_limit as the upper limit for crash low memory.
> >>>>
> >>>> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
> >>>> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> >>>> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
> >>>>                                                crash_base, crash_max);
> >>>>
> >>>>> Another option is to only introduce crashkernel=Y,low and, when that is
> >>>>> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
> >>>>> 'high' option at all:
> >>>>>
> >>>>> 	crashkernel=1G				- all within ZONE_DMA
> >>>>> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
> >>>>> 						  1G above ZONE_DMA
> >>>>>
> >>>>> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
> >>>>> the 'low' option.
> >>>>
> >>>> I think although the code is hard to make generic, the interface is better to
> >>>> be relatively uniform. A user might have to maintain both x86 and arm64, and
> >>>> so on. It's not a good thing that the difference is too big.
> >>>
> >>> There will be some difference as the 4G limit doesn't always hold for
> >>> arm64 (though it's true in most cases). Anyway, we can probably simplify
> >>> things a bit while following the documented behaviour:
> >>>
> >>> 	crashkernel=Y		- current behaviour within ZONE_DMA
> >>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> >>> 	crashkernel=Y,low	- allocate within ZONE_DMA
> >>>
> >>> There is no fallback from crashkernel=Y.
> >>>
> >>> The question is whether we still want a default low allocation if
> >>> crashkernel=Y,low is missing but 'high' is present. If we add this, I
> >>> think we'd be consistent with kernel-parameters.txt for the 'low'
> >>> description. A default 'low' is probably not that bad but I'm tempted to
> >>> always mandate both 'high' and 'low'.
> >>
> >> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
> >> about this version. And I have the same concerns about them which comes
> >> from below points:
> >> 1) we may need to take best effort to keep ,high, ,low behaviour
> >> consistent on all ARCHes. Otherwise user/admin may be confused when they
> >> deploy/configure kdump on different machines of different ARCHes in the
> >> same LAB. I think we should try to avoid the confusion.
> 
> Yes, but for someone who is configuring crashkernel= for the first time, he
> needs to read doc to understand how to configure it. The doc can show the
> recommended default value of 'low' size.
> 
> After commit 94fb93341822 ("x86/crash: Allocate enough low memory when crashkernel=high"),
> the default 'low' size doesn't make much sense anymore. The default size of swiotlb_size()
> is 64M, far less than 256M. And if user specify "swiotlb=", he can also adjust crashkernel=Y,low.
> 
> 
> +                * -swiotlb size: user-specified with swiotlb= or default.
> -               low_size = swiotlb_size_or_default() + (8UL<<20);
> +               low_size = max(swiotlb_size_or_default() + (8UL<<20), 256UL<<20);
> 
> That means all ARCHs can explicit configure crashkernel=256M,low, instead of
> omitting it. This may be another way to avoid confusion. It's not hard for
> programmer-turned-user/admin. However, this requires us to forgo backward
> compatibility with the default size of 'low'.

We can make ,high and ,low simpler at first as they are alternative. If
possible, we can also simplify the ,high ,low implementation on x86_64
if it truly brings better archievement on arm64.

> 
> 
> >> 2) Fallback behaviour is important to our distros. The reason is we will
> >> provide default value with crashkernel=xxxM along kernel of distros. In
> >> this case, we hope the reservation will succeed by all means. The ,high
> >> and ,low is an option if customer likes to take with expertise.
> 
> OK, I got it.
> 
> >>
> >> After going through arm64 memory init code, I got below summary about
> >> arm64_dma_phys_limit which is the first zone's upper limit. I think we
> >> can make use of it to facilitate to simplify code.
> >> ================================================================================
> >>                         DMA                      DMA32                    NORMAL
> >> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
> >> 2)Normal machine        0~4G                     0                        (above 4G)
> >> 3)Special machine       (above 4G)~MAX
> >> 4)No DMA|DMA32                                                            (above 4G)~MAX
> 
> arm64_memblock_init()
> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
We don't need different code for this place of reservation as you are
doing in this patchset, since arm64_dma_phys_limit is initialized as 
below. In fact, in arm64_memblock_init(), we have made memblock ready,
we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
memblock_start_of_DRAM() is bigger than 4G, we possibly can call
reserve_crashkernel() here too.

phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;

> paging_init()                                       |
> 	map_mem()                                   |
> unflatten_device_tree or ACPI                       |  ----  //Raspberry Pi4 get dma zone base on dtb or ACPI
> bootmem_init();                                     |      |
> 	zone_sizes_init()                           |      |
> 		of_dma_get_max_cpu_address          |  ----|
> 		//Update arm64_dma_phys_limit       |  ----|
> 	reserve_crashkernel()        <--------------  //Because we need arm64_dma_phys_limit to be updated above
> request_standard_resources()

Yeah, because arm64_dma_phys_limit is decided late in the 1) and 2) case
as I summarized, we need defer reserve_crashkernel() to bootmem_init(). But 
arm64_dma_phys_limit could be 1G or 4G, that's why your optimization
about BLOCKING may not be right since you assume the 4G boundary, while
forgetting Raspberry Pi4 on which 1G is the boundary of low memory and
high memory. So separating out BLOCKING optimization can let us focus on
the crashkernel,high support.

> 
> >>
> >> -------------------------------------------
> >>                       arm64_dma_phys_limit
> >> 1)Raspberry Pi4         1G                     
> >> 2)Normal machine        4G                     
> >> 3)Special machine       MAX
> >> 4)No DMA|DMA32          MAX
> >>
> >> Note: 3)Special machine means the machine's starting physical address is above 4G.
> >> WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
> >> IOMMU hardware supporting.
> >> ===================================================================================
> >>
> >> I made a draft patch based on this patchset, please feel free to check and
> >> see if it's OK, or anything missing or wrongly understood. I removed
> >> reserve_crashkernel_high() and only keep reserve_crashkernel() and
> >> reserve_crashkernel_low() as the v21 did.
> > 
> > Sorry, forgot attaching the draft patch.
> > 
> > By the way, we can also have a simple version with basic ,high, ,low
> > support, no fallback. We can add fallback and other optimization later.
> > This can be plan B.
> 
> Yes, That's what Catalin suggested also.
> 
> Hi, Baoquan He:
>   Without optimization, the whole Patch 3-4 and 6-7 can be dropped.
> 
> Process after abstraction:
> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) {
> 		reserve_crashkernel()
> 		//block mapping
> 	} else {
> 		//page mapping
> 		reserve_crashkernel()
> 	}
> 
> ------------ Simplified real-world process ---------
Yeah, this looks clearer. I would like to see a version with them.

> arm64_memblock_init()
        Before reserve_crashkernel(), we can update arm64_dma_phys_limit
as memblock_end_of_DRAM() if CONFIG_ZONE_DMA|DMA32 is not enabled or
memblock_start_of_DRAM() is bigger than 4G.
        Then we go with:
        if (!arm64_dma_phys_limit)
		reserve_crashkernel();

Just personal opinion, please check if it's appropriate to handle case
3) which has physical starting memory above 4G here. 

> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
> 		reserve_crashkernel()
           
> paging_init()
> 	map_mem()
> 		if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
> 			//block mapping
> 		else
> 			//page mapping
> unflatten_device_tree or ACPI
> bootmem_init();
> 	zone_sizes_init()
> 		of_dma_get_max_cpu_address
> 		//Update arm64_dma_phys_limit
> 	if (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32))
> 		reserve_crashkernel()

The rest sounds good with optimization code split out.



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-04-29  3:24                     ` Baoquan He
  (?)
@ 2022-04-29  8:02                       ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-29  8:02 UTC (permalink / raw)
  To: Baoquan He
  Cc: Catalin Marinas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp



On 2022/4/29 11:24, Baoquan He wrote:
> On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2022/4/28 11:52, Baoquan He wrote:
>>> On 04/28/22 at 11:40am, Baoquan He wrote:
>>>> Hi Catalin, Zhen Lei,
>>>>
>>>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
>>>>> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
>>>>>> On 2022/4/27 20:32, Catalin Marinas wrote:
>>>>>>> I think one could always pass a default command line like:
>>>>>>>
>>>>>>> 	crashkernel=1G,high crashkernel=128M,low
>>>>>>>
>>>>>>> without much knowledge of the SoC memory layout.
>>>>>>
>>>>>> Yes, that's what the end result is. The user specify crashkernel=128M,low
>>>>>> and the implementation ensure the 128M low memory is allocated from DMA zone.
>>>>>> We use arm64_dma_phys_limit as the upper limit for crash low memory.
>>>>>>
>>>>>> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
>>>>>> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>>>>> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>>>>                                                crash_base, crash_max);
>>>>>>
>>>>>>> Another option is to only introduce crashkernel=Y,low and, when that is
>>>>>>> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
>>>>>>> 'high' option at all:
>>>>>>>
>>>>>>> 	crashkernel=1G				- all within ZONE_DMA
>>>>>>> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
>>>>>>> 						  1G above ZONE_DMA
>>>>>>>
>>>>>>> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
>>>>>>> the 'low' option.
>>>>>>
>>>>>> I think although the code is hard to make generic, the interface is better to
>>>>>> be relatively uniform. A user might have to maintain both x86 and arm64, and
>>>>>> so on. It's not a good thing that the difference is too big.
>>>>>
>>>>> There will be some difference as the 4G limit doesn't always hold for
>>>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
>>>>> things a bit while following the documented behaviour:
>>>>>
>>>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
>>>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
>>>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
>>>>>
>>>>> There is no fallback from crashkernel=Y.
>>>>>
>>>>> The question is whether we still want a default low allocation if
>>>>> crashkernel=Y,low is missing but 'high' is present. If we add this, I
>>>>> think we'd be consistent with kernel-parameters.txt for the 'low'
>>>>> description. A default 'low' is probably not that bad but I'm tempted to
>>>>> always mandate both 'high' and 'low'.
>>>>
>>>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
>>>> about this version. And I have the same concerns about them which comes
>>>> from below points:
>>>> 1) we may need to take best effort to keep ,high, ,low behaviour
>>>> consistent on all ARCHes. Otherwise user/admin may be confused when they
>>>> deploy/configure kdump on different machines of different ARCHes in the
>>>> same LAB. I think we should try to avoid the confusion.
>>
>> Yes, but for someone who is configuring crashkernel= for the first time, he
>> needs to read doc to understand how to configure it. The doc can show the
>> recommended default value of 'low' size.
>>
>> After commit 94fb93341822 ("x86/crash: Allocate enough low memory when crashkernel=high"),
>> the default 'low' size doesn't make much sense anymore. The default size of swiotlb_size()
>> is 64M, far less than 256M. And if user specify "swiotlb=", he can also adjust crashkernel=Y,low.
>>
>>
>> +                * -swiotlb size: user-specified with swiotlb= or default.
>> -               low_size = swiotlb_size_or_default() + (8UL<<20);
>> +               low_size = max(swiotlb_size_or_default() + (8UL<<20), 256UL<<20);
>>
>> That means all ARCHs can explicit configure crashkernel=256M,low, instead of
>> omitting it. This may be another way to avoid confusion. It's not hard for
>> programmer-turned-user/admin. However, this requires us to forgo backward
>> compatibility with the default size of 'low'.
> 
> We can make ,high and ,low simpler at first as they are alternative. If
> possible, we can also simplify the ,high ,low implementation on x86_64
> if it truly brings better archievement on arm64.

OK, I plan to remove optimization, fallback and default low size, to follow the
suggestion of Catalin first. But there's one minor point of contention.

1)    Both "crashkernel=X,high" and "crashkernel=X,low" must be present.
2)    Both "crashkernel=X,high" and "crashkernel=X,low" are present.
   or
      Allow "crashkernel=X,high" to be present alone. Unlike x86, the default low size is zero.

I prefer 2), how about you?

> 
>>
>>
>>>> 2) Fallback behaviour is important to our distros. The reason is we will
>>>> provide default value with crashkernel=xxxM along kernel of distros. In
>>>> this case, we hope the reservation will succeed by all means. The ,high
>>>> and ,low is an option if customer likes to take with expertise.
>>
>> OK, I got it.
>>
>>>>
>>>> After going through arm64 memory init code, I got below summary about
>>>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
>>>> can make use of it to facilitate to simplify code.
>>>> ================================================================================
>>>>                         DMA                      DMA32                    NORMAL
>>>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
>>>> 2)Normal machine        0~4G                     0                        (above 4G)
>>>> 3)Special machine       (above 4G)~MAX
>>>> 4)No DMA|DMA32                                                            (above 4G)~MAX
>>
>> arm64_memblock_init()
>> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
> We don't need different code for this place of reservation as you are
> doing in this patchset, since arm64_dma_phys_limit is initialized as 
> below. In fact, in arm64_memblock_init(), we have made memblock ready,
> we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
> memblock_start_of_DRAM() is bigger than 4G, we possibly can call
> reserve_crashkernel() here too.

Yes. Maybe all the devices in this environment are 64-bit. One way I know of allowing
32-bit devices to access high memory without SMMU is: Set a fixed value for the upper
32 bits. In this case, the DMA zone should be [phys_start, phys_start + 4G).

> 
> phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
> 
>> paging_init()                                       |
>> 	map_mem()                                   |
>> unflatten_device_tree or ACPI                       |  ----  //Raspberry Pi4 get dma zone base on dtb or ACPI
>> bootmem_init();                                     |      |
>> 	zone_sizes_init()                           |      |
>> 		of_dma_get_max_cpu_address          |  ----|
>> 		//Update arm64_dma_phys_limit       |  ----|
>> 	reserve_crashkernel()        <--------------  //Because we need arm64_dma_phys_limit to be updated above
>> request_standard_resources()
> 
> Yeah, because arm64_dma_phys_limit is decided late in the 1) and 2) case
> as I summarized, we need defer reserve_crashkernel() to bootmem_init(). But 
> arm64_dma_phys_limit could be 1G or 4G, that's why your optimization
> about BLOCKING may not be right since you assume the 4G boundary, while
> forgetting Raspberry Pi4 on which 1G is the boundary of low memory and

No, no, my optimization for Raspberry Pi4 is fine. I do page mapping for memory
under 4G and block mapping for memory above 4G. But still try to reserve crash
low memory from DMA zone. So when 1G is the boundary, it's just not fully optimized,
the memory 1-4G are mapped as page.

> high memory. So separating out BLOCKING optimization can let us focus on
> the crashkernel,high support.
> 
>>
>>>>
>>>> -------------------------------------------
>>>>                       arm64_dma_phys_limit
>>>> 1)Raspberry Pi4         1G                     
>>>> 2)Normal machine        4G                     
>>>> 3)Special machine       MAX
>>>> 4)No DMA|DMA32          MAX
>>>>
>>>> Note: 3)Special machine means the machine's starting physical address is above 4G.
>>>> WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
>>>> IOMMU hardware supporting.
>>>> ===================================================================================
>>>>
>>>> I made a draft patch based on this patchset, please feel free to check and
>>>> see if it's OK, or anything missing or wrongly understood. I removed
>>>> reserve_crashkernel_high() and only keep reserve_crashkernel() and
>>>> reserve_crashkernel_low() as the v21 did.
>>>
>>> Sorry, forgot attaching the draft patch.
>>>
>>> By the way, we can also have a simple version with basic ,high, ,low
>>> support, no fallback. We can add fallback and other optimization later.
>>> This can be plan B.
>>
>> Yes, That's what Catalin suggested also.
>>
>> Hi, Baoquan He:
>>   Without optimization, the whole Patch 3-4 and 6-7 can be dropped.
>>
>> Process after abstraction:
>> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) {
>> 		reserve_crashkernel()
>> 		//block mapping
>> 	} else {
>> 		//page mapping
>> 		reserve_crashkernel()
>> 	}
>>
>> ------------ Simplified real-world process ---------
> Yeah, this looks clearer. I would like to see a version with them.
> 
>> arm64_memblock_init()
>         Before reserve_crashkernel(), we can update arm64_dma_phys_limit
> as memblock_end_of_DRAM() if CONFIG_ZONE_DMA|DMA32 is not enabled or
> memblock_start_of_DRAM() is bigger than 4G.
>         Then we go with:
>         if (!arm64_dma_phys_limit)
> 		reserve_crashkernel();
> 
> Just personal opinion, please check if it's appropriate to handle case
> 3) which has physical starting memory above 4G here. 

OK, I'll write it down and consider it in the future optimization.

> 
>> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
>> 		reserve_crashkernel()
>            
>> paging_init()
>> 	map_mem()
>> 		if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
>> 			//block mapping
>> 		else
>> 			//page mapping
>> unflatten_device_tree or ACPI
>> bootmem_init();
>> 	zone_sizes_init()
>> 		of_dma_get_max_cpu_address
>> 		//Update arm64_dma_phys_limit
>> 	if (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32))
>> 		reserve_crashkernel()
> 
> The rest sounds good with optimization code split out.
> 
> .
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-29  8:02                       ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-29  8:02 UTC (permalink / raw)
  To: Baoquan He
  Cc: Catalin Marinas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp



On 2022/4/29 11:24, Baoquan He wrote:
> On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2022/4/28 11:52, Baoquan He wrote:
>>> On 04/28/22 at 11:40am, Baoquan He wrote:
>>>> Hi Catalin, Zhen Lei,
>>>>
>>>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
>>>>> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
>>>>>> On 2022/4/27 20:32, Catalin Marinas wrote:
>>>>>>> I think one could always pass a default command line like:
>>>>>>>
>>>>>>> 	crashkernel=1G,high crashkernel=128M,low
>>>>>>>
>>>>>>> without much knowledge of the SoC memory layout.
>>>>>>
>>>>>> Yes, that's what the end result is. The user specify crashkernel=128M,low
>>>>>> and the implementation ensure the 128M low memory is allocated from DMA zone.
>>>>>> We use arm64_dma_phys_limit as the upper limit for crash low memory.
>>>>>>
>>>>>> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
>>>>>> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>>>>> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>>>>                                                crash_base, crash_max);
>>>>>>
>>>>>>> Another option is to only introduce crashkernel=Y,low and, when that is
>>>>>>> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
>>>>>>> 'high' option at all:
>>>>>>>
>>>>>>> 	crashkernel=1G				- all within ZONE_DMA
>>>>>>> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
>>>>>>> 						  1G above ZONE_DMA
>>>>>>>
>>>>>>> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
>>>>>>> the 'low' option.
>>>>>>
>>>>>> I think although the code is hard to make generic, the interface is better to
>>>>>> be relatively uniform. A user might have to maintain both x86 and arm64, and
>>>>>> so on. It's not a good thing that the difference is too big.
>>>>>
>>>>> There will be some difference as the 4G limit doesn't always hold for
>>>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
>>>>> things a bit while following the documented behaviour:
>>>>>
>>>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
>>>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
>>>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
>>>>>
>>>>> There is no fallback from crashkernel=Y.
>>>>>
>>>>> The question is whether we still want a default low allocation if
>>>>> crashkernel=Y,low is missing but 'high' is present. If we add this, I
>>>>> think we'd be consistent with kernel-parameters.txt for the 'low'
>>>>> description. A default 'low' is probably not that bad but I'm tempted to
>>>>> always mandate both 'high' and 'low'.
>>>>
>>>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
>>>> about this version. And I have the same concerns about them which comes
>>>> from below points:
>>>> 1) we may need to take best effort to keep ,high, ,low behaviour
>>>> consistent on all ARCHes. Otherwise user/admin may be confused when they
>>>> deploy/configure kdump on different machines of different ARCHes in the
>>>> same LAB. I think we should try to avoid the confusion.
>>
>> Yes, but for someone who is configuring crashkernel= for the first time, he
>> needs to read doc to understand how to configure it. The doc can show the
>> recommended default value of 'low' size.
>>
>> After commit 94fb93341822 ("x86/crash: Allocate enough low memory when crashkernel=high"),
>> the default 'low' size doesn't make much sense anymore. The default size of swiotlb_size()
>> is 64M, far less than 256M. And if user specify "swiotlb=", he can also adjust crashkernel=Y,low.
>>
>>
>> +                * -swiotlb size: user-specified with swiotlb= or default.
>> -               low_size = swiotlb_size_or_default() + (8UL<<20);
>> +               low_size = max(swiotlb_size_or_default() + (8UL<<20), 256UL<<20);
>>
>> That means all ARCHs can explicit configure crashkernel=256M,low, instead of
>> omitting it. This may be another way to avoid confusion. It's not hard for
>> programmer-turned-user/admin. However, this requires us to forgo backward
>> compatibility with the default size of 'low'.
> 
> We can make ,high and ,low simpler at first as they are alternative. If
> possible, we can also simplify the ,high ,low implementation on x86_64
> if it truly brings better archievement on arm64.

OK, I plan to remove optimization, fallback and default low size, to follow the
suggestion of Catalin first. But there's one minor point of contention.

1)    Both "crashkernel=X,high" and "crashkernel=X,low" must be present.
2)    Both "crashkernel=X,high" and "crashkernel=X,low" are present.
   or
      Allow "crashkernel=X,high" to be present alone. Unlike x86, the default low size is zero.

I prefer 2), how about you?

> 
>>
>>
>>>> 2) Fallback behaviour is important to our distros. The reason is we will
>>>> provide default value with crashkernel=xxxM along kernel of distros. In
>>>> this case, we hope the reservation will succeed by all means. The ,high
>>>> and ,low is an option if customer likes to take with expertise.
>>
>> OK, I got it.
>>
>>>>
>>>> After going through arm64 memory init code, I got below summary about
>>>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
>>>> can make use of it to facilitate to simplify code.
>>>> ================================================================================
>>>>                         DMA                      DMA32                    NORMAL
>>>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
>>>> 2)Normal machine        0~4G                     0                        (above 4G)
>>>> 3)Special machine       (above 4G)~MAX
>>>> 4)No DMA|DMA32                                                            (above 4G)~MAX
>>
>> arm64_memblock_init()
>> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
> We don't need different code for this place of reservation as you are
> doing in this patchset, since arm64_dma_phys_limit is initialized as 
> below. In fact, in arm64_memblock_init(), we have made memblock ready,
> we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
> memblock_start_of_DRAM() is bigger than 4G, we possibly can call
> reserve_crashkernel() here too.

Yes. Maybe all the devices in this environment are 64-bit. One way I know of allowing
32-bit devices to access high memory without SMMU is: Set a fixed value for the upper
32 bits. In this case, the DMA zone should be [phys_start, phys_start + 4G).

> 
> phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
> 
>> paging_init()                                       |
>> 	map_mem()                                   |
>> unflatten_device_tree or ACPI                       |  ----  //Raspberry Pi4 get dma zone base on dtb or ACPI
>> bootmem_init();                                     |      |
>> 	zone_sizes_init()                           |      |
>> 		of_dma_get_max_cpu_address          |  ----|
>> 		//Update arm64_dma_phys_limit       |  ----|
>> 	reserve_crashkernel()        <--------------  //Because we need arm64_dma_phys_limit to be updated above
>> request_standard_resources()
> 
> Yeah, because arm64_dma_phys_limit is decided late in the 1) and 2) case
> as I summarized, we need defer reserve_crashkernel() to bootmem_init(). But 
> arm64_dma_phys_limit could be 1G or 4G, that's why your optimization
> about BLOCKING may not be right since you assume the 4G boundary, while
> forgetting Raspberry Pi4 on which 1G is the boundary of low memory and

No, no, my optimization for Raspberry Pi4 is fine. I do page mapping for memory
under 4G and block mapping for memory above 4G. But still try to reserve crash
low memory from DMA zone. So when 1G is the boundary, it's just not fully optimized,
the memory 1-4G are mapped as page.

> high memory. So separating out BLOCKING optimization can let us focus on
> the crashkernel,high support.
> 
>>
>>>>
>>>> -------------------------------------------
>>>>                       arm64_dma_phys_limit
>>>> 1)Raspberry Pi4         1G                     
>>>> 2)Normal machine        4G                     
>>>> 3)Special machine       MAX
>>>> 4)No DMA|DMA32          MAX
>>>>
>>>> Note: 3)Special machine means the machine's starting physical address is above 4G.
>>>> WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
>>>> IOMMU hardware supporting.
>>>> ===================================================================================
>>>>
>>>> I made a draft patch based on this patchset, please feel free to check and
>>>> see if it's OK, or anything missing or wrongly understood. I removed
>>>> reserve_crashkernel_high() and only keep reserve_crashkernel() and
>>>> reserve_crashkernel_low() as the v21 did.
>>>
>>> Sorry, forgot attaching the draft patch.
>>>
>>> By the way, we can also have a simple version with basic ,high, ,low
>>> support, no fallback. We can add fallback and other optimization later.
>>> This can be plan B.
>>
>> Yes, That's what Catalin suggested also.
>>
>> Hi, Baoquan He:
>>   Without optimization, the whole Patch 3-4 and 6-7 can be dropped.
>>
>> Process after abstraction:
>> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) {
>> 		reserve_crashkernel()
>> 		//block mapping
>> 	} else {
>> 		//page mapping
>> 		reserve_crashkernel()
>> 	}
>>
>> ------------ Simplified real-world process ---------
> Yeah, this looks clearer. I would like to see a version with them.
> 
>> arm64_memblock_init()
>         Before reserve_crashkernel(), we can update arm64_dma_phys_limit
> as memblock_end_of_DRAM() if CONFIG_ZONE_DMA|DMA32 is not enabled or
> memblock_start_of_DRAM() is bigger than 4G.
>         Then we go with:
>         if (!arm64_dma_phys_limit)
> 		reserve_crashkernel();
> 
> Just personal opinion, please check if it's appropriate to handle case
> 3) which has physical starting memory above 4G here. 

OK, I'll write it down and consider it in the future optimization.

> 
>> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
>> 		reserve_crashkernel()
>            
>> paging_init()
>> 	map_mem()
>> 		if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
>> 			//block mapping
>> 		else
>> 			//page mapping
>> unflatten_device_tree or ACPI
>> bootmem_init();
>> 	zone_sizes_init()
>> 		of_dma_get_max_cpu_address
>> 		//Update arm64_dma_phys_limit
>> 	if (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32))
>> 		reserve_crashkernel()
> 
> The rest sounds good with optimization code split out.
> 
> .
> 

-- 
Regards,
  Zhen Lei

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-29  8:02                       ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen @ 2022-04-29  8:02 UTC (permalink / raw)
  To: kexec



On 2022/4/29 11:24, Baoquan He wrote:
> On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2022/4/28 11:52, Baoquan He wrote:
>>> On 04/28/22 at 11:40am, Baoquan He wrote:
>>>> Hi Catalin, Zhen Lei,
>>>>
>>>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
>>>>> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
>>>>>> On 2022/4/27 20:32, Catalin Marinas wrote:
>>>>>>> I think one could always pass a default command line like:
>>>>>>>
>>>>>>> 	crashkernel=1G,high crashkernel=128M,low
>>>>>>>
>>>>>>> without much knowledge of the SoC memory layout.
>>>>>>
>>>>>> Yes, that's what the end result is. The user specify crashkernel=128M,low
>>>>>> and the implementation ensure the 128M low memory is allocated from DMA zone.
>>>>>> We use arm64_dma_phys_limit as the upper limit for crash low memory.
>>>>>>
>>>>>> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
>>>>>> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>>>>> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>>>>                                                crash_base, crash_max);
>>>>>>
>>>>>>> Another option is to only introduce crashkernel=Y,low and, when that is
>>>>>>> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
>>>>>>> 'high' option at all:
>>>>>>>
>>>>>>> 	crashkernel=1G				- all within ZONE_DMA
>>>>>>> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
>>>>>>> 						  1G above ZONE_DMA
>>>>>>>
>>>>>>> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
>>>>>>> the 'low' option.
>>>>>>
>>>>>> I think although the code is hard to make generic, the interface is better to
>>>>>> be relatively uniform. A user might have to maintain both x86 and arm64, and
>>>>>> so on. It's not a good thing that the difference is too big.
>>>>>
>>>>> There will be some difference as the 4G limit doesn't always hold for
>>>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
>>>>> things a bit while following the documented behaviour:
>>>>>
>>>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
>>>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
>>>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
>>>>>
>>>>> There is no fallback from crashkernel=Y.
>>>>>
>>>>> The question is whether we still want a default low allocation if
>>>>> crashkernel=Y,low is missing but 'high' is present. If we add this, I
>>>>> think we'd be consistent with kernel-parameters.txt for the 'low'
>>>>> description. A default 'low' is probably not that bad but I'm tempted to
>>>>> always mandate both 'high' and 'low'.
>>>>
>>>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
>>>> about this version. And I have the same concerns about them which comes
>>>> from below points:
>>>> 1) we may need to take best effort to keep ,high, ,low behaviour
>>>> consistent on all ARCHes. Otherwise user/admin may be confused when they
>>>> deploy/configure kdump on different machines of different ARCHes in the
>>>> same LAB. I think we should try to avoid the confusion.
>>
>> Yes, but for someone who is configuring crashkernel= for the first time, he
>> needs to read doc to understand how to configure it. The doc can show the
>> recommended default value of 'low' size.
>>
>> After commit 94fb93341822 ("x86/crash: Allocate enough low memory when crashkernel=high"),
>> the default 'low' size doesn't make much sense anymore. The default size of swiotlb_size()
>> is 64M, far less than 256M. And if user specify "swiotlb=", he can also adjust crashkernel=Y,low.
>>
>>
>> +                * -swiotlb size: user-specified with swiotlb= or default.
>> -               low_size = swiotlb_size_or_default() + (8UL<<20);
>> +               low_size = max(swiotlb_size_or_default() + (8UL<<20), 256UL<<20);
>>
>> That means all ARCHs can explicit configure crashkernel=256M,low, instead of
>> omitting it. This may be another way to avoid confusion. It's not hard for
>> programmer-turned-user/admin. However, this requires us to forgo backward
>> compatibility with the default size of 'low'.
> 
> We can make ,high and ,low simpler at first as they are alternative. If
> possible, we can also simplify the ,high ,low implementation on x86_64
> if it truly brings better archievement on arm64.

OK, I plan to remove optimization, fallback and default low size, to follow the
suggestion of Catalin first. But there's one minor point of contention.

1)    Both "crashkernel=X,high" and "crashkernel=X,low" must be present.
2)    Both "crashkernel=X,high" and "crashkernel=X,low" are present.
   or
      Allow "crashkernel=X,high" to be present alone. Unlike x86, the default low size is zero.

I prefer 2), how about you?

> 
>>
>>
>>>> 2) Fallback behaviour is important to our distros. The reason is we will
>>>> provide default value with crashkernel=xxxM along kernel of distros. In
>>>> this case, we hope the reservation will succeed by all means. The ,high
>>>> and ,low is an option if customer likes to take with expertise.
>>
>> OK, I got it.
>>
>>>>
>>>> After going through arm64 memory init code, I got below summary about
>>>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
>>>> can make use of it to facilitate to simplify code.
>>>> ================================================================================
>>>>                         DMA                      DMA32                    NORMAL
>>>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
>>>> 2)Normal machine        0~4G                     0                        (above 4G)
>>>> 3)Special machine       (above 4G)~MAX
>>>> 4)No DMA|DMA32                                                            (above 4G)~MAX
>>
>> arm64_memblock_init()
>> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
> We don't need different code for this place of reservation as you are
> doing in this patchset, since arm64_dma_phys_limit is initialized as 
> below. In fact, in arm64_memblock_init(), we have made memblock ready,
> we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
> memblock_start_of_DRAM() is bigger than 4G, we possibly can call
> reserve_crashkernel() here too.

Yes. Maybe all the devices in this environment are 64-bit. One way I know of allowing
32-bit devices to access high memory without SMMU is: Set a fixed value for the upper
32 bits. In this case, the DMA zone should be [phys_start, phys_start + 4G).

> 
> phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
> 
>> paging_init()                                       |
>> 	map_mem()                                   |
>> unflatten_device_tree or ACPI                       |  ----  //Raspberry Pi4 get dma zone base on dtb or ACPI
>> bootmem_init();                                     |      |
>> 	zone_sizes_init()                           |      |
>> 		of_dma_get_max_cpu_address          |  ----|
>> 		//Update arm64_dma_phys_limit       |  ----|
>> 	reserve_crashkernel()        <--------------  //Because we need arm64_dma_phys_limit to be updated above
>> request_standard_resources()
> 
> Yeah, because arm64_dma_phys_limit is decided late in the 1) and 2) case
> as I summarized, we need defer reserve_crashkernel() to bootmem_init(). But 
> arm64_dma_phys_limit could be 1G or 4G, that's why your optimization
> about BLOCKING may not be right since you assume the 4G boundary, while
> forgetting Raspberry Pi4 on which 1G is the boundary of low memory and

No, no, my optimization for Raspberry Pi4 is fine. I do page mapping for memory
under 4G and block mapping for memory above 4G. But still try to reserve crash
low memory from DMA zone. So when 1G is the boundary, it's just not fully optimized,
the memory 1-4G are mapped as page.

> high memory. So separating out BLOCKING optimization can let us focus on
> the crashkernel,high support.
> 
>>
>>>>
>>>> -------------------------------------------
>>>>                       arm64_dma_phys_limit
>>>> 1)Raspberry Pi4         1G                     
>>>> 2)Normal machine        4G                     
>>>> 3)Special machine       MAX
>>>> 4)No DMA|DMA32          MAX
>>>>
>>>> Note: 3)Special machine means the machine's starting physical address is above 4G.
>>>> WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
>>>> IOMMU hardware supporting.
>>>> ===================================================================================
>>>>
>>>> I made a draft patch based on this patchset, please feel free to check and
>>>> see if it's OK, or anything missing or wrongly understood. I removed
>>>> reserve_crashkernel_high() and only keep reserve_crashkernel() and
>>>> reserve_crashkernel_low() as the v21 did.
>>>
>>> Sorry, forgot attaching the draft patch.
>>>
>>> By the way, we can also have a simple version with basic ,high, ,low
>>> support, no fallback. We can add fallback and other optimization later.
>>> This can be plan B.
>>
>> Yes, That's what Catalin suggested also.
>>
>> Hi, Baoquan He:
>>   Without optimization, the whole Patch 3-4 and 6-7 can be dropped.
>>
>> Process after abstraction:
>> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) {
>> 		reserve_crashkernel()
>> 		//block mapping
>> 	} else {
>> 		//page mapping
>> 		reserve_crashkernel()
>> 	}
>>
>> ------------ Simplified real-world process ---------
> Yeah, this looks clearer. I would like to see a version with them.
> 
>> arm64_memblock_init()
>         Before reserve_crashkernel(), we can update arm64_dma_phys_limit
> as memblock_end_of_DRAM() if CONFIG_ZONE_DMA|DMA32 is not enabled or
> memblock_start_of_DRAM() is bigger than 4G.
>         Then we go with:
>         if (!arm64_dma_phys_limit)
> 		reserve_crashkernel();
> 
> Just personal opinion, please check if it's appropriate to handle case
> 3) which has physical starting memory above 4G here. 

OK, I'll write it down and consider it in the future optimization.

> 
>> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
>> 		reserve_crashkernel()
>            
>> paging_init()
>> 	map_mem()
>> 		if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
>> 			//block mapping
>> 		else
>> 			//page mapping
>> unflatten_device_tree or ACPI
>> bootmem_init();
>> 	zone_sizes_init()
>> 		of_dma_get_max_cpu_address
>> 		//Update arm64_dma_phys_limit
>> 	if (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32))
>> 		reserve_crashkernel()
> 
> The rest sounds good with optimization code split out.
> 
> .
> 

-- 
Regards,
  Zhen Lei


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-04-29  8:02                       ` Leizhen (ThunderTown)
  (?)
@ 2022-04-29  8:25                         ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-29  8:25 UTC (permalink / raw)
  To: Baoquan He
  Cc: Catalin Marinas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp



On 2022/4/29 16:02, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/4/29 11:24, Baoquan He wrote:
>> On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
>>>
>>>
>>> On 2022/4/28 11:52, Baoquan He wrote:
>>>> On 04/28/22 at 11:40am, Baoquan He wrote:
>>>>> Hi Catalin, Zhen Lei,
>>>>>
>>>>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
>>>>>> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
>>>>>>> On 2022/4/27 20:32, Catalin Marinas wrote:
>>>>>>>> I think one could always pass a default command line like:
>>>>>>>>
>>>>>>>> 	crashkernel=1G,high crashkernel=128M,low
>>>>>>>>
>>>>>>>> without much knowledge of the SoC memory layout.
>>>>>>>
>>>>>>> Yes, that's what the end result is. The user specify crashkernel=128M,low
>>>>>>> and the implementation ensure the 128M low memory is allocated from DMA zone.
>>>>>>> We use arm64_dma_phys_limit as the upper limit for crash low memory.
>>>>>>>
>>>>>>> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
>>>>>>> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>>>>>> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>>>>>                                                crash_base, crash_max);
>>>>>>>
>>>>>>>> Another option is to only introduce crashkernel=Y,low and, when that is
>>>>>>>> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
>>>>>>>> 'high' option at all:
>>>>>>>>
>>>>>>>> 	crashkernel=1G				- all within ZONE_DMA
>>>>>>>> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
>>>>>>>> 						  1G above ZONE_DMA
>>>>>>>>
>>>>>>>> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
>>>>>>>> the 'low' option.
>>>>>>>
>>>>>>> I think although the code is hard to make generic, the interface is better to
>>>>>>> be relatively uniform. A user might have to maintain both x86 and arm64, and
>>>>>>> so on. It's not a good thing that the difference is too big.
>>>>>>
>>>>>> There will be some difference as the 4G limit doesn't always hold for
>>>>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
>>>>>> things a bit while following the documented behaviour:
>>>>>>
>>>>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
>>>>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
>>>>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
>>>>>>
>>>>>> There is no fallback from crashkernel=Y.
>>>>>>
>>>>>> The question is whether we still want a default low allocation if
>>>>>> crashkernel=Y,low is missing but 'high' is present. If we add this, I
>>>>>> think we'd be consistent with kernel-parameters.txt for the 'low'
>>>>>> description. A default 'low' is probably not that bad but I'm tempted to
>>>>>> always mandate both 'high' and 'low'.
>>>>>
>>>>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
>>>>> about this version. And I have the same concerns about them which comes
>>>>> from below points:
>>>>> 1) we may need to take best effort to keep ,high, ,low behaviour
>>>>> consistent on all ARCHes. Otherwise user/admin may be confused when they
>>>>> deploy/configure kdump on different machines of different ARCHes in the
>>>>> same LAB. I think we should try to avoid the confusion.
>>>
>>> Yes, but for someone who is configuring crashkernel= for the first time, he
>>> needs to read doc to understand how to configure it. The doc can show the
>>> recommended default value of 'low' size.
>>>
>>> After commit 94fb93341822 ("x86/crash: Allocate enough low memory when crashkernel=high"),
>>> the default 'low' size doesn't make much sense anymore. The default size of swiotlb_size()
>>> is 64M, far less than 256M. And if user specify "swiotlb=", he can also adjust crashkernel=Y,low.
>>>
>>>
>>> +                * -swiotlb size: user-specified with swiotlb= or default.
>>> -               low_size = swiotlb_size_or_default() + (8UL<<20);
>>> +               low_size = max(swiotlb_size_or_default() + (8UL<<20), 256UL<<20);
>>>
>>> That means all ARCHs can explicit configure crashkernel=256M,low, instead of
>>> omitting it. This may be another way to avoid confusion. It's not hard for
>>> programmer-turned-user/admin. However, this requires us to forgo backward
>>> compatibility with the default size of 'low'.
>>
>> We can make ,high and ,low simpler at first as they are alternative. If
>> possible, we can also simplify the ,high ,low implementation on x86_64
>> if it truly brings better archievement on arm64.
> 
> OK, I plan to remove optimization, fallback and default low size, to follow the
> suggestion of Catalin first. But there's one minor point of contention.
> 
> 1)    Both "crashkernel=X,high" and "crashkernel=X,low" must be present.
> 2)    Both "crashkernel=X,high" and "crashkernel=X,low" are present.
>    or
>       Allow "crashkernel=X,high" to be present alone. Unlike x86, the default low size is zero.
> 
> I prefer 2), how about you?
> 
>>
>>>
>>>
>>>>> 2) Fallback behaviour is important to our distros. The reason is we will
>>>>> provide default value with crashkernel=xxxM along kernel of distros. In
>>>>> this case, we hope the reservation will succeed by all means. The ,high
>>>>> and ,low is an option if customer likes to take with expertise.
>>>
>>> OK, I got it.
>>>
>>>>>
>>>>> After going through arm64 memory init code, I got below summary about
>>>>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
>>>>> can make use of it to facilitate to simplify code.
>>>>> ================================================================================
>>>>>                         DMA                      DMA32                    NORMAL
>>>>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
>>>>> 2)Normal machine        0~4G                     0                        (above 4G)
>>>>> 3)Special machine       (above 4G)~MAX
>>>>> 4)No DMA|DMA32                                                            (above 4G)~MAX
>>>
>>> arm64_memblock_init()
>>> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
>> We don't need different code for this place of reservation as you are
>> doing in this patchset, since arm64_dma_phys_limit is initialized as 
>> below. In fact, in arm64_memblock_init(), we have made memblock ready,
>> we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
>> memblock_start_of_DRAM() is bigger than 4G, we possibly can call
>> reserve_crashkernel() here too.
> 
> Yes. Maybe all the devices in this environment are 64-bit. One way I know of allowing
> 32-bit devices to access high memory without SMMU is: Set a fixed value for the upper
> 32 bits. In this case, the DMA zone should be [phys_start, phys_start + 4G).

I just read the message of commit 791ab8b2e3 ("arm64: Ignore any DMA offsets in the max_zone_phys() calculation")

    Currently, the kernel assumes that if RAM starts above 32-bit (or
    zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
    such constrained devices have a hardwired DMA offset. In practice, we
    haven't noticed any such hardware so let's assume that we can expand
    ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
    ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
    zone_bits.

So your suggestion is feasible.

> 
>>
>> phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
>>
>>> paging_init()                                       |
>>> 	map_mem()                                   |
>>> unflatten_device_tree or ACPI                       |  ----  //Raspberry Pi4 get dma zone base on dtb or ACPI
>>> bootmem_init();                                     |      |
>>> 	zone_sizes_init()                           |      |
>>> 		of_dma_get_max_cpu_address          |  ----|
>>> 		//Update arm64_dma_phys_limit       |  ----|
>>> 	reserve_crashkernel()        <--------------  //Because we need arm64_dma_phys_limit to be updated above
>>> request_standard_resources()
>>
>> Yeah, because arm64_dma_phys_limit is decided late in the 1) and 2) case
>> as I summarized, we need defer reserve_crashkernel() to bootmem_init(). But 
>> arm64_dma_phys_limit could be 1G or 4G, that's why your optimization
>> about BLOCKING may not be right since you assume the 4G boundary, while
>> forgetting Raspberry Pi4 on which 1G is the boundary of low memory and
> 
> No, no, my optimization for Raspberry Pi4 is fine. I do page mapping for memory
> under 4G and block mapping for memory above 4G. But still try to reserve crash
> low memory from DMA zone. So when 1G is the boundary, it's just not fully optimized,
> the memory 1-4G are mapped as page.
> 
>> high memory. So separating out BLOCKING optimization can let us focus on
>> the crashkernel,high support.
>>
>>>
>>>>>
>>>>> -------------------------------------------
>>>>>                       arm64_dma_phys_limit
>>>>> 1)Raspberry Pi4         1G                     
>>>>> 2)Normal machine        4G                     
>>>>> 3)Special machine       MAX
>>>>> 4)No DMA|DMA32          MAX
>>>>>
>>>>> Note: 3)Special machine means the machine's starting physical address is above 4G.
>>>>> WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
>>>>> IOMMU hardware supporting.
>>>>> ===================================================================================
>>>>>
>>>>> I made a draft patch based on this patchset, please feel free to check and
>>>>> see if it's OK, or anything missing or wrongly understood. I removed
>>>>> reserve_crashkernel_high() and only keep reserve_crashkernel() and
>>>>> reserve_crashkernel_low() as the v21 did.
>>>>
>>>> Sorry, forgot attaching the draft patch.
>>>>
>>>> By the way, we can also have a simple version with basic ,high, ,low
>>>> support, no fallback. We can add fallback and other optimization later.
>>>> This can be plan B.
>>>
>>> Yes, That's what Catalin suggested also.
>>>
>>> Hi, Baoquan He:
>>>   Without optimization, the whole Patch 3-4 and 6-7 can be dropped.
>>>
>>> Process after abstraction:
>>> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) {
>>> 		reserve_crashkernel()
>>> 		//block mapping
>>> 	} else {
>>> 		//page mapping
>>> 		reserve_crashkernel()
>>> 	}
>>>
>>> ------------ Simplified real-world process ---------
>> Yeah, this looks clearer. I would like to see a version with them.
>>
>>> arm64_memblock_init()
>>         Before reserve_crashkernel(), we can update arm64_dma_phys_limit
>> as memblock_end_of_DRAM() if CONFIG_ZONE_DMA|DMA32 is not enabled or
>> memblock_start_of_DRAM() is bigger than 4G.
>>         Then we go with:
>>         if (!arm64_dma_phys_limit)
>> 		reserve_crashkernel();
>>
>> Just personal opinion, please check if it's appropriate to handle case
>> 3) which has physical starting memory above 4G here. 
> 
> OK, I'll write it down and consider it in the future optimization.
> 
>>
>>> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
>>> 		reserve_crashkernel()
>>            
>>> paging_init()
>>> 	map_mem()
>>> 		if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
>>> 			//block mapping
>>> 		else
>>> 			//page mapping
>>> unflatten_device_tree or ACPI
>>> bootmem_init();
>>> 	zone_sizes_init()
>>> 		of_dma_get_max_cpu_address
>>> 		//Update arm64_dma_phys_limit
>>> 	if (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32))
>>> 		reserve_crashkernel()
>>
>> The rest sounds good with optimization code split out.
>>
>> .
>>
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-29  8:25                         ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-04-29  8:25 UTC (permalink / raw)
  To: Baoquan He
  Cc: Catalin Marinas, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp



On 2022/4/29 16:02, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/4/29 11:24, Baoquan He wrote:
>> On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
>>>
>>>
>>> On 2022/4/28 11:52, Baoquan He wrote:
>>>> On 04/28/22 at 11:40am, Baoquan He wrote:
>>>>> Hi Catalin, Zhen Lei,
>>>>>
>>>>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
>>>>>> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
>>>>>>> On 2022/4/27 20:32, Catalin Marinas wrote:
>>>>>>>> I think one could always pass a default command line like:
>>>>>>>>
>>>>>>>> 	crashkernel=1G,high crashkernel=128M,low
>>>>>>>>
>>>>>>>> without much knowledge of the SoC memory layout.
>>>>>>>
>>>>>>> Yes, that's what the end result is. The user specify crashkernel=128M,low
>>>>>>> and the implementation ensure the 128M low memory is allocated from DMA zone.
>>>>>>> We use arm64_dma_phys_limit as the upper limit for crash low memory.
>>>>>>>
>>>>>>> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
>>>>>>> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>>>>>> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>>>>>                                                crash_base, crash_max);
>>>>>>>
>>>>>>>> Another option is to only introduce crashkernel=Y,low and, when that is
>>>>>>>> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
>>>>>>>> 'high' option at all:
>>>>>>>>
>>>>>>>> 	crashkernel=1G				- all within ZONE_DMA
>>>>>>>> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
>>>>>>>> 						  1G above ZONE_DMA
>>>>>>>>
>>>>>>>> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
>>>>>>>> the 'low' option.
>>>>>>>
>>>>>>> I think although the code is hard to make generic, the interface is better to
>>>>>>> be relatively uniform. A user might have to maintain both x86 and arm64, and
>>>>>>> so on. It's not a good thing that the difference is too big.
>>>>>>
>>>>>> There will be some difference as the 4G limit doesn't always hold for
>>>>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
>>>>>> things a bit while following the documented behaviour:
>>>>>>
>>>>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
>>>>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
>>>>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
>>>>>>
>>>>>> There is no fallback from crashkernel=Y.
>>>>>>
>>>>>> The question is whether we still want a default low allocation if
>>>>>> crashkernel=Y,low is missing but 'high' is present. If we add this, I
>>>>>> think we'd be consistent with kernel-parameters.txt for the 'low'
>>>>>> description. A default 'low' is probably not that bad but I'm tempted to
>>>>>> always mandate both 'high' and 'low'.
>>>>>
>>>>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
>>>>> about this version. And I have the same concerns about them which comes
>>>>> from below points:
>>>>> 1) we may need to take best effort to keep ,high, ,low behaviour
>>>>> consistent on all ARCHes. Otherwise user/admin may be confused when they
>>>>> deploy/configure kdump on different machines of different ARCHes in the
>>>>> same LAB. I think we should try to avoid the confusion.
>>>
>>> Yes, but for someone who is configuring crashkernel= for the first time, he
>>> needs to read doc to understand how to configure it. The doc can show the
>>> recommended default value of 'low' size.
>>>
>>> After commit 94fb93341822 ("x86/crash: Allocate enough low memory when crashkernel=high"),
>>> the default 'low' size doesn't make much sense anymore. The default size of swiotlb_size()
>>> is 64M, far less than 256M. And if user specify "swiotlb=", he can also adjust crashkernel=Y,low.
>>>
>>>
>>> +                * -swiotlb size: user-specified with swiotlb= or default.
>>> -               low_size = swiotlb_size_or_default() + (8UL<<20);
>>> +               low_size = max(swiotlb_size_or_default() + (8UL<<20), 256UL<<20);
>>>
>>> That means all ARCHs can explicit configure crashkernel=256M,low, instead of
>>> omitting it. This may be another way to avoid confusion. It's not hard for
>>> programmer-turned-user/admin. However, this requires us to forgo backward
>>> compatibility with the default size of 'low'.
>>
>> We can make ,high and ,low simpler at first as they are alternative. If
>> possible, we can also simplify the ,high ,low implementation on x86_64
>> if it truly brings better archievement on arm64.
> 
> OK, I plan to remove optimization, fallback and default low size, to follow the
> suggestion of Catalin first. But there's one minor point of contention.
> 
> 1)    Both "crashkernel=X,high" and "crashkernel=X,low" must be present.
> 2)    Both "crashkernel=X,high" and "crashkernel=X,low" are present.
>    or
>       Allow "crashkernel=X,high" to be present alone. Unlike x86, the default low size is zero.
> 
> I prefer 2), how about you?
> 
>>
>>>
>>>
>>>>> 2) Fallback behaviour is important to our distros. The reason is we will
>>>>> provide default value with crashkernel=xxxM along kernel of distros. In
>>>>> this case, we hope the reservation will succeed by all means. The ,high
>>>>> and ,low is an option if customer likes to take with expertise.
>>>
>>> OK, I got it.
>>>
>>>>>
>>>>> After going through arm64 memory init code, I got below summary about
>>>>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
>>>>> can make use of it to facilitate to simplify code.
>>>>> ================================================================================
>>>>>                         DMA                      DMA32                    NORMAL
>>>>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
>>>>> 2)Normal machine        0~4G                     0                        (above 4G)
>>>>> 3)Special machine       (above 4G)~MAX
>>>>> 4)No DMA|DMA32                                                            (above 4G)~MAX
>>>
>>> arm64_memblock_init()
>>> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
>> We don't need different code for this place of reservation as you are
>> doing in this patchset, since arm64_dma_phys_limit is initialized as 
>> below. In fact, in arm64_memblock_init(), we have made memblock ready,
>> we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
>> memblock_start_of_DRAM() is bigger than 4G, we possibly can call
>> reserve_crashkernel() here too.
> 
> Yes. Maybe all the devices in this environment are 64-bit. One way I know of allowing
> 32-bit devices to access high memory without SMMU is: Set a fixed value for the upper
> 32 bits. In this case, the DMA zone should be [phys_start, phys_start + 4G).

I just read the message of commit 791ab8b2e3 ("arm64: Ignore any DMA offsets in the max_zone_phys() calculation")

    Currently, the kernel assumes that if RAM starts above 32-bit (or
    zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
    such constrained devices have a hardwired DMA offset. In practice, we
    haven't noticed any such hardware so let's assume that we can expand
    ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
    ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
    zone_bits.

So your suggestion is feasible.

> 
>>
>> phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
>>
>>> paging_init()                                       |
>>> 	map_mem()                                   |
>>> unflatten_device_tree or ACPI                       |  ----  //Raspberry Pi4 get dma zone base on dtb or ACPI
>>> bootmem_init();                                     |      |
>>> 	zone_sizes_init()                           |      |
>>> 		of_dma_get_max_cpu_address          |  ----|
>>> 		//Update arm64_dma_phys_limit       |  ----|
>>> 	reserve_crashkernel()        <--------------  //Because we need arm64_dma_phys_limit to be updated above
>>> request_standard_resources()
>>
>> Yeah, because arm64_dma_phys_limit is decided late in the 1) and 2) case
>> as I summarized, we need defer reserve_crashkernel() to bootmem_init(). But 
>> arm64_dma_phys_limit could be 1G or 4G, that's why your optimization
>> about BLOCKING may not be right since you assume the 4G boundary, while
>> forgetting Raspberry Pi4 on which 1G is the boundary of low memory and
> 
> No, no, my optimization for Raspberry Pi4 is fine. I do page mapping for memory
> under 4G and block mapping for memory above 4G. But still try to reserve crash
> low memory from DMA zone. So when 1G is the boundary, it's just not fully optimized,
> the memory 1-4G are mapped as page.
> 
>> high memory. So separating out BLOCKING optimization can let us focus on
>> the crashkernel,high support.
>>
>>>
>>>>>
>>>>> -------------------------------------------
>>>>>                       arm64_dma_phys_limit
>>>>> 1)Raspberry Pi4         1G                     
>>>>> 2)Normal machine        4G                     
>>>>> 3)Special machine       MAX
>>>>> 4)No DMA|DMA32          MAX
>>>>>
>>>>> Note: 3)Special machine means the machine's starting physical address is above 4G.
>>>>> WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
>>>>> IOMMU hardware supporting.
>>>>> ===================================================================================
>>>>>
>>>>> I made a draft patch based on this patchset, please feel free to check and
>>>>> see if it's OK, or anything missing or wrongly understood. I removed
>>>>> reserve_crashkernel_high() and only keep reserve_crashkernel() and
>>>>> reserve_crashkernel_low() as the v21 did.
>>>>
>>>> Sorry, forgot attaching the draft patch.
>>>>
>>>> By the way, we can also have a simple version with basic ,high, ,low
>>>> support, no fallback. We can add fallback and other optimization later.
>>>> This can be plan B.
>>>
>>> Yes, That's what Catalin suggested also.
>>>
>>> Hi, Baoquan He:
>>>   Without optimization, the whole Patch 3-4 and 6-7 can be dropped.
>>>
>>> Process after abstraction:
>>> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) {
>>> 		reserve_crashkernel()
>>> 		//block mapping
>>> 	} else {
>>> 		//page mapping
>>> 		reserve_crashkernel()
>>> 	}
>>>
>>> ------------ Simplified real-world process ---------
>> Yeah, this looks clearer. I would like to see a version with them.
>>
>>> arm64_memblock_init()
>>         Before reserve_crashkernel(), we can update arm64_dma_phys_limit
>> as memblock_end_of_DRAM() if CONFIG_ZONE_DMA|DMA32 is not enabled or
>> memblock_start_of_DRAM() is bigger than 4G.
>>         Then we go with:
>>         if (!arm64_dma_phys_limit)
>> 		reserve_crashkernel();
>>
>> Just personal opinion, please check if it's appropriate to handle case
>> 3) which has physical starting memory above 4G here. 
> 
> OK, I'll write it down and consider it in the future optimization.
> 
>>
>>> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
>>> 		reserve_crashkernel()
>>            
>>> paging_init()
>>> 	map_mem()
>>> 		if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
>>> 			//block mapping
>>> 		else
>>> 			//page mapping
>>> unflatten_device_tree or ACPI
>>> bootmem_init();
>>> 	zone_sizes_init()
>>> 		of_dma_get_max_cpu_address
>>> 		//Update arm64_dma_phys_limit
>>> 	if (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32))
>>> 		reserve_crashkernel()
>>
>> The rest sounds good with optimization code split out.
>>
>> .
>>
> 

-- 
Regards,
  Zhen Lei

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-04-29  8:25                         ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen @ 2022-04-29  8:25 UTC (permalink / raw)
  To: kexec



On 2022/4/29 16:02, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/4/29 11:24, Baoquan He wrote:
>> On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
>>>
>>>
>>> On 2022/4/28 11:52, Baoquan He wrote:
>>>> On 04/28/22 at 11:40am, Baoquan He wrote:
>>>>> Hi Catalin, Zhen Lei,
>>>>>
>>>>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
>>>>>> On Wed, Apr 27, 2022 at 09:49:20PM +0800, Leizhen (ThunderTown) wrote:
>>>>>>> On 2022/4/27 20:32, Catalin Marinas wrote:
>>>>>>>> I think one could always pass a default command line like:
>>>>>>>>
>>>>>>>> 	crashkernel=1G,high crashkernel=128M,low
>>>>>>>>
>>>>>>>> without much knowledge of the SoC memory layout.
>>>>>>>
>>>>>>> Yes, that's what the end result is. The user specify crashkernel=128M,low
>>>>>>> and the implementation ensure the 128M low memory is allocated from DMA zone.
>>>>>>> We use arm64_dma_phys_limit as the upper limit for crash low memory.
>>>>>>>
>>>>>>> +#define CRASH_ADDR_LOW_MAX             arm64_dma_phys_limit
>>>>>>> +       unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
>>>>>>> +       crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>>>>>                                                crash_base, crash_max);
>>>>>>>
>>>>>>>> Another option is to only introduce crashkernel=Y,low and, when that is
>>>>>>>> passed, crashkernel=Y can go above arm64_dma_phys_limit. We won't need a
>>>>>>>> 'high' option at all:
>>>>>>>>
>>>>>>>> 	crashkernel=1G				- all within ZONE_DMA
>>>>>>>> 	crashkernel=1G crashkernel=128M,low	- 128M in ZONE_DMA
>>>>>>>> 						  1G above ZONE_DMA
>>>>>>>>
>>>>>>>> If ZONE_DMA is not present or it extends to the whole RAM, we can ignore
>>>>>>>> the 'low' option.
>>>>>>>
>>>>>>> I think although the code is hard to make generic, the interface is better to
>>>>>>> be relatively uniform. A user might have to maintain both x86 and arm64, and
>>>>>>> so on. It's not a good thing that the difference is too big.
>>>>>>
>>>>>> There will be some difference as the 4G limit doesn't always hold for
>>>>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
>>>>>> things a bit while following the documented behaviour:
>>>>>>
>>>>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
>>>>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
>>>>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
>>>>>>
>>>>>> There is no fallback from crashkernel=Y.
>>>>>>
>>>>>> The question is whether we still want a default low allocation if
>>>>>> crashkernel=Y,low is missing but 'high' is present. If we add this, I
>>>>>> think we'd be consistent with kernel-parameters.txt for the 'low'
>>>>>> description. A default 'low' is probably not that bad but I'm tempted to
>>>>>> always mandate both 'high' and 'low'.
>>>>>
>>>>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
>>>>> about this version. And I have the same concerns about them which comes
>>>>> from below points:
>>>>> 1) we may need to take best effort to keep ,high, ,low behaviour
>>>>> consistent on all ARCHes. Otherwise user/admin may be confused when they
>>>>> deploy/configure kdump on different machines of different ARCHes in the
>>>>> same LAB. I think we should try to avoid the confusion.
>>>
>>> Yes, but for someone who is configuring crashkernel= for the first time, he
>>> needs to read doc to understand how to configure it. The doc can show the
>>> recommended default value of 'low' size.
>>>
>>> After commit 94fb93341822 ("x86/crash: Allocate enough low memory when crashkernel=high"),
>>> the default 'low' size doesn't make much sense anymore. The default size of swiotlb_size()
>>> is 64M, far less than 256M. And if user specify "swiotlb=", he can also adjust crashkernel=Y,low.
>>>
>>>
>>> +                * -swiotlb size: user-specified with swiotlb= or default.
>>> -               low_size = swiotlb_size_or_default() + (8UL<<20);
>>> +               low_size = max(swiotlb_size_or_default() + (8UL<<20), 256UL<<20);
>>>
>>> That means all ARCHs can explicit configure crashkernel=256M,low, instead of
>>> omitting it. This may be another way to avoid confusion. It's not hard for
>>> programmer-turned-user/admin. However, this requires us to forgo backward
>>> compatibility with the default size of 'low'.
>>
>> We can make ,high and ,low simpler at first as they are alternative. If
>> possible, we can also simplify the ,high ,low implementation on x86_64
>> if it truly brings better archievement on arm64.
> 
> OK, I plan to remove optimization, fallback and default low size, to follow the
> suggestion of Catalin first. But there's one minor point of contention.
> 
> 1)    Both "crashkernel=X,high" and "crashkernel=X,low" must be present.
> 2)    Both "crashkernel=X,high" and "crashkernel=X,low" are present.
>    or
>       Allow "crashkernel=X,high" to be present alone. Unlike x86, the default low size is zero.
> 
> I prefer 2), how about you?
> 
>>
>>>
>>>
>>>>> 2) Fallback behaviour is important to our distros. The reason is we will
>>>>> provide default value with crashkernel=xxxM along kernel of distros. In
>>>>> this case, we hope the reservation will succeed by all means. The ,high
>>>>> and ,low is an option if customer likes to take with expertise.
>>>
>>> OK, I got it.
>>>
>>>>>
>>>>> After going through arm64 memory init code, I got below summary about
>>>>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
>>>>> can make use of it to facilitate to simplify code.
>>>>> ================================================================================
>>>>>                         DMA                      DMA32                    NORMAL
>>>>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
>>>>> 2)Normal machine        0~4G                     0                        (above 4G)
>>>>> 3)Special machine       (above 4G)~MAX
>>>>> 4)No DMA|DMA32                                                            (above 4G)~MAX
>>>
>>> arm64_memblock_init()
>>> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
>> We don't need different code for this place of reservation as you are
>> doing in this patchset, since arm64_dma_phys_limit is initialized as 
>> below. In fact, in arm64_memblock_init(), we have made memblock ready,
>> we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
>> memblock_start_of_DRAM() is bigger than 4G, we possibly can call
>> reserve_crashkernel() here too.
> 
> Yes. Maybe all the devices in this environment are 64-bit. One way I know of allowing
> 32-bit devices to access high memory without SMMU is: Set a fixed value for the upper
> 32 bits. In this case, the DMA zone should be [phys_start, phys_start + 4G).

I just read the message of commit 791ab8b2e3 ("arm64: Ignore any DMA offsets in the max_zone_phys() calculation")

    Currently, the kernel assumes that if RAM starts above 32-bit (or
    zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
    such constrained devices have a hardwired DMA offset. In practice, we
    haven't noticed any such hardware so let's assume that we can expand
    ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
    ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
    zone_bits.

So your suggestion is feasible.

> 
>>
>> phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
>>
>>> paging_init()                                       |
>>> 	map_mem()                                   |
>>> unflatten_device_tree or ACPI                       |  ----  //Raspberry Pi4 get dma zone base on dtb or ACPI
>>> bootmem_init();                                     |      |
>>> 	zone_sizes_init()                           |      |
>>> 		of_dma_get_max_cpu_address          |  ----|
>>> 		//Update arm64_dma_phys_limit       |  ----|
>>> 	reserve_crashkernel()        <--------------  //Because we need arm64_dma_phys_limit to be updated above
>>> request_standard_resources()
>>
>> Yeah, because arm64_dma_phys_limit is decided late in the 1) and 2) case
>> as I summarized, we need defer reserve_crashkernel() to bootmem_init(). But 
>> arm64_dma_phys_limit could be 1G or 4G, that's why your optimization
>> about BLOCKING may not be right since you assume the 4G boundary, while
>> forgetting Raspberry Pi4 on which 1G is the boundary of low memory and
> 
> No, no, my optimization for Raspberry Pi4 is fine. I do page mapping for memory
> under 4G and block mapping for memory above 4G. But still try to reserve crash
> low memory from DMA zone. So when 1G is the boundary, it's just not fully optimized,
> the memory 1-4G are mapped as page.
> 
>> high memory. So separating out BLOCKING optimization can let us focus on
>> the crashkernel,high support.
>>
>>>
>>>>>
>>>>> -------------------------------------------
>>>>>                       arm64_dma_phys_limit
>>>>> 1)Raspberry Pi4         1G                     
>>>>> 2)Normal machine        4G                     
>>>>> 3)Special machine       MAX
>>>>> 4)No DMA|DMA32          MAX
>>>>>
>>>>> Note: 3)Special machine means the machine's starting physical address is above 4G.
>>>>> WHile 4)No DMA|DMA32 means kernel w/o CONFIG_ZONE_DMA|DMA32, and has
>>>>> IOMMU hardware supporting.
>>>>> ===================================================================================
>>>>>
>>>>> I made a draft patch based on this patchset, please feel free to check and
>>>>> see if it's OK, or anything missing or wrongly understood. I removed
>>>>> reserve_crashkernel_high() and only keep reserve_crashkernel() and
>>>>> reserve_crashkernel_low() as the v21 did.
>>>>
>>>> Sorry, forgot attaching the draft patch.
>>>>
>>>> By the way, we can also have a simple version with basic ,high, ,low
>>>> support, no fallback. We can add fallback and other optimization later.
>>>> This can be plan B.
>>>
>>> Yes, That's what Catalin suggested also.
>>>
>>> Hi, Baoquan He:
>>>   Without optimization, the whole Patch 3-4 and 6-7 can be dropped.
>>>
>>> Process after abstraction:
>>> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) {
>>> 		reserve_crashkernel()
>>> 		//block mapping
>>> 	} else {
>>> 		//page mapping
>>> 		reserve_crashkernel()
>>> 	}
>>>
>>> ------------ Simplified real-world process ---------
>> Yeah, this looks clearer. I would like to see a version with them.
>>
>>> arm64_memblock_init()
>>         Before reserve_crashkernel(), we can update arm64_dma_phys_limit
>> as memblock_end_of_DRAM() if CONFIG_ZONE_DMA|DMA32 is not enabled or
>> memblock_start_of_DRAM() is bigger than 4G.
>>         Then we go with:
>>         if (!arm64_dma_phys_limit)
>> 		reserve_crashkernel();
>>
>> Just personal opinion, please check if it's appropriate to handle case
>> 3) which has physical starting memory above 4G here. 
> 
> OK, I'll write it down and consider it in the future optimization.
> 
>>
>>> 	if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
>>> 		reserve_crashkernel()
>>            
>>> paging_init()
>>> 	map_mem()
>>> 		if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
>>> 			//block mapping
>>> 		else
>>> 			//page mapping
>>> unflatten_device_tree or ACPI
>>> bootmem_init();
>>> 	zone_sizes_init()
>>> 		of_dma_get_max_cpu_address
>>> 		//Update arm64_dma_phys_limit
>>> 	if (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32))
>>> 		reserve_crashkernel()
>>
>> The rest sounds good with optimization code split out.
>>
>> .
>>
> 

-- 
Regards,
  Zhen Lei


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-04-29  8:25                         ` Leizhen (ThunderTown)
  (?)
@ 2022-05-03 22:00                           ` Catalin Marinas
  -1 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-05-03 22:00 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Baoquan He, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

On Fri, Apr 29, 2022 at 04:25:37PM +0800, Leizhen (ThunderTown) wrote:
> On 2022/4/29 16:02, Leizhen (ThunderTown) wrote:
> > On 2022/4/29 11:24, Baoquan He wrote:
> >> On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
> >>> On 2022/4/28 11:52, Baoquan He wrote:
> >>>> On 04/28/22 at 11:40am, Baoquan He wrote:
> >>>>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
> >>>>>> There will be some difference as the 4G limit doesn't always hold for
> >>>>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
> >>>>>> things a bit while following the documented behaviour:
> >>>>>>
> >>>>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
> >>>>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> >>>>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
[...]
> >>>>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
> >>>>> about this version. And I have the same concerns about them which comes
> >>>>> from below points:
> >>>>> 1) we may need to take best effort to keep ,high, ,low behaviour
> >>>>> consistent on all ARCHes. Otherwise user/admin may be confused when they
> >>>>> deploy/configure kdump on different machines of different ARCHes in the
> >>>>> same LAB. I think we should try to avoid the confusion.

I guess by all arches you mean just x86 here. Since the code is not
generic, all arches do their own stuff.

> > OK, I plan to remove optimization, fallback and default low size, to follow the
> > suggestion of Catalin first. But there's one minor point of contention.
> > 
> > 1)    Both "crashkernel=X,high" and "crashkernel=X,low" must be present.
> > 2)    Both "crashkernel=X,high" and "crashkernel=X,low" are present.
> >    or
> >       Allow "crashkernel=X,high" to be present alone. Unlike x86, the default low size is zero.
> > 
> > I prefer 2), how about you?

(2) works for me as well. We keep these simple as "expert" options and
allow crashkernel= to fall back to 'high' if not sufficient memory in
ZONE_DMA. That would be a slight change from the current behaviour but,
as Zhen Lei said, with the old tools it's just moving the error around,
the crashkernel wouldn't be available in either case.

> >>>>> 2) Fallback behaviour is important to our distros. The reason is we will
> >>>>> provide default value with crashkernel=xxxM along kernel of distros. In
> >>>>> this case, we hope the reservation will succeed by all means. The ,high
> >>>>> and ,low is an option if customer likes to take with expertise.

OK, that's good feedback.

So, to recap, IIUC you are fine with:

	crashkernel=Y		- allocate within ZONE_DMA with fallback
				  above with a default in ZONE_DMA (like
				  x86, 256M or swiotlb size)
	crashkernel=Y,high	- allocate from above ZONE_DMA
	crashkernel=Y,low	- allocate within ZONE_DMA

'crashkernel' overrides the high and low while the latter two can be
passed independently.

> >>>>> After going through arm64 memory init code, I got below summary about
> >>>>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
> >>>>> can make use of it to facilitate to simplify code.
> >>>>> ================================================================================
> >>>>>                         DMA                      DMA32                    NORMAL
> >>>>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
> >>>>> 2)Normal machine        0~4G                     0                        (above 4G)
> >>>>> 3)Special machine       (above 4G)~MAX
> >>>>> 4)No DMA|DMA32                                                            (above 4G)~MAX
> >>>
> >>> arm64_memblock_init()
> >>> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
> >> We don't need different code for this place of reservation as you are
> >> doing in this patchset, since arm64_dma_phys_limit is initialized as 
> >> below. In fact, in arm64_memblock_init(), we have made memblock ready,
> >> we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
> >> memblock_start_of_DRAM() is bigger than 4G, we possibly can call
> >> reserve_crashkernel() here too.
> > 
> > Yes. Maybe all the devices in this environment are 64-bit. One way I
> > know of allowing 32-bit devices to access high memory without SMMU
> > is: Set a fixed value for the upper 32 bits. In this case, the DMA
> > zone should be [phys_start, phys_start + 4G).

We decided that this case doesn't really exists for arm64 platforms (no
need for special ZONE_DMA).

> I just read the message of commit 791ab8b2e3 ("arm64: Ignore any DMA
> offsets in the max_zone_phys() calculation")
> 
>     Currently, the kernel assumes that if RAM starts above 32-bit (or
>     zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
>     such constrained devices have a hardwired DMA offset. In practice, we
>     haven't noticed any such hardware so let's assume that we can expand
>     ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
>     ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
>     zone_bits.

I think the above log is slightly confusing. If the DRAM starts above
4G, ZONE_DMA goes to the end of DRAM. If the DRAM starts below 4G but
above the zone_bits for ZONE_DMA as specified in DT/ACPI, it pushes
ZONE_DMA to 4G. I don't remember why we did this last part, maybe in
case we get incorrect firmware tables, otherwise we could have extended
ZONE_DMA to end of DRAM.

Zhen Lei, if we agreed on the crashkernel behaviour, could you please
post a series that does the above parsing allocation? Ignore the
optimisations, we can look at them afterwards.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-05-03 22:00                           ` Catalin Marinas
  0 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-05-03 22:00 UTC (permalink / raw)
  To: Leizhen (ThunderTown)
  Cc: Baoquan He, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

On Fri, Apr 29, 2022 at 04:25:37PM +0800, Leizhen (ThunderTown) wrote:
> On 2022/4/29 16:02, Leizhen (ThunderTown) wrote:
> > On 2022/4/29 11:24, Baoquan He wrote:
> >> On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
> >>> On 2022/4/28 11:52, Baoquan He wrote:
> >>>> On 04/28/22 at 11:40am, Baoquan He wrote:
> >>>>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
> >>>>>> There will be some difference as the 4G limit doesn't always hold for
> >>>>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
> >>>>>> things a bit while following the documented behaviour:
> >>>>>>
> >>>>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
> >>>>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> >>>>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
[...]
> >>>>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
> >>>>> about this version. And I have the same concerns about them which comes
> >>>>> from below points:
> >>>>> 1) we may need to take best effort to keep ,high, ,low behaviour
> >>>>> consistent on all ARCHes. Otherwise user/admin may be confused when they
> >>>>> deploy/configure kdump on different machines of different ARCHes in the
> >>>>> same LAB. I think we should try to avoid the confusion.

I guess by all arches you mean just x86 here. Since the code is not
generic, all arches do their own stuff.

> > OK, I plan to remove optimization, fallback and default low size, to follow the
> > suggestion of Catalin first. But there's one minor point of contention.
> > 
> > 1)    Both "crashkernel=X,high" and "crashkernel=X,low" must be present.
> > 2)    Both "crashkernel=X,high" and "crashkernel=X,low" are present.
> >    or
> >       Allow "crashkernel=X,high" to be present alone. Unlike x86, the default low size is zero.
> > 
> > I prefer 2), how about you?

(2) works for me as well. We keep these simple as "expert" options and
allow crashkernel= to fall back to 'high' if not sufficient memory in
ZONE_DMA. That would be a slight change from the current behaviour but,
as Zhen Lei said, with the old tools it's just moving the error around,
the crashkernel wouldn't be available in either case.

> >>>>> 2) Fallback behaviour is important to our distros. The reason is we will
> >>>>> provide default value with crashkernel=xxxM along kernel of distros. In
> >>>>> this case, we hope the reservation will succeed by all means. The ,high
> >>>>> and ,low is an option if customer likes to take with expertise.

OK, that's good feedback.

So, to recap, IIUC you are fine with:

	crashkernel=Y		- allocate within ZONE_DMA with fallback
				  above with a default in ZONE_DMA (like
				  x86, 256M or swiotlb size)
	crashkernel=Y,high	- allocate from above ZONE_DMA
	crashkernel=Y,low	- allocate within ZONE_DMA

'crashkernel' overrides the high and low while the latter two can be
passed independently.

> >>>>> After going through arm64 memory init code, I got below summary about
> >>>>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
> >>>>> can make use of it to facilitate to simplify code.
> >>>>> ================================================================================
> >>>>>                         DMA                      DMA32                    NORMAL
> >>>>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
> >>>>> 2)Normal machine        0~4G                     0                        (above 4G)
> >>>>> 3)Special machine       (above 4G)~MAX
> >>>>> 4)No DMA|DMA32                                                            (above 4G)~MAX
> >>>
> >>> arm64_memblock_init()
> >>> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
> >> We don't need different code for this place of reservation as you are
> >> doing in this patchset, since arm64_dma_phys_limit is initialized as 
> >> below. In fact, in arm64_memblock_init(), we have made memblock ready,
> >> we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
> >> memblock_start_of_DRAM() is bigger than 4G, we possibly can call
> >> reserve_crashkernel() here too.
> > 
> > Yes. Maybe all the devices in this environment are 64-bit. One way I
> > know of allowing 32-bit devices to access high memory without SMMU
> > is: Set a fixed value for the upper 32 bits. In this case, the DMA
> > zone should be [phys_start, phys_start + 4G).

We decided that this case doesn't really exists for arm64 platforms (no
need for special ZONE_DMA).

> I just read the message of commit 791ab8b2e3 ("arm64: Ignore any DMA
> offsets in the max_zone_phys() calculation")
> 
>     Currently, the kernel assumes that if RAM starts above 32-bit (or
>     zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
>     such constrained devices have a hardwired DMA offset. In practice, we
>     haven't noticed any such hardware so let's assume that we can expand
>     ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
>     ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
>     zone_bits.

I think the above log is slightly confusing. If the DRAM starts above
4G, ZONE_DMA goes to the end of DRAM. If the DRAM starts below 4G but
above the zone_bits for ZONE_DMA as specified in DT/ACPI, it pushes
ZONE_DMA to 4G. I don't remember why we did this last part, maybe in
case we get incorrect firmware tables, otherwise we could have extended
ZONE_DMA to end of DRAM.

Zhen Lei, if we agreed on the crashkernel behaviour, could you please
post a series that does the above parsing allocation? Ignore the
optimisations, we can look at them afterwards.

Thanks.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-05-03 22:00                           ` Catalin Marinas
  0 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-05-03 22:00 UTC (permalink / raw)
  To: kexec

On Fri, Apr 29, 2022 at 04:25:37PM +0800, Leizhen (ThunderTown) wrote:
> On 2022/4/29 16:02, Leizhen (ThunderTown) wrote:
> > On 2022/4/29 11:24, Baoquan He wrote:
> >> On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
> >>> On 2022/4/28 11:52, Baoquan He wrote:
> >>>> On 04/28/22 at 11:40am, Baoquan He wrote:
> >>>>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
> >>>>>> There will be some difference as the 4G limit doesn't always hold for
> >>>>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
> >>>>>> things a bit while following the documented behaviour:
> >>>>>>
> >>>>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
> >>>>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> >>>>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
[...]
> >>>>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
> >>>>> about this version. And I have the same concerns about them which comes
> >>>>> from below points:
> >>>>> 1) we may need to take best effort to keep ,high, ,low behaviour
> >>>>> consistent on all ARCHes. Otherwise user/admin may be confused when they
> >>>>> deploy/configure kdump on different machines of different ARCHes in the
> >>>>> same LAB. I think we should try to avoid the confusion.

I guess by all arches you mean just x86 here. Since the code is not
generic, all arches do their own stuff.

> > OK, I plan to remove optimization, fallback and default low size, to follow the
> > suggestion of Catalin first. But there's one minor point of contention.
> > 
> > 1)    Both "crashkernel=X,high" and "crashkernel=X,low" must be present.
> > 2)    Both "crashkernel=X,high" and "crashkernel=X,low" are present.
> >    or
> >       Allow "crashkernel=X,high" to be present alone. Unlike x86, the default low size is zero.
> > 
> > I prefer 2), how about you?

(2) works for me as well. We keep these simple as "expert" options and
allow crashkernel= to fall back to 'high' if not sufficient memory in
ZONE_DMA. That would be a slight change from the current behaviour but,
as Zhen Lei said, with the old tools it's just moving the error around,
the crashkernel wouldn't be available in either case.

> >>>>> 2) Fallback behaviour is important to our distros. The reason is we will
> >>>>> provide default value with crashkernel=xxxM along kernel of distros. In
> >>>>> this case, we hope the reservation will succeed by all means. The ,high
> >>>>> and ,low is an option if customer likes to take with expertise.

OK, that's good feedback.

So, to recap, IIUC you are fine with:

	crashkernel=Y		- allocate within ZONE_DMA with fallback
				  above with a default in ZONE_DMA (like
				  x86, 256M or swiotlb size)
	crashkernel=Y,high	- allocate from above ZONE_DMA
	crashkernel=Y,low	- allocate within ZONE_DMA

'crashkernel' overrides the high and low while the latter two can be
passed independently.

> >>>>> After going through arm64 memory init code, I got below summary about
> >>>>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
> >>>>> can make use of it to facilitate to simplify code.
> >>>>> ================================================================================
> >>>>>                         DMA                      DMA32                    NORMAL
> >>>>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
> >>>>> 2)Normal machine        0~4G                     0                        (above 4G)
> >>>>> 3)Special machine       (above 4G)~MAX
> >>>>> 4)No DMA|DMA32                                                            (above 4G)~MAX
> >>>
> >>> arm64_memblock_init()
> >>> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
> >> We don't need different code for this place of reservation as you are
> >> doing in this patchset, since arm64_dma_phys_limit is initialized as 
> >> below. In fact, in arm64_memblock_init(), we have made memblock ready,
> >> we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
> >> memblock_start_of_DRAM() is bigger than 4G, we possibly can call
> >> reserve_crashkernel() here too.
> > 
> > Yes. Maybe all the devices in this environment are 64-bit. One way I
> > know of allowing 32-bit devices to access high memory without SMMU
> > is: Set a fixed value for the upper 32 bits. In this case, the DMA
> > zone should be [phys_start, phys_start + 4G).

We decided that this case doesn't really exists for arm64 platforms (no
need for special ZONE_DMA).

> I just read the message of commit 791ab8b2e3 ("arm64: Ignore any DMA
> offsets in the max_zone_phys() calculation")
> 
>     Currently, the kernel assumes that if RAM starts above 32-bit (or
>     zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
>     such constrained devices have a hardwired DMA offset. In practice, we
>     haven't noticed any such hardware so let's assume that we can expand
>     ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
>     ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
>     zone_bits.

I think the above log is slightly confusing. If the DRAM starts above
4G, ZONE_DMA goes to the end of DRAM. If the DRAM starts below 4G but
above the zone_bits for ZONE_DMA as specified in DT/ACPI, it pushes
ZONE_DMA to 4G. I don't remember why we did this last part, maybe in
case we get incorrect firmware tables, otherwise we could have extended
ZONE_DMA to end of DRAM.

Zhen Lei, if we agreed on the crashkernel behaviour, could you please
post a series that does the above parsing allocation? Ignore the
optimisations, we can look at them afterwards.

Thanks.

-- 
Catalin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-05-03 22:00                           ` Catalin Marinas
  (?)
@ 2022-05-05  2:13                             ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-05-05  2:13 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Baoquan He, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp



On 2022/5/4 6:00, Catalin Marinas wrote:
> On Fri, Apr 29, 2022 at 04:25:37PM +0800, Leizhen (ThunderTown) wrote:
>> On 2022/4/29 16:02, Leizhen (ThunderTown) wrote:
>>> On 2022/4/29 11:24, Baoquan He wrote:
>>>> On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
>>>>> On 2022/4/28 11:52, Baoquan He wrote:
>>>>>> On 04/28/22 at 11:40am, Baoquan He wrote:
>>>>>>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
>>>>>>>> There will be some difference as the 4G limit doesn't always hold for
>>>>>>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
>>>>>>>> things a bit while following the documented behaviour:
>>>>>>>>
>>>>>>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
>>>>>>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
>>>>>>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
> [...]
>>>>>>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
>>>>>>> about this version. And I have the same concerns about them which comes
>>>>>>> from below points:
>>>>>>> 1) we may need to take best effort to keep ,high, ,low behaviour
>>>>>>> consistent on all ARCHes. Otherwise user/admin may be confused when they
>>>>>>> deploy/configure kdump on different machines of different ARCHes in the
>>>>>>> same LAB. I think we should try to avoid the confusion.
> 
> I guess by all arches you mean just x86 here. Since the code is not
> generic, all arches do their own stuff.
> 
>>> OK, I plan to remove optimization, fallback and default low size, to follow the
>>> suggestion of Catalin first. But there's one minor point of contention.
>>>
>>> 1)    Both "crashkernel=X,high" and "crashkernel=X,low" must be present.
>>> 2)    Both "crashkernel=X,high" and "crashkernel=X,low" are present.
>>>    or
>>>       Allow "crashkernel=X,high" to be present alone. Unlike x86, the default low size is zero.
>>>
>>> I prefer 2), how about you?
> 
> (2) works for me as well. We keep these simple as "expert" options and

Okay, so I'll follow 2) to update the code.

> allow crashkernel= to fall back to 'high' if not sufficient memory in
> ZONE_DMA. That would be a slight change from the current behaviour but,
> as Zhen Lei said, with the old tools it's just moving the error around,
> the crashkernel wouldn't be available in either case.
> 
>>>>>>> 2) Fallback behaviour is important to our distros. The reason is we will
>>>>>>> provide default value with crashkernel=xxxM along kernel of distros. In
>>>>>>> this case, we hope the reservation will succeed by all means. The ,high
>>>>>>> and ,low is an option if customer likes to take with expertise.
> 
> OK, that's good feedback.
> 
> So, to recap, IIUC you are fine with:
> 
> 	crashkernel=Y		- allocate within ZONE_DMA with fallback
> 				  above with a default in ZONE_DMA (like
> 				  x86, 256M or swiotlb size)
> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> 	crashkernel=Y,low	- allocate within ZONE_DMA
> 
> 'crashkernel' overrides the high and low while the latter two can be
> passed independently.
> 
>>>>>>> After going through arm64 memory init code, I got below summary about
>>>>>>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
>>>>>>> can make use of it to facilitate to simplify code.
>>>>>>> ================================================================================
>>>>>>>                         DMA                      DMA32                    NORMAL
>>>>>>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
>>>>>>> 2)Normal machine        0~4G                     0                        (above 4G)
>>>>>>> 3)Special machine       (above 4G)~MAX
>>>>>>> 4)No DMA|DMA32                                                            (above 4G)~MAX
>>>>>
>>>>> arm64_memblock_init()
>>>>> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
>>>> We don't need different code for this place of reservation as you are
>>>> doing in this patchset, since arm64_dma_phys_limit is initialized as 
>>>> below. In fact, in arm64_memblock_init(), we have made memblock ready,
>>>> we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
>>>> memblock_start_of_DRAM() is bigger than 4G, we possibly can call
>>>> reserve_crashkernel() here too.
>>>
>>> Yes. Maybe all the devices in this environment are 64-bit. One way I
>>> know of allowing 32-bit devices to access high memory without SMMU
>>> is: Set a fixed value for the upper 32 bits. In this case, the DMA
>>> zone should be [phys_start, phys_start + 4G).
> 
> We decided that this case doesn't really exists for arm64 platforms (no
> need for special ZONE_DMA).
> 
>> I just read the message of commit 791ab8b2e3 ("arm64: Ignore any DMA
>> offsets in the max_zone_phys() calculation")
>>
>>     Currently, the kernel assumes that if RAM starts above 32-bit (or
>>     zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
>>     such constrained devices have a hardwired DMA offset. In practice, we
>>     haven't noticed any such hardware so let's assume that we can expand
>>     ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
>>     ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
>>     zone_bits.
> 
> I think the above log is slightly confusing. If the DRAM starts above
> 4G, ZONE_DMA goes to the end of DRAM. If the DRAM starts below 4G but
> above the zone_bits for ZONE_DMA as specified in DT/ACPI, it pushes
> ZONE_DMA to 4G. I don't remember why we did this last part, maybe in
> case we get incorrect firmware tables, otherwise we could have extended
> ZONE_DMA to end of DRAM.
> 
> Zhen Lei, if we agreed on the crashkernel behaviour, could you please
> post a series that does the above parsing allocation? Ignore the
> optimisations, we can look at them afterwards.

OK, I've changed the code before the festival, and I'll test it today.

> 
> Thanks.
> 

-- 
Regards,
  Zhen Lei

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-05-05  2:13                             ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen (ThunderTown) @ 2022-05-05  2:13 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Baoquan He, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp



On 2022/5/4 6:00, Catalin Marinas wrote:
> On Fri, Apr 29, 2022 at 04:25:37PM +0800, Leizhen (ThunderTown) wrote:
>> On 2022/4/29 16:02, Leizhen (ThunderTown) wrote:
>>> On 2022/4/29 11:24, Baoquan He wrote:
>>>> On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
>>>>> On 2022/4/28 11:52, Baoquan He wrote:
>>>>>> On 04/28/22 at 11:40am, Baoquan He wrote:
>>>>>>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
>>>>>>>> There will be some difference as the 4G limit doesn't always hold for
>>>>>>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
>>>>>>>> things a bit while following the documented behaviour:
>>>>>>>>
>>>>>>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
>>>>>>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
>>>>>>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
> [...]
>>>>>>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
>>>>>>> about this version. And I have the same concerns about them which comes
>>>>>>> from below points:
>>>>>>> 1) we may need to take best effort to keep ,high, ,low behaviour
>>>>>>> consistent on all ARCHes. Otherwise user/admin may be confused when they
>>>>>>> deploy/configure kdump on different machines of different ARCHes in the
>>>>>>> same LAB. I think we should try to avoid the confusion.
> 
> I guess by all arches you mean just x86 here. Since the code is not
> generic, all arches do their own stuff.
> 
>>> OK, I plan to remove optimization, fallback and default low size, to follow the
>>> suggestion of Catalin first. But there's one minor point of contention.
>>>
>>> 1)    Both "crashkernel=X,high" and "crashkernel=X,low" must be present.
>>> 2)    Both "crashkernel=X,high" and "crashkernel=X,low" are present.
>>>    or
>>>       Allow "crashkernel=X,high" to be present alone. Unlike x86, the default low size is zero.
>>>
>>> I prefer 2), how about you?
> 
> (2) works for me as well. We keep these simple as "expert" options and

Okay, so I'll follow 2) to update the code.

> allow crashkernel= to fall back to 'high' if not sufficient memory in
> ZONE_DMA. That would be a slight change from the current behaviour but,
> as Zhen Lei said, with the old tools it's just moving the error around,
> the crashkernel wouldn't be available in either case.
> 
>>>>>>> 2) Fallback behaviour is important to our distros. The reason is we will
>>>>>>> provide default value with crashkernel=xxxM along kernel of distros. In
>>>>>>> this case, we hope the reservation will succeed by all means. The ,high
>>>>>>> and ,low is an option if customer likes to take with expertise.
> 
> OK, that's good feedback.
> 
> So, to recap, IIUC you are fine with:
> 
> 	crashkernel=Y		- allocate within ZONE_DMA with fallback
> 				  above with a default in ZONE_DMA (like
> 				  x86, 256M or swiotlb size)
> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> 	crashkernel=Y,low	- allocate within ZONE_DMA
> 
> 'crashkernel' overrides the high and low while the latter two can be
> passed independently.
> 
>>>>>>> After going through arm64 memory init code, I got below summary about
>>>>>>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
>>>>>>> can make use of it to facilitate to simplify code.
>>>>>>> ================================================================================
>>>>>>>                         DMA                      DMA32                    NORMAL
>>>>>>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
>>>>>>> 2)Normal machine        0~4G                     0                        (above 4G)
>>>>>>> 3)Special machine       (above 4G)~MAX
>>>>>>> 4)No DMA|DMA32                                                            (above 4G)~MAX
>>>>>
>>>>> arm64_memblock_init()
>>>>> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
>>>> We don't need different code for this place of reservation as you are
>>>> doing in this patchset, since arm64_dma_phys_limit is initialized as 
>>>> below. In fact, in arm64_memblock_init(), we have made memblock ready,
>>>> we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
>>>> memblock_start_of_DRAM() is bigger than 4G, we possibly can call
>>>> reserve_crashkernel() here too.
>>>
>>> Yes. Maybe all the devices in this environment are 64-bit. One way I
>>> know of allowing 32-bit devices to access high memory without SMMU
>>> is: Set a fixed value for the upper 32 bits. In this case, the DMA
>>> zone should be [phys_start, phys_start + 4G).
> 
> We decided that this case doesn't really exists for arm64 platforms (no
> need for special ZONE_DMA).
> 
>> I just read the message of commit 791ab8b2e3 ("arm64: Ignore any DMA
>> offsets in the max_zone_phys() calculation")
>>
>>     Currently, the kernel assumes that if RAM starts above 32-bit (or
>>     zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
>>     such constrained devices have a hardwired DMA offset. In practice, we
>>     haven't noticed any such hardware so let's assume that we can expand
>>     ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
>>     ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
>>     zone_bits.
> 
> I think the above log is slightly confusing. If the DRAM starts above
> 4G, ZONE_DMA goes to the end of DRAM. If the DRAM starts below 4G but
> above the zone_bits for ZONE_DMA as specified in DT/ACPI, it pushes
> ZONE_DMA to 4G. I don't remember why we did this last part, maybe in
> case we get incorrect firmware tables, otherwise we could have extended
> ZONE_DMA to end of DRAM.
> 
> Zhen Lei, if we agreed on the crashkernel behaviour, could you please
> post a series that does the above parsing allocation? Ignore the
> optimisations, we can look at them afterwards.

OK, I've changed the code before the festival, and I'll test it today.

> 
> Thanks.
> 

-- 
Regards,
  Zhen Lei

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-05-05  2:13                             ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 102+ messages in thread
From: Leizhen @ 2022-05-05  2:13 UTC (permalink / raw)
  To: kexec



On 2022/5/4 6:00, Catalin Marinas wrote:
> On Fri, Apr 29, 2022 at 04:25:37PM +0800, Leizhen (ThunderTown) wrote:
>> On 2022/4/29 16:02, Leizhen (ThunderTown) wrote:
>>> On 2022/4/29 11:24, Baoquan He wrote:
>>>> On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
>>>>> On 2022/4/28 11:52, Baoquan He wrote:
>>>>>> On 04/28/22 at 11:40am, Baoquan He wrote:
>>>>>>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
>>>>>>>> There will be some difference as the 4G limit doesn't always hold for
>>>>>>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
>>>>>>>> things a bit while following the documented behaviour:
>>>>>>>>
>>>>>>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
>>>>>>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
>>>>>>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
> [...]
>>>>>>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
>>>>>>> about this version. And I have the same concerns about them which comes
>>>>>>> from below points:
>>>>>>> 1) we may need to take best effort to keep ,high, ,low behaviour
>>>>>>> consistent on all ARCHes. Otherwise user/admin may be confused when they
>>>>>>> deploy/configure kdump on different machines of different ARCHes in the
>>>>>>> same LAB. I think we should try to avoid the confusion.
> 
> I guess by all arches you mean just x86 here. Since the code is not
> generic, all arches do their own stuff.
> 
>>> OK, I plan to remove optimization, fallback and default low size, to follow the
>>> suggestion of Catalin first. But there's one minor point of contention.
>>>
>>> 1)    Both "crashkernel=X,high" and "crashkernel=X,low" must be present.
>>> 2)    Both "crashkernel=X,high" and "crashkernel=X,low" are present.
>>>    or
>>>       Allow "crashkernel=X,high" to be present alone. Unlike x86, the default low size is zero.
>>>
>>> I prefer 2), how about you?
> 
> (2) works for me as well. We keep these simple as "expert" options and

Okay, so I'll follow 2) to update the code.

> allow crashkernel= to fall back to 'high' if not sufficient memory in
> ZONE_DMA. That would be a slight change from the current behaviour but,
> as Zhen Lei said, with the old tools it's just moving the error around,
> the crashkernel wouldn't be available in either case.
> 
>>>>>>> 2) Fallback behaviour is important to our distros. The reason is we will
>>>>>>> provide default value with crashkernel=xxxM along kernel of distros. In
>>>>>>> this case, we hope the reservation will succeed by all means. The ,high
>>>>>>> and ,low is an option if customer likes to take with expertise.
> 
> OK, that's good feedback.
> 
> So, to recap, IIUC you are fine with:
> 
> 	crashkernel=Y		- allocate within ZONE_DMA with fallback
> 				  above with a default in ZONE_DMA (like
> 				  x86, 256M or swiotlb size)
> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> 	crashkernel=Y,low	- allocate within ZONE_DMA
> 
> 'crashkernel' overrides the high and low while the latter two can be
> passed independently.
> 
>>>>>>> After going through arm64 memory init code, I got below summary about
>>>>>>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
>>>>>>> can make use of it to facilitate to simplify code.
>>>>>>> ================================================================================
>>>>>>>                         DMA                      DMA32                    NORMAL
>>>>>>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
>>>>>>> 2)Normal machine        0~4G                     0                        (above 4G)
>>>>>>> 3)Special machine       (above 4G)~MAX
>>>>>>> 4)No DMA|DMA32                                                            (above 4G)~MAX
>>>>>
>>>>> arm64_memblock_init()
>>>>> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
>>>> We don't need different code for this place of reservation as you are
>>>> doing in this patchset, since arm64_dma_phys_limit is initialized as 
>>>> below. In fact, in arm64_memblock_init(), we have made memblock ready,
>>>> we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
>>>> memblock_start_of_DRAM() is bigger than 4G, we possibly can call
>>>> reserve_crashkernel() here too.
>>>
>>> Yes. Maybe all the devices in this environment are 64-bit. One way I
>>> know of allowing 32-bit devices to access high memory without SMMU
>>> is: Set a fixed value for the upper 32 bits. In this case, the DMA
>>> zone should be [phys_start, phys_start + 4G).
> 
> We decided that this case doesn't really exists for arm64 platforms (no
> need for special ZONE_DMA).
> 
>> I just read the message of commit 791ab8b2e3 ("arm64: Ignore any DMA
>> offsets in the max_zone_phys() calculation")
>>
>>     Currently, the kernel assumes that if RAM starts above 32-bit (or
>>     zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
>>     such constrained devices have a hardwired DMA offset. In practice, we
>>     haven't noticed any such hardware so let's assume that we can expand
>>     ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
>>     ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
>>     zone_bits.
> 
> I think the above log is slightly confusing. If the DRAM starts above
> 4G, ZONE_DMA goes to the end of DRAM. If the DRAM starts below 4G but
> above the zone_bits for ZONE_DMA as specified in DT/ACPI, it pushes
> ZONE_DMA to 4G. I don't remember why we did this last part, maybe in
> case we get incorrect firmware tables, otherwise we could have extended
> ZONE_DMA to end of DRAM.
> 
> Zhen Lei, if we agreed on the crashkernel behaviour, could you please
> post a series that does the above parsing allocation? Ignore the
> optimisations, we can look at them afterwards.

OK, I've changed the code before the festival, and I'll test it today.

> 
> Thanks.
> 

-- 
Regards,
  Zhen Lei


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-05-03 22:00                           ` Catalin Marinas
  (?)
@ 2022-05-05  3:00                             ` Baoquan He
  -1 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-05-05  3:00 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Leizhen (ThunderTown),
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

On 05/03/22 at 11:00pm, Catalin Marinas wrote:
> On Fri, Apr 29, 2022 at 04:25:37PM +0800, Leizhen (ThunderTown) wrote:
> > On 2022/4/29 16:02, Leizhen (ThunderTown) wrote:
> > > On 2022/4/29 11:24, Baoquan He wrote:
> > >> On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
> > >>> On 2022/4/28 11:52, Baoquan He wrote:
> > >>>> On 04/28/22 at 11:40am, Baoquan He wrote:
> > >>>>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
> > >>>>>> There will be some difference as the 4G limit doesn't always hold for
> > >>>>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
> > >>>>>> things a bit while following the documented behaviour:
> > >>>>>>
> > >>>>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
> > >>>>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> > >>>>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
> [...]
> > >>>>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
> > >>>>> about this version. And I have the same concerns about them which comes
> > >>>>> from below points:
> > >>>>> 1) we may need to take best effort to keep ,high, ,low behaviour
> > >>>>> consistent on all ARCHes. Otherwise user/admin may be confused when they
> > >>>>> deploy/configure kdump on different machines of different ARCHes in the
> > >>>>> same LAB. I think we should try to avoid the confusion.
> 
> I guess by all arches you mean just x86 here. Since the code is not
> generic, all arches do their own stuff.

Right. Since currently only x86 has crashkernel,high|low support. From
the distros and customer's point of view, we would like to see the same
feature has the same or similar behaviour. This will ease operation and
maintaining. E.g on the cloud platform, the base of it could be any
ARCH, x86, arm64. The inconsistent behaviour could cause confusion.
Certainly, the underlying implementation may be different.

Surely, if arm64 has its own manner because of reasons, we can
add document to note that.

> 
> > > OK, I plan to remove optimization, fallback and default low size, to follow the
> > > suggestion of Catalin first. But there's one minor point of contention.
> > > 
> > > 1)    Both "crashkernel=X,high" and "crashkernel=X,low" must be present.
> > > 2)    Both "crashkernel=X,high" and "crashkernel=X,low" are present.
> > >    or
> > >       Allow "crashkernel=X,high" to be present alone. Unlike x86, the default low size is zero.
> > > 
> > > I prefer 2), how about you?
> 
> (2) works for me as well. We keep these simple as "expert" options and
> allow crashkernel= to fall back to 'high' if not sufficient memory in
> ZONE_DMA. That would be a slight change from the current behaviour but,
> as Zhen Lei said, with the old tools it's just moving the error around,
> the crashkernel wouldn't be available in either case.
> 
> > >>>>> 2) Fallback behaviour is important to our distros. The reason is we will
> > >>>>> provide default value with crashkernel=xxxM along kernel of distros. In
> > >>>>> this case, we hope the reservation will succeed by all means. The ,high
> > >>>>> and ,low is an option if customer likes to take with expertise.
> 
> OK, that's good feedback.
> 
> So, to recap, IIUC you are fine with:
> 
> 	crashkernel=Y		- allocate within ZONE_DMA with fallback
> 				  above with a default in ZONE_DMA (like
> 				  x86, 256M or swiotlb size)
                                  
        Ack to this one. 


> 	crashkernel=Y,high	- allocate from above ZONE_DMA
                                  
        Not exactly. If there's only ZONE_DMA, crashkernel,high will
        be reserved in ZONE_DMA, and crashkernel,low will be ignored.
        Other than this, ack.

> 	crashkernel=Y,low	- allocate within ZONE_DMA

        Ack to this one.
> 
> 'crashkernel' overrides the high and low while the latter two can be
> passed independently.

        crashkernel=,high can be passed independently, then a crashkernel=,low
        is needed implicitly. If people don't want crashkernel=,low
        explicitly, crashkernel=0,low need be specified.

        An independent crashkernel=,low makes no sense. Crashkernel=,low
        should be paird with crashkernel=,high.
        
        My personal opinion according to the existed senmantics on x86.
        Otherwise, the guidance of crashkernel= |,high|,low reservation
        will be complicated to write.

> 
> > >>>>> After going through arm64 memory init code, I got below summary about
> > >>>>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
> > >>>>> can make use of it to facilitate to simplify code.
> > >>>>> ================================================================================
> > >>>>>                         DMA                      DMA32                    NORMAL
> > >>>>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
> > >>>>> 2)Normal machine        0~4G                     0                        (above 4G)
> > >>>>> 3)Special machine       (above 4G)~MAX
> > >>>>> 4)No DMA|DMA32                                                            (above 4G)~MAX
> > >>>
> > >>> arm64_memblock_init()
> > >>> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
> > >> We don't need different code for this place of reservation as you are
> > >> doing in this patchset, since arm64_dma_phys_limit is initialized as 
> > >> below. In fact, in arm64_memblock_init(), we have made memblock ready,
> > >> we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
> > >> memblock_start_of_DRAM() is bigger than 4G, we possibly can call
> > >> reserve_crashkernel() here too.
> > > 
> > > Yes. Maybe all the devices in this environment are 64-bit. One way I
> > > know of allowing 32-bit devices to access high memory without SMMU
> > > is: Set a fixed value for the upper 32 bits. In this case, the DMA
> > > zone should be [phys_start, phys_start + 4G).
> 
> We decided that this case doesn't really exists for arm64 platforms (no
> need for special ZONE_DMA).
> 
> > I just read the message of commit 791ab8b2e3 ("arm64: Ignore any DMA
> > offsets in the max_zone_phys() calculation")
> > 
> >     Currently, the kernel assumes that if RAM starts above 32-bit (or
> >     zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
> >     such constrained devices have a hardwired DMA offset. In practice, we
> >     haven't noticed any such hardware so let's assume that we can expand
> >     ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
> >     ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
> >     zone_bits.
> 
> I think the above log is slightly confusing. If the DRAM starts above
> 4G, ZONE_DMA goes to the end of DRAM. If the DRAM starts below 4G but
> above the zone_bits for ZONE_DMA as specified in DT/ACPI, it pushes
> ZONE_DMA to 4G. I don't remember why we did this last part, maybe in
> case we get incorrect firmware tables, otherwise we could have extended
> ZONE_DMA to end of DRAM.
> 
> Zhen Lei, if we agreed on the crashkernel behaviour, could you please
> post a series that does the above parsing allocation? Ignore the
> optimisations, we can look at them afterwards.
> 
> Thanks.
> 
> -- 
> Catalin
> 


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-05-05  3:00                             ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-05-05  3:00 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Leizhen (ThunderTown),
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

On 05/03/22 at 11:00pm, Catalin Marinas wrote:
> On Fri, Apr 29, 2022 at 04:25:37PM +0800, Leizhen (ThunderTown) wrote:
> > On 2022/4/29 16:02, Leizhen (ThunderTown) wrote:
> > > On 2022/4/29 11:24, Baoquan He wrote:
> > >> On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
> > >>> On 2022/4/28 11:52, Baoquan He wrote:
> > >>>> On 04/28/22 at 11:40am, Baoquan He wrote:
> > >>>>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
> > >>>>>> There will be some difference as the 4G limit doesn't always hold for
> > >>>>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
> > >>>>>> things a bit while following the documented behaviour:
> > >>>>>>
> > >>>>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
> > >>>>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> > >>>>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
> [...]
> > >>>>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
> > >>>>> about this version. And I have the same concerns about them which comes
> > >>>>> from below points:
> > >>>>> 1) we may need to take best effort to keep ,high, ,low behaviour
> > >>>>> consistent on all ARCHes. Otherwise user/admin may be confused when they
> > >>>>> deploy/configure kdump on different machines of different ARCHes in the
> > >>>>> same LAB. I think we should try to avoid the confusion.
> 
> I guess by all arches you mean just x86 here. Since the code is not
> generic, all arches do their own stuff.

Right. Since currently only x86 has crashkernel,high|low support. From
the distros and customer's point of view, we would like to see the same
feature has the same or similar behaviour. This will ease operation and
maintaining. E.g on the cloud platform, the base of it could be any
ARCH, x86, arm64. The inconsistent behaviour could cause confusion.
Certainly, the underlying implementation may be different.

Surely, if arm64 has its own manner because of reasons, we can
add document to note that.

> 
> > > OK, I plan to remove optimization, fallback and default low size, to follow the
> > > suggestion of Catalin first. But there's one minor point of contention.
> > > 
> > > 1)    Both "crashkernel=X,high" and "crashkernel=X,low" must be present.
> > > 2)    Both "crashkernel=X,high" and "crashkernel=X,low" are present.
> > >    or
> > >       Allow "crashkernel=X,high" to be present alone. Unlike x86, the default low size is zero.
> > > 
> > > I prefer 2), how about you?
> 
> (2) works for me as well. We keep these simple as "expert" options and
> allow crashkernel= to fall back to 'high' if not sufficient memory in
> ZONE_DMA. That would be a slight change from the current behaviour but,
> as Zhen Lei said, with the old tools it's just moving the error around,
> the crashkernel wouldn't be available in either case.
> 
> > >>>>> 2) Fallback behaviour is important to our distros. The reason is we will
> > >>>>> provide default value with crashkernel=xxxM along kernel of distros. In
> > >>>>> this case, we hope the reservation will succeed by all means. The ,high
> > >>>>> and ,low is an option if customer likes to take with expertise.
> 
> OK, that's good feedback.
> 
> So, to recap, IIUC you are fine with:
> 
> 	crashkernel=Y		- allocate within ZONE_DMA with fallback
> 				  above with a default in ZONE_DMA (like
> 				  x86, 256M or swiotlb size)
                                  
        Ack to this one. 


> 	crashkernel=Y,high	- allocate from above ZONE_DMA
                                  
        Not exactly. If there's only ZONE_DMA, crashkernel,high will
        be reserved in ZONE_DMA, and crashkernel,low will be ignored.
        Other than this, ack.

> 	crashkernel=Y,low	- allocate within ZONE_DMA

        Ack to this one.
> 
> 'crashkernel' overrides the high and low while the latter two can be
> passed independently.

        crashkernel=,high can be passed independently, then a crashkernel=,low
        is needed implicitly. If people don't want crashkernel=,low
        explicitly, crashkernel=0,low need be specified.

        An independent crashkernel=,low makes no sense. Crashkernel=,low
        should be paird with crashkernel=,high.
        
        My personal opinion according to the existed senmantics on x86.
        Otherwise, the guidance of crashkernel= |,high|,low reservation
        will be complicated to write.

> 
> > >>>>> After going through arm64 memory init code, I got below summary about
> > >>>>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
> > >>>>> can make use of it to facilitate to simplify code.
> > >>>>> ================================================================================
> > >>>>>                         DMA                      DMA32                    NORMAL
> > >>>>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
> > >>>>> 2)Normal machine        0~4G                     0                        (above 4G)
> > >>>>> 3)Special machine       (above 4G)~MAX
> > >>>>> 4)No DMA|DMA32                                                            (above 4G)~MAX
> > >>>
> > >>> arm64_memblock_init()
> > >>> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
> > >> We don't need different code for this place of reservation as you are
> > >> doing in this patchset, since arm64_dma_phys_limit is initialized as 
> > >> below. In fact, in arm64_memblock_init(), we have made memblock ready,
> > >> we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
> > >> memblock_start_of_DRAM() is bigger than 4G, we possibly can call
> > >> reserve_crashkernel() here too.
> > > 
> > > Yes. Maybe all the devices in this environment are 64-bit. One way I
> > > know of allowing 32-bit devices to access high memory without SMMU
> > > is: Set a fixed value for the upper 32 bits. In this case, the DMA
> > > zone should be [phys_start, phys_start + 4G).
> 
> We decided that this case doesn't really exists for arm64 platforms (no
> need for special ZONE_DMA).
> 
> > I just read the message of commit 791ab8b2e3 ("arm64: Ignore any DMA
> > offsets in the max_zone_phys() calculation")
> > 
> >     Currently, the kernel assumes that if RAM starts above 32-bit (or
> >     zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
> >     such constrained devices have a hardwired DMA offset. In practice, we
> >     haven't noticed any such hardware so let's assume that we can expand
> >     ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
> >     ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
> >     zone_bits.
> 
> I think the above log is slightly confusing. If the DRAM starts above
> 4G, ZONE_DMA goes to the end of DRAM. If the DRAM starts below 4G but
> above the zone_bits for ZONE_DMA as specified in DT/ACPI, it pushes
> ZONE_DMA to 4G. I don't remember why we did this last part, maybe in
> case we get incorrect firmware tables, otherwise we could have extended
> ZONE_DMA to end of DRAM.
> 
> Zhen Lei, if we agreed on the crashkernel behaviour, could you please
> post a series that does the above parsing allocation? Ignore the
> optimisations, we can look at them afterwards.
> 
> Thanks.
> 
> -- 
> Catalin
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-05-05  3:00                             ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-05-05  3:00 UTC (permalink / raw)
  To: kexec

On 05/03/22 at 11:00pm, Catalin Marinas wrote:
> On Fri, Apr 29, 2022 at 04:25:37PM +0800, Leizhen (ThunderTown) wrote:
> > On 2022/4/29 16:02, Leizhen (ThunderTown) wrote:
> > > On 2022/4/29 11:24, Baoquan He wrote:
> > >> On 04/28/22 at 05:33pm, Leizhen (ThunderTown) wrote:
> > >>> On 2022/4/28 11:52, Baoquan He wrote:
> > >>>> On 04/28/22 at 11:40am, Baoquan He wrote:
> > >>>>> On 04/27/22 at 05:04pm, Catalin Marinas wrote:
> > >>>>>> There will be some difference as the 4G limit doesn't always hold for
> > >>>>>> arm64 (though it's true in most cases). Anyway, we can probably simplify
> > >>>>>> things a bit while following the documented behaviour:
> > >>>>>>
> > >>>>>> 	crashkernel=Y		- current behaviour within ZONE_DMA
> > >>>>>> 	crashkernel=Y,high	- allocate from above ZONE_DMA
> > >>>>>> 	crashkernel=Y,low	- allocate within ZONE_DMA
> [...]
> > >>>>> Sorry to interrupt. Seems the ,high ,low and fallback are main concerns
> > >>>>> about this version. And I have the same concerns about them which comes
> > >>>>> from below points:
> > >>>>> 1) we may need to take best effort to keep ,high, ,low behaviour
> > >>>>> consistent on all ARCHes. Otherwise user/admin may be confused when they
> > >>>>> deploy/configure kdump on different machines of different ARCHes in the
> > >>>>> same LAB. I think we should try to avoid the confusion.
> 
> I guess by all arches you mean just x86 here. Since the code is not
> generic, all arches do their own stuff.

Right. Since currently only x86 has crashkernel,high|low support. From
the distros and customer's point of view, we would like to see the same
feature has the same or similar behaviour. This will ease operation and
maintaining. E.g on the cloud platform, the base of it could be any
ARCH, x86, arm64. The inconsistent behaviour could cause confusion.
Certainly, the underlying implementation may be different.

Surely, if arm64 has its own manner because of reasons, we can
add document to note that.

> 
> > > OK, I plan to remove optimization, fallback and default low size, to follow the
> > > suggestion of Catalin first. But there's one minor point of contention.
> > > 
> > > 1)    Both "crashkernel=X,high" and "crashkernel=X,low" must be present.
> > > 2)    Both "crashkernel=X,high" and "crashkernel=X,low" are present.
> > >    or
> > >       Allow "crashkernel=X,high" to be present alone. Unlike x86, the default low size is zero.
> > > 
> > > I prefer 2), how about you?
> 
> (2) works for me as well. We keep these simple as "expert" options and
> allow crashkernel= to fall back to 'high' if not sufficient memory in
> ZONE_DMA. That would be a slight change from the current behaviour but,
> as Zhen Lei said, with the old tools it's just moving the error around,
> the crashkernel wouldn't be available in either case.
> 
> > >>>>> 2) Fallback behaviour is important to our distros. The reason is we will
> > >>>>> provide default value with crashkernel=xxxM along kernel of distros. In
> > >>>>> this case, we hope the reservation will succeed by all means. The ,high
> > >>>>> and ,low is an option if customer likes to take with expertise.
> 
> OK, that's good feedback.
> 
> So, to recap, IIUC you are fine with:
> 
> 	crashkernel=Y		- allocate within ZONE_DMA with fallback
> 				  above with a default in ZONE_DMA (like
> 				  x86, 256M or swiotlb size)
                                  
        Ack to this one. 


> 	crashkernel=Y,high	- allocate from above ZONE_DMA
                                  
        Not exactly. If there's only ZONE_DMA, crashkernel,high will
        be reserved in ZONE_DMA, and crashkernel,low will be ignored.
        Other than this, ack.

> 	crashkernel=Y,low	- allocate within ZONE_DMA

        Ack to this one.
> 
> 'crashkernel' overrides the high and low while the latter two can be
> passed independently.

        crashkernel=,high can be passed independently, then a crashkernel=,low
        is needed implicitly. If people don't want crashkernel=,low
        explicitly, crashkernel=0,low need be specified.

        An independent crashkernel=,low makes no sense. Crashkernel=,low
        should be paird with crashkernel=,high.
        
        My personal opinion according to the existed senmantics on x86.
        Otherwise, the guidance of crashkernel= |,high|,low reservation
        will be complicated to write.

> 
> > >>>>> After going through arm64 memory init code, I got below summary about
> > >>>>> arm64_dma_phys_limit which is the first zone's upper limit. I think we
> > >>>>> can make use of it to facilitate to simplify code.
> > >>>>> ================================================================================
> > >>>>>                         DMA                      DMA32                    NORMAL
> > >>>>> 1)Raspberry Pi4         0~1G                     3G~4G                    (above 4G)
> > >>>>> 2)Normal machine        0~4G                     0                        (above 4G)
> > >>>>> 3)Special machine       (above 4G)~MAX
> > >>>>> 4)No DMA|DMA32                                                            (above 4G)~MAX
> > >>>
> > >>> arm64_memblock_init()
> > >>> 	reserve_crashkernel()        ---------------   0a30c53573b0 ("arm64: mm: Move reserve_crashkernel() into mem_init()")
> > >> We don't need different code for this place of reservation as you are
> > >> doing in this patchset, since arm64_dma_phys_limit is initialized as 
> > >> below. In fact, in arm64_memblock_init(), we have made memblock ready,
> > >> we can initialize arm64_dma_phys_limit as memblock_end_of_DRAM(). And if
> > >> memblock_start_of_DRAM() is bigger than 4G, we possibly can call
> > >> reserve_crashkernel() here too.
> > > 
> > > Yes. Maybe all the devices in this environment are 64-bit. One way I
> > > know of allowing 32-bit devices to access high memory without SMMU
> > > is: Set a fixed value for the upper 32 bits. In this case, the DMA
> > > zone should be [phys_start, phys_start + 4G).
> 
> We decided that this case doesn't really exists for arm64 platforms (no
> need for special ZONE_DMA).
> 
> > I just read the message of commit 791ab8b2e3 ("arm64: Ignore any DMA
> > offsets in the max_zone_phys() calculation")
> > 
> >     Currently, the kernel assumes that if RAM starts above 32-bit (or
> >     zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
> >     such constrained devices have a hardwired DMA offset. In practice, we
> >     haven't noticed any such hardware so let's assume that we can expand
> >     ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
> >     ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
> >     zone_bits.
> 
> I think the above log is slightly confusing. If the DRAM starts above
> 4G, ZONE_DMA goes to the end of DRAM. If the DRAM starts below 4G but
> above the zone_bits for ZONE_DMA as specified in DT/ACPI, it pushes
> ZONE_DMA to 4G. I don't remember why we did this last part, maybe in
> case we get incorrect firmware tables, otherwise we could have extended
> ZONE_DMA to end of DRAM.
> 
> Zhen Lei, if we agreed on the crashkernel behaviour, could you please
> post a series that does the above parsing allocation? Ignore the
> optimisations, we can look at them afterwards.
> 
> Thanks.
> 
> -- 
> Catalin
> 



^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-05-05  3:00                             ` Baoquan He
  (?)
@ 2022-05-05 14:20                               ` Catalin Marinas
  -1 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-05-05 14:20 UTC (permalink / raw)
  To: Baoquan He
  Cc: Leizhen (ThunderTown),
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

On Thu, May 05, 2022 at 11:00:19AM +0800, Baoquan He wrote:
> On 05/03/22 at 11:00pm, Catalin Marinas wrote:
> > So, to recap, IIUC you are fine with:
> > 
> > 	crashkernel=Y		- allocate within ZONE_DMA with fallback
> > 				  above with a default in ZONE_DMA (like
> > 				  x86, 256M or swiotlb size)
> 
>         Ack to this one.
> 
> 
> > 	crashkernel=Y,high	- allocate from above ZONE_DMA
> 
>         Not exactly. If there's only ZONE_DMA, crashkernel,high will
>         be reserved in ZONE_DMA, and crashkernel,low will be ignored.
>         Other than this, ack.

Yes, that's fine.

> > 	crashkernel=Y,low	- allocate within ZONE_DMA
> 
>         Ack to this one.
> > 
> > 'crashkernel' overrides the high and low while the latter two can be
> > passed independently.
> 
>         crashkernel=,high can be passed independently, then a crashkernel=,low
>         is needed implicitly. If people don't want crashkernel=,low
>         explicitly, crashkernel=0,low need be specified.

I find this complicating the interface. I don't know the background to
the x86 implementation but we diverge already on arm64 since we talk
about ZONE_DMA rather than 4G limit (though for most platforms these
would be the same).

I guess we could restate the difference between crashkernel= and
crashkernel=,high as the hint to go for allocation above ZONE_DMA first.

>         An independent crashkernel=,low makes no sense. Crashkernel=,low
>         should be paird with crashkernel=,high.

You could argue that crashkernel=,low gives the current crashkernel=
behaviour, i.e. either all within ZONE_DMA or fail to allocate. So it
may have some value on its own.

>         My personal opinion according to the existed senmantics on x86.
>         Otherwise, the guidance of crashkernel= |,high|,low reservation
>         will be complicated to write.

It's more that I find the current semantics unnecessarily confusing. But
even reading the x86_64 text it's not that clear. For example the
default low allocation for crashkernel= and crashkernel=,high is only
mentioned in the crashkernel=,low description.

-- 
Catalin

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-05-05 14:20                               ` Catalin Marinas
  0 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-05-05 14:20 UTC (permalink / raw)
  To: Baoquan He
  Cc: Leizhen (ThunderTown),
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

On Thu, May 05, 2022 at 11:00:19AM +0800, Baoquan He wrote:
> On 05/03/22 at 11:00pm, Catalin Marinas wrote:
> > So, to recap, IIUC you are fine with:
> > 
> > 	crashkernel=Y		- allocate within ZONE_DMA with fallback
> > 				  above with a default in ZONE_DMA (like
> > 				  x86, 256M or swiotlb size)
> 
>         Ack to this one.
> 
> 
> > 	crashkernel=Y,high	- allocate from above ZONE_DMA
> 
>         Not exactly. If there's only ZONE_DMA, crashkernel,high will
>         be reserved in ZONE_DMA, and crashkernel,low will be ignored.
>         Other than this, ack.

Yes, that's fine.

> > 	crashkernel=Y,low	- allocate within ZONE_DMA
> 
>         Ack to this one.
> > 
> > 'crashkernel' overrides the high and low while the latter two can be
> > passed independently.
> 
>         crashkernel=,high can be passed independently, then a crashkernel=,low
>         is needed implicitly. If people don't want crashkernel=,low
>         explicitly, crashkernel=0,low need be specified.

I find this complicating the interface. I don't know the background to
the x86 implementation but we diverge already on arm64 since we talk
about ZONE_DMA rather than 4G limit (though for most platforms these
would be the same).

I guess we could restate the difference between crashkernel= and
crashkernel=,high as the hint to go for allocation above ZONE_DMA first.

>         An independent crashkernel=,low makes no sense. Crashkernel=,low
>         should be paird with crashkernel=,high.

You could argue that crashkernel=,low gives the current crashkernel=
behaviour, i.e. either all within ZONE_DMA or fail to allocate. So it
may have some value on its own.

>         My personal opinion according to the existed senmantics on x86.
>         Otherwise, the guidance of crashkernel= |,high|,low reservation
>         will be complicated to write.

It's more that I find the current semantics unnecessarily confusing. But
even reading the x86_64 text it's not that clear. For example the
default low allocation for crashkernel= and crashkernel=,high is only
mentioned in the crashkernel=,low description.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-05-05 14:20                               ` Catalin Marinas
  0 siblings, 0 replies; 102+ messages in thread
From: Catalin Marinas @ 2022-05-05 14:20 UTC (permalink / raw)
  To: kexec

On Thu, May 05, 2022 at 11:00:19AM +0800, Baoquan He wrote:
> On 05/03/22 at 11:00pm, Catalin Marinas wrote:
> > So, to recap, IIUC you are fine with:
> > 
> > 	crashkernel=Y		- allocate within ZONE_DMA with fallback
> > 				  above with a default in ZONE_DMA (like
> > 				  x86, 256M or swiotlb size)
> 
>         Ack to this one.
> 
> 
> > 	crashkernel=Y,high	- allocate from above ZONE_DMA
> 
>         Not exactly. If there's only ZONE_DMA, crashkernel,high will
>         be reserved in ZONE_DMA, and crashkernel,low will be ignored.
>         Other than this, ack.

Yes, that's fine.

> > 	crashkernel=Y,low	- allocate within ZONE_DMA
> 
>         Ack to this one.
> > 
> > 'crashkernel' overrides the high and low while the latter two can be
> > passed independently.
> 
>         crashkernel=,high can be passed independently, then a crashkernel=,low
>         is needed implicitly. If people don't want crashkernel=,low
>         explicitly, crashkernel=0,low need be specified.

I find this complicating the interface. I don't know the background to
the x86 implementation but we diverge already on arm64 since we talk
about ZONE_DMA rather than 4G limit (though for most platforms these
would be the same).

I guess we could restate the difference between crashkernel= and
crashkernel=,high as the hint to go for allocation above ZONE_DMA first.

>         An independent crashkernel=,low makes no sense. Crashkernel=,low
>         should be paird with crashkernel=,high.

You could argue that crashkernel=,low gives the current crashkernel=
behaviour, i.e. either all within ZONE_DMA or fail to allocate. So it
may have some value on its own.

>         My personal opinion according to the existed senmantics on x86.
>         Otherwise, the guidance of crashkernel= |,high|,low reservation
>         will be complicated to write.

It's more that I find the current semantics unnecessarily confusing. But
even reading the x86_64 text it's not that clear. For example the
default low allocation for crashkernel= and crashkernel=,high is only
mentioned in the crashkernel=,low description.

-- 
Catalin


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
  2022-05-05 14:20                               ` Catalin Marinas
  (?)
@ 2022-05-06 11:39                                 ` Baoquan He
  -1 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-05-06 11:39 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Leizhen (ThunderTown),
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

On 05/05/22 at 03:20pm, Catalin Marinas wrote:
> On Thu, May 05, 2022 at 11:00:19AM +0800, Baoquan He wrote:
> > On 05/03/22 at 11:00pm, Catalin Marinas wrote:
> > > So, to recap, IIUC you are fine with:
> > > 
> > > 	crashkernel=Y		- allocate within ZONE_DMA with fallback
> > > 				  above with a default in ZONE_DMA (like
> > > 				  x86, 256M or swiotlb size)
> > 
> >         Ack to this one.
> > 
> > 
> > > 	crashkernel=Y,high	- allocate from above ZONE_DMA
> > 
> >         Not exactly. If there's only ZONE_DMA, crashkernel,high will
> >         be reserved in ZONE_DMA, and crashkernel,low will be ignored.
> >         Other than this, ack.
> 
> Yes, that's fine.
> 
> > > 	crashkernel=Y,low	- allocate within ZONE_DMA
> > 
> >         Ack to this one.
> > > 
> > > 'crashkernel' overrides the high and low while the latter two can be
> > > passed independently.
> > 
> >         crashkernel=,high can be passed independently, then a crashkernel=,low
> >         is needed implicitly. If people don't want crashkernel=,low
> >         explicitly, crashkernel=0,low need be specified.
> 
> I find this complicating the interface. I don't know the background to
> the x86 implementation but we diverge already on arm64 since we talk
> about ZONE_DMA rather than 4G limit (though for most platforms these
> would be the same).
> 
> I guess we could restate the difference between crashkernel= and
> crashkernel=,high as the hint to go for allocation above ZONE_DMA first.

Yes, rethinking about this, we can make a straightforward and simpler
crashkernel=,high|,low on arm64, namely asking for user to clearly
specify them.

During maintenance of crashkernel= parameter in our distros, we found
crashkernel=xM is used mostly since most of systems can be satisfied
with 256M or a little more for kdump. While on some big end servers,
1G or more crashkernel memory is needed. In this case, crashkernel=,high
is taken. We don't want to reserve so much low memory during system
running while just waiting in case rare crash happened. crashkernel=,high
is rarely used, so making it simple and not so flexible is not so bad.
We can improve it later with justification.

> 
> >         An independent crashkernel=,low makes no sense. Crashkernel=,low
> >         should be paird with crashkernel=,high.
> 
> You could argue that crashkernel=,low gives the current crashkernel=
> behaviour, i.e. either all within ZONE_DMA or fail to allocate. So it
> may have some value on its own.

Yes, crashkernel=,low has the same behaviour as the current crashkernel=
if we decide not to add fallback mechanism to it. The purpose of
crahskernel=,low is to assist crashkernel=,high to get kdump kernel
boot up with satisfing DMA allocation. While allowing independent
crashkernel=,low will add it another mission, limiting crashkernel only
reserved in low memory. Up to now, we don't see the need for that.

> 
> >         My personal opinion according to the existed senmantics on x86.
> >         Otherwise, the guidance of crashkernel= |,high|,low reservation
> >         will be complicated to write.
> 
> It's more that I find the current semantics unnecessarily confusing. But
> even reading the x86_64 text it's not that clear. For example the
> default low allocation for crashkernel= and crashkernel=,high is only
> mentioned in the crashkernel=,low description.

Yeah, we can improve those document if insufficiency is found.

By the way, with my observation, crashkernel= with fallback meet
99% of our needs. If people really need more than 512M memory or more,
then please consider crashkernel=,high. Basically on servers, low memory
is limited, while high memory is very big.

So I agree with you that we can make it step by step, firstly adding
basic crashkernel=,high and ,low support. We can add those complicated
cases later. 


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-05-06 11:39                                 ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-05-06 11:39 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Leizhen (ThunderTown),
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin, linux-kernel, Dave Young, Vivek Goyal,
	Eric Biederman, kexec, Will Deacon, linux-arm-kernel,
	Rob Herring, Frank Rowand, devicetree, Jonathan Corbet,
	linux-doc, Randy Dunlap, Feng Zhou, Kefeng Wang, Chen Zhou,
	John Donnelly, Dave Kleikamp

On 05/05/22 at 03:20pm, Catalin Marinas wrote:
> On Thu, May 05, 2022 at 11:00:19AM +0800, Baoquan He wrote:
> > On 05/03/22 at 11:00pm, Catalin Marinas wrote:
> > > So, to recap, IIUC you are fine with:
> > > 
> > > 	crashkernel=Y		- allocate within ZONE_DMA with fallback
> > > 				  above with a default in ZONE_DMA (like
> > > 				  x86, 256M or swiotlb size)
> > 
> >         Ack to this one.
> > 
> > 
> > > 	crashkernel=Y,high	- allocate from above ZONE_DMA
> > 
> >         Not exactly. If there's only ZONE_DMA, crashkernel,high will
> >         be reserved in ZONE_DMA, and crashkernel,low will be ignored.
> >         Other than this, ack.
> 
> Yes, that's fine.
> 
> > > 	crashkernel=Y,low	- allocate within ZONE_DMA
> > 
> >         Ack to this one.
> > > 
> > > 'crashkernel' overrides the high and low while the latter two can be
> > > passed independently.
> > 
> >         crashkernel=,high can be passed independently, then a crashkernel=,low
> >         is needed implicitly. If people don't want crashkernel=,low
> >         explicitly, crashkernel=0,low need be specified.
> 
> I find this complicating the interface. I don't know the background to
> the x86 implementation but we diverge already on arm64 since we talk
> about ZONE_DMA rather than 4G limit (though for most platforms these
> would be the same).
> 
> I guess we could restate the difference between crashkernel= and
> crashkernel=,high as the hint to go for allocation above ZONE_DMA first.

Yes, rethinking about this, we can make a straightforward and simpler
crashkernel=,high|,low on arm64, namely asking for user to clearly
specify them.

During maintenance of crashkernel= parameter in our distros, we found
crashkernel=xM is used mostly since most of systems can be satisfied
with 256M or a little more for kdump. While on some big end servers,
1G or more crashkernel memory is needed. In this case, crashkernel=,high
is taken. We don't want to reserve so much low memory during system
running while just waiting in case rare crash happened. crashkernel=,high
is rarely used, so making it simple and not so flexible is not so bad.
We can improve it later with justification.

> 
> >         An independent crashkernel=,low makes no sense. Crashkernel=,low
> >         should be paird with crashkernel=,high.
> 
> You could argue that crashkernel=,low gives the current crashkernel=
> behaviour, i.e. either all within ZONE_DMA or fail to allocate. So it
> may have some value on its own.

Yes, crashkernel=,low has the same behaviour as the current crashkernel=
if we decide not to add fallback mechanism to it. The purpose of
crahskernel=,low is to assist crashkernel=,high to get kdump kernel
boot up with satisfing DMA allocation. While allowing independent
crashkernel=,low will add it another mission, limiting crashkernel only
reserved in low memory. Up to now, we don't see the need for that.

> 
> >         My personal opinion according to the existed senmantics on x86.
> >         Otherwise, the guidance of crashkernel= |,high|,low reservation
> >         will be complicated to write.
> 
> It's more that I find the current semantics unnecessarily confusing. But
> even reading the x86_64 text it's not that clear. For example the
> default low allocation for crashkernel= and crashkernel=,high is only
> mentioned in the crashkernel=,low description.

Yeah, we can improve those document if insufficiency is found.

By the way, with my observation, crashkernel= with fallback meet
99% of our needs. If people really need more than 512M memory or more,
then please consider crashkernel=,high. Basically on servers, low memory
is limited, while high memory is very big.

So I agree with you that we can make it step by step, firstly adding
basic crashkernel=,high and ,low support. We can add those complicated
cases later. 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X
@ 2022-05-06 11:39                                 ` Baoquan He
  0 siblings, 0 replies; 102+ messages in thread
From: Baoquan He @ 2022-05-06 11:39 UTC (permalink / raw)
  To: kexec

On 05/05/22 at 03:20pm, Catalin Marinas wrote:
> On Thu, May 05, 2022 at 11:00:19AM +0800, Baoquan He wrote:
> > On 05/03/22 at 11:00pm, Catalin Marinas wrote:
> > > So, to recap, IIUC you are fine with:
> > > 
> > > 	crashkernel=Y		- allocate within ZONE_DMA with fallback
> > > 				  above with a default in ZONE_DMA (like
> > > 				  x86, 256M or swiotlb size)
> > 
> >         Ack to this one.
> > 
> > 
> > > 	crashkernel=Y,high	- allocate from above ZONE_DMA
> > 
> >         Not exactly. If there's only ZONE_DMA, crashkernel,high will
> >         be reserved in ZONE_DMA, and crashkernel,low will be ignored.
> >         Other than this, ack.
> 
> Yes, that's fine.
> 
> > > 	crashkernel=Y,low	- allocate within ZONE_DMA
> > 
> >         Ack to this one.
> > > 
> > > 'crashkernel' overrides the high and low while the latter two can be
> > > passed independently.
> > 
> >         crashkernel=,high can be passed independently, then a crashkernel=,low
> >         is needed implicitly. If people don't want crashkernel=,low
> >         explicitly, crashkernel=0,low need be specified.
> 
> I find this complicating the interface. I don't know the background to
> the x86 implementation but we diverge already on arm64 since we talk
> about ZONE_DMA rather than 4G limit (though for most platforms these
> would be the same).
> 
> I guess we could restate the difference between crashkernel= and
> crashkernel=,high as the hint to go for allocation above ZONE_DMA first.

Yes, rethinking about this, we can make a straightforward and simpler
crashkernel=,high|,low on arm64, namely asking for user to clearly
specify them.

During maintenance of crashkernel= parameter in our distros, we found
crashkernel=xM is used mostly since most of systems can be satisfied
with 256M or a little more for kdump. While on some big end servers,
1G or more crashkernel memory is needed. In this case, crashkernel=,high
is taken. We don't want to reserve so much low memory during system
running while just waiting in case rare crash happened. crashkernel=,high
is rarely used, so making it simple and not so flexible is not so bad.
We can improve it later with justification.

> 
> >         An independent crashkernel=,low makes no sense. Crashkernel=,low
> >         should be paird with crashkernel=,high.
> 
> You could argue that crashkernel=,low gives the current crashkernel=
> behaviour, i.e. either all within ZONE_DMA or fail to allocate. So it
> may have some value on its own.

Yes, crashkernel=,low has the same behaviour as the current crashkernel=
if we decide not to add fallback mechanism to it. The purpose of
crahskernel=,low is to assist crashkernel=,high to get kdump kernel
boot up with satisfing DMA allocation. While allowing independent
crashkernel=,low will add it another mission, limiting crashkernel only
reserved in low memory. Up to now, we don't see the need for that.

> 
> >         My personal opinion according to the existed senmantics on x86.
> >         Otherwise, the guidance of crashkernel= |,high|,low reservation
> >         will be complicated to write.
> 
> It's more that I find the current semantics unnecessarily confusing. But
> even reading the x86_64 text it's not that clear. For example the
> default low allocation for crashkernel= and crashkernel=,high is only
> mentioned in the crashkernel=,low description.

Yeah, we can improve those document if insufficiency is found.

By the way, with my observation, crashkernel= with fallback meet
99% of our needs. If people really need more than 512M memory or more,
then please consider crashkernel=,high. Basically on servers, low memory
is limited, while high memory is very big.

So I agree with you that we can make it step by step, firstly adding
basic crashkernel=,high and ,low support. We can add those complicated
cases later. 



^ permalink raw reply	[flat|nested] 102+ messages in thread

end of thread, other threads:[~2022-05-06 11:40 UTC | newest]

Thread overview: 102+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-14 11:57 [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
2022-04-14 11:57 ` Zhen Lei
2022-04-14 11:57 ` Zhen Lei
2022-04-14 11:57 ` [PATCH v22 1/9] kdump: return -ENOENT if required cmdline option does not exist Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-25  3:49   ` Baoquan He
2022-04-25  3:49     ` Baoquan He
2022-04-25  3:49     ` Baoquan He
2022-04-14 11:57 ` [PATCH v22 2/9] arm64: Use insert_resource() to simplify code Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-14 11:57 ` [PATCH v22 3/9] arm64: kdump: Remove some redundant checks in map_mem() Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-14 11:57 ` [PATCH v22 4/9] arm64: kdump: Don't force page-level mappings for memory above 4G Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-26 14:26   ` Catalin Marinas
2022-04-26 14:26     ` Catalin Marinas
2022-04-26 14:26     ` Catalin Marinas
2022-04-27  7:12     ` Leizhen (ThunderTown)
2022-04-27  7:12       ` Leizhen
2022-04-27  7:12       ` Leizhen (ThunderTown)
2022-04-14 11:57 ` [PATCH v22 5/9] arm64: kdump: Reimplement crashkernel=X Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-26 18:02   ` Catalin Marinas
2022-04-26 18:02     ` Catalin Marinas
2022-04-26 18:02     ` Catalin Marinas
2022-04-27  6:54     ` Leizhen (ThunderTown)
2022-04-27  6:54       ` Leizhen
2022-04-27  6:54       ` Leizhen (ThunderTown)
2022-04-27 12:32       ` Catalin Marinas
2022-04-27 12:32         ` Catalin Marinas
2022-04-27 12:32         ` Catalin Marinas
2022-04-27 13:49         ` Leizhen (ThunderTown)
2022-04-27 13:49           ` Leizhen
2022-04-27 13:49           ` Leizhen (ThunderTown)
2022-04-27 16:04           ` Catalin Marinas
2022-04-27 16:04             ` Catalin Marinas
2022-04-27 16:04             ` Catalin Marinas
2022-04-28  2:22             ` Leizhen (ThunderTown)
2022-04-28  2:22               ` Leizhen
2022-04-28  2:22               ` Leizhen (ThunderTown)
2022-04-28  3:40             ` Baoquan He
2022-04-28  3:40               ` Baoquan He
2022-04-28  3:40               ` Baoquan He
2022-04-28  3:52               ` Baoquan He
2022-04-28  3:52                 ` Baoquan He
2022-04-28  3:52                 ` Baoquan He
2022-04-28  9:33                 ` Leizhen (ThunderTown)
2022-04-28  9:33                   ` Leizhen
2022-04-28  9:33                   ` Leizhen (ThunderTown)
2022-04-29  3:24                   ` Baoquan He
2022-04-29  3:24                     ` Baoquan He
2022-04-29  3:24                     ` Baoquan He
2022-04-29  8:02                     ` Leizhen (ThunderTown)
2022-04-29  8:02                       ` Leizhen
2022-04-29  8:02                       ` Leizhen (ThunderTown)
2022-04-29  8:25                       ` Leizhen (ThunderTown)
2022-04-29  8:25                         ` Leizhen
2022-04-29  8:25                         ` Leizhen (ThunderTown)
2022-05-03 22:00                         ` Catalin Marinas
2022-05-03 22:00                           ` Catalin Marinas
2022-05-03 22:00                           ` Catalin Marinas
2022-05-05  2:13                           ` Leizhen (ThunderTown)
2022-05-05  2:13                             ` Leizhen
2022-05-05  2:13                             ` Leizhen (ThunderTown)
2022-05-05  3:00                           ` Baoquan He
2022-05-05  3:00                             ` Baoquan He
2022-05-05  3:00                             ` Baoquan He
2022-05-05 14:20                             ` Catalin Marinas
2022-05-05 14:20                               ` Catalin Marinas
2022-05-05 14:20                               ` Catalin Marinas
2022-05-06 11:39                               ` Baoquan He
2022-05-06 11:39                                 ` Baoquan He
2022-05-06 11:39                                 ` Baoquan He
2022-04-14 11:57 ` [PATCH v22 6/9] arm64: kdump: Use page-level mapping for the high memory of crashkernel Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-14 11:57 ` [PATCH v22 7/9] arm64: kdump: Try not to use NO_BLOCK_MAPPINGS for memory under 4G Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-14 11:57 ` [PATCH v22 8/9] of: fdt: Add memory for devices by DT property "linux,usable-memory-range" Zhen Lei
2022-04-14 11:57   ` [PATCH v22 8/9] of: fdt: Add memory for devices by DT property "linux, usable-memory-range" Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-14 11:57 ` [PATCH v22 9/9] docs: kdump: Update the crashkernel description for arm64 Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-14 11:57   ` Zhen Lei
2022-04-19 17:02 ` [PATCH v22 0/9] support reserving crashkernel above 4G on arm64 kdump Dave Kleikamp
2022-04-19 17:02   ` Dave Kleikamp
2022-04-19 17:02   ` Dave Kleikamp
2022-04-25  2:19 ` Leizhen (ThunderTown)
2022-04-25  2:19   ` Leizhen
2022-04-25  2:19   ` Leizhen (ThunderTown)
2022-04-25  2:45   ` Baoquan He
2022-04-25  2:45     ` Baoquan He
2022-04-25  2:45     ` Baoquan He
2022-04-25  6:29     ` Leizhen (ThunderTown)
2022-04-25  6:29       ` Leizhen
2022-04-25  6:29       ` Leizhen (ThunderTown)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.