linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v10 0/5] support reserving crashkernel above 4G on arm64 kdump
@ 2020-07-03  3:58 Chen Zhou
  2020-07-03  3:58 ` [PATCH v10 1/5] x86: kdump: move reserve_crashkernel_low() into crash_core.c Chen Zhou
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Chen Zhou @ 2020-07-03  3:58 UTC (permalink / raw)
  To: tglx, mingo, dyoung, bhe, catalin.marinas, will, james.morse,
	robh+dt, arnd, John.P.donnelly, prabhakar.pkin, nsaenzjulienne,
	corbet, bhsharma, horms
  Cc: guohanjun, xiexiuqi, huawei.libin, linux-kernel,
	linux-arm-kernel, kexec, linux-doc, chenzhou10

This patch series enable reserving crashkernel above 4G in arm64.

There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel below 4G, which will fail
when there is no enough low memory.
2. Currently, crashkernel=Y@X can be used to reserve crashkernel above 4G,
in this case, if swiotlb or DMA buffers are required, crash dump kernel
will boot failure because there is no low memory available for allocation.
3. commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32") broken
the arm64 kdump. If the memory reserved for crash dump kernel falled in
ZONE_DMA32, the devices in crash dump kernel need to use ZONE_DMA will alloc
fail.

To solve these issues, introduce crashkernel=X,low to reserve specified
size low memory.
Crashkernel=X tries to reserve memory for the crash dump kernel under
4G. If crashkernel=Y,low is specified simultaneously, reserve spcified
size low memory for crash kdump kernel devices firstly and then reserve
memory above 4G.

When crashkernel is reserved above 4G in memory and crashkernel=X,low
is specified simultaneously, kernel should reserve specified size low memory
for crash dump kernel devices. So there may be two crash kernel regions, one
is below 4G, the other is above 4G.
In order to distinct from the high region and make no effect to the use of
kexec-tools, rename the low region as "Crash kernel (low)", and pass the
low region by reusing DT property "linux,usable-memory-range". We made the low
memory region as the last range of "linux,usable-memory-range" to keep
compatibility with existing user-space and older kdump kernels.

Besides, we need to modify kexec-tools:
arm64: support more than one crash kernel regions(see [1])

Another update is document about DT property 'linux,usable-memory-range':
schemas: update 'linux,usable-memory-range' node schema(see [2])

The previous changes and discussions can be retrieved from:

Changes since [v9]
- Patch 1 add Acked-by from Dave.
- Update patch 5 according to Dave's comments.
- Update chosen schema.

Changes since [v8]
- Reuse DT property "linux,usable-memory-range".
Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
memory region.
- Fix kdump broken with ZONE_DMA reintroduced.
- Update chosen schema.

Changes since [v7]
- Move x86 CRASH_ALIGN to 2M
Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
- Update Documentation/devicetree/bindings/chosen.txt.
Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
suggested by Arnd.
- Add Tested-by from Jhon and pk.

Changes since [v6]
- Fix build errors reported by kbuild test robot.

Changes since [v5]
- Move reserve_crashkernel_low() into kernel/crash_core.c.
- Delete crashkernel=X,high.
- Modify crashkernel=X,low.
If crashkernel=X,low is specified simultaneously, reserve spcified size low
memory for crash kdump kernel devices firstly and then reserve memory above 4G.
In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
pass to crash dump kernel by DT property "linux,low-memory-range".
- Update Documentation/admin-guide/kdump/kdump.rst.

Changes since [v4]
- Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.

Changes since [v3]
- Add memblock_cap_memory_ranges back for multiple ranges.
- Fix some compiling warnings.

Changes since [v2]
- Split patch "arm64: kdump: support reserving crashkernel above 4G" as
two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
patch.

Changes since [v1]:
- Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
- Remove memblock_cap_memory_ranges() i added in v1 and implement that
in fdt_enforce_memory_region().
There are at most two crash kernel regions, for two crash kernel regions
case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
and then remove the memory range in the middle.

[1]: http://lists.infradead.org/pipermail/kexec/2020-June/020737.html
[2]: https://github.com/robherring/dt-schema/pull/19 
[v1]: https://lkml.org/lkml/2019/4/2/1174
[v2]: https://lkml.org/lkml/2019/4/9/86
[v3]: https://lkml.org/lkml/2019/4/9/306
[v4]: https://lkml.org/lkml/2019/4/15/273
[v5]: https://lkml.org/lkml/2019/5/6/1360
[v6]: https://lkml.org/lkml/2019/8/30/142
[v7]: https://lkml.org/lkml/2019/12/23/411
[v8]: https://lkml.org/lkml/2020/5/21/213
[v9]: https://lkml.org/lkml/2020/6/28/73

Chen Zhou (5):
  x86: kdump: move reserve_crashkernel_low() into crash_core.c
  arm64: kdump: reserve crashkenel above 4G for crash dump kernel
  arm64: kdump: add memory for devices by DT property
    linux,usable-memory-range
  arm64: kdump: fix kdump broken with ZONE_DMA reintroduced
  kdump: update Documentation about crashkernel on arm64

 Documentation/admin-guide/kdump/kdump.rst     | 14 ++-
 .../admin-guide/kernel-parameters.txt         | 17 +++-
 arch/arm64/kernel/setup.c                     |  8 +-
 arch/arm64/mm/init.c                          | 74 ++++++++++++---
 arch/x86/kernel/setup.c                       | 66 ++------------
 include/linux/crash_core.h                    |  3 +
 include/linux/kexec.h                         |  2 -
 kernel/crash_core.c                           | 90 +++++++++++++++++++
 kernel/kexec_core.c                           | 17 ----
 9 files changed, 197 insertions(+), 94 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v10 1/5] x86: kdump: move reserve_crashkernel_low() into crash_core.c
  2020-07-03  3:58 [PATCH v10 0/5] support reserving crashkernel above 4G on arm64 kdump Chen Zhou
@ 2020-07-03  3:58 ` Chen Zhou
  2020-07-03  3:58 ` [PATCH v10 2/5] arm64: kdump: reserve crashkenel above 4G for crash dump kernel Chen Zhou
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Chen Zhou @ 2020-07-03  3:58 UTC (permalink / raw)
  To: tglx, mingo, dyoung, bhe, catalin.marinas, will, james.morse,
	robh+dt, arnd, John.P.donnelly, prabhakar.pkin, nsaenzjulienne,
	corbet, bhsharma, horms
  Cc: guohanjun, xiexiuqi, huawei.libin, linux-kernel,
	linux-arm-kernel, kexec, linux-doc, chenzhou10

In preparation for supporting reserve_crashkernel_low in arm64 as
x86_64 does, move reserve_crashkernel_low() into kernel/crash_core.c.

BTW, move x86_64 CRASH_ALIGN to 2M suggested by Dave. CONFIG_PHYSICAL_ALIGN
can be selected from 2M to 16M, move to the same as arm64.

Note, in arm64, we reserve low memory if and only if crashkernel=X,low
is specified. Different with x86_64, don't set low memory automatically.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Tested-by: John Donnelly <John.p.donnelly@oracle.com>
Tested-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Acked-by: Dave Young <dyoung@redhat.com>
---
 arch/x86/kernel/setup.c    | 66 ++++-------------------------
 include/linux/crash_core.h |  3 ++
 include/linux/kexec.h      |  2 -
 kernel/crash_core.c        | 85 ++++++++++++++++++++++++++++++++++++++
 kernel/kexec_core.c        | 17 --------
 5 files changed, 96 insertions(+), 77 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index a3767e74c758..33db99ae3035 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -401,8 +401,8 @@ static void __init memblock_x86_reserve_range_setup_data(void)
 
 #ifdef CONFIG_KEXEC_CORE
 
-/* 16M alignment for crash kernel regions */
-#define CRASH_ALIGN		SZ_16M
+/* 2M alignment for crash kernel regions */
+#define CRASH_ALIGN		SZ_2M
 
 /*
  * Keep the crash kernel below this limit.
@@ -425,59 +425,6 @@ static void __init memblock_x86_reserve_range_setup_data(void)
 # define CRASH_ADDR_HIGH_MAX	SZ_64T
 #endif
 
-static int __init reserve_crashkernel_low(void)
-{
-#ifdef CONFIG_X86_64
-	unsigned long long base, low_base = 0, low_size = 0;
-	unsigned long total_low_mem;
-	int ret;
-
-	total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
-
-	/* crashkernel=Y,low */
-	ret = parse_crashkernel_low(boot_command_line, total_low_mem, &low_size, &base);
-	if (ret) {
-		/*
-		 * two parts from kernel/dma/swiotlb.c:
-		 * -swiotlb size: user-specified with swiotlb= or default.
-		 *
-		 * -swiotlb overflow buffer: now hardcoded to 32k. We round it
-		 * to 8M for other buffers that may need to stay low too. Also
-		 * make sure we allocate enough extra low memory so that we
-		 * don't run out of DMA buffers for 32-bit devices.
-		 */
-		low_size = max(swiotlb_size_or_default() + (8UL << 20), 256UL << 20);
-	} else {
-		/* passed with crashkernel=0,low ? */
-		if (!low_size)
-			return 0;
-	}
-
-	low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
-	if (!low_base) {
-		pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
-		       (unsigned long)(low_size >> 20));
-		return -ENOMEM;
-	}
-
-	ret = memblock_reserve(low_base, low_size);
-	if (ret) {
-		pr_err("%s: Error reserving crashkernel low memblock.\n", __func__);
-		return ret;
-	}
-
-	pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (System low RAM: %ldMB)\n",
-		(unsigned long)(low_size >> 20),
-		(unsigned long)(low_base >> 20),
-		(unsigned long)(total_low_mem >> 20));
-
-	crashk_low_res.start = low_base;
-	crashk_low_res.end   = low_base + low_size - 1;
-	insert_resource(&iomem_resource, &crashk_low_res);
-#endif
-	return 0;
-}
-
 static void __init reserve_crashkernel(void)
 {
 	unsigned long long crash_size, crash_base, total_mem;
@@ -541,9 +488,12 @@ static void __init reserve_crashkernel(void)
 		return;
 	}
 
-	if (crash_base >= (1ULL << 32) && reserve_crashkernel_low()) {
-		memblock_free(crash_base, crash_size);
-		return;
+	if (crash_base >= (1ULL << 32)) {
+		if (reserve_crashkernel_low()) {
+			memblock_free(crash_base, crash_size);
+			return;
+		}
+		insert_resource(&iomem_resource, &crashk_low_res);
 	}
 
 	pr_info("Reserving %ldMB of memory at %ldMB for crashkernel (System RAM: %ldMB)\n",
diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index 525510a9f965..4df8c0bff03e 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -63,6 +63,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
 extern unsigned char *vmcoreinfo_data;
 extern size_t vmcoreinfo_size;
 extern u32 *vmcoreinfo_note;
+extern struct resource crashk_res;
+extern struct resource crashk_low_res;
 
 Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
 			  void *data, size_t data_len);
@@ -74,5 +76,6 @@ int parse_crashkernel_high(char *cmdline, unsigned long long system_ram,
 		unsigned long long *crash_size, unsigned long long *crash_base);
 int parse_crashkernel_low(char *cmdline, unsigned long long system_ram,
 		unsigned long long *crash_size, unsigned long long *crash_base);
+int __init reserve_crashkernel_low(void);
 
 #endif /* LINUX_CRASH_CORE_H */
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index ea67910ae6b7..a460afdbab0f 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -330,8 +330,6 @@ extern int kexec_load_disabled;
 
 /* Location of a reserved region to hold the crash kernel.
  */
-extern struct resource crashk_res;
-extern struct resource crashk_low_res;
 extern note_buf_t __percpu *crash_notes;
 
 /* flag to track if kexec reboot is in progress */
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 9f1557b98468..a7580d291c37 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -7,6 +7,8 @@
 #include <linux/crash_core.h>
 #include <linux/utsname.h>
 #include <linux/vmalloc.h>
+#include <linux/memblock.h>
+#include <linux/swiotlb.h>
 
 #include <asm/page.h>
 #include <asm/sections.h>
@@ -19,6 +21,22 @@ u32 *vmcoreinfo_note;
 /* trusted vmcoreinfo, e.g. we can make a copy in the crash memory */
 static unsigned char *vmcoreinfo_data_safecopy;
 
+/* Location of the reserved area for the crash kernel */
+struct resource crashk_res = {
+	.name  = "Crash kernel",
+	.start = 0,
+	.end   = 0,
+	.flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
+	.desc  = IORES_DESC_CRASH_KERNEL
+};
+struct resource crashk_low_res = {
+	.name  = "Crash kernel",
+	.start = 0,
+	.end   = 0,
+	.flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
+	.desc  = IORES_DESC_CRASH_KERNEL
+};
+
 /*
  * parsing the "crashkernel" commandline
  *
@@ -292,6 +310,73 @@ int __init parse_crashkernel_low(char *cmdline,
 				"crashkernel=", suffix_tbl[SUFFIX_LOW]);
 }
 
+#if defined(CONFIG_X86_64) || defined(CONFIG_ARM64)
+#define CRASH_ALIGN		SZ_2M
+#endif
+
+int __init reserve_crashkernel_low(void)
+{
+#if defined(CONFIG_X86_64) || defined(CONFIG_ARM64)
+	unsigned long long base, low_base = 0, low_size = 0;
+	unsigned long total_low_mem;
+	int ret;
+
+	total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
+
+	/* crashkernel=Y,low */
+	ret = parse_crashkernel_low(boot_command_line, total_low_mem, &low_size,
+			&base);
+	if (ret) {
+#ifdef CONFIG_X86_64
+		/*
+		 * two parts from lib/swiotlb.c:
+		 * -swiotlb size: user-specified with swiotlb= or default.
+		 *
+		 * -swiotlb overflow buffer: now hardcoded to 32k. We round it
+		 * to 8M for other buffers that may need to stay low too. Also
+		 * make sure we allocate enough extra low memory so that we
+		 * don't run out of DMA buffers for 32-bit devices.
+		 */
+		low_size = max(swiotlb_size_or_default() + (8UL << 20),
+				256UL << 20);
+#else
+		/*
+		 * in arm64, reserve low memory if and only if crashkernel=X,low
+		 * specified.
+		 */
+		return -EINVAL;
+#endif
+	} else {
+		/* passed with crashkernel=0,low ? */
+		if (!low_size)
+			return 0;
+	}
+
+	low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
+	if (!low_base) {
+		pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
+		       (unsigned long)(low_size >> 20));
+		return -ENOMEM;
+	}
+
+	ret = memblock_reserve(low_base, low_size);
+	if (ret) {
+		pr_err("%s: Error reserving crashkernel low memblock.\n",
+				__func__);
+		return ret;
+	}
+
+	pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (System low RAM: %ldMB)\n",
+		(unsigned long)(low_size >> 20),
+		(unsigned long)(low_base >> 20),
+		(unsigned long)(total_low_mem >> 20));
+
+	crashk_low_res.start = low_base;
+	crashk_low_res.end   = low_base + low_size - 1;
+#endif
+	return 0;
+}
+
 Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
 			  void *data, size_t data_len)
 {
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index c19c0dad1ebe..db66bbabfff3 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -53,23 +53,6 @@ note_buf_t __percpu *crash_notes;
 /* Flag to indicate we are going to kexec a new kernel */
 bool kexec_in_progress = false;
 
-
-/* Location of the reserved area for the crash kernel */
-struct resource crashk_res = {
-	.name  = "Crash kernel",
-	.start = 0,
-	.end   = 0,
-	.flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
-	.desc  = IORES_DESC_CRASH_KERNEL
-};
-struct resource crashk_low_res = {
-	.name  = "Crash kernel",
-	.start = 0,
-	.end   = 0,
-	.flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
-	.desc  = IORES_DESC_CRASH_KERNEL
-};
-
 int kexec_should_crash(struct task_struct *p)
 {
 	/*
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v10 2/5] arm64: kdump: reserve crashkenel above 4G for crash dump kernel
  2020-07-03  3:58 [PATCH v10 0/5] support reserving crashkernel above 4G on arm64 kdump Chen Zhou
  2020-07-03  3:58 ` [PATCH v10 1/5] x86: kdump: move reserve_crashkernel_low() into crash_core.c Chen Zhou
@ 2020-07-03  3:58 ` Chen Zhou
  2020-07-03  3:58 ` [PATCH v10 3/5] arm64: kdump: add memory for devices by DT property linux,usable-memory-range Chen Zhou
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Chen Zhou @ 2020-07-03  3:58 UTC (permalink / raw)
  To: tglx, mingo, dyoung, bhe, catalin.marinas, will, james.morse,
	robh+dt, arnd, John.P.donnelly, prabhakar.pkin, nsaenzjulienne,
	corbet, bhsharma, horms
  Cc: guohanjun, xiexiuqi, huawei.libin, linux-kernel,
	linux-arm-kernel, kexec, linux-doc, chenzhou10

Crashkernel=X tries to reserve memory for the crash dump kernel under
4G. If crashkernel=X,low is specified simultaneously, reserve spcified
size low memory for crash kdump kernel devices firstly and then reserve
memory above 4G.

Suggested by James, just introduced crashkernel=X,low to arm64. As memtioned
above, if crashkernel=X,low is specified simultaneously, reserve spcified
size low memory for crash kdump kernel devices firstly and then reserve
memory above 4G, which is much simpler.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Tested-by: John Donnelly <John.p.donnelly@oracle.com>
Tested-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
---
 arch/arm64/kernel/setup.c |  8 +++++++-
 arch/arm64/mm/init.c      | 31 +++++++++++++++++++++++++++++--
 2 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 93b3844cf442..4dc51a2ac012 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -238,7 +238,13 @@ static void __init request_standard_resources(void)
 		    kernel_data.end <= res->end)
 			request_resource(res, &kernel_data);
 #ifdef CONFIG_KEXEC_CORE
-		/* Userspace will find "Crash kernel" region in /proc/iomem. */
+		/*
+		 * Userspace will find "Crash kernel" region in /proc/iomem.
+		 * Note: the low region is renamed as Crash kernel (low).
+		 */
+		if (crashk_low_res.end && crashk_low_res.start >= res->start &&
+				crashk_low_res.end <= res->end)
+			request_resource(res, &crashk_low_res);
 		if (crashk_res.end && crashk_res.start >= res->start &&
 		    crashk_res.end <= res->end)
 			request_resource(res, &crashk_res);
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1e93cfc7c47a..ce7ced85f5fb 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -81,6 +81,7 @@ static void __init reserve_crashkernel(void)
 {
 	unsigned long long crash_base, crash_size;
 	int ret;
+	phys_addr_t crash_max = arm64_dma32_phys_limit;
 
 	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
 				&crash_size, &crash_base);
@@ -88,12 +89,38 @@ static void __init reserve_crashkernel(void)
 	if (ret || !crash_size)
 		return;
 
+	ret = reserve_crashkernel_low();
+	if (!ret && crashk_low_res.end) {
+		/*
+		 * If crashkernel=X,low specified, there may be two regions,
+		 * we need to make some changes as follows:
+		 *
+		 * 1. rename the low region as "Crash kernel (low)"
+		 * In order to distinct from the high region and make no effect
+		 * to the use of existing kexec-tools, rename the low region as
+		 * "Crash kernel (low)".
+		 *
+		 * 2. change the upper bound for crash memory
+		 * Set MEMBLOCK_ALLOC_ACCESSIBLE upper bound for crash memory.
+		 *
+		 * 3. mark the low region as "nomap"
+		 * The low region is intended to be used for crash dump kernel
+		 * devices, just mark the low region as "nomap" simply.
+		 */
+		const char *rename = "Crash kernel (low)";
+
+		crashk_low_res.name = rename;
+		crash_max = MEMBLOCK_ALLOC_ACCESSIBLE;
+		memblock_mark_nomap(crashk_low_res.start,
+				    resource_size(&crashk_low_res));
+	}
+
 	crash_size = PAGE_ALIGN(crash_size);
 
 	if (crash_base == 0) {
 		/* Current arm64 boot protocol requires 2MB alignment */
-		crash_base = memblock_find_in_range(0, arm64_dma32_phys_limit,
-				crash_size, SZ_2M);
+		crash_base = memblock_find_in_range(0, crash_max, crash_size,
+				SZ_2M);
 		if (crash_base == 0) {
 			pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
 				crash_size);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v10 3/5] arm64: kdump: add memory for devices by DT property linux,usable-memory-range
  2020-07-03  3:58 [PATCH v10 0/5] support reserving crashkernel above 4G on arm64 kdump Chen Zhou
  2020-07-03  3:58 ` [PATCH v10 1/5] x86: kdump: move reserve_crashkernel_low() into crash_core.c Chen Zhou
  2020-07-03  3:58 ` [PATCH v10 2/5] arm64: kdump: reserve crashkenel above 4G for crash dump kernel Chen Zhou
@ 2020-07-03  3:58 ` Chen Zhou
  2020-07-03  3:58 ` [PATCH v10 4/5] arm64: kdump: fix kdump broken with ZONE_DMA reintroduced Chen Zhou
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Chen Zhou @ 2020-07-03  3:58 UTC (permalink / raw)
  To: tglx, mingo, dyoung, bhe, catalin.marinas, will, james.morse,
	robh+dt, arnd, John.P.donnelly, prabhakar.pkin, nsaenzjulienne,
	corbet, bhsharma, horms
  Cc: guohanjun, xiexiuqi, huawei.libin, linux-kernel,
	linux-arm-kernel, kexec, linux-doc, chenzhou10

If we want to reserve crashkernel above 4G, we could use parameters
"crashkernel=X crashkernel=Y,low", in this case, specified size low
memory is reserved for crash dump kernel devices and never mapped by
the first kernel. This memory range is advertised to crash dump kernel
via DT property under /chosen,
	linux,usable-memory-range = <BASE1 SIZE1 [BASE2 SIZE2]>

We reused the DT property linux,usable-memory-range and made the low
memory region as the second range "BASE2 SIZE2", which keeps compatibility
with existing user-space and older kdump kernels.

Crash dump kernel reads this property at boot time and call memblock_add()
to add the low memory region after memblock_cap_memory_range() has been
called.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Tested-by: John Donnelly <John.p.donnelly@oracle.com>
Tested-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
---
 arch/arm64/mm/init.c | 43 +++++++++++++++++++++++++++++++++----------
 1 file changed, 33 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index ce7ced85f5fb..f5b31e8f1f34 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -69,6 +69,15 @@ EXPORT_SYMBOL(vmemmap);
 phys_addr_t arm64_dma_phys_limit __ro_after_init;
 static phys_addr_t arm64_dma32_phys_limit __ro_after_init;
 
+/*
+ * The main usage of linux,usable-memory-range is for crash dump kernel.
+ * Originally, the number of usable-memory regions is one. Now crash dump
+ * kernel support at most two regions, low region and high region.
+ * To make compatibility with existing user-space and older kdump, the low
+ * region is always the last range of linux,usable-memory-range if exist.
+ */
+#define MAX_USABLE_RANGES	2
+
 #ifdef CONFIG_KEXEC_CORE
 /*
  * reserve_crashkernel() - reserves memory for crash kernel
@@ -272,9 +281,9 @@ early_param("mem", early_mem);
 static int __init early_init_dt_scan_usablemem(unsigned long node,
 		const char *uname, int depth, void *data)
 {
-	struct memblock_region *usablemem = data;
-	const __be32 *reg;
-	int len;
+	struct memblock_region *usable_rgns = data;
+	const __be32 *reg, *endp;
+	int len, nr = 0;
 
 	if (depth != 1 || strcmp(uname, "chosen") != 0)
 		return 0;
@@ -283,22 +292,36 @@ static int __init early_init_dt_scan_usablemem(unsigned long node,
 	if (!reg || (len < (dt_root_addr_cells + dt_root_size_cells)))
 		return 1;
 
-	usablemem->base = dt_mem_next_cell(dt_root_addr_cells, &reg);
-	usablemem->size = dt_mem_next_cell(dt_root_size_cells, &reg);
+	endp = reg + (len / sizeof(__be32));
+	while ((endp - reg) >= (dt_root_addr_cells + dt_root_size_cells)) {
+		usable_rgns[nr].base = dt_mem_next_cell(dt_root_addr_cells, &reg);
+		usable_rgns[nr].size = dt_mem_next_cell(dt_root_size_cells, &reg);
+
+		if (++nr >= MAX_USABLE_RANGES)
+			break;
+	}
 
 	return 1;
 }
 
 static void __init fdt_enforce_memory_region(void)
 {
-	struct memblock_region reg = {
-		.size = 0,
+	struct memblock_region usable_rgns[MAX_USABLE_RANGES] = {
+		{ .size = 0 },
+		{ .size = 0 }
 	};
 
-	of_scan_flat_dt(early_init_dt_scan_usablemem, &reg);
+	of_scan_flat_dt(early_init_dt_scan_usablemem, &usable_rgns);
 
-	if (reg.size)
-		memblock_cap_memory_range(reg.base, reg.size);
+	/*
+	 * The first range of usable-memory regions is for crash dump
+	 * kernel with only one region or for high region with two regions,
+	 * the second range is dedicated for low region if exist.
+	 */
+	if (usable_rgns[0].size)
+		memblock_cap_memory_range(usable_rgns[0].base, usable_rgns[0].size);
+	if (usable_rgns[1].size)
+		memblock_add(usable_rgns[1].base, usable_rgns[1].size);
 }
 
 void __init arm64_memblock_init(void)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v10 4/5] arm64: kdump: fix kdump broken with ZONE_DMA reintroduced
  2020-07-03  3:58 [PATCH v10 0/5] support reserving crashkernel above 4G on arm64 kdump Chen Zhou
                   ` (2 preceding siblings ...)
  2020-07-03  3:58 ` [PATCH v10 3/5] arm64: kdump: add memory for devices by DT property linux,usable-memory-range Chen Zhou
@ 2020-07-03  3:58 ` Chen Zhou
  2020-07-27 17:30   ` Catalin Marinas
  2020-07-03  3:58 ` [PATCH v10 5/5] kdump: update Documentation about crashkernel on arm64 Chen Zhou
  2020-07-03  7:26 ` [PATCH v10 0/5] support reserving crashkernel above 4G on arm64 kdump Bhupesh Sharma
  5 siblings, 1 reply; 18+ messages in thread
From: Chen Zhou @ 2020-07-03  3:58 UTC (permalink / raw)
  To: tglx, mingo, dyoung, bhe, catalin.marinas, will, james.morse,
	robh+dt, arnd, John.P.donnelly, prabhakar.pkin, nsaenzjulienne,
	corbet, bhsharma, horms
  Cc: guohanjun, xiexiuqi, huawei.libin, linux-kernel,
	linux-arm-kernel, kexec, linux-doc, chenzhou10

commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32")
broken the arm64 kdump. If the memory reserved for crash dump kernel
falled in ZONE_DMA32, the devices in crash dump kernel need to use
ZONE_DMA will alloc fail.

This patch addressed the above issue based on "reserving crashkernel
above 4G". Originally, we reserve low memory below 4G, and now just need
to adjust memory limit to arm64_dma_phys_limit in reserve_crashkernel_low
if ZONE_DMA is enabled. That is, if there are devices need to use ZONE_DMA
in crash dump kernel, it is a good choice to use parameters
"crashkernel=X crashkernel=Y,low".

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
---
 kernel/crash_core.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index a7580d291c37..e8ecbbc761a3 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -320,6 +320,7 @@ int __init reserve_crashkernel_low(void)
 	unsigned long long base, low_base = 0, low_size = 0;
 	unsigned long total_low_mem;
 	int ret;
+	phys_addr_t crash_max = 1ULL << 32;
 
 	total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
 
@@ -352,7 +353,11 @@ int __init reserve_crashkernel_low(void)
 			return 0;
 	}
 
-	low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
+#ifdef CONFIG_ARM64
+	if (IS_ENABLED(CONFIG_ZONE_DMA))
+		crash_max = arm64_dma_phys_limit;
+#endif
+	low_base = memblock_find_in_range(0, crash_max, low_size, CRASH_ALIGN);
 	if (!low_base) {
 		pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
 		       (unsigned long)(low_size >> 20));
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v10 5/5] kdump: update Documentation about crashkernel on arm64
  2020-07-03  3:58 [PATCH v10 0/5] support reserving crashkernel above 4G on arm64 kdump Chen Zhou
                   ` (3 preceding siblings ...)
  2020-07-03  3:58 ` [PATCH v10 4/5] arm64: kdump: fix kdump broken with ZONE_DMA reintroduced Chen Zhou
@ 2020-07-03  3:58 ` Chen Zhou
  2020-07-03  4:46   ` Dave Young
  2020-07-03  7:26 ` [PATCH v10 0/5] support reserving crashkernel above 4G on arm64 kdump Bhupesh Sharma
  5 siblings, 1 reply; 18+ messages in thread
From: Chen Zhou @ 2020-07-03  3:58 UTC (permalink / raw)
  To: tglx, mingo, dyoung, bhe, catalin.marinas, will, james.morse,
	robh+dt, arnd, John.P.donnelly, prabhakar.pkin, nsaenzjulienne,
	corbet, bhsharma, horms
  Cc: guohanjun, xiexiuqi, huawei.libin, linux-kernel,
	linux-arm-kernel, kexec, linux-doc, chenzhou10

Now we support crashkernel=X,[low] on arm64, update the Documentation.
We could use parameters "crashkernel=X crashkernel=Y,low" to reserve
memory above 4G.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Tested-by: John Donnelly <John.p.donnelly@oracle.com>
Tested-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
---
 Documentation/admin-guide/kdump/kdump.rst       | 14 ++++++++++++--
 Documentation/admin-guide/kernel-parameters.txt | 17 +++++++++++++++--
 2 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
index 2da65fef2a1c..e80fc9e28a9a 100644
--- a/Documentation/admin-guide/kdump/kdump.rst
+++ b/Documentation/admin-guide/kdump/kdump.rst
@@ -299,7 +299,15 @@ Boot into System Kernel
    "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
    starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
 
-   On x86 and x86_64, use "crashkernel=64M@16M".
+   On x86 use "crashkernel=64M@16M".
+
+   On x86_64, use "crashkernel=Y" to select a region under 4G first, and
+   fall back to reserve region above 4G.
+   We can also use "crashkernel=X,high" to select a region above 4G, which
+   also tries to allocate at least 256M below 4G automatically and
+   "crashkernel=Y,low" can be used to allocate specified size low memory.
+   Use "crashkernel=Y@X" if we really have to reserve memory from specified
+   start address X.
 
    On ppc64, use "crashkernel=128M@32M".
 
@@ -316,8 +324,10 @@ Boot into System Kernel
    kernel will automatically locate the crash kernel image within the
    first 512MB of RAM if X is not given.
 
-   On arm64, use "crashkernel=Y[@X]".  Note that the start address of
+   On arm64, use "crashkernel=Y[@X]". Note that the start address of
    the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
+   If crashkernel=Z,low is specified simultaneously, reserve spcified size
+   low memory firstly and then reserve memory above 4G.
 
 Load the Dump-capture Kernel
 ============================
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index fb95fad81c79..58a731eed011 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -722,6 +722,9 @@
 			[KNL, x86_64] select a region under 4G first, and
 			fall back to reserve region above 4G when '@offset'
 			hasn't been specified.
+			[KNL, arm64] If crashkernel=X,low is specified, reserve
+			spcified size low memory firstly, and then reserve memory
+			above 4G.
 			See Documentation/admin-guide/kdump/kdump.rst for further details.
 
 	crashkernel=range1:size1[,range2:size2,...][@offset]
@@ -746,13 +749,23 @@
 			requires at least 64M+32K low memory, also enough extra
 			low memory is needed to make sure DMA buffers for 32-bit
 			devices won't run out. Kernel would try to allocate at
-			at least 256M below 4G automatically.
+			least 256M below 4G automatically.
 			This one let user to specify own low range under 4G
 			for second kernel instead.
 			0: to disable low allocation.
 			It will be ignored when crashkernel=X,high is not used
 			or memory reserved is below 4G.
-
+			[KNL, arm64] range under 4G.
+			This one let user to specify own low range under 4G
+			for crash dump kernel instead.
+			Be different from x86_64, kernel reserves specified size
+			physical memory region only when this parameter is specified
+			instead of trying to reserve at least 256M below 4G
+			automatically.
+			Use this parameter along with crashkernel=X when we want
+			to reserve crashkernel above 4G. If there are devices
+			need to use ZONE_DMA in crash dump kernel, it is also
+			a good choice.
 	cryptomgr.notests
 			[KNL] Disable crypto self-tests
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v10 5/5] kdump: update Documentation about crashkernel on arm64
  2020-07-03  3:58 ` [PATCH v10 5/5] kdump: update Documentation about crashkernel on arm64 Chen Zhou
@ 2020-07-03  4:46   ` Dave Young
  2020-07-03  4:50     ` Dave Young
  0 siblings, 1 reply; 18+ messages in thread
From: Dave Young @ 2020-07-03  4:46 UTC (permalink / raw)
  To: Chen Zhou
  Cc: tglx, mingo, bhe, catalin.marinas, will, james.morse, robh+dt,
	arnd, John.P.donnelly, prabhakar.pkin, nsaenzjulienne, corbet,
	bhsharma, horms, guohanjun, xiexiuqi, huawei.libin, linux-kernel,
	linux-arm-kernel, kexec, linux-doc

Hi,

Thanks for the update, but still some nitpicks :(

I'm sorry I did not catch them previously,  but maybe it is not worth to
repost the whole series if no other changes needed.
On 07/03/20 at 11:58am, Chen Zhou wrote:
> Now we support crashkernel=X,[low] on arm64, update the Documentation.
> We could use parameters "crashkernel=X crashkernel=Y,low" to reserve
> memory above 4G.
> 
> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
> Tested-by: John Donnelly <John.p.donnelly@oracle.com>
> Tested-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
> ---
>  Documentation/admin-guide/kdump/kdump.rst       | 14 ++++++++++++--
>  Documentation/admin-guide/kernel-parameters.txt | 17 +++++++++++++++--
>  2 files changed, 27 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
> index 2da65fef2a1c..e80fc9e28a9a 100644
> --- a/Documentation/admin-guide/kdump/kdump.rst
> +++ b/Documentation/admin-guide/kdump/kdump.rst
> @@ -299,7 +299,15 @@ Boot into System Kernel
>     "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
>     starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
>  
> -   On x86 and x86_64, use "crashkernel=64M@16M".
> +   On x86 use "crashkernel=64M@16M".
> +
> +   On x86_64, use "crashkernel=Y" to select a region under 4G first, and
> +   fall back to reserve region above 4G.
> +   We can also use "crashkernel=X,high" to select a region above 4G, which
> +   also tries to allocate at least 256M below 4G automatically and
> +   "crashkernel=Y,low" can be used to allocate specified size low memory.
> +   Use "crashkernel=Y@X" if we really have to reserve memory from specified

s/we/you

> +   start address X.
>  
>     On ppc64, use "crashkernel=128M@32M".
>  
> @@ -316,8 +324,10 @@ Boot into System Kernel
>     kernel will automatically locate the crash kernel image within the
>     first 512MB of RAM if X is not given.
>  
> -   On arm64, use "crashkernel=Y[@X]".  Note that the start address of
> +   On arm64, use "crashkernel=Y[@X]". Note that the start address of
>     the kernel, X if explicitly specified, must be aligned to 2MiB (0x200000).
> +   If crashkernel=Z,low is specified simultaneously, reserve spcified size

s/spcified/specified

> +   low memory firstly and then reserve memory above 4G.
>  
>  Load the Dump-capture Kernel
>  ============================
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index fb95fad81c79..58a731eed011 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -722,6 +722,9 @@
>  			[KNL, x86_64] select a region under 4G first, and
>  			fall back to reserve region above 4G when '@offset'
>  			hasn't been specified.
> +			[KNL, arm64] If crashkernel=X,low is specified, reserve
> +			spcified size low memory firstly, and then reserve memory

s/spcified/specified

> +			above 4G.
>  			See Documentation/admin-guide/kdump/kdump.rst for further details.
>  
>  	crashkernel=range1:size1[,range2:size2,...][@offset]
> @@ -746,13 +749,23 @@
>  			requires at least 64M+32K low memory, also enough extra
>  			low memory is needed to make sure DMA buffers for 32-bit
>  			devices won't run out. Kernel would try to allocate at
> -			at least 256M below 4G automatically.
> +			least 256M below 4G automatically.
>  			This one let user to specify own low range under 4G
>  			for second kernel instead.
>  			0: to disable low allocation.
>  			It will be ignored when crashkernel=X,high is not used
>  			or memory reserved is below 4G.
> -
> +			[KNL, arm64] range under 4G.
> +			This one let user to specify own low range under 4G

s/own low/a low

> +			for crash dump kernel instead.
> +			Be different from x86_64, kernel reserves specified size
> +			physical memory region only when this parameter is specified
> +			instead of trying to reserve at least 256M below 4G
> +			automatically.
> +			Use this parameter along with crashkernel=X when we want
> +			to reserve crashkernel above 4G. If there are devices
> +			need to use ZONE_DMA in crash dump kernel, it is also
> +			a good choice.
>  	cryptomgr.notests
>  			[KNL] Disable crypto self-tests
>  
> -- 
> 2.20.1
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v10 5/5] kdump: update Documentation about crashkernel on arm64
  2020-07-03  4:46   ` Dave Young
@ 2020-07-03  4:50     ` Dave Young
  2020-07-03  9:11       ` Dave Young
  0 siblings, 1 reply; 18+ messages in thread
From: Dave Young @ 2020-07-03  4:50 UTC (permalink / raw)
  To: Chen Zhou
  Cc: tglx, mingo, bhe, catalin.marinas, will, james.morse, robh+dt,
	arnd, John.P.donnelly, prabhakar.pkin, nsaenzjulienne, corbet,
	bhsharma, horms, guohanjun, xiexiuqi, huawei.libin, linux-kernel,
	linux-arm-kernel, kexec, linux-doc

On 07/03/20 at 12:46pm, Dave Young wrote:
> Hi,
> 
> Thanks for the update, but still some nitpicks :(
> 
> I'm sorry I did not catch them previously,  but maybe it is not worth to
> repost the whole series if no other changes needed.

Feel free to add my acks for the common kdump part:

Acked-by: Dave Young <dyoung@redhat.com>

Thanks
Dave


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v10 0/5] support reserving crashkernel above 4G on arm64 kdump
  2020-07-03  3:58 [PATCH v10 0/5] support reserving crashkernel above 4G on arm64 kdump Chen Zhou
                   ` (4 preceding siblings ...)
  2020-07-03  3:58 ` [PATCH v10 5/5] kdump: update Documentation about crashkernel on arm64 Chen Zhou
@ 2020-07-03  7:26 ` Bhupesh Sharma
  2020-07-03  8:38   ` chenzhou
  5 siblings, 1 reply; 18+ messages in thread
From: Bhupesh Sharma @ 2020-07-03  7:26 UTC (permalink / raw)
  To: Chen Zhou
  Cc: Thomas Gleixner, Ingo Molnar, RuiRui Yang, Baoquan He,
	Catalin Marinas, Will Deacon, James Morse, Rob Herring,
	Arnd Bergmann, John Donnelly, Prabhakar Kushwaha, nsaenzjulienne,
	Jonathan Corbet, Simon Horman, guohanjun, xiexiuqi, huawei.libin,
	Linux Kernel Mailing List, linux-arm-kernel, kexec mailing list,
	Linux Doc Mailing List

Hi Chen,

On Fri, Jul 3, 2020 at 9:24 AM Chen Zhou <chenzhou10@huawei.com> wrote:
>
> This patch series enable reserving crashkernel above 4G in arm64.
>
> There are following issues in arm64 kdump:
> 1. We use crashkernel=X to reserve crashkernel below 4G, which will fail
> when there is no enough low memory.
> 2. Currently, crashkernel=Y@X can be used to reserve crashkernel above 4G,
> in this case, if swiotlb or DMA buffers are required, crash dump kernel
> will boot failure because there is no low memory available for allocation.
> 3. commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32") broken
> the arm64 kdump. If the memory reserved for crash dump kernel falled in
> ZONE_DMA32, the devices in crash dump kernel need to use ZONE_DMA will alloc
> fail.
>
> To solve these issues, introduce crashkernel=X,low to reserve specified
> size low memory.
> Crashkernel=X tries to reserve memory for the crash dump kernel under
> 4G. If crashkernel=Y,low is specified simultaneously, reserve spcified
> size low memory for crash kdump kernel devices firstly and then reserve
> memory above 4G.
>
> When crashkernel is reserved above 4G in memory and crashkernel=X,low
> is specified simultaneously, kernel should reserve specified size low memory
> for crash dump kernel devices. So there may be two crash kernel regions, one
> is below 4G, the other is above 4G.
> In order to distinct from the high region and make no effect to the use of
> kexec-tools, rename the low region as "Crash kernel (low)", and pass the
> low region by reusing DT property "linux,usable-memory-range". We made the low
> memory region as the last range of "linux,usable-memory-range" to keep
> compatibility with existing user-space and older kdump kernels.
>
> Besides, we need to modify kexec-tools:
> arm64: support more than one crash kernel regions(see [1])
>
> Another update is document about DT property 'linux,usable-memory-range':
> schemas: update 'linux,usable-memory-range' node schema(see [2])
>
> The previous changes and discussions can be retrieved from:
>
> Changes since [v9]
> - Patch 1 add Acked-by from Dave.
> - Update patch 5 according to Dave's comments.
> - Update chosen schema.
>
> Changes since [v8]
> - Reuse DT property "linux,usable-memory-range".
> Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
> memory region.
> - Fix kdump broken with ZONE_DMA reintroduced.
> - Update chosen schema.
>
> Changes since [v7]
> - Move x86 CRASH_ALIGN to 2M
> Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> - Update Documentation/devicetree/bindings/chosen.txt.
> Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
> suggested by Arnd.
> - Add Tested-by from Jhon and pk.
>
> Changes since [v6]
> - Fix build errors reported by kbuild test robot.
>
> Changes since [v5]
> - Move reserve_crashkernel_low() into kernel/crash_core.c.
> - Delete crashkernel=X,high.
> - Modify crashkernel=X,low.
> If crashkernel=X,low is specified simultaneously, reserve spcified size low
> memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> pass to crash dump kernel by DT property "linux,low-memory-range".
> - Update Documentation/admin-guide/kdump/kdump.rst.
>
> Changes since [v4]
> - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
>
> Changes since [v3]
> - Add memblock_cap_memory_ranges back for multiple ranges.
> - Fix some compiling warnings.
>
> Changes since [v2]
> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> patch.
>
> Changes since [v1]:
> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> in fdt_enforce_memory_region().
> There are at most two crash kernel regions, for two crash kernel regions
> case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> and then remove the memory range in the middle.
>
> [1]: http://lists.infradead.org/pipermail/kexec/2020-June/020737.html
> [2]: https://github.com/robherring/dt-schema/pull/19
> [v1]: https://lkml.org/lkml/2019/4/2/1174
> [v2]: https://lkml.org/lkml/2019/4/9/86
> [v3]: https://lkml.org/lkml/2019/4/9/306
> [v4]: https://lkml.org/lkml/2019/4/15/273
> [v5]: https://lkml.org/lkml/2019/5/6/1360
> [v6]: https://lkml.org/lkml/2019/8/30/142
> [v7]: https://lkml.org/lkml/2019/12/23/411
> [v8]: https://lkml.org/lkml/2020/5/21/213
> [v9]: https://lkml.org/lkml/2020/6/28/73
>
> Chen Zhou (5):
>   x86: kdump: move reserve_crashkernel_low() into crash_core.c
>   arm64: kdump: reserve crashkenel above 4G for crash dump kernel
>   arm64: kdump: add memory for devices by DT property
>     linux,usable-memory-range
>   arm64: kdump: fix kdump broken with ZONE_DMA reintroduced
>   kdump: update Documentation about crashkernel on arm64
>
>  Documentation/admin-guide/kdump/kdump.rst     | 14 ++-
>  .../admin-guide/kernel-parameters.txt         | 17 +++-
>  arch/arm64/kernel/setup.c                     |  8 +-
>  arch/arm64/mm/init.c                          | 74 ++++++++++++---
>  arch/x86/kernel/setup.c                       | 66 ++------------
>  include/linux/crash_core.h                    |  3 +
>  include/linux/kexec.h                         |  2 -
>  kernel/crash_core.c                           | 90 +++++++++++++++++++
>  kernel/kexec_core.c                           | 17 ----
>  9 files changed, 197 insertions(+), 94 deletions(-)
>
> --
> 2.20.1

Thanks for the v10.

1. Seems this series is still broken on arm64 boards like ampere and
ThunderX2 (marvell) because of the ZONE_DMA32 related OOM seen while
booting kdump kernel.
Here are details about my environment:

- Latest upstream Linus master branch (5.8.0-rc3) + your v10 patches.
- Latest upstream kexec-tools + your v4 patch.

# dmesg | grep -i crash
[    0.000000] crashkernel reserved: 0x00000000ca000000 -
0x00000000ea000000 (512 MB)
[    0.000000] Kernel command line:
BOOT_IMAGE=(hd13,gpt2)/vmlinuz-5.8.0-rc3+
root=/dev/mapper/rhel_hpe--apache--cn99xx--09-root ro
rd.lvm.lv=rhel_hpe-apache-cn99xx-09/root
rd.lvm.lv=rhel_hpe-apache-cn99xx-09/swap crashkernel=512M
[   58.917523]     crashkernel=512M

2. Here is the OOM crash seen while booting the kdump kernel:

[    0.244724] DMA: preallocated 128 KiB GFP_KERNEL pool for atomic allocations
[    0.251859] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000188
[    0.260737] Mem abort info:
[    0.263553]   ESR = 0x96000006
[    0.266632]   EC = 0x25: DABT (current EL), IL = 32 bits
[    0.271994]   SET = 0, FnV = 0
[    0.275074]   EA = 0, S1PTW = 0
[    0.278239] Data abort info:
[    0.281141]   ISV = 0, ISS = 0x00000006
[    0.285010]   CM = 0, WnR = 0
[    0.288001] [0000000000000188] user address but active_mm is swapper
[    0.294420] Internal error: Oops: 96000006 [#1] SMP
[    0.299344] Modules linked in:
[    0.302424] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3+ #8
[    0.308753] Hardware name: HPE Apollo 70             /C01_APACHE_MB
        , BIOS L50_5.13_1.11 06/18/2019
[    0.318599] pstate: 00400009 (nzcv daif +PAN -UAO BTYPE=--)
[    0.324228] pc : mem_cgroup_get_nr_swap_pages+0x2c/0x60
[    0.329506] lr : shrink_lruvec+0x404/0x4f8
[    0.333638] sp : fffffe0012b8f840
[    0.336979] x29: fffffe0012b8f840 x28: fffffe00116b3000
[    0.342343] x27: fffffe0012b8fb00 x26: 0000000000000020
[    0.347707] x25: 0000000000000000 x24: fffffc0069fffe28
[    0.353070] x23: 0000000000000000 x22: 0000000000000000
[    0.358433] x21: 000000000000003c x20: fffffe0012b8fa98
[    0.363796] x19: 0000000000000000 x18: 0000000000000010
[    0.369159] x17: 00000000bd8afee8 x16: 000000001260aa76
[    0.374523] x15: ffffffffffffffff x14: fffffe00116b3988
[    0.379886] x13: fffffe0092b8faa7 x12: fffffe0012b8faaf
[    0.385248] x11: fffffe00116f1000 x10: fffffe0012b8fa30
[    0.390612] x9 : fffffe0010244ebc x8 : 0000000000000000
[    0.395975] x7 : 0000000000000020 x6 : 00000000ffff8ae3
[    0.401338] x5 : 0000000000000000 x4 : fffffc004da89000
[    0.406701] x3 : 0000000000000000 x2 : 0000000000000000
[    0.412064] x1 : fffffe00116bf000 x0 : 0000000000000000
[    0.417427] Call trace:
[    0.419891]  mem_cgroup_get_nr_swap_pages+0x2c/0x60
[    0.424815]  shrink_node+0x1a8/0x688
[    0.428420]  do_try_to_free_pages+0xe8/0x448
[    0.432729]  try_to_free_pages+0x110/0x230
[    0.436863]  __alloc_pages_slowpath.constprop.106+0x2b8/0xb48
[    0.442666]  __alloc_pages_nodemask+0x2ac/0x2f8
[    0.447239]  alloc_page_interleave+0x20/0x90
[    0.451548]  alloc_pages_current+0xdc/0xf8
[    0.455681]  atomic_pool_expand+0x60/0x210
[    0.459817]  __dma_atomic_pool_init+0x50/0xa4
[    0.464214]  dma_atomic_pool_init+0xac/0x158
[    0.468522]  do_one_initcall+0x50/0x218
[    0.472393]  kernel_init_freeable+0x22c/0x2d0
[    0.476792]  kernel_init+0x18/0x110
[    0.480310]  ret_from_fork+0x10/0x18
[    0.483918] Code: 350001e3 d503201f f9450024 1400000a (f940c401)
[    0.490074] ---[ end trace e5a9147af159e580 ]---
[    0.494734] Kernel panic - not syncing: Fatal exception
[    0.500010] Rebooting in 10 seconds..

3. Did you test your patch with a simple crashkernel=512M command line
(without using the crashkernel hi/lo or crashkernel=X@Y format)?

Anyway, since this implementation still needs rework, we can go ahead
with the arrangement of limiting the crashkernel allocation in
ZONE_DMA range (as I suggested in another patch series
<http://lists.infradead.org/pipermail/kexec/2020-July/020777.html>) in
the meanwhile. to ensure the upstream kernel can still support kdump
on arm64 boards where it was working before the ZONE_DMA32 changes
were introduced for arm64.

Please let me know your views,

Thanks,
Bhupesh


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v10 0/5] support reserving crashkernel above 4G on arm64 kdump
  2020-07-03  7:26 ` [PATCH v10 0/5] support reserving crashkernel above 4G on arm64 kdump Bhupesh Sharma
@ 2020-07-03  8:38   ` chenzhou
  2020-07-27 12:38     ` John Donnelly
  0 siblings, 1 reply; 18+ messages in thread
From: chenzhou @ 2020-07-03  8:38 UTC (permalink / raw)
  To: Bhupesh Sharma
  Cc: Thomas Gleixner, Ingo Molnar, RuiRui Yang, Baoquan He,
	Catalin Marinas, Will Deacon, James Morse, Rob Herring,
	Arnd Bergmann, John Donnelly, Prabhakar Kushwaha, nsaenzjulienne,
	Jonathan Corbet, Simon Horman, guohanjun, xiexiuqi, huawei.libin,
	Linux Kernel Mailing List, linux-arm-kernel, kexec mailing list,
	Linux Doc Mailing List

Hi Bhupesh,


On 2020/7/3 15:26, Bhupesh Sharma wrote:
> Hi Chen,
>
> On Fri, Jul 3, 2020 at 9:24 AM Chen Zhou <chenzhou10@huawei.com> wrote:
>> This patch series enable reserving crashkernel above 4G in arm64.
>>
>> There are following issues in arm64 kdump:
>> 1. We use crashkernel=X to reserve crashkernel below 4G, which will fail
>> when there is no enough low memory.
>> 2. Currently, crashkernel=Y@X can be used to reserve crashkernel above 4G,
>> in this case, if swiotlb or DMA buffers are required, crash dump kernel
>> will boot failure because there is no low memory available for allocation.
>> 3. commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32") broken
>> the arm64 kdump. If the memory reserved for crash dump kernel falled in
>> ZONE_DMA32, the devices in crash dump kernel need to use ZONE_DMA will alloc
>> fail.
>>
>> To solve these issues, introduce crashkernel=X,low to reserve specified
>> size low memory.
>> Crashkernel=X tries to reserve memory for the crash dump kernel under
>> 4G. If crashkernel=Y,low is specified simultaneously, reserve spcified
>> size low memory for crash kdump kernel devices firstly and then reserve
>> memory above 4G.
>>
>> When crashkernel is reserved above 4G in memory and crashkernel=X,low
>> is specified simultaneously, kernel should reserve specified size low memory
>> for crash dump kernel devices. So there may be two crash kernel regions, one
>> is below 4G, the other is above 4G.
>> In order to distinct from the high region and make no effect to the use of
>> kexec-tools, rename the low region as "Crash kernel (low)", and pass the
>> low region by reusing DT property "linux,usable-memory-range". We made the low
>> memory region as the last range of "linux,usable-memory-range" to keep
>> compatibility with existing user-space and older kdump kernels.
>>
>> Besides, we need to modify kexec-tools:
>> arm64: support more than one crash kernel regions(see [1])
>>
>> Another update is document about DT property 'linux,usable-memory-range':
>> schemas: update 'linux,usable-memory-range' node schema(see [2])
>>
>> The previous changes and discussions can be retrieved from:
>>
>> Changes since [v9]
>> - Patch 1 add Acked-by from Dave.
>> - Update patch 5 according to Dave's comments.
>> - Update chosen schema.
>>
>> Changes since [v8]
>> - Reuse DT property "linux,usable-memory-range".
>> Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
>> memory region.
>> - Fix kdump broken with ZONE_DMA reintroduced.
>> - Update chosen schema.
>>
>> Changes since [v7]
>> - Move x86 CRASH_ALIGN to 2M
>> Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
>> - Update Documentation/devicetree/bindings/chosen.txt.
>> Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
>> suggested by Arnd.
>> - Add Tested-by from Jhon and pk.
>>
>> Changes since [v6]
>> - Fix build errors reported by kbuild test robot.
>>
>> Changes since [v5]
>> - Move reserve_crashkernel_low() into kernel/crash_core.c.
>> - Delete crashkernel=X,high.
>> - Modify crashkernel=X,low.
>> If crashkernel=X,low is specified simultaneously, reserve spcified size low
>> memory for crash kdump kernel devices firstly and then reserve memory above 4G.
>> In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
>> pass to crash dump kernel by DT property "linux,low-memory-range".
>> - Update Documentation/admin-guide/kdump/kdump.rst.
>>
>> Changes since [v4]
>> - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
>>
>> Changes since [v3]
>> - Add memblock_cap_memory_ranges back for multiple ranges.
>> - Fix some compiling warnings.
>>
>> Changes since [v2]
>> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
>> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
>> patch.
>>
>> Changes since [v1]:
>> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
>> - Remove memblock_cap_memory_ranges() i added in v1 and implement that
>> in fdt_enforce_memory_region().
>> There are at most two crash kernel regions, for two crash kernel regions
>> case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
>> and then remove the memory range in the middle.
>>
>> [1]: http://lists.infradead.org/pipermail/kexec/2020-June/020737.html
>> [2]: https://github.com/robherring/dt-schema/pull/19
>> [v1]: https://lkml.org/lkml/2019/4/2/1174
>> [v2]: https://lkml.org/lkml/2019/4/9/86
>> [v3]: https://lkml.org/lkml/2019/4/9/306
>> [v4]: https://lkml.org/lkml/2019/4/15/273
>> [v5]: https://lkml.org/lkml/2019/5/6/1360
>> [v6]: https://lkml.org/lkml/2019/8/30/142
>> [v7]: https://lkml.org/lkml/2019/12/23/411
>> [v8]: https://lkml.org/lkml/2020/5/21/213
>> [v9]: https://lkml.org/lkml/2020/6/28/73
>>
>> Chen Zhou (5):
>>   x86: kdump: move reserve_crashkernel_low() into crash_core.c
>>   arm64: kdump: reserve crashkenel above 4G for crash dump kernel
>>   arm64: kdump: add memory for devices by DT property
>>     linux,usable-memory-range
>>   arm64: kdump: fix kdump broken with ZONE_DMA reintroduced
>>   kdump: update Documentation about crashkernel on arm64
>>
>>  Documentation/admin-guide/kdump/kdump.rst     | 14 ++-
>>  .../admin-guide/kernel-parameters.txt         | 17 +++-
>>  arch/arm64/kernel/setup.c                     |  8 +-
>>  arch/arm64/mm/init.c                          | 74 ++++++++++++---
>>  arch/x86/kernel/setup.c                       | 66 ++------------
>>  include/linux/crash_core.h                    |  3 +
>>  include/linux/kexec.h                         |  2 -
>>  kernel/crash_core.c                           | 90 +++++++++++++++++++
>>  kernel/kexec_core.c                           | 17 ----
>>  9 files changed, 197 insertions(+), 94 deletions(-)
>>
>> --
>> 2.20.1
> Thanks for the v10.
>
> 1. Seems this series is still broken on arm64 boards like ampere and
> ThunderX2 (marvell) because of the ZONE_DMA32 related OOM seen while
> booting kdump kernel.
> Here are details about my environment:
>
> - Latest upstream Linus master branch (5.8.0-rc3) + your v10 patches.
> - Latest upstream kexec-tools + your v4 patch.
>
> # dmesg | grep -i crash
> [    0.000000] crashkernel reserved: 0x00000000ca000000 -
> 0x00000000ea000000 (512 MB)
> [    0.000000] Kernel command line:
> BOOT_IMAGE=(hd13,gpt2)/vmlinuz-5.8.0-rc3+
> root=/dev/mapper/rhel_hpe--apache--cn99xx--09-root ro
> rd.lvm.lv=rhel_hpe-apache-cn99xx-09/root
> rd.lvm.lv=rhel_hpe-apache-cn99xx-09/swap crashkernel=512M
> [   58.917523]     crashkernel=512M
>
> 2. Here is the OOM crash seen while booting the kdump kernel:
>
> [    0.244724] DMA: preallocated 128 KiB GFP_KERNEL pool for atomic allocations
> [    0.251859] Unable to handle kernel NULL pointer dereference at
> virtual address 0000000000000188
> [    0.260737] Mem abort info:
> [    0.263553]   ESR = 0x96000006
> [    0.266632]   EC = 0x25: DABT (current EL), IL = 32 bits
> [    0.271994]   SET = 0, FnV = 0
> [    0.275074]   EA = 0, S1PTW = 0
> [    0.278239] Data abort info:
> [    0.281141]   ISV = 0, ISS = 0x00000006
> [    0.285010]   CM = 0, WnR = 0
> [    0.288001] [0000000000000188] user address but active_mm is swapper
> [    0.294420] Internal error: Oops: 96000006 [#1] SMP
> [    0.299344] Modules linked in:
> [    0.302424] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3+ #8
> [    0.308753] Hardware name: HPE Apollo 70             /C01_APACHE_MB
>         , BIOS L50_5.13_1.11 06/18/2019
> [    0.318599] pstate: 00400009 (nzcv daif +PAN -UAO BTYPE=--)
> [    0.324228] pc : mem_cgroup_get_nr_swap_pages+0x2c/0x60
> [    0.329506] lr : shrink_lruvec+0x404/0x4f8
> [    0.333638] sp : fffffe0012b8f840
> [    0.336979] x29: fffffe0012b8f840 x28: fffffe00116b3000
> [    0.342343] x27: fffffe0012b8fb00 x26: 0000000000000020
> [    0.347707] x25: 0000000000000000 x24: fffffc0069fffe28
> [    0.353070] x23: 0000000000000000 x22: 0000000000000000
> [    0.358433] x21: 000000000000003c x20: fffffe0012b8fa98
> [    0.363796] x19: 0000000000000000 x18: 0000000000000010
> [    0.369159] x17: 00000000bd8afee8 x16: 000000001260aa76
> [    0.374523] x15: ffffffffffffffff x14: fffffe00116b3988
> [    0.379886] x13: fffffe0092b8faa7 x12: fffffe0012b8faaf
> [    0.385248] x11: fffffe00116f1000 x10: fffffe0012b8fa30
> [    0.390612] x9 : fffffe0010244ebc x8 : 0000000000000000
> [    0.395975] x7 : 0000000000000020 x6 : 00000000ffff8ae3
> [    0.401338] x5 : 0000000000000000 x4 : fffffc004da89000
> [    0.406701] x3 : 0000000000000000 x2 : 0000000000000000
> [    0.412064] x1 : fffffe00116bf000 x0 : 0000000000000000
> [    0.417427] Call trace:
> [    0.419891]  mem_cgroup_get_nr_swap_pages+0x2c/0x60
> [    0.424815]  shrink_node+0x1a8/0x688
> [    0.428420]  do_try_to_free_pages+0xe8/0x448
> [    0.432729]  try_to_free_pages+0x110/0x230
> [    0.436863]  __alloc_pages_slowpath.constprop.106+0x2b8/0xb48
> [    0.442666]  __alloc_pages_nodemask+0x2ac/0x2f8
> [    0.447239]  alloc_page_interleave+0x20/0x90
> [    0.451548]  alloc_pages_current+0xdc/0xf8
> [    0.455681]  atomic_pool_expand+0x60/0x210
> [    0.459817]  __dma_atomic_pool_init+0x50/0xa4
> [    0.464214]  dma_atomic_pool_init+0xac/0x158
> [    0.468522]  do_one_initcall+0x50/0x218
> [    0.472393]  kernel_init_freeable+0x22c/0x2d0
> [    0.476792]  kernel_init+0x18/0x110
> [    0.480310]  ret_from_fork+0x10/0x18
> [    0.483918] Code: 350001e3 d503201f f9450024 1400000a (f940c401)
> [    0.490074] ---[ end trace e5a9147af159e580 ]---
> [    0.494734] Kernel panic - not syncing: Fatal exception
> [    0.500010] Rebooting in 10 seconds..
>
> 3. Did you test your patch with a simple crashkernel=512M command line
> (without using the crashkernel hi/lo or crashkernel=X@Y format)?
>
> Anyway, since this implementation still needs rework, we can go ahead
> with the arrangement of limiting the crashkernel allocation in
> ZONE_DMA range (as I suggested in another patch series
> <http://lists.infradead.org/pipermail/kexec/2020-July/020777.html>) in
> the meanwhile. to ensure the upstream kernel can still support kdump
> on arm64 boards where it was working before the ZONE_DMA32 changes
> were introduced for arm64.
>
> Please let me know your views,
Thanks for your test and sharing your views. I have no questions about the 1 and 2 you mentioned.

I charity the issue in my patch 4 and suggest to use the parameter like
"crashkernel=X crashkernel=Y,low" if CONFIG_ZONE_DMA is enabled.
I also document this in doc in patch 5.

I choose to address the issue based on the  "reserving crashkernel above 4G",
because we just need to adjust the low memory limit instead of limiting the
whole crahshkernel to ZONE_DMA.
details: https://lkml.org/lkml/2020/7/3/64

But you are right, arm64 kdump is broken for long time, including the issue you addressed
"Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)".

I agree with you to make it work as soon as possible.

Ping James, Will,
any other comments about this patch series?

Thanks,
Chen Zhou
>
> Thanks,
> Bhupesh
>
>
> .
>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v10 5/5] kdump: update Documentation about crashkernel on arm64
  2020-07-03  4:50     ` Dave Young
@ 2020-07-03  9:11       ` Dave Young
  0 siblings, 0 replies; 18+ messages in thread
From: Dave Young @ 2020-07-03  9:11 UTC (permalink / raw)
  To: Chen Zhou
  Cc: tglx, mingo, bhe, catalin.marinas, will, james.morse, robh+dt,
	arnd, John.P.donnelly, prabhakar.pkin, nsaenzjulienne, corbet,
	bhsharma, horms, guohanjun, xiexiuqi, huawei.libin, linux-kernel,
	linux-arm-kernel, kexec, linux-doc

On 07/03/20 at 12:50pm, Dave Young wrote:
> On 07/03/20 at 12:46pm, Dave Young wrote:
> > Hi,
> > 
> > Thanks for the update, but still some nitpicks :(
> > 
> > I'm sorry I did not catch them previously,  but maybe it is not worth to
> > repost the whole series if no other changes needed.
> 
> Feel free to add my acks for the common kdump part:

Forgot to add "With those typos fixed":)

> 
> Acked-by: Dave Young <dyoung@redhat.com>
> 
> Thanks
> Dave


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v10 0/5] support reserving crashkernel above 4G on arm64 kdump
  2020-07-03  8:38   ` chenzhou
@ 2020-07-27 12:38     ` John Donnelly
  0 siblings, 0 replies; 18+ messages in thread
From: John Donnelly @ 2020-07-27 12:38 UTC (permalink / raw)
  To: chenzhou, Bhupesh Sharma
  Cc: Thomas Gleixner, Ingo Molnar, RuiRui Yang, Baoquan He,
	Catalin Marinas, Will Deacon, James Morse, Rob Herring,
	Arnd Bergmann, Prabhakar Kushwaha, nsaenzjulienne,
	Jonathan Corbet, Simon Horman, guohanjun, xiexiuqi, huawei.libin,
	Linux Kernel Mailing List, linux-arm-kernel, kexec mailing list,
	Linux Doc Mailing List


On 7/3/20 3:38 AM, chenzhou wrote:
> Hi Bhupesh,
>
>
> On 2020/7/3 15:26, Bhupesh Sharma wrote:
>> Hi Chen,
>>
>> On Fri, Jul 3, 2020 at 9:24 AM Chen Zhou <chenzhou10@huawei.com> wrote:
>>> This patch series enable reserving crashkernel above 4G in arm64.
>>>
>>> There are following issues in arm64 kdump:
>>> 1. We use crashkernel=X to reserve crashkernel below 4G, which will fail
>>> when there is no enough low memory.
>>> 2. Currently, crashkernel=Y@X can be used to reserve crashkernel above 4G,
>>> in this case, if swiotlb or DMA buffers are required, crash dump kernel
>>> will boot failure because there is no low memory available for allocation.
>>> 3. commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32") broken
>>> the arm64 kdump. If the memory reserved for crash dump kernel falled in
>>> ZONE_DMA32, the devices in crash dump kernel need to use ZONE_DMA will alloc
>>> fail.
>>>
>>> To solve these issues, introduce crashkernel=X,low to reserve specified
>>> size low memory.
>>> Crashkernel=X tries to reserve memory for the crash dump kernel under
>>> 4G. If crashkernel=Y,low is specified simultaneously, reserve spcified
>>> size low memory for crash kdump kernel devices firstly and then reserve
>>> memory above 4G.
>>>
>>> When crashkernel is reserved above 4G in memory and crashkernel=X,low
>>> is specified simultaneously, kernel should reserve specified size low memory
>>> for crash dump kernel devices. So there may be two crash kernel regions, one
>>> is below 4G, the other is above 4G.
>>> In order to distinct from the high region and make no effect to the use of
>>> kexec-tools, rename the low region as "Crash kernel (low)", and pass the
>>> low region by reusing DT property "linux,usable-memory-range". We made the low
>>> memory region as the last range of "linux,usable-memory-range" to keep
>>> compatibility with existing user-space and older kdump kernels.
>>>
>>> Besides, we need to modify kexec-tools:
>>> arm64: support more than one crash kernel regions(see [1])
>>>
>>> Another update is document about DT property 'linux,usable-memory-range':
>>> schemas: update 'linux,usable-memory-range' node schema(see [2])
>>>
>>> The previous changes and discussions can be retrieved from:
>>>
>>> Changes since [v9]
>>> - Patch 1 add Acked-by from Dave.
>>> - Update patch 5 according to Dave's comments.
>>> - Update chosen schema.
>>>
>>> Changes since [v8]
>>> - Reuse DT property "linux,usable-memory-range".
>>> Suggested by Rob, reuse DT property "linux,usable-memory-range" to pass the low
>>> memory region.
>>> - Fix kdump broken with ZONE_DMA reintroduced.
>>> - Update chosen schema.
>>>
>>> Changes since [v7]
>>> - Move x86 CRASH_ALIGN to 2M
>>> Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
>>> - Update Documentation/devicetree/bindings/chosen.txt.
>>> Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt
>>> suggested by Arnd.
>>> - Add Tested-by from Jhon and pk.
>>>
>>> Changes since [v6]
>>> - Fix build errors reported by kbuild test robot.
>>>
>>> Changes since [v5]
>>> - Move reserve_crashkernel_low() into kernel/crash_core.c.
>>> - Delete crashkernel=X,high.
>>> - Modify crashkernel=X,low.
>>> If crashkernel=X,low is specified simultaneously, reserve spcified size low
>>> memory for crash kdump kernel devices firstly and then reserve memory above 4G.
>>> In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
>>> pass to crash dump kernel by DT property "linux,low-memory-range".
>>> - Update Documentation/admin-guide/kdump/kdump.rst.
>>>
>>> Changes since [v4]
>>> - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
>>>
>>> Changes since [v3]
>>> - Add memblock_cap_memory_ranges back for multiple ranges.
>>> - Fix some compiling warnings.
>>>
>>> Changes since [v2]
>>> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
>>> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
>>> patch.
>>>
>>> Changes since [v1]:
>>> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
>>> - Remove memblock_cap_memory_ranges() i added in v1 and implement that
>>> in fdt_enforce_memory_region().
>>> There are at most two crash kernel regions, for two crash kernel regions
>>> case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
>>> and then remove the memory range in the middle.
>>>
>>> [1]: https://urldefense.com/v3/__http://lists.infradead.org/pipermail/kexec/2020-June/020737.html__;!!GqivPVa7Brio!LQeROomdhNOjTVFcQP6pLxDm9nhbEsY3vqZMI7NHeDU_VnCaN7iw2DJ84x-Su4V80IBu$
>>> [2]: https://urldefense.com/v3/__https://github.com/robherring/dt-schema/pull/19__;!!GqivPVa7Brio!LQeROomdhNOjTVFcQP6pLxDm9nhbEsY3vqZMI7NHeDU_VnCaN7iw2DJ84x-Su3Exu3Pr$
>>> [v1]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/2/1174__;!!GqivPVa7Brio!LQeROomdhNOjTVFcQP6pLxDm9nhbEsY3vqZMI7NHeDU_VnCaN7iw2DJ84x-Su_RTeG6n$
>>> [v2]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/9/86__;!!GqivPVa7Brio!LQeROomdhNOjTVFcQP6pLxDm9nhbEsY3vqZMI7NHeDU_VnCaN7iw2DJ84x-Su3HI0hvE$
>>> [v3]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/9/306__;!!GqivPVa7Brio!LQeROomdhNOjTVFcQP6pLxDm9nhbEsY3vqZMI7NHeDU_VnCaN7iw2DJ84x-Su-DmOkg5$
>>> [v4]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/15/273__;!!GqivPVa7Brio!LQeROomdhNOjTVFcQP6pLxDm9nhbEsY3vqZMI7NHeDU_VnCaN7iw2DJ84x-SuykJijY2$
>>> [v5]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/5/6/1360__;!!GqivPVa7Brio!LQeROomdhNOjTVFcQP6pLxDm9nhbEsY3vqZMI7NHeDU_VnCaN7iw2DJ84x-Su2YHe5UX$
>>> [v6]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/8/30/142__;!!GqivPVa7Brio!LQeROomdhNOjTVFcQP6pLxDm9nhbEsY3vqZMI7NHeDU_VnCaN7iw2DJ84x-Su9HL5p7k$
>>> [v7]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/12/23/411__;!!GqivPVa7Brio!LQeROomdhNOjTVFcQP6pLxDm9nhbEsY3vqZMI7NHeDU_VnCaN7iw2DJ84x-Su_mHOJs0$
>>> [v8]: https://urldefense.com/v3/__https://lkml.org/lkml/2020/5/21/213__;!!GqivPVa7Brio!LQeROomdhNOjTVFcQP6pLxDm9nhbEsY3vqZMI7NHeDU_VnCaN7iw2DJ84x-Su7UYMTZJ$
>>> [v9]: https://urldefense.com/v3/__https://lkml.org/lkml/2020/6/28/73__;!!GqivPVa7Brio!LQeROomdhNOjTVFcQP6pLxDm9nhbEsY3vqZMI7NHeDU_VnCaN7iw2DJ84x-Suxcd0E6t$
>>>
>>> Chen Zhou (5):
>>>    x86: kdump: move reserve_crashkernel_low() into crash_core.c
>>>    arm64: kdump: reserve crashkenel above 4G for crash dump kernel
>>>    arm64: kdump: add memory for devices by DT property
>>>      linux,usable-memory-range
>>>    arm64: kdump: fix kdump broken with ZONE_DMA reintroduced
>>>    kdump: update Documentation about crashkernel on arm64
>>>
>>>   Documentation/admin-guide/kdump/kdump.rst     | 14 ++-
>>>   .../admin-guide/kernel-parameters.txt         | 17 +++-
>>>   arch/arm64/kernel/setup.c                     |  8 +-
>>>   arch/arm64/mm/init.c                          | 74 ++++++++++++---
>>>   arch/x86/kernel/setup.c                       | 66 ++------------
>>>   include/linux/crash_core.h                    |  3 +
>>>   include/linux/kexec.h                         |  2 -
>>>   kernel/crash_core.c                           | 90 +++++++++++++++++++
>>>   kernel/kexec_core.c                           | 17 ----
>>>   9 files changed, 197 insertions(+), 94 deletions(-)
>>>
>>> --
>>> 2.20.1
>> Thanks for the v10.
>>
>> 1. Seems this series is still broken on arm64 boards like ampere and
>> ThunderX2 (marvell) because of the ZONE_DMA32 related OOM seen while
>> booting kdump kernel.
>> Here are details about my environment:
>>
>> - Latest upstream Linus master branch (5.8.0-rc3) + your v10 patches.
>> - Latest upstream kexec-tools + your v4 patch.
>>
>> # dmesg | grep -i crash
>> [    0.000000] crashkernel reserved: 0x00000000ca000000 -
>> 0x00000000ea000000 (512 MB)
>> [    0.000000] Kernel command line:
>> BOOT_IMAGE=(hd13,gpt2)/vmlinuz-5.8.0-rc3+
>> root=/dev/mapper/rhel_hpe--apache--cn99xx--09-root ro
>> rd.lvm.lv=rhel_hpe-apache-cn99xx-09/root
>> rd.lvm.lv=rhel_hpe-apache-cn99xx-09/swap crashkernel=512M
>> [   58.917523]     crashkernel=512M
>>
>> 2. Here is the OOM crash seen while booting the kdump kernel:
>>
>> [    0.244724] DMA: preallocated 128 KiB GFP_KERNEL pool for atomic allocations
>> [    0.251859] Unable to handle kernel NULL pointer dereference at
>> virtual address 0000000000000188
>> [    0.260737] Mem abort info:
>> [    0.263553]   ESR = 0x96000006
>> [    0.266632]   EC = 0x25: DABT (current EL), IL = 32 bits
>> [    0.271994]   SET = 0, FnV = 0
>> [    0.275074]   EA = 0, S1PTW = 0
>> [    0.278239] Data abort info:
>> [    0.281141]   ISV = 0, ISS = 0x00000006
>> [    0.285010]   CM = 0, WnR = 0
>> [    0.288001] [0000000000000188] user address but active_mm is swapper
>> [    0.294420] Internal error: Oops: 96000006 [#1] SMP
>> [    0.299344] Modules linked in:
>> [    0.302424] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3+ #8
>> [    0.308753] Hardware name: HPE Apollo 70             /C01_APACHE_MB
>>          , BIOS L50_5.13_1.11 06/18/2019
>> [    0.318599] pstate: 00400009 (nzcv daif +PAN -UAO BTYPE=--)
>> [    0.324228] pc : mem_cgroup_get_nr_swap_pages+0x2c/0x60
>> [    0.329506] lr : shrink_lruvec+0x404/0x4f8
>> [    0.333638] sp : fffffe0012b8f840
>> [    0.336979] x29: fffffe0012b8f840 x28: fffffe00116b3000
>> [    0.342343] x27: fffffe0012b8fb00 x26: 0000000000000020
>> [    0.347707] x25: 0000000000000000 x24: fffffc0069fffe28
>> [    0.353070] x23: 0000000000000000 x22: 0000000000000000
>> [    0.358433] x21: 000000000000003c x20: fffffe0012b8fa98
>> [    0.363796] x19: 0000000000000000 x18: 0000000000000010
>> [    0.369159] x17: 00000000bd8afee8 x16: 000000001260aa76
>> [    0.374523] x15: ffffffffffffffff x14: fffffe00116b3988
>> [    0.379886] x13: fffffe0092b8faa7 x12: fffffe0012b8faaf
>> [    0.385248] x11: fffffe00116f1000 x10: fffffe0012b8fa30
>> [    0.390612] x9 : fffffe0010244ebc x8 : 0000000000000000
>> [    0.395975] x7 : 0000000000000020 x6 : 00000000ffff8ae3
>> [    0.401338] x5 : 0000000000000000 x4 : fffffc004da89000
>> [    0.406701] x3 : 0000000000000000 x2 : 0000000000000000
>> [    0.412064] x1 : fffffe00116bf000 x0 : 0000000000000000
>> [    0.417427] Call trace:
>> [    0.419891]  mem_cgroup_get_nr_swap_pages+0x2c/0x60
>> [    0.424815]  shrink_node+0x1a8/0x688
>> [    0.428420]  do_try_to_free_pages+0xe8/0x448
>> [    0.432729]  try_to_free_pages+0x110/0x230
>> [    0.436863]  __alloc_pages_slowpath.constprop.106+0x2b8/0xb48
>> [    0.442666]  __alloc_pages_nodemask+0x2ac/0x2f8
>> [    0.447239]  alloc_page_interleave+0x20/0x90
>> [    0.451548]  alloc_pages_current+0xdc/0xf8
>> [    0.455681]  atomic_pool_expand+0x60/0x210
>> [    0.459817]  __dma_atomic_pool_init+0x50/0xa4
>> [    0.464214]  dma_atomic_pool_init+0xac/0x158
>> [    0.468522]  do_one_initcall+0x50/0x218
>> [    0.472393]  kernel_init_freeable+0x22c/0x2d0
>> [    0.476792]  kernel_init+0x18/0x110
>> [    0.480310]  ret_from_fork+0x10/0x18
>> [    0.483918] Code: 350001e3 d503201f f9450024 1400000a (f940c401)
>> [    0.490074] ---[ end trace e5a9147af159e580 ]---
>> [    0.494734] Kernel panic - not syncing: Fatal exception
>> [    0.500010] Rebooting in 10 seconds..
>>
>> 3. Did you test your patch with a simple crashkernel=512M command line
>> (without using the crashkernel hi/lo or crashkernel=X@Y format)?
>>
>> Anyway, since this implementation still needs rework, we can go ahead
>> with the arrangement of limiting the crashkernel allocation in
>> ZONE_DMA range (as I suggested in another patch series
>> <https://urldefense.com/v3/__http://lists.infradead.org/pipermail/kexec/2020-July/020777.html__;!!GqivPVa7Brio!LQeROomdhNOjTVFcQP6pLxDm9nhbEsY3vqZMI7NHeDU_VnCaN7iw2DJ84x-Su56QERe_$ >) in
>> the meanwhile. to ensure the upstream kernel can still support kdump
>> on arm64 boards where it was working before the ZONE_DMA32 changes
>> were introduced for arm64.
>>
>> Please let me know your views,
> Thanks for your test and sharing your views. I have no questions about the 1 and 2 you mentioned.
>
> I charity the issue in my patch 4 and suggest to use the parameter like
> "crashkernel=X crashkernel=Y,low" if CONFIG_ZONE_DMA is enabled.
> I also document this in doc in patch 5.
>
> I choose to address the issue based on the  "reserving crashkernel above 4G",
> because we just need to adjust the low memory limit instead of limiting the
> whole crahshkernel to ZONE_DMA.
> details: https://urldefense.com/v3/__https://lkml.org/lkml/2020/7/3/64__;!!GqivPVa7Brio!LQeROomdhNOjTVFcQP6pLxDm9nhbEsY3vqZMI7NHeDU_VnCaN7iw2DJ84x-Su1vtGdek$
>
> But you are right, arm64 kdump is broken for long time, including the issue you addressed
> "Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)".
>
> I agree with you to make it work as soon as possible.
>
> Ping James, Will,
> any other comments about this patch series?
>
> Thanks,
> Chen Zhou
>

Hi  James and Will,


   This patch set has been in review for over a year, since May of 2019. 
   What is holding up getting this accepted ?




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v10 4/5] arm64: kdump: fix kdump broken with ZONE_DMA reintroduced
  2020-07-03  3:58 ` [PATCH v10 4/5] arm64: kdump: fix kdump broken with ZONE_DMA reintroduced Chen Zhou
@ 2020-07-27 17:30   ` Catalin Marinas
  2020-07-29  3:52     ` chenzhou
  0 siblings, 1 reply; 18+ messages in thread
From: Catalin Marinas @ 2020-07-27 17:30 UTC (permalink / raw)
  To: Chen Zhou
  Cc: tglx, mingo, dyoung, bhe, will, james.morse, robh+dt, arnd,
	John.P.donnelly, prabhakar.pkin, nsaenzjulienne, corbet,
	bhsharma, horms, guohanjun, xiexiuqi, huawei.libin, linux-kernel,
	linux-arm-kernel, kexec, linux-doc

On Fri, Jul 03, 2020 at 11:58:15AM +0800, Chen Zhou wrote:
> commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32")
> broken the arm64 kdump. If the memory reserved for crash dump kernel
> falled in ZONE_DMA32, the devices in crash dump kernel need to use
> ZONE_DMA will alloc fail.
> 
> This patch addressed the above issue based on "reserving crashkernel
> above 4G". Originally, we reserve low memory below 4G, and now just need
> to adjust memory limit to arm64_dma_phys_limit in reserve_crashkernel_low
> if ZONE_DMA is enabled. That is, if there are devices need to use ZONE_DMA
> in crash dump kernel, it is a good choice to use parameters
> "crashkernel=X crashkernel=Y,low".
> 
> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
> ---
>  kernel/crash_core.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index a7580d291c37..e8ecbbc761a3 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -320,6 +320,7 @@ int __init reserve_crashkernel_low(void)
>  	unsigned long long base, low_base = 0, low_size = 0;
>  	unsigned long total_low_mem;
>  	int ret;
> +	phys_addr_t crash_max = 1ULL << 32;
>  
>  	total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
>  
> @@ -352,7 +353,11 @@ int __init reserve_crashkernel_low(void)
>  			return 0;
>  	}
>  
> -	low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
> +#ifdef CONFIG_ARM64
> +	if (IS_ENABLED(CONFIG_ZONE_DMA))
> +		crash_max = arm64_dma_phys_limit;
> +#endif
> +	low_base = memblock_find_in_range(0, crash_max, low_size, CRASH_ALIGN);
>  	if (!low_base) {
>  		pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
>  		       (unsigned long)(low_size >> 20));

Given the number of #ifdefs we end up with in this function, I think
it's better to simply copy to the code to arch/arm64 and tailor it
accordingly.

Anyway, there are two series solving slightly different issues with
kdump reservations:

1. This series which relaxes the crashkernel= allocation to go anywhere
   in the accessible space while having a dedicated crashkernel=X,low
   option for ZONE_DMA.

2. Bhupesh's series [1] forcing crashkernel=X allocations only from
   ZONE_DMA.

For RPi4 support, we limited ZONE_DMA allocations to the 1st GB.
Existing crashkernel= uses may no longer work, depending on where the
allocation falls. Option (2) above is a quick fix assuming that the
crashkernel reservation is small enough. What's a typical crashkernel
option here? That series is probably more prone to reservation failures.

Option (1), i.e. this series, doesn't solve the problem raised by
Bhupesh unless one uses the crashkernel=X,low argument. It can actually
make it worse even for ZONE_DMA32 since the allocation can go above 4G
(assuming that we change the ZONE_DMA configuration to only limit it to
1GB on RPi4).

I'm more inclined to keep the crashkernel= behaviour to ZONE_DMA
allocations. If this is too small for typical kdump, we can look into
expanding ZONE_DMA to 4G on non-RPi4 hardware (we had patches on the
list). In addition, if Chen thinks allocations above 4G are still needed
or if RPi4 needs a sufficiently large crashkernel=, I'd rather have a
",high" option to explicitly require such access.

[1] http://lists.infradead.org/pipermail/kexec/2020-July/020777.html

-- 
Catalin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v10 4/5] arm64: kdump: fix kdump broken with ZONE_DMA reintroduced
  2020-07-27 17:30   ` Catalin Marinas
@ 2020-07-29  3:52     ` chenzhou
  2020-07-29 11:58       ` Catalin Marinas
  0 siblings, 1 reply; 18+ messages in thread
From: chenzhou @ 2020-07-29  3:52 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: tglx, mingo, dyoung, bhe, will, james.morse, robh+dt, arnd,
	John.P.donnelly, prabhakar.pkin, nsaenzjulienne, corbet,
	bhsharma, horms, guohanjun, xiexiuqi, huawei.libin, linux-kernel,
	linux-arm-kernel, kexec, linux-doc

Hi  Catalin,


On 2020/7/28 1:30, Catalin Marinas wrote:
> On Fri, Jul 03, 2020 at 11:58:15AM +0800, Chen Zhou wrote:
>> commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32")
>> broken the arm64 kdump. If the memory reserved for crash dump kernel
>> falled in ZONE_DMA32, the devices in crash dump kernel need to use
>> ZONE_DMA will alloc fail.
>>
>> This patch addressed the above issue based on "reserving crashkernel
>> above 4G". Originally, we reserve low memory below 4G, and now just need
>> to adjust memory limit to arm64_dma_phys_limit in reserve_crashkernel_low
>> if ZONE_DMA is enabled. That is, if there are devices need to use ZONE_DMA
>> in crash dump kernel, it is a good choice to use parameters
>> "crashkernel=X crashkernel=Y,low".
>>
>> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
>> ---
>>  kernel/crash_core.c | 7 ++++++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>> index a7580d291c37..e8ecbbc761a3 100644
>> --- a/kernel/crash_core.c
>> +++ b/kernel/crash_core.c
>> @@ -320,6 +320,7 @@ int __init reserve_crashkernel_low(void)
>>  	unsigned long long base, low_base = 0, low_size = 0;
>>  	unsigned long total_low_mem;
>>  	int ret;
>> +	phys_addr_t crash_max = 1ULL << 32;
>>  
>>  	total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
>>  
>> @@ -352,7 +353,11 @@ int __init reserve_crashkernel_low(void)
>>  			return 0;
>>  	}
>>  
>> -	low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
>> +#ifdef CONFIG_ARM64
>> +	if (IS_ENABLED(CONFIG_ZONE_DMA))
>> +		crash_max = arm64_dma_phys_limit;
>> +#endif
>> +	low_base = memblock_find_in_range(0, crash_max, low_size, CRASH_ALIGN);
>>  	if (!low_base) {
>>  		pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n",
>>  		       (unsigned long)(low_size >> 20));
> Given the number of #ifdefs we end up with in this function, I think
> it's better to simply copy to the code to arch/arm64 and tailor it
> accordingly.
>
> Anyway, there are two series solving slightly different issues with
> kdump reservations:
>
> 1. This series which relaxes the crashkernel= allocation to go anywhere
>    in the accessible space while having a dedicated crashkernel=X,low
>    option for ZONE_DMA.
>
> 2. Bhupesh's series [1] forcing crashkernel=X allocations only from
>    ZONE_DMA.
>
> For RPi4 support, we limited ZONE_DMA allocations to the 1st GB.
> Existing crashkernel= uses may no longer work, depending on where the
> allocation falls. Option (2) above is a quick fix assuming that the
> crashkernel reservation is small enough. What's a typical crashkernel
> option here? That series is probably more prone to reservation failures.
>
> Option (1), i.e. this series, doesn't solve the problem raised by
> Bhupesh unless one uses the crashkernel=X,low argument. It can actually
> make it worse even for ZONE_DMA32 since the allocation can go above 4G
> (assuming that we change the ZONE_DMA configuration to only limit it to
> 1GB on RPi4).
>
> I'm more inclined to keep the crashkernel= behaviour to ZONE_DMA
> allocations. If this is too small for typical kdump, we can look into
> expanding ZONE_DMA to 4G on non-RPi4 hardware (we had patches on the
> list). In addition, if Chen thinks allocations above 4G are still needed
> or if RPi4 needs a sufficiently large crashkernel=, I'd rather have a
> ",high" option to explicitly require such access.
Thanks for your reply and exhaustive explanation.

In our ARM servers, we need to to reserve a large chunk for kdump(512M or 1G),
there is no enough low memory. So we proposed this patch series
"support reserving crashkernel above 4G on arm64 kdump" In April 2019.

I introduce parameters "crashkernel=X,[high,low]" as x86_64 does in earlier versions.
Suggested by James, to simplify, we call reserve_crashkernel_low() at the beginning of
reserve_crashkernel() and then relax the arm64_dma32_phys_limit if reserve_crashkernel_low()
allocated something.
That is, just the parameter "crashkernel=X,low" is ok and i deleted "crashkernel=X,high".

After the ZONE_DMA introduced in December 2019, the issue occurred as you said above.
In fact, we didn't have RPi4 machine. Originally, i suggested to fix this based on this patch series
and used the dedicated option.

According to your clarify, for typical kdump, there are other solutions. In this case,
"keep the crashkernel= behaviour to ZONE_DMA allocations" looks much better.

How about like this:
1. For ZONE_DMA issue, use Bhupesh's solution, keep the crashkernel= behaviour to ZONE_DMA allocations.
2. For this patch series, make the reserve_crashkernel_low() to ZONE_DMA allocations.

Thanks,
Chen Zhou
> [1] http://lists.infradead.org/pipermail/kexec/2020-July/020777.html
>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v10 4/5] arm64: kdump: fix kdump broken with ZONE_DMA reintroduced
  2020-07-29  3:52     ` chenzhou
@ 2020-07-29 11:58       ` Catalin Marinas
  2020-07-29 14:14         ` chenzhou
  0 siblings, 1 reply; 18+ messages in thread
From: Catalin Marinas @ 2020-07-29 11:58 UTC (permalink / raw)
  To: chenzhou
  Cc: tglx, mingo, dyoung, bhe, will, james.morse, robh+dt, arnd,
	John.P.donnelly, prabhakar.pkin, nsaenzjulienne, corbet,
	bhsharma, horms, guohanjun, xiexiuqi, huawei.libin, linux-kernel,
	linux-arm-kernel, kexec, linux-doc

Hi Chen,

On Wed, Jul 29, 2020 at 11:52:39AM +0800, chenzhou wrote:
> On 2020/7/28 1:30, Catalin Marinas wrote:
> > Anyway, there are two series solving slightly different issues with
> > kdump reservations:
> >
> > 1. This series which relaxes the crashkernel= allocation to go anywhere
> >    in the accessible space while having a dedicated crashkernel=X,low
> >    option for ZONE_DMA.
> >
> > 2. Bhupesh's series [1] forcing crashkernel=X allocations only from
> >    ZONE_DMA.
> >
> > For RPi4 support, we limited ZONE_DMA allocations to the 1st GB.
> > Existing crashkernel= uses may no longer work, depending on where the
> > allocation falls. Option (2) above is a quick fix assuming that the
> > crashkernel reservation is small enough. What's a typical crashkernel
> > option here? That series is probably more prone to reservation failures.
> >
> > Option (1), i.e. this series, doesn't solve the problem raised by
> > Bhupesh unless one uses the crashkernel=X,low argument. It can actually
> > make it worse even for ZONE_DMA32 since the allocation can go above 4G
> > (assuming that we change the ZONE_DMA configuration to only limit it to
> > 1GB on RPi4).
> >
> > I'm more inclined to keep the crashkernel= behaviour to ZONE_DMA
> > allocations. If this is too small for typical kdump, we can look into
> > expanding ZONE_DMA to 4G on non-RPi4 hardware (we had patches on the
> > list). In addition, if Chen thinks allocations above 4G are still needed
> > or if RPi4 needs a sufficiently large crashkernel=, I'd rather have a
> > ",high" option to explicitly require such access.
> 
> Thanks for your reply and exhaustive explanation.
> 
> In our ARM servers, we need to to reserve a large chunk for kdump(512M
> or 1G), there is no enough low memory. So we proposed this patch
> series "support reserving crashkernel above 4G on arm64 kdump" In
> April 2019.

Trying to go through the discussions last year, hopefully things get
clearer.

So prior to the ZONE_DMA change, you still couldn't reserve 1G in the
first 4GB? It shouldn't be sparsely populated during early boot.

> I introduce parameters "crashkernel=X,[high,low]" as x86_64 does in earlier versions.
> Suggested by James, to simplify, we call reserve_crashkernel_low() at the beginning of
> reserve_crashkernel() and then relax the arm64_dma32_phys_limit if reserve_crashkernel_low()
> allocated something.
> That is, just the parameter "crashkernel=X,low" is ok and i deleted "crashkernel=X,high".

The problem I see is that with your patches we diverge from x86
behaviour (and the arm64 behaviour prior to the ZONE_DMA reduction) as
we now require that crashkernel=X,low is always passed if you want
something in ZONE_DMA (and you do want, otherwise the crashdump kernel
fails to boot).

My main requirement is that crashkernel=X, without any suffix, still
works which I don't think is guaranteed with your patches (well,
ignoring RPi4 ZONE_DMA). Bhupesh's series is a quick fix but doesn't
solve your large allocation requirements (that may have worked prior to
the ZONE_DMA change).

> After the ZONE_DMA introduced in December 2019, the issue occurred as
> you said above. In fact, we didn't have RPi4 machine.

You don't even need to have a RPi4 machine, ZONE_DMA has been set to 1GB
unconditionally. And while we could move it back to 4GB on non-RPi4
hardware, I'd rather have a solution that fixes kdump for RPi4 as well.

> Originally, i suggested to fix this based on this patch series and
> used the dedicated option.
> 
> According to your clarify, for typical kdump, there are other
> solutions. In this case, "keep the crashkernel= behaviour to ZONE_DMA
> allocations" looks much better.
> 
> How about like this:
> 1. For ZONE_DMA issue, use Bhupesh's solution, keep the crashkernel=
>    behaviour to ZONE_DMA allocations.
> 2. For this patch series, make the reserve_crashkernel_low() to
>    ZONE_DMA allocations.

So you mean rebasing your series on top of Bhupesh's? I guess you can
combine the two, I really don't care which way as long as we fix both
issues and agree on the crashkernel= semantics. I think with some tweaks
we can go with your series alone.

IIUC from the x86 code (especially the part you #ifdef'ed out for
arm64), if ",low" is not passed (so just standard crashkernel=X), it
still allocates sufficient low memory for the swiotlb in ZONE_DMA. The
rest can go in a high region. Why can't we do something similar on
arm64? Of course, you can keep the ",low" argument for explicit
allocation but I don't want to mandate it.

So with an implicit ZONE_DMA allocation similar to the x86 one, we
probably don't need Bhupesh's series at all. In addition, we can limit
crashkernel= to the first 4G with a fall-back to high like x86 (not sure
if memblock_find_in_range() is guaranteed to search in ascending order).
I don't think we need an explicit ",high" annotation.

So with the above, just a crashkernel=1G gives you at least 256MB in
ZONE_DMA followed by the rest anywhere, with a preference for
ZONE_DMA32. This way we can also keep the reserve_crashkernel_low()
mostly intact from x86 (less #ifdef's).

Do I miss anything?

-- 
Catalin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v10 4/5] arm64: kdump: fix kdump broken with ZONE_DMA reintroduced
  2020-07-29 11:58       ` Catalin Marinas
@ 2020-07-29 14:14         ` chenzhou
  2020-07-29 15:20           ` Catalin Marinas
  0 siblings, 1 reply; 18+ messages in thread
From: chenzhou @ 2020-07-29 14:14 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: tglx, mingo, dyoung, bhe, will, james.morse, robh+dt, arnd,
	John.P.donnelly, prabhakar.pkin, nsaenzjulienne, corbet,
	bhsharma, horms, guohanjun, xiexiuqi, huawei.libin, linux-kernel,
	linux-arm-kernel, kexec, linux-doc

Hi Catalin,

On 2020/7/29 19:58, Catalin Marinas wrote:
> Hi Chen,
>
> On Wed, Jul 29, 2020 at 11:52:39AM +0800, chenzhou wrote:
>> On 2020/7/28 1:30, Catalin Marinas wrote:
>>> Anyway, there are two series solving slightly different issues with
>>> kdump reservations:
>>>
>>> 1. This series which relaxes the crashkernel= allocation to go anywhere
>>>    in the accessible space while having a dedicated crashkernel=X,low
>>>    option for ZONE_DMA.
>>>
>>> 2. Bhupesh's series [1] forcing crashkernel=X allocations only from
>>>    ZONE_DMA.
>>>
>>> For RPi4 support, we limited ZONE_DMA allocations to the 1st GB.
>>> Existing crashkernel= uses may no longer work, depending on where the
>>> allocation falls. Option (2) above is a quick fix assuming that the
>>> crashkernel reservation is small enough. What's a typical crashkernel
>>> option here? That series is probably more prone to reservation failures.
>>>
>>> Option (1), i.e. this series, doesn't solve the problem raised by
>>> Bhupesh unless one uses the crashkernel=X,low argument. It can actually
>>> make it worse even for ZONE_DMA32 since the allocation can go above 4G
>>> (assuming that we change the ZONE_DMA configuration to only limit it to
>>> 1GB on RPi4).
>>>
>>> I'm more inclined to keep the crashkernel= behaviour to ZONE_DMA
>>> allocations. If this is too small for typical kdump, we can look into
>>> expanding ZONE_DMA to 4G on non-RPi4 hardware (we had patches on the
>>> list). In addition, if Chen thinks allocations above 4G are still needed
>>> or if RPi4 needs a sufficiently large crashkernel=, I'd rather have a
>>> ",high" option to explicitly require such access.
>> Thanks for your reply and exhaustive explanation.
>>
>> In our ARM servers, we need to to reserve a large chunk for kdump(512M
>> or 1G), there is no enough low memory. So we proposed this patch
>> series "support reserving crashkernel above 4G on arm64 kdump" In
>> April 2019.
> Trying to go through the discussions last year, hopefully things get
> clearer.
>
> So prior to the ZONE_DMA change, you still couldn't reserve 1G in the
> first 4GB? It shouldn't be sparsely populated during early boot.
Yes, we prior to the ZONE_DMA change, you still couldn't reserve 1G/512M in the first 4GB.
The memory reported by the bios may be splitted by some "reserved" entries.
Like this:
...
2f126000-2fbfffff : reserved
2fc00000-396affff : System RAM
  30de8000-30de9fff : reserved
  30dec000-30decfff : reserved
  30df2000-30df2fff : reserved
  30e20000-30e4ffff : reserved
  39620000-3968ffff : reserved
396b0000-3974ffff : reserved
39750000-397affff : System RAM
397b0000-398fffff : reserved
39900000-3990ffff : System RAM
  39900000-3990ffff : reserved
...
>
>> I introduce parameters "crashkernel=X,[high,low]" as x86_64 does in earlier versions.
>> Suggested by James, to simplify, we call reserve_crashkernel_low() at the beginning of
>> reserve_crashkernel() and then relax the arm64_dma32_phys_limit if reserve_crashkernel_low()
>> allocated something.
>> That is, just the parameter "crashkernel=X,low" is ok and i deleted "crashkernel=X,high".
> The problem I see is that with your patches we diverge from x86
> behaviour (and the arm64 behaviour prior to the ZONE_DMA reduction) as
> we now require that crashkernel=X,low is always passed if you want
> something in ZONE_DMA (and you do want, otherwise the crashdump kernel
> fails to boot).
>
> My main requirement is that crashkernel=X, without any suffix, still
> works which I don't think is guaranteed with your patches (well,
> ignoring RPi4 ZONE_DMA). Bhupesh's series is a quick fix but doesn't
> solve your large allocation requirements (that may have worked prior to
> the ZONE_DMA change).
The main purpose of this series is to solve the large allocation requirements.
Before the DMA_ZONE, both the original crashkernel=X and large allocation with my  patches
work well.

With the DMA_ZONE, both the original crashkernel=X and large allocation with my  patches
may fail to boot. Both need to think about the DMA_ZONE.

>
>> After the ZONE_DMA introduced in December 2019, the issue occurred as
>> you said above. In fact, we didn't have RPi4 machine.
> You don't even need to have a RPi4 machine, ZONE_DMA has been set to 1GB
> unconditionally. And while we could move it back to 4GB on non-RPi4
> hardware, I'd rather have a solution that fixes kdump for RPi4 as well.
>
>> Originally, i suggested to fix this based on this patch series and
>> used the dedicated option.
>>
>> According to your clarify, for typical kdump, there are other
>> solutions. In this case, "keep the crashkernel= behaviour to ZONE_DMA
>> allocations" looks much better.
>>
>> How about like this:
>> 1. For ZONE_DMA issue, use Bhupesh's solution, keep the crashkernel=
>>    behaviour to ZONE_DMA allocations.
>> 2. For this patch series, make the reserve_crashkernel_low() to
>>    ZONE_DMA allocations.
> So you mean rebasing your series on top of Bhupesh's? I guess you can
> combine the two, I really don't care which way as long as we fix both
> issues and agree on the crashkernel= semantics. I think with some tweaks
> we can go with your series alone.
>
> IIUC from the x86 code (especially the part you #ifdef'ed out for
> arm64), if ",low" is not passed (so just standard crashkernel=X), it
> still allocates sufficient low memory for the swiotlb in ZONE_DMA. The
> rest can go in a high region. Why can't we do something similar on
> arm64? Of course, you can keep the ",low" argument for explicit
> allocation but I don't want to mandate it.
It is a good idea to combine the two.

For parameter crashkernel=X, we do like this:
1. allocate some low memory in ZONE_DMA(or ZONE_DMA32 if CONFIG_ZONE_DMA=n)
2. allocate X size memory in a high region

",low" argument can be used to specify the low memory.

Do i understand correctly?
>
> So with an implicit ZONE_DMA allocation similar to the x86 one, we
> probably don't need Bhupesh's series at all. In addition, we can limit
> crashkernel= to the first 4G with a fall-back to high like x86 (not sure
> if memblock_find_in_range() is guaranteed to search in ascending order).
> I don't think we need an explicit ",high" annotation.
>
> So with the above, just a crashkernel=1G gives you at least 256MB in
> ZONE_DMA followed by the rest anywhere, with a preference for
> ZONE_DMA32. This way we can also keep the reserve_crashkernel_low()
> mostly intact from x86 (less #ifdef's).
>
> Do I miss anything?
Yes. We can let crashkernel=X  try to reserve low memory and fall back to use high memory
if failing to find a low range.

About the function reserve_crashkernel_low(), if we put it in arch/arm64, there is some common
code with x86_64. Some suggestions about this?

Thanks,
Chen Zhou
>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v10 4/5] arm64: kdump: fix kdump broken with ZONE_DMA reintroduced
  2020-07-29 14:14         ` chenzhou
@ 2020-07-29 15:20           ` Catalin Marinas
  2020-07-30  8:22             ` chenzhou
  0 siblings, 1 reply; 18+ messages in thread
From: Catalin Marinas @ 2020-07-29 15:20 UTC (permalink / raw)
  To: chenzhou
  Cc: tglx, mingo, dyoung, bhe, will, james.morse, robh+dt, arnd,
	John.P.donnelly, prabhakar.pkin, nsaenzjulienne, corbet,
	bhsharma, horms, guohanjun, xiexiuqi, huawei.libin, linux-kernel,
	linux-arm-kernel, kexec, linux-doc

On Wed, Jul 29, 2020 at 10:14:32PM +0800, chenzhou wrote:
> On 2020/7/29 19:58, Catalin Marinas wrote:
> > On Wed, Jul 29, 2020 at 11:52:39AM +0800, chenzhou wrote:
> >> How about like this:
> >> 1. For ZONE_DMA issue, use Bhupesh's solution, keep the crashkernel=
> >>    behaviour to ZONE_DMA allocations.
> >> 2. For this patch series, make the reserve_crashkernel_low() to
> >>    ZONE_DMA allocations.
> > 
> > So you mean rebasing your series on top of Bhupesh's? I guess you can
> > combine the two, I really don't care which way as long as we fix both
> > issues and agree on the crashkernel= semantics. I think with some tweaks
> > we can go with your series alone.
> >
> > IIUC from the x86 code (especially the part you #ifdef'ed out for
> > arm64), if ",low" is not passed (so just standard crashkernel=X), it
> > still allocates sufficient low memory for the swiotlb in ZONE_DMA. The
> > rest can go in a high region. Why can't we do something similar on
> > arm64? Of course, you can keep the ",low" argument for explicit
> > allocation but I don't want to mandate it.
> 
> It is a good idea to combine the two.
> 
> For parameter crashkernel=X, we do like this:
> 1. allocate some low memory in ZONE_DMA(or ZONE_DMA32 if CONFIG_ZONE_DMA=n)
> 2. allocate X size memory in a high region
> 
> ",low" argument can be used to specify the low memory.
> 
> Do i understand correctly?

Yes, although we could follow the x86 approach:

1. Try low (ZONE_DMA for arm64) allocation, fallback to high allocation
   if it fails.

2. If crash_base is outside ZONE_DMA, call reserve_crashkernel_low()
   which either honours the ,low option or allocates some small amount
   in ZONE_DMA.

If at some point we have platforms failing step 2, we'll look at
changing ZONE_DMA to the full 4GB on non-RPi4 platforms.

It looks to me like x86 ignores the ,low option if the first step
managed to get some low memory. Shall we do the same on arm64?

> > So with an implicit ZONE_DMA allocation similar to the x86 one, we
> > probably don't need Bhupesh's series at all. In addition, we can limit
> > crashkernel= to the first 4G with a fall-back to high like x86 (not sure
> > if memblock_find_in_range() is guaranteed to search in ascending order).
> > I don't think we need an explicit ",high" annotation.
> >
> > So with the above, just a crashkernel=1G gives you at least 256MB in
> > ZONE_DMA followed by the rest anywhere, with a preference for
> > ZONE_DMA32. This way we can also keep the reserve_crashkernel_low()
> > mostly intact from x86 (less #ifdef's).
> 
> Yes. We can let crashkernel=X  try to reserve low memory and fall back to use high memory
> if failing to find a low range.

The only question is whether we need to preserve some more ZONE_DMA on
the current system. If for example we pass a crashkernel=512M and some
cma=, we may end up with very little free memory in ZONE_DMA. That's
mostly an issue for RPi4 since other platforms would work with
ZONE_DMA32. We could add a threshold and go for high allocation directly
if the required size is too large.

> About the function reserve_crashkernel_low(), if we put it in arch/arm64, there is some common
> code with x86_64. Some suggestions about this?

If we can use this function almost intact, just move it in a common
place. But if it gets sprinkled with #ifdef CONFIG_ARM64, I'd rather
duplicate it. I'd still prefer to move it to a common place if possible.

You can go a step further and also move the x86 reserve_crashkernel() to
common code. I don't think there a significant difference between arm64
and x86 here. You'd have to define arch-specific specific
CRASH_ADDR_LOW_MAX etc.

Also patches moving code should not have any functional change. The
CRASH_ALIGN change from 16M to 2M on x86 should be a separate patch as
it needs to be acked by the x86 maintainers (IIRC, Ingo only acked the
function move if there was no functional change; CRASH_ALIGN is used for
the start address, not just alignment, on x86).

-- 
Catalin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v10 4/5] arm64: kdump: fix kdump broken with ZONE_DMA reintroduced
  2020-07-29 15:20           ` Catalin Marinas
@ 2020-07-30  8:22             ` chenzhou
  0 siblings, 0 replies; 18+ messages in thread
From: chenzhou @ 2020-07-30  8:22 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: tglx, mingo, dyoung, bhe, will, james.morse, robh+dt, arnd,
	John.P.donnelly, prabhakar.pkin, nsaenzjulienne, corbet,
	bhsharma, horms, guohanjun, xiexiuqi, huawei.libin, linux-kernel,
	linux-arm-kernel, kexec, linux-doc

Hi Catalin,


On 2020/7/29 23:20, Catalin Marinas wrote:
> On Wed, Jul 29, 2020 at 10:14:32PM +0800, chenzhou wrote:
>> On 2020/7/29 19:58, Catalin Marinas wrote:
>>> On Wed, Jul 29, 2020 at 11:52:39AM +0800, chenzhou wrote:
>>>> How about like this:
>>>> 1. For ZONE_DMA issue, use Bhupesh's solution, keep the crashkernel=
>>>>    behaviour to ZONE_DMA allocations.
>>>> 2. For this patch series, make the reserve_crashkernel_low() to
>>>>    ZONE_DMA allocations.
>>> So you mean rebasing your series on top of Bhupesh's? I guess you can
>>> combine the two, I really don't care which way as long as we fix both
>>> issues and agree on the crashkernel= semantics. I think with some tweaks
>>> we can go with your series alone.
>>>
>>> IIUC from the x86 code (especially the part you #ifdef'ed out for
>>> arm64), if ",low" is not passed (so just standard crashkernel=X), it
>>> still allocates sufficient low memory for the swiotlb in ZONE_DMA. The
>>> rest can go in a high region. Why can't we do something similar on
>>> arm64? Of course, you can keep the ",low" argument for explicit
>>> allocation but I don't want to mandate it.
>> It is a good idea to combine the two.
>>
>> For parameter crashkernel=X, we do like this:
>> 1. allocate some low memory in ZONE_DMA(or ZONE_DMA32 if CONFIG_ZONE_DMA=n)
>> 2. allocate X size memory in a high region
>>
>> ",low" argument can be used to specify the low memory.
>>
>> Do i understand correctly?
> Yes, although we could follow the x86 approach:
>
> 1. Try low (ZONE_DMA for arm64) allocation, fallback to high allocation
>    if it fails.
>
> 2. If crash_base is outside ZONE_DMA, call reserve_crashkernel_low()
>    which either honours the ,low option or allocates some small amount
>    in ZONE_DMA.
>
> If at some point we have platforms failing step 2, we'll look at
> changing ZONE_DMA to the full 4GB on non-RPi4 platforms.
>
> It looks to me like x86 ignores the ,low option if the first step
> managed to get some low memory. Shall we do the same on arm64?
Yes, we could do like this.
>
>>> So with an implicit ZONE_DMA allocation similar to the x86 one, we
>>> probably don't need Bhupesh's series at all. In addition, we can limit
>>> crashkernel= to the first 4G with a fall-back to high like x86 (not sure
>>> if memblock_find_in_range() is guaranteed to search in ascending order).
>>> I don't think we need an explicit ",high" annotation.
>>>
>>> So with the above, just a crashkernel=1G gives you at least 256MB in
>>> ZONE_DMA followed by the rest anywhere, with a preference for
>>> ZONE_DMA32. This way we can also keep the reserve_crashkernel_low()
>>> mostly intact from x86 (less #ifdef's).
>> Yes. We can let crashkernel=X  try to reserve low memory and fall back to use high memory
>> if failing to find a low range.
> The only question is whether we need to preserve some more ZONE_DMA on
> the current system. If for example we pass a crashkernel=512M and some
> cma=, we may end up with very little free memory in ZONE_DMA. That's
> mostly an issue for RPi4 since other platforms would work with
> ZONE_DMA32. We could add a threshold and go for high allocation directly
> if the required size is too large.
Ok.  I will think about the threshold in the next version and make the value be 1/2 or 1/3 of the ZONE_DMA.
>
>> About the function reserve_crashkernel_low(), if we put it in arch/arm64, there is some common
>> code with x86_64. Some suggestions about this?
> If we can use this function almost intact, just move it in a common
> place. But if it gets sprinkled with #ifdef CONFIG_ARM64, I'd rather
> duplicate it. I'd still prefer to move it to a common place if possible.
>
> You can go a step further and also move the x86 reserve_crashkernel() to
> common code. I don't think there a significant difference between arm64
> and x86 here. You'd have to define arch-specific specific
> CRASH_ADDR_LOW_MAX etc.
I will take these into account and send the next version recently.
>
> Also patches moving code should not have any functional change. The
> CRASH_ALIGN change from 16M to 2M on x86 should be a separate patch as
> it needs to be acked by the x86 maintainers (IIRC, Ingo only acked the
> function move if there was no functional change; CRASH_ALIGN is used for
> the start address, not just alignment, on x86).
>
Thanks,
Chen Zhou


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-07-30  8:22 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-03  3:58 [PATCH v10 0/5] support reserving crashkernel above 4G on arm64 kdump Chen Zhou
2020-07-03  3:58 ` [PATCH v10 1/5] x86: kdump: move reserve_crashkernel_low() into crash_core.c Chen Zhou
2020-07-03  3:58 ` [PATCH v10 2/5] arm64: kdump: reserve crashkenel above 4G for crash dump kernel Chen Zhou
2020-07-03  3:58 ` [PATCH v10 3/5] arm64: kdump: add memory for devices by DT property linux,usable-memory-range Chen Zhou
2020-07-03  3:58 ` [PATCH v10 4/5] arm64: kdump: fix kdump broken with ZONE_DMA reintroduced Chen Zhou
2020-07-27 17:30   ` Catalin Marinas
2020-07-29  3:52     ` chenzhou
2020-07-29 11:58       ` Catalin Marinas
2020-07-29 14:14         ` chenzhou
2020-07-29 15:20           ` Catalin Marinas
2020-07-30  8:22             ` chenzhou
2020-07-03  3:58 ` [PATCH v10 5/5] kdump: update Documentation about crashkernel on arm64 Chen Zhou
2020-07-03  4:46   ` Dave Young
2020-07-03  4:50     ` Dave Young
2020-07-03  9:11       ` Dave Young
2020-07-03  7:26 ` [PATCH v10 0/5] support reserving crashkernel above 4G on arm64 kdump Bhupesh Sharma
2020-07-03  8:38   ` chenzhou
2020-07-27 12:38     ` John Donnelly

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).