linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v2 0/5] arm64: allow to reserve memory for normal kexec kernel
@ 2019-07-09 18:20 Pavel Tatashin
  2019-07-09 18:20 ` [v2 1/5] kexec: quiet down kexec reboot Pavel Tatashin
                   ` (7 more replies)
  0 siblings, 8 replies; 12+ messages in thread
From: Pavel Tatashin @ 2019-07-09 18:20 UTC (permalink / raw)
  To: pasha.tatashin, jmorris, sashal, ebiederm, kexec, linux-kernel,
	corbet, catalin.marinas, will, linux-doc, linux-arm-kernel

Changelog
v1 - v2
	- No changes to patches, addressed suggestion from James Morse
	  to add "arm64" tag to cover letter.
	- Improved cover letter information based on discussion.

Currently, it is only allowed to reserve memory for crash kernel, because
it is a requirement in order to be able to boot into crash kernel without
touching memory of crashed kernel is to have memory reserved.

The second benefit for having memory reserved for kexec kernel is
that it does not require a relocation after segments are loaded into
memory.

If kexec functionality is used for a fast system update, with a minimal
downtime, the relocation of kernel + initramfs might take a significant
portion of reboot.

In fact, on the machine that we are using, that has ARM64 processor
it takes 0.35s to relocate during kexec, thus taking 52% of kernel reboot
time:

kernel shutdown	0.03s
relocation	0.35s
kernel startup	0.29s

Image: 13M and initramfs is 24M. If initramfs increases, the relocation
time increases proportionally.

While, it is possible to add 'kexeckernel=' parameters support to other
architectures by modifying reserve_crashkernel(), in this series this is
done for arm64 only.

The reason it is so slow on arm64 to relocate kernel is because the code
that does relocation does this with MMU disabled, and thus D-Cache and
I-Cache must also be disabled.

Alternative solution is more complicated: Setup a temporary page table
for relocation_routine and also for code from cpu_soft_restart. Perform
relocation with MMU enabled, do cpu_soft_restart where MMU and caching
are disabled, jump to purgatory. A similar approach was suggested for
purgatory and was rejected due to making purgatory too complicated.
On, the other hand hibernate does something similar already, but there
MMU never needs to be disabled, and also by the time machine_kexec()
is called, allocator is not available, as we can't fail to do reboot,
so page table must be pre-allocated during kernel load time.

Note: the above time is relocation time only. Purgatory usually also
computes checksum, but that is skipped, because --no-check is used when
kernel image is loaded via kexec.

Pavel Tatashin (5):
  kexec: quiet down kexec reboot
  kexec: add resource for normal kexec region
  kexec: export common crashkernel/kexeckernel parser
  kexec: use reserved memory for normal kexec reboot
  arm64, kexec: reserve kexeckernel region

 .../admin-guide/kernel-parameters.txt         |  7 ++
 arch/arm64/kernel/setup.c                     |  5 ++
 arch/arm64/mm/init.c                          | 83 ++++++++++++-------
 include/linux/crash_core.h                    |  6 ++
 include/linux/ioport.h                        |  1 +
 include/linux/kexec.h                         |  6 +-
 kernel/crash_core.c                           | 27 +++---
 kernel/kexec_core.c                           | 50 +++++++----
 8 files changed, 127 insertions(+), 58 deletions(-)

-- 
2.22.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [v2 1/5] kexec: quiet down kexec reboot
  2019-07-09 18:20 [v2 0/5] arm64: allow to reserve memory for normal kexec kernel Pavel Tatashin
@ 2019-07-09 18:20 ` Pavel Tatashin
  2019-07-09 18:20 ` [v2 2/5] kexec: add resource for normal kexec region Pavel Tatashin
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Pavel Tatashin @ 2019-07-09 18:20 UTC (permalink / raw)
  To: pasha.tatashin, jmorris, sashal, ebiederm, kexec, linux-kernel,
	corbet, catalin.marinas, will, linux-doc, linux-arm-kernel

Here is a regular kexec command sequence and output:
=====
$ kexec --reuse-cmdline -i --load Image
$ kexec -e
[  161.342002] kexec_core: Starting new kernel

Welcome to Buildroot
buildroot login:
=====

Even when "quiet" kernel parameter is specified, "kexec_core: Starting
new kernel" is printed.

This message has  KERN_EMERG level, but there is no emergency, it is a
normal kexec operation, so quiet it down to appropriate KERN_NOTICE.

Machines that have slow console baud rate benefit from less output.

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Simon Horman <horms@verge.net.au>
---
 kernel/kexec_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index d5870723b8ad..2c5b72863b7b 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1169,7 +1169,7 @@ int kernel_kexec(void)
 		 * CPU hotplug again; so re-enable it here.
 		 */
 		cpu_hotplug_enable();
-		pr_emerg("Starting new kernel\n");
+		pr_notice("Starting new kernel\n");
 		machine_shutdown();
 	}
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [v2 2/5] kexec: add resource for normal kexec region
  2019-07-09 18:20 [v2 0/5] arm64: allow to reserve memory for normal kexec kernel Pavel Tatashin
  2019-07-09 18:20 ` [v2 1/5] kexec: quiet down kexec reboot Pavel Tatashin
@ 2019-07-09 18:20 ` Pavel Tatashin
  2019-07-09 18:20 ` [v2 3/5] kexec: export common crashkernel/kexeckernel parser Pavel Tatashin
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Pavel Tatashin @ 2019-07-09 18:20 UTC (permalink / raw)
  To: pasha.tatashin, jmorris, sashal, ebiederm, kexec, linux-kernel,
	corbet, catalin.marinas, will, linux-doc, linux-arm-kernel

crashk_res resource is used to reserve memory for crash kernel. There is
also, however, a benefit to reserve memory for normal kernel to speed up
reboot performance. This is because during regular kexec reboot, kernel
performs relocations to the final destination of the loaded segments, and
the relocation might take a long time especially if initramfs is big.

Therefore, similarly to crashk_res, add kexeck_res that will be used to
reserve memory for normal kexec kernel.

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
---
 include/linux/ioport.h | 1 +
 include/linux/kexec.h  | 6 ++++--
 kernel/kexec_core.c    | 9 +++++++++
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index da0ebaec25f0..3b18a3c112f3 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -133,6 +133,7 @@ enum {
 	IORES_DESC_PERSISTENT_MEMORY_LEGACY	= 5,
 	IORES_DESC_DEVICE_PRIVATE_MEMORY	= 6,
 	IORES_DESC_DEVICE_PUBLIC_MEMORY		= 7,
+	IORES_DESC_KEXEC_KERNEL			= 8,
 };
 
 /* helpers to define resources */
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index b9b1bc5f9669..4c1121b385fb 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -303,12 +303,14 @@ extern int kexec_load_disabled;
 #define KEXEC_FILE_FLAGS	(KEXEC_FILE_UNLOAD | KEXEC_FILE_ON_CRASH | \
 				 KEXEC_FILE_NO_INITRAMFS)
 
-/* Location of a reserved region to hold the crash kernel.
- */
+/* Location of a reserved region to hold the crash kernel. */
 extern struct resource crashk_res;
 extern struct resource crashk_low_res;
 extern note_buf_t __percpu *crash_notes;
 
+/* Location of a reserved region to hold normal kexec kernel. */
+extern struct resource kexeck_res;
+
 /* flag to track if kexec reboot is in progress */
 extern bool kexec_in_progress;
 
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 2c5b72863b7b..932feadbeb3a 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -70,6 +70,15 @@ struct resource crashk_low_res = {
 	.desc  = IORES_DESC_CRASH_KERNEL
 };
 
+/* Location of the reserved area for the normal kexec kernel */
+struct resource kexeck_res = {
+	.name  = "Kexec kernel",
+	.start = 0,
+	.end   = 0,
+	.flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
+	.desc  = IORES_DESC_KEXEC_KERNEL
+};
+
 int kexec_should_crash(struct task_struct *p)
 {
 	/*
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [v2 3/5] kexec: export common crashkernel/kexeckernel parser
  2019-07-09 18:20 [v2 0/5] arm64: allow to reserve memory for normal kexec kernel Pavel Tatashin
  2019-07-09 18:20 ` [v2 1/5] kexec: quiet down kexec reboot Pavel Tatashin
  2019-07-09 18:20 ` [v2 2/5] kexec: add resource for normal kexec region Pavel Tatashin
@ 2019-07-09 18:20 ` Pavel Tatashin
  2019-07-09 18:20 ` [v2 4/5] kexec: use reserved memory for normal kexec reboot Pavel Tatashin
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Pavel Tatashin @ 2019-07-09 18:20 UTC (permalink / raw)
  To: pasha.tatashin, jmorris, sashal, ebiederm, kexec, linux-kernel,
	corbet, catalin.marinas, will, linux-doc, linux-arm-kernel

To reserve memory for normal kexec reboot, the new parameter:
kexeckernel=size[KMG][@offset[KMG]] is used. Its syntax is the
same as craskernel=, therefore they can use the same function to
parse parameter settings.

Rename: __parse_crashkernel() to parse_crash_or_kexec_kernel(), and
make it public.

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
---
 .../admin-guide/kernel-parameters.txt         |  7 +++++
 include/linux/crash_core.h                    |  6 +++++
 kernel/crash_core.c                           | 27 ++++++++++---------
 3 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 5c7a0f5b0a2f..0f5ce665c7f5 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -739,6 +739,13 @@
 			It will be ignored when crashkernel=X,high is not used
 			or memory reserved is below 4G.
 
+	kexeckernel=size[KMG][@offset[KMG]]
+			[KNL] Using kexec, Linux can reboot to a new kernel.
+			This parameter reserves the physical memory region
+			[offset, offset + size] for that kernel. If '@offset' is
+			omitted, then a suitable offset is selected
+			automatically.
+
 	cryptomgr.notests
 			[KNL] Disable crypto self-tests
 
diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index 525510a9f965..e90789ff0bec 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -74,5 +74,11 @@ int parse_crashkernel_high(char *cmdline, unsigned long long system_ram,
 		unsigned long long *crash_size, unsigned long long *crash_base);
 int parse_crashkernel_low(char *cmdline, unsigned long long system_ram,
 		unsigned long long *crash_size, unsigned long long *crash_base);
+int parse_crash_or_kexec_kernel(char *cmdline,
+				unsigned long long system_ram,
+				unsigned long long *crash_size,
+				unsigned long long *crash_base,
+				const char *name,
+				const char *suffix);
 
 #endif /* LINUX_CRASH_CORE_H */
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 9f1557b98468..11e0f9837a32 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -224,12 +224,12 @@ static __init char *get_last_crashkernel(char *cmdline,
 	return ck_cmdline;
 }
 
-static int __init __parse_crashkernel(char *cmdline,
-			     unsigned long long system_ram,
-			     unsigned long long *crash_size,
-			     unsigned long long *crash_base,
-			     const char *name,
-			     const char *suffix)
+int __init parse_crash_or_kexec_kernel(char *cmdline,
+				       unsigned long long system_ram,
+				       unsigned long long *crash_size,
+				       unsigned long long *crash_base,
+				       const char *name,
+				       const char *suffix)
 {
 	char	*first_colon, *first_space;
 	char	*ck_cmdline;
@@ -270,8 +270,9 @@ int __init parse_crashkernel(char *cmdline,
 			     unsigned long long *crash_size,
 			     unsigned long long *crash_base)
 {
-	return __parse_crashkernel(cmdline, system_ram, crash_size, crash_base,
-					"crashkernel=", NULL);
+	return parse_crash_or_kexec_kernel(cmdline, system_ram, crash_size,
+					   crash_base, "crashkernel=",
+					   NULL);
 }
 
 int __init parse_crashkernel_high(char *cmdline,
@@ -279,8 +280,9 @@ int __init parse_crashkernel_high(char *cmdline,
 			     unsigned long long *crash_size,
 			     unsigned long long *crash_base)
 {
-	return __parse_crashkernel(cmdline, system_ram, crash_size, crash_base,
-				"crashkernel=", suffix_tbl[SUFFIX_HIGH]);
+	return parse_crash_or_kexec_kernel(cmdline, system_ram, crash_size,
+					   crash_base, "crashkernel=",
+					   suffix_tbl[SUFFIX_HIGH]);
 }
 
 int __init parse_crashkernel_low(char *cmdline,
@@ -288,8 +290,9 @@ int __init parse_crashkernel_low(char *cmdline,
 			     unsigned long long *crash_size,
 			     unsigned long long *crash_base)
 {
-	return __parse_crashkernel(cmdline, system_ram, crash_size, crash_base,
-				"crashkernel=", suffix_tbl[SUFFIX_LOW]);
+	return parse_crash_or_kexec_kernel(cmdline, system_ram, crash_size,
+					   crash_base, "crashkernel=",
+					   suffix_tbl[SUFFIX_LOW]);
 }
 
 Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [v2 4/5] kexec: use reserved memory for normal kexec reboot
  2019-07-09 18:20 [v2 0/5] arm64: allow to reserve memory for normal kexec kernel Pavel Tatashin
                   ` (2 preceding siblings ...)
  2019-07-09 18:20 ` [v2 3/5] kexec: export common crashkernel/kexeckernel parser Pavel Tatashin
@ 2019-07-09 18:20 ` Pavel Tatashin
  2019-07-09 18:20 ` [v2 5/5] arm64, kexec: reserve kexeckernel region Pavel Tatashin
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Pavel Tatashin @ 2019-07-09 18:20 UTC (permalink / raw)
  To: pasha.tatashin, jmorris, sashal, ebiederm, kexec, linux-kernel,
	corbet, catalin.marinas, will, linux-doc, linux-arm-kernel

If memory was reserved for the given segment use it directly instead of
allocating on per-page bases. This will avoid relocating this segment to
final destination when machine is rebooted.

This is done on a per segment bases because user might decide to always
load kernel segments at the given address (i.e. non-relocatable kernel),
but load initramfs at reserved address, and thus save reboot time on
copying initramfs if it is large, and reduces reboot performance.

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
---
 kernel/kexec_core.c | 39 ++++++++++++++++++++++++++-------------
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 932feadbeb3a..2a8d8746e0a1 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -154,6 +154,18 @@ static struct page *kimage_alloc_page(struct kimage *image,
 				       gfp_t gfp_mask,
 				       unsigned long dest);
 
+/* Check whether this segment is fully within the resource */
+static bool segment_is_reserved(struct kexec_segment *seg, struct resource *res)
+{
+	unsigned long mstart = seg->mem;
+	unsigned long mend = mstart + seg->memsz - 1;
+
+	if (mstart < phys_to_boot_phys(res->start) ||
+	    mend > phys_to_boot_phys(res->end))
+		return false;
+	return true;
+}
+
 int sanity_check_segment_list(struct kimage *image)
 {
 	int i;
@@ -246,13 +258,9 @@ int sanity_check_segment_list(struct kimage *image)
 
 	if (image->type == KEXEC_TYPE_CRASH) {
 		for (i = 0; i < nr_segments; i++) {
-			unsigned long mstart, mend;
-
-			mstart = image->segment[i].mem;
-			mend = mstart + image->segment[i].memsz - 1;
 			/* Ensure we are within the crash kernel limits */
-			if ((mstart < phys_to_boot_phys(crashk_res.start)) ||
-			    (mend > phys_to_boot_phys(crashk_res.end)))
+			if (!segment_is_reserved(&image->segment[i],
+						 &crashk_res))
 				return -EADDRNOTAVAIL;
 		}
 	}
@@ -848,12 +856,13 @@ static int kimage_load_normal_segment(struct kimage *image,
 	return result;
 }
 
-static int kimage_load_crash_segment(struct kimage *image,
-					struct kexec_segment *segment)
+static int kimage_load_crash_or_reserved_segment(struct kimage *image,
+						 struct kexec_segment *segment)
 {
-	/* For crash dumps kernels we simply copy the data from
-	 * user space to it's destination.
-	 * We do things a page at a time for the sake of kmap.
+	/*
+	 * For crash dumps and kexec-reserved kernels we simply copy the data
+	 * from user space to it's destination. We do things a page at a time
+	 * for the sake of kmap.
 	 */
 	unsigned long maddr;
 	size_t ubytes, mbytes;
@@ -923,10 +932,14 @@ int kimage_load_segment(struct kimage *image,
 
 	switch (image->type) {
 	case KEXEC_TYPE_DEFAULT:
-		result = kimage_load_normal_segment(image, segment);
+		if (segment_is_reserved(segment, &kexeck_res))
+			result = kimage_load_crash_or_reserved_segment(image,
+								       segment);
+		else
+			result = kimage_load_normal_segment(image, segment);
 		break;
 	case KEXEC_TYPE_CRASH:
-		result = kimage_load_crash_segment(image, segment);
+		result = kimage_load_crash_or_reserved_segment(image, segment);
 		break;
 	}
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [v2 5/5] arm64, kexec: reserve kexeckernel region
  2019-07-09 18:20 [v2 0/5] arm64: allow to reserve memory for normal kexec kernel Pavel Tatashin
                   ` (3 preceding siblings ...)
  2019-07-09 18:20 ` [v2 4/5] kexec: use reserved memory for normal kexec reboot Pavel Tatashin
@ 2019-07-09 18:20 ` Pavel Tatashin
  2019-07-10  6:59 ` [v2 0/5] arm64: allow to reserve memory for normal kexec kernel Dave Young
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Pavel Tatashin @ 2019-07-09 18:20 UTC (permalink / raw)
  To: pasha.tatashin, jmorris, sashal, ebiederm, kexec, linux-kernel,
	corbet, catalin.marinas, will, linux-doc, linux-arm-kernel

kexeckernel= is used to reserve memory for normal kexec kernel for
faster reboot.

Rename reserve_crashkernel() to reserve_crash_or_kexec_kernel(), and
generalize it by adding an argument that specifies what is reserved:
"crashkernel=" for crash kernel region
"kexeckernel=" for normal kexec region

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
---
 .../admin-guide/kernel-parameters.txt         | 10 +--
 arch/arm64/kernel/setup.c                     |  5 ++
 arch/arm64/mm/init.c                          | 83 ++++++++++++-------
 3 files changed, 63 insertions(+), 35 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 0f5ce665c7f5..a18222c1fbee 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -740,11 +740,11 @@
 			or memory reserved is below 4G.
 
 	kexeckernel=size[KMG][@offset[KMG]]
-			[KNL] Using kexec, Linux can reboot to a new kernel.
-			This parameter reserves the physical memory region
-			[offset, offset + size] for that kernel. If '@offset' is
-			omitted, then a suitable offset is selected
-			automatically.
+			[KNL, ARM64] Using kexec, Linux can reboot to a new
+			kernel. This parameter reserves the physical memory
+			region [offset, offset + size] for that kernel. If
+			'@offset' is omitted, then a suitable offset is
+			selected automatically.
 
 	cryptomgr.notests
 			[KNL] Disable crypto self-tests
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 7e541f947b4c..9f308fa103c5 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -235,6 +235,11 @@ static void __init request_standard_resources(void)
 		if (crashk_res.end && crashk_res.start >= res->start &&
 		    crashk_res.end <= res->end)
 			request_resource(res, &crashk_res);
+
+		/* Userspace will find "Kexec kernel" region in /proc/iomem. */
+		if (kexeck_res.end && kexeck_res.start >= res->start &&
+		    kexeck_res.end <= res->end)
+			request_resource(res, &kexeck_res);
 #endif
 	}
 }
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f3c795278def..dfef39f72faf 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -54,61 +54,83 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
 
 #ifdef CONFIG_KEXEC_CORE
 /*
- * reserve_crashkernel() - reserves memory for crash kernel
+ * reserve_crash_or_kexec_kernel() - reserves memory for crash kernel or
+ * for normal kexec kernel.
  *
- * This function reserves memory area given in "crashkernel=" kernel command
- * line parameter. The memory reserved is used by dump capture kernel when
- * primary kernel is crashing.
+ * This function reserves memory area given in "crashkernel=" or "kexeckenel="
+ * kernel command line parameter. The memory reserved is used by dump capture
+ * kernel when primary kernel is crashing, or to load new kexec kernel for
+ * faster reboot without relocation.
  */
-static void __init reserve_crashkernel(void)
+static void __init reserve_crash_or_kexec_kernel(char *cmd)
 {
-	unsigned long long crash_base, crash_size;
+	unsigned long long base, size;
+	struct resource *res;
+	char s[16];
 	int ret;
 
-	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
-				&crash_size, &crash_base);
-	/* no crashkernel= or invalid value specified */
-	if (ret || !crash_size)
+	/* cmd must be either: "crashkernel=" or "kexeckernel=" */
+	if (!strcmp(cmd, "crashkernel=")) {
+		res = &crashk_res;
+	} else if (!strcmp(cmd, "kexeckernel=")) {
+		res = &kexeck_res;
+	} else {
+		pr_err("%s: invalid cmd %s\n", __func__, cmd);
+		return;
+	}
+
+	/* remove trailing '=' for a nicer printfs */
+	strcpy(s, cmd);
+	s[strlen(s) - 1] = '\0';
+
+	ret = parse_crash_or_kexec_kernel(boot_command_line,
+					  memblock_phys_mem_size(),
+					  &size, &base, cmd, NULL);
+	/* no specified command or invalid value specified */
+	if (ret || !size)
 		return;
 
-	crash_size = PAGE_ALIGN(crash_size);
+	size = PAGE_ALIGN(size);
 
-	if (crash_base == 0) {
+	if (base == 0) {
 		/* Current arm64 boot protocol requires 2MB alignment */
-		crash_base = memblock_find_in_range(0, ARCH_LOW_ADDRESS_LIMIT,
-				crash_size, SZ_2M);
-		if (crash_base == 0) {
-			pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
-				crash_size);
+		base = memblock_find_in_range(0, ARCH_LOW_ADDRESS_LIMIT,
+					      size, SZ_2M);
+		if (base == 0) {
+			pr_warn("cannot allocate %s (size:0x%llx)\n",
+				s, size);
 			return;
 		}
 	} else {
 		/* User specifies base address explicitly. */
-		if (!memblock_is_region_memory(crash_base, crash_size)) {
-			pr_warn("cannot reserve crashkernel: region is not memory\n");
+		if (!memblock_is_region_memory(base, size)) {
+			pr_warn("cannot reserve %s: region is not memory\n",
+				s);
 			return;
 		}
 
-		if (memblock_is_region_reserved(crash_base, crash_size)) {
-			pr_warn("cannot reserve crashkernel: region overlaps reserved memory\n");
+		if (memblock_is_region_reserved(base, size)) {
+			pr_warn("cannot reserve %s: region overlaps reserved memory\n",
+				s);
 			return;
 		}
 
-		if (!IS_ALIGNED(crash_base, SZ_2M)) {
-			pr_warn("cannot reserve crashkernel: base address is not 2MB aligned\n");
+		if (!IS_ALIGNED(base, SZ_2M)) {
+			pr_warn("cannot reserve %s: base address is not 2MB aligned\n",
+				s);
 			return;
 		}
 	}
-	memblock_reserve(crash_base, crash_size);
+	memblock_reserve(base, size);
 
-	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
-		crash_base, crash_base + crash_size, crash_size >> 20);
+	pr_info("%s reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
+		s, base, base + size, size >> 20);
 
-	crashk_res.start = crash_base;
-	crashk_res.end = crash_base + crash_size - 1;
+	res->start = base;
+	res->end = base + size - 1;
 }
 #else
-static void __init reserve_crashkernel(void)
+static void __init reserve_crash_or_kexec_kernel(char *cmd)
 {
 }
 #endif /* CONFIG_KEXEC_CORE */
@@ -411,7 +433,8 @@ void __init arm64_memblock_init(void)
 	else
 		arm64_dma_phys_limit = PHYS_MASK + 1;
 
-	reserve_crashkernel();
+	reserve_crash_or_kexec_kernel("crashkernel=");
+	reserve_crash_or_kexec_kernel("kexeckernel=");
 
 	reserve_elfcorehdr();
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [v2 0/5] arm64: allow to reserve memory for normal kexec kernel
  2019-07-09 18:20 [v2 0/5] arm64: allow to reserve memory for normal kexec kernel Pavel Tatashin
                   ` (4 preceding siblings ...)
  2019-07-09 18:20 ` [v2 5/5] arm64, kexec: reserve kexeckernel region Pavel Tatashin
@ 2019-07-10  6:59 ` Dave Young
  2019-07-10 15:53   ` Pavel Tatashin
  2019-07-10  7:32 ` Bhupesh Sharma
  2019-07-10 15:28 ` Matthias Brugger
  7 siblings, 1 reply; 12+ messages in thread
From: Dave Young @ 2019-07-10  6:59 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: jmorris, sashal, ebiederm, kexec, linux-kernel, corbet,
	catalin.marinas, will, linux-doc, linux-arm-kernel

On 07/09/19 at 02:20pm, Pavel Tatashin wrote:
> Changelog
> v1 - v2
> 	- No changes to patches, addressed suggestion from James Morse
> 	  to add "arm64" tag to cover letter.
> 	- Improved cover letter information based on discussion.
> 
> Currently, it is only allowed to reserve memory for crash kernel, because
> it is a requirement in order to be able to boot into crash kernel without
> touching memory of crashed kernel is to have memory reserved.
> 
> The second benefit for having memory reserved for kexec kernel is
> that it does not require a relocation after segments are loaded into
> memory.
> 
> If kexec functionality is used for a fast system update, with a minimal
> downtime, the relocation of kernel + initramfs might take a significant
> portion of reboot.
> 
> In fact, on the machine that we are using, that has ARM64 processor
> it takes 0.35s to relocate during kexec, thus taking 52% of kernel reboot
> time:
> 
> kernel shutdown	0.03s
> relocation	0.35s
> kernel startup	0.29s
> 
> Image: 13M and initramfs is 24M. If initramfs increases, the relocation
> time increases proportionally.
> 
> While, it is possible to add 'kexeckernel=' parameters support to other
> architectures by modifying reserve_crashkernel(), in this series this is
> done for arm64 only.
> 
> The reason it is so slow on arm64 to relocate kernel is because the code
> that does relocation does this with MMU disabled, and thus D-Cache and
> I-Cache must also be disabled.
> 
> Alternative solution is more complicated: Setup a temporary page table
> for relocation_routine and also for code from cpu_soft_restart. Perform
> relocation with MMU enabled, do cpu_soft_restart where MMU and caching
> are disabled, jump to purgatory. A similar approach was suggested for
> purgatory and was rejected due to making purgatory too complicated.

The crashkernel reservation for kdump is a must, there are already a lot
of different problems need to consider, for example the low and high
memory issues, and a lot of other things.  I'm not convinced to enable
this for kexec reboot.

This really looks to workaround the arm64 issue and move the
complication to kernel.

> On, the other hand hibernate does something similar already, but there
> MMU never needs to be disabled, and also by the time machine_kexec()
> is called, allocator is not available, as we can't fail to do reboot,
> so page table must be pre-allocated during kernel load time.
> 
> Note: the above time is relocation time only. Purgatory usually also
> computes checksum, but that is skipped, because --no-check is used when
> kernel image is loaded via kexec.
> 
> Pavel Tatashin (5):
>   kexec: quiet down kexec reboot
>   kexec: add resource for normal kexec region
>   kexec: export common crashkernel/kexeckernel parser
>   kexec: use reserved memory for normal kexec reboot
>   arm64, kexec: reserve kexeckernel region
> 
>  .../admin-guide/kernel-parameters.txt         |  7 ++
>  arch/arm64/kernel/setup.c                     |  5 ++
>  arch/arm64/mm/init.c                          | 83 ++++++++++++-------
>  include/linux/crash_core.h                    |  6 ++
>  include/linux/ioport.h                        |  1 +
>  include/linux/kexec.h                         |  6 +-
>  kernel/crash_core.c                           | 27 +++---
>  kernel/kexec_core.c                           | 50 +++++++----
>  8 files changed, 127 insertions(+), 58 deletions(-)
> 
> -- 
> 2.22.0
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

Thanks
Dave

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [v2 0/5] arm64: allow to reserve memory for normal kexec kernel
  2019-07-09 18:20 [v2 0/5] arm64: allow to reserve memory for normal kexec kernel Pavel Tatashin
                   ` (5 preceding siblings ...)
  2019-07-10  6:59 ` [v2 0/5] arm64: allow to reserve memory for normal kexec kernel Dave Young
@ 2019-07-10  7:32 ` Bhupesh Sharma
  2019-07-10 15:54   ` Pavel Tatashin
  2019-07-10 15:28 ` Matthias Brugger
  7 siblings, 1 reply; 12+ messages in thread
From: Bhupesh Sharma @ 2019-07-10  7:32 UTC (permalink / raw)
  To: Pavel Tatashin, jmorris, sashal, ebiederm, kexec, linux-kernel,
	corbet, catalin.marinas, will, linux-doc, linux-arm-kernel

Hi Pavel,

On 07/09/2019 11:50 PM, Pavel Tatashin wrote:
> Changelog
> v1 - v2
> 	- No changes to patches, addressed suggestion from James Morse
> 	  to add "arm64" tag to cover letter.

Minor nit. Please also add PATCH to the subject line. Something like 
[PATCH v2]

Also will suggest to wait for atleast a couple of days before sending a 
new version of the patchset so as to give sufficient time for reviews to 
happen.

> 	- Improved cover letter information based on discussion.

> Currently, it is only allowed to reserve memory for crash kernel, because
> it is a requirement in order to be able to boot into crash kernel without
> touching memory of crashed kernel is to have memory reserved.

> The second benefit for having memory reserved for kexec kernel is
> that it does not require a relocation after segments are loaded into
> memory.

> If kexec functionality is used for a fast system update, with a minimal
> downtime, the relocation of kernel + initramfs might take a significant
> portion of reboot.
> 
> In fact, on the machine that we are using, that has ARM64 processor
> it takes 0.35s to relocate during kexec, thus taking 52% of kernel reboot
> time:
> 
> kernel shutdown	0.03s
> relocation	0.35s
> kernel startup	0.29s
> 
> Image: 13M and initramfs is 24M. If initramfs increases, the relocation
> time increases proportionally.
> 
> While, it is possible to add 'kexeckernel=' parameters support to other
> architectures by modifying reserve_crashkernel(), in this series this is
> done for arm64 only.

Note that we normally have two dimensions to this (and similar) 
problem(s) - time we spend in relocating the kernel + initramfs v/s the 
memory space we reserve while enabling kexeckernel (in this case) in the 
primary kernel.

Just to give you an example, I have to shrink even the crashkernel 
reservation size in the primary kernel on arm64 systems running fedora 
which have very small memory footprint. I have a amazon ec2 (aarch64) 
for example which runs with 256M memory space and even enabling 
crashkernel on the same was quite a challenge :)

In such a case we need to do a comparison between the space we reserve 
v/s the time we spend while relocating while doing a kexec load.

Note that we recently had issues with OOM in crashkernel boot, because 
of which we had to introduce kernel command-line parameter to allow a 
user to disable device dump to reduce memory usage, see the following 
commit:

a3a3031b384f ("vmcore: Add a kernel parameter novmcoredd")

More on the same below ...

> The reason it is so slow on arm64 to relocate kernel is because the code
> that does relocation does this with MMU disabled, and thus D-Cache and
> I-Cache must also be disabled.
> 
> Alternative solution is more complicated: Setup a temporary page table
> for relocation_routine and also for code from cpu_soft_restart. Perform
> relocation with MMU enabled, do cpu_soft_restart where MMU and caching
> are disabled, jump to purgatory. A similar approach was suggested for
> purgatory and was rejected due to making purgatory too complicated.
> On, the other hand hibernate does something similar already, but there
> MMU never needs to be disabled, and also by the time machine_kexec()
> is called, allocator is not available, as we can't fail to do reboot,
> so page table must be pre-allocated during kernel load time.

... may be its time to explore this path now with a fresh mind. I know 
Pratyush tried a bit on this and now I am experimenting on the same on 
several aarch64 systems, mainly because we are really short on memory 
resources on several aarch64 systems (used in embedded/cloud domain) and 
frequently run into OOM issues even in the primary kernel.

Some more comments below:

1. I recommend protecting this code under a CONFIG (CONFIG_FAST_KEXEC ?) 
option and make it dependent on ARM64 being enabled (via CONFIG_ARM64 
option) to avoid causing issues on other archs like s390, powerpc, 
x86_64 (which probably don't need these changes).

Also better to make the CONFIG option disabled by default, so that we 
can avoid OOM issues in primary kernel on arm64 systems with smaller 
memory footprints. A user can enabled it, if he needs fast kexec load 
experience..

2. Also, I don't see timing results for kexec_file_load() in this cover 
letter. Can you add some results for the same here, or are they on 
similar lines?

I will give this a go on some aarch64 systems at my end and come back 
with more on the kernel + initramfs relocation time v/s memory space 
taken up results.

Thanks,
Bhupesh

> Note: the above time is relocation time only. Purgatory usually also
> computes checksum, but that is skipped, because --no-check is used when
> kernel image is loaded via kexec.
> 
> Pavel Tatashin (5):
>    kexec: quiet down kexec reboot
>    kexec: add resource for normal kexec region
>    kexec: export common crashkernel/kexeckernel parser
>    kexec: use reserved memory for normal kexec reboot
>    arm64, kexec: reserve kexeckernel region
> 
>   .../admin-guide/kernel-parameters.txt         |  7 ++
>   arch/arm64/kernel/setup.c                     |  5 ++
>   arch/arm64/mm/init.c                          | 83 ++++++++++++-------
>   include/linux/crash_core.h                    |  6 ++
>   include/linux/ioport.h                        |  1 +
>   include/linux/kexec.h                         |  6 +-
>   kernel/crash_core.c                           | 27 +++---
>   kernel/kexec_core.c                           | 50 +++++++----
>   8 files changed, 127 insertions(+), 58 deletions(-)
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [v2 0/5] arm64: allow to reserve memory for normal kexec kernel
  2019-07-09 18:20 [v2 0/5] arm64: allow to reserve memory for normal kexec kernel Pavel Tatashin
                   ` (6 preceding siblings ...)
  2019-07-10  7:32 ` Bhupesh Sharma
@ 2019-07-10 15:28 ` Matthias Brugger
  2019-07-10 15:58   ` Pavel Tatashin
  7 siblings, 1 reply; 12+ messages in thread
From: Matthias Brugger @ 2019-07-10 15:28 UTC (permalink / raw)
  To: Pavel Tatashin, jmorris, sashal, ebiederm, kexec, linux-kernel,
	corbet, catalin.marinas, will, linux-doc, linux-arm-kernel



On 09/07/2019 20:20, Pavel Tatashin wrote:
> Changelog
> v1 - v2
> 	- No changes to patches, addressed suggestion from James Morse
> 	  to add "arm64" tag to cover letter.
> 	- Improved cover letter information based on discussion.
> 
> Currently, it is only allowed to reserve memory for crash kernel, because
> it is a requirement in order to be able to boot into crash kernel without
> touching memory of crashed kernel is to have memory reserved.
> 
> The second benefit for having memory reserved for kexec kernel is
> that it does not require a relocation after segments are loaded into
> memory.
> 
> If kexec functionality is used for a fast system update, with a minimal
> downtime, the relocation of kernel + initramfs might take a significant
> portion of reboot.
> 
> In fact, on the machine that we are using, that has ARM64 processor
> it takes 0.35s to relocate during kexec, thus taking 52% of kernel reboot
> time:
> 
> kernel shutdown	0.03s
> relocation	0.35s
> kernel startup	0.29s
> 
> Image: 13M and initramfs is 24M. If initramfs increases, the relocation
> time increases proportionally.
> 
> While, it is possible to add 'kexeckernel=' parameters support to other
> architectures by modifying reserve_crashkernel(), in this series this is
> done for arm64 only.
> 

I wonder if we couldn't use the crashkernel reserved memory area for that and
just add logic to kexec-tools to pass to the kernel a flag (a new magic reboot
number?) to use the crashkernel memory for that?
The kernel would then unload the crash/capture system in the reserved memory
area and reuse the latter for kexec.
This would also enable the feature for all architectures.

Regards,
Matthias

> The reason it is so slow on arm64 to relocate kernel is because the code
> that does relocation does this with MMU disabled, and thus D-Cache and
> I-Cache must also be disabled.
> 
> Alternative solution is more complicated: Setup a temporary page table
> for relocation_routine and also for code from cpu_soft_restart. Perform
> relocation with MMU enabled, do cpu_soft_restart where MMU and caching
> are disabled, jump to purgatory. A similar approach was suggested for
> purgatory and was rejected due to making purgatory too complicated.
> On, the other hand hibernate does something similar already, but there
> MMU never needs to be disabled, and also by the time machine_kexec()
> is called, allocator is not available, as we can't fail to do reboot,
> so page table must be pre-allocated during kernel load time.
> 
> Note: the above time is relocation time only. Purgatory usually also
> computes checksum, but that is skipped, because --no-check is used when
> kernel image is loaded via kexec.
> 
> Pavel Tatashin (5):
>   kexec: quiet down kexec reboot
>   kexec: add resource for normal kexec region
>   kexec: export common crashkernel/kexeckernel parser
>   kexec: use reserved memory for normal kexec reboot
>   arm64, kexec: reserve kexeckernel region
> 
>  .../admin-guide/kernel-parameters.txt         |  7 ++
>  arch/arm64/kernel/setup.c                     |  5 ++
>  arch/arm64/mm/init.c                          | 83 ++++++++++++-------
>  include/linux/crash_core.h                    |  6 ++
>  include/linux/ioport.h                        |  1 +
>  include/linux/kexec.h                         |  6 +-
>  kernel/crash_core.c                           | 27 +++---
>  kernel/kexec_core.c                           | 50 +++++++----
>  8 files changed, 127 insertions(+), 58 deletions(-)
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [v2 0/5] arm64: allow to reserve memory for normal kexec kernel
  2019-07-10  6:59 ` [v2 0/5] arm64: allow to reserve memory for normal kexec kernel Dave Young
@ 2019-07-10 15:53   ` Pavel Tatashin
  0 siblings, 0 replies; 12+ messages in thread
From: Pavel Tatashin @ 2019-07-10 15:53 UTC (permalink / raw)
  To: Dave Young
  Cc: James Morris, Sasha Levin, Eric W. Biederman, kexec mailing list,
	LKML, Jonathan Corbet, Catalin Marinas, will,
	Linux Doc Mailing List, Linux ARM

> The crashkernel reservation for kdump is a must, there are already a lot
> of different problems need to consider, for example the low and high
> memory issues, and a lot of other things.  I'm not convinced to enable
> this for kexec reboot.
>
> This really looks to workaround the arm64 issue and move the
> complication to kernel.

I will be working on MMU arm64 kernel relocation solution.

Pasha

>
> > On, the other hand hibernate does something similar already, but there
> > MMU never needs to be disabled, and also by the time machine_kexec()
> > is called, allocator is not available, as we can't fail to do reboot,
> > so page table must be pre-allocated during kernel load time.
> >
> > Note: the above time is relocation time only. Purgatory usually also
> > computes checksum, but that is skipped, because --no-check is used when
> > kernel image is loaded via kexec.
> >
> > Pavel Tatashin (5):
> >   kexec: quiet down kexec reboot
> >   kexec: add resource for normal kexec region
> >   kexec: export common crashkernel/kexeckernel parser
> >   kexec: use reserved memory for normal kexec reboot
> >   arm64, kexec: reserve kexeckernel region
> >
> >  .../admin-guide/kernel-parameters.txt         |  7 ++
> >  arch/arm64/kernel/setup.c                     |  5 ++
> >  arch/arm64/mm/init.c                          | 83 ++++++++++++-------
> >  include/linux/crash_core.h                    |  6 ++
> >  include/linux/ioport.h                        |  1 +
> >  include/linux/kexec.h                         |  6 +-
> >  kernel/crash_core.c                           | 27 +++---
> >  kernel/kexec_core.c                           | 50 +++++++----
> >  8 files changed, 127 insertions(+), 58 deletions(-)
> >
> > --
> > 2.22.0
> >
> >
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
>
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [v2 0/5] arm64: allow to reserve memory for normal kexec kernel
  2019-07-10  7:32 ` Bhupesh Sharma
@ 2019-07-10 15:54   ` Pavel Tatashin
  0 siblings, 0 replies; 12+ messages in thread
From: Pavel Tatashin @ 2019-07-10 15:54 UTC (permalink / raw)
  To: Bhupesh Sharma
  Cc: James Morris, Sasha Levin, Eric W. Biederman, kexec mailing list,
	LKML, Jonathan Corbet, Catalin Marinas, will,
	Linux Doc Mailing List, Linux ARM

On Wed, Jul 10, 2019 at 3:32 AM Bhupesh Sharma <bhsharma@redhat.com> wrote:
>
> Hi Pavel,
>
> On 07/09/2019 11:50 PM, Pavel Tatashin wrote:
> > Changelog
> > v1 - v2
> >       - No changes to patches, addressed suggestion from James Morse
> >         to add "arm64" tag to cover letter.
>
> Minor nit. Please also add PATCH to the subject line. Something like
> [PATCH v2]

OK

>
> Also will suggest to wait for atleast a couple of days before sending a
> new version of the patchset so as to give sufficient time for reviews to
> happen.

OK

>
> >       - Improved cover letter information based on discussion.
>
> > Currently, it is only allowed to reserve memory for crash kernel, because
> > it is a requirement in order to be able to boot into crash kernel without
> > touching memory of crashed kernel is to have memory reserved.
>
> > The second benefit for having memory reserved for kexec kernel is
> > that it does not require a relocation after segments are loaded into
> > memory.
>
> > If kexec functionality is used for a fast system update, with a minimal
> > downtime, the relocation of kernel + initramfs might take a significant
> > portion of reboot.
> >
> > In fact, on the machine that we are using, that has ARM64 processor
> > it takes 0.35s to relocate during kexec, thus taking 52% of kernel reboot
> > time:
> >
> > kernel shutdown       0.03s
> > relocation    0.35s
> > kernel startup        0.29s
> >
> > Image: 13M and initramfs is 24M. If initramfs increases, the relocation
> > time increases proportionally.
> >
> > While, it is possible to add 'kexeckernel=' parameters support to other
> > architectures by modifying reserve_crashkernel(), in this series this is
> > done for arm64 only.
>
> Note that we normally have two dimensions to this (and similar)
> problem(s) - time we spend in relocating the kernel + initramfs v/s the
> memory space we reserve while enabling kexeckernel (in this case) in the
> primary kernel.

Yes, for our specific case (Microsoft), it is more important to faster
reboot and have 64M permanently reserved. However, after thinking
about this, I decided to go ahead, and implement MMU enabled kernel
relocation for ARM64.

>
> Just to give you an example, I have to shrink even the crashkernel
> reservation size in the primary kernel on arm64 systems running fedora
> which have very small memory footprint. I have a amazon ec2 (aarch64)
> for example which runs with 256M memory space and even enabling
> crashkernel on the same was quite a challenge :)
>
> In such a case we need to do a comparison between the space we reserve
> v/s the time we spend while relocating while doing a kexec load.
>
> Note that we recently had issues with OOM in crashkernel boot, because
> of which we had to introduce kernel command-line parameter to allow a
> user to disable device dump to reduce memory usage, see the following
> commit:
>
> a3a3031b384f ("vmcore: Add a kernel parameter novmcoredd")
>
> More on the same below ...
>
> > The reason it is so slow on arm64 to relocate kernel is because the code
> > that does relocation does this with MMU disabled, and thus D-Cache and
> > I-Cache must also be disabled.
> >
> > Alternative solution is more complicated: Setup a temporary page table
> > for relocation_routine and also for code from cpu_soft_restart. Perform
> > relocation with MMU enabled, do cpu_soft_restart where MMU and caching
> > are disabled, jump to purgatory. A similar approach was suggested for
> > purgatory and was rejected due to making purgatory too complicated.
> > On, the other hand hibernate does something similar already, but there
> > MMU never needs to be disabled, and also by the time machine_kexec()
> > is called, allocator is not available, as we can't fail to do reboot,
> > so page table must be pre-allocated during kernel load time.
>
> ... may be its time to explore this path now with a fresh mind. I know
> Pratyush tried a bit on this and now I am experimenting on the same on
> several aarch64 systems, mainly because we are really short on memory
> resources on several aarch64 systems (used in embedded/cloud domain) and
> frequently run into OOM issues even in the primary kernel.
>
> Some more comments below:
>
> 1. I recommend protecting this code under a CONFIG (CONFIG_FAST_KEXEC ?)
> option and make it dependent on ARM64 being enabled (via CONFIG_ARM64
> option) to avoid causing issues on other archs like s390, powerpc,
> x86_64 (which probably don't need these changes).
>
> Also better to make the CONFIG option disabled by default, so that we
> can avoid OOM issues in primary kernel on arm64 systems with smaller
> memory footprints. A user can enabled it, if he needs fast kexec load
> experience..
>
> 2. Also, I don't see timing results for kexec_file_load() in this cover
> letter. Can you add some results for the same here, or are they on
> similar lines?
>
> I will give this a go on some aarch64 systems at my end and come back
> with more on the kernel + initramfs relocation time v/s memory space
> taken up results.
>
> Thanks,
> Bhupesh
>
> > Note: the above time is relocation time only. Purgatory usually also
> > computes checksum, but that is skipped, because --no-check is used when
> > kernel image is loaded via kexec.
> >
> > Pavel Tatashin (5):
> >    kexec: quiet down kexec reboot
> >    kexec: add resource for normal kexec region
> >    kexec: export common crashkernel/kexeckernel parser
> >    kexec: use reserved memory for normal kexec reboot
> >    arm64, kexec: reserve kexeckernel region
> >
> >   .../admin-guide/kernel-parameters.txt         |  7 ++
> >   arch/arm64/kernel/setup.c                     |  5 ++
> >   arch/arm64/mm/init.c                          | 83 ++++++++++++-------
> >   include/linux/crash_core.h                    |  6 ++
> >   include/linux/ioport.h                        |  1 +
> >   include/linux/kexec.h                         |  6 +-
> >   kernel/crash_core.c                           | 27 +++---
> >   kernel/kexec_core.c                           | 50 +++++++----
> >   8 files changed, 127 insertions(+), 58 deletions(-)
> >
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [v2 0/5] arm64: allow to reserve memory for normal kexec kernel
  2019-07-10 15:28 ` Matthias Brugger
@ 2019-07-10 15:58   ` Pavel Tatashin
  0 siblings, 0 replies; 12+ messages in thread
From: Pavel Tatashin @ 2019-07-10 15:58 UTC (permalink / raw)
  To: Matthias Brugger
  Cc: James Morris, Sasha Levin, Eric W. Biederman, kexec mailing list,
	LKML, Jonathan Corbet, Catalin Marinas, will,
	Linux Doc Mailing List, Linux ARM

On Wed, Jul 10, 2019 at 11:28 AM Matthias Brugger
<matthias.bgg@gmail.com> wrote:
>
>
>
> On 09/07/2019 20:20, Pavel Tatashin wrote:
> > Changelog
> > v1 - v2
> >       - No changes to patches, addressed suggestion from James Morse
> >         to add "arm64" tag to cover letter.
> >       - Improved cover letter information based on discussion.
> >
> > Currently, it is only allowed to reserve memory for crash kernel, because
> > it is a requirement in order to be able to boot into crash kernel without
> > touching memory of crashed kernel is to have memory reserved.
> >
> > The second benefit for having memory reserved for kexec kernel is
> > that it does not require a relocation after segments are loaded into
> > memory.
> >
> > If kexec functionality is used for a fast system update, with a minimal
> > downtime, the relocation of kernel + initramfs might take a significant
> > portion of reboot.
> >
> > In fact, on the machine that we are using, that has ARM64 processor
> > it takes 0.35s to relocate during kexec, thus taking 52% of kernel reboot
> > time:
> >
> > kernel shutdown       0.03s
> > relocation    0.35s
> > kernel startup        0.29s
> >
> > Image: 13M and initramfs is 24M. If initramfs increases, the relocation
> > time increases proportionally.
> >
> > While, it is possible to add 'kexeckernel=' parameters support to other
> > architectures by modifying reserve_crashkernel(), in this series this is
> > done for arm64 only.
> >
>
> I wonder if we couldn't use the crashkernel reserved memory area for that and
> just add logic to kexec-tools to pass to the kernel a flag (a new magic reboot
> number?) to use the crashkernel memory for that?
> The kernel would then unload the crash/capture system in the reserved memory
> area and reuse the latter for kexec.
> This would also enable the feature for all architectures.

I decided to take another route: enable MMU during kernel relocation
on ARM64. This will eliminate the problem that I am experiencing with
slow relocation.

Pasha

>
> Regards,
> Matthias
>
> > The reason it is so slow on arm64 to relocate kernel is because the code
> > that does relocation does this with MMU disabled, and thus D-Cache and
> > I-Cache must also be disabled.
> >
> > Alternative solution is more complicated: Setup a temporary page table
> > for relocation_routine and also for code from cpu_soft_restart. Perform
> > relocation with MMU enabled, do cpu_soft_restart where MMU and caching
> > are disabled, jump to purgatory. A similar approach was suggested for
> > purgatory and was rejected due to making purgatory too complicated.
> > On, the other hand hibernate does something similar already, but there
> > MMU never needs to be disabled, and also by the time machine_kexec()
> > is called, allocator is not available, as we can't fail to do reboot,
> > so page table must be pre-allocated during kernel load time.
> >
> > Note: the above time is relocation time only. Purgatory usually also
> > computes checksum, but that is skipped, because --no-check is used when
> > kernel image is loaded via kexec.
> >
> > Pavel Tatashin (5):
> >   kexec: quiet down kexec reboot
> >   kexec: add resource for normal kexec region
> >   kexec: export common crashkernel/kexeckernel parser
> >   kexec: use reserved memory for normal kexec reboot
> >   arm64, kexec: reserve kexeckernel region
> >
> >  .../admin-guide/kernel-parameters.txt         |  7 ++
> >  arch/arm64/kernel/setup.c                     |  5 ++
> >  arch/arm64/mm/init.c                          | 83 ++++++++++++-------
> >  include/linux/crash_core.h                    |  6 ++
> >  include/linux/ioport.h                        |  1 +
> >  include/linux/kexec.h                         |  6 +-
> >  kernel/crash_core.c                           | 27 +++---
> >  kernel/kexec_core.c                           | 50 +++++++----
> >  8 files changed, 127 insertions(+), 58 deletions(-)
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-07-10 15:58 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-09 18:20 [v2 0/5] arm64: allow to reserve memory for normal kexec kernel Pavel Tatashin
2019-07-09 18:20 ` [v2 1/5] kexec: quiet down kexec reboot Pavel Tatashin
2019-07-09 18:20 ` [v2 2/5] kexec: add resource for normal kexec region Pavel Tatashin
2019-07-09 18:20 ` [v2 3/5] kexec: export common crashkernel/kexeckernel parser Pavel Tatashin
2019-07-09 18:20 ` [v2 4/5] kexec: use reserved memory for normal kexec reboot Pavel Tatashin
2019-07-09 18:20 ` [v2 5/5] arm64, kexec: reserve kexeckernel region Pavel Tatashin
2019-07-10  6:59 ` [v2 0/5] arm64: allow to reserve memory for normal kexec kernel Dave Young
2019-07-10 15:53   ` Pavel Tatashin
2019-07-10  7:32 ` Bhupesh Sharma
2019-07-10 15:54   ` Pavel Tatashin
2019-07-10 15:28 ` Matthias Brugger
2019-07-10 15:58   ` Pavel Tatashin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).