linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v1 0/5] allow to reserve memory for normal kexec kernel
@ 2019-07-08 21:15 Pavel Tatashin
  2019-07-08 21:15 ` [v1 1/5] kexec: quiet down kexec reboot Pavel Tatashin
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Pavel Tatashin @ 2019-07-08 21:15 UTC (permalink / raw)
  To: pasha.tatashin, jmorris, sashal, ebiederm, kexec, linux-kernel,
	corbet, catalin.marinas, will, linux-doc, linux-arm-kernel

Currently, it is only allowed to reserve memory for crash kernel, because
it is a requirement in order to be able to boot into crash kernel without
touching memory of crashed kernel is to have memory reserved.

The second benefit for having memory reserved for kexec kernel is
that it does not require a relocation after segments are loaded into
memory.

If kexec functionality is used for a fast system update, with a minimal
downtime, the relocation of kernel + initramfs might take a significant
portion of reboot.

In fact, on the machine that we are using, that has ARM64 processor
it takes 0.35s to relocate during kexec, thus taking 52% of kernel reboot
time:

kernel shutdown	0.03s
relocation	0.35s
kernel startup	0.29s

Image: 13M and initramfs is 24M. If initramfs increases, the relocation
time increases proportionally.

While, it is possible to add 'kexeckernel=' parameters support to other
architectures by modifying reserve_crashkernel(), in this series this is
done for arm64 only.

Pavel Tatashin (5):
  kexec: quiet down kexec reboot
  kexec: add resource for normal kexec region
  kexec: export common crashkernel/kexeckernel parser
  kexec: use reserved memory for normal kexec reboot
  arm64, kexec: reserve kexeckernel region

 .../admin-guide/kernel-parameters.txt         |  7 ++
 arch/arm64/kernel/setup.c                     |  5 ++
 arch/arm64/mm/init.c                          | 83 ++++++++++++-------
 include/linux/crash_core.h                    |  6 ++
 include/linux/ioport.h                        |  1 +
 include/linux/kexec.h                         |  6 +-
 kernel/crash_core.c                           | 27 +++---
 kernel/kexec_core.c                           | 50 +++++++----
 8 files changed, 127 insertions(+), 58 deletions(-)

-- 
2.22.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [v1 1/5] kexec: quiet down kexec reboot
  2019-07-08 21:15 [v1 0/5] allow to reserve memory for normal kexec kernel Pavel Tatashin
@ 2019-07-08 21:15 ` Pavel Tatashin
  2019-07-08 21:15 ` [v1 2/5] kexec: add resource for normal kexec region Pavel Tatashin
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Pavel Tatashin @ 2019-07-08 21:15 UTC (permalink / raw)
  To: pasha.tatashin, jmorris, sashal, ebiederm, kexec, linux-kernel,
	corbet, catalin.marinas, will, linux-doc, linux-arm-kernel

Here is a regular kexec command sequence and output:
=====
$ kexec --reuse-cmdline -i --load Image
$ kexec -e
[  161.342002] kexec_core: Starting new kernel

Welcome to Buildroot
buildroot login:
=====

Even when "quiet" kernel parameter is specified, "kexec_core: Starting
new kernel" is printed.

This message has  KERN_EMERG level, but there is no emergency, it is a
normal kexec operation, so quiet it down to appropriate KERN_NOTICE.

Machines that have slow console baud rate benefit from less output.

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Simon Horman <horms@verge.net.au>
---
 kernel/kexec_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index d5870723b8ad..2c5b72863b7b 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1169,7 +1169,7 @@ int kernel_kexec(void)
 		 * CPU hotplug again; so re-enable it here.
 		 */
 		cpu_hotplug_enable();
-		pr_emerg("Starting new kernel\n");
+		pr_notice("Starting new kernel\n");
 		machine_shutdown();
 	}
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [v1 2/5] kexec: add resource for normal kexec region
  2019-07-08 21:15 [v1 0/5] allow to reserve memory for normal kexec kernel Pavel Tatashin
  2019-07-08 21:15 ` [v1 1/5] kexec: quiet down kexec reboot Pavel Tatashin
@ 2019-07-08 21:15 ` Pavel Tatashin
  2019-07-08 21:15 ` [v1 3/5] kexec: export common crashkernel/kexeckernel parser Pavel Tatashin
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Pavel Tatashin @ 2019-07-08 21:15 UTC (permalink / raw)
  To: pasha.tatashin, jmorris, sashal, ebiederm, kexec, linux-kernel,
	corbet, catalin.marinas, will, linux-doc, linux-arm-kernel

crashk_res resource is used to reserve memory for crash kernel. There is
also, however, a benefit to reserve memory for normal kernel to speed up
reboot performance. This is because during regular kexec reboot, kernel
performs relocations to the final destination of the loaded segments, and
the relocation might take a long time especially if initramfs is big.

Therefore, similarly to crashk_res, add kexeck_res that will be used to
reserve memory for normal kexec kernel.

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
---
 include/linux/ioport.h | 1 +
 include/linux/kexec.h  | 6 ++++--
 kernel/kexec_core.c    | 9 +++++++++
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index da0ebaec25f0..3b18a3c112f3 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -133,6 +133,7 @@ enum {
 	IORES_DESC_PERSISTENT_MEMORY_LEGACY	= 5,
 	IORES_DESC_DEVICE_PRIVATE_MEMORY	= 6,
 	IORES_DESC_DEVICE_PUBLIC_MEMORY		= 7,
+	IORES_DESC_KEXEC_KERNEL			= 8,
 };
 
 /* helpers to define resources */
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index b9b1bc5f9669..4c1121b385fb 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -303,12 +303,14 @@ extern int kexec_load_disabled;
 #define KEXEC_FILE_FLAGS	(KEXEC_FILE_UNLOAD | KEXEC_FILE_ON_CRASH | \
 				 KEXEC_FILE_NO_INITRAMFS)
 
-/* Location of a reserved region to hold the crash kernel.
- */
+/* Location of a reserved region to hold the crash kernel. */
 extern struct resource crashk_res;
 extern struct resource crashk_low_res;
 extern note_buf_t __percpu *crash_notes;
 
+/* Location of a reserved region to hold normal kexec kernel. */
+extern struct resource kexeck_res;
+
 /* flag to track if kexec reboot is in progress */
 extern bool kexec_in_progress;
 
diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 2c5b72863b7b..932feadbeb3a 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -70,6 +70,15 @@ struct resource crashk_low_res = {
 	.desc  = IORES_DESC_CRASH_KERNEL
 };
 
+/* Location of the reserved area for the normal kexec kernel */
+struct resource kexeck_res = {
+	.name  = "Kexec kernel",
+	.start = 0,
+	.end   = 0,
+	.flags = IORESOURCE_BUSY | IORESOURCE_SYSTEM_RAM,
+	.desc  = IORES_DESC_KEXEC_KERNEL
+};
+
 int kexec_should_crash(struct task_struct *p)
 {
 	/*
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [v1 3/5] kexec: export common crashkernel/kexeckernel parser
  2019-07-08 21:15 [v1 0/5] allow to reserve memory for normal kexec kernel Pavel Tatashin
  2019-07-08 21:15 ` [v1 1/5] kexec: quiet down kexec reboot Pavel Tatashin
  2019-07-08 21:15 ` [v1 2/5] kexec: add resource for normal kexec region Pavel Tatashin
@ 2019-07-08 21:15 ` Pavel Tatashin
  2019-07-08 21:15 ` [v1 4/5] kexec: use reserved memory for normal kexec reboot Pavel Tatashin
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Pavel Tatashin @ 2019-07-08 21:15 UTC (permalink / raw)
  To: pasha.tatashin, jmorris, sashal, ebiederm, kexec, linux-kernel,
	corbet, catalin.marinas, will, linux-doc, linux-arm-kernel

To reserve memory for normal kexec reboot, the new parameter:
kexeckernel=size[KMG][@offset[KMG]] is used. Its syntax is the
same as craskernel=, therefore they can use the same function to
parse parameter settings.

Rename: __parse_crashkernel() to parse_crash_or_kexec_kernel(), and
make it public.

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
---
 .../admin-guide/kernel-parameters.txt         |  7 +++++
 include/linux/crash_core.h                    |  6 +++++
 kernel/crash_core.c                           | 27 ++++++++++---------
 3 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 5c7a0f5b0a2f..0f5ce665c7f5 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -739,6 +739,13 @@
 			It will be ignored when crashkernel=X,high is not used
 			or memory reserved is below 4G.
 
+	kexeckernel=size[KMG][@offset[KMG]]
+			[KNL] Using kexec, Linux can reboot to a new kernel.
+			This parameter reserves the physical memory region
+			[offset, offset + size] for that kernel. If '@offset' is
+			omitted, then a suitable offset is selected
+			automatically.
+
 	cryptomgr.notests
 			[KNL] Disable crypto self-tests
 
diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index 525510a9f965..e90789ff0bec 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -74,5 +74,11 @@ int parse_crashkernel_high(char *cmdline, unsigned long long system_ram,
 		unsigned long long *crash_size, unsigned long long *crash_base);
 int parse_crashkernel_low(char *cmdline, unsigned long long system_ram,
 		unsigned long long *crash_size, unsigned long long *crash_base);
+int parse_crash_or_kexec_kernel(char *cmdline,
+				unsigned long long system_ram,
+				unsigned long long *crash_size,
+				unsigned long long *crash_base,
+				const char *name,
+				const char *suffix);
 
 #endif /* LINUX_CRASH_CORE_H */
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 9f1557b98468..11e0f9837a32 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -224,12 +224,12 @@ static __init char *get_last_crashkernel(char *cmdline,
 	return ck_cmdline;
 }
 
-static int __init __parse_crashkernel(char *cmdline,
-			     unsigned long long system_ram,
-			     unsigned long long *crash_size,
-			     unsigned long long *crash_base,
-			     const char *name,
-			     const char *suffix)
+int __init parse_crash_or_kexec_kernel(char *cmdline,
+				       unsigned long long system_ram,
+				       unsigned long long *crash_size,
+				       unsigned long long *crash_base,
+				       const char *name,
+				       const char *suffix)
 {
 	char	*first_colon, *first_space;
 	char	*ck_cmdline;
@@ -270,8 +270,9 @@ int __init parse_crashkernel(char *cmdline,
 			     unsigned long long *crash_size,
 			     unsigned long long *crash_base)
 {
-	return __parse_crashkernel(cmdline, system_ram, crash_size, crash_base,
-					"crashkernel=", NULL);
+	return parse_crash_or_kexec_kernel(cmdline, system_ram, crash_size,
+					   crash_base, "crashkernel=",
+					   NULL);
 }
 
 int __init parse_crashkernel_high(char *cmdline,
@@ -279,8 +280,9 @@ int __init parse_crashkernel_high(char *cmdline,
 			     unsigned long long *crash_size,
 			     unsigned long long *crash_base)
 {
-	return __parse_crashkernel(cmdline, system_ram, crash_size, crash_base,
-				"crashkernel=", suffix_tbl[SUFFIX_HIGH]);
+	return parse_crash_or_kexec_kernel(cmdline, system_ram, crash_size,
+					   crash_base, "crashkernel=",
+					   suffix_tbl[SUFFIX_HIGH]);
 }
 
 int __init parse_crashkernel_low(char *cmdline,
@@ -288,8 +290,9 @@ int __init parse_crashkernel_low(char *cmdline,
 			     unsigned long long *crash_size,
 			     unsigned long long *crash_base)
 {
-	return __parse_crashkernel(cmdline, system_ram, crash_size, crash_base,
-				"crashkernel=", suffix_tbl[SUFFIX_LOW]);
+	return parse_crash_or_kexec_kernel(cmdline, system_ram, crash_size,
+					   crash_base, "crashkernel=",
+					   suffix_tbl[SUFFIX_LOW]);
 }
 
 Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [v1 4/5] kexec: use reserved memory for normal kexec reboot
  2019-07-08 21:15 [v1 0/5] allow to reserve memory for normal kexec kernel Pavel Tatashin
                   ` (2 preceding siblings ...)
  2019-07-08 21:15 ` [v1 3/5] kexec: export common crashkernel/kexeckernel parser Pavel Tatashin
@ 2019-07-08 21:15 ` Pavel Tatashin
  2019-07-08 21:15 ` [v1 5/5] arm64, kexec: reserve kexeckernel region Pavel Tatashin
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Pavel Tatashin @ 2019-07-08 21:15 UTC (permalink / raw)
  To: pasha.tatashin, jmorris, sashal, ebiederm, kexec, linux-kernel,
	corbet, catalin.marinas, will, linux-doc, linux-arm-kernel

If memory was reserved for the given segment use it directly instead of
allocating on per-page bases. This will avoid relocating this segment to
final destination when machine is rebooted.

This is done on a per segment bases because user might decide to always
load kernel segments at the given address (i.e. non-relocatable kernel),
but load initramfs at reserved address, and thus save reboot time on
copying initramfs if it is large, and reduces reboot performance.

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
---
 kernel/kexec_core.c | 39 ++++++++++++++++++++++++++-------------
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 932feadbeb3a..2a8d8746e0a1 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -154,6 +154,18 @@ static struct page *kimage_alloc_page(struct kimage *image,
 				       gfp_t gfp_mask,
 				       unsigned long dest);
 
+/* Check whether this segment is fully within the resource */
+static bool segment_is_reserved(struct kexec_segment *seg, struct resource *res)
+{
+	unsigned long mstart = seg->mem;
+	unsigned long mend = mstart + seg->memsz - 1;
+
+	if (mstart < phys_to_boot_phys(res->start) ||
+	    mend > phys_to_boot_phys(res->end))
+		return false;
+	return true;
+}
+
 int sanity_check_segment_list(struct kimage *image)
 {
 	int i;
@@ -246,13 +258,9 @@ int sanity_check_segment_list(struct kimage *image)
 
 	if (image->type == KEXEC_TYPE_CRASH) {
 		for (i = 0; i < nr_segments; i++) {
-			unsigned long mstart, mend;
-
-			mstart = image->segment[i].mem;
-			mend = mstart + image->segment[i].memsz - 1;
 			/* Ensure we are within the crash kernel limits */
-			if ((mstart < phys_to_boot_phys(crashk_res.start)) ||
-			    (mend > phys_to_boot_phys(crashk_res.end)))
+			if (!segment_is_reserved(&image->segment[i],
+						 &crashk_res))
 				return -EADDRNOTAVAIL;
 		}
 	}
@@ -848,12 +856,13 @@ static int kimage_load_normal_segment(struct kimage *image,
 	return result;
 }
 
-static int kimage_load_crash_segment(struct kimage *image,
-					struct kexec_segment *segment)
+static int kimage_load_crash_or_reserved_segment(struct kimage *image,
+						 struct kexec_segment *segment)
 {
-	/* For crash dumps kernels we simply copy the data from
-	 * user space to it's destination.
-	 * We do things a page at a time for the sake of kmap.
+	/*
+	 * For crash dumps and kexec-reserved kernels we simply copy the data
+	 * from user space to it's destination. We do things a page at a time
+	 * for the sake of kmap.
 	 */
 	unsigned long maddr;
 	size_t ubytes, mbytes;
@@ -923,10 +932,14 @@ int kimage_load_segment(struct kimage *image,
 
 	switch (image->type) {
 	case KEXEC_TYPE_DEFAULT:
-		result = kimage_load_normal_segment(image, segment);
+		if (segment_is_reserved(segment, &kexeck_res))
+			result = kimage_load_crash_or_reserved_segment(image,
+								       segment);
+		else
+			result = kimage_load_normal_segment(image, segment);
 		break;
 	case KEXEC_TYPE_CRASH:
-		result = kimage_load_crash_segment(image, segment);
+		result = kimage_load_crash_or_reserved_segment(image, segment);
 		break;
 	}
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [v1 5/5] arm64, kexec: reserve kexeckernel region
  2019-07-08 21:15 [v1 0/5] allow to reserve memory for normal kexec kernel Pavel Tatashin
                   ` (3 preceding siblings ...)
  2019-07-08 21:15 ` [v1 4/5] kexec: use reserved memory for normal kexec reboot Pavel Tatashin
@ 2019-07-08 21:15 ` Pavel Tatashin
  2019-07-08 23:53 ` [v1 0/5] allow to reserve memory for normal kexec kernel Eric W. Biederman
  2019-07-09 10:36 ` Bhupesh Sharma
  6 siblings, 0 replies; 17+ messages in thread
From: Pavel Tatashin @ 2019-07-08 21:15 UTC (permalink / raw)
  To: pasha.tatashin, jmorris, sashal, ebiederm, kexec, linux-kernel,
	corbet, catalin.marinas, will, linux-doc, linux-arm-kernel

kexeckernel= is used to reserve memory for normal kexec kernel for
faster reboot.

Rename reserve_crashkernel() to reserve_crash_or_kexec_kernel(), and
generalize it by adding an argument that specifies what is reserved:
"crashkernel=" for crash kernel region
"kexeckernel=" for normal kexec region

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
---
 .../admin-guide/kernel-parameters.txt         | 10 +--
 arch/arm64/kernel/setup.c                     |  5 ++
 arch/arm64/mm/init.c                          | 83 ++++++++++++-------
 3 files changed, 63 insertions(+), 35 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 0f5ce665c7f5..a18222c1fbee 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -740,11 +740,11 @@
 			or memory reserved is below 4G.
 
 	kexeckernel=size[KMG][@offset[KMG]]
-			[KNL] Using kexec, Linux can reboot to a new kernel.
-			This parameter reserves the physical memory region
-			[offset, offset + size] for that kernel. If '@offset' is
-			omitted, then a suitable offset is selected
-			automatically.
+			[KNL, ARM64] Using kexec, Linux can reboot to a new
+			kernel. This parameter reserves the physical memory
+			region [offset, offset + size] for that kernel. If
+			'@offset' is omitted, then a suitable offset is
+			selected automatically.
 
 	cryptomgr.notests
 			[KNL] Disable crypto self-tests
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 7e541f947b4c..9f308fa103c5 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -235,6 +235,11 @@ static void __init request_standard_resources(void)
 		if (crashk_res.end && crashk_res.start >= res->start &&
 		    crashk_res.end <= res->end)
 			request_resource(res, &crashk_res);
+
+		/* Userspace will find "Kexec kernel" region in /proc/iomem. */
+		if (kexeck_res.end && kexeck_res.start >= res->start &&
+		    kexeck_res.end <= res->end)
+			request_resource(res, &kexeck_res);
 #endif
 	}
 }
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f3c795278def..dfef39f72faf 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -54,61 +54,83 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
 
 #ifdef CONFIG_KEXEC_CORE
 /*
- * reserve_crashkernel() - reserves memory for crash kernel
+ * reserve_crash_or_kexec_kernel() - reserves memory for crash kernel or
+ * for normal kexec kernel.
  *
- * This function reserves memory area given in "crashkernel=" kernel command
- * line parameter. The memory reserved is used by dump capture kernel when
- * primary kernel is crashing.
+ * This function reserves memory area given in "crashkernel=" or "kexeckenel="
+ * kernel command line parameter. The memory reserved is used by dump capture
+ * kernel when primary kernel is crashing, or to load new kexec kernel for
+ * faster reboot without relocation.
  */
-static void __init reserve_crashkernel(void)
+static void __init reserve_crash_or_kexec_kernel(char *cmd)
 {
-	unsigned long long crash_base, crash_size;
+	unsigned long long base, size;
+	struct resource *res;
+	char s[16];
 	int ret;
 
-	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
-				&crash_size, &crash_base);
-	/* no crashkernel= or invalid value specified */
-	if (ret || !crash_size)
+	/* cmd must be either: "crashkernel=" or "kexeckernel=" */
+	if (!strcmp(cmd, "crashkernel=")) {
+		res = &crashk_res;
+	} else if (!strcmp(cmd, "kexeckernel=")) {
+		res = &kexeck_res;
+	} else {
+		pr_err("%s: invalid cmd %s\n", __func__, cmd);
+		return;
+	}
+
+	/* remove trailing '=' for a nicer printfs */
+	strcpy(s, cmd);
+	s[strlen(s) - 1] = '\0';
+
+	ret = parse_crash_or_kexec_kernel(boot_command_line,
+					  memblock_phys_mem_size(),
+					  &size, &base, cmd, NULL);
+	/* no specified command or invalid value specified */
+	if (ret || !size)
 		return;
 
-	crash_size = PAGE_ALIGN(crash_size);
+	size = PAGE_ALIGN(size);
 
-	if (crash_base == 0) {
+	if (base == 0) {
 		/* Current arm64 boot protocol requires 2MB alignment */
-		crash_base = memblock_find_in_range(0, ARCH_LOW_ADDRESS_LIMIT,
-				crash_size, SZ_2M);
-		if (crash_base == 0) {
-			pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
-				crash_size);
+		base = memblock_find_in_range(0, ARCH_LOW_ADDRESS_LIMIT,
+					      size, SZ_2M);
+		if (base == 0) {
+			pr_warn("cannot allocate %s (size:0x%llx)\n",
+				s, size);
 			return;
 		}
 	} else {
 		/* User specifies base address explicitly. */
-		if (!memblock_is_region_memory(crash_base, crash_size)) {
-			pr_warn("cannot reserve crashkernel: region is not memory\n");
+		if (!memblock_is_region_memory(base, size)) {
+			pr_warn("cannot reserve %s: region is not memory\n",
+				s);
 			return;
 		}
 
-		if (memblock_is_region_reserved(crash_base, crash_size)) {
-			pr_warn("cannot reserve crashkernel: region overlaps reserved memory\n");
+		if (memblock_is_region_reserved(base, size)) {
+			pr_warn("cannot reserve %s: region overlaps reserved memory\n",
+				s);
 			return;
 		}
 
-		if (!IS_ALIGNED(crash_base, SZ_2M)) {
-			pr_warn("cannot reserve crashkernel: base address is not 2MB aligned\n");
+		if (!IS_ALIGNED(base, SZ_2M)) {
+			pr_warn("cannot reserve %s: base address is not 2MB aligned\n",
+				s);
 			return;
 		}
 	}
-	memblock_reserve(crash_base, crash_size);
+	memblock_reserve(base, size);
 
-	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
-		crash_base, crash_base + crash_size, crash_size >> 20);
+	pr_info("%s reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
+		s, base, base + size, size >> 20);
 
-	crashk_res.start = crash_base;
-	crashk_res.end = crash_base + crash_size - 1;
+	res->start = base;
+	res->end = base + size - 1;
 }
 #else
-static void __init reserve_crashkernel(void)
+static void __init reserve_crash_or_kexec_kernel(char *cmd)
 {
 }
 #endif /* CONFIG_KEXEC_CORE */
@@ -411,7 +433,8 @@ void __init arm64_memblock_init(void)
 	else
 		arm64_dma_phys_limit = PHYS_MASK + 1;
 
-	reserve_crashkernel();
+	reserve_crash_or_kexec_kernel("crashkernel=");
+	reserve_crash_or_kexec_kernel("kexeckernel=");
 
 	reserve_elfcorehdr();
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [v1 0/5] allow to reserve memory for normal kexec kernel
  2019-07-08 21:15 [v1 0/5] allow to reserve memory for normal kexec kernel Pavel Tatashin
                   ` (4 preceding siblings ...)
  2019-07-08 21:15 ` [v1 5/5] arm64, kexec: reserve kexeckernel region Pavel Tatashin
@ 2019-07-08 23:53 ` Eric W. Biederman
  2019-07-09  0:09   ` Pavel Tatashin
  2019-07-09 10:36 ` Bhupesh Sharma
  6 siblings, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2019-07-08 23:53 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: jmorris, sashal, kexec, linux-kernel, corbet, catalin.marinas,
	will, linux-doc, linux-arm-kernel

Pavel Tatashin <pasha.tatashin@soleen.com> writes:

> Currently, it is only allowed to reserve memory for crash kernel, because
> it is a requirement in order to be able to boot into crash kernel without
> touching memory of crashed kernel is to have memory reserved.
>
> The second benefit for having memory reserved for kexec kernel is
> that it does not require a relocation after segments are loaded into
> memory.
>
> If kexec functionality is used for a fast system update, with a minimal
> downtime, the relocation of kernel + initramfs might take a significant
> portion of reboot.
>
> In fact, on the machine that we are using, that has ARM64 processor
> it takes 0.35s to relocate during kexec, thus taking 52% of kernel reboot
> time:
>
> kernel shutdown	0.03s
> relocation	0.35s
> kernel startup	0.29s
>
> Image: 13M and initramfs is 24M. If initramfs increases, the relocation
> time increases proportionally.

Something is very very wrong there.

Last I measured memory bandwidth seriously I could touch a Gigabyte per
second easily, and that was nearly 20 years ago.  Did you manage to
disable caching or have some particularly slow code that does the
reolocations.

There is a serious cost to reserving memory in that it is simply not
available at other times.  For kexec on panic there is no other reliable
way to get memory that won't be DMA'd to.

We have options in this case and I would strongly encourage you to track
down why that copy in relocation is so very slow.  I suspect a 4KiB page
size is large enough that it can swamp pointer following costs.

My back of the napkin math says even 20 years ago your copying costs
should be only 0.037s.  The only machine I have ever tested on where
the copy costs were noticable was my old 386.

Maybe I am out to lunch here but a claim that your memory only runs
at 100MiB/s (the speed of my spinning rust hard drive) is rather
incredible.

Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v1 0/5] allow to reserve memory for normal kexec kernel
  2019-07-08 23:53 ` [v1 0/5] allow to reserve memory for normal kexec kernel Eric W. Biederman
@ 2019-07-09  0:09   ` Pavel Tatashin
  2019-07-09 10:18     ` James Morse
  0 siblings, 1 reply; 17+ messages in thread
From: Pavel Tatashin @ 2019-07-09  0:09 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: James Morris, Sasha Levin, kexec, LKML, corbet, Catalin Marinas,
	will, linux-doc, Linux ARM

> Something is very very wrong there.
>
> Last I measured memory bandwidth seriously I could touch a Gigabyte per
> second easily, and that was nearly 20 years ago.  Did you manage to
> disable caching or have some particularly slow code that does the
> reolocations.
>
> There is a serious cost to reserving memory in that it is simply not
> available at other times.  For kexec on panic there is no other reliable
> way to get memory that won't be DMA'd to.

Hi Eric,

Thank you for your comments.

Indeed, but sometimes fast reboot is more important than the cost of
reserving 32M-64M of memory.

>
> We have options in this case and I would strongly encourage you to track
> down why that copy in relocation is so very slow.  I suspect a 4KiB page
> size is large enough that it can swamp pointer following costs.
>
> My back of the napkin math says even 20 years ago your copying costs
> should be only 0.037s.  The only machine I have ever tested on where
> the copy costs were noticable was my old 386.
>
> Maybe I am out to lunch here but a claim that your memory only runs
> at 100MiB/s (the speed of my spinning rust hard drive) is rather
> incredible.

I agree,  my measurement on this machine was 2,857MB/s. Perhaps when
MMU is disabled ARM64 also has caching disabled? The function that
loops through array of pages and relocates them to final destination
is this:

https://soleen.com/source/xref/linux/arch/arm64/kernel/relocate_kernel.S?r=d2912cb1#29

A comment before calling it:

205   /*
206   * cpu_soft_restart will shutdown the MMU, disable data caches, then
207   * transfer control to the reboot_code_buffer which contains a copy of
208   * the arm64_relocate_new_kernel routine.  arm64_relocate_new_kernel
209   * uses physical addressing to relocate the new image to its final
210   * position and transfers control to the image entry point when the
211   * relocation is complete.
212   * In kexec case, kimage->start points to purgatory assuming that
213   * kernel entry and dtb address are embedded in purgatory by
214   * userspace (kexec-tools).
215   * In kexec_file case, the kernel starts directly without purgatory.
216   */
https://soleen.com/source/xref/linux/arch/arm64/kernel/machine_kexec.c?r=d2912cb1#206

So, as I understand at least data caches are disabled, and MMU is
disabled, perhaps this is why this function is so incredibly slow?

Perhaps, there is a better way to fix this problem by keeping caches
enabled while still relocating? Any suggestions from Aarch64
developers?

Pasha

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v1 0/5] allow to reserve memory for normal kexec kernel
  2019-07-09  0:09   ` Pavel Tatashin
@ 2019-07-09 10:18     ` James Morse
  0 siblings, 0 replies; 17+ messages in thread
From: James Morse @ 2019-07-09 10:18 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: Eric W. Biederman, James Morris, Sasha Levin, kexec, LKML,
	corbet, Catalin Marinas, will, linux-doc, Linux ARM

Hi Pavel, Eric,

(Subject-Nit: 'arm64:' is needed to match the style for arm64's arch code. Without it the
maintainer is likely to skip the patches as being for core code.)

On 09/07/2019 01:09, Pavel Tatashin wrote:
>> Something is very very wrong there.
>>
>> Last I measured memory bandwidth seriously I could touch a Gigabyte per
>> second easily, and that was nearly 20 years ago.  Did you manage to
>> disable caching or have some particularly slow code that does the
>> reolocations.
>>
>> There is a serious cost to reserving memory in that it is simply not
>> available at other times.  For kexec on panic there is no other reliable
>> way to get memory that won't be DMA'd to.

> Indeed, but sometimes fast reboot is more important than the cost of
> reserving 32M-64M of memory.

>> We have options in this case and I would strongly encourage you to track
>> down why that copy in relocation is so very slow.  I suspect a 4KiB page
>> size is large enough that it can swamp pointer following costs.
>>
>> My back of the napkin math says even 20 years ago your copying costs
>> should be only 0.037s.  The only machine I have ever tested on where
>> the copy costs were noticable was my old 386.

>> Maybe I am out to lunch here but a claim that your memory only runs
>> at 100MiB/s (the speed of my spinning rust hard drive) is rather
>> incredible.

> I agree,  my measurement on this machine was 2,857MB/s. Perhaps when
> MMU is disabled ARM64 also has caching disabled? The function that
> loops through array of pages and relocates them to final destination
> is this:

> A comment before calling it:
> 
> 205   /*
> 206   * cpu_soft_restart will shutdown the MMU, disable data caches, then
> 207   * transfer control to the reboot_code_buffer which contains a copy of
> 208   * the arm64_relocate_new_kernel routine.  arm64_relocate_new_kernel
> 209   * uses physical addressing to relocate the new image to its final
> 210   * position and transfers control to the image entry point when the
> 211   * relocation is complete.
> 212   * In kexec case, kimage->start points to purgatory assuming that
> 213   * kernel entry and dtb address are embedded in purgatory by
> 214   * userspace (kexec-tools).
> 215   * In kexec_file case, the kernel starts directly without purgatory.
> 216   */

> So, as I understand at least data caches are disabled, and MMU is
> disabled, perhaps this is why this function is so incredibly slow?

Yup, spot on.

Kexec typically wants to place the new kernel over the top of the old one, so its
guaranteed to overwrite the live swapper_pg_dir.
There is also nothing to prevent the other parts of the page-tables being overwritten as
we relocate the kernel. The way the the kexec series chose to make this safe was the
simplest: turn the MMU off. We need to enter purgatory with the MMU off anyway.

(Its worth checking your kexec-tools purgatory isn't spending a decade generating a SHA256
of the kernel while the MMU is off. This is pointless as we don't suspect the previous
kernel of corrupting memory, and we can't debug/report the problem if we detect a
different SHA256. Newer kexec-tools have some commandline option to turn this thing off.)


> Perhaps, there is a better way to fix this problem by keeping caches
> enabled while still relocating? Any suggestions from Aarch64
> developers?

Turning the MMU off is the simplest. The alternative is a lot more complicated:

(To get the benefit of the caches, we need the MMU enabled to tell the hardware what the
cache-ability attributes of each page of memory are.)

We'd need to copy the page tables to build a new set out of memory we know won't get
overwritten. Switching to this 'safe set' is tricky, as it also maps the code we're
executing. To do that we'd need to use TTBR0 to hold another 'safe mapping' of the code
we're running, while we change our view of the linear-map.

Hibernate does exactly this, so its possible to re-use some of that logic. From memory, I
think the reason that didn't get done is kexec doesn't provide an allocator, and needs the
MMU off at some point anyway.


Thanks,

James

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v1 0/5] allow to reserve memory for normal kexec kernel
  2019-07-08 21:15 [v1 0/5] allow to reserve memory for normal kexec kernel Pavel Tatashin
                   ` (5 preceding siblings ...)
  2019-07-08 23:53 ` [v1 0/5] allow to reserve memory for normal kexec kernel Eric W. Biederman
@ 2019-07-09 10:36 ` Bhupesh Sharma
  2019-07-09 10:55   ` Pavel Tatashin
  6 siblings, 1 reply; 17+ messages in thread
From: Bhupesh Sharma @ 2019-07-09 10:36 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: James Morris, sashal, Eric Biederman, kexec mailing list,
	Linux Kernel Mailing List, Jonathan Corbet, Catalin Marinas,
	will, Linux Doc Mailing List, linux-arm-kernel

Hi Pavel,

On Tue, Jul 9, 2019 at 2:46 AM Pavel Tatashin <pasha.tatashin@soleen.com> wrote:
>
> Currently, it is only allowed to reserve memory for crash kernel, because
> it is a requirement in order to be able to boot into crash kernel without
> touching memory of crashed kernel is to have memory reserved.
>
> The second benefit for having memory reserved for kexec kernel is
> that it does not require a relocation after segments are loaded into
> memory.
>
> If kexec functionality is used for a fast system update, with a minimal
> downtime, the relocation of kernel + initramfs might take a significant
> portion of reboot.
>
> In fact, on the machine that we are using, that has ARM64 processor
> it takes 0.35s to relocate during kexec, thus taking 52% of kernel reboot
> time:
>
> kernel shutdown 0.03s
> relocation      0.35s
> kernel startup  0.29s
>
> Image: 13M and initramfs is 24M. If initramfs increases, the relocation
> time increases proportionally.
>
> While, it is possible to add 'kexeckernel=' parameters support to other
> architectures by modifying reserve_crashkernel(), in this series this is
> done for arm64 only.
>
> Pavel Tatashin (5):
>   kexec: quiet down kexec reboot
>   kexec: add resource for normal kexec region
>   kexec: export common crashkernel/kexeckernel parser
>   kexec: use reserved memory for normal kexec reboot
>   arm64, kexec: reserve kexeckernel region
>
>  .../admin-guide/kernel-parameters.txt         |  7 ++
>  arch/arm64/kernel/setup.c                     |  5 ++
>  arch/arm64/mm/init.c                          | 83 ++++++++++++-------
>  include/linux/crash_core.h                    |  6 ++
>  include/linux/ioport.h                        |  1 +
>  include/linux/kexec.h                         |  6 +-
>  kernel/crash_core.c                           | 27 +++---
>  kernel/kexec_core.c                           | 50 +++++++----
>  8 files changed, 127 insertions(+), 58 deletions(-)
>
> --
> 2.22.0

This seems like an issue with time spent while doing sha256
verification while in purgatory.

Can you please try the following two patches which enable D-cache in
purgatory before SHA verification and disable it before switching to
kernel:

http://lists.infradead.org/pipermail/kexec/2017-May/018839.html
http://lists.infradead.org/pipermail/kexec/2017-May/018840.html

Note that these were not accepted upstream but are included in several
distros in some form or the other :)

Thanks,
Bhupesh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v1 0/5] allow to reserve memory for normal kexec kernel
  2019-07-09 10:36 ` Bhupesh Sharma
@ 2019-07-09 10:55   ` Pavel Tatashin
  2019-07-09 11:59     ` James Morse
  0 siblings, 1 reply; 17+ messages in thread
From: Pavel Tatashin @ 2019-07-09 10:55 UTC (permalink / raw)
  To: Bhupesh Sharma
  Cc: James Morris, Sasha Levin, Eric Biederman, kexec mailing list,
	Linux Kernel Mailing List, Jonathan Corbet, Catalin Marinas,
	will, Linux Doc Mailing List, linux-arm-kernel

On Tue, Jul 9, 2019 at 6:36 AM Bhupesh Sharma <bhsharma@redhat.com> wrote:
>
> Hi Pavel,
>
> On Tue, Jul 9, 2019 at 2:46 AM Pavel Tatashin <pasha.tatashin@soleen.com> wrote:
> >
> > Currently, it is only allowed to reserve memory for crash kernel, because
> > it is a requirement in order to be able to boot into crash kernel without
> > touching memory of crashed kernel is to have memory reserved.
> >
> > The second benefit for having memory reserved for kexec kernel is
> > that it does not require a relocation after segments are loaded into
> > memory.
> >
> > If kexec functionality is used for a fast system update, with a minimal
> > downtime, the relocation of kernel + initramfs might take a significant
> > portion of reboot.
> >
> > In fact, on the machine that we are using, that has ARM64 processor
> > it takes 0.35s to relocate during kexec, thus taking 52% of kernel reboot
> > time:
> >
> > kernel shutdown 0.03s
> > relocation      0.35s
> > kernel startup  0.29s
> >
> > Image: 13M and initramfs is 24M. If initramfs increases, the relocation
> > time increases proportionally.
> >
> > While, it is possible to add 'kexeckernel=' parameters support to other
> > architectures by modifying reserve_crashkernel(), in this series this is
> > done for arm64 only.
> >
> > Pavel Tatashin (5):
> >   kexec: quiet down kexec reboot
> >   kexec: add resource for normal kexec region
> >   kexec: export common crashkernel/kexeckernel parser
> >   kexec: use reserved memory for normal kexec reboot
> >   arm64, kexec: reserve kexeckernel region
> >
> >  .../admin-guide/kernel-parameters.txt         |  7 ++
> >  arch/arm64/kernel/setup.c                     |  5 ++
> >  arch/arm64/mm/init.c                          | 83 ++++++++++++-------
> >  include/linux/crash_core.h                    |  6 ++
> >  include/linux/ioport.h                        |  1 +
> >  include/linux/kexec.h                         |  6 +-
> >  kernel/crash_core.c                           | 27 +++---
> >  kernel/kexec_core.c                           | 50 +++++++----
> >  8 files changed, 127 insertions(+), 58 deletions(-)
> >
> > --
> > 2.22.0
>
> This seems like an issue with time spent while doing sha256
> verification while in purgatory.
>
> Can you please try the following two patches which enable D-cache in
> purgatory before SHA verification and disable it before switching to
> kernel:
>
> http://lists.infradead.org/pipermail/kexec/2017-May/018839.html
> http://lists.infradead.org/pipermail/kexec/2017-May/018840.html

Hi Bhupesh,

The verification was taking 2.31s. This is why it is disabled via
kexec's '-i' flag. Therefore 0.35s is only the relocation part where
time is spent, and with my patches the time is completely gone.
Actually, I am glad you showed these patches to me because I might
pull them and enable verification for our needs.

>
> Note that these were not accepted upstream but are included in several
> distros in some form or the other :)

Enabling MMU and D-Cache for relocation  would essentially require the
same changes in kernel. Could you please share exactly why these were
not accepted upstream into kexec-tools?

Thank you,
Pasha

>
> Thanks,
> Bhupesh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v1 0/5] allow to reserve memory for normal kexec kernel
  2019-07-09 10:55   ` Pavel Tatashin
@ 2019-07-09 11:59     ` James Morse
  2019-07-09 13:07       ` Pavel Tatashin
  0 siblings, 1 reply; 17+ messages in thread
From: James Morse @ 2019-07-09 11:59 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: Bhupesh Sharma, James Morris, Sasha Levin, Eric Biederman,
	kexec mailing list, Linux Kernel Mailing List, Jonathan Corbet,
	Catalin Marinas, will, Linux Doc Mailing List, linux-arm-kernel

Hi Pavel,

On 09/07/2019 11:55, Pavel Tatashin wrote:
> On Tue, Jul 9, 2019 at 6:36 AM Bhupesh Sharma <bhsharma@redhat.com> wrote:
>> On Tue, Jul 9, 2019 at 2:46 AM Pavel Tatashin <pasha.tatashin@soleen.com> wrote:
>>> Currently, it is only allowed to reserve memory for crash kernel, because
>>> it is a requirement in order to be able to boot into crash kernel without
>>> touching memory of crashed kernel is to have memory reserved.
>>>
>>> The second benefit for having memory reserved for kexec kernel is
>>> that it does not require a relocation after segments are loaded into
>>> memory.
>>>
>>> If kexec functionality is used for a fast system update, with a minimal
>>> downtime, the relocation of kernel + initramfs might take a significant
>>> portion of reboot.
>>>
>>> In fact, on the machine that we are using, that has ARM64 processor
>>> it takes 0.35s to relocate during kexec, thus taking 52% of kernel reboot
>>> time:
>>>
>>> kernel shutdown 0.03s
>>> relocation      0.35s
>>> kernel startup  0.29s
>>>
>>> Image: 13M and initramfs is 24M. If initramfs increases, the relocation
>>> time increases proportionally.
>>>
>>> While, it is possible to add 'kexeckernel=' parameters support to other
>>> architectures by modifying reserve_crashkernel(), in this series this is
>>> done for arm64 only.

>>
>> This seems like an issue with time spent while doing sha256
>> verification while in purgatory.
>>
>> Can you please try the following two patches which enable D-cache in
>> purgatory before SHA verification and disable it before switching to
>> kernel:
>>
>> http://lists.infradead.org/pipermail/kexec/2017-May/018839.html
>> http://lists.infradead.org/pipermail/kexec/2017-May/018840.html
> 
> Hi Bhupesh,
> 
> The verification was taking 2.31s. This is why it is disabled via
> kexec's '-i' flag. Therefore 0.35s is only the relocation part where
> time is spent, and with my patches the time is completely gone.
> Actually, I am glad you showed these patches to me because I might
> pull them and enable verification for our needs.
> 
>>
>> Note that these were not accepted upstream but are included in several
>> distros in some form or the other :)
> 
> Enabling MMU and D-Cache for relocation  would essentially require the
> same changes in kernel. Could you please share exactly why these were
> not accepted upstream into kexec-tools?

Because '--no-checks' is a much simpler alternative.

More of the discussion:
https://lore.kernel.org/linux-arm-kernel/5599813d-f83c-d154-287a-c131c48292ca@arm.com/

While you can make purgatory a fully-fledged operating system, it doesn't really need to
do anything on arm64. Errata-workarounds alone are a reason not do start down this path.


Thanks,

James

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v1 0/5] allow to reserve memory for normal kexec kernel
  2019-07-09 11:59     ` James Morse
@ 2019-07-09 13:07       ` Pavel Tatashin
  2019-07-10 15:19         ` James Morse
  0 siblings, 1 reply; 17+ messages in thread
From: Pavel Tatashin @ 2019-07-09 13:07 UTC (permalink / raw)
  To: James Morse
  Cc: Bhupesh Sharma, James Morris, Sasha Levin, Eric Biederman,
	kexec mailing list, Linux Kernel Mailing List, Jonathan Corbet,
	Catalin Marinas, will, Linux Doc Mailing List, linux-arm-kernel

> > Enabling MMU and D-Cache for relocation  would essentially require the
> > same changes in kernel. Could you please share exactly why these were
> > not accepted upstream into kexec-tools?
>
> Because '--no-checks' is a much simpler alternative.
>
> More of the discussion:
> https://lore.kernel.org/linux-arm-kernel/5599813d-f83c-d154-287a-c131c48292ca@arm.com/
>
> While you can make purgatory a fully-fledged operating system, it doesn't really need to
> do anything on arm64. Errata-workarounds alone are a reason not do start down this path.

Thank you James. I will summaries the information gathered from the
yesterday's/today's discussion and add it to the cover letter together
with ARM64 tag. I think, the patch series makes sense for ARM64 only,
unless there are other platforms that disable caching/MMU during
relocation.

Thank you,
Pasha


>
>
> Thanks,
>
> James

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v1 0/5] allow to reserve memory for normal kexec kernel
  2019-07-09 13:07       ` Pavel Tatashin
@ 2019-07-10 15:19         ` James Morse
  2019-07-10 15:56           ` Pavel Tatashin
  0 siblings, 1 reply; 17+ messages in thread
From: James Morse @ 2019-07-10 15:19 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: Bhupesh Sharma, James Morris, Sasha Levin, Eric Biederman,
	kexec mailing list, Linux Kernel Mailing List, Jonathan Corbet,
	Catalin Marinas, will, Linux Doc Mailing List, linux-arm-kernel

Hi Pasha,

On 09/07/2019 14:07, Pavel Tatashin wrote:
>>> Enabling MMU and D-Cache for relocation  would essentially require the
>>> same changes in kernel. Could you please share exactly why these were
>>> not accepted upstream into kexec-tools?
>>
>> Because '--no-checks' is a much simpler alternative.
>>
>> More of the discussion:
>> https://lore.kernel.org/linux-arm-kernel/5599813d-f83c-d154-287a-c131c48292ca@arm.com/
>>
>> While you can make purgatory a fully-fledged operating system, it doesn't really need to
>> do anything on arm64. Errata-workarounds alone are a reason not do start down this path.
> 
> Thank you James. I will summaries the information gathered from the
> yesterday's/today's discussion and add it to the cover letter together
> with ARM64 tag. I think, the patch series makes sense for ARM64 only,
> unless there are other platforms that disable caching/MMU during
> relocation.

I'd prefer not to reserve additional memory for regular kexec just to avoid the relocation.
If the kernel's relocation work is so painful we can investigate doing it while the MMU is
enabled. If you can compare regular-kexec with kexec_file_load() you eliminate the
purgatory part of the work.


Thanks,

James

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v1 0/5] allow to reserve memory for normal kexec kernel
  2019-07-10 15:19         ` James Morse
@ 2019-07-10 15:56           ` Pavel Tatashin
  2019-07-11  8:12             ` Vladimir Murzin
  0 siblings, 1 reply; 17+ messages in thread
From: Pavel Tatashin @ 2019-07-10 15:56 UTC (permalink / raw)
  To: James Morse
  Cc: Bhupesh Sharma, James Morris, Sasha Levin, Eric Biederman,
	kexec mailing list, Linux Kernel Mailing List, Jonathan Corbet,
	Catalin Marinas, will, Linux Doc Mailing List, linux-arm-kernel

On Wed, Jul 10, 2019 at 11:19 AM James Morse <james.morse@arm.com> wrote:
>
> Hi Pasha,
>
> On 09/07/2019 14:07, Pavel Tatashin wrote:
> >>> Enabling MMU and D-Cache for relocation  would essentially require the
> >>> same changes in kernel. Could you please share exactly why these were
> >>> not accepted upstream into kexec-tools?
> >>
> >> Because '--no-checks' is a much simpler alternative.
> >>
> >> More of the discussion:
> >> https://lore.kernel.org/linux-arm-kernel/5599813d-f83c-d154-287a-c131c48292ca@arm.com/
> >>
> >> While you can make purgatory a fully-fledged operating system, it doesn't really need to
> >> do anything on arm64. Errata-workarounds alone are a reason not do start down this path.
> >
> > Thank you James. I will summaries the information gathered from the
> > yesterday's/today's discussion and add it to the cover letter together
> > with ARM64 tag. I think, the patch series makes sense for ARM64 only,
> > unless there are other platforms that disable caching/MMU during
> > relocation.
>
> I'd prefer not to reserve additional memory for regular kexec just to avoid the relocation.
> If the kernel's relocation work is so painful we can investigate doing it while the MMU is
> enabled. If you can compare regular-kexec with kexec_file_load() you eliminate the
> purgatory part of the work.

Relocation time is exactly the same for regular-kexec and
kexec_file_load(). So, the relocation is indeed painful for our case.
I am working on adding MMU enabled kernel relocation.

Pasha

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v1 0/5] allow to reserve memory for normal kexec kernel
  2019-07-10 15:56           ` Pavel Tatashin
@ 2019-07-11  8:12             ` Vladimir Murzin
  2019-07-11 12:26               ` Pavel Tatashin
  0 siblings, 1 reply; 17+ messages in thread
From: Vladimir Murzin @ 2019-07-11  8:12 UTC (permalink / raw)
  To: Pavel Tatashin, James Morse
  Cc: Sasha Levin, Jonathan Corbet, Catalin Marinas, Bhupesh Sharma,
	Linux Doc Mailing List, kexec mailing list,
	Linux Kernel Mailing List, James Morris, Eric Biederman, will,
	linux-arm-kernel

Hi,

On 7/10/19 4:56 PM, Pavel Tatashin wrote:
> On Wed, Jul 10, 2019 at 11:19 AM James Morse <james.morse@arm.com> wrote:
>>
>> Hi Pasha,
>>
>> On 09/07/2019 14:07, Pavel Tatashin wrote:
>>>>> Enabling MMU and D-Cache for relocation  would essentially require the
>>>>> same changes in kernel. Could you please share exactly why these were
>>>>> not accepted upstream into kexec-tools?
>>>>
>>>> Because '--no-checks' is a much simpler alternative.
>>>>
>>>> More of the discussion:
>>>> https://lore.kernel.org/linux-arm-kernel/5599813d-f83c-d154-287a-c131c48292ca@arm.com/
>>>>
>>>> While you can make purgatory a fully-fledged operating system, it doesn't really need to
>>>> do anything on arm64. Errata-workarounds alone are a reason not do start down this path.
>>>
>>> Thank you James. I will summaries the information gathered from the
>>> yesterday's/today's discussion and add it to the cover letter together
>>> with ARM64 tag. I think, the patch series makes sense for ARM64 only,
>>> unless there are other platforms that disable caching/MMU during
>>> relocation.
>>
>> I'd prefer not to reserve additional memory for regular kexec just to avoid the relocation.
>> If the kernel's relocation work is so painful we can investigate doing it while the MMU is
>> enabled. If you can compare regular-kexec with kexec_file_load() you eliminate the
>> purgatory part of the work.
> 
> Relocation time is exactly the same for regular-kexec and
> kexec_file_load(). So, the relocation is indeed painful for our case.
> I am working on adding MMU enabled kernel relocation.

Out of curiosity, does enabling only I-cache make a difference? IIRC, it doesn't
require setting MMU, in contrast to D-cache.

Cheers
Vladimir

> 
> Pasha
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v1 0/5] allow to reserve memory for normal kexec kernel
  2019-07-11  8:12             ` Vladimir Murzin
@ 2019-07-11 12:26               ` Pavel Tatashin
  0 siblings, 0 replies; 17+ messages in thread
From: Pavel Tatashin @ 2019-07-11 12:26 UTC (permalink / raw)
  To: Vladimir Murzin
  Cc: James Morse, Sasha Levin, Jonathan Corbet, Catalin Marinas,
	Bhupesh Sharma, Linux Doc Mailing List, kexec mailing list,
	Linux Kernel Mailing List, James Morris, Eric Biederman, will,
	linux-arm-kernel

On Thu, Jul 11, 2019 at 4:12 AM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
>
> Hi,
>
> On 7/10/19 4:56 PM, Pavel Tatashin wrote:
> > On Wed, Jul 10, 2019 at 11:19 AM James Morse <james.morse@arm.com> wrote:
> >>
> >> Hi Pasha,
> >>
> >> On 09/07/2019 14:07, Pavel Tatashin wrote:
> >>>>> Enabling MMU and D-Cache for relocation  would essentially require the
> >>>>> same changes in kernel. Could you please share exactly why these were
> >>>>> not accepted upstream into kexec-tools?
> >>>>
> >>>> Because '--no-checks' is a much simpler alternative.
> >>>>
> >>>> More of the discussion:
> >>>> https://lore.kernel.org/linux-arm-kernel/5599813d-f83c-d154-287a-c131c48292ca@arm.com/
> >>>>
> >>>> While you can make purgatory a fully-fledged operating system, it doesn't really need to
> >>>> do anything on arm64. Errata-workarounds alone are a reason not do start down this path.
> >>>
> >>> Thank you James. I will summaries the information gathered from the
> >>> yesterday's/today's discussion and add it to the cover letter together
> >>> with ARM64 tag. I think, the patch series makes sense for ARM64 only,
> >>> unless there are other platforms that disable caching/MMU during
> >>> relocation.
> >>
> >> I'd prefer not to reserve additional memory for regular kexec just to avoid the relocation.
> >> If the kernel's relocation work is so painful we can investigate doing it while the MMU is
> >> enabled. If you can compare regular-kexec with kexec_file_load() you eliminate the
> >> purgatory part of the work.
> >
> > Relocation time is exactly the same for regular-kexec and
> > kexec_file_load(). So, the relocation is indeed painful for our case.
> > I am working on adding MMU enabled kernel relocation.
>
> Out of curiosity, does enabling only I-cache make a difference? IIRC, it doesn't
> require setting MMU, in contrast to D-cache.

Resend:

Thank you for suggestion. I have actually experimented with enabling
caches without MMU. Did not see a difference.

Thank you,
Pasha

>
> Cheers
> Vladimir
>
> >
> > Pasha
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-07-11 12:26 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-08 21:15 [v1 0/5] allow to reserve memory for normal kexec kernel Pavel Tatashin
2019-07-08 21:15 ` [v1 1/5] kexec: quiet down kexec reboot Pavel Tatashin
2019-07-08 21:15 ` [v1 2/5] kexec: add resource for normal kexec region Pavel Tatashin
2019-07-08 21:15 ` [v1 3/5] kexec: export common crashkernel/kexeckernel parser Pavel Tatashin
2019-07-08 21:15 ` [v1 4/5] kexec: use reserved memory for normal kexec reboot Pavel Tatashin
2019-07-08 21:15 ` [v1 5/5] arm64, kexec: reserve kexeckernel region Pavel Tatashin
2019-07-08 23:53 ` [v1 0/5] allow to reserve memory for normal kexec kernel Eric W. Biederman
2019-07-09  0:09   ` Pavel Tatashin
2019-07-09 10:18     ` James Morse
2019-07-09 10:36 ` Bhupesh Sharma
2019-07-09 10:55   ` Pavel Tatashin
2019-07-09 11:59     ` James Morse
2019-07-09 13:07       ` Pavel Tatashin
2019-07-10 15:19         ` James Morse
2019-07-10 15:56           ` Pavel Tatashin
2019-07-11  8:12             ` Vladimir Murzin
2019-07-11 12:26               ` Pavel Tatashin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).