linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/8] In kernel handling of CPU hotplug events for crash kernel
@ 2023-01-15 15:01 Sourabh Jain
  2023-01-15 15:01 ` [PATCH v7 1/8] powerpc/kexec: turn some static helper functions public Sourabh Jain
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-01-15 15:01 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

This patch series implements the crash hotplug handler on PowerPC introduced
by https://lkml.org/lkml/2022/12/9/520 patch series.

The Problem:
============
Post hotplug/DLPAR events the capture kernel holds stale information about the
system. Dump collection with stale capture kernel might end up in dump capture
failure or an inaccurate dump collection.

Existing solution:
==================
The existing solution to keep the capture kernel up-to-date by monitoring
hotplug event via udev rule and trigger a full capture kernel reload for
every hotplug event.

Shortcomings:
------------------------------------------------
- Leaves a window where kernel crash might not lead to a successful dump
  collection.
- Reloading all kexec components for each hotplug is inefficient.
- udev rules are prone to races if hotplug events are frequent.

More about issues with an existing solution is posted here:
 - https://lkml.org/lkml/2020/12/14/532
 - https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-February/240254.html

Proposed Solution:
==================
Instead of reloading all kexec segments on hotplug event, this patch series
focuses on updating only the relevant kexec segment. Once the kexec segments
are loaded in the kernel reserved area then an arch-specific hotplug handler
will update the relevant kexec segment based on hotplug event type.

---
Changelog:

v6 -> v7
  - added a new config to configure this feature
  - pass hotplug action type to arch specific handler

v5 -> v6
  - Added crash memory hotplug support

v4 -> v5:
  - Replace COFNIG_CRASH_HOTPLUG with CONFIG_HOTPLUG_CPU.
  - Move fdt segment identification for kexec_load case to load path
    instead of crash hotplug handler
  - Keep new attribute defined under kimage_arch to track FDT segment
    under CONFIG_HOTPLUG_CPU config.

v3 -> v4:
  - Update the logic to find the additional space needed for hotadd CPUs post
    kexec load. Refer "[RFC v4 PATCH 4/5] powerpc/crash hp: add crash hotplug
    support for kexec_file_load" patch to know more about the change.
  - Fix a couple of typo.
  - Replace pr_err to pr_info_once to warn user about memory hotplug
    support.
  - In crash hotplug handle exit the for loop if FDT segment is found.

v2 -> v3
  - Move fdt_index and fdt_index_vaild variables to kimage_arch struct.
  - Rebase patche on top of https://lkml.org/lkml/2022/3/3/674 [v5]
  - Fixed warning reported by checpatch script

v1 -> v2:
  - Use generic hotplug handler introduced by https://lkml.org/lkml/2022/2/9/1406, a
    significant change from v1.
---

Sourabh Jain (8):
  powerpc/kexec: turn some static helper functions public
  powerpc/crash hp: introduce a new config option CRASH_HOTPLUG
  powerpc/crash: update kimage_arch struct
  crash: add phdr for possible CPUs in elfcorehdr
  crash: pass hotplug action type to arch crash hotplug handler
  powerpc/crash: add crash CPU hotplug support
  crash: forward memory_notify args to arch crash hotplug handler
  powerpc/kexec: add crash memory hotplug support

 arch/powerpc/Kconfig                    |  12 +
 arch/powerpc/include/asm/kexec.h        |  17 ++
 arch/powerpc/include/asm/kexec_ranges.h |   1 +
 arch/powerpc/kexec/core_64.c            | 290 ++++++++++++++++++++++++
 arch/powerpc/kexec/elf_64.c             |  73 +++++-
 arch/powerpc/kexec/file_load_64.c       | 179 ++-------------
 arch/powerpc/kexec/ranges.c             |  60 +++++
 arch/x86/include/asm/kexec.h            |   3 +-
 arch/x86/kernel/crash.c                 |   5 +-
 include/linux/kexec.h                   |   6 +-
 kernel/crash_core.c                     |  23 +-
 11 files changed, 491 insertions(+), 178 deletions(-)

-- 
2.39.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v7 1/8] powerpc/kexec: turn some static helper functions public
  2023-01-15 15:01 [PATCH v7 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
@ 2023-01-15 15:01 ` Sourabh Jain
  2023-01-15 15:02 ` [PATCH v7 2/8] powerpc/crash hp: introduce a new config option CRASH_HOTPLUG Sourabh Jain
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-01-15 15:01 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

Move update_cpus_node and get_crash_memory_ranges functions from
kexec/file_load.c to kexec/core_64.c to make these functions usable
by other kexec compoenets.

Later in the series, both functions are used for in-kernel updates to
kexec segments in the event of CPU/Memory hotplug for both kexec_load
and kexec_file_load system call.

No functional change intended.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/include/asm/kexec.h  |   6 ++
 arch/powerpc/kexec/core_64.c      | 166 ++++++++++++++++++++++++++++++
 arch/powerpc/kexec/file_load_64.c | 162 -----------------------------
 3 files changed, 172 insertions(+), 162 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index a1ddba01e7d13..8090ad7d97d9d 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -99,6 +99,12 @@ void relocate_new_kernel(unsigned long indirection_page, unsigned long reboot_co
 
 void kexec_copy_flush(struct kimage *image);
 
+#ifdef CONFIG_PPC64
+struct crash_mem;
+int update_cpus_node(void *fdt);
+int get_crash_memory_ranges(struct crash_mem **mem_ranges);
+#endif
+
 #if defined(CONFIG_CRASH_DUMP) && defined(CONFIG_PPC_RTAS)
 void crash_free_reserved_phys_range(unsigned long begin, unsigned long end);
 #define crash_free_reserved_phys_range crash_free_reserved_phys_range
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index a79e28c91e2be..0b292f93a74cc 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -17,6 +17,8 @@
 #include <linux/cpu.h>
 #include <linux/hardirq.h>
 #include <linux/of.h>
+#include <linux/libfdt.h>
+#include <linux/memblock.h>
 
 #include <asm/page.h>
 #include <asm/current.h>
@@ -30,6 +32,8 @@
 #include <asm/hw_breakpoint.h>
 #include <asm/svm.h>
 #include <asm/ultravisor.h>
+#include <asm/kexec_ranges.h>
+#include <asm/crashdump-ppc64.h>
 
 int machine_kexec_prepare(struct kimage *image)
 {
@@ -377,6 +381,168 @@ void default_machine_kexec(struct kimage *image)
 	/* NOTREACHED */
 }
 
+/**
+ * get_crash_memory_ranges - Get crash memory ranges. This list includes
+ *                           first/crashing kernel's memory regions that
+ *                           would be exported via an elfcore.
+ * @mem_ranges:              Range list to add the memory ranges to.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+int get_crash_memory_ranges(struct crash_mem **mem_ranges)
+{
+	phys_addr_t base, end;
+	struct crash_mem *tmem;
+	u64 i;
+	int ret;
+
+	for_each_mem_range(i, &base, &end) {
+		u64 size = end - base;
+
+		/* Skip backup memory region, which needs a separate entry */
+		if (base == BACKUP_SRC_START) {
+			if (size > BACKUP_SRC_SIZE) {
+				base = BACKUP_SRC_END + 1;
+				size -= BACKUP_SRC_SIZE;
+			} else
+				continue;
+		}
+
+		ret = add_mem_range(mem_ranges, base, size);
+		if (ret)
+			goto out;
+
+		/* Try merging adjacent ranges before reallocation attempt */
+		if ((*mem_ranges)->nr_ranges == (*mem_ranges)->max_nr_ranges)
+			sort_memory_ranges(*mem_ranges, true);
+	}
+
+	/* Reallocate memory ranges if there is no space to split ranges */
+	tmem = *mem_ranges;
+	if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
+		tmem = realloc_mem_ranges(mem_ranges);
+		if (!tmem)
+			goto out;
+	}
+
+	/* Exclude crashkernel region */
+	ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
+	if (ret)
+		goto out;
+
+	/*
+	 * FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
+	 *        regions are exported to save their context at the time of
+	 *        crash, they should actually be backed up just like the
+	 *        first 64K bytes of memory.
+	 */
+	ret = add_rtas_mem_range(mem_ranges);
+	if (ret)
+		goto out;
+
+	ret = add_opal_mem_range(mem_ranges);
+	if (ret)
+		goto out;
+
+	/* create a separate program header for the backup region */
+	ret = add_mem_range(mem_ranges, BACKUP_SRC_START, BACKUP_SRC_SIZE);
+	if (ret)
+		goto out;
+
+	sort_memory_ranges(*mem_ranges, false);
+out:
+	if (ret)
+		pr_err("Failed to setup crash memory ranges\n");
+	return ret;
+}
+
+/**
+ * add_node_props - Reads node properties from device node structure and add
+ *                  them to fdt.
+ * @fdt:            Flattened device tree of the kernel
+ * @node_offset:    offset of the node to add a property at
+ * @dn:             device node pointer
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+static int add_node_props(void *fdt, int node_offset, const struct device_node *dn)
+{
+	int ret = 0;
+	struct property *pp;
+
+	if (!dn)
+		return -EINVAL;
+
+	for_each_property_of_node(dn, pp) {
+		ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, pp->length);
+		if (ret < 0) {
+			pr_err("Unable to add %s property: %s\n", pp->name, fdt_strerror(ret));
+			return ret;
+		}
+	}
+	return ret;
+}
+
+/**
+ * update_cpus_node - Update cpus node of flattened device tree using of_root
+ *                    device node.
+ * @fdt:              Flattened device tree of the kernel.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+int update_cpus_node(void *fdt)
+{
+	struct device_node *cpus_node, *dn;
+	int cpus_offset, cpus_subnode_offset, ret = 0;
+
+	cpus_offset = fdt_path_offset(fdt, "/cpus");
+	if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
+		pr_err("Malformed device tree: error reading /cpus node: %s\n",
+		       fdt_strerror(cpus_offset));
+		return cpus_offset;
+	}
+
+	if (cpus_offset > 0) {
+		ret = fdt_del_node(fdt, cpus_offset);
+		if (ret < 0) {
+			pr_err("Error deleting /cpus node: %s\n", fdt_strerror(ret));
+			return -EINVAL;
+		}
+	}
+
+	/* Add cpus node to fdt */
+	cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
+	if (cpus_offset < 0) {
+		pr_err("Error creating /cpus node: %s\n", fdt_strerror(cpus_offset));
+		return -EINVAL;
+	}
+
+	/* Add cpus node properties */
+	cpus_node = of_find_node_by_path("/cpus");
+	ret = add_node_props(fdt, cpus_offset, cpus_node);
+	of_node_put(cpus_node);
+	if (ret < 0)
+		return ret;
+
+	/* Loop through all subnodes of cpus and add them to fdt */
+	for_each_node_by_type(dn, "cpu") {
+		cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, dn->full_name);
+		if (cpus_subnode_offset < 0) {
+			pr_err("Unable to add %s subnode: %s\n", dn->full_name,
+			       fdt_strerror(cpus_subnode_offset));
+			ret = cpus_subnode_offset;
+			goto out;
+		}
+
+		ret = add_node_props(fdt, cpus_subnode_offset, dn);
+		if (ret < 0)
+			goto out;
+	}
+out:
+	of_node_put(dn);
+	return ret;
+}
+
 #ifdef CONFIG_PPC_64S_HASH_MMU
 /* Values we need to export to the second kernel via the device tree. */
 static unsigned long htab_base;
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index e51d8059535b2..9bc70b4d8eafc 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -131,81 +131,6 @@ static int get_usable_memory_ranges(struct crash_mem **mem_ranges)
 	return ret;
 }
 
-/**
- * get_crash_memory_ranges - Get crash memory ranges. This list includes
- *                           first/crashing kernel's memory regions that
- *                           would be exported via an elfcore.
- * @mem_ranges:              Range list to add the memory ranges to.
- *
- * Returns 0 on success, negative errno on error.
- */
-static int get_crash_memory_ranges(struct crash_mem **mem_ranges)
-{
-	phys_addr_t base, end;
-	struct crash_mem *tmem;
-	u64 i;
-	int ret;
-
-	for_each_mem_range(i, &base, &end) {
-		u64 size = end - base;
-
-		/* Skip backup memory region, which needs a separate entry */
-		if (base == BACKUP_SRC_START) {
-			if (size > BACKUP_SRC_SIZE) {
-				base = BACKUP_SRC_END + 1;
-				size -= BACKUP_SRC_SIZE;
-			} else
-				continue;
-		}
-
-		ret = add_mem_range(mem_ranges, base, size);
-		if (ret)
-			goto out;
-
-		/* Try merging adjacent ranges before reallocation attempt */
-		if ((*mem_ranges)->nr_ranges == (*mem_ranges)->max_nr_ranges)
-			sort_memory_ranges(*mem_ranges, true);
-	}
-
-	/* Reallocate memory ranges if there is no space to split ranges */
-	tmem = *mem_ranges;
-	if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
-		tmem = realloc_mem_ranges(mem_ranges);
-		if (!tmem)
-			goto out;
-	}
-
-	/* Exclude crashkernel region */
-	ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
-	if (ret)
-		goto out;
-
-	/*
-	 * FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
-	 *        regions are exported to save their context at the time of
-	 *        crash, they should actually be backed up just like the
-	 *        first 64K bytes of memory.
-	 */
-	ret = add_rtas_mem_range(mem_ranges);
-	if (ret)
-		goto out;
-
-	ret = add_opal_mem_range(mem_ranges);
-	if (ret)
-		goto out;
-
-	/* create a separate program header for the backup region */
-	ret = add_mem_range(mem_ranges, BACKUP_SRC_START, BACKUP_SRC_SIZE);
-	if (ret)
-		goto out;
-
-	sort_memory_ranges(*mem_ranges, false);
-out:
-	if (ret)
-		pr_err("Failed to setup crash memory ranges\n");
-	return ret;
-}
-
 /**
  * get_reserved_memory_ranges - Get reserve memory ranges. This list includes
  *                              memory regions that should be added to the
@@ -1009,93 +934,6 @@ unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image)
 	return extra_size;
 }
 
-/**
- * add_node_props - Reads node properties from device node structure and add
- *                  them to fdt.
- * @fdt:            Flattened device tree of the kernel
- * @node_offset:    offset of the node to add a property at
- * @dn:             device node pointer
- *
- * Returns 0 on success, negative errno on error.
- */
-static int add_node_props(void *fdt, int node_offset, const struct device_node *dn)
-{
-	int ret = 0;
-	struct property *pp;
-
-	if (!dn)
-		return -EINVAL;
-
-	for_each_property_of_node(dn, pp) {
-		ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, pp->length);
-		if (ret < 0) {
-			pr_err("Unable to add %s property: %s\n", pp->name, fdt_strerror(ret));
-			return ret;
-		}
-	}
-	return ret;
-}
-
-/**
- * update_cpus_node - Update cpus node of flattened device tree using of_root
- *                    device node.
- * @fdt:              Flattened device tree of the kernel.
- *
- * Returns 0 on success, negative errno on error.
- */
-static int update_cpus_node(void *fdt)
-{
-	struct device_node *cpus_node, *dn;
-	int cpus_offset, cpus_subnode_offset, ret = 0;
-
-	cpus_offset = fdt_path_offset(fdt, "/cpus");
-	if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
-		pr_err("Malformed device tree: error reading /cpus node: %s\n",
-		       fdt_strerror(cpus_offset));
-		return cpus_offset;
-	}
-
-	if (cpus_offset > 0) {
-		ret = fdt_del_node(fdt, cpus_offset);
-		if (ret < 0) {
-			pr_err("Error deleting /cpus node: %s\n", fdt_strerror(ret));
-			return -EINVAL;
-		}
-	}
-
-	/* Add cpus node to fdt */
-	cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
-	if (cpus_offset < 0) {
-		pr_err("Error creating /cpus node: %s\n", fdt_strerror(cpus_offset));
-		return -EINVAL;
-	}
-
-	/* Add cpus node properties */
-	cpus_node = of_find_node_by_path("/cpus");
-	ret = add_node_props(fdt, cpus_offset, cpus_node);
-	of_node_put(cpus_node);
-	if (ret < 0)
-		return ret;
-
-	/* Loop through all subnodes of cpus and add them to fdt */
-	for_each_node_by_type(dn, "cpu") {
-		cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, dn->full_name);
-		if (cpus_subnode_offset < 0) {
-			pr_err("Unable to add %s subnode: %s\n", dn->full_name,
-			       fdt_strerror(cpus_subnode_offset));
-			ret = cpus_subnode_offset;
-			goto out;
-		}
-
-		ret = add_node_props(fdt, cpus_subnode_offset, dn);
-		if (ret < 0)
-			goto out;
-	}
-out:
-	of_node_put(dn);
-	return ret;
-}
-
 static int copy_property(void *fdt, int node_offset, const struct device_node *dn,
 			 const char *propname)
 {
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 2/8] powerpc/crash hp: introduce a new config option CRASH_HOTPLUG
  2023-01-15 15:01 [PATCH v7 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
  2023-01-15 15:01 ` [PATCH v7 1/8] powerpc/kexec: turn some static helper functions public Sourabh Jain
@ 2023-01-15 15:02 ` Sourabh Jain
  2023-01-15 15:02 ` [PATCH v7 3/8] powerpc/crash: update kimage_arch struct Sourabh Jain
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-01-15 15:02 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

The config option CRASH_HOTPLUG enables in kernel update to kexec
segments due to CPU/Memory hotplug events. By default it is disabled.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/Kconfig | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index b8c4ac56bddc5..57d18cc5d27c5 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -668,6 +668,18 @@ config CRASH_DUMP
 	  The same kernel binary can be used as production kernel and dump
 	  capture kernel.
 
+config CRASH_HOTPLUG
+	bool "Update crash capture system on CPU/Memory hotplug event"
+	default n
+	depends on CRASH_DUMP && (HOTPLUG_CPU || MEMORY_HOTPLUG)
+	help
+	  In kernel update to relevant kexec segments due to change
+	  in system configuration, rather reloading all the kexec
+	  segments again from userspace by monitoring CPU/Memory
+	  hotplug event in userspace using udev.
+
+	  If unsure, say Y.
+
 config FA_DUMP
 	bool "Firmware-assisted dump"
 	depends on PPC64 && (PPC_RTAS || PPC_POWERNV)
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 3/8] powerpc/crash: update kimage_arch struct
  2023-01-15 15:01 [PATCH v7 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
  2023-01-15 15:01 ` [PATCH v7 1/8] powerpc/kexec: turn some static helper functions public Sourabh Jain
  2023-01-15 15:02 ` [PATCH v7 2/8] powerpc/crash hp: introduce a new config option CRASH_HOTPLUG Sourabh Jain
@ 2023-01-15 15:02 ` Sourabh Jain
  2023-01-19 18:57   ` Laurent Dufour
  2023-01-15 15:02 ` [PATCH v7 4/8] crash: add phdr for possible CPUs in elfcorehdr Sourabh Jain
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Sourabh Jain @ 2023-01-15 15:02 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

Add a new member "fdt_index" to kimage_arch struct to hold the index of
the FDT (Flattened Device Tree) segment in the kexec segment array.

Having direct access to FDT segment will help arch crash hotplug handler
to avoid looping kexec segment array to identify the FDT segment index
for every FDT update on hotplug events.

The fdt_index is initialized during the kexec load for both kexec_load and
kexec_file_load system call.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/include/asm/kexec.h  |  7 +++++++
 arch/powerpc/kexec/core_64.c      | 27 +++++++++++++++++++++++++++
 arch/powerpc/kexec/elf_64.c       |  6 ++++++
 arch/powerpc/kexec/file_load_64.c |  5 +++++
 4 files changed, 45 insertions(+)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 8090ad7d97d9d..5a322c1737661 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -103,6 +103,10 @@ void kexec_copy_flush(struct kimage *image);
 struct crash_mem;
 int update_cpus_node(void *fdt);
 int get_crash_memory_ranges(struct crash_mem **mem_ranges);
+#if defined(CONFIG_CRASH_HOTPLUG)
+int machine_kexec_post_load(struct kimage *image);
+#define machine_kexec_post_load machine_kexec_post_load
+#endif
 #endif
 
 #if defined(CONFIG_CRASH_DUMP) && defined(CONFIG_PPC_RTAS)
@@ -118,6 +122,9 @@ extern const struct kexec_file_ops kexec_elf64_ops;
 struct kimage_arch {
 	struct crash_mem *exclude_ranges;
 
+#if defined(CONFIG_CRASH_HOTPLUG)
+	int fdt_index;
+#endif
 	unsigned long backup_start;
 	void *backup_buf;
 	void *fdt;
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 0b292f93a74cc..3d4fe1aa6f761 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -77,6 +77,33 @@ int machine_kexec_prepare(struct kimage *image)
 	return 0;
 }
 
+#if defined(CONFIG_CRASH_HOTPLUG)
+int machine_kexec_post_load(struct kimage *kimage)
+{
+	int i;
+	void *ptr;
+	unsigned long mem;
+
+	/* Mark fdt_index invalid */
+	kimage->arch.fdt_index = -1;
+
+	if (kimage->type != KEXEC_TYPE_CRASH)
+		return 0;
+
+	for (i = 0; i < kimage->nr_segments; i++) {
+		mem = kimage->segment[i].mem;
+		ptr = __va(mem);
+
+		if (ptr && fdt_magic(ptr) == FDT_MAGIC) {
+			kimage->arch.fdt_index = i;
+			break;
+		}
+	}
+
+	return 0;
+}
+#endif
+
 /* Called during kexec sequence with MMU off */
 static notrace void copy_segments(unsigned long ind)
 {
diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
index eeb258002d1e0..2a17f171661f1 100644
--- a/arch/powerpc/kexec/elf_64.c
+++ b/arch/powerpc/kexec/elf_64.c
@@ -123,6 +123,12 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
 	kbuf.buf_align = PAGE_SIZE;
 	kbuf.top_down = true;
 	kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
+
+#if defined(CONFIG_CRASH_HOTPLUG)
+	image->arch.fdt_index = image->nr_segments;
+#endif
+	kbuf.memsz = fdt_totalsize(fdt);
+
 	ret = kexec_add_buffer(&kbuf);
 	if (ret)
 		goto out_free_fdt;
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index 9bc70b4d8eafc..725f74d1b928c 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -1153,6 +1153,11 @@ int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
 		return ret;
 	}
 
+#if defined(CONFIG_CRASH_HOTPLUG)
+	/* Mark fdt_index invalid */
+	image->arch.fdt_index = -1;
+#endif
+
 	return kexec_image_probe_default(image, buf, buf_len);
 }
 
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 4/8] crash: add phdr for possible CPUs in elfcorehdr
  2023-01-15 15:01 [PATCH v7 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
                   ` (2 preceding siblings ...)
  2023-01-15 15:02 ` [PATCH v7 3/8] powerpc/crash: update kimage_arch struct Sourabh Jain
@ 2023-01-15 15:02 ` Sourabh Jain
  2023-01-19 18:29   ` Laurent Dufour
  2023-01-15 15:02 ` [PATCH v7 5/8] crash: pass hotplug action type to arch crash hotplug handler Sourabh Jain
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Sourabh Jain @ 2023-01-15 15:02 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

On architectures like PowerPC the crash notes are available for all
possible CPUs. So let's populate the elfcorehdr for all possible
CPUs having crash notes to avoid updating elfcorehdr during in-kernel
crash update on CPU hotplug events.

The similar technique is used in kexec-tool for kexec_load case.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 kernel/crash_core.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 910d377ea317e..19f987b3851e8 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -364,8 +364,8 @@ int crash_prepare_elf64_headers(struct kimage *image, struct crash_mem *mem,
 	ehdr->e_ehsize = sizeof(Elf64_Ehdr);
 	ehdr->e_phentsize = sizeof(Elf64_Phdr);
 
-	/* Prepare one phdr of type PT_NOTE for each present CPU */
-	for_each_present_cpu(cpu) {
+	/* Prepare one phdr of type PT_NOTE for possible CPU with crash note. */
+	for_each_possible_cpu(cpu) {
 #ifdef CONFIG_CRASH_HOTPLUG
 		if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) {
 			/* Skip the soon-to-be offlined cpu */
@@ -373,8 +373,11 @@ int crash_prepare_elf64_headers(struct kimage *image, struct crash_mem *mem,
 				continue;
 		}
 #endif
-		phdr->p_type = PT_NOTE;
 		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
+		if (!notes_addr)
+			continue;
+
+		phdr->p_type = PT_NOTE;
 		phdr->p_offset = phdr->p_paddr = notes_addr;
 		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
 		(ehdr->e_phnum)++;
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 5/8] crash: pass hotplug action type to arch crash hotplug handler
  2023-01-15 15:01 [PATCH v7 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
                   ` (3 preceding siblings ...)
  2023-01-15 15:02 ` [PATCH v7 4/8] crash: add phdr for possible CPUs in elfcorehdr Sourabh Jain
@ 2023-01-15 15:02 ` Sourabh Jain
  2023-01-15 15:02 ` [PATCH v7 6/8] powerpc/crash: add crash CPU hotplug support Sourabh Jain
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-01-15 15:02 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

On PowerPC, the crash update action for CPU and Memory hotplug events
is not the same. Since arch_crash_handle_hotplug_event is a common
crash hotplug handler for both CPU and Memory hot un/plug events,
it is hard to differentiate which hotplug event triggered the arch
crash hotplug handler and decide the crash update action accordingly.

Pass the hotplug action type to arch specific crash hotplug handler
so that corresponding crash update action can be taken based on the
hotplug action type.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/x86/include/asm/kexec.h | 2 +-
 arch/x86/kernel/crash.c      | 3 ++-
 include/linux/kexec.h        | 2 +-
 kernel/crash_core.c          | 2 +-
 4 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 1bc852ce347d4..ec33a592a2ddd 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -213,7 +213,7 @@ extern crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss;
 extern void kdump_nmi_shootdown_cpus(void);
 
 #ifdef CONFIG_CRASH_HOTPLUG
-void arch_crash_handle_hotplug_event(struct kimage *image);
+void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 5186df48ce6c6..0026c5ad828d7 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -465,12 +465,13 @@ int crash_load_segments(struct kimage *image)
 /**
  * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
  * @image: the active struct kimage
+ * @hp_action: the hot un/plug action being handled
  *
  * To accurately reflect hot un/plug changes, the new elfcorehdr
  * is prepared in a kernel buffer, and then it is written on top
  * of the existing/old elfcorehdr.
  */
-void arch_crash_handle_hotplug_event(struct kimage *image)
+void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action)
 {
 	void *elfbuf = NULL, *old_elfcorehdr;
 	unsigned long mem, memsz;
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 0b1ad1ac06e37..ea8ff85b0b436 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -509,7 +509,7 @@ static inline void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages) {
 #endif
 
 #ifndef arch_crash_handle_hotplug_event
-static inline void arch_crash_handle_hotplug_event(struct kimage *image) { }
+static inline void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action) { }
 #endif
 
 #ifndef crash_hotplug_cpu_support
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 19f987b3851e8..bdf8c363a90aa 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -707,7 +707,7 @@ static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
 					cpu : KEXEC_CRASH_HP_INVALID_CPU;
 
 			/* Now invoke arch-specific update handler */
-			arch_crash_handle_hotplug_event(image);
+			arch_crash_handle_hotplug_event(image, hp_action);
 
 			/* No longer handling a hotplug event */
 			image->hotplug_event = false;
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 6/8] powerpc/crash: add crash CPU hotplug support
  2023-01-15 15:01 [PATCH v7 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
                   ` (4 preceding siblings ...)
  2023-01-15 15:02 ` [PATCH v7 5/8] crash: pass hotplug action type to arch crash hotplug handler Sourabh Jain
@ 2023-01-15 15:02 ` Sourabh Jain
  2023-01-15 15:02 ` [PATCH v7 7/8] crash: forward memory_notify args to arch crash hotplug handler Sourabh Jain
  2023-01-15 15:02 ` [PATCH v7 8/8] powerpc/kexec: add crash memory hotplug support Sourabh Jain
  7 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-01-15 15:02 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

Introduce powerpc crash hotplug handler to update the necessary kexec
segments on CPU/Memory hotplug events. A common crash hotplug handler is
triggered from generic infrastructure for both CPU/Memory hot un/plugged
events but in this patch, only CPU hot un/plugged events are introduced.
The memory hotplug support is added in upcoming patches.

The elfcorehdr segment is used to exchange the CPU and other dump related
information between the kernels. Ideally, the elfcorehdr segment needs to
be recreated on CPU hotplug events to reflect the changes. But on PowerPC
elfcorehdr is built with all possible CPUs instead of just online CPUs
hence no elfcorehdr segment update/recreation is needed.

In addition to elfcorehdr there is one more kexec segment that holds CPU
data FDT (Flattened Device Tree). To boot the PowerPC kernel the crashing
CPU has to be part of the FDT segment. If the kdump kernel doesn't find
the crashing CPU in the FDT segment, it fails to boot.

The only action needed on PowerPC to handle the crash CPU hotplug event
is to add hot added CPUs in the FDT segment to avoid kdump kernel boot
failure in case the system crashes on hot added CPU.

So for the CPU hot add events, the FDT segment is updated with hot added
CPU and Since there is no need to remove the hot unplugged CPUs from the
FDT segment hence no action taken on CPU hot remove event in PowerPC arch
crash hotplug handler.

To accommodate a growing number of CPUs, FDT is built with additional
buffer space to ensure that it can hold all possible CPUs.

The changes done here will also work for kexec_load system call given
that the kexec tool builds the FDT segment with additional space to
accommodate all possible CPUs as it is done for kexec_file_load system
call in the kernel.

Since memory crash hotplug support is not there yet the crash hotplug
handler simply warns the user and returns.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/include/asm/kexec.h |  4 ++
 arch/powerpc/kexec/core_64.c     | 43 ++++++++++++++++++++
 arch/powerpc/kexec/elf_64.c      | 69 +++++++++++++++++++++++++++++++-
 3 files changed, 114 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 5a322c1737661..e7cd4fd2becf5 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -106,6 +106,10 @@ int get_crash_memory_ranges(struct crash_mem **mem_ranges);
 #if defined(CONFIG_CRASH_HOTPLUG)
 int machine_kexec_post_load(struct kimage *image);
 #define machine_kexec_post_load machine_kexec_post_load
+
+void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action);
+#define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
+
 #endif
 #endif
 
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 3d4fe1aa6f761..7748b633c20fa 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -570,6 +570,49 @@ int update_cpus_node(void *fdt)
 	return ret;
 }
 
+#if defined(CONFIG_CRASH_HOTPLUG)
+/**
+ * arch_crash_hotplug_handler() - Handle hotplug kexec segements changes FDT, elfcorehdr
+ * @image: the active struct kimage
+ * @hp_action: the hot un/plug action being handled
+ *
+ * To accurately reflect CPU hot un/plug changes, the FDT must be updated with the
+ * new list of CPUs.
+ */
+void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action)
+{
+	void *fdt;
+
+	/* No action needed for CPU hot-unplug */
+	if (hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
+		return;
+
+	/* crash update on memory hotplug is not support yet */
+	if (hp_action == KEXEC_CRASH_HP_REMOVE_MEMORY || hp_action == KEXEC_CRASH_HP_ADD_MEMORY) {
+		pr_info_once("crash hp: crash update is not supported with memory hotplug\n");
+		return;
+	}
+
+	/* Must have valid FDT index */
+	if (image->arch.fdt_index < 0) {
+		pr_err("crash hp: unable to locate FDT segment");
+		return;
+	}
+
+	fdt = __va((void *)image->segment[image->arch.fdt_index].mem);
+
+	/* Temporarily invalidate the crash image while it is replaced */
+	xchg(&kexec_crash_image, NULL);
+
+	/* update FDT to refelect changes to CPU resrouces */
+	if (update_cpus_node(fdt))
+		pr_err("crash hp: failed to update crash FDT");
+
+	/* The crash image is now valid once again */
+	xchg(&kexec_crash_image, image);
+}
+#endif
+
 #ifdef CONFIG_PPC_64S_HASH_MMU
 /* Values we need to export to the second kernel via the device tree. */
 static unsigned long htab_base;
diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
index 2a17f171661f1..c9dfd6d7660ed 100644
--- a/arch/powerpc/kexec/elf_64.c
+++ b/arch/powerpc/kexec/elf_64.c
@@ -24,6 +24,64 @@
 #include <linux/slab.h>
 #include <linux/types.h>
 
+#include <asm/kvm_book3s_asm.h>
+
+#if defined(CONFIG_CRASH_HOTPLUG)
+/**
+ * get_cpu_node_sz() - Calculate the space needed to store a CPU device
+ * type node in FDT. The calculation is done based on the existing CPU node
+ * in unflatten device tree. Loop through all the properties of the very
+ * first CPU type device node found in unflatten device tree and returns
+ * the sum of the property length and property string size of all properties
+ * of a CPU node.
+ */
+static int get_cpu_node_sz(void)
+{
+	struct device_node *dn;
+	struct property *pp;
+	int cpu_node_size = 0;
+
+	dn = of_find_node_by_type(NULL, "cpu");
+
+	if (!dn) {
+		pr_warn("Unable to locate cpu device_type node.\n");
+		return 0;
+	}
+
+	/* Every node in FDT starts with FDT_BEGIN_NODE and ends with
+	 * FDT_END_NODE that takes one byte each.
+	 */
+	cpu_node_size = 2;
+
+	for_each_property_of_node(dn, pp) {
+		/**
+		 * For each property add two bytes extra. One for string null
+		 * character for property name and other for FDT property start
+		 * tag FDT_PROP.
+		 */
+		cpu_node_size += pp->length + strlen(pp->name) + 2;
+	}
+	return cpu_node_size;
+}
+
+/*
+ * get_crash_fdt_mem_sz() - calcuate mem size for crash kernel FDT
+ * @fdt: pointer to crash kernel FDT
+ *
+ * Calculate the buffer space needed to accommodate more CPU nodes in
+ * crash FDT post capture kernel load due to CPU hotplug events.
+ */
+static unsigned int get_crash_fdt_mem_sz(void *fdt)
+{
+	int fdt_cpu_nodes_sz, offline_cpu_cnt;
+
+	offline_cpu_cnt = (num_possible_cpus() - num_present_cpus()) / MAX_SMT_THREADS;
+	fdt_cpu_nodes_sz = get_cpu_node_sz() * offline_cpu_cnt;
+
+	return fdt_totalsize(fdt) + fdt_cpu_nodes_sz;
+}
+#endif
+
 static void *elf64_load(struct kimage *image, char *kernel_buf,
 			unsigned long kernel_len, char *initrd,
 			unsigned long initrd_len, char *cmdline,
@@ -119,15 +177,22 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
 	fdt_pack(fdt);
 
 	kbuf.buffer = fdt;
-	kbuf.bufsz = kbuf.memsz = fdt_totalsize(fdt);
+	kbuf.bufsz = fdt_totalsize(fdt);
 	kbuf.buf_align = PAGE_SIZE;
 	kbuf.top_down = true;
 	kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
 
 #if defined(CONFIG_CRASH_HOTPLUG)
 	image->arch.fdt_index = image->nr_segments;
+	if (image->type == KEXEC_TYPE_CRASH) {
+		kbuf.memsz = get_crash_fdt_mem_sz(fdt);
+		fdt_set_totalsize(fdt, kbuf.memsz);
+		image->arch.fdt_index = image->nr_segments;
+	} else
 #endif
-	kbuf.memsz = fdt_totalsize(fdt);
+	{
+		kbuf.memsz = fdt_totalsize(fdt);
+	}
 
 	ret = kexec_add_buffer(&kbuf);
 	if (ret)
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 7/8] crash: forward memory_notify args to arch crash hotplug handler
  2023-01-15 15:01 [PATCH v7 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
                   ` (5 preceding siblings ...)
  2023-01-15 15:02 ` [PATCH v7 6/8] powerpc/crash: add crash CPU hotplug support Sourabh Jain
@ 2023-01-15 15:02 ` Sourabh Jain
  2023-01-15 15:02 ` [PATCH v7 8/8] powerpc/kexec: add crash memory hotplug support Sourabh Jain
  7 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-01-15 15:02 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

The way memory hot remove is handled on PowerPC, it is hard to update
the elfcorehdr without memory_notify args.

On PowePC memblock data structure is used to prepare elfcorehdr for kdump.
Since the notifier used for memory hotplug crash handler get initiated
before the memblock data structure update happens (as depicted below),
the newly prepared elfcorehdr still holds the old memory regions. So if
the system crash with obsolete elfcorehdr, makedumpfile failed to collect
vmcore.

Sequence of actions done on PowerPC to serve the memory hot remove:

 Initiate memory hot remove
          |
          v
 offline pages
          |
          v
 initiate memory notify call chain
 for MEM_OFFLINE event.
 (same is used for crash update)
          |
          v
 prepare new elfcorehdr for kdump using
 memblock data structure
          |
          v
 update memblock data structure

How passing memory_notify to arch crash hotplug handler will help?

memory_notify holds the start PFN and page count, with that base address
and size of hot unplugged memory can calculated and same can be use to
avoid hot unplugged memory region to get added in the elfcorehdr.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/include/asm/kexec.h |  2 +-
 arch/powerpc/kexec/core_64.c     |  3 ++-
 arch/x86/include/asm/kexec.h     |  3 ++-
 arch/x86/kernel/crash.c          |  4 +++-
 include/linux/kexec.h            |  6 +++++-
 kernel/crash_core.c              | 14 +++++++-------
 6 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index e7cd4fd2becf5..743056f0bedf2 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -107,7 +107,7 @@ int get_crash_memory_ranges(struct crash_mem **mem_ranges);
 int machine_kexec_post_load(struct kimage *image);
 #define machine_kexec_post_load machine_kexec_post_load
 
-void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action);
+void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
 #endif
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 7748b633c20fa..1f807b29db93f 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -575,11 +575,12 @@ int update_cpus_node(void *fdt)
  * arch_crash_hotplug_handler() - Handle hotplug kexec segements changes FDT, elfcorehdr
  * @image: the active struct kimage
  * @hp_action: the hot un/plug action being handled
+ * @arg: struct memory_notify data handler
  *
  * To accurately reflect CPU hot un/plug changes, the FDT must be updated with the
  * new list of CPUs.
  */
-void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action)
+void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action, void *arg)
 {
 	void *fdt;
 
diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index ec33a592a2ddd..171edb8167a12 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -213,7 +213,8 @@ extern crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss;
 extern void kdump_nmi_shootdown_cpus(void);
 
 #ifdef CONFIG_CRASH_HOTPLUG
-void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action);
+void arch_crash_handle_hotplug_event(struct kimage *image,
+				     unsigned int hp_action, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 0026c5ad828d7..6bdfefebdd27c 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -466,12 +466,14 @@ int crash_load_segments(struct kimage *image)
  * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
  * @image: the active struct kimage
  * @hp_action: the hot un/plug action being handled
+ * @arg: struct memory_notify data handler
  *
  * To accurately reflect hot un/plug changes, the new elfcorehdr
  * is prepared in a kernel buffer, and then it is written on top
  * of the existing/old elfcorehdr.
  */
-void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action)
+void arch_crash_handle_hotplug_event(struct kimage *image,
+				     unsigned int hp_action, void *arg)
 {
 	void *elfbuf = NULL, *old_elfcorehdr;
 	unsigned long mem, memsz;
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index ea8ff85b0b436..f640d3c30f41c 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -509,7 +509,11 @@ static inline void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages) {
 #endif
 
 #ifndef arch_crash_handle_hotplug_event
-static inline void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action) { }
+static inline void arch_crash_handle_hotplug_event(struct kimage *image,
+						   unsigned int hp_action,
+						   void *arg)
+{
+}
 #endif
 
 #ifndef crash_hotplug_cpu_support
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index bdf8c363a90aa..735f2022593fa 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -645,7 +645,7 @@ subsys_initcall(crash_save_vmcoreinfo_init);
  * list of segments it checks (since the elfcorehdr changes and thus
  * would require an update to purgatory itself to update the digest).
  */
-static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
+static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu, void *arg)
 {
 	/* Obtain lock while changing crash information */
 	if (kexec_trylock()) {
@@ -707,7 +707,7 @@ static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
 					cpu : KEXEC_CRASH_HP_INVALID_CPU;
 
 			/* Now invoke arch-specific update handler */
-			arch_crash_handle_hotplug_event(image, hp_action);
+			arch_crash_handle_hotplug_event(image, hp_action, arg);
 
 			/* No longer handling a hotplug event */
 			image->hotplug_event = false;
@@ -722,17 +722,17 @@ static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
 	}
 }
 
-static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, void *v)
+static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, void *arg)
 {
 	switch (val) {
 	case MEM_ONLINE:
 		handle_hotplug_event(KEXEC_CRASH_HP_ADD_MEMORY,
-			KEXEC_CRASH_HP_INVALID_CPU);
+			KEXEC_CRASH_HP_INVALID_CPU, arg);
 		break;
 
 	case MEM_OFFLINE:
 		handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_MEMORY,
-			KEXEC_CRASH_HP_INVALID_CPU);
+			KEXEC_CRASH_HP_INVALID_CPU, arg);
 		break;
 	}
 	return NOTIFY_OK;
@@ -745,13 +745,13 @@ static struct notifier_block crash_memhp_nb = {
 
 static int crash_cpuhp_online(unsigned int cpu)
 {
-	handle_hotplug_event(KEXEC_CRASH_HP_ADD_CPU, cpu);
+	handle_hotplug_event(KEXEC_CRASH_HP_ADD_CPU, cpu, NULL);
 	return 0;
 }
 
 static int crash_cpuhp_offline(unsigned int cpu)
 {
-	handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_CPU, cpu);
+	handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_CPU, cpu, NULL);
 	return 0;
 }
 
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v7 8/8] powerpc/kexec: add crash memory hotplug support
  2023-01-15 15:01 [PATCH v7 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
                   ` (6 preceding siblings ...)
  2023-01-15 15:02 ` [PATCH v7 7/8] crash: forward memory_notify args to arch crash hotplug handler Sourabh Jain
@ 2023-01-15 15:02 ` Sourabh Jain
  7 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-01-15 15:02 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

Extend PowerPC arch crash hotplug handler to support memory hotplug
events. Since elfcorehdr is used to exchange the memory info between the
kernels hence it needs to be recreated to reflect the changes due to
memory hotplug events.

The way memory hotplug events are handled on PowerPC and the notifier call
chain used in generic code to trigger the arch crash handler, the process
to recreate the elfcorehdr is different for memory add and remove events.

In the hot remove case, the memory region is first marked offline then the
notifier call chain is triggered (same is used to initiate arch crash
hotplug handler) and at last the memblock structure is updated. Whereas
in the hot add case, memblock structure is updated before the notifier
call chain is triggered.

On PowerPC, memblock structure is used to identify the memory ranges for
elfcorehdr. In case of memory hot remove the memblock structure is updated
after the arch crash hotplug handler is triggered, hence an additional
step is taken to ensure that memory ranges used to build elfcorehdr do
not include hot removed memory. Whereas no such extra steps are needed
for the hot add case because memblock structure is updated before the
arch crash hotplug handler is triggered.

The hot removed memory is explicitly removed from the memory ranges list
before creating elfcorehdr for the hot remove case.

To accommodate a growing number of memory regions, elfcorehdr is built
with additional buffer space to hold max memory regions.

The changes done here will also work for the kexec_load system call given
that the kexec tool builds the elfcoredhr with additional space to
accommodate future memory regions as it is done for kexec_file_load
system call.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/include/asm/kexec_ranges.h |  1 +
 arch/powerpc/kexec/core_64.c            | 59 ++++++++++++++++++++++--
 arch/powerpc/kexec/file_load_64.c       | 12 ++++-
 arch/powerpc/kexec/ranges.c             | 60 +++++++++++++++++++++++++
 4 files changed, 127 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec_ranges.h b/arch/powerpc/include/asm/kexec_ranges.h
index f83866a19e870..802abf580cf0f 100644
--- a/arch/powerpc/include/asm/kexec_ranges.h
+++ b/arch/powerpc/include/asm/kexec_ranges.h
@@ -7,6 +7,7 @@
 void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
 struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
 int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
+int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
 int add_tce_mem_ranges(struct crash_mem **mem_ranges);
 int add_initrd_mem_range(struct crash_mem **mem_ranges);
 #ifdef CONFIG_PPC_64S_HASH_MMU
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 1f807b29db93f..0516fb5dd543f 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -19,6 +19,7 @@
 #include <linux/of.h>
 #include <linux/libfdt.h>
 #include <linux/memblock.h>
+#include <linux/memory.h>
 
 #include <asm/page.h>
 #include <asm/current.h>
@@ -571,6 +572,60 @@ int update_cpus_node(void *fdt)
 }
 
 #if defined(CONFIG_CRASH_HOTPLUG)
+int update_crash_elfcorehdr(struct kimage *image, unsigned int hp_action, void *arg)
+{
+	int ret;
+	struct crash_mem *cmem = NULL;
+	struct kexec_segment *ksegment;
+	unsigned long elfsz;
+	void *elfbuf = NULL;
+	void *mem;
+	unsigned long memsz;
+	char *ptr;
+	struct memory_notify *mn = (struct memory_notify *) arg;
+	unsigned long base_addr;
+	unsigned long size;
+
+	ksegment = &image->segment[image->elfcorehdr_index];
+	mem = (void *) ksegment->mem;
+	memsz = ksegment->memsz;
+
+	ret = get_crash_memory_ranges(&cmem);
+	if (ret) {
+		pr_err("crash hp: failed to get crash mem range\n");
+		return -1;
+	}
+
+	if (hp_action == KEXEC_CRASH_HP_REMOVE_MEMORY) {
+		base_addr = PFN_PHYS(mn->start_pfn);
+		size = mn->nr_pages * PAGE_SIZE;
+		ret = remove_mem_range(&cmem, base_addr, size);
+		if (ret)
+			return -1;
+	}
+
+	ret = crash_prepare_elf64_headers(image, cmem, false, &elfbuf, &elfsz);
+	if (ret) {
+		pr_err("crash hp: failed to prepare elf header\n");
+		return -1;
+	}
+
+	if (elfsz > memsz) {
+		pr_err("crash hp: updated crash elfcorehdr elfsz %lu > memsz %lu", elfsz, memsz);
+		return -1;
+	}
+
+	ptr = __va(mem);
+	if (ptr) {
+		xchg(&kexec_crash_image, NULL);
+		memcpy((void *)ptr, elfbuf, elfsz);
+		xchg(&kexec_crash_image, image);
+	}
+
+	vfree(elfbuf);
+	return 0;
+}
+
 /**
  * arch_crash_hotplug_handler() - Handle hotplug kexec segements changes FDT, elfcorehdr
  * @image: the active struct kimage
@@ -588,9 +643,8 @@ void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_actio
 	if (hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
 		return;
 
-	/* crash update on memory hotplug is not support yet */
 	if (hp_action == KEXEC_CRASH_HP_REMOVE_MEMORY || hp_action == KEXEC_CRASH_HP_ADD_MEMORY) {
-		pr_info_once("crash hp: crash update is not supported with memory hotplug\n");
+		update_crash_elfcorehdr(image, hp_action, arg);
 		return;
 	}
 
@@ -605,7 +659,6 @@ void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_actio
 	/* Temporarily invalidate the crash image while it is replaced */
 	xchg(&kexec_crash_image, NULL);
 
-	/* update FDT to refelect changes to CPU resrouces */
 	if (update_cpus_node(fdt))
 		pr_err("crash hp: failed to update crash FDT");
 
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index 725f74d1b928c..5ec5d0b1ec904 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -21,6 +21,8 @@
 #include <linux/memblock.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
+#include <linux/elf.h>
+
 #include <asm/setup.h>
 #include <asm/drmem.h>
 #include <asm/firmware.h>
@@ -735,7 +737,13 @@ static int load_elfcorehdr_segment(struct kimage *image, struct kexec_buf *kbuf)
 
 	kbuf->buffer = headers;
 	kbuf->mem = KEXEC_BUF_MEM_UNKNOWN;
-	kbuf->bufsz = kbuf->memsz = headers_sz;
+	kbuf->bufsz = headers_sz;
+/* Additional buffer space to accommodate future memory ranges */
+#if defined(CONFIG_MEMORY_HOTPLUG)
+	kbuf->memsz = PN_XNUM * sizeof(Elf64_Phdr);
+#else
+	kbuf->memsz = headers_sz;
+#endif
 	kbuf->top_down = false;
 
 	ret = kexec_add_buffer(kbuf);
@@ -745,7 +753,7 @@ static int load_elfcorehdr_segment(struct kimage *image, struct kexec_buf *kbuf)
 	}
 
 	image->elf_load_addr = kbuf->mem;
-	image->elf_headers_sz = headers_sz;
+	image->elf_headers_sz = kbuf->memsz;
 	image->elf_headers = headers;
 out:
 	kfree(cmem);
diff --git a/arch/powerpc/kexec/ranges.c b/arch/powerpc/kexec/ranges.c
index 5fc53a5fcfdf6..2bb90874df781 100644
--- a/arch/powerpc/kexec/ranges.c
+++ b/arch/powerpc/kexec/ranges.c
@@ -234,6 +234,66 @@ int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size)
 	return __add_mem_range(mem_ranges, base, size);
 }
 
+/**
+ * remove_mem_range - Removes the given memory range from the range list.
+ * @mem_ranges:    Range list to remove the memory range to.
+ * @base:          Base address of the range to remove.
+ * @size:          Size of the memory range to remove.
+ *
+ * (Re)allocates memory, if needed.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size)
+{
+	int ret = 0;
+	unsigned int i;
+	struct crash_mem *mem_rngs = *mem_ranges;
+	u64 mstart, mend;
+	u64 range_start, range_end;
+
+	if (!size)
+		return 0;
+
+	range_start = base;
+	range_end = base + size - 1;
+
+	for (i = 0; i < mem_rngs->nr_ranges; i++) {
+		mstart = mem_rngs->ranges[i].start;
+		mend = mem_rngs->ranges[i].end;
+
+		if (!(range_start >= mstart && range_end <= mend))
+			continue;
+
+		if (range_start == mstart) {
+			if (range_end == mend) {
+				for (; i < mem_rngs->nr_ranges - 1; i++) {
+					mem_rngs->ranges[i].start = mem_rngs->ranges[i+1].start;
+					mem_rngs->ranges[i].end = mem_rngs->ranges[i+1].end;
+				}
+				mem_rngs->nr_ranges--;
+				goto out;
+			}
+			mem_rngs->ranges[i].start = range_end + 1;
+			goto out;
+		} else if (range_end == mend)  {
+			mem_rngs->ranges[i].end = range_start - 1;
+			goto out;
+		} else {
+			size = mem_rngs->ranges[i].end - range_end;
+			mem_rngs->ranges[i].end = range_start - 1;
+			if (add_mem_range(mem_ranges, range_end + 1, size))
+				goto error;
+			goto out;
+		}
+	}
+error:
+	return -1;
+
+out:
+	return ret;
+}
+
 /**
  * add_tce_mem_ranges - Adds tce-table range to the given memory ranges list.
  * @mem_ranges:         Range list to add the memory range(s) to.
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 4/8] crash: add phdr for possible CPUs in elfcorehdr
  2023-01-15 15:02 ` [PATCH v7 4/8] crash: add phdr for possible CPUs in elfcorehdr Sourabh Jain
@ 2023-01-19 18:29   ` Laurent Dufour
  2023-01-20 11:39     ` Laurent Dufour
  0 siblings, 1 reply; 14+ messages in thread
From: Laurent Dufour @ 2023-01-19 18:29 UTC (permalink / raw)
  To: Sourabh Jain, linuxppc-dev, mpe
  Cc: mahesh, eric.devolder, kexec, hbathini, bhe

On 15/01/2023 16:02:02, Sourabh Jain wrote:
> On architectures like PowerPC the crash notes are available for all
> possible CPUs. So let's populate the elfcorehdr for all possible
> CPUs having crash notes to avoid updating elfcorehdr during in-kernel
> crash update on CPU hotplug events.
> 
> The similar technique is used in kexec-tool for kexec_load case.
> 
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> ---
>  kernel/crash_core.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)

This patch is not applying on ppc/next (53ab112a9508).

As far as I could see, crash_prepare_elf64_headers() is defined in the file
kernel/kexec_file.c and that's not recent, see babac4a84a88 (kexec_file,
x86: move re-factored code to generic side, 2018-04-13)

Am I missing something?

> 
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 910d377ea317e..19f987b3851e8 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -364,8 +364,8 @@ int crash_prepare_elf64_headers(struct kimage *image, struct crash_mem *mem,
>  	ehdr->e_ehsize = sizeof(Elf64_Ehdr);
>  	ehdr->e_phentsize = sizeof(Elf64_Phdr);
>  
> -	/* Prepare one phdr of type PT_NOTE for each present CPU */
> -	for_each_present_cpu(cpu) {
> +	/* Prepare one phdr of type PT_NOTE for possible CPU with crash note. */
> +	for_each_possible_cpu(cpu) {
>  #ifdef CONFIG_CRASH_HOTPLUG
>  		if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) {
>  			/* Skip the soon-to-be offlined cpu */
> @@ -373,8 +373,11 @@ int crash_prepare_elf64_headers(struct kimage *image, struct crash_mem *mem,
>  				continue;
>  		}
>  #endif
> -		phdr->p_type = PT_NOTE;
>  		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
> +		if (!notes_addr)
> +			continue;
> +
> +		phdr->p_type = PT_NOTE;
>  		phdr->p_offset = phdr->p_paddr = notes_addr;
>  		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
>  		(ehdr->e_phnum)++;


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 3/8] powerpc/crash: update kimage_arch struct
  2023-01-15 15:02 ` [PATCH v7 3/8] powerpc/crash: update kimage_arch struct Sourabh Jain
@ 2023-01-19 18:57   ` Laurent Dufour
  2023-01-23  5:23     ` Sourabh Jain
  0 siblings, 1 reply; 14+ messages in thread
From: Laurent Dufour @ 2023-01-19 18:57 UTC (permalink / raw)
  To: Sourabh Jain, linuxppc-dev, mpe
  Cc: mahesh, eric.devolder, kexec, hbathini, bhe

On 15/01/2023 16:02:01, Sourabh Jain wrote:
> Add a new member "fdt_index" to kimage_arch struct to hold the index of
> the FDT (Flattened Device Tree) segment in the kexec segment array.
> 
> Having direct access to FDT segment will help arch crash hotplug handler
> to avoid looping kexec segment array to identify the FDT segment index
> for every FDT update on hotplug events.
> 
> The fdt_index is initialized during the kexec load for both kexec_load and
> kexec_file_load system call.
> 
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/kexec.h  |  7 +++++++
>  arch/powerpc/kexec/core_64.c      | 27 +++++++++++++++++++++++++++
>  arch/powerpc/kexec/elf_64.c       |  6 ++++++
>  arch/powerpc/kexec/file_load_64.c |  5 +++++
>  4 files changed, 45 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
> index 8090ad7d97d9d..5a322c1737661 100644
> --- a/arch/powerpc/include/asm/kexec.h
> +++ b/arch/powerpc/include/asm/kexec.h
> @@ -103,6 +103,10 @@ void kexec_copy_flush(struct kimage *image);
>  struct crash_mem;
>  int update_cpus_node(void *fdt);
>  int get_crash_memory_ranges(struct crash_mem **mem_ranges);
> +#if defined(CONFIG_CRASH_HOTPLUG)
> +int machine_kexec_post_load(struct kimage *image);
> +#define machine_kexec_post_load machine_kexec_post_load
> +#endif
>  #endif
>  
>  #if defined(CONFIG_CRASH_DUMP) && defined(CONFIG_PPC_RTAS)
> @@ -118,6 +122,9 @@ extern const struct kexec_file_ops kexec_elf64_ops;
>  struct kimage_arch {
>  	struct crash_mem *exclude_ranges;
>  
> +#if defined(CONFIG_CRASH_HOTPLUG)
> +	int fdt_index;
> +#endif
>  	unsigned long backup_start;
>  	void *backup_buf;
>  	void *fdt;
> diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
> index 0b292f93a74cc..3d4fe1aa6f761 100644
> --- a/arch/powerpc/kexec/core_64.c
> +++ b/arch/powerpc/kexec/core_64.c
> @@ -77,6 +77,33 @@ int machine_kexec_prepare(struct kimage *image)
>  	return 0;
>  }
>  
> +#if defined(CONFIG_CRASH_HOTPLUG)

I think you should add a small function header describing that this
function is recording the index of the FDT segment for later use.

> +int machine_kexec_post_load(struct kimage *kimage)
> +{
> +	int i;
> +	void *ptr;
> +	unsigned long mem;
> +
> +	/* Mark fdt_index invalid */
> +	kimage->arch.fdt_index = -1;

Is that really needed?
This is already done in arch_kexec_kernel_image_probe() called before this
function, isn't it?

> +
> +	if (kimage->type != KEXEC_TYPE_CRASH)
> +		return 0;
> +
> +	for (i = 0; i < kimage->nr_segments; i++) {
> +		mem = kimage->segment[i].mem;
> +		ptr = __va(mem);
> +
> +		if (ptr && fdt_magic(ptr) == FDT_MAGIC) {
> +			kimage->arch.fdt_index = i;
> +			break;
> +		}
> +	}
> +
> +	return 0;
> +}
> +#endif
> +
>  /* Called during kexec sequence with MMU off */
>  static notrace void copy_segments(unsigned long ind)
>  {
> diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
> index eeb258002d1e0..2a17f171661f1 100644
> --- a/arch/powerpc/kexec/elf_64.c
> +++ b/arch/powerpc/kexec/elf_64.c
> @@ -123,6 +123,12 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
>  	kbuf.buf_align = PAGE_SIZE;
>  	kbuf.top_down = true;
>  	kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
> +
> +#if defined(CONFIG_CRASH_HOTPLUG)
> +	image->arch.fdt_index = image->nr_segments;

I'm sorry, I'm not familliar with that code, could you explain why
fdt_index has to be assigned here, and to that value?

> +#endif
> +	kbuf.memsz = fdt_totalsize(fdt);
> +
>  	ret = kexec_add_buffer(&kbuf);
>  	if (ret)
>  		goto out_free_fdt;
> diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
> index 9bc70b4d8eafc..725f74d1b928c 100644
> --- a/arch/powerpc/kexec/file_load_64.c
> +++ b/arch/powerpc/kexec/file_load_64.c
> @@ -1153,6 +1153,11 @@ int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
>  		return ret;
>  	}
>  
> +#if defined(CONFIG_CRASH_HOTPLUG)
> +	/* Mark fdt_index invalid */
> +	image->arch.fdt_index = -1;
> +#endif
> +
>  	return kexec_image_probe_default(image, buf, buf_len);
>  }
>  


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 4/8] crash: add phdr for possible CPUs in elfcorehdr
  2023-01-19 18:29   ` Laurent Dufour
@ 2023-01-20 11:39     ` Laurent Dufour
  2023-01-23  5:27       ` Sourabh Jain
  0 siblings, 1 reply; 14+ messages in thread
From: Laurent Dufour @ 2023-01-20 11:39 UTC (permalink / raw)
  To: Sourabh Jain, linuxppc-dev, mpe
  Cc: mahesh, eric.devolder, kexec, bhe, hbathini

On 19/01/2023 19:29:52, Laurent Dufour wrote:
> On 15/01/2023 16:02:02, Sourabh Jain wrote:
>> On architectures like PowerPC the crash notes are available for all
>> possible CPUs. So let's populate the elfcorehdr for all possible
>> CPUs having crash notes to avoid updating elfcorehdr during in-kernel
>> crash update on CPU hotplug events.
>>
>> The similar technique is used in kexec-tool for kexec_load case.
>>
>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>> ---
>>  kernel/crash_core.c | 9 ++++++---
>>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> This patch is not applying on ppc/next (53ab112a9508).
> 
> As far as I could see, crash_prepare_elf64_headers() is defined in the file
> kernel/kexec_file.c and that's not recent, see babac4a84a88 (kexec_file,
> x86: move re-factored code to generic side, 2018-04-13)
> 
> Am I missing something?

My mistake, sounds that your series is based on top of the Eric's one (not yet upstream):

https://lore.kernel.org/lkml/20230118213544.2128-1-eric.devolder@oracle.com/

> 
>>
>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>> index 910d377ea317e..19f987b3851e8 100644
>> --- a/kernel/crash_core.c
>> +++ b/kernel/crash_core.c
>> @@ -364,8 +364,8 @@ int crash_prepare_elf64_headers(struct kimage *image, struct crash_mem *mem,
>>  	ehdr->e_ehsize = sizeof(Elf64_Ehdr);
>>  	ehdr->e_phentsize = sizeof(Elf64_Phdr);
>>  
>> -	/* Prepare one phdr of type PT_NOTE for each present CPU */
>> -	for_each_present_cpu(cpu) {
>> +	/* Prepare one phdr of type PT_NOTE for possible CPU with crash note. */
>> +	for_each_possible_cpu(cpu) {
>>  #ifdef CONFIG_CRASH_HOTPLUG
>>  		if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) {
>>  			/* Skip the soon-to-be offlined cpu */
>> @@ -373,8 +373,11 @@ int crash_prepare_elf64_headers(struct kimage *image, struct crash_mem *mem,
>>  				continue;
>>  		}
>>  #endif
>> -		phdr->p_type = PT_NOTE;
>>  		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
>> +		if (!notes_addr)
>> +			continue;
>> +
>> +		phdr->p_type = PT_NOTE;
>>  		phdr->p_offset = phdr->p_paddr = notes_addr;
>>  		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
>>  		(ehdr->e_phnum)++;
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 3/8] powerpc/crash: update kimage_arch struct
  2023-01-19 18:57   ` Laurent Dufour
@ 2023-01-23  5:23     ` Sourabh Jain
  0 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-01-23  5:23 UTC (permalink / raw)
  To: Laurent Dufour, linuxppc-dev, mpe
  Cc: mahesh, eric.devolder, kexec, bhe, hbathini

Hello Laurent,

On 20/01/23 00:27, Laurent Dufour wrote:
> On 15/01/2023 16:02:01, Sourabh Jain wrote:
>> Add a new member "fdt_index" to kimage_arch struct to hold the index of
>> the FDT (Flattened Device Tree) segment in the kexec segment array.
>>
>> Having direct access to FDT segment will help arch crash hotplug handler
>> to avoid looping kexec segment array to identify the FDT segment index
>> for every FDT update on hotplug events.
>>
>> The fdt_index is initialized during the kexec load for both kexec_load and
>> kexec_file_load system call.
>>
>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>> ---
>>   arch/powerpc/include/asm/kexec.h  |  7 +++++++
>>   arch/powerpc/kexec/core_64.c      | 27 +++++++++++++++++++++++++++
>>   arch/powerpc/kexec/elf_64.c       |  6 ++++++
>>   arch/powerpc/kexec/file_load_64.c |  5 +++++
>>   4 files changed, 45 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
>> index 8090ad7d97d9d..5a322c1737661 100644
>> --- a/arch/powerpc/include/asm/kexec.h
>> +++ b/arch/powerpc/include/asm/kexec.h
>> @@ -103,6 +103,10 @@ void kexec_copy_flush(struct kimage *image);
>>   struct crash_mem;
>>   int update_cpus_node(void *fdt);
>>   int get_crash_memory_ranges(struct crash_mem **mem_ranges);
>> +#if defined(CONFIG_CRASH_HOTPLUG)
>> +int machine_kexec_post_load(struct kimage *image);
>> +#define machine_kexec_post_load machine_kexec_post_load
>> +#endif
>>   #endif
>>   
>>   #if defined(CONFIG_CRASH_DUMP) && defined(CONFIG_PPC_RTAS)
>> @@ -118,6 +122,9 @@ extern const struct kexec_file_ops kexec_elf64_ops;
>>   struct kimage_arch {
>>   	struct crash_mem *exclude_ranges;
>>   
>> +#if defined(CONFIG_CRASH_HOTPLUG)
>> +	int fdt_index;
>> +#endif
>>   	unsigned long backup_start;
>>   	void *backup_buf;
>>   	void *fdt;
>> diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
>> index 0b292f93a74cc..3d4fe1aa6f761 100644
>> --- a/arch/powerpc/kexec/core_64.c
>> +++ b/arch/powerpc/kexec/core_64.c
>> @@ -77,6 +77,33 @@ int machine_kexec_prepare(struct kimage *image)
>>   	return 0;
>>   }
>>   
>> +#if defined(CONFIG_CRASH_HOTPLUG)
> I think you should add a small function header describing that this
> function is recording the index of the FDT segment for later use.

Yes good to have one,  I will add it in the next version.

>
>> +int machine_kexec_post_load(struct kimage *kimage)
>> +{
>> +	int i;
>> +	void *ptr;
>> +	unsigned long mem;
>> +
>> +	/* Mark fdt_index invalid */
>> +	kimage->arch.fdt_index = -1;
> Is that really needed?
> This is already done in arch_kexec_kernel_image_probe() called before this
> function, isn't it?

Oh I didn't realize machine_kexec_post load is called for both
kexec_load and kexec_file_load system call.

The intention was to initialize fdt_index for both system calls but since
machine_kexec_post_load is called for both system calls,
initializing fdt_index in arch_kernel_image_probe function is redundant.

Thanks for point it out. I will fix this in the next version by only 
initializing the
fdtindex in machine_kexec_post_load. With this fdt_index will be 
initialized
for both syscalls.

>> +
>> +	if (kimage->type != KEXEC_TYPE_CRASH)
>> +		return 0;
>> +
>> +	for (i = 0; i < kimage->nr_segments; i++) {
>> +		mem = kimage->segment[i].mem;
>> +		ptr = __va(mem);
>> +
>> +		if (ptr && fdt_magic(ptr) == FDT_MAGIC) {
>> +			kimage->arch.fdt_index = i;
>> +			break;
>> +		}
>> +	}
>> +
>> +	return 0;
>> +}
>> +#endif
>> +
>>   /* Called during kexec sequence with MMU off */
>>   static notrace void copy_segments(unsigned long ind)
>>   {
>> diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
>> index eeb258002d1e0..2a17f171661f1 100644
>> --- a/arch/powerpc/kexec/elf_64.c
>> +++ b/arch/powerpc/kexec/elf_64.c
>> @@ -123,6 +123,12 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
>>   	kbuf.buf_align = PAGE_SIZE;
>>   	kbuf.top_down = true;
>>   	kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>> +
>> +#if defined(CONFIG_CRASH_HOTPLUG)
>> +	image->arch.fdt_index = image->nr_segments;
> I'm sorry, I'm not familliar with that code, could you explain why
> fdt_index has to be assigned here, and to that value?

Again I didn't realize machine_kexec_post_load is also called for 
kexec_file_load
system call too, which makes this assignment redundant.

Now why this value?

The image->nr-segments holds the count of total number of kexec segments and
when a new segment/buffer is added ( by kexec_add_buffer()) it is 
incremented by 1. With
this the index of newly added segment in the kexec segment array will be 
image->nr_segments - 1.

So instead of adding fdt segment first and then initializing fdt_index 
as image->nr_segments - 1,
the fdt_index is initialized with nr_segments before adding the fdt 
kexec segment/buffer to the
segment array.

Hope I answered you query.

Thanks for the review.

- Sourabh


>
>> +#endif
>> +	kbuf.memsz = fdt_totalsize(fdt);
>> +
>>   	ret = kexec_add_buffer(&kbuf);
>>   	if (ret)
>>   		goto out_free_fdt;
>> diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
>> index 9bc70b4d8eafc..725f74d1b928c 100644
>> --- a/arch/powerpc/kexec/file_load_64.c
>> +++ b/arch/powerpc/kexec/file_load_64.c
>> @@ -1153,6 +1153,11 @@ int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
>>   		return ret;
>>   	}
>>   
>> +#if defined(CONFIG_CRASH_HOTPLUG)
>> +	/* Mark fdt_index invalid */
>> +	image->arch.fdt_index = -1;
>> +#endif
>> +
>>   	return kexec_image_probe_default(image, buf, buf_len);
>>   }
>>   

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v7 4/8] crash: add phdr for possible CPUs in elfcorehdr
  2023-01-20 11:39     ` Laurent Dufour
@ 2023-01-23  5:27       ` Sourabh Jain
  0 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-01-23  5:27 UTC (permalink / raw)
  To: linuxppc-dev


On 20/01/23 17:09, Laurent Dufour wrote:
> On 19/01/2023 19:29:52, Laurent Dufour wrote:
>> On 15/01/2023 16:02:02, Sourabh Jain wrote:
>>> On architectures like PowerPC the crash notes are available for all
>>> possible CPUs. So let's populate the elfcorehdr for all possible
>>> CPUs having crash notes to avoid updating elfcorehdr during in-kernel
>>> crash update on CPU hotplug events.
>>>
>>> The similar technique is used in kexec-tool for kexec_load case.
>>>
>>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>>> ---
>>>   kernel/crash_core.c | 9 ++++++---
>>>   1 file changed, 6 insertions(+), 3 deletions(-)
>> This patch is not applying on ppc/next (53ab112a9508).
>>
>> As far as I could see, crash_prepare_elf64_headers() is defined in the file
>> kernel/kexec_file.c and that's not recent, see babac4a84a88 (kexec_file,
>> x86: move re-factored code to generic side, 2018-04-13)
>>
>> Am I missing something?
> My mistake, sounds that your series is based on top of the Eric's one (not yet upstream):
>
> https://lore.kernel.org/lkml/20230118213544.2128-1-eric.devolder@oracle.com/

Yes this patch series is dependent on the above patch series. In the 
next version I will
share a git repo link that will have this patch series applied along 
with dependent
patch series for easy try.

- Sourabh


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-01-23  5:28 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-15 15:01 [PATCH v7 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
2023-01-15 15:01 ` [PATCH v7 1/8] powerpc/kexec: turn some static helper functions public Sourabh Jain
2023-01-15 15:02 ` [PATCH v7 2/8] powerpc/crash hp: introduce a new config option CRASH_HOTPLUG Sourabh Jain
2023-01-15 15:02 ` [PATCH v7 3/8] powerpc/crash: update kimage_arch struct Sourabh Jain
2023-01-19 18:57   ` Laurent Dufour
2023-01-23  5:23     ` Sourabh Jain
2023-01-15 15:02 ` [PATCH v7 4/8] crash: add phdr for possible CPUs in elfcorehdr Sourabh Jain
2023-01-19 18:29   ` Laurent Dufour
2023-01-20 11:39     ` Laurent Dufour
2023-01-23  5:27       ` Sourabh Jain
2023-01-15 15:02 ` [PATCH v7 5/8] crash: pass hotplug action type to arch crash hotplug handler Sourabh Jain
2023-01-15 15:02 ` [PATCH v7 6/8] powerpc/crash: add crash CPU hotplug support Sourabh Jain
2023-01-15 15:02 ` [PATCH v7 7/8] crash: forward memory_notify args to arch crash hotplug handler Sourabh Jain
2023-01-15 15:02 ` [PATCH v7 8/8] powerpc/kexec: add crash memory hotplug support Sourabh Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).