linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 0/8] In kernel handling of CPU hotplug events for crash kernel
@ 2023-02-01  6:38 Sourabh Jain
  2023-02-01  6:38 ` [PATCH v8 1/8] powerpc/kexec: turn some static helper functions public Sourabh Jain
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-02-01  6:38 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

The Problem:
============
Post hotplug/DLPAR events the capture kernel holds stale information about the
system. Dump collection with stale capture kernel might end up in dump capture
failure or an inaccurate dump collection.

Existing solution:
==================
The existing solution to keep the capture kernel up-to-date by monitoring
hotplug event via udev rule and trigger a full capture kernel reload for
every hotplug event.

Shortcomings:
------------------------------------------------
- Leaves a window where kernel crash might not lead to a successful dump
  collection.
- Reloading all kexec components for each hotplug is inefficient.
- udev rules are prone to races if hotplug events are frequent.

More about issues with an existing solution is posted here:
 - https://lkml.org/lkml/2020/12/14/532
 - https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-February/240254.html

Proposed Solution:
==================
Instead of reloading all kexec segments on hotplug event, this patch series
focuses on updating only the relevant kexec segment. Once the kexec segments
are loaded in the kernel reserved area then an arch-specific hotplug handler
will update the relevant kexec segment based on hotplug event type.

Series Dependecies
==================
This patch series implements the crash hotplug handler on PowerPC. The generic
for crash hotplug update is introduced by https://lkml.org/lkml/2023/1/18/1420 patch series.

Git tree for testing:
=====================
The below git tree has this patch series applied on top of dependent patch
series.
https://github.com/sourabhjains/linux/commits/in-kernel-crash-update

Note: only kexec_file_load syscall will work. For kexec_load mirnor
changes are required in kexec tool.

To realise the feature the kdump udev rules must be disabled for CPU/Memory
hotplug events. Comment out the below line in kdump udev rule file:

  RHEL: /usr/lib/udev/rules.d/98-kexec.rules

  	#SUBSYSTEM=="cpu", ACTION=="online", GOTO="kdump_reload_cpu"
	#SUBSYSTEM=="memory", ACTION=="online", GOTO="kdump_reload_mem"
	#SUBSYSTEM=="memory", ACTION=="offline", GOTO="kdump_reload_mem"

  SLES: /usr/lib/kdump/70-kdump.rules

	#SUBSYSTEM=="memory", ACTION=="add|remove", GOTO="kdump_try_restart"
	#SUBSYSTEM=="cpu", ACTION=="online", GOTO="kdump_try_restart"
---
Changelog:

v7 -> v8:
  - Restrict fdt_index initialization to machine_kexec_post_load
    it work for both kexec_load and kexec_file_load.[3/8] Laurent Dufour

  - Updated the logic to find the number of offline core. [6/8]

  - Changed the logic to find the elfcore program header to accommodate
    future memory ranges due memory hotplug events. [8/8]

v6 -> v7
  - added a new config to configure this feature
  - pass hotplug action type to arch specific handler

v5 -> v6
  - Added crash memory hotplug support

v4 -> v5:
  - Replace COFNIG_CRASH_HOTPLUG with CONFIG_HOTPLUG_CPU.
  - Move fdt segment identification for kexec_load case to load path
    instead of crash hotplug handler
  - Keep new attribute defined under kimage_arch to track FDT segment
    under CONFIG_HOTPLUG_CPU config.

v3 -> v4:
  - Update the logic to find the additional space needed for hotadd CPUs post
    kexec load. Refer "[RFC v4 PATCH 4/5] powerpc/crash hp: add crash hotplug
    support for kexec_file_load" patch to know more about the change.
  - Fix a couple of typo.
  - Replace pr_err to pr_info_once to warn user about memory hotplug
    support.
  - In crash hotplug handle exit the for loop if FDT segment is found.

v2 -> v3
  - Move fdt_index and fdt_index_vaild variables to kimage_arch struct.
  - Rebase patche on top of https://lkml.org/lkml/2022/3/3/674 [v5]
  - Fixed warning reported by checpatch script

v1 -> v2:
  - Use generic hotplug handler introduced by https://lkml.org/lkml/2022/2/9/1406, a
    significant change from v1.

Sourabh Jain (8):
  powerpc/kexec: turn some static helper functions public
  powerpc/crash hp: introduce a new config option CRASH_HOTPLUG
  powerpc/crash: update kimage_arch struct
  crash: add phdr for possible CPUs in elfcorehdr
  crash: pass hotplug action type to arch crash hotplug handler
  powerpc/crash: add crash CPU hotplug support
  crash: forward memory_notify args to arch crash hotplug handler
  powerpc/kexec: add crash memory hotplug support

 arch/powerpc/Kconfig                    |  12 +
 arch/powerpc/include/asm/kexec.h        |  18 ++
 arch/powerpc/include/asm/kexec_ranges.h |   1 +
 arch/powerpc/kexec/core_64.c            | 335 ++++++++++++++++++++++++
 arch/powerpc/kexec/elf_64.c             |  19 +-
 arch/powerpc/kexec/file_load_64.c       | 237 +++--------------
 arch/powerpc/kexec/ranges.c             |  60 +++++
 arch/x86/include/asm/kexec.h            |   3 +-
 arch/x86/kernel/crash.c                 |   5 +-
 include/linux/kexec.h                   |   6 +-
 kernel/crash_core.c                     |  23 +-
 11 files changed, 502 insertions(+), 217 deletions(-)

-- 
2.39.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v8 1/8] powerpc/kexec: turn some static helper functions public
  2023-02-01  6:38 [PATCH v8 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
@ 2023-02-01  6:38 ` Sourabh Jain
  2023-02-01  6:38 ` [PATCH v8 2/8] powerpc/crash hp: introduce a new config option CRASH_HOTPLUG Sourabh Jain
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-02-01  6:38 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

Move update_cpus_node and get_crash_memory_ranges functions from
kexec/file_load.c to kexec/core_64.c to make these functions usable
by other kexec compoenets.

Later in the series, both functions are used for in-kernel updates to
kexec segments in the event of CPU/Memory hotplug for both kexec_load
and kexec_file_load system call.

No functional change intended.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/include/asm/kexec.h  |   6 ++
 arch/powerpc/kexec/core_64.c      | 166 ++++++++++++++++++++++++++++++
 arch/powerpc/kexec/file_load_64.c | 162 -----------------------------
 3 files changed, 172 insertions(+), 162 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index a1ddba01e7d13..8090ad7d97d9d 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -99,6 +99,12 @@ void relocate_new_kernel(unsigned long indirection_page, unsigned long reboot_co
 
 void kexec_copy_flush(struct kimage *image);
 
+#ifdef CONFIG_PPC64
+struct crash_mem;
+int update_cpus_node(void *fdt);
+int get_crash_memory_ranges(struct crash_mem **mem_ranges);
+#endif
+
 #if defined(CONFIG_CRASH_DUMP) && defined(CONFIG_PPC_RTAS)
 void crash_free_reserved_phys_range(unsigned long begin, unsigned long end);
 #define crash_free_reserved_phys_range crash_free_reserved_phys_range
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index a79e28c91e2be..0b292f93a74cc 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -17,6 +17,8 @@
 #include <linux/cpu.h>
 #include <linux/hardirq.h>
 #include <linux/of.h>
+#include <linux/libfdt.h>
+#include <linux/memblock.h>
 
 #include <asm/page.h>
 #include <asm/current.h>
@@ -30,6 +32,8 @@
 #include <asm/hw_breakpoint.h>
 #include <asm/svm.h>
 #include <asm/ultravisor.h>
+#include <asm/kexec_ranges.h>
+#include <asm/crashdump-ppc64.h>
 
 int machine_kexec_prepare(struct kimage *image)
 {
@@ -377,6 +381,168 @@ void default_machine_kexec(struct kimage *image)
 	/* NOTREACHED */
 }
 
+/**
+ * get_crash_memory_ranges - Get crash memory ranges. This list includes
+ *                           first/crashing kernel's memory regions that
+ *                           would be exported via an elfcore.
+ * @mem_ranges:              Range list to add the memory ranges to.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+int get_crash_memory_ranges(struct crash_mem **mem_ranges)
+{
+	phys_addr_t base, end;
+	struct crash_mem *tmem;
+	u64 i;
+	int ret;
+
+	for_each_mem_range(i, &base, &end) {
+		u64 size = end - base;
+
+		/* Skip backup memory region, which needs a separate entry */
+		if (base == BACKUP_SRC_START) {
+			if (size > BACKUP_SRC_SIZE) {
+				base = BACKUP_SRC_END + 1;
+				size -= BACKUP_SRC_SIZE;
+			} else
+				continue;
+		}
+
+		ret = add_mem_range(mem_ranges, base, size);
+		if (ret)
+			goto out;
+
+		/* Try merging adjacent ranges before reallocation attempt */
+		if ((*mem_ranges)->nr_ranges == (*mem_ranges)->max_nr_ranges)
+			sort_memory_ranges(*mem_ranges, true);
+	}
+
+	/* Reallocate memory ranges if there is no space to split ranges */
+	tmem = *mem_ranges;
+	if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
+		tmem = realloc_mem_ranges(mem_ranges);
+		if (!tmem)
+			goto out;
+	}
+
+	/* Exclude crashkernel region */
+	ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
+	if (ret)
+		goto out;
+
+	/*
+	 * FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
+	 *        regions are exported to save their context at the time of
+	 *        crash, they should actually be backed up just like the
+	 *        first 64K bytes of memory.
+	 */
+	ret = add_rtas_mem_range(mem_ranges);
+	if (ret)
+		goto out;
+
+	ret = add_opal_mem_range(mem_ranges);
+	if (ret)
+		goto out;
+
+	/* create a separate program header for the backup region */
+	ret = add_mem_range(mem_ranges, BACKUP_SRC_START, BACKUP_SRC_SIZE);
+	if (ret)
+		goto out;
+
+	sort_memory_ranges(*mem_ranges, false);
+out:
+	if (ret)
+		pr_err("Failed to setup crash memory ranges\n");
+	return ret;
+}
+
+/**
+ * add_node_props - Reads node properties from device node structure and add
+ *                  them to fdt.
+ * @fdt:            Flattened device tree of the kernel
+ * @node_offset:    offset of the node to add a property at
+ * @dn:             device node pointer
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+static int add_node_props(void *fdt, int node_offset, const struct device_node *dn)
+{
+	int ret = 0;
+	struct property *pp;
+
+	if (!dn)
+		return -EINVAL;
+
+	for_each_property_of_node(dn, pp) {
+		ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, pp->length);
+		if (ret < 0) {
+			pr_err("Unable to add %s property: %s\n", pp->name, fdt_strerror(ret));
+			return ret;
+		}
+	}
+	return ret;
+}
+
+/**
+ * update_cpus_node - Update cpus node of flattened device tree using of_root
+ *                    device node.
+ * @fdt:              Flattened device tree of the kernel.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+int update_cpus_node(void *fdt)
+{
+	struct device_node *cpus_node, *dn;
+	int cpus_offset, cpus_subnode_offset, ret = 0;
+
+	cpus_offset = fdt_path_offset(fdt, "/cpus");
+	if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
+		pr_err("Malformed device tree: error reading /cpus node: %s\n",
+		       fdt_strerror(cpus_offset));
+		return cpus_offset;
+	}
+
+	if (cpus_offset > 0) {
+		ret = fdt_del_node(fdt, cpus_offset);
+		if (ret < 0) {
+			pr_err("Error deleting /cpus node: %s\n", fdt_strerror(ret));
+			return -EINVAL;
+		}
+	}
+
+	/* Add cpus node to fdt */
+	cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
+	if (cpus_offset < 0) {
+		pr_err("Error creating /cpus node: %s\n", fdt_strerror(cpus_offset));
+		return -EINVAL;
+	}
+
+	/* Add cpus node properties */
+	cpus_node = of_find_node_by_path("/cpus");
+	ret = add_node_props(fdt, cpus_offset, cpus_node);
+	of_node_put(cpus_node);
+	if (ret < 0)
+		return ret;
+
+	/* Loop through all subnodes of cpus and add them to fdt */
+	for_each_node_by_type(dn, "cpu") {
+		cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, dn->full_name);
+		if (cpus_subnode_offset < 0) {
+			pr_err("Unable to add %s subnode: %s\n", dn->full_name,
+			       fdt_strerror(cpus_subnode_offset));
+			ret = cpus_subnode_offset;
+			goto out;
+		}
+
+		ret = add_node_props(fdt, cpus_subnode_offset, dn);
+		if (ret < 0)
+			goto out;
+	}
+out:
+	of_node_put(dn);
+	return ret;
+}
+
 #ifdef CONFIG_PPC_64S_HASH_MMU
 /* Values we need to export to the second kernel via the device tree. */
 static unsigned long htab_base;
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index e51d8059535b2..9bc70b4d8eafc 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -131,81 +131,6 @@ static int get_usable_memory_ranges(struct crash_mem **mem_ranges)
 	return ret;
 }
 
-/**
- * get_crash_memory_ranges - Get crash memory ranges. This list includes
- *                           first/crashing kernel's memory regions that
- *                           would be exported via an elfcore.
- * @mem_ranges:              Range list to add the memory ranges to.
- *
- * Returns 0 on success, negative errno on error.
- */
-static int get_crash_memory_ranges(struct crash_mem **mem_ranges)
-{
-	phys_addr_t base, end;
-	struct crash_mem *tmem;
-	u64 i;
-	int ret;
-
-	for_each_mem_range(i, &base, &end) {
-		u64 size = end - base;
-
-		/* Skip backup memory region, which needs a separate entry */
-		if (base == BACKUP_SRC_START) {
-			if (size > BACKUP_SRC_SIZE) {
-				base = BACKUP_SRC_END + 1;
-				size -= BACKUP_SRC_SIZE;
-			} else
-				continue;
-		}
-
-		ret = add_mem_range(mem_ranges, base, size);
-		if (ret)
-			goto out;
-
-		/* Try merging adjacent ranges before reallocation attempt */
-		if ((*mem_ranges)->nr_ranges == (*mem_ranges)->max_nr_ranges)
-			sort_memory_ranges(*mem_ranges, true);
-	}
-
-	/* Reallocate memory ranges if there is no space to split ranges */
-	tmem = *mem_ranges;
-	if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
-		tmem = realloc_mem_ranges(mem_ranges);
-		if (!tmem)
-			goto out;
-	}
-
-	/* Exclude crashkernel region */
-	ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
-	if (ret)
-		goto out;
-
-	/*
-	 * FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
-	 *        regions are exported to save their context at the time of
-	 *        crash, they should actually be backed up just like the
-	 *        first 64K bytes of memory.
-	 */
-	ret = add_rtas_mem_range(mem_ranges);
-	if (ret)
-		goto out;
-
-	ret = add_opal_mem_range(mem_ranges);
-	if (ret)
-		goto out;
-
-	/* create a separate program header for the backup region */
-	ret = add_mem_range(mem_ranges, BACKUP_SRC_START, BACKUP_SRC_SIZE);
-	if (ret)
-		goto out;
-
-	sort_memory_ranges(*mem_ranges, false);
-out:
-	if (ret)
-		pr_err("Failed to setup crash memory ranges\n");
-	return ret;
-}
-
 /**
  * get_reserved_memory_ranges - Get reserve memory ranges. This list includes
  *                              memory regions that should be added to the
@@ -1009,93 +934,6 @@ unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image)
 	return extra_size;
 }
 
-/**
- * add_node_props - Reads node properties from device node structure and add
- *                  them to fdt.
- * @fdt:            Flattened device tree of the kernel
- * @node_offset:    offset of the node to add a property at
- * @dn:             device node pointer
- *
- * Returns 0 on success, negative errno on error.
- */
-static int add_node_props(void *fdt, int node_offset, const struct device_node *dn)
-{
-	int ret = 0;
-	struct property *pp;
-
-	if (!dn)
-		return -EINVAL;
-
-	for_each_property_of_node(dn, pp) {
-		ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, pp->length);
-		if (ret < 0) {
-			pr_err("Unable to add %s property: %s\n", pp->name, fdt_strerror(ret));
-			return ret;
-		}
-	}
-	return ret;
-}
-
-/**
- * update_cpus_node - Update cpus node of flattened device tree using of_root
- *                    device node.
- * @fdt:              Flattened device tree of the kernel.
- *
- * Returns 0 on success, negative errno on error.
- */
-static int update_cpus_node(void *fdt)
-{
-	struct device_node *cpus_node, *dn;
-	int cpus_offset, cpus_subnode_offset, ret = 0;
-
-	cpus_offset = fdt_path_offset(fdt, "/cpus");
-	if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
-		pr_err("Malformed device tree: error reading /cpus node: %s\n",
-		       fdt_strerror(cpus_offset));
-		return cpus_offset;
-	}
-
-	if (cpus_offset > 0) {
-		ret = fdt_del_node(fdt, cpus_offset);
-		if (ret < 0) {
-			pr_err("Error deleting /cpus node: %s\n", fdt_strerror(ret));
-			return -EINVAL;
-		}
-	}
-
-	/* Add cpus node to fdt */
-	cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
-	if (cpus_offset < 0) {
-		pr_err("Error creating /cpus node: %s\n", fdt_strerror(cpus_offset));
-		return -EINVAL;
-	}
-
-	/* Add cpus node properties */
-	cpus_node = of_find_node_by_path("/cpus");
-	ret = add_node_props(fdt, cpus_offset, cpus_node);
-	of_node_put(cpus_node);
-	if (ret < 0)
-		return ret;
-
-	/* Loop through all subnodes of cpus and add them to fdt */
-	for_each_node_by_type(dn, "cpu") {
-		cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, dn->full_name);
-		if (cpus_subnode_offset < 0) {
-			pr_err("Unable to add %s subnode: %s\n", dn->full_name,
-			       fdt_strerror(cpus_subnode_offset));
-			ret = cpus_subnode_offset;
-			goto out;
-		}
-
-		ret = add_node_props(fdt, cpus_subnode_offset, dn);
-		if (ret < 0)
-			goto out;
-	}
-out:
-	of_node_put(dn);
-	return ret;
-}
-
 static int copy_property(void *fdt, int node_offset, const struct device_node *dn,
 			 const char *propname)
 {
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v8 2/8] powerpc/crash hp: introduce a new config option CRASH_HOTPLUG
  2023-02-01  6:38 [PATCH v8 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
  2023-02-01  6:38 ` [PATCH v8 1/8] powerpc/kexec: turn some static helper functions public Sourabh Jain
@ 2023-02-01  6:38 ` Sourabh Jain
  2023-02-01  6:38 ` [PATCH v8 3/8] powerpc/crash: update kimage_arch struct Sourabh Jain
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-02-01  6:38 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

The config option CRASH_HOTPLUG enables in kernel update to kexec
segments due to CPU/Memory hotplug events. By default it is disabled.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/Kconfig | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index b8c4ac56bddc5..57d18cc5d27c5 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -668,6 +668,18 @@ config CRASH_DUMP
 	  The same kernel binary can be used as production kernel and dump
 	  capture kernel.
 
+config CRASH_HOTPLUG
+	bool "Update crash capture system on CPU/Memory hotplug event"
+	default n
+	depends on CRASH_DUMP && (HOTPLUG_CPU || MEMORY_HOTPLUG)
+	help
+	  In kernel update to relevant kexec segments due to change
+	  in system configuration, rather reloading all the kexec
+	  segments again from userspace by monitoring CPU/Memory
+	  hotplug event in userspace using udev.
+
+	  If unsure, say Y.
+
 config FA_DUMP
 	bool "Firmware-assisted dump"
 	depends on PPC64 && (PPC_RTAS || PPC_POWERNV)
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v8 3/8] powerpc/crash: update kimage_arch struct
  2023-02-01  6:38 [PATCH v8 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
  2023-02-01  6:38 ` [PATCH v8 1/8] powerpc/kexec: turn some static helper functions public Sourabh Jain
  2023-02-01  6:38 ` [PATCH v8 2/8] powerpc/crash hp: introduce a new config option CRASH_HOTPLUG Sourabh Jain
@ 2023-02-01  6:38 ` Sourabh Jain
  2023-02-01  6:38 ` [PATCH v8 4/8] crash: add phdr for possible CPUs in elfcorehdr Sourabh Jain
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-02-01  6:38 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

Add a new member "fdt_index" to kimage_arch struct to hold the index of
the FDT (Flattened Device Tree) segment in the kexec segment array.

Having direct access to FDT segment will help arch crash hotplug handler
to avoid looping kexec segment array to identify the FDT segment index
for every FDT update on hotplug events.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/include/asm/kexec.h |  7 +++++++
 arch/powerpc/kexec/core_64.c     | 33 ++++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 8090ad7d97d9d..5a322c1737661 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -103,6 +103,10 @@ void kexec_copy_flush(struct kimage *image);
 struct crash_mem;
 int update_cpus_node(void *fdt);
 int get_crash_memory_ranges(struct crash_mem **mem_ranges);
+#if defined(CONFIG_CRASH_HOTPLUG)
+int machine_kexec_post_load(struct kimage *image);
+#define machine_kexec_post_load machine_kexec_post_load
+#endif
 #endif
 
 #if defined(CONFIG_CRASH_DUMP) && defined(CONFIG_PPC_RTAS)
@@ -118,6 +122,9 @@ extern const struct kexec_file_ops kexec_elf64_ops;
 struct kimage_arch {
 	struct crash_mem *exclude_ranges;
 
+#if defined(CONFIG_CRASH_HOTPLUG)
+	int fdt_index;
+#endif
 	unsigned long backup_start;
 	void *backup_buf;
 	void *fdt;
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 0b292f93a74cc..a9918084b1eba 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -77,6 +77,39 @@ int machine_kexec_prepare(struct kimage *image)
 	return 0;
 }
 
+#if defined(CONFIG_CRASH_HOTPLUG)
+/*
+ * Find the index of the FDT segment in kexec segment array and assign
+ * it to kimage's member fdt_index to have direct access to FDT
+ * segment later on.
+ */
+int machine_kexec_post_load(struct kimage *kimage)
+{
+	int i;
+	void *ptr;
+	unsigned long mem;
+
+	/* Mark fdt_index invalid */
+	kimage->arch.fdt_index = -1;
+
+	/* fdt_index remains invalid for if it not crash kernel load */
+	if (kimage->type != KEXEC_TYPE_CRASH)
+		return 0;
+
+	for (i = 0; i < kimage->nr_segments; i++) {
+		mem = kimage->segment[i].mem;
+		ptr = __va(mem);
+
+		if (ptr && fdt_magic(ptr) == FDT_MAGIC) {
+			kimage->arch.fdt_index = i;
+			break;
+		}
+	}
+
+	return 0;
+}
+#endif
+
 /* Called during kexec sequence with MMU off */
 static notrace void copy_segments(unsigned long ind)
 {
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v8 4/8] crash: add phdr for possible CPUs in elfcorehdr
  2023-02-01  6:38 [PATCH v8 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
                   ` (2 preceding siblings ...)
  2023-02-01  6:38 ` [PATCH v8 3/8] powerpc/crash: update kimage_arch struct Sourabh Jain
@ 2023-02-01  6:38 ` Sourabh Jain
  2023-02-02 15:37   ` Eric DeVolder
  2023-02-01  6:38 ` [PATCH v8 5/8] crash: pass hotplug action type to arch crash hotplug handler Sourabh Jain
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Sourabh Jain @ 2023-02-01  6:38 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

On architectures like PowerPC the crash notes are available for all
possible CPUs. So let's populate the elfcorehdr for all possible
CPUs having crash notes to avoid updating elfcorehdr during in-kernel
crash update on CPU hotplug events.

The similar technique is used in kexec-tool for kexec_load case.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 kernel/crash_core.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 37c594858fd51..898d8d2fe2e2e 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -364,8 +364,8 @@ int crash_prepare_elf64_headers(struct kimage *image, struct crash_mem *mem,
 	ehdr->e_ehsize = sizeof(Elf64_Ehdr);
 	ehdr->e_phentsize = sizeof(Elf64_Phdr);
 
-	/* Prepare one phdr of type PT_NOTE for each present CPU */
-	for_each_present_cpu(cpu) {
+	/* Prepare one phdr of type PT_NOTE for possible CPU with crash note. */
+	for_each_possible_cpu(cpu) {
 #ifdef CONFIG_CRASH_HOTPLUG
 		if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) {
 			/* Skip the soon-to-be offlined cpu */
@@ -373,8 +373,11 @@ int crash_prepare_elf64_headers(struct kimage *image, struct crash_mem *mem,
 				continue;
 		}
 #endif
-		phdr->p_type = PT_NOTE;
 		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
+		if (!notes_addr)
+			continue;
+
+		phdr->p_type = PT_NOTE;
 		phdr->p_offset = phdr->p_paddr = notes_addr;
 		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
 		(ehdr->e_phnum)++;
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v8 5/8] crash: pass hotplug action type to arch crash hotplug handler
  2023-02-01  6:38 [PATCH v8 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
                   ` (3 preceding siblings ...)
  2023-02-01  6:38 ` [PATCH v8 4/8] crash: add phdr for possible CPUs in elfcorehdr Sourabh Jain
@ 2023-02-01  6:38 ` Sourabh Jain
  2023-02-01  6:38 ` [PATCH v8 6/8] powerpc/crash: add crash CPU hotplug support Sourabh Jain
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-02-01  6:38 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

On PowerPC, the crash update action for CPU and Memory hotplug events
is not the same. Since arch_crash_handle_hotplug_event is a common
crash hotplug handler for both CPU and Memory hot un/plug events,
it is hard to differentiate which hotplug event triggered the arch
crash hotplug handler and decide the crash update action accordingly.

Pass the hotplug action type to arch specific crash hotplug handler
so that corresponding crash update action can be taken based on the
hotplug action type.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/x86/include/asm/kexec.h | 2 +-
 arch/x86/kernel/crash.c      | 3 ++-
 include/linux/kexec.h        | 2 +-
 kernel/crash_core.c          | 2 +-
 4 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 1bc852ce347d4..ec33a592a2ddd 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -213,7 +213,7 @@ extern crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss;
 extern void kdump_nmi_shootdown_cpus(void);
 
 #ifdef CONFIG_CRASH_HOTPLUG
-void arch_crash_handle_hotplug_event(struct kimage *image);
+void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index ba11c7e70edb1..c0b5afad37565 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -463,12 +463,13 @@ int crash_load_segments(struct kimage *image)
 /**
  * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
  * @image: the active struct kimage
+ * @hp_action: the hot un/plug action being handled
  *
  * To accurately reflect hot un/plug changes, the new elfcorehdr
  * is prepared in a kernel buffer, and then it is written on top
  * of the existing/old elfcorehdr.
  */
-void arch_crash_handle_hotplug_event(struct kimage *image)
+void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action)
 {
 	void *elfbuf = NULL, *old_elfcorehdr;
 	unsigned long nr_mem_ranges;
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 881e8a43db89e..5461a4ad50e2a 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -509,7 +509,7 @@ static inline void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages) {
 #endif
 
 #ifndef arch_crash_handle_hotplug_event
-static inline void arch_crash_handle_hotplug_event(struct kimage *image) { }
+static inline void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action) { }
 #endif
 
 #else /* !CONFIG_KEXEC_CORE */
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 898d8d2fe2e2e..3aede7187c8b4 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -789,7 +789,7 @@ static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
 					cpu : KEXEC_CRASH_HP_INVALID_CPU;
 
 			/* Now invoke arch-specific update handler */
-			arch_crash_handle_hotplug_event(image);
+			arch_crash_handle_hotplug_event(image, hp_action);
 
 			/* No longer handling a hotplug event */
 			image->hotplug_event = false;
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v8 6/8] powerpc/crash: add crash CPU hotplug support
  2023-02-01  6:38 [PATCH v8 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
                   ` (4 preceding siblings ...)
  2023-02-01  6:38 ` [PATCH v8 5/8] crash: pass hotplug action type to arch crash hotplug handler Sourabh Jain
@ 2023-02-01  6:38 ` Sourabh Jain
  2023-02-01  6:38 ` [PATCH v8 7/8] crash: forward memory_notify args to arch crash hotplug handler Sourabh Jain
  2023-02-01  6:38 ` [PATCH v8 8/8] powerpc/kexec: add crash memory hotplug support Sourabh Jain
  7 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-02-01  6:38 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

Introduce powerpc crash hotplug handler to update the necessary kexec
segments on CPU/Memory hotplug events. A common crash hotplug handler is
triggered from generic infrastructure for both CPU/Memory hot un/plugged
events but in this patch, only CPU hot un/plugged events are introduced.
The memory hotplug support is added in upcoming patches.

The elfcorehdr segment is used to exchange the CPU and other dump related
information between the kernels. Ideally, the elfcorehdr segment needs to
be recreated on CPU hotplug events to reflect the changes. But on PowerPC
elfcorehdr is built with all possible CPUs instead of just online CPUs
hence no elfcorehdr segment update/recreation is needed.

In addition to elfcorehdr there is one more kexec segment that holds CPU
data FDT (Flattened Device Tree). To boot the PowerPC kernel the crashing
CPU has to be part of the FDT segment. If the kdump kernel doesn't find
the crashing CPU in the FDT segment, it fails to boot.

The only action needed on PowerPC to handle the crash CPU hotplug event
is to add hot added CPUs in the FDT segment to avoid kdump kernel boot
failure in case the system crashes on hot added CPU.

So for the CPU hot add events, the FDT segment is updated with hot added
CPU and Since there is no need to remove the hot unplugged CPUs from the
FDT segment hence no action taken on CPU hot remove event in PowerPC arch
crash hotplug handler.

To accommodate a growing number of CPUs, FDT is built with additional
buffer space to ensure that it can hold all possible CPUs.

The changes done here will also work for kexec_load system call given
that the kexec tool builds the FDT segment with additional space to
accommodate all possible CPUs as it is done for kexec_file_load system
call in the kernel.

Since memory crash hotplug support is not there yet the crash hotplug
handler simply warns the user and returns.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/include/asm/kexec.h  |  5 ++
 arch/powerpc/kexec/core_64.c      | 82 +++++++++++++++++++++++++++++++
 arch/powerpc/kexec/elf_64.c       | 19 ++++++-
 arch/powerpc/kexec/file_load_64.c | 39 ---------------
 4 files changed, 105 insertions(+), 40 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 5a322c1737661..c2b8debc11a61 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -101,11 +101,16 @@ void kexec_copy_flush(struct kimage *image);
 
 #ifdef CONFIG_PPC64
 struct crash_mem;
+unsigned int cpu_node_size(void);
 int update_cpus_node(void *fdt);
 int get_crash_memory_ranges(struct crash_mem **mem_ranges);
 #if defined(CONFIG_CRASH_HOTPLUG)
 int machine_kexec_post_load(struct kimage *image);
 #define machine_kexec_post_load machine_kexec_post_load
+
+void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action);
+#define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
+
 #endif
 #endif
 
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index a9918084b1eba..5fdc9fe4e7fe0 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -516,6 +516,45 @@ static int add_node_props(void *fdt, int node_offset, const struct device_node *
 	return ret;
 }
 
+/**
+ * cpu_node_size - Compute the size of a CPU node in the FDT.
+ *                 This should be done only once and the value is stored in
+ *                 a static variable.
+ * Returns the max size of a CPU node in the FDT.
+ */
+unsigned int cpu_node_size(void)
+{
+	static unsigned int size;
+	struct device_node *dn;
+	struct property *pp;
+
+	/*
+	 * Don't compute it twice, we are assuming that the per CPU node size
+	 * doesn't change during the system's life.
+	 */
+	if (size)
+		return size;
+
+	dn = of_find_node_by_type(NULL, "cpu");
+	if (WARN_ON_ONCE(!dn)) {
+		// Unlikely to happen
+		return 0;
+	}
+
+	/*
+	 * We compute the sub node size for a CPU node, assuming it
+	 * will be the same for all.
+	 */
+	size += strlen(dn->name) + 5;
+	for_each_property_of_node(dn, pp) {
+		size += strlen(pp->name);
+		size += pp->length;
+	}
+
+	of_node_put(dn);
+	return size;
+}
+
 /**
  * update_cpus_node - Update cpus node of flattened device tree using of_root
  *                    device node.
@@ -576,6 +615,49 @@ int update_cpus_node(void *fdt)
 	return ret;
 }
 
+#if defined(CONFIG_CRASH_HOTPLUG)
+/**
+ * arch_crash_hotplug_handler() - Handle hotplug kexec segements changes FDT, elfcorehdr
+ * @image: the active struct kimage
+ * @hp_action: the hot un/plug action being handled
+ *
+ * To accurately reflect CPU hot un/plug changes, the FDT must be updated with the
+ * new list of CPUs.
+ */
+void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action)
+{
+	void *fdt;
+
+	/* No action needed for CPU hot-unplug */
+	if (hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
+		return;
+
+	/* crash update on memory hotplug is not support yet */
+	if (hp_action == KEXEC_CRASH_HP_REMOVE_MEMORY || hp_action == KEXEC_CRASH_HP_ADD_MEMORY) {
+		pr_info_once("crash hp: crash update is not supported with memory hotplug\n");
+		return;
+	}
+
+	/* Must have valid FDT index */
+	if (image->arch.fdt_index < 0) {
+		pr_err("crash hp: unable to locate FDT segment");
+		return;
+	}
+
+	fdt = __va((void *)image->segment[image->arch.fdt_index].mem);
+
+	/* Temporarily invalidate the crash image while it is replaced */
+	xchg(&kexec_crash_image, NULL);
+
+	/* update FDT to refelect changes to CPU resrouces */
+	if (update_cpus_node(fdt))
+		pr_err("crash hp: failed to update crash FDT");
+
+	/* The crash image is now valid once again */
+	xchg(&kexec_crash_image, image);
+}
+#endif
+
 #ifdef CONFIG_PPC_64S_HASH_MMU
 /* Values we need to export to the second kernel via the device tree. */
 static unsigned long htab_base;
diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
index eeb258002d1e0..ace48e517427e 100644
--- a/arch/powerpc/kexec/elf_64.c
+++ b/arch/powerpc/kexec/elf_64.c
@@ -42,6 +42,9 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
 	struct kexec_buf pbuf = { .image = image, .buf_min = 0,
 				  .buf_max = ppc64_rma_size, .top_down = true,
 				  .mem = KEXEC_BUF_MEM_UNKNOWN };
+#if defined(CONFIG_CRASH_HOTPLUG)
+	unsigned int offline_core_count;
+#endif
 
 	ret = kexec_build_elf_info(kernel_buf, kernel_len, &ehdr, &elf_info);
 	if (ret)
@@ -119,10 +122,24 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
 	fdt_pack(fdt);
 
 	kbuf.buffer = fdt;
-	kbuf.bufsz = kbuf.memsz = fdt_totalsize(fdt);
+	kbuf.bufsz = fdt_totalsize(fdt);
 	kbuf.buf_align = PAGE_SIZE;
 	kbuf.top_down = true;
 	kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
+#if defined(CONFIG_CRASH_HOTPLUG)
+	if (image->type == KEXEC_TYPE_CRASH) {
+		/* Calculate the buffer space needed to accommodate more CPU nodes in
+		 * crash FDT post capture kernel load due to CPU hotplug events.
+		 */
+		offline_core_count = (num_possible_cpus() - num_present_cpus()) / threads_per_core;
+		kbuf.memsz = fdt_totalsize(fdt) + offline_core_count * cpu_node_size();
+		fdt_set_totalsize(fdt, kbuf.memsz);
+	} else
+#endif
+	{
+		kbuf.memsz = fdt_totalsize(fdt);
+	}
+
 	ret = kexec_add_buffer(&kbuf);
 	if (ret)
 		goto out_free_fdt;
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index 9bc70b4d8eafc..ceac592be72b9 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -854,45 +854,6 @@ int setup_purgatory_ppc64(struct kimage *image, const void *slave_code,
 	return ret;
 }
 
-/**
- * get_cpu_node_size - Compute the size of a CPU node in the FDT.
- *                     This should be done only once and the value is stored in
- *                     a static variable.
- * Returns the max size of a CPU node in the FDT.
- */
-static unsigned int cpu_node_size(void)
-{
-	static unsigned int size;
-	struct device_node *dn;
-	struct property *pp;
-
-	/*
-	 * Don't compute it twice, we are assuming that the per CPU node size
-	 * doesn't change during the system's life.
-	 */
-	if (size)
-		return size;
-
-	dn = of_find_node_by_type(NULL, "cpu");
-	if (WARN_ON_ONCE(!dn)) {
-		// Unlikely to happen
-		return 0;
-	}
-
-	/*
-	 * We compute the sub node size for a CPU node, assuming it
-	 * will be the same for all.
-	 */
-	size += strlen(dn->name) + 5;
-	for_each_property_of_node(dn, pp) {
-		size += strlen(pp->name);
-		size += pp->length;
-	}
-
-	of_node_put(dn);
-	return size;
-}
-
 /**
  * kexec_extra_fdt_size_ppc64 - Return the estimated additional size needed to
  *                              setup FDT for kexec/kdump kernel.
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v8 7/8] crash: forward memory_notify args to arch crash hotplug handler
  2023-02-01  6:38 [PATCH v8 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
                   ` (5 preceding siblings ...)
  2023-02-01  6:38 ` [PATCH v8 6/8] powerpc/crash: add crash CPU hotplug support Sourabh Jain
@ 2023-02-01  6:38 ` Sourabh Jain
  2023-02-01  6:38 ` [PATCH v8 8/8] powerpc/kexec: add crash memory hotplug support Sourabh Jain
  7 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-02-01  6:38 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

The way memory hot remove is handled on PowerPC, it is hard to update
the elfcorehdr without memory_notify args.

On PowePC memblock data structure is used to prepare elfcorehdr for kdump.
Since the notifier used for memory hotplug crash handler get initiated
before the memblock data structure update happens (as depicted below),
the newly prepared elfcorehdr still holds the old memory regions. So if
the system crash with obsolete elfcorehdr, makedumpfile failed to collect
vmcore.

Sequence of actions done on PowerPC to serve the memory hot remove:

 Initiate memory hot remove
          |
          v
 offline pages
          |
          v
 initiate memory notify call chain
 for MEM_OFFLINE event.
 (same is used for crash update)
          |
          v
 prepare new elfcorehdr for kdump using
 memblock data structure
          |
          v
 update memblock data structure

How passing memory_notify to arch crash hotplug handler will help?

memory_notify holds the start PFN and page count, with that base address
and size of hot unplugged memory can calculated and same can be use to
avoid hot unplugged memory region to get added in the elfcorehdr.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/include/asm/kexec.h |  2 +-
 arch/powerpc/kexec/core_64.c     |  3 ++-
 arch/x86/include/asm/kexec.h     |  3 ++-
 arch/x86/kernel/crash.c          |  4 +++-
 include/linux/kexec.h            |  6 +++++-
 kernel/crash_core.c              | 14 +++++++-------
 6 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index c2b8debc11a61..f49eec264e545 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -108,7 +108,7 @@ int get_crash_memory_ranges(struct crash_mem **mem_ranges);
 int machine_kexec_post_load(struct kimage *image);
 #define machine_kexec_post_load machine_kexec_post_load
 
-void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action);
+void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
 #endif
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 5fdc9fe4e7fe0..27a360120bc55 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -620,11 +620,12 @@ int update_cpus_node(void *fdt)
  * arch_crash_hotplug_handler() - Handle hotplug kexec segements changes FDT, elfcorehdr
  * @image: the active struct kimage
  * @hp_action: the hot un/plug action being handled
+ * @arg: struct memory_notify data handler
  *
  * To accurately reflect CPU hot un/plug changes, the FDT must be updated with the
  * new list of CPUs.
  */
-void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action)
+void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action, void *arg)
 {
 	void *fdt;
 
diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index ec33a592a2ddd..171edb8167a12 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -213,7 +213,8 @@ extern crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss;
 extern void kdump_nmi_shootdown_cpus(void);
 
 #ifdef CONFIG_CRASH_HOTPLUG
-void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action);
+void arch_crash_handle_hotplug_event(struct kimage *image,
+				     unsigned int hp_action, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index c0b5afad37565..bbe8e3cd0c61d 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -464,12 +464,14 @@ int crash_load_segments(struct kimage *image)
  * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
  * @image: the active struct kimage
  * @hp_action: the hot un/plug action being handled
+ * @arg: struct memory_notify data handler
  *
  * To accurately reflect hot un/plug changes, the new elfcorehdr
  * is prepared in a kernel buffer, and then it is written on top
  * of the existing/old elfcorehdr.
  */
-void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action)
+void arch_crash_handle_hotplug_event(struct kimage *image,
+				     unsigned int hp_action, void *arg)
 {
 	void *elfbuf = NULL, *old_elfcorehdr;
 	unsigned long nr_mem_ranges;
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 5461a4ad50e2a..fe33f6aeed843 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -509,7 +509,11 @@ static inline void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages) {
 #endif
 
 #ifndef arch_crash_handle_hotplug_event
-static inline void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_action) { }
+static inline void arch_crash_handle_hotplug_event(struct kimage *image,
+						   unsigned int hp_action,
+						   void *arg)
+{
+}
 #endif
 
 #else /* !CONFIG_KEXEC_CORE */
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 3aede7187c8b4..f19978dcb1e65 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -727,7 +727,7 @@ subsys_initcall(crash_save_vmcoreinfo_init);
  * list of segments it checks (since the elfcorehdr changes and thus
  * would require an update to purgatory itself to update the digest).
  */
-static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
+static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu, void *arg)
 {
 	/* Obtain lock while changing crash information */
 	if (kexec_trylock()) {
@@ -789,7 +789,7 @@ static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
 					cpu : KEXEC_CRASH_HP_INVALID_CPU;
 
 			/* Now invoke arch-specific update handler */
-			arch_crash_handle_hotplug_event(image, hp_action);
+			arch_crash_handle_hotplug_event(image, hp_action, arg);
 
 			/* No longer handling a hotplug event */
 			image->hotplug_event = false;
@@ -804,17 +804,17 @@ static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
 	}
 }
 
-static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, void *v)
+static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, void *arg)
 {
 	switch (val) {
 	case MEM_ONLINE:
 		handle_hotplug_event(KEXEC_CRASH_HP_ADD_MEMORY,
-			KEXEC_CRASH_HP_INVALID_CPU);
+			KEXEC_CRASH_HP_INVALID_CPU, arg);
 		break;
 
 	case MEM_OFFLINE:
 		handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_MEMORY,
-			KEXEC_CRASH_HP_INVALID_CPU);
+			KEXEC_CRASH_HP_INVALID_CPU, arg);
 		break;
 	}
 	return NOTIFY_OK;
@@ -827,13 +827,13 @@ static struct notifier_block crash_memhp_nb = {
 
 static int crash_cpuhp_online(unsigned int cpu)
 {
-	handle_hotplug_event(KEXEC_CRASH_HP_ADD_CPU, cpu);
+	handle_hotplug_event(KEXEC_CRASH_HP_ADD_CPU, cpu, NULL);
 	return 0;
 }
 
 static int crash_cpuhp_offline(unsigned int cpu)
 {
-	handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_CPU, cpu);
+	handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_CPU, cpu, NULL);
 	return 0;
 }
 
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v8 8/8] powerpc/kexec: add crash memory hotplug support
  2023-02-01  6:38 [PATCH v8 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
                   ` (6 preceding siblings ...)
  2023-02-01  6:38 ` [PATCH v8 7/8] crash: forward memory_notify args to arch crash hotplug handler Sourabh Jain
@ 2023-02-01  6:38 ` Sourabh Jain
  7 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-02-01  6:38 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: eric.devolder, bhe, mahesh, kexec, ldufour, hbathini

Extend PowerPC arch crash hotplug handler to support memory hotplug
events. Since elfcorehdr is used to exchange the memory info between the
kernels hence it needs to be recreated to reflect the changes due to
memory hotplug events.

The way memory hotplug events are handled on PowerPC and the notifier call
chain used in generic code to trigger the arch crash handler, the process
to recreate the elfcorehdr is different for memory add and remove events.

In the hot remove case, the memory region is first marked offline then the
notifier call chain is triggered (same is used to initiate arch crash
hotplug handler) and at last the memblock structure is updated. Whereas
in the hot add case, memblock structure is updated before the notifier
call chain is triggered.

On PowerPC, memblock structure is used to identify the memory ranges for
elfcorehdr. In case of memory hot remove the memblock structure is updated
after the arch crash hotplug handler is triggered, hence an additional
step is taken to ensure that memory ranges used to build elfcorehdr do
not include hot removed memory. Whereas no such extra steps are needed
for the hot add case because memblock structure is updated before the
arch crash hotplug handler is triggered.

The hot removed memory is explicitly removed from the memory ranges list
before creating elfcorehdr for the hot remove case.

To accommodate a growing number of memory regions, elfcorehdr is built
with additional buffer space to hold max memory regions.

The changes done here will also work for the kexec_load system call given
that the kexec tool builds the elfcoredhr with additional space to
accommodate future memory regions as it is done for kexec_file_load
system call.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/include/asm/kexec_ranges.h |  1 +
 arch/powerpc/kexec/core_64.c            | 59 ++++++++++++++++++++++--
 arch/powerpc/kexec/file_load_64.c       | 36 ++++++++++++++-
 arch/powerpc/kexec/ranges.c             | 60 +++++++++++++++++++++++++
 4 files changed, 151 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec_ranges.h b/arch/powerpc/include/asm/kexec_ranges.h
index f83866a19e870..802abf580cf0f 100644
--- a/arch/powerpc/include/asm/kexec_ranges.h
+++ b/arch/powerpc/include/asm/kexec_ranges.h
@@ -7,6 +7,7 @@
 void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
 struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
 int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
+int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
 int add_tce_mem_ranges(struct crash_mem **mem_ranges);
 int add_initrd_mem_range(struct crash_mem **mem_ranges);
 #ifdef CONFIG_PPC_64S_HASH_MMU
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 27a360120bc55..d877b8c2e7427 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -19,6 +19,7 @@
 #include <linux/of.h>
 #include <linux/libfdt.h>
 #include <linux/memblock.h>
+#include <linux/memory.h>
 
 #include <asm/page.h>
 #include <asm/current.h>
@@ -616,6 +617,60 @@ int update_cpus_node(void *fdt)
 }
 
 #if defined(CONFIG_CRASH_HOTPLUG)
+int update_crash_elfcorehdr(struct kimage *image, unsigned int hp_action, void *arg)
+{
+	int ret;
+	struct crash_mem *cmem = NULL;
+	struct kexec_segment *ksegment;
+	unsigned long elfsz;
+	void *elfbuf = NULL;
+	void *mem;
+	unsigned long memsz;
+	char *ptr;
+	struct memory_notify *mn = (struct memory_notify *) arg;
+	unsigned long base_addr;
+	unsigned long size;
+
+	ksegment = &image->segment[image->elfcorehdr_index];
+	mem = (void *) ksegment->mem;
+	memsz = ksegment->memsz;
+
+	ret = get_crash_memory_ranges(&cmem);
+	if (ret) {
+		pr_err("crash hp: failed to get crash mem range\n");
+		return -1;
+	}
+
+	if (hp_action == KEXEC_CRASH_HP_REMOVE_MEMORY) {
+		base_addr = PFN_PHYS(mn->start_pfn);
+		size = mn->nr_pages * PAGE_SIZE;
+		ret = remove_mem_range(&cmem, base_addr, size);
+		if (ret)
+			return -1;
+	}
+
+	ret = crash_prepare_elf64_headers(image, cmem, false, &elfbuf, &elfsz);
+	if (ret) {
+		pr_err("crash hp: failed to prepare elf header\n");
+		return -1;
+	}
+
+	if (elfsz > memsz) {
+		pr_err("crash hp: updated crash elfcorehdr elfsz %lu > memsz %lu", elfsz, memsz);
+		return -1;
+	}
+
+	ptr = __va(mem);
+	if (ptr) {
+		xchg(&kexec_crash_image, NULL);
+		memcpy((void *)ptr, elfbuf, elfsz);
+		xchg(&kexec_crash_image, image);
+	}
+
+	vfree(elfbuf);
+	return 0;
+}
+
 /**
  * arch_crash_hotplug_handler() - Handle hotplug kexec segements changes FDT, elfcorehdr
  * @image: the active struct kimage
@@ -633,9 +688,8 @@ void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_actio
 	if (hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
 		return;
 
-	/* crash update on memory hotplug is not support yet */
 	if (hp_action == KEXEC_CRASH_HP_REMOVE_MEMORY || hp_action == KEXEC_CRASH_HP_ADD_MEMORY) {
-		pr_info_once("crash hp: crash update is not supported with memory hotplug\n");
+		update_crash_elfcorehdr(image, hp_action, arg);
 		return;
 	}
 
@@ -650,7 +704,6 @@ void arch_crash_handle_hotplug_event(struct kimage *image, unsigned int hp_actio
 	/* Temporarily invalidate the crash image while it is replaced */
 	xchg(&kexec_crash_image, NULL);
 
-	/* update FDT to refelect changes to CPU resrouces */
 	if (update_cpus_node(fdt))
 		pr_err("crash hp: failed to update crash FDT");
 
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index ceac592be72b9..4b96c6017cf1b 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -21,6 +21,8 @@
 #include <linux/memblock.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
+#include <linux/elf.h>
+
 #include <asm/setup.h>
 #include <asm/drmem.h>
 #include <asm/firmware.h>
@@ -704,6 +706,30 @@ static void update_backup_region_phdr(struct kimage *image, Elf64_Ehdr *ehdr)
 	}
 }
 
+/* get_max_phdr - Find the total number of Phdr needed to represent the
+ *		  max memory in the kdump elfcorehdr.
+ *
+ * @cmem: crash memory ranges in the system.
+ */
+static int get_max_phdr(struct crash_mem *cmem)
+{
+	int max_lmb;
+
+	/* In the worst case, a Phdr is needed for every other LMB to be represented
+	 * as an individual crash range.
+	 */
+	max_lmb = memory_hotplug_max() / 2 * drmem_lmb_size();
+
+	/* Do not cross the Phdr max limit of the elf header.
+	 * Avoid counting Phdr for crash ranges (cmem->nr_ranges) which
+	 * are already part of elfcorehdr.
+	 */
+	if (max_lmb > PN_XNUM)
+		return PN_XNUM - cmem->nr_ranges;
+
+	return max_lmb - cmem->nr_ranges;
+}
+
 /**
  * load_elfcorehdr_segment - Setup crash memory ranges and initialize elfcorehdr
  *                           segment needed to load kdump kernel.
@@ -735,7 +761,13 @@ static int load_elfcorehdr_segment(struct kimage *image, struct kexec_buf *kbuf)
 
 	kbuf->buffer = headers;
 	kbuf->mem = KEXEC_BUF_MEM_UNKNOWN;
-	kbuf->bufsz = kbuf->memsz = headers_sz;
+	kbuf->bufsz = headers_sz;
+/* Additional buffer space to accommodate future memory ranges */
+#if defined(CONFIG_MEMORY_HOTPLUG)
+	kbuf->memsz = headers_sz + get_max_phdr(cmem) * sizeof(Elf64_Phdr);
+#else
+	kbuf->memsz = headers_sz;
+#endif
 	kbuf->top_down = false;
 
 	ret = kexec_add_buffer(kbuf);
@@ -745,7 +777,7 @@ static int load_elfcorehdr_segment(struct kimage *image, struct kexec_buf *kbuf)
 	}
 
 	image->elf_load_addr = kbuf->mem;
-	image->elf_headers_sz = headers_sz;
+	image->elf_headers_sz = kbuf->memsz;
 	image->elf_headers = headers;
 out:
 	kfree(cmem);
diff --git a/arch/powerpc/kexec/ranges.c b/arch/powerpc/kexec/ranges.c
index 5fc53a5fcfdf6..2bb90874df781 100644
--- a/arch/powerpc/kexec/ranges.c
+++ b/arch/powerpc/kexec/ranges.c
@@ -234,6 +234,66 @@ int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size)
 	return __add_mem_range(mem_ranges, base, size);
 }
 
+/**
+ * remove_mem_range - Removes the given memory range from the range list.
+ * @mem_ranges:    Range list to remove the memory range to.
+ * @base:          Base address of the range to remove.
+ * @size:          Size of the memory range to remove.
+ *
+ * (Re)allocates memory, if needed.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size)
+{
+	int ret = 0;
+	unsigned int i;
+	struct crash_mem *mem_rngs = *mem_ranges;
+	u64 mstart, mend;
+	u64 range_start, range_end;
+
+	if (!size)
+		return 0;
+
+	range_start = base;
+	range_end = base + size - 1;
+
+	for (i = 0; i < mem_rngs->nr_ranges; i++) {
+		mstart = mem_rngs->ranges[i].start;
+		mend = mem_rngs->ranges[i].end;
+
+		if (!(range_start >= mstart && range_end <= mend))
+			continue;
+
+		if (range_start == mstart) {
+			if (range_end == mend) {
+				for (; i < mem_rngs->nr_ranges - 1; i++) {
+					mem_rngs->ranges[i].start = mem_rngs->ranges[i+1].start;
+					mem_rngs->ranges[i].end = mem_rngs->ranges[i+1].end;
+				}
+				mem_rngs->nr_ranges--;
+				goto out;
+			}
+			mem_rngs->ranges[i].start = range_end + 1;
+			goto out;
+		} else if (range_end == mend)  {
+			mem_rngs->ranges[i].end = range_start - 1;
+			goto out;
+		} else {
+			size = mem_rngs->ranges[i].end - range_end;
+			mem_rngs->ranges[i].end = range_start - 1;
+			if (add_mem_range(mem_ranges, range_end + 1, size))
+				goto error;
+			goto out;
+		}
+	}
+error:
+	return -1;
+
+out:
+	return ret;
+}
+
 /**
  * add_tce_mem_ranges - Adds tce-table range to the given memory ranges list.
  * @mem_ranges:         Range list to add the memory range(s) to.
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v8 4/8] crash: add phdr for possible CPUs in elfcorehdr
  2023-02-01  6:38 ` [PATCH v8 4/8] crash: add phdr for possible CPUs in elfcorehdr Sourabh Jain
@ 2023-02-02 15:37   ` Eric DeVolder
  2023-02-02 21:01     ` Eric DeVolder
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Eric DeVolder @ 2023-02-02 15:37 UTC (permalink / raw)
  To: Sourabh Jain, linuxppc-dev, mpe; +Cc: mahesh, ldufour, kexec, bhe, hbathini



On 2/1/23 00:38, Sourabh Jain wrote:
> On architectures like PowerPC the crash notes are available for all
> possible CPUs. So let's populate the elfcorehdr for all possible
> CPUs having crash notes to avoid updating elfcorehdr during in-kernel
> crash update on CPU hotplug events.
> 
> The similar technique is used in kexec-tool for kexec_load case.
> 
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> ---
>   kernel/crash_core.c | 9 ++++++---
>   1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 37c594858fd51..898d8d2fe2e2e 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -364,8 +364,8 @@ int crash_prepare_elf64_headers(struct kimage *image, struct crash_mem *mem,
>   	ehdr->e_ehsize = sizeof(Elf64_Ehdr);
>   	ehdr->e_phentsize = sizeof(Elf64_Phdr);
>   
> -	/* Prepare one phdr of type PT_NOTE for each present CPU */
> -	for_each_present_cpu(cpu) {
> +	/* Prepare one phdr of type PT_NOTE for possible CPU with crash note. */
> +	for_each_possible_cpu(cpu) {

Sourabh,
Thomas Gleixner is suggesting moving away from for_each_present_cpu() to for_each_online_cpu(). 
Using for_each_online_cpu() is going to the minimum number of needed, whereas your approach of 
for_each_possible_cpu() would be to the maximum number needed.

What would be the ramifications to ppc for moving towards for_each_online_cpu()?

In my next patch series, I have finally figured out how to use cpuhp framework to where it is 
possible to use for_each_online_cpu() here, but that is at odds with your changes here.

Thanks,
eric



>   #ifdef CONFIG_CRASH_HOTPLUG
>   		if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) {
>   			/* Skip the soon-to-be offlined cpu */
> @@ -373,8 +373,11 @@ int crash_prepare_elf64_headers(struct kimage *image, struct crash_mem *mem,
>   				continue;
>   		}
>   #endif
> -		phdr->p_type = PT_NOTE;
>   		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
> +		if (!notes_addr)
> +			continue;
> +
> +		phdr->p_type = PT_NOTE;
>   		phdr->p_offset = phdr->p_paddr = notes_addr;
>   		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
>   		(ehdr->e_phnum)++;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v8 4/8] crash: add phdr for possible CPUs in elfcorehdr
  2023-02-02 15:37   ` Eric DeVolder
@ 2023-02-02 21:01     ` Eric DeVolder
  2023-02-06  6:36       ` Sourabh Jain
  2023-02-06  5:52     ` Sourabh Jain
  2023-02-06  5:56     ` Sourabh Jain
  2 siblings, 1 reply; 14+ messages in thread
From: Eric DeVolder @ 2023-02-02 21:01 UTC (permalink / raw)
  To: Sourabh Jain, linuxppc-dev, mpe; +Cc: mahesh, ldufour, kexec, bhe, hbathini



On 2/2/23 09:37, Eric DeVolder wrote:
> 
> 
> On 2/1/23 00:38, Sourabh Jain wrote:
>> On architectures like PowerPC the crash notes are available for all
>> possible CPUs. So let's populate the elfcorehdr for all possible
>> CPUs having crash notes to avoid updating elfcorehdr during in-kernel
>> crash update on CPU hotplug events.
>>
>> The similar technique is used in kexec-tool for kexec_load case.
>>
>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>> ---
>>   kernel/crash_core.c | 9 ++++++---
>>   1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>> index 37c594858fd51..898d8d2fe2e2e 100644
>> --- a/kernel/crash_core.c
>> +++ b/kernel/crash_core.c
>> @@ -364,8 +364,8 @@ int crash_prepare_elf64_headers(struct kimage *image, struct crash_mem *mem,
>>       ehdr->e_ehsize = sizeof(Elf64_Ehdr);
>>       ehdr->e_phentsize = sizeof(Elf64_Phdr);
>> -    /* Prepare one phdr of type PT_NOTE for each present CPU */
>> -    for_each_present_cpu(cpu) {
>> +    /* Prepare one phdr of type PT_NOTE for possible CPU with crash note. */
>> +    for_each_possible_cpu(cpu) {
> 
> Sourabh,
> Thomas Gleixner is suggesting moving away from for_each_present_cpu() to for_each_online_cpu(). 
> Using for_each_online_cpu() is going to the minimum number of needed, whereas your approach of 
> for_each_possible_cpu() would be to the maximum number needed.
> 
> What would be the ramifications to ppc for moving towards for_each_online_cpu()?
> 
> In my next patch series, I have finally figured out how to use cpuhp framework to where it is 
> possible to use for_each_online_cpu() here, but that is at odds with your changes here.
> 
> Thanks,
> eric
> 
> 
Without knowing the ramifications of changing to for_each_online_cpu(), I currently am
using the following:

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index e1a3430f06f4..a019b691d974 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -366,6 +366,9 @@ int crash_prepare_elf64_headers(struct crash_mem *mem, int n

     /* Prepare one phdr of type PT_NOTE for each present CPU */
     for_each_present_cpu(cpu) {
+       if (IS_ENABLED(CONFIG_CRASH_HOTPLUG)) {
+           if (!cpu_online(cpu)) continue;
+       }
         phdr->p_type = PT_NOTE;
         notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
         phdr->p_offset = phdr->p_paddr = notes_addr;

Thomas points out that the above can be simply the for_each_online_cpu(), but again
I'm not sure how that impacts ppc, which appears to layout all possible cpus rather
than just online ones. How are present but not online cpus handled by crash analysis
utility?
eric

> 
>>   #ifdef CONFIG_CRASH_HOTPLUG
>>           if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) {
>>               /* Skip the soon-to-be offlined cpu */
>> @@ -373,8 +373,11 @@ int crash_prepare_elf64_headers(struct kimage *image, struct crash_mem *mem,
>>                   continue;
>>           }
>>   #endif
>> -        phdr->p_type = PT_NOTE;
>>           notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
>> +        if (!notes_addr)
>> +            continue;
>> +
>> +        phdr->p_type = PT_NOTE;
>>           phdr->p_offset = phdr->p_paddr = notes_addr;
>>           phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
>>           (ehdr->e_phnum)++;

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v8 4/8] crash: add phdr for possible CPUs in elfcorehdr
  2023-02-02 15:37   ` Eric DeVolder
  2023-02-02 21:01     ` Eric DeVolder
@ 2023-02-06  5:52     ` Sourabh Jain
  2023-02-06  5:56     ` Sourabh Jain
  2 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-02-06  5:52 UTC (permalink / raw)
  To: Eric DeVolder, linuxppc-dev, mpe; +Cc: mahesh, ldufour, kexec, bhe, hbathini


On 02/02/23 21:07, Eric DeVolder wrote:
>
>
> On 2/1/23 00:38, Sourabh Jain wrote:
>> On architectures like PowerPC the crash notes are available for all
>> possible CPUs. So let's populate the elfcorehdr for all possible
>> CPUs having crash notes to avoid updating elfcorehdr during in-kernel
>> crash update on CPU hotplug events.
>>
>> The similar technique is used in kexec-tool for kexec_load case.
>>
>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>> ---
>>   kernel/crash_core.c | 9 ++++++---
>>   1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>> index 37c594858fd51..898d8d2fe2e2e 100644
>> --- a/kernel/crash_core.c
>> +++ b/kernel/crash_core.c
>> @@ -364,8 +364,8 @@ int crash_prepare_elf64_headers(struct kimage 
>> *image, struct crash_mem *mem,
>>       ehdr->e_ehsize = sizeof(Elf64_Ehdr);
>>       ehdr->e_phentsize = sizeof(Elf64_Phdr);
>>   -    /* Prepare one phdr of type PT_NOTE for each present CPU */
>> -    for_each_present_cpu(cpu) {
>> +    /* Prepare one phdr of type PT_NOTE for possible CPU with crash 
>> note. */
>> +    for_each_possible_cpu(cpu) {
>
> Sourabh,
> Thomas Gleixner is suggesting moving away from for_each_present_cpu() 
> to for_each_online_cpu(). Using for_each_online_cpu() is going to the 
> minimum number of needed, whereas your approach of 
> for_each_possible_cpu() would be to the maximum number needed.
>
> What would be the ramifications to ppc for moving towards 
> for_each_online_cpu()?

I was in the impression that if CPU notes are missing for offline CPUs 
in the elfcorehdr then makedumpfile will mess up the
CPU IDs.

But somehow replacing for_each_present_cpu with for_each_online_cpu 
worked on PowerPC, even after disabling a couple of CPUs.

So things are fine if we pack PT_LOAD for online CPUs instead of present 
CPUs but,
I need to investigate how makedumpfile is able to map PT_LOAD to online 
CPUs.

- Sourabh Jain


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v8 4/8] crash: add phdr for possible CPUs in elfcorehdr
  2023-02-02 15:37   ` Eric DeVolder
  2023-02-02 21:01     ` Eric DeVolder
  2023-02-06  5:52     ` Sourabh Jain
@ 2023-02-06  5:56     ` Sourabh Jain
  2 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-02-06  5:56 UTC (permalink / raw)
  To: Eric DeVolder, linuxppc-dev, mpe; +Cc: mahesh, ldufour, kexec, bhe, hbathini


On 02/02/23 21:07, Eric DeVolder wrote:
>
>
> On 2/1/23 00:38, Sourabh Jain wrote:
>> On architectures like PowerPC the crash notes are available for all
>> possible CPUs. So let's populate the elfcorehdr for all possible
>> CPUs having crash notes to avoid updating elfcorehdr during in-kernel
>> crash update on CPU hotplug events.
>>
>> The similar technique is used in kexec-tool for kexec_load case.
>>
>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>> ---
>>   kernel/crash_core.c | 9 ++++++---
>>   1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>> index 37c594858fd51..898d8d2fe2e2e 100644
>> --- a/kernel/crash_core.c
>> +++ b/kernel/crash_core.c
>> @@ -364,8 +364,8 @@ int crash_prepare_elf64_headers(struct kimage 
>> *image, struct crash_mem *mem,
>>       ehdr->e_ehsize = sizeof(Elf64_Ehdr);
>>       ehdr->e_phentsize = sizeof(Elf64_Phdr);
>>   -    /* Prepare one phdr of type PT_NOTE for each present CPU */
>> -    for_each_present_cpu(cpu) {
>> +    /* Prepare one phdr of type PT_NOTE for possible CPU with crash 
>> note. */
>> +    for_each_possible_cpu(cpu) {
>
> Sourabh,
> Thomas Gleixner is suggesting moving away from for_each_present_cpu() 
> to for_each_online_cpu(). Using for_each_online_cpu() is going to the 
> minimum number of needed, whereas your approach of 
> for_each_possible_cpu() would be to the maximum number needed.
>
> What would be the ramifications to ppc for moving towards 
> for_each_online_cpu()?
>
> In my next patch series, I have finally figured out how to use cpuhp 
> framework to where it is possible to use for_each_online_cpu() here, 
> but that is at odds with your changes here.
I was in the impression that if CPU notes are missing for offline CPUs 
in the elfcorehdr then makedumpfile will mess up the
CPU IDs.

But somehow replacing for_each_present_cpu with for_each_online_cpu 
worked on PowerPC, even after disabling a couple of CPUs.

So things are fine if we pack PT_LOAD for online CPUs instead of present 
CPUs but,
I need to investigate how makedumpfile is able to map PT_LOAD to online 
CPUs.

- Sourabh Jain

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v8 4/8] crash: add phdr for possible CPUs in elfcorehdr
  2023-02-02 21:01     ` Eric DeVolder
@ 2023-02-06  6:36       ` Sourabh Jain
  0 siblings, 0 replies; 14+ messages in thread
From: Sourabh Jain @ 2023-02-06  6:36 UTC (permalink / raw)
  To: Eric DeVolder, linuxppc-dev, mpe; +Cc: mahesh, ldufour, kexec, bhe, hbathini


On 03/02/23 02:31, Eric DeVolder wrote:
>
>
> On 2/2/23 09:37, Eric DeVolder wrote:
>>
>>
>> On 2/1/23 00:38, Sourabh Jain wrote:
>>> On architectures like PowerPC the crash notes are available for all
>>> possible CPUs. So let's populate the elfcorehdr for all possible
>>> CPUs having crash notes to avoid updating elfcorehdr during in-kernel
>>> crash update on CPU hotplug events.
>>>
>>> The similar technique is used in kexec-tool for kexec_load case.
>>>
>>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>>> ---
>>>   kernel/crash_core.c | 9 ++++++---
>>>   1 file changed, 6 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>>> index 37c594858fd51..898d8d2fe2e2e 100644
>>> --- a/kernel/crash_core.c
>>> +++ b/kernel/crash_core.c
>>> @@ -364,8 +364,8 @@ int crash_prepare_elf64_headers(struct kimage 
>>> *image, struct crash_mem *mem,
>>>       ehdr->e_ehsize = sizeof(Elf64_Ehdr);
>>>       ehdr->e_phentsize = sizeof(Elf64_Phdr);
>>> -    /* Prepare one phdr of type PT_NOTE for each present CPU */
>>> -    for_each_present_cpu(cpu) {
>>> +    /* Prepare one phdr of type PT_NOTE for possible CPU with crash 
>>> note. */
>>> +    for_each_possible_cpu(cpu) {
>>
>> Sourabh,
>> Thomas Gleixner is suggesting moving away from for_each_present_cpu() 
>> to for_each_online_cpu(). Using for_each_online_cpu() is going to the 
>> minimum number of needed, whereas your approach of 
>> for_each_possible_cpu() would be to the maximum number needed.
>>
>> What would be the ramifications to ppc for moving towards 
>> for_each_online_cpu()?
>>
>> In my next patch series, I have finally figured out how to use cpuhp 
>> framework to where it is possible to use for_each_online_cpu() here, 
>> but that is at odds with your changes here.
>>
>> Thanks,
>> eric
>>
>>
> Without knowing the ramifications of changing to 
> for_each_online_cpu(), I currently am
> using the following:
>
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index e1a3430f06f4..a019b691d974 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -366,6 +366,9 @@ int crash_prepare_elf64_headers(struct crash_mem 
> *mem, int n
>
>     /* Prepare one phdr of type PT_NOTE for each present CPU */
>     for_each_present_cpu(cpu) {
> +       if (IS_ENABLED(CONFIG_CRASH_HOTPLUG)) {
> +           if (!cpu_online(cpu)) continue;
> +       }

How about let the arch decide the list of CPUs they want to pack in? 
Because on
PowerPC the crash notes are created for possible CPUs and we can utilize 
this
to avoid re-generating elfcorehdr for every hotplug operation.


> phdr->p_type = PT_NOTE;
>         notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
>         phdr->p_offset = phdr->p_paddr = notes_addr;
>
> Thomas points out that the above can be simply the 
> for_each_online_cpu(), but again
> I'm not sure how that impacts ppc,
>
> which appears to layout all possible cpus rather
> than just online ones. How are present but not online cpus handled by 
> crash analysis
> utility?

As per my testing all worked fine if we replace for_each_present_cpu 
with for_each_online_cpu
but again I don't know the reason why it worked. I will investigate it 
and let you know.

How packing PT_LOAD for present CPU is impacting x86? Because on PowerPC
when system is on crash path it only populates the crash notes for 
online CPUs, and skip
all other CPUs.

- Sourabh Jain

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-02-06  6:37 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-01  6:38 [PATCH v8 0/8] In kernel handling of CPU hotplug events for crash kernel Sourabh Jain
2023-02-01  6:38 ` [PATCH v8 1/8] powerpc/kexec: turn some static helper functions public Sourabh Jain
2023-02-01  6:38 ` [PATCH v8 2/8] powerpc/crash hp: introduce a new config option CRASH_HOTPLUG Sourabh Jain
2023-02-01  6:38 ` [PATCH v8 3/8] powerpc/crash: update kimage_arch struct Sourabh Jain
2023-02-01  6:38 ` [PATCH v8 4/8] crash: add phdr for possible CPUs in elfcorehdr Sourabh Jain
2023-02-02 15:37   ` Eric DeVolder
2023-02-02 21:01     ` Eric DeVolder
2023-02-06  6:36       ` Sourabh Jain
2023-02-06  5:52     ` Sourabh Jain
2023-02-06  5:56     ` Sourabh Jain
2023-02-01  6:38 ` [PATCH v8 5/8] crash: pass hotplug action type to arch crash hotplug handler Sourabh Jain
2023-02-01  6:38 ` [PATCH v8 6/8] powerpc/crash: add crash CPU hotplug support Sourabh Jain
2023-02-01  6:38 ` [PATCH v8 7/8] crash: forward memory_notify args to arch crash hotplug handler Sourabh Jain
2023-02-01  6:38 ` [PATCH v8 8/8] powerpc/kexec: add crash memory hotplug support Sourabh Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).