linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v11 0/4] PowerPC: In-kernel handling of CPU/Memory hotplug/online/offline events for kdump kernel
@ 2023-06-19  2:49 Sourabh Jain
  2023-06-19  2:49 ` [PATCH v11 1/4] powerpc/kexec: turn some static helper functions public Sourabh Jain
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Sourabh Jain @ 2023-06-19  2:49 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: mahesh, ldufour, eric.devolder, kexec, hbathini

The Problem:
============
Post CPU/Memory hot plug/unplug and online/offline events occur, the
kdump kernel often retains outdated system information. This presents
a significant challenge when attempting to perform a dump collection
using an outdated or stale kdump kernel. In such situations, there
are two potential outcomes that pose risks: either the dump collection
fails to capture the required data entirely, leading to a failed dump,
or the collected dump data is inaccurate, thereby compromising its
reliability for analysis and troubleshooting purposes

Existing solution:
==================
The existing solution to keep the kdump kernel up-to-date involves
monitoring CPU/Memory hotplug/online/offline events via a udev rule.
This approach triggers a full kdump kernel reload for each hotplug event,
ensuring that the kdump kernel is always synchronized with the latest
system resource changes.

Shortcomings of existing solution:
==================================
- Leaves a window where kernel crash might not lead to a successful dump
  collection.
- Reloading all kexec segments for each hotplug is inefficient.
- udev rules are prone to races if hotplug events are frequent.

Further information regarding the problems associated with a current
solution can be found here.
 - https://lore.kernel.org/lkml/b04ed259-dc5f-7f30-6661-c26f92d9096a@oracle.com/
 - https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-February/240254.html

Proposed Solution:
==================
To address the limitations of the current approach, a proposed solution
focuses on implementing a more targeted update strategy. Instead of
performing a full reload of all kexec segments for every CPU/Memory hot
plug/unplug and online/offline events, the proposed solution aims to update
only the relevant kexec segment. After loading the kexec segments into the
reserved area, a newly introduced hotplug handler will be responsible for
updating the specific kexec segment based on the type of hotplug event.
This selective update approach enhances overall efficiency by minimizing
unnecessary overhead and significantly reduces the chances of a kernel
crash leading to a failed or inaccurate dump collection.

Series Dependencies:
====================
The implementation of the crash hotplug handler on PowerPC is included in
this patch series. The introduction of the generic crash hotplug handler
is done through the patch series available at
https://lore.kernel.org/all/20230612210712.683175-1-eric.devolder@oracle.com/

Git tree for testing:
=====================
The following Git tree incorporates this patch series applied on top of
the dependent patch series.
https://github.com/sourabhjains/linux/tree/e23-s11-with-kexec-config

In order to enable this feature, it is necessary to disable the udev rule
responsible for reloading the kdump service. To do this, you can make the
following additions to the file "/usr/lib/udev/rules.d/98-kexec.rules" on RHEL:

Add the following two lines at top:

   SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
   SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"

The changes mentioned above ensure that the kdump reload process is skipped
for CPU/Memory hot plug/unplug events when the path
"/sys/devices/system/[cpu|memory]/crash_hotplug" exists.

Note: only kexec_file_load syscall will work. For kexec_load minor changes are
required in kexec tool.

---
Changelog:

v11:
  - Rebase to v6.4-rc6
  - The patch that introduced CONFIG_CRASH_HOTPLUG for PowerPC has been removed.
    The config is now part of common configuration:
    https://lore.kernel.org/all/87ilbpflsk.fsf@mail.lhotse/

v10:
  - Drop the patch that adds fdt_index attribute to struct kimage_arch
    Find the fdt segment index when needed.
  - Added more details into commits messages.
  - Rebased onto 6.3.0-rc5

v9:
  - Removed patch to prepare elfcorehdr crash notes for possible CPUs.
    The patch is moved to generic patch series that introduces generic
    infrastructure for in kernel crash update.
  - Removed patch to pass the hotplug action type to the arch crash
    hotplug handler function. The generic patch series has introduced
    the hotplug action type in kimage struct.
  - Add detail commit message for better understanding.

v8:
  - Restrict fdt_index initialization to machine_kexec_post_load
    it work for both kexec_load and kexec_file_load.[3/8] Laurent Dufour

  - Updated the logic to find the number of offline core. [6/8]

  - Changed the logic to find the elfcore program header to accommodate
    future memory ranges due memory hotplug events. [8/8]

v7
  - added a new config to configure this feature
  - pass hotplug action type to arch specific handler

v6
  - Added crash memory hotplug support

v5:
  - Replace COFNIG_CRASH_HOTPLUG with CONFIG_HOTPLUG_CPU.
  - Move fdt segment identification for kexec_load case to load path
    instead of crash hotplug handler
  - Keep new attribute defined under kimage_arch to track FDT segment
    under CONFIG_HOTPLUG_CPU config.

v4:
  - Update the logic to find the additional space needed for hotadd CPUs post
    kexec load. Refer "[RFC v4 PATCH 4/5] powerpc/crash hp: add crash hotplug
    support for kexec_file_load" patch to know more about the change.
  - Fix a couple of typo.
  - Replace pr_err to pr_info_once to warn user about memory hotplug
    support.
  - In crash hotplug handle exit the for loop if FDT segment is found.

v3
  - Move fdt_index and fdt_index_vaild variables to kimage_arch struct.
  - Rebase patche on top of
    https://lore.kernel.org/lkml/20220303162725.49640-1-eric.devolder@oracle.com/
  - Fixed warning reported by checpatch script

v2:
  - Use generic hotplug handler introduced by
    https://lore.kernel.org/lkml/20220209195706.51522-1-eric.devolder@oracle.com/
    a significant change from v1.

Sourabh Jain (4):
  powerpc/kexec: turn some static helper functions public
  powerpc/crash: add crash CPU hotplug support
  crash: forward memory_notify args to arch crash hotplug handler
  powerpc/crash: add crash memory hotplug support

 arch/powerpc/Kconfig                    |   3 +
 arch/powerpc/include/asm/kexec.h        |  22 ++
 arch/powerpc/include/asm/kexec_ranges.h |   1 +
 arch/powerpc/kexec/core_64.c            | 301 ++++++++++++++++++++++++
 arch/powerpc/kexec/elf_64.c             |  12 +-
 arch/powerpc/kexec/file_load_64.c       | 212 ++++-------------
 arch/powerpc/kexec/ranges.c             |  85 +++++++
 arch/x86/include/asm/kexec.h            |   2 +-
 arch/x86/kernel/crash.c                 |   5 +-
 include/linux/kexec.h                   |   2 +-
 kernel/crash_core.c                     |  14 +-
 11 files changed, 483 insertions(+), 176 deletions(-)

-- 
2.40.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v11 1/4] powerpc/kexec: turn some static helper functions public
  2023-06-19  2:49 [PATCH v11 0/4] PowerPC: In-kernel handling of CPU/Memory hotplug/online/offline events for kdump kernel Sourabh Jain
@ 2023-06-19  2:49 ` Sourabh Jain
  2023-10-13 13:29   ` Christophe Leroy
  2023-06-19  2:49 ` [PATCH v11 2/4] powerpc/crash: add crash CPU hotplug support Sourabh Jain
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Sourabh Jain @ 2023-06-19  2:49 UTC (permalink / raw)
  To: linuxppc-dev, mpe
  Cc: eric.devolder, mahesh, kexec, Laurent Dufour, ldufour, hbathini

Move update_cpus_node and get_crash_memory_ranges functions from
kexec/file_load_64.c to kexec/core_64.c to make these functions usable
by other kexec components.

Later in the series, these functions are utilized to do in-kernel update
to kexec segments on CPU/Memory hot plug/unplug or online/offline events
for both kexec_load and kexec_file_load syscalls.

No functional change intended.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Reviewed-by: Laurent Dufour <laurent.dufour@fr.ibm.com>
---
 arch/powerpc/include/asm/kexec.h  |   6 ++
 arch/powerpc/kexec/core_64.c      | 166 ++++++++++++++++++++++++++++++
 arch/powerpc/kexec/file_load_64.c | 162 -----------------------------
 3 files changed, 172 insertions(+), 162 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index a1ddba01e7d1..8090ad7d97d9 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -99,6 +99,12 @@ void relocate_new_kernel(unsigned long indirection_page, unsigned long reboot_co
 
 void kexec_copy_flush(struct kimage *image);
 
+#ifdef CONFIG_PPC64
+struct crash_mem;
+int update_cpus_node(void *fdt);
+int get_crash_memory_ranges(struct crash_mem **mem_ranges);
+#endif
+
 #if defined(CONFIG_CRASH_DUMP) && defined(CONFIG_PPC_RTAS)
 void crash_free_reserved_phys_range(unsigned long begin, unsigned long end);
 #define crash_free_reserved_phys_range crash_free_reserved_phys_range
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index a79e28c91e2b..0b292f93a74c 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -17,6 +17,8 @@
 #include <linux/cpu.h>
 #include <linux/hardirq.h>
 #include <linux/of.h>
+#include <linux/libfdt.h>
+#include <linux/memblock.h>
 
 #include <asm/page.h>
 #include <asm/current.h>
@@ -30,6 +32,8 @@
 #include <asm/hw_breakpoint.h>
 #include <asm/svm.h>
 #include <asm/ultravisor.h>
+#include <asm/kexec_ranges.h>
+#include <asm/crashdump-ppc64.h>
 
 int machine_kexec_prepare(struct kimage *image)
 {
@@ -377,6 +381,168 @@ void default_machine_kexec(struct kimage *image)
 	/* NOTREACHED */
 }
 
+/**
+ * get_crash_memory_ranges - Get crash memory ranges. This list includes
+ *                           first/crashing kernel's memory regions that
+ *                           would be exported via an elfcore.
+ * @mem_ranges:              Range list to add the memory ranges to.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+int get_crash_memory_ranges(struct crash_mem **mem_ranges)
+{
+	phys_addr_t base, end;
+	struct crash_mem *tmem;
+	u64 i;
+	int ret;
+
+	for_each_mem_range(i, &base, &end) {
+		u64 size = end - base;
+
+		/* Skip backup memory region, which needs a separate entry */
+		if (base == BACKUP_SRC_START) {
+			if (size > BACKUP_SRC_SIZE) {
+				base = BACKUP_SRC_END + 1;
+				size -= BACKUP_SRC_SIZE;
+			} else
+				continue;
+		}
+
+		ret = add_mem_range(mem_ranges, base, size);
+		if (ret)
+			goto out;
+
+		/* Try merging adjacent ranges before reallocation attempt */
+		if ((*mem_ranges)->nr_ranges == (*mem_ranges)->max_nr_ranges)
+			sort_memory_ranges(*mem_ranges, true);
+	}
+
+	/* Reallocate memory ranges if there is no space to split ranges */
+	tmem = *mem_ranges;
+	if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
+		tmem = realloc_mem_ranges(mem_ranges);
+		if (!tmem)
+			goto out;
+	}
+
+	/* Exclude crashkernel region */
+	ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
+	if (ret)
+		goto out;
+
+	/*
+	 * FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
+	 *        regions are exported to save their context at the time of
+	 *        crash, they should actually be backed up just like the
+	 *        first 64K bytes of memory.
+	 */
+	ret = add_rtas_mem_range(mem_ranges);
+	if (ret)
+		goto out;
+
+	ret = add_opal_mem_range(mem_ranges);
+	if (ret)
+		goto out;
+
+	/* create a separate program header for the backup region */
+	ret = add_mem_range(mem_ranges, BACKUP_SRC_START, BACKUP_SRC_SIZE);
+	if (ret)
+		goto out;
+
+	sort_memory_ranges(*mem_ranges, false);
+out:
+	if (ret)
+		pr_err("Failed to setup crash memory ranges\n");
+	return ret;
+}
+
+/**
+ * add_node_props - Reads node properties from device node structure and add
+ *                  them to fdt.
+ * @fdt:            Flattened device tree of the kernel
+ * @node_offset:    offset of the node to add a property at
+ * @dn:             device node pointer
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+static int add_node_props(void *fdt, int node_offset, const struct device_node *dn)
+{
+	int ret = 0;
+	struct property *pp;
+
+	if (!dn)
+		return -EINVAL;
+
+	for_each_property_of_node(dn, pp) {
+		ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, pp->length);
+		if (ret < 0) {
+			pr_err("Unable to add %s property: %s\n", pp->name, fdt_strerror(ret));
+			return ret;
+		}
+	}
+	return ret;
+}
+
+/**
+ * update_cpus_node - Update cpus node of flattened device tree using of_root
+ *                    device node.
+ * @fdt:              Flattened device tree of the kernel.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+int update_cpus_node(void *fdt)
+{
+	struct device_node *cpus_node, *dn;
+	int cpus_offset, cpus_subnode_offset, ret = 0;
+
+	cpus_offset = fdt_path_offset(fdt, "/cpus");
+	if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
+		pr_err("Malformed device tree: error reading /cpus node: %s\n",
+		       fdt_strerror(cpus_offset));
+		return cpus_offset;
+	}
+
+	if (cpus_offset > 0) {
+		ret = fdt_del_node(fdt, cpus_offset);
+		if (ret < 0) {
+			pr_err("Error deleting /cpus node: %s\n", fdt_strerror(ret));
+			return -EINVAL;
+		}
+	}
+
+	/* Add cpus node to fdt */
+	cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
+	if (cpus_offset < 0) {
+		pr_err("Error creating /cpus node: %s\n", fdt_strerror(cpus_offset));
+		return -EINVAL;
+	}
+
+	/* Add cpus node properties */
+	cpus_node = of_find_node_by_path("/cpus");
+	ret = add_node_props(fdt, cpus_offset, cpus_node);
+	of_node_put(cpus_node);
+	if (ret < 0)
+		return ret;
+
+	/* Loop through all subnodes of cpus and add them to fdt */
+	for_each_node_by_type(dn, "cpu") {
+		cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, dn->full_name);
+		if (cpus_subnode_offset < 0) {
+			pr_err("Unable to add %s subnode: %s\n", dn->full_name,
+			       fdt_strerror(cpus_subnode_offset));
+			ret = cpus_subnode_offset;
+			goto out;
+		}
+
+		ret = add_node_props(fdt, cpus_subnode_offset, dn);
+		if (ret < 0)
+			goto out;
+	}
+out:
+	of_node_put(dn);
+	return ret;
+}
+
 #ifdef CONFIG_PPC_64S_HASH_MMU
 /* Values we need to export to the second kernel via the device tree. */
 static unsigned long htab_base;
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index 110d28bede2a..5b0b3f61e0e7 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -133,81 +133,6 @@ static int get_usable_memory_ranges(struct crash_mem **mem_ranges)
 	return ret;
 }
 
-/**
- * get_crash_memory_ranges - Get crash memory ranges. This list includes
- *                           first/crashing kernel's memory regions that
- *                           would be exported via an elfcore.
- * @mem_ranges:              Range list to add the memory ranges to.
- *
- * Returns 0 on success, negative errno on error.
- */
-static int get_crash_memory_ranges(struct crash_mem **mem_ranges)
-{
-	phys_addr_t base, end;
-	struct crash_mem *tmem;
-	u64 i;
-	int ret;
-
-	for_each_mem_range(i, &base, &end) {
-		u64 size = end - base;
-
-		/* Skip backup memory region, which needs a separate entry */
-		if (base == BACKUP_SRC_START) {
-			if (size > BACKUP_SRC_SIZE) {
-				base = BACKUP_SRC_END + 1;
-				size -= BACKUP_SRC_SIZE;
-			} else
-				continue;
-		}
-
-		ret = add_mem_range(mem_ranges, base, size);
-		if (ret)
-			goto out;
-
-		/* Try merging adjacent ranges before reallocation attempt */
-		if ((*mem_ranges)->nr_ranges == (*mem_ranges)->max_nr_ranges)
-			sort_memory_ranges(*mem_ranges, true);
-	}
-
-	/* Reallocate memory ranges if there is no space to split ranges */
-	tmem = *mem_ranges;
-	if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
-		tmem = realloc_mem_ranges(mem_ranges);
-		if (!tmem)
-			goto out;
-	}
-
-	/* Exclude crashkernel region */
-	ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
-	if (ret)
-		goto out;
-
-	/*
-	 * FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
-	 *        regions are exported to save their context at the time of
-	 *        crash, they should actually be backed up just like the
-	 *        first 64K bytes of memory.
-	 */
-	ret = add_rtas_mem_range(mem_ranges);
-	if (ret)
-		goto out;
-
-	ret = add_opal_mem_range(mem_ranges);
-	if (ret)
-		goto out;
-
-	/* create a separate program header for the backup region */
-	ret = add_mem_range(mem_ranges, BACKUP_SRC_START, BACKUP_SRC_SIZE);
-	if (ret)
-		goto out;
-
-	sort_memory_ranges(*mem_ranges, false);
-out:
-	if (ret)
-		pr_err("Failed to setup crash memory ranges\n");
-	return ret;
-}
-
 /**
  * get_reserved_memory_ranges - Get reserve memory ranges. This list includes
  *                              memory regions that should be added to the
@@ -1018,93 +943,6 @@ unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image)
 	return extra_size;
 }
 
-/**
- * add_node_props - Reads node properties from device node structure and add
- *                  them to fdt.
- * @fdt:            Flattened device tree of the kernel
- * @node_offset:    offset of the node to add a property at
- * @dn:             device node pointer
- *
- * Returns 0 on success, negative errno on error.
- */
-static int add_node_props(void *fdt, int node_offset, const struct device_node *dn)
-{
-	int ret = 0;
-	struct property *pp;
-
-	if (!dn)
-		return -EINVAL;
-
-	for_each_property_of_node(dn, pp) {
-		ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, pp->length);
-		if (ret < 0) {
-			pr_err("Unable to add %s property: %s\n", pp->name, fdt_strerror(ret));
-			return ret;
-		}
-	}
-	return ret;
-}
-
-/**
- * update_cpus_node - Update cpus node of flattened device tree using of_root
- *                    device node.
- * @fdt:              Flattened device tree of the kernel.
- *
- * Returns 0 on success, negative errno on error.
- */
-static int update_cpus_node(void *fdt)
-{
-	struct device_node *cpus_node, *dn;
-	int cpus_offset, cpus_subnode_offset, ret = 0;
-
-	cpus_offset = fdt_path_offset(fdt, "/cpus");
-	if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
-		pr_err("Malformed device tree: error reading /cpus node: %s\n",
-		       fdt_strerror(cpus_offset));
-		return cpus_offset;
-	}
-
-	if (cpus_offset > 0) {
-		ret = fdt_del_node(fdt, cpus_offset);
-		if (ret < 0) {
-			pr_err("Error deleting /cpus node: %s\n", fdt_strerror(ret));
-			return -EINVAL;
-		}
-	}
-
-	/* Add cpus node to fdt */
-	cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
-	if (cpus_offset < 0) {
-		pr_err("Error creating /cpus node: %s\n", fdt_strerror(cpus_offset));
-		return -EINVAL;
-	}
-
-	/* Add cpus node properties */
-	cpus_node = of_find_node_by_path("/cpus");
-	ret = add_node_props(fdt, cpus_offset, cpus_node);
-	of_node_put(cpus_node);
-	if (ret < 0)
-		return ret;
-
-	/* Loop through all subnodes of cpus and add them to fdt */
-	for_each_node_by_type(dn, "cpu") {
-		cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, dn->full_name);
-		if (cpus_subnode_offset < 0) {
-			pr_err("Unable to add %s subnode: %s\n", dn->full_name,
-			       fdt_strerror(cpus_subnode_offset));
-			ret = cpus_subnode_offset;
-			goto out;
-		}
-
-		ret = add_node_props(fdt, cpus_subnode_offset, dn);
-		if (ret < 0)
-			goto out;
-	}
-out:
-	of_node_put(dn);
-	return ret;
-}
-
 static int copy_property(void *fdt, int node_offset, const struct device_node *dn,
 			 const char *propname)
 {
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v11 2/4] powerpc/crash: add crash CPU hotplug support
  2023-06-19  2:49 [PATCH v11 0/4] PowerPC: In-kernel handling of CPU/Memory hotplug/online/offline events for kdump kernel Sourabh Jain
  2023-06-19  2:49 ` [PATCH v11 1/4] powerpc/kexec: turn some static helper functions public Sourabh Jain
@ 2023-06-19  2:49 ` Sourabh Jain
  2023-06-19  2:49 ` [PATCH v11 3/4] crash: forward memory_notify args to arch crash hotplug handler Sourabh Jain
  2023-06-19  2:49 ` [PATCH v11 4/4] powerpc/crash: add crash memory hotplug support Sourabh Jain
  3 siblings, 0 replies; 7+ messages in thread
From: Sourabh Jain @ 2023-06-19  2:49 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: mahesh, ldufour, eric.devolder, kexec, hbathini

Introduce powerpc crash hotplug handler to update the necessary kexec
segments in the kernel on CPU/Memory hotplug events. Currently, these
updates are done by monitoring CPU/Memory hotplug events in userspace.

In the generic infrastructure, a shared crash hotplug handler is
triggered for both CPU and Memory hotplug events. However, in this
particular patch, only CPU hotplug events are handled for crash updates.
Support for crash updates during memory hotplug events will be
introduced in subsequent patches.

The elfcorehdr segment facilitates the exchange of CPU and other
relevant dump information between kernels. Ideally, this segment should
be regenerated during CPU hotplug events to reflect any changes.
However, on powerpc systems, the elfcorehdr is initially constructed
with all possible CPUs, rendering it unnecessary to update or recreate
it when CPU hotplug events occur.

Additionally, on powerpc, there is another segment called the FDT
(Flattened Device Tree) that holds CPU data. During the boot of the
kdump kernel, it is crucial for the crashing CPU to be present in the
FDT. Failure to have the crashing CPU in the FDT would result in a
boot failure of the kdump kernel.

Therefore, the only action required on powerpc to handle a crash CPU
hotplug event is to add the hot-added CPUs to the kdump FDT segment,
ensuring that the kdump kernel boots successfully. However, there is
no need to remove the hot-unplugged CPUs from the FDT segment, so no
action is taken for a CPU hot removal event.

In order to support an increasing number of CPUs, the FDT is constructed
with extra buffer space to ensure it can accommodate possible number of
CPUs nodes. Also do not pack the FDT and shrik its size once prepared.

The changes introduced here will also work for the kexec_load system
call, considering that the kexec tool constructs the FDT segment with
extra space to accommodate possible number CPUs nodes.

Since memory crash hotplug support is not there yet the crash hotplug
the handler simply warns the user and returns.

The changes realted to in-kernel update to kdump kernel kexec segments
on CPU/Memory hotplug and online/offline events are kept under
CRASH_HOTPLUG config and it is enabled by default on PowerPC platform.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/Kconfig              |  3 ++
 arch/powerpc/include/asm/kexec.h  | 10 +++++
 arch/powerpc/kexec/core_64.c      | 61 +++++++++++++++++++++++++++++++
 arch/powerpc/kexec/elf_64.c       | 12 +++++-
 arch/powerpc/kexec/file_load_64.c | 14 +++++++
 5 files changed, 99 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 36f2fe0cc8a5..87f1b569c682 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -665,6 +665,9 @@ config RELOCATABLE_TEST
 config ARCH_HAS_CRASH_DUMP
 	def_bool PPC64 || PPC_BOOK3S_32 || PPC_85xx || (44x && !SMP)
 
+config ARCH_HAS_CRASH_HOTPLUG
+	def_bool y
+
 config ARCH_SUPPORTS_CRASH_DUMP
 	def_bool y
 	depends on CRASH_DUMP
diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 8090ad7d97d9..154759bf0759 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -103,6 +103,16 @@ void kexec_copy_flush(struct kimage *image);
 struct crash_mem;
 int update_cpus_node(void *fdt);
 int get_crash_memory_ranges(struct crash_mem **mem_ranges);
+
+#ifdef CONFIG_CRASH_HOTPLUG
+void arch_crash_handle_hotplug_event(struct kimage *image);
+#define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
+
+#ifdef CONFIG_HOTPLUG_CPU
+static inline int crash_hotplug_cpu_support(void) { return 1; }
+#define crash_hotplug_cpu_support crash_hotplug_cpu_support
+#endif
+#endif
 #endif
 
 #if defined(CONFIG_CRASH_DUMP) && defined(CONFIG_PPC_RTAS)
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 0b292f93a74c..cb38da1c6dbe 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -543,6 +543,67 @@ int update_cpus_node(void *fdt)
 	return ret;
 }
 
+#ifdef CONFIG_CRASH_HOTPLUG
+#undef pr_fmt
+#define pr_fmt(fmt) "crash hp: " fmt
+
+/**
+ * arch_crash_handle_hotplug_event - Handle crash CPU/Memory hotplug events to update the
+ *				     necessary kexec segments based on the hotplug event.
+ * @image: the active struct kimage
+ *
+ * Update FDT segment to include newly added CPU. No action for CPU remove case.
+ */
+void arch_crash_handle_hotplug_event(struct kimage *image)
+{
+	void *fdt, *ptr;
+	unsigned long mem;
+	int i, fdt_index = -1;
+	unsigned int hp_action = image->hp_action;
+
+	/*
+	 * Since the hot-unplugged CPU is already part of crash FDT,
+	 * no action is needed for CPU remove case.
+	 */
+	if (hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
+		return;
+
+	/* crash update on memory hotplug events is not supported yet */
+	if (hp_action == KEXEC_CRASH_HP_REMOVE_MEMORY || hp_action == KEXEC_CRASH_HP_ADD_MEMORY) {
+		pr_info_once("Crash update is not supported for memory hotplug\n");
+		return;
+	}
+
+	/* Find the FDT segment index in kexec segment array. */
+	for (i = 0; i < image->nr_segments; i++) {
+		mem = image->segment[i].mem;
+		ptr = __va(mem);
+
+		if (ptr && fdt_magic(ptr) == FDT_MAGIC) {
+			fdt_index = i;
+			break;
+		}
+	}
+
+	if (fdt_index < 0) {
+		pr_err("Unable to locate FDT segment.\n");
+		return;
+	}
+
+	fdt = __va((void *)image->segment[fdt_index].mem);
+
+	/* Temporarily invalidate the crash image while it is replaced */
+	xchg(&kexec_crash_image, NULL);
+
+	/* update FDT to refelect changes in CPU resrouces */
+	if (update_cpus_node(fdt))
+		pr_err("Failed to update crash FDT");
+
+	/* The crash image is now valid once again */
+	xchg(&kexec_crash_image, image);
+}
+#endif
+
 #ifdef CONFIG_PPC_64S_HASH_MMU
 /* Values we need to export to the second kernel via the device tree. */
 static unsigned long htab_base;
diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
index eeb258002d1e..4983d67efbf7 100644
--- a/arch/powerpc/kexec/elf_64.c
+++ b/arch/powerpc/kexec/elf_64.c
@@ -30,6 +30,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
 			unsigned long cmdline_len)
 {
 	int ret;
+	bool do_pack_fdt = true;
 	unsigned long kernel_load_addr;
 	unsigned long initrd_load_addr = 0, fdt_load_addr;
 	void *fdt;
@@ -116,7 +117,16 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
 	if (ret)
 		goto out_free_fdt;
 
-	fdt_pack(fdt);
+#ifdef CONFIG_CRASH_HOTPLG
+	/*
+	 * Do not pack FDT, additional space is reserved to accommodate
+	 * possible CPU nodes which are not yet present in the system.
+	 */
+	if (image->type == KEXEC_TYPE_CRASH)
+		do_pack_fdt = false;
+#endif
+	if (do_pack_fdt)
+		fdt_pack(fdt);
 
 	kbuf.buffer = fdt;
 	kbuf.bufsz = kbuf.memsz = fdt_totalsize(fdt);
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index 5b0b3f61e0e7..1757e7ba379a 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -908,6 +908,9 @@ unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image)
 	unsigned int cpu_nodes, extra_size = 0;
 	struct device_node *dn;
 	u64 usm_entries;
+#ifdef CONFIG_CRASH_HOTPLUG
+	unsigned int possible_cpu_nodes;
+#endif
 
 	// Budget some space for the password blob. There's already extra space
 	// for the key name
@@ -940,6 +943,17 @@ unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image)
 	if (cpu_nodes > boot_cpu_node_count)
 		extra_size += (cpu_nodes - boot_cpu_node_count) * cpu_node_size();
 
+#ifdef CONFIG_CRASH_HOTPLUG
+	/*
+	 * Make sure enough space is reserved to accommodate possible CPU nodes
+	 * in the crash FDT. This allows packing possible CPU nodes which are
+	 * not yet present in the system without regenerating the entire FDT.
+	 */
+	possible_cpu_nodes = num_possible_cpus() / threads_per_core;
+	if (image->type == KEXEC_TYPE_CRASH && possible_cpu_nodes > cpu_nodes)
+		extra_size += (possible_cpu_nodes - cpu_nodes) * cpu_node_size();
+#endif
+
 	return extra_size;
 }
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v11 3/4] crash: forward memory_notify args to arch crash hotplug handler
  2023-06-19  2:49 [PATCH v11 0/4] PowerPC: In-kernel handling of CPU/Memory hotplug/online/offline events for kdump kernel Sourabh Jain
  2023-06-19  2:49 ` [PATCH v11 1/4] powerpc/kexec: turn some static helper functions public Sourabh Jain
  2023-06-19  2:49 ` [PATCH v11 2/4] powerpc/crash: add crash CPU hotplug support Sourabh Jain
@ 2023-06-19  2:49 ` Sourabh Jain
  2023-06-19  2:49 ` [PATCH v11 4/4] powerpc/crash: add crash memory hotplug support Sourabh Jain
  3 siblings, 0 replies; 7+ messages in thread
From: Sourabh Jain @ 2023-06-19  2:49 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: mahesh, ldufour, eric.devolder, kexec, hbathini

On PowerPC, memblock regions is used to prepare the elfcorehdr. This
elfcorehdr describes the memory regions of the running kernel to the
kdump kernel. However, a challenge arises as the notifier for the memory
hotplug crash handler is triggered before the memblock region update
takes place. Consequently, the newly prepared elfcorehdr still retains
the outdated memory regions. If the elfcorehdr is generated with these
outdated memblock regions, it will contain inaccurate information about
the memory regions. This can result in failures or incomplete dump
collection when attempting to collect a dump using the outdated
elfcorehdr.

This challenge is specific to removing an LMB (Local Memory Block). It
does not apply when memory is added. During memory addition, the memory
regions are updated first, and then the memory notify function calls the
arch crash hotplug handler to update the elfcorehdr.

The flow diagram below depicts the series of action taken during memory
hot removal.

 Initiate memory hot remove
          |
          v
 offline pages
          |
          v
 initiate memory notify call
 chain for MEM_OFFLINE event
          |
          v
 Prepare new elfcorehdr and replace
 it with old one.
	  |
	  V
 update memblock regions

The arch crash hotplug handler function signature is updated to pass
additional argument as the struct memory_notify object to architectures.

The memory_notify object contains the starting PFN (Page Frame Number)
and the number of pages in the hot removed memory. By utilizing this
information, the base address and size of the hot removed memory is
calculated and used to avoid adding the hot removed memory region to
the elfcorehdr.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/include/asm/kexec.h |  2 +-
 arch/powerpc/kexec/core_64.c     |  3 ++-
 arch/x86/include/asm/kexec.h     |  2 +-
 arch/x86/kernel/crash.c          |  5 +++--
 include/linux/kexec.h            |  2 +-
 kernel/crash_core.c              | 14 +++++++-------
 6 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 154759bf0759..d3ff481aa9f8 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -105,7 +105,7 @@ int update_cpus_node(void *fdt);
 int get_crash_memory_ranges(struct crash_mem **mem_ranges);
 
 #ifdef CONFIG_CRASH_HOTPLUG
-void arch_crash_handle_hotplug_event(struct kimage *image);
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index cb38da1c6dbe..4d1c53cc9a90 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -551,10 +551,11 @@ int update_cpus_node(void *fdt)
  * arch_crash_handle_hotplug_event - Handle crash CPU/Memory hotplug events to update the
  *				     necessary kexec segments based on the hotplug event.
  * @image: the active struct kimage
+ * @arg: struct memory_notify handler for memory hotplug case and NULL for CPU hotplug case.
  *
  * Update FDT segment to include newly added CPU. No action for CPU remove case.
  */
-void arch_crash_handle_hotplug_event(struct kimage *image)
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg)
 {
 	void *fdt, *ptr;
 	unsigned long mem;
diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 3be6a98751f0..ca0bc9b12504 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -210,7 +210,7 @@ extern crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss;
 extern void kdump_nmi_shootdown_cpus(void);
 
 #ifdef CONFIG_CRASH_HOTPLUG
-void arch_crash_handle_hotplug_event(struct kimage *image);
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
 #define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 18d2a18d1073..d80a63665436 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -456,11 +456,12 @@ unsigned int arch_crash_get_elfcorehdr_size(void)
 
 /**
  * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
- * @image: a pointer to kexec_crash_image
+ * @image: the active struct kimage
+ * @arg: struct memory_notify handler for memory hotplug case and NULL for CPU hotplug case.
  *
  * Prepare the new elfcorehdr and replace the existing elfcorehdr.
  */
-void arch_crash_handle_hotplug_event(struct kimage *image)
+void arch_crash_handle_hotplug_event(struct kimage *image, void *arg)
 {
 	void *elfbuf = NULL, *old_elfcorehdr;
 	unsigned long nr_mem_ranges;
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index bb0e614f2a05..b3fcee14fea4 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -502,7 +502,7 @@ static inline void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages) {
 #endif
 
 #ifndef arch_crash_handle_hotplug_event
-static inline void arch_crash_handle_hotplug_event(struct kimage *image) { }
+static inline void arch_crash_handle_hotplug_event(struct kimage *image, void *arg) { }
 #endif
 
 int crash_check_update_elfcorehdr(void);
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 7378b501fada..23a8ab9db5bb 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -745,7 +745,7 @@ int crash_check_update_elfcorehdr(void)
  * list of segments it checks (since the elfcorehdr changes and thus
  * would require an update to purgatory itself to update the digest).
  */
-static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
+static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu, void *arg)
 {
 	struct kimage *image;
 
@@ -805,7 +805,7 @@ static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
 	image->hp_action = hp_action;
 
 	/* Now invoke arch-specific update handler */
-	arch_crash_handle_hotplug_event(image);
+	arch_crash_handle_hotplug_event(image, arg);
 
 	/* No longer handling a hotplug event */
 	image->hp_action = KEXEC_CRASH_HP_NONE;
@@ -820,17 +820,17 @@ static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
 	kexec_unlock();
 }
 
-static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, void *v)
+static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, void *arg)
 {
 	switch (val) {
 	case MEM_ONLINE:
 		crash_handle_hotplug_event(KEXEC_CRASH_HP_ADD_MEMORY,
-			KEXEC_CRASH_HP_INVALID_CPU);
+			KEXEC_CRASH_HP_INVALID_CPU, arg);
 		break;
 
 	case MEM_OFFLINE:
 		crash_handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_MEMORY,
-			KEXEC_CRASH_HP_INVALID_CPU);
+			KEXEC_CRASH_HP_INVALID_CPU, arg);
 		break;
 	}
 	return NOTIFY_OK;
@@ -843,13 +843,13 @@ static struct notifier_block crash_memhp_nb = {
 
 static int crash_cpuhp_online(unsigned int cpu)
 {
-	crash_handle_hotplug_event(KEXEC_CRASH_HP_ADD_CPU, cpu);
+	crash_handle_hotplug_event(KEXEC_CRASH_HP_ADD_CPU, cpu, NULL);
 	return 0;
 }
 
 static int crash_cpuhp_offline(unsigned int cpu)
 {
-	crash_handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_CPU, cpu);
+	crash_handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_CPU, cpu, NULL);
 	return 0;
 }
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v11 4/4] powerpc/crash: add crash memory hotplug support
  2023-06-19  2:49 [PATCH v11 0/4] PowerPC: In-kernel handling of CPU/Memory hotplug/online/offline events for kdump kernel Sourabh Jain
                   ` (2 preceding siblings ...)
  2023-06-19  2:49 ` [PATCH v11 3/4] crash: forward memory_notify args to arch crash hotplug handler Sourabh Jain
@ 2023-06-19  2:49 ` Sourabh Jain
  3 siblings, 0 replies; 7+ messages in thread
From: Sourabh Jain @ 2023-06-19  2:49 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: mahesh, ldufour, eric.devolder, kexec, hbathini

Extend PowerPC arch crash hotplug handler to support memory hotplug
events. Since elfcorehdr is used to exchange the memory info between the
kernels hence it needs to be recreated to reflect the changes due to
memory hotplug events.

The way memory hotplug events are handled on PowerPC and the notifier
call chain used in generic code to trigger the arch crash handler, the
process to recreate the elfcorehdr is different for memory add and
remove case.

For memory remove case the memory change notifier call chain is
triggered first and then memblock regions is updated. Whereas for the
memory hot add case, memblock regions are updated before invoking the
memory change notifier call chain.

On PowerPC, memblock regions list is used to prepare the elfcorehdr. In
case of memory hot remove the memblock regions are updated after the
arch crash hotplug handler is triggered, hence an additional step is
taken to ensure that memory ranges used to prepare elfcorehdr do not
include hot removed memory.

When memory is hot removed it possible that memory regions count may
increase. So to accommodate a growing number of memory regions, the
elfcorehdr kexec segment is built with additional buffer space.

The changes done here will also work for the kexec_load system call given
that the kexec tool builds the elfcoredhr with additional space to
accommodate future memory regions as it is done for kexec_file_load
system call in the kernel.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/include/asm/kexec.h        |  6 ++
 arch/powerpc/include/asm/kexec_ranges.h |  1 +
 arch/powerpc/kexec/core_64.c            | 77 +++++++++++++++++++++-
 arch/powerpc/kexec/file_load_64.c       | 36 ++++++++++-
 arch/powerpc/kexec/ranges.c             | 85 +++++++++++++++++++++++++
 5 files changed, 201 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index d3ff481aa9f8..10017880571c 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -112,6 +112,12 @@ void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
 static inline int crash_hotplug_cpu_support(void) { return 1; }
 #define crash_hotplug_cpu_support crash_hotplug_cpu_support
 #endif
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+static inline int crash_hotplug_memory_support(void) { return 1; }
+#define crash_hotplug_memory_support crash_hotplug_memory_support
+#endif
+
 #endif
 #endif
 
diff --git a/arch/powerpc/include/asm/kexec_ranges.h b/arch/powerpc/include/asm/kexec_ranges.h
index f83866a19e87..802abf580cf0 100644
--- a/arch/powerpc/include/asm/kexec_ranges.h
+++ b/arch/powerpc/include/asm/kexec_ranges.h
@@ -7,6 +7,7 @@
 void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
 struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
 int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
+int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
 int add_tce_mem_ranges(struct crash_mem **mem_ranges);
 int add_initrd_mem_range(struct crash_mem **mem_ranges);
 #ifdef CONFIG_PPC_64S_HASH_MMU
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 4d1c53cc9a90..e5038f2769bb 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -19,6 +19,7 @@
 #include <linux/of.h>
 #include <linux/libfdt.h>
 #include <linux/memblock.h>
+#include <linux/memory.h>
 
 #include <asm/page.h>
 #include <asm/current.h>
@@ -547,6 +548,76 @@ int update_cpus_node(void *fdt)
 #undef pr_fmt
 #define pr_fmt(fmt) "crash hp: " fmt
 
+/**
+ * update_crash_elfcorehdr() - Recreate the elfcorehdr and replace it with old
+ *			       elfcorehdr in the kexec segment array.
+ * @image: the active struct kimage
+ * @mn: struct memory_notify data handler
+ */
+static void update_crash_elfcorehdr(struct kimage *image, struct memory_notify *mn)
+{
+	int ret;
+	struct crash_mem *cmem = NULL;
+	struct kexec_segment *ksegment;
+	void *ptr, *mem, *elfbuf = NULL;
+	unsigned long elfsz, memsz, base_addr, size;
+
+	ksegment = &image->segment[image->elfcorehdr_index];
+	mem = (void *) ksegment->mem;
+	memsz = ksegment->memsz;
+
+	ret = get_crash_memory_ranges(&cmem);
+	if (ret) {
+		pr_err("Failed to get crash mem range\n");
+		return;
+	}
+
+	/*
+	 * The hot unplugged memory is part of crash memory ranges,
+	 * remove it here.
+	 */
+	if (image->hp_action == KEXEC_CRASH_HP_REMOVE_MEMORY) {
+		base_addr = PFN_PHYS(mn->start_pfn);
+		size = mn->nr_pages * PAGE_SIZE;
+		ret = remove_mem_range(&cmem, base_addr, size);
+		if (ret) {
+			pr_err("Failed to remove hot-unplugged from crash memory ranges.\n");
+			return;
+		}
+	}
+
+	ret = crash_prepare_elf64_headers(cmem, false, &elfbuf, &elfsz);
+	if (ret) {
+		pr_err("Failed to prepare elf header\n");
+		return;
+	}
+
+	/*
+	 * It is unlikely that kernel hit this because elfcorehdr kexec
+	 * segment (memsz) is built with addition space to accommodate growing
+	 * number of crash memory ranges while loading the kdump kernel. It is
+	 * Just to avoid any unforeseen case.
+	 */
+	if (elfsz > memsz) {
+		pr_err("Updated crash elfcorehdr elfsz %lu > memsz %lu", elfsz, memsz);
+		goto out;
+	}
+
+	ptr = __va(mem);
+	if (ptr) {
+		/* Temporarily invalidate the crash image while it is replaced */
+		xchg(&kexec_crash_image, NULL);
+
+		/* Replace the old elfcorehdr with newly prepared elfcorehdr */
+		memcpy((void *)ptr, elfbuf, elfsz);
+
+		/* The crash image is now valid once again */
+		xchg(&kexec_crash_image, image);
+	}
+out:
+	vfree(elfbuf);
+}
+
 /**
  * arch_crash_handle_hotplug_event - Handle crash CPU/Memory hotplug events to update the
  *				     necessary kexec segments based on the hotplug event.
@@ -554,12 +625,14 @@ int update_cpus_node(void *fdt)
  * @arg: struct memory_notify handler for memory hotplug case and NULL for CPU hotplug case.
  *
  * Update FDT segment to include newly added CPU. No action for CPU remove case.
+ * Recreate the elfcorehdr for Memory add/remove case and replace it with old one.
  */
 void arch_crash_handle_hotplug_event(struct kimage *image, void *arg)
 {
 	void *fdt, *ptr;
 	unsigned long mem;
 	int i, fdt_index = -1;
+	struct memory_notify *mn;
 	unsigned int hp_action = image->hp_action;
 
 	/*
@@ -569,9 +642,9 @@ void arch_crash_handle_hotplug_event(struct kimage *image, void *arg)
 	if (hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
 		return;
 
-	/* crash update on memory hotplug events is not supported yet */
 	if (hp_action == KEXEC_CRASH_HP_REMOVE_MEMORY || hp_action == KEXEC_CRASH_HP_ADD_MEMORY) {
-		pr_info_once("Crash update is not supported for memory hotplug\n");
+		mn = (struct memory_notify *)arg;
+		update_crash_elfcorehdr(image, mn);
 		return;
 	}
 
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index 1757e7ba379a..8853b7308ae9 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -21,6 +21,8 @@
 #include <linux/memblock.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
+#include <linux/elf.h>
+
 #include <asm/setup.h>
 #include <asm/drmem.h>
 #include <asm/firmware.h>
@@ -707,6 +709,30 @@ static void update_backup_region_phdr(struct kimage *image, Elf64_Ehdr *ehdr)
 	}
 }
 
+/* get_max_phdr - Find the total number of Phdr needed to represent the
+ *		  max memory in the kdump elfcorehdr.
+ *
+ * @cmem: crash memory ranges in the system.
+ */
+static int get_max_phdr(struct crash_mem *cmem)
+{
+	int max_lmb;
+
+	/* In the worst case, a Phdr is needed for every other LMB to be represented
+	 * as an individual crash range.
+	 */
+	max_lmb = memory_hotplug_max() / 2 * drmem_lmb_size();
+
+	/* Do not cross the Phdr max limit of the elf header.
+	 * Avoid counting Phdr for crash ranges (cmem->nr_ranges) which
+	 * are already part of elfcorehdr.
+	 */
+	if (max_lmb > PN_XNUM)
+		return PN_XNUM - cmem->nr_ranges;
+
+	return max_lmb - cmem->nr_ranges;
+}
+
 /**
  * load_elfcorehdr_segment - Setup crash memory ranges and initialize elfcorehdr
  *                           segment needed to load kdump kernel.
@@ -738,7 +764,13 @@ static int load_elfcorehdr_segment(struct kimage *image, struct kexec_buf *kbuf)
 
 	kbuf->buffer = headers;
 	kbuf->mem = KEXEC_BUF_MEM_UNKNOWN;
-	kbuf->bufsz = kbuf->memsz = headers_sz;
+	kbuf->bufsz = headers_sz;
+/* Additional buffer space to accommodate future memory ranges */
+#ifdef CONFIG_MEMORY_HOTPLUG
+	kbuf->memsz = headers_sz + get_max_phdr(cmem) * sizeof(Elf64_Phdr);
+#else
+	kbuf->memsz = headers_sz;
+#endif
 	kbuf->top_down = false;
 
 	ret = kexec_add_buffer(kbuf);
@@ -748,7 +780,7 @@ static int load_elfcorehdr_segment(struct kimage *image, struct kexec_buf *kbuf)
 	}
 
 	image->elf_load_addr = kbuf->mem;
-	image->elf_headers_sz = headers_sz;
+	image->elf_headers_sz = kbuf->memsz;
 	image->elf_headers = headers;
 out:
 	kfree(cmem);
diff --git a/arch/powerpc/kexec/ranges.c b/arch/powerpc/kexec/ranges.c
index 5fc53a5fcfdf..d8007363cdc1 100644
--- a/arch/powerpc/kexec/ranges.c
+++ b/arch/powerpc/kexec/ranges.c
@@ -234,6 +234,91 @@ int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size)
 	return __add_mem_range(mem_ranges, base, size);
 }
 
+/**
+ * remove_mem_range - Removes the given memory range from the range list.
+ * @mem_ranges:    Range list to remove the memory range to.
+ * @base:          Base address of the range to remove.
+ * @size:          Size of the memory range to remove.
+ *
+ * (Re)allocates memory, if needed.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size)
+{
+	u64 end;
+	int ret = 0;
+	unsigned int i;
+	u64 mstart, mend;
+	struct crash_mem *mem_rngs = *mem_ranges;
+
+	if (!size)
+		return 0;
+
+	/*
+	 * Memory range are stored as start and end address, use
+	 * the same format to do remove operation.
+	 */
+	end = base + size - 1;
+
+	for (i = 0; i < mem_rngs->nr_ranges; i++) {
+		mstart = mem_rngs->ranges[i].start;
+		mend = mem_rngs->ranges[i].end;
+
+		/*
+		 * Memory range to remove is not part of this range entry
+		 * in the memory range list
+		 */
+		if (!(base >= mstart && end <= mend))
+			continue;
+
+		/*
+		 * Memory range to remove is equivalent to this entry in the
+		 * memory range list. Remove the range entry from the list.
+		 */
+		if (base == mstart && end == mend) {
+			for (; i < mem_rngs->nr_ranges - 1; i++) {
+				mem_rngs->ranges[i].start = mem_rngs->ranges[i+1].start;
+				mem_rngs->ranges[i].end = mem_rngs->ranges[i+1].end;
+			}
+			mem_rngs->nr_ranges--;
+			goto out;
+		}
+		/*
+		 * Start address of the memory range to remove and the
+		 * current memory range entry in the list is same. Just
+		 * move the start address of the current memory range
+		 * entry in the list to end + 1.
+		 */
+		else if (base == mstart) {
+			mem_rngs->ranges[i].start = end + 1;
+			goto out;
+		}
+		/*
+		 * End address of the memory range to remove and the
+		 * current memory range entry in the list is same.
+		 * Just move the end address of the current memory
+		 * range entry in the list to base - 1.
+		 */
+		else if (end == mend)  {
+			mem_rngs->ranges[i].end = base - 1;
+			goto out;
+		}
+		/*
+		 * Memory range to remove is not at the edge of current
+		 * memory range entry. Split the current memory entry into
+		 * two half.
+		 */
+		else {
+			mem_rngs->ranges[i].end = base - 1;
+			size = mem_rngs->ranges[i].end - end;
+			ret = add_mem_range(mem_ranges, end + 1, size);
+		}
+	}
+out:
+	return ret;
+}
+
 /**
  * add_tce_mem_ranges - Adds tce-table range to the given memory ranges list.
  * @mem_ranges:         Range list to add the memory range(s) to.
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v11 1/4] powerpc/kexec: turn some static helper functions public
  2023-06-19  2:49 ` [PATCH v11 1/4] powerpc/kexec: turn some static helper functions public Sourabh Jain
@ 2023-10-13 13:29   ` Christophe Leroy
  2023-10-13 16:46     ` Sourabh Jain
  0 siblings, 1 reply; 7+ messages in thread
From: Christophe Leroy @ 2023-10-13 13:29 UTC (permalink / raw)
  To: Sourabh Jain, linuxppc-dev, mpe
  Cc: ldufour, mahesh, kexec, Laurent Dufour, eric.devolder, hbathini



Le 19/06/2023 à 04:49, Sourabh Jain a écrit :
> Move update_cpus_node and get_crash_memory_ranges functions from
> kexec/file_load_64.c to kexec/core_64.c to make these functions usable
> by other kexec components.
> 
> Later in the series, these functions are utilized to do in-kernel update
> to kexec segments on CPU/Memory hot plug/unplug or online/offline events
> for both kexec_load and kexec_file_load syscalls.
> 
> No functional change intended.
> 
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> Reviewed-by: Laurent Dufour <laurent.dufour@fr.ibm.com>

This patch doesn't apply, if still applicable can you rebase and resubmit ?

Thanks
Christophe

> ---
>   arch/powerpc/include/asm/kexec.h  |   6 ++
>   arch/powerpc/kexec/core_64.c      | 166 ++++++++++++++++++++++++++++++
>   arch/powerpc/kexec/file_load_64.c | 162 -----------------------------
>   3 files changed, 172 insertions(+), 162 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
> index a1ddba01e7d1..8090ad7d97d9 100644
> --- a/arch/powerpc/include/asm/kexec.h
> +++ b/arch/powerpc/include/asm/kexec.h
> @@ -99,6 +99,12 @@ void relocate_new_kernel(unsigned long indirection_page, unsigned long reboot_co
>   
>   void kexec_copy_flush(struct kimage *image);
>   
> +#ifdef CONFIG_PPC64
> +struct crash_mem;
> +int update_cpus_node(void *fdt);
> +int get_crash_memory_ranges(struct crash_mem **mem_ranges);
> +#endif
> +
>   #if defined(CONFIG_CRASH_DUMP) && defined(CONFIG_PPC_RTAS)
>   void crash_free_reserved_phys_range(unsigned long begin, unsigned long end);
>   #define crash_free_reserved_phys_range crash_free_reserved_phys_range
> diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
> index a79e28c91e2b..0b292f93a74c 100644
> --- a/arch/powerpc/kexec/core_64.c
> +++ b/arch/powerpc/kexec/core_64.c
> @@ -17,6 +17,8 @@
>   #include <linux/cpu.h>
>   #include <linux/hardirq.h>
>   #include <linux/of.h>
> +#include <linux/libfdt.h>
> +#include <linux/memblock.h>
>   
>   #include <asm/page.h>
>   #include <asm/current.h>
> @@ -30,6 +32,8 @@
>   #include <asm/hw_breakpoint.h>
>   #include <asm/svm.h>
>   #include <asm/ultravisor.h>
> +#include <asm/kexec_ranges.h>
> +#include <asm/crashdump-ppc64.h>
>   
>   int machine_kexec_prepare(struct kimage *image)
>   {
> @@ -377,6 +381,168 @@ void default_machine_kexec(struct kimage *image)
>   	/* NOTREACHED */
>   }
>   
> +/**
> + * get_crash_memory_ranges - Get crash memory ranges. This list includes
> + *                           first/crashing kernel's memory regions that
> + *                           would be exported via an elfcore.
> + * @mem_ranges:              Range list to add the memory ranges to.
> + *
> + * Returns 0 on success, negative errno on error.
> + */
> +int get_crash_memory_ranges(struct crash_mem **mem_ranges)
> +{
> +	phys_addr_t base, end;
> +	struct crash_mem *tmem;
> +	u64 i;
> +	int ret;
> +
> +	for_each_mem_range(i, &base, &end) {
> +		u64 size = end - base;
> +
> +		/* Skip backup memory region, which needs a separate entry */
> +		if (base == BACKUP_SRC_START) {
> +			if (size > BACKUP_SRC_SIZE) {
> +				base = BACKUP_SRC_END + 1;
> +				size -= BACKUP_SRC_SIZE;
> +			} else
> +				continue;
> +		}
> +
> +		ret = add_mem_range(mem_ranges, base, size);
> +		if (ret)
> +			goto out;
> +
> +		/* Try merging adjacent ranges before reallocation attempt */
> +		if ((*mem_ranges)->nr_ranges == (*mem_ranges)->max_nr_ranges)
> +			sort_memory_ranges(*mem_ranges, true);
> +	}
> +
> +	/* Reallocate memory ranges if there is no space to split ranges */
> +	tmem = *mem_ranges;
> +	if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
> +		tmem = realloc_mem_ranges(mem_ranges);
> +		if (!tmem)
> +			goto out;
> +	}
> +
> +	/* Exclude crashkernel region */
> +	ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
> +	if (ret)
> +		goto out;
> +
> +	/*
> +	 * FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
> +	 *        regions are exported to save their context at the time of
> +	 *        crash, they should actually be backed up just like the
> +	 *        first 64K bytes of memory.
> +	 */
> +	ret = add_rtas_mem_range(mem_ranges);
> +	if (ret)
> +		goto out;
> +
> +	ret = add_opal_mem_range(mem_ranges);
> +	if (ret)
> +		goto out;
> +
> +	/* create a separate program header for the backup region */
> +	ret = add_mem_range(mem_ranges, BACKUP_SRC_START, BACKUP_SRC_SIZE);
> +	if (ret)
> +		goto out;
> +
> +	sort_memory_ranges(*mem_ranges, false);
> +out:
> +	if (ret)
> +		pr_err("Failed to setup crash memory ranges\n");
> +	return ret;
> +}
> +
> +/**
> + * add_node_props - Reads node properties from device node structure and add
> + *                  them to fdt.
> + * @fdt:            Flattened device tree of the kernel
> + * @node_offset:    offset of the node to add a property at
> + * @dn:             device node pointer
> + *
> + * Returns 0 on success, negative errno on error.
> + */
> +static int add_node_props(void *fdt, int node_offset, const struct device_node *dn)
> +{
> +	int ret = 0;
> +	struct property *pp;
> +
> +	if (!dn)
> +		return -EINVAL;
> +
> +	for_each_property_of_node(dn, pp) {
> +		ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, pp->length);
> +		if (ret < 0) {
> +			pr_err("Unable to add %s property: %s\n", pp->name, fdt_strerror(ret));
> +			return ret;
> +		}
> +	}
> +	return ret;
> +}
> +
> +/**
> + * update_cpus_node - Update cpus node of flattened device tree using of_root
> + *                    device node.
> + * @fdt:              Flattened device tree of the kernel.
> + *
> + * Returns 0 on success, negative errno on error.
> + */
> +int update_cpus_node(void *fdt)
> +{
> +	struct device_node *cpus_node, *dn;
> +	int cpus_offset, cpus_subnode_offset, ret = 0;
> +
> +	cpus_offset = fdt_path_offset(fdt, "/cpus");
> +	if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
> +		pr_err("Malformed device tree: error reading /cpus node: %s\n",
> +		       fdt_strerror(cpus_offset));
> +		return cpus_offset;
> +	}
> +
> +	if (cpus_offset > 0) {
> +		ret = fdt_del_node(fdt, cpus_offset);
> +		if (ret < 0) {
> +			pr_err("Error deleting /cpus node: %s\n", fdt_strerror(ret));
> +			return -EINVAL;
> +		}
> +	}
> +
> +	/* Add cpus node to fdt */
> +	cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
> +	if (cpus_offset < 0) {
> +		pr_err("Error creating /cpus node: %s\n", fdt_strerror(cpus_offset));
> +		return -EINVAL;
> +	}
> +
> +	/* Add cpus node properties */
> +	cpus_node = of_find_node_by_path("/cpus");
> +	ret = add_node_props(fdt, cpus_offset, cpus_node);
> +	of_node_put(cpus_node);
> +	if (ret < 0)
> +		return ret;
> +
> +	/* Loop through all subnodes of cpus and add them to fdt */
> +	for_each_node_by_type(dn, "cpu") {
> +		cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, dn->full_name);
> +		if (cpus_subnode_offset < 0) {
> +			pr_err("Unable to add %s subnode: %s\n", dn->full_name,
> +			       fdt_strerror(cpus_subnode_offset));
> +			ret = cpus_subnode_offset;
> +			goto out;
> +		}
> +
> +		ret = add_node_props(fdt, cpus_subnode_offset, dn);
> +		if (ret < 0)
> +			goto out;
> +	}
> +out:
> +	of_node_put(dn);
> +	return ret;
> +}
> +
>   #ifdef CONFIG_PPC_64S_HASH_MMU
>   /* Values we need to export to the second kernel via the device tree. */
>   static unsigned long htab_base;
> diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
> index 110d28bede2a..5b0b3f61e0e7 100644
> --- a/arch/powerpc/kexec/file_load_64.c
> +++ b/arch/powerpc/kexec/file_load_64.c
> @@ -133,81 +133,6 @@ static int get_usable_memory_ranges(struct crash_mem **mem_ranges)
>   	return ret;
>   }
>   
> -/**
> - * get_crash_memory_ranges - Get crash memory ranges. This list includes
> - *                           first/crashing kernel's memory regions that
> - *                           would be exported via an elfcore.
> - * @mem_ranges:              Range list to add the memory ranges to.
> - *
> - * Returns 0 on success, negative errno on error.
> - */
> -static int get_crash_memory_ranges(struct crash_mem **mem_ranges)
> -{
> -	phys_addr_t base, end;
> -	struct crash_mem *tmem;
> -	u64 i;
> -	int ret;
> -
> -	for_each_mem_range(i, &base, &end) {
> -		u64 size = end - base;
> -
> -		/* Skip backup memory region, which needs a separate entry */
> -		if (base == BACKUP_SRC_START) {
> -			if (size > BACKUP_SRC_SIZE) {
> -				base = BACKUP_SRC_END + 1;
> -				size -= BACKUP_SRC_SIZE;
> -			} else
> -				continue;
> -		}
> -
> -		ret = add_mem_range(mem_ranges, base, size);
> -		if (ret)
> -			goto out;
> -
> -		/* Try merging adjacent ranges before reallocation attempt */
> -		if ((*mem_ranges)->nr_ranges == (*mem_ranges)->max_nr_ranges)
> -			sort_memory_ranges(*mem_ranges, true);
> -	}
> -
> -	/* Reallocate memory ranges if there is no space to split ranges */
> -	tmem = *mem_ranges;
> -	if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
> -		tmem = realloc_mem_ranges(mem_ranges);
> -		if (!tmem)
> -			goto out;
> -	}
> -
> -	/* Exclude crashkernel region */
> -	ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
> -	if (ret)
> -		goto out;
> -
> -	/*
> -	 * FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
> -	 *        regions are exported to save their context at the time of
> -	 *        crash, they should actually be backed up just like the
> -	 *        first 64K bytes of memory.
> -	 */
> -	ret = add_rtas_mem_range(mem_ranges);
> -	if (ret)
> -		goto out;
> -
> -	ret = add_opal_mem_range(mem_ranges);
> -	if (ret)
> -		goto out;
> -
> -	/* create a separate program header for the backup region */
> -	ret = add_mem_range(mem_ranges, BACKUP_SRC_START, BACKUP_SRC_SIZE);
> -	if (ret)
> -		goto out;
> -
> -	sort_memory_ranges(*mem_ranges, false);
> -out:
> -	if (ret)
> -		pr_err("Failed to setup crash memory ranges\n");
> -	return ret;
> -}
> -
>   /**
>    * get_reserved_memory_ranges - Get reserve memory ranges. This list includes
>    *                              memory regions that should be added to the
> @@ -1018,93 +943,6 @@ unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image)
>   	return extra_size;
>   }
>   
> -/**
> - * add_node_props - Reads node properties from device node structure and add
> - *                  them to fdt.
> - * @fdt:            Flattened device tree of the kernel
> - * @node_offset:    offset of the node to add a property at
> - * @dn:             device node pointer
> - *
> - * Returns 0 on success, negative errno on error.
> - */
> -static int add_node_props(void *fdt, int node_offset, const struct device_node *dn)
> -{
> -	int ret = 0;
> -	struct property *pp;
> -
> -	if (!dn)
> -		return -EINVAL;
> -
> -	for_each_property_of_node(dn, pp) {
> -		ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, pp->length);
> -		if (ret < 0) {
> -			pr_err("Unable to add %s property: %s\n", pp->name, fdt_strerror(ret));
> -			return ret;
> -		}
> -	}
> -	return ret;
> -}
> -
> -/**
> - * update_cpus_node - Update cpus node of flattened device tree using of_root
> - *                    device node.
> - * @fdt:              Flattened device tree of the kernel.
> - *
> - * Returns 0 on success, negative errno on error.
> - */
> -static int update_cpus_node(void *fdt)
> -{
> -	struct device_node *cpus_node, *dn;
> -	int cpus_offset, cpus_subnode_offset, ret = 0;
> -
> -	cpus_offset = fdt_path_offset(fdt, "/cpus");
> -	if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
> -		pr_err("Malformed device tree: error reading /cpus node: %s\n",
> -		       fdt_strerror(cpus_offset));
> -		return cpus_offset;
> -	}
> -
> -	if (cpus_offset > 0) {
> -		ret = fdt_del_node(fdt, cpus_offset);
> -		if (ret < 0) {
> -			pr_err("Error deleting /cpus node: %s\n", fdt_strerror(ret));
> -			return -EINVAL;
> -		}
> -	}
> -
> -	/* Add cpus node to fdt */
> -	cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
> -	if (cpus_offset < 0) {
> -		pr_err("Error creating /cpus node: %s\n", fdt_strerror(cpus_offset));
> -		return -EINVAL;
> -	}
> -
> -	/* Add cpus node properties */
> -	cpus_node = of_find_node_by_path("/cpus");
> -	ret = add_node_props(fdt, cpus_offset, cpus_node);
> -	of_node_put(cpus_node);
> -	if (ret < 0)
> -		return ret;
> -
> -	/* Loop through all subnodes of cpus and add them to fdt */
> -	for_each_node_by_type(dn, "cpu") {
> -		cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, dn->full_name);
> -		if (cpus_subnode_offset < 0) {
> -			pr_err("Unable to add %s subnode: %s\n", dn->full_name,
> -			       fdt_strerror(cpus_subnode_offset));
> -			ret = cpus_subnode_offset;
> -			goto out;
> -		}
> -
> -		ret = add_node_props(fdt, cpus_subnode_offset, dn);
> -		if (ret < 0)
> -			goto out;
> -	}
> -out:
> -	of_node_put(dn);
> -	return ret;
> -}
> -
>   static int copy_property(void *fdt, int node_offset, const struct device_node *dn,
>   			 const char *propname)
>   {

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v11 1/4] powerpc/kexec: turn some static helper functions public
  2023-10-13 13:29   ` Christophe Leroy
@ 2023-10-13 16:46     ` Sourabh Jain
  0 siblings, 0 replies; 7+ messages in thread
From: Sourabh Jain @ 2023-10-13 16:46 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev, mpe
  Cc: ldufour, mahesh, kexec, Laurent Dufour, eric.devolder, hbathini

Hello Christophe,

On 13/10/23 18:59, Christophe Leroy wrote:
>
> Le 19/06/2023 à 04:49, Sourabh Jain a écrit :
>> Move update_cpus_node and get_crash_memory_ranges functions from
>> kexec/file_load_64.c to kexec/core_64.c to make these functions usable
>> by other kexec components.
>>
>> Later in the series, these functions are utilized to do in-kernel update
>> to kexec segments on CPU/Memory hot plug/unplug or online/offline events
>> for both kexec_load and kexec_file_load syscalls.
>>
>> No functional change intended.
>>
>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>> Reviewed-by: Laurent Dufour <laurent.dufour@fr.ibm.com>
> This patch doesn't apply, if still applicable can you rebase and resubmit ?

I will be sending next version soon.

Thanks
Sourabh
>
>> ---
>>    arch/powerpc/include/asm/kexec.h  |   6 ++
>>    arch/powerpc/kexec/core_64.c      | 166 ++++++++++++++++++++++++++++++
>>    arch/powerpc/kexec/file_load_64.c | 162 -----------------------------
>>    3 files changed, 172 insertions(+), 162 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
>> index a1ddba01e7d1..8090ad7d97d9 100644
>> --- a/arch/powerpc/include/asm/kexec.h
>> +++ b/arch/powerpc/include/asm/kexec.h
>> @@ -99,6 +99,12 @@ void relocate_new_kernel(unsigned long indirection_page, unsigned long reboot_co
>>    
>>    void kexec_copy_flush(struct kimage *image);
>>    
>> +#ifdef CONFIG_PPC64
>> +struct crash_mem;
>> +int update_cpus_node(void *fdt);
>> +int get_crash_memory_ranges(struct crash_mem **mem_ranges);
>> +#endif
>> +
>>    #if defined(CONFIG_CRASH_DUMP) && defined(CONFIG_PPC_RTAS)
>>    void crash_free_reserved_phys_range(unsigned long begin, unsigned long end);
>>    #define crash_free_reserved_phys_range crash_free_reserved_phys_range
>> diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
>> index a79e28c91e2b..0b292f93a74c 100644
>> --- a/arch/powerpc/kexec/core_64.c
>> +++ b/arch/powerpc/kexec/core_64.c
>> @@ -17,6 +17,8 @@
>>    #include <linux/cpu.h>
>>    #include <linux/hardirq.h>
>>    #include <linux/of.h>
>> +#include <linux/libfdt.h>
>> +#include <linux/memblock.h>
>>    
>>    #include <asm/page.h>
>>    #include <asm/current.h>
>> @@ -30,6 +32,8 @@
>>    #include <asm/hw_breakpoint.h>
>>    #include <asm/svm.h>
>>    #include <asm/ultravisor.h>
>> +#include <asm/kexec_ranges.h>
>> +#include <asm/crashdump-ppc64.h>
>>    
>>    int machine_kexec_prepare(struct kimage *image)
>>    {
>> @@ -377,6 +381,168 @@ void default_machine_kexec(struct kimage *image)
>>    	/* NOTREACHED */
>>    }
>>    
>> +/**
>> + * get_crash_memory_ranges - Get crash memory ranges. This list includes
>> + *                           first/crashing kernel's memory regions that
>> + *                           would be exported via an elfcore.
>> + * @mem_ranges:              Range list to add the memory ranges to.
>> + *
>> + * Returns 0 on success, negative errno on error.
>> + */
>> +int get_crash_memory_ranges(struct crash_mem **mem_ranges)
>> +{
>> +	phys_addr_t base, end;
>> +	struct crash_mem *tmem;
>> +	u64 i;
>> +	int ret;
>> +
>> +	for_each_mem_range(i, &base, &end) {
>> +		u64 size = end - base;
>> +
>> +		/* Skip backup memory region, which needs a separate entry */
>> +		if (base == BACKUP_SRC_START) {
>> +			if (size > BACKUP_SRC_SIZE) {
>> +				base = BACKUP_SRC_END + 1;
>> +				size -= BACKUP_SRC_SIZE;
>> +			} else
>> +				continue;
>> +		}
>> +
>> +		ret = add_mem_range(mem_ranges, base, size);
>> +		if (ret)
>> +			goto out;
>> +
>> +		/* Try merging adjacent ranges before reallocation attempt */
>> +		if ((*mem_ranges)->nr_ranges == (*mem_ranges)->max_nr_ranges)
>> +			sort_memory_ranges(*mem_ranges, true);
>> +	}
>> +
>> +	/* Reallocate memory ranges if there is no space to split ranges */
>> +	tmem = *mem_ranges;
>> +	if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
>> +		tmem = realloc_mem_ranges(mem_ranges);
>> +		if (!tmem)
>> +			goto out;
>> +	}
>> +
>> +	/* Exclude crashkernel region */
>> +	ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
>> +	if (ret)
>> +		goto out;
>> +
>> +	/*
>> +	 * FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
>> +	 *        regions are exported to save their context at the time of
>> +	 *        crash, they should actually be backed up just like the
>> +	 *        first 64K bytes of memory.
>> +	 */
>> +	ret = add_rtas_mem_range(mem_ranges);
>> +	if (ret)
>> +		goto out;
>> +
>> +	ret = add_opal_mem_range(mem_ranges);
>> +	if (ret)
>> +		goto out;
>> +
>> +	/* create a separate program header for the backup region */
>> +	ret = add_mem_range(mem_ranges, BACKUP_SRC_START, BACKUP_SRC_SIZE);
>> +	if (ret)
>> +		goto out;
>> +
>> +	sort_memory_ranges(*mem_ranges, false);
>> +out:
>> +	if (ret)
>> +		pr_err("Failed to setup crash memory ranges\n");
>> +	return ret;
>> +}
>> +
>> +/**
>> + * add_node_props - Reads node properties from device node structure and add
>> + *                  them to fdt.
>> + * @fdt:            Flattened device tree of the kernel
>> + * @node_offset:    offset of the node to add a property at
>> + * @dn:             device node pointer
>> + *
>> + * Returns 0 on success, negative errno on error.
>> + */
>> +static int add_node_props(void *fdt, int node_offset, const struct device_node *dn)
>> +{
>> +	int ret = 0;
>> +	struct property *pp;
>> +
>> +	if (!dn)
>> +		return -EINVAL;
>> +
>> +	for_each_property_of_node(dn, pp) {
>> +		ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, pp->length);
>> +		if (ret < 0) {
>> +			pr_err("Unable to add %s property: %s\n", pp->name, fdt_strerror(ret));
>> +			return ret;
>> +		}
>> +	}
>> +	return ret;
>> +}
>> +
>> +/**
>> + * update_cpus_node - Update cpus node of flattened device tree using of_root
>> + *                    device node.
>> + * @fdt:              Flattened device tree of the kernel.
>> + *
>> + * Returns 0 on success, negative errno on error.
>> + */
>> +int update_cpus_node(void *fdt)
>> +{
>> +	struct device_node *cpus_node, *dn;
>> +	int cpus_offset, cpus_subnode_offset, ret = 0;
>> +
>> +	cpus_offset = fdt_path_offset(fdt, "/cpus");
>> +	if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
>> +		pr_err("Malformed device tree: error reading /cpus node: %s\n",
>> +		       fdt_strerror(cpus_offset));
>> +		return cpus_offset;
>> +	}
>> +
>> +	if (cpus_offset > 0) {
>> +		ret = fdt_del_node(fdt, cpus_offset);
>> +		if (ret < 0) {
>> +			pr_err("Error deleting /cpus node: %s\n", fdt_strerror(ret));
>> +			return -EINVAL;
>> +		}
>> +	}
>> +
>> +	/* Add cpus node to fdt */
>> +	cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
>> +	if (cpus_offset < 0) {
>> +		pr_err("Error creating /cpus node: %s\n", fdt_strerror(cpus_offset));
>> +		return -EINVAL;
>> +	}
>> +
>> +	/* Add cpus node properties */
>> +	cpus_node = of_find_node_by_path("/cpus");
>> +	ret = add_node_props(fdt, cpus_offset, cpus_node);
>> +	of_node_put(cpus_node);
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	/* Loop through all subnodes of cpus and add them to fdt */
>> +	for_each_node_by_type(dn, "cpu") {
>> +		cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, dn->full_name);
>> +		if (cpus_subnode_offset < 0) {
>> +			pr_err("Unable to add %s subnode: %s\n", dn->full_name,
>> +			       fdt_strerror(cpus_subnode_offset));
>> +			ret = cpus_subnode_offset;
>> +			goto out;
>> +		}
>> +
>> +		ret = add_node_props(fdt, cpus_subnode_offset, dn);
>> +		if (ret < 0)
>> +			goto out;
>> +	}
>> +out:
>> +	of_node_put(dn);
>> +	return ret;
>> +}
>> +
>>    #ifdef CONFIG_PPC_64S_HASH_MMU
>>    /* Values we need to export to the second kernel via the device tree. */
>>    static unsigned long htab_base;
>> diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
>> index 110d28bede2a..5b0b3f61e0e7 100644
>> --- a/arch/powerpc/kexec/file_load_64.c
>> +++ b/arch/powerpc/kexec/file_load_64.c
>> @@ -133,81 +133,6 @@ static int get_usable_memory_ranges(struct crash_mem **mem_ranges)
>>    	return ret;
>>    }
>>    
>> -/**
>> - * get_crash_memory_ranges - Get crash memory ranges. This list includes
>> - *                           first/crashing kernel's memory regions that
>> - *                           would be exported via an elfcore.
>> - * @mem_ranges:              Range list to add the memory ranges to.
>> - *
>> - * Returns 0 on success, negative errno on error.
>> - */
>> -static int get_crash_memory_ranges(struct crash_mem **mem_ranges)
>> -{
>> -	phys_addr_t base, end;
>> -	struct crash_mem *tmem;
>> -	u64 i;
>> -	int ret;
>> -
>> -	for_each_mem_range(i, &base, &end) {
>> -		u64 size = end - base;
>> -
>> -		/* Skip backup memory region, which needs a separate entry */
>> -		if (base == BACKUP_SRC_START) {
>> -			if (size > BACKUP_SRC_SIZE) {
>> -				base = BACKUP_SRC_END + 1;
>> -				size -= BACKUP_SRC_SIZE;
>> -			} else
>> -				continue;
>> -		}
>> -
>> -		ret = add_mem_range(mem_ranges, base, size);
>> -		if (ret)
>> -			goto out;
>> -
>> -		/* Try merging adjacent ranges before reallocation attempt */
>> -		if ((*mem_ranges)->nr_ranges == (*mem_ranges)->max_nr_ranges)
>> -			sort_memory_ranges(*mem_ranges, true);
>> -	}
>> -
>> -	/* Reallocate memory ranges if there is no space to split ranges */
>> -	tmem = *mem_ranges;
>> -	if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
>> -		tmem = realloc_mem_ranges(mem_ranges);
>> -		if (!tmem)
>> -			goto out;
>> -	}
>> -
>> -	/* Exclude crashkernel region */
>> -	ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
>> -	if (ret)
>> -		goto out;
>> -
>> -	/*
>> -	 * FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
>> -	 *        regions are exported to save their context at the time of
>> -	 *        crash, they should actually be backed up just like the
>> -	 *        first 64K bytes of memory.
>> -	 */
>> -	ret = add_rtas_mem_range(mem_ranges);
>> -	if (ret)
>> -		goto out;
>> -
>> -	ret = add_opal_mem_range(mem_ranges);
>> -	if (ret)
>> -		goto out;
>> -
>> -	/* create a separate program header for the backup region */
>> -	ret = add_mem_range(mem_ranges, BACKUP_SRC_START, BACKUP_SRC_SIZE);
>> -	if (ret)
>> -		goto out;
>> -
>> -	sort_memory_ranges(*mem_ranges, false);
>> -out:
>> -	if (ret)
>> -		pr_err("Failed to setup crash memory ranges\n");
>> -	return ret;
>> -}
>> -
>>    /**
>>     * get_reserved_memory_ranges - Get reserve memory ranges. This list includes
>>     *                              memory regions that should be added to the
>> @@ -1018,93 +943,6 @@ unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image)
>>    	return extra_size;
>>    }
>>    
>> -/**
>> - * add_node_props - Reads node properties from device node structure and add
>> - *                  them to fdt.
>> - * @fdt:            Flattened device tree of the kernel
>> - * @node_offset:    offset of the node to add a property at
>> - * @dn:             device node pointer
>> - *
>> - * Returns 0 on success, negative errno on error.
>> - */
>> -static int add_node_props(void *fdt, int node_offset, const struct device_node *dn)
>> -{
>> -	int ret = 0;
>> -	struct property *pp;
>> -
>> -	if (!dn)
>> -		return -EINVAL;
>> -
>> -	for_each_property_of_node(dn, pp) {
>> -		ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, pp->length);
>> -		if (ret < 0) {
>> -			pr_err("Unable to add %s property: %s\n", pp->name, fdt_strerror(ret));
>> -			return ret;
>> -		}
>> -	}
>> -	return ret;
>> -}
>> -
>> -/**
>> - * update_cpus_node - Update cpus node of flattened device tree using of_root
>> - *                    device node.
>> - * @fdt:              Flattened device tree of the kernel.
>> - *
>> - * Returns 0 on success, negative errno on error.
>> - */
>> -static int update_cpus_node(void *fdt)
>> -{
>> -	struct device_node *cpus_node, *dn;
>> -	int cpus_offset, cpus_subnode_offset, ret = 0;
>> -
>> -	cpus_offset = fdt_path_offset(fdt, "/cpus");
>> -	if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
>> -		pr_err("Malformed device tree: error reading /cpus node: %s\n",
>> -		       fdt_strerror(cpus_offset));
>> -		return cpus_offset;
>> -	}
>> -
>> -	if (cpus_offset > 0) {
>> -		ret = fdt_del_node(fdt, cpus_offset);
>> -		if (ret < 0) {
>> -			pr_err("Error deleting /cpus node: %s\n", fdt_strerror(ret));
>> -			return -EINVAL;
>> -		}
>> -	}
>> -
>> -	/* Add cpus node to fdt */
>> -	cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
>> -	if (cpus_offset < 0) {
>> -		pr_err("Error creating /cpus node: %s\n", fdt_strerror(cpus_offset));
>> -		return -EINVAL;
>> -	}
>> -
>> -	/* Add cpus node properties */
>> -	cpus_node = of_find_node_by_path("/cpus");
>> -	ret = add_node_props(fdt, cpus_offset, cpus_node);
>> -	of_node_put(cpus_node);
>> -	if (ret < 0)
>> -		return ret;
>> -
>> -	/* Loop through all subnodes of cpus and add them to fdt */
>> -	for_each_node_by_type(dn, "cpu") {
>> -		cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, dn->full_name);
>> -		if (cpus_subnode_offset < 0) {
>> -			pr_err("Unable to add %s subnode: %s\n", dn->full_name,
>> -			       fdt_strerror(cpus_subnode_offset));
>> -			ret = cpus_subnode_offset;
>> -			goto out;
>> -		}
>> -
>> -		ret = add_node_props(fdt, cpus_subnode_offset, dn);
>> -		if (ret < 0)
>> -			goto out;
>> -	}
>> -out:
>> -	of_node_put(dn);
>> -	return ret;
>> -}
>> -
>>    static int copy_property(void *fdt, int node_offset, const struct device_node *dn,
>>    			 const char *propname)
>>    {


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-10-13 16:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-19  2:49 [PATCH v11 0/4] PowerPC: In-kernel handling of CPU/Memory hotplug/online/offline events for kdump kernel Sourabh Jain
2023-06-19  2:49 ` [PATCH v11 1/4] powerpc/kexec: turn some static helper functions public Sourabh Jain
2023-10-13 13:29   ` Christophe Leroy
2023-10-13 16:46     ` Sourabh Jain
2023-06-19  2:49 ` [PATCH v11 2/4] powerpc/crash: add crash CPU hotplug support Sourabh Jain
2023-06-19  2:49 ` [PATCH v11 3/4] crash: forward memory_notify args to arch crash hotplug handler Sourabh Jain
2023-06-19  2:49 ` [PATCH v11 4/4] powerpc/crash: add crash memory hotplug support Sourabh Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).