linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug
@ 2022-07-21 18:17 Eric DeVolder
  2022-07-21 18:17 ` [PATCH v10 1/8] crash: introduce arch/*/asm/crash.h Eric DeVolder
                   ` (8 more replies)
  0 siblings, 9 replies; 25+ messages in thread
From: Eric DeVolder @ 2022-07-21 18:17 UTC (permalink / raw)
  To: linux-kernel, x86, kexec, ebiederm, dyoung, bhe, vgoyal
  Cc: tglx, mingo, bp, dave.hansen, hpa, nramas, thomas.lendacky, robh,
	efault, rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky,
	eric.devolder

When the kdump service is loaded, if a CPU or memory is hot
un/plugged, the crash elfcorehdr, which describes the CPUs
and memory in the system, must also be updated, else the resulting
vmcore is inaccurate (eg. missing either CPU context or memory
regions).

The current solution utilizes udev to initiate an unload-then-reload
of the kdump image (e. kernel, initrd, boot_params, puratory and
elfcorehdr) by the userspace kexec utility. In previous posts I have
outlined the significant performance problems related to offloading
this activity to userspace.

This patchset introduces a generic crash hot un/plug handler that
registers with the CPU and memory notifiers. Upon CPU or memory
changes, this generic handler is invoked and performs important
housekeeping, for example obtaining the appropriate lock, and then
invokes an architecture specific handler to do the appropriate
updates.

In the case of x86_64, the arch specific handler generates a new
elfcorehdr, and overwrites the old one in memory. No involvement
with userspace needed.

To realize the benefits/test this patchset, one must make a couple
of minor changes to userspace:

 - Prevent udev from updating kdump crash kernel on hot un/plug changes.
   Add the following as the first lines to the udev rule file
   /usr/lib/udev/rules.d/98-kexec.rules:

   # The kernel handles updates to crash elfcorehdr for cpu and memory changes
   SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
   SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"

   These lines will cause cpu and memory hot un/plug events to be
   skipped within this rule file, if the kernel has these changes
   enabled.

 - Change to the kexec_file_load for loading the kdump kernel:
   Eg. on RHEL: in /usr/bin/kdumpctl, change to:
    standard_kexec_args="-p -d -s"
   which adds the -s to select kexec_file_load syscall.

This patchset supports kexec_load with a modified kexec userspace
utility, and a working changeset to the kexec userspace utility
is provided here (and to use, the above change to standard_kexec_args
would be, for example, to append --hotplug instead of -s).

  diff --git a/kexec/arch/i386/crashdump-x86.c b/kexec/arch/i386/crashdump-x86.c
  index 9826f6d..4ed395a 100644
  --- a/kexec/arch/i386/crashdump-x86.c
  +++ b/kexec/arch/i386/crashdump-x86.c
  @@ -48,6 +48,7 @@
   #include <x86/x86-linux.h>
   
   extern struct arch_options_t arch_options;
  +extern int do_hotplug;
   
   static int get_kernel_page_offset(struct kexec_info *UNUSED(info),
   				  struct crash_elf_info *elf_info)
  @@ -975,6 +976,14 @@ int load_crashdump_segments(struct kexec_info *info, char* mod_cmdline,
   	} else {
   		memsz = bufsz;
   	}
  +
  +	/* If hotplug support enabled, use larger size to accomodate changes */
  +	if (do_hotplug) {
  +		long int nr_cpus = get_nr_cpus();
  +		memsz = (nr_cpus + CRASH_MAX_MEMORY_RANGES) * sizeof(Elf64_Phdr);
  +	}
  +
  +    info->elfcorehdr =
   	elfcorehdr = add_buffer(info, tmp, bufsz, memsz, align, min_base,
   							max_addr, -1);
   	dbgprintf("Created elf header segment at 0x%lx\n", elfcorehdr);
  diff --git a/kexec/crashdump-elf.c b/kexec/crashdump-elf.c
  index b8bb686..5e29f7a 100644
  --- a/kexec/crashdump-elf.c
  +++ b/kexec/crashdump-elf.c
  @@ -43,11 +43,7 @@ int FUNC(struct kexec_info *info,
   	int (*get_note_info)(int cpu, uint64_t *addr, uint64_t *len);
   	long int count_cpu;
   
  -	if (xen_present())
  -		nr_cpus = xen_get_nr_phys_cpus();
  -	else
  -		nr_cpus = sysconf(_SC_NPROCESSORS_CONF);
  -
  +	nr_cpus = get_nr_cpus();
   	if (nr_cpus < 0) {
   		return -1;
   	}
  diff --git a/kexec/crashdump.h b/kexec/crashdump.h
  index 18bd691..28d3278 100644
  --- a/kexec/crashdump.h
  +++ b/kexec/crashdump.h
  @@ -57,7 +57,6 @@ unsigned long phys_to_virt(struct crash_elf_info *elf_info,
   			   unsigned long long paddr);
   
   unsigned long xen_architecture(struct crash_elf_info *elf_info);
  -int xen_get_nr_phys_cpus(void);
   int xen_get_note(int cpu, uint64_t *addr, uint64_t *len);
   int xen_get_crashkernel_region(uint64_t *start, uint64_t *end);
   
  diff --git a/kexec/kexec-xen.h b/kexec/kexec-xen.h
  index 70fb576..f54a2dd 100644
  --- a/kexec/kexec-xen.h
  +++ b/kexec/kexec-xen.h
  @@ -83,5 +83,6 @@ extern int __xc_interface_close(xc_interface *xch);
   #endif
   
   int xen_get_kexec_range(int range, uint64_t *start, uint64_t *end);
  +int xen_get_nr_phys_cpus(void);
   
   #endif /* KEXEC_XEN_H */
  diff --git a/kexec/kexec.c b/kexec/kexec.c
  index 829a6ea..3668b73 100644
  --- a/kexec/kexec.c
  +++ b/kexec/kexec.c
  @@ -58,6 +58,7 @@
   
   unsigned long long mem_min = 0;
   unsigned long long mem_max = ULONG_MAX;
  +int do_hotplug = 0;
   static unsigned long kexec_flags = 0;
   /* Flags for kexec file (fd) based syscall */
   static unsigned long kexec_file_flags = 0;
  @@ -489,6 +490,17 @@ static int add_backup_segments(struct kexec_info *info,
   	return 0;
   }
   
  +long int get_nr_cpus(void)
  +{
  +    long int nr_cpus;
  +
  +	if (xen_present())
  +		nr_cpus = xen_get_nr_phys_cpus();
  +	else
  +		nr_cpus = sysconf(_SC_NPROCESSORS_CONF);
  +    return nr_cpus;
  +}
  +
   static char *slurp_fd(int fd, const char *filename, off_t size, off_t *nread)
   {
   	char *buf;
  @@ -672,6 +684,14 @@ static void update_purgatory(struct kexec_info *info)
   		if (info->segment[i].mem == (void *)info->rhdr.rel_addr) {
   			continue;
   		}
  +
  +		/* Don't include elfcorehdr in the checksum, if hotplug
  +		 * support enabled.
  +		 */
  +		if (do_hotplug && (info->segment[i].mem == (void *)info->elfcorehdr)) {
  +			continue;
  +		}
  +
   		sha256_update(&ctx, info->segment[i].buf,
   			      info->segment[i].bufsz);
   		nullsz = info->segment[i].memsz - info->segment[i].bufsz;
  @@ -1565,6 +1585,9 @@ int main(int argc, char *argv[])
   		case OPT_PRINT_CKR_SIZE:
   			print_crashkernel_region_size();
   			return 0;
  +		case OPT_HOTPLUG:
  +			do_hotplug = 1;
  +			break;
   		default:
   			break;
   		}
  diff --git a/kexec/kexec.h b/kexec/kexec.h
  index 0f97a97..b0428cc 100644
  --- a/kexec/kexec.h
  +++ b/kexec/kexec.h
  @@ -169,6 +169,7 @@ struct kexec_info {
   	int command_line_len;
   
   	int skip_checks;
  +	unsigned long elfcorehdr;
   };
   
   struct arch_map_entry {
  @@ -231,7 +232,8 @@ extern int file_types;
   #define OPT_PRINT_CKR_SIZE	262
   #define OPT_LOAD_LIVE_UPDATE	263
   #define OPT_EXEC_LIVE_UPDATE	264
  -#define OPT_MAX			265
  +#define OPT_HOTPLUG		265
  +#define OPT_MAX		266
   #define KEXEC_OPTIONS \
   	{ "help",		0, 0, OPT_HELP }, \
   	{ "version",		0, 0, OPT_VERSION }, \
  @@ -258,6 +260,7 @@ extern int file_types;
   	{ "debug",		0, 0, OPT_DEBUG }, \
   	{ "status",		0, 0, OPT_STATUS }, \
   	{ "print-ckr-size",     0, 0, OPT_PRINT_CKR_SIZE }, \
  +	{ "hotplug",		0, 0, OPT_HOTPLUG }, \
   
   #define KEXEC_OPT_STR "h?vdfixyluet:pscaS"
   
  @@ -290,6 +293,8 @@ extern unsigned long add_buffer_phys_virt(struct kexec_info *info,
   	int buf_end, int phys);
   extern void arch_reuse_initrd(void);
   
  +extern long int get_nr_cpus(void);
  +
   extern int ifdown(void);
   
   extern char purgatory[];

Regards,
eric
---
v10: 21jul2022
 - Rebased to 5.19.0-rc7
 - Per Sourabh, corrected build issue with arch_un/map_crash_pages()
   for architectures not supporting this feature.
 - Per David Hildebrand, removed the WARN_ONCE() altogether.
 - Per David Hansen, converted to use of kmap_local_page().
 - Per Baoquan He, replaced use of __weak with the kexec technique.

v9: 13jun2022
 https://lkml.org/lkml/2022/6/13/3382
 - Rebased to 5.18.0
 - Per Sourabh, moved crash_prepare_elf64_headers() into common
   crash_core.c to avoid compile issues with kexec_load only path.
 - Per David Hildebrand, replaced mutex_trylock() with mutex_lock().
 - Changed the __weak arch_crash_handle_hotplug_event() to utilize
   WARN_ONCE() instead of WARN(). Fix some formatting issues.
 - Per Sourabh, introduced sysfs attribute crash_hotplug for memory
   and CPUs; for use by userspace (udev) to determine if the kernel
   performs crash hot un/plug support.
 - Per Sourabh, moved the code detecting the elfcorehdr segment from
   arch/x86 into crash_core:handle_hotplug_event() so both kexec_load
   and kexec_file_load can benefit.
 - Updated userspace kexec-tools kexec utility to reflect change to
   using CRASH_MAX_MEMORY_RANGES and get_nr_cpus().
 - Updated the new proposed udev rules to reflect using the sysfs
   attributes crash_hotplug.

v8: 5may2022
 https://lkml.org/lkml/2022/5/5/1133
 - Per Borislav Petkov, eliminated CONFIG_CRASH_HOTPLUG in favor
   of CONFIG_HOTPLUG_CPU || CONFIG_MEMORY_HOTPLUG, ie a new define
   is not needed. Also use of IS_ENABLED() rather than #ifdef's.
   Renamed crash_hotplug_handler() to handle_hotplug_event().
   And other corrections.
 - Per Baoquan, minimized the parameters to the arch_crash_
   handle_hotplug_event() to hp_action and cpu.
 - Introduce KEXEC_CRASH_HP_INVALID_CPU definition, per Baoquan.
 - Per Sourabh Jain, renamed and repurposed CRASH_HOTPLUG_ELFCOREHDR_SZ
   to CONFIG_CRASH_MAX_MEMORY_RANGES, mirroring kexec-tools change
   by David Hildebrand. Folded this patch into the x86
   kexec_file_load support patch.

v7: 13apr2022
 https://lkml.org/lkml/2022/4/13/850
 - Resolved parameter usage to crash_hotplug_handler(), per Baoquan.

v6: 1apr2022
 https://lkml.org/lkml/2022/4/1/1203
 - Reword commit messages and some comment cleanup per Baoquan.
 - Changed elf_index to elfcorehdr_index for clarity.
 - Minor code changes per Baoquan.

v5: 3mar2022
 https://lkml.org/lkml/2022/3/3/674
 - Reworded description of CRASH_HOTPLUG_ELFCOREHDR_SZ, per
   David Hildenbrand.
 - Refactored slightly a few patches per Baoquan recommendation.

v4: 9feb2022
 https://lkml.org/lkml/2022/2/9/1406
 - Refactored patches per Baoquan suggestsions.
 - A few corrections, per Baoquan.

v3: 10jan2022
 https://lkml.org/lkml/2022/1/10/1212
 - Rebasing per Baoquan He request.
 - Changed memory notifier per David Hildenbrand.
 - Providing example kexec userspace change in cover letter.

RFC v2: 7dec2021
 https://lkml.org/lkml/2021/12/7/1088
 - Acting upon Baoquan He suggestion of removing elfcorehdr from
   the purgatory list of segments, removed purgatory code from
   patchset, and it is signficiantly simpler now.

RFC v1: 18nov2021
 https://lkml.org/lkml/2021/11/18/845
 - working patchset demonstrating kernel handling of hotplug
   updates to x86 elfcorehdr for kexec_file_load

RFC: 14dec2020
 https://lkml.org/lkml/2020/12/14/532
 - proposed concept of allowing kernel to handle hotplug update
   of elfcorehdr
---

Eric DeVolder (8):
  crash: introduce arch/*/asm/crash.h
  crash: move crash_prepare_elf64_headers
  crash: prototype change for crash_prepare_elf64_headers
  crash: add generic infrastructure for crash hotplug support
  kexec: exclude elfcorehdr from the segment digest
  kexec: exclude hot remove cpu from elfcorehdr notes
  crash: memory and cpu hotplug sysfs attributes
  x86/crash: Add x86 crash hotplug support

 .../admin-guide/mm/memory-hotplug.rst         |   8 +
 Documentation/core-api/cpu_hotplug.rst        |  18 ++
 arch/arm/include/asm/crash.h                  |   5 +
 arch/arm64/include/asm/crash.h                |   5 +
 arch/arm64/kernel/machine_kexec_file.c        |   6 +-
 arch/ia64/include/asm/crash.h                 |   5 +
 arch/m68k/include/asm/crash.h                 |   5 +
 arch/mips/include/asm/crash.h                 |   5 +
 arch/parisc/include/asm/crash.h               |   5 +
 arch/powerpc/include/asm/crash.h              |   5 +
 arch/powerpc/kexec/file_load_64.c             |   2 +-
 arch/riscv/include/asm/crash.h                |   5 +
 arch/s390/include/asm/crash.h                 |   5 +
 arch/sh/include/asm/crash.h                   |   5 +
 arch/x86/Kconfig                              |  11 +
 arch/x86/include/asm/crash.h                  |  20 ++
 arch/x86/kernel/crash.c                       | 117 ++++++++-
 drivers/base/cpu.c                            |  14 ++
 drivers/base/memory.c                         |  13 +
 include/linux/crash_core.h                    |  32 +++
 include/linux/kexec.h                         |  14 +-
 kernel/crash_core.c                           | 233 ++++++++++++++++++
 kernel/kexec_file.c                           | 105 +-------
 23 files changed, 537 insertions(+), 106 deletions(-)
 create mode 100644 arch/arm/include/asm/crash.h
 create mode 100644 arch/arm64/include/asm/crash.h
 create mode 100644 arch/ia64/include/asm/crash.h
 create mode 100644 arch/m68k/include/asm/crash.h
 create mode 100644 arch/mips/include/asm/crash.h
 create mode 100644 arch/parisc/include/asm/crash.h
 create mode 100644 arch/powerpc/include/asm/crash.h
 create mode 100644 arch/riscv/include/asm/crash.h
 create mode 100644 arch/s390/include/asm/crash.h
 create mode 100644 arch/sh/include/asm/crash.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v10 1/8] crash: introduce arch/*/asm/crash.h
  2022-07-21 18:17 [PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
@ 2022-07-21 18:17 ` Eric DeVolder
  2022-08-08  3:25   ` Baoquan He
  2022-07-21 18:17 ` [PATCH v10 2/8] crash: move crash_prepare_elf64_headers Eric DeVolder
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Eric DeVolder @ 2022-07-21 18:17 UTC (permalink / raw)
  To: linux-kernel, x86, kexec, ebiederm, dyoung, bhe, vgoyal
  Cc: tglx, mingo, bp, dave.hansen, hpa, nramas, thomas.lendacky, robh,
	efault, rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky,
	eric.devolder

The use of __weak is being eliminated within kexec sources.
The technique uses macros mapped onto inline functions in
order to replace __weak.

This patchset was using __weak and so in order to replace
__weak, this patch introduces arch/*/asm/crash.h, patterned
after how kexec is moving away from __weak and to the macro
definitions.

No functionality changed, yet.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
---
 arch/arm/include/asm/crash.h     | 5 +++++
 arch/arm64/include/asm/crash.h   | 5 +++++
 arch/ia64/include/asm/crash.h    | 5 +++++
 arch/m68k/include/asm/crash.h    | 5 +++++
 arch/mips/include/asm/crash.h    | 5 +++++
 arch/parisc/include/asm/crash.h  | 5 +++++
 arch/powerpc/include/asm/crash.h | 5 +++++
 arch/riscv/include/asm/crash.h   | 5 +++++
 arch/s390/include/asm/crash.h    | 5 +++++
 arch/sh/include/asm/crash.h      | 5 +++++
 include/linux/crash_core.h       | 2 ++
 11 files changed, 52 insertions(+)
 create mode 100644 arch/arm/include/asm/crash.h
 create mode 100644 arch/arm64/include/asm/crash.h
 create mode 100644 arch/ia64/include/asm/crash.h
 create mode 100644 arch/m68k/include/asm/crash.h
 create mode 100644 arch/mips/include/asm/crash.h
 create mode 100644 arch/parisc/include/asm/crash.h
 create mode 100644 arch/powerpc/include/asm/crash.h
 create mode 100644 arch/riscv/include/asm/crash.h
 create mode 100644 arch/s390/include/asm/crash.h
 create mode 100644 arch/sh/include/asm/crash.h

diff --git a/arch/arm/include/asm/crash.h b/arch/arm/include/asm/crash.h
new file mode 100644
index 000000000000..385646957d60
--- /dev/null
+++ b/arch/arm/include/asm/crash.h
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_ARM_CRASH_H
+#define _ASM_ARM_CRASH_H
+
+#endif /* _ASM_ARM_CRASH_H */
diff --git a/arch/arm64/include/asm/crash.h b/arch/arm64/include/asm/crash.h
new file mode 100644
index 000000000000..ec8870c1ea49
--- /dev/null
+++ b/arch/arm64/include/asm/crash.h
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_ARM64_CRASH_H
+#define _ASM_ARM64_CRASH_H
+
+#endif /* _ASM_ARM64_CRASH_H */
diff --git a/arch/ia64/include/asm/crash.h b/arch/ia64/include/asm/crash.h
new file mode 100644
index 000000000000..02a457cccda3
--- /dev/null
+++ b/arch/ia64/include/asm/crash.h
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_IA64_CRASH_H
+#define _ASM_IA64_CRASH_H
+
+#endif /* _ASM_IA64_CRASH_H */
diff --git a/arch/m68k/include/asm/crash.h b/arch/m68k/include/asm/crash.h
new file mode 100644
index 000000000000..ba6e412a1267
--- /dev/null
+++ b/arch/m68k/include/asm/crash.h
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_M68K_CRASH_H
+#define _ASM_M68K_CRASH_H
+
+#endif /* _ASM_M68K_CRASH_H */
diff --git a/arch/mips/include/asm/crash.h b/arch/mips/include/asm/crash.h
new file mode 100644
index 000000000000..35872522c574
--- /dev/null
+++ b/arch/mips/include/asm/crash.h
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_MIPS_CRASH_H
+#define _ASM_MIPS_CRASH_H
+
+#endif /* _ASM_MIPS_CRASH_H */
diff --git a/arch/parisc/include/asm/crash.h b/arch/parisc/include/asm/crash.h
new file mode 100644
index 000000000000..96833b727179
--- /dev/null
+++ b/arch/parisc/include/asm/crash.h
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_PARISC_CRASH_H
+#define _ASM_PARISC_CRASH_H
+
+#endif /* _ASM_PARISC_CRASH_H */
diff --git a/arch/powerpc/include/asm/crash.h b/arch/powerpc/include/asm/crash.h
new file mode 100644
index 000000000000..40ce71e56ac1
--- /dev/null
+++ b/arch/powerpc/include/asm/crash.h
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_POWERPC_CRASH_H
+#define _ASM_POWERPC_CRASH_H
+
+#endif /* _ASM_POWERPC_CRASH_H */
diff --git a/arch/riscv/include/asm/crash.h b/arch/riscv/include/asm/crash.h
new file mode 100644
index 000000000000..24f3aea99707
--- /dev/null
+++ b/arch/riscv/include/asm/crash.h
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RISCV_CRASH_H
+#define _ASM_RISCV_CRASH_H
+
+#endif /* _ASM_RISCV_CRASH_H */
diff --git a/arch/s390/include/asm/crash.h b/arch/s390/include/asm/crash.h
new file mode 100644
index 000000000000..0db16ad4c75f
--- /dev/null
+++ b/arch/s390/include/asm/crash.h
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_S390_CRASH_H
+#define _ASM_S390_CRASH_H
+
+#endif /* _ASM_S390_CRASH_H */
diff --git a/arch/sh/include/asm/crash.h b/arch/sh/include/asm/crash.h
new file mode 100644
index 000000000000..f54e12f88cae
--- /dev/null
+++ b/arch/sh/include/asm/crash.h
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_SH_CRASH_H
+#define _ASM_SH_CRASH_H
+
+#endif /* _ASM_SH_CRASH_H */
diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index de62a722431e..cb0f1916fbf5 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -6,6 +6,8 @@
 #include <linux/elfcore.h>
 #include <linux/elf.h>
 
+#include <asm/crash.h>
+
 #define CRASH_CORE_NOTE_NAME	   "CORE"
 #define CRASH_CORE_NOTE_HEAD_BYTES ALIGN(sizeof(struct elf_note), 4)
 #define CRASH_CORE_NOTE_NAME_BYTES ALIGN(sizeof(CRASH_CORE_NOTE_NAME), 4)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v10 2/8] crash: move crash_prepare_elf64_headers
  2022-07-21 18:17 [PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
  2022-07-21 18:17 ` [PATCH v10 1/8] crash: introduce arch/*/asm/crash.h Eric DeVolder
@ 2022-07-21 18:17 ` Eric DeVolder
  2022-07-21 18:17 ` [PATCH v10 3/8] crash: prototype change for crash_prepare_elf64_headers Eric DeVolder
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Eric DeVolder @ 2022-07-21 18:17 UTC (permalink / raw)
  To: linux-kernel, x86, kexec, ebiederm, dyoung, bhe, vgoyal
  Cc: tglx, mingo, bp, dave.hansen, hpa, nramas, thomas.lendacky, robh,
	efault, rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky,
	eric.devolder

At the outcome of this patch set, the crash_prepare_elf64_headers()
is utilized on both the kexec_file_load and kexec_load paths. As
such, need to move this function out of kexec_file.c and into a
common location crash_core.c.

No functionality change.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
---
 kernel/crash_core.c | 100 ++++++++++++++++++++++++++++++++++++++++++++
 kernel/kexec_file.c |  99 -------------------------------------------
 2 files changed, 100 insertions(+), 99 deletions(-)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 71122e01623c..cf273fd7c18b 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -9,6 +9,7 @@
 #include <linux/init.h>
 #include <linux/utsname.h>
 #include <linux/vmalloc.h>
+#include <linux/kexec.h>
 
 #include <asm/page.h>
 #include <asm/sections.h>
@@ -302,6 +303,105 @@ static int __init parse_crashkernel_dummy(char *arg)
 }
 early_param("crashkernel", parse_crashkernel_dummy);
 
+int crash_prepare_elf64_headers(struct crash_mem *mem, int need_kernel_map,
+			  void **addr, unsigned long *sz)
+{
+	Elf64_Ehdr *ehdr;
+	Elf64_Phdr *phdr;
+	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
+	unsigned char *buf;
+	unsigned int cpu, i;
+	unsigned long long notes_addr;
+	unsigned long mstart, mend;
+
+	/* extra phdr for vmcoreinfo ELF note */
+	nr_phdr = nr_cpus + 1;
+	nr_phdr += mem->nr_ranges;
+
+	/*
+	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
+	 * area (for example, ffffffff80000000 - ffffffffa0000000 on x86_64).
+	 * I think this is required by tools like gdb. So same physical
+	 * memory will be mapped in two ELF headers. One will contain kernel
+	 * text virtual addresses and other will have __va(physical) addresses.
+	 */
+
+	nr_phdr++;
+	elf_sz = sizeof(Elf64_Ehdr) + nr_phdr * sizeof(Elf64_Phdr);
+	elf_sz = ALIGN(elf_sz, ELF_CORE_HEADER_ALIGN);
+
+	buf = vzalloc(elf_sz);
+	if (!buf)
+		return -ENOMEM;
+
+	ehdr = (Elf64_Ehdr *)buf;
+	phdr = (Elf64_Phdr *)(ehdr + 1);
+	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
+	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
+	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
+	ehdr->e_ident[EI_VERSION] = EV_CURRENT;
+	ehdr->e_ident[EI_OSABI] = ELF_OSABI;
+	memset(ehdr->e_ident + EI_PAD, 0, EI_NIDENT - EI_PAD);
+	ehdr->e_type = ET_CORE;
+	ehdr->e_machine = ELF_ARCH;
+	ehdr->e_version = EV_CURRENT;
+	ehdr->e_phoff = sizeof(Elf64_Ehdr);
+	ehdr->e_ehsize = sizeof(Elf64_Ehdr);
+	ehdr->e_phentsize = sizeof(Elf64_Phdr);
+
+	/* Prepare one phdr of type PT_NOTE for each present CPU */
+	for_each_present_cpu(cpu) {
+		phdr->p_type = PT_NOTE;
+		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
+		phdr->p_offset = phdr->p_paddr = notes_addr;
+		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
+		(ehdr->e_phnum)++;
+		phdr++;
+	}
+
+	/* Prepare one PT_NOTE header for vmcoreinfo */
+	phdr->p_type = PT_NOTE;
+	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
+	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
+	(ehdr->e_phnum)++;
+	phdr++;
+
+	/* Prepare PT_LOAD type program header for kernel text region */
+	if (need_kernel_map) {
+		phdr->p_type = PT_LOAD;
+		phdr->p_flags = PF_R|PF_W|PF_X;
+		phdr->p_vaddr = (unsigned long) _text;
+		phdr->p_filesz = phdr->p_memsz = _end - _text;
+		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
+		ehdr->e_phnum++;
+		phdr++;
+	}
+
+	/* Go through all the ranges in mem->ranges[] and prepare phdr */
+	for (i = 0; i < mem->nr_ranges; i++) {
+		mstart = mem->ranges[i].start;
+		mend = mem->ranges[i].end;
+
+		phdr->p_type = PT_LOAD;
+		phdr->p_flags = PF_R|PF_W|PF_X;
+		phdr->p_offset  = mstart;
+
+		phdr->p_paddr = mstart;
+		phdr->p_vaddr = (unsigned long) __va(mstart);
+		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
+		phdr->p_align = 0;
+		ehdr->e_phnum++;
+		pr_debug("Crash PT_LOAD ELF header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
+			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
+			ehdr->e_phnum, phdr->p_offset);
+		phdr++;
+	}
+
+	*addr = buf;
+	*sz = elf_sz;
+	return 0;
+}
+
 Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
 			  void *data, size_t data_len)
 {
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index f9261c07b048..fdad056d5938 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -1234,102 +1234,3 @@ int crash_exclude_mem_range(struct crash_mem *mem,
 	mem->nr_ranges++;
 	return 0;
 }
-
-int crash_prepare_elf64_headers(struct crash_mem *mem, int need_kernel_map,
-			  void **addr, unsigned long *sz)
-{
-	Elf64_Ehdr *ehdr;
-	Elf64_Phdr *phdr;
-	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
-	unsigned char *buf;
-	unsigned int cpu, i;
-	unsigned long long notes_addr;
-	unsigned long mstart, mend;
-
-	/* extra phdr for vmcoreinfo ELF note */
-	nr_phdr = nr_cpus + 1;
-	nr_phdr += mem->nr_ranges;
-
-	/*
-	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
-	 * area (for example, ffffffff80000000 - ffffffffa0000000 on x86_64).
-	 * I think this is required by tools like gdb. So same physical
-	 * memory will be mapped in two ELF headers. One will contain kernel
-	 * text virtual addresses and other will have __va(physical) addresses.
-	 */
-
-	nr_phdr++;
-	elf_sz = sizeof(Elf64_Ehdr) + nr_phdr * sizeof(Elf64_Phdr);
-	elf_sz = ALIGN(elf_sz, ELF_CORE_HEADER_ALIGN);
-
-	buf = vzalloc(elf_sz);
-	if (!buf)
-		return -ENOMEM;
-
-	ehdr = (Elf64_Ehdr *)buf;
-	phdr = (Elf64_Phdr *)(ehdr + 1);
-	memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
-	ehdr->e_ident[EI_CLASS] = ELFCLASS64;
-	ehdr->e_ident[EI_DATA] = ELFDATA2LSB;
-	ehdr->e_ident[EI_VERSION] = EV_CURRENT;
-	ehdr->e_ident[EI_OSABI] = ELF_OSABI;
-	memset(ehdr->e_ident + EI_PAD, 0, EI_NIDENT - EI_PAD);
-	ehdr->e_type = ET_CORE;
-	ehdr->e_machine = ELF_ARCH;
-	ehdr->e_version = EV_CURRENT;
-	ehdr->e_phoff = sizeof(Elf64_Ehdr);
-	ehdr->e_ehsize = sizeof(Elf64_Ehdr);
-	ehdr->e_phentsize = sizeof(Elf64_Phdr);
-
-	/* Prepare one phdr of type PT_NOTE for each present CPU */
-	for_each_present_cpu(cpu) {
-		phdr->p_type = PT_NOTE;
-		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
-		phdr->p_offset = phdr->p_paddr = notes_addr;
-		phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
-		(ehdr->e_phnum)++;
-		phdr++;
-	}
-
-	/* Prepare one PT_NOTE header for vmcoreinfo */
-	phdr->p_type = PT_NOTE;
-	phdr->p_offset = phdr->p_paddr = paddr_vmcoreinfo_note();
-	phdr->p_filesz = phdr->p_memsz = VMCOREINFO_NOTE_SIZE;
-	(ehdr->e_phnum)++;
-	phdr++;
-
-	/* Prepare PT_LOAD type program header for kernel text region */
-	if (need_kernel_map) {
-		phdr->p_type = PT_LOAD;
-		phdr->p_flags = PF_R|PF_W|PF_X;
-		phdr->p_vaddr = (unsigned long) _text;
-		phdr->p_filesz = phdr->p_memsz = _end - _text;
-		phdr->p_offset = phdr->p_paddr = __pa_symbol(_text);
-		ehdr->e_phnum++;
-		phdr++;
-	}
-
-	/* Go through all the ranges in mem->ranges[] and prepare phdr */
-	for (i = 0; i < mem->nr_ranges; i++) {
-		mstart = mem->ranges[i].start;
-		mend = mem->ranges[i].end;
-
-		phdr->p_type = PT_LOAD;
-		phdr->p_flags = PF_R|PF_W|PF_X;
-		phdr->p_offset  = mstart;
-
-		phdr->p_paddr = mstart;
-		phdr->p_vaddr = (unsigned long) __va(mstart);
-		phdr->p_filesz = phdr->p_memsz = mend - mstart + 1;
-		phdr->p_align = 0;
-		ehdr->e_phnum++;
-		pr_debug("Crash PT_LOAD ELF header. phdr=%p vaddr=0x%llx, paddr=0x%llx, sz=0x%llx e_phnum=%d p_offset=0x%llx\n",
-			phdr, phdr->p_vaddr, phdr->p_paddr, phdr->p_filesz,
-			ehdr->e_phnum, phdr->p_offset);
-		phdr++;
-	}
-
-	*addr = buf;
-	*sz = elf_sz;
-	return 0;
-}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v10 3/8] crash: prototype change for crash_prepare_elf64_headers
  2022-07-21 18:17 [PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
  2022-07-21 18:17 ` [PATCH v10 1/8] crash: introduce arch/*/asm/crash.h Eric DeVolder
  2022-07-21 18:17 ` [PATCH v10 2/8] crash: move crash_prepare_elf64_headers Eric DeVolder
@ 2022-07-21 18:17 ` Eric DeVolder
  2022-07-21 18:17 ` [PATCH v10 4/8] crash: add generic infrastructure for crash hotplug support Eric DeVolder
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Eric DeVolder @ 2022-07-21 18:17 UTC (permalink / raw)
  To: linux-kernel, x86, kexec, ebiederm, dyoung, bhe, vgoyal
  Cc: tglx, mingo, bp, dave.hansen, hpa, nramas, thomas.lendacky, robh,
	efault, rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky,
	eric.devolder

From within crash_prepare_elf64_headers() there is a need to
reference the struct kimage hotplug members. As such, this
change passes the struct kimage as a parameter to the
crash_prepare_elf64_headers(). The hotplug members are added
in "crash: add generic infrastructure for crash hotplug support".

This is preparation for later patch, no functionality change.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Acked-by: Baoquan He <bhe@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
---
 arch/arm64/kernel/machine_kexec_file.c | 6 +++---
 arch/powerpc/kexec/file_load_64.c      | 2 +-
 arch/x86/kernel/crash.c                | 2 +-
 include/linux/kexec.h                  | 7 +++++--
 kernel/crash_core.c                    | 4 ++--
 5 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index 889951291cc0..7f5f8243928e 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -39,7 +39,7 @@ int arch_kimage_file_post_load_cleanup(struct kimage *image)
 	return kexec_image_post_load_cleanup_default(image);
 }
 
-static int prepare_elf_headers(void **addr, unsigned long *sz)
+static int prepare_elf_headers(struct kimage *image, void **addr, unsigned long *sz)
 {
 	struct crash_mem *cmem;
 	unsigned int nr_ranges;
@@ -64,7 +64,7 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
 	}
 
 	/* Exclude crashkernel region */
-	ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
+	ret = crash_exclude_mem_range(image, cmem, crashk_res.start, crashk_res.end);
 	if (ret)
 		goto out;
 
@@ -74,7 +74,7 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
 			goto out;
 	}
 
-	ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
+	ret = crash_prepare_elf64_headers(image, cmem, true, addr, sz);
 
 out:
 	kfree(cmem);
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index b4981b651d9a..07da6bf1cf24 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -797,7 +797,7 @@ static int load_elfcorehdr_segment(struct kimage *image, struct kexec_buf *kbuf)
 		goto out;
 
 	/* Setup elfcorehdr segment */
-	ret = crash_prepare_elf64_headers(cmem, false, &headers, &headers_sz);
+	ret = crash_prepare_elf64_headers(image, cmem, false, &headers, &headers_sz);
 	if (ret) {
 		pr_err("Failed to prepare elf headers for the core\n");
 		goto out;
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 9730c88530fc..9ceb93c176a6 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -265,7 +265,7 @@ static int prepare_elf_headers(struct kimage *image, void **addr,
 		goto out;
 
 	/* By default prepare 64bit headers */
-	ret =  crash_prepare_elf64_headers(cmem, IS_ENABLED(CONFIG_X86_64), addr, sz);
+	ret =  crash_prepare_elf64_headers(image, cmem, IS_ENABLED(CONFIG_X86_64), addr, sz);
 
 out:
 	vfree(cmem);
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 475683cd67f1..5f4969cf3f4e 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -219,8 +219,11 @@ struct crash_mem {
 extern int crash_exclude_mem_range(struct crash_mem *mem,
 				   unsigned long long mstart,
 				   unsigned long long mend);
-extern int crash_prepare_elf64_headers(struct crash_mem *mem, int need_kernel_map,
-				       void **addr, unsigned long *sz);
+extern int crash_prepare_elf64_headers(struct kimage *image,
+				   struct crash_mem *mem,
+				   int need_kernel_map,
+				   void **addr,
+				   unsigned long *sz);
 
 #ifndef arch_kexec_apply_relocations_add
 /*
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index cf273fd7c18b..212d4dad0ec7 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -303,8 +303,8 @@ static int __init parse_crashkernel_dummy(char *arg)
 }
 early_param("crashkernel", parse_crashkernel_dummy);
 
-int crash_prepare_elf64_headers(struct crash_mem *mem, int need_kernel_map,
-			  void **addr, unsigned long *sz)
+int crash_prepare_elf64_headers(struct kimage *image, struct crash_mem *mem,
+			  int need_kernel_map, void **addr, unsigned long *sz)
 {
 	Elf64_Ehdr *ehdr;
 	Elf64_Phdr *phdr;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v10 4/8] crash: add generic infrastructure for crash hotplug support
  2022-07-21 18:17 [PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
                   ` (2 preceding siblings ...)
  2022-07-21 18:17 ` [PATCH v10 3/8] crash: prototype change for crash_prepare_elf64_headers Eric DeVolder
@ 2022-07-21 18:17 ` Eric DeVolder
  2022-08-08  9:30   ` Baoquan He
  2022-07-21 18:17 ` [PATCH v10 5/8] kexec: exclude elfcorehdr from the segment digest Eric DeVolder
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Eric DeVolder @ 2022-07-21 18:17 UTC (permalink / raw)
  To: linux-kernel, x86, kexec, ebiederm, dyoung, bhe, vgoyal
  Cc: tglx, mingo, bp, dave.hansen, hpa, nramas, thomas.lendacky, robh,
	efault, rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky,
	eric.devolder

CPU and memory change notifications are received in order to
regenerate the elfcorehdr.

To support cpu hotplug, a callback is registered to capture the
CPUHP_AP_ONLINE_DYN online and offline events via
cpuhp_setup_state_nocalls().

To support memory hotplug, a notifier is registered to capture the
MEM_ONLINE and MEM_OFFLINE events via register_memory_notifier().

The cpu callback and memory notifiers call handle_hotplug_event()
which performs needed tasks and then dispatches the event to the
architecture specific arch_crash_handle_hotplug_event(). During the
process, the kexec_mutex is held.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
---
 include/linux/crash_core.h |  24 ++++++++
 include/linux/kexec.h      |   7 +++
 kernel/crash_core.c        | 118 +++++++++++++++++++++++++++++++++++++
 3 files changed, 149 insertions(+)

diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index cb0f1916fbf5..c9705b6872e7 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -86,4 +86,28 @@ int parse_crashkernel_high(char *cmdline, unsigned long long system_ram,
 int parse_crashkernel_low(char *cmdline, unsigned long long system_ram,
 		unsigned long long *crash_size, unsigned long long *crash_base);
 
+#define KEXEC_CRASH_HP_REMOVE_CPU		0
+#define KEXEC_CRASH_HP_ADD_CPU			1
+#define KEXEC_CRASH_HP_REMOVE_MEMORY	2
+#define KEXEC_CRASH_HP_ADD_MEMORY		3
+#define KEXEC_CRASH_HP_INVALID_CPU		-1U
+
+struct kimage;
+#ifndef arch_map_crash_pages
+static inline void *arch_map_crash_pages(unsigned long paddr,
+		unsigned long size)
+{
+	return NULL;
+}
+#endif
+#ifndef arch_unmap_crash_pages
+static inline void arch_unmap_crash_pages(void **ptr) { }
+#endif
+#ifndef arch_crash_handle_hotplug_event
+static inline void arch_crash_handle_hotplug_event(struct kimage *image,
+		unsigned int hp_action, unsigned int cpu)
+{
+}
+#endif
+
 #endif /* LINUX_CRASH_CORE_H */
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 5f4969cf3f4e..7694aa77b92b 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -340,6 +340,13 @@ struct kimage {
 	struct purgatory_info purgatory_info;
 #endif
 
+#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
+	bool hotplug_event;
+	unsigned int offlinecpu;
+	bool elfcorehdr_index_valid;
+	int elfcorehdr_index;
+#endif
+
 #ifdef CONFIG_IMA_KEXEC
 	/* Virtual address of IMA measurement buffer for kexec syscall */
 	void *ima_buffer;
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 212d4dad0ec7..154ef532d45a 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -10,12 +10,16 @@
 #include <linux/utsname.h>
 #include <linux/vmalloc.h>
 #include <linux/kexec.h>
+#include <linux/memory.h>
+#include <linux/cpuhotplug.h>
 
 #include <asm/page.h>
 #include <asm/sections.h>
 
 #include <crypto/sha1.h>
 
+#include "kexec_internal.h"
+
 /* vmcoreinfo stuff */
 unsigned char *vmcoreinfo_data;
 size_t vmcoreinfo_size;
@@ -587,3 +591,117 @@ static int __init crash_save_vmcoreinfo_init(void)
 }
 
 subsys_initcall(crash_save_vmcoreinfo_init);
+
+#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
+static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
+{
+	/* Obtain lock while changing crash information */
+	mutex_lock(&kexec_mutex);
+
+	/* Check kdump is loaded */
+	if (kexec_crash_image) {
+		struct kimage *image = kexec_crash_image;
+
+		pr_debug("crash hp: hp_action %u, cpu %u", hp_action, cpu);
+
+		/*
+		 * When the struct kimage is alloced, it is wiped to zero, so
+		 * the elfcorehdr_index_valid defaults to false. Find the
+		 * segment containing the elfcorehdr, if not already found.
+		 * This works for both the kexec_load and kexec_file_load paths.
+		 */
+		if (!image->elfcorehdr_index_valid) {
+			unsigned char *ptr;
+			unsigned long mem, memsz;
+			unsigned int n;
+
+			for (n = 0; n < image->nr_segments; n++) {
+				mem = image->segment[n].mem;
+				memsz = image->segment[n].memsz;
+				ptr = arch_map_crash_pages(mem, memsz);
+				if (ptr) {
+					/* The segment containing elfcorehdr */
+					if (memcmp(ptr, ELFMAG, SELFMAG) == 0) {
+						image->elfcorehdr_index = (int)n;
+						image->elfcorehdr_index_valid = true;
+					}
+				}
+				arch_unmap_crash_pages((void **)&ptr);
+			}
+		}
+
+		if (!image->elfcorehdr_index_valid) {
+			pr_err("crash hp: unable to locate elfcorehdr segment");
+			goto out;
+		}
+
+		/* Needed in order for the segments to be updated */
+		arch_kexec_unprotect_crashkres();
+
+		/* Flag to differentiate between normal load and hotplug */
+		image->hotplug_event = true;
+
+		/* Now invoke arch-specific update handler */
+		arch_crash_handle_hotplug_event(image, hp_action, cpu);
+
+		/* No longer handling a hotplug event */
+		image->hotplug_event = false;
+
+		/* Change back to read-only */
+		arch_kexec_protect_crashkres();
+	}
+
+out:
+	/* Release lock now that update complete */
+	mutex_unlock(&kexec_mutex);
+}
+
+static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, void *v)
+{
+	switch (val) {
+	case MEM_ONLINE:
+		handle_hotplug_event(KEXEC_CRASH_HP_ADD_MEMORY, 0);
+		break;
+
+	case MEM_OFFLINE:
+		handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_MEMORY, 0);
+		break;
+	}
+	return NOTIFY_OK;
+}
+
+static struct notifier_block crash_memhp_nb = {
+	.notifier_call = crash_memhp_notifier,
+	.priority = 0
+};
+
+static int crash_cpuhp_online(unsigned int cpu)
+{
+	handle_hotplug_event(KEXEC_CRASH_HP_ADD_CPU, cpu);
+	return 0;
+}
+
+static int crash_cpuhp_offline(unsigned int cpu)
+{
+	handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_CPU, cpu);
+	return 0;
+}
+
+static int __init crash_hotplug_init(void)
+{
+	int result = 0;
+
+	if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG))
+		register_memory_notifier(&crash_memhp_nb);
+
+	if (IS_ENABLED(CONFIG_HOTPLUG_CPU))
+		result = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
+						   "crash/cpuhp",
+						   crash_cpuhp_online,
+						   crash_cpuhp_offline);
+
+	return result;
+}
+
+subsys_initcall(crash_hotplug_init);
+#endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v10 5/8] kexec: exclude elfcorehdr from the segment digest
  2022-07-21 18:17 [PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
                   ` (3 preceding siblings ...)
  2022-07-21 18:17 ` [PATCH v10 4/8] crash: add generic infrastructure for crash hotplug support Eric DeVolder
@ 2022-07-21 18:17 ` Eric DeVolder
  2022-07-21 18:17 ` [PATCH v10 6/8] kexec: exclude hot remove cpu from elfcorehdr notes Eric DeVolder
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Eric DeVolder @ 2022-07-21 18:17 UTC (permalink / raw)
  To: linux-kernel, x86, kexec, ebiederm, dyoung, bhe, vgoyal
  Cc: tglx, mingo, bp, dave.hansen, hpa, nramas, thomas.lendacky, robh,
	efault, rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky,
	eric.devolder

When a crash kernel is loaded via the kexec_file_load syscall, the
kernel places the various segments (ie crash kernel, crash initrd,
boot_params, elfcorehdr, purgatory, etc) in memory. For those
architectures that utilize purgatory, a hash digest of the segments
is calculated for integrity checking. This digest is embedded into
the purgatory image prior to placing purgatory in memory.

Since hotplug events cause changes to the elfcorehdr, purgatory
integrity checking fails (at crash time, and no kdump created).
As a result, this change explicitly excludes the elfcorehdr segment
from the list of segments used to create the digest. By doing so,
this permits changes to the elfcorehdr in response to hotplug events,
without having to also reload purgatory due to the change to the
digest.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Acked-by: Baoquan He <bhe@redhat.com>
---
 kernel/kexec_file.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index fdad056d5938..73d81120c3eb 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -740,6 +740,12 @@ static int kexec_calculate_store_digests(struct kimage *image)
 	for (j = i = 0; i < image->nr_segments; i++) {
 		struct kexec_segment *ksegment;
 
+#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
+		/* This segment excluded to allow future changes via hotplug */
+		if (image->elfcorehdr_index_valid && (j == image->elfcorehdr_index))
+			continue;
+#endif
+
 		ksegment = &image->segment[i];
 		/*
 		 * Skip purgatory as it will be modified once we put digest
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v10 6/8] kexec: exclude hot remove cpu from elfcorehdr notes
  2022-07-21 18:17 [PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
                   ` (4 preceding siblings ...)
  2022-07-21 18:17 ` [PATCH v10 5/8] kexec: exclude elfcorehdr from the segment digest Eric DeVolder
@ 2022-07-21 18:17 ` Eric DeVolder
  2022-07-21 18:17 ` [PATCH v10 7/8] crash: memory and cpu hotplug sysfs attributes Eric DeVolder
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Eric DeVolder @ 2022-07-21 18:17 UTC (permalink / raw)
  To: linux-kernel, x86, kexec, ebiederm, dyoung, bhe, vgoyal
  Cc: tglx, mingo, bp, dave.hansen, hpa, nramas, thomas.lendacky, robh,
	efault, rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky,
	eric.devolder

Due to use of CPUHP_AP_ONLINE_DYN, upon CPU unplug, the CPU is
still in the for_each_present_cpu() list when within the
handle_hotplug_event(). Thus the CPU must be explicitly excluded
when building the new list of CPUs.

This change identifies in handle_hotplug_event() the CPU to be
excluded, and the check for excluding the CPU in
crash_prepare_elf64_headers().

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Acked-by: Baoquan He <bhe@redhat.com>
---
 kernel/crash_core.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 154ef532d45a..77f5f3591760 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -355,6 +355,11 @@ int crash_prepare_elf64_headers(struct kimage *image, struct crash_mem *mem,
 
 	/* Prepare one phdr of type PT_NOTE for each present CPU */
 	for_each_present_cpu(cpu) {
+#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
+		/* Skip the soon-to-be offlined cpu */
+		if (image->hotplug_event && (cpu == image->offlinecpu))
+			continue;
+#endif
 		phdr->p_type = PT_NOTE;
 		notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
 		phdr->p_offset = phdr->p_paddr = notes_addr;
@@ -641,6 +646,16 @@ static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
 		/* Flag to differentiate between normal load and hotplug */
 		image->hotplug_event = true;
 
+		/*
+		 * Due to use of CPUHP_AP_ONLINE_DYN, upon unplug and during
+		 * this callback, the CPU is still in the for_each_present_cpu()
+		 * list. Must explicitly look to exclude this CPU when building
+		 * new list.
+		 */
+		image->offlinecpu =
+			(hp_action == KEXEC_CRASH_HP_REMOVE_CPU) ?
+				cpu : KEXEC_CRASH_HP_INVALID_CPU;
+
 		/* Now invoke arch-specific update handler */
 		arch_crash_handle_hotplug_event(image, hp_action, cpu);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v10 7/8] crash: memory and cpu hotplug sysfs attributes
  2022-07-21 18:17 [PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
                   ` (5 preceding siblings ...)
  2022-07-21 18:17 ` [PATCH v10 6/8] kexec: exclude hot remove cpu from elfcorehdr notes Eric DeVolder
@ 2022-07-21 18:17 ` Eric DeVolder
  2022-08-08 10:41   ` Baoquan He
  2022-07-21 18:17 ` [PATCH v10 8/8] x86/crash: Add x86 crash hotplug support Eric DeVolder
  2022-08-04 14:42 ` [PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
  8 siblings, 1 reply; 25+ messages in thread
From: Eric DeVolder @ 2022-07-21 18:17 UTC (permalink / raw)
  To: linux-kernel, x86, kexec, ebiederm, dyoung, bhe, vgoyal
  Cc: tglx, mingo, bp, dave.hansen, hpa, nramas, thomas.lendacky, robh,
	efault, rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky,
	eric.devolder

This introduces the crash_hotplug attribute for memory and CPUs
for use by userspace.  This change directly facilitates the udev
rule for managing userspace re-loading of the crash kernel upon
hot un/plug changes.

For memory, this changeset introduces the crash_hotplug attribute
to the /sys/devices/system/memory directory. For example:

 # udevadm info --attribute-walk /sys/devices/system/memory/memory81
  looking at device '/devices/system/memory/memory81':
    KERNEL=="memory81"
    SUBSYSTEM=="memory"
    DRIVER==""
    ATTR{online}=="1"
    ATTR{phys_device}=="0"
    ATTR{phys_index}=="00000051"
    ATTR{removable}=="1"
    ATTR{state}=="online"
    ATTR{valid_zones}=="Movable"

  looking at parent device '/devices/system/memory':
    KERNELS=="memory"
    SUBSYSTEMS==""
    DRIVERS==""
    ATTRS{auto_online_blocks}=="offline"
    ATTRS{block_size_bytes}=="8000000"
    ATTRS{crash_hotplug}=="1"

For CPUs, this changeset introduces the crash_hotplug attribute
to the /sys/devices/system/cpu directory. For example:

 # udevadm info --attribute-walk /sys/devices/system/cpu/cpu0
  looking at device '/devices/system/cpu/cpu0':
    KERNEL=="cpu0"
    SUBSYSTEM=="cpu"
    DRIVER=="processor"
    ATTR{crash_notes}=="277c38600"
    ATTR{crash_notes_size}=="368"
    ATTR{online}=="1"

  looking at parent device '/devices/system/cpu':
    KERNELS=="cpu"
    SUBSYSTEMS==""
    DRIVERS==""
    ATTRS{crash_hotplug}=="1"
    ATTRS{isolated}==""
    ATTRS{kernel_max}=="8191"
    ATTRS{nohz_full}=="  (null)"
    ATTRS{offline}=="4-7"
    ATTRS{online}=="0-3"
    ATTRS{possible}=="0-7"
    ATTRS{present}=="0-3"

With these sysfs attributes in place, it is possible to efficiently
instruct the udev rule to skip crash kernel reloading.

For example, the following is the proposed udev rule change for RHEL
system 98-kexec.rules (as the first lines of the rule file):

 # The kernel handles updates to crash elfcorehdr for cpu and memory changes
 SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
 SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"

When examined in the context of 98-kexec.rules, the above change
tests if crash_hotplug is set, and if so, it skips the userspace
initiated unload-then-reload of the crash kernel.

Cpu and memory checks are separated in accordance with
CONFIG_HOTPLUG_CPU and CONFIG_MEMORY_HOTPLUG kernel config options.
If an architecture supports, for example, memory hotplug but not
CPU hotplug, then the /sys/devices/system/memory/crash_hotplug
attribute file is present, but the /sys/devices/system/cpu/crash_hotplug
attribute file will NOT be present. Thus the udev rule will skip
userspace processing of memory hot un/plug events, but the udev
rule will fail for CPU events, thus allowing userspace to process
cpu hot un/plug events (ie the unload-then-reload of the kdump
capture kernel).

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
---
 .../admin-guide/mm/memory-hotplug.rst          |  8 ++++++++
 Documentation/core-api/cpu_hotplug.rst         | 18 ++++++++++++++++++
 drivers/base/cpu.c                             | 14 ++++++++++++++
 drivers/base/memory.c                          | 13 +++++++++++++
 include/linux/crash_core.h                     |  6 ++++++
 5 files changed, 59 insertions(+)

diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst
index 0f56ecd8ac05..494d7a63c543 100644
--- a/Documentation/admin-guide/mm/memory-hotplug.rst
+++ b/Documentation/admin-guide/mm/memory-hotplug.rst
@@ -293,6 +293,14 @@ The following files are currently defined:
 		       Availability depends on the CONFIG_ARCH_MEMORY_PROBE
 		       kernel configuration option.
 ``uevent``	       read-write: generic udev file for device subsystems.
+``crash_hotplug``      read-only: when changes to the system memory map
+		       occur due to hot un/plug of memory, this file contains
+		       '1' if the kernel updates the kdump capture kernel memory
+		       map itself (via elfcorehdr), or '0' if userspace must update
+		       the kdump capture kernel memory map.
+
+		       Availability depends on the CONFIG_MEMORY_HOTPLUG kernel
+		       configuration option.
 ====================== =========================================================
 
 .. note::
diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst
index c6f4ba2fb32d..13e33d098645 100644
--- a/Documentation/core-api/cpu_hotplug.rst
+++ b/Documentation/core-api/cpu_hotplug.rst
@@ -750,6 +750,24 @@ will receive all events. A script like::
 
 can process the event further.
 
+When changes to the CPUs in the system occur, the sysfs file
+/sys/devices/system/cpu/crash_hotplug contains '1' if the kernel
+updates the kdump capture kernel list of CPUs itself (via elfcorehdr),
+or '0' if userspace must update the kdump capture kernel list of CPUs.
+
+The availability depends on the CONFIG_HOTPLUG_CPU kernel configuration
+option.
+
+To skip userspace processing of CPU hot un/plug events for kdump
+(ie the unload-then-reload to obtain a current list of CPUs), this sysfs
+file can be used in a udev rule as follows:
+
+ SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
+
+For a cpu hot un/plug event, if the architecture supports kernel updates
+of the elfcorehdr (which contains the list of CPUs), then the rule skips
+the unload-then-reload of the kdump capture kernel.
+
 Kernel Inline Documentations Reference
 ======================================
 
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 4c98849577d4..bd470236d9a2 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -293,6 +293,17 @@ static ssize_t print_cpus_nohz_full(struct device *dev,
 static DEVICE_ATTR(nohz_full, 0444, print_cpus_nohz_full, NULL);
 #endif
 
+#ifdef CONFIG_HOTPLUG_CPU
+#include <linux/crash_core.h>
+static ssize_t crash_hotplug_show(struct device *dev,
+				     struct device_attribute *attr,
+				     char *buf)
+{
+	return sprintf(buf, "%d\n", crash_hotplug_cpu_support());
+}
+static DEVICE_ATTR_ADMIN_RO(crash_hotplug);
+#endif
+
 static void cpu_device_release(struct device *dev)
 {
 	/*
@@ -469,6 +480,9 @@ static struct attribute *cpu_root_attrs[] = {
 #ifdef CONFIG_NO_HZ_FULL
 	&dev_attr_nohz_full.attr,
 #endif
+#ifdef CONFIG_HOTPLUG_CPU
+	&dev_attr_crash_hotplug.attr,
+#endif
 #ifdef CONFIG_GENERIC_CPU_AUTOPROBE
 	&dev_attr_modalias.attr,
 #endif
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index bc60c9cd3230..63c1754a52b6 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -483,6 +483,16 @@ static ssize_t auto_online_blocks_store(struct device *dev,
 
 static DEVICE_ATTR_RW(auto_online_blocks);
 
+#ifdef CONFIG_MEMORY_HOTPLUG
+#include <linux/crash_core.h>
+static ssize_t crash_hotplug_show(struct device *dev,
+				       struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%d\n", crash_hotplug_memory_support());
+}
+static DEVICE_ATTR_RO(crash_hotplug);
+#endif
+
 /*
  * Some architectures will have custom drivers to do this, and
  * will not need to do it from userspace.  The fake hot-add code
@@ -887,6 +897,9 @@ static struct attribute *memory_root_attrs[] = {
 
 	&dev_attr_block_size_bytes.attr,
 	&dev_attr_auto_online_blocks.attr,
+#ifdef CONFIG_MEMORY_HOTPLUG
+	&dev_attr_crash_hotplug.attr,
+#endif
 	NULL
 };
 
diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index c9705b6872e7..3964e9924ea5 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -109,5 +109,11 @@ static inline void arch_crash_handle_hotplug_event(struct kimage *image,
 {
 }
 #endif
+#ifndef crash_hotplug_cpu_support
+static inline int crash_hotplug_cpu_support(void) { return 0; }
+#endif
+#ifndef crash_hotplug_memory_support
+static inline int crash_hotplug_memory_support(void) { return 0; }
+#endif
 
 #endif /* LINUX_CRASH_CORE_H */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v10 8/8] x86/crash: Add x86 crash hotplug support
  2022-07-21 18:17 [PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
                   ` (6 preceding siblings ...)
  2022-07-21 18:17 ` [PATCH v10 7/8] crash: memory and cpu hotplug sysfs attributes Eric DeVolder
@ 2022-07-21 18:17 ` Eric DeVolder
  2022-08-13  0:34   ` Baoquan He
  2022-08-04 14:42 ` [PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
  8 siblings, 1 reply; 25+ messages in thread
From: Eric DeVolder @ 2022-07-21 18:17 UTC (permalink / raw)
  To: linux-kernel, x86, kexec, ebiederm, dyoung, bhe, vgoyal
  Cc: tglx, mingo, bp, dave.hansen, hpa, nramas, thomas.lendacky, robh,
	efault, rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky,
	eric.devolder

For x86_64, when CPU or memory is hot un/plugged, the crash
elfcorehdr, which describes the CPUs and memory in the system,
must also be updated.

When loading the crash kernel via kexec_load or kexec_file_load,
the elfcorehdr is identified at run time in
crash_core:handle_hotplug_event().

To update the elfcorehdr for x86_64, a new elfcorehdr must be
generated from the available CPUs and memory. The new elfcorehdr
is prepared into a buffer, and then installed over the top of
the existing elfcorehdr.

In the patch 'kexec: exclude elfcorehdr from the segment digest'
the need to update purgatory due to the change in elfcorehdr was
eliminated.  As a result, no changes to purgatory or boot_params
(as the elfcorehdr= kernel command line parameter pointer
remains unchanged and correct) are needed, just elfcorehdr.

To accommodate a growing number of resources via hotplug, the
elfcorehdr segment must be sufficiently large enough to accommodate
changes, see the CRASH_MAX_MEMORY_RANGES configure item.

With this change, crash hotplug for kexec_file_load syscall
is supported. The kexec_load is also supported, but also
requires a corresponding change to userspace kexec-tools.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
---
 arch/x86/Kconfig             |  11 ++++
 arch/x86/include/asm/crash.h |  20 ++++++
 arch/x86/kernel/crash.c      | 115 +++++++++++++++++++++++++++++++++++
 3 files changed, 146 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e58798f636d4..bb59596c8bea 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2065,6 +2065,17 @@ config CRASH_DUMP
 	  (CONFIG_RELOCATABLE=y).
 	  For more details see Documentation/admin-guide/kdump/kdump.rst
 
+config CRASH_MAX_MEMORY_RANGES
+	depends on CRASH_DUMP && KEXEC_FILE && (HOTPLUG_CPU || MEMORY_HOTPLUG)
+	int
+	default 32768
+	help
+	  For the kexec_file_load path, specify the maximum number of
+	  memory regions, eg. as represented by the 'System RAM' entries
+	  in /proc/iomem, that the elfcorehdr buffer/segment can accommodate.
+	  This value is combined with NR_CPUS and multiplied by Elf64_Phdr
+	  size to determine the final buffer size.
+
 config KEXEC_JUMP
 	bool "kexec jump"
 	depends on KEXEC && HIBERNATION
diff --git a/arch/x86/include/asm/crash.h b/arch/x86/include/asm/crash.h
index 8b6bd63530dc..96051d8e4b45 100644
--- a/arch/x86/include/asm/crash.h
+++ b/arch/x86/include/asm/crash.h
@@ -9,4 +9,24 @@ int crash_setup_memmap_entries(struct kimage *image,
 		struct boot_params *params);
 void crash_smp_send_stop(void);
 
+void *arch_map_crash_pages(unsigned long paddr, unsigned long size);
+#define arch_map_crash_pages arch_map_crash_pages
+
+void arch_unmap_crash_pages(void **ptr);
+#define arch_unmap_crash_pages arch_unmap_crash_pages
+
+void arch_crash_handle_hotplug_event(struct kimage *image,
+		unsigned int hp_action, unsigned int cpu);
+#define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
+
+#ifdef CONFIG_HOTPLUG_CPU
+static inline int crash_hotplug_cpu_support(void) { return 1; }
+#define crash_hotplug_cpu_support crash_hotplug_cpu_support
+#endif
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+static inline int crash_hotplug_memory_support(void) { return 1; }
+#define crash_hotplug_memory_support crash_hotplug_memory_support
+#endif
+
 #endif /* _ASM_X86_CRASH_H */
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 9ceb93c176a6..55dda4fcde6e 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -25,6 +25,7 @@
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
 #include <linux/memblock.h>
+#include <linux/highmem.h>
 
 #include <asm/processor.h>
 #include <asm/hardirq.h>
@@ -397,7 +398,17 @@ int crash_load_segments(struct kimage *image)
 	image->elf_headers = kbuf.buffer;
 	image->elf_headers_sz = kbuf.bufsz;
 
+#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
+	/* Ensure elfcorehdr segment large enough for hotplug changes */
+	kbuf.memsz = (CONFIG_NR_CPUS_DEFAULT + CONFIG_CRASH_MAX_MEMORY_RANGES) * sizeof(Elf64_Phdr);
+	/* For marking as usable to crash kernel */
+	image->elf_headers_sz = kbuf.memsz;
+	/* Record the index of the elfcorehdr segment */
+	image->elfcorehdr_index = image->nr_segments;
+	image->elfcorehdr_index_valid = true;
+#else
 	kbuf.memsz = kbuf.bufsz;
+#endif
 	kbuf.buf_align = ELF_CORE_HEADER_ALIGN;
 	kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
 	ret = kexec_add_buffer(&kbuf);
@@ -412,3 +423,107 @@ int crash_load_segments(struct kimage *image)
 	return ret;
 }
 #endif /* CONFIG_KEXEC_FILE */
+
+#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
+void *arch_map_crash_pages(unsigned long paddr, unsigned long size)
+{
+	/*
+	 * NOTE: The addresses and sizes passed to this routine have
+	 * already been fully aligned on page boundaries. There is no
+	 * need for massaging the address or size.
+	 */
+	void *ptr = NULL;
+
+	/* NOTE: requires arch_kexec_[un]protect_crashkres() for write access */
+	if (size > 0) {
+		struct page *page = pfn_to_page(paddr >> PAGE_SHIFT);
+
+		ptr = kmap_local_page(page);
+	}
+
+	return ptr;
+}
+
+void arch_unmap_crash_pages(void **ptr)
+{
+	if (ptr) {
+		if (*ptr)
+			kunmap_local(*ptr);
+		*ptr = NULL;
+	}
+}
+
+/**
+ * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
+ * @image: the active struct kimage
+ * @hp_action: the hot un/plug action being handled
+ * @cpu: when KEXEC_CRASH_HP_ADD/REMOVE_CPU, the cpu affected
+ *
+ * To accurately reflect hot un/plug changes, the elfcorehdr (which
+ * is passed to the crash kernel via the elfcorehdr= parameter)
+ * must be updated with the new list of CPUs and memories. The new
+ * elfcorehdr is prepared in a kernel buffer, and then it is
+ * written on top of the existing/old elfcorehdr.
+ *
+ * For hotplug changes to elfcorehdr to work, two conditions are
+ * needed:
+ * First, the segment containing the elfcorehdr must be large enough
+ * to permit a growing number of resources. See the
+ * CONFIG_CRASH_MAX_MEMORY_RANGES description.
+ * Second, purgatory must explicitly exclude the elfcorehdr from the
+ * list of segments it checks (since the elfcorehdr changes and thus
+ * would require an update to purgatory itself to update the digest).
+ *
+ */
+void arch_crash_handle_hotplug_event(struct kimage *image,
+	unsigned int hp_action, unsigned int cpu)
+{
+	struct kexec_segment *ksegment;
+	unsigned char *ptr = NULL;
+	unsigned long elfsz = 0;
+	void *elfbuf = NULL;
+	unsigned long mem, memsz;
+
+	/*
+	 * Elfcorehdr_index_valid checked in crash_core:handle_hotplug_event()
+	 */
+	ksegment = &image->segment[image->elfcorehdr_index];
+	mem = ksegment->mem;
+	memsz = ksegment->memsz;
+
+	/*
+	 * Create the new elfcorehdr reflecting the changes to CPU and/or
+	 * memory resources.
+	 */
+	if (prepare_elf_headers(image, &elfbuf, &elfsz)) {
+		pr_err("crash hp: unable to prepare elfcore headers");
+		goto out;
+	}
+	if (elfsz > memsz) {
+		pr_err("crash hp: update elfcorehdr elfsz %lu > memsz %lu",
+			elfsz, memsz);
+		goto out;
+	}
+
+	/*
+	 * At this point, we are all but assured of success.
+	 * Copy new elfcorehdr into destination.
+	 */
+	ptr = arch_map_crash_pages(mem, memsz);
+	if (ptr) {
+		/*
+		 * Temporarily invalidate the crash image while the
+		 * elfcorehdr is updated.
+		 */
+		xchg(&kexec_crash_image, NULL);
+		memcpy_flushcache((void *)ptr, elfbuf, elfsz);
+		xchg(&kexec_crash_image, image);
+	}
+	arch_unmap_crash_pages((void **)&ptr);
+	pr_debug("crash hp: re-loaded elfcorehdr at 0x%lx\n", mem);
+
+out:
+	if (elfbuf)
+		vfree(elfbuf);
+}
+#endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug
  2022-07-21 18:17 [PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
                   ` (7 preceding siblings ...)
  2022-07-21 18:17 ` [PATCH v10 8/8] x86/crash: Add x86 crash hotplug support Eric DeVolder
@ 2022-08-04 14:42 ` Eric DeVolder
  8 siblings, 0 replies; 25+ messages in thread
From: Eric DeVolder @ 2022-08-04 14:42 UTC (permalink / raw)
  To: linux-kernel, x86, kexec, ebiederm, dyoung, bhe, vgoyal
  Cc: tglx, mingo, bp, dave.hansen, hpa, nramas, thomas.lendacky, robh,
	efault, rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky

Nudge...
eric

On 7/21/22 13:17, Eric DeVolder wrote:
> When the kdump service is loaded, if a CPU or memory is hot
> un/plugged, the crash elfcorehdr, which describes the CPUs
> and memory in the system, must also be updated, else the resulting
> vmcore is inaccurate (eg. missing either CPU context or memory
> regions).
> 
> The current solution utilizes udev to initiate an unload-then-reload
> of the kdump image (e. kernel, initrd, boot_params, puratory and
> elfcorehdr) by the userspace kexec utility. In previous posts I have
> outlined the significant performance problems related to offloading
> this activity to userspace.
> 
> This patchset introduces a generic crash hot un/plug handler that
> registers with the CPU and memory notifiers. Upon CPU or memory
> changes, this generic handler is invoked and performs important
> housekeeping, for example obtaining the appropriate lock, and then
> invokes an architecture specific handler to do the appropriate
> updates.
> 
> In the case of x86_64, the arch specific handler generates a new
> elfcorehdr, and overwrites the old one in memory. No involvement
> with userspace needed.
> 
> To realize the benefits/test this patchset, one must make a couple
> of minor changes to userspace:
> 
>   - Prevent udev from updating kdump crash kernel on hot un/plug changes.
>     Add the following as the first lines to the udev rule file
>     /usr/lib/udev/rules.d/98-kexec.rules:
> 
>     # The kernel handles updates to crash elfcorehdr for cpu and memory changes
>     SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
>     SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
> 
>     These lines will cause cpu and memory hot un/plug events to be
>     skipped within this rule file, if the kernel has these changes
>     enabled.
> 
>   - Change to the kexec_file_load for loading the kdump kernel:
>     Eg. on RHEL: in /usr/bin/kdumpctl, change to:
>      standard_kexec_args="-p -d -s"
>     which adds the -s to select kexec_file_load syscall.
> 
> This patchset supports kexec_load with a modified kexec userspace
> utility, and a working changeset to the kexec userspace utility
> is provided here (and to use, the above change to standard_kexec_args
> would be, for example, to append --hotplug instead of -s).
> 
>    diff --git a/kexec/arch/i386/crashdump-x86.c b/kexec/arch/i386/crashdump-x86.c
>    index 9826f6d..4ed395a 100644
>    --- a/kexec/arch/i386/crashdump-x86.c
>    +++ b/kexec/arch/i386/crashdump-x86.c
>    @@ -48,6 +48,7 @@
>     #include <x86/x86-linux.h>
>     
>     extern struct arch_options_t arch_options;
>    +extern int do_hotplug;
>     
>     static int get_kernel_page_offset(struct kexec_info *UNUSED(info),
>     				  struct crash_elf_info *elf_info)
>    @@ -975,6 +976,14 @@ int load_crashdump_segments(struct kexec_info *info, char* mod_cmdline,
>     	} else {
>     		memsz = bufsz;
>     	}
>    +
>    +	/* If hotplug support enabled, use larger size to accomodate changes */
>    +	if (do_hotplug) {
>    +		long int nr_cpus = get_nr_cpus();
>    +		memsz = (nr_cpus + CRASH_MAX_MEMORY_RANGES) * sizeof(Elf64_Phdr);
>    +	}
>    +
>    +    info->elfcorehdr =
>     	elfcorehdr = add_buffer(info, tmp, bufsz, memsz, align, min_base,
>     							max_addr, -1);
>     	dbgprintf("Created elf header segment at 0x%lx\n", elfcorehdr);
>    diff --git a/kexec/crashdump-elf.c b/kexec/crashdump-elf.c
>    index b8bb686..5e29f7a 100644
>    --- a/kexec/crashdump-elf.c
>    +++ b/kexec/crashdump-elf.c
>    @@ -43,11 +43,7 @@ int FUNC(struct kexec_info *info,
>     	int (*get_note_info)(int cpu, uint64_t *addr, uint64_t *len);
>     	long int count_cpu;
>     
>    -	if (xen_present())
>    -		nr_cpus = xen_get_nr_phys_cpus();
>    -	else
>    -		nr_cpus = sysconf(_SC_NPROCESSORS_CONF);
>    -
>    +	nr_cpus = get_nr_cpus();
>     	if (nr_cpus < 0) {
>     		return -1;
>     	}
>    diff --git a/kexec/crashdump.h b/kexec/crashdump.h
>    index 18bd691..28d3278 100644
>    --- a/kexec/crashdump.h
>    +++ b/kexec/crashdump.h
>    @@ -57,7 +57,6 @@ unsigned long phys_to_virt(struct crash_elf_info *elf_info,
>     			   unsigned long long paddr);
>     
>     unsigned long xen_architecture(struct crash_elf_info *elf_info);
>    -int xen_get_nr_phys_cpus(void);
>     int xen_get_note(int cpu, uint64_t *addr, uint64_t *len);
>     int xen_get_crashkernel_region(uint64_t *start, uint64_t *end);
>     
>    diff --git a/kexec/kexec-xen.h b/kexec/kexec-xen.h
>    index 70fb576..f54a2dd 100644
>    --- a/kexec/kexec-xen.h
>    +++ b/kexec/kexec-xen.h
>    @@ -83,5 +83,6 @@ extern int __xc_interface_close(xc_interface *xch);
>     #endif
>     
>     int xen_get_kexec_range(int range, uint64_t *start, uint64_t *end);
>    +int xen_get_nr_phys_cpus(void);
>     
>     #endif /* KEXEC_XEN_H */
>    diff --git a/kexec/kexec.c b/kexec/kexec.c
>    index 829a6ea..3668b73 100644
>    --- a/kexec/kexec.c
>    +++ b/kexec/kexec.c
>    @@ -58,6 +58,7 @@
>     
>     unsigned long long mem_min = 0;
>     unsigned long long mem_max = ULONG_MAX;
>    +int do_hotplug = 0;
>     static unsigned long kexec_flags = 0;
>     /* Flags for kexec file (fd) based syscall */
>     static unsigned long kexec_file_flags = 0;
>    @@ -489,6 +490,17 @@ static int add_backup_segments(struct kexec_info *info,
>     	return 0;
>     }
>     
>    +long int get_nr_cpus(void)
>    +{
>    +    long int nr_cpus;
>    +
>    +	if (xen_present())
>    +		nr_cpus = xen_get_nr_phys_cpus();
>    +	else
>    +		nr_cpus = sysconf(_SC_NPROCESSORS_CONF);
>    +    return nr_cpus;
>    +}
>    +
>     static char *slurp_fd(int fd, const char *filename, off_t size, off_t *nread)
>     {
>     	char *buf;
>    @@ -672,6 +684,14 @@ static void update_purgatory(struct kexec_info *info)
>     		if (info->segment[i].mem == (void *)info->rhdr.rel_addr) {
>     			continue;
>     		}
>    +
>    +		/* Don't include elfcorehdr in the checksum, if hotplug
>    +		 * support enabled.
>    +		 */
>    +		if (do_hotplug && (info->segment[i].mem == (void *)info->elfcorehdr)) {
>    +			continue;
>    +		}
>    +
>     		sha256_update(&ctx, info->segment[i].buf,
>     			      info->segment[i].bufsz);
>     		nullsz = info->segment[i].memsz - info->segment[i].bufsz;
>    @@ -1565,6 +1585,9 @@ int main(int argc, char *argv[])
>     		case OPT_PRINT_CKR_SIZE:
>     			print_crashkernel_region_size();
>     			return 0;
>    +		case OPT_HOTPLUG:
>    +			do_hotplug = 1;
>    +			break;
>     		default:
>     			break;
>     		}
>    diff --git a/kexec/kexec.h b/kexec/kexec.h
>    index 0f97a97..b0428cc 100644
>    --- a/kexec/kexec.h
>    +++ b/kexec/kexec.h
>    @@ -169,6 +169,7 @@ struct kexec_info {
>     	int command_line_len;
>     
>     	int skip_checks;
>    +	unsigned long elfcorehdr;
>     };
>     
>     struct arch_map_entry {
>    @@ -231,7 +232,8 @@ extern int file_types;
>     #define OPT_PRINT_CKR_SIZE	262
>     #define OPT_LOAD_LIVE_UPDATE	263
>     #define OPT_EXEC_LIVE_UPDATE	264
>    -#define OPT_MAX			265
>    +#define OPT_HOTPLUG		265
>    +#define OPT_MAX		266
>     #define KEXEC_OPTIONS \
>     	{ "help",		0, 0, OPT_HELP }, \
>     	{ "version",		0, 0, OPT_VERSION }, \
>    @@ -258,6 +260,7 @@ extern int file_types;
>     	{ "debug",		0, 0, OPT_DEBUG }, \
>     	{ "status",		0, 0, OPT_STATUS }, \
>     	{ "print-ckr-size",     0, 0, OPT_PRINT_CKR_SIZE }, \
>    +	{ "hotplug",		0, 0, OPT_HOTPLUG }, \
>     
>     #define KEXEC_OPT_STR "h?vdfixyluet:pscaS"
>     
>    @@ -290,6 +293,8 @@ extern unsigned long add_buffer_phys_virt(struct kexec_info *info,
>     	int buf_end, int phys);
>     extern void arch_reuse_initrd(void);
>     
>    +extern long int get_nr_cpus(void);
>    +
>     extern int ifdown(void);
>     
>     extern char purgatory[];
> 
> Regards,
> eric
> ---
> v10: 21jul2022
>   - Rebased to 5.19.0-rc7
>   - Per Sourabh, corrected build issue with arch_un/map_crash_pages()
>     for architectures not supporting this feature.
>   - Per David Hildebrand, removed the WARN_ONCE() altogether.
>   - Per David Hansen, converted to use of kmap_local_page().
>   - Per Baoquan He, replaced use of __weak with the kexec technique.
> 
> v9: 13jun2022
>   https://lkml.org/lkml/2022/6/13/3382
>   - Rebased to 5.18.0
>   - Per Sourabh, moved crash_prepare_elf64_headers() into common
>     crash_core.c to avoid compile issues with kexec_load only path.
>   - Per David Hildebrand, replaced mutex_trylock() with mutex_lock().
>   - Changed the __weak arch_crash_handle_hotplug_event() to utilize
>     WARN_ONCE() instead of WARN(). Fix some formatting issues.
>   - Per Sourabh, introduced sysfs attribute crash_hotplug for memory
>     and CPUs; for use by userspace (udev) to determine if the kernel
>     performs crash hot un/plug support.
>   - Per Sourabh, moved the code detecting the elfcorehdr segment from
>     arch/x86 into crash_core:handle_hotplug_event() so both kexec_load
>     and kexec_file_load can benefit.
>   - Updated userspace kexec-tools kexec utility to reflect change to
>     using CRASH_MAX_MEMORY_RANGES and get_nr_cpus().
>   - Updated the new proposed udev rules to reflect using the sysfs
>     attributes crash_hotplug.
> 
> v8: 5may2022
>   https://lkml.org/lkml/2022/5/5/1133
>   - Per Borislav Petkov, eliminated CONFIG_CRASH_HOTPLUG in favor
>     of CONFIG_HOTPLUG_CPU || CONFIG_MEMORY_HOTPLUG, ie a new define
>     is not needed. Also use of IS_ENABLED() rather than #ifdef's.
>     Renamed crash_hotplug_handler() to handle_hotplug_event().
>     And other corrections.
>   - Per Baoquan, minimized the parameters to the arch_crash_
>     handle_hotplug_event() to hp_action and cpu.
>   - Introduce KEXEC_CRASH_HP_INVALID_CPU definition, per Baoquan.
>   - Per Sourabh Jain, renamed and repurposed CRASH_HOTPLUG_ELFCOREHDR_SZ
>     to CONFIG_CRASH_MAX_MEMORY_RANGES, mirroring kexec-tools change
>     by David Hildebrand. Folded this patch into the x86
>     kexec_file_load support patch.
> 
> v7: 13apr2022
>   https://lkml.org/lkml/2022/4/13/850
>   - Resolved parameter usage to crash_hotplug_handler(), per Baoquan.
> 
> v6: 1apr2022
>   https://lkml.org/lkml/2022/4/1/1203
>   - Reword commit messages and some comment cleanup per Baoquan.
>   - Changed elf_index to elfcorehdr_index for clarity.
>   - Minor code changes per Baoquan.
> 
> v5: 3mar2022
>   https://lkml.org/lkml/2022/3/3/674
>   - Reworded description of CRASH_HOTPLUG_ELFCOREHDR_SZ, per
>     David Hildenbrand.
>   - Refactored slightly a few patches per Baoquan recommendation.
> 
> v4: 9feb2022
>   https://lkml.org/lkml/2022/2/9/1406
>   - Refactored patches per Baoquan suggestsions.
>   - A few corrections, per Baoquan.
> 
> v3: 10jan2022
>   https://lkml.org/lkml/2022/1/10/1212
>   - Rebasing per Baoquan He request.
>   - Changed memory notifier per David Hildenbrand.
>   - Providing example kexec userspace change in cover letter.
> 
> RFC v2: 7dec2021
>   https://lkml.org/lkml/2021/12/7/1088
>   - Acting upon Baoquan He suggestion of removing elfcorehdr from
>     the purgatory list of segments, removed purgatory code from
>     patchset, and it is signficiantly simpler now.
> 
> RFC v1: 18nov2021
>   https://lkml.org/lkml/2021/11/18/845
>   - working patchset demonstrating kernel handling of hotplug
>     updates to x86 elfcorehdr for kexec_file_load
> 
> RFC: 14dec2020
>   https://lkml.org/lkml/2020/12/14/532
>   - proposed concept of allowing kernel to handle hotplug update
>     of elfcorehdr
> ---
> 
> Eric DeVolder (8):
>    crash: introduce arch/*/asm/crash.h
>    crash: move crash_prepare_elf64_headers
>    crash: prototype change for crash_prepare_elf64_headers
>    crash: add generic infrastructure for crash hotplug support
>    kexec: exclude elfcorehdr from the segment digest
>    kexec: exclude hot remove cpu from elfcorehdr notes
>    crash: memory and cpu hotplug sysfs attributes
>    x86/crash: Add x86 crash hotplug support
> 
>   .../admin-guide/mm/memory-hotplug.rst         |   8 +
>   Documentation/core-api/cpu_hotplug.rst        |  18 ++
>   arch/arm/include/asm/crash.h                  |   5 +
>   arch/arm64/include/asm/crash.h                |   5 +
>   arch/arm64/kernel/machine_kexec_file.c        |   6 +-
>   arch/ia64/include/asm/crash.h                 |   5 +
>   arch/m68k/include/asm/crash.h                 |   5 +
>   arch/mips/include/asm/crash.h                 |   5 +
>   arch/parisc/include/asm/crash.h               |   5 +
>   arch/powerpc/include/asm/crash.h              |   5 +
>   arch/powerpc/kexec/file_load_64.c             |   2 +-
>   arch/riscv/include/asm/crash.h                |   5 +
>   arch/s390/include/asm/crash.h                 |   5 +
>   arch/sh/include/asm/crash.h                   |   5 +
>   arch/x86/Kconfig                              |  11 +
>   arch/x86/include/asm/crash.h                  |  20 ++
>   arch/x86/kernel/crash.c                       | 117 ++++++++-
>   drivers/base/cpu.c                            |  14 ++
>   drivers/base/memory.c                         |  13 +
>   include/linux/crash_core.h                    |  32 +++
>   include/linux/kexec.h                         |  14 +-
>   kernel/crash_core.c                           | 233 ++++++++++++++++++
>   kernel/kexec_file.c                           | 105 +-------
>   23 files changed, 537 insertions(+), 106 deletions(-)
>   create mode 100644 arch/arm/include/asm/crash.h
>   create mode 100644 arch/arm64/include/asm/crash.h
>   create mode 100644 arch/ia64/include/asm/crash.h
>   create mode 100644 arch/m68k/include/asm/crash.h
>   create mode 100644 arch/mips/include/asm/crash.h
>   create mode 100644 arch/parisc/include/asm/crash.h
>   create mode 100644 arch/powerpc/include/asm/crash.h
>   create mode 100644 arch/riscv/include/asm/crash.h
>   create mode 100644 arch/s390/include/asm/crash.h
>   create mode 100644 arch/sh/include/asm/crash.h
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 1/8] crash: introduce arch/*/asm/crash.h
  2022-07-21 18:17 ` [PATCH v10 1/8] crash: introduce arch/*/asm/crash.h Eric DeVolder
@ 2022-08-08  3:25   ` Baoquan He
  2022-08-08 15:18     ` Eric DeVolder
  0 siblings, 1 reply; 25+ messages in thread
From: Baoquan He @ 2022-08-08  3:25 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: linux-kernel, x86, kexec, ebiederm, dyoung, vgoyal, tglx, mingo,
	bp, dave.hansen, hpa, nramas, thomas.lendacky, robh, efault,
	rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky

Hi Eric,

On 07/21/22 at 02:17pm, Eric DeVolder wrote:
> The use of __weak is being eliminated within kexec sources.
> The technique uses macros mapped onto inline functions in
> order to replace __weak.
> 
> This patchset was using __weak and so in order to replace
> __weak, this patch introduces arch/*/asm/crash.h, patterned
> after how kexec is moving away from __weak and to the macro
> definitions.

Are you going to replace __weak in kexec of arll ARCHes? I don't see
your point why all these empty header files are introduced. Wondering
what's impacted if not adding these empty files?

> 
> No functionality changed, yet.
> 
> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> ---
>  arch/arm/include/asm/crash.h     | 5 +++++
>  arch/arm64/include/asm/crash.h   | 5 +++++
>  arch/ia64/include/asm/crash.h    | 5 +++++
>  arch/m68k/include/asm/crash.h    | 5 +++++
>  arch/mips/include/asm/crash.h    | 5 +++++
>  arch/parisc/include/asm/crash.h  | 5 +++++
>  arch/powerpc/include/asm/crash.h | 5 +++++
>  arch/riscv/include/asm/crash.h   | 5 +++++
>  arch/s390/include/asm/crash.h    | 5 +++++
>  arch/sh/include/asm/crash.h      | 5 +++++
>  include/linux/crash_core.h       | 2 ++
>  11 files changed, 52 insertions(+)
>  create mode 100644 arch/arm/include/asm/crash.h
>  create mode 100644 arch/arm64/include/asm/crash.h
>  create mode 100644 arch/ia64/include/asm/crash.h
>  create mode 100644 arch/m68k/include/asm/crash.h
>  create mode 100644 arch/mips/include/asm/crash.h
>  create mode 100644 arch/parisc/include/asm/crash.h
>  create mode 100644 arch/powerpc/include/asm/crash.h
>  create mode 100644 arch/riscv/include/asm/crash.h
>  create mode 100644 arch/s390/include/asm/crash.h
>  create mode 100644 arch/sh/include/asm/crash.h
> 
> diff --git a/arch/arm/include/asm/crash.h b/arch/arm/include/asm/crash.h
> new file mode 100644
> index 000000000000..385646957d60
> --- /dev/null
> +++ b/arch/arm/include/asm/crash.h
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_ARM_CRASH_H
> +#define _ASM_ARM_CRASH_H
> +
> +#endif /* _ASM_ARM_CRASH_H */
> diff --git a/arch/arm64/include/asm/crash.h b/arch/arm64/include/asm/crash.h
> new file mode 100644
> index 000000000000..ec8870c1ea49
> --- /dev/null
> +++ b/arch/arm64/include/asm/crash.h
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_ARM64_CRASH_H
> +#define _ASM_ARM64_CRASH_H
> +
> +#endif /* _ASM_ARM64_CRASH_H */
> diff --git a/arch/ia64/include/asm/crash.h b/arch/ia64/include/asm/crash.h
> new file mode 100644
> index 000000000000..02a457cccda3
> --- /dev/null
> +++ b/arch/ia64/include/asm/crash.h
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_IA64_CRASH_H
> +#define _ASM_IA64_CRASH_H
> +
> +#endif /* _ASM_IA64_CRASH_H */
> diff --git a/arch/m68k/include/asm/crash.h b/arch/m68k/include/asm/crash.h
> new file mode 100644
> index 000000000000..ba6e412a1267
> --- /dev/null
> +++ b/arch/m68k/include/asm/crash.h
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_M68K_CRASH_H
> +#define _ASM_M68K_CRASH_H
> +
> +#endif /* _ASM_M68K_CRASH_H */
> diff --git a/arch/mips/include/asm/crash.h b/arch/mips/include/asm/crash.h
> new file mode 100644
> index 000000000000..35872522c574
> --- /dev/null
> +++ b/arch/mips/include/asm/crash.h
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_MIPS_CRASH_H
> +#define _ASM_MIPS_CRASH_H
> +
> +#endif /* _ASM_MIPS_CRASH_H */
> diff --git a/arch/parisc/include/asm/crash.h b/arch/parisc/include/asm/crash.h
> new file mode 100644
> index 000000000000..96833b727179
> --- /dev/null
> +++ b/arch/parisc/include/asm/crash.h
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_PARISC_CRASH_H
> +#define _ASM_PARISC_CRASH_H
> +
> +#endif /* _ASM_PARISC_CRASH_H */
> diff --git a/arch/powerpc/include/asm/crash.h b/arch/powerpc/include/asm/crash.h
> new file mode 100644
> index 000000000000..40ce71e56ac1
> --- /dev/null
> +++ b/arch/powerpc/include/asm/crash.h
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_POWERPC_CRASH_H
> +#define _ASM_POWERPC_CRASH_H
> +
> +#endif /* _ASM_POWERPC_CRASH_H */
> diff --git a/arch/riscv/include/asm/crash.h b/arch/riscv/include/asm/crash.h
> new file mode 100644
> index 000000000000..24f3aea99707
> --- /dev/null
> +++ b/arch/riscv/include/asm/crash.h
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_RISCV_CRASH_H
> +#define _ASM_RISCV_CRASH_H
> +
> +#endif /* _ASM_RISCV_CRASH_H */
> diff --git a/arch/s390/include/asm/crash.h b/arch/s390/include/asm/crash.h
> new file mode 100644
> index 000000000000..0db16ad4c75f
> --- /dev/null
> +++ b/arch/s390/include/asm/crash.h
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_S390_CRASH_H
> +#define _ASM_S390_CRASH_H
> +
> +#endif /* _ASM_S390_CRASH_H */
> diff --git a/arch/sh/include/asm/crash.h b/arch/sh/include/asm/crash.h
> new file mode 100644
> index 000000000000..f54e12f88cae
> --- /dev/null
> +++ b/arch/sh/include/asm/crash.h
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_SH_CRASH_H
> +#define _ASM_SH_CRASH_H
> +
> +#endif /* _ASM_SH_CRASH_H */
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index de62a722431e..cb0f1916fbf5 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -6,6 +6,8 @@
>  #include <linux/elfcore.h>
>  #include <linux/elf.h>
>  
> +#include <asm/crash.h>
> +
>  #define CRASH_CORE_NOTE_NAME	   "CORE"
>  #define CRASH_CORE_NOTE_HEAD_BYTES ALIGN(sizeof(struct elf_note), 4)
>  #define CRASH_CORE_NOTE_NAME_BYTES ALIGN(sizeof(CRASH_CORE_NOTE_NAME), 4)
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 4/8] crash: add generic infrastructure for crash hotplug support
  2022-07-21 18:17 ` [PATCH v10 4/8] crash: add generic infrastructure for crash hotplug support Eric DeVolder
@ 2022-08-08  9:30   ` Baoquan He
  2022-08-08 15:20     ` Eric DeVolder
  0 siblings, 1 reply; 25+ messages in thread
From: Baoquan He @ 2022-08-08  9:30 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: linux-kernel, x86, kexec, ebiederm, dyoung, vgoyal, tglx, mingo,
	bp, dave.hansen, hpa, nramas, thomas.lendacky, robh, efault,
	rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky

On 07/21/22 at 02:17pm, Eric DeVolder wrote:
> CPU and memory change notifications are received in order to
> regenerate the elfcorehdr.
> 
> To support cpu hotplug, a callback is registered to capture the
> CPUHP_AP_ONLINE_DYN online and offline events via
> cpuhp_setup_state_nocalls().
> 
> To support memory hotplug, a notifier is registered to capture the
> MEM_ONLINE and MEM_OFFLINE events via register_memory_notifier().
> 
> The cpu callback and memory notifiers call handle_hotplug_event()
> which performs needed tasks and then dispatches the event to the
> architecture specific arch_crash_handle_hotplug_event(). During the
> process, the kexec_mutex is held.
> 
> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> ---
>  include/linux/crash_core.h |  24 ++++++++
>  include/linux/kexec.h      |   7 +++
>  kernel/crash_core.c        | 118 +++++++++++++++++++++++++++++++++++++
>  3 files changed, 149 insertions(+)
> 
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index cb0f1916fbf5..c9705b6872e7 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -86,4 +86,28 @@ int parse_crashkernel_high(char *cmdline, unsigned long long system_ram,
>  int parse_crashkernel_low(char *cmdline, unsigned long long system_ram,
>  		unsigned long long *crash_size, unsigned long long *crash_base);
>  
> +#define KEXEC_CRASH_HP_REMOVE_CPU		0
> +#define KEXEC_CRASH_HP_ADD_CPU			1
> +#define KEXEC_CRASH_HP_REMOVE_MEMORY	2
> +#define KEXEC_CRASH_HP_ADD_MEMORY		3
> +#define KEXEC_CRASH_HP_INVALID_CPU		-1U
> +
> +struct kimage;
> +#ifndef arch_map_crash_pages
> +static inline void *arch_map_crash_pages(unsigned long paddr,
> +		unsigned long size)
> +{
> +	return NULL;
> +}
> +#endif
> +#ifndef arch_unmap_crash_pages
> +static inline void arch_unmap_crash_pages(void **ptr) { }
> +#endif
> +#ifndef arch_crash_handle_hotplug_event
> +static inline void arch_crash_handle_hotplug_event(struct kimage *image,
> +		unsigned int hp_action, unsigned int cpu)
> +{
> +}
> +#endif
> +
>  #endif /* LINUX_CRASH_CORE_H */
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 5f4969cf3f4e..7694aa77b92b 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -340,6 +340,13 @@ struct kimage {
>  	struct purgatory_info purgatory_info;
>  #endif
>  
> +#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
> +	bool hotplug_event;
> +	unsigned int offlinecpu;
> +	bool elfcorehdr_index_valid;
> +	int elfcorehdr_index;
> +#endif
> +
>  #ifdef CONFIG_IMA_KEXEC
>  	/* Virtual address of IMA measurement buffer for kexec syscall */
>  	void *ima_buffer;
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 212d4dad0ec7..154ef532d45a 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -10,12 +10,16 @@
>  #include <linux/utsname.h>
>  #include <linux/vmalloc.h>
>  #include <linux/kexec.h>
> +#include <linux/memory.h>
> +#include <linux/cpuhotplug.h>
>  
>  #include <asm/page.h>
>  #include <asm/sections.h>
>  
>  #include <crypto/sha1.h>
>  
> +#include "kexec_internal.h"
> +
>  /* vmcoreinfo stuff */
>  unsigned char *vmcoreinfo_data;
>  size_t vmcoreinfo_size;
> @@ -587,3 +591,117 @@ static int __init crash_save_vmcoreinfo_init(void)
>  }
>  
>  subsys_initcall(crash_save_vmcoreinfo_init);
> +
> +#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
> +static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
> +{
> +	/* Obtain lock while changing crash information */
> +	mutex_lock(&kexec_mutex);
> +
> +	/* Check kdump is loaded */
> +	if (kexec_crash_image) {
> +		struct kimage *image = kexec_crash_image;
> +
> +		pr_debug("crash hp: hp_action %u, cpu %u", hp_action, cpu);

We don't need to check if it's cpu hotplug action here? Then for mem
hotplug actions, we also get:
                "crash hp: hp_action 2, cpu 0"
That cpu 0 will be confusing. And you forgot the line break.

Can we do like this:
		pr_debug("crash hp: hp_action %u ", hp_action);
		if (hp_action == KEXEC_CRASH_HP_REMOVE_CPU ||
		    hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
			pr_debug("cpu %u \n", cpu);

Other than this, the patch looks good to me.

> +
> +		/*
> +		 * When the struct kimage is alloced, it is wiped to zero, so
> +		 * the elfcorehdr_index_valid defaults to false. Find the
> +		 * segment containing the elfcorehdr, if not already found.
> +		 * This works for both the kexec_load and kexec_file_load paths.
> +		 */
> +		if (!image->elfcorehdr_index_valid) {
> +			unsigned char *ptr;
> +			unsigned long mem, memsz;
> +			unsigned int n;
> +
> +			for (n = 0; n < image->nr_segments; n++) {
> +				mem = image->segment[n].mem;
> +				memsz = image->segment[n].memsz;
> +				ptr = arch_map_crash_pages(mem, memsz);
> +				if (ptr) {
> +					/* The segment containing elfcorehdr */
> +					if (memcmp(ptr, ELFMAG, SELFMAG) == 0) {
> +						image->elfcorehdr_index = (int)n;
> +						image->elfcorehdr_index_valid = true;
> +					}
> +				}
> +				arch_unmap_crash_pages((void **)&ptr);
> +			}
> +		}
> +
> +		if (!image->elfcorehdr_index_valid) {
> +			pr_err("crash hp: unable to locate elfcorehdr segment");
> +			goto out;
> +		}
> +
> +		/* Needed in order for the segments to be updated */
> +		arch_kexec_unprotect_crashkres();
> +
> +		/* Flag to differentiate between normal load and hotplug */
> +		image->hotplug_event = true;
> +
> +		/* Now invoke arch-specific update handler */
> +		arch_crash_handle_hotplug_event(image, hp_action, cpu);
> +
> +		/* No longer handling a hotplug event */
> +		image->hotplug_event = false;
> +
> +		/* Change back to read-only */
> +		arch_kexec_protect_crashkres();
> +	}
> +
> +out:
> +	/* Release lock now that update complete */
> +	mutex_unlock(&kexec_mutex);
> +}
> +
> +static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, void *v)
> +{
> +	switch (val) {
> +	case MEM_ONLINE:
> +		handle_hotplug_event(KEXEC_CRASH_HP_ADD_MEMORY, 0);
> +		break;
> +
> +	case MEM_OFFLINE:
> +		handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_MEMORY, 0);
> +		break;
> +	}
> +	return NOTIFY_OK;
> +}
> +
> +static struct notifier_block crash_memhp_nb = {
> +	.notifier_call = crash_memhp_notifier,
> +	.priority = 0
> +};
> +
> +static int crash_cpuhp_online(unsigned int cpu)
> +{
> +	handle_hotplug_event(KEXEC_CRASH_HP_ADD_CPU, cpu);
> +	return 0;
> +}
> +
> +static int crash_cpuhp_offline(unsigned int cpu)
> +{
> +	handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_CPU, cpu);
> +	return 0;
> +}
> +
> +static int __init crash_hotplug_init(void)
> +{
> +	int result = 0;
> +
> +	if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG))
> +		register_memory_notifier(&crash_memhp_nb);
> +
> +	if (IS_ENABLED(CONFIG_HOTPLUG_CPU))
> +		result = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
> +						   "crash/cpuhp",
> +						   crash_cpuhp_online,
> +						   crash_cpuhp_offline);
> +
> +	return result;
> +}
> +
> +subsys_initcall(crash_hotplug_init);
> +#endif
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 7/8] crash: memory and cpu hotplug sysfs attributes
  2022-07-21 18:17 ` [PATCH v10 7/8] crash: memory and cpu hotplug sysfs attributes Eric DeVolder
@ 2022-08-08 10:41   ` Baoquan He
  2022-08-16 15:24     ` Eric DeVolder
  0 siblings, 1 reply; 25+ messages in thread
From: Baoquan He @ 2022-08-08 10:41 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: linux-kernel, x86, kexec, ebiederm, dyoung, vgoyal, tglx, mingo,
	bp, dave.hansen, hpa, nramas, thomas.lendacky, robh, efault,
	rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky

On 07/21/22 at 02:17pm, Eric DeVolder wrote:
> This introduces the crash_hotplug attribute for memory and CPUs
> for use by userspace.  This change directly facilitates the udev
> rule for managing userspace re-loading of the crash kernel upon
> hot un/plug changes.
> 
> For memory, this changeset introduces the crash_hotplug attribute
> to the /sys/devices/system/memory directory. For example:
> 
>  # udevadm info --attribute-walk /sys/devices/system/memory/memory81
>   looking at device '/devices/system/memory/memory81':
>     KERNEL=="memory81"
>     SUBSYSTEM=="memory"
>     DRIVER==""
>     ATTR{online}=="1"
>     ATTR{phys_device}=="0"
>     ATTR{phys_index}=="00000051"
>     ATTR{removable}=="1"
>     ATTR{state}=="online"
>     ATTR{valid_zones}=="Movable"
> 
>   looking at parent device '/devices/system/memory':
>     KERNELS=="memory"
>     SUBSYSTEMS==""
>     DRIVERS==""
>     ATTRS{auto_online_blocks}=="offline"
>     ATTRS{block_size_bytes}=="8000000"
>     ATTRS{crash_hotplug}=="1"
> 
> For CPUs, this changeset introduces the crash_hotplug attribute
> to the /sys/devices/system/cpu directory. For example:
> 
>  # udevadm info --attribute-walk /sys/devices/system/cpu/cpu0
>   looking at device '/devices/system/cpu/cpu0':
>     KERNEL=="cpu0"
>     SUBSYSTEM=="cpu"
>     DRIVER=="processor"
>     ATTR{crash_notes}=="277c38600"
>     ATTR{crash_notes_size}=="368"
>     ATTR{online}=="1"
> 
>   looking at parent device '/devices/system/cpu':
>     KERNELS=="cpu"
>     SUBSYSTEMS==""
>     DRIVERS==""
>     ATTRS{crash_hotplug}=="1"
>     ATTRS{isolated}==""
>     ATTRS{kernel_max}=="8191"
>     ATTRS{nohz_full}=="  (null)"
>     ATTRS{offline}=="4-7"
>     ATTRS{online}=="0-3"
>     ATTRS{possible}=="0-7"
>     ATTRS{present}=="0-3"
> 
> With these sysfs attributes in place, it is possible to efficiently
> instruct the udev rule to skip crash kernel reloading.
> 
> For example, the following is the proposed udev rule change for RHEL
> system 98-kexec.rules (as the first lines of the rule file):
> 
>  # The kernel handles updates to crash elfcorehdr for cpu and memory changes
>  SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
>  SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
> 
> When examined in the context of 98-kexec.rules, the above change
> tests if crash_hotplug is set, and if so, it skips the userspace
> initiated unload-then-reload of the crash kernel.
> 
> Cpu and memory checks are separated in accordance with
> CONFIG_HOTPLUG_CPU and CONFIG_MEMORY_HOTPLUG kernel config options.
> If an architecture supports, for example, memory hotplug but not
> CPU hotplug, then the /sys/devices/system/memory/crash_hotplug
> attribute file is present, but the /sys/devices/system/cpu/crash_hotplug
> attribute file will NOT be present. Thus the udev rule will skip
> userspace processing of memory hot un/plug events, but the udev
> rule will fail for CPU events, thus allowing userspace to process
> cpu hot un/plug events (ie the unload-then-reload of the kdump
> capture kernel).
> 
> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>

LGTM,

Acked-by: Baoquan He <bhe@redhat.com>

> ---
>  .../admin-guide/mm/memory-hotplug.rst          |  8 ++++++++
>  Documentation/core-api/cpu_hotplug.rst         | 18 ++++++++++++++++++
>  drivers/base/cpu.c                             | 14 ++++++++++++++
>  drivers/base/memory.c                          | 13 +++++++++++++
>  include/linux/crash_core.h                     |  6 ++++++
>  5 files changed, 59 insertions(+)
> 
> diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst
> index 0f56ecd8ac05..494d7a63c543 100644
> --- a/Documentation/admin-guide/mm/memory-hotplug.rst
> +++ b/Documentation/admin-guide/mm/memory-hotplug.rst
> @@ -293,6 +293,14 @@ The following files are currently defined:
>  		       Availability depends on the CONFIG_ARCH_MEMORY_PROBE
>  		       kernel configuration option.
>  ``uevent``	       read-write: generic udev file for device subsystems.
> +``crash_hotplug``      read-only: when changes to the system memory map
> +		       occur due to hot un/plug of memory, this file contains
> +		       '1' if the kernel updates the kdump capture kernel memory
> +		       map itself (via elfcorehdr), or '0' if userspace must update
> +		       the kdump capture kernel memory map.
> +
> +		       Availability depends on the CONFIG_MEMORY_HOTPLUG kernel
> +		       configuration option.
>  ====================== =========================================================
>  
>  .. note::
> diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst
> index c6f4ba2fb32d..13e33d098645 100644
> --- a/Documentation/core-api/cpu_hotplug.rst
> +++ b/Documentation/core-api/cpu_hotplug.rst
> @@ -750,6 +750,24 @@ will receive all events. A script like::
>  
>  can process the event further.
>  
> +When changes to the CPUs in the system occur, the sysfs file
> +/sys/devices/system/cpu/crash_hotplug contains '1' if the kernel
> +updates the kdump capture kernel list of CPUs itself (via elfcorehdr),
> +or '0' if userspace must update the kdump capture kernel list of CPUs.
> +
> +The availability depends on the CONFIG_HOTPLUG_CPU kernel configuration
> +option.
> +
> +To skip userspace processing of CPU hot un/plug events for kdump
> +(ie the unload-then-reload to obtain a current list of CPUs), this sysfs
> +file can be used in a udev rule as follows:
> +
> + SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
> +
> +For a cpu hot un/plug event, if the architecture supports kernel updates
> +of the elfcorehdr (which contains the list of CPUs), then the rule skips
> +the unload-then-reload of the kdump capture kernel.
> +
>  Kernel Inline Documentations Reference
>  ======================================
>  
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index 4c98849577d4..bd470236d9a2 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -293,6 +293,17 @@ static ssize_t print_cpus_nohz_full(struct device *dev,
>  static DEVICE_ATTR(nohz_full, 0444, print_cpus_nohz_full, NULL);
>  #endif
>  
> +#ifdef CONFIG_HOTPLUG_CPU
> +#include <linux/crash_core.h>
> +static ssize_t crash_hotplug_show(struct device *dev,
> +				     struct device_attribute *attr,
> +				     char *buf)
> +{
> +	return sprintf(buf, "%d\n", crash_hotplug_cpu_support());
> +}
> +static DEVICE_ATTR_ADMIN_RO(crash_hotplug);
> +#endif
> +
>  static void cpu_device_release(struct device *dev)
>  {
>  	/*
> @@ -469,6 +480,9 @@ static struct attribute *cpu_root_attrs[] = {
>  #ifdef CONFIG_NO_HZ_FULL
>  	&dev_attr_nohz_full.attr,
>  #endif
> +#ifdef CONFIG_HOTPLUG_CPU
> +	&dev_attr_crash_hotplug.attr,
> +#endif
>  #ifdef CONFIG_GENERIC_CPU_AUTOPROBE
>  	&dev_attr_modalias.attr,
>  #endif
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index bc60c9cd3230..63c1754a52b6 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -483,6 +483,16 @@ static ssize_t auto_online_blocks_store(struct device *dev,
>  
>  static DEVICE_ATTR_RW(auto_online_blocks);
>  
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +#include <linux/crash_core.h>
> +static ssize_t crash_hotplug_show(struct device *dev,
> +				       struct device_attribute *attr, char *buf)
> +{
> +	return sprintf(buf, "%d\n", crash_hotplug_memory_support());
> +}
> +static DEVICE_ATTR_RO(crash_hotplug);
> +#endif
> +
>  /*
>   * Some architectures will have custom drivers to do this, and
>   * will not need to do it from userspace.  The fake hot-add code
> @@ -887,6 +897,9 @@ static struct attribute *memory_root_attrs[] = {
>  
>  	&dev_attr_block_size_bytes.attr,
>  	&dev_attr_auto_online_blocks.attr,
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +	&dev_attr_crash_hotplug.attr,
> +#endif
>  	NULL
>  };
>  
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index c9705b6872e7..3964e9924ea5 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -109,5 +109,11 @@ static inline void arch_crash_handle_hotplug_event(struct kimage *image,
>  {
>  }
>  #endif
> +#ifndef crash_hotplug_cpu_support
> +static inline int crash_hotplug_cpu_support(void) { return 0; }
> +#endif
> +#ifndef crash_hotplug_memory_support
> +static inline int crash_hotplug_memory_support(void) { return 0; }
> +#endif
>  
>  #endif /* LINUX_CRASH_CORE_H */
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 1/8] crash: introduce arch/*/asm/crash.h
  2022-08-08  3:25   ` Baoquan He
@ 2022-08-08 15:18     ` Eric DeVolder
  2022-08-08 15:44       ` Rob Herring
  2022-08-12  9:46       ` Baoquan He
  0 siblings, 2 replies; 25+ messages in thread
From: Eric DeVolder @ 2022-08-08 15:18 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, x86, kexec, ebiederm, dyoung, vgoyal, tglx, mingo,
	bp, dave.hansen, hpa, nramas, thomas.lendacky, robh, efault,
	rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky



On 8/7/22 22:25, Baoquan He wrote:
> Hi Eric,
> 
> On 07/21/22 at 02:17pm, Eric DeVolder wrote:
>> The use of __weak is being eliminated within kexec sources.
>> The technique uses macros mapped onto inline functions in
>> order to replace __weak.
>>
>> This patchset was using __weak and so in order to replace
>> __weak, this patch introduces arch/*/asm/crash.h, patterned
>> after how kexec is moving away from __weak and to the macro
>> definitions.
> 
> Are you going to replace __weak in kexec of arll ARCHes? I don't see
> your point why all these empty header files are introduced. Wondering
> what's impacted if not adding these empty files?

Hi Baoquan,
In this patchset, to file include/linux/crash_core.h I added the line #include <asm/crash.h>.
I patterned this after how include/linux/kexec.h does #include <asm/kexec.h>.

For kexec, the items that were __weak are refactored into corresponding asm/kexec.h.

I followed suit for crash __weak items. File crash_core.h now #include's asm/crash.h and
so that file needs to be present for every arch, else build failures ensue. It turns out
x86_64 already had this file.

At this time, I was not planning on converting the other arch's __weak to asm/crash.h, but at least 
with these empty files, the infrastructure is in place for when that does occur.

Let me know if you think I need to do something different/more here.
Thanks!
eric

> 
>>
>> No functionality changed, yet.
>>
>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>> ---
>>   arch/arm/include/asm/crash.h     | 5 +++++
>>   arch/arm64/include/asm/crash.h   | 5 +++++
>>   arch/ia64/include/asm/crash.h    | 5 +++++
>>   arch/m68k/include/asm/crash.h    | 5 +++++
>>   arch/mips/include/asm/crash.h    | 5 +++++
>>   arch/parisc/include/asm/crash.h  | 5 +++++
>>   arch/powerpc/include/asm/crash.h | 5 +++++
>>   arch/riscv/include/asm/crash.h   | 5 +++++
>>   arch/s390/include/asm/crash.h    | 5 +++++
>>   arch/sh/include/asm/crash.h      | 5 +++++
>>   include/linux/crash_core.h       | 2 ++
>>   11 files changed, 52 insertions(+)
>>   create mode 100644 arch/arm/include/asm/crash.h
>>   create mode 100644 arch/arm64/include/asm/crash.h
>>   create mode 100644 arch/ia64/include/asm/crash.h
>>   create mode 100644 arch/m68k/include/asm/crash.h
>>   create mode 100644 arch/mips/include/asm/crash.h
>>   create mode 100644 arch/parisc/include/asm/crash.h
>>   create mode 100644 arch/powerpc/include/asm/crash.h
>>   create mode 100644 arch/riscv/include/asm/crash.h
>>   create mode 100644 arch/s390/include/asm/crash.h
>>   create mode 100644 arch/sh/include/asm/crash.h
>>
>> diff --git a/arch/arm/include/asm/crash.h b/arch/arm/include/asm/crash.h
>> new file mode 100644
>> index 000000000000..385646957d60
>> --- /dev/null
>> +++ b/arch/arm/include/asm/crash.h
>> @@ -0,0 +1,5 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_ARM_CRASH_H
>> +#define _ASM_ARM_CRASH_H
>> +
>> +#endif /* _ASM_ARM_CRASH_H */
>> diff --git a/arch/arm64/include/asm/crash.h b/arch/arm64/include/asm/crash.h
>> new file mode 100644
>> index 000000000000..ec8870c1ea49
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/crash.h
>> @@ -0,0 +1,5 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_ARM64_CRASH_H
>> +#define _ASM_ARM64_CRASH_H
>> +
>> +#endif /* _ASM_ARM64_CRASH_H */
>> diff --git a/arch/ia64/include/asm/crash.h b/arch/ia64/include/asm/crash.h
>> new file mode 100644
>> index 000000000000..02a457cccda3
>> --- /dev/null
>> +++ b/arch/ia64/include/asm/crash.h
>> @@ -0,0 +1,5 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_IA64_CRASH_H
>> +#define _ASM_IA64_CRASH_H
>> +
>> +#endif /* _ASM_IA64_CRASH_H */
>> diff --git a/arch/m68k/include/asm/crash.h b/arch/m68k/include/asm/crash.h
>> new file mode 100644
>> index 000000000000..ba6e412a1267
>> --- /dev/null
>> +++ b/arch/m68k/include/asm/crash.h
>> @@ -0,0 +1,5 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_M68K_CRASH_H
>> +#define _ASM_M68K_CRASH_H
>> +
>> +#endif /* _ASM_M68K_CRASH_H */
>> diff --git a/arch/mips/include/asm/crash.h b/arch/mips/include/asm/crash.h
>> new file mode 100644
>> index 000000000000..35872522c574
>> --- /dev/null
>> +++ b/arch/mips/include/asm/crash.h
>> @@ -0,0 +1,5 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_MIPS_CRASH_H
>> +#define _ASM_MIPS_CRASH_H
>> +
>> +#endif /* _ASM_MIPS_CRASH_H */
>> diff --git a/arch/parisc/include/asm/crash.h b/arch/parisc/include/asm/crash.h
>> new file mode 100644
>> index 000000000000..96833b727179
>> --- /dev/null
>> +++ b/arch/parisc/include/asm/crash.h
>> @@ -0,0 +1,5 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_PARISC_CRASH_H
>> +#define _ASM_PARISC_CRASH_H
>> +
>> +#endif /* _ASM_PARISC_CRASH_H */
>> diff --git a/arch/powerpc/include/asm/crash.h b/arch/powerpc/include/asm/crash.h
>> new file mode 100644
>> index 000000000000..40ce71e56ac1
>> --- /dev/null
>> +++ b/arch/powerpc/include/asm/crash.h
>> @@ -0,0 +1,5 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_POWERPC_CRASH_H
>> +#define _ASM_POWERPC_CRASH_H
>> +
>> +#endif /* _ASM_POWERPC_CRASH_H */
>> diff --git a/arch/riscv/include/asm/crash.h b/arch/riscv/include/asm/crash.h
>> new file mode 100644
>> index 000000000000..24f3aea99707
>> --- /dev/null
>> +++ b/arch/riscv/include/asm/crash.h
>> @@ -0,0 +1,5 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_RISCV_CRASH_H
>> +#define _ASM_RISCV_CRASH_H
>> +
>> +#endif /* _ASM_RISCV_CRASH_H */
>> diff --git a/arch/s390/include/asm/crash.h b/arch/s390/include/asm/crash.h
>> new file mode 100644
>> index 000000000000..0db16ad4c75f
>> --- /dev/null
>> +++ b/arch/s390/include/asm/crash.h
>> @@ -0,0 +1,5 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_S390_CRASH_H
>> +#define _ASM_S390_CRASH_H
>> +
>> +#endif /* _ASM_S390_CRASH_H */
>> diff --git a/arch/sh/include/asm/crash.h b/arch/sh/include/asm/crash.h
>> new file mode 100644
>> index 000000000000..f54e12f88cae
>> --- /dev/null
>> +++ b/arch/sh/include/asm/crash.h
>> @@ -0,0 +1,5 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_SH_CRASH_H
>> +#define _ASM_SH_CRASH_H
>> +
>> +#endif /* _ASM_SH_CRASH_H */
>> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
>> index de62a722431e..cb0f1916fbf5 100644
>> --- a/include/linux/crash_core.h
>> +++ b/include/linux/crash_core.h
>> @@ -6,6 +6,8 @@
>>   #include <linux/elfcore.h>
>>   #include <linux/elf.h>
>>   
>> +#include <asm/crash.h>
>> +
>>   #define CRASH_CORE_NOTE_NAME	   "CORE"
>>   #define CRASH_CORE_NOTE_HEAD_BYTES ALIGN(sizeof(struct elf_note), 4)
>>   #define CRASH_CORE_NOTE_NAME_BYTES ALIGN(sizeof(CRASH_CORE_NOTE_NAME), 4)
>> -- 
>> 2.31.1
>>
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 4/8] crash: add generic infrastructure for crash hotplug support
  2022-08-08  9:30   ` Baoquan He
@ 2022-08-08 15:20     ` Eric DeVolder
  0 siblings, 0 replies; 25+ messages in thread
From: Eric DeVolder @ 2022-08-08 15:20 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, x86, kexec, ebiederm, dyoung, vgoyal, tglx, mingo,
	bp, dave.hansen, hpa, nramas, thomas.lendacky, robh, efault,
	rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky



On 8/8/22 04:30, Baoquan He wrote:
> On 07/21/22 at 02:17pm, Eric DeVolder wrote:
>> CPU and memory change notifications are received in order to
>> regenerate the elfcorehdr.
>>
>> To support cpu hotplug, a callback is registered to capture the
>> CPUHP_AP_ONLINE_DYN online and offline events via
>> cpuhp_setup_state_nocalls().
>>
>> To support memory hotplug, a notifier is registered to capture the
>> MEM_ONLINE and MEM_OFFLINE events via register_memory_notifier().
>>
>> The cpu callback and memory notifiers call handle_hotplug_event()
>> which performs needed tasks and then dispatches the event to the
>> architecture specific arch_crash_handle_hotplug_event(). During the
>> process, the kexec_mutex is held.
>>
>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
>> ---
>>   include/linux/crash_core.h |  24 ++++++++
>>   include/linux/kexec.h      |   7 +++
>>   kernel/crash_core.c        | 118 +++++++++++++++++++++++++++++++++++++
>>   3 files changed, 149 insertions(+)
>>
>> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
>> index cb0f1916fbf5..c9705b6872e7 100644
>> --- a/include/linux/crash_core.h
>> +++ b/include/linux/crash_core.h
>> @@ -86,4 +86,28 @@ int parse_crashkernel_high(char *cmdline, unsigned long long system_ram,
>>   int parse_crashkernel_low(char *cmdline, unsigned long long system_ram,
>>   		unsigned long long *crash_size, unsigned long long *crash_base);
>>   
>> +#define KEXEC_CRASH_HP_REMOVE_CPU		0
>> +#define KEXEC_CRASH_HP_ADD_CPU			1
>> +#define KEXEC_CRASH_HP_REMOVE_MEMORY	2
>> +#define KEXEC_CRASH_HP_ADD_MEMORY		3
>> +#define KEXEC_CRASH_HP_INVALID_CPU		-1U
>> +
>> +struct kimage;
>> +#ifndef arch_map_crash_pages
>> +static inline void *arch_map_crash_pages(unsigned long paddr,
>> +		unsigned long size)
>> +{
>> +	return NULL;
>> +}
>> +#endif
>> +#ifndef arch_unmap_crash_pages
>> +static inline void arch_unmap_crash_pages(void **ptr) { }
>> +#endif
>> +#ifndef arch_crash_handle_hotplug_event
>> +static inline void arch_crash_handle_hotplug_event(struct kimage *image,
>> +		unsigned int hp_action, unsigned int cpu)
>> +{
>> +}
>> +#endif
>> +
>>   #endif /* LINUX_CRASH_CORE_H */
>> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
>> index 5f4969cf3f4e..7694aa77b92b 100644
>> --- a/include/linux/kexec.h
>> +++ b/include/linux/kexec.h
>> @@ -340,6 +340,13 @@ struct kimage {
>>   	struct purgatory_info purgatory_info;
>>   #endif
>>   
>> +#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
>> +	bool hotplug_event;
>> +	unsigned int offlinecpu;
>> +	bool elfcorehdr_index_valid;
>> +	int elfcorehdr_index;
>> +#endif
>> +
>>   #ifdef CONFIG_IMA_KEXEC
>>   	/* Virtual address of IMA measurement buffer for kexec syscall */
>>   	void *ima_buffer;
>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>> index 212d4dad0ec7..154ef532d45a 100644
>> --- a/kernel/crash_core.c
>> +++ b/kernel/crash_core.c
>> @@ -10,12 +10,16 @@
>>   #include <linux/utsname.h>
>>   #include <linux/vmalloc.h>
>>   #include <linux/kexec.h>
>> +#include <linux/memory.h>
>> +#include <linux/cpuhotplug.h>
>>   
>>   #include <asm/page.h>
>>   #include <asm/sections.h>
>>   
>>   #include <crypto/sha1.h>
>>   
>> +#include "kexec_internal.h"
>> +
>>   /* vmcoreinfo stuff */
>>   unsigned char *vmcoreinfo_data;
>>   size_t vmcoreinfo_size;
>> @@ -587,3 +591,117 @@ static int __init crash_save_vmcoreinfo_init(void)
>>   }
>>   
>>   subsys_initcall(crash_save_vmcoreinfo_init);
>> +
>> +#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
>> +static void handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
>> +{
>> +	/* Obtain lock while changing crash information */
>> +	mutex_lock(&kexec_mutex);
>> +
>> +	/* Check kdump is loaded */
>> +	if (kexec_crash_image) {
>> +		struct kimage *image = kexec_crash_image;
>> +
>> +		pr_debug("crash hp: hp_action %u, cpu %u", hp_action, cpu);
> 
> We don't need to check if it's cpu hotplug action here? Then for mem
> hotplug actions, we also get:
>                  "crash hp: hp_action 2, cpu 0"
> That cpu 0 will be confusing. And you forgot the line break.
> 
> Can we do like this:
> 		pr_debug("crash hp: hp_action %u ", hp_action);
> 		if (hp_action == KEXEC_CRASH_HP_REMOVE_CPU ||
> 		    hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
> 			pr_debug("cpu %u \n", cpu);
> 
> Other than this, the patch looks good to me.

OK, I'll make the changes you suggest, thanks!
eric

> 
>> +
>> +		/*
>> +		 * When the struct kimage is alloced, it is wiped to zero, so
>> +		 * the elfcorehdr_index_valid defaults to false. Find the
>> +		 * segment containing the elfcorehdr, if not already found.
>> +		 * This works for both the kexec_load and kexec_file_load paths.
>> +		 */
>> +		if (!image->elfcorehdr_index_valid) {
>> +			unsigned char *ptr;
>> +			unsigned long mem, memsz;
>> +			unsigned int n;
>> +
>> +			for (n = 0; n < image->nr_segments; n++) {
>> +				mem = image->segment[n].mem;
>> +				memsz = image->segment[n].memsz;
>> +				ptr = arch_map_crash_pages(mem, memsz);
>> +				if (ptr) {
>> +					/* The segment containing elfcorehdr */
>> +					if (memcmp(ptr, ELFMAG, SELFMAG) == 0) {
>> +						image->elfcorehdr_index = (int)n;
>> +						image->elfcorehdr_index_valid = true;
>> +					}
>> +				}
>> +				arch_unmap_crash_pages((void **)&ptr);
>> +			}
>> +		}
>> +
>> +		if (!image->elfcorehdr_index_valid) {
>> +			pr_err("crash hp: unable to locate elfcorehdr segment");
>> +			goto out;
>> +		}
>> +
>> +		/* Needed in order for the segments to be updated */
>> +		arch_kexec_unprotect_crashkres();
>> +
>> +		/* Flag to differentiate between normal load and hotplug */
>> +		image->hotplug_event = true;
>> +
>> +		/* Now invoke arch-specific update handler */
>> +		arch_crash_handle_hotplug_event(image, hp_action, cpu);
>> +
>> +		/* No longer handling a hotplug event */
>> +		image->hotplug_event = false;
>> +
>> +		/* Change back to read-only */
>> +		arch_kexec_protect_crashkres();
>> +	}
>> +
>> +out:
>> +	/* Release lock now that update complete */
>> +	mutex_unlock(&kexec_mutex);
>> +}
>> +
>> +static int crash_memhp_notifier(struct notifier_block *nb, unsigned long val, void *v)
>> +{
>> +	switch (val) {
>> +	case MEM_ONLINE:
>> +		handle_hotplug_event(KEXEC_CRASH_HP_ADD_MEMORY, 0);
>> +		break;
>> +
>> +	case MEM_OFFLINE:
>> +		handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_MEMORY, 0);
>> +		break;
>> +	}
>> +	return NOTIFY_OK;
>> +}
>> +
>> +static struct notifier_block crash_memhp_nb = {
>> +	.notifier_call = crash_memhp_notifier,
>> +	.priority = 0
>> +};
>> +
>> +static int crash_cpuhp_online(unsigned int cpu)
>> +{
>> +	handle_hotplug_event(KEXEC_CRASH_HP_ADD_CPU, cpu);
>> +	return 0;
>> +}
>> +
>> +static int crash_cpuhp_offline(unsigned int cpu)
>> +{
>> +	handle_hotplug_event(KEXEC_CRASH_HP_REMOVE_CPU, cpu);
>> +	return 0;
>> +}
>> +
>> +static int __init crash_hotplug_init(void)
>> +{
>> +	int result = 0;
>> +
>> +	if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG))
>> +		register_memory_notifier(&crash_memhp_nb);
>> +
>> +	if (IS_ENABLED(CONFIG_HOTPLUG_CPU))
>> +		result = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
>> +						   "crash/cpuhp",
>> +						   crash_cpuhp_online,
>> +						   crash_cpuhp_offline);
>> +
>> +	return result;
>> +}
>> +
>> +subsys_initcall(crash_hotplug_init);
>> +#endif
>> -- 
>> 2.31.1
>>
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 1/8] crash: introduce arch/*/asm/crash.h
  2022-08-08 15:18     ` Eric DeVolder
@ 2022-08-08 15:44       ` Rob Herring
  2022-08-12  9:46       ` Baoquan He
  1 sibling, 0 replies; 25+ messages in thread
From: Rob Herring @ 2022-08-08 15:44 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: Baoquan He, linux-kernel, X86 ML, kexec, Eric W. Biederman,
	Dave Young, Vivek Goyal, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Lakshmi Ramasubramanian, Thomas Lendacky, efault, Mike Rapoport,
	David Hildenbrand, sourabhjain, Konrad Rzeszutek Wilk,
	Boris Ostrovsky

On Mon, Aug 8, 2022 at 9:19 AM Eric DeVolder <eric.devolder@oracle.com> wrote:
>
>
>
> On 8/7/22 22:25, Baoquan He wrote:
> > Hi Eric,
> >
> > On 07/21/22 at 02:17pm, Eric DeVolder wrote:
> >> The use of __weak is being eliminated within kexec sources.
> >> The technique uses macros mapped onto inline functions in
> >> order to replace __weak.
> >>
> >> This patchset was using __weak and so in order to replace
> >> __weak, this patch introduces arch/*/asm/crash.h, patterned
> >> after how kexec is moving away from __weak and to the macro
> >> definitions.
> >
> > Are you going to replace __weak in kexec of arll ARCHes? I don't see
> > your point why all these empty header files are introduced. Wondering
> > what's impacted if not adding these empty files?
>
> Hi Baoquan,
> In this patchset, to file include/linux/crash_core.h I added the line #include <asm/crash.h>.
> I patterned this after how include/linux/kexec.h does #include <asm/kexec.h>.
>
> For kexec, the items that were __weak are refactored into corresponding asm/kexec.h.
>
> I followed suit for crash __weak items. File crash_core.h now #include's asm/crash.h and
> so that file needs to be present for every arch, else build failures ensue. It turns out
> x86_64 already had this file.
>
> At this time, I was not planning on converting the other arch's __weak to asm/crash.h, but at least
> with these empty files, the infrastructure is in place for when that does occur.

asm-generic is the right location for default asm headers. I'm not
really sure you even need them as you mainly seem to be using the
header to enable a feature. That's normally done by the arch selecting
a kconfig option. But as Borislav previously pointed out, you don't
need a new option here if the arch must provide support when kexec and
hotplug are enabled/supported.

Rob

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 1/8] crash: introduce arch/*/asm/crash.h
  2022-08-08 15:18     ` Eric DeVolder
  2022-08-08 15:44       ` Rob Herring
@ 2022-08-12  9:46       ` Baoquan He
  2022-08-12 21:23         ` Eric DeVolder
  1 sibling, 1 reply; 25+ messages in thread
From: Baoquan He @ 2022-08-12  9:46 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: linux-kernel, x86, kexec, ebiederm, dyoung, vgoyal, tglx, mingo,
	bp, dave.hansen, hpa, nramas, thomas.lendacky, robh, efault,
	rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky

On 08/08/22 at 10:18am, Eric DeVolder wrote:
> 
> 
> On 8/7/22 22:25, Baoquan He wrote:
> > Hi Eric,
> > 
> > On 07/21/22 at 02:17pm, Eric DeVolder wrote:
> > > The use of __weak is being eliminated within kexec sources.
> > > The technique uses macros mapped onto inline functions in
> > > order to replace __weak.
> > > 
> > > This patchset was using __weak and so in order to replace
> > > __weak, this patch introduces arch/*/asm/crash.h, patterned
> > > after how kexec is moving away from __weak and to the macro
> > > definitions.
> > 
> > Are you going to replace __weak in kexec of arll ARCHes? I don't see
> > your point why all these empty header files are introduced. Wondering
> > what's impacted if not adding these empty files?
> 
> Hi Baoquan,
> In this patchset, to file include/linux/crash_core.h I added the line #include <asm/crash.h>.
> I patterned this after how include/linux/kexec.h does #include <asm/kexec.h>.

I am sorry, Eric, it looks not so good. I understand you want to pattern
asm/kexe.h, but we need consider reality. Introducing a dozen of empty
header file and not being able to tell when they will be filled doesn't
make sense.

Includig <asm/crash.h> where needed is much simpler. I doubt if your way
can pass other reviewers' line. Can you reconsider?

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 77f5f3591760..b0577bdcc491 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -15,6 +15,7 @@
 
 #include <asm/page.h>
 #include <asm/sections.h>
+#include <asm/crash.h>
 
 #include <crypto/sha1.h>


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 1/8] crash: introduce arch/*/asm/crash.h
  2022-08-12  9:46       ` Baoquan He
@ 2022-08-12 21:23         ` Eric DeVolder
  2022-08-13  0:24           ` Baoquan He
  0 siblings, 1 reply; 25+ messages in thread
From: Eric DeVolder @ 2022-08-12 21:23 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, x86, kexec, ebiederm, dyoung, vgoyal, tglx, mingo,
	bp, dave.hansen, hpa, nramas, thomas.lendacky, robh, efault,
	rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky



On 8/12/22 04:46, Baoquan He wrote:
> On 08/08/22 at 10:18am, Eric DeVolder wrote:
>>
>>
>> On 8/7/22 22:25, Baoquan He wrote:
>>> Hi Eric,
>>>
>>> On 07/21/22 at 02:17pm, Eric DeVolder wrote:
>>>> The use of __weak is being eliminated within kexec sources.
>>>> The technique uses macros mapped onto inline functions in
>>>> order to replace __weak.
>>>>
>>>> This patchset was using __weak and so in order to replace
>>>> __weak, this patch introduces arch/*/asm/crash.h, patterned
>>>> after how kexec is moving away from __weak and to the macro
>>>> definitions.
>>>
>>> Are you going to replace __weak in kexec of arll ARCHes? I don't see
>>> your point why all these empty header files are introduced. Wondering
>>> what's impacted if not adding these empty files?
>>
>> Hi Baoquan,
>> In this patchset, to file include/linux/crash_core.h I added the line #include <asm/crash.h>.
>> I patterned this after how include/linux/kexec.h does #include <asm/kexec.h>.
> 
> I am sorry, Eric, it looks not so good. I understand you want to pattern
> asm/kexe.h, but we need consider reality. Introducing a dozen of empty
> header file and not being able to tell when they will be filled doesn't
> make sense.
> 
> Includig <asm/crash.h> where needed is much simpler. I doubt if your way
> can pass other reviewers' line. Can you reconsider?

If I include <asm/crash.h> where needed, which is kernel/crash_core.c, then the other archs will 
fail build if that file doesn't exist. A couple of options, which do you think is better to pursue?

- use asm/kexec.h instead of asm/crash.h; it appears all the architectures already have this file in 
place

- go ahead and put the appropriate crash macros/inline functions into each arch asm/crash.h so that 
the files are not just empty, and leave the use of asm/crash.h

Or perhaps you see a better alternative?

Thanks!
eric


> 
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 77f5f3591760..b0577bdcc491 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -15,6 +15,7 @@
>   
>   #include <asm/page.h>
>   #include <asm/sections.h>
> +#include <asm/crash.h>
>   
>   #include <crypto/sha1.h>
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 1/8] crash: introduce arch/*/asm/crash.h
  2022-08-12 21:23         ` Eric DeVolder
@ 2022-08-13  0:24           ` Baoquan He
  2022-08-16 15:23             ` Eric DeVolder
  0 siblings, 1 reply; 25+ messages in thread
From: Baoquan He @ 2022-08-13  0:24 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: linux-kernel, x86, kexec, ebiederm, dyoung, vgoyal, tglx, mingo,
	bp, dave.hansen, hpa, nramas, thomas.lendacky, robh, efault,
	rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky

On 08/12/22 at 04:23pm, Eric DeVolder wrote:
> 
> 
> On 8/12/22 04:46, Baoquan He wrote:
> > On 08/08/22 at 10:18am, Eric DeVolder wrote:
> > > 
> > > 
> > > On 8/7/22 22:25, Baoquan He wrote:
> > > > Hi Eric,
> > > > 
> > > > On 07/21/22 at 02:17pm, Eric DeVolder wrote:
> > > > > The use of __weak is being eliminated within kexec sources.
> > > > > The technique uses macros mapped onto inline functions in
> > > > > order to replace __weak.
> > > > > 
> > > > > This patchset was using __weak and so in order to replace
> > > > > __weak, this patch introduces arch/*/asm/crash.h, patterned
> > > > > after how kexec is moving away from __weak and to the macro
> > > > > definitions.
> > > > 
> > > > Are you going to replace __weak in kexec of arll ARCHes? I don't see
> > > > your point why all these empty header files are introduced. Wondering
> > > > what's impacted if not adding these empty files?
> > > 
> > > Hi Baoquan,
> > > In this patchset, to file include/linux/crash_core.h I added the line #include <asm/crash.h>.
> > > I patterned this after how include/linux/kexec.h does #include <asm/kexec.h>.
> > 
> > I am sorry, Eric, it looks not so good. I understand you want to pattern
> > asm/kexe.h, but we need consider reality. Introducing a dozen of empty
> > header file and not being able to tell when they will be filled doesn't
> > make sense.
> > 
> > Includig <asm/crash.h> where needed is much simpler. I doubt if your way
> > can pass other reviewers' line. Can you reconsider?
> 
> If I include <asm/crash.h> where needed, which is kernel/crash_core.c, then
> the other archs will fail build if that file doesn't exist. A couple of
> options, which do you think is better to pursue?
> 
> - use asm/kexec.h instead of asm/crash.h; it appears all the architectures
> already have this file in place
> 
> - go ahead and put the appropriate crash macros/inline functions into each
> arch asm/crash.h so that the files are not just empty, and leave the use of
> asm/crash.h

I think we can do this in two steps.

Firstly, make do with asm/kexec.h since all other ARCHes put crash related stuff
into asm/kexec.h, except of x86. 

Secondly, clean up to put those crash marco/inline functions into
asm/crash.h.

The 2nd step can be done in a independent patchset. What do you think?


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 8/8] x86/crash: Add x86 crash hotplug support
  2022-07-21 18:17 ` [PATCH v10 8/8] x86/crash: Add x86 crash hotplug support Eric DeVolder
@ 2022-08-13  0:34   ` Baoquan He
  2022-08-16 15:23     ` Eric DeVolder
  0 siblings, 1 reply; 25+ messages in thread
From: Baoquan He @ 2022-08-13  0:34 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: linux-kernel, x86, kexec, ebiederm, dyoung, vgoyal, tglx, mingo,
	bp, dave.hansen, hpa, nramas, thomas.lendacky, robh, efault,
	rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky

On 07/21/22 at 02:17pm, Eric DeVolder wrote:
...snip.... 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index e58798f636d4..bb59596c8bea 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -2065,6 +2065,17 @@ config CRASH_DUMP
>  	  (CONFIG_RELOCATABLE=y).
>  	  For more details see Documentation/admin-guide/kdump/kdump.rst
>  
> +config CRASH_MAX_MEMORY_RANGES
> +	depends on CRASH_DUMP && KEXEC_FILE && (HOTPLUG_CPU || MEMORY_HOTPLUG)
> +	int
> +	default 32768

Do we need to enforce the value with page align and minimal size? I
checked crash_load_segments() in arch/x86/kernel/crash.c, it does the
page size aligning in kexec_add_buffer(). And in
load_crashdump_segments() of
kexec-tools/kexec/arch/i386/crashdump-x86.c, it creates elfcorehdr at
below code, the align is 1024, and in generic add_buffer()
implementation, it enforces the memsz page aligned, and changes the
passed align as page alignment.


	elfcorehdr = add_buffer(info, tmp, bufsz, memsz, align, min_base,
                                                        max_addr, -1);

Maybe we should at least mention this in the help text to notice people.

> +	help
> +	  For the kexec_file_load path, specify the maximum number of
> +	  memory regions, eg. as represented by the 'System RAM' entries
> +	  in /proc/iomem, that the elfcorehdr buffer/segment can accommodate.
> +	  This value is combined with NR_CPUS and multiplied by Elf64_Phdr
> +	  size to determine the final buffer size.
> +
>  config KEXEC_JUMP
>  	bool "kexec jump"
>  	depends on KEXEC && HIBERNATION
> diff --git a/arch/x86/include/asm/crash.h b/arch/x86/include/asm/crash.h
> index 8b6bd63530dc..96051d8e4b45 100644
> --- a/arch/x86/include/asm/crash.h
> +++ b/arch/x86/include/asm/crash.h
> @@ -9,4 +9,24 @@ int crash_setup_memmap_entries(struct kimage *image,
>  		struct boot_params *params);
>  void crash_smp_send_stop(void);
>  
> +void *arch_map_crash_pages(unsigned long paddr, unsigned long size);
> +#define arch_map_crash_pages arch_map_crash_pages
> +
> +void arch_unmap_crash_pages(void **ptr);
> +#define arch_unmap_crash_pages arch_unmap_crash_pages
> +
> +void arch_crash_handle_hotplug_event(struct kimage *image,
> +		unsigned int hp_action, unsigned int cpu);
> +#define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
> +
> +#ifdef CONFIG_HOTPLUG_CPU
> +static inline int crash_hotplug_cpu_support(void) { return 1; }
> +#define crash_hotplug_cpu_support crash_hotplug_cpu_support
> +#endif
> +
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +static inline int crash_hotplug_memory_support(void) { return 1; }
> +#define crash_hotplug_memory_support crash_hotplug_memory_support
> +#endif
> +
>  #endif /* _ASM_X86_CRASH_H */
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index 9ceb93c176a6..55dda4fcde6e 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -25,6 +25,7 @@
>  #include <linux/slab.h>
>  #include <linux/vmalloc.h>
>  #include <linux/memblock.h>
> +#include <linux/highmem.h>
>  
>  #include <asm/processor.h>
>  #include <asm/hardirq.h>
> @@ -397,7 +398,17 @@ int crash_load_segments(struct kimage *image)
>  	image->elf_headers = kbuf.buffer;
>  	image->elf_headers_sz = kbuf.bufsz;
>  
> +#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
> +	/* Ensure elfcorehdr segment large enough for hotplug changes */
> +	kbuf.memsz = (CONFIG_NR_CPUS_DEFAULT + CONFIG_CRASH_MAX_MEMORY_RANGES) * sizeof(Elf64_Phdr);

Do we need to break the line to 80 chars?

> +	/* For marking as usable to crash kernel */
> +	image->elf_headers_sz = kbuf.memsz;

Do we need this code comment?

> +	/* Record the index of the elfcorehdr segment */
> +	image->elfcorehdr_index = image->nr_segments;

And this place?

> +	image->elfcorehdr_index_valid = true;
> +#else
>  	kbuf.memsz = kbuf.bufsz;
> +#endif
>  	kbuf.buf_align = ELF_CORE_HEADER_ALIGN;
>  	kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>  	ret = kexec_add_buffer(&kbuf);
> @@ -412,3 +423,107 @@ int crash_load_segments(struct kimage *image)
>  	return ret;
>  }
>  #endif /* CONFIG_KEXEC_FILE */
> +
> +#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
> +void *arch_map_crash_pages(unsigned long paddr, unsigned long size)
> +{
> +	/*
> +	 * NOTE: The addresses and sizes passed to this routine have
> +	 * already been fully aligned on page boundaries. There is no
> +	 * need for massaging the address or size.
> +	 */

Can we move the code comment above function interface?

> +	void *ptr = NULL;
> +
> +	/* NOTE: requires arch_kexec_[un]protect_crashkres() for write access */

Do we need this code comment? On ARCH where proctionion is made, we
surely need to the protect/unprotect.

> +	if (size > 0) {
> +		struct page *page = pfn_to_page(paddr >> PAGE_SHIFT);
> +
> +		ptr = kmap_local_page(page);
> +	}
> +
> +	return ptr;
> +}
> +
> +void arch_unmap_crash_pages(void **ptr)
> +{
> +	if (ptr) {
> +		if (*ptr)
> +			kunmap_local(*ptr);
> +		*ptr = NULL;
> +	}
> +}
> +
> +/**
> + * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
> + * @image: the active struct kimage
> + * @hp_action: the hot un/plug action being handled
> + * @cpu: when KEXEC_CRASH_HP_ADD/REMOVE_CPU, the cpu affected
> + *
> + * To accurately reflect hot un/plug changes, the elfcorehdr (which
> + * is passed to the crash kernel via the elfcorehdr= parameter)
> + * must be updated with the new list of CPUs and memories. The new
> + * elfcorehdr is prepared in a kernel buffer, and then it is
> + * written on top of the existing/old elfcorehdr.
> + *
> + * For hotplug changes to elfcorehdr to work, two conditions are
> + * needed:
> + * First, the segment containing the elfcorehdr must be large enough
> + * to permit a growing number of resources. See the
> + * CONFIG_CRASH_MAX_MEMORY_RANGES description.
> + * Second, purgatory must explicitly exclude the elfcorehdr from the
> + * list of segments it checks (since the elfcorehdr changes and thus
> + * would require an update to purgatory itself to update the digest).

Isn't this generic concept to crash hotplug? Should we move it out to
some generic place?

> + *
> + */
> +void arch_crash_handle_hotplug_event(struct kimage *image,
> +	unsigned int hp_action, unsigned int cpu)

The passed in 'cpu' is not used at all, what is it added for? I didn't
see explanation about it.

> +{
> +	struct kexec_segment *ksegment;
> +	unsigned char *ptr = NULL;
> +	unsigned long elfsz = 0;
> +	void *elfbuf = NULL;
> +	unsigned long mem, memsz;
> +
> +	/*
> +	 * Elfcorehdr_index_valid checked in crash_core:handle_hotplug_event()
> +	 */
> +	ksegment = &image->segment[image->elfcorehdr_index];
> +	mem = ksegment->mem;
> +	memsz = ksegment->memsz;
> +
> +	/*
> +	 * Create the new elfcorehdr reflecting the changes to CPU and/or
> +	 * memory resources.
> +	 */
> +	if (prepare_elf_headers(image, &elfbuf, &elfsz)) {
> +		pr_err("crash hp: unable to prepare elfcore headers");
> +		goto out;
> +	}
> +	if (elfsz > memsz) {
> +		pr_err("crash hp: update elfcorehdr elfsz %lu > memsz %lu",
> +			elfsz, memsz);
> +		goto out;
> +	}
> +
> +	/*
> +	 * At this point, we are all but assured of success.
> +	 * Copy new elfcorehdr into destination.
> +	 */
> +	ptr = arch_map_crash_pages(mem, memsz);
> +	if (ptr) {
> +		/*
> +		 * Temporarily invalidate the crash image while the
> +		 * elfcorehdr is updated.
> +		 */
> +		xchg(&kexec_crash_image, NULL);
> +		memcpy_flushcache((void *)ptr, elfbuf, elfsz);
> +		xchg(&kexec_crash_image, image);
> +	}
> +	arch_unmap_crash_pages((void **)&ptr);
> +	pr_debug("crash hp: re-loaded elfcorehdr at 0x%lx\n", mem);
> +
> +out:
> +	if (elfbuf)
> +		vfree(elfbuf);
> +}
> +#endif
> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 8/8] x86/crash: Add x86 crash hotplug support
  2022-08-13  0:34   ` Baoquan He
@ 2022-08-16 15:23     ` Eric DeVolder
  2022-08-25 19:42       ` Eric DeVolder
  2022-08-26  4:35       ` Baoquan He
  0 siblings, 2 replies; 25+ messages in thread
From: Eric DeVolder @ 2022-08-16 15:23 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, x86, kexec, ebiederm, dyoung, vgoyal, tglx, mingo,
	bp, dave.hansen, hpa, nramas, thomas.lendacky, robh, efault,
	rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky



On 8/12/22 19:34, Baoquan He wrote:
> On 07/21/22 at 02:17pm, Eric DeVolder wrote:
> ...snip....
>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>> index e58798f636d4..bb59596c8bea 100644
>> --- a/arch/x86/Kconfig
>> +++ b/arch/x86/Kconfig
>> @@ -2065,6 +2065,17 @@ config CRASH_DUMP
>>   	  (CONFIG_RELOCATABLE=y).
>>   	  For more details see Documentation/admin-guide/kdump/kdump.rst
>>   
>> +config CRASH_MAX_MEMORY_RANGES
>> +	depends on CRASH_DUMP && KEXEC_FILE && (HOTPLUG_CPU || MEMORY_HOTPLUG)
>> +	int
>> +	default 32768
> 
> Do we need to enforce the value with page align and minimal size? I

Are you asking about the value CRASH_MAX_MEMORY_RANGES? This value represents
the maximum number of memory ranges, and there Elf64_Phdrs, that we need to
allow for elfcorehdr memory. So I'm not sure what the concern for alignment
is. I suppose we could also institute a minimum size for this value, say 1024.

> checked crash_load_segments() in arch/x86/kernel/crash.c, it does the
> page size aligning in kexec_add_buffer(). And in
> load_crashdump_segments() of
> kexec-tools/kexec/arch/i386/crashdump-x86.c, it creates elfcorehdr at
> below code, the align is 1024, and in generic add_buffer()
> implementation, it enforces the memsz page aligned, and changes the
> passed align as page alignment.
> 
> 
> 	elfcorehdr = add_buffer(info, tmp, bufsz, memsz, align, min_base,
>                                                          max_addr, -1);
> 
> Maybe we should at least mention this in the help text to notice people.

Unfortunately I do not yet understand the concern being raised.

> 
>> +	help
>> +	  For the kexec_file_load path, specify the maximum number of
>> +	  memory regions, eg. as represented by the 'System RAM' entries
>> +	  in /proc/iomem, that the elfcorehdr buffer/segment can accommodate.
>> +	  This value is combined with NR_CPUS and multiplied by Elf64_Phdr
>> +	  size to determine the final buffer size.
>> +
>>   config KEXEC_JUMP
>>   	bool "kexec jump"
>>   	depends on KEXEC && HIBERNATION
>> diff --git a/arch/x86/include/asm/crash.h b/arch/x86/include/asm/crash.h
>> index 8b6bd63530dc..96051d8e4b45 100644
>> --- a/arch/x86/include/asm/crash.h
>> +++ b/arch/x86/include/asm/crash.h
>> @@ -9,4 +9,24 @@ int crash_setup_memmap_entries(struct kimage *image,
>>   		struct boot_params *params);
>>   void crash_smp_send_stop(void);
>>   
>> +void *arch_map_crash_pages(unsigned long paddr, unsigned long size);
>> +#define arch_map_crash_pages arch_map_crash_pages
>> +
>> +void arch_unmap_crash_pages(void **ptr);
>> +#define arch_unmap_crash_pages arch_unmap_crash_pages
>> +
>> +void arch_crash_handle_hotplug_event(struct kimage *image,
>> +		unsigned int hp_action, unsigned int cpu);
>> +#define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
>> +
>> +#ifdef CONFIG_HOTPLUG_CPU
>> +static inline int crash_hotplug_cpu_support(void) { return 1; }
>> +#define crash_hotplug_cpu_support crash_hotplug_cpu_support
>> +#endif
>> +
>> +#ifdef CONFIG_MEMORY_HOTPLUG
>> +static inline int crash_hotplug_memory_support(void) { return 1; }
>> +#define crash_hotplug_memory_support crash_hotplug_memory_support
>> +#endif
>> +
>>   #endif /* _ASM_X86_CRASH_H */
>> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
>> index 9ceb93c176a6..55dda4fcde6e 100644
>> --- a/arch/x86/kernel/crash.c
>> +++ b/arch/x86/kernel/crash.c
>> @@ -25,6 +25,7 @@
>>   #include <linux/slab.h>
>>   #include <linux/vmalloc.h>
>>   #include <linux/memblock.h>
>> +#include <linux/highmem.h>
>>   
>>   #include <asm/processor.h>
>>   #include <asm/hardirq.h>
>> @@ -397,7 +398,17 @@ int crash_load_segments(struct kimage *image)
>>   	image->elf_headers = kbuf.buffer;
>>   	image->elf_headers_sz = kbuf.bufsz;
>>   
>> +#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
>> +	/* Ensure elfcorehdr segment large enough for hotplug changes */
>> +	kbuf.memsz = (CONFIG_NR_CPUS_DEFAULT + CONFIG_CRASH_MAX_MEMORY_RANGES) * sizeof(Elf64_Phdr);
> 
> Do we need to break the line to 80 chars?

Sure, I will do so.

> 
>> +	/* For marking as usable to crash kernel */
>> +	image->elf_headers_sz = kbuf.memsz;
> 
> Do we need this code comment?

Well, it did take me a while to figure this particular item out in order for all
this code to work right (else the crash kernel would fail at boot time). So I
think it best to keep this comment.

> 
>> +	/* Record the index of the elfcorehdr segment */
>> +	image->elfcorehdr_index = image->nr_segments;
> 
> And this place?

Not necessarily needed, but I've found it useful.

> 
>> +	image->elfcorehdr_index_valid = true;
>> +#else
>>   	kbuf.memsz = kbuf.bufsz;
>> +#endif
>>   	kbuf.buf_align = ELF_CORE_HEADER_ALIGN;
>>   	kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>>   	ret = kexec_add_buffer(&kbuf);
>> @@ -412,3 +423,107 @@ int crash_load_segments(struct kimage *image)
>>   	return ret;
>>   }
>>   #endif /* CONFIG_KEXEC_FILE */
>> +
>> +#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
>> +void *arch_map_crash_pages(unsigned long paddr, unsigned long size)
>> +{
>> +	/*
>> +	 * NOTE: The addresses and sizes passed to this routine have
>> +	 * already been fully aligned on page boundaries. There is no
>> +	 * need for massaging the address or size.
>> +	 */
> 
> Can we move the code comment above function interface?

Yes

> 
>> +	void *ptr = NULL;
>> +
>> +	/* NOTE: requires arch_kexec_[un]protect_crashkres() for write access */
> 
> Do we need this code comment? On ARCH where proctionion is made, we
> surely need to the protect/unprotect.

I will remove this; I've mentioned this in handle_hotplug_event() where these
protect/unprotect functions are called.

> 
>> +	if (size > 0) {
>> +		struct page *page = pfn_to_page(paddr >> PAGE_SHIFT);
>> +
>> +		ptr = kmap_local_page(page);
>> +	}
>> +
>> +	return ptr;
>> +}
>> +
>> +void arch_unmap_crash_pages(void **ptr)
>> +{
>> +	if (ptr) {
>> +		if (*ptr)
>> +			kunmap_local(*ptr);
>> +		*ptr = NULL;
>> +	}
>> +}
>> +
>> +/**
>> + * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
>> + * @image: the active struct kimage
>> + * @hp_action: the hot un/plug action being handled
>> + * @cpu: when KEXEC_CRASH_HP_ADD/REMOVE_CPU, the cpu affected
>> + *
>> + * To accurately reflect hot un/plug changes, the elfcorehdr (which
>> + * is passed to the crash kernel via the elfcorehdr= parameter)
>> + * must be updated with the new list of CPUs and memories. The new
>> + * elfcorehdr is prepared in a kernel buffer, and then it is
>> + * written on top of the existing/old elfcorehdr.
>> + *
>> + * For hotplug changes to elfcorehdr to work, two conditions are
>> + * needed:
>> + * First, the segment containing the elfcorehdr must be large enough
>> + * to permit a growing number of resources. See the
>> + * CONFIG_CRASH_MAX_MEMORY_RANGES description.
>> + * Second, purgatory must explicitly exclude the elfcorehdr from the
>> + * list of segments it checks (since the elfcorehdr changes and thus
>> + * would require an update to purgatory itself to update the digest).
> 
> Isn't this generic concept to crash hotplug? Should we move it out to
> some generic place?

Yes, so I will relocate this.

> 
>> + *
>> + */
>> +void arch_crash_handle_hotplug_event(struct kimage *image,
>> +	unsigned int hp_action, unsigned int cpu)
> 
> The passed in 'cpu' is not used at all, what is it added for? I didn't
> see explanation about it.

Well its not used for x86, but as I recall, Sourabh Jain needed it for the PowerPC handler.

> 
>> +{
>> +	struct kexec_segment *ksegment;
>> +	unsigned char *ptr = NULL;
>> +	unsigned long elfsz = 0;
>> +	void *elfbuf = NULL;
>> +	unsigned long mem, memsz;
>> +
>> +	/*
>> +	 * Elfcorehdr_index_valid checked in crash_core:handle_hotplug_event()
>> +	 */
>> +	ksegment = &image->segment[image->elfcorehdr_index];
>> +	mem = ksegment->mem;
>> +	memsz = ksegment->memsz;
>> +
>> +	/*
>> +	 * Create the new elfcorehdr reflecting the changes to CPU and/or
>> +	 * memory resources.
>> +	 */
>> +	if (prepare_elf_headers(image, &elfbuf, &elfsz)) {
>> +		pr_err("crash hp: unable to prepare elfcore headers");
>> +		goto out;
>> +	}
>> +	if (elfsz > memsz) {
>> +		pr_err("crash hp: update elfcorehdr elfsz %lu > memsz %lu",
>> +			elfsz, memsz);
>> +		goto out;
>> +	}
>> +
>> +	/*
>> +	 * At this point, we are all but assured of success.
>> +	 * Copy new elfcorehdr into destination.
>> +	 */
>> +	ptr = arch_map_crash_pages(mem, memsz);
>> +	if (ptr) {
>> +		/*
>> +		 * Temporarily invalidate the crash image while the
>> +		 * elfcorehdr is updated.
>> +		 */
>> +		xchg(&kexec_crash_image, NULL);
>> +		memcpy_flushcache((void *)ptr, elfbuf, elfsz);
>> +		xchg(&kexec_crash_image, image);
>> +	}
>> +	arch_unmap_crash_pages((void **)&ptr);
>> +	pr_debug("crash hp: re-loaded elfcorehdr at 0x%lx\n", mem);
>> +
>> +out:
>> +	if (elfbuf)
>> +		vfree(elfbuf);
>> +}
>> +#endif
>> -- 
>> 2.31.1
>>
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 1/8] crash: introduce arch/*/asm/crash.h
  2022-08-13  0:24           ` Baoquan He
@ 2022-08-16 15:23             ` Eric DeVolder
  0 siblings, 0 replies; 25+ messages in thread
From: Eric DeVolder @ 2022-08-16 15:23 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, x86, kexec, ebiederm, dyoung, vgoyal, tglx, mingo,
	bp, dave.hansen, hpa, nramas, thomas.lendacky, robh, efault,
	rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky



On 8/12/22 19:24, Baoquan He wrote:
> On 08/12/22 at 04:23pm, Eric DeVolder wrote:
>>
>>
>> On 8/12/22 04:46, Baoquan He wrote:
>>> On 08/08/22 at 10:18am, Eric DeVolder wrote:
>>>>
>>>>
>>>> On 8/7/22 22:25, Baoquan He wrote:
>>>>> Hi Eric,
>>>>>
>>>>> On 07/21/22 at 02:17pm, Eric DeVolder wrote:
>>>>>> The use of __weak is being eliminated within kexec sources.
>>>>>> The technique uses macros mapped onto inline functions in
>>>>>> order to replace __weak.
>>>>>>
>>>>>> This patchset was using __weak and so in order to replace
>>>>>> __weak, this patch introduces arch/*/asm/crash.h, patterned
>>>>>> after how kexec is moving away from __weak and to the macro
>>>>>> definitions.
>>>>>
>>>>> Are you going to replace __weak in kexec of arll ARCHes? I don't see
>>>>> your point why all these empty header files are introduced. Wondering
>>>>> what's impacted if not adding these empty files?
>>>>
>>>> Hi Baoquan,
>>>> In this patchset, to file include/linux/crash_core.h I added the line #include <asm/crash.h>.
>>>> I patterned this after how include/linux/kexec.h does #include <asm/kexec.h>.
>>>
>>> I am sorry, Eric, it looks not so good. I understand you want to pattern
>>> asm/kexe.h, but we need consider reality. Introducing a dozen of empty
>>> header file and not being able to tell when they will be filled doesn't
>>> make sense.
>>>
>>> Includig <asm/crash.h> where needed is much simpler. I doubt if your way
>>> can pass other reviewers' line. Can you reconsider?
>>
>> If I include <asm/crash.h> where needed, which is kernel/crash_core.c, then
>> the other archs will fail build if that file doesn't exist. A couple of
>> options, which do you think is better to pursue?
>>
>> - use asm/kexec.h instead of asm/crash.h; it appears all the architectures
>> already have this file in place
>>
>> - go ahead and put the appropriate crash macros/inline functions into each
>> arch asm/crash.h so that the files are not just empty, and leave the use of
>> asm/crash.h
> 
> I think we can do this in two steps.
> 
> Firstly, make do with asm/kexec.h since all other ARCHes put crash related stuff
> into asm/kexec.h, except of x86.

Baoquan,
I've made the changes to utilizes asm/kexec.h. For x86, I did have to put what I
previously had in asm/crash.h into asm/kexec.h.

> 
> Secondly, clean up to put those crash marco/inline functions into
> asm/crash.h.

Yes, in looking at kexec.h, there is alot of crash items in there that would make for a good cleanup 
pass...

> 
> The 2nd step can be done in a independent patchset. What do you think?
> 

Yes!
eric

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 7/8] crash: memory and cpu hotplug sysfs attributes
  2022-08-08 10:41   ` Baoquan He
@ 2022-08-16 15:24     ` Eric DeVolder
  0 siblings, 0 replies; 25+ messages in thread
From: Eric DeVolder @ 2022-08-16 15:24 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, x86, kexec, ebiederm, dyoung, vgoyal, tglx, mingo,
	bp, dave.hansen, hpa, nramas, thomas.lendacky, robh, efault,
	rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky



On 8/8/22 05:41, Baoquan He wrote:
> On 07/21/22 at 02:17pm, Eric DeVolder wrote:
>> This introduces the crash_hotplug attribute for memory and CPUs
>> for use by userspace.  This change directly facilitates the udev
>> rule for managing userspace re-loading of the crash kernel upon
>> hot un/plug changes.
>>
>> For memory, this changeset introduces the crash_hotplug attribute
>> to the /sys/devices/system/memory directory. For example:
>>
>>   # udevadm info --attribute-walk /sys/devices/system/memory/memory81
>>    looking at device '/devices/system/memory/memory81':
>>      KERNEL=="memory81"
>>      SUBSYSTEM=="memory"
>>      DRIVER==""
>>      ATTR{online}=="1"
>>      ATTR{phys_device}=="0"
>>      ATTR{phys_index}=="00000051"
>>      ATTR{removable}=="1"
>>      ATTR{state}=="online"
>>      ATTR{valid_zones}=="Movable"
>>
>>    looking at parent device '/devices/system/memory':
>>      KERNELS=="memory"
>>      SUBSYSTEMS==""
>>      DRIVERS==""
>>      ATTRS{auto_online_blocks}=="offline"
>>      ATTRS{block_size_bytes}=="8000000"
>>      ATTRS{crash_hotplug}=="1"
>>
>> For CPUs, this changeset introduces the crash_hotplug attribute
>> to the /sys/devices/system/cpu directory. For example:
>>
>>   # udevadm info --attribute-walk /sys/devices/system/cpu/cpu0
>>    looking at device '/devices/system/cpu/cpu0':
>>      KERNEL=="cpu0"
>>      SUBSYSTEM=="cpu"
>>      DRIVER=="processor"
>>      ATTR{crash_notes}=="277c38600"
>>      ATTR{crash_notes_size}=="368"
>>      ATTR{online}=="1"
>>
>>    looking at parent device '/devices/system/cpu':
>>      KERNELS=="cpu"
>>      SUBSYSTEMS==""
>>      DRIVERS==""
>>      ATTRS{crash_hotplug}=="1"
>>      ATTRS{isolated}==""
>>      ATTRS{kernel_max}=="8191"
>>      ATTRS{nohz_full}=="  (null)"
>>      ATTRS{offline}=="4-7"
>>      ATTRS{online}=="0-3"
>>      ATTRS{possible}=="0-7"
>>      ATTRS{present}=="0-3"
>>
>> With these sysfs attributes in place, it is possible to efficiently
>> instruct the udev rule to skip crash kernel reloading.
>>
>> For example, the following is the proposed udev rule change for RHEL
>> system 98-kexec.rules (as the first lines of the rule file):
>>
>>   # The kernel handles updates to crash elfcorehdr for cpu and memory changes
>>   SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
>>   SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
>>
>> When examined in the context of 98-kexec.rules, the above change
>> tests if crash_hotplug is set, and if so, it skips the userspace
>> initiated unload-then-reload of the crash kernel.
>>
>> Cpu and memory checks are separated in accordance with
>> CONFIG_HOTPLUG_CPU and CONFIG_MEMORY_HOTPLUG kernel config options.
>> If an architecture supports, for example, memory hotplug but not
>> CPU hotplug, then the /sys/devices/system/memory/crash_hotplug
>> attribute file is present, but the /sys/devices/system/cpu/crash_hotplug
>> attribute file will NOT be present. Thus the udev rule will skip
>> userspace processing of memory hot un/plug events, but the udev
>> rule will fail for CPU events, thus allowing userspace to process
>> cpu hot un/plug events (ie the unload-then-reload of the kdump
>> capture kernel).
>>
>> Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
> 
> LGTM,
> 
> Acked-by: Baoquan He <bhe@redhat.com>

Awesome, thank you!
eric

> 
>> ---
>>   .../admin-guide/mm/memory-hotplug.rst          |  8 ++++++++
>>   Documentation/core-api/cpu_hotplug.rst         | 18 ++++++++++++++++++
>>   drivers/base/cpu.c                             | 14 ++++++++++++++
>>   drivers/base/memory.c                          | 13 +++++++++++++
>>   include/linux/crash_core.h                     |  6 ++++++
>>   5 files changed, 59 insertions(+)
>>
>> diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst
>> index 0f56ecd8ac05..494d7a63c543 100644
>> --- a/Documentation/admin-guide/mm/memory-hotplug.rst
>> +++ b/Documentation/admin-guide/mm/memory-hotplug.rst
>> @@ -293,6 +293,14 @@ The following files are currently defined:
>>   		       Availability depends on the CONFIG_ARCH_MEMORY_PROBE
>>   		       kernel configuration option.
>>   ``uevent``	       read-write: generic udev file for device subsystems.
>> +``crash_hotplug``      read-only: when changes to the system memory map
>> +		       occur due to hot un/plug of memory, this file contains
>> +		       '1' if the kernel updates the kdump capture kernel memory
>> +		       map itself (via elfcorehdr), or '0' if userspace must update
>> +		       the kdump capture kernel memory map.
>> +
>> +		       Availability depends on the CONFIG_MEMORY_HOTPLUG kernel
>> +		       configuration option.
>>   ====================== =========================================================
>>   
>>   .. note::
>> diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst
>> index c6f4ba2fb32d..13e33d098645 100644
>> --- a/Documentation/core-api/cpu_hotplug.rst
>> +++ b/Documentation/core-api/cpu_hotplug.rst
>> @@ -750,6 +750,24 @@ will receive all events. A script like::
>>   
>>   can process the event further.
>>   
>> +When changes to the CPUs in the system occur, the sysfs file
>> +/sys/devices/system/cpu/crash_hotplug contains '1' if the kernel
>> +updates the kdump capture kernel list of CPUs itself (via elfcorehdr),
>> +or '0' if userspace must update the kdump capture kernel list of CPUs.
>> +
>> +The availability depends on the CONFIG_HOTPLUG_CPU kernel configuration
>> +option.
>> +
>> +To skip userspace processing of CPU hot un/plug events for kdump
>> +(ie the unload-then-reload to obtain a current list of CPUs), this sysfs
>> +file can be used in a udev rule as follows:
>> +
>> + SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
>> +
>> +For a cpu hot un/plug event, if the architecture supports kernel updates
>> +of the elfcorehdr (which contains the list of CPUs), then the rule skips
>> +the unload-then-reload of the kdump capture kernel.
>> +
>>   Kernel Inline Documentations Reference
>>   ======================================
>>   
>> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
>> index 4c98849577d4..bd470236d9a2 100644
>> --- a/drivers/base/cpu.c
>> +++ b/drivers/base/cpu.c
>> @@ -293,6 +293,17 @@ static ssize_t print_cpus_nohz_full(struct device *dev,
>>   static DEVICE_ATTR(nohz_full, 0444, print_cpus_nohz_full, NULL);
>>   #endif
>>   
>> +#ifdef CONFIG_HOTPLUG_CPU
>> +#include <linux/crash_core.h>
>> +static ssize_t crash_hotplug_show(struct device *dev,
>> +				     struct device_attribute *attr,
>> +				     char *buf)
>> +{
>> +	return sprintf(buf, "%d\n", crash_hotplug_cpu_support());
>> +}
>> +static DEVICE_ATTR_ADMIN_RO(crash_hotplug);
>> +#endif
>> +
>>   static void cpu_device_release(struct device *dev)
>>   {
>>   	/*
>> @@ -469,6 +480,9 @@ static struct attribute *cpu_root_attrs[] = {
>>   #ifdef CONFIG_NO_HZ_FULL
>>   	&dev_attr_nohz_full.attr,
>>   #endif
>> +#ifdef CONFIG_HOTPLUG_CPU
>> +	&dev_attr_crash_hotplug.attr,
>> +#endif
>>   #ifdef CONFIG_GENERIC_CPU_AUTOPROBE
>>   	&dev_attr_modalias.attr,
>>   #endif
>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>> index bc60c9cd3230..63c1754a52b6 100644
>> --- a/drivers/base/memory.c
>> +++ b/drivers/base/memory.c
>> @@ -483,6 +483,16 @@ static ssize_t auto_online_blocks_store(struct device *dev,
>>   
>>   static DEVICE_ATTR_RW(auto_online_blocks);
>>   
>> +#ifdef CONFIG_MEMORY_HOTPLUG
>> +#include <linux/crash_core.h>
>> +static ssize_t crash_hotplug_show(struct device *dev,
>> +				       struct device_attribute *attr, char *buf)
>> +{
>> +	return sprintf(buf, "%d\n", crash_hotplug_memory_support());
>> +}
>> +static DEVICE_ATTR_RO(crash_hotplug);
>> +#endif
>> +
>>   /*
>>    * Some architectures will have custom drivers to do this, and
>>    * will not need to do it from userspace.  The fake hot-add code
>> @@ -887,6 +897,9 @@ static struct attribute *memory_root_attrs[] = {
>>   
>>   	&dev_attr_block_size_bytes.attr,
>>   	&dev_attr_auto_online_blocks.attr,
>> +#ifdef CONFIG_MEMORY_HOTPLUG
>> +	&dev_attr_crash_hotplug.attr,
>> +#endif
>>   	NULL
>>   };
>>   
>> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
>> index c9705b6872e7..3964e9924ea5 100644
>> --- a/include/linux/crash_core.h
>> +++ b/include/linux/crash_core.h
>> @@ -109,5 +109,11 @@ static inline void arch_crash_handle_hotplug_event(struct kimage *image,
>>   {
>>   }
>>   #endif
>> +#ifndef crash_hotplug_cpu_support
>> +static inline int crash_hotplug_cpu_support(void) { return 0; }
>> +#endif
>> +#ifndef crash_hotplug_memory_support
>> +static inline int crash_hotplug_memory_support(void) { return 0; }
>> +#endif
>>   
>>   #endif /* LINUX_CRASH_CORE_H */
>> -- 
>> 2.31.1
>>
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 8/8] x86/crash: Add x86 crash hotplug support
  2022-08-16 15:23     ` Eric DeVolder
@ 2022-08-25 19:42       ` Eric DeVolder
  2022-08-26  4:35       ` Baoquan He
  1 sibling, 0 replies; 25+ messages in thread
From: Eric DeVolder @ 2022-08-25 19:42 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, x86, kexec, ebiederm, dyoung, vgoyal, tglx, mingo,
	bp, dave.hansen, hpa, nramas, thomas.lendacky, robh, efault,
	rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky

Hi Baoquan,
I've v11 ready to go, but I did raise some questions below which would be good to resolve before 
posting.
Thanks!
eric

On 8/16/22 10:23, Eric DeVolder wrote:
> 
> 
> On 8/12/22 19:34, Baoquan He wrote:
>> On 07/21/22 at 02:17pm, Eric DeVolder wrote:
>> ...snip....
>>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>>> index e58798f636d4..bb59596c8bea 100644
>>> --- a/arch/x86/Kconfig
>>> +++ b/arch/x86/Kconfig
>>> @@ -2065,6 +2065,17 @@ config CRASH_DUMP
>>>         (CONFIG_RELOCATABLE=y).
>>>         For more details see Documentation/admin-guide/kdump/kdump.rst
>>> +config CRASH_MAX_MEMORY_RANGES
>>> +    depends on CRASH_DUMP && KEXEC_FILE && (HOTPLUG_CPU || MEMORY_HOTPLUG)
>>> +    int
>>> +    default 32768
>>
>> Do we need to enforce the value with page align and minimal size? I
> 
> Are you asking about the value CRASH_MAX_MEMORY_RANGES? This value represents
> the maximum number of memory ranges, and there Elf64_Phdrs, that we need to
> allow for elfcorehdr memory. So I'm not sure what the concern for alignment
> is. I suppose we could also institute a minimum size for this value, say 1024.
> 
>> checked crash_load_segments() in arch/x86/kernel/crash.c, it does the
>> page size aligning in kexec_add_buffer(). And in
>> load_crashdump_segments() of
>> kexec-tools/kexec/arch/i386/crashdump-x86.c, it creates elfcorehdr at
>> below code, the align is 1024, and in generic add_buffer()
>> implementation, it enforces the memsz page aligned, and changes the
>> passed align as page alignment.
>>
>>
>>     elfcorehdr = add_buffer(info, tmp, bufsz, memsz, align, min_base,
>>                                                          max_addr, -1);
>>
>> Maybe we should at least mention this in the help text to notice people.
> 
> Unfortunately I do not yet understand the concern being raised.
> 
>>
>>> +    help
>>> +      For the kexec_file_load path, specify the maximum number of
>>> +      memory regions, eg. as represented by the 'System RAM' entries
>>> +      in /proc/iomem, that the elfcorehdr buffer/segment can accommodate.
>>> +      This value is combined with NR_CPUS and multiplied by Elf64_Phdr
>>> +      size to determine the final buffer size.
>>> +
>>>   config KEXEC_JUMP
>>>       bool "kexec jump"
>>>       depends on KEXEC && HIBERNATION
>>> diff --git a/arch/x86/include/asm/crash.h b/arch/x86/include/asm/crash.h
>>> index 8b6bd63530dc..96051d8e4b45 100644
>>> --- a/arch/x86/include/asm/crash.h
>>> +++ b/arch/x86/include/asm/crash.h
>>> @@ -9,4 +9,24 @@ int crash_setup_memmap_entries(struct kimage *image,
>>>           struct boot_params *params);
>>>   void crash_smp_send_stop(void);
>>> +void *arch_map_crash_pages(unsigned long paddr, unsigned long size);
>>> +#define arch_map_crash_pages arch_map_crash_pages
>>> +
>>> +void arch_unmap_crash_pages(void **ptr);
>>> +#define arch_unmap_crash_pages arch_unmap_crash_pages
>>> +
>>> +void arch_crash_handle_hotplug_event(struct kimage *image,
>>> +        unsigned int hp_action, unsigned int cpu);
>>> +#define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
>>> +
>>> +#ifdef CONFIG_HOTPLUG_CPU
>>> +static inline int crash_hotplug_cpu_support(void) { return 1; }
>>> +#define crash_hotplug_cpu_support crash_hotplug_cpu_support
>>> +#endif
>>> +
>>> +#ifdef CONFIG_MEMORY_HOTPLUG
>>> +static inline int crash_hotplug_memory_support(void) { return 1; }
>>> +#define crash_hotplug_memory_support crash_hotplug_memory_support
>>> +#endif
>>> +
>>>   #endif /* _ASM_X86_CRASH_H */
>>> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
>>> index 9ceb93c176a6..55dda4fcde6e 100644
>>> --- a/arch/x86/kernel/crash.c
>>> +++ b/arch/x86/kernel/crash.c
>>> @@ -25,6 +25,7 @@
>>>   #include <linux/slab.h>
>>>   #include <linux/vmalloc.h>
>>>   #include <linux/memblock.h>
>>> +#include <linux/highmem.h>
>>>   #include <asm/processor.h>
>>>   #include <asm/hardirq.h>
>>> @@ -397,7 +398,17 @@ int crash_load_segments(struct kimage *image)
>>>       image->elf_headers = kbuf.buffer;
>>>       image->elf_headers_sz = kbuf.bufsz;
>>> +#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
>>> +    /* Ensure elfcorehdr segment large enough for hotplug changes */
>>> +    kbuf.memsz = (CONFIG_NR_CPUS_DEFAULT + CONFIG_CRASH_MAX_MEMORY_RANGES) * sizeof(Elf64_Phdr);
>>
>> Do we need to break the line to 80 chars?
> 
> Sure, I will do so.
> 
>>
>>> +    /* For marking as usable to crash kernel */
>>> +    image->elf_headers_sz = kbuf.memsz;
>>
>> Do we need this code comment?
> 
> Well, it did take me a while to figure this particular item out in order for all
> this code to work right (else the crash kernel would fail at boot time). So I
> think it best to keep this comment.
> 
>>
>>> +    /* Record the index of the elfcorehdr segment */
>>> +    image->elfcorehdr_index = image->nr_segments;
>>
>> And this place?
> 
> Not necessarily needed, but I've found it useful.
> 
>>
>>> +    image->elfcorehdr_index_valid = true;
>>> +#else
>>>       kbuf.memsz = kbuf.bufsz;
>>> +#endif
>>>       kbuf.buf_align = ELF_CORE_HEADER_ALIGN;
>>>       kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>>>       ret = kexec_add_buffer(&kbuf);
>>> @@ -412,3 +423,107 @@ int crash_load_segments(struct kimage *image)
>>>       return ret;
>>>   }
>>>   #endif /* CONFIG_KEXEC_FILE */
>>> +
>>> +#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
>>> +void *arch_map_crash_pages(unsigned long paddr, unsigned long size)
>>> +{
>>> +    /*
>>> +     * NOTE: The addresses and sizes passed to this routine have
>>> +     * already been fully aligned on page boundaries. There is no
>>> +     * need for massaging the address or size.
>>> +     */
>>
>> Can we move the code comment above function interface?
> 
> Yes
> 
>>
>>> +    void *ptr = NULL;
>>> +
>>> +    /* NOTE: requires arch_kexec_[un]protect_crashkres() for write access */
>>
>> Do we need this code comment? On ARCH where proctionion is made, we
>> surely need to the protect/unprotect.
> 
> I will remove this; I've mentioned this in handle_hotplug_event() where these
> protect/unprotect functions are called.
> 
>>
>>> +    if (size > 0) {
>>> +        struct page *page = pfn_to_page(paddr >> PAGE_SHIFT);
>>> +
>>> +        ptr = kmap_local_page(page);
>>> +    }
>>> +
>>> +    return ptr;
>>> +}
>>> +
>>> +void arch_unmap_crash_pages(void **ptr)
>>> +{
>>> +    if (ptr) {
>>> +        if (*ptr)
>>> +            kunmap_local(*ptr);
>>> +        *ptr = NULL;
>>> +    }
>>> +}
>>> +
>>> +/**
>>> + * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
>>> + * @image: the active struct kimage
>>> + * @hp_action: the hot un/plug action being handled
>>> + * @cpu: when KEXEC_CRASH_HP_ADD/REMOVE_CPU, the cpu affected
>>> + *
>>> + * To accurately reflect hot un/plug changes, the elfcorehdr (which
>>> + * is passed to the crash kernel via the elfcorehdr= parameter)
>>> + * must be updated with the new list of CPUs and memories. The new
>>> + * elfcorehdr is prepared in a kernel buffer, and then it is
>>> + * written on top of the existing/old elfcorehdr.
>>> + *
>>> + * For hotplug changes to elfcorehdr to work, two conditions are
>>> + * needed:
>>> + * First, the segment containing the elfcorehdr must be large enough
>>> + * to permit a growing number of resources. See the
>>> + * CONFIG_CRASH_MAX_MEMORY_RANGES description.
>>> + * Second, purgatory must explicitly exclude the elfcorehdr from the
>>> + * list of segments it checks (since the elfcorehdr changes and thus
>>> + * would require an update to purgatory itself to update the digest).
>>
>> Isn't this generic concept to crash hotplug? Should we move it out to
>> some generic place?
> 
> Yes, so I will relocate this.
> 
>>
>>> + *
>>> + */
>>> +void arch_crash_handle_hotplug_event(struct kimage *image,
>>> +    unsigned int hp_action, unsigned int cpu)
>>
>> The passed in 'cpu' is not used at all, what is it added for? I didn't
>> see explanation about it.
> 
> Well its not used for x86, but as I recall, Sourabh Jain needed it for the PowerPC handler.
> 
>>
>>> +{
>>> +    struct kexec_segment *ksegment;
>>> +    unsigned char *ptr = NULL;
>>> +    unsigned long elfsz = 0;
>>> +    void *elfbuf = NULL;
>>> +    unsigned long mem, memsz;
>>> +
>>> +    /*
>>> +     * Elfcorehdr_index_valid checked in crash_core:handle_hotplug_event()
>>> +     */
>>> +    ksegment = &image->segment[image->elfcorehdr_index];
>>> +    mem = ksegment->mem;
>>> +    memsz = ksegment->memsz;
>>> +
>>> +    /*
>>> +     * Create the new elfcorehdr reflecting the changes to CPU and/or
>>> +     * memory resources.
>>> +     */
>>> +    if (prepare_elf_headers(image, &elfbuf, &elfsz)) {
>>> +        pr_err("crash hp: unable to prepare elfcore headers");
>>> +        goto out;
>>> +    }
>>> +    if (elfsz > memsz) {
>>> +        pr_err("crash hp: update elfcorehdr elfsz %lu > memsz %lu",
>>> +            elfsz, memsz);
>>> +        goto out;
>>> +    }
>>> +
>>> +    /*
>>> +     * At this point, we are all but assured of success.
>>> +     * Copy new elfcorehdr into destination.
>>> +     */
>>> +    ptr = arch_map_crash_pages(mem, memsz);
>>> +    if (ptr) {
>>> +        /*
>>> +         * Temporarily invalidate the crash image while the
>>> +         * elfcorehdr is updated.
>>> +         */
>>> +        xchg(&kexec_crash_image, NULL);
>>> +        memcpy_flushcache((void *)ptr, elfbuf, elfsz);
>>> +        xchg(&kexec_crash_image, image);
>>> +    }
>>> +    arch_unmap_crash_pages((void **)&ptr);
>>> +    pr_debug("crash hp: re-loaded elfcorehdr at 0x%lx\n", mem);
>>> +
>>> +out:
>>> +    if (elfbuf)
>>> +        vfree(elfbuf);
>>> +}
>>> +#endif
>>> -- 
>>> 2.31.1
>>>
>>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v10 8/8] x86/crash: Add x86 crash hotplug support
  2022-08-16 15:23     ` Eric DeVolder
  2022-08-25 19:42       ` Eric DeVolder
@ 2022-08-26  4:35       ` Baoquan He
  1 sibling, 0 replies; 25+ messages in thread
From: Baoquan He @ 2022-08-26  4:35 UTC (permalink / raw)
  To: Eric DeVolder
  Cc: linux-kernel, x86, kexec, ebiederm, dyoung, vgoyal, tglx, mingo,
	bp, dave.hansen, hpa, nramas, thomas.lendacky, robh, efault,
	rppt, david, sourabhjain, konrad.wilk, boris.ostrovsky

On 08/16/22 at 10:23am, Eric DeVolder wrote:
> 
> 
> On 8/12/22 19:34, Baoquan He wrote:
> > On 07/21/22 at 02:17pm, Eric DeVolder wrote:
> > ...snip....
> > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > > index e58798f636d4..bb59596c8bea 100644
> > > --- a/arch/x86/Kconfig
> > > +++ b/arch/x86/Kconfig
> > > @@ -2065,6 +2065,17 @@ config CRASH_DUMP
> > >   	  (CONFIG_RELOCATABLE=y).
> > >   	  For more details see Documentation/admin-guide/kdump/kdump.rst
> > > +config CRASH_MAX_MEMORY_RANGES
> > > +	depends on CRASH_DUMP && KEXEC_FILE && (HOTPLUG_CPU || MEMORY_HOTPLUG)
> > > +	int
> > > +	default 32768
> > 
> > Do we need to enforce the value with page align and minimal size? I
> 
> Are you asking about the value CRASH_MAX_MEMORY_RANGES? This value represents
> the maximum number of memory ranges, and there Elf64_Phdrs, that we need to
> allow for elfcorehdr memory. So I'm not sure what the concern for alignment
> is. I suppose we could also institute a minimum size for this value, say 1024.
> 
> > checked crash_load_segments() in arch/x86/kernel/crash.c, it does the
> > page size aligning in kexec_add_buffer(). And in
> > load_crashdump_segments() of
> > kexec-tools/kexec/arch/i386/crashdump-x86.c, it creates elfcorehdr at
> > below code, the align is 1024, and in generic add_buffer()
> > implementation, it enforces the memsz page aligned, and changes the
> > passed align as page alignment.
> > 
> > 
> > 	elfcorehdr = add_buffer(info, tmp, bufsz, memsz, align, min_base,
> >                                                          max_addr, -1);
> > 
> > Maybe we should at least mention this in the help text to notice people.
> 
> Unfortunately I do not yet understand the concern being raised.

Oops, never mind, I misunderstood CRASH_MAX_MEMORY_RANGES. Thought it's
the range vlaue of buffer containing elfcorehdr, I must be dizzy when
reading this part.

> 
> > 
...snip...
> > > diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> > > index 9ceb93c176a6..55dda4fcde6e 100644
> > > --- a/arch/x86/kernel/crash.c
> > > +++ b/arch/x86/kernel/crash.c
> > > @@ -25,6 +25,7 @@
> > >   #include <linux/slab.h>
> > >   #include <linux/vmalloc.h>
> > >   #include <linux/memblock.h>
> > > +#include <linux/highmem.h>
> > >   #include <asm/processor.h>
> > >   #include <asm/hardirq.h>
> > > @@ -397,7 +398,17 @@ int crash_load_segments(struct kimage *image)
> > >   	image->elf_headers = kbuf.buffer;
> > >   	image->elf_headers_sz = kbuf.bufsz;
> > > +#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
> > > +	/* Ensure elfcorehdr segment large enough for hotplug changes */
> > > +	kbuf.memsz = (CONFIG_NR_CPUS_DEFAULT + CONFIG_CRASH_MAX_MEMORY_RANGES) * sizeof(Elf64_Phdr);
> > 
> > Do we need to break the line to 80 chars?
> 
> Sure, I will do so.
> 
> > 
> > > +	/* For marking as usable to crash kernel */
> > > +	image->elf_headers_sz = kbuf.memsz;
> > 
> > Do we need this code comment?
> 
> Well, it did take me a while to figure this particular item out in order for all
> this code to work right (else the crash kernel would fail at boot time). So I
> think it best to keep this comment.
> 
> > 
> > > +	/* Record the index of the elfcorehdr segment */
> > > +	image->elfcorehdr_index = image->nr_segments;
> > 
> > And this place?
> 
> Not necessarily needed, but I've found it useful.
> 
> > 
> > > +	image->elfcorehdr_index_valid = true;
> > > +#else
> > >   	kbuf.memsz = kbuf.bufsz;
> > > +#endif
> > >   	kbuf.buf_align = ELF_CORE_HEADER_ALIGN;
> > >   	kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
> > >   	ret = kexec_add_buffer(&kbuf);
> > > @@ -412,3 +423,107 @@ int crash_load_segments(struct kimage *image)
> > >   	return ret;
> > >   }
> > >   #endif /* CONFIG_KEXEC_FILE */
> > > +
> > > +#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
> > > +void *arch_map_crash_pages(unsigned long paddr, unsigned long size)
> > > +{
> > > +	/*
> > > +	 * NOTE: The addresses and sizes passed to this routine have
> > > +	 * already been fully aligned on page boundaries. There is no
> > > +	 * need for massaging the address or size.
> > > +	 */
> > 
> > Can we move the code comment above function interface?
> 
> Yes
> 
> > 
> > > +	void *ptr = NULL;
> > > +
> > > +	/* NOTE: requires arch_kexec_[un]protect_crashkres() for write access */
> > 
> > Do we need this code comment? On ARCH where proctionion is made, we
> > surely need to the protect/unprotect.
> 
> I will remove this; I've mentioned this in handle_hotplug_event() where these
> protect/unprotect functions are called.
> 
> > 
> > > +	if (size > 0) {
> > > +		struct page *page = pfn_to_page(paddr >> PAGE_SHIFT);
> > > +
> > > +		ptr = kmap_local_page(page);
> > > +	}
> > > +
> > > +	return ptr;
> > > +}
> > > +
> > > +void arch_unmap_crash_pages(void **ptr)
> > > +{
> > > +	if (ptr) {
> > > +		if (*ptr)
> > > +			kunmap_local(*ptr);
> > > +		*ptr = NULL;
> > > +	}
> > > +}
> > > +
> > > +/**
> > > + * arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
> > > + * @image: the active struct kimage
> > > + * @hp_action: the hot un/plug action being handled
> > > + * @cpu: when KEXEC_CRASH_HP_ADD/REMOVE_CPU, the cpu affected
> > > + *
> > > + * To accurately reflect hot un/plug changes, the elfcorehdr (which
> > > + * is passed to the crash kernel via the elfcorehdr= parameter)
> > > + * must be updated with the new list of CPUs and memories. The new
> > > + * elfcorehdr is prepared in a kernel buffer, and then it is
> > > + * written on top of the existing/old elfcorehdr.
> > > + *
> > > + * For hotplug changes to elfcorehdr to work, two conditions are
> > > + * needed:
> > > + * First, the segment containing the elfcorehdr must be large enough
> > > + * to permit a growing number of resources. See the
> > > + * CONFIG_CRASH_MAX_MEMORY_RANGES description.
> > > + * Second, purgatory must explicitly exclude the elfcorehdr from the
> > > + * list of segments it checks (since the elfcorehdr changes and thus
> > > + * would require an update to purgatory itself to update the digest).
> > 
> > Isn't this generic concept to crash hotplug? Should we move it out to
> > some generic place?
> 
> Yes, so I will relocate this.
> 
> > 
> > > + *
> > > + */
> > > +void arch_crash_handle_hotplug_event(struct kimage *image,
> > > +	unsigned int hp_action, unsigned int cpu)
> > 
> > The passed in 'cpu' is not used at all, what is it added for? I didn't
> > see explanation about it.
> 
> Well its not used for x86, but as I recall, Sourabh Jain needed it for the PowerPC handler.

Then better mention this in log or add code comment, otherwise confusion
could be caused.


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2022-08-26  4:35 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-21 18:17 [PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder
2022-07-21 18:17 ` [PATCH v10 1/8] crash: introduce arch/*/asm/crash.h Eric DeVolder
2022-08-08  3:25   ` Baoquan He
2022-08-08 15:18     ` Eric DeVolder
2022-08-08 15:44       ` Rob Herring
2022-08-12  9:46       ` Baoquan He
2022-08-12 21:23         ` Eric DeVolder
2022-08-13  0:24           ` Baoquan He
2022-08-16 15:23             ` Eric DeVolder
2022-07-21 18:17 ` [PATCH v10 2/8] crash: move crash_prepare_elf64_headers Eric DeVolder
2022-07-21 18:17 ` [PATCH v10 3/8] crash: prototype change for crash_prepare_elf64_headers Eric DeVolder
2022-07-21 18:17 ` [PATCH v10 4/8] crash: add generic infrastructure for crash hotplug support Eric DeVolder
2022-08-08  9:30   ` Baoquan He
2022-08-08 15:20     ` Eric DeVolder
2022-07-21 18:17 ` [PATCH v10 5/8] kexec: exclude elfcorehdr from the segment digest Eric DeVolder
2022-07-21 18:17 ` [PATCH v10 6/8] kexec: exclude hot remove cpu from elfcorehdr notes Eric DeVolder
2022-07-21 18:17 ` [PATCH v10 7/8] crash: memory and cpu hotplug sysfs attributes Eric DeVolder
2022-08-08 10:41   ` Baoquan He
2022-08-16 15:24     ` Eric DeVolder
2022-07-21 18:17 ` [PATCH v10 8/8] x86/crash: Add x86 crash hotplug support Eric DeVolder
2022-08-13  0:34   ` Baoquan He
2022-08-16 15:23     ` Eric DeVolder
2022-08-25 19:42       ` Eric DeVolder
2022-08-26  4:35       ` Baoquan He
2022-08-04 14:42 ` [PATCH v10 0/8] crash: Kernel handling of CPU and memory hot un/plug Eric DeVolder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).